searxng/searx/engines/youtube.py

## Youtube (Videos)
#
# @website     https://www.youtube.com/
# @provide-api yes (http://gdata-samples-youtube-search-py.appspot.com/)
#
# @using-api   yes
# @results     JSON
# @stable      yes
# @parse       url, title, content, publishedDate, thumbnail, embedded

from json import loads
from urllib import urlencode
from dateutil import parser

# engine dependent config
categories = ['videos', 'music']
paging = True
language_support = True

# search-url
base_url = 'https://gdata.youtube.com/feeds/api/videos'
search_url = base_url + '?alt=json&{query}&start-index={index}&max-results=5'

embedded_url = '<iframe width="540" height="304" ' +\
    'data-src="//www.youtube-nocookie.com/embed/{videoid}" ' +\
    'frameborder="0" allowfullscreen></iframe>'


# do search-request
def request(query, params):
    index = (params['pageno'] - 1) * 5 + 1

    params['url'] = search_url.format(query=urlencode({'q': query}),
                                      index=index)

    # add language tag if specified
    if params['language'] != 'all':
        params['url'] += '&lr=' + params['language'].split('_')[0]

    return params


# get response from search-request
def response(resp):
    results = []

    search_results = loads(resp.text)

    # return empty array if there are no results
    if not 'feed' in search_results:
        return []

    feed = search_results['feed']

    # parse results
    for result in feed['entry']:
        url = [x['href'] for x in result['link'] if x['type'] == 'text/html']

        if not url:
            continue

        # remove tracking
        url = url[0].replace('feature=youtube_gdata', '')
        if url.endswith('&'):
            url = url[:-1]

        videoid = url[32:]

        title = result['title']['$t']
        content = ''
        thumbnail = ''

        pubdate = result['published']['$t']
        publishedDate = parser.parse(pubdate)

        if 'media$thumbnail' in result['media$group']:
            thumbnail = result['media$group']['media$thumbnail'][0]['url']

        content = result['content']['$t']

        embedded = embedded_url.format(videoid=videoid)

        # append result
        results.append({'url': url,
                        'title': title,
                        'content': content,
                        'template': 'videos.html',
                        'publishedDate': publishedDate,
                        'embedded': embedded,
                        'thumbnail': thumbnail})

    # return results
    return results
fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00			`## Youtube (Videos)`
[fix] pep8 : engines (errors E121, E127, E128 and E501 still exist) 2014-12-07 16:37:56 +01:00			`#`
fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00			`# @website https://www.youtube.com/`
			`# @provide-api yes (http://gdata-samples-youtube-search-py.appspot.com/)`
[fix] pep8 : engines (errors E121, E127, E128 and E501 still exist) 2014-12-07 16:37:56 +01:00			`#`
fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00			`# @using-api yes`
			`# @results JSON`
			`# @stable yes`
Integrated media in results + Deezer Engine New "embedded" item for the results, allow to give an iframe to display the media directly in the results. Note that the attributes src of the iframes are not set, but instead data-src is set, allowing to only load the iframe when clicked. Deezer engine based on public API (no key). 2015-01-05 02:04:23 +01:00			`# @parse url, title, content, publishedDate, thumbnail, embedded`
fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00
[enh] youtube engine added 2013-10-19 20:46:10 +02:00			`from json import loads`
[enh] engine cfg compatibilty 2013-10-23 23:55:37 +02:00			`from urllib import urlencode`
simplify datetime extraction 2014-03-18 13:19:50 +01:00			`from dateutil import parser`
[enh] youtube engine added 2013-10-19 20:46:10 +02:00
fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00			`# engine dependent config`
Search Music also on YouTube YouTube hosts a lot of music and should be enabled for Music search by default. 2014-10-13 14:51:15 +02:00			`categories = ['videos', 'music']`
[enh] paging support for youtube 2014-01-30 00:50:47 +01:00			`paging = True`
fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00			`language_support = True`

			`# search-url`
			`base_url = 'https://gdata.youtube.com/feeds/api/videos'`
Integrated media in results + Deezer Engine New "embedded" item for the results, allow to give an iframe to display the media directly in the results. Note that the attributes src of the iframes are not set, but instead data-src is set, allowing to only load the iframe when clicked. Deezer engine based on public API (no key). 2015-01-05 02:04:23 +01:00			`search_url = base_url + '?alt=json&{query}&start-index={index}&max-results=5'`

			`embedded_url = '<iframe width="540" height="304" ' +\`
			`'data-src="//www.youtube-nocookie.com/embed/{videoid}" ' +\`
			`'frameborder="0" allowfullscreen></iframe>'`
[enh] youtube engine added 2013-10-19 20:46:10 +02:00
[fix] pep/flake8 compatibility 2014-01-20 02:31:20 +01:00
fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00			`# do search-request`
[enh] youtube engine added 2013-10-19 20:46:10 +02:00			`def request(query, params):`
fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00			`index = (params['pageno'] - 1) * 5 + 1`

[enh] paging support for youtube 2014-01-30 00:50:47 +01:00			`params['url'] = search_url.format(query=urlencode({'q': query}),`
			`index=index)`
fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00
			`# add language tag if specified`
			`if params['language'] != 'all':`
			`params['url'] += '&lr=' + params['language'].split('_')[0]`

[enh] youtube engine added 2013-10-19 20:46:10 +02:00			`return params`


fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00			`# get response from search-request`
[enh] youtube engine added 2013-10-19 20:46:10 +02:00			`def response(resp):`
			`results = []`
fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00
[enh] youtube engine added 2013-10-19 20:46:10 +02:00			`search_results = loads(resp.text)`
fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00
			`# return empty array if there are no results`
[enh] youtube engine added 2013-10-19 20:46:10 +02:00			`if not 'feed' in search_results:`
fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00			`return []`

[enh] youtube engine added 2013-10-19 20:46:10 +02:00			`feed = search_results['feed']`
[mod] len() removed from conditions 2014-02-11 13:13:51 +01:00
fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00			`# parse results`
[enh] youtube engine added 2013-10-19 20:46:10 +02:00			`for result in feed['entry']:`
			`url = [x['href'] for x in result['link'] if x['type'] == 'text/html']`
fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00
[mod] len() removed from conditions 2014-02-11 13:13:51 +01:00			`if not url:`
Youtube's unit test 2015-01-26 18:24:08 +01:00			`continue`
fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00
[enh] youtube engine added 2013-10-19 20:46:10 +02:00			`# remove tracking`
			`url = url[0].replace('feature=youtube_gdata', '')`
			`if url.endswith('&'):`
			`url = url[:-1]`
fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00
Integrated media in results + Deezer Engine New "embedded" item for the results, allow to give an iframe to display the media directly in the results. Note that the attributes src of the iframes are not set, but instead data-src is set, allowing to only load the iframe when clicked. Deezer engine based on public API (no key). 2015-01-05 02:04:23 +01:00			`videoid = url[32:]`

[enh] youtube engine added 2013-10-19 20:46:10 +02:00			`title = result['title']['$t']`
[enh] thumbnails to youtube video results 2013-10-22 19:36:30 +02:00			`content = ''`
[ehn] Add a 'featured result feature'm putting on top of the reasults ddg definitions and wikipedia (ugly html / css) [ehn] Add a templates for videos, so the thumbnails all have the same side 2014-01-12 18:31:57 +01:00			`thumbnail = ''`
[mod] len() removed from conditions 2014-02-11 13:13:51 +01:00
simplify datetime extraction 2014-03-18 13:19:50 +01:00			`pubdate = result['published']['$t']`
			`publishedDate = parser.parse(pubdate)`

Youtube's unit test 2015-01-26 18:24:08 +01:00			`if 'media$thumbnail' in result['media$group']:`
[ehn] Add a 'featured result feature'm putting on top of the reasults ddg definitions and wikipedia (ugly html / css) [ehn] Add a templates for videos, so the thumbnails all have the same side 2014-01-12 18:31:57 +01:00			`thumbnail = result['media$group']['media$thumbnail'][0]['url']`
[mod] len() removed from conditions 2014-02-11 13:13:51 +01:00
oscar template: implement first version of results page * implement results page * improve search form template * implement all result_templates * fix youtube engine 2014-09-27 12:33:22 +02:00			`content = result['content']['$t']`
[enh] thumbnails to youtube video results 2013-10-22 19:36:30 +02:00
Integrated media in results + Deezer Engine New "embedded" item for the results, allow to give an iframe to display the media directly in the results. Note that the attributes src of the iframes are not set, but instead data-src is set, allowing to only load the iframe when clicked. Deezer engine based on public API (no key). 2015-01-05 02:04:23 +01:00			`embedded = embedded_url.format(videoid=videoid)`

fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00			`# append result`
[fix] pep/flake8 compatibility 2014-01-20 02:31:20 +01:00			`results.append({'url': url,`
			`'title': title,`
			`'content': content,`
			`'template': 'videos.html',`
simplify datetime extraction 2014-03-18 13:19:50 +01:00			`'publishedDate': publishedDate,`
Integrated media in results + Deezer Engine New "embedded" item for the results, allow to give an iframe to display the media directly in the results. Note that the attributes src of the iframes are not set, but instead data-src is set, allowing to only load the iframe when clicked. Deezer engine based on public API (no key). 2015-01-05 02:04:23 +01:00			`'embedded': embedded,`
[fix] pep/flake8 compatibility 2014-01-20 02:31:20 +01:00			`'thumbnail': thumbnail})`
[enh] youtube engine added 2013-10-19 20:46:10 +02:00
fix youtube engine and add comments * add language-support * decrease search-results/site to 5 * add comments 2014-09-02 21:19:20 +02:00			`# return results`
[enh] youtube engine added 2013-10-19 20:46:10 +02:00			`return results`