searxng/searx/engines/kickass.py

"""
 Kickass Torrent (Videos, Music, Files)

 @website     https://kickass.so
 @provide-api no (nothing found)

 @using-api   no
 @results     HTML (using search portal)
 @stable      yes (HTML can change)
 @parse       url, title, content, seed, leech, magnetlink
"""

from urlparse import urljoin
from cgi import escape
from urllib import quote
from lxml import html
from operator import itemgetter
from searx.engines.xpath import extract_text
from searx.utils import get_torrent_size, convert_str_to_int

# engine dependent config
categories = ['videos', 'music', 'files']
paging = True

# search-url
url = 'https://kickass.cd/'
search_url = url + 'search/{search_term}/{pageno}/'

# specific xpath variables
magnet_xpath = './/a[@title="Torrent magnet link"]'
torrent_xpath = './/a[@title="Download torrent file"]'
content_xpath = './/span[@class="font11px lightgrey block"]'


# do search-request
def request(query, params):
    params['url'] = search_url.format(search_term=quote(query),
                                      pageno=params['pageno'])

    return params


# get response from search-request
def response(resp):
    results = []

    dom = html.fromstring(resp.text)

    search_res = dom.xpath('//table[@class="data"]//tr')

    # return empty array if nothing is found
    if not search_res:
        return []

    # parse results
    for result in search_res[1:]:
        link = result.xpath('.//a[@class="cellMainLink"]')[0]
        href = urljoin(url, link.attrib['href'])
        title = extract_text(link)
        content = escape(extract_text(result.xpath(content_xpath)))
        seed = extract_text(result.xpath('.//td[contains(@class, "green")]'))
        leech = extract_text(result.xpath('.//td[contains(@class, "red")]'))
        filesize_info = extract_text(result.xpath('.//td[contains(@class, "nobr")]'))
        files = extract_text(result.xpath('.//td[contains(@class, "center")][2]'))

        seed = convert_str_to_int(seed)
        leech = convert_str_to_int(leech)

        filesize, filesize_multiplier = filesize_info.split()
        filesize = get_torrent_size(filesize, filesize_multiplier)
        if files.isdigit():
            files = int(files)
        else:
            files = None

        magnetlink = result.xpath(magnet_xpath)[0].attrib['href']

        torrentfile = result.xpath(torrent_xpath)[0].attrib['href']
        torrentfileurl = quote(torrentfile, safe="%/:=&?~#+!$,;'@()*")

        # append result
        results.append({'url': href,
                        'title': title,
                        'content': content,
                        'seed': seed,
                        'leech': leech,
                        'filesize': filesize,
                        'files': files,
                        'magnetlink': magnetlink,
                        'torrentfile': torrentfileurl,
                        'template': 'torrent.html'})

    # return results sorted by seeder
    return sorted(results, key=itemgetter('seed'), reverse=True)
update versions.cfg to use the current up-to-date packages 2015-05-02 15:45:17 +02:00			`"""`
			`Kickass Torrent (Videos, Music, Files)`

			`@website https://kickass.so`
			`@provide-api no (nothing found)`

			`@using-api no`
			`@results HTML (using search portal)`
			`@stable yes (HTML can change)`
			`@parse url, title, content, seed, leech, magnetlink`
			`"""`
First pass at Kickass Engine Parse and return results correctly. Pages numbers taken care of. Not done, and maybe to do : - 'content' : I don't know what it could be. Maybe votes ? - 'categories' : the results are not filtered by categories, because I don't see how to do it properly : there are too much categories on Kickass. Is 'video' only movies, or also tv show or porn ? So for now, the category is 'all'. - Favicon/icon : may be a good idea. 2014-12-09 19:19:39 +01:00
			`from urlparse import urljoin`
			`from cgi import escape`
			`from urllib import quote`
			`from lxml import html`
			`from operator import itemgetter`
Kickass' unit test 2015-01-30 21:02:17 +01:00			`from searx.engines.xpath import extract_text`
fix kickass torrents engine 2016-10-11 19:31:42 +02:00			`from searx.utils import get_torrent_size, convert_str_to_int`
First pass at Kickass Engine Parse and return results correctly. Pages numbers taken care of. Not done, and maybe to do : - 'content' : I don't know what it could be. Maybe votes ? - 'categories' : the results are not filtered by categories, because I don't see how to do it properly : there are too much categories on Kickass. Is 'video' only movies, or also tv show or porn ? So for now, the category is 'all'. - Favicon/icon : may be a good idea. 2014-12-09 19:19:39 +01:00
			`# engine dependent config`
			`categories = ['videos', 'music', 'files']`
			`paging = True`

			`# search-url`
fix kickass torrents engine 2016-10-11 19:31:42 +02:00			`url = 'https://kickass.cd/'`
First pass at Kickass Engine Parse and return results correctly. Pages numbers taken care of. Not done, and maybe to do : - 'content' : I don't know what it could be. Maybe votes ? - 'categories' : the results are not filtered by categories, because I don't see how to do it properly : there are too much categories on Kickass. Is 'video' only movies, or also tv show or porn ? So for now, the category is 'all'. - Favicon/icon : may be a good idea. 2014-12-09 19:19:39 +01:00			`search_url = url + 'search/{search_term}/{pageno}/'`

			`# specific xpath variables`
			`magnet_xpath = './/a[@title="Torrent magnet link"]'`
[enh] improve torrent results 2015-01-10 19:40:27 +01:00			`torrent_xpath = './/a[@title="Download torrent file"]'`
Flake8 and Twitter corrections Lots of Flake8 corrections Maybe we should change the rule to allow lines of 120 chars. It seems more usable. Big twitter correction : now it outputs the words in right order... 2014-12-29 21:31:04 +01:00			`content_xpath = './/span[@class="font11px lightgrey block"]'`
First pass at Kickass Engine Parse and return results correctly. Pages numbers taken care of. Not done, and maybe to do : - 'content' : I don't know what it could be. Maybe votes ? - 'categories' : the results are not filtered by categories, because I don't see how to do it properly : there are too much categories on Kickass. Is 'video' only movies, or also tv show or porn ? So for now, the category is 'all'. - Favicon/icon : may be a good idea. 2014-12-09 19:19:39 +01:00

			`# do search-request`
			`def request(query, params):`
			`params['url'] = search_url.format(search_term=quote(query),`
			`pageno=params['pageno'])`

			`return params`


			`# get response from search-request`
			`def response(resp):`
			`results = []`

			`dom = html.fromstring(resp.text)`

			`search_res = dom.xpath('//table[@class="data"]//tr')`

			`# return empty array if nothing is found`
			`if not search_res:`
			`return []`

			`# parse results`
			`for result in search_res[1:]:`
			`link = result.xpath('.//a[@class="cellMainLink"]')[0]`
			`href = urljoin(url, link.attrib['href'])`
Kickass' unit test 2015-01-30 21:02:17 +01:00			`title = extract_text(link)`
			`content = escape(extract_text(result.xpath(content_xpath)))`
fix kickass torrents engine 2016-10-11 19:31:42 +02:00			`seed = extract_text(result.xpath('.//td[contains(@class, "green")]'))`
			`leech = extract_text(result.xpath('.//td[contains(@class, "red")]'))`
			`filesize_info = extract_text(result.xpath('.//td[contains(@class, "nobr")]'))`
			`files = extract_text(result.xpath('.//td[contains(@class, "center")][2]'))`
First pass at Kickass Engine Parse and return results correctly. Pages numbers taken care of. Not done, and maybe to do : - 'content' : I don't know what it could be. Maybe votes ? - 'categories' : the results are not filtered by categories, because I don't see how to do it properly : there are too much categories on Kickass. Is 'video' only movies, or also tv show or porn ? So for now, the category is 'all'. - Favicon/icon : may be a good idea. 2014-12-09 19:19:39 +01:00
fix kickass torrents engine 2016-10-11 19:31:42 +02:00			`seed = convert_str_to_int(seed)`
			`leech = convert_str_to_int(leech)`

			`filesize, filesize_multiplier = filesize_info.split()`
			`filesize = get_torrent_size(filesize, filesize_multiplier)`
[enh] improve torrent results 2015-01-10 19:40:27 +01:00			`if files.isdigit():`
			`files = int(files)`
			`else:`
			`files = None`

First pass at Kickass Engine Parse and return results correctly. Pages numbers taken care of. Not done, and maybe to do : - 'content' : I don't know what it could be. Maybe votes ? - 'categories' : the results are not filtered by categories, because I don't see how to do it properly : there are too much categories on Kickass. Is 'video' only movies, or also tv show or porn ? So for now, the category is 'all'. - Favicon/icon : may be a good idea. 2014-12-09 19:19:39 +01:00			`magnetlink = result.xpath(magnet_xpath)[0].attrib['href']`
[fix] pep8 2015-01-10 20:01:36 +01:00
[enh] improve torrent results 2015-01-10 19:40:27 +01:00			`torrentfile = result.xpath(torrent_xpath)[0].attrib['href']`
Fix torrent W3C+UX Puts links to torrents and magnets in tool bar Fixes a lot of W3C errors 2015-01-11 19:34:11 +01:00			`torrentfileurl = quote(torrentfile, safe="%/:=&?~#+!$,;'@()*")`
First pass at Kickass Engine Parse and return results correctly. Pages numbers taken care of. Not done, and maybe to do : - 'content' : I don't know what it could be. Maybe votes ? - 'categories' : the results are not filtered by categories, because I don't see how to do it properly : there are too much categories on Kickass. Is 'video' only movies, or also tv show or porn ? So for now, the category is 'all'. - Favicon/icon : may be a good idea. 2014-12-09 19:19:39 +01:00
			`# append result`
			`results.append({'url': href,`
			`'title': title,`
Add icons and badge for the themes Add kickass in engine list Add content for the result from kickass 2014-12-14 23:27:27 +01:00			`'content': content,`
First pass at Kickass Engine Parse and return results correctly. Pages numbers taken care of. Not done, and maybe to do : - 'content' : I don't know what it could be. Maybe votes ? - 'categories' : the results are not filtered by categories, because I don't see how to do it properly : there are too much categories on Kickass. Is 'video' only movies, or also tv show or porn ? So for now, the category is 'all'. - Favicon/icon : may be a good idea. 2014-12-09 19:19:39 +01:00			`'seed': seed,`
			`'leech': leech,`
[enh] improve torrent results 2015-01-10 19:40:27 +01:00			`'filesize': filesize,`
			`'files': files,`
First pass at Kickass Engine Parse and return results correctly. Pages numbers taken care of. Not done, and maybe to do : - 'content' : I don't know what it could be. Maybe votes ? - 'categories' : the results are not filtered by categories, because I don't see how to do it properly : there are too much categories on Kickass. Is 'video' only movies, or also tv show or porn ? So for now, the category is 'all'. - Favicon/icon : may be a good idea. 2014-12-09 19:19:39 +01:00			`'magnetlink': magnetlink,`
Fix torrent W3C+UX Puts links to torrents and magnets in tool bar Fixes a lot of W3C errors 2015-01-11 19:34:11 +01:00			`'torrentfile': torrentfileurl,`
First pass at Kickass Engine Parse and return results correctly. Pages numbers taken care of. Not done, and maybe to do : - 'content' : I don't know what it could be. Maybe votes ? - 'categories' : the results are not filtered by categories, because I don't see how to do it properly : there are too much categories on Kickass. Is 'video' only movies, or also tv show or porn ? So for now, the category is 'all'. - Favicon/icon : may be a good idea. 2014-12-09 19:19:39 +01:00			`'template': 'torrent.html'})`

			`# return results sorted by seeder`
			`return sorted(results, key=itemgetter('seed'), reverse=True)`