Adding a CR in some files and in others not, is a good starting point for a
DOS+Unix mess we all have already seen in many projects.
Patch fixes all files matching (even those comming from grunt's build)::
find ./searx -exec file {} \; | grep CR
BTW: Same with mixing TAB and SPACE indent styles in one and the same file. So
if sources are tuched here in this patch, its also fixed.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Fix this error while travis build::
/home/travis/build/asciimoo/searx/searx/engines/duckduckgo_definitions.py:21:44: E225 missing whitespace around operator
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Add image format and source information to display - needs changes to engines to actually display something.
Displays result.source (website from which the image was taken) and result.img_format (image type and size).
Result is styled with result-format and result-source classes. See PR #1566 for an example of an engine which has the necessary changes.
Strip <span class="highlight">...</span> in the oscar image template.
This PR fixes the result count from bing which was throwing an (hidden) error and add a validation to avoid reading more results than avalaible.
For example :
If there is 100 results from some search and we try to get results from 120 to 130, Bing will send back the results from 0 to 10 and no error. If we compare results count with the first parameter of the request we can avoid this "invalid" results.
The new url parameter "timeout_limit" set timeout limit defined in second.
Example "timeout_limit=1.5" means the timeout limit is 1.5 seconds.
In addition, the query can start with <[number] to set the timeout limit.
For number between 0 and 99, the unit is the second :
Example: "<30 searx" means the timeout limit is 3 seconds
For number above 100, the unit is the millisecond:
Example: "<850 searx" means the timeout is 850 milliseconds.
In addition, there is a new optional setting: outgoing.max_request_timeout.
If not set, the user timeout can't go above searx configuration (as before: the max timeout of selected engine for a query).
If the value is set, the user can set a timeout between 0 and max_request_timeout using
<[number] or timeout_limit query parameter.
Related to #1077
Updated version of PR #1413 from @isj-privacore
Characters that were not ASCII were incorrectly decoded.
Add an helper function: searx.utils.ecma_unescape (Python implementation of unescape Javascript function).
* Search URL is https://www.wikidata.org/w/index.php?{query}&ns0=1 (with ns0=1 at the end to avoid an HTTP redirection)
* url_detail: remove the disabletidy=1 deprecated parameter
* Add eval_xpath function: compile once for all xpath.
* Add get_id_cache: retrieve all HTML with an id, avoid the slow to procress dynamic xpath '//div[@id="{propertyid}"]'.replace('{propertyid}')
* Create an etree.HTMLParser() instead of using the global one (see #1575)
Fetch complete JSON data block, use legend to extract images.
Unquote urlencoded strings.
Add image description as 'content'.
Add 'img_format' and 'source' data (needs PR #1567 to enable this data to be displayed).
Show images which lack ownerid instead of discarding them.
use data from embedded JSON to improve results (e.g. real page title), add image format and source info (see PR #1567), improve paging logic (it now works)
Server Timing specification: https://www.w3.org/TR/server-timing/
In the browser Dev Tools, focus on the main request, there are the responses per engine in the Timing tab.
doi_resolvers / default_doi_resolver were missing in the settings_robots.yml file, so the test server was not able to start (crash). Since the output wasn't displayed, it was not obvious why the Selenium couldn't connect to searx.
Locale and search language was always defined with english value.
This patch inits the locale on `pre_request` in order to define the
default value of locale and language preferences.
Plus the `best_match` function provided by flask babel library did not
work as expected. So the function `match_language` provided
by searx is used to detect that the language from Accepted-Language
header can be used in searx project.