Use httpx.Response.json() to avoid charset_normalizer issues:
DEBUG charset_normalizer : override steps (5) and chunk_size (512) as content does not fit (153 byte(s) given) parameters.
INFO charset_normalizer : ascii passed initial chaos probing. Mean measured chaos is 0.000000 %
DEBUG charset_normalizer : ascii should target any language(s) of ['Latin Based']
INFO charset_normalizer : ascii is most likely the one. Stopping the process.
[1] https://www.python-httpx.org/api/#response
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Currently we have two kinds of user documentation:
* the about page[1] which is written in HTML and part of the web
application and can therefore link instance-specific pages
(like e.g. the preferences) via Jinja variables
* the Sphinx documentation[2] which is written in reStructuredText
and cannot link instance-specific pages since it doesn't know
which instance the user is using
The plan is to integrate the user documentation currently in Sphinx
into the application, so that it can also link instance specific pages.
We also want to enable the user documentation to be translated.
This commit implements the first step in this endeavor (see #722).
[1]: searx/templates/__common__/about.html
[2]: docs/user/ (currently served at https://docs.searxng.org/user/)
Since https://github.com/searxng/searxng/pull/354
the searx.network.stream(...) returns a tuple
This commits update the checker code according to
this function signature change.
webapp.py monkey-patches the Flask request global.
This commit adds a type cast so that e.g. Pyright[1]
doesn't show "Cannot access member" errors everywhere.
[1]: https://github.com/microsoft/pyright
* mirror all inline SVGs so that direction SVGs display correctly on RTL
* set the bold list element in info box to RTL so the colon gets displayed on the right side
* set correct .ltr function for the left border on the search button in #q
* move text to the right in autocomplete
* move search form in lign with result article on RTL
* add the correct padding for img thumbnails in categories like music on RTL
* apply RTL to result table for map results
* align text in tables part of /preferences on RTL
* move burger menu on index page to the left on RTL
* fix positioning of drop down arrow on select boxes on RTL
* align result URL on the right (written LTR)
* align vim hotkeys help on the left since it is not translated
* image detail:
* labels (author, format, URL, etc...) are written on the right,
values are on the left.
* URL are written LTR and overflow on the right
The less grunt runner silently ignore missing files and continue with the build[1]::
Running "less:production" (less) task
>> Destination css/searxng.min.css not written because no source files were found.
>> 1 stylesheet created.
>> 1 sourcemap created.
Add filter function that calls grunt.fail() if the scr file does not exists.
[1] https://github.com/searxng/searxng/pull/750#discussion_r784357031
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Bangs with a `*` suffix (e.g. `!!d*`) overwrite Bangs with the same
prefix (e.g. `!!d`) [1]. This can be avoid when a non printable character is
used to tag a LEAF_KEY.
[1] https://github.com/searxng/searxng/pull/740#issuecomment-1010411888
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
There is an issue with redis v4.1.0 [1] / for the interim lets remove this
python dependency.
[1] https://github.com/searxng/searxng/issues/741
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
An ambiguous bang like `!!d` raises an exception in function get_bang_url(). A
bang is only unique when the bang_definition from get_bang_definition_and_ac() is
a string / for a ambiguous bang the returned bang_definition is a dictionary.
Reported-by: user prg at #searxng:matrix.org on 2022/01/11
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
In case of CAPTCHA raise a SearxEngineCaptchaException and suspend for 7 days.
When get_sc_code() fails raise a SearxEngineResponseException and suspend for 7
days.
[1] https://github.com/searxng/searxng/pull/695
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Startpage has introduced new anti-scraping measures that make SearXNG instances
run into captchas:
1. some arguments has been removed and a new `sc` has been added.
2. search path changed from `do/search` to `sp/search`
3. POST request is no longer needed
Closes: https://github.com/searxng/searxng/issues/692
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
api.openverse.engineering is a little picky and wants to have a trailing slash
in the path:
/v1/images? -->/ v1/images/?
otherwise it redirects, here is the debug log:
DEBUG searx.network.openverse : HTTP Request: GET https://api.openverse.engineering/v1/images?&page=1&page_size=20&format=json&q=foo "HTTP/2 301 Moved Permanently" (text/html; charset=utf-8)
DEBUG searx.network.openverse : HTTP Request: GET https://api.openverse.engineering/v1/images/?&page=1&page_size=20&format=json&q=foo "HTTP/2 200 OK" (application/json)
WARNING searx.engines.openverse : ErrorContext('searx/search/processors/online.py', 105, 'count_error(', None, '1 redirects, maximum: 0', ('200', 'OK', 'api.openverse.engineering')) True
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
The implementation of the etools engine is poor. No date-range support, no
language support and it is broken by a CAPTCHA.
etools is a metasearch engine, the major search engines it supports (google,
bing, wikipedia, Yahoo) are already available in SeaarXNG.
While etools does support several engines we currently don't support directly,
support for them should be added directly to SearXNG if there is demand.
In practice: in SearXNG the worse etools results will be mixed with good results
from other engines we have (as long as there is no captcha).
At best case, what we win with etools is in e.g. results from de.ask.com in a
query from a german request .. in all other cases worse results are bubble up in
SearXNG's result list.
[1] https://github.com/searxng/searxng/issues/696#issuecomment-1005855499
Closes: https://github.com/searxng/searxng/issues/696
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
The previous implementation used two hash sets and a list.
... that's not necessary ... a single hash map suffices.
And it's also less error prone ... because the previous data structure
allowed a setting to be enabled and disabled at the same time.
Previously the default_value was abused for the cookie name.
Having SwitchableSetting subclass Setting doesn't even make sense
in the first place since none of the Setting methods apply.
The ? search operator has been broken for some time and
currently only raises the question why it's still there.
## Context ##
The query "Paris !images" searches for "Paris" in the "images" category.
Once upon a time Searx supported "Paris ?images" to search for "Paris"
in the currently enabled categories and the "images" category.
The feature makes sense ... the ? syntax does not.
We will hopefully introduce a +!images syntax in the future.
Fixes#702.
* allow not to record metrics (response time, etc...)
* this commit doesn't change the UI. If the metrics are disabled
/stats and /stats/errors will return empty response.
in /preferences, the columns response time and reliability will be empty.
The tab icon names are currently hard coded in the templates.
This commit lets us introduce an icon property in the future, e.g:
categories_as_tabs:
general:
icon: search-outline
These dictionaries are no longer part of the general category,
so they're no longer queried by default -> we can enable them
by default without degrading general query performance.
The general category is the category that is searched by default.
From a privacy standpoint it doesn't make sense to send all general
queries to specialized search engines that cannot deal with those
queries anyway.
Previously we didn't have a good place to put search engines that don't
fit into any of the tab categories. This commit automatically puts
search engines that don't belong to any tab category in an "other"
category, that is only displayed in the user preferences (and not above
search results).