paging is broken in searchcode.com's API .. not sure it will ever been fixed /
this commit disables paging in the engine and BTW pylint `searchcode_code.py`.
Closes: https://github.com/searxng/searxng/issues/3287
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Parse the result list from ask.com given in the variable named
window.MESON.initialState::
<script nonce="..">
window.MESON = window.MESON || {};
window.MESON.initialState = {"siteConfig": ...
...}};
window.MESON.loadedLang = "en";
</script>
The result list is in field::
json_resp['search']['webResults']['results']
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
In Presearch there are languages for the UI and regions for narrowing down the
search. With this change the SearXNG engine supports a search by region. The
details can be found in the documentation of the source code.
To test, you can search terms like::
!presearch bmw :zh-TW
!presearch bmw :en-CA
1. You should get results corresponding to the region (Taiwan, Canada)
2. and in the language (Chinese, Englisch).
3. The context in info box content is in the same language.
Exceptions:
1. Region or language is not supported by Presearch or
2. SearXNG user did not selected a region tag, example::
!presearch bmw :en
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
DDG's bot detection is sensitive to the vqd value. For some search terms (such
as extremely long search terms that are often sent by bots), no vqd value can be
determined.
If SearXNG cannot determine a vqd value, then no request should go out to
DDG (WEB): a request with a wrong vqd value leads to DDG temporarily putting
SearXNG's IP on a block list.
Requests from IPs in this block list run into timeouts.
Not sure, but it seems the block list is a sliding window: to get my IP rid from
the bot list I had to cool down my IP for 1h (send no requests from that IP to
DDG).
Since such issues can't reproduce in a local instance I tested this patch 24h on
my public SearXNG instance: There are still errors (rare), but the reliability
is still 100%.
Related:
- https://github.com/searxng/searxng/pull/2922
- https://github.com/searxng/searxng/pull/2923
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Some search terms do not have results and therefore no vqd value
BTW: remove a leftover from 9197efa
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
We have had problems with this before, the bot protection from ddg-lite seems to
have included this referer in the rating [1][2].
From reverse engineering:
- The Referer ``https://google.com/`` was set in commt 257dc7d6c4 --> DDG lite
does not like this referer anymore!
- The 'Referer' header is only set on second and follow up pages but not on the
first page
- The vqd value is not needed on the first page, the ddg-lite client sets this
value only on follow up pages / this can help to reduce the vqd requests from
SearXNG.
Related to 'Referer' header & ddg requests:
[1] https://github.com/searxng/searxng/pull/2161
[2] https://github.com/searxng/searxng/pull/2081
Closes: https://github.com/searxng/searxng/issues/2796
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>