All favicons implementations have been documented and moved to the Python
package:
searx.favicons
There is a configuration (based on Pydantic) for the favicons and all its
components:
searx.favicons.config
A solution for caching favicons has been implemented:
searx.favicon.cache
If the favicon is already in the cache, the returned URL is a data URL [1]
(something like `data:image/png;base64,...`). By generating a data url from
the FaviconCache, additional HTTP roundtripps via the favicon_proxy are saved:
favicons.proxy.favicon_url
The favicon proxy service now sets a HTTP header "Cache-Control: max-age=...":
favicons.proxy.favicon_proxy
The resolvers now also provide the mime type (data, mime):
searx.favicon.resolvers
[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URLs
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
The intention of this PR is to modernize the settings_loader implementations.
The concept is old (remember, this is partly from 2014), back then we only had
one config file, meanwhile we have had a folder with config files for a very
long time. Callers can now load a YAML configuration from this folder as
follows ::
settings_loader.get_yaml_cfg('my-config.yml')
- BTW this is a fix of #3557.
- Further the `existing_filename_or_none` construct dates back to times when
there was not yet a `pathlib.Path` in all Python versions we supported in the
past.
- Typehints have been added wherever appropriate
At the same time, this patch should also be downward compatible and not
introduce a new environment variable. The localization of the folder with the
configurations is further based on:
SEARXNG_SETTINGS_PATH (wich defaults to /etc/searxng/settings.yml)
Which means, the default config folder is `/etc/searxng/`.
ATTENTION: intended functional changes!
If SEARXNG_SETTINGS_PATH was set and pointed to a not existing file, the
previous implementation silently loaded the default configuration. This
behavior has been changed: if the file or folder does not exist, an
EnvironmentError exception will be thrown in future.
Closes: https://github.com/searxng/searxng/issues/3557
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
In the past, some files were tested with the standard profile, others with a
profile in which most of the messages were switched off ... some files were not
checked at all.
- ``PYLINT_SEARXNG_DISABLE_OPTION`` has been abolished
- the distinction ``# lint: pylint`` is no longer necessary
- the pylint tasks have been reduced from three to two
1. ./searx/engines -> lint engines with additional builtins
2. ./searx ./searxng_extra ./tests -> lint all other python files
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
The URL was accidentally deleted in a85907a98, but is still required in
base.html for auto-discovery / from base.html::
<link title="{{ instance_name }}"
type="application/opensearchdescription+xml"
rel="search" href="{{ opensearch_url }}"
/>
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
This patch was inspired by the discussion around PR-2882 [2]. The goals of this
patch are:
1. Convert plugin searx.plugin.limiter to normal code [1]
2. isolation of botdetection from the limiter [2]
3. searx/{tools => botdetection}/config.py and drop searx.tools
4. in URL /config, 'limiter.enabled' is true only if the limiter is really
enabled (Redis is available).
This patch moves all the code that belongs to botdetection into namespace
searx.botdetection and code that belongs to limiter is placed in namespace
searx.limiter.
Tthe limiter used to be a plugin at some point botdetection was added, it was
not a plugin. The modularization of these two components was long overdue.
With the clear modularization, the documentation could then also be organized
according to the architecture.
[1] https://github.com/searxng/searxng/pull/2882
[2] https://github.com/searxng/searxng/pull/2882#issuecomment-1741716891
To test:
- check the app works without the limiter, check `/config`
- check the app works with the limiter and with the token, check `/config`
- make docs.live .. and read
- http://0.0.0.0:8000/admin/searx.limiter.html
- http://0.0.0.0:8000/src/searx.botdetection.html#botdetection
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Over the years the webapp module became more and more a mess. To improve the
modulaization a little this patch moves some implementations from the webapp
module to webutils module.
HINT: this patch brings non functional change
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
In order to be able to meet the outstanding requirements, the implementation is
modularized and supplemented with documentation.
This patch does not contain functional change, except it fixes issue #2455
----
Aktivate limiter in the settings.yml and simulate a bot request by::
curl -H 'Accept-Language: de-DE,en-US;q=0.7,en;q=0.3' \
-H 'Accept: text/html'
-H 'User-Agent: xyz' \
-H 'Accept-Encoding: gzip' \
'http://127.0.0.1:8888/search?q=foo'
In the LOG:
DEBUG searx.botdetection.link_token : missing ping for this request: .....
Since ``BURST_MAX_SUSPICIOUS = 2`` you can repeat the query above two time
before you get a "Too Many Requests" response.
Closes: https://github.com/searxng/searxng/issues/2455
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
To set the language from language recognition and hold the value selected by the
client, the previous implementation creates a copy of the SearchQuery object and
manipulates the SearchQuery object by calling function replace_auto_language().
This patch tries to implement a similar functionality in a more central place,
in function get_search_query_from_webapp() when the SearchQuery object is build
up.
Additional this patch uses the language preferred by the client, if language
recognition does not have a match / the existing implementation does not care
about client preferences and uses 'all' in case of no match.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
This PR does no functional change it is just an attempt to make more clear in
the code, what a default category is and what a subcategory is. The previous
name 'others' leads to confusion with the **category 'other'**.
If a engine is not assigned to a category, the default is assigned::
DEFAULT_CATEGORY = 'other'
If an engine has only one category and this category is shown as tab in the user
interface, this engine has no further subgrouping::
NO_SUBGROUPING = 'without further subgrouping'
Related:
- https://github.com/searxng/searxng/issues/1604
- https://github.com/searxng/searxng/pull/1545
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
This patch replaces the *full of magic* ``utils.match_language`` function by a
``locales.match_locale``. The ``locales.match_locale`` function is based on the
``locales.build_engine_locales`` introduced in 9ae409a0 [1].
In the past SearXNG did only support a search by a language but not in a region.
This has been changed a long time ago and regions have been added to SearXNG
core but not to the engines. The ``utils.match_language`` was the function to
handle the different aspects of language/regions in SearXNG core and the
supported *languages* in the engine. The ``utils.match_language`` did it with
some magic and works good for most use cases but fails in some edge case.
To replace the concurrence of languages and regions in the SearXNG core the
``locales.build_engine_locales`` was introduced in 9ae409a0 [1]. With the last
patches all engines has been migrated to a ``fetch_traits`` and a
language/region concept that is based on ``locales.build_engine_locales``.
To summarize: there is no longer a need for the ``locales.match_language``.
[1] https://github.com/searxng/searxng/pull/1652
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
All engines has been migrated from ``supported_languages`` to the
``fetch_traits`` concept. There is no longer a need for the obsolete code that
implements the ``supported_languages`` concept.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
With the language and region tags from the EngineTraitsMap the handling of
SearXNG's tags of languages and regions has been normalized and is no longer
a *mystery*. The "languages" became "locales" that are supported by babel and
by this, the update_engine_traits.py can be simplified a lot.
Other code places can be simplified as well, but these simplifications
should (respectively can) only be done when none of the engines work with the
deprecated EngineTraits.supported_languages interface anymore.
This commit replaces searx.languages by searx.sxng_locales and fix the naming of
some names from "language" to "locale" (e.g. language_codes --> sxng_locales).
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Implementations of the *traits* of the engines.
Engine's traits are fetched from the origin engine and stored in a JSON file in
the *data folder*. Most often traits are languages and region codes and their
mapping from SearXNG's representation to the representation in the origin search
engine.
To load traits from the persistence::
searx.enginelib.traits.EngineTraitsMap.from_data()
For new traits new properties can be added to the class::
searx.enginelib.traits.EngineTraits
.. hint::
Implementation is downward compatible to the deprecated *supported_languages
method* from the vintage implementation.
The vintage code is tagged as *deprecated* an can be removed when all engines
has been ported to the *traits method*.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
When the user choose "Auto-detected", the choice remains on the following queries.
The detected language is displayed.
For example "Auto-detected (en)":
* the next query language is going to be auto detected
* for the current query, the detected language is English.
This replace the autodetect_search_language plugin.
* Move the datetime to str code from searx.webapp.search to searx.webutils.searxng_format_date
* When the month, day, hour, day and second are zero, the function returns only the year.
Some HTTP-Clients do have issues with the ``opensearch.xml`` from SearXNG
(related [1][2]) while other OpenSearch descriptions[3] (e.g. from qwant) work
flawles.
Inspired by the OpenSearch description from qwant and with informations from the
specification[4] the ``opensearch.xml`` has been *improved*.
- convert `<Url>` methods from lower case to upper case (`POST`|`GET`)
- add `<moz:SearchForm>` and `xmlns:moz="http://www.mozilla.org/2006/browser/search/"`
- add `<Query role="example" searchTerms="SearXNG" />` [4]
OpenSearch description documents should include at least one Query element of
`role="example"` that is expected to return search results. Search clients may
use this example query to validate that the search engine is working properly.
- modified `<LongName>` to SearXNG
- modified `<Description>` the word 'hackable' scares uninitiated users and was removed
- add the `type="image/png"` to `<Image>`
Test can be done by::
make run
Visit http://127.0.0.1:8888/ and add the search engine to your WEB-Browser /
test with different WEB-Browser from desktop and Smartphones (are there any iOS
user here, please test on Safari and Chrome).
[1] https://app.element.io/#/room/#searxng:matrix.org/$xN_abdKhNqUlgXRBrb_9F3pqOxnSzGQ1TG0s0G9hQVw
[2] https://github.com/searxng/searxng/issues/431
[3] https://developer.mozilla.org/en-US/docs/Web/OpenSearch
[4] https://github.com/dewitt/opensearch/blob/master/opensearch-1-1-draft-6.md#the-query-element
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
The errors make pyright usage useless since a new error won't be seen [1].
[1] https://github.com/searxng/searxng/pull/1569
```
searx/compat.py:11:27 - error: Expression of type "Type[cached_property[_T@cached_property]]" cannot be assigned to declared type "Type[cached_property]"
"Type[cached_property[_T@cached_property]]" is incompatible with "Type[cached_property]"
Type "Type[cached_property[_T@cached_property]]" cannot be assigned to type "Type[cached_property]" (reportGeneralTypeIssues)
searx/utils.py:69:36 - error: Expression of type "None" cannot be assigned to parameter of type "str"
Type "None" cannot be assigned to type "str" (reportGeneralTypeIssues)
searx/utils.py:573:85 - error: Expression of type "None" cannot be assigned to parameter of type "int"
Type "None" cannot be assigned to type "int" (reportGeneralTypeIssues)
searx/webapp.py:1306:22 - error: Argument of type "str" cannot be assigned to parameter "__a" of type "BytesPath" in function "join"
Type "str" cannot be assigned to type "BytesPath"
"str" is incompatible with "bytes"
"str" is incompatible with protocol "PathLike[bytes]"
"__fspath__" is not present (reportGeneralTypeIssues)
searx/webapp.py:1306:68 - error: Argument of type "Literal['themes']" cannot be assigned to parameter "paths" of type "BytesPath" in function "join"
Type "Literal['themes']" cannot be assigned to type "BytesPath"
"Literal['themes']" is incompatible with "bytes"
"Literal['themes']" is incompatible with protocol "PathLike[bytes]"
"__fspath__" is not present (reportGeneralTypeIssues)
searx/webapp.py:1306:78 - error: Argument of type "str | Any | None" cannot be assigned to parameter "paths" of type "BytesPath" in function "join"
Type "str | Any | None" cannot be assigned to type "BytesPath"
Type "str" cannot be assigned to type "BytesPath"
"str" is incompatible with "bytes"
"str" is incompatible with protocol "PathLike[bytes]"
"__fspath__" is not present (reportGeneralTypeIssues)
searx/webapp.py:1306:85 - error: Argument of type "Literal['img']" cannot be assigned to parameter "paths" of type "BytesPath" in function "join"
Type "Literal['img']" cannot be assigned to type "BytesPath"
"Literal['img']" is incompatible with "bytes"
"Literal['img']" is incompatible with protocol "PathLike[bytes]"
"__fspath__" is not present (reportGeneralTypeIssues)
searx/engines/mongodb.py:8:6 - warning: Import "pymongo" could not be resolved (reportMissingImports)
searx/engines/mysql_server.py:9:8 - warning: Import "mysql.connector" could not be resolved (reportMissingImports)
searx/engines/postgresql.py:9:8 - warning: Import "psycopg2" could not be resolved from source (reportMissingModuleSource)
searx/engines/xpath.py:187:28 - warning: "categories" is not defined (reportUndefinedVariable)
searx/search/__init__.py:184:82 - warning: "flask" is not defined (reportUndefinedVariable)
searx/search/checker/background.py:19:26 - error: Type of "schedule" is partially unknown
Type of "schedule" is "(delay: Any, func: Any, *args: Any) -> Literal[True]" (reportUnknownVariableType)
searx/shared/__init__.py:8:12 - warning: Import "uwsgi" could not be resolved (reportMissingImports)
searx/shared/shared_uwsgi.py:5:8 - warning: Import "uwsgi" could not be resolved (reportMissingImports)
```