1
0
mirror of https://github.com/searxng/searxng.git synced 2024-11-18 02:10:12 +01:00
searxng/searx/botdetection/http_accept_encoding.py
Markus Heiser fd814aac86 [mod] isolation of botdetection from the limiter
This patch was inspired by the discussion around PR-2882 [2].  The goals of this
patch are:

1. Convert plugin searx.plugin.limiter to normal code [1]
2. isolation of botdetection from the limiter [2]
3. searx/{tools => botdetection}/config.py and drop searx.tools
4. in URL /config, 'limiter.enabled' is true only if the limiter is really
   enabled (Redis is available).

This patch moves all the code that belongs to botdetection into namespace
searx.botdetection and code that belongs to limiter is placed in namespace
searx.limiter.

Tthe limiter used to be a plugin at some point botdetection was added, it was
not a plugin.  The modularization of these two components was long overdue.
With the clear modularization, the documentation could then also be organized
according to the architecture.

[1] https://github.com/searxng/searxng/pull/2882
[2] https://github.com/searxng/searxng/pull/2882#issuecomment-1741716891

To test:

- check the app works without the limiter, check `/config`
- check the app works with the limiter and with the token, check `/config`
- make docs.live .. and read
  - http://0.0.0.0:8000/admin/searx.limiter.html
  - http://0.0.0.0:8000/src/searx.botdetection.html#botdetection

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-11-01 06:44:56 +01:00

42 lines
1.1 KiB
Python

# SPDX-License-Identifier: AGPL-3.0-or-later
# lint: pylint
"""
Method ``http_accept_encoding``
-------------------------------
The ``http_accept_encoding`` method evaluates a request as the request of a
bot if the Accept-Encoding_ header ..
- did not contain ``gzip`` AND ``deflate`` (if both values are missed)
- did not contain ``text/html``
.. _Accept-Encoding:
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Encoding
"""
# pylint: disable=unused-argument
from __future__ import annotations
from ipaddress import (
IPv4Network,
IPv6Network,
)
import flask
import werkzeug
from . import config
from ._helpers import too_many_requests
def filter_request(
network: IPv4Network | IPv6Network,
request: flask.Request,
cfg: config.Config,
) -> werkzeug.Response | None:
accept_list = [l.strip() for l in request.headers.get('Accept-Encoding', '').split(',')]
if not ('gzip' in accept_list or 'deflate' in accept_list):
return too_many_requests(network, "HTTP header Accept-Encoding did not contain gzip nor deflate")
return None