anything-llm/collector/utils
Sean Hatfield 612a7e1662
[FEAT] Website depth scraping data connector (#1191)
* WIP website depth scraping, (sort of works)

* website depth data connector stable + add maxLinks option

* linting + loading small ui tweak

* refactor website depth data connector for stability, speed, & readability

* patch: remove console log
Guard clause on URL validitiy check
reasonable overrides

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2024-05-14 12:49:14 -07:00
..
comKey patch comkey path to fallback 2024-04-04 10:47:26 -07:00
extensions [FEAT] Website depth scraping data connector (#1191) 2024-05-14 12:49:14 -07:00
files patch file types as plaintext (#1095) 2024-04-12 14:54:33 -07:00
http Document Processor v2 (#442) 2023-12-14 15:14:56 -08:00
tokenizer Document Processor v2 (#442) 2023-12-14 15:14:56 -08:00
url Prevent private octets from link collection for self-hosted (#626) 2024-01-19 10:49:40 -08:00
WhisperProviders duplicate key (no impact) 2024-05-02 13:05:20 -07:00
constants.js Add epub support for parsing (#1017) 2024-04-02 14:25:52 -07:00