anything-llm/collector
Sean Hatfield a87014822a
[REFACTOR] Improve asPDF collector processor with pdfjs (#1791)
* WIP replace langchain pdfloader with pdfjs and add more context to each page

* remove extras from pdfjs and just replace langchain library

* remove unneeded dep

* fix console log in docs

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2024-07-03 14:26:48 -07:00
..
extensions [BETA] Live document sync (#1719) 2024-06-21 13:38:50 -07:00
hotdir Document Processor v2 (#442) 2023-12-14 15:14:56 -08:00
middleware [BETA] Live document sync (#1719) 2024-06-21 13:38:50 -07:00
processLink Agent support for @agent default agent inside workspace chat (#1093) 2024-04-16 10:50:10 -07:00
processRawText Add support to upload rawText document via api (#692) 2024-02-07 15:17:32 -08:00
processSingleFile [REFACTOR] Improve asPDF collector processor with pdfjs (#1791) 2024-07-03 14:26:48 -07:00
storage feat: Embed on-instance Whisper model for audio/mp4 transcribing (#449) 2023-12-15 11:20:13 -08:00
utils [FIX] Confluence code snippet blocks not being extracted (#1804) 2024-07-03 14:00:44 -07:00
.env.example devcontainer v1 (#297) 2024-01-08 15:31:06 -08:00
.gitignore Document Processor v2 (#442) 2023-12-14 15:14:56 -08:00
.nvmrc Document Processor v2 (#442) 2023-12-14 15:14:56 -08:00
index.js Agent support for @agent default agent inside workspace chat (#1093) 2024-04-16 10:50:10 -07:00
nodemon.json Document Processor v2 (#442) 2023-12-14 15:14:56 -08:00
package.json [REFACTOR] Improve asPDF collector processor with pdfjs (#1791) 2024-07-03 14:26:48 -07:00
yarn.lock [REFACTOR] Improve asPDF collector processor with pdfjs (#1791) 2024-07-03 14:26:48 -07:00