mirror of
https://github.com/Mintplex-Labs/anything-llm.git
synced 2024-11-11 01:10:11 +01:00
9f33b3dfcb
* Updates for Linux for frontend/server * frontend/server docker * updated Dockerfile for deps related to node vectordb * updates for collector in docker * docker deps for ODT processing * ignore another collector dir * storage mount improvements; run as UID * fix pypandoc version typo * permissions fixes
11 lines
699 B
Markdown
11 lines
699 B
Markdown
### What is this folder of documents?
|
|
|
|
This is a temporary cache of the resulting files you have collected from `collector/`. You really should not be adding files manually to this folder. However the general format of this is you should partion data by how it was collected - it will be added to the appropriate namespace when you undergo vectorizing.
|
|
|
|
You can manage these files from the frontend application.
|
|
|
|
All files should be JSON files and in general there is only one main required key: `pageContent` all other keys will be inserted as metadata for each document inserted into the vector DB.
|
|
|
|
There is also a special reserved key called `published` that should be reserved for timestamps.
|
|
|