mirror of
https://github.com/Mintplex-Labs/anything-llm.git
synced 2024-11-05 14:30:11 +01:00
9f33b3dfcb
* Updates for Linux for frontend/server * frontend/server docker * updated Dockerfile for deps related to node vectordb * updates for collector in docker * docker deps for ODT processing * ignore another collector dir * storage mount improvements; run as UID * fix pypandoc version typo * permissions fixes
699 B
699 B
What is this folder of documents?
This is a temporary cache of the resulting files you have collected from collector/
. You really should not be adding files manually to this folder. However the general format of this is you should partion data by how it was collected - it will be added to the appropriate namespace when you undergo vectorizing.
You can manage these files from the frontend application.
All files should be JSON files and in general there is only one main required key: pageContent
all other keys will be inserted as metadata for each document inserted into the vector DB.
There is also a special reserved key called published
that should be reserved for timestamps.