* Add support for fetching single document in documents folder
* Add document object to upload + support link scraping via API
* hotfixes for documentation
* update api docs
* feat: implement github repo loading
fix: purge of folders
fix: rendering of sub-files
* noshow delete on custom-documents
* Add API key support because of rate limits
* WIP for frontend of data connectors
* wip
* Add frontend form for GitHub repo data connector
* remove console.logs
block custom-documents from being deleted
* remove _meta unused arg
* Add support for ignore pathing in request
Ignore path input via tagging
* Update hint
* feat: Embed on-instance Whisper model for audio/mp4 transcribing
resolves#329
* additional logging
* add placeholder for tmp folder in collector storage
Add cleanup of hotdir and tmp on collector boot to prevent hanging files
split loading of model and file conversion into concurrency
* update README
* update model size
* update supported filetypes
* wip: init refactor of document processor to JS
* add NodeJs PDF support
* wip: partity with python processor
feat: add pptx support
* fix: forgot files
* Remove python scripts totally
* wip:update docker to boot new collector
* add package.json support
* update dockerfile for new build
* update gitignore and linting
* add more protections on file lookup
* update package.json
* test build
* update docker commands to use cap-add=SYS_ADMIN so web scraper can run
update all scripts to reflect this
remove docker build for branch