Commit Graph

11 Commits

Author SHA1 Message Date
frasergr
4079020de0
dockerfile cleanup; enforce text LF line endings (#81) 2023-06-17 20:18:01 -07:00
AntonioCiolino
e7ba028497
Enable web scraping based on a urtl and a simple filter. (#73) 2023-06-16 17:29:11 -07:00
timothycarambat
81b2159329 reorder docs 2023-06-16 17:26:42 -07:00
Timothy Carambat
c4eb46ca19
Upload and process documents via UI + document processor in docker image (#65)
* implement dnd uploader
show file upload progress
write files to hotdirector
build simple flaskAPI to process files one off

* move document processor calls to util
build out dockerfile to run both procs at the same time
update UI to check for document processor before upload
* disable pragma update on boot
* dockerfile changes

* add filetype restrictions based on python app support response and show rejected files in the UI

* cleanup

* stub migrations on boot to prevent exit condition

* update CF template for AWS deploy
2023-06-16 16:01:27 -07:00
AntonioCiolino
537a6a91d2
Update __HOTDIR__.md (#70)
fixed typo for text.
2023-06-16 11:17:18 -07:00
Skid Vis
4118c9dcf3
Blocks images in sitemaps from being parsed. (#56)
* Adds ability to import sitemaps to include a website

* adds example sitemap url

* adds filter to bypass common image formats

* moves filetype ignoring to sitemap script
2023-06-14 23:00:03 -07:00
Skid Vis
bd32f97a21
Adds ability to import sitemaps to include a website (#51)
* Adds ability to import sitemaps to include a website

* adds example sitemap url
2023-06-14 11:04:17 -07:00
frasergr
9f33b3dfcb
Docker support (#34)
* Updates for Linux for frontend/server

* frontend/server docker

* updated Dockerfile for deps related to node vectordb

* updates for collector in docker

* docker deps for ODT processing

* ignore another collector dir

* storage mount improvements; run as UID

* fix pypandoc version typo

* permissions fixes
2023-06-13 11:26:11 -07:00
Fabio
d954d7a3d5
Fix pypandoc issue in requirements.txt (#23)
Co-authored-by: Carvalho, Fabio <Fabio_Carvalho@comcast.com>
2023-06-12 11:21:11 -07:00
timothycarambat
728eaff773 fix typo 2023-06-09 11:23:53 -07:00
timothycarambat
27c58541bd inital commit 2023-06-03 19:28:07 -07:00