Commit Graph

9 Commits

Author SHA1 Message Date
Timothy Carambat
04e29203a5
Add header static class for metadata assembly (#2567)
* Add header static class for metadata assembly

* update comments

* patch header parsing for links
2024-11-04 11:47:46 -08:00
Timothy Carambat
dc4ad6b5a9
[BETA] Live document sync (#1719)
* wip bg workers for live document sync

* Add ability to re-embed specific documents across many workspaces via background queue
bgworkser is gated behind expieremental system setting flag that needs to be explictly enabled
UI for watching/unwatching docments that are embedded.
TODO: UI to easily manage all bg tasks and see run results
TODO: UI to enable this feature and background endpoints to manage it

* create frontend views and paths
Move elements to correct experimental scope

* update migration to delete runs on removal of watched document

* Add watch support to YouTube transcripts (#1716)

* Add watch support to YouTube transcripts
refactor how sync is done for supported types

* Watch specific files in Confluence space (#1718)

Add failure-prune check for runs

* create tmp workflow modifications for beta image

* create tmp workflow modifications for beta image

* create tmp workflow modifications for beta image

* dual build
update copy of alert modals

* update job interval

* Add support for live-sync of Github files

* update copy for document sync feature

* hide Experimental features from UI

* update docs links

* [FEAT] Implement new settings menu for experimental features (#1735)

* implement new settings menu for experimental features

* remove unused context save bar

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>

* dont run job on boot

* unset workflow changes

* Add persistent encryption service
Relay key to collector so persistent encryption can be used
Encrypt any private data in chunkSources used for replay during resync jobs

* update jsDOC

* Linting and organization

* update modal copy for feature

---------

Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>
2024-06-21 13:38:50 -07:00
Shixian Sheng
a256db132d
Fixed links (#1485)
* Update CHROMA_SETUP.md

* Update ASTRA_SETUP.md
2024-05-22 10:06:39 -05:00
Timothy Carambat
b23cb1a90f
Improve RAG results via chunkHeader append (#1473) 2024-05-21 14:43:39 -05:00
Timothy Carambat
cae6cee1b5
Do not go through LLM to embed when embedding documents (#1428) 2024-05-16 17:51:04 -07:00
Timothy Carambat
9655880cf0
Update all vector dbs to filter duplicate source documents that may be pinned (#1122)
* Update all vector dbs to filter duplicate parents

* cleanup
2024-04-17 18:04:39 -07:00
Timothy Carambat
24b523d5eb
append missing import for some vectordb providers (#1066) 2024-04-07 14:40:23 -07:00
Timothy Carambat
ce98ff4653
Enable customization of chunk length and overlap (#1059)
* Enable customization of chunk length and overlap

* fix onboarding link
show max limit in UI and prevent overlap >= chunk size
2024-04-06 16:38:07 -07:00
Hakeem Abbas
5614e2ed30
feature: Integrate Astra as vectorDBProvider (#648)
* feature: Integrate Astra as vectorDBProvider

feature: Integrate Astra as vectorDBProvider

* Update .env.example

* Add env.example to docker example file
Update spellcheck fo Astra
Update Astra key for vector selection
Update order of AstraDB options
Resize Astra logo image to 330x330
Update methods of Astra to take in latest vectorDB params like TopN and more
Update Astra interface to support default methods and avoid crash errors from 404 collections
Update Astra interface to comply to max chunk insertion limitations
Update Astra interface to dynamically set dimensionality from chunk 0 size on creation

* reset workspaces

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2024-01-26 13:07:53 -08:00