Commit Graph

12 Commits

Author SHA1 Message Date
Timothy Carambat
dc4ad6b5a9
[BETA] Live document sync (#1719)
* wip bg workers for live document sync

* Add ability to re-embed specific documents across many workspaces via background queue
bgworkser is gated behind expieremental system setting flag that needs to be explictly enabled
UI for watching/unwatching docments that are embedded.
TODO: UI to easily manage all bg tasks and see run results
TODO: UI to enable this feature and background endpoints to manage it

* create frontend views and paths
Move elements to correct experimental scope

* update migration to delete runs on removal of watched document

* Add watch support to YouTube transcripts (#1716)

* Add watch support to YouTube transcripts
refactor how sync is done for supported types

* Watch specific files in Confluence space (#1718)

Add failure-prune check for runs

* create tmp workflow modifications for beta image

* create tmp workflow modifications for beta image

* create tmp workflow modifications for beta image

* dual build
update copy of alert modals

* update job interval

* Add support for live-sync of Github files

* update copy for document sync feature

* hide Experimental features from UI

* update docs links

* [FEAT] Implement new settings menu for experimental features (#1735)

* implement new settings menu for experimental features

* remove unused context save bar

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>

* dont run job on boot

* unset workflow changes

* Add persistent encryption service
Relay key to collector so persistent encryption can be used
Encrypt any private data in chunkSources used for replay during resync jobs

* update jsDOC

* Linting and organization

* update modal copy for feature

---------

Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>
2024-06-21 13:38:50 -07:00
Timothy Carambat
b23cb1a90f
Improve RAG results via chunkHeader append (#1473) 2024-05-21 14:43:39 -05:00
Timothy Carambat
cae6cee1b5
Do not go through LLM to embed when embedding documents (#1428) 2024-05-16 17:51:04 -07:00
Timothy Carambat
9655880cf0
Update all vector dbs to filter duplicate source documents that may be pinned (#1122)
* Update all vector dbs to filter duplicate parents

* cleanup
2024-04-17 18:04:39 -07:00
Timothy Carambat
24b523d5eb
append missing import for some vectordb providers (#1066) 2024-04-07 14:40:23 -07:00
Timothy Carambat
ce98ff4653
Enable customization of chunk length and overlap (#1059)
* Enable customization of chunk length and overlap

* fix onboarding link
show max limit in UI and prevent overlap >= chunk size
2024-04-06 16:38:07 -07:00
timothycarambat
718062d033 patch milvus/zilliz auto-generated collection name
resolves #1027
2024-04-03 12:34:23 -07:00
Timothy Carambat
44c71013c8
Enforce name requirements for Zilliz/Milvus (#723) 2024-02-14 13:01:05 -08:00
Sean Hatfield
56fa17caf2
create configurable topN per workspace (#616)
* create configurable topN per workspace

* Update TopN UI text
Fix fallbacks for all providers
Add SQLite CHECK to TOPN value

* merge with master
Update zilliz provider for variable TopN

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2024-01-18 12:34:20 -08:00
Timothy Carambat
658e7fa390
chore: Better VectorDb and Embedder error messages (#620)
* chore: propogate embedder and vectordb errors during document mutations

* add default value for errors on addDocuments
2024-01-18 11:40:48 -08:00
Timothy Carambat
d0a3f1e3e1
Fix present diminsions on vectorDBs to be inferred for providers who require it (#605) 2024-01-16 13:41:01 -08:00
Shuyoou
6faa0efaa8
Issue #543 support milvus vector db (#579)
* issue #543 support milvus vector db

* migrate Milvus to use MilvusClient instead of ORM
normalize env setup for docs/implementation
feat: embedder model dimension added

* update comments

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2024-01-12 13:23:57 -08:00