anything-llm

mirror of https://github.com/Mintplex-Labs/anything-llm.git synced 2024-11-19 12:40:09 +01:00

Author	SHA1	Message	Date
Timothy Carambat	04e29203a5	Add header static class for metadata assembly (#2567 ) * Add header static class for metadata assembly * update comments * patch header parsing for links	2024-11-04 11:47:46 -08:00
Sean Hatfield	a58f271149	Milvus bug fix (#2183 ) * patch no text results for milvus chunks * wrap addDocumentToNamespace in try catch for handling milvus errors * lint * revert milvus db changes * add try catch to handle grpc error from milvus	2024-09-09 15:32:08 -07:00
Timothy Carambat	9bd65f1567	[CHORE] Migration from `vectordb` to @lancedb/lancedb NodeJS SDK (#1766 ) WIP on migration to @lancedb/lancedb NodeJS SDK	2024-06-26 21:57:16 -07:00
Timothy Carambat	dc4ad6b5a9	[BETA] Live document sync (#1719 ) * wip bg workers for live document sync * Add ability to re-embed specific documents across many workspaces via background queue bgworkser is gated behind expieremental system setting flag that needs to be explictly enabled UI for watching/unwatching docments that are embedded. TODO: UI to easily manage all bg tasks and see run results TODO: UI to enable this feature and background endpoints to manage it * create frontend views and paths Move elements to correct experimental scope * update migration to delete runs on removal of watched document * Add watch support to YouTube transcripts (#1716) * Add watch support to YouTube transcripts refactor how sync is done for supported types * Watch specific files in Confluence space (#1718) Add failure-prune check for runs * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * dual build update copy of alert modals * update job interval * Add support for live-sync of Github files * update copy for document sync feature * hide Experimental features from UI * update docs links * [FEAT] Implement new settings menu for experimental features (#1735) * implement new settings menu for experimental features * remove unused context save bar --------- Co-authored-by: timothycarambat <rambat1010@gmail.com> * dont run job on boot * unset workflow changes * Add persistent encryption service Relay key to collector so persistent encryption can be used Encrypt any private data in chunkSources used for replay during resync jobs * update jsDOC * Linting and organization * update modal copy for feature --------- Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>	2024-06-21 13:38:50 -07:00
Sean Hatfield	1b8386b079	[FIX] ChromaDB namespace normalization (#1625 ) * chromadb namespace normalization * update normalization function with more clarity --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>	2024-06-06 15:38:05 -07:00
Anush	771889ad7f	[FIX] Incorrect vectors count with Qdrant (#1561 ) Co-authored-by: Timothy Carambat <rambat1010@gmail.com>	2024-06-06 13:18:01 -07:00
Shixian Sheng	a256db132d	Fixed links (#1485 ) * Update CHROMA_SETUP.md * Update ASTRA_SETUP.md	2024-05-22 10:06:39 -05:00
Timothy Carambat	b23cb1a90f	Improve RAG results via chunkHeader append (#1473 )	2024-05-21 14:43:39 -05:00
Timothy Carambat	cae6cee1b5	Do not go through LLM to embed when embedding documents (#1428 )	2024-05-16 17:51:04 -07:00
Timothy Carambat	94017e2b51	bump langchain deps (#1231 ) * bump langchain deps * patch native and ollama providers remove deprecated deps --------- Co-authored-by: shatfield4 <seanhatfield5@gmail.com>	2024-04-30 12:04:24 -07:00
Timothy Carambat	ca63012c0f	bump lancedb dep (#1229 )	2024-04-29 09:52:22 -07:00
Timothy Carambat	9655880cf0	Update all vector dbs to filter duplicate source documents that may be pinned (#1122 ) * Update all vector dbs to filter duplicate parents * cleanup	2024-04-17 18:04:39 -07:00
Timothy Carambat	24b523d5eb	append missing import for some vectordb providers (#1066 )	2024-04-07 14:40:23 -07:00
Timothy Carambat	ce98ff4653	Enable customization of chunk length and overlap (#1059 ) * Enable customization of chunk length and overlap * fix onboarding link show max limit in UI and prevent overlap >= chunk size	2024-04-06 16:38:07 -07:00
timothycarambat	718062d033	patch milvus/zilliz auto-generated collection name resolves #1027	2024-04-03 12:34:23 -07:00
Gabriel Koo	4731ec8be8	[FIX] : missing import for `parseAuthHeader` in `server/utils/vectorDbProviders/chroma/index.js` (#869 ) fix: import parseAuthHeader in chroma/index.js	2024-03-06 09:14:36 -08:00
Timothy Carambat	44c71013c8	Enforce name requirements for Zilliz/Milvus (#723 )	2024-02-14 13:01:05 -08:00
Timothy Carambat	dfab14a5d2	Patch lanceDB not deleting vectors from workspace (#655 ) patch lanceDB not deleting vectors from workspace documentVectors self-sanitize on delete of parent document	2024-01-29 09:49:22 -08:00
Hakeem Abbas	5614e2ed30	feature: Integrate Astra as vectorDBProvider (#648 ) * feature: Integrate Astra as vectorDBProvider feature: Integrate Astra as vectorDBProvider * Update .env.example * Add env.example to docker example file Update spellcheck fo Astra Update Astra key for vector selection Update order of AstraDB options Resize Astra logo image to 330x330 Update methods of Astra to take in latest vectorDB params like TopN and more Update Astra interface to support default methods and avoid crash errors from 404 collections Update Astra interface to comply to max chunk insertion limitations Update Astra interface to dynamically set dimensionality from chunk 0 size on creation * reset workspaces --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>	2024-01-26 13:07:53 -08:00
Sean Hatfield	2f3db0e63a	[FEAT] support pinecone serverless (#639 ) * migrate pinecone package to latest version and migrate pinecone vectordb provider class * remove pinecone environment name env variable and update docs to reflect removal & serverless support complete * migrate query for pinecone db * typo in log --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>	2024-01-22 16:41:20 -08:00
Sean Hatfield	56fa17caf2	create configurable topN per workspace (#616 ) * create configurable topN per workspace * Update TopN UI text Fix fallbacks for all providers Add SQLite CHECK to TOPN value * merge with master Update zilliz provider for variable TopN --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>	2024-01-18 12:34:20 -08:00
Timothy Carambat	658e7fa390	chore: Better VectorDb and Embedder error messages (#620 ) * chore: propogate embedder and vectordb errors during document mutations * add default value for errors on addDocuments	2024-01-18 11:40:48 -08:00
Timothy Carambat	0df86699e7	feat: Add support for Zilliz Cloud by Milvus (#615 ) * feat: Add support for Zilliz Cloud by Milvus * update placeholder text update data handling stmt * update zilliz descriptor	2024-01-17 18:00:54 -08:00
Timothy Carambat	d0a3f1e3e1	Fix present diminsions on vectorDBs to be inferred for providers who require it (#605 )	2024-01-16 13:41:01 -08:00
Shuyoou	6faa0efaa8	Issue #543 support milvus vector db (#579 ) * issue #543 support milvus vector db * migrate Milvus to use MilvusClient instead of ORM normalize env setup for docs/implementation feat: embedder model dimension added * update comments --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>	2024-01-12 13:23:57 -08:00
Sayan Gupta	b7d2756754	Issue #204 Added a check to ensure that 'chunk.payload' exists and contains the 'id' property (#526 ) * Issue #204 Added a check to ensure that 'chunk.payload' exists and contains the 'id' property before attempting to destructure it * run linter * simplify condition and comment --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>	2024-01-04 16:39:43 -08:00
Timothy Carambat	8cc1455b72	feat: add support for variable chunk length (#415 ) fix: cleanup code for embedding length clarify resolves #388	2023-12-07 16:27:36 -08:00
Timothy Carambat	6fa8b0ce93	Add API key option to LocalAI (#407 ) * Add API key option to LocalAI * add api key for model dropdown selector	2023-12-04 08:38:15 -08:00
Timothy Carambat	88d4808c52	315 show citations based on relevancy score (#316 ) * settings for similarity score threshold and prisma schema updated * prisma schema migration for adding similarityScore setting * WIP * Min score default change * added similarityThreshold checking for all vectordb providers * linting --------- Co-authored-by: shatfield4 <seanhatfield5@gmail.com>	2023-11-06 16:49:29 -08:00
Timothy Carambat	be9d8b0397	Infinite prompt input and compression implementation (#332 ) * WIP on continuous prompt window summary * wip * Move chat out of VDB simplify chat interface normalize LLM model interface have compression abstraction Cleanup compressor TODO: Anthropic stuff * Implement compression for Anythropic Fix lancedb sources * cleanup vectorDBs and check that lance, chroma, and pinecone are returning valid metadata sources * Resolve Weaviate citation sources not working with schema * comment cleanup	2023-11-06 13:13:53 -08:00
Timothy Carambat	5d56ab623b	Anthropic claude 2 support (#305 ) * WIP Anythropic support for chat, chat and query w/context * Add onboarding support for Anthropic * cleanup * fix Anthropic answer parsing move embedding selector to general util	2023-10-30 15:44:03 -07:00
Sean Hatfield	669d7a396d	282 return relevancy score with similarityresponse (#304 ) * include score value in similarityResponse for weaviate * include score value in si milarityResponse for qdrant * include score value in si milarityResponse for pinecone * include score value in similarityResponse for chroma * include score value in similarityResponse for lancedb * distance to similarity --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>	2023-10-30 12:46:38 -07:00
Timothy Carambat	a8ec0d9584	Compensate for upper OpenAI emedding limit chunk size (#292 ) Limit is due to POST body max size. Sufficiently large requests will abort automatically We should report that error back on the frontend during embedding Update vectordb providers to return on failed	2023-10-26 10:57:37 -07:00
Timothy Carambat	62d39eb4fb	resolves #259 (#260 ) Support API client for chroma	2023-09-29 13:20:06 -07:00
Sean Hatfield	a126b5f5aa	Replace custom sqlite dbms with prisma (#239 ) * WIP converted all sqlite models into prisma calls * modify db setup and fix ApiKey model calls in admin.js * renaming function params to be consistent * converted adminEndpoints to utilize prisma orm * converted chatEndpoints to utilize prisma orm * converted inviteEndpoints to utilize prisma orm * converted systemEndpoints to utilize prisma orm * converted workspaceEndpoints to utilize prisma orm * converting sql queries to prisma calls * fixed default param bug for orderBy and limit * fixed typo for workspace chats * fixed order of deletion to account for sql relations * fix invite CRUD and workspace management CRUD * fixed CRUD for api keys * created prisma setup scripts/docs for understanding how to use prisma * prisma dependency change * removing unneeded console.logs * removing unneeded sql escape function * linting and creating migration script * migration from depreciated sqlite script update * removing unneeded migrations in prisma folder * create backup of old sqlite db and use transactions to ensure all operations complete successfully * adding migrations to gitignore * updated PRISMA.md docs for info on how to use sqlite migration script * comment changes * adding back migrations folder to repo * Reviewing SQL and prisma integraiton on fresh repo * update inline key replacement * ensure migration script executes and maps foreign_keys regardless of db ordering * run migration endpoint * support new prisma backend * bump version * change migration call --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>	2023-09-28 14:00:03 -07:00
Sean Hatfield	ce6951b21f	Renamed all indicies to vectors to avoid confusion of vocab (#246 ) * renamed all indicies to vectors to avoid confusion of vocab * removing unneeded files * changed indicies on frontend sidebar to vectors	2023-09-21 12:04:17 -07:00
timothycarambat	79e3faa82d	Update readme to not prefer Pinecone	2023-09-12 14:58:14 -07:00
timothycarambat	cfcd14a307	Merge branch 'master' of github.com:Mintplex-Labs/anything-llm	2023-08-22 10:49:27 -07:00
timothycarambat	4f8abeb7fc	better loggin on addDocumentToWorkspace and add Qdrant setup doc	2023-08-22 10:30:01 -07:00
Timothy Carambat	c019f5abfa	Enable batch deletion of Pinecone Ids by max limit of 1000 (#210 ) * Enable batch deletion of Pinecone Ids by max limit of 1000 * lint	2023-08-22 09:25:55 -07:00
Timothy Carambat	cf0b24af02	Add Qdrant support for embedding, chat, and conversation (#192 ) * Add Qdrant support for embedding, chat, and conversation * Change comments	2023-08-15 15:26:44 -07:00
timothycarambat	a048cf451a	hot fix storage path for unix	2023-08-10 13:50:17 -07:00
Timothy Carambat	f3a6147ffd	Add support for Weaviate VectorDB (#181 )	2023-08-08 18:02:30 -07:00
Timothy Carambat	1f29cec918	Multiple LLM Support framework + AzureOpenAI Support (#180 ) * Remove LangchainJS for chat support chaining Implement runtime LLM selection Implement AzureOpenAI Support for LLM + Emebedding WIP on frontend Update env to reflect the new fields * Remove LangchainJS for chat support chaining Implement runtime LLM selection Implement AzureOpenAI Support for LLM + Emebedding WIP on frontend Update env to reflect the new fields * Replace keys with LLM Selection in settings modal Enforce checks for new ENVs depending on LLM selection	2023-08-04 14:56:27 -07:00
timothycarambat	9bea7739ed	move OpenAI to AiProvider folder in preparation for new AI provider support	2023-07-28 12:09:49 -07:00
Timothy Carambat	8929d96ed0	Move OpenAI api calls into its own interface/Class (#162 ) * Move OpenAI api calls into its own interface/Class move curate sources to be specific for each vectorDBs response for chat/query * remove comment	2023-07-28 12:05:38 -07:00
Timothy Carambat	0a2f837fb2	improve citations to show all text chunks referred and expand the citation to view full referenced text (#161 ) * improve citations to show all text chunks referred and expand the citation to view full referenced text chunk text of same document together * remove debug	2023-07-27 22:33:27 -07:00
Timothy Carambat	91f5f94200	[FEATURE] Enable the ability to have multi user instances (#158 ) * multi user wip * WIP MUM features * invitation mgmt * suspend or unsuspend users * workspace mangement * manage chats * manage chats * add Support for admin system settings for users to delete workspaces and limit chats per user * fix issue ith system var update app to lazy load invite page * cleanup and bug fixes * wrong method * update readme * update readme * update readme * bump version to 0.1.0	2023-07-25 10:37:04 -07:00
Timothy Carambat	5fa6145872	can now count and remove data in lancedb 0.1.12 so bumped version and added new functionality support (#155 )	2023-07-20 13:09:56 -07:00
Timothy Carambat	c1deca4928	[Fork] Batch embed by jwaltz (#153 ) * refactor: convert chunk embedding to one API call * chore: lint * fix chroma for batch and single vectorization of text * Fix LanceDB multi and single vectorization * Fix pinecone for single and multiple embeddings --------- Co-authored-by: Jonathan Waltz <volcanicislander@gmail.com>	2023-07-20 12:05:23 -07:00

1 2

62 Commits