anything-llm

mirror of https://github.com/Mintplex-Labs/anything-llm.git synced 2024-11-10 17:00:11 +01:00

Author	SHA1	Message	Date
Timothy Carambat	a89812703b	repatch path normalization (#1516 )	2024-05-23 12:52:04 -07:00
timothycarambat	05488c81e0	undo path norm whitespace fix	2024-05-23 12:04:00 -07:00
timothycarambat	e208074ef4	patch path normalization	2024-05-22 11:50:01 -05:00
Timothy Carambat	1a5aacb001	Support multi-model whispers (#1444 )	2024-05-17 21:31:29 -07:00
Timothy Carambat	7e0b638a2c	Patch confluence URL patterns(#1426 ) * patch confluence patterns --------- Co-authored-by: shatfield4 <seanhatfield5@gmail.com>	2024-05-16 14:15:59 -07:00
timothycarambat	87b41a60e9	refactor spaceKey url pattern for custom domains	2024-05-16 11:01:34 -07:00
Predrag Stojadinović	cf969adf37	1362 custom display confluence url (#1423 ) * chore: confluence data connector can now handle custom urls, in addition to default {subdomain}.atlassian.net ones * chore: formatting as per yarn lint * chore: adding /display/ url matching to confluence data connector	2024-05-16 10:46:18 -07:00
timothycarambat	b5ac944475	patch: bulk-scraper, update when folder is made and path creation params	2024-05-14 12:57:23 -07:00
Sean Hatfield	612a7e1662	[FEAT] Website depth scraping data connector (#1191 ) * WIP website depth scraping, (sort of works) * website depth data connector stable + add maxLinks option * linting + loading small ui tweak * refactor website depth data connector for stability, speed, & readability * patch: remove console log Guard clause on URL validitiy check reasonable overrides --------- Co-authored-by: Timothy Carambat <rambat1010@gmail.com>	2024-05-14 12:49:14 -07:00
jazelly	d71db22799	fix: skip undefined confluence pageContent (#1383 ) Refs: https://github.com/Mintplex-Labs/anything-llm/issues/1381 Co-authored-by: Timothy Carambat <rambat1010@gmail.com>	2024-05-14 10:22:13 -07:00
Predrag Stojadinović	78e3e35d27	[FEAT] Confluence Data Connector handles custom Confluence urls (#1362 ) * chore: confluence data connector can now handle custom urls, in addition to default {subdomain}.atlassian.net ones * chore: formatting as per yarn lint	2024-05-14 10:21:04 -07:00
timothycarambat	2d215acb75	patch storage dirs for extensions	2024-05-02 14:03:10 -07:00
timothycarambat	1aa8e5766f	duplicate key (no impact)	2024-05-02 13:05:20 -07:00
Timothy Carambat	547d4859ef	Bump `openai` package to latest (#1234 ) * Bump `openai` package to latest Tested all except localai * bump LocalAI support with latest image * add deprecation notice * linting	2024-04-30 12:33:42 -07:00
Timothy Carambat	94017e2b51	bump langchain deps (#1231 ) * bump langchain deps * patch native and ollama providers remove deprecated deps --------- Co-authored-by: shatfield4 <seanhatfield5@gmail.com>	2024-04-30 12:04:24 -07:00
Sean Hatfield	348b36bf85	[FEAT] Confluence data connector (#1181 ) * WIP Confluence data connector backend * confluence data connector complete * confluence citations * fix citation for confluence * Patch confulence integration * fix Citation Icon for confluence --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>	2024-04-25 17:53:38 -07:00
Ken Kuang	a3b7239d05	Fix Cannot read properties of undefined (reading 'length') (#1145 ) Fix upload failed	2024-04-20 12:28:19 -07:00
Timothy Carambat	a5bb77f97a	Agent support for `@agent` default agent inside workspace chat (#1093 ) V1 of agent support via built-in `@agent` that can be invoked alongside normal workspace RAG chat.	2024-04-16 10:50:10 -07:00
Sean Hatfield	af84b01482	[FIX] GitHub repo with periods in link fix (#1084 ) fix periods in github repo links bug	2024-04-12 14:56:59 -07:00
Timothy Carambat	2c6135aa54	patch file types as plaintext (#1095 ) resolves #1089	2024-04-12 14:54:33 -07:00
Timothy Carambat	1f8ab0d245	Remove YoutubeLoader dependency (#1050 ) * WIP data connector redesign * new UI for data connectors complete * remove old data connector page/cleanup imports * cleanup of UI and imports * Remove Youtube Transcript dep and move in-house * lang pref default to en --------- Co-authored-by: shatfield4 <seanhatfield5@gmail.com>	2024-04-05 16:33:01 -07:00
timothycarambat	0b454016cf	patch comkey path to fallback	2024-04-04 10:47:26 -07:00
timothycarambat	e524afae9e	Merge branch 'master' of github.com:Mintplex-Labs/anything-llm	2024-04-02 14:30:27 -07:00
timothycarambat	117c3b2bfb	forgot epub file!	2024-04-02 14:30:20 -07:00
Timothy Carambat	4fb4aa2041	Add epub support for parsing (#1017 )	2024-04-02 14:25:52 -07:00
Timothy Carambat	752e3e22ed	Add more text file forced extensions (#1016 )	2024-04-02 14:13:11 -07:00
Timothy Carambat	f4088d9348	RSA-Signing on server<->collector communication via API (#1005 ) * WIP integrity check between processes * Implement integrity checking on document processor payloads	2024-04-01 13:56:35 -07:00
Sean Hatfield	45f50ce13c	[FIX] Update metadata tags in PDF collector script (#925 ) update title in pdf collector script to be the filename instead of metadata title	2024-03-19 18:14:34 -07:00
Timothy Carambat	0ada882991	Support external transcription providers (#909 ) * Support External Transcription providers * patch files * update docs * fix return data	2024-03-14 15:43:26 -07:00
Timothy Carambat	0f31e43fd4	bump YT metadata lib for YT api fix rot (#888 )	2024-03-11 10:57:53 -07:00
Timothy Carambat	ec90060d36	Re-map some file mimes to support text (#842 ) re-map some file mimes to support text	2024-02-29 10:05:03 -08:00
Timothy Carambat	6d18d79bb7	Generic upload fallback as text file. (#808 ) * Do not block any file upload fallback unknown/unsupported types to text if possible * reduce call for frontend * patch	2024-02-26 13:43:54 -08:00
Timothy Carambat	d89610586a	improve error messages from YT scraping (#768 ) parse & enforce URL to allow multiple URL schemas	2024-02-21 10:47:10 -08:00
Timothy Carambat	49fbd09af4	Support more plaintext filetypes (#757 ) * Add more plaintext document types org-mode, asciidoc, and reStructuredText are all text formats Signed-off-by: Christian Romney <christian.a.romney@gmail.com> * lint --------- Signed-off-by: Christian Romney <christian.a.romney@gmail.com> Co-authored-by: Christian Romney <christian.a.romney@gmail.com>	2024-02-19 10:44:01 -08:00
Timothy Carambat	d52f8aafd4	689 links in citation (#715 ) * Include links in citations force ChunkSource key to retain this information old links will be unsupported * show special icons depending on source * remove console log * reset server documents writeTo	2024-02-13 14:11:57 -08:00
Timothy Carambat	48cb8f2897	Add support to upload rawText document via api (#692 ) * Add support to upload rawText document via api * update API doc endpoint with correct textContent key * update response swagger doc	2024-02-07 15:17:32 -08:00
Sean Hatfield	288ff0d18c	fix vector cache not deleting cache after unembedding items with folders (#630 )	2024-01-22 13:03:05 -08:00
Timothy Carambat	0db6c3b2aa	Prevent private octets from link collection for self-hosted (#626 )	2024-01-19 10:49:40 -08:00
Timothy Carambat	b35feede87	570 document api return object (#608 ) * Add support for fetching single document in documents folder * Add document object to upload + support link scraping via API * hotfixes for documentation * update api docs	2024-01-16 16:04:22 -08:00
Timothy Carambat	1563a1b20f	Strict link protocol validation (#577 )	2024-01-11 12:29:00 -08:00
Timothy Carambat	58971e8b30	Build & Publish AnythingLLM for ARM64 and x86 (#549 ) * Update build process to support multi-platform builds Bump @lancedb/vectordb to 0.1.19 for ARM&AMD compatibility Patch puppeteer on ARM builds because of broken chromium resolves #539 resolves #548 --------- Co-authored-by: shatfield4 <seanhatfield5@gmail.com>	2024-01-08 16:15:01 -08:00
Francisco Bischoff	990a2e85bf	devcontainer v1 (#297 ) Implement support for GitHub codespaces and VSCode devcontainers --------- Co-authored-by: timothycarambat <rambat1010@gmail.com> Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>	2024-01-08 15:31:06 -08:00
timothycarambat	26549df6a9	touchup linting	2023-12-27 13:28:37 -08:00
timothycarambat	daadad3859	hoist var in extensions	2023-12-20 19:41:16 -08:00
Timothy Carambat	f2fadd6d2e	Add placeholder collector ENV file (#476 ) resolves #474	2023-12-19 13:27:09 -08:00
Timothy Carambat	ecf4295537	Add ability to grab youtube transcripts via doc processor (#470 ) * Add ability to grab youtube transcripts via doc processor * dynamic imports swap out Github for Youtube in placeholder text	2023-12-18 17:17:26 -08:00
Timothy Carambat	452582489e	GitHub loader extension + extension support v1 (#469 ) * feat: implement github repo loading fix: purge of folders fix: rendering of sub-files * noshow delete on custom-documents * Add API key support because of rate limits * WIP for frontend of data connectors * wip * Add frontend form for GitHub repo data connector * remove console.logs block custom-documents from being deleted * remove _meta unused arg * Add support for ignore pathing in request Ignore path input via tagging * Update hint	2023-12-18 15:48:02 -08:00
timothycarambat	d2e3506bb9	fix: transition on LLM and embedding screen linting	2023-12-15 12:40:11 -08:00
Timothy Carambat	61db981017	feat: Embed on-instance Whisper model for audio/mp4 transcribing (#449 ) * feat: Embed on-instance Whisper model for audio/mp4 transcribing resolves #329 * additional logging * add placeholder for tmp folder in collector storage Add cleanup of hotdir and tmp on collector boot to prevent hanging files split loading of model and file conversion into concurrency * update README * update model size * update supported filetypes	2023-12-15 11:20:13 -08:00
Timothy Carambat	719521c307	Document Processor v2 (#442 ) * wip: init refactor of document processor to JS * add NodeJs PDF support * wip: partity with python processor feat: add pptx support * fix: forgot files * Remove python scripts totally * wip:update docker to boot new collector * add package.json support * update dockerfile for new build * update gitignore and linting * add more protections on file lookup * update package.json * test build * update docker commands to use cap-add=SYS_ADMIN so web scraper can run update all scripts to reflect this remove docker build for branch	2023-12-14 15:14:56 -08:00

1 2

79 Commits