Commit Graph

93 Commits

Author SHA1 Message Date
timothycarambat
d603d0fd51 patch:update storage for bulk-website scraper for render 2024-05-14 12:59:14 -07:00
timothycarambat
c8dac6177a Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-05-14 12:57:44 -07:00
timothycarambat
b5ac944475 patch: bulk-scraper, update when folder is made and path creation params 2024-05-14 12:57:23 -07:00
timothycarambat
72c9fda6c9 Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-05-14 12:50:17 -07:00
Sean Hatfield
612a7e1662
[FEAT] Website depth scraping data connector (#1191)
* WIP website depth scraping, (sort of works)

* website depth data connector stable + add maxLinks option

* linting + loading small ui tweak

* refactor website depth data connector for stability, speed, & readability

* patch: remove console log
Guard clause on URL validitiy check
reasonable overrides

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2024-05-14 12:49:14 -07:00
jazelly
d71db22799
fix: skip undefined confluence pageContent (#1383)
Refs: https://github.com/Mintplex-Labs/anything-llm/issues/1381

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2024-05-14 10:22:13 -07:00
Predrag Stojadinović
78e3e35d27
[FEAT] Confluence Data Connector handles custom Confluence urls (#1362)
* chore: confluence data connector can now handle custom urls, in addition to default {subdomain}.atlassian.net ones

* chore: formatting as per yarn lint
2024-05-14 10:21:04 -07:00
timothycarambat
c60077a078 merge with master 2024-05-03 10:02:53 -07:00
timothycarambat
2d215acb75 patch storage dirs for extensions 2024-05-02 14:03:10 -07:00
timothycarambat
1aa8e5766f duplicate key (no impact) 2024-05-02 13:05:20 -07:00
timothycarambat
6150ff41ea Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-05-01 13:33:07 -07:00
Timothy Carambat
547d4859ef
Bump openai package to latest (#1234)
* Bump `openai` package to latest
Tested all except localai

* bump LocalAI support with latest image

* add deprecation notice

* linting
2024-04-30 12:33:42 -07:00
Timothy Carambat
94017e2b51
bump langchain deps (#1231)
* bump langchain deps

* patch native and ollama providers remove deprecated deps

---------

Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
2024-04-30 12:04:24 -07:00
Sean Hatfield
348b36bf85
[FEAT] Confluence data connector (#1181)
* WIP Confluence data connector backend

* confluence data connector complete

* confluence citations

* fix citation for confluence

* Patch confulence integration

* fix Citation Icon for confluence

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2024-04-25 17:53:38 -07:00
timothycarambat
e1372a81d4 Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-04-20 18:22:41 -07:00
Ken Kuang
a3b7239d05
Fix Cannot read properties of undefined (reading 'length') (#1145)
Fix upload failed
2024-04-20 12:28:19 -07:00
timothycarambat
45505630a6 Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-04-17 11:55:57 -07:00
Timothy Carambat
a5bb77f97a
Agent support for @agent default agent inside workspace chat (#1093)
V1 of agent support via built-in `@agent` that can be invoked alongside normal workspace RAG chat.
2024-04-16 10:50:10 -07:00
timothycarambat
fde4e5400f Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-04-12 14:57:46 -07:00
Sean Hatfield
af84b01482
[FIX] GitHub repo with periods in link fix (#1084)
fix periods in github repo links bug
2024-04-12 14:56:59 -07:00
Timothy Carambat
2c6135aa54
patch file types as plaintext (#1095)
resolves #1089
2024-04-12 14:54:33 -07:00
timothycarambat
75ced7e65a merge with master
Patch LLM selection for native to be disabled
2024-04-07 14:55:18 -07:00
Timothy Carambat
1f8ab0d245
Remove YoutubeLoader dependency (#1050)
* WIP data connector redesign

* new UI for data connectors complete

* remove old data connector page/cleanup imports

* cleanup of UI and imports

* Remove Youtube Transcript dep and move in-house

* lang pref default to en

---------

Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
2024-04-05 16:33:01 -07:00
timothycarambat
2638098d49 patch with master 2024-04-05 09:45:28 -07:00
timothycarambat
0b454016cf patch comkey path to fallback 2024-04-04 10:47:26 -07:00
timothycarambat
a4c1d42e41 merge with master 2024-04-02 14:33:32 -07:00
timothycarambat
e524afae9e Merge branch 'master' of github.com:Mintplex-Labs/anything-llm 2024-04-02 14:30:27 -07:00
timothycarambat
117c3b2bfb forgot epub file! 2024-04-02 14:30:20 -07:00
Timothy Carambat
4fb4aa2041
Add epub support for parsing (#1017) 2024-04-02 14:25:52 -07:00
Timothy Carambat
752e3e22ed
Add more text file forced extensions (#1016) 2024-04-02 14:13:11 -07:00
Timothy Carambat
f4088d9348
RSA-Signing on server<->collector communication via API (#1005)
* WIP integrity check between processes

* Implement integrity checking on document processor payloads
2024-04-01 13:56:35 -07:00
timothycarambat
971c54e2c8 Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-03-26 14:12:09 -07:00
Sean Hatfield
45f50ce13c
[FIX] Update metadata tags in PDF collector script (#925)
update title in pdf collector script to be the filename instead of metadata title
2024-03-19 18:14:34 -07:00
timothycarambat
540d18ec84 Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-03-18 09:52:11 -07:00
Timothy Carambat
0ada882991
Support external transcription providers (#909)
* Support External Transcription providers

* patch files

* update docs

* fix return data
2024-03-14 15:43:26 -07:00
timothycarambat
429ea0c805 Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-03-12 12:29:57 -07:00
Timothy Carambat
0f31e43fd4
bump YT metadata lib for YT api fix rot (#888) 2024-03-11 10:57:53 -07:00
timothycarambat
65f8a01505 merge with master 2024-03-06 16:43:36 -08:00
Timothy Carambat
ec90060d36
Re-map some file mimes to support text (#842)
re-map some file mimes to support text
2024-02-29 10:05:03 -08:00
timothycarambat
2b6e1db79b merge with master 2024-02-27 23:12:09 -08:00
Timothy Carambat
6d18d79bb7
Generic upload fallback as text file. (#808)
* Do not block any file upload
fallback unknown/unsupported types to text if possible

* reduce call for frontend

* patch
2024-02-26 13:43:54 -08:00
timothycarambat
ae01785220 Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-02-21 15:11:45 -08:00
Timothy Carambat
d89610586a
improve error messages from YT scraping (#768)
parse & enforce URL to allow multiple URL schemas
2024-02-21 10:47:10 -08:00
Timothy Carambat
49fbd09af4
Support more plaintext filetypes (#757)
* Add more plaintext document types

org-mode, asciidoc, and reStructuredText are all text formats

Signed-off-by: Christian Romney <christian.a.romney@gmail.com>

* lint

---------

Signed-off-by: Christian Romney <christian.a.romney@gmail.com>
Co-authored-by: Christian Romney <christian.a.romney@gmail.com>
2024-02-19 10:44:01 -08:00
Timothy Carambat
d52f8aafd4
689 links in citation (#715)
* Include links in citations
force ChunkSource key to retain this information
old links will be unsupported

* show special icons depending on source

* remove console log

* reset server documents writeTo
2024-02-13 14:11:57 -08:00
Timothy Carambat
48cb8f2897
Add support to upload rawText document via api (#692)
* Add support to upload rawText document via api

* update API doc endpoint with correct textContent key

* update response swagger doc
2024-02-07 15:17:32 -08:00
Sean Hatfield
288ff0d18c
fix vector cache not deleting cache after unembedding items with folders (#630) 2024-01-22 13:03:05 -08:00
Timothy Carambat
0db6c3b2aa
Prevent private octets from link collection for self-hosted (#626) 2024-01-19 10:49:40 -08:00
timothycarambat
addb3d0c3e Update Render.com image for AnythignLLM to latest 2024-01-17 18:12:25 -08:00
Timothy Carambat
b35feede87
570 document api return object (#608)
* Add support for fetching single document in documents folder

* Add document object to upload + support link scraping via API

* hotfixes for documentation

* update api docs
2024-01-16 16:04:22 -08:00