Commit Graph

74 Commits

Author SHA1 Message Date
timothycarambat
f2ebca8f84 Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-07-19 18:36:48 -07:00
timothycarambat
f15529653f patch logger for full logs 2024-07-19 18:35:41 -07:00
timothycarambat
cec1a3d585 append stacktraces to winston 2024-07-19 18:13:54 -07:00
timothycarambat
766537180a linting 2024-07-19 15:25:09 -07:00
timothycarambat
a56c124543 Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-07-11 15:58:21 -07:00
Sean Hatfield
79656718b2
[FEAT] Create custom pdfloader (#1852)
* implement custom PDFLoader to remove LC dep

* remove unneeded comment

* remove pdfjs as dep and fix page splitting using pdf-parse

* linting + export rename for desktop compat

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2024-07-11 12:26:11 -07:00
timothycarambat
e6ee872136 Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-07-10 15:51:27 -07:00
timothycarambat
8658b1e7c7 linting 2024-07-03 18:25:44 -07:00
Timothy Carambat
29c9eeaa5c
Add winston logging for production (#1811) 2024-07-03 16:39:33 -07:00
Sean Hatfield
f205d51fe9
[FIX] Confluence code snippet blocks not being extracted (#1804)
implement custom confluence loader to extract code blocks properly from documents

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2024-07-03 14:00:44 -07:00
timothycarambat
86a31d7551 Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-07-01 17:08:59 -07:00
Sean Hatfield
fc375f4036
[FIX] Bulk link scraper bug fix (#1800)
patch website depth data connector to work for other links that are not root url
2024-07-01 16:59:28 -07:00
Jason Zhang
fa4ab0f65f
fix: sanitize filename before writing (#1743)
* fix: sanitize filename before writing

Fixes: https://github.com/Mintplex-Labs/anything-llm/issues/1737

* fixup

* fixup
2024-06-25 15:45:09 -07:00
Timothy Carambat
dc4ad6b5a9
[BETA] Live document sync (#1719)
* wip bg workers for live document sync

* Add ability to re-embed specific documents across many workspaces via background queue
bgworkser is gated behind expieremental system setting flag that needs to be explictly enabled
UI for watching/unwatching docments that are embedded.
TODO: UI to easily manage all bg tasks and see run results
TODO: UI to enable this feature and background endpoints to manage it

* create frontend views and paths
Move elements to correct experimental scope

* update migration to delete runs on removal of watched document

* Add watch support to YouTube transcripts (#1716)

* Add watch support to YouTube transcripts
refactor how sync is done for supported types

* Watch specific files in Confluence space (#1718)

Add failure-prune check for runs

* create tmp workflow modifications for beta image

* create tmp workflow modifications for beta image

* create tmp workflow modifications for beta image

* dual build
update copy of alert modals

* update job interval

* Add support for live-sync of Github files

* update copy for document sync feature

* hide Experimental features from UI

* update docs links

* [FEAT] Implement new settings menu for experimental features (#1735)

* implement new settings menu for experimental features

* remove unused context save bar

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>

* dont run job on boot

* unset workflow changes

* Add persistent encryption service
Relay key to collector so persistent encryption can be used
Encrypt any private data in chunkSources used for replay during resync jobs

* update jsDOC

* Linting and organization

* update modal copy for feature

---------

Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>
2024-06-21 13:38:50 -07:00
Timothy Carambat
a598c8e04c
1347 human readable confluence url (#1706)
* chore: confluence data connector can now handle custom urls, in addition to default {subdomain}.atlassian.net ones

* chore: formatting as per yarn lint

* chore: fixing the human readable confluence url fetch baseUrl

* chore: fixing the human readable confluence url fetch baseUrl

* chore: fixing the human readable confluence url fetch baseUrl

* chore: fixing the human readable confluence url fetch baseUrl

* chore: fixing the human readable confluence url fetch baseUrl

* refactor implementation of various types of Confluence URL patterns

---------

Co-authored-by: Predrag Stojadinovic <predrag@stojadinovic.net>
Co-authored-by: Predrag Stojadinović <cope@users.noreply.github.com>
Co-authored-by: Predrag Stojadinovic <predrags@nvidia.com>
2024-06-17 16:04:20 -07:00
timothycarambat
393772c4a5 Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-06-12 09:05:57 -07:00
Chris Daniel
8a4dd2bdf5
[FEAT] add support for TSX files to be parsed as text (#1597)
add support for TSX files to be parsed as text
2024-06-03 17:01:41 +08:00
Sean Hatfield
9a38b32c74
[FEAT] Add support for R files to be parsed as text (#1577)
add support for R files to be parsed as text
2024-05-31 13:52:00 +08:00
Sean Hatfield
4324a8bb4f
[FEAT] Github repo loader bug fix (#1558)
* fix project names with special characters for github repo data connector

* linting
2024-05-29 17:01:29 +08:00
timothycarambat
6e8a327d98 merge with master 2024-05-23 12:58:36 -07:00
Timothy Carambat
a89812703b
repatch path normalization (#1516) 2024-05-23 12:52:04 -07:00
timothycarambat
05488c81e0 undo path norm whitespace fix 2024-05-23 12:04:00 -07:00
timothycarambat
c6ad94d81a Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-05-22 13:43:09 -05:00
timothycarambat
e208074ef4 patch path normalization 2024-05-22 11:50:01 -05:00
timothycarambat
c65ab6d863 merge with master 2024-05-21 14:48:16 -05:00
Timothy Carambat
1a5aacb001
Support multi-model whispers (#1444) 2024-05-17 21:31:29 -07:00
Timothy Carambat
7e0b638a2c
Patch confluence URL patterns(#1426)
* patch confluence patterns

---------

Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
2024-05-16 14:15:59 -07:00
timothycarambat
87b41a60e9 refactor spaceKey url pattern for custom domains 2024-05-16 11:01:34 -07:00
Predrag Stojadinović
cf969adf37
1362 custom display confluence url (#1423)
* chore: confluence data connector can now handle custom urls, in addition to default {subdomain}.atlassian.net ones

* chore: formatting as per yarn lint

* chore: adding /display/ url matching to confluence data connector
2024-05-16 10:46:18 -07:00
timothycarambat
d603d0fd51 patch:update storage for bulk-website scraper for render 2024-05-14 12:59:14 -07:00
timothycarambat
c8dac6177a Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-05-14 12:57:44 -07:00
timothycarambat
b5ac944475 patch: bulk-scraper, update when folder is made and path creation params 2024-05-14 12:57:23 -07:00
timothycarambat
72c9fda6c9 Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-05-14 12:50:17 -07:00
Sean Hatfield
612a7e1662
[FEAT] Website depth scraping data connector (#1191)
* WIP website depth scraping, (sort of works)

* website depth data connector stable + add maxLinks option

* linting + loading small ui tweak

* refactor website depth data connector for stability, speed, & readability

* patch: remove console log
Guard clause on URL validitiy check
reasonable overrides

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2024-05-14 12:49:14 -07:00
jazelly
d71db22799
fix: skip undefined confluence pageContent (#1383)
Refs: https://github.com/Mintplex-Labs/anything-llm/issues/1381

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2024-05-14 10:22:13 -07:00
Predrag Stojadinović
78e3e35d27
[FEAT] Confluence Data Connector handles custom Confluence urls (#1362)
* chore: confluence data connector can now handle custom urls, in addition to default {subdomain}.atlassian.net ones

* chore: formatting as per yarn lint
2024-05-14 10:21:04 -07:00
timothycarambat
c60077a078 merge with master 2024-05-03 10:02:53 -07:00
timothycarambat
2d215acb75 patch storage dirs for extensions 2024-05-02 14:03:10 -07:00
timothycarambat
1aa8e5766f duplicate key (no impact) 2024-05-02 13:05:20 -07:00
timothycarambat
6150ff41ea Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-05-01 13:33:07 -07:00
Timothy Carambat
547d4859ef
Bump openai package to latest (#1234)
* Bump `openai` package to latest
Tested all except localai

* bump LocalAI support with latest image

* add deprecation notice

* linting
2024-04-30 12:33:42 -07:00
Sean Hatfield
348b36bf85
[FEAT] Confluence data connector (#1181)
* WIP Confluence data connector backend

* confluence data connector complete

* confluence citations

* fix citation for confluence

* Patch confulence integration

* fix Citation Icon for confluence

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>
2024-04-25 17:53:38 -07:00
timothycarambat
fde4e5400f Merge branch 'master' of github.com:Mintplex-Labs/anything-llm into render 2024-04-12 14:57:46 -07:00
Sean Hatfield
af84b01482
[FIX] GitHub repo with periods in link fix (#1084)
fix periods in github repo links bug
2024-04-12 14:56:59 -07:00
Timothy Carambat
2c6135aa54
patch file types as plaintext (#1095)
resolves #1089
2024-04-12 14:54:33 -07:00
timothycarambat
75ced7e65a merge with master
Patch LLM selection for native to be disabled
2024-04-07 14:55:18 -07:00
Timothy Carambat
1f8ab0d245
Remove YoutubeLoader dependency (#1050)
* WIP data connector redesign

* new UI for data connectors complete

* remove old data connector page/cleanup imports

* cleanup of UI and imports

* Remove Youtube Transcript dep and move in-house

* lang pref default to en

---------

Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
2024-04-05 16:33:01 -07:00
timothycarambat
2638098d49 patch with master 2024-04-05 09:45:28 -07:00
timothycarambat
0b454016cf patch comkey path to fallback 2024-04-04 10:47:26 -07:00
timothycarambat
a4c1d42e41 merge with master 2024-04-02 14:33:32 -07:00