* fix tree/blob github urls from branches not being loaded
* improve ux of github data connector
* lint
* patch Github URL parser to just validate with `URL` native parser
* uncheck LocalStorage of PAT for security reasons
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* Updated the `GitHubRepoLoader` class to use the new import syntax and adjust the `recursiveLoader` method accordingly.
* add @langchain/community to collector package.json
* fix: Improve handling of complex ignore patterns in GitLabRepoLoader
* refactor: use ignore package for simplified ignore logic
* run yarn lint
* add @langchain/community@^0.2.23
* remove unused dep
lint
---------
Co-authored-by: Emil Rofors (aider) <emirof@gmail.com>
* Added an option to fetch issues from gitlab. Made the file fetching asynchornous to improve performance. #2334
* Fixed a typo in loadGitlabRepo.
* Convert issues to markdown.
* Fixed an issue with time estimate field names in issueToMarkdown.
* handle rate limits more gracefully + update checkbox to toggle switch
* lint
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
* support more confluence url formats
* use pattern matching for confluence urls and manual splitting as fallback
* rework entire Confluence flow to prevent issues with custom, local, and cloud spaces
* remove dep
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* Add support for GitLab repo collection as well as Github Repo collection
* Refactor for repo collectors to be more compact
---------
Co-authored-by: Emil Rofors <emirof@gmail.com>
* implement custom PDFLoader to remove LC dep
* remove unneeded comment
* remove pdfjs as dep and fix page splitting using pdf-parse
* linting + export rename for desktop compat
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* WIP replace langchain pdfloader with pdfjs and add more context to each page
* remove extras from pdfjs and just replace langchain library
* remove unneeded dep
* fix console log in docs
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* wip bg workers for live document sync
* Add ability to re-embed specific documents across many workspaces via background queue
bgworkser is gated behind expieremental system setting flag that needs to be explictly enabled
UI for watching/unwatching docments that are embedded.
TODO: UI to easily manage all bg tasks and see run results
TODO: UI to enable this feature and background endpoints to manage it
* create frontend views and paths
Move elements to correct experimental scope
* update migration to delete runs on removal of watched document
* Add watch support to YouTube transcripts (#1716)
* Add watch support to YouTube transcripts
refactor how sync is done for supported types
* Watch specific files in Confluence space (#1718)
Add failure-prune check for runs
* create tmp workflow modifications for beta image
* create tmp workflow modifications for beta image
* create tmp workflow modifications for beta image
* dual build
update copy of alert modals
* update job interval
* Add support for live-sync of Github files
* update copy for document sync feature
* hide Experimental features from UI
* update docs links
* [FEAT] Implement new settings menu for experimental features (#1735)
* implement new settings menu for experimental features
* remove unused context save bar
---------
Co-authored-by: timothycarambat <rambat1010@gmail.com>
* dont run job on boot
* unset workflow changes
* Add persistent encryption service
Relay key to collector so persistent encryption can be used
Encrypt any private data in chunkSources used for replay during resync jobs
* update jsDOC
* Linting and organization
* update modal copy for feature
---------
Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>
* chore: confluence data connector can now handle custom urls, in addition to default {subdomain}.atlassian.net ones
* chore: formatting as per yarn lint
* chore: fixing the human readable confluence url fetch baseUrl
* chore: fixing the human readable confluence url fetch baseUrl
* chore: fixing the human readable confluence url fetch baseUrl
* chore: fixing the human readable confluence url fetch baseUrl
* chore: fixing the human readable confluence url fetch baseUrl
* refactor implementation of various types of Confluence URL patterns
---------
Co-authored-by: Predrag Stojadinovic <predrag@stojadinovic.net>
Co-authored-by: Predrag Stojadinović <cope@users.noreply.github.com>
Co-authored-by: Predrag Stojadinovic <predrags@nvidia.com>
* Updated apt-packages source for devcontainer
Switched the devcontainer's package source to a different repository to
align with updated dependencies and package availability. The previous
source from 'rocker-org' is replaced with 'devcontainers-contrib', which
may offer more recent or relevant development tools.
* Subject: Centralize prettier ignores and refine
config
Body:
Centralized all prettier ignore rules by removing individual
`.prettierignore` files in subprojects and updating the root
`.prettierignore` to include previously ignored patterns, ensuring
consistency across the workspace. Additionally, the prettier
configuration was refined by making the file pattern for `.config.js`
files consistent and adjusting quote styles for better readability. All
lint scripts across the project were updated to respect the centralized
ignore path, enhancing maintainability.
The consolidation simplifies the process of managing ignore rules as the
project scales, ensuring developers can focus on writing code without
worrying about divergent formatting standards. These changes also align
with introducing comprehensive linting across multiple environments to
keep the codebase clean and consistent.
This adjustment is a foundational step towards a more streamlined and
unified code base, making it easier for new contributors to adhere to
established coding standards and reducing the cognitive load associated
with managing multiple configuration files across the project.
* unset package json changes
---------
Co-authored-by: Francisco Bischoff <franzbischoff@gmail.com>
Co-authored-by: Francisco Bischoff <984592+franzbischoff@users.noreply.github.com>
* chore: confluence data connector can now handle custom urls, in addition to default {subdomain}.atlassian.net ones
* chore: formatting as per yarn lint
* chore: adding /display/ url matching to confluence data connector
* chore: confluence data connector can now handle custom urls, in addition to default {subdomain}.atlassian.net ones
* chore: formatting as per yarn lint