10 KiB
AnythingLLM: A document chatbot to chat with anything!.
An efficient, customizable, and open-source enterprise-ready document chatbot solution.
| | Docs | Hosted Instance
A full-stack application that enables you to turn any document, resource, or piece of content into context that any LLM can use as references during chatting. This application allows you to pick and choose which LLM or Vector Database you want to use.
Watch the demo!
Product Overview
AnythingLLM aims to be a full-stack application where you can use commercial off-the-shelf LLMs or popular open source LLMs and vectorDB solutions.
Anything LLM is a full-stack product that you can run locally as well as host remotely and be able to chat intelligently with any documents you provide it.
AnythingLLM divides your documents into objects called workspaces
. A Workspace functions a lot like a thread, but with the addition of containerization of your documents. Workspaces can share documents, but they do not talk to each other so you can keep your context for each workspace clean.
Some cool features of AnythingLLM
- Multi-user instance support and permissioning
- Atomically manage documents in your vector database from a simple UI
- Two chat modes
conversation
andquery
. Conversation retains previous questions and amendments. Query is simple QA against your documents - Each chat response contains a citation that is linked to the original document source
- Simple technology stack for fast iteration
- 100% Cloud deployment ready.
- "Bring your own LLM" model.
- Extremely efficient cost-saving measures for managing very large documents. You'll never pay to embed a massive document or transcript more than once. 90% more cost effective than other document chatbot solutions.
- Full Developer API for custom integrations!
Supported LLMs, Embedders, and Vector Databases
Supported LLMs:
Supported Embedding models:
Supported Vector Databases:
Technical Overview
This monorepo consists of three main sections:
collector
: Python tools that enable you to quickly convert online resources or local documents into LLM useable format.frontend
: A viteJS + React frontend that you can run to easily create and manage all your content the LLM can use.server
: A nodeJS + express server to handle all the interactions and do all the vectorDB management and LLM interactions.docker
: Docker instructions and build process + information for building from source.
Requirements
yarn
andnode
on your machinepython
3.9+ for running scripts incollector/
.- access to an LLM running locally or remotely.
- (optional) a vector database like Pinecone, qDrant, Weaviate, or Chroma*.
*AnythingLLM by default uses a built-in vector database powered by LanceDB *AnythingLLM by default embeds text on instance privately Learn More
Recommended usage with Docker (easy!)
Tip
It is best to mount the containers storage volume to a folder on your host machine so that you can pull in future updates without deleting your existing data!
docker pull mintplexlabs/anythingllm:master
STORAGE_LOCATION="/var/lib/anythingllm" \
mkdir -p "$STORAGE_LOCATION" && \
touch "$STORAGE_LOCATION/.env" && \
docker run -d -p 3001:3001 \
-v ${STORAGE_LOCATION}:/app/server/storage \
-v ${STORAGE_LOCATION}/.env:/app/server/.env \
-e STORAGE_DIR="/app/server/storage" \
mintplexlabs/anythingllm:master
Go to http://localhost:3001
and you are now using AnythingLLM! All your data and progress will persist between
container rebuilds or pulls from Docker Hub.
Learn more about running AnythingLLM with Docker
How to get started (Development environment)
yarn setup
from the project root directory.- This will fill in the required
.env
files you'll need in each of the application sections. Go fill those out before proceeding or else things won't work right.
- This will fill in the required
yarn prisma:setup
To build the Prisma client and migrate the database.
To boot the server locally (run commands from root of repo):
- ensure
server/.env.development
is set and filled out.yarn dev:server
To boot the frontend locally (run commands from root of repo):
- ensure
frontend/.env
is set and filled out. - ensure
VITE_API_BASE="http://localhost:3001/api"
yarn dev:frontend
Standalone scripts
This repo contains standlone scripts you can run to collect data from a Youtube Channel, Medium articles, local text files, word documents, and the list goes on. This is where you will use the collector/
part of the repo.
Go set up and run collector scripts
Contributing
- create issue
- create PR with branch name format of
<issue number>-<short name>
- yee haw let's merge
Telemetry
AnythingLLM by Mintplex Labs Inc contains a telemetry feature that collects anonymous usage information.
Why?
We use this information to help us understand how AnythingLLM is used, to help us prioritize work on new features and bug fixes, and to help us improve AnythingLLM's performance and stability.
Opting out
Set DISABLE_TELEMETRY
in your server or docker .env settings to "true" to opt out of telemetry.
DISABLE_TELEMETRY="true"
What do you explicitly track?
We will only track usage details that help us make product and roadmap decisions, specifically:
- Version of your installation
- When a document is added or removed. No information about the document. Just that the event occurred. This gives us an idea of use.
- Type of vector database in use. Let's us know which vector database provider is the most used to prioritize changes when updates arrive for that provider.
- Type of LLM in use. Let's us know the most popular choice and prioritize changes when updates arrive for that provider.
- Chat is sent. This is the most regular "event" and gives us an idea of the daily-activity of this project across all installations. Again, only the event is sent - we have no information on the nature or content of the chat itself.
You can verify these claims by finding all locations Telemetry.sendTelemetry
is called. Additionally these events are written to the output log so you can also see the specific data which was sent - if enabled. No IP or other identifying information is collected. The Telemetry provider is PostHog - an open-source telemetry collection service.