anything-llm/server/models/documents.js

272 lines
7.8 KiB
JavaScript
Raw Normal View History

2023-06-08 06:31:35 +02:00
const { v4: uuidv4 } = require("uuid");
const { getVectorDbClass } = require("../utils/helpers");
Replace custom sqlite dbms with prisma (#239) * WIP converted all sqlite models into prisma calls * modify db setup and fix ApiKey model calls in admin.js * renaming function params to be consistent * converted adminEndpoints to utilize prisma orm * converted chatEndpoints to utilize prisma orm * converted inviteEndpoints to utilize prisma orm * converted systemEndpoints to utilize prisma orm * converted workspaceEndpoints to utilize prisma orm * converting sql queries to prisma calls * fixed default param bug for orderBy and limit * fixed typo for workspace chats * fixed order of deletion to account for sql relations * fix invite CRUD and workspace management CRUD * fixed CRUD for api keys * created prisma setup scripts/docs for understanding how to use prisma * prisma dependency change * removing unneeded console.logs * removing unneeded sql escape function * linting and creating migration script * migration from depreciated sqlite script update * removing unneeded migrations in prisma folder * create backup of old sqlite db and use transactions to ensure all operations complete successfully * adding migrations to gitignore * updated PRISMA.md docs for info on how to use sqlite migration script * comment changes * adding back migrations folder to repo * Reviewing SQL and prisma integraiton on fresh repo * update inline key replacement * ensure migration script executes and maps foreign_keys regardless of db ordering * run migration endpoint * support new prisma backend * bump version * change migration call --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>
2023-09-28 23:00:03 +02:00
const prisma = require("../utils/prisma");
const { Telemetry } = require("./telemetry");
const { EventLogs } = require("./eventLogs");
[BETA] Live document sync (#1719) * wip bg workers for live document sync * Add ability to re-embed specific documents across many workspaces via background queue bgworkser is gated behind expieremental system setting flag that needs to be explictly enabled UI for watching/unwatching docments that are embedded. TODO: UI to easily manage all bg tasks and see run results TODO: UI to enable this feature and background endpoints to manage it * create frontend views and paths Move elements to correct experimental scope * update migration to delete runs on removal of watched document * Add watch support to YouTube transcripts (#1716) * Add watch support to YouTube transcripts refactor how sync is done for supported types * Watch specific files in Confluence space (#1718) Add failure-prune check for runs * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * dual build update copy of alert modals * update job interval * Add support for live-sync of Github files * update copy for document sync feature * hide Experimental features from UI * update docs links * [FEAT] Implement new settings menu for experimental features (#1735) * implement new settings menu for experimental features * remove unused context save bar --------- Co-authored-by: timothycarambat <rambat1010@gmail.com> * dont run job on boot * unset workflow changes * Add persistent encryption service Relay key to collector so persistent encryption can be used Encrypt any private data in chunkSources used for replay during resync jobs * update jsDOC * Linting and organization * update modal copy for feature --------- Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>
2024-06-21 22:38:50 +02:00
const { safeJsonParse } = require("../utils/http");
2023-06-04 04:28:07 +02:00
const Document = {
[BETA] Live document sync (#1719) * wip bg workers for live document sync * Add ability to re-embed specific documents across many workspaces via background queue bgworkser is gated behind expieremental system setting flag that needs to be explictly enabled UI for watching/unwatching docments that are embedded. TODO: UI to easily manage all bg tasks and see run results TODO: UI to enable this feature and background endpoints to manage it * create frontend views and paths Move elements to correct experimental scope * update migration to delete runs on removal of watched document * Add watch support to YouTube transcripts (#1716) * Add watch support to YouTube transcripts refactor how sync is done for supported types * Watch specific files in Confluence space (#1718) Add failure-prune check for runs * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * dual build update copy of alert modals * update job interval * Add support for live-sync of Github files * update copy for document sync feature * hide Experimental features from UI * update docs links * [FEAT] Implement new settings menu for experimental features (#1735) * implement new settings menu for experimental features * remove unused context save bar --------- Co-authored-by: timothycarambat <rambat1010@gmail.com> * dont run job on boot * unset workflow changes * Add persistent encryption service Relay key to collector so persistent encryption can be used Encrypt any private data in chunkSources used for replay during resync jobs * update jsDOC * Linting and organization * update modal copy for feature --------- Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>
2024-06-21 22:38:50 +02:00
writable: ["pinned", "watched", "lastUpdatedAt"],
/**
* @param {import("@prisma/client").workspace_documents} document - Document PrismaRecord
* @returns {{
* metadata: (null|object),
* type: import("./documentSyncQueue.js").validFileType,
* source: string
* }}
*/
parseDocumentTypeAndSource: function (document) {
const metadata = safeJsonParse(document.metadata, null);
if (!metadata) return { metadata: null, type: null, source: null };
// Parse the correct type of source and its original source path.
const idx = metadata.chunkSource.indexOf("://");
const [type, source] = [
metadata.chunkSource.slice(0, idx),
metadata.chunkSource.slice(idx + 3),
];
return { metadata, type, source: this._stripSource(source, type) };
},
2023-06-04 04:28:07 +02:00
forWorkspace: async function (workspaceId = null) {
if (!workspaceId) return [];
Replace custom sqlite dbms with prisma (#239) * WIP converted all sqlite models into prisma calls * modify db setup and fix ApiKey model calls in admin.js * renaming function params to be consistent * converted adminEndpoints to utilize prisma orm * converted chatEndpoints to utilize prisma orm * converted inviteEndpoints to utilize prisma orm * converted systemEndpoints to utilize prisma orm * converted workspaceEndpoints to utilize prisma orm * converting sql queries to prisma calls * fixed default param bug for orderBy and limit * fixed typo for workspace chats * fixed order of deletion to account for sql relations * fix invite CRUD and workspace management CRUD * fixed CRUD for api keys * created prisma setup scripts/docs for understanding how to use prisma * prisma dependency change * removing unneeded console.logs * removing unneeded sql escape function * linting and creating migration script * migration from depreciated sqlite script update * removing unneeded migrations in prisma folder * create backup of old sqlite db and use transactions to ensure all operations complete successfully * adding migrations to gitignore * updated PRISMA.md docs for info on how to use sqlite migration script * comment changes * adding back migrations folder to repo * Reviewing SQL and prisma integraiton on fresh repo * update inline key replacement * ensure migration script executes and maps foreign_keys regardless of db ordering * run migration endpoint * support new prisma backend * bump version * change migration call --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>
2023-09-28 23:00:03 +02:00
return await prisma.workspace_documents.findMany({
where: { workspaceId },
});
2023-06-04 04:28:07 +02:00
},
Replace custom sqlite dbms with prisma (#239) * WIP converted all sqlite models into prisma calls * modify db setup and fix ApiKey model calls in admin.js * renaming function params to be consistent * converted adminEndpoints to utilize prisma orm * converted chatEndpoints to utilize prisma orm * converted inviteEndpoints to utilize prisma orm * converted systemEndpoints to utilize prisma orm * converted workspaceEndpoints to utilize prisma orm * converting sql queries to prisma calls * fixed default param bug for orderBy and limit * fixed typo for workspace chats * fixed order of deletion to account for sql relations * fix invite CRUD and workspace management CRUD * fixed CRUD for api keys * created prisma setup scripts/docs for understanding how to use prisma * prisma dependency change * removing unneeded console.logs * removing unneeded sql escape function * linting and creating migration script * migration from depreciated sqlite script update * removing unneeded migrations in prisma folder * create backup of old sqlite db and use transactions to ensure all operations complete successfully * adding migrations to gitignore * updated PRISMA.md docs for info on how to use sqlite migration script * comment changes * adding back migrations folder to repo * Reviewing SQL and prisma integraiton on fresh repo * update inline key replacement * ensure migration script executes and maps foreign_keys regardless of db ordering * run migration endpoint * support new prisma backend * bump version * change migration call --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>
2023-09-28 23:00:03 +02:00
delete: async function (clause = {}) {
try {
await prisma.workspace_documents.deleteMany({ where: clause });
return true;
} catch (error) {
console.error(error.message);
return false;
}
2023-06-04 04:28:07 +02:00
},
Replace custom sqlite dbms with prisma (#239) * WIP converted all sqlite models into prisma calls * modify db setup and fix ApiKey model calls in admin.js * renaming function params to be consistent * converted adminEndpoints to utilize prisma orm * converted chatEndpoints to utilize prisma orm * converted inviteEndpoints to utilize prisma orm * converted systemEndpoints to utilize prisma orm * converted workspaceEndpoints to utilize prisma orm * converting sql queries to prisma calls * fixed default param bug for orderBy and limit * fixed typo for workspace chats * fixed order of deletion to account for sql relations * fix invite CRUD and workspace management CRUD * fixed CRUD for api keys * created prisma setup scripts/docs for understanding how to use prisma * prisma dependency change * removing unneeded console.logs * removing unneeded sql escape function * linting and creating migration script * migration from depreciated sqlite script update * removing unneeded migrations in prisma folder * create backup of old sqlite db and use transactions to ensure all operations complete successfully * adding migrations to gitignore * updated PRISMA.md docs for info on how to use sqlite migration script * comment changes * adding back migrations folder to repo * Reviewing SQL and prisma integraiton on fresh repo * update inline key replacement * ensure migration script executes and maps foreign_keys regardless of db ordering * run migration endpoint * support new prisma backend * bump version * change migration call --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>
2023-09-28 23:00:03 +02:00
get: async function (clause = {}) {
Replace custom sqlite dbms with prisma (#239) * WIP converted all sqlite models into prisma calls * modify db setup and fix ApiKey model calls in admin.js * renaming function params to be consistent * converted adminEndpoints to utilize prisma orm * converted chatEndpoints to utilize prisma orm * converted inviteEndpoints to utilize prisma orm * converted systemEndpoints to utilize prisma orm * converted workspaceEndpoints to utilize prisma orm * converting sql queries to prisma calls * fixed default param bug for orderBy and limit * fixed typo for workspace chats * fixed order of deletion to account for sql relations * fix invite CRUD and workspace management CRUD * fixed CRUD for api keys * created prisma setup scripts/docs for understanding how to use prisma * prisma dependency change * removing unneeded console.logs * removing unneeded sql escape function * linting and creating migration script * migration from depreciated sqlite script update * removing unneeded migrations in prisma folder * create backup of old sqlite db and use transactions to ensure all operations complete successfully * adding migrations to gitignore * updated PRISMA.md docs for info on how to use sqlite migration script * comment changes * adding back migrations folder to repo * Reviewing SQL and prisma integraiton on fresh repo * update inline key replacement * ensure migration script executes and maps foreign_keys regardless of db ordering * run migration endpoint * support new prisma backend * bump version * change migration call --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>
2023-09-28 23:00:03 +02:00
try {
const document = await prisma.workspace_documents.findFirst({
where: clause,
});
return document || null;
} catch (error) {
console.error(error.message);
return null;
}
2023-06-04 04:28:07 +02:00
},
Replace custom sqlite dbms with prisma (#239) * WIP converted all sqlite models into prisma calls * modify db setup and fix ApiKey model calls in admin.js * renaming function params to be consistent * converted adminEndpoints to utilize prisma orm * converted chatEndpoints to utilize prisma orm * converted inviteEndpoints to utilize prisma orm * converted systemEndpoints to utilize prisma orm * converted workspaceEndpoints to utilize prisma orm * converting sql queries to prisma calls * fixed default param bug for orderBy and limit * fixed typo for workspace chats * fixed order of deletion to account for sql relations * fix invite CRUD and workspace management CRUD * fixed CRUD for api keys * created prisma setup scripts/docs for understanding how to use prisma * prisma dependency change * removing unneeded console.logs * removing unneeded sql escape function * linting and creating migration script * migration from depreciated sqlite script update * removing unneeded migrations in prisma folder * create backup of old sqlite db and use transactions to ensure all operations complete successfully * adding migrations to gitignore * updated PRISMA.md docs for info on how to use sqlite migration script * comment changes * adding back migrations folder to repo * Reviewing SQL and prisma integraiton on fresh repo * update inline key replacement * ensure migration script executes and maps foreign_keys regardless of db ordering * run migration endpoint * support new prisma backend * bump version * change migration call --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>
2023-09-28 23:00:03 +02:00
[BETA] Live document sync (#1719) * wip bg workers for live document sync * Add ability to re-embed specific documents across many workspaces via background queue bgworkser is gated behind expieremental system setting flag that needs to be explictly enabled UI for watching/unwatching docments that are embedded. TODO: UI to easily manage all bg tasks and see run results TODO: UI to enable this feature and background endpoints to manage it * create frontend views and paths Move elements to correct experimental scope * update migration to delete runs on removal of watched document * Add watch support to YouTube transcripts (#1716) * Add watch support to YouTube transcripts refactor how sync is done for supported types * Watch specific files in Confluence space (#1718) Add failure-prune check for runs * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * dual build update copy of alert modals * update job interval * Add support for live-sync of Github files * update copy for document sync feature * hide Experimental features from UI * update docs links * [FEAT] Implement new settings menu for experimental features (#1735) * implement new settings menu for experimental features * remove unused context save bar --------- Co-authored-by: timothycarambat <rambat1010@gmail.com> * dont run job on boot * unset workflow changes * Add persistent encryption service Relay key to collector so persistent encryption can be used Encrypt any private data in chunkSources used for replay during resync jobs * update jsDOC * Linting and organization * update modal copy for feature --------- Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>
2024-06-21 22:38:50 +02:00
getOnlyWorkspaceIds: async function (clause = {}) {
try {
const workspaceIds = await prisma.workspace_documents.findMany({
where: clause,
select: {
workspaceId: true,
},
});
[BETA] Live document sync (#1719) * wip bg workers for live document sync * Add ability to re-embed specific documents across many workspaces via background queue bgworkser is gated behind expieremental system setting flag that needs to be explictly enabled UI for watching/unwatching docments that are embedded. TODO: UI to easily manage all bg tasks and see run results TODO: UI to enable this feature and background endpoints to manage it * create frontend views and paths Move elements to correct experimental scope * update migration to delete runs on removal of watched document * Add watch support to YouTube transcripts (#1716) * Add watch support to YouTube transcripts refactor how sync is done for supported types * Watch specific files in Confluence space (#1718) Add failure-prune check for runs * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * dual build update copy of alert modals * update job interval * Add support for live-sync of Github files * update copy for document sync feature * hide Experimental features from UI * update docs links * [FEAT] Implement new settings menu for experimental features (#1735) * implement new settings menu for experimental features * remove unused context save bar --------- Co-authored-by: timothycarambat <rambat1010@gmail.com> * dont run job on boot * unset workflow changes * Add persistent encryption service Relay key to collector so persistent encryption can be used Encrypt any private data in chunkSources used for replay during resync jobs * update jsDOC * Linting and organization * update modal copy for feature --------- Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>
2024-06-21 22:38:50 +02:00
return workspaceIds.map((record) => record.workspaceId) || [];
} catch (error) {
console.error(error.message);
return [];
}
},
[BETA] Live document sync (#1719) * wip bg workers for live document sync * Add ability to re-embed specific documents across many workspaces via background queue bgworkser is gated behind expieremental system setting flag that needs to be explictly enabled UI for watching/unwatching docments that are embedded. TODO: UI to easily manage all bg tasks and see run results TODO: UI to enable this feature and background endpoints to manage it * create frontend views and paths Move elements to correct experimental scope * update migration to delete runs on removal of watched document * Add watch support to YouTube transcripts (#1716) * Add watch support to YouTube transcripts refactor how sync is done for supported types * Watch specific files in Confluence space (#1718) Add failure-prune check for runs * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * dual build update copy of alert modals * update job interval * Add support for live-sync of Github files * update copy for document sync feature * hide Experimental features from UI * update docs links * [FEAT] Implement new settings menu for experimental features (#1735) * implement new settings menu for experimental features * remove unused context save bar --------- Co-authored-by: timothycarambat <rambat1010@gmail.com> * dont run job on boot * unset workflow changes * Add persistent encryption service Relay key to collector so persistent encryption can be used Encrypt any private data in chunkSources used for replay during resync jobs * update jsDOC * Linting and organization * update modal copy for feature --------- Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>
2024-06-21 22:38:50 +02:00
where: async function (
clause = {},
limit = null,
orderBy = null,
include = null
) {
try {
const results = await prisma.workspace_documents.findMany({
where: clause,
...(limit !== null ? { take: limit } : {}),
...(orderBy !== null ? { orderBy } : {}),
[BETA] Live document sync (#1719) * wip bg workers for live document sync * Add ability to re-embed specific documents across many workspaces via background queue bgworkser is gated behind expieremental system setting flag that needs to be explictly enabled UI for watching/unwatching docments that are embedded. TODO: UI to easily manage all bg tasks and see run results TODO: UI to enable this feature and background endpoints to manage it * create frontend views and paths Move elements to correct experimental scope * update migration to delete runs on removal of watched document * Add watch support to YouTube transcripts (#1716) * Add watch support to YouTube transcripts refactor how sync is done for supported types * Watch specific files in Confluence space (#1718) Add failure-prune check for runs * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * dual build update copy of alert modals * update job interval * Add support for live-sync of Github files * update copy for document sync feature * hide Experimental features from UI * update docs links * [FEAT] Implement new settings menu for experimental features (#1735) * implement new settings menu for experimental features * remove unused context save bar --------- Co-authored-by: timothycarambat <rambat1010@gmail.com> * dont run job on boot * unset workflow changes * Add persistent encryption service Relay key to collector so persistent encryption can be used Encrypt any private data in chunkSources used for replay during resync jobs * update jsDOC * Linting and organization * update modal copy for feature --------- Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>
2024-06-21 22:38:50 +02:00
...(include !== null ? { include } : {}),
});
return results;
} catch (error) {
console.error(error.message);
return [];
}
},
addDocuments: async function (workspace, additions = [], userId = null) {
2023-06-08 06:31:35 +02:00
const VectorDb = getVectorDbClass();
if (additions.length === 0) return { failed: [], embedded: [] };
const { fileData } = require("../utils/files");
const embedded = [];
const failedToEmbed = [];
const errors = new Set();
2023-06-04 04:28:07 +02:00
for (const path of additions) {
const data = await fileData(path);
if (!data) continue;
const docId = uuidv4();
2023-06-08 06:31:35 +02:00
const { pageContent, ...metadata } = data;
2023-06-04 04:28:07 +02:00
const newDoc = {
docId,
2023-06-08 06:31:35 +02:00
filename: path.split("/")[1],
2023-06-04 04:28:07 +02:00
docpath: path,
Replace custom sqlite dbms with prisma (#239) * WIP converted all sqlite models into prisma calls * modify db setup and fix ApiKey model calls in admin.js * renaming function params to be consistent * converted adminEndpoints to utilize prisma orm * converted chatEndpoints to utilize prisma orm * converted inviteEndpoints to utilize prisma orm * converted systemEndpoints to utilize prisma orm * converted workspaceEndpoints to utilize prisma orm * converting sql queries to prisma calls * fixed default param bug for orderBy and limit * fixed typo for workspace chats * fixed order of deletion to account for sql relations * fix invite CRUD and workspace management CRUD * fixed CRUD for api keys * created prisma setup scripts/docs for understanding how to use prisma * prisma dependency change * removing unneeded console.logs * removing unneeded sql escape function * linting and creating migration script * migration from depreciated sqlite script update * removing unneeded migrations in prisma folder * create backup of old sqlite db and use transactions to ensure all operations complete successfully * adding migrations to gitignore * updated PRISMA.md docs for info on how to use sqlite migration script * comment changes * adding back migrations folder to repo * Reviewing SQL and prisma integraiton on fresh repo * update inline key replacement * ensure migration script executes and maps foreign_keys regardless of db ordering * run migration endpoint * support new prisma backend * bump version * change migration call --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>
2023-09-28 23:00:03 +02:00
workspaceId: workspace.id,
2023-06-08 06:31:35 +02:00
metadata: JSON.stringify(metadata),
};
const { vectorized, error } = await VectorDb.addDocumentToNamespace(
2023-06-08 06:31:35 +02:00
workspace.slug,
{ ...data, docId },
path
);
2023-06-04 04:28:07 +02:00
if (!vectorized) {
console.error(
"Failed to vectorize",
metadata?.title || newDoc.filename
);
failedToEmbed.push(metadata?.title || newDoc.filename);
errors.add(error);
2023-06-04 04:28:07 +02:00
continue;
}
2023-07-27 03:06:53 +02:00
Replace custom sqlite dbms with prisma (#239) * WIP converted all sqlite models into prisma calls * modify db setup and fix ApiKey model calls in admin.js * renaming function params to be consistent * converted adminEndpoints to utilize prisma orm * converted chatEndpoints to utilize prisma orm * converted inviteEndpoints to utilize prisma orm * converted systemEndpoints to utilize prisma orm * converted workspaceEndpoints to utilize prisma orm * converting sql queries to prisma calls * fixed default param bug for orderBy and limit * fixed typo for workspace chats * fixed order of deletion to account for sql relations * fix invite CRUD and workspace management CRUD * fixed CRUD for api keys * created prisma setup scripts/docs for understanding how to use prisma * prisma dependency change * removing unneeded console.logs * removing unneeded sql escape function * linting and creating migration script * migration from depreciated sqlite script update * removing unneeded migrations in prisma folder * create backup of old sqlite db and use transactions to ensure all operations complete successfully * adding migrations to gitignore * updated PRISMA.md docs for info on how to use sqlite migration script * comment changes * adding back migrations folder to repo * Reviewing SQL and prisma integraiton on fresh repo * update inline key replacement * ensure migration script executes and maps foreign_keys regardless of db ordering * run migration endpoint * support new prisma backend * bump version * change migration call --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>
2023-09-28 23:00:03 +02:00
try {
await prisma.workspace_documents.create({ data: newDoc });
embedded.push(path);
Replace custom sqlite dbms with prisma (#239) * WIP converted all sqlite models into prisma calls * modify db setup and fix ApiKey model calls in admin.js * renaming function params to be consistent * converted adminEndpoints to utilize prisma orm * converted chatEndpoints to utilize prisma orm * converted inviteEndpoints to utilize prisma orm * converted systemEndpoints to utilize prisma orm * converted workspaceEndpoints to utilize prisma orm * converting sql queries to prisma calls * fixed default param bug for orderBy and limit * fixed typo for workspace chats * fixed order of deletion to account for sql relations * fix invite CRUD and workspace management CRUD * fixed CRUD for api keys * created prisma setup scripts/docs for understanding how to use prisma * prisma dependency change * removing unneeded console.logs * removing unneeded sql escape function * linting and creating migration script * migration from depreciated sqlite script update * removing unneeded migrations in prisma folder * create backup of old sqlite db and use transactions to ensure all operations complete successfully * adding migrations to gitignore * updated PRISMA.md docs for info on how to use sqlite migration script * comment changes * adding back migrations folder to repo * Reviewing SQL and prisma integraiton on fresh repo * update inline key replacement * ensure migration script executes and maps foreign_keys regardless of db ordering * run migration endpoint * support new prisma backend * bump version * change migration call --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>
2023-09-28 23:00:03 +02:00
} catch (error) {
console.error(error.message);
2023-07-27 03:06:53 +02:00
}
}
await Telemetry.sendTelemetry("documents_embedded_in_workspace", {
LLMSelection: process.env.LLM_PROVIDER || "openai",
2023-12-07 17:53:37 +01:00
Embedder: process.env.EMBEDDING_ENGINE || "inherit",
VectorDbSelection: process.env.VECTOR_DB || "lancedb",
});
await EventLogs.logEvent(
"workspace_documents_added",
{
workspaceName: workspace?.name || "Unknown Workspace",
numberOfDocumentsAdded: additions.length,
},
userId
);
return { failedToEmbed, errors: Array.from(errors), embedded };
2023-06-04 04:28:07 +02:00
},
Replace custom sqlite dbms with prisma (#239) * WIP converted all sqlite models into prisma calls * modify db setup and fix ApiKey model calls in admin.js * renaming function params to be consistent * converted adminEndpoints to utilize prisma orm * converted chatEndpoints to utilize prisma orm * converted inviteEndpoints to utilize prisma orm * converted systemEndpoints to utilize prisma orm * converted workspaceEndpoints to utilize prisma orm * converting sql queries to prisma calls * fixed default param bug for orderBy and limit * fixed typo for workspace chats * fixed order of deletion to account for sql relations * fix invite CRUD and workspace management CRUD * fixed CRUD for api keys * created prisma setup scripts/docs for understanding how to use prisma * prisma dependency change * removing unneeded console.logs * removing unneeded sql escape function * linting and creating migration script * migration from depreciated sqlite script update * removing unneeded migrations in prisma folder * create backup of old sqlite db and use transactions to ensure all operations complete successfully * adding migrations to gitignore * updated PRISMA.md docs for info on how to use sqlite migration script * comment changes * adding back migrations folder to repo * Reviewing SQL and prisma integraiton on fresh repo * update inline key replacement * ensure migration script executes and maps foreign_keys regardless of db ordering * run migration endpoint * support new prisma backend * bump version * change migration call --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>
2023-09-28 23:00:03 +02:00
removeDocuments: async function (workspace, removals = [], userId = null) {
2023-06-08 06:31:35 +02:00
const VectorDb = getVectorDbClass();
2023-06-04 04:28:07 +02:00
if (removals.length === 0) return;
2023-07-27 03:06:53 +02:00
2023-06-04 04:28:07 +02:00
for (const path of removals) {
const document = await this.get({
Replace custom sqlite dbms with prisma (#239) * WIP converted all sqlite models into prisma calls * modify db setup and fix ApiKey model calls in admin.js * renaming function params to be consistent * converted adminEndpoints to utilize prisma orm * converted chatEndpoints to utilize prisma orm * converted inviteEndpoints to utilize prisma orm * converted systemEndpoints to utilize prisma orm * converted workspaceEndpoints to utilize prisma orm * converting sql queries to prisma calls * fixed default param bug for orderBy and limit * fixed typo for workspace chats * fixed order of deletion to account for sql relations * fix invite CRUD and workspace management CRUD * fixed CRUD for api keys * created prisma setup scripts/docs for understanding how to use prisma * prisma dependency change * removing unneeded console.logs * removing unneeded sql escape function * linting and creating migration script * migration from depreciated sqlite script update * removing unneeded migrations in prisma folder * create backup of old sqlite db and use transactions to ensure all operations complete successfully * adding migrations to gitignore * updated PRISMA.md docs for info on how to use sqlite migration script * comment changes * adding back migrations folder to repo * Reviewing SQL and prisma integraiton on fresh repo * update inline key replacement * ensure migration script executes and maps foreign_keys regardless of db ordering * run migration endpoint * support new prisma backend * bump version * change migration call --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>
2023-09-28 23:00:03 +02:00
docpath: path,
workspaceId: workspace.id,
});
2023-06-04 04:28:07 +02:00
if (!document) continue;
2023-06-08 06:31:35 +02:00
await VectorDb.deleteDocumentFromNamespace(
workspace.slug,
document.docId
);
2023-07-27 03:06:53 +02:00
Replace custom sqlite dbms with prisma (#239) * WIP converted all sqlite models into prisma calls * modify db setup and fix ApiKey model calls in admin.js * renaming function params to be consistent * converted adminEndpoints to utilize prisma orm * converted chatEndpoints to utilize prisma orm * converted inviteEndpoints to utilize prisma orm * converted systemEndpoints to utilize prisma orm * converted workspaceEndpoints to utilize prisma orm * converting sql queries to prisma calls * fixed default param bug for orderBy and limit * fixed typo for workspace chats * fixed order of deletion to account for sql relations * fix invite CRUD and workspace management CRUD * fixed CRUD for api keys * created prisma setup scripts/docs for understanding how to use prisma * prisma dependency change * removing unneeded console.logs * removing unneeded sql escape function * linting and creating migration script * migration from depreciated sqlite script update * removing unneeded migrations in prisma folder * create backup of old sqlite db and use transactions to ensure all operations complete successfully * adding migrations to gitignore * updated PRISMA.md docs for info on how to use sqlite migration script * comment changes * adding back migrations folder to repo * Reviewing SQL and prisma integraiton on fresh repo * update inline key replacement * ensure migration script executes and maps foreign_keys regardless of db ordering * run migration endpoint * support new prisma backend * bump version * change migration call --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>
2023-09-28 23:00:03 +02:00
try {
await prisma.workspace_documents.delete({
where: { id: document.id, workspaceId: workspace.id },
});
await prisma.document_vectors.deleteMany({
where: { docId: document.docId },
});
Replace custom sqlite dbms with prisma (#239) * WIP converted all sqlite models into prisma calls * modify db setup and fix ApiKey model calls in admin.js * renaming function params to be consistent * converted adminEndpoints to utilize prisma orm * converted chatEndpoints to utilize prisma orm * converted inviteEndpoints to utilize prisma orm * converted systemEndpoints to utilize prisma orm * converted workspaceEndpoints to utilize prisma orm * converting sql queries to prisma calls * fixed default param bug for orderBy and limit * fixed typo for workspace chats * fixed order of deletion to account for sql relations * fix invite CRUD and workspace management CRUD * fixed CRUD for api keys * created prisma setup scripts/docs for understanding how to use prisma * prisma dependency change * removing unneeded console.logs * removing unneeded sql escape function * linting and creating migration script * migration from depreciated sqlite script update * removing unneeded migrations in prisma folder * create backup of old sqlite db and use transactions to ensure all operations complete successfully * adding migrations to gitignore * updated PRISMA.md docs for info on how to use sqlite migration script * comment changes * adding back migrations folder to repo * Reviewing SQL and prisma integraiton on fresh repo * update inline key replacement * ensure migration script executes and maps foreign_keys regardless of db ordering * run migration endpoint * support new prisma backend * bump version * change migration call --------- Co-authored-by: timothycarambat <rambat1010@gmail.com>
2023-09-28 23:00:03 +02:00
} catch (error) {
console.error(error.message);
2023-07-27 03:06:53 +02:00
}
}
await Telemetry.sendTelemetry("documents_removed_in_workspace", {
LLMSelection: process.env.LLM_PROVIDER || "openai",
2023-12-07 17:53:37 +01:00
Embedder: process.env.EMBEDDING_ENGINE || "inherit",
VectorDbSelection: process.env.VECTOR_DB || "lancedb",
});
await EventLogs.logEvent(
"workspace_documents_removed",
{
workspaceName: workspace?.name || "Unknown Workspace",
numberOfDocuments: removals.length,
},
userId
);
2023-06-04 04:28:07 +02:00
return true;
2023-06-08 06:31:35 +02:00
},
count: async function (clause = {}, limit = null) {
try {
const count = await prisma.workspace_documents.count({
where: clause,
...(limit !== null ? { take: limit } : {}),
});
return count;
} catch (error) {
console.error("FAILED TO COUNT DOCUMENTS.", error.message);
return 0;
}
},
update: async function (id = null, data = {}) {
if (!id) throw new Error("No workspace document id provided for update");
const validKeys = Object.keys(data).filter((key) =>
this.writable.includes(key)
);
if (validKeys.length === 0)
return { document: { id }, message: "No valid fields to update!" };
try {
const document = await prisma.workspace_documents.update({
where: { id },
data,
});
return { document, message: null };
} catch (error) {
console.error(error.message);
return { document: null, message: error.message };
}
},
[BETA] Live document sync (#1719) * wip bg workers for live document sync * Add ability to re-embed specific documents across many workspaces via background queue bgworkser is gated behind expieremental system setting flag that needs to be explictly enabled UI for watching/unwatching docments that are embedded. TODO: UI to easily manage all bg tasks and see run results TODO: UI to enable this feature and background endpoints to manage it * create frontend views and paths Move elements to correct experimental scope * update migration to delete runs on removal of watched document * Add watch support to YouTube transcripts (#1716) * Add watch support to YouTube transcripts refactor how sync is done for supported types * Watch specific files in Confluence space (#1718) Add failure-prune check for runs * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * dual build update copy of alert modals * update job interval * Add support for live-sync of Github files * update copy for document sync feature * hide Experimental features from UI * update docs links * [FEAT] Implement new settings menu for experimental features (#1735) * implement new settings menu for experimental features * remove unused context save bar --------- Co-authored-by: timothycarambat <rambat1010@gmail.com> * dont run job on boot * unset workflow changes * Add persistent encryption service Relay key to collector so persistent encryption can be used Encrypt any private data in chunkSources used for replay during resync jobs * update jsDOC * Linting and organization * update modal copy for feature --------- Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>
2024-06-21 22:38:50 +02:00
_updateAll: async function (clause = {}, data = {}) {
try {
await prisma.workspace_documents.updateMany({
where: clause,
data,
});
return true;
} catch (error) {
console.error(error.message);
return false;
}
},
content: async function (docId) {
if (!docId) throw new Error("No workspace docId provided!");
const document = await this.get({ docId: String(docId) });
if (!document) throw new Error(`Could not find a document by id ${docId}`);
const { fileData } = require("../utils/files");
const data = await fileData(document.docpath);
return { title: data.title, content: data.pageContent };
},
[BETA] Live document sync (#1719) * wip bg workers for live document sync * Add ability to re-embed specific documents across many workspaces via background queue bgworkser is gated behind expieremental system setting flag that needs to be explictly enabled UI for watching/unwatching docments that are embedded. TODO: UI to easily manage all bg tasks and see run results TODO: UI to enable this feature and background endpoints to manage it * create frontend views and paths Move elements to correct experimental scope * update migration to delete runs on removal of watched document * Add watch support to YouTube transcripts (#1716) * Add watch support to YouTube transcripts refactor how sync is done for supported types * Watch specific files in Confluence space (#1718) Add failure-prune check for runs * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * create tmp workflow modifications for beta image * dual build update copy of alert modals * update job interval * Add support for live-sync of Github files * update copy for document sync feature * hide Experimental features from UI * update docs links * [FEAT] Implement new settings menu for experimental features (#1735) * implement new settings menu for experimental features * remove unused context save bar --------- Co-authored-by: timothycarambat <rambat1010@gmail.com> * dont run job on boot * unset workflow changes * Add persistent encryption service Relay key to collector so persistent encryption can be used Encrypt any private data in chunkSources used for replay during resync jobs * update jsDOC * Linting and organization * update modal copy for feature --------- Co-authored-by: Sean Hatfield <seanhatfield5@gmail.com>
2024-06-21 22:38:50 +02:00
contentByDocPath: async function (docPath) {
const { fileData } = require("../utils/files");
const data = await fileData(docPath);
return { title: data.title, content: data.pageContent };
},
// Some data sources have encoded params in them we don't want to log - so strip those details.
_stripSource: function (sourceString, type) {
if (["confluence", "github"].includes(type)) {
const _src = new URL(sourceString);
_src.search = ""; // remove all search params that are encoded for resync.
return _src.toString();
}
return sourceString;
},
2023-06-08 06:31:35 +02:00
};
2023-06-04 04:28:07 +02:00
2023-06-08 06:31:35 +02:00
module.exports = { Document };