@aiconnect/easy-rag 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +181 -0
- package/dist/chunker/csv.d.ts +4 -0
- package/dist/chunker/csv.d.ts.map +1 -0
- package/dist/chunker/csv.js +10 -0
- package/dist/chunker/csv.js.map +1 -0
- package/dist/chunker/index.d.ts +5 -0
- package/dist/chunker/index.d.ts.map +1 -0
- package/dist/chunker/index.js +20 -0
- package/dist/chunker/index.js.map +1 -0
- package/dist/chunker/markdown.d.ts +4 -0
- package/dist/chunker/markdown.d.ts.map +1 -0
- package/dist/chunker/markdown.js +35 -0
- package/dist/chunker/markdown.js.map +1 -0
- package/dist/chunker/pdf.d.ts +4 -0
- package/dist/chunker/pdf.d.ts.map +1 -0
- package/dist/chunker/pdf.js +26 -0
- package/dist/chunker/pdf.js.map +1 -0
- package/dist/chunker/types.d.ts +12 -0
- package/dist/chunker/types.d.ts.map +1 -0
- package/dist/chunker/types.js +2 -0
- package/dist/chunker/types.js.map +1 -0
- package/dist/commands/init.d.ts +2 -0
- package/dist/commands/init.d.ts.map +1 -0
- package/dist/commands/init.js +120 -0
- package/dist/commands/init.js.map +1 -0
- package/dist/commands/serve.d.ts +2 -0
- package/dist/commands/serve.d.ts.map +1 -0
- package/dist/commands/serve.js +17 -0
- package/dist/commands/serve.js.map +1 -0
- package/dist/config/index.d.ts +9 -0
- package/dist/config/index.d.ts.map +1 -0
- package/dist/config/index.js +66 -0
- package/dist/config/index.js.map +1 -0
- package/dist/config/types.d.ts +5 -0
- package/dist/config/types.d.ts.map +1 -0
- package/dist/config/types.js +2 -0
- package/dist/config/types.js.map +1 -0
- package/dist/embeddings/index.d.ts +3 -0
- package/dist/embeddings/index.d.ts.map +1 -0
- package/dist/embeddings/index.js +2 -0
- package/dist/embeddings/index.js.map +1 -0
- package/dist/embeddings/openai.d.ts +2 -0
- package/dist/embeddings/openai.d.ts.map +1 -0
- package/dist/embeddings/openai.js +57 -0
- package/dist/embeddings/openai.js.map +1 -0
- package/dist/embeddings/types.d.ts +15 -0
- package/dist/embeddings/types.d.ts.map +1 -0
- package/dist/embeddings/types.js +2 -0
- package/dist/embeddings/types.js.map +1 -0
- package/dist/index.d.ts +3 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +248 -0
- package/dist/index.js.map +1 -0
- package/dist/indexer/index.d.ts +4 -0
- package/dist/indexer/index.d.ts.map +1 -0
- package/dist/indexer/index.js +3 -0
- package/dist/indexer/index.js.map +1 -0
- package/dist/indexer/orchestrator.d.ts +3 -0
- package/dist/indexer/orchestrator.d.ts.map +1 -0
- package/dist/indexer/orchestrator.js +106 -0
- package/dist/indexer/orchestrator.js.map +1 -0
- package/dist/indexer/scanner.d.ts +2 -0
- package/dist/indexer/scanner.d.ts.map +1 -0
- package/dist/indexer/scanner.js +34 -0
- package/dist/indexer/scanner.js.map +1 -0
- package/dist/indexer/types.d.ts +12 -0
- package/dist/indexer/types.d.ts.map +1 -0
- package/dist/indexer/types.js +2 -0
- package/dist/indexer/types.js.map +1 -0
- package/dist/parsers/csv.d.ts +3 -0
- package/dist/parsers/csv.d.ts.map +1 -0
- package/dist/parsers/csv.js +63 -0
- package/dist/parsers/csv.js.map +1 -0
- package/dist/parsers/fileTypeDetector.d.ts +3 -0
- package/dist/parsers/fileTypeDetector.d.ts.map +1 -0
- package/dist/parsers/fileTypeDetector.js +16 -0
- package/dist/parsers/fileTypeDetector.js.map +1 -0
- package/dist/parsers/index.d.ts +3 -0
- package/dist/parsers/index.d.ts.map +1 -0
- package/dist/parsers/index.js +18 -0
- package/dist/parsers/index.js.map +1 -0
- package/dist/parsers/markdown.d.ts +3 -0
- package/dist/parsers/markdown.d.ts.map +1 -0
- package/dist/parsers/markdown.js +30 -0
- package/dist/parsers/markdown.js.map +1 -0
- package/dist/parsers/pdf.d.ts +3 -0
- package/dist/parsers/pdf.d.ts.map +1 -0
- package/dist/parsers/pdf.js +22 -0
- package/dist/parsers/pdf.js.map +1 -0
- package/dist/parsers/types.d.ts +17 -0
- package/dist/parsers/types.d.ts.map +1 -0
- package/dist/parsers/types.js +2 -0
- package/dist/parsers/types.js.map +1 -0
- package/dist/query/embedding.d.ts +2 -0
- package/dist/query/embedding.d.ts.map +1 -0
- package/dist/query/embedding.js +6 -0
- package/dist/query/embedding.js.map +1 -0
- package/dist/query/index.d.ts +3 -0
- package/dist/query/index.d.ts.map +1 -0
- package/dist/query/index.js +86 -0
- package/dist/query/index.js.map +1 -0
- package/dist/query/search.d.ts +6 -0
- package/dist/query/search.d.ts.map +1 -0
- package/dist/query/search.js +45 -0
- package/dist/query/search.js.map +1 -0
- package/dist/query/types.d.ts +19 -0
- package/dist/query/types.d.ts.map +1 -0
- package/dist/query/types.js +2 -0
- package/dist/query/types.js.map +1 -0
- package/dist/vector-store/chroma-server.d.ts +10 -0
- package/dist/vector-store/chroma-server.d.ts.map +1 -0
- package/dist/vector-store/chroma-server.js +102 -0
- package/dist/vector-store/chroma-server.js.map +1 -0
- package/dist/vector-store/chromadb.d.ts +8 -0
- package/dist/vector-store/chromadb.d.ts.map +1 -0
- package/dist/vector-store/chromadb.js +98 -0
- package/dist/vector-store/chromadb.js.map +1 -0
- package/dist/vector-store/index.d.ts +4 -0
- package/dist/vector-store/index.d.ts.map +1 -0
- package/dist/vector-store/index.js +3 -0
- package/dist/vector-store/index.js.map +1 -0
- package/dist/vector-store/types.d.ts +12 -0
- package/dist/vector-store/types.d.ts.map +1 -0
- package/dist/vector-store/types.js +2 -0
- package/dist/vector-store/types.js.map +1 -0
- package/dist/vector-store/utils.d.ts +2 -0
- package/dist/vector-store/utils.d.ts.map +1 -0
- package/dist/vector-store/utils.js +17 -0
- package/dist/vector-store/utils.js.map +1 -0
- package/package.json +57 -0
- package/skills/easy-rag/SKILL.md +198 -0
|
@@ -0,0 +1,98 @@
|
|
|
1
|
+
import { ChromaClient } from 'chromadb';
|
|
2
|
+
import { ensureChromaDataDirectory } from './utils.js';
|
|
3
|
+
const client = new ChromaClient({
|
|
4
|
+
path: process.env.CHROMA_URL || 'http://localhost:8000',
|
|
5
|
+
});
|
|
6
|
+
export async function getOrCreateCollection(baseName) {
|
|
7
|
+
const sanitizedBaseName = baseName
|
|
8
|
+
.replace(/[^a-zA-Z0-9_-]/g, '_')
|
|
9
|
+
.toLowerCase();
|
|
10
|
+
let collectionName = sanitizedBaseName;
|
|
11
|
+
let suffix = 1;
|
|
12
|
+
while (true) {
|
|
13
|
+
try {
|
|
14
|
+
const collection = await client.getOrCreateCollection({
|
|
15
|
+
name: collectionName,
|
|
16
|
+
metadata: { 'hnsw:space': 'cosine' },
|
|
17
|
+
});
|
|
18
|
+
if (suffix === 1 || collectionName !== sanitizedBaseName) {
|
|
19
|
+
console.log(`Using collection: ${collectionName}`);
|
|
20
|
+
}
|
|
21
|
+
return collection;
|
|
22
|
+
}
|
|
23
|
+
catch (error) {
|
|
24
|
+
if (error instanceof Error && error.message.includes('already exists')) {
|
|
25
|
+
collectionName = `${sanitizedBaseName}_${suffix}`;
|
|
26
|
+
suffix++;
|
|
27
|
+
}
|
|
28
|
+
else {
|
|
29
|
+
throw error;
|
|
30
|
+
}
|
|
31
|
+
}
|
|
32
|
+
}
|
|
33
|
+
}
|
|
34
|
+
export async function storeEmbeddings(collection, chunks) {
|
|
35
|
+
if (chunks.length === 0) {
|
|
36
|
+
return;
|
|
37
|
+
}
|
|
38
|
+
const ids = chunks.map((c) => c.id);
|
|
39
|
+
const embeddings = chunks.map((c) => c.embedding);
|
|
40
|
+
const documents = chunks.map((c) => c.content);
|
|
41
|
+
const metadatas = chunks.map((c) => {
|
|
42
|
+
const metadata = {
|
|
43
|
+
sourcePath: c.metadata.sourcePath,
|
|
44
|
+
contentType: c.metadata.contentType,
|
|
45
|
+
chunkIndex: c.metadata.chunkIndex,
|
|
46
|
+
};
|
|
47
|
+
if (c.metadata.sectionTitle !== undefined) {
|
|
48
|
+
metadata.sectionTitle = c.metadata.sectionTitle;
|
|
49
|
+
}
|
|
50
|
+
if (c.metadata.startPosition !== undefined) {
|
|
51
|
+
metadata.startPosition = c.metadata.startPosition;
|
|
52
|
+
}
|
|
53
|
+
if (c.metadata.endPosition !== undefined) {
|
|
54
|
+
metadata.endPosition = c.metadata.endPosition;
|
|
55
|
+
}
|
|
56
|
+
if (c.metadata.rowNumber !== undefined) {
|
|
57
|
+
metadata.rowNumber = c.metadata.rowNumber;
|
|
58
|
+
}
|
|
59
|
+
return metadata;
|
|
60
|
+
});
|
|
61
|
+
await collection.add({
|
|
62
|
+
ids,
|
|
63
|
+
embeddings,
|
|
64
|
+
documents,
|
|
65
|
+
metadatas,
|
|
66
|
+
});
|
|
67
|
+
}
|
|
68
|
+
export async function initializeChromaDB() {
|
|
69
|
+
await ensureChromaDataDirectory();
|
|
70
|
+
}
|
|
71
|
+
export async function listCollections() {
|
|
72
|
+
const collectionNames = await client.listCollections();
|
|
73
|
+
const collectionInfos = [];
|
|
74
|
+
for (const name of collectionNames) {
|
|
75
|
+
try {
|
|
76
|
+
const collection = await client.getCollection({ name });
|
|
77
|
+
const count = await collection.count();
|
|
78
|
+
collectionInfos.push({
|
|
79
|
+
name,
|
|
80
|
+
count,
|
|
81
|
+
});
|
|
82
|
+
}
|
|
83
|
+
catch {
|
|
84
|
+
collectionInfos.push({
|
|
85
|
+
name,
|
|
86
|
+
count: 0,
|
|
87
|
+
});
|
|
88
|
+
}
|
|
89
|
+
}
|
|
90
|
+
return collectionInfos;
|
|
91
|
+
}
|
|
92
|
+
export async function deleteCollection(collectionName) {
|
|
93
|
+
const sanitized = collectionName
|
|
94
|
+
.replace(/[^a-zA-Z0-9_-]/g, '_')
|
|
95
|
+
.toLowerCase();
|
|
96
|
+
await client.deleteCollection({ name: sanitized });
|
|
97
|
+
}
|
|
98
|
+
//# sourceMappingURL=chromadb.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"chromadb.js","sourceRoot":"","sources":["../../src/vector-store/chromadb.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,YAAY,EAAmB,MAAM,UAAU,CAAC;AAEzD,OAAO,EAAE,yBAAyB,EAAE,MAAM,YAAY,CAAC;AAEvD,MAAM,MAAM,GAAG,IAAI,YAAY,CAAC;IAC9B,IAAI,EAAE,OAAO,CAAC,GAAG,CAAC,UAAU,IAAI,uBAAuB;CACxD,CAAC,CAAC;AAEH,MAAM,CAAC,KAAK,UAAU,qBAAqB,CAAC,QAAgB;IAC1D,MAAM,iBAAiB,GAAG,QAAQ;SAC/B,OAAO,CAAC,iBAAiB,EAAE,GAAG,CAAC;SAC/B,WAAW,EAAE,CAAC;IAEjB,IAAI,cAAc,GAAG,iBAAiB,CAAC;IACvC,IAAI,MAAM,GAAG,CAAC,CAAC;IAEf,OAAO,IAAI,EAAE,CAAC;QACZ,IAAI,CAAC;YACH,MAAM,UAAU,GAAG,MAAM,MAAM,CAAC,qBAAqB,CAAC;gBACpD,IAAI,EAAE,cAAc;gBACpB,QAAQ,EAAE,EAAE,YAAY,EAAE,QAAQ,EAAE;aACrC,CAAC,CAAC;YAEH,IAAI,MAAM,KAAK,CAAC,IAAI,cAAc,KAAK,iBAAiB,EAAE,CAAC;gBACzD,OAAO,CAAC,GAAG,CAAC,qBAAqB,cAAc,EAAE,CAAC,CAAC;YACrD,CAAC;YAED,OAAO,UAAU,CAAC;QACpB,CAAC;QAAC,OAAO,KAAK,EAAE,CAAC;YACf,IAAI,KAAK,YAAY,KAAK,IAAI,KAAK,CAAC,OAAO,CAAC,QAAQ,CAAC,gBAAgB,CAAC,EAAE,CAAC;gBACvE,cAAc,GAAG,GAAG,iBAAiB,IAAI,MAAM,EAAE,CAAC;gBAClD,MAAM,EAAE,CAAC;YACX,CAAC;iBAAM,CAAC;gBACN,MAAM,KAAK,CAAC;YACd,CAAC;QACH,CAAC;IACH,CAAC;AACH,CAAC;AAED,MAAM,CAAC,KAAK,UAAU,eAAe,CACnC,UAAsB,EACtB,MAAqB;IAErB,IAAI,MAAM,CAAC,MAAM,KAAK,CAAC,EAAE,CAAC;QACxB,OAAO;IACT,CAAC;IAED,MAAM,GAAG,GAAG,MAAM,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC;IACpC,MAAM,UAAU,GAAG,MAAM,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,SAAS,CAAC,CAAC;IAClD,MAAM,SAAS,GAAG,MAAM,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,OAAO,CAAC,CAAC;IAC/C,MAAM,SAAS,GAAG,MAAM,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE;QACjC,MAAM,QAAQ,GAA8C;YAC1D,UAAU,EAAE,CAAC,CAAC,QAAQ,CAAC,UAAU;YACjC,WAAW,EAAE,CAAC,CAAC,QAAQ,CAAC,WAAW;YACnC,UAAU,EAAE,CAAC,CAAC,QAAQ,CAAC,UAAU;SAClC,CAAC;QAEF,IAAI,CAAC,CAAC,QAAQ,CAAC,YAAY,KAAK,SAAS,EAAE,CAAC;YAC1C,QAAQ,CAAC,YAAY,GAAG,CAAC,CAAC,QAAQ,CAAC,YAAY,CAAC;QAClD,CAAC;QACD,IAAI,CAAC,CAAC,QAAQ,CAAC,aAAa,KAAK,SAAS,EAAE,CAAC;YAC3C,QAAQ,CAAC,aAAa,GAAG,CAAC,CAAC,QAAQ,CAAC,aAAa,CAAC;QACpD,CAAC;QACD,IAAI,CAAC,CAAC,QAAQ,CAAC,WAAW,KAAK,SAAS,EAAE,CAAC;YACzC,QAAQ,CAAC,WAAW,GAAG,CAAC,CAAC,QAAQ,CAAC,WAAW,CAAC;QAChD,CAAC;QACD,IAAI,CAAC,CAAC,QAAQ,CAAC,SAAS,KAAK,SAAS,EAAE,CAAC;YACvC,QAAQ,CAAC,SAAS,GAAG,CAAC,CAAC,QAAQ,CAAC,SAAS,CAAC;QAC5C,CAAC;QAED,OAAO,QAAQ,CAAC;IAClB,CAAC,CAAC,CAAC;IAEH,MAAM,UAAU,CAAC,GAAG,CAAC;QACnB,GAAG;QACH,UAAU;QACV,SAAS;QACT,SAAS;KACV,CAAC,CAAC;AACL,CAAC;AAED,MAAM,CAAC,KAAK,UAAU,kBAAkB;IACtC,MAAM,yBAAyB,EAAE,CAAC;AACpC,CAAC;AAED,MAAM,CAAC,KAAK,UAAU,eAAe;IACnC,MAAM,eAAe,GAAG,MAAM,MAAM,CAAC,eAAe,EAAE,CAAC;IACvD,MAAM,eAAe,GAAqB,EAAE,CAAC;IAE7C,KAAK,MAAM,IAAI,IAAI,eAAe,EAAE,CAAC;QACnC,IAAI,CAAC;YACH,MAAM,UAAU,GAAG,MAAM,MAAM,CAAC,aAAa,CAAC,EAAE,IAAI,EAAE,CAAC,CAAC;YACxD,MAAM,KAAK,GAAG,MAAM,UAAU,CAAC,KAAK,EAAE,CAAC;YACvC,eAAe,CAAC,IAAI,CAAC;gBACnB,IAAI;gBACJ,KAAK;aACN,CAAC,CAAC;QACL,CAAC;QAAC,MAAM,CAAC;YACP,eAAe,CAAC,IAAI,CAAC;gBACnB,IAAI;gBACJ,KAAK,EAAE,CAAC;aACT,CAAC,CAAC;QACL,CAAC;IACH,CAAC;IAED,OAAO,eAAe,CAAC;AACzB,CAAC;AAED,MAAM,CAAC,KAAK,UAAU,gBAAgB,CAAC,cAAsB;IAC3D,MAAM,SAAS,GAAG,cAAc;SAC7B,OAAO,CAAC,iBAAiB,EAAE,GAAG,CAAC;SAC/B,WAAW,EAAE,CAAC;IACjB,MAAM,MAAM,CAAC,gBAAgB,CAAC,EAAE,IAAI,EAAE,SAAS,EAAE,CAAC,CAAC;AACrD,CAAC"}
|
|
@@ -0,0 +1,4 @@
|
|
|
1
|
+
export { getOrCreateCollection, storeEmbeddings, initializeChromaDB, listCollections, deleteCollection } from './chromadb.js';
|
|
2
|
+
export { ensureChromaDataDirectory } from './utils.js';
|
|
3
|
+
export type { VectorChunk, CollectionInfo } from './types.js';
|
|
4
|
+
//# sourceMappingURL=index.d.ts.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../../src/vector-store/index.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,qBAAqB,EAAE,eAAe,EAAE,kBAAkB,EAAE,eAAe,EAAE,gBAAgB,EAAE,MAAM,eAAe,CAAC;AAC9H,OAAO,EAAE,yBAAyB,EAAE,MAAM,YAAY,CAAC;AACvD,YAAY,EAAE,WAAW,EAAE,cAAc,EAAE,MAAM,YAAY,CAAC"}
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"index.js","sourceRoot":"","sources":["../../src/vector-store/index.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,qBAAqB,EAAE,eAAe,EAAE,kBAAkB,EAAE,eAAe,EAAE,gBAAgB,EAAE,MAAM,eAAe,CAAC;AAC9H,OAAO,EAAE,yBAAyB,EAAE,MAAM,YAAY,CAAC"}
|
|
@@ -0,0 +1,12 @@
|
|
|
1
|
+
import type { ChunkMetadata } from '../chunker/types.js';
|
|
2
|
+
export interface VectorChunk {
|
|
3
|
+
id: string;
|
|
4
|
+
embedding: number[];
|
|
5
|
+
metadata: ChunkMetadata;
|
|
6
|
+
content: string;
|
|
7
|
+
}
|
|
8
|
+
export interface CollectionInfo {
|
|
9
|
+
name: string;
|
|
10
|
+
count: number;
|
|
11
|
+
}
|
|
12
|
+
//# sourceMappingURL=types.d.ts.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"types.d.ts","sourceRoot":"","sources":["../../src/vector-store/types.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,EAAE,aAAa,EAAE,MAAM,qBAAqB,CAAC;AAEzD,MAAM,WAAW,WAAW;IAC1B,EAAE,EAAE,MAAM,CAAC;IACX,SAAS,EAAE,MAAM,EAAE,CAAC;IACpB,QAAQ,EAAE,aAAa,CAAC;IACxB,OAAO,EAAE,MAAM,CAAC;CACjB;AAED,MAAM,WAAW,cAAc;IAC7B,IAAI,EAAE,MAAM,CAAC;IACb,KAAK,EAAE,MAAM,CAAC;CACf"}
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"types.js","sourceRoot":"","sources":["../../src/vector-store/types.ts"],"names":[],"mappings":""}
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"utils.d.ts","sourceRoot":"","sources":["../../src/vector-store/utils.ts"],"names":[],"mappings":"AAMA,wBAAsB,yBAAyB,IAAI,OAAO,CAAC,MAAM,CAAC,CAYjE"}
|
|
@@ -0,0 +1,17 @@
|
|
|
1
|
+
import os from 'os';
|
|
2
|
+
import path from 'path';
|
|
3
|
+
import fs from 'fs/promises';
|
|
4
|
+
const CHROMA_DATA_DIR = path.join(os.homedir(), '.easy-rag', 'chromadb');
|
|
5
|
+
export async function ensureChromaDataDirectory() {
|
|
6
|
+
try {
|
|
7
|
+
await fs.mkdir(CHROMA_DATA_DIR, { recursive: true });
|
|
8
|
+
return CHROMA_DATA_DIR;
|
|
9
|
+
}
|
|
10
|
+
catch (error) {
|
|
11
|
+
if (error instanceof Error && 'code' in error && error.code === 'EACCES') {
|
|
12
|
+
throw new Error(`Cannot write to ${CHROMA_DATA_DIR}. Please check directory permissions.`);
|
|
13
|
+
}
|
|
14
|
+
throw error;
|
|
15
|
+
}
|
|
16
|
+
}
|
|
17
|
+
//# sourceMappingURL=utils.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"utils.js","sourceRoot":"","sources":["../../src/vector-store/utils.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,MAAM,IAAI,CAAC;AACpB,OAAO,IAAI,MAAM,MAAM,CAAC;AACxB,OAAO,EAAE,MAAM,aAAa,CAAC;AAE7B,MAAM,eAAe,GAAG,IAAI,CAAC,IAAI,CAAC,EAAE,CAAC,OAAO,EAAE,EAAE,WAAW,EAAE,UAAU,CAAC,CAAC;AAEzE,MAAM,CAAC,KAAK,UAAU,yBAAyB;IAC7C,IAAI,CAAC;QACH,MAAM,EAAE,CAAC,KAAK,CAAC,eAAe,EAAE,EAAE,SAAS,EAAE,IAAI,EAAE,CAAC,CAAC;QACrD,OAAO,eAAe,CAAC;IACzB,CAAC;IAAC,OAAO,KAAK,EAAE,CAAC;QACf,IAAI,KAAK,YAAY,KAAK,IAAI,MAAM,IAAI,KAAK,IAAI,KAAK,CAAC,IAAI,KAAK,QAAQ,EAAE,CAAC;YACzE,MAAM,IAAI,KAAK,CACb,mBAAmB,eAAe,uCAAuC,CAC1E,CAAC;QACJ,CAAC;QACD,MAAM,KAAK,CAAC;IACd,CAAC;AACH,CAAC"}
|
package/package.json
ADDED
|
@@ -0,0 +1,57 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "@aiconnect/easy-rag",
|
|
3
|
+
"version": "0.3.0",
|
|
4
|
+
"description": "A TypeScript CLI tool for local RAG indexing and querying",
|
|
5
|
+
"type": "module",
|
|
6
|
+
"main": "dist/index.js",
|
|
7
|
+
"bin": {
|
|
8
|
+
"easy-rag": "./dist/index.js"
|
|
9
|
+
},
|
|
10
|
+
"scripts": {
|
|
11
|
+
"build": "tsc",
|
|
12
|
+
"dev": "tsx src/index.ts",
|
|
13
|
+
"start": "node dist/index.js",
|
|
14
|
+
"test": "node --experimental-vm-modules node_modules/jest/bin/jest.js",
|
|
15
|
+
"test:watch": "node --experimental-vm-modules node_modules/jest/bin/jest.js --watch",
|
|
16
|
+
"prepublishOnly": "npm run build"
|
|
17
|
+
},
|
|
18
|
+
"publishConfig": {
|
|
19
|
+
"access": "public"
|
|
20
|
+
},
|
|
21
|
+
"files": [
|
|
22
|
+
"dist",
|
|
23
|
+
"skills",
|
|
24
|
+
"README.md"
|
|
25
|
+
],
|
|
26
|
+
"keywords": [
|
|
27
|
+
"rag",
|
|
28
|
+
"cli",
|
|
29
|
+
"vector-search",
|
|
30
|
+
"embeddings",
|
|
31
|
+
"chromadb",
|
|
32
|
+
"openai",
|
|
33
|
+
"retrieval-augmented-generation"
|
|
34
|
+
],
|
|
35
|
+
"author": "AI Connect",
|
|
36
|
+
"license": "MIT",
|
|
37
|
+
"repository": {
|
|
38
|
+
"type": "git",
|
|
39
|
+
"url": "https://github.com/johnjohn-aic/easy-rag.git"
|
|
40
|
+
},
|
|
41
|
+
"devDependencies": {
|
|
42
|
+
"@types/jest": "^29.5.14",
|
|
43
|
+
"@types/node": "^22.19.11",
|
|
44
|
+
"jest": "^29.7.0",
|
|
45
|
+
"ts-jest": "^29.4.6",
|
|
46
|
+
"typescript": "^5.9.3"
|
|
47
|
+
},
|
|
48
|
+
"dependencies": {
|
|
49
|
+
"commander": "^13.1.0",
|
|
50
|
+
"pdf-parse": "^1.1.1",
|
|
51
|
+
"openai": "^4.89.0",
|
|
52
|
+
"chromadb": "^2.1.5"
|
|
53
|
+
},
|
|
54
|
+
"engines": {
|
|
55
|
+
"node": ">=18.0.0"
|
|
56
|
+
}
|
|
57
|
+
}
|
|
@@ -0,0 +1,198 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: easy-rag
|
|
3
|
+
description: >
|
|
4
|
+
Index documents and query them using local RAG (Retrieval-Augmented Generation).
|
|
5
|
+
Use when you need to build a knowledge base from files (PDF, Markdown, CSV) and
|
|
6
|
+
search it with natural language questions. Runs locally with ChromaDB embedded.
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Easy RAG Skill
|
|
10
|
+
|
|
11
|
+
## When to Use
|
|
12
|
+
|
|
13
|
+
- You need to **index documents** (PDF, Markdown, CSV) into a searchable knowledge base
|
|
14
|
+
- You need to **query** indexed documents with natural language questions
|
|
15
|
+
- You need to **manage collections** (list, delete)
|
|
16
|
+
|
|
17
|
+
## Requirements
|
|
18
|
+
|
|
19
|
+
- **OpenAI API key** must be configured (via `easy-rag init` or `OPENAI_API_KEY` env var)
|
|
20
|
+
- No external services needed — ChromaDB runs embedded locally
|
|
21
|
+
|
|
22
|
+
### Setting Up API Key
|
|
23
|
+
|
|
24
|
+
**Option 1: Interactive init (recommended)**
|
|
25
|
+
```bash
|
|
26
|
+
easy-rag init
|
|
27
|
+
```
|
|
28
|
+
Prompts for API key and embedding model selection.
|
|
29
|
+
|
|
30
|
+
**Option 2: Environment variable**
|
|
31
|
+
```bash
|
|
32
|
+
export OPENAI_API_KEY="sk-..."
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
Environment variables override config file settings.
|
|
36
|
+
|
|
37
|
+
## Installation
|
|
38
|
+
|
|
39
|
+
```bash
|
|
40
|
+
# From npm (when published)
|
|
41
|
+
npx easy-rag <command>
|
|
42
|
+
|
|
43
|
+
# From source
|
|
44
|
+
cd /path/to/easy-rag
|
|
45
|
+
npm install
|
|
46
|
+
npm run build
|
|
47
|
+
node dist/index.js <command>
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
## Commands
|
|
51
|
+
|
|
52
|
+
### Initialize Configuration
|
|
53
|
+
|
|
54
|
+
Run interactive setup to create global configuration.
|
|
55
|
+
|
|
56
|
+
```bash
|
|
57
|
+
easy-rag init
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
Prompts for:
|
|
61
|
+
- OpenAI API key (masked input)
|
|
62
|
+
- Embedding model selection (default: `text-embedding-3-large`)
|
|
63
|
+
|
|
64
|
+
Creates `~/.easy-rag/config.json` with your settings.
|
|
65
|
+
|
|
66
|
+
**Always run this first** before indexing if you haven't configured EasyRAG yet.
|
|
67
|
+
|
|
68
|
+
### Index Documents
|
|
69
|
+
|
|
70
|
+
Recursively scans a folder for supported files and indexes them into ChromaDB.
|
|
71
|
+
|
|
72
|
+
```bash
|
|
73
|
+
easy-rag index <folder>
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
- **Supported formats:** `.pdf`, `.md`, `.csv`
|
|
77
|
+
- **Chunking:** Automatic — Markdown by heading/section, PDF by paragraph, CSV by row
|
|
78
|
+
- **Collection:** Named after the folder (e.g., `my-docs/` → collection `my-docs`)
|
|
79
|
+
- **Idempotent:** Re-indexing the same folder updates the existing collection
|
|
80
|
+
|
|
81
|
+
**Example:**
|
|
82
|
+
```bash
|
|
83
|
+
easy-rag index ./knowledge-base
|
|
84
|
+
# Indexes all PDF, MD, CSV files in ./knowledge-base/
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
### Query Documents
|
|
88
|
+
|
|
89
|
+
Search indexed documents using natural language. Returns the most relevant chunks.
|
|
90
|
+
|
|
91
|
+
```bash
|
|
92
|
+
easy-rag query [options] "<question>"
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
| Option | Description | Default |
|
|
96
|
+
|--------|-------------|---------|
|
|
97
|
+
| `--top <n>` | Number of results to return | 5 |
|
|
98
|
+
| `--metadata` | Include source file, score, chunk index | off |
|
|
99
|
+
| `--collection <name>` | Search a specific collection | all |
|
|
100
|
+
|
|
101
|
+
**Examples:**
|
|
102
|
+
```bash
|
|
103
|
+
# Simple query
|
|
104
|
+
easy-rag query "What is the refund policy?"
|
|
105
|
+
|
|
106
|
+
# Top 3 results with metadata
|
|
107
|
+
easy-rag query --top 3 --metadata "How do I configure the API?"
|
|
108
|
+
|
|
109
|
+
# Search specific collection
|
|
110
|
+
easy-rag query --collection my-docs "quarterly revenue"
|
|
111
|
+
```
|
|
112
|
+
|
|
113
|
+
**Output with `--metadata`:**
|
|
114
|
+
```
|
|
115
|
+
[Score: 0.92 | Source: docs/api-guide.md | Chunk: 3]
|
|
116
|
+
To configure the API, set the following environment variables...
|
|
117
|
+
|
|
118
|
+
[Score: 0.87 | Source: docs/setup.pdf | Chunk: 12]
|
|
119
|
+
The API configuration requires...
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
**Output without `--metadata`:**
|
|
123
|
+
```
|
|
124
|
+
To configure the API, set the following environment variables...
|
|
125
|
+
|
|
126
|
+
The API configuration requires...
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
### List Collections
|
|
130
|
+
|
|
131
|
+
Show all indexed collections and their document counts.
|
|
132
|
+
|
|
133
|
+
```bash
|
|
134
|
+
easy-rag collections
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
### Delete a Collection
|
|
138
|
+
|
|
139
|
+
Remove an indexed collection. Asks for confirmation unless `--force` is used.
|
|
140
|
+
|
|
141
|
+
```bash
|
|
142
|
+
easy-rag delete <collection-name>
|
|
143
|
+
easy-rag delete <collection-name> --force
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
**Always use `--force` in automated workflows** to avoid interactive prompts.
|
|
147
|
+
|
|
148
|
+
## Workflow for AI Agents
|
|
149
|
+
|
|
150
|
+
### Building a Knowledge Base
|
|
151
|
+
|
|
152
|
+
1. Place documents in a folder
|
|
153
|
+
2. Index: `easy-rag index ./my-folder`
|
|
154
|
+
3. Query: `easy-rag query "your question"`
|
|
155
|
+
|
|
156
|
+
### Recommended Patterns
|
|
157
|
+
|
|
158
|
+
- **One collection per topic/project** — index separate folders for different knowledge domains
|
|
159
|
+
- **Always use `--metadata`** — helps trace answers back to source documents
|
|
160
|
+
- **Use `--top 3`** for focused answers, `--top 10` for broad research
|
|
161
|
+
- **Re-index after adding files** — run `easy-rag index` again on the same folder
|
|
162
|
+
|
|
163
|
+
### Error Handling
|
|
164
|
+
|
|
165
|
+
| Error | Cause | Fix |
|
|
166
|
+
|-------|-------|-----|
|
|
167
|
+
| `OpenAI API key is required` | Missing configuration | Run `easy-rag init` or export `OPENAI_API_KEY` |
|
|
168
|
+
| `No supported files found` | Folder has no PDF/MD/CSV | Check folder path and file extensions |
|
|
169
|
+
| `Collection not found` | Querying non-existent collection | Run `easy-rag collections` to check available ones |
|
|
170
|
+
|
|
171
|
+
## Environment Variables
|
|
172
|
+
|
|
173
|
+
| Variable | Required | Description |
|
|
174
|
+
|----------|----------|-------------|
|
|
175
|
+
| `OPENAI_API_KEY` | Yes (if not in config) | OpenAI API key for generating embeddings |
|
|
176
|
+
| `EMBEDDING_MODEL` | No | Override embedding model (default: `text-embedding-3-large`) |
|
|
177
|
+
|
|
178
|
+
**Environment variables override config file settings.**
|
|
179
|
+
|
|
180
|
+
### Configuration File
|
|
181
|
+
|
|
182
|
+
Config is stored at `~/.easy-rag/config.json`:
|
|
183
|
+
```json
|
|
184
|
+
{
|
|
185
|
+
"openai_api_key": "sk-...",
|
|
186
|
+
"embedding_model": "text-embedding-3-large"
|
|
187
|
+
}
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
**Priority (highest first):**
|
|
191
|
+
1. Environment variables
|
|
192
|
+
2. Config file
|
|
193
|
+
3. Hardcoded defaults
|
|
194
|
+
|
|
195
|
+
**Available models:**
|
|
196
|
+
- `text-embedding-3-large` (default, best performance)
|
|
197
|
+
- `text-embedding-3-small` (faster, cheaper)
|
|
198
|
+
- `text-embedding-ada-002` (legacy)
|