@robthepcguy/rag-vault 1.0.0 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +441 -421
- package/dist/errors/index.d.ts +2 -6
- package/dist/errors/index.d.ts.map +1 -1
- package/dist/errors/index.js +8 -16
- package/dist/errors/index.js.map +1 -1
- package/dist/explainability/index.d.ts +2 -0
- package/dist/explainability/index.d.ts.map +1 -0
- package/dist/explainability/index.js +8 -0
- package/dist/explainability/index.js.map +1 -0
- package/dist/explainability/keywords.d.ts +18 -0
- package/dist/explainability/keywords.d.ts.map +1 -0
- package/dist/explainability/keywords.js +237 -0
- package/dist/explainability/keywords.js.map +1 -0
- package/dist/flywheel/feedback.d.ts +105 -0
- package/dist/flywheel/feedback.d.ts.map +1 -0
- package/dist/flywheel/feedback.js +219 -0
- package/dist/flywheel/feedback.js.map +1 -0
- package/dist/flywheel/index.d.ts +2 -0
- package/dist/flywheel/index.d.ts.map +1 -0
- package/dist/flywheel/index.js +9 -0
- package/dist/flywheel/index.js.map +1 -0
- package/dist/index.js +1 -0
- package/dist/index.js.map +1 -1
- package/dist/server/index.d.ts +40 -0
- package/dist/server/index.d.ts.map +1 -1
- package/dist/server/index.js +113 -0
- package/dist/server/index.js.map +1 -1
- package/dist/server/raw-data-utils.d.ts.map +1 -1
- package/dist/server/raw-data-utils.js.map +1 -1
- package/dist/utils/config.d.ts +15 -0
- package/dist/utils/config.d.ts.map +1 -1
- package/dist/utils/config.js +84 -0
- package/dist/utils/config.js.map +1 -1
- package/dist/utils/math.d.ts +0 -7
- package/dist/utils/math.d.ts.map +1 -1
- package/dist/utils/math.js +0 -14
- package/dist/utils/math.js.map +1 -1
- package/dist/vectordb/index.d.ts +57 -2
- package/dist/vectordb/index.d.ts.map +1 -1
- package/dist/vectordb/index.js +255 -33
- package/dist/vectordb/index.js.map +1 -1
- package/dist/web/api-routes.d.ts.map +1 -1
- package/dist/web/api-routes.js +120 -7
- package/dist/web/api-routes.js.map +1 -1
- package/dist/web/config-routes.d.ts.map +1 -1
- package/dist/web/config-routes.js +84 -2
- package/dist/web/config-routes.js.map +1 -1
- package/dist/web/database-manager.d.ts +119 -1
- package/dist/web/database-manager.d.ts.map +1 -1
- package/dist/web/database-manager.js +339 -51
- package/dist/web/database-manager.js.map +1 -1
- package/dist/web/http-server.d.ts.map +1 -1
- package/dist/web/http-server.js +12 -2
- package/dist/web/http-server.js.map +1 -1
- package/dist/web/index.js +18 -10
- package/dist/web/index.js.map +1 -1
- package/dist/web/middleware/error-handler.d.ts +0 -16
- package/dist/web/middleware/error-handler.d.ts.map +1 -1
- package/dist/web/middleware/error-handler.js +0 -18
- package/dist/web/middleware/error-handler.js.map +1 -1
- package/dist/web/middleware/request-logger.d.ts +2 -1
- package/dist/web/middleware/request-logger.d.ts.map +1 -1
- package/package.json +129 -135
- package/skills/rag-vault/SKILL.md +111 -111
- package/web-ui/dist/assets/index-BcRp9-z9.js +120 -0
- package/web-ui/dist/assets/index-ej8i4PGl.css +1 -0
- package/web-ui/dist/index.html +14 -0
- package/web-ui/dist/vite.svg +3 -0
- package/dist/utils/logger.d.ts +0 -36
- package/dist/utils/logger.d.ts.map +0 -1
- package/dist/utils/logger.js +0 -64
- package/dist/utils/logger.js.map +0 -1
|
@@ -1,111 +1,111 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: rag-vault
|
|
3
|
-
description: This skill should be used when the user asks to "search documents", "query RAG", "ingest file", "ingest PDF", "save web page", "add to knowledge base", or mentions document search, semantic search, vector search, or RAG operations. Provides score interpretation (< 0.3 good, > 0.5 skip), query optimization, and ingestion guidance for query_documents, ingest_file, ingest_data tools.
|
|
4
|
-
version: 1.0.0
|
|
5
|
-
---
|
|
6
|
-
|
|
7
|
-
# RAG Vault Skills
|
|
8
|
-
|
|
9
|
-
## Tools
|
|
10
|
-
|
|
11
|
-
| Tool | Use When |
|
|
12
|
-
|------|----------|
|
|
13
|
-
| `ingest_file` | Local files (PDF, DOCX, TXT, MD, JSON, JSONL) |
|
|
14
|
-
| `ingest_data` | Raw content (HTML, text) with source URL |
|
|
15
|
-
| `query_documents` | Semantic + keyword hybrid search |
|
|
16
|
-
| `delete_file` / `list_files` / `status` | Management |
|
|
17
|
-
|
|
18
|
-
## Search: Core Rules
|
|
19
|
-
|
|
20
|
-
Hybrid search combines vector (semantic) and keyword (BM25).
|
|
21
|
-
|
|
22
|
-
### Score Interpretation
|
|
23
|
-
|
|
24
|
-
Lower = better match. Use this to filter noise.
|
|
25
|
-
|
|
26
|
-
| Score | Action |
|
|
27
|
-
|-------|--------|
|
|
28
|
-
| < 0.3 | Use directly |
|
|
29
|
-
| 0.3-0.5 | Include if mentions same concept/entity |
|
|
30
|
-
| > 0.5 | Skip unless no better results |
|
|
31
|
-
|
|
32
|
-
### Limit Selection
|
|
33
|
-
|
|
34
|
-
| Intent | Limit |
|
|
35
|
-
|--------|-------|
|
|
36
|
-
| Specific answer (function, error) | 5 |
|
|
37
|
-
| General understanding | 10 |
|
|
38
|
-
| Comprehensive survey | 20 |
|
|
39
|
-
|
|
40
|
-
### Query Formulation
|
|
41
|
-
|
|
42
|
-
| Situation | Why Transform | Action |
|
|
43
|
-
|-----------|---------------|--------|
|
|
44
|
-
| Specific term mentioned | Keyword search needs exact match | KEEP term |
|
|
45
|
-
| Vague query | Vector search needs semantic signal | ADD context |
|
|
46
|
-
| Error stack or code block | Long text dilutes relevance | EXTRACT core keywords |
|
|
47
|
-
| Multiple distinct topics | Single query conflates results | SPLIT queries |
|
|
48
|
-
| Few/poor results | Term mismatch | EXPAND (see below) |
|
|
49
|
-
|
|
50
|
-
### Query Expansion
|
|
51
|
-
|
|
52
|
-
When results are few or all score > 0.5, expand query terms:
|
|
53
|
-
|
|
54
|
-
- Keep original term first, add 2-4 variants
|
|
55
|
-
- Types: synonyms, abbreviations, related terms, word forms
|
|
56
|
-
- Example: `"config"` → `"config configuration settings configure"`
|
|
57
|
-
|
|
58
|
-
Avoid over-expansion (causes topic drift).
|
|
59
|
-
|
|
60
|
-
### Result Selection
|
|
61
|
-
|
|
62
|
-
When to include vs skip—based on answer quality, not just score.
|
|
63
|
-
|
|
64
|
-
**INCLUDE** if:
|
|
65
|
-
- Directly answers the question
|
|
66
|
-
- Provides necessary context
|
|
67
|
-
- Score < 0.5
|
|
68
|
-
|
|
69
|
-
**SKIP** if:
|
|
70
|
-
- Same keyword, unrelated context
|
|
71
|
-
- Score > 0.7
|
|
72
|
-
- Mentions term without explanation
|
|
73
|
-
|
|
74
|
-
## Ingestion
|
|
75
|
-
|
|
76
|
-
### ingest_file
|
|
77
|
-
```
|
|
78
|
-
ingest_file({ filePath: "/absolute/path/to/document.pdf" })
|
|
79
|
-
```
|
|
80
|
-
|
|
81
|
-
### ingest_data
|
|
82
|
-
```
|
|
83
|
-
ingest_data({
|
|
84
|
-
content: "<html>...</html>",
|
|
85
|
-
metadata: { source: "https://example.com/page", format: "html" }
|
|
86
|
-
})
|
|
87
|
-
```
|
|
88
|
-
|
|
89
|
-
**Format selection** — match the data you have:
|
|
90
|
-
- HTML string → `format: "html"`
|
|
91
|
-
- Markdown string → `format: "markdown"`
|
|
92
|
-
- Other → `format: "text"`
|
|
93
|
-
|
|
94
|
-
**Source format:**
|
|
95
|
-
- Web page → Use URL: `https://example.com/page`
|
|
96
|
-
- Other content → Use scheme: `{type}://{date}` or `{type}://{date}/{detail}`
|
|
97
|
-
- Examples: `clipboard://2024-12-30`, `chat://2024-12-30/project-discussion`
|
|
98
|
-
|
|
99
|
-
**HTML source options:**
|
|
100
|
-
- Static page → LLM fetch
|
|
101
|
-
- SPA/JS-rendered → Browser MCP
|
|
102
|
-
- Auth required → Manual paste
|
|
103
|
-
|
|
104
|
-
Re-ingest same source to update. Use same source in `delete_file` to remove.
|
|
105
|
-
|
|
106
|
-
## References
|
|
107
|
-
|
|
108
|
-
For edge cases and examples:
|
|
109
|
-
- [html-ingestion.md](references/html-ingestion.md) - URL normalization, SPA handling
|
|
110
|
-
- [query-optimization.md](references/query-optimization.md) - Query patterns by intent
|
|
111
|
-
- [result-refinement.md](references/result-refinement.md) - Contradiction resolution, chunking
|
|
1
|
+
---
|
|
2
|
+
name: rag-vault
|
|
3
|
+
description: This skill should be used when the user asks to "search documents", "query RAG", "ingest file", "ingest PDF", "save web page", "add to knowledge base", or mentions document search, semantic search, vector search, or RAG operations. Provides score interpretation (< 0.3 good, > 0.5 skip), query optimization, and ingestion guidance for query_documents, ingest_file, ingest_data tools.
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# RAG Vault Skills
|
|
8
|
+
|
|
9
|
+
## Tools
|
|
10
|
+
|
|
11
|
+
| Tool | Use When |
|
|
12
|
+
|------|----------|
|
|
13
|
+
| `ingest_file` | Local files (PDF, DOCX, TXT, MD, JSON, JSONL) |
|
|
14
|
+
| `ingest_data` | Raw content (HTML, text) with source URL |
|
|
15
|
+
| `query_documents` | Semantic + keyword hybrid search |
|
|
16
|
+
| `delete_file` / `list_files` / `status` | Management |
|
|
17
|
+
|
|
18
|
+
## Search: Core Rules
|
|
19
|
+
|
|
20
|
+
Hybrid search combines vector (semantic) and keyword (BM25).
|
|
21
|
+
|
|
22
|
+
### Score Interpretation
|
|
23
|
+
|
|
24
|
+
Lower = better match. Use this to filter noise.
|
|
25
|
+
|
|
26
|
+
| Score | Action |
|
|
27
|
+
|-------|--------|
|
|
28
|
+
| < 0.3 | Use directly |
|
|
29
|
+
| 0.3-0.5 | Include if mentions same concept/entity |
|
|
30
|
+
| > 0.5 | Skip unless no better results |
|
|
31
|
+
|
|
32
|
+
### Limit Selection
|
|
33
|
+
|
|
34
|
+
| Intent | Limit |
|
|
35
|
+
|--------|-------|
|
|
36
|
+
| Specific answer (function, error) | 5 |
|
|
37
|
+
| General understanding | 10 |
|
|
38
|
+
| Comprehensive survey | 20 |
|
|
39
|
+
|
|
40
|
+
### Query Formulation
|
|
41
|
+
|
|
42
|
+
| Situation | Why Transform | Action |
|
|
43
|
+
|-----------|---------------|--------|
|
|
44
|
+
| Specific term mentioned | Keyword search needs exact match | KEEP term |
|
|
45
|
+
| Vague query | Vector search needs semantic signal | ADD context |
|
|
46
|
+
| Error stack or code block | Long text dilutes relevance | EXTRACT core keywords |
|
|
47
|
+
| Multiple distinct topics | Single query conflates results | SPLIT queries |
|
|
48
|
+
| Few/poor results | Term mismatch | EXPAND (see below) |
|
|
49
|
+
|
|
50
|
+
### Query Expansion
|
|
51
|
+
|
|
52
|
+
When results are few or all score > 0.5, expand query terms:
|
|
53
|
+
|
|
54
|
+
- Keep original term first, add 2-4 variants
|
|
55
|
+
- Types: synonyms, abbreviations, related terms, word forms
|
|
56
|
+
- Example: `"config"` → `"config configuration settings configure"`
|
|
57
|
+
|
|
58
|
+
Avoid over-expansion (causes topic drift).
|
|
59
|
+
|
|
60
|
+
### Result Selection
|
|
61
|
+
|
|
62
|
+
When to include vs skip—based on answer quality, not just score.
|
|
63
|
+
|
|
64
|
+
**INCLUDE** if:
|
|
65
|
+
- Directly answers the question
|
|
66
|
+
- Provides necessary context
|
|
67
|
+
- Score < 0.5
|
|
68
|
+
|
|
69
|
+
**SKIP** if:
|
|
70
|
+
- Same keyword, unrelated context
|
|
71
|
+
- Score > 0.7
|
|
72
|
+
- Mentions term without explanation
|
|
73
|
+
|
|
74
|
+
## Ingestion
|
|
75
|
+
|
|
76
|
+
### ingest_file
|
|
77
|
+
```
|
|
78
|
+
ingest_file({ filePath: "/absolute/path/to/document.pdf" })
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
### ingest_data
|
|
82
|
+
```
|
|
83
|
+
ingest_data({
|
|
84
|
+
content: "<html>...</html>",
|
|
85
|
+
metadata: { source: "https://example.com/page", format: "html" }
|
|
86
|
+
})
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
**Format selection** — match the data you have:
|
|
90
|
+
- HTML string → `format: "html"`
|
|
91
|
+
- Markdown string → `format: "markdown"`
|
|
92
|
+
- Other → `format: "text"`
|
|
93
|
+
|
|
94
|
+
**Source format:**
|
|
95
|
+
- Web page → Use URL: `https://example.com/page`
|
|
96
|
+
- Other content → Use scheme: `{type}://{date}` or `{type}://{date}/{detail}`
|
|
97
|
+
- Examples: `clipboard://2024-12-30`, `chat://2024-12-30/project-discussion`
|
|
98
|
+
|
|
99
|
+
**HTML source options:**
|
|
100
|
+
- Static page → LLM fetch
|
|
101
|
+
- SPA/JS-rendered → Browser MCP
|
|
102
|
+
- Auth required → Manual paste
|
|
103
|
+
|
|
104
|
+
Re-ingest same source to update. Use same source in `delete_file` to remove.
|
|
105
|
+
|
|
106
|
+
## References
|
|
107
|
+
|
|
108
|
+
For edge cases and examples:
|
|
109
|
+
- [html-ingestion.md](references/html-ingestion.md) - URL normalization, SPA handling
|
|
110
|
+
- [query-optimization.md](references/query-optimization.md) - Query patterns by intent
|
|
111
|
+
- [result-refinement.md](references/result-refinement.md) - Contradiction resolution, chunking
|