ruvnet-kb-first 5.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +674 -0
- package/SKILL.md +740 -0
- package/bin/kb-first.js +123 -0
- package/install/init-project.sh +435 -0
- package/install/install-global.sh +257 -0
- package/install/kb-first-autodetect.sh +108 -0
- package/install/kb-first-command.md +80 -0
- package/install/kb-first-skill.md +262 -0
- package/package.json +87 -0
- package/phases/00-assessment.md +529 -0
- package/phases/01-storage.md +194 -0
- package/phases/01.5-hooks-setup.md +521 -0
- package/phases/02-kb-creation.md +413 -0
- package/phases/03-persistence.md +125 -0
- package/phases/04-visualization.md +170 -0
- package/phases/05-integration.md +114 -0
- package/phases/06-scaffold.md +130 -0
- package/phases/07-build.md +493 -0
- package/phases/08-verification.md +597 -0
- package/phases/09-security.md +512 -0
- package/phases/10-documentation.md +613 -0
- package/phases/11-deployment.md +670 -0
- package/phases/testing.md +713 -0
- package/scripts/1.5-hooks-verify.sh +252 -0
- package/scripts/8.1-code-scan.sh +58 -0
- package/scripts/8.2-import-check.sh +42 -0
- package/scripts/8.3-source-returns.sh +52 -0
- package/scripts/8.4-startup-verify.sh +65 -0
- package/scripts/8.5-fallback-check.sh +63 -0
- package/scripts/8.6-attribution.sh +56 -0
- package/scripts/8.7-confidence.sh +56 -0
- package/scripts/8.8-gap-logging.sh +70 -0
- package/scripts/9-security-audit.sh +202 -0
- package/scripts/init-project.sh +395 -0
- package/scripts/verify-enforcement.sh +167 -0
- package/src/commands/hooks.js +361 -0
- package/src/commands/init.js +315 -0
- package/src/commands/phase.js +372 -0
- package/src/commands/score.js +380 -0
- package/src/commands/status.js +193 -0
- package/src/commands/verify.js +286 -0
- package/src/index.js +56 -0
- package/src/mcp-server.js +412 -0
- package/templates/attention-router.ts +534 -0
- package/templates/code-analysis.ts +683 -0
- package/templates/federated-kb-learner.ts +649 -0
- package/templates/gnn-engine.ts +1091 -0
- package/templates/intentions.md +277 -0
- package/templates/kb-client.ts +905 -0
- package/templates/schema.sql +303 -0
- package/templates/sona-config.ts +312 -0
|
@@ -0,0 +1,413 @@
|
|
|
1
|
+
# Phase 2: World-Class Knowledge Base Creation
|
|
2
|
+
|
|
3
|
+
## Purpose
|
|
4
|
+
|
|
5
|
+
Create a knowledge base that represents the collective expertise of the **top 100 world experts** on the topic. This is not a simple FAQ — it's a comprehensive, structured repository of domain knowledge.
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Overview
|
|
10
|
+
|
|
11
|
+
This phase has 8 sub-phases:
|
|
12
|
+
|
|
13
|
+
| Sub-Phase | Purpose | Quality Criteria |
|
|
14
|
+
|-----------|---------|------------------|
|
|
15
|
+
| 2.1 | Domain Scoping | Formal definition established |
|
|
16
|
+
| 2.2 | Perspective Expansion | All user types and questions identified |
|
|
17
|
+
| 2.3 | Expert Discovery | 100 experts, 500+ content, 1000+ insights |
|
|
18
|
+
| 2.4 | Completeness Audit | 30+ gaps identified |
|
|
19
|
+
| 2.5 | Gap Filling | All gaps addressed with sources |
|
|
20
|
+
| 2.6 | Structure Organization | ≤9 primary nodes |
|
|
21
|
+
| 2.7 | Recursive Depth | Actual data at all leaves |
|
|
22
|
+
| 2.8 | Quality Loop | Score ≥98/100 |
|
|
23
|
+
|
|
24
|
+
**Do not skip sub-phases.** Each builds on the previous.
|
|
25
|
+
|
|
26
|
+
---
|
|
27
|
+
|
|
28
|
+
## Phase 2.1: Domain Scoping
|
|
29
|
+
|
|
30
|
+
### Purpose
|
|
31
|
+
Formally define the domain to ensure complete coverage.
|
|
32
|
+
|
|
33
|
+
### Process
|
|
34
|
+
|
|
35
|
+
Use this prompt:
|
|
36
|
+
```
|
|
37
|
+
You are defining the complete scope of the domain: [TOPIC]
|
|
38
|
+
|
|
39
|
+
Before identifying specific topics, establish formal boundaries:
|
|
40
|
+
|
|
41
|
+
1. FORMAL DEFINITION
|
|
42
|
+
- What is the academic/professional definition of this field?
|
|
43
|
+
- What discipline(s) does it belong to?
|
|
44
|
+
- How is it distinguished from adjacent fields?
|
|
45
|
+
|
|
46
|
+
2. ADJACENT FIELDS
|
|
47
|
+
- What related fields overlap with this domain?
|
|
48
|
+
- What topics are OUT of scope?
|
|
49
|
+
- What topics are shared between fields?
|
|
50
|
+
|
|
51
|
+
3. SUB-DISCIPLINES
|
|
52
|
+
- What are the major recognized sub-areas?
|
|
53
|
+
- What professional certifications exist?
|
|
54
|
+
- What academic programs cover this field?
|
|
55
|
+
|
|
56
|
+
4. HISTORICAL CONTEXT
|
|
57
|
+
- How has this field evolved?
|
|
58
|
+
- What are the major schools of thought?
|
|
59
|
+
- What paradigm shifts have occurred?
|
|
60
|
+
|
|
61
|
+
5. CURRENT STATE
|
|
62
|
+
- What are the active debates?
|
|
63
|
+
- What's cutting edge?
|
|
64
|
+
- What's considered settled knowledge?
|
|
65
|
+
|
|
66
|
+
Output a structured domain scope document.
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
### Output
|
|
70
|
+
A formal domain scope document with clear boundaries.
|
|
71
|
+
|
|
72
|
+
### Quality Gate
|
|
73
|
+
- [ ] Formal definition documented
|
|
74
|
+
- [ ] Adjacent fields identified with boundaries
|
|
75
|
+
- [ ] Sub-disciplines listed
|
|
76
|
+
- [ ] Historical context captured
|
|
77
|
+
- [ ] Current debates noted
|
|
78
|
+
|
|
79
|
+
---
|
|
80
|
+
|
|
81
|
+
## Phase 2.2: Perspective Expansion
|
|
82
|
+
|
|
83
|
+
### Purpose
|
|
84
|
+
Ensure the KB serves all potential users and questions.
|
|
85
|
+
|
|
86
|
+
### Process
|
|
87
|
+
|
|
88
|
+
Use this prompt:
|
|
89
|
+
```
|
|
90
|
+
For the domain: [TOPIC]
|
|
91
|
+
|
|
92
|
+
Identify all perspectives that the KB must serve:
|
|
93
|
+
|
|
94
|
+
1. USER TYPES
|
|
95
|
+
- Who are the primary users? (e.g., beginners, professionals)
|
|
96
|
+
- Who are secondary users?
|
|
97
|
+
- What are their expertise levels?
|
|
98
|
+
- What are their goals?
|
|
99
|
+
|
|
100
|
+
2. QUESTION TYPES
|
|
101
|
+
- What factual questions do people ask?
|
|
102
|
+
- What "how do I..." questions?
|
|
103
|
+
- What "should I..." questions?
|
|
104
|
+
- What comparison questions?
|
|
105
|
+
- What "what if..." questions?
|
|
106
|
+
|
|
107
|
+
3. USE CASES
|
|
108
|
+
- Educational use cases
|
|
109
|
+
- Professional use cases
|
|
110
|
+
- Personal decision-making use cases
|
|
111
|
+
- Research use cases
|
|
112
|
+
|
|
113
|
+
4. EDGE CASES
|
|
114
|
+
- What unusual situations arise?
|
|
115
|
+
- What exceptions to general rules exist?
|
|
116
|
+
- What misconceptions need addressing?
|
|
117
|
+
|
|
118
|
+
5. INTERACTION PATTERNS
|
|
119
|
+
- One-time queries vs. ongoing guidance
|
|
120
|
+
- Simple lookups vs. complex analysis
|
|
121
|
+
- General advice vs. personalized recommendations
|
|
122
|
+
|
|
123
|
+
Output a comprehensive perspective map.
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
### Output
|
|
127
|
+
A perspective map covering all users, questions, and use cases.
|
|
128
|
+
|
|
129
|
+
### Quality Gate
|
|
130
|
+
- [ ] At least 5 user types identified
|
|
131
|
+
- [ ] At least 50 question types documented
|
|
132
|
+
- [ ] Edge cases explicitly listed
|
|
133
|
+
- [ ] Interaction patterns defined
|
|
134
|
+
|
|
135
|
+
---
|
|
136
|
+
|
|
137
|
+
## Phase 2.3: Expert Discovery & Knowledge Extraction
|
|
138
|
+
|
|
139
|
+
### Purpose
|
|
140
|
+
Identify and extract knowledge from the top 100 world experts.
|
|
141
|
+
|
|
142
|
+
### Process
|
|
143
|
+
|
|
144
|
+
**Step 1: Identify Experts**
|
|
145
|
+
|
|
146
|
+
Use prompt from `./prompts/expert-discovery.md`:
|
|
147
|
+
```
|
|
148
|
+
For the domain: [TOPIC]
|
|
149
|
+
|
|
150
|
+
Identify the top 100 world experts. For each expert provide:
|
|
151
|
+
- Full name with credentials
|
|
152
|
+
- Primary affiliation
|
|
153
|
+
- Key contributions (3-5 bullet points)
|
|
154
|
+
- Most cited work
|
|
155
|
+
- Where to find their content (books, papers, websites, podcasts)
|
|
156
|
+
|
|
157
|
+
Categories to cover:
|
|
158
|
+
- Academic researchers (30+)
|
|
159
|
+
- Practitioners/professionals (30+)
|
|
160
|
+
- Authors/educators (20+)
|
|
161
|
+
- Industry leaders (10+)
|
|
162
|
+
- Emerging voices (10+)
|
|
163
|
+
|
|
164
|
+
Output as structured list with categories.
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
**Step 2: Extract Content**
|
|
168
|
+
|
|
169
|
+
For each expert, extract:
|
|
170
|
+
- Core concepts they've contributed
|
|
171
|
+
- Unique frameworks or models
|
|
172
|
+
- Key insights and findings
|
|
173
|
+
- Practical recommendations
|
|
174
|
+
- Contrarian or nuanced views
|
|
175
|
+
|
|
176
|
+
Target: **500+ distinct content items**
|
|
177
|
+
|
|
178
|
+
**Step 3: Synthesize Insights**
|
|
179
|
+
|
|
180
|
+
Cross-reference experts to identify:
|
|
181
|
+
- Points of consensus
|
|
182
|
+
- Points of disagreement
|
|
183
|
+
- Complementary perspectives
|
|
184
|
+
- Evolution of thinking over time
|
|
185
|
+
|
|
186
|
+
Target: **1000+ insights** (including relationships between concepts)
|
|
187
|
+
|
|
188
|
+
### Output
|
|
189
|
+
- List of 100 experts with metadata
|
|
190
|
+
- 500+ content items with attribution
|
|
191
|
+
- 1000+ insights with sources
|
|
192
|
+
|
|
193
|
+
### Quality Gate
|
|
194
|
+
- [ ] 100 experts identified across all categories
|
|
195
|
+
- [ ] Each expert has verifiable credentials and sources
|
|
196
|
+
- [ ] 500+ content items extracted
|
|
197
|
+
- [ ] All content has expert attribution
|
|
198
|
+
- [ ] 1000+ insights documented
|
|
199
|
+
|
|
200
|
+
---
|
|
201
|
+
|
|
202
|
+
## Phase 2.4: Completeness Audit
|
|
203
|
+
|
|
204
|
+
### Purpose
|
|
205
|
+
Identify gaps in the knowledge base.
|
|
206
|
+
|
|
207
|
+
### Process
|
|
208
|
+
|
|
209
|
+
Use prompt from `./prompts/completeness-audit.md`:
|
|
210
|
+
```
|
|
211
|
+
Review the knowledge base for [TOPIC] and identify gaps:
|
|
212
|
+
|
|
213
|
+
1. COVERAGE GAPS
|
|
214
|
+
- What sub-topics have <3 expert sources?
|
|
215
|
+
- What user questions can't be answered?
|
|
216
|
+
- What use cases aren't supported?
|
|
217
|
+
|
|
218
|
+
2. DEPTH GAPS
|
|
219
|
+
- Where is coverage superficial?
|
|
220
|
+
- What topics lack practical examples?
|
|
221
|
+
- What lacks quantitative data?
|
|
222
|
+
|
|
223
|
+
3. PERSPECTIVE GAPS
|
|
224
|
+
- What viewpoints are missing?
|
|
225
|
+
- What user types are underserved?
|
|
226
|
+
- What expertise levels lack content?
|
|
227
|
+
|
|
228
|
+
4. TEMPORAL GAPS
|
|
229
|
+
- What historical context is missing?
|
|
230
|
+
- What recent developments aren't covered?
|
|
231
|
+
- What emerging topics need addition?
|
|
232
|
+
|
|
233
|
+
5. STRUCTURAL GAPS
|
|
234
|
+
- What relationships between concepts are unclear?
|
|
235
|
+
- What prerequisites aren't explained?
|
|
236
|
+
- What comparisons are missing?
|
|
237
|
+
|
|
238
|
+
For each gap:
|
|
239
|
+
- Describe specifically what's missing
|
|
240
|
+
- Rate importance (critical/high/medium/low)
|
|
241
|
+
- Suggest sources to fill it
|
|
242
|
+
|
|
243
|
+
Target: Identify 30+ gaps.
|
|
244
|
+
```
|
|
245
|
+
|
|
246
|
+
### Output
|
|
247
|
+
Gap analysis document with 30+ identified gaps.
|
|
248
|
+
|
|
249
|
+
### Quality Gate
|
|
250
|
+
- [ ] 30+ gaps identified
|
|
251
|
+
- [ ] Each gap has importance rating
|
|
252
|
+
- [ ] Each gap has suggested sources
|
|
253
|
+
- [ ] Critical gaps flagged for priority
|
|
254
|
+
|
|
255
|
+
---
|
|
256
|
+
|
|
257
|
+
## Phase 2.5: Gap Filling
|
|
258
|
+
|
|
259
|
+
### Purpose
|
|
260
|
+
Address all identified gaps with quality content.
|
|
261
|
+
|
|
262
|
+
### Process
|
|
263
|
+
|
|
264
|
+
For each gap (starting with critical/high importance):
|
|
265
|
+
|
|
266
|
+
1. **Research** — Find authoritative sources
|
|
267
|
+
2. **Extract** — Pull relevant content with attribution
|
|
268
|
+
3. **Integrate** — Add to appropriate place in KB
|
|
269
|
+
4. **Verify** — Ensure gap is adequately addressed
|
|
270
|
+
|
|
271
|
+
### Output
|
|
272
|
+
All gaps addressed with sourced content.
|
|
273
|
+
|
|
274
|
+
### Quality Gate
|
|
275
|
+
- [ ] All critical gaps filled
|
|
276
|
+
- [ ] All high-importance gaps filled
|
|
277
|
+
- [ ] Each fill has expert attribution
|
|
278
|
+
- [ ] Re-audit shows no critical gaps remaining
|
|
279
|
+
|
|
280
|
+
---
|
|
281
|
+
|
|
282
|
+
## Phase 2.6: Structure Organization
|
|
283
|
+
|
|
284
|
+
### Purpose
|
|
285
|
+
Organize all content into a navigable structure.
|
|
286
|
+
|
|
287
|
+
### Constraint
|
|
288
|
+
**Maximum 9 primary nodes** (cognitive limit for comprehension)
|
|
289
|
+
|
|
290
|
+
### Process
|
|
291
|
+
|
|
292
|
+
```
|
|
293
|
+
Organize all [TOPIC] knowledge into a hierarchical structure:
|
|
294
|
+
|
|
295
|
+
CONSTRAINTS:
|
|
296
|
+
- Maximum 9 top-level nodes
|
|
297
|
+
- Each node can have unlimited children
|
|
298
|
+
- Leaf nodes contain actual content
|
|
299
|
+
- No orphan content
|
|
300
|
+
|
|
301
|
+
PRINCIPLES:
|
|
302
|
+
- MECE (Mutually Exclusive, Collectively Exhaustive)
|
|
303
|
+
- User mental models (how do people think about this?)
|
|
304
|
+
- Progressive disclosure (simple → complex)
|
|
305
|
+
- Multiple access paths (can reach content different ways)
|
|
306
|
+
|
|
307
|
+
Output:
|
|
308
|
+
1. The 9 (or fewer) primary nodes with descriptions
|
|
309
|
+
2. Second-level breakdown for each
|
|
310
|
+
3. Rationale for organization
|
|
311
|
+
```
|
|
312
|
+
|
|
313
|
+
### Output
|
|
314
|
+
- Primary node structure (≤9 nodes)
|
|
315
|
+
- Second-level breakdown
|
|
316
|
+
- Organization rationale
|
|
317
|
+
|
|
318
|
+
### Quality Gate
|
|
319
|
+
- [ ] ≤9 primary nodes
|
|
320
|
+
- [ ] All content has a home
|
|
321
|
+
- [ ] Structure follows MECE principle
|
|
322
|
+
- [ ] Rationale documented
|
|
323
|
+
|
|
324
|
+
---
|
|
325
|
+
|
|
326
|
+
## Phase 2.7: Recursive Depth Expansion
|
|
327
|
+
|
|
328
|
+
### Purpose
|
|
329
|
+
Ensure actual data exists at all leaf nodes.
|
|
330
|
+
|
|
331
|
+
### Process
|
|
332
|
+
|
|
333
|
+
For each branch:
|
|
334
|
+
1. **Expand** until you reach atomic content
|
|
335
|
+
2. **Verify** leaf nodes have actual data (not just labels)
|
|
336
|
+
3. **Add examples** where appropriate
|
|
337
|
+
4. **Cross-reference** related content
|
|
338
|
+
|
|
339
|
+
### Leaf Node Requirements
|
|
340
|
+
|
|
341
|
+
Every leaf node must have:
|
|
342
|
+
- **Title** — Clear, descriptive name
|
|
343
|
+
- **Content** — Actual information (not a placeholder)
|
|
344
|
+
- **Expert Source** — Who said this?
|
|
345
|
+
- **Source URL** — Where can it be verified?
|
|
346
|
+
- **Confidence** — How reliable is this? (0.0-1.0)
|
|
347
|
+
- **Metadata** — Tags, related nodes, etc.
|
|
348
|
+
|
|
349
|
+
### Output
|
|
350
|
+
Complete tree with data at all leaves.
|
|
351
|
+
|
|
352
|
+
### Quality Gate
|
|
353
|
+
- [ ] No empty leaf nodes
|
|
354
|
+
- [ ] All leaves have expert attribution
|
|
355
|
+
- [ ] All leaves have confidence scores
|
|
356
|
+
- [ ] Cross-references in place
|
|
357
|
+
|
|
358
|
+
---
|
|
359
|
+
|
|
360
|
+
## Phase 2.8: Quality Enhancement Loop
|
|
361
|
+
|
|
362
|
+
### Purpose
|
|
363
|
+
Iterate until KB quality reaches ≥98/100.
|
|
364
|
+
|
|
365
|
+
### Process
|
|
366
|
+
|
|
367
|
+
Use prompt from `./prompts/quality-critique.md`:
|
|
368
|
+
```
|
|
369
|
+
Critically evaluate this knowledge base on a scale of 0-100:
|
|
370
|
+
|
|
371
|
+
SCORING CRITERIA (weight):
|
|
372
|
+
1. Completeness (25%) — Does it cover the full domain?
|
|
373
|
+
2. Accuracy (25%) — Is the information correct?
|
|
374
|
+
3. Attribution (15%) — Are sources properly cited?
|
|
375
|
+
4. Structure (15%) — Is it well-organized?
|
|
376
|
+
5. Depth (10%) — Is there sufficient detail?
|
|
377
|
+
6. Currency (10%) — Is it up to date?
|
|
378
|
+
|
|
379
|
+
For each criterion:
|
|
380
|
+
- Score 0-100
|
|
381
|
+
- Specific weaknesses found
|
|
382
|
+
- Specific improvements needed
|
|
383
|
+
|
|
384
|
+
Overall score = weighted average
|
|
385
|
+
|
|
386
|
+
If score < 98:
|
|
387
|
+
- List top 10 improvements needed
|
|
388
|
+
- Prioritize by impact
|
|
389
|
+
```
|
|
390
|
+
|
|
391
|
+
### Loop
|
|
392
|
+
1. Score current KB
|
|
393
|
+
2. If score ≥98, exit loop
|
|
394
|
+
3. Implement top improvements
|
|
395
|
+
4. Re-score
|
|
396
|
+
5. Repeat until ≥98
|
|
397
|
+
|
|
398
|
+
### Output
|
|
399
|
+
KB scoring ≥98/100 with documented improvements.
|
|
400
|
+
|
|
401
|
+
### Quality Gate
|
|
402
|
+
- [ ] Final score ≥98/100
|
|
403
|
+
- [ ] All criterion scores documented
|
|
404
|
+
- [ ] Improvement history logged
|
|
405
|
+
- [ ] No critical weaknesses remaining
|
|
406
|
+
|
|
407
|
+
---
|
|
408
|
+
|
|
409
|
+
## Phase 2 Complete
|
|
410
|
+
|
|
411
|
+
When all sub-phases pass their quality gates:
|
|
412
|
+
|
|
413
|
+
**Proceed to Phase 3: Persistence & Verification**
|
|
@@ -0,0 +1,125 @@
|
|
|
1
|
+
# Phase 3: Persistence & Verification
|
|
2
|
+
|
|
3
|
+
## Purpose
|
|
4
|
+
|
|
5
|
+
Store the knowledge base to PostgreSQL with embeddings and verify that retrieval works correctly.
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Steps
|
|
10
|
+
|
|
11
|
+
### 3.1 Generate Embeddings
|
|
12
|
+
|
|
13
|
+
For each KB node, generate an embedding:
|
|
14
|
+
|
|
15
|
+
```typescript
|
|
16
|
+
import { embed } from './embedding';
|
|
17
|
+
|
|
18
|
+
async function generateEmbeddings(nodes: KBNode[]) {
|
|
19
|
+
for (const node of nodes) {
|
|
20
|
+
// Combine title and content for embedding
|
|
21
|
+
const text = `${node.title} ${node.content}`;
|
|
22
|
+
node.embedding = await embed(text);
|
|
23
|
+
}
|
|
24
|
+
}
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
**Using ruvector (if available):**
|
|
28
|
+
```sql
|
|
29
|
+
-- Generate embedding inline
|
|
30
|
+
UPDATE kb_nodes
|
|
31
|
+
SET embedding = ruvector_embed('all-MiniLM-L6-v2', title || ' ' || content)
|
|
32
|
+
WHERE embedding IS NULL;
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
**Using external API (OpenAI, etc.):**
|
|
36
|
+
```typescript
|
|
37
|
+
const response = await openai.embeddings.create({
|
|
38
|
+
model: "text-embedding-3-small",
|
|
39
|
+
input: text
|
|
40
|
+
});
|
|
41
|
+
node.embedding = response.data[0].embedding;
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
### 3.2 Insert Nodes
|
|
45
|
+
|
|
46
|
+
```sql
|
|
47
|
+
INSERT INTO kb_nodes (
|
|
48
|
+
namespace, path, title, content,
|
|
49
|
+
source_expert, source_url, confidence,
|
|
50
|
+
embedding, metadata
|
|
51
|
+
) VALUES (
|
|
52
|
+
$1, $2, $3, $4, $5, $6, $7, $8, $9
|
|
53
|
+
)
|
|
54
|
+
ON CONFLICT (namespace, path) DO UPDATE SET
|
|
55
|
+
content = EXCLUDED.content,
|
|
56
|
+
source_expert = EXCLUDED.source_expert,
|
|
57
|
+
source_url = EXCLUDED.source_url,
|
|
58
|
+
confidence = EXCLUDED.confidence,
|
|
59
|
+
embedding = EXCLUDED.embedding,
|
|
60
|
+
updated_at = NOW();
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
### 3.3 Create Index
|
|
64
|
+
|
|
65
|
+
```sql
|
|
66
|
+
-- HNSW index for fast similarity search
|
|
67
|
+
CREATE INDEX IF NOT EXISTS kb_nodes_embedding_idx
|
|
68
|
+
ON kb_nodes USING hnsw (embedding vector_cosine_ops)
|
|
69
|
+
WITH (m = 16, ef_construction = 64);
|
|
70
|
+
|
|
71
|
+
-- Text search index
|
|
72
|
+
CREATE INDEX IF NOT EXISTS kb_nodes_content_idx
|
|
73
|
+
ON kb_nodes USING gin(to_tsvector('english', content));
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
### 3.4 Verify Retrieval
|
|
77
|
+
|
|
78
|
+
Test semantic search:
|
|
79
|
+
```sql
|
|
80
|
+
SELECT title, source_expert, confidence,
|
|
81
|
+
embedding <=> $query_embedding AS distance
|
|
82
|
+
FROM kb_nodes
|
|
83
|
+
WHERE namespace = 'your-namespace'
|
|
84
|
+
ORDER BY distance
|
|
85
|
+
LIMIT 5;
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
Test hybrid search:
|
|
89
|
+
```sql
|
|
90
|
+
SELECT title, source_expert,
|
|
91
|
+
0.7 * (1.0 / (1.0 + (embedding <=> $query_embedding))) +
|
|
92
|
+
0.3 * ts_rank(to_tsvector('english', content), plainto_tsquery('english', $keywords))
|
|
93
|
+
AS score
|
|
94
|
+
FROM kb_nodes
|
|
95
|
+
WHERE namespace = 'your-namespace'
|
|
96
|
+
ORDER BY score DESC
|
|
97
|
+
LIMIT 5;
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
### 3.5 Generate Statistics Report
|
|
101
|
+
|
|
102
|
+
```sql
|
|
103
|
+
SELECT
|
|
104
|
+
namespace,
|
|
105
|
+
COUNT(*) as node_count,
|
|
106
|
+
COUNT(*) FILTER (WHERE embedding IS NOT NULL) as with_embedding,
|
|
107
|
+
AVG(confidence) as avg_confidence,
|
|
108
|
+
COUNT(DISTINCT source_expert) as expert_count
|
|
109
|
+
FROM kb_nodes
|
|
110
|
+
GROUP BY namespace;
|
|
111
|
+
```
|
|
112
|
+
|
|
113
|
+
---
|
|
114
|
+
|
|
115
|
+
## Quality Gate
|
|
116
|
+
|
|
117
|
+
- [ ] All nodes have embeddings
|
|
118
|
+
- [ ] HNSW index created
|
|
119
|
+
- [ ] Semantic search returns relevant results
|
|
120
|
+
- [ ] All nodes have source_expert attribution
|
|
121
|
+
- [ ] All nodes have confidence scores
|
|
122
|
+
|
|
123
|
+
---
|
|
124
|
+
|
|
125
|
+
**Proceed to Phase 4: Visualization**
|
|
@@ -0,0 +1,170 @@
|
|
|
1
|
+
# Phase 4: Visualization
|
|
2
|
+
|
|
3
|
+
## Purpose
|
|
4
|
+
|
|
5
|
+
Generate an interactive visualization of the knowledge base for human verification and exploration.
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Requirements
|
|
10
|
+
|
|
11
|
+
- Interactive tree (expandable/collapsible nodes)
|
|
12
|
+
- Click node to see full content + expert sources
|
|
13
|
+
- Search functionality
|
|
14
|
+
- Breadcrumb navigation
|
|
15
|
+
- Statistics display
|
|
16
|
+
- Responsive design
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
## Implementation Options
|
|
21
|
+
|
|
22
|
+
### Option 1: React + D3 Tree
|
|
23
|
+
|
|
24
|
+
```tsx
|
|
25
|
+
import { Tree } from 'react-d3-tree';
|
|
26
|
+
|
|
27
|
+
function KBVisualization({ data }) {
|
|
28
|
+
return (
|
|
29
|
+
<Tree
|
|
30
|
+
data={data}
|
|
31
|
+
orientation="vertical"
|
|
32
|
+
pathFunc="step"
|
|
33
|
+
onNodeClick={(node) => showNodeDetails(node)}
|
|
34
|
+
/>
|
|
35
|
+
);
|
|
36
|
+
}
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
### Option 2: Three.js 3D Tree
|
|
40
|
+
|
|
41
|
+
For larger knowledge bases, a 3D visualization provides better navigation:
|
|
42
|
+
|
|
43
|
+
```typescript
|
|
44
|
+
import * as THREE from 'three';
|
|
45
|
+
|
|
46
|
+
// Create 3D tree with force-directed layout
|
|
47
|
+
const nodes = createNodes(kbData);
|
|
48
|
+
const edges = createEdges(kbData);
|
|
49
|
+
|
|
50
|
+
// Add interactivity
|
|
51
|
+
scene.on('click', (object) => {
|
|
52
|
+
if (object.type === 'node') {
|
|
53
|
+
showNodePanel(object.data);
|
|
54
|
+
}
|
|
55
|
+
});
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
### Option 3: Simple HTML Tree
|
|
59
|
+
|
|
60
|
+
For quick visualization:
|
|
61
|
+
|
|
62
|
+
```html
|
|
63
|
+
<div class="kb-tree">
|
|
64
|
+
<ul>
|
|
65
|
+
{{#each nodes}}
|
|
66
|
+
<li class="node {{type}}">
|
|
67
|
+
<span class="title" onclick="showDetails('{{id}}')">
|
|
68
|
+
{{title}}
|
|
69
|
+
</span>
|
|
70
|
+
{{#if children}}
|
|
71
|
+
<ul>
|
|
72
|
+
{{#each children}}
|
|
73
|
+
<!-- recursive -->
|
|
74
|
+
{{/each}}
|
|
75
|
+
</ul>
|
|
76
|
+
{{/if}}
|
|
77
|
+
</li>
|
|
78
|
+
{{/each}}
|
|
79
|
+
</ul>
|
|
80
|
+
</div>
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
---
|
|
84
|
+
|
|
85
|
+
## Node Details Panel
|
|
86
|
+
|
|
87
|
+
When clicking a node, show:
|
|
88
|
+
|
|
89
|
+
```html
|
|
90
|
+
<div class="node-details">
|
|
91
|
+
<h2>{{title}}</h2>
|
|
92
|
+
<div class="path">{{path}}</div>
|
|
93
|
+
|
|
94
|
+
<div class="content">
|
|
95
|
+
{{content}}
|
|
96
|
+
</div>
|
|
97
|
+
|
|
98
|
+
<div class="attribution">
|
|
99
|
+
<h4>Source</h4>
|
|
100
|
+
<div class="expert">{{sourceExpert}}</div>
|
|
101
|
+
<a href="{{sourceUrl}}">{{sourceUrl}}</a>
|
|
102
|
+
<div class="confidence">
|
|
103
|
+
Confidence: {{confidence}}%
|
|
104
|
+
</div>
|
|
105
|
+
</div>
|
|
106
|
+
|
|
107
|
+
<div class="related">
|
|
108
|
+
<h4>Related Nodes</h4>
|
|
109
|
+
{{#each relatedNodes}}
|
|
110
|
+
<a href="#" onclick="navigateTo('{{id}}')">{{title}}</a>
|
|
111
|
+
{{/each}}
|
|
112
|
+
</div>
|
|
113
|
+
</div>
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
---
|
|
117
|
+
|
|
118
|
+
## Search Functionality
|
|
119
|
+
|
|
120
|
+
```typescript
|
|
121
|
+
async function searchKB(query: string) {
|
|
122
|
+
const results = await fetch(`/api/kb/search?q=${encodeURIComponent(query)}`);
|
|
123
|
+
|
|
124
|
+
// Highlight matching nodes in tree
|
|
125
|
+
highlightNodes(results.map(r => r.id));
|
|
126
|
+
|
|
127
|
+
// Show results panel
|
|
128
|
+
showSearchResults(results);
|
|
129
|
+
}
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
---
|
|
133
|
+
|
|
134
|
+
## Statistics Display
|
|
135
|
+
|
|
136
|
+
```html
|
|
137
|
+
<div class="kb-stats">
|
|
138
|
+
<div class="stat">
|
|
139
|
+
<span class="value">{{nodeCount}}</span>
|
|
140
|
+
<span class="label">Total Nodes</span>
|
|
141
|
+
</div>
|
|
142
|
+
<div class="stat">
|
|
143
|
+
<span class="value">{{expertCount}}</span>
|
|
144
|
+
<span class="label">Expert Sources</span>
|
|
145
|
+
</div>
|
|
146
|
+
<div class="stat">
|
|
147
|
+
<span class="value">{{avgConfidence}}%</span>
|
|
148
|
+
<span class="label">Avg Confidence</span>
|
|
149
|
+
</div>
|
|
150
|
+
<div class="stat">
|
|
151
|
+
<span class="value">{{depth}}</span>
|
|
152
|
+
<span class="label">Max Depth</span>
|
|
153
|
+
</div>
|
|
154
|
+
</div>
|
|
155
|
+
```
|
|
156
|
+
|
|
157
|
+
---
|
|
158
|
+
|
|
159
|
+
## Quality Gate
|
|
160
|
+
|
|
161
|
+
- [ ] Full tree renders without errors
|
|
162
|
+
- [ ] All nodes expandable/collapsible
|
|
163
|
+
- [ ] Click node shows details with sources
|
|
164
|
+
- [ ] Search returns relevant results
|
|
165
|
+
- [ ] Statistics display correctly
|
|
166
|
+
- [ ] Navigation breadcrumbs work
|
|
167
|
+
|
|
168
|
+
---
|
|
169
|
+
|
|
170
|
+
**Proceed to Phase 5: Integration Layer**
|