agentboot 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (66) hide show
  1. package/README.md +8 -7
  2. package/agentboot.config.json +4 -1
  3. package/package.json +2 -2
  4. package/scripts/cli.ts +42 -14
  5. package/scripts/compile.ts +30 -7
  6. package/scripts/dev-sync.ts +1 -1
  7. package/scripts/lib/config.ts +17 -1
  8. package/scripts/validate.ts +12 -7
  9. package/.github/ISSUE_TEMPLATE/persona-request.md +0 -62
  10. package/.github/ISSUE_TEMPLATE/quality-feedback.md +0 -67
  11. package/.github/workflows/cla.yml +0 -25
  12. package/.github/workflows/validate.yml +0 -49
  13. package/.idea/agentboot.iml +0 -9
  14. package/.idea/misc.xml +0 -6
  15. package/.idea/modules.xml +0 -8
  16. package/.idea/vcs.xml +0 -6
  17. package/CLAUDE.md +0 -230
  18. package/CONTRIBUTING.md +0 -168
  19. package/PERSONAS.md +0 -156
  20. package/core/instructions/baseline.instructions.md +0 -133
  21. package/core/instructions/security.instructions.md +0 -186
  22. package/core/personas/code-reviewer/SKILL.md +0 -175
  23. package/core/personas/security-reviewer/SKILL.md +0 -233
  24. package/core/personas/test-data-expert/SKILL.md +0 -234
  25. package/core/personas/test-generator/SKILL.md +0 -262
  26. package/core/traits/audit-trail.md +0 -182
  27. package/core/traits/confidence-signaling.md +0 -172
  28. package/core/traits/critical-thinking.md +0 -129
  29. package/core/traits/schema-awareness.md +0 -132
  30. package/core/traits/source-citation.md +0 -174
  31. package/core/traits/structured-output.md +0 -199
  32. package/docs/ci-cd-automation.md +0 -548
  33. package/docs/claude-code-reference/README.md +0 -21
  34. package/docs/claude-code-reference/agentboot-coverage.md +0 -484
  35. package/docs/claude-code-reference/feature-inventory.md +0 -906
  36. package/docs/cli-commands-audit.md +0 -112
  37. package/docs/cli-design.md +0 -924
  38. package/docs/concepts.md +0 -1117
  39. package/docs/config-schema-audit.md +0 -121
  40. package/docs/configuration.md +0 -645
  41. package/docs/delivery-methods.md +0 -758
  42. package/docs/developer-onboarding.md +0 -342
  43. package/docs/extending.md +0 -448
  44. package/docs/getting-started.md +0 -298
  45. package/docs/knowledge-layer.md +0 -464
  46. package/docs/marketplace.md +0 -822
  47. package/docs/org-connection.md +0 -570
  48. package/docs/plans/architecture.md +0 -2429
  49. package/docs/plans/design.md +0 -2018
  50. package/docs/plans/prd.md +0 -1862
  51. package/docs/plans/stack-rank.md +0 -261
  52. package/docs/plans/technical-spec.md +0 -2755
  53. package/docs/privacy-and-safety.md +0 -807
  54. package/docs/prompt-optimization.md +0 -1071
  55. package/docs/test-plan.md +0 -972
  56. package/docs/third-party-ecosystem.md +0 -496
  57. package/domains/compliance-template/README.md +0 -173
  58. package/domains/compliance-template/traits/compliance-aware.md +0 -228
  59. package/examples/enterprise/agentboot.config.json +0 -184
  60. package/examples/minimal/agentboot.config.json +0 -46
  61. package/tests/REGRESSION-PLAN.md +0 -705
  62. package/tests/TEST-PLAN.md +0 -111
  63. package/tests/cli.test.ts +0 -705
  64. package/tests/pipeline.test.ts +0 -608
  65. package/tests/validate.test.ts +0 -278
  66. package/tsconfig.json +0 -62
@@ -1,464 +0,0 @@
1
- # Knowledge Layer — From Flat Files to RAG
2
-
3
- How AgentBoot's domain knowledge evolves from markdown files to structured datastores
4
- to vector-powered semantic retrieval as an organization's needs grow.
5
-
6
- ---
7
-
8
- ## The Problem That Flat Files Can't Solve
9
-
10
- AgentBoot starts with flat files: traits as markdown, gotchas as markdown, persona
11
- prompts as markdown. This works brilliantly up to a point:
12
-
13
- | Org Maturity | Knowledge Volume | Flat Files Work? |
14
- |---|---|---|
15
- | Getting started | 6 traits, 10 gotchas, 4 personas | Yes — everything fits in context |
16
- | Growing | 20 traits, 50 gotchas, 10 personas | Mostly — path scoping keeps it manageable |
17
- | Mature | 50 traits, 200 gotchas, 20 personas, 500 ADRs, 1000 incident learnings | **No** — can't load it all, can't find what's relevant |
18
-
19
- The breaking point is when the organization's accumulated knowledge exceeds what
20
- fits in a context window — or more practically, when the right knowledge for a given
21
- task is buried in hundreds of files and there's no good way to find it.
22
-
23
- A security reviewer looking at an auth endpoint doesn't need all 200 gotchas. It
24
- needs the 5 that are relevant to authentication, JWT tokens, and the specific
25
- framework being used. Flat files with `paths:` frontmatter help (the gotcha activates
26
- when you're in `src/auth/`), but they can't do semantic relevance: "this code is
27
- doing token validation, and we had an incident last year where token expiry was
28
- miscalculated" — that kind of retrieval requires understanding the code's intent,
29
- not just its file path.
30
-
31
- ---
32
-
33
- ## The Knowledge Progression
34
-
35
- ```
36
- Stage 1 Stage 2 Stage 3
37
- FLAT FILES STRUCTURED STORE VECTOR / RAG
38
- (markdown) (queryable) (semantic retrieval)
39
-
40
- core/traits/*.md Knowledge DB Embeddings
41
- .claude/rules/*.md → (SQLite, JSON, or → (pgvector, Pinecone,
42
- .claude/gotchas/*.md structured markdown) Chroma, local HNSW)
43
-
44
- 5-50 items 50-500 items 500+ items
45
- Path-scoped Category/tag queries Semantic similarity
46
- Full context load Filtered retrieval "Find what's relevant
47
- to THIS code"
48
-
49
- Free Free (local DB) $ (embedding API +
50
- storage)
51
- ```
52
-
53
- **Most orgs stay at Stage 1 forever.** That's fine. AgentBoot's flat file system
54
- is the right default. The progression to Stage 2 or 3 is opt-in, driven by the
55
- org's actual knowledge volume, not by AgentBoot's architecture preferences.
56
-
57
- ---
58
-
59
- ## Stage 1: Flat Files (Current — Default)
60
-
61
- What AgentBoot does today. Markdown files loaded into context.
62
-
63
- **How personas access knowledge:**
64
- - Always-on instructions loaded at session start
65
- - Path-scoped rules loaded when Claude reads matching files
66
- - Trait content composed at build time
67
- - Skills loaded on invocation
68
-
69
- **Strengths:**
70
- - Zero infrastructure (just files)
71
- - Version-controlled in git
72
- - Human-readable and editable
73
- - Composable via AgentBoot's build system
74
-
75
- **Limits:**
76
- - Everything that MIGHT be relevant loads into context (token cost)
77
- - No semantic retrieval (can't find "gotchas related to JWT token handling")
78
- - Doesn't scale beyond ~50 rules without context bloat
79
- - No cross-referencing between knowledge items
80
-
81
- ---
82
-
83
- ## Stage 2: Structured Knowledge Store
84
-
85
- A queryable layer on top of flat files. The files are still the source of truth
86
- (authored in markdown, stored in git), but they're indexed into a structured store
87
- that personas can query.
88
-
89
- ### What Gets Structured
90
-
91
- | Knowledge Type | Flat File | Structured Fields |
92
- |---|---|---|
93
- | **Gotchas** | `gotchas-postgres.md` | `{ technology: "postgres", tags: ["rls", "partitions", "auth"], severity: "high", learned_from: "incident-2025-Q3" }` |
94
- | **ADRs** | `adrs/ADR-001.md` | `{ id: "ADR-001", status: "accepted", domain: "auth", supersedes: null, date: "2026-01" }` |
95
- | **Incident learnings** | `incidents/2025-Q3-token-expiry.md` | `{ id: "INC-2025-Q3-01", domain: "auth", root_cause: "token-expiry", affected_services: ["api-gateway", "auth-service"] }` |
96
- | **Standards** | `standards/api-versioning.md` | `{ domain: "api", applies_to: ["rest", "graphql"], mandatory: true }` |
97
- | **Patterns** | `patterns/retry-with-backoff.md` | `{ category: "resilience", languages: ["typescript", "python"], anti_patterns: ["retry-without-backoff"] }` |
98
-
99
- ### How Personas Query It
100
-
101
- Via an MCP server that reads the structured index:
102
-
103
- ```yaml
104
- # In persona SKILL.md
105
- ## Setup
106
-
107
- Before reviewing, query the knowledge base for relevant context:
108
- 1. Use the agentboot-kb tool to find gotchas related to the technologies in the diff
109
- 2. Use the agentboot-kb tool to find ADRs related to the domains being modified
110
- 3. Use the agentboot-kb tool to find incident learnings for the affected services
111
- ```
112
-
113
- The MCP server:
114
-
115
- ```json
116
- {
117
- "mcpServers": {
118
- "agentboot-kb": {
119
- "type": "stdio",
120
- "command": "npx",
121
- "args": ["@agentboot/knowledge-server", "--store", ".agentboot/knowledge.db"]
122
- }
123
- }
124
- }
125
- ```
126
-
127
- ### MCP Tools Exposed
128
-
129
- ```
130
- agentboot_kb_search
131
- query: "postgres RLS partitions"
132
- filters: { technology: "postgres", severity: "high" }
133
- → Returns: 3 relevant gotchas (not all 200)
134
-
135
- agentboot_kb_get
136
- id: "ADR-001"
137
- → Returns: full ADR content
138
-
139
- agentboot_kb_related
140
- id: "INC-2025-Q3-01"
141
- → Returns: related gotchas, ADRs, and patterns
142
-
143
- agentboot_kb_list
144
- type: "gotcha"
145
- tags: ["auth"]
146
- → Returns: all auth-related gotchas (titles + IDs, not full content)
147
- ```
148
-
149
- ### How the Index Gets Built
150
-
151
- The structured index is generated from flat files during `agentboot build`:
152
-
153
- ```bash
154
- agentboot build
155
- # Compiles personas, traits...
156
- # Also indexes knowledge files into .agentboot/knowledge.db
157
-
158
- agentboot sync
159
- # Syncs personas to repos...
160
- # Also syncs the knowledge DB (or the MCP server config to access it)
161
- ```
162
-
163
- The flat files gain optional frontmatter for structured fields:
164
-
165
- ```markdown
166
- ---
167
- type: gotcha
168
- technology: postgres
169
- tags: [rls, partitions, security]
170
- severity: high
171
- learned_from: incident-2025-Q3
172
- ---
173
-
174
- # PostgreSQL RLS on Partitions
175
-
176
- Partitions do NOT inherit `relrowsecurity`...
177
- ```
178
-
179
- Files without frontmatter still work (Stage 1 behavior). The frontmatter adds
180
- queryability without breaking existing content.
181
-
182
- ### Implementation Options
183
-
184
- | Option | Pros | Cons |
185
- |---|---|---|
186
- | **SQLite (local)** | Zero infra, ships with the MCP server, fast | Not shared across machines |
187
- | **JSON index file** | Zero infra, git-trackable, simple | Slow for large datasets |
188
- | **Turso/LibSQL (hosted SQLite)** | Shared, serverless, still SQLite API | Requires account/hosting |
189
- | **PostgreSQL** | Full SQL, mature, team-shared | Requires running DB |
190
-
191
- **Recommendation:** SQLite local for V1. It's a single file, ships with the MCP
192
- server, requires zero infrastructure, and handles thousands of knowledge items with
193
- sub-millisecond queries. The MCP server reads the SQLite file; `agentboot build`
194
- writes it.
195
-
196
- ---
197
-
198
- ## Stage 3: Vector Embeddings / RAG
199
-
200
- Semantic retrieval. Instead of querying by tags and categories (Stage 2), the
201
- persona describes what it's looking at and the knowledge base returns the most
202
- semantically relevant items.
203
-
204
- ### When You Need This
205
-
206
- Stage 2's structured queries work when you know what to ask for: "give me postgres
207
- gotchas." Stage 3 shines when the relevance isn't obvious from tags:
208
-
209
- - "This code is doing a double-check on token expiry before refreshing. Is there
210
- a reason for that?" → Retrieves the incident report about token expiry race
211
- conditions, even though the code doesn't mention "race condition"
212
- - "This migration adds a new column to the users table" → Retrieves the gotcha
213
- about ALTER TABLE locking on large tables, the standard about column naming
214
- conventions, AND the ADR about the users table schema evolution plan
215
- - "This PR changes the retry logic" → Retrieves the retry-with-backoff pattern,
216
- the incident where retry-without-backoff caused a cascading failure, and the
217
- circuit breaker ADR
218
-
219
- The connection between the code and the knowledge is **semantic**, not
220
- keyword-based. The code says "retry logic"; the incident report says "cascading
221
- failure from unbounded retries." A keyword search wouldn't connect them.
222
- Embeddings would.
223
-
224
- ### Architecture
225
-
226
- ```
227
- Knowledge files (markdown)
228
-
229
-
230
- agentboot build --embeddings
231
-
232
- ├── Chunks each file into sections
233
- ├── Generates embeddings via API (Anthropic, OpenAI, or local model)
234
- ├── Stores in vector DB
235
- └── Stores metadata (type, tags, source file, last updated)
236
-
237
-
238
-
239
- MCP Server (agentboot-kb with vector search)
240
-
241
-
242
- Persona queries:
243
- "Find knowledge relevant to this code: [code snippet]"
244
-
245
-
246
- Vector similarity search → top 5 results → injected into persona context
247
- ```
248
-
249
- ### MCP Tools (Extended for Vector)
250
-
251
- ```
252
- agentboot_kb_semantic_search
253
- query: "JWT token refresh with expiry validation"
254
- limit: 5
255
- min_similarity: 0.7
256
- → Returns: ranked results by semantic similarity
257
- 1. (0.92) Incident: Token expiry race condition (2025-Q3)
258
- 2. (0.87) Gotcha: JWT clock skew handling
259
- 3. (0.84) Pattern: Token refresh with mutex
260
- 4. (0.78) ADR: Auth token lifecycle management
261
- 5. (0.71) Standard: Session timeout requirements
262
-
263
- agentboot_kb_relevant_to_diff
264
- diff: "<git diff output>"
265
- limit: 10
266
- → Returns: knowledge items most relevant to the code changes
267
- (Embeds the diff, searches against knowledge embeddings)
268
- ```
269
-
270
- ### The Killer Use Case: Context-Aware Review
271
-
272
- A code reviewer persona with RAG doesn't just check rules — it brings
273
- organizational memory to every review:
274
-
275
- ```
276
- Reviewing: src/api/auth/token-refresh.ts
277
-
278
- Standard review findings:
279
- [WARN] Missing error handling on refresh token call (line 34)
280
-
281
- Knowledge-augmented findings:
282
- [WARN] Missing error handling on refresh token call (line 34)
283
-
284
- [CONTEXT] This is similar to INC-2025-Q3-01: our token refresh service
285
- experienced a cascading failure when the auth provider returned 503 and
286
- the retry logic had no backoff. The current code has the same pattern.
287
- See: patterns/retry-with-backoff.md
288
-
289
- [CONTEXT] ADR-007 requires all auth token operations to use the shared
290
- AuthClient wrapper (src/lib/auth-client.ts) which includes retry,
291
- circuit breaking, and telemetry. This file is calling the provider
292
- directly.
293
- ```
294
-
295
- The persona found things a rule-based review never would — because the connection
296
- between "this code" and "that incident" is semantic, not syntactic.
297
-
298
- ### Vector Store Options
299
-
300
- | Option | Pros | Cons |
301
- |---|---|---|
302
- | **Chroma (local)** | Python, runs locally, simple API | Requires Python runtime |
303
- | **LanceDB (local)** | Rust-based, embedded, no server | Newer, smaller community |
304
- | **SQLite + sqlite-vss** | Extension for SQLite, same DB as Stage 2 | Limited vector ops |
305
- | **pgvector (hosted)** | PostgreSQL extension, full SQL + vectors | Requires running Postgres |
306
- | **Pinecone / Weaviate (cloud)** | Managed, scalable, team-shared | Cost, vendor dependency |
307
- | **Anthropic embeddings API** | Native if using Claude | Per-call cost |
308
-
309
- **Recommendation:** Start with sqlite-vss (extends the Stage 2 SQLite DB with vector
310
- search). Zero new infrastructure. When the org outgrows it, migrate to pgvector or
311
- a managed service — the MCP interface doesn't change, only the backing store.
312
-
313
- ### Embedding Cost
314
-
315
- | Content Volume | Embedding Cost (one-time) | Storage |
316
- |---|---|---|
317
- | 100 knowledge items (~200KB) | ~$0.02 | <1MB |
318
- | 1,000 items (~2MB) | ~$0.20 | ~10MB |
319
- | 10,000 items (~20MB) | ~$2.00 | ~100MB |
320
-
321
- Re-embedding on content change: only the changed items, not the full corpus.
322
- Incremental updates during `agentboot build`.
323
-
324
- ---
325
-
326
- ## What Each Stage Gives Acme-Org
327
-
328
- ### Stage 1 (Flat Files): "Read the rules"
329
-
330
- ```
331
- Developer writes auth code
332
- → Security reviewer loads auth gotchas (path-scoped)
333
- → Finds: "missing null check" (rule-based)
334
- → Doesn't know about last year's auth incident
335
- → Doesn't know about the ADR requiring AuthClient wrapper
336
- ```
337
-
338
- ### Stage 2 (Structured): "Query the knowledge"
339
-
340
- ```
341
- Developer writes auth code
342
- → Security reviewer queries: tags=["auth", "token"]
343
- → Finds: null check gotcha + JWT gotcha + AuthClient standard
344
- → Knows the rules but not the history
345
- → Doesn't connect "this code" to "that incident" semantically
346
- ```
347
-
348
- ### Stage 3 (Vector/RAG): "Understand the context"
349
-
350
- ```
351
- Developer writes auth code
352
- → Security reviewer embeds the code, searches knowledge base
353
- → Finds: null check + JWT gotcha + AuthClient standard
354
- + incident INC-2025-Q3-01 (semantically similar)
355
- + ADR-007 (related to auth token lifecycle)
356
- → Review includes: what's wrong, why it matters, what happened
357
- last time, and what the org decided about it
358
- ```
359
-
360
- The progression is: **rules → knowledge → organizational memory.**
361
-
362
- ---
363
-
364
- ## How AgentBoot Supports Each Stage
365
-
366
- ### Stage 1 (Current Design — Partially Implemented)
367
-
368
- - Flat markdown files in `core/traits/`, `.claude/rules/`, `.claude/gotchas/`
369
- - Path-scoped activation via `paths:` frontmatter
370
- - Build-time composition via `agentboot build`
371
- - No additional infrastructure
372
-
373
- ### Stage 2 (Needs Building)
374
-
375
- | Component | What | Phase |
376
- |---|---|---|
377
- | Knowledge frontmatter spec | Optional fields: type, tags, severity, domain, learned_from | V1.5 |
378
- | `agentboot build --index` | Generate SQLite index from frontmatter | V1.5 |
379
- | `@agentboot/knowledge-server` | MCP server reading SQLite, exposing search/get/related tools | V2 |
380
- | Persona templates with KB queries | Setup steps that query KB before reviewing | V2 |
381
- | `agentboot add incident` / `agentboot add standard` | Scaffold knowledge items with proper frontmatter | V2 |
382
- | Knowledge dashboard | "You have 142 gotchas, 23 ADRs, 8 incident learnings" | V2 |
383
-
384
- ### Stage 3 (Future)
385
-
386
- | Component | What | Phase |
387
- |---|---|---|
388
- | `agentboot build --embeddings` | Generate embeddings + vector index | V3 |
389
- | sqlite-vss integration | Vector search in the existing SQLite DB | V3 |
390
- | `agentboot_kb_semantic_search` MCP tool | Semantic retrieval | V3 |
391
- | `agentboot_kb_relevant_to_diff` MCP tool | Auto-contextualize reviews with relevant knowledge | V3 |
392
- | Incremental embedding updates | Only re-embed changed items | V3 |
393
- | Migration path to pgvector / managed | When SQLite isn't enough | V3+ |
394
-
395
- ---
396
-
397
- ## The MCP Interface Stays Stable
398
-
399
- The most important architectural decision: **the MCP interface is the same
400
- across all three stages.** Personas don't know (or care) whether the backing
401
- store is flat files, SQLite, or pgvector. They call the same MCP tools:
402
-
403
- ```
404
- Stage 1: No MCP — files loaded directly into context (path-scoped)
405
- Stage 2: agentboot_kb_search → queries SQLite with filters
406
- Stage 3: agentboot_kb_search → vector similarity + SQLite filters
407
- ```
408
-
409
- The persona prompt doesn't change when the org upgrades from Stage 2 to Stage 3.
410
- Only the MCP server implementation changes. This is why MCP-first matters — the
411
- abstraction boundary is clean.
412
-
413
- An org can start with flat files, graduate to structured queries when they have
414
- 100+ knowledge items, and add vector search when they need semantic retrieval.
415
- At no point do they rewrite their personas.
416
-
417
- ---
418
-
419
- ## What Kinds of Knowledge Belong Here
420
-
421
- Not everything should be in the knowledge layer. The rule: **knowledge that a
422
- persona needs to retrieve at query time goes here. Knowledge that shapes persona
423
- behavior goes in traits and instructions.**
424
-
425
- | Content | Where It Belongs | Why |
426
- |---|---|---|
427
- | "Always check for null safety" | **Trait / rule** | Behavioral directive — always active |
428
- | "PostgreSQL partitions don't inherit RLS" | **Gotcha (flat file, Stage 1)** | Path-scoped, activates on relevant files |
429
- | "We had an incident where token refresh caused cascading failure" | **Knowledge store (Stage 2+)** | Historical context, retrieved when relevant |
430
- | "ADR-007: All auth tokens use the AuthClient wrapper" | **Knowledge store (Stage 2+)** | Architectural decision, queried by domain |
431
- | "The retry backoff formula is: delay = base * 2^attempt" | **Knowledge store (Stage 2+)** | Reference data, retrieved when relevant |
432
- | "Our API versioning convention is /v{N}/ in the URL path" | **Standard (flat file or Stage 2)** | Could be a rule or a queryable standard |
433
- | "3,000 customer records with transaction histories for testing" | **Not here** | Test data, not knowledge. Use test-data-expert persona. |
434
-
435
- ---
436
-
437
- ## The Honest Assessment
438
-
439
- **Most orgs will never need Stage 3.** Vector search is powerful but it's
440
- complexity. An org with 50 gotchas and 10 ADRs doesn't need embeddings — they
441
- need well-organized flat files with good path scoping.
442
-
443
- **Stage 2 is the sweet spot for mature orgs.** A structured SQLite index with
444
- tag-based queries handles hundreds of knowledge items with zero infrastructure.
445
- The MCP server is a single `npx` command. This is where the cost/value curve
446
- peaks for most organizations.
447
-
448
- **Stage 3 is for orgs where knowledge IS the competitive advantage.** Compliance-
449
- heavy industries (healthcare, finance, government) where the accumulated knowledge
450
- of "what happened, what we decided, and why" is as valuable as the code itself.
451
- For these orgs, a security reviewer that cites last year's incident report is
452
- worth the embedding cost.
453
-
454
- AgentBoot should make the progression effortless — but never push orgs up the
455
- ladder faster than they need. Flat files are the right answer for most teams,
456
- most of the time.
457
-
458
- ---
459
-
460
- *See also:*
461
- - [`docs/concepts.md`](concepts.md) — gotchas rules, MCP-first integrations
462
- - [`docs/extending.md`](extending.md) — domain layers and per-persona extensions
463
- - [`docs/third-party-ecosystem.md`](third-party-ecosystem.md) — MCP server as cross-platform bridge
464
- - [`docs/claude-code-reference/feature-inventory.md`](claude-code-reference/feature-inventory.md) — MCP configuration