amalfa 1.0.2 → 1.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (55) hide show
  1. package/package.json +1 -1
  2. package/src/cli.ts +1 -1
  3. package/docs/AGENT-METADATA-PATTERNS.md +0 -1021
  4. package/docs/AGENT_PROTOCOLS.md +0 -28
  5. package/docs/ARCHITECTURAL_OVERVIEW.md +0 -123
  6. package/docs/BENTO_BOXING_DEPRECATION.md +0 -281
  7. package/docs/Bun-SQLite.html +0 -464
  8. package/docs/COMMIT_GUIDELINES.md +0 -367
  9. package/docs/CONFIG_E2E_VALIDATION.md +0 -147
  10. package/docs/CONFIG_UNIFICATION.md +0 -187
  11. package/docs/CONFIG_VALIDATION.md +0 -103
  12. package/docs/DEVELOPER_ONBOARDING.md +0 -36
  13. package/docs/Graph and Vector Database Best Practices.md +0 -214
  14. package/docs/LEGACY_DEPRECATION.md +0 -174
  15. package/docs/MCP_SETUP.md +0 -317
  16. package/docs/PERFORMANCE_BASELINES.md +0 -88
  17. package/docs/QUICK_START_MCP.md +0 -168
  18. package/docs/REPOSITORY_CLEANUP_SUMMARY.md +0 -261
  19. package/docs/SESSION-2026-01-06-METADATA-PATTERNS.md +0 -346
  20. package/docs/SETUP.md +0 -464
  21. package/docs/SETUP_COMPLETE.md +0 -464
  22. package/docs/VISION-AGENT-LEARNING.md +0 -1242
  23. package/docs/_current-config-status.md +0 -93
  24. package/docs/edge-generation-methods.md +0 -57
  25. package/docs/elevator-pitch.md +0 -118
  26. package/docs/graph-and-vector-database-playbook.html +0 -480
  27. package/docs/hardened-sqlite.md +0 -85
  28. package/docs/headless-knowledge-management.md +0 -79
  29. package/docs/john-kaye-flux-prompt.md +0 -46
  30. package/docs/keyboard-shortcuts.md +0 -80
  31. package/docs/opinion-proceed-pattern.md +0 -29
  32. package/docs/polyvis-nodes-edges-schema.md +0 -77
  33. package/docs/protocols/lab-protocol.md +0 -30
  34. package/docs/reaction-iquest-loop-coder.md +0 -46
  35. package/docs/services.md +0 -60
  36. package/docs/sqlite-wal-readonly-trap.md +0 -228
  37. package/docs/strategy/css-architecture.md +0 -40
  38. package/docs/test-document-cycle.md +0 -83
  39. package/docs/test_lifecycle_E2E.md +0 -4
  40. package/docs/the-bicameral-graph.md +0 -83
  41. package/docs/user-guide.md +0 -70
  42. package/docs/vision-helper.md +0 -53
  43. package/polyvis.settings.json.bak +0 -38
  44. package/src/EnlightenedTriad.ts +0 -146
  45. package/src/JIT_Triad.ts +0 -137
  46. package/src/data/experience/test_doc_1.md +0 -2
  47. package/src/data/experience/test_doc_2.md +0 -2
  48. package/src/demo-triad.ts +0 -45
  49. package/src/gardeners/BaseGardener.ts +0 -55
  50. package/src/llm/EnlightenedProvider.ts +0 -95
  51. package/src/services/README.md +0 -56
  52. package/src/services/llama.ts +0 -59
  53. package/src/services/llamauv.ts +0 -56
  54. package/src/services/olmo3.ts +0 -58
  55. package/src/services/phi.ts +0 -52
@@ -1,1242 +0,0 @@
1
- # Agent-Generated Knowledge: Beyond Spec-Driven Development
2
-
3
- **Author:** Insights from PolyVis project experience
4
- **Date:** 2026-01-06
5
- **Status:** Vision Document
6
-
7
- ---
8
-
9
- ## Executive Summary
10
-
11
- Traditional software development separates execution (agents/developers write code) from documentation (humans write specs and docs). The PolyVis project demonstrated a more effective pattern: **agents generate their own institutional knowledge** through structured reflection. This document explores why this works, how it scales, and what it means for Amalfa's design.
12
-
13
- **Key Finding:** When given minimal structure (brief-debrief-playbook workflow), agents spontaneously maintain documentation, update standards, and build compounding organizational knowledge - with humans acting as curators rather than primary authors.
14
-
15
- ---
16
-
17
- ## Table of Contents
18
-
19
- 1. [The Problem with Traditional Documentation](#the-problem-with-traditional-documentation)
20
- 2. [The PolyVis Discovery](#the-polyvis-discovery)
21
- 3. [Why the Pattern Works](#why-the-pattern-works)
22
- 4. [The Brief-Debrief-Playbook Flywheel](#the-brief-debrief-playbook-flywheel)
23
- 5. [Spec-Driven vs. Learning-Driven Development](#spec-driven-vs-learning-driven-development)
24
- 6. [The Human Role: Reader, Not Writer](#the-human-role-reader-not-writer)
25
- 7. [Implications for Amalfa Design](#implications-for-amalfa-design)
26
- 8. [Evolution Path: From Manual to Emergent](#evolution-path-from-manual-to-emergent)
27
- 9. [Implementation Principles](#implementation-principles)
28
- 10. [Conclusion: Documentation as Cognition](#conclusion-documentation-as-cognition)
29
-
30
- ---
31
-
32
- ## The Problem with Traditional Documentation
33
-
34
- ### The Classic Pattern
35
-
36
- ```
37
- Human writes spec → Agent executes → Code ships
38
-
39
- Knowledge stays in heads
40
-
41
- Next project: start over
42
- ```
43
-
44
- **Problems:**
45
- - Documentation lags reality (always outdated)
46
- - Knowledge stays tacit (in developer heads)
47
- - Human is bottleneck (must write everything)
48
- - No feedback loop (learnings don't update specs)
49
- - Context loss (why decisions were made)
50
-
51
- ### The Maintenance Burden
52
-
53
- **Traditional approach:**
54
- ```
55
- Day 1: Human writes comprehensive style guide
56
- Day 30: Half the team forgot it exists
57
- Day 90: Guide is outdated, nobody updates it
58
- Day 180: Guide is ignored, tribal knowledge rules
59
- ```
60
-
61
- **Why it fails:**
62
- - Writing docs is separate from doing work
63
- - Updates require explicit effort
64
- - No immediate payoff for maintainer
65
- - Docs drift from reality
66
-
67
- ---
68
-
69
- ## The PolyVis Discovery
70
-
71
- ### The Pattern
72
-
73
- ```
74
- Brief → Work → Debrief → Playbook Update
75
- ↓ ↓ ↓ ↓
76
- Human Agent Agent (mostly) Agent
77
- ```
78
-
79
- ### What Happened
80
-
81
- **Expected behavior:**
82
- - Agent executes brief
83
- - Human writes debrief
84
- - Human updates playbooks
85
-
86
- **Actual behavior:**
87
- - Agent executes brief
88
- - Agent writes debrief (as required)
89
- - **Agent updates playbooks spontaneously** (unprompted!)
90
- - Human reads debriefs for oversight
91
- - Human does "occasional tidy up"
92
-
93
- ### The Key Observation
94
-
95
- > "Most times the lessons learned would be copied to the playbooks by the agent without prompting."
96
-
97
- This wasn't programmed - it **emerged** from the structure.
98
-
99
- ---
100
-
101
- ## Why the Pattern Works
102
-
103
- ### 1. Immediate Context
104
-
105
- **Debriefs capture knowledge while it's fresh:**
106
-
107
- ```markdown
108
- ## Debrief: Auth Refactor (2025-12-05)
109
-
110
- ### What Worked
111
- - Alpine's x-data was better than manual state management
112
- - Eliminated 200 lines of imperative DOM code
113
-
114
- ### What Failed
115
- - Tried CSS Grid for layout → broke on Safari
116
- - Switched to Flexbox → worked everywhere
117
-
118
- ### Lesson Learned
119
- - Test layout in Safari EARLY, not as afterthought
120
- - Alpine handles state better for reactive UIs
121
- ```
122
-
123
- **Why this works:**
124
- - Written **immediately** after work (context in memory)
125
- - Captures **reasoning**, not just outcomes
126
- - Documents **dead ends** (what not to try next time)
127
- - Natural language (no friction to write)
128
-
129
- ### 2. The Reflection Gap
130
-
131
- **Debrief forces cognitive processing:**
132
-
133
- | Without Debrief | With Debrief |
134
- |----------------|--------------|
135
- | Task done → forget | Task done → reflect |
136
- | Knowledge stays implicit | Knowledge becomes explicit |
137
- | "It works" | "It works *because*..." |
138
- | No pattern recognition | Patterns emerge from writing |
139
-
140
- **The act of writing** the debrief causes:
141
- - Pattern recognition ("Alpine was better than...")
142
- - Causal reasoning ("Grid broke *because*...")
143
- - Abstraction ("Test Safari *early*" - generalizable rule)
144
-
145
- ### 3. Intrinsic Motivation
146
-
147
- **Without playbooks:**
148
- ```
149
- Session 1: Agent solves problem (hard)
150
- Session 2: Agent encounters same problem (hard again)
151
- Session 3: Agent encounters same problem (hard again)
152
- ```
153
- → No incentive to document
154
-
155
- **With playbooks:**
156
- ```
157
- Session 1: Agent solves problem → writes debrief → updates playbook
158
- Session 2: Agent reads playbook → solves similar problem (easy)
159
- Session 3: Agent benefits from own past work
160
- ```
161
- → **Self-interest** drives documentation
162
-
163
- ### 4. Closed Feedback Loop
164
-
165
- ```
166
- ┌─────────────────────────────────────────┐
167
- │ │
168
- │ Brief ──→ Work ──→ Debrief ──→ Playbook│
169
- │ ↑ │ │
170
- │ │ │ │
171
- │ └──────────── Next Brief ←────┘ │
172
- │ (informed by playbook) │
173
- │ │
174
- └─────────────────────────────────────────┘
175
- ```
176
-
177
- **Each cycle:**
178
- 1. Brief references playbook ("follow CSS standards")
179
- 2. Agent works, encounters edge case
180
- 3. Debrief captures edge case + solution
181
- 4. Agent updates playbook unprompted
182
- 5. Next brief benefits from richer playbook
183
-
184
- **Result:** Compounding knowledge with minimal human intervention.
185
-
186
- ---
187
-
188
- ## The Brief-Debrief-Playbook Flywheel
189
-
190
- ### Document Types
191
-
192
- #### Brief (Tactical)
193
- **Purpose:** Define specific task
194
- **Scope:** Single work session
195
- **Author:** Human (or agent proposal)
196
-
197
- **Contents:**
198
- - Objective (what to build)
199
- - Requirements (success criteria)
200
- - Context (why this matters)
201
- - Constraints (what to avoid)
202
-
203
- **Example:**
204
- ```markdown
205
- # Brief: Add Vector Search to Explorer
206
-
207
- ## Objective
208
- Enable semantic search in graph explorer UI
209
-
210
- ## Requirements
211
- - Search input in sidebar
212
- - Results highlight matching nodes
213
- - < 100ms response time
214
-
215
- ## Context
216
- Users need to find concepts without knowing exact node names
217
- ```
218
-
219
- #### Debrief (Reflective)
220
- **Purpose:** Capture learnings
221
- **Scope:** Single work session
222
- **Author:** Agent (immediately after work)
223
-
224
- **Contents:**
225
- - What worked (successes)
226
- - What failed (dead ends)
227
- - Lessons learned (abstractions)
228
- - Open questions (future work)
229
-
230
- **Example:**
231
- ```markdown
232
- # Debrief: Vector Search Implementation
233
-
234
- ## What Worked
235
- - Sigma's filter API for highlighting
236
- - Debounced search (300ms) prevents lag
237
- - Used existing VectorEngine (no new code)
238
-
239
- ## What Failed
240
- - Tried animating node transitions → janky on 1000+ nodes
241
- - Z-index for results popup → broke in Safari
242
- - Computed `position: fixed` instead
243
-
244
- ## Lessons Learned
245
- - Animation performance degrades non-linearly with node count
246
- - Test Safari layout EARLY (different stacking context rules)
247
- - Debouncing is critical for live search
248
-
249
- ## Open Questions
250
- - Should search history persist across sessions?
251
- ```
252
-
253
- #### Playbook (Strategic)
254
- **Purpose:** Codify standards
255
- **Scope:** All future work
256
- **Author:** Agent (extracted from debriefs)
257
-
258
- **Contents:**
259
- - Principles (how we do things)
260
- - Patterns (reusable solutions)
261
- - Anti-patterns (what to avoid)
262
- - Decision records (why we chose X over Y)
263
-
264
- **Example:**
265
- ```markdown
266
- # CSS Performance Playbook
267
-
268
- ## Principles
269
- 1. Test Safari early - different rendering behavior
270
- 2. Animations degrade with node count
271
- 3. Use CSS variables for all tweakable values
272
-
273
- ## Patterns
274
-
275
- ### Debounced Search Input
276
- ```javascript
277
- let timeout;
278
- input.addEventListener('input', (e) => {
279
- clearTimeout(timeout);
280
- timeout = setTimeout(() => search(e.target.value), 300);
281
- });
282
- ```
283
-
284
- ## Anti-Patterns
285
- - ❌ `position: fixed` with z-index (breaks Safari)
286
- - ✅ Use `position: absolute` with explicit stacking
287
-
288
- ## Decision Records
289
- **DR-023: Why Flexbox over Grid for layouts**
290
- Grid has better semantics but Safari rendering is inconsistent.
291
- Flexbox trades elegance for reliability. See debrief-2025-12-05.
292
- ```
293
-
294
- ### The Flow
295
-
296
- ```
297
- Briefs
298
- (this task)
299
-
300
- Work
301
- (execution)
302
-
303
- Debriefs
304
- (what we learned)
305
-
306
- Playbooks
307
- (how we do things)
308
-
309
- Future Briefs
310
- (informed by standards)
311
- ```
312
-
313
- **Key insight:** Knowledge flows **upward** (tactical → strategic) **and** **cycles** (strategic informs future tactical).
314
-
315
- ---
316
-
317
- ## Spec-Driven vs. Learning-Driven Development
318
-
319
- ### Spec-Driven (Traditional)
320
-
321
- ```
322
- Define requirements → Implement → Verify against spec
323
-
324
- Done ✓
325
- ```
326
-
327
- **Optimization target:** This task
328
- **Question answered:** Did we meet requirements?
329
- **Knowledge captured:** None (implicit)
330
-
331
- ### Learning-Driven (PolyVis)
332
-
333
- ```
334
- Define requirements → Implement → Verify → Reflect → Update knowledge
335
-
336
- All future tasks ✓
337
- ```
338
-
339
- **Optimization target:** All future tasks
340
- **Question answered:** What did we learn?
341
- **Knowledge captured:** Explicit (codified)
342
-
343
- ### The Synthesis
344
-
345
- **Spec-driven is necessary but not sufficient.**
346
-
347
- **Combined approach:**
348
- - Spec-driven ensures **this task succeeds**
349
- - Learning-driven ensures **future tasks succeed faster**
350
-
351
- **Example:**
352
-
353
- **Spec-driven only:**
354
- ```
355
- Week 1: Build auth system (40 hours)
356
- Week 5: Build payment system (40 hours - similar patterns)
357
- Week 9: Build admin system (40 hours - similar patterns)
358
- ```
359
- → Linear time complexity
360
-
361
- **Learning-driven addition:**
362
- ```
363
- Week 1: Build auth (40h) + debrief (2h) + playbook (1h) = 43h
364
- Week 5: Build payment (25h, leveraged auth playbook)
365
- Week 9: Build admin (15h, leveraged both playbooks)
366
- ```
367
- → Compounding efficiency
368
-
369
- ---
370
-
371
- ## The Human Role: Reader, Not Writer
372
-
373
- ### The Inversion
374
-
375
- **Old model:**
376
- ```
377
- Human: Primary Author
378
-
379
- Writes specs, docs, standards
380
-
381
- Agent: Reader/Executor
382
-
383
- Follows documentation
384
-
385
- Human: Maintainer
386
-
387
- Updates docs (bottleneck)
388
- ```
389
-
390
- **PolyVis model:**
391
- ```
392
- Human: Editor/Curator
393
-
394
- Writes initial briefs
395
-
396
- Agent: Author/Executor
397
-
398
- Writes code + debriefs + playbooks
399
-
400
- Human: Reader/Overseer
401
-
402
- Reads debriefs, occasional tidy-up
403
- ```
404
-
405
- ### What "Occasional Tidy Up" Means
406
-
407
- **Human didn't need constant maintenance because:**
408
-
409
- 1. **Agents write at point of knowledge creation** (fresh context)
410
- 2. **Agents have incentive to maintain** (benefit next session)
411
- 3. **Errors self-correct** (agent encounters bad advice → fixes playbook)
412
- 4. **Human passively monitors** (reads debriefs for issues)
413
-
414
- **Human's "tidy up" role:**
415
- - Merge duplicate playbooks
416
- - Reorganize structure
417
- - Deprecate obsolete sections
418
- - Resolve contradictions
419
- - **Curation, not creation**
420
-
421
- ### Why This Scales
422
-
423
- **Traditional docs:**
424
- ```
425
- N tasks = N × human_writing_time (linear bottleneck)
426
- ```
427
-
428
- **PolyVis pattern:**
429
- ```
430
- N tasks = N × agent_writing_time + √N × human_curation_time
431
- (parallelizable) (sublinear oversight)
432
- ```
433
-
434
- **Key:** Agent writing scales linearly (no bottleneck), human curation scales sublinearly (occasional intervention).
435
-
436
- ---
437
-
438
- ## Implications for Amalfa Design
439
-
440
- ### What PolyVis Taught Us
441
-
442
- **PolyVis pattern:**
443
- ```
444
- Briefs/Debriefs = Markdown files in repo
445
- Playbooks = Markdown files in repo
446
- Discovery = Grep/search file names
447
- Links = Manual (file references)
448
- ```
449
-
450
- **Strengths:**
451
- - Simple (no infrastructure)
452
- - Version controlled (git history)
453
- - Human readable (markdown)
454
-
455
- **Limitations:**
456
- - Linear discovery (keyword matching)
457
- - Manual linking (file references)
458
- - No semantic search (must know filenames)
459
- - Hard to find related concepts
460
-
461
- ### Amalfa Enhancement
462
-
463
- **Amalfa pattern:**
464
- ```
465
- Briefs/Debriefs = Nodes in semantic graph
466
- Playbooks = Nodes in semantic graph
467
- Discovery = Vector search + graph traversal
468
- Links = Automatic (similarity + explicit)
469
- ```
470
-
471
- **Example:**
472
-
473
- **PolyVis:**
474
- ```bash
475
- $ grep "CSS" playbooks/*.md
476
- playbooks/css-performance.md
477
- playbooks/css-variables.md
478
- ```
479
- → Must know keyword "CSS"
480
-
481
- **Amalfa:**
482
- ```
483
- Query: "styling broke on Safari"
484
-
485
- Results:
486
- 1. debrief-2025-12-03-layout-debug (0.89 similarity)
487
- "Flexbox works, Grid breaks Safari stacking context"
488
-
489
- 2. playbook-cross-browser-testing (0.85 similarity)
490
- "Test Safari early - rendering differs from Chrome"
491
-
492
- 3. brief-safari-animation-bug (0.82 similarity)
493
- "Animation performance issues Safari vs Chrome"
494
- ```
495
- → Semantic matching, even without "CSS" keyword
496
-
497
- ### First-Class Workflow Support
498
-
499
- **Bad Amalfa design:**
500
- ```
501
- Generic document store
502
- Agent dumps random markdown
503
- No structure, no patterns
504
- Human digs through chaos
505
- ```
506
-
507
- **Good Amalfa design:**
508
- ```
509
- Structured document types:
510
- - Brief: requirements, context, success criteria
511
- - Debrief: what worked, failed, lessons learned
512
- - Playbook: principles, patterns, anti-patterns
513
-
514
- Auto-linking:
515
- - Debrief → Brief (what task it relates to)
516
- - Debrief → Playbook (which standard it updates)
517
- - Similar debriefs (encountered same problem)
518
-
519
- Auto-promotion:
520
- - Extract lessons from debrief
521
- - Suggest playbook additions
522
- - Identify contradictions
523
- ```
524
-
525
- ### Discovery Patterns
526
-
527
- **PolyVis (manual search):**
528
- ```
529
- Agent: "I need to style this component"
530
- Agent: *greps for CSS playbook*
531
- Agent: *reads playbook*
532
- Agent: *applies patterns*
533
- ```
534
-
535
- **Amalfa (semantic discovery):**
536
- ```
537
- Agent: "I need to style this component for Safari"
538
- Agent: *queries "safari styling patterns"*
539
- Amalfa: Returns:
540
- - Playbook: cross-browser CSS
541
- - Debrief: Safari layout debugging
542
- - Brief: Safari-specific animation work
543
- Agent: *synthesizes from multiple sources*
544
- ```
545
-
546
- **Key difference:** Amalfa returns **related concepts**, not just exact matches.
547
-
548
- ### Graph Traversal
549
-
550
- **Beyond search - navigate relationships:**
551
-
552
- ```
553
- Query: "What problems did auth work encounter?"
554
-
555
- Amalfa returns graph:
556
- brief-auth-refactor
557
- ├─→ debrief-auth-refactor (completed)
558
- │ ├─→ playbook-alpine-patterns (updated)
559
- │ └─→ debrief-state-management (similar problem)
560
- ├─→ brief-auth-tests (follow-up work)
561
- └─→ issue-token-refresh (blocker)
562
- ```
563
-
564
- **Agent can:**
565
- - See full context (brief → debrief → playbook)
566
- - Find related work (similar problems)
567
- - Identify follow-ups (what came after)
568
- - Understand dependencies (blockers)
569
-
570
- ---
571
-
572
- ## Evolution Path: From Manual to Emergent
573
-
574
- ### Level 1: Manual Process (Pre-Agent)
575
-
576
- ```
577
- Human writes specs
578
- Human writes code
579
- Human reviews code
580
- Human documents learnings
581
- Human maintains docs
582
- ```
583
-
584
- **Characteristics:**
585
- - Human bottleneck
586
- - Linear scaling
587
- - Knowledge stays tacit
588
-
589
- ### Level 2: Agent Execution (Early Agent Era)
590
-
591
- ```
592
- Human writes specs
593
- Agent writes code
594
- Human reviews code
595
- Human documents learnings
596
- Human maintains docs
597
- ```
598
-
599
- **Characteristics:**
600
- - Execution scales
601
- - Documentation still bottleneck
602
- - Knowledge extraction still manual
603
-
604
- ### Level 3: Agent Reflection (PolyVis Pattern)
605
-
606
- ```
607
- Human writes briefs
608
- Agent writes code + debriefs
609
- Agent updates playbooks
610
- Human reads debriefs (oversight)
611
- Human occasional tidy-up
612
- ```
613
-
614
- **Characteristics:**
615
- - Execution scales
616
- - Documentation scales
617
- - Human acts as curator
618
- - **This is where PolyVis succeeded**
619
-
620
- ### Level 4: Semantic Infrastructure (Amalfa-Enhanced)
621
-
622
- ```
623
- Human writes briefs (or reviews agent proposals)
624
- Agent queries past work semantically
625
- Agent writes code + debriefs
626
- Agent links related concepts
627
- Agent updates playbooks
628
- Human queries debriefs for trends
629
- Human sets strategic direction
630
- ```
631
-
632
- **Characteristics:**
633
- - Execution scales
634
- - Documentation scales
635
- - Discovery scales (semantic)
636
- - Knowledge compounds faster
637
-
638
- ### Level 5: Full Emergence (Future Vision)
639
-
640
- ```
641
- Agents propose work based on knowledge gaps
642
- Agents self-organize documentation
643
- Agents identify obsolete knowledge
644
- Agents coordinate multi-agent work
645
- Human sets goals and values
646
- Human monitors for alignment
647
- ```
648
-
649
- **Characteristics:**
650
- - Agents manage full workflow
651
- - Self-organizing knowledge base
652
- - Human provides direction, not execution
653
- - Emergent collective intelligence
654
-
655
- **Note:** Level 5 requires significant AI advances. Level 4 is achievable today.
656
-
657
- ---
658
-
659
- ## Implementation Principles
660
-
661
- ### 1. Structure Enables Emergence
662
-
663
- **Don't:** Create generic note-taking app
664
- **Do:** Provide structured templates that guide reflection
665
-
666
- **Example:**
667
-
668
- **Bad:**
669
- ```
670
- "Write notes about your work"
671
- ```
672
- → Agent writes random thoughts
673
-
674
- **Good:**
675
- ```markdown
676
- # Debrief Template
677
-
678
- ## What Worked
679
- [What approaches succeeded and why?]
680
-
681
- ## What Failed
682
- [What did you try that didn't work? Why did it fail?]
683
-
684
- ## Lessons Learned
685
- [What generalizable insights emerged?]
686
-
687
- ## Open Questions
688
- [What remains unclear or needs investigation?]
689
- ```
690
- → Agent follows structure, captures useful knowledge
691
-
692
- ### 2. Make Documentation Self-Interested
693
-
694
- **Principle:** Agents maintain docs when *they* benefit.
695
-
696
- **Design implications:**
697
- - Debriefs inform future agent sessions (not just human record)
698
- - Playbooks save agents time (not just human reference)
699
- - Search finds agent's own past work (personal memory)
700
-
701
- **Anti-pattern:** Documentation as "homework" (pure overhead)
702
- **Pattern:** Documentation as "external memory" (future productivity)
703
-
704
- ### 3. Reduce Friction
705
-
706
- **Principle:** Easier to document than not document.
707
-
708
- **Design implications:**
709
- - In-line templates (don't make agents search for format)
710
- - Auto-fill context (task ID, date, files changed)
711
- - Suggest related docs (based on code changes)
712
- - One-command workflows (`debrief create`, `playbook add`)
713
-
714
- **Example:**
715
- ```bash
716
- # Bad: Multiple steps, high friction
717
- $ touch debrief.md
718
- $ open debrief.md
719
- $ # manually fill in metadata
720
- $ # manually find related brief
721
- $ # manually write content
722
-
723
- # Good: Single command, pre-filled template
724
- $ amalfa debrief create brief-auth-refactor
725
-
726
- Creating debrief for brief-auth-refactor...
727
-
728
- Auto-detected changes:
729
- - src/auth/login.ts (modified)
730
- - src/auth/session.ts (new)
731
-
732
- Related documents:
733
- - playbook-auth-patterns
734
- - debrief-session-management-2025-11-12
735
-
736
- Template loaded. Fill in sections below:
737
- [opens editor with pre-filled template]
738
- ```
739
-
740
- ### 4. Close the Loop
741
-
742
- **Principle:** Knowledge must flow back into workflow.
743
-
744
- **Design implications:**
745
- - Briefs reference playbooks (standards apply)
746
- - Debriefs update playbooks (learnings codify)
747
- - Playbooks link to debriefs (provenance)
748
- - Search surfaces relevant past work (discovery)
749
-
750
- **The cycle:**
751
- ```
752
- Playbook → Brief → Work → Debrief → Playbook
753
- ↑ ↓
754
- └────────── (knowledge compounds) ───┘
755
- ```
756
-
757
- ### 5. Support Human Oversight
758
-
759
- **Principle:** Human should read, not write.
760
-
761
- **Design implications:**
762
- - Dashboard: Recent debriefs (what happened today)
763
- - Digest: Weekly summary (trends across sessions)
764
- - Alerts: Contradictions (agent A says X, agent B says Y)
765
- - Queries: "What did we learn about auth this month?"
766
-
767
- **Example dashboard:**
768
- ```
769
- Recent Debriefs (Last 7 Days)
770
- 📝 debrief-vector-search (2h ago)
771
- Lesson: Debouncing critical for live search
772
- Updated: playbook-performance
773
-
774
- 📝 debrief-safari-layout (1d ago)
775
- Lesson: Test Safari early, not late
776
- Updated: playbook-cross-browser
777
- ⚠️ Contradicts: playbook-css-grid (suggests Grid)
778
-
779
- Playbook Updates (Last 30 Days)
780
- - performance: 3 updates
781
- - cross-browser: 2 updates
782
- - auth-patterns: 1 update
783
-
784
- 🔍 Query: "What problems did we have with Safari?"
785
- ```
786
-
787
- ### 6. Semantic Over Keyword
788
-
789
- **Principle:** Match concepts, not strings.
790
-
791
- **Design implications:**
792
- - Vector embeddings for all documents
793
- - Similarity search (not just keyword grep)
794
- - Graph traversal (related concepts)
795
- - Summarization (compressed context)
796
-
797
- **Example:**
798
-
799
- **Keyword search:**
800
- ```
801
- Query: "authentication"
802
- Matches: Documents containing "authentication"
803
- Misses: Documents about "login", "session", "tokens" without exact word
804
- ```
805
-
806
- **Semantic search:**
807
- ```
808
- Query: "authentication"
809
- Matches:
810
- - Documents about authentication (exact)
811
- - Documents about login (related concept)
812
- - Documents about session management (related concept)
813
- - Documents about JWT tokens (related implementation)
814
- - Ranked by relevance
815
- ```
816
-
817
- ### 7. Explicit Over Implicit Links
818
-
819
- **Principle:** Relationships should be first-class, not inferred.
820
-
821
- **Design implications:**
822
- - Debrief explicitly links to brief (task reference)
823
- - Playbook explicitly links to debriefs (provenance)
824
- - Cross-references are typed (related-to, contradicts, supersedes)
825
-
826
- **Example graph:**
827
- ```
828
- brief-auth-refactor
829
- ├─[completed-by]─→ debrief-auth-refactor
830
- │ ├─[updated]─→ playbook-alpine-patterns
831
- │ └─[similar-to]─→ debrief-state-management
832
- ├─[led-to]─→ brief-auth-tests
833
- └─[blocked-by]─→ issue-token-refresh
834
- ```
835
-
836
- **Why explicit links:**
837
- - Agent can traverse graph intentionally
838
- - Human can audit reasoning
839
- - Changes don't break relationships
840
- - Provenance is clear
841
-
842
- ---
843
-
844
- ## Conclusion: Documentation as Cognition
845
-
846
- ### The Core Insight
847
-
848
- **Documentation is not an artifact of work - it's a cognitive tool.**
849
-
850
- When agents write debriefs:
851
- - They **process** what happened (reflection)
852
- - They **abstract** patterns (generalization)
853
- - They **externalize** knowledge (memory)
854
- - They **connect** concepts (synthesis)
855
-
856
- **Writing forces thinking.**
857
-
858
- ### Why PolyVis Worked
859
-
860
- The brief-debrief-playbook pattern succeeded because:
861
-
862
- 1. **Structured reflection** (debrief template)
863
- 2. **Immediate capture** (write while context fresh)
864
- 3. **Self-interest** (agents benefit from own docs)
865
- 4. **Closed loop** (playbooks inform future briefs)
866
- 5. **Minimal friction** (markdown, git, simple tools)
867
- 6. **Human oversight** (curator, not bottleneck)
868
-
869
- ### What Amalfa Adds
870
-
871
- Amalfa enhances the pattern with:
872
-
873
- 1. **Semantic discovery** (find related work by concept)
874
- 2. **Graph structure** (navigate relationships)
875
- 3. **Auto-linking** (suggest connections)
876
- 4. **Multi-agent** (shared memory substrate)
877
- 5. **Cross-session** (persistent context)
878
- 6. **Queryable trends** (what are we learning about X?)
879
-
880
- ### The Vision
881
-
882
- **Current state (2026):**
883
- - Agents execute tasks
884
- - Documentation is afterthought
885
- - Knowledge resets each session
886
-
887
- **Near future (Amalfa + PolyVis pattern):**
888
- - Agents execute + reflect
889
- - Documentation is intrinsic
890
- - Knowledge compounds across sessions
891
-
892
- **Long-term vision:**
893
- - Agents self-organize work
894
- - Documentation is infrastructure
895
- - Collective intelligence emerges
896
-
897
- ### The Path Forward
898
-
899
- **Phase 1: Replicate PolyVis** (MVP)
900
- - Support brief/debrief/playbook types
901
- - Markdown storage
902
- - Basic search
903
- - Git integration
904
-
905
- **Phase 2: Add Semantics** (Enhanced)
906
- - Vector embeddings
907
- - Similarity search
908
- - Auto-suggested links
909
- - Graph traversal
910
-
911
- **Phase 3: Multi-Agent** (Scaled)
912
- - Shared knowledge base
913
- - Cross-agent learning
914
- - Conflict resolution
915
- - Trend analysis
916
-
917
- **Phase 4: Emergence** (Future)
918
- - Agent-proposed work
919
- - Self-organizing structure
920
- - Automated maintenance
921
- - Collective intelligence
922
-
923
- ---
924
-
925
- ## Appendix: Practical Examples
926
-
927
- ### Example 1: Auth Refactor Session
928
-
929
- **Brief:**
930
- ```markdown
931
- # Brief: Refactor Auth to Alpine.js
932
-
933
- ## Objective
934
- Replace imperative DOM manipulation with Alpine state management
935
-
936
- ## Requirements
937
- - User login/logout works
938
- - Session persistence across page loads
939
- - Token refresh on expiry
940
-
941
- ## Context
942
- Current auth code is 400 lines of manual DOM updates. Hard to debug.
943
- ```
944
-
945
- **Work:**
946
- - Agent refactors code
947
- - Replaces 400 lines with 80 lines of Alpine
948
- - Tests pass
949
- - Deploy succeeds
950
-
951
- **Debrief (written by agent):**
952
- ```markdown
953
- # Debrief: Auth Alpine Refactor
954
-
955
- ## What Worked
956
- - Alpine's `x-data` eliminated manual state tracking
957
- - `x-show` replaced 50+ lines of visibility toggling
958
- - Token refresh using Alpine's `$watch` (reactive, not polling)
959
-
960
- ## What Failed
961
- - Tried storing token in Alpine state → broke on page reload
962
- - Switched to localStorage, Alpine reads on mount → works
963
-
964
- ## Lessons Learned
965
- - Alpine is perfect for UI state, NOT persistence
966
- - Alpine + localStorage is the right pattern
967
- - 400 lines → 80 lines = 5x reduction, easier to debug
968
-
969
- ## Files Changed
970
- - src/auth/login.ts (refactored)
971
- - src/auth/session.ts (simplified)
972
-
973
- ## Test Results
974
- All 23 auth tests pass
975
- ```
976
-
977
- **Playbook Update (agent writes unprompted):**
978
- ```markdown
979
- # Alpine.js Playbook
980
-
981
- ## State Management Patterns
982
-
983
- ### UI State: Use Alpine
984
- ```html
985
- <div x-data="{ isLoggedIn: false, user: null }">
986
- <button @click="login()">Login</button>
987
- <div x-show="isLoggedIn">Welcome, {{ user.name }}</div>
988
- </div>
989
- ```
990
-
991
- ### Persistent State: Use localStorage + Alpine
992
- ```javascript
993
- // Mount: Read from localStorage
994
- Alpine.data('auth', () => ({
995
- init() {
996
- this.token = localStorage.getItem('token');
997
- this.isLoggedIn = !!this.token;
998
- }
999
- }));
1000
-
1001
- // Update: Write to localStorage
1002
- function login(token) {
1003
- localStorage.setItem('token', token);
1004
- this.token = token;
1005
- this.isLoggedIn = true;
1006
- }
1007
- ```
1008
-
1009
- **Provenance:** debrief-auth-alpine-refactor (2026-01-05)
1010
- ```
1011
-
1012
- **Result:**
1013
- - Brief defined task
1014
- - Debrief captured learnings
1015
- - Playbook codified pattern
1016
- - Future auth work references playbook
1017
- - Next agent doesn't repeat localStorage mistake
1018
-
1019
- ### Example 2: Safari Layout Bug
1020
-
1021
- **Brief:**
1022
- ```markdown
1023
- # Brief: Fix Explorer Layout on Safari
1024
-
1025
- ## Objective
1026
- Sidebar + graph layout broken on Safari (works Chrome/Firefox)
1027
-
1028
- ## Requirements
1029
- - Sidebar 300px wide, fixed left
1030
- - Graph fills remaining space
1031
- - Responsive (mobile collapses sidebar)
1032
-
1033
- ## Context
1034
- User report: Safari shows overlapping panels
1035
- ```
1036
-
1037
- **Work:**
1038
- - Agent investigates Safari rendering
1039
- - Discovers CSS Grid stacking context bug
1040
- - Switches to Flexbox
1041
- - Verifies on Safari
1042
-
1043
- **Debrief:**
1044
- ```markdown
1045
- # Debrief: Safari Layout Fix
1046
-
1047
- ## What Worked
1048
- - Flexbox layout: `display: flex` for container
1049
- - `flex: 0 0 300px` for sidebar, `flex: 1` for graph
1050
- - Works Safari/Chrome/Firefox/Edge
1051
-
1052
- ## What Failed
1053
- - CSS Grid: `grid-template-columns: 300px 1fr`
1054
- - Safari bug: z-index doesn't work in grid context
1055
- - Searched for fixes, found Safari 15.x known issue
1056
-
1057
- ## Lessons Learned
1058
- - **Safari != Chrome** - test early, not at end
1059
- - CSS Grid has better semantics, but Safari bugs
1060
- - Flexbox is more reliable for layout (trade elegance for compatibility)
1061
- - Check caniuse.com for Safari quirks
1062
-
1063
- ## Browser Tested
1064
- - ✅ Safari 17.2
1065
- - ✅ Chrome 120
1066
- - ✅ Firefox 121
1067
-
1068
- ## Decision
1069
- Using Flexbox for all layouts going forward until Safari Grid bugs fixed.
1070
- ```
1071
-
1072
- **Playbook Update:**
1073
- ```markdown
1074
- # Cross-Browser CSS Playbook
1075
-
1076
- ## Principles
1077
- 1. **Test Safari EARLY** - rendering differs significantly
1078
- 2. **Trade elegance for reliability** - Safari bugs beat clean code
1079
- 3. **Check caniuse.com** - don't assume feature parity
1080
-
1081
- ## Layout Patterns
1082
-
1083
- ### ❌ CSS Grid (Safari Issues)
1084
- Safari 15-17 has z-index bugs in grid context. Avoid for layouts with overlays.
1085
-
1086
- ### ✅ Flexbox (Reliable)
1087
- Works consistently across browsers:
1088
- ```css
1089
- .container {
1090
- display: flex;
1091
- }
1092
- .sidebar {
1093
- flex: 0 0 300px; /* fixed 300px */
1094
- }
1095
- .main {
1096
- flex: 1; /* fill remaining */
1097
- }
1098
- ```
1099
-
1100
- ## Decision Records
1101
-
1102
- **DR-042: Why Flexbox over Grid**
1103
- Grid has better semantics (2D layout primitives), but Safari 15-17
1104
- has stacking context bugs that break z-index. Flexbox trades elegance
1105
- for reliability. Revisit when Safari 18+ adoption > 90%.
1106
-
1107
- **Provenance:** debrief-safari-layout-fix (2026-01-05)
1108
- ```
1109
-
1110
- **Result:**
1111
- - Future layout work reads playbook
1112
- - Agents choose Flexbox, avoid Grid (Safari)
1113
- - Decision rationale is documented (not tribal knowledge)
1114
- - When Safari fixes bug, DR has revisit condition
1115
-
1116
- ### Example 3: Vector Search Performance
1117
-
1118
- **Brief:**
1119
- ```markdown
1120
- # Brief: Add Semantic Search to Explorer
1121
-
1122
- ## Objective
1123
- Search nodes by concept, not just exact name match
1124
-
1125
- ## Requirements
1126
- - Input box in sidebar
1127
- - Live results as user types
1128
- - Highlight matching nodes in graph
1129
- - < 100ms response time
1130
-
1131
- ## Context
1132
- Users want "find authentication stuff" not "find node named 'auth.ts'"
1133
- ```
1134
-
1135
- **Work:**
1136
- - Agent integrates VectorEngine
1137
- - Adds search input with Alpine
1138
- - Implements highlighting
1139
- - Tests performance
1140
-
1141
- **Debrief:**
1142
- ```markdown
1143
- # Debrief: Vector Search Implementation
1144
-
1145
- ## What Worked
1146
- - Reused existing VectorEngine (no new code!)
1147
- - Alpine + `@input` event (reactive search)
1148
- - Sigma's `setNodeAttribute` for highlighting
1149
- - Debounced search input (300ms delay)
1150
-
1151
- ## What Failed
1152
- - **Attempt 1:** No debouncing → laggy typing (search on every keystroke)
1153
- - **Attempt 2:** 100ms debounce → still laggy on slow machines
1154
- - **Attempt 3:** 300ms debounce → smooth, feels instant
1155
-
1156
- ## Lessons Learned
1157
- - **Debouncing is CRITICAL** for live search
1158
- - 300ms is sweet spot (feels instant, but batches queries)
1159
- - Never search on raw `@input` - always debounce
1160
- - VectorEngine already fast (<10ms), network overhead is the issue
1161
-
1162
- ## Performance
1163
- - Vector search: 8ms avg
1164
- - UI render: 12ms avg
1165
- - Total with 300ms debounce: Feels instant
1166
-
1167
- ## Files Changed
1168
- - public/explorer/search.js (new)
1169
- - public/explorer/index.html (added input)
1170
- ```
1171
-
1172
- **Playbook Update:**
1173
- ```markdown
1174
- # Performance Playbook
1175
-
1176
- ## Search Input Patterns
1177
-
1178
- ### ❌ Raw Input (No Debouncing)
1179
- ```javascript
1180
- // DON'T: Fires on every keystroke
1181
- input.addEventListener('input', search);
1182
- ```
1183
- **Problem:** Laggy typing, wasted queries
1184
-
1185
- ### ✅ Debounced Input
1186
- ```javascript
1187
- // DO: Batch queries
1188
- let timeout;
1189
- input.addEventListener('input', (e) => {
1190
- clearTimeout(timeout);
1191
- timeout = setTimeout(() => search(e.target.value), 300);
1192
- });
1193
- ```
1194
- **Result:** Smooth typing, fewer queries
1195
-
1196
- ### Debounce Timing
1197
- - **100ms:** Still feels laggy on slow machines
1198
- - **300ms:** ✅ Sweet spot (feels instant, batches queries)
1199
- - **500ms:** Noticeable delay
1200
-
1201
- **Provenance:** debrief-vector-search-implementation (2026-01-06)
1202
-
1203
- ## Animation Performance
1204
-
1205
- ### Rule: Performance Degrades Non-Linearly
1206
- - 100 nodes: animations smooth
1207
- - 500 nodes: animations slightly janky
1208
- - 1000+ nodes: animations unusable
1209
-
1210
- **Lesson:** Disable animations above threshold, don't try to optimize.
1211
-
1212
- **Provenance:** debrief-graph-animation-performance (2025-12-15)
1213
- ```
1214
-
1215
- **Result:**
1216
- - Future agents know to debounce search inputs
1217
- - 300ms is documented as best practice
1218
- - Performance thresholds are explicit
1219
- - Cross-referenced to original debrief
1220
-
1221
- ---
1222
-
1223
- ## References
1224
-
1225
- - **PolyVis Project:** https://github.com/pjsvis/polyvis
1226
- - **Brief-Debrief-Playbook Pattern:** Emerged from PolyVis development (2025)
1227
- - **Amalfa:** This project (MCP server for agent memory)
1228
- - **Related Concepts:**
1229
- - Learning Organizations (Peter Senge)
1230
- - After-Action Reviews (US Army)
1231
- - Retrospectives (Agile)
1232
- - Decision Records (ADRs)
1233
-
1234
- ---
1235
-
1236
- **Status:** Vision document, not specification
1237
- **Next Steps:** Design Amalfa schema to support this workflow
1238
- **Feedback:** Iterate based on implementation experience
1239
-
1240
- ---
1241
-
1242
- _This document captures learnings from PolyVis and charts a path for Amalfa. The goal: make agent-generated knowledge the default, not the exception._