amalfa 1.0.0 → 1.0.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +226 -247
- package/amalfa.config.example.ts +8 -6
- package/docs/AGENT-METADATA-PATTERNS.md +1021 -0
- package/docs/CONFIG_E2E_VALIDATION.md +147 -0
- package/docs/CONFIG_UNIFICATION.md +187 -0
- package/docs/CONFIG_VALIDATION.md +103 -0
- package/docs/LEGACY_DEPRECATION.md +174 -0
- package/docs/MCP_SETUP.md +317 -0
- package/docs/QUICK_START_MCP.md +168 -0
- package/docs/SESSION-2026-01-06-METADATA-PATTERNS.md +346 -0
- package/docs/SETUP.md +464 -0
- package/docs/SETUP_COMPLETE.md +464 -0
- package/docs/VISION-AGENT-LEARNING.md +1242 -0
- package/docs/_current-config-status.md +93 -0
- package/package.json +6 -3
- package/polyvis.settings.json.bak +38 -0
- package/src/cli.ts +159 -31
- package/src/config/defaults.ts +73 -15
- package/src/core/VectorEngine.ts +18 -9
- package/src/daemon/index.ts +12 -8
- package/src/mcp/index.ts +62 -7
- package/src/pipeline/AmalfaIngestor.ts +22 -12
- package/src/pipeline/PreFlightAnalyzer.ts +434 -0
- package/src/resonance/DatabaseFactory.ts +3 -4
- package/src/resonance/db.ts +8 -6
- package/src/resonance/schema.ts +19 -1
- package/src/resonance/services/vector-daemon.ts +151 -0
- package/src/utils/DaemonManager.ts +147 -0
- package/src/utils/ZombieDefense.ts +5 -1
- package/:memory: +0 -0
- package/:memory:-shm +0 -0
- package/:memory:-wal +0 -0
- package/README.old.md +0 -112
- package/agents.config.json +0 -11
- package/drizzle/0000_minor_iron_fist.sql +0 -19
- package/drizzle/meta/0000_snapshot.json +0 -139
- package/drizzle/meta/_journal.json +0 -13
- package/example_usage.ts +0 -39
- package/experiment.sh +0 -35
- package/hello +0 -2
- package/index.html +0 -52
- package/knowledge/excalibur.md +0 -12
- package/plans/experience-graph-integration.md +0 -60
- package/prompts/gemini-king-mode-prompt.md +0 -46
- package/public/docs/MCP_TOOLS.md +0 -372
- package/schemas/README.md +0 -20
- package/schemas/cda.schema.json +0 -84
- package/schemas/conceptual-lexicon.schema.json +0 -75
- package/scratchpads/dummy-debrief-boxed.md +0 -39
- package/scratchpads/dummy-debrief.md +0 -27
- package/scratchpads/scratchpad-design.md +0 -50
- package/scratchpads/scratchpad-scrolling.md +0 -20
- package/scratchpads/scratchpad-toc-disappearance.md +0 -23
- package/scratchpads/scratchpad-toc.md +0 -28
- package/scratchpads/test_gardener.md +0 -7
- package/src/core/LLMClient.ts +0 -93
- package/src/core/TagEngine.ts +0 -56
- package/src/db/schema.ts +0 -46
- package/src/gardeners/AutoTagger.ts +0 -116
- package/src/pipeline/HarvesterPipeline.ts +0 -101
- package/src/pipeline/Ingestor.ts +0 -555
- package/src/resonance/cli/ingest.ts +0 -41
- package/src/resonance/cli/migrate.ts +0 -54
- package/src/resonance/config.ts +0 -40
- package/src/resonance/daemon.ts +0 -236
- package/src/resonance/pipeline/extract.ts +0 -89
- package/src/resonance/pipeline/transform_docs.ts +0 -60
- package/src/resonance/services/tokenizer.ts +0 -159
- package/src/resonance/transform/cda.ts +0 -393
- package/src/utils/EnvironmentVerifier.ts +0 -67
- package/substack/substack-playbook-1.md +0 -95
- package/substack/substack-playbook-2.md +0 -78
- package/tasks/ui-investigation.md +0 -26
- package/test-db +0 -0
- package/test-db-shm +0 -0
- package/test-db-wal +0 -0
- package/tests/canary/verify_pinch_check.ts +0 -44
- package/tests/fixtures/ingest_test.md +0 -12
- package/tests/fixtures/ingest_test_boxed.md +0 -13
- package/tests/fixtures/safety_test.md +0 -45
- package/tests/fixtures/safety_test_boxed.md +0 -49
- package/tests/fixtures/tagged_output.md +0 -49
- package/tests/fixtures/tagged_test.md +0 -49
- package/tests/mcp-server-settings.json +0 -8
- package/verify-embedder.ts +0 -54
|
@@ -0,0 +1,1242 @@
|
|
|
1
|
+
# Agent-Generated Knowledge: Beyond Spec-Driven Development
|
|
2
|
+
|
|
3
|
+
**Author:** Insights from PolyVis project experience
|
|
4
|
+
**Date:** 2026-01-06
|
|
5
|
+
**Status:** Vision Document
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Executive Summary
|
|
10
|
+
|
|
11
|
+
Traditional software development separates execution (agents/developers write code) from documentation (humans write specs and docs). The PolyVis project demonstrated a more effective pattern: **agents generate their own institutional knowledge** through structured reflection. This document explores why this works, how it scales, and what it means for Amalfa's design.
|
|
12
|
+
|
|
13
|
+
**Key Finding:** When given minimal structure (brief-debrief-playbook workflow), agents spontaneously maintain documentation, update standards, and build compounding organizational knowledge - with humans acting as curators rather than primary authors.
|
|
14
|
+
|
|
15
|
+
---
|
|
16
|
+
|
|
17
|
+
## Table of Contents
|
|
18
|
+
|
|
19
|
+
1. [The Problem with Traditional Documentation](#the-problem-with-traditional-documentation)
|
|
20
|
+
2. [The PolyVis Discovery](#the-polyvis-discovery)
|
|
21
|
+
3. [Why the Pattern Works](#why-the-pattern-works)
|
|
22
|
+
4. [The Brief-Debrief-Playbook Flywheel](#the-brief-debrief-playbook-flywheel)
|
|
23
|
+
5. [Spec-Driven vs. Learning-Driven Development](#spec-driven-vs-learning-driven-development)
|
|
24
|
+
6. [The Human Role: Reader, Not Writer](#the-human-role-reader-not-writer)
|
|
25
|
+
7. [Implications for Amalfa Design](#implications-for-amalfa-design)
|
|
26
|
+
8. [Evolution Path: From Manual to Emergent](#evolution-path-from-manual-to-emergent)
|
|
27
|
+
9. [Implementation Principles](#implementation-principles)
|
|
28
|
+
10. [Conclusion: Documentation as Cognition](#conclusion-documentation-as-cognition)
|
|
29
|
+
|
|
30
|
+
---
|
|
31
|
+
|
|
32
|
+
## The Problem with Traditional Documentation
|
|
33
|
+
|
|
34
|
+
### The Classic Pattern
|
|
35
|
+
|
|
36
|
+
```
|
|
37
|
+
Human writes spec → Agent executes → Code ships
|
|
38
|
+
↓
|
|
39
|
+
Knowledge stays in heads
|
|
40
|
+
↓
|
|
41
|
+
Next project: start over
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
**Problems:**
|
|
45
|
+
- Documentation lags reality (always outdated)
|
|
46
|
+
- Knowledge stays tacit (in developer heads)
|
|
47
|
+
- Human is bottleneck (must write everything)
|
|
48
|
+
- No feedback loop (learnings don't update specs)
|
|
49
|
+
- Context loss (why decisions were made)
|
|
50
|
+
|
|
51
|
+
### The Maintenance Burden
|
|
52
|
+
|
|
53
|
+
**Traditional approach:**
|
|
54
|
+
```
|
|
55
|
+
Day 1: Human writes comprehensive style guide
|
|
56
|
+
Day 30: Half the team forgot it exists
|
|
57
|
+
Day 90: Guide is outdated, nobody updates it
|
|
58
|
+
Day 180: Guide is ignored, tribal knowledge rules
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
**Why it fails:**
|
|
62
|
+
- Writing docs is separate from doing work
|
|
63
|
+
- Updates require explicit effort
|
|
64
|
+
- No immediate payoff for maintainer
|
|
65
|
+
- Docs drift from reality
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
69
|
+
## The PolyVis Discovery
|
|
70
|
+
|
|
71
|
+
### The Pattern
|
|
72
|
+
|
|
73
|
+
```
|
|
74
|
+
Brief → Work → Debrief → Playbook Update
|
|
75
|
+
↓ ↓ ↓ ↓
|
|
76
|
+
Human Agent Agent (mostly) Agent
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
### What Happened
|
|
80
|
+
|
|
81
|
+
**Expected behavior:**
|
|
82
|
+
- Agent executes brief
|
|
83
|
+
- Human writes debrief
|
|
84
|
+
- Human updates playbooks
|
|
85
|
+
|
|
86
|
+
**Actual behavior:**
|
|
87
|
+
- Agent executes brief
|
|
88
|
+
- Agent writes debrief (as required)
|
|
89
|
+
- **Agent updates playbooks spontaneously** (unprompted!)
|
|
90
|
+
- Human reads debriefs for oversight
|
|
91
|
+
- Human does "occasional tidy up"
|
|
92
|
+
|
|
93
|
+
### The Key Observation
|
|
94
|
+
|
|
95
|
+
> "Most times the lessons learned would be copied to the playbooks by the agent without prompting."
|
|
96
|
+
|
|
97
|
+
This wasn't programmed - it **emerged** from the structure.
|
|
98
|
+
|
|
99
|
+
---
|
|
100
|
+
|
|
101
|
+
## Why the Pattern Works
|
|
102
|
+
|
|
103
|
+
### 1. Immediate Context
|
|
104
|
+
|
|
105
|
+
**Debriefs capture knowledge while it's fresh:**
|
|
106
|
+
|
|
107
|
+
```markdown
|
|
108
|
+
## Debrief: Auth Refactor (2025-12-05)
|
|
109
|
+
|
|
110
|
+
### What Worked
|
|
111
|
+
- Alpine's x-data was better than manual state management
|
|
112
|
+
- Eliminated 200 lines of imperative DOM code
|
|
113
|
+
|
|
114
|
+
### What Failed
|
|
115
|
+
- Tried CSS Grid for layout → broke on Safari
|
|
116
|
+
- Switched to Flexbox → worked everywhere
|
|
117
|
+
|
|
118
|
+
### Lesson Learned
|
|
119
|
+
- Test layout in Safari EARLY, not as afterthought
|
|
120
|
+
- Alpine handles state better for reactive UIs
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
**Why this works:**
|
|
124
|
+
- Written **immediately** after work (context in memory)
|
|
125
|
+
- Captures **reasoning**, not just outcomes
|
|
126
|
+
- Documents **dead ends** (what not to try next time)
|
|
127
|
+
- Natural language (no friction to write)
|
|
128
|
+
|
|
129
|
+
### 2. The Reflection Gap
|
|
130
|
+
|
|
131
|
+
**Debrief forces cognitive processing:**
|
|
132
|
+
|
|
133
|
+
| Without Debrief | With Debrief |
|
|
134
|
+
|----------------|--------------|
|
|
135
|
+
| Task done → forget | Task done → reflect |
|
|
136
|
+
| Knowledge stays implicit | Knowledge becomes explicit |
|
|
137
|
+
| "It works" | "It works *because*..." |
|
|
138
|
+
| No pattern recognition | Patterns emerge from writing |
|
|
139
|
+
|
|
140
|
+
**The act of writing** the debrief causes:
|
|
141
|
+
- Pattern recognition ("Alpine was better than...")
|
|
142
|
+
- Causal reasoning ("Grid broke *because*...")
|
|
143
|
+
- Abstraction ("Test Safari *early*" - generalizable rule)
|
|
144
|
+
|
|
145
|
+
### 3. Intrinsic Motivation
|
|
146
|
+
|
|
147
|
+
**Without playbooks:**
|
|
148
|
+
```
|
|
149
|
+
Session 1: Agent solves problem (hard)
|
|
150
|
+
Session 2: Agent encounters same problem (hard again)
|
|
151
|
+
Session 3: Agent encounters same problem (hard again)
|
|
152
|
+
```
|
|
153
|
+
→ No incentive to document
|
|
154
|
+
|
|
155
|
+
**With playbooks:**
|
|
156
|
+
```
|
|
157
|
+
Session 1: Agent solves problem → writes debrief → updates playbook
|
|
158
|
+
Session 2: Agent reads playbook → solves similar problem (easy)
|
|
159
|
+
Session 3: Agent benefits from own past work
|
|
160
|
+
```
|
|
161
|
+
→ **Self-interest** drives documentation
|
|
162
|
+
|
|
163
|
+
### 4. Closed Feedback Loop
|
|
164
|
+
|
|
165
|
+
```
|
|
166
|
+
┌─────────────────────────────────────────┐
|
|
167
|
+
│ │
|
|
168
|
+
│ Brief ──→ Work ──→ Debrief ──→ Playbook│
|
|
169
|
+
│ ↑ │ │
|
|
170
|
+
│ │ │ │
|
|
171
|
+
│ └──────────── Next Brief ←────┘ │
|
|
172
|
+
│ (informed by playbook) │
|
|
173
|
+
│ │
|
|
174
|
+
└─────────────────────────────────────────┘
|
|
175
|
+
```
|
|
176
|
+
|
|
177
|
+
**Each cycle:**
|
|
178
|
+
1. Brief references playbook ("follow CSS standards")
|
|
179
|
+
2. Agent works, encounters edge case
|
|
180
|
+
3. Debrief captures edge case + solution
|
|
181
|
+
4. Agent updates playbook unprompted
|
|
182
|
+
5. Next brief benefits from richer playbook
|
|
183
|
+
|
|
184
|
+
**Result:** Compounding knowledge with minimal human intervention.
|
|
185
|
+
|
|
186
|
+
---
|
|
187
|
+
|
|
188
|
+
## The Brief-Debrief-Playbook Flywheel
|
|
189
|
+
|
|
190
|
+
### Document Types
|
|
191
|
+
|
|
192
|
+
#### Brief (Tactical)
|
|
193
|
+
**Purpose:** Define specific task
|
|
194
|
+
**Scope:** Single work session
|
|
195
|
+
**Author:** Human (or agent proposal)
|
|
196
|
+
|
|
197
|
+
**Contents:**
|
|
198
|
+
- Objective (what to build)
|
|
199
|
+
- Requirements (success criteria)
|
|
200
|
+
- Context (why this matters)
|
|
201
|
+
- Constraints (what to avoid)
|
|
202
|
+
|
|
203
|
+
**Example:**
|
|
204
|
+
```markdown
|
|
205
|
+
# Brief: Add Vector Search to Explorer
|
|
206
|
+
|
|
207
|
+
## Objective
|
|
208
|
+
Enable semantic search in graph explorer UI
|
|
209
|
+
|
|
210
|
+
## Requirements
|
|
211
|
+
- Search input in sidebar
|
|
212
|
+
- Results highlight matching nodes
|
|
213
|
+
- < 100ms response time
|
|
214
|
+
|
|
215
|
+
## Context
|
|
216
|
+
Users need to find concepts without knowing exact node names
|
|
217
|
+
```
|
|
218
|
+
|
|
219
|
+
#### Debrief (Reflective)
|
|
220
|
+
**Purpose:** Capture learnings
|
|
221
|
+
**Scope:** Single work session
|
|
222
|
+
**Author:** Agent (immediately after work)
|
|
223
|
+
|
|
224
|
+
**Contents:**
|
|
225
|
+
- What worked (successes)
|
|
226
|
+
- What failed (dead ends)
|
|
227
|
+
- Lessons learned (abstractions)
|
|
228
|
+
- Open questions (future work)
|
|
229
|
+
|
|
230
|
+
**Example:**
|
|
231
|
+
```markdown
|
|
232
|
+
# Debrief: Vector Search Implementation
|
|
233
|
+
|
|
234
|
+
## What Worked
|
|
235
|
+
- Sigma's filter API for highlighting
|
|
236
|
+
- Debounced search (300ms) prevents lag
|
|
237
|
+
- Used existing VectorEngine (no new code)
|
|
238
|
+
|
|
239
|
+
## What Failed
|
|
240
|
+
- Tried animating node transitions → janky on 1000+ nodes
|
|
241
|
+
- Z-index for results popup → broke in Safari
|
|
242
|
+
- Computed `position: fixed` instead
|
|
243
|
+
|
|
244
|
+
## Lessons Learned
|
|
245
|
+
- Animation performance degrades non-linearly with node count
|
|
246
|
+
- Test Safari layout EARLY (different stacking context rules)
|
|
247
|
+
- Debouncing is critical for live search
|
|
248
|
+
|
|
249
|
+
## Open Questions
|
|
250
|
+
- Should search history persist across sessions?
|
|
251
|
+
```
|
|
252
|
+
|
|
253
|
+
#### Playbook (Strategic)
|
|
254
|
+
**Purpose:** Codify standards
|
|
255
|
+
**Scope:** All future work
|
|
256
|
+
**Author:** Agent (extracted from debriefs)
|
|
257
|
+
|
|
258
|
+
**Contents:**
|
|
259
|
+
- Principles (how we do things)
|
|
260
|
+
- Patterns (reusable solutions)
|
|
261
|
+
- Anti-patterns (what to avoid)
|
|
262
|
+
- Decision records (why we chose X over Y)
|
|
263
|
+
|
|
264
|
+
**Example:**
|
|
265
|
+
```markdown
|
|
266
|
+
# CSS Performance Playbook
|
|
267
|
+
|
|
268
|
+
## Principles
|
|
269
|
+
1. Test Safari early - different rendering behavior
|
|
270
|
+
2. Animations degrade with node count
|
|
271
|
+
3. Use CSS variables for all tweakable values
|
|
272
|
+
|
|
273
|
+
## Patterns
|
|
274
|
+
|
|
275
|
+
### Debounced Search Input
|
|
276
|
+
```javascript
|
|
277
|
+
let timeout;
|
|
278
|
+
input.addEventListener('input', (e) => {
|
|
279
|
+
clearTimeout(timeout);
|
|
280
|
+
timeout = setTimeout(() => search(e.target.value), 300);
|
|
281
|
+
});
|
|
282
|
+
```
|
|
283
|
+
|
|
284
|
+
## Anti-Patterns
|
|
285
|
+
- ❌ `position: fixed` with z-index (breaks Safari)
|
|
286
|
+
- ✅ Use `position: absolute` with explicit stacking
|
|
287
|
+
|
|
288
|
+
## Decision Records
|
|
289
|
+
**DR-023: Why Flexbox over Grid for layouts**
|
|
290
|
+
Grid has better semantics but Safari rendering is inconsistent.
|
|
291
|
+
Flexbox trades elegance for reliability. See debrief-2025-12-05.
|
|
292
|
+
```
|
|
293
|
+
|
|
294
|
+
### The Flow
|
|
295
|
+
|
|
296
|
+
```
|
|
297
|
+
Briefs
|
|
298
|
+
(this task)
|
|
299
|
+
↓
|
|
300
|
+
Work
|
|
301
|
+
(execution)
|
|
302
|
+
↓
|
|
303
|
+
Debriefs
|
|
304
|
+
(what we learned)
|
|
305
|
+
↓
|
|
306
|
+
Playbooks
|
|
307
|
+
(how we do things)
|
|
308
|
+
↓
|
|
309
|
+
Future Briefs
|
|
310
|
+
(informed by standards)
|
|
311
|
+
```
|
|
312
|
+
|
|
313
|
+
**Key insight:** Knowledge flows **upward** (tactical → strategic) **and** **cycles** (strategic informs future tactical).
|
|
314
|
+
|
|
315
|
+
---
|
|
316
|
+
|
|
317
|
+
## Spec-Driven vs. Learning-Driven Development
|
|
318
|
+
|
|
319
|
+
### Spec-Driven (Traditional)
|
|
320
|
+
|
|
321
|
+
```
|
|
322
|
+
Define requirements → Implement → Verify against spec
|
|
323
|
+
↓
|
|
324
|
+
Done ✓
|
|
325
|
+
```
|
|
326
|
+
|
|
327
|
+
**Optimization target:** This task
|
|
328
|
+
**Question answered:** Did we meet requirements?
|
|
329
|
+
**Knowledge captured:** None (implicit)
|
|
330
|
+
|
|
331
|
+
### Learning-Driven (PolyVis)
|
|
332
|
+
|
|
333
|
+
```
|
|
334
|
+
Define requirements → Implement → Verify → Reflect → Update knowledge
|
|
335
|
+
↓
|
|
336
|
+
All future tasks ✓
|
|
337
|
+
```
|
|
338
|
+
|
|
339
|
+
**Optimization target:** All future tasks
|
|
340
|
+
**Question answered:** What did we learn?
|
|
341
|
+
**Knowledge captured:** Explicit (codified)
|
|
342
|
+
|
|
343
|
+
### The Synthesis
|
|
344
|
+
|
|
345
|
+
**Spec-driven is necessary but not sufficient.**
|
|
346
|
+
|
|
347
|
+
**Combined approach:**
|
|
348
|
+
- Spec-driven ensures **this task succeeds**
|
|
349
|
+
- Learning-driven ensures **future tasks succeed faster**
|
|
350
|
+
|
|
351
|
+
**Example:**
|
|
352
|
+
|
|
353
|
+
**Spec-driven only:**
|
|
354
|
+
```
|
|
355
|
+
Week 1: Build auth system (40 hours)
|
|
356
|
+
Week 5: Build payment system (40 hours - similar patterns)
|
|
357
|
+
Week 9: Build admin system (40 hours - similar patterns)
|
|
358
|
+
```
|
|
359
|
+
→ Linear time complexity
|
|
360
|
+
|
|
361
|
+
**Learning-driven addition:**
|
|
362
|
+
```
|
|
363
|
+
Week 1: Build auth (40h) + debrief (2h) + playbook (1h) = 43h
|
|
364
|
+
Week 5: Build payment (25h, leveraged auth playbook)
|
|
365
|
+
Week 9: Build admin (15h, leveraged both playbooks)
|
|
366
|
+
```
|
|
367
|
+
→ Compounding efficiency
|
|
368
|
+
|
|
369
|
+
---
|
|
370
|
+
|
|
371
|
+
## The Human Role: Reader, Not Writer
|
|
372
|
+
|
|
373
|
+
### The Inversion
|
|
374
|
+
|
|
375
|
+
**Old model:**
|
|
376
|
+
```
|
|
377
|
+
Human: Primary Author
|
|
378
|
+
↓
|
|
379
|
+
Writes specs, docs, standards
|
|
380
|
+
↓
|
|
381
|
+
Agent: Reader/Executor
|
|
382
|
+
↓
|
|
383
|
+
Follows documentation
|
|
384
|
+
↓
|
|
385
|
+
Human: Maintainer
|
|
386
|
+
↓
|
|
387
|
+
Updates docs (bottleneck)
|
|
388
|
+
```
|
|
389
|
+
|
|
390
|
+
**PolyVis model:**
|
|
391
|
+
```
|
|
392
|
+
Human: Editor/Curator
|
|
393
|
+
↓
|
|
394
|
+
Writes initial briefs
|
|
395
|
+
↓
|
|
396
|
+
Agent: Author/Executor
|
|
397
|
+
↓
|
|
398
|
+
Writes code + debriefs + playbooks
|
|
399
|
+
↓
|
|
400
|
+
Human: Reader/Overseer
|
|
401
|
+
↓
|
|
402
|
+
Reads debriefs, occasional tidy-up
|
|
403
|
+
```
|
|
404
|
+
|
|
405
|
+
### What "Occasional Tidy Up" Means
|
|
406
|
+
|
|
407
|
+
**Human didn't need constant maintenance because:**
|
|
408
|
+
|
|
409
|
+
1. **Agents write at point of knowledge creation** (fresh context)
|
|
410
|
+
2. **Agents have incentive to maintain** (benefit next session)
|
|
411
|
+
3. **Errors self-correct** (agent encounters bad advice → fixes playbook)
|
|
412
|
+
4. **Human passively monitors** (reads debriefs for issues)
|
|
413
|
+
|
|
414
|
+
**Human's "tidy up" role:**
|
|
415
|
+
- Merge duplicate playbooks
|
|
416
|
+
- Reorganize structure
|
|
417
|
+
- Deprecate obsolete sections
|
|
418
|
+
- Resolve contradictions
|
|
419
|
+
- **Curation, not creation**
|
|
420
|
+
|
|
421
|
+
### Why This Scales
|
|
422
|
+
|
|
423
|
+
**Traditional docs:**
|
|
424
|
+
```
|
|
425
|
+
N tasks = N × human_writing_time (linear bottleneck)
|
|
426
|
+
```
|
|
427
|
+
|
|
428
|
+
**PolyVis pattern:**
|
|
429
|
+
```
|
|
430
|
+
N tasks = N × agent_writing_time + √N × human_curation_time
|
|
431
|
+
(parallelizable) (sublinear oversight)
|
|
432
|
+
```
|
|
433
|
+
|
|
434
|
+
**Key:** Agent writing scales linearly (no bottleneck), human curation scales sublinearly (occasional intervention).
|
|
435
|
+
|
|
436
|
+
---
|
|
437
|
+
|
|
438
|
+
## Implications for Amalfa Design
|
|
439
|
+
|
|
440
|
+
### What PolyVis Taught Us
|
|
441
|
+
|
|
442
|
+
**PolyVis pattern:**
|
|
443
|
+
```
|
|
444
|
+
Briefs/Debriefs = Markdown files in repo
|
|
445
|
+
Playbooks = Markdown files in repo
|
|
446
|
+
Discovery = Grep/search file names
|
|
447
|
+
Links = Manual (file references)
|
|
448
|
+
```
|
|
449
|
+
|
|
450
|
+
**Strengths:**
|
|
451
|
+
- Simple (no infrastructure)
|
|
452
|
+
- Version controlled (git history)
|
|
453
|
+
- Human readable (markdown)
|
|
454
|
+
|
|
455
|
+
**Limitations:**
|
|
456
|
+
- Linear discovery (keyword matching)
|
|
457
|
+
- Manual linking (file references)
|
|
458
|
+
- No semantic search (must know filenames)
|
|
459
|
+
- Hard to find related concepts
|
|
460
|
+
|
|
461
|
+
### Amalfa Enhancement
|
|
462
|
+
|
|
463
|
+
**Amalfa pattern:**
|
|
464
|
+
```
|
|
465
|
+
Briefs/Debriefs = Nodes in semantic graph
|
|
466
|
+
Playbooks = Nodes in semantic graph
|
|
467
|
+
Discovery = Vector search + graph traversal
|
|
468
|
+
Links = Automatic (similarity + explicit)
|
|
469
|
+
```
|
|
470
|
+
|
|
471
|
+
**Example:**
|
|
472
|
+
|
|
473
|
+
**PolyVis:**
|
|
474
|
+
```bash
|
|
475
|
+
$ grep "CSS" playbooks/*.md
|
|
476
|
+
playbooks/css-performance.md
|
|
477
|
+
playbooks/css-variables.md
|
|
478
|
+
```
|
|
479
|
+
→ Must know keyword "CSS"
|
|
480
|
+
|
|
481
|
+
**Amalfa:**
|
|
482
|
+
```
|
|
483
|
+
Query: "styling broke on Safari"
|
|
484
|
+
|
|
485
|
+
Results:
|
|
486
|
+
1. debrief-2025-12-03-layout-debug (0.89 similarity)
|
|
487
|
+
"Flexbox works, Grid breaks Safari stacking context"
|
|
488
|
+
|
|
489
|
+
2. playbook-cross-browser-testing (0.85 similarity)
|
|
490
|
+
"Test Safari early - rendering differs from Chrome"
|
|
491
|
+
|
|
492
|
+
3. brief-safari-animation-bug (0.82 similarity)
|
|
493
|
+
"Animation performance issues Safari vs Chrome"
|
|
494
|
+
```
|
|
495
|
+
→ Semantic matching, even without "CSS" keyword
|
|
496
|
+
|
|
497
|
+
### First-Class Workflow Support
|
|
498
|
+
|
|
499
|
+
**Bad Amalfa design:**
|
|
500
|
+
```
|
|
501
|
+
Generic document store
|
|
502
|
+
Agent dumps random markdown
|
|
503
|
+
No structure, no patterns
|
|
504
|
+
Human digs through chaos
|
|
505
|
+
```
|
|
506
|
+
|
|
507
|
+
**Good Amalfa design:**
|
|
508
|
+
```
|
|
509
|
+
Structured document types:
|
|
510
|
+
- Brief: requirements, context, success criteria
|
|
511
|
+
- Debrief: what worked, failed, lessons learned
|
|
512
|
+
- Playbook: principles, patterns, anti-patterns
|
|
513
|
+
|
|
514
|
+
Auto-linking:
|
|
515
|
+
- Debrief → Brief (what task it relates to)
|
|
516
|
+
- Debrief → Playbook (which standard it updates)
|
|
517
|
+
- Similar debriefs (encountered same problem)
|
|
518
|
+
|
|
519
|
+
Auto-promotion:
|
|
520
|
+
- Extract lessons from debrief
|
|
521
|
+
- Suggest playbook additions
|
|
522
|
+
- Identify contradictions
|
|
523
|
+
```
|
|
524
|
+
|
|
525
|
+
### Discovery Patterns
|
|
526
|
+
|
|
527
|
+
**PolyVis (manual search):**
|
|
528
|
+
```
|
|
529
|
+
Agent: "I need to style this component"
|
|
530
|
+
Agent: *greps for CSS playbook*
|
|
531
|
+
Agent: *reads playbook*
|
|
532
|
+
Agent: *applies patterns*
|
|
533
|
+
```
|
|
534
|
+
|
|
535
|
+
**Amalfa (semantic discovery):**
|
|
536
|
+
```
|
|
537
|
+
Agent: "I need to style this component for Safari"
|
|
538
|
+
Agent: *queries "safari styling patterns"*
|
|
539
|
+
Amalfa: Returns:
|
|
540
|
+
- Playbook: cross-browser CSS
|
|
541
|
+
- Debrief: Safari layout debugging
|
|
542
|
+
- Brief: Safari-specific animation work
|
|
543
|
+
Agent: *synthesizes from multiple sources*
|
|
544
|
+
```
|
|
545
|
+
|
|
546
|
+
**Key difference:** Amalfa returns **related concepts**, not just exact matches.
|
|
547
|
+
|
|
548
|
+
### Graph Traversal
|
|
549
|
+
|
|
550
|
+
**Beyond search - navigate relationships:**
|
|
551
|
+
|
|
552
|
+
```
|
|
553
|
+
Query: "What problems did auth work encounter?"
|
|
554
|
+
|
|
555
|
+
Amalfa returns graph:
|
|
556
|
+
brief-auth-refactor
|
|
557
|
+
├─→ debrief-auth-refactor (completed)
|
|
558
|
+
│ ├─→ playbook-alpine-patterns (updated)
|
|
559
|
+
│ └─→ debrief-state-management (similar problem)
|
|
560
|
+
├─→ brief-auth-tests (follow-up work)
|
|
561
|
+
└─→ issue-token-refresh (blocker)
|
|
562
|
+
```
|
|
563
|
+
|
|
564
|
+
**Agent can:**
|
|
565
|
+
- See full context (brief → debrief → playbook)
|
|
566
|
+
- Find related work (similar problems)
|
|
567
|
+
- Identify follow-ups (what came after)
|
|
568
|
+
- Understand dependencies (blockers)
|
|
569
|
+
|
|
570
|
+
---
|
|
571
|
+
|
|
572
|
+
## Evolution Path: From Manual to Emergent
|
|
573
|
+
|
|
574
|
+
### Level 1: Manual Process (Pre-Agent)
|
|
575
|
+
|
|
576
|
+
```
|
|
577
|
+
Human writes specs
|
|
578
|
+
Human writes code
|
|
579
|
+
Human reviews code
|
|
580
|
+
Human documents learnings
|
|
581
|
+
Human maintains docs
|
|
582
|
+
```
|
|
583
|
+
|
|
584
|
+
**Characteristics:**
|
|
585
|
+
- Human bottleneck
|
|
586
|
+
- Linear scaling
|
|
587
|
+
- Knowledge stays tacit
|
|
588
|
+
|
|
589
|
+
### Level 2: Agent Execution (Early Agent Era)
|
|
590
|
+
|
|
591
|
+
```
|
|
592
|
+
Human writes specs
|
|
593
|
+
Agent writes code
|
|
594
|
+
Human reviews code
|
|
595
|
+
Human documents learnings
|
|
596
|
+
Human maintains docs
|
|
597
|
+
```
|
|
598
|
+
|
|
599
|
+
**Characteristics:**
|
|
600
|
+
- Execution scales
|
|
601
|
+
- Documentation still bottleneck
|
|
602
|
+
- Knowledge extraction still manual
|
|
603
|
+
|
|
604
|
+
### Level 3: Agent Reflection (PolyVis Pattern)
|
|
605
|
+
|
|
606
|
+
```
|
|
607
|
+
Human writes briefs
|
|
608
|
+
Agent writes code + debriefs
|
|
609
|
+
Agent updates playbooks
|
|
610
|
+
Human reads debriefs (oversight)
|
|
611
|
+
Human occasional tidy-up
|
|
612
|
+
```
|
|
613
|
+
|
|
614
|
+
**Characteristics:**
|
|
615
|
+
- Execution scales
|
|
616
|
+
- Documentation scales
|
|
617
|
+
- Human acts as curator
|
|
618
|
+
- **This is where PolyVis succeeded**
|
|
619
|
+
|
|
620
|
+
### Level 4: Semantic Infrastructure (Amalfa-Enhanced)
|
|
621
|
+
|
|
622
|
+
```
|
|
623
|
+
Human writes briefs (or reviews agent proposals)
|
|
624
|
+
Agent queries past work semantically
|
|
625
|
+
Agent writes code + debriefs
|
|
626
|
+
Agent links related concepts
|
|
627
|
+
Agent updates playbooks
|
|
628
|
+
Human queries debriefs for trends
|
|
629
|
+
Human sets strategic direction
|
|
630
|
+
```
|
|
631
|
+
|
|
632
|
+
**Characteristics:**
|
|
633
|
+
- Execution scales
|
|
634
|
+
- Documentation scales
|
|
635
|
+
- Discovery scales (semantic)
|
|
636
|
+
- Knowledge compounds faster
|
|
637
|
+
|
|
638
|
+
### Level 5: Full Emergence (Future Vision)
|
|
639
|
+
|
|
640
|
+
```
|
|
641
|
+
Agents propose work based on knowledge gaps
|
|
642
|
+
Agents self-organize documentation
|
|
643
|
+
Agents identify obsolete knowledge
|
|
644
|
+
Agents coordinate multi-agent work
|
|
645
|
+
Human sets goals and values
|
|
646
|
+
Human monitors for alignment
|
|
647
|
+
```
|
|
648
|
+
|
|
649
|
+
**Characteristics:**
|
|
650
|
+
- Agents manage full workflow
|
|
651
|
+
- Self-organizing knowledge base
|
|
652
|
+
- Human provides direction, not execution
|
|
653
|
+
- Emergent collective intelligence
|
|
654
|
+
|
|
655
|
+
**Note:** Level 5 requires significant AI advances. Level 4 is achievable today.
|
|
656
|
+
|
|
657
|
+
---
|
|
658
|
+
|
|
659
|
+
## Implementation Principles
|
|
660
|
+
|
|
661
|
+
### 1. Structure Enables Emergence
|
|
662
|
+
|
|
663
|
+
**Don't:** Create generic note-taking app
|
|
664
|
+
**Do:** Provide structured templates that guide reflection
|
|
665
|
+
|
|
666
|
+
**Example:**
|
|
667
|
+
|
|
668
|
+
**Bad:**
|
|
669
|
+
```
|
|
670
|
+
"Write notes about your work"
|
|
671
|
+
```
|
|
672
|
+
→ Agent writes random thoughts
|
|
673
|
+
|
|
674
|
+
**Good:**
|
|
675
|
+
```markdown
|
|
676
|
+
# Debrief Template
|
|
677
|
+
|
|
678
|
+
## What Worked
|
|
679
|
+
[What approaches succeeded and why?]
|
|
680
|
+
|
|
681
|
+
## What Failed
|
|
682
|
+
[What did you try that didn't work? Why did it fail?]
|
|
683
|
+
|
|
684
|
+
## Lessons Learned
|
|
685
|
+
[What generalizable insights emerged?]
|
|
686
|
+
|
|
687
|
+
## Open Questions
|
|
688
|
+
[What remains unclear or needs investigation?]
|
|
689
|
+
```
|
|
690
|
+
→ Agent follows structure, captures useful knowledge
|
|
691
|
+
|
|
692
|
+
### 2. Make Documentation Self-Interested
|
|
693
|
+
|
|
694
|
+
**Principle:** Agents maintain docs when *they* benefit.
|
|
695
|
+
|
|
696
|
+
**Design implications:**
|
|
697
|
+
- Debriefs inform future agent sessions (not just human record)
|
|
698
|
+
- Playbooks save agents time (not just human reference)
|
|
699
|
+
- Search finds agent's own past work (personal memory)
|
|
700
|
+
|
|
701
|
+
**Anti-pattern:** Documentation as "homework" (pure overhead)
|
|
702
|
+
**Pattern:** Documentation as "external memory" (future productivity)
|
|
703
|
+
|
|
704
|
+
### 3. Reduce Friction
|
|
705
|
+
|
|
706
|
+
**Principle:** Easier to document than not document.
|
|
707
|
+
|
|
708
|
+
**Design implications:**
|
|
709
|
+
- In-line templates (don't make agents search for format)
|
|
710
|
+
- Auto-fill context (task ID, date, files changed)
|
|
711
|
+
- Suggest related docs (based on code changes)
|
|
712
|
+
- One-command workflows (`debrief create`, `playbook add`)
|
|
713
|
+
|
|
714
|
+
**Example:**
|
|
715
|
+
```bash
|
|
716
|
+
# Bad: Multiple steps, high friction
|
|
717
|
+
$ touch debrief.md
|
|
718
|
+
$ open debrief.md
|
|
719
|
+
$ # manually fill in metadata
|
|
720
|
+
$ # manually find related brief
|
|
721
|
+
$ # manually write content
|
|
722
|
+
|
|
723
|
+
# Good: Single command, pre-filled template
|
|
724
|
+
$ amalfa debrief create brief-auth-refactor
|
|
725
|
+
|
|
726
|
+
Creating debrief for brief-auth-refactor...
|
|
727
|
+
|
|
728
|
+
Auto-detected changes:
|
|
729
|
+
- src/auth/login.ts (modified)
|
|
730
|
+
- src/auth/session.ts (new)
|
|
731
|
+
|
|
732
|
+
Related documents:
|
|
733
|
+
- playbook-auth-patterns
|
|
734
|
+
- debrief-session-management-2025-11-12
|
|
735
|
+
|
|
736
|
+
Template loaded. Fill in sections below:
|
|
737
|
+
[opens editor with pre-filled template]
|
|
738
|
+
```
|
|
739
|
+
|
|
740
|
+
### 4. Close the Loop
|
|
741
|
+
|
|
742
|
+
**Principle:** Knowledge must flow back into workflow.
|
|
743
|
+
|
|
744
|
+
**Design implications:**
|
|
745
|
+
- Briefs reference playbooks (standards apply)
|
|
746
|
+
- Debriefs update playbooks (learnings codify)
|
|
747
|
+
- Playbooks link to debriefs (provenance)
|
|
748
|
+
- Search surfaces relevant past work (discovery)
|
|
749
|
+
|
|
750
|
+
**The cycle:**
|
|
751
|
+
```
|
|
752
|
+
Playbook → Brief → Work → Debrief → Playbook
|
|
753
|
+
↑ ↓
|
|
754
|
+
└────────── (knowledge compounds) ───┘
|
|
755
|
+
```
|
|
756
|
+
|
|
757
|
+
### 5. Support Human Oversight
|
|
758
|
+
|
|
759
|
+
**Principle:** Human should read, not write.
|
|
760
|
+
|
|
761
|
+
**Design implications:**
|
|
762
|
+
- Dashboard: Recent debriefs (what happened today)
|
|
763
|
+
- Digest: Weekly summary (trends across sessions)
|
|
764
|
+
- Alerts: Contradictions (agent A says X, agent B says Y)
|
|
765
|
+
- Queries: "What did we learn about auth this month?"
|
|
766
|
+
|
|
767
|
+
**Example dashboard:**
|
|
768
|
+
```
|
|
769
|
+
Recent Debriefs (Last 7 Days)
|
|
770
|
+
📝 debrief-vector-search (2h ago)
|
|
771
|
+
Lesson: Debouncing critical for live search
|
|
772
|
+
Updated: playbook-performance
|
|
773
|
+
|
|
774
|
+
📝 debrief-safari-layout (1d ago)
|
|
775
|
+
Lesson: Test Safari early, not late
|
|
776
|
+
Updated: playbook-cross-browser
|
|
777
|
+
⚠️ Contradicts: playbook-css-grid (suggests Grid)
|
|
778
|
+
|
|
779
|
+
Playbook Updates (Last 30 Days)
|
|
780
|
+
- performance: 3 updates
|
|
781
|
+
- cross-browser: 2 updates
|
|
782
|
+
- auth-patterns: 1 update
|
|
783
|
+
|
|
784
|
+
🔍 Query: "What problems did we have with Safari?"
|
|
785
|
+
```
|
|
786
|
+
|
|
787
|
+
### 6. Semantic Over Keyword
|
|
788
|
+
|
|
789
|
+
**Principle:** Match concepts, not strings.
|
|
790
|
+
|
|
791
|
+
**Design implications:**
|
|
792
|
+
- Vector embeddings for all documents
|
|
793
|
+
- Similarity search (not just keyword grep)
|
|
794
|
+
- Graph traversal (related concepts)
|
|
795
|
+
- Summarization (compressed context)
|
|
796
|
+
|
|
797
|
+
**Example:**
|
|
798
|
+
|
|
799
|
+
**Keyword search:**
|
|
800
|
+
```
|
|
801
|
+
Query: "authentication"
|
|
802
|
+
Matches: Documents containing "authentication"
|
|
803
|
+
Misses: Documents about "login", "session", "tokens" without exact word
|
|
804
|
+
```
|
|
805
|
+
|
|
806
|
+
**Semantic search:**
|
|
807
|
+
```
|
|
808
|
+
Query: "authentication"
|
|
809
|
+
Matches:
|
|
810
|
+
- Documents about authentication (exact)
|
|
811
|
+
- Documents about login (related concept)
|
|
812
|
+
- Documents about session management (related concept)
|
|
813
|
+
- Documents about JWT tokens (related implementation)
|
|
814
|
+
- Ranked by relevance
|
|
815
|
+
```
|
|
816
|
+
|
|
817
|
+
### 7. Explicit Over Implicit Links
|
|
818
|
+
|
|
819
|
+
**Principle:** Relationships should be first-class, not inferred.
|
|
820
|
+
|
|
821
|
+
**Design implications:**
|
|
822
|
+
- Debrief explicitly links to brief (task reference)
|
|
823
|
+
- Playbook explicitly links to debriefs (provenance)
|
|
824
|
+
- Cross-references are typed (related-to, contradicts, supersedes)
|
|
825
|
+
|
|
826
|
+
**Example graph:**
|
|
827
|
+
```
|
|
828
|
+
brief-auth-refactor
|
|
829
|
+
├─[completed-by]─→ debrief-auth-refactor
|
|
830
|
+
│ ├─[updated]─→ playbook-alpine-patterns
|
|
831
|
+
│ └─[similar-to]─→ debrief-state-management
|
|
832
|
+
├─[led-to]─→ brief-auth-tests
|
|
833
|
+
└─[blocked-by]─→ issue-token-refresh
|
|
834
|
+
```
|
|
835
|
+
|
|
836
|
+
**Why explicit links:**
|
|
837
|
+
- Agent can traverse graph intentionally
|
|
838
|
+
- Human can audit reasoning
|
|
839
|
+
- Changes don't break relationships
|
|
840
|
+
- Provenance is clear
|
|
841
|
+
|
|
842
|
+
---
|
|
843
|
+
|
|
844
|
+
## Conclusion: Documentation as Cognition
|
|
845
|
+
|
|
846
|
+
### The Core Insight
|
|
847
|
+
|
|
848
|
+
**Documentation is not an artifact of work - it's a cognitive tool.**
|
|
849
|
+
|
|
850
|
+
When agents write debriefs:
|
|
851
|
+
- They **process** what happened (reflection)
|
|
852
|
+
- They **abstract** patterns (generalization)
|
|
853
|
+
- They **externalize** knowledge (memory)
|
|
854
|
+
- They **connect** concepts (synthesis)
|
|
855
|
+
|
|
856
|
+
**Writing forces thinking.**
|
|
857
|
+
|
|
858
|
+
### Why PolyVis Worked
|
|
859
|
+
|
|
860
|
+
The brief-debrief-playbook pattern succeeded because:
|
|
861
|
+
|
|
862
|
+
1. **Structured reflection** (debrief template)
|
|
863
|
+
2. **Immediate capture** (write while context fresh)
|
|
864
|
+
3. **Self-interest** (agents benefit from own docs)
|
|
865
|
+
4. **Closed loop** (playbooks inform future briefs)
|
|
866
|
+
5. **Minimal friction** (markdown, git, simple tools)
|
|
867
|
+
6. **Human oversight** (curator, not bottleneck)
|
|
868
|
+
|
|
869
|
+
### What Amalfa Adds
|
|
870
|
+
|
|
871
|
+
Amalfa enhances the pattern with:
|
|
872
|
+
|
|
873
|
+
1. **Semantic discovery** (find related work by concept)
|
|
874
|
+
2. **Graph structure** (navigate relationships)
|
|
875
|
+
3. **Auto-linking** (suggest connections)
|
|
876
|
+
4. **Multi-agent** (shared memory substrate)
|
|
877
|
+
5. **Cross-session** (persistent context)
|
|
878
|
+
6. **Queryable trends** (what are we learning about X?)
|
|
879
|
+
|
|
880
|
+
### The Vision
|
|
881
|
+
|
|
882
|
+
**Current state (2026):**
|
|
883
|
+
- Agents execute tasks
|
|
884
|
+
- Documentation is afterthought
|
|
885
|
+
- Knowledge resets each session
|
|
886
|
+
|
|
887
|
+
**Near future (Amalfa + PolyVis pattern):**
|
|
888
|
+
- Agents execute + reflect
|
|
889
|
+
- Documentation is intrinsic
|
|
890
|
+
- Knowledge compounds across sessions
|
|
891
|
+
|
|
892
|
+
**Long-term vision:**
|
|
893
|
+
- Agents self-organize work
|
|
894
|
+
- Documentation is infrastructure
|
|
895
|
+
- Collective intelligence emerges
|
|
896
|
+
|
|
897
|
+
### The Path Forward
|
|
898
|
+
|
|
899
|
+
**Phase 1: Replicate PolyVis** (MVP)
|
|
900
|
+
- Support brief/debrief/playbook types
|
|
901
|
+
- Markdown storage
|
|
902
|
+
- Basic search
|
|
903
|
+
- Git integration
|
|
904
|
+
|
|
905
|
+
**Phase 2: Add Semantics** (Enhanced)
|
|
906
|
+
- Vector embeddings
|
|
907
|
+
- Similarity search
|
|
908
|
+
- Auto-suggested links
|
|
909
|
+
- Graph traversal
|
|
910
|
+
|
|
911
|
+
**Phase 3: Multi-Agent** (Scaled)
|
|
912
|
+
- Shared knowledge base
|
|
913
|
+
- Cross-agent learning
|
|
914
|
+
- Conflict resolution
|
|
915
|
+
- Trend analysis
|
|
916
|
+
|
|
917
|
+
**Phase 4: Emergence** (Future)
|
|
918
|
+
- Agent-proposed work
|
|
919
|
+
- Self-organizing structure
|
|
920
|
+
- Automated maintenance
|
|
921
|
+
- Collective intelligence
|
|
922
|
+
|
|
923
|
+
---
|
|
924
|
+
|
|
925
|
+
## Appendix: Practical Examples
|
|
926
|
+
|
|
927
|
+
### Example 1: Auth Refactor Session
|
|
928
|
+
|
|
929
|
+
**Brief:**
|
|
930
|
+
```markdown
|
|
931
|
+
# Brief: Refactor Auth to Alpine.js
|
|
932
|
+
|
|
933
|
+
## Objective
|
|
934
|
+
Replace imperative DOM manipulation with Alpine state management
|
|
935
|
+
|
|
936
|
+
## Requirements
|
|
937
|
+
- User login/logout works
|
|
938
|
+
- Session persistence across page loads
|
|
939
|
+
- Token refresh on expiry
|
|
940
|
+
|
|
941
|
+
## Context
|
|
942
|
+
Current auth code is 400 lines of manual DOM updates. Hard to debug.
|
|
943
|
+
```
|
|
944
|
+
|
|
945
|
+
**Work:**
|
|
946
|
+
- Agent refactors code
|
|
947
|
+
- Replaces 400 lines with 80 lines of Alpine
|
|
948
|
+
- Tests pass
|
|
949
|
+
- Deploy succeeds
|
|
950
|
+
|
|
951
|
+
**Debrief (written by agent):**
|
|
952
|
+
```markdown
|
|
953
|
+
# Debrief: Auth Alpine Refactor
|
|
954
|
+
|
|
955
|
+
## What Worked
|
|
956
|
+
- Alpine's `x-data` eliminated manual state tracking
|
|
957
|
+
- `x-show` replaced 50+ lines of visibility toggling
|
|
958
|
+
- Token refresh using Alpine's `$watch` (reactive, not polling)
|
|
959
|
+
|
|
960
|
+
## What Failed
|
|
961
|
+
- Tried storing token in Alpine state → broke on page reload
|
|
962
|
+
- Switched to localStorage, Alpine reads on mount → works
|
|
963
|
+
|
|
964
|
+
## Lessons Learned
|
|
965
|
+
- Alpine is perfect for UI state, NOT persistence
|
|
966
|
+
- Alpine + localStorage is the right pattern
|
|
967
|
+
- 400 lines → 80 lines = 5x reduction, easier to debug
|
|
968
|
+
|
|
969
|
+
## Files Changed
|
|
970
|
+
- src/auth/login.ts (refactored)
|
|
971
|
+
- src/auth/session.ts (simplified)
|
|
972
|
+
|
|
973
|
+
## Test Results
|
|
974
|
+
All 23 auth tests pass
|
|
975
|
+
```
|
|
976
|
+
|
|
977
|
+
**Playbook Update (agent writes unprompted):**
|
|
978
|
+
```markdown
|
|
979
|
+
# Alpine.js Playbook
|
|
980
|
+
|
|
981
|
+
## State Management Patterns
|
|
982
|
+
|
|
983
|
+
### UI State: Use Alpine
|
|
984
|
+
```html
|
|
985
|
+
<div x-data="{ isLoggedIn: false, user: null }">
|
|
986
|
+
<button @click="login()">Login</button>
|
|
987
|
+
<div x-show="isLoggedIn">Welcome, {{ user.name }}</div>
|
|
988
|
+
</div>
|
|
989
|
+
```
|
|
990
|
+
|
|
991
|
+
### Persistent State: Use localStorage + Alpine
|
|
992
|
+
```javascript
|
|
993
|
+
// Mount: Read from localStorage
|
|
994
|
+
Alpine.data('auth', () => ({
|
|
995
|
+
init() {
|
|
996
|
+
this.token = localStorage.getItem('token');
|
|
997
|
+
this.isLoggedIn = !!this.token;
|
|
998
|
+
}
|
|
999
|
+
}));
|
|
1000
|
+
|
|
1001
|
+
// Update: Write to localStorage
|
|
1002
|
+
function login(token) {
|
|
1003
|
+
localStorage.setItem('token', token);
|
|
1004
|
+
this.token = token;
|
|
1005
|
+
this.isLoggedIn = true;
|
|
1006
|
+
}
|
|
1007
|
+
```
|
|
1008
|
+
|
|
1009
|
+
**Provenance:** debrief-auth-alpine-refactor (2026-01-05)
|
|
1010
|
+
```
|
|
1011
|
+
|
|
1012
|
+
**Result:**
|
|
1013
|
+
- Brief defined task
|
|
1014
|
+
- Debrief captured learnings
|
|
1015
|
+
- Playbook codified pattern
|
|
1016
|
+
- Future auth work references playbook
|
|
1017
|
+
- Next agent doesn't repeat localStorage mistake
|
|
1018
|
+
|
|
1019
|
+
### Example 2: Safari Layout Bug
|
|
1020
|
+
|
|
1021
|
+
**Brief:**
|
|
1022
|
+
```markdown
|
|
1023
|
+
# Brief: Fix Explorer Layout on Safari
|
|
1024
|
+
|
|
1025
|
+
## Objective
|
|
1026
|
+
Sidebar + graph layout broken on Safari (works Chrome/Firefox)
|
|
1027
|
+
|
|
1028
|
+
## Requirements
|
|
1029
|
+
- Sidebar 300px wide, fixed left
|
|
1030
|
+
- Graph fills remaining space
|
|
1031
|
+
- Responsive (mobile collapses sidebar)
|
|
1032
|
+
|
|
1033
|
+
## Context
|
|
1034
|
+
User report: Safari shows overlapping panels
|
|
1035
|
+
```
|
|
1036
|
+
|
|
1037
|
+
**Work:**
|
|
1038
|
+
- Agent investigates Safari rendering
|
|
1039
|
+
- Discovers CSS Grid stacking context bug
|
|
1040
|
+
- Switches to Flexbox
|
|
1041
|
+
- Verifies on Safari
|
|
1042
|
+
|
|
1043
|
+
**Debrief:**
|
|
1044
|
+
```markdown
|
|
1045
|
+
# Debrief: Safari Layout Fix
|
|
1046
|
+
|
|
1047
|
+
## What Worked
|
|
1048
|
+
- Flexbox layout: `display: flex` for container
|
|
1049
|
+
- `flex: 0 0 300px` for sidebar, `flex: 1` for graph
|
|
1050
|
+
- Works Safari/Chrome/Firefox/Edge
|
|
1051
|
+
|
|
1052
|
+
## What Failed
|
|
1053
|
+
- CSS Grid: `grid-template-columns: 300px 1fr`
|
|
1054
|
+
- Safari bug: z-index doesn't work in grid context
|
|
1055
|
+
- Searched for fixes, found Safari 15.x known issue
|
|
1056
|
+
|
|
1057
|
+
## Lessons Learned
|
|
1058
|
+
- **Safari != Chrome** - test early, not at end
|
|
1059
|
+
- CSS Grid has better semantics, but Safari bugs
|
|
1060
|
+
- Flexbox is more reliable for layout (trade elegance for compatibility)
|
|
1061
|
+
- Check caniuse.com for Safari quirks
|
|
1062
|
+
|
|
1063
|
+
## Browser Tested
|
|
1064
|
+
- ✅ Safari 17.2
|
|
1065
|
+
- ✅ Chrome 120
|
|
1066
|
+
- ✅ Firefox 121
|
|
1067
|
+
|
|
1068
|
+
## Decision
|
|
1069
|
+
Using Flexbox for all layouts going forward until Safari Grid bugs fixed.
|
|
1070
|
+
```
|
|
1071
|
+
|
|
1072
|
+
**Playbook Update:**
|
|
1073
|
+
```markdown
|
|
1074
|
+
# Cross-Browser CSS Playbook
|
|
1075
|
+
|
|
1076
|
+
## Principles
|
|
1077
|
+
1. **Test Safari EARLY** - rendering differs significantly
|
|
1078
|
+
2. **Trade elegance for reliability** - Safari bugs beat clean code
|
|
1079
|
+
3. **Check caniuse.com** - don't assume feature parity
|
|
1080
|
+
|
|
1081
|
+
## Layout Patterns
|
|
1082
|
+
|
|
1083
|
+
### ❌ CSS Grid (Safari Issues)
|
|
1084
|
+
Safari 15-17 has z-index bugs in grid context. Avoid for layouts with overlays.
|
|
1085
|
+
|
|
1086
|
+
### ✅ Flexbox (Reliable)
|
|
1087
|
+
Works consistently across browsers:
|
|
1088
|
+
```css
|
|
1089
|
+
.container {
|
|
1090
|
+
display: flex;
|
|
1091
|
+
}
|
|
1092
|
+
.sidebar {
|
|
1093
|
+
flex: 0 0 300px; /* fixed 300px */
|
|
1094
|
+
}
|
|
1095
|
+
.main {
|
|
1096
|
+
flex: 1; /* fill remaining */
|
|
1097
|
+
}
|
|
1098
|
+
```
|
|
1099
|
+
|
|
1100
|
+
## Decision Records
|
|
1101
|
+
|
|
1102
|
+
**DR-042: Why Flexbox over Grid**
|
|
1103
|
+
Grid has better semantics (2D layout primitives), but Safari 15-17
|
|
1104
|
+
has stacking context bugs that break z-index. Flexbox trades elegance
|
|
1105
|
+
for reliability. Revisit when Safari 18+ adoption > 90%.
|
|
1106
|
+
|
|
1107
|
+
**Provenance:** debrief-safari-layout-fix (2026-01-05)
|
|
1108
|
+
```
|
|
1109
|
+
|
|
1110
|
+
**Result:**
|
|
1111
|
+
- Future layout work reads playbook
|
|
1112
|
+
- Agents choose Flexbox, avoid Grid (Safari)
|
|
1113
|
+
- Decision rationale is documented (not tribal knowledge)
|
|
1114
|
+
- When Safari fixes bug, DR has revisit condition
|
|
1115
|
+
|
|
1116
|
+
### Example 3: Vector Search Performance
|
|
1117
|
+
|
|
1118
|
+
**Brief:**
|
|
1119
|
+
```markdown
|
|
1120
|
+
# Brief: Add Semantic Search to Explorer
|
|
1121
|
+
|
|
1122
|
+
## Objective
|
|
1123
|
+
Search nodes by concept, not just exact name match
|
|
1124
|
+
|
|
1125
|
+
## Requirements
|
|
1126
|
+
- Input box in sidebar
|
|
1127
|
+
- Live results as user types
|
|
1128
|
+
- Highlight matching nodes in graph
|
|
1129
|
+
- < 100ms response time
|
|
1130
|
+
|
|
1131
|
+
## Context
|
|
1132
|
+
Users want "find authentication stuff" not "find node named 'auth.ts'"
|
|
1133
|
+
```
|
|
1134
|
+
|
|
1135
|
+
**Work:**
|
|
1136
|
+
- Agent integrates VectorEngine
|
|
1137
|
+
- Adds search input with Alpine
|
|
1138
|
+
- Implements highlighting
|
|
1139
|
+
- Tests performance
|
|
1140
|
+
|
|
1141
|
+
**Debrief:**
|
|
1142
|
+
```markdown
|
|
1143
|
+
# Debrief: Vector Search Implementation
|
|
1144
|
+
|
|
1145
|
+
## What Worked
|
|
1146
|
+
- Reused existing VectorEngine (no new code!)
|
|
1147
|
+
- Alpine + `@input` event (reactive search)
|
|
1148
|
+
- Sigma's `setNodeAttribute` for highlighting
|
|
1149
|
+
- Debounced search input (300ms delay)
|
|
1150
|
+
|
|
1151
|
+
## What Failed
|
|
1152
|
+
- **Attempt 1:** No debouncing → laggy typing (search on every keystroke)
|
|
1153
|
+
- **Attempt 2:** 100ms debounce → still laggy on slow machines
|
|
1154
|
+
- **Attempt 3:** 300ms debounce → smooth, feels instant
|
|
1155
|
+
|
|
1156
|
+
## Lessons Learned
|
|
1157
|
+
- **Debouncing is CRITICAL** for live search
|
|
1158
|
+
- 300ms is sweet spot (feels instant, but batches queries)
|
|
1159
|
+
- Never search on raw `@input` - always debounce
|
|
1160
|
+
- VectorEngine already fast (<10ms), network overhead is the issue
|
|
1161
|
+
|
|
1162
|
+
## Performance
|
|
1163
|
+
- Vector search: 8ms avg
|
|
1164
|
+
- UI render: 12ms avg
|
|
1165
|
+
- Total with 300ms debounce: Feels instant
|
|
1166
|
+
|
|
1167
|
+
## Files Changed
|
|
1168
|
+
- public/explorer/search.js (new)
|
|
1169
|
+
- public/explorer/index.html (added input)
|
|
1170
|
+
```
|
|
1171
|
+
|
|
1172
|
+
**Playbook Update:**
|
|
1173
|
+
```markdown
|
|
1174
|
+
# Performance Playbook
|
|
1175
|
+
|
|
1176
|
+
## Search Input Patterns
|
|
1177
|
+
|
|
1178
|
+
### ❌ Raw Input (No Debouncing)
|
|
1179
|
+
```javascript
|
|
1180
|
+
// DON'T: Fires on every keystroke
|
|
1181
|
+
input.addEventListener('input', search);
|
|
1182
|
+
```
|
|
1183
|
+
**Problem:** Laggy typing, wasted queries
|
|
1184
|
+
|
|
1185
|
+
### ✅ Debounced Input
|
|
1186
|
+
```javascript
|
|
1187
|
+
// DO: Batch queries
|
|
1188
|
+
let timeout;
|
|
1189
|
+
input.addEventListener('input', (e) => {
|
|
1190
|
+
clearTimeout(timeout);
|
|
1191
|
+
timeout = setTimeout(() => search(e.target.value), 300);
|
|
1192
|
+
});
|
|
1193
|
+
```
|
|
1194
|
+
**Result:** Smooth typing, fewer queries
|
|
1195
|
+
|
|
1196
|
+
### Debounce Timing
|
|
1197
|
+
- **100ms:** Still feels laggy on slow machines
|
|
1198
|
+
- **300ms:** ✅ Sweet spot (feels instant, batches queries)
|
|
1199
|
+
- **500ms:** Noticeable delay
|
|
1200
|
+
|
|
1201
|
+
**Provenance:** debrief-vector-search-implementation (2026-01-06)
|
|
1202
|
+
|
|
1203
|
+
## Animation Performance
|
|
1204
|
+
|
|
1205
|
+
### Rule: Performance Degrades Non-Linearly
|
|
1206
|
+
- 100 nodes: animations smooth
|
|
1207
|
+
- 500 nodes: animations slightly janky
|
|
1208
|
+
- 1000+ nodes: animations unusable
|
|
1209
|
+
|
|
1210
|
+
**Lesson:** Disable animations above threshold, don't try to optimize.
|
|
1211
|
+
|
|
1212
|
+
**Provenance:** debrief-graph-animation-performance (2025-12-15)
|
|
1213
|
+
```
|
|
1214
|
+
|
|
1215
|
+
**Result:**
|
|
1216
|
+
- Future agents know to debounce search inputs
|
|
1217
|
+
- 300ms is documented as best practice
|
|
1218
|
+
- Performance thresholds are explicit
|
|
1219
|
+
- Cross-referenced to original debrief
|
|
1220
|
+
|
|
1221
|
+
---
|
|
1222
|
+
|
|
1223
|
+
## References
|
|
1224
|
+
|
|
1225
|
+
- **PolyVis Project:** https://github.com/pjsvis/polyvis
|
|
1226
|
+
- **Brief-Debrief-Playbook Pattern:** Emerged from PolyVis development (2025)
|
|
1227
|
+
- **Amalfa:** This project (MCP server for agent memory)
|
|
1228
|
+
- **Related Concepts:**
|
|
1229
|
+
- Learning Organizations (Peter Senge)
|
|
1230
|
+
- After-Action Reviews (US Army)
|
|
1231
|
+
- Retrospectives (Agile)
|
|
1232
|
+
- Decision Records (ADRs)
|
|
1233
|
+
|
|
1234
|
+
---
|
|
1235
|
+
|
|
1236
|
+
**Status:** Vision document, not specification
|
|
1237
|
+
**Next Steps:** Design Amalfa schema to support this workflow
|
|
1238
|
+
**Feedback:** Iterate based on implementation experience
|
|
1239
|
+
|
|
1240
|
+
---
|
|
1241
|
+
|
|
1242
|
+
_This document captures learnings from PolyVis and charts a path for Amalfa. The goal: make agent-generated knowledge the default, not the exception._
|