universal-agent-memory 3.1.0 → 4.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "universal-agent-memory",
3
- "version": "3.1.0",
3
+ "version": "4.0.0",
4
4
  "description": "Universal AI agent memory system - CLAUDE.md templates, memory, worktrees for Claude Code, Factory.AI, VSCode, OpenCode",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
@@ -1,5 +1,5 @@
1
1
  <!--
2
- CLAUDE.md Universal Template - v10.19-opt
2
+ CLAUDE.md Universal Template - v11.0-slim
3
3
 
4
4
  Core Variables:
5
5
  {{PROJECT_NAME}}, {{DESCRIPTION}}, {{DEFAULT_BRANCH}}, {{STRUCTURE_DATE}}
@@ -44,42 +44,13 @@
44
44
 
45
45
  ---
46
46
 
47
- ## CODE FIELD - COGNITIVE ENVIRONMENT
47
+ ## CODE PRINCIPLES
48
48
 
49
- **Apply to ALL code generation. Creates conditions where better code emerges naturally.**
50
-
51
- ### Core Inhibitions
52
-
53
- ```
54
- Do not write code before stating assumptions.
55
- Do not claim correctness you haven't verified.
56
- Do not handle only the happy path.
57
- Under what conditions does this work?
58
- ```
59
-
60
- ### Before Writing Code
61
-
62
- - What are you assuming about the input?
63
- - What are you assuming about the environment?
64
- - What would break this?
65
- - What would a malicious caller do?
66
-
67
- ### Do Not
68
-
69
- - Write code before stating assumptions
70
- - Claim correctness you haven't verified
71
- - Handle the happy path and gesture at the rest
72
- - Import complexity you don't need
73
- - Solve problems you weren't asked to solve
74
- - Produce code you wouldn't want to debug at 3am
75
-
76
- ### Expected Output Format
77
-
78
- **Before code**: Assumptions stated explicitly, scope bounded
79
- **In code**: Smaller than expected, edge cases handled or explicitly rejected
80
- **After code**: "What this handles" and "What this does NOT handle" sections
81
-
82
- *Attribution: Based on [context-field research](https://github.com/NeoVertex1/context-field)*
49
+ - State assumptions before writing code
50
+ - Verify correctness -- do not claim it
51
+ - Handle error paths, not just the happy path
52
+ - Do not import complexity you do not need
53
+ - Produce code you would want to debug at 3am
83
54
 
84
55
  ---
85
56
 
@@ -98,8 +69,6 @@ Under what conditions does this work?
98
69
  | kubectl with secrets | `ops-approved-operations.yml` |
99
70
  | One-time secret operation | `ops-create-ephemeral.yml` (self-destructs after run) |
100
71
 
101
- **Local commands without secrets** (read-only, public resources) are allowed for testing.
102
-
103
72
  ### Two-Phase Infrastructure Workflow
104
73
 
105
74
  ```
@@ -107,7 +76,6 @@ PHASE 1: LOCAL PROOF (ALLOWED - NO SECRETS)
107
76
  - kubectl get/describe/logs (read-only operations)
108
77
  - terraform plan (uses GitHub pipeline for secrets)
109
78
  - Direct cloud console changes for rapid prototyping
110
- - Manual commands to verify behavior (public resources)
111
79
  - SECRETS REQUIRED? -> Use pipeline, not local commands
112
80
 
113
81
  PHASE 2: IaC PARITY (MANDATORY - VIA PIPELINE)
@@ -119,639 +87,104 @@ PHASE 2: IaC PARITY (MANDATORY - VIA PIPELINE)
119
87
  - RULE: Work is NOT complete until IaC matches live state
120
88
  ```
121
89
 
122
- ### Core Principle
123
-
124
- ```
125
- Local testing proves the solution. IaC ensures reproducibility.
126
- Manual changes are TEMPORARY. IaC changes are PERMANENT.
127
- If it's not in IaC, it doesn't exist (will be destroyed/lost).
128
- Secrets live in GitHub - use pipelines for secret-dependent operations.
129
- ```
130
-
131
90
  ### Approved Pipelines
132
91
 
133
- | Task | Pipeline | Trigger | Notes |
134
- |------|----------|---------|-------|
135
- | Kubernetes operations | `ops-approved-operations.yml` | Manual dispatch | Has cluster secrets |
136
- | Ephemeral environments | `ops-create-ephemeral.yml` | Manual dispatch | Self-destructs after run |
137
- | Terraform changes | `iac-terraform-cicd.yml` | PR to {{DEFAULT_BRANCH}} | Has TF secrets |
138
- | Ephemeral Terraform | `ops-ephemeral-terraform.yml` | Manual dispatch | One-time TF operations |
139
-
140
- ### What This Means for Agents
141
-
142
- **PHASE 1 - Local Testing (ALLOWED for non-secret operations):**
143
- - Run read-only commands: `kubectl get`, `kubectl describe`, `kubectl logs`
144
- - Run `terraform plan` via pipeline (needs secrets)
145
- - Make cloud console changes to prototype
146
- - Use ephemeral pipelines for secret-dependent testing
147
-
148
- **PHASE 2 - IaC Parity (MANDATORY - always via pipeline):**
149
- - ALL manual changes MUST be translated to IaC (Terraform/K8s YAML)
150
- - IaC MUST be committed to version control
151
- - Deployment MUST go through CI/CD pipeline (has secrets)
152
- - Final state MUST match IaC exactly (verify with diff)
153
- - Manual/ephemeral resources MUST be cleaned up
154
-
155
- **NEVER:**
156
- - Run `terraform apply` locally (no secrets available)
157
- - Run `kubectl apply` with secret-dependent resources locally
158
- - Create secrets via `kubectl create secret` (use Sealed Secrets)
159
- - Hardcode or expose secrets in code/logs
92
+ | Task | Pipeline | Trigger |
93
+ |------|----------|---------|
94
+ | Kubernetes operations | `ops-approved-operations.yml` | Manual dispatch |
95
+ | Ephemeral environments | `ops-create-ephemeral.yml` | Manual dispatch |
96
+ | Terraform changes | `iac-terraform-cicd.yml` | PR to {{DEFAULT_BRANCH}} |
97
+ | Ephemeral Terraform | `ops-ephemeral-terraform.yml` | Manual dispatch |
160
98
  {{/if}}
161
99
 
162
100
  ---
163
101
 
164
- ## UNIVERSAL AGENT PATTERNS
165
-
166
- **Forty-two patterns proven to improve LLM agent reliability. Apply to ALL task execution.**
167
-
168
- ### Pattern 1: Environment Isolation
169
- Code installed during execution may NOT persist across phases (agent -> test -> deploy).
170
- - Check what dependencies are pre-installed before using them
171
- - Prefer pre-installed tools over installing new ones
172
- - Design for minimal dependency footprint
173
-
174
- ### Pattern 2: Recipe Following
175
- LLMs excel at explicit step-by-step instructions but struggle with ambiguous tasks.
176
- - Convert complex tasks into numbered sequential commands
177
- - Each step should be atomic and verifiable
178
- - Provide exact commands, not descriptions
179
-
180
- ### Pattern 3: Pre-execution State Protection
181
- LLMs do NOT reliably follow negative instructions ("do not delete X").
182
- - Backup critical files BEFORE starting work
183
- - Never rely on "do not" for critical constraints
184
- - Protect state proactively, not reactively
102
+ ## SESSION START
185
103
 
186
104
  ```bash
187
- cp /etc/nginx/nginx.conf /etc/nginx/nginx.conf.bak
188
- sqlite3 db.sqlite ".backup /tmp/db_backup.sqlite"
105
+ uam task ready
106
+ sqlite3 ./{{MEMORY_DB_PATH}} "SELECT * FROM memories ORDER BY id DESC LIMIT 10;"
107
+ sqlite3 ./{{MEMORY_DB_PATH}} "SELECT * FROM session_memories WHERE session_id='current' ORDER BY id DESC LIMIT 5;"
108
+ uam agent status
189
109
  ```
190
110
 
191
- ### Pattern 4: Tool Specification
192
- Agents succeed more when told WHICH tool to use, not just WHAT to accomplish.
193
- - Specify exact tool and flags when possible
194
- - Reduce search space of possible approaches
195
-
196
- ### Pattern 5: Recognizing Impossible Tasks
197
- Some tasks require pre-computed solutions (compressed algorithms, lookup tables, trained models).
198
- - Identify tasks requiring: compression below limits, pre-trained weights, exhaustive search
199
- - These cannot be solved by reasoning alone
200
- - Detect early to avoid wasted computation
201
-
202
- ### Pattern 6: Hierarchical Prompting (Recency Bias)
203
- LLMs weight information at the END of context more heavily.
204
- - Structure: capabilities -> context -> CRITICAL instructions
205
- - Put most important constraints at the END
206
- - Repeat critical instructions if essential
207
-
208
- ### Pattern 7: Task Classification
209
- Different task types require different strategies. See also P40 (Adaptive Reasoning Depth).
210
-
211
- | Category | Strategy |
212
- |----------|----------|
213
- | State-sensitive | Pre-backup critical files |
214
- | Recipe-following | Step-by-step commands |
215
- | Tool-dependent | Specify exact tool + flags |
216
- | Research/exploration | Parallel searches |
217
- | Complex/ambiguous | P38 Branching Exploration |
218
- | Multi-file change | P41 Atomic Task Loop |
219
-
220
- ### Pattern 8: CLI over Libraries
221
- When environment dependencies are uncertain, prefer subprocess + CLI over library imports.
222
- - CLI tools more likely pre-installed
223
- - Better backward compatibility
224
- - Easier to verify: `which tool`
225
-
226
- ### Pattern 9: Format Translation Pipeline
227
- When task involves non-text data (images, audio, video, binary), convert to standard format first.
228
- - Identify input format and required output
229
- - Find converter tool: image->text, audio->transcript, binary->structured
230
- - Convert to standard intermediate format (FEN, JSON, CSV, etc.)
231
-
232
- ### Pattern 10: Whitelist-First Sanitization
233
- For security filtering (XSS, SQL injection, command injection), ALWAYS use allow-list approach.
234
- - NEVER use blacklist/regex to filter malicious patterns
235
- - Use established security libraries (bleach, DOMPurify, parameterized queries)
236
-
237
- ### Pattern 11: Pre-Computed Solution Recognition
238
- Some tasks have known solutions that cannot be derived through reasoning alone.
239
- - Recognize tasks matching known solved problems (chess engines, crypto tables, ML models)
240
- - Search GitHub/papers FIRST before attempting to solve
241
- - Download and adapt existing solutions rather than reinventing
242
-
243
- ### Pattern 12: Output Existence Verification (OEV)
244
- **CRITICAL: 37% of task failures are "output file not created"**
245
-
246
- BEFORE completing ANY task:
247
- 1. **Parse expected outputs** from task description (files, paths, artifacts)
248
- 2. **Verify each output EXISTS**: `ls -la /expected/path`
249
- 3. **If missing: CREATE IT** - don't explain, don't ask, just create
250
- 4. **Re-verify** before reporting completion
251
-
252
- ### Pattern 13: Iterative Refinement Loop (IRL)
253
- When tests are available and some pass but others fail:
254
- 1. Run tests after implementation
255
- 2. If partial success: read failing test, fix specific issue, re-run
256
- 3. Repeat until 100% pass OR timeout approaching
257
- 4. Reserve 20% of time budget for iteration
258
-
259
- ### Pattern 14: Output Format Validation (OFV)
260
- When task specifies exact output format:
261
- 1. Extract format requirements from task description
262
- 2. Validate output matches spec before completion
263
- 3. If validation fails: fix output, don't re-explain
264
-
265
- ### Pattern 15: Exception Recovery (ER)
266
- When code throws exceptions:
267
- 1. Read the stack trace
268
- 2. Fix the specific error (ImportError -> install/use stdlib, FileNotFoundError -> create file)
269
- 3. Re-run and verify
270
- 4. Don't give up after first exception
271
-
272
- ### Pattern 16: Task-First Execution (TFE)
273
- **CRITICAL: Prevents regression where agent analyzes but forgets to create outputs**
274
-
275
- BEFORE any analysis or exploration:
276
- 1. **Parse task for REQUIRED OUTPUTS** (files, artifacts, states)
277
- 2. **Create MINIMAL WORKING versions immediately** (touch files, create stubs)
278
- 3. **THEN refine to full implementation**
279
- 4. **Never finish a turn without outputs existing**
280
-
281
- ### Pattern 17: Constraint Extraction (CE)
282
- Agent misses specific constraints buried in task descriptions.
283
-
284
- BEFORE implementing:
285
- 1. **Parse for constraint keywords**: "exactly", "only", "single", "must be", "no more than"
286
- 2. **Extract format constraints**: file types, line counts, size limits, response formats
287
- 3. **Create constraint checklist**
288
- 4. **Validate EACH constraint before completion**
289
-
290
- ### Pattern 18: Multi-Tool Pipeline (MTP)
291
- Complex tasks need multiple tools chained together.
292
-
293
- 1. **Decompose task into stages** (input -> transform -> output)
294
- 2. **Identify tool for EACH stage** (don't implement from scratch if tool exists)
295
- 3. **Verify tools available**: `which tool || pip install package`
296
- 4. **Chain in pipeline**
297
-
298
- ### Pattern 19: Enhanced Impossible Task Refusal (ITR+)
299
- Pattern 5 identifies impossible tasks but agent still attempts them.
300
-
301
- **Impossibility markers -> REFUSE IMMEDIATELY:**
302
- - Compression below Shannon limit (e.g., "GPT-2 in 5KB" - weights are 500MB+)
303
- - Pre-trained weights required but none provided
304
- - Exhaustive search in exponential space
305
- - Undecidable problems
306
-
307
- ### Pattern 20: Adversarial Thinking (AT)
308
- Security bypass/attack tasks require "attacker mindset".
309
-
310
- 1. **Detect**: Keywords "bypass", "break", "exploit", "circumvent", "evade"
311
- 2. **Flip perspective**: Not "how to use correctly" but "how to break it"
312
- 3. **Enumerate attack vectors** (case variation, encoding, null bytes, context breaking)
313
- 4. **Test each vector until one works**
314
-
315
- ---
316
-
317
- ## DOMAIN-SPECIFIC PATTERNS
318
-
319
- **Task-specific guidance for common failure categories.**
320
-
321
- ### Pattern 21: Chess Engine Integration (CEI)
322
- Chess tasks require external engines - reasoning alone cannot find optimal moves.
323
-
324
- **Detection**: Keywords "chess", "best move", "FEN", "PGN", "checkmate"
325
-
326
- **Protocol**:
327
- 1. Check for Stockfish: `which stockfish || apt-get install -y stockfish`
328
- 2. Parse board state to FEN format
329
- 3. Use Stockfish for analysis
330
- 4. Extract best move from engine output
331
-
332
- **NEVER**: Try to calculate chess moves through reasoning - use the engine.
333
-
334
- ### Pattern 22: Git Recovery Forensics (GRF)
335
- Git recovery tasks require forensic approach, not standard git commands.
336
-
337
- **Detection**: Keywords "recover", "corrupted", "lost commit", "fix git", "reflog"
338
-
339
- **Protocol**:
340
- 1. **BACKUP FIRST**: `cp -r .git .git.bak`
341
- 2. Check integrity: `git fsck --full --no-dangling`
342
- 3. Check reflog: `git reflog` (commits survive even after reset)
343
- 4. Check loose objects: `find .git/objects -type f`
344
- 5. Recover from pack files if needed
345
-
346
- **NEVER**: Run destructive commands without backup. Use `--dry-run` first.
347
-
348
- ### Pattern 23: Compression Impossibility Detection (CID)
349
- Some compression tasks are mathematically impossible.
350
-
351
- **Detection**: Keywords "compress", "codegolf", "under X bytes", "minimal size"
352
-
353
- **Impossibility Markers**:
354
- - ML model weights (GPT-2 = 500MB+, cannot be <1MB without losing function)
355
- - Random/encrypted data (incompressible by definition)
356
- - Asking for compression ratio beyond information-theoretic limits
357
-
358
- **NEVER**: Attempt to compress ML weights to <1% of original size.
359
-
360
- ### Pattern 24: Polyglot Code Construction (PCC)
361
- Polyglot tasks require specific compiler/interpreter quirks.
362
-
363
- **Detection**: Keywords "polyglot", "works in both", "compile as X and Y"
364
-
365
- **Protocol**: Search for existing polyglot examples before implementing.
366
-
367
- ### Pattern 25: Service Configuration Pipeline (SCP)
368
- Multi-service configuration requires ordered setup.
369
-
370
- **Detection**: Keywords "configure", "server", "webserver", "service", "daemon"
371
-
372
- **Protocol**:
373
- 1. **Identify all services** needed
374
- 2. **Check service status**: `systemctl status <service>`
375
- 3. **Configure in dependency order** (base -> dependent)
376
- 4. **Test each service** before moving to next
377
- 5. **Verify end-to-end** after all configured
378
-
379
- ### Pattern 26: Near-Miss Iteration (NMI)
380
- When tests show >50% passing, focus on specific failing tests.
381
-
382
- **Detection**: Test results show partial success (e.g., 8/9, 6/7, 5/6)
383
-
384
- **Protocol**:
385
- 1. Run tests with verbose output
386
- 2. Extract ONLY failing test names
387
- 3. Read failing test code to understand exact requirement
388
- 4. Fix specific issue without breaking passing tests
389
- 5. Re-run ONLY failing tests first
390
- 6. Then run full suite to verify no regressions
391
-
392
- **Reserve 30% of time budget for near-miss iteration.**
393
-
394
- ### Pattern 27: Output Directory Cleanup (ODC)
395
- Tests often check for ONLY specific files in output directories.
396
-
397
- **Detection**: Tasks mentioning "single file", "only", constraints on output directory contents
398
-
399
- **Protocol**:
400
- 1. **Before completing**, list output directory
401
- 2. **Remove non-required files**: compiled binaries, temp files, backups
402
- 3. **Keep ONLY the required outputs** as specified in task
403
-
404
- ### Pattern 28: Service Smoke Test (SST)
405
- Services must be tested BEFORE claiming completion.
406
-
407
- **Detection**: Tasks involving servers, daemons, APIs, web services
408
-
409
- **Protocol**:
410
- 1. Start the service
411
- 2. **Immediately test it** with curl/wget/client
412
- 3. If test fails: debug, fix, restart, re-test
413
- 4. Only complete when test PASSES
414
-
415
- **NEVER complete without a successful smoke test.**
416
-
417
- ### Pattern 29: Multi-Solution Discovery (MSD)
418
- Some tasks require finding ALL valid solutions, not just one.
419
-
420
- **Detection**: Keywords "all moves", "both solutions", "list all", "find every"
421
-
422
- ### Pattern 30: Performance Threshold Tuning (PTT)
423
- Tasks with numeric thresholds require iterative tuning.
424
-
425
- **Detection**: Keywords "win rate", "accuracy", "percentage", "threshold", "at least X%"
426
-
427
- ### Pattern 31: Round-Trip Verification (RTV)
428
- For transform/encode/compress tasks, verify the reverse operation.
429
-
430
- **Detection**: Keywords "compress", "encode", "serialize", "encrypt", and task mentions reverse operation.
431
-
432
- **Protocol**:
433
- 1. Create test data
434
- 2. Apply forward transform (compress)
435
- 3. **Immediately apply reverse** (decompress)
436
- 4. **Verify original == result**
437
- 5. Fix if not matching
438
-
439
- ### Pattern 32: CLI Execution Verification (CEV)
440
- When creating executable CLI tools, verify execution method matches tests.
441
-
442
- **Detection**: Tasks requiring executable scripts, CLI tools, command-line interfaces
443
-
444
- **Protocol**:
445
- 1. Add proper shebang: `#!/usr/bin/env python3`
446
- 2. Make executable: `chmod +x <script>`
447
- 3. **Test EXACTLY as verifier will run it**: `./tool args` not `python3 tool args`
448
- 4. Verify output format matches expected format
449
-
450
- **Common mistake**: Script works with `python3 script.py` but fails with `./script.py` (missing shebang/chmod)
451
-
452
- ### Pattern 33: Numerical Stability Testing (NST)
453
- Numerical algorithms require robustness against edge cases.
454
-
455
- **Detection**: Statistical sampling, numerical optimization, floating-point computation
456
-
457
- **Protocol**:
458
- 1. Test with multiple random seeds (3+ iterations, not just one)
459
- 2. Test domain boundaries explicitly (0, near-zero, infinity)
460
- 3. Use adaptive step sizes for derivative computation
461
- 4. Add tolerance margins for floating-point comparisons (1e-6 typical)
462
- 5. Handle edge cases: empty input, single element, maximum values
463
-
464
- ### Pattern 34: Image-to-Structured Pipeline (ISP)
465
- Visual data requires dedicated recognition tools, not reasoning.
466
-
467
- **Detection**: Tasks involving image analysis, diagram parsing, visual data extraction
468
-
469
- **Protocol**:
470
- 1. **NEVER rely on visual reasoning alone** - accuracy is unreliable
471
- 2. Search for existing recognition libraries
472
- 3. Verify extracted structured data before using
473
- 4. If no tools available, clearly state the limitation
474
-
475
- ### Pattern 35: Decoder-First Analysis (DFA)
476
- For encode/compress tasks with provided decoder, analyze decoder FIRST.
477
-
478
- **Detection**: Task provides a decoder/decompressor and asks to create encoder/compressor
479
-
480
- **Protocol**:
481
- 1. **Read and understand the provided decoder** before writing encoder
482
- 2. Identify expected input format from decoder source
483
- 3. Create minimal test case matching decoder's expected format
484
- 4. Test round-trip with decoder BEFORE optimizing for size
485
- 5. If decoder crashes, your format is wrong - don't optimize further
486
-
487
- ### Pattern 36: Competition Domain Research (CDR)
488
- Competitive tasks benefit from researching domain-specific winning strategies.
489
-
490
- **Detection**: Keywords "win rate", "beat", "competition", "versus", "tournament"
491
-
492
- **Protocol**:
493
- 1. **Research domain strategies BEFORE implementing**
494
- 2. Time-box implementation iterations: stop at 70% time budget
495
- 3. Track progress per iteration to identify improvement trajectory
496
- 4. If not meeting threshold, document best achieved + gap
111
+ **On work request**: `uam task create --title "..." --type task|bug|feature`
497
112
 
498
113
  ---
499
114
 
500
- ## ADVANCED REASONING PATTERNS
501
-
502
- **Six patterns derived from state-of-the-art LLM optimization research (2025-2026). Address reasoning depth, self-verification, branching exploration, feedback grounding, and task atomization.**
503
-
504
- ### Pattern 37: Pre-Implementation Verification (PIV)
505
- **CRITICAL: Prevents wrong-approach waste — the #1 cause of wasted compute.**
506
-
507
- After planning but BEFORE writing any code, explicitly verify your approach:
508
-
509
- **Detection**: Any implementation task (always active for non-trivial changes)
115
+ ## DECISION LOOP
510
116
 
511
- **Protocol**:
512
117
  ```
513
- === PRE-IMPLEMENTATION VERIFY ===
514
- 1. ROOT CAUSE: Does this approach address the actual root cause, not a symptom?
515
- 2. EXISTING TESTS: Will this break any existing passing tests?
516
- 3. SIMPLER PATH: Is there a simpler approach I'm overlooking?
517
- 4. ASSUMPTIONS: What am I assuming about the codebase that I haven't verified?
518
- 5. SIDE EFFECTS: What else does this change affect?
519
- === VERIFIED: [proceed/revise] ===
118
+ 1. CLASSIFY -> complexity? backup needed? tools?
119
+ 2. PROTECT -> cp file file.bak (for configs, DBs, critical files)
120
+ 3. MEMORY -> query relevant context + past failures
121
+ 4. AGENTS -> check overlaps (if multi-agent)
122
+ 5. SKILLS -> check {{SKILLS_PATH}} for domain-specific guidance
123
+ 6. WORK -> implement (use worktree for 3+ file changes)
124
+ 7. REVIEW -> self-review diff before testing
125
+ 8. TEST -> completion gates pass
126
+ 9. LEARN -> store outcome in memory
520
127
  ```
521
128
 
522
- **If ANY answer raises doubt**: STOP. Re-read the problem. Revise approach before coding.
523
-
524
- *Research basis: CoT verification (+4.3% accuracy), Reflexion framework (+18.5%), SEER adaptive reasoning (+4-9%)*
525
-
526
- ### Pattern 38: Branching Exploration (BE)
527
- For complex or ambiguous problems, explore multiple approaches before committing.
528
-
529
- **Detection**: Problem has multiple valid approaches, ambiguous requirements, or high complexity
530
-
531
- **Protocol**:
532
- 1. **Generate 2-3 candidate approaches** (brief description, not full implementation)
533
- 2. **Evaluate each** against: simplicity, correctness likelihood, test-compatibility, side-effect risk
534
- 3. **Select best** with explicit reasoning
535
- 4. **Commit fully** to selected approach — no mid-implementation switching
536
- 5. **If selected approach fails**: backtrack to step 1, eliminate failed approach, try next
537
-
538
- **NEVER**: Start coding the first approach that comes to mind for complex problems.
539
- **ALWAYS**: Spend 5% of effort exploring alternatives to save 50% on wrong-path recovery.
540
-
541
- *Research basis: MCTS-guided code generation (RethinkMCTS: 70%→89% pass@1), Policy-Guided Tree Search*
542
-
543
- ### Pattern 39: Execution Feedback Grounding (EFG)
544
- Learn from test failures systematically — don't just fix, understand and remember.
545
-
546
- **Detection**: Any test failure or runtime error during implementation
547
-
548
- **Protocol**:
549
- 1. **Categorize the failure** using the Failure Taxonomy (see below)
550
- 2. **Identify root cause** (not just the symptom the error message shows)
551
- 3. **Fix with explanation**: What was wrong, why, and what the fix addresses
552
- 4. **Store structured feedback** in memory:
553
- ```bash
554
- sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO memories (timestamp,type,content) VALUES (datetime('now'),'failure_analysis','type:<category>|cause:<root_cause>|fix:<what_fixed>|file:<filename>');"
555
- ```
556
- 5. **Query before similar tasks**: Before implementing, check memory for past failures in same area
557
-
558
- **Failure Taxonomy** (use for categorization):
559
- | Type | Description | Recovery Strategy |
560
- |------|-------------|-------------------|
561
- | `dependency_missing` | Import/module not found | Install or use stdlib alternative |
562
- | `wrong_approach` | Fundamentally incorrect solution | P38 Branching - try different approach |
563
- | `format_mismatch` | Output doesn't match expected format | P14 OFV - re-read spec carefully |
564
- | `edge_case` | Works for happy path, fails on edge | Add boundary checks, test with extremes |
565
- | `state_mutation` | Unexpected side effect on shared state | Isolate mutations, use copies |
566
- | `concurrency` | Race condition or timing issue | Add locks, use sequential fallback |
567
- | `timeout` | Exceeded time/resource limit | Optimize algorithm, reduce scope |
568
- | `environment` | Works locally, fails in target env | P1 Environment Isolation checks |
569
-
570
- *Research basis: RLEF/RLVR (RL from Execution Feedback), verifiable rewards for coding agents*
571
-
572
- ### Pattern 40: Adaptive Reasoning Depth (ARD)
573
- Match reasoning effort to task complexity — don't over-think simple tasks or under-think hard ones.
574
-
575
- **Detection**: Applied automatically at Pattern Router stage
576
-
577
- **Complexity Classification**:
578
- | Complexity | Indicators | Reasoning Protocol |
579
- |-----------|------------|-------------------|
580
- | **Simple** | Single file, clear spec, known pattern, <20 lines | Direct implementation. No exploration phase. |
581
- | **Moderate** | Multi-file, some ambiguity, 20-200 lines | Plan-then-implement. State assumptions. P37 verify. |
582
- | **Complex** | Cross-cutting concerns, ambiguous spec, >200 lines, unfamiliar domain | P38 explore → P37 verify → implement → P39 feedback loop. |
583
- | **Research** | Unknown solution space, no clear approach | Research first (web search, codebase analysis) → P38 explore → implement iteratively. |
584
-
585
- **Rule**: Never apply Complex-level reasoning to Simple tasks (wastes tokens). Never apply Simple-level reasoning to Complex tasks (causes failures).
586
-
587
- *Research basis: SEER adaptive CoT, test-time compute scaling (2-3x gains from adaptive depth)*
588
-
589
- ### Pattern 41: Atomic Task Loop (ATL)
590
- For multi-step changes, decompose into atomic units with clean boundaries.
591
-
592
- **Detection**: Task involves changes to 3+ files, or multiple independent concerns
593
-
594
- **Protocol**:
595
- 1. **Decompose** the task into atomic sub-tasks (each independently testable)
596
- 2. **Order** by dependency (upstream changes first)
597
- 3. **For each sub-task**:
598
- a. Implement the change (single concern only)
599
- b. Run relevant tests
600
- c. Commit if tests pass
601
- d. If context is getting long/confused, note progress and continue fresh
602
- 4. **Final verification**: Run full test suite after all sub-tasks complete
603
-
604
- **Atomicity rules**:
605
- - Each sub-task modifies ideally 1-2 files
606
- - Each sub-task has a clear pass/fail criterion
607
- - Sub-tasks should not depend on uncommitted work from other sub-tasks
608
- - If a sub-task fails, only that sub-task needs rework
609
-
610
- *Research basis: Addy Osmani's continuous coding loop, context drift prevention research*
611
-
612
- ### Pattern 42: Critic-Before-Commit (CBC)
613
- Review your own diff against requirements before running tests.
129
+ ---
614
130
 
615
- **Detection**: Any implementation about to be tested or committed
131
+ ## MEMORY SYSTEM
616
132
 
617
- **Protocol**:
618
133
  ```
619
- === SELF-REVIEW ===
620
- Diff summary: [what changed, in which files]
621
-
622
- REQUIREMENT CHECK:
623
- ☐ Does the diff address ALL requirements from the task?
624
- ☐ Are there any unintended changes (debug prints, commented code, temp files)?
625
- ☐ Does the code handle the error/edge cases mentioned in the spec?
626
- ☐ Is the code consistent with surrounding style and conventions?
627
- ☐ Would this diff make sense to a reviewer with no context?
628
-
629
- ISSUES FOUND: [list or "none"]
630
- === END REVIEW ===
134
+ L1 Working | SQLite memories | {{SHORT_TERM_LIMIT}} max | <1ms
135
+ L2 Session | SQLite session_mem | current session | <5ms
136
+ L3 Semantic | {{LONG_TERM_BACKEND}}| search | ~50ms
137
+ L4 Knowledge| SQLite entities/rels | graph | <20ms
631
138
  ```
632
139
 
633
- **If issues found**: Fix BEFORE running tests. Cheaper to catch logic errors by reading than by test-debug cycles.
634
-
635
- *Research basis: Multi-agent reflection (actor+critic, +20% accuracy), RL^V unified reasoner-verifier*
636
-
637
- ---
638
-
639
- ## CONTEXT OPTIMIZATION
640
-
641
- **Reduce token waste and improve response quality through intelligent context management.**
642
-
643
- ### Progressive Context Disclosure
644
- Not all patterns are needed for every task. The Pattern Router activates only relevant patterns.
645
- - **Always loaded**: Pattern Router, Completion Gates, Error Recovery
646
- - **Loaded on activation**: Only patterns flagged YES by router
647
- - **Summarize, don't repeat**: When referencing prior work, summarize in 1-2 lines, don't paste full output
648
-
649
- ### Context Hygiene
650
- - **Prune completed context**: After a sub-task completes, don't carry its full debug output forward
651
- - **Compress tool output**: Quote only the 2-3 lines that inform the next decision
652
- - **Avoid context poisoning**: Don't include failed approaches in context unless actively debugging them
653
- - **Reset on drift**: If responses become unfocused or repetitive, summarize progress and continue with clean context
654
-
655
- ### Token Budget Awareness
656
- | Task Type | Target Context Usage | Strategy |
657
- |-----------|---------------------|----------|
658
- | Simple fix | <10% of window | Direct implementation, minimal exploration |
659
- | Feature implementation | 30-50% of window | Structured exploration, then focused implementation |
660
- | Complex debugging | 50-70% of window | Deep investigation justified, but prune between attempts |
661
- | Research/exploration | 20-40% of window | Broad search first, then narrow and deep |
662
-
663
- ---
664
-
665
- ## SELF-IMPROVEMENT PROTOCOL
666
-
667
- **The agent improves its own effectiveness over time by learning from outcomes.**
668
-
669
- ### After Task Completion (Success or Failure)
670
- 1. **Record outcome** with structured metadata:
671
- ```bash
672
- sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO session_memories (session_id,timestamp,type,content,importance) VALUES ('current',datetime('now'),'outcome','task:<summary>|result:<pass/fail>|patterns_used:<list>|time_spent:<estimate>|failure_type:<category_or_none>',8);"
673
- ```
674
-
675
- 2. **If failure occurred**: Store in semantic memory for cross-session learning:
676
- ```bash
677
- {{MEMORY_STORE_CMD}} lesson "Failed on <task_type>: <what_went_wrong>. Fix: <what_worked>." --tags failure,<category>,<language> --importance 8
678
- ```
679
-
680
- 3. **If novel technique discovered**: Store as reusable pattern:
681
- ```bash
682
- {{MEMORY_STORE_CMD}} lesson "New technique for <domain>: <technique_description>. Use when <conditions>." --tags technique,<domain> --importance 9
683
- ```
140
+ ### Commands
684
141
 
685
- ### Before Starting Similar Tasks
686
- Query memory for relevant past outcomes:
687
142
  ```bash
688
- sqlite3 ./{{MEMORY_DB_PATH}} "SELECT content FROM memories WHERE type='failure_analysis' AND content LIKE '%<relevant_keyword>%' ORDER BY timestamp DESC LIMIT 5;"
689
- ```
690
-
691
- ### Repo-Specific Learning
692
- Over time, accumulate repository-specific patterns:
693
- - Which test frameworks and assertions this repo uses
694
- - Common failure modes in this codebase
695
- - Preferred code style and naming conventions
696
- - Architecture decisions and their rationale
697
-
698
- Store these as high-importance semantic memories tagged with the repo name.
699
-
700
- ---
143
+ # L1: Working Memory
144
+ sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO memories (timestamp,type,content) VALUES (datetime('now'),'action','...');"
701
145
 
702
- ## CODE QUALITY HEURISTICS
146
+ # L2: Session Memory
147
+ sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO session_memories (session_id,timestamp,type,content,importance) VALUES ('current',datetime('now'),'decision','...',7);"
703
148
 
704
- **Apply to ALL generated code. Verify before committing.**
149
+ # L3: Semantic Memory
150
+ {{MEMORY_STORE_CMD}} lesson "..." --tags t1,t2 --importance 8
705
151
 
706
- ### Pre-Commit Code Review Checklist
707
- - [ ] Functions 30 lines (split if longer)
708
- - [ ] No God objects or functions doing multiple unrelated things
709
- - [ ] Names are self-documenting (no single-letter variables outside loops)
710
- - [ ] Error paths handled explicitly (not just happy path)
711
- - [ ] No debug prints, console.logs, or commented-out code left behind
712
- - [ ] Consistent with surrounding code style (indentation, naming, patterns)
713
- - [ ] No hardcoded values that should be constants or config
714
- - [ ] Imports are minimal — only what's actually used
152
+ # L4: Knowledge Graph
153
+ sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO entities (type,name,first_seen,last_seen,mention_count) VALUES ('file','x.ts',datetime('now'),datetime('now'),1);"
154
+ ```
715
155
 
716
- ### Code Smell Detection
717
- If you notice any of these, fix before committing:
718
- - **Duplicated logic** → Extract to shared function
719
- - **Deep nesting (>3 levels)** → Early returns, extract helper
720
- - **Boolean parameters** → Consider separate methods or options object
721
- - **Magic numbers** → Named constants
722
- - **Catch-all error handling** → Specific error types with appropriate responses
156
+ Decay: `effective_importance = importance * (0.95 ^ days_since_access)`
723
157
 
724
158
  ---
725
159
 
726
- ## SESSION START PROTOCOL
160
+ ## WORKTREE WORKFLOW
727
161
 
728
- **EXECUTE IMMEDIATELY before any response:**
162
+ | Change Scope | Workflow |
163
+ |-------------|----------|
164
+ | Single-file fix (<20 lines) | Direct commit to feature branch |
165
+ | Multi-file change (2-5 files) | Worktree recommended |
166
+ | Feature/refactor (3+ files) | Worktree required |
729
167
 
730
168
  ```bash
731
- uam task ready # Check existing work
732
- sqlite3 ./{{MEMORY_DB_PATH}} "SELECT * FROM memories ORDER BY id DESC LIMIT 10;"
733
- sqlite3 ./{{MEMORY_DB_PATH}} "SELECT * FROM session_memories WHERE session_id='current' ORDER BY id DESC LIMIT 5;"
734
- uam agent status # Check other active agents
169
+ {{WORKTREE_CREATE_CMD}} <slug> # Create
170
+ cd {{WORKTREE_DIR}}/NNN-<slug>/
171
+ git add -A && git commit -m "type: description"
172
+ {{WORKTREE_PR_CMD}} <id> # PR
173
+ {{WORKTREE_CLEANUP_CMD}} <id> # Cleanup after merge
735
174
  ```
736
175
 
737
- **On work request**: `uam task create --title "..." --type task|bug|feature`
176
+ **Applies to**: {{WORKTREE_APPLIES_TO}}
738
177
 
739
178
  ---
740
179
 
741
- ## MULTI-AGENT COORDINATION PROTOCOL
742
-
743
- **Skip this section for single-agent sessions.** Only activate when multiple agents work concurrently (e.g., parallel subagents via Task tool, or multiple Claude Code sessions on same repo).
180
+ ## MULTI-AGENT COORDINATION
744
181
 
745
- **Parallel-first rule**: When safe, run independent tool calls in parallel (searches, reads, status checks) and invoke multiple subagents concurrently for review.
746
-
747
- ### Before Claiming Any Work (multi-agent only)
182
+ **Skip for single-agent sessions.** Only activate when multiple agents work concurrently.
748
183
 
749
184
  ```bash
750
185
  uam agent overlaps --resource "<files-or-directories>"
751
186
  ```
752
187
 
753
- ### Overlap Response Matrix
754
-
755
188
  | Risk Level | Action |
756
189
  |------------|--------|
757
190
  | `none` | Proceed immediately |
@@ -759,15 +192,14 @@ uam agent overlaps --resource "<files-or-directories>"
759
192
  | `medium` | Announce, coordinate sections |
760
193
  | `high`/`critical` | Wait or split work |
761
194
 
762
- ### Agent Capability Routing
195
+ ### Agent Routing
763
196
 
764
- | Task Type | Route To | Capabilities |
765
- |-----------|----------|--------------|
766
- | Security review | `security-auditor` | owasp, secrets, injection |
767
- | Performance | `performance-optimizer` | algorithms, memory, caching |
768
- | Documentation | `documentation-expert` | jsdoc, readme, api-docs |
769
- | Code quality | `code-quality-guardian` | complexity, naming, solid |
770
- | Solution verification | self (P42 CBC) | diff review, requirement check |
197
+ | Task Type | Route To |
198
+ |-----------|----------|
199
+ | Security review | `security-auditor` |
200
+ | Performance | `performance-optimizer` |
201
+ | Documentation | `documentation-expert` |
202
+ | Code quality | `code-quality-guardian` |
771
203
 
772
204
  {{#if LANGUAGE_DROIDS}}
773
205
  ### Language Droids
@@ -783,121 +215,13 @@ uam agent overlaps --resource "<files-or-directories>"
783
215
  {{{MCP_PLUGINS}}}
784
216
  {{/if}}
785
217
 
786
- ---
787
-
788
- ## MULTI-AGENT EXECUTION (DEPENDENCY-AWARE)
789
-
790
- **Skip for single-agent sessions.** When using parallel subagents:
791
- 1. **Decompose** into discrete work items. **Map dependencies** (A blocks B).
792
- 2. **Parallelize** dependency-free items with separate agents and explicit file boundaries.
793
- 3. **Gate edits** with `uam agent overlaps --resource "<files>"` before touching any file.
794
- 4. **Merge in dependency order** (upstream first).
795
-
796
- ---
797
-
798
- ## TOKEN EFFICIENCY RULES
799
-
800
- - Prefer concise, high-signal responses; avoid repeating instructions or large logs.
801
- - Summarize command output; quote only the lines needed for decisions.
802
- - Use parallel tool calls to reduce back-and-forth.
803
- - Ask for clarification only when necessary to proceed correctly.
804
-
805
- ---
806
-
807
- ## DECISION LOOP
808
-
809
- ```
810
- 0. CLASSIFY -> complexity? backup? tool? steps? (P40 Adaptive Depth)
811
- 1. PROTECT -> cp file file.bak
812
- 2. MEMORY -> query relevant context + past failures (P39)
813
- 3. EXPLORE -> if complex: generate 2-3 approaches (P38)
814
- 4. VERIFY -> pre-implementation check (P37)
815
- 5. AGENTS -> check overlaps
816
- 6. SKILLS -> check {{SKILLS_PATH}}
817
- 7. WORKTREE -> create, work (P41 atomic tasks)
818
- 8. REVIEW -> self-review diff (P42)
819
- 9. TEST -> gates pass
820
- 10. LEARN -> store outcome in memory (P39)
821
- ```
822
-
823
- ---
824
-
825
- ## MEMORY SYSTEM
826
-
827
- ```
828
- L1 Working | SQLite memories | {{SHORT_TERM_LIMIT}} max | <1ms
829
- L2 Session | SQLite session_mem | current | <5ms
830
- L3 Semantic | {{LONG_TERM_BACKEND}}| search | ~50ms
831
- L4 Knowledge| SQLite entities/rels | graph | <20ms
832
- ```
833
-
834
- ### Layer Selection
835
-
836
- | Question | YES -> Layer |
837
- |----------|-------------|
838
- | Just did this (last few minutes)? | L1: Working |
839
- | Session-specific decision/context? | L2: Session |
840
- | Reusable learning for future? | L3: Semantic |
841
- | Entity relationships? | L4: Knowledge Graph |
842
-
843
- ### Memory Commands
844
-
845
- ```bash
846
- # L1: Working Memory
847
- sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO memories (timestamp,type,content) VALUES (datetime('now'),'action','...');"
848
-
849
- # L2: Session Memory
850
- sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO session_memories (session_id,timestamp,type,content,importance) VALUES ('current',datetime('now'),'decision','...',7);"
851
-
852
- # L3: Semantic Memory
853
- {{MEMORY_STORE_CMD}} lesson "..." --tags t1,t2 --importance 8
854
-
855
- # L4: Knowledge Graph
856
- sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO entities (type,name,first_seen,last_seen,mention_count) VALUES ('file','x.ts',datetime('now'),datetime('now'),1);"
857
- sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO relationships (source_id,target_id,relation,timestamp) VALUES (1,2,'depends_on',datetime('now'));"
858
- ```
859
-
860
- ### Consolidation Rules
861
-
862
- - **Trigger**: Every 10 working memory entries
863
- - **Action**: Summarize -> session_memories, Extract lessons -> semantic memory
864
- - **Dedup**: Skip if content_hash exists OR similarity > 0.92
865
-
866
- ### Decay Formula
867
-
868
- ```
869
- effective_importance = importance * (0.95 ^ days_since_access)
870
- ```
871
-
872
- ---
873
-
874
- ## WORKTREE WORKFLOW
875
-
876
- **Use worktrees for multi-file features/refactors. Skip for single-file fixes.**
877
-
878
- | Change Scope | Workflow |
879
- |-------------|----------|
880
- | Single-file fix (<20 lines) | Direct commit to feature branch, no worktree needed |
881
- | Multi-file change (2-5 files) | Worktree recommended if touching shared interfaces |
882
- | Feature/refactor (3+ files, new feature) | Worktree required |
883
- | CLAUDE.md or config changes | Worktree required |
884
-
885
- ```bash
886
- # Create (when needed)
887
- {{WORKTREE_CREATE_CMD}} <slug>
888
- cd {{WORKTREE_DIR}}/NNN-<slug>/
889
-
890
- # Work
891
- git add -A && git commit -m "type: description"
218
+ ### Parallel Execution
892
219
 
893
- # PR (runs tests, triggers parallel reviewers)
894
- {{WORKTREE_PR_CMD}} <id>
895
-
896
- # Cleanup (ALWAYS cleanup after merge)
897
- {{WORKTREE_CLEANUP_CMD}} <id>
898
- ```
899
-
900
- **Applies to**: {{WORKTREE_APPLIES_TO}}
220
+ When safe, run independent tool calls in parallel. When using parallel subagents:
221
+ 1. Decompose into discrete work items. Map dependencies.
222
+ 2. Parallelize dependency-free items with separate agents and explicit file boundaries.
223
+ 3. Gate edits with `uam agent overlaps` before touching any file.
224
+ 4. Merge in dependency order (upstream first).
901
225
 
902
226
  ---
903
227
 
@@ -906,15 +230,12 @@ git add -A && git commit -m "type: description"
906
230
  **Before ANY commit/PR, invoke quality droids in PARALLEL:**
907
231
 
908
232
  ```bash
909
- # These run concurrently - do NOT wait between calls
910
233
  Task(subagent_type: "code-quality-guardian", prompt: "Review: <files>")
911
234
  Task(subagent_type: "security-auditor", prompt: "Audit: <files>")
912
235
  Task(subagent_type: "performance-optimizer", prompt: "Analyze: <files>")
913
236
  Task(subagent_type: "documentation-expert", prompt: "Check: <files>")
914
237
  ```
915
238
 
916
- ### Review Priority
917
-
918
239
  | Droid | Blocks PR | Fix Before Merge |
919
240
  |-------|-----------|------------------|
920
241
  | security-auditor | CRITICAL/HIGH | Always |
@@ -924,6 +245,19 @@ Task(subagent_type: "documentation-expert", prompt: "Check: <files>")
924
245
 
925
246
  ---
926
247
 
248
+ ## CODE QUALITY
249
+
250
+ ### Pre-Commit Checklist
251
+ - Functions <= 30 lines
252
+ - Self-documenting names
253
+ - Error paths handled explicitly
254
+ - No debug prints or commented-out code left behind
255
+ - Consistent with surrounding code style
256
+ - No hardcoded values that should be constants
257
+ - Imports are minimal
258
+
259
+ ---
260
+
927
261
  ## AUTOMATIC TRIGGERS
928
262
 
929
263
  | Pattern | Action |
@@ -931,75 +265,25 @@ Task(subagent_type: "documentation-expert", prompt: "Check: <files>")
931
265
  | work request (fix/add/change/update/create/implement/build) | `uam task create --type task` |
932
266
  | bug report/error | `uam task create --type bug` |
933
267
  | feature request | `uam task create --type feature` |
934
- | single-file fix | direct commit to branch, skip worktree |
935
- | multi-file feature (3+ files) | create worktree, then work |
268
+ | single-file fix | direct commit to branch |
269
+ | multi-file feature (3+ files) | create worktree |
936
270
  | review/check/look | query memory first |
937
271
  | ANY code change | tests required |
938
272
 
939
- **Agent coordination**: Only use `uam agent` commands when multiple agents are active concurrently. For single-agent sessions (most common), skip agent registration and overlap checks.
940
-
941
273
  ---
942
274
 
943
- ## UAM VISUAL STATUS FEEDBACK (MANDATORY WHEN UAM IS ACTIVE)
944
-
945
- **When UAM tools are in use, ALWAYS use the built-in status display commands to provide visual feedback on progress and underlying numbers. Do NOT silently perform operations -- show the user what is happening.**
946
-
947
- ### After Task Operations
948
- After creating, updating, closing, or claiming tasks, run:
949
- ```bash
950
- uam dashboard progress # Show completion %, status bars, velocity
951
- uam task stats # Show priority/type breakdown with charts
952
- ```
953
-
954
- ### After Memory Operations
955
- After storing, querying, or prepopulating memory, run:
956
- ```bash
957
- uam memory status # Show memory layer health, capacity gauges, service status
958
- uam dashboard memory # Show detailed memory dashboard with architecture tree
959
- ```
275
+ ## UAM VISUAL STATUS FEEDBACK
960
276
 
961
- ### After Agent/Coordination Operations
962
- After registering agents, checking overlaps, or claiming resources, run:
963
- ```bash
964
- uam dashboard agents # Show agent status table, resource claims, active work
965
- ```
277
+ **When UAM tools are in use, show visual feedback:**
966
278
 
967
- ### Periodic Overview
968
- At session start and after completing major work items, run:
969
279
  ```bash
970
- uam dashboard overview # Full overview: task progress, agent status, memory health
280
+ uam dashboard overview # Full overview at session start
281
+ uam dashboard progress # After task operations
282
+ uam task stats # After task state changes
283
+ uam memory status # After memory operations
284
+ uam dashboard agents # After agent/coordination operations
971
285
  ```
972
286
 
973
- ### Display Function Reference
974
-
975
- UAM provides these visual output functions (from `src/cli/visualize.ts`):
976
-
977
- | Function | Purpose | When to Use |
978
- |----------|---------|-------------|
979
- | `progressBar` | Completion bar with % and count | Task/test progress |
980
- | `stackedBar` + `stackedBarLegend` | Multi-segment status bar | Status distribution |
981
- | `horizontalBarChart` | Labeled bar chart | Priority/type breakdowns |
982
- | `miniGauge` | Compact colored gauge | Capacity/utilization |
983
- | `sparkline` | Inline trend line | Historical data trends |
984
- | `table` | Formatted data table | Task/agent listings |
985
- | `tree` | Hierarchical tree view | Memory layers, task hierarchy |
986
- | `box` | Bordered summary box | Section summaries |
987
- | `statusBadge` | Colored status labels | Agent/service status |
988
- | `keyValue` | Aligned key-value pairs | Metadata display |
989
- | `inlineProgressSummary` | Compact progress bar with counts | After task mutations |
990
- | `trend` | Up/down arrow with delta | Before/after comparisons |
991
- | `heatmapRow` | Color-coded cell row | Activity density |
992
- | `bulletList` | Status-colored bullet list | Health checks |
993
-
994
- ### Rules
995
-
996
- 1. **Never silently complete a UAM operation** -- always follow up with the relevant dashboard/status command
997
- 2. **Show numbers, not just success messages** -- the user needs to see counts, percentages, and trends
998
- 3. **Use `uam dashboard overview`** at session start to establish baseline awareness
999
- 4. **Use `uam task stats`** after any task state change to show the impact
1000
- 5. **Use `uam memory status`** after any memory write to confirm storage and show capacity
1001
- 6. **Prefer dashboard commands over raw SQLite queries** for status checks -- they provide formatted visual output
1002
-
1003
287
  ---
1004
288
 
1005
289
  {{#if HAS_PROJECT_MD}}
@@ -1096,40 +380,24 @@ UAM provides these visual output functions (from `src/cli/visualize.ts`):
1096
380
 
1097
381
  ## COMPLETION GATES - MANDATORY
1098
382
 
1099
- **CANNOT say "done" or "complete" until ALL gates pass. This is NOT optional.**
383
+ **CANNOT say "done" until ALL gates pass.**
1100
384
 
1101
- ### GATE 1: Output Existence Check
385
+ ### GATE 1: Output Existence
1102
386
  ```bash
1103
- echo "=== GATE 1: OUTPUT EXISTENCE ==="
1104
387
  for f in $EXPECTED_OUTPUTS; do
1105
- [ -f "$f" ] && echo " $f exists" || echo "MISSING: $f"
388
+ [ -f "$f" ] && echo "ok $f" || echo "MISSING: $f"
1106
389
  done
1107
390
  ```
391
+ If missing: CREATE IT immediately.
1108
392
 
1109
- ### GATE 2: Constraint Compliance Check
1110
- ```
1111
- Extract ALL constraints from task description:
1112
- - File count constraint (e.g., "single file")
1113
- - Line count constraint (e.g., "exactly one line")
1114
- - Format constraint (e.g., "valid JSON")
1115
- - Size constraint (e.g., "under 5KB")
1116
-
1117
- Verify EACH constraint is satisfied before proceeding.
1118
- ```
393
+ ### GATE 2: Constraint Compliance
394
+ Extract ALL constraints from task ("exactly", "only", "single", "must be", "no more than"). Verify EACH.
1119
395
 
1120
- ### GATE 3: Test Verification
396
+ ### GATE 3: Tests Pass
1121
397
  ```bash
1122
- echo "=== GATE 3: TEST VERIFICATION ==="
1123
398
  {{TEST_COMMAND}} 2>&1 | tail -30
1124
- # If < 100% pass: iterate (fix specific failure, re-run)
1125
- # Reserve 20% of time for iteration
1126
399
  ```
1127
-
1128
- **HARD STOP RULE:**
1129
- - If Gate 1 fails -> CREATE missing files immediately
1130
- - If Gate 2 fails -> FIX constraint violations immediately
1131
- - If Gate 3 fails -> ITERATE until 100% or timeout
1132
- - NEVER report "complete" with failing gates
400
+ If < 100%: iterate (fix specific failure, re-run). Reserve 20% of time for iteration.
1133
401
 
1134
402
  ---
1135
403
 
@@ -1139,15 +407,15 @@ echo "=== GATE 3: TEST VERIFICATION ==="
1139
407
  ☐ Tests pass
1140
408
  ☐ Lint/typecheck pass
1141
409
  ☐ Worktree used (not {{DEFAULT_BRANCH}})
1142
- ☐ Self-review completed (P42)
1143
- ☐ Memory updated (outcome + lessons from P39)
410
+ ☐ Self-review completed
411
+ ☐ Memory updated
1144
412
  ☐ PR created
1145
413
  ☐ Parallel reviews passed
1146
414
  {{#if HAS_INFRA}}
1147
415
  ☐ IaC parity verified
1148
416
  {{/if}}
1149
417
  ☐ No secrets in code
1150
- ☐ No debug artifacts left (console.logs, commented code, temp files)
418
+ ☐ No debug artifacts left
1151
419
  ```
1152
420
 
1153
421
  ---
@@ -1156,67 +424,12 @@ echo "=== GATE 3: TEST VERIFICATION ==="
1156
424
 
1157
425
  ```
1158
426
  MERGE -> DEPLOY -> MONITOR -> FIX (iterate until 100%)
1159
-
1160
- 1. MERGE: PR approved -> merge to {{DEFAULT_BRANCH}} -> delete branch
1161
- 2. DEPLOY: CI/CD runs -> check status -> confirm live
1162
- 3. MONITOR: Check logs -> verify functionality -> smoke tests
1163
- 4. FIX: New worktree -> fix -> GOTO 1
1164
- 5. COMPLETE: Update memory -> close tasks
1165
427
  ```
1166
428
 
1167
429
  **Never "done" until:** PR merged + deployed + verified working
1168
430
 
1169
431
  ---
1170
432
 
1171
- ## NEAR-MISS ITERATION PROTOCOL (P26 ENFORCED)
1172
-
1173
- When >50% of tests pass but not all:
1174
- 1. **Read exact failure message** for each failing test
1175
- 2. **Fix ONLY the specific failing test** - do not refactor passing code
1176
- 3. **Re-run tests** immediately after each fix
1177
- 4. **Reserve 30% of time budget** for this iteration loop
1178
- 5. **Repeat** until 100% pass or time exhausted
1179
- 6. **Never give up** on a task that is >50% passing - small fixes flip outcomes
1180
-
1181
- ---
1182
-
1183
- ## DECODER-FIRST PROTOCOL (P35 ENFORCED)
1184
-
1185
- When a task provides a decoder, validator, or expected output format:
1186
- 1. **READ the decoder/validator source code FIRST** before writing any implementation
1187
- 2. **Extract the exact format** it expects (headers, encoding, byte order, etc.)
1188
- 3. **Implement encoder/generator** to match that exact format
1189
- 4. **Test round-trip**: `original == decode(encode(original))` BEFORE optimizing
1190
- 5. **Never optimize** until round-trip verification passes
1191
-
1192
- ---
1193
-
1194
- ## ERROR RECOVERY ESCALATION
1195
-
1196
- On any test failure or error:
1197
- 1. **Categorize** using P39 Failure Taxonomy (`dependency_missing`, `wrong_approach`, `format_mismatch`, `edge_case`, `state_mutation`, `concurrency`, `timeout`, `environment`)
1198
- 2. **Read exact error message** - do not guess
1199
- 3. **Query memory** for past failures of this type: `SELECT content FROM memories WHERE type='failure_analysis' AND content LIKE '%<type>%' LIMIT 3;`
1200
- 4. **If same error twice**: change approach completely (P38 Branching), do not retry same fix
1201
- 5. **If dependency missing**: install it (`pip install`, `npm install`, `apt-get`)
1202
- 6. **If permission denied**: use alternative path or `chmod`
1203
- 7. **If timeout**: reduce scope, submit partial solution
1204
- 8. **Store failure** in memory after resolution (P39 protocol)
1205
- 9. **Never repeat a failed command** without modification
1206
-
1207
- ---
1208
-
1209
- ## TIME BUDGET ALLOCATION
1210
-
1211
- For complex tasks (compilation, VM, multi-service):
1212
- - **20%** - Environment assessment (deps, tools, constraints)
1213
- - **50%** - Core implementation
1214
- - **30%** - Iteration, debugging, test fixes
1215
-
1216
- If timeout approaching: submit best partial solution rather than nothing.
1217
-
1218
- ---
1219
-
1220
433
  {{#if PREPOPULATED_KNOWLEDGE}}
1221
434
  ## PROJECT KNOWLEDGE
1222
435
 
@@ -1243,126 +456,11 @@ If timeout approaching: submit best partial solution rather than nothing.
1243
456
 
1244
457
  ---
1245
458
 
1246
- ## CRITICAL REMINDERS (READ LAST)
1247
-
1248
- **These reminders exploit recency bias - place at END for maximum compliance.**
1249
-
1250
- 0. **PATTERN ROUTER (MANDATORY FIRST STEP)**:
1251
- BEFORE ANY work, you MUST print this analysis block:
1252
- ```
1253
- === PATTERN ROUTER ===
1254
- Task: [one-line task summary]
1255
- Classification: [file-creation | constraint-validation | adversarial | pipeline | impossible]
1256
- Complexity: [simple | moderate | complex | research] (P40)
459
+ ## TOKEN EFFICIENCY
1257
460
 
1258
- SELECTED PATTERNS:
1259
- - P12 (OEV): [YES/NO] - Does task require creating files?
1260
- - P17 (CE): [YES/NO] - Does task have "exactly/only/single/must be" constraints?
1261
- - P20 (AT): [YES/NO] - Does task involve "bypass/break/exploit/filter"?
1262
- - P11 (Pre-computed): [YES/NO] - Is this a known algorithm (crypto/chess/ML)?
1263
- - P18 (MTP): [YES/NO] - Does task need multiple tools chained?
1264
- - P37 (PIV): [YES/NO] - Non-trivial implementation needing plan verification?
1265
- - P38 (BE): [YES/NO] - Multiple valid approaches or high ambiguity?
1266
- - P41 (ATL): [YES/NO] - Changes span 3+ files or multiple concerns?
1267
- - P42 (CBC): [YES/NO] - Implementation that will be committed?
1268
-
1269
- REASONING DEPTH (per P40):
1270
- - Simple → Direct implementation
1271
- - Moderate → Plan, verify (P37), implement
1272
- - Complex → Explore (P38), verify (P37), implement, feedback (P39)
1273
- - Research → Search first, then explore (P38), implement iteratively
1274
-
1275
- ACTIVE PATTERNS: [list only YES patterns]
1276
- === END ROUTER ===
1277
- ```
1278
- DO NOT proceed without printing this block first.
1279
-
1280
- 1. **BACKUP FIRST**: Before modifying ANY config, database, or critical file:
1281
- `cp <file> <file>.bak`
1282
-
1283
- 2. **CHECK ENVIRONMENT**: Before using a library, verify it exists:
1284
- `which <tool> || python3 -c "import <lib>" 2>/dev/null || echo "NOT AVAILABLE"`
1285
-
1286
- 3. **USE CLI OVER LIBRARIES**: When in doubt, use subprocess + CLI tools.
1287
-
1288
- 4. **STEP-BY-STEP**: Break complex tasks into numbered commands. Execute one at a time.
1289
-
1290
- 5. **VERIFY EACH STEP**: Check output before proceeding to next step.
1291
-
1292
- 6. **STATE-SENSITIVE OPERATIONS**: If task involves recovery, forensics, or state inspection - backup EVERYTHING first.
1293
-
1294
- 7. **IMPOSSIBLE DETECTION**: If task requires compression magic, ML inference, or exhaustive search - flag as potentially impossible.
1295
-
1296
- 8. **TEST IN ISOLATION**: Dependencies installed during work may not persist. Use pre-installed tools.
1297
-
1298
- 9. **OUTPUT VERIFICATION (MANDATORY)**: Before saying "done" or "complete":
1299
- ```bash
1300
- ls -la $EXPECTED_OUTPUTS 2>&1
1301
- # If ANY file is missing: CREATE IT NOW
1302
- ```
1303
-
1304
- 10. **ITERATE ON PARTIAL SUCCESS**: If tests show partial pass (e.g., 8/9):
1305
- - Read the failing test output
1306
- - Fix the specific failure
1307
- - Re-run tests
1308
- - Repeat until 100% or timeout
1309
-
1310
- 11. **CREATE BEFORE EXPLAIN**: If a required output file doesn't exist:
1311
- - CREATE a minimal working version FIRST
1312
- - Don't explain why it's missing
1313
- - Just make the file exist with valid content
1314
-
1315
- 12. **TASK-FIRST**: Create required output files BEFORE analyzing.
1316
- Touch placeholders immediately, then refine with real content.
1317
-
1318
- 13. **EXTRACT CONSTRAINTS (MANDATORY)**: If P17 is active, you MUST print:
1319
- ```
1320
- === CONSTRAINT CHECKLIST ===
1321
- Extracted from task description:
1322
- ☐ [constraint 1]
1323
- ☐ [constraint 2]
1324
-
1325
- BEFORE completing, verify EACH constraint:
1326
- ☑ constraint 1: [PASS/FAIL]
1327
- ☑ constraint 2: [PASS/FAIL]
1328
- === END CONSTRAINTS ===
1329
- ```
1330
-
1331
- 14. **MULTI-TOOL PIPELINE**: For complex tasks, identify ALL tools needed.
1332
- Don't implement from scratch if existing tools exist.
1333
-
1334
- 15. **REFUSE IMPOSSIBLE**: If task violates information theory (compress 500MB to 5KB),
1335
- REFUSE immediately. Create IMPOSSIBLE.md explaining why. Don't attempt.
1336
-
1337
- 16. **VERIFY BEFORE CODING (P37)**: For moderate+ complexity tasks, print the
1338
- PRE-IMPLEMENTATION VERIFY block. Catch wrong approaches before wasting tokens.
1339
-
1340
- 17. **EXPLORE BEFORE COMMITTING (P38)**: For complex/ambiguous tasks, generate 2-3
1341
- candidate approaches and evaluate before coding. 5% exploration saves 50% rework.
1342
-
1343
- 18. **LEARN FROM FAILURES (P39)**: After ANY test failure, categorize it using the
1344
- Failure Taxonomy and store structured feedback in memory. Query memory before similar tasks.
1345
-
1346
- 19. **REVIEW YOUR OWN DIFF (P42)**: Before running tests, do a self-review of your
1347
- changes against requirements. Catch logic errors by reading, not by test-debug cycles.
1348
-
1349
- 20. **ADVERSARIAL MINDSET (MANDATORY)**: If P20 is active, you MUST:
1350
- ```
1351
- === ADVERSARIAL ANALYSIS ===
1352
- Target: [what are we trying to bypass/break?]
1353
- Defense mechanism: [how does the filter/protection work?]
1354
-
1355
- ATTACK VECTORS TO TRY:
1356
- 1. Case variation
1357
- 2. Encoding
1358
- 3. Null bytes
1359
- 4. Double encoding
1360
- 5. Context breaking
1361
- 6. Event handlers
1362
- 7. [add task-specific vectors]
1363
-
1364
- TEST EACH vector until one works.
1365
- === END ADVERSARIAL ===
1366
- ```
461
+ - Prefer concise, high-signal responses
462
+ - Summarize command output; quote only decision-relevant lines
463
+ - Use parallel tool calls to reduce back-and-forth
464
+ - Check `{{SKILLS_PATH}}` for domain-specific skills before re-inventing approaches
1367
465
 
1368
466
  </coding_guidelines>