universal-agent-memory 3.1.0 → 4.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "universal-agent-memory",
3
- "version": "3.1.0",
3
+ "version": "4.1.0",
4
4
  "description": "Universal AI agent memory system - CLAUDE.md templates, memory, worktrees for Claude Code, Factory.AI, VSCode, OpenCode",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
@@ -1,5 +1,5 @@
1
1
  <!--
2
- CLAUDE.md Universal Template - v10.19-opt
2
+ CLAUDE.md Universal Template - v11.0-slim
3
3
 
4
4
  Core Variables:
5
5
  {{PROJECT_NAME}}, {{DESCRIPTION}}, {{DEFAULT_BRANCH}}, {{STRUCTURE_DATE}}
@@ -44,42 +44,13 @@
44
44
 
45
45
  ---
46
46
 
47
- ## CODE FIELD - COGNITIVE ENVIRONMENT
47
+ ## CODE PRINCIPLES
48
48
 
49
- **Apply to ALL code generation. Creates conditions where better code emerges naturally.**
50
-
51
- ### Core Inhibitions
52
-
53
- ```
54
- Do not write code before stating assumptions.
55
- Do not claim correctness you haven't verified.
56
- Do not handle only the happy path.
57
- Under what conditions does this work?
58
- ```
59
-
60
- ### Before Writing Code
61
-
62
- - What are you assuming about the input?
63
- - What are you assuming about the environment?
64
- - What would break this?
65
- - What would a malicious caller do?
66
-
67
- ### Do Not
68
-
69
- - Write code before stating assumptions
70
- - Claim correctness you haven't verified
71
- - Handle the happy path and gesture at the rest
72
- - Import complexity you don't need
73
- - Solve problems you weren't asked to solve
74
- - Produce code you wouldn't want to debug at 3am
75
-
76
- ### Expected Output Format
77
-
78
- **Before code**: Assumptions stated explicitly, scope bounded
79
- **In code**: Smaller than expected, edge cases handled or explicitly rejected
80
- **After code**: "What this handles" and "What this does NOT handle" sections
81
-
82
- *Attribution: Based on [context-field research](https://github.com/NeoVertex1/context-field)*
49
+ - State assumptions before writing code
50
+ - Verify correctness -- do not claim it
51
+ - Handle error paths, not just the happy path
52
+ - Do not import complexity you do not need
53
+ - Produce code you would want to debug at 3am
83
54
 
84
55
  ---
85
56
 
@@ -98,8 +69,6 @@ Under what conditions does this work?
98
69
  | kubectl with secrets | `ops-approved-operations.yml` |
99
70
  | One-time secret operation | `ops-create-ephemeral.yml` (self-destructs after run) |
100
71
 
101
- **Local commands without secrets** (read-only, public resources) are allowed for testing.
102
-
103
72
  ### Two-Phase Infrastructure Workflow
104
73
 
105
74
  ```
@@ -107,7 +76,6 @@ PHASE 1: LOCAL PROOF (ALLOWED - NO SECRETS)
107
76
  - kubectl get/describe/logs (read-only operations)
108
77
  - terraform plan (uses GitHub pipeline for secrets)
109
78
  - Direct cloud console changes for rapid prototyping
110
- - Manual commands to verify behavior (public resources)
111
79
  - SECRETS REQUIRED? -> Use pipeline, not local commands
112
80
 
113
81
  PHASE 2: IaC PARITY (MANDATORY - VIA PIPELINE)
@@ -119,639 +87,107 @@ PHASE 2: IaC PARITY (MANDATORY - VIA PIPELINE)
119
87
  - RULE: Work is NOT complete until IaC matches live state
120
88
  ```
121
89
 
122
- ### Core Principle
123
-
124
- ```
125
- Local testing proves the solution. IaC ensures reproducibility.
126
- Manual changes are TEMPORARY. IaC changes are PERMANENT.
127
- If it's not in IaC, it doesn't exist (will be destroyed/lost).
128
- Secrets live in GitHub - use pipelines for secret-dependent operations.
129
- ```
130
-
131
90
  ### Approved Pipelines
132
91
 
133
- | Task | Pipeline | Trigger | Notes |
134
- |------|----------|---------|-------|
135
- | Kubernetes operations | `ops-approved-operations.yml` | Manual dispatch | Has cluster secrets |
136
- | Ephemeral environments | `ops-create-ephemeral.yml` | Manual dispatch | Self-destructs after run |
137
- | Terraform changes | `iac-terraform-cicd.yml` | PR to {{DEFAULT_BRANCH}} | Has TF secrets |
138
- | Ephemeral Terraform | `ops-ephemeral-terraform.yml` | Manual dispatch | One-time TF operations |
139
-
140
- ### What This Means for Agents
141
-
142
- **PHASE 1 - Local Testing (ALLOWED for non-secret operations):**
143
- - Run read-only commands: `kubectl get`, `kubectl describe`, `kubectl logs`
144
- - Run `terraform plan` via pipeline (needs secrets)
145
- - Make cloud console changes to prototype
146
- - Use ephemeral pipelines for secret-dependent testing
147
-
148
- **PHASE 2 - IaC Parity (MANDATORY - always via pipeline):**
149
- - ALL manual changes MUST be translated to IaC (Terraform/K8s YAML)
150
- - IaC MUST be committed to version control
151
- - Deployment MUST go through CI/CD pipeline (has secrets)
152
- - Final state MUST match IaC exactly (verify with diff)
153
- - Manual/ephemeral resources MUST be cleaned up
154
-
155
- **NEVER:**
156
- - Run `terraform apply` locally (no secrets available)
157
- - Run `kubectl apply` with secret-dependent resources locally
158
- - Create secrets via `kubectl create secret` (use Sealed Secrets)
159
- - Hardcode or expose secrets in code/logs
92
+ | Task | Pipeline | Trigger |
93
+ |------|----------|---------|
94
+ | Kubernetes operations | `ops-approved-operations.yml` | Manual dispatch |
95
+ | Ephemeral environments | `ops-create-ephemeral.yml` | Manual dispatch |
96
+ | Terraform changes | `iac-terraform-cicd.yml` | PR to {{DEFAULT_BRANCH}} |
97
+ | Ephemeral Terraform | `ops-ephemeral-terraform.yml` | Manual dispatch |
160
98
  {{/if}}
161
99
 
162
100
  ---
163
101
 
164
- ## UNIVERSAL AGENT PATTERNS
165
-
166
- **Forty-two patterns proven to improve LLM agent reliability. Apply to ALL task execution.**
167
-
168
- ### Pattern 1: Environment Isolation
169
- Code installed during execution may NOT persist across phases (agent -> test -> deploy).
170
- - Check what dependencies are pre-installed before using them
171
- - Prefer pre-installed tools over installing new ones
172
- - Design for minimal dependency footprint
173
-
174
- ### Pattern 2: Recipe Following
175
- LLMs excel at explicit step-by-step instructions but struggle with ambiguous tasks.
176
- - Convert complex tasks into numbered sequential commands
177
- - Each step should be atomic and verifiable
178
- - Provide exact commands, not descriptions
179
-
180
- ### Pattern 3: Pre-execution State Protection
181
- LLMs do NOT reliably follow negative instructions ("do not delete X").
182
- - Backup critical files BEFORE starting work
183
- - Never rely on "do not" for critical constraints
184
- - Protect state proactively, not reactively
102
+ ## SESSION START
185
103
 
186
104
  ```bash
187
- cp /etc/nginx/nginx.conf /etc/nginx/nginx.conf.bak
188
- sqlite3 db.sqlite ".backup /tmp/db_backup.sqlite"
105
+ uam task ready
106
+ sqlite3 ./{{MEMORY_DB_PATH}} "SELECT * FROM memories ORDER BY id DESC LIMIT 10;"
107
+ sqlite3 ./{{MEMORY_DB_PATH}} "SELECT * FROM session_memories WHERE session_id='current' ORDER BY id DESC LIMIT 5;"
108
+ uam agent status
189
109
  ```
190
110
 
191
- ### Pattern 4: Tool Specification
192
- Agents succeed more when told WHICH tool to use, not just WHAT to accomplish.
193
- - Specify exact tool and flags when possible
194
- - Reduce search space of possible approaches
195
-
196
- ### Pattern 5: Recognizing Impossible Tasks
197
- Some tasks require pre-computed solutions (compressed algorithms, lookup tables, trained models).
198
- - Identify tasks requiring: compression below limits, pre-trained weights, exhaustive search
199
- - These cannot be solved by reasoning alone
200
- - Detect early to avoid wasted computation
201
-
202
- ### Pattern 6: Hierarchical Prompting (Recency Bias)
203
- LLMs weight information at the END of context more heavily.
204
- - Structure: capabilities -> context -> CRITICAL instructions
205
- - Put most important constraints at the END
206
- - Repeat critical instructions if essential
207
-
208
- ### Pattern 7: Task Classification
209
- Different task types require different strategies. See also P40 (Adaptive Reasoning Depth).
210
-
211
- | Category | Strategy |
212
- |----------|----------|
213
- | State-sensitive | Pre-backup critical files |
214
- | Recipe-following | Step-by-step commands |
215
- | Tool-dependent | Specify exact tool + flags |
216
- | Research/exploration | Parallel searches |
217
- | Complex/ambiguous | P38 Branching Exploration |
218
- | Multi-file change | P41 Atomic Task Loop |
219
-
220
- ### Pattern 8: CLI over Libraries
221
- When environment dependencies are uncertain, prefer subprocess + CLI over library imports.
222
- - CLI tools more likely pre-installed
223
- - Better backward compatibility
224
- - Easier to verify: `which tool`
225
-
226
- ### Pattern 9: Format Translation Pipeline
227
- When task involves non-text data (images, audio, video, binary), convert to standard format first.
228
- - Identify input format and required output
229
- - Find converter tool: image->text, audio->transcript, binary->structured
230
- - Convert to standard intermediate format (FEN, JSON, CSV, etc.)
231
-
232
- ### Pattern 10: Whitelist-First Sanitization
233
- For security filtering (XSS, SQL injection, command injection), ALWAYS use allow-list approach.
234
- - NEVER use blacklist/regex to filter malicious patterns
235
- - Use established security libraries (bleach, DOMPurify, parameterized queries)
236
-
237
- ### Pattern 11: Pre-Computed Solution Recognition
238
- Some tasks have known solutions that cannot be derived through reasoning alone.
239
- - Recognize tasks matching known solved problems (chess engines, crypto tables, ML models)
240
- - Search GitHub/papers FIRST before attempting to solve
241
- - Download and adapt existing solutions rather than reinventing
242
-
243
- ### Pattern 12: Output Existence Verification (OEV)
244
- **CRITICAL: 37% of task failures are "output file not created"**
245
-
246
- BEFORE completing ANY task:
247
- 1. **Parse expected outputs** from task description (files, paths, artifacts)
248
- 2. **Verify each output EXISTS**: `ls -la /expected/path`
249
- 3. **If missing: CREATE IT** - don't explain, don't ask, just create
250
- 4. **Re-verify** before reporting completion
251
-
252
- ### Pattern 13: Iterative Refinement Loop (IRL)
253
- When tests are available and some pass but others fail:
254
- 1. Run tests after implementation
255
- 2. If partial success: read failing test, fix specific issue, re-run
256
- 3. Repeat until 100% pass OR timeout approaching
257
- 4. Reserve 20% of time budget for iteration
258
-
259
- ### Pattern 14: Output Format Validation (OFV)
260
- When task specifies exact output format:
261
- 1. Extract format requirements from task description
262
- 2. Validate output matches spec before completion
263
- 3. If validation fails: fix output, don't re-explain
264
-
265
- ### Pattern 15: Exception Recovery (ER)
266
- When code throws exceptions:
267
- 1. Read the stack trace
268
- 2. Fix the specific error (ImportError -> install/use stdlib, FileNotFoundError -> create file)
269
- 3. Re-run and verify
270
- 4. Don't give up after first exception
271
-
272
- ### Pattern 16: Task-First Execution (TFE)
273
- **CRITICAL: Prevents regression where agent analyzes but forgets to create outputs**
274
-
275
- BEFORE any analysis or exploration:
276
- 1. **Parse task for REQUIRED OUTPUTS** (files, artifacts, states)
277
- 2. **Create MINIMAL WORKING versions immediately** (touch files, create stubs)
278
- 3. **THEN refine to full implementation**
279
- 4. **Never finish a turn without outputs existing**
280
-
281
- ### Pattern 17: Constraint Extraction (CE)
282
- Agent misses specific constraints buried in task descriptions.
283
-
284
- BEFORE implementing:
285
- 1. **Parse for constraint keywords**: "exactly", "only", "single", "must be", "no more than"
286
- 2. **Extract format constraints**: file types, line counts, size limits, response formats
287
- 3. **Create constraint checklist**
288
- 4. **Validate EACH constraint before completion**
289
-
290
- ### Pattern 18: Multi-Tool Pipeline (MTP)
291
- Complex tasks need multiple tools chained together.
292
-
293
- 1. **Decompose task into stages** (input -> transform -> output)
294
- 2. **Identify tool for EACH stage** (don't implement from scratch if tool exists)
295
- 3. **Verify tools available**: `which tool || pip install package`
296
- 4. **Chain in pipeline**
297
-
298
- ### Pattern 19: Enhanced Impossible Task Refusal (ITR+)
299
- Pattern 5 identifies impossible tasks but agent still attempts them.
300
-
301
- **Impossibility markers -> REFUSE IMMEDIATELY:**
302
- - Compression below Shannon limit (e.g., "GPT-2 in 5KB" - weights are 500MB+)
303
- - Pre-trained weights required but none provided
304
- - Exhaustive search in exponential space
305
- - Undecidable problems
306
-
307
- ### Pattern 20: Adversarial Thinking (AT)
308
- Security bypass/attack tasks require "attacker mindset".
309
-
310
- 1. **Detect**: Keywords "bypass", "break", "exploit", "circumvent", "evade"
311
- 2. **Flip perspective**: Not "how to use correctly" but "how to break it"
312
- 3. **Enumerate attack vectors** (case variation, encoding, null bytes, context breaking)
313
- 4. **Test each vector until one works**
314
-
315
- ---
316
-
317
- ## DOMAIN-SPECIFIC PATTERNS
318
-
319
- **Task-specific guidance for common failure categories.**
320
-
321
- ### Pattern 21: Chess Engine Integration (CEI)
322
- Chess tasks require external engines - reasoning alone cannot find optimal moves.
323
-
324
- **Detection**: Keywords "chess", "best move", "FEN", "PGN", "checkmate"
325
-
326
- **Protocol**:
327
- 1. Check for Stockfish: `which stockfish || apt-get install -y stockfish`
328
- 2. Parse board state to FEN format
329
- 3. Use Stockfish for analysis
330
- 4. Extract best move from engine output
331
-
332
- **NEVER**: Try to calculate chess moves through reasoning - use the engine.
333
-
334
- ### Pattern 22: Git Recovery Forensics (GRF)
335
- Git recovery tasks require forensic approach, not standard git commands.
336
-
337
- **Detection**: Keywords "recover", "corrupted", "lost commit", "fix git", "reflog"
338
-
339
- **Protocol**:
340
- 1. **BACKUP FIRST**: `cp -r .git .git.bak`
341
- 2. Check integrity: `git fsck --full --no-dangling`
342
- 3. Check reflog: `git reflog` (commits survive even after reset)
343
- 4. Check loose objects: `find .git/objects -type f`
344
- 5. Recover from pack files if needed
345
-
346
- **NEVER**: Run destructive commands without backup. Use `--dry-run` first.
347
-
348
- ### Pattern 23: Compression Impossibility Detection (CID)
349
- Some compression tasks are mathematically impossible.
350
-
351
- **Detection**: Keywords "compress", "codegolf", "under X bytes", "minimal size"
352
-
353
- **Impossibility Markers**:
354
- - ML model weights (GPT-2 = 500MB+, cannot be <1MB without losing function)
355
- - Random/encrypted data (incompressible by definition)
356
- - Asking for compression ratio beyond information-theoretic limits
357
-
358
- **NEVER**: Attempt to compress ML weights to <1% of original size.
359
-
360
- ### Pattern 24: Polyglot Code Construction (PCC)
361
- Polyglot tasks require specific compiler/interpreter quirks.
362
-
363
- **Detection**: Keywords "polyglot", "works in both", "compile as X and Y"
364
-
365
- **Protocol**: Search for existing polyglot examples before implementing.
366
-
367
- ### Pattern 25: Service Configuration Pipeline (SCP)
368
- Multi-service configuration requires ordered setup.
369
-
370
- **Detection**: Keywords "configure", "server", "webserver", "service", "daemon"
371
-
372
- **Protocol**:
373
- 1. **Identify all services** needed
374
- 2. **Check service status**: `systemctl status <service>`
375
- 3. **Configure in dependency order** (base -> dependent)
376
- 4. **Test each service** before moving to next
377
- 5. **Verify end-to-end** after all configured
378
-
379
- ### Pattern 26: Near-Miss Iteration (NMI)
380
- When tests show >50% passing, focus on specific failing tests.
381
-
382
- **Detection**: Test results show partial success (e.g., 8/9, 6/7, 5/6)
383
-
384
- **Protocol**:
385
- 1. Run tests with verbose output
386
- 2. Extract ONLY failing test names
387
- 3. Read failing test code to understand exact requirement
388
- 4. Fix specific issue without breaking passing tests
389
- 5. Re-run ONLY failing tests first
390
- 6. Then run full suite to verify no regressions
391
-
392
- **Reserve 30% of time budget for near-miss iteration.**
393
-
394
- ### Pattern 27: Output Directory Cleanup (ODC)
395
- Tests often check for ONLY specific files in output directories.
396
-
397
- **Detection**: Tasks mentioning "single file", "only", constraints on output directory contents
398
-
399
- **Protocol**:
400
- 1. **Before completing**, list output directory
401
- 2. **Remove non-required files**: compiled binaries, temp files, backups
402
- 3. **Keep ONLY the required outputs** as specified in task
403
-
404
- ### Pattern 28: Service Smoke Test (SST)
405
- Services must be tested BEFORE claiming completion.
406
-
407
- **Detection**: Tasks involving servers, daemons, APIs, web services
408
-
409
- **Protocol**:
410
- 1. Start the service
411
- 2. **Immediately test it** with curl/wget/client
412
- 3. If test fails: debug, fix, restart, re-test
413
- 4. Only complete when test PASSES
414
-
415
- **NEVER complete without a successful smoke test.**
416
-
417
- ### Pattern 29: Multi-Solution Discovery (MSD)
418
- Some tasks require finding ALL valid solutions, not just one.
419
-
420
- **Detection**: Keywords "all moves", "both solutions", "list all", "find every"
421
-
422
- ### Pattern 30: Performance Threshold Tuning (PTT)
423
- Tasks with numeric thresholds require iterative tuning.
424
-
425
- **Detection**: Keywords "win rate", "accuracy", "percentage", "threshold", "at least X%"
426
-
427
- ### Pattern 31: Round-Trip Verification (RTV)
428
- For transform/encode/compress tasks, verify the reverse operation.
429
-
430
- **Detection**: Keywords "compress", "encode", "serialize", "encrypt", and task mentions reverse operation.
431
-
432
- **Protocol**:
433
- 1. Create test data
434
- 2. Apply forward transform (compress)
435
- 3. **Immediately apply reverse** (decompress)
436
- 4. **Verify original == result**
437
- 5. Fix if not matching
438
-
439
- ### Pattern 32: CLI Execution Verification (CEV)
440
- When creating executable CLI tools, verify execution method matches tests.
441
-
442
- **Detection**: Tasks requiring executable scripts, CLI tools, command-line interfaces
443
-
444
- **Protocol**:
445
- 1. Add proper shebang: `#!/usr/bin/env python3`
446
- 2. Make executable: `chmod +x <script>`
447
- 3. **Test EXACTLY as verifier will run it**: `./tool args` not `python3 tool args`
448
- 4. Verify output format matches expected format
449
-
450
- **Common mistake**: Script works with `python3 script.py` but fails with `./script.py` (missing shebang/chmod)
451
-
452
- ### Pattern 33: Numerical Stability Testing (NST)
453
- Numerical algorithms require robustness against edge cases.
454
-
455
- **Detection**: Statistical sampling, numerical optimization, floating-point computation
456
-
457
- **Protocol**:
458
- 1. Test with multiple random seeds (3+ iterations, not just one)
459
- 2. Test domain boundaries explicitly (0, near-zero, infinity)
460
- 3. Use adaptive step sizes for derivative computation
461
- 4. Add tolerance margins for floating-point comparisons (1e-6 typical)
462
- 5. Handle edge cases: empty input, single element, maximum values
463
-
464
- ### Pattern 34: Image-to-Structured Pipeline (ISP)
465
- Visual data requires dedicated recognition tools, not reasoning.
466
-
467
- **Detection**: Tasks involving image analysis, diagram parsing, visual data extraction
468
-
469
- **Protocol**:
470
- 1. **NEVER rely on visual reasoning alone** - accuracy is unreliable
471
- 2. Search for existing recognition libraries
472
- 3. Verify extracted structured data before using
473
- 4. If no tools available, clearly state the limitation
474
-
475
- ### Pattern 35: Decoder-First Analysis (DFA)
476
- For encode/compress tasks with provided decoder, analyze decoder FIRST.
477
-
478
- **Detection**: Task provides a decoder/decompressor and asks to create encoder/compressor
479
-
480
- **Protocol**:
481
- 1. **Read and understand the provided decoder** before writing encoder
482
- 2. Identify expected input format from decoder source
483
- 3. Create minimal test case matching decoder's expected format
484
- 4. Test round-trip with decoder BEFORE optimizing for size
485
- 5. If decoder crashes, your format is wrong - don't optimize further
486
-
487
- ### Pattern 36: Competition Domain Research (CDR)
488
- Competitive tasks benefit from researching domain-specific winning strategies.
489
-
490
- **Detection**: Keywords "win rate", "beat", "competition", "versus", "tournament"
491
-
492
- **Protocol**:
493
- 1. **Research domain strategies BEFORE implementing**
494
- 2. Time-box implementation iterations: stop at 70% time budget
495
- 3. Track progress per iteration to identify improvement trajectory
496
- 4. If not meeting threshold, document best achieved + gap
111
+ **On work request**: `uam task create --title "..." --type task|bug|feature`
497
112
 
498
113
  ---
499
114
 
500
- ## ADVANCED REASONING PATTERNS
501
-
502
- **Six patterns derived from state-of-the-art LLM optimization research (2025-2026). Address reasoning depth, self-verification, branching exploration, feedback grounding, and task atomization.**
503
-
504
- ### Pattern 37: Pre-Implementation Verification (PIV)
505
- **CRITICAL: Prevents wrong-approach waste — the #1 cause of wasted compute.**
506
-
507
- After planning but BEFORE writing any code, explicitly verify your approach:
508
-
509
- **Detection**: Any implementation task (always active for non-trivial changes)
115
+ ## DECISION LOOP
510
116
 
511
- **Protocol**:
512
117
  ```
513
- === PRE-IMPLEMENTATION VERIFY ===
514
- 1. ROOT CAUSE: Does this approach address the actual root cause, not a symptom?
515
- 2. EXISTING TESTS: Will this break any existing passing tests?
516
- 3. SIMPLER PATH: Is there a simpler approach I'm overlooking?
517
- 4. ASSUMPTIONS: What am I assuming about the codebase that I haven't verified?
518
- 5. SIDE EFFECTS: What else does this change affect?
519
- === VERIFIED: [proceed/revise] ===
118
+ 1. CLASSIFY -> complexity? backup needed? tools?
119
+ 2. PROTECT -> cp file file.bak (for configs, DBs, critical files)
120
+ 3. MEMORY -> query relevant context + past failures
121
+ 4. AGENTS -> check overlaps (if multi-agent)
122
+ 5. SKILLS -> check {{SKILLS_PATH}} for domain-specific guidance
123
+ 6. WORK -> implement (ALWAYS use worktree for ANY file changes)
124
+ 7. REVIEW -> self-review diff before testing
125
+ 8. TEST -> completion gates pass
126
+ 9. LEARN -> store outcome in memory
520
127
  ```
521
128
 
522
- **If ANY answer raises doubt**: STOP. Re-read the problem. Revise approach before coding.
523
-
524
- *Research basis: CoT verification (+4.3% accuracy), Reflexion framework (+18.5%), SEER adaptive reasoning (+4-9%)*
525
-
526
- ### Pattern 38: Branching Exploration (BE)
527
- For complex or ambiguous problems, explore multiple approaches before committing.
528
-
529
- **Detection**: Problem has multiple valid approaches, ambiguous requirements, or high complexity
530
-
531
- **Protocol**:
532
- 1. **Generate 2-3 candidate approaches** (brief description, not full implementation)
533
- 2. **Evaluate each** against: simplicity, correctness likelihood, test-compatibility, side-effect risk
534
- 3. **Select best** with explicit reasoning
535
- 4. **Commit fully** to selected approach — no mid-implementation switching
536
- 5. **If selected approach fails**: backtrack to step 1, eliminate failed approach, try next
537
-
538
- **NEVER**: Start coding the first approach that comes to mind for complex problems.
539
- **ALWAYS**: Spend 5% of effort exploring alternatives to save 50% on wrong-path recovery.
540
-
541
- *Research basis: MCTS-guided code generation (RethinkMCTS: 70%→89% pass@1), Policy-Guided Tree Search*
542
-
543
- ### Pattern 39: Execution Feedback Grounding (EFG)
544
- Learn from test failures systematically — don't just fix, understand and remember.
545
-
546
- **Detection**: Any test failure or runtime error during implementation
547
-
548
- **Protocol**:
549
- 1. **Categorize the failure** using the Failure Taxonomy (see below)
550
- 2. **Identify root cause** (not just the symptom the error message shows)
551
- 3. **Fix with explanation**: What was wrong, why, and what the fix addresses
552
- 4. **Store structured feedback** in memory:
553
- ```bash
554
- sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO memories (timestamp,type,content) VALUES (datetime('now'),'failure_analysis','type:<category>|cause:<root_cause>|fix:<what_fixed>|file:<filename>');"
555
- ```
556
- 5. **Query before similar tasks**: Before implementing, check memory for past failures in same area
557
-
558
- **Failure Taxonomy** (use for categorization):
559
- | Type | Description | Recovery Strategy |
560
- |------|-------------|-------------------|
561
- | `dependency_missing` | Import/module not found | Install or use stdlib alternative |
562
- | `wrong_approach` | Fundamentally incorrect solution | P38 Branching - try different approach |
563
- | `format_mismatch` | Output doesn't match expected format | P14 OFV - re-read spec carefully |
564
- | `edge_case` | Works for happy path, fails on edge | Add boundary checks, test with extremes |
565
- | `state_mutation` | Unexpected side effect on shared state | Isolate mutations, use copies |
566
- | `concurrency` | Race condition or timing issue | Add locks, use sequential fallback |
567
- | `timeout` | Exceeded time/resource limit | Optimize algorithm, reduce scope |
568
- | `environment` | Works locally, fails in target env | P1 Environment Isolation checks |
569
-
570
- *Research basis: RLEF/RLVR (RL from Execution Feedback), verifiable rewards for coding agents*
571
-
572
- ### Pattern 40: Adaptive Reasoning Depth (ARD)
573
- Match reasoning effort to task complexity — don't over-think simple tasks or under-think hard ones.
574
-
575
- **Detection**: Applied automatically at Pattern Router stage
576
-
577
- **Complexity Classification**:
578
- | Complexity | Indicators | Reasoning Protocol |
579
- |-----------|------------|-------------------|
580
- | **Simple** | Single file, clear spec, known pattern, <20 lines | Direct implementation. No exploration phase. |
581
- | **Moderate** | Multi-file, some ambiguity, 20-200 lines | Plan-then-implement. State assumptions. P37 verify. |
582
- | **Complex** | Cross-cutting concerns, ambiguous spec, >200 lines, unfamiliar domain | P38 explore → P37 verify → implement → P39 feedback loop. |
583
- | **Research** | Unknown solution space, no clear approach | Research first (web search, codebase analysis) → P38 explore → implement iteratively. |
584
-
585
- **Rule**: Never apply Complex-level reasoning to Simple tasks (wastes tokens). Never apply Simple-level reasoning to Complex tasks (causes failures).
586
-
587
- *Research basis: SEER adaptive CoT, test-time compute scaling (2-3x gains from adaptive depth)*
588
-
589
- ### Pattern 41: Atomic Task Loop (ATL)
590
- For multi-step changes, decompose into atomic units with clean boundaries.
591
-
592
- **Detection**: Task involves changes to 3+ files, or multiple independent concerns
593
-
594
- **Protocol**:
595
- 1. **Decompose** the task into atomic sub-tasks (each independently testable)
596
- 2. **Order** by dependency (upstream changes first)
597
- 3. **For each sub-task**:
598
- a. Implement the change (single concern only)
599
- b. Run relevant tests
600
- c. Commit if tests pass
601
- d. If context is getting long/confused, note progress and continue fresh
602
- 4. **Final verification**: Run full test suite after all sub-tasks complete
603
-
604
- **Atomicity rules**:
605
- - Each sub-task modifies ideally 1-2 files
606
- - Each sub-task has a clear pass/fail criterion
607
- - Sub-tasks should not depend on uncommitted work from other sub-tasks
608
- - If a sub-task fails, only that sub-task needs rework
609
-
610
- *Research basis: Addy Osmani's continuous coding loop, context drift prevention research*
611
-
612
- ### Pattern 42: Critic-Before-Commit (CBC)
613
- Review your own diff against requirements before running tests.
129
+ ---
614
130
 
615
- **Detection**: Any implementation about to be tested or committed
131
+ ## MEMORY SYSTEM
616
132
 
617
- **Protocol**:
618
133
  ```
619
- === SELF-REVIEW ===
620
- Diff summary: [what changed, in which files]
621
-
622
- REQUIREMENT CHECK:
623
- ☐ Does the diff address ALL requirements from the task?
624
- ☐ Are there any unintended changes (debug prints, commented code, temp files)?
625
- ☐ Does the code handle the error/edge cases mentioned in the spec?
626
- ☐ Is the code consistent with surrounding style and conventions?
627
- ☐ Would this diff make sense to a reviewer with no context?
628
-
629
- ISSUES FOUND: [list or "none"]
630
- === END REVIEW ===
134
+ L1 Working | SQLite memories | {{SHORT_TERM_LIMIT}} max | <1ms
135
+ L2 Session | SQLite session_mem | current session | <5ms
136
+ L3 Semantic | {{LONG_TERM_BACKEND}}| search | ~50ms
137
+ L4 Knowledge| SQLite entities/rels | graph | <20ms
631
138
  ```
632
139
 
633
- **If issues found**: Fix BEFORE running tests. Cheaper to catch logic errors by reading than by test-debug cycles.
634
-
635
- *Research basis: Multi-agent reflection (actor+critic, +20% accuracy), RL^V unified reasoner-verifier*
636
-
637
- ---
638
-
639
- ## CONTEXT OPTIMIZATION
640
-
641
- **Reduce token waste and improve response quality through intelligent context management.**
642
-
643
- ### Progressive Context Disclosure
644
- Not all patterns are needed for every task. The Pattern Router activates only relevant patterns.
645
- - **Always loaded**: Pattern Router, Completion Gates, Error Recovery
646
- - **Loaded on activation**: Only patterns flagged YES by router
647
- - **Summarize, don't repeat**: When referencing prior work, summarize in 1-2 lines, don't paste full output
648
-
649
- ### Context Hygiene
650
- - **Prune completed context**: After a sub-task completes, don't carry its full debug output forward
651
- - **Compress tool output**: Quote only the 2-3 lines that inform the next decision
652
- - **Avoid context poisoning**: Don't include failed approaches in context unless actively debugging them
653
- - **Reset on drift**: If responses become unfocused or repetitive, summarize progress and continue with clean context
654
-
655
- ### Token Budget Awareness
656
- | Task Type | Target Context Usage | Strategy |
657
- |-----------|---------------------|----------|
658
- | Simple fix | <10% of window | Direct implementation, minimal exploration |
659
- | Feature implementation | 30-50% of window | Structured exploration, then focused implementation |
660
- | Complex debugging | 50-70% of window | Deep investigation justified, but prune between attempts |
661
- | Research/exploration | 20-40% of window | Broad search first, then narrow and deep |
662
-
663
- ---
664
-
665
- ## SELF-IMPROVEMENT PROTOCOL
666
-
667
- **The agent improves its own effectiveness over time by learning from outcomes.**
140
+ ### Commands
668
141
 
669
- ### After Task Completion (Success or Failure)
670
- 1. **Record outcome** with structured metadata:
671
- ```bash
672
- sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO session_memories (session_id,timestamp,type,content,importance) VALUES ('current',datetime('now'),'outcome','task:<summary>|result:<pass/fail>|patterns_used:<list>|time_spent:<estimate>|failure_type:<category_or_none>',8);"
673
- ```
142
+ ```bash
143
+ # L1: Working Memory
144
+ sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO memories (timestamp,type,content) VALUES (datetime('now'),'action','...');"
674
145
 
675
- 2. **If failure occurred**: Store in semantic memory for cross-session learning:
676
- ```bash
677
- {{MEMORY_STORE_CMD}} lesson "Failed on <task_type>: <what_went_wrong>. Fix: <what_worked>." --tags failure,<category>,<language> --importance 8
678
- ```
146
+ # L2: Session Memory
147
+ sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO session_memories (session_id,timestamp,type,content,importance) VALUES ('current',datetime('now'),'decision','...',7);"
679
148
 
680
- 3. **If novel technique discovered**: Store as reusable pattern:
681
- ```bash
682
- {{MEMORY_STORE_CMD}} lesson "New technique for <domain>: <technique_description>. Use when <conditions>." --tags technique,<domain> --importance 9
683
- ```
149
+ # L3: Semantic Memory
150
+ {{MEMORY_STORE_CMD}} lesson "..." --tags t1,t2 --importance 8
684
151
 
685
- ### Before Starting Similar Tasks
686
- Query memory for relevant past outcomes:
687
- ```bash
688
- sqlite3 ./{{MEMORY_DB_PATH}} "SELECT content FROM memories WHERE type='failure_analysis' AND content LIKE '%<relevant_keyword>%' ORDER BY timestamp DESC LIMIT 5;"
152
+ # L4: Knowledge Graph
153
+ sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO entities (type,name,first_seen,last_seen,mention_count) VALUES ('file','x.ts',datetime('now'),datetime('now'),1);"
689
154
  ```
690
155
 
691
- ### Repo-Specific Learning
692
- Over time, accumulate repository-specific patterns:
693
- - Which test frameworks and assertions this repo uses
694
- - Common failure modes in this codebase
695
- - Preferred code style and naming conventions
696
- - Architecture decisions and their rationale
697
-
698
- Store these as high-importance semantic memories tagged with the repo name.
156
+ Decay: `effective_importance = importance * (0.95 ^ days_since_access)`
699
157
 
700
158
  ---
701
159
 
702
- ## CODE QUALITY HEURISTICS
160
+ ## WORKTREE WORKFLOW — MANDATORY
703
161
 
704
- **Apply to ALL generated code. Verify before committing.**
162
+ > **MANDATORY**: ALL file changes MUST use a worktree. No exceptions. Never commit directly to any branch without a worktree. After PR is merged, worktree cleanup is MANDATORY — never leave stale worktrees.
705
163
 
706
- ### Pre-Commit Code Review Checklist
707
- - [ ] Functions ≤ 30 lines (split if longer)
708
- - [ ] No God objects or functions doing multiple unrelated things
709
- - [ ] Names are self-documenting (no single-letter variables outside loops)
710
- - [ ] Error paths handled explicitly (not just happy path)
711
- - [ ] No debug prints, console.logs, or commented-out code left behind
712
- - [ ] Consistent with surrounding code style (indentation, naming, patterns)
713
- - [ ] No hardcoded values that should be constants or config
714
- - [ ] Imports are minimal — only what's actually used
715
-
716
- ### Code Smell Detection
717
- If you notice any of these, fix before committing:
718
- - **Duplicated logic** → Extract to shared function
719
- - **Deep nesting (>3 levels)** → Early returns, extract helper
720
- - **Boolean parameters** → Consider separate methods or options object
721
- - **Magic numbers** → Named constants
722
- - **Catch-all error handling** → Specific error types with appropriate responses
723
-
724
- ---
725
-
726
- ## SESSION START PROTOCOL
727
-
728
- **EXECUTE IMMEDIATELY before any response:**
164
+ | Change Scope | Workflow |
165
+ |-------------|----------|
166
+ | ANY file change (even single-file) | **Worktree REQUIRED** |
729
167
 
730
168
  ```bash
731
- uam task ready # Check existing work
732
- sqlite3 ./{{MEMORY_DB_PATH}} "SELECT * FROM memories ORDER BY id DESC LIMIT 10;"
733
- sqlite3 ./{{MEMORY_DB_PATH}} "SELECT * FROM session_memories WHERE session_id='current' ORDER BY id DESC LIMIT 5;"
734
- uam agent status # Check other active agents
169
+ {{WORKTREE_CREATE_CMD}} <slug> # ALWAYS create first
170
+ cd {{WORKTREE_DIR}}/NNN-<slug>/
171
+ git add -A && git commit -m "type: description"
172
+ {{WORKTREE_PR_CMD}} <id> # Create PR
173
+ # After PR merge:
174
+ {{WORKTREE_CLEANUP_CMD}} <id> # MANDATORY cleanup after merge
735
175
  ```
736
176
 
737
- **On work request**: `uam task create --title "..." --type task|bug|feature`
738
-
739
- ---
177
+ **Applies to**: {{WORKTREE_APPLIES_TO}} ALL changes without exception
740
178
 
741
- ## MULTI-AGENT COORDINATION PROTOCOL
179
+ **Cleanup is MANDATORY**: After every PR merge, immediately run `{{WORKTREE_CLEANUP_CMD}} <id>`. Never leave merged worktrees behind.
742
180
 
743
- **Skip this section for single-agent sessions.** Only activate when multiple agents work concurrently (e.g., parallel subagents via Task tool, or multiple Claude Code sessions on same repo).
181
+ ---
744
182
 
745
- **Parallel-first rule**: When safe, run independent tool calls in parallel (searches, reads, status checks) and invoke multiple subagents concurrently for review.
183
+ ## MULTI-AGENT COORDINATION
746
184
 
747
- ### Before Claiming Any Work (multi-agent only)
185
+ **Skip for single-agent sessions.** Only activate when multiple agents work concurrently.
748
186
 
749
187
  ```bash
750
188
  uam agent overlaps --resource "<files-or-directories>"
751
189
  ```
752
190
 
753
- ### Overlap Response Matrix
754
-
755
191
  | Risk Level | Action |
756
192
  |------------|--------|
757
193
  | `none` | Proceed immediately |
@@ -759,15 +195,14 @@ uam agent overlaps --resource "<files-or-directories>"
759
195
  | `medium` | Announce, coordinate sections |
760
196
  | `high`/`critical` | Wait or split work |
761
197
 
762
- ### Agent Capability Routing
198
+ ### Agent Routing
763
199
 
764
- | Task Type | Route To | Capabilities |
765
- |-----------|----------|--------------|
766
- | Security review | `security-auditor` | owasp, secrets, injection |
767
- | Performance | `performance-optimizer` | algorithms, memory, caching |
768
- | Documentation | `documentation-expert` | jsdoc, readme, api-docs |
769
- | Code quality | `code-quality-guardian` | complexity, naming, solid |
770
- | Solution verification | self (P42 CBC) | diff review, requirement check |
200
+ | Task Type | Route To |
201
+ |-----------|----------|
202
+ | Security review | `security-auditor` |
203
+ | Performance | `performance-optimizer` |
204
+ | Documentation | `documentation-expert` |
205
+ | Code quality | `code-quality-guardian` |
771
206
 
772
207
  {{#if LANGUAGE_DROIDS}}
773
208
  ### Language Droids
@@ -783,121 +218,13 @@ uam agent overlaps --resource "<files-or-directories>"
783
218
  {{{MCP_PLUGINS}}}
784
219
  {{/if}}
785
220
 
786
- ---
787
-
788
- ## MULTI-AGENT EXECUTION (DEPENDENCY-AWARE)
789
-
790
- **Skip for single-agent sessions.** When using parallel subagents:
791
- 1. **Decompose** into discrete work items. **Map dependencies** (A blocks B).
792
- 2. **Parallelize** dependency-free items with separate agents and explicit file boundaries.
793
- 3. **Gate edits** with `uam agent overlaps --resource "<files>"` before touching any file.
794
- 4. **Merge in dependency order** (upstream first).
795
-
796
- ---
797
-
798
- ## TOKEN EFFICIENCY RULES
799
-
800
- - Prefer concise, high-signal responses; avoid repeating instructions or large logs.
801
- - Summarize command output; quote only the lines needed for decisions.
802
- - Use parallel tool calls to reduce back-and-forth.
803
- - Ask for clarification only when necessary to proceed correctly.
804
-
805
- ---
806
-
807
- ## DECISION LOOP
808
-
809
- ```
810
- 0. CLASSIFY -> complexity? backup? tool? steps? (P40 Adaptive Depth)
811
- 1. PROTECT -> cp file file.bak
812
- 2. MEMORY -> query relevant context + past failures (P39)
813
- 3. EXPLORE -> if complex: generate 2-3 approaches (P38)
814
- 4. VERIFY -> pre-implementation check (P37)
815
- 5. AGENTS -> check overlaps
816
- 6. SKILLS -> check {{SKILLS_PATH}}
817
- 7. WORKTREE -> create, work (P41 atomic tasks)
818
- 8. REVIEW -> self-review diff (P42)
819
- 9. TEST -> gates pass
820
- 10. LEARN -> store outcome in memory (P39)
821
- ```
822
-
823
- ---
824
-
825
- ## MEMORY SYSTEM
826
-
827
- ```
828
- L1 Working | SQLite memories | {{SHORT_TERM_LIMIT}} max | <1ms
829
- L2 Session | SQLite session_mem | current | <5ms
830
- L3 Semantic | {{LONG_TERM_BACKEND}}| search | ~50ms
831
- L4 Knowledge| SQLite entities/rels | graph | <20ms
832
- ```
833
-
834
- ### Layer Selection
835
-
836
- | Question | YES -> Layer |
837
- |----------|-------------|
838
- | Just did this (last few minutes)? | L1: Working |
839
- | Session-specific decision/context? | L2: Session |
840
- | Reusable learning for future? | L3: Semantic |
841
- | Entity relationships? | L4: Knowledge Graph |
842
-
843
- ### Memory Commands
844
-
845
- ```bash
846
- # L1: Working Memory
847
- sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO memories (timestamp,type,content) VALUES (datetime('now'),'action','...');"
848
-
849
- # L2: Session Memory
850
- sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO session_memories (session_id,timestamp,type,content,importance) VALUES ('current',datetime('now'),'decision','...',7);"
851
-
852
- # L3: Semantic Memory
853
- {{MEMORY_STORE_CMD}} lesson "..." --tags t1,t2 --importance 8
854
-
855
- # L4: Knowledge Graph
856
- sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO entities (type,name,first_seen,last_seen,mention_count) VALUES ('file','x.ts',datetime('now'),datetime('now'),1);"
857
- sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO relationships (source_id,target_id,relation,timestamp) VALUES (1,2,'depends_on',datetime('now'));"
858
- ```
859
-
860
- ### Consolidation Rules
861
-
862
- - **Trigger**: Every 10 working memory entries
863
- - **Action**: Summarize -> session_memories, Extract lessons -> semantic memory
864
- - **Dedup**: Skip if content_hash exists OR similarity > 0.92
865
-
866
- ### Decay Formula
867
-
868
- ```
869
- effective_importance = importance * (0.95 ^ days_since_access)
870
- ```
871
-
872
- ---
873
-
874
- ## WORKTREE WORKFLOW
875
-
876
- **Use worktrees for multi-file features/refactors. Skip for single-file fixes.**
877
-
878
- | Change Scope | Workflow |
879
- |-------------|----------|
880
- | Single-file fix (<20 lines) | Direct commit to feature branch, no worktree needed |
881
- | Multi-file change (2-5 files) | Worktree recommended if touching shared interfaces |
882
- | Feature/refactor (3+ files, new feature) | Worktree required |
883
- | CLAUDE.md or config changes | Worktree required |
884
-
885
- ```bash
886
- # Create (when needed)
887
- {{WORKTREE_CREATE_CMD}} <slug>
888
- cd {{WORKTREE_DIR}}/NNN-<slug>/
889
-
890
- # Work
891
- git add -A && git commit -m "type: description"
221
+ ### Parallel Execution
892
222
 
893
- # PR (runs tests, triggers parallel reviewers)
894
- {{WORKTREE_PR_CMD}} <id>
895
-
896
- # Cleanup (ALWAYS cleanup after merge)
897
- {{WORKTREE_CLEANUP_CMD}} <id>
898
- ```
899
-
900
- **Applies to**: {{WORKTREE_APPLIES_TO}}
223
+ When safe, run independent tool calls in parallel. When using parallel subagents:
224
+ 1. Decompose into discrete work items. Map dependencies.
225
+ 2. Parallelize dependency-free items with separate agents and explicit file boundaries.
226
+ 3. Gate edits with `uam agent overlaps` before touching any file.
227
+ 4. Merge in dependency order (upstream first).
901
228
 
902
229
  ---
903
230
 
@@ -906,15 +233,12 @@ git add -A && git commit -m "type: description"
906
233
  **Before ANY commit/PR, invoke quality droids in PARALLEL:**
907
234
 
908
235
  ```bash
909
- # These run concurrently - do NOT wait between calls
910
236
  Task(subagent_type: "code-quality-guardian", prompt: "Review: <files>")
911
237
  Task(subagent_type: "security-auditor", prompt: "Audit: <files>")
912
238
  Task(subagent_type: "performance-optimizer", prompt: "Analyze: <files>")
913
239
  Task(subagent_type: "documentation-expert", prompt: "Check: <files>")
914
240
  ```
915
241
 
916
- ### Review Priority
917
-
918
242
  | Droid | Blocks PR | Fix Before Merge |
919
243
  |-------|-----------|------------------|
920
244
  | security-auditor | CRITICAL/HIGH | Always |
@@ -924,6 +248,19 @@ Task(subagent_type: "documentation-expert", prompt: "Check: <files>")
924
248
 
925
249
  ---
926
250
 
251
+ ## CODE QUALITY
252
+
253
+ ### Pre-Commit Checklist
254
+ - Functions <= 30 lines
255
+ - Self-documenting names
256
+ - Error paths handled explicitly
257
+ - No debug prints or commented-out code left behind
258
+ - Consistent with surrounding code style
259
+ - No hardcoded values that should be constants
260
+ - Imports are minimal
261
+
262
+ ---
263
+
927
264
  ## AUTOMATIC TRIGGERS
928
265
 
929
266
  | Pattern | Action |
@@ -931,75 +268,24 @@ Task(subagent_type: "documentation-expert", prompt: "Check: <files>")
931
268
  | work request (fix/add/change/update/create/implement/build) | `uam task create --type task` |
932
269
  | bug report/error | `uam task create --type bug` |
933
270
  | feature request | `uam task create --type feature` |
934
- | single-file fix | direct commit to branch, skip worktree |
935
- | multi-file feature (3+ files) | create worktree, then work |
271
+ | ANY file change | **create worktree (MANDATORY)** |
936
272
  | review/check/look | query memory first |
937
273
  | ANY code change | tests required |
938
274
 
939
- **Agent coordination**: Only use `uam agent` commands when multiple agents are active concurrently. For single-agent sessions (most common), skip agent registration and overlap checks.
940
-
941
275
  ---
942
276
 
943
- ## UAM VISUAL STATUS FEEDBACK (MANDATORY WHEN UAM IS ACTIVE)
944
-
945
- **When UAM tools are in use, ALWAYS use the built-in status display commands to provide visual feedback on progress and underlying numbers. Do NOT silently perform operations -- show the user what is happening.**
277
+ ## UAM VISUAL STATUS FEEDBACK
946
278
 
947
- ### After Task Operations
948
- After creating, updating, closing, or claiming tasks, run:
949
- ```bash
950
- uam dashboard progress # Show completion %, status bars, velocity
951
- uam task stats # Show priority/type breakdown with charts
952
- ```
279
+ **When UAM tools are in use, show visual feedback:**
953
280
 
954
- ### After Memory Operations
955
- After storing, querying, or prepopulating memory, run:
956
281
  ```bash
957
- uam memory status # Show memory layer health, capacity gauges, service status
958
- uam dashboard memory # Show detailed memory dashboard with architecture tree
282
+ uam dashboard overview # Full overview at session start
283
+ uam dashboard progress # After task operations
284
+ uam task stats # After task state changes
285
+ uam memory status # After memory operations
286
+ uam dashboard agents # After agent/coordination operations
959
287
  ```
960
288
 
961
- ### After Agent/Coordination Operations
962
- After registering agents, checking overlaps, or claiming resources, run:
963
- ```bash
964
- uam dashboard agents # Show agent status table, resource claims, active work
965
- ```
966
-
967
- ### Periodic Overview
968
- At session start and after completing major work items, run:
969
- ```bash
970
- uam dashboard overview # Full overview: task progress, agent status, memory health
971
- ```
972
-
973
- ### Display Function Reference
974
-
975
- UAM provides these visual output functions (from `src/cli/visualize.ts`):
976
-
977
- | Function | Purpose | When to Use |
978
- |----------|---------|-------------|
979
- | `progressBar` | Completion bar with % and count | Task/test progress |
980
- | `stackedBar` + `stackedBarLegend` | Multi-segment status bar | Status distribution |
981
- | `horizontalBarChart` | Labeled bar chart | Priority/type breakdowns |
982
- | `miniGauge` | Compact colored gauge | Capacity/utilization |
983
- | `sparkline` | Inline trend line | Historical data trends |
984
- | `table` | Formatted data table | Task/agent listings |
985
- | `tree` | Hierarchical tree view | Memory layers, task hierarchy |
986
- | `box` | Bordered summary box | Section summaries |
987
- | `statusBadge` | Colored status labels | Agent/service status |
988
- | `keyValue` | Aligned key-value pairs | Metadata display |
989
- | `inlineProgressSummary` | Compact progress bar with counts | After task mutations |
990
- | `trend` | Up/down arrow with delta | Before/after comparisons |
991
- | `heatmapRow` | Color-coded cell row | Activity density |
992
- | `bulletList` | Status-colored bullet list | Health checks |
993
-
994
- ### Rules
995
-
996
- 1. **Never silently complete a UAM operation** -- always follow up with the relevant dashboard/status command
997
- 2. **Show numbers, not just success messages** -- the user needs to see counts, percentages, and trends
998
- 3. **Use `uam dashboard overview`** at session start to establish baseline awareness
999
- 4. **Use `uam task stats`** after any task state change to show the impact
1000
- 5. **Use `uam memory status`** after any memory write to confirm storage and show capacity
1001
- 6. **Prefer dashboard commands over raw SQLite queries** for status checks -- they provide formatted visual output
1002
-
1003
289
  ---
1004
290
 
1005
291
  {{#if HAS_PROJECT_MD}}
@@ -1096,40 +382,24 @@ UAM provides these visual output functions (from `src/cli/visualize.ts`):
1096
382
 
1097
383
  ## COMPLETION GATES - MANDATORY
1098
384
 
1099
- **CANNOT say "done" or "complete" until ALL gates pass. This is NOT optional.**
385
+ **CANNOT say "done" until ALL gates pass.**
1100
386
 
1101
- ### GATE 1: Output Existence Check
387
+ ### GATE 1: Output Existence
1102
388
  ```bash
1103
- echo "=== GATE 1: OUTPUT EXISTENCE ==="
1104
389
  for f in $EXPECTED_OUTPUTS; do
1105
- [ -f "$f" ] && echo " $f exists" || echo "MISSING: $f"
390
+ [ -f "$f" ] && echo "ok $f" || echo "MISSING: $f"
1106
391
  done
1107
392
  ```
393
+ If missing: CREATE IT immediately.
1108
394
 
1109
- ### GATE 2: Constraint Compliance Check
1110
- ```
1111
- Extract ALL constraints from task description:
1112
- - File count constraint (e.g., "single file")
1113
- - Line count constraint (e.g., "exactly one line")
1114
- - Format constraint (e.g., "valid JSON")
1115
- - Size constraint (e.g., "under 5KB")
1116
-
1117
- Verify EACH constraint is satisfied before proceeding.
1118
- ```
395
+ ### GATE 2: Constraint Compliance
396
+ Extract ALL constraints from task ("exactly", "only", "single", "must be", "no more than"). Verify EACH.
1119
397
 
1120
- ### GATE 3: Test Verification
398
+ ### GATE 3: Tests Pass
1121
399
  ```bash
1122
- echo "=== GATE 3: TEST VERIFICATION ==="
1123
400
  {{TEST_COMMAND}} 2>&1 | tail -30
1124
- # If < 100% pass: iterate (fix specific failure, re-run)
1125
- # Reserve 20% of time for iteration
1126
401
  ```
1127
-
1128
- **HARD STOP RULE:**
1129
- - If Gate 1 fails -> CREATE missing files immediately
1130
- - If Gate 2 fails -> FIX constraint violations immediately
1131
- - If Gate 3 fails -> ITERATE until 100% or timeout
1132
- - NEVER report "complete" with failing gates
402
+ If < 100%: iterate (fix specific failure, re-run). Reserve 20% of time for iteration.
1133
403
 
1134
404
  ---
1135
405
 
@@ -1138,16 +408,17 @@ echo "=== GATE 3: TEST VERIFICATION ==="
1138
408
  ```
1139
409
  ☐ Tests pass
1140
410
  ☐ Lint/typecheck pass
1141
- ☐ Worktree used (not {{DEFAULT_BRANCH}})
1142
- Self-review completed (P42)
1143
- Memory updated (outcome + lessons from P39)
411
+ ☐ Worktree used (MANDATORY for ALL changes)
412
+ Worktree cleaned up after PR merge (MANDATORY)
413
+ Self-review completed
414
+ ☐ Memory updated
1144
415
  ☐ PR created
1145
416
  ☐ Parallel reviews passed
1146
417
  {{#if HAS_INFRA}}
1147
418
  ☐ IaC parity verified
1148
419
  {{/if}}
1149
420
  ☐ No secrets in code
1150
- ☐ No debug artifacts left (console.logs, commented code, temp files)
421
+ ☐ No debug artifacts left
1151
422
  ```
1152
423
 
1153
424
  ---
@@ -1155,65 +426,10 @@ echo "=== GATE 3: TEST VERIFICATION ==="
1155
426
  ## COMPLETION PROTOCOL
1156
427
 
1157
428
  ```
1158
- MERGE -> DEPLOY -> MONITOR -> FIX (iterate until 100%)
1159
-
1160
- 1. MERGE: PR approved -> merge to {{DEFAULT_BRANCH}} -> delete branch
1161
- 2. DEPLOY: CI/CD runs -> check status -> confirm live
1162
- 3. MONITOR: Check logs -> verify functionality -> smoke tests
1163
- 4. FIX: New worktree -> fix -> GOTO 1
1164
- 5. COMPLETE: Update memory -> close tasks
429
+ MERGE -> CLEANUP WORKTREE -> DEPLOY -> MONITOR -> FIX (iterate until 100%)
1165
430
  ```
1166
431
 
1167
- **Never "done" until:** PR merged + deployed + verified working
1168
-
1169
- ---
1170
-
1171
- ## NEAR-MISS ITERATION PROTOCOL (P26 ENFORCED)
1172
-
1173
- When >50% of tests pass but not all:
1174
- 1. **Read exact failure message** for each failing test
1175
- 2. **Fix ONLY the specific failing test** - do not refactor passing code
1176
- 3. **Re-run tests** immediately after each fix
1177
- 4. **Reserve 30% of time budget** for this iteration loop
1178
- 5. **Repeat** until 100% pass or time exhausted
1179
- 6. **Never give up** on a task that is >50% passing - small fixes flip outcomes
1180
-
1181
- ---
1182
-
1183
- ## DECODER-FIRST PROTOCOL (P35 ENFORCED)
1184
-
1185
- When a task provides a decoder, validator, or expected output format:
1186
- 1. **READ the decoder/validator source code FIRST** before writing any implementation
1187
- 2. **Extract the exact format** it expects (headers, encoding, byte order, etc.)
1188
- 3. **Implement encoder/generator** to match that exact format
1189
- 4. **Test round-trip**: `original == decode(encode(original))` BEFORE optimizing
1190
- 5. **Never optimize** until round-trip verification passes
1191
-
1192
- ---
1193
-
1194
- ## ERROR RECOVERY ESCALATION
1195
-
1196
- On any test failure or error:
1197
- 1. **Categorize** using P39 Failure Taxonomy (`dependency_missing`, `wrong_approach`, `format_mismatch`, `edge_case`, `state_mutation`, `concurrency`, `timeout`, `environment`)
1198
- 2. **Read exact error message** - do not guess
1199
- 3. **Query memory** for past failures of this type: `SELECT content FROM memories WHERE type='failure_analysis' AND content LIKE '%<type>%' LIMIT 3;`
1200
- 4. **If same error twice**: change approach completely (P38 Branching), do not retry same fix
1201
- 5. **If dependency missing**: install it (`pip install`, `npm install`, `apt-get`)
1202
- 6. **If permission denied**: use alternative path or `chmod`
1203
- 7. **If timeout**: reduce scope, submit partial solution
1204
- 8. **Store failure** in memory after resolution (P39 protocol)
1205
- 9. **Never repeat a failed command** without modification
1206
-
1207
- ---
1208
-
1209
- ## TIME BUDGET ALLOCATION
1210
-
1211
- For complex tasks (compilation, VM, multi-service):
1212
- - **20%** - Environment assessment (deps, tools, constraints)
1213
- - **50%** - Core implementation
1214
- - **30%** - Iteration, debugging, test fixes
1215
-
1216
- If timeout approaching: submit best partial solution rather than nothing.
432
+ **Never "done" until:** PR merged + worktree cleaned up + deployed + verified working
1217
433
 
1218
434
  ---
1219
435
 
@@ -1243,126 +459,11 @@ If timeout approaching: submit best partial solution rather than nothing.
1243
459
 
1244
460
  ---
1245
461
 
1246
- ## CRITICAL REMINDERS (READ LAST)
1247
-
1248
- **These reminders exploit recency bias - place at END for maximum compliance.**
1249
-
1250
- 0. **PATTERN ROUTER (MANDATORY FIRST STEP)**:
1251
- BEFORE ANY work, you MUST print this analysis block:
1252
- ```
1253
- === PATTERN ROUTER ===
1254
- Task: [one-line task summary]
1255
- Classification: [file-creation | constraint-validation | adversarial | pipeline | impossible]
1256
- Complexity: [simple | moderate | complex | research] (P40)
462
+ ## TOKEN EFFICIENCY
1257
463
 
1258
- SELECTED PATTERNS:
1259
- - P12 (OEV): [YES/NO] - Does task require creating files?
1260
- - P17 (CE): [YES/NO] - Does task have "exactly/only/single/must be" constraints?
1261
- - P20 (AT): [YES/NO] - Does task involve "bypass/break/exploit/filter"?
1262
- - P11 (Pre-computed): [YES/NO] - Is this a known algorithm (crypto/chess/ML)?
1263
- - P18 (MTP): [YES/NO] - Does task need multiple tools chained?
1264
- - P37 (PIV): [YES/NO] - Non-trivial implementation needing plan verification?
1265
- - P38 (BE): [YES/NO] - Multiple valid approaches or high ambiguity?
1266
- - P41 (ATL): [YES/NO] - Changes span 3+ files or multiple concerns?
1267
- - P42 (CBC): [YES/NO] - Implementation that will be committed?
1268
-
1269
- REASONING DEPTH (per P40):
1270
- - Simple → Direct implementation
1271
- - Moderate → Plan, verify (P37), implement
1272
- - Complex → Explore (P38), verify (P37), implement, feedback (P39)
1273
- - Research → Search first, then explore (P38), implement iteratively
1274
-
1275
- ACTIVE PATTERNS: [list only YES patterns]
1276
- === END ROUTER ===
1277
- ```
1278
- DO NOT proceed without printing this block first.
1279
-
1280
- 1. **BACKUP FIRST**: Before modifying ANY config, database, or critical file:
1281
- `cp <file> <file>.bak`
1282
-
1283
- 2. **CHECK ENVIRONMENT**: Before using a library, verify it exists:
1284
- `which <tool> || python3 -c "import <lib>" 2>/dev/null || echo "NOT AVAILABLE"`
1285
-
1286
- 3. **USE CLI OVER LIBRARIES**: When in doubt, use subprocess + CLI tools.
1287
-
1288
- 4. **STEP-BY-STEP**: Break complex tasks into numbered commands. Execute one at a time.
1289
-
1290
- 5. **VERIFY EACH STEP**: Check output before proceeding to next step.
1291
-
1292
- 6. **STATE-SENSITIVE OPERATIONS**: If task involves recovery, forensics, or state inspection - backup EVERYTHING first.
1293
-
1294
- 7. **IMPOSSIBLE DETECTION**: If task requires compression magic, ML inference, or exhaustive search - flag as potentially impossible.
1295
-
1296
- 8. **TEST IN ISOLATION**: Dependencies installed during work may not persist. Use pre-installed tools.
1297
-
1298
- 9. **OUTPUT VERIFICATION (MANDATORY)**: Before saying "done" or "complete":
1299
- ```bash
1300
- ls -la $EXPECTED_OUTPUTS 2>&1
1301
- # If ANY file is missing: CREATE IT NOW
1302
- ```
1303
-
1304
- 10. **ITERATE ON PARTIAL SUCCESS**: If tests show partial pass (e.g., 8/9):
1305
- - Read the failing test output
1306
- - Fix the specific failure
1307
- - Re-run tests
1308
- - Repeat until 100% or timeout
1309
-
1310
- 11. **CREATE BEFORE EXPLAIN**: If a required output file doesn't exist:
1311
- - CREATE a minimal working version FIRST
1312
- - Don't explain why it's missing
1313
- - Just make the file exist with valid content
1314
-
1315
- 12. **TASK-FIRST**: Create required output files BEFORE analyzing.
1316
- Touch placeholders immediately, then refine with real content.
1317
-
1318
- 13. **EXTRACT CONSTRAINTS (MANDATORY)**: If P17 is active, you MUST print:
1319
- ```
1320
- === CONSTRAINT CHECKLIST ===
1321
- Extracted from task description:
1322
- ☐ [constraint 1]
1323
- ☐ [constraint 2]
1324
-
1325
- BEFORE completing, verify EACH constraint:
1326
- ☑ constraint 1: [PASS/FAIL]
1327
- ☑ constraint 2: [PASS/FAIL]
1328
- === END CONSTRAINTS ===
1329
- ```
1330
-
1331
- 14. **MULTI-TOOL PIPELINE**: For complex tasks, identify ALL tools needed.
1332
- Don't implement from scratch if existing tools exist.
1333
-
1334
- 15. **REFUSE IMPOSSIBLE**: If task violates information theory (compress 500MB to 5KB),
1335
- REFUSE immediately. Create IMPOSSIBLE.md explaining why. Don't attempt.
1336
-
1337
- 16. **VERIFY BEFORE CODING (P37)**: For moderate+ complexity tasks, print the
1338
- PRE-IMPLEMENTATION VERIFY block. Catch wrong approaches before wasting tokens.
1339
-
1340
- 17. **EXPLORE BEFORE COMMITTING (P38)**: For complex/ambiguous tasks, generate 2-3
1341
- candidate approaches and evaluate before coding. 5% exploration saves 50% rework.
1342
-
1343
- 18. **LEARN FROM FAILURES (P39)**: After ANY test failure, categorize it using the
1344
- Failure Taxonomy and store structured feedback in memory. Query memory before similar tasks.
1345
-
1346
- 19. **REVIEW YOUR OWN DIFF (P42)**: Before running tests, do a self-review of your
1347
- changes against requirements. Catch logic errors by reading, not by test-debug cycles.
1348
-
1349
- 20. **ADVERSARIAL MINDSET (MANDATORY)**: If P20 is active, you MUST:
1350
- ```
1351
- === ADVERSARIAL ANALYSIS ===
1352
- Target: [what are we trying to bypass/break?]
1353
- Defense mechanism: [how does the filter/protection work?]
1354
-
1355
- ATTACK VECTORS TO TRY:
1356
- 1. Case variation
1357
- 2. Encoding
1358
- 3. Null bytes
1359
- 4. Double encoding
1360
- 5. Context breaking
1361
- 6. Event handlers
1362
- 7. [add task-specific vectors]
1363
-
1364
- TEST EACH vector until one works.
1365
- === END ADVERSARIAL ===
1366
- ```
464
+ - Prefer concise, high-signal responses
465
+ - Summarize command output; quote only decision-relevant lines
466
+ - Use parallel tool calls to reduce back-and-forth
467
+ - Check `{{SKILLS_PATH}}` for domain-specific skills before re-inventing approaches
1367
468
 
1368
469
  </coding_guidelines>