universal-agent-memory 2.8.1 → 2.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "universal-agent-memory",
3
- "version": "2.8.1",
3
+ "version": "2.9.0",
4
4
  "description": "Universal AI agent memory system - CLAUDE.md templates, memory, worktrees for Claude Code, Factory.AI, VSCode, OpenCode",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
@@ -1,5 +1,5 @@
1
1
  <!--
2
- CLAUDE.md Universal Template - v10.18-opt
2
+ CLAUDE.md Universal Template - v10.19-opt
3
3
 
4
4
  Core Variables:
5
5
  {{PROJECT_NAME}}, {{DESCRIPTION}}, {{DEFAULT_BRANCH}}, {{STRUCTURE_DATE}}
@@ -163,7 +163,7 @@ Secrets live in GitHub - use pipelines for secret-dependent operations.
163
163
 
164
164
  ## UNIVERSAL AGENT PATTERNS
165
165
 
166
- **Thirty-six patterns proven to improve LLM agent reliability. Apply to ALL task execution.**
166
+ **Forty-two patterns proven to improve LLM agent reliability. Apply to ALL task execution.**
167
167
 
168
168
  ### Pattern 1: Environment Isolation
169
169
  Code installed during execution may NOT persist across phases (agent -> test -> deploy).
@@ -206,7 +206,7 @@ LLMs weight information at the END of context more heavily.
206
206
  - Repeat critical instructions if essential
207
207
 
208
208
  ### Pattern 7: Task Classification
209
- Different task types require different strategies.
209
+ Different task types require different strategies. See also P40 (Adaptive Reasoning Depth).
210
210
 
211
211
  | Category | Strategy |
212
212
  |----------|----------|
@@ -214,6 +214,8 @@ Different task types require different strategies.
214
214
  | Recipe-following | Step-by-step commands |
215
215
  | Tool-dependent | Specify exact tool + flags |
216
216
  | Research/exploration | Parallel searches |
217
+ | Complex/ambiguous | P38 Branching Exploration |
218
+ | Multi-file change | P41 Atomic Task Loop |
217
219
 
218
220
  ### Pattern 8: CLI over Libraries
219
221
  When environment dependencies are uncertain, prefer subprocess + CLI over library imports.
@@ -495,6 +497,232 @@ Competitive tasks benefit from researching domain-specific winning strategies.
495
497
 
496
498
  ---
497
499
 
500
+ ## ADVANCED REASONING PATTERNS
501
+
502
+ **Six patterns derived from state-of-the-art LLM optimization research (2025-2026). Address reasoning depth, self-verification, branching exploration, feedback grounding, and task atomization.**
503
+
504
+ ### Pattern 37: Pre-Implementation Verification (PIV)
505
+ **CRITICAL: Prevents wrong-approach waste — the #1 cause of wasted compute.**
506
+
507
+ After planning but BEFORE writing any code, explicitly verify your approach:
508
+
509
+ **Detection**: Any implementation task (always active for non-trivial changes)
510
+
511
+ **Protocol**:
512
+ ```
513
+ === PRE-IMPLEMENTATION VERIFY ===
514
+ 1. ROOT CAUSE: Does this approach address the actual root cause, not a symptom?
515
+ 2. EXISTING TESTS: Will this break any existing passing tests?
516
+ 3. SIMPLER PATH: Is there a simpler approach I'm overlooking?
517
+ 4. ASSUMPTIONS: What am I assuming about the codebase that I haven't verified?
518
+ 5. SIDE EFFECTS: What else does this change affect?
519
+ === VERIFIED: [proceed/revise] ===
520
+ ```
521
+
522
+ **If ANY answer raises doubt**: STOP. Re-read the problem. Revise approach before coding.
523
+
524
+ *Research basis: CoT verification (+4.3% accuracy), Reflexion framework (+18.5%), SEER adaptive reasoning (+4-9%)*
525
+
526
+ ### Pattern 38: Branching Exploration (BE)
527
+ For complex or ambiguous problems, explore multiple approaches before committing.
528
+
529
+ **Detection**: Problem has multiple valid approaches, ambiguous requirements, or high complexity
530
+
531
+ **Protocol**:
532
+ 1. **Generate 2-3 candidate approaches** (brief description, not full implementation)
533
+ 2. **Evaluate each** against: simplicity, correctness likelihood, test-compatibility, side-effect risk
534
+ 3. **Select best** with explicit reasoning
535
+ 4. **Commit fully** to selected approach — no mid-implementation switching
536
+ 5. **If selected approach fails**: backtrack to step 1, eliminate failed approach, try next
537
+
538
+ **NEVER**: Start coding the first approach that comes to mind for complex problems.
539
+ **ALWAYS**: Spend 5% of effort exploring alternatives to save 50% on wrong-path recovery.
540
+
541
+ *Research basis: MCTS-guided code generation (RethinkMCTS: 70%→89% pass@1), Policy-Guided Tree Search*
542
+
543
+ ### Pattern 39: Execution Feedback Grounding (EFG)
544
+ Learn from test failures systematically — don't just fix, understand and remember.
545
+
546
+ **Detection**: Any test failure or runtime error during implementation
547
+
548
+ **Protocol**:
549
+ 1. **Categorize the failure** using the Failure Taxonomy (see below)
550
+ 2. **Identify root cause** (not just the symptom the error message shows)
551
+ 3. **Fix with explanation**: What was wrong, why, and what the fix addresses
552
+ 4. **Store structured feedback** in memory:
553
+ ```bash
554
+ sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO memories (timestamp,type,content) VALUES (datetime('now'),'failure_analysis','type:<category>|cause:<root_cause>|fix:<what_fixed>|file:<filename>');"
555
+ ```
556
+ 5. **Query before similar tasks**: Before implementing, check memory for past failures in same area
557
+
558
+ **Failure Taxonomy** (use for categorization):
559
+ | Type | Description | Recovery Strategy |
560
+ |------|-------------|-------------------|
561
+ | `dependency_missing` | Import/module not found | Install or use stdlib alternative |
562
+ | `wrong_approach` | Fundamentally incorrect solution | P38 Branching - try different approach |
563
+ | `format_mismatch` | Output doesn't match expected format | P14 OFV - re-read spec carefully |
564
+ | `edge_case` | Works for happy path, fails on edge | Add boundary checks, test with extremes |
565
+ | `state_mutation` | Unexpected side effect on shared state | Isolate mutations, use copies |
566
+ | `concurrency` | Race condition or timing issue | Add locks, use sequential fallback |
567
+ | `timeout` | Exceeded time/resource limit | Optimize algorithm, reduce scope |
568
+ | `environment` | Works locally, fails in target env | P1 Environment Isolation checks |
569
+
570
+ *Research basis: RLEF/RLVR (RL from Execution Feedback), verifiable rewards for coding agents*
571
+
572
+ ### Pattern 40: Adaptive Reasoning Depth (ARD)
573
+ Match reasoning effort to task complexity — don't over-think simple tasks or under-think hard ones.
574
+
575
+ **Detection**: Applied automatically at Pattern Router stage
576
+
577
+ **Complexity Classification**:
578
+ | Complexity | Indicators | Reasoning Protocol |
579
+ |-----------|------------|-------------------|
580
+ | **Simple** | Single file, clear spec, known pattern, <20 lines | Direct implementation. No exploration phase. |
581
+ | **Moderate** | Multi-file, some ambiguity, 20-200 lines | Plan-then-implement. State assumptions. P37 verify. |
582
+ | **Complex** | Cross-cutting concerns, ambiguous spec, >200 lines, unfamiliar domain | P38 explore → P37 verify → implement → P39 feedback loop. |
583
+ | **Research** | Unknown solution space, no clear approach | Research first (web search, codebase analysis) → P38 explore → implement iteratively. |
584
+
585
+ **Rule**: Never apply Complex-level reasoning to Simple tasks (wastes tokens). Never apply Simple-level reasoning to Complex tasks (causes failures).
586
+
587
+ *Research basis: SEER adaptive CoT, test-time compute scaling (2-3x gains from adaptive depth)*
588
+
589
+ ### Pattern 41: Atomic Task Loop (ATL)
590
+ For multi-step changes, decompose into atomic units with clean boundaries.
591
+
592
+ **Detection**: Task involves changes to 3+ files, or multiple independent concerns
593
+
594
+ **Protocol**:
595
+ 1. **Decompose** the task into atomic sub-tasks (each independently testable)
596
+ 2. **Order** by dependency (upstream changes first)
597
+ 3. **For each sub-task**:
598
+ a. Implement the change (single concern only)
599
+ b. Run relevant tests
600
+ c. Commit if tests pass
601
+ d. If context is getting long/confused, note progress and continue fresh
602
+ 4. **Final verification**: Run full test suite after all sub-tasks complete
603
+
604
+ **Atomicity rules**:
605
+ - Each sub-task modifies ideally 1-2 files
606
+ - Each sub-task has a clear pass/fail criterion
607
+ - Sub-tasks should not depend on uncommitted work from other sub-tasks
608
+ - If a sub-task fails, only that sub-task needs rework
609
+
610
+ *Research basis: Addy Osmani's continuous coding loop, context drift prevention research*
611
+
612
+ ### Pattern 42: Critic-Before-Commit (CBC)
613
+ Review your own diff against requirements before running tests.
614
+
615
+ **Detection**: Any implementation about to be tested or committed
616
+
617
+ **Protocol**:
618
+ ```
619
+ === SELF-REVIEW ===
620
+ Diff summary: [what changed, in which files]
621
+
622
+ REQUIREMENT CHECK:
623
+ ☐ Does the diff address ALL requirements from the task?
624
+ ☐ Are there any unintended changes (debug prints, commented code, temp files)?
625
+ ☐ Does the code handle the error/edge cases mentioned in the spec?
626
+ ☐ Is the code consistent with surrounding style and conventions?
627
+ ☐ Would this diff make sense to a reviewer with no context?
628
+
629
+ ISSUES FOUND: [list or "none"]
630
+ === END REVIEW ===
631
+ ```
632
+
633
+ **If issues found**: Fix BEFORE running tests. Cheaper to catch logic errors by reading than by test-debug cycles.
634
+
635
+ *Research basis: Multi-agent reflection (actor+critic, +20% accuracy), RL^V unified reasoner-verifier*
636
+
637
+ ---
638
+
639
+ ## CONTEXT OPTIMIZATION
640
+
641
+ **Reduce token waste and improve response quality through intelligent context management.**
642
+
643
+ ### Progressive Context Disclosure
644
+ Not all patterns are needed for every task. The Pattern Router activates only relevant patterns.
645
+ - **Always loaded**: Pattern Router, Completion Gates, Error Recovery
646
+ - **Loaded on activation**: Only patterns flagged YES by router
647
+ - **Summarize, don't repeat**: When referencing prior work, summarize in 1-2 lines, don't paste full output
648
+
649
+ ### Context Hygiene
650
+ - **Prune completed context**: After a sub-task completes, don't carry its full debug output forward
651
+ - **Compress tool output**: Quote only the 2-3 lines that inform the next decision
652
+ - **Avoid context poisoning**: Don't include failed approaches in context unless actively debugging them
653
+ - **Reset on drift**: If responses become unfocused or repetitive, summarize progress and continue with clean context
654
+
655
+ ### Token Budget Awareness
656
+ | Task Type | Target Context Usage | Strategy |
657
+ |-----------|---------------------|----------|
658
+ | Simple fix | <10% of window | Direct implementation, minimal exploration |
659
+ | Feature implementation | 30-50% of window | Structured exploration, then focused implementation |
660
+ | Complex debugging | 50-70% of window | Deep investigation justified, but prune between attempts |
661
+ | Research/exploration | 20-40% of window | Broad search first, then narrow and deep |
662
+
663
+ ---
664
+
665
+ ## SELF-IMPROVEMENT PROTOCOL
666
+
667
+ **The agent improves its own effectiveness over time by learning from outcomes.**
668
+
669
+ ### After Task Completion (Success or Failure)
670
+ 1. **Record outcome** with structured metadata:
671
+ ```bash
672
+ sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO session_memories (session_id,timestamp,type,content,importance) VALUES ('current',datetime('now'),'outcome','task:<summary>|result:<pass/fail>|patterns_used:<list>|time_spent:<estimate>|failure_type:<category_or_none>',8);"
673
+ ```
674
+
675
+ 2. **If failure occurred**: Store in semantic memory for cross-session learning:
676
+ ```bash
677
+ {{MEMORY_STORE_CMD}} lesson "Failed on <task_type>: <what_went_wrong>. Fix: <what_worked>." --tags failure,<category>,<language> --importance 8
678
+ ```
679
+
680
+ 3. **If novel technique discovered**: Store as reusable pattern:
681
+ ```bash
682
+ {{MEMORY_STORE_CMD}} lesson "New technique for <domain>: <technique_description>. Use when <conditions>." --tags technique,<domain> --importance 9
683
+ ```
684
+
685
+ ### Before Starting Similar Tasks
686
+ Query memory for relevant past outcomes:
687
+ ```bash
688
+ sqlite3 ./{{MEMORY_DB_PATH}} "SELECT content FROM memories WHERE type='failure_analysis' AND content LIKE '%<relevant_keyword>%' ORDER BY timestamp DESC LIMIT 5;"
689
+ ```
690
+
691
+ ### Repo-Specific Learning
692
+ Over time, accumulate repository-specific patterns:
693
+ - Which test frameworks and assertions this repo uses
694
+ - Common failure modes in this codebase
695
+ - Preferred code style and naming conventions
696
+ - Architecture decisions and their rationale
697
+
698
+ Store these as high-importance semantic memories tagged with the repo name.
699
+
700
+ ---
701
+
702
+ ## CODE QUALITY HEURISTICS
703
+
704
+ **Apply to ALL generated code. Verify before committing.**
705
+
706
+ ### Pre-Commit Code Review Checklist
707
+ - [ ] Functions ≤ 30 lines (split if longer)
708
+ - [ ] No God objects or functions doing multiple unrelated things
709
+ - [ ] Names are self-documenting (no single-letter variables outside loops)
710
+ - [ ] Error paths handled explicitly (not just happy path)
711
+ - [ ] No debug prints, console.logs, or commented-out code left behind
712
+ - [ ] Consistent with surrounding code style (indentation, naming, patterns)
713
+ - [ ] No hardcoded values that should be constants or config
714
+ - [ ] Imports are minimal — only what's actually used
715
+
716
+ ### Code Smell Detection
717
+ If you notice any of these, fix before committing:
718
+ - **Duplicated logic** → Extract to shared function
719
+ - **Deep nesting (>3 levels)** → Early returns, extract helper
720
+ - **Boolean parameters** → Consider separate methods or options object
721
+ - **Magic numbers** → Named constants
722
+ - **Catch-all error handling** → Specific error types with appropriate responses
723
+
724
+ ---
725
+
498
726
  ## SESSION START PROTOCOL
499
727
 
500
728
  **EXECUTE IMMEDIATELY before any response:**
@@ -538,6 +766,7 @@ uam agent overlaps --resource "<files-or-directories>"
538
766
  | Performance | `performance-optimizer` | algorithms, memory, caching |
539
767
  | Documentation | `documentation-expert` | jsdoc, readme, api-docs |
540
768
  | Code quality | `code-quality-guardian` | complexity, naming, solid |
769
+ | Solution verification | self (P42 CBC) | diff review, requirement check |
541
770
 
542
771
  {{#if LANGUAGE_DROIDS}}
543
772
  ### Language Droids
@@ -587,13 +816,17 @@ uam agent overlaps --resource "<files-or-directories>"
587
816
  ## DECISION LOOP
588
817
 
589
818
  ```
590
- 0. CLASSIFY -> backup? tool? steps?
591
- 1. PROTECT -> cp file file.bak
592
- 2. MEMORY -> query relevant context
593
- 3. AGENTS -> check overlaps
594
- 4. SKILLS -> check {{SKILLS_PATH}}
595
- 5. WORKTREE -> create, work, PR
596
- 6. VERIFY -> gates pass
819
+ 0. CLASSIFY -> complexity? backup? tool? steps? (P40 Adaptive Depth)
820
+ 1. PROTECT -> cp file file.bak
821
+ 2. MEMORY -> query relevant context + past failures (P39)
822
+ 3. EXPLORE -> if complex: generate 2-3 approaches (P38)
823
+ 4. VERIFY -> pre-implementation check (P37)
824
+ 5. AGENTS -> check overlaps
825
+ 6. SKILLS -> check {{SKILLS_PATH}}
826
+ 7. WORKTREE -> create, work (P41 atomic tasks)
827
+ 8. REVIEW -> self-review diff (P42)
828
+ 9. TEST -> gates pass
829
+ 10. LEARN -> store outcome in memory (P39)
597
830
  ```
598
831
 
599
832
  ---
@@ -906,13 +1139,15 @@ echo "=== GATE 3: TEST VERIFICATION ==="
906
1139
  ☐ Tests pass
907
1140
  ☐ Lint/typecheck pass
908
1141
  ☐ Worktree used (not {{DEFAULT_BRANCH}})
909
- Memory updated
1142
+ Self-review completed (P42)
1143
+ ☐ Memory updated (outcome + lessons from P39)
910
1144
  ☐ PR created
911
1145
  ☐ Parallel reviews passed
912
1146
  {{#if HAS_INFRA}}
913
1147
  ☐ IaC parity verified
914
1148
  {{/if}}
915
1149
  ☐ No secrets in code
1150
+ ☐ No debug artifacts left (console.logs, commented code, temp files)
916
1151
  ```
917
1152
 
918
1153
  ---
@@ -959,12 +1194,15 @@ When a task provides a decoder, validator, or expected output format:
959
1194
  ## ERROR RECOVERY ESCALATION
960
1195
 
961
1196
  On any test failure or error:
962
- 1. **Read exact error message** - do not guess
963
- 2. **If same error twice**: change approach completely, do not retry same fix
964
- 3. **If dependency missing**: install it (`pip install`, `npm install`, `apt-get`)
965
- 4. **If permission denied**: use alternative path or `chmod`
966
- 5. **If timeout**: reduce scope, submit partial solution
967
- 6. **Never repeat a failed command** without modification
1197
+ 1. **Categorize** using P39 Failure Taxonomy (`dependency_missing`, `wrong_approach`, `format_mismatch`, `edge_case`, `state_mutation`, `concurrency`, `timeout`, `environment`)
1198
+ 2. **Read exact error message** - do not guess
1199
+ 3. **Query memory** for past failures of this type: `SELECT content FROM memories WHERE type='failure_analysis' AND content LIKE '%<type>%' LIMIT 3;`
1200
+ 4. **If same error twice**: change approach completely (P38 Branching), do not retry same fix
1201
+ 5. **If dependency missing**: install it (`pip install`, `npm install`, `apt-get`)
1202
+ 6. **If permission denied**: use alternative path or `chmod`
1203
+ 7. **If timeout**: reduce scope, submit partial solution
1204
+ 8. **Store failure** in memory after resolution (P39 protocol)
1205
+ 9. **Never repeat a failed command** without modification
968
1206
 
969
1207
  ---
970
1208
 
@@ -1015,6 +1253,7 @@ If timeout approaching: submit best partial solution rather than nothing.
1015
1253
  === PATTERN ROUTER ===
1016
1254
  Task: [one-line task summary]
1017
1255
  Classification: [file-creation | constraint-validation | adversarial | pipeline | impossible]
1256
+ Complexity: [simple | moderate | complex | research] (P40)
1018
1257
 
1019
1258
  SELECTED PATTERNS:
1020
1259
  - P12 (OEV): [YES/NO] - Does task require creating files?
@@ -1022,6 +1261,16 @@ If timeout approaching: submit best partial solution rather than nothing.
1022
1261
  - P20 (AT): [YES/NO] - Does task involve "bypass/break/exploit/filter"?
1023
1262
  - P11 (Pre-computed): [YES/NO] - Is this a known algorithm (crypto/chess/ML)?
1024
1263
  - P18 (MTP): [YES/NO] - Does task need multiple tools chained?
1264
+ - P37 (PIV): [YES/NO] - Non-trivial implementation needing plan verification?
1265
+ - P38 (BE): [YES/NO] - Multiple valid approaches or high ambiguity?
1266
+ - P41 (ATL): [YES/NO] - Changes span 3+ files or multiple concerns?
1267
+ - P42 (CBC): [YES/NO] - Implementation that will be committed?
1268
+
1269
+ REASONING DEPTH (per P40):
1270
+ - Simple → Direct implementation
1271
+ - Moderate → Plan, verify (P37), implement
1272
+ - Complex → Explore (P38), verify (P37), implement, feedback (P39)
1273
+ - Research → Search first, then explore (P38), implement iteratively
1025
1274
 
1026
1275
  ACTIVE PATTERNS: [list only YES patterns]
1027
1276
  === END ROUTER ===
@@ -1085,7 +1334,19 @@ If timeout approaching: submit best partial solution rather than nothing.
1085
1334
  15. **REFUSE IMPOSSIBLE**: If task violates information theory (compress 500MB to 5KB),
1086
1335
  REFUSE immediately. Create IMPOSSIBLE.md explaining why. Don't attempt.
1087
1336
 
1088
- 16. **ADVERSARIAL MINDSET (MANDATORY)**: If P20 is active, you MUST:
1337
+ 16. **VERIFY BEFORE CODING (P37)**: For moderate+ complexity tasks, print the
1338
+ PRE-IMPLEMENTATION VERIFY block. Catch wrong approaches before wasting tokens.
1339
+
1340
+ 17. **EXPLORE BEFORE COMMITTING (P38)**: For complex/ambiguous tasks, generate 2-3
1341
+ candidate approaches and evaluate before coding. 5% exploration saves 50% rework.
1342
+
1343
+ 18. **LEARN FROM FAILURES (P39)**: After ANY test failure, categorize it using the
1344
+ Failure Taxonomy and store structured feedback in memory. Query memory before similar tasks.
1345
+
1346
+ 19. **REVIEW YOUR OWN DIFF (P42)**: Before running tests, do a self-review of your
1347
+ changes against requirements. Catch logic errors by reading, not by test-debug cycles.
1348
+
1349
+ 20. **ADVERSARIAL MINDSET (MANDATORY)**: If P20 is active, you MUST:
1089
1350
  ```
1090
1351
  === ADVERSARIAL ANALYSIS ===
1091
1352
  Target: [what are we trying to bypass/break?]