npm - universal-agent-memory - Versions diffs - 2.2.0 → 2.4.0 - Mend

universal-agent-memory 2.2.0 → 2.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/dist/generators/claude-md.js +1 -1
package/package.json +1 -1
package/templates/CLAUDE.template.md +281 -83

package/dist/generators/claude-md.js CHANGED Viewed

@@ -125,7 +125,7 @@ async function buildContext(analysis, config) {
     // Build relevant patterns (pruned by project type to save tokens)
     const relevantPatterns = buildRelevantPatterns(analysis);
     // Template version for reproducibility
-    const TEMPLATE_VERSION = '10.16-opt';
+    const TEMPLATE_VERSION = '10.18-opt';
     return {
         // Project basics
         PROJECT_NAME: analysis.projectName || config.project.name,

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "universal-agent-memory",
-  "version": "2.2.0",
+  "version": "2.4.0",
   "description": "Universal AI agent memory system - CLAUDE.md templates, memory, worktrees for Claude Code, Factory.AI, VSCode, OpenCode",
   "type": "module",
   "main": "dist/index.js",

package/templates/CLAUDE.template.md CHANGED Viewed

@@ -1,6 +1,8 @@
-<!-- CLAUDE.md v2.2.0 - 34 Model Outcome Success Optimizations -->
+<!-- CLAUDE.md v2.4.0 - 46 Model Outcome Success Optimizations -->
 <!-- Optimizations #22-27: Template Compression, Structured Iteration, Inline Domain Knowledge, Early Impossibility Exit, Conditional Context, Remove Emphasis Theater -->
 <!-- Optimizations #28-34: Mandatory Verifier Loop, Decoder-First Gate, Context Stripping, Environment Check, Schema Diff, State Protection, Conditional Domain -->
+<!-- Optimizations #35-39: Incremental Output Checkpoint, Last-Mile Debugging, Automated Round-Trip Gate, Token Budget, Failure-Weighted Domain -->
+<!-- Optimizations #40-46: Code Field Restore, Pattern Router, Full 36-Pattern Reference, Critical Reminders, Infra Pipelines, Paths Reference, Reviews Gate -->
 <coding_guidelines>
 # {{PROJECT_NAME}} - Autonomous Agent Guide
@@ -17,6 +19,34 @@
 ---
+## CODE FIELD (#40)
+**Before writing code:** State assumptions, bound scope, identify edge cases.
+```
+Do not write code before stating assumptions.
+Do not claim correctness you haven't verified.
+Do not handle only the happy path.
+```
+**Output:** Assumptions → Code (smaller than expected) → "Handles/Does NOT handle"
+---
+## PATTERN ROUTER (MANDATORY FIRST) (#41)
+Before ANY work, print:
+```
+=== PATTERN ROUTER ===
+Task: [summary]
+Classification: [file-creation|constraint|adversarial|pipeline|impossible]
+PATTERNS: P12:[Y/N] P17:[Y/N] P20:[Y/N] P11:[Y/N] P35:[Y/N]
+ACTIVE: [list YES only]
+=== END ===
+```
+---
 ## FIRST: IMPOSSIBILITY CHECK (#25)
 Before any work, check if task is impossible:
@@ -31,65 +61,78 @@ Claim partial credit with stub outputs, then stop.
 ## ENVIRONMENT SNAPSHOT (#21, #31)
-Run first in any task:
 ```bash
 python3 --version; which pip3 gcc make node npm git curl 2>/dev/null | head -10
+cat Dockerfile 2>/dev/null || cat /tmp/tbench2/*/environment/Dockerfile 2>/dev/null || true
 ```
-Environment check (#31): If Dockerfile exists, read it to know what's pre-installed.
-Packages installed via pip at runtime may NOT be available in the test environment.
+---
+## INCREMENTAL OUTPUT CHECKPOINT (#35)
+**CRITICAL**: Parse required outputs at START, create stubs IMMEDIATELY:
 ```bash
-cat Dockerfile 2>/dev/null || cat /tmp/tbench2/*/environment/Dockerfile 2>/dev/null || true
+# 1. Extract expected outputs from task description
+# 2. Create stub files BEFORE any implementation:
+touch output.txt && echo '{}' > result.json && echo '#!/bin/bash' > script.sh && chmod +x script.sh
+# 3. After EACH code change: ls -la to verify stubs still exist
+# 4. NEVER delete an output file without recreating it
 ```
+This guarantees partial credit even if implementation fails.
 ---
 ## VERIFIER-FIRST (#5, #32)
-Before implementing any task with tests:
 1. Read verifier/test code first
 2. Extract expected paths, formats, return values
-3. Create stub outputs matching format
+3. Create stub outputs matching format (#35)
 4. Run verifier after each change
-Schema diff gate (#32): Before running verifier, diff your output structure against expected:
-```bash
-# Compare output schema to what tests expect
-# Check: field names, data types, array lengths, file paths, line counts
-# Fix mismatches BEFORE running verifier - saves iteration cycles
-```
+---
-```bash
-echo '#!/usr/bin/env python3\nprint("stub")' > script && chmod +x script
-./verifier.sh  # Confirm detection before implementing
+## AUTOMATED ROUND-TRIP GATE (#37)
+If task provides BOTH encoder AND decoder:
+```python
+# BLOCKING - Must pass before ANY optimization
+test_input = b"test" * 10  # Start small
+compressed = encode(test_input)
+decoded = decode(compressed)  # Use PROVIDED decoder
+assert test_input == decoded, "BLOCKING: Round-trip failed"
+# Scale up: 100 bytes -> 1KB -> full input
+# Only optimize size AFTER round-trip passes at all scales
 ```
 ---
 ## STRUCTURED ITERATION PROTOCOL (#23, #28)
-MANDATORY: Run verifier/tests minimum 3 times before declaring done.
+Run verifier minimum 3 times before declaring done.
-After each test run:
 ```
 if pass_rate == 100% AND run_count >= 3: DONE
-if pass_rate == 100% AND run_count < 3: Re-run to confirm (not flaky)
-if pass_rate > 50%:
-  1. Read failing test source
-  2. Diff expected vs actual
-  3. Fix ONLY the delta, touch nothing else
-  4. Re-run immediately
+if pass_rate >= 75% AND 1-2 tests failing: LAST-MILE DEBUGGING (#36)
+if pass_rate > 50%: Fix ONLY the delta, re-run immediately
 if pass_rate <= 50%: Change approach entirely
 ```
 Print after every run: `TESTS: X/Y (Z%) [run N/3]`
-When 1-2 tests fail:
-1. Copy full failing test function
-2. Copy full error trace
-3. What does test expect vs what you produce?
-4. Fix only that difference
-5. Re-run
+---
+## LAST-MILE DEBUGGING PROTOCOL (#36)
+When pass_rate >= 75% and exactly 1-2 tests failing:
+```
+1. STOP implementing new code
+2. Read ENTIRE failing test function (not just assertion)
+3. Print exact expected vs actual values
+4. Binary search difference: format? precision? edge case?
+5. If numerical: try tolerance (atol=1e-6 -> 1e-3 -> 1e-1)
+6. If format: character-by-character diff
+7. Spend up to 40% of remaining time on last-mile
+```
 Never: say "done" at <100% | refactor passing code | start new approach when >80%
@@ -105,65 +148,132 @@ Never: say "done" at <100% | refactor passing code | start new approach when >80
 ---
-## PATTERNS (Merged P12+P16, P13+P26, +#28-34)
-| ID | Trigger | Action |
-|----|---------|--------|
-| P1 | Any task | `which tool` before use |
-| P3 | Modifies config/db | `cp file file.bak` first |
-| P5 | Compression/ML/search | Detect impossible early |
-| P8 | Uncertain env | `subprocess` > library import |
-| P11 | Chess/crypto/ML | Use library: python-chess, scipy, zlib, hashlib |
-| P12 | Creates files | `ls -la` verify all outputs exist |
-| P13 | Tests partial pass | Fix specific failure, re-run (merged with P26) |
-| P17 | "exactly/only/must" | Extract constraints, print checklist |
-| P20 | "bypass/break/exploit" | Attacker mindset, enumerate vectors |
-| P22 | Git recovery | `cp -r .git .git.bak` first |
-| P32 | CLI tool | Test `./script` not `python script` |
-| P35 | Decoder given | **BLOCKING**: Read decoder, round-trip MUST pass first |
-| P37 | Output format specified | Diff output schema vs test expectations before verifier |
-| P38 | Recovery/forensics task | COPY ALL artifacts before ANY read operation |
-| P39 | Tests use separate env | Check Dockerfile; pip installs may not persist |
+## PATTERNS REFERENCE (#42)
+### Core (P1-P11)
+| P | Trigger | Action |
+|---|---------|--------|
+| 1 | Any task | `which tool` before use |
+| 2 | Complex task | Convert to numbered recipe steps |
+| 3 | Modifies config/db | `cp file file.bak` FIRST |
+| 4 | Ambiguous how | Specify exact tool+flags |
+| 5 | Compression/ML/search | Detect impossible early, refuse |
+| 6 | Critical rules | Put at END of context (recency bias) |
+| 7 | Any task | Classify: state-sensitive→backup, tool→specify |
+| 8 | Uncertain env | `subprocess` > library import |
+| 9 | Non-text data | Convert→standard format first |
+| 10 | Security filter | Allow-list ONLY, never blacklist |
+| 11 | Chess/crypto/ML | Use library: python-chess, scipy, zlib, hashlib |
+### Output (P12-P16)
+| P | Trigger | Action |
+|---|---------|--------|
+| 12 | Creates files | `ls -la` verify ALL outputs exist before done |
+| 13 | Tests partial pass | Fix specific failure, re-run, iterate (merged P26) |
+| 14 | Format specified | Validate output matches spec exactly |
+| 15 | Exception thrown | Read full trace, fix error, re-run |
+| 16 | Any output task | `touch` placeholders FIRST, then implement |
+### Constraint (P17-P20)
+| P | Trigger | Action |
+|---|---------|--------|
+| 17 | "exactly/only/must" | Extract constraints, print checklist |
+| 18 | Multi-step | Identify tool per stage, chain |
+| 19 | Impossible markers | REFUSE immediately, create IMPOSSIBLE.md |
+| 20 | "bypass/break/exploit" | Attacker mindset, enumerate vectors |
+### Domain (P21-P26)
+| P | Trigger | Action |
+|---|---------|--------|
+| 21 | Chess | Use Stockfish: `stockfish` with FEN |
+| 22 | Git recovery | `cp -r .git .git.bak` FIRST, then `git fsck` |
+| 23 | Compress impossible | Refuse if target < Shannon limit |
+| 24 | Polyglot | Search existing examples first |
+| 25 | Multi-service | Configure in dependency order, test each |
+| 26 | >50% tests pass | Focus on specific failures, 30% time reserved |
+### Verification (P27-P31)
+| P | Trigger | Action |
+|---|---------|--------|
+| 27 | Output dir constraint | Remove non-required files before done |
+| 28 | Service task | `curl` test BEFORE claiming done |
+| 29 | "all/both/every" | Find ALL solutions, not just first |
+| 30 | "% threshold" | Iterate until threshold met |
+| 31 | Transform task | Round-trip: `original == decompress(compress(original))` |
+### Execution (P32-P39)
+| P | Trigger | Action |
+|---|---------|--------|
+| 32 | CLI tool | Test `./script` not `python script` |
+| 33 | Numerical | Multiple seeds, edge cases, tolerance 1e-6 |
+| 34 | Image analysis | Use recognition tools, not reasoning |
+| 35 | Decoder given | **BLOCKING**: Read decoder, round-trip MUST pass first |
+| 36 | Competition | Research strategies BEFORE implementing |
+| 37 | Output format specified | Diff output schema vs test expectations before verifier |
+| 38 | Recovery/forensics | COPY ALL artifacts before ANY read operation |
+| 39 | Tests use separate env | Check Dockerfile; pip installs may not persist |
 ---
 {{#if IS_BENCHMARK}}
-## INLINE DOMAIN KNOWLEDGE (#24)
+## INLINE DOMAIN KNOWLEDGE (#24, #39)
+### XSS/HTML Filtering (HIGH FAILURE RATE)
+```python
+# Use bleach or DOMPurify approach - NEVER regex-only
+# Whitelist-only: allow known-safe tags/attributes, strip everything else
+# Test: <script>, <img onerror=>, <svg onload=>, javascript: URIs, data: URIs
+# Edge cases: nested encoding, null bytes, case variations, incomplete tags
+import bleach
+clean = bleach.clean(html, tags=['p','b','i','a','ul','li'], attributes={'a':['href']}, strip=True)
+```
+### Formal Proofs / Coq (HIGH FAILURE RATE)
+```
+(* Prove commutativity: use induction on n, simpl, rewrite IHn, reflexivity *)
+Theorem plus_comm : forall n m : nat, n + m = m + n.
+Proof. induction n as [|n' IHn']. - intros m. simpl. rewrite <- plus_n_O. reflexivity.
+  - intros m. simpl. rewrite IHn'. rewrite plus_n_Sm. reflexivity. Qed.
+```
+### Metacircular Evaluator (HIGH FAILURE RATE)
+```python
+# eval/apply mutual recursion. Handle: quote, if, lambda, define, begin, set!
+# Self-interpretation: eval must be able to evaluate eval
+# Key: environment is list of frames, each frame is dict of bindings
+def eval_expr(expr, env):
+    if is_symbol(expr): return env_lookup(env, expr)
+    if is_number(expr): return expr
+    if expr[0] == 'quote': return expr[1]
+    if expr[0] == 'if': return eval_expr(expr[2] if eval_expr(expr[1],env) else expr[3], env)
+    if expr[0] == 'lambda': return Closure(expr[1], expr[2], env)
+    # apply: eval operator, eval operands, call
+```
 ### Chess Tasks (P21)
 ```bash
 pip install python-chess
-# Extract FEN, then:
 python3 -c "import chess, chess.engine; e=chess.engine.SimpleEngine.popen_uci('/usr/bin/stockfish'); b=chess.Board('FEN'); print(e.play(b,chess.engine.Limit(time=1)).move)"
 ```
-### Compression Tasks (P35) - BLOCKING GATE (#29)
+### Compression Tasks (P35) - BLOCKING GATE (#29, #37)
 ```python
-# DECODER-FIRST: If decoder/decompressor provided, analyze it FIRST
-# This is a BLOCKING GATE - round-trip MUST pass before ANY optimization
-# 1. Read decoder to understand expected format
-# 2. Create minimal test case
-# 3. Verify round-trip BEFORE optimizing:
+# Round-trip MUST pass before optimization (see #37 above)
 original = open('input.bin','rb').read()
 compressed = encode(original)
 decoded = decode(compressed)  # Use PROVIDED decoder
-assert original == decoded, "BLOCKING: Round-trip failed - fix format first"
-# 4. Only after round-trip passes, optimize for size
+assert original == decoded, "BLOCKING: Round-trip failed"
+```
+### Git Recovery (P22)
+```bash
+cp -r .git .git.bak && git fsck --full && git reflog --all
 ```
 ### CoreWars Tasks (P36)
 - Paper beats Stone (self-replication > bombing)
 - Vampires capture processes (offensive)
 - Scanners detect slow opponents
-### Git Recovery (P22)
-```bash
-cp -r .git .git.bak
-git fsck --full
-git reflog --all
-git log --all --oneline | head -50
-```
 {{/if}}
 ---
@@ -235,6 +345,20 @@ When time running out (last 20%):
 ---
+## DECISION LOOP
+```
+0. CLASSIFY → backup? tool? steps?
+1. PROTECT  → cp file file.bak
+2. MEMORY   → query relevant context
+3. AGENTS   → check overlaps
+4. SKILLS   → check {{SKILLS_PATH}}
+5. WORKTREE → create, work, PR
+6. VERIFY   → gates pass
+```
+---
 ## MEMORY SYSTEM
 ```
@@ -244,6 +368,13 @@ L3 Semantic | {{LONG_TERM_BACKEND}} | search      | ~50ms
 L4 Knowledge| SQLite entities/rels | graph       | <20ms
 ```
+**Commands:**
+```bash
+sqlite3 {{MEMORY_DB_PATH}} "SELECT * FROM memories ORDER BY id DESC LIMIT 10;"
+{{MEMORY_QUERY_CMD}} "keywords"
+{{MEMORY_STORE_CMD}} "lesson" --importance 8
+```
 If task attempted before: load failure report, avoid failed approach, start from closest success.
 ---
@@ -251,27 +382,55 @@ If task attempted before: load failure report, avoid failed approach, start from
 {{#unless IS_BENCHMARK}}
 ## WORKTREE WORKFLOW
+**ALL changes use worktrees. NO EXCEPTIONS.**
 ```bash
 {{WORKTREE_CREATE_CMD}} <slug>
 git add -A && git commit -m "type: description"
 {{WORKTREE_PR_CMD}} <id>
+{{WORKTREE_CLEANUP_CMD}} <id>
+```
+## MULTI-AGENT COORDINATION
+**Before claiming work:**
+```bash
+uam agent overlaps --resource "<files>"
 ```
+| Risk | Action |
+|------|--------|
+| none/low | Proceed |
+| medium | Announce, coordinate |
+| high/critical | Wait or split |
 ## DROIDS
 | Droid | Use |
 |-------|-----|
 | security-auditor | OWASP, secrets, injection |
 | code-quality-guardian | SOLID, complexity |
+| performance-optimizer | Algorithms, memory |
+| documentation-expert | JSDoc, README |
 | debug-expert | Dependency conflicts |
 | sysadmin-expert | Kernel, QEMU, networking |
+| ml-training-expert | Model training, MTEB |
 {{/unless}}
 {{#if HAS_INFRA}}
-## INFRASTRUCTURE
+## INFRASTRUCTURE (#43)
+**Secrets in GitHub → use pipelines for secret-dependent ops.**
+| Task | Pipeline |
+|------|----------|
+| Terraform | `iac-terraform-cicd.yml` |
+| kubectl ops | `ops-approved-operations.yml` |
+| One-time | `ops-create-ephemeral.yml` |
-Secrets in GitHub. Use pipelines for secret-dependent ops.
-Prohibited locally: `terraform apply`, `kubectl apply/delete`
+**Two-phase:** Local proof (no secrets) → IaC parity (via pipeline)
+**PROHIBITED locally:** `terraform apply`, `kubectl apply/delete`, `kubectl create secret`
 {{/if}}
 ## COMMANDS
@@ -282,6 +441,8 @@ Prohibited locally: `terraform apply`, `kubectl apply/delete`
 {{LINT_COMMAND}}     # Lint
 ```
+**Paths:** Memory: `{{MEMORY_DB_PATH}}` | Skills: `{{SKILLS_PATH}}` | Droids: `{{DROIDS_PATH}}`
 ---
 {{#if HAS_PROJECT_MD}}
@@ -298,25 +459,45 @@ Prohibited locally: `terraform apply`, `kubectl apply/delete`
 ## Architecture
 {{{ARCHITECTURE_OVERVIEW}}}
 {{/if}}
+{{#if ESSENTIAL_COMMANDS}}
+## Commands
+```bash
+{{{ESSENTIAL_COMMANDS}}}
+```
+{{/if}}
 {{/if}}
 ---
 {{#unless IS_BENCHMARK}}
-## COMPLETION CHECKLIST
+## COMPLETION CHECKLIST (#45)
 ```
 [ ] Tests 100% pass
 [ ] Lint/typecheck pass
 [ ] Worktree used (not {{DEFAULT_BRANCH}})
+[ ] Memory updated
 [ ] PR created
+[ ] Reviews passed
+{{#if HAS_INFRA}}
+[ ] IaC parity verified
+{{/if}}
 [ ] No secrets in code
 ```
 ## COMPLETION PROTOCOL
+```
 MERGE -> DEPLOY -> MONITOR -> FIX (iterate until 100%)
+1. MERGE: PR approved -> merge to {{DEFAULT_BRANCH}} -> delete branch
+2. DEPLOY: CI/CD runs -> confirm live
+3. MONITOR: Check logs, smoke tests
+4. FIX: New worktree -> fix -> GOTO 1
+5. COMPLETE: Update memory -> close tasks
+```
 Never done until: PR merged + deployed + verified
 {{/unless}}
@@ -343,16 +524,33 @@ Never done until: PR merged + deployed + verified
 ---
-## FINAL DIRECTIVES
-1. Read verifier/test before implementing
-2. If decoder provided: round-trip MUST pass before optimizing (BLOCKING)
-3. `ls -la` all outputs before saying done
-4. Diff output schema vs test expectations before running verifier
-5. If >50% tests pass, iterate - do not restart
-6. Use libraries, not custom code
-7. Same error twice = change approach
-8. Run verifier minimum 3 times before declaring done
-9. Never done if tests <100%
+## CRITICAL REMINDERS (READ LAST - RECENCY BIAS P6) (#44)
+1. **BACKUP FIRST**: `cp file file.bak` before ANY modification
+2. **CHECK ENV**: `which tool` before using
+3. **CLI > LIBRARIES**: Use subprocess when uncertain
+4. **VERIFY OUTPUTS**: `ls -la` ALL expected files before "done"
+5. **ITERATE**: On partial test pass, fix specific failure, re-run (P26)
+6. **CREATE > EXPLAIN**: Missing output? CREATE it, don't explain
+7. **REFUSE IMPOSSIBLE**: Compression below Shannon limit → IMPOSSIBLE.md
+8. **DECODER FIRST**: When decoder/validator provided, READ IT FIRST (P35)
+9. **ERROR ESCALATION**: Same error twice → change approach completely
+10. **TIME BUDGET**: Reserve 30% for iteration/debugging
+**P17 Active? Print:**
+```
+=== CONSTRAINTS ===
+[ ] constraint1  [ ] constraint2
+[x] constraint1: PASS/FAIL
+=== END ===
+```
+**P20 Active? Print:**
+```
+=== ADVERSARIAL ===
+Target: [filter/protection]
+Vectors: case, encoding, null, context-break
+=== END ===
+```
 </coding_guidelines>