universal-agent-memory 2.4.0 → 2.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/templates/CLAUDE.template.md +101 -114
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "universal-agent-memory",
|
|
3
|
-
"version": "2.
|
|
3
|
+
"version": "2.6.0",
|
|
4
4
|
"description": "Universal AI agent memory system - CLAUDE.md templates, memory, worktrees for Claude Code, Factory.AI, VSCode, OpenCode",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"main": "dist/index.js",
|
|
@@ -1,8 +1,10 @@
|
|
|
1
|
-
<!-- CLAUDE.md v2.
|
|
1
|
+
<!-- CLAUDE.md v2.6.0 - 54 Model Outcome Success Optimizations -->
|
|
2
2
|
<!-- Optimizations #22-27: Template Compression, Structured Iteration, Inline Domain Knowledge, Early Impossibility Exit, Conditional Context, Remove Emphasis Theater -->
|
|
3
3
|
<!-- Optimizations #28-34: Mandatory Verifier Loop, Decoder-First Gate, Context Stripping, Environment Check, Schema Diff, State Protection, Conditional Domain -->
|
|
4
4
|
<!-- Optimizations #35-39: Incremental Output Checkpoint, Last-Mile Debugging, Automated Round-Trip Gate, Token Budget, Failure-Weighted Domain -->
|
|
5
5
|
<!-- Optimizations #40-46: Code Field Restore, Pattern Router, Full 36-Pattern Reference, Critical Reminders, Infra Pipelines, Paths Reference, Reviews Gate -->
|
|
6
|
+
<!-- Optimizations #47-50: Stronger Round-Trip Enforcement, Numerical Tolerance Escalation, Pre-Hook Activation Signal, Token Savings -->
|
|
7
|
+
<!-- Optimizations #51-54: Attention-Weighted Ordering, Conditional Benchmark Sections, Verifier Anchor, Near-Miss Escalation -->
|
|
6
8
|
<coding_guidelines>
|
|
7
9
|
|
|
8
10
|
# {{PROJECT_NAME}} - Autonomous Agent Guide
|
|
@@ -33,7 +35,7 @@ Do not handle only the happy path.
|
|
|
33
35
|
|
|
34
36
|
---
|
|
35
37
|
|
|
36
|
-
## PATTERN ROUTER (MANDATORY FIRST) (#41)
|
|
38
|
+
## PATTERN ROUTER (MANDATORY FIRST) (#41, #47, #50, #53)
|
|
37
39
|
|
|
38
40
|
Before ANY work, print:
|
|
39
41
|
```
|
|
@@ -42,9 +44,16 @@ Task: [summary]
|
|
|
42
44
|
Classification: [file-creation|constraint|adversarial|pipeline|impossible]
|
|
43
45
|
PATTERNS: P12:[Y/N] P17:[Y/N] P20:[Y/N] P11:[Y/N] P35:[Y/N]
|
|
44
46
|
ACTIVE: [list YES only]
|
|
47
|
+
BLOCKING: [P35: decoder.py/decode exists? → MUST round-trip before optimize]
|
|
48
|
+
VERIFIER: [cat verifier.sh test_*.py 2>/dev/null | head -20]
|
|
49
|
+
HOOKS: [ls /tmp/*_strategies.txt /tmp/*_hints.txt 2>/dev/null | head -3]
|
|
45
50
|
=== END ===
|
|
46
51
|
```
|
|
47
52
|
|
|
53
|
+
**#47 BLOCKING GATE**: If P35=YES (decoder provided), you CANNOT optimize until round-trip passes.
|
|
54
|
+
**#53 VERIFIER**: Read verifier/tests BEFORE implementing. Extract expected outputs, formats, paths.
|
|
55
|
+
**#50 HOOKS**: If hook files exist, READ THEM before implementing.
|
|
56
|
+
|
|
48
57
|
---
|
|
49
58
|
|
|
50
59
|
## FIRST: IMPOSSIBILITY CHECK (#25)
|
|
@@ -91,61 +100,25 @@ This guarantees partial credit even if implementation fails.
|
|
|
91
100
|
|
|
92
101
|
---
|
|
93
102
|
|
|
94
|
-
## AUTOMATED ROUND-TRIP GATE (#37)
|
|
103
|
+
## AUTOMATED ROUND-TRIP GATE (#37, #47)
|
|
95
104
|
|
|
96
105
|
If task provides BOTH encoder AND decoder:
|
|
97
106
|
```python
|
|
98
|
-
# BLOCKING -
|
|
107
|
+
# *** BLOCKING (#47) - NO optimization until this passes ***
|
|
108
|
+
# Step 1: READ the provided decoder FIRST - understand its expected format
|
|
109
|
+
# Step 2: Write encoder to MATCH that format exactly
|
|
110
|
+
# Step 3: Test round-trip at increasing scales:
|
|
99
111
|
test_input = b"test" * 10 # Start small
|
|
100
112
|
compressed = encode(test_input)
|
|
101
|
-
decoded = decode(compressed) # Use PROVIDED decoder
|
|
102
|
-
assert test_input == decoded, "BLOCKING: Round-trip failed"
|
|
113
|
+
decoded = decode(compressed) # Use PROVIDED decoder, NOT your own
|
|
114
|
+
assert test_input == decoded, "BLOCKING: Round-trip failed - fix format before optimizing"
|
|
103
115
|
# Scale up: 100 bytes -> 1KB -> full input
|
|
104
|
-
#
|
|
116
|
+
# Step 4: ONLY optimize size AFTER round-trip passes at ALL scales
|
|
105
117
|
```
|
|
118
|
+
**If round-trip produces garbage**: Your encoder format doesn't match the decoder. Re-read decoder byte-by-byte.
|
|
106
119
|
|
|
107
120
|
---
|
|
108
121
|
|
|
109
|
-
## STRUCTURED ITERATION PROTOCOL (#23, #28)
|
|
110
|
-
|
|
111
|
-
Run verifier minimum 3 times before declaring done.
|
|
112
|
-
|
|
113
|
-
```
|
|
114
|
-
if pass_rate == 100% AND run_count >= 3: DONE
|
|
115
|
-
if pass_rate >= 75% AND 1-2 tests failing: LAST-MILE DEBUGGING (#36)
|
|
116
|
-
if pass_rate > 50%: Fix ONLY the delta, re-run immediately
|
|
117
|
-
if pass_rate <= 50%: Change approach entirely
|
|
118
|
-
```
|
|
119
|
-
|
|
120
|
-
Print after every run: `TESTS: X/Y (Z%) [run N/3]`
|
|
121
|
-
|
|
122
|
-
---
|
|
123
|
-
|
|
124
|
-
## LAST-MILE DEBUGGING PROTOCOL (#36)
|
|
125
|
-
|
|
126
|
-
When pass_rate >= 75% and exactly 1-2 tests failing:
|
|
127
|
-
```
|
|
128
|
-
1. STOP implementing new code
|
|
129
|
-
2. Read ENTIRE failing test function (not just assertion)
|
|
130
|
-
3. Print exact expected vs actual values
|
|
131
|
-
4. Binary search difference: format? precision? edge case?
|
|
132
|
-
5. If numerical: try tolerance (atol=1e-6 -> 1e-3 -> 1e-1)
|
|
133
|
-
6. If format: character-by-character diff
|
|
134
|
-
7. Spend up to 40% of remaining time on last-mile
|
|
135
|
-
```
|
|
136
|
-
|
|
137
|
-
Never: say "done" at <100% | refactor passing code | start new approach when >80%
|
|
138
|
-
|
|
139
|
-
---
|
|
140
|
-
|
|
141
|
-
## TIME BUDGET
|
|
142
|
-
|
|
143
|
-
| Phase | Allocation |
|
|
144
|
-
|-------|------------|
|
|
145
|
-
| Setup | 20% |
|
|
146
|
-
| Implementation | 50% |
|
|
147
|
-
| Iteration | 30% (reserved) |
|
|
148
|
-
|
|
149
122
|
---
|
|
150
123
|
|
|
151
124
|
## PATTERNS REFERENCE (#42)
|
|
@@ -205,7 +178,7 @@ Never: say "done" at <100% | refactor passing code | start new approach when >80
|
|
|
205
178
|
| P | Trigger | Action |
|
|
206
179
|
|---|---------|--------|
|
|
207
180
|
| 32 | CLI tool | Test `./script` not `python script` |
|
|
208
|
-
| 33 | Numerical | Multiple seeds, edge cases, tolerance 1e-6 |
|
|
181
|
+
| 33 | Numerical | Multiple seeds, edge cases, tolerance escalation: 1e-6→1e-3→1e-1 |
|
|
209
182
|
| 34 | Image analysis | Use recognition tools, not reasoning |
|
|
210
183
|
| 35 | Decoder given | **BLOCKING**: Read decoder, round-trip MUST pass first |
|
|
211
184
|
| 36 | Competition | Research strategies BEFORE implementing |
|
|
@@ -216,64 +189,30 @@ Never: say "done" at <100% | refactor passing code | start new approach when >80
|
|
|
216
189
|
---
|
|
217
190
|
|
|
218
191
|
{{#if IS_BENCHMARK}}
|
|
219
|
-
## INLINE DOMAIN KNOWLEDGE (#24, #39)
|
|
192
|
+
## INLINE DOMAIN KNOWLEDGE (#24, #39, #50 token-optimized)
|
|
220
193
|
|
|
221
|
-
### XSS
|
|
222
|
-
|
|
223
|
-
|
|
224
|
-
# Whitelist-only: allow known-safe tags/attributes, strip everything else
|
|
225
|
-
# Test: <script>, <img onerror=>, <svg onload=>, javascript: URIs, data: URIs
|
|
226
|
-
# Edge cases: nested encoding, null bytes, case variations, incomplete tags
|
|
227
|
-
import bleach
|
|
228
|
-
clean = bleach.clean(html, tags=['p','b','i','a','ul','li'], attributes={'a':['href']}, strip=True)
|
|
229
|
-
```
|
|
194
|
+
### XSS Filtering
|
|
195
|
+
`import bleach; clean = bleach.clean(html, tags=['p','b','i','a','ul','li'], attributes={'a':['href']}, strip=True)`
|
|
196
|
+
Test: `<script>`, `<img onerror=>`, `javascript:` URIs, null bytes, nested encoding. NEVER regex-only.
|
|
230
197
|
|
|
231
|
-
###
|
|
232
|
-
|
|
233
|
-
(* Prove commutativity: use induction on n, simpl, rewrite IHn, reflexivity *)
|
|
234
|
-
Theorem plus_comm : forall n m : nat, n + m = m + n.
|
|
235
|
-
Proof. induction n as [|n' IHn']. - intros m. simpl. rewrite <- plus_n_O. reflexivity.
|
|
236
|
-
- intros m. simpl. rewrite IHn'. rewrite plus_n_Sm. reflexivity. Qed.
|
|
237
|
-
```
|
|
198
|
+
### Coq Proofs
|
|
199
|
+
Commutativity: induction on n, simpl, rewrite IHn, plus_n_O, plus_n_Sm, reflexivity.
|
|
238
200
|
|
|
239
|
-
### Metacircular
|
|
240
|
-
|
|
241
|
-
# eval/apply mutual recursion. Handle: quote, if, lambda, define, begin, set!
|
|
242
|
-
# Self-interpretation: eval must be able to evaluate eval
|
|
243
|
-
# Key: environment is list of frames, each frame is dict of bindings
|
|
244
|
-
def eval_expr(expr, env):
|
|
245
|
-
if is_symbol(expr): return env_lookup(env, expr)
|
|
246
|
-
if is_number(expr): return expr
|
|
247
|
-
if expr[0] == 'quote': return expr[1]
|
|
248
|
-
if expr[0] == 'if': return eval_expr(expr[2] if eval_expr(expr[1],env) else expr[3], env)
|
|
249
|
-
if expr[0] == 'lambda': return Closure(expr[1], expr[2], env)
|
|
250
|
-
# apply: eval operator, eval operands, call
|
|
251
|
-
```
|
|
252
|
-
|
|
253
|
-
### Chess Tasks (P21)
|
|
254
|
-
```bash
|
|
255
|
-
pip install python-chess
|
|
256
|
-
python3 -c "import chess, chess.engine; e=chess.engine.SimpleEngine.popen_uci('/usr/bin/stockfish'); b=chess.Board('FEN'); print(e.play(b,chess.engine.Limit(time=1)).move)"
|
|
257
|
-
```
|
|
201
|
+
### Metacircular Eval
|
|
202
|
+
eval/apply recursion. Handle: quote, if, lambda, define, begin, set!. Env = list of frames (dicts).
|
|
258
203
|
|
|
259
|
-
###
|
|
260
|
-
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
|
|
264
|
-
decoded = decode(compressed) # Use PROVIDED decoder
|
|
265
|
-
assert original == decoded, "BLOCKING: Round-trip failed"
|
|
266
|
-
```
|
|
204
|
+
### Chess (P21)
|
|
205
|
+
`pip install python-chess && python3 -c "import chess,chess.engine; e=chess.engine.SimpleEngine.popen_uci('/usr/bin/stockfish'); b=chess.Board('FEN'); print(e.play(b,chess.engine.Limit(time=1)).move)"`
|
|
206
|
+
|
|
207
|
+
### Compression (P35) - BLOCKING (#47)
|
|
208
|
+
Read decoder FIRST. Round-trip MUST pass before optimizing. If garbage output: encoder format mismatches decoder.
|
|
267
209
|
|
|
268
210
|
### Git Recovery (P22)
|
|
269
|
-
|
|
270
|
-
cp -r .git .git.bak && git fsck --full && git reflog --all
|
|
271
|
-
```
|
|
211
|
+
`cp -r .git .git.bak && git fsck --full && git reflog --all`
|
|
272
212
|
|
|
273
|
-
### CoreWars
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
- Scanners detect slow opponents
|
|
213
|
+
### CoreWars (P36)
|
|
214
|
+
Paper > Stone (self-replication > bombing). Vampires capture processes. Scanners detect slow opponents.
|
|
215
|
+
Check hook: `cat /tmp/corewars_strategies.txt 2>/dev/null`
|
|
277
216
|
{{/if}}
|
|
278
217
|
|
|
279
218
|
---
|
|
@@ -301,7 +240,15 @@ Print before done:
|
|
|
301
240
|
|
|
302
241
|
---
|
|
303
242
|
|
|
304
|
-
##
|
|
243
|
+
## ERROR RECOVERY
|
|
244
|
+
|
|
245
|
+
1. Read exact error message
|
|
246
|
+
2. Same error twice? Change approach
|
|
247
|
+
3. Dependency missing? Install it
|
|
248
|
+
4. Timeout approaching? Submit best partial
|
|
249
|
+
|
|
250
|
+
{{#if IS_BENCHMARK}}
|
|
251
|
+
## FAILURE TRIAGE (#4, #52)
|
|
305
252
|
|
|
306
253
|
| Error | Fix |
|
|
307
254
|
|-------|-----|
|
|
@@ -315,26 +262,13 @@ Print before done:
|
|
|
315
262
|
|
|
316
263
|
Same error twice = change approach completely.
|
|
317
264
|
|
|
318
|
-
---
|
|
319
|
-
|
|
320
|
-
## ERROR RECOVERY
|
|
321
|
-
|
|
322
|
-
1. Read exact error message
|
|
323
|
-
2. Same error twice? Change approach
|
|
324
|
-
3. Dependency missing? Install it
|
|
325
|
-
4. Timeout approaching? Submit best partial
|
|
326
|
-
|
|
327
265
|
State destruction prevention (#33): For recovery/forensics tasks (WAL, deleted files, corrupted data):
|
|
328
266
|
```bash
|
|
329
|
-
# BEFORE any read: copy ALL artifacts
|
|
330
267
|
cp -a /app/*.db /tmp/backup/ 2>/dev/null
|
|
331
268
|
cp -a /app/*.db-wal /tmp/backup/ 2>/dev/null
|
|
332
269
|
cp -a /app/*.db-shm /tmp/backup/ 2>/dev/null
|
|
333
|
-
# sqlite3 checkpoints WAL on open - backup FIRST
|
|
334
270
|
```
|
|
335
271
|
|
|
336
|
-
---
|
|
337
|
-
|
|
338
272
|
## PARTIAL CREDIT (#17)
|
|
339
273
|
|
|
340
274
|
When time running out (last 20%):
|
|
@@ -343,6 +277,15 @@ When time running out (last 20%):
|
|
|
343
277
|
3. One targeted fix for highest-likelihood failure
|
|
344
278
|
4. Submit current state
|
|
345
279
|
|
|
280
|
+
## TIME BUDGET (#52)
|
|
281
|
+
|
|
282
|
+
| Phase | Allocation |
|
|
283
|
+
|-------|------------|
|
|
284
|
+
| Setup | 20% |
|
|
285
|
+
| Implementation | 50% |
|
|
286
|
+
| Iteration | 30% (reserved) |
|
|
287
|
+
{{/if}}
|
|
288
|
+
|
|
346
289
|
---
|
|
347
290
|
|
|
348
291
|
## DECISION LOOP
|
|
@@ -524,6 +467,47 @@ Never done until: PR merged + deployed + verified
|
|
|
524
467
|
|
|
525
468
|
---
|
|
526
469
|
|
|
470
|
+
## STRUCTURED ITERATION PROTOCOL (#23, #28, #51, #54)
|
|
471
|
+
|
|
472
|
+
Run verifier minimum 3 times before declaring done.
|
|
473
|
+
|
|
474
|
+
```
|
|
475
|
+
if pass_rate == 100% AND run_count >= 3: DONE
|
|
476
|
+
if 60% <= pass_rate < 100%: NEAR-MISS (#54) - you are CLOSE, do NOT change approach
|
|
477
|
+
→ Read the 1-2 failing tests completely
|
|
478
|
+
→ Fix ONLY what those specific tests check
|
|
479
|
+
→ Re-run immediately
|
|
480
|
+
if pass_rate >= 75% AND 1-2 tests failing: LAST-MILE DEBUGGING (#36)
|
|
481
|
+
if pass_rate > 50%: Fix ONLY the delta, re-run immediately
|
|
482
|
+
if pass_rate <= 50%: Change approach entirely
|
|
483
|
+
```
|
|
484
|
+
|
|
485
|
+
Print after every run: `TESTS: X/Y (Z%) [run N/3]`
|
|
486
|
+
|
|
487
|
+
---
|
|
488
|
+
|
|
489
|
+
## LAST-MILE DEBUGGING PROTOCOL (#36, #48)
|
|
490
|
+
|
|
491
|
+
When pass_rate >= 75% and exactly 1-2 tests failing:
|
|
492
|
+
```
|
|
493
|
+
1. STOP implementing new code
|
|
494
|
+
2. Read ENTIRE failing test function (not just assertion)
|
|
495
|
+
3. Print exact expected vs actual values
|
|
496
|
+
4. Binary search difference: format? precision? edge case?
|
|
497
|
+
5. If numerical: TOLERANCE ESCALATION (#48):
|
|
498
|
+
- Start: atol=1e-6, rtol=1e-6
|
|
499
|
+
- If fail: atol=1e-3, rtol=1e-3
|
|
500
|
+
- If fail: atol=1e-1, rtol=1e-1
|
|
501
|
+
- Log-concavity/convexity checks: use numerical derivatives with eps=1e-4
|
|
502
|
+
- Floating-point noise: compare sign(diff) not exact values
|
|
503
|
+
6. If format: character-by-character diff
|
|
504
|
+
7. Spend up to 40% of remaining time on last-mile
|
|
505
|
+
```
|
|
506
|
+
|
|
507
|
+
Never: say "done" at <100% | refactor passing code | start new approach when >80%
|
|
508
|
+
|
|
509
|
+
---
|
|
510
|
+
|
|
527
511
|
## CRITICAL REMINDERS (READ LAST - RECENCY BIAS P6) (#44)
|
|
528
512
|
|
|
529
513
|
1. **BACKUP FIRST**: `cp file file.bak` before ANY modification
|
|
@@ -533,9 +517,12 @@ Never done until: PR merged + deployed + verified
|
|
|
533
517
|
5. **ITERATE**: On partial test pass, fix specific failure, re-run (P26)
|
|
534
518
|
6. **CREATE > EXPLAIN**: Missing output? CREATE it, don't explain
|
|
535
519
|
7. **REFUSE IMPOSSIBLE**: Compression below Shannon limit → IMPOSSIBLE.md
|
|
536
|
-
8. **DECODER FIRST**: When decoder
|
|
520
|
+
8. **DECODER FIRST (#47)**: When decoder provided, READ IT FIRST, round-trip MUST pass before ANY optimization
|
|
537
521
|
9. **ERROR ESCALATION**: Same error twice → change approach completely
|
|
538
522
|
10. **TIME BUDGET**: Reserve 30% for iteration/debugging
|
|
523
|
+
11. **READ HOOKS (#50)**: `ls /tmp/*_strategies.txt /tmp/*_hints.txt` - if exist, READ before implementing
|
|
524
|
+
12. **TOLERANCE ESCALATION (#48)**: Numerical test fails? Try 1e-6→1e-3→1e-1 before changing algorithm
|
|
525
|
+
13. **NEAR-MISS (#54)**: At 60-89% pass rate, you are CLOSE - do NOT change approach, fix only failing tests
|
|
539
526
|
|
|
540
527
|
**P17 Active? Print:**
|
|
541
528
|
```
|