universal-agent-memory 2.8.0 → 2.9.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/templates/CLAUDE.template.md +341 -18
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "universal-agent-memory",
|
|
3
|
-
"version": "2.
|
|
3
|
+
"version": "2.9.0",
|
|
4
4
|
"description": "Universal AI agent memory system - CLAUDE.md templates, memory, worktrees for Claude Code, Factory.AI, VSCode, OpenCode",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"main": "dist/index.js",
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
<!--
|
|
2
|
-
CLAUDE.md Universal Template - v10.
|
|
2
|
+
CLAUDE.md Universal Template - v10.19-opt
|
|
3
3
|
|
|
4
4
|
Core Variables:
|
|
5
5
|
{{PROJECT_NAME}}, {{DESCRIPTION}}, {{DEFAULT_BRANCH}}, {{STRUCTURE_DATE}}
|
|
@@ -163,7 +163,7 @@ Secrets live in GitHub - use pipelines for secret-dependent operations.
|
|
|
163
163
|
|
|
164
164
|
## UNIVERSAL AGENT PATTERNS
|
|
165
165
|
|
|
166
|
-
**
|
|
166
|
+
**Forty-two patterns proven to improve LLM agent reliability. Apply to ALL task execution.**
|
|
167
167
|
|
|
168
168
|
### Pattern 1: Environment Isolation
|
|
169
169
|
Code installed during execution may NOT persist across phases (agent -> test -> deploy).
|
|
@@ -206,7 +206,7 @@ LLMs weight information at the END of context more heavily.
|
|
|
206
206
|
- Repeat critical instructions if essential
|
|
207
207
|
|
|
208
208
|
### Pattern 7: Task Classification
|
|
209
|
-
Different task types require different strategies.
|
|
209
|
+
Different task types require different strategies. See also P40 (Adaptive Reasoning Depth).
|
|
210
210
|
|
|
211
211
|
| Category | Strategy |
|
|
212
212
|
|----------|----------|
|
|
@@ -214,6 +214,8 @@ Different task types require different strategies.
|
|
|
214
214
|
| Recipe-following | Step-by-step commands |
|
|
215
215
|
| Tool-dependent | Specify exact tool + flags |
|
|
216
216
|
| Research/exploration | Parallel searches |
|
|
217
|
+
| Complex/ambiguous | P38 Branching Exploration |
|
|
218
|
+
| Multi-file change | P41 Atomic Task Loop |
|
|
217
219
|
|
|
218
220
|
### Pattern 8: CLI over Libraries
|
|
219
221
|
When environment dependencies are uncertain, prefer subprocess + CLI over library imports.
|
|
@@ -495,6 +497,232 @@ Competitive tasks benefit from researching domain-specific winning strategies.
|
|
|
495
497
|
|
|
496
498
|
---
|
|
497
499
|
|
|
500
|
+
## ADVANCED REASONING PATTERNS
|
|
501
|
+
|
|
502
|
+
**Six patterns derived from state-of-the-art LLM optimization research (2025-2026). Address reasoning depth, self-verification, branching exploration, feedback grounding, and task atomization.**
|
|
503
|
+
|
|
504
|
+
### Pattern 37: Pre-Implementation Verification (PIV)
|
|
505
|
+
**CRITICAL: Prevents wrong-approach waste — the #1 cause of wasted compute.**
|
|
506
|
+
|
|
507
|
+
After planning but BEFORE writing any code, explicitly verify your approach:
|
|
508
|
+
|
|
509
|
+
**Detection**: Any implementation task (always active for non-trivial changes)
|
|
510
|
+
|
|
511
|
+
**Protocol**:
|
|
512
|
+
```
|
|
513
|
+
=== PRE-IMPLEMENTATION VERIFY ===
|
|
514
|
+
1. ROOT CAUSE: Does this approach address the actual root cause, not a symptom?
|
|
515
|
+
2. EXISTING TESTS: Will this break any existing passing tests?
|
|
516
|
+
3. SIMPLER PATH: Is there a simpler approach I'm overlooking?
|
|
517
|
+
4. ASSUMPTIONS: What am I assuming about the codebase that I haven't verified?
|
|
518
|
+
5. SIDE EFFECTS: What else does this change affect?
|
|
519
|
+
=== VERIFIED: [proceed/revise] ===
|
|
520
|
+
```
|
|
521
|
+
|
|
522
|
+
**If ANY answer raises doubt**: STOP. Re-read the problem. Revise approach before coding.
|
|
523
|
+
|
|
524
|
+
*Research basis: CoT verification (+4.3% accuracy), Reflexion framework (+18.5%), SEER adaptive reasoning (+4-9%)*
|
|
525
|
+
|
|
526
|
+
### Pattern 38: Branching Exploration (BE)
|
|
527
|
+
For complex or ambiguous problems, explore multiple approaches before committing.
|
|
528
|
+
|
|
529
|
+
**Detection**: Problem has multiple valid approaches, ambiguous requirements, or high complexity
|
|
530
|
+
|
|
531
|
+
**Protocol**:
|
|
532
|
+
1. **Generate 2-3 candidate approaches** (brief description, not full implementation)
|
|
533
|
+
2. **Evaluate each** against: simplicity, correctness likelihood, test-compatibility, side-effect risk
|
|
534
|
+
3. **Select best** with explicit reasoning
|
|
535
|
+
4. **Commit fully** to selected approach — no mid-implementation switching
|
|
536
|
+
5. **If selected approach fails**: backtrack to step 1, eliminate failed approach, try next
|
|
537
|
+
|
|
538
|
+
**NEVER**: Start coding the first approach that comes to mind for complex problems.
|
|
539
|
+
**ALWAYS**: Spend 5% of effort exploring alternatives to save 50% on wrong-path recovery.
|
|
540
|
+
|
|
541
|
+
*Research basis: MCTS-guided code generation (RethinkMCTS: 70%→89% pass@1), Policy-Guided Tree Search*
|
|
542
|
+
|
|
543
|
+
### Pattern 39: Execution Feedback Grounding (EFG)
|
|
544
|
+
Learn from test failures systematically — don't just fix, understand and remember.
|
|
545
|
+
|
|
546
|
+
**Detection**: Any test failure or runtime error during implementation
|
|
547
|
+
|
|
548
|
+
**Protocol**:
|
|
549
|
+
1. **Categorize the failure** using the Failure Taxonomy (see below)
|
|
550
|
+
2. **Identify root cause** (not just the symptom the error message shows)
|
|
551
|
+
3. **Fix with explanation**: What was wrong, why, and what the fix addresses
|
|
552
|
+
4. **Store structured feedback** in memory:
|
|
553
|
+
```bash
|
|
554
|
+
sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO memories (timestamp,type,content) VALUES (datetime('now'),'failure_analysis','type:<category>|cause:<root_cause>|fix:<what_fixed>|file:<filename>');"
|
|
555
|
+
```
|
|
556
|
+
5. **Query before similar tasks**: Before implementing, check memory for past failures in same area
|
|
557
|
+
|
|
558
|
+
**Failure Taxonomy** (use for categorization):
|
|
559
|
+
| Type | Description | Recovery Strategy |
|
|
560
|
+
|------|-------------|-------------------|
|
|
561
|
+
| `dependency_missing` | Import/module not found | Install or use stdlib alternative |
|
|
562
|
+
| `wrong_approach` | Fundamentally incorrect solution | P38 Branching - try different approach |
|
|
563
|
+
| `format_mismatch` | Output doesn't match expected format | P14 OFV - re-read spec carefully |
|
|
564
|
+
| `edge_case` | Works for happy path, fails on edge | Add boundary checks, test with extremes |
|
|
565
|
+
| `state_mutation` | Unexpected side effect on shared state | Isolate mutations, use copies |
|
|
566
|
+
| `concurrency` | Race condition or timing issue | Add locks, use sequential fallback |
|
|
567
|
+
| `timeout` | Exceeded time/resource limit | Optimize algorithm, reduce scope |
|
|
568
|
+
| `environment` | Works locally, fails in target env | P1 Environment Isolation checks |
|
|
569
|
+
|
|
570
|
+
*Research basis: RLEF/RLVR (RL from Execution Feedback), verifiable rewards for coding agents*
|
|
571
|
+
|
|
572
|
+
### Pattern 40: Adaptive Reasoning Depth (ARD)
|
|
573
|
+
Match reasoning effort to task complexity — don't over-think simple tasks or under-think hard ones.
|
|
574
|
+
|
|
575
|
+
**Detection**: Applied automatically at Pattern Router stage
|
|
576
|
+
|
|
577
|
+
**Complexity Classification**:
|
|
578
|
+
| Complexity | Indicators | Reasoning Protocol |
|
|
579
|
+
|-----------|------------|-------------------|
|
|
580
|
+
| **Simple** | Single file, clear spec, known pattern, <20 lines | Direct implementation. No exploration phase. |
|
|
581
|
+
| **Moderate** | Multi-file, some ambiguity, 20-200 lines | Plan-then-implement. State assumptions. P37 verify. |
|
|
582
|
+
| **Complex** | Cross-cutting concerns, ambiguous spec, >200 lines, unfamiliar domain | P38 explore → P37 verify → implement → P39 feedback loop. |
|
|
583
|
+
| **Research** | Unknown solution space, no clear approach | Research first (web search, codebase analysis) → P38 explore → implement iteratively. |
|
|
584
|
+
|
|
585
|
+
**Rule**: Never apply Complex-level reasoning to Simple tasks (wastes tokens). Never apply Simple-level reasoning to Complex tasks (causes failures).
|
|
586
|
+
|
|
587
|
+
*Research basis: SEER adaptive CoT, test-time compute scaling (2-3x gains from adaptive depth)*
|
|
588
|
+
|
|
589
|
+
### Pattern 41: Atomic Task Loop (ATL)
|
|
590
|
+
For multi-step changes, decompose into atomic units with clean boundaries.
|
|
591
|
+
|
|
592
|
+
**Detection**: Task involves changes to 3+ files, or multiple independent concerns
|
|
593
|
+
|
|
594
|
+
**Protocol**:
|
|
595
|
+
1. **Decompose** the task into atomic sub-tasks (each independently testable)
|
|
596
|
+
2. **Order** by dependency (upstream changes first)
|
|
597
|
+
3. **For each sub-task**:
|
|
598
|
+
a. Implement the change (single concern only)
|
|
599
|
+
b. Run relevant tests
|
|
600
|
+
c. Commit if tests pass
|
|
601
|
+
d. If context is getting long/confused, note progress and continue fresh
|
|
602
|
+
4. **Final verification**: Run full test suite after all sub-tasks complete
|
|
603
|
+
|
|
604
|
+
**Atomicity rules**:
|
|
605
|
+
- Each sub-task modifies ideally 1-2 files
|
|
606
|
+
- Each sub-task has a clear pass/fail criterion
|
|
607
|
+
- Sub-tasks should not depend on uncommitted work from other sub-tasks
|
|
608
|
+
- If a sub-task fails, only that sub-task needs rework
|
|
609
|
+
|
|
610
|
+
*Research basis: Addy Osmani's continuous coding loop, context drift prevention research*
|
|
611
|
+
|
|
612
|
+
### Pattern 42: Critic-Before-Commit (CBC)
|
|
613
|
+
Review your own diff against requirements before running tests.
|
|
614
|
+
|
|
615
|
+
**Detection**: Any implementation about to be tested or committed
|
|
616
|
+
|
|
617
|
+
**Protocol**:
|
|
618
|
+
```
|
|
619
|
+
=== SELF-REVIEW ===
|
|
620
|
+
Diff summary: [what changed, in which files]
|
|
621
|
+
|
|
622
|
+
REQUIREMENT CHECK:
|
|
623
|
+
☐ Does the diff address ALL requirements from the task?
|
|
624
|
+
☐ Are there any unintended changes (debug prints, commented code, temp files)?
|
|
625
|
+
☐ Does the code handle the error/edge cases mentioned in the spec?
|
|
626
|
+
☐ Is the code consistent with surrounding style and conventions?
|
|
627
|
+
☐ Would this diff make sense to a reviewer with no context?
|
|
628
|
+
|
|
629
|
+
ISSUES FOUND: [list or "none"]
|
|
630
|
+
=== END REVIEW ===
|
|
631
|
+
```
|
|
632
|
+
|
|
633
|
+
**If issues found**: Fix BEFORE running tests. Cheaper to catch logic errors by reading than by test-debug cycles.
|
|
634
|
+
|
|
635
|
+
*Research basis: Multi-agent reflection (actor+critic, +20% accuracy), RL^V unified reasoner-verifier*
|
|
636
|
+
|
|
637
|
+
---
|
|
638
|
+
|
|
639
|
+
## CONTEXT OPTIMIZATION
|
|
640
|
+
|
|
641
|
+
**Reduce token waste and improve response quality through intelligent context management.**
|
|
642
|
+
|
|
643
|
+
### Progressive Context Disclosure
|
|
644
|
+
Not all patterns are needed for every task. The Pattern Router activates only relevant patterns.
|
|
645
|
+
- **Always loaded**: Pattern Router, Completion Gates, Error Recovery
|
|
646
|
+
- **Loaded on activation**: Only patterns flagged YES by router
|
|
647
|
+
- **Summarize, don't repeat**: When referencing prior work, summarize in 1-2 lines, don't paste full output
|
|
648
|
+
|
|
649
|
+
### Context Hygiene
|
|
650
|
+
- **Prune completed context**: After a sub-task completes, don't carry its full debug output forward
|
|
651
|
+
- **Compress tool output**: Quote only the 2-3 lines that inform the next decision
|
|
652
|
+
- **Avoid context poisoning**: Don't include failed approaches in context unless actively debugging them
|
|
653
|
+
- **Reset on drift**: If responses become unfocused or repetitive, summarize progress and continue with clean context
|
|
654
|
+
|
|
655
|
+
### Token Budget Awareness
|
|
656
|
+
| Task Type | Target Context Usage | Strategy |
|
|
657
|
+
|-----------|---------------------|----------|
|
|
658
|
+
| Simple fix | <10% of window | Direct implementation, minimal exploration |
|
|
659
|
+
| Feature implementation | 30-50% of window | Structured exploration, then focused implementation |
|
|
660
|
+
| Complex debugging | 50-70% of window | Deep investigation justified, but prune between attempts |
|
|
661
|
+
| Research/exploration | 20-40% of window | Broad search first, then narrow and deep |
|
|
662
|
+
|
|
663
|
+
---
|
|
664
|
+
|
|
665
|
+
## SELF-IMPROVEMENT PROTOCOL
|
|
666
|
+
|
|
667
|
+
**The agent improves its own effectiveness over time by learning from outcomes.**
|
|
668
|
+
|
|
669
|
+
### After Task Completion (Success or Failure)
|
|
670
|
+
1. **Record outcome** with structured metadata:
|
|
671
|
+
```bash
|
|
672
|
+
sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO session_memories (session_id,timestamp,type,content,importance) VALUES ('current',datetime('now'),'outcome','task:<summary>|result:<pass/fail>|patterns_used:<list>|time_spent:<estimate>|failure_type:<category_or_none>',8);"
|
|
673
|
+
```
|
|
674
|
+
|
|
675
|
+
2. **If failure occurred**: Store in semantic memory for cross-session learning:
|
|
676
|
+
```bash
|
|
677
|
+
{{MEMORY_STORE_CMD}} lesson "Failed on <task_type>: <what_went_wrong>. Fix: <what_worked>." --tags failure,<category>,<language> --importance 8
|
|
678
|
+
```
|
|
679
|
+
|
|
680
|
+
3. **If novel technique discovered**: Store as reusable pattern:
|
|
681
|
+
```bash
|
|
682
|
+
{{MEMORY_STORE_CMD}} lesson "New technique for <domain>: <technique_description>. Use when <conditions>." --tags technique,<domain> --importance 9
|
|
683
|
+
```
|
|
684
|
+
|
|
685
|
+
### Before Starting Similar Tasks
|
|
686
|
+
Query memory for relevant past outcomes:
|
|
687
|
+
```bash
|
|
688
|
+
sqlite3 ./{{MEMORY_DB_PATH}} "SELECT content FROM memories WHERE type='failure_analysis' AND content LIKE '%<relevant_keyword>%' ORDER BY timestamp DESC LIMIT 5;"
|
|
689
|
+
```
|
|
690
|
+
|
|
691
|
+
### Repo-Specific Learning
|
|
692
|
+
Over time, accumulate repository-specific patterns:
|
|
693
|
+
- Which test frameworks and assertions this repo uses
|
|
694
|
+
- Common failure modes in this codebase
|
|
695
|
+
- Preferred code style and naming conventions
|
|
696
|
+
- Architecture decisions and their rationale
|
|
697
|
+
|
|
698
|
+
Store these as high-importance semantic memories tagged with the repo name.
|
|
699
|
+
|
|
700
|
+
---
|
|
701
|
+
|
|
702
|
+
## CODE QUALITY HEURISTICS
|
|
703
|
+
|
|
704
|
+
**Apply to ALL generated code. Verify before committing.**
|
|
705
|
+
|
|
706
|
+
### Pre-Commit Code Review Checklist
|
|
707
|
+
- [ ] Functions ≤ 30 lines (split if longer)
|
|
708
|
+
- [ ] No God objects or functions doing multiple unrelated things
|
|
709
|
+
- [ ] Names are self-documenting (no single-letter variables outside loops)
|
|
710
|
+
- [ ] Error paths handled explicitly (not just happy path)
|
|
711
|
+
- [ ] No debug prints, console.logs, or commented-out code left behind
|
|
712
|
+
- [ ] Consistent with surrounding code style (indentation, naming, patterns)
|
|
713
|
+
- [ ] No hardcoded values that should be constants or config
|
|
714
|
+
- [ ] Imports are minimal — only what's actually used
|
|
715
|
+
|
|
716
|
+
### Code Smell Detection
|
|
717
|
+
If you notice any of these, fix before committing:
|
|
718
|
+
- **Duplicated logic** → Extract to shared function
|
|
719
|
+
- **Deep nesting (>3 levels)** → Early returns, extract helper
|
|
720
|
+
- **Boolean parameters** → Consider separate methods or options object
|
|
721
|
+
- **Magic numbers** → Named constants
|
|
722
|
+
- **Catch-all error handling** → Specific error types with appropriate responses
|
|
723
|
+
|
|
724
|
+
---
|
|
725
|
+
|
|
498
726
|
## SESSION START PROTOCOL
|
|
499
727
|
|
|
500
728
|
**EXECUTE IMMEDIATELY before any response:**
|
|
@@ -538,6 +766,7 @@ uam agent overlaps --resource "<files-or-directories>"
|
|
|
538
766
|
| Performance | `performance-optimizer` | algorithms, memory, caching |
|
|
539
767
|
| Documentation | `documentation-expert` | jsdoc, readme, api-docs |
|
|
540
768
|
| Code quality | `code-quality-guardian` | complexity, naming, solid |
|
|
769
|
+
| Solution verification | self (P42 CBC) | diff review, requirement check |
|
|
541
770
|
|
|
542
771
|
{{#if LANGUAGE_DROIDS}}
|
|
543
772
|
### Language Droids
|
|
@@ -587,13 +816,17 @@ uam agent overlaps --resource "<files-or-directories>"
|
|
|
587
816
|
## DECISION LOOP
|
|
588
817
|
|
|
589
818
|
```
|
|
590
|
-
0. CLASSIFY
|
|
591
|
-
1. PROTECT
|
|
592
|
-
2. MEMORY
|
|
593
|
-
3.
|
|
594
|
-
4.
|
|
595
|
-
5.
|
|
596
|
-
6.
|
|
819
|
+
0. CLASSIFY -> complexity? backup? tool? steps? (P40 Adaptive Depth)
|
|
820
|
+
1. PROTECT -> cp file file.bak
|
|
821
|
+
2. MEMORY -> query relevant context + past failures (P39)
|
|
822
|
+
3. EXPLORE -> if complex: generate 2-3 approaches (P38)
|
|
823
|
+
4. VERIFY -> pre-implementation check (P37)
|
|
824
|
+
5. AGENTS -> check overlaps
|
|
825
|
+
6. SKILLS -> check {{SKILLS_PATH}}
|
|
826
|
+
7. WORKTREE -> create, work (P41 atomic tasks)
|
|
827
|
+
8. REVIEW -> self-review diff (P42)
|
|
828
|
+
9. TEST -> gates pass
|
|
829
|
+
10. LEARN -> store outcome in memory (P39)
|
|
597
830
|
```
|
|
598
831
|
|
|
599
832
|
---
|
|
@@ -707,6 +940,68 @@ Task(subagent_type: "documentation-expert", prompt: "Check: <files>")
|
|
|
707
940
|
|
|
708
941
|
---
|
|
709
942
|
|
|
943
|
+
## UAM VISUAL STATUS FEEDBACK (MANDATORY WHEN UAM IS ACTIVE)
|
|
944
|
+
|
|
945
|
+
**When UAM tools are in use, ALWAYS use the built-in status display commands to provide visual feedback on progress and underlying numbers. Do NOT silently perform operations -- show the user what is happening.**
|
|
946
|
+
|
|
947
|
+
### After Task Operations
|
|
948
|
+
After creating, updating, closing, or claiming tasks, run:
|
|
949
|
+
```bash
|
|
950
|
+
uam dashboard progress # Show completion %, status bars, velocity
|
|
951
|
+
uam task stats # Show priority/type breakdown with charts
|
|
952
|
+
```
|
|
953
|
+
|
|
954
|
+
### After Memory Operations
|
|
955
|
+
After storing, querying, or prepopulating memory, run:
|
|
956
|
+
```bash
|
|
957
|
+
uam memory status # Show memory layer health, capacity gauges, service status
|
|
958
|
+
uam dashboard memory # Show detailed memory dashboard with architecture tree
|
|
959
|
+
```
|
|
960
|
+
|
|
961
|
+
### After Agent/Coordination Operations
|
|
962
|
+
After registering agents, checking overlaps, or claiming resources, run:
|
|
963
|
+
```bash
|
|
964
|
+
uam dashboard agents # Show agent status table, resource claims, active work
|
|
965
|
+
```
|
|
966
|
+
|
|
967
|
+
### Periodic Overview
|
|
968
|
+
At session start and after completing major work items, run:
|
|
969
|
+
```bash
|
|
970
|
+
uam dashboard overview # Full overview: task progress, agent status, memory health
|
|
971
|
+
```
|
|
972
|
+
|
|
973
|
+
### Display Function Reference
|
|
974
|
+
|
|
975
|
+
UAM provides these visual output functions (from `src/cli/visualize.ts`):
|
|
976
|
+
|
|
977
|
+
| Function | Purpose | When to Use |
|
|
978
|
+
|----------|---------|-------------|
|
|
979
|
+
| `progressBar` | Completion bar with % and count | Task/test progress |
|
|
980
|
+
| `stackedBar` + `stackedBarLegend` | Multi-segment status bar | Status distribution |
|
|
981
|
+
| `horizontalBarChart` | Labeled bar chart | Priority/type breakdowns |
|
|
982
|
+
| `miniGauge` | Compact colored gauge | Capacity/utilization |
|
|
983
|
+
| `sparkline` | Inline trend line | Historical data trends |
|
|
984
|
+
| `table` | Formatted data table | Task/agent listings |
|
|
985
|
+
| `tree` | Hierarchical tree view | Memory layers, task hierarchy |
|
|
986
|
+
| `box` | Bordered summary box | Section summaries |
|
|
987
|
+
| `statusBadge` | Colored status labels | Agent/service status |
|
|
988
|
+
| `keyValue` | Aligned key-value pairs | Metadata display |
|
|
989
|
+
| `inlineProgressSummary` | Compact progress bar with counts | After task mutations |
|
|
990
|
+
| `trend` | Up/down arrow with delta | Before/after comparisons |
|
|
991
|
+
| `heatmapRow` | Color-coded cell row | Activity density |
|
|
992
|
+
| `bulletList` | Status-colored bullet list | Health checks |
|
|
993
|
+
|
|
994
|
+
### Rules
|
|
995
|
+
|
|
996
|
+
1. **Never silently complete a UAM operation** -- always follow up with the relevant dashboard/status command
|
|
997
|
+
2. **Show numbers, not just success messages** -- the user needs to see counts, percentages, and trends
|
|
998
|
+
3. **Use `uam dashboard overview`** at session start to establish baseline awareness
|
|
999
|
+
4. **Use `uam task stats`** after any task state change to show the impact
|
|
1000
|
+
5. **Use `uam memory status`** after any memory write to confirm storage and show capacity
|
|
1001
|
+
6. **Prefer dashboard commands over raw SQLite queries** for status checks -- they provide formatted visual output
|
|
1002
|
+
|
|
1003
|
+
---
|
|
1004
|
+
|
|
710
1005
|
{{#if HAS_PROJECT_MD}}
|
|
711
1006
|
{{> PROJECT}}
|
|
712
1007
|
{{else}}
|
|
@@ -844,13 +1139,15 @@ echo "=== GATE 3: TEST VERIFICATION ==="
|
|
|
844
1139
|
☐ Tests pass
|
|
845
1140
|
☐ Lint/typecheck pass
|
|
846
1141
|
☐ Worktree used (not {{DEFAULT_BRANCH}})
|
|
847
|
-
☐
|
|
1142
|
+
☐ Self-review completed (P42)
|
|
1143
|
+
☐ Memory updated (outcome + lessons from P39)
|
|
848
1144
|
☐ PR created
|
|
849
1145
|
☐ Parallel reviews passed
|
|
850
1146
|
{{#if HAS_INFRA}}
|
|
851
1147
|
☐ IaC parity verified
|
|
852
1148
|
{{/if}}
|
|
853
1149
|
☐ No secrets in code
|
|
1150
|
+
☐ No debug artifacts left (console.logs, commented code, temp files)
|
|
854
1151
|
```
|
|
855
1152
|
|
|
856
1153
|
---
|
|
@@ -897,12 +1194,15 @@ When a task provides a decoder, validator, or expected output format:
|
|
|
897
1194
|
## ERROR RECOVERY ESCALATION
|
|
898
1195
|
|
|
899
1196
|
On any test failure or error:
|
|
900
|
-
1. **
|
|
901
|
-
2. **
|
|
902
|
-
3. **
|
|
903
|
-
4. **If
|
|
904
|
-
5. **If
|
|
905
|
-
6. **
|
|
1197
|
+
1. **Categorize** using P39 Failure Taxonomy (`dependency_missing`, `wrong_approach`, `format_mismatch`, `edge_case`, `state_mutation`, `concurrency`, `timeout`, `environment`)
|
|
1198
|
+
2. **Read exact error message** - do not guess
|
|
1199
|
+
3. **Query memory** for past failures of this type: `SELECT content FROM memories WHERE type='failure_analysis' AND content LIKE '%<type>%' LIMIT 3;`
|
|
1200
|
+
4. **If same error twice**: change approach completely (P38 Branching), do not retry same fix
|
|
1201
|
+
5. **If dependency missing**: install it (`pip install`, `npm install`, `apt-get`)
|
|
1202
|
+
6. **If permission denied**: use alternative path or `chmod`
|
|
1203
|
+
7. **If timeout**: reduce scope, submit partial solution
|
|
1204
|
+
8. **Store failure** in memory after resolution (P39 protocol)
|
|
1205
|
+
9. **Never repeat a failed command** without modification
|
|
906
1206
|
|
|
907
1207
|
---
|
|
908
1208
|
|
|
@@ -953,6 +1253,7 @@ If timeout approaching: submit best partial solution rather than nothing.
|
|
|
953
1253
|
=== PATTERN ROUTER ===
|
|
954
1254
|
Task: [one-line task summary]
|
|
955
1255
|
Classification: [file-creation | constraint-validation | adversarial | pipeline | impossible]
|
|
1256
|
+
Complexity: [simple | moderate | complex | research] (P40)
|
|
956
1257
|
|
|
957
1258
|
SELECTED PATTERNS:
|
|
958
1259
|
- P12 (OEV): [YES/NO] - Does task require creating files?
|
|
@@ -960,6 +1261,16 @@ If timeout approaching: submit best partial solution rather than nothing.
|
|
|
960
1261
|
- P20 (AT): [YES/NO] - Does task involve "bypass/break/exploit/filter"?
|
|
961
1262
|
- P11 (Pre-computed): [YES/NO] - Is this a known algorithm (crypto/chess/ML)?
|
|
962
1263
|
- P18 (MTP): [YES/NO] - Does task need multiple tools chained?
|
|
1264
|
+
- P37 (PIV): [YES/NO] - Non-trivial implementation needing plan verification?
|
|
1265
|
+
- P38 (BE): [YES/NO] - Multiple valid approaches or high ambiguity?
|
|
1266
|
+
- P41 (ATL): [YES/NO] - Changes span 3+ files or multiple concerns?
|
|
1267
|
+
- P42 (CBC): [YES/NO] - Implementation that will be committed?
|
|
1268
|
+
|
|
1269
|
+
REASONING DEPTH (per P40):
|
|
1270
|
+
- Simple → Direct implementation
|
|
1271
|
+
- Moderate → Plan, verify (P37), implement
|
|
1272
|
+
- Complex → Explore (P38), verify (P37), implement, feedback (P39)
|
|
1273
|
+
- Research → Search first, then explore (P38), implement iteratively
|
|
963
1274
|
|
|
964
1275
|
ACTIVE PATTERNS: [list only YES patterns]
|
|
965
1276
|
=== END ROUTER ===
|
|
@@ -1023,7 +1334,19 @@ If timeout approaching: submit best partial solution rather than nothing.
|
|
|
1023
1334
|
15. **REFUSE IMPOSSIBLE**: If task violates information theory (compress 500MB to 5KB),
|
|
1024
1335
|
REFUSE immediately. Create IMPOSSIBLE.md explaining why. Don't attempt.
|
|
1025
1336
|
|
|
1026
|
-
16. **
|
|
1337
|
+
16. **VERIFY BEFORE CODING (P37)**: For moderate+ complexity tasks, print the
|
|
1338
|
+
PRE-IMPLEMENTATION VERIFY block. Catch wrong approaches before wasting tokens.
|
|
1339
|
+
|
|
1340
|
+
17. **EXPLORE BEFORE COMMITTING (P38)**: For complex/ambiguous tasks, generate 2-3
|
|
1341
|
+
candidate approaches and evaluate before coding. 5% exploration saves 50% rework.
|
|
1342
|
+
|
|
1343
|
+
18. **LEARN FROM FAILURES (P39)**: After ANY test failure, categorize it using the
|
|
1344
|
+
Failure Taxonomy and store structured feedback in memory. Query memory before similar tasks.
|
|
1345
|
+
|
|
1346
|
+
19. **REVIEW YOUR OWN DIFF (P42)**: Before running tests, do a self-review of your
|
|
1347
|
+
changes against requirements. Catch logic errors by reading, not by test-debug cycles.
|
|
1348
|
+
|
|
1349
|
+
20. **ADVERSARIAL MINDSET (MANDATORY)**: If P20 is active, you MUST:
|
|
1027
1350
|
```
|
|
1028
1351
|
=== ADVERSARIAL ANALYSIS ===
|
|
1029
1352
|
Target: [what are we trying to bypass/break?]
|