universal-agent-memory 3.1.0 → 4.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/templates/CLAUDE.template.md +116 -1015
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "universal-agent-memory",
|
|
3
|
-
"version": "
|
|
3
|
+
"version": "4.1.0",
|
|
4
4
|
"description": "Universal AI agent memory system - CLAUDE.md templates, memory, worktrees for Claude Code, Factory.AI, VSCode, OpenCode",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"main": "dist/index.js",
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
<!--
|
|
2
|
-
CLAUDE.md Universal Template -
|
|
2
|
+
CLAUDE.md Universal Template - v11.0-slim
|
|
3
3
|
|
|
4
4
|
Core Variables:
|
|
5
5
|
{{PROJECT_NAME}}, {{DESCRIPTION}}, {{DEFAULT_BRANCH}}, {{STRUCTURE_DATE}}
|
|
@@ -44,42 +44,13 @@
|
|
|
44
44
|
|
|
45
45
|
---
|
|
46
46
|
|
|
47
|
-
## CODE
|
|
47
|
+
## CODE PRINCIPLES
|
|
48
48
|
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
Do not write code before stating assumptions.
|
|
55
|
-
Do not claim correctness you haven't verified.
|
|
56
|
-
Do not handle only the happy path.
|
|
57
|
-
Under what conditions does this work?
|
|
58
|
-
```
|
|
59
|
-
|
|
60
|
-
### Before Writing Code
|
|
61
|
-
|
|
62
|
-
- What are you assuming about the input?
|
|
63
|
-
- What are you assuming about the environment?
|
|
64
|
-
- What would break this?
|
|
65
|
-
- What would a malicious caller do?
|
|
66
|
-
|
|
67
|
-
### Do Not
|
|
68
|
-
|
|
69
|
-
- Write code before stating assumptions
|
|
70
|
-
- Claim correctness you haven't verified
|
|
71
|
-
- Handle the happy path and gesture at the rest
|
|
72
|
-
- Import complexity you don't need
|
|
73
|
-
- Solve problems you weren't asked to solve
|
|
74
|
-
- Produce code you wouldn't want to debug at 3am
|
|
75
|
-
|
|
76
|
-
### Expected Output Format
|
|
77
|
-
|
|
78
|
-
**Before code**: Assumptions stated explicitly, scope bounded
|
|
79
|
-
**In code**: Smaller than expected, edge cases handled or explicitly rejected
|
|
80
|
-
**After code**: "What this handles" and "What this does NOT handle" sections
|
|
81
|
-
|
|
82
|
-
*Attribution: Based on [context-field research](https://github.com/NeoVertex1/context-field)*
|
|
49
|
+
- State assumptions before writing code
|
|
50
|
+
- Verify correctness -- do not claim it
|
|
51
|
+
- Handle error paths, not just the happy path
|
|
52
|
+
- Do not import complexity you do not need
|
|
53
|
+
- Produce code you would want to debug at 3am
|
|
83
54
|
|
|
84
55
|
---
|
|
85
56
|
|
|
@@ -98,8 +69,6 @@ Under what conditions does this work?
|
|
|
98
69
|
| kubectl with secrets | `ops-approved-operations.yml` |
|
|
99
70
|
| One-time secret operation | `ops-create-ephemeral.yml` (self-destructs after run) |
|
|
100
71
|
|
|
101
|
-
**Local commands without secrets** (read-only, public resources) are allowed for testing.
|
|
102
|
-
|
|
103
72
|
### Two-Phase Infrastructure Workflow
|
|
104
73
|
|
|
105
74
|
```
|
|
@@ -107,7 +76,6 @@ PHASE 1: LOCAL PROOF (ALLOWED - NO SECRETS)
|
|
|
107
76
|
- kubectl get/describe/logs (read-only operations)
|
|
108
77
|
- terraform plan (uses GitHub pipeline for secrets)
|
|
109
78
|
- Direct cloud console changes for rapid prototyping
|
|
110
|
-
- Manual commands to verify behavior (public resources)
|
|
111
79
|
- SECRETS REQUIRED? -> Use pipeline, not local commands
|
|
112
80
|
|
|
113
81
|
PHASE 2: IaC PARITY (MANDATORY - VIA PIPELINE)
|
|
@@ -119,639 +87,107 @@ PHASE 2: IaC PARITY (MANDATORY - VIA PIPELINE)
|
|
|
119
87
|
- RULE: Work is NOT complete until IaC matches live state
|
|
120
88
|
```
|
|
121
89
|
|
|
122
|
-
### Core Principle
|
|
123
|
-
|
|
124
|
-
```
|
|
125
|
-
Local testing proves the solution. IaC ensures reproducibility.
|
|
126
|
-
Manual changes are TEMPORARY. IaC changes are PERMANENT.
|
|
127
|
-
If it's not in IaC, it doesn't exist (will be destroyed/lost).
|
|
128
|
-
Secrets live in GitHub - use pipelines for secret-dependent operations.
|
|
129
|
-
```
|
|
130
|
-
|
|
131
90
|
### Approved Pipelines
|
|
132
91
|
|
|
133
|
-
| Task | Pipeline | Trigger |
|
|
134
|
-
|
|
135
|
-
| Kubernetes operations | `ops-approved-operations.yml` | Manual dispatch |
|
|
136
|
-
| Ephemeral environments | `ops-create-ephemeral.yml` | Manual dispatch |
|
|
137
|
-
| Terraform changes | `iac-terraform-cicd.yml` | PR to {{DEFAULT_BRANCH}} |
|
|
138
|
-
| Ephemeral Terraform | `ops-ephemeral-terraform.yml` | Manual dispatch |
|
|
139
|
-
|
|
140
|
-
### What This Means for Agents
|
|
141
|
-
|
|
142
|
-
**PHASE 1 - Local Testing (ALLOWED for non-secret operations):**
|
|
143
|
-
- Run read-only commands: `kubectl get`, `kubectl describe`, `kubectl logs`
|
|
144
|
-
- Run `terraform plan` via pipeline (needs secrets)
|
|
145
|
-
- Make cloud console changes to prototype
|
|
146
|
-
- Use ephemeral pipelines for secret-dependent testing
|
|
147
|
-
|
|
148
|
-
**PHASE 2 - IaC Parity (MANDATORY - always via pipeline):**
|
|
149
|
-
- ALL manual changes MUST be translated to IaC (Terraform/K8s YAML)
|
|
150
|
-
- IaC MUST be committed to version control
|
|
151
|
-
- Deployment MUST go through CI/CD pipeline (has secrets)
|
|
152
|
-
- Final state MUST match IaC exactly (verify with diff)
|
|
153
|
-
- Manual/ephemeral resources MUST be cleaned up
|
|
154
|
-
|
|
155
|
-
**NEVER:**
|
|
156
|
-
- Run `terraform apply` locally (no secrets available)
|
|
157
|
-
- Run `kubectl apply` with secret-dependent resources locally
|
|
158
|
-
- Create secrets via `kubectl create secret` (use Sealed Secrets)
|
|
159
|
-
- Hardcode or expose secrets in code/logs
|
|
92
|
+
| Task | Pipeline | Trigger |
|
|
93
|
+
|------|----------|---------|
|
|
94
|
+
| Kubernetes operations | `ops-approved-operations.yml` | Manual dispatch |
|
|
95
|
+
| Ephemeral environments | `ops-create-ephemeral.yml` | Manual dispatch |
|
|
96
|
+
| Terraform changes | `iac-terraform-cicd.yml` | PR to {{DEFAULT_BRANCH}} |
|
|
97
|
+
| Ephemeral Terraform | `ops-ephemeral-terraform.yml` | Manual dispatch |
|
|
160
98
|
{{/if}}
|
|
161
99
|
|
|
162
100
|
---
|
|
163
101
|
|
|
164
|
-
##
|
|
165
|
-
|
|
166
|
-
**Forty-two patterns proven to improve LLM agent reliability. Apply to ALL task execution.**
|
|
167
|
-
|
|
168
|
-
### Pattern 1: Environment Isolation
|
|
169
|
-
Code installed during execution may NOT persist across phases (agent -> test -> deploy).
|
|
170
|
-
- Check what dependencies are pre-installed before using them
|
|
171
|
-
- Prefer pre-installed tools over installing new ones
|
|
172
|
-
- Design for minimal dependency footprint
|
|
173
|
-
|
|
174
|
-
### Pattern 2: Recipe Following
|
|
175
|
-
LLMs excel at explicit step-by-step instructions but struggle with ambiguous tasks.
|
|
176
|
-
- Convert complex tasks into numbered sequential commands
|
|
177
|
-
- Each step should be atomic and verifiable
|
|
178
|
-
- Provide exact commands, not descriptions
|
|
179
|
-
|
|
180
|
-
### Pattern 3: Pre-execution State Protection
|
|
181
|
-
LLMs do NOT reliably follow negative instructions ("do not delete X").
|
|
182
|
-
- Backup critical files BEFORE starting work
|
|
183
|
-
- Never rely on "do not" for critical constraints
|
|
184
|
-
- Protect state proactively, not reactively
|
|
102
|
+
## SESSION START
|
|
185
103
|
|
|
186
104
|
```bash
|
|
187
|
-
|
|
188
|
-
sqlite3
|
|
105
|
+
uam task ready
|
|
106
|
+
sqlite3 ./{{MEMORY_DB_PATH}} "SELECT * FROM memories ORDER BY id DESC LIMIT 10;"
|
|
107
|
+
sqlite3 ./{{MEMORY_DB_PATH}} "SELECT * FROM session_memories WHERE session_id='current' ORDER BY id DESC LIMIT 5;"
|
|
108
|
+
uam agent status
|
|
189
109
|
```
|
|
190
110
|
|
|
191
|
-
|
|
192
|
-
Agents succeed more when told WHICH tool to use, not just WHAT to accomplish.
|
|
193
|
-
- Specify exact tool and flags when possible
|
|
194
|
-
- Reduce search space of possible approaches
|
|
195
|
-
|
|
196
|
-
### Pattern 5: Recognizing Impossible Tasks
|
|
197
|
-
Some tasks require pre-computed solutions (compressed algorithms, lookup tables, trained models).
|
|
198
|
-
- Identify tasks requiring: compression below limits, pre-trained weights, exhaustive search
|
|
199
|
-
- These cannot be solved by reasoning alone
|
|
200
|
-
- Detect early to avoid wasted computation
|
|
201
|
-
|
|
202
|
-
### Pattern 6: Hierarchical Prompting (Recency Bias)
|
|
203
|
-
LLMs weight information at the END of context more heavily.
|
|
204
|
-
- Structure: capabilities -> context -> CRITICAL instructions
|
|
205
|
-
- Put most important constraints at the END
|
|
206
|
-
- Repeat critical instructions if essential
|
|
207
|
-
|
|
208
|
-
### Pattern 7: Task Classification
|
|
209
|
-
Different task types require different strategies. See also P40 (Adaptive Reasoning Depth).
|
|
210
|
-
|
|
211
|
-
| Category | Strategy |
|
|
212
|
-
|----------|----------|
|
|
213
|
-
| State-sensitive | Pre-backup critical files |
|
|
214
|
-
| Recipe-following | Step-by-step commands |
|
|
215
|
-
| Tool-dependent | Specify exact tool + flags |
|
|
216
|
-
| Research/exploration | Parallel searches |
|
|
217
|
-
| Complex/ambiguous | P38 Branching Exploration |
|
|
218
|
-
| Multi-file change | P41 Atomic Task Loop |
|
|
219
|
-
|
|
220
|
-
### Pattern 8: CLI over Libraries
|
|
221
|
-
When environment dependencies are uncertain, prefer subprocess + CLI over library imports.
|
|
222
|
-
- CLI tools more likely pre-installed
|
|
223
|
-
- Better backward compatibility
|
|
224
|
-
- Easier to verify: `which tool`
|
|
225
|
-
|
|
226
|
-
### Pattern 9: Format Translation Pipeline
|
|
227
|
-
When task involves non-text data (images, audio, video, binary), convert to standard format first.
|
|
228
|
-
- Identify input format and required output
|
|
229
|
-
- Find converter tool: image->text, audio->transcript, binary->structured
|
|
230
|
-
- Convert to standard intermediate format (FEN, JSON, CSV, etc.)
|
|
231
|
-
|
|
232
|
-
### Pattern 10: Whitelist-First Sanitization
|
|
233
|
-
For security filtering (XSS, SQL injection, command injection), ALWAYS use allow-list approach.
|
|
234
|
-
- NEVER use blacklist/regex to filter malicious patterns
|
|
235
|
-
- Use established security libraries (bleach, DOMPurify, parameterized queries)
|
|
236
|
-
|
|
237
|
-
### Pattern 11: Pre-Computed Solution Recognition
|
|
238
|
-
Some tasks have known solutions that cannot be derived through reasoning alone.
|
|
239
|
-
- Recognize tasks matching known solved problems (chess engines, crypto tables, ML models)
|
|
240
|
-
- Search GitHub/papers FIRST before attempting to solve
|
|
241
|
-
- Download and adapt existing solutions rather than reinventing
|
|
242
|
-
|
|
243
|
-
### Pattern 12: Output Existence Verification (OEV)
|
|
244
|
-
**CRITICAL: 37% of task failures are "output file not created"**
|
|
245
|
-
|
|
246
|
-
BEFORE completing ANY task:
|
|
247
|
-
1. **Parse expected outputs** from task description (files, paths, artifacts)
|
|
248
|
-
2. **Verify each output EXISTS**: `ls -la /expected/path`
|
|
249
|
-
3. **If missing: CREATE IT** - don't explain, don't ask, just create
|
|
250
|
-
4. **Re-verify** before reporting completion
|
|
251
|
-
|
|
252
|
-
### Pattern 13: Iterative Refinement Loop (IRL)
|
|
253
|
-
When tests are available and some pass but others fail:
|
|
254
|
-
1. Run tests after implementation
|
|
255
|
-
2. If partial success: read failing test, fix specific issue, re-run
|
|
256
|
-
3. Repeat until 100% pass OR timeout approaching
|
|
257
|
-
4. Reserve 20% of time budget for iteration
|
|
258
|
-
|
|
259
|
-
### Pattern 14: Output Format Validation (OFV)
|
|
260
|
-
When task specifies exact output format:
|
|
261
|
-
1. Extract format requirements from task description
|
|
262
|
-
2. Validate output matches spec before completion
|
|
263
|
-
3. If validation fails: fix output, don't re-explain
|
|
264
|
-
|
|
265
|
-
### Pattern 15: Exception Recovery (ER)
|
|
266
|
-
When code throws exceptions:
|
|
267
|
-
1. Read the stack trace
|
|
268
|
-
2. Fix the specific error (ImportError -> install/use stdlib, FileNotFoundError -> create file)
|
|
269
|
-
3. Re-run and verify
|
|
270
|
-
4. Don't give up after first exception
|
|
271
|
-
|
|
272
|
-
### Pattern 16: Task-First Execution (TFE)
|
|
273
|
-
**CRITICAL: Prevents regression where agent analyzes but forgets to create outputs**
|
|
274
|
-
|
|
275
|
-
BEFORE any analysis or exploration:
|
|
276
|
-
1. **Parse task for REQUIRED OUTPUTS** (files, artifacts, states)
|
|
277
|
-
2. **Create MINIMAL WORKING versions immediately** (touch files, create stubs)
|
|
278
|
-
3. **THEN refine to full implementation**
|
|
279
|
-
4. **Never finish a turn without outputs existing**
|
|
280
|
-
|
|
281
|
-
### Pattern 17: Constraint Extraction (CE)
|
|
282
|
-
Agent misses specific constraints buried in task descriptions.
|
|
283
|
-
|
|
284
|
-
BEFORE implementing:
|
|
285
|
-
1. **Parse for constraint keywords**: "exactly", "only", "single", "must be", "no more than"
|
|
286
|
-
2. **Extract format constraints**: file types, line counts, size limits, response formats
|
|
287
|
-
3. **Create constraint checklist**
|
|
288
|
-
4. **Validate EACH constraint before completion**
|
|
289
|
-
|
|
290
|
-
### Pattern 18: Multi-Tool Pipeline (MTP)
|
|
291
|
-
Complex tasks need multiple tools chained together.
|
|
292
|
-
|
|
293
|
-
1. **Decompose task into stages** (input -> transform -> output)
|
|
294
|
-
2. **Identify tool for EACH stage** (don't implement from scratch if tool exists)
|
|
295
|
-
3. **Verify tools available**: `which tool || pip install package`
|
|
296
|
-
4. **Chain in pipeline**
|
|
297
|
-
|
|
298
|
-
### Pattern 19: Enhanced Impossible Task Refusal (ITR+)
|
|
299
|
-
Pattern 5 identifies impossible tasks but agent still attempts them.
|
|
300
|
-
|
|
301
|
-
**Impossibility markers -> REFUSE IMMEDIATELY:**
|
|
302
|
-
- Compression below Shannon limit (e.g., "GPT-2 in 5KB" - weights are 500MB+)
|
|
303
|
-
- Pre-trained weights required but none provided
|
|
304
|
-
- Exhaustive search in exponential space
|
|
305
|
-
- Undecidable problems
|
|
306
|
-
|
|
307
|
-
### Pattern 20: Adversarial Thinking (AT)
|
|
308
|
-
Security bypass/attack tasks require "attacker mindset".
|
|
309
|
-
|
|
310
|
-
1. **Detect**: Keywords "bypass", "break", "exploit", "circumvent", "evade"
|
|
311
|
-
2. **Flip perspective**: Not "how to use correctly" but "how to break it"
|
|
312
|
-
3. **Enumerate attack vectors** (case variation, encoding, null bytes, context breaking)
|
|
313
|
-
4. **Test each vector until one works**
|
|
314
|
-
|
|
315
|
-
---
|
|
316
|
-
|
|
317
|
-
## DOMAIN-SPECIFIC PATTERNS
|
|
318
|
-
|
|
319
|
-
**Task-specific guidance for common failure categories.**
|
|
320
|
-
|
|
321
|
-
### Pattern 21: Chess Engine Integration (CEI)
|
|
322
|
-
Chess tasks require external engines - reasoning alone cannot find optimal moves.
|
|
323
|
-
|
|
324
|
-
**Detection**: Keywords "chess", "best move", "FEN", "PGN", "checkmate"
|
|
325
|
-
|
|
326
|
-
**Protocol**:
|
|
327
|
-
1. Check for Stockfish: `which stockfish || apt-get install -y stockfish`
|
|
328
|
-
2. Parse board state to FEN format
|
|
329
|
-
3. Use Stockfish for analysis
|
|
330
|
-
4. Extract best move from engine output
|
|
331
|
-
|
|
332
|
-
**NEVER**: Try to calculate chess moves through reasoning - use the engine.
|
|
333
|
-
|
|
334
|
-
### Pattern 22: Git Recovery Forensics (GRF)
|
|
335
|
-
Git recovery tasks require forensic approach, not standard git commands.
|
|
336
|
-
|
|
337
|
-
**Detection**: Keywords "recover", "corrupted", "lost commit", "fix git", "reflog"
|
|
338
|
-
|
|
339
|
-
**Protocol**:
|
|
340
|
-
1. **BACKUP FIRST**: `cp -r .git .git.bak`
|
|
341
|
-
2. Check integrity: `git fsck --full --no-dangling`
|
|
342
|
-
3. Check reflog: `git reflog` (commits survive even after reset)
|
|
343
|
-
4. Check loose objects: `find .git/objects -type f`
|
|
344
|
-
5. Recover from pack files if needed
|
|
345
|
-
|
|
346
|
-
**NEVER**: Run destructive commands without backup. Use `--dry-run` first.
|
|
347
|
-
|
|
348
|
-
### Pattern 23: Compression Impossibility Detection (CID)
|
|
349
|
-
Some compression tasks are mathematically impossible.
|
|
350
|
-
|
|
351
|
-
**Detection**: Keywords "compress", "codegolf", "under X bytes", "minimal size"
|
|
352
|
-
|
|
353
|
-
**Impossibility Markers**:
|
|
354
|
-
- ML model weights (GPT-2 = 500MB+, cannot be <1MB without losing function)
|
|
355
|
-
- Random/encrypted data (incompressible by definition)
|
|
356
|
-
- Asking for compression ratio beyond information-theoretic limits
|
|
357
|
-
|
|
358
|
-
**NEVER**: Attempt to compress ML weights to <1% of original size.
|
|
359
|
-
|
|
360
|
-
### Pattern 24: Polyglot Code Construction (PCC)
|
|
361
|
-
Polyglot tasks require specific compiler/interpreter quirks.
|
|
362
|
-
|
|
363
|
-
**Detection**: Keywords "polyglot", "works in both", "compile as X and Y"
|
|
364
|
-
|
|
365
|
-
**Protocol**: Search for existing polyglot examples before implementing.
|
|
366
|
-
|
|
367
|
-
### Pattern 25: Service Configuration Pipeline (SCP)
|
|
368
|
-
Multi-service configuration requires ordered setup.
|
|
369
|
-
|
|
370
|
-
**Detection**: Keywords "configure", "server", "webserver", "service", "daemon"
|
|
371
|
-
|
|
372
|
-
**Protocol**:
|
|
373
|
-
1. **Identify all services** needed
|
|
374
|
-
2. **Check service status**: `systemctl status <service>`
|
|
375
|
-
3. **Configure in dependency order** (base -> dependent)
|
|
376
|
-
4. **Test each service** before moving to next
|
|
377
|
-
5. **Verify end-to-end** after all configured
|
|
378
|
-
|
|
379
|
-
### Pattern 26: Near-Miss Iteration (NMI)
|
|
380
|
-
When tests show >50% passing, focus on specific failing tests.
|
|
381
|
-
|
|
382
|
-
**Detection**: Test results show partial success (e.g., 8/9, 6/7, 5/6)
|
|
383
|
-
|
|
384
|
-
**Protocol**:
|
|
385
|
-
1. Run tests with verbose output
|
|
386
|
-
2. Extract ONLY failing test names
|
|
387
|
-
3. Read failing test code to understand exact requirement
|
|
388
|
-
4. Fix specific issue without breaking passing tests
|
|
389
|
-
5. Re-run ONLY failing tests first
|
|
390
|
-
6. Then run full suite to verify no regressions
|
|
391
|
-
|
|
392
|
-
**Reserve 30% of time budget for near-miss iteration.**
|
|
393
|
-
|
|
394
|
-
### Pattern 27: Output Directory Cleanup (ODC)
|
|
395
|
-
Tests often check for ONLY specific files in output directories.
|
|
396
|
-
|
|
397
|
-
**Detection**: Tasks mentioning "single file", "only", constraints on output directory contents
|
|
398
|
-
|
|
399
|
-
**Protocol**:
|
|
400
|
-
1. **Before completing**, list output directory
|
|
401
|
-
2. **Remove non-required files**: compiled binaries, temp files, backups
|
|
402
|
-
3. **Keep ONLY the required outputs** as specified in task
|
|
403
|
-
|
|
404
|
-
### Pattern 28: Service Smoke Test (SST)
|
|
405
|
-
Services must be tested BEFORE claiming completion.
|
|
406
|
-
|
|
407
|
-
**Detection**: Tasks involving servers, daemons, APIs, web services
|
|
408
|
-
|
|
409
|
-
**Protocol**:
|
|
410
|
-
1. Start the service
|
|
411
|
-
2. **Immediately test it** with curl/wget/client
|
|
412
|
-
3. If test fails: debug, fix, restart, re-test
|
|
413
|
-
4. Only complete when test PASSES
|
|
414
|
-
|
|
415
|
-
**NEVER complete without a successful smoke test.**
|
|
416
|
-
|
|
417
|
-
### Pattern 29: Multi-Solution Discovery (MSD)
|
|
418
|
-
Some tasks require finding ALL valid solutions, not just one.
|
|
419
|
-
|
|
420
|
-
**Detection**: Keywords "all moves", "both solutions", "list all", "find every"
|
|
421
|
-
|
|
422
|
-
### Pattern 30: Performance Threshold Tuning (PTT)
|
|
423
|
-
Tasks with numeric thresholds require iterative tuning.
|
|
424
|
-
|
|
425
|
-
**Detection**: Keywords "win rate", "accuracy", "percentage", "threshold", "at least X%"
|
|
426
|
-
|
|
427
|
-
### Pattern 31: Round-Trip Verification (RTV)
|
|
428
|
-
For transform/encode/compress tasks, verify the reverse operation.
|
|
429
|
-
|
|
430
|
-
**Detection**: Keywords "compress", "encode", "serialize", "encrypt", and task mentions reverse operation.
|
|
431
|
-
|
|
432
|
-
**Protocol**:
|
|
433
|
-
1. Create test data
|
|
434
|
-
2. Apply forward transform (compress)
|
|
435
|
-
3. **Immediately apply reverse** (decompress)
|
|
436
|
-
4. **Verify original == result**
|
|
437
|
-
5. Fix if not matching
|
|
438
|
-
|
|
439
|
-
### Pattern 32: CLI Execution Verification (CEV)
|
|
440
|
-
When creating executable CLI tools, verify execution method matches tests.
|
|
441
|
-
|
|
442
|
-
**Detection**: Tasks requiring executable scripts, CLI tools, command-line interfaces
|
|
443
|
-
|
|
444
|
-
**Protocol**:
|
|
445
|
-
1. Add proper shebang: `#!/usr/bin/env python3`
|
|
446
|
-
2. Make executable: `chmod +x <script>`
|
|
447
|
-
3. **Test EXACTLY as verifier will run it**: `./tool args` not `python3 tool args`
|
|
448
|
-
4. Verify output format matches expected format
|
|
449
|
-
|
|
450
|
-
**Common mistake**: Script works with `python3 script.py` but fails with `./script.py` (missing shebang/chmod)
|
|
451
|
-
|
|
452
|
-
### Pattern 33: Numerical Stability Testing (NST)
|
|
453
|
-
Numerical algorithms require robustness against edge cases.
|
|
454
|
-
|
|
455
|
-
**Detection**: Statistical sampling, numerical optimization, floating-point computation
|
|
456
|
-
|
|
457
|
-
**Protocol**:
|
|
458
|
-
1. Test with multiple random seeds (3+ iterations, not just one)
|
|
459
|
-
2. Test domain boundaries explicitly (0, near-zero, infinity)
|
|
460
|
-
3. Use adaptive step sizes for derivative computation
|
|
461
|
-
4. Add tolerance margins for floating-point comparisons (1e-6 typical)
|
|
462
|
-
5. Handle edge cases: empty input, single element, maximum values
|
|
463
|
-
|
|
464
|
-
### Pattern 34: Image-to-Structured Pipeline (ISP)
|
|
465
|
-
Visual data requires dedicated recognition tools, not reasoning.
|
|
466
|
-
|
|
467
|
-
**Detection**: Tasks involving image analysis, diagram parsing, visual data extraction
|
|
468
|
-
|
|
469
|
-
**Protocol**:
|
|
470
|
-
1. **NEVER rely on visual reasoning alone** - accuracy is unreliable
|
|
471
|
-
2. Search for existing recognition libraries
|
|
472
|
-
3. Verify extracted structured data before using
|
|
473
|
-
4. If no tools available, clearly state the limitation
|
|
474
|
-
|
|
475
|
-
### Pattern 35: Decoder-First Analysis (DFA)
|
|
476
|
-
For encode/compress tasks with provided decoder, analyze decoder FIRST.
|
|
477
|
-
|
|
478
|
-
**Detection**: Task provides a decoder/decompressor and asks to create encoder/compressor
|
|
479
|
-
|
|
480
|
-
**Protocol**:
|
|
481
|
-
1. **Read and understand the provided decoder** before writing encoder
|
|
482
|
-
2. Identify expected input format from decoder source
|
|
483
|
-
3. Create minimal test case matching decoder's expected format
|
|
484
|
-
4. Test round-trip with decoder BEFORE optimizing for size
|
|
485
|
-
5. If decoder crashes, your format is wrong - don't optimize further
|
|
486
|
-
|
|
487
|
-
### Pattern 36: Competition Domain Research (CDR)
|
|
488
|
-
Competitive tasks benefit from researching domain-specific winning strategies.
|
|
489
|
-
|
|
490
|
-
**Detection**: Keywords "win rate", "beat", "competition", "versus", "tournament"
|
|
491
|
-
|
|
492
|
-
**Protocol**:
|
|
493
|
-
1. **Research domain strategies BEFORE implementing**
|
|
494
|
-
2. Time-box implementation iterations: stop at 70% time budget
|
|
495
|
-
3. Track progress per iteration to identify improvement trajectory
|
|
496
|
-
4. If not meeting threshold, document best achieved + gap
|
|
111
|
+
**On work request**: `uam task create --title "..." --type task|bug|feature`
|
|
497
112
|
|
|
498
113
|
---
|
|
499
114
|
|
|
500
|
-
##
|
|
501
|
-
|
|
502
|
-
**Six patterns derived from state-of-the-art LLM optimization research (2025-2026). Address reasoning depth, self-verification, branching exploration, feedback grounding, and task atomization.**
|
|
503
|
-
|
|
504
|
-
### Pattern 37: Pre-Implementation Verification (PIV)
|
|
505
|
-
**CRITICAL: Prevents wrong-approach waste — the #1 cause of wasted compute.**
|
|
506
|
-
|
|
507
|
-
After planning but BEFORE writing any code, explicitly verify your approach:
|
|
508
|
-
|
|
509
|
-
**Detection**: Any implementation task (always active for non-trivial changes)
|
|
115
|
+
## DECISION LOOP
|
|
510
116
|
|
|
511
|
-
**Protocol**:
|
|
512
117
|
```
|
|
513
|
-
|
|
514
|
-
|
|
515
|
-
|
|
516
|
-
|
|
517
|
-
|
|
518
|
-
|
|
519
|
-
|
|
118
|
+
1. CLASSIFY -> complexity? backup needed? tools?
|
|
119
|
+
2. PROTECT -> cp file file.bak (for configs, DBs, critical files)
|
|
120
|
+
3. MEMORY -> query relevant context + past failures
|
|
121
|
+
4. AGENTS -> check overlaps (if multi-agent)
|
|
122
|
+
5. SKILLS -> check {{SKILLS_PATH}} for domain-specific guidance
|
|
123
|
+
6. WORK -> implement (ALWAYS use worktree for ANY file changes)
|
|
124
|
+
7. REVIEW -> self-review diff before testing
|
|
125
|
+
8. TEST -> completion gates pass
|
|
126
|
+
9. LEARN -> store outcome in memory
|
|
520
127
|
```
|
|
521
128
|
|
|
522
|
-
|
|
523
|
-
|
|
524
|
-
*Research basis: CoT verification (+4.3% accuracy), Reflexion framework (+18.5%), SEER adaptive reasoning (+4-9%)*
|
|
525
|
-
|
|
526
|
-
### Pattern 38: Branching Exploration (BE)
|
|
527
|
-
For complex or ambiguous problems, explore multiple approaches before committing.
|
|
528
|
-
|
|
529
|
-
**Detection**: Problem has multiple valid approaches, ambiguous requirements, or high complexity
|
|
530
|
-
|
|
531
|
-
**Protocol**:
|
|
532
|
-
1. **Generate 2-3 candidate approaches** (brief description, not full implementation)
|
|
533
|
-
2. **Evaluate each** against: simplicity, correctness likelihood, test-compatibility, side-effect risk
|
|
534
|
-
3. **Select best** with explicit reasoning
|
|
535
|
-
4. **Commit fully** to selected approach — no mid-implementation switching
|
|
536
|
-
5. **If selected approach fails**: backtrack to step 1, eliminate failed approach, try next
|
|
537
|
-
|
|
538
|
-
**NEVER**: Start coding the first approach that comes to mind for complex problems.
|
|
539
|
-
**ALWAYS**: Spend 5% of effort exploring alternatives to save 50% on wrong-path recovery.
|
|
540
|
-
|
|
541
|
-
*Research basis: MCTS-guided code generation (RethinkMCTS: 70%→89% pass@1), Policy-Guided Tree Search*
|
|
542
|
-
|
|
543
|
-
### Pattern 39: Execution Feedback Grounding (EFG)
|
|
544
|
-
Learn from test failures systematically — don't just fix, understand and remember.
|
|
545
|
-
|
|
546
|
-
**Detection**: Any test failure or runtime error during implementation
|
|
547
|
-
|
|
548
|
-
**Protocol**:
|
|
549
|
-
1. **Categorize the failure** using the Failure Taxonomy (see below)
|
|
550
|
-
2. **Identify root cause** (not just the symptom the error message shows)
|
|
551
|
-
3. **Fix with explanation**: What was wrong, why, and what the fix addresses
|
|
552
|
-
4. **Store structured feedback** in memory:
|
|
553
|
-
```bash
|
|
554
|
-
sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO memories (timestamp,type,content) VALUES (datetime('now'),'failure_analysis','type:<category>|cause:<root_cause>|fix:<what_fixed>|file:<filename>');"
|
|
555
|
-
```
|
|
556
|
-
5. **Query before similar tasks**: Before implementing, check memory for past failures in same area
|
|
557
|
-
|
|
558
|
-
**Failure Taxonomy** (use for categorization):
|
|
559
|
-
| Type | Description | Recovery Strategy |
|
|
560
|
-
|------|-------------|-------------------|
|
|
561
|
-
| `dependency_missing` | Import/module not found | Install or use stdlib alternative |
|
|
562
|
-
| `wrong_approach` | Fundamentally incorrect solution | P38 Branching - try different approach |
|
|
563
|
-
| `format_mismatch` | Output doesn't match expected format | P14 OFV - re-read spec carefully |
|
|
564
|
-
| `edge_case` | Works for happy path, fails on edge | Add boundary checks, test with extremes |
|
|
565
|
-
| `state_mutation` | Unexpected side effect on shared state | Isolate mutations, use copies |
|
|
566
|
-
| `concurrency` | Race condition or timing issue | Add locks, use sequential fallback |
|
|
567
|
-
| `timeout` | Exceeded time/resource limit | Optimize algorithm, reduce scope |
|
|
568
|
-
| `environment` | Works locally, fails in target env | P1 Environment Isolation checks |
|
|
569
|
-
|
|
570
|
-
*Research basis: RLEF/RLVR (RL from Execution Feedback), verifiable rewards for coding agents*
|
|
571
|
-
|
|
572
|
-
### Pattern 40: Adaptive Reasoning Depth (ARD)
|
|
573
|
-
Match reasoning effort to task complexity — don't over-think simple tasks or under-think hard ones.
|
|
574
|
-
|
|
575
|
-
**Detection**: Applied automatically at Pattern Router stage
|
|
576
|
-
|
|
577
|
-
**Complexity Classification**:
|
|
578
|
-
| Complexity | Indicators | Reasoning Protocol |
|
|
579
|
-
|-----------|------------|-------------------|
|
|
580
|
-
| **Simple** | Single file, clear spec, known pattern, <20 lines | Direct implementation. No exploration phase. |
|
|
581
|
-
| **Moderate** | Multi-file, some ambiguity, 20-200 lines | Plan-then-implement. State assumptions. P37 verify. |
|
|
582
|
-
| **Complex** | Cross-cutting concerns, ambiguous spec, >200 lines, unfamiliar domain | P38 explore → P37 verify → implement → P39 feedback loop. |
|
|
583
|
-
| **Research** | Unknown solution space, no clear approach | Research first (web search, codebase analysis) → P38 explore → implement iteratively. |
|
|
584
|
-
|
|
585
|
-
**Rule**: Never apply Complex-level reasoning to Simple tasks (wastes tokens). Never apply Simple-level reasoning to Complex tasks (causes failures).
|
|
586
|
-
|
|
587
|
-
*Research basis: SEER adaptive CoT, test-time compute scaling (2-3x gains from adaptive depth)*
|
|
588
|
-
|
|
589
|
-
### Pattern 41: Atomic Task Loop (ATL)
|
|
590
|
-
For multi-step changes, decompose into atomic units with clean boundaries.
|
|
591
|
-
|
|
592
|
-
**Detection**: Task involves changes to 3+ files, or multiple independent concerns
|
|
593
|
-
|
|
594
|
-
**Protocol**:
|
|
595
|
-
1. **Decompose** the task into atomic sub-tasks (each independently testable)
|
|
596
|
-
2. **Order** by dependency (upstream changes first)
|
|
597
|
-
3. **For each sub-task**:
|
|
598
|
-
a. Implement the change (single concern only)
|
|
599
|
-
b. Run relevant tests
|
|
600
|
-
c. Commit if tests pass
|
|
601
|
-
d. If context is getting long/confused, note progress and continue fresh
|
|
602
|
-
4. **Final verification**: Run full test suite after all sub-tasks complete
|
|
603
|
-
|
|
604
|
-
**Atomicity rules**:
|
|
605
|
-
- Each sub-task modifies ideally 1-2 files
|
|
606
|
-
- Each sub-task has a clear pass/fail criterion
|
|
607
|
-
- Sub-tasks should not depend on uncommitted work from other sub-tasks
|
|
608
|
-
- If a sub-task fails, only that sub-task needs rework
|
|
609
|
-
|
|
610
|
-
*Research basis: Addy Osmani's continuous coding loop, context drift prevention research*
|
|
611
|
-
|
|
612
|
-
### Pattern 42: Critic-Before-Commit (CBC)
|
|
613
|
-
Review your own diff against requirements before running tests.
|
|
129
|
+
---
|
|
614
130
|
|
|
615
|
-
|
|
131
|
+
## MEMORY SYSTEM
|
|
616
132
|
|
|
617
|
-
**Protocol**:
|
|
618
133
|
```
|
|
619
|
-
|
|
620
|
-
|
|
621
|
-
|
|
622
|
-
|
|
623
|
-
☐ Does the diff address ALL requirements from the task?
|
|
624
|
-
☐ Are there any unintended changes (debug prints, commented code, temp files)?
|
|
625
|
-
☐ Does the code handle the error/edge cases mentioned in the spec?
|
|
626
|
-
☐ Is the code consistent with surrounding style and conventions?
|
|
627
|
-
☐ Would this diff make sense to a reviewer with no context?
|
|
628
|
-
|
|
629
|
-
ISSUES FOUND: [list or "none"]
|
|
630
|
-
=== END REVIEW ===
|
|
134
|
+
L1 Working | SQLite memories | {{SHORT_TERM_LIMIT}} max | <1ms
|
|
135
|
+
L2 Session | SQLite session_mem | current session | <5ms
|
|
136
|
+
L3 Semantic | {{LONG_TERM_BACKEND}}| search | ~50ms
|
|
137
|
+
L4 Knowledge| SQLite entities/rels | graph | <20ms
|
|
631
138
|
```
|
|
632
139
|
|
|
633
|
-
|
|
634
|
-
|
|
635
|
-
*Research basis: Multi-agent reflection (actor+critic, +20% accuracy), RL^V unified reasoner-verifier*
|
|
636
|
-
|
|
637
|
-
---
|
|
638
|
-
|
|
639
|
-
## CONTEXT OPTIMIZATION
|
|
640
|
-
|
|
641
|
-
**Reduce token waste and improve response quality through intelligent context management.**
|
|
642
|
-
|
|
643
|
-
### Progressive Context Disclosure
|
|
644
|
-
Not all patterns are needed for every task. The Pattern Router activates only relevant patterns.
|
|
645
|
-
- **Always loaded**: Pattern Router, Completion Gates, Error Recovery
|
|
646
|
-
- **Loaded on activation**: Only patterns flagged YES by router
|
|
647
|
-
- **Summarize, don't repeat**: When referencing prior work, summarize in 1-2 lines, don't paste full output
|
|
648
|
-
|
|
649
|
-
### Context Hygiene
|
|
650
|
-
- **Prune completed context**: After a sub-task completes, don't carry its full debug output forward
|
|
651
|
-
- **Compress tool output**: Quote only the 2-3 lines that inform the next decision
|
|
652
|
-
- **Avoid context poisoning**: Don't include failed approaches in context unless actively debugging them
|
|
653
|
-
- **Reset on drift**: If responses become unfocused or repetitive, summarize progress and continue with clean context
|
|
654
|
-
|
|
655
|
-
### Token Budget Awareness
|
|
656
|
-
| Task Type | Target Context Usage | Strategy |
|
|
657
|
-
|-----------|---------------------|----------|
|
|
658
|
-
| Simple fix | <10% of window | Direct implementation, minimal exploration |
|
|
659
|
-
| Feature implementation | 30-50% of window | Structured exploration, then focused implementation |
|
|
660
|
-
| Complex debugging | 50-70% of window | Deep investigation justified, but prune between attempts |
|
|
661
|
-
| Research/exploration | 20-40% of window | Broad search first, then narrow and deep |
|
|
662
|
-
|
|
663
|
-
---
|
|
664
|
-
|
|
665
|
-
## SELF-IMPROVEMENT PROTOCOL
|
|
666
|
-
|
|
667
|
-
**The agent improves its own effectiveness over time by learning from outcomes.**
|
|
140
|
+
### Commands
|
|
668
141
|
|
|
669
|
-
|
|
670
|
-
|
|
671
|
-
|
|
672
|
-
sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO session_memories (session_id,timestamp,type,content,importance) VALUES ('current',datetime('now'),'outcome','task:<summary>|result:<pass/fail>|patterns_used:<list>|time_spent:<estimate>|failure_type:<category_or_none>',8);"
|
|
673
|
-
```
|
|
142
|
+
```bash
|
|
143
|
+
# L1: Working Memory
|
|
144
|
+
sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO memories (timestamp,type,content) VALUES (datetime('now'),'action','...');"
|
|
674
145
|
|
|
675
|
-
|
|
676
|
-
|
|
677
|
-
{{MEMORY_STORE_CMD}} lesson "Failed on <task_type>: <what_went_wrong>. Fix: <what_worked>." --tags failure,<category>,<language> --importance 8
|
|
678
|
-
```
|
|
146
|
+
# L2: Session Memory
|
|
147
|
+
sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO session_memories (session_id,timestamp,type,content,importance) VALUES ('current',datetime('now'),'decision','...',7);"
|
|
679
148
|
|
|
680
|
-
|
|
681
|
-
|
|
682
|
-
{{MEMORY_STORE_CMD}} lesson "New technique for <domain>: <technique_description>. Use when <conditions>." --tags technique,<domain> --importance 9
|
|
683
|
-
```
|
|
149
|
+
# L3: Semantic Memory
|
|
150
|
+
{{MEMORY_STORE_CMD}} lesson "..." --tags t1,t2 --importance 8
|
|
684
151
|
|
|
685
|
-
|
|
686
|
-
|
|
687
|
-
```bash
|
|
688
|
-
sqlite3 ./{{MEMORY_DB_PATH}} "SELECT content FROM memories WHERE type='failure_analysis' AND content LIKE '%<relevant_keyword>%' ORDER BY timestamp DESC LIMIT 5;"
|
|
152
|
+
# L4: Knowledge Graph
|
|
153
|
+
sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO entities (type,name,first_seen,last_seen,mention_count) VALUES ('file','x.ts',datetime('now'),datetime('now'),1);"
|
|
689
154
|
```
|
|
690
155
|
|
|
691
|
-
|
|
692
|
-
Over time, accumulate repository-specific patterns:
|
|
693
|
-
- Which test frameworks and assertions this repo uses
|
|
694
|
-
- Common failure modes in this codebase
|
|
695
|
-
- Preferred code style and naming conventions
|
|
696
|
-
- Architecture decisions and their rationale
|
|
697
|
-
|
|
698
|
-
Store these as high-importance semantic memories tagged with the repo name.
|
|
156
|
+
Decay: `effective_importance = importance * (0.95 ^ days_since_access)`
|
|
699
157
|
|
|
700
158
|
---
|
|
701
159
|
|
|
702
|
-
##
|
|
160
|
+
## WORKTREE WORKFLOW — MANDATORY
|
|
703
161
|
|
|
704
|
-
**
|
|
162
|
+
> **MANDATORY**: ALL file changes MUST use a worktree. No exceptions. Never commit directly to any branch without a worktree. After PR is merged, worktree cleanup is MANDATORY — never leave stale worktrees.
|
|
705
163
|
|
|
706
|
-
|
|
707
|
-
|
|
708
|
-
|
|
709
|
-
- [ ] Names are self-documenting (no single-letter variables outside loops)
|
|
710
|
-
- [ ] Error paths handled explicitly (not just happy path)
|
|
711
|
-
- [ ] No debug prints, console.logs, or commented-out code left behind
|
|
712
|
-
- [ ] Consistent with surrounding code style (indentation, naming, patterns)
|
|
713
|
-
- [ ] No hardcoded values that should be constants or config
|
|
714
|
-
- [ ] Imports are minimal — only what's actually used
|
|
715
|
-
|
|
716
|
-
### Code Smell Detection
|
|
717
|
-
If you notice any of these, fix before committing:
|
|
718
|
-
- **Duplicated logic** → Extract to shared function
|
|
719
|
-
- **Deep nesting (>3 levels)** → Early returns, extract helper
|
|
720
|
-
- **Boolean parameters** → Consider separate methods or options object
|
|
721
|
-
- **Magic numbers** → Named constants
|
|
722
|
-
- **Catch-all error handling** → Specific error types with appropriate responses
|
|
723
|
-
|
|
724
|
-
---
|
|
725
|
-
|
|
726
|
-
## SESSION START PROTOCOL
|
|
727
|
-
|
|
728
|
-
**EXECUTE IMMEDIATELY before any response:**
|
|
164
|
+
| Change Scope | Workflow |
|
|
165
|
+
|-------------|----------|
|
|
166
|
+
| ANY file change (even single-file) | **Worktree REQUIRED** |
|
|
729
167
|
|
|
730
168
|
```bash
|
|
731
|
-
|
|
732
|
-
|
|
733
|
-
|
|
734
|
-
|
|
169
|
+
{{WORKTREE_CREATE_CMD}} <slug> # ALWAYS create first
|
|
170
|
+
cd {{WORKTREE_DIR}}/NNN-<slug>/
|
|
171
|
+
git add -A && git commit -m "type: description"
|
|
172
|
+
{{WORKTREE_PR_CMD}} <id> # Create PR
|
|
173
|
+
# After PR merge:
|
|
174
|
+
{{WORKTREE_CLEANUP_CMD}} <id> # MANDATORY cleanup after merge
|
|
735
175
|
```
|
|
736
176
|
|
|
737
|
-
**
|
|
738
|
-
|
|
739
|
-
---
|
|
177
|
+
**Applies to**: {{WORKTREE_APPLIES_TO}} — ALL changes without exception
|
|
740
178
|
|
|
741
|
-
|
|
179
|
+
**Cleanup is MANDATORY**: After every PR merge, immediately run `{{WORKTREE_CLEANUP_CMD}} <id>`. Never leave merged worktrees behind.
|
|
742
180
|
|
|
743
|
-
|
|
181
|
+
---
|
|
744
182
|
|
|
745
|
-
|
|
183
|
+
## MULTI-AGENT COORDINATION
|
|
746
184
|
|
|
747
|
-
|
|
185
|
+
**Skip for single-agent sessions.** Only activate when multiple agents work concurrently.
|
|
748
186
|
|
|
749
187
|
```bash
|
|
750
188
|
uam agent overlaps --resource "<files-or-directories>"
|
|
751
189
|
```
|
|
752
190
|
|
|
753
|
-
### Overlap Response Matrix
|
|
754
|
-
|
|
755
191
|
| Risk Level | Action |
|
|
756
192
|
|------------|--------|
|
|
757
193
|
| `none` | Proceed immediately |
|
|
@@ -759,15 +195,14 @@ uam agent overlaps --resource "<files-or-directories>"
|
|
|
759
195
|
| `medium` | Announce, coordinate sections |
|
|
760
196
|
| `high`/`critical` | Wait or split work |
|
|
761
197
|
|
|
762
|
-
### Agent
|
|
198
|
+
### Agent Routing
|
|
763
199
|
|
|
764
|
-
| Task Type | Route To |
|
|
765
|
-
|
|
766
|
-
| Security review | `security-auditor` |
|
|
767
|
-
| Performance | `performance-optimizer` |
|
|
768
|
-
| Documentation | `documentation-expert` |
|
|
769
|
-
| Code quality | `code-quality-guardian` |
|
|
770
|
-
| Solution verification | self (P42 CBC) | diff review, requirement check |
|
|
200
|
+
| Task Type | Route To |
|
|
201
|
+
|-----------|----------|
|
|
202
|
+
| Security review | `security-auditor` |
|
|
203
|
+
| Performance | `performance-optimizer` |
|
|
204
|
+
| Documentation | `documentation-expert` |
|
|
205
|
+
| Code quality | `code-quality-guardian` |
|
|
771
206
|
|
|
772
207
|
{{#if LANGUAGE_DROIDS}}
|
|
773
208
|
### Language Droids
|
|
@@ -783,121 +218,13 @@ uam agent overlaps --resource "<files-or-directories>"
|
|
|
783
218
|
{{{MCP_PLUGINS}}}
|
|
784
219
|
{{/if}}
|
|
785
220
|
|
|
786
|
-
|
|
787
|
-
|
|
788
|
-
## MULTI-AGENT EXECUTION (DEPENDENCY-AWARE)
|
|
789
|
-
|
|
790
|
-
**Skip for single-agent sessions.** When using parallel subagents:
|
|
791
|
-
1. **Decompose** into discrete work items. **Map dependencies** (A blocks B).
|
|
792
|
-
2. **Parallelize** dependency-free items with separate agents and explicit file boundaries.
|
|
793
|
-
3. **Gate edits** with `uam agent overlaps --resource "<files>"` before touching any file.
|
|
794
|
-
4. **Merge in dependency order** (upstream first).
|
|
795
|
-
|
|
796
|
-
---
|
|
797
|
-
|
|
798
|
-
## TOKEN EFFICIENCY RULES
|
|
799
|
-
|
|
800
|
-
- Prefer concise, high-signal responses; avoid repeating instructions or large logs.
|
|
801
|
-
- Summarize command output; quote only the lines needed for decisions.
|
|
802
|
-
- Use parallel tool calls to reduce back-and-forth.
|
|
803
|
-
- Ask for clarification only when necessary to proceed correctly.
|
|
804
|
-
|
|
805
|
-
---
|
|
806
|
-
|
|
807
|
-
## DECISION LOOP
|
|
808
|
-
|
|
809
|
-
```
|
|
810
|
-
0. CLASSIFY -> complexity? backup? tool? steps? (P40 Adaptive Depth)
|
|
811
|
-
1. PROTECT -> cp file file.bak
|
|
812
|
-
2. MEMORY -> query relevant context + past failures (P39)
|
|
813
|
-
3. EXPLORE -> if complex: generate 2-3 approaches (P38)
|
|
814
|
-
4. VERIFY -> pre-implementation check (P37)
|
|
815
|
-
5. AGENTS -> check overlaps
|
|
816
|
-
6. SKILLS -> check {{SKILLS_PATH}}
|
|
817
|
-
7. WORKTREE -> create, work (P41 atomic tasks)
|
|
818
|
-
8. REVIEW -> self-review diff (P42)
|
|
819
|
-
9. TEST -> gates pass
|
|
820
|
-
10. LEARN -> store outcome in memory (P39)
|
|
821
|
-
```
|
|
822
|
-
|
|
823
|
-
---
|
|
824
|
-
|
|
825
|
-
## MEMORY SYSTEM
|
|
826
|
-
|
|
827
|
-
```
|
|
828
|
-
L1 Working | SQLite memories | {{SHORT_TERM_LIMIT}} max | <1ms
|
|
829
|
-
L2 Session | SQLite session_mem | current | <5ms
|
|
830
|
-
L3 Semantic | {{LONG_TERM_BACKEND}}| search | ~50ms
|
|
831
|
-
L4 Knowledge| SQLite entities/rels | graph | <20ms
|
|
832
|
-
```
|
|
833
|
-
|
|
834
|
-
### Layer Selection
|
|
835
|
-
|
|
836
|
-
| Question | YES -> Layer |
|
|
837
|
-
|----------|-------------|
|
|
838
|
-
| Just did this (last few minutes)? | L1: Working |
|
|
839
|
-
| Session-specific decision/context? | L2: Session |
|
|
840
|
-
| Reusable learning for future? | L3: Semantic |
|
|
841
|
-
| Entity relationships? | L4: Knowledge Graph |
|
|
842
|
-
|
|
843
|
-
### Memory Commands
|
|
844
|
-
|
|
845
|
-
```bash
|
|
846
|
-
# L1: Working Memory
|
|
847
|
-
sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO memories (timestamp,type,content) VALUES (datetime('now'),'action','...');"
|
|
848
|
-
|
|
849
|
-
# L2: Session Memory
|
|
850
|
-
sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO session_memories (session_id,timestamp,type,content,importance) VALUES ('current',datetime('now'),'decision','...',7);"
|
|
851
|
-
|
|
852
|
-
# L3: Semantic Memory
|
|
853
|
-
{{MEMORY_STORE_CMD}} lesson "..." --tags t1,t2 --importance 8
|
|
854
|
-
|
|
855
|
-
# L4: Knowledge Graph
|
|
856
|
-
sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO entities (type,name,first_seen,last_seen,mention_count) VALUES ('file','x.ts',datetime('now'),datetime('now'),1);"
|
|
857
|
-
sqlite3 ./{{MEMORY_DB_PATH}} "INSERT INTO relationships (source_id,target_id,relation,timestamp) VALUES (1,2,'depends_on',datetime('now'));"
|
|
858
|
-
```
|
|
859
|
-
|
|
860
|
-
### Consolidation Rules
|
|
861
|
-
|
|
862
|
-
- **Trigger**: Every 10 working memory entries
|
|
863
|
-
- **Action**: Summarize -> session_memories, Extract lessons -> semantic memory
|
|
864
|
-
- **Dedup**: Skip if content_hash exists OR similarity > 0.92
|
|
865
|
-
|
|
866
|
-
### Decay Formula
|
|
867
|
-
|
|
868
|
-
```
|
|
869
|
-
effective_importance = importance * (0.95 ^ days_since_access)
|
|
870
|
-
```
|
|
871
|
-
|
|
872
|
-
---
|
|
873
|
-
|
|
874
|
-
## WORKTREE WORKFLOW
|
|
875
|
-
|
|
876
|
-
**Use worktrees for multi-file features/refactors. Skip for single-file fixes.**
|
|
877
|
-
|
|
878
|
-
| Change Scope | Workflow |
|
|
879
|
-
|-------------|----------|
|
|
880
|
-
| Single-file fix (<20 lines) | Direct commit to feature branch, no worktree needed |
|
|
881
|
-
| Multi-file change (2-5 files) | Worktree recommended if touching shared interfaces |
|
|
882
|
-
| Feature/refactor (3+ files, new feature) | Worktree required |
|
|
883
|
-
| CLAUDE.md or config changes | Worktree required |
|
|
884
|
-
|
|
885
|
-
```bash
|
|
886
|
-
# Create (when needed)
|
|
887
|
-
{{WORKTREE_CREATE_CMD}} <slug>
|
|
888
|
-
cd {{WORKTREE_DIR}}/NNN-<slug>/
|
|
889
|
-
|
|
890
|
-
# Work
|
|
891
|
-
git add -A && git commit -m "type: description"
|
|
221
|
+
### Parallel Execution
|
|
892
222
|
|
|
893
|
-
|
|
894
|
-
|
|
895
|
-
|
|
896
|
-
|
|
897
|
-
|
|
898
|
-
```
|
|
899
|
-
|
|
900
|
-
**Applies to**: {{WORKTREE_APPLIES_TO}}
|
|
223
|
+
When safe, run independent tool calls in parallel. When using parallel subagents:
|
|
224
|
+
1. Decompose into discrete work items. Map dependencies.
|
|
225
|
+
2. Parallelize dependency-free items with separate agents and explicit file boundaries.
|
|
226
|
+
3. Gate edits with `uam agent overlaps` before touching any file.
|
|
227
|
+
4. Merge in dependency order (upstream first).
|
|
901
228
|
|
|
902
229
|
---
|
|
903
230
|
|
|
@@ -906,15 +233,12 @@ git add -A && git commit -m "type: description"
|
|
|
906
233
|
**Before ANY commit/PR, invoke quality droids in PARALLEL:**
|
|
907
234
|
|
|
908
235
|
```bash
|
|
909
|
-
# These run concurrently - do NOT wait between calls
|
|
910
236
|
Task(subagent_type: "code-quality-guardian", prompt: "Review: <files>")
|
|
911
237
|
Task(subagent_type: "security-auditor", prompt: "Audit: <files>")
|
|
912
238
|
Task(subagent_type: "performance-optimizer", prompt: "Analyze: <files>")
|
|
913
239
|
Task(subagent_type: "documentation-expert", prompt: "Check: <files>")
|
|
914
240
|
```
|
|
915
241
|
|
|
916
|
-
### Review Priority
|
|
917
|
-
|
|
918
242
|
| Droid | Blocks PR | Fix Before Merge |
|
|
919
243
|
|-------|-----------|------------------|
|
|
920
244
|
| security-auditor | CRITICAL/HIGH | Always |
|
|
@@ -924,6 +248,19 @@ Task(subagent_type: "documentation-expert", prompt: "Check: <files>")
|
|
|
924
248
|
|
|
925
249
|
---
|
|
926
250
|
|
|
251
|
+
## CODE QUALITY
|
|
252
|
+
|
|
253
|
+
### Pre-Commit Checklist
|
|
254
|
+
- Functions <= 30 lines
|
|
255
|
+
- Self-documenting names
|
|
256
|
+
- Error paths handled explicitly
|
|
257
|
+
- No debug prints or commented-out code left behind
|
|
258
|
+
- Consistent with surrounding code style
|
|
259
|
+
- No hardcoded values that should be constants
|
|
260
|
+
- Imports are minimal
|
|
261
|
+
|
|
262
|
+
---
|
|
263
|
+
|
|
927
264
|
## AUTOMATIC TRIGGERS
|
|
928
265
|
|
|
929
266
|
| Pattern | Action |
|
|
@@ -931,75 +268,24 @@ Task(subagent_type: "documentation-expert", prompt: "Check: <files>")
|
|
|
931
268
|
| work request (fix/add/change/update/create/implement/build) | `uam task create --type task` |
|
|
932
269
|
| bug report/error | `uam task create --type bug` |
|
|
933
270
|
| feature request | `uam task create --type feature` |
|
|
934
|
-
|
|
|
935
|
-
| multi-file feature (3+ files) | create worktree, then work |
|
|
271
|
+
| ANY file change | **create worktree (MANDATORY)** |
|
|
936
272
|
| review/check/look | query memory first |
|
|
937
273
|
| ANY code change | tests required |
|
|
938
274
|
|
|
939
|
-
**Agent coordination**: Only use `uam agent` commands when multiple agents are active concurrently. For single-agent sessions (most common), skip agent registration and overlap checks.
|
|
940
|
-
|
|
941
275
|
---
|
|
942
276
|
|
|
943
|
-
## UAM VISUAL STATUS FEEDBACK
|
|
944
|
-
|
|
945
|
-
**When UAM tools are in use, ALWAYS use the built-in status display commands to provide visual feedback on progress and underlying numbers. Do NOT silently perform operations -- show the user what is happening.**
|
|
277
|
+
## UAM VISUAL STATUS FEEDBACK
|
|
946
278
|
|
|
947
|
-
|
|
948
|
-
After creating, updating, closing, or claiming tasks, run:
|
|
949
|
-
```bash
|
|
950
|
-
uam dashboard progress # Show completion %, status bars, velocity
|
|
951
|
-
uam task stats # Show priority/type breakdown with charts
|
|
952
|
-
```
|
|
279
|
+
**When UAM tools are in use, show visual feedback:**
|
|
953
280
|
|
|
954
|
-
### After Memory Operations
|
|
955
|
-
After storing, querying, or prepopulating memory, run:
|
|
956
281
|
```bash
|
|
957
|
-
uam
|
|
958
|
-
uam dashboard
|
|
282
|
+
uam dashboard overview # Full overview at session start
|
|
283
|
+
uam dashboard progress # After task operations
|
|
284
|
+
uam task stats # After task state changes
|
|
285
|
+
uam memory status # After memory operations
|
|
286
|
+
uam dashboard agents # After agent/coordination operations
|
|
959
287
|
```
|
|
960
288
|
|
|
961
|
-
### After Agent/Coordination Operations
|
|
962
|
-
After registering agents, checking overlaps, or claiming resources, run:
|
|
963
|
-
```bash
|
|
964
|
-
uam dashboard agents # Show agent status table, resource claims, active work
|
|
965
|
-
```
|
|
966
|
-
|
|
967
|
-
### Periodic Overview
|
|
968
|
-
At session start and after completing major work items, run:
|
|
969
|
-
```bash
|
|
970
|
-
uam dashboard overview # Full overview: task progress, agent status, memory health
|
|
971
|
-
```
|
|
972
|
-
|
|
973
|
-
### Display Function Reference
|
|
974
|
-
|
|
975
|
-
UAM provides these visual output functions (from `src/cli/visualize.ts`):
|
|
976
|
-
|
|
977
|
-
| Function | Purpose | When to Use |
|
|
978
|
-
|----------|---------|-------------|
|
|
979
|
-
| `progressBar` | Completion bar with % and count | Task/test progress |
|
|
980
|
-
| `stackedBar` + `stackedBarLegend` | Multi-segment status bar | Status distribution |
|
|
981
|
-
| `horizontalBarChart` | Labeled bar chart | Priority/type breakdowns |
|
|
982
|
-
| `miniGauge` | Compact colored gauge | Capacity/utilization |
|
|
983
|
-
| `sparkline` | Inline trend line | Historical data trends |
|
|
984
|
-
| `table` | Formatted data table | Task/agent listings |
|
|
985
|
-
| `tree` | Hierarchical tree view | Memory layers, task hierarchy |
|
|
986
|
-
| `box` | Bordered summary box | Section summaries |
|
|
987
|
-
| `statusBadge` | Colored status labels | Agent/service status |
|
|
988
|
-
| `keyValue` | Aligned key-value pairs | Metadata display |
|
|
989
|
-
| `inlineProgressSummary` | Compact progress bar with counts | After task mutations |
|
|
990
|
-
| `trend` | Up/down arrow with delta | Before/after comparisons |
|
|
991
|
-
| `heatmapRow` | Color-coded cell row | Activity density |
|
|
992
|
-
| `bulletList` | Status-colored bullet list | Health checks |
|
|
993
|
-
|
|
994
|
-
### Rules
|
|
995
|
-
|
|
996
|
-
1. **Never silently complete a UAM operation** -- always follow up with the relevant dashboard/status command
|
|
997
|
-
2. **Show numbers, not just success messages** -- the user needs to see counts, percentages, and trends
|
|
998
|
-
3. **Use `uam dashboard overview`** at session start to establish baseline awareness
|
|
999
|
-
4. **Use `uam task stats`** after any task state change to show the impact
|
|
1000
|
-
5. **Use `uam memory status`** after any memory write to confirm storage and show capacity
|
|
1001
|
-
6. **Prefer dashboard commands over raw SQLite queries** for status checks -- they provide formatted visual output
|
|
1002
|
-
|
|
1003
289
|
---
|
|
1004
290
|
|
|
1005
291
|
{{#if HAS_PROJECT_MD}}
|
|
@@ -1096,40 +382,24 @@ UAM provides these visual output functions (from `src/cli/visualize.ts`):
|
|
|
1096
382
|
|
|
1097
383
|
## COMPLETION GATES - MANDATORY
|
|
1098
384
|
|
|
1099
|
-
**CANNOT say "done"
|
|
385
|
+
**CANNOT say "done" until ALL gates pass.**
|
|
1100
386
|
|
|
1101
|
-
### GATE 1: Output Existence
|
|
387
|
+
### GATE 1: Output Existence
|
|
1102
388
|
```bash
|
|
1103
|
-
echo "=== GATE 1: OUTPUT EXISTENCE ==="
|
|
1104
389
|
for f in $EXPECTED_OUTPUTS; do
|
|
1105
|
-
[ -f "$f" ] && echo "
|
|
390
|
+
[ -f "$f" ] && echo "ok $f" || echo "MISSING: $f"
|
|
1106
391
|
done
|
|
1107
392
|
```
|
|
393
|
+
If missing: CREATE IT immediately.
|
|
1108
394
|
|
|
1109
|
-
### GATE 2: Constraint Compliance
|
|
1110
|
-
|
|
1111
|
-
Extract ALL constraints from task description:
|
|
1112
|
-
- File count constraint (e.g., "single file")
|
|
1113
|
-
- Line count constraint (e.g., "exactly one line")
|
|
1114
|
-
- Format constraint (e.g., "valid JSON")
|
|
1115
|
-
- Size constraint (e.g., "under 5KB")
|
|
1116
|
-
|
|
1117
|
-
Verify EACH constraint is satisfied before proceeding.
|
|
1118
|
-
```
|
|
395
|
+
### GATE 2: Constraint Compliance
|
|
396
|
+
Extract ALL constraints from task ("exactly", "only", "single", "must be", "no more than"). Verify EACH.
|
|
1119
397
|
|
|
1120
|
-
### GATE 3:
|
|
398
|
+
### GATE 3: Tests Pass
|
|
1121
399
|
```bash
|
|
1122
|
-
echo "=== GATE 3: TEST VERIFICATION ==="
|
|
1123
400
|
{{TEST_COMMAND}} 2>&1 | tail -30
|
|
1124
|
-
# If < 100% pass: iterate (fix specific failure, re-run)
|
|
1125
|
-
# Reserve 20% of time for iteration
|
|
1126
401
|
```
|
|
1127
|
-
|
|
1128
|
-
**HARD STOP RULE:**
|
|
1129
|
-
- If Gate 1 fails -> CREATE missing files immediately
|
|
1130
|
-
- If Gate 2 fails -> FIX constraint violations immediately
|
|
1131
|
-
- If Gate 3 fails -> ITERATE until 100% or timeout
|
|
1132
|
-
- NEVER report "complete" with failing gates
|
|
402
|
+
If < 100%: iterate (fix specific failure, re-run). Reserve 20% of time for iteration.
|
|
1133
403
|
|
|
1134
404
|
---
|
|
1135
405
|
|
|
@@ -1138,16 +408,17 @@ echo "=== GATE 3: TEST VERIFICATION ==="
|
|
|
1138
408
|
```
|
|
1139
409
|
☐ Tests pass
|
|
1140
410
|
☐ Lint/typecheck pass
|
|
1141
|
-
☐ Worktree used (
|
|
1142
|
-
☐
|
|
1143
|
-
☐
|
|
411
|
+
☐ Worktree used (MANDATORY for ALL changes)
|
|
412
|
+
☐ Worktree cleaned up after PR merge (MANDATORY)
|
|
413
|
+
☐ Self-review completed
|
|
414
|
+
☐ Memory updated
|
|
1144
415
|
☐ PR created
|
|
1145
416
|
☐ Parallel reviews passed
|
|
1146
417
|
{{#if HAS_INFRA}}
|
|
1147
418
|
☐ IaC parity verified
|
|
1148
419
|
{{/if}}
|
|
1149
420
|
☐ No secrets in code
|
|
1150
|
-
☐ No debug artifacts left
|
|
421
|
+
☐ No debug artifacts left
|
|
1151
422
|
```
|
|
1152
423
|
|
|
1153
424
|
---
|
|
@@ -1155,65 +426,10 @@ echo "=== GATE 3: TEST VERIFICATION ==="
|
|
|
1155
426
|
## COMPLETION PROTOCOL
|
|
1156
427
|
|
|
1157
428
|
```
|
|
1158
|
-
MERGE -> DEPLOY -> MONITOR -> FIX (iterate until 100%)
|
|
1159
|
-
|
|
1160
|
-
1. MERGE: PR approved -> merge to {{DEFAULT_BRANCH}} -> delete branch
|
|
1161
|
-
2. DEPLOY: CI/CD runs -> check status -> confirm live
|
|
1162
|
-
3. MONITOR: Check logs -> verify functionality -> smoke tests
|
|
1163
|
-
4. FIX: New worktree -> fix -> GOTO 1
|
|
1164
|
-
5. COMPLETE: Update memory -> close tasks
|
|
429
|
+
MERGE -> CLEANUP WORKTREE -> DEPLOY -> MONITOR -> FIX (iterate until 100%)
|
|
1165
430
|
```
|
|
1166
431
|
|
|
1167
|
-
**Never "done" until:** PR merged + deployed + verified working
|
|
1168
|
-
|
|
1169
|
-
---
|
|
1170
|
-
|
|
1171
|
-
## NEAR-MISS ITERATION PROTOCOL (P26 ENFORCED)
|
|
1172
|
-
|
|
1173
|
-
When >50% of tests pass but not all:
|
|
1174
|
-
1. **Read exact failure message** for each failing test
|
|
1175
|
-
2. **Fix ONLY the specific failing test** - do not refactor passing code
|
|
1176
|
-
3. **Re-run tests** immediately after each fix
|
|
1177
|
-
4. **Reserve 30% of time budget** for this iteration loop
|
|
1178
|
-
5. **Repeat** until 100% pass or time exhausted
|
|
1179
|
-
6. **Never give up** on a task that is >50% passing - small fixes flip outcomes
|
|
1180
|
-
|
|
1181
|
-
---
|
|
1182
|
-
|
|
1183
|
-
## DECODER-FIRST PROTOCOL (P35 ENFORCED)
|
|
1184
|
-
|
|
1185
|
-
When a task provides a decoder, validator, or expected output format:
|
|
1186
|
-
1. **READ the decoder/validator source code FIRST** before writing any implementation
|
|
1187
|
-
2. **Extract the exact format** it expects (headers, encoding, byte order, etc.)
|
|
1188
|
-
3. **Implement encoder/generator** to match that exact format
|
|
1189
|
-
4. **Test round-trip**: `original == decode(encode(original))` BEFORE optimizing
|
|
1190
|
-
5. **Never optimize** until round-trip verification passes
|
|
1191
|
-
|
|
1192
|
-
---
|
|
1193
|
-
|
|
1194
|
-
## ERROR RECOVERY ESCALATION
|
|
1195
|
-
|
|
1196
|
-
On any test failure or error:
|
|
1197
|
-
1. **Categorize** using P39 Failure Taxonomy (`dependency_missing`, `wrong_approach`, `format_mismatch`, `edge_case`, `state_mutation`, `concurrency`, `timeout`, `environment`)
|
|
1198
|
-
2. **Read exact error message** - do not guess
|
|
1199
|
-
3. **Query memory** for past failures of this type: `SELECT content FROM memories WHERE type='failure_analysis' AND content LIKE '%<type>%' LIMIT 3;`
|
|
1200
|
-
4. **If same error twice**: change approach completely (P38 Branching), do not retry same fix
|
|
1201
|
-
5. **If dependency missing**: install it (`pip install`, `npm install`, `apt-get`)
|
|
1202
|
-
6. **If permission denied**: use alternative path or `chmod`
|
|
1203
|
-
7. **If timeout**: reduce scope, submit partial solution
|
|
1204
|
-
8. **Store failure** in memory after resolution (P39 protocol)
|
|
1205
|
-
9. **Never repeat a failed command** without modification
|
|
1206
|
-
|
|
1207
|
-
---
|
|
1208
|
-
|
|
1209
|
-
## TIME BUDGET ALLOCATION
|
|
1210
|
-
|
|
1211
|
-
For complex tasks (compilation, VM, multi-service):
|
|
1212
|
-
- **20%** - Environment assessment (deps, tools, constraints)
|
|
1213
|
-
- **50%** - Core implementation
|
|
1214
|
-
- **30%** - Iteration, debugging, test fixes
|
|
1215
|
-
|
|
1216
|
-
If timeout approaching: submit best partial solution rather than nothing.
|
|
432
|
+
**Never "done" until:** PR merged + worktree cleaned up + deployed + verified working
|
|
1217
433
|
|
|
1218
434
|
---
|
|
1219
435
|
|
|
@@ -1243,126 +459,11 @@ If timeout approaching: submit best partial solution rather than nothing.
|
|
|
1243
459
|
|
|
1244
460
|
---
|
|
1245
461
|
|
|
1246
|
-
##
|
|
1247
|
-
|
|
1248
|
-
**These reminders exploit recency bias - place at END for maximum compliance.**
|
|
1249
|
-
|
|
1250
|
-
0. **PATTERN ROUTER (MANDATORY FIRST STEP)**:
|
|
1251
|
-
BEFORE ANY work, you MUST print this analysis block:
|
|
1252
|
-
```
|
|
1253
|
-
=== PATTERN ROUTER ===
|
|
1254
|
-
Task: [one-line task summary]
|
|
1255
|
-
Classification: [file-creation | constraint-validation | adversarial | pipeline | impossible]
|
|
1256
|
-
Complexity: [simple | moderate | complex | research] (P40)
|
|
462
|
+
## TOKEN EFFICIENCY
|
|
1257
463
|
|
|
1258
|
-
|
|
1259
|
-
|
|
1260
|
-
|
|
1261
|
-
|
|
1262
|
-
- P11 (Pre-computed): [YES/NO] - Is this a known algorithm (crypto/chess/ML)?
|
|
1263
|
-
- P18 (MTP): [YES/NO] - Does task need multiple tools chained?
|
|
1264
|
-
- P37 (PIV): [YES/NO] - Non-trivial implementation needing plan verification?
|
|
1265
|
-
- P38 (BE): [YES/NO] - Multiple valid approaches or high ambiguity?
|
|
1266
|
-
- P41 (ATL): [YES/NO] - Changes span 3+ files or multiple concerns?
|
|
1267
|
-
- P42 (CBC): [YES/NO] - Implementation that will be committed?
|
|
1268
|
-
|
|
1269
|
-
REASONING DEPTH (per P40):
|
|
1270
|
-
- Simple → Direct implementation
|
|
1271
|
-
- Moderate → Plan, verify (P37), implement
|
|
1272
|
-
- Complex → Explore (P38), verify (P37), implement, feedback (P39)
|
|
1273
|
-
- Research → Search first, then explore (P38), implement iteratively
|
|
1274
|
-
|
|
1275
|
-
ACTIVE PATTERNS: [list only YES patterns]
|
|
1276
|
-
=== END ROUTER ===
|
|
1277
|
-
```
|
|
1278
|
-
DO NOT proceed without printing this block first.
|
|
1279
|
-
|
|
1280
|
-
1. **BACKUP FIRST**: Before modifying ANY config, database, or critical file:
|
|
1281
|
-
`cp <file> <file>.bak`
|
|
1282
|
-
|
|
1283
|
-
2. **CHECK ENVIRONMENT**: Before using a library, verify it exists:
|
|
1284
|
-
`which <tool> || python3 -c "import <lib>" 2>/dev/null || echo "NOT AVAILABLE"`
|
|
1285
|
-
|
|
1286
|
-
3. **USE CLI OVER LIBRARIES**: When in doubt, use subprocess + CLI tools.
|
|
1287
|
-
|
|
1288
|
-
4. **STEP-BY-STEP**: Break complex tasks into numbered commands. Execute one at a time.
|
|
1289
|
-
|
|
1290
|
-
5. **VERIFY EACH STEP**: Check output before proceeding to next step.
|
|
1291
|
-
|
|
1292
|
-
6. **STATE-SENSITIVE OPERATIONS**: If task involves recovery, forensics, or state inspection - backup EVERYTHING first.
|
|
1293
|
-
|
|
1294
|
-
7. **IMPOSSIBLE DETECTION**: If task requires compression magic, ML inference, or exhaustive search - flag as potentially impossible.
|
|
1295
|
-
|
|
1296
|
-
8. **TEST IN ISOLATION**: Dependencies installed during work may not persist. Use pre-installed tools.
|
|
1297
|
-
|
|
1298
|
-
9. **OUTPUT VERIFICATION (MANDATORY)**: Before saying "done" or "complete":
|
|
1299
|
-
```bash
|
|
1300
|
-
ls -la $EXPECTED_OUTPUTS 2>&1
|
|
1301
|
-
# If ANY file is missing: CREATE IT NOW
|
|
1302
|
-
```
|
|
1303
|
-
|
|
1304
|
-
10. **ITERATE ON PARTIAL SUCCESS**: If tests show partial pass (e.g., 8/9):
|
|
1305
|
-
- Read the failing test output
|
|
1306
|
-
- Fix the specific failure
|
|
1307
|
-
- Re-run tests
|
|
1308
|
-
- Repeat until 100% or timeout
|
|
1309
|
-
|
|
1310
|
-
11. **CREATE BEFORE EXPLAIN**: If a required output file doesn't exist:
|
|
1311
|
-
- CREATE a minimal working version FIRST
|
|
1312
|
-
- Don't explain why it's missing
|
|
1313
|
-
- Just make the file exist with valid content
|
|
1314
|
-
|
|
1315
|
-
12. **TASK-FIRST**: Create required output files BEFORE analyzing.
|
|
1316
|
-
Touch placeholders immediately, then refine with real content.
|
|
1317
|
-
|
|
1318
|
-
13. **EXTRACT CONSTRAINTS (MANDATORY)**: If P17 is active, you MUST print:
|
|
1319
|
-
```
|
|
1320
|
-
=== CONSTRAINT CHECKLIST ===
|
|
1321
|
-
Extracted from task description:
|
|
1322
|
-
☐ [constraint 1]
|
|
1323
|
-
☐ [constraint 2]
|
|
1324
|
-
|
|
1325
|
-
BEFORE completing, verify EACH constraint:
|
|
1326
|
-
☑ constraint 1: [PASS/FAIL]
|
|
1327
|
-
☑ constraint 2: [PASS/FAIL]
|
|
1328
|
-
=== END CONSTRAINTS ===
|
|
1329
|
-
```
|
|
1330
|
-
|
|
1331
|
-
14. **MULTI-TOOL PIPELINE**: For complex tasks, identify ALL tools needed.
|
|
1332
|
-
Don't implement from scratch if existing tools exist.
|
|
1333
|
-
|
|
1334
|
-
15. **REFUSE IMPOSSIBLE**: If task violates information theory (compress 500MB to 5KB),
|
|
1335
|
-
REFUSE immediately. Create IMPOSSIBLE.md explaining why. Don't attempt.
|
|
1336
|
-
|
|
1337
|
-
16. **VERIFY BEFORE CODING (P37)**: For moderate+ complexity tasks, print the
|
|
1338
|
-
PRE-IMPLEMENTATION VERIFY block. Catch wrong approaches before wasting tokens.
|
|
1339
|
-
|
|
1340
|
-
17. **EXPLORE BEFORE COMMITTING (P38)**: For complex/ambiguous tasks, generate 2-3
|
|
1341
|
-
candidate approaches and evaluate before coding. 5% exploration saves 50% rework.
|
|
1342
|
-
|
|
1343
|
-
18. **LEARN FROM FAILURES (P39)**: After ANY test failure, categorize it using the
|
|
1344
|
-
Failure Taxonomy and store structured feedback in memory. Query memory before similar tasks.
|
|
1345
|
-
|
|
1346
|
-
19. **REVIEW YOUR OWN DIFF (P42)**: Before running tests, do a self-review of your
|
|
1347
|
-
changes against requirements. Catch logic errors by reading, not by test-debug cycles.
|
|
1348
|
-
|
|
1349
|
-
20. **ADVERSARIAL MINDSET (MANDATORY)**: If P20 is active, you MUST:
|
|
1350
|
-
```
|
|
1351
|
-
=== ADVERSARIAL ANALYSIS ===
|
|
1352
|
-
Target: [what are we trying to bypass/break?]
|
|
1353
|
-
Defense mechanism: [how does the filter/protection work?]
|
|
1354
|
-
|
|
1355
|
-
ATTACK VECTORS TO TRY:
|
|
1356
|
-
1. Case variation
|
|
1357
|
-
2. Encoding
|
|
1358
|
-
3. Null bytes
|
|
1359
|
-
4. Double encoding
|
|
1360
|
-
5. Context breaking
|
|
1361
|
-
6. Event handlers
|
|
1362
|
-
7. [add task-specific vectors]
|
|
1363
|
-
|
|
1364
|
-
TEST EACH vector until one works.
|
|
1365
|
-
=== END ADVERSARIAL ===
|
|
1366
|
-
```
|
|
464
|
+
- Prefer concise, high-signal responses
|
|
465
|
+
- Summarize command output; quote only decision-relevant lines
|
|
466
|
+
- Use parallel tool calls to reduce back-and-forth
|
|
467
|
+
- Check `{{SKILLS_PATH}}` for domain-specific skills before re-inventing approaches
|
|
1367
468
|
|
|
1368
469
|
</coding_guidelines>
|