@hanzlaa/rcode 3.6.4 → 3.6.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@hanzlaa/rcode",
3
- "version": "3.6.4",
3
+ "version": "3.6.6",
4
4
  "description": "rcode — the AI team that never forgets. Persistent memory, specialist agents, and slash commands for AI IDEs. Works in Claude Code, Cursor, Gemini, VS Code, and Antigravity.",
5
5
  "main": "cli/index.js",
6
6
  "bin": {
@@ -1,7 +1,6 @@
1
1
  ---
2
2
  name: rihal-debug
3
- internal: true
4
- description: Root-cause debugging via the scientific method.
3
+ description: Root-cause debugging via the scientific method. Enforces investigate-before-fix, structured hypothesis iteration, multi-component evidence gathering, and architectural escalation after 3 failed fixes.
5
4
  triggers:
6
5
  # English
7
6
  - "debug this"
@@ -12,11 +11,15 @@ triggers:
12
11
  - "track this down"
13
12
  - "narrow down the bug"
14
13
  - "scientific method"
14
+ - "bug fix"
15
+ - "something is broken"
15
16
  # Roman Urdu / Hindi
16
17
  - "kharab kyu hai"
17
18
  - "bug dhoondo"
18
19
  - "fix karo bug"
19
20
  - "theek karo"
21
+ - "kya masla hai"
22
+ - "kyu kaam nahi kar raha"
20
23
  # Arabic native
21
24
  - "صحّح هذا"
22
25
  - "ما المشكلة"
@@ -29,29 +32,136 @@ user-invocable: false
29
32
  @.rihal/references/karpathy-guidelines.md
30
33
 
31
34
 
35
+ ## The Iron Law
36
+
37
+ ```
38
+ NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST.
39
+ ```
40
+
41
+ If you have not completed Phase 1, you cannot propose a fix. "It seems to work" is a red flag — keep investigating until the mechanism is clear. Symptom fixes are failure.
42
+
32
43
  ## Overview
33
44
 
34
- Debugging is investigation, not pattern-matching. Each iteration narrows the problem space — never widens it. The skill enforces a written hypothesis, an experiment that distinguishes "yes" from "no", and a captured observation. Random fixes that "happen to work" are not allowed — the bug must be understood.
45
+ Debugging is investigation, not pattern-matching. Each iteration narrows the problem space — never widens it. The skill enforces a written hypothesis, an experiment that distinguishes "yes" from "no", and a captured observation. Random fixes are not allowed — the bug must be understood before the fix is written.
46
+
47
+ ## Phase 1 — Root Cause Investigation
48
+
49
+ **BEFORE attempting ANY fix:**
50
+
51
+ 1. **Reproduce consistently.** Write the exact steps. If not reproducible, make it reproducible first — anything else is guessing.
52
+
53
+ 2. **Read the error carefully.** Don't skim stack traces. Note file paths, line numbers, error codes. They often contain the exact answer.
54
+
55
+ 3. **Check recent changes.** `git diff`, recent commits, new dependencies, config changes, environment differences.
56
+
57
+ 4. **Gather evidence in multi-component systems.**
58
+
59
+ When the system has multiple layers (API → service → DB, CI → build → signing, frontend → backend → queue):
60
+
61
+ Add diagnostic instrumentation at EACH component boundary BEFORE proposing fixes:
62
+ ```
63
+ For EACH boundary:
64
+ - Log what data enters the component
65
+ - Log what data exits the component
66
+ - Verify env/config propagation
67
+ - Check state at each layer
68
+
69
+ Run ONCE to gather evidence showing WHERE it breaks.
70
+ THEN identify the failing component.
71
+ THEN investigate that specific component.
72
+ ```
73
+
74
+ Example:
75
+ ```bash
76
+ # Layer 1: incoming request
77
+ console.log('[L1] body:', req.body, 'userId:', req.user?.id)
78
+
79
+ # Layer 2: service call
80
+ console.log('[L2] args to createTask:', args)
81
+
82
+ # Layer 3: DB query
83
+ console.log('[L3] Prisma input:', data)
84
+ ```
85
+
86
+ This reveals which layer fails — not guessing.
35
87
 
36
- ## Workflow
88
+ 5. **Trace data flow backward.** Where does the bad value originate? What called this function with that bad value? Keep tracing up until you find the source. Fix at source, not at symptom.
37
89
 
38
- 1. **Reproduce the bug.** Write the exact steps. If you can't reproduce it, the first job is making it reproducible anything else is guessing.
39
- 2. **State the hypothesis.** "I think the bug is in <component>; specifically <mechanism>." One sentence, falsifiable.
40
- 3. **Design the experiment.** What single test, log line, or dataflow change would distinguish a true hypothesis from a false one?
41
- 4. **Run it. Capture the observation.** Console output verbatim, screenshot, stack trace, network response — whatever the experiment produced.
42
- 5. **Update the hypothesis.** Either confirmed (now narrow to the next layer) or refuted (form a new hypothesis based on what was observed).
43
- 6. **Stop conditions:** the bug is reproducible from a unit test (then hand to `rihal-prove-it`), OR the root cause is a known external constraint (e.g. third-party API behaviour) that you record in `incidents/known-issues.md`.
44
- 7. **Never apply a fix without understanding why it works.** "It seems to fix it" is a red flag — keep investigating until the mechanism is clear.
90
+ ## Phase 2Pattern Analysis
45
91
 
46
- ## Sentry / observability integration
92
+ Before forming a hypothesis, find the comparison point:
93
+
94
+ 1. **Find working examples.** Locate similar code in the same codebase that works. What's different?
95
+ 2. **Read reference implementations completely.** Don't skim — partial understanding guarantees bugs.
96
+ 3. **List every difference**, however small. Don't assume "that can't matter."
97
+ 4. **Check assumptions.** What config, environment, or state does this code assume?
98
+
99
+ ## Phase 3 — Hypothesis and Experiment
100
+
101
+ Scientific method:
102
+
103
+ 1. **State ONE hypothesis.** "I think X is the root cause because Y." Write it down. Be specific, not vague.
104
+ 2. **Design the minimal experiment.** What single test, log line, or code change would confirm or refute this hypothesis?
105
+ 3. **Run it. Capture the observation verbatim.** Console output, stack trace, network response — whatever was produced.
106
+ 4. **Update.** Confirmed → Phase 4. Refuted → form a new hypothesis based on what was observed. Do NOT add more fixes on top.
107
+
108
+ ## Phase 4 — Implementation
109
+
110
+ 1. **Create a failing test first.** Simplest possible reproduction. Use `rihal-prove-it` for writing the test that locks the fix in.
111
+ 2. **Implement ONE fix.** Address the root cause identified. No "while I'm here" improvements. No bundled refactors.
112
+ 3. **Verify.** Test passes. No other tests broken. Issue actually resolved.
113
+ 4. **If fix doesn't work:** STOP. Count fix attempts.
114
+ - < 3 attempts: return to Phase 1 with new information.
115
+ - **≥ 3 attempts: STOP — this is an architectural problem.**
116
+
117
+ ## Architectural Escalation (after 3 failed fixes)
118
+
119
+ Pattern that signals architectural problem:
120
+ - Each fix reveals new coupling or shared state in a different place
121
+ - Fixes require "massive refactoring" to implement
122
+ - Each fix creates new symptoms elsewhere
123
+
124
+ When this pattern appears:
125
+ 1. Stop attempting fixes
126
+ 2. Ask: is this pattern fundamentally sound, or are we continuing through inertia?
127
+ 3. Discuss with the user before attempting more fixes
128
+ 4. Consider `/rihal-council` for a cross-functional review
129
+
130
+ ## Sentry / Observability Integration
47
131
 
48
132
  If the project has Sentry (`@sentry/*` in `package.json` or `sentry-sdk` in Python):
49
133
 
50
- - Quote the actual Sentry issue ID and stack trace in the hypothesis section
51
- - Look at breadcrumbs for the chain of events leading to the error
52
- - Check the issue's "first seen / last seen" — recurring or one-off matters
134
+ - Quote the actual Sentry issue ID and stack trace in the hypothesis
135
+ - Read breadcrumbs for the chain of events leading to the error
136
+ - Check "first seen / last seen" — recurring or one-off matters
53
137
  - Cross-reference with deployment timestamps to identify regressions
54
138
 
139
+ ## Red Flags — STOP and return to Phase 1
140
+
141
+ If you catch yourself thinking any of these:
142
+ - "Quick fix for now, investigate later"
143
+ - "Just try changing X and see if it works"
144
+ - "Add multiple changes and run tests"
145
+ - "It's probably X, let me fix that"
146
+ - "I don't fully understand but this might work"
147
+ - "It seems to fix it"
148
+ - "One more fix attempt" (when already tried 2+)
149
+ - Proposing solutions before tracing data flow
150
+ - Each fix reveals a new problem in a different place
151
+
152
+ **ALL of these mean: STOP. Return to Phase 1.**
153
+
154
+ ## Common Rationalizations
155
+
156
+ | Excuse | Reality |
157
+ |--------|---------|
158
+ | "Issue is simple, don't need process" | Simple bugs have root causes too. Process is fast for simple bugs. |
159
+ | "Emergency, no time for process" | Systematic debugging is FASTER than guess-and-check thrashing. |
160
+ | "Just try this first, then investigate" | First fix sets the pattern. Do it right from the start. |
161
+ | "Multiple fixes at once saves time" | Can't isolate what worked. Causes new bugs. |
162
+ | "I see the problem, let me fix it" | Seeing symptoms ≠ understanding root cause. |
163
+ | "One more fix attempt" (after 2+) | 3+ failures = architectural problem. Escalate, don't fix again. |
164
+
55
165
  ## Output Format
56
166
 
57
167
  ```
@@ -59,9 +169,12 @@ Reproduction:
59
169
  <exact steps>
60
170
  <observed vs expected>
61
171
 
172
+ Phase 1 — Evidence
173
+ <what layers were instrumented and what they showed>
174
+
62
175
  Iteration 1
63
- Hypothesis: <falsifiable claim>
64
- Experiment: <what we did>
176
+ Hypothesis: <falsifiable claim — "I think X because Y">
177
+ Experiment: <the single test/log that would confirm or refute>
65
178
  Observation: <verbatim output>
66
179
  Outcome: confirmed | refuted | partial
67
180
 
@@ -69,26 +182,30 @@ Iteration N
69
182
  ...
70
183
 
71
184
  Root cause:
72
- <one paragraph explanation of the actual mechanism>
185
+ <one paragraph the actual mechanism, not the symptom>
73
186
 
74
187
  Fix scope:
75
- <minimum change that fixes the cause, not the symptom>
188
+ <minimum change that addresses the cause>
76
189
 
77
190
  Regression test:
78
- <hand off to rihal-prove-it for the test that locks the fix in>
191
+ <hand to rihal-prove-it the test that locks the fix in>
79
192
  ```
80
193
 
81
- Do NOT include: "tried X and it seems to work"; speculative "maybe it's caching"; broad refactors disguised as bug fixes.
194
+ Do NOT include: "tried X and it seems to work" · speculative "maybe it's caching" · broad refactors disguised as bug fixes.
82
195
 
83
196
  ## Examples
84
197
 
85
- **Happy path** — "Login fails for Arabic usernames" → reproduce: POST `/login` with `محمد` returns 500 → hypothesis: encoding boundary in URL parsing → experiment: add hex-dump log of the raw request → observation: bytes are UTF-8 but the Postgres driver re-encodes as Latin-1 → root cause: client_encoding mismatch → fix: pin client_encoding=utf8 → regression test asserts non-ASCII login returns 200.
198
+ **Happy path** — "Login fails for Arabic usernames" → reproduce: POST `/login` with `محمد` returns 500 → Phase 1: hex-dump log of raw request body → observation: UTF-8 bytes, but Postgres driver re-encodes as Latin-1 → root cause: `client_encoding` mismatch → fix: pin `client_encoding=utf8` in connection string → regression test asserts non-ASCII login returns 200.
199
+
200
+ **Multi-component** — "Tasks not appearing after creation" → instrument three layers: controller logs input, service logs DB call args, DB query logs row count → observation: service receives correct args, DB returns `rowCount: 0` → hypothesis: wrong table name in query → confirmed → one-line fix, regression test added.
201
+
202
+ **Edge case — flaky test** — Passes locally, fails in CI 30% of the time → hypothesis: race condition → experiment: `--runInBand` → still flaky → next hypothesis: filesystem timing → experiment: `await fs.stat` after write → confirmed → fix.
86
203
 
87
- **Edge case flaky test** — Test passes locally, fails in CI 30% of the time hypothesis: race condition experiment: run with `--runInBand` observation: still flaky next hypothesis: filesystem timing experiment: await fs.stat after write → confirmed → fix.
204
+ **Negativeshotgun fix** — "I added a try/catch around the whole function and now it doesn't crash." Refuse. The exception is silently swallowed; the bug still exists. Restore the throw and form a real hypothesis.
88
205
 
89
- **Negativeshotgun fix** "I added a try/catch around the whole function and now it doesn't crash". Refuse. The exception is now silently swallowed; the bug still exists. Restore the throw and form a real hypothesis.
206
+ **Architectural escalation** Three separate fixes attempted (missing await, wrong env var, stale cache) each fix exposed a new problem elsewhere. Stop. The async data-flow design is wrong. Escalate to `/rihal-council` before attempting Fix #4.
90
207
 
91
208
  ## Memory Bank Hooks
92
209
 
93
- - **Reads:** `.rihal/memory/incidents/known-issues.md` (so prior debugging context is loaded), `.rihal/memory/project/stack.md` (Sentry presence)
94
- - **Writes:** append the root cause to `.rihal/memory/incidents/post-mortems/YYYYMMDD-<slug>.md` when an incident is resolved; remove the entry from `known-issues.md` once the fix is verified in production
210
+ - **Reads:** `.rihal/memory/incidents/known-issues.md` (prior debugging context), `.rihal/memory/project/stack.md` (Sentry presence, observability tools)
211
+ - **Writes:** append root cause to `.rihal/memory/incidents/post-mortems/YYYYMMDD-<slug>.md` when resolved; remove from `known-issues.md` once fix is verified in production
@@ -32,15 +32,39 @@ Follow the instructions in ./workflow.md.
32
32
  - Reads story file first; executes tasks in order
33
33
  - Marks tasks [x] only when implementation AND tests pass
34
34
  - Updates story's File List and Dev Agent Record sections
35
- - Reports: "Story complete. N tasks done. Tests: PASS (X). Files: [list]."
35
+ - Runs two-stage automated review before marking complete: spec compliance code quality
36
+ - Reports: "Story complete. N tasks done. Tests: PASS (X). Files: [list]. Reviews: SPEC ✅ QUALITY ✅"
36
37
  - Do NOT invent scope beyond the story
37
38
  - Do NOT commit with red tests
38
39
 
40
+ ## Review Protocol
41
+
42
+ After all tasks complete, dispatches two fresh reviewer subagents before handing off to human review:
43
+
44
+ **Stage 1 — Spec Compliance:** Confirms every AC is implemented, nothing extra was built. Repeats until COMPLIANT.
45
+
46
+ **Stage 2 — Code Quality:** Reviews naming, error handling, test depth, security, maintainability. Fixes High-severity issues; logs Medium issues for human reviewer. Repeats until APPROVED/APPROVED_WITH_NOTES.
47
+
48
+ ## Model Selection
49
+
50
+ When dispatching reviewer subagents or sub-tasks:
51
+ - Mechanical tasks (isolated, clear spec, 1-2 files) → cheapest/fastest model
52
+ - Integration tasks (multi-file, pattern matching) → standard model
53
+ - Architecture, design, or review tasks → most capable model
54
+
55
+ ## Implementer Status Protocol
56
+
57
+ When running as a subagent implementer, report one of:
58
+ - **DONE** — all requirements met, tests pass
59
+ - **DONE_WITH_CONCERNS** — complete but flagging doubts about correctness or scope
60
+ - **NEEDS_CONTEXT** — cannot proceed without specific missing information
61
+ - **BLOCKED** — cannot complete; caller must restructure or escalate
62
+
39
63
  ## Examples
40
64
 
41
65
  ### Happy Path
42
66
  **Input:** "dev this story: .rihal/phases/phase-02/stories/story-005.md"
43
- **Expected behavior:** Read story, execute tasks in order, write tests, run suite after each task, mark checkboxes, update File List.
67
+ **Expected behavior:** Read story, execute tasks in order, write tests, run suite after each task, mark checkboxes, update File List. After all tasks: dispatch spec compliance reviewer, dispatch code quality reviewer, mark as "review".
44
68
 
45
69
  ### Edge Case: Missing Story
46
70
  **Input:** "dev the login story" (story file doesn't exist)
@@ -49,3 +73,7 @@ Follow the instructions in ./workflow.md.
49
73
  ### Edge Case: Red Tests Mid-Execution
50
74
  **Input:** (task 2 breaks a test from task 1)
51
75
  **Expected behavior:** STOP. Report regression. Fix before continuing.
76
+
77
+ ### Edge Case: Spec Compliance Fails Review
78
+ **Input:** Implementation complete but reviewer finds missing AC
79
+ **Expected behavior:** Fix the gap, re-run tests, re-dispatch spec compliance reviewer. Do not proceed to code quality review until spec compliance passes.
@@ -263,6 +263,21 @@ Load config from `{project-root}/.rihal/config.json` and resolve:
263
263
  <step n="5" goal="Implement task following red-green-refactor cycle">
264
264
  <critical>FOLLOW THE STORY FILE TASKS/SUBTASKS SEQUENCE EXACTLY AS WRITTEN - NO DEVIATION</critical>
265
265
 
266
+ <!-- Model selection guidance — applies when dispatching sub-tasks to subagents -->
267
+ <model_selection>
268
+ Mechanical tasks (isolated function, clear spec, 1-2 files) → use cheapest/fastest model
269
+ Integration tasks (multi-file coordination, pattern matching) → use standard model
270
+ Architecture, design, or review tasks → use most capable model
271
+ </model_selection>
272
+
273
+ <!-- Implementer status protocol — use when completing tasks as a subagent -->
274
+ <status_protocol>
275
+ DONE: All requirements met, tests pass, committed.
276
+ DONE_WITH_CONCERNS: Complete but flagging doubts — describe the concern. Caller decides before review.
277
+ NEEDS_CONTEXT: Cannot proceed without missing information — specify exactly what is needed.
278
+ BLOCKED: Cannot complete despite context — describe the blocker. Caller must restructure or escalate.
279
+ </status_protocol>
280
+
266
281
  <action>Review the current task/subtask from the story file - this is your authoritative implementation guide</action>
267
282
  <action>Plan implementation following red-green-refactor cycle</action>
268
283
 
@@ -410,6 +425,65 @@ Load config from `{project-root}/.rihal/config.json` and resolve:
410
425
  <action if="definition-of-done validation fails">HALT - Address DoD failures before completing</action>
411
426
  </step>
412
427
 
428
+ <step n="9.5" goal="Two-stage automated review: spec compliance then code quality">
429
+ <critical>Both stages must pass before the story is marked complete. Never skip either stage. Spec compliance must pass before starting code quality review.</critical>
430
+
431
+ <output>🔍 **Two-Stage Review** — verifying before handing off to human review</output>
432
+
433
+ <!-- STAGE 1: Spec Compliance -->
434
+ <output>
435
+ ━━━ Stage 1: Spec Compliance Review ━━━
436
+ </output>
437
+ <action>Spawn a fresh spec-compliance reviewer subagent. Provide it:
438
+ - Full story file contents (especially Acceptance Criteria and Tasks sections)
439
+ - List of all modified files from the File List section
440
+ - Brief: "Review that every AC is satisfied in the code. Flag anything built outside spec. Flag any AC with no corresponding implementation."
441
+ </action>
442
+
443
+ <action>Reviewer reports one of: COMPLIANT | NON_COMPLIANT (with specific gaps listed)</action>
444
+
445
+ <check if="spec compliance reviewer reports NON_COMPLIANT">
446
+ <action>Fix each gap: implement missing ACs, remove any out-of-spec additions</action>
447
+ <action>Re-run tests to confirm fixes pass</action>
448
+ <action>Re-dispatch spec compliance reviewer with the same prompt</action>
449
+ <action>Repeat until COMPLIANT</action>
450
+ </check>
451
+
452
+ <output>✅ Stage 1 passed — implementation is spec-compliant</output>
453
+
454
+ <!-- STAGE 2: Code Quality Review -->
455
+ <output>
456
+ ━━━ Stage 2: Code Quality Review ━━━
457
+ </output>
458
+ <action>Spawn a fresh code quality reviewer subagent. Provide it:
459
+ - All modified files (from File List)
460
+ - Story title and ACs (context for what was being built)
461
+ - Project coding standards (contents of CLAUDE.md or project-context.md if available)
462
+ - Brief: "Review code quality: naming conventions, error handling, test coverage depth, security, performance, maintainability. Severity: High (must fix) | Medium (should fix) | Low (note only)."
463
+ </action>
464
+
465
+ <action>Reviewer reports one of: APPROVED | APPROVED_WITH_NOTES | CHANGES_REQUIRED (severity breakdown)</action>
466
+
467
+ <check if="code quality reviewer reports CHANGES_REQUIRED (High severity issues)">
468
+ <action>Fix all High-severity issues</action>
469
+ <action>Re-run tests to confirm fixes pass</action>
470
+ <action>Re-dispatch code quality reviewer with the same prompt</action>
471
+ <action>Repeat until APPROVED or APPROVED_WITH_NOTES</action>
472
+ </check>
473
+
474
+ <check if="Medium-severity issues exist">
475
+ <action>Fix Medium-severity issues when the fix is straightforward and low-risk</action>
476
+ <action>Log unfixed Medium issues in Dev Agent Record → Completion Notes for human reviewer awareness</action>
477
+ </check>
478
+
479
+ <output>✅ Stage 2 passed — code quality verified</output>
480
+
481
+ <output>
482
+ ✅ **Two-stage review complete** — story is spec-compliant and quality-approved.
483
+ Ready for human review.
484
+ </output>
485
+ </step>
486
+
413
487
  <step n="10" goal="Completion communication and user support">
414
488
  <action>Execute the enhanced definition-of-done checklist using the validation framework</action>
415
489
  <action>Prepare a concise summary in Dev Agent Record → Completion Notes</action>
@@ -114,6 +114,37 @@ Output consumed by /rihal-execute. Plans need:
114
114
  </downstream_consumer>
115
115
 
116
116
  <deep_work_rules>
117
+ ## File Structure Map (REQUIRED — before task decomposition)
118
+
119
+ Before writing any task, produce a file structure map listing every file this plan will create or modify:
120
+
121
+ ```
122
+ FILES_TO_CREATE:
123
+ - exact/path/to/new/file.ts — responsibility: [one sentence]
124
+ FILES_TO_MODIFY:
125
+ - exact/path/to/existing.ts — what changes: [one sentence]
126
+ FILES_FOR_TESTS:
127
+ - tests/exact/path/test.ts — tests for: [one sentence]
128
+ ```
129
+
130
+ Rules:
131
+ - Each file has one clear responsibility — if you can't describe it in one sentence, split the file
132
+ - Files that change together should live together (split by responsibility, not layer)
133
+ - This map is what informs task decomposition — each task should produce self-contained changes
134
+ - In existing codebases: follow established patterns; only restructure files if a file is genuinely unwieldy and the split is included as its own task
135
+
136
+ ## No-Placeholders Rule (HARD BLOCKER)
137
+
138
+ Every step must contain the actual content the executor needs. These are **plan failures** — never write them:
139
+ - "TBD", "TODO", "implement later", "fill in details"
140
+ - "Add appropriate error handling" / "add validation" / "handle edge cases" (without code)
141
+ - "Write tests for the above" (without actual test code)
142
+ - "Similar to Task N" — copy the code; executor may read tasks out of order
143
+ - Steps that describe what to do without showing how (code blocks required for code steps)
144
+ - References to types, functions, or methods not yet defined in any task in this plan
145
+
146
+ If a step would require TBD content, either: (a) do the research now and fill it in, or (b) split into a research task that outputs a decision, followed by an implementation task that consumes it.
147
+
117
148
  ## Anti-Shallow Execution Rules (MANDATORY)
118
149
 
119
150
  Every task MUST include these fields — they are NOT optional:
@@ -185,6 +216,8 @@ Every task MUST include these fields — they are NOT optional:
185
216
  </deep_work_rules>
186
217
 
187
218
  <quality_gate>
219
+ - [ ] File structure map written before first task (files_to_create / files_to_modify / files_for_tests)
220
+ - [ ] No placeholder patterns: no TBD/TODO/implement-later, no "similar to Task N", no code steps without code
188
221
  - [ ] SPRINT.md files created in phase directory
189
222
  - [ ] Each plan has valid frontmatter including `files_modified:` array aggregating all `<files>` paths across tasks (consumed by execute.md intra-wave overlap checker)
190
223
  - [ ] Tasks are specific and actionable
@@ -195,10 +228,21 @@ Every task MUST include these fields — they are NOT optional:
195
228
  - [ ] Every task has `<done>` with a single observable acceptance sentence (Dimension 2 requirement)
196
229
  - [ ] Every `<action>` contains concrete values (no "align X with Y" without specifying what)
197
230
  - [ ] Tasks extending existing code have `<interfaces>` with relevant signatures
231
+ - [ ] Type/name consistency: function names, types, and method signatures match across all tasks (no rename drift)
198
232
  - [ ] Dependencies correctly identified
199
233
  - [ ] Waves assigned for parallel execution
200
234
  - [ ] must_haves derived from phase goal
201
235
  </quality_gate>
236
+
237
+ <self_review>
238
+ After writing the complete plan, review the spec with fresh eyes before handing off:
239
+
240
+ 1. **Spec coverage** — skim each requirement in the phase goal / CONTEXT.md decisions. Can you point to a task that implements it? List any gaps; add tasks if needed.
241
+ 2. **Placeholder scan** — search the plan for the no-placeholder patterns listed above. Fix any found inline.
242
+ 3. **Type consistency** — check that function names, types, and method signatures used in later tasks match what earlier tasks define. A method called `clearLayers()` in Task 3 but `clearFullLayers()` in Task 7 is a bug.
243
+
244
+ Fix issues inline. No sub-agent needed — this is a quick self-check before the sprint-checker runs.
245
+ </self_review>
202
246
  ```
203
247
 
204
248
  ```