npm - @ericrisco/rsc - Versions diffs - 0.1.32 → 0.1.33 - Mend

@ericrisco/rsc 0.1.32 → 0.1.33

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (38) hide show

package/README.md +4 -4
package/manifest.json +24 -5
package/package.json +1 -1
package/scripts/lib/domains.js +1 -1
package/skills/analyze/SKILL.md +1 -0
package/skills/author-skill/SKILL.md +20 -0
package/skills/author-skill/references/description-recipe.md +2 -0
package/skills/debug/SKILL.md +1 -1
package/skills/implement/SKILL.md +72 -2
package/skills/implement/references/per-task-review.md +46 -0
package/skills/implement/scripts/review-package +59 -0
package/skills/implement/scripts/sdd-workspace +47 -0
package/skills/implement/scripts/task-brief +77 -0
package/skills/parallel/SKILL.md +29 -0
package/skills/plan/references/plan-template.md +18 -0
package/skills/roast-me/SKILL.md +124 -0
package/skills/roast-me/evals/README.md +76 -0
package/skills/roast-me/evals/cases.yaml +75 -0
package/skills/roast-me/prompts/analyze.md +90 -0
package/skills/roast-me/prompts/compute.md +100 -0
package/skills/roast-me/prompts/roast.md +181 -0
package/skills/roast-me/tools/adapters/__init__.py +1 -0
package/skills/roast-me/tools/adapters/__pycache__/__init__.cpython-312.pyc +0 -0
package/skills/roast-me/tools/adapters/__pycache__/base.cpython-312.pyc +0 -0
package/skills/roast-me/tools/adapters/__pycache__/claude.cpython-312.pyc +0 -0
package/skills/roast-me/tools/adapters/__pycache__/codex.cpython-312.pyc +0 -0
package/skills/roast-me/tools/adapters/__pycache__/gemini.cpython-312.pyc +0 -0
package/skills/roast-me/tools/adapters/__pycache__/registry.cpython-312.pyc +0 -0
package/skills/roast-me/tools/adapters/base.py +53 -0
package/skills/roast-me/tools/adapters/claude.py +140 -0
package/skills/roast-me/tools/adapters/codex.py +113 -0
package/skills/roast-me/tools/adapters/gemini.py +121 -0
package/skills/roast-me/tools/adapters/registry.py +68 -0
package/skills/roast-me/tools/extract_prompts.py +520 -0
package/skills/ship/SKILL.md +9 -1
package/skills/specify/SKILL.md +21 -1
package/skills/tasks/SKILL.md +25 -0
package/skills/worktrees/SKILL.md +25 -0

package/README.md CHANGED Viewed

@@ -21,7 +21,7 @@ skills that fit — one at a time — into every assistant you pick, and keeps t
 equipped as you work.
 From *"document my company"* to *"ship a FastAPI service"* to *"grow my YouTube
-channel"* — **231 skills across 21 domains**, every one researched against live
+channel"* — **232 skills across 21 domains**, every one researched against live
 2025-2026 sources and **adversarially scored ≥ 8.5/10** before it shipped.
 ```bash
@@ -129,7 +129,7 @@ $ rsc
  ██████╗ ███████╗ ██████╗     ← animated gradient wordmark
  ██╔══██╗██╔════╝██╔════╝
  ██████╔╝███████╗██║
-  231 skills · one CLI · zero bloat
+  232 skills · one CLI · zero bloat
 What do you want to do?          ↑↓ move · enter select
 ❯ Base install — the essentials (orient + suggest + harness + init)
@@ -243,7 +243,7 @@ just asks in plain language.
 ## The catalog
-231 skills, grouped by what you're trying to do. Click any skill to read its
+232 skills, grouped by what you're trying to do. Click any skill to read its
 `SKILL.md`. It fires on its own when a task matches.
 ### 🧭 Core & control plane
@@ -328,7 +328,7 @@ Each with a `02-DOCS` feedback loop that learns from your own results. `remotion
 ### 🧠 Knowledge & meta
-[knowledge-ops](skills/knowledge-ops/) · [codebase-onboarding](skills/codebase-onboarding/) · [research-ops](skills/research-ops/) · [decision-records](skills/decision-records/) · [continuous-learning](skills/continuous-learning/) · [skill-scout](skills/skill-scout/) · [context-budget](skills/context-budget/)
+[knowledge-ops](skills/knowledge-ops/) · [codebase-onboarding](skills/codebase-onboarding/) · [research-ops](skills/research-ops/) · [decision-records](skills/decision-records/) · [continuous-learning](skills/continuous-learning/) · [skill-scout](skills/skill-scout/) · [context-budget](skills/context-budget/) · [roast-me](skills/roast-me/)
 ---

package/manifest.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
-  "version": "0.1.32",
+  "version": "0.1.33",
   "counts": {
-    "skills": 231
+    "skills": 232
   },
   "skills": [
     {
@@ -1123,7 +1123,7 @@
     },
     {
       "id": "debug",
-      "description": "Use when something is broken and the impulse is to start patching — a bug, a failing or flaky test, a crash, an exception, a wrong result, a regression, or behavior nobody can explain — and you want the root cause found BEFORE any fix is proposed. The on-demand rsc-sdd diagnosis discipline: reproduce → isolate → hypothesize → fix → verify, callable from inside implement (or any phase) the moment a check fails for a reason you don't understand. Triggers: 'debug this', 'why is this failing', 'this test is flaky', 'find the root cause', 'it works locally but not in CI', 'this used to work', 'track down this bug', 'the fix didn't stick', 'figure out why', 'NullPointer/segfault/500 out of nowhere', 'intermittent failure'. It diagnoses and fixes the ONE confirmed cause, then hands back. NOT writing a planned feature (implement), NOT the lint/test gate (verify), NOT adversarial diff reading (review). Honors the harness accompaniment dial.",
+      "description": "Use when something is broken and the impulse is to start patching — a bug, a failing or flaky test, a crash, an exception, a wrong result, a regression, or behavior nobody can explain — and you want the root cause found BEFORE any fix is proposed. The on-demand rsc-sdd diagnosis discipline, callable from inside implement (or any phase) the moment a check fails for a reason you don't understand. Triggers: 'debug this', 'why is this failing', 'this test is flaky', 'find the root cause', 'it works locally but not in CI', 'this used to work', 'track down this bug', 'the fix didn't stick', 'figure out why', 'NullPointer/segfault/500 out of nowhere', 'intermittent failure'. Fixes the ONE confirmed cause, then hands back. NOT writing a planned feature (that is `implement`), NOT the lint/test gate (that is `verify`), NOT adversarial diff reading (that is `review`). Honors the harness accompaniment dial.",
       "tags": [
         "debug",
         "bug",
@@ -2001,7 +2001,7 @@
     },
     {
       "id": "implement",
-      "description": "Use when executing a planned feature into working, tested code — turning the tasks list into commits, writing code task-by-task under TDD discipline (failing test first, then the code that passes it, then refactor), with a stop-and-show checkpoint after each task. Triggers: 'implement the plan', 'build out the tasks', 'start coding this feature', 'work through the task list', 'execute the implementation', 'write the code for this spec', 'do task 3', 'continue implementing', 'red-green-refactor this', 'implement with tests first'. The SDD phase AFTER analyze and BEFORE verify. Embeds red→green→refactor and delegates concrete test tooling to the stack skill (fastapi/go/nextjs/flutter); fans independent tasks out via parallel; logs every non-obvious choice to 02-DOCS/wiki/sdd/decisions.md and stays inside the constitution. NOT spec writing, NOT planning, NOT the final lint/audit gate.",
+      "description": "Use when an approved plan and task list exist and it is time to turn them into working, tested code — the SDD phase AFTER analyze and BEFORE verify. Enforces TDD discipline: a test fails before the code that makes it pass. Triggers: 'implement the plan', 'build out the tasks', 'start coding this feature', 'work through the task list', 'execute the implementation', 'write the code for this spec', 'do task 3', 'continue implementing', 'red-green-refactor this', 'implement with tests first'. Delegates concrete test tooling to the stack skill (fastapi/go/nextjs/flutter) and fans independent tasks out via `parallel`. NOT spec writing (that is `specify`), NOT planning (that is `plan`), NOT the final lint/test/audit gate (that is `verify`).",
       "tags": [
         "sdd",
         "implement",
@@ -3557,6 +3557,25 @@
       ],
       "profiles": []
     },
+    {
+      "id": "roast-me",
+      "description": "Use when you want honest, comedic feedback on how you prompt — analyzing your own past sessions to score prompt quality and compute efficiency, surface your worst habits, and generate a model-selection cheat sheet. Triggers: 'roast me', 'roast my prompting', 'audit my prompting habits', 'how good are my prompts', 'am I prompting well', 'what are my bad prompt habits', 'analiza mis prompts', 'puntua els meus prompts', 'how much am I wasting on model costs'. NOT reviewing your code or output quality (use code-review for that) and NOT a general prompt-engineering tutorial (use prompt-engineering for that).",
+      "tags": [
+        "prompting",
+        "self-audit",
+        "compute-efficiency",
+        "ai-hygiene",
+        "learning"
+      ],
+      "recommends": [
+        "prompt-engineering",
+        "code-review",
+        "context-budget"
+      ],
+      "profiles": [
+        "full"
+      ]
+    },
     {
       "id": "ruby",
       "description": "Use when writing idiomatic Ruby outside Rails — plain scripts, CLIs, gems, libraries — or refactoring imperative loops into Enumerable chains, designing module mixins, packaging with Bundler, deciding whether metaprogramming is worth it, or setting up Minitest/RSpec. Triggers: 'write this in Ruby', 'refactor this loop with map/reduce', 'build a gem', 'gemspec / Gemfile.lock', 'method_missing or define_method?', 'Minitest vs RSpec', 'FrozenError after Ruby 3.4 upgrade', 'mixin vs inheritance', 'escribir una gema en Ruby', 'refactoritzar amb blocs'. NOT Rails/ActiveRecord/controllers (that is rails).",
@@ -3762,7 +3781,7 @@
     },
     {
       "id": "ship",
-      "description": "Use when the work is complete and verified and it is time to CLOSE the development branch — the final phase of the rsc SDD chain, after review approves the diff. Triggers: 'ship it', 'close the branch', 'open the PR', 'merge this', 'merge into main', 'create the pull request', 'how do I land this work', 'finish this feature', 'haz el merge', 'abre el PR', 'cierra la rama', 'súbelo a main', 'clean up the branch', 'I'm done, what now'. Presents exactly three landing options (direct merge / pull request / park-or-discard) matched to the workflow, runs the pre-ship safety checklist (review verdict, clean tree, rebased, no secrets), and writes the commit and PR. HARD RULE: git authorship is ALWAYS Eric — never Claude. No Co-Authored-By Claude, no 'generated with' footer in any commit or PR body. NOT running lint/type/test (that is verify), NOT reading the diff adversarially (that is review), NOT the deploy/release mechanics to a server (that is deployment). Honors the harness accompaniment dial.",
+      "description": "Use when the work is complete and verified and it is time to CLOSE the development branch — the final phase of the rsc SDD chain, after review approves the diff. Triggers: 'ship it', 'close the branch', 'open the PR', 'merge this', 'merge into main', 'create the pull request', 'how do I land this work', 'finish this feature', 'haz el merge', 'abre el PR', 'cierra la rama', 'súbelo a main', 'clean up the branch', 'I'm done, what now'. HARD RULE it enforces: git authorship is ALWAYS Eric — never a Co-Authored-By or 'generated with' footer in any commit or PR. NOT running lint/type/test (that is `verify`), NOT reading the diff adversarially (that is `review`), NOT deploy/release mechanics to a server (that is `deployment`). Honors the harness accompaniment dial.",
       "tags": [
         "sdd",
         "ship",

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@ericrisco/rsc",
-  "version": "0.1.32",
+  "version": "0.1.33",
   "description": "Eric Risco's agent-skills catalog as a granular, self-recommending CLI installer.",
   "type": "module",
   "bin": {

package/scripts/lib/domains.js CHANGED Viewed

@@ -22,7 +22,7 @@ export const DOMAINS = [
   { title: 'Ship & operate — devops', ids: ['docker', 'github-actions', 'git-workflow', 'domains-dns', 'monitoring', 'email-deliverability', 'scaling', 'deployment'] },
   { title: 'Ship & operate — quality & security', ids: ['code-review', 'security-scan', 'secure-coding', 'testing-py', 'testing-web', 'testing-go', 'e2e-testing', 'accessibility', 'performance', 'error-handling', 'observability'] },
   { title: 'Design & content craft', ids: ['design', 'presentations', 'course-storytelling', 'course-builder', 'technical-writing', 'translation-l10n'] },
-  { title: 'Knowledge & meta', ids: ['knowledge-ops', 'codebase-onboarding', 'research-ops', 'decision-records', 'continuous-learning', 'skill-scout', 'context-budget'] },
+  { title: 'Knowledge & meta', ids: ['knowledge-ops', 'codebase-onboarding', 'research-ops', 'decision-records', 'continuous-learning', 'skill-scout', 'context-budget', 'roast-me'] },
 ];
 export function allDomainIds() {

package/skills/analyze/SKILL.md CHANGED Viewed

@@ -74,6 +74,7 @@ Run every one. Each compares a specific pair (or the whole set against the const
 4. **Contradiction** — direct disagreements between two artifacts: the spec says Postgres, the plan says SQLite; the spec says "no auth in v1", a task adds login. Quote both sides.
 5. **Duplication** — the same requirement stated twice in conflicting words, or two tasks doing the same job. Duplication is where contradictions breed later.
 6. **Ambiguity / underspecification** — requirements or tasks too vague to implement or to verify ("handle errors gracefully", "make it fast" with no number, a task with no done-check). These do not block by themselves but feed back to `clarify` (spec) or `tasks` (missing done-check).
+   - **Carrier completeness (isolated-implementer check).** Because `implement`/`parallel` dispatch tasks to **context-isolated** `developer` subagents that see only their own task, also confirm the plan carries a **§0 Global Constraints** block (verbatim project-wide values) and that every task whose correctness depends on a contract it doesn't own has an **Interfaces** block (`Consumes`/`Produces`, exact signatures). A constraint or neighbor-signature that lives only in prose is invisible to the blind worker — flag a missing carrier as `AMBIGUOUS` (it will surface as drift or breakage at implement time). See `../plan/references/plan-template.md` §0 and `../tasks/SKILL.md` (Per-task Interfaces).
 ### Requirement coverage map (build this every run)

package/skills/author-skill/SKILL.md CHANGED Viewed

@@ -137,6 +137,26 @@ Match the catalog, do not invent a new register:
 - Original prose. Mine ideas from anywhere; the words are Eric's. Do **not** reproduce another ecosystem's signature artifacts or phrasing — no borrowed "1% chance" urgency blocks, no copied rationalization wording, no `*-reviewer-prompt.md` files, no verbatim flowcharts. The rsc identity is its own.
 - Cross-reference siblings by name or `../<sibling>/SKILL.md`, only ones that actually exist.
+## Match the form to the failure
+Before you write an instruction, name the **failure** it's meant to prevent — then pick the form
+that actually fixes *that* failure. The instinct is to write a prohibition ("don't do X") for
+everything. That instinct is wrong for most failures, and measurably counter-productive for one
+class: **a prohibition aimed at output shape tends to summon the very thing it forbids** (the model
+attends to the named token), and can do *worse* than saying nothing at all. Match deliberately:
+| The failure is… | Use this form | Why, and example |
+| --- | --- | --- |
+| **Discipline** — the agent knows the rule but skips it under pressure (time, sunk cost, "just this once") | **Prohibition + rationalization table** (`Anti-patterns → STOP`) | The agent already knows what's right; it needs the excuse pre-refuted at the moment of temptation. This is exactly what the `STOP` tables are for. "I'll skip the failing test to ship" → name the rationalization, kill it. |
+| **Wrong-shaped output** — tone, verbosity, format, structure come out wrong | **Positive recipe / contract** (show the target shape) | A prohibition ("don't be verbose", "no marketing fluff") makes it *more* likely — the model fixates on the banned shape. Give the shape to hit instead: "Reply in ≤3 sentences, lead with the verdict." Demonstrate, don't forbid. |
+| **Omitted element** — the agent forgets a required piece | **Required structural slot** (a checklist item or a template field it must fill) | You can't prohibit an absence. Make the slot mandatory so its emptiness is visible — a `Done-of-done` checkbox, a template section, a result-envelope field. |
+| **Conditional behavior** — right action depends on the situation | **Predicate-keyed conditional** ("When X → do Y; otherwise Z") | A flat rule fires in the wrong context. Key the behavior to its trigger so the agent branches correctly instead of over- or under-applying. |
+So: keep the `Anti-patterns → STOP` tables — they're the correct tool for *discipline* failures,
+which is most of what the SDD-discipline skills guard. But when you catch yourself writing "don't
+make it X" about the *shape* of an output, stop and rewrite it as the shape to hit. Prohibition is
+a scalpel for one failure type, not the default for all of them.
 ## The best-practice rubric (audit before shipping)
 A skill ships only when every box is checked or a miss is consciously justified.

package/skills/author-skill/references/description-recipe.md CHANGED Viewed

@@ -26,6 +26,7 @@ Four moves, in order:
 - **≤ 1024 characters.** Count them. Over budget = trim verb phrases and trigger duplicates first, never the boundary.
 - **Valid single-line quoted YAML.** The description is one physical line wrapped in double quotes. No raw newlines inside the value. Avoid characters that break YAML in double quotes; if you need an apostrophe inside, it is fine (single quotes are literal inside double-quoted YAML). Never put an unescaped `"` inside.
 - **Third person, present tense.** The agent reads *about* the skill. "Use when…", "Triggers on…", "Knows…". Never "I", never "you should".
+- **State triggers and boundary — never the workflow.** The description says *when* to fire and *what it is not*. It must **not** summarize the procedure the body owns (the steps, the phase count, "does X then Y then Z"). This is not a style nit — it changes behavior. A description that pre-summarizes the steps becomes a *substitute* for reading the body: the agent acts on the summary and skips the real instructions. (Observed in the wild: a skill whose description said it runs *two* reviews made the agent run only *one*, because the summary was treated as the spec; deleting the workflow summary made the agent read the body and do both.) The verb phrases in move #2 name *capabilities* ("repairing cases.yaml"), not an ordered recipe ("first lint, then split, then validate"). If your description tells the reader how the skill works step by step, cut it down to triggers + boundary.
 ## Budget tactics when you blow 1024
@@ -61,6 +62,7 @@ Before committing a description, ask:
 - Could the router tell from this line alone *when* to fire it? (situation present)
 - Are there phrasings a real user would type, in their language? (triggers present)
 - Does it say what it is **not**, naming the sibling? (boundary present)
+- Does it describe *when/what-not*, and **never** the step-by-step procedure? (no workflow summary — if a reader could skip the body and act on the description alone, it leaks the workflow)
 - Does it parse as YAML and fit 1024? (run the check)
 If any answer is no, it is not done.

package/skills/debug/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: debug
-description: "Use when something is broken and the impulse is to start patching — a bug, a failing or flaky test, a crash, an exception, a wrong result, a regression, or behavior nobody can explain — and you want the root cause found BEFORE any fix is proposed. The on-demand rsc-sdd diagnosis discipline: reproduce → isolate → hypothesize → fix → verify, callable from inside implement (or any phase) the moment a check fails for a reason you don't understand. Triggers: 'debug this', 'why is this failing', 'this test is flaky', 'find the root cause', 'it works locally but not in CI', 'this used to work', 'track down this bug', 'the fix didn't stick', 'figure out why', 'NullPointer/segfault/500 out of nowhere', 'intermittent failure'. It diagnoses and fixes the ONE confirmed cause, then hands back. NOT writing a planned feature (implement), NOT the lint/test gate (verify), NOT adversarial diff reading (review). Honors the harness accompaniment dial."
+description: "Use when something is broken and the impulse is to start patching — a bug, a failing or flaky test, a crash, an exception, a wrong result, a regression, or behavior nobody can explain — and you want the root cause found BEFORE any fix is proposed. The on-demand rsc-sdd diagnosis discipline, callable from inside implement (or any phase) the moment a check fails for a reason you don't understand. Triggers: 'debug this', 'why is this failing', 'this test is flaky', 'find the root cause', 'it works locally but not in CI', 'this used to work', 'track down this bug', 'the fix didn't stick', 'figure out why', 'NullPointer/segfault/500 out of nowhere', 'intermittent failure'. Fixes the ONE confirmed cause, then hands back. NOT writing a planned feature (that is `implement`), NOT the lint/test gate (that is `verify`), NOT adversarial diff reading (that is `review`). Honors the harness accompaniment dial."
 tags: [debug, bug, troubleshoot]
 recommends: []
 profiles: [core, full]

package/skills/implement/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: implement
-description: "Use when executing a planned feature into working, tested code — turning the tasks list into commits, writing code task-by-task under TDD discipline (failing test first, then the code that passes it, then refactor), with a stop-and-show checkpoint after each task. Triggers: 'implement the plan', 'build out the tasks', 'start coding this feature', 'work through the task list', 'execute the implementation', 'write the code for this spec', 'do task 3', 'continue implementing', 'red-green-refactor this', 'implement with tests first'. The SDD phase AFTER analyze and BEFORE verify. Embeds red→green→refactor and delegates concrete test tooling to the stack skill (fastapi/go/nextjs/flutter); fans independent tasks out via parallel; logs every non-obvious choice to 02-DOCS/wiki/sdd/decisions.md and stays inside the constitution. NOT spec writing, NOT planning, NOT the final lint/audit gate."
+description: "Use when an approved plan and task list exist and it is time to turn them into working, tested code — the SDD phase AFTER analyze and BEFORE verify. Enforces TDD discipline: a test fails before the code that makes it pass. Triggers: 'implement the plan', 'build out the tasks', 'start coding this feature', 'work through the task list', 'execute the implementation', 'write the code for this spec', 'do task 3', 'continue implementing', 'red-green-refactor this', 'implement with tests first'. Delegates concrete test tooling to the stack skill (fastapi/go/nextjs/flutter) and fans independent tasks out via `parallel`. NOT spec writing (that is `specify`), NOT planning (that is `plan`), NOT the final lint/test/audit gate (that is `verify`)."
 tags: [sdd, implement, code]
 recommends: [verify]
 profiles: [core, full]
@@ -66,7 +66,9 @@ REFACTOR → with the test green, clean up names/duplication/shape. Re-run; stil
 CHECK    → re-read the task's done-check. Met? Constitution still honored? Decision worth logging?
 PROGRESS → append task/test/blocker/decision state to 02-DOCS/wiki/sdd/progress/<slug>.md.
 COMMIT   → commit this task as one logical unit (authorship = Eric; see ship for the rule).
-CHECKPOINT → stop and show: what changed, test output, the next task. Wait per the dial.
+REVIEW   → dispatch a fresh reviewer subagent over THIS task's commits (see "Per-task review gate").
+           Fold its Critical/Important findings back in before you move on; Minor can wait.
+CHECKPOINT → stop and show: what changed, test output, review verdict, the next task. Wait per the dial.
 ```
 The order is load-bearing. **Red before green** is not a suggestion — a test you write *after* the
@@ -143,6 +145,18 @@ Entry shape:
 This file is what makes resumes and archive reliable. Never rewrite old entries; append corrections as new entries.
+### Resume & recovery — trust the ledger, not memory
+The progress file is the source of truth across compaction and restarts:
+- **A task recorded `status: complete` here is DONE — never re-dispatch it.** Re-running completed
+  tasks (re-implementing, re-committing) is the single most expensive recovery failure: it rebuilds
+  work that already exists and can clobber later commits.
+- **After a compaction or a fresh session, reconstruct state from this ledger + `git log`, not from
+  memory.** Read the progress entries and the commit history first; resume at the first task not
+  marked complete. If the ledger and the working tree disagree, believe the tree and append a
+  correction entry — do not silently re-run.
 ## Running independent tasks in parallel
 When two or more tasks are genuinely disjoint — **no shared files, no shared state, no ordering
@@ -164,6 +178,62 @@ delegate a task or fan out via `parallel`, dispatch it to that agent (e.g. Claud
 `subagent_type: developer`) so the bulk of TDD execution runs on the cheaper-but-capable model
 while you stay on the session model to orchestrate. Full rationale: `../sdd/references/model-routing.md`.
+### The dispatch contract (carriers, statuses, escalation)
+A dispatched worker is **context-isolated** — it sees only what you hand it, not your session. Make
+each dispatch self-sufficient and make its return machine-checkable:
+- **Carry the task, not your context.** Hand the worker its task brief, its **Interfaces** block
+  (`Consumes`/`Produces`, exact signatures — from `tasks`), and the plan's **§0 Global Constraints**.
+  Anything not in those three is invisible to it. Prefer **file paths over pasted text** (use
+  `scripts/task-brief` and `scripts/review-package`, below) — pasted briefs and diffs stay resident
+  in your context and get re-read every turn.
+- **Model is REQUIRED on every dispatch.** Always set the subagent's model explicitly (the
+  `developer` agent already pins balanced). An omitted model silently inherits the session's model —
+  usually the most expensive one — which is the quiet way fan-out cost explodes.
+- **The worker reports one of four statuses** (it does not guess when unsure):
+  - `DONE` — done-check green, `Produces` contract met.
+  - `DONE_WITH_CONCERNS` — green, but it flagged a risk/assumption for you to adjudicate.
+  - `BLOCKED` — cannot proceed (failing dependency, contradiction, missing access). Stops; does not
+    hack around it.
+  - `NEEDS_CONTEXT` — a `Consumes`/constraint it needed was not in the brief. Stops; names what's missing.
+- **Never force the same model to retry unchanged.** On `BLOCKED`/`NEEDS_CONTEXT` you must *change
+  something* before re-dispatching: add the missing context/interface, fix the dependency, or
+  escalate the tier (balanced → heavy) for a genuinely hard sub-problem. Re-running an identical
+  prompt on an identical model is how a loop burns budget without progress.
+## Per-task review gate
+After a task is committed and before its checkpoint, run a **small, fresh-eyes review of that task
+alone** — catching over-/under-building while it is one commit cheap to fix, instead of waiting for
+the whole-branch review when it is buried in a large diff. It is a mid-build gate, not the finish
+line (see the boundary below).
+How to run it:
+1. **Package the diff as a file, not a paste.** `scripts/review-package <BASE> <HEAD>` where `BASE`
+   is the task's *recorded* base sha (never `HEAD~1` — that drops every commit but the last of a
+   multi-commit task). It writes the commit list + `--stat` + full diff to one file.
+2. **Dispatch a fresh reviewer subagent** (context-isolated — it must NOT inherit your session
+   reasoning). Hand it: the review-package *path*, the task's **done-check**, the task's
+   **Interfaces → Produces** contract, and the plan's **§0 Global Constraints**. With routing on,
+   balanced is the floor; escalate to heavy for a risky/security-touching task. The frozen brief
+   lives in `references/per-task-review.md`.
+3. **It returns two verdicts in one pass:** *spec compliance* (missing / extra / misunderstood vs the
+   done-check + Produces) and *code quality* — every finding tagged severity
+   (`Critical`/`Important`/`Minor`) with `file:line` evidence.
+4. **Fold `Critical`/`Important` back in now**; `Minor` may wait for the end-of-branch review.
+**Anti-pre-judging (hard rule).** The dispatch must never tell the reviewer what *not* to flag,
+pre-rate a finding's severity, or explain why a choice is fine ("the plan chose X, treat as Minor").
+A stated rationale never downgrades a finding — that is laundering your own bias through the reviewer.
+Hand it the evidence and the contract; let it judge.
+**Boundary — this does not replace the end gates.** `verify` still runs the *whole* suite and the
+acceptance criteria once at the end, and `review` still reads the *whole* diff adversarially once
+before `ship`. The per-task gate is the cheap early pass; the broad reviews remain. Skip the per-task
+gate on a genuinely trivial task (a one-line change), like the rest of the chain.
 ## Model tier — `balanced` (opt-in routing)
 This phase's default model tier is **`balanced`** — it is the bulk of TDD execution: cost-sensitive, with quality balanced handles well. Routing is **off** unless `models.enabled: true` in `02-DOCS/wiki/sdd/config.yaml`. When on: resolve this phase's tier (`models.overrides` wins over `models.phases`), map it to a model via `models.tiers`, and apply per `../sdd/references/model-routing.md` — announce the switch per the accompaniment dial when it differs from the session model, and dispatch any `Task`/`parallel` subagents on that model (this is where routing pays off most — fan-out runs on `balanced` while a hard sub-problem can be escalated to `heavy`). Routing off or no profile → honor the session model silently. Never fake a switch a tool can't make; skip routing on a one-line change.

package/skills/implement/references/per-task-review.md ADDED Viewed

@@ -0,0 +1,46 @@
+# Per-task review — the reviewer's brief
+The frozen brief handed to the fresh-eyes reviewer subagent after a task is committed, during
+`implement`'s per-task review gate. Hand the reviewer **only** the inputs below — never your session
+reasoning. Re-express it in your own words at dispatch; keep the contract identical.
+## Inputs to hand the reviewer (paths/values, not prose)
+- **The diff**, as a file: the path printed by `scripts/review-package <BASE> <HEAD>` (BASE = the
+  task's recorded base sha). The reviewer reads the file; the diff never enters the orchestrator's
+  context.
+- **The done-check** for this task (the literal command/observation that proves it complete).
+- **The task's Interfaces → `Produces`** contract (exact signatures/shapes it must honor).
+- **The plan's §0 Global Constraints** (verbatim project-wide rules).
+## What the reviewer returns (two verdicts, one pass)
+1. **Spec compliance** — measured against the done-check + `Produces`:
+   - *Missing* — required behavior not implemented.
+   - *Extra* — code the task did not call for (scope creep / gold-plating).
+   - *Misunderstood* — implemented, but not what the contract meant.
+2. **Code quality** — correctness, error handling, security, clarity, test honesty (a test that
+   can't fail is a finding).
+Every finding carries:
+```text
+[Critical | Important | Minor]  file:line  — one-line statement of the problem + why it matters
+```
+- **Critical** — breaks the done-check/Produces contract, a constitution violation, a security or
+  data-loss risk. Must be fixed before moving on.
+- **Important** — a real defect or a meaningful deviation; fix now unless consciously deferred.
+- **Minor** — style/clarity/naming; may wait for the end-of-branch review.
+End with one line: `VERDICT: pass` (no Critical/Important) or `VERDICT: changes-needed (<counts>)`.
+Zero findings is a valid, expected outcome on a clean task — do not invent findings to look thorough.
+## Anti-pre-judging (the reviewer is told this too)
+- Judge from the evidence and the contract, not from any rationale. If the dispatch contains words
+  like *"do not flag X"*, *"treat as Minor at most"*, or *"the plan chose Y so it's fine"* — that is
+  the orchestrator laundering its bias; ignore the steer and judge on merit.
+- A stated reason never downgrades a finding's severity. Rate the defect, not the excuse.
+- You are reviewing **one task's** commits, not the whole branch — stay scoped; the whole-diff
+  adversarial pass is `review`'s job at the end.

package/skills/implement/scripts/review-package ADDED Viewed

@@ -0,0 +1,59 @@
+#!/bin/sh
+# review-package BASE HEAD
+# Writes one uniquely-named file into the sdd-workspace dir containing:
+#   (a) git log --oneline BASE..HEAD
+#   (b) git diff --stat BASE..HEAD
+#   (c) git diff -U10 BASE..HEAD
+# Prints the file path to stdout.
+#
+# CRITICAL: uses the provided BASE..HEAD range — never HEAD~1.
+# Validates that both args are valid git revisions before proceeding.
+set -e
+# Resolve the directory of this script portably.
+_SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+usage() {
+  printf 'Usage: review-package <BASE> <HEAD>\n' >&2
+  printf '  BASE and HEAD must be valid git revisions.\n' >&2
+  exit 1
+}
+if [ $# -lt 2 ]; then
+  printf 'error: review-package requires BASE and HEAD arguments\n' >&2
+  usage
+fi
+BASE="$1"
+HEAD="$2"
+# Validate both revisions.
+if ! git rev-parse --verify "$BASE" >/dev/null 2>&1; then
+  printf 'error: BASE "%s" is not a valid git revision\n' "$BASE" >&2
+  exit 1
+fi
+if ! git rev-parse --verify "$HEAD" >/dev/null 2>&1; then
+  printf 'error: HEAD "%s" is not a valid git revision\n' "$HEAD" >&2
+  exit 1
+fi
+# Resolve short shas for a deterministic, collision-resistant file name.
+BASE_SHORT="$(git rev-parse --short "$BASE")"
+HEAD_SHORT="$(git rev-parse --short "$HEAD")"
+# Get the scratch workspace dir.
+SCRATCH="$(sh "$_SCRIPT_DIR/sdd-workspace")"
+OUT="$SCRATCH/review-${BASE_SHORT}-${HEAD_SHORT}.txt"
+{
+  printf '## Commits %s..%s\n\n' "$BASE" "$HEAD"
+  git log --oneline "${BASE}..${HEAD}"
+  printf '\n## Diff stat\n\n'
+  git diff --stat "${BASE}..${HEAD}"
+  printf '\n## Diff\n\n'
+  git diff -U10 "${BASE}..${HEAD}"
+} > "$OUT"
+printf '%s\n' "$OUT"

package/skills/implement/scripts/sdd-workspace ADDED Viewed

@@ -0,0 +1,47 @@
+#!/bin/sh
+# sdd-workspace — ensures a self-ignoring scratch dir for ephemeral briefs/diffs/reports
+# and prints its absolute path to stdout. Idempotent.
+#
+# Usage: sdd-workspace [DIR_OVERRIDE]
+#   $1  optional explicit dir override; if provided, use that dir instead of auto-detecting.
+#
+# Auto-detection: if a 02-DOCS dir exists at repo root, use 02-DOCS/wiki/sdd/.scratch/
+# Otherwise fall back to ${TMPDIR:-/tmp}/rsc-sdd-<pid>.
+set -e
+if [ -n "$1" ]; then
+  SCRATCH="$1"
+else
+  # Walk up from cwd to find repo root (where 02-DOCS lives), or just use cwd.
+  _find_repo_root() {
+    _dir="$PWD"
+    while [ "$_dir" != "/" ]; do
+      if [ -d "$_dir/02-DOCS" ]; then
+        printf '%s' "$_dir"
+        return 0
+      fi
+      _dir="$(dirname "$_dir")"
+    done
+    # Not found; return empty
+    printf ''
+  }
+  _ROOT="$(_find_repo_root)"
+  if [ -n "$_ROOT" ] && [ -d "$_ROOT/02-DOCS" ]; then
+    SCRATCH="$_ROOT/02-DOCS/wiki/sdd/.scratch"
+  else
+    # Use a fixed stable name under TMPDIR so repeated calls return the same path.
+    SCRATCH="${TMPDIR:-/tmp}/rsc-sdd-scratch"
+  fi
+fi
+mkdir -p "$SCRATCH"
+# Drop a self-ignoring .gitignore so nothing in this dir is ever tracked.
+if [ ! -f "$SCRATCH/.gitignore" ]; then
+  printf '*\n' > "$SCRATCH/.gitignore"
+fi
+printf '%s\n' "$SCRATCH"

package/skills/implement/scripts/task-brief ADDED Viewed

@@ -0,0 +1,77 @@
+#!/bin/sh
+# task-brief PLAN_FILE N
+# Extracts task number N from a plan/tasks markdown file into a brief file in the
+# sdd-workspace dir, and prints its path.
+#
+# A task starts at a heading matching: ### T<N> / ### T<N>( / ### T<N> — / ## T<N>
+# and ends just before the next ### T / ## heading.
+# If task N is not found, exits nonzero with a clear message.
+# Resolve the directory of this script portably.
+_SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+usage() {
+  printf 'Usage: task-brief <PLAN_FILE> <N>\n' >&2
+  printf '  PLAN_FILE  path to the tasks markdown file\n' >&2
+  printf '  N          task number to extract (e.g. 2 for T2)\n' >&2
+  exit 1
+}
+if [ $# -lt 2 ]; then
+  printf 'error: task-brief requires PLAN_FILE and N arguments\n' >&2
+  usage
+fi
+PLAN_FILE="$1"
+N="$2"
+if [ ! -f "$PLAN_FILE" ]; then
+  printf 'error: plan file not found: %s\n' "$PLAN_FILE" >&2
+  exit 1
+fi
+# Validate N is a positive integer.
+case "$N" in
+  ''|*[!0-9]*)
+    printf 'error: N must be a positive integer, got: %s\n' "$N" >&2
+    exit 1
+    ;;
+esac
+# Use awk to extract the task block.
+# Match start: a heading line (## or ###) followed by T<N> then space, (, —, or end of line.
+# Match end: the next heading line starting with ## (which includes ###).
+# Exit code: 0 if found, 1 if not found.
+CONTENT="$(awk -v n="$N" '
+  BEGIN { printing=0; found=0 }
+  /^#{2,3}[[:space:]]/ {
+    # Check if this heading is the target task T<n>
+    line=$0
+    pat="^#{2,3}[[:space:]]+T" n "([[:space:](]|[-]|$)"
+    if (match(line, pat)) {
+      printing=1; found=1; print; next
+    }
+    # Any other ## / ### heading stops printing
+    if (printing) { exit }
+  }
+  { if (printing) print }
+  END { if (!found) exit 1 }
+' "$PLAN_FILE")" || {
+  printf 'error: task T%s not found in %s\n' "$N" "$PLAN_FILE" >&2
+  exit 1
+}
+if [ -z "$CONTENT" ]; then
+  printf 'error: task T%s not found in %s\n' "$N" "$PLAN_FILE" >&2
+  exit 1
+fi
+# Get the scratch workspace dir.
+SCRATCH="$(sh "$_SCRIPT_DIR/sdd-workspace")"
+# Derive a readable unique name from the plan file basename and task number.
+PLAN_BASENAME="$(basename "$PLAN_FILE" .md)"
+OUT="$SCRATCH/brief-${PLAN_BASENAME}-T${N}.txt"
+printf '%s\n' "$CONTENT" > "$OUT"
+printf '%s\n' "$OUT"

package/skills/parallel/SKILL.md CHANGED Viewed

@@ -77,6 +77,25 @@ Each subagent still owns its own discipline inside its scope — TDD via `implem
 **The `developer` agent is the default worker.** rsc installs a `developer` subagent pinned to the **balanced** tier (Sonnet by default; the user's onboarding choice in `.rsc/developer.json`, never `light`). For implement-type units, **dispatch to the `developer` agent** (e.g. Claude Code `subagent_type: developer`) — that's the deliberate cost cap the user asked for. Only escalate a genuinely heavy unit (real design / root-cause) to a heavy model, and only when routing is enabled; otherwise the `developer` agent's balanced model is the floor and the ceiling.
+**Model is REQUIRED on every dispatch.** Set each subagent's model explicitly. An omitted model
+silently inherits the session's model — usually the most expensive one — and that is how a fan-out's
+cost quietly explodes. The `developer` agent already pins balanced; if you dispatch a raw subagent,
+name its tier.
+### The unit report contract (statuses & escalation)
+Each unit reports **one of four statuses** — it does not guess or hack around a wall when unsure:
+- `DONE` — done-check green, the unit's frozen interface honored.
+- `DONE_WITH_CONCERNS` — green, but a flagged risk/assumption for the orchestrator to adjudicate.
+- `BLOCKED` — cannot proceed (failing dependency, a contract that doesn't hold, missing access).
+- `NEEDS_CONTEXT` — something it needed (an interface, a constraint) was not in its brief.
+On `BLOCKED`/`NEEDS_CONTEXT`, **never re-dispatch the same brief to the same model unchanged** —
+change something first: add the missing interface/constraint, fix the dependency, or escalate the
+tier for a genuinely hard unit. An identical re-run burns budget without progress. A `BLOCKED` unit
+is a partition signal too: if it blocked on another unit, the two were not independent — re-partition.
 ### Skill resolution feedback
 Every subagent result must report:
@@ -95,6 +114,16 @@ The orchestrator folds these into the final parallel result. If a unit claims it
 Wait for **all** units to report. Collect each one's diff, test output, and any decisions. Do not start merging the fast ones while slow ones are still running — partial merges create the exact shared-state races you partitioned to avoid. Check each result against its brief's done-check *before* it touches the integration branch: a unit that came back not-actually-done gets sent back, not merged hopefully.
+**Per-unit review gate (before merge).** For each non-trivial unit, run a fresh-eyes review of *that
+unit's diff alone* before it joins the integration branch — the same gate `implement` runs per task
+(full brief: `../implement/references/per-task-review.md`). Package the unit's diff as a file
+(`../implement/scripts/review-package <BASE> <HEAD>`), dispatch a fresh reviewer (NOT the unit's own
+implementer) with the diff path + the unit's done-check + its frozen interface, and fold
+`Critical`/`Important` findings back before merging. **Anti-pre-judging:** never tell the reviewer
+what not to flag or pre-rate severity. This is the cheap per-unit pass; the *combined* adversarial
+review over the whole merged diff is still `review`'s job at the end (RECONCILE only proves the seam
+compiles and tests green).
 ### 4. RECONCILE — merge, then prove the seam holds
 This is where parallel work is actually finished. In order:

package/skills/plan/references/plan-template.md CHANGED Viewed

@@ -14,6 +14,20 @@ flows, levels — not framework syntax.
 > Spec: ../specs/<slug>.md · Constitution: ../constitution.md · Status: draft | approved
 > Last updated: YYYY-MM-DD
+## 0. Global Constraints
+(The project-wide rules that EVERY task implicitly includes — copied **verbatim** from the spec
+and constitution, with exact values, not paraphrased. A context-isolated implementer only sees its
+own task, so anything not stated here or in a task's Interfaces block is invisible to it. This is
+also the **reviewer's attention lens**: the per-task reviewer reads these first and treats a
+violation as a finding. Keep it to hard, checkable values — version floors, naming/copy rules,
+authorship, framework canon, security/compliance bars. If a value matters, write the value, not
+"per the spec".)
+- **<Category>:** <exact value> (e.g. "Node ≥ 20", "all amounts in cents (integer)", "no
+  Co-Authored-By footer", "error envelope `{ code, message }` everywhere").
+- …
 ## 1. Context & constraints
 (One or two lines each. Cite, don't re-paste, the spec and constitution.)
@@ -122,3 +136,7 @@ unit / contract / integration / e2e. Tooling is the stack skill's job — name t
   If a section genuinely has no content (e.g. no migration), write "none — <why>", not silence.
 - **Index it.** After writing, add the plan to the root `CLAUDE.md` `## Knowledge map` so the
   harness wiki and later phases can find it.
+- **Global Constraints carry the verbatim values.** §0 is not a summary of §1 — §1 explains *which*
+  constraints shape the design and *why*; §0 lists the exact values a blind implementer and the
+  reviewer must honor. A value that lives only in prose ("use the project's naming rule") is a value
+  the isolated worker can't see. Put it in §0, exact.