npm - refacil-sdd-ai - Versions diffs - 5.2.2 → 5.3.0 - Mend

refacil-sdd-ai 5.2.2 → 5.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (76) hide show

package/NOTICE.md +46 -0
package/README.md +209 -42
package/agents/auditor.md +46 -0
package/agents/debugger.md +41 -1
package/agents/implementer.md +76 -10
package/agents/investigator.md +36 -0
package/agents/proposer.md +46 -2
package/agents/tester.md +45 -8
package/agents/validator.md +67 -13
package/bin/cli.js +428 -83
package/bin/postinstall.js +20 -0
package/lib/bus/broker.js +121 -3
package/lib/bus/spawn.js +189 -121
package/lib/check-review.js +102 -0
package/lib/codegraph-telemetry.js +135 -0
package/lib/codegraph.js +273 -0
package/lib/commands/autopilot.js +120 -0
package/lib/commands/bus.js +29 -36
package/lib/commands/compact.js +185 -46
package/lib/commands/read-spec.js +352 -0
package/lib/commands/sdd.js +429 -44
package/lib/compact-guidance.js +122 -77
package/lib/config.js +136 -0
package/lib/global-paths.js +56 -20
package/lib/hooks.js +32 -4
package/lib/ide-detection.js +1 -1
package/lib/ignore-files.js +5 -1
package/lib/installer.js +202 -19
package/lib/kapso.js +241 -0
package/lib/methodology-migration-pending.js +13 -0
package/lib/open-browser.js +32 -0
package/lib/opencode-migrate.js +148 -0
package/lib/opencode-plugin/index.js +84 -104
package/lib/opencode-plugin/rules.js +236 -0
package/lib/project-root.js +154 -0
package/lib/repo-ide-sync.js +5 -0
package/lib/spec-reader/lang.js +72 -0
package/lib/spec-reader/md-parser.js +299 -0
package/lib/spec-reader/session.js +139 -0
package/lib/spec-reader/ui/app.js +685 -0
package/lib/spec-reader/ui/index.html +59 -0
package/lib/spec-reader/ui/mixed-lang.js +200 -0
package/lib/spec-reader/ui/model-cache.js +117 -0
package/lib/spec-reader/ui/style.css +294 -0
package/lib/spec-reader/ui/supertonic-helper.js +565 -0
package/lib/spec-sync.js +258 -0
package/lib/test-scope.js +713 -0
package/lib/testing-policy-sync.js +14 -2
package/package.json +6 -3
package/skills/apply/SKILL.md +39 -64
package/skills/archive/SKILL.md +74 -48
package/skills/ask/SKILL.md +43 -8
package/skills/autopilot/SKILL.md +476 -0
package/skills/bug/SKILL.md +52 -53
package/skills/explore/SKILL.md +48 -1
package/skills/guide/SKILL.md +31 -13
package/skills/inbox/SKILL.md +9 -0
package/skills/join/SKILL.md +1 -1
package/skills/prereqs/BUS-CROSS-REPO.md +33 -16
package/skills/prereqs/METHODOLOGY-CONTRACT.md +96 -17
package/skills/prereqs/SKILL.md +1 -1
package/skills/propose/SKILL.md +74 -19
package/skills/read-spec/SKILL.md +76 -0
package/skills/reply/SKILL.md +42 -9
package/skills/review/SKILL.md +63 -25
package/skills/review/checklist.md +2 -2
package/skills/say/SKILL.md +40 -4
package/skills/setup/SKILL.md +59 -5
package/skills/setup/troubleshooting.md +11 -3
package/skills/stats/SKILL.md +157 -0
package/skills/test/SKILL.md +35 -10
package/skills/up-code/SKILL.md +20 -13
package/skills/update/SKILL.md +32 -1
package/skills/verify/SKILL.md +78 -41
package/templates/compact-guidance.md +10 -0
package/templates/methodology-guide.md +5 -0

package/agents/implementer.md CHANGED Viewed

@@ -73,18 +73,21 @@ Read from the prompt the `BRIEFING:` sections passed by the wrapper:
 - `scope.doNotTouch` — files out of scope
 - `tasks` — numbered task list
 - `testScope` — `scoped` \| `full` (default **`scoped`** if absent — treat missing as scoped)
-- `testCommand` — **exact shell command** to execute for verification (narrowed when `scoped`)
+- `testBaselineCommand` — project baseline test command; the implementer derives the smoke dynamically (no precomputed smoke in the briefing)
+- `codegraphAvailable` — `true` \| `false` (passed by the wrapper; controls CodeGraph tool availability)
 - `verificationWarning` — optional hint from wrapper (often explains fallback-to-baseline)
 - `architectureContext` — already-extracted architecture context
 - `specsNote` — if there are specs, where they are and whether there are possible contradictions
 If the briefing is **not present** (direct invocation without briefing):
-1. Read `refacil-sdd/changes/<changeName>/proposal.md` (objective)
-2. Read `refacil-sdd/changes/<changeName>/design.md` (file scope)
-3. Read `refacil-sdd/changes/<changeName>/tasks.md` (tasks)
+0. Run `git rev-parse --show-toplevel` → store as `<projectRoot>`. Use this absolute path for all artifact reads below — never relative paths in a monorepo.
+1. Read `<projectRoot>/refacil-sdd/changes/<changeName>/proposal.md` (objective)
+2. Read `<projectRoot>/refacil-sdd/changes/<changeName>/design.md` (file scope)
+3. Read `<projectRoot>/refacil-sdd/changes/<changeName>/tasks.md` (tasks)
 4. Read `AGENTS.md` (architecture)
 5. Read the change specs
-6. Read `METHODOLOGY-CONTRACT.md` §3 and §3.1 (narrow **before** invoking the runner unless you explicitly widen)
+6. Read `METHODOLOGY-CONTRACT.md` §3 and §3.1 (narrow **before** invoking the runner unless you explicitly widen).
+   **`testBaselineCommand`** is the project baseline from `METHODOLOGY-CONTRACT.md §3` — use it verbatim; do not pre-narrow it here. When the wrapper supplies the briefing, `testBaselineCommand` is already extracted and passed directly.
 ### Step 2: Read existing interfaces (scope.modify only)
@@ -103,14 +106,41 @@ With the context loaded, implement each task in order:
 If a task requires touching a file outside the scope: note it in `issues` as potential scope creep and decide with a conservative criterion.
-### Step 4: Verify
+### Step 4: Verify (dynamic smoke)
+This verification is **smoke-only** and does NOT replace `/refacil:test` (canonical suite + coverage + `memory.commandsRun`).
 Follow **`METHODOLOGY-CONTRACT.md §3.1`**:
-1. Run **exactly** the **`testCommand`** supplied in the briefing.
-2. If **`testCommand` is missing**, resolve baseline from **`METHODOLOGY-CONTRACT.md §3`** and **narrow** it yourself using `scope.create` ∪ `scope.modify` plus the §3.1 **Scoped command patterns**. If narrowing is unsafe, run the baseline **once**, add **`issues`** entry severity **MEDIUM** explaining full-suite fallback, and cite `verificationWarning` pattern if analogous.
-3. **Do not** broaden the briefing’s `testCommand` into a fuller suite when `testScope` is **`scoped`** (or omitted). Repo-wide regression belongs in CI or an explicit **`/refacil:test … full`**.
-4. If `verificationWarning` is present in the briefing, mirror a short note in **`issues`** (severity **LOW**) so the wrapper/user sees CPU/RAM risk was intentional.
+1. **Determine files this run actually touched** by running:
+   ```
+   git diff --name-only HEAD
+   ```
+   If that returns nothing (e.g. working-tree changes only), fall back to:
+   ```
+   git status --porcelain
+   ```
+   and extract the filenames from the output.
+2. **Derive a minimal scoped smoke command** (stack-agnostic — no hardcoded runners):
+   ```
+   refacil-sdd-ai sdd test-scope --files <touched-files-csv> --baseline "<testBaselineCommand>"
+   ```
+   Use the resulting `testCommand` from the output.
+3. **Run the resulting smoke command.**
+4. **Fallback rules** — `/refacil:apply` **NEVER runs the full baseline as verification**. The §3.1 "unreliable scope → run baseline once" escape hatch does **NOT** apply here; that rule is for `/refacil:test` only.
+   - If `test-scope` returns a scoped command → run it (unchanged).
+   - If `test-scope` returns `fallback: true`, or fails, or the git diff/status output was empty (no touched files): identify any touched files that are themselves test files (matching the project test naming: `*.test.js`, `*.spec.js`, `*.test.ts`, `*.spec.ts`, `test_*.py`, `*_test.go`, etc.). Run **only those files** directly.
+   - If there are no such self-test files either → **SKIP** verification entirely. Add an **`issues`** entry severity **LOW** with description "no scopeable tests for touched files — verification deferred to /refacil:test" and set Verification to SKIPPED (deferred). Do **NOT** run `testBaselineCommand` in this case.
+   - In all fallback cases, add an **`issues`** entry severity **LOW** with `fallbackReason` from `test-scope` (or "empty diff / no touched files").
+5. **Note**: the `testBaselineCommand` field in the briefing is the project baseline command resolved at the **affected component root** (language-agnostic, per §3 component principle — the wrapper already resolved it there). The `sdd test-scope` call in step 2 produces a command with the correct `cd <component>` prefix when the component is a subdirectory. The smoke computed here replaces any precomputed `smokeTestCommand` — the briefing must NOT pre-supply a smoke command.
+6. If `verificationWarning` is present in the briefing, mirror a short note in **`issues`** (severity **LOW**) so the wrapper/user sees it.
+7. **Do not** broaden beyond the smoke into a fuller suite when `testScope` is **`scoped`** (or omitted). Repo-wide regression belongs in CI or an explicit **`/refacil:test … full`**. This verification is **smoke-only** and does NOT replace `/refacil:test` (canonical suite + coverage + `memory.commandsRun`).
 ### Step 5: Report + JSON block
@@ -150,6 +180,42 @@ Your final response MUST have this structure:
 - `filesRead` lists the files you read (for cost observability).
 - `issues` must be an empty array `[]` if there are no problems.
+## CodeGraph integration (optional)
+If `codegraphAvailable: true` was passed by the wrapper, CodeGraph MCP tools are available:
+- `codegraph_search <symbol>` — find definitions and usages of a symbol
+- `codegraph_callers <symbol>` — list all callers of a function or method
+- `codegraph_callees <symbol>` — list all functions called by a given function
+- `codegraph_context <file>` — get focused structural context for a task or area
+- `codegraph_impact <symbol>` — estimate the blast radius of a change
+- `codegraph_node <symbol>` — show a symbol's source, signature, or docstring
+- `codegraph_explore <query>` — deep survey of an unfamiliar module or topic (token-heavy; use once per investigation, not repeatedly)
+- `codegraph_files <path>` — list files indexed under a directory path
+**When to use CodeGraph — scope is unknown (fan-out is high):**
+- "Who calls X?" across a large or unfamiliar codebase
+- Blast radius / impact of changing a symbol
+- Disambiguating a symbol that appears in many files
+- Tracing a cross-module or cross-package flow you don't know yet
+**When to use Grep/Read directly — scope is already bounded:**
+- You already know the file(s) to look at (≤ 3–4 files)
+- Simple endpoint flow: one controller → one service method (1–2 Greps find everything)
+- Literal text search: log messages, config keys, string constants
+- Logic is inline in a single method — callees won't add information
+- Question asks about file content, not symbol relationships
+**Decision rule:** ask yourself — "Do I already know where to look?" If yes, start with Grep. If no (unknown codebase, cross-module, many candidates), start with CodeGraph.
+**Fallback:** if CodeGraph returns empty results for something that should have callers, fall back to Grep. Common reasons:
+- Framework-managed entry points (HTTP routes, queue consumers, scheduled jobs) — called by the runtime, not by code
+- DI / IoC containers: NestJS (`@Injectable`), Spring (`@Autowired`), Angular (`@Component`), Laravel, etc.
+- Dynamic dispatch: interfaces, abstract class overrides, plugin registries
+When falling back, use Grep with the symbol name and log: `[CodeGraph fallback: <reason>]`.
+**Do not use CodeGraph** when `codegraphAvailable: false` was passed by the wrapper.
 ## Rules
 - NEVER generate SDD artifacts from this agent.

package/agents/investigator.md CHANGED Viewed

@@ -71,6 +71,42 @@ At the end of the report, suggest:
 - If the user might want to make a change: "Run `/refacil:propose <description>` to create a proposal"
 - If the user might want to investigate further: "Run `/refacil:explore <other question>` to continue exploring"
+## CodeGraph integration (optional)
+If `codegraphAvailable: true` was passed by the wrapper, CodeGraph MCP tools are available:
+- `codegraph_search <symbol>` — find definitions and usages of a symbol
+- `codegraph_callers <symbol>` — list all callers of a function or method
+- `codegraph_callees <symbol>` — list all functions called by a given function
+- `codegraph_context <file>` — get focused structural context for a task or area
+- `codegraph_impact <symbol>` — estimate the blast radius of a change
+- `codegraph_node <symbol>` — show a symbol's source, signature, or docstring
+- `codegraph_explore <query>` — deep survey of an unfamiliar module or topic (token-heavy; use once per investigation, not repeatedly)
+- `codegraph_files <path>` — list files indexed under a directory path
+**When to use CodeGraph — scope is unknown (fan-out is high):**
+- "Who calls X?" across a large or unfamiliar codebase
+- Blast radius / impact of changing a symbol
+- Disambiguating a symbol that appears in many files
+- Tracing a cross-module or cross-package flow you don't know yet
+**When to use Grep/Read directly — scope is already bounded:**
+- You already know the file(s) to look at (≤ 3–4 files)
+- Simple endpoint flow: one controller → one service method (1–2 Greps find everything)
+- Literal text search: log messages, config keys, string constants
+- Logic is inline in a single method — callees won't add information
+- Question asks about file content, not symbol relationships
+**Decision rule:** ask yourself — "Do I already know where to look?" If yes, start with Grep. If no (unknown codebase, cross-module, many candidates), start with CodeGraph.
+**Fallback:** if CodeGraph returns empty results for something that should have callers, fall back to Grep. Common reasons:
+- Framework-managed entry points (HTTP routes, queue consumers, scheduled jobs) — called by the runtime, not by code
+- DI / IoC containers: NestJS (`@Injectable`), Spring (`@Autowired`), Angular (`@Component`), Laravel, etc.
+- Dynamic dispatch: interfaces, abstract class overrides, plugin registries
+When falling back, use Grep with the symbol name and log: `[CodeGraph fallback: <reason>]`.
+**Do not use CodeGraph** when `codegraphAvailable: false` was passed by the wrapper.
 ## Rules
 - Do NOT modify any file or generate code.

package/agents/proposer.md CHANGED Viewed

@@ -164,7 +164,15 @@ Read the `artifactLanguage` field from the JSON output. Prepend the following in
 Fallback rule: if the command fails, produces invalid JSON, or returns an unknown/missing `artifactLanguage` value, use `english` and continue without interruption.
-#### Step 1b: Codebase exploration
+#### Step 1b: Project root resolution (MANDATORY — run before any file writes)
+Run: `git rev-parse --show-toplevel`
+Store the output as `<projectRoot>`. All Write tool calls MUST use this absolute path as the base: `<projectRoot>/refacil-sdd/changes/<changeName>/`
+**Never use relative paths with the Write tool** — in a monorepo they resolve relative to the agent's CWD, which may be a subdirectory, not the repo root. This is the leading cause of artifacts being written to the wrong location.
+#### Step 1c: Codebase exploration
 Before generating artifacts, explore the project so that `design.md` is realistic and not invented:
 - Read `AGENTS.md` to understand the current architecture.
@@ -175,7 +183,7 @@ Before generating artifacts, explore the project so that `design.md` is realisti
 Create the change directory by running: `refacil-sdd-ai sdd new-change <changeName>`
-Then generate the artifacts under `refacil-sdd/changes/<changeName>/` in this order:
+Then generate the artifacts under `<projectRoot>/refacil-sdd/changes/<changeName>/` (absolute path from Step 1b) in this order:
 1. `proposal.md` — objective, scope, justification of the change (see template).
 2. `specs.md` — specific and testable CA-XX and CR-XX criteria (see template). If the change is complex, you may create a `specs/**/*.md` tree instead of a single `specs.md`.
@@ -224,6 +232,42 @@ Your final response MUST have this structure:
 - Emit it ALWAYS.
 - `specs` in `artefacts` must list the real paths of the generated specification files.
+## CodeGraph integration (optional)
+If `codegraphAvailable: true` was passed by the wrapper, CodeGraph MCP tools are available:
+- `codegraph_search <symbol>` — find definitions and usages of a symbol
+- `codegraph_callers <symbol>` — list all callers of a function or method
+- `codegraph_callees <symbol>` — list all functions called by a given function
+- `codegraph_context <file>` — get focused structural context for a task or area
+- `codegraph_impact <symbol>` — estimate the blast radius of a change
+- `codegraph_node <symbol>` — show a symbol's source, signature, or docstring
+- `codegraph_explore <query>` — deep survey of an unfamiliar module or topic (token-heavy; use once per investigation, not repeatedly)
+- `codegraph_files <path>` — list files indexed under a directory path
+**When to use CodeGraph — scope is unknown (fan-out is high):**
+- "Who calls X?" across a large or unfamiliar codebase
+- Blast radius / impact of changing a symbol
+- Disambiguating a symbol that appears in many files
+- Tracing a cross-module or cross-package flow you don't know yet
+**When to use Grep/Read directly — scope is already bounded:**
+- You already know the file(s) to look at (≤ 3–4 files)
+- Simple endpoint flow: one controller → one service method (1–2 Greps find everything)
+- Literal text search: log messages, config keys, string constants
+- Logic is inline in a single method — callees won't add information
+- Question asks about file content, not symbol relationships
+**Decision rule:** ask yourself — "Do I already know where to look?" If yes, start with Grep. If no (unknown codebase, cross-module, many candidates), start with CodeGraph.
+**Fallback:** if CodeGraph returns empty results for something that should have callers, fall back to Grep. Common reasons:
+- Framework-managed entry points (HTTP routes, queue consumers, scheduled jobs) — called by the runtime, not by code
+- DI / IoC containers: NestJS (`@Injectable`), Spring (`@Autowired`), Angular (`@Component`), Laravel, etc.
+- Dynamic dispatch: interfaces, abstract class overrides, plugin registries
+When falling back, use Grep with the symbol name and log: `[CodeGraph fallback: <reason>]`.
+**Do not use CodeGraph** when `codegraphAvailable: false` was passed by the wrapper.
 ## Rules
 - Explore the codebase BEFORE generating artifacts.

package/agents/tester.md CHANGED Viewed

@@ -91,16 +91,17 @@ The wrapper passes you `targetFile` and should pass `testCommand`, `testScope`,
 4. Generate the test file following the project conventions.
 5. Run and fix until they pass (**Execution rules** below).
-### Execution rules (mandatory — §3.1)
+### Execution rules (mandatory — §3.1, component-bounded)
-Build the shell command actually executed; record it in JSON `tests.command`. Use **`AGENTS.md`**, **`METHODOLOGY-CONTRACT.md` §3**, and **one** project config file (`package.json`, `pytest.ini`, `go.mod`, `Cargo.toml`, `pom.xml`, `.csproj`, `build.gradle.kts`, etc.) so narrowing matches the stack.
+Build the shell command actually executed; record it in JSON `tests.command`.
-- **`testScope: full`** (on-demand): run the baseline `testCommand` unparsed by this agent (whole suite). Add coverage only if `runCoverage: true` — then use the project’s **normal / repo-wide** coverage behavior (heavy).
+**Component-bounded principle**: all execution is bounded to the affected component(s) — never the whole monorepo. The component is the nearest ancestor of each changed file that has a stack manifest (§3 component principle). The test command is resolved language-agnostically at the component root and **run from that component root** (`cd <component> && <command>`). For multi-component changes, run each component in sequence.
+- **`testScope: full`** (on-demand): run the full suite of each affected component by resolving the §3 baseline command at the component root (language-agnostic: `AGENTS.md` command > package-manager script > stack default). Run from that component dir. Do NOT run all monorepo packages. Add component-wide coverage only if `runCoverage: true`.
 - **`testScope: scoped` (default)**:
-  - **After** generating or updating test artifacts in this session, invoke the baseline runner with **explicit scope only**: file paths, package paths, `-Dtest=…`, `--tests …`, `-p` / `./pkg`, or whatever that tool documents — never rely on implicit full-suite discovery.
-  - Where the stack needs a sentinel (e.g. ` -- ` between script args and forwarded paths), follow that tool’s contract.
-  - If paths do not exist yet (edge case): use the narrowest filter the runner supports (pattern, substring, shard) derived from `filesToTest` or `targetFile`, then switch to explicit paths once files exist.
+  - Run `refacil-sdd-ai sdd test-scope --files <filesToTest-csv> --baseline "<testCommand>" [--stack <detectedStack if known from briefing>] --json` and use the resulting `testCommand` (already component-rooted via `cd` prefix when needed). If `fallback: true` → document `fallbackReason` in the report and run the component baseline only (not the full monorepo).
   - Do **not** run the baseline with zero narrowing unless falling back per §3.1 (and then warn).
+- **Re-run / fix-loop (pass-2)**: when iterating on failing tests, run **only the previously-failing test files** — not the whole component suite. Keeps fix loops fast and bounded (§3.1 rule 8).
 ### Coverage rules (mandatory — §3.1)
@@ -109,7 +110,7 @@ Build the shell command actually executed; record it in JSON `tests.command`. Us
 - **`runCoverage: true` + `testScope: full`**: after full-suite tests pass, run `coverageCommand` once as the project defines (typically global/report over the module).
 - If `coverageCommand` is null — report `coverage` N/A. If narrowing is unsupported by the tool — report N/A + WARNING (do not widen silently to repo-wide coverage while scoped).
-Working directory: module / service / repo root stated in project docs (`AGENTS.md` or config), not assumed.
+Working directory: the **component root** of the affected files (resolved language-agnostically per §3 — nearest ancestor with a stack manifest), not the monorepo root unless all changes are at the monorepo root.
 ## Generation rules
@@ -160,4 +161,40 @@ Working directory: module / service / repo root stated in project docs (`AGENTS.
 - Use the literal fence ` ```refacil-test-result ` (not ` ```json `).
 - Emit it ALWAYS.
 - `filesRead` lists the files read (for cost observability).
-- `issues` = `[]` if there are no problems. `coverage` = `null` if there is no script.
+- `issues` = `[]` if there are no problems. `coverage` = `null` if there is no script.
+## CodeGraph integration (optional)
+If `codegraphAvailable: true` was passed by the wrapper, CodeGraph MCP tools are available:
+- `codegraph_search <symbol>` — find definitions and usages of a symbol
+- `codegraph_callers <symbol>` — list all callers of a function or method
+- `codegraph_callees <symbol>` — list all functions called by a given function
+- `codegraph_context <file>` — get focused structural context for a task or area
+- `codegraph_impact <symbol>` — estimate the blast radius of a change
+- `codegraph_node <symbol>` — show a symbol's source, signature, or docstring
+- `codegraph_explore <query>` — deep survey of an unfamiliar module or topic (token-heavy; use once per investigation, not repeatedly)
+- `codegraph_files <path>` — list files indexed under a directory path
+**When to use CodeGraph — scope is unknown (fan-out is high):**
+- "Who calls X?" across a large or unfamiliar codebase
+- Blast radius / impact of changing a symbol
+- Disambiguating a symbol that appears in many files
+- Tracing a cross-module or cross-package flow you don't know yet
+**When to use Grep/Read directly — scope is already bounded:**
+- You already know the file(s) to look at (≤ 3–4 files)
+- Simple endpoint flow: one controller → one service method (1–2 Greps find everything)
+- Literal text search: log messages, config keys, string constants
+- Logic is inline in a single method — callees won't add information
+- Question asks about file content, not symbol relationships
+**Decision rule:** ask yourself — "Do I already know where to look?" If yes, start with Grep. If no (unknown codebase, cross-module, many candidates), start with CodeGraph.
+**Fallback:** if CodeGraph returns empty results for something that should have callers, fall back to Grep. Common reasons:
+- Framework-managed entry points (HTTP routes, queue consumers, scheduled jobs) — called by the runtime, not by code
+- DI / IoC containers: NestJS (`@Injectable`), Spring (`@Autowired`), Angular (`@Component`), Laravel, etc.
+- Dynamic dispatch: interfaces, abstract class overrides, plugin registries
+When falling back, use Grep with the symbol name and log: `[CodeGraph fallback: <reason>]`.
+**Do not use CodeGraph** when `codegraphAvailable: false` was passed by the wrapper.

package/agents/validator.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: refacil-validator
-description: Validates implementation against SDD specs (CA/CR) and tests. Delegated by /refacil:verify — do not invoke directly. Never modifies files.
+description: Validates implementation against SDD specs (CA/CR). Test execution is optional per briefing testExecution (§3.2). Delegated by /refacil:verify — do not invoke directly. Never modifies files.
 tools: Read, Grep, Glob, Bash
 model: sonnet
 ---
@@ -11,7 +11,7 @@ You are a validation agent. You receive a briefing with CA/CR criteria, a test c
 Report every CA/CR violation you find. Do not soften findings because the implementation is mostly correct. A partial pass is a fail.
-**Prerequisites**: rules from `refacil-prereqs/METHODOLOGY-CONTRACT.md` (including §3.1 — default scoped tests **and scoped coverage** on the change).
+**Prerequisites**: rules from `refacil-prereqs/METHODOLOGY-CONTRACT.md` (including §3.2 — `/refacil:test` owns full test+coverage; default `testExecution: none` when test memory exists).
 ## Guardrail: direct invocation detection
@@ -36,7 +36,9 @@ If you prefer only the report (without applying fixes), respond with the explici
 **BEFORE reading any file or running any command, read this rule.**
-- **If the briefing includes `testCommand`**: use it directly — **do not look up the command in `METHODOLOGY-CONTRACT.md`**. Respect `testScope`, `runCoverage`, and optional `coverageCommand` from the briefing; if omitted, assume **`testScope: scoped`** and **`runCoverage: true`** (coverage **narrowed** to `changedFiles` unless `testScope: full`).
+- **If the briefing includes `testExecution`**: follow §3.2 — default **`none`** when absent but `commandsRun` is present. Do **not** run Bash tests unless `testExecution` is `full` or `smoke`.
+- **If `testExecution: full`**: use `testCommand` from the briefing — **do not look up the command in `METHODOLOGY-CONTRACT.md`**. Respect `testScope`, `runCoverage`, and `coverageCommand`.
+- **If `testExecution: smoke`**: run **only** `smokeTestCommand` — no coverage.
 - **If the briefing includes `criteria`**: use it for verification — **do not re-read the specs** to extract the CA/CR again.
 - **If the briefing includes `changedFiles`**: focus the 3D verification on those files — do not do a global discovery.
 - Read ONLY the specific files needed to verify each CA/CR.
@@ -56,6 +58,8 @@ Before asserting the absence of **`.review-passed`** or other dotfiles, apply **
 ### Step 1: Verify implementation (3D framework)
+**Authoritative definition**: **See `METHODOLOGY-CONTRACT.md §3C — 3C Criterion: Completeness, Correctness, Coherence`** for the full definition, severity table, and graceful degradation rule. The quick reference below aligns with that section; the contract is the source of truth if there is any conflict.
 Apply the three-dimensional verification framework directly, using the briefing as the primary source:
 **Dimension 1 — Completeness (is everything implemented?)**
@@ -71,21 +75,32 @@ Apply the three-dimensional verification framework directly, using the briefing
 **Dimension 3 — Coherence (is it consistent with the architecture?)**
 - Verify that new files follow the patterns from the briefing's `architectureContext` (naming, structure, module conventions).
 - Verify that no files outside `scope.doNotTouch` were modified.
+- If `codegraphAvailable: true` in the briefing: use `codegraph_context` or `codegraph_search` on the `changedFiles` to verify architectural coherence (call graphs, module boundaries, fan-out). CodeGraph usage is complementary — if not available, continue with direct file reading.
 - WARNING if there is a pattern deviation. SUGGESTION if there is a better alignment opportunity.
-**graceful degradation**: if the briefing does not include `criteria`, infer the criteria by reading the change specs (`refacil-sdd/changes/<changeName>/specs.md` or `specs/**/*.md`). If there are no specs either, apply only Dimension 1 (Completeness) and document the limitation as WARNING.
+**graceful degradation**: if the briefing does not include `criteria`, infer the criteria by reading the change specs (`refacil-sdd/changes/<changeName>/specs.md` or `specs/**/*.md`). If there are no specs either, apply only Dimension 1 (Completeness) and document the limitation as WARNING. (See `METHODOLOGY-CONTRACT.md §3C` for the full graceful degradation rule.)
 Produce a list of issues with severity `CRITICAL` / `WARNING` / `SUGGESTION`.
-### Step 2: Verify tests
+### Step 2: Verify tests (conditional — §3.2)
+Read `testExecution` from the briefing (default infer: `none` if `commandsRun` present, else `full`).
+**`testExecution: none`**:
+- **Do not** run `testCommand`, `smokeTestCommand`, or `coverageCommand`.
+- In the Tests section report: **N/A (delegated to `/refacil:test` phase)** and cite the last entry in `commandsRun` from the briefing.
+- Still validate CA/CR that depend on test *artifacts* by reading test files (static), not by executing the suite.
+- JSON `tests.executed: false`, `tests.delegated: true`, `tests.command` = last `commandsRun` or null.
-**If the briefing includes `testCommand`**: run **only** that command (already narrowed by the wrapper when `testScope: scoped`). Do not substitute a fuller command.
-**If there is NO briefing**: resolve by reading `METHODOLOGY-CONTRACT.md` §3, then narrow per §3.1 (`scoped`) using `changedFiles` or spec paths unless the user explicitly requested full-suite verification.
+**`testExecution: smoke`**:
+- Run **only** `smokeTestCommand`. Do not run `coverageCommand`.
+- FAIL if smoke fails; PASS if smoke passes. Note in report that full suite/coverage requires `/refacil:test`.
-Verify:
-- All invoked tests pass.
-- Tests substantively cover acceptance criteria from the briefing (or from the spec).
-- **`runCoverage: true`** (briefing default unless user opted out): after tests pass, run coverage narrowed to **`changedFiles`** / touched packages when **`testScope: scoped`**; use standard repo-wide coverage when **`testScope: full`**. If `coverageCommand` is null → N/A. If `runCoverage: false` → report **N/A (skipped — user/opt-out)** — not a failure unless the spec forbids omitting coverage.
+**`testExecution: full`**:
+- Run `testCommand` only (already narrowed when `testScope: scoped`). Do not substitute a fuller command.
+- After tests pass, apply coverage per briefing (`runCoverage`, `coverageCommand`, `testScope`) as in §3.1.
+**If there is NO briefing**: resolve by reading `METHODOLOGY-CONTRACT.md` §3.2 and §3.1; ask user to confirm scope before running tests.
 ### Step 3: Validate cross-repo ambiguities (optional)
@@ -129,8 +144,11 @@ Required corrections (only if REQUIRES_CORRECTIONS):
     }
   ],
   "tests": {
-    "command": "<command>",
-    "passed": <bool>,
+    "executed": <bool>,
+    "delegated": <bool>,
+    "executionMode": "none" | "smoke" | "full",
+    "command": "<command or last commandsRun when delegated>",
+    "passed": <bool or null when not executed>,
     "total": <int or null>,
     "coverage": <number or null>
   }
@@ -143,6 +161,42 @@ Required corrections (only if REQUIRES_CORRECTIONS):
 - `date`: run `date -u +%Y-%m-%dT%H:%M:%SZ` via Bash.
 - `issues` = `[]` if there are no issues.
+## CodeGraph integration (optional)
+If `codegraphAvailable: true` was passed by the wrapper, CodeGraph MCP tools are available:
+- `codegraph_search <symbol>` — find definitions and usages of a symbol
+- `codegraph_callers <symbol>` — list all callers of a function or method
+- `codegraph_callees <symbol>` — list all functions called by a given function
+- `codegraph_context <file>` — get focused structural context for a task or area
+- `codegraph_impact <symbol>` — estimate the blast radius of a change
+- `codegraph_node <symbol>` — show a symbol's source, signature, or docstring
+- `codegraph_explore <query>` — deep survey of an unfamiliar module or topic (token-heavy; use once per investigation, not repeatedly)
+- `codegraph_files <path>` — list files indexed under a directory path
+**When to use CodeGraph — scope is unknown (fan-out is high):**
+- "Who calls X?" across a large or unfamiliar codebase
+- Blast radius / impact of changing a symbol
+- Disambiguating a symbol that appears in many files
+- Tracing a cross-module or cross-package flow you don't know yet
+**When to use Grep/Read directly — scope is already bounded:**
+- You already know the file(s) to look at (≤ 3–4 files)
+- Simple endpoint flow: one controller → one service method (1–2 Greps find everything)
+- Literal text search: log messages, config keys, string constants
+- Logic is inline in a single method — callees won't add information
+- Question asks about file content, not symbol relationships
+**Decision rule:** ask yourself — "Do I already know where to look?" If yes, start with Grep. If no (unknown codebase, cross-module, many candidates), start with CodeGraph.
+**Fallback:** if CodeGraph returns empty results for something that should have callers, fall back to Grep. Common reasons:
+- Framework-managed entry points (HTTP routes, queue consumers, scheduled jobs) — called by the runtime, not by code
+- DI / IoC containers: NestJS (`@Injectable`), Spring (`@Autowired`), Angular (`@Component`), Laravel, etc.
+- Dynamic dispatch: interfaces, abstract class overrides, plugin registries
+When falling back, use Grep with the symbol name and log: `[CodeGraph fallback: <reason>]`.
+**Do not use CodeGraph** when `codegraphAvailable: false` was passed by the wrapper.
 ## Rules
 - **NEVER modify code**.