refacil-sdd-ai 5.2.3 → 5.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (76) hide show
  1. package/NOTICE.md +46 -0
  2. package/README.md +210 -42
  3. package/agents/auditor.md +46 -0
  4. package/agents/debugger.md +41 -1
  5. package/agents/implementer.md +76 -10
  6. package/agents/investigator.md +36 -0
  7. package/agents/proposer.md +56 -2
  8. package/agents/tester.md +45 -8
  9. package/agents/validator.md +67 -13
  10. package/bin/cli.js +396 -84
  11. package/lib/bus/broker.js +121 -3
  12. package/lib/bus/spawn.js +189 -121
  13. package/lib/check-review.js +102 -0
  14. package/lib/codegraph-telemetry.js +135 -0
  15. package/lib/codegraph.js +273 -0
  16. package/lib/commands/autopilot.js +120 -0
  17. package/lib/commands/bus.js +29 -36
  18. package/lib/commands/compact.js +185 -46
  19. package/lib/commands/read-spec.js +352 -0
  20. package/lib/commands/sdd.js +600 -47
  21. package/lib/compact-guidance.js +122 -77
  22. package/lib/config.js +136 -0
  23. package/lib/global-paths.js +56 -20
  24. package/lib/hooks.js +26 -4
  25. package/lib/ide-detection.js +1 -1
  26. package/lib/ignore-files.js +5 -1
  27. package/lib/installer.js +196 -19
  28. package/lib/kapso.js +308 -0
  29. package/lib/methodology-migration-pending.js +13 -0
  30. package/lib/open-browser.js +32 -0
  31. package/lib/opencode-migrate.js +148 -0
  32. package/lib/opencode-plugin/index.js +84 -104
  33. package/lib/opencode-plugin/rules.js +236 -0
  34. package/lib/project-root.js +154 -0
  35. package/lib/repo-ide-sync.js +5 -0
  36. package/lib/spec-reader/lang.js +72 -0
  37. package/lib/spec-reader/md-parser.js +299 -0
  38. package/lib/spec-reader/session.js +139 -0
  39. package/lib/spec-reader/ui/app.js +685 -0
  40. package/lib/spec-reader/ui/index.html +59 -0
  41. package/lib/spec-reader/ui/mixed-lang.js +200 -0
  42. package/lib/spec-reader/ui/model-cache.js +117 -0
  43. package/lib/spec-reader/ui/style.css +294 -0
  44. package/lib/spec-reader/ui/supertonic-helper.js +565 -0
  45. package/lib/spec-sync.js +258 -0
  46. package/lib/test-scope.js +713 -0
  47. package/lib/testing-policy-sync.js +14 -2
  48. package/package.json +5 -3
  49. package/skills/apply/SKILL.md +50 -65
  50. package/skills/archive/SKILL.md +84 -50
  51. package/skills/ask/SKILL.md +43 -8
  52. package/skills/autopilot/SKILL.md +505 -0
  53. package/skills/bug/SKILL.md +52 -53
  54. package/skills/explore/SKILL.md +48 -1
  55. package/skills/guide/SKILL.md +35 -13
  56. package/skills/inbox/SKILL.md +9 -0
  57. package/skills/join/SKILL.md +1 -1
  58. package/skills/prereqs/BUS-CROSS-REPO.md +33 -16
  59. package/skills/prereqs/METHODOLOGY-CONTRACT.md +96 -17
  60. package/skills/prereqs/SKILL.md +1 -1
  61. package/skills/propose/SKILL.md +82 -19
  62. package/skills/read-spec/SKILL.md +76 -0
  63. package/skills/reply/SKILL.md +42 -9
  64. package/skills/review/SKILL.md +71 -25
  65. package/skills/review/checklist.md +2 -2
  66. package/skills/say/SKILL.md +40 -4
  67. package/skills/setup/SKILL.md +59 -5
  68. package/skills/setup/troubleshooting.md +11 -3
  69. package/skills/stats/SKILL.md +160 -0
  70. package/skills/status/SKILL.md +116 -0
  71. package/skills/test/SKILL.md +38 -11
  72. package/skills/up-code/SKILL.md +20 -13
  73. package/skills/update/SKILL.md +32 -1
  74. package/skills/verify/SKILL.md +85 -40
  75. package/templates/compact-guidance.md +10 -0
  76. package/templates/methodology-guide.md +5 -0
package/NOTICE.md ADDED
@@ -0,0 +1,46 @@
1
+ # Third-Party Notices
2
+
3
+ This project uses the following third-party packages. Each is used in accordance with its license terms.
4
+
5
+ ---
6
+
7
+ ## refacil-sdd-ai
8
+
9
+ ### @colbymchenry/codegraph
10
+
11
+ - **Author**: Colby McHenry
12
+ - **License**: MIT
13
+ - **Repository**: https://github.com/colbymchenry/codegraph
14
+ - **Purpose**: Optional call-graph indexer integrated into refacil-sdd-ai to reduce token consumption
15
+ in exploratory sub-agents (refacil-investigator, refacil-proposer, refacil-debugger) by querying
16
+ the indexed call graph instead of reading source files directly (~71% estimated token reduction).
17
+ - **Usage**: Optional — the methodology works without it. Enable via `refacil-sdd-ai init` or set
18
+ `codegraphMode: enabled` in `~/.refacil-sdd-ai/config.yaml`. Disable with:
19
+ `refacil-sdd-ai sdd write-config --global --codegraph disabled`
20
+
21
+ ### smol-toml
22
+
23
+ - **Author**: Florian Boulay and contributors
24
+ - **License**: MIT
25
+ - **Repository**: https://github.com/nicolo-ribaudo/smol-toml
26
+ - **Purpose**: TOML parser used for Codex agent frontmatter generation (`convertAgentToToml`).
27
+
28
+ ### ws
29
+
30
+ - **Author**: Einar Otto Stangvik and contributors
31
+ - **License**: MIT
32
+ - **Repository**: https://github.com/websockets/ws
33
+ - **Purpose**: WebSocket library used by the local refacil-bus broker for cross-repo agent communication.
34
+
35
+ ### @clack/prompts (optional)
36
+
37
+ - **Author**: Nate Moore and contributors
38
+ - **License**: MIT
39
+ - **Repository**: https://github.com/bombshell-dev/clack
40
+ - **Purpose**: Optional peer dependency for interactive CLI prompts during `refacil-sdd-ai init`.
41
+ Falls back to a built-in readline implementation when absent.
42
+
43
+ ---
44
+
45
+ All other dependencies included via transitive closure are subject to their respective licenses.
46
+ Refer to each package's `LICENSE` file or the npm registry for details.
package/README.md CHANGED
@@ -81,7 +81,7 @@ refacil-sdd-ai update
81
81
 
82
82
  `update` reads `~/.refacil-sdd-ai/selected-ides.json` (the selection saved during `init`) and only updates those IDEs — it never touches IDEs you did not select. You do not need to run `update` per repo; it operates on the global install.
83
83
 
84
- In Claude Code and Cursor the `check-update` hook (every session) syncs skills and `compact-guidance` automatically. It also cleans up any leftover project-level `refacil-*` artifacts from older installations and prints a message if it removes anything. In OpenCode the equivalent runs via the `session.created` handler of the embedded plugin. Only if a pending methodology migration is detected does the hook prompt `/refacil:update` — otherwise the user is not interrupted.
84
+ In Claude Code, Cursor, Codex, and OpenCode the `check-update` hook (every session / `session.created`) runs `refacil-sdd-ai check-update`: syncs skills, `compact-guidance`, optional CodeGraph reindex, and cleans leftover project-level `refacil-*` artifacts. OpenCode invokes the same CLI via `node <package>/bin/cli.js check-update` from the global plugin. Only if a pending methodology migration is detected does `notify-update` prompt `/refacil:update` — otherwise the user is not interrupted.
85
85
 
86
86
  ### Uninstall
87
87
 
@@ -120,7 +120,7 @@ Native CLI for **`refacil-sdd/`** (no separate OpenSpec skill layer). Used by sk
120
120
  |---|---|
121
121
  | `refacil-sdd-ai sdd new-change <name>` | Scaffold `proposal.md`, `design.md`, `tasks.md`, and specs under `refacil-sdd/changes/<name>/` |
122
122
  | `refacil-sdd-ai sdd list [--json]` | List active changes and review status |
123
- | `refacil-sdd-ai sdd status <name> [--json]` | Artifact and task status for one change |
123
+ | `refacil-sdd-ai sdd status <name> [--json]` | Artifact and task status for one change. `ready.forApply` requires `proposal.md`, `design.md`, `tasks.md`, and specs from `specs.md` and/or recursive `specs/**/*.md` |
124
124
  | `refacil-sdd-ai sdd mark-reviewed <name>` | Write `.review-passed` (requires `--verdict`, `--summary`, counts) |
125
125
  | `refacil-sdd-ai sdd tasks-update <name>` | Mark a task done (`--task N --done`) |
126
126
  | `refacil-sdd-ai sdd archive <name>` | Move a regular change to `refacil-sdd/changes/archive/` |
@@ -130,6 +130,56 @@ Native CLI for **`refacil-sdd/`** (no separate OpenSpec skill layer). Used by sk
130
130
 
131
131
  Run **`refacil-sdd-ai help`** for the full list including `bus` and `compact` subcommands.
132
132
 
133
+ ### read-spec — on-device voice reading of SDD artifacts
134
+
135
+ Opens a Markdown file or a complete SDD change folder in the browser and reads it aloud using **on-device TTS** (Supertonic/Kokoro via ONNX). No audio is sent to any server — synthesis runs entirely in the browser.
136
+
137
+ ```bash
138
+ # Single file
139
+ refacil-sdd-ai read-spec --file refacil-sdd/specs/my-feature/spec.md
140
+
141
+ # Full SDD change folder (proposal + design + tasks + specs in a sidebar)
142
+ refacil-sdd-ai read-spec --change my-feature-change
143
+
144
+ # Archived change folder (path relative to refacil-sdd/changes/)
145
+ refacil-sdd-ai read-spec --change archive/2026-05-20-my-feature-change
146
+ ```
147
+
148
+ | Option | Default | Description |
149
+ |---|---|---|
150
+ | `--file <path>` | — | Single Markdown file (must be inside the project root) |
151
+ | `--change <name>` | — | Load all SDD artifacts for a change folder; accepts `archive/<date>-<name>` paths too |
152
+ | `--select <file.md>` | `proposal.md` | Pre-select a specific file when using `--change` |
153
+ | `--lang <code>` | auto | TTS language (`es`, `en`, …). Defaults to `artifactLanguage` from the SDD meta comment |
154
+ | `--voice <id>` | `M3` | Voice style: `M1`–`M5` or `F1`–`F5` |
155
+ | `--speed <n>` | `1` | Playback speed 0.9–1.5 |
156
+
157
+ #### File mode vs folder mode
158
+
159
+ | | File mode (`--file`) | Folder mode (`--change`) |
160
+ |---|---|---|
161
+ | Sidebar | Hidden — content fills the full width | Shows all `.md` files in the change folder |
162
+ | Navigation | Sections within the single file | Sections within the active file + **auto-advances to next file** when the last section finishes |
163
+ | Use case | Quick review of a single spec | Full walkthrough: proposal → design → tasks → specs in one uninterrupted session |
164
+
165
+ #### TTS pipeline
166
+
167
+ - **Bilingual synthesis**: Spanish text is split into segments; English technical terms (`HTML`, `CSS`, `API`, camelCase identifiers, file paths, CLI flags, etc.) are synthesized with the English voice engine. Both segments are concatenated into a single audio buffer with no perceptible gap.
168
+ - **Markdown rendering**: [`marked.js`](https://marked.js.org/) (loaded via CDN) renders headings, lists, tables, code blocks, bold/italic, and blockquotes as HTML. Falls back to plain text if the CDN is unavailable (offline mode).
169
+ - **TTS text pipeline** — what gets stripped or transformed before synthesis:
170
+ - **Named code blocks** (` ```typescript `) → `"code block: typescript"` (source is not read aloud)
171
+ - **Unlabeled code blocks** (` ``` `) → body is read as plain text (diagrams, dependency graphs)
172
+ - **Markdown tables** → header label (`"tabla: ColA, ColB."`) followed by each data row as a comma list
173
+ - **HTML tag mentions** (e.g. `` `<table>` ``) → tag name only (`"table"`)
174
+ - **Arrows** (`→`) → `"arrow"`; emojis are removed
175
+ - **Paragraph lines** without terminal punctuation → period appended (natural TTS pause)
176
+ - **List items** → comma after each item except the last, which gets a period (enumeration rhythm)
177
+ - **On-device**: models are downloaded from HuggingFace on the first visit and cached in the browser. Subsequent opens are instant. No data leaves the machine.
178
+
179
+ #### Artifact Language
180
+
181
+ `read-spec` detects the `artifactLanguage` meta comment at the top of the Markdown file (e.g. `<!-- refacil-sdd: artifactLanguage=spanish -->`) and sets the primary TTS language automatically. The `--lang` flag overrides it.
182
+
133
183
  ### Artifact Language
134
184
 
135
185
  By default, `/refacil:propose` generates proposal, specs, design, and tasks in **English**. Set `artifactLanguage` to have the artifacts produced in your team's preferred language so developers can review them in their natural language.
@@ -162,14 +212,24 @@ refacil-sdd-ai sdd config --json
162
212
 
163
213
  `refacil-sdd-ai init` also prompts for this preference and writes to the global config. Skip with `--yes` to keep the current value.
164
214
 
215
+ ### Kapso notifications (`kapso`)
216
+
217
+ [Kapso](https://docs.kapso.ai/docs/whatsapp/send-messages/text) is a WhatsApp notification service. You'll need a Kapso account to obtain `KAPSO_API_KEY` and `KAPSO_PHONE_NUMBER_ID`.
218
+
219
+ | Command | Description |
220
+ |---|---|
221
+ | `refacil-sdd-ai kapso setup` | Interactive setup of Kapso WhatsApp notification credentials (`~/.refacil-sdd-ai/kapso.env`) |
222
+
165
223
  ### Command rewrite control (`compact-bash`)
166
224
 
167
225
  | Command | Description |
168
226
  |---|---|
169
- | `refacil-sdd-ai compact stats` | Statistics (hook + already-compact) + estimated tokens and USD |
227
+ | `refacil-sdd-ai compact stats` | Statistics (compact-bash hook + CodeGraph) and estimated tokens/USD |
228
+ | `refacil-sdd-ai compact log-codegraph-event` | Log a sub-agent CodeGraph session (`--skill`, `--has-graph`, `--tool-calls`, `--tokens`) |
170
229
  | `refacil-sdd-ai compact enable` | Re-enable rewriting |
171
230
  | `refacil-sdd-ai compact disable` | Disable rewriting without uninstalling |
172
231
  | `refacil-sdd-ai compact clear-log` | Clear `~/.refacil-sdd-ai/compact.log` |
232
+ | `refacil-sdd-ai compact codegraph-clear-log` | Clear `~/.refacil-sdd-ai/codegraph.log` |
173
233
 
174
234
  ### Agent bus (`bus`)
175
235
 
@@ -192,7 +252,7 @@ refacil-sdd-ai sdd config --json
192
252
 
193
253
  > The `join/leave/say/ask/reply/attend/inbox` subcommands also exist as **IDE skills** (`/refacil:join`, etc.). In most cases use the skills; the CLI commands are for scripting or debugging.
194
254
  >
195
- > **Cross-repo coordination** (ask requests, room agreements, `/refacil:propose`, closing to the requester): after `init`, the file **`BUS-CROSS-REPO.md`** is available in `~/.claude/skills/refacil-prereqs/` and `~/.cursor/skills/refacil-prereqs/`.
255
+ > **Cross-repo coordination** (ask requests, room agreements, `/refacil:propose`, closing to the requester): after `init`, the file **`BUS-CROSS-REPO.md`** is available in each selected IDE's global `refacil-prereqs` skill folder — e.g. `~/.claude/skills/refacil-prereqs/`, `~/.cursor/skills/refacil-prereqs/`, `~/.config/opencode/skills/refacil-prereqs/`, `~/.codex/skills/refacil-prereqs/` (or your `OPENCODE_CONFIG_DIR` skills path).
196
256
 
197
257
  ---
198
258
 
@@ -216,6 +276,10 @@ All invoked as `/refacil:<name>` in Claude Code, Cursor, OpenCode, or Codex.
216
276
  | `/refacil:up-code` | Commit + push + PR (runs review if missing) |
217
277
  | `/refacil:bug` | Full bugfix flow with regression tests |
218
278
  | `/refacil:update` | Detect and apply pending methodology migrations to the current repo |
279
+ | `/refacil:stats` | Show change progress, task status, review gate, and test commands from SDD artifacts |
280
+ | `/refacil:status` | Show which phase of the SDD-AI cycle a change is in and the exact command to resume it |
281
+ | `/refacil:read-spec` | Listen to change specs in the browser with on-device TTS |
282
+ | `/refacil:autopilot` | Autonomous pipeline: chains apply → test → verify → review → archive in one invocation; up-code (push + PR) is optional and configured in pre-flight. Optional WhatsApp notification via `~/.refacil-sdd-ai/kapso.env` |
219
283
 
220
284
  ### Automatic sub-agents (v3.0.0+)
221
285
 
@@ -224,9 +288,9 @@ Some skills delegate their heavy work to **sub-agents** that run in isolated con
224
288
  | Skill | Sub-agent | Role | Can write |
225
289
  |---|---|---|---|
226
290
  | `/refacil:explore` | `refacil-investigator` | Reads codebase, enriches with AGENTS.md, queries cross-repo bus | No |
227
- | `/refacil:verify` | `refacil-validator` | Runs tests + compares against spec, returns prioritized issues | No |
291
+ | `/refacil:verify` | `refacil-validator` | Validates CA/CR vs spec; runs tests only when `testExecution: full` or smoke after fixes (§3.2) | No |
228
292
  | `/refacil:review` | `refacil-auditor` | Evaluates changes against the quality checklist | No |
229
- | `/refacil:test` | `refacil-tester` | Detects stack, generates tests covering CA/CR, runs and fixes | Yes (test files) |
293
+ | `/refacil:test` | `refacil-tester` | **Canonical test phase**: generates tests, runs scoped suite + coverage, writes `memory.commandsRun` | Yes (test files) |
230
294
  | `/refacil:apply` | `refacil-implementer` | Reads SDD artifacts and implements all change tasks | Yes (source code) |
231
295
  | `/refacil:bug` | `refacil-debugger` | `investigation` mode: analyzes root cause without modifying anything. `fix` mode: implements the fix, generates regression tests, creates `summary.md` | Only in fix mode |
232
296
  | `/refacil:propose` | `refacil-proposer` | Explores the codebase and generates proposal, specs, design, and tasks | Yes (SDD artifacts) |
@@ -239,6 +303,15 @@ Some skills delegate their heavy work to **sub-agents** that run in isolated con
239
303
 
240
304
  **Two-pass `refacil:bug` flow**: the wrapper first invokes the sub-agent in `investigation` mode (writes nothing) → the user confirms the hypothesis and approves the fix → the wrapper validates the working branch → invokes the sub-agent in `fix` mode to implement.
241
305
 
306
+ ### Component-bounded testing (monorepos)
307
+
308
+ In a monorepo, **no phase ever runs the entire monorepo's test suite** — each phase scopes execution to the **affected component(s)** only. This `component-bounded` principle is defined in `skills/prereqs/METHODOLOGY-CONTRACT.md` (§3 / §3.1 / §3.2).
309
+
310
+ - **Scope resolution**: `test-scope` resolves every changed file to its owning component (`findModuleRoot` → `affectedComponents`) and runs that component's real test command from its own root (`cd <component> && …`), language-agnostic (Node, Python, Go, Rust, Java/Maven/Gradle, C#/dotnet…). Test files passed directly are recognized as their own scope.
311
+ - **`/refacil:apply` never runs the full suite**: it runs a smoke check of what it modified, or skips and delegates the full run to `/refacil:test` (overrides the §3.1 "unreliable scope → run baseline" clause).
312
+ - **`/refacil:test` is the only phase that runs a full suite** — and only for the affected component. A re-run covers just the previously failing tests, not the whole suite again.
313
+ - **`/refacil:verify`, `/refacil:review`, and `/refacil:archive` do not re-execute tests**: they consume the evidence recorded by `/refacil:test` in `memory.yaml`. In autopilot, missing/stale evidence aborts instead of silently widening the test scope.
314
+
242
315
  ### Agent bus
243
316
 
244
317
  | Skill | Usage |
@@ -260,7 +333,14 @@ Quick rule for choosing the entry command:
260
333
  - New feature or behavior change → `/refacil:propose`
261
334
  - Functional bug or production error → `/refacil:bug`
262
335
 
263
- From there, the full cycle is:
336
+ **Optional token-reduction layer**: if `.codegraph/` exists at the repo root (created by
337
+ `refacil-sdd-ai codegraph init` via `/refacil:setup` when `codegraphMode` is `enabled`),
338
+ exploratory sub-agents use CodeGraph symbol queries instead of file reads, reducing token
339
+ consumption ~71% in the `/refacil:explore`, `/refacil:propose`, and `/refacil:bug`
340
+ (investigation phase) flows. This layer is transparent — skill invocation and output contracts
341
+ are unchanged.
342
+
343
+ From there, the full cycle is (after `/refacil:propose` you choose step-by-step or autonomous — see note below):
264
344
 
265
345
  ```
266
346
  ┌───────────────────────────┐
@@ -280,45 +360,92 @@ From there, the full cycle is:
280
360
  design + summary.md)
281
361
  tasks) │
282
362
  │ │
363
+ │ ┌─────────────────────────────┐
364
+ ├─┤ read-spec --change <name> │ ← optional
365
+ │ │ (listen to proposal, specs, │
366
+ │ │ design & tasks by voice; │
367
+ │ │ auto-advances file by file)│
368
+ │ └─────────────────────────────┘
283
369
  ▼ │
284
- /refacil: │
285
- apply
286
- │ │
287
-
288
- /refacil: │
289
- test
290
-
291
- ▼ │
292
- /refacil: │
293
- verify
294
- (max 2 rounds
295
- autofix)
296
-
297
- └───┬───┘
298
-
299
- /refacil:review
300
- (generates .review-passed)
301
-
302
- /refacil:archive
303
- (feature: moves to archive/ + syncs specs
304
- bug: fix-*/spec.md + review.yaml)
305
-
306
- /refacil:up-code
307
- (checks review +
308
- commit + push + PR)
309
-
310
- PR created
370
+ ┌──────────────┴──────────┐
371
+ Continue implementation?│
372
+ └────┬──────────────┬─────┘
373
+
374
+ A: step-by-step B: autonomous
375
+
376
+ ▼ ▼
377
+ /refacil: /refacil:
378
+ apply autopilot ──────────────────────────────┐
379
+ (internally chains: │
380
+ ▼ apply test → verify → review
381
+ /refacil: → archive → [up-code, optional])
382
+ test
383
+ │ │ on finish: │
384
+ │ WhatsApp via Kapso │
385
+ /refacil: │ (if configured) │
386
+ verify ▼ │
387
+ (CA/CR; tests PR created or archive-only ◄─────────── ┘
388
+ delegated to (depends on pre-flight up-code choice)
389
+ delegated to
390
+ test phase;
391
+ max 2 autofix
392
+ smoke only)
393
+
394
+
395
+ /refacil:review
396
+ (generates .review-passed)
397
+
398
+
399
+ /refacil:archive
400
+ (feature: moves to archive/ + syncs specs
401
+ bug: fix-*/spec.md + review.yaml)
402
+
403
+
404
+ /refacil:up-code
405
+ (checks review +
406
+ commit + push + PR)
407
+
408
+
409
+ PR created
311
410
  ```
312
411
 
412
+ > **After `/refacil:propose` is approved**, two continuation options are offered:
413
+ > - **`/refacil:apply`** (option A) — step-by-step: each phase (apply → test → verify) pauses for your confirmation.
414
+ > - **`/refacil:autopilot`** (option B) — autonomous: chains apply → test → verify → review → archive in one invocation. During pre-flight you decide whether to include up-code (push + PR) or end the cycle at archive. The pipeline adapts: with up-code it ends at a PR; without up-code it ends at archive. Optional WhatsApp notification via Kapso in both cases (configure with `refacil-sdd-ai kapso setup`). Path B is fully independent — it handles review, archive, and optionally up-code internally without merging into path A.
415
+ >
416
+ > **`read-spec --change <name>`** is an optional review step between propose and the implementation choice. It opens the change folder in the browser and reads proposal, design, tasks, and specs aloud in order, auto-advancing between files. Use it to absorb the scope of a change hands-free before committing to implementation.
417
+
418
+ ---
419
+
420
+ ## Autonomous Mode
421
+
422
+ Run the full post-proposal SDD cycle without manual intervention using `/refacil:autopilot`. After `/refacil:propose` is approved, a single command chains **apply → test → verify → review → archive** and, depending on your pre-flight choice, optionally continues with **up-code** (commit + push + PR). You decide in the pre-flight whether to include up-code or end the cycle at archive. The pipeline adapts accordingly and always sends the Kapso notification and prints the terminal summary when it finishes.
423
+
424
+ ### One-time Kapso setup (optional — required for WhatsApp notifications)
425
+
426
+ ```bash
427
+ refacil-sdd-ai kapso setup
428
+ ```
429
+
430
+ This prompts for `KAPSO_API_KEY`, `KAPSO_PHONE_NUMBER_ID`, and `NOTIFY_PHONE` (E.164 format), then writes `~/.refacil-sdd-ai/kapso.env` with `chmod 600`. You only need to run this once. Autopilot works without it — you just won't receive a WhatsApp notification.
431
+
432
+ > **Getting your Kapso credentials**: see [Kapso docs → Introduction](https://docs.kapso.ai/docs/whatsapp/send-messages/text) for how to create an account, get your API key, and configure a phone number sender.
433
+
434
+
313
435
  **Two-layer review gate**:
314
436
  - `/refacil:up-code` detects a missing `.review-passed` and **automatically runs `/refacil:review`** before pushing.
315
437
  - The `check-review` hook also intercepts manual `git push` commands and **blocks** the operation if it is missing. The hook does not invoke skills — it only blocks and instructs.
316
438
 
439
+ **Behavior on failure**:
440
+ - Autopilot stops at the failing phase, preserves the working tree for inspection, records the relevant evidence, and sends a Kapso failure notification when configured.
441
+ - Normal recovery does not use destructive reset commands. The developer decides how to keep, fix, or discard local edits after reviewing the evidence.
442
+
317
443
  **Archive**:
318
- - For features/improvements: the CLI moves artifacts to `archive/` and extracts `.review-passed` fields to `review.yaml` inside each affected spec.
319
- - For bugs: manual archiving, creates `refacil-sdd/specs/fix-*/spec.md` in standard format + `review.yaml`.
444
+ - For features/improvements: the archive flow moves artifacts to `archive/` and persists `.review-passed` fields to `review.yaml` inside each affected spec. Specs can live in `specs.md`, recursive `specs/**/*.md`, or both; `sync-spec` consumes the same source set as `sdd status`.
445
+ - For bugs: `fix-*` folders are the operational exception to regular proposal readiness. They archive with `summary.md`, regression test evidence, and `.review-passed`, then create `refacil-sdd/specs/fix-*/spec.md` in standard format + `review.yaml`.
320
446
  - A single branch can accumulate multiple bugs, each in its own independent `fix-*/` folder.
321
447
  - `/refacil:archive` always requests one or more **task references** associated with the change before proceeding. Accepted formats: URL, ticket/issue identifier, or task name. References are stored in `review.yaml` under the `taskReferences` field (YAML list). This field is mandatory — archiving does not proceed until the user provides at least one reference.
448
+ - `/refacil:archive` uses current `/refacil:test` evidence from `memory.yaml` by default. In normal mode it asks before continuing if evidence is missing or stale; in autopilot mode it aborts instead of silently re-running or widening tests.
322
449
 
323
450
  ---
324
451
 
@@ -328,19 +455,19 @@ Installed during `init` / `update` for each selected IDE. The same four behavior
328
455
 
329
456
  | Behavior | Claude Code | Cursor | OpenCode | Codex |
330
457
  |---|---|---|---|---|
331
- | **check-update** | `SessionStart` hook in `~/.claude/settings.json` | `SessionStart` hook in `~/.cursor/hooks.json` | `session.created` handler in the global OpenCode plugin | `sessionStart` hook in `~/.codex/config.toml` |
458
+ | **check-update** | `SessionStart` `refacil-sdd-ai check-update` | `sessionStart` same CLI (single entry; no `workspaceOpen` duplicate) | `session.created` same CLI (`node …/bin/cli.js check-update`) | `sessionStart` same CLI |
332
459
  | **notify-update** | `UserPromptSubmit` hook | `beforeSubmitPrompt` hook | `tui.prompt.append` handler | `userPromptSubmit` hook in `~/.codex/config.toml` |
333
460
  | **compact-bash** | `PreToolUse` (Bash) hook | `PreToolUse` (Bash) hook | `tool.execute.before` handler for bash tool | `preToolUse` hook (Bash matcher) in `~/.codex/config.toml` |
334
461
  | **check-review** | `PreToolUse` (Bash) hook | `PreToolUse` (Bash) hook | `tool.execute.before` handler for bash tool | `preToolUse` hook (Bash matcher) in `~/.codex/config.toml` |
335
462
 
336
463
  | Behavior | What it does |
337
464
  |---|---|
338
- | `check-update` | On startup: deletes `.refacil-pending-update` if no migration is pending (stale flags). Then: npm check, sync skills, **compact-guidance**. If skills were synced **and** a migration is pending, writes the flag for `notify-update`. Always refreshes the flag content when a migration is pending (keeps the `to` version current). |
465
+ | `check-update` | On startup: deletes `.refacil-pending-update` if no migration is pending (stale flags). Then: npm check, sync skills, **compact-guidance**, **CodeGraph** auto-init/reindex when enabled. If skills were synced **and** a migration is pending, writes the flag for `notify-update`. Always refreshes the flag content when a migration is pending (keeps the `to` version current). Repo root: `CURSOR_PROJECT_DIR` / `CLAUDE_PROJECT_DIR`, then Cursor `workspace_roots` from stdin, then `.git` traversal (never the embedded `refacil-sdd-ai/` package inside a monorepo). |
339
466
  | `notify-update` | If the flag exists **and** a methodology migration is pending (same table as `/refacil:update`), injects the instruction before the agent processes the next user message; if the sync happened without a migration, the flag is not created or is discarded silently. |
340
467
  | `compact-bash` | Silently rewrites bare Bash commands. No extra turns, the IDE does not see the change. Requires Claude Code >= 2.1.89 for the `updatedInput` path. |
341
- | `check-review` | Intercepts `git push` and blocks if `.review-passed` is missing in any active change. |
468
+ | `check-review` | Intercepts `git push` and blocks if an active change has started implementation (`tasks.md` with ≥1 `[x]`) without `.review-passed`. |
342
469
 
343
- > **OpenCode plugin**: a single file installed in the global OpenCode plugins directory implements all four behaviors. It loads `lib/compact/rules.js` from the package to reuse the same rewrite rules — no duplicated logic. If the rules file is not resolvable, compact-bash is disabled gracefully with a warning to stderr; the plugin never crashes the session.
470
+ > **OpenCode plugin**: a single file installed in the global OpenCode plugins directory implements all four behaviors. `session.created` shells out to the same `check-update` CLI as the other IDEs (not a partial reimplementation). For `compact-bash` it loads `rules.js` co-installed in `~/.config/opencode/plugins/` alongside `refacil-hooks.js`, with fallback to `lib/compact/rules.js` from the npm package — no duplicated rewrite logic. If the rules file is not resolvable, compact-bash is disabled gracefully with a warning to stderr; the plugin never crashes the session.
344
471
 
345
472
  > **Codex hooks**: injected into `~/.codex/config.toml` under `[hooks]` with `[features] codex_hooks = true`. Each SDD-AI hook entry carries a boolean marker (`_sdd`, `_sdd_compact`, `_sdd_review`, `_sdd_notify`) for clean removal on `clean`. User-defined hooks outside these entries are preserved.
346
473
 
@@ -450,7 +577,8 @@ Local bus (WebSocket over `127.0.0.1`) so agents across different repos can comm
450
577
  **Properties**:
451
578
 
452
579
  - 100% local: nothing leaves `127.0.0.1`. No accounts, no shared service.
453
- - Zero config: the broker auto-spawns the first time a skill needs it (`127.0.0.1:7821`, fallback 7822/7823).
580
+ - Zero config: the broker auto-spawns the first time a skill needs it (`127.0.0.1:7821`, fallback 7822/7823). If all three fixed candidates are occupied by external processes, the broker binds an OS-assigned ephemeral port instead of failing — clients discover the actual port automatically.
581
+ - **Port override (`REFACIL_BUS_PORT`)**: set this env var when the broker spawns to bind a specific port exclusively — a fixed number (e.g. `REFACIL_BUS_PORT=9000`), or `0` to force an OS-assigned ephemeral port. Useful in CI or sandboxed environments where `7821-7823` are unavailable or reserved.
454
582
  - ~40 MB RAM, 0% CPU idle. Persistence: `~/.refacil-sdd-ai/bus/<room>/inbox.jsonl` (7-day rotation).
455
583
  - Same skills in Claude Code and Cursor.
456
584
 
@@ -498,10 +626,11 @@ Skills, sub-agents, and hooks are installed into the user's global IDE directori
498
626
  ~/.cursor/agents/refacil-*.md # Cursor sub-agents (readonly:true/false + model:inherit, auto-generated)
499
627
  ~/.cursor/hooks.json # SDD hooks merged in (same four behaviors)
500
628
 
501
- # OpenCode (if selected) — macOS/Linux: ~/.config/opencode/ Windows: %APPDATA%\opencode
629
+ # OpenCode (if selected) — all platforms: ~/.config/opencode/ (override: OPENCODE_CONFIG_DIR)
502
630
  ~/.config/opencode/skills/refacil-*/ # OpenCode skills
503
631
  ~/.config/opencode/agents/refacil-*.md # OpenCode sub-agents (permission block + mode:subagent)
504
632
  ~/.config/opencode/plugins/refacil-hooks.js # Plugin: session.created + tui.prompt.append + tool.execute.before
633
+ ~/.config/opencode/plugins/refacil-check-review.js # Shared git push review gate (used by refacil-hooks.js)
505
634
 
506
635
  # Codex (if selected)
507
636
  ~/.codex/skills/refacil-*/ # Codex skills (same content as Claude Code)
@@ -541,6 +670,45 @@ refacil-sdd/ # SDD artifacts store
541
670
 
542
671
  ---
543
672
 
673
+ ## Third-party integrations
674
+
675
+ ### CodeGraph (optional)
676
+
677
+ - **Author**: Colby McHenry
678
+ - **License**: MIT
679
+ - **Repository**: https://github.com/colbymchenry/codegraph
680
+ - **Purpose**: When present, reduces token consumption ~71% in exploratory sub-agents
681
+ (`refacil-investigator`, `refacil-proposer`, `refacil-debugger`) by querying an indexed call graph
682
+ instead of reading source files directly. The methodology works without it — CodeGraph is purely optional.
683
+
684
+ **How it works**: after `refacil-sdd-ai init` sets `codegraphMode: enabled`, the setup step
685
+ (`/refacil:setup`) runs `refacil-sdd-ai codegraph init` in the background. This creates a `.codegraph/`
686
+ directory at the repo root. Exploratory sub-agents detect `.codegraph/` at the start of each session
687
+ and prefer CodeGraph symbol queries (`codegraph_search`, `codegraph_callers`, `codegraph_callees`,
688
+ `codegraph_context`, `codegraph_impact`) over raw file reads.
689
+
690
+ **Opt-out** at any time:
691
+
692
+ ```bash
693
+ refacil-sdd-ai sdd write-config --global --codegraph disabled
694
+ ```
695
+
696
+ Or set `codegraphMode: disabled` in `~/.refacil-sdd-ai/config.yaml`.
697
+
698
+ **Modes**:
699
+
700
+ | Mode | Behavior |
701
+ |---|---|
702
+ | `enabled` | Auto-index every repo on `/refacil:setup` (recommended) |
703
+ | `per-repo` | Ask once per project during `/refacil:setup` |
704
+ | `disabled` | Never use CodeGraph |
705
+
706
+ Configure during `refacil-sdd-ai init` or at any time:
707
+
708
+ ```bash
709
+ refacil-sdd-ai sdd write-config --global --codegraph enabled
710
+ ```
711
+
544
712
  ## Technologies
545
713
 
546
714
  - [AGENTS.md](https://agents.md/) — universal AI instructions standard
package/agents/auditor.md CHANGED
@@ -42,6 +42,7 @@ If you prefer only the report (without the marker), respond with the explicit sc
42
42
  - **If the briefing includes `projectType`**: use it to decide which checklists to load — **do not re-detect the project type**.
43
43
  - **If the briefing includes `changeObjective`**: use it as intent context — **do not read `proposal.md`** to extract the same thing.
44
44
  - Read ONLY the files in the blocking scope (those in `changedFiles`). Read pre-existing context only if strictly necessary to evaluate a checklist item.
45
+ - **Do not run the project's full or scoped test suite via Bash** unless the briefing sets `testExecution: full` (rare) or the user explicitly requested re-execution. For checklist §6, use `commandsRun` / `criteriaRun` from the briefing and static review of test files (**METHODOLOGY-CONTRACT.md §3.2**).
45
46
  - **Every tool call has a cost** — justify each Read/Bash with a concrete evaluation need.
46
47
 
47
48
  ## Critical sub-agent rules
@@ -71,6 +72,8 @@ The main agent passes you the already-resolved scope and the BRIEFING block. Ext
71
72
  - `changedFiles` → blocking scope (new/modified files in this change)
72
73
  - `projectType` → which checklists to load
73
74
  - `changeObjective` → intent context of the change
75
+ - `commandsRun`, `criteriaRun`, `lastStep` → test phase evidence (do not re-run suite when present)
76
+ - `testExecution` → default `none` for review; never widen to full suite without explicit user request
74
77
 
75
78
  If the scope is ambiguous or empty, **stop** and respond only with:
76
79
  ```
@@ -109,6 +112,13 @@ For each FAIL, note whether the affected code belongs to the **blocking scope**
109
112
  - **MEDIUM**: Relevant technical debt.
110
113
  - **LOW**: Non-blocking recommended improvement.
111
114
 
115
+ **Coherence vs. Correctness distinction** — **See `METHODOLOGY-CONTRACT.md §3C — 3C Criterion: Completeness, Correctness, Coherence`** for the authoritative definitions. Quick reference:
116
+ - A **coherence issue** is a deviation from established architectural patterns, naming conventions, or module boundaries — the code may work but does not fit the codebase structure (maps to WARNING or SUGGESTION in §3C).
117
+ - A **correctness issue** is a failure to satisfy a spec criterion (CA-XX) or to handle a rejection condition (CR-XX) — the code does not do what it is supposed to do (maps to CRITICAL or WARNING in §3C).
118
+ When classifying a FAIL, choose the type that most accurately reflects the root cause. A single finding may have both dimensions; report the dominant one and note the secondary.
119
+
120
+ If `codegraphAvailable: true` in the briefing: use `codegraph_impact` or `codegraph_callers` on `changedFiles` before giving verdict on coherence and blast-radius — this helps identify unintended breakage across module boundaries. Absence of CodeGraph does not block or produce a WARNING; the checklist verdict is unaffected.
121
+
112
122
  ### Step 4: Emit report + JSON block
113
123
 
114
124
  The verdict and `blockers` are determined **exclusively** by findings in the blocking scope:
@@ -178,6 +188,42 @@ Next step: [/refacil:archive | /refacil:verify]
178
188
 
179
189
  If the main agent indicates `mode: detailed`, after the concise report and BEFORE the JSON block, add a section per checklist with each item and its state `[PASS/FAIL/N/A]`.
180
190
 
191
+ ## CodeGraph integration (optional)
192
+
193
+ If `codegraphAvailable: true` was passed by the wrapper, CodeGraph MCP tools are available:
194
+ - `codegraph_search <symbol>` — find definitions and usages of a symbol
195
+ - `codegraph_callers <symbol>` — list all callers of a function or method
196
+ - `codegraph_callees <symbol>` — list all functions called by a given function
197
+ - `codegraph_context <file>` — get focused structural context for a task or area
198
+ - `codegraph_impact <symbol>` — estimate the blast radius of a change
199
+ - `codegraph_node <symbol>` — show a symbol's source, signature, or docstring
200
+ - `codegraph_explore <query>` — deep survey of an unfamiliar module or topic (token-heavy; use once per investigation, not repeatedly)
201
+ - `codegraph_files <path>` — list files indexed under a directory path
202
+
203
+ **When to use CodeGraph — scope is unknown (fan-out is high):**
204
+ - "Who calls X?" across a large or unfamiliar codebase
205
+ - Blast radius / impact of changing a symbol
206
+ - Disambiguating a symbol that appears in many files
207
+ - Tracing a cross-module or cross-package flow you don't know yet
208
+
209
+ **When to use Grep/Read directly — scope is already bounded:**
210
+ - You already know the file(s) to look at (≤ 3–4 files)
211
+ - Simple endpoint flow: one controller → one service method (1–2 Greps find everything)
212
+ - Literal text search: log messages, config keys, string constants
213
+ - Logic is inline in a single method — callees won't add information
214
+ - Question asks about file content, not symbol relationships
215
+
216
+ **Decision rule:** ask yourself — "Do I already know where to look?" If yes, start with Grep. If no (unknown codebase, cross-module, many candidates), start with CodeGraph.
217
+
218
+ **Fallback:** if CodeGraph returns empty results for something that should have callers, fall back to Grep. Common reasons:
219
+ - Framework-managed entry points (HTTP routes, queue consumers, scheduled jobs) — called by the runtime, not by code
220
+ - DI / IoC containers: NestJS (`@Injectable`), Spring (`@Autowired`), Angular (`@Component`), Laravel, etc.
221
+ - Dynamic dispatch: interfaces, abstract class overrides, plugin registries
222
+
223
+ When falling back, use Grep with the symbol name and log: `[CodeGraph fallback: <reason>]`.
224
+
225
+ **Do not use CodeGraph** when `codegraphAvailable: false` was passed by the wrapper.
226
+
181
227
  ## Rules
182
228
 
183
229
  - Be constructive: not only say what fails, but how to fix it.
@@ -53,6 +53,44 @@ If you prefer to continue here, provide:
53
53
 
54
54
  ---
55
55
 
56
+ ## CodeGraph integration (optional — investigation mode only)
57
+
58
+ If `codegraphAvailable: true` was passed by the wrapper, CodeGraph MCP tools are available. In **mode=investigation** only:
59
+ - `codegraph_search <symbol>` — find definitions and usages of a symbol
60
+ - `codegraph_callers <symbol>` — list all callers of a function or method
61
+ - `codegraph_callees <symbol>` — list all functions called by a given function
62
+ - `codegraph_context <file>` — get focused structural context for a task or area
63
+ - `codegraph_impact <symbol>` — estimate the blast radius of a change
64
+ - `codegraph_node <symbol>` — show a symbol's source, signature, or docstring
65
+ - `codegraph_explore <query>` — deep survey of an unfamiliar module or topic (token-heavy; use once per investigation, not repeatedly)
66
+ - `codegraph_files <path>` — list files indexed under a directory path
67
+
68
+ **When to use CodeGraph — scope is unknown (fan-out is high):**
69
+ - "Who calls X?" across a large or unfamiliar codebase
70
+ - Blast radius / impact of changing a symbol
71
+ - Disambiguating a symbol that appears in many files
72
+ - Tracing a cross-module or cross-package flow you don't know yet
73
+
74
+ **When to use Grep/Read directly — scope is already bounded:**
75
+ - You already know the file(s) to look at (≤ 3–4 files)
76
+ - Simple endpoint flow: one controller → one service method (1–2 Greps find everything)
77
+ - Literal text search: log messages, config keys, string constants
78
+ - Logic is inline in a single method — callees won't add information
79
+ - Question asks about file content, not symbol relationships
80
+
81
+ **Decision rule:** ask yourself — "Do I already know where to look?" If yes, start with Grep. If no (unknown codebase, cross-module, many candidates), start with CodeGraph.
82
+
83
+ **Fallback:** if CodeGraph returns empty results for something that should have callers, fall back to Grep. Common reasons:
84
+ - Framework-managed entry points (HTTP routes, queue consumers, scheduled jobs) — called by the runtime, not by code
85
+ - DI / IoC containers: NestJS (`@Injectable`), Spring (`@Autowired`), Angular (`@Component`), Laravel, etc.
86
+ - Dynamic dispatch: interfaces, abstract class overrides, plugin registries
87
+
88
+ When falling back, use Grep with the symbol name and log: `[CodeGraph fallback: <reason>]`.
89
+
90
+ **Do not use CodeGraph** when `codegraphAvailable: false` was passed by the wrapper, or when you are in **mode=fix** (in fix mode the files to change are already known from the confirmed hypothesis — CodeGraph call-graph traversal adds no value and only burns tokens).
91
+
92
+ ---
93
+
56
94
  ## Investigation mode
57
95
 
58
96
  The main agent passes you: `mode: investigation` + bug `description`.
@@ -150,7 +188,9 @@ Each test must cover:
150
188
 
151
189
  Generate a descriptive folder name: `fix-[short-description]` (maximum 3-4 words kebab-case, e.g. `fix-session-timeout-redis`). **Do not use ticket IDs or branch name** — the name must be readable as input to `/refacil:explore`.
152
190
 
153
- Create `refacil-sdd/changes/<fix-name>/summary.md`:
191
+ Resolve the absolute project root before writing: run `git rev-parse --show-toplevel` and store as `<projectRoot>`. Write `summary.md` to `<projectRoot>/refacil-sdd/changes/<fix-name>/summary.md` — never use a relative path with the Write tool in a monorepo.
192
+
193
+ Create `<projectRoot>/refacil-sdd/changes/<fix-name>/summary.md`:
154
194
 
155
195
  ```markdown
156
196
  # Fix: [short description]