@nano-step/skill-manager 5.6.2 → 5.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/dist/utils.d.ts CHANGED
@@ -1,4 +1,4 @@
1
- export declare const MANAGER_VERSION = "5.6.1";
1
+ export declare const MANAGER_VERSION = "5.7.0";
2
2
  export interface SkillManifest {
3
3
  name: string;
4
4
  version: string;
package/dist/utils.js CHANGED
@@ -13,7 +13,7 @@ exports.writeText = writeText;
13
13
  const path_1 = __importDefault(require("path"));
14
14
  const os_1 = __importDefault(require("os"));
15
15
  const fs_extra_1 = __importDefault(require("fs-extra"));
16
- exports.MANAGER_VERSION = "5.6.1";
16
+ exports.MANAGER_VERSION = "5.7.0";
17
17
  async function detectOpenCodePaths() {
18
18
  const homeConfig = path_1.default.join(os_1.default.homedir(), ".config", "opencode");
19
19
  const cwd = process.cwd();
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@nano-step/skill-manager",
3
- "version": "5.6.2",
3
+ "version": "5.7.0",
4
4
  "description": "CLI tool that installs and manages AI agent skills, MCP tool routing, and workflow configurations.",
5
5
  "main": "dist/index.js",
6
6
  "types": "dist/index.d.ts",
@@ -16,8 +16,8 @@
16
16
  },
17
17
  {
18
18
  "name": "pr-code-reviewer",
19
- "version": "7.1.0",
20
- "description": "PR review with MANDATORY cross-repo consumer search, nano-brain code intelligence, and better-context structural analysis. READ-ONLY — no comments, pushes, or code fixes."
19
+ "version": "3.3.0",
20
+ "description": "PR review with 4 parallel subagents, stack-aware setup wizard (Nuxt/Next/React/Express/NestJS/TypeORM/Prisma), AGENTS.md knowledge base integration, cross-repo tracing, verification pipeline, and confidence scoring. READ-ONLY — no comments, pushes, or code fixes."
21
21
  },
22
22
  {
23
23
  "name": "deep-design",
@@ -1,29 +1,71 @@
1
1
  # PR Code Reviewer Changelog
2
2
 
3
- ## v3.1.0 (2026-03-16) - Verification Pipeline (False Positive Reduction)
3
+ ## v3.3.0 (2026-03-24) - Stack-Aware Setup Wizard + Token Efficiency
4
4
 
5
- **FIX**: ~50% of critical/warning findings were false positives because subagents inspected code locally without verifying the full execution context. This release adds a 3-layer verification pipeline that reduces false positives to ~10%.
5
+ ### Added
6
+ - **Phase -2: Setup Check** — runs before Phase -1 on first use (no config) or `/review --setup`
7
+ - Interactive wizard: asks 5 questions (frontend, backend, ORM, language, state management)
8
+ - Writes `.opencode/code-reviewer.json` with `stack` field
9
+ - Shows confirmation with which framework rule files will be used
10
+ - **3 new framework rule files**: `nextjs.md`, `react.md`, `prisma.md`
11
+ - **`references/setup-wizard.md`** — full wizard flow, question text, stack→file mapping
12
+ - **`stack` field** added to `assets/config.json` schema
13
+
14
+ ### Fixed
15
+ - **Token efficiency**: SKILL.md now has an explicit on-demand loading table — each reference file is read only when its phase runs, not all at startup. Prevents 79k+ token bloat.
16
+ - **Subagent 3 (LIBRARIAN) missing `TRACED_DEPENDENCIES`** — added to prompt template
17
+ - **Stale framework reference in LIBRARIAN**: "Next.js, React, Express" → "check ## FRAMEWORK RULES above for project-specific patterns"
18
+ - **`database.md` checklist missing from SKILL.md reference table** — added
19
+ - **`setup-wizard.md` added to SKILL.md reference table**
20
+ - **review-checklist.md missing Phase 4.5 and 4.6** — both added with full step-by-step items
21
+
22
+ ### Changed
23
+ - Framework rules no longer all loaded at once — only stack-matching files from config
24
+ - `$FRAMEWORK_RULES` variable replaces hardcoded framework mentions in subagent prompts
25
+ - All 4 subagent prompts now include `## FRAMEWORK RULES` section
26
+
27
+ ---
28
+
29
+ ## v3.2.0 (2026-03-14) - Consensus Scoring + Evidence Enforcement
30
+
31
+ ### Added
32
+ - **Consensus scoring in Phase 4**: findings flagged by 2+ agents → confidence boosted to `high`
33
+ - **Auto-downgrade rule**: single agent + missing evidence + critical/warning → auto-downgraded to `suggestion`
34
+ - **Phase 4.6: Result Confidence Assessment** — scores review quality 0–100 from accuracy, consensus, evidence rates
35
+ - **Phase 4.5: Orchestrator Verification Spot-Check** — orchestrator reads cited code to catch surviving false positives
36
+ - `evidence` field is REQUIRED for all critical/warning findings (subagent prompts updated)
37
+ - `confidence` field added: `high` | `medium` | `low`
38
+ - `trace_path` optional field added for verification audit trail
39
+
40
+ ### Changed
41
+ - Phase 4 now has two sub-phases: 4.5 (verification) and 4.6 (confidence)
42
+ - Report TL;DR now includes Result Confidence score
43
+
44
+ ---
45
+
46
+ ## v3.1.0 (2026-03-12) - Linear Ambiguity Detection + Premise Check
6
47
 
7
48
  ### Added
8
- - **VERIFICATION PROTOCOL** in all 4 subagent prompts mandatory 4-step process (IDENTIFY TRACE VERIFY → DECIDE) before reporting any critical/warning finding
9
- - **Evidence field** (`evidence`, `confidence`, `trace_path`) in finding schema subagents must cite concrete file:line proof for critical/warning findings
10
- - **Evidence examples** (good vs bad) in filtering rulesteaches subagents what constitutes valid evidence
11
- - **Consensus scoring** in Phase 4 tracks how many subagents independently flagged the same issue; single-agent findings without evidence are auto-downgraded
12
- - **Phase 4.5: Orchestrator Verification Spot-Check** orchestrator reads actual code at cited evidence locations to verify critical/warning findings (30s timeout per finding)
13
- - **`phase-4.5-verification.json` checkpoint** — tracks verification results for resumability
14
- - **Verification metadata in report TL;DR** — shows how many findings were dropped as false positives
49
+ - **Phase 1.5 Ambiguity Detection**: when acceptance criteria are vague, flag as warning and identify multiple interpretations
50
+ - **DELETION classification**: explicit new change type (distinct from REFACTOR) requiring Premise Check
51
+ - **Premise Check in Phase 2**: for DELETION changesanswers why code existed, whether removal is correct
52
+ - **Premise Check section in report**only shown for DELETION PRs
53
+ - **Cross-repo API tracing in Phase 2**: trace hardcoded frontend values vs backend config (e.g., cache TTLs)
54
+
55
+ ---
56
+
57
+ ## v3.0.0 (2026-03-10) - Unified Skill Rename + Phase -1 Resume
58
+
59
+ ### Added
60
+ - **Phase -1: Resume Detection** — checks for existing checkpoints before starting
61
+ - Checkpoint manifest schema with `head_sha` validation (stale checkpoint detection)
62
+ - Skill renamed from project-level name to `pr-code-reviewer` for clarity
15
63
 
16
64
  ### Changed
17
- - Phase 4 now includes consensus scoring step after deduplication
18
- - Phase 4 checkpoint `next_phase` updated from `5` to `4.5`
19
- - Manifest schema includes Phase 4.5 in `phase_status`
20
- - Finding schema in SKILL.md Phase 3 updated to include new fields
21
- - All 4 subagent return format sections updated with `evidence`, `confidence`, `trace_path`
22
-
23
- ### How It Works
24
- 1. **Layer 1 (Subagent self-verification)**: Subagents trace error handling to HTTP boundary, null safety to data source, framework patterns to usage context — before reporting
25
- 2. **Layer 2 (Orchestrator spot-check)**: Orchestrator reads cited files and verifies evidence claims for critical/warning findings
26
- 3. **Layer 3 (Consensus scoring)**: Multi-agent agreement boosts confidence; single-agent findings without evidence are downgraded
65
+ - SKILL.md restructured: inline details moved to reference files (`subagent-prompts.md`, `report-template.md`, etc.)
66
+ - Version reset to 3.x to reflect this is the unified project + global skill
67
+
68
+ ---
27
69
 
28
70
  ## v2.7.0 (2026-03-09) - Clone to Temp Folder
29
71
 
@@ -1,31 +1,24 @@
1
1
  ---
2
2
  name: pr-code-reviewer
3
- description: "Comprehensive code review with 4 parallel subagents, smart tracing, iterative refinement, workspace-aware configuration, and GitHub Copilot-style PR summaries."
3
+ description: "Review pull requests and staged changes for bugs, security issues, and code quality. Use this skill whenever the user mentions: review PR, code review, check this PR, review my changes, /review, PR #123, look at this diff, is this safe to merge, or provides a GitHub PR URL. Also triggers on: 'what do you think of these changes', 'review --staged', 'check my code before merge'."
4
4
  compatibility: "OpenCode with nano-brain"
5
5
  metadata:
6
6
  author: Sisyphus
7
- version: "3.1.0"
7
+ version: "3.3.0"
8
8
  severity-levels: ["critical", "warning", "improvement", "suggestion"]
9
9
  ---
10
10
 
11
11
  # PR Code Reviewer
12
12
 
13
- **Version**: 3.1.0 | **Architecture**: 4 Parallel Subagents + Verification Pipeline | **Memory**: nano-brain
13
+ **Version**: 3.3.0 | **Architecture**: 4 Parallel Subagents + Verification Pipeline + Confidence Scoring | **Memory**: nano-brain
14
14
 
15
15
  ## Overview
16
16
 
17
17
  Comprehensive PR reviewer: gathers full context, applies smart tracing by change type, runs four specialized subagents in parallel, iteratively refines findings, and produces a **short, actionable report** — only what matters. Also suggests code improvements when opportunities exist.
18
18
 
19
- ### NO PHASE SKIPPING (ABSOLUTE RULE)
19
+ ### Why Every Phase Runs
20
20
 
21
- **Every phase MUST be executed for every review, regardless of PR size.** A 1-line deletion can hide a critical logic bug. A "trivial" change can break cross-repo contracts. The phases exist because each one catches different classes of issues.
22
-
23
- - **1-line change?** → Run all phases.
24
- - **Deletion-only PR?** → Run all phases. Deletions are MORE dangerous, not less.
25
- - **"Obviously safe"?** → Run all phases. Your confidence is the risk.
26
- - **Only STYLE changes?** → Run all phases. Verify no hidden logic changes.
27
-
28
- **The ONLY exception**: Phase 0 (clone) is skipped for `--staged` local reviews. Every other phase runs unconditionally.
21
+ Each phase catches a different class of issue. A 1-line deletion can hide a critical logic bug that only cross-repo tracing (Phase 2) would reveal. A "trivial" style change can mask a hidden logic change that only subagent consensus (Phase 3-4) would catch. The only exception: Phase 0 (clone) is skipped for `--staged` local reviews.
29
22
 
30
23
  ## Report Philosophy
31
24
 
@@ -40,6 +33,26 @@ Comprehensive PR reviewer: gathers full context, applies smart tracing by change
40
33
 
41
34
  **Filtering rule**: If a finding wouldn't make a senior engineer stop and think, drop it.
42
35
 
36
+ ## Token Efficiency — Read Files On-Demand
37
+
38
+ **Do NOT load all reference files upfront.** Read each file only when the relevant phase runs:
39
+
40
+ | Phase | Read at start of phase |
41
+ |-------|------------------------|
42
+ | Phase -2 | `references/setup-wizard.md` (only if no config) |
43
+ | Phase 1 | `{workspace_root}/AGENTS.md` + `.agents/_repos/{repo}.md` + `.agents/_domains/{domain}.md` |
44
+ | Phase 1 | `references/nano-brain-integration.md` |
45
+ | Phase 2 | Domain checklist for changed file types (one file only) |
46
+ | Phase 3 | `references/subagent-prompts.md` + stack framework rules (from config) |
47
+ | Phase 4 | `references/confidence-scoring.md` |
48
+ | Phase 4.5 | `references/verification-protocol.md` |
49
+ | Phase 5 | `references/report-template.md` |
50
+ | Phase 5.5 | `references/nano-brain-integration.md` (save section only) |
51
+
52
+ Framework rules: load ONLY the files matching `stack` in `.opencode/code-reviewer.json`. Never load all framework rules.
53
+
54
+ ---
55
+
43
56
  ## Prerequisites
44
57
 
45
58
  ### GitHub MCP Server (Required for PR Reviews)
@@ -98,69 +111,40 @@ Check for config at `.opencode/code-reviewer.json`. **Full example**: [config.js
98
111
 
99
112
  ## Checkpoint System
100
113
 
101
- Reviews are resumable via checkpoints saved at each phase. If the agent crashes mid-review, you can resume from the last completed phase instead of starting over.
102
-
103
- ### Checkpoint Directory
104
-
105
- **For PR Reviews:** `$REVIEW_DIR/.checkpoints/` (inside the temp clone directory)
106
- **For Local Reviews (`--staged`):** `{current_working_directory}/.checkpoints/`
107
-
108
- Checkpoints are automatically removed when the clone directory is deleted (Phase 6 cleanup).
109
-
110
- ### Checkpoint Files
111
-
112
- | File | Content | Updated When |
113
- |------|---------|--------------|
114
- | `manifest.json` | Master state tracker | After every phase |
115
- | `phase-0-clone.json` | Clone metadata (clone_dir, branches, head_sha, files_changed) | Phase 0 |
116
- | `phase-1-context.json` | PR metadata, file classifications | Phase 1 |
117
- | `phase-1.5-linear.json` | Linear ticket context, acceptance criteria | Phase 1.5 |
118
- | `phase-2-tracing.json` | Smart tracing results per file | Phase 2 |
119
- | `phase-2.5-summary.json` | PR summary text | Phase 2.5 |
120
- | `phase-3-subagents.json` | Subagent findings (updated after EACH subagent completes) | Phase 3 |
121
- | `phase-4-refined.json` | Deduplicated/filtered findings | Phase 4 |
122
- | `phase-4.5-verification.json` | Verification results (verified/false/unverifiable counts, dropped/downgraded findings) | Phase 4.5 |
123
- | `phase-5-report.md` | Copy of final report | Phase 5 |
124
-
125
- ### Manifest Schema
126
-
127
- ```json
128
- {
129
- "version": "1.0",
130
- "pr": { "repo": "owner/repo", "number": 123, "url": "..." },
131
- "clone_dir": "/tmp/pr-review-...",
132
- "started_at": "ISO-8601",
133
- "last_updated": "ISO-8601",
134
- "completed_phase": 2,
135
- "next_phase": 2.5,
136
- "phase_status": {
137
- "0": "complete", "1": "complete", "1.5": "complete",
138
- "2": "complete", "2.5": "pending", "3": "pending",
139
- "4": "pending", "4.5": "pending", "5": "pending"
140
- },
141
- "subagent_status": {
142
- "explore": "pending", "oracle": "pending",
143
- "librarian": "pending", "general": "pending"
144
- }
145
- }
146
- ```
114
+ Reviews are resumable via checkpoints saved at each phase. If the agent crashes mid-review, resume from the last completed phase.
115
+
116
+ **Full details**: [checkpoint-system.md](references/checkpoint-system.md) — manifest schema, checkpoint files, Phase 3 special handling.
147
117
 
148
- ### Phase 3 Special Handling
118
+ ## Workflow
149
119
 
150
- Phase 3 runs 4 parallel subagents. After **EACH** subagent completes:
151
- 1. Update `phase-3-subagents.json` with that subagent's findings and status
152
- 2. Update `manifest.json` subagent_status to `"complete"` for that subagent
153
- 3. On resume: only run subagents with status != `"complete"`
120
+ **Full checklist**: [review-checklist.md](checklists/review-checklist.md) use this to track every step.
154
121
 
155
- This allows resuming mid-Phase-3 if only some subagents completed before a crash.
122
+ ### Phase -2: Setup Check (First Run Detection)
156
123
 
157
- ## Workflow (CRITICAL Follow Exactly, NEVER Skip Phases)
124
+ Before anything else, check if `.opencode/code-reviewer.json` exists.
158
125
 
159
- **Full checklist**: [review-checklist.md](checklists/review-checklist.md) use this to track every step.
126
+ **If it exists**: read `stack` field → load only the matching framework rule files (listed in setup-wizard.md mapping table). Continue to Phase -1.
160
127
 
161
- **ABSOLUTE RULE: Execute every phase in order. No phase may be skipped, shortened, or "optimized away" based on PR size, change type, or perceived simplicity. A 1-line deletion PR gets the same phase coverage as a 500-line feature PR.**
128
+ **If it doesn't exist** (or user ran `/review --setup`):
129
+ 1. Read `references/setup-wizard.md` for the full wizard flow
130
+ 2. Ask the 5 setup questions interactively
131
+ 3. Write `.opencode/code-reviewer.json` with `stack` field filled in
132
+ 4. Show confirmation with which framework rule files will be used
133
+ 5. If called as `/review --setup` (not a real review), stop here. Otherwise continue.
162
134
 
163
- ### Phase -1: Resume Detection (MANDATORY Check Before Starting)
135
+ **Stack framework rules mapping** (also in setup-wizard.md):
136
+ - `frontend: nuxt/vue` → `framework-rules/vue-nuxt.md`
137
+ - `frontend: nextjs` → `framework-rules/nextjs.md`
138
+ - `frontend: react` → `framework-rules/react.md`
139
+ - `backend: express` → `framework-rules/express.md`
140
+ - `backend: nestjs` → `framework-rules/nestjs.md`
141
+ - `orm: typeorm` → `framework-rules/typeorm.md`
142
+ - `orm: prisma` → `framework-rules/prisma.md`
143
+ - `language: typescript*` → `framework-rules/typescript.md`
144
+
145
+ Store the resolved `$FRAMEWORK_RULES` content (concatenated) for Phase 3. If multiple match, concatenate them.
146
+
147
+ ### Phase -1: Resume Detection (Check Before Starting)
164
148
 
165
149
  Before starting a new review, check for existing checkpoints to resume interrupted reviews.
166
150
 
@@ -204,7 +188,7 @@ Before starting a new review, check for existing checkpoints to resume interrupt
204
188
 
205
189
  **For Local Reviews:** Checkpoint directory is `.checkpoints/` in current working directory. No SHA validation needed (working directory changes are expected).
206
190
 
207
- ### Phase 0: Repository Preparation (MANDATORY for PR Reviews)
191
+ ### Phase 0: Repository Preparation (PR Reviews Only)
208
192
 
209
193
  **Why**: Your local repo may be on any branch, have uncommitted changes, or not exist at all. Cloning to a temp folder ensures:
210
194
  - You always read the **actual PR branch code**, not whatever is checked out locally
@@ -225,14 +209,14 @@ Before starting a new review, check for existing checkpoints to resume interrupt
225
209
  ```
226
210
  Format: `/tmp/pr-review-{repo}-{pr_number}-{unix_timestamp}`
227
211
 
228
- 3. **Clone the repo** (shallow clone for speed):
212
+ 3. **Clone the repo** (minimal shallow clone only latest commit):
229
213
  ```bash
230
- git clone --depth=50 --branch="${head_branch}" \
214
+ git clone --depth=1 --branch="${head_branch}" \
231
215
  "https://github.com/${owner}/${repo}.git" "$REVIEW_DIR"
232
216
  ```
233
217
  If the branch doesn't exist on remote (force-pushed/deleted), fall back to:
234
218
  ```bash
235
- git clone --depth=50 "https://github.com/${owner}/${repo}.git" "$REVIEW_DIR"
219
+ git clone --depth=1 "https://github.com/${owner}/${repo}.git" "$REVIEW_DIR"
236
220
  cd "$REVIEW_DIR" && gh pr checkout ${pr_number}
237
221
  ```
238
222
 
@@ -253,12 +237,30 @@ Before starting a new review, check for existing checkpoints to resume interrupt
253
237
 
254
238
  **For Local Reviews (`--staged`):** Skip Phase 0 — use current working directory.
255
239
 
256
- **CRITICAL**: Store `$REVIEW_DIR` path. Every file read, grep, and subagent prompt MUST reference this path, not the original workspace repo.
240
+ Store `$REVIEW_DIR` path every file read, grep, and subagent prompt references this path, not the original workspace repo.
257
241
 
258
242
  **Checkpoint:** Save clone metadata to `.checkpoints/phase-0-clone.json` (clone_dir, branches, head_sha, files_changed) and create `manifest.json` with `completed_phase: 0`, `next_phase: 1`, `phase_status: {"0": "complete", ...}`.
259
243
 
260
244
  ### Phase 1: Context Gathering
261
245
 
246
+ **Step 0 — Load agent knowledge base (MANDATORY if configured):**
247
+
248
+ Read `agents` config from `.opencode/code-reviewer.json`. If `has_agents_md: true`:
249
+
250
+ 1. **Read `{workspace_root}/AGENTS.md`** — keyword→domain→repo mapping table. Use this to identify which domain the PR's repo belongs to.
251
+
252
+ 2. **Read `{workspace_root}/.agents/_repos/{repo-name}.md`** (if `has_repos_dir: true`) — repo-specific context: framework, port, key files, known issues, cross-repo relationships. `repo-name` comes from the PR's `owner/repo` (e.g., PR from `tradeit-backend` → read `.agents/_repos/tradeit-backend.md`).
253
+
254
+ 3. **Read `{workspace_root}/.agents/_domains/{domain}.md`** (if `has_domains_dir: true`) — domain context for the repo's domain. Domain identified from AGENTS.md mapping (e.g., `tradeit-backend` → `trading-core` domain → read `.agents/_domains/trading-core.md`).
255
+
256
+ 4. **For cross-repo tracing (Phase 2)**: if the PR touches API contracts or shared data, also read:
257
+ - `.agents/_indexes/by-database.md` — which repo owns which DB table
258
+ - `.agents/_indexes/by-data-source.md` — which repo consumes which external API
259
+
260
+ Do NOT read all repo/domain files — only the ones relevant to the PR being reviewed. Store combined result as `$AGENTS_CONTEXT`.
261
+
262
+ If `agents` config is missing or files not found: continue without it, no error.
263
+
262
264
  **For PR Reviews (GitHub MCP):**
263
265
  1. `get_pull_request` → title, description, author, base branch
264
266
  2. `get_pull_request_files` → changed files with diff stats
@@ -276,7 +278,7 @@ Before starting a new review, check for existing checkpoints to resume interrupt
276
278
  - **REFACTOR**: Structure changes, no logic change → MEDIUM TRACE
277
279
  - **NEW**: New files → FULL REVIEW
278
280
 
279
- **DELETION classification rules**: Any PR that removes user-facing behavior, error messages, validation logic, UI elements, or API responses MUST be classified as DELETION, not STYLE or REFACTOR. Deletions feel safe but can hide regressions — they require the same depth as LOGIC changes plus a Premise Check.
281
+ **DELETION classification**: Any PR that removes user-facing behavior, error messages, validation logic, UI elements, or API responses is classified as DELETION, not STYLE or REFACTOR. Deletions feel safe but can hide regressions — they require the same depth as LOGIC changes plus a Premise Check.
280
282
  4. Gather full context per changed file from `$REVIEW_DIR`: callers/callees, tests, types, usage sites
281
283
  5. **Query nano-brain** for project memory on each changed module — [query patterns](references/nano-brain-integration.md#phase-1-memory-queries)
282
284
  6. **Fetch Linear ticket context** (if ticket ID found) — see Phase 1.5
@@ -301,8 +303,8 @@ If a Linear ticket ID was extracted from the branch name, PR description, or PR
301
303
  - Flag in report if PR appears to miss acceptance criteria items
302
304
  - Include ticket title + status in report header
303
305
 
304
- **Ambiguity Detection (MANDATORY):**
305
- If acceptance criteria are vague or open to multiple interpretations (e.g., "fix it", "make it correct", "improve this", "need to fix to make it correct"), you MUST:
306
+ **Ambiguity Detection:**
307
+ If acceptance criteria are vague or open to multiple interpretations (e.g., "fix it", "make it correct", "improve this", "need to fix to make it correct"):
306
308
  1. Flag it as a **warning** in the report: *"Acceptance criteria are ambiguous — PR may not match intended fix."*
307
309
  2. Identify the multiple interpretations (e.g., "remove the feature" vs "fix the condition")
308
310
  3. Evaluate which interpretation the PR implements
@@ -330,7 +332,7 @@ If acceptance criteria are vague or open to multiple interpretations (e.g., "fix
330
332
  - Are there related components (backend config, i18n keys, API responses) that depend on this code existing?
331
333
  7. Document the Premise Check answers — they feed into the report (Phase 5)
332
334
 
333
- **Cross-Repo API Tracing (MANDATORY for multi-repo workspaces):**
335
+ **Cross-Repo API Tracing** (for multi-repo workspaces):
334
336
  For any changed code that **consumes data from an API** (fetches, reads responses, uses values from backend):
335
337
  1. Identify the API endpoint being called (e.g., `/api/v2/inventory/my/data`)
336
338
  2. Find the backend repo that serves this endpoint (use workspace AGENTS.md domain mappings)
@@ -346,6 +348,13 @@ For any changed code that **consumes data from an API** (fetches, reads response
346
348
 
347
349
  **REFACTOR changes:** Verify behavior preservation, check all usages still work.
348
350
 
351
+ **Domain-Specific Checklists**: Based on the file types in the PR, read the relevant checklist for domain-specific review criteria:
352
+ - Vue/Nuxt frontend files → [frontend-vue-nuxt.md](checklists/frontend-vue-nuxt.md)
353
+ - Express/Node backend → [backend-express.md](checklists/backend-express.md)
354
+ - Database migrations/queries → [database.md](checklists/database.md)
355
+ - CI/CD configs → [ci-cd.md](checklists/ci-cd.md)
356
+ - Consumer search patterns → [consumer-search-matrix.md](checklists/consumer-search-matrix.md)
357
+
349
358
  **Checkpoint:** Save results to `.checkpoints/phase-2-tracing.json` (tracing results per file, callers/callees, test coverage, data flow, premise check answers, cross-repo tracing) and update `manifest.json` (`completed_phase: 2`, `next_phase: 2.5`).
350
359
 
351
360
  ### Phase 2.5: PR Summary Generation (REQUIRED)
@@ -361,11 +370,11 @@ Before launching subagents, generate a GitHub Copilot-style PR summary. Reviewer
361
370
 
362
371
  **Checkpoint:** Save results to `.checkpoints/phase-2.5-summary.json` (PR summary text, key changes, file summaries) and update `manifest.json` (`completed_phase: 2.5`, `next_phase: 3`).
363
372
 
364
- ### Phase 3: Parallel Subagent Execution (NEVER SKIP)
373
+ ### Phase 3: Parallel Subagent Execution
365
374
 
366
- **ALL 4 subagents MUST run for EVERY review. No exceptions. No "this PR is too small." No "this is just a deletion." The phases exist because each subagent catches different classes of issues that you, the orchestrator, will miss.**
375
+ Launch all 4 subagents simultaneously with `run_in_background: true`. Each agent catches issues the others miss the quality agent finds duplication the security agent ignores, the librarian catches framework anti-patterns the integration agent overlooks. Include PR Summary, nano-brain memory, Premise Check results (if DELETION), cross-repo tracing results, `$REVIEW_DIR` path, and `$FRAMEWORK_RULES` (from Phase -2) in each prompt.
367
376
 
368
- Launch ALL 4 subagents simultaneously with `run_in_background: true`. Include PR Summary, nano-brain memory, Premise Check results (if DELETION), cross-repo tracing results, and **`$REVIEW_DIR` path** in each prompt so subagents read from the correct clone.
377
+ Read `references/subagent-prompts.md` now for the full prompt templates.
369
378
 
370
379
  | # | Agent | Type | Focus |
371
380
  |---|-------|------|-------|
@@ -392,7 +401,7 @@ New fields (v3.1): `evidence` (REQUIRED for critical/warning — concrete proof
392
401
  - `consensus_count >= 2` → boost confidence to `high` (multiple agents agree)
393
402
  - `consensus_count == 1` + non-empty `evidence` with file:line references → keep original severity and confidence
394
403
  - `consensus_count == 1` + empty/missing `evidence` + severity `critical` or `warning` → **AUTO-DOWNGRADE to `suggestion`**
395
- 3. **Severity Filter** (CRITICAL — this makes reports short):
404
+ 3. **Severity Filter** (keeps reports short):
396
405
  - `critical` + `warning` → **KEEP with full detail**
397
406
  - `improvement` → **KEEP as one-liner** with optional code suggestion
398
407
  - `suggestion` → **COUNT only** — report total number, omit individual details unless < 3 total
@@ -405,44 +414,33 @@ New fields (v3.1): `evidence` (REQUIRED for critical/warning — concrete proof
405
414
 
406
415
  ### Phase 4.5: Orchestrator Verification Spot-Check (Critical + Warning Only)
407
416
 
408
- The orchestrator MUST verify each finding with severity `critical` or `warning` by reading the actual code at the cited evidence locations in `$REVIEW_DIR`. This catches false positives that survived subagent self-verification.
409
-
410
- **If no critical/warning findings exist after Phase 4:** Skip Phase 4.5 (mark as complete immediately, proceed to Phase 5).
411
-
412
- **For each critical/warning finding:**
413
-
414
- 1. **Parse the evidence field** extract file:line references cited by the subagent
415
- 2. **Read the cited files** from `$REVIEW_DIR` at the referenced line numbers
416
- 3. **Verify the claim** based on finding category:
417
- - **Error handling claims**: Read the controller/route handler that calls this code path. If a try-catch exists at the HTTP boundary → finding is FALSE
418
- - **Null safety claims**: Read the data source (SQL query, API contract). If the source guarantees non-null (PK, NOT NULL, JOIN constraint) finding is FALSE
419
- - **Logic error claims**: Trace the cited execution path. If no realistic input triggers the bug → finding is FALSE
420
- - **Framework pattern claims**: Check if the specific usage context makes the pattern safe → finding is FALSE
421
- 4. **Mark verification status**:
422
- - `verified: true` evidence checks out, issue is real → **KEEP** in report
423
- - `verified: false` evidence is wrong (e.g., try-catch DOES exist) **DROP** from report
424
- - `verified: "unverifiable"` can't confirm within timeout **DOWNGRADE** to `suggestion`
425
-
426
- **Timeout**: 30 seconds per finding. If verification takes longer, mark as `unverifiable` and move on.
427
-
428
- **Checkpoint:** Save results to `.checkpoints/phase-4.5-verification.json`:
429
- ```json
430
- {
431
- "findings_checked": 5,
432
- "verified_true": 3,
433
- "verified_false": 1,
434
- "unverifiable": 1,
435
- "dropped_findings": [{ "original": { "file": "...", "line": 42, "message": "..." }, "reason": "try-catch exists at controller.js:28" }],
436
- "downgraded_findings": [{ "original": { "file": "...", "line": 99, "message": "..." }, "new_severity": "suggestion" }]
437
- }
438
- ```
439
- Update `manifest.json` (`completed_phase: 4.5`, `next_phase: 5`).
417
+ Verify each critical/warning finding by reading the actual code at cited evidence locations. This catches false positives that survived subagent self-verification.
418
+
419
+ If no critical/warning findings exist after Phase 4, skip to Phase 4.6.
420
+
421
+ **Full protocol**: [verification-protocol.md](references/verification-protocol.md) — category-specific verification rules, timeout policy, checkpoint schema.
422
+
423
+ **Checkpoint:** Save to `.checkpoints/phase-4.5-verification.json`. Update manifest (`completed_phase: 4.5`, `next_phase: 4.6`).
424
+
425
+ ### Phase 4.6: Result Confidence Assessment
426
+
427
+ Score how confident we are in the review's findings are they correct and complete? Computed from accuracy rate (40%), consensus rate (30%), and evidence rate (30%).
428
+
429
+ | Score | Label | Gate Action |
430
+ |-------|-------|-------------|
431
+ | 80–100 | 🟢 High | Proceed normally |
432
+ | 60–79 | 🟡 Medium | Add warning: "Some findings may be inaccurate" |
433
+ | < 60 | 🔴 Low | Add warning: "Low confidence — manual review recommended" |
434
+
435
+ **Full scoring details**: [confidence-scoring.md](references/confidence-scoring.md) formula, per-finding confidence levels, special cases, checkpoint schema.
436
+
437
+ **Checkpoint:** Save to `.checkpoints/phase-4.6-confidence.json`. Update manifest (`completed_phase: 4.6`, `next_phase: 5`).
440
438
 
441
439
  ### Phase 5: Report Generation
442
440
  Save to `.opencode/reviews/{type}_{identifier}_{date}.md`. Create directory if needed.
443
441
 
444
442
  **Report structure** (compact — omit empty sections):
445
- 1. **TL;DR** — verdict (APPROVE/REQUEST CHANGES/COMMENT) + issue counts in 3 lines
443
+ 1. **TL;DR** — verdict (APPROVE/REQUEST CHANGES/COMMENT) + issue counts + Result Confidence score from Phase 4.6
446
444
  2. **PR Overview** — what this PR does (1-3 sentences) + key changes by category
447
445
  3. **Ticket Alignment** — acceptance criteria coverage check (only if Linear ticket found). Flag ambiguous criteria.
448
446
  4. **Premise Check** — only for DELETION changes: why the code existed, whether removal is correct vs fixing the logic, cross-repo implications
@@ -466,9 +464,9 @@ Save key findings for future sessions. Includes PR number, title, date, files, c
466
464
 
467
465
  **Checkpoint:** Update `manifest.json` (`completed_phase: 5.5`, `next_phase: 6`).
468
466
 
469
- ### Phase 6: Cleanup (MANDATORY for PR Reviews)
467
+ ### Phase 6: Cleanup (PR Reviews Only)
470
468
 
471
- **NEVER delete the temp folder without asking the user.** The user may want to inspect files, run tests, or review multiple PRs.
469
+ Always ask before deleting the temp folder the user may want to inspect files, run tests, or review multiple PRs.
472
470
 
473
471
  1. **Show the temp folder path and size**:
474
472
  ```bash
@@ -495,7 +493,7 @@ Save key findings for future sessions. Includes PR number, title, date, files, c
495
493
 
496
494
  **Note:** Checkpoints are automatically removed when the clone directory is deleted. For local reviews (`--staged`), checkpoints remain in `.checkpoints/` until manually deleted.
497
495
 
498
- ### User Notification (CRITICAL)
496
+ ### User Notification
499
497
 
500
498
  After review completes, ALWAYS inform the user:
501
499
 
@@ -524,7 +522,16 @@ Summary:
524
522
 
525
523
  | Document | Content | When to Read |
526
524
  |----------|---------|--------------|
525
+ | [setup-wizard.md](references/setup-wizard.md) | Stack setup wizard — questions, mapping, config schema | Phase -2 (first run) |
527
526
  | [subagent-prompts.md](references/subagent-prompts.md) | Full prompt templates for all 4 subagents | Phase 3 execution |
528
527
  | [report-template.md](references/report-template.md) | Report format, PR summary guidelines, pseudocode | Phase 2.5 + Phase 5 |
529
528
  | [nano-brain-integration.md](references/nano-brain-integration.md) | Tool reference, query patterns, save patterns | Phase 1, 2, 5.5 |
530
- | [config.json](assets/config.json) | Full workspace + output + trace config | Setup |
529
+ | [config.json](assets/config.json) | Full workspace + output + trace + stack config | Setup |
530
+ | [security-patterns.md](references/security-patterns.md) | OWASP patterns, auth checks | Phase 3 (Security agent) |
531
+ | [quality-patterns.md](references/quality-patterns.md) | Code quality anti-patterns | Phase 3 (Quality agent) |
532
+ | [performance-patterns.md](references/performance-patterns.md) | N+1, caching, allocation patterns | Phase 3 (Integration agent) |
533
+ | [framework-rules/](references/framework-rules/) | vue-nuxt, express, nestjs, typeorm, typescript, nextjs, react, prisma | Phase -2 (load only stack-matching files) |
534
+ | [checkpoint-system.md](references/checkpoint-system.md) | Manifest schema, checkpoint files, resume logic | Phase -1 (resume detection) |
535
+ | [verification-protocol.md](references/verification-protocol.md) | Category-specific verification rules | Phase 4.5 |
536
+ | [confidence-scoring.md](references/confidence-scoring.md) | Confidence formula, thresholds, display format | Phase 4.6 |
537
+ | [checklists/database.md](checklists/database.md) | MySQL/Redis patterns, transactions, migrations | Phase 2 (DB file changes) |
@@ -1,5 +1,18 @@
1
1
  {
2
- "version": "2.8.0",
2
+ "version": "3.3.0",
3
+ "stack": {
4
+ "frontend": "nuxt",
5
+ "backend": "express",
6
+ "orm": "typeorm",
7
+ "language": "typescript",
8
+ "state": "pinia"
9
+ },
10
+ "agents": {
11
+ "workspace_root": "/path/to/workspace",
12
+ "has_agents_md": true,
13
+ "has_repos_dir": true,
14
+ "has_domains_dir": true
15
+ },
3
16
  "workspace": {
4
17
  "name": "my-project",
5
18
  "github": {
@@ -2,6 +2,15 @@
2
2
 
3
3
  Use this checklist for every PR review. Check off each item as you complete it.
4
4
 
5
+ ## Setup Check (Phase -2)
6
+
7
+ - [ ] Check if `.opencode/code-reviewer.json` exists
8
+ - [ ] If exists: read `stack` field → resolve framework rule files → store as `$FRAMEWORK_RULES`
9
+ - [ ] If exists: read `agents` field → validate workspace_root, AGENTS.md, .agents/ dirs exist
10
+ - [ ] If missing (or `/review --setup`): read `references/setup-wizard.md`, run wizard, write config
11
+ - [ ] Confirm which framework rule files will be loaded (only stack-matching files)
12
+ - [ ] Confirm agents knowledge base paths (workspace_root + which dirs found)
13
+
5
14
  ## Resume Detection (Phase -1)
6
15
 
7
16
  - [ ] Look for existing checkpoint: `find /tmp -maxdepth 1 -type d -name "pr-review-${repo}-${pr_number}-*"`
@@ -15,7 +24,7 @@ Use this checklist for every PR review. Check off each item as you complete it.
15
24
 
16
25
  - [ ] Extract repo info: `owner/repo`, `pr_number`, `head_branch`
17
26
  - [ ] Create unique temp dir: `/tmp/pr-review-{repo}-{pr}-{timestamp}`
18
- - [ ] Clone repo to temp dir (shallow clone with `--depth=50`)
27
+ - [ ] Clone repo to temp dir (shallow clone with `--depth=1`)
19
28
  - [ ] Verify correct branch is checked out (`git log --oneline -1`)
20
29
  - [ ] Record `$REVIEW_DIR` path for all subsequent phases
21
30
  - [ ] Print confirmation with path and branch name
@@ -24,10 +33,14 @@ Use this checklist for every PR review. Check off each item as you complete it.
24
33
 
25
34
  ## Context Gathering (Phase 1)
26
35
 
36
+ - [ ] Read `{workspace_root}/AGENTS.md` → identify PR repo's domain
37
+ - [ ] Read `.agents/_repos/{repo-name}.md` → repo-specific context
38
+ - [ ] Read `.agents/_domains/{domain}.md` → domain context
39
+ - [ ] Store combined as `$AGENTS_CONTEXT`
27
40
  - [ ] Get PR metadata: title, description, author, base branch
28
41
  - [ ] Get changed files with diff
29
42
  - [ ] Read full file context from `$REVIEW_DIR` (not workspace repo)
30
- - [ ] Classify each file: LOGIC / STYLE / REFACTOR / NEW
43
+ - [ ] Classify each file: LOGIC / DELETION / STYLE / REFACTOR / NEW
31
44
  - [ ] Query nano-brain for past context on changed modules
32
45
  - [ ] Save checkpoint: `.checkpoints/phase-1-context.json`
33
46
  - [ ] Update manifest: `completed_phase: 1`, `next_phase: 1.5`
@@ -77,11 +90,31 @@ Use this checklist for every PR review. Check off each item as you complete it.
77
90
  ## Refinement (Phase 4)
78
91
 
79
92
  - [ ] Merge and deduplicate findings across agents
93
+ - [ ] Consensus scoring: 2+ agents flagged same issue → boost confidence to high
94
+ - [ ] Auto-downgrade: single agent + no evidence + critical/warning → suggestion
80
95
  - [ ] Apply severity filter (critical/warning keep, suggestion count-only)
81
96
  - [ ] Gap analysis — any subagent fail? Unreviewed files?
82
97
  - [ ] Second pass on gaps if needed
83
98
  - [ ] Save checkpoint: `.checkpoints/phase-4-refined.json`
84
- - [ ] Update manifest: `completed_phase: 4`, `next_phase: 5`
99
+ - [ ] Update manifest: `completed_phase: 4`, `next_phase: 4.5`
100
+
101
+ ## Verification Spot-Check (Phase 4.5)
102
+
103
+ - [ ] Read `references/verification-protocol.md`
104
+ - [ ] For each critical/warning finding: read cited code at evidence file:line in `$REVIEW_DIR`
105
+ - [ ] Mark each: `verified:true` (keep) | `verified:false` (drop) | `verified:unverifiable` (downgrade to suggestion)
106
+ - [ ] If no critical/warning findings: skip to Phase 4.6
107
+ - [ ] Save checkpoint: `.checkpoints/phase-4.5-verification.json`
108
+ - [ ] Update manifest: `completed_phase: 4.5`, `next_phase: 4.6`
109
+
110
+ ## Confidence Scoring (Phase 4.6)
111
+
112
+ - [ ] Read `references/confidence-scoring.md`
113
+ - [ ] Compute accuracy_rate, consensus_rate, evidence_rate
114
+ - [ ] Compute overall score (0–100)
115
+ - [ ] Apply gate: < 60 → add 🔴 warning, 60–79 → add ⚠️ warning, 80+ → proceed normally
116
+ - [ ] Save checkpoint: `.checkpoints/phase-4.6-confidence.json`
117
+ - [ ] Update manifest: `completed_phase: 4.6`, `next_phase: 5`
85
118
 
86
119
  ## Report (Phase 5)
87
120
 
@@ -97,7 +130,7 @@ Use this checklist for every PR review. Check off each item as you complete it.
97
130
  ## Save to Memory (Phase 5.5)
98
131
 
99
132
  - [ ] Write key findings to nano-brain with tags: review, {repo}
100
- - [ ] Verify searchable (`npx nano-brain search "PR {number}"`)
133
+ - [ ] Verify searchable (`curl -s localhost:3100/api/search -d '{"query":"PR {number}"}'`)
101
134
  - [ ] Update manifest: `completed_phase: 5.5`, `next_phase: 6`
102
135
 
103
136
  ## Cleanup (Phase 6)