@tianhai/pi-workflow-kit 0.15.0 → 0.17.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,273 @@
1
+ # Implementation Plan: PR #5 Improvements
2
+
3
+ Fixes CI failures, tightens consistency across the workflow chain, and improves the design-review integration so every skill speaks with one voice.
4
+
5
+ ---
6
+
7
+ ## Task 1: Fix biome lint and format errors in workflow-guard
8
+
9
+ <!-- tdd: modifying-tested-code -->
10
+
11
+ Acceptance Criteria (QA Engineer Hat):
12
+ - **Happy Path**:
13
+ - Given: The biome linter runs on `extensions/workflow-guard.ts` and `tests/workflow-guard.test.ts`
14
+ - When: `npx biome check` is executed
15
+ - Then: Zero errors and zero warnings are emitted
16
+ - **Edge Case (no functional regression)**:
17
+ - Given: The existing test suite for workflow-guard
18
+ - When: `npx vitest run` is executed
19
+ - Then: All 27 existing tests still pass
20
+
21
+ Files:
22
+ - `extensions/workflow-guard.ts`
23
+ - `tests/workflow-guard.test.ts`
24
+
25
+ Steps:
26
+ 1. Fix `extensions/workflow-guard.ts` line 163 — replace string concatenation with template literal:
27
+ ```ts
28
+ // Before:
29
+ return !absolute.startsWith(plansDir + "/");
30
+ // After:
31
+ return !absolute.startsWith(`${plansDir}/`);
32
+ ```
33
+ 2. Fix `tests/workflow-guard.test.ts` line 1 — remove unused `beforeEach` import:
34
+ ```ts
35
+ // Before:
36
+ import { describe, it, expect, beforeEach } from "vitest";
37
+ // After:
38
+ import { describe, it, expect } from "vitest";
39
+ ```
40
+ 3. Fix `tests/workflow-guard.test.ts` line 19 — remove unused `getCurrentPhase` import:
41
+ ```ts
42
+ // Before:
43
+ import { getCurrentPhase, isSafeCommand, shouldBlockFilePath } from "../extensions/workflow-guard";
44
+ // After:
45
+ import { isSafeCommand, shouldBlockFilePath } from "../extensions/workflow-guard";
46
+ ```
47
+ 4. Run `npx biome check extensions/ tests/` — confirm zero errors.
48
+ 5. Run `npx vitest run` — confirm all tests pass.
49
+
50
+ ---
51
+
52
+ ## Task 2: Fix CHANGELOG `[Unreleased]` link
53
+
54
+ <!-- tdd: trivial -->
55
+
56
+ Acceptance Criteria (QA Engineer Hat):
57
+ - **Happy Path**:
58
+ - Given: The CHANGELOG.md file with version link references
59
+ - When: A reader clicks the `[Unreleased]` link
60
+ - Then: It shows changes between v0.16.0 and HEAD (not v0.14.0)
61
+
62
+ Files:
63
+ - `CHANGELOG.md`
64
+
65
+ Steps:
66
+ 1. Update the `[Unreleased]` link at the bottom of CHANGELOG.md:
67
+ ```markdown
68
+ // Before:
69
+ [Unreleased]: https://github.com/yinloo-ola/pi-workflow-kit/compare/v0.14.0...HEAD
70
+ // After:
71
+ [Unreleased]: https://github.com/yinloo-ola/pi-workflow-kit/compare/v0.16.0...HEAD
72
+ ```
73
+
74
+ ---
75
+
76
+ ## Task 3: Align skill descriptions — trim design-review, match tone
77
+
78
+ <!-- tdd: trivial -->
79
+
80
+ All six skills should follow the same description pattern: a one-sentence summary of what the skill does, followed by trigger guidance. The `design-review` description currently front-loads usage instructions that belong in the skill body.
81
+
82
+ Acceptance Criteria (QA Engineer Hat):
83
+ - **Happy Path**:
84
+ - Given: The `description` frontmatter of `skills/design-review/SKILL.md`
85
+ - When: Compared to the other five skill descriptions
86
+ - Then: It follows the same pattern — concise purpose first, trigger guidance second
87
+ - **Edge Case (no information loss)**:
88
+ - Given: The trimmed description
89
+ - When: An agent reads it for skill matching
90
+ - Then: It still contains enough signal to trigger correctly (keywords: audit, design, production risks, security, scalability)
91
+
92
+ Files:
93
+ - `skills/design-review/SKILL.md`
94
+
95
+ Steps:
96
+ 1. Replace the description in `skills/design-review/SKILL.md` frontmatter:
97
+ ```yaml
98
+ // Before:
99
+ description: "Audit a design doc for production risks — security, scalability, fault tolerance, and operational hazards. Run after brainstorming, before writing-plans. Use when the brainstorm flags a non-trivial design, or when you want to stress-test a design for production readiness."
100
+ // After:
101
+ description: "Audit a design doc for production risks — security, scalability, fault tolerance, and operational hazards. Use after brainstorming for non-trivial designs, or when you want to stress-test a design for production readiness."
102
+ ```
103
+ This trims the redundant "Run after brainstorming, before writing-plans" (the workflow order is documented in README) while keeping the trigger guidance.
104
+
105
+ ---
106
+
107
+ ## Task 4: Remove redundant user confirmation for trivial design-review
108
+
109
+ <!-- tdd: modifying-tested-code -->
110
+
111
+ The brainstorming skill already asked the user to classify trivial vs non-trivial. When the design doc says "Simple change — no design review needed", the user already made that decision. Asking again in design-review step 2 adds friction without value.
112
+
113
+ Acceptance Criteria (QA Engineer Hat):
114
+ - **Happy Path (trivial skip)**:
115
+ - Given: A design doc with "Simple change — no design review needed"
116
+ - When: `/skill:design-review` is run
117
+ - Then: The agent automatically appends the "Skipped — trivial change" section and moves on, without asking the user to confirm
118
+ - **Edge Case (non-trivial proceeds normally)**:
119
+ - Given: A design doc without the trivial marker
120
+ - When: `/skill:design-review` is run
121
+ - Then: The full audit proceeds as before (no behavior change)
122
+
123
+ Files:
124
+ - `skills/design-review/SKILL.md`
125
+
126
+ Steps:
127
+ 1. In step 2 ("Check triviality"), replace the interactive confirmation with an automatic skip:
128
+ ```markdown
129
+ // Before:
130
+ 2. **Check triviality** — if the design doc notes "Simple change — no design review needed", confirm with the user: "This looks like a trivial change. Skip the full audit?" If yes, append a brief section:
131
+
132
+ // After:
133
+ 2. **Check triviality** — if the design doc notes "Simple change — no design review needed", append a brief section:
134
+ ```
135
+ 2. Remove the "If yes," conditional — the append and stop is now unconditional for trivial docs.
136
+ 3. Verify the file reads cleanly — the flow is now: find doc → check trivial → (if trivial: append + stop, else: continue to step 3).
137
+
138
+ ---
139
+
140
+ ## Task 5: Evaluate design for review need regardless of design doc presence
141
+
142
+ <!-- tdd: trivial -->
143
+
144
+ The brainstorming skill flags "database changes, external services, auth, concurrency, large data flows" for design-review. The writing-plans safety net checks for a slightly different list and only when a design doc exists. When writing-plans is used standalone (no design doc), the safety net never fires — so a non-trivial standalone design skips design-review entirely.
145
+
146
+ The fix: always evaluate whether the design involves high-risk patterns, regardless of source. A design doc with no `## Architectural Review` section and a standalone user description both deserve the same scrutiny.
147
+
148
+
149
+ Acceptance Criteria (QA Engineer Hat):
150
+ - **Happy Path (design doc, no review section)**:
151
+ - Given: A design doc exists without an `## Architectural Review` section, and the design involves database schema changes
152
+ - When: Writing-plans step 1 runs
153
+ - Then: The agent prompts the user to run `/skill:design-review` or type 'proceed'
154
+ - **Happy Path (standalone, non-trivial)**:
155
+ - Given: No design doc exists, and the user describes a feature involving authentication and external API integrations
156
+ - When: Writing-plans gathers context
157
+ - Then: The agent prompts: "This design involves [auth, external APIs] but hasn't been reviewed for production risks. Run `/skill:design-review` first, or type 'proceed' to skip."
158
+ - **Edge Case (design doc already reviewed)**:
159
+ - Given: A design doc with an `## Architectural Review` section
160
+ - When: Writing-plans step 1 runs
161
+ - Then: No prompt — the review already happened
162
+ - **Edge Case (trivial)**:
163
+ - Given: A trivial design (config rename, simple field addition) with or without a design doc
164
+ - When: Writing-plans evaluates the design
165
+ - Then: No prompt — no trigger categories matched
166
+
167
+ Files:
168
+ - `skills/brainstorming/SKILL.md`
169
+ - `skills/writing-plans/SKILL.md`
170
+
171
+ Steps:
172
+ 1. Update `skills/brainstorming/SKILL.md` step 4 to match writing-plans' more specific list:
173
+ ```markdown
174
+ // Before:
175
+ For non-trivial designs, note any areas that may need production-risk review (database changes, external services, auth, concurrency, large data flows). You don't need to audit them here — just flag them for the design-review stage.
176
+
177
+ // After:
178
+ For non-trivial designs, note any areas that may need production-risk review (database schema changes, authentication or authorization, external API integrations, concurrency or batch processing, file uploads or large data flows, Redis/caching/message queues). You don't need to audit them here — just flag them for the design-review stage.
179
+ ```
180
+ 2. Update `skills/writing-plans/SKILL.md` step 1 — consolidate the safety net into one check that applies regardless of whether a design doc exists. Replace the current conditional with:
181
+ ```markdown
182
+ // Before (current text — only checks design docs):
183
+ Then check whether the design doc has an `## Architectural Review` section. If it doesn't, and the design involves any of the following, prompt the user...
184
+
185
+ // After (unified check):
186
+ Then evaluate whether the design — whether from the design doc or from the user's description and codebase exploration — involves any of the following:
187
+
188
+ - Database schema changes or migrations
189
+ - Authentication or authorization logic
190
+ - External API or service integrations
191
+ - Concurrency or batch processing
192
+ - File uploads or large data flows
193
+ - Redis, caching, or message queues
194
+
195
+ If any apply AND the design doc does not already have an `## Architectural Review` section, prompt the user: "This design involves [list what you found] but hasn't been reviewed for production risks. Run `/skill:design-review` first, or type 'proceed' to skip."
196
+
197
+ If the design doc explicitly notes "Simple change — no design review needed", skip this check.
198
+ ```
199
+ 3. Verify the safety net fires for both design-doc and standalone paths, and skips when the review already exists.
200
+
201
+ ---
202
+
203
+ ## Task 6: Generalize `NODE_ENV` reference in executing-tasks
204
+
205
+ <!-- tdd: trivial -->
206
+
207
+ The QA Test frame references `NODE_ENV` which is Node.js-specific. Since the workflow kit is used across languages (the examples reference SQL, Go, etc.), this should be generalized.
208
+
209
+ Acceptance Criteria (QA Engineer Hat):
210
+ - **Happy Path**:
211
+ - Given: The QA Test frame in `skills/executing-tasks/SKILL.md`
212
+ - When: A developer working in Python or Go reads it
213
+ - Then: The guidance makes sense without Node.js context
214
+ - **Edge Case (Node.js users still understand)**:
215
+ - Given: A Node.js developer reading the same text
216
+ - When: They see the generalized phrasing
217
+ - Then: They understand it means `NODE_ENV=test` or equivalent
218
+
219
+ Files:
220
+ - `skills/executing-tasks/SKILL.md`
221
+
222
+ Steps:
223
+ 1. In the QA Test frame, replace the `NODE_ENV` reference:
224
+ ```markdown
225
+ // Before:
226
+ External dependencies must be mocked or stubbed. `NODE_ENV` must be `test` (or equivalent).
227
+
228
+ // After:
229
+ External dependencies must be mocked or stubbed. Ensure the test environment is isolated (e.g., `NODE_ENV=test`, `GO_ENV=test`, or equivalent for your stack).
230
+ ```
231
+
232
+ ---
233
+
234
+ ## Task 7: Deduplicate test coverage requirement in writing-plans task format
235
+
236
+ <!-- tdd: trivial -->
237
+
238
+ The "Each task must include" section has two overlapping bullets about test coverage:
239
+ 1. The new Acceptance Criteria block (Happy Path + Edge Cases)
240
+ 2. The old "Each task's tests should cover the happy path and at least one edge case" bullet
241
+
242
+ The Acceptance Criteria block supersedes the old bullet. Keeping both is redundant and confusing.
243
+
244
+ Acceptance Criteria (QA Engineer Hat):
245
+ - **Happy Path**:
246
+ - Given: The task format section in `skills/writing-plans/SKILL.md`
247
+ - When: Reading the "Each task must include" bullets
248
+ - Then: Test coverage is specified exactly once (in the Acceptance Criteria block), not duplicated
249
+
250
+ Files:
251
+ - `skills/writing-plans/SKILL.md`
252
+
253
+ Steps:
254
+ 1. Remove the redundant bullet from "Each task must include":
255
+ ```markdown
256
+ // Remove this line (now covered by Acceptance Criteria):
257
+ - Each task's tests should cover the happy path and at least one edge case or error path, with concrete assertions
258
+ ```
259
+ 2. Verify the Acceptance Criteria bullet already covers this requirement with its "Happy Path" and "Edge Cases & Error Paths" sub-bullets.
260
+
261
+ ---
262
+
263
+ ## Task 8: Run tests and verify CI passes
264
+
265
+ <!-- tdd: trivial -->
266
+
267
+ Files:
268
+ - None (verification only)
269
+
270
+ Steps:
271
+ 1. Run `npx biome check extensions/ tests/` — confirm zero errors.
272
+ 2. Run `npx vitest run` — confirm all existing tests pass.
273
+ 3. Verify the CHANGELOG link renders correctly.
@@ -0,0 +1,17 @@
1
+ # Progress: PR #5 Improvements
2
+
3
+ Plan: docs/plans/2026-05-25-pr5-improvements-implementation.md
4
+ Branch: design-review-split
5
+ Started: 2026-05-25T12:00:00Z
6
+ Last updated: 2026-05-25T12:26:00Z
7
+
8
+ | # | Status | Task | Commit |
9
+ |---|--------|------|--------|
10
+ | 1 | ✅ done | Fix biome lint and format errors in workflow-guard | 6f2eb8c |
11
+ | 2 | ✅ done | Fix CHANGELOG `[Unreleased]` link | 4866c25 |
12
+ | 3 | ✅ done | Align skill descriptions — trim design-review, match tone | 541ea9b |
13
+ | 4 | ✅ done | Remove redundant user confirmation for trivial design-review | 953b6d6 |
14
+ | 5 | ✅ done | Evaluate design for review need regardless of design doc presence | 20ea47e |
15
+ | 6 | ✅ done | Generalize `NODE_ENV` reference in executing-tasks | b117a44 |
16
+ | 7 | ✅ done | Deduplicate test coverage requirement in writing-plans task format | 3a34266 |
17
+ | 8 | ✅ done | Run tests and verify CI passes | — |
@@ -0,0 +1,51 @@
1
+ # Add Verify Skill — Design Doc
2
+
3
+ ## Context
4
+
5
+ Based on [Chris LeMa's "The Last Prompt"](https://chrislema.com/the-last-prompt-you-need-when-building-software-with-ai), we need a post-implementation code verification phase in pi-workflow-kit. The existing `design-review` skill validates architecture *intentions* at the design-doc level, but there's no review of the *actual implemented code*. This is where the most dangerous bugs hide: signature mismatches between layers, dead code, duplicated logic, and security holes that pass tests but break in production.
6
+
7
+ ## Decision
8
+
9
+ ### Add a `verify` skill (new)
10
+
11
+ A single skill triggered by `/skill:verify` that runs three sequential expert review passes over implemented code:
12
+
13
+ 1. **Security** 🔴 — adversarial review as if a junior wrote it and the best security expert is auditing
14
+ 2. **Optimization** 🟡 — dead code, duplication, over/under-engineering, performance
15
+ 3. **Traceability** 🔵 — end-to-end call chain verification across every layer boundary
16
+
17
+ Output: structured markdown report at `docs/plans/*-verification-report.md` with findings and actionable task list.
18
+
19
+ ### Keep `design-review` unchanged
20
+
21
+ `design-review` stays between brainstorm and plan — it validates architecture before task breakdown. Moving it would lose the cheap "catch it before you build it" value.
22
+
23
+ ### Update README
24
+
25
+ Add `verify` to the workflow diagram, skill table, and quick start. The pipeline becomes:
26
+
27
+ ```
28
+ brainstorm → design-review → plan → execute → verify → finalize
29
+ ```
30
+
31
+ ## Workflow Integration
32
+
33
+ ```
34
+ brainstorm → design-review (optional) → plan → execute → verify → finalize
35
+ ↑ ↑
36
+ existing new
37
+ ```
38
+
39
+ - `verify` runs after `executing-tasks` and before `finalizing`
40
+ - It's optional — trivial changes can skip it
41
+ - The report's remediation task list feeds directly into a follow-up `/skill:writing-plans` if fixes are needed
42
+ - Read-only: can write to `docs/plans/` only, cannot modify source code
43
+
44
+ ## Files to Change
45
+
46
+ 1. **`skills/verify/SKILL.md`** — new skill (full content in `docs/plans/2026-06-03-verify-skill-design.md`)
47
+ 2. **`README.md`** — update workflow diagram, skill table, quick start, and project structure
48
+
49
+ ## Production Risks
50
+
51
+ Simple change — no design review needed. We're adding a new SKILL.md and updating documentation. No code execution, no external integrations, no security surface.
@@ -0,0 +1,111 @@
1
+ # Implementation Plan: Add Verify Skill
2
+
3
+ Design: `docs/plans/2026-06-03-add-verify-skill-design.md`
4
+
5
+ ## Overview
6
+
7
+ Add a `verify` skill to pi-workflow-kit — a post-implementation code verification phase that runs three expert review passes (security, optimization, traceability) over implemented code. Also update the README to reflect the expanded workflow pipeline.
8
+
9
+ Full SKILL.md content is in `docs/plans/2026-06-03-verify-skill-design.md` (lines 7-176, inside the code fence).
10
+
11
+ ## Task 1: Create the verify skill
12
+
13
+ <!-- tdd: trivial -->
14
+
15
+ Acceptance Criteria (QA Engineer Hat):
16
+ - **Happy Path**:
17
+ - Given: No `skills/verify/` directory exists
18
+ - When: `skills/verify/SKILL.md` is created
19
+ - Then: The file contains valid YAML frontmatter with `name: verify` and a description mentioning security, optimization, and traceability. The file body contains all three review pass sections, the report format template, and the principles section.
20
+ - **Edge Case (skill already exists)**:
21
+ - Given: `skills/verify/SKILL.md` already exists
22
+ - When: Task runs
23
+ - Then: The existing file is overwritten with the new content
24
+
25
+ Files:
26
+ - `skills/verify/SKILL.md`
27
+
28
+ Steps:
29
+ 1. Create the directory `skills/verify/`
30
+ 2. Create `skills/verify/SKILL.md` with the full content from the design draft. The content is the markdown inside the code fence in `docs/plans/2026-06-03-verify-skill-design.md` (lines 8-176). Copy it exactly — it includes:
31
+ - YAML frontmatter with name and description
32
+ - # Verify heading and intro paragraph
33
+ - ## Process section (5 steps)
34
+ - ## Pass 1 — Security Review 🔴 (framing, what to look for, severity table)
35
+ - ## Pass 2 — Optimization Review 🟡 (framing, what to look for, priority table)
36
+ - ## Pass 3 — Traceability Review 🔵 (framing, what to look for 4 sub-items, severity table)
37
+ - ## Report Format section (full template with summary table, findings sections, remediation task list)
38
+ - ## Principles section (5 bullets)
39
+
40
+ ## Task 2: Update README with verify skill
41
+
42
+ <!-- tdd: trivial -->
43
+
44
+ Acceptance Criteria (QA Engineer Hat):
45
+ - **Happy Path**:
46
+ - Given: README.md has the current workflow (brainstorm → design-review → plan → execute → finalize)
47
+ - When: README is updated
48
+ - Then: All five sections are updated — tagline, workflow diagram, skill table, phase control, quick start, and project structure — to include `verify` between execute and finalize.
49
+ - **Edge Case (verify already in README)**:
50
+ - Given: README already contains verify references
51
+ - When: Task runs
52
+ - Then: No duplicate entries are introduced
53
+
54
+ Files:
55
+ - `README.md`
56
+
57
+ Steps:
58
+
59
+ 1. Update the tagline (line 3) — change `brainstorm→plan→execute→finalize` to `brainstorm→plan→execute→verify→finalize`:
60
+ ```
61
+ > Stop AI agents from rushing to code. Enforce a structured brainstorm→plan→execute→verify→finalize workflow with TDD discipline.
62
+ ```
63
+
64
+ 2. Update the "🧠 6 Workflow Skills" heading (line 36) to "🧠 7 Workflow Skills"
65
+
66
+ 3. Update the workflow diagram (lines 40-44) to:
67
+ ```
68
+ brainstorm → design-review → plan → execute → verify → finalize
69
+
70
+ diagnose (anytime)
71
+ ```
72
+
73
+ 4. Add verify to the skill table (after the Execute row, before Finalize):
74
+ ```
75
+ | **Verify** | `/skill:verify` | Three expert review passes (security, optimization, traceability) on implemented code |
76
+ ```
77
+
78
+ 5. Update the phase control section (lines 61-67) to add verify:
79
+ ```
80
+ /skill:brainstorming → discuss and design
81
+ /skill:design-review → audit for production risks (non-trivial designs)
82
+ /skill:writing-plans → break into tasks
83
+ /skill:executing-tasks → implement with TDD
84
+ /skill:verify → review code for security, optimization, and traceability issues
85
+ /skill:finalizing → ship it
86
+ ```
87
+
88
+ 6. Update the quick start section (lines 110-135) to add verify between executing-tasks and finalizing:
89
+ ```
90
+ > /skill:executing-tasks
91
+
92
+ # (agent implements with TDD, cognitive persona shifts, all tools unlocked)
93
+ > /skill:verify
94
+
95
+ # (agent runs security, optimization, and traceability reviews on implemented code)
96
+ > /skill:finalizing
97
+
98
+ # (agent archives docs, curates lessons, creates PR)
99
+ ```
100
+
101
+ 7. Update the project structure (lines 146-161) to add verify:
102
+ ```
103
+ ├── skills/
104
+ │ ├── brainstorming/SKILL.md
105
+ │ ├── design-review/SKILL.md
106
+ │ ├── writing-plans/SKILL.md
107
+ │ ├── executing-tasks/SKILL.md
108
+ │ ├── verify/SKILL.md
109
+ │ ├── finalizing/SKILL.md
110
+ │ └── diagnose/SKILL.md
111
+ ```
@@ -0,0 +1,11 @@
1
+ # Progress: Add Verify Skill
2
+
3
+ Plan: docs/plans/2026-06-03-add-verify-skill-implementation.md
4
+ Branch: add-verify-skill
5
+ Started: 2026-06-03T13:00:00Z
6
+ Last updated: 2026-06-03T13:00:00Z
7
+
8
+ | # | Status | Task | Commit |
9
+ |---|--------|------|--------|
10
+ | 1 | ✅ done | Create the verify skill | c48d47a |
11
+ | 2 | ✅ done | Update README with verify skill | ea37ea8 |
@@ -0,0 +1,176 @@
1
+ # Verify Skill — Draft SKILL.md
2
+
3
+ > **Target path:** `skills/verify/SKILL.md` (to be created during executing-tasks)
4
+
5
+ ---
6
+
7
+ ```markdown
8
+ ---
9
+ name: verify
10
+ description: "Post-implementation code verification with three expert review passes — security, optimization, and traceability. Use after executing-tasks and before finalizing to catch issues that pass tests but break in production. Runs the 'last prompt' pattern: adversarial security review, dead code and duplication audit, and end-to-end contract verification across every layer. Use this skill whenever the user says 'verify', 'review the code', 'check for issues', 'security review', 'the last prompt', 'audit', or when code has been implemented and needs a quality gate before shipping."
11
+ ---
12
+
13
+ # Verify
14
+
15
+ Three expert review passes over the implemented codebase. Read-only — you **may** write the verification report to `docs/plans/`, but you **may not** modify source code.
16
+
17
+ The core insight: code that passes tests is not code that's ready. Working code can have security holes, dead branches, duplicated logic, and broken contracts between layers — especially when AI generates across many files without maintaining a single mental model of the whole system. This skill catches what tests miss.
18
+
19
+ ## Process
20
+
21
+ 1. **Check what's been done** — run `git log --oneline` and `git diff --stat` to understand the scope of recent changes. If nothing has been implemented, say "No code changes found. Run `/skill:executing-tasks` first." and stop.
22
+
23
+ 2. **Identify the project's layers** — before reviewing, map the codebase's architecture. Look for layer boundaries: UI/handlers/routes → services/business logic → repositories/data access → database/models. Note the patterns: does the project use controllers, handlers, or routes? Services or use cases? Repositories or DAOs? This map drives the traceability pass.
24
+
25
+ 3. **Run three expert review passes** — each pass adopts a distinct adversarial framing. Do them sequentially. For each pass, read the relevant code deeply — don't skim. Then write findings.
26
+
27
+ 4. **Compile the report** — write all findings to `docs/plans/*-verification-report.md`. Present the report to the user and wait for feedback.
28
+
29
+ 5. **Offer to create a remediation plan** — after the report, ask: "Want me to create a fix plan from these findings? Run `/skill:writing-plans` to turn the task list into executable tasks."
30
+
31
+ ## Pass 1 — Security Review 🔴
32
+
33
+ **Framing:** A junior developer wrote this code. Now the best security expert on the team is reviewing it — adversarial, suspicious of everything. Trust nothing.
34
+
35
+ **What to look for:**
36
+
37
+ - **Input validation** — every external input (HTTP params, form data, headers, query strings, environment variables) must be validated and sanitized. Unvalidated input is a critical finding.
38
+ - **Authentication & authorization** — every endpoint that handles user data must have auth checks. Are there endpoints that skip auth? Can one user access another user's data by changing an ID?
39
+ - **Injection** — SQL queries built by string concatenation, unsanitized shell commands, template injection, XSS in HTML output. Any raw variable interpolated into a query or command is critical.
40
+ - **Secrets** — API keys, passwords, tokens hardcoded in source files. Check environment variable loading — are defaults set to empty or to actual secrets?
41
+ - **Data exposure** — are sensitive fields (passwords, tokens, PII) logged, returned in API responses, or stored unencrypted?
42
+ - **Dependency risks** — known-vulnerable packages (if `package.json`/`go.mod`/`requirements.txt` is present).
43
+
44
+ **Severity classification:**
45
+
46
+ | Severity | Definition |
47
+ |----------|-----------|
48
+ | Critical | Exploitable right now — auth bypass, injection, data leak |
49
+ | High | Likely exploitable — missing validation on sensitive endpoint, weak auth |
50
+ | Medium | Harder to exploit but real risk — verbose error messages leaking internals, missing rate limits |
51
+ | Low | Best practice violations — missing CSP headers, no HSTS, long session timeouts |
52
+
53
+ ## Pass 2 — Optimization Review 🟡
54
+
55
+ **Framing:** A code quality expert looking for waste — things that make the codebase harder to maintain, slower to run, or more confusing than necessary.
56
+
57
+ **What to look for:**
58
+
59
+ - **Dead code** — functions, methods, types, or exports that are never called anywhere in the codebase. Search for definitions and verify they have callers.
60
+ - **Duplication** — the same logic implemented in slightly different ways across multiple files. AI-generated code is especially prone to this — if context was lost between sessions, the AI solved the same sub-problem differently in two places. Flag each pair with file paths and line numbers.
61
+ - **Over-engineering** — abstractions, interfaces, or layers that add complexity without earning their keep (only one implementation, no real variation across the seam).
62
+ - **Under-engineering** — god functions, 200-line blocks, deeply nested conditionals that should be extracted.
63
+ - **Performance concerns** — N+1 queries, unbounded loops, unnecessary copies of large data structures, missing pagination on list endpoints.
64
+
65
+ **Priority classification:**
66
+
67
+ | Priority | Definition |
68
+ |----------|-----------|
69
+ | P0 | Dead code in a critical path or duplicated logic that will diverge |
70
+ | P1 | Significant duplication or over-engineering that increases maintenance cost |
71
+ | P2 | Minor cleanups — long functions, missing pagination, style inconsistencies |
72
+
73
+ ## Pass 3 — Traceability Review 🔵
74
+
75
+ **Framing:** An integration expert tracing every user-facing action end-to-end — from UI to database and back. The AI generates code file-by-file, and the seams between files are where bugs hide.
76
+
77
+ **What to look for:**
78
+
79
+ 1. **Map every entry point** — list all handlers, routes, controllers, or event listeners that receive external input.
80
+ 2. **Trace each call chain** — for each entry point, follow the call: handler → service → repository → database. At each boundary, verify:
81
+ - **Function name** — does the caller use the exact function name the callee exposes?
82
+ - **Argument names** — does the caller pass `userId` when the function expects `user_id`? Does `id` mean the same thing in both layers?
83
+ - **Argument types** — is a string passed where an integer is expected? Is an object shape different from what the next layer destructures?
84
+ - **Return shape** — does the caller expect fields that the callee actually returns? Are response DTOs consistent across layers?
85
+ 3. **Check error propagation** — when a database query returns no results, does the service layer handle it? Does the handler return 404 or 500? Do errors propagate cleanly or get swallowed silently?
86
+ 4. **Verify the round-trip** — if the UI calls `getUser(id)` and displays `user.name`, trace that `name` actually exists in the DB schema, gets selected by the query, mapped by the repository, passed through the service, included in the response, and rendered by the UI.
87
+
88
+ **This is the pass that catches the most bugs.** AI-generated code will often have a frontend calling `getUserProfile(userId)` and a backend exposing `get_user_profile(user_id)` — both work in isolation, neither works together.
89
+
90
+ **Severity classification:**
91
+
92
+ | Severity | Definition |
93
+ |----------|-----------|
94
+ | Critical | Call chain is completely broken — function doesn't exist or signature is fundamentally wrong |
95
+ | High | Signature mismatch — wrong arg names, wrong types, missing required fields |
96
+ | Medium | Silent error handling — errors swallowed without logging or user feedback |
97
+ | Low | Inconsistent naming conventions that could confuse future developers |
98
+
99
+ ## Report Format
100
+
101
+ Write findings to `docs/plans/*-verification-report.md` using this structure:
102
+
103
+ # Verification Report: <feature/topic>
104
+
105
+ **Date:** <ISO date>
106
+ **Scope:** <summary of what was reviewed>
107
+ **Reviewer:** AI verify skill (security + optimization + traceability)
108
+
109
+ ## Summary
110
+
111
+ | Pass | Critical | High | Medium | Low |
112
+ |------|----------|------|--------|-----|
113
+ | Security | X | X | X | X |
114
+ | Optimization | — | X | X | X |
115
+ | Traceability | X | X | X | X |
116
+ | **Total** | **X** | **X** | **X** | **X** |
117
+
118
+ ## 🔴 Security Findings
119
+
120
+ ### [S-001] Critical — <short title>
121
+
122
+ **Location:** `path/to/file.ts:line`
123
+
124
+ **Issue:** <what's wrong and why it matters>
125
+
126
+ **Fix:** <concrete remediation step>
127
+
128
+ ### [S-002] High — <short title>
129
+ ...
130
+
131
+ ## 🟡 Optimization Findings
132
+
133
+ ### [O-001] P0 — <short title>
134
+
135
+ **Location:** `path/to/file.ts:line` and `path/to/other.ts:line`
136
+
137
+ **Issue:** <what's wrong>
138
+
139
+ **Fix:** <concrete remediation step>
140
+
141
+ ### [O-002] P1 — <short title>
142
+ ...
143
+
144
+ ## 🔵 Traceability Findings
145
+
146
+ ### [T-001] Critical — <short title>
147
+
148
+ **Entry point:** `path/to/handler.ts:line`
149
+ **Call chain:** handler → service → repository → DB
150
+ **Broken at:** <which boundary>
151
+ **Issue:** <what's wrong — e.g., handler passes `userId` but service expects `user_id`>
152
+
153
+ **Fix:** <concrete remediation step>
154
+
155
+ ### [T-002] High — <short title>
156
+ ...
157
+
158
+ ## Remediation Task List
159
+
160
+ Convert findings into actionable tasks:
161
+
162
+ | ID | Priority | Finding | Estimated Effort |
163
+ |----|----------|---------|-----------------|
164
+ | S-001 | Critical | <one-liner> | <small/medium/large> |
165
+ | T-001 | Critical | <one-liner> | <small/medium/large> |
166
+ | O-001 | P0 | <one-liner> | <small/medium/large> |
167
+ | ...
168
+
169
+ ## Principles
170
+
171
+ - **Be specific** — every finding must include a file path and line reference. "There might be security issues" is useless.
172
+ - **Be adversarial** — actively look for problems. If you don't find any, say so — but don't phone it in.
173
+ - **Be proportional** — a small config change doesn't need the same depth as a new API endpoint. Adjust your review depth to the scope of changes.
174
+ - **Don't fix anything** — this is read-only. Find and report. The user decides what to fix and when.
175
+ - **Focus on seams** — the traceability pass is where the most value lives. Code within a single file is usually coherent; the bugs hide between files.
176
+ ```