agent-directives 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (57) hide show
  1. package/README.md +385 -0
  2. package/directives/adaptive-routing.md +361 -0
  3. package/directives/architecture-boundaries.md +223 -0
  4. package/directives/codebase-navigation.md +325 -0
  5. package/directives/context-handoff.md +220 -0
  6. package/directives/error-memory.md +169 -0
  7. package/directives/exploration-mode.md +266 -0
  8. package/directives/session-decisions.md +193 -0
  9. package/directives/specification-driven-development.md +278 -0
  10. package/directives/task-framing.md +154 -0
  11. package/directives/test-driven-development.md +305 -0
  12. package/directives/type-driven-development.md +173 -0
  13. package/directives/verification.md +266 -0
  14. package/directives/workspace-isolation.md +219 -0
  15. package/dist/cli.d.ts +3 -0
  16. package/dist/cli.d.ts.map +1 -0
  17. package/dist/cli.js +232 -0
  18. package/dist/cli.js.map +1 -0
  19. package/dist/context-audit.d.ts +30 -0
  20. package/dist/context-audit.d.ts.map +1 -0
  21. package/dist/context-audit.js +75 -0
  22. package/dist/context-audit.js.map +1 -0
  23. package/dist/install.d.ts +18 -0
  24. package/dist/install.d.ts.map +1 -0
  25. package/dist/install.js +28 -0
  26. package/dist/install.js.map +1 -0
  27. package/dist/manifest.d.ts +25 -0
  28. package/dist/manifest.d.ts.map +1 -0
  29. package/dist/manifest.js +29 -0
  30. package/dist/manifest.js.map +1 -0
  31. package/dist/prompt.d.ts +3 -0
  32. package/dist/prompt.d.ts.map +1 -0
  33. package/dist/prompt.js +29 -0
  34. package/dist/prompt.js.map +1 -0
  35. package/dist/targets.d.ts +10 -0
  36. package/dist/targets.d.ts.map +1 -0
  37. package/dist/targets.js +32 -0
  38. package/dist/targets.js.map +1 -0
  39. package/manifest.json +387 -0
  40. package/package.json +74 -0
  41. package/skills/architecture-boundary-reviewer/SKILL.md +228 -0
  42. package/skills/code-reviewer/SKILL.md +77 -0
  43. package/skills/codebase-health-reviewer/SKILL.md +234 -0
  44. package/skills/harness-hooks-reviewer/SKILL.md +159 -0
  45. package/skills/implementation-task-planner/SKILL.md +205 -0
  46. package/skills/mcp-integration-reviewer/SKILL.md +157 -0
  47. package/skills/product-requirements-writer/SKILL.md +205 -0
  48. package/skills/production-readiness-reviewer/SKILL.md +240 -0
  49. package/skills/self-audit/SKILL.md +134 -0
  50. package/skills/spec-reviewer/SKILL.md +304 -0
  51. package/skills/subagent-driven-development/SKILL.md +236 -0
  52. package/skills/systematic-debugging/SKILL.md +313 -0
  53. package/skills/test-reviewer/SKILL.md +293 -0
  54. package/templates/AGENTS.md +120 -0
  55. package/templates/CLAUDE.md +115 -0
  56. package/templates/copilot-instructions.md +116 -0
  57. package/templates/decision-log.md +44 -0
@@ -0,0 +1,157 @@
1
+ ---
2
+ name: "mcp-integration-reviewer"
3
+ description: "Load when adding or reviewing MCP servers, agent tools, tool schemas, internal API bridges, structured search, docs/ticketing/analytics connectors, or agent-accessible write tools."
4
+ version: 1.0.0
5
+ required: false
6
+ category: review
7
+ tools:
8
+ - claude
9
+ - copilot
10
+ - codex
11
+ - cursor
12
+ routing:
13
+ triggers:
14
+ - mcp-server
15
+ - mcp-tool
16
+ - agent-tool-schema
17
+ - internal-tool-bridge
18
+ - structured-search-tool
19
+ - agent-accessible-api
20
+ - agent-write-tool
21
+ paths:
22
+ - full-path
23
+ - review-path
24
+ - policy-path
25
+ ---
26
+
27
+ # MCP Integration Reviewer
28
+
29
+ You are a specialist in reviewing Model Context Protocol (MCP) servers and other
30
+ agent-accessible tool surfaces. Your job is to make sure the agent can call the
31
+ right tool safely, with strict schemas, least privilege, bounded output, and clear
32
+ failure behavior.
33
+
34
+ This skill applies to MCP specifically and to similar internal tool bridges that
35
+ expose APIs, search, tickets, analytics, docs, deploys, or data systems to agents.
36
+
37
+ ---
38
+
39
+ ## When to Use
40
+
41
+ Use this skill when work adds, changes, or reviews:
42
+
43
+ - MCP servers, tool definitions, resources, prompts, or transports
44
+ - agent-callable wrappers around internal APIs, search, docs, ticketing, analytics,
45
+ deploy, data, or operational systems
46
+ - tool schemas, descriptions, argument validation, output contracts, or permissions
47
+ - write-capable tools or tools that can expose sensitive data
48
+
49
+ Do not use this skill for ordinary application APIs unless they are exposed to an
50
+ agent as tools.
51
+
52
+ ---
53
+
54
+ ## Review Process
55
+
56
+ ### Step 1: Inventory the Tool Surface
57
+
58
+ List only the exposed agent-facing capabilities:
59
+
60
+ - tool/resource/prompt names
61
+ - read vs write behavior
62
+ - external/internal systems touched
63
+ - auth identity and permission scope
64
+ - expected output shape and size
65
+
66
+ ### Step 2: Check Tool Routing Quality
67
+
68
+ Verify tool names and descriptions tell an agent when to use the tool and when not
69
+ to. A good tool description includes task intent, boundaries, required identifiers,
70
+ and important side effects.
71
+
72
+ Flag vague names like `run`, `query`, `doThing`, or broad descriptions like
73
+ "access internal systems" unless the surrounding schema strongly disambiguates.
74
+
75
+ ### Step 3: Check Schemas and Validation
76
+
77
+ Require:
78
+
79
+ - strict argument schemas with required fields, enums, bounds, and formats
80
+ - server-side validation, not only client-side hints
81
+ - pagination or limits for large reads
82
+ - structured errors with actionable codes/messages
83
+ - stable output fields that avoid dumping unbounded raw documents by default
84
+
85
+ ### Step 4: Check Auth, Secrets, and Data Boundaries
86
+
87
+ Review:
88
+
89
+ - least-privilege auth for the tool's real blast radius
90
+ - separation between user identity, service identity, and elevated/admin identity
91
+ - secret handling and redaction in logs, errors, traces, and model-visible output
92
+ - tenant/user/project scoping for internal data
93
+ - audit logging for sensitive reads and all meaningful writes
94
+
95
+ ### Step 5: Check Write Safety
96
+
97
+ For write-capable tools, require appropriate safeguards:
98
+
99
+ - dry-run or preview mode when practical
100
+ - explicit confirmation for destructive, deploy, billing, permission, or data writes
101
+ - idempotency keys or duplicate-call protection when retries are plausible
102
+ - rollback/recovery notes for high-impact changes
103
+ - clear distinction between create/update/delete operations
104
+
105
+ ### Step 6: Check Operational Behavior
106
+
107
+ Look for timeouts, retries, rate limits, cancellation, concurrency limits,
108
+ backpressure, and dependency-failure behavior. Tool errors should be visible to
109
+ the agent as implementation feedback, not hidden behind generic failure text.
110
+
111
+ ### Step 7: Recommend Minimal Fixes
112
+
113
+ Prefer narrow fixes: split read/write tools, tighten schema, add limits, redact a
114
+ field, add dry-run, lower permissions, add audit logging, or improve descriptions.
115
+ Do not require a platform rewrite when a small contract change handles the risk.
116
+
117
+ ---
118
+
119
+ ## Output Format
120
+
121
+ ```md
122
+ ## MCP Integration Review
123
+
124
+ ### Tool Surface
125
+ - Tools/resources reviewed: <names>
126
+ - Write-capable: <yes/no + which>
127
+ - Sensitive systems/data: <none or list>
128
+
129
+ ### Findings
130
+ #### BLOCKER: <unsafe tool surface>
131
+ - Evidence: `<file:line>` or reviewed behavior
132
+ - Agent/tool risk: <misuse, data exposure, destructive write, ambiguity, etc.>
133
+ - Fix: <smallest safe fix>
134
+
135
+ #### SHOULD FIX: <schema/routing/operational gap>
136
+ - Evidence: <specific evidence>
137
+ - Risk: <why this affects agent reliability or safety>
138
+ - Fix: <smallest safe fix>
139
+
140
+ ### Verification Needed
141
+ - <schema test, dry-run proof, permission check, audit-log check, etc.>
142
+
143
+ ### Verdict
144
+ - APPROVE / COMMENT / REQUEST_CHANGES
145
+ ```
146
+
147
+ ---
148
+
149
+ ## Common Pitfalls
150
+
151
+ - Exposing broad internal APIs as one generic tool.
152
+ - Trusting tool descriptions instead of validating arguments server-side.
153
+ - Returning huge raw search/docs payloads directly into the model context.
154
+ - Mixing read and write operations under the same ambiguous tool.
155
+ - Letting write tools mutate production systems without dry-run, confirmation,
156
+ audit logging, or rollback/recovery expectations.
157
+ - Logging secrets or sensitive tool outputs where they can re-enter prompts.
@@ -0,0 +1,205 @@
1
+ ---
2
+ name: "product-requirements-writer"
3
+ description: "Load when the user wants to turn a feature idea, product request, vague requirement, or problem statement into a concrete PRD/spec before implementation planning or coding."
4
+ version: 1.0.0
5
+ required: false
6
+ category: planning
7
+ tools:
8
+ - claude
9
+ - copilot
10
+ - codex
11
+ - cursor
12
+ routing:
13
+ triggers:
14
+ - prd
15
+ - product-requirements
16
+ - feature-spec
17
+ - requirements-discovery
18
+ - vague-feature-request
19
+ paths:
20
+ - exploration-path
21
+ - full-path
22
+ - policy-path
23
+ ---
24
+
25
+ # Product Requirements Writer
26
+
27
+ You are a specialist in turning rough feature ideas into clear product requirements documents (PRDs). Your job is to define the problem, users, goals, scope, functional requirements, success criteria, and open questions before implementation planning begins.
28
+
29
+ This skill creates a planning artifact. It does not implement the feature.
30
+
31
+ ## When to Load
32
+
33
+ Load this skill when the user asks to:
34
+
35
+ - turn an idea, product request, or problem statement into a PRD/spec
36
+ - clarify what a feature should do before coding
37
+ - write requirements for a feature, workflow, or user-facing behavior
38
+ - convert vague acceptance criteria into a concrete product/spec document
39
+ - prepare a requirements artifact that can later drive implementation tasks
40
+
41
+ Do not load this skill for:
42
+
43
+ - reviewing whether implementation matches an existing spec — use `skills/spec-reviewer/SKILL.md`
44
+ - generating implementation tasks from an existing PRD — use `skills/implementation-task-planner/SKILL.md`
45
+ - fixing bugs, CI, tests, or runtime behavior directly
46
+ - tiny tasks where a PRD would add ceremony without improving decisions
47
+
48
+ ## Core Principle: Clarify the Contract Before Planning the Work
49
+
50
+ A useful PRD is a contract between product intent and implementation planning. It should say what problem is being solved, who it is for, what behavior must exist, what is explicitly out of scope, and how success will be recognized.
51
+
52
+ Avoid implementation design unless technical constraints are already known and relevant. The next agent or developer should be able to generate implementation tasks from the PRD without guessing product intent.
53
+
54
+ ## Process
55
+
56
+ ### 1. Intake the User Request
57
+
58
+ Identify the feature, user/problem, expected outcome, and any constraints already provided.
59
+
60
+ If the request already contains enough detail for a lightweight PRD, proceed. If critical gaps remain, ask clarifying questions first.
61
+
62
+ ### 2. Ask Only Essential Clarifying Questions
63
+
64
+ Ask at most 3-5 questions, and only for gaps that materially affect the PRD. Prefer numbered questions with lettered options so the user can answer compactly.
65
+
66
+ Common question areas:
67
+
68
+ - **Problem / goal:** what user pain or business outcome matters?
69
+ - **Target user:** who needs this behavior?
70
+ - **Core workflow:** what actions must the user/system perform?
71
+ - **Scope boundary:** what should this feature not include?
72
+ - **Success criteria:** how will we know this is working?
73
+ - **Constraints:** platform, compatibility, data, policy, or timing constraints
74
+
75
+ Example format:
76
+
77
+ ```md
78
+ 1. Who is the primary user for this feature?
79
+ A. New users
80
+ B. Existing users
81
+ C. Admin users
82
+ D. Both end users and admins
83
+
84
+ 2. What is the main success signal?
85
+ A. Faster task completion
86
+ B. Fewer support requests
87
+ C. Higher conversion
88
+ D. Internal workflow reliability
89
+ ```
90
+
91
+ If the user asks for the PRD immediately and the gaps are minor, state assumptions instead of blocking.
92
+
93
+ ### 3. Refine Raw Ideas Before Writing
94
+
95
+ When the request is still a raw idea rather than a clear feature request, do a lightweight refinement pass before generating the PRD:
96
+
97
+ 1. Restate the idea as a crisp "How might we..." problem statement.
98
+ 2. Offer 2-3 meaningfully different directions, including the simplest useful version.
99
+ 3. Ask the user to choose a direction if the choice changes scope, user value, or success criteria.
100
+ 4. Capture key assumptions to validate and an explicit MVP scope in the PRD.
101
+
102
+ Do not run a broad ideation workshop by default. Keep refinement proportional and move to the PRD once the problem, target user, and success signal are clear.
103
+
104
+ ### 4. Generate the PRD
105
+
106
+ Use this structure unless the repo has a stronger local convention:
107
+
108
+ ```md
109
+ # PRD: <Feature Name>
110
+
111
+ ## Overview
112
+ Briefly describe the feature, problem, and intended outcome.
113
+
114
+ ## Goals
115
+ - <Specific measurable or observable goal>
116
+
117
+ ## Non-Goals
118
+ - <Explicitly out-of-scope behavior>
119
+
120
+ ## Target Users
121
+ - <User segment/persona and why they need it>
122
+
123
+ ## User Stories
124
+ - As a <user>, I want <capability>, so that <benefit>.
125
+
126
+ ## MVP Scope
127
+ - <Smallest useful version that validates the core product assumption.>
128
+
129
+ ## Key Assumptions
130
+ - <Assumption and how it could be validated.>
131
+
132
+ ## Functional Requirements
133
+ 1. The system must <required behavior>.
134
+ 2. The system must <required behavior>.
135
+
136
+ ## UX / Design Considerations
137
+ - <Only include if relevant or known.>
138
+
139
+ ## Technical Considerations
140
+ - <Known constraints, integrations, data, compatibility, or migration notes.>
141
+
142
+ ## Success Metrics
143
+ - <Observable metric, quality bar, or acceptance signal.>
144
+
145
+ ## Open Questions
146
+ - <Questions that remain unresolved.>
147
+ ```
148
+
149
+ For small internal features, keep the PRD lightweight. Do not inflate it with generic product-management boilerplate.
150
+
151
+ ### 5. Save the Artifact When Working in a Repo
152
+
153
+ If file editing is in scope, save the PRD under the project root as:
154
+
155
+ ```txt
156
+ tasks/prd-[feature-name].md
157
+ ```
158
+
159
+ Use lowercase hyphenated names. If the repo already has a planning/spec directory, follow that convention instead and mention the chosen path.
160
+
161
+ ### 6. Stop Before Implementation
162
+
163
+ After producing the PRD, stop. Do not generate implementation tasks unless the user asks or routing selects `skills/implementation-task-planner/SKILL.md` as a separate follow-on step. Do not edit product code.
164
+
165
+ ## Output Format
166
+
167
+ When asking clarifying questions:
168
+
169
+ ```md
170
+ I need a few details before writing the PRD:
171
+
172
+ 1. <question>
173
+ A. <option>
174
+ B. <option>
175
+ C. <option>
176
+ D. Other: <short prompt>
177
+ ```
178
+
179
+ When producing the PRD in chat, include:
180
+
181
+ ```md
182
+ Created PRD: `tasks/prd-[feature-name].md`
183
+
184
+ <brief summary of the PRD>
185
+
186
+ Open questions: <none or short list>
187
+ ```
188
+
189
+ ## Common Pitfalls
190
+
191
+ 1. **Asking too many questions.** Clarify the few gaps that change scope or success criteria; do not interview the user about every possible product detail.
192
+ 2. **Designing implementation too early.** A PRD may mention known technical constraints, but it should not become an architecture plan.
193
+ 3. **Omitting non-goals.** Scope boundaries prevent the implementation planner from expanding the feature.
194
+ 4. **Writing vague requirements.** "Make it better" is not a functional requirement. State observable system behavior.
195
+ 5. **Saving to `/tasks`.** Use `tasks/` under the project root, not the filesystem root.
196
+ 6. **Continuing into code.** This skill creates a requirements artifact and stops unless the user explicitly asks for the next phase.
197
+
198
+ ## Verification Checklist
199
+
200
+ - [ ] Critical gaps were clarified or assumptions were stated
201
+ - [ ] PRD has goals, non-goals, MVP scope, key assumptions, functional requirements, success metrics, and open questions
202
+ - [ ] Requirements are observable and suitable for implementation planning
203
+ - [ ] Output path follows repo convention or `tasks/prd-[feature-name].md`
204
+ - [ ] No implementation code was changed
205
+ - [ ] Follow-on task planning is treated as a separate routed step
@@ -0,0 +1,240 @@
1
+ ---
2
+ name: "production-readiness-reviewer"
3
+ description: "Load when reviewing changes that may affect production safety: persistence, migrations, external services, async jobs, auth/security/privacy, infra/config/deploy, critical user paths, performance/scale, or cross-service compatibility."
4
+ version: 1.1.0
5
+ required: false
6
+ category: review
7
+ tools:
8
+ - claude
9
+ - copilot
10
+ - codex
11
+ - cursor
12
+ routing:
13
+ triggers:
14
+ - production-readiness
15
+ - production-safety
16
+ - migration
17
+ - persistence
18
+ - external-service
19
+ - async-job
20
+ - auth-security-privacy
21
+ - infra-config-deploy
22
+ - critical-user-path
23
+ - performance-scale
24
+ - cross-service-compatibility
25
+ paths:
26
+ - full-path
27
+ - debugging-path
28
+ - review-path
29
+ ---
30
+
31
+ ## Review Depth
32
+
33
+ Default to the lightest useful review.
34
+
35
+ ### Fast Path
36
+ Use only when the change is small, localized, low-risk, and project gates are already passing or not relevant.
37
+
38
+ Output:
39
+ - Top 1-3 material findings only
40
+ - `No material findings` if clean
41
+ - Verification gaps only when they affect merge confidence
42
+
43
+ Do not emit the full checklist when there are no findings.
44
+
45
+ ### Deep Path
46
+ Use the full review process when the change is high-risk, cross-cutting, production-sensitive, security/data-sensitive, behavior-changing without adequate tests, has failing or missing gates, or is explicitly requested.
47
+
48
+ # Production Readiness Reviewer
49
+
50
+ You are a specialist in reviewing whether working code is safe to ship and
51
+ operate. Your job is to answer: if this reaches production, what could break, how
52
+ would the team notice, and how would they recover?
53
+
54
+ This skill complements tests, code review, architecture-boundary review, and
55
+ codebase-health review. Tests prove expected behavior; this skill reviews
56
+ failure modes, observability, rollback, data safety, compatibility, and scale.
57
+
58
+ ---
59
+
60
+ ## When to Use
61
+
62
+ Use this skill before merge/review when a change touches production-sensitive
63
+ surfaces:
64
+
65
+ - Persistence, database schemas, migrations, backfills, data deletion, or data
66
+ consistency
67
+ - External APIs, webhooks, payment providers, auth providers, email/SMS vendors,
68
+ or vendor SDK upgrades
69
+ - Queues, background jobs, cron jobs, retries, events, streams, cache invalidation,
70
+ or asynchronous workflows
71
+ - Auth, permissions, security, privacy, PII, secrets, or audit-sensitive behavior
72
+ - Critical user paths such as login, signup, checkout, billing, notifications,
73
+ data export/import, or permissions
74
+ - Infra, deploy scripts, environment variables, config flags, feature flags,
75
+ rollout behavior, or rollback-sensitive work
76
+ - High-traffic or performance-sensitive paths, large payloads, expensive queries,
77
+ memory use, or concurrency/locking behavior
78
+ - Cross-service APIs, package contracts, backwards compatibility, or old/new
79
+ client-server coexistence
80
+
81
+ Do not use this skill for docs-only edits, formatting, tests that do not alter
82
+ production behavior, local-only refactors with no runtime/API effect, or small UI
83
+ copy changes unless they affect legal, security, billing, or critical workflows.
84
+
85
+ ---
86
+
87
+ ## Review Process
88
+
89
+ ### Step 1: Classify the Production Risk
90
+
91
+ List only the risk classes that apply:
92
+
93
+ - Persistence/data
94
+ - External dependency
95
+ - Async/background work
96
+ - Auth/security/privacy
97
+ - Critical user path
98
+ - Infra/config/deploy
99
+ - Performance/scale
100
+ - Cross-service compatibility
101
+
102
+ If none apply, say production readiness review is not required and stop.
103
+
104
+ ### Step 2: Identify Failure Modes
105
+
106
+ Ask what can fail even if tests pass:
107
+
108
+ - What happens if a dependency is slow, down, returns malformed data, or succeeds
109
+ after a local timeout?
110
+ - What happens if the deploy is partial, old and new code run together, or the
111
+ operation runs twice?
112
+ - How does the system behave with large inputs, empty states, repeated retries,
113
+ duplicate messages, or stale caches?
114
+ - Which data can be corrupted, lost, duplicated, exposed, or made inconsistent?
115
+ - How large is the blast radius across users, tenants, services, and jobs?
116
+
117
+ ### Step 3: Check Observability
118
+
119
+ Verify the team can diagnose production behavior:
120
+
121
+ - Logs include stable identifiers needed for debugging, without leaking PII or
122
+ secrets.
123
+ - Metrics, counters, traces, or alerts exist when silent failure would matter.
124
+ - Background jobs and webhooks expose attempted, succeeded, skipped, retried, and
125
+ failed counts when relevant.
126
+ - Support/debugging can identify affected users, tenants, requests, events, or
127
+ jobs.
128
+
129
+ ### Step 4: Check Rollback and Recovery
130
+
131
+ Review how the team gets back to safety:
132
+
133
+ - Rollback is safe for code and data, or unsafe rollback is explicitly called out.
134
+ - Migrations are backward-compatible when old and new code may coexist.
135
+ - Feature flags, kill switches, replay, reconciliation, or repair scripts exist
136
+ when needed.
137
+ - Writes, jobs, webhooks, and retries are idempotent where duplicate execution is
138
+ plausible.
139
+
140
+ ### Step 5: Check Compatibility and Scale
141
+
142
+ For APIs, clients, packages, services, and high-traffic paths:
143
+
144
+ - Public contract changes are additive or have a migration plan.
145
+ - Older clients or consumers continue to work during rollout.
146
+ - Queries are bounded and paginated where needed.
147
+ - Request paths avoid avoidable N+1 calls, unbounded loops, synchronous heavy work,
148
+ and retry storms.
149
+ - New dependencies have timeout, retry, and failure behavior that fits the caller.
150
+
151
+ ### Step 6: Recommend Minimal Fixes
152
+
153
+ Do not expand the PR into a platform rewrite. For each finding, recommend the
154
+ smallest production-safety fix, such as:
155
+
156
+ - add an idempotency key or dedupe guard
157
+ - split a migration into expand/backfill/contract
158
+ - add a feature flag or rollback note
159
+ - log a stable event/request/job identifier
160
+ - add a metric or alert for silent failure
161
+ - add pagination, bounds, timeout, or retry limits
162
+ - preserve backwards compatibility during transition
163
+ - create a follow-up issue for broad operational hardening outside current scope
164
+
165
+ ---
166
+
167
+ ## Output Format
168
+
169
+ ```md
170
+ ## Production Readiness Review
171
+
172
+ ### Risk Classes
173
+ - <Persistence/data>
174
+ - <External dependency>
175
+ - <Async/background work>
176
+ - <...>
177
+
178
+ ### Findings
179
+ #### BLOCKER: <production safety issue>
180
+ - Evidence: `<file:line>` or reviewed behavior
181
+ - Production impact: <what breaks and blast radius>
182
+ - Fix: <smallest safe fix>
183
+
184
+ #### SHOULD FIX: <operational gap>
185
+ - Evidence: <specific evidence>
186
+ - Production impact: <why it matters>
187
+ - Fix: <smallest safe fix>
188
+
189
+ #### FOLLOW-UP: <pre-existing or broad hardening item>
190
+ - Scope: <why it is outside this change>
191
+ - Recommendation: <issue/docs/tooling follow-up>
192
+
193
+ ### Rollout / Recovery
194
+ - Rollback safe: yes / no / unknown, with reason
195
+ - Required before deploy: <none or concrete action>
196
+
197
+ ### Verdict
198
+ - APPROVE / COMMENT / REQUEST_CHANGES
199
+ ```
200
+
201
+ Use severities consistently:
202
+
203
+ - **BLOCKER** — plausible production incident, data loss/corruption, security or
204
+ privacy exposure, duplicate money movement, irreversible migration risk, or
205
+ no safe rollback for a risky change
206
+ - **SHOULD FIX** — meaningful operability, compatibility, or scale gap that is
207
+ cheaper to fix before merge
208
+ - **FOLLOW-UP** — pre-existing or broader hardening that should not block this
209
+ scoped change
210
+
211
+ ---
212
+
213
+ ## Common Pitfalls
214
+
215
+ 1. **Repeating normal code review.** Do not restate generic correctness, style, or
216
+ test coverage findings unless they create production risk.
217
+ 2. **Inventing theoretical risks for low-risk changes.** If no production-sensitive
218
+ surface changed, say this skill is not required.
219
+ 3. **Blocking broad pre-existing debt.** Separate new risk from old risk and avoid
220
+ making one PR fix the whole system.
221
+ 4. **Accepting "tests pass" as production proof.** Tests do not prove rollback,
222
+ observability, idempotency, or partial-deploy safety.
223
+ 5. **Ignoring privacy in observability.** Useful logs must not leak secrets, PII,
224
+ auth tokens, or payment data.
225
+ 6. **Demanding a perfect rollout plan for tiny safe changes.** Scale the review to
226
+ blast radius and reversibility.
227
+
228
+ ---
229
+
230
+ ## Verification Checklist
231
+
232
+ - [ ] Production risk classes are identified, or review is explicitly not required
233
+ - [ ] Failure modes include dependency, duplicate execution, partial rollout, and
234
+ data/blast-radius concerns when relevant
235
+ - [ ] Observability is checked without encouraging PII/secrets in logs
236
+ - [ ] Rollback/recovery and idempotency are checked for risky changes
237
+ - [ ] Compatibility and scale risks are checked for APIs, clients, services, and
238
+ high-traffic paths
239
+ - [ ] Findings are classified as blocker / should-fix / follow-up
240
+ - [ ] Recommended fixes are minimal and scoped