agent-directives 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +385 -0
- package/directives/adaptive-routing.md +361 -0
- package/directives/architecture-boundaries.md +223 -0
- package/directives/codebase-navigation.md +325 -0
- package/directives/context-handoff.md +220 -0
- package/directives/error-memory.md +169 -0
- package/directives/exploration-mode.md +266 -0
- package/directives/session-decisions.md +193 -0
- package/directives/specification-driven-development.md +278 -0
- package/directives/task-framing.md +154 -0
- package/directives/test-driven-development.md +305 -0
- package/directives/type-driven-development.md +173 -0
- package/directives/verification.md +266 -0
- package/directives/workspace-isolation.md +219 -0
- package/dist/cli.d.ts +3 -0
- package/dist/cli.d.ts.map +1 -0
- package/dist/cli.js +232 -0
- package/dist/cli.js.map +1 -0
- package/dist/context-audit.d.ts +30 -0
- package/dist/context-audit.d.ts.map +1 -0
- package/dist/context-audit.js +75 -0
- package/dist/context-audit.js.map +1 -0
- package/dist/install.d.ts +18 -0
- package/dist/install.d.ts.map +1 -0
- package/dist/install.js +28 -0
- package/dist/install.js.map +1 -0
- package/dist/manifest.d.ts +25 -0
- package/dist/manifest.d.ts.map +1 -0
- package/dist/manifest.js +29 -0
- package/dist/manifest.js.map +1 -0
- package/dist/prompt.d.ts +3 -0
- package/dist/prompt.d.ts.map +1 -0
- package/dist/prompt.js +29 -0
- package/dist/prompt.js.map +1 -0
- package/dist/targets.d.ts +10 -0
- package/dist/targets.d.ts.map +1 -0
- package/dist/targets.js +32 -0
- package/dist/targets.js.map +1 -0
- package/manifest.json +387 -0
- package/package.json +74 -0
- package/skills/architecture-boundary-reviewer/SKILL.md +228 -0
- package/skills/code-reviewer/SKILL.md +77 -0
- package/skills/codebase-health-reviewer/SKILL.md +234 -0
- package/skills/harness-hooks-reviewer/SKILL.md +159 -0
- package/skills/implementation-task-planner/SKILL.md +205 -0
- package/skills/mcp-integration-reviewer/SKILL.md +157 -0
- package/skills/product-requirements-writer/SKILL.md +205 -0
- package/skills/production-readiness-reviewer/SKILL.md +240 -0
- package/skills/self-audit/SKILL.md +134 -0
- package/skills/spec-reviewer/SKILL.md +304 -0
- package/skills/subagent-driven-development/SKILL.md +236 -0
- package/skills/systematic-debugging/SKILL.md +313 -0
- package/skills/test-reviewer/SKILL.md +293 -0
- package/templates/AGENTS.md +120 -0
- package/templates/CLAUDE.md +115 -0
- package/templates/copilot-instructions.md +116 -0
- package/templates/decision-log.md +44 -0
|
@@ -0,0 +1,157 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "mcp-integration-reviewer"
|
|
3
|
+
description: "Load when adding or reviewing MCP servers, agent tools, tool schemas, internal API bridges, structured search, docs/ticketing/analytics connectors, or agent-accessible write tools."
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
required: false
|
|
6
|
+
category: review
|
|
7
|
+
tools:
|
|
8
|
+
- claude
|
|
9
|
+
- copilot
|
|
10
|
+
- codex
|
|
11
|
+
- cursor
|
|
12
|
+
routing:
|
|
13
|
+
triggers:
|
|
14
|
+
- mcp-server
|
|
15
|
+
- mcp-tool
|
|
16
|
+
- agent-tool-schema
|
|
17
|
+
- internal-tool-bridge
|
|
18
|
+
- structured-search-tool
|
|
19
|
+
- agent-accessible-api
|
|
20
|
+
- agent-write-tool
|
|
21
|
+
paths:
|
|
22
|
+
- full-path
|
|
23
|
+
- review-path
|
|
24
|
+
- policy-path
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
# MCP Integration Reviewer
|
|
28
|
+
|
|
29
|
+
You are a specialist in reviewing Model Context Protocol (MCP) servers and other
|
|
30
|
+
agent-accessible tool surfaces. Your job is to make sure the agent can call the
|
|
31
|
+
right tool safely, with strict schemas, least privilege, bounded output, and clear
|
|
32
|
+
failure behavior.
|
|
33
|
+
|
|
34
|
+
This skill applies to MCP specifically and to similar internal tool bridges that
|
|
35
|
+
expose APIs, search, tickets, analytics, docs, deploys, or data systems to agents.
|
|
36
|
+
|
|
37
|
+
---
|
|
38
|
+
|
|
39
|
+
## When to Use
|
|
40
|
+
|
|
41
|
+
Use this skill when work adds, changes, or reviews:
|
|
42
|
+
|
|
43
|
+
- MCP servers, tool definitions, resources, prompts, or transports
|
|
44
|
+
- agent-callable wrappers around internal APIs, search, docs, ticketing, analytics,
|
|
45
|
+
deploy, data, or operational systems
|
|
46
|
+
- tool schemas, descriptions, argument validation, output contracts, or permissions
|
|
47
|
+
- write-capable tools or tools that can expose sensitive data
|
|
48
|
+
|
|
49
|
+
Do not use this skill for ordinary application APIs unless they are exposed to an
|
|
50
|
+
agent as tools.
|
|
51
|
+
|
|
52
|
+
---
|
|
53
|
+
|
|
54
|
+
## Review Process
|
|
55
|
+
|
|
56
|
+
### Step 1: Inventory the Tool Surface
|
|
57
|
+
|
|
58
|
+
List only the exposed agent-facing capabilities:
|
|
59
|
+
|
|
60
|
+
- tool/resource/prompt names
|
|
61
|
+
- read vs write behavior
|
|
62
|
+
- external/internal systems touched
|
|
63
|
+
- auth identity and permission scope
|
|
64
|
+
- expected output shape and size
|
|
65
|
+
|
|
66
|
+
### Step 2: Check Tool Routing Quality
|
|
67
|
+
|
|
68
|
+
Verify tool names and descriptions tell an agent when to use the tool and when not
|
|
69
|
+
to. A good tool description includes task intent, boundaries, required identifiers,
|
|
70
|
+
and important side effects.
|
|
71
|
+
|
|
72
|
+
Flag vague names like `run`, `query`, `doThing`, or broad descriptions like
|
|
73
|
+
"access internal systems" unless the surrounding schema strongly disambiguates.
|
|
74
|
+
|
|
75
|
+
### Step 3: Check Schemas and Validation
|
|
76
|
+
|
|
77
|
+
Require:
|
|
78
|
+
|
|
79
|
+
- strict argument schemas with required fields, enums, bounds, and formats
|
|
80
|
+
- server-side validation, not only client-side hints
|
|
81
|
+
- pagination or limits for large reads
|
|
82
|
+
- structured errors with actionable codes/messages
|
|
83
|
+
- stable output fields that avoid dumping unbounded raw documents by default
|
|
84
|
+
|
|
85
|
+
### Step 4: Check Auth, Secrets, and Data Boundaries
|
|
86
|
+
|
|
87
|
+
Review:
|
|
88
|
+
|
|
89
|
+
- least-privilege auth for the tool's real blast radius
|
|
90
|
+
- separation between user identity, service identity, and elevated/admin identity
|
|
91
|
+
- secret handling and redaction in logs, errors, traces, and model-visible output
|
|
92
|
+
- tenant/user/project scoping for internal data
|
|
93
|
+
- audit logging for sensitive reads and all meaningful writes
|
|
94
|
+
|
|
95
|
+
### Step 5: Check Write Safety
|
|
96
|
+
|
|
97
|
+
For write-capable tools, require appropriate safeguards:
|
|
98
|
+
|
|
99
|
+
- dry-run or preview mode when practical
|
|
100
|
+
- explicit confirmation for destructive, deploy, billing, permission, or data writes
|
|
101
|
+
- idempotency keys or duplicate-call protection when retries are plausible
|
|
102
|
+
- rollback/recovery notes for high-impact changes
|
|
103
|
+
- clear distinction between create/update/delete operations
|
|
104
|
+
|
|
105
|
+
### Step 6: Check Operational Behavior
|
|
106
|
+
|
|
107
|
+
Look for timeouts, retries, rate limits, cancellation, concurrency limits,
|
|
108
|
+
backpressure, and dependency-failure behavior. Tool errors should be visible to
|
|
109
|
+
the agent as implementation feedback, not hidden behind generic failure text.
|
|
110
|
+
|
|
111
|
+
### Step 7: Recommend Minimal Fixes
|
|
112
|
+
|
|
113
|
+
Prefer narrow fixes: split read/write tools, tighten schema, add limits, redact a
|
|
114
|
+
field, add dry-run, lower permissions, add audit logging, or improve descriptions.
|
|
115
|
+
Do not require a platform rewrite when a small contract change handles the risk.
|
|
116
|
+
|
|
117
|
+
---
|
|
118
|
+
|
|
119
|
+
## Output Format
|
|
120
|
+
|
|
121
|
+
```md
|
|
122
|
+
## MCP Integration Review
|
|
123
|
+
|
|
124
|
+
### Tool Surface
|
|
125
|
+
- Tools/resources reviewed: <names>
|
|
126
|
+
- Write-capable: <yes/no + which>
|
|
127
|
+
- Sensitive systems/data: <none or list>
|
|
128
|
+
|
|
129
|
+
### Findings
|
|
130
|
+
#### BLOCKER: <unsafe tool surface>
|
|
131
|
+
- Evidence: `<file:line>` or reviewed behavior
|
|
132
|
+
- Agent/tool risk: <misuse, data exposure, destructive write, ambiguity, etc.>
|
|
133
|
+
- Fix: <smallest safe fix>
|
|
134
|
+
|
|
135
|
+
#### SHOULD FIX: <schema/routing/operational gap>
|
|
136
|
+
- Evidence: <specific evidence>
|
|
137
|
+
- Risk: <why this affects agent reliability or safety>
|
|
138
|
+
- Fix: <smallest safe fix>
|
|
139
|
+
|
|
140
|
+
### Verification Needed
|
|
141
|
+
- <schema test, dry-run proof, permission check, audit-log check, etc.>
|
|
142
|
+
|
|
143
|
+
### Verdict
|
|
144
|
+
- APPROVE / COMMENT / REQUEST_CHANGES
|
|
145
|
+
```
|
|
146
|
+
|
|
147
|
+
---
|
|
148
|
+
|
|
149
|
+
## Common Pitfalls
|
|
150
|
+
|
|
151
|
+
- Exposing broad internal APIs as one generic tool.
|
|
152
|
+
- Trusting tool descriptions instead of validating arguments server-side.
|
|
153
|
+
- Returning huge raw search/docs payloads directly into the model context.
|
|
154
|
+
- Mixing read and write operations under the same ambiguous tool.
|
|
155
|
+
- Letting write tools mutate production systems without dry-run, confirmation,
|
|
156
|
+
audit logging, or rollback/recovery expectations.
|
|
157
|
+
- Logging secrets or sensitive tool outputs where they can re-enter prompts.
|
|
@@ -0,0 +1,205 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "product-requirements-writer"
|
|
3
|
+
description: "Load when the user wants to turn a feature idea, product request, vague requirement, or problem statement into a concrete PRD/spec before implementation planning or coding."
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
required: false
|
|
6
|
+
category: planning
|
|
7
|
+
tools:
|
|
8
|
+
- claude
|
|
9
|
+
- copilot
|
|
10
|
+
- codex
|
|
11
|
+
- cursor
|
|
12
|
+
routing:
|
|
13
|
+
triggers:
|
|
14
|
+
- prd
|
|
15
|
+
- product-requirements
|
|
16
|
+
- feature-spec
|
|
17
|
+
- requirements-discovery
|
|
18
|
+
- vague-feature-request
|
|
19
|
+
paths:
|
|
20
|
+
- exploration-path
|
|
21
|
+
- full-path
|
|
22
|
+
- policy-path
|
|
23
|
+
---
|
|
24
|
+
|
|
25
|
+
# Product Requirements Writer
|
|
26
|
+
|
|
27
|
+
You are a specialist in turning rough feature ideas into clear product requirements documents (PRDs). Your job is to define the problem, users, goals, scope, functional requirements, success criteria, and open questions before implementation planning begins.
|
|
28
|
+
|
|
29
|
+
This skill creates a planning artifact. It does not implement the feature.
|
|
30
|
+
|
|
31
|
+
## When to Load
|
|
32
|
+
|
|
33
|
+
Load this skill when the user asks to:
|
|
34
|
+
|
|
35
|
+
- turn an idea, product request, or problem statement into a PRD/spec
|
|
36
|
+
- clarify what a feature should do before coding
|
|
37
|
+
- write requirements for a feature, workflow, or user-facing behavior
|
|
38
|
+
- convert vague acceptance criteria into a concrete product/spec document
|
|
39
|
+
- prepare a requirements artifact that can later drive implementation tasks
|
|
40
|
+
|
|
41
|
+
Do not load this skill for:
|
|
42
|
+
|
|
43
|
+
- reviewing whether implementation matches an existing spec — use `skills/spec-reviewer/SKILL.md`
|
|
44
|
+
- generating implementation tasks from an existing PRD — use `skills/implementation-task-planner/SKILL.md`
|
|
45
|
+
- fixing bugs, CI, tests, or runtime behavior directly
|
|
46
|
+
- tiny tasks where a PRD would add ceremony without improving decisions
|
|
47
|
+
|
|
48
|
+
## Core Principle: Clarify the Contract Before Planning the Work
|
|
49
|
+
|
|
50
|
+
A useful PRD is a contract between product intent and implementation planning. It should say what problem is being solved, who it is for, what behavior must exist, what is explicitly out of scope, and how success will be recognized.
|
|
51
|
+
|
|
52
|
+
Avoid implementation design unless technical constraints are already known and relevant. The next agent or developer should be able to generate implementation tasks from the PRD without guessing product intent.
|
|
53
|
+
|
|
54
|
+
## Process
|
|
55
|
+
|
|
56
|
+
### 1. Intake the User Request
|
|
57
|
+
|
|
58
|
+
Identify the feature, user/problem, expected outcome, and any constraints already provided.
|
|
59
|
+
|
|
60
|
+
If the request already contains enough detail for a lightweight PRD, proceed. If critical gaps remain, ask clarifying questions first.
|
|
61
|
+
|
|
62
|
+
### 2. Ask Only Essential Clarifying Questions
|
|
63
|
+
|
|
64
|
+
Ask at most 3-5 questions, and only for gaps that materially affect the PRD. Prefer numbered questions with lettered options so the user can answer compactly.
|
|
65
|
+
|
|
66
|
+
Common question areas:
|
|
67
|
+
|
|
68
|
+
- **Problem / goal:** what user pain or business outcome matters?
|
|
69
|
+
- **Target user:** who needs this behavior?
|
|
70
|
+
- **Core workflow:** what actions must the user/system perform?
|
|
71
|
+
- **Scope boundary:** what should this feature not include?
|
|
72
|
+
- **Success criteria:** how will we know this is working?
|
|
73
|
+
- **Constraints:** platform, compatibility, data, policy, or timing constraints
|
|
74
|
+
|
|
75
|
+
Example format:
|
|
76
|
+
|
|
77
|
+
```md
|
|
78
|
+
1. Who is the primary user for this feature?
|
|
79
|
+
A. New users
|
|
80
|
+
B. Existing users
|
|
81
|
+
C. Admin users
|
|
82
|
+
D. Both end users and admins
|
|
83
|
+
|
|
84
|
+
2. What is the main success signal?
|
|
85
|
+
A. Faster task completion
|
|
86
|
+
B. Fewer support requests
|
|
87
|
+
C. Higher conversion
|
|
88
|
+
D. Internal workflow reliability
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
If the user asks for the PRD immediately and the gaps are minor, state assumptions instead of blocking.
|
|
92
|
+
|
|
93
|
+
### 3. Refine Raw Ideas Before Writing
|
|
94
|
+
|
|
95
|
+
When the request is still a raw idea rather than a clear feature request, do a lightweight refinement pass before generating the PRD:
|
|
96
|
+
|
|
97
|
+
1. Restate the idea as a crisp "How might we..." problem statement.
|
|
98
|
+
2. Offer 2-3 meaningfully different directions, including the simplest useful version.
|
|
99
|
+
3. Ask the user to choose a direction if the choice changes scope, user value, or success criteria.
|
|
100
|
+
4. Capture key assumptions to validate and an explicit MVP scope in the PRD.
|
|
101
|
+
|
|
102
|
+
Do not run a broad ideation workshop by default. Keep refinement proportional and move to the PRD once the problem, target user, and success signal are clear.
|
|
103
|
+
|
|
104
|
+
### 4. Generate the PRD
|
|
105
|
+
|
|
106
|
+
Use this structure unless the repo has a stronger local convention:
|
|
107
|
+
|
|
108
|
+
```md
|
|
109
|
+
# PRD: <Feature Name>
|
|
110
|
+
|
|
111
|
+
## Overview
|
|
112
|
+
Briefly describe the feature, problem, and intended outcome.
|
|
113
|
+
|
|
114
|
+
## Goals
|
|
115
|
+
- <Specific measurable or observable goal>
|
|
116
|
+
|
|
117
|
+
## Non-Goals
|
|
118
|
+
- <Explicitly out-of-scope behavior>
|
|
119
|
+
|
|
120
|
+
## Target Users
|
|
121
|
+
- <User segment/persona and why they need it>
|
|
122
|
+
|
|
123
|
+
## User Stories
|
|
124
|
+
- As a <user>, I want <capability>, so that <benefit>.
|
|
125
|
+
|
|
126
|
+
## MVP Scope
|
|
127
|
+
- <Smallest useful version that validates the core product assumption.>
|
|
128
|
+
|
|
129
|
+
## Key Assumptions
|
|
130
|
+
- <Assumption and how it could be validated.>
|
|
131
|
+
|
|
132
|
+
## Functional Requirements
|
|
133
|
+
1. The system must <required behavior>.
|
|
134
|
+
2. The system must <required behavior>.
|
|
135
|
+
|
|
136
|
+
## UX / Design Considerations
|
|
137
|
+
- <Only include if relevant or known.>
|
|
138
|
+
|
|
139
|
+
## Technical Considerations
|
|
140
|
+
- <Known constraints, integrations, data, compatibility, or migration notes.>
|
|
141
|
+
|
|
142
|
+
## Success Metrics
|
|
143
|
+
- <Observable metric, quality bar, or acceptance signal.>
|
|
144
|
+
|
|
145
|
+
## Open Questions
|
|
146
|
+
- <Questions that remain unresolved.>
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
For small internal features, keep the PRD lightweight. Do not inflate it with generic product-management boilerplate.
|
|
150
|
+
|
|
151
|
+
### 5. Save the Artifact When Working in a Repo
|
|
152
|
+
|
|
153
|
+
If file editing is in scope, save the PRD under the project root as:
|
|
154
|
+
|
|
155
|
+
```txt
|
|
156
|
+
tasks/prd-[feature-name].md
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
Use lowercase hyphenated names. If the repo already has a planning/spec directory, follow that convention instead and mention the chosen path.
|
|
160
|
+
|
|
161
|
+
### 6. Stop Before Implementation
|
|
162
|
+
|
|
163
|
+
After producing the PRD, stop. Do not generate implementation tasks unless the user asks or routing selects `skills/implementation-task-planner/SKILL.md` as a separate follow-on step. Do not edit product code.
|
|
164
|
+
|
|
165
|
+
## Output Format
|
|
166
|
+
|
|
167
|
+
When asking clarifying questions:
|
|
168
|
+
|
|
169
|
+
```md
|
|
170
|
+
I need a few details before writing the PRD:
|
|
171
|
+
|
|
172
|
+
1. <question>
|
|
173
|
+
A. <option>
|
|
174
|
+
B. <option>
|
|
175
|
+
C. <option>
|
|
176
|
+
D. Other: <short prompt>
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
When producing the PRD in chat, include:
|
|
180
|
+
|
|
181
|
+
```md
|
|
182
|
+
Created PRD: `tasks/prd-[feature-name].md`
|
|
183
|
+
|
|
184
|
+
<brief summary of the PRD>
|
|
185
|
+
|
|
186
|
+
Open questions: <none or short list>
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
## Common Pitfalls
|
|
190
|
+
|
|
191
|
+
1. **Asking too many questions.** Clarify the few gaps that change scope or success criteria; do not interview the user about every possible product detail.
|
|
192
|
+
2. **Designing implementation too early.** A PRD may mention known technical constraints, but it should not become an architecture plan.
|
|
193
|
+
3. **Omitting non-goals.** Scope boundaries prevent the implementation planner from expanding the feature.
|
|
194
|
+
4. **Writing vague requirements.** "Make it better" is not a functional requirement. State observable system behavior.
|
|
195
|
+
5. **Saving to `/tasks`.** Use `tasks/` under the project root, not the filesystem root.
|
|
196
|
+
6. **Continuing into code.** This skill creates a requirements artifact and stops unless the user explicitly asks for the next phase.
|
|
197
|
+
|
|
198
|
+
## Verification Checklist
|
|
199
|
+
|
|
200
|
+
- [ ] Critical gaps were clarified or assumptions were stated
|
|
201
|
+
- [ ] PRD has goals, non-goals, MVP scope, key assumptions, functional requirements, success metrics, and open questions
|
|
202
|
+
- [ ] Requirements are observable and suitable for implementation planning
|
|
203
|
+
- [ ] Output path follows repo convention or `tasks/prd-[feature-name].md`
|
|
204
|
+
- [ ] No implementation code was changed
|
|
205
|
+
- [ ] Follow-on task planning is treated as a separate routed step
|
|
@@ -0,0 +1,240 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "production-readiness-reviewer"
|
|
3
|
+
description: "Load when reviewing changes that may affect production safety: persistence, migrations, external services, async jobs, auth/security/privacy, infra/config/deploy, critical user paths, performance/scale, or cross-service compatibility."
|
|
4
|
+
version: 1.1.0
|
|
5
|
+
required: false
|
|
6
|
+
category: review
|
|
7
|
+
tools:
|
|
8
|
+
- claude
|
|
9
|
+
- copilot
|
|
10
|
+
- codex
|
|
11
|
+
- cursor
|
|
12
|
+
routing:
|
|
13
|
+
triggers:
|
|
14
|
+
- production-readiness
|
|
15
|
+
- production-safety
|
|
16
|
+
- migration
|
|
17
|
+
- persistence
|
|
18
|
+
- external-service
|
|
19
|
+
- async-job
|
|
20
|
+
- auth-security-privacy
|
|
21
|
+
- infra-config-deploy
|
|
22
|
+
- critical-user-path
|
|
23
|
+
- performance-scale
|
|
24
|
+
- cross-service-compatibility
|
|
25
|
+
paths:
|
|
26
|
+
- full-path
|
|
27
|
+
- debugging-path
|
|
28
|
+
- review-path
|
|
29
|
+
---
|
|
30
|
+
|
|
31
|
+
## Review Depth
|
|
32
|
+
|
|
33
|
+
Default to the lightest useful review.
|
|
34
|
+
|
|
35
|
+
### Fast Path
|
|
36
|
+
Use only when the change is small, localized, low-risk, and project gates are already passing or not relevant.
|
|
37
|
+
|
|
38
|
+
Output:
|
|
39
|
+
- Top 1-3 material findings only
|
|
40
|
+
- `No material findings` if clean
|
|
41
|
+
- Verification gaps only when they affect merge confidence
|
|
42
|
+
|
|
43
|
+
Do not emit the full checklist when there are no findings.
|
|
44
|
+
|
|
45
|
+
### Deep Path
|
|
46
|
+
Use the full review process when the change is high-risk, cross-cutting, production-sensitive, security/data-sensitive, behavior-changing without adequate tests, has failing or missing gates, or is explicitly requested.
|
|
47
|
+
|
|
48
|
+
# Production Readiness Reviewer
|
|
49
|
+
|
|
50
|
+
You are a specialist in reviewing whether working code is safe to ship and
|
|
51
|
+
operate. Your job is to answer: if this reaches production, what could break, how
|
|
52
|
+
would the team notice, and how would they recover?
|
|
53
|
+
|
|
54
|
+
This skill complements tests, code review, architecture-boundary review, and
|
|
55
|
+
codebase-health review. Tests prove expected behavior; this skill reviews
|
|
56
|
+
failure modes, observability, rollback, data safety, compatibility, and scale.
|
|
57
|
+
|
|
58
|
+
---
|
|
59
|
+
|
|
60
|
+
## When to Use
|
|
61
|
+
|
|
62
|
+
Use this skill before merge/review when a change touches production-sensitive
|
|
63
|
+
surfaces:
|
|
64
|
+
|
|
65
|
+
- Persistence, database schemas, migrations, backfills, data deletion, or data
|
|
66
|
+
consistency
|
|
67
|
+
- External APIs, webhooks, payment providers, auth providers, email/SMS vendors,
|
|
68
|
+
or vendor SDK upgrades
|
|
69
|
+
- Queues, background jobs, cron jobs, retries, events, streams, cache invalidation,
|
|
70
|
+
or asynchronous workflows
|
|
71
|
+
- Auth, permissions, security, privacy, PII, secrets, or audit-sensitive behavior
|
|
72
|
+
- Critical user paths such as login, signup, checkout, billing, notifications,
|
|
73
|
+
data export/import, or permissions
|
|
74
|
+
- Infra, deploy scripts, environment variables, config flags, feature flags,
|
|
75
|
+
rollout behavior, or rollback-sensitive work
|
|
76
|
+
- High-traffic or performance-sensitive paths, large payloads, expensive queries,
|
|
77
|
+
memory use, or concurrency/locking behavior
|
|
78
|
+
- Cross-service APIs, package contracts, backwards compatibility, or old/new
|
|
79
|
+
client-server coexistence
|
|
80
|
+
|
|
81
|
+
Do not use this skill for docs-only edits, formatting, tests that do not alter
|
|
82
|
+
production behavior, local-only refactors with no runtime/API effect, or small UI
|
|
83
|
+
copy changes unless they affect legal, security, billing, or critical workflows.
|
|
84
|
+
|
|
85
|
+
---
|
|
86
|
+
|
|
87
|
+
## Review Process
|
|
88
|
+
|
|
89
|
+
### Step 1: Classify the Production Risk
|
|
90
|
+
|
|
91
|
+
List only the risk classes that apply:
|
|
92
|
+
|
|
93
|
+
- Persistence/data
|
|
94
|
+
- External dependency
|
|
95
|
+
- Async/background work
|
|
96
|
+
- Auth/security/privacy
|
|
97
|
+
- Critical user path
|
|
98
|
+
- Infra/config/deploy
|
|
99
|
+
- Performance/scale
|
|
100
|
+
- Cross-service compatibility
|
|
101
|
+
|
|
102
|
+
If none apply, say production readiness review is not required and stop.
|
|
103
|
+
|
|
104
|
+
### Step 2: Identify Failure Modes
|
|
105
|
+
|
|
106
|
+
Ask what can fail even if tests pass:
|
|
107
|
+
|
|
108
|
+
- What happens if a dependency is slow, down, returns malformed data, or succeeds
|
|
109
|
+
after a local timeout?
|
|
110
|
+
- What happens if the deploy is partial, old and new code run together, or the
|
|
111
|
+
operation runs twice?
|
|
112
|
+
- How does the system behave with large inputs, empty states, repeated retries,
|
|
113
|
+
duplicate messages, or stale caches?
|
|
114
|
+
- Which data can be corrupted, lost, duplicated, exposed, or made inconsistent?
|
|
115
|
+
- How large is the blast radius across users, tenants, services, and jobs?
|
|
116
|
+
|
|
117
|
+
### Step 3: Check Observability
|
|
118
|
+
|
|
119
|
+
Verify the team can diagnose production behavior:
|
|
120
|
+
|
|
121
|
+
- Logs include stable identifiers needed for debugging, without leaking PII or
|
|
122
|
+
secrets.
|
|
123
|
+
- Metrics, counters, traces, or alerts exist when silent failure would matter.
|
|
124
|
+
- Background jobs and webhooks expose attempted, succeeded, skipped, retried, and
|
|
125
|
+
failed counts when relevant.
|
|
126
|
+
- Support/debugging can identify affected users, tenants, requests, events, or
|
|
127
|
+
jobs.
|
|
128
|
+
|
|
129
|
+
### Step 4: Check Rollback and Recovery
|
|
130
|
+
|
|
131
|
+
Review how the team gets back to safety:
|
|
132
|
+
|
|
133
|
+
- Rollback is safe for code and data, or unsafe rollback is explicitly called out.
|
|
134
|
+
- Migrations are backward-compatible when old and new code may coexist.
|
|
135
|
+
- Feature flags, kill switches, replay, reconciliation, or repair scripts exist
|
|
136
|
+
when needed.
|
|
137
|
+
- Writes, jobs, webhooks, and retries are idempotent where duplicate execution is
|
|
138
|
+
plausible.
|
|
139
|
+
|
|
140
|
+
### Step 5: Check Compatibility and Scale
|
|
141
|
+
|
|
142
|
+
For APIs, clients, packages, services, and high-traffic paths:
|
|
143
|
+
|
|
144
|
+
- Public contract changes are additive or have a migration plan.
|
|
145
|
+
- Older clients or consumers continue to work during rollout.
|
|
146
|
+
- Queries are bounded and paginated where needed.
|
|
147
|
+
- Request paths avoid avoidable N+1 calls, unbounded loops, synchronous heavy work,
|
|
148
|
+
and retry storms.
|
|
149
|
+
- New dependencies have timeout, retry, and failure behavior that fits the caller.
|
|
150
|
+
|
|
151
|
+
### Step 6: Recommend Minimal Fixes
|
|
152
|
+
|
|
153
|
+
Do not expand the PR into a platform rewrite. For each finding, recommend the
|
|
154
|
+
smallest production-safety fix, such as:
|
|
155
|
+
|
|
156
|
+
- add an idempotency key or dedupe guard
|
|
157
|
+
- split a migration into expand/backfill/contract
|
|
158
|
+
- add a feature flag or rollback note
|
|
159
|
+
- log a stable event/request/job identifier
|
|
160
|
+
- add a metric or alert for silent failure
|
|
161
|
+
- add pagination, bounds, timeout, or retry limits
|
|
162
|
+
- preserve backwards compatibility during transition
|
|
163
|
+
- create a follow-up issue for broad operational hardening outside current scope
|
|
164
|
+
|
|
165
|
+
---
|
|
166
|
+
|
|
167
|
+
## Output Format
|
|
168
|
+
|
|
169
|
+
```md
|
|
170
|
+
## Production Readiness Review
|
|
171
|
+
|
|
172
|
+
### Risk Classes
|
|
173
|
+
- <Persistence/data>
|
|
174
|
+
- <External dependency>
|
|
175
|
+
- <Async/background work>
|
|
176
|
+
- <...>
|
|
177
|
+
|
|
178
|
+
### Findings
|
|
179
|
+
#### BLOCKER: <production safety issue>
|
|
180
|
+
- Evidence: `<file:line>` or reviewed behavior
|
|
181
|
+
- Production impact: <what breaks and blast radius>
|
|
182
|
+
- Fix: <smallest safe fix>
|
|
183
|
+
|
|
184
|
+
#### SHOULD FIX: <operational gap>
|
|
185
|
+
- Evidence: <specific evidence>
|
|
186
|
+
- Production impact: <why it matters>
|
|
187
|
+
- Fix: <smallest safe fix>
|
|
188
|
+
|
|
189
|
+
#### FOLLOW-UP: <pre-existing or broad hardening item>
|
|
190
|
+
- Scope: <why it is outside this change>
|
|
191
|
+
- Recommendation: <issue/docs/tooling follow-up>
|
|
192
|
+
|
|
193
|
+
### Rollout / Recovery
|
|
194
|
+
- Rollback safe: yes / no / unknown, with reason
|
|
195
|
+
- Required before deploy: <none or concrete action>
|
|
196
|
+
|
|
197
|
+
### Verdict
|
|
198
|
+
- APPROVE / COMMENT / REQUEST_CHANGES
|
|
199
|
+
```
|
|
200
|
+
|
|
201
|
+
Use severities consistently:
|
|
202
|
+
|
|
203
|
+
- **BLOCKER** — plausible production incident, data loss/corruption, security or
|
|
204
|
+
privacy exposure, duplicate money movement, irreversible migration risk, or
|
|
205
|
+
no safe rollback for a risky change
|
|
206
|
+
- **SHOULD FIX** — meaningful operability, compatibility, or scale gap that is
|
|
207
|
+
cheaper to fix before merge
|
|
208
|
+
- **FOLLOW-UP** — pre-existing or broader hardening that should not block this
|
|
209
|
+
scoped change
|
|
210
|
+
|
|
211
|
+
---
|
|
212
|
+
|
|
213
|
+
## Common Pitfalls
|
|
214
|
+
|
|
215
|
+
1. **Repeating normal code review.** Do not restate generic correctness, style, or
|
|
216
|
+
test coverage findings unless they create production risk.
|
|
217
|
+
2. **Inventing theoretical risks for low-risk changes.** If no production-sensitive
|
|
218
|
+
surface changed, say this skill is not required.
|
|
219
|
+
3. **Blocking broad pre-existing debt.** Separate new risk from old risk and avoid
|
|
220
|
+
making one PR fix the whole system.
|
|
221
|
+
4. **Accepting "tests pass" as production proof.** Tests do not prove rollback,
|
|
222
|
+
observability, idempotency, or partial-deploy safety.
|
|
223
|
+
5. **Ignoring privacy in observability.** Useful logs must not leak secrets, PII,
|
|
224
|
+
auth tokens, or payment data.
|
|
225
|
+
6. **Demanding a perfect rollout plan for tiny safe changes.** Scale the review to
|
|
226
|
+
blast radius and reversibility.
|
|
227
|
+
|
|
228
|
+
---
|
|
229
|
+
|
|
230
|
+
## Verification Checklist
|
|
231
|
+
|
|
232
|
+
- [ ] Production risk classes are identified, or review is explicitly not required
|
|
233
|
+
- [ ] Failure modes include dependency, duplicate execution, partial rollout, and
|
|
234
|
+
data/blast-radius concerns when relevant
|
|
235
|
+
- [ ] Observability is checked without encouraging PII/secrets in logs
|
|
236
|
+
- [ ] Rollback/recovery and idempotency are checked for risky changes
|
|
237
|
+
- [ ] Compatibility and scale risks are checked for APIs, clients, services, and
|
|
238
|
+
high-traffic paths
|
|
239
|
+
- [ ] Findings are classified as blocker / should-fix / follow-up
|
|
240
|
+
- [ ] Recommended fixes are minimal and scoped
|