agent-directives 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +385 -0
- package/directives/adaptive-routing.md +361 -0
- package/directives/architecture-boundaries.md +223 -0
- package/directives/codebase-navigation.md +325 -0
- package/directives/context-handoff.md +220 -0
- package/directives/error-memory.md +169 -0
- package/directives/exploration-mode.md +266 -0
- package/directives/session-decisions.md +193 -0
- package/directives/specification-driven-development.md +278 -0
- package/directives/task-framing.md +154 -0
- package/directives/test-driven-development.md +305 -0
- package/directives/type-driven-development.md +173 -0
- package/directives/verification.md +266 -0
- package/directives/workspace-isolation.md +219 -0
- package/dist/cli.d.ts +3 -0
- package/dist/cli.d.ts.map +1 -0
- package/dist/cli.js +232 -0
- package/dist/cli.js.map +1 -0
- package/dist/context-audit.d.ts +30 -0
- package/dist/context-audit.d.ts.map +1 -0
- package/dist/context-audit.js +75 -0
- package/dist/context-audit.js.map +1 -0
- package/dist/install.d.ts +18 -0
- package/dist/install.d.ts.map +1 -0
- package/dist/install.js +28 -0
- package/dist/install.js.map +1 -0
- package/dist/manifest.d.ts +25 -0
- package/dist/manifest.d.ts.map +1 -0
- package/dist/manifest.js +29 -0
- package/dist/manifest.js.map +1 -0
- package/dist/prompt.d.ts +3 -0
- package/dist/prompt.d.ts.map +1 -0
- package/dist/prompt.js +29 -0
- package/dist/prompt.js.map +1 -0
- package/dist/targets.d.ts +10 -0
- package/dist/targets.d.ts.map +1 -0
- package/dist/targets.js +32 -0
- package/dist/targets.js.map +1 -0
- package/manifest.json +387 -0
- package/package.json +74 -0
- package/skills/architecture-boundary-reviewer/SKILL.md +228 -0
- package/skills/code-reviewer/SKILL.md +77 -0
- package/skills/codebase-health-reviewer/SKILL.md +234 -0
- package/skills/harness-hooks-reviewer/SKILL.md +159 -0
- package/skills/implementation-task-planner/SKILL.md +205 -0
- package/skills/mcp-integration-reviewer/SKILL.md +157 -0
- package/skills/product-requirements-writer/SKILL.md +205 -0
- package/skills/production-readiness-reviewer/SKILL.md +240 -0
- package/skills/self-audit/SKILL.md +134 -0
- package/skills/spec-reviewer/SKILL.md +304 -0
- package/skills/subagent-driven-development/SKILL.md +236 -0
- package/skills/systematic-debugging/SKILL.md +313 -0
- package/skills/test-reviewer/SKILL.md +293 -0
- package/templates/AGENTS.md +120 -0
- package/templates/CLAUDE.md +115 -0
- package/templates/copilot-instructions.md +116 -0
- package/templates/decision-log.md +44 -0
|
@@ -0,0 +1,77 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "code-reviewer"
|
|
3
|
+
description: "Load when the user asks to review a PR, branch, diff, local changes, or says approve, merge, or check this change for bugs, regressions, security, maintainability, or merge risk."
|
|
4
|
+
version: 1.1.0
|
|
5
|
+
required: true
|
|
6
|
+
category: review
|
|
7
|
+
tools:
|
|
8
|
+
- claude
|
|
9
|
+
- copilot
|
|
10
|
+
- codex
|
|
11
|
+
- cursor
|
|
12
|
+
routing:
|
|
13
|
+
triggers:
|
|
14
|
+
- pull-request
|
|
15
|
+
- pr-review
|
|
16
|
+
- code-review
|
|
17
|
+
- branch-review
|
|
18
|
+
- diff-review
|
|
19
|
+
- merge-risk
|
|
20
|
+
paths:
|
|
21
|
+
- review-path
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## Review Depth
|
|
25
|
+
|
|
26
|
+
Default to the lightest useful review.
|
|
27
|
+
|
|
28
|
+
### Fast Path
|
|
29
|
+
Use only when the change is small, localized, low-risk, and project gates are already passing or not relevant.
|
|
30
|
+
|
|
31
|
+
Output:
|
|
32
|
+
- Top 1-3 material findings only
|
|
33
|
+
- `No material findings` if clean
|
|
34
|
+
- Verification gaps only when they affect merge confidence
|
|
35
|
+
|
|
36
|
+
Do not emit the full checklist when there are no findings.
|
|
37
|
+
|
|
38
|
+
### Deep Path
|
|
39
|
+
Use the full review process when the change is high-risk, cross-cutting, production-sensitive, security/data-sensitive, behavior-changing without adequate tests, has failing or missing gates, or is explicitly requested.
|
|
40
|
+
|
|
41
|
+
# Code Review Guidelines
|
|
42
|
+
|
|
43
|
+
When reviewing a pull request, branch, diff, or local change:
|
|
44
|
+
|
|
45
|
+
## Review Heuristics (What to Check)
|
|
46
|
+
|
|
47
|
+
Do not just read the code top-to-bottom. Apply these specific checks:
|
|
48
|
+
|
|
49
|
+
| Severity | Check | Heuristic / Action |
|
|
50
|
+
| :--- | :--- | :--- |
|
|
51
|
+
| **🛑 Critical** | **CI & Config Changes** | Look at `.github/workflows`, test configs, and lint rules first. Flag any change that weakens CI, ignores rules, or deletes tests to "fix" them. |
|
|
52
|
+
| **🛑 Critical** | **Evidence Requirements** | For any non-trivial logic change, require a test that fails on the pre-change behavior. |
|
|
53
|
+
| **⚠️ Warning** | **New Utilities (DRY)** | Search for new functions, helpers, or modules. Run a repo search to check for duplicates. Flag anything that reinvents existing codebase functionality. |
|
|
54
|
+
| **⚠️ Warning** | **Structural Regression** | Flag changes that make the codebase harder to reason about: ad-hoc branches in busy flows, feature checks scattered through shared paths, unnecessary wrappers, cast-heavy contracts, duplicated helpers, or logic added in the wrong layer. Prefer fixes that delete complexity, reuse the canonical owner/helper, or make the data/type boundary explicit. |
|
|
55
|
+
| **🔍 Trace** | **Critical Path** | Pick the most important logic change and trace it end-to-end: input → transforms → output. Check boundary conditions and unexpected branching. |
|
|
56
|
+
| **🔍 Trace** | **Security Boundaries** | If the PR touches untrusted input or LLM calls, explicitly check for prompt injection, auth bypass, and missing sanitization. |
|
|
57
|
+
|
|
58
|
+
## Standard Categories
|
|
59
|
+
|
|
60
|
+
## Output Format
|
|
61
|
+
|
|
62
|
+
For each finding:
|
|
63
|
+
|
|
64
|
+
- **File:Line** — exact location
|
|
65
|
+
- **Severity** — Critical / Warning / Trace / Suggestion
|
|
66
|
+
- **What's wrong** — one sentence
|
|
67
|
+
- **Fix** — how to fix it
|
|
68
|
+
|
|
69
|
+
## Rules
|
|
70
|
+
|
|
71
|
+
- Be specific. Quote the problematic code.
|
|
72
|
+
- Don't flag style nitpicks unless they affect readability.
|
|
73
|
+
- Before accepting a complex implementation, ask whether the same behavior can be achieved by deleting branches, reusing an existing abstraction, moving logic to the canonical owner, or making the data/type boundary explicit.
|
|
74
|
+
- Do not approve merely because behavior works. If the implementation creates obvious structural debt, spaghetti branching, duplicated abstractions, or unclear type/boundary contracts, call that out as a material review finding.
|
|
75
|
+
- Treat these as merge-blocking unless clearly justified: scattered feature-specific conditionals in shared flows, duplicated helpers/models, casts or optionality that hide real invariants, wrappers that add indirection without reducing complexity, or modules made materially harder to scan, test, or own.
|
|
76
|
+
- If the PR looks good, say so. Don't invent problems.
|
|
77
|
+
- End with: APPROVE / REQUEST_CHANGES / COMMENT
|
|
@@ -0,0 +1,234 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "codebase-health-reviewer"
|
|
3
|
+
description: "Load when reviewing TypeScript/JavaScript refactors, cleanup, AI-generated code, shared utilities, dead code, duplication, complexity, circular dependencies, Fallow output, or maintainability drift."
|
|
4
|
+
version: 1.1.0
|
|
5
|
+
required: false
|
|
6
|
+
category: review
|
|
7
|
+
tools:
|
|
8
|
+
- claude
|
|
9
|
+
- copilot
|
|
10
|
+
- codex
|
|
11
|
+
- cursor
|
|
12
|
+
routing:
|
|
13
|
+
triggers:
|
|
14
|
+
- typescript
|
|
15
|
+
- javascript
|
|
16
|
+
- refactor
|
|
17
|
+
- cleanup
|
|
18
|
+
- fallow
|
|
19
|
+
- codebase-health
|
|
20
|
+
paths:
|
|
21
|
+
- full-path
|
|
22
|
+
- review-path
|
|
23
|
+
---
|
|
24
|
+
|
|
25
|
+
## Review Depth
|
|
26
|
+
|
|
27
|
+
Default to the lightest useful review.
|
|
28
|
+
|
|
29
|
+
### Fast Path
|
|
30
|
+
Use only when the change is small, localized, low-risk, and project gates are already passing or not relevant.
|
|
31
|
+
|
|
32
|
+
Output:
|
|
33
|
+
- Top 1-3 material findings only
|
|
34
|
+
- `No material findings` if clean
|
|
35
|
+
- Verification gaps only when they affect merge confidence
|
|
36
|
+
|
|
37
|
+
Do not emit the full checklist when there are no findings.
|
|
38
|
+
|
|
39
|
+
### Deep Path
|
|
40
|
+
Use the full review process when the change is high-risk, cross-cutting, production-sensitive, security/data-sensitive, behavior-changing without adequate tests, has failing or missing gates, or is explicitly requested.
|
|
41
|
+
|
|
42
|
+
# Codebase Health Reviewer
|
|
43
|
+
|
|
44
|
+
You are a specialist in reviewing project-wide health signals, especially for
|
|
45
|
+
TypeScript and JavaScript repositories where Fallow is available. Your goal is to
|
|
46
|
+
turn static analysis output into actionable review findings without flooding the
|
|
47
|
+
user with raw tool output.
|
|
48
|
+
|
|
49
|
+
This skill complements architecture-boundary-reviewer. Boundary review asks
|
|
50
|
+
whether the dependency graph is legal; health review asks whether the change made
|
|
51
|
+
the codebase harder to maintain.
|
|
52
|
+
|
|
53
|
+
---
|
|
54
|
+
|
|
55
|
+
## When to Use
|
|
56
|
+
|
|
57
|
+
Use this skill when a task includes:
|
|
58
|
+
|
|
59
|
+
- Refactors or cleanup
|
|
60
|
+
- Removing files, exports, types, or dependencies
|
|
61
|
+
- Adding shared utilities or abstractions
|
|
62
|
+
- Touching complex modules
|
|
63
|
+
- Reviewing AI-generated code before merge
|
|
64
|
+
- Investigating duplication, dead code, circular dependencies, or architecture drift
|
|
65
|
+
- Any project where the user asks to incorporate Fallow into directives or review
|
|
66
|
+
|
|
67
|
+
Do not use it as a substitute for tests. Fallow can show graph and health facts;
|
|
68
|
+
tests still prove behavior.
|
|
69
|
+
|
|
70
|
+
---
|
|
71
|
+
|
|
72
|
+
## Primary Tool: Fallow
|
|
73
|
+
|
|
74
|
+
For TypeScript/JavaScript projects, prefer Fallow when available:
|
|
75
|
+
|
|
76
|
+
```bash
|
|
77
|
+
npx fallow --summary
|
|
78
|
+
npx fallow dead-code
|
|
79
|
+
npx fallow dupes
|
|
80
|
+
npx fallow health
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
Targeted checks:
|
|
84
|
+
|
|
85
|
+
```bash
|
|
86
|
+
npx fallow dead-code --boundary-violations
|
|
87
|
+
npx fallow dead-code --circular-deps
|
|
88
|
+
npx fallow fix --dry-run
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
Use `--format json` or `--format markdown` when the project needs structured
|
|
92
|
+
review output. Prefer summaries and targeted excerpts over dumping full reports
|
|
93
|
+
into the PR.
|
|
94
|
+
|
|
95
|
+
If Fallow is unavailable, fall back to project-native checks such as lint,
|
|
96
|
+
type-check, dependency-cruiser, ts-prune, knip, jscpd, or manual import/search
|
|
97
|
+
inspection.
|
|
98
|
+
|
|
99
|
+
---
|
|
100
|
+
|
|
101
|
+
## Review Process
|
|
102
|
+
|
|
103
|
+
### Step 1: Establish the Baseline
|
|
104
|
+
|
|
105
|
+
Run or inspect the starting health state before judging the change. Existing
|
|
106
|
+
issues are not automatically blockers, but new or worsened issues are.
|
|
107
|
+
|
|
108
|
+
```md
|
|
109
|
+
Baseline:
|
|
110
|
+
- Dead code: existing count or not checked
|
|
111
|
+
- Duplication: existing count/rate or not checked
|
|
112
|
+
- Health/complexity: hotspots or not checked
|
|
113
|
+
- Cycles/boundaries: existing count or not checked
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
If no baseline is available, state that the review is a snapshot rather than a
|
|
117
|
+
regression comparison.
|
|
118
|
+
|
|
119
|
+
### Step 2: Review Change-Specific Risk
|
|
120
|
+
|
|
121
|
+
Map the task to the relevant Fallow checks:
|
|
122
|
+
|
|
123
|
+
| Change type | Checks to prioritize |
|
|
124
|
+
| --- | --- |
|
|
125
|
+
| Delete or rename code | dead code, unused exports, dependents |
|
|
126
|
+
| Add helper/shared utility | duplication, health, boundary violations |
|
|
127
|
+
| Refactor module | health, duplication, circular dependencies |
|
|
128
|
+
| Add package dependency | unused dependencies, unlisted dependencies |
|
|
129
|
+
| Change public exports | unused exports, duplicate exports, dependents |
|
|
130
|
+
| Move code between folders | boundary violations, circular dependencies |
|
|
131
|
+
|
|
132
|
+
### Step 3: Interpret Findings
|
|
133
|
+
|
|
134
|
+
Classify each finding:
|
|
135
|
+
|
|
136
|
+
- **Blocker** — newly introduced dead code, illegal boundary, cycle, or major complexity spike
|
|
137
|
+
- **Should fix** — duplicated logic or maintainability issue caused by this change
|
|
138
|
+
- **Follow-up** — pre-existing issue worth tracking separately
|
|
139
|
+
- **Not relevant** — unrelated existing issue outside task scope
|
|
140
|
+
|
|
141
|
+
Do not demand that one PR fix the whole repository unless the task is explicitly
|
|
142
|
+
cleanup. Focus on whether the change made health worse or missed an obvious
|
|
143
|
+
cleanup opportunity.
|
|
144
|
+
|
|
145
|
+
### Step 4: Recommend Minimal Fixes
|
|
146
|
+
|
|
147
|
+
For each actionable finding, give the smallest safe fix:
|
|
148
|
+
|
|
149
|
+
- remove unused export/file/dependency
|
|
150
|
+
- merge duplicate helper logic
|
|
151
|
+
- split a high-complexity function
|
|
152
|
+
- invert a dependency to preserve boundaries
|
|
153
|
+
- add a public export instead of deep import
|
|
154
|
+
- create a follow-up issue for broad cleanup
|
|
155
|
+
|
|
156
|
+
---
|
|
157
|
+
|
|
158
|
+
## Output Format
|
|
159
|
+
|
|
160
|
+
```md
|
|
161
|
+
## Codebase Health Review
|
|
162
|
+
|
|
163
|
+
### Tool Evidence
|
|
164
|
+
- Command: `npx fallow --summary`
|
|
165
|
+
- Result: pass / warnings / failed / unavailable
|
|
166
|
+
|
|
167
|
+
### Findings
|
|
168
|
+
#### BLOCKER: New circular dependency between auth and user modules
|
|
169
|
+
- Evidence: `npx fallow dead-code --circular-deps`
|
|
170
|
+
- Introduced by: `src/features/auth/login.ts` importing `src/features/user/internal.ts`
|
|
171
|
+
- Impact: makes feature refactors brittle and violates directional dependency expectations
|
|
172
|
+
- Fix: move shared type to `src/shared/auth-user.ts` or expose through public API
|
|
173
|
+
|
|
174
|
+
#### FOLLOW-UP: Existing duplicated validation helpers
|
|
175
|
+
- Evidence: 3 clone groups reported before this change
|
|
176
|
+
- Scope: pre-existing; not introduced by this PR
|
|
177
|
+
- Recommendation: create cleanup issue if not already tracked
|
|
178
|
+
|
|
179
|
+
### Verdict
|
|
180
|
+
- ✅ Pass — no new health regressions
|
|
181
|
+
- ⚠️ Pass with follow-ups — no blocker, but cleanup recommended
|
|
182
|
+
- ❌ Block — change introduces health regression
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
---
|
|
186
|
+
|
|
187
|
+
## PR Verification Snippet
|
|
188
|
+
|
|
189
|
+
When the review passes, include a concise section in the PR body:
|
|
190
|
+
|
|
191
|
+
```md
|
|
192
|
+
### Codebase Health
|
|
193
|
+
|
|
194
|
+
- `npx fallow --summary` passed / reported no new blockers
|
|
195
|
+
- Dead code: no new unused files/exports/dependencies
|
|
196
|
+
- Duplication: no new duplicate clone family relevant to this change
|
|
197
|
+
- Complexity: changed functions remain below project threshold
|
|
198
|
+
- Cycles/boundaries: no new cycle or boundary violation
|
|
199
|
+
```
|
|
200
|
+
|
|
201
|
+
If Fallow is unavailable:
|
|
202
|
+
|
|
203
|
+
```md
|
|
204
|
+
### Codebase Health
|
|
205
|
+
|
|
206
|
+
- Fallow unavailable: <reason>
|
|
207
|
+
- Manual fallback: checked imports/exports/dependents with <tool/command>
|
|
208
|
+
- No new dead code, duplication, cycles, or boundary regressions found in touched files
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
---
|
|
212
|
+
|
|
213
|
+
## Common Pitfalls
|
|
214
|
+
|
|
215
|
+
1. **Confusing pre-existing issues with PR regressions.** Flag them, but do not
|
|
216
|
+
block unless the change worsens them or depends on them.
|
|
217
|
+
2. **Dumping raw reports.** Summarize evidence and link/paste only relevant lines.
|
|
218
|
+
3. **Treating health checks as behavior proof.** Static analysis complements tests;
|
|
219
|
+
it does not replace RED/GREEN/REFACTOR.
|
|
220
|
+
4. **Ignoring boundary findings because the task was not architectural.** Illegal
|
|
221
|
+
dependency edges are blockers even when behavior tests pass.
|
|
222
|
+
5. **Letting Fallow absence stop the task.** Use manual fallback and document the
|
|
223
|
+
lower confidence.
|
|
224
|
+
|
|
225
|
+
---
|
|
226
|
+
|
|
227
|
+
## Verification Checklist
|
|
228
|
+
|
|
229
|
+
- [ ] Baseline or snapshot status stated
|
|
230
|
+
- [ ] Relevant Fallow checks selected for the change type
|
|
231
|
+
- [ ] Findings classified as blocker / should-fix / follow-up / not relevant
|
|
232
|
+
- [ ] New regressions separated from existing debt
|
|
233
|
+
- [ ] Minimal fixes or issue follow-ups recommended
|
|
234
|
+
- [ ] PR-ready health summary produced
|
|
@@ -0,0 +1,159 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "harness-hooks-reviewer"
|
|
3
|
+
description: "Load when adding or reviewing agent harness hooks, start/stop hooks, pre-write/pre-command hooks, post-change automation, or deterministic agent workflow scripts."
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
required: false
|
|
6
|
+
category: review
|
|
7
|
+
tools:
|
|
8
|
+
- claude
|
|
9
|
+
- copilot
|
|
10
|
+
- codex
|
|
11
|
+
- cursor
|
|
12
|
+
routing:
|
|
13
|
+
triggers:
|
|
14
|
+
- agent-hooks
|
|
15
|
+
- harness-hooks
|
|
16
|
+
- start-hook
|
|
17
|
+
- stop-hook
|
|
18
|
+
- pre-write-hook
|
|
19
|
+
- pre-command-hook
|
|
20
|
+
- post-change-automation
|
|
21
|
+
- deterministic-agent-automation
|
|
22
|
+
paths:
|
|
23
|
+
- full-path
|
|
24
|
+
- review-path
|
|
25
|
+
- policy-path
|
|
26
|
+
---
|
|
27
|
+
|
|
28
|
+
# Harness Hooks Reviewer
|
|
29
|
+
|
|
30
|
+
You are a specialist in reviewing deterministic automation around AI coding
|
|
31
|
+
agents. Your job is to decide whether hooks and harness scripts make agent work
|
|
32
|
+
safer and more consistent without becoming slow, brittle, surprising, or unsafe.
|
|
33
|
+
|
|
34
|
+
Hooks are for deterministic behavior: enforcing policy, preparing scoped context,
|
|
35
|
+
running local checks, capturing session learnings, or producing machine-readable
|
|
36
|
+
signals. Do not use hooks to hide vague LLM prompts behind automation.
|
|
37
|
+
|
|
38
|
+
---
|
|
39
|
+
|
|
40
|
+
## When to Use
|
|
41
|
+
|
|
42
|
+
Use this skill when work adds, changes, or reviews:
|
|
43
|
+
|
|
44
|
+
- start/session hooks that load dynamic project, team, module, or developer context
|
|
45
|
+
- stop/session hooks that summarize work, suggest instruction updates, or emit logs
|
|
46
|
+
- pre-write, pre-command, or permission hooks that block or warn on risky actions
|
|
47
|
+
- post-change hooks for formatting, linting, reporting, or generated artifacts
|
|
48
|
+
- agent harness scripts that run automatically around model actions
|
|
49
|
+
|
|
50
|
+
Do not use this skill for normal application runtime hooks, framework lifecycle
|
|
51
|
+
hooks, or CI jobs unless they are specifically part of the agent harness.
|
|
52
|
+
|
|
53
|
+
---
|
|
54
|
+
|
|
55
|
+
## Review Process
|
|
56
|
+
|
|
57
|
+
### Step 1: Classify the Hook
|
|
58
|
+
|
|
59
|
+
State the hook class and intended outcome:
|
|
60
|
+
|
|
61
|
+
- Start/context setup
|
|
62
|
+
- Stop/session reflection
|
|
63
|
+
- Pre-action policy gate
|
|
64
|
+
- Post-change formatting/check/reporting
|
|
65
|
+
- Logging/telemetry/audit
|
|
66
|
+
- Other deterministic automation
|
|
67
|
+
|
|
68
|
+
If the hook's purpose is unclear, ask for clarification before approving it.
|
|
69
|
+
|
|
70
|
+
### Step 2: Check Trigger and Scope
|
|
71
|
+
|
|
72
|
+
Verify:
|
|
73
|
+
|
|
74
|
+
- the trigger event is specific enough to avoid surprising execution
|
|
75
|
+
- the hook only runs in intended repositories, paths, tools, or task classes
|
|
76
|
+
- slow checks are scoped to changed or relevant files when possible
|
|
77
|
+
- generated, vendor, dependency, or build-output paths are excluded unless they are
|
|
78
|
+
the explicit subject of work
|
|
79
|
+
|
|
80
|
+
### Step 3: Check Side Effects and Failure Mode
|
|
81
|
+
|
|
82
|
+
For each side effect, identify whether the hook may read, write, block, network,
|
|
83
|
+
spawn processes, modify config, or expose data.
|
|
84
|
+
|
|
85
|
+
Require an explicit failure mode:
|
|
86
|
+
|
|
87
|
+
- **Block** for deterministic safety violations
|
|
88
|
+
- **Warn** for useful but non-authoritative findings
|
|
89
|
+
- **Log-only** for telemetry or improvement suggestions
|
|
90
|
+
|
|
91
|
+
Hooks must not silently rewrite unrelated files, weaken checks, commit changes,
|
|
92
|
+
exfiltrate secrets, or mask failures to keep an agent moving.
|
|
93
|
+
|
|
94
|
+
### Step 4: Check Operational Safety
|
|
95
|
+
|
|
96
|
+
Look for:
|
|
97
|
+
|
|
98
|
+
- timeout and output-size bounds
|
|
99
|
+
- idempotent behavior, or documented non-idempotent state
|
|
100
|
+
- stable input/output contract for the harness
|
|
101
|
+
- clear handling for missing tools, partial configuration, and unsupported clients
|
|
102
|
+
- least-privilege environment and secret access
|
|
103
|
+
- logs that avoid secrets, tokens, PII, and excessive prompt/context dumps
|
|
104
|
+
|
|
105
|
+
### Step 5: Check Continuous-Improvement Use
|
|
106
|
+
|
|
107
|
+
For hooks that suggest instruction updates or session learnings:
|
|
108
|
+
|
|
109
|
+
- emit proposals, not automatic policy changes, unless the repo explicitly allows it
|
|
110
|
+
- include evidence: trigger, affected files, repeated failure, and proposed wording
|
|
111
|
+
- keep generated recommendations compact and reviewable
|
|
112
|
+
- avoid append-forever logs becoming the new context sink
|
|
113
|
+
|
|
114
|
+
### Step 6: Recommend Minimal Fixes
|
|
115
|
+
|
|
116
|
+
Prefer small fixes such as narrowing triggers, adding dry-run mode, bounding output,
|
|
117
|
+
adding a timeout, changing block to warn, adding a smoke test, or redacting logs.
|
|
118
|
+
Do not turn a hook review into a broad harness rewrite.
|
|
119
|
+
|
|
120
|
+
---
|
|
121
|
+
|
|
122
|
+
## Output Format
|
|
123
|
+
|
|
124
|
+
```md
|
|
125
|
+
## Harness Hooks Review
|
|
126
|
+
|
|
127
|
+
### Hook Classification
|
|
128
|
+
- Type: <start | stop | pre-action | post-change | logging | other>
|
|
129
|
+
- Trigger: <event/path/tool scope>
|
|
130
|
+
- Failure mode: <block | warn | log-only>
|
|
131
|
+
|
|
132
|
+
### Findings
|
|
133
|
+
#### BLOCKER: <unsafe hook behavior>
|
|
134
|
+
- Evidence: `<file:line>` or reviewed behavior
|
|
135
|
+
- Risk: <what can break or leak>
|
|
136
|
+
- Fix: <smallest safe fix>
|
|
137
|
+
|
|
138
|
+
#### SHOULD FIX: <brittleness or usability issue>
|
|
139
|
+
- Evidence: <specific evidence>
|
|
140
|
+
- Risk: <why this will hurt agent/dev workflow>
|
|
141
|
+
- Fix: <smallest safe fix>
|
|
142
|
+
|
|
143
|
+
### Verification Needed
|
|
144
|
+
- <hook smoke test, timeout proof, dry-run output, or config check>
|
|
145
|
+
|
|
146
|
+
### Verdict
|
|
147
|
+
- APPROVE / COMMENT / REQUEST_CHANGES
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
---
|
|
151
|
+
|
|
152
|
+
## Common Pitfalls
|
|
153
|
+
|
|
154
|
+
- Treating hooks as prompts instead of deterministic scripts.
|
|
155
|
+
- Blocking on expensive full-repo checks for every small agent action.
|
|
156
|
+
- Writing instruction updates automatically from one noisy session.
|
|
157
|
+
- Logging raw prompts, secrets, tokens, or sensitive file contents.
|
|
158
|
+
- Making a hook client-specific without documenting fallback behavior for other
|
|
159
|
+
agent tools.
|
|
@@ -0,0 +1,205 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "implementation-task-planner"
|
|
3
|
+
description: "Load when the user has a PRD, issue, acceptance criteria, or requirements doc and wants a staged implementation task list with relevant files, tests, validation steps, and review checkpoints."
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
required: false
|
|
6
|
+
category: planning
|
|
7
|
+
tools:
|
|
8
|
+
- claude
|
|
9
|
+
- copilot
|
|
10
|
+
- codex
|
|
11
|
+
- cursor
|
|
12
|
+
routing:
|
|
13
|
+
triggers:
|
|
14
|
+
- task-planning
|
|
15
|
+
- implementation-plan
|
|
16
|
+
- prd-to-tasks
|
|
17
|
+
- acceptance-criteria-to-tasks
|
|
18
|
+
- task-list
|
|
19
|
+
paths:
|
|
20
|
+
- exploration-path
|
|
21
|
+
- full-path
|
|
22
|
+
- policy-path
|
|
23
|
+
---
|
|
24
|
+
|
|
25
|
+
# Implementation Task Planner
|
|
26
|
+
|
|
27
|
+
You are a specialist in turning a PRD, issue, requirements document, or acceptance criteria into an implementation task list that an agent or developer can execute safely.
|
|
28
|
+
|
|
29
|
+
This skill produces a plan and task artifact. It does not implement the tasks.
|
|
30
|
+
|
|
31
|
+
## When to Load
|
|
32
|
+
|
|
33
|
+
Load this skill when the user asks to:
|
|
34
|
+
|
|
35
|
+
- generate implementation tasks from a PRD, spec, issue, or acceptance criteria
|
|
36
|
+
- break a feature into staged work items before coding
|
|
37
|
+
- identify likely files, test files, and validation gates for a feature
|
|
38
|
+
- convert product requirements into a task checklist
|
|
39
|
+
- create a task list that an implementation agent can work through one item at a time
|
|
40
|
+
|
|
41
|
+
Do not load this skill for:
|
|
42
|
+
|
|
43
|
+
- creating the PRD itself — use `skills/product-requirements-writer/SKILL.md`
|
|
44
|
+
- reviewing code against a spec — use `skills/spec-reviewer/SKILL.md`
|
|
45
|
+
- implementing tasks immediately unless the user explicitly switches to execution
|
|
46
|
+
- tiny edits where a task list adds more ceremony than clarity
|
|
47
|
+
|
|
48
|
+
## Core Principle: Plan Executable Slices, Not Vague Milestones
|
|
49
|
+
|
|
50
|
+
A good task list reduces ambiguity for the implementation phase. Each task should be small enough to execute and verify, ordered by dependency, and connected to likely files, tests, and quality gates.
|
|
51
|
+
|
|
52
|
+
Do not invent file paths from vibes. Inspect the repository structure and existing patterns before naming relevant files. If repo context is unavailable, mark file paths as tentative.
|
|
53
|
+
|
|
54
|
+
## Process
|
|
55
|
+
|
|
56
|
+
### 1. Intake the Source Requirements
|
|
57
|
+
|
|
58
|
+
Read the PRD/spec/issue/acceptance criteria. Identify:
|
|
59
|
+
|
|
60
|
+
- feature name and purpose
|
|
61
|
+
- functional requirements
|
|
62
|
+
- non-goals and constraints
|
|
63
|
+
- user-visible behaviors
|
|
64
|
+
- data/API/type changes
|
|
65
|
+
- test and verification implications
|
|
66
|
+
- open questions that block planning
|
|
67
|
+
|
|
68
|
+
If requirements are too vague to plan safely, ask a concise clarifying question or route back to `skills/product-requirements-writer/SKILL.md`.
|
|
69
|
+
|
|
70
|
+
### 2. Inspect Repo Structure Before Naming Files
|
|
71
|
+
|
|
72
|
+
When working in a repo, use the lightest safe codebase navigation:
|
|
73
|
+
|
|
74
|
+
- check project instructions and existing planning conventions
|
|
75
|
+
- inspect top-level folders and relevant source/test roots
|
|
76
|
+
- search for existing related components, services, routes, commands, or utilities
|
|
77
|
+
- identify the project-native test/type/lint commands if present
|
|
78
|
+
|
|
79
|
+
Only list files as concrete when they are grounded in repo evidence. Use `likely` or `tentative` labels when a path is inferred.
|
|
80
|
+
|
|
81
|
+
### 3. Generate Parent Tasks First
|
|
82
|
+
|
|
83
|
+
Create high-level tasks before detailed subtasks when user alignment matters. Keep parent tasks few and outcome-oriented.
|
|
84
|
+
|
|
85
|
+
Default parent task shape:
|
|
86
|
+
|
|
87
|
+
```md
|
|
88
|
+
## Tasks
|
|
89
|
+
|
|
90
|
+
- [ ] 0.0 Create feature branch
|
|
91
|
+
- [ ] 1.0 Confirm requirements and existing patterns
|
|
92
|
+
- [ ] 2.0 Define contracts, types, or interfaces
|
|
93
|
+
- [ ] 3.0 Implement the smallest end-to-end behavior slice
|
|
94
|
+
- [ ] 4.0 Add tests for required behavior and edge cases
|
|
95
|
+
- [ ] 5.0 Integrate, verify, and document
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
Include task `0.0 Create feature branch` unless the user or repo workflow says not to create branches.
|
|
99
|
+
|
|
100
|
+
If the user asked for an interactive planning flow, stop after parent tasks and ask:
|
|
101
|
+
|
|
102
|
+
```md
|
|
103
|
+
I have generated the high-level tasks based on the requirements. Ready to generate the sub-tasks? Respond with "Go" to proceed.
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
If the user asked for the full task list in one pass, continue to subtasks without the pause.
|
|
107
|
+
|
|
108
|
+
### 4. Generate Subtasks
|
|
109
|
+
|
|
110
|
+
Subtasks should be concrete, ordered, and independently checkable. Include verification steps as tasks, not just prose.
|
|
111
|
+
|
|
112
|
+
Good subtasks:
|
|
113
|
+
|
|
114
|
+
- inspect an existing pattern before editing
|
|
115
|
+
- add or update a type/contract
|
|
116
|
+
- write a failing test for one behavior
|
|
117
|
+
- implement the minimum behavior
|
|
118
|
+
- run a named validation command
|
|
119
|
+
- update docs only when the feature has user-facing or operator-facing behavior
|
|
120
|
+
|
|
121
|
+
Bad subtasks:
|
|
122
|
+
|
|
123
|
+
- "implement feature"
|
|
124
|
+
- "make it work"
|
|
125
|
+
- "clean up code"
|
|
126
|
+
- broad refactors not required by the PRD
|
|
127
|
+
|
|
128
|
+
### 5. Identify Relevant Files
|
|
129
|
+
|
|
130
|
+
List likely files before tasks. Include test files near the corresponding implementation files. Each entry needs a short reason.
|
|
131
|
+
|
|
132
|
+
Use this format:
|
|
133
|
+
|
|
134
|
+
```md
|
|
135
|
+
## Relevant Files
|
|
136
|
+
|
|
137
|
+
- `path/to/file.ts` - Why this file is relevant.
|
|
138
|
+
- `path/to/file.test.ts` - Tests for `path/to/file.ts`.
|
|
139
|
+
- `tasks/prd-[feature-name].md` - Source PRD for this task list (or the actual existing PRD path).
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
If a file does not exist yet, mark it as `(new)`. If it is inferred, mark it as `(tentative)`.
|
|
143
|
+
|
|
144
|
+
### 6. Save the Artifact When Working in a Repo
|
|
145
|
+
|
|
146
|
+
Save the task list under the project root as:
|
|
147
|
+
|
|
148
|
+
```txt
|
|
149
|
+
tasks/tasks-[feature-name].md
|
|
150
|
+
```
|
|
151
|
+
|
|
152
|
+
Follow the repo's existing planning/spec directory if one exists.
|
|
153
|
+
|
|
154
|
+
## Output Format
|
|
155
|
+
|
|
156
|
+
```md
|
|
157
|
+
# Tasks: <Feature Name>
|
|
158
|
+
|
|
159
|
+
## Source
|
|
160
|
+
|
|
161
|
+
- PRD/spec/issue: `<path or description>`
|
|
162
|
+
- Planning assumptions: <brief assumptions or "none">
|
|
163
|
+
|
|
164
|
+
## Relevant Files
|
|
165
|
+
|
|
166
|
+
- `path/to/file` - <why relevant>
|
|
167
|
+
|
|
168
|
+
## Notes
|
|
169
|
+
|
|
170
|
+
- Use the project-native test/type/lint commands listed in repo instructions.
|
|
171
|
+
- Update each checkbox as work completes.
|
|
172
|
+
- Keep implementation scoped to the PRD non-goals.
|
|
173
|
+
|
|
174
|
+
## Instructions for Completing Tasks
|
|
175
|
+
|
|
176
|
+
As each task is completed, change `- [ ]` to `- [x]` in this file. Update after each subtask, not just after a parent task.
|
|
177
|
+
|
|
178
|
+
## Tasks
|
|
179
|
+
|
|
180
|
+
- [ ] 0.0 Create feature branch
|
|
181
|
+
- [ ] 0.1 Create and check out a branch for this feature.
|
|
182
|
+
- [ ] 1.0 <Parent task>
|
|
183
|
+
- [ ] 1.1 <Subtask>
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
## Verification Checklist
|
|
187
|
+
|
|
188
|
+
Before returning the task list, verify:
|
|
189
|
+
|
|
190
|
+
- [ ] Tasks trace back to PRD/spec requirements and non-goals
|
|
191
|
+
- [ ] Relevant files are grounded in repo evidence or marked tentative
|
|
192
|
+
- [ ] Tests and validation gates are included as explicit tasks
|
|
193
|
+
- [ ] Parent tasks are ordered by dependency
|
|
194
|
+
- [ ] Subtasks are small enough for one implementation pass
|
|
195
|
+
- [ ] No implementation code was changed
|
|
196
|
+
- [ ] Saved path follows repo convention or `tasks/tasks-[feature-name].md`
|
|
197
|
+
|
|
198
|
+
## Common Pitfalls
|
|
199
|
+
|
|
200
|
+
1. **Inventing file paths without inspecting the repo.** Either inspect first or label paths tentative.
|
|
201
|
+
2. **Skipping validation tasks.** A task list without test/type/lint/check steps pushes ambiguity into implementation.
|
|
202
|
+
3. **Expanding scope beyond the PRD.** Non-goals are binding unless the user changes them.
|
|
203
|
+
4. **Planning giant tasks.** If a task cannot be implemented and verified in one pass, split it.
|
|
204
|
+
5. **Forcing a confirmation pause when the user asked for a complete plan.** Pause by default for interactive planning; continue when the request clearly asks for full output.
|
|
205
|
+
6. **Implementing while planning.** This skill creates the task artifact and stops.
|