@adityaaria/spark 6.0.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +20 -0
- package/.claude-plugin/plugin.json +20 -0
- package/.codex-plugin/plugin.json +48 -0
- package/.cursor-plugin/plugin.json +23 -0
- package/.kimi-plugin/plugin.json +38 -0
- package/.opencode/INSTALL.md +115 -0
- package/.opencode/plugins/spark.js +139 -0
- package/.pi/extensions/spark.ts +121 -0
- package/.version-bump.json +21 -0
- package/CLAUDE.md +115 -0
- package/CODE_OF_CONDUCT.md +128 -0
- package/GEMINI.md +2 -0
- package/LICENSE +21 -0
- package/README.md +282 -0
- package/RELEASE-NOTES.md +1299 -0
- package/assets/app-icon.png +0 -0
- package/assets/spark-small.svg +1 -0
- package/bin/spark.js +7 -0
- package/docs/README.kimi.md +94 -0
- package/docs/README.opencode.md +170 -0
- package/docs/porting-to-a-new-harness.md +830 -0
- package/gemini-extension.json +6 -0
- package/hooks/hooks-codex.json +16 -0
- package/hooks/hooks-cursor.json +10 -0
- package/hooks/hooks.json +16 -0
- package/hooks/run-hook.cmd +46 -0
- package/hooks/session-start +49 -0
- package/hooks/session-start-codex +26 -0
- package/package.json +52 -0
- package/skills/brainstorming/SKILL.md +159 -0
- package/skills/brainstorming/scripts/frame-template.html +213 -0
- package/skills/brainstorming/scripts/helper.js +167 -0
- package/skills/brainstorming/scripts/server.cjs +722 -0
- package/skills/brainstorming/scripts/start-server.sh +209 -0
- package/skills/brainstorming/scripts/stop-server.sh +120 -0
- package/skills/brainstorming/spec-document-reviewer-prompt.md +49 -0
- package/skills/brainstorming/visual-companion.md +298 -0
- package/skills/dispatching-parallel-agents/SKILL.md +185 -0
- package/skills/executing-plans/SKILL.md +70 -0
- package/skills/finishing-a-development-branch/SKILL.md +241 -0
- package/skills/receiving-code-review/SKILL.md +213 -0
- package/skills/requesting-code-review/SKILL.md +103 -0
- package/skills/requesting-code-review/code-reviewer.md +172 -0
- package/skills/subagent-driven-development/SKILL.md +418 -0
- package/skills/subagent-driven-development/implementer-prompt.md +139 -0
- package/skills/subagent-driven-development/scripts/review-package +44 -0
- package/skills/subagent-driven-development/scripts/sdd-workspace +22 -0
- package/skills/subagent-driven-development/scripts/task-brief +40 -0
- package/skills/subagent-driven-development/task-reviewer-prompt.md +188 -0
- package/skills/systematic-debugging/CREATION-LOG.md +119 -0
- package/skills/systematic-debugging/SKILL.md +296 -0
- package/skills/systematic-debugging/condition-based-waiting-example.ts +158 -0
- package/skills/systematic-debugging/condition-based-waiting.md +115 -0
- package/skills/systematic-debugging/defense-in-depth.md +122 -0
- package/skills/systematic-debugging/find-polluter.sh +63 -0
- package/skills/systematic-debugging/root-cause-tracing.md +169 -0
- package/skills/systematic-debugging/test-academic.md +14 -0
- package/skills/systematic-debugging/test-pressure-1.md +58 -0
- package/skills/systematic-debugging/test-pressure-2.md +68 -0
- package/skills/systematic-debugging/test-pressure-3.md +69 -0
- package/skills/test-driven-development/SKILL.md +371 -0
- package/skills/test-driven-development/testing-anti-patterns.md +299 -0
- package/skills/using-git-worktrees/SKILL.md +202 -0
- package/skills/using-spark/SKILL.md +121 -0
- package/skills/using-spark/references/antigravity-tools.md +96 -0
- package/skills/using-spark/references/claude-code-tools.md +50 -0
- package/skills/using-spark/references/codex-tools.md +72 -0
- package/skills/using-spark/references/copilot-tools.md +49 -0
- package/skills/using-spark/references/gemini-tools.md +63 -0
- package/skills/using-spark/references/pi-tools.md +28 -0
- package/skills/verification-before-completion/SKILL.md +139 -0
- package/skills/writing-plans/SKILL.md +174 -0
- package/skills/writing-plans/plan-document-reviewer-prompt.md +49 -0
- package/skills/writing-skills/SKILL.md +689 -0
- package/skills/writing-skills/anthropic-best-practices.md +1150 -0
- package/skills/writing-skills/examples/CLAUDE_MD_TESTING.md +189 -0
- package/skills/writing-skills/graphviz-conventions.dot +172 -0
- package/skills/writing-skills/persuasion-principles.md +187 -0
- package/skills/writing-skills/render-graphs.js +168 -0
- package/skills/writing-skills/testing-skills-with-subagents.md +384 -0
- package/src/cli/index.js +26 -0
- package/src/cli/install.js +47 -0
- package/src/cli/output.js +11 -0
- package/src/cli/parse-args.js +46 -0
- package/src/cli/prompt.js +10 -0
- package/src/installer/adapters/common.js +59 -0
- package/src/installer/adapters/extension-style.js +67 -0
- package/src/installer/adapters/shell-hook.js +57 -0
- package/src/installer/detect.js +168 -0
- package/src/installer/errors.js +7 -0
- package/src/installer/registry.js +35 -0
|
@@ -0,0 +1,139 @@
|
|
|
1
|
+
# Implementer Subagent Prompt Template
|
|
2
|
+
|
|
3
|
+
Use this template when dispatching an implementer subagent.
|
|
4
|
+
|
|
5
|
+
```
|
|
6
|
+
Subagent (general-purpose):
|
|
7
|
+
description: "Implement Task N: [task name]"
|
|
8
|
+
model: [MODEL — REQUIRED: choose per SKILL.md Model Selection; an omitted
|
|
9
|
+
model silently inherits the session's most expensive one]
|
|
10
|
+
prompt: |
|
|
11
|
+
You are implementing Task N: [task name]
|
|
12
|
+
|
|
13
|
+
## Task Description
|
|
14
|
+
|
|
15
|
+
Read your task brief first: [BRIEF_FILE]
|
|
16
|
+
It contains the full task text from the plan.
|
|
17
|
+
|
|
18
|
+
## Context
|
|
19
|
+
|
|
20
|
+
[Scene-setting: where this fits, dependencies, architectural context]
|
|
21
|
+
|
|
22
|
+
## Before You Begin
|
|
23
|
+
|
|
24
|
+
If you have questions about:
|
|
25
|
+
- The requirements or acceptance criteria
|
|
26
|
+
- The approach or implementation strategy
|
|
27
|
+
- Dependencies or assumptions
|
|
28
|
+
- Anything unclear in the task description
|
|
29
|
+
|
|
30
|
+
**Ask them now.** Raise any concerns before starting work.
|
|
31
|
+
|
|
32
|
+
## Your Job
|
|
33
|
+
|
|
34
|
+
Once you're clear on requirements:
|
|
35
|
+
1. Implement exactly what the task specifies
|
|
36
|
+
2. Write tests (following TDD if task says to)
|
|
37
|
+
3. Verify implementation works
|
|
38
|
+
4. Commit your work
|
|
39
|
+
5. Self-review (see below)
|
|
40
|
+
6. Report back
|
|
41
|
+
|
|
42
|
+
Work from: [directory]
|
|
43
|
+
|
|
44
|
+
**While you work:** If you encounter something unexpected or unclear, **ask questions**.
|
|
45
|
+
It's always OK to pause and clarify. Don't guess or make assumptions.
|
|
46
|
+
|
|
47
|
+
While iterating, run the focused test for what you're changing; run the
|
|
48
|
+
full suite once before committing, not after every edit.
|
|
49
|
+
|
|
50
|
+
## Code Organization
|
|
51
|
+
|
|
52
|
+
You reason best about code you can hold in context at once, and your edits are more
|
|
53
|
+
reliable when files are focused. Keep this in mind:
|
|
54
|
+
- Follow the file structure defined in the plan
|
|
55
|
+
- Each file should have one clear responsibility with a well-defined interface
|
|
56
|
+
- If a file you're creating is growing beyond the plan's intent, stop and report
|
|
57
|
+
it as DONE_WITH_CONCERNS — don't split files on your own without plan guidance
|
|
58
|
+
- If an existing file you're modifying is already large or tangled, work carefully
|
|
59
|
+
and note it as a concern in your report
|
|
60
|
+
- In existing codebases, follow established patterns. Improve code you're touching
|
|
61
|
+
the way a good developer would, but don't restructure things outside your task.
|
|
62
|
+
|
|
63
|
+
## When You're in Over Your Head
|
|
64
|
+
|
|
65
|
+
It is always OK to stop and say "this is too hard for me." Bad work is worse than
|
|
66
|
+
no work. You will not be penalized for escalating.
|
|
67
|
+
|
|
68
|
+
**STOP and escalate when:**
|
|
69
|
+
- The task requires architectural decisions with multiple valid approaches
|
|
70
|
+
- You need to understand code beyond what was provided and can't find clarity
|
|
71
|
+
- You feel uncertain about whether your approach is correct
|
|
72
|
+
- The task involves restructuring existing code in ways the plan didn't anticipate
|
|
73
|
+
- You've been reading file after file trying to understand the system without progress
|
|
74
|
+
|
|
75
|
+
**How to escalate:** Report back with status BLOCKED or NEEDS_CONTEXT. Describe
|
|
76
|
+
specifically what you're stuck on, what you've tried, and what kind of help you need.
|
|
77
|
+
The controller can provide more context, re-dispatch with a more capable model,
|
|
78
|
+
or break the task into smaller pieces.
|
|
79
|
+
|
|
80
|
+
## Before Reporting Back: Self-Review
|
|
81
|
+
|
|
82
|
+
Review your work with fresh eyes. Ask yourself:
|
|
83
|
+
|
|
84
|
+
**Completeness:**
|
|
85
|
+
- Did I fully implement everything in the spec?
|
|
86
|
+
- Did I miss any requirements?
|
|
87
|
+
- Are there edge cases I didn't handle?
|
|
88
|
+
|
|
89
|
+
**Quality:**
|
|
90
|
+
- Is this my best work?
|
|
91
|
+
- Are names clear and accurate (match what things do, not how they work)?
|
|
92
|
+
- Is the code clean and maintainable?
|
|
93
|
+
|
|
94
|
+
**Discipline:**
|
|
95
|
+
- Did I avoid overbuilding (YAGNI)?
|
|
96
|
+
- Did I only build what was requested?
|
|
97
|
+
- Did I follow existing patterns in the codebase?
|
|
98
|
+
|
|
99
|
+
**Testing:**
|
|
100
|
+
- Do tests actually verify behavior (not just mock behavior)?
|
|
101
|
+
- Did I follow TDD if required?
|
|
102
|
+
- Are tests comprehensive?
|
|
103
|
+
- Is the test output pristine (no stray warnings or noise)?
|
|
104
|
+
|
|
105
|
+
If you find issues during self-review, fix them now before reporting.
|
|
106
|
+
|
|
107
|
+
## After Review Findings
|
|
108
|
+
|
|
109
|
+
If a reviewer finds issues and you fix them, re-run the tests that cover
|
|
110
|
+
the amended code and append the results to your report file. Reviewers
|
|
111
|
+
will not re-run tests for you — your report is the test evidence.
|
|
112
|
+
|
|
113
|
+
## Report Format
|
|
114
|
+
|
|
115
|
+
Write your full report to [REPORT_FILE]:
|
|
116
|
+
- What you implemented (or what you attempted, if blocked)
|
|
117
|
+
- What you tested and test results
|
|
118
|
+
- **TDD Evidence** (if TDD was required for this task):
|
|
119
|
+
- RED: command run, relevant failing output before implementation, and why the failure was expected
|
|
120
|
+
- GREEN: command run and relevant passing output after implementation
|
|
121
|
+
- Files changed
|
|
122
|
+
- Self-review findings (if any)
|
|
123
|
+
- Any issues or concerns
|
|
124
|
+
|
|
125
|
+
Then report back with ONLY (under 15 lines — the detail lives in the
|
|
126
|
+
report file):
|
|
127
|
+
- **Status:** DONE | DONE_WITH_CONCERNS | BLOCKED | NEEDS_CONTEXT
|
|
128
|
+
- Commits created (short SHA + subject)
|
|
129
|
+
- One-line test summary (e.g. "14/14 passing, output pristine")
|
|
130
|
+
- Your concerns, if any
|
|
131
|
+
- The report file path
|
|
132
|
+
|
|
133
|
+
If BLOCKED or NEEDS_CONTEXT, put the specifics in the final message
|
|
134
|
+
itself — the controller acts on it directly.
|
|
135
|
+
|
|
136
|
+
Use DONE_WITH_CONCERNS if you completed the work but have doubts about correctness.
|
|
137
|
+
Use BLOCKED if you cannot complete the task. Use NEEDS_CONTEXT if you need
|
|
138
|
+
information that wasn't provided. Never silently produce work you're unsure about.
|
|
139
|
+
```
|
|
@@ -0,0 +1,44 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
# Generate a review package: commit list, stat summary, and the net
|
|
3
|
+
# diff with extended context, written to a file the reviewer reads in one
|
|
4
|
+
# call. Using the recorded per-task BASE (not HEAD~1) keeps multi-commit
|
|
5
|
+
# tasks intact.
|
|
6
|
+
#
|
|
7
|
+
# Usage: review-package BASE HEAD [OUTFILE]
|
|
8
|
+
# Default OUTFILE: <repo-root>/.spark/sdd/review-<base7>..<head7>.diff
|
|
9
|
+
# (named per range, so a re-review after fixes gets a distinct fresh file).
|
|
10
|
+
set -euo pipefail
|
|
11
|
+
|
|
12
|
+
if [ $# -lt 2 ] || [ $# -gt 3 ]; then
|
|
13
|
+
echo "usage: review-package BASE HEAD [OUTFILE]" >&2
|
|
14
|
+
exit 2
|
|
15
|
+
fi
|
|
16
|
+
|
|
17
|
+
base=$1
|
|
18
|
+
head=$2
|
|
19
|
+
|
|
20
|
+
git rev-parse --verify --quiet "$base" >/dev/null || { echo "bad BASE: $base" >&2; exit 2; }
|
|
21
|
+
git rev-parse --verify --quiet "$head" >/dev/null || { echo "bad HEAD: $head" >&2; exit 2; }
|
|
22
|
+
|
|
23
|
+
if [ $# -eq 3 ]; then
|
|
24
|
+
out=$3
|
|
25
|
+
else
|
|
26
|
+
dir=$("$(cd "$(dirname "$0")" && pwd)/sdd-workspace")
|
|
27
|
+
out="$dir/review-$(git rev-parse --short "$base")..$(git rev-parse --short "$head").diff"
|
|
28
|
+
fi
|
|
29
|
+
|
|
30
|
+
{
|
|
31
|
+
echo "# Review package: ${base}..${head}"
|
|
32
|
+
echo
|
|
33
|
+
echo "## Commits"
|
|
34
|
+
git log --oneline "${base}..${head}"
|
|
35
|
+
echo
|
|
36
|
+
echo "## Files changed"
|
|
37
|
+
git diff --stat "${base}..${head}"
|
|
38
|
+
echo
|
|
39
|
+
echo "## Diff"
|
|
40
|
+
git diff -U10 "${base}..${head}"
|
|
41
|
+
} > "$out"
|
|
42
|
+
|
|
43
|
+
commits=$(git rev-list --count "${base}..${head}")
|
|
44
|
+
echo "wrote ${out}: ${commits} commit(s), $(wc -c < "$out" | tr -d ' ') bytes"
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
# Resolve and ensure the working-tree directory SDD uses for its short-lived
|
|
3
|
+
# artifacts: task briefs, implementer reports, review packages, and the
|
|
4
|
+
# progress ledger. Print the directory's absolute path.
|
|
5
|
+
#
|
|
6
|
+
# The workspace lives in the working tree (not under .git/) because Claude Code
|
|
7
|
+
# treats .git/ as a protected path and denies agent writes there — which blocks
|
|
8
|
+
# an implementer subagent from writing its report file. A self-ignoring
|
|
9
|
+
# .gitignore keeps the workspace out of `git status` and out of accidental
|
|
10
|
+
# commits without modifying any tracked file.
|
|
11
|
+
#
|
|
12
|
+
# Single source of truth for the workspace location, so task-brief and
|
|
13
|
+
# review-package cannot drift to different directories.
|
|
14
|
+
#
|
|
15
|
+
# Usage: sdd-workspace
|
|
16
|
+
set -euo pipefail
|
|
17
|
+
|
|
18
|
+
root=$(git rev-parse --show-toplevel)
|
|
19
|
+
dir="$root/.spark/sdd"
|
|
20
|
+
mkdir -p "$dir"
|
|
21
|
+
printf '*\n' > "$dir/.gitignore"
|
|
22
|
+
cd "$dir" && pwd
|
|
@@ -0,0 +1,40 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
# Extract one task's full text from an implementation plan into a file the
|
|
3
|
+
# implementer reads in one call, so the task text never has to be pasted
|
|
4
|
+
# through the controller's context.
|
|
5
|
+
#
|
|
6
|
+
# Usage: task-brief PLAN_FILE TASK_NUMBER [OUTFILE]
|
|
7
|
+
# Default OUTFILE: <repo-root>/.spark/sdd/task-<N>-brief.md
|
|
8
|
+
# (per worktree; concurrent runs in the same working tree share it).
|
|
9
|
+
set -euo pipefail
|
|
10
|
+
|
|
11
|
+
if [ $# -lt 2 ] || [ $# -gt 3 ]; then
|
|
12
|
+
echo "usage: task-brief PLAN_FILE TASK_NUMBER [OUTFILE]" >&2
|
|
13
|
+
exit 2
|
|
14
|
+
fi
|
|
15
|
+
|
|
16
|
+
plan=$1
|
|
17
|
+
n=$2
|
|
18
|
+
[ -f "$plan" ] || { echo "no such plan file: $plan" >&2; exit 2; }
|
|
19
|
+
|
|
20
|
+
if [ $# -eq 3 ]; then
|
|
21
|
+
out=$3
|
|
22
|
+
else
|
|
23
|
+
dir=$("$(cd "$(dirname "$0")" && pwd)/sdd-workspace")
|
|
24
|
+
out="$dir/task-${n}-brief.md"
|
|
25
|
+
fi
|
|
26
|
+
|
|
27
|
+
awk -v n="$n" '
|
|
28
|
+
/^```/ { infence = !infence }
|
|
29
|
+
!infence && /^#+[ \t]+Task[ \t]+[0-9]+/ {
|
|
30
|
+
intask = ($0 ~ ("^#+[ \t]+Task[ \t]+" n "([^0-9]|$)"))
|
|
31
|
+
}
|
|
32
|
+
intask { print }
|
|
33
|
+
' "$plan" > "$out"
|
|
34
|
+
|
|
35
|
+
if [ ! -s "$out" ]; then
|
|
36
|
+
echo "task ${n} not found in ${plan} (no heading matching 'Task ${n}')" >&2
|
|
37
|
+
exit 3
|
|
38
|
+
fi
|
|
39
|
+
|
|
40
|
+
echo "wrote ${out}: $(wc -l < "$out" | tr -d ' ') lines"
|
|
@@ -0,0 +1,188 @@
|
|
|
1
|
+
# Task Reviewer Prompt Template
|
|
2
|
+
|
|
3
|
+
Use this template when dispatching a task reviewer subagent. The reviewer
|
|
4
|
+
reads the task's diff once and returns two verdicts: spec compliance and
|
|
5
|
+
code quality.
|
|
6
|
+
|
|
7
|
+
**Purpose:** Verify one task's implementation matches its requirements (nothing
|
|
8
|
+
more, nothing less) and is well-built (clean, tested, maintainable)
|
|
9
|
+
|
|
10
|
+
```
|
|
11
|
+
Subagent (general-purpose):
|
|
12
|
+
description: "Review Task N (spec + quality)"
|
|
13
|
+
model: [MODEL — REQUIRED: choose per SKILL.md Model Selection; an omitted
|
|
14
|
+
model silently inherits the session's most expensive one]
|
|
15
|
+
prompt: |
|
|
16
|
+
You are reviewing one task's implementation: first whether it matches its
|
|
17
|
+
requirements, then whether it is well-built. This is a task-scoped gate,
|
|
18
|
+
not a merge review — a broad whole-branch review happens separately after
|
|
19
|
+
all tasks are complete.
|
|
20
|
+
|
|
21
|
+
## What Was Requested
|
|
22
|
+
|
|
23
|
+
Read the task brief: [BRIEF_FILE]
|
|
24
|
+
|
|
25
|
+
Global constraints from the spec/design that bind this task:
|
|
26
|
+
[GLOBAL_CONSTRAINTS]
|
|
27
|
+
|
|
28
|
+
## What the Implementer Claims They Built
|
|
29
|
+
|
|
30
|
+
Read the implementer's report: [REPORT_FILE]
|
|
31
|
+
|
|
32
|
+
## Diff Under Review
|
|
33
|
+
|
|
34
|
+
**Base:** [BASE_SHA]
|
|
35
|
+
**Head:** [HEAD_SHA]
|
|
36
|
+
**Diff file:** [DIFF_FILE]
|
|
37
|
+
|
|
38
|
+
Read the diff file once — it contains the commit list, a stat summary,
|
|
39
|
+
and the full diff with surrounding context, and it is your view of the
|
|
40
|
+
change. The diff's context lines ARE the changed files: do not Read a
|
|
41
|
+
changed file separately unless a hunk you must judge is cut off
|
|
42
|
+
mid-function — and say so in your report. Do not re-run git commands.
|
|
43
|
+
If the diff file is missing, fetch the diff yourself:
|
|
44
|
+
`git diff --stat [BASE_SHA]..[HEAD_SHA]` and `git diff [BASE_SHA]..[HEAD_SHA]`.
|
|
45
|
+
Do not crawl the broader codebase. Inspect code outside the diff only
|
|
46
|
+
to evaluate a concrete risk you can name — one focused check per named
|
|
47
|
+
risk, and name both the risk and what you checked in your report.
|
|
48
|
+
Cross-cutting changes are legitimate named risks: if the diff changes
|
|
49
|
+
lock ordering, a function or API contract, or shared mutable state,
|
|
50
|
+
checking the call sites is the right method.
|
|
51
|
+
|
|
52
|
+
Your review is read-only on this checkout. Do not mutate the working
|
|
53
|
+
tree, the index, HEAD, or branch state in any way.
|
|
54
|
+
|
|
55
|
+
## Do Not Trust the Report
|
|
56
|
+
|
|
57
|
+
Treat the implementer's report as unverified claims about the code. It
|
|
58
|
+
may be incomplete, inaccurate, or optimistic. Verify the claims against
|
|
59
|
+
the diff. Design rationales in the report are claims too: "left it per
|
|
60
|
+
YAGNI," "kept it simple deliberately," or any other justification is the
|
|
61
|
+
implementer grading their own work. Judge the code on its merits — a
|
|
62
|
+
stated rationale never downgrades a finding's severity.
|
|
63
|
+
|
|
64
|
+
## Tests
|
|
65
|
+
|
|
66
|
+
The implementer already ran the tests and reported results with TDD
|
|
67
|
+
evidence for exactly this code. Do not re-run the suite to confirm their
|
|
68
|
+
report. Run a test only when reading the code raises a specific doubt
|
|
69
|
+
that no existing run answers — and then a focused test, never a
|
|
70
|
+
package-wide suite, race detector run, or repeated/high-count loop. If
|
|
71
|
+
heavy validation seems warranted, recommend it in your report instead of
|
|
72
|
+
running it. If you cannot run commands in this environment, name the
|
|
73
|
+
test you would run.
|
|
74
|
+
|
|
75
|
+
Warnings or other noise in the implementer's reported test output are
|
|
76
|
+
findings — test output should be pristine.
|
|
77
|
+
|
|
78
|
+
## Part 1: Spec Compliance
|
|
79
|
+
|
|
80
|
+
Compare the diff against What Was Requested:
|
|
81
|
+
|
|
82
|
+
- **Missing:** requirements they skipped, missed, or claimed without
|
|
83
|
+
implementing
|
|
84
|
+
- **Extra:** features that weren't requested, over-engineering, unneeded
|
|
85
|
+
"nice to haves"
|
|
86
|
+
- **Misunderstood:** right feature built the wrong way, wrong problem
|
|
87
|
+
solved
|
|
88
|
+
|
|
89
|
+
If a requirement cannot be verified from this diff alone (it lives in
|
|
90
|
+
unchanged code or spans tasks), report it as a ⚠️ item instead of
|
|
91
|
+
broadening your search.
|
|
92
|
+
|
|
93
|
+
## Part 2: Code Quality
|
|
94
|
+
|
|
95
|
+
**Code quality:**
|
|
96
|
+
- Clean separation of concerns?
|
|
97
|
+
- Proper error handling?
|
|
98
|
+
- DRY without premature abstraction?
|
|
99
|
+
- Edge cases handled?
|
|
100
|
+
|
|
101
|
+
**Tests:**
|
|
102
|
+
- Do the new and changed tests verify real behavior, not mocks?
|
|
103
|
+
- Are the task's edge cases covered?
|
|
104
|
+
|
|
105
|
+
**Structure:**
|
|
106
|
+
- Does each file have one clear responsibility with a well-defined interface?
|
|
107
|
+
- Are units decomposed so they can be understood and tested independently?
|
|
108
|
+
- Is the implementation following the file structure from the plan?
|
|
109
|
+
- Did this change create new files that are already large, or
|
|
110
|
+
significantly grow existing files? (Don't flag pre-existing file
|
|
111
|
+
sizes — focus on what this change contributed.)
|
|
112
|
+
|
|
113
|
+
Your report should point at evidence: file:line references for every
|
|
114
|
+
finding and for any check you would otherwise answer with a bare
|
|
115
|
+
"yes." A tight report that cites lines gives the controller everything
|
|
116
|
+
it needs.
|
|
117
|
+
|
|
118
|
+
Your final message is the report itself: begin directly with the
|
|
119
|
+
spec-compliance verdict. Every line is a verdict, a finding with
|
|
120
|
+
file:line, or a check you ran — no preamble, no process narration,
|
|
121
|
+
no closing summary.
|
|
122
|
+
|
|
123
|
+
## Calibration
|
|
124
|
+
|
|
125
|
+
Categorize issues by actual severity. Not everything is Critical.
|
|
126
|
+
Important means this task cannot be trusted until it is fixed: incorrect
|
|
127
|
+
or fragile behavior, a missed requirement, or maintainability damage you
|
|
128
|
+
would block a merge over — verbatim duplication of a logic block,
|
|
129
|
+
swallowed errors, tests that assert nothing. "Coverage could be broader"
|
|
130
|
+
and polish suggestions are Minor.
|
|
131
|
+
If the plan or brief explicitly mandates something this rubric calls a
|
|
132
|
+
defect (a test that asserts nothing, verbatim duplication of a logic
|
|
133
|
+
block), that IS a finding — report it as Important, labeled
|
|
134
|
+
plan-mandated. The plan's authorship does not grade its own work; the
|
|
135
|
+
human decides.
|
|
136
|
+
Acknowledge what was done well before listing issues — accurate praise
|
|
137
|
+
helps the implementer trust the rest of the feedback.
|
|
138
|
+
|
|
139
|
+
## Output Format
|
|
140
|
+
|
|
141
|
+
### Spec Compliance
|
|
142
|
+
|
|
143
|
+
- ✅ Spec compliant | ❌ Issues found: [what's missing/extra/misunderstood,
|
|
144
|
+
with file:line references]
|
|
145
|
+
- ⚠️ Cannot verify from diff: [requirements you could not verify from the
|
|
146
|
+
diff alone, and what the controller should check — report alongside the
|
|
147
|
+
✅/❌ verdict for everything you could verify]
|
|
148
|
+
|
|
149
|
+
### Strengths
|
|
150
|
+
[What's well done? Be specific.]
|
|
151
|
+
|
|
152
|
+
### Issues
|
|
153
|
+
|
|
154
|
+
#### Critical (Must Fix)
|
|
155
|
+
#### Important (Should Fix)
|
|
156
|
+
#### Minor (Nice to Have)
|
|
157
|
+
|
|
158
|
+
For each issue: file:line, what's wrong, why it matters, how to fix
|
|
159
|
+
(if not obvious).
|
|
160
|
+
|
|
161
|
+
### Assessment
|
|
162
|
+
|
|
163
|
+
**Task quality:** [Approved | Needs fixes]
|
|
164
|
+
|
|
165
|
+
**Reasoning:** [1-2 sentence technical assessment]
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
**Placeholders:**
|
|
169
|
+
- `[MODEL]` — REQUIRED: reviewer model per SKILL.md Model Selection
|
|
170
|
+
- `[BRIEF_FILE]` — REQUIRED: the task brief file (`scripts/task-brief PLAN N`
|
|
171
|
+
prints the path; same file the implementer worked from)
|
|
172
|
+
- `[GLOBAL_CONSTRAINTS]` — the binding requirements copied verbatim from
|
|
173
|
+
the plan's Global Constraints section or the spec: exact values, formats,
|
|
174
|
+
and stated relationships between components (not process rules — those
|
|
175
|
+
are already in this template)
|
|
176
|
+
- `[REPORT_FILE]` — REQUIRED: the file the implementer wrote its detailed
|
|
177
|
+
report to
|
|
178
|
+
- `[BASE_SHA]` — commit before this task
|
|
179
|
+
- `[HEAD_SHA]` — current commit
|
|
180
|
+
- `[DIFF_FILE]` — REQUIRED: the path the controller wrote the review
|
|
181
|
+
package to (`scripts/review-package BASE HEAD` prints the unique path it
|
|
182
|
+
wrote; the package never enters the controller's context)
|
|
183
|
+
|
|
184
|
+
**Reviewer returns:** Spec Compliance verdict (✅/❌/⚠️), Strengths, Issues
|
|
185
|
+
(Critical/Important/Minor), Task quality verdict
|
|
186
|
+
|
|
187
|
+
A fix dispatch can address spec gaps and quality findings together;
|
|
188
|
+
re-review after fixes covers both verdicts.
|
|
@@ -0,0 +1,119 @@
|
|
|
1
|
+
# Creation Log: Systematic Debugging Skill
|
|
2
|
+
|
|
3
|
+
Reference example of extracting, structuring, and bulletproofing a critical skill.
|
|
4
|
+
|
|
5
|
+
## Source Material
|
|
6
|
+
|
|
7
|
+
Extracted debugging framework from `~/.claude/CLAUDE.md`:
|
|
8
|
+
- 4-phase systematic process (Investigation → Pattern Analysis → Hypothesis → Implementation)
|
|
9
|
+
- Core mandate: ALWAYS find root cause, NEVER fix symptoms
|
|
10
|
+
- Rules designed to resist time pressure and rationalization
|
|
11
|
+
|
|
12
|
+
## Extraction Decisions
|
|
13
|
+
|
|
14
|
+
**What to include:**
|
|
15
|
+
- Complete 4-phase framework with all rules
|
|
16
|
+
- Anti-shortcuts ("NEVER fix symptom", "STOP and re-analyze")
|
|
17
|
+
- Pressure-resistant language ("even if faster", "even if I seem in a hurry")
|
|
18
|
+
- Concrete steps for each phase
|
|
19
|
+
|
|
20
|
+
**What to leave out:**
|
|
21
|
+
- Project-specific context
|
|
22
|
+
- Repetitive variations of same rule
|
|
23
|
+
- Narrative explanations (condensed to principles)
|
|
24
|
+
|
|
25
|
+
## Structure Following skill-creation/SKILL.md
|
|
26
|
+
|
|
27
|
+
1. **Rich when_to_use** - Included symptoms and anti-patterns
|
|
28
|
+
2. **Type: technique** - Concrete process with steps
|
|
29
|
+
3. **Keywords** - "root cause", "symptom", "workaround", "debugging", "investigation"
|
|
30
|
+
4. **Flowchart** - Decision point for "fix failed" → re-analyze vs add more fixes
|
|
31
|
+
5. **Phase-by-phase breakdown** - Scannable checklist format
|
|
32
|
+
6. **Anti-patterns section** - What NOT to do (critical for this skill)
|
|
33
|
+
|
|
34
|
+
## Bulletproofing Elements
|
|
35
|
+
|
|
36
|
+
Framework designed to resist rationalization under pressure:
|
|
37
|
+
|
|
38
|
+
### Language Choices
|
|
39
|
+
- "ALWAYS" / "NEVER" (not "should" / "try to")
|
|
40
|
+
- "even if faster" / "even if I seem in a hurry"
|
|
41
|
+
- "STOP and re-analyze" (explicit pause)
|
|
42
|
+
- "Don't skip past" (catches the actual behavior)
|
|
43
|
+
|
|
44
|
+
### Structural Defenses
|
|
45
|
+
- **Phase 1 required** - Can't skip to implementation
|
|
46
|
+
- **Single hypothesis rule** - Forces thinking, prevents shotgun fixes
|
|
47
|
+
- **Explicit failure mode** - "IF your first fix doesn't work" with mandatory action
|
|
48
|
+
- **Anti-patterns section** - Shows exactly what shortcuts look like
|
|
49
|
+
|
|
50
|
+
### Redundancy
|
|
51
|
+
- Root cause mandate in overview + when_to_use + Phase 1 + implementation rules
|
|
52
|
+
- "NEVER fix symptom" appears 4 times in different contexts
|
|
53
|
+
- Each phase has explicit "don't skip" guidance
|
|
54
|
+
|
|
55
|
+
## Testing Approach
|
|
56
|
+
|
|
57
|
+
Created 4 validation tests following skills/meta/testing-skills-with-subagents:
|
|
58
|
+
|
|
59
|
+
### Test 1: Academic Context (No Pressure)
|
|
60
|
+
- Simple bug, no time pressure
|
|
61
|
+
- **Result:** Perfect compliance, complete investigation
|
|
62
|
+
|
|
63
|
+
### Test 2: Time Pressure + Obvious Quick Fix
|
|
64
|
+
- User "in a hurry", symptom fix looks easy
|
|
65
|
+
- **Result:** Resisted shortcut, followed full process, found real root cause
|
|
66
|
+
|
|
67
|
+
### Test 3: Complex System + Uncertainty
|
|
68
|
+
- Multi-layer failure, unclear if can find root cause
|
|
69
|
+
- **Result:** Systematic investigation, traced through all layers, found source
|
|
70
|
+
|
|
71
|
+
### Test 4: Failed First Fix
|
|
72
|
+
- Hypothesis doesn't work, temptation to add more fixes
|
|
73
|
+
- **Result:** Stopped, re-analyzed, formed new hypothesis (no shotgun)
|
|
74
|
+
|
|
75
|
+
**All tests passed.** No rationalizations found.
|
|
76
|
+
|
|
77
|
+
## Iterations
|
|
78
|
+
|
|
79
|
+
### Initial Version
|
|
80
|
+
- Complete 4-phase framework
|
|
81
|
+
- Anti-patterns section
|
|
82
|
+
- Flowchart for "fix failed" decision
|
|
83
|
+
|
|
84
|
+
### Enhancement 1: TDD Reference
|
|
85
|
+
- Added link to skills/testing/test-driven-development
|
|
86
|
+
- Note explaining TDD's "simplest code" ≠ debugging's "root cause"
|
|
87
|
+
- Prevents confusion between methodologies
|
|
88
|
+
|
|
89
|
+
## Final Outcome
|
|
90
|
+
|
|
91
|
+
Bulletproof skill that:
|
|
92
|
+
- ✅ Clearly mandates root cause investigation
|
|
93
|
+
- ✅ Resists time pressure rationalization
|
|
94
|
+
- ✅ Provides concrete steps for each phase
|
|
95
|
+
- ✅ Shows anti-patterns explicitly
|
|
96
|
+
- ✅ Tested under multiple pressure scenarios
|
|
97
|
+
- ✅ Clarifies relationship to TDD
|
|
98
|
+
- ✅ Ready for use
|
|
99
|
+
|
|
100
|
+
## Key Insight
|
|
101
|
+
|
|
102
|
+
**Most important bulletproofing:** Anti-patterns section showing exact shortcuts that feel justified in the moment. When Claude thinks "I'll just add this one quick fix", seeing that exact pattern listed as wrong creates cognitive friction.
|
|
103
|
+
|
|
104
|
+
## Usage Example
|
|
105
|
+
|
|
106
|
+
When encountering a bug:
|
|
107
|
+
1. Load skill: skills/debugging/systematic-debugging
|
|
108
|
+
2. Read overview (10 sec) - reminded of mandate
|
|
109
|
+
3. Follow Phase 1 checklist - forced investigation
|
|
110
|
+
4. If tempted to skip - see anti-pattern, stop
|
|
111
|
+
5. Complete all phases - root cause found
|
|
112
|
+
|
|
113
|
+
**Time investment:** 5-10 minutes
|
|
114
|
+
**Time saved:** Hours of symptom-whack-a-mole
|
|
115
|
+
|
|
116
|
+
---
|
|
117
|
+
|
|
118
|
+
*Created: 2025-10-03*
|
|
119
|
+
*Purpose: Reference example for skill extraction and bulletproofing*
|