claude-mcp-workflow 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/plugin.json +13 -0
- package/.mcp.json +9 -0
- package/LICENSE +21 -0
- package/README.md +260 -0
- package/build/dashboard.d.ts +4 -0
- package/build/dashboard.d.ts.map +1 -0
- package/build/dashboard.js +91 -0
- package/build/dashboard.js.map +1 -0
- package/build/engine.d.ts +55 -0
- package/build/engine.d.ts.map +1 -0
- package/build/engine.js +486 -0
- package/build/engine.js.map +1 -0
- package/build/index.d.ts +2 -0
- package/build/index.d.ts.map +1 -0
- package/build/index.js +60 -0
- package/build/index.js.map +1 -0
- package/build/loader.d.ts +29 -0
- package/build/loader.d.ts.map +1 -0
- package/build/loader.js +166 -0
- package/build/loader.js.map +1 -0
- package/build/modifier.d.ts +42 -0
- package/build/modifier.d.ts.map +1 -0
- package/build/modifier.js +96 -0
- package/build/modifier.js.map +1 -0
- package/build/storage.d.ts +12 -0
- package/build/storage.d.ts.map +1 -0
- package/build/storage.js +62 -0
- package/build/storage.js.map +1 -0
- package/build/tools.d.ts +7 -0
- package/build/tools.d.ts.map +1 -0
- package/build/tools.js +316 -0
- package/build/tools.js.map +1 -0
- package/build/types.d.ts +417 -0
- package/build/types.d.ts.map +1 -0
- package/build/types.js +82 -0
- package/build/types.js.map +1 -0
- package/dashboard/dagre.min.js +801 -0
- package/dashboard/index.html +652 -0
- package/hooks/hooks.json +24 -0
- package/hooks/workflow-cleanup.sh +51 -0
- package/hooks/workflow-start.sh +79 -0
- package/package.json +44 -0
- package/templates/bug-fix.yaml +283 -0
- package/templates/code-review.yaml +164 -0
- package/templates/coding.yaml +176 -0
- package/templates/debugging.yaml +162 -0
- package/templates/explore.yaml +90 -0
- package/templates/file-code.yaml +69 -0
- package/templates/file-review.yaml +164 -0
- package/templates/investigate.yaml +84 -0
- package/templates/master.yaml +202 -0
- package/templates/new-feature.yaml +41 -0
- package/templates/planning.yaml +85 -0
- package/templates/refactoring.yaml +56 -0
- package/templates/reflection.yaml +61 -0
- package/templates/skills/architecture/SKILL.md +55 -0
- package/templates/skills/coding-skill-selector/SKILL.md +25 -0
- package/templates/skills/lang-haxe/SKILL.md +257 -0
- package/templates/skills/lang-python/SKILL.md +16 -0
- package/templates/skills/math/SKILL.md +14 -0
- package/templates/skills/preferences/SKILL.md +25 -0
- package/templates/skills/task-delegation/SKILL.md +53 -0
- package/templates/skills/web-reading/SKILL.md +62 -0
- package/templates/subagent.yaml +67 -0
- package/templates/testing.yaml +120 -0
- package/templates/web-research.yaml +53 -0
|
@@ -0,0 +1,176 @@
|
|
|
1
|
+
name: coding
|
|
2
|
+
description: "Code writing workflow — from thinking to verified implementation"
|
|
3
|
+
initial: think
|
|
4
|
+
max_transitions: 40
|
|
5
|
+
|
|
6
|
+
states:
|
|
7
|
+
think:
|
|
8
|
+
prompt: |
|
|
9
|
+
MANDATORY FIRST ACTION — call `Skill("coding-skill-selector")` to load language/domain skills BEFORE doing anything else in this state. Do NOT skip this step.
|
|
10
|
+
|
|
11
|
+
Then think before writing:
|
|
12
|
+
1. What will I write? Outline the approach
|
|
13
|
+
2. Does it smell like a utility? Check framework built-ins first
|
|
14
|
+
3. Search project for existing utilities/helpers — use if found, create if broadly useful
|
|
15
|
+
4. Before creating new files: verify name/path doesn't already exist
|
|
16
|
+
|
|
17
|
+
For multi-file tasks (2+ files changing):
|
|
18
|
+
5. Produce a **per-file plan** listing each file and what changes it needs
|
|
19
|
+
6. Define **interface contracts** between files — function signatures, types, imports
|
|
20
|
+
that must match across file boundaries
|
|
21
|
+
7. Identify dependencies between files — which must be written first
|
|
22
|
+
|
|
23
|
+
Route based on scope:
|
|
24
|
+
- `single` → 1 file to change (or tightly coupled changes best done together)
|
|
25
|
+
- `multi` → 2+ independent files to change in parallel via agents
|
|
26
|
+
- `multi_self` → 2+ files but high coupling / detailed plan already exists → write yourself
|
|
27
|
+
transitions:
|
|
28
|
+
single: write_single
|
|
29
|
+
multi: delegate
|
|
30
|
+
multi_self: write_multi
|
|
31
|
+
|
|
32
|
+
write_single:
|
|
33
|
+
sub_workflow: file-code
|
|
34
|
+
on_complete: review
|
|
35
|
+
on_fail: review
|
|
36
|
+
|
|
37
|
+
write_multi:
|
|
38
|
+
task: "Implement multi-file changes (self)"
|
|
39
|
+
prompt: |
|
|
40
|
+
Write changes across multiple files yourself (high coupling or detailed plan).
|
|
41
|
+
Implement layer by layer respecting dependencies.
|
|
42
|
+
Build after all changes to verify compilation.
|
|
43
|
+
Then → transition `done`.
|
|
44
|
+
transitions:
|
|
45
|
+
done: review
|
|
46
|
+
|
|
47
|
+
delegate:
|
|
48
|
+
task: "Delegate per-file coding to agents"
|
|
49
|
+
prompt: |
|
|
50
|
+
Spawn one Agent per file for parallel implementation.
|
|
51
|
+
Use `subagent_type="general-purpose"`, `model="opus"`.
|
|
52
|
+
|
|
53
|
+
For each file, include in the agent prompt VERBATIM:
|
|
54
|
+
---
|
|
55
|
+
IMPORTANT: Override the default CLAUDE.md rule about start().
|
|
56
|
+
Do NOT call start() without arguments.
|
|
57
|
+
Instead, your FIRST action must be:
|
|
58
|
+
mcp__plugin_workflow_wf__start({ workflow: "file-code", parent_session_id: "<SESSION_ID>" })
|
|
59
|
+
where <SESSION_ID> is the current session ID (substitute it before sending).
|
|
60
|
+
Then follow the file-code workflow states until completion.
|
|
61
|
+
---
|
|
62
|
+
Also include:
|
|
63
|
+
- The file path
|
|
64
|
+
- The per-file plan for this specific file
|
|
65
|
+
- The interface contracts — what signatures/types/exports this file must provide
|
|
66
|
+
and what it can expect from other files
|
|
67
|
+
- How to get current state: `git diff HEAD -- <filepath>`
|
|
68
|
+
|
|
69
|
+
Launch ALL file agents in parallel (single message, multiple Agent tool calls).
|
|
70
|
+
Wait for all to complete. Collect their reports.
|
|
71
|
+
|
|
72
|
+
After ALL reports are collected:
|
|
73
|
+
1. Abort orphaned child workflow sessions — call `sessions` to find
|
|
74
|
+
any active `file-code` sessions that are children of this session,
|
|
75
|
+
then `abort` each one.
|
|
76
|
+
2. Transition `done`.
|
|
77
|
+
transitions:
|
|
78
|
+
done: integrate
|
|
79
|
+
|
|
80
|
+
integrate:
|
|
81
|
+
task: "Cross-file integration check"
|
|
82
|
+
prompt: |
|
|
83
|
+
All file agents have completed. Now verify integration:
|
|
84
|
+
|
|
85
|
+
1. **Interface contracts** — do all files satisfy the contracts defined in think?
|
|
86
|
+
Check function signatures, types, imports match across boundaries.
|
|
87
|
+
2. **Import resolution** — are all cross-file imports correct and present?
|
|
88
|
+
3. **Conflict resolution** — did agents make conflicting assumptions?
|
|
89
|
+
Fix any inconsistencies.
|
|
90
|
+
4. **Missing glue** — any wiring code needed to connect the pieces?
|
|
91
|
+
|
|
92
|
+
Fix any issues found directly.
|
|
93
|
+
Then → transition `done`.
|
|
94
|
+
transitions:
|
|
95
|
+
done: review
|
|
96
|
+
|
|
97
|
+
review:
|
|
98
|
+
task: "Route review by scope"
|
|
99
|
+
prompt: |
|
|
100
|
+
Route to the correct review approach based on how many files were changed:
|
|
101
|
+
- `single` → 1 file changed — review via file-review sub-workflow (yourself)
|
|
102
|
+
- `multi` → 2+ files changed — spawn file-review agents in parallel
|
|
103
|
+
transitions:
|
|
104
|
+
single: review_single
|
|
105
|
+
multi: review_delegate
|
|
106
|
+
|
|
107
|
+
review_single:
|
|
108
|
+
sub_workflow: file-review
|
|
109
|
+
on_complete: verify
|
|
110
|
+
on_fail: write_fix
|
|
111
|
+
|
|
112
|
+
review_delegate:
|
|
113
|
+
task: "Delegate per-file review to agents"
|
|
114
|
+
prompt: |
|
|
115
|
+
Spawn one Agent per changed file for deep review.
|
|
116
|
+
Use `subagent_type="general-purpose"`, `model="opus"`.
|
|
117
|
+
|
|
118
|
+
For each file, include in the agent prompt VERBATIM:
|
|
119
|
+
---
|
|
120
|
+
IMPORTANT: Override the default CLAUDE.md rule about start().
|
|
121
|
+
Do NOT call start() without arguments.
|
|
122
|
+
Instead, your FIRST action must be:
|
|
123
|
+
mcp__plugin_workflow_wf__start({ workflow: "file-review", parent_session_id: "<SESSION_ID>" })
|
|
124
|
+
where <SESSION_ID> is the current session ID (substitute it before sending).
|
|
125
|
+
Then follow the file-review workflow states until completion.
|
|
126
|
+
---
|
|
127
|
+
Also include:
|
|
128
|
+
- The file path
|
|
129
|
+
- How to get the diff: `git diff HEAD -- <filepath>`
|
|
130
|
+
- Review scope: `diff` — review only the changed lines
|
|
131
|
+
|
|
132
|
+
Launch ALL file agents in parallel (single message, multiple Agent tool calls).
|
|
133
|
+
Wait for all to complete. Collect their reports.
|
|
134
|
+
|
|
135
|
+
After ALL reports are collected:
|
|
136
|
+
1. Abort orphaned child workflow sessions — call `sessions` to find
|
|
137
|
+
any active `file-review` sessions that are children of this session,
|
|
138
|
+
then `abort` each one.
|
|
139
|
+
2. Transition `done`.
|
|
140
|
+
transitions:
|
|
141
|
+
done: cross_file
|
|
142
|
+
|
|
143
|
+
cross_file:
|
|
144
|
+
task: "Cross-file review analysis"
|
|
145
|
+
prompt: |
|
|
146
|
+
Review the big picture — things per-file agents can't see:
|
|
147
|
+
|
|
148
|
+
- [ ] Duplication across files: did different files introduce similar code?
|
|
149
|
+
- [ ] Missing updates: if an interface/type changed, are all consumers updated?
|
|
150
|
+
- [ ] Consistency: do the changes follow the same patterns across files?
|
|
151
|
+
- [ ] Integration: do the changed components work together correctly?
|
|
152
|
+
- [ ] Blast radius: are there callers/dependents that weren't updated?
|
|
153
|
+
|
|
154
|
+
Choose transition:
|
|
155
|
+
- `ok` → no issues, proceed to testing
|
|
156
|
+
- `fix` → found issues that need fixing
|
|
157
|
+
transitions:
|
|
158
|
+
ok: verify
|
|
159
|
+
fix: write_fix
|
|
160
|
+
|
|
161
|
+
write_fix:
|
|
162
|
+
task: "Fix review issues"
|
|
163
|
+
prompt: |
|
|
164
|
+
Fix the issues found during review. After fixing, transition back to review
|
|
165
|
+
for re-verification.
|
|
166
|
+
transitions:
|
|
167
|
+
done: review
|
|
168
|
+
|
|
169
|
+
verify:
|
|
170
|
+
sub_workflow: testing
|
|
171
|
+
on_complete: done
|
|
172
|
+
on_fail: write_fix
|
|
173
|
+
|
|
174
|
+
done:
|
|
175
|
+
prompt: "Code written, reviewed, and verified. Implementation complete."
|
|
176
|
+
terminal: true
|
|
@@ -0,0 +1,162 @@
|
|
|
1
|
+
name: debugging
|
|
2
|
+
description: "Debugging workflow — diagnose first, fix never (until diagnosed)"
|
|
3
|
+
initial: reproduce
|
|
4
|
+
max_transitions: 40
|
|
5
|
+
|
|
6
|
+
states:
|
|
7
|
+
reproduce:
|
|
8
|
+
prompt: |
|
|
9
|
+
MANDATORY FIRST ACTION — call `Skill("coding-skill-selector")` to load language/domain skills BEFORE doing anything else in this state. Do NOT skip this step.
|
|
10
|
+
|
|
11
|
+
**Rule #1: Diagnose FIRST, fix NEVER.**
|
|
12
|
+
**Rule #2: Reproduce BEFORE fixing.**
|
|
13
|
+
|
|
14
|
+
1. Build with diagnostic flags from loaded skills
|
|
15
|
+
2. Reproduce the bug via debug bridge or manual steps
|
|
16
|
+
3. Confirm the crash/error happens
|
|
17
|
+
|
|
18
|
+
Do NOT jump to fixing based on code reading alone.
|
|
19
|
+
|
|
20
|
+
Choose transition:
|
|
21
|
+
- `reproduced` → bug confirmed, ready to diagnose
|
|
22
|
+
- `need_info` → can't reproduce, need more details from user
|
|
23
|
+
transitions:
|
|
24
|
+
reproduced: hypothesize
|
|
25
|
+
need_info: clarify
|
|
26
|
+
|
|
27
|
+
clarify:
|
|
28
|
+
prompt: |
|
|
29
|
+
Ask the user for more details about the bug:
|
|
30
|
+
- Exact steps to reproduce
|
|
31
|
+
- Error messages or logs
|
|
32
|
+
- When it started happening
|
|
33
|
+
- Any recent changes
|
|
34
|
+
|
|
35
|
+
Choose transition:
|
|
36
|
+
- `got_info` → have enough info to try reproducing again
|
|
37
|
+
transitions:
|
|
38
|
+
got_info: reproduce
|
|
39
|
+
|
|
40
|
+
hypothesize:
|
|
41
|
+
prompt: |
|
|
42
|
+
**Step 1: Hypothesize WHERE, not WHY.**
|
|
43
|
+
|
|
44
|
+
Before adding traces, narrow down WHERE the problem is — not why it happens.
|
|
45
|
+
Form 1-3 hypotheses about the location of the bug.
|
|
46
|
+
|
|
47
|
+
Choose transition:
|
|
48
|
+
- `trace` → ready to add targeted traces to test hypotheses
|
|
49
|
+
transitions:
|
|
50
|
+
trace: trace
|
|
51
|
+
|
|
52
|
+
trace:
|
|
53
|
+
max_visits: 5
|
|
54
|
+
prompt: |
|
|
55
|
+
Add 3-5 targeted traces with unique prefixes (`[DBG1]`, `[DBG2]`).
|
|
56
|
+
|
|
57
|
+
Delegate to `debugger` subagent: build command, run command, what to look for, timeout.
|
|
58
|
+
**NEVER use `run_in_background: true`**.
|
|
59
|
+
|
|
60
|
+
**Anti-patterns:**
|
|
61
|
+
- Running builds in main context (100KB+ noise)
|
|
62
|
+
- Adding Mutex/volatile "just in case" — masks real problem
|
|
63
|
+
|
|
64
|
+
Choose transition:
|
|
65
|
+
- `found` → root cause identified
|
|
66
|
+
- `more_traces` → need more data
|
|
67
|
+
- `bridge` → need interactive debug bridge inspection
|
|
68
|
+
- `escalate` → exhausted trace attempts
|
|
69
|
+
transitions:
|
|
70
|
+
found: diagnosed
|
|
71
|
+
more_traces: trace
|
|
72
|
+
bridge: bridge_debug
|
|
73
|
+
escalate: escalated
|
|
74
|
+
|
|
75
|
+
bridge_debug:
|
|
76
|
+
prompt: |
|
|
77
|
+
Call `Skill("debug-bridge")`.
|
|
78
|
+
|
|
79
|
+
Use debug bridge for interactive state inspection.
|
|
80
|
+
Launch `interactive-openfl-debugger` subagent. Inspect state, take screenshots, simulate events.
|
|
81
|
+
|
|
82
|
+
Choose transition:
|
|
83
|
+
- `found` → root cause identified via bridge
|
|
84
|
+
- `back_to_trace` → need more code-level tracing
|
|
85
|
+
transitions:
|
|
86
|
+
found: diagnosed
|
|
87
|
+
back_to_trace: trace
|
|
88
|
+
|
|
89
|
+
diagnosed:
|
|
90
|
+
prompt: |
|
|
91
|
+
Root cause identified. Present diagnosis to the user, then decide:
|
|
92
|
+
- Trivial fix (1-2 lines, obvious, no risk) → `fix`
|
|
93
|
+
- Significant changes needed → `done` (exit to master; user decides next)
|
|
94
|
+
- Unclear scope or trade-offs → `ask_user`
|
|
95
|
+
transitions:
|
|
96
|
+
fix: fix
|
|
97
|
+
done: done
|
|
98
|
+
ask_user: ask_user
|
|
99
|
+
|
|
100
|
+
ask_user:
|
|
101
|
+
prompt: |
|
|
102
|
+
Present findings and trade-offs. Ask user how to proceed:
|
|
103
|
+
- Quick fix here?
|
|
104
|
+
- Plan a bigger change?
|
|
105
|
+
- More investigation needed?
|
|
106
|
+
transitions:
|
|
107
|
+
fix: fix
|
|
108
|
+
done: done
|
|
109
|
+
investigate: hypothesize
|
|
110
|
+
|
|
111
|
+
fix:
|
|
112
|
+
prompt: |
|
|
113
|
+
Root cause identified. Fix the bug.
|
|
114
|
+
Apply the minimal fix that addresses the root cause.
|
|
115
|
+
Do NOT add defensive code "just in case" — fix the actual problem.
|
|
116
|
+
After fixing → transition `review`.
|
|
117
|
+
transitions:
|
|
118
|
+
review: mini_review
|
|
119
|
+
|
|
120
|
+
mini_review:
|
|
121
|
+
prompt: |
|
|
122
|
+
Re-read what you just wrote. Code review only — do NOT build or run tests.
|
|
123
|
+
1. Typos, edge cases, off-by-one, null refs
|
|
124
|
+
2. Does the change affect surrounding logic?
|
|
125
|
+
3. Ripple check — grep all call sites of changed functions
|
|
126
|
+
- `ok` → proceed to testing
|
|
127
|
+
- `fix` → go back and fix
|
|
128
|
+
transitions:
|
|
129
|
+
ok: verify
|
|
130
|
+
fix: fix
|
|
131
|
+
|
|
132
|
+
verify:
|
|
133
|
+
sub_workflow: testing
|
|
134
|
+
on_complete: final_review
|
|
135
|
+
on_fail: fix
|
|
136
|
+
|
|
137
|
+
final_review:
|
|
138
|
+
prompt: |
|
|
139
|
+
Tests pass. Full review of all changes made during debugging.
|
|
140
|
+
1. New + old code together — do they fit?
|
|
141
|
+
2. Dead code — did the fix make something unnecessary?
|
|
142
|
+
3. Constants/config — magic values?
|
|
143
|
+
4. Simplification opportunities?
|
|
144
|
+
- `ok` → done
|
|
145
|
+
- `fix` → needs more work
|
|
146
|
+
transitions:
|
|
147
|
+
ok: done
|
|
148
|
+
fix: fix
|
|
149
|
+
|
|
150
|
+
done:
|
|
151
|
+
prompt: "Debugging complete. Present findings to user."
|
|
152
|
+
terminal: true
|
|
153
|
+
|
|
154
|
+
escalated:
|
|
155
|
+
prompt: |
|
|
156
|
+
Could not diagnose the root cause after maximum trace attempts.
|
|
157
|
+
Report findings to the user:
|
|
158
|
+
- What was tried
|
|
159
|
+
- What was ruled out
|
|
160
|
+
- Remaining hypotheses
|
|
161
|
+
terminal: true
|
|
162
|
+
outcome: fail
|
|
@@ -0,0 +1,90 @@
|
|
|
1
|
+
name: explore
|
|
2
|
+
description: "Codebase exploration workflow — understand structure, trace code, find patterns"
|
|
3
|
+
initial: scope
|
|
4
|
+
max_transitions: 20
|
|
5
|
+
|
|
6
|
+
states:
|
|
7
|
+
scope:
|
|
8
|
+
prompt: |
|
|
9
|
+
Determine what we're looking for and where.
|
|
10
|
+
|
|
11
|
+
Exploration types:
|
|
12
|
+
- **Structure** — project layout, module boundaries, dependency graph
|
|
13
|
+
- **Feature** — how a specific feature is implemented end-to-end
|
|
14
|
+
- **Pattern** — recurring code patterns, conventions, idioms
|
|
15
|
+
- **Code path** — trace execution flow through the codebase
|
|
16
|
+
|
|
17
|
+
1. Check loaded skills, MEMORY.md, and project CLAUDE.md first
|
|
18
|
+
2. If the answer is already available → transition `already_known`
|
|
19
|
+
3. If exploration is needed → transition `explore`
|
|
20
|
+
transitions:
|
|
21
|
+
already_known: synthesize
|
|
22
|
+
explore: investigate
|
|
23
|
+
|
|
24
|
+
investigate:
|
|
25
|
+
prompt: |
|
|
26
|
+
Explore the codebase using available tools.
|
|
27
|
+
|
|
28
|
+
**Preferred approach:**
|
|
29
|
+
- Launch parallel `Explore` subagents for independent questions
|
|
30
|
+
- Use Glob to find files by pattern
|
|
31
|
+
- Use Grep to search for symbols, strings, patterns
|
|
32
|
+
- Use Read to examine specific files
|
|
33
|
+
|
|
34
|
+
**Rules:**
|
|
35
|
+
- Start broad (Glob/Grep), then narrow (Read specific files)
|
|
36
|
+
- Follow imports and call chains to understand flow
|
|
37
|
+
- Note key files, patterns, and architectural decisions
|
|
38
|
+
|
|
39
|
+
Choose transition:
|
|
40
|
+
- `found` → gathered enough information to answer
|
|
41
|
+
- `deeper` → need to explore further (follow a lead, check another area)
|
|
42
|
+
- `ask_user` → need user input to decide direction (e.g., "is this feature needed?", "which approach do you prefer?")
|
|
43
|
+
- `not_found` → exhausted search avenues, nothing relevant found
|
|
44
|
+
max_visits: 4
|
|
45
|
+
transitions:
|
|
46
|
+
found: synthesize
|
|
47
|
+
deeper: investigate
|
|
48
|
+
ask_user: clarify
|
|
49
|
+
not_found: synthesize
|
|
50
|
+
|
|
51
|
+
clarify:
|
|
52
|
+
prompt: |
|
|
53
|
+
Findings so far require user input before proceeding.
|
|
54
|
+
|
|
55
|
+
Present what you've found so far concisely, then ask your question(s).
|
|
56
|
+
Use AskUserQuestion for critical decisions that affect the direction of exploration.
|
|
57
|
+
|
|
58
|
+
After getting user's answer → transition `continue` to resume investigation
|
|
59
|
+
with the new context, or `enough` if the answer completes the picture.
|
|
60
|
+
transitions:
|
|
61
|
+
continue: investigate
|
|
62
|
+
enough: synthesize
|
|
63
|
+
|
|
64
|
+
synthesize:
|
|
65
|
+
prompt: |
|
|
66
|
+
Compile findings into a structured answer.
|
|
67
|
+
|
|
68
|
+
Format:
|
|
69
|
+
- Start with a concise summary (1-2 sentences)
|
|
70
|
+
- Key files and their roles
|
|
71
|
+
- Architecture/patterns discovered
|
|
72
|
+
- Code flow if traced
|
|
73
|
+
|
|
74
|
+
If this reveals a recurring knowledge gap → note it for reflection.
|
|
75
|
+
|
|
76
|
+
Choose transition:
|
|
77
|
+
- `done` → exploration is complete, no further action needed (pure research)
|
|
78
|
+
- `needs_action` → findings indicate code changes are needed (delete, refactor, fix, implement)
|
|
79
|
+
transitions:
|
|
80
|
+
done: done
|
|
81
|
+
needs_action: needs_action
|
|
82
|
+
|
|
83
|
+
needs_action:
|
|
84
|
+
prompt: "Exploration complete. Findings indicate action is needed."
|
|
85
|
+
terminal: true
|
|
86
|
+
outcome: needs_action
|
|
87
|
+
|
|
88
|
+
done:
|
|
89
|
+
prompt: "Codebase exploration complete."
|
|
90
|
+
terminal: true
|
|
@@ -0,0 +1,69 @@
|
|
|
1
|
+
name: file-code
|
|
2
|
+
description: "Per-file coding — spawned by coding/bug-fix for each file to implement"
|
|
3
|
+
initial: understand
|
|
4
|
+
max_transitions: 25
|
|
5
|
+
|
|
6
|
+
states:
|
|
7
|
+
understand:
|
|
8
|
+
task: "Understand file context"
|
|
9
|
+
prompt: |
|
|
10
|
+
You are implementing changes in a single file as part of a larger task.
|
|
11
|
+
The file path, plan, and interface contracts were provided in your task prompt.
|
|
12
|
+
|
|
13
|
+
1. Call `Skill("coding-skill-selector")` to load language/domain skills.
|
|
14
|
+
2. Read the existing file (if it exists) to understand current structure.
|
|
15
|
+
3. Read related files: imports, callers, siblings in the same module.
|
|
16
|
+
Understand the interfaces this file must satisfy.
|
|
17
|
+
4. Review the plan and interface contracts from your task prompt.
|
|
18
|
+
Identify exactly what needs to change in this file.
|
|
19
|
+
|
|
20
|
+
Then → transition `done`.
|
|
21
|
+
transitions:
|
|
22
|
+
done: write
|
|
23
|
+
|
|
24
|
+
write:
|
|
25
|
+
task: "Write implementation"
|
|
26
|
+
prompt: |
|
|
27
|
+
Write the implementation for this file following:
|
|
28
|
+
- The plan and interface contracts from your task prompt
|
|
29
|
+
- Loaded preference/style skills
|
|
30
|
+
- Existing patterns in the file and its neighbors
|
|
31
|
+
|
|
32
|
+
Check original before changing: `git diff HEAD -- <file>` to understand original values.
|
|
33
|
+
|
|
34
|
+
Write the code, then → transition `done`.
|
|
35
|
+
transitions:
|
|
36
|
+
done: self_review
|
|
37
|
+
|
|
38
|
+
self_review:
|
|
39
|
+
task: "Self-review code"
|
|
40
|
+
prompt: |
|
|
41
|
+
Re-read what you just wrote. Code review only — do NOT build or run tests.
|
|
42
|
+
|
|
43
|
+
1. **Skills as checklist** — re-read loaded `lang-*` and `preferences-*` skills.
|
|
44
|
+
For every WRONG pattern listed, scan your new code for matches.
|
|
45
|
+
2. **Typos, edge cases, off-by-one, null refs**
|
|
46
|
+
3. **Interface contracts** — does your code satisfy the contracts from the plan?
|
|
47
|
+
Check function signatures, return types, side effects match what other files expect.
|
|
48
|
+
4. **Ripple check** — if you added a call to function F, grep ALL existing call sites.
|
|
49
|
+
Some may now fire twice through your new path. Flag for the parent agent.
|
|
50
|
+
5. **DRY check** — duplicating existing code? Extract if obvious.
|
|
51
|
+
|
|
52
|
+
If issues found → fix them, then re-review.
|
|
53
|
+
When clean → transition `done`.
|
|
54
|
+
transitions:
|
|
55
|
+
done: report
|
|
56
|
+
|
|
57
|
+
report:
|
|
58
|
+
task: "Write implementation report"
|
|
59
|
+
prompt: |
|
|
60
|
+
Write a structured report for this file:
|
|
61
|
+
|
|
62
|
+
**File**: <path>
|
|
63
|
+
**What was done**: <one-paragraph summary of changes>
|
|
64
|
+
**Decisions made**: <any implementation choices and rationale>
|
|
65
|
+
**Interface notes**: <how this file connects to others — exports, imports, contracts satisfied>
|
|
66
|
+
**Concerns**: <anything the parent agent should verify — edge cases, assumptions, integration risks>
|
|
67
|
+
|
|
68
|
+
Output this report as your final response.
|
|
69
|
+
terminal: true
|
|
@@ -0,0 +1,164 @@
|
|
|
1
|
+
name: file-review
|
|
2
|
+
description: "Per-file deep review — spawned by code-review for each changed file"
|
|
3
|
+
initial: read_context
|
|
4
|
+
max_transitions: 25
|
|
5
|
+
|
|
6
|
+
states:
|
|
7
|
+
read_context:
|
|
8
|
+
task: "Read file and context"
|
|
9
|
+
prompt: |
|
|
10
|
+
You are reviewing a single file as part of a code review.
|
|
11
|
+
The file path and diff info were provided in your task prompt.
|
|
12
|
+
|
|
13
|
+
1. Call `Skill("coding-skill-selector")` to load language/domain skills.
|
|
14
|
+
2. Read the diff for this file to understand what changed.
|
|
15
|
+
3. Check the review scope from your task prompt:
|
|
16
|
+
- `diff` scope → read the diff, then read the full methods/blocks containing changes.
|
|
17
|
+
Changes can affect surrounding code, so always understand the logical context.
|
|
18
|
+
- `full` scope → read the FULL file
|
|
19
|
+
4. Read related files: imports, files that call this one, siblings in the same module.
|
|
20
|
+
Check if similar code already exists elsewhere that could be reused or unified.
|
|
21
|
+
5. Understand what changed and why.
|
|
22
|
+
|
|
23
|
+
Then → transition `done`.
|
|
24
|
+
transitions:
|
|
25
|
+
done: build_checklist
|
|
26
|
+
|
|
27
|
+
build_checklist:
|
|
28
|
+
task: "Build style checklist from skills"
|
|
29
|
+
prompt: |
|
|
30
|
+
Build a concrete style checklist from loaded skills:
|
|
31
|
+
|
|
32
|
+
Process each loaded `lang-*` and `preferences-*` skill ONE AT A TIME.
|
|
33
|
+
For each skill:
|
|
34
|
+
1. Re-read the FULL skill text
|
|
35
|
+
2. Extract EVERY rule — each `// WRONG` example, explicit rule, or convention
|
|
36
|
+
becomes a one-line check with the wrong pattern AND the correct one
|
|
37
|
+
3. Count rules extracted from this skill and output the count
|
|
38
|
+
|
|
39
|
+
Do NOT summarize, group, or prioritize — every rule matters equally.
|
|
40
|
+
If your total is under 25 rules across all skills, you missed some — re-read.
|
|
41
|
+
|
|
42
|
+
Output the full checklist, then → transition `done`.
|
|
43
|
+
transitions:
|
|
44
|
+
done: check_correctness
|
|
45
|
+
|
|
46
|
+
check_correctness:
|
|
47
|
+
task: "Check correctness"
|
|
48
|
+
prompt: |
|
|
49
|
+
Check based on review scope (`diff` or `full`).
|
|
50
|
+
For `diff` scope: changed lines are entry points, but always check the
|
|
51
|
+
logical context around them — the full method, callers of changed signatures,
|
|
52
|
+
other uses of changed fields. A change can break surrounding code.
|
|
53
|
+
- [ ] Logic errors: wrong conditions, off-by-one, inverted booleans, wrong operators
|
|
54
|
+
- [ ] Edge cases: null/undefined, empty collections, zero, negative, boundary values
|
|
55
|
+
- [ ] Error handling: uncaught exceptions, swallowed errors, wrong error types
|
|
56
|
+
- [ ] Race conditions: async without awaits, shared mutable state
|
|
57
|
+
- [ ] Resource leaks: unclosed handles, missing cleanup in error paths
|
|
58
|
+
- [ ] Type mismatches: wrong argument types, unsafe casts
|
|
59
|
+
- [ ] Regression risk: does this break existing callers or assumptions?
|
|
60
|
+
|
|
61
|
+
Then → transition `done`.
|
|
62
|
+
transitions:
|
|
63
|
+
done: check_loops
|
|
64
|
+
|
|
65
|
+
check_loops:
|
|
66
|
+
task: "Audit every loop and iteration pattern"
|
|
67
|
+
prompt: |
|
|
68
|
+
Find loops based on review scope (`diff` = only loops in changed code, `full` = ALL loops in file). For each one, output:
|
|
69
|
+
|
|
70
|
+
Line <N>: `<the loop header>` — iterates <what> to access <what> → verdict: ok / issue
|
|
71
|
+
|
|
72
|
+
Check each loop against these anti-patterns:
|
|
73
|
+
|
|
74
|
+
1. **Range-over-sparse-container**: the loop iterates an index range (`0...n`,
|
|
75
|
+
`i++`) but uses the index to look up a Map/Dict/HashMap, with a null check
|
|
76
|
+
to skip missing entries. This means the container is sparse and should be
|
|
77
|
+
iterated directly instead of via index range.
|
|
78
|
+
WRONG: `for (i in 0...n) { val = map[i]; if (val == null) continue; ... }`
|
|
79
|
+
RIGHT: `for (key => val in map) { ... }` (with filter if needed)
|
|
80
|
+
|
|
81
|
+
2. **Keys-then-lookup**: iterates `.keys()` then does a separate `map[key]`
|
|
82
|
+
lookup inside the body. Use key-value destructuring instead.
|
|
83
|
+
WRONG: `for (k in map.keys()) { v = map[k]; ... }`
|
|
84
|
+
RIGHT: `for (k => v in map) { ... }`
|
|
85
|
+
|
|
86
|
+
3. **Repeated linear scan**: the same collection is scanned multiple times in
|
|
87
|
+
the same method or call chain to find related data that could be found in
|
|
88
|
+
a single pass.
|
|
89
|
+
|
|
90
|
+
4. **Wrong container for access pattern**: Array with `.contains()` checks
|
|
91
|
+
(should be Set/Map), unsorted Array with binary-search-like access, etc.
|
|
92
|
+
|
|
93
|
+
5. **Continue instead of guard**: when a single condition filters the entire
|
|
94
|
+
loop body, use guard-style `for (...) if (cond) {` instead of
|
|
95
|
+
`if (!cond) continue;`. But when multiple independent checks follow
|
|
96
|
+
sequentially (each with its own variable), `continue` is fine.
|
|
97
|
+
WRONG: `for (x in list) { if (x.type != Foo) continue; doWork(x); }`
|
|
98
|
+
RIGHT: `for (x in list) if (x.type == Foo) { doWork(x); }`
|
|
99
|
+
OK: `for (f in range) { val = map[f]; if (val == null) continue; ... }`
|
|
100
|
+
|
|
101
|
+
6. **If/else-if chain instead of switch**: when a loop body checks the same
|
|
102
|
+
variable with `if/else if` chain (e.g. `if (x == "a") ... else if (x == "b")`),
|
|
103
|
+
use `switch` instead — cleaner, exhaustive, idiomatic.
|
|
104
|
+
WRONG: `if (node.name == "a") { ... } else if (node.name == "b") { ... }`
|
|
105
|
+
RIGHT: `switch node.name { case "a": ... case "b": ... }`
|
|
106
|
+
|
|
107
|
+
You MUST list ALL loops with verdicts. Do not summarize or skip "obvious" ones.
|
|
108
|
+
|
|
109
|
+
Then → transition `done`.
|
|
110
|
+
transitions:
|
|
111
|
+
done: check_style
|
|
112
|
+
|
|
113
|
+
check_style:
|
|
114
|
+
task: "Check code style"
|
|
115
|
+
prompt: |
|
|
116
|
+
Go through EVERY item in your style checklist from build_checklist.
|
|
117
|
+
Scope: `diff` = scan only lines in the diff; `full` = scan entire file.
|
|
118
|
+
If a line appears in the diff, it is part of the change and must meet
|
|
119
|
+
current style standards. Report with file:line.
|
|
120
|
+
|
|
121
|
+
Additionally check:
|
|
122
|
+
- [ ] Naming: consistent with conventions in neighboring files
|
|
123
|
+
- [ ] Duplication: does this duplicate existing code? Search the codebase.
|
|
124
|
+
- [ ] Dead code: unused imports, unreachable branches, commented-out code
|
|
125
|
+
- [ ] Complexity: overly nested logic, functions doing too many things
|
|
126
|
+
- [ ] Consistency: does new code follow patterns in the same module?
|
|
127
|
+
|
|
128
|
+
Then → transition `done`.
|
|
129
|
+
transitions:
|
|
130
|
+
done: check_security
|
|
131
|
+
|
|
132
|
+
check_security:
|
|
133
|
+
task: "Check security"
|
|
134
|
+
prompt: |
|
|
135
|
+
Check code for security issues (scope: `diff` = changed code, `full` = entire file):
|
|
136
|
+
- [ ] Injection: unsanitized inputs reaching dangerous APIs
|
|
137
|
+
- [ ] Auth/authz: missing or insufficient access controls
|
|
138
|
+
- [ ] Data exposure: secrets, credentials, or sensitive data leaking
|
|
139
|
+
- [ ] Input validation: missing validation at system boundaries
|
|
140
|
+
- [ ] Unsafe operations: anything exploitable by malicious input
|
|
141
|
+
|
|
142
|
+
Only flag real risks — don't invent hypothetical issues.
|
|
143
|
+
Then → transition `done`.
|
|
144
|
+
transitions:
|
|
145
|
+
done: report
|
|
146
|
+
|
|
147
|
+
report:
|
|
148
|
+
task: "Write file review report"
|
|
149
|
+
prompt: |
|
|
150
|
+
Write a structured report for this file:
|
|
151
|
+
|
|
152
|
+
**File**: <path>
|
|
153
|
+
**What changed**: <one-paragraph summary of the changes>
|
|
154
|
+
**Issues found**:
|
|
155
|
+
For each issue:
|
|
156
|
+
- Severity: bug / risk / style / nit
|
|
157
|
+
- Line: <line number>
|
|
158
|
+
- Description: <what's wrong>
|
|
159
|
+
- Suggestion: <how to fix>
|
|
160
|
+
|
|
161
|
+
If no issues found, say so explicitly.
|
|
162
|
+
|
|
163
|
+
Output this report as your final response.
|
|
164
|
+
terminal: true
|