claude-code-scanner 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/DOCUMENTATION.md +1210 -0
- package/LICENSE +21 -0
- package/README.md +306 -0
- package/bin/cli.js +305 -0
- package/package.json +43 -0
- package/template/.claude/agents/api-builder.md +64 -0
- package/template/.claude/agents/architect.md +92 -0
- package/template/.claude/agents/debugger.md +69 -0
- package/template/.claude/agents/explorer.md +71 -0
- package/template/.claude/agents/frontend.md +61 -0
- package/template/.claude/agents/infra.md +66 -0
- package/template/.claude/agents/product-owner.md +73 -0
- package/template/.claude/agents/qa-lead.md +102 -0
- package/template/.claude/agents/reviewer.md +77 -0
- package/template/.claude/agents/security.md +81 -0
- package/template/.claude/agents/team-lead.md +128 -0
- package/template/.claude/agents/tester.md +72 -0
- package/template/.claude/docs/agent-error-protocol.md +89 -0
- package/template/.claude/docs/best-practices.md +93 -0
- package/template/.claude/docs/commands-template.md +73 -0
- package/template/.claude/docs/conflict-resolution-protocol.md +82 -0
- package/template/.claude/docs/context-budget.md +54 -0
- package/template/.claude/docs/execution-metrics-protocol.md +105 -0
- package/template/.claude/docs/flow-engine.md +475 -0
- package/template/.claude/docs/smithery-setup.md +51 -0
- package/template/.claude/docs/task-record-schema.md +196 -0
- package/template/.claude/hooks/drift-detector.js +143 -0
- package/template/.claude/hooks/execution-report.js +114 -0
- package/template/.claude/hooks/notify-approval.js +30 -0
- package/template/.claude/hooks/post-compact-recovery.js +68 -0
- package/template/.claude/hooks/post-edit-format.js +43 -0
- package/template/.claude/hooks/pre-compact-save.js +94 -0
- package/template/.claude/hooks/protect-files.js +39 -0
- package/template/.claude/hooks/session-start.js +76 -0
- package/template/.claude/hooks/stop-failure-handler.js +77 -0
- package/template/.claude/hooks/tool-failure-tracker.js +54 -0
- package/template/.claude/hooks/track-file-changes.js +34 -0
- package/template/.claude/hooks/validate-bash.js +34 -0
- package/template/.claude/manifest.json +22 -0
- package/template/.claude/profiles/backend.md +34 -0
- package/template/.claude/profiles/devops.md +36 -0
- package/template/.claude/profiles/frontend.md +34 -0
- package/template/.claude/rules/context-budget.md +34 -0
- package/template/.claude/scripts/verify-setup.js +210 -0
- package/template/.claude/settings.json +154 -0
- package/template/.claude/skills/context-check/SKILL.md +112 -0
- package/template/.claude/skills/execution-report/SKILL.md +229 -0
- package/template/.claude/skills/generate-environment/SKILL.md +128 -0
- package/template/.claude/skills/generate-environment/additional-skills.md +276 -0
- package/template/.claude/skills/generate-environment/artifact-templates.md +386 -0
- package/template/.claude/skills/generate-environment/domain-agents.md +202 -0
- package/template/.claude/skills/impact-analysis/SKILL.md +17 -0
- package/template/.claude/skills/metrics/SKILL.md +19 -0
- package/template/.claude/skills/progress-report/SKILL.md +27 -0
- package/template/.claude/skills/rollback/SKILL.md +75 -0
- package/template/.claude/skills/scan-codebase/SKILL.md +59 -0
- package/template/.claude/skills/scan-codebase/deep-scan-instructions.md +101 -0
- package/template/.claude/skills/scan-codebase/tech-markers.md +87 -0
- package/template/.claude/skills/setup-smithery/SKILL.md +38 -0
- package/template/.claude/skills/sync/SKILL.md +239 -0
- package/template/.claude/skills/task-tracker/SKILL.md +40 -0
- package/template/.claude/skills/validate-setup/SKILL.md +30 -0
- package/template/.claude/skills/workflow/SKILL.md +333 -0
- package/template/.claude/templates/README.md +42 -0
- package/template/CLAUDE.md +67 -0
|
@@ -0,0 +1,105 @@
|
|
|
1
|
+
# Execution Metrics Protocol
|
|
2
|
+
|
|
3
|
+
Every agent MUST include execution metrics in their HANDOFF block when completing work. This data feeds into the `/execution-report` skill for post-completion analytics.
|
|
4
|
+
|
|
5
|
+
## Extended HANDOFF Block Format
|
|
6
|
+
|
|
7
|
+
```
|
|
8
|
+
HANDOFF:
|
|
9
|
+
from: @agent-name
|
|
10
|
+
to: @next-agent-or-team-lead
|
|
11
|
+
reason: why this handoff is happening
|
|
12
|
+
artifacts:
|
|
13
|
+
- list of files/docs produced
|
|
14
|
+
context: |
|
|
15
|
+
Summary of what was done and key decisions
|
|
16
|
+
iteration: N/max (if in a loop)
|
|
17
|
+
execution_metrics:
|
|
18
|
+
turns_used: N
|
|
19
|
+
files_read: N
|
|
20
|
+
files_modified: N
|
|
21
|
+
files_created: N
|
|
22
|
+
tests_run: N (pass/fail/skip)
|
|
23
|
+
coverage_delta: "+N%" or "-N%" or "N/A"
|
|
24
|
+
hallucination_flags:
|
|
25
|
+
- "referenced non-existent file X" (if any)
|
|
26
|
+
- "CLEAN" (if none)
|
|
27
|
+
regression_flags:
|
|
28
|
+
- "test X changed from PASS to FAIL" (if any)
|
|
29
|
+
- "CLEAN" (if none)
|
|
30
|
+
confidence: HIGH/MEDIUM/LOW
|
|
31
|
+
notes: |
|
|
32
|
+
Any concerns, assumptions made, or uncertainty
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
## What Each Metric Means
|
|
36
|
+
|
|
37
|
+
### turns_used
|
|
38
|
+
How many agentic turns the agent consumed. Compare against `maxTurns` to assess efficiency.
|
|
39
|
+
- Under 50% of maxTurns = efficient
|
|
40
|
+
- 50-80% = normal
|
|
41
|
+
- Over 80% = approaching limit, may indicate complexity or inefficiency
|
|
42
|
+
|
|
43
|
+
### files_read / files_modified / files_created
|
|
44
|
+
Raw count of file operations. Used to calculate:
|
|
45
|
+
- Context consumption estimate (~500 tokens per file read)
|
|
46
|
+
- Blast radius (files modified)
|
|
47
|
+
- Code generation volume (files created)
|
|
48
|
+
|
|
49
|
+
### tests_run
|
|
50
|
+
Test suite results after this agent's work. Format: `N (pass/fail/skip)`.
|
|
51
|
+
- All pass = quality score contribution
|
|
52
|
+
- New failures = regression flag
|
|
53
|
+
|
|
54
|
+
### coverage_delta
|
|
55
|
+
Change in test coverage after this agent's work.
|
|
56
|
+
- Positive or zero = quality maintained
|
|
57
|
+
- Negative = quality decreased (flag for review)
|
|
58
|
+
|
|
59
|
+
### hallucination_flags
|
|
60
|
+
Self-check by the agent for potential hallucinations:
|
|
61
|
+
- Referenced a file that doesn't exist
|
|
62
|
+
- Called a function that doesn't exist in the codebase
|
|
63
|
+
- Used an API pattern that doesn't match the project
|
|
64
|
+
- Generated import paths that don't resolve
|
|
65
|
+
- **CLEAN** if all references verified
|
|
66
|
+
|
|
67
|
+
### regression_flags
|
|
68
|
+
Self-check by the agent for regressions introduced:
|
|
69
|
+
- Tests that passed before but fail after changes
|
|
70
|
+
- Lint errors introduced
|
|
71
|
+
- Type errors introduced
|
|
72
|
+
- Build failures introduced
|
|
73
|
+
- **CLEAN** if no regressions
|
|
74
|
+
|
|
75
|
+
### confidence
|
|
76
|
+
Agent's self-assessment of output quality:
|
|
77
|
+
- **HIGH**: All references verified, tests pass, patterns match
|
|
78
|
+
- **MEDIUM**: Most references verified, minor uncertainty about edge cases
|
|
79
|
+
- **LOW**: Significant assumptions made, references not fully verifiable, needs human review
|
|
80
|
+
|
|
81
|
+
## @team-lead Responsibilities
|
|
82
|
+
The @team-lead agent is responsible for:
|
|
83
|
+
1. Collecting execution_metrics from every agent handoff
|
|
84
|
+
2. Aggregating metrics into the task record's Execution Report section
|
|
85
|
+
3. Triggering `/execution-report` after each phase completion
|
|
86
|
+
4. Flagging any agent with hallucination_flags != CLEAN or confidence = LOW
|
|
87
|
+
5. Blocking phase advancement if regression_flags != CLEAN
|
|
88
|
+
|
|
89
|
+
## Automatic Collection
|
|
90
|
+
The `execution-report.js` Stop hook automatically captures:
|
|
91
|
+
- Active task status and metadata
|
|
92
|
+
- Agent mention counts from handoff log
|
|
93
|
+
- Loop iteration counts from task record
|
|
94
|
+
- File change counts from changes log
|
|
95
|
+
- Snapshots saved to `.claude/reports/executions/`
|
|
96
|
+
|
|
97
|
+
## Report Aggregation
|
|
98
|
+
The `/execution-report` skill combines:
|
|
99
|
+
- Hook-collected snapshots (automatic)
|
|
100
|
+
- Agent-reported execution_metrics (from HANDOFF blocks)
|
|
101
|
+
- Test suite results (run on demand)
|
|
102
|
+
- Coverage reports (run on demand)
|
|
103
|
+
- File reference verification (scan on demand)
|
|
104
|
+
|
|
105
|
+
Into a single scored report with success, hallucination, and regression assessments.
|
|
@@ -0,0 +1,475 @@
|
|
|
1
|
+
# Real-World Flow Engine
|
|
2
|
+
|
|
3
|
+
## Orchestration Model
|
|
4
|
+
**Subagents cannot spawn other subagents** in Claude Code. All agent-to-agent coordination is orchestrated from the `/workflow` skill running in the main conversation context (with `context: fork`). When a phase requires multiple agents, invoke them sequentially or in parallel from the workflow — never expect one agent to call another directly.
|
|
5
|
+
|
|
6
|
+
**ALL handoffs route through the orchestrator** (workflow skill or @team-lead). Even when the flow-engine describes "@debugger fixes, handoff to @tester", the actual flow is: @debugger -> HANDOFF to orchestrator -> orchestrator invokes @tester.
|
|
7
|
+
|
|
8
|
+
## Handoff Protocol
|
|
9
|
+
Every agent/human transition MUST include a structured handoff block:
|
|
10
|
+
|
|
11
|
+
```
|
|
12
|
+
HANDOFF:
|
|
13
|
+
from: @agent-name
|
|
14
|
+
to: @next-agent-or-user
|
|
15
|
+
reason: why this handoff is happening
|
|
16
|
+
artifacts:
|
|
17
|
+
- path/to/file-produced.md
|
|
18
|
+
- path/to/test-results.json
|
|
19
|
+
context: |
|
|
20
|
+
Summary of what was done, key decisions made,
|
|
21
|
+
and anything the next agent needs to know
|
|
22
|
+
iteration: N/max (if in a loop)
|
|
23
|
+
status: complete | blocked | escalating
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
### Handoff Rules
|
|
27
|
+
1. Every agent output MUST end with a HANDOFF block
|
|
28
|
+
2. The receiving agent MUST read all listed artifacts before starting
|
|
29
|
+
3. If `status: blocked`, route to @team-lead for resolution
|
|
30
|
+
4. If `status: escalating`, route to user with options
|
|
31
|
+
5. ALL routing goes through the orchestrator — agents NEVER invoke each other directly
|
|
32
|
+
|
|
33
|
+
---
|
|
34
|
+
|
|
35
|
+
## Loop 1: Dev-Test Loop (Phase 6, max 5 iterations)
|
|
36
|
+
|
|
37
|
+
### Entry Condition
|
|
38
|
+
Phase 5 dev agent completes -> HANDOFF to orchestrator -> orchestrator invokes @tester FIRST (not parallel with @debugger).
|
|
39
|
+
|
|
40
|
+
### Flow
|
|
41
|
+
```
|
|
42
|
+
ENTRY: orchestrator invokes @tester with Phase 5 artifacts
|
|
43
|
+
|
|
|
44
|
+
v
|
|
45
|
+
@tester runs full test suite
|
|
46
|
+
|
|
|
47
|
+
+-> ALL PASS + coverage maintained -> EXIT: advance to Phase 7
|
|
48
|
+
|
|
|
49
|
+
+-> FAILURES -> @tester reports failing tests in HANDOFF
|
|
50
|
+
|
|
|
51
|
+
v
|
|
52
|
+
orchestrator routes to fix agent:
|
|
53
|
+
- test logic bug -> @debugger (minimal fix)
|
|
54
|
+
- backend code issue -> @api-builder (if needs domain knowledge)
|
|
55
|
+
- frontend code issue -> @frontend (if needs UI knowledge)
|
|
56
|
+
- infra/config issue -> @infra
|
|
57
|
+
|
|
|
58
|
+
v
|
|
59
|
+
fix agent applies fix -> runs tests locally -> HANDOFF back
|
|
60
|
+
|
|
|
61
|
+
v
|
|
62
|
+
orchestrator invokes @tester again (increment iteration)
|
|
63
|
+
|
|
|
64
|
+
+-> ALL PASS -> EXIT: advance to Phase 7
|
|
65
|
+
+-> STILL FAILING -> repeat loop
|
|
66
|
+
+-> iteration 5 -> CIRCUIT BREAKER: escalate to user
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
### Loop State Tracking
|
|
70
|
+
```markdown
|
|
71
|
+
## Loop State
|
|
72
|
+
- dev-test-loop: iteration N/5
|
|
73
|
+
- last-failure: [test name] — [error summary] at [ISO timestamp]
|
|
74
|
+
- fix-agent: @debugger|@api-builder|@frontend|@infra
|
|
75
|
+
- coverage-baseline: N% (measured at Phase 5 end, before Phase 6 starts)
|
|
76
|
+
- coverage-current: N%
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
### Coverage Rule
|
|
80
|
+
- **Baseline**: measured at end of Phase 5 (before any Phase 6 iteration)
|
|
81
|
+
- **Maintained**: coverage-current >= coverage-baseline
|
|
82
|
+
- If coverage drops: @tester must add missing tests before advancing
|
|
83
|
+
|
|
84
|
+
### Exit Criteria
|
|
85
|
+
- ALL tests pass (zero failures)
|
|
86
|
+
- Coverage >= baseline
|
|
87
|
+
- @tester confirms in HANDOFF: `reason: testing complete — all pass`
|
|
88
|
+
|
|
89
|
+
---
|
|
90
|
+
|
|
91
|
+
## Loop 2: Review-Rework Loop (Phase 7, max 3 iterations)
|
|
92
|
+
|
|
93
|
+
### Entry Condition
|
|
94
|
+
Phase 6 passes -> orchestrator invokes @reviewer + @security in PARALLEL.
|
|
95
|
+
|
|
96
|
+
### Flow
|
|
97
|
+
```
|
|
98
|
+
ENTRY: orchestrator invokes @reviewer + @security in parallel
|
|
99
|
+
|
|
|
100
|
+
v
|
|
101
|
+
@reviewer produces: APPROVE or REQUEST_CHANGES (with comments)
|
|
102
|
+
@security produces: APPROVE or REQUEST_CHANGES (with findings)
|
|
103
|
+
|
|
|
104
|
+
+-> BOTH APPROVE -> EXIT: advance to Phase 8
|
|
105
|
+
|
|
|
106
|
+
+-> SPLIT DECISION (one approves, one requests changes):
|
|
107
|
+
| -> treat as REQUEST_CHANGES (stricter wins)
|
|
108
|
+
| -> only re-review with the agent that requested changes
|
|
109
|
+
|
|
|
110
|
+
+-> BOTH REQUEST_CHANGES -> combine all comments
|
|
111
|
+
|
|
|
112
|
+
v
|
|
113
|
+
orchestrator routes fix by comment category:
|
|
114
|
+
- code quality / conventions -> original dev agent (@api-builder / @frontend)
|
|
115
|
+
- security finding -> @debugger (security fix) or original dev agent
|
|
116
|
+
- performance issue -> original dev agent
|
|
117
|
+
- test gap -> @tester (add tests, not a review-loop iteration)
|
|
118
|
+
|
|
|
119
|
+
v
|
|
120
|
+
fix agent addresses ALL comments -> HANDOFF with list of addressed items
|
|
121
|
+
|
|
|
122
|
+
v
|
|
123
|
+
orchestrator invokes ONLY the agent(s) that requested changes (not both)
|
|
124
|
+
- @reviewer if they had comments
|
|
125
|
+
- @security if they had findings
|
|
126
|
+
- both only if both originally requested changes
|
|
127
|
+
|
|
|
128
|
+
v
|
|
129
|
+
re-review focuses on: previously flagged items + any new code from fix
|
|
130
|
+
|
|
|
131
|
+
+-> APPROVE -> EXIT (check if other reviewer still needs to re-review)
|
|
132
|
+
+-> REQUEST_CHANGES again -> repeat (increment iteration)
|
|
133
|
+
+-> iteration 3 -> CIRCUIT BREAKER: escalate to user
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
### Loop State Tracking
|
|
137
|
+
```markdown
|
|
138
|
+
## Loop State
|
|
139
|
+
- review-loop: iteration N/3
|
|
140
|
+
- reviewer-status: APPROVE | REQUEST_CHANGES (N critical, M suggestions)
|
|
141
|
+
- security-status: APPROVE | REQUEST_CHANGES (N findings)
|
|
142
|
+
- open-comments: [count] critical, [count] suggestions
|
|
143
|
+
- addressed-comments: [count] fixed, [count] won't-fix (with justification)
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
### Partial Re-Review Rule
|
|
147
|
+
- If 8/10 comments addressed and 2 marked "won't fix" with justification:
|
|
148
|
+
- Reviewer evaluates the justifications
|
|
149
|
+
- If justifications are valid -> APPROVE with noted exceptions
|
|
150
|
+
- If justifications are weak -> REQUEST_CHANGES on those 2 only
|
|
151
|
+
|
|
152
|
+
### Exit Criteria
|
|
153
|
+
- @reviewer: APPROVE (zero critical issues)
|
|
154
|
+
- @security: APPROVE (zero HIGH/CRITICAL findings)
|
|
155
|
+
- Both confirmed in HANDOFF blocks
|
|
156
|
+
|
|
157
|
+
---
|
|
158
|
+
|
|
159
|
+
## Loop 3: QA-Bug Loop (Phase 9, max 3 iterations PER BUG)
|
|
160
|
+
|
|
161
|
+
### Entry Condition
|
|
162
|
+
Phase 8 PR+CI pass -> orchestrator invokes @qa-lead to create test plan, then @tester to execute.
|
|
163
|
+
|
|
164
|
+
### Flow
|
|
165
|
+
```
|
|
166
|
+
ENTRY: orchestrator invokes @qa-lead (creates QA test plan)
|
|
167
|
+
|
|
|
168
|
+
v
|
|
169
|
+
orchestrator invokes @tester (executes test plan scenarios)
|
|
170
|
+
|
|
|
171
|
+
v
|
|
172
|
+
@qa-lead reviews results, files bug reports for failures
|
|
173
|
+
|
|
|
174
|
+
+-> ZERO BUGS -> EXIT: advance to Phase 10 (QA sign-off)
|
|
175
|
+
|
|
|
176
|
+
+-> BUGS FOUND -> orchestrator triages by severity:
|
|
177
|
+
|
|
|
178
|
+
v
|
|
179
|
+
Phase 9a: Fix P0 bugs first (block everything)
|
|
180
|
+
Phase 9b: Fix P1 bugs (block sign-off)
|
|
181
|
+
Phase 9c: Fix P2 bugs (QA decides)
|
|
182
|
+
Phase 9d: Log P3/P4 as known issues (don't block)
|
|
183
|
+
|
|
|
184
|
+
v
|
|
185
|
+
For each bug (priority order):
|
|
186
|
+
orchestrator invokes @debugger with bug report
|
|
187
|
+
|
|
|
188
|
+
v
|
|
189
|
+
@debugger fixes -> runs targeted test -> HANDOFF
|
|
190
|
+
|
|
|
191
|
+
v
|
|
192
|
+
orchestrator invokes @tester: run regression suite
|
|
193
|
+
(to catch: fix for BUG-1 broke BUG-2's previous fix)
|
|
194
|
+
|
|
|
195
|
+
v
|
|
196
|
+
orchestrator invokes @qa-lead: re-verify THIS bug
|
|
197
|
+
|
|
|
198
|
+
+-> VERIFIED: close bug, move to next
|
|
199
|
+
+-> REOPENED: back to @debugger (increment per-bug counter)
|
|
200
|
+
+-> per-bug iteration 3 -> escalate to @team-lead
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
### Loop State Tracking
|
|
204
|
+
```markdown
|
|
205
|
+
## Loop State
|
|
206
|
+
- qa-bug-loop:
|
|
207
|
+
- BUG-{id}-1 (P1): iteration N/3 — [OPEN|FIXED|VERIFIED|REOPENED]
|
|
208
|
+
- BUG-{id}-2 (P3): iteration 0/3 — [OPEN] (known issue, won't block)
|
|
209
|
+
- BUG-{id}-3 (P0): iteration N/3 — [VERIFIED]
|
|
210
|
+
- total-bugs: N found, M fixed, K verified, J known-issues
|
|
211
|
+
- regression-check-after-each-fix: true
|
|
212
|
+
```
|
|
213
|
+
|
|
214
|
+
### Per-Bug vs Global Iteration
|
|
215
|
+
- **3 iterations PER BUG** — each bug gets up to 3 fix attempts
|
|
216
|
+
- If a single bug exceeds 3 iterations: escalate THAT bug to @team-lead
|
|
217
|
+
- Other bugs continue being fixed normally
|
|
218
|
+
- **Global circuit breaker**: if total fix attempts across ALL bugs exceeds 15, escalate entire Phase 9
|
|
219
|
+
|
|
220
|
+
### P2 Bug Handling
|
|
221
|
+
P2 bugs get explicit treatment:
|
|
222
|
+
1. @qa-lead evaluates: is the workaround acceptable for release?
|
|
223
|
+
2. If YES -> mark as known issue with workaround documentation, CONDITIONAL
|
|
224
|
+
3. If NO -> treat as P1 (blocks sign-off, must fix)
|
|
225
|
+
4. Decision logged in task record Decision Log
|
|
226
|
+
|
|
227
|
+
### Regression Between Bug Fixes
|
|
228
|
+
After EVERY bug fix, @tester runs the full regression suite BEFORE @qa-lead verifies. This catches:
|
|
229
|
+
- Fix for BUG-1 that breaks BUG-2's previous fix
|
|
230
|
+
- Fix that introduces new failures unrelated to the reported bug
|
|
231
|
+
|
|
232
|
+
### Exit Criteria
|
|
233
|
+
- All P0/P1 bugs: VERIFIED and CLOSED
|
|
234
|
+
- P2 bugs: VERIFIED or CONDITIONAL (with documented workaround)
|
|
235
|
+
- P3/P4 bugs: logged as known issues
|
|
236
|
+
- @qa-lead confirms readiness for sign-off
|
|
237
|
+
|
|
238
|
+
---
|
|
239
|
+
|
|
240
|
+
## Loop 4: CI Fix Loop (Phase 8, max 3 iterations)
|
|
241
|
+
|
|
242
|
+
### Flow
|
|
243
|
+
```
|
|
244
|
+
ENTRY: orchestrator creates PR via gh cli
|
|
245
|
+
|
|
|
246
|
+
v
|
|
247
|
+
CI runs (GitHub Actions / GitLab CI / etc.)
|
|
248
|
+
|
|
|
249
|
+
+-> ALL CHECKS PASS -> EXIT: advance to Phase 9
|
|
250
|
+
|
|
|
251
|
+
+-> CI FAILURE -> orchestrator reads CI logs
|
|
252
|
+
|
|
|
253
|
+
v
|
|
254
|
+
classify failure:
|
|
255
|
+
- test failure -> @debugger (fix code, NOT test expectations)
|
|
256
|
+
- lint failure -> original dev agent (fix style)
|
|
257
|
+
- type error -> original dev agent (fix types)
|
|
258
|
+
- build failure -> @debugger or @infra (depends on error)
|
|
259
|
+
- flaky test -> @tester (stabilize test)
|
|
260
|
+
|
|
|
261
|
+
v
|
|
262
|
+
fix agent applies fix -> pushes to PR branch
|
|
263
|
+
|
|
|
264
|
+
v
|
|
265
|
+
CI re-runs (increment iteration)
|
|
266
|
+
|
|
|
267
|
+
+-> PASS -> EXIT
|
|
268
|
+
+-> FAIL -> repeat
|
|
269
|
+
+-> iteration 3 -> CIRCUIT BREAKER
|
|
270
|
+
```
|
|
271
|
+
|
|
272
|
+
### Loop State Tracking
|
|
273
|
+
```markdown
|
|
274
|
+
## Loop State
|
|
275
|
+
- ci-fix-loop: iteration N/3
|
|
276
|
+
- last-ci-failure: [check name] — [error summary] at [ISO timestamp]
|
|
277
|
+
- fix-agent: @debugger|@api-builder|@frontend|@infra|@tester
|
|
278
|
+
```
|
|
279
|
+
|
|
280
|
+
### CI Fix Review Rule
|
|
281
|
+
If @debugger changes code to fix CI:
|
|
282
|
+
- If change is trivial (typo, import, config): push directly, no re-review needed
|
|
283
|
+
- If change is substantive (logic change, new code): flag in HANDOFF, orchestrator decides whether to loop back to Phase 7 for re-review before re-running CI
|
|
284
|
+
|
|
285
|
+
---
|
|
286
|
+
|
|
287
|
+
## Loop 5: Sign-off Rejection Loop (Phase 10, max 2 full rejection cycles)
|
|
288
|
+
|
|
289
|
+
### Flow
|
|
290
|
+
```
|
|
291
|
+
ENTRY: orchestrator invokes sign-off gates sequentially
|
|
292
|
+
|
|
|
293
|
+
v
|
|
294
|
+
Gate 1: @qa-lead (QA sign-off)
|
|
295
|
+
+-> APPROVED or CONDITIONAL -> proceed to Gate 2
|
|
296
|
+
+-> REJECTED (P0/P1 bugs) -> route back to Phase 5
|
|
297
|
+
| (QA approval is INVALIDATED, must re-sign after fix)
|
|
298
|
+
|
|
|
299
|
+
Gate 2: @product-owner (Business sign-off)
|
|
300
|
+
+-> APPROVED -> proceed to Gate 3
|
|
301
|
+
+-> REJECTED:
|
|
302
|
+
| wrong requirements -> Phase 4 (revise criteria)
|
|
303
|
+
| UI not right -> Phase 5c (frontend fix)
|
|
304
|
+
| missing edge case -> Phase 5 (add case)
|
|
305
|
+
| defer -> ON_HOLD (exit workflow)
|
|
306
|
+
| (QA approval PRESERVED — business issue, not quality)
|
|
307
|
+
|
|
|
308
|
+
Gate 3: @team-lead (Tech sign-off)
|
|
309
|
+
+-> APPROVED -> EXIT: advance to Phase 11
|
|
310
|
+
+-> REJECTED:
|
|
311
|
+
architecture issue -> Phase 3 (redesign -> full re-flow)
|
|
312
|
+
performance issue -> Phase 5 (optimize) -> Phase 6
|
|
313
|
+
needs more tests -> Phase 5d (add tests) -> Phase 6
|
|
314
|
+
(QA + Business approvals INVALIDATED for architecture rejection)
|
|
315
|
+
(QA + Business approvals PRESERVED for test/performance rejection)
|
|
316
|
+
```
|
|
317
|
+
|
|
318
|
+
### Sign-off State Preservation Rules
|
|
319
|
+
| Rejection Source | QA Approval | Business Approval | Reason |
|
|
320
|
+
|-----------------|-------------|-------------------|--------|
|
|
321
|
+
| @qa-lead (bugs) | INVALIDATED | INVALIDATED | Code changed, needs full re-verify |
|
|
322
|
+
| @product-owner (requirements) | PRESERVED | INVALIDATED | Code didn't change quality |
|
|
323
|
+
| @product-owner (UI) | PRESERVED | INVALIDATED | Only frontend changed |
|
|
324
|
+
| @team-lead (architecture) | INVALIDATED | INVALIDATED | Major structural change |
|
|
325
|
+
| @team-lead (performance) | PRESERVED | PRESERVED | Optimization, not logic change |
|
|
326
|
+
| @team-lead (tests) | PRESERVED | PRESERVED | Adding tests, not changing behavior |
|
|
327
|
+
|
|
328
|
+
### Outer Rejection Circuit Breaker
|
|
329
|
+
**Max 2 full rejection cycles** (Phase 10 -> rework -> return to Phase 10):
|
|
330
|
+
- After 2 full rejection cycles at Phase 10, STOP and escalate to user
|
|
331
|
+
- Options: continue, re-scope task, split into smaller tasks, cancel
|
|
332
|
+
- This prevents infinite ping-pong between phases
|
|
333
|
+
|
|
334
|
+
### Tracking
|
|
335
|
+
```markdown
|
|
336
|
+
## Loop State
|
|
337
|
+
- signoff-rejection-cycle: N/2
|
|
338
|
+
- qa-signoff: APPROVED|CONDITIONAL|REJECTED|PENDING
|
|
339
|
+
- biz-signoff: APPROVED|REJECTED|PENDING
|
|
340
|
+
- tech-signoff: APPROVED|REJECTED|PENDING
|
|
341
|
+
- last-rejection: [who] — [reason] at [ISO timestamp]
|
|
342
|
+
```
|
|
343
|
+
|
|
344
|
+
---
|
|
345
|
+
|
|
346
|
+
## Loop 6: Deploy-Failure Loop (Phase 11, max 2 attempts)
|
|
347
|
+
|
|
348
|
+
### Flow
|
|
349
|
+
```
|
|
350
|
+
ENTRY: orchestrator invokes @infra for deployment
|
|
351
|
+
|
|
|
352
|
+
v
|
|
353
|
+
@infra runs: pre-checks -> merge PR -> deploy -> health check -> smoke test
|
|
354
|
+
|
|
|
355
|
+
+-> ALL PASS -> EXIT: advance to Phase 12
|
|
356
|
+
|
|
|
357
|
+
+-> FAILURE -> @infra triages:
|
|
358
|
+
|
|
|
359
|
+
+-> config/env issue -> @infra fixes config -> retry deploy (no Phase 5)
|
|
360
|
+
+-> code bug revealed in prod -> @debugger hotfix -> Phase 6->7->8 fast-track
|
|
361
|
+
+-> infra issue (capacity, networking) -> @infra resolves -> retry deploy
|
|
362
|
+
+-> unknown -> rollback via /rollback, escalate to user
|
|
363
|
+
|
|
|
364
|
+
v
|
|
365
|
+
retry deploy (increment iteration)
|
|
366
|
+
+-> PASS -> EXIT
|
|
367
|
+
+-> FAIL -> iteration 2 -> STOP, rollback, escalate
|
|
368
|
+
```
|
|
369
|
+
|
|
370
|
+
### Loop State Tracking
|
|
371
|
+
```markdown
|
|
372
|
+
## Loop State
|
|
373
|
+
- deploy-loop: iteration N/2
|
|
374
|
+
- last-deploy-failure: [error type] — [summary] at [ISO timestamp]
|
|
375
|
+
- rollback-executed: true|false
|
|
376
|
+
```
|
|
377
|
+
|
|
378
|
+
### Deploy Triage (NOT blind Phase 5 re-route)
|
|
379
|
+
Deploy failures are triaged BEFORE routing:
|
|
380
|
+
1. **Config issue**: @infra fixes config, retry deploy directly (no code change)
|
|
381
|
+
2. **Code bug**: follows hotfix fast-track (Phase 5->6->7->8->11)
|
|
382
|
+
3. **Infra issue**: @infra resolves, retry deploy directly
|
|
383
|
+
4. **Unknown**: rollback + escalate to user
|
|
384
|
+
|
|
385
|
+
---
|
|
386
|
+
|
|
387
|
+
## Bug Severity Quick Reference
|
|
388
|
+
- **P0**: System down, data loss, security breach — blocks everything, immediate fix
|
|
389
|
+
- **P1**: Core feature broken, no workaround — blocks QA sign-off, must fix
|
|
390
|
+
- **P2**: Feature broken with workaround — QA decides: treat as P1 or CONDITIONAL with documented workaround
|
|
391
|
+
- **P3**: Minor issue, cosmetic — conditional approve
|
|
392
|
+
- **P4**: Enhancement, nice-to-have — approve with known issues list
|
|
393
|
+
|
|
394
|
+
## Hotfix Fast-Track
|
|
395
|
+
Production P0/P1 -> skip Phase 3+4:
|
|
396
|
+
```
|
|
397
|
+
Phase 1 (type=hotfix) -> Phase 2 (abbreviated impact)
|
|
398
|
+
-> Phase 5 (@debugger as primary dev, not @api-builder)
|
|
399
|
+
-> Phase 6 (dev-test: max 3 iterations)
|
|
400
|
+
-> Phase 7 (review: max 2 iterations)
|
|
401
|
+
-> Phase 8 (CI: max 2 attempts)
|
|
402
|
+
-> Phase 9 (verify-only: @qa-lead confirms fix, no full test plan)
|
|
403
|
+
-> Phase 10 (tech sign-off ONLY, skip QA formal + business)
|
|
404
|
+
-> Phase 11 (deploy: max 1 attempt, immediate rollback on failure)
|
|
405
|
+
-> Phase 12 (monitor 15min, not 30)
|
|
406
|
+
```
|
|
407
|
+
|
|
408
|
+
**Hotfix failure at any phase:** rollback + escalate immediately (no re-routing through earlier phases).
|
|
409
|
+
|
|
410
|
+
## Spike Flow (Research Only)
|
|
411
|
+
```
|
|
412
|
+
Phase 1 (type=spike, NO branch) -> Phase 2 (@explorer only)
|
|
413
|
+
-> Phase 3 (@architect + @explorer investigate)
|
|
414
|
+
-> SKIP Phases 4-12
|
|
415
|
+
-> Output: Research Report (findings, recommendation, complexity estimate)
|
|
416
|
+
-> Status: CLOSED
|
|
417
|
+
-> If "proceed as feature": user runs /workflow new "..."
|
|
418
|
+
```
|
|
419
|
+
No loops, no circuit breakers, no sign-offs.
|
|
420
|
+
|
|
421
|
+
## Concurrency Rule
|
|
422
|
+
**ONE active workflow at a time.** Before `/workflow new`:
|
|
423
|
+
1. Scan `.claude/tasks/` for active tasks
|
|
424
|
+
2. If found: prompt user to ON_HOLD or cancel the active task
|
|
425
|
+
3. NEVER run parallel workflows (context/file conflicts)
|
|
426
|
+
|
|
427
|
+
## ON_HOLD Management
|
|
428
|
+
- Enter: product-owner defers, or user explicitly pauses
|
|
429
|
+
- State preserved: current phase + all loop counters
|
|
430
|
+
- Resume: `/workflow resume TASK-id` -> re-enter at saved phase
|
|
431
|
+
- 7+ days: session-start hook warns user
|
|
432
|
+
- 30+ days: session-start hook suggests cancel
|
|
433
|
+
- NO auto-cancel (user must decide)
|
|
434
|
+
|
|
435
|
+
## CANCELLED Cleanup
|
|
436
|
+
When `/workflow cancel TASK-id`:
|
|
437
|
+
1. Status -> CANCELLED
|
|
438
|
+
2. Close open PR (`gh pr close`)
|
|
439
|
+
3. Delete local feature branch
|
|
440
|
+
4. Clean up worktrees if any
|
|
441
|
+
5. Task record preserved for history
|
|
442
|
+
|
|
443
|
+
## Circuit Breaker Summary
|
|
444
|
+
| Loop | Normal Max | Hotfix Max | Scope | Escalation |
|
|
445
|
+
|------|-----------|------------|-------|------------|
|
|
446
|
+
| Dev-Test (P6) | 5 | 3 | Global | User options |
|
|
447
|
+
| Review (P7) | 3 | 2 | Global | User options |
|
|
448
|
+
| CI Fix (P8) | 3 | 2 | Global | User options |
|
|
449
|
+
| QA Bug (P9) | 3/bug, 15 total | 2/bug, 6 total | Per-bug | @team-lead per bug |
|
|
450
|
+
| Sign-off (P10) | 2 cycles | 1 cycle | Global | User options |
|
|
451
|
+
| Deploy (P11) | 2 | 1 | Global | Rollback + escalate |
|
|
452
|
+
|
|
453
|
+
## Loop Counter Reset Rules
|
|
454
|
+
| Event | Which Counters Reset |
|
|
455
|
+
|-------|---------------------|
|
|
456
|
+
| Phase 10 rejects -> Phase 5 | dev-test, review, ci-fix reset to 0 |
|
|
457
|
+
| Phase 10 rejects -> Phase 4 or 3 | ALL loop counters reset to 0 |
|
|
458
|
+
| Deploy fails -> Phase 5 | dev-test, review, ci-fix reset to 0 |
|
|
459
|
+
| Normal phase advance | Counters preserved for reporting |
|
|
460
|
+
| Agent timeout in loop | Counts as +1 iteration (no reset) |
|
|
461
|
+
| ON_HOLD -> resume | ALL counters preserved (no reset) |
|
|
462
|
+
| CANCELLED | All counters frozen (historical) |
|
|
463
|
+
|
|
464
|
+
## Task State Machine
|
|
465
|
+
```
|
|
466
|
+
BACKLOG -> INTAKE -> ANALYZING -> DESIGNING -> APPROVED -> DEVELOPING
|
|
467
|
+
-> DEV_TESTING -> REVIEWING -> CI_PENDING -> QA_TESTING
|
|
468
|
+
-> QA_SIGNOFF -> BIZ_SIGNOFF -> TECH_SIGNOFF
|
|
469
|
+
-> DEPLOYING -> MONITORING -> CLOSED
|
|
470
|
+
|
|
471
|
+
Special: BLOCKED, ON_HOLD, CANCELLED (from any active state)
|
|
472
|
+
|
|
473
|
+
Reverse: QA_SIGNOFF->DEVELOPING, BIZ_SIGNOFF->APPROVED|DEVELOPING,
|
|
474
|
+
TECH_SIGNOFF->DESIGNING|DEVELOPING, DEPLOYING->DEVELOPING
|
|
475
|
+
```
|
|
@@ -0,0 +1,51 @@
|
|
|
1
|
+
# Smithery Ecosystem Setup Reference
|
|
2
|
+
|
|
3
|
+
## Conditional Skill Installation
|
|
4
|
+
|
|
5
|
+
### Frontend
|
|
6
|
+
- React/Vue/Angular/Svelte → `smithery skill add anthropics/frontend-design`
|
|
7
|
+
- shadcn/ui → shadcn skill
|
|
8
|
+
- Expo/React Native → Expo skills
|
|
9
|
+
|
|
10
|
+
### Testing
|
|
11
|
+
- Playwright → `smithery skill add anthropics/webapp-testing`
|
|
12
|
+
- Cypress → `smithery skill search "cypress testing"`
|
|
13
|
+
|
|
14
|
+
### Security
|
|
15
|
+
- Auth/security code → Trail of Bits security skills
|
|
16
|
+
|
|
17
|
+
### DevOps
|
|
18
|
+
- Docker → `smithery skill search "docker devops"`
|
|
19
|
+
- Kubernetes → `smithery skill search "kubernetes yaml helm"`
|
|
20
|
+
- GitHub Actions → `smithery skill search "github actions"`
|
|
21
|
+
|
|
22
|
+
### Documents (always)
|
|
23
|
+
- anthropics/pdf, anthropics/docx, anthropics/pptx, anthropics/xlsx
|
|
24
|
+
|
|
25
|
+
### Meta
|
|
26
|
+
- anthropics/skill-creator, anthropics/mcp-builder
|
|
27
|
+
|
|
28
|
+
## MCP Server Connections (scope to agents, max 5)
|
|
29
|
+
| Service | Server | Scope To |
|
|
30
|
+
|---------|--------|----------|
|
|
31
|
+
| GitHub | github | @api-builder, @infra |
|
|
32
|
+
| PostgreSQL | postgres | @api-builder |
|
|
33
|
+
| MongoDB | mongodb | @api-builder |
|
|
34
|
+
| Redis | redis | @api-builder |
|
|
35
|
+
| Playwright | playwright | @frontend |
|
|
36
|
+
| AWS | aws | @infra |
|
|
37
|
+
| GCP | gcp | @infra |
|
|
38
|
+
| Slack | slack | global |
|
|
39
|
+
| Sentry | sentry | global |
|
|
40
|
+
|
|
41
|
+
## Context Budget After Smithery
|
|
42
|
+
Each skill: ~100 tokens metadata (startup) + ~5k active
|
|
43
|
+
Each MCP: ~200-2k tokens ALWAYS loaded
|
|
44
|
+
|
|
45
|
+
Run `/context-check` after setup. If startup > 20%, remove lowest-priority items.
|
|
46
|
+
Prefer skills over MCP when both can solve the problem.
|
|
47
|
+
|
|
48
|
+
## Community Libraries
|
|
49
|
+
- obra/superpowers — 20+ battle-tested skills (TDD, debugging, collaboration)
|
|
50
|
+
- Trail of Bits — Security skills (CodeQL, Semgrep)
|
|
51
|
+
- Expo — Mobile development skills
|