@juho0719/cckit 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/assets/agents/architect.md +211 -0
- package/assets/agents/build-error-resolver.md +114 -0
- package/assets/agents/ccwin-code-reviewer.md +224 -0
- package/assets/agents/database-reviewer.md +91 -0
- package/assets/agents/doc-updater.md +107 -0
- package/assets/agents/e2e-runner.md +107 -0
- package/assets/agents/planner.md +212 -0
- package/assets/agents/python-reviewer.md +98 -0
- package/assets/agents/refactor-cleaner.md +85 -0
- package/assets/agents/security-reviewer.md +108 -0
- package/assets/agents/superpower-code-reviewer.md +48 -0
- package/assets/agents/tdd-guide.md +80 -0
- package/assets/commands/build-fix.md +62 -0
- package/assets/commands/checkpoint.md +74 -0
- package/assets/commands/code-review.md +40 -0
- package/assets/commands/e2e.md +362 -0
- package/assets/commands/eval.md +120 -0
- package/assets/commands/orchestrate.md +172 -0
- package/assets/commands/plan.md +113 -0
- package/assets/commands/python-review.md +297 -0
- package/assets/commands/refactor-clean.md +80 -0
- package/assets/commands/sessions.md +305 -0
- package/assets/commands/tdd.md +326 -0
- package/assets/commands/test-coverage.md +69 -0
- package/assets/commands/update-codemaps.md +72 -0
- package/assets/commands/update-docs.md +84 -0
- package/assets/commands/verify.md +59 -0
- package/assets/hooks/post-edit-format.js +49 -0
- package/assets/hooks/post-edit-typecheck.js +96 -0
- package/assets/mcps/mcp-servers.json +92 -0
- package/assets/rules/common/agents.md +49 -0
- package/assets/rules/common/coding-style.md +48 -0
- package/assets/rules/common/git-workflow.md +45 -0
- package/assets/rules/common/hooks.md +30 -0
- package/assets/rules/common/patterns.md +31 -0
- package/assets/rules/common/performance.md +55 -0
- package/assets/rules/common/security.md +29 -0
- package/assets/rules/common/testing.md +29 -0
- package/assets/rules/python/coding-style.md +42 -0
- package/assets/rules/python/hooks.md +19 -0
- package/assets/rules/python/patterns.md +39 -0
- package/assets/rules/python/security.md +30 -0
- package/assets/rules/python/testing.md +38 -0
- package/assets/rules/typescript/coding-style.md +18 -0
- package/assets/rules/typescript/hooks.md +19 -0
- package/assets/rules/typescript/patterns.md +39 -0
- package/assets/rules/typescript/security.md +30 -0
- package/assets/rules/typescript/testing.md +38 -0
- package/assets/skills/api-design/SKILL.md +522 -0
- package/assets/skills/backend-patterns/SKILL.md +597 -0
- package/assets/skills/brainstorming/SKILL.md +96 -0
- package/assets/skills/coding-standards/SKILL.md +529 -0
- package/assets/skills/database-migrations/SKILL.md +334 -0
- package/assets/skills/deployment-patterns/SKILL.md +426 -0
- package/assets/skills/dispatching-parallel-agents/SKILL.md +180 -0
- package/assets/skills/docker-patterns/SKILL.md +363 -0
- package/assets/skills/e2e-testing/SKILL.md +325 -0
- package/assets/skills/eval-harness/SKILL.md +235 -0
- package/assets/skills/executing-plans/SKILL.md +84 -0
- package/assets/skills/finishing-a-development-branch/SKILL.md +200 -0
- package/assets/skills/frontend-patterns/SKILL.md +641 -0
- package/assets/skills/iterative-retrieval/SKILL.md +210 -0
- package/assets/skills/postgres-patterns/SKILL.md +145 -0
- package/assets/skills/python-patterns/SKILL.md +749 -0
- package/assets/skills/python-testing/SKILL.md +815 -0
- package/assets/skills/receiving-code-review/SKILL.md +213 -0
- package/assets/skills/requesting-code-review/SKILL.md +105 -0
- package/assets/skills/requesting-code-review/code-reviewer-template.md +146 -0
- package/assets/skills/subagent-driven-development/SKILL.md +242 -0
- package/assets/skills/subagent-driven-development/code-quality-reviewer-prompt.md +20 -0
- package/assets/skills/subagent-driven-development/implementer-prompt.md +78 -0
- package/assets/skills/subagent-driven-development/spec-reviewer-prompt.md +61 -0
- package/assets/skills/systematic-debugging/CREATION-LOG.md +114 -0
- package/assets/skills/systematic-debugging/SKILL.md +296 -0
- package/assets/skills/systematic-debugging/condition-based-waiting-example.ts +158 -0
- package/assets/skills/systematic-debugging/condition-based-waiting.md +115 -0
- package/assets/skills/systematic-debugging/defense-in-depth.md +122 -0
- package/assets/skills/systematic-debugging/root-cause-tracing.md +169 -0
- package/assets/skills/systematic-debugging/scripts/find-polluter.sh +63 -0
- package/assets/skills/systematic-debugging/test-academic.md +14 -0
- package/assets/skills/systematic-debugging/test-pressure-1.md +58 -0
- package/assets/skills/systematic-debugging/test-pressure-2.md +68 -0
- package/assets/skills/systematic-debugging/test-pressure-3.md +69 -0
- package/assets/skills/tdd-workflow/SKILL.md +409 -0
- package/assets/skills/test-driven-development/SKILL.md +371 -0
- package/assets/skills/test-driven-development/testing-anti-patterns.md +299 -0
- package/assets/skills/using-git-worktrees/SKILL.md +218 -0
- package/assets/skills/verification-before-completion/SKILL.md +139 -0
- package/assets/skills/verification-loop/SKILL.md +125 -0
- package/assets/skills/writing-plans/SKILL.md +116 -0
- package/dist/agents-AEKT67A6.js +9 -0
- package/dist/chunk-3GUKEMND.js +28 -0
- package/dist/chunk-3UNN3IBE.js +54 -0
- package/dist/chunk-3Y26YU4R.js +27 -0
- package/dist/chunk-5XOKKPAA.js +21 -0
- package/dist/chunk-6B46AIFM.js +136 -0
- package/dist/chunk-EYY2IZ7N.js +27 -0
- package/dist/chunk-K25UZZVG.js +17 -0
- package/dist/chunk-KEENFBLL.js +24 -0
- package/dist/chunk-RMUKD7CW.js +44 -0
- package/dist/chunk-W63UKEIT.js +50 -0
- package/dist/cli-VZRGF733.js +238 -0
- package/dist/commands-P5LILVZ5.js +9 -0
- package/dist/hooks-IIG2XK4I.js +9 -0
- package/dist/index.js +131 -0
- package/dist/mcps-67Q7TBGW.js +6 -0
- package/dist/paths-FT6KBIRD.js +10 -0
- package/dist/registry-EGXWYWWK.js +17 -0
- package/dist/rules-2CPBVNNJ.js +7 -0
- package/dist/skills-ULMW3UCM.js +8 -0
- package/package.json +36 -0
|
@@ -0,0 +1,235 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: eval-harness
|
|
3
|
+
description: Formal evaluation framework for Claude Code sessions implementing eval-driven development (EDD) principles
|
|
4
|
+
tools: Read, Write, Edit, Bash, Grep, Glob
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Eval Harness Skill
|
|
8
|
+
|
|
9
|
+
A formal evaluation framework for Claude Code sessions, implementing eval-driven development (EDD) principles.
|
|
10
|
+
|
|
11
|
+
## When to Activate
|
|
12
|
+
|
|
13
|
+
- Setting up eval-driven development (EDD) for AI-assisted workflows
|
|
14
|
+
- Defining pass/fail criteria for Claude Code task completion
|
|
15
|
+
- Measuring agent reliability with pass@k metrics
|
|
16
|
+
- Creating regression test suites for prompt or agent changes
|
|
17
|
+
- Benchmarking agent performance across model versions
|
|
18
|
+
|
|
19
|
+
## Philosophy
|
|
20
|
+
|
|
21
|
+
Eval-Driven Development treats evals as the "unit tests of AI development":
|
|
22
|
+
- Define expected behavior BEFORE implementation
|
|
23
|
+
- Run evals continuously during development
|
|
24
|
+
- Track regressions with each change
|
|
25
|
+
- Use pass@k metrics for reliability measurement
|
|
26
|
+
|
|
27
|
+
## Eval Types
|
|
28
|
+
|
|
29
|
+
### Capability Evals
|
|
30
|
+
Test if Claude can do something it couldn't before:
|
|
31
|
+
```markdown
|
|
32
|
+
[CAPABILITY EVAL: feature-name]
|
|
33
|
+
Task: Description of what Claude should accomplish
|
|
34
|
+
Success Criteria:
|
|
35
|
+
- [ ] Criterion 1
|
|
36
|
+
- [ ] Criterion 2
|
|
37
|
+
- [ ] Criterion 3
|
|
38
|
+
Expected Output: Description of expected result
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
### Regression Evals
|
|
42
|
+
Ensure changes don't break existing functionality:
|
|
43
|
+
```markdown
|
|
44
|
+
[REGRESSION EVAL: feature-name]
|
|
45
|
+
Baseline: SHA or checkpoint name
|
|
46
|
+
Tests:
|
|
47
|
+
- existing-test-1: PASS/FAIL
|
|
48
|
+
- existing-test-2: PASS/FAIL
|
|
49
|
+
- existing-test-3: PASS/FAIL
|
|
50
|
+
Result: X/Y passed (previously Y/Y)
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
## Grader Types
|
|
54
|
+
|
|
55
|
+
### 1. Code-Based Grader
|
|
56
|
+
Deterministic checks using code:
|
|
57
|
+
```bash
|
|
58
|
+
# Check if file contains expected pattern
|
|
59
|
+
grep -q "export function handleAuth" src/auth.ts && echo "PASS" || echo "FAIL"
|
|
60
|
+
|
|
61
|
+
# Check if tests pass
|
|
62
|
+
npm test -- --testPathPattern="auth" && echo "PASS" || echo "FAIL"
|
|
63
|
+
|
|
64
|
+
# Check if build succeeds
|
|
65
|
+
npm run build && echo "PASS" || echo "FAIL"
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
### 2. Model-Based Grader
|
|
69
|
+
Use Claude to evaluate open-ended outputs:
|
|
70
|
+
```markdown
|
|
71
|
+
[MODEL GRADER PROMPT]
|
|
72
|
+
Evaluate the following code change:
|
|
73
|
+
1. Does it solve the stated problem?
|
|
74
|
+
2. Is it well-structured?
|
|
75
|
+
3. Are edge cases handled?
|
|
76
|
+
4. Is error handling appropriate?
|
|
77
|
+
|
|
78
|
+
Score: 1-5 (1=poor, 5=excellent)
|
|
79
|
+
Reasoning: [explanation]
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
### 3. Human Grader
|
|
83
|
+
Flag for manual review:
|
|
84
|
+
```markdown
|
|
85
|
+
[HUMAN REVIEW REQUIRED]
|
|
86
|
+
Change: Description of what changed
|
|
87
|
+
Reason: Why human review is needed
|
|
88
|
+
Risk Level: LOW/MEDIUM/HIGH
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
## Metrics
|
|
92
|
+
|
|
93
|
+
### pass@k
|
|
94
|
+
"At least one success in k attempts"
|
|
95
|
+
- pass@1: First attempt success rate
|
|
96
|
+
- pass@3: Success within 3 attempts
|
|
97
|
+
- Typical target: pass@3 > 90%
|
|
98
|
+
|
|
99
|
+
### pass^k
|
|
100
|
+
"All k trials succeed"
|
|
101
|
+
- Higher bar for reliability
|
|
102
|
+
- pass^3: 3 consecutive successes
|
|
103
|
+
- Use for critical paths
|
|
104
|
+
|
|
105
|
+
## Eval Workflow
|
|
106
|
+
|
|
107
|
+
### 1. Define (Before Coding)
|
|
108
|
+
```markdown
|
|
109
|
+
## EVAL DEFINITION: feature-xyz
|
|
110
|
+
|
|
111
|
+
### Capability Evals
|
|
112
|
+
1. Can create new user account
|
|
113
|
+
2. Can validate email format
|
|
114
|
+
3. Can hash password securely
|
|
115
|
+
|
|
116
|
+
### Regression Evals
|
|
117
|
+
1. Existing login still works
|
|
118
|
+
2. Session management unchanged
|
|
119
|
+
3. Logout flow intact
|
|
120
|
+
|
|
121
|
+
### Success Metrics
|
|
122
|
+
- pass@3 > 90% for capability evals
|
|
123
|
+
- pass^3 = 100% for regression evals
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
### 2. Implement
|
|
127
|
+
Write code to pass the defined evals.
|
|
128
|
+
|
|
129
|
+
### 3. Evaluate
|
|
130
|
+
```bash
|
|
131
|
+
# Run capability evals
|
|
132
|
+
[Run each capability eval, record PASS/FAIL]
|
|
133
|
+
|
|
134
|
+
# Run regression evals
|
|
135
|
+
npm test -- --testPathPattern="existing"
|
|
136
|
+
|
|
137
|
+
# Generate report
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
### 4. Report
|
|
141
|
+
```markdown
|
|
142
|
+
EVAL REPORT: feature-xyz
|
|
143
|
+
========================
|
|
144
|
+
|
|
145
|
+
Capability Evals:
|
|
146
|
+
create-user: PASS (pass@1)
|
|
147
|
+
validate-email: PASS (pass@2)
|
|
148
|
+
hash-password: PASS (pass@1)
|
|
149
|
+
Overall: 3/3 passed
|
|
150
|
+
|
|
151
|
+
Regression Evals:
|
|
152
|
+
login-flow: PASS
|
|
153
|
+
session-mgmt: PASS
|
|
154
|
+
logout-flow: PASS
|
|
155
|
+
Overall: 3/3 passed
|
|
156
|
+
|
|
157
|
+
Metrics:
|
|
158
|
+
pass@1: 67% (2/3)
|
|
159
|
+
pass@3: 100% (3/3)
|
|
160
|
+
|
|
161
|
+
Status: READY FOR REVIEW
|
|
162
|
+
```
|
|
163
|
+
|
|
164
|
+
## Integration Patterns
|
|
165
|
+
|
|
166
|
+
### Pre-Implementation
|
|
167
|
+
```
|
|
168
|
+
/eval define feature-name
|
|
169
|
+
```
|
|
170
|
+
Creates eval definition file at `.claude/evals/feature-name.md`
|
|
171
|
+
|
|
172
|
+
### During Implementation
|
|
173
|
+
```
|
|
174
|
+
/eval check feature-name
|
|
175
|
+
```
|
|
176
|
+
Runs current evals and reports status
|
|
177
|
+
|
|
178
|
+
### Post-Implementation
|
|
179
|
+
```
|
|
180
|
+
/eval report feature-name
|
|
181
|
+
```
|
|
182
|
+
Generates full eval report
|
|
183
|
+
|
|
184
|
+
## Eval Storage
|
|
185
|
+
|
|
186
|
+
Store evals in project:
|
|
187
|
+
```
|
|
188
|
+
.claude/
|
|
189
|
+
evals/
|
|
190
|
+
feature-xyz.md # Eval definition
|
|
191
|
+
feature-xyz.log # Eval run history
|
|
192
|
+
baseline.json # Regression baselines
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
## Best Practices
|
|
196
|
+
|
|
197
|
+
1. **Define evals BEFORE coding** - Forces clear thinking about success criteria
|
|
198
|
+
2. **Run evals frequently** - Catch regressions early
|
|
199
|
+
3. **Track pass@k over time** - Monitor reliability trends
|
|
200
|
+
4. **Use code graders when possible** - Deterministic > probabilistic
|
|
201
|
+
5. **Human review for security** - Never fully automate security checks
|
|
202
|
+
6. **Keep evals fast** - Slow evals don't get run
|
|
203
|
+
7. **Version evals with code** - Evals are first-class artifacts
|
|
204
|
+
|
|
205
|
+
## Example: Adding Authentication
|
|
206
|
+
|
|
207
|
+
```markdown
|
|
208
|
+
## EVAL: add-authentication
|
|
209
|
+
|
|
210
|
+
### Phase 1: Define (10 min)
|
|
211
|
+
Capability Evals:
|
|
212
|
+
- [ ] User can register with email/password
|
|
213
|
+
- [ ] User can login with valid credentials
|
|
214
|
+
- [ ] Invalid credentials rejected with proper error
|
|
215
|
+
- [ ] Sessions persist across page reloads
|
|
216
|
+
- [ ] Logout clears session
|
|
217
|
+
|
|
218
|
+
Regression Evals:
|
|
219
|
+
- [ ] Public routes still accessible
|
|
220
|
+
- [ ] API responses unchanged
|
|
221
|
+
- [ ] Database schema compatible
|
|
222
|
+
|
|
223
|
+
### Phase 2: Implement (varies)
|
|
224
|
+
[Write code]
|
|
225
|
+
|
|
226
|
+
### Phase 3: Evaluate
|
|
227
|
+
Run: /eval check add-authentication
|
|
228
|
+
|
|
229
|
+
### Phase 4: Report
|
|
230
|
+
EVAL REPORT: add-authentication
|
|
231
|
+
==============================
|
|
232
|
+
Capability: 5/5 passed (pass@3: 100%)
|
|
233
|
+
Regression: 3/3 passed (pass^3: 100%)
|
|
234
|
+
Status: SHIP IT
|
|
235
|
+
```
|
|
@@ -0,0 +1,84 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: executing-plans
|
|
3
|
+
description: Use when you have a written implementation plan to execute in a separate session with review checkpoints
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Executing Plans
|
|
7
|
+
|
|
8
|
+
## Overview
|
|
9
|
+
|
|
10
|
+
Load plan, review critically, execute tasks in batches, report for review between batches.
|
|
11
|
+
|
|
12
|
+
**Core principle:** Batch execution with checkpoints for architect review.
|
|
13
|
+
|
|
14
|
+
**Announce at start:** "I'm using the executing-plans skill to implement this plan."
|
|
15
|
+
|
|
16
|
+
## The Process
|
|
17
|
+
|
|
18
|
+
### Step 1: Load and Review Plan
|
|
19
|
+
1. Read plan file
|
|
20
|
+
2. Review critically - identify any questions or concerns about the plan
|
|
21
|
+
3. If concerns: Raise them with your human partner before starting
|
|
22
|
+
4. If no concerns: Create TodoWrite and proceed
|
|
23
|
+
|
|
24
|
+
### Step 2: Execute Batch
|
|
25
|
+
**Default: First 3 tasks**
|
|
26
|
+
|
|
27
|
+
For each task:
|
|
28
|
+
1. Mark as in_progress
|
|
29
|
+
2. Follow each step exactly (plan has bite-sized steps)
|
|
30
|
+
3. Run verifications as specified
|
|
31
|
+
4. Mark as completed
|
|
32
|
+
|
|
33
|
+
### Step 3: Report
|
|
34
|
+
When batch complete:
|
|
35
|
+
- Show what was implemented
|
|
36
|
+
- Show verification output
|
|
37
|
+
- Say: "Ready for feedback."
|
|
38
|
+
|
|
39
|
+
### Step 4: Continue
|
|
40
|
+
Based on feedback:
|
|
41
|
+
- Apply changes if needed
|
|
42
|
+
- Execute next batch
|
|
43
|
+
- Repeat until complete
|
|
44
|
+
|
|
45
|
+
### Step 5: Complete Development
|
|
46
|
+
|
|
47
|
+
After all tasks complete and verified:
|
|
48
|
+
- Announce: "I'm using the finishing-a-development-branch skill to complete this work."
|
|
49
|
+
- **REQUIRED SUB-SKILL:** Use finishing-a-development-branch skill
|
|
50
|
+
- Follow that skill to verify tests, present options, execute choice
|
|
51
|
+
|
|
52
|
+
## When to Stop and Ask for Help
|
|
53
|
+
|
|
54
|
+
**STOP executing immediately when:**
|
|
55
|
+
- Hit a blocker mid-batch (missing dependency, test fails, instruction unclear)
|
|
56
|
+
- Plan has critical gaps preventing starting
|
|
57
|
+
- You don't understand an instruction
|
|
58
|
+
- Verification fails repeatedly
|
|
59
|
+
|
|
60
|
+
**Ask for clarification rather than guessing.**
|
|
61
|
+
|
|
62
|
+
## When to Revisit Earlier Steps
|
|
63
|
+
|
|
64
|
+
**Return to Review (Step 1) when:**
|
|
65
|
+
- Partner updates the plan based on your feedback
|
|
66
|
+
- Fundamental approach needs rethinking
|
|
67
|
+
|
|
68
|
+
**Don't force through blockers** - stop and ask.
|
|
69
|
+
|
|
70
|
+
## Remember
|
|
71
|
+
- Review plan critically first
|
|
72
|
+
- Follow plan steps exactly
|
|
73
|
+
- Don't skip verifications
|
|
74
|
+
- Reference skills when plan says to
|
|
75
|
+
- Between batches: just report and wait
|
|
76
|
+
- Stop when blocked, don't guess
|
|
77
|
+
- Never start implementation on main/master branch without explicit user consent
|
|
78
|
+
|
|
79
|
+
## Integration
|
|
80
|
+
|
|
81
|
+
**Required workflow skills:**
|
|
82
|
+
- **using-git-worktrees skill** - REQUIRED: Set up isolated workspace before starting
|
|
83
|
+
- **writing-plans skill** - Creates the plan this skill executes
|
|
84
|
+
- **finishing-a-development-branch skill** - Complete development after all tasks
|
|
@@ -0,0 +1,200 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: finishing-a-development-branch
|
|
3
|
+
description: Use when implementation is complete, all tests pass, and you need to decide how to integrate the work - guides completion of development work by presenting structured options for merge, PR, or cleanup
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Finishing a Development Branch
|
|
7
|
+
|
|
8
|
+
## Overview
|
|
9
|
+
|
|
10
|
+
Guide completion of development work by presenting clear options and handling chosen workflow.
|
|
11
|
+
|
|
12
|
+
**Core principle:** Verify tests → Present options → Execute choice → Clean up.
|
|
13
|
+
|
|
14
|
+
**Announce at start:** "I'm using the finishing-a-development-branch skill to complete this work."
|
|
15
|
+
|
|
16
|
+
## The Process
|
|
17
|
+
|
|
18
|
+
### Step 1: Verify Tests
|
|
19
|
+
|
|
20
|
+
**Before presenting options, verify tests pass:**
|
|
21
|
+
|
|
22
|
+
```bash
|
|
23
|
+
# Run project's test suite
|
|
24
|
+
npm test / cargo test / pytest / go test ./...
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
**If tests fail:**
|
|
28
|
+
```
|
|
29
|
+
Tests failing (<N> failures). Must fix before completing:
|
|
30
|
+
|
|
31
|
+
[Show failures]
|
|
32
|
+
|
|
33
|
+
Cannot proceed with merge/PR until tests pass.
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
Stop. Don't proceed to Step 2.
|
|
37
|
+
|
|
38
|
+
**If tests pass:** Continue to Step 2.
|
|
39
|
+
|
|
40
|
+
### Step 2: Determine Base Branch
|
|
41
|
+
|
|
42
|
+
```bash
|
|
43
|
+
# Try common base branches
|
|
44
|
+
git merge-base HEAD main 2>/dev/null || git merge-base HEAD master 2>/dev/null
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
Or ask: "This branch split from main - is that correct?"
|
|
48
|
+
|
|
49
|
+
### Step 3: Present Options
|
|
50
|
+
|
|
51
|
+
Present exactly these 4 options:
|
|
52
|
+
|
|
53
|
+
```
|
|
54
|
+
Implementation complete. What would you like to do?
|
|
55
|
+
|
|
56
|
+
1. Merge back to <base-branch> locally
|
|
57
|
+
2. Push and create a Pull Request
|
|
58
|
+
3. Keep the branch as-is (I'll handle it later)
|
|
59
|
+
4. Discard this work
|
|
60
|
+
|
|
61
|
+
Which option?
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
**Don't add explanation** - keep options concise.
|
|
65
|
+
|
|
66
|
+
### Step 4: Execute Choice
|
|
67
|
+
|
|
68
|
+
#### Option 1: Merge Locally
|
|
69
|
+
|
|
70
|
+
```bash
|
|
71
|
+
# Switch to base branch
|
|
72
|
+
git checkout <base-branch>
|
|
73
|
+
|
|
74
|
+
# Pull latest
|
|
75
|
+
git pull
|
|
76
|
+
|
|
77
|
+
# Merge feature branch
|
|
78
|
+
git merge <feature-branch>
|
|
79
|
+
|
|
80
|
+
# Verify tests on merged result
|
|
81
|
+
<test command>
|
|
82
|
+
|
|
83
|
+
# If tests pass
|
|
84
|
+
git branch -d <feature-branch>
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
Then: Cleanup worktree (Step 5)
|
|
88
|
+
|
|
89
|
+
#### Option 2: Push and Create PR
|
|
90
|
+
|
|
91
|
+
```bash
|
|
92
|
+
# Push branch
|
|
93
|
+
git push -u origin <feature-branch>
|
|
94
|
+
|
|
95
|
+
# Create PR
|
|
96
|
+
gh pr create --title "<title>" --body "$(cat <<'EOF'
|
|
97
|
+
## Summary
|
|
98
|
+
<2-3 bullets of what changed>
|
|
99
|
+
|
|
100
|
+
## Test Plan
|
|
101
|
+
- [ ] <verification steps>
|
|
102
|
+
EOF
|
|
103
|
+
)"
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
Then: Cleanup worktree (Step 5)
|
|
107
|
+
|
|
108
|
+
#### Option 3: Keep As-Is
|
|
109
|
+
|
|
110
|
+
Report: "Keeping branch <name>. Worktree preserved at <path>."
|
|
111
|
+
|
|
112
|
+
**Don't cleanup worktree.**
|
|
113
|
+
|
|
114
|
+
#### Option 4: Discard
|
|
115
|
+
|
|
116
|
+
**Confirm first:**
|
|
117
|
+
```
|
|
118
|
+
This will permanently delete:
|
|
119
|
+
- Branch <name>
|
|
120
|
+
- All commits: <commit-list>
|
|
121
|
+
- Worktree at <path>
|
|
122
|
+
|
|
123
|
+
Type 'discard' to confirm.
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
Wait for exact confirmation.
|
|
127
|
+
|
|
128
|
+
If confirmed:
|
|
129
|
+
```bash
|
|
130
|
+
git checkout <base-branch>
|
|
131
|
+
git branch -D <feature-branch>
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
Then: Cleanup worktree (Step 5)
|
|
135
|
+
|
|
136
|
+
### Step 5: Cleanup Worktree
|
|
137
|
+
|
|
138
|
+
**For Options 1, 2, 4:**
|
|
139
|
+
|
|
140
|
+
Check if in worktree:
|
|
141
|
+
```bash
|
|
142
|
+
git worktree list | grep $(git branch --show-current)
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
If yes:
|
|
146
|
+
```bash
|
|
147
|
+
git worktree remove <worktree-path>
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
**For Option 3:** Keep worktree.
|
|
151
|
+
|
|
152
|
+
## Quick Reference
|
|
153
|
+
|
|
154
|
+
| Option | Merge | Push | Keep Worktree | Cleanup Branch |
|
|
155
|
+
|--------|-------|------|---------------|----------------|
|
|
156
|
+
| 1. Merge locally | ✓ | - | - | ✓ |
|
|
157
|
+
| 2. Create PR | - | ✓ | ✓ | - |
|
|
158
|
+
| 3. Keep as-is | - | - | ✓ | - |
|
|
159
|
+
| 4. Discard | - | - | - | ✓ (force) |
|
|
160
|
+
|
|
161
|
+
## Common Mistakes
|
|
162
|
+
|
|
163
|
+
**Skipping test verification**
|
|
164
|
+
- **Problem:** Merge broken code, create failing PR
|
|
165
|
+
- **Fix:** Always verify tests before offering options
|
|
166
|
+
|
|
167
|
+
**Open-ended questions**
|
|
168
|
+
- **Problem:** "What should I do next?" → ambiguous
|
|
169
|
+
- **Fix:** Present exactly 4 structured options
|
|
170
|
+
|
|
171
|
+
**Automatic worktree cleanup**
|
|
172
|
+
- **Problem:** Remove worktree when might need it (Option 2, 3)
|
|
173
|
+
- **Fix:** Only cleanup for Options 1 and 4
|
|
174
|
+
|
|
175
|
+
**No confirmation for discard**
|
|
176
|
+
- **Problem:** Accidentally delete work
|
|
177
|
+
- **Fix:** Require typed "discard" confirmation
|
|
178
|
+
|
|
179
|
+
## Red Flags
|
|
180
|
+
|
|
181
|
+
**Never:**
|
|
182
|
+
- Proceed with failing tests
|
|
183
|
+
- Merge without verifying tests on result
|
|
184
|
+
- Delete work without confirmation
|
|
185
|
+
- Force-push without explicit request
|
|
186
|
+
|
|
187
|
+
**Always:**
|
|
188
|
+
- Verify tests before offering options
|
|
189
|
+
- Present exactly 4 options
|
|
190
|
+
- Get typed confirmation for Option 4
|
|
191
|
+
- Clean up worktree for Options 1 & 4 only
|
|
192
|
+
|
|
193
|
+
## Integration
|
|
194
|
+
|
|
195
|
+
**Called by:**
|
|
196
|
+
- **subagent-driven-development** (Step 7) - After all tasks complete
|
|
197
|
+
- **executing-plans** (Step 5) - After all batches complete
|
|
198
|
+
|
|
199
|
+
**Pairs with:**
|
|
200
|
+
- **using-git-worktrees** - Cleans up worktree created by that skill
|