opencodekit 0.20.6 → 0.20.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (46) hide show
  1. package/dist/index.js +1 -1
  2. package/dist/template/.opencode/AGENTS.md +48 -0
  3. package/dist/template/.opencode/agent/build.md +3 -2
  4. package/dist/template/.opencode/agent/explore.md +14 -14
  5. package/dist/template/.opencode/agent/general.md +1 -1
  6. package/dist/template/.opencode/agent/plan.md +1 -1
  7. package/dist/template/.opencode/agent/review.md +1 -1
  8. package/dist/template/.opencode/agent/vision.md +0 -9
  9. package/dist/template/.opencode/command/compound.md +102 -28
  10. package/dist/template/.opencode/command/curate.md +299 -0
  11. package/dist/template/.opencode/command/lfg.md +1 -0
  12. package/dist/template/.opencode/command/ship.md +1 -0
  13. package/dist/template/.opencode/memory.db +0 -0
  14. package/dist/template/.opencode/memory.db-shm +0 -0
  15. package/dist/template/.opencode/memory.db-wal +0 -0
  16. package/dist/template/.opencode/opencode.json +0 -5
  17. package/dist/template/.opencode/package.json +1 -1
  18. package/dist/template/.opencode/pnpm-lock.yaml +791 -9
  19. package/dist/template/.opencode/skill/api-and-interface-design/SKILL.md +162 -0
  20. package/dist/template/.opencode/skill/beads/SKILL.md +10 -9
  21. package/dist/template/.opencode/skill/beads/references/MULTI_AGENT.md +10 -10
  22. package/dist/template/.opencode/skill/ci-cd-and-automation/SKILL.md +202 -0
  23. package/dist/template/.opencode/skill/code-search-patterns/SKILL.md +253 -0
  24. package/dist/template/.opencode/skill/code-simplification/SKILL.md +211 -0
  25. package/dist/template/.opencode/skill/condition-based-waiting/SKILL.md +12 -0
  26. package/dist/template/.opencode/skill/defense-in-depth/SKILL.md +16 -6
  27. package/dist/template/.opencode/skill/deprecation-and-migration/SKILL.md +189 -0
  28. package/dist/template/.opencode/skill/development-lifecycle/SKILL.md +12 -48
  29. package/dist/template/.opencode/skill/documentation-and-adrs/SKILL.md +220 -0
  30. package/dist/template/.opencode/skill/incremental-implementation/SKILL.md +191 -0
  31. package/dist/template/.opencode/skill/performance-optimization/SKILL.md +236 -0
  32. package/dist/template/.opencode/skill/receiving-code-review/SKILL.md +11 -0
  33. package/dist/template/.opencode/skill/reflection-checkpoints/SKILL.md +183 -0
  34. package/dist/template/.opencode/skill/security-and-hardening/SKILL.md +296 -0
  35. package/dist/template/.opencode/skill/structured-edit/SKILL.md +10 -0
  36. package/dist/template/.opencode/skill/swarm-coordination/SKILL.md +66 -1
  37. package/package.json +1 -1
  38. package/dist/template/.opencode/skill/beads-bridge/SKILL.md +0 -321
  39. package/dist/template/.opencode/skill/code-navigation/SKILL.md +0 -130
  40. package/dist/template/.opencode/skill/mqdh/SKILL.md +0 -171
  41. package/dist/template/.opencode/skill/obsidian/SKILL.md +0 -192
  42. package/dist/template/.opencode/skill/obsidian/mcp.json +0 -22
  43. package/dist/template/.opencode/skill/pencil/SKILL.md +0 -72
  44. package/dist/template/.opencode/skill/ralph/SKILL.md +0 -296
  45. package/dist/template/.opencode/skill/tilth-cli/SKILL.md +0 -207
  46. package/dist/template/.opencode/skill/tool-priority/SKILL.md +0 -299
@@ -0,0 +1,191 @@
1
+ ---
2
+ name: incremental-implementation
3
+ description: Use when implementing features or fixes to enforce thin vertical slices with verify-after-each — prevents large, untested changes by requiring working code at every step
4
+ version: 1.0.0
5
+ tags: [workflow, implementation, code-quality]
6
+ dependencies: [test-driven-development, verification-before-completion]
7
+ ---
8
+
9
+ # Incremental Implementation
10
+
11
+ > **Replaces** big-bang implementations where everything is built at once and tested at the end — enforces thin vertical slices with verification after each step
12
+
13
+ ## When to Use
14
+
15
+ - Implementing any feature that touches more than 2 files
16
+ - Working from a plan or spec with multiple tasks
17
+ - Building something where partial progress should be demonstrable
18
+
19
+ ## When NOT to Use
20
+
21
+ - One-line fixes or trivial changes
22
+ - Pure refactors with no behavior change (use code-simplification instead)
23
+ - Exploratory prototyping where you need to experiment freely
24
+
25
+ ## Common Rationalizations
26
+
27
+ | Rationalization | Rebuttal |
28
+ | ------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------- |
29
+ | "I'll build everything first and test at the end" | End-to-end testing after 500 lines of changes makes failures impossible to isolate |
30
+ | "This feature can't be split into slices" | Every feature can be sliced — you're confusing "the UI needs all parts" with "the code must be written all at once" |
31
+ | "Committing partial work creates noise" | Partial working commits are rollback points. One giant commit is a rollback cliff |
32
+ | "It's faster to write it all at once" | It feels faster until the first bug takes 2 hours to locate in a 400-line diff |
33
+ | "The slices are too small to be meaningful" | If a slice compiles, passes tests, and moves toward the goal, it's meaningful |
34
+ | "I need to see the whole picture first" | Read the plan first, then implement slice by slice. Understanding ≠ building all at once |
35
+
36
+ ## Overview
37
+
38
+ Large implementations fail because errors compound. When you write 500 lines before running anything, each line can introduce a bug that interacts with bugs from other lines. Thin vertical slices keep the error surface small.
39
+
40
+ **Core principle:** Working code at every step. Never be more than one slice away from a green build.
41
+
42
+ ## The Cycle
43
+
44
+ ```
45
+ FOR each slice:
46
+ 1. IMPLEMENT — Write the minimal code for this slice (1-3 files max)
47
+ 2. VERIFY — Run typecheck + lint + relevant tests
48
+ 3. COMMIT — Create a checkpoint with descriptive message
49
+ 4. NEXT — Move to the next slice
50
+
51
+ IF verify fails:
52
+ Fix within the current slice before moving on
53
+ Do NOT proceed to the next slice with broken code
54
+ ```
55
+
56
+ ## Slicing Strategies
57
+
58
+ ### Vertical Slice (Preferred)
59
+
60
+ Each slice delivers one thin path through the full stack:
61
+
62
+ ```
63
+ Slice 1: API endpoint returns hardcoded data → test passes
64
+ Slice 2: API endpoint reads from database → test passes
65
+ Slice 3: UI calls API and renders data → test passes
66
+ Slice 4: Add validation and error handling → test passes
67
+ ```
68
+
69
+ ### Contract-First
70
+
71
+ Define interfaces first, then implement behind them:
72
+
73
+ ```
74
+ Slice 1: Define types/interfaces → compiles
75
+ Slice 2: Implement with stubs → tests pass (with mocked data)
76
+ Slice 3: Replace stubs with real implementation → tests pass
77
+ ```
78
+
79
+ ### Risk-First
80
+
81
+ Implement the hardest or most uncertain part first:
82
+
83
+ ```
84
+ Slice 1: The tricky algorithm or integration → tests pass
85
+ Slice 2: The straightforward plumbing → tests pass
86
+ Slice 3: The UI/presentation layer → tests pass
87
+ ```
88
+
89
+ ## Implementation Rules
90
+
91
+ ### 1. Simplicity First
92
+
93
+ Default to the simplest viable solution for each slice.
94
+
95
+ ```
96
+ ❌ "Let me add a factory pattern for extensibility"
97
+ ✅ "Direct function call works. Refactor to pattern IF a second use case appears"
98
+ ```
99
+
100
+ ### 2. Scope Discipline
101
+
102
+ Each slice does ONE thing. If you notice something else that needs fixing:
103
+
104
+ ```
105
+ NOTICED BUT NOT TOUCHING: [description of unrelated improvement]
106
+ ```
107
+
108
+ Log it and continue with the current slice.
109
+
110
+ ### 3. One Compilable Step at a Time
111
+
112
+ Never leave the codebase in a state where typecheck fails between slices.
113
+
114
+ ```
115
+ ❌ Add 5 function signatures, then implement all 5
116
+ ✅ Add and implement function 1, verify, then function 2
117
+ ```
118
+
119
+ ### 4. Keep Tests Green
120
+
121
+ If existing tests break from your change, fix them in the same slice — not in a "fix tests" slice later.
122
+
123
+ ### 5. Feature Flags for Incomplete Features
124
+
125
+ If a slice can't be hidden behind existing abstractions:
126
+
127
+ ```typescript
128
+ // Temporary gate — remove when feature is complete
129
+ if (process.env.ENABLE_NEW_FEATURE) {
130
+ // new code path
131
+ } else {
132
+ // existing behavior
133
+ }
134
+ ```
135
+
136
+ ### 6. Rollback-Friendly
137
+
138
+ Each committed slice should be independently revertable without breaking the build.
139
+
140
+ ## Slice Size Guide
141
+
142
+ | Slice Size | Signal |
143
+ | ------------- | ------------------------------------------ |
144
+ | 1-30 lines | Ideal — easy to review and verify |
145
+ | 30-100 lines | Acceptable — still isolatable |
146
+ | 100-200 lines | Too large — find a split point |
147
+ | 200+ lines | Stop. You're doing big-bang implementation |
148
+
149
+ ## Red Flags — STOP
150
+
151
+ If you catch yourself:
152
+
153
+ - Writing more than 100 lines without running verification
154
+ - Saying "I'll test this after I finish the next part"
155
+ - Having 3+ files with uncommitted changes
156
+ - Building a complex abstraction before the simple version works
157
+ - Skipping verification because "this slice is trivial"
158
+
159
+ **STOP.** Verify what you have. Commit if it passes. Then continue.
160
+
161
+ ## Verification
162
+
163
+ After each slice:
164
+
165
+ ```bash
166
+ # Minimum verification (must pass)
167
+ npm run typecheck # or equivalent
168
+ npm run lint # or equivalent
169
+
170
+ # If slice changes behavior
171
+ npm test # relevant test files
172
+ ```
173
+
174
+ After all slices complete:
175
+
176
+ ```bash
177
+ # Full verification
178
+ npm run typecheck && npm run lint && npm test
179
+ ```
180
+
181
+ ## Integration with Other Skills
182
+
183
+ - **test-driven-development** — Write the test for each slice FIRST (RED), then implement (GREEN)
184
+ - **verification-before-completion** — Run full gates after the final slice
185
+ - **code-simplification** — Refactor AFTER all slices pass, not during implementation
186
+ - **systematic-debugging** — If a slice fails verification, debug systematically instead of guessing
187
+
188
+ ## See Also
189
+
190
+ - **writing-plans** — Creates the plan that this skill executes slice-by-slice
191
+ - **executing-plans** — Orchestrates parallel execution of independent slices
@@ -0,0 +1,236 @@
1
+ ---
2
+ name: performance-optimization
3
+ description: Use when profiling, optimizing, or adding performance budgets to applications — covers measure-first workflow, Core Web Vitals, common anti-patterns, and performance regression prevention
4
+ version: 1.0.0
5
+ tags: [performance, code-quality]
6
+ dependencies: []
7
+ ---
8
+
9
+ # Performance Optimization
10
+
11
+ > **Replaces** premature optimization and gut-feeling tuning with measurement-driven improvements that target actual bottlenecks
12
+
13
+ ## When to Use
14
+
15
+ - Application is measurably slow (user reports, metrics, profiler data)
16
+ - Setting up performance budgets for a new project
17
+ - Reviewing code for common performance anti-patterns
18
+ - Performance regression detected in CI or monitoring
19
+
20
+ ## When NOT to Use
21
+
22
+ - No evidence of a performance problem — premature optimization wastes time
23
+ - Micro-optimizations that save nanoseconds in non-hot paths
24
+ - Choosing "fast" over "correct" — correctness first, always
25
+
26
+ ## Overview
27
+
28
+ Performance optimization is an empirical process. Measure, identify, fix, verify. Never optimize without profiling first.
29
+
30
+ **Core principle:** Measure before optimizing. Optimize the bottleneck, not the code you happen to be reading. Verify the improvement with numbers.
31
+
32
+ ## Measure-First Workflow
33
+
34
+ ```
35
+ 1. MEASURE — Profile to identify actual bottlenecks (not guessed ones)
36
+ 2. IDENTIFY — Find the specific hot path or resource constraint
37
+ 3. FIX — Apply targeted optimization to the bottleneck
38
+ 4. VERIFY — Measure again to confirm improvement
39
+ 5. GUARD — Add budget/benchmark to prevent regression
40
+ ```
41
+
42
+ **Rule:** Skip to step 3 only if you have measurement data that justifies the optimization.
43
+
44
+ ## Performance Targets
45
+
46
+ ### Core Web Vitals (Web)
47
+
48
+ | Metric | Good | Needs Improvement | Poor |
49
+ | ------------------------------- | ------- | ----------------- | ------- |
50
+ | LCP (Largest Contentful Paint) | ≤ 2.5s | ≤ 4.0s | > 4.0s |
51
+ | INP (Interaction to Next Paint) | ≤ 200ms | ≤ 500ms | > 500ms |
52
+ | CLS (Cumulative Layout Shift) | ≤ 0.1 | ≤ 0.25 | > 0.25 |
53
+
54
+ ### General Targets
55
+
56
+ | Context | Target | Measure |
57
+ | ------------------- | ------------ | ---------------------- |
58
+ | API response (p95) | < 200ms | Server-side timing |
59
+ | CLI command startup | < 500ms | `time` or `perf_hooks` |
60
+ | Build time | < 60s | CI pipeline metrics |
61
+ | Bundle size (JS) | < 200KB gzip | Bundler output |
62
+ | Database query | < 50ms | Query EXPLAIN + timing |
63
+
64
+ ## Common Anti-Patterns & Fixes
65
+
66
+ ### N+1 Queries
67
+
68
+ ```typescript
69
+ // ❌ N+1: One query per user
70
+ const users = await db.query("SELECT * FROM users");
71
+ for (const user of users) {
72
+ user.posts = await db.query("SELECT * FROM posts WHERE user_id = ?", [user.id]);
73
+ }
74
+
75
+ // ✅ Batch: Two queries total
76
+ const users = await db.query("SELECT * FROM users");
77
+ const userIds = users.map((u) => u.id);
78
+ const posts = await db.query("SELECT * FROM posts WHERE user_id IN (?)", [userIds]);
79
+ // Group posts by user_id in application code
80
+ ```
81
+
82
+ ### Unbounded Fetching
83
+
84
+ ```typescript
85
+ // ❌ Fetch everything
86
+ const allItems = await db.query("SELECT * FROM items");
87
+
88
+ // ✅ Paginate
89
+ const items = await db.query("SELECT * FROM items LIMIT ? OFFSET ?", [pageSize, offset]);
90
+ ```
91
+
92
+ ### Missing Memoization
93
+
94
+ ```typescript
95
+ // ❌ Recompute on every render
96
+ function ExpensiveList({ items }) {
97
+ const sorted = items.sort((a, b) => complexSort(a, b)); // sorts on every render
98
+ return <List items={sorted} />;
99
+ }
100
+
101
+ // ✅ Memoize expensive computation
102
+ function ExpensiveList({ items }) {
103
+ const sorted = useMemo(
104
+ () => [...items].sort((a, b) => complexSort(a, b)),
105
+ [items]
106
+ );
107
+ return <List items={sorted} />;
108
+ }
109
+ ```
110
+
111
+ ### Large Bundle Size
112
+
113
+ ```typescript
114
+ // ❌ Import entire library
115
+ import _ from "lodash";
116
+ const result = _.debounce(fn, 300);
117
+
118
+ // ✅ Import only what you need
119
+ import debounce from "lodash/debounce";
120
+ const result = debounce(fn, 300);
121
+
122
+ // ✅✅ Use native (no dependency)
123
+ function debounce(fn, ms) {
124
+ /* ... */
125
+ }
126
+ ```
127
+
128
+ ### Missing Image Optimization
129
+
130
+ ```html
131
+ <!-- ❌ Unoptimized -->
132
+ <img src="hero.png" />
133
+
134
+ <!-- ✅ Optimized -->
135
+ <img
136
+ src="hero.webp"
137
+ srcset="hero-400.webp 400w, hero-800.webp 800w, hero-1200.webp 1200w"
138
+ sizes="(max-width: 600px) 400px, (max-width: 1024px) 800px, 1200px"
139
+ loading="lazy"
140
+ decoding="async"
141
+ width="1200"
142
+ height="630"
143
+ alt="Hero image"
144
+ />
145
+ ```
146
+
147
+ ### Unnecessary Re-renders (React)
148
+
149
+ ```typescript
150
+ // ❌ New object every render causes child re-render
151
+ function Parent() {
152
+ return <Child style={{ color: 'red' }} onClick={() => doThing()} />;
153
+ }
154
+
155
+ // ✅ Stable references
156
+ const style = { color: 'red' };
157
+ function Parent() {
158
+ const handleClick = useCallback(() => doThing(), []);
159
+ return <Child style={style} onClick={handleClick} />;
160
+ }
161
+ ```
162
+
163
+ ## Profiling Tools
164
+
165
+ | Context | Tool | What It Shows |
166
+ | ---------------- | -------------------------------- | --------------------------------- |
167
+ | Web (browser) | Chrome DevTools Performance | Paint, scripting, layout, network |
168
+ | Web (field data) | CrUX, PageSpeed Insights | Real-user Core Web Vitals |
169
+ | Node.js | `node --prof` + `--prof-process` | V8 profiling ticks per function |
170
+ | Node.js | `clinic.js` | Flamegraphs, event loop delays |
171
+ | React | React DevTools Profiler | Component render times |
172
+ | SQL | `EXPLAIN ANALYZE` | Query execution plan |
173
+ | Bundle | `source-map-explorer` | Module size breakdown |
174
+ | Network | `lighthouse` | Loading performance audit |
175
+
176
+ ## Performance Budget
177
+
178
+ ### Setting Budgets
179
+
180
+ ```json
181
+ {
182
+ "budgets": [
183
+ { "metric": "js-bundle", "max": "200KB", "warn": "150KB" },
184
+ { "metric": "css-bundle", "max": "50KB", "warn": "40KB" },
185
+ { "metric": "lcp", "max": "2500ms", "warn": "2000ms" },
186
+ { "metric": "api-p95", "max": "200ms", "warn": "150ms" }
187
+ ]
188
+ }
189
+ ```
190
+
191
+ ### Enforcing Budgets in CI
192
+
193
+ ```yaml
194
+ - name: Check bundle size
195
+ run: |
196
+ npx bundlesize --config .bundlesizerc.json
197
+
198
+ - name: Lighthouse audit
199
+ run: |
200
+ npx lighthouse $URL --output json --chrome-flags="--headless"
201
+ # Parse and assert against budgets
202
+ ```
203
+
204
+ ## Common Rationalizations
205
+
206
+ | Excuse | Rebuttal |
207
+ | --------------------------------- | ---------------------------------------------------------------------------------- |
208
+ | "It's fast enough on my machine" | Test on low-end devices and slow networks. Your machine isn't representative. |
209
+ | "We'll optimize later" | Performance debt compounds. Set budgets now, optimize when they're breached. |
210
+ | "This micro-optimization matters" | Profile first. If it's not in the hot path, it doesn't matter. |
211
+ | "Users won't notice 200ms" | Studies show 100ms delays reduce conversions. Users notice more than you think. |
212
+ | "Adding metrics is overhead" | The overhead of measurement is trivial compared to the cost of blind optimization. |
213
+ | "Caching will fix it" | Caching masks problems. Fix the root cause, then add caching as defense. |
214
+
215
+ ## Red Flags — STOP
216
+
217
+ - Optimizing without profiling data
218
+ - Adding caching to mask a fundamentally slow operation
219
+ - Micro-optimizing code that runs once per request
220
+ - Bundle size growing without review
221
+ - No performance budget or monitoring in place
222
+ - Using `SELECT *` in production queries
223
+
224
+ ## Verification
225
+
226
+ - [ ] Bottleneck identified with profiling data (not intuition)
227
+ - [ ] Optimization shows measurable improvement in profiler
228
+ - [ ] Performance budget is set and enforced in CI
229
+ - [ ] No regressions in existing benchmarks
230
+ - [ ] Optimization doesn't sacrifice correctness or readability
231
+
232
+ ## See Also
233
+
234
+ - **react-best-practices** — React-specific performance patterns (server components, bundle optimization)
235
+ - **ci-cd-and-automation** — Enforcing performance budgets in CI
236
+ - **code-simplification** — Simplifying code often improves performance as a side effect
@@ -20,6 +20,17 @@ dependencies: []
20
20
  - You already verified and accepted the feedback and are ready to implement
21
21
  - You need to request a review (use requesting-code-review)
22
22
 
23
+ ## Common Rationalizations
24
+
25
+ | Rationalization | Rebuttal |
26
+ | ------------------------------------------------- | -------------------------------------------------------------------------------------- |
27
+ | "The reviewer is experienced, they must be right" | Experience doesn't mean they have YOUR codebase context. Verify against reality |
28
+ | "It's faster to just implement it than to verify" | A wrong implementation costs more than the 2 minutes to check |
29
+ | "Pushing back will create conflict" | Technical correctness > social comfort. Shipping wrong code creates bigger conflict |
30
+ | "I'll fix it and verify later" | "Later" means after the wrong change is merged and depended upon |
31
+ | "The suggestion is small, no need to verify" | Small changes break things too. One wrong import can crash a module |
32
+ | "I understood the feedback, no need to restate" | Restating catches misunderstandings BEFORE you waste time implementing the wrong thing |
33
+
23
34
  ## Overview
24
35
 
25
36
  Code review requires technical evaluation, not emotional performance.
@@ -0,0 +1,183 @@
1
+ ---
2
+ name: reflection-checkpoints
3
+ description: >
4
+ Use when executing long-running commands (/ship, /lfg) to add self-assessment
5
+ checkpoints that detect scope drift, stalled progress, and premature completion claims.
6
+ Inspired by ByteRover's reflection prompt architecture.
7
+ version: 1.0.0
8
+ tags: [workflow, quality, autonomous]
9
+ dependencies: [verification-before-completion]
10
+ ---
11
+
12
+ # Reflection Checkpoints
13
+
14
+ ## When to Use
15
+
16
+ - During `/ship` execution after completing 50%+ of tasks
17
+ - During `/lfg` at each phase transition (Plan→Work→Review→Compound)
18
+ - When a task takes significantly longer than estimated
19
+ - When context usage exceeds 60% of budget
20
+
21
+ ## When NOT to Use
22
+
23
+ - Simple, single-task work (< 3 tasks)
24
+ - Pure research or exploration commands
25
+ - When user explicitly requests fast execution without checkpoints
26
+
27
+ ## Overview
28
+
29
+ Long-running autonomous execution drifts silently. By the time you notice, you've burned context on the wrong thing. Reflection checkpoints force self-assessment at critical moments — catching drift before it compounds.
30
+
31
+ **Core principle:** Pause to assess, don't just assess to pause.
32
+
33
+ ## The Four Reflection Types
34
+
35
+ ### 1. Mid-Point Check
36
+
37
+ **Trigger:** After completing ~50% of planned tasks (e.g., 3 of 6 tasks done)
38
+
39
+ ```
40
+ ## 🔍 Mid-Point Reflection
41
+
42
+ **Progress:** [N/M] tasks complete
43
+ **Context used:** ~[X]% estimated
44
+
45
+ ### Scope Check
46
+ - [ ] Am I still solving the original problem?
47
+ - [ ] Have I introduced any unplanned work?
48
+ - [ ] Are remaining tasks still correctly scoped?
49
+
50
+ ### Quality Check
51
+ - [ ] Do completed tasks actually work (not just "done")?
52
+ - [ ] Any verification steps I deferred?
53
+ - [ ] Any TODO/FIXME I left that needs addressing?
54
+
55
+ ### Efficiency Check
56
+ - [ ] Am I spending context on the right things?
57
+ - [ ] Should remaining tasks be parallelized?
58
+ - [ ] Any tasks that should be deferred to a follow-up bead?
59
+
60
+ **Assessment:** [On track / Drifting / Blocked]
61
+ **Adjustment:** [None needed / Describe change]
62
+ ```
63
+
64
+ ### 2. Completion Check
65
+
66
+ **Trigger:** Before claiming any task or phase is complete
67
+
68
+ ```
69
+ ## ✅ Completion Check
70
+
71
+ **Claiming complete:** [task/phase name]
72
+
73
+ ### Evidence Audit
74
+ - [ ] Verification command was run (not assumed)
75
+ - [ ] Output confirms the claim (not inferred)
76
+ - [ ] No stub patterns in modified files
77
+ - [ ] Imports/exports are wired (not just declared)
78
+
79
+ ### Goal-Backward Check
80
+ - [ ] Does this task achieve its stated end-state?
81
+ - [ ] Would a user see the expected behavior?
82
+ - [ ] If tested manually, would it work?
83
+
84
+ **Verdict:** [Complete / Needs work: describe what]
85
+ ```
86
+
87
+ ### 3. Near-Limit Warning
88
+
89
+ **Trigger:** When context usage exceeds ~70% or step count approaches limit
90
+
91
+ ```
92
+ ## ⚠️ Near-Limit Warning
93
+
94
+ **Context pressure:** [High / Critical]
95
+ **Remaining tasks:** [N]
96
+
97
+ ### Triage
98
+ 1. What MUST be done before stopping? [list critical tasks]
99
+ 2. What CAN be deferred? [list deferrable tasks]
100
+ 3. What should be handed off? [list with context needed]
101
+
102
+ ### Action
103
+ - [ ] Compress completed work
104
+ - [ ] Prioritize remaining tasks ruthlessly
105
+ - [ ] Prepare handoff if needed
106
+
107
+ **Decision:** [Continue (enough budget) / Compress and continue / Handoff now]
108
+ ```
109
+
110
+ ### 4. Phase Transition Check
111
+
112
+ **Trigger:** At `/lfg` phase boundaries (Plan→Work, Work→Review, Review→Compound)
113
+
114
+ ```
115
+ ## 🔄 Phase Transition: [Previous] → [Next]
116
+
117
+ ### Previous Phase Assessment
118
+ - **Objective met?** [Yes / Partially / No]
119
+ - **Artifacts produced:** [list]
120
+ - **Open issues carried forward:** [list or "none"]
121
+
122
+ ### Next Phase Readiness
123
+ - [ ] Prerequisites satisfied
124
+ - [ ] Context is clean (no stale noise)
125
+ - [ ] Correct skills loaded for next phase
126
+
127
+ **Proceed:** [Yes / Need to resolve: describe]
128
+ ```
129
+
130
+ ## Integration Points
131
+
132
+ ### In `/ship` (Phase 3 task loop)
133
+
134
+ After every ceil(totalTasks / 2) tasks, run **Mid-Point Check**:
135
+
136
+ ```typescript
137
+ const midpoint = Math.ceil(totalTasks / 2);
138
+ if (completedTasks === midpoint) {
139
+ // Run mid-point reflection
140
+ // Log assessment to .beads/artifacts/$BEAD_ID/reflections.md
141
+ }
142
+ ```
143
+
144
+ Before each task completion claim, run **Completion Check** (lightweight — just the evidence audit).
145
+
146
+ ### In `/lfg` (phase transitions)
147
+
148
+ At each step boundary (Plan→Work, Work→Review, Review→Compound), run **Phase Transition Check**.
149
+
150
+ ### Context pressure monitoring
151
+
152
+ When context usage estimate exceeds 70%, run **Near-Limit Warning** regardless of task position.
153
+
154
+ ## Reflection Log
155
+
156
+ Append all reflections to `.beads/artifacts/$BEAD_ID/reflections.md` (or session-level if no bead):
157
+
158
+ ```markdown
159
+ ## Reflection Log
160
+
161
+ ### [timestamp] Mid-Point Check
162
+
163
+ Assessment: On track
164
+ Context: ~45% used
165
+ Adjustment: None
166
+
167
+ ### [timestamp] Completion Check — Task 3
168
+
169
+ Verdict: Complete
170
+ Evidence: typecheck pass, test pass (12/12)
171
+
172
+ ### [timestamp] Near-Limit Warning
173
+
174
+ Decision: Compress and continue
175
+ Deferred: Task 6 (cosmetic cleanup) → follow-up bead
176
+ ```
177
+
178
+ ## Gotchas
179
+
180
+ - **Don't over-reflect** — these are quick self-checks, not long analyses. Each should take < 30 seconds of reasoning.
181
+ - **Don't block on minor drift** — if drift is cosmetic (variable naming, style), note it and continue. Only pause for scope drift.
182
+ - **Context cost** — each reflection adds ~200-400 tokens. Budget accordingly. Skip mid-point check for < 4 tasks.
183
+ - **Not a replacement for verification** — reflections assess trajectory, not correctness. Always run actual verification commands.