npm - opencodekit - Versions diffs - 0.20.6 → 0.20.8 - Mend

opencodekit 0.20.6 → 0.20.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (46) hide show

package/dist/template/.opencode/skill/incremental-implementation/SKILL.md ADDED Viewed

@@ -0,0 +1,191 @@
+---
+name: incremental-implementation
+description: Use when implementing features or fixes to enforce thin vertical slices with verify-after-each — prevents large, untested changes by requiring working code at every step
+version: 1.0.0
+tags: [workflow, implementation, code-quality]
+dependencies: [test-driven-development, verification-before-completion]
+---
+# Incremental Implementation
+> **Replaces** big-bang implementations where everything is built at once and tested at the end — enforces thin vertical slices with verification after each step
+## When to Use
+- Implementing any feature that touches more than 2 files
+- Working from a plan or spec with multiple tasks
+- Building something where partial progress should be demonstrable
+## When NOT to Use
+- One-line fixes or trivial changes
+- Pure refactors with no behavior change (use code-simplification instead)
+- Exploratory prototyping where you need to experiment freely
+## Common Rationalizations
+| Rationalization                                   | Rebuttal                                                                                                            |
+| ------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------- |
+| "I'll build everything first and test at the end" | End-to-end testing after 500 lines of changes makes failures impossible to isolate                                  |
+| "This feature can't be split into slices"         | Every feature can be sliced — you're confusing "the UI needs all parts" with "the code must be written all at once" |
+| "Committing partial work creates noise"           | Partial working commits are rollback points. One giant commit is a rollback cliff                                   |
+| "It's faster to write it all at once"             | It feels faster until the first bug takes 2 hours to locate in a 400-line diff                                      |
+| "The slices are too small to be meaningful"       | If a slice compiles, passes tests, and moves toward the goal, it's meaningful                                       |
+| "I need to see the whole picture first"           | Read the plan first, then implement slice by slice. Understanding ≠ building all at once                            |
+## Overview
+Large implementations fail because errors compound. When you write 500 lines before running anything, each line can introduce a bug that interacts with bugs from other lines. Thin vertical slices keep the error surface small.
+**Core principle:** Working code at every step. Never be more than one slice away from a green build.
+## The Cycle
+```
+FOR each slice:
+  1. IMPLEMENT — Write the minimal code for this slice (1-3 files max)
+  2. VERIFY — Run typecheck + lint + relevant tests
+  3. COMMIT — Create a checkpoint with descriptive message
+  4. NEXT — Move to the next slice
+IF verify fails:
+  Fix within the current slice before moving on
+  Do NOT proceed to the next slice with broken code
+```
+## Slicing Strategies
+### Vertical Slice (Preferred)
+Each slice delivers one thin path through the full stack:
+```
+Slice 1: API endpoint returns hardcoded data → test passes
+Slice 2: API endpoint reads from database → test passes
+Slice 3: UI calls API and renders data → test passes
+Slice 4: Add validation and error handling → test passes
+```
+### Contract-First
+Define interfaces first, then implement behind them:
+```
+Slice 1: Define types/interfaces → compiles
+Slice 2: Implement with stubs → tests pass (with mocked data)
+Slice 3: Replace stubs with real implementation → tests pass
+```
+### Risk-First
+Implement the hardest or most uncertain part first:
+```
+Slice 1: The tricky algorithm or integration → tests pass
+Slice 2: The straightforward plumbing → tests pass
+Slice 3: The UI/presentation layer → tests pass
+```
+## Implementation Rules
+### 1. Simplicity First
+Default to the simplest viable solution for each slice.
+```
+❌ "Let me add a factory pattern for extensibility"
+✅ "Direct function call works. Refactor to pattern IF a second use case appears"
+```
+### 2. Scope Discipline
+Each slice does ONE thing. If you notice something else that needs fixing:
+```
+NOTICED BUT NOT TOUCHING: [description of unrelated improvement]
+```
+Log it and continue with the current slice.
+### 3. One Compilable Step at a Time
+Never leave the codebase in a state where typecheck fails between slices.
+```
+❌ Add 5 function signatures, then implement all 5
+✅ Add and implement function 1, verify, then function 2
+```
+### 4. Keep Tests Green
+If existing tests break from your change, fix them in the same slice — not in a "fix tests" slice later.
+### 5. Feature Flags for Incomplete Features
+If a slice can't be hidden behind existing abstractions:
+```typescript
+// Temporary gate — remove when feature is complete
+if (process.env.ENABLE_NEW_FEATURE) {
+  // new code path
+} else {
+  // existing behavior
+}
+```
+### 6. Rollback-Friendly
+Each committed slice should be independently revertable without breaking the build.
+## Slice Size Guide
+| Slice Size    | Signal                                     |
+| ------------- | ------------------------------------------ |
+| 1-30 lines    | Ideal — easy to review and verify          |
+| 30-100 lines  | Acceptable — still isolatable              |
+| 100-200 lines | Too large — find a split point             |
+| 200+ lines    | Stop. You're doing big-bang implementation |
+## Red Flags — STOP
+If you catch yourself:
+- Writing more than 100 lines without running verification
+- Saying "I'll test this after I finish the next part"
+- Having 3+ files with uncommitted changes
+- Building a complex abstraction before the simple version works
+- Skipping verification because "this slice is trivial"
+**STOP.** Verify what you have. Commit if it passes. Then continue.
+## Verification
+After each slice:
+```bash
+# Minimum verification (must pass)
+npm run typecheck   # or equivalent
+npm run lint        # or equivalent
+# If slice changes behavior
+npm test            # relevant test files
+```
+After all slices complete:
+```bash
+# Full verification
+npm run typecheck && npm run lint && npm test
+```
+## Integration with Other Skills
+- **test-driven-development** — Write the test for each slice FIRST (RED), then implement (GREEN)
+- **verification-before-completion** — Run full gates after the final slice
+- **code-simplification** — Refactor AFTER all slices pass, not during implementation
+- **systematic-debugging** — If a slice fails verification, debug systematically instead of guessing
+## See Also
+- **writing-plans** — Creates the plan that this skill executes slice-by-slice
+- **executing-plans** — Orchestrates parallel execution of independent slices

package/dist/template/.opencode/skill/performance-optimization/SKILL.md ADDED Viewed

@@ -0,0 +1,236 @@
+---
+name: performance-optimization
+description: Use when profiling, optimizing, or adding performance budgets to applications — covers measure-first workflow, Core Web Vitals, common anti-patterns, and performance regression prevention
+version: 1.0.0
+tags: [performance, code-quality]
+dependencies: []
+---
+# Performance Optimization
+> **Replaces** premature optimization and gut-feeling tuning with measurement-driven improvements that target actual bottlenecks
+## When to Use
+- Application is measurably slow (user reports, metrics, profiler data)
+- Setting up performance budgets for a new project
+- Reviewing code for common performance anti-patterns
+- Performance regression detected in CI or monitoring
+## When NOT to Use
+- No evidence of a performance problem — premature optimization wastes time
+- Micro-optimizations that save nanoseconds in non-hot paths
+- Choosing "fast" over "correct" — correctness first, always
+## Overview
+Performance optimization is an empirical process. Measure, identify, fix, verify. Never optimize without profiling first.
+**Core principle:** Measure before optimizing. Optimize the bottleneck, not the code you happen to be reading. Verify the improvement with numbers.
+## Measure-First Workflow
+```
+1. MEASURE   — Profile to identify actual bottlenecks (not guessed ones)
+2. IDENTIFY  — Find the specific hot path or resource constraint
+3. FIX       — Apply targeted optimization to the bottleneck
+4. VERIFY    — Measure again to confirm improvement
+5. GUARD     — Add budget/benchmark to prevent regression
+```
+**Rule:** Skip to step 3 only if you have measurement data that justifies the optimization.
+## Performance Targets
+### Core Web Vitals (Web)
+| Metric                          | Good    | Needs Improvement | Poor    |
+| ------------------------------- | ------- | ----------------- | ------- |
+| LCP (Largest Contentful Paint)  | ≤ 2.5s  | ≤ 4.0s            | > 4.0s  |
+| INP (Interaction to Next Paint) | ≤ 200ms | ≤ 500ms           | > 500ms |
+| CLS (Cumulative Layout Shift)   | ≤ 0.1   | ≤ 0.25            | > 0.25  |
+### General Targets
+| Context             | Target       | Measure                |
+| ------------------- | ------------ | ---------------------- |
+| API response (p95)  | < 200ms      | Server-side timing     |
+| CLI command startup | < 500ms      | `time` or `perf_hooks` |
+| Build time          | < 60s        | CI pipeline metrics    |
+| Bundle size (JS)    | < 200KB gzip | Bundler output         |
+| Database query      | < 50ms       | Query EXPLAIN + timing |
+## Common Anti-Patterns & Fixes
+### N+1 Queries
+```typescript
+// ❌ N+1: One query per user
+const users = await db.query("SELECT * FROM users");
+for (const user of users) {
+  user.posts = await db.query("SELECT * FROM posts WHERE user_id = ?", [user.id]);
+}
+// ✅ Batch: Two queries total
+const users = await db.query("SELECT * FROM users");
+const userIds = users.map((u) => u.id);
+const posts = await db.query("SELECT * FROM posts WHERE user_id IN (?)", [userIds]);
+// Group posts by user_id in application code
+```
+### Unbounded Fetching
+```typescript
+// ❌ Fetch everything
+const allItems = await db.query("SELECT * FROM items");
+// ✅ Paginate
+const items = await db.query("SELECT * FROM items LIMIT ? OFFSET ?", [pageSize, offset]);
+```
+### Missing Memoization
+```typescript
+// ❌ Recompute on every render
+function ExpensiveList({ items }) {
+  const sorted = items.sort((a, b) => complexSort(a, b)); // sorts on every render
+  return <List items={sorted} />;
+}
+// ✅ Memoize expensive computation
+function ExpensiveList({ items }) {
+  const sorted = useMemo(
+    () => [...items].sort((a, b) => complexSort(a, b)),
+    [items]
+  );
+  return <List items={sorted} />;
+}
+```
+### Large Bundle Size
+```typescript
+// ❌ Import entire library
+import _ from "lodash";
+const result = _.debounce(fn, 300);
+// ✅ Import only what you need
+import debounce from "lodash/debounce";
+const result = debounce(fn, 300);
+// ✅✅ Use native (no dependency)
+function debounce(fn, ms) {
+  /* ... */
+}
+```
+### Missing Image Optimization
+```html
+<!-- ❌ Unoptimized -->
+<img src="hero.png" />
+<!-- ✅ Optimized -->
+<img
+  src="hero.webp"
+  srcset="hero-400.webp 400w, hero-800.webp 800w, hero-1200.webp 1200w"
+  sizes="(max-width: 600px) 400px, (max-width: 1024px) 800px, 1200px"
+  loading="lazy"
+  decoding="async"
+  width="1200"
+  height="630"
+  alt="Hero image"
+/>
+```
+### Unnecessary Re-renders (React)
+```typescript
+// ❌ New object every render causes child re-render
+function Parent() {
+  return <Child style={{ color: 'red' }} onClick={() => doThing()} />;
+}
+// ✅ Stable references
+const style = { color: 'red' };
+function Parent() {
+  const handleClick = useCallback(() => doThing(), []);
+  return <Child style={style} onClick={handleClick} />;
+}
+```
+## Profiling Tools
+| Context          | Tool                             | What It Shows                     |
+| ---------------- | -------------------------------- | --------------------------------- |
+| Web (browser)    | Chrome DevTools Performance      | Paint, scripting, layout, network |
+| Web (field data) | CrUX, PageSpeed Insights         | Real-user Core Web Vitals         |
+| Node.js          | `node --prof` + `--prof-process` | V8 profiling ticks per function   |
+| Node.js          | `clinic.js`                      | Flamegraphs, event loop delays    |
+| React            | React DevTools Profiler          | Component render times            |
+| SQL              | `EXPLAIN ANALYZE`                | Query execution plan              |
+| Bundle           | `source-map-explorer`            | Module size breakdown             |
+| Network          | `lighthouse`                     | Loading performance audit         |
+## Performance Budget
+### Setting Budgets
+```json
+{
+  "budgets": [
+    { "metric": "js-bundle", "max": "200KB", "warn": "150KB" },
+    { "metric": "css-bundle", "max": "50KB", "warn": "40KB" },
+    { "metric": "lcp", "max": "2500ms", "warn": "2000ms" },
+    { "metric": "api-p95", "max": "200ms", "warn": "150ms" }
+  ]
+}
+```
+### Enforcing Budgets in CI
+```yaml
+- name: Check bundle size
+  run: |
+    npx bundlesize --config .bundlesizerc.json
+- name: Lighthouse audit
+  run: |
+    npx lighthouse $URL --output json --chrome-flags="--headless"
+    # Parse and assert against budgets
+```
+## Common Rationalizations
+| Excuse                            | Rebuttal                                                                           |
+| --------------------------------- | ---------------------------------------------------------------------------------- |
+| "It's fast enough on my machine"  | Test on low-end devices and slow networks. Your machine isn't representative.      |
+| "We'll optimize later"            | Performance debt compounds. Set budgets now, optimize when they're breached.       |
+| "This micro-optimization matters" | Profile first. If it's not in the hot path, it doesn't matter.                     |
+| "Users won't notice 200ms"        | Studies show 100ms delays reduce conversions. Users notice more than you think.    |
+| "Adding metrics is overhead"      | The overhead of measurement is trivial compared to the cost of blind optimization. |
+| "Caching will fix it"             | Caching masks problems. Fix the root cause, then add caching as defense.           |
+## Red Flags — STOP
+- Optimizing without profiling data
+- Adding caching to mask a fundamentally slow operation
+- Micro-optimizing code that runs once per request
+- Bundle size growing without review
+- No performance budget or monitoring in place
+- Using `SELECT *` in production queries
+## Verification
+- [ ] Bottleneck identified with profiling data (not intuition)
+- [ ] Optimization shows measurable improvement in profiler
+- [ ] Performance budget is set and enforced in CI
+- [ ] No regressions in existing benchmarks
+- [ ] Optimization doesn't sacrifice correctness or readability
+## See Also
+- **react-best-practices** — React-specific performance patterns (server components, bundle optimization)
+- **ci-cd-and-automation** — Enforcing performance budgets in CI
+- **code-simplification** — Simplifying code often improves performance as a side effect

package/dist/template/.opencode/skill/receiving-code-review/SKILL.md CHANGED Viewed

@@ -20,6 +20,17 @@ dependencies: []
 - You already verified and accepted the feedback and are ready to implement
 - You need to request a review (use requesting-code-review)
+## Common Rationalizations
+| Rationalization                                   | Rebuttal                                                                               |
+| ------------------------------------------------- | -------------------------------------------------------------------------------------- |
+| "The reviewer is experienced, they must be right" | Experience doesn't mean they have YOUR codebase context. Verify against reality        |
+| "It's faster to just implement it than to verify" | A wrong implementation costs more than the 2 minutes to check                          |
+| "Pushing back will create conflict"               | Technical correctness > social comfort. Shipping wrong code creates bigger conflict    |
+| "I'll fix it and verify later"                    | "Later" means after the wrong change is merged and depended upon                       |
+| "The suggestion is small, no need to verify"      | Small changes break things too. One wrong import can crash a module                    |
+| "I understood the feedback, no need to restate"   | Restating catches misunderstandings BEFORE you waste time implementing the wrong thing |
 ## Overview
 Code review requires technical evaluation, not emotional performance.

package/dist/template/.opencode/skill/reflection-checkpoints/SKILL.md ADDED Viewed

@@ -0,0 +1,183 @@
+---
+name: reflection-checkpoints
+description: >
+  Use when executing long-running commands (/ship, /lfg) to add self-assessment
+  checkpoints that detect scope drift, stalled progress, and premature completion claims.
+  Inspired by ByteRover's reflection prompt architecture.
+version: 1.0.0
+tags: [workflow, quality, autonomous]
+dependencies: [verification-before-completion]
+---
+# Reflection Checkpoints
+## When to Use
+- During `/ship` execution after completing 50%+ of tasks
+- During `/lfg` at each phase transition (Plan→Work→Review→Compound)
+- When a task takes significantly longer than estimated
+- When context usage exceeds 60% of budget
+## When NOT to Use
+- Simple, single-task work (< 3 tasks)
+- Pure research or exploration commands
+- When user explicitly requests fast execution without checkpoints
+## Overview
+Long-running autonomous execution drifts silently. By the time you notice, you've burned context on the wrong thing. Reflection checkpoints force self-assessment at critical moments — catching drift before it compounds.
+**Core principle:** Pause to assess, don't just assess to pause.
+## The Four Reflection Types
+### 1. Mid-Point Check
+**Trigger:** After completing ~50% of planned tasks (e.g., 3 of 6 tasks done)
+```
+## 🔍 Mid-Point Reflection
+**Progress:** [N/M] tasks complete
+**Context used:** ~[X]% estimated
+### Scope Check
+- [ ] Am I still solving the original problem?
+- [ ] Have I introduced any unplanned work?
+- [ ] Are remaining tasks still correctly scoped?
+### Quality Check
+- [ ] Do completed tasks actually work (not just "done")?
+- [ ] Any verification steps I deferred?
+- [ ] Any TODO/FIXME I left that needs addressing?
+### Efficiency Check
+- [ ] Am I spending context on the right things?
+- [ ] Should remaining tasks be parallelized?
+- [ ] Any tasks that should be deferred to a follow-up bead?
+**Assessment:** [On track / Drifting / Blocked]
+**Adjustment:** [None needed / Describe change]
+```
+### 2. Completion Check
+**Trigger:** Before claiming any task or phase is complete
+```
+## ✅ Completion Check
+**Claiming complete:** [task/phase name]
+### Evidence Audit
+- [ ] Verification command was run (not assumed)
+- [ ] Output confirms the claim (not inferred)
+- [ ] No stub patterns in modified files
+- [ ] Imports/exports are wired (not just declared)
+### Goal-Backward Check
+- [ ] Does this task achieve its stated end-state?
+- [ ] Would a user see the expected behavior?
+- [ ] If tested manually, would it work?
+**Verdict:** [Complete / Needs work: describe what]
+```
+### 3. Near-Limit Warning
+**Trigger:** When context usage exceeds ~70% or step count approaches limit
+```
+## ⚠️ Near-Limit Warning
+**Context pressure:** [High / Critical]
+**Remaining tasks:** [N]
+### Triage
+1. What MUST be done before stopping? [list critical tasks]
+2. What CAN be deferred? [list deferrable tasks]
+3. What should be handed off? [list with context needed]
+### Action
+- [ ] Compress completed work
+- [ ] Prioritize remaining tasks ruthlessly
+- [ ] Prepare handoff if needed
+**Decision:** [Continue (enough budget) / Compress and continue / Handoff now]
+```
+### 4. Phase Transition Check
+**Trigger:** At `/lfg` phase boundaries (Plan→Work, Work→Review, Review→Compound)
+```
+## 🔄 Phase Transition: [Previous] → [Next]
+### Previous Phase Assessment
+- **Objective met?** [Yes / Partially / No]
+- **Artifacts produced:** [list]
+- **Open issues carried forward:** [list or "none"]
+### Next Phase Readiness
+- [ ] Prerequisites satisfied
+- [ ] Context is clean (no stale noise)
+- [ ] Correct skills loaded for next phase
+**Proceed:** [Yes / Need to resolve: describe]
+```
+## Integration Points
+### In `/ship` (Phase 3 task loop)
+After every ceil(totalTasks / 2) tasks, run **Mid-Point Check**:
+```typescript
+const midpoint = Math.ceil(totalTasks / 2);
+if (completedTasks === midpoint) {
+  // Run mid-point reflection
+  // Log assessment to .beads/artifacts/$BEAD_ID/reflections.md
+}
+```
+Before each task completion claim, run **Completion Check** (lightweight — just the evidence audit).
+### In `/lfg` (phase transitions)
+At each step boundary (Plan→Work, Work→Review, Review→Compound), run **Phase Transition Check**.
+### Context pressure monitoring
+When context usage estimate exceeds 70%, run **Near-Limit Warning** regardless of task position.
+## Reflection Log
+Append all reflections to `.beads/artifacts/$BEAD_ID/reflections.md` (or session-level if no bead):
+```markdown
+## Reflection Log
+### [timestamp] Mid-Point Check
+Assessment: On track
+Context: ~45% used
+Adjustment: None
+### [timestamp] Completion Check — Task 3
+Verdict: Complete
+Evidence: typecheck pass, test pass (12/12)
+### [timestamp] Near-Limit Warning
+Decision: Compress and continue
+Deferred: Task 6 (cosmetic cleanup) → follow-up bead
+```
+## Gotchas
+- **Don't over-reflect** — these are quick self-checks, not long analyses. Each should take < 30 seconds of reasoning.
+- **Don't block on minor drift** — if drift is cosmetic (variable naming, style), note it and continue. Only pause for scope drift.
+- **Context cost** — each reflection adds ~200-400 tokens. Budget accordingly. Skip mid-point check for < 4 tasks.
+- **Not a replacement for verification** — reflections assess trajectory, not correctness. Always run actual verification commands.