npm - opencodekit - Versions diffs - 0.18.3 → 0.18.5 - Mend

opencodekit 0.18.3 → 0.18.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (44) hide show

package/dist/template/.opencode/skill/context-management/SKILL.md CHANGED Viewed

@@ -1,164 +1,173 @@
 ---
 name: context-management
-description: Use when context is growing large, needing to prune/distill tool outputs, or managing conversation size - covers DCP slash commands and context budgets
-version: 1.0.0
-tags: [context, workflow]
+description: Unified protocol for context health and session lifecycle management using DCP tools, thresholds, handoff, and resume workflows
+version: 2.0.0
+tags: [context, workflow, session]
 dependencies: []
 ---
 # Context Management
+> **Replaces** manual context tracking and ad-hoc session management — unified protocol for context health across a session lifecycle
+Use this skill to keep context useful from first turn to final handoff.
 ## When to Use
-- Context is growing large and you need to compress/distill/prune tool outputs
-- You are finishing a phase and want to preserve signal while freeing tokens
+- Context size is growing and you need to reduce noise without losing critical details
+- You are finishing a work phase and want to compress completed exploration/implementation
+- You are preparing `/handoff` or resuming a prior session
+- You need to recover relevant prior context with `find_sessions`, `read_session`, and memory files
 ## When NOT to Use
-- You still need active file contents for upcoming edits
-- The output is protected or required for immediate modifications
+- You are actively editing files whose raw content must remain exact
+- You are in a short, single-step task that will finish before context pressure appears
-## Tool Hierarchy (v2.2+ Philosophy)
+## Core Principle
-DCP beta shifted to a compress-first approach. Follow this order strictly:
+Prefer **phase-level compression** over reactive cleanup.
-```
+```text
 compress > distill > prune
 ```
-| Tool       | Use When                                               | Cache Impact |
-| ---------- | ------------------------------------------------------ | ------------ |
-| `compress` | A phase of work is complete — collapse the whole phase | Minimal      |
-| `distill`  | Large raw output with extractable value to preserve    | Low          |
-| `prune`    | Pure noise: wrong target, irrelevant, zero value       | Moderate     |
+- **compress**: Best default when a phase is complete
+- **distill**: Use when you must preserve technical detail but can remove bulky raw outputs
+- **prune**: Use only for true noise you are certain will never be needed
-**Why this matters:** Granular `prune` calls trigger cache invalidation on every provider, especially Anthropic. Compressing whole phases instead of surgically deleting individual outputs is cheaper, faster, and more reliable.
+## DCP Tool Usage
-**Never prune because it's convenient. Only prune true noise.**
+### `/dcp compress`
-## DCP Slash Commands (Recommended)
+Use for completed chapters of work (research, implementation wave, review sweep).
-| Command                 | Purpose                                  | When to Use                         |
-| ----------------------- | ---------------------------------------- | ----------------------------------- |
-| `/dcp compress [focus]` | Collapse conversation range into summary | Phase complete, research done       |
-| `/dcp distill [focus]`  | Distill key findings before removing     | Large outputs with valuable details |
-| `/dcp sweep [count]`    | Prune all tools since last user message  | Cleanup pure noise only             |
-| `/dcp context`          | Show token breakdown by category         | Check context usage                 |
-| `/dcp stats`            | Show cumulative pruning stats            | Review efficiency                   |
+- Best for large ranges of now-stable outputs
+- Lowest cognitive overhead on later turns
+- Usually lowest risk of deleting needed details
-## Tool Calls (Fallback)
+### `/dcp distill`
-Use when slash commands aren't suitable:
+Use when raw output is large but details still matter later.
-| Tool       | Purpose                       | When to Use                           |
-| ---------- | ----------------------------- | ------------------------------------- |
-| `compress` | Collapse conversation range   | Phase complete, research done         |
-| `distill`  | Extract key info, then remove | Large outputs with valuable details   |
-| `prune`    | Remove tool outputs (no save) | Noise only — irrelevant, never-needed |
+Include concrete facts in distillation:
-**Note:** Prefer `/dcp compress` slash command over the `compress` tool — better boundary matching.
+- function signatures
+- constraints and assumptions
+- file paths and key decisions
+- verification outcomes
-## Phase-Boundary Compress Triggers
+### `/dcp prune`
-The most effective compress timing is at natural phase endings. For the Compound Engineering loop:
+Use only for irrelevant/noise outputs:
-| Phase ends            | What to compress                         | Keep                              |
-| --------------------- | ---------------------------------------- | --------------------------------- |
-| `/plan` research done | Exploration turns, scout/explore outputs | Plan.md facts, key decisions      |
-| `/ship` wave complete | Implementation turns, read file outputs  | Commit refs, verification results |
-| `/review` complete    | Raw agent outputs (all 5 reviewers)      | Synthesized findings summary      |
-| `/compound` done      | Entire compound loop session             | Observation titles stored         |
-| Session → handoff     | Everything since last compress           | Handoff doc summary               |
+- wrong-target searches
+- failed dead-end exploration no longer needed
+- duplicate or superseded junk output
-**Rule:** Every completed phase is a compress candidate. Don't wait until context is full — compress as chapters close.
+Do **not** prune because output is "long". Length alone is not noise.
-## DCP Auto-Strategies
+## Session Lifecycle Protocol
-DCP runs these automatically at zero LLM cost — don't manually manage these:
+### 1) Start Session
-- **Deduplication** — removes duplicate tool calls (same tool + same args)
-- **Supersede Writes** — removes write inputs when file is later read
-- **Purge Errors** — removes errored tool inputs after 4 turns
+1. Load task spec and essential policy docs only
+2. Check context health (`/dcp context`)
+3. Pull prior work only if needed:
+   - `find_sessions({ query })`
+   - `read_session({ session_id, focus })`
+   - `memory-read({ file })` or `memory-search({ query })`
-## When to Evaluate
+### 2) During Active Work
-**DO evaluate at:**
+- Keep active files readable until edits are done
+- At each natural boundary, evaluate compress candidates
+- Distill high-value technical outputs before removal
-- Start of new turn after receiving user message (best timing — you know what's needed next)
-- Phase boundary: research done, implementation wave done, review done
-- Large tool output just returned that won't be needed for upcoming edits
-- Information superseded by newer, more specific output
+### 3) Pre-Handoff / Closeout
-**DO NOT manage when:**
+1. Compress completed phase ranges
+2. Persist key decisions/learnings to memory (observation or memory-update)
+3. Create concise handoff summary (what changed, what is pending, known risks)
-- Output needed for upcoming file edits (read files stay until edit is done)
-- Contains active file contents you're about to modify
-- Uncertain if you'll need it — defer until certain
-- DCP auto-strategies already handle it
+### 4) Resume Session
-## Protected Content
+1. Rehydrate only relevant context (don’t replay everything)
+2. Validate assumptions against current files/git state
+3. Continue with fresh context budget, not accumulated clutter
-Auto-protected from pruning (v2.1.7+):
+## Context Budget Thresholds
-- `write` and `edit` tool outputs
-- `.env*` files
-- `AGENTS.md`
-- `.opencode/**` config
-- `.beads/**` tasks
-- `package.json`, `tsconfig.json`
+Use these thresholds as operational triggers:
-Don't manually protect what's already protected.
+| Threshold | Interpretation | Required Action |
+| --- | --- | --- |
+| <50k | Healthy start | Keep inputs minimal, avoid unnecessary reads |
+| 50k–100k | Moderate growth | Compress completed phases, keep active files intact |
+| >100k | High pressure | Aggressively compress by phase; distill critical leftovers |
+| >150k | Near capacity | Perform handoff and resume in a fresh session |
-## Distill — Preserve + Remove
+Secondary guardrails:
-Extract high-fidelity knowledge from tool outputs, then remove the raw output. Distillation must be a **complete technical substitute** — capture signatures, types, logic, constraints, everything essential.
+- ~70%: Consolidate and drop stale exploration
+- ~85%: Plan handoff window at next natural break
+- ~95%: Immediate cleanup or restart required
-```typescript
-distill({
-  targets: [
-    {
-      id: "10",
-      distillation:
-        "auth.ts: validateToken(token: string) -> User|null, uses bcrypt 12 rounds, throws on expired tokens",
-    },
-    {
-      id: "11",
-      distillation:
-        "user.ts: interface User { id: string, email: string, permissions: Permission[], status: 'active'|'suspended' }",
-    },
-  ],
-});
-```
+## Phase Boundary Triggers
-## Context Budget Guidelines
+Compress at these boundaries:
-| Phase             | Target  | Action                                            |
-| ----------------- | ------- | ------------------------------------------------- |
-| Starting work     | <50k    | Load only essential AGENTS.md + task spec         |
-| Mid-task          | 50-100k | Compress completed phases, keep active files      |
-| Approaching limit | >100k   | Compress aggressively by phase, distill remaining |
-| Near capacity     | >150k   | Session restart with handoff                      |
+- Research complete → compress exploration + search outputs
+- Implementation wave complete → compress completed read/edit/test cycles
+- Review complete → compress raw reviewer outputs, keep synthesized findings
+- Before `/handoff` → compress everything non-essential since last checkpoint
-At >100k: prefer compressing full phases over distilling individual outputs. The cache cost is lower.
+Rule: **Completed phases should not remain uncompressed for long.**
-## Quick Reference
+## Context Transfer Sources (Cross-Session)
+Use in priority order:
+1. Memory artifacts (`memory-search`, `memory-read`, observations)
+2. Session history (`find_sessions`, `read_session`)
+3. Task tracker state (`br show <id>` when applicable)
+4. Git evidence (`git diff`, `git log`, test output)
+Carry forward decisions and constraints, not every intermediate log.
+## Anti-Patterns
+| Anti-Pattern | Why It Hurts | Correct Pattern |
+| --- | --- | --- |
+| Compressing active work areas (losing precision needed for edits) | Removes exact lines needed for safe edits | Keep active file/tool outputs until edit + verification complete |
+| Pruning tool outputs you'll need later | Forces rework and increases error risk | Distill first, then remove raw output |
+| Not compressing completed exploration phases | Bloats context and degrades later turns | Compress immediately at phase completion |
+| Session handoff without persisting key decisions to memory | Next session loses rationale and constraints | Write observations/memory updates before handoff |
+## Verification
+Check context health: are completed phases compressed? Are active files still readable?
+Before claiming cleanup done, confirm:
+- Active edit targets are still present in readable form
+- Completed phases are compressed/distilled
+- No critical decision exists only in transient tool output
+- Handoff includes next actions and blockers
+## Quick Playbook
+```text
+1) Start turn: /dcp context
+2) Identify completed phase ranges
+3) compress completed ranges
+4) distill high-value technical outputs
+5) prune true noise only
+6) persist key decisions to memory
+7) handoff/resume with focused rehydration
 ```
-HIERARCHY: compress > distill > prune
-TIMING: manage at turn START, not turn END
-PHASE ENDS = compress trigger
-DCP SLASH COMMANDS (preferred):
-/dcp compress [focus]  → Collapse completed phase
-/dcp distill [focus]   → Distill key findings
-/dcp sweep [count]     → Prune pure noise only
-/dcp context           → Show token breakdown
-TOOL CALLS (fallback):
-compress({ topic, content: { startId, endId, summary } })
-distill({ targets: [{ id, distillation }] })
-prune({ ids: [...] })  ← last resort only
-BUDGET: <50k start → 50-100k compress phases → >100k aggressive → >150k restart
-```
+## See Also
+- `memory-system`
+- `compaction`

package/dist/template/.opencode/skill/defense-in-depth/SKILL.md CHANGED Viewed

@@ -8,6 +8,8 @@ dependencies: []
 # Defense-in-Depth Validation
+> **Replaces** single-layer validation where bad data propagates silently until it causes cryptic failures deep in execution
 ## When to Use
 - A bug is caused by invalid data flowing through multiple layers
@@ -18,6 +20,15 @@ dependencies: []
 - Simple, single-layer validation at an obvious entry point is enough
 - The issue is unrelated to invalid data or boundary checks
+## Anti-Patterns
+| Anti-Pattern | Why It Fails | Instead |
+| --- | --- | --- |
+| Validating only at the entry point (trusting downstream) | Alternate paths and refactors bypass one gate | Add independent checks at each boundary |
+| Duplicating identical validation at every layer | Creates noise without improving safety | Tailor each layer to boundary-specific invariants |
+| Catching and swallowing errors silently | Hides failures and delays detection | Raise explicit errors with actionable context |
+| Mixing validation with business logic | Makes behavior hard to reason about and test | Keep validation checks explicit and separate from core logic |
 ## Overview
 When you fix a bug caused by invalid data, adding validation at one place feels sufficient. But that single check can be bypassed by different code paths, refactoring, or mocks.
@@ -144,3 +155,12 @@ All four layers were necessary. During testing, each layer caught bugs the other
 - Debug logging identified structural misuse
 **Don't stop at one validation point.** Add checks at every layer.
+## Verification
+- Test with invalid input at each layer boundary — each should reject independently.
+- Remove one validation layer — the next layer should still catch the error.
+## See Also
+- **structured-edit** - Reliable read/verify/edit workflow when changing validation code across layers

package/dist/template/.opencode/skill/design-system-audit/SKILL.md CHANGED Viewed

@@ -1,152 +1,153 @@
 ---
 name: design-system-audit
-description: Use when auditing existing design systems for consistency, documenting undocumented design tokens, identifying design debt, preparing for design system refactoring, or comparing implementation vs design specs
-version: 1.0.0
-tags: [design, code-quality]
+description: Unified design audit workflow for UI pattern analysis, design token auditing, and visual comparison against specs
+version: 2.0.0
+tags: [design, audit, ui]
 dependencies: []
 ---
-# Design System Audit Skill
+# Design System Audit
+> **Replaces** separate, overlapping design review skills — unified design analysis covering UI patterns, design tokens, and visual properties
+Use this skill for end-to-end design analysis across code, screenshots, and design specs.
 ## When to Use
-- Auditing existing design systems for consistency
-- Documenting undocumented design tokens
-- Identifying design debt
-- Preparing for design system refactoring
-- Comparing implementation vs design specs
+- Auditing UI consistency across a component library or app
+- Documenting existing patterns before refactor or migration
+- Extracting/validating design tokens from implementation and visuals
+- Comparing rendered output against mockups/Figma/screenshots
 ## When NOT to Use
-- No design system or token set to analyze.
+- Pure backend work with no user-facing UI
+- One-off micro tweak where system-level consistency is irrelevant
+## Modes
-## Core Workflow
+### UI Pattern Analysis
+Analyze component patterns, identify inconsistencies, document undocumented patterns.
-### Phase 1: Visual Inventory
+Focus areas:
-```
-Analyze application screenshots and create a visual inventory:
-1. COLOR PALETTE
-   - Primary colors (brand)
-   - Secondary colors
-   - Neutral/gray scale
-   - Semantic colors (success, warning, error, info)
-2. TYPOGRAPHY SCALE
-   - Heading sizes (H1-H6)
-   - Body text sizes
-   - Font families used
-   - Font weights observed
-3. SPACING PATTERNS
-   - Common padding values
-   - Common margin values
-   - Gap patterns
-4. COMPONENT VARIANTS
-   - Button styles
-   - Input field styles
-   - Card variations
-5. INCONSISTENCIES DETECTED
-   - Similar but different colors
-   - Inconsistent spacing
-   - Typography variations
-Output as structured JSON design tokens.
-```
+- Component variants and usage drift
+- Repeated interaction patterns (forms, tables, dialogs, navigation)
+- Pattern ownership (where canonical implementation should live)
+- Inconsistent states (hover/focus/disabled/error/loading)
-### Phase 2: Consistency Analysis
+Deliverables:
-```
-Compare visual inventory with code.
-Identify:
-1. Tokens used in code but not in designs
-2. Visual patterns not codified as tokens
-3. Naming inconsistencies
-4. Redundant/duplicate values
-5. Missing semantic tokens
-```
+- Pattern inventory with file references
+- Consolidation candidates
+- Priority fixes by user impact
-## Design Token Structure
-```json
-{
-  "color": {
-    "primitive": {
-      "blue": { "50": "#eff6ff", "500": "#3b82f6", "900": "#1e3a8a" }
-    },
-    "semantic": {
-      "primary": "{color.primitive.blue.500}",
-      "background": { "default": "#f9fafb", "muted": "#f3f4f6" },
-      "text": { "default": "#111827", "muted": "#6b7280" }
-    }
-  },
-  "spacing": { "1": "0.25rem", "2": "0.5rem", "4": "1rem", "8": "2rem" },
-  "typography": {
-    "fontFamily": { "sans": "Inter", "mono": "JetBrains Mono" },
-    "fontSize": { "sm": "0.875rem", "base": "1rem", "lg": "1.125rem" }
-  },
-  "borderRadius": { "sm": "0.125rem", "md": "0.375rem", "lg": "0.5rem" }
-}
-```
+### Design Token Audit
+Extract and verify color, typography, spacing tokens against implementation.
-## Audit Report Template
+Focus areas:
-```markdown
-# Design System Audit Report
+- Color token usage vs one-off literals
+- Typography scale consistency (size/weight/line-height)
+- Spacing/radius/shadow value normalization
+- Semantic token gaps (text-muted, border-subtle, success, warning, etc.)
-**Date:** [Date]
-**Application:** [Name]
+Deliverables:
-## Summary
+- Token map (source of truth + current usage)
+- Drift report (where implementation diverges)
+- Proposed canonical token set and migration order
-- Total unique colors: X (recommended: <20)
-- Total spacing values: X (recommended: 8-12)
-- Typography variants: X
-- Consistency score: X/100
+### Visual Comparison
+Compare rendered output against mockups/specs with specific measurements.
-## Color Audit
+Focus areas:
-| Category   | Count | Issues       |
-| ---------- | ----- | ------------ |
-| Primitives | X     | X duplicates |
-| Semantics  | X     | X missing    |
-| One-offs   | X     | Should be 0  |
+- Pixel-level spacing/sizing mismatches
+- Color differences (hex-level)
+- Typography mismatches (font, size, weight, line height)
+- Layout/responsive behavior across breakpoints
-### Recommendations
+Deliverables:
-1. Consolidate similar colors
-2. Add semantic tokens
-3. Remove one-off colors
+- Spec-vs-implementation discrepancy list
+- Severity-ranked visual defects
+- Concrete fix list with measurable targets
-## Priority Actions
+## Recommended Workflow
-### High Priority
+1. **Scope**
+   - Identify target surfaces (pages/components/states/breakpoints)
+   - Gather artifacts (code paths, screenshots, mockups)
-1. [Action with impact]
+2. **Inventory**
+   - Capture pattern and token inventory from code + visuals
+   - Note duplicates, drift, and undocumented conventions
-### Medium Priority
+3. **Cross-Reference**
+   - Validate findings against design system tokens/components
+   - Distinguish intentional exceptions from accidental drift
-1. [Action]
+4. **Measure**
+   - Record measurable values for each issue:
+     - hex color
+     - px/rem size
+     - spacing/gap/padding values
+     - breakpoint-specific behavior
-### Low Priority (Design Debt)
+5. **Report**
+   - Produce actionable findings with file:line and target value
+   - Group by severity and migration effort
-1. [Action]
+## Audit Output Template
+```markdown
+## Design Audit: [Scope]
+### Findings
+1. [Severity] [Issue]
+   - File: `path/to/file.tsx:123`
+   - Current: `#6B7280`, `14px`, `gap: 10px`
+   - Expected: `var(--color-text-muted)`, `13px`, `gap: 8px`
+   - Impact: [consistency/accessibility/brand mismatch]
+### Token Drift Summary
+- Colors: X one-offs, Y missing semantic mappings
+- Typography: X non-scale values
+- Spacing: X non-token values
+### Priority Actions
+- P0: [high impact, low effort]
+- P1: [high impact, medium effort]
+- P2: [design debt cleanup]
 ```
+## Anti-Patterns
+| Anti-Pattern | Why It Hurts | Better Approach |
+| --- | --- | --- |
+| Vague feedback ("looks good") instead of specific measurements | Not actionable, impossible to verify | Report concrete values (hex, px/rem, exact delta) |
+| Not checking existing design tokens before proposing new values | Creates token sprawl and inconsistency | Map to existing tokens first; add new tokens only when justified |
+| Auditing in isolation without cross-referencing the design system | Flags intentional patterns as bugs | Validate each finding against canonical component/token sources |
+| Reporting visual issues without checking responsive breakpoints | Misses major UX regressions on mobile/tablet | Verify each issue across defined breakpoints and states |
+## Verification
+After audit: every finding should reference a specific file:line and measurable value (hex color, px size, etc.)
+Minimum quality gate:
+- Each finding has location + current value + expected value
+- Each recommendation maps to token/system guidance
+- Responsive states checked for impacted components
 ## Storage
-Save audit reports to `.opencode/memory/design/audits/`
-Save design tokens to `.opencode/memory/design/tokens/`
+- Save audits to `.opencode/memory/design/audits/`
+- Save extracted token snapshots to `.opencode/memory/design/tokens/`
-## Related Skills
+## See Also
-| Need                 | Skill                 |
-| -------------------- | --------------------- |
-| Aesthetic principles | `frontend-design`     |
-| Implement components | `mockup-to-code`      |
-| Accessibility        | `accessibility-audit` |
+- `mockup-to-code`
+- `frontend-design`
+- `accessibility-audit`

package/dist/template/.opencode/skill/dispatching-parallel-agents/SKILL.md CHANGED Viewed

@@ -8,6 +8,8 @@ dependencies: []
 # Dispatching Parallel Agents
+> **Replaces** sequential investigation of independent failures — one-by-one debugging when problems don't share state
 ## When to Use
 - 3+ independent failures across different subsystems or test files
@@ -178,3 +180,9 @@ From debugging session (2025-10-03):
 - All investigations completed concurrently
 - All fixes integrated successfully
 - Zero conflicts between agent changes
+## See Also
+- `agent-teams` — for coordinated parallel work (not just debugging)
+- `swarm-coordination` — for large-scale task decomposition
+- `executing-plans` — for plan-driven parallel execution

package/dist/template/.opencode/skill/executing-plans/SKILL.md CHANGED Viewed

@@ -8,6 +8,7 @@ dependencies: [writing-plans]
 # Executing Plans
+> **Replaces** unstructured implementation where the agent jumps between tasks without review checkpoints or batch control
 ## When to Use
 - A complete implementation plan exists and you need to execute it in batches with checkpoints
@@ -221,3 +222,9 @@ Use numeric batch numbers, not task names, for predictable reference.
 - Reference skills when plan says to
 - Between batches: just report and wait
 - Stop when blocked, don't guess
+## See Also
+- `writing-plans` - Create detailed, zero-ambiguity implementation plans before execution
+- `swarm-coordination` - Coordinate parallel execution when many independent tasks can run concurrently
+- `verification-before-completion` - Run final verification gates before claiming completion