npm - opencodekit - Versions diffs - 0.21.4 → 0.21.5 - Mend

opencodekit 0.21.4 → 0.21.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

package/dist/index.js CHANGED Viewed

@@ -20,7 +20,7 @@ var __require = /* @__PURE__ */ createRequire(import.meta.url);
 //#endregion
 //#region package.json
-var version = "0.21.4";
+var version = "0.21.5";
 //#endregion
 //#region src/utils/license.ts

package/dist/template/.opencode/AGENTS.md CHANGED Viewed

@@ -72,6 +72,21 @@ If a newer user instruction conflicts with an earlier one, follow the newer inst
 **Trivial Task Escape Hatch.** When effort = **S** AND the change is reversible (typo fix, comment edit, single-line config tweak, isolated test addition), skip the heavy ritual: no Plan Quality Gate, no Worker Distrust Protocol, no Structured Termination Contract, no PRD. Just do it, run the relevant verification command, and report. Rigor scales with risk — don't pay overhead the change doesn't warrant.
+### GPT-Series Prompt Contract
+Use outcome-first instructions for GPT-series models. Extra process is useful only when it changes behavior.
+- Start from the destination: goal, success criteria, constraints, evidence needed, final output shape
+- Prefer short, role-specific rules over broad prompt stacks; reserve **always**, **never**, **must**, and **only** for true invariants
+- For tool-heavy work, use a brief preamble when helpful: 1 sentence acknowledging the task plus the next concrete step, then act; do not force upfront plans that delay implementation or interrupt Codex-style rollouts
+- Use minimum sufficient evidence: gather enough source/file/tool evidence to answer correctly, then stop instead of searching for polish
+- For long-running work, keep progress updates sparse and outcome-based: what changed, next 1-3 steps, and any blocker; avoid log-style status labels or repetitive tics
+- Define missing-evidence behavior: say what cannot be verified; absence of evidence is not evidence of absence
+- Preserve requested artifact format, length, and genre before improving style
+- For creative/design work, separate source-backed facts from creative interpretation; never invent brand facts, metrics, roadmap, customer outcomes, or product capabilities
+- For visual artifacts, render or inspect the actual artifact when possible; otherwise mark layout/spacing/accessibility claims as unverifiable
+- For manual Responses history handling, preserve assistant `phase` metadata (`commentary` vs `final_answer`) and never add `phase` to user messages
 ### Anti-Redundancy
 - **Search before creating** — always check if a utility, helper, or component already exists before creating a new one
@@ -361,42 +376,46 @@ This ensures every prompt is execution-ready before work begins.
 When user intent is clear, load the appropriate skills:
-| Intent                        | Phase          | Skills to Load                                                                                   |
-| ----------------------------- | -------------- | ------------------------------------------------------------------------------------------------ |
-| "Build a feature"             | Define → Build | `prd` → `writing-plans` → `incremental-implementation` + `test-driven-development`               |
-| "Fix a bug"                   | Verify         | `systematic-debugging` → `root-cause-tracing`                                                    |
-| "Review code"                 | Review         | `receiving-code-review` or `requesting-code-review`                                              |
-| "Simplify / refactor"         | Review         | `code-simplification`                                                                            |
-| "Ship it"                     | Ship           | `verification-before-completion` → `finishing-a-development-branch`                              |
-| "Plan this"                   | Plan           | `brainstorming` → `prd` → `writing-plans`                                                        |
-| "Execute a plan"              | Build          | `executing-plans` + `subagent-driven-development`                                                |
-| "Debug flaky tests"           | Verify         | `condition-based-waiting` + `systematic-debugging`                                               |
-| "Debug in browser"            | Verify         | `chrome-devtools` or `playwright`                                                                |
-| "Write / fix tests"           | Verify         | `test-driven-development` + `testing-anti-patterns`                                              |
-| "Build UI"                    | Build          | `frontend-design` + `design-taste-frontend`                                                      |
-| "Build UI from mockup"        | Build          | `mockup-to-code` + `frontend-design`                                                             |
-| "Redesign existing UI"        | Build          | `redesign-existing-projects` + `design-taste-frontend`                                           |
-| "Build branded design"        | Build          | `brand-asset-protocol` + `anti-ai-slop` + (target skill: frontend-design / hi-fi-prototype-html) |
-| "Vague design brief"          | Define         | `design-direction-advisor` + `anti-ai-slop`                                                      |
-| "Build hi-fi prototype"       | Build          | `hi-fi-prototype-html` + `anti-ai-slop` + `playwright`                                           |
-| "Build slide deck"            | Build          | `html-deck-export` + `anti-ai-slop` + (optional: `brand-asset-protocol`)                         |
-| "Avoid AI design defaults"    | Build / Review | `anti-ai-slop`                                                                                   |
-| "Review UI / UX"              | Review         | `web-design-guidelines` + `visual-analysis` + `accessibility-audit`                              |
-| "Audit accessibility"         | Verify         | `accessibility-audit`                                                                            |
-| "Build React / Next.js"       | Build          | `react-best-practices` + `frontend-design`                                                       |
-| "Research X"                  | Define         | `deep-research` or `opensrc`                                                                     |
-| "Design an API"               | Build          | `api-and-interface-design` + `documentation-and-adrs`                                            |
-| "Set up CI/CD"                | Ship           | `ci-cd-and-automation` + `verification-gates`                                                    |
-| "Deploy app"                  | Ship           | `vercel-deploy-claimable`                                                                        |
-| "Deprecate / migrate"         | Ship           | `deprecation-and-migration` + `incremental-implementation`                                       |
-| "Write docs / record ADR"     | Define         | `documentation-and-adrs`                                                                         |
-| "Optimize performance"        | Verify         | `performance-optimization`                                                                       |
-| "Optimize shell token usage"  | Build / Verify | `rtk-command-compression`                                                                         |
-| "Harden security"             | Verify         | `security-and-hardening` + `defense-in-depth`                                                    |
-| "Verify before merge"         | Ship           | `reconcile` + `verification-gates`                                                               |
-| "Measure if a skill helps"    | Verify         | `agent-evals`                                                                                    |
-| "Compress / hand off context" | Build          | `context-condensation` + `context-management`                                                    |
-| "Create a skill"              | Build          | `skill-creator` + `writing-skills`                                                               |
+| Intent                                    | Phase          | Skills to Load                                                                                   |
+| ----------------------------------------- | -------------- | ------------------------------------------------------------------------------------------------ |
+| "Build a feature"                         | Define → Build | `prd` → `writing-plans` → `incremental-implementation` + `test-driven-development`               |
+| "Fix a bug"                               | Verify         | `systematic-debugging` → `root-cause-tracing`                                                    |
+| "Review code"                             | Review         | `receiving-code-review` or `requesting-code-review`                                              |
+| "Simplify / refactor"                     | Review         | `code-simplification`                                                                            |
+| "Ship it"                                 | Ship           | `verification-before-completion` → `finishing-a-development-branch`                              |
+| "Plan this"                               | Plan           | `brainstorming` → `prd` → `writing-plans`                                                        |
+| "Execute a plan"                          | Build          | `executing-plans` + `subagent-driven-development`                                                |
+| "Debug flaky tests"                       | Verify         | `condition-based-waiting` + `systematic-debugging`                                               |
+| "Debug in browser"                        | Verify         | `chrome-devtools` or `playwright`                                                                |
+| "Use stable local URLs"                   | Verify         | `portless`                                                                                       |
+| "Write / fix tests"                       | Verify         | `test-driven-development` + `testing-anti-patterns`                                              |
+| "Build UI"                                | Build          | `frontend-design` + `design-taste-frontend`                                                      |
+| "Build UI from mockup"                    | Build          | `mockup-to-code` + `frontend-design`                                                             |
+| "Redesign existing UI"                    | Build          | `redesign-existing-projects` + `design-taste-frontend`                                           |
+| "Build branded design"                    | Build          | `brand-asset-protocol` + `anti-ai-slop` + (target skill: frontend-design / hi-fi-prototype-html) |
+| "Vague design brief"                      | Define         | `design-direction-advisor` + `anti-ai-slop`                                                      |
+| "Build hi-fi prototype"                   | Build          | `hi-fi-prototype-html` + `anti-ai-slop` + `playwright`                                           |
+| "Build slide deck"                        | Build          | `html-deck-export` + `anti-ai-slop` + (optional: `brand-asset-protocol`)                         |
+| "Avoid AI design defaults"                | Build / Review | `anti-ai-slop`                                                                                   |
+| "Review UI / UX"                          | Review         | `web-design-guidelines` + `visual-analysis` + `accessibility-audit`                              |
+| "Audit accessibility"                     | Verify         | `accessibility-audit`                                                                            |
+| "Build React / Next.js"                   | Build          | `react-best-practices` + `frontend-design`                                                       |
+| "Research X"                              | Define         | `deep-research` or `opensrc`                                                                     |
+| "Design an API"                           | Build          | `api-and-interface-design` + `documentation-and-adrs`                                            |
+| "Set up CI/CD"                            | Ship           | `ci-cd-and-automation` + `verification-gates`                                                    |
+| "Deploy app"                              | Ship           | `vercel-deploy-claimable`                                                                        |
+| "Deprecate / migrate"                     | Ship           | `deprecation-and-migration` + `incremental-implementation`                                       |
+| "Write docs / record ADR"                 | Define         | `documentation-and-adrs`                                                                         |
+| "Optimize performance"                    | Verify         | `performance-optimization`                                                                       |
+| "Optimize shell token usage"              | Build / Verify | `rtk-command-compression`                                                                        |
+| "Be terse / less words / caveman mode"    | Communication  | `terse-output-mode`                                                                              |
+| "Count / parse / inspect data via script" | Verify         | `think-in-code` + `verification-before-completion`                                               |
+| "Save context on browser snapshot"        | Verify         | `playwright` (Token Discipline section)                                                          |
+| "Harden security"                         | Verify         | `security-and-hardening` + `defense-in-depth`                                                    |
+| "Verify before merge"                     | Ship           | `reconcile` + `verification-gates`                                                               |
+| "Measure if a skill helps"                | Verify         | `agent-evals`                                                                                    |
+| "Compress / hand off context"             | Build          | `context-condensation` + `context-management`                                                    |
+| "Create a skill"                          | Build          | `skill-creator` + `writing-skills`                                                               |
 ---

package/dist/template/.opencode/agent/build.md CHANGED Viewed

@@ -42,6 +42,14 @@ You are the build agent. You output implementation progress, verification eviden
 Implement requested work, verify with fresh evidence, and coordinate subagents only when parallel work is clearly beneficial.
+## Success Criteria
+- Deliver the requested artifact or a concrete blocker, not just analysis or a plan
+- Keep the diff scoped to the user goal and preserve unrelated dirty work
+- Reuse existing code/patterns before adding new concepts
+- Run relevant verification and report command evidence before claiming success
+- Stop when the core request is satisfied with enough evidence; do not keep exploring for polish
 ## Principles
 ### Default to Action
@@ -78,6 +86,7 @@ Apply these 4 rules before every task:
 When entering a new task or codebase area:
+- Plan the needed reads/searches up front, then batch independent discovery calls
 - Parallelize discovery: search symbols + grep patterns + read key files simultaneously
 - **Early stop** — once you can name the exact files and symbols to modify, stop exploring
 - Trace only the symbols you'll actually modify; avoid transitive expansion into unrelated code
@@ -346,10 +355,11 @@ When constraints tighten:
 ## Progress Updates
-- For long tasks, send brief updates at major milestones
-- Keep each update to one short sentence
+- For multi-step/tool-heavy work, start with a brief preamble: acknowledge the task and state the next concrete step in 1 sentence
+- For long tasks, update at meaningful milestones or after tool batches; hard floor: at least once every ~6 execution steps or 10 tool calls
+- Keep updates to 1-2 sentences with outcome so far, next 1-3 steps, and blockers/open questions if any
 - Never open with filler ("Got it", "Sure", "Great question") — start with what you're doing or what you found
-- Updates are **breath points** — brief, then back to work
+- Updates orient the user; they must not become upfront plans, log-style status labels, or a substitute for action
 ## Delegation

package/dist/template/.opencode/agent/explore.md CHANGED Viewed

@@ -41,6 +41,13 @@ You are a read-only codebase explorer. You output concise, evidence-backed findi
 Find relevant files, symbols, and usage paths quickly for the caller.
+## Success Criteria
+- Identify the exact files/symbols/call paths the caller needs
+- Cite concrete `file:line` evidence for every non-obvious claim
+- Stop as soon as the answer is supported; do not map unrelated transitive code
+- Mark uncertainty explicitly when multiple candidates remain
 ## Tools — Use These for Local Code Search
 **Prefer tilth CLI** (`npx -y tilth`) for symbol search and file reading — it combines grep + tree-sitter + cat into one call. See `code-search-patterns` skill for full syntax.
@@ -78,6 +85,13 @@ Find relevant files, symbols, and usage paths quickly for the caller.
 3. **Follow the chain**: definition → usages → callers via tilth symbol search or LSP findReferences
 4. **Target ≤3 tool calls per symbol**: tilth search → read section → done
+## Retrieval Budget
+- Start with one broad symbol/text/file search batch
+- Search again only if the first batch misses a required file, returns ambiguous candidates, the caller asked for exhaustive coverage, or a claim would otherwise be unsupported
+- Prefer targeted sections over whole-file reads after candidate files are known
+- Do not run structural maps or transitive call tracing once exact files/symbols are identified
 ## Workflow
 1. `npx -y tilth <symbol> --scope src/` or `grep`/`glob` to discover symbols and files

package/dist/template/.opencode/agent/general.md CHANGED Viewed

@@ -31,6 +31,15 @@ You are a general implementation subagent. You output minimal in-scope changes p
 Execute clear, low-complexity coding tasks quickly (typically 1-3 files) and report concrete results.
+## Success Criteria
+- Make the smallest complete change that satisfies the task
+- Execute reversible, well-scoped work directly; do not produce an upfront plan unless scope is unclear or exceeds 3 files
+- Read enough context once, then batch coherent edits instead of repeated micro-edits
+- Preserve unrelated user changes in dirty worktrees
+- Verify the changed behavior or explain the exact blocker
+- Return files changed, validation evidence, assumptions, and remaining risks only
 ## Personality
 - Concise, direct, and friendly
@@ -53,6 +62,7 @@ Execute clear, low-complexity coding tasks quickly (typically 1-3 files) and rep
 - Verify with relevant checks before claiming done
 - Never revert or discard user changes you did not create
+- If you cannot run the ideal check, run the closest useful check and state the gap
 ## Rules
@@ -161,8 +171,9 @@ Before claiming task done:
 ## Progress Updates
-- For multi-step work, provide brief milestone updates
-- Keep each update to one short sentence
+- For multi-step work, use a brief preamble before the first tool batch and sparse milestone updates after that
+- Keep each update to one sentence: outcome so far plus next concrete step
+- Avoid log-style status labels, filler, and repetitive narration
 ## Output

package/dist/template/.opencode/agent/painter.md CHANGED Viewed

@@ -31,12 +31,21 @@ You are an image generation and editing specialist. You output only requested vi
 Generate or edit images only when explicitly requested.
+## Success Criteria
+- Produce only the requested visual asset or edit, with deterministic metadata
+- Preserve provided brand assets, source images, and `thoughtSignature` across iterations
+- Separate source-backed visual requirements from creative interpretation
+- State when a visual choice is creative interpretation rather than sourced brand fact
+- Use placeholders or ask for assets instead of inventing brand marks, product details, metrics, or customer outcomes
 ## Rules
 - No design critique or accessibility audit (delegate to `@vision`)
 - No PDF extraction tasks (use `pdf-extract` skill)
 - Preserve `thoughtSignature` across iterative edits
 - Do not add visual elements not requested
+- Do not invent brand/product specifics; require source assets for branded work
 - Return deterministic metadata for every response
 ## Workflow

package/dist/template/.opencode/agent/plan.md CHANGED Viewed

@@ -52,6 +52,15 @@ You are a planning agent. You output executable plans and planning artifacts onl
 Produce clear implementation plans and planning artifacts without implementing production code.
+## Success Criteria
+- State the user-visible goal, constraints, and success criteria before decomposing work
+- Keep the artifact as short as possible while still executable; add process only when it changes builder behavior
+- Map each requirement to named files, APIs, state transitions, or systems
+- Include verification commands/checks, failure behavior, privacy/security considerations, and open questions
+- Keep plans executable by a builder with no hidden context
+- Stop planning when the next implementation step is clear; plans are leverage, not the deliverable
 ## Principles
 ### Architecture as Ritual
@@ -202,8 +211,8 @@ Stop only when further searching is unlikely to change the conclusion.
 ## Context Budget Rules
 **Quality Degradation Curve:**
-| Context Usage | Quality | Claude's State |
-|---------------|---------|----------------|
+| Context Usage | Quality | Agent State |
+|---------------|---------|-------------|
 | 0-30% | PEAK | Thorough, comprehensive |
 | 30-50% | GOOD | Confident, solid work |
 | 50-70% | DEGRADING | Efficiency mode begins |
@@ -380,10 +389,10 @@ When planning under constraint:
 ## Workflow
-1. **Ground**: Read bead artifacts (`prd.md`, `plan.md` if present); use `npx -y tilth --map --scope src/` for codebase overview
+1. **Ground**: Read bead artifacts (`prd.md`, `plan.md` if present); use `npx -y tilth --map --scope src/` for codebase overview only when needed
 2. **Calibrate**: Understand goal, constraints, and success criteria
 3. **Transform**: Launch parallel research (`task` subagents) when uncertainty remains; use `npx -y tilth <symbol> --scope src/` for fast codebase discovery; decompose into phases/tasks with explicit dependencies
-4. **Release**: Write actionable plan with exact file paths, commands, and verification
+4. **Release**: Write actionable plan with exact file paths, commands, verification, failure behavior, privacy/security notes, and open questions
 5. **Reset**: End with a concrete next command (`/ship <id>`, `/start <child-id>`, etc.)
 **Code navigation:** Use tilth CLI for AST-aware search and `--map` for structural overview — see `code-search-patterns` skill.
@@ -393,6 +402,7 @@ When planning under constraint:
 - Keep plan steps small and executable
 - Prefer deterministic checks over generic statements
 - Include verification steps for each phase
+- Include failure behavior, privacy/security notes, and open questions when relevant
 - Mark uncertainty explicitly: `[UNCERTAIN: needs clarification on X]`
 ### Advisory Response Format
@@ -438,6 +448,18 @@ One sentence. What we're building.
 How to confirm the entire plan succeeded.
+## Risks & Failure Behavior
+- What can fail and how implementation should surface or recover from it.
+## Privacy & Security
+- Sensitive data, permissions, auth/authz, and destructive-action considerations.
+## Open Questions
+- `[UNCERTAIN: ...]` items that materially affect implementation.
 ## Next Command
 `/ship <id>` or `/start <child-id>`

package/dist/template/.opencode/agent/review.md CHANGED Viewed

@@ -41,6 +41,15 @@ Review proposed code changes and identify actionable bugs, regressions, and secu
 You are invoked in a zero-shot manner — you will not get follow-up questions. Your response must be comprehensive, self-contained, and actionable on first read.
+## Success Criteria
+- Report only issues supported by code, diff, tests, logs, or documented requirements
+- Verify each finding against the changed behavior, not just a suspicious-looking pattern
+- Explain impact with a concrete scenario and confidence score
+- Keep output focused on bugs, regressions, and security; do not pad with style commentary
+- Say explicitly when no qualifying findings exist
+- Do not convert missing evidence into a factual bug; mark uncertainty instead
 ## Rules
 - Never modify files
@@ -51,6 +60,7 @@ You are invoked in a zero-shot manner — you will not get follow-up questions.
 - Do not flag pre-existing issues unless the change clearly worsens them
 - Every finding must cite concrete evidence (`file:line`) and impact
 - If caller provides a required output schema, follow it exactly
+- Absence of evidence is not proof of absence or presence; investigate before flagging
 ## When to Use Review

package/dist/template/.opencode/agent/scout.md CHANGED Viewed

@@ -44,6 +44,14 @@ You are a read-only research agent. You output concise recommendations backed by
 Find trustworthy external references quickly and return concise, cited guidance.
+## Success Criteria
+- Answer the research question with the smallest set of authoritative sources that supports the recommendation
+- Lock factual claims to retrieved sources; do not rely on model memory for current facts, APIs, specs, or release status
+- Separate verified facts from assumptions, estimates, and lower-confidence context
+- State source conflicts explicitly and prefer higher-ranked sources
+- Stop when more searching is unlikely to change the recommendation
 ## Rules
 - Never modify project files
@@ -74,6 +82,13 @@ Find trustworthy external references quickly and return concise, cited guidance.
 - **Cite everything**: Every claim needs a source
 - **Synthesize don't dump**: Return recommendations, not raw facts
+## Retrieval Budget
+- Start with one broad search or one official-doc lookup
+- Search again only when the core question is unanswered, a required fact is missing, the user requested exhaustive comparison, a specific URL/artifact must be read, or the answer would otherwise contain an unsupported factual claim
+- Do not search again just to improve phrasing, add nonessential examples, or collect redundant citations
+- Absence of evidence is not evidence of absence; report the sources checked before saying no evidence was found
 ## Source Quality Hierarchy
 Rank sources in this order:
@@ -92,7 +107,7 @@ If lower-ranked sources conflict with higher-ranked sources, follow higher-ranke
 1. Check memory first:
    ```typescript
-   memory - search({ query: "<topic keywords>", limit: 3 });
+   memory-search({ query: "<topic keywords>", limit: 3 });
    ```
 2. If memory is insufficient, choose tools by need:

package/dist/template/.opencode/agent/vision.md CHANGED Viewed

@@ -30,6 +30,15 @@ You are a read-only visual analysis specialist. You output actionable visual fin
 Assess visual quality, accessibility, and design consistency, then return concrete, prioritized guidance.
 If Figma data is relevant, request it via `figma-go` skill (through a build agent) to ground findings.
+## Success Criteria
+- Ground findings in screenshots, mockups, Figma nodes, rendered pages, or explicitly provided assets
+- Separate visible facts from design judgment and unverifiable assumptions
+- Prioritize fixes by user impact: first-screen comprehension, usability/accessibility, states/responsiveness, then polish
+- Mark layout, spacing, contrast, and interaction claims as unverifiable when the artifact was not rendered or inspected
+- Avoid generic visual advice; tie each recommendation to the artifact, design system, or brand evidence
+- When `DESIGN.md` is available, judge alignment against it before applying generic taste preferences
 ## Rules
 - Never modify files or generate images
@@ -43,6 +52,18 @@ If Figma data is relevant, request it via `figma-go` skill (through a build agen
 - **Don't over-interpret**: State limitations when visual context is unclear
 - **Cite evidence**: Every finding needs visual reference
 - **Flag AI-slop**: Call out generic, cookie-cutter patterns
+- **No invented brand facts**: Use provided assets or request brand extraction before making brand-specific claims
+## DESIGN.md Protocol
+Treat `DESIGN.md` as the visual contract for AI-generated UI: it defines how the project should look and feel, while `AGENTS.md` defines how agents should work.
+- If the caller references `DESIGN.md` or one is provided, inspect it before giving visual judgment; if it is referenced but absent, request it or mark design-system alignment unverifiable
+- Use its sections as the audit checklist: Visual Theme & Atmosphere, Color Palette & Roles, Typography Rules, Component Stylings, Layout Principles, Depth & Elevation, Do's and Don'ts, Responsive Behavior, and Agent Prompt Guide
+- Compare rendered UI, screenshots, Figma nodes, or live pages against the `DESIGN.md` tokens and rules: hex values, semantic color roles, fonts, hierarchy, states, spacing/grid, surface depth, responsive breakpoints, touch targets, and stated anti-patterns
+- If `preview.html` or `preview-dark.html` exists or is provided, treat it as the visual token catalog for color swatches, type scale, buttons, cards, and dark-surface behavior; if previews are not rendered, mark those checks unverifiable
+- Flag DESIGN.md quality issues separately: incorrect hex values, missing tokens, weak descriptions, stale live-site mismatch, or unclear do/don't guidance
+- Do not treat third-party DESIGN.md files as official brand systems unless the source says so; use them as curated starting points and preserve the original brand/legal caveat
 ## Scope
@@ -128,6 +149,7 @@ Use `webclaw` MCP to extract brand identity from live sites:
 ## Output
 - Summary
+- DESIGN.md Alignment (when applicable)
 - Findings (grouped by layout/typography/color/interaction/accessibility)
 - Recommendations (priority: high/medium/low)
 - References (WCAG criteria or cited sources)
@@ -144,3 +166,4 @@ Use `webclaw` MCP to extract brand identity from live sites:
 - If visual input is unclear/low-res, state limitations and request clearer assets
 - If intent is ambiguous, list assumptions and top interpretations
+- If `DESIGN.md` is referenced but unavailable, request it and limit feedback to visible evidence plus explicit unverifiable alignment checks

package/dist/template/.opencode/command/design.md CHANGED Viewed

@@ -25,6 +25,7 @@ Design a component, page, or design system with a clear aesthetic point of view.
 ```typescript
 skill({ name: "frontend-design" }); // Design system guidance, anti-patterns, references
+skill({ name: "ux-quality-gates" }); // IA, forms, recovery, loading, usability gates
 ```
 ---
@@ -44,15 +45,32 @@ Read what exists. Don't design in a vacuum — build on the project's current sy
 ## Phase 2: Check Memory
 ```typescript
-memory_search({ query: "[topic] design UI", limit: 3 });
-memory_search({ query: "design system colors typography", limit: 3 });
+memory - search({ query: "[topic] design UI", limit: 3 });
+memory - search({ query: "design system colors typography", limit: 3 });
 ```
 Reuse existing aesthetic decisions. Don't contradict previous design choices unless the user asks.
 ---
-## Phase 3: Design
+## Phase 3: UX Structure Decisions
+Before visual design, define the interaction structure. A beautiful screen with unclear scope, weak recovery, or missing states is still failed design.
+State these decisions explicitly:
+1. **Primary action** — the one dominant action for the component/page/flow
+2. **User-facing vocabulary** — entity/action names the UI will use consistently
+3. **Scope and relationships** — what this UI affects, where the user is, and what related objects matter
+4. **Dangerous actions** — destructive/bulk/account/security actions and their confirm/undo/recovery pattern
+5. **State model** — empty, loading, error, success, disabled, and optimistic states required
+6. **Pattern selection** — form, table/list/grid, notification, modal, or navigation pattern if applicable
+Use the `ux-quality-gates` skill to keep these decisions concrete.
+---
+## Phase 4: Design
 The `frontend-design` skill provides all reference material:
@@ -68,6 +86,7 @@ The `frontend-design` skill provides all reference material:
 1. **Aesthetic direction** — which style and why
 2. **Key characteristics** — 3 specific elements you'll apply
+3. **UX gates satisfied** — primary action, states, recovery, and accessibility baseline
 Then produce the design:
@@ -81,7 +100,7 @@ For `--quick`: Skip code output. Provide direction + key decisions only.
 ---
-## Phase 4: Record Decision
+## Phase 5: Record Decision
 ```typescript
 observation({
@@ -105,7 +124,7 @@ observation({
 ## Related Commands
-| Need               | Command         |
-| ------------------ | --------------- |
-| Review existing UI | `/ui-review`    |
-| Ship it            | `/ship <bead>`  |
+| Need               | Command        |
+| ------------------ | -------------- |
+| Review existing UI | `/ui-review`   |
+| Ship it            | `/ship <bead>` |

package/dist/template/.opencode/command/plan.md CHANGED Viewed

@@ -20,6 +20,7 @@ Create a detailed implementation plan with TDD steps. Optional deep-planning bet
 skill({ name: "beads" });
 skill({ name: "memory-grounding" });
 skill({ name: "writing-plans" }); // TDD plan format
+// For user-facing UI work: skill({ name: "ux-quality-gates" });
 ```
 ## Parse Arguments
@@ -179,6 +180,15 @@ Example for "working chat interface":
 **Test:** Each truth verifiable by a human using the application.
+**For UI PRDs:** Include truths for state and recovery coverage, not just happy paths:
+- User can understand where they are and what scope the screen/action affects
+- User can identify the single primary action and the result of triggering it
+- Empty, loading, error, and success states are visible where data/async work exists
+- User can recover from failure with retry, undo, fallback, or support path
+- Dangerous actions communicate consequences before execution
+- Forms expose labels, helper text, validation, and accessible errors
 ### Step 3: Derive Required Artifacts
 For each truth: "What must EXIST for this to be true?"
@@ -200,6 +210,15 @@ For each truth: "What must EXIST for this to be true?"
 | API       | Database  | `prisma.query`      | Query returns static, not DB result |
 | Component | Real data | `useEffect` fetch   | Shows placeholder, not messages     |
+**For UI PRDs:** Add UX failure links where relevant:
+| From               | To                 | Via                          | Risk                                     |
+| ------------------ | ------------------ | ---------------------------- | ---------------------------------------- |
+| Destructive action | Confirmation/undo  | Dialog, toast, or action log | User deletes wrong entity or cannot undo |
+| Form field         | Validation message | `aria-describedby` / focus   | User cannot find or understand the error |
+| Async action       | Loading/recovery   | Button state, toast, banner  | User double-submits or hits a dead end   |
+| Filtered data      | Empty/no-results   | Query state + empty copy     | User thinks data is missing or corrupted |
 ## Phase 5: Decompose with Context Budget
 **Quality Degradation Rule:** Target ~50% context per execution. More plans, smaller scope = consistent quality.
@@ -316,6 +335,9 @@ Wave 3: C
 - **TDD order** — test first, then implementation
 - **Each step is 2-5 minutes** — one action per step
 - **Tasks map to PRD tasks**
+- **UI state coverage** — UI tasks list empty/loading/error/success states when applicable
+- **UX recovery path** — async/destructive/form tasks include retry/undo/confirm/error handling
+- **Accessibility wiring** — form and interactive tasks include labels, focus behavior, keyboard path, and semantic HTML
 ## Phase 8: Constitutional Compliance Gate

package/dist/template/.opencode/command/ship.md CHANGED Viewed

@@ -20,6 +20,8 @@ skill({ name: "memory-grounding" });
 skill({ name: "workspace-setup" });
 skill({ name: "verification-before-completion" });
 skill({ name: "reflection-checkpoints" }); // Mid-point + completion checks during execution
+// For user-facing UI changes: skill({ name: "ux-quality-gates" });
+// If local web/browser verification needs stable URLs: skill({ name: "portless" });
 ```
 ## Determine Input Type
@@ -226,8 +228,37 @@ Follow the [Verification Protocol](../skill/verification-before-completion/refer
 - All 4 gates must pass before proceeding to commit/push
 - Also run PRD `Verify:` commands
+If the PRD requires local web, browser, OAuth callback, webhook, or multi-service verification, load the [portless](../skill/portless/SKILL.md) skill and use approved stable URLs as verification evidence. Portless is optional: read-only `portless list` / `portless get <service>` checks are allowed when installed, but do not install Portless, start proxies, trust CAs, mutate hosts files, clean Portless state, or expose LAN services without explicit user approval.
 ## Phase 5: Review
+```bash
+BASE_SHA=$(git rev-parse origin/main 2>/dev/null || git rev-parse HEAD~1)
+HEAD_SHA=$(git rev-parse HEAD)
+```
+### UI Quality Gate (if UI files changed)
+Before general review, detect changed UI files:
+```bash
+git diff --name-only $BASE_SHA...HEAD -- \
+  '*.tsx' '*.jsx' '*.css' '*.scss' '*.sass' '*.less' '*.html' '*.mdx'
+```
+If any UI files changed:
+1. Load `skill({ name: "ux-quality-gates" })`.
+2. Run `/ui-slop-check auto --since=$BASE_SHA` or manually apply its checklist when slash-command invocation is unavailable.
+3. Verify UX gates for changed surfaces:
+   - One primary action per view/section
+   - Empty/loading/error/success states for async/data flows
+   - Retry/undo/confirm paths for errors and destructive actions
+   - Form labels, helper text, validation, and error association
+   - Semantic HTML, keyboard path, visible focus, reduced motion
+   - Component family consistency for related controls
+4. Treat Critical findings like review Critical findings: fix inline, rerun verification, then continue.
 Load and run the review skill:
 ```typescript
@@ -236,11 +267,6 @@ skill({ name: "requesting-code-review" });
 Run **5 parallel agents**: security/correctness, performance/architecture, type-safety/tests, conventions/patterns, simplicity/completeness.
-```bash
-BASE_SHA=$(git rev-parse origin/main 2>/dev/null || git rev-parse HEAD~1)
-HEAD_SHA=$(git rev-parse HEAD)
-```
 Fill placeholders:
 - `{WHAT_WAS_IMPLEMENTED}`: bead title + brief summary of what changed