npm - orchestrix-skills - Versions diffs - 0.1.0 - Mend

orchestrix-skills 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

package/.claude-plugin/plugin.json +10 -0
package/LICENSE +21 -0
package/README.md +73 -0
package/bin/install.js +82 -0
package/package.json +30 -0
package/project-scaffold/README.md +33 -0
package/project-scaffold/core-config.yaml +19 -0
package/project-scaffold/knowledge/README.md +34 -0
package/project-scaffold/knowledge/architecture/decisions.md +32 -0
package/project-scaffold/knowledge/registry/api.yaml +15 -0
package/project-scaffold/knowledge/taste/brand.md +20 -0
package/project-scaffold/knowledge/taste/coding-standards.md +17 -0
package/project-scaffold/knowledge/taste/design-system.md +32 -0
package/skills/README.md +74 -0
package/skills/brainstorm/SKILL.md +86 -0
package/skills/commit/SKILL.md +57 -0
package/skills/design-architecture/SKILL.md +65 -0
package/skills/design-review/SKILL.md +83 -0
package/skills/design-system/SKILL.md +81 -0
package/skills/design-ui/SKILL.md +96 -0
package/skills/draft-story/SKILL.md +106 -0
package/skills/implement/SKILL.md +78 -0
package/skills/orchestrate/SKILL.md +94 -0
package/skills/research/SKILL.md +63 -0
package/skills/review-code/SKILL.md +71 -0
package/skills/run-tests/SKILL.md +73 -0

package/skills/design-architecture/SKILL.md ADDED Viewed

@@ -0,0 +1,65 @@
+---
+name: design-architecture
+description: Use when a feature needs a new or changed architectural decision — a service, data model, cross-cutting pattern, or external integration — before stories are drafted.
+license: MIT
+allowed-tools: [Read, Write, Grep, Glob]
+metadata:
+  contract:
+    inputs: [requirement, system_context]
+    reads: [architecture/*, registry/api, registry/db]
+    outputs: [specs/<slug>-arch.md]
+    updates: [architecture/decisions, registry/api?, registry/db?]
+    authority: "Write to the specs namespace (default docs/specs/) and update the architecture KB / registry. No source code, no production."
+    verify: "States the alternatives and the rationale; consistent with existing decisions (or explicitly supersedes one); no placeholder."
+    accept:
+      when: "Always — an architecture decision is foundational; the whole build rests on it."
+      timing: inline
+---
+# Design Architecture
+Make one architectural decision a feature needs, with its rationale, and record
+it durably — before any story is drafted on top of it.
+**Core principle:** Decide deliberately and record why. A wrong foundation
+discovered late wastes the whole build, so this is a gated, foundational step —
+not something `implement` should improvise.
+## Process
+1. **Read the slice.** Load relevant `architecture/*` decisions and the
+   `registry/api` + `registry/db` facts. Understand what already exists.
+2. **State the decision needed.** One sentence: what must be decided and why it
+   came up now.
+3. **Weigh 2–3 alternatives** with trade-offs; pick one; say why the others lose.
+4. **Check consistency.** It must not contradict an existing decision. If it
+   must, explicitly supersede that decision — never silently diverge.
+5. **Scope it.** Decide only what this feature forces. Don't design the whole
+   system (YAGNI); leave future decisions to future features.
+## Outputs
+**Per-feature record** — `specs/<slug>-arch.md`:
+```markdown
+# <Feature> — Architecture Decision
+## Decision — one line.
+## Context — why it came up.
+## Alternatives — what was rejected and why.
+## Impact — affected modules, data, interfaces.
+```
+**Durable record (metabolism)** — append an ADR to `architecture/decisions`
+(append/supersede, with provenance). If the decision adds or changes endpoints or
+data models, update `registry/api` / `registry/db` so downstream skills read the
+new facts.
+## Done
+Write the decision, update the KB, then stop for approval (`accept: inline`). On
+approval the orchestrator wires `draft-story`, which reads the updated
+`architecture/*` and `registry/*`.

package/skills/design-review/SKILL.md ADDED Viewed

@@ -0,0 +1,83 @@
+---
+name: design-review
+description: Use after a UI feature is built and running, before merging, to review the rendered interface against its UI spec, the design system, and world-class craft.
+license: MIT
+allowed-tools: [Read, Bash, Grep, Glob]
+metadata:
+  contract:
+    inputs: [built_ui, ui_spec]
+    reads: [taste/design-system, taste/brand]
+    outputs: [design_review_report]
+    authority: "Read-only on code; render and screenshot the running UI. No edits, no commits, no production."
+    verify: "Report is based on actual rendered screens (screenshots), not inferred from source; contains both verdicts (spec compliance AND visual quality)."
+    accept:
+      when: "never — findings route back to implement; the human reviews at the end batch."
+      timing: deferred
+---
+# Design Review
+Review a built UI with a senior designer's eye. The visual counterpart of
+`review-code`: two verdicts, in order — does it match the spec, then is it
+world-class.
+**Core principle:** You cannot review design from source. Render it and look at
+the pixels. A claim about how it looks, without a screenshot, is a guess — the
+visual equivalent of claiming tests pass without running them.
+## Inputs
+- `built_ui` — the running app (url / dev server / screenshots of each screen).
+- `ui_spec` — `specs/<slug>-ui.md` it must satisfy.
+- Read `taste/design-system` + `taste/brand` — the bar to calibrate against.
+  Deviations from the system are higher severity, not lower.
+## Look at the real thing first
+Render every screen and state in the spec (loading, empty, error too). Capture
+screenshots. Review those, not the CSS.
+## Verdict 1 — Spec compliance
+For each screen/flow/state in `ui_spec`: present and correct, missing, or wrong.
+Missing a screen or a non-happy state → spec verdict FAIL.
+## Verdict 2 — Visual quality
+Against the design system and world-class craft:
+- **Hierarchy** obvious in one glance? **Spacing** on the system's scale?
+- **Consistency** with the system's type/color/tokens? Drift = defect.
+- Is the **memorable-thing** actually visible here?
+- **Non-happy states** real, not stubs (loading/empty/error)?
+- **Interaction quality** — jank, slow transitions, layout shift?
+### Anti-slop check
+Flag any AI-default tell: Inter/Roboto where the system says otherwise ·
+purple-blue gradient heroes · three-column rounded-card grids · default Tailwind
+swatches · drop shadows on everything · emoji as icons · generic everything.
+Rate each finding **Critical** (must fix) · **Important** (fix before merge) ·
+**Minor** (note it).
+## Output: `design_review_report`
+```yaml
+spec_compliance: pass # pass | fail
+missing: [<screen/state not built>]
+findings:
+  - { severity: Important, screen: settings, issue: "...", fix: "...", shot: "..." }
+verdict: changes_requested # approved | changes_requested
+```
+## Rules
+- **Evidence-based.** Every finding names the screen and, where possible, a
+  screenshot. No "looks off" without showing it.
+- **Don't pre-judge.** Surface deviations; don't excuse one because the spec or a
+  deadline pushed it.
+- **Findings route back, not to the human.** Critical/Important go to `implement`
+  (re-run with this report as `qa_feedback`). The human sees the result at the
+  end-of-run acceptance.
+- If it's clean and on-system, say so plainly and approve.

package/skills/design-system/SKILL.md ADDED Viewed

@@ -0,0 +1,81 @@
+---
+name: design-system
+description: Use when a project has no design direction yet, or must (re)establish its aesthetic — produces the durable design system (aesthetic POV, type, color, space, motion) before any UI is designed.
+license: MIT
+allowed-tools: [Read, Write, WebSearch, Grep, Glob]
+metadata:
+  contract:
+    inputs: [product_context, references?]
+    reads: [taste/brand, taste/design-system]
+    outputs: [taste/design-system, taste/brand]
+    authority: "Author the durable design KB (taste/design-system, taste/brand). High-authority, audited (knowledge write). No source code, no production."
+    verify: "Specific, not generic: a named typeface, real type/space scales, named reference products, one memorable-thing. Covers type + color + space + motion. No AI-default tells (see anti-slop)."
+    accept:
+      when: "Always — the aesthetic direction is foundational and brand-defining."
+      timing: inline
+---
+# Design System
+Establish the project's durable visual direction — once — so every UI after it
+is consistent and unmistakably this product's. This is the source of taste the
+`design-ui` skill applies.
+**Posture:** You are a senior product designer with strong opinions about
+typography, color, space, and motion. You research, then propose ONE coherent
+system and explain why. Opinionated, not dogmatic. Zero tolerance for generic,
+AI-generated-looking interfaces.
+## The forcing question (do this first)
+Ask: **"What is the one thing someone should remember after seeing this product
+for the first time?"** One sentence — a feeling, a claim, a posture. Every
+decision below serves it. A system that tries to be memorable for everything is
+memorable for nothing.
+## Commit to a reference (this is what beats slop)
+AI defaults to the on-distribution average. Beat it by committing to a specific
+point of view BEFORE specifying anything:
+1. **Research the space.** What do the 2–3 best products here actually look like?
+2. **Name 2–3 concrete references** to steal direction from (e.g. Linear,
+   Things 3, Stripe, Vercel, Bloomberg terminal, Notion). Not to copy — to anchor.
+3. **State the one rule that makes this distinctive** (the type personality, a
+   signature color, a density choice, a motion restraint).
+## Specify the system (specifics, not adjectives)
+- **Typography** — a named typeface (not Inter/Roboto unless deliberate and
+  justified), a type scale with real sizes/weights, line-height rules.
+- **Color** — a real palette with roles (bg, surface, text, accent, states), not
+  default framework swatches; state the contrast floor.
+- **Space** — a spacing scale; density posture (tight/airy) tied to the product.
+- **Layout** — grid and composition principles; how hierarchy is created.
+- **Motion** — timing, easing, and where motion is used (and where it is not).
+## Anti-slop (forbidden defaults — name and avoid)
+Inter/Roboto by default · purple-blue gradient heroes · three-column
+rounded-card feature grids · drop shadows on everything · default Tailwind
+palette (blue-500…) · emoji as product icons · centered everything · gradients /
+glassmorphism with no reason. If a choice is one of these, it must be a
+deliberate, justified decision — not a default.
+## Output: the durable KB (structured, with provenance)
+Write `taste/design-system` and `taste/brand` as structured entries (per the
+knowledge format: terse, chunked, each with `source`/`added`/`approved_by`).
+Record the memorable-thing, the references, the one distinctive rule, and each
+specified token/scale. This is the source `design-ui` reads.
+## Self-critique before done
+Look at the system with a designer's eye: does it read as a specific, named
+point of view, or as a generic template? Could you tell it apart from a default
+AI UI? If not, sharpen it.
+## Done
+Write the KB, then stop for approval (`accept: inline`). On approval, `design-ui`
+applies this system per feature. Re-run only to evolve the direction.

package/skills/design-ui/SKILL.md ADDED Viewed

@@ -0,0 +1,96 @@
+---
+name: design-ui
+description: Use when a feature has a user interface, to design its screens and flows by applying the project's design system with world-class craft, before stories are drafted.
+license: MIT
+allowed-tools: [Read, Write, Grep, Glob]
+metadata:
+  contract:
+    inputs: [requirement, ui_context]
+    reads: [taste/design-system, taste/brand]
+    outputs: [specs/<slug>-ui.md]
+    updates: [taste/design-system?, taste/brand?]
+    authority: "Write to the specs namespace (default docs/specs/) and design assets. No source code, no production."
+    verify: "Every screen and flow maps to a requirement; expresses the design system (not generic defaults); passes the designer's-eye self-critique."
+    accept:
+      when: "Visual direction — the look is foundational; everything downstream builds on it."
+      timing: inline
+---
+# Design UI
+Design a feature's UI by APPLYING the project's design system — not inventing a
+one-off look. Consistency is the asset; the system is where world-class comes
+from. This skill expresses it for one feature.
+**Posture:** Senior product designer. Strong opinions on type, color, space,
+motion. Zero tolerance for generic, AI-generated-looking screens.
+## Precondition
+Read `taste/design-system` + `taste/brand`. **If the design system is empty,
+stop and route to `design-system` first.** Do not invent a per-feature
+aesthetic — that is exactly how product consistency dies.
+Carry the system's **memorable-thing** and **distinctive rule** through every
+screen. If this feature can't express them, say so.
+## Process
+1. **Map screens to requirements.** Every screen/state must trace to a
+   requirement. Cut the rest (YAGNI).
+2. **Apply the system, don't restate it.** Use its typeface, scale, palette,
+   spacing, motion. Reuse existing components/tokens before proposing new ones.
+3. **Specify each screen:** purpose, layout and hierarchy, the system tokens
+   used, and the non-happy states — loading, empty, error. These are where UIs
+   actually fail.
+4. **Show, don't just tell.** When a layout choice is clearer shown than
+   described, produce a mockup/wireframe, not prose.
+## Anti-slop (forbidden defaults)
+No Inter/Roboto unless the system says so · no purple-blue gradient heroes · no
+three-column rounded-card grids by reflex · no default Tailwind swatches · no
+drop shadows on everything · no emoji as icons. A default is only allowed as a
+deliberate, justified choice.
+## Designer's-eye self-critique (mandatory gate before done)
+Look at the result and ask:
+- Does this look like it could ship from {the named references}, or like a
+  generic AI UI? If the latter, fix it.
+- Is hierarchy obvious in one glance? Is spacing on the scale? Is the
+  memorable-thing visible here?
+- Did any anti-slop tell sneak in?
+Fix until it passes. This is the visual equivalent of `run-tests` — don't claim
+done without running it.
+## Output: `specs/<slug>-ui.md`
+```markdown
+# <Feature> — UI Spec
+## Screens — each: purpose, the requirement it serves, layout + hierarchy.
+## Flows — how the user moves between screens.
+## States — loading / empty / error for each screen.
+## System use — tokens/components used; how the memorable-thing shows up here.
+## New patterns — anything the system lacked, with rationale (candidate for KB).
+```
+`draft-story`'s `UI Reference` points to this file.
+## Metabolism
+A genuinely new, reusable design decision (not feature-specific) folds back into
+`taste/design-system` / `taste/brand` via `updates:` — append with provenance,
+supersede rather than rewrite. Feature-only details stay in the spec.
+## Done
+Write the UI spec, pass the self-critique, then stop for visual approval
+(`accept: inline`). On approval the orchestrator wires `draft-story`.

package/skills/draft-story/SKILL.md ADDED Viewed

@@ -0,0 +1,106 @@
+---
+name: draft-story
+description: Use when a feature request or requirement must become an implementable spec with acceptance criteria, before any code is written.
+license: MIT
+allowed-tools: [Read, Write, Grep, Glob]
+metadata:
+  contract:
+    inputs: [requirement, context]
+    reads: [taste/coding-standards, registry/api, registry/db, front-end-spec?]
+    outputs: [docs/stories/<slug>.md]
+    authority: "Write one flat story file at docs/stories/<slug>.md. No folders. No source code. No production. No spend."
+    verify: "Every requirement maps to at least one acceptance criterion; constraints are copied verbatim; no placeholders (no TBD/TODO/'handle edge cases')."
+    accept:
+      when: "Always — this output sets the direction the whole build rests on."
+      timing: inline
+---
+# Draft Story
+Turn a requirement into one implementable story. This is the upfront direction
+gate: cheap to get right here, expensive to discover wrong after a build.
+**Core principle:** A story is done when an engineer with no prior context could
+implement it without guessing.
+## Output
+Write ONE flat Markdown file: `docs/stories/<slug>.md` — `<slug>` is a kebab
+handle from the title. No folders, no `epic.story` numbering. Produce exactly
+these sections (with frontmatter) and nothing more.
+```markdown
+---
+origin: <stable short handle of the requirement this story came from>
+---
+# <Story title>
+## Goal
+One sentence: what this delivers and for whom.
+## Constraints
+Hard rules this story must respect, one per line, copied verbatim — no
+paraphrase (version floors, allowed/forbidden dependencies, naming and copy
+rules, platform requirements). Every section below implicitly includes these.
+## Acceptance Criteria
+- AC1: <observable, testable condition>
+- AC2: ...
+  Every AC must be checkable by a test or a human looking at a result.
+## File Map
+- Create: <exact/path>
+- Modify: <exact/path>
+- Test: <exact/path>
+## Interfaces
+- Consumes: <signatures this story relies on from existing code>
+- Produces: <signatures later work will rely on — exact names and types>
+## UI Reference (only if the story touches UI)
+Link to the front-end spec / design and name the exact screens or components in
+scope. Omit this section entirely for non-UI stories.
+## Out of Scope
+What this story deliberately does NOT do.
+```
+## Rules
+- **One story, one coherent deliverable.** If it spans independent subsystems,
+  split into separate stories — each its own `docs/stories/<slug>.md`, all
+  sharing the same `origin` so the set is queryable as one group.
+- **Flat, no hierarchy.** One file per story under `docs/stories/`. Grouping is
+  the `origin` field (a query the AI runs), never a folder. Order comes from
+  Interfaces (dependencies), never from a number.
+- **No placeholders.** "Add validation", "handle errors", "TBD" are failures.
+  State the actual condition.
+- **Acceptance criteria are observable.** "Works well" is not an AC. "Returns
+  400 with `{error:'Email required'}` for empty email" is.
+- **Stay at spec altitude.** Describe what must be true, not how to code it.
+  Implementation is the `implement` skill's job.
+- **Constraints are verbatim.** Copy each hard rule exactly from its source — a
+  paraphrased limit is a changed limit.
+- **UI Reference only when there is UI.** Omit the section for non-UI stories;
+  never leave it as an empty placeholder.
+## Self-review before finishing
+1. Does every requirement map to an AC? List any gap, then close it.
+2. Are all hard constraints copied verbatim, not paraphrased?
+3. Any placeholder text? Replace with the real condition.
+4. Do the names in Interfaces match across sections?
+## Done
+Write `docs/stories/<slug>.md`. Hand off its path for direction confirmation
+(this skill's `accept` is `inline`): the human approves the direction, or sends
+it back, before `implement` begins.

package/skills/implement/SKILL.md ADDED Viewed

@@ -0,0 +1,78 @@
+---
+name: implement
+description: Use when implementing a feature or bugfix from a spec with acceptance criteria, before writing implementation code.
+license: MIT
+allowed-tools: [Read, Write, Edit, Bash]
+metadata:
+  contract:
+    inputs: [story, acceptance_criteria, scope, qa_feedback?]
+    reads: [taste/coding-standards, registry/api, registry/db]
+    outputs: [code_diff, ac_traceability, test_files]
+    authority: "Write src/ and tests/. No production, no deploy, no network spend."
+    verify: "run-tests is green AND every acceptance criterion has evidence (file:line + test)."
+    accept:
+      when: "scope is high-risk"
+      timing: deferred
+---
+# Implement (Test-Driven)
+Write the test first. Watch it fail. Write the minimum code to pass.
+**Core principle:** If you did not watch the test fail, you do not know the test
+tests the right thing.
+> Rework note: when `qa_feedback` is present, you are re-running this same skill
+> on prior output. Read the feedback, add or fix the failing test that proves
+> it, then continue the cycle. Do not start a separate "fix" process.
+## The Iron Law
+```
+NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
+```
+Wrote code before its test? Delete it. Implement again from the test. "Keep it
+as reference" is testing-after with extra steps.
+## RED → GREEN → REFACTOR (one acceptance criterion at a time)
+1. **RED** — Write one test for one behavior. Clear name. Real code, not mocks.
+2. **Verify RED** — Run it. Confirm it fails, and fails because the feature is
+   missing (not a typo). A test that passes now tests existing behavior — fix it.
+3. **GREEN** — Write the simplest code that passes. No extra options, no
+   abstraction the test does not demand (YAGNI).
+4. **Verify GREEN** — Run it. The test passes, other tests still pass, output is
+   clean. Failing? Fix the code, not the test.
+5. **REFACTOR** — Remove duplication, improve names. Stay green. Add no behavior.
+6. **Repeat** for the next acceptance criterion.
+## Scope
+- `scope=small` (bugfix, ≤3 files): same cycle, narrow it to the changed behavior.
+- `scope=high-risk` (security, data, money, irreversible): same cycle; this
+  skill's `accept` triggers a human sign-off on the result.
+## AC traceability (required output)
+For every acceptance criterion, record evidence — the orchestrator and
+`run-tests` consume this:
+```
+AC1 -> code: src/auth/login.ts:42-58 | test: tests/auth/login.test.ts:'rejects empty email'
+```
+No evidence = the criterion is not done. Placeholders (`TODO`, empty paths) are
+not evidence.
+## Red flags — stop and restart the cycle
+- Code written before its test
+- A test that passed the first time you ran it
+- You cannot explain why the test failed
+- "I will add tests after" / "too simple to test" / "just this once"
+## Done
+`run-tests` is green, every AC has evidence. Output `code_diff`,
+`ac_traceability`, and `test_files`.

package/skills/orchestrate/SKILL.md ADDED Viewed

@@ -0,0 +1,94 @@
+---
+name: orchestrate
+description: Use when a goal must be delivered end-to-end by composing skills, with the human approving direction at the start and the result at the end.
+license: MIT
+allowed-tools: [Read, Write, Edit, Bash, Grep, Glob, Task]
+metadata:
+  contract:
+    inputs: [intent, constraints?]
+    reads: [skill-registry, taste/*]
+    outputs: [accepted_deliverable, run_ledger]
+    authority: "Dispatch leaf skills, each within its own authority. Do not directly touch source, production, or spend — leaf skills do that, gated. Enforce every accept gate."
+    verify: "Every dispatched skill's verify passed; final review is clean; all inline accepts were obtained."
+    accept:
+      when: "The final deliverable — once, batched."
+      timing: inline
+---
+# Orchestrate
+The root skill. The human binds intent here; this skill composes leaf skills to
+fulfill it. It holds the only warm context. Leaf skills run as fresh, isolated
+dispatches and keep none.
+**Core principle:** Organize by capability, not by role. The human sets
+direction at the start and accepts the result at the end. In between, the run is
+lights-out: objective `verify` gates each step, not a human.
+**This skill is the designated root.** It does not orchestrate itself. There is
+no step above intent.
+## The loop
+1. **Bind intent.** Read the human's goal and constraints. This is the only
+   place intent enters.
+2. **Select.** Read the skill registry. Pick skills by their `description`
+   (when-to-use). Load a skill's full `contract` only when it is a candidate —
+   never load every contract at once.
+3. **Wire (emergent, not hardcoded).** Build the path by matching one skill's
+   `outputs` to the next skill's `inputs`. Skills do not know each other; only
+   you do. Do not assume a fixed pipeline — wire what this intent needs.
+4. **Dispatch one step.** Hand the skill exactly the `inputs` it declares, as
+   files. Run it as a fresh subagent for isolation. Choose the cheapest model
+   that can do the step.
+5. **Verify (gate).** Run the skill's `verify`. If it fails, re-dispatch the
+   **same** skill with the failure as feedback (see Rework). If it passes,
+   continue.
+6. **Accept (gate).** Apply the rule below. Then continue — do not pause to ask
+   "should I keep going?" mid-run.
+7. **Repeat** 3–6 until the intent is fulfilled.
+8. **Final acceptance.** Present the batched deferred accepts and a final review
+   to the human, once. Apply corrections (see Metabolism), then deliver.
+## Accept gate
+| Skill's `accept.timing` | Skill's `authority`                                               | Action                                                                                        |
+| ----------------------- | ----------------------------------------------------------------- | --------------------------------------------------------------------------------------------- |
+| `inline`                | any                                                               | Stop now. Get human sign-off before continuing.                                               |
+| `deferred`              | reversible / low blast                                            | Record it for the final acceptance batch.                                                     |
+| `deferred`              | **irreversible / high blast** (spend, send, deploy, delete, sign) | **Override to inline.** Stop and get sign-off now. A deferred review cannot un-send an email. |
+| `never`                 | any                                                               | No human. Continue.                                                                           |
+You are the teeth. The fields are only data; you enforce them.
+## Rework is a loop, not a skill
+A failed `verify` or a `changes_requested` review is not a separate "fix" step.
+Re-dispatch the same skill with the feedback as an input (e.g. `qa_feedback`).
+Same capability, new input.
+## Metabolism
+When the human corrects something at final acceptance ("not on-brand", "wrong
+tone"), write the correction back into the relevant `taste/*` knowledge base, so
+the next run reads the improved taste. The run teaches the organization.
+## Context discipline (stay lean)
+- **Files, not paste.** Move artifacts between steps as files. Never paste a
+  step's full output into your context — it would be re-read every later turn.
+- **Ledger.** Append each finished step to a progress ledger file
+  (`.orchestrate/ledger.md`). After a compaction, trust the ledger and `git
+log`, not memory. A step recorded done is done — do not re-dispatch it.
+- **Cheapest model per step.** Mechanical step → cheap model. Judgment step →
+  capable model. State the model explicitly on every dispatch.
+- **Keep your own context small.** You coordinate; the leaves do the heavy work.
+## Red flags — stop
+- Pausing to ask the human mid-run when nothing is `inline` or irreversible
+- Running an irreversible-authority skill on a `deferred` accept
+- Hardcoding a fixed skill order instead of wiring outputs→inputs
+- Pasting a step's full output into your context instead of handing a file
+- Re-dispatching a step the ledger already marks done
+- Marking the run complete without every step's `verify` evidence

package/skills/research/SKILL.md ADDED Viewed

@@ -0,0 +1,63 @@
+---
+name: research
+description: Use when a planning or design decision needs external facts — market, competitor, feasibility, or a library/tech choice — before committing.
+license: MIT
+allowed-tools: [Read, WebSearch, WebFetch]
+metadata:
+  contract:
+    inputs: [question, scope]
+    reads: []
+    outputs: [research/<slug>.md]
+    authority: "Read-only, plus external/web reads. No code, no production, no spend."
+    verify: "Every claim carries a source; the recommendation follows from the findings, not from prior belief."
+    accept:
+      when: "never — informational; it feeds a decision, it does not make one."
+      timing: deferred
+---
+# Research
+Answer one decision-relevant question with sourced facts, so a downstream
+decision rests on evidence instead of assumption.
+**Core principle:** Evidence before recommendation. A claim without a source is
+an opinion — label it as one or drop it.
+## Process
+1. **Sharpen the question.** State exactly what decision this informs and what
+   would change the answer. If the question is vague, narrow it first.
+2. **Gather from several independent sources.** One source is a lead, not a fact.
+3. **Verify each claim.** Cross-check; keep the source for every claim. Note
+   where sources disagree.
+4. **Recommend.** Give the answer the findings support, with the trade-offs. If
+   the evidence is thin, say so — "insufficient evidence" is a valid result.
+## Output: `research/<slug>.md`
+```markdown
+# <Question>
+## Answer — the recommendation in one or two lines.
+## Findings
+- <claim> — source: <url/ref> — confidence: high/med/low
+## Disagreements / gaps — where sources conflict or evidence is thin.
+## So what — how this changes the downstream decision.
+```
+## Rules
+- **No unsourced claims.** Each finding names its source.
+- **Recommendation must follow from findings**, not from what you assumed going
+  in. If findings contradict the assumption, say that.
+- **Stay scoped.** Answer the question asked; don't expand into a survey.
+## Done
+Write the brief. It feeds back to `brainstorm` / `design-*` (no human gate —
+`accept: never`). Durable facts it surfaces (e.g. a chosen library) become an
+architecture decision via `design-architecture`, not a direct write to knowledge.