npm - @elementarno/eawf - Versions diffs - 0.5.2 - Mend

@elementarno/eawf 0.5.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (48) hide show

package/.claude-plugin/marketplace.json +18 -0
package/.claude-plugin/plugin.json +19 -0
package/README.md +45 -0
package/agents/auditor.md +69 -0
package/agents/domain-specialist.md +60 -0
package/agents/executor.md +69 -0
package/agents/operator.md +66 -0
package/agents/planner.md +86 -0
package/agents/polisher.md +64 -0
package/agents/researcher.md +71 -0
package/agents/reviewer.md +68 -0
package/hooks/post_commit.sh +30 -0
package/hooks/post_push.sh +30 -0
package/hooks/pre_commit.sh +30 -0
package/hooks/pre_compact.sh +30 -0
package/hooks/pre_push.sh +30 -0
package/hooks/session_end.sh +30 -0
package/hooks/session_start.sh +30 -0
package/hooks/subagent_stop.sh +30 -0
package/hooks.json +88 -0
package/package.json +18 -0
package/skills/add-property-test/SKILL.md +31 -0
package/skills/agent-dispatch/SKILL.md +33 -0
package/skills/audit/SKILL.md +35 -0
package/skills/blitz/SKILL.md +24 -0
package/skills/coauthor/SKILL.md +30 -0
package/skills/compress/SKILL.md +30 -0
package/skills/design/SKILL.md +63 -0
package/skills/differentiate/SKILL.md +23 -0
package/skills/extract-function/SKILL.md +31 -0
package/skills/extract-module/SKILL.md +31 -0
package/skills/flow/SKILL.md +28 -0
package/skills/graduate-research-code/SKILL.md +31 -0
package/skills/init/SKILL.md +32 -0
package/skills/math-explainer/SKILL.md +64 -0
package/skills/memory/SKILL.md +32 -0
package/skills/mockup/SKILL.md +30 -0
package/skills/polish/SKILL.md +29 -0
package/skills/prep/SKILL.md +43 -0
package/skills/refactor-god-class/SKILL.md +31 -0
package/skills/research/SKILL.md +45 -0
package/skills/review/SKILL.md +31 -0
package/skills/roadmap/SKILL.md +31 -0
package/skills/security-review/SKILL.md +30 -0
package/skills/ship/SKILL.md +39 -0
package/skills/spike/SKILL.md +65 -0
package/skills/wave-spec/SKILL.md +28 -0
package/skills/write-adr/SKILL.md +30 -0

package/skills/extract-module/SKILL.md ADDED Viewed

@@ -0,0 +1,31 @@
+---
+name: extract-module
+description: Model-only refactoring playbook for splitting a multi-concern file into layered modules.
+argument-hint: ""
+user-invocable: false
+disable-model-invocation: false
+---
+# extract-module
+A model-only refactoring playbook. There is no slash command; the model invokes this when a single file has grown to host several unrelated concerns that deserve their own module.
+## When to reach for it
+- One file mixes layers — schema models, business logic, and CLI glue — that the architecture keeps separate elsewhere.
+- The file's imports fan out across many subpackages, signalling it does too much.
+- Two clusters of functions never call each other and share no state.
+## Canonical procedure
+1. Draw the dependency graph inside the file: which functions call which, which share module-level state. Disjoint clusters are extraction candidates.
+2. Choose the cluster with the fewest inbound edges from the rest of the file — it moves with the least churn.
+3. Create the new module in the layer it belongs to (schema, logic, CLI) per the CLI-is-dispatch / separation-of-concerns rules.
+4. Move the cluster, then fix imports; keep public names stable so callers outside the file change only their import path.
+5. Re-export from the original module only if a published API depended on the old location; otherwise update call sites directly.
+## Guardrails
+- Respect the layering: CLI imports logic, logic imports schema, never the reverse.
+- Avoid circular imports — if the split would create a cycle, the seam is wrong; find a different cluster.
+- Each module keeps `from __future__ import annotations` and its own `logger = logging.getLogger(__name__)`.

package/skills/flow/SKILL.md ADDED Viewed

@@ -0,0 +1,28 @@
+---
+name: flow
+description: Run /research → /prep → /audit → /polish → /ship sequentially; review folds into /ship as the PR-review pass. Short-circuit on any non-ok status.
+argument-hint: "<task-slug> [--auto-accept=<stage>[,<stage>...]]"
+user-invocable: true
+disable-model-invocation: true
+---
+# /flow
+## Canonical algorithm
+1. Run `/research` → `/prep` → `/audit` → `/polish` → `/ship` sequentially. The PR-review pass is folded into `/ship` (it reads the remote review comments, addresses feedback by appending waves to the current iter, then bundles iter + phase close in the final pre-merge commit per the `iter-phase-close-timing` rule in AGENTS.md).
+2. **Inter-stage gate (default).** After each step returns `status=ok`, check `flow.auto_accept.<stage>` (via `uv run eawf config get flow.auto_accept.<stage>`). When `false` (the default) and the stage was not listed in `--auto-accept`, ask the operator via `AskUserQuestion` whether to proceed — options: `proceed` / `skip-next` / `stop`. When `true`, advance without a prompt.
+3. On any non-`ok` status (`blocked`, `needs_user`, `failed`, `partial`), short-circuit with the failing step's repair commands.
+## Pre-flight checklist
+- [ ] All upstream skills are installed.
+- [ ] Per-stage `flow.auto_accept` flags reflect the operator's intended cadence (review existing values; default is "ask each time" for every stage).
+## Decision surfaces
+`/flow` is a long-running pipeline. Every operator-facing decision point — inter-stage gates, "abandon vs retry on `failed`", "merge order on `needs_user`" — MUST be raised through `AskUserQuestion` so the run stays unstuck without dropping the operator into free-text. Per-step skills already follow this rule; the flow merely propagates their `needs_user` envelopes verbatim.
+## Output contract
+Skill envelope whose body accumulates per-step envelopes plus the inter-stage gate decisions. Status is `ok` when every step passed (after any auto-accept or operator confirm), otherwise the first non-`ok` step's status is propagated.

package/skills/graduate-research-code/SKILL.md ADDED Viewed

@@ -0,0 +1,31 @@
+---
+name: graduate-research-code
+description: Model-only playbook for promoting spike/research code into a typed, tested, maintained module.
+argument-hint: ""
+user-invocable: false
+disable-model-invocation: false
+---
+# graduate-research-code
+A model-only playbook for promoting throwaway research/spike code into a maintained module. There is no slash command; the model invokes this when a notebook or scratch script has proven its idea and must now meet production conventions.
+## When to reach for it
+- A spike under `.ea/local/` (or a notebook) demonstrated a result that a real feature now depends on.
+- The exploratory code lacks types, tests, and error handling but the algorithm is settled.
+- A research artifact is being promoted to `.ea/artifacts/` and its code needs to move out of scratch.
+## Canonical procedure
+1. Separate the kept algorithm from the exploratory scaffolding (plotting, ad-hoc prints, hard-coded paths). Only the algorithm graduates.
+2. Re-home it into the proper package layer with `from __future__ import annotations`, full type hints, and module-level `logger`.
+3. Replace inline constants and machine paths with parameters or config; scrub any PII or local paths before the code is committed.
+4. Add the test debt the spike skipped: boundary cases, error paths, and `pytest.approx` / `assert_allclose` for any numerics.
+5. Back every quantitative claim the graduated code makes with an audit- recorded artifact so the verify-before-claim rule holds.
+## Guardrails
+- A spike brief that ratifies a decision promotes from `.ea/local/research/` to `.ea/artifacts/research/` in the same commit that lands the decision (spike-workflow rule).
+- Do not graduate code whose verdict is still open — promote the brief and the decision first.
+- The graduated module obeys the same lint, type, and coverage gates as any other source file; no exceptions for "it was research".

package/skills/init/SKILL.md ADDED Viewed

@@ -0,0 +1,32 @@
+---
+name: init
+description: Initialise a new Eä Workflow workspace. Renders managed regions of AGENTS.md and the .claude/ plugin tree.
+argument-hint: "[--profile=<id>]"
+user-invocable: true
+disable-model-invocation: true
+---
+# /init
+## v0.4 cross-links
+Init persists a typed `Project` row (id, profile composition, default `RoleSpec` set, vcs cadence) — downstream skills resolve role and release-cadence config off the `Project` rather than re-reading the profile yaml. The init wizard emits a `MEMORY` seed entry (`MutationKind=create`) recording the project bootstrap so the calibrated-trust scorecard has a base date.
+## Canonical algorithm
+1. Discover existing `.ea/` (if any) and load profile composition.
+2. Render managed regions of `AGENTS.md`, `CLAUDE.md`, `.claude/`.
+3. Persist `.ea/state.json` and `.ea/profile.yaml`.
+## Pre-flight checklist
+- [ ] Working tree is clean before the first init.
+- [ ] Profile composition is declared.
+## Decision surfaces
+When the wizard pauses on an unanswered question, the `status=needs_user` envelope routes the operator to an `AskUserQuestion` prompt for the missing field rather than guessing a default.
+## Output contract
+Skill envelope wrapping the wizard outcome. `status=needs_user` when the wizard pauses on an unanswered question.

package/skills/math-explainer/SKILL.md ADDED Viewed

@@ -0,0 +1,64 @@
+---
+name: math-explainer
+description: Author a verification-grounded math-explainer over typed MathClaim/MathExplainer rows: each claim pins intuition + a runnable CI-checked example gate + assumptions/regime + a canonical citation, run through an in-skill clarity loop (vale-prose + EAWF019 + draft validate). No state mutations.
+argument-hint: "<explainer-slug> [--final] [--from-brief <path>]"
+user-invocable: true
+disable-model-invocation: true
+---
+# /math-explainer
+## Purpose
+Author a math-explainer a non-expert can trust. Agents are strong at olympiad-style math and weak at research-level conceptual math, and the dominant failure is a fluent, confident, *invalid* derivation rather than a refusal — math that *looks* execution-grounded but is not. The defence is the same discipline that lets a reader who cannot check the math act on it: pin every claim to an executed check. `/math-explainer` drives that contract over the typed `MathExplainer` + `MathClaim` rows (`kernel/spec/math.py`), running an in-skill clarity loop so a draft that fails the newcomer / facet test is revised, not emitted. Read-only: it writes only under `.ea/local/`.
+## The four-facet per-claim contract
+Every `MathClaim` carries all four facets — Pydantic refuses to construct one that drops any:
+1. **Intuition** — prose that *explains* why the statement holds (the analogy, the shape of the argument), not a citation standing in for the conceptually-hard step. Substituting a reference for the hard step is the non-expert trap.
+2. **Runnable example** — a `GateSpec` (`command_exit_zero`) hosting the verifier: a CAS identity, a high-precision numeric cross-check, a property test, a units check, an SMT counterexample hunt, an interval-arithmetic bound, or a proof-assistant obligation. The claim is *trustable* only when this gate actually runs in CI.
+3. **Assumptions / regime** — the one-line regime where the claim holds plus the explicit assumptions it rests on. This is what tells a non-expert *when not to trust the result*.
+4. **Citation** — a canonical `EvidenceRef` (audit / artifact / decision / store URN or external URL) that resolves and entails the claim.
+Route by claim shape and refute before you certify: most verifiers (CAS, numeric, property test, units, SMT-sat, reference-DB lookup) only *refute* — record `assurance="refute"`; reserve `assurance="certify"` for validated-interval arithmetic and a kernel-checked proof. Require a second independent path before marking any conceptual claim resolved.
+## Canonical algorithm
+1. Resolve the explainer slug from the argument (or `AskUserQuestion` if absent). The slug stems the artifact: `<YYYY-MM-DD>-<slug>-math.md` plus its typed sidecar `<YYYY-MM-DD>-<slug>-math.json`.
+2. Draft each claim with all four facets. Write the intuition first; never hand-write SDE / derivation math from memory — ground it against the source code or a cited reference and mark intuition-vs- rigor.
+3. Wire each claim's `example_gate` to a real verifier command and pin its `verifier` + `assurance` hints. The gate is dead unless its `args['argv']` is a non-empty vector the runner can execute.
+4. Run the in-skill clarity loop (below). A claim that fails any leg is revised in place — do not emit a draft that fails the loop.
+5. Write the `MathExplainer` JSON sidecar and the chassis-backed markdown companion under `.ea/local/research/`, then return the output envelope. On `--final` with residual unknowns > 0, emit a `/blitz` follow-up under the standard recursion guard.
+## The in-skill clarity loop
+Three layers, each with exactly one owner (no double-enforcement):
+- **Form (prose)** — `eawf hook vale-prose <doc>-math.md`: unglossed jargon, a hedge with no number, notation-spelling consistency, the readability of the intuition. Cannot see structure or correctness.
+- **Structure / binding** — `eawf hook eawf019-math-facets <doc>-math.json`: every claim carries the four facets; each citation resolves to a reference; every fenced example is actually collected by the runner (the silent-skip regression where a "tested" pin runs nothing); each formula parses. Cannot run the math.
+- **Draft artifact** — `eawf draft validate <doc>-math.md`: the chassis sections (Summary / References / Provenance / Scrub), dense `[N]` citations, and the scrub gate (no machine paths, host-local URLs, or PII).
+Correctness itself (the gate-runner executing each verifier) and entailment (a *different* model confirming the intuition entails the formula and the citation supports the claim) sit beyond the in-skill loop — the gate-runner owns truth, the L3 judge owns meaning. A lint can guarantee a claim *has* a verifier of the right type; only running it shows the math holds.
+## Promotability
+A `MathExplainer` is constructible while its claims are ungrounded but `is_promotable()` is false until *every* claim's gate is runnable and its citation resolves — the same EviBound shape as `IntentBrief.evidence_refs`. A claim is promoted on verification- grounding (gate resolves + citation resolves + judge confirms the intuition entails), never on author confidence. The local draft promotes to `.ea/artifacts/` only when it informs a decision recorded in `state.json` (the artifact-chassis rule then applies).
+## Pre-flight checklist
+- [ ] No state mutations — read-only; writes only under `.ea/local/`.
+- [ ] Explainer slug matches the wave / iter / phase prefix so dispatch renderers surface this brief.
+- [ ] Every claim carries all four facets — intuition, runnable example gate, assumptions/regime, canonical citation.
+- [ ] No claim substitutes a citation for the conceptually-hard step; the intuition explains, not asserts.
+- [ ] Each `example_gate` is a real `command_exit_zero` gate with a non-empty argv — no cosmetic "tested against" pin.
+- [ ] `verifier` + `assurance` hints set; `certify` reserved for interval arithmetic and proof assistants.
+- [ ] Derivation math grounded against source / a cited reference — none hand-written from memory.
+- [ ] `eawf hook vale-prose`, `eawf hook eawf019-math-facets`, and `eawf draft validate` all clean.
+- [ ] Citations repo-relative, external URL, or eawf URN — no absolute local paths, no host-local URLs, no PII.
+## Output contract
+Eä-rendered skill envelope (`OutputEnvelope`) with `header.skill = "/math-explainer"`. Body carries the explainer slug, the markdown + JSON artifact paths, the claim count, the per-claim verifier
++ assurance tally, the clarity-loop leg statuses (vale-prose / EAWF019 /
+draft validate), the promotability verdict, and any residual unknowns.

package/skills/memory/SKILL.md ADDED Viewed

@@ -0,0 +1,32 @@
+---
+name: memory
+description: Save, list, or forget curated durable memory entries.
+argument-hint: "save|list|forget [<name>] [--tier=working|archival|retrieval]"
+user-invocable: true
+disable-model-invocation: true
+---
+# /memory
+## v0.4 cross-links
+Each memory mutation carries a typed `MutationKind` — `create | update | refresh | demote | archive` — so the audit trail distinguishes a freshly captured fact from a content-preserving re-touch. `find_stale` is recommender-only (it surfaces candidates ranked by age and use-count) and never flips `status=STALE` unilaterally; the explicit `demote` mutation does that. Promoted memories carry a `Project` row reference so cross-repo recall stays scoped.
+## Canonical algorithm
+1. Resolve the verb (`save` default / `list` / `forget`) and the target tier (`working` default / `archival` / `retrieval`).
+2. A named verb (`save` / `forget`) without a `name` degrades to `status=needs_user`.
+3. Append a single append-only `EVENT` describing the operation intent; the daemon is the sole canonical writer of the memory JSONL store, so the skill routes the operator to the `eawf memory` writer via `next_valid_actions` rather than mutating the store itself.
+## Pre-flight checklist
+- [ ] The skill records intent only — the daemon owns the store write.
+- [ ] `save` / `forget` carry a memory entry name.
+## Decision surfaces
+A named verb (`save` / `forget`) without a `name` degrades to `status=needs_user`, which routes the operator to an `AskUserQuestion` prompt for the missing entry name rather than inventing one.
+## Output contract
+Skill envelope with `header.skill = "/memory"`. Body carries verb, name, and tier (or a reason on the needs_user path).

package/skills/mockup/SKILL.md ADDED Viewed

@@ -0,0 +1,30 @@
+---
+name: mockup
+description: Author 2-4 UI mockups as ASCII layouts and surface them as side-by-side AskUserQuestion option previews to compare.
+argument-hint: "<surface-slug>"
+user-invocable: true
+disable-model-invocation: false
+---
+# /mockup
+## Canonical algorithm
+1. Take the UI surface under design and the operator's intent (the fields to show, the terminal width budget, the priority elements).
+2. Author 2-4 concrete mockup variants as ASCII layouts. Use plain ASCII box characters (`+`, `-`, `|`) so the render stays lint-clean; do not use Unicode box-drawing glyphs.
+3. Surface the variants as `UserQuestionOption.preview` strings on a single `UserQuestion` so the operator compares the layouts side-by-side and picks one. Each option's `preview` carries that variant's full multi-line layout; the `label` is the variant name.
+4. Stop at the mockup. `/mockup` is advisory: it produces layouts and the comparison prompt, it does not write files or mutate state.
+## Pre-flight checklist
+- [ ] Read-only / advisory -- no state mutations, no file writes.
+- [ ] 2-4 variants only (the `AskUserQuestion` cap; single-select).
+- [ ] ASCII-only box art so the preview renders cleanly everywhere.
+## Decision surfaces
+The variant choice is the one operator-facing decision. Surface it through `AskUserQuestion`, one option per variant, with the layout in each option's `preview` box -- never bounce the operator to free-text.
+## Output contract
+Skill envelope whose body conforms to `MockupBody`: the authored `variants` plus the optional `UserQuestion` whose options carry the side-by-side `preview` layouts.

package/skills/polish/SKILL.md ADDED Viewed

@@ -0,0 +1,29 @@
+---
+name: polish
+description: Repo-wide consistency sweep. Aligns naming, docstring style, log fields, error message phrasing, and removes dead code.
+argument-hint: "[--scope=<dir|file>]"
+user-invocable: true
+disable-model-invocation: true
+---
+# /polish
+## Canonical algorithm
+1. Resolve scope: default = entire `src/eawf/`; `--scope=<dir|file>` narrows.
+2. Sweep checks: naming, docstrings, log fields, error message phrasing, dead code, citation density, draft sentinels, scrub status.
+3. Apply fixes inline. If a change touches public API, stop and ask.
+## Pre-flight checklist
+- [ ] Scope is declared and bounded.
+- [ ] No public API rename without explicit user confirmation.
+- [ ] The target iter is NOT yet closed. Iter close is gated on `audit + polish + ship CI + PR review pass` per the `iter-phase-close-timing` rule in AGENTS.md; `/polish` runs after `/audit` and before that close.
+## Decision surfaces
+Public-API renames, dead-code deletions, and anything matching `polish.deletion_policy` MUST be raised via `AskUserQuestion` (options: `apply` / `defer-to-backlog` / `skip`) instead of asking in free text. `polish.auto_apply_safe=true` bypasses the prompt for the small "safe" subset only (formatting, comment phrasing).
+## Output contract
+Skill envelope with a change list grouped by category and a deferred- items list for changes needing user OK.

package/skills/prep/SKILL.md ADDED Viewed

@@ -0,0 +1,43 @@
+---
+name: prep
+description: Activate the next PLANNED phase: surface its DAG for operator approval, then run the activate_phase hard gate and dispatch subagents per wave.
+argument-hint: "<phase-id>"
+user-invocable: true
+disable-model-invocation: true
+---
+# /prep
+## Canonical algorithm
+`/prep` activates a PLANNED phase. The flow branches on the phase's PLANNED-queue state:
+1. Resolve `<phase-id>` against `state.phases`.
+2. Branch on phase status + wave plan:
+   - **Case A — PLANNED phase with at least one PENDING wave.** Render the plan via `eawf roadmap show --phase <id> --md`. Enter Claude Code plan mode (`EnterPlanMode`) with the rendered DAG, then surface an `AskUserQuestion` with the options `use-as-is`, `revise`, `replace`, `cancel`. On `use-as-is`, call `eawf phase activate <id>` (which runs the V11 hard gate: ≥1 wave + deps phases CLOSED). On `revise`, hand back to `/roadmap revise`. On `replace`, hand back to `/roadmap drop`
+     + `/roadmap propose`.
+   - **Case B — PLANNED phase with empty wave DAG.** Dispatch the `planner` agent (`build/eawf-plugin/agents/planner.md`). The planner returns a sequence of `eawf roadmap revise --add-wave` commands (or a YAML payload). **Apply the planner's commands first** through the daemon-backed roadmap surface — waves land as PENDING on the still-PLANNED iter — then render the resulting DAG via `eawf roadmap show --phase <id> --md` and enter Claude Code plan mode (`EnterPlanMode`) with that rendering. Surface `AskUserQuestion` with `approve`, `edit`, `cancel`. The operator reviews the rendered roadmap, not the planner's raw commands. Edits during plan mode are `/roadmap revise` calls (PLANNED scope is mutable). On `approve`, call `eawf phase activate <id>` (V11 hard gate).
+   - **Case C — no PLANNED phase by that id.** Reject with exit 4 and hint `Run \`eawf roadmap propose --phase <id> --title ...\` first.` for the operator.
+3. **Optional spike first.** Before claiming a wave whose success criteria are not yet writable, run `/research <topic>` as a *spike* (read-only) per the `spike-workflow` rule in AGENTS.md. The spike produces a brief under `.ea/local/<YYYY-MM-DD>-<slug>.md` (or the conventional `.ea/local/research/` sub-directory). When a matching spike brief exists, the plan-mode proposal in case A MUST reference it by repo-relative path so the operator and the dispatched executor read the same source-of-truth artifact — the wave dispatch renderer surfaces matching briefs under a `## References` section automatically.
+4. For each parallel wave under the activated iter, dispatch a worktree subagent.
+5. For each sequential wave, run inline; cherry-pick parallel-wave commits in between as they finish.
+6. Validate the rendered plan with `eawf plan show --md`; wave tags and bucket roll-ups must match state.
+## Pre-flight checklist
+- [ ] Confirm current branch is the long-running phase branch.
+- [ ] Confirm `git status` is clean.
+- [ ] Confirm worktree subagents branch from the parent HEAD.
+- [ ] Every wave has success criteria, agent role, effort bucket, and file scope.
+- [ ] The target phase exists in `state.phases` with status `planned` (otherwise hand back to `/roadmap propose`).
+- [ ] If a spike preceded the claim, its brief path is cited in the plan-mode proposal (case A) so the dispatched subagent reads it.
+## Decision surfaces
+`AskUserQuestion` is the canonical surface for the case-A `use-as-is/revise/replace/cancel` pick and the case-B `approve/edit/cancel` pick. Free-text prompts are forbidden per the project-wide approval policy.
+## Output contract
+Skill envelope describing the activated phase + dispatched waves and the expected cherry-pick order. The envelope's `body.plan_mode_approval` records the approval source (`use-as-is`, `revise`, `replace`, `planner-approve`).

package/skills/refactor-god-class/SKILL.md ADDED Viewed

@@ -0,0 +1,31 @@
+---
+name: refactor-god-class
+description: Model-only playbook for splitting a multi-responsibility class into single-purpose collaborators.
+argument-hint: ""
+user-invocable: false
+disable-model-invocation: false
+---
+# refactor-god-class
+A model-only refactoring playbook. There is no slash command and no CLI verb; the model invokes this while editing a module whose single class has accreted too many responsibilities.
+## When to reach for it
+- One class owns parsing, validation, persistence, and presentation.
+- The class exceeds ~300 lines or has more than ~7 public methods that cluster into distinct concerns.
+- Tests for the class need elaborate setup because one method depends on state another method mutates.
+## Canonical procedure
+1. Map responsibilities. List each public method and tag it with the one concern it serves (parse, validate, compute, persist, render).
+2. Find the seams. Group methods sharing the same tag and the same private state; each group is a candidate collaborator.
+3. Extract the lowest-coupling group first into its own class with a constructor that takes only the state it needs (no back-reference to the god class).
+4. Replace the in-class call sites with delegation to the new collaborator; keep the god class's public API stable for one step so callers do not churn.
+5. Repeat per concern. When the god class is a thin coordinator, decide whether it survives as a facade or dissolves into its callers.
+## Guardrails
+- One concern per extraction commit; never move two seams at once.
+- Preserve behaviour: run the existing tests after each extraction before moving the next group.
+- Honour the project conventions — `extra="forbid"` on any new Pydantic model, single-responsibility per the AGENTS.md engineering rules, and no speculative abstraction (YAGNI).

package/skills/research/SKILL.md ADDED Viewed

@@ -0,0 +1,45 @@
+---
+name: research
+description: Read-only investigation of an open question. Produces a research brief or surfaces findings inline; no code changes, no state mutations.
+argument-hint: "<topic-slug> [--final]"
+user-invocable: true
+disable-model-invocation: false
+---
+# /research
+## Canonical algorithm
+1. Define the question. State the hypothesis or unknown in one sentence.
+2. Survey: read source, run `git log`, fetch external refs as needed.
+3. Compare alternatives — bullet list of options with pros/cons.
+4. Verdict: recommend one path, or recommend "stay open" with the next discriminating experiment.
+5. If `--final`: persist a research brief with `references` and render it through `eawf research show --md`.
+## v0.4 output contract: `IntentBrief` + dispatch-plan
+The brief body conforms to `kernel/spec/research.IntentBrief` — typed claims with `evidence_refs` (a brief is promotable iff every claim has at least one resolving + entailing reference). The session also emits an optional dispatch-plan when the verdict names a follow-up wave the brief informs, so `/prep` and `/roadmap propose` can wire the brief into the next wave's References block automatically.
+## `--depth` flag
+`--depth shallow|medium|deep|exhaustive` controls survey budget (file reads, external fetches, cross-wave grep sweeps). Default is `medium`; pin via `research.default_depth` in the layered config (reuses `StageProfile`, no new key).
+## Spike convention
+A *spike* — a short read-only investigation done before claiming a real wave — is run via `/research` and produces a brief under `.ea/local/<YYYY-MM-DD>-<slug>.md` (or the conventional `.ea/local/research/` sub-directory). The filename follows the `<date>-<slug>.md` stem so it sorts chronologically and slug-matches the wave, iter, or phase it informs. Briefs stay local-only — `.ea/local/` is gitignored — and are promoted to `.ea/artifacts/` only when they inform a decision recorded in `state.json` (the artifact-chassis rule then applies). See `spike-workflow` in AGENTS.md for the full convention.
+## Pre-flight checklist
+- [ ] No state mutations — read-only.
+- [ ] Cite sources as dense `[N]` references backed by `Citation` rows.
+- [ ] Keep promoted artifact prose scrub-clean and repo-relative.
+- [ ] Distinguish "what the code does" from "what the doc claims".
+- [ ] If this run is a spike, name the brief `<YYYY-MM-DD>-<slug>.md` and place it under `.ea/local/` (or `.ea/local/research/`) so the dispatch renderer can surface it to the next wave's executor.
+## Decision surfaces
+When the verdict reduces to a small set of named alternatives, surface the choice through `AskUserQuestion` rather than free-text — the operator can pick without retyping the option labels.
+## Output contract
+Eä-rendered skill envelope (`OutputEnvelope`) with `header.skill = "/research"`. Body carries the structured findings; footer records any persisted brief.

package/skills/review/SKILL.md ADDED Viewed

@@ -0,0 +1,31 @@
+---
+name: review
+description: Code review of an open PR or local diff. Surfaces issues with severity tags; no scope creep, no praise.
+argument-hint: "[<PR# | commit-range>]"
+user-invocable: true
+disable-model-invocation: false
+---
+# /review
+## Canonical algorithm
+1. Resolve target: PR number → `gh pr diff <PR>`; commit range → `git diff <range>`; default → `git diff main...HEAD`.
+2. Walk the diff hunk by hunk. For each hunk, read enough surrounding context to make a judgment.
+3. Apply rules in order: correctness > security > clarity > style.
+4. Tag findings: 🔴 blocker, 🟠 must-fix, 🟡 should-fix, 🔵 nit.
+5. Check artifact chassis and dense references when reviewing docs or promoted artifacts.
+## Pre-flight checklist
+- [ ] Read the success criteria for the phase/wave the diff belongs to.
+- [ ] Verify any quantitative claim against `Read`/`grep`.
+- [ ] Verify markdown artifacts keep `Summary`, `References`, `Provenance`, and `Scrub` sections.
+## Decision surfaces
+When the final verdict is ambiguous (e.g. one 🟠 finding the operator might choose to defer), surface `approve | request-changes | comment-only` through `AskUserQuestion` rather than picking silently.
+## Output contract
+Skill envelope with a flat findings list grouped by file and an aggregate verdict (`approve | request-changes | comment-only`).

package/skills/roadmap/SKILL.md ADDED Viewed

@@ -0,0 +1,31 @@
+---
+name: roadmap
+description: Plan / revise / apply / drop / show PLANNED-scope phases on the eawf roadmap queue. Mutates state.json via the lifecycle transitions; one phase at a time.
+argument-hint: "propose|revise|apply|drop|show <phase-id> [flags]"
+user-invocable: true
+disable-model-invocation: true
+---
+# /roadmap
+## Canonical algorithm
+1. **`propose`** stages a new PLANNED phase + its `P##-I01` iter on the queue without any waves yet. Emits a `needs_user` envelope with the rendered plan text — the active runtime (Claude plan-mode, Codex text-prompt) surfaces it for operator approval.
+2. **`revise`** edits the PLANNED scope via structured flags: `--add-wave`, `--remove-wave`, `--set-deps`, `--retitle`. Wave-level mutations route through the PENDING-only transitions on the lifecycle state machine.
+3. **`apply`** is the post-propose confirmation step. It validates that the phase is PLANNED with at least one wave and emits an `ok` envelope; the actual planning is already persisted (propose does the state mutation). Use it as the handoff into `/prep`.
+4. **`drop`** archives a PLANNED phase (PLANNED → ARCHIVED) when the operator rejects the proposed plan.
+5. **`show`** renders the queue: text table (default), markdown (`--md`), or JSON envelope (`--json`).
+## Pre-flight checklist
+- [ ] The daemon is the canonical mutator; roadmap commands proxy state mutations through its single-writer path.
+- [ ] Brief ids passed via `--from-briefs` should be promoted research artefacts (RES-YYYY-MM-DD-NNN).
+- [ ] One phase at a time. Bulk-propose is deferred.
+## Decision surfaces
+`roadmap propose` is the single decision surface — its envelope status is `needs_user`. The runtime adapter (Claude / Codex / OpenCode) maps the envelope's `decision_kind=approve_plan` body to its native confirm UI. `revise`, `apply`, and `drop` emit `ok` envelopes — operator has already approved via propose (or is walking back via drop).
+## Output contract
+`status=needs_user` envelope for `propose` (carries `plan_text` + `options`). `status=ok` envelope for `revise`, `apply`, `drop`, `show` — body shape varies per command. JSON envelope is the machine surface; the default text render is for terminal use.

package/skills/security-review/SKILL.md ADDED Viewed

@@ -0,0 +1,30 @@
+---
+name: security-review
+description: Run the security-audit DSL against a closed scope and emit findings.
+argument-hint: "--spec=<path> [--cwd=<dir>]"
+user-invocable: true
+disable-model-invocation: false
+---
+# /security-review
+## Canonical algorithm
+1. Load the caller-supplied audit spec (a YAML file of declarative checks) via the audit-check DSL; a missing/unreadable `spec_path` degrades to `status=needs_user`.
+2. Dispatch every check through the DSL runner against the target `cwd` (defaults to the process working tree).
+3. Fold the pass/fail tally into the body. The terminal status is `ok` when every check passes and `failed` when any check fails (failing check names surface as repair commands).
+When the active profile is `security`, this skill is a required gate for `phase close`.
+## Pre-flight checklist
+- [ ] `spec_path` points at a readable declarative audit spec.
+- [ ] The scope under review is closed.
+## Decision surfaces
+A missing or unreadable `spec_path` degrades to `status=needs_user`, which routes the operator to an `AskUserQuestion` prompt for the audit spec rather than running an empty check set.
+## Output contract
+Skill envelope with `header.skill = "/security-review"`. Body carries scope_id, spec_path, checks_run, and the per-check findings.

package/skills/ship/SKILL.md ADDED Viewed

@@ -0,0 +1,39 @@
+---
+name: ship
+description: Close out a phase by running the full local CI surface, opening the phase PR, and (after merge) advancing state.
+argument-hint: "<phase-id> [--dry-run]"
+user-invocable: true
+disable-model-invocation: true
+---
+# /ship
+## v0.4 cross-links
+`/ship` reads the phase `CloseReadiness` projection (gate-pack aggregate + `EvidenceRecord` summary + outstanding follow-ups) to decide what gates still need clearing. The phase-close commit only lands once `CloseReadiness.status == "ready"`. The phase-PR body is synthesized from the same projection so reviewer and tool see the same shape. `MEMORY` mutations driven by ship (e.g. release-notes entries) carry an explicit `MutationKind` for downstream audit.
+## Canonical algorithm
+1. Resolve `<phase-id>`; verify all waves under it are complete.
+2. Run the local verification gauntlet (pre-commit, mypy, pytest, ruff).
+3. Validate artifact markdown and PR prose against the chassis/scrub rules.
+4. Push the long-running feature branch.
+5. Open the phase PR via `gh pr create`.
+6. **PR-review pass.** Read remote review comments via `gh pr view <PR> --comments` (or the inline equivalent). For each actionable finding, append a follow-up wave to the current iter via `eawf roadmap revise --add-wave` (not a new iter — per the `iter-phase-close-timing` rule). Implement, re-push, wait for green CI, re-request review until clean.
+7. **Bundle close in the final pre-merge commit.** Once CI is green and the review-passed branch is on the remote, emit a single `[P<NN>] state: close iter + phase (audit=<id>)` commit (the legacy `[P<NN>-CORE] state: ...` form remains valid per the `commit-prefix` block in AGENTS.md) that bundles `eawf iter close P<NN>-I<MM>` + `eawf phase close P<NN>` (no other touched files). The operator merges that commit to end the phase.
+## Pre-flight checklist
+- [ ] All waves under `<phase-id>` are complete.
+- [ ] Cherry-picks from worktree subagents have all landed.
+- [ ] `eawf artifact validate` passes for promoted markdown.
+- [ ] CI on the latest push is green.
+- [ ] `/audit` and `/polish` have already run on the iter — phase close is gated on both per `iter-phase-close-timing`.
+## Decision surfaces
+`gh pr create`, `gh pr merge`, and any push to a protected branch are irreversible/visible-to-others actions per AGENTS.md — surface the final confirm through `AskUserQuestion` (options: `proceed` / `defer` / `abort`) unless `vcs.auto_push`, `vcs.pr_open`, and the merge strategy are pre-resolved by config.
+## Output contract
+Skill envelope carrying the PR URL, the post-merge state mutation, and any deferred follow-ups.

package/skills/spike/SKILL.md ADDED Viewed

@@ -0,0 +1,65 @@
+---
+name: spike
+description: Read-only multi-axis direction investigation that unblocks /roadmap propose or /design: N rounds x M axis picks, optional postmortem + scope deltas. No state mutations.
+argument-hint: "<spike-slug> [--final] [--from-briefs <path1,path2,...>] [--postmortem <phase-id>]"
+user-invocable: true
+disable-model-invocation: false
+---
+# /spike
+## Purpose
+A *spike* is a time-boxed read-only investigation that produces **direction** — picks across many design axes — to unblock the next planning skill. Direction-only means no wave plan, no success criteria, and no EU estimates fall out of a spike; those belong to `/design` and `/roadmap propose`. Spike writes only under `.ea/local/`.
+Three failure modes it prevents:
+1. Single-verdict context loss. `/research` answers one question; a rebuild or phase-opening decision has many entangled axes. Spike runs them as multi-round AUQ batches in one session with a rolling matrix.
+2. Silent scope creep. Mid-session picks can quietly expand or reduce surface vs prior briefs. Spike surfaces scope deltas as a mandatory section when prior briefs exist.
+3. Postmortem-without-next-plan. Spike couples a postmortem (gap matrix + root causes + salvage matrix) to direction picks in the same brief so the rebuild plan inherits the lessons.
+## When to invoke
+Reach for `/spike` when multiple decisions block `/roadmap propose` and must be picked together to stay coherent, when a prior attempt shipped but missed its briefs and the next phase is a rebuild, or when direction picks span scope-expansions and -reductions that need explicit acknowledgement. Pick a neighbour instead when one unknown blocks progress and a single verdict lands it (`/research`), or when direction is locked and an interactive surface needs a full statechart
++ matrix + journey design (`/design`).
+## Canonical algorithm
+1. Resolve the slug from the argument or AUQ. Filename stem: `<YYYY-MM-DD>-<slug>.md`.
+2. Frame the multi-axis unknown in one paragraph; list the prior briefs feeding the spike (`--from-briefs`). On `--postmortem <phase-id>`, declare the phase under postmortem.
+3. Survey: read the cited prior briefs, read source on the verdict ladder (verify-before-claim), run `git log` for shipped-vs-spec drift. Optionally dispatch worktree subagents for independent investigative arms; the parent compiles the chunks.
+4. Multi-round AUQ picks. Each round = 3-6 axes batched into one `AskUserQuestion` call. Decisions accumulate into the rolling matrix; round-close is gated on all axes in the batch answered.
+5. Scope-delta surfacing. When picks diverge from prior briefs, record explicit expansion and reduction tables, each row citing the original brief line and the new pick.
+6. Critical-contracts capture. When picks ripple into process changes (commit-prefix lint, success-criteria shape, schema migration, audit kind), surface them with enforcement + effect named.
+7. Open follow-ups. Enumerate explicitly — "none" is rare. Label each with a next-action (next-spike, next-research, hypothesis-open, blocked-on-EU, blocked-on-demo).
+8. Hand-off declaration. The Summary closes with a literal `next:` line naming the unblocked skill and its args.
+9. Self-lint, then write (on `--final`) `.ea/local/research/<YYYY-MM-DD>-<slug>.md` with the sentinel `<!-- eawf-template: spike-brief -->` on line 1. Return the output envelope.
+## PoC allowance + execution-spike isolation
+A spike MAY produce throwaway runnable artifacts to ground direction picks — smoke demos, probe scripts, config experiments — under `.ea/local/` only (`.ea/local/smoke/<slug>/`, `.ea/local/poc/<slug>/`, or `.ea/local/research/notes/<slug>/`), gitignored, with a manifest table + a "How to run the PoCs" section when >=1 is built.
+A *direction spike* is read-only. An *execution spike* writes code to ground a verdict against a running artifact, so it needs isolation the read-only flow does not. Zero-internal-dep throwaway scripts stay in the local gitignored PoC scope. Code that imports `eawf.*` runs in a dedicated worktree branch off the current feature-branch HEAD (`feature/<symbol>-vX.Y-spike-<slug>`); on a green verdict the commits are cherry-picked into the feature branch (never merged), and on a rejected verdict the worktree is torn down. Do NOT commit spike code straight onto the shared feature branch — a pre-verdict commit interleaves with concurrent sessions and bakes unratified experiment into history.
+## Brief chassis
+Required sections (the writer rejects on missing): frontmatter (scope URN, `status=local-draft`, created date, agent); the sentinel on line 1; Summary (verdict rollup + direction bullets + the `next:` line); a decision matrix (>=1 round table when picks were made, else a Findings section); Open follow-ups (labelled); References (dense `[N]` rows, repo-relative + external URLs + Eä URNs); Provenance (`store_record=none (local-only spike)`, starting commit SHA, session slug); and Scrub (repo-relative paths only, no PII, placeholder names when project codes appear). Conditionally required: the PoC manifest + run guide when >=1 PoC was built, the postmortem arc when `--postmortem` is passed, and the scope-delta tables when `--from-briefs` cites prior briefs.
+## Pre-flight checklist
+- [ ] No state mutations; `state.json` untouched. PoCs under `.ea/local/{smoke,poc,research/notes}/` only (gitignored).
+- [ ] If an execution spike that landed runnable code — code is isolated (worktree branch for internal deps, local PoC scope for zero deps) and ships a `.ea/local/` user test guide.
+- [ ] Multi-round picks via `AskUserQuestion`, never free-text; recommended option first, labelled.
+- [ ] Every direction pick carries a one-line rationale (`[N]` cite or "because X").
+- [ ] Scope deltas surfaced when `--from-briefs` cited; critical contracts surfaced when picks ripple into process change.
+- [ ] Open follow-ups enumerated with next-action labels; `next:` line present in the Summary.
+- [ ] References dense `[N]`, repo-relative only; Provenance records the starting commit SHA + session slug; Scrub confirms no PII.
+- [ ] Brief filename `<YYYY-MM-DD>-<slug>.md` under `.ea/local/research/`; sentinel `<!-- eawf-template: spike-brief -->` on line 1.
+## Decision surfaces
+Round axis picks, scope-delta acknowledgement, and the hand-off declaration are all surfaced through `AskUserQuestion` batches — the hand-off AUQ options name the unblocked skill (`/roadmap propose`, another `/spike`, `/research --final`, or `/design`).
+## Output contract
+Eä-rendered envelope (`OutputEnvelope`) with `header.skill = "/spike"`. Body carries the Summary + `next:` line, the round tables (decision matrix), the postmortem arc when `--postmortem`, the scope deltas when `--from-briefs` cited, the critical contracts when applicable, the labelled open follow-ups, and the References + Provenance + Scrub chassis tail. The footer records the persisted brief path + sentinel when `--final` was passed.