npm - @ai-agent-lead/skills - Versions diffs - 1.3.0 → 1.5.0 - Mend

@ai-agent-lead/skills 1.3.0 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (40) hide show

package/README.md +30 -0
package/package.json +1 -1
package/skills/README.md +29 -20
package/skills/SKILL-TEMPLATE.md +9 -3
package/skills/TRIGGERS.md +3 -3
package/skills/WORKFLOWS.md +17 -16
package/skills/bench/SKILL.md +20 -4
package/skills/bootstrap/SKILL.md +24 -15
package/skills/caveman/SKILL.md +1 -3
package/skills/debug/SKILL.md +0 -2
package/skills/design/ILLEGAL-STATES.md +12 -1
package/skills/design/SKILL.md +10 -3
package/skills/design/templates/design-note.md +47 -0
package/skills/feature-doc/SKILL.md +12 -20
package/skills/{code-hygiene/SKILL.md → formats/CODE-HYGIENE.md} +24 -29
package/skills/formats/CONTEXT-FORMAT.md +5 -5
package/skills/formats/CONTEXT-MAP-FORMAT.md +14 -3
package/skills/formats/CONVENTION-FORMAT.md +40 -0
package/skills/formats/DOC-DRIFT-AUDIT.md +56 -0
package/skills/formats/OKF.md +13 -9
package/skills/formats/{STYLE-comments.md → STYLE-COMMENTS.md} +1 -1
package/skills/grill-plan/SKILL.md +5 -3
package/skills/improve-codebase-architecture/INTERFACE-DESIGN.md +1 -1
package/skills/improve-codebase-architecture/SKILL.md +7 -18
package/skills/investigate/SKILL.md +14 -2
package/skills/pr-review/SKILL.md +7 -22
package/skills/prod-ready/SKILL.md +16 -17
package/skills/security-review/SKILL.md +12 -4
package/skills/simplify/SKILL.md +6 -15
package/skills/system-design/SKILL.md +29 -19
package/skills/tdd/SKILL.md +5 -6
package/skills/tdd-rounds/COMMITS.md +4 -4
package/skills/tdd-rounds/SKILL.md +33 -9
package/skills/tdd-rounds/templates/builder-brief.md +4 -4
package/skills/verify-real-deps/SKILL.md +16 -4
package/skills/verify-real-deps/templates/known-issues.md +33 -28
package/skills/zoom-out/SKILL.md +1 -3
package/skills/sync-check/SKILL.md +0 -69
/package/skills/{design → formats}/OBSERVABILITY.md +0 -0
/package/skills/{design → formats}/PERSONAS.md +0 -0

package/skills/security-review/SKILL.md CHANGED Viewed

@@ -1,8 +1,6 @@
 ---
 name: security-review
 description: Threat-model + control-review pass for surface-changing work — runs alongside `tdd` and as a heavier gate than `prod-ready` Section 3 when the change introduces or alters trust boundaries, authentication, authorization, sensitive data flows, or external surfaces. Use when the user mentions "security review", "threat model", "STRIDE", "auth flow", "permissions", "secrets", "PII", "public API", "external surface", "abuse", "hardening", or whenever a feature doc / PR adds a new entry point, identity flow, or sensitive-data path. Skip for pure-internal refactors with no surface change, dependency bumps that don't change runtime behavior, or doc-only changes. Pairs with `feature-doc` (informs the threat model) and `prod-ready` (which has the lighter operational-security checklist for every change).
-complexity: high
-expected_duration: 30 minutes
 ---
 # Security Review
@@ -118,7 +116,17 @@ Most reviews produce a **`## Security review` section appended to the feature do
 - <risk> — accepted because <reason>; tracked at <link or follow-up>
 ```
-For high-stakes changes (new auth flow, new external surface, regulated data), promote to `docs/security/<feature>.md` with a fuller threat-model section. Same shape, just longer.
+For high-stakes changes (new auth flow, new external surface, regulated data), promote to a dedicated `docs/security/<feature>.md` with a fuller threat-model section — same shape, just longer. As a standalone produced doc it opens with OKF frontmatter (`type: security`, per [`../formats/OKF.md`](../formats/OKF.md) §2); the `## Security review` section appended to a feature doc needs none, since the feature doc already carries its own:
+```yaml
+---
+type: security
+title: <Feature> security review
+description: Threat model and verified controls for <feature>'s surface change.
+tags: [security, threat-model]
+timestamp: YYYY-MM-DD
+---
+```
 ## Anti-patterns
@@ -141,5 +149,5 @@ For high-stakes changes (new auth flow, new external surface, regulated data), p
 - The security surface (entry points + trust zones + data flows) is named, not assumed.
 - Each plausible threat has a verified control or an explicit deferral with rationale.
 - High-impact threats have two independent layers of control where feasible.
-- The artifact (feature-doc section or `docs/security/<feature>.md`) is in the repo.
+- The artifact is in the repo — a `## Security review` feature-doc section, or a standalone `docs/security/<feature>.md` opening with `type: security` OKF frontmatter.
 - The PR description references the review so the reviewer can audit it.

package/skills/simplify/SKILL.md CHANGED Viewed

@@ -1,8 +1,6 @@
 ---
 name: simplify
-description: Single end-of-round sweep that tightens what `tdd` just left green — review every changed file for reuse, quality, efficiency, and test relevance. Use after `tdd` reaches green and before opening a PR (or before a Builder closes a round in `tdd-rounds`). Triggered by phrases like "simplify pass", "tighten this", "clean up before commit", "end-of-round sweep", or appearing as a step in a Builder brief. Skip for trivial diffs (typo, dep bump, doc-only). Pairs with `tdd` (runs immediately after green), `code-hygiene` (the lens applied during the sweep), and `pr-review` (a self-check after this).
-complexity: low
-expected_duration: 10 minutes
+description: Single end-of-round sweep that tightens what `tdd` just left green — review every changed file for reuse, quality, efficiency, test relevance, and telemetry. Use after `tdd` reaches green and before opening a PR (or before a Builder closes a round in `tdd-rounds`). Triggered by phrases like "simplify pass", "tighten this", "clean up before commit", "end-of-round sweep", or appearing as a step in a Builder brief. Skip for trivial diffs (typo, dep bump, doc-only). Pairs with `tdd` (runs immediately after green), the `code-hygiene` lens (`formats/CODE-HYGIENE.md`, applied during the sweep), and `pr-review` (a self-check after this).
 ---
 # Simplify
@@ -34,7 +32,7 @@ The simplify pass catches these once, deliberately, before the diff lands. Witho
 - Slices where the diff is one or two lines and there's nothing to sweep.
 - During red phase — never refactor while red. See [`tdd/SKILL.md`](../tdd/SKILL.md).
-## The four lenses
+## The five lenses
 Walk every changed file. Apply each lens in order. Fix what you find inline.
@@ -49,14 +47,7 @@ Walk every changed file. Apply each lens in order. Fix what you find inline.
 - Names that read clearly out of context — would a stranger guess what `result`, `data`, `value` referred to? If not, rename.
 - Error messages that name the failing input — `"could not parse: <value>"` beats `"parse error"`.
 - Abstractions that haven't earned their keep — a base class with one subclass, an interface with one implementation. Inline.
-- Comments — **default during the sweep is DELETE**:
-  - Delete WHAT-comments (the code already says it).
-  - Delete docstrings on exports whose contract is obvious from the signature.
-  - Delete "used by X" / "added for Y" caller references — these rot; use grep.
-  - Delete commented-out code (git has it).
-  - Delete banner / ASCII-art dividers and in-function section headers (`// validate`, `// build response`).
-  - KEEP only why-comments — constraint, invariant, trade-off, provenance — and only when the next reader would otherwise reattempt the rejected alternative.
-  - If a kept comment cites a round / ADR / AC / snapshot, normalize to [`skills/formats/STYLE-comments.md`](../formats/STYLE-comments.md) §3 (`ADR-007 §7`, `R6 AC-3`, `v0.3 R1b2`, `snapshot §"Invariants"`).
+- Comments — **default during the sweep is DELETE.** Apply [`CODE-HYGIENE.md`](../formats/CODE-HYGIENE.md) Principle 6 and the bar in [`STYLE-COMMENTS.md`](../formats/STYLE-COMMENTS.md): delete WHAT-comments, obvious-from-signature docstrings, "used by X" caller references, commented-out code, banners, and in-function section headers (`// validate`, `// build response`); keep only why-comments the next reader would otherwise reattempt. Normalize any kept citation (`ADR-007 §7`, `R6 AC-3`, `v0.3 R1b2`) to [`STYLE-COMMENTS.md`](../formats/STYLE-COMMENTS.md) §3.
 ### 3. Efficiency — dead code, redundant work, premature defensive checks
@@ -76,7 +67,7 @@ For each test added in the slice:
 ### 5. Telemetry — the observability pass
-Apply the lens from [`skills/design/OBSERVABILITY.md`](../design/OBSERVABILITY.md):
+Apply the lens from [`skills/formats/OBSERVABILITY.md`](../formats/OBSERVABILITY.md):
 - [ ] Does every error message name the failing input?
 - [ ] Are there any "catch-all" blocks that swallow errors?
@@ -100,13 +91,13 @@ In single-feature flow (no rounds), the sweep can land as a separate commit befo
 ## Pairing with other skills
 - **`tdd`** runs first. Simplify runs after green. Never simplify while red.
-- **`code-hygiene`** is the lens — its 5 principles (boring code, naming, YAGNI, rule of 3, locality) are what you apply during the sweep. Read it once; apply it many times.
+- **[`code-hygiene`](../formats/CODE-HYGIENE.md)** is the lens — its seven principles (boring code, naming, YAGNI, rule of 3, locality, comments, constants placement) are what you apply during the sweep. Read it once; apply it many times.
 - **`pr-review`** comes after — a self-check against the diff. Some of the same lenses, applied as a reviewer rather than an author.
 - **`improve-codebase-architecture`** is the escalation — when simplify surfaces structural issues bigger than a sweep can fix.
 ## Done when
-- Every changed file walked once with the four lenses.
+- Every changed file walked once with the five lenses.
 - Each finding either fixed inline (small) or surfaced as a follow-up (large).
 - Tests still green after the sweep — re-run them.
 - A separate `simplify` commit exists (in `tdd-rounds`) or the sweep is captured in the final commit message (in single-feature flow).

package/skills/system-design/SKILL.md CHANGED Viewed

@@ -1,8 +1,6 @@
 ---
 name: system-design
 description: System-level architecture for greenfield work — name the modules, define responsibilities, set dependency direction, identify seams. Use when starting a new multi-module system or service from scratch, when defining topology before code lands, or when the user mentions "system architecture", "module boundaries", "service boundaries", "how should I structure this system", "draw the architecture", "topology". Skip for single-module work — use `design` instead. Skip for reorganizing an existing codebase — use `improve-codebase-architecture`. Pairs with `investigate` (comes before, surveying options) and `design` (comes after, shaping each module's interface).
-complexity: high
-expected_duration: 30 minutes
 ---
 # System Design
@@ -38,7 +36,7 @@ Before drawing anything, read what already exists:
 - `docs/adr/` — decisions already taken.
 - The feature set — what the system has to do.
-If `CONTEXT.md` doesn't exist yet, **resolve the core domain terms first** — defer to [`grill-plan` bootstrap mode](../grill-plan/BOOTSTRAP.md) (single source of truth for the lazy-creation rules). Module names without domain terms are placeholders.
+If `CONTEXT.md` doesn't exist yet, **resolve the core domain terms first** — defer to [`grill-plan` bootstrap mode](../bootstrap/BOOTSTRAP.md) (single source of truth for the lazy-creation rules). Module names without domain terms are placeholders.
 ### 2. List the modules
@@ -80,7 +78,19 @@ Each seam is a potential test boundary AND a potential failure point. Naming the
 ### 5. Draw the system map
-Output an ASCII diagram + a module table + a seam list. Save to `docs/architecture.md` (create lazily on first use).
+Output an ASCII diagram + a module table + a seam list, saved to `docs/architecture.md` (create lazily on first use). It's a produced doc, so it opens with OKF frontmatter (`type: architecture`, per [`../formats/OKF.md`](../formats/OKF.md) §2):
+```yaml
+---
+type: architecture
+title: <System> architecture
+description: Module topology, dependency direction, and seams for <system>.
+tags: [architecture, topology]
+timestamp: YYYY-MM-DD
+---
+```
+Below the frontmatter, the body — ASCII map, module table, seam list:
 ```
                  ┌──────────────┐
@@ -127,14 +137,6 @@ Before declaring the design done, check each item:
 Failures here are not "warnings" — they're "stop and rework". The system map is load-bearing for everything downstream.
-## Done when
-- `docs/architecture.md` exists with: module table, dependency direction, seam list, ASCII map.
-- Each module name comes from `CONTEXT.md` vocabulary (not framework conventions).
-- The dependency graph is acyclic and explicitly reviewed.
-- Every cross-module boundary has a named seam + adapter location.
-- For any decision you're least sure about, you've raised it as an open question to the user before locking in.
 ## Anti-patterns
 - **Layering instead of slicing.** Naming modules `controllers`, `services`, `repositories` is layering — a category split, not a responsibility split. Prefer feature slicing (`ordering/`, `billing/`).
@@ -142,6 +144,21 @@ Failures here are not "warnings" — they're "stop and rework". The system map i
 - **Premature service extraction.** Splitting modules into separate processes / deployments without a real reason (scale, team, fault isolation) adds operational cost without clarity benefit. Prefer in-process modules first; extract later if needed.
 - **Cycles "for now".** A cyclic dependency is a sign the modules aren't really separate. Don't accept it; merge them or insert a third module to break the cycle.
+## Pairing with other skills
+- **`investigate`** runs *before*. Investigates the problem and chooses the direction. `system-design` then operationalizes the chosen direction into a topology.
+- **`design`** runs *after*, per module. `system-design` says what modules should exist; `design` says what shape each module has.
+- **`grill-plan`** runs alongside or after, to stress-test high-stakes topology decisions.
+- **`improve-codebase-architecture`** is the *opposite* skill: given an existing codebase, find what to deepen. `system-design` is greenfield; that one is brownfield.
+## Done when
+- `docs/architecture.md` exists, opening with `type: architecture` OKF frontmatter, and contains: module table, dependency direction, seam list, ASCII map.
+- Each module name comes from `CONTEXT.md` vocabulary (not framework conventions).
+- The dependency graph is acyclic and explicitly reviewed.
+- Every cross-module boundary has a named seam + adapter location.
+- For any decision you're least sure about, you've raised it as an open question to the user before locking in.
 ## Handoff
 Once the system map is stable:
@@ -151,10 +168,3 @@ Once the system map is stable:
 - Then proceed to `feature-doc` for the first feature using the topology.
 If the system map needs to evolve later (new module, changed boundaries, removed seam), come back to this skill — don't drift the map silently.
-## Pairing with other skills
-- **`investigate`** runs *before*. Investigates the problem and chooses the direction. `system-design` then operationalizes the chosen direction into a topology.
-- **`design`** runs *after*, per module. `system-design` says what modules should exist; `design` says what shape each module has.
-- **`grill-plan`** runs alongside or after, to stress-test high-stakes topology decisions.
-- **`improve-codebase-architecture`** is the *opposite* skill: given an existing codebase, find what to deepen. `system-design` is greenfield; that one is brownfield.

package/skills/tdd/SKILL.md CHANGED Viewed

@@ -1,8 +1,6 @@
 ---
 name: tdd
-description: Test-driven development with the red-green-refactor loop. Use when implementing a feature, fixing a bug, changing core logic, or when the user mentions "TDD", "test-first", "red-green-refactor", or "integration tests". Skip for trivial UI glue, config changes, or pure docs edits.
-complexity: medium
-expected_duration: 20 minutes
+description: Test-driven development with the red-green-refactor loop. Use when implementing a feature, fixing a bug, changing core logic, or when the user mentions "TDD", "test-first", "red-green-refactor", or "integration tests". Skip for trivial UI glue, config changes, or pure docs edits. Pairs with `feature-doc` (its ACs become the test list), `debug` (run first when the root cause isn't yet known), and `simplify` (the end-of-round sweep after green).
 ---
 # Test-Driven Development
@@ -33,9 +31,9 @@ Tests verify **behavior through public interfaces**, not implementation details.
 See [TESTS.md](TESTS.md) for concrete good/bad examples and a pre-commit checklist.
-## Anti-pattern: Horizontal Slicing
+## Slice vertically, one test at a time
-**Do not write all tests first, then all implementation.** This is the most common TDD failure mode for AI agents — you write five tests in one shot, then five matching functions. The tests describe *imagined* behavior, become insensitive to real bugs, and lock in the wrong shape.
+The principle that drives the workflow below. **Do not write all tests first, then all implementation.** This is the most common TDD failure mode for AI agents — you write five tests in one shot, then five matching functions. The tests describe *imagined* behavior, become insensitive to real bugs, and lock in the wrong shape.
 ```
 WRONG (horizontal):
@@ -74,7 +72,7 @@ One test → minimum code to pass → next test. Each cycle informs the next.
 ### 4. Simplify pass — end-of-round, after green
-Before declaring the round done, run the [`simplify`](../simplify/SKILL.md) skill. It walks every changed file with four lenses — reuse, quality, efficiency, test relevance — and is a single sweep, not a license to refactor everything. If you find structural issues that need a dedicated round, surface them as `Open questions for parent` instead.
+Before declaring the round done, run the [`simplify`](../simplify/SKILL.md) skill. It walks every changed file with five lenses — reuse, quality, efficiency, test relevance, telemetry — and is a single sweep, not a license to refactor everything. If you find structural issues that need a dedicated round, surface them as `Open questions for parent` instead.
 ## Rules
@@ -92,6 +90,7 @@ Before declaring the round done, run the [`simplify`](../simplify/SKILL.md) skil
 ## Anti-patterns
+- Horizontal slicing — all tests first, then all implementation. See *Slice vertically, one test at a time* above; it's the failure mode this skill most guards against.
 - Writing tests after the code "to get coverage" — defeats the design feedback loop.
 - Mocking everything — tests pass but the system is broken. Mock at boundaries only (external APIs, time, randomness).
 - One giant test asserting many things — failures become hard to diagnose.

package/skills/tdd-rounds/COMMITS.md CHANGED Viewed

@@ -38,10 +38,10 @@ End-of-round refactoring (extracting helpers, removing dead branches, tightening
 ### 5. Doc-only commits stay separate from code
-`docs/STATE.md`, `docs/known-issues.md`, ADR patches — never bundle into a code commit. Mixing makes both diffs harder to read. Doc-only commits get a `docs:` prefix (no `R<N>:`):
+`docs/features/<name>.state.md`, `docs/known-issues.md`, ADR patches — never bundle into a code commit. Mixing makes both diffs harder to read. Doc-only commits get a `docs:` prefix (no `R<N>:`):
 ```
-docs: append actual R14 entry to STATE.md
+docs: append actual R14 entry to <name>.state.md
 docs: add public README
 ```
@@ -55,7 +55,7 @@ If you can't justify it, don't lump it.
 ### 7. Commit messages are honest
-The commit message states what the commit actually does, not what you wished it did. The lesson from this skill's source project: an R14 commit message claimed "STATE.md updated" but the Builder had forgotten that file — a follow-up commit was needed to restore truth. **Cost of an honest message: one extra line of typing. Cost of a lying message: a follow-up commit, a smudged history, and a lost minute the next time someone wonders why the round entry is missing.**
+The commit message states what the commit actually does, not what you wished it did. The lesson from this skill's source project: an R14 commit message claimed "state file updated" but the Builder had forgotten that file — a follow-up commit was needed to restore truth. **Cost of an honest message: one extra line of typing. Cost of a lying message: a follow-up commit, a smudged history, and a lost minute the next time someone wonders why the round entry is missing.**
 If a step you described in the brief didn't actually land, say so in `Deviations`. If a commit's diff doesn't match its message, fix the message before committing — or split into two commits if the message's intent really was two things.
@@ -109,7 +109,7 @@ Use whatever model line matches the actual driver. Match exactly the convention
 ## Anti-patterns
 - **Mono-commit at end of round.** "R<N>: round done." Defeats the per-slice review story.
-- **Bundling docs with code.** STATE.md updates ride along with the code change. Diffs become harder to read.
+- **Bundling docs with code.** State-file updates ride along with the code change. Diffs become harder to read.
 - **Imperative-mood squashing.** Four sequential "wip" commits eventually amended into one. The journey isn't the commit; the destination is. Don't force-amend; commit cleanly the first time.
 - **Empty bodies on non-trivial commits.** "fix bug" with no body. The body is where the *why* lives.
 - **Lying about scope.** A commit message saying "also updates X" when the diff has no X. Worse than no message.

package/skills/tdd-rounds/SKILL.md CHANGED Viewed

@@ -1,8 +1,6 @@
 ---
 name: tdd-rounds
-description: Multi-round TDD orchestration. Use when delivering a feature larger than one TDD slice — typically 5-15 acceptance criteria across multiple packages — by dispatching Builder sub-agents per round, with the parent maintaining state and verifying. Triggered when the user mentions "drive the sub-agent team", "multi-round TDD", "orchestrate rounds", "Builder agents", or when a plan from `feature-doc` lists more ACs than one agent should reasonably tackle in a single invocation. Pairs with `tdd` (Builders invoke that skill per round) and `prod-ready` (final round before tag).
-complexity: high
-expected_duration: 1 hour
+description: Multi-round TDD orchestration. Use when delivering a feature larger than one TDD slice — typically 5-15 acceptance criteria across multiple packages — by dispatching Builder sub-agents per round, with the parent maintaining state and verifying. Triggered when the user mentions "drive the sub-agent team", "multi-round TDD", "orchestrate rounds", "Builder agents", or when a plan from `feature-doc` lists more ACs than one agent should reasonably tackle in a single invocation. Skip for a single-AC bug fix or single-package feature — invoke `tdd` directly, with no round overhead. Pairs with `tdd` (Builders invoke that skill per round) and `prod-ready` (final round before tag).
 ---
 # Multi-Round TDD Orchestration
@@ -25,8 +23,8 @@ Distills the pattern of a parent agent driving Builder sub-agents through a feat
 - **Dispatch on a feature branch, never `main`.** Verify the current branch before dispatching the first Builder; if it's `main` / `master`, create `feat/<short-name>` first. The whole multi-round delivery lives on one feature branch (or a stack of feature branches), merged to `main` only at the end.
 - **Plan once.** The plan file lists rounds, each round's ACs, the dependency order, and the skills cadence per round (which rounds need `design`; when to invoke `improve-codebase-architecture` mid-project; when to invoke `prod-ready`).
 - **Brief per round.** A self-contained brief — the Builder shouldn't need conversation history.
-  - **Briefing Sub-agent**: For complex rounds, invoke a sub-agent to autonomously generate the brief by analyzing `docs/STATE.md`, the `feature-doc`, and the results of the previous round. See `templates/builder-brief.md` for the schema.
-- **Verify after each round.** Run the test command independently (don't trust the Builder's pasted output), read the diff, tick AC checkboxes in the feature doc with the test name, append a round summary to `docs/STATE.md`.
+  - **Briefing Sub-agent**: For complex rounds, invoke a sub-agent to autonomously generate the brief by analyzing the feature's `docs/features/<name>.state.md`, the `feature-doc`, and the results of the previous round. See `templates/builder-brief.md` for the schema.
+- **Verify after each round.** Run the test command independently (don't trust the Builder's pasted output), read the diff, tick AC checkboxes in the feature doc with the test name, append a round summary to `docs/features/<name>.state.md`.
 - **Never write code yourself.** If you find yourself doing it directly under time pressure, that's a signal the round was misscoped — split it.
 ## The Builder's contract
@@ -34,7 +32,7 @@ Distills the pattern of a parent agent driving Builder sub-agents through a feat
 When dispatched for a round, the Builder:
 0. **Pre-flight.** Verifies the current branch is **not** `main` / `master`. If it is, refuses immediately and returns a blocking report — does not proceed to step 1.
-1. Reads the brief, the plan file, `docs/STATE.md`, and any ADRs / feature docs the brief cites.
+1. Reads the brief, the plan file, the feature's `docs/features/<name>.state.md`, and any ADRs / feature docs the brief cites.
 2. **Executes the listed skills sequentially in this single invocation.** When a brief says "Skills (in order): design, tdd, simplify" — that means run all three in this run, not return to the parent between them. This rule is non-obvious; the brief template makes it explicit.
 3. Commits per AC slice (or per behavior slice for refactor rounds), prefix `R<N>:`. **Read [`COMMITS.md`](COMMITS.md) before the first commit** — it captures the seven rules (`R<N>:` prefix, `#<X>` for bug fixes, per-AC slicing, separate simplify-pass commit, separate doc commits, single-commit-with-justification, honest messages) and the message-body shape. The parent reads commits as the review surface; a clean sequence is reviewable one diff at a time, a mono-commit isn't.
 4. Returns the structured report from `templates/builder-report.md`.
@@ -46,18 +44,28 @@ When dispatched for a round, the Builder:
 |---|---|---|
 | [`design`](../design/SKILL.md) | Builder | Rounds introducing a new package or non-trivial interface. Skip on rounds that only extend existing modules. |
 | [`tdd`](../tdd/SKILL.md) | Builder | **Every** code-writing round. Mandatory. |
-| [`simplify`](../simplify/SKILL.md) | Builder | End of every round, after green. Single sweep — reuse / quality / efficiency / test relevance. Lands as its own commit ([COMMITS.md rule 4](COMMITS.md)). |
+| [`simplify`](../simplify/SKILL.md) | Builder | End of every round, after green. Single sweep — reuse / quality / efficiency / test relevance / telemetry. Lands as its own commit ([COMMITS.md rule 4](COMMITS.md)). |
 | [`improve-codebase-architecture`](../improve-codebase-architecture/SKILL.md) | **Parent** | Once mid-project, after the load-bearing seams exist. Surfaced opportunities become dedicated refactor rounds (no behavior change; ACs are "all existing tests still green"). |
 | [`prod-ready`](../prod-ready/SKILL.md) | Builder, final round only | Pre-tag operational checklist. Output the filled checklist verbatim in the Builder's report. |
 | [`verify-real-deps`](../verify-real-deps/SKILL.md) | **Parent** | After `prod-ready` clean, before tagging. Catches what unit/integration tests can't see — wire-shape mismatches against real third-party APIs. |
 ## State tracking
-`docs/STATE.md` is the parent-maintained running summary, append-only, ~one entry per round. The Builder reads it as pre-flight context for the round; the parent appends after each round closes. Single source of truth for "what we have so far" between Builder invocations.
+Each feature has one state file, `docs/features/<name>.state.md` — a sibling to its contract (`<name>.md`) and design note (`<name>.design.md`). It is the parent-maintained running summary, append-only, ~one `## Round N` section per round. The Builder reads it as pre-flight context for the round; the parent appends after each round closes. Single source of truth for "what we have so far" between Builder invocations. There is no global `docs/STATE.md`; `docs/index.md` is the feature manifest (per [ADR-0001](../../docs/adr/0001-distributed-state.md)).
-Format per round:
+It is a produced doc, so it opens with OKF frontmatter (`type: state`, per [`../formats/OKF.md`](../formats/OKF.md) §2), then one section per round:
 ```
+---
+type: state
+title: <Feature Name> — state
+description: Round-by-round state for the <name> feature.
+tags: [<name>, state]
+timestamp: YYYY-MM-DD
+---
+# <Feature Name> — state
 ## Round N — <title> (DONE YYYY-MM-DD)
 Delivered: AC-NN, AC-NN
 New packages: <list>
@@ -89,6 +97,22 @@ Known follow-ups: <one line>
 - [`COMMITS.md`](COMMITS.md) — commit cadence and message style (per-AC slicing, `R<N>:` prefix, when single-commit is OK, honesty rule). Builders read this before the first commit.
+## Pairing with other skills
+- **[`feature-doc`](../feature-doc/SKILL.md)** — runs *before*. Its AC list is what splits into rounds.
+- **[`tdd`](../tdd/SKILL.md)** — *invoked* by every code-writing round (mandatory).
+- **[`design`](../design/SKILL.md)**, **[`simplify`](../simplify/SKILL.md)**, **[`prod-ready`](../prod-ready/SKILL.md)**, **[`verify-real-deps`](../verify-real-deps/SKILL.md)** — *invoked* per the Skills-cadence table above.
+- **[`improve-codebase-architecture`](../improve-codebase-architecture/SKILL.md)** — run *once by the parent* mid-project, after the load-bearing seams exist.
+- **[`grill-plan`](../grill-plan/SKILL.md)** — *escalation* when an ADR wobbles mid-round.
+## Done when
+- Every AC in the feature doc is ticked with the name of the test that proves it.
+- `docs/features/<name>.state.md` has one append-only `## Round N` section per round.
+- The test suite was green at the end of **every** round (parent-verified independently), not only at the end.
+- The final round ran `prod-ready`; `verify-real-deps` ran before the tag; `docs/known-issues.md` has zero `Open` entries (or explicit deferred-by-design ones).
+- The whole delivery lived on a feature branch and merged to `main` only at the end.
 ## Handoff
 When the final round completes:

package/skills/tdd-rounds/templates/builder-brief.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # ROUND N — <title>
-You are a Builder sub-agent in a multi-round TDD project. Read the orchestration plan and `docs/STATE.md` first. **Execute design + tdd + simplify in THIS invocation. Do not stop after design.**
+You are a Builder sub-agent in a multi-round TDD project. Read the orchestration plan and the feature's `docs/features/<name>.state.md` first. **Execute design + tdd + simplify in THIS invocation. Do not stop after design.**
 ## Scope
@@ -34,18 +34,18 @@ You are a Builder sub-agent in a multi-round TDD project. Read the orchestration
 ## Files you must NOT touch
 - closed packages: `<list>`
-- `cli/`, `creds/`, `.claude/`, `docs/STATE.md` (parent owns)
+- `cli/`, `creds/`, `.claude/`, `docs/features/<name>.state.md` (parent owns)
 - ...
 ## Skills (in order — execute ALL in this invocation)
 1. **`design`** — REQUIRED if introducing a new package or non-trivial interface. Decisions to make: <list>.
 2. **`tdd`** — REQUIRED. Vertical slicing per AC; commit prefix `R<N>:`. Read [`tdd-rounds/COMMITS.md`](../COMMITS.md) before the first commit — it captures the per-AC slicing rule, the simplify-pass-gets-its-own-commit rule, the doc-commits-stay-separate rule, when single-commit is justified, and the message-body shape.
-3. **[`simplify`](../../simplify/SKILL.md)** — REQUIRED end-of-round. Walk every changed file with four lenses: reuse, quality, efficiency, test relevance. Land as its own commit (`R<N>: simplify — <summary>`).
+3. **[`simplify`](../../simplify/SKILL.md)** — REQUIRED end-of-round. Walk every changed file with five lenses: reuse, quality, efficiency, test relevance, telemetry. Land as its own commit (`R<N>: simplify — <summary>`).
 ## Pre-flight reading (in order)
-1. `docs/STATE.md` — what previous rounds delivered.
+1. `docs/features/<name>.state.md` — what previous rounds delivered.
 2. `docs/features/<feature>.md` — the AC contract.
 3. The ADRs cited above.
 4. <specific files the Builder needs to read before starting>

package/skills/verify-real-deps/SKILL.md CHANGED Viewed

@@ -1,8 +1,6 @@
 ---
 name: verify-real-deps
-description: Pre-tag smoke test against real third-party APIs. Use after `prod-ready` is clean, before tagging vN.0 — the gate that catches wire-shape mismatches that fakes accept but real upstreams reject. Triggered when the user mentions "smoke test", "real API", "live verify", "before tag", or "end-to-end against actual <vendor>". Pairs with `prod-ready` (which catches ops/infra issues tests miss) and `tdd-rounds` (the orchestration that feeds into this gate).
-complexity: high
-expected_duration: 45 minutes
+description: Pre-tag smoke test against real third-party APIs. Use after `prod-ready` is clean, before tagging vN.0 — the gate that catches wire-shape mismatches that fakes accept but real upstreams reject. Triggered when the user mentions "smoke test", "real API", "live verify", "before tag", or "end-to-end against actual <vendor>". Skip when there's no third-party API integration, or the "real" upstream is a staging service you fully control. Pairs with `prod-ready` (which catches ops/infra issues tests miss) and `tdd-rounds` (the orchestration that feeds into this gate).
 ---
 # Verify Against Real Dependencies
@@ -109,10 +107,24 @@ git push --tags
 - [`templates/known-issues.md`](templates/known-issues.md) — the bug-ledger format.
+## Pairing with other skills
+- **[`prod-ready`](../prod-ready/SKILL.md)** — runs *before*. This gate fires only once `prod-ready` is clean; it catches the wire-shape class `prod-ready` can't see.
+- **[`tdd-rounds`](../tdd-rounds/SKILL.md)** — the *orchestration* that feeds this gate; each ledger entry becomes a fix-round brief the parent dispatches.
+- **[`tdd`](../tdd/SKILL.md)** — *invoked* inside each fix-round: reproduce the surprise as a failing test before fixing.
+## Done when
+- The representative end-to-end flow ran against the **real** upstream with real credentials.
+- Every surprise is captured in `docs/known-issues.md` with all six fields.
+- The bug ledger has zero `Open` entries (or only explicit deferred-by-design ones with a tracking line).
+- For each fixed bug, the fake / mock was tightened to reject the wrong shape too.
+- A verification entry was appended to `docs/features/<name>.state.md`, and the release is tagged.
 ## Handoff
 When the bug ledger is clean and the same end-to-end flow runs without surprise:
-- Append a verification entry to `docs/STATE.md`: `## End-to-end smoke test (DONE YYYY-MM-DD)` with what was run and what was found.
+- Append a verification entry to the feature's `docs/features/<name>.state.md`: `## End-to-end smoke test (DONE YYYY-MM-DD)` with what was run and what was found.
 - Tag the release.
 - The bug ledger stays in `docs/known-issues.md` as a permanent post-mortem record. Future contributors read it to understand "what surprises lurk in this codebase that the test suite doesn't show."

package/skills/verify-real-deps/templates/known-issues.md CHANGED Viewed

@@ -1,56 +1,61 @@
 ---
 type: known-issues
-title: "Known issues — post-vN.0 smoke test findings"
-description: Bugs surfaced by the first end-to-end run against the real upstream.
-tags: [smoke-test, known-issues]
+title: "Known issues"
+description: <one sentence — what this ledger tracks>
+tags: [known-issues]
 timestamp: YYYY-MM-DD
 ---
-<!-- Frontmatter is OKF per skills/formats/OKF.md (`type: known-issues`).
-     `timestamp` tracks the last edit; the Discovered line below pins the first run. -->
+<!-- Frontmatter is OKF per skills/formats/OKF.md (`type: known-issues`). One per repo: docs/known-issues.md. -->
-# Known issues — post-vN.0 smoke test findings
+# Known issues
-**Discovered:** YYYY-MM-DD (first end-to-end run against real <upstream>).
-**Status:** <N> open / <M> closed.
+This file has **two modes** — use whichever fits (a repo may use both, under separate headings). Delete the mode you don't need.
-These bugs slipped past unit/integration tests because the test harness accepted whatever wire shape the code sent. Real upstream enforces stricter contracts and surfaced them. Each entry doubles as the canonical brief for its fix-round.
+- **A — Smoke-test bug ledger** (`verify-real-deps`): bugs the first real-upstream run surfaced that tests missed. Each entry doubles as the brief for its fix-round.
+- **B — Accepted-but-unimplemented follow-ups** (`tdd-rounds`, deferrals): work that's agreed but not done yet — e.g. an ADR accepted but not implemented. Each entry names the artifact that *claims* it's done, the real state, and the next step.
 ---
-## #N — <one-line title>
+## A — Smoke-test bug ledger
+**Discovered:** YYYY-MM-DD (first end-to-end run against real <upstream>).
+**Status:** <N> open / <M> closed.
+### #N — <one-line title>
 **Severity:** High | Medium | Low — <one-line reason based on user impact>.
-**Status:** Open | Closed in R<N> (commit <hash>). Tests:
-`TestNameOne`,
-`TestNameTwo`.
+**Status:** Open | Closed in R<N> (commit <hash>). Tests: `TestOne`, `TestTwo`.
 **Reproduction:**
 1. <numbered step a stranger could follow>
-2. <step>
-3. <expected vs observed>
+2. <expected vs observed>
-**Root cause:** <one paragraph — why the test passed but the real API didn't. Cite the test harness's permissive behavior, the real API's documented or observed contract, and the gap between them.>
+**Root cause:** <one paragraph — why the test passed but the real API didn't: the harness's permissive behavior, the real contract, the gap between them.>
 **Files:**
-- `<package>/<file>.go` (the function that mishandled the shape)
-- `<test-harness>/<fake>.go` (the fake that accepted the wrong shape — fix this too to prevent regression)
+- `<package>/<file>` — the function that mishandled the shape.
+- `<harness>/<fake>` — the fake that accepted the wrong shape (fix this too, or the test layer regresses).
-**Fix sketch:**
-<one paragraph specific enough to scope a Builder brief. Mention the precise wire shape change, the function signature change if any, and the new test that pins the real-API contract.>
+**Fix sketch:** <one paragraph specific enough to scope a Builder brief.>
 ---
-(repeat per issue)
+## B — Accepted-but-unimplemented follow-ups
+### <short title>
+- **Claimed done by:** <artifact — e.g. ADR-NNNN, a feature doc>.
+- **Actual state:** <what's really true — e.g. "decided, not yet implemented">.
+- **Next step:** <the concrete action, and what unblocks it>.
+_Write `_No open follow-ups._` when the list is empty._
 ---
-## Fix plan
+## Fix plan (mode A)
-**R<N> round (DONE YYYY-MM-DD):** issues #X-#Y closed in <K> dedicated commits + a simplify pass.
-**R<N+1> round (PENDING):** issue #Z.
+**R<N> (DONE YYYY-MM-DD):** issues #X–#Y closed in <K> commits + a simplify pass.
+**R<N+1> (PENDING):** issue #Z.
-After all open issues are closed:
-- Append an end-to-end verification entry to `docs/STATE.md`.
-- Tag vN.0.
-- Leave this file in place as a permanent post-mortem record.
+After all open issues close: append an end-to-end verification entry to `docs/features/<name>.state.md`, tag vN.0, and leave this file as a permanent post-mortem record.

package/skills/zoom-out/SKILL.md CHANGED Viewed

@@ -1,9 +1,7 @@
 ---
 name: zoom-out
-description: User-invoked utility — pulls the agent up an abstraction layer when the user is lost in unfamiliar code. Produces a map of relevant modules, callers, and seams in `docs/CONTEXT.md` vocabulary. Use when the user says "I'm lost", "zoom out", "give me higher-level context", "I don't know this area", "what depends on what here", or invokes the slash command. Does not change which workflow the user is in — interrupts to orient, then hands back. Skip when the user already has the map and just needs to read code.
+description: User-invoked utility — pulls the agent up an abstraction layer when the user is lost in unfamiliar code. Produces a map of relevant modules, callers, and seams in `docs/CONTEXT.md` vocabulary. Use when the user says "I'm lost", "zoom out", "give me higher-level context", "I don't know this area", "what depends on what here", or invokes the slash command. Does not change which workflow the user is in — interrupts to orient, then hands back. Skip when the user already has the map and just needs to read code. Runs before `debug` or `improve-codebase-architecture` when the area is unfamiliar.
 disable-model-invocation: true
-complexity: low
-expected_duration: 5 minutes
 ---
 # Zoom Out

package/skills/sync-check/SKILL.md DELETED Viewed

@@ -1,69 +0,0 @@
----
-name: sync-check
-description: Diagnostic "Context Auditor" that scans code changes for terminology drift against `CONTEXT.md` and architectural contradictions against `docs/adr/`. Use before `pr-review`, after significant refactors, or when names feel "off." Prevents term collisions (e.g., "Account" vs "Customer") and silent deviations from established architectural decisions. Pairs with `grill-plan` (to resolve contradictions found) and `pr-review` (downstream gate).
-complexity: low
-expected_duration: 10 minutes
----
-# Sync Check (Context Auditor)
-The discipline of ensuring that the code's implementation matches the project's **Domain Language** and **Architectural History**.
-## Why this skill exists
-As a codebase grows, it is easy for synonyms ("User", "Account", "Profile") to bleed into the code, diluting the domain model. Similarly, architectural decisions recorded in ADRs are often forgotten by developers (or LLMs) focused on local implementation. `sync-check` is the background auditor that surfaces these drifts before they calcify.
-## When to use
-- Before starting a `pr-review` (as a primary step).
-- After a large refactor round in `tdd-rounds`.
-- When you encounter a term in the code that isn't in `CONTEXT.md`.
-- When the user asks "is this consistent with our model?" or "are we following our ADRs?"
-## When to skip
-- Initial project setup (pre-vocabulary).
-- Pure doc/comment changes.
-- Trivial typo/config fixes.
-## Process
-### 1. Identify the Scope
-Map the files changed in the current branch or round to their corresponding domain contexts.
-- Use `CONTEXT-MAP.md` if it exists.
-- Otherwise, use the root `CONTEXT.md`.
-### 2. Terminology Audit
-Scan the `git diff` for new or changed public identifiers (classes, functions, types, variables).
-- **Term Collision**: Check if a new term is listed as an `_Avoid_:` alias in `CONTEXT.md`.
-- **Fuzzy Mapping**: Check if a term exists in the code but is missing from the glossary.
-- **Synonym Drift**: Check if the code uses two different words for the same domain concept.
-### 3. ADR Consistency Check
-Read the `docs/adr/` entries relevant to the changed packages.
-- **Direct Contradiction**: Does the change perform an action explicitly forbidden by an ADR (e.g., using an ORM when an ADR mandates manual SQL)?
-- **Surprise Deviation**: Does the change introduce a pattern that warrants a new ADR (hard to reverse, surprising, real trade-off)?
-### 4. Report Drifts
-Output a numbered list of findings. For each:
-- **Location**: `file:line` or `ADR-NNNN`.
-- **Severity**: `Blocker` (direct contradiction/collision) or `Suggestion` (fuzzy mapping).
-- **Finding**: "Code uses 'Account', but `CONTEXT.md` specifies 'Customer'."
-- **Recommended Action**: "Rename to 'Customer' or update `CONTEXT.md` if the concept is genuinely new."
-## Done when
-- All changed files have been cross-referenced with the glossary and relevant ADRs.
-- Every "Avoid" alias found in the diff has been flagged.
-- Every architectural deviation from active ADRs has been surfaced.
-## Handoff
-- If contradictions are found → run `grill-plan` to resolve the terminology or architectural mismatch.
-- If clean → proceed to `pr-review`.

/package/skills/{design → formats}/OBSERVABILITY.md RENAMED Viewed

File without changes

/package/skills/{design → formats}/PERSONAS.md RENAMED Viewed

File without changes