npm - @ai-agent-lead/skills - Versions diffs - 1.2.0 → 1.4.0 - Mend

@ai-agent-lead/skills 1.2.0 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (35) hide show

package/package.json +1 -1
package/skills/README.md +19 -18
package/skills/SKILL-TEMPLATE.md +5 -1
package/skills/TRIGGERS.md +3 -3
package/skills/WORKFLOWS.md +12 -15
package/skills/bench/SKILL.md +0 -2
package/skills/bench/templates/benchmark-report.md +10 -1
package/skills/bootstrap/SKILL.md +0 -2
package/skills/caveman/SKILL.md +0 -2
package/skills/debug/SKILL.md +0 -2
package/skills/design/ILLEGAL-STATES.md +12 -1
package/skills/design/SKILL.md +1 -3
package/skills/feature-doc/SKILL.md +3 -3
package/skills/feature-doc/templates/feature-template.md +13 -1
package/skills/formats/ADR-FORMAT.md +12 -2
package/skills/{code-hygiene/SKILL.md → formats/CODE-HYGIENE.md} +27 -39
package/skills/formats/CONTEXT-FORMAT.md +26 -0
package/skills/formats/DOC-DRIFT-AUDIT.md +55 -0
package/skills/formats/OKF.md +82 -0
package/skills/formats/STYLE-comments.md +33 -44
package/skills/grill-plan/SKILL.md +1 -3
package/skills/improve-codebase-architecture/SKILL.md +1 -3
package/skills/investigate/SKILL.md +0 -2
package/skills/investigate/templates/research-note.md +13 -1
package/skills/pr-review/SKILL.md +7 -20
package/skills/prod-ready/SKILL.md +8 -8
package/skills/security-review/SKILL.md +0 -2
package/skills/simplify/SKILL.md +4 -12
package/skills/system-design/SKILL.md +1 -3
package/skills/tdd/SKILL.md +5 -2
package/skills/tdd-rounds/SKILL.md +2 -14
package/skills/verify-real-deps/SKILL.md +0 -2
package/skills/verify-real-deps/templates/known-issues.md +11 -0
package/skills/zoom-out/SKILL.md +0 -2
package/skills/sync-check/SKILL.md +0 -69

package/skills/formats/DOC-DRIFT-AUDIT.md ADDED Viewed

@@ -0,0 +1,55 @@
+# Doc-Drift Audit
+The single source for one question: *does this change leave the durable docs consistent with the code?* Implementation lands → docs drift. The natural moment to catch the drift is at the merge boundary, not "next sprint".
+This is a **shared reference**, run from three places with the same checks but a different lens:
+- [`prod-ready`](../prod-ready/SKILL.md) Section 7 — **author lens.** Walk the checks before opening the PR; fix drift inline now.
+- [`pr-review`](../pr-review/SKILL.md) §3e — **reviewer lens.** Second line of defense; what the author missed, classified by severity.
+- **Standalone** — the old `sync-check` role. Run mid-stream after a significant refactor, or when a name feels "off", to surface terminology and ADR drift before it calcifies. Produces a numbered findings list (see "Standalone report" below).
+Files don't need to pre-exist — `docs/adr/`, `CONTEXT.md`, design notes are created lazily when the first relevant change appears. If a doc type isn't relevant to this work, write `n/a` — explicit beats implicit.
+## The six checks
+One question per doc type: *did this work change X? Then update Y.*
+1. **New decision with viable alternatives** → an ADR exists in `docs/adr/`, names what it supersedes (if anything), and is referenced from code where the decision is load-bearing. See [`ADR-FORMAT.md`](ADR-FORMAT.md).
+2. **New or changed domain term** → [`docs/CONTEXT.md`](../../docs/CONTEXT.md) entry created or updated, including `_Avoid_:` aliases if the term risks being confused with an existing one. A new term that collides with an existing `_Avoid_:` alias (e.g. "Account" where the glossary says "Customer") is the highest-signal finding here.
+3. **New/removed package, changed public interface, or shifted module boundary** → the feature's design note (`docs/features/<feature>.design.md`) is updated: module map, file layout, public-interface signatures, test boundaries.
+4. **Changed acceptance criteria** → the feature doc reflects what was actually built. Silently-dropped or silently-added behavior is the most common drift class — fix here, don't kick to a follow-up.
+5. **User-visible change** → `CHANGELOG.md` has an entry under `[Unreleased]`, grouped by `Added / Changed / Fixed / Removed`. Skip only for formatter-only / lint-only / test-only / internal-refactor-with-no-behavior-change / dep-bump-with-no-runtime-impact diffs. (Overlaps but is **not identical** to `feature-doc`'s skip list, which also waives the *doc* for typo fixes and one-line config tweaks. A typo fix can still merit a one-liner here.)
+6. **New or changed produced doc under `docs/`** → it opens with OKF frontmatter carrying a non-empty `type` from [`OKF.md`](OKF.md) §2. A produced doc with no `type` breaks bundle conformance.
+## ADR consistency (read alongside check 1)
+Beyond "does an ADR exist", check the change against the ones already recorded:
+- **Direct contradiction** — does the change do something an active ADR explicitly forbids (e.g. uses an ORM where an ADR mandates hand-written SQL)? That's a blocker until the ADR is superseded.
+- **Surprise deviation** — does the change introduce a pattern that *warrants* a new ADR (hard to reverse, surprising, a real trade-off) but doesn't have one?
+## Severity
+- **Blocker** — the missing/contradicted doc is load-bearing for the next reader: an ADR for a hard-to-reverse decision, a `CONTEXT.md` entry for a term other changes will use, an AC drift hiding behavior, or a direct ADR contradiction.
+- **Suggestion** — the doc would help but the diff is self-explanatory in isolation.
+In the author lens (`prod-ready`) every check resolves to `✓`, `✗ + remediation`, or `n/a + reason` and is fixed before the PR. In the reviewer lens (`pr-review`) an unresolved check without `n/a + reason` is a finding at the severity above.
+## Standalone report
+When run on its own (the former `sync-check`), output a numbered list. For each finding:
+- **Location** — `file:line` or `ADR-NNNN`.
+- **Severity** — `Blocker` or `Suggestion` per above.
+- **Finding** — e.g. "Code uses 'Account', but `CONTEXT.md` specifies 'Customer'."
+- **Recommended action** — e.g. "Rename to 'Customer', or update `CONTEXT.md` if the concept is genuinely new."
+If a contradiction is found in *existing* code (not just this diff), escalate to [`grill-plan`](../grill-plan/SKILL.md) to resolve the terminology or architectural mismatch.
+## Done when
+- Every relevant check is `✓`, `✗ + remediation`, or `n/a + reason`.
+- Every `_Avoid_:` alias appearing in the diff has been flagged.
+- Every deviation from an active ADR has been surfaced.
+The doc-map is small enough to walk in 2–3 minutes. Skip it and you're trading 3 minutes now for an hour of orientation in 3 months.

package/skills/formats/OKF.md ADDED Viewed

@@ -0,0 +1,82 @@
+# OKF Frontmatter for Produced Docs
+Every document a skill writes under `docs/` carries [Open Knowledge Format](https://github.com/GoogleCloudPlatform/knowledge-catalog/tree/main/okf) (OKF v0.1) YAML frontmatter. That makes the whole `docs/` tree a consumable OKF **bundle** — a directory of markdown "concept" files any OKF-aware tool, agent, or static server can ingest, with no change to how we author.
+This is the substrate every doc-producing skill consumes (`feature-doc`, `investigate`, `bootstrap`, `grill-plan`, `bench`, `tdd-rounds`, `verify-real-deps`, …) — the way `STYLE-comments.md` is the substrate for comments. The per-skill templates carry a filled-in frontmatter block; this doc is the contract behind them.
+> **Why OKF and not our own scheme:** the format is vendor-neutral and additive — the only hard requirement is a non-empty `type`, and consumers preserve unknown keys. We get interop for free without giving up any of our existing structure.
+---
+## 1. The frontmatter block
+Every non-reserved `.md` file under `docs/` opens with a YAML frontmatter block:
+```yaml
+---
+type: feature
+title: Distributed State for TDD Rounds
+description: Feature-scoped state replaces the global STATE.md to cut merge conflicts.
+tags: [tdd-rounds, state, architecture]
+timestamp: 2026-05-25
+---
+```
+| Field | OKF status | Rule |
+| --- | --- | --- |
+| `type` | **required** | One value from the vocabulary in §2. Non-empty. This is the only field a bundle is rejected for missing. |
+| `title` | recommended | Human-readable display name. Mirror the `# H1`. Omit only if the filename already says it. |
+| `description` | recommended | One sentence. Reuse the doc's opening sentence — don't invent a second summary. |
+| `tags` | recommended | YAML list of short cross-cutting labels. Skip rather than pad. |
+| `timestamp` | recommended | ISO 8601 date (or datetime) of the last meaningful change. This is the canonical home for the date — do **not** also keep a `**Date:**` bold line. |
+| `resource` | recommended | A URI for the underlying asset. Our docs are abstract concepts, so in practice this is usually **absent** — include it only when the doc fronts a real external resource (a dashboard, a BigQuery table, an API). |
+**Extension keys** are allowed and encouraged where a doc kind carries status OKF doesn't model. Our kinds keep `status:` (and `owner:` on feature docs) as extension keys *in addition to* the human-facing `**Status:**` / `**Owner:**` bold lines the skills already read. OKF consumers preserve them; our skills keep working.
+---
+## 2. The `type` vocabulary
+OKF does not register types centrally — producers pick descriptive, self-explanatory values and consumers tolerate unknowns. These are the values this repo's skills emit:
+| Doc kind | `type` | Produced by | Lives in |
+| --- | --- | --- | --- |
+| Architecture decision record | `adr` | `bootstrap`, `grill-plan`, `investigate` handoff | `docs/adr/NNNN-slug.md` |
+| Feature contract | `feature` | `feature-doc` | `docs/features/<name>.md` |
+| Module / interface design note | `design` | `design` | `docs/features/<name>.design.md` |
+| Research / options note | `research` | `investigate` | `docs/research/<topic>.md` |
+| Ubiquitous-language map | `context` | `bootstrap`, `grill-plan` | `docs/CONTEXT.md` |
+| Multi-context map | `context-map` | `bootstrap` | `CONTEXT-MAP.md` |
+| Repo conventions | `convention` | `bootstrap` | `docs/CONVENTIONS.md` |
+| Accepted-but-unimplemented tracker | `known-issues` | `tdd-rounds`, `verify-real-deps` | `docs/known-issues.md` |
+| Benchmark report | `benchmark` | `bench` | `docs/bench/<name>.md` |
+| Round / feature state | `state` | `tdd-rounds` | `docs/features/<name>/state/*.md` |
+Pick the closest existing value before inventing a new one. A new doc kind adds a row here in the same change that first emits it.
+---
+## 3. Bundle conventions
+`docs/` is the bundle root. Two filenames are **reserved** by OKF and never used for a concept:
+- **`index.md`** — a directory listing for progressive disclosure. No frontmatter. Entries are `* [Title](relative-url) — short description`, optionally grouped under `## Section` headings. Add `docs/index.md` once the bundle has more than a couple of files; keep it current as docs are added.
+- **`log.md`** — a date-grouped changelog. Our repo's root **`CHANGELOG.md`** already plays this role (Keep-a-Changelog, newest first); treat it as the bundle's log rather than adding a second file. An OKF consumer tolerates the absence of `log.md`.
+**Cross-references** between docs are plain markdown links. OKF accepts two forms: **bundle-relative** (begin with `/`, resolved from the bundle root: `/adr/0001-distributed-state.md`) and **relative** (`../known-issues.md`). Bundle-relative is the more move-stable form, but this repo's docs use ordinary **relative** links by convention (they also reach files outside the bundle, like `../skills/formats/OKF.md`), and both are equally valid. A link asserts a *relationship* — the kind (supersedes, depends-on, refines) is carried by the surrounding prose, not the link. Broken links are tolerated, not malformed.
+**Body:** no required sections. Favor structural markdown (headings, lists, tables, fenced code) over prose — it aids both human reading and agent retrieval. The headings `# Schema`, `# Examples`, `# Citations` carry their conventional OKF meaning when present.
+---
+## 4. Conformance checklist
+A produced doc is OKF-conformant when:
+- [ ] It opens with a parseable YAML frontmatter block.
+- [ ] `type` is present and non-empty, drawn from §2 (or added there in the same change).
+- [ ] `title` mirrors the H1; `description` reuses the opening sentence; the date lives only in `timestamp`.
+- [ ] Reserved filenames (`index.md`, `log.md` / `CHANGELOG.md`) follow §3 and are **not** used for concepts.
+- [ ] Cross-links use bundle-relative (`/…`) or relative paths.
+This checklist is enforced at the doc-drift audits in `pr-review` (§3e) and `prod-ready` (Section 7) — a produced doc missing `type` is a finding.

package/skills/formats/STYLE-comments.md CHANGED Viewed

@@ -1,59 +1,58 @@
 # Comment Style
-Codifies the comment discipline for this codebase. New contributors (human or agent) should match what's here, and what's already in the code, rather than re-deriving conventions per package.
+Codifies the comment discipline for this codebase. New contributors (human or agent) match what's here, rather than re-deriving conventions per package.
-> Default: write **no** comment. Add one only when the WHY is non-obvious.
+> **Default: write NO comment.** Add one only when the WHY is non-obvious AND a future reader is likely to make the opposite assumption.
-The code's identifiers, types, and structure already explain the *what*. A comment earns its line only when it captures something the reader cannot recover by reading the next ten lines: a constraint, an invariant, a trade-off, a workaround, or a pointer to the decision record.
+The code's identifiers, types, and structure explain the *what*. A comment earns its line only when it captures something the reader cannot recover by reading the next ten lines: a constraint, an invariant, a trade-off, or a pointer to the decision record. If you're unsure whether a comment earns its keep, **delete it** — git history and the next maintainer will be fine.
 ---
-## 1. Five kinds of comment we keep
+## 1. Three kinds of comment we keep
-### 1.1 Package docstring (required, one per package)
+### 1.1 Package docstring (one per package, only when the design needs it)
-Every package starts with a multi-paragraph docstring block in its primary file.
-Shape:
+A package whose design choice or invariant would not be obvious from reading the primary file gets a short docstring block:
 - One-line purpose.
 - One paragraph on the design choice and its rationale.
-- Round / ADR provenance for major decisions.
 - The failure mode or invariant the next reader needs to know.
-### 1.2 Docstrings on every exported identifier (required)
-- Starts with the identifier name (`// Cache stores …`, not `// Stores …`).
-- Interface methods state the **contract**, not just the action: is the call cheap, idempotent, safe under concurrency, called once per request or many times?
+A package that is what it says it is (a thin utility, a generated stub, a glue adapter) does not need a docstring. Don't pad.
-### 1.3 Invariant / constraint comments (encouraged)
+### 1.2 Docstrings on exported identifiers (only when the contract is non-obvious from the signature)
-A comment that catches a bug the next maintainer would otherwise write.
-- `// Zero means 'unknown'` (and re-stated where the bug would bite).
-- `// Selector must be cheap and side-effect-free; ServeHTTP may invoke Pick multiple times on 429 retry.`
+Skip the docstring on getters, simple constructors, and any function whose name + types already convey the contract. Write one only when:
+- The call is **not** cheap, idempotent, or safe under concurrency, and the next reader would assume it was.
+- The function may be invoked multiple times per request (e.g. retried) and that matters.
+- A parameter or return has non-obvious zero-value or nil semantics (`0 = "unknown"`, `(nil, nil) = "not found"`).
-### 1.4 Trade-off comments (encouraged)
+When you write one, start with the identifier name (`// Cache stores …`, not `// Stores …`).
-Name the alternative considered and why it lost. Cite the failure mode if the choice has one.
+### 1.3 Invariant / constraint / trade-off / provenance (only when the alternative would otherwise be reattempted)
-### 1.5 Provenance comments (use the grammar in §3)
+A comment whose absence would let the next maintainer write the bug, reattempt the rejected alternative, or "fix" the deliberate choice. If the next reader would just say "OK, sure" and move on, the comment is not earning its line.
-Anchor non-obvious behavior to its decision record (ADR, round, AC, snapshot).
+- **Invariant** — `// Zero means 'unknown'` (and re-stated where the bug would bite).
+- **Constraint** — `// Selector must be cheap and side-effect-free; ServeHTTP may invoke Pick multiple times on 429 retry.`
+- **Trade-off** — `// Cold-cache choice: an account whose (pk, model) is not yet in the cache is treated as INELIGIBLE.`
+- **Provenance** — `// Manual SQL instead of ORM — see ADR-007 §"Query shape".`
 ---
-## 2. Five kinds of comment we delete on sight
+## 2. Six kinds of comment we delete on sight
-1. **Restates the code.** `// increment i`, `// return the user`, `// loop over items`. The code already says it.
+1. **Restates the code.** `// increment i`, `// return the user`, `// loop over items`.
 2. **Commented-out code.** Git has it. Delete.
-3. **"Added for X" / "Used by Y" caller references.** These rot the day someone renames or removes the caller. Use `grep` instead.
+3. **"Added for X" / "Used by Y" caller references.** These rot the day someone renames or removes the caller. Use `grep`.
 4. **Banner / ASCII-art dividers.** `// =====` and `// --- section ---`. Use a function or a file boundary, not a banner.
 5. **Stale TODO / FIXME / HACK with no owner, ticket, ADR, or date.** Either link it to a decision record or fix it now.
+6. **Section-header comments inside a function.** `// validate`, `// build response`, `// compute total`. If a function needs internal section headers, extract those sections into named functions — the function name is the comment.
 ---
 ## 3. Provenance grammar (canonical forms)
-When a comment cites the project record, use **exactly one** of these forms. Comments that follow the canonical grammar are greppable and stable.
+When a kept comment cites the project record, use **exactly one** of these forms. Comments that follow the canonical grammar are greppable and stable.
 | Kind                                   | Canonical form          | Examples                                |
 | -------------------------------------- | ----------------------- | --------------------------------------- |
@@ -82,32 +81,22 @@ If the explanation needs more than **~6 lines**, the right home is an ADR or a `
 ---
-## 5. "Documented for the next reader" framing
-When a comment exists because a future reader is **likely to make the opposite assumption**, say so explicitly:
-```go
-// Cold-cache choice (documented for the next reader): an account whose
-// (pk, model) is not yet in the cache is treated as INELIGIBLE …
-```
----
-## 6. Test files
+## 5. Test files
-Non-trivial tests — those with fakes, harnesses, time control, channel-driven ceremonies, or unusual lifecycle — get a **5–20 line block comment at the top** of the file (or above the harness type) stating:
+Non-trivial tests — those with fakes, harnesses, time control, or unusual lifecycle — get a **5–20 line block comment** at the top of the file (or above the harness type) stating:
 - What is **faked** versus real (and why each choice).
 - Why the design (deterministic? avoid `time.Sleep`? exercise a specific retry path?).
-- Which knobs are overridden for the test and to **what value**.
+- Which knobs are overridden and to **what value**.
+Trivial tests get nothing. The test name carries the intent.
 ---
-## 7. Done when
+## 6. Done when
 Apply this checklist during the [`simplify`](../simplify/SKILL.md) sweep or in code review:
-- [ ] Every package has a docstring matching §1.1.
-- [ ] Every exported identifier has a docstring starting with its name.
-- [ ] Every comment that survived the sweep is a why, an invariant, a trade-off, or a provenance link — none restates the code.
-- [ ] No commented-out code, no `// used by X`, no banner dividers, no orphan TODOs.
+- [ ] Every surviving comment names a why, invariant, trade-off, or provenance link — none restates the code.
+- [ ] No commented-out code, no `// used by X`, no banner dividers, no orphan TODOs, no in-function section headers.
 - [ ] Every provenance citation matches the canonical grammar in §3.
+- [ ] Docstrings on exported identifiers exist only where the contract isn't obvious from the signature.
 - [ ] Test files with non-trivial harnesses have a strategy preamble; trivial ones do not.

package/skills/grill-plan/SKILL.md CHANGED Viewed

@@ -1,8 +1,6 @@
 ---
 name: grill-plan
 description: Grilling session that stress-tests a chosen plan against the existing domain model, sharpens terminology, and updates documentation (CONTEXT.md, ADRs) inline as decisions crystallise. Use AFTER a direction has been picked (post-`investigate` or post-`feature-doc`), when the user wants to pressure-test the plan — triggered by phrases like "grill me on this", "stress-test this plan", "walk me through this", "is this consistent with our model". Skip if the direction is still being explored — use `investigate` instead.
-complexity: medium
-expected_duration: 20 minutes
 ---
 ## When to use
@@ -98,7 +96,7 @@ If any of the three is missing, skip the ADR. Use the format in [`../formats/ADR
 ## Pairing with other skills
 - **[`investigate`](../investigate/SKILL.md)** — runs *before* when direction is unclear. Once a direction is chosen, hand off here.
-- **[`sync-check`](../sync-check/SKILL.md)** — escalates here when a terminology collision or ADR contradiction is found in existing code.
+- **[`DOC-DRIFT-AUDIT.md`](../formats/DOC-DRIFT-AUDIT.md)** — the terminology/ADR audit (formerly the `sync-check` skill) escalates here when it finds a terminology collision or ADR contradiction in existing code.
 - **[`feature-doc`](../feature-doc/SKILL.md)** — runs *before* when a feature doc is the input to grilling, or *after* when grilling refines the plan into a feature doc.
 - **[`system-design`](../system-design/SKILL.md)** — defers here for hard-to-reverse topology decisions.
 - **[`improve-codebase-architecture`](../improve-codebase-architecture/SKILL.md)** — borrows this skill's grilling discipline in its own grilling loop.

package/skills/improve-codebase-architecture/SKILL.md CHANGED Viewed

@@ -1,8 +1,6 @@
 ---
 name: improve-codebase-architecture
 description: Find deepening opportunities in EXISTING code, informed by the domain language in CONTEXT.md and the decisions in docs/adr/. Use when the user wants to improve architecture, find refactoring opportunities, consolidate tightly-coupled modules, or make a codebase more testable and AI-navigable. Use for EXISTING code; for designing the shape of a new module from scratch, use `design`. Skip for single-module local refactors with no cross-module impact — use `design` or just refactor inline.
-complexity: high
-expected_duration: 45 minutes
 ---
 # Improve Codebase Architecture
@@ -20,7 +18,7 @@ Surface architectural friction and propose **deepening opportunities** — refac
 - Designing the shape of a new module from scratch — use [`design`](../design/SKILL.md).
 - Greenfield system topology — use [`system-design`](../system-design/SKILL.md).
 - Single-module local refactor with no cross-module impact — use `design` or refactor inline.
-- Line-level cleanup (renames, dead-code removal) — use [`code-hygiene`](../code-hygiene/SKILL.md) + [`simplify`](../simplify/SKILL.md).
+- Line-level cleanup (renames, dead-code removal) — use the [`code-hygiene`](../formats/CODE-HYGIENE.md) lens + [`simplify`](../simplify/SKILL.md).
 ## Glossary

package/skills/investigate/SKILL.md CHANGED Viewed

@@ -1,8 +1,6 @@
 ---
 name: investigate
 description: Use when the user asks for investigation, research, a proposal, or "options" before any code lands; or proactively for non-trivial structural decisions (new dependency, framework choice, API contract change, cross-cutting refactor). Triggered by phrases like "investigate X", "research Y", "give me a proposal", "what are our options", "how would we approach", "let's explore", "should we...". Produces a durable research note in `docs/research/<topic>.md`. Skip for tasks where one obvious approach exists (typo fixes, config tweaks, mechanical refactors). Pairs with `feature-doc` (captures *what* we're building once a direction is chosen) and `grill-plan` (stress-tests a chosen plan).
-complexity: medium
-expected_duration: 20 minutes
 ---
 # Investigation Workflow

package/skills/investigate/templates/research-note.md CHANGED Viewed

@@ -1,8 +1,20 @@
+---
+type: research
+title: "Research: <topic>"
+description: <one sentence — the question this note investigates>
+tags: [<area>]
+timestamp: YYYY-MM-DD
+status: Open
+owner: <user>
+---
+<!-- Frontmatter is OKF per skills/formats/OKF.md. `timestamp` is the canonical date;
+     `status`/`owner` mirror the human-facing lines below (OKF models neither). -->
 # Research: <topic>
 **Status:** Open | Decided | Superseded
 **Owner:** <user>
-**Date:** YYYY-MM-DD
 <!--
 Status values:

package/skills/pr-review/SKILL.md CHANGED Viewed

@@ -1,8 +1,6 @@
 ---
 name: pr-review
-description: Discipline for reviewing someone else's pull request — the inverse of `prod-ready` (which is the author's pre-merge gate). Use when the user asks to "review this PR", "look over the diff", "check this change", "give feedback on", or invokes a code-review slash command. Reviews against the linked feature doc / ADRs / domain vocabulary, classifies findings by severity (blocker / suggestion / nit), and returns a structured report. Skip for trivial PRs (typo, dep bump, lint-only) — approve directly. Pairs with `prod-ready` (the author's checklist; the reviewer verifies it landed honestly), `security-review` (escalation when the diff is surface-changing), and `code-hygiene` (the line-level lens applied during the read).
-complexity: medium
-expected_duration: 20 minutes
+description: Discipline for reviewing someone else's pull request — the inverse of `prod-ready` (which is the author's pre-merge gate). Use when the user asks to "review this PR", "look over the diff", "check this change", "give feedback on", or invokes a code-review slash command. Reviews against the linked feature doc / ADRs / domain vocabulary, classifies findings by severity (blocker / suggestion / nit), and returns a structured report. Skip for trivial PRs (typo, dep bump, lint-only) — approve directly. Pairs with `prod-ready` (the author's checklist; the reviewer verifies it landed honestly), `security-review` (escalation when the diff is surface-changing), and the `code-hygiene` lens (`formats/CODE-HYGIENE.md`, applied line-level during the read).
 ---
 # PR Review
@@ -87,24 +85,13 @@ For non-surface-changing diffs: walk `prod-ready` Section 3 (defense-in-depth) b
 #### 3e. Doc-drift audit
-This is the second line of defense for `prod-ready` Section 7. Author may have missed it; reviewer catches what's left. Walk these four questions against the diff:
+Walk the six checks in [`skills/formats/DOC-DRIFT-AUDIT.md`](../formats/DOC-DRIFT-AUDIT.md) against the diff — **reviewer lens**. This is the second line of defense for `prod-ready` Section 7: the author may have missed it; you catch what's left. Any check that resolves to "no" without `n/a + reason` is a finding, at the severity that reference defines — **Blocker** for load-bearing drift (a missing ADR for a hard-to-reverse decision, a `CONTEXT.md` entry for a term other PRs will use, AC drift hiding behavior, a direct ADR contradiction), **Suggestion** when the diff is self-explanatory in isolation.
-- **New decision with viable alternatives** → does an ADR exist in `docs/adr/` for it? Does it name what it supersedes? If load-bearing, is it referenced from the code?
-- **New or changed domain term** → has [`docs/CONTEXT.md`](../../docs/CONTEXT.md) been updated? Are `_Avoid_:` aliases listed if there's risk of confusion?
-- **New / removed package, changed public interface, shifted module boundary** → is the feature's design note (`docs/features/<feature>.design.md`) updated? Module map, file layout, public-interface signatures, test boundaries.
-- **Changed acceptance criteria** → does the feature doc reflect what was actually built? Silently-dropped or silently-added behavior is the most common drift class — flag and don't accept "we'll fix in a follow-up".
+#### 3f. Hygiene (line level)
-If any answer is "no" without `n/a + reason`, that's a finding. Severity:
-- **Blocker** — the missing doc is load-bearing for the next reader (ADR for a hard-to-reverse decision; CONTEXT.md entry for a term other PRs will use; AC drift hiding behavior).
-- **Suggestion** — the doc would help but the diff is self-explanatory in isolation.
+Apply the [`code-hygiene`](../formats/CODE-HYGIENE.md) lens here, not as a primary phase:
-The doc-map is small enough to walk in 2–3 minutes. Skip it and you're trading 3 minutes now for an hour of orientation in 3 months.
-#### 3e. Hygiene (line level)
-Apply `code-hygiene` as a lens here, not as a primary phase:
-- **Comment churn**: new WHAT-comments, stale "used by X" references, citation grammar that doesn't match the repo's comment style doc ([`skills/formats/STYLE-comments.md`](../formats/STYLE-comments.md)). These are nits individually; in aggregate they're a signal the author didn't run `simplify`. Flag as suggestions, not blockers.
+- **Comment noise**: new WHAT-comments, docstrings on exports whose contract is obvious from the signature, in-function section headers (`// validate`, `// build response`), stale "used by X" references, citation grammar that doesn't match the repo's comment style doc ([`skills/formats/STYLE-comments.md`](../formats/STYLE-comments.md)). Flag as **nits by default**; promote to a suggestion only when cumulative comment noise obscures the diff (signal the author skipped `simplify`).
 - Names that mislead (boolean returning non-bool, `getX` that mutates, `Manager`/`Helper` suffixes hiding what the thing is).
 - Cleverness that earns its cost? Or could be boring?
 - YAGNI — "in case we need it" parameters / interfaces / classes? Strip.
@@ -184,9 +171,9 @@ The classification (blocker / suggestion / nit) lives in the parent's notes; onl
 ## Pairing with other skills
 - **`prod-ready`** is the author's pre-merge checklist. Reviewer verifies it landed.
-- **`sync-check`** is the diagnostic context auditor. Reviewer runs it (or verifies it was run) to catch terminology drift and ADR contradictions early.
+- **[`DOC-DRIFT-AUDIT.md`](../formats/DOC-DRIFT-AUDIT.md)** is the shared terminology/ADR/doc-map audit run from §3e — the former `sync-check` skill, now single-sourced and shared with `prod-ready` Section 7.
 - **`security-review`** is the surface-change escalation. Reviewer flags when required.
-- **`code-hygiene`** is the line-level lens applied during the read.
+- **[`code-hygiene`](../formats/CODE-HYGIENE.md)** is the line-level lens applied during the read (§3f).
 - **`grill-plan`** is where load-bearing disagreements go (a new ADR, not a PR comment thread).
 - **`debug`** if a finding turns out to be "this PR introduces a bug" — switch to debug to characterise it before recommending a change.

package/skills/prod-ready/SKILL.md CHANGED Viewed

@@ -1,8 +1,6 @@
 ---
 name: prod-ready
 description: Pre-merge production-readiness checklist — operational, infrastructure, and consistency checks that tests alone don't surface. Use after `tdd` reaches green; before opening a PR or merging to main; after significant infra changes (new DB, new deployment target, new auth flow); or when the user mentions "shipping", "ready to merge", "before deploy", "production readiness", or "prod-ready". Pairs with the `tdd` skill — tdd proves the feature works; this catches what tests can't see (server timeouts, DB pragmas, error-response consistency, secrets at rest).
-complexity: low
-expected_duration: 10 minutes
 ---
 # Prod-Readiness Checklist
@@ -54,14 +52,16 @@ Walk each section. An item is OK to fail **only if** the feature doc's Notes / N
 ### 7. Documentation (the doc-map)
-Implementation lands → docs drift. The natural moment to catch drift is now, not "next sprint". One question per doc type: *did this work change X? Then update Y.* Files don't need to pre-exist — create `docs/adr/`, `CONTEXT.md`, design notes lazily when the first relevant change appears.
+Implementation lands → docs drift. The natural moment to catch drift is now, not "next sprint". Walk the six checks in [`skills/formats/DOC-DRIFT-AUDIT.md`](../formats/DOC-DRIFT-AUDIT.md) — **author lens**: tick each `✓`, `✗ + remediation`, or `n/a + reason`, and fix the drift inline before the PR (don't kick it to a follow-up). Files don't need to pre-exist — create `docs/adr/`, `CONTEXT.md`, design notes lazily when the first relevant change appears.
-- [ ] **New decision with viable alternatives** → ADR exists in `docs/adr/`, names what it supersedes (if anything), and is referenced from code where the decision is load-bearing.
-- [ ] **New or changed domain term** → `CONTEXT.md` entry created or updated. Includes `_Avoid_:` aliases if the term is at risk of being confused with an existing one.
-- [ ] **New/removed package, changed public interface, or shifted module boundary** → the feature's design note (e.g., `docs/features/<feature>.design.md`) updated. Specifically: Module map, File layout, public-interface signatures, Test boundaries.
-- [ ] **Changed acceptance criteria** → the feature doc reflects what was actually built. Silently-dropped or silently-added behavior is the most common drift class — fix here, don't kick to a follow-up.
+- [ ] ADR for any new decision with viable alternatives (and no active ADR is contradicted).
+- [ ] `CONTEXT.md` updated for any new/changed domain term, with `_Avoid_:` aliases where confusion is likely.
+- [ ] Design note updated for any new/removed package, changed public interface, or shifted module boundary.
+- [ ] Feature doc reflects what was actually built (no silently-dropped or silently-added behavior).
+- [ ] `CHANGELOG.md` `[Unreleased]` entry for any user-visible change.
+- [ ] Every new/changed `docs/` file opens with OKF `type` frontmatter.
-If a doc type isn't relevant to this work, write "n/a" — explicit beats implicit.
+Per-check definitions, skip lists, and severity live in [`DOC-DRIFT-AUDIT.md`](../formats/DOC-DRIFT-AUDIT.md) — this is the same audit `pr-review` §3e runs from the reviewer side.
 ## When to skip

package/skills/security-review/SKILL.md CHANGED Viewed

@@ -1,8 +1,6 @@
 ---
 name: security-review
 description: Threat-model + control-review pass for surface-changing work — runs alongside `tdd` and as a heavier gate than `prod-ready` Section 3 when the change introduces or alters trust boundaries, authentication, authorization, sensitive data flows, or external surfaces. Use when the user mentions "security review", "threat model", "STRIDE", "auth flow", "permissions", "secrets", "PII", "public API", "external surface", "abuse", "hardening", or whenever a feature doc / PR adds a new entry point, identity flow, or sensitive-data path. Skip for pure-internal refactors with no surface change, dependency bumps that don't change runtime behavior, or doc-only changes. Pairs with `feature-doc` (informs the threat model) and `prod-ready` (which has the lighter operational-security checklist for every change).
-complexity: high
-expected_duration: 30 minutes
 ---
 # Security Review

package/skills/simplify/SKILL.md CHANGED Viewed

@@ -1,8 +1,6 @@
 ---
 name: simplify
-description: Single end-of-round sweep that tightens what `tdd` just left green — review every changed file for reuse, quality, efficiency, and test relevance. Use after `tdd` reaches green and before opening a PR (or before a Builder closes a round in `tdd-rounds`). Triggered by phrases like "simplify pass", "tighten this", "clean up before commit", "end-of-round sweep", or appearing as a step in a Builder brief. Skip for trivial diffs (typo, dep bump, doc-only). Pairs with `tdd` (runs immediately after green), `code-hygiene` (the lens applied during the sweep), and `pr-review` (a self-check after this).
-complexity: low
-expected_duration: 10 minutes
+description: Single end-of-round sweep that tightens what `tdd` just left green — review every changed file for reuse, quality, efficiency, and test relevance. Use after `tdd` reaches green and before opening a PR (or before a Builder closes a round in `tdd-rounds`). Triggered by phrases like "simplify pass", "tighten this", "clean up before commit", "end-of-round sweep", or appearing as a step in a Builder brief. Skip for trivial diffs (typo, dep bump, doc-only). Pairs with `tdd` (runs immediately after green), the `code-hygiene` lens (`formats/CODE-HYGIENE.md`, applied during the sweep), and `pr-review` (a self-check after this).
 ---
 # Simplify
@@ -34,7 +32,7 @@ The simplify pass catches these once, deliberately, before the diff lands. Witho
 - Slices where the diff is one or two lines and there's nothing to sweep.
 - During red phase — never refactor while red. See [`tdd/SKILL.md`](../tdd/SKILL.md).
-## The four lenses
+## The five lenses
 Walk every changed file. Apply each lens in order. Fix what you find inline.
@@ -49,13 +47,7 @@ Walk every changed file. Apply each lens in order. Fix what you find inline.
 - Names that read clearly out of context — would a stranger guess what `result`, `data`, `value` referred to? If not, rename.
 - Error messages that name the failing input — `"could not parse: <value>"` beats `"parse error"`.
 - Abstractions that haven't earned their keep — a base class with one subclass, an interface with one implementation. Inline.
-- Comments:
-  - Delete WHAT-comments (the code already says it).
-  - Delete "used by X" / "added for Y" caller references — these rot; use grep.
-  - Delete commented-out code (git has it).
-  - Delete banner / ASCII-art dividers.
-  - KEEP why-comments: constraint, invariant, trade-off, provenance.
-  - If a kept comment cites a round / ADR / AC / snapshot, normalize to the project grammar in [`skills/formats/STYLE-comments.md`](../formats/STYLE-comments.md) (§3 — `ADR-007 §7`, `R6 AC-3`, `v0.3 R1b2`, `snapshot §"Invariants"`).
+- Comments — **default during the sweep is DELETE.** Apply [`CODE-HYGIENE.md`](../formats/CODE-HYGIENE.md) Principle 6 and the bar in [`STYLE-comments.md`](../formats/STYLE-comments.md): delete WHAT-comments, obvious-from-signature docstrings, "used by X" caller references, commented-out code, banners, and in-function section headers (`// validate`, `// build response`); keep only why-comments the next reader would otherwise reattempt. Normalize any kept citation (`ADR-007 §7`, `R6 AC-3`, `v0.3 R1b2`) to [`STYLE-comments.md`](../formats/STYLE-comments.md) §3.
 ### 3. Efficiency — dead code, redundant work, premature defensive checks
@@ -99,7 +91,7 @@ In single-feature flow (no rounds), the sweep can land as a separate commit befo
 ## Pairing with other skills
 - **`tdd`** runs first. Simplify runs after green. Never simplify while red.
-- **`code-hygiene`** is the lens — its 5 principles (boring code, naming, YAGNI, rule of 3, locality) are what you apply during the sweep. Read it once; apply it many times.
+- **[`code-hygiene`](../formats/CODE-HYGIENE.md)** is the lens — its seven principles (boring code, naming, YAGNI, rule of 3, locality, comments, constants placement) are what you apply during the sweep. Read it once; apply it many times.
 - **`pr-review`** comes after — a self-check against the diff. Some of the same lenses, applied as a reviewer rather than an author.
 - **`improve-codebase-architecture`** is the escalation — when simplify surfaces structural issues bigger than a sweep can fix.

package/skills/system-design/SKILL.md CHANGED Viewed

@@ -1,8 +1,6 @@
 ---
 name: system-design
 description: System-level architecture for greenfield work — name the modules, define responsibilities, set dependency direction, identify seams. Use when starting a new multi-module system or service from scratch, when defining topology before code lands, or when the user mentions "system architecture", "module boundaries", "service boundaries", "how should I structure this system", "draw the architecture", "topology". Skip for single-module work — use `design` instead. Skip for reorganizing an existing codebase — use `improve-codebase-architecture`. Pairs with `investigate` (comes before, surveying options) and `design` (comes after, shaping each module's interface).
-complexity: high
-expected_duration: 30 minutes
 ---
 # System Design
@@ -38,7 +36,7 @@ Before drawing anything, read what already exists:
 - `docs/adr/` — decisions already taken.
 - The feature set — what the system has to do.
-If `CONTEXT.md` doesn't exist yet, **resolve the core domain terms first** — defer to [`grill-plan` bootstrap mode](../grill-plan/BOOTSTRAP.md) (single source of truth for the lazy-creation rules). Module names without domain terms are placeholders.
+If `CONTEXT.md` doesn't exist yet, **resolve the core domain terms first** — defer to [`grill-plan` bootstrap mode](../bootstrap/BOOTSTRAP.md) (single source of truth for the lazy-creation rules). Module names without domain terms are placeholders.
 ### 2. List the modules

package/skills/tdd/SKILL.md CHANGED Viewed

@@ -1,8 +1,6 @@
 ---
 name: tdd
 description: Test-driven development with the red-green-refactor loop. Use when implementing a feature, fixing a bug, changing core logic, or when the user mentions "TDD", "test-first", "red-green-refactor", or "integration tests". Skip for trivial UI glue, config changes, or pure docs edits.
-complexity: medium
-expected_duration: 20 minutes
 ---
 # Test-Driven Development
@@ -19,6 +17,11 @@ expected_duration: 20 minutes
 - Trivial getters / setters with no behavior.
 - Bug whose root cause isn't yet known — run [`debug`](../debug/SKILL.md) first; the reproduction crystallises into the failing test.
+## Pre-conditions
+- **Current branch is not `main` / `master`.** If it is, stop and run `git checkout -b feat/<short-name>` (or `fix/...`) before writing the first test. Code lands on a feature branch; `main` receives merges, not commits.
+- A feature doc (`docs/features/<short-name>.md`) exists with testable ACs — or a `debug` reproduction names the root cause.
 ## Philosophy
 Tests verify **behavior through public interfaces**, not implementation details. Code can change entirely; tests shouldn't.

package/skills/tdd-rounds/SKILL.md CHANGED Viewed

@@ -1,8 +1,6 @@
 ---
 name: tdd-rounds
 description: Multi-round TDD orchestration. Use when delivering a feature larger than one TDD slice — typically 5-15 acceptance criteria across multiple packages — by dispatching Builder sub-agents per round, with the parent maintaining state and verifying. Triggered when the user mentions "drive the sub-agent team", "multi-round TDD", "orchestrate rounds", "Builder agents", or when a plan from `feature-doc` lists more ACs than one agent should reasonably tackle in a single invocation. Pairs with `tdd` (Builders invoke that skill per round) and `prod-ready` (final round before tag).
-complexity: high
-expected_duration: 1 hour
 ---
 # Multi-Round TDD Orchestration
@@ -22,6 +20,7 @@ Distills the pattern of a parent agent driving Builder sub-agents through a feat
 ## The parent's contract
+- **Dispatch on a feature branch, never `main`.** Verify the current branch before dispatching the first Builder; if it's `main` / `master`, create `feat/<short-name>` first. The whole multi-round delivery lives on one feature branch (or a stack of feature branches), merged to `main` only at the end.
 - **Plan once.** The plan file lists rounds, each round's ACs, the dependency order, and the skills cadence per round (which rounds need `design`; when to invoke `improve-codebase-architecture` mid-project; when to invoke `prod-ready`).
 - **Brief per round.** A self-contained brief — the Builder shouldn't need conversation history.
   - **Briefing Sub-agent**: For complex rounds, invoke a sub-agent to autonomously generate the brief by analyzing `docs/STATE.md`, the `feature-doc`, and the results of the previous round. See `templates/builder-brief.md` for the schema.
@@ -32,6 +31,7 @@ Distills the pattern of a parent agent driving Builder sub-agents through a feat
 When dispatched for a round, the Builder:
+0. **Pre-flight.** Verifies the current branch is **not** `main` / `master`. If it is, refuses immediately and returns a blocking report — does not proceed to step 1.
 1. Reads the brief, the plan file, `docs/STATE.md`, and any ADRs / feature docs the brief cites.
 2. **Executes the listed skills sequentially in this single invocation.** When a brief says "Skills (in order): design, tdd, simplify" — that means run all three in this run, not return to the parent between them. This rule is non-obvious; the brief template makes it explicit.
 3. Commits per AC slice (or per behavior slice for refactor rounds), prefix `R<N>:`. **Read [`COMMITS.md`](COMMITS.md) before the first commit** — it captures the seven rules (`R<N>:` prefix, `#<X>` for bug fixes, per-AC slicing, separate simplify-pass commit, separate doc commits, single-commit-with-justification, honest messages) and the message-body shape. The parent reads commits as the review surface; a clean sequence is reviewable one diff at a time, a mono-commit isn't.
@@ -94,16 +94,4 @@ When the final round completes:
 1. Run `verify-real-deps`. Capture surfaced bugs into `docs/known-issues.md`.
 2. Iterate fix-rounds until clean, or document deferrals to vN.1 with rationale.
 3. Tag and publish via whatever release / distribution channel applies.
-hape the Builder returns.
-## Supporting docs
-- [`COMMITS.md`](COMMITS.md) — commit cadence and message style (per-AC slicing, `R<N>:` prefix, when single-commit is OK, honesty rule). Builders read this before the first commit.
-## Handoff
-When the final round completes:
-1. Run `verify-real-deps`. Capture surfaced bugs into `docs/known-issues.md`.
-2. Iterate fix-rounds until clean, or document deferrals to vN.1 with rationale.
-3. Tag and publish via whatever release / distribution channel applies.

package/skills/verify-real-deps/SKILL.md CHANGED Viewed

@@ -1,8 +1,6 @@
 ---
 name: verify-real-deps
 description: Pre-tag smoke test against real third-party APIs. Use after `prod-ready` is clean, before tagging vN.0 — the gate that catches wire-shape mismatches that fakes accept but real upstreams reject. Triggered when the user mentions "smoke test", "real API", "live verify", "before tag", or "end-to-end against actual <vendor>". Pairs with `prod-ready` (which catches ops/infra issues tests miss) and `tdd-rounds` (the orchestration that feeds into this gate).
-complexity: high
-expected_duration: 45 minutes
 ---
 # Verify Against Real Dependencies

package/skills/verify-real-deps/templates/known-issues.md CHANGED Viewed

@@ -1,3 +1,14 @@
+---
+type: known-issues
+title: "Known issues — post-vN.0 smoke test findings"
+description: Bugs surfaced by the first end-to-end run against the real upstream.
+tags: [smoke-test, known-issues]
+timestamp: YYYY-MM-DD
+---
+<!-- Frontmatter is OKF per skills/formats/OKF.md (`type: known-issues`).
+     `timestamp` tracks the last edit; the Discovered line below pins the first run. -->
 # Known issues — post-vN.0 smoke test findings
 **Discovered:** YYYY-MM-DD (first end-to-end run against real <upstream>).

package/skills/zoom-out/SKILL.md CHANGED Viewed

@@ -2,8 +2,6 @@
 name: zoom-out
 description: User-invoked utility — pulls the agent up an abstraction layer when the user is lost in unfamiliar code. Produces a map of relevant modules, callers, and seams in `docs/CONTEXT.md` vocabulary. Use when the user says "I'm lost", "zoom out", "give me higher-level context", "I don't know this area", "what depends on what here", or invokes the slash command. Does not change which workflow the user is in — interrupts to orient, then hands back. Skip when the user already has the map and just needs to read code.
 disable-model-invocation: true
-complexity: low
-expected_duration: 5 minutes
 ---
 # Zoom Out