npm - @lemoncode/lemony - Versions diffs - 0.1.0 - Mend

@lemoncode/lemony 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (68) hide show

package/LICENSE +21 -0
package/PRIVACY.md +147 -0
package/README.md +189 -0
package/catalog/VERSION +1 -0
package/catalog/agents/README.md +29 -0
package/catalog/agents/architect.md +81 -0
package/catalog/agents/fit-assessment.md +94 -0
package/catalog/agents/implementer.md +67 -0
package/catalog/agents/orchestrator.md +627 -0
package/catalog/agents/reviewer.md +124 -0
package/catalog/agents/spec-author.md +69 -0
package/catalog/agents/ui-designer.md +25 -0
package/catalog/commands/add-capability.md +69 -0
package/catalog/commands/bypass.md +40 -0
package/catalog/commands/define.md +24 -0
package/catalog/commands/hotfix.md +47 -0
package/catalog/commands/pause.md +52 -0
package/catalog/commands/resume.md +56 -0
package/catalog/commands/spinoff.md +59 -0
package/catalog/commands/triage.md +24 -0
package/catalog/harness.config.schema.json +116 -0
package/catalog/hooks/README.md +56 -0
package/catalog/hooks/init.sh +281 -0
package/catalog/hooks/lib/lemony.sh +41 -0
package/catalog/hooks/lib/playbook-scan.sh +394 -0
package/catalog/hooks/lib/transcript-grep.sh +56 -0
package/catalog/hooks/require-playbook.sh +97 -0
package/catalog/hooks/session-close.sh +232 -0
package/catalog/hooks/suggest-playbook.sh +72 -0
package/catalog/playbook-format.md +198 -0
package/catalog/schemas/README.md +13 -0
package/catalog/schemas/tier2-events-history.md +104 -0
package/catalog/schemas/tier2-events.md +286 -0
package/catalog/skills/README.md +62 -0
package/catalog/skills/bootstrap-architecture/SKILL.md +78 -0
package/catalog/skills/code-explorer/SKILL.md +76 -0
package/catalog/skills/grill-with-docs/ADR-FORMAT.md +49 -0
package/catalog/skills/grill-with-docs/CONTEXT-FORMAT.md +77 -0
package/catalog/skills/grill-with-docs/SKILL.md +270 -0
package/catalog/skills/grill-with-docs/reference.md +236 -0
package/catalog/skills/mutation-testing/SKILL.md +84 -0
package/catalog/skills/note-side-finding/SKILL.md +89 -0
package/catalog/skills/playbook-iterate/SKILL.md +78 -0
package/catalog/skills/prd-to-spec/SKILL.md +181 -0
package/catalog/skills/raise-discovery/SKILL.md +112 -0
package/catalog/skills/resolve-discovery/SKILL.md +123 -0
package/catalog/skills/review-pr/SKILL.md +106 -0
package/catalog/skills/review-pr/reference.md +105 -0
package/catalog/skills/security-review/SKILL.md +90 -0
package/catalog/skills/senior-review/SKILL.md +99 -0
package/catalog/skills/silent-failure-hunter/SKILL.md +76 -0
package/catalog/skills/spec-compliance-check/SKILL.md +74 -0
package/catalog/skills/spec-to-issue/SKILL.md +88 -0
package/catalog/skills/task-closeout/SKILL.md +229 -0
package/catalog/skills/tdd/SKILL.md +171 -0
package/catalog/skills/test-gap-report/SKILL.md +71 -0
package/catalog/skills/triage-issue/SKILL.md +102 -0
package/catalog/skills/update-architecture/SKILL.md +69 -0
package/catalog/skills/verify/SKILL.md +90 -0
package/catalog/skills/write-adr/SKILL.md +77 -0
package/catalog/templates/README.md +32 -0
package/catalog/templates/claude-code/.claude/settings.json.tpl +34 -0
package/catalog/templates/claude-code/agents.md.tpl +109 -0
package/catalog/templates/claude-code/docs/playbooks/README.md.tpl +96 -0
package/catalog/templates/claude-code/harness.config.yml.tpl +59 -0
package/catalog/templates/claude-code/state/history.md.tpl +6 -0
package/dist/cli.mjs +5691 -0
package/package.json +80 -0

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Lemoncode
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/PRIVACY.md ADDED Viewed

@@ -0,0 +1,147 @@
+# Privacy
+_Last updated: 2026-06-21 · Applies to: Lemony anonymous telemetry (v1)._
+Lemony collects a small amount of **anonymous** usage telemetry to improve the
+harness. **It is on by default**; you can turn it off at any time (see
+[How to opt out](#how-to-opt-out)), and you can inspect exactly what would be sent
+with `lemony telemetry show`. This document explains exactly what is collected, why
+it is genuinely anonymous, who the parties are, and how to turn it off. It is written
+for the developers who run Lemony and is versioned with the code — when the posture
+changes, the change is in git history.
+> **Scope.** This describes Lemony's **v1 `anonymous` tier only** — the only tier
+> that ships today, on by default. A future `project` tier (identified at
+> org-level, for contracted Lemoncode clients) is designed but **not built or
+> shipped**; when it is, this document gains the corresponding lawful-basis,
+> retention, and data-subject-rights sections, after a formal legal review.
+## Who we are
+- **Data controller:** Lemoncode S.L.
+- **Contact:** daniel.sanchez@lemoncode.net
+## What is collected
+Telemetry events are written locally to `.claude/state/events.jsonl` as you work.
+When telemetry is on, each event is **sanitized** before it leaves your machine and
+only the anonymous projection is sent. The sanitizer keeps two kinds of field and
+drops everything else (fail-closed — any field not explicitly allowed is dropped):
+**Collected (anonymous):**
+- `ts` — the event timestamp.
+- `harness_version` — the Lemony version that produced the event.
+- **Numeric metrics** — e.g. session duration, cycle time, review-rejection counts,
+  iteration counts, step counts.
+- **Internal enumerations** — fixed, code-defined categories: the event `type` (one
+  of nine), task levels, close reasons, severities, checkpoint results.
+- **`attributed_name`** — a short label naming the harness **component** (an agent,
+  skill, or playbook) attributed to an event; this is the metric we most care about
+  ("which component causes friction"). For the components Lemony ships, these are
+  roster names shared across every install — not your code, your project, or free
+  text. (It is a bounded free string, not a strictly-enforced enum; the
+  custom-local-component edge is covered by the anonymity caveat below.)
+**Never sent:**
+- `user` (your git identity) — this is marked **local-only**: it never leaves your
+  machine in any tier.
+- `project` and task identifiers (`task_id`, `parent_task_id`) — project-correlating
+  identifiers, dropped from the anonymous projection.
+- **Free text** — anything you typed: spec topics, follow-up descriptions, review
+  rejection reasons, bypass reasons.
+You can see exactly what would be sent for your own data at any time:
+```bash
+lemony telemetry show     # prints each local event raw, then its sanitized projection
+lemony telemetry status   # shows whether telemetry is on/off and why
+```
+The full field-by-field classification is the schema document
+[`catalog/schemas/tier2-events.md`](./catalog/schemas/tier2-events.md); the
+sanitizer code (`src/telemetry/`) is its executable mirror and is public.
+## Why this is genuinely anonymous
+The `anonymous` tier carries **no persistent identifier** — no account, no device
+ID, no install UUID — so events cannot be linked to you or to each other across
+sessions. The only non-numeric values are timestamps and fixed internal
+enumerations; the one component-name field, `attributed_name`, refers to a
+catalog component that is identical across all installs, so it identifies a part of
+**Lemony**, not a person or a project.
+Because the data carries no identifier and cannot reasonably be re-linked to an
+individual, we **regard it as genuinely anonymous and, in our assessment, outside the
+scope of the GDPR** — pending the formal legal review noted below. This assessment
+rests on the data staying genuinely anonymous: if a future change ever introduced a
+re-identification vector — e.g. unique locally-authored component names in a small
+population — that change would have to revisit this classification first.
+## Cloudflare sub-processor and your IP address
+Telemetry is sent over HTTPS to a Cloudflare Worker that writes the anonymous
+payload to private storage.
+- **Cloudflare, Inc.** acts as a **sub-processor**. Because Cloudflare terminates
+  TLS at its edge, it transiently sees the **source IP address** of the request.
+- **We never read or store the IP address.** The Lemony Worker does not log it,
+  does not persist it, and there is no IP Logpush. The IP exists only in transit and
+  is used by Cloudflare's edge for connection handling and an abuse / rate-limit
+  rule.
+The IP, while in transit, is the only piece of personal data touched anywhere in
+this pipeline. Our lawful basis for that transient processing is **legitimate
+interest** — operating the service securely and preventing abuse — and it is
+minimized to the maximum: not persisted, not logged, never associated with the
+anonymous payload.
+## Retention
+Because the collected payload is genuinely anonymous (no personal data, no
+identifier), it is retained **indefinitely** to support long-term, version-aware
+analysis of how the harness is used. The transient IP seen by Cloudflare is **not
+retained at all**.
+## How to opt out
+Anonymous telemetry is **on by default**. You can turn it off at any level — the
+**first** matching switch below wins:
+1. **`DO_NOT_TRACK` environment variable** (cross-tool standard) — set it to any
+   value other than `0`/empty and Lemony, along with every other tool that honors
+   the standard, will not send telemetry.
+2. **`LEMONY_TELEMETRY_DISABLED` environment variable** — set to `1`, `true`, or
+   `yes` to disable Lemony telemetry machine- or shell-wide.
+3. **Per-checkout opt-out** — run `lemony telemetry disable`. This writes a local,
+   gitignored opt-out for this repository checkout only (it does not silence your
+   teammates). Add `--purge-local` to also delete the local `events.jsonl` and the
+   send cursor:
+   ```bash
+   lemony telemetry disable                # off for this checkout
+   lemony telemetry disable --purge-local  # off, and delete local event data
+   ```
+4. **Team-wide off-switch** — commit `telemetry.enabled: false` in your
+   `harness.config.yml` to disable telemetry for everyone who uses that repository.
+Run `lemony telemetry status` at any time to see whether telemetry is on and which
+switch decided it. Re-enable a per-checkout opt-out with `lemony telemetry enable`.
+## Transparency
+Everything about this pipeline is inspectable:
+- The exact data for **your** events — `lemony telemetry show`.
+- The field-by-field schema — [`catalog/schemas/tier2-events.md`](./catalog/schemas/tier2-events.md).
+- The sanitizer, send engine, and Worker — all public source (`src/telemetry/`,
+  `telemetry-worker/`).
+## Changes and review
+This document is self-drafted following established industry disclosures and is
+versioned with the code. A **formal legal review** is scheduled before Lemony is
+published as a public open-source package. Material changes to what is collected or
+how it is processed will be reflected here and in git history.

package/README.md ADDED Viewed

@@ -0,0 +1,189 @@
+# 🍋 Lemony
+> A Harness for AI Coding.
+`@lemoncode/lemony` brings a reproducible **Spec-Driven Development (SDD)** workflow to
+AI coding agents. A single CLI installs — into any repo — an **Orchestrator** that
+dispatches work to fresh-context sub-agents (spec, implementation, review, architecture),
+a generic skill catalog, lifecycle hooks, anonymous telemetry, and a versioned
+update/rollback system. You own your code and your domain knowledge; the harness owns the
+framework around it and keeps it up to date.
+**Target:** Claude Code is the only operative target today; the design is
+abstraction-ready, so Cursor / Codex / OpenCode plug in later without a refactor.
+**Status:** `0.x` — still evolving. Pins are exact and any pre-1.0 minor can break; `1.0.0`
+lands when the CLI verbs, config schema, and catalog contract are stable enough to commit
+to semver.
+---
+## Prerequisites
+- **Node.js ≥ 24** and **npm ≥ 11** (`node --version`).
+- **[Claude Code](https://claude.com/claude-code)** installed in the target repo.
+## Install
+The package is public on npm — no token, no registry config:
+```bash
+npm i -D @lemoncode/lemony   # add the harness to the project
+npx lemony install           # scaffold the harness into this repo
+```
+The devDependency is the recommended setup: the harness's hooks and agents emit telemetry
+through a launcher that resolves the CLI from the project's own `node_modules/.bin` first,
+then a global install. That local bin is pinned to the repo, so it survives a global `PATH`
+change or a Node version switch (`fnm`/`nvm`).
+> No `package.json` (a non-Node repo)? A one-shot `npx @lemoncode/lemony install` still
+> scaffolds the harness; telemetry then needs a **global** install
+> (`npm i -g @lemoncode/lemony`) so the launcher can resolve the CLI.
+`install` is **fresh-only** and writes, into the current repo:
+```
+.claude/
+├── agents/             # the agent role instances (orchestrator, spec, impl, review, …)
+├── skills/             # the eligible slice of the generic skill catalog (by capability)
+├── settings.json       # the `hooks` block (vendor-owned) merged into your settings
+└── .harness/
+    └── baseline/<ver>/ # pristine copy of every installed vendor file (the merge base)
+agents.md               # Orchestrator entry point for Claude Code
+harness.config.yml      # your harness config (see below)
+```
+Your own files — `CLAUDE.md`, `CONTEXT.md`, `docs/`, playbooks — are never overwritten. If
+you run `install` over a `.claude/` that already has managed content, it reconciles instead
+of refusing (interactive on a TTY).
+## A day with the harness — when to run what
+> Two surfaces: **CLI verbs** in your **terminal** (setup & maintenance), **slash commands**
+> **inside Claude Code** (the daily work).
+**① First time in a repo** _(terminal)_
+```bash
+npx lemony install     # scaffold the harness
+npx lemony doctor      # confirm it's healthy
+```
+**② You have an idea or a new feature** _(Claude Code)_ — `/define "add CSV export to the reports page"`
+Runs the full SDD round-trip. You stop at two human gates; nothing else needs you:
+1. the Orchestrator **grills** the idea into a reviewable PRD,
+2. opens the task issue + branch, dispatches a fresh-context **Spec Author**,
+3. **▸ approval gate** — you read the spec cold and approve,
+4. the **Implementer** builds it (TDD), the **Reviewer** checks security / tests / spec-compliance,
+5. **▸ merge gate** — you approve the PR; on merge, the task closes out.
+It **honors every human gate and never self-approves** — that, plus fresh-context
+sub-agents, is the difference from prompting an agent ad hoc.
+**③ A small bug, not a whole feature** _(Claude Code)_ — `/triage "dates render a day off in Safari"`
+The lightweight **L2** path: find root cause → fix → review → merge, without the full PRD ceremony.
+**④ Production's on fire** _(Claude Code)_ — `/hotfix "checkout 500s on empty cart"`
+Fix and ship now; the review runs async and the issue + postmortem are backfilled after.
+**⑤ Mid-task you spot an _unrelated_ defect** _(Claude Code)_ — `/spinoff "typo in the 404 page copy"`
+Captures it as a tracked `pending` stub and **keeps you on your current task** — no context
+switch. You pick it up later with `/resume`.
+**⑥ Stopping for the day, and coming back** _(Claude Code)_
+- `/pause` — writes a resume narrative + a `session_closed` event so future-you starts cold.
+- `/resume` — lists the open queue (spec-ready, in-progress, `pending` stubs, parked
+  closeouts) and picks one up from its branch + state.
+**⑦ Throwaway or spike, no ceremony** _(Claude Code)_ — `/bypass`
+Escape hatch: records one `l3_bypass` event, then you work with no issue / spec / review.
+> **Not sure which mode?** Just describe what you want — the Orchestrator routes _"I have an
+> idea for…"_ → DEFINE, _"there's a bug in…"_ → TRIAGE, _"continue #42"_ → RESUME.
+| Slash command     | Mode | Use it to                                                                                                                                                                   |
+| ----------------- | ---- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `/define`         | L1   | Start a new feature/idea: grill → PRD → spec → implement → review → merge.                                                                                                  |
+| `/triage`         | L2   | Handle a small bug on the lightweight path.                                                                                                                                 |
+| `/hotfix`         | L2!  | Urgent fast-track: fix and ship now; the review runs async, issue + postmortem are backfilled.                                                                              |
+| `/spinoff`        | —    | Capture an unrelated, non-blocking defect mid-task as a tracked stub — without leaving your current task.                                                                   |
+| `/resume`         | —    | Pick up an existing harness task (or a `/spinoff` stub) from its branch and state.                                                                                          |
+| `/pause`          | —    | Pause the session: writes a narrative resume + a `session_closed` event.                                                                                                    |
+| `/bypass`         | L3   | Escape hatch: record one `l3_bypass` event, then work with no issue/state/review.                                                                                           |
+| `/add-capability` | —    | Activate an opt-in capability your repo reported as available (see [Opt-in capabilities](#opt-in-capabilities)) — e.g. have the Architect bootstrap `docs/architecture.md`. |
+Behind the slash commands the harness installs a catalog of **agents** (orchestrator,
+spec-author, implementer, reviewer, architect, ui-designer) and **skills** (`grill-with-docs`,
+`tdd`, `senior-review`, `security-review`, …) that the agents invoke — you don't call skills
+directly; the Orchestrator and sub-agents do, gated by your repo's capabilities.
+## Commands
+The CLI ships nine verbs. Run `lemony <command> --help` (or `-h`) for usage;
+`lemony version` (or `-v`) prints the installed version.
+| Command     | What it does                                                                    | Key flags                                                                                    |
+| ----------- | ------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------- |
+| `install`   | Install into a fresh repo, or reconcile a pre-existing `.claude/`.              | `--target=<claude-code>` `--task-storage-repo=<owner/name>` `--on-conflict=<vendor\|client>` |
+| `update`    | Move the install to the CLI's catalog version (3-way merge).                    | `--on-conflict=<vendor\|client>` `--dry-run`                                                 |
+| `repair`    | Re-sync at the **pinned** version (restore missing files, never clobber edits). | `--dry-run`                                                                                  |
+| `rollback`  | Restore a pre-change snapshot (offline).                                        | `--to=<version>` `--list` `--cleanup` `--force`                                              |
+| `uninstall` | Remove vendor-managed files (keeps your docs, state, adopted skills).           | `--labels`                                                                                   |
+| `doctor`    | Diagnose the installation (read-only); proposes `repair`.                       | —                                                                                            |
+| `status`    | Show installed version, branch drift, and open tasks.                           | —                                                                                            |
+| `emit`      | Append a telemetry event to `.claude/state/events.jsonl`.                       | `<type> [--key=value …]`                                                                     |
+| `telemetry` | Inspect or control the local anonymous telemetry.                               | `status` `show` `disable` `enable`                                                           |
+The harness keeps a **committed baseline** under `.claude/.harness/baseline/<version>/` — a
+verbatim copy of every installed vendor file at the pinned version — so `update` is a true
+3-way merge that travels with your repo. Overlapping edits get git-style `<<< / >>>` markers
+and a report; your copy is always snapshotted first, so `rollback` is one command away.
+## `harness.config.yml`
+The installer writes a config that pins and shapes your install (validated with a strict
+schema; `harness.config.schema.json` ships for IDE autocomplete):
+| Key              | Meaning                                                      |
+| ---------------- | ------------------------------------------------------------ |
+| `vendor_version` | Exact semver pin. `update` bumps it; never edit by hand.     |
+| `target`         | Which AI-coding harness the install targets (`claude-code`). |
+| `task_storage`   | Where tasks live (`owner/name` of the issues repo).          |
+| `agents`         | Per-agent overrides.                                         |
+| `paths`          | Where managed files land.                                    |
+## Opt-in capabilities
+Some skills stay dormant until your repo keeps a **convention file** — the harness never
+creates one for you (that would impose an architecture), so you opt in deliberately.
+- **What it is.** Today the one convention is **`docs/architecture.md`**. Keep it and the
+  `update-architecture` skill activates on your next `update` / `repair`, keeping that map
+  current as your system's shape changes.
+- **See what's available.** `install` lists the capabilities your repo qualifies for, and
+  `lemony doctor` shows them under an `ℹ capabilities` line — informational, never a failure.
+- **Turn one on.** Run **`/add-capability`** inside Claude Code: it has the right agent
+  author the convention file for you, then re-syncs so the gated skill installs.
+## Privacy
+The harness emits **anonymous, opt-out** telemetry to improve the framework.
+- **What's collected.** Coarse usage signals only — **no source, no file contents, no
+  identifiers**. Events are written locally to `.claude/state/events.jsonl` first.
+- **Opt out.** Set `DO_NOT_TRACK=1`, or run `npx lemony telemetry disable`.
+- **Inspect it.** `npx lemony telemetry status` / `show` shows exactly what's on disk.
+Full details are in `PRIVACY.md`, shipped with the package.
+---
+Made with 🍋 by [Lemoncode](https://lemoncode.net).

package/catalog/VERSION ADDED Viewed

	@@ -0,0 +1 @@
1	+ 0.1.0

package/catalog/agents/README.md ADDED Viewed

@@ -0,0 +1,29 @@
+# agents/ — role catalog templates
+> **Status: 1 hat + 5 sub-agents, all skill-gated (P4 slice 2).** `orchestrator` (hat)
+> weaves its skills into its prose. The sub-agents `spec-author`, `implementer`,
+> `reviewer`, and the on-demand `architect` each carry a single `{{SKILLS}}` marker the
+> installer fills with the skills they invoke, gated by `applies-when` capabilities
+> (decision #31; the profile tier #39 is retired — ADR 0015). The `architect` is always
+> installed but invoked **on-demand** —
+> outside the linear flow (P4 slice 2 authored it + its skills). `ui-designer` stays
+> inactive (#11).
+These are the **vendor role catalog templates**, not client instances. The
+installer renders them into a client's `.claude/agents/<role>.md` with the repo's
+resolved skills (decision #31, #32).
+The harness runs as **1 hat + 4 sub-agents** plus one inactive slot (decisions #10,
+#29):
+| Role                              | Reification                          | When invoked                                       |
+| --------------------------------- | ------------------------------------ | -------------------------------------------------- |
+| [orchestrator](./orchestrator.md) | hat (continuous, human-facing)       | entry point, RESUME, closeout, discovery mediation |
+| [spec-author](./spec-author.md)   | sub-agent (fresh context)            | PRD → spec, create GitHub issue                    |
+| [implementer](./implementer.md)   | sub-agent (fresh context)            | implement spec via TDD                             |
+| [reviewer](./reviewer.md)         | sub-agent (fresh context)            | validate against spec, tests, security             |
+| [architect](./architect.md)       | sub-agent, on-demand                 | ADRs, architecture docs, playbook iteration        |
+| [ui-designer](./ui-designer.md)   | slot — **inactive by default** (#11) | —                                                  |
+**Fresh context** is the distinctive value of a sub-agent over the hat: it prevents
+confirmation bias and forces self-sufficient artifacts.

package/catalog/agents/architect.md ADDED Viewed

@@ -0,0 +1,81 @@
+---
+name: architect
+description: On-demand sub-agent for decisions about the system's shape — record an ADR, keep docs/architecture.md true, iterate a client playbook, or explore an unfamiliar codebase. Invoked only by the Orchestrator, with fresh context; it proposes and the human decides, and is never part of the linear DEFINE/TRIAGE flow.
+role: Architect
+reification: sub-agent (on-demand)
+invoked-when: on-demand — an ADR / architecture update / playbook iteration, or an explicit request
+origin: vendor
+vendor_version: '{{vendor_version}}'
+---
+# Architect
+A **sub-agent** with fresh context, invoked **on-demand** — never part of the linear
+DEFINE / TRIAGE flow. The Orchestrator is its **only** invoker. Clean eyes for decisions
+about the system's shape: recording why a choice was made, keeping the architecture map
+true, exploring an unfamiliar codebase, and iterating the client's playbooks.
+The Architect **proposes**; the human (via the Orchestrator) **decides**. It owns the
+ADRs, `docs/architecture.md`, and the client's playbooks — but playbooks are
+client-owned (decision #8), so it never imposes content, it suggests changes.
+## When the Orchestrator invokes you
+You are dispatched for one of these, with the relevant context (the discovery entry, the
+decision, the change, or the request):
+| Trigger                                                                                | Skill to run          |
+| -------------------------------------------------------------------------------------- | --------------------- |
+| A discovery resolution worth recording as a durable decision                           | `write-adr`           |
+| A `T6 PLAYBOOK_CONFLICT` resolution that changes a playbook (or "capture how we do X") | `playbook-iterate`    |
+| A change altered the architecture **and** `docs/architecture.md` exists                | `update-architecture` |
+| Orienting in a large or unfamiliar codebase before a decision/spec                     | `code-explorer`       |
+| A grill that a `T6` shows needs a **major** playbook re-think                          | `grill-with-docs`     |
+These are conditions, not a sequence — run the one you were invoked for. Which skills
+are installed depends on the repo's capabilities (see Skills below); run whichever landed.
+Your most reliable activation is **closeout** (#138, ADR 0010): the `task-closeout` skill
+drives `write-adr`, `update-architecture`, and `playbook-iterate` at the end of every task,
+in cold blood, so durable capture isn't lost to mid-task resume pressure. There the
+Orchestrator invokes you **automatically** for `update-architecture` (when
+`docs/architecture.md` exists — the map tracks reality, no offer), and on the human's
+acceptance for `write-adr` / `playbook-iterate`. See the Orchestrator's §Closeout.
+## Operating procedure
+1. **Orient.** Read the context the Orchestrator gave you — the discovery entry and its
+   resolution, the changed code, or the request. If the codebase is unfamiliar and the
+   task needs it, run `code-explorer` first.
+2. **Do the one thing you were invoked for**, via its skill. Keep edits surgical and
+   the rationale linked, not duplicated: an ADR records the _why_, `architecture.md` the
+   _shape_, a playbook the _reusable pattern_. Don't restate one in another.
+3. **Stay project-agnostic in playbooks.** A rule tied to this one codebase belongs in
+   `CLAUDE.md` / an ADR / `CONTEXT.md`, never a playbook. If a "playbook" change is
+   really project-specific, say so and route it there instead.
+4. **Raise a discovery, don't improvise.** If you hit a T1–T6 case the context didn't
+   anticipate — the decision contradicts an existing ADR (T1), an unspecified fork (T2),
+   the change already exists (T4) — **run the `raise-discovery` skill**: write the entry,
+   return the one-line summary, and stop. The Orchestrator mediates and re-invokes you.
+   If instead you spot an **independent**, non-blocking defect while mapping or deciding
+   — unrelated to the artifact at hand — **run `note-side-finding`** to add it to your
+   return summary rather than raising a pausing discovery.
+5. **Report back** to the Orchestrator: the artifact written/changed (path) and a
+   one-line summary — or, if the trigger didn't actually warrant the artifact (an ADR
+   that fails the three tests, a change that isn't architecturally significant), say so
+   and write nothing. The Orchestrator owns the human dialogue and the label lifecycle.
+## Grill condition
+`grill-with-docs` is **shared** with the Orchestrator but you invoke it **only** under
+the documented condition `IF discovery == T6 AND the playbook requires major iteration`
+— a re-think too large for a single `playbook-iterate` edit. The Orchestrator's
+DEFINE-mode grill is a different use; there is no operational overlap.
+## Skills
+The installer fills this list with the skills your repo's capabilities resolved to
+(decision #31); each skill renders with the condition that triggers it. The rich "how"
+of each lives in its own `SKILL.md`.
+{{SKILLS}}

package/catalog/agents/fit-assessment.md ADDED Viewed

@@ -0,0 +1,94 @@
+# Task-fit assessment — the dial
+> Vendor reference doc. The **canonical, operative nudge criterion** lives in
+> `orchestrator.md` (§"Task-fit assessment") — that file is the hat, always
+> loaded, so it is the single source the Orchestrator acts on. This doc is the
+> fuller model and worked examples behind that one-paragraph rule; when the two
+> ever drift, `orchestrator.md` wins.
+The harness is a **dial, not an on/off switch** (decisions #57–#61). Every
+incoming task lands at one of three levels of ceremony. The Orchestrator (an
+LLM) classifies — there is **no runtime scorer** (a keyword heuristic would be
+less accurate and would bias toward the expensive failure; a data-driven scorer
+is a later-phase revisit once telemetry exists).
+## The three levels
+| Level                | Ceremony                                                     | For                                                                    | Path               |
+| -------------------- | ------------------------------------------------------------ | ---------------------------------------------------------------------- | ------------------ |
+| **L1 — full SDD**    | Grill → spec → human gate → implement → review → merge gate  | Real features; anything whose intent is worth pinning down before code | DEFINE (`/define`) |
+| **L2 — lightweight** | Issue + branch + implement (TDD) + light review → merge gate | Small bugs that still deserve a record and a reviewer, not a spec      | TRIAGE (`/triage`) |
+| **L3 — bypass**      | None — no issue, no task state, no review                    | Typos, mechanical renames, lockfile bumps, comment fixes               | `/bypass`          |
+The gradient is **cost of ceremony vs. cost of a missed mistake**. L1 spends the
+most ceremony because a misunderstood feature is the most expensive thing to
+discover at review (or in production). L3 spends none because there is nothing a
+spec or a reviewer would catch that the change itself doesn't already make
+obvious.
+## The criterion: specifiable, verifiable, or reviewable
+A task **deserves the harness** (L1 or L2) if it is at least **one** of:
+- **Specifiable** — you could write down, ahead of time, what "done" means in a
+  way that could be wrong. If there's a spec worth agreeing on before code, it's
+  L1.
+- **Verifiable** — there's a behaviour a test could pin down (a red test today,
+  green after). If the change has a testable contract, it earns at least L2.
+- **Reviewable** — a second reader could catch a real mistake in it. If the diff
+  has judgement in it (logic, naming that encodes intent, an edge case), a
+  reviewer adds value.
+If a task meets **none** of the three — there's nothing to specify, nothing a
+test would meaningfully assert, nothing a reviewer would catch — it's **L3**.
+**When in doubt, keep it in the harness.** A wasted bit of ceremony is cheaper
+than an unreviewed bug. The nudge toward `/bypass` is a single, discardable
+question — it never blocks.
+## L1 vs. L2: spec or no spec
+Both run the branch → PR → human merge gate; the difference is the **front half**:
+- **L1** when the _intent_ is what's at stake — a feature, a behaviour change, a
+  decision that a human should sign off on _before_ code exists. The spec, not
+  the code, is the source of truth, so the approval gate sits at the head of
+  implementation.
+- **L2** when the intent is obvious but the _fix_ still wants a record and a
+  reviewer — a bug with a clear root cause. Skip the spec and its gate; keep the
+  issue, the branch, the TDD loop, and the review.
+The mechanical marker: an L1 task carries `harness:sdd`; an L2 task does not (its
+absence is what selects the lightweight path).
+## Worked examples
+| Task                                                  | Specifiable?                         | Verifiable?                    | Reviewable? | Level  |
+| ----------------------------------------------------- | ------------------------------------ | ------------------------------ | ----------- | ------ |
+| "Add SSO login with role-based access"                | yes — many decisions to pin          | yes                            | yes         | **L1** |
+| "Export the report as CSV"                            | yes — column set, escaping, encoding | yes                            | yes         | **L1** |
+| "Pagination returns the wrong total on the last page" | no spec needed                       | yes — red test on the boundary | yes         | **L2** |
+| "Null deref when the user has no avatar"              | no                                   | yes — reproduce then fix       | yes         | **L2** |
+| "Rename `usr` → `user` across the module"             | no                                   | no — mechanical                | barely      | **L3** |
+| "Fix a typo in a log string"                          | no                                   | no                             | no          | **L3** |
+| "Bump the lockfile after `npm audit fix`"             | no                                   | no                             | no          | **L3** |
+The borderline cases are the interesting ones. A "rename" that also changes a
+public API is **L2** (a reviewer should see the call-site churn). A "typo fix" in
+a user-facing error message that other code matches on is **L2** (something could
+break). The properties, not the surface verb, decide.
+## Escape hatches
+- **`/bypass`** — force L3 on a task the dial would otherwise pull into the
+  harness. It emits exactly one `l3_bypass` telemetry event (so misclassification
+  is observable later) and then does the work with no issue, no task state, no
+  review. The escape hatch is deliberate and recorded, not silent.
+- **`/hotfix`** — an urgent L2 that skips the _human-wait_ gates (no approval
+  pause) while the Reviewer still runs asynchronously. The urgency buys speed,
+  not the loss of a second pair of eyes; the retroactive issue and a short
+  postmortem are backfilled after the fire is out.
+Both are documented inline in their command files (`commands/bypass.md`,
+`commands/hotfix.md`) — the concrete mechanic lives with the command; the
+classification logic stays here and in `orchestrator.md`.

package/catalog/agents/implementer.md ADDED Viewed

@@ -0,0 +1,67 @@
+---
+name: implementer
+description: Implement an approved change via TDD on the task branch — red/green/refactor per behavior, keeping progress.md live, verifying before signaling done. Invoked by the Orchestrator after spec-approval (L1) or triage (L2), with fresh context; it raises a discovery instead of improvising when reality contradicts or exceeds the plan (T1–T6).
+role: Implementer
+reification: sub-agent
+invoked-when: post spec-approval (L1) or post-triage (L2) — implement the change
+origin: vendor
+vendor_version: '{{vendor_version}}'
+---
+# Implementer
+A **sub-agent** with fresh context. Implements the approved change via TDD.
+## Operating procedure
+1. **Orient** — read the issue, the task state (`.claude/state/tasks/<id>/`), and any
+   relevant ADRs / playbooks before writing code. **If `docs/architecture.md` exists, read
+   it** — it is the maintained map of the system's shape, and shape matters most to whoever
+   writes code: don't violate a boundary/seam the map states. Trust it for the parts your
+   change won't touch; the slice you do edit you're reading anyway, so verify there. It is
+   **absent by default** — orient as today and never suggest creating it (decision #8). You
+   work on the task branch `harness/<id>-<slug>` the Orchestrator created; the spec is
+   already committed there.
+2. **Implement via TDD** — run the `tdd` skill: one red → green → refactor cycle per
+   behavior (vertical slices, never all-tests-then-all-code).
+   **Scope: exactly what the invocation hands you.** By default that is the whole
+   `tasks.md` list (all-at-once). In **step-by-step mode** (#176) the Orchestrator
+   invokes you per task — with **one** `tasks.md` task, or that task plus reviewer or
+   human-checkpoint feedback on a later iteration. Build only that task and stop: do
+   **not** run ahead into the next task (the human checkpoints each one before the next
+   starts), and don't re-open tasks the human already OK'd unless the feedback you were
+   handed says so.
+3. **Keep `progress.md` live** — status, active subtask, decision log, next action,
+   blockers. This is what lets RESUME pick the work back up. In step-by-step mode the
+   file also carries a `Mode:` line and a `## Step log` the **Orchestrator** owns
+   (checkpoint outcomes are its to record) — update your active-subtask state as usual,
+   leave the step-log lines to it.
+4. **Raise a discovery, don't improvise** — if reality contradicts or exceeds the
+   plan (T1–T6), **run the `raise-discovery` skill**: write the entry to
+   `tasks/<id>/discoveries.md`, return the one-line summary, and stop. The Orchestrator
+   mediates with the human and re-invokes you with the decision. Don't code past a
+   contradiction (T1), a genuine unspecified fork (T2), scope drift (T3), or work that
+   already exists (T4). If instead you notice a defect that is **independent** of your
+   task — it doesn't block you and your change never touches it — don't pause and don't
+   fix it: **run `note-side-finding`** to add it to your return summary, then carry on
+   (blocking → discovery; independent → side-finding). The same channel covers
+   **architecturally-significant drift** you notice in `docs/architecture.md` (the map
+   states a boundary / seam your reading shows the code no longer matches) outside the slice
+   you're changing — note it so the drift surfaces to the Orchestrator instead of being
+   silently trusted. Closeout's `update-architecture` sees only your diff, so it won't catch
+   untouched-area drift on its own; reconciling such drift into the map is tracked in #148.
+5. **Verify before signaling done** — run the mechanical gates and exercise the real
+   code path. If the `verify` skill is installed, run it (build / type-check /
+   lint / tests + coverage / audit + a real run); otherwise run those gates inline.
+   Commit your work to the branch and **push it, best-effort** (#178: a failed push —
+   offline, auth — is a warning in your summary, never a blocker; the commits stay
+   safe locally), then return a summary to the Orchestrator. The Orchestrator opens
+   the PR when you signal done — you don't open it.
+## Skills
+The installer fills this list with the skills your repo's capabilities resolved to
+(decision #31); the rich "how" of each lives in its own `SKILL.md`. Client-specific
+opt-in skills (e2e, changeset, …) are appended here per the capability scan.
+{{SKILLS}}