PyPI - claude-code-kit - Versions diffs - 0.12.0__tar.gz → 0.13.0__tar.gz - Mend

claude-code-kit 0.12.0tar.gz → 0.13.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (230) hide show

{claude_code_kit-0.12.0 → claude_code_kit-0.13.0}/.claude-plugin/marketplace.json RENAMED Viewed

@@ -10,7 +10,7 @@
       "name": "claude-kit",
       "source": "./",
       "description": "Cookiecutter-style scaffolder for an autonomous Claude Code SDLC config (no app code, no Docker): install CLAUDE.md + .claude/ (rules, the profile's agents/skills, hooks, artifact templates) + optional .mcp.json, then run /sdlc to drive spec → review → build → test → security → ship through profile-aware quality gates, working memory, and a self-improving learnings loop.",
-      "version": "0.12.0",
+      "version": "0.13.0",
       "license": "MIT",
       "keywords": ["sdlc", "agents", "orchestration", "quality-gates", "workflow", "scaffold", "cookiecutter"]
     }

{claude_code_kit-0.12.0 → claude_code_kit-0.13.0}/.claude-plugin/plugin.json RENAMED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "claude-kit",
-  "version": "0.12.0",
+  "version": "0.13.0",
   "description": "Cookiecutter-style scaffolder for an autonomous Claude Code SDLC config (no app code, no Docker). `claude-kit init` asks ordered questions and installs CLAUDE.md + .claude/ (rules, the profile's agents/skills, hooks, artifact templates) + optional .mcp.json; run /sdlc to drive spec → review → build → test → security → ship through profile-aware quality gates with working memory and a self-improving learnings loop.",
   "author": {
     "name": "Arjunsingh Yadav",

{claude_code_kit-0.12.0 → claude_code_kit-0.13.0}/CHANGELOG.md RENAMED Viewed

@@ -4,6 +4,89 @@ All notable changes to claude-kit are documented here. The format follows
 [Keep a Changelog](https://keepachangelog.com/), and the project uses
 [semantic versioning](https://semver.org/).
+## [0.13.0] — 2026-06-15
+A **second improvement brief** (external self-review, post-0.12.0) — Item 0 (a covered-vs-gated audit)
++ P0-1/P0-2, P1-1/P1-2/P1-3, and six P2 items — run through the kit's mandated **adversarial
+reuse-first map→verify** (a 24-agent map→verify pass). The decisive finding repeated from last time:
+several premises were **overstated** against the live files (migration safety was already largely
+enforced; the README "no PyPI yet" text was simply stale; the README is already progressively
+disclosed). The result is a mix of **two new gates wired as data, one new live backend stack, and
+targeted extensions** — **zero new agents/skills/rules** beyond what already existed (core counts
+unchanged: 28 agents · 50 skills · 23 rules).
+### Added
+- **Item 0 — `docs/coverage-audit.md`.** The justification record the briefs kept eliding: every
+  "already covered" capability classified **GATED (enforced) / RULE (always-on) / SKILL-DOC
+  (advisory)** with file evidence. Verifies rollback (GATED enterprise-only; RULE elsewhere), cost
+  (DOC by design), migration safety (overlay-advisory + enterprise rollback), accessibility, and
+  flags the one *looks-enforced-but-isn't* trap (the `accessibility-review` skill's internal "Quality
+  gates" heading is **not** a gate token).
+- **P0-1 — `contract-clear` reaches the default `standard` profile** (API stacks), not just
+  enterprise (`catalog/profiles.yaml`). It still self-skips when the stack exposes no API contract
+  surface, so non-API projects are unaffected. *(Deliberate posture change: 0.12.0 placed it in
+  enterprise under golden-rule-#6 "heavyweight gates default to enterprise"; the brief explicitly
+  authorizes promoting it because breaking-change detection is table-stakes for the headline FastAPI
+  backend. Documented, not silent.)* Owned by `merge-reviewer`; quality-gates §4 + mandatory-workflow
+  §2d + the api-change-report template updated to say "standard+".
+- **P1-1 — a live Go backend stack** (Go · stdlib **net/http**): a pure `catalog/stacks.yaml` entry +
+  `templates/stacks/backend/go/net-http/rules/go-patterns.md` overlay + exact `go` commands
+  (`go build ./...`, `go test ./...`, `go vet`, `gofmt`). Chosen over Node/Express precisely because
+  its build/test command shapes differ most from npm/pip — the strongest test of the stack-agnostic
+  claim. The one supporting code change: a **`build`** key added to `_BACKEND_CMD_KEYS` (compiled
+  backends surface a build command; interpreted ones leave it empty). No `resolve()` branch.
+- **P1-2 — `accessibility-clear` gate** at organization scope, **`regulated` strictness only**
+  (`catalog/org.yaml` `extra_gates`). Owned by `acceptance-reviewer` (read-only, already present at
+  standard+), drives the existing `accessibility-review` skill over changed UI (WCAG-AA), self-skips
+  when no UI surface. Wired in `org.yaml` only, so the `lean⊊standard⊊enterprise` profile invariant is
+  untouched.
+- **`examples/react-fastapi-postgres-feature/`** (P2-2) — a clearly-labelled **synthetic** end-to-end
+  walkthrough: request → feature-spec → story breakdown (coverage gate) → gate verdicts (incl. one
+  defect-loop cycle and a Devil's-Advocate CONFIRMED line) → sample PR diff. Repo reference (like
+  `docs/`), **not** bundled into the wheel.
+- **`docs/eval-harness.md`** (P2-4) — a fill-in template to measure the pipeline with vs without the
+  gates (which gate caught which defect), built on `rules/evals.md` §6 median-of-N. Ships **no**
+  numbers by design (an eval result is environment-specific); honesty rules included.
+- **Self-test matrix** (P2-5) — a parametrized test sweeping **every live frontend × backend ×
+  database × profile × scope** (now 24 combos incl. Go), each resolved + installed + validated +
+  Docker-checked. Driven off `catalog.list_options`, so new live stacks auto-join with no test edit.
+### Changed
+- **P0-2 — migration safety made explicit.** Both `migration-specialist` overlays (postgres + mongodb)
+  already mandated expand/contract, reversible down-path, and idempotent backfill *as agent guidance*;
+  added the explicit hard rule **"no destructive drop in the same release as the code that stops using
+  the old shape"** with **severity** to the always-on overlay RULES (`postgres-patterns.md`,
+  `mongodb-patterns.md`) — so it lives in a rule, not only an agent prompt. (Same-release destruction
+  = at least **High**.)
+- **P1-3 — the PyPI story reconciled.** `claude-code-kit` **is** published (latest 0.12.0); the README
+  install block, troubleshooting row, and a stale `changelog-v0.10.0` badge said otherwise. Install is
+  now `pip install claude-code-kit`; the changelog badge is de-versioned (self-healing); the CI
+  publish machinery (`publish.yml`) was correct and left untouched.
+- **P2-3 (on-ramp, minimal)** — added an **Examples** nav link + pipeline pointer only; the proposed
+  full README restructure was **rejected** (see below). Pipeline gate table + `docs/architecture.md`
+  diagram updated for `contract-clear` (standard+) and the Go stack.
+### Not adopted (deliberately — premise overstated or against the kit's design)
+- **A dedicated migration GATE token (P0-2).** Migrations are overlay-conditioned and not every-run;
+  `resolve()` can't emit stack gates without a branch. Strengthened the always-on overlay rules +
+  reviewer agents instead — enforcement via review + the enterprise rollback gate (`pipeline-green`),
+  per the coverage audit.
+- **Node/Express as the new backend (P1-1).** Chose **Go** instead — its command shapes differ more
+  from the existing npm/pip stacks, which is the whole point of the breadth test. Express/Vue/Svelte/
+  Django remain `planned`.
+- **A full README restructure + GIF (P2-3).** The README already uses progressive disclosure
+  (`<details>`); a big move-to-`docs/` churn is negative-value and a GIF can't be produced here. Added
+  only the example link. (Recording a demo GIF is a human follow-up.)
+- **Relocating the CHANGELOG "Not adopted" blocks to `docs/decision-log.md` (P2-6).** Those blocks are
+  a **marketed feature** the README links to; moving them would break that cross-reference for low
+  value. Added a forward-looking note in `CONTRIBUTING.md` instead (split later *only if* the README
+  link is updated in the same change).
+- **Repo About-box metadata (P2-1)** — host config outside the payload; `gh` is unavailable here.
+  Human follow-up: `gh repo edit ajyadav013/claude-kit --description "Config-only, stack-agnostic
+  autonomous-SDLC scaffolder for Claude Code (plugin + pip)" --add-topic claude-code --add-topic
+  claude-code-plugins --add-topic sdlc --add-topic ai-agents --add-topic agentic-coding --add-topic
+  claude-skills`.
 ## [0.12.0] — 2026-06-15
 An **improvement brief** (external self-review, no repo access) proposed ~15 changes — four P0, five

{claude_code_kit-0.12.0 → claude_code_kit-0.13.0}/CLAUDE.md RENAMED Viewed

@@ -29,8 +29,9 @@ distributed two ways from one source of truth:
 | `templates/org/` | **Org overlay** content (scope-gated, organization only): `skills/`, `agents/` (personas), `rules/` (policy/vibe), `packs/<pack>/{pack.yaml,README.md}`, `README.md`. Wired via `catalog/org.yaml`. The only place org-specific content lives. |
 | `scripts/init.sh` | Thin no-pip fallback scaffolder (copies the full payload; no catalog resolution) |
 | `src/claude_kit/` | The pip CLI (Typer): `cli.py`, `catalog.py` (resolver), `prompts.py`, `models.py`, `scaffold.py` (installer), `render.py` (Jinja2), `hooks.py`, `validator.py`, `upgrader.py` |
-| `tests/` | pytest suite (catalog, render, scaffold, validator, upgrader, CLI) |
-| `docs/architecture.md` · `docs/agents.md` | Architecture diagrams · agent guide |
+| `tests/` | pytest suite (catalog, render, scaffold, validator, upgrader, CLI; incl. the profile×stack×scope self-test matrix) |
+| `examples/` | Synthetic end-to-end `/sdlc` worked example (repo reference; **not** bundled into the wheel) |
+| `docs/architecture.md` · `docs/agents.md` · `docs/coverage-audit.md` · `docs/eval-harness.md` | Architecture diagrams · agent guide · the GATED-vs-RULE-vs-SKILL enforcement audit · the with/without eval template |
 | `pyproject.toml` | Packaging (deps: typer/jinja2/pyyaml); `[tool.hatch...force-include]` bundles the payload into the wheel |
 **One source of truth:** `agents/ skills/ commands/ hooks/ rules/ templates/ catalog/` at the repo

{claude_code_kit-0.12.0 → claude_code_kit-0.13.0}/CONTRIBUTING.md RENAMED Viewed

@@ -83,9 +83,14 @@ a specific stack — `pytest` enforces the no-Docker invariant on a scaffolded p
 1. Bump the version in **all four** places: `pyproject.toml`, `.claude-plugin/plugin.json`, the
    `.claude-plugin/marketplace.json` entry, and `src/claude_kit/__init__.py`.
-2. Add a `CHANGELOG.md` entry.
+2. Add a `CHANGELOG.md` entry, including a **"Not adopted (deliberately)"** block stating what you
+   chose *not* to add and why — this is a marketed feature of the changelog (the README links to it),
+   so keep it. If those blocks ever grow unwieldy they may later split into `docs/decision-log.md`,
+   but **only if** the README's CHANGELOG cross-reference is updated in the same change; until then
+   they stay in `CHANGELOG.md` by design.
 3. `pytest` green, then `python3 -m build && python3 -m twine check dist/*`.
-4. `python3 -m twine upload dist/*` (PyPI).
+4. CI auto-publishes to PyPI on merge to `main` when the version is new (`.github/workflows/publish.yml`,
+   OIDC trusted publishing). Manual `python3 -m twine upload dist/*` is the fallback.
 5. Tag the release and push so plugin users get the update.
 ## License

{claude_code_kit-0.12.0 → claude_code_kit-0.13.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: claude-code-kit
-Version: 0.12.0
+Version: 0.13.0
 Summary: Cookiecutter-style scaffolder for an autonomous Claude Code SDLC configuration (no app code, no Docker). Asks ordered questions and installs CLAUDE.md + .claude/ (rules, the chosen profile's agents/skills, hooks, artifact templates) + optional .mcp.json; run /sdlc to drive spec → review → build → test → security → ship through profile-aware quality gates, working memory, and a self-improving learnings loop.
 Project-URL: Homepage, https://github.com/ajyadav013/claude-kit
 Project-URL: Repository, https://github.com/ajyadav013/claude-kit
@@ -39,9 +39,9 @@ with a quality gate between every phase. **No application code. No Docker. Confi
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
 [![Built for Claude Code](https://img.shields.io/badge/built%20for-Claude%20Code-d97757.svg)](https://www.claude.com/product/claude-code)
 [![CI](https://github.com/ajyadav013/claude-kit/actions/workflows/ci.yml/badge.svg)](https://github.com/ajyadav013/claude-kit/actions/workflows/ci.yml)
-[![Changelog](https://img.shields.io/badge/changelog-v0.10.0-blue.svg)](CHANGELOG.md)
+[![Changelog](https://img.shields.io/badge/changelog-md-blue.svg)](CHANGELOG.md)
-🚀 [Quick start](#quick-start) · 🧭 [How it works](#how-it-works) · 🔁 [The pipeline](#the-pipeline) · 🌱 [What we adopted](#influences--what-we-adopted) · 🤖 [Agents](#the-agents) · 🧩 [Catalog](#catalog--extensibility) · 🛠️ [CLI](#cli-reference) · 📖 [Agent guide](docs/agents.md)
+🚀 [Quick start](#quick-start) · 🧭 [How it works](#how-it-works) · 🔁 [The pipeline](#the-pipeline) · 🧪 [Example](examples/) · 🌱 [What we adopted](#influences--what-we-adopted) · 🤖 [Agents](#the-agents) · 🧩 [Catalog](#catalog--extensibility) · 🛠️ [CLI](#cli-reference) · 📖 [Agent guide](docs/agents.md)
 </div>
@@ -59,7 +59,7 @@ refuses to advance a phase until its **quality gate** passes. You drive it all w
 **At a glance:**
 - 🧱 **Stack-agnostic** — the pipeline assumes no language or framework. Pick a stack at `init` and it
-  installs matching overlay rules (React · FastAPI · PostgreSQL · MongoDB) and your exact
+  installs matching overlay rules (React · FastAPI · Go/net-http · PostgreSQL · MongoDB) and your exact
   lint/test/build commands. It never writes your app code and never needs Docker.
 - 🎚️ **Dial the rigor with profiles** — `lean ⊊ standard ⊊ enterprise` decide how many agents, skills,
   hooks, and gates are active, from "fast track" to "full audit".
@@ -109,9 +109,9 @@ Then, inside any project you want the pipeline to manage:
 A CLI (`claude-kit`, aliases `ckit` / `claude-sdlc`) that scaffolds the same config into any repo:
 ```bash
-# Until the first PyPI release, install straight from the repo:
-pip install "git+https://github.com/ajyadav013/claude-kit.git"
-# Once published to PyPI this becomes:  pip install claude-code-kit
+pip install claude-code-kit
+# or, for the bleeding edge straight from the repo:
+#   pip install "git+https://github.com/ajyadav013/claude-kit.git"
 claude-kit init                 # interactive: prompts for stack, profile, MCP
 claude-kit init --defaults      # non-interactive: React + Python/FastAPI + Postgres + standard
@@ -223,10 +223,16 @@ flowchart TD
 | Profile | Gates that run |
 |---|---|
 | **lean** | code-review · build-green |
-| **standard** | spec-complete · em-approved · code-review · build-green · test-coverage · security-clear |
+| **standard** | spec-complete · em-approved · code-review · build-green · test-coverage · security-clear · contract-clear\* |
 | **enterprise** | standard + pipeline-green · observability-ready · acceptance |
-A **fast-track** mode collapses small changes (< 5 files) to Developer → Code Reviewer → Tester → PR.
+\* `contract-clear` (API breaking-change diff) self-skips when the stack exposes no API surface, so it
+is inert for non-API projects. Organization scope at `regulated` strictness adds `accessibility-clear`
+(WCAG-AA on changed UI). A **fast-track** mode collapses small changes (< 5 files) to Developer →
+Code Reviewer → Tester → PR.
+See [`examples/`](examples/) for a synthetic end-to-end walkthrough — request → spec → story breakdown
+→ gate verdicts (with one defect-loop cycle) → sample PR diff.
 ---
@@ -245,6 +251,7 @@ non-duplicative gaps**, minimally and catalog-wired.
 | **[GitHub spec-kit](https://github.com/github/spec-kit)** | Spec → tasks → **analyze** coverage gate; tasks → tracker issues; stable requirement IDs + assumptions in specs | Wired the (previously orphaned) `story-planner` as the **coverage gate (1f)**, a tracker-agnostic `task-tracker-sync` skill, and enriched the feature-spec template | `0.9.0` |
 | **[protectai/llm-guard](https://github.com/protectai/llm-guard)** | Input→model→output guardrails for LLM features — prompt injection, PII vault, treating model output as untrusted | **Opt-in** "LLM / AI Feature Security" guidance in `security-and-hardening` + the advisory `warn-llm-io` hook (warns, **never blocks**) | `0.10.0` |
 | **Improvement brief** (external self-review) | API backward-compat as a gate; load-against-SLO as a release criterion; supply-chain maintenance cadence; pipeline resumability, clean abort, and worktree lifecycle; pipeline cost/concurrency/cross-platform transparency | The enterprise **`contract-clear`** gate (owned by `merge-reviewer`) + `api-change-report` template; a load-vs-SLO criterion in Observability Ready; dependency **Cadence Mode**; `/sdlc` resume-vs-restart, `/claude-kit:abort`, worktree teardown; cost/concurrency/Windows notes — **9 surgical extensions, 0 new agents/skills/rules** | `0.12.0` |
+| **Improvement brief #2** (external self-review) | The covered-vs-**gated** distinction (a skill ≠ a gate); enforce API breaking-changes by default; expand/contract migration safety; back the stack-agnostic claim with a compiled backend; WCAG as a regulated gate; reconcile the PyPI story; ship a worked example + a self-test matrix | [`docs/coverage-audit.md`](docs/coverage-audit.md); **`contract-clear` promoted to `standard`**; a live **Go/net-http** backend; the **`accessibility-clear`** regulated gate; explicit migration-drop rules; a synthetic [`examples/`](examples/) run; an eval-harness template; a profile×stack×scope self-test matrix — **2 gates wired + 1 stack, 0 new agents/skills/rules** | `0.13.0` |
 > Each adoption is detailed in the [CHANGELOG](CHANGELOG.md) — including, for every review, what we
 > deliberately **did not** add because the kit already covered it.
@@ -386,8 +393,8 @@ change.
 <br>
 - **`catalog/stacks.yaml`** — frontend frameworks, backend languages → frameworks, and databases.
-  Live today: React · Python/FastAPI · PostgreSQL/MongoDB. Vue/Svelte/Django/Express are listed as
-  `planned` (offered by `list-options`, not yet selectable).
+  Live today: React · Python/FastAPI · **Go/net-http** · PostgreSQL/MongoDB. Vue/Svelte/Django/Express
+  are listed as `planned` (offered by `list-options`, not yet selectable).
 - **`catalog/profiles.yaml`** — what each profile activates (`inherit:` composes; `all` = everything).
 - **`catalog/mcp.yaml`** — ready `.mcp.json` fragments per server, with `${ENV}` placeholders.
 - **`catalog/org.yaml`** — the **organization layer**: scopes, teams, the autonomy model, review
@@ -454,7 +461,7 @@ hints.
 | Guard / quality hooks seem to do nothing | `jq` isn't installed (the hooks parse tool input with it) | Install `jq`; without it the hooks degrade to no-ops by design |
 | Hooks do nothing on **Windows** | No POSIX shell — `.sh` hooks can't run under `cmd`/PowerShell | Run claude-kit inside **WSL or Git Bash** (with `jq`); `claude-kit doctor` confirms. Config + CLI work natively regardless |
 | A selected MCP server won't start | `node` / `npx` missing (most MCP servers launch via `npx`) | Install Node.js, or remove the server from `.mcp.json` |
-| `pip install claude-code-kit` fails | Not yet published to PyPI | Use `pip install "git+https://github.com/ajyadav013/claude-kit.git"` |
+| `pip install claude-code-kit` fails | Outdated `pip`, or you want an unreleased change | Upgrade pip (`pip install -U pip`); for unreleased changes use `pip install "git+https://github.com/ajyadav013/claude-kit.git"` |
 | `validate` reports missing files | Partial or outdated install | Re-run `claude-kit init` (choose **merge**), or `claude-kit upgrade` |
 </details>

{claude_code_kit-0.12.0 → claude_code_kit-0.13.0}/README.md RENAMED Viewed

@@ -12,9 +12,9 @@ with a quality gate between every phase. **No application code. No Docker. Confi
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
 [![Built for Claude Code](https://img.shields.io/badge/built%20for-Claude%20Code-d97757.svg)](https://www.claude.com/product/claude-code)
 [![CI](https://github.com/ajyadav013/claude-kit/actions/workflows/ci.yml/badge.svg)](https://github.com/ajyadav013/claude-kit/actions/workflows/ci.yml)
-[![Changelog](https://img.shields.io/badge/changelog-v0.10.0-blue.svg)](CHANGELOG.md)
+[![Changelog](https://img.shields.io/badge/changelog-md-blue.svg)](CHANGELOG.md)
-🚀 [Quick start](#quick-start) · 🧭 [How it works](#how-it-works) · 🔁 [The pipeline](#the-pipeline) · 🌱 [What we adopted](#influences--what-we-adopted) · 🤖 [Agents](#the-agents) · 🧩 [Catalog](#catalog--extensibility) · 🛠️ [CLI](#cli-reference) · 📖 [Agent guide](docs/agents.md)
+🚀 [Quick start](#quick-start) · 🧭 [How it works](#how-it-works) · 🔁 [The pipeline](#the-pipeline) · 🧪 [Example](examples/) · 🌱 [What we adopted](#influences--what-we-adopted) · 🤖 [Agents](#the-agents) · 🧩 [Catalog](#catalog--extensibility) · 🛠️ [CLI](#cli-reference) · 📖 [Agent guide](docs/agents.md)
 </div>
@@ -32,7 +32,7 @@ refuses to advance a phase until its **quality gate** passes. You drive it all w
 **At a glance:**
 - 🧱 **Stack-agnostic** — the pipeline assumes no language or framework. Pick a stack at `init` and it
-  installs matching overlay rules (React · FastAPI · PostgreSQL · MongoDB) and your exact
+  installs matching overlay rules (React · FastAPI · Go/net-http · PostgreSQL · MongoDB) and your exact
   lint/test/build commands. It never writes your app code and never needs Docker.
 - 🎚️ **Dial the rigor with profiles** — `lean ⊊ standard ⊊ enterprise` decide how many agents, skills,
   hooks, and gates are active, from "fast track" to "full audit".
@@ -82,9 +82,9 @@ Then, inside any project you want the pipeline to manage:
 A CLI (`claude-kit`, aliases `ckit` / `claude-sdlc`) that scaffolds the same config into any repo:
 ```bash
-# Until the first PyPI release, install straight from the repo:
-pip install "git+https://github.com/ajyadav013/claude-kit.git"
-# Once published to PyPI this becomes:  pip install claude-code-kit
+pip install claude-code-kit
+# or, for the bleeding edge straight from the repo:
+#   pip install "git+https://github.com/ajyadav013/claude-kit.git"
 claude-kit init                 # interactive: prompts for stack, profile, MCP
 claude-kit init --defaults      # non-interactive: React + Python/FastAPI + Postgres + standard
@@ -196,10 +196,16 @@ flowchart TD
 | Profile | Gates that run |
 |---|---|
 | **lean** | code-review · build-green |
-| **standard** | spec-complete · em-approved · code-review · build-green · test-coverage · security-clear |
+| **standard** | spec-complete · em-approved · code-review · build-green · test-coverage · security-clear · contract-clear\* |
 | **enterprise** | standard + pipeline-green · observability-ready · acceptance |
-A **fast-track** mode collapses small changes (< 5 files) to Developer → Code Reviewer → Tester → PR.
+\* `contract-clear` (API breaking-change diff) self-skips when the stack exposes no API surface, so it
+is inert for non-API projects. Organization scope at `regulated` strictness adds `accessibility-clear`
+(WCAG-AA on changed UI). A **fast-track** mode collapses small changes (< 5 files) to Developer →
+Code Reviewer → Tester → PR.
+See [`examples/`](examples/) for a synthetic end-to-end walkthrough — request → spec → story breakdown
+→ gate verdicts (with one defect-loop cycle) → sample PR diff.
 ---
@@ -218,6 +224,7 @@ non-duplicative gaps**, minimally and catalog-wired.
 | **[GitHub spec-kit](https://github.com/github/spec-kit)** | Spec → tasks → **analyze** coverage gate; tasks → tracker issues; stable requirement IDs + assumptions in specs | Wired the (previously orphaned) `story-planner` as the **coverage gate (1f)**, a tracker-agnostic `task-tracker-sync` skill, and enriched the feature-spec template | `0.9.0` |
 | **[protectai/llm-guard](https://github.com/protectai/llm-guard)** | Input→model→output guardrails for LLM features — prompt injection, PII vault, treating model output as untrusted | **Opt-in** "LLM / AI Feature Security" guidance in `security-and-hardening` + the advisory `warn-llm-io` hook (warns, **never blocks**) | `0.10.0` |
 | **Improvement brief** (external self-review) | API backward-compat as a gate; load-against-SLO as a release criterion; supply-chain maintenance cadence; pipeline resumability, clean abort, and worktree lifecycle; pipeline cost/concurrency/cross-platform transparency | The enterprise **`contract-clear`** gate (owned by `merge-reviewer`) + `api-change-report` template; a load-vs-SLO criterion in Observability Ready; dependency **Cadence Mode**; `/sdlc` resume-vs-restart, `/claude-kit:abort`, worktree teardown; cost/concurrency/Windows notes — **9 surgical extensions, 0 new agents/skills/rules** | `0.12.0` |
+| **Improvement brief #2** (external self-review) | The covered-vs-**gated** distinction (a skill ≠ a gate); enforce API breaking-changes by default; expand/contract migration safety; back the stack-agnostic claim with a compiled backend; WCAG as a regulated gate; reconcile the PyPI story; ship a worked example + a self-test matrix | [`docs/coverage-audit.md`](docs/coverage-audit.md); **`contract-clear` promoted to `standard`**; a live **Go/net-http** backend; the **`accessibility-clear`** regulated gate; explicit migration-drop rules; a synthetic [`examples/`](examples/) run; an eval-harness template; a profile×stack×scope self-test matrix — **2 gates wired + 1 stack, 0 new agents/skills/rules** | `0.13.0` |
 > Each adoption is detailed in the [CHANGELOG](CHANGELOG.md) — including, for every review, what we
 > deliberately **did not** add because the kit already covered it.
@@ -359,8 +366,8 @@ change.
 <br>
 - **`catalog/stacks.yaml`** — frontend frameworks, backend languages → frameworks, and databases.
-  Live today: React · Python/FastAPI · PostgreSQL/MongoDB. Vue/Svelte/Django/Express are listed as
-  `planned` (offered by `list-options`, not yet selectable).
+  Live today: React · Python/FastAPI · **Go/net-http** · PostgreSQL/MongoDB. Vue/Svelte/Django/Express
+  are listed as `planned` (offered by `list-options`, not yet selectable).
 - **`catalog/profiles.yaml`** — what each profile activates (`inherit:` composes; `all` = everything).
 - **`catalog/mcp.yaml`** — ready `.mcp.json` fragments per server, with `${ENV}` placeholders.
 - **`catalog/org.yaml`** — the **organization layer**: scopes, teams, the autonomy model, review
@@ -427,7 +434,7 @@ hints.
 | Guard / quality hooks seem to do nothing | `jq` isn't installed (the hooks parse tool input with it) | Install `jq`; without it the hooks degrade to no-ops by design |
 | Hooks do nothing on **Windows** | No POSIX shell — `.sh` hooks can't run under `cmd`/PowerShell | Run claude-kit inside **WSL or Git Bash** (with `jq`); `claude-kit doctor` confirms. Config + CLI work natively regardless |
 | A selected MCP server won't start | `node` / `npx` missing (most MCP servers launch via `npx`) | Install Node.js, or remove the server from `.mcp.json` |
-| `pip install claude-code-kit` fails | Not yet published to PyPI | Use `pip install "git+https://github.com/ajyadav013/claude-kit.git"` |
+| `pip install claude-code-kit` fails | Outdated `pip`, or you want an unreleased change | Upgrade pip (`pip install -U pip`); for unreleased changes use `pip install "git+https://github.com/ajyadav013/claude-kit.git"` |
 | `validate` reports missing files | Partial or outdated install | Re-run `claude-kit init` (choose **merge**), or `claude-kit upgrade` |
 </details>

{claude_code_kit-0.12.0 → claude_code_kit-0.13.0}/agents/acceptance-reviewer.md RENAMED Viewed

@@ -53,6 +53,25 @@ Run the **RARV** cycle (`.claude/rules/rarv-cycle.md`) with a green Verify (you
 checks) before issuing the verdict, and update `.claude/CONTINUITY.md`. This gate is **Acceptance**
 in the enterprise profile and runs before the PR is handed to a human.
+## Join Point: Accessibility (accessibility-clear gate)
+> Active **only** under organization scope at **`regulated`** review strictness (where WCAG is
+> commonly a legal requirement). You own the **accessibility-clear** gate. **Degrade to a no-op**
+> (PASS, note "no UI surface") when the change touches no frontend/UI files — detect with `Bash`
+> (`git diff --name-only <base>` against the frontend stack dir / component globs); never block a
+> back-end-only or API-only change.
+When a UI surface is present:
+1. **Drive `.claude/skills/accessibility-review`** over the changed views/components (and the standards
+   in `.claude/rules/responsive-and-accessibility.md`) — keyboard operability, focus management,
+   semantics/ARIA, color contrast (WCAG AA), motion, and screen-reader labels.
+2. **Classify each finding** by `.claude/rules/quality-gates.md` §1. A WCAG-AA failure on a
+   legally-required surface is at least **High** (per `accessibility-review`'s risk note); a missing
+   label, focus trap, or sub-threshold contrast is **High/Medium**; cosmetic spacing is **Low**.
+3. **Verdict** — *accessibility-clear* PASSes only at zero Critical/High/Medium, consistent with every
+   other gate. Record findings in the acceptance report.
 ## Escalation
 Escalate to the human when acceptance criteria themselves are ambiguous or untestable, when the spec

{claude_code_kit-0.12.0 → claude_code_kit-0.13.0}/agents/merge-reviewer.md RENAMED Viewed

@@ -178,8 +178,8 @@ Frontend code reviewed: ✓ | Build/tests: ✓
 > source is found — mirror the hooks' detect-then-skip pattern; never block a project that has no
 > contract.
-Owns the **contract-clear** gate (enterprise; or any profile an org opts into via `org.yaml`
-strictness). With `Bash`:
+Owns the **contract-clear** gate (runs in **standard and enterprise** — any profile that includes the
+`merge-reviewer` — whenever the selected stack exposes an API surface). With `Bash`:
 1. **Locate or generate the contract** — a committed `openapi.(json|yaml)` / GraphQL SDL, or generate it from the framework's typed routes.
 2. **Diff against the base branch** — `git show <base>:<contract-path>` vs the working copy.

{claude_code_kit-0.12.0 → claude_code_kit-0.13.0}/catalog/org.yaml RENAMED Viewed

@@ -77,7 +77,10 @@ strictness:
     regulated:
       label: "Regulated — compliance-grade gates"
       hooks: [validate-frontmatter, validate-settings]
-      extra_gates: [security-clear, acceptance]
+      # accessibility-clear (brief #2 P1-2): a WCAG gate owned by acceptance-reviewer, driving the
+      # accessibility-review skill. Regulated-strictness only (WCAG is often a legal requirement there);
+      # self-skips when the change touches no UI surface, so API/back-end-only work is unaffected.
+      extra_gates: [security-clear, acceptance, accessibility-clear]
 # --- core agents the org layer activates regardless of profile ---------------------------------------
 # These live in the core agents/ dir (installed via the normal agent path); listing them here unions

{claude_code_kit-0.12.0 → claude_code_kit-0.13.0}/catalog/profiles.yaml RENAMED Viewed

@@ -82,7 +82,10 @@ profiles:
       - over-engineering-review
       - simplification-debt
       - task-tracker-sync
-    gates: [spec-complete, em-approved, code-review, build-green, test-coverage, security-clear]
+    # contract-clear self-skips when the stack exposes no API contract surface, so it is inert for
+    # non-API projects on standard while enforcing backward-compatibility for API-exposing backends
+    # (e.g. FastAPI). Promoted from enterprise-only per brief #2 P0-1 — see CHANGELOG 0.13.0.
+    gates: [spec-complete, em-approved, code-review, build-green, test-coverage, security-clear, contract-clear]
     hooks: [load-continuity, load-learnings, load-autonomy, skill-routing, learning-detection, guard-rm-rf, guard-push-main, guard-destructive-git, protect-secrets, guard-commit-secrets, warn-shared-modules, warn-llm-io, lint-fix, type-check]
   enterprise:

{claude_code_kit-0.12.0 → claude_code_kit-0.13.0}/catalog/stacks.yaml RENAMED Viewed

@@ -3,7 +3,7 @@
 # Adding a frontend framework, backend language/framework, or database is a DATA change here
 # (plus a templates/stacks/<stack_dir>/ folder for its overlay rules/agents) — never a code change.
 # Entries marked `status: planned` are offered by `list-options` as "coming soon" but cannot be
-# selected yet (no overlay content shipped). React + Python/FastAPI + Postgres/Mongo are live.
+# selected yet (no overlay content shipped). React + Python/FastAPI + Go/net-http + Postgres/Mongo are live.
 #
 # Each live entry may declare:
 #   label          human name shown in prompts
@@ -78,6 +78,23 @@ backend:
           label: "Express"
           status: planned
           stack_dir: backend/node/express
+    go:
+      label: "Go"
+      default_framework: net-http
+      frameworks:
+        net-http:
+          label: "net/http (stdlib)"
+          overlay_rules: [go-patterns.md]
+          overlay_agents: []
+          skills: [api-and-interface-design, api-integration]
+          stack_dir: backend/go/net-http
+          commands:
+            install: "go mod download"
+            dev: "go run ./..."
+            test: "go test ./..."
+            lint: "go vet ./... && gofmt -l ."
+            format: "gofmt -w ."
+            build: "go build ./..."
 database:
   default: postgres

{claude_code_kit-0.12.0 → claude_code_kit-0.13.0}/docs/agents.md RENAMED Viewed

@@ -54,10 +54,10 @@ Request ─▶ classify ─▶ Spec & Dev Docs ─▶ [Gate: EM approved]
 ```
 Which gates actually run depends on the profile: **lean** = code-review · build-green; **standard**
-adds spec/EM/coverage/security; **enterprise** adds contract-clear · pipeline-green ·
-observability-ready · acceptance (contract-clear self-skips on stacks with no API contract surface). A
-**fast-track** path (bug fixes / < 5 files) skips planning: Developer → Code Reviewer →
-Tester → PR.
+adds spec/EM/coverage/security · contract-clear; **enterprise** adds pipeline-green ·
+observability-ready · acceptance (contract-clear self-skips on stacks with no API contract surface, so
+it is inert for non-API projects). A **fast-track** path (bug fixes / < 5 files) skips planning:
+Developer → Code Reviewer → Tester → PR.
 Every gate uses the same severity model — a gate passes only with **zero Critical/High/Medium**
 findings open — and a *unanimous* PASS triggers the `devils-advocate` agent before the gate counts

{claude_code_kit-0.12.0 → claude_code_kit-0.13.0}/docs/architecture.md RENAMED Viewed

@@ -100,7 +100,8 @@ flowchart TD
     FORK --> LANES
     LANES --> MR1{{"Gate: Merge Reviewer<br/>cross-lane consistency"}}
-    MR1 -->|"pass"| TEST["Testing (parallel): unit · e2e · integration<br/>then Senior Tester verification"]
+    MR1 -->|"pass"| CC{{"Gate: Contract clear<br/>standard+ · API stacks (self-skips otherwise)"}}
+    CC -->|"pass"| TEST["Testing (parallel): unit · e2e · integration<br/>then Senior Tester verification"]
     TEST --> TCG{{"Gate: Test coverage<br/>blind review + Devil's Advocate"}}
     TCG -->|"pass / CONFIRMED"| SEC{{"Gate: Security Clear<br/>security-reviewer + 4 sub-scanners"}}

claude_code_kit-0.13.0/docs/coverage-audit.md ADDED Viewed

@@ -0,0 +1,51 @@
+# Coverage audit — GATED vs RULE vs SKILL/DOC
+claude-kit's reuse-first reviews often defer adding something because a skill or rule "touches" the
+topic. But the three are **not** equivalent in enforcement strength:
+| Class | What it means | Enforced? |
+|-------|---------------|-----------|
+| **GATED** | A gate token in `catalog/profiles.yaml` / `catalog/org.yaml`, owned by an agent, blocking at ≥ Medium severity (`rules/quality-gates.md` §1) | **Yes** — blocks delivery |
+| **RULE** | An always-on file in `.claude/rules/` (installed in every profile) | Partly — an instruction the agents must follow; not a blocking checkpoint by itself |
+| **SKILL / DOC** | A profile-gated skill (advisory, invoked on demand) or repo documentation | **No** — guidance, runs only when invoked |
+This document is the **justification record** for what the kit enforces versus documents. Each P0/P1
+item in the improvement briefs cites a row here. It reflects the state **as of 0.13.0**.
+## The named capabilities (verified against the files)
+| Capability | Class | Evidence | Enforced where |
+|------------|-------|----------|----------------|
+| **Rollback (verified)** | **GATED — enterprise only**; RULE elsewhere | `pipeline-green` gate is listed **only** in the enterprise profile (`catalog/profiles.yaml`); owned by `devops-engineer`, which requires a *verified* rollback + runbook (`rules/devops-observability.md`, `agents/devops-engineer.md`). In lean/standard, rollback is **RULE-level** advice via `rules/risk-classification.md` (high-risk changes need rollback notes), not a gate. | enterprise (blocking); lean/standard (advisory) |
+| **Cost expectations** | **DOC — by design** | `rules/model-tiers.md` "Profile cost expectations" (added 0.12.0). A `cost-estimate` skill + per-run cost hook were **deliberately rejected** (CHANGELOG 0.12.0) — the kit cannot reliably meter tokens at scaffold time. | documented only (accepted) |
+| **Migration safety** | **RULE + OVERLAY-AGENT (advisory) + enterprise rollback** | Always-on RULE: `rules/risk-classification.md` (DB migrations = sensitive, ≥ High). Overlay RULES (when a DB is selected): `postgres-patterns.md` / `mongodb-patterns.md` now state expand/contract + "no destructive drop in the same release" with **severity** (0.13.0, brief #2 P0-2). Overlay AGENT: `migration-specialist` (postgres + mongodb) reviews each change (expand/contract, reversible down-path, idempotent backfill; irreversible/table-locking ≥ High). **No dedicated migration gate token** — it is reviewed, not gated, and the enterprise rollback verification (`pipeline-green`) is the nearest enforced backstop. | overlay-advisory + enterprise rollback |
+| **Accessibility** | **SKILL/DOC** in lean/standard/team; **GATED** at org `regulated` strictness (0.13.0) | RULE (standards): `rules/responsive-and-accessibility.md` (always-on, advisory). SKILL (review procedure): `skills/accessibility-review`. As of 0.13.0 there **is** a gate — `accessibility-clear` — but **only** under organization scope at `regulated` strictness (`catalog/org.yaml`), owned by `acceptance-reviewer`, self-skipping when no UI surface (brief #2 P1-2). Outside `regulated`, a11y is advisory and blocks nothing. | regulated-org (blocking); otherwise advisory |
+| **API breaking changes** | **GATED — standard+ (API stacks)** as of 0.13.0 | `contract-clear` gate, owned by `merge-reviewer`, now in the **standard** and enterprise profiles (`catalog/profiles.yaml`); self-skips when the stack exposes no API contract surface (brief #2 P0-1). The manual counterpart is `rules/mandatory-workflow.md` §2d. | standard+ (blocking, API stacks) |
+### The one "looks enforced but isn't" trap
+`skills/accessibility-review/SKILL.md` contains an internal **"Quality gates"** heading — that is the
+*skill's own checklist wording*, **not** a kit gate token. Before 0.13.0 nothing in
+`catalog/*.yaml` enforced accessibility in any profile, so that heading could be misread as an enforced
+gate. The 0.13.0 `accessibility-clear` gate (regulated-org only) is the *actual* enforcing token;
+elsewhere the skill remains advisory. Do not cite a skill's internal "gate" wording as evidence of
+enforcement — only a token in `catalog/{profiles,org}.yaml` enforces.
+## Why the posture is internally consistent
+- Gates come **only** from `prof["gates"] + org.extra_gates` (`src/claude_kit/catalog.py`); stacks
+  contribute overlay rules/agents/skills, never gates. So "gated" always traces to a profile or org
+  strictness level, and `resolve()` stays branch-free.
+- Heavyweight, situational gates default to **enterprise** or to **org strictness** (golden rule #6).
+  `contract-clear` is the deliberate exception promoted to `standard` (brief #2 P0-1) because
+  breaking-change detection is table-stakes for the headline FastAPI backend — and it self-skips for
+  non-API stacks, so it adds no burden where it doesn't apply.
+- Where a capability is *advisory by design* (cost, lean/standard rollback, non-regulated a11y), this
+  audit says so plainly rather than implying enforcement the kit doesn't provide.
+## How to extend enforcement (the lever)
+To move a capability from RULE/SKILL to GATED: add a gate token to a profile (`catalog/profiles.yaml`)
+or to an org strictness level (`catalog/org.yaml` `extra_gates`), give it an **owner agent**, a
+**self-skip** condition when irrelevant, a **severity mapping**, and a row in `rules/quality-gates.md`
+§4. That is exactly how `contract-clear` (standard+) and `accessibility-clear` (regulated) were wired.

claude_code_kit-0.13.0/docs/eval-harness.md ADDED Viewed

@@ -0,0 +1,58 @@
+# Eval harness — does the pipeline earn its cost?
+A repeatable method to measure what the claude-kit gate pipeline actually *catches* — and what it
+costs — by comparing the same tasks run **with** and **without** the pipeline. This is a **template you
+fill from your own runs**; the kit deliberately ships **no numbers here**, because an eval result is
+only meaningful for the model, tasks, and environment that produced it.
+> Method reference: `.claude/rules/evals.md` **§6 (Repeat and aggregate)** — run each case **N times**
+> (commonly 5–10) and report the **median**, not the mean, with N stated. This doc does not restate that
+> rule; it gives the with/without comparison structure on top of it.
+## Design
+Two arms over the **same** task set:
+- **Arm A — baseline:** the task done by a single agent, no `/sdlc`, no gates (one Developer pass).
+- **Arm B — pipeline:** the same task through `/sdlc` at a chosen profile (state which: `lean` /
+  `standard` / `enterprise`), with the gates active.
+Pick **5–10 representative tasks** with *objective* pass criteria (a hidden test that must pass, a
+known breaking change that must be flagged, a secret that must be blocked). Avoid tasks graded by
+taste. Run each task **N times per arm** (per `evals.md` §6) and report the **median**.
+Keep a third column for **what caught it**: when Arm B succeeds where Arm A fails, name the gate
+(`code-review`, `test-coverage`, `security-clear`, `contract-clear`, …) and the severity it assigned.
+That is the load-bearing evidence — it converts "the pipeline feels safer" into "gate X caught defect
+class Y, Z% of the time."
+## Results (fill from your own runs — do not ship fabricated numbers)
+> N = ___ runs per arm · model = ___ · profile (Arm B) = ___ · date = ___
+| Task | Objective pass criterion | Arm A median (no pipeline) | Arm B median (pipeline) | Gate that caught the gap | Notes |
+|------|--------------------------|----------------------------|-------------------------|--------------------------|-------|
+| T1 — _e.g._ add endpoint + tests | hidden test suite green | _fill_ | _fill_ | _e.g._ test-coverage (High) | |
+| T2 — introduce a breaking API change | change flagged + migration required | _fill_ | _fill_ | contract-clear (High) | |
+| T3 — paste a hardcoded secret | secret blocked pre-commit | _fill_ | _fill_ | guard-commit-secrets / security-clear | |
+| T4 — … | … | | | | |
+| T5 — … | … | | | | |
+### Cost
+| Arm | Median tokens / task | Median wall-clock / task |
+|-----|----------------------|--------------------------|
+| A (baseline) | _fill_ | _fill_ |
+| B (pipeline) | _fill_ | _fill_ |
+> The pipeline costs more per task by design (more agents, more gates). The question this harness
+> answers is whether the **defects caught** (and their severity) justify that delta **for your task mix**.
+## Honesty rules
+- **Never publish numbers you did not run.** A "90%" from one run and from twenty runs are not the same
+  claim (`evals.md` §6) — always report N.
+- An eval result is environment-specific; do not present one repo's table as a general claim about
+  claude-kit.
+- If a gate caught nothing across the suite, **say so** — that is a signal the gate may be miscalibrated
+  for your tasks, which is exactly what this harness is for.

{claude_code_kit-0.12.0 → claude_code_kit-0.13.0}/docs/org-capabilities.md RENAMED Viewed

@@ -47,7 +47,10 @@ How much Claude may do before a human acts. Set per repo; default **assisted**.
 | `enterprise-controlled` | work through strict gates + audit | edit sensitive files / finish without security + review | `warn-sensitive-files`, `warn-large-edits`, `warn-missing-tests`, `validate-frontmatter`, `validate-settings`, `audit-log`, `guard-push-main`, `guard-commit-secrets` |
 Review **strictness** (`light` / `standard` / `regulated`) is an independent axis; `regulated` adds the
-`validate-frontmatter` + `validate-settings` hooks and the `security-clear` + `acceptance` gates.
+`validate-frontmatter` + `validate-settings` hooks and the `security-clear` + `acceptance` +
+`accessibility-clear` gates. The `accessibility-clear` gate (owned by `acceptance-reviewer`, driving
+the `accessibility-review` skill) enforces WCAG-AA on changed UI and self-skips when the change has no
+UI surface — so it binds only when both `regulated` strictness **and** a frontend are in play.
 ## Risk classification

{claude_code_kit-0.12.0 → claude_code_kit-0.13.0}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 [project]
 name = "claude-code-kit"
-version = "0.12.0"
+version = "0.13.0"
 description = "Cookiecutter-style scaffolder for an autonomous Claude Code SDLC configuration (no app code, no Docker). Asks ordered questions and installs CLAUDE.md + .claude/ (rules, the chosen profile's agents/skills, hooks, artifact templates) + optional .mcp.json; run /sdlc to drive spec → review → build → test → security → ship through profile-aware quality gates, working memory, and a self-improving learnings loop."
 readme = "README.md"
 requires-python = ">=3.9"

{claude_code_kit-0.12.0 → claude_code_kit-0.13.0}/rules/evals.md RENAMED Viewed

@@ -67,6 +67,10 @@ honest:
 > 3 models × 5 tasks at 10 runs each (median reported), with a line-count *measurement* that always
 > passes beside a *correctness gate* that spawns the runtime to actually execute the generated code.
 > A concrete instance of this section's two practices.
+>
+> To measure the **claude-kit pipeline itself** (the same tasks run with vs without the gates, and
+> which gate caught each defect), the claude-kit repo ships a fill-in template — `docs/eval-harness.md`
+> — that builds the with/without comparison on top of this section's median-of-N method.
 ## Rules

{claude_code_kit-0.12.0 → claude_code_kit-0.13.0}/rules/mandatory-workflow.md RENAMED Viewed

@@ -223,7 +223,7 @@ every consumer and verify it still works. Run the full test suite (not just your
 Review the diff for changes outside your scope.
 **Gate:** zero regressions verified across the codebase.
-> **Mechanical counterpart (enterprise, API-exposing stacks):** the `merge-reviewer` runs the
+> **Mechanical counterpart (standard+, API-exposing stacks):** the `merge-reviewer` runs the
 > **contract-clear** gate — a base-branch API-surface diff (`git show <base>:<schema>`) that classifies
 > each delta by severity and blocks backward-incompatible changes lacking an approved migration note +
 > version bump. It self-skips when no API contract surface exists. This §2d is the manual consumer

claude-code-kit 0.12.0__tar.gz → 0.13.0__tar.gz

claude-code-kit 0.12.0tar.gz → 0.13.0tar.gz