npm - joycraft - Versions diffs - 0.5.3 → 0.5.4 - Mend

joycraft 0.5.3 → 0.5.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

package/README.md +139 -67
package/dist/{chunk-HHW4Q2UC.js → chunk-5BZUWHEY.js} +251 -36
package/dist/chunk-5BZUWHEY.js.map +1 -0
package/dist/cli.js +12 -7
package/dist/cli.js.map +1 -1
package/dist/{init-DHVJEWGX.js → init-PPOY5PDB.js} +114 -66
package/dist/init-PPOY5PDB.js.map +1 -0
package/dist/{init-autofix-OVHXYVLB.js → init-autofix-F7PLQLVX.js} +11 -15
package/dist/{init-autofix-OVHXYVLB.js.map → init-autofix-F7PLQLVX.js.map} +1 -1
package/dist/{upgrade-RRG2ZRSO.js → upgrade-CZJ6A447.js} +2 -2
package/package.json +1 -1
package/dist/chunk-HHW4Q2UC.js.map +0 -1
package/dist/init-DHVJEWGX.js.map +0 -1
/package/dist/{upgrade-RRG2ZRSO.js.map → upgrade-CZJ6A447.js.map} +0 -0

package/README.md CHANGED Viewed

@@ -1,14 +1,14 @@
 # Joycraft
 <p align="center">
-  <img src="docs/joycraft-banner.png" alt="Joycraft — the craft of AI development" width="700" />
+  <img src="docs/joycraft-banner.png" alt="Joycraft, the craft of AI development" width="700" />
 </p>
-> The craft of AI development — with joy, not darkness.
+> The craft of AI development. With joy, not darkness.
 ## What is Joycraft?
-Joycraft is a CLI tool and [Claude Code](https://docs.anthropic.com/en/docs/claude-code) plugin that upgrades your AI development workflow. It installs skills, behavioral boundaries, templates, and documentation structure into any project — taking you from unstructured prompting to autonomous spec-driven development.
+Joycraft is a CLI tool and [Claude Code](https://docs.anthropic.com/en/docs/claude-code) plugin that upgrades your AI development workflow. It installs skills, behavioral boundaries, templates, and documentation structure into any project, taking you from unstructured prompting to autonomous spec-driven development.
 If you've been using Claude Code (or any AI coding tool) and your workflow looks like this:
@@ -16,18 +16,18 @@ If you've been using Claude Code (or any AI coding tool) and your workflow looks
 ...then Joycraft is for you.
-This project started as a personal exploration by [@maksutovic](https://github.com/maksutovic). I was working across multiple client projects, spending more time wrestling with prompts than building software. I knew Claude Code was capable of extraordinary work, but my *process* was holding it back. I was vibe coding — and vibe coding doesn't scale.
+This project started as a personal exploration by [@maksutovic](https://github.com/maksutovic). I was working across multiple client projects, spending more time wrestling with prompts than building software. I knew Claude Code was capable of extraordinary work, but my *process* was holding it back. I was vibe coding - and vibe coding doesn't scale.
-The spark was [Nate B Jones' video on the 5 Levels of Vibe Coding](https://www.youtube.com/watch?v=bDcgHzCBgmQ). It mapped out a progression I hadn't seen articulated before — from "spicy autocomplete" to fully autonomous development — and lit my brain up to the potential of what Claude Code could do with the right harness around it. Joycraft is the result of that exploration: a tool that encodes the patterns, boundaries, and workflows that make AI-assisted development actually deterministic.
+The spark was [Nate B Jones' video on the 5 Levels of Vibe Coding](https://www.youtube.com/watch?v=bDcgHzCBgmQ). It mapped out a progression I hadn't seen articulated before - from "spicy autocomplete" to fully autonomous development - and lit my brain up to the potential of what Claude Code could do with the right harness around it. Joycraft is the result of that exploration: a tool that encodes the patterns, boundaries, and workflows that make AI-assisted development actually deterministic.
 ### The core idea
 Joycraft is simple. It's a set of **skills** (slash commands for Claude Code) and **instructions** (CLAUDE.md boundaries) that guide you and your agent through a structured development process:
 - **Levels 1-4:** Skills like `/joycraft-tune`, `/joycraft-new-feature`, and `/joycraft-interview` replace unstructured prompting with spec-driven development. You interview, you write specs, the agent executes. No back-and-forth.
-- **Level 5:** The `/joycraft-implement-level5` skill sets up the autonomous loop — where specs go in and validated software comes out, with holdout scenario testing that prevents the agent from gaming its own tests.
+- **Level 5:** The `/joycraft-implement-level5` skill sets up the autonomous loop where specs go in and validated software comes out, with holdout scenario testing that prevents the agent from gaming its own tests.
-StrongDM calls their Level 5 fully autonomous loop a "Dark Factory" — which, albeit a cool name, the world has so much darkness in it right now. I wanted a name that extolled more of what I believe tools like this can provide: joy and craftsmanship. Hence "Joycraft."
+StrongDM calls their Level 5 fully autonomous loop a "Dark Factory" - which, albeit a cool name, the world has so much darkness in it right now. I wanted a name that extolled more of what I believe tools like this can provide: joy and craftsmanship. Hence "Joycraft."
 ### What are the levels?
@@ -35,7 +35,7 @@ StrongDM calls their Level 5 fully autonomous loop a "Dark Factory" — which, a
 | Level | Name | What it looks like | Joycraft's role |
 |-------|------|--------------------|-----------------|
-| 1 | Autocomplete | Tab-complete suggestions | — |
+| 1 | Autocomplete | Tab-complete suggestions | - |
 | 2 | Junior Developer | Prompt → iterate → fix → repeat | `/joycraft-tune` assesses where you are |
 | 3 | Developer as Manager | Your life is reviewing diffs | Behavioral boundaries in CLAUDE.md |
 | 4 | Developer as PM | You write specs, agent writes code | `/joycraft-new-feature` + `/joycraft-decompose` |
@@ -45,7 +45,7 @@ Most developers plateau at Level 2. Joycraft's job is to move you up.
 ### Platform support
-Joycraft is currently focused on making the Claude Code experience state-of-the-art. Better [Codex](https://openai.com/codex) support is coming — `AGENTS.md` generation is already included, and deeper integration is on the roadmap.
+Joycraft is currently focused on making the Claude Code experience state-of-the-art. Better [Codex](https://openai.com/codex) support is coming. `AGENTS.md` generation is already included, and deeper integration is on the roadmap.
 ## Quick Start
@@ -67,14 +67,16 @@ Joycraft auto-detects your tech stack and creates:
 - **CLAUDE.md** with behavioral boundaries (Always / Ask First / Never) and correct build/test/lint commands
 - **AGENTS.md** for Codex compatibility
 - **Claude Code skills** installed to `.claude/skills/`:
-  - `/joycraft-tune` — Assess your harness, apply upgrades, see your path to Level 5
-  - `/joycraft-new-feature` — Interview → Feature Brief → Atomic Specs
-  - `/joycraft-interview` — Lightweight brainstorm — yap about ideas, get a structured summary
-  - `/joycraft-decompose` — Break a brief into small, testable specs
-  - `/joycraft-session-end` — Capture discoveries, verify, commit
-  - `/joycraft-implement-level5` — Set up Level 5: autofix loop, holdout scenarios, scenario evolution
-- **docs/** structure — `briefs/`, `specs/`, `discoveries/`, `contracts/`, `decisions/`
-- **Templates** — Atomic spec, feature brief, implementation plan, boundary framework, and workflow templates for scenario generation and autofix loops
+  - `/joycraft-tune` Assess your harness, apply upgrades, see your path to Level 5
+  - `/joycraft-new-feature` Interview → Feature Brief → Atomic Specs
+  - `/joycraft-interview` Lightweight brainstorm. Yap about ideas, get a structured summary
+  - `/joycraft-decompose` Break a brief into small, testable specs
+  - `/joycraft-add-fact` Capture project knowledge on the fly -- routes to the right context doc
+  - `/joycraft-session-end` Capture discoveries, verify, commit, push
+  - `/joycraft-implement-level5` Set up Level 5 (autofix loop, holdout scenarios, scenario evolution)
+- **docs/** structure: `briefs/`, `specs/`, `discoveries/`, `contracts/`, `decisions/`, `context/`
+- **Context documents** in `docs/context/`: production map, dangerous assumptions, decision log, institutional knowledge, and troubleshooting guide
+- **Templates** including atomic spec, feature brief, implementation plan, boundary framework, and workflow templates for scenario generation and autofix loops
 Once you reach Level 4, you can set up the autonomous loop with `/joycraft-implement-level5`. See [Level 5: The Autonomous Loop](#level-5-the-autonomous-loop) below.
@@ -90,11 +92,12 @@ After init, open Claude Code and use the installed skills:
 ```
 /joycraft-tune                  # Assess your harness, apply upgrades, see path to Level 5
-/joycraft-interview             # Brainstorm freely — yap about ideas, get a structured summary
+/joycraft-interview             # Brainstorm freely, yap about ideas, get a structured summary
 /joycraft-new-feature           # Interview → Feature Brief → Atomic Specs → ready to execute
 /joycraft-decompose             # Break any feature into small, independent specs
-/joycraft-session-end           # Wrap up — discoveries, verification, commit
-/joycraft-implement-level5     # Set up Level 5 — autofix, holdout scenarios, evolution
+/joycraft-add-fact              # Capture a fact mid-session -- auto-routes to the right context doc
+/joycraft-session-end           # Wrap up: discoveries, verification, commit, push
+/joycraft-implement-level5     # Set up Level 5 (autofix, holdout scenarios, evolution)
 ```
 The core loop:
@@ -113,13 +116,13 @@ Joycraft flips this. Before the agent writes a single line of code, you have a c
 ### Two interview modes
-**`/joycraft-interview`** — The lightweight brainstorm. You yap about an idea, the agent asks clarifying questions, and you get a structured summary saved to `docs/briefs/`. Good for early-stage thinking when you're not ready to commit to building anything yet. No pressure, no specs — just organized thought.
+**`/joycraft-interview`** is the lightweight brainstorm. You yap about an idea, the agent asks clarifying questions, and you get a structured summary saved to `docs/briefs/`. Good for early-stage thinking when you're not ready to commit to building anything yet. No pressure, no specs, just organized thought.
-**`/joycraft-new-feature`** — The full workflow. This is the structured interview that produces a **Feature Brief** (the what and why) and then decomposes it into **Atomic Specs** (small, testable, independently executable units of work). Each spec is self-contained — an agent in a fresh session can pick it up and execute without reading anything else.
+**`/joycraft-new-feature`** is the full workflow. This is the structured interview that produces a **Feature Brief** (the what and why) and then decomposes it into **Atomic Specs** (small, testable, independently executable units of work). Each spec is self-contained. An agent in a fresh session can pick it up and execute without reading anything else.
 ### Why this works
-The insight comes from [Boris Cherny](https://www.lennysnewsletter.com/p/head-of-claude-code-what-happens) (Head of Claude Code at Anthropic): interview in one session, write the spec, then execute in a *fresh session* with clean context. The interview captures your intent. The spec is the contract. The execution session has only the spec — no baggage from the conversation, no accumulated misunderstandings, no context window full of abandoned approaches.
+The insight comes from [Boris Cherny](https://www.lennysnewsletter.com/p/head-of-claude-code-what-happens) (Head of Claude Code at Anthropic): interview in one session, write the spec, then execute in a *fresh session* with clean context. The interview captures your intent. The spec is the contract. The execution session has only the spec. No baggage from the conversation, no accumulated misunderstandings, no context window full of abandoned approaches.
 This is what separates Level 2 (back-and-forth prompting) from Level 4 (spec-driven development). You stop being a typist correcting an agent's guesses and start being a PM defining what needs to be built.
@@ -143,13 +146,13 @@ flowchart LR
 An atomic spec produced by `/joycraft-decompose` has:
-- **What** — One paragraph. A developer with zero context understands the change in 15 seconds.
-- **Why** — One sentence. What breaks or is missing without this?
-- **Acceptance criteria** — Checkboxes. Testable. No ambiguity.
-- **Affected files** — Exact paths, what changes in each.
-- **Edge cases** — Table of scenarios and expected behavior.
+- **What:** One paragraph. A developer with zero context understands the change in 15 seconds.
+- **Why:** One sentence. What breaks or is missing without this?
+- **Acceptance criteria:** Checkboxes. Testable. No ambiguity.
+- **Affected files:** Exact paths, what changes in each.
+- **Edge cases:** Table of scenarios and expected behavior.
-The agent doesn't guess. It reads the spec and executes. If something's unclear, the spec is wrong — fix the spec, not the conversation.
+The agent doesn't guess. It reads the spec and executes. If something's unclear, the spec is wrong. Fix the spec, not the conversation.
 ## Upgrade
@@ -165,7 +168,7 @@ Joycraft tracks what it installed vs. what you've customized. Unmodified files u
 ## Level 5: The Autonomous Loop
-> **A note on complexity:** Setting up Level 5 does have some moving parts and, depending on the complexity of your stack (software vs. hardware, monorepo vs. single app, etc.), this will require a good amount of prompting and trial-and-error to get right. I've done my best to make this as painless as possible, but just note — this is not a one-shot-prompt-done-in-5-minutes kind of thing. For small projects and simple stacks it will be easy, but any level of complexity is going to take some iteration, so plan ahead. Full step-by-step guides along with a video coming soon.
+> **A note on complexity:** Setting up Level 5 does have some moving parts and, depending on the complexity of your stack (software vs. hardware, monorepo vs. single app, etc.), this will require a good amount of prompting and trial-and-error to get right. I've done my best to make this as painless as possible, but just note - this is not a one-shot-prompt-done-in-5-minutes kind of thing. For small projects and simple stacks it will be easy, but any level of complexity is going to take some iteration, so plan ahead. Full step-by-step guides along with a video coming soon.
 Level 5 is where specs go in and validated software comes out. Joycraft implements this as four interlocking GitHub Actions workflows, a separate scenarios repository, and two independent AI agents that can never see each other's work.
@@ -177,7 +180,7 @@ npx joycraft init-autofix --scenarios-repo my-project-scenarios --app-id 3180156
 ### Architecture Overview
-Level 5 has four moving parts. Each is a GitHub Actions workflow that communicates via `repository_dispatch` events — no custom servers, no webhooks, no external services.
+Level 5 has four moving parts. Each is a GitHub Actions workflow that communicates via `repository_dispatch` events. No custom servers, no webhooks, no external services.
 ```mermaid
 graph TB
@@ -241,10 +244,10 @@ sequenceDiagram
 ```
 **Key details:**
-- Uses a GitHub App identity for pushes — avoids GitHub's anti-recursion protection
-- Concurrency group per PR — only one autofix runs at a time per PR
-- Max 3 iterations — posts "human review needed" if it can't fix it
-- No `--model` flag — Claude CLI handles model selection
+- Uses a GitHub App identity for pushes to avoid GitHub's anti-recursion protection
+- Concurrency group per PR so only one autofix runs at a time
+- Max 3 iterations, then posts "human review needed"
+- No `--model` flag. Claude CLI handles model selection.
 - Strips ANSI escape codes from logs so Claude gets clean text
 #### 2. Scenarios Dispatch Workflow (`scenarios-dispatch.yml`)
@@ -281,7 +284,7 @@ sequenceDiagram
         SPD->>SR: repository_dispatch: spec-pushed<br/>payload: {spec_filename, spec_content, commit_sha, branch, repo}
     end
-    Note over SPD: Deleted specs are ignored —<br/>existing scenario tests remain
+    Note over SPD: Deleted specs are ignored -<br/>existing scenario tests remain
 ```
 #### 4. Scenarios Re-run Workflow (`scenarios-rerun.yml`)
@@ -306,7 +309,7 @@ sequenceDiagram
     end
 ```
-**Why this exists:** There's a race condition. The implementation agent might open a PR before the scenario agent finishes writing new tests. The re-run workflow handles this — when new tests land, all open PRs get re-tested. Worst case: a PR merges before the re-run, and the new tests protect the very next PR. You're never more than one cycle behind.
+**Why this exists:** There's a race condition. The implementation agent might open a PR before the scenario agent finishes writing new tests. The re-run workflow handles this by re-testing all open PRs when new tests land. Worst case, a PR merges before the re-run, and the new tests protect the very next PR. You're never more than one cycle behind.
 ### The Holdout Wall
@@ -336,7 +339,7 @@ graph LR
     style Specs fill:#cfc,stroke:#393
 ```
-This is the same principle as a holdout set in machine learning. If the implementation agent could see the scenario tests, it would optimize to pass them specifically — not to build correct software. By keeping the wall intact, scenario tests catch real behavioral regressions, not test-gaming.
+This is the same principle as a holdout set in machine learning. If the implementation agent could see the scenario tests, it would optimize to pass them specifically instead of building correct software. By keeping the wall intact, scenario tests catch real behavioral regressions, not test-gaming.
 ### Scenario Evolution
@@ -348,7 +351,7 @@ flowchart TD
     B --> C[Scenario Agent reads spec]
     C --> D{Triage: is this user-facing?}
-    D -->|Internal refactor, CI, dev tooling| E[Skip — commit note: 'No scenario changes needed']
+    D -->|Internal refactor, CI, dev tooling| E[Skip - commit note: 'No scenario changes needed']
     D -->|New user-facing behavior| F[Write new scenario test file]
     D -->|Modified existing behavior| G[Update existing scenario tests]
@@ -433,11 +436,80 @@ sequenceDiagram
 | Scenarios repo | `package.json` | Minimal vitest setup |
 | Scenarios repo | `README.md` | Explains holdout pattern to contributors |
-### Prerequisites
+### Setup Guide
-- **GitHub App** — Provides a separate identity for autofix pushes (avoids GitHub's anti-recursion protection). You can install the shared [Joycraft Autofix](https://github.com/apps/joycraft-autofix) app (App ID: `3180156`) or create your own.
-- **Secrets** — `JOYCRAFT_APP_PRIVATE_KEY` and `ANTHROPIC_API_KEY` on both the main and scenarios repos.
-- **Scenarios repo** — A private repository where holdout tests live. Created during setup.
+The fastest way: run `/joycraft-implement-level5` in Claude Code and it walks you through everything interactively. Or follow these steps manually:
+#### Step 1: Create a GitHub App
+The autofix workflow needs a GitHub App identity to push commits. GitHub blocks workflows from triggering other workflows with the default `GITHUB_TOKEN` -- a separate App identity solves this. Creating one takes about 2 minutes:
+1. Go to https://github.com/settings/apps/new
+2. Give it a name (e.g., "My Project Autofix")
+3. Uncheck "Webhook > Active" (not needed)
+4. Under **Repository permissions**, set:
+   - **Contents**: Read & Write
+   - **Pull requests**: Read & Write
+   - **Actions**: Read & Write
+5. Click **Create GitHub App**
+6. Note the **App ID** from the settings page (you'll need it in Step 2)
+7. Scroll to **Private keys** > click **Generate a private key**
+8. Save the downloaded `.pem` file -- you'll need it in Step 3
+9. Click **Install App** in the left sidebar > install it on the repo(s) you want to use
+> **Coming soon:** We're working on a shared Joycraft Autofix app that will reduce this to a single click. For now, creating your own app gives you full control and takes just a couple minutes.
+#### Step 2: Run the CLI
+```bash
+npx joycraft init-autofix --scenarios-repo my-project-scenarios --app-id YOUR_APP_ID
+```
+Replace `YOUR_APP_ID` with the App ID from Step 1. This installs the four workflow files in your main repo and copies scenario templates to `docs/templates/scenarios/`.
+#### Step 3: Add secrets to your main repo
+Go to your repo's **Settings > Secrets and variables > Actions** and add:
+| Secret | Value |
+|--------|-------|
+| `JOYCRAFT_APP_PRIVATE_KEY` | The full contents of the `.pem` file from Step 1 |
+| `ANTHROPIC_API_KEY` | Your Anthropic API key (used by the autofix workflow to run Claude) |
+#### Step 4: Create the scenarios repo
+```bash
+# Create a private repo for holdout tests
+gh repo create my-project-scenarios --private
+# Copy the scenario templates into it
+cp -r docs/templates/scenarios/* ../my-project-scenarios/
+cd ../my-project-scenarios
+git add -A && git commit -m "init: scaffold scenarios repo from Joycraft"
+git push
+```
+Then add the **same two secrets** (`JOYCRAFT_APP_PRIVATE_KEY` and `ANTHROPIC_API_KEY`) to the scenarios repo's Settings > Secrets.
+#### Step 5: Verify
+```bash
+# Check workflow files exist in your main repo
+ls .github/workflows/autofix.yml .github/workflows/scenarios-dispatch.yml \
+   .github/workflows/spec-dispatch.yml .github/workflows/scenarios-rerun.yml
+# Check scenario templates in the scenarios repo
+ls ../my-project-scenarios/workflows/run.yml ../my-project-scenarios/workflows/generate.yml \
+   ../my-project-scenarios/prompts/scenario-agent.md ../my-project-scenarios/example-scenario.test.ts
+```
+#### Step 6: Test it
+1. Push a spec to `docs/specs/` on main -- this triggers scenario generation in the scenarios repo
+2. Open a PR with a small change -- when CI passes, scenarios run against the PR
+3. Watch for the scenario test results posted as a PR comment
+Or deliberately break something in a PR to test the autofix loop.
 ### Cost
@@ -454,12 +526,12 @@ When `/joycraft-tune` runs for the first time, it does two things:
 ### Risk interview
-3-5 targeted questions about what's dangerous in your project — production databases, live APIs, secrets, files that should be off-limits. From your answers, Joycraft generates:
+3-5 targeted questions about what's dangerous in your project (production databases, live APIs, secrets, files that should be off-limits). From your answers, Joycraft generates:
 - **NEVER rules** for CLAUDE.md (e.g., "NEVER connect to production DB")
 - **Deny patterns** for `.claude/settings.json` (blocks dangerous bash commands)
-- **`docs/context/production-map.md`** — what's real vs. safe to touch
-- **`docs/context/dangerous-assumptions.md`** — "Agent might assume X, but actually Y"
+- **`docs/context/production-map.md`** documenting what's real vs. safe to touch
+- **`docs/context/dangerous-assumptions.md`** documenting "Agent might assume X, but actually Y"
 This takes 2-3 minutes and dramatically reduces the chance of your agent doing something catastrophic.
@@ -467,8 +539,8 @@ This takes 2-3 minutes and dramatically reduces the chance of your agent doing s
 One question: **how autonomous should git be?**
-- **Cautious** (default) — commits freely, asks before pushing or opening PRs. Good for learning the workflow.
-- **Autonomous** — commits, pushes to feature branches, and opens PRs without asking. Good for spec-driven development where you want full send.
+- **Cautious** (default) commits freely but asks before pushing or opening PRs. Good for learning the workflow.
+- **Autonomous** commits, pushes to feature branches, and opens PRs without asking. Good for spec-driven development where you want full send.
 Either way, Joycraft generates explicit git boundaries in your CLAUDE.md: commit message format (`verb: message`), specific file staging (no `git add -A`), no secrets in commits, no force-pushing.
@@ -476,9 +548,9 @@ Either way, Joycraft generates explicit git boundaries in your CLAUDE.md: commit
 **Claude Code** reads `CLAUDE.md` automatically and discovers skills in `.claude/skills/`. The behavioral boundaries guide every action. The skills provide structured workflows accessible via `/slash-commands`.
-**Codex** reads `AGENTS.md` — same boundaries and commands in a concise format optimized for smaller context windows.
+**Codex** reads `AGENTS.md`, which provides the same boundaries and commands in a concise format optimized for smaller context windows.
-Both agents get the same guardrails and the same development workflow. Joycraft doesn't write your project code — it builds the *system* that makes AI-assisted development reliable.
+Both agents get the same guardrails and the same development workflow. Joycraft doesn't write your project code. It builds the *system* that makes AI-assisted development reliable.
 ### Team Sharing
@@ -489,13 +561,13 @@ git add .claude/skills/ docs/
 git commit -m "add: Joycraft harness"
 ```
-Joycraft also installs a session-start hook that checks for updates — if your templates are outdated, you'll see a one-line nudge when Claude Code starts.
+Joycraft also installs a session-start hook that checks for updates. If your templates are outdated, you'll see a one-line nudge when Claude Code starts.
 ## Why This Exists
-Most developers using AI tools are at Level 2 — they prompt, they iterate, they feel productive. But [METR's randomized control trial](https://metr.org/) found experienced developers using AI tools actually completed tasks **19% slower**, while *believing* they were 24% faster. The problem isn't the tools. It's the absence of structure around them.
+Most developers using AI tools are at Level 2. They prompt, they iterate, they feel productive. But [METR's randomized control trial](https://metr.org/) found experienced developers using AI tools actually completed tasks **19% slower**, while *believing* they were 24% faster. The problem isn't the tools. It's the absence of structure around them.
-The teams seeing transformative results — [StrongDM](https://factory.strongdm.ai/) shipping an entire product with 3 engineers, [Spotify Honk](https://www.danshapiro.com/blog/2026/01/the-five-levels-from-spicy-autocomplete-to-the-software-factory/) merging 1,000 PRs every 10 days, Anthropic generating effectively 100% of their code with AI — all share the same pattern: **they don't prompt AI to write code. They write specs and let AI execute them.**
+The teams seeing transformative results ([StrongDM](https://factory.strongdm.ai/) shipping an entire product with 3 engineers, [Spotify Honk](https://www.danshapiro.com/blog/2026/01/the-five-levels-from-spicy-autocomplete-to-the-software-factory/) merging 1,000 PRs every 10 days, Anthropic generating effectively 100% of their code with AI) all share the same pattern: **they don't prompt AI to write code. They write specs and let AI execute them.**
 Joycraft packages that pattern into something anyone can install.
@@ -503,15 +575,15 @@ Joycraft packages that pattern into something anyone can install.
 Joycraft's approach is synthesized from several sources:
-**Spec-driven development.** Instead of prompting AI in conversation, you write structured specifications — Feature Briefs that capture the *what* and *why*, then Atomic Specs that break work into small, testable, independently executable units. Each spec is self-contained: an agent can pick it up without reading anything else. This follows [Addy Osmani's](https://addyosmani.com/blog/good-spec/) principles for AI-consumable specs and [GitHub's Spec Kit](https://github.blog/ai-and-ml/generative-ai/spec-driven-development-with-ai-get-started-with-a-new-open-source-toolkit/) 4-phase process (Specify → Plan → Tasks → Implement).
+**Spec-driven development.** Instead of prompting AI in conversation, you write structured specifications. Feature Briefs capture the *what* and *why*, then Atomic Specs break work into small, testable, independently executable units. Each spec is self-contained: an agent can pick it up without reading anything else. This follows [Addy Osmani's](https://addyosmani.com/blog/good-spec/) principles for AI-consumable specs and [GitHub's Spec Kit](https://github.blog/ai-and-ml/generative-ai/spec-driven-development-with-ai-get-started-with-a-new-open-source-toolkit/) 4-phase process (Specify → Plan → Tasks → Implement).
 **Context isolation.** [Boris Cherny](https://www.lennysnewsletter.com/p/head-of-claude-code-what-happens) (Head of Claude Code at Anthropic) recommends: interview in one session, write the spec, then execute in a *fresh session* with clean context. Joycraft's `/joycraft-new-feature` → `/joycraft-decompose` → execute workflow enforces this naturally. The interview session captures intent; the execution session has only the spec.
-**Behavioral boundaries.** CLAUDE.md isn't a suggestion box — it's a contract. Joycraft installs a three-tier boundary framework (Always / Ask First / Never) that prevents the most common AI development failures: overwriting user files, skipping tests, pushing without approval, hardcoding secrets. This is [Addy Osmani's](https://addyosmani.com/blog/good-spec/) "boundaries" principle made concrete.
+**Behavioral boundaries.** CLAUDE.md isn't a suggestion box, it's a contract. Joycraft installs a three-tier boundary framework (Always / Ask First / Never) that prevents the most common AI development failures: overwriting user files, skipping tests, pushing without approval, hardcoding secrets. This is [Addy Osmani's](https://addyosmani.com/blog/good-spec/) "boundaries" principle made concrete.
-**Knowledge capture over session notes.** Most session notes are never re-read. Joycraft's `/joycraft-session-end` skill captures only *discoveries* — assumptions that were wrong, APIs that behaved unexpectedly, decisions made during implementation that aren't in the spec. If nothing surprising happened, you capture nothing. This keeps the signal-to-noise ratio high.
+**Knowledge capture over session notes.** Most session notes are never re-read. Joycraft's `/joycraft-session-end` skill captures only *discoveries*: assumptions that were wrong, APIs that behaved unexpectedly, decisions made during implementation that aren't in the spec. If nothing surprising happened, you capture nothing. This keeps the signal-to-noise ratio high.
-**External holdout scenarios.** [StrongDM's Software Factory](https://factory.strongdm.ai/) proved that AI agents will [actively game visible test suites](https://palisaderesearch.org/blog/specification-gaming). Their solution: scenarios that live *outside* the codebase, invisible to the agent during development. Like a holdout set in ML, this prevents overfitting. Joycraft now implements this directly — `init-autofix` sets up the holdout wall, the scenario agent, and the GitHub App integration, not just provides templates for it.
+**External holdout scenarios.** [StrongDM's Software Factory](https://factory.strongdm.ai/) proved that AI agents will [actively game visible test suites](https://palisaderesearch.org/blog/specification-gaming). Their solution: scenarios that live *outside* the codebase, invisible to the agent during development. Like a holdout set in ML, this prevents overfitting. Joycraft now implements this directly. `init-autofix` sets up the holdout wall, the scenario agent, and the GitHub App integration.
 **The 5-level framework.** [Dan Shapiro's levels](https://www.danshapiro.com/blog/2026/01/the-five-levels-from-spicy-autocomplete-to-the-software-factory/) give you a map. Level 2 (Junior Developer) is where most teams plateau. Level 3 (Developer as Manager) means your life is diffs. Level 4 (Developer as PM) means you write specs, not code. Level 5 (Dark Factory) means specs in, software out. Joycraft's `/joycraft-tune` assessment tells you where you are and what to do next.
@@ -519,14 +591,14 @@ Joycraft's approach is synthesized from several sources:
 Joycraft synthesizes ideas and patterns from people doing extraordinary work in AI-assisted software development:
-- **[Dan Shapiro](https://x.com/danshapiro)** — The [5 Levels of Vibe Coding](https://www.danshapiro.com/blog/2026/01/the-five-levels-from-spicy-autocomplete-to-the-software-factory/) framework that Joycraft's assessment and level system is built on
-- **[StrongDM](https://www.strongdm.com/)** / **[Justin McCarthy](https://x.com/BuiltByJustin)** — The [Software Factory](https://factory.strongdm.ai/): spec-driven autonomous development, NLSpec, external holdout scenarios, and the proof that 3 engineers can outproduce 30
-- **[Boris Cherny](https://x.com/bcherny)** — Head of Claude Code at Anthropic. The interview → spec → fresh session → execute pattern, and the insight that [context isolation produces better results](https://www.lennysnewsletter.com/p/head-of-claude-code-what-happens)
-- **[Addy Osmani](https://x.com/addyosmani)** — [What makes a good spec for AI](https://addyosmani.com/blog/good-spec/): commands, testing, project structure, code style, git workflow, and boundaries
-- **[METR](https://metr.org/)** — The [randomized control trial](https://metr.org/) that proved unstructured AI use makes experienced developers slower, validating the need for harnesses
-- **[Nate B Jones](https://x.com/natebjones)** — His [video on the 5 Levels of Vibe Coding](https://www.youtube.com/watch?v=bDcgHzCBgmQ) made this research accessible and inspired turning Joycraft into a tool anyone can use
-- **[Simon Willison](https://x.com/simonw)** — [Analysis of the Software Factory](https://simonwillison.net/2026/Feb/7/software-factory/) that helped contextualize StrongDM's approach for the broader community
-- **[Anthropic](https://www.anthropic.com/)** — Claude Code's skills, hooks, and CLAUDE.md system that makes tool-native AI development possible, and the [harness patterns for long-running agents](https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents)
+- **[Dan Shapiro](https://x.com/danshapiro)** for the [5 Levels of Vibe Coding](https://www.danshapiro.com/blog/2026/01/the-five-levels-from-spicy-autocomplete-to-the-software-factory/) framework that Joycraft's assessment and level system is built on
+- **[StrongDM](https://www.strongdm.com/)** / **[Justin McCarthy](https://x.com/BuiltByJustin)** for the [Software Factory](https://factory.strongdm.ai/): spec-driven autonomous development, NLSpec, external holdout scenarios, and the proof that 3 engineers can outproduce 30
+- **[Boris Cherny](https://x.com/bcherny)**, Head of Claude Code at Anthropic, for the interview → spec → fresh session → execute pattern and the insight that [context isolation produces better results](https://www.lennysnewsletter.com/p/head-of-claude-code-what-happens)
+- **[Addy Osmani](https://x.com/addyosmani)** for [What makes a good spec for AI](https://addyosmani.com/blog/good-spec/): commands, testing, project structure, code style, git workflow, and boundaries
+- **[METR](https://metr.org/)** for the [randomized control trial](https://metr.org/) that proved unstructured AI use makes experienced developers slower, validating the need for harnesses
+- **[Nate B Jones](https://x.com/natebjones)** whose [video on the 5 Levels of Vibe Coding](https://www.youtube.com/watch?v=bDcgHzCBgmQ) made this research accessible and inspired turning Joycraft into a tool anyone can use
+- **[Simon Willison](https://x.com/simonw)** for his [analysis of the Software Factory](https://simonwillison.net/2026/Feb/7/software-factory/) that helped contextualize StrongDM's approach for the broader community
+- **[Anthropic](https://www.anthropic.com/)** for Claude Code's skills, hooks, and CLAUDE.md system that makes tool-native AI development possible, and the [harness patterns for long-running agents](https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents)
 ## Contributing
@@ -538,10 +610,10 @@ The short version:
 2. `pnpm install && pnpm test --run` to verify your setup
 3. Write tests first, then implement
 4. `pnpm test --run && pnpm typecheck && pnpm build`
-5. Open a PR — one approval required
+5. Open a PR (one approval required)
 Look for [`good first issue`](https://github.com/maksutovic/joycraft/labels/good%20first%20issue) labels if you're new. Areas we'd especially love help with: stack detection for new languages, skill improvements, documentation, and Codex integration.
 ## License
-MIT — see [LICENSE](LICENSE) for details.
+MIT. See [LICENSE](LICENSE) for details.