joycraft 0.5.3 → 0.5.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,14 +1,14 @@
1
1
  # Joycraft
2
2
 
3
3
  <p align="center">
4
- <img src="docs/joycraft-banner.png" alt="Joycraft the craft of AI development" width="700" />
4
+ <img src="docs/joycraft-banner.png" alt="Joycraft, the craft of AI development" width="700" />
5
5
  </p>
6
6
 
7
- > The craft of AI development with joy, not darkness.
7
+ > The craft of AI development. With joy, not darkness.
8
8
 
9
9
  ## What is Joycraft?
10
10
 
11
- Joycraft is a CLI tool and [Claude Code](https://docs.anthropic.com/en/docs/claude-code) plugin that upgrades your AI development workflow. It installs skills, behavioral boundaries, templates, and documentation structure into any project taking you from unstructured prompting to autonomous spec-driven development.
11
+ Joycraft is a CLI tool and [Claude Code](https://docs.anthropic.com/en/docs/claude-code) plugin that upgrades your AI development workflow. It installs skills, behavioral boundaries, templates, and documentation structure into any project, taking you from unstructured prompting to autonomous spec-driven development.
12
12
 
13
13
  If you've been using Claude Code (or any AI coding tool) and your workflow looks like this:
14
14
 
@@ -16,18 +16,18 @@ If you've been using Claude Code (or any AI coding tool) and your workflow looks
16
16
 
17
17
  ...then Joycraft is for you.
18
18
 
19
- This project started as a personal exploration by [@maksutovic](https://github.com/maksutovic). I was working across multiple client projects, spending more time wrestling with prompts than building software. I knew Claude Code was capable of extraordinary work, but my *process* was holding it back. I was vibe coding and vibe coding doesn't scale.
19
+ This project started as a personal exploration by [@maksutovic](https://github.com/maksutovic). I was working across multiple client projects, spending more time wrestling with prompts than building software. I knew Claude Code was capable of extraordinary work, but my *process* was holding it back. I was vibe coding - and vibe coding doesn't scale.
20
20
 
21
- The spark was [Nate B Jones' video on the 5 Levels of Vibe Coding](https://www.youtube.com/watch?v=bDcgHzCBgmQ). It mapped out a progression I hadn't seen articulated before from "spicy autocomplete" to fully autonomous development and lit my brain up to the potential of what Claude Code could do with the right harness around it. Joycraft is the result of that exploration: a tool that encodes the patterns, boundaries, and workflows that make AI-assisted development actually deterministic.
21
+ The spark was [Nate B Jones' video on the 5 Levels of Vibe Coding](https://www.youtube.com/watch?v=bDcgHzCBgmQ). It mapped out a progression I hadn't seen articulated before - from "spicy autocomplete" to fully autonomous development - and lit my brain up to the potential of what Claude Code could do with the right harness around it. Joycraft is the result of that exploration: a tool that encodes the patterns, boundaries, and workflows that make AI-assisted development actually deterministic.
22
22
 
23
23
  ### The core idea
24
24
 
25
25
  Joycraft is simple. It's a set of **skills** (slash commands for Claude Code) and **instructions** (CLAUDE.md boundaries) that guide you and your agent through a structured development process:
26
26
 
27
27
  - **Levels 1-4:** Skills like `/joycraft-tune`, `/joycraft-new-feature`, and `/joycraft-interview` replace unstructured prompting with spec-driven development. You interview, you write specs, the agent executes. No back-and-forth.
28
- - **Level 5:** The `/joycraft-implement-level5` skill sets up the autonomous loop where specs go in and validated software comes out, with holdout scenario testing that prevents the agent from gaming its own tests.
28
+ - **Level 5:** The `/joycraft-implement-level5` skill sets up the autonomous loop where specs go in and validated software comes out, with holdout scenario testing that prevents the agent from gaming its own tests.
29
29
 
30
- StrongDM calls their Level 5 fully autonomous loop a "Dark Factory" which, albeit a cool name, the world has so much darkness in it right now. I wanted a name that extolled more of what I believe tools like this can provide: joy and craftsmanship. Hence "Joycraft."
30
+ StrongDM calls their Level 5 fully autonomous loop a "Dark Factory" - which, albeit a cool name, the world has so much darkness in it right now. I wanted a name that extolled more of what I believe tools like this can provide: joy and craftsmanship. Hence "Joycraft."
31
31
 
32
32
  ### What are the levels?
33
33
 
@@ -35,7 +35,7 @@ StrongDM calls their Level 5 fully autonomous loop a "Dark Factory" — which, a
35
35
 
36
36
  | Level | Name | What it looks like | Joycraft's role |
37
37
  |-------|------|--------------------|-----------------|
38
- | 1 | Autocomplete | Tab-complete suggestions | |
38
+ | 1 | Autocomplete | Tab-complete suggestions | - |
39
39
  | 2 | Junior Developer | Prompt → iterate → fix → repeat | `/joycraft-tune` assesses where you are |
40
40
  | 3 | Developer as Manager | Your life is reviewing diffs | Behavioral boundaries in CLAUDE.md |
41
41
  | 4 | Developer as PM | You write specs, agent writes code | `/joycraft-new-feature` + `/joycraft-decompose` |
@@ -45,7 +45,7 @@ Most developers plateau at Level 2. Joycraft's job is to move you up.
45
45
 
46
46
  ### Platform support
47
47
 
48
- Joycraft is currently focused on making the Claude Code experience state-of-the-art. Better [Codex](https://openai.com/codex) support is coming `AGENTS.md` generation is already included, and deeper integration is on the roadmap.
48
+ Joycraft is currently focused on making the Claude Code experience state-of-the-art. Better [Codex](https://openai.com/codex) support is coming. `AGENTS.md` generation is already included, and deeper integration is on the roadmap.
49
49
 
50
50
  ## Quick Start
51
51
 
@@ -67,14 +67,16 @@ Joycraft auto-detects your tech stack and creates:
67
67
  - **CLAUDE.md** with behavioral boundaries (Always / Ask First / Never) and correct build/test/lint commands
68
68
  - **AGENTS.md** for Codex compatibility
69
69
  - **Claude Code skills** installed to `.claude/skills/`:
70
- - `/joycraft-tune` Assess your harness, apply upgrades, see your path to Level 5
71
- - `/joycraft-new-feature` Interview → Feature Brief → Atomic Specs
72
- - `/joycraft-interview` Lightweight brainstorm yap about ideas, get a structured summary
73
- - `/joycraft-decompose` Break a brief into small, testable specs
74
- - `/joycraft-session-end` Capture discoveries, verify, commit
75
- - `/joycraft-implement-level5` Set up Level 5: autofix loop, holdout scenarios, scenario evolution
76
- - **docs/** structure `briefs/`, `specs/`, `discoveries/`, `contracts/`, `decisions/`
77
- - **Templates** Atomic spec, feature brief, implementation plan, boundary framework, and workflow templates for scenario generation and autofix loops
70
+ - `/joycraft-tune` Assess your harness, apply upgrades, see your path to Level 5
71
+ - `/joycraft-new-feature` Interview → Feature Brief → Atomic Specs
72
+ - `/joycraft-interview` Lightweight brainstorm. Yap about ideas, get a structured summary
73
+ - `/joycraft-decompose` Break a brief into small, testable specs
74
+ - `/joycraft-add-fact` Capture project knowledge on the fly -- routes to the right context doc
75
+ - `/joycraft-session-end` Capture discoveries, verify, commit, push
76
+ - `/joycraft-implement-level5` Set up Level 5 (autofix loop, holdout scenarios, scenario evolution)
77
+ - **docs/** structure: `briefs/`, `specs/`, `discoveries/`, `contracts/`, `decisions/`, `context/`
78
+ - **Context documents** in `docs/context/`: production map, dangerous assumptions, decision log, institutional knowledge, and troubleshooting guide
79
+ - **Templates** including atomic spec, feature brief, implementation plan, boundary framework, and workflow templates for scenario generation and autofix loops
78
80
 
79
81
  Once you reach Level 4, you can set up the autonomous loop with `/joycraft-implement-level5`. See [Level 5: The Autonomous Loop](#level-5-the-autonomous-loop) below.
80
82
 
@@ -90,11 +92,12 @@ After init, open Claude Code and use the installed skills:
90
92
 
91
93
  ```
92
94
  /joycraft-tune # Assess your harness, apply upgrades, see path to Level 5
93
- /joycraft-interview # Brainstorm freely yap about ideas, get a structured summary
95
+ /joycraft-interview # Brainstorm freely, yap about ideas, get a structured summary
94
96
  /joycraft-new-feature # Interview → Feature Brief → Atomic Specs → ready to execute
95
97
  /joycraft-decompose # Break any feature into small, independent specs
96
- /joycraft-session-end # Wrap up discoveries, verification, commit
97
- /joycraft-implement-level5 # Set up Level 5 — autofix, holdout scenarios, evolution
98
+ /joycraft-add-fact # Capture a fact mid-session -- auto-routes to the right context doc
99
+ /joycraft-session-end # Wrap up: discoveries, verification, commit, push
100
+ /joycraft-implement-level5 # Set up Level 5 (autofix, holdout scenarios, evolution)
98
101
  ```
99
102
 
100
103
  The core loop:
@@ -113,13 +116,13 @@ Joycraft flips this. Before the agent writes a single line of code, you have a c
113
116
 
114
117
  ### Two interview modes
115
118
 
116
- **`/joycraft-interview`** The lightweight brainstorm. You yap about an idea, the agent asks clarifying questions, and you get a structured summary saved to `docs/briefs/`. Good for early-stage thinking when you're not ready to commit to building anything yet. No pressure, no specs just organized thought.
119
+ **`/joycraft-interview`** is the lightweight brainstorm. You yap about an idea, the agent asks clarifying questions, and you get a structured summary saved to `docs/briefs/`. Good for early-stage thinking when you're not ready to commit to building anything yet. No pressure, no specs, just organized thought.
117
120
 
118
- **`/joycraft-new-feature`** The full workflow. This is the structured interview that produces a **Feature Brief** (the what and why) and then decomposes it into **Atomic Specs** (small, testable, independently executable units of work). Each spec is self-contained an agent in a fresh session can pick it up and execute without reading anything else.
121
+ **`/joycraft-new-feature`** is the full workflow. This is the structured interview that produces a **Feature Brief** (the what and why) and then decomposes it into **Atomic Specs** (small, testable, independently executable units of work). Each spec is self-contained. An agent in a fresh session can pick it up and execute without reading anything else.
119
122
 
120
123
  ### Why this works
121
124
 
122
- The insight comes from [Boris Cherny](https://www.lennysnewsletter.com/p/head-of-claude-code-what-happens) (Head of Claude Code at Anthropic): interview in one session, write the spec, then execute in a *fresh session* with clean context. The interview captures your intent. The spec is the contract. The execution session has only the spec no baggage from the conversation, no accumulated misunderstandings, no context window full of abandoned approaches.
125
+ The insight comes from [Boris Cherny](https://www.lennysnewsletter.com/p/head-of-claude-code-what-happens) (Head of Claude Code at Anthropic): interview in one session, write the spec, then execute in a *fresh session* with clean context. The interview captures your intent. The spec is the contract. The execution session has only the spec. No baggage from the conversation, no accumulated misunderstandings, no context window full of abandoned approaches.
123
126
 
124
127
  This is what separates Level 2 (back-and-forth prompting) from Level 4 (spec-driven development). You stop being a typist correcting an agent's guesses and start being a PM defining what needs to be built.
125
128
 
@@ -143,13 +146,13 @@ flowchart LR
143
146
 
144
147
  An atomic spec produced by `/joycraft-decompose` has:
145
148
 
146
- - **What** One paragraph. A developer with zero context understands the change in 15 seconds.
147
- - **Why** One sentence. What breaks or is missing without this?
148
- - **Acceptance criteria** Checkboxes. Testable. No ambiguity.
149
- - **Affected files** Exact paths, what changes in each.
150
- - **Edge cases** Table of scenarios and expected behavior.
149
+ - **What:** One paragraph. A developer with zero context understands the change in 15 seconds.
150
+ - **Why:** One sentence. What breaks or is missing without this?
151
+ - **Acceptance criteria:** Checkboxes. Testable. No ambiguity.
152
+ - **Affected files:** Exact paths, what changes in each.
153
+ - **Edge cases:** Table of scenarios and expected behavior.
151
154
 
152
- The agent doesn't guess. It reads the spec and executes. If something's unclear, the spec is wrong fix the spec, not the conversation.
155
+ The agent doesn't guess. It reads the spec and executes. If something's unclear, the spec is wrong. Fix the spec, not the conversation.
153
156
 
154
157
  ## Upgrade
155
158
 
@@ -165,7 +168,7 @@ Joycraft tracks what it installed vs. what you've customized. Unmodified files u
165
168
 
166
169
  ## Level 5: The Autonomous Loop
167
170
 
168
- > **A note on complexity:** Setting up Level 5 does have some moving parts and, depending on the complexity of your stack (software vs. hardware, monorepo vs. single app, etc.), this will require a good amount of prompting and trial-and-error to get right. I've done my best to make this as painless as possible, but just note this is not a one-shot-prompt-done-in-5-minutes kind of thing. For small projects and simple stacks it will be easy, but any level of complexity is going to take some iteration, so plan ahead. Full step-by-step guides along with a video coming soon.
171
+ > **A note on complexity:** Setting up Level 5 does have some moving parts and, depending on the complexity of your stack (software vs. hardware, monorepo vs. single app, etc.), this will require a good amount of prompting and trial-and-error to get right. I've done my best to make this as painless as possible, but just note - this is not a one-shot-prompt-done-in-5-minutes kind of thing. For small projects and simple stacks it will be easy, but any level of complexity is going to take some iteration, so plan ahead. Full step-by-step guides along with a video coming soon.
169
172
 
170
173
  Level 5 is where specs go in and validated software comes out. Joycraft implements this as four interlocking GitHub Actions workflows, a separate scenarios repository, and two independent AI agents that can never see each other's work.
171
174
 
@@ -177,7 +180,7 @@ npx joycraft init-autofix --scenarios-repo my-project-scenarios --app-id 3180156
177
180
 
178
181
  ### Architecture Overview
179
182
 
180
- Level 5 has four moving parts. Each is a GitHub Actions workflow that communicates via `repository_dispatch` events no custom servers, no webhooks, no external services.
183
+ Level 5 has four moving parts. Each is a GitHub Actions workflow that communicates via `repository_dispatch` events. No custom servers, no webhooks, no external services.
181
184
 
182
185
  ```mermaid
183
186
  graph TB
@@ -241,10 +244,10 @@ sequenceDiagram
241
244
  ```
242
245
 
243
246
  **Key details:**
244
- - Uses a GitHub App identity for pushes avoids GitHub's anti-recursion protection
245
- - Concurrency group per PR only one autofix runs at a time per PR
246
- - Max 3 iterations posts "human review needed" if it can't fix it
247
- - No `--model` flag Claude CLI handles model selection
247
+ - Uses a GitHub App identity for pushes to avoid GitHub's anti-recursion protection
248
+ - Concurrency group per PR so only one autofix runs at a time
249
+ - Max 3 iterations, then posts "human review needed"
250
+ - No `--model` flag. Claude CLI handles model selection.
248
251
  - Strips ANSI escape codes from logs so Claude gets clean text
249
252
 
250
253
  #### 2. Scenarios Dispatch Workflow (`scenarios-dispatch.yml`)
@@ -281,7 +284,7 @@ sequenceDiagram
281
284
  SPD->>SR: repository_dispatch: spec-pushed<br/>payload: {spec_filename, spec_content, commit_sha, branch, repo}
282
285
  end
283
286
 
284
- Note over SPD: Deleted specs are ignored —<br/>existing scenario tests remain
287
+ Note over SPD: Deleted specs are ignored -<br/>existing scenario tests remain
285
288
  ```
286
289
 
287
290
  #### 4. Scenarios Re-run Workflow (`scenarios-rerun.yml`)
@@ -306,7 +309,7 @@ sequenceDiagram
306
309
  end
307
310
  ```
308
311
 
309
- **Why this exists:** There's a race condition. The implementation agent might open a PR before the scenario agent finishes writing new tests. The re-run workflow handles this when new tests land, all open PRs get re-tested. Worst case: a PR merges before the re-run, and the new tests protect the very next PR. You're never more than one cycle behind.
312
+ **Why this exists:** There's a race condition. The implementation agent might open a PR before the scenario agent finishes writing new tests. The re-run workflow handles this by re-testing all open PRs when new tests land. Worst case, a PR merges before the re-run, and the new tests protect the very next PR. You're never more than one cycle behind.
310
313
 
311
314
  ### The Holdout Wall
312
315
 
@@ -336,7 +339,7 @@ graph LR
336
339
  style Specs fill:#cfc,stroke:#393
337
340
  ```
338
341
 
339
- This is the same principle as a holdout set in machine learning. If the implementation agent could see the scenario tests, it would optimize to pass them specifically not to build correct software. By keeping the wall intact, scenario tests catch real behavioral regressions, not test-gaming.
342
+ This is the same principle as a holdout set in machine learning. If the implementation agent could see the scenario tests, it would optimize to pass them specifically instead of building correct software. By keeping the wall intact, scenario tests catch real behavioral regressions, not test-gaming.
340
343
 
341
344
  ### Scenario Evolution
342
345
 
@@ -348,7 +351,7 @@ flowchart TD
348
351
  B --> C[Scenario Agent reads spec]
349
352
  C --> D{Triage: is this user-facing?}
350
353
 
351
- D -->|Internal refactor, CI, dev tooling| E[Skip commit note: 'No scenario changes needed']
354
+ D -->|Internal refactor, CI, dev tooling| E[Skip - commit note: 'No scenario changes needed']
352
355
  D -->|New user-facing behavior| F[Write new scenario test file]
353
356
  D -->|Modified existing behavior| G[Update existing scenario tests]
354
357
 
@@ -433,11 +436,80 @@ sequenceDiagram
433
436
  | Scenarios repo | `package.json` | Minimal vitest setup |
434
437
  | Scenarios repo | `README.md` | Explains holdout pattern to contributors |
435
438
 
436
- ### Prerequisites
439
+ ### Setup Guide
437
440
 
438
- - **GitHub App** Provides a separate identity for autofix pushes (avoids GitHub's anti-recursion protection). You can install the shared [Joycraft Autofix](https://github.com/apps/joycraft-autofix) app (App ID: `3180156`) or create your own.
439
- - **Secrets** — `JOYCRAFT_APP_PRIVATE_KEY` and `ANTHROPIC_API_KEY` on both the main and scenarios repos.
440
- - **Scenarios repo** A private repository where holdout tests live. Created during setup.
441
+ The fastest way: run `/joycraft-implement-level5` in Claude Code and it walks you through everything interactively. Or follow these steps manually:
442
+
443
+ #### Step 1: Create a GitHub App
444
+
445
+ The autofix workflow needs a GitHub App identity to push commits. GitHub blocks workflows from triggering other workflows with the default `GITHUB_TOKEN` -- a separate App identity solves this. Creating one takes about 2 minutes:
446
+
447
+ 1. Go to https://github.com/settings/apps/new
448
+ 2. Give it a name (e.g., "My Project Autofix")
449
+ 3. Uncheck "Webhook > Active" (not needed)
450
+ 4. Under **Repository permissions**, set:
451
+ - **Contents**: Read & Write
452
+ - **Pull requests**: Read & Write
453
+ - **Actions**: Read & Write
454
+ 5. Click **Create GitHub App**
455
+ 6. Note the **App ID** from the settings page (you'll need it in Step 2)
456
+ 7. Scroll to **Private keys** > click **Generate a private key**
457
+ 8. Save the downloaded `.pem` file -- you'll need it in Step 3
458
+ 9. Click **Install App** in the left sidebar > install it on the repo(s) you want to use
459
+
460
+ > **Coming soon:** We're working on a shared Joycraft Autofix app that will reduce this to a single click. For now, creating your own app gives you full control and takes just a couple minutes.
461
+
462
+ #### Step 2: Run the CLI
463
+
464
+ ```bash
465
+ npx joycraft init-autofix --scenarios-repo my-project-scenarios --app-id YOUR_APP_ID
466
+ ```
467
+
468
+ Replace `YOUR_APP_ID` with the App ID from Step 1. This installs the four workflow files in your main repo and copies scenario templates to `docs/templates/scenarios/`.
469
+
470
+ #### Step 3: Add secrets to your main repo
471
+
472
+ Go to your repo's **Settings > Secrets and variables > Actions** and add:
473
+
474
+ | Secret | Value |
475
+ |--------|-------|
476
+ | `JOYCRAFT_APP_PRIVATE_KEY` | The full contents of the `.pem` file from Step 1 |
477
+ | `ANTHROPIC_API_KEY` | Your Anthropic API key (used by the autofix workflow to run Claude) |
478
+
479
+ #### Step 4: Create the scenarios repo
480
+
481
+ ```bash
482
+ # Create a private repo for holdout tests
483
+ gh repo create my-project-scenarios --private
484
+
485
+ # Copy the scenario templates into it
486
+ cp -r docs/templates/scenarios/* ../my-project-scenarios/
487
+ cd ../my-project-scenarios
488
+ git add -A && git commit -m "init: scaffold scenarios repo from Joycraft"
489
+ git push
490
+ ```
491
+
492
+ Then add the **same two secrets** (`JOYCRAFT_APP_PRIVATE_KEY` and `ANTHROPIC_API_KEY`) to the scenarios repo's Settings > Secrets.
493
+
494
+ #### Step 5: Verify
495
+
496
+ ```bash
497
+ # Check workflow files exist in your main repo
498
+ ls .github/workflows/autofix.yml .github/workflows/scenarios-dispatch.yml \
499
+ .github/workflows/spec-dispatch.yml .github/workflows/scenarios-rerun.yml
500
+
501
+ # Check scenario templates in the scenarios repo
502
+ ls ../my-project-scenarios/workflows/run.yml ../my-project-scenarios/workflows/generate.yml \
503
+ ../my-project-scenarios/prompts/scenario-agent.md ../my-project-scenarios/example-scenario.test.ts
504
+ ```
505
+
506
+ #### Step 6: Test it
507
+
508
+ 1. Push a spec to `docs/specs/` on main -- this triggers scenario generation in the scenarios repo
509
+ 2. Open a PR with a small change -- when CI passes, scenarios run against the PR
510
+ 3. Watch for the scenario test results posted as a PR comment
511
+
512
+ Or deliberately break something in a PR to test the autofix loop.
441
513
 
442
514
  ### Cost
443
515
 
@@ -454,12 +526,12 @@ When `/joycraft-tune` runs for the first time, it does two things:
454
526
 
455
527
  ### Risk interview
456
528
 
457
- 3-5 targeted questions about what's dangerous in your project production databases, live APIs, secrets, files that should be off-limits. From your answers, Joycraft generates:
529
+ 3-5 targeted questions about what's dangerous in your project (production databases, live APIs, secrets, files that should be off-limits). From your answers, Joycraft generates:
458
530
 
459
531
  - **NEVER rules** for CLAUDE.md (e.g., "NEVER connect to production DB")
460
532
  - **Deny patterns** for `.claude/settings.json` (blocks dangerous bash commands)
461
- - **`docs/context/production-map.md`** what's real vs. safe to touch
462
- - **`docs/context/dangerous-assumptions.md`** "Agent might assume X, but actually Y"
533
+ - **`docs/context/production-map.md`** documenting what's real vs. safe to touch
534
+ - **`docs/context/dangerous-assumptions.md`** documenting "Agent might assume X, but actually Y"
463
535
 
464
536
  This takes 2-3 minutes and dramatically reduces the chance of your agent doing something catastrophic.
465
537
 
@@ -467,8 +539,8 @@ This takes 2-3 minutes and dramatically reduces the chance of your agent doing s
467
539
 
468
540
  One question: **how autonomous should git be?**
469
541
 
470
- - **Cautious** (default) commits freely, asks before pushing or opening PRs. Good for learning the workflow.
471
- - **Autonomous** commits, pushes to feature branches, and opens PRs without asking. Good for spec-driven development where you want full send.
542
+ - **Cautious** (default) commits freely but asks before pushing or opening PRs. Good for learning the workflow.
543
+ - **Autonomous** commits, pushes to feature branches, and opens PRs without asking. Good for spec-driven development where you want full send.
472
544
 
473
545
  Either way, Joycraft generates explicit git boundaries in your CLAUDE.md: commit message format (`verb: message`), specific file staging (no `git add -A`), no secrets in commits, no force-pushing.
474
546
 
@@ -476,9 +548,9 @@ Either way, Joycraft generates explicit git boundaries in your CLAUDE.md: commit
476
548
 
477
549
  **Claude Code** reads `CLAUDE.md` automatically and discovers skills in `.claude/skills/`. The behavioral boundaries guide every action. The skills provide structured workflows accessible via `/slash-commands`.
478
550
 
479
- **Codex** reads `AGENTS.md` same boundaries and commands in a concise format optimized for smaller context windows.
551
+ **Codex** reads `AGENTS.md`, which provides the same boundaries and commands in a concise format optimized for smaller context windows.
480
552
 
481
- Both agents get the same guardrails and the same development workflow. Joycraft doesn't write your project code it builds the *system* that makes AI-assisted development reliable.
553
+ Both agents get the same guardrails and the same development workflow. Joycraft doesn't write your project code. It builds the *system* that makes AI-assisted development reliable.
482
554
 
483
555
  ### Team Sharing
484
556
 
@@ -489,13 +561,13 @@ git add .claude/skills/ docs/
489
561
  git commit -m "add: Joycraft harness"
490
562
  ```
491
563
 
492
- Joycraft also installs a session-start hook that checks for updates if your templates are outdated, you'll see a one-line nudge when Claude Code starts.
564
+ Joycraft also installs a session-start hook that checks for updates. If your templates are outdated, you'll see a one-line nudge when Claude Code starts.
493
565
 
494
566
  ## Why This Exists
495
567
 
496
- Most developers using AI tools are at Level 2 they prompt, they iterate, they feel productive. But [METR's randomized control trial](https://metr.org/) found experienced developers using AI tools actually completed tasks **19% slower**, while *believing* they were 24% faster. The problem isn't the tools. It's the absence of structure around them.
568
+ Most developers using AI tools are at Level 2. They prompt, they iterate, they feel productive. But [METR's randomized control trial](https://metr.org/) found experienced developers using AI tools actually completed tasks **19% slower**, while *believing* they were 24% faster. The problem isn't the tools. It's the absence of structure around them.
497
569
 
498
- The teams seeing transformative results [StrongDM](https://factory.strongdm.ai/) shipping an entire product with 3 engineers, [Spotify Honk](https://www.danshapiro.com/blog/2026/01/the-five-levels-from-spicy-autocomplete-to-the-software-factory/) merging 1,000 PRs every 10 days, Anthropic generating effectively 100% of their code with AI all share the same pattern: **they don't prompt AI to write code. They write specs and let AI execute them.**
570
+ The teams seeing transformative results ([StrongDM](https://factory.strongdm.ai/) shipping an entire product with 3 engineers, [Spotify Honk](https://www.danshapiro.com/blog/2026/01/the-five-levels-from-spicy-autocomplete-to-the-software-factory/) merging 1,000 PRs every 10 days, Anthropic generating effectively 100% of their code with AI) all share the same pattern: **they don't prompt AI to write code. They write specs and let AI execute them.**
499
571
 
500
572
  Joycraft packages that pattern into something anyone can install.
501
573
 
@@ -503,15 +575,15 @@ Joycraft packages that pattern into something anyone can install.
503
575
 
504
576
  Joycraft's approach is synthesized from several sources:
505
577
 
506
- **Spec-driven development.** Instead of prompting AI in conversation, you write structured specifications Feature Briefs that capture the *what* and *why*, then Atomic Specs that break work into small, testable, independently executable units. Each spec is self-contained: an agent can pick it up without reading anything else. This follows [Addy Osmani's](https://addyosmani.com/blog/good-spec/) principles for AI-consumable specs and [GitHub's Spec Kit](https://github.blog/ai-and-ml/generative-ai/spec-driven-development-with-ai-get-started-with-a-new-open-source-toolkit/) 4-phase process (Specify → Plan → Tasks → Implement).
578
+ **Spec-driven development.** Instead of prompting AI in conversation, you write structured specifications. Feature Briefs capture the *what* and *why*, then Atomic Specs break work into small, testable, independently executable units. Each spec is self-contained: an agent can pick it up without reading anything else. This follows [Addy Osmani's](https://addyosmani.com/blog/good-spec/) principles for AI-consumable specs and [GitHub's Spec Kit](https://github.blog/ai-and-ml/generative-ai/spec-driven-development-with-ai-get-started-with-a-new-open-source-toolkit/) 4-phase process (Specify → Plan → Tasks → Implement).
507
579
 
508
580
  **Context isolation.** [Boris Cherny](https://www.lennysnewsletter.com/p/head-of-claude-code-what-happens) (Head of Claude Code at Anthropic) recommends: interview in one session, write the spec, then execute in a *fresh session* with clean context. Joycraft's `/joycraft-new-feature` → `/joycraft-decompose` → execute workflow enforces this naturally. The interview session captures intent; the execution session has only the spec.
509
581
 
510
- **Behavioral boundaries.** CLAUDE.md isn't a suggestion box it's a contract. Joycraft installs a three-tier boundary framework (Always / Ask First / Never) that prevents the most common AI development failures: overwriting user files, skipping tests, pushing without approval, hardcoding secrets. This is [Addy Osmani's](https://addyosmani.com/blog/good-spec/) "boundaries" principle made concrete.
582
+ **Behavioral boundaries.** CLAUDE.md isn't a suggestion box, it's a contract. Joycraft installs a three-tier boundary framework (Always / Ask First / Never) that prevents the most common AI development failures: overwriting user files, skipping tests, pushing without approval, hardcoding secrets. This is [Addy Osmani's](https://addyosmani.com/blog/good-spec/) "boundaries" principle made concrete.
511
583
 
512
- **Knowledge capture over session notes.** Most session notes are never re-read. Joycraft's `/joycraft-session-end` skill captures only *discoveries* assumptions that were wrong, APIs that behaved unexpectedly, decisions made during implementation that aren't in the spec. If nothing surprising happened, you capture nothing. This keeps the signal-to-noise ratio high.
584
+ **Knowledge capture over session notes.** Most session notes are never re-read. Joycraft's `/joycraft-session-end` skill captures only *discoveries*: assumptions that were wrong, APIs that behaved unexpectedly, decisions made during implementation that aren't in the spec. If nothing surprising happened, you capture nothing. This keeps the signal-to-noise ratio high.
513
585
 
514
- **External holdout scenarios.** [StrongDM's Software Factory](https://factory.strongdm.ai/) proved that AI agents will [actively game visible test suites](https://palisaderesearch.org/blog/specification-gaming). Their solution: scenarios that live *outside* the codebase, invisible to the agent during development. Like a holdout set in ML, this prevents overfitting. Joycraft now implements this directly `init-autofix` sets up the holdout wall, the scenario agent, and the GitHub App integration, not just provides templates for it.
586
+ **External holdout scenarios.** [StrongDM's Software Factory](https://factory.strongdm.ai/) proved that AI agents will [actively game visible test suites](https://palisaderesearch.org/blog/specification-gaming). Their solution: scenarios that live *outside* the codebase, invisible to the agent during development. Like a holdout set in ML, this prevents overfitting. Joycraft now implements this directly. `init-autofix` sets up the holdout wall, the scenario agent, and the GitHub App integration.
515
587
 
516
588
  **The 5-level framework.** [Dan Shapiro's levels](https://www.danshapiro.com/blog/2026/01/the-five-levels-from-spicy-autocomplete-to-the-software-factory/) give you a map. Level 2 (Junior Developer) is where most teams plateau. Level 3 (Developer as Manager) means your life is diffs. Level 4 (Developer as PM) means you write specs, not code. Level 5 (Dark Factory) means specs in, software out. Joycraft's `/joycraft-tune` assessment tells you where you are and what to do next.
517
589
 
@@ -519,14 +591,14 @@ Joycraft's approach is synthesized from several sources:
519
591
 
520
592
  Joycraft synthesizes ideas and patterns from people doing extraordinary work in AI-assisted software development:
521
593
 
522
- - **[Dan Shapiro](https://x.com/danshapiro)** The [5 Levels of Vibe Coding](https://www.danshapiro.com/blog/2026/01/the-five-levels-from-spicy-autocomplete-to-the-software-factory/) framework that Joycraft's assessment and level system is built on
523
- - **[StrongDM](https://www.strongdm.com/)** / **[Justin McCarthy](https://x.com/BuiltByJustin)** The [Software Factory](https://factory.strongdm.ai/): spec-driven autonomous development, NLSpec, external holdout scenarios, and the proof that 3 engineers can outproduce 30
524
- - **[Boris Cherny](https://x.com/bcherny)** Head of Claude Code at Anthropic. The interview → spec → fresh session → execute pattern, and the insight that [context isolation produces better results](https://www.lennysnewsletter.com/p/head-of-claude-code-what-happens)
525
- - **[Addy Osmani](https://x.com/addyosmani)** [What makes a good spec for AI](https://addyosmani.com/blog/good-spec/): commands, testing, project structure, code style, git workflow, and boundaries
526
- - **[METR](https://metr.org/)** The [randomized control trial](https://metr.org/) that proved unstructured AI use makes experienced developers slower, validating the need for harnesses
527
- - **[Nate B Jones](https://x.com/natebjones)** His [video on the 5 Levels of Vibe Coding](https://www.youtube.com/watch?v=bDcgHzCBgmQ) made this research accessible and inspired turning Joycraft into a tool anyone can use
528
- - **[Simon Willison](https://x.com/simonw)** [Analysis of the Software Factory](https://simonwillison.net/2026/Feb/7/software-factory/) that helped contextualize StrongDM's approach for the broader community
529
- - **[Anthropic](https://www.anthropic.com/)** Claude Code's skills, hooks, and CLAUDE.md system that makes tool-native AI development possible, and the [harness patterns for long-running agents](https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents)
594
+ - **[Dan Shapiro](https://x.com/danshapiro)** for the [5 Levels of Vibe Coding](https://www.danshapiro.com/blog/2026/01/the-five-levels-from-spicy-autocomplete-to-the-software-factory/) framework that Joycraft's assessment and level system is built on
595
+ - **[StrongDM](https://www.strongdm.com/)** / **[Justin McCarthy](https://x.com/BuiltByJustin)** for the [Software Factory](https://factory.strongdm.ai/): spec-driven autonomous development, NLSpec, external holdout scenarios, and the proof that 3 engineers can outproduce 30
596
+ - **[Boris Cherny](https://x.com/bcherny)**, Head of Claude Code at Anthropic, for the interview → spec → fresh session → execute pattern and the insight that [context isolation produces better results](https://www.lennysnewsletter.com/p/head-of-claude-code-what-happens)
597
+ - **[Addy Osmani](https://x.com/addyosmani)** for [What makes a good spec for AI](https://addyosmani.com/blog/good-spec/): commands, testing, project structure, code style, git workflow, and boundaries
598
+ - **[METR](https://metr.org/)** for the [randomized control trial](https://metr.org/) that proved unstructured AI use makes experienced developers slower, validating the need for harnesses
599
+ - **[Nate B Jones](https://x.com/natebjones)** whose [video on the 5 Levels of Vibe Coding](https://www.youtube.com/watch?v=bDcgHzCBgmQ) made this research accessible and inspired turning Joycraft into a tool anyone can use
600
+ - **[Simon Willison](https://x.com/simonw)** for his [analysis of the Software Factory](https://simonwillison.net/2026/Feb/7/software-factory/) that helped contextualize StrongDM's approach for the broader community
601
+ - **[Anthropic](https://www.anthropic.com/)** for Claude Code's skills, hooks, and CLAUDE.md system that makes tool-native AI development possible, and the [harness patterns for long-running agents](https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents)
530
602
 
531
603
  ## Contributing
532
604
 
@@ -538,10 +610,10 @@ The short version:
538
610
  2. `pnpm install && pnpm test --run` to verify your setup
539
611
  3. Write tests first, then implement
540
612
  4. `pnpm test --run && pnpm typecheck && pnpm build`
541
- 5. Open a PR one approval required
613
+ 5. Open a PR (one approval required)
542
614
 
543
615
  Look for [`good first issue`](https://github.com/maksutovic/joycraft/labels/good%20first%20issue) labels if you're new. Areas we'd especially love help with: stack detection for new languages, skill improvements, documentation, and Codex integration.
544
616
 
545
617
  ## License
546
618
 
547
- MIT see [LICENSE](LICENSE) for details.
619
+ MIT. See [LICENSE](LICENSE) for details.