joycraft 0.5.3 → 0.5.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +139 -67
- package/dist/{chunk-HHW4Q2UC.js → chunk-5BZUWHEY.js} +251 -36
- package/dist/chunk-5BZUWHEY.js.map +1 -0
- package/dist/cli.js +12 -7
- package/dist/cli.js.map +1 -1
- package/dist/{init-DHVJEWGX.js → init-PPOY5PDB.js} +114 -66
- package/dist/init-PPOY5PDB.js.map +1 -0
- package/dist/{init-autofix-OVHXYVLB.js → init-autofix-F7PLQLVX.js} +11 -15
- package/dist/{init-autofix-OVHXYVLB.js.map → init-autofix-F7PLQLVX.js.map} +1 -1
- package/dist/{upgrade-RRG2ZRSO.js → upgrade-CZJ6A447.js} +2 -2
- package/package.json +1 -1
- package/dist/chunk-HHW4Q2UC.js.map +0 -1
- package/dist/init-DHVJEWGX.js.map +0 -1
- /package/dist/{upgrade-RRG2ZRSO.js.map → upgrade-CZJ6A447.js.map} +0 -0
package/README.md
CHANGED
|
@@ -1,14 +1,14 @@
|
|
|
1
1
|
# Joycraft
|
|
2
2
|
|
|
3
3
|
<p align="center">
|
|
4
|
-
<img src="docs/joycraft-banner.png" alt="Joycraft
|
|
4
|
+
<img src="docs/joycraft-banner.png" alt="Joycraft, the craft of AI development" width="700" />
|
|
5
5
|
</p>
|
|
6
6
|
|
|
7
|
-
> The craft of AI development
|
|
7
|
+
> The craft of AI development. With joy, not darkness.
|
|
8
8
|
|
|
9
9
|
## What is Joycraft?
|
|
10
10
|
|
|
11
|
-
Joycraft is a CLI tool and [Claude Code](https://docs.anthropic.com/en/docs/claude-code) plugin that upgrades your AI development workflow. It installs skills, behavioral boundaries, templates, and documentation structure into any project
|
|
11
|
+
Joycraft is a CLI tool and [Claude Code](https://docs.anthropic.com/en/docs/claude-code) plugin that upgrades your AI development workflow. It installs skills, behavioral boundaries, templates, and documentation structure into any project, taking you from unstructured prompting to autonomous spec-driven development.
|
|
12
12
|
|
|
13
13
|
If you've been using Claude Code (or any AI coding tool) and your workflow looks like this:
|
|
14
14
|
|
|
@@ -16,18 +16,18 @@ If you've been using Claude Code (or any AI coding tool) and your workflow looks
|
|
|
16
16
|
|
|
17
17
|
...then Joycraft is for you.
|
|
18
18
|
|
|
19
|
-
This project started as a personal exploration by [@maksutovic](https://github.com/maksutovic). I was working across multiple client projects, spending more time wrestling with prompts than building software. I knew Claude Code was capable of extraordinary work, but my *process* was holding it back. I was vibe coding
|
|
19
|
+
This project started as a personal exploration by [@maksutovic](https://github.com/maksutovic). I was working across multiple client projects, spending more time wrestling with prompts than building software. I knew Claude Code was capable of extraordinary work, but my *process* was holding it back. I was vibe coding - and vibe coding doesn't scale.
|
|
20
20
|
|
|
21
|
-
The spark was [Nate B Jones' video on the 5 Levels of Vibe Coding](https://www.youtube.com/watch?v=bDcgHzCBgmQ). It mapped out a progression I hadn't seen articulated before
|
|
21
|
+
The spark was [Nate B Jones' video on the 5 Levels of Vibe Coding](https://www.youtube.com/watch?v=bDcgHzCBgmQ). It mapped out a progression I hadn't seen articulated before - from "spicy autocomplete" to fully autonomous development - and lit my brain up to the potential of what Claude Code could do with the right harness around it. Joycraft is the result of that exploration: a tool that encodes the patterns, boundaries, and workflows that make AI-assisted development actually deterministic.
|
|
22
22
|
|
|
23
23
|
### The core idea
|
|
24
24
|
|
|
25
25
|
Joycraft is simple. It's a set of **skills** (slash commands for Claude Code) and **instructions** (CLAUDE.md boundaries) that guide you and your agent through a structured development process:
|
|
26
26
|
|
|
27
27
|
- **Levels 1-4:** Skills like `/joycraft-tune`, `/joycraft-new-feature`, and `/joycraft-interview` replace unstructured prompting with spec-driven development. You interview, you write specs, the agent executes. No back-and-forth.
|
|
28
|
-
- **Level 5:** The `/joycraft-implement-level5` skill sets up the autonomous loop
|
|
28
|
+
- **Level 5:** The `/joycraft-implement-level5` skill sets up the autonomous loop where specs go in and validated software comes out, with holdout scenario testing that prevents the agent from gaming its own tests.
|
|
29
29
|
|
|
30
|
-
StrongDM calls their Level 5 fully autonomous loop a "Dark Factory"
|
|
30
|
+
StrongDM calls their Level 5 fully autonomous loop a "Dark Factory" - which, albeit a cool name, the world has so much darkness in it right now. I wanted a name that extolled more of what I believe tools like this can provide: joy and craftsmanship. Hence "Joycraft."
|
|
31
31
|
|
|
32
32
|
### What are the levels?
|
|
33
33
|
|
|
@@ -35,7 +35,7 @@ StrongDM calls their Level 5 fully autonomous loop a "Dark Factory" — which, a
|
|
|
35
35
|
|
|
36
36
|
| Level | Name | What it looks like | Joycraft's role |
|
|
37
37
|
|-------|------|--------------------|-----------------|
|
|
38
|
-
| 1 | Autocomplete | Tab-complete suggestions |
|
|
38
|
+
| 1 | Autocomplete | Tab-complete suggestions | - |
|
|
39
39
|
| 2 | Junior Developer | Prompt → iterate → fix → repeat | `/joycraft-tune` assesses where you are |
|
|
40
40
|
| 3 | Developer as Manager | Your life is reviewing diffs | Behavioral boundaries in CLAUDE.md |
|
|
41
41
|
| 4 | Developer as PM | You write specs, agent writes code | `/joycraft-new-feature` + `/joycraft-decompose` |
|
|
@@ -45,7 +45,7 @@ Most developers plateau at Level 2. Joycraft's job is to move you up.
|
|
|
45
45
|
|
|
46
46
|
### Platform support
|
|
47
47
|
|
|
48
|
-
Joycraft is currently focused on making the Claude Code experience state-of-the-art. Better [Codex](https://openai.com/codex) support is coming
|
|
48
|
+
Joycraft is currently focused on making the Claude Code experience state-of-the-art. Better [Codex](https://openai.com/codex) support is coming. `AGENTS.md` generation is already included, and deeper integration is on the roadmap.
|
|
49
49
|
|
|
50
50
|
## Quick Start
|
|
51
51
|
|
|
@@ -67,14 +67,16 @@ Joycraft auto-detects your tech stack and creates:
|
|
|
67
67
|
- **CLAUDE.md** with behavioral boundaries (Always / Ask First / Never) and correct build/test/lint commands
|
|
68
68
|
- **AGENTS.md** for Codex compatibility
|
|
69
69
|
- **Claude Code skills** installed to `.claude/skills/`:
|
|
70
|
-
- `/joycraft-tune`
|
|
71
|
-
- `/joycraft-new-feature`
|
|
72
|
-
- `/joycraft-interview`
|
|
73
|
-
- `/joycraft-decompose`
|
|
74
|
-
- `/joycraft-
|
|
75
|
-
- `/joycraft-
|
|
76
|
-
-
|
|
77
|
-
- **
|
|
70
|
+
- `/joycraft-tune` Assess your harness, apply upgrades, see your path to Level 5
|
|
71
|
+
- `/joycraft-new-feature` Interview → Feature Brief → Atomic Specs
|
|
72
|
+
- `/joycraft-interview` Lightweight brainstorm. Yap about ideas, get a structured summary
|
|
73
|
+
- `/joycraft-decompose` Break a brief into small, testable specs
|
|
74
|
+
- `/joycraft-add-fact` Capture project knowledge on the fly -- routes to the right context doc
|
|
75
|
+
- `/joycraft-session-end` Capture discoveries, verify, commit, push
|
|
76
|
+
- `/joycraft-implement-level5` Set up Level 5 (autofix loop, holdout scenarios, scenario evolution)
|
|
77
|
+
- **docs/** structure: `briefs/`, `specs/`, `discoveries/`, `contracts/`, `decisions/`, `context/`
|
|
78
|
+
- **Context documents** in `docs/context/`: production map, dangerous assumptions, decision log, institutional knowledge, and troubleshooting guide
|
|
79
|
+
- **Templates** including atomic spec, feature brief, implementation plan, boundary framework, and workflow templates for scenario generation and autofix loops
|
|
78
80
|
|
|
79
81
|
Once you reach Level 4, you can set up the autonomous loop with `/joycraft-implement-level5`. See [Level 5: The Autonomous Loop](#level-5-the-autonomous-loop) below.
|
|
80
82
|
|
|
@@ -90,11 +92,12 @@ After init, open Claude Code and use the installed skills:
|
|
|
90
92
|
|
|
91
93
|
```
|
|
92
94
|
/joycraft-tune # Assess your harness, apply upgrades, see path to Level 5
|
|
93
|
-
/joycraft-interview # Brainstorm freely
|
|
95
|
+
/joycraft-interview # Brainstorm freely, yap about ideas, get a structured summary
|
|
94
96
|
/joycraft-new-feature # Interview → Feature Brief → Atomic Specs → ready to execute
|
|
95
97
|
/joycraft-decompose # Break any feature into small, independent specs
|
|
96
|
-
/joycraft-
|
|
97
|
-
/joycraft-
|
|
98
|
+
/joycraft-add-fact # Capture a fact mid-session -- auto-routes to the right context doc
|
|
99
|
+
/joycraft-session-end # Wrap up: discoveries, verification, commit, push
|
|
100
|
+
/joycraft-implement-level5 # Set up Level 5 (autofix, holdout scenarios, evolution)
|
|
98
101
|
```
|
|
99
102
|
|
|
100
103
|
The core loop:
|
|
@@ -113,13 +116,13 @@ Joycraft flips this. Before the agent writes a single line of code, you have a c
|
|
|
113
116
|
|
|
114
117
|
### Two interview modes
|
|
115
118
|
|
|
116
|
-
**`/joycraft-interview`**
|
|
119
|
+
**`/joycraft-interview`** is the lightweight brainstorm. You yap about an idea, the agent asks clarifying questions, and you get a structured summary saved to `docs/briefs/`. Good for early-stage thinking when you're not ready to commit to building anything yet. No pressure, no specs, just organized thought.
|
|
117
120
|
|
|
118
|
-
**`/joycraft-new-feature`**
|
|
121
|
+
**`/joycraft-new-feature`** is the full workflow. This is the structured interview that produces a **Feature Brief** (the what and why) and then decomposes it into **Atomic Specs** (small, testable, independently executable units of work). Each spec is self-contained. An agent in a fresh session can pick it up and execute without reading anything else.
|
|
119
122
|
|
|
120
123
|
### Why this works
|
|
121
124
|
|
|
122
|
-
The insight comes from [Boris Cherny](https://www.lennysnewsletter.com/p/head-of-claude-code-what-happens) (Head of Claude Code at Anthropic): interview in one session, write the spec, then execute in a *fresh session* with clean context. The interview captures your intent. The spec is the contract. The execution session has only the spec
|
|
125
|
+
The insight comes from [Boris Cherny](https://www.lennysnewsletter.com/p/head-of-claude-code-what-happens) (Head of Claude Code at Anthropic): interview in one session, write the spec, then execute in a *fresh session* with clean context. The interview captures your intent. The spec is the contract. The execution session has only the spec. No baggage from the conversation, no accumulated misunderstandings, no context window full of abandoned approaches.
|
|
123
126
|
|
|
124
127
|
This is what separates Level 2 (back-and-forth prompting) from Level 4 (spec-driven development). You stop being a typist correcting an agent's guesses and start being a PM defining what needs to be built.
|
|
125
128
|
|
|
@@ -143,13 +146,13 @@ flowchart LR
|
|
|
143
146
|
|
|
144
147
|
An atomic spec produced by `/joycraft-decompose` has:
|
|
145
148
|
|
|
146
|
-
- **What
|
|
147
|
-
- **Why
|
|
148
|
-
- **Acceptance criteria
|
|
149
|
-
- **Affected files
|
|
150
|
-
- **Edge cases
|
|
149
|
+
- **What:** One paragraph. A developer with zero context understands the change in 15 seconds.
|
|
150
|
+
- **Why:** One sentence. What breaks or is missing without this?
|
|
151
|
+
- **Acceptance criteria:** Checkboxes. Testable. No ambiguity.
|
|
152
|
+
- **Affected files:** Exact paths, what changes in each.
|
|
153
|
+
- **Edge cases:** Table of scenarios and expected behavior.
|
|
151
154
|
|
|
152
|
-
The agent doesn't guess. It reads the spec and executes. If something's unclear, the spec is wrong
|
|
155
|
+
The agent doesn't guess. It reads the spec and executes. If something's unclear, the spec is wrong. Fix the spec, not the conversation.
|
|
153
156
|
|
|
154
157
|
## Upgrade
|
|
155
158
|
|
|
@@ -165,7 +168,7 @@ Joycraft tracks what it installed vs. what you've customized. Unmodified files u
|
|
|
165
168
|
|
|
166
169
|
## Level 5: The Autonomous Loop
|
|
167
170
|
|
|
168
|
-
> **A note on complexity:** Setting up Level 5 does have some moving parts and, depending on the complexity of your stack (software vs. hardware, monorepo vs. single app, etc.), this will require a good amount of prompting and trial-and-error to get right. I've done my best to make this as painless as possible, but just note
|
|
171
|
+
> **A note on complexity:** Setting up Level 5 does have some moving parts and, depending on the complexity of your stack (software vs. hardware, monorepo vs. single app, etc.), this will require a good amount of prompting and trial-and-error to get right. I've done my best to make this as painless as possible, but just note - this is not a one-shot-prompt-done-in-5-minutes kind of thing. For small projects and simple stacks it will be easy, but any level of complexity is going to take some iteration, so plan ahead. Full step-by-step guides along with a video coming soon.
|
|
169
172
|
|
|
170
173
|
Level 5 is where specs go in and validated software comes out. Joycraft implements this as four interlocking GitHub Actions workflows, a separate scenarios repository, and two independent AI agents that can never see each other's work.
|
|
171
174
|
|
|
@@ -177,7 +180,7 @@ npx joycraft init-autofix --scenarios-repo my-project-scenarios --app-id 3180156
|
|
|
177
180
|
|
|
178
181
|
### Architecture Overview
|
|
179
182
|
|
|
180
|
-
Level 5 has four moving parts. Each is a GitHub Actions workflow that communicates via `repository_dispatch` events
|
|
183
|
+
Level 5 has four moving parts. Each is a GitHub Actions workflow that communicates via `repository_dispatch` events. No custom servers, no webhooks, no external services.
|
|
181
184
|
|
|
182
185
|
```mermaid
|
|
183
186
|
graph TB
|
|
@@ -241,10 +244,10 @@ sequenceDiagram
|
|
|
241
244
|
```
|
|
242
245
|
|
|
243
246
|
**Key details:**
|
|
244
|
-
- Uses a GitHub App identity for pushes
|
|
245
|
-
- Concurrency group per PR
|
|
246
|
-
- Max 3 iterations
|
|
247
|
-
- No `--model` flag
|
|
247
|
+
- Uses a GitHub App identity for pushes to avoid GitHub's anti-recursion protection
|
|
248
|
+
- Concurrency group per PR so only one autofix runs at a time
|
|
249
|
+
- Max 3 iterations, then posts "human review needed"
|
|
250
|
+
- No `--model` flag. Claude CLI handles model selection.
|
|
248
251
|
- Strips ANSI escape codes from logs so Claude gets clean text
|
|
249
252
|
|
|
250
253
|
#### 2. Scenarios Dispatch Workflow (`scenarios-dispatch.yml`)
|
|
@@ -281,7 +284,7 @@ sequenceDiagram
|
|
|
281
284
|
SPD->>SR: repository_dispatch: spec-pushed<br/>payload: {spec_filename, spec_content, commit_sha, branch, repo}
|
|
282
285
|
end
|
|
283
286
|
|
|
284
|
-
Note over SPD: Deleted specs are ignored
|
|
287
|
+
Note over SPD: Deleted specs are ignored -<br/>existing scenario tests remain
|
|
285
288
|
```
|
|
286
289
|
|
|
287
290
|
#### 4. Scenarios Re-run Workflow (`scenarios-rerun.yml`)
|
|
@@ -306,7 +309,7 @@ sequenceDiagram
|
|
|
306
309
|
end
|
|
307
310
|
```
|
|
308
311
|
|
|
309
|
-
**Why this exists:** There's a race condition. The implementation agent might open a PR before the scenario agent finishes writing new tests. The re-run workflow handles this
|
|
312
|
+
**Why this exists:** There's a race condition. The implementation agent might open a PR before the scenario agent finishes writing new tests. The re-run workflow handles this by re-testing all open PRs when new tests land. Worst case, a PR merges before the re-run, and the new tests protect the very next PR. You're never more than one cycle behind.
|
|
310
313
|
|
|
311
314
|
### The Holdout Wall
|
|
312
315
|
|
|
@@ -336,7 +339,7 @@ graph LR
|
|
|
336
339
|
style Specs fill:#cfc,stroke:#393
|
|
337
340
|
```
|
|
338
341
|
|
|
339
|
-
This is the same principle as a holdout set in machine learning. If the implementation agent could see the scenario tests, it would optimize to pass them specifically
|
|
342
|
+
This is the same principle as a holdout set in machine learning. If the implementation agent could see the scenario tests, it would optimize to pass them specifically instead of building correct software. By keeping the wall intact, scenario tests catch real behavioral regressions, not test-gaming.
|
|
340
343
|
|
|
341
344
|
### Scenario Evolution
|
|
342
345
|
|
|
@@ -348,7 +351,7 @@ flowchart TD
|
|
|
348
351
|
B --> C[Scenario Agent reads spec]
|
|
349
352
|
C --> D{Triage: is this user-facing?}
|
|
350
353
|
|
|
351
|
-
D -->|Internal refactor, CI, dev tooling| E[Skip
|
|
354
|
+
D -->|Internal refactor, CI, dev tooling| E[Skip - commit note: 'No scenario changes needed']
|
|
352
355
|
D -->|New user-facing behavior| F[Write new scenario test file]
|
|
353
356
|
D -->|Modified existing behavior| G[Update existing scenario tests]
|
|
354
357
|
|
|
@@ -433,11 +436,80 @@ sequenceDiagram
|
|
|
433
436
|
| Scenarios repo | `package.json` | Minimal vitest setup |
|
|
434
437
|
| Scenarios repo | `README.md` | Explains holdout pattern to contributors |
|
|
435
438
|
|
|
436
|
-
###
|
|
439
|
+
### Setup Guide
|
|
437
440
|
|
|
438
|
-
|
|
439
|
-
|
|
440
|
-
|
|
441
|
+
The fastest way: run `/joycraft-implement-level5` in Claude Code and it walks you through everything interactively. Or follow these steps manually:
|
|
442
|
+
|
|
443
|
+
#### Step 1: Create a GitHub App
|
|
444
|
+
|
|
445
|
+
The autofix workflow needs a GitHub App identity to push commits. GitHub blocks workflows from triggering other workflows with the default `GITHUB_TOKEN` -- a separate App identity solves this. Creating one takes about 2 minutes:
|
|
446
|
+
|
|
447
|
+
1. Go to https://github.com/settings/apps/new
|
|
448
|
+
2. Give it a name (e.g., "My Project Autofix")
|
|
449
|
+
3. Uncheck "Webhook > Active" (not needed)
|
|
450
|
+
4. Under **Repository permissions**, set:
|
|
451
|
+
- **Contents**: Read & Write
|
|
452
|
+
- **Pull requests**: Read & Write
|
|
453
|
+
- **Actions**: Read & Write
|
|
454
|
+
5. Click **Create GitHub App**
|
|
455
|
+
6. Note the **App ID** from the settings page (you'll need it in Step 2)
|
|
456
|
+
7. Scroll to **Private keys** > click **Generate a private key**
|
|
457
|
+
8. Save the downloaded `.pem` file -- you'll need it in Step 3
|
|
458
|
+
9. Click **Install App** in the left sidebar > install it on the repo(s) you want to use
|
|
459
|
+
|
|
460
|
+
> **Coming soon:** We're working on a shared Joycraft Autofix app that will reduce this to a single click. For now, creating your own app gives you full control and takes just a couple minutes.
|
|
461
|
+
|
|
462
|
+
#### Step 2: Run the CLI
|
|
463
|
+
|
|
464
|
+
```bash
|
|
465
|
+
npx joycraft init-autofix --scenarios-repo my-project-scenarios --app-id YOUR_APP_ID
|
|
466
|
+
```
|
|
467
|
+
|
|
468
|
+
Replace `YOUR_APP_ID` with the App ID from Step 1. This installs the four workflow files in your main repo and copies scenario templates to `docs/templates/scenarios/`.
|
|
469
|
+
|
|
470
|
+
#### Step 3: Add secrets to your main repo
|
|
471
|
+
|
|
472
|
+
Go to your repo's **Settings > Secrets and variables > Actions** and add:
|
|
473
|
+
|
|
474
|
+
| Secret | Value |
|
|
475
|
+
|--------|-------|
|
|
476
|
+
| `JOYCRAFT_APP_PRIVATE_KEY` | The full contents of the `.pem` file from Step 1 |
|
|
477
|
+
| `ANTHROPIC_API_KEY` | Your Anthropic API key (used by the autofix workflow to run Claude) |
|
|
478
|
+
|
|
479
|
+
#### Step 4: Create the scenarios repo
|
|
480
|
+
|
|
481
|
+
```bash
|
|
482
|
+
# Create a private repo for holdout tests
|
|
483
|
+
gh repo create my-project-scenarios --private
|
|
484
|
+
|
|
485
|
+
# Copy the scenario templates into it
|
|
486
|
+
cp -r docs/templates/scenarios/* ../my-project-scenarios/
|
|
487
|
+
cd ../my-project-scenarios
|
|
488
|
+
git add -A && git commit -m "init: scaffold scenarios repo from Joycraft"
|
|
489
|
+
git push
|
|
490
|
+
```
|
|
491
|
+
|
|
492
|
+
Then add the **same two secrets** (`JOYCRAFT_APP_PRIVATE_KEY` and `ANTHROPIC_API_KEY`) to the scenarios repo's Settings > Secrets.
|
|
493
|
+
|
|
494
|
+
#### Step 5: Verify
|
|
495
|
+
|
|
496
|
+
```bash
|
|
497
|
+
# Check workflow files exist in your main repo
|
|
498
|
+
ls .github/workflows/autofix.yml .github/workflows/scenarios-dispatch.yml \
|
|
499
|
+
.github/workflows/spec-dispatch.yml .github/workflows/scenarios-rerun.yml
|
|
500
|
+
|
|
501
|
+
# Check scenario templates in the scenarios repo
|
|
502
|
+
ls ../my-project-scenarios/workflows/run.yml ../my-project-scenarios/workflows/generate.yml \
|
|
503
|
+
../my-project-scenarios/prompts/scenario-agent.md ../my-project-scenarios/example-scenario.test.ts
|
|
504
|
+
```
|
|
505
|
+
|
|
506
|
+
#### Step 6: Test it
|
|
507
|
+
|
|
508
|
+
1. Push a spec to `docs/specs/` on main -- this triggers scenario generation in the scenarios repo
|
|
509
|
+
2. Open a PR with a small change -- when CI passes, scenarios run against the PR
|
|
510
|
+
3. Watch for the scenario test results posted as a PR comment
|
|
511
|
+
|
|
512
|
+
Or deliberately break something in a PR to test the autofix loop.
|
|
441
513
|
|
|
442
514
|
### Cost
|
|
443
515
|
|
|
@@ -454,12 +526,12 @@ When `/joycraft-tune` runs for the first time, it does two things:
|
|
|
454
526
|
|
|
455
527
|
### Risk interview
|
|
456
528
|
|
|
457
|
-
3-5 targeted questions about what's dangerous in your project
|
|
529
|
+
3-5 targeted questions about what's dangerous in your project (production databases, live APIs, secrets, files that should be off-limits). From your answers, Joycraft generates:
|
|
458
530
|
|
|
459
531
|
- **NEVER rules** for CLAUDE.md (e.g., "NEVER connect to production DB")
|
|
460
532
|
- **Deny patterns** for `.claude/settings.json` (blocks dangerous bash commands)
|
|
461
|
-
- **`docs/context/production-map.md`**
|
|
462
|
-
- **`docs/context/dangerous-assumptions.md`**
|
|
533
|
+
- **`docs/context/production-map.md`** documenting what's real vs. safe to touch
|
|
534
|
+
- **`docs/context/dangerous-assumptions.md`** documenting "Agent might assume X, but actually Y"
|
|
463
535
|
|
|
464
536
|
This takes 2-3 minutes and dramatically reduces the chance of your agent doing something catastrophic.
|
|
465
537
|
|
|
@@ -467,8 +539,8 @@ This takes 2-3 minutes and dramatically reduces the chance of your agent doing s
|
|
|
467
539
|
|
|
468
540
|
One question: **how autonomous should git be?**
|
|
469
541
|
|
|
470
|
-
- **Cautious** (default)
|
|
471
|
-
- **Autonomous**
|
|
542
|
+
- **Cautious** (default) commits freely but asks before pushing or opening PRs. Good for learning the workflow.
|
|
543
|
+
- **Autonomous** commits, pushes to feature branches, and opens PRs without asking. Good for spec-driven development where you want full send.
|
|
472
544
|
|
|
473
545
|
Either way, Joycraft generates explicit git boundaries in your CLAUDE.md: commit message format (`verb: message`), specific file staging (no `git add -A`), no secrets in commits, no force-pushing.
|
|
474
546
|
|
|
@@ -476,9 +548,9 @@ Either way, Joycraft generates explicit git boundaries in your CLAUDE.md: commit
|
|
|
476
548
|
|
|
477
549
|
**Claude Code** reads `CLAUDE.md` automatically and discovers skills in `.claude/skills/`. The behavioral boundaries guide every action. The skills provide structured workflows accessible via `/slash-commands`.
|
|
478
550
|
|
|
479
|
-
**Codex** reads `AGENTS.md
|
|
551
|
+
**Codex** reads `AGENTS.md`, which provides the same boundaries and commands in a concise format optimized for smaller context windows.
|
|
480
552
|
|
|
481
|
-
Both agents get the same guardrails and the same development workflow. Joycraft doesn't write your project code
|
|
553
|
+
Both agents get the same guardrails and the same development workflow. Joycraft doesn't write your project code. It builds the *system* that makes AI-assisted development reliable.
|
|
482
554
|
|
|
483
555
|
### Team Sharing
|
|
484
556
|
|
|
@@ -489,13 +561,13 @@ git add .claude/skills/ docs/
|
|
|
489
561
|
git commit -m "add: Joycraft harness"
|
|
490
562
|
```
|
|
491
563
|
|
|
492
|
-
Joycraft also installs a session-start hook that checks for updates
|
|
564
|
+
Joycraft also installs a session-start hook that checks for updates. If your templates are outdated, you'll see a one-line nudge when Claude Code starts.
|
|
493
565
|
|
|
494
566
|
## Why This Exists
|
|
495
567
|
|
|
496
|
-
Most developers using AI tools are at Level 2
|
|
568
|
+
Most developers using AI tools are at Level 2. They prompt, they iterate, they feel productive. But [METR's randomized control trial](https://metr.org/) found experienced developers using AI tools actually completed tasks **19% slower**, while *believing* they were 24% faster. The problem isn't the tools. It's the absence of structure around them.
|
|
497
569
|
|
|
498
|
-
The teams seeing transformative results
|
|
570
|
+
The teams seeing transformative results ([StrongDM](https://factory.strongdm.ai/) shipping an entire product with 3 engineers, [Spotify Honk](https://www.danshapiro.com/blog/2026/01/the-five-levels-from-spicy-autocomplete-to-the-software-factory/) merging 1,000 PRs every 10 days, Anthropic generating effectively 100% of their code with AI) all share the same pattern: **they don't prompt AI to write code. They write specs and let AI execute them.**
|
|
499
571
|
|
|
500
572
|
Joycraft packages that pattern into something anyone can install.
|
|
501
573
|
|
|
@@ -503,15 +575,15 @@ Joycraft packages that pattern into something anyone can install.
|
|
|
503
575
|
|
|
504
576
|
Joycraft's approach is synthesized from several sources:
|
|
505
577
|
|
|
506
|
-
**Spec-driven development.** Instead of prompting AI in conversation, you write structured specifications
|
|
578
|
+
**Spec-driven development.** Instead of prompting AI in conversation, you write structured specifications. Feature Briefs capture the *what* and *why*, then Atomic Specs break work into small, testable, independently executable units. Each spec is self-contained: an agent can pick it up without reading anything else. This follows [Addy Osmani's](https://addyosmani.com/blog/good-spec/) principles for AI-consumable specs and [GitHub's Spec Kit](https://github.blog/ai-and-ml/generative-ai/spec-driven-development-with-ai-get-started-with-a-new-open-source-toolkit/) 4-phase process (Specify → Plan → Tasks → Implement).
|
|
507
579
|
|
|
508
580
|
**Context isolation.** [Boris Cherny](https://www.lennysnewsletter.com/p/head-of-claude-code-what-happens) (Head of Claude Code at Anthropic) recommends: interview in one session, write the spec, then execute in a *fresh session* with clean context. Joycraft's `/joycraft-new-feature` → `/joycraft-decompose` → execute workflow enforces this naturally. The interview session captures intent; the execution session has only the spec.
|
|
509
581
|
|
|
510
|
-
**Behavioral boundaries.** CLAUDE.md isn't a suggestion box
|
|
582
|
+
**Behavioral boundaries.** CLAUDE.md isn't a suggestion box, it's a contract. Joycraft installs a three-tier boundary framework (Always / Ask First / Never) that prevents the most common AI development failures: overwriting user files, skipping tests, pushing without approval, hardcoding secrets. This is [Addy Osmani's](https://addyosmani.com/blog/good-spec/) "boundaries" principle made concrete.
|
|
511
583
|
|
|
512
|
-
**Knowledge capture over session notes.** Most session notes are never re-read. Joycraft's `/joycraft-session-end` skill captures only *discoveries
|
|
584
|
+
**Knowledge capture over session notes.** Most session notes are never re-read. Joycraft's `/joycraft-session-end` skill captures only *discoveries*: assumptions that were wrong, APIs that behaved unexpectedly, decisions made during implementation that aren't in the spec. If nothing surprising happened, you capture nothing. This keeps the signal-to-noise ratio high.
|
|
513
585
|
|
|
514
|
-
**External holdout scenarios.** [StrongDM's Software Factory](https://factory.strongdm.ai/) proved that AI agents will [actively game visible test suites](https://palisaderesearch.org/blog/specification-gaming). Their solution: scenarios that live *outside* the codebase, invisible to the agent during development. Like a holdout set in ML, this prevents overfitting. Joycraft now implements this directly
|
|
586
|
+
**External holdout scenarios.** [StrongDM's Software Factory](https://factory.strongdm.ai/) proved that AI agents will [actively game visible test suites](https://palisaderesearch.org/blog/specification-gaming). Their solution: scenarios that live *outside* the codebase, invisible to the agent during development. Like a holdout set in ML, this prevents overfitting. Joycraft now implements this directly. `init-autofix` sets up the holdout wall, the scenario agent, and the GitHub App integration.
|
|
515
587
|
|
|
516
588
|
**The 5-level framework.** [Dan Shapiro's levels](https://www.danshapiro.com/blog/2026/01/the-five-levels-from-spicy-autocomplete-to-the-software-factory/) give you a map. Level 2 (Junior Developer) is where most teams plateau. Level 3 (Developer as Manager) means your life is diffs. Level 4 (Developer as PM) means you write specs, not code. Level 5 (Dark Factory) means specs in, software out. Joycraft's `/joycraft-tune` assessment tells you where you are and what to do next.
|
|
517
589
|
|
|
@@ -519,14 +591,14 @@ Joycraft's approach is synthesized from several sources:
|
|
|
519
591
|
|
|
520
592
|
Joycraft synthesizes ideas and patterns from people doing extraordinary work in AI-assisted software development:
|
|
521
593
|
|
|
522
|
-
- **[Dan Shapiro](https://x.com/danshapiro)**
|
|
523
|
-
- **[StrongDM](https://www.strongdm.com/)** / **[Justin McCarthy](https://x.com/BuiltByJustin)**
|
|
524
|
-
- **[Boris Cherny](https://x.com/bcherny)
|
|
525
|
-
- **[Addy Osmani](https://x.com/addyosmani)**
|
|
526
|
-
- **[METR](https://metr.org/)**
|
|
527
|
-
- **[Nate B Jones](https://x.com/natebjones)**
|
|
528
|
-
- **[Simon Willison](https://x.com/simonw)**
|
|
529
|
-
- **[Anthropic](https://www.anthropic.com/)**
|
|
594
|
+
- **[Dan Shapiro](https://x.com/danshapiro)** for the [5 Levels of Vibe Coding](https://www.danshapiro.com/blog/2026/01/the-five-levels-from-spicy-autocomplete-to-the-software-factory/) framework that Joycraft's assessment and level system is built on
|
|
595
|
+
- **[StrongDM](https://www.strongdm.com/)** / **[Justin McCarthy](https://x.com/BuiltByJustin)** for the [Software Factory](https://factory.strongdm.ai/): spec-driven autonomous development, NLSpec, external holdout scenarios, and the proof that 3 engineers can outproduce 30
|
|
596
|
+
- **[Boris Cherny](https://x.com/bcherny)**, Head of Claude Code at Anthropic, for the interview → spec → fresh session → execute pattern and the insight that [context isolation produces better results](https://www.lennysnewsletter.com/p/head-of-claude-code-what-happens)
|
|
597
|
+
- **[Addy Osmani](https://x.com/addyosmani)** for [What makes a good spec for AI](https://addyosmani.com/blog/good-spec/): commands, testing, project structure, code style, git workflow, and boundaries
|
|
598
|
+
- **[METR](https://metr.org/)** for the [randomized control trial](https://metr.org/) that proved unstructured AI use makes experienced developers slower, validating the need for harnesses
|
|
599
|
+
- **[Nate B Jones](https://x.com/natebjones)** whose [video on the 5 Levels of Vibe Coding](https://www.youtube.com/watch?v=bDcgHzCBgmQ) made this research accessible and inspired turning Joycraft into a tool anyone can use
|
|
600
|
+
- **[Simon Willison](https://x.com/simonw)** for his [analysis of the Software Factory](https://simonwillison.net/2026/Feb/7/software-factory/) that helped contextualize StrongDM's approach for the broader community
|
|
601
|
+
- **[Anthropic](https://www.anthropic.com/)** for Claude Code's skills, hooks, and CLAUDE.md system that makes tool-native AI development possible, and the [harness patterns for long-running agents](https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents)
|
|
530
602
|
|
|
531
603
|
## Contributing
|
|
532
604
|
|
|
@@ -538,10 +610,10 @@ The short version:
|
|
|
538
610
|
2. `pnpm install && pnpm test --run` to verify your setup
|
|
539
611
|
3. Write tests first, then implement
|
|
540
612
|
4. `pnpm test --run && pnpm typecheck && pnpm build`
|
|
541
|
-
5. Open a PR
|
|
613
|
+
5. Open a PR (one approval required)
|
|
542
614
|
|
|
543
615
|
Look for [`good first issue`](https://github.com/maksutovic/joycraft/labels/good%20first%20issue) labels if you're new. Areas we'd especially love help with: stack detection for new languages, skill improvements, documentation, and Codex integration.
|
|
544
616
|
|
|
545
617
|
## License
|
|
546
618
|
|
|
547
|
-
MIT
|
|
619
|
+
MIT. See [LICENSE](LICENSE) for details.
|