odd-studio 3.5.0 → 3.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/plugin.json +5 -1
- package/README.md +28 -1
- package/bin/commands/status.js +8 -0
- package/bin/commands/upgrade.js +10 -0
- package/bin/odd-studio.js +1 -1
- package/codex-plugin/.codex-plugin/plugin.json +1 -1
- package/codex-plugin/hooks.json +16 -0
- package/hooks/odd-studio.sh +93 -0
- package/package.json +1 -1
- package/plugins/plugin-gates.js +34 -3
- package/plugins/plugin-quality-checks.js +20 -0
- package/scripts/command-definitions.js +5 -0
- package/scripts/scaffold-project.js +3 -2
- package/scripts/setup-hooks.js +4 -0
- package/scripts/state-schema.js +48 -0
- package/skill/SKILL.md +86 -9
- package/skill/docs/build/build-protocol.md +34 -0
- package/skill/docs/build/code-excellence.md +37 -1
- package/skill/docs/build/debug-protocol.md +141 -0
- package/skill/docs/chapters/chapter-10.md +4 -4
- package/skill/docs/planning/build-planner.md +32 -9
- package/skill/odd-debug/SKILL.md +60 -0
- package/templates/.odd/state.json +11 -1
- package/templates/AGENTS.md +16 -1
- package/templates/CLAUDE.md +27 -0
package/skill/SKILL.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: "odd"
|
|
3
|
-
version: "3.
|
|
3
|
+
version: "3.6.0"
|
|
4
4
|
description: "Outcome-Driven Development planning and build coach. Use /odd to start or resume an ODD project — building personas, writing outcomes, mapping contracts, creating a Master Implementation Plan, and directing a odd-flow-powered build. Designed for domain experts who are not developers. Works with Claude Code, OpenCode, and Codex."
|
|
5
5
|
metadata:
|
|
6
6
|
priority: 10
|
|
@@ -23,6 +23,7 @@ metadata:
|
|
|
23
23
|
- "begin odd"
|
|
24
24
|
- "resume odd"
|
|
25
25
|
- "continue odd"
|
|
26
|
+
- "odd debug"
|
|
26
27
|
- "odd studio"
|
|
27
28
|
- "outcome-driven development"
|
|
28
29
|
- "odd status"
|
|
@@ -33,6 +34,7 @@ metadata:
|
|
|
33
34
|
allOf:
|
|
34
35
|
- [odd, status]
|
|
35
36
|
- [odd, build]
|
|
37
|
+
- [odd, debug]
|
|
36
38
|
- [odd, plan]
|
|
37
39
|
- [outcome, driven]
|
|
38
40
|
anyOf:
|
|
@@ -42,6 +44,7 @@ metadata:
|
|
|
42
44
|
- "outcome"
|
|
43
45
|
- "contract map"
|
|
44
46
|
- "phase brief"
|
|
47
|
+
- "debug"
|
|
45
48
|
noneOf: []
|
|
46
49
|
minScore: 5
|
|
47
50
|
retrieval:
|
|
@@ -96,7 +99,7 @@ Display this when no existing state is found:
|
|
|
96
99
|
|
|
97
100
|
---
|
|
98
101
|
|
|
99
|
-
Welcome to ODD Studio v3.
|
|
102
|
+
Welcome to ODD Studio v3.6.0.
|
|
100
103
|
|
|
101
104
|
You are about to plan and build something real — using a methodology called Outcome-Driven Development. Before we write a single line of code, we are going to get precise about three things:
|
|
102
105
|
|
|
@@ -120,7 +123,7 @@ Display this when existing state is found. Replace the bracketed values with act
|
|
|
120
123
|
|
|
121
124
|
---
|
|
122
125
|
|
|
123
|
-
Welcome back to ODD Studio v3.
|
|
126
|
+
Welcome back to ODD Studio v3.6.0.
|
|
124
127
|
|
|
125
128
|
**Project:** [project.name]
|
|
126
129
|
**Current Phase:** [state.currentPhase]
|
|
@@ -136,7 +139,7 @@ Welcome back to ODD Studio v3.5.0.
|
|
|
136
139
|
|
|
137
140
|
**What's next:** [state.nextStep]
|
|
138
141
|
|
|
139
|
-
Type `*plan` to continue planning, `*build` to enter build mode, or `*status` for full detail.
|
|
142
|
+
Type `*plan` to continue planning, `*build` to enter build mode, `*debug` to investigate a failing outcome without leaving the ODD flow, or `*status` for full detail.
|
|
140
143
|
|
|
141
144
|
---
|
|
142
145
|
|
|
@@ -219,6 +222,36 @@ Enter build mode. This command runs the following checks in order before beginni
|
|
|
219
222
|
- Do NOT run the brief generation and build agents "in parallel" — the brief MUST be confirmed BEFORE any build work begins
|
|
220
223
|
- This is a hard sequential gate. There are no exceptions.
|
|
221
224
|
|
|
225
|
+
### `*debug`
|
|
226
|
+
|
|
227
|
+
Enter controlled debug mode for the current outcome.
|
|
228
|
+
|
|
229
|
+
This command must keep the work inside the ODD flow. It is not a free-form detour.
|
|
230
|
+
|
|
231
|
+
Execute these steps in order:
|
|
232
|
+
|
|
233
|
+
1. Read `.odd/state.json` and confirm `currentPhase` is `"build"`. If not, explain that debugging only exists inside build work and route back to `*build`.
|
|
234
|
+
2. Read the latest failure in domain language from the current conversation and identify the active outcome.
|
|
235
|
+
3. Read `docs/build/debug-protocol.md` and choose exactly one debug strategy before inspecting code:
|
|
236
|
+
- `ui-behaviour`
|
|
237
|
+
- `full-stack`
|
|
238
|
+
- `auth-security`
|
|
239
|
+
- `integration-contract`
|
|
240
|
+
- `background-process`
|
|
241
|
+
- `performance-state`
|
|
242
|
+
4. Update `.odd/state.json`:
|
|
243
|
+
- set `buildMode` to `"debug"`
|
|
244
|
+
- set `verificationConfirmed` to `false`
|
|
245
|
+
- set `debugStartedAt` to the current timestamp
|
|
246
|
+
- set `debugStrategy`, `debugTarget`, and `debugSummary`
|
|
247
|
+
5. Call `mcp__odd-flow__memory_store` with key `odd-project-state`, namespace `odd-project`, value set to the full updated `.odd/state.json`
|
|
248
|
+
6. Run the investigation and fix strictly according to the chosen strategy. Do not guess. Do not apply quick fixes. Reproduce first, identify the failing boundary, then fix.
|
|
249
|
+
7. When the fix is ready, update `.odd/state.json` again:
|
|
250
|
+
- set `buildMode` to `"verify"`
|
|
251
|
+
- keep `debugStrategy`, `debugTarget`, and `debugSummary` as the latest resolved context
|
|
252
|
+
8. Call `mcp__odd-flow__memory_store` again with the full updated `.odd/state.json`
|
|
253
|
+
9. Return to the verification walkthrough from step one. A debug session ends only when verification passes.
|
|
254
|
+
|
|
222
255
|
**If the brief exists but `briefConfirmed` is not true in state.json:**
|
|
223
256
|
- Present it to the domain expert: "Session Brief [N] exists. Review it at docs/session-brief-[N].md and confirm before we build."
|
|
224
257
|
- Wait for confirmation, then set `briefConfirmed: true` in `.odd/state.json`
|
|
@@ -508,6 +541,7 @@ You can use either format:
|
|
|
508
541
|
|---|---|---|
|
|
509
542
|
| `*plan` | `/odd-plan` | Continue from where you left off in planning |
|
|
510
543
|
| `*build` | `/odd-build` | Enter build mode and initialise odd-flow swarm |
|
|
544
|
+
| `*debug` | `/odd-debug` | Keep debugging inside the active outcome and force an explicit debug strategy before fixing |
|
|
511
545
|
| `*status` | `/odd-status` | Show full project state and progress |
|
|
512
546
|
| `*swarm` | `/odd-swarm` | Build all independent outcomes in the current phase simultaneously |
|
|
513
547
|
| `*deploy` | `/odd-deploy` | Deploy the current verified build to production |
|
|
@@ -566,11 +600,54 @@ Enforce this sequence — do not proceed to a later step without the earlier one
|
|
|
566
600
|
Run when `*build` is called and `servicesConfigured` is false.
|
|
567
601
|
|
|
568
602
|
1. **Scaffold.** If `package.json` exists, skip to step 2. If not: `create-next-app` rejects non-empty directories — scaffold into a sibling dir (`${PROJECT_DIR}-scaffold`) then rsync across excluding `.git`, `docs/`, `node_modules/`. Fix `package.json name` after rsync. Tell user they can delete the sibling dir.
|
|
569
|
-
2. **Install deps.** `
|
|
570
|
-
|
|
571
|
-
|
|
572
|
-
|
|
573
|
-
|
|
603
|
+
2. **Install deps.** Read `testingFramework` from `.odd/state.json` (default "Vitest"). Install the chosen testing stack:
|
|
604
|
+
- **Vitest (default):** `npm install --save-dev vitest @testing-library/react @vitejs/plugin-react @testing-library/jest-dom jsdom`
|
|
605
|
+
- **Jest:** `npm install --save-dev jest @testing-library/react @testing-library/jest-dom ts-jest @types/jest jest-environment-jsdom`
|
|
606
|
+
- **Playwright:** `npm install --save-dev @playwright/test` then `npx playwright install`
|
|
607
|
+
- Also install production deps: `npm install drizzle-orm drizzle-kit`
|
|
608
|
+
3. **Scaffold test harness.** Read `testingFramework` from `.odd/state.json` and scaffold the appropriate config. For **Vitest** (the default):
|
|
609
|
+
- Create `vitest.config.ts`:
|
|
610
|
+
```typescript
|
|
611
|
+
import { defineConfig } from "vitest/config"
|
|
612
|
+
import react from "@vitejs/plugin-react"
|
|
613
|
+
import path from "path"
|
|
614
|
+
|
|
615
|
+
export default defineConfig({
|
|
616
|
+
plugins: [react()],
|
|
617
|
+
test: {
|
|
618
|
+
environment: "jsdom",
|
|
619
|
+
globals: true,
|
|
620
|
+
setupFiles: ["./tests/setup.ts"],
|
|
621
|
+
include: ["tests/**/*.test.{ts,tsx}"],
|
|
622
|
+
},
|
|
623
|
+
resolve: {
|
|
624
|
+
alias: {
|
|
625
|
+
"@": path.resolve(__dirname, "."),
|
|
626
|
+
},
|
|
627
|
+
},
|
|
628
|
+
})
|
|
629
|
+
```
|
|
630
|
+
- Create `tests/setup.ts`:
|
|
631
|
+
```typescript
|
|
632
|
+
import "@testing-library/jest-dom/vitest"
|
|
633
|
+
```
|
|
634
|
+
- Create `tests/setup.test.ts` (smoke test):
|
|
635
|
+
```typescript
|
|
636
|
+
import { describe, it, expect } from "vitest"
|
|
637
|
+
|
|
638
|
+
describe("vitest setup", () => {
|
|
639
|
+
it("runs", () => {
|
|
640
|
+
expect(true).toBe(true)
|
|
641
|
+
})
|
|
642
|
+
})
|
|
643
|
+
```
|
|
644
|
+
- Add scripts to `package.json`: `"test": "vitest run"` and `"test:watch": "vitest"`
|
|
645
|
+
- Run `npm test` to confirm the harness works. If the smoke test fails, diagnose and fix before proceeding.
|
|
646
|
+
- Display: "Test harness configured. `npm test` runs the suite. `npm run test:watch` runs in watch mode."
|
|
647
|
+
4. **Generate `.env.local`.** Write a placeholder file with every credential the chosen stack needs. Each line must have a comment pointing to exactly where to find the real value in the service dashboard. Include a note: never commit this file, use test keys for payment services.
|
|
648
|
+
5. **Wait.** Display the credential list. Wait for the user to confirm they've filled everything in.
|
|
649
|
+
6. **Verify.** Kill port 3000 (`lsof -ti:3000 | xargs kill 2>/dev/null || true`), run `npm run dev`. Translate any connection errors into plain language. Repeat until server starts cleanly.
|
|
650
|
+
7. **Mark done.** Set `servicesConfigured: true` in `.odd/state.json`. Confirm: "All services connected. Development server running at http://localhost:3000. Test harness verified."
|
|
574
651
|
|
|
575
652
|
---
|
|
576
653
|
|
|
@@ -32,6 +32,40 @@ The domain expert does not re-brief the AI, paste context, identify shared infra
|
|
|
32
32
|
|
|
33
33
|
---
|
|
34
34
|
|
|
35
|
+
### Step 2b — Test
|
|
36
|
+
|
|
37
|
+
After the build completes and before verification begins, the build agent runs the test suite automatically.
|
|
38
|
+
|
|
39
|
+
**What the build agent tests:**
|
|
40
|
+
|
|
41
|
+
Every outcome produces code. Some of that code is pure logic — functions that take inputs and return outputs without touching databases, APIs, or the browser. These functions MUST have tests written alongside the implementation. The build agent writes tests for:
|
|
42
|
+
|
|
43
|
+
- **Business rules** — access control, pricing, eligibility, classification logic
|
|
44
|
+
- **Data transformations** — formatting, aggregation, filtering, sorting
|
|
45
|
+
- **Validation** — input parsing, CSV import, form validation, regex matching
|
|
46
|
+
- **Calculations** — mastery scores, scheduling, priority ordering, time-based logic
|
|
47
|
+
- **Safety-critical logic** — safeguarding detection, content filtering, concern routing
|
|
48
|
+
|
|
49
|
+
**What is NOT tested at this stage:**
|
|
50
|
+
|
|
51
|
+
- Database queries (tested via verification walkthrough)
|
|
52
|
+
- UI rendering (tested via verification walkthrough)
|
|
53
|
+
- External API calls (tested via verification walkthrough)
|
|
54
|
+
- LLM prompt/response cycles (tested via verification walkthrough)
|
|
55
|
+
|
|
56
|
+
**The test gate:**
|
|
57
|
+
|
|
58
|
+
After the build completes, run `npm test`. If any tests fail:
|
|
59
|
+
1. The build agent fixes the failures immediately — do not proceed to verification with failing tests
|
|
60
|
+
2. Re-run `npm test` until all tests pass
|
|
61
|
+
3. Display to the domain expert: "All [n] tests passing. Ready for verification."
|
|
62
|
+
|
|
63
|
+
If no testable pure logic was produced by this outcome (e.g., a purely UI outcome), display: "No new business logic tests required for this outcome. Ready for verification."
|
|
64
|
+
|
|
65
|
+
Tests are committed alongside the implementation code. They live in `tests/` mirroring the source structure. Test files must never be deleted — they are regression guards for every future outcome.
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
35
69
|
### Step 3 — Verify
|
|
36
70
|
|
|
37
71
|
When ODD Studio reports the build is complete, the verification checklist is on screen.
|
|
@@ -191,6 +191,40 @@ If a build agent finds itself exceeding these limits, it stops and restructures
|
|
|
191
191
|
|
|
192
192
|
---
|
|
193
193
|
|
|
194
|
+
## Testing Standard
|
|
195
|
+
|
|
196
|
+
Every pure-logic module — any function that takes inputs and returns outputs without side effects — MUST have a corresponding test file in `tests/` that mirrors the source path. If the source is `lib/psm/mastery.ts`, the test is `tests/lib/psm/mastery.test.ts`.
|
|
197
|
+
|
|
198
|
+
### What MUST be tested
|
|
199
|
+
|
|
200
|
+
- **Business rules:** access control checks, eligibility logic, classification functions, routing decisions
|
|
201
|
+
- **Data transformations:** formatting, aggregation, filtering, score calculations
|
|
202
|
+
- **Parsing and validation:** CSV import, regex patterns, input sanitisation, form validation
|
|
203
|
+
- **Safety-critical logic:** safeguarding keyword detection, content filtering, concern classification and routing
|
|
204
|
+
- **State machines:** plant growth levels, unlock gates, engagement level thresholds
|
|
205
|
+
|
|
206
|
+
### What is NOT unit tested
|
|
207
|
+
|
|
208
|
+
- Database queries — these are tested via the verification walkthrough
|
|
209
|
+
- React components — these are tested via the verification walkthrough and design verification
|
|
210
|
+
- LLM calls — these are tested via the verification walkthrough
|
|
211
|
+
- External API integrations — these are tested via the verification walkthrough
|
|
212
|
+
|
|
213
|
+
### Test quality rules
|
|
214
|
+
|
|
215
|
+
- Test the behaviour, not the implementation. Test what the function returns, not how it computes it.
|
|
216
|
+
- Use `it.each` for data-driven tests with multiple inputs against the same assertion.
|
|
217
|
+
- Use `vi.useFakeTimers()` for time-dependent logic. Clean up with `vi.useRealTimers()`.
|
|
218
|
+
- Use `vi.stubEnv()` for environment variable tests. Clean up with `vi.unstubAllEnvs()`.
|
|
219
|
+
- No mocks for things that can be tested directly. If a function is pure, test it directly.
|
|
220
|
+
- Every test file must pass independently — no shared state between test files.
|
|
221
|
+
|
|
222
|
+
### The testing gate
|
|
223
|
+
|
|
224
|
+
`npm test` runs before every verification walkthrough. Failing tests block verification. The domain expert never sees a system with broken business logic.
|
|
225
|
+
|
|
226
|
+
---
|
|
227
|
+
|
|
194
228
|
## How This Standard Is Enforced
|
|
195
229
|
|
|
196
230
|
1. **At build time.** The build agent reads this document before writing any code. It applies the Design-It-Twice protocol internally and outputs only the minimal version.
|
|
@@ -199,4 +233,6 @@ If a build agent finds itself exceeding these limits, it stops and restructures
|
|
|
199
233
|
|
|
200
234
|
3. **At verification time.** When the domain expert verifies an outcome, the code behind it has already been through two passes. The domain expert does not review code — but the code they are relying on is clean, minimal, and maintainable.
|
|
201
235
|
|
|
202
|
-
4. **At
|
|
236
|
+
4. **At test time.** `npm test` runs after every build and before every verification. Failing tests block verification. New pure-logic functions without corresponding test files are flagged.
|
|
237
|
+
|
|
238
|
+
5. **At refactor time.** If an outcome is rebuilt after a verification failure, the rebuild starts from scratch against this standard — it does not patch the previous attempt. Existing tests must still pass after the rebuild.
|
|
@@ -0,0 +1,141 @@
|
|
|
1
|
+
# ODD Debug Protocol
|
|
2
|
+
|
|
3
|
+
Debugging does not sit outside Outcome-Driven Development. It is a controlled sub-mode of the current build.
|
|
4
|
+
|
|
5
|
+
When something fails during verification or during an in-progress build, use `*debug`. Do not abandon the active outcome. Do not start free-form fixing. Do not guess.
|
|
6
|
+
|
|
7
|
+
## Purpose
|
|
8
|
+
|
|
9
|
+
`*debug` exists to keep failure analysis inside the ODD flow:
|
|
10
|
+
|
|
11
|
+
- The failing outcome remains the active unit of work
|
|
12
|
+
- The investigation approach is chosen deliberately, not guessed
|
|
13
|
+
- The fix stays tied to the outcome walkthrough and contracts
|
|
14
|
+
- The work returns to verification when the defect is resolved
|
|
15
|
+
|
|
16
|
+
## Entry Conditions
|
|
17
|
+
|
|
18
|
+
Before debugging:
|
|
19
|
+
|
|
20
|
+
1. Read `.odd/state.json`
|
|
21
|
+
2. Confirm `currentPhase` is `"build"`
|
|
22
|
+
3. Identify the active outcome and the latest failure in domain language
|
|
23
|
+
4. Set these fields in `.odd/state.json`:
|
|
24
|
+
- `buildMode: "debug"`
|
|
25
|
+
- `verificationConfirmed: false`
|
|
26
|
+
- `debugStartedAt: <timestamp>`
|
|
27
|
+
- `debugSummary: <one-sentence failure in domain language>`
|
|
28
|
+
5. Store the updated state to odd-flow with key `odd-project-state`
|
|
29
|
+
|
|
30
|
+
Debugging must never mark an outcome verified, complete, or committed.
|
|
31
|
+
|
|
32
|
+
## Strategy Selection
|
|
33
|
+
|
|
34
|
+
Choose exactly one debug strategy before inspecting code. State the chosen strategy and the reason.
|
|
35
|
+
|
|
36
|
+
Use this routing rule so the coding tool does not guess:
|
|
37
|
+
|
|
38
|
+
- Choose `ui-behaviour` when the problem is visible in the interface and you do not yet have evidence of a backend or data fault
|
|
39
|
+
- Choose `full-stack` when the failure crosses a user action, server boundary, and persisted state
|
|
40
|
+
- Choose `auth-security` when access, identity, trust, or validation boundaries might be wrong
|
|
41
|
+
- Choose `integration-contract` when one part of the system expects data or sequencing another part does not produce
|
|
42
|
+
- Choose `background-process` when the failure depends on async handoff, jobs, retries, or event delivery
|
|
43
|
+
- Choose `performance-state` when the issue depends on timing, staleness, cache invalidation, or repeated actions
|
|
44
|
+
|
|
45
|
+
If more than one strategy seems plausible, do not fix anything yet. Gather one more piece of evidence, then choose the narrowest strategy that still explains the failure.
|
|
46
|
+
|
|
47
|
+
### 1. `ui-behaviour`
|
|
48
|
+
|
|
49
|
+
Use when the failure is visible in the interface only:
|
|
50
|
+
- layout is wrong
|
|
51
|
+
- a button does nothing
|
|
52
|
+
- a view does not update
|
|
53
|
+
- a message or validation state is missing
|
|
54
|
+
|
|
55
|
+
Approach:
|
|
56
|
+
- reproduce in browser first
|
|
57
|
+
- inspect the rendered path backwards to the triggering action
|
|
58
|
+
- verify whether the contract is correct and the rendering is wrong
|
|
59
|
+
|
|
60
|
+
### 2. `full-stack`
|
|
61
|
+
|
|
62
|
+
Use when the failure spans browser, route, service, and data:
|
|
63
|
+
- a form submits but the result is missing
|
|
64
|
+
- a saved change does not appear
|
|
65
|
+
- a payment or enrolment looks complete but data is inconsistent
|
|
66
|
+
|
|
67
|
+
Approach:
|
|
68
|
+
- trace the full request path
|
|
69
|
+
- identify the first boundary where expected data diverges
|
|
70
|
+
- fix the smallest broken boundary, not the symptom
|
|
71
|
+
|
|
72
|
+
### 3. `auth-security`
|
|
73
|
+
|
|
74
|
+
Use when the defect touches access, trust, or sensitive behaviour:
|
|
75
|
+
- the wrong person can see or do something
|
|
76
|
+
- a protected route is open
|
|
77
|
+
- a webhook or upload path behaves unsafely
|
|
78
|
+
- a session, role, or permission check is wrong
|
|
79
|
+
|
|
80
|
+
Approach:
|
|
81
|
+
- verify actor, boundary, and expected restriction first
|
|
82
|
+
- inspect authentication, authorisation, validation, and side-effect points in order
|
|
83
|
+
- prefer the fix that narrows access and restores explicit checks
|
|
84
|
+
|
|
85
|
+
### 4. `integration-contract`
|
|
86
|
+
|
|
87
|
+
Use when two outcomes disagree about shared data or sequencing:
|
|
88
|
+
- one screen expects data another workflow never produces
|
|
89
|
+
- a downstream step fails because an upstream assumption changed
|
|
90
|
+
|
|
91
|
+
Approach:
|
|
92
|
+
- inspect the contract map and the active outcome contracts
|
|
93
|
+
- find the first producer/consumer mismatch
|
|
94
|
+
- fix the contract implementation or update the outcome if the specification is wrong
|
|
95
|
+
|
|
96
|
+
### 5. `background-process`
|
|
97
|
+
|
|
98
|
+
Use when the failure depends on queues, jobs, webhooks, scheduled work, or async delivery.
|
|
99
|
+
|
|
100
|
+
Approach:
|
|
101
|
+
- identify the triggering event
|
|
102
|
+
- confirm the worker/task started
|
|
103
|
+
- inspect the persisted state before and after the async boundary
|
|
104
|
+
- fix the handoff, retry, or idempotency break
|
|
105
|
+
|
|
106
|
+
### 6. `performance-state`
|
|
107
|
+
|
|
108
|
+
Use when the issue is stale data, race conditions, repeated actions, caching, or timing-sensitive state.
|
|
109
|
+
|
|
110
|
+
Approach:
|
|
111
|
+
- reproduce twice
|
|
112
|
+
- confirm whether the fault is deterministic or timing-sensitive
|
|
113
|
+
- inspect cache/state invalidation boundaries before changing business logic
|
|
114
|
+
|
|
115
|
+
## Non-Negotiable Rules
|
|
116
|
+
|
|
117
|
+
- Never use “quick fix” or “patch it” reasoning
|
|
118
|
+
- Never change multiple layers at once before reproducing the fault
|
|
119
|
+
- Never skip the reproduction step
|
|
120
|
+
- Never jump to a fix before naming the failing boundary
|
|
121
|
+
- Never broaden the strategy after starting unless new evidence proves the original classification wrong
|
|
122
|
+
- Never leave `buildMode: "debug"` active after the fix is complete
|
|
123
|
+
|
|
124
|
+
## Fix Protocol
|
|
125
|
+
|
|
126
|
+
After choosing the strategy:
|
|
127
|
+
|
|
128
|
+
1. Reproduce the failure
|
|
129
|
+
2. Name the failing boundary
|
|
130
|
+
3. Inspect only the layers required by the chosen strategy
|
|
131
|
+
4. Apply the smallest fix that restores the specified behaviour
|
|
132
|
+
5. Run the relevant automated checks
|
|
133
|
+
6. Set these fields in `.odd/state.json`:
|
|
134
|
+
- `buildMode: "verify"`
|
|
135
|
+
- `debugStrategy: <chosen strategy>`
|
|
136
|
+
- `debugTarget: <affected outcome/surface>`
|
|
137
|
+
- `debugSummary: <resolved failure summary>`
|
|
138
|
+
7. Store the updated state to odd-flow with key `odd-project-state`
|
|
139
|
+
8. Return to the verification walkthrough from step one
|
|
140
|
+
|
|
141
|
+
If the investigation reveals that the specification is wrong, stop debugging and update the outcome instead. That is not a bug fix. That is a specification correction.
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Chapter 10: The Build Protocol
|
|
2
2
|
|
|
3
|
-
ODD Studio handles all mechanics — context loading, contract validation, re-briefing, committing. You do three things: type /odd, type *build, verify the result. The tool handles the rest.
|
|
3
|
+
ODD Studio handles all mechanics — context loading, contract validation, re-briefing, committing, and controlled debugging. You do three things: type /odd, type *build, verify the result. If verification fails, type *debug. The tool handles the rest.
|
|
4
4
|
|
|
5
5
|
The Build Protocol is the repeating rhythm of every session. It is deliberately simple because the complexity belongs in the specification, not in the process. If the specification is precise, the build protocol is almost mechanical. If the build goes wrong, the cause is almost always in the specification, not in the process.
|
|
6
6
|
|
|
@@ -8,7 +8,7 @@ The Build Protocol is the repeating rhythm of every session. It is deliberately
|
|
|
8
8
|
|
|
9
9
|
- **The tool handles the mechanics. You handle the judgment.** ODD Studio loads context from odd-flow, reads the contract map, identifies the next outcome to build, briefs the AI, waits for the result, and presents you with a verification checklist. Your job is to follow that checklist as the persona and judge whether the result is correct.
|
|
10
10
|
|
|
11
|
-
- **The session rhythm is: /odd, *build, verify, confirm.** `/odd` loads the skill and restores project state from odd-flow. `*build` starts the next outcome. You verify the result against the checklist. `confirm` runs Checkpoint (a security scan of what was just built), commits the verified outcome, and advances to the next one. That is it.
|
|
11
|
+
- **The session rhythm is: /odd, *build, verify, confirm. If verification fails: *debug, then verify again.** `/odd` loads the skill and restores project state from odd-flow. `*build` starts the next outcome. You verify the result against the checklist. If it fails, `*debug` classifies the failure and keeps the fix inside the active outcome. `confirm` runs Checkpoint (a security scan of what was just built), commits the verified outcome, and advances to the next one. That is it.
|
|
12
12
|
|
|
13
13
|
- **Re-briefing is automatic.** You do not need to remind the AI what your project is, what has been built, or what comes next. odd-flow stores all of this. ODD Studio reads it at the start of every session. If you find yourself explaining context, something is wrong with the state — not with the process.
|
|
14
14
|
|
|
@@ -22,12 +22,12 @@ The Build Protocol is the repeating rhythm of every session. It is deliberately
|
|
|
22
22
|
|
|
23
23
|
- Building without verifying. Typing `confirm` without following the verification checklist is the fastest way to accumulate hidden defects. Every unverified outcome is a risk to every outcome that depends on it.
|
|
24
24
|
|
|
25
|
-
- Worrying about security implementation. Checkpoint runs automatically
|
|
25
|
+
- Worrying about security implementation. Checkpoint runs automatically before verification is completed and before commit. It scans for exposed secrets, missing authentication checks, insecure session shortcuts, and dangerous rendering or transport shortcuts in what was just built. If it finds something, the build returns to controlled debugging instead of drifting into ad hoc fixes.
|
|
26
26
|
|
|
27
27
|
- Batching multiple outcomes into one build. Each outcome has its own verification checklist for a reason. Mixing them makes it impossible to know which outcome caused a failure.
|
|
28
28
|
|
|
29
29
|
## What This Means for You
|
|
30
30
|
|
|
31
|
-
Your next session: type `/odd`. Read what ODD Studio tells you about where you are. Type `*build`. Follow the checklist. If it passes, type `confirm`. If it fails, describe the failure in your own words — not in technical terms
|
|
31
|
+
Your next session: type `/odd`. Read what ODD Studio tells you about where you are. Type `*build`. Follow the checklist. If it passes, type `confirm`. If it fails, describe the failure in your own words — not in technical terms — then type `*debug`. ODD Studio classifies the failure, keeps the work inside the active outcome, and routes the fix back into verification.
|
|
32
32
|
|
|
33
33
|
Next: Chapter 11 explains why verification is your job and no tool can replace your judgment.
|
|
@@ -267,19 +267,42 @@ If the domain expert raises a concern:
|
|
|
267
267
|
- Always tie reasoning to the specification. "This fits because Outcome 2.1 requires real-time updates for 90 students" — not "This is popular."
|
|
268
268
|
- If a previous choice constrains the next (e.g., choosing Supabase for database means Supabase Auth is available as an auth option), mention it as context but do not force it.
|
|
269
269
|
|
|
270
|
-
**Phase 3:
|
|
270
|
+
**Phase 3: Fixed Layer and Testing Decision**
|
|
271
271
|
|
|
272
|
-
After all choices are made, explain the
|
|
272
|
+
After all choices are made, explain the fixed ORM layer:
|
|
273
273
|
|
|
274
274
|
**Drizzle ORM** — the database layer that keeps the AI honest.
|
|
275
275
|
|
|
276
276
|
"Drizzle is the tool that ensures the build agents always know the exact shape of your data. Every field, every type, every relationship lives in your codebase as versioned migrations. When something goes wrong, we can reverse the last change precisely — the same way git lets us reverse code changes. Without Drizzle, agents are guessing about your database. With it, they know."
|
|
277
277
|
|
|
278
|
-
|
|
278
|
+
Drizzle is not negotiable. It exists because the build agents need it, not because of preference.
|
|
279
279
|
|
|
280
|
-
|
|
280
|
+
**Then present the testing decision.**
|
|
281
281
|
|
|
282
|
-
|
|
282
|
+
"Now we choose your testing framework. Automated tests run the business rules and calculations you cannot verify by clicking — access control logic, pricing calculations, workflow state transitions. Every outcome built triggers the test suite automatically. If a rule breaks because of a change somewhere else, the tests catch it before you reach the verification step."
|
|
283
|
+
|
|
284
|
+
Present the testing options:
|
|
285
|
+
|
|
286
|
+
**Decision: Testing Framework**
|
|
287
|
+
|
|
288
|
+
**Vitest** (recommended)
|
|
289
|
+
- What it is: A fast, modern test runner built for the JavaScript/TypeScript ecosystem, with native ESM support and a jsdom browser environment for component testing.
|
|
290
|
+
- Why it fits: Vitest understands your project's TypeScript and path aliases out of the box. It runs in under 2 seconds for most test suites. It includes everything needed — assertions, mocking, fake timers, coverage — with zero extra configuration.
|
|
291
|
+
- Trade-off: None significant. This is the default because it works best with the ODD build process.
|
|
292
|
+
|
|
293
|
+
**Jest**
|
|
294
|
+
- What it is: The most widely-used JavaScript test runner. Battle-tested, enormous ecosystem.
|
|
295
|
+
- Why it fits: If your team already uses Jest and has existing test patterns, consistency may matter more than speed.
|
|
296
|
+
- Trade-off: Slower than Vitest, requires additional configuration for TypeScript and ESM, heavier setup.
|
|
297
|
+
|
|
298
|
+
**Playwright Test** (for E2E-only projects)
|
|
299
|
+
- What it is: A browser-based test runner. Tests run against a real browser, not jsdom.
|
|
300
|
+
- Why it fits: If your project is almost entirely UI with minimal business logic, browser-level testing may be more valuable than unit tests.
|
|
301
|
+
- Trade-off: Much slower per test, requires a running dev server, not suitable for testing pure business logic in isolation.
|
|
302
|
+
|
|
303
|
+
"Which testing framework do you prefer? Vitest is the default because it integrates cleanly with the build process — but if you have a strong preference, we will use it."
|
|
304
|
+
|
|
305
|
+
If the domain expert has no preference or chooses Vitest, proceed with Vitest. Record the choice.
|
|
283
306
|
|
|
284
307
|
**Phase 4: Summarise and Confirm**
|
|
285
308
|
|
|
@@ -292,7 +315,7 @@ After all decisions are made, present the complete stack as a summary:
|
|
|
292
315
|
- **ORM**: Drizzle (fixed — build agent requirement)
|
|
293
316
|
- **Auth**: [chosen] — because [reason from their decision]
|
|
294
317
|
- **Hosting**: [chosen] — because [reason from their decision]
|
|
295
|
-
- **Testing**: Vitest
|
|
318
|
+
- **Testing**: [chosen — default Vitest] — because [reason]
|
|
296
319
|
- [Any specialist services]: [chosen] — because [reason from their decision]
|
|
297
320
|
|
|
298
321
|
Does this look right? Any second thoughts before I record it?"
|
|
@@ -313,7 +336,7 @@ Append a technical decisions section to `CLAUDE.md`:
|
|
|
313
336
|
- Database: [chosen]
|
|
314
337
|
- ORM: Drizzle
|
|
315
338
|
- Auth: [chosen]
|
|
316
|
-
- Testing: Vitest
|
|
339
|
+
- Testing: [chosen — default Vitest]
|
|
317
340
|
- Hosting: [chosen]
|
|
318
341
|
- [Other services]: [chosen]
|
|
319
342
|
|
|
@@ -323,7 +346,7 @@ Append a technical decisions section to `CLAUDE.md`:
|
|
|
323
346
|
- Auth: [why, with reference to specific outcome or persona]
|
|
324
347
|
- Hosting: [why, with reference to specific outcome or persona]
|
|
325
348
|
- Drizzle: type-safe database layer with versioned migrations — build agents always know the exact shape of the data and every change is tracked alongside code changes
|
|
326
|
-
-
|
|
349
|
+
- Testing: [chosen framework — default Vitest] — automated testing for invisible business rules — catches regressions before verification
|
|
327
350
|
|
|
328
351
|
**Alternatives considered (per layer):**
|
|
329
352
|
- Framework: [rejected options and why — specific constraint from the specification]
|
|
@@ -346,7 +369,7 @@ Call `mcp__odd-flow__memory_store`:
|
|
|
346
369
|
- Set `techStackDecided: true`
|
|
347
370
|
- Set `techStack` to the chosen stack description (e.g., "Next.js 16 + Supabase + NextAuth + Vercel")
|
|
348
371
|
- Set `orm` to "Drizzle"
|
|
349
|
-
- Set `testingFramework` to "Vitest"
|
|
372
|
+
- Set `testingFramework` to the chosen testing framework (default "Vitest")
|
|
350
373
|
- Update `nextStep` to "Choose the design approach, then generate architecture and design system documents"
|
|
351
374
|
|
|
352
375
|
Confirm to the user: "Technical stack chosen and recorded. Every build agent will read this before writing a line of code."
|
|
@@ -0,0 +1,60 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "odd-debug"
|
|
3
|
+
version: "1.0.0"
|
|
4
|
+
description: "ODD Studio debug command. Keeps debugging inside the active outcome, selects the correct debug strategy, and routes back to verification instead of drifting outside the ODD flow."
|
|
5
|
+
metadata:
|
|
6
|
+
priority: 9
|
|
7
|
+
pathPatterns:
|
|
8
|
+
- '.odd/state.json'
|
|
9
|
+
- 'docs/plan.md'
|
|
10
|
+
- 'docs/outcomes/**'
|
|
11
|
+
- 'docs/session-brief*.md'
|
|
12
|
+
promptSignals:
|
|
13
|
+
phrases:
|
|
14
|
+
- "odd debug"
|
|
15
|
+
- "start odd debug"
|
|
16
|
+
- "continue odd debug"
|
|
17
|
+
- "resume odd debug"
|
|
18
|
+
- "debug this in odd"
|
|
19
|
+
allOf:
|
|
20
|
+
- [odd, debug]
|
|
21
|
+
anyOf:
|
|
22
|
+
- "debug"
|
|
23
|
+
- "fix"
|
|
24
|
+
- "broken"
|
|
25
|
+
- "verification failed"
|
|
26
|
+
- "regression"
|
|
27
|
+
noneOf: []
|
|
28
|
+
minScore: 5
|
|
29
|
+
retrieval:
|
|
30
|
+
aliases:
|
|
31
|
+
- odd debug
|
|
32
|
+
- debug with odd
|
|
33
|
+
intents:
|
|
34
|
+
- start odd debug
|
|
35
|
+
- continue odd debug
|
|
36
|
+
entities:
|
|
37
|
+
- debug strategy
|
|
38
|
+
- failing outcome
|
|
39
|
+
- verification failure
|
|
40
|
+
---
|
|
41
|
+
|
|
42
|
+
# /odd-debug
|
|
43
|
+
|
|
44
|
+
You are executing the ODD Studio `*debug` command.
|
|
45
|
+
|
|
46
|
+
Read these two files now:
|
|
47
|
+
1. `.claude/skills/odd/SKILL.md` — the full ODD Studio coach and build protocol
|
|
48
|
+
2. `.claude/skills/odd/docs/build/debug-protocol.md` — the Debug Protocol detail
|
|
49
|
+
|
|
50
|
+
Then execute the `*debug` protocol exactly as documented in those files, starting from the state check and selecting the explicit debug strategy before any fix is attempted.
|
|
51
|
+
|
|
52
|
+
You must classify the failure into exactly one debug strategy before reading implementation details:
|
|
53
|
+
- `ui-behaviour`
|
|
54
|
+
- `full-stack`
|
|
55
|
+
- `auth-security`
|
|
56
|
+
- `integration-contract`
|
|
57
|
+
- `background-process`
|
|
58
|
+
- `performance-state`
|
|
59
|
+
|
|
60
|
+
If the correct strategy is not yet clear, gather evidence first. Never guess. Never jump straight to a fix.
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
{
|
|
2
|
-
"version": "2.
|
|
2
|
+
"version": "2.1.0",
|
|
3
3
|
"projectName": "{{PROJECT_NAME}}",
|
|
4
4
|
"initialisedAt": null,
|
|
5
5
|
"lastSaved": null,
|
|
@@ -14,6 +14,7 @@
|
|
|
14
14
|
"planApproved": false,
|
|
15
15
|
"techStackDecided": false,
|
|
16
16
|
"designApproachDecided": false,
|
|
17
|
+
"architectureDocGenerated": false,
|
|
17
18
|
"servicesConfigured": false,
|
|
18
19
|
"sessionBriefExported": false,
|
|
19
20
|
"sessionBriefCount": 0,
|
|
@@ -22,5 +23,14 @@
|
|
|
22
23
|
"swarmActive": false,
|
|
23
24
|
"buildPhase": null,
|
|
24
25
|
"currentBuildPhase": null,
|
|
26
|
+
"buildMode": "idle",
|
|
27
|
+
"debugStrategy": null,
|
|
28
|
+
"debugTarget": null,
|
|
29
|
+
"debugSummary": null,
|
|
30
|
+
"debugStartedAt": null,
|
|
31
|
+
"checkpointStatus": "unknown",
|
|
32
|
+
"lastCheckpointAt": null,
|
|
33
|
+
"checkpointFindings": 0,
|
|
34
|
+
"securityBaselineVersion": "2026-04-12",
|
|
25
35
|
"notes": ""
|
|
26
36
|
}
|
package/templates/AGENTS.md
CHANGED
|
@@ -6,7 +6,7 @@
|
|
|
6
6
|
This project uses ODD Studio for planning and building. To activate the ODD coach:
|
|
7
7
|
|
|
8
8
|
**In Codex:** Type `use ODD`, `start ODD`, or `begin ODD` to start.
|
|
9
|
-
Use `ODD status` to check the current state before resuming,
|
|
9
|
+
Use `ODD status` to check the current state before resuming, `ODD build` to continue the build flow, or `ODD debug` to investigate a failing outcome without leaving ODD.
|
|
10
10
|
|
|
11
11
|
**In OpenCode:** Type `/odd` to start.
|
|
12
12
|
|
|
@@ -124,6 +124,21 @@ export function canAccess(user: User): boolean {
|
|
|
124
124
|
### Security Baseline
|
|
125
125
|
- No hardcoded secrets, API keys, or credentials — use environment variables
|
|
126
126
|
- Validate user input at system boundaries
|
|
127
|
+
- Authenticate and authorise every protected route, action, webhook, and admin surface
|
|
128
|
+
- Verify webhooks, uploads, and third-party callbacks before trusting payloads
|
|
129
|
+
- Use secure session defaults — no localStorage auth/session tokens, no JWT-by-default shortcuts
|
|
130
|
+
- Rate-limit auth, admin, upload, payment, and public write surfaces
|
|
131
|
+
- Record audit trails for admin and security-sensitive actions
|
|
132
|
+
- Never disable TLS, CSRF, origin, or certificate verification in production code
|
|
133
|
+
- Treat any security scan finding as release-blocking until fixed
|
|
134
|
+
|
|
135
|
+
## Debugging Inside ODD
|
|
136
|
+
- Use `ODD debug` or `*debug` when verification fails or a build breaks
|
|
137
|
+
- Debugging stays inside the current outcome — it is not a free-form detour
|
|
138
|
+
- Choose an explicit debug strategy before touching code: `ui-behaviour`, `full-stack`, `auth-security`, `integration-contract`, `background-process`, or `performance-state`
|
|
139
|
+
- Reproduce first, identify the failing boundary second, fix third
|
|
140
|
+
- Never apply a “quick fix” without naming the failing boundary
|
|
141
|
+
- After a fix, return to the verification walkthrough from step one
|
|
127
142
|
|
|
128
143
|
## UI Standards (Every UI Outcome)
|
|
129
144
|
- Use shadcn/ui components as the default component library
|