odd-studio 3.5.1 → 3.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/skill/SKILL.md CHANGED
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: "odd"
3
- version: "3.5.1"
3
+ version: "3.6.0"
4
4
  description: "Outcome-Driven Development planning and build coach. Use /odd to start or resume an ODD project — building personas, writing outcomes, mapping contracts, creating a Master Implementation Plan, and directing a odd-flow-powered build. Designed for domain experts who are not developers. Works with Claude Code, OpenCode, and Codex."
5
5
  metadata:
6
6
  priority: 10
@@ -23,6 +23,7 @@ metadata:
23
23
  - "begin odd"
24
24
  - "resume odd"
25
25
  - "continue odd"
26
+ - "odd debug"
26
27
  - "odd studio"
27
28
  - "outcome-driven development"
28
29
  - "odd status"
@@ -33,6 +34,7 @@ metadata:
33
34
  allOf:
34
35
  - [odd, status]
35
36
  - [odd, build]
37
+ - [odd, debug]
36
38
  - [odd, plan]
37
39
  - [outcome, driven]
38
40
  anyOf:
@@ -42,6 +44,7 @@ metadata:
42
44
  - "outcome"
43
45
  - "contract map"
44
46
  - "phase brief"
47
+ - "debug"
45
48
  noneOf: []
46
49
  minScore: 5
47
50
  retrieval:
@@ -96,7 +99,7 @@ Display this when no existing state is found:
96
99
 
97
100
  ---
98
101
 
99
- Welcome to ODD Studio v3.5.1.
102
+ Welcome to ODD Studio v3.6.0.
100
103
 
101
104
  You are about to plan and build something real — using a methodology called Outcome-Driven Development. Before we write a single line of code, we are going to get precise about three things:
102
105
 
@@ -120,7 +123,7 @@ Display this when existing state is found. Replace the bracketed values with act
120
123
 
121
124
  ---
122
125
 
123
- Welcome back to ODD Studio v3.5.1.
126
+ Welcome back to ODD Studio v3.6.0.
124
127
 
125
128
  **Project:** [project.name]
126
129
  **Current Phase:** [state.currentPhase]
@@ -136,7 +139,7 @@ Welcome back to ODD Studio v3.5.1.
136
139
 
137
140
  **What's next:** [state.nextStep]
138
141
 
139
- Type `*plan` to continue planning, `*build` to enter build mode, or `*status` for full detail.
142
+ Type `*plan` to continue planning, `*build` to enter build mode, `*debug` to investigate a failing outcome without leaving the ODD flow, or `*status` for full detail.
140
143
 
141
144
  ---
142
145
 
@@ -219,6 +222,36 @@ Enter build mode. This command runs the following checks in order before beginni
219
222
  - Do NOT run the brief generation and build agents "in parallel" — the brief MUST be confirmed BEFORE any build work begins
220
223
  - This is a hard sequential gate. There are no exceptions.
221
224
 
225
+ ### `*debug`
226
+
227
+ Enter controlled debug mode for the current outcome.
228
+
229
+ This command must keep the work inside the ODD flow. It is not a free-form detour.
230
+
231
+ Execute these steps in order:
232
+
233
+ 1. Read `.odd/state.json` and confirm `currentPhase` is `"build"`. If not, explain that debugging only exists inside build work and route back to `*build`.
234
+ 2. Read the latest failure in domain language from the current conversation and identify the active outcome.
235
+ 3. Read `docs/build/debug-protocol.md` and choose exactly one debug strategy before inspecting code:
236
+ - `ui-behaviour`
237
+ - `full-stack`
238
+ - `auth-security`
239
+ - `integration-contract`
240
+ - `background-process`
241
+ - `performance-state`
242
+ 4. Update `.odd/state.json`:
243
+ - set `buildMode` to `"debug"`
244
+ - set `verificationConfirmed` to `false`
245
+ - set `debugStartedAt` to the current timestamp
246
+ - set `debugStrategy`, `debugTarget`, and `debugSummary`
247
+ 5. Call `mcp__odd-flow__memory_store` with key `odd-project-state`, namespace `odd-project`, value set to the full updated `.odd/state.json`
248
+ 6. Run the investigation and fix strictly according to the chosen strategy. Do not guess. Do not apply quick fixes. Reproduce first, identify the failing boundary, then fix.
249
+ 7. When the fix is ready, update `.odd/state.json` again:
250
+ - set `buildMode` to `"verify"`
251
+ - keep `debugStrategy`, `debugTarget`, and `debugSummary` as the latest resolved context
252
+ 8. Call `mcp__odd-flow__memory_store` again with the full updated `.odd/state.json`
253
+ 9. Return to the verification walkthrough from step one. A debug session ends only when verification passes.
254
+
222
255
  **If the brief exists but `briefConfirmed` is not true in state.json:**
223
256
  - Present it to the domain expert: "Session Brief [N] exists. Review it at docs/session-brief-[N].md and confirm before we build."
224
257
  - Wait for confirmation, then set `briefConfirmed: true` in `.odd/state.json`
@@ -508,6 +541,7 @@ You can use either format:
508
541
  |---|---|---|
509
542
  | `*plan` | `/odd-plan` | Continue from where you left off in planning |
510
543
  | `*build` | `/odd-build` | Enter build mode and initialise odd-flow swarm |
544
+ | `*debug` | `/odd-debug` | Keep debugging inside the active outcome and force an explicit debug strategy before fixing |
511
545
  | `*status` | `/odd-status` | Show full project state and progress |
512
546
  | `*swarm` | `/odd-swarm` | Build all independent outcomes in the current phase simultaneously |
513
547
  | `*deploy` | `/odd-deploy` | Deploy the current verified build to production |
@@ -566,11 +600,54 @@ Enforce this sequence — do not proceed to a later step without the earlier one
566
600
  Run when `*build` is called and `servicesConfigured` is false.
567
601
 
568
602
  1. **Scaffold.** If `package.json` exists, skip to step 2. If not: `create-next-app` rejects non-empty directories — scaffold into a sibling dir (`${PROJECT_DIR}-scaffold`) then rsync across excluding `.git`, `docs/`, `node_modules/`. Fix `package.json name` after rsync. Tell user they can delete the sibling dir.
569
- 2. **Install deps.** `npm install drizzle-orm drizzle-kit vitest @testing-library/react @vitejs/plugin-react`
570
- 3. **Generate `.env.local`.** Write a placeholder file with every credential the chosen stack needs. Each line must have a comment pointing to exactly where to find the real value in the service dashboard. Include a note: never commit this file, use test keys for payment services.
571
- 4. **Wait.** Display the credential list. Wait for the user to confirm they've filled everything in.
572
- 5. **Verify.** Kill port 3000 (`lsof -ti:3000 | xargs kill 2>/dev/null || true`), run `npm run dev`. Translate any connection errors into plain language. Repeat until server starts cleanly.
573
- 6. **Mark done.** Set `servicesConfigured: true` in `.odd/state.json`. Confirm: "All services connected. Development server running at http://localhost:3000."
603
+ 2. **Install deps.** Read `testingFramework` from `.odd/state.json` (default "Vitest"). Install the chosen testing stack:
604
+ - **Vitest (default):** `npm install --save-dev vitest @testing-library/react @vitejs/plugin-react @testing-library/jest-dom jsdom`
605
+ - **Jest:** `npm install --save-dev jest @testing-library/react @testing-library/jest-dom ts-jest @types/jest jest-environment-jsdom`
606
+ - **Playwright:** `npm install --save-dev @playwright/test` then `npx playwright install`
607
+ - Also install production deps: `npm install drizzle-orm drizzle-kit`
608
+ 3. **Scaffold test harness.** Read `testingFramework` from `.odd/state.json` and scaffold the appropriate config. For **Vitest** (the default):
609
+ - Create `vitest.config.ts`:
610
+ ```typescript
611
+ import { defineConfig } from "vitest/config"
612
+ import react from "@vitejs/plugin-react"
613
+ import path from "path"
614
+
615
+ export default defineConfig({
616
+ plugins: [react()],
617
+ test: {
618
+ environment: "jsdom",
619
+ globals: true,
620
+ setupFiles: ["./tests/setup.ts"],
621
+ include: ["tests/**/*.test.{ts,tsx}"],
622
+ },
623
+ resolve: {
624
+ alias: {
625
+ "@": path.resolve(__dirname, "."),
626
+ },
627
+ },
628
+ })
629
+ ```
630
+ - Create `tests/setup.ts`:
631
+ ```typescript
632
+ import "@testing-library/jest-dom/vitest"
633
+ ```
634
+ - Create `tests/setup.test.ts` (smoke test):
635
+ ```typescript
636
+ import { describe, it, expect } from "vitest"
637
+
638
+ describe("vitest setup", () => {
639
+ it("runs", () => {
640
+ expect(true).toBe(true)
641
+ })
642
+ })
643
+ ```
644
+ - Add scripts to `package.json`: `"test": "vitest run"` and `"test:watch": "vitest"`
645
+ - Run `npm test` to confirm the harness works. If the smoke test fails, diagnose and fix before proceeding.
646
+ - Display: "Test harness configured. `npm test` runs the suite. `npm run test:watch` runs in watch mode."
647
+ 4. **Generate `.env.local`.** Write a placeholder file with every credential the chosen stack needs. Each line must have a comment pointing to exactly where to find the real value in the service dashboard. Include a note: never commit this file, use test keys for payment services.
648
+ 5. **Wait.** Display the credential list. Wait for the user to confirm they've filled everything in.
649
+ 6. **Verify.** Kill port 3000 (`lsof -ti:3000 | xargs kill 2>/dev/null || true`), run `npm run dev`. Translate any connection errors into plain language. Repeat until server starts cleanly.
650
+ 7. **Mark done.** Set `servicesConfigured: true` in `.odd/state.json`. Confirm: "All services connected. Development server running at http://localhost:3000. Test harness verified."
574
651
 
575
652
  ---
576
653
 
@@ -32,6 +32,40 @@ The domain expert does not re-brief the AI, paste context, identify shared infra
32
32
 
33
33
  ---
34
34
 
35
+ ### Step 2b — Test
36
+
37
+ After the build completes and before verification begins, the build agent runs the test suite automatically.
38
+
39
+ **What the build agent tests:**
40
+
41
+ Every outcome produces code. Some of that code is pure logic — functions that take inputs and return outputs without touching databases, APIs, or the browser. These functions MUST have tests written alongside the implementation. The build agent writes tests for:
42
+
43
+ - **Business rules** — access control, pricing, eligibility, classification logic
44
+ - **Data transformations** — formatting, aggregation, filtering, sorting
45
+ - **Validation** — input parsing, CSV import, form validation, regex matching
46
+ - **Calculations** — mastery scores, scheduling, priority ordering, time-based logic
47
+ - **Safety-critical logic** — safeguarding detection, content filtering, concern routing
48
+
49
+ **What is NOT tested at this stage:**
50
+
51
+ - Database queries (tested via verification walkthrough)
52
+ - UI rendering (tested via verification walkthrough)
53
+ - External API calls (tested via verification walkthrough)
54
+ - LLM prompt/response cycles (tested via verification walkthrough)
55
+
56
+ **The test gate:**
57
+
58
+ After the build completes, run `npm test`. If any tests fail:
59
+ 1. The build agent fixes the failures immediately — do not proceed to verification with failing tests
60
+ 2. Re-run `npm test` until all tests pass
61
+ 3. Display to the domain expert: "All [n] tests passing. Ready for verification."
62
+
63
+ If no testable pure logic was produced by this outcome (e.g., a purely UI outcome), display: "No new business logic tests required for this outcome. Ready for verification."
64
+
65
+ Tests are committed alongside the implementation code. They live in `tests/` mirroring the source structure. Test files must never be deleted — they are regression guards for every future outcome.
66
+
67
+ ---
68
+
35
69
  ### Step 3 — Verify
36
70
 
37
71
  When ODD Studio reports the build is complete, the verification checklist is on screen.
@@ -191,6 +191,40 @@ If a build agent finds itself exceeding these limits, it stops and restructures
191
191
 
192
192
  ---
193
193
 
194
+ ## Testing Standard
195
+
196
+ Every pure-logic module — any function that takes inputs and returns outputs without side effects — MUST have a corresponding test file in `tests/` that mirrors the source path. If the source is `lib/psm/mastery.ts`, the test is `tests/lib/psm/mastery.test.ts`.
197
+
198
+ ### What MUST be tested
199
+
200
+ - **Business rules:** access control checks, eligibility logic, classification functions, routing decisions
201
+ - **Data transformations:** formatting, aggregation, filtering, score calculations
202
+ - **Parsing and validation:** CSV import, regex patterns, input sanitisation, form validation
203
+ - **Safety-critical logic:** safeguarding keyword detection, content filtering, concern classification and routing
204
+ - **State machines:** plant growth levels, unlock gates, engagement level thresholds
205
+
206
+ ### What is NOT unit tested
207
+
208
+ - Database queries — these are tested via the verification walkthrough
209
+ - React components — these are tested via the verification walkthrough and design verification
210
+ - LLM calls — these are tested via the verification walkthrough
211
+ - External API integrations — these are tested via the verification walkthrough
212
+
213
+ ### Test quality rules
214
+
215
+ - Test the behaviour, not the implementation. Test what the function returns, not how it computes it.
216
+ - Use `it.each` for data-driven tests with multiple inputs against the same assertion.
217
+ - Use `vi.useFakeTimers()` for time-dependent logic. Clean up with `vi.useRealTimers()`.
218
+ - Use `vi.stubEnv()` for environment variable tests. Clean up with `vi.unstubAllEnvs()`.
219
+ - No mocks for things that can be tested directly. If a function is pure, test it directly.
220
+ - Every test file must pass independently — no shared state between test files.
221
+
222
+ ### The testing gate
223
+
224
+ `npm test` runs before every verification walkthrough. Failing tests block verification. The domain expert never sees a system with broken business logic.
225
+
226
+ ---
227
+
194
228
  ## How This Standard Is Enforced
195
229
 
196
230
  1. **At build time.** The build agent reads this document before writing any code. It applies the Design-It-Twice protocol internally and outputs only the minimal version.
@@ -199,4 +233,6 @@ If a build agent finds itself exceeding these limits, it stops and restructures
199
233
 
200
234
  3. **At verification time.** When the domain expert verifies an outcome, the code behind it has already been through two passes. The domain expert does not review code — but the code they are relying on is clean, minimal, and maintainable.
201
235
 
202
- 4. **At refactor time.** If an outcome is rebuilt after a verification failure, the rebuild starts from scratch against this standard it does not patch the previous attempt.
236
+ 4. **At test time.** `npm test` runs after every build and before every verification. Failing tests block verification. New pure-logic functions without corresponding test files are flagged.
237
+
238
+ 5. **At refactor time.** If an outcome is rebuilt after a verification failure, the rebuild starts from scratch against this standard — it does not patch the previous attempt. Existing tests must still pass after the rebuild.
@@ -0,0 +1,141 @@
1
+ # ODD Debug Protocol
2
+
3
+ Debugging does not sit outside Outcome-Driven Development. It is a controlled sub-mode of the current build.
4
+
5
+ When something fails during verification or during an in-progress build, use `*debug`. Do not abandon the active outcome. Do not start free-form fixing. Do not guess.
6
+
7
+ ## Purpose
8
+
9
+ `*debug` exists to keep failure analysis inside the ODD flow:
10
+
11
+ - The failing outcome remains the active unit of work
12
+ - The investigation approach is chosen deliberately, not guessed
13
+ - The fix stays tied to the outcome walkthrough and contracts
14
+ - The work returns to verification when the defect is resolved
15
+
16
+ ## Entry Conditions
17
+
18
+ Before debugging:
19
+
20
+ 1. Read `.odd/state.json`
21
+ 2. Confirm `currentPhase` is `"build"`
22
+ 3. Identify the active outcome and the latest failure in domain language
23
+ 4. Set these fields in `.odd/state.json`:
24
+ - `buildMode: "debug"`
25
+ - `verificationConfirmed: false`
26
+ - `debugStartedAt: <timestamp>`
27
+ - `debugSummary: <one-sentence failure in domain language>`
28
+ 5. Store the updated state to odd-flow with key `odd-project-state`
29
+
30
+ Debugging must never mark an outcome verified, complete, or committed.
31
+
32
+ ## Strategy Selection
33
+
34
+ Choose exactly one debug strategy before inspecting code. State the chosen strategy and the reason.
35
+
36
+ Use this routing rule so the coding tool does not guess:
37
+
38
+ - Choose `ui-behaviour` when the problem is visible in the interface and you do not yet have evidence of a backend or data fault
39
+ - Choose `full-stack` when the failure crosses a user action, server boundary, and persisted state
40
+ - Choose `auth-security` when access, identity, trust, or validation boundaries might be wrong
41
+ - Choose `integration-contract` when one part of the system expects data or sequencing another part does not produce
42
+ - Choose `background-process` when the failure depends on async handoff, jobs, retries, or event delivery
43
+ - Choose `performance-state` when the issue depends on timing, staleness, cache invalidation, or repeated actions
44
+
45
+ If more than one strategy seems plausible, do not fix anything yet. Gather one more piece of evidence, then choose the narrowest strategy that still explains the failure.
46
+
47
+ ### 1. `ui-behaviour`
48
+
49
+ Use when the failure is visible in the interface only:
50
+ - layout is wrong
51
+ - a button does nothing
52
+ - a view does not update
53
+ - a message or validation state is missing
54
+
55
+ Approach:
56
+ - reproduce in browser first
57
+ - inspect the rendered path backwards to the triggering action
58
+ - verify whether the contract is correct and the rendering is wrong
59
+
60
+ ### 2. `full-stack`
61
+
62
+ Use when the failure spans browser, route, service, and data:
63
+ - a form submits but the result is missing
64
+ - a saved change does not appear
65
+ - a payment or enrolment looks complete but data is inconsistent
66
+
67
+ Approach:
68
+ - trace the full request path
69
+ - identify the first boundary where expected data diverges
70
+ - fix the smallest broken boundary, not the symptom
71
+
72
+ ### 3. `auth-security`
73
+
74
+ Use when the defect touches access, trust, or sensitive behaviour:
75
+ - the wrong person can see or do something
76
+ - a protected route is open
77
+ - a webhook or upload path behaves unsafely
78
+ - a session, role, or permission check is wrong
79
+
80
+ Approach:
81
+ - verify actor, boundary, and expected restriction first
82
+ - inspect authentication, authorisation, validation, and side-effect points in order
83
+ - prefer the fix that narrows access and restores explicit checks
84
+
85
+ ### 4. `integration-contract`
86
+
87
+ Use when two outcomes disagree about shared data or sequencing:
88
+ - one screen expects data another workflow never produces
89
+ - a downstream step fails because an upstream assumption changed
90
+
91
+ Approach:
92
+ - inspect the contract map and the active outcome contracts
93
+ - find the first producer/consumer mismatch
94
+ - fix the contract implementation or update the outcome if the specification is wrong
95
+
96
+ ### 5. `background-process`
97
+
98
+ Use when the failure depends on queues, jobs, webhooks, scheduled work, or async delivery.
99
+
100
+ Approach:
101
+ - identify the triggering event
102
+ - confirm the worker/task started
103
+ - inspect the persisted state before and after the async boundary
104
+ - fix the handoff, retry, or idempotency break
105
+
106
+ ### 6. `performance-state`
107
+
108
+ Use when the issue is stale data, race conditions, repeated actions, caching, or timing-sensitive state.
109
+
110
+ Approach:
111
+ - reproduce twice
112
+ - confirm whether the fault is deterministic or timing-sensitive
113
+ - inspect cache/state invalidation boundaries before changing business logic
114
+
115
+ ## Non-Negotiable Rules
116
+
117
+ - Never use “quick fix” or “patch it” reasoning
118
+ - Never change multiple layers at once before reproducing the fault
119
+ - Never skip the reproduction step
120
+ - Never jump to a fix before naming the failing boundary
121
+ - Never broaden the strategy after starting unless new evidence proves the original classification wrong
122
+ - Never leave `buildMode: "debug"` active after the fix is complete
123
+
124
+ ## Fix Protocol
125
+
126
+ After choosing the strategy:
127
+
128
+ 1. Reproduce the failure
129
+ 2. Name the failing boundary
130
+ 3. Inspect only the layers required by the chosen strategy
131
+ 4. Apply the smallest fix that restores the specified behaviour
132
+ 5. Run the relevant automated checks
133
+ 6. Set these fields in `.odd/state.json`:
134
+ - `buildMode: "verify"`
135
+ - `debugStrategy: <chosen strategy>`
136
+ - `debugTarget: <affected outcome/surface>`
137
+ - `debugSummary: <resolved failure summary>`
138
+ 7. Store the updated state to odd-flow with key `odd-project-state`
139
+ 8. Return to the verification walkthrough from step one
140
+
141
+ If the investigation reveals that the specification is wrong, stop debugging and update the outcome instead. That is not a bug fix. That is a specification correction.
@@ -1,6 +1,6 @@
1
1
  # Chapter 10: The Build Protocol
2
2
 
3
- ODD Studio handles all mechanics — context loading, contract validation, re-briefing, committing. You do three things: type /odd, type *build, verify the result. The tool handles the rest.
3
+ ODD Studio handles all mechanics — context loading, contract validation, re-briefing, committing, and controlled debugging. You do three things: type /odd, type *build, verify the result. If verification fails, type *debug. The tool handles the rest.
4
4
 
5
5
  The Build Protocol is the repeating rhythm of every session. It is deliberately simple because the complexity belongs in the specification, not in the process. If the specification is precise, the build protocol is almost mechanical. If the build goes wrong, the cause is almost always in the specification, not in the process.
6
6
 
@@ -8,7 +8,7 @@ The Build Protocol is the repeating rhythm of every session. It is deliberately
8
8
 
9
9
  - **The tool handles the mechanics. You handle the judgment.** ODD Studio loads context from odd-flow, reads the contract map, identifies the next outcome to build, briefs the AI, waits for the result, and presents you with a verification checklist. Your job is to follow that checklist as the persona and judge whether the result is correct.
10
10
 
11
- - **The session rhythm is: /odd, *build, verify, confirm.** `/odd` loads the skill and restores project state from odd-flow. `*build` starts the next outcome. You verify the result against the checklist. `confirm` runs Checkpoint (a security scan of what was just built), commits the verified outcome, and advances to the next one. That is it.
11
+ - **The session rhythm is: /odd, *build, verify, confirm. If verification fails: *debug, then verify again.** `/odd` loads the skill and restores project state from odd-flow. `*build` starts the next outcome. You verify the result against the checklist. If it fails, `*debug` classifies the failure and keeps the fix inside the active outcome. `confirm` runs Checkpoint (a security scan of what was just built), commits the verified outcome, and advances to the next one. That is it.
12
12
 
13
13
  - **Re-briefing is automatic.** You do not need to remind the AI what your project is, what has been built, or what comes next. odd-flow stores all of this. ODD Studio reads it at the start of every session. If you find yourself explaining context, something is wrong with the state — not with the process.
14
14
 
@@ -22,12 +22,12 @@ The Build Protocol is the repeating rhythm of every session. It is deliberately
22
22
 
23
23
  - Building without verifying. Typing `confirm` without following the verification checklist is the fastest way to accumulate hidden defects. Every unverified outcome is a risk to every outcome that depends on it.
24
24
 
25
- - Worrying about security implementation. Checkpoint runs automatically when you type `confirm`. It scans for exposed secrets, missing authentication checks, and injection vulnerabilities in what was just built. If it finds something, the build agent fixes it before the commit. You do not need to think about this it happens in the background.
25
+ - Worrying about security implementation. Checkpoint runs automatically before verification is completed and before commit. It scans for exposed secrets, missing authentication checks, insecure session shortcuts, and dangerous rendering or transport shortcuts in what was just built. If it finds something, the build returns to controlled debugging instead of drifting into ad hoc fixes.
26
26
 
27
27
  - Batching multiple outcomes into one build. Each outcome has its own verification checklist for a reason. Mixing them makes it impossible to know which outcome caused a failure.
28
28
 
29
29
  ## What This Means for You
30
30
 
31
- Your next session: type `/odd`. Read what ODD Studio tells you about where you are. Type `*build`. Follow the checklist. If it passes, type `confirm`. If it fails, describe the failure in your own words — not in technical terms. ODD Studio handles the fix.
31
+ Your next session: type `/odd`. Read what ODD Studio tells you about where you are. Type `*build`. Follow the checklist. If it passes, type `confirm`. If it fails, describe the failure in your own words — not in technical terms — then type `*debug`. ODD Studio classifies the failure, keeps the work inside the active outcome, and routes the fix back into verification.
32
32
 
33
33
  Next: Chapter 11 explains why verification is your job and no tool can replace your judgment.
@@ -267,19 +267,42 @@ If the domain expert raises a concern:
267
267
  - Always tie reasoning to the specification. "This fits because Outcome 2.1 requires real-time updates for 90 students" — not "This is popular."
268
268
  - If a previous choice constrains the next (e.g., choosing Supabase for database means Supabase Auth is available as an auth option), mention it as context but do not force it.
269
269
 
270
- **Phase 3: Confirm the Fixed Layers**
270
+ **Phase 3: Fixed Layer and Testing Decision**
271
271
 
272
- After all choices are made, explain the two components that are included in every ODD Studio project regardless of choice:
272
+ After all choices are made, explain the fixed ORM layer:
273
273
 
274
274
  **Drizzle ORM** — the database layer that keeps the AI honest.
275
275
 
276
276
  "Drizzle is the tool that ensures the build agents always know the exact shape of your data. Every field, every type, every relationship lives in your codebase as versioned migrations. When something goes wrong, we can reverse the last change precisely — the same way git lets us reverse code changes. Without Drizzle, agents are guessing about your database. With it, they know."
277
277
 
278
- **Vitest** automated checks for invisible behaviours.
278
+ Drizzle is not negotiable. It exists because the build agents need it, not because of preference.
279
279
 
280
- "Vitest runs the business rules and calculations you cannot verify by clicking — access control logic, pricing calculations, workflow state transitions. Every outcome built triggers Vitest automatically. If a rule breaks because of a change somewhere else, Vitest catches it before you reach the verification step."
280
+ **Then present the testing decision.**
281
281
 
282
- These are not negotiable. They exist because the build agents need them, not because of preference.
282
+ "Now we choose your testing framework. Automated tests run the business rules and calculations you cannot verify by clicking — access control logic, pricing calculations, workflow state transitions. Every outcome built triggers the test suite automatically. If a rule breaks because of a change somewhere else, the tests catch it before you reach the verification step."
283
+
284
+ Present the testing options:
285
+
286
+ **Decision: Testing Framework**
287
+
288
+ **Vitest** (recommended)
289
+ - What it is: A fast, modern test runner built for the JavaScript/TypeScript ecosystem, with native ESM support and a jsdom browser environment for component testing.
290
+ - Why it fits: Vitest understands your project's TypeScript and path aliases out of the box. It runs in under 2 seconds for most test suites. It includes everything needed — assertions, mocking, fake timers, coverage — with zero extra configuration.
291
+ - Trade-off: None significant. This is the default because it works best with the ODD build process.
292
+
293
+ **Jest**
294
+ - What it is: The most widely-used JavaScript test runner. Battle-tested, enormous ecosystem.
295
+ - Why it fits: If your team already uses Jest and has existing test patterns, consistency may matter more than speed.
296
+ - Trade-off: Slower than Vitest, requires additional configuration for TypeScript and ESM, heavier setup.
297
+
298
+ **Playwright Test** (for E2E-only projects)
299
+ - What it is: A browser-based test runner. Tests run against a real browser, not jsdom.
300
+ - Why it fits: If your project is almost entirely UI with minimal business logic, browser-level testing may be more valuable than unit tests.
301
+ - Trade-off: Much slower per test, requires a running dev server, not suitable for testing pure business logic in isolation.
302
+
303
+ "Which testing framework do you prefer? Vitest is the default because it integrates cleanly with the build process — but if you have a strong preference, we will use it."
304
+
305
+ If the domain expert has no preference or chooses Vitest, proceed with Vitest. Record the choice.
283
306
 
284
307
  **Phase 4: Summarise and Confirm**
285
308
 
@@ -292,7 +315,7 @@ After all decisions are made, present the complete stack as a summary:
292
315
  - **ORM**: Drizzle (fixed — build agent requirement)
293
316
  - **Auth**: [chosen] — because [reason from their decision]
294
317
  - **Hosting**: [chosen] — because [reason from their decision]
295
- - **Testing**: Vitest (fixed build agent requirement)
318
+ - **Testing**: [chosen — default Vitest]because [reason]
296
319
  - [Any specialist services]: [chosen] — because [reason from their decision]
297
320
 
298
321
  Does this look right? Any second thoughts before I record it?"
@@ -313,7 +336,7 @@ Append a technical decisions section to `CLAUDE.md`:
313
336
  - Database: [chosen]
314
337
  - ORM: Drizzle
315
338
  - Auth: [chosen]
316
- - Testing: Vitest
339
+ - Testing: [chosen — default Vitest]
317
340
  - Hosting: [chosen]
318
341
  - [Other services]: [chosen]
319
342
 
@@ -323,7 +346,7 @@ Append a technical decisions section to `CLAUDE.md`:
323
346
  - Auth: [why, with reference to specific outcome or persona]
324
347
  - Hosting: [why, with reference to specific outcome or persona]
325
348
  - Drizzle: type-safe database layer with versioned migrations — build agents always know the exact shape of the data and every change is tracked alongside code changes
326
- - Vitest: automated testing for invisible business rules — catches regressions before verification
349
+ - Testing: [chosen framework — default Vitest] — automated testing for invisible business rules — catches regressions before verification
327
350
 
328
351
  **Alternatives considered (per layer):**
329
352
  - Framework: [rejected options and why — specific constraint from the specification]
@@ -346,7 +369,7 @@ Call `mcp__odd-flow__memory_store`:
346
369
  - Set `techStackDecided: true`
347
370
  - Set `techStack` to the chosen stack description (e.g., "Next.js 16 + Supabase + NextAuth + Vercel")
348
371
  - Set `orm` to "Drizzle"
349
- - Set `testingFramework` to "Vitest"
372
+ - Set `testingFramework` to the chosen testing framework (default "Vitest")
350
373
  - Update `nextStep` to "Choose the design approach, then generate architecture and design system documents"
351
374
 
352
375
  Confirm to the user: "Technical stack chosen and recorded. Every build agent will read this before writing a line of code."
@@ -0,0 +1,60 @@
1
+ ---
2
+ name: "odd-debug"
3
+ version: "1.0.0"
4
+ description: "ODD Studio debug command. Keeps debugging inside the active outcome, selects the correct debug strategy, and routes back to verification instead of drifting outside the ODD flow."
5
+ metadata:
6
+ priority: 9
7
+ pathPatterns:
8
+ - '.odd/state.json'
9
+ - 'docs/plan.md'
10
+ - 'docs/outcomes/**'
11
+ - 'docs/session-brief*.md'
12
+ promptSignals:
13
+ phrases:
14
+ - "odd debug"
15
+ - "start odd debug"
16
+ - "continue odd debug"
17
+ - "resume odd debug"
18
+ - "debug this in odd"
19
+ allOf:
20
+ - [odd, debug]
21
+ anyOf:
22
+ - "debug"
23
+ - "fix"
24
+ - "broken"
25
+ - "verification failed"
26
+ - "regression"
27
+ noneOf: []
28
+ minScore: 5
29
+ retrieval:
30
+ aliases:
31
+ - odd debug
32
+ - debug with odd
33
+ intents:
34
+ - start odd debug
35
+ - continue odd debug
36
+ entities:
37
+ - debug strategy
38
+ - failing outcome
39
+ - verification failure
40
+ ---
41
+
42
+ # /odd-debug
43
+
44
+ You are executing the ODD Studio `*debug` command.
45
+
46
+ Read these two files now:
47
+ 1. `.claude/skills/odd/SKILL.md` — the full ODD Studio coach and build protocol
48
+ 2. `.claude/skills/odd/docs/build/debug-protocol.md` — the Debug Protocol detail
49
+
50
+ Then execute the `*debug` protocol exactly as documented in those files, starting from the state check and selecting the explicit debug strategy before any fix is attempted.
51
+
52
+ You must classify the failure into exactly one debug strategy before reading implementation details:
53
+ - `ui-behaviour`
54
+ - `full-stack`
55
+ - `auth-security`
56
+ - `integration-contract`
57
+ - `background-process`
58
+ - `performance-state`
59
+
60
+ If the correct strategy is not yet clear, gather evidence first. Never guess. Never jump straight to a fix.
@@ -1,5 +1,5 @@
1
1
  {
2
- "version": "2.0.0",
2
+ "version": "2.1.0",
3
3
  "projectName": "{{PROJECT_NAME}}",
4
4
  "initialisedAt": null,
5
5
  "lastSaved": null,
@@ -14,6 +14,7 @@
14
14
  "planApproved": false,
15
15
  "techStackDecided": false,
16
16
  "designApproachDecided": false,
17
+ "architectureDocGenerated": false,
17
18
  "servicesConfigured": false,
18
19
  "sessionBriefExported": false,
19
20
  "sessionBriefCount": 0,
@@ -22,5 +23,14 @@
22
23
  "swarmActive": false,
23
24
  "buildPhase": null,
24
25
  "currentBuildPhase": null,
26
+ "buildMode": "idle",
27
+ "debugStrategy": null,
28
+ "debugTarget": null,
29
+ "debugSummary": null,
30
+ "debugStartedAt": null,
31
+ "checkpointStatus": "unknown",
32
+ "lastCheckpointAt": null,
33
+ "checkpointFindings": 0,
34
+ "securityBaselineVersion": "2026-04-12",
25
35
  "notes": ""
26
36
  }
@@ -6,7 +6,7 @@
6
6
  This project uses ODD Studio for planning and building. To activate the ODD coach:
7
7
 
8
8
  **In Codex:** Type `use ODD`, `start ODD`, or `begin ODD` to start.
9
- Use `ODD status` to check the current state before resuming, or `ODD build` to continue the build flow.
9
+ Use `ODD status` to check the current state before resuming, `ODD build` to continue the build flow, or `ODD debug` to investigate a failing outcome without leaving ODD.
10
10
 
11
11
  **In OpenCode:** Type `/odd` to start.
12
12
 
@@ -124,6 +124,21 @@ export function canAccess(user: User): boolean {
124
124
  ### Security Baseline
125
125
  - No hardcoded secrets, API keys, or credentials — use environment variables
126
126
  - Validate user input at system boundaries
127
+ - Authenticate and authorise every protected route, action, webhook, and admin surface
128
+ - Verify webhooks, uploads, and third-party callbacks before trusting payloads
129
+ - Use secure session defaults — no localStorage auth/session tokens, no JWT-by-default shortcuts
130
+ - Rate-limit auth, admin, upload, payment, and public write surfaces
131
+ - Record audit trails for admin and security-sensitive actions
132
+ - Never disable TLS, CSRF, origin, or certificate verification in production code
133
+ - Treat any security scan finding as release-blocking until fixed
134
+
135
+ ## Debugging Inside ODD
136
+ - Use `ODD debug` or `*debug` when verification fails or a build breaks
137
+ - Debugging stays inside the current outcome — it is not a free-form detour
138
+ - Choose an explicit debug strategy before touching code: `ui-behaviour`, `full-stack`, `auth-security`, `integration-contract`, `background-process`, or `performance-state`
139
+ - Reproduce first, identify the failing boundary second, fix third
140
+ - Never apply a “quick fix” without naming the failing boundary
141
+ - After a fix, return to the verification walkthrough from step one
127
142
 
128
143
  ## UI Standards (Every UI Outcome)
129
144
  - Use shadcn/ui components as the default component library