claude-devkit-cli 1.2.5 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # claude-devkit-cli
2
2
 
3
- A lightweight, spec-first development toolkit for [Claude Code](https://claude.ai/code). It enforces the cycle **spec test plan → code + tests → build pass** through custom commands, automatic hooks, and a universal test runner.
3
+ A lightweight, spec-first development toolkit for [Claude Code](https://claude.ai/code). It enforces the cycle **spec (with acceptance scenarios) → code + tests → build pass** through custom commands, automatic hooks, and a universal test runner.
4
4
 
5
5
  **Works with:** Swift, TypeScript/JavaScript, Python, Rust, Go, Java/Kotlin, C#, Ruby.
6
6
  **Dependencies:** None (requires only Claude Code CLI, Node.js, Git, and Bash).
@@ -16,7 +16,7 @@ A lightweight, spec-first development toolkit for [Claude Code](https://claude.a
16
16
  5. [Commands Reference](#5-commands-reference)
17
17
  6. [Automatic Guards (Hooks)](#6-automatic-guards-hooks)
18
18
  7. [Build Test Script](#7-build-test-script)
19
- 8. [Specs & Test Plans Format](#8-specs--test-plans-format)
19
+ 8. [Spec Format](#8-spec-format)
20
20
  9. [Customization](#9-customization)
21
21
  10. [Token Cost Guide](#10-token-cost-guide)
22
22
  11. [Troubleshooting](#11-troubleshooting)
@@ -29,16 +29,16 @@ A lightweight, spec-first development toolkit for [Claude Code](https://claude.a
29
29
  ### The Core Cycle
30
30
 
31
31
  ```
32
- SPEC TEST PLAN → CODE + TESTS → BUILD PASS
32
+ SPEC (with acceptance scenarios) → CODE + TESTS → BUILD PASS
33
33
  ```
34
34
 
35
- Every code change — feature, fix, or removal — follows this cycle. The spec is the source of truth. If code contradicts the spec, the code is wrong.
35
+ Every code change — feature, fix, or removal — follows this cycle. The spec is the source of truth. Acceptance scenarios (Given/When/Then) are embedded directly in the spec — no separate test plan file. If code contradicts the spec, the code is wrong.
36
36
 
37
37
  ### Why Spec-First?
38
38
 
39
- - **Prevents drift.** Without a spec, code becomes the only documentation and diverges from intent over time.
40
- - **Tests have purpose.** Test plans derived from specs test behavior, not implementation details. This means tests survive refactoring.
41
- - **AI writes better code.** When Claude Code has a spec to reference, it generates more accurate implementations and more meaningful tests.
39
+ - **Prevents drift.** Acceptance scenarios live inside the spec no separate test plan to fall out of sync.
40
+ - **Tests have purpose.** Scenarios derived from specs test behavior, not implementation details. This means tests survive refactoring.
41
+ - **AI writes better code.** When Claude Code has a spec with concrete Given/When/Then scenarios, it generates more accurate implementations and more meaningful tests.
42
42
  - **Reviews are grounded.** Reviewers can check code against the spec rather than guessing at intent.
43
43
 
44
44
  ### Principles
@@ -62,7 +62,7 @@ npx claude-devkit-cli init .
62
62
  # 2. Open your project in Claude Code
63
63
  claude
64
64
 
65
- # 3. Create your first spec and test plan
65
+ # 3. Create your first spec
66
66
  /mf-plan "describe your feature here"
67
67
 
68
68
  # 4. Write code, then test
@@ -145,8 +145,10 @@ your-project/
145
145
  ├── scripts/
146
146
  │ └── build-test.sh ← Universal test runner
147
147
  └── docs/
148
- ├── specs/ ← Your specs go here
149
- ├── test-plans/ ← Generated test plans
148
+ ├── specs/ ← Your specs (folder-per-feature)
149
+ │ └── <feature>/
150
+ │ ├── <feature>.md ← Spec with acceptance scenarios
151
+ │ └── snapshots/ ← Version history (managed by /mf-plan)
150
152
  └── WORKFLOW.md ← Process reference
151
153
  ```
152
154
 
@@ -185,7 +187,7 @@ npx claude-devkit-cli list
185
187
  npx claude-devkit-cli remove
186
188
  ```
187
189
 
188
- This removes hooks, commands, settings, and build-test.sh. It preserves `CLAUDE.md` (which you may have customized) and `docs/` (which contains your specs and plans).
190
+ This removes hooks, commands, settings, and build-test.sh. It preserves `CLAUDE.md` (which you may have customized) and `docs/` (which contains your specs).
189
191
 
190
192
  ---
191
193
 
@@ -197,7 +199,7 @@ This removes hooks, commands, settings, and build-test.sh. It preserves `CLAUDE.
197
199
 
198
200
  ```
199
201
  1. /mf-plan "description of the feature"
200
- → Generates spec + test plan. Review both.
202
+ → Generates spec with acceptance scenarios at docs/specs/<feature>/<feature>.md.
201
203
 
202
204
  2. Implement code in chunks.
203
205
  After each chunk: /mf-test
@@ -218,16 +220,15 @@ This removes hooks, commands, settings, and build-test.sh. It preserves `CLAUDE.
218
220
  > When: Changing behavior of something that already exists.
219
221
 
220
222
  ```
221
- 1. Edit the spec first: docs/specs/<feature>.md
223
+ 1. /mf-plan docs/specs/<feature>/<feature>.md "description of changes"
224
+ → Mode C handles everything: snapshot → classification → change report → apply.
225
+ Do NOT manually edit the spec before running /mf-plan.
222
226
 
223
- 2. /mf-plan docs/specs/<feature>.md
224
- → Updates the test plan with new/modified/removed cases.
225
-
226
- 3. Implement the code change.
227
+ 2. Implement the code change.
227
228
  /mf-test
228
229
  Fix until green.
229
230
 
230
- 4. /mf-review → /mf-commit
231
+ 3. /mf-review → /mf-commit
231
232
  ```
232
233
 
233
234
  ### Bug Fix
@@ -251,7 +252,8 @@ This removes hooks, commands, settings, and build-test.sh. It preserves `CLAUDE.
251
252
  > When: Deleting code, removing deprecated functionality.
252
253
 
253
254
  ```
254
- 1. Mark spec sections as removed in docs/specs/<feature>.md
255
+ 1. /mf-plan docs/specs/<feature>/<feature>.md "remove stories S-XXX"
256
+ → Mode C creates a snapshot (removing stories = Major), then marks as removed.
255
257
 
256
258
  2. Delete production code + related tests.
257
259
 
@@ -265,62 +267,69 @@ This removes hooks, commands, settings, and build-test.sh. It preserves `CLAUDE.
265
267
 
266
268
  ## 5. Commands Reference
267
269
 
268
- ### /mf-plan — Generate Spec + Test Plan
270
+ ### /mf-plan — Generate Spec with Acceptance Scenarios
269
271
 
270
272
  **Usage:**
271
273
  ```
272
- /mf-plan docs/specs/auth.md # Mode A: from existing spec
273
- /mf-plan "user authentication with OAuth2" # Mode B: from description
274
- /mf-plan docs/specs/auth.md (after spec edit) # Mode C: update existing plan
274
+ /mf-plan "user authentication with OAuth2" # Mode A: new spec from description
275
+ /mf-plan docs/specs/auth/auth.md # Mode B: add scenarios to existing spec
276
+ /mf-plan docs/specs/auth/auth.md "add password reset flow" # Mode C: update existing spec
275
277
  ```
276
278
 
277
279
  **Modes:**
278
- - **Mode A** — Reads an existing spec, generates a test plan.
279
- - **Mode B** — Drafts a spec from your description, asks for confirmation, then generates the test plan.
280
- - **Mode C** — Updates an existing spec + test plan when requirements change.
280
+ - **Mode A** — Creates a new spec with stories and acceptance scenarios from your description.
281
+ - **Mode B** — Reads an existing spec that has no acceptance scenarios yet, adds them.
282
+ - **Mode C** — Updates an existing spec: creates a snapshot before Major changes, shows a change report, waits for confirmation, then applies.
281
283
 
282
284
  **How it works:**
283
285
 
284
- 1. **Phase 0: Codebase Awareness** — Scans existing code, `docs/specs/`, `docs/test-plans/`, and project patterns before planning. Prevents plans that conflict with existing implementations.
285
- 2. **Phase 1: Write/Update Spec** — Generates a structured spec with sections: Overview, Data Model, Use Cases, State Machine, Constraints & Invariants, Error Handling, Security Considerations. Depth is proportional to complexity — simple CRUD gets 1 paragraph + 3 use cases, complex auth gets the full template.
286
- 3. **Phase 2: Clarify Ambiguities** — Systematically finds gaps across behavioral, data, auth, non-functional, integration, and concurrency dimensions. Asks 3-5 targeted questions with 2-4 options each before proceeding. If the spec is clear and complete, 0 questions is valid.
287
- 4. **Phase 3: Generate Test Plan** — Derives test cases from the spec (never from code).
288
-
289
- **Traceability IDs:** Every requirement gets a traceable ID:
290
- - `UC-NNN` — Use Cases
291
- - `FR-NNN` — Functional Requirements
292
- - `SC-NNN` — Security Constraints
293
- - `TC-NNN` — Test Cases (each must reference at least one FR or SC)
286
+ 1. **Phase 0: Codebase Awareness** — Scans existing code, `docs/specs/`, and project patterns before planning. Prevents specs that conflict with existing implementations.
287
+ 2. **Phase 1: Scope & Split Assessment** — Evaluates feature size. Features with >7 stories or >20 acceptance scenarios must be split into sub-specs.
288
+ 3. **Phase 2: Draft Spec** — Generates a structured spec with stories and acceptance scenarios (Given/When/Then). Depth scales by priority: P0 gets full GWT + test data, P1 gets GWT, P2 gets 1-2 line descriptions. Runs consistency checks (CC1-CC6) before showing draft.
289
+ 4. **Phase 3: Clarify Ambiguities** — Systematically finds gaps across behavioral, data, auth, non-functional, integration, and concurrency dimensions. Asks 3-5 targeted questions. Waits for user answers before continuing.
290
+ 5. **Phase 4: Summary** — Shows story counts, AS counts, implementation order, next steps.
294
291
 
295
- **Test plan table format:**
292
+ **Mode C (Update) adds:**
293
+ - **Classification** — Walks through M1-M6 checklist to determine Major vs Minor change.
294
+ - **Snapshot** — Major changes trigger an automatic snapshot (`cp`, bit-perfect) before editing.
295
+ - **Change report** — Shows what will change, waits for user confirmation.
296
+ - **Consistency check** — Runs CC1-CC6 after every update.
296
297
 
297
- | ID | Priority | Type | UC | FR/SC | Description | Expected |
298
- |----|----------|------|----|-------|-------------|----------|
299
- | TC-001 | P0 | unit | UC-001 | FR-001 | ... | ... |
298
+ **Traceability IDs:**
299
+ - `S-NNN` — Stories (with priority P0/P1/P2)
300
+ - `AS-NNN` Acceptance Scenarios (Given/When/Then, embedded in stories)
301
+ - `FR-NNN` — Functional Requirements (if needed)
302
+ - `SC-NNN` — Success Criteria (if needed)
303
+ - IDs are immutable — deleted IDs are never reused.
300
304
 
301
- Priorities: **P0** (must have), **P1** (should have), **P2** (nice to have).
305
+ **Directory structure:**
306
+ ```
307
+ docs/specs/<feature>/
308
+ <feature>.md # single source of truth — always read this file
309
+ snapshots/ # version history (managed by mf-plan, not developers)
310
+ YYYY-MM-DD.md
311
+ YYYY-MM-DD-<REF>.md
312
+ ```
302
313
 
303
314
  **Output:**
304
- - Spec: `docs/specs/<feature>.md`
305
- - Test plan: `docs/test-plans/<feature>.md`
315
+ - Spec with acceptance scenarios: `docs/specs/<feature>/<feature>.md`
306
316
 
307
317
  ### /mf-challenge — Adversarial Plan Review
308
318
 
309
319
  **Usage:**
310
320
  ```
311
- /mf-challenge docs/test-plans/auth.md # challenge a test plan
312
- /mf-challenge docs/specs/auth.md # challenge a spec
321
+ /mf-challenge docs/specs/auth/auth.md # challenge a spec
313
322
  /mf-challenge "user authentication" # challenge by feature name
314
323
  ```
315
324
 
316
325
  **How it works (7 phases):**
317
326
 
318
- 1. **Read & Map** — Reads the plan/spec and maps: decisions made, assumptions (stated AND implied), dependencies, scope boundaries, risk acknowledgments, spec-plan consistency.
327
+ 1. **Read & Map** — Reads the spec (including acceptance scenarios) and maps: decisions made, assumptions (stated AND implied), dependencies, scope boundaries, risk acknowledgments, story-AS consistency.
319
328
  2. **Scale Reviewers** — Assesses complexity and selects reviewers:
320
329
 
321
330
  | Complexity | Signals | Reviewers |
322
331
  |------------|---------|-----------|
323
- | Simple | 1 spec section, <20 test cases, no auth/data | 2 |
332
+ | Simple | 1 spec section, <20 acceptance scenarios, no auth/data | 2 |
324
333
  | Standard | Multiple sections, auth or data involved | 3 |
325
334
  | Complex | Multiple integrations, concurrency, migrations, 6+ phases | 4 |
326
335
 
@@ -394,8 +403,8 @@ Priorities: **P0** (must have), **P1** (should have), **P2** (nice to have).
394
403
 
395
404
  **How it works:**
396
405
 
397
- 1. **Phase 0: Build Context** — Finds changed files vs base branch, reads matching test plan/spec, reads existing tests for patterns, fixtures, and naming conventions. Doesn't duplicate what already exists.
398
- 2. **Phase 1: Write Tests** — Creates or updates tests based on the test plan. Each test covers one concept, is independent, deterministic (no random, no time-dependent, no external calls), and has a clear name.
406
+ 1. **Phase 0: Build Context** — Finds changed files vs base branch, reads the spec (acceptance scenarios in `## Stories` section are the roadmap), reads existing tests for patterns, fixtures, and naming conventions. Doesn't duplicate what already exists.
407
+ 2. **Phase 1: Write Tests** — Creates or updates tests based on acceptance scenarios. Each test covers one concept, is independent, deterministic (no random, no time-dependent, no external calls), and has a clear name.
399
408
  3. **Phase 2: Compile First** — Runs typecheck/compile before executing tests. Catches syntax errors early.
400
409
  4. **Phase 3: Run Tests** — Executes the test suite.
401
410
  5. **Phase 4: Fix Loop** — If tests fail, fixes **test code only** (max 3 attempts, then hard stop and report). If tests expect X but code does Y, asks you whether to fix production code or adjust the test.
@@ -422,7 +431,7 @@ Priorities: **P0** (must have), **P1** (should have), **P2** (nice to have).
422
431
  2. **Phase 1: Write Failing Test** — Creates a regression test that reproduces the bug. Test includes a comment: `// Regression: <bug description> — <expected> vs <actual>`.
423
432
  3. **Phase 2: Confirm Failure** — Runs the test to verify it fails for the right reason.
424
433
  4. **Phase 3: Fix** — Minimal change to production code. If other tests break, the fix is wrong — never weakens existing tests.
425
- 5. **Phase 4: Root Cause Analysis** — Documents: Symptom, Root cause, Gap (why wasn't this caught earlier?), Prevention (suggests one: type constraint, validation, lint rule, spec update, or test plan update). Non-optional for serious bugs; for trivial bugs, the fix summary is enough.
434
+ 5. **Phase 4: Root Cause Analysis** — Documents: Symptom, Root cause, Gap (why wasn't this caught earlier?), Prevention (suggests one: type constraint, validation, lint rule, spec update including acceptance scenarios). Non-optional for serious bugs; for trivial bugs, the fix summary is enough.
426
435
  6. **Phase 5: Full Suite** — Runs all tests to catch regressions.
427
436
 
428
437
  **Multiple bugs:** Triages by severity, fixes one at a time, commits each separately.
@@ -437,7 +446,7 @@ Priorities: **P0** (must have), **P1** (should have), **P2** (nice to have).
437
446
 
438
447
  **How it works:**
439
448
 
440
- 1. **Phase 0: Understand Intent** — Reads commit messages and checks for related spec/test plan. Understands *why* the change was made before reviewing *how*.
449
+ 1. **Phase 0: Understand Intent** — Reads commit messages and checks for related spec. Understands *why* the change was made before reviewing *how*.
441
450
  2. **Phase 1: Smart Focus** — Auto-detects what to focus on based on the diff:
442
451
 
443
452
  | Diff contains | Focus on |
@@ -459,7 +468,7 @@ Priorities: **P0** (must have), **P1** (should have), **P2** (nice to have).
459
468
  **Rules:**
460
469
  - At least 1 positive note — reinforces good patterns, not just problems
461
470
  - Never auto-fixes code — report only
462
- - Checks spec-test alignment: code changed → spec/tests also changed? Vague requirements without metrics ("fast", "secure") get flagged with a suggestion to add concrete numbers
471
+ - Checks spec-test alignment: code changed → spec/acceptance scenarios/tests also changed? Vague requirements without metrics ("fast", "secure") get flagged with a suggestion to add concrete numbers
463
472
 
464
473
  ### /mf-commit — Smart Git Commit
465
474
 
@@ -708,77 +717,95 @@ Edit `scripts/build-test.sh`:
708
717
 
709
718
  ---
710
719
 
711
- ## 8. Specs & Test Plans Format
720
+ ## 8. Spec Format
712
721
 
713
- ### Business Logic Spec Template
722
+ ### Spec Template
714
723
 
715
- Create specs at `docs/specs/<feature-name>.md`:
724
+ Create specs at `docs/specs/<feature>/<feature>.md`:
716
725
 
717
726
  ```markdown
718
727
  # Spec: <Feature Name>
719
728
 
729
+ **Created:** 2026-04-02
730
+ **Last updated:** 2026-04-02
731
+ **Status:** Draft | Active | Deprecated
732
+
720
733
  ## Overview
721
734
  What this feature does, why it exists, who uses it. 2-3 sentences.
722
735
 
723
736
  ## Data Model
724
- Entities, attributes, relationships (table format).
737
+ Entities, attributes, relationships (if applicable).
738
+
739
+ ## Stories
740
+
741
+ ### S-001: <Story name> (P0)
742
+
743
+ **Description:** [user story]
744
+ **Source:** [optional: ticket/issue ref]
745
+
746
+ **Acceptance Scenarios:**
747
+
748
+ AS-001: <short description>
749
+ - **Given:** [state]
750
+ - **When:** [action]
751
+ - **Then:** [expected]
752
+ - **Data:** [test data]
725
753
 
726
- ## Use Cases
754
+ AS-002: <short description>
755
+ - **Given:** [error state]
756
+ - **When:** [action]
757
+ - **Then:** [error handling]
727
758
 
728
- ### UC-001: <Name>
729
- - **Actor:** User / System
730
- - **Preconditions:** ...
731
- - **Flow:** 1. ... 2. ...
732
- - **Postconditions:** ...
733
- - **Error cases:** ...
759
+ ### S-002: <Story name> (P1)
734
760
 
735
- ## Settings / Configuration
736
- Configurable behavior and defaults.
761
+ AS-003: <short description>
762
+ - **Given:** [state]
763
+ - **When:** [action]
764
+ - **Then:** [expected]
765
+
766
+ ### S-003: <Story name> (P2)
767
+
768
+ AS-004: <short description>
769
+ - [flow description + expected behavior]
737
770
 
738
771
  ## Constraints & Invariants
739
772
  Rules that must always hold.
740
773
 
741
- ## Error Handling
742
- How errors surface to users and are logged.
774
+ ## Change Log
743
775
 
744
- ## Security Considerations
745
- Auth, authorization, data sensitivity.
776
+ | Date | Change | Ref |
777
+ |------|--------|-----|
778
+ | 2026-04-02 | Initial creation | -- |
746
779
  ```
747
780
 
748
781
  Skip sections that don't apply. Match depth to feature complexity.
749
782
 
750
- ### Test Plan Format
751
-
752
- Generated by `/mf-plan` at `docs/test-plans/<feature-name>.md`:
753
-
754
- ```markdown
755
- # Test Plan: <Feature Name>
783
+ **Acceptance Scenario depth by priority:**
784
+ - **P0:** Full Given + When + Then + Data + Setup. At least 1 happy path + 1 error path.
785
+ - **P1:** Given + When + Then. At least 1 happy path.
786
+ - **P2:** 1-2 line flow description. At least 1 scenario.
756
787
 
757
- **Spec:** docs/specs/<feature-name>.md
758
- **Generated:** 2026-04-01
788
+ ### Snapshots (Version History)
759
789
 
760
- ## Test Cases
790
+ When `/mf-plan` Mode C detects a Major change (new story, removed story, priority change, flow change, behavior change for P0, or constraint change), it automatically creates a snapshot before updating:
761
791
 
762
- | ID | Priority | Type | UC | FR/SC | Description | Expected |
763
- |----|----------|------|----|-------|-------------|----------|
764
- | TC-001 | P0 | unit | UC-001 | FR-001 | Valid login returns token | 200 + JWT |
765
- | TC-002 | P0 | unit | UC-001 | FR-002 | Wrong password returns 401 | 401 + error msg |
766
- | TC-003 | P1 | integration | UC-002 | FR-003 | Session expires after 24h | Auto-logout |
767
-
768
- ## Coverage Notes
769
- - Highest risk areas: ...
770
- - Edge cases needing attention: ...
792
+ ```
793
+ docs/specs/<feature>/snapshots/
794
+ 2026-04-02.md ← full copy at that point in time
795
+ 2026-04-05-BILL-101.md ← with ticket reference
771
796
  ```
772
797
 
773
- ### Naming Conventions
798
+ Snapshots are immutable, managed by mf-plan (not developers), and capped at 5 most recent.
774
799
 
800
+ ### Naming Conventions
775
801
  | Item | Convention | Example |
776
802
  |------|-----------|---------|
777
- | Spec file | `<feature-name>.md` in `docs/specs/` | `user-auth.md` |
778
- | Test plan | Same name as spec, in `docs/test-plans/` | `user-auth.md` |
779
- | Test case ID | `TC-NNN` sequential | `TC-001`, `TC-042` |
780
- | Priority | `P0` (critical), `P1` (important), `P2` (nice-to-have) | — |
781
- | Type | `unit`, `integration`, `e2e`, `snapshot`, `performance` | — |
803
+ | Spec directory | `docs/specs/<feature>/` | `docs/specs/user-auth/` |
804
+ | Spec file | `<feature>.md` in feature directory | `user-auth.md` |
805
+ | Story ID | `S-NNN` sequential per spec | `S-001`, `S-005` |
806
+ | Scenario ID | `AS-NNN` sequential across all stories | `AS-001`, `AS-042` |
807
+ | Priority | `P0` (critical), `P1` (important), `P2` (nice-to-have) — per story | — |
808
+ | Snapshot | `YYYY-MM-DD.md` or `YYYY-MM-DD-<REF>.md` in `snapshots/` | `2026-04-02.md` |
782
809
 
783
810
  ---
784
811
 
@@ -910,13 +937,13 @@ A: Only for external services you can't run locally (third-party APIs, email ser
910
937
  A: This usually means the spec is ambiguous. Clarify the spec first, then re-run `/mf-test`. Good specs produce good tests.
911
938
 
912
939
  **Q: Can I use this with other AI coding tools?**
913
- A: The commands and hooks are Claude Code-specific. The specs, test plans, workflow, and `build-test.sh` work with any tool or manual workflow.
940
+ A: The commands and hooks are Claude Code-specific. The specs, workflow, and `build-test.sh` work with any tool or manual workflow.
914
941
 
915
942
  **Q: When should I use `/mf-challenge`?**
916
943
  A: After `/mf-plan`, for complex features involving authentication, payments, data pipelines, or multi-service integration. It spawns parallel hostile reviewers that find security holes, failure modes, and false assumptions BEFORE you write code. Skip it for simple CRUD or small features — the overhead isn't worth it.
917
944
 
918
945
  **Q: How do I do a full coverage audit?**
919
- A: This is intentionally not a command (it's expensive and rare). When needed, prompt Claude directly: "Audit test coverage for feature X against docs/test-plans/X.md. Identify gaps and write missing tests."
946
+ A: This is intentionally not a command (it's expensive and rare). When needed, prompt Claude directly: "Audit test coverage for feature X against docs/specs/X/X.md acceptance scenarios. Identify gaps and write missing tests."
920
947
 
921
948
  **Q: What if my project uses multiple languages?**
922
949
  A: `build-test.sh` detects the first match. For monorepos, you may need to run it from each sub-project directory or customize the script.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "claude-devkit-cli",
3
- "version": "1.2.5",
3
+ "version": "1.3.0",
4
4
  "description": "CLI toolkit for spec-first development with Claude Code — hooks, commands, guards, and test runners",
5
5
  "bin": {
6
6
  "claude-devkit": "./bin/devkit.js",
@@ -2,10 +2,10 @@
2
2
 
3
3
  ## Spec-First Development
4
4
 
5
- Every change follows this cycle: **SPEC TEST PLAN → CODE + TESTS → BUILD PASS**.
5
+ Every change follows this cycle: **SPEC (with acceptance scenarios) → CODE + TESTS → BUILD PASS**.
6
6
 
7
- - Business logic specs live in `docs/specs/`
8
- - Test plans live in `docs/test-plans/`
7
+ - Business logic specs live in `docs/specs/<feature>/<feature>.md`
8
+ - Acceptance scenarios (Given/When/Then) are embedded in the spec under `## Stories`
9
9
  - Never write code before the spec exists. Never auto-modify specs from code.
10
10
  - Specs are the source of truth. If code contradicts the spec, the code is wrong.
11
11
 
@@ -14,9 +14,9 @@ Every change follows this cycle: **SPEC → TEST PLAN → CODE + TESTS → BUILD
14
14
  | Trigger | Commands | Details |
15
15
  |---------|----------|---------|
16
16
  | New feature | `/mf-plan` → `/mf-challenge` (optional) → code in chunks → `/mf-test` each chunk | Start with spec or description |
17
- | Update feature | Update spec first → `/mf-plan` → code → `/mf-test` | Spec changes before code changes |
17
+ | Update feature | `/mf-plan <spec-path> "changes"` → code → `/mf-test` | Do NOT manually edit spec before /mf-plan |
18
18
  | Bug fix | `/mf-fix "description"` | Test-first: write failing test → fix → green |
19
- | Remove feature | Mark spec as removed → delete code + tests → build pass | Run full suite after removal |
19
+ | Remove feature | `/mf-plan <spec-path> "remove stories"` → delete code + tests → build pass | /mf-plan handles snapshot before removal |
20
20
  | Pre-merge check | `/mf-review` | Diff-based quality gate |
21
21
  | Commit changes | `/mf-commit` | Secret scan + conventional commit |
22
22
 
@@ -49,11 +49,11 @@ For detailed workflow steps, templates, and decision trees, see `docs/WORKFLOW.m
49
49
  - **File naming:** Descriptive enough that AI tools understand the purpose from the path alone.
50
50
  Prefer kebab-case for new files (e.g., `user-authentication-service.ts`).
51
51
  - **Dates in filenames:** Use `$(date +%Y-%m-%d)` — never guess dates.
52
- - **Specs & test plans naming:**
53
- - kebab-case, lowercase: `user-auth.md`, `file-sync.md`
54
- - Feature name, not module name: `user-auth.md` not `AuthService.md`
55
- - Spec and test plan share the SAME name: `docs/specs/user-auth.md` ↔ `docs/test-plans/user-auth.md`
56
- - Short (2-3 words): `payment-flow.md` not `payment-processing-with-stripe-integration.md`
52
+ - **Spec naming:**
53
+ - kebab-case, lowercase: `user-auth/user-auth.md`, `file-sync/file-sync.md`
54
+ - Feature name, not module name: `user-auth/` not `AuthService/`
55
+ - Each feature gets its own directory: `docs/specs/<feature>/<feature>.md`
56
+ - Short (2-3 words): `payment-flow/` not `payment-processing-with-stripe-integration/`
57
57
  - No prefix/suffix: `user-auth.md` not `spec-user-auth.md`
58
58
 
59
59
  ## Forbidden
@@ -5,14 +5,12 @@ Adversarial review — spawn hostile reviewers to break the plan before coding.
5
5
  Target: $ARGUMENTS
6
6
 
7
7
  If argument is a file path → use that.
8
- If argument is a feature name → search `docs/test-plans/` and `docs/specs/` for matches.
9
- If no argument → list recent files in both dirs, ask user which to challenge.
8
+ If argument is a feature name → search `docs/specs/` for matches.
9
+ If no argument → list recent files in `docs/specs/`, ask user which to challenge.
10
10
 
11
11
  ## Phase 1: Read and Map
12
12
 
13
- Read the ENTIRE target file. Also read related files:
14
- - If target is a test plan → also read the corresponding spec in `docs/specs/`
15
- - If target is a spec → also read the corresponding test plan in `docs/test-plans/`
13
+ Read the ENTIRE target file. The spec contains both the feature definition and acceptance scenarios (in `## Stories` section).
16
14
 
17
15
  Map the plan's attack surface:
18
16
  - Decisions made (and what was rejected)
@@ -20,7 +18,7 @@ Map the plan's attack surface:
20
18
  - Dependencies (external services, APIs, libraries, infra)
21
19
  - Scope boundaries (in/out/suspiciously unmentioned)
22
20
  - Risk acknowledgments (mentioned vs. conspicuously absent)
23
- - Specplan consistency (use cases without test cases? contradictions?)
21
+ - StoryAS consistency (stories without acceptance scenarios? contradictions?)
24
22
 
25
23
  Collect all file paths the reviewers will need to read.
26
24
 
@@ -30,7 +28,7 @@ Assess plan complexity and select which lenses to deploy:
30
28
 
31
29
  | Complexity Signal | Reviewers | Lenses |
32
30
  |-------------------|-----------|--------|
33
- | Simple (1 spec section, <20 test cases, no auth/data) | 2 | Assumptions + Scope |
31
+ | Simple (1 spec section, <20 acceptance scenarios, no auth/data) | 2 | Assumptions + Scope |
34
32
  | Standard (multiple sections, auth or data involved) | 3 | + Security |
35
33
  | Complex (multiple integrations, concurrency, migrations, 6+ phases) | 4 | + Failure Modes |
36
34
 
@@ -158,7 +156,7 @@ After all reviewers complete:
158
156
 
159
157
  4. **Sort** by severity: Critical → High → Medium → Low
160
158
  5. **Cap** at 15 findings: keep all Critical, top High by specificity, note how many Medium were dropped
161
- 6. **Cross-reference check** (you, not reviewers): If both spec and test plan exist, flag any use cases in spec without test cases, and any test cases that contradict the spec
159
+ 6. **Cross-reference check** (you, not reviewers): Flag any stories without acceptance scenarios, and any AS that contradicts the story description
162
160
 
163
161
  ## Phase 5: Adjudicate
164
162
 
@@ -62,7 +62,7 @@ After fixing, document:
62
62
  Symptom: <what the user saw>
63
63
  Root cause: <why it happened>
64
64
  Gap: <why not caught earlier — missing test? wrong assumption? missing spec?>
65
- Prevention: <suggest one: type constraint, validation, lint rule, spec update, or test plan update>
65
+ Prevention: <suggest one: type constraint, validation, lint rule, spec update (including acceptance scenarios)>
66
66
  ```
67
67
 
68
68
  This is non-optional for serious bugs. For trivial bugs, the fix summary is enough.
@@ -81,7 +81,7 @@ Prevention: <suggestion>
81
81
  Full suite: All passing ✓
82
82
  ```
83
83
 
84
- If the bug reveals an undocumented edge case: "Consider updating the spec at docs/specs/<feature>.md."
84
+ If the bug reveals an undocumented edge case: "Consider updating the spec at docs/specs/<feature>/<feature>.md."
85
85
 
86
86
  ## Multiple Bugs
87
87