@haposoft/cafekit 0.8.7 → 0.8.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -7,43 +7,39 @@
7
7
  **Dependencies:** {{DEPENDENCIES}}
8
8
  **Spec:** specs/{{FEATURE_NAME}}/
9
9
 
10
- ## Objective
10
+ ## Context
11
11
 
12
- {{Brief 1-2 sentence objective detailing WHAT to accomplish, not HOW. Must relate directly to requirement R{{REQ_NUMBER}}.}}
12
+ - **Why**: {{Business/user reason this task exists}}
13
+ - **Current state**: {{Relevant existing files, route, model, API, screen, or "greenfield"}}
14
+ - **Target outcome**: {{Observable behavior after this task is done}}
13
15
 
14
16
  ## Constraints
15
17
 
16
18
  - **MUST**: {{Non-negotiable requirement or technical constraint}}
17
19
  - **SHOULD**: {{Recommended approach or optimization}}
18
20
  - **MUST NOT**: {{Explicitly forbidden action or approach}}
21
+ - **SCOPE**: Implement only the behavior mapped to R{{REQ_NUMBER}} and the approved `scope_lock`; do not add out-of-scope features or leave scoped acceptance criteria unwired.
19
22
 
20
- ## Implementation Steps
21
-
22
- - [ ] 1. {{MAJOR_STEP_1}}
23
- - [ ] 1.1 {{Sub-task describing specific behavior/action}}
24
- - {{Detail: business logic, behavior, target validation}}
25
- - {{Detail: edge case or constraint}}
26
- - _Requirements: {{REQ_NUMBER}}.{{X}}_
27
- - [ ] 1.2 {{Next sub-task}}
28
- - {{Detail items}}
29
- - _Requirements: {{REQ_NUMBER}}.{{Y}}_
30
-
31
- - [ ] 2. {{MAJOR_STEP_2}}
32
- - [ ] 2.1 {{Sub-task}}
33
- - {{Details}}
34
- - _Requirements: {{REQ_NUMBER}}.{{Z}}_
35
- - [ ] 2.2 {{Sub-task}}
36
- - {{Details}}
37
- - _Requirements: {{REQ_NUMBER}}.{{W}}_
38
-
39
- - [ ] 3. Test coverage for R{{REQ_NUMBER}}
40
- - [ ] 3.1 Unit tests
41
- - {{Test case 1: target behavior to verify}}
42
- - {{Test case 2: edge case / error case}}
43
- - _Requirements: {{REQ_NUMBER}}_
44
- - [ ]* 3.2 Integration tests (optional for MVP)
45
- - {{Describe end-to-end flow to verify}}
46
- - _Requirements: {{REQ_NUMBER}}_
23
+ ## Steps
24
+
25
+ - [ ] 1. {{Actionable step with exact file/path/contract}}
26
+ - {{Business intent: what user/system behavior this enables}}
27
+ - {{Code detail: schema/API/component/function/route and validation rules}}
28
+ - _Requirements: {{REQ_NUMBER}}.{{X}}_
29
+
30
+ - [ ] 2. {{Next actionable step}}
31
+ - {{Business intent}}
32
+ - {{Code detail, edge case, or integration contract}}
33
+ - _Requirements: {{REQ_NUMBER}}.{{Y}}_
34
+
35
+ - [ ] 3. Verification implementation
36
+ - {{Unit/integration/e2e test or explicit manual verification hook}}
37
+ - _Requirements: {{REQ_NUMBER}}_
38
+
39
+ ## Requirements
40
+
41
+ - {{REQ_NUMBER}}.{{X}} — {{Acceptance criterion or requirement covered}}
42
+ - {{REQ_NUMBER}}.{{Y}} {{Acceptance criterion or requirement covered}}
47
43
 
48
44
  ## Related Files
49
45
 
@@ -57,17 +53,31 @@
57
53
  - [ ] {{Criteria 1 — observable output or artifact, maps to acceptance criteria R{{REQ_NUMBER}}}}
58
54
  - [ ] {{Criteria 2 — measurable behavior or negative-path outcome}}
59
55
  - [ ] {{Criteria 3 — maps directly to acceptance criteria from requirements.md and can be proven below}}
56
+ - [ ] {{Criteria 4 — no orphaned component/service/route/command; created runtime-facing work is reachable from the declared entrypoint or explicitly deferred to a named integration task}}
57
+
58
+ ## Evidence
60
59
 
61
- ## Task Test Plan & Verification Evidence
60
+ This section is both the task-level test plan and the proof checklist. Keep it short, exact, and executable.
61
+ Select the proof by task risk; do not run every test type for every task.
62
62
 
63
- This section is the task-level test plan. It names the exact commands, observable runtime/artifact proof, and negative-path checks required before this task can be marked done.
63
+ - Logic/data/validator task: include unit tests.
64
+ - Stateful UI/component task: include component or integration tests.
65
+ - Cross-module/API/state flow task: include integration tests.
66
+ - User-facing end-to-end workflow: include E2E/UI flow verification.
67
+ - Layout/theme/responsive task: include visual/runtime viewport checks.
68
+ - Interactive UI task: include accessibility checks when keyboard, focus, labels, or ARIA can regress.
69
+ - Scaffold/release task: include smoke build/test/dev-server checks.
70
+ - Performance/security checks are required only when the requirement, risk, or touched surface calls for them.
64
71
 
65
- - [ ] Automated verification
72
+ - [ ] Automated verification (unit/component/integration/E2E as applicable)
66
73
  - Command(s): `{{TYPECHECK / TEST / BUILD COMMANDS OR N/A}}`
67
74
  - Expected proof: {{What output, exit code, or report proves success}}
68
75
  - [ ] Artifact / runtime verification
69
76
  - Inspect: `{{artifact path | route | UI state | DB object | manifest entry}}`
70
77
  - Expect: {{Observable result that proves the task is really wired}}
78
+ - [ ] Runtime reachability verification
79
+ - Entrypoint/caller: `{{App.tsx | route file | CLI command | worker registration | manifest | API consumer}}`
80
+ - Expect: {{Created component/service/route/worker/loader is imported, mounted, registered, or invoked from the runtime path; if deferred, name the later integration task}}
71
81
  - [ ] Contract / negative-path verification
72
82
  - Check: {{Unauthorized path, validation error, permission omission, missing env behavior, deletion effect, etc.}}
73
83
  - Expect: {{Concrete failure mode or contract-preserving behavior}}
@@ -84,4 +94,4 @@ This section is the task-level test plan. It names the exact commands, observabl
84
94
  > **Parallel marker**: Append `(P)` to the title if this task can run concurrently with another (usually when serving different requirements).
85
95
  > **Test note**: If a test coverage sub-task can be deferred post-MVP, mark it with `- [ ]*`.
86
96
  > **Requirement mapping**: Every sub-task MUST end with `_Requirements: X.X_`. No mapping = invalid task file.
87
- > **Verification rule**: No `## Task Test Plan & Verification Evidence` section = invalid task file. Existing specs may use legacy `## Verification & Evidence`; agents must support both headings.
97
+ > **Evidence rule**: No `## Evidence` section = invalid task file. Existing specs may use `## Task Test Plan & Verification Evidence` or legacy `## Verification & Evidence`; agents must support all three headings.
@@ -34,8 +34,8 @@ Scans the `spec.json` against all physical `task-R*.md` files to detect mismatch
34
34
 
35
35
  1. **Precision Edits:** Never overwrite the entire `spec.json` string blindly. Update only the required keys, while keeping JSON valid.
36
36
  2. **Machine + Human Sync:** Every task status update MUST modify both `spec.json.task_registry[...]` and the matching markdown task file header/status section.
37
- 3. **Markdown Integrity:** When marking a task `done`, only then turn `[ ]` into `[x]` inside `## Implementation Steps` and relevant `Completion Criteria` / `Task Test Plan & Verification Evidence` checkboxes that have actual proof. Legacy `Verification & Evidence` sections are supported.
38
- 4. **Verification Receipt Rule:** `done` is illegal without a human-readable verification receipt already present in `## Task Test Plan & Verification Evidence` or legacy `## Verification & Evidence` (commands executed, artifact/runtime proof, or equivalent concrete evidence). If proof is missing, keep the task `in_progress` or `blocked`.
37
+ 3. **Markdown Integrity:** When marking a task `done`, only then turn `[ ]` into `[x]` inside `## Steps` / `## Implementation Steps` and relevant `Completion Criteria` / `Evidence` checkboxes that have actual proof. `Task Test Plan & Verification Evidence` and legacy `Verification & Evidence` sections are supported.
38
+ 4. **Verification Receipt Rule:** `done` is illegal without a human-readable verification receipt already present in `## Evidence`, `## Task Test Plan & Verification Evidence`, or legacy `## Verification & Evidence` (commands executed, artifact/runtime proof, or equivalent concrete evidence). If proof is missing, keep the task `in_progress` or `blocked`.
39
39
  5. **Task Docs Hook:** Every time `hapo:sync` marks a task as `done`, it must flag that a task-level docs checkpoint is now due for that verified task.
40
40
  6. **Phase Prompt Rule:** When `hapo:sync` marks the final pending task in the whole feature as `done`, it should automatically prompt the user if they'd like to advance the phase, but only after the docs checkpoint for that last completed task has been considered.
41
41
 
@@ -15,7 +15,7 @@ When requested to update a phase or change task configuration, `spec.json` must
15
15
  - full relative path like `tasks/task-R0-02-extension-shell.md`
16
16
  * **Status Update:** If a task changes to `blocked`, the matching `task_registry[path].status` must become `"blocked"`, `task_registry[path].blocker` must record the reason, and `spec.json.status` / `spec.json.blocker` must reflect the top-level block if work is globally blocked.
17
17
  * **Timestamp Rule:** Update `task_registry[path].started_at`, `completed_at`, and `last_updated_at` consistently with the new state. Also refresh `spec.json.updated_at`.
18
- * **Done-State Rule:** Never set `task_registry[path].status = "done"` unless the matching markdown task file already contains a verification receipt in `## Task Test Plan & Verification Evidence` or legacy `## Verification & Evidence`, or the caller explicitly provides proof that can be written there first.
18
+ * **Done-State Rule:** Never set `task_registry[path].status = "done"` unless the matching markdown task file already contains a verification receipt in `## Evidence`, `## Task Test Plan & Verification Evidence`, or legacy `## Verification & Evidence`, or the caller explicitly provides proof that can be written there first.
19
19
  * **Receipt Integrity Rule:** A valid verification receipt must include the exact commands run, their outcomes, and artifact/runtime proof. Receipts containing `PRECHECK_FAIL`, `FAIL`, `UNVERIFIED`, or explicit "placeholder / simplified for MVP / production later" contract deviations are not eligible for `done`.
20
20
  * **Contract Fidelity Rule:** If the task file notes or evidence show that a named framework/auth/runtime choice from the spec was silently replaced, sync MUST refuse `done` until the spec is amended or the implementation is corrected.
21
21
  * **Task Docs Rule:** After a task is moved to `done`, emit a short alert that a task-level docs checkpoint is due for this verified task.
@@ -27,12 +27,12 @@ The structure of `tasks/task.md` relies heavily on exact keyword markers. Follow
27
27
  ### A. Completing a Task
28
28
  When `/hapo:sync <feature> <task-id> done`:
29
29
  1. Find: `**Status:** pending` (or `in_progress` / `blocked`).
30
- 2. Inspect `## Task Test Plan & Verification Evidence` first. If the task uses legacy `## Verification & Evidence`, inspect that section instead. If it has no explicit proof lines (commands run, artifact proof, runtime proof, or blockers cleared), STOP and refuse to mark the task done.
30
+ 2. Inspect `## Evidence` first. If the task uses `## Task Test Plan & Verification Evidence` or legacy `## Verification & Evidence`, inspect that section instead. If it has no explicit proof lines (commands run, artifact proof, runtime proof, or blockers cleared), STOP and refuse to mark the task done.
31
31
  3. Refuse completion if the receipt contains any non-passing marker such as `PRECHECK_FAIL`, `FAIL`, `UNVERIFIED`, or an explicit note that the implementation substituted a named contract with a placeholder/custom simplification.
32
32
  4. Replace with: `**Status:** done`.
33
- 5. Locate block: `## Implementation Steps`.
33
+ 5. Locate block: `## Steps` or `## Implementation Steps`.
34
34
  6. Convert `- [ ]` into `- [x]` strictly within that section.
35
- 7. Update relevant checkboxes in `## Completion Criteria` and `## Task Test Plan & Verification Evidence` only when the caller provides or the file already contains real proof. For legacy task files, update `## Verification & Evidence` instead.
35
+ 7. Update relevant checkboxes in `## Completion Criteria` and `## Evidence` only when the caller provides or the file already contains real proof. For v0.8 or legacy task files, update `## Task Test Plan & Verification Evidence` or `## Verification & Evidence` instead.
36
36
  8. Surface a note such as: `Docs checkpoint due: task Rn-mm just completed`.
37
37
 
38
38
  ### B. Blocking a Task
@@ -59,7 +59,7 @@ When `/hapo:sync audit <feature>` is activated:
59
59
  - Missing disk file referenced in registry → remove or flag it
60
60
  - Markdown says `done` but registry not done → registry wins only if evidence already exists; otherwise downgrade markdown or flag conflict
61
61
  - Registry says `done` but markdown still pending → update markdown only if evidence exists
62
- - Either side says `done` but `## Task Test Plan & Verification Evidence` / legacy `## Verification & Evidence` has no concrete proof → downgrade to `in_progress` or flag conflict instead of preserving fake completion
62
+ - Either side says `done` but `## Evidence` / `## Task Test Plan & Verification Evidence` / legacy `## Verification & Evidence` has no concrete proof → downgrade to `in_progress` or flag conflict instead of preserving fake completion
63
63
  - Either side says `done` but the receipt contains `PRECHECK_FAIL`, `FAIL`, `UNVERIFIED`, or explicit contract-substitution notes → downgrade to `in_progress` or flag conflict
64
64
  5. **Correction Alert:** Output a brief markdown alert detailing mismatches fixed and any unresolved conflicts requiring manual review.
65
65
  6. **Task Docs Alert:** If audit reveals tasks newly marked `done`, include whether task-level docs sync appears still due or already accounted for in the current run summary.
@@ -18,6 +18,8 @@ Designed to work **after `hapo:develop`**. Standalone `/hapo:test` uses the same
18
18
  /hapo:test # Blast-radius mode: only tests affected by recent changes
19
19
  /hapo:test --full # Run full test suite regardless of changes
20
20
  /hapo:test <scope> # Test a specific module or path
21
+ /hapo:test <feature-name> # Spec-aware test: load specs/<feature-name> and verify scope/task evidence
22
+ /hapo:test specs/<feature> # Spec-aware test by spec directory
21
23
  /hapo:test --ui <url> # UI verification via chrome-devtools (public pages)
22
24
  /hapo:test --ui-auth <url> # UI verification with auth injection (protected pages)
23
25
  /hapo:test --ui-flow <url> # UI testing with User Journey (form fill/submit simulation)
@@ -31,6 +33,13 @@ If a test command exits 0 but runs 0 tests, report NO_TESTS — this is a green
31
33
  If tests fail, list every failure explicitly — do not summarize failures away.
32
34
  </HARD-GATE>
33
35
 
36
+ <SCOPE-GATE>
37
+ When a feature name or `specs/<feature>` path is supplied, testing is spec-aware.
38
+ Load `spec.json`, `requirements.md`, `design.md`, active/recent task files, and task `Evidence` / test-plan proof.
39
+ The verdict MUST compare executed/reachable behavior against `scope_lock`, requirements, design contracts, task Completion Criteria, and runtime reachability obligations.
40
+ Build/typecheck success without scoped runtime proof is not PASS.
41
+ </SCOPE-GATE>
42
+
34
43
  ## 4-Phase Execution
35
44
 
36
45
  ```mermaid
@@ -61,6 +70,12 @@ Auto-detect the test runner from project files:
61
70
  Unless `--full` is specified: apply **Blast Radius scoping** to run only tests
62
71
  affected by recent file changes. See `references/execution-strategy.md` Phase A.
63
72
 
73
+ If the argument resolves to `specs/<feature>` or a feature directory under `specs/`, enter **Spec-Aware Mode**:
74
+ - Load `spec.json`, `requirements.md`, `design.md`, and task files referenced by `task_registry`
75
+ - Identify tasks marked `done`, `in_progress`, or recently changed
76
+ - Extract exact commands, runtime/artifact proof, runtime reachability proof, and negative-path checks
77
+ - Scope test selection by affected task files, but do not skip any mandatory task evidence
78
+
64
79
  ### Phase 2 — Execute
65
80
 
66
81
  **Code testing (default):**
@@ -68,6 +83,16 @@ affected by recent file changes. See `references/execution-strategy.md` Phase A.
68
83
  2. Execute test command with coverage flags
69
84
  3. Collect test counts, coverage percentages, and fail stack traces
70
85
  4. Treat 0 executed tests as `NO_TESTS`, even if the command exits 0
86
+ 5. In Spec-Aware Mode, inspect runtime reachability from declared entrypoints/callers and fail if scoped surfaces are missing or orphaned
87
+
88
+ **Spec-aware test type escalation:**
89
+ - Unit tests are mandatory when task evidence covers pure logic, transforms, validators, sorting/filtering, or regressions.
90
+ - Component/integration tests are expected when task evidence covers stateful UI, context/store wiring, API/service boundaries, or persistence.
91
+ - E2E/UI flow tests are expected once a complete user-facing workflow exists, not for isolated foundation tasks.
92
+ - Visual/responsive checks are expected for layout, theme, dashboard, and style tasks.
93
+ - Accessibility checks are expected for interactive UI surfaces where focus, roles, labels, keyboard navigation, or ARIA can regress.
94
+ - Smoke checks are enough for scaffold/config tasks unless the task requires deeper proof.
95
+ - Performance/security checks are only mandatory when the requirement, design risk, or touched runtime boundary calls for them.
71
96
 
72
97
  **UI verification (`--ui` / `--ui-auth` / `--ui-flow`):**
73
98
  Execute multi-page discovery, then spawn **Parallel UI Subagents** (test-runner instances) to handle Smoke, Core-Vitals, Accessibility, SEO, Security, and User Flows simultaneously.
@@ -76,7 +101,7 @@ See `references/execution-strategy.md` Phase C for full phase breakdown.
76
101
  Delegate execution to `test-runner` agent:
77
102
  ```
78
103
  Agent(subagent_type="test-runner",
79
- prompt="Run tests. Scope: [blast-radius|full|ui]. Target: [path|url]. Return structured verdict.",
104
+ prompt="Run tests. Scope: [blast-radius|full|ui|spec-aware]. Target: [path|url|feature]. Load specs when target is a feature. Return structured verdict with scope/spec coverage and runtime reachability.",
80
105
  description="Test [feature]")
81
106
  ```
82
107
 
@@ -111,6 +136,12 @@ Return a **structured verdict** (required format — not free-form prose):
111
136
  - Accessibility issues: N found | none
112
137
  - Screenshots: [paths]
113
138
 
139
+ ### Scope / Spec Coverage (if feature scope)
140
+ - Requirements covered: N/N
141
+ - Task evidence checks: PASS | FAIL | UNVERIFIED
142
+ - Runtime reachability: PASS | FAIL | UNVERIFIED
143
+ - Out-of-scope behavior detected: none | [list]
144
+
114
145
  ### Test Regression Check
115
146
  - **Comparison:** Compare current test count and assertion depth against previous runs.
116
147
  - **Result:** OK | REGRESSION (tests deleted/weakened)