@haposoft/cafekit 0.7.24 → 0.7.26
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/src/claude/agents/code-auditor.md +8 -3
- package/src/claude/agents/docs-keeper.md +12 -1
- package/src/claude/agents/god-developer.md +3 -3
- package/src/claude/agents/spec-maker.md +14 -7
- package/src/claude/agents/test-runner.md +14 -0
- package/src/claude/hooks/privacy-block.cjs +133 -80
- package/src/claude/hooks/spec-state.cjs +21 -2
- package/src/claude/hooks/state.cjs +164 -122
- package/src/claude/rules/manage-docs.md +6 -5
- package/src/claude/rules/state-sync.md +7 -6
- package/src/claude/settings/settings.json +10 -1
- package/src/claude/skills/develop/SKILL.md +51 -9
- package/src/claude/skills/develop/references/quality-gate.md +13 -4
- package/src/claude/skills/specs/SKILL.md +33 -6
- package/src/claude/skills/specs/references/review.md +32 -3
- package/src/claude/skills/specs/references/task-hydration.md +18 -16
- package/src/claude/skills/specs/rules/tasks-generation.md +4 -0
- package/src/claude/skills/specs/templates/design.md +2 -1
- package/src/claude/skills/specs/templates/init.json +3 -1
- package/src/claude/skills/sync/SKILL.md +13 -10
- package/src/claude/skills/sync/references/sync-protocols.md +39 -13
- package/src/claude/skills/test/SKILL.md +2 -2
- package/src/claude/skills/test/references/execution-strategy.md +2 -3
|
@@ -7,6 +7,13 @@ Green tests are NOT enough. The gate requires three proofs:
|
|
|
7
7
|
2. Code/spec review
|
|
8
8
|
3. Task evidence (completion criteria + runtime/artifact proof from the task file)
|
|
9
9
|
|
|
10
|
+
## Automation Semantics
|
|
11
|
+
|
|
12
|
+
- If the task names exact commands in `Verification & Evidence`, those exact commands are mandatory and must run before any fallback repo defaults.
|
|
13
|
+
- `NO_TESTS` is never an automatic PASS.
|
|
14
|
+
- `NO_TESTS` is acceptable only when the task does **not** require a dedicated test suite command and every other required automated command/evidence item passes.
|
|
15
|
+
- If the task explicitly requires tests and the repo has no such test command or suite, the task is FAIL or BLOCKED, not done.
|
|
16
|
+
|
|
10
17
|
## Parallel Quality Cycle
|
|
11
18
|
|
|
12
19
|
Maximum retry counter: **3 attempts**. Exceeding 3 triggers a collapse warning.
|
|
@@ -17,6 +24,7 @@ Variable: retry_count = 0
|
|
|
17
24
|
Before START_LOOP:
|
|
18
25
|
- Read the active task file(s)
|
|
19
26
|
- Extract Related Files, Completion Criteria, Verification & Evidence
|
|
27
|
+
- Extract the exact executable verification commands in declaration order
|
|
20
28
|
- Extract relevant design contracts/invariants for the touched area
|
|
21
29
|
- If any of these are missing or too vague to verify, FAIL immediately and route back to spec correction
|
|
22
30
|
|
|
@@ -25,11 +33,11 @@ START_LOOP:
|
|
|
25
33
|
PARALLEL GATE: Spawn BOTH agents simultaneously
|
|
26
34
|
---------------------------------------------------------------
|
|
27
35
|
→ Task(subagent_type="test-runner",
|
|
28
|
-
prompt="Run task-aware verification for the recently implemented code. Read the active task file(s) and execute
|
|
36
|
+
prompt="Run task-aware verification for the recently implemented code. Read the active task file(s) and execute the exact verification commands named there first, in order. After that, run any additional repo-level typecheck/test/build checks needed for confidence. Inspect named artifacts/runtime outputs. Return PASS only if automated checks and task evidence both pass. Mark anything unexecuted as UNVERIFIED. Treat NO_TESTS as non-passing unless the task did not require a dedicated test suite.",
|
|
29
37
|
description="Test [feature]")
|
|
30
38
|
|
|
31
39
|
→ Task(subagent_type="code-auditor",
|
|
32
|
-
prompt="Review all recently written code against the active task file(s), referenced requirements, and design contracts. Missing deliverables, placeholder-only wiring, missing runtime entrypoints, or contract drift are Critical even if build/tests pass. Check security, logic, architecture, YAGNI/KISS/DRY. Return score (X/10), critical count, warning list, and evidence gaps.",
|
|
40
|
+
prompt="Review all recently written code against the active task file(s), referenced requirements, and design contracts. Missing deliverables, placeholder-only wiring, missing runtime entrypoints, overscope edits outside the task packet, or contract drift are Critical even if build/tests pass. Check security, logic, architecture, YAGNI/KISS/DRY. Return score (X/10), critical count, warning list, and evidence gaps.",
|
|
33
41
|
description="Review [feature]")
|
|
34
42
|
|
|
35
43
|
Wait for BOTH to return results.
|
|
@@ -38,7 +46,7 @@ START_LOOP:
|
|
|
38
46
|
COMBINE RESULTS
|
|
39
47
|
---------------------------------------------------------------
|
|
40
48
|
|
|
41
|
-
CASE 1 —
|
|
49
|
+
CASE 1 — Automated FAIL OR required command missing OR Evidence FAIL / UNVERIFIED OR NO_TESTS when tests were required:
|
|
42
50
|
- Increment retry_count++
|
|
43
51
|
- If retry_count >= 3:
|
|
44
52
|
→ COLLAPSE! AskUserQuestion: "Quality gate cannot prove this task is complete! User intervention required!"
|
|
@@ -56,7 +64,7 @@ START_LOOP:
|
|
|
56
64
|
|
|
57
65
|
CASE 3 — Test PASS + Evidence PASS + Review PASS (Score >= 9.5 AND Critical = 0):
|
|
58
66
|
→ PASS! Auto-approved.
|
|
59
|
-
→ PROCEED to completion report.
|
|
67
|
+
→ PROCEED to completion report with a verification receipt summarizing exact commands executed, artifact/runtime proof, and review result.
|
|
60
68
|
|
|
61
69
|
REVIEW_ONLY:
|
|
62
70
|
---------------------------------------------------------------
|
|
@@ -77,6 +85,7 @@ REVIEW_ONLY:
|
|
|
77
85
|
- **Architecture:** Breaking MVC boundaries, cross-module coupling, convention violations.
|
|
78
86
|
- **Principles:** YAGNI violations, KISS violations, DRY violations (excessive code duplication).
|
|
79
87
|
- **Evidence / Done-Criteria Drift:** Missing required artifacts, placeholder-only wiring, missing entrypoints, unproven completion criteria, or runtime contract mismatches.
|
|
88
|
+
- **Overscope Delivery Drift:** Implementing later-task deliverables or editing out-of-scope files without direct justification for the active task.
|
|
80
89
|
|
|
81
90
|
## Terminal Log Format
|
|
82
91
|
|
|
@@ -44,6 +44,7 @@ Analyze → Dependency Scan → Complexity Assessment → Init → Requirements
|
|
|
44
44
|
- Canonical active status string is `in_progress`. Legacy `in-progress` may be READ for compatibility but MUST NOT be generated in new specs.
|
|
45
45
|
- `current_phase` is required for live work and must track the active phase (`init`, `requirements`, `design`, `tasks`, `develop`, `test`, `review`).
|
|
46
46
|
- `task_files` in `spec.json` MUST exactly match the real files under `tasks/` after Step 7.
|
|
47
|
+
- `task_registry` in `spec.json` MUST exist once task files are generated and MUST contain one entry per task file, keyed by relative path.
|
|
47
48
|
- `ready_for_implementation` is a hard gate, not a convenience flag. Never set it before the finalization audit passes.
|
|
48
49
|
|
|
49
50
|
### Output Criteria
|
|
@@ -132,7 +133,7 @@ flowchart TD
|
|
|
132
133
|
P --> Q["Step 6: Design — pick discovery mode"]
|
|
133
134
|
Q --> R["Write design.md"]
|
|
134
135
|
R --> S["Step 7: Tasks — split into individual files"]
|
|
135
|
-
S --> T["Create tasks/task-
|
|
136
|
+
S --> T["Create tasks/task-R*.md + task_registry"]
|
|
136
137
|
T --> U["Step 8: Hydrate Claude Tasks if >= 3 task files"]
|
|
137
138
|
U --> V{Review?}
|
|
138
139
|
V -->|Yes| W["Run review — auto-pick red team or validation"]
|
|
@@ -218,6 +219,15 @@ Load: `references/scope-inquiry.md`
|
|
|
218
219
|
- Load `rules/tasks-parallel-analysis.md` for parallel markers (default: enabled)
|
|
219
220
|
- Each task file follows template `templates/task.md`
|
|
220
221
|
- Each task file MUST include `Completion Criteria` and `Verification & Evidence` sections detailed enough that a downstream quality gate can prove the task is truly done.
|
|
222
|
+
- Build `spec.json.task_registry` alongside `task_files`. For each task file, register at minimum:
|
|
223
|
+
- `id`
|
|
224
|
+
- `title`
|
|
225
|
+
- `status` (`pending` by default)
|
|
226
|
+
- `dependencies` (relative task paths, not prose labels)
|
|
227
|
+
- `blocker`
|
|
228
|
+
- `started_at`
|
|
229
|
+
- `completed_at`
|
|
230
|
+
- `last_updated_at`
|
|
221
231
|
- Update `spec.json` phase + task metadata
|
|
222
232
|
|
|
223
233
|
#### Requirement-Driven Task Grouping (MANDATORY)
|
|
@@ -237,7 +247,7 @@ tasks/
|
|
|
237
247
|
├── task-R1-02-gap-marker-detector.md
|
|
238
248
|
├── task-R1-03-chunk-api-endpoint.md
|
|
239
249
|
├── task-R2-01-summarize-orchestrator.md
|
|
240
|
-
├── task-R2-02-
|
|
250
|
+
├── task-R2-02-provider-integration.md
|
|
241
251
|
├── task-R3-01-consent-onboarding.md
|
|
242
252
|
...
|
|
243
253
|
```
|
|
@@ -291,13 +301,24 @@ Load: `references/review.md` + `rules/design-review.md`
|
|
|
291
301
|
- When both run: Red Team ALWAYS before Validate (red team may change the spec)
|
|
292
302
|
- **PROHIBITION:** The system MUST NOT skip Red Team because of a prior code-auditor review. Code review ≠ Spec review.
|
|
293
303
|
- **PROHIBITION:** The system MUST NOT create `.ts`, `.js`, `.py` or any implementation files during validation. Spec-only outputs.
|
|
304
|
+
- **Reconciliation Rule:** `validation.status = "completed"` is forbidden until all accepted findings and validation decisions are physically propagated into `requirements.md`, `design.md`, `tasks/*.md`, and `spec.json` where applicable.
|
|
294
305
|
|
|
295
306
|
### Step 9.5: Finalization Audit (MANDATORY)
|
|
296
307
|
- Re-scan the `tasks/` directory and rebuild `spec.json.task_files` from the real filesystem (sorted, relative paths)
|
|
308
|
+
- Rebuild `spec.json.task_registry` from the real filesystem if it is missing, stale, or missing keys. Preserve task status fields when the path still matches.
|
|
297
309
|
- FAIL if any task file exists on disk but is missing from `task_files`
|
|
298
310
|
- FAIL if any path in `task_files` does not exist on disk
|
|
311
|
+
- FAIL if any task file exists on disk but is missing from `task_registry`
|
|
312
|
+
- FAIL if any path in `task_registry` does not exist on disk
|
|
299
313
|
- FAIL if any requirement or NFR mapping uses non-numeric labels (`NFR-1`, `SEC-1`, etc.)
|
|
300
314
|
- FAIL if a task lacks `Completion Criteria` or `Verification & Evidence`
|
|
315
|
+
- FAIL if accepted validation decisions exist in reports but are not reflected in the implementation-facing sections of affected artifacts (`Objective`, `Constraints`, `Implementation Steps`, `Completion Criteria`, `Verification & Evidence`, canonical contracts, or requirements text).
|
|
316
|
+
- FAIL if the spec scope/provider was switched away from Anthropic/Claude but `requirements.md`, `design.md`, or `tasks/*.md` still contain stale provider-specific strings such as `Claude API`, `Haiku`, or `haiku_reachable`. `research.md` is the only allowed place for historical cost comparisons.
|
|
317
|
+
- FAIL if privacy/delete-data work lacks a single canonical deletion policy. The design MUST explicitly choose either:
|
|
318
|
+
1. hard-delete with no re-registration lock, or
|
|
319
|
+
2. privacy-preserving re-registration lock using a non-raw identifier (for example `email_hash` / `email_fingerprint`) with a retention period.
|
|
320
|
+
Tasks and requirements must reuse the same policy verbatim; mixed policies are invalid.
|
|
321
|
+
- FAIL if `validation.status = "completed"` but `timestamps.validation_done` / `timestamps.review_done`, `updated_at`, and report metadata are not synchronized with the final reviewed state.
|
|
301
322
|
- If `validation_recommended = true` and `validation.status` is not `completed` (or an explicit accepted-risk state recorded by the user), `ready_for_implementation` MUST remain `false`
|
|
302
323
|
- Only after this audit passes may the system mark `progress.tasks = "done"` and `ready_for_implementation = true`
|
|
303
324
|
|
|
@@ -329,7 +350,8 @@ When user calls `hapo:specs`, system checks `specs/`:
|
|
|
329
350
|
| `init` done | "Next: write requirements" |
|
|
330
351
|
| `requirements` done | "Next: architectural design" |
|
|
331
352
|
| `design` done | "Next: break into tasks" |
|
|
332
|
-
| `tasks` done | "Next: `/hapo:
|
|
353
|
+
| `tasks` done + validation recommended but incomplete | "Next: `/hapo:specs --validate <feature>`" |
|
|
354
|
+
| `tasks` done + ready_for_implementation = true | "Next: `/hapo:develop <feature>`" |
|
|
333
355
|
| Spec is `blocked` | "Warning: spec `X` is blocking this spec" |
|
|
334
356
|
|
|
335
357
|
**State persistence:** Update `spec.json` `phase` field on each transition. `spec.json` is the single source of truth.
|
|
@@ -338,7 +360,7 @@ When user calls `hapo:specs`, system checks `specs/`:
|
|
|
338
360
|
|
|
339
361
|
**Canonical status vocabulary:** Use `in_progress`, `blocked`, `done`, and `archived`. New specs MUST NOT emit `in-progress`.
|
|
340
362
|
|
|
341
|
-
**Timestamps:** Each `timestamps.*_done` field MUST use the **actual current time** (ISO 8601 with timezone) when that specific phase completes. Do NOT reuse the `init` timestamp for later phases. If running the full pipeline end-to-end, capture a fresh timestamp at each phase transition.
|
|
363
|
+
**Timestamps:** Each `timestamps.*_done` field MUST use the **actual current time** (ISO 8601 with timezone) when that specific phase completes. This includes `review_done` and `validation_done` after review/validate workflows. Do NOT reuse the `init` timestamp for later phases. If running the full pipeline end-to-end, capture a fresh timestamp at each phase transition.
|
|
342
364
|
|
|
343
365
|
**Approvals (auto-approval behavior):**
|
|
344
366
|
- When running the **full pipeline end-to-end** (init → tasks in one session): set `approvals.{phase}.generated = true` AND `approvals.{phase}.approved = true` for each completed phase before proceeding to the next.
|
|
@@ -346,6 +368,8 @@ When user calls `hapo:specs`, system checks `specs/`:
|
|
|
346
368
|
|
|
347
369
|
**Task inventory:** `task_files` MUST be present and MUST list every real task file exactly once using relative paths like `tasks/task-R1-01-example.md`.
|
|
348
370
|
|
|
371
|
+
**Task machine-state:** `task_registry` MUST be present after Step 7. Each key is a relative task path, and each value MUST contain `id`, `title`, `status`, `dependencies`, `blocker`, `started_at`, `completed_at`, and `last_updated_at`.
|
|
372
|
+
|
|
349
373
|
**Validation recommendation:** `design_context.validation_recommended` MUST be `true` for auth, privacy, delete-data, migration, schema-change, browser-extension-permission, external-provider, or 5+ task file specs.
|
|
350
374
|
|
|
351
375
|
**`ready_for_implementation`:** This field MUST only be set to `true` when ALL of the following conditions are met:
|
|
@@ -354,7 +378,8 @@ When user calls `hapo:specs`, system checks `specs/`:
|
|
|
354
378
|
3. `approvals.tasks.approved = true`
|
|
355
379
|
4. `progress.tasks = "done"`
|
|
356
380
|
5. `task_files` matches the real filesystem
|
|
357
|
-
6.
|
|
381
|
+
6. `task_registry` matches the real filesystem and does not omit any task file
|
|
382
|
+
7. If `design_context.validation_recommended = true`, `validation.status = "completed"` (or another explicit user-accepted risk state that is recorded)
|
|
358
383
|
|
|
359
384
|
If any approval is `false`, `ready_for_implementation` MUST remain `false`.
|
|
360
385
|
|
|
@@ -363,7 +388,7 @@ If any approval is `false`, `ready_for_implementation` MUST remain `false`.
|
|
|
363
388
|
```
|
|
364
389
|
specs/
|
|
365
390
|
└── <feature-name>/
|
|
366
|
-
├── spec.json # Metadata, state, scope_lock, dependencies
|
|
391
|
+
├── spec.json # Metadata, state, scope_lock, dependencies, task_registry
|
|
367
392
|
├── requirements.md # Technical requirements (EARS format)
|
|
368
393
|
├── research.md # Research notes
|
|
369
394
|
├── design.md # Architectural design
|
|
@@ -427,7 +452,9 @@ Before finalizing any specification, assert all the following:
|
|
|
427
452
|
- [ ] **Risk matrix filled**: likelihood × impact, with mitigation for High items
|
|
428
453
|
- [ ] **Test strategy defined**: what gets unit tested, integration tested, e2e validated
|
|
429
454
|
- [ ] **task_files inventory synced**: no missing or orphaned task references
|
|
455
|
+
- [ ] **task_registry synced**: every task file has exactly one machine-state entry with valid status + dependencies
|
|
430
456
|
- [ ] **Validation gate consistent**: validation_recommended and validation.status agree with spec risk
|
|
457
|
+
- [ ] **Provider wording clean**: no stale vendor/provider strings outside allowed research context
|
|
431
458
|
- [ ] **spec.json fully updated**: phase, current_phase, progress, timestamps, approvals, design_context
|
|
432
459
|
|
|
433
460
|
## When TO Use
|
|
@@ -40,6 +40,9 @@ These rules override any self-reasoning or optimization the system may attempt:
|
|
|
40
40
|
2. **No implementation code files.** This workflow produces ONLY `.md` files. If a finding requires a new shared module or config file, describe it inside the relevant `task-*.md` file. Do NOT create `.ts`, `.js`, `.py`, or any source code file.
|
|
41
41
|
3. **Findings must use the exact format** defined in Part A Step 5 below. No shortened or custom formats.
|
|
42
42
|
4. **Apply YAGNI to fixes.** When user says "configure later" or "decide later", add a single note to the task file. Do NOT generate multiple concrete implementations (e.g., 4 provider files when user only asked for abstraction).
|
|
43
|
+
5. **No false completion.** You MUST NOT set `validation.status = "completed"` or `ready_for_implementation = true` until a reconciliation audit proves the accepted findings and validation decisions are reflected in the physical spec artifacts.
|
|
44
|
+
6. **Provider drift is a real defect.** If the scope changed away from Claude/Anthropic, stale strings like `Claude API`, `Haiku`, or `haiku_reachable` in `requirements.md`, `design.md`, or `tasks/*.md` are validation failures. `research.md` may mention them only as historical comparison.
|
|
45
|
+
7. **Implementation-facing propagation is mandatory.** A decision that affects implementation is NOT considered applied if it only appears in `Risk Assessment`, `validate-log.md`, or `red-team-report.md`. It must update at least one of: `requirements.md`, `Canonical Contracts & Invariants`, `Objective`, `Constraints`, `Implementation Steps`, `Completion Criteria`, or `Verification & Evidence`.
|
|
43
46
|
|
|
44
47
|
---
|
|
45
48
|
|
|
@@ -110,6 +113,7 @@ If "Review each one": For each finding, ask: "Apply" | "Reject" | "Modify sugges
|
|
|
110
113
|
#### Step 8: Apply to Spec
|
|
111
114
|
- Edit design.md / task files directly for accepted findings
|
|
112
115
|
- Create `reports/red-team-report.md` documenting the full review session
|
|
116
|
+
- Mark a finding as `Applied To = ...` only after the physical file really contains the change
|
|
113
117
|
|
|
114
118
|
### Finding Output Format
|
|
115
119
|
|
|
@@ -134,14 +138,14 @@ If "Review each one": For each finding, ask: "Apply" | "Reject" | "Modify sugges
|
|
|
134
138
|
|
|
135
139
|
| # | Finding | Severity | Disposition | Applied To |
|
|
136
140
|
|---|---------|----------|-------------|------------|
|
|
137
|
-
| 1 | {title} | Critical | Accept |
|
|
141
|
+
| 1 | {title} | Critical | Accept | task-R0-02-... |
|
|
138
142
|
```
|
|
139
143
|
|
|
140
144
|
---
|
|
141
145
|
|
|
142
146
|
## Part B: Validation Interview
|
|
143
147
|
|
|
144
|
-
###
|
|
148
|
+
### 8-Step Workflow
|
|
145
149
|
|
|
146
150
|
#### Step 1: Read Spec
|
|
147
151
|
- `requirements.md` — technical requirements
|
|
@@ -211,6 +215,31 @@ Save to `reports/validate-log.md`:
|
|
|
211
215
|
| Risk | Task files (Risk Assessment section) |
|
|
212
216
|
| Unknown | `design.md` (add new subsection) |
|
|
213
217
|
|
|
218
|
+
**Additional propagation rules:**
|
|
219
|
+
- If the decision changes implementation behavior, update an implementation-facing section, not only `Risk Assessment`.
|
|
220
|
+
- If the decision changes scope or provider choice, scan `requirements.md`, `design.md`, and `tasks/*.md` for stale wording and normalize it.
|
|
221
|
+
- If the decision changes deletion/privacy behavior, update `Canonical Contracts & Invariants` first, then tasks that inherit that contract.
|
|
222
|
+
|
|
223
|
+
#### Step 7: Reconciliation Audit (MANDATORY)
|
|
224
|
+
Before declaring validation complete:
|
|
225
|
+
1. Re-read `spec.json`, `requirements.md`, `design.md`, and all `tasks/task-*.md`
|
|
226
|
+
2. Verify every accepted red-team finding and every validation action item is reflected in the correct physical file(s)
|
|
227
|
+
3. Fail the audit if:
|
|
228
|
+
- a report says "applied" but the file still contains the old text
|
|
229
|
+
- stale provider strings remain after a provider change
|
|
230
|
+
- delete-data/privacy artifacts mix multiple canonical policies
|
|
231
|
+
- `spec.json.updated_at`, `timestamps.review_done`, or `timestamps.validation_done` do not reflect the final reviewed state
|
|
232
|
+
4. Only after the audit passes may you:
|
|
233
|
+
- set `spec.json.validation.status = "completed"`
|
|
234
|
+
- set `spec.json.timestamps.validation_done`
|
|
235
|
+
- set `spec.json.timestamps.review_done`
|
|
236
|
+
- set `spec.json.ready_for_implementation = true` when all other gates are satisfied
|
|
237
|
+
|
|
238
|
+
#### Step 8: Final Status Write-Back
|
|
239
|
+
- Update `spec.json.updated_at` to the reconciliation time
|
|
240
|
+
- Ensure `red-team-report.md` and `validate-log.md` do not contradict `spec.json`
|
|
241
|
+
- If reconciliation fails, keep `validation.status` as `not-run` or `in_progress` and list blockers explicitly
|
|
242
|
+
|
|
214
243
|
---
|
|
215
244
|
|
|
216
245
|
## Combined Output
|
|
@@ -225,5 +254,5 @@ Validate: {Q} questions asked, {D} decisions confirmed
|
|
|
225
254
|
|
|
226
255
|
Files modified: {list}
|
|
227
256
|
|
|
228
|
-
📌 Next step: /hapo:develop <feature>
|
|
257
|
+
📌 Next step: /hapo:develop <feature> (ONLY if reconciliation audit passed)
|
|
229
258
|
```
|
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
## Purpose
|
|
4
4
|
|
|
5
|
-
Convert task files (persistent storage) into Claude Tasks (session-scoped only), enabling real-time progress tracking and automatic unblocking when predecessor tasks complete.
|
|
5
|
+
Convert task files (persistent storage) into Claude Tasks (session-scoped only), enabling real-time progress tracking and automatic unblocking when predecessor tasks complete. `spec.json.task_registry` is the machine-state bridge between markdown task files and session-scoped Claude Tasks.
|
|
6
6
|
|
|
7
7
|
## When to Run
|
|
8
8
|
|
|
@@ -16,16 +16,16 @@ Convert task files (persistent storage) into Claude Tasks (session-scoped only),
|
|
|
16
16
|
┌──────────────────┐ Hydrate ┌───────────────────┐
|
|
17
17
|
│ Task Files │ ─────────► │ Claude Tasks │
|
|
18
18
|
│ (persistent) │ │ (session-scoped) │
|
|
19
|
-
│ [ ]
|
|
20
|
-
│ [ ]
|
|
19
|
+
│ [ ] task-R0-01 │ │ ◼ pending │
|
|
20
|
+
│ [ ] task-R0-02 │ │ ◼ pending │
|
|
21
21
|
└──────────────────┘ └───────────────────┘
|
|
22
22
|
│ Work
|
|
23
23
|
▼
|
|
24
24
|
┌──────────────────┐ Sync-back ┌───────────────────┐
|
|
25
25
|
│ Task Files │ ◄───────── │ Task Updates │
|
|
26
26
|
│ (updated) │ │ (completed) │
|
|
27
|
-
│ [x]
|
|
28
|
-
│ [ ]
|
|
27
|
+
│ [x] task-R0-01 │ │ ✓ completed │
|
|
28
|
+
│ [ ] task-R0-02 │ │ ◼ in_progress │
|
|
29
29
|
└──────────────────┘ └───────────────────┘
|
|
30
30
|
```
|
|
31
31
|
|
|
@@ -37,23 +37,24 @@ Convert task files (persistent storage) into Claude Tasks (session-scoped only),
|
|
|
37
37
|
|
|
38
38
|
| Field | Description | Example |
|
|
39
39
|
|---|---|---|
|
|
40
|
-
| `taskNumber` | Task sequence number | `
|
|
40
|
+
| `taskNumber` | Task sequence number | `R0-02` |
|
|
41
41
|
| `priority` | Priority level | `P1`, `P2`, `P3` |
|
|
42
42
|
| `effort` | Time estimate | `2h` |
|
|
43
43
|
| `specDir` | Spec directory path | `specs/mobile-app/` |
|
|
44
|
-
| `taskFile` | Task file name | `task-
|
|
44
|
+
| `taskFile` | Task file name | `task-R0-02-extension-shell.md` |
|
|
45
45
|
|
|
46
46
|
## Hydration Workflow
|
|
47
47
|
|
|
48
|
-
1. Read all `tasks/task
|
|
49
|
-
2.
|
|
48
|
+
1. Read all `tasks/task-R*.md` files → filter incomplete tasks
|
|
49
|
+
2. Load `spec.json.task_registry` and prefer its task status/dependencies when present
|
|
50
|
+
3. Create `TaskCreate` for each major task, attach `addBlockedBy` per dependency chain:
|
|
50
51
|
```
|
|
51
|
-
Task 01 (no blockers) ← start here
|
|
52
|
-
Task 02 (addBlockedBy: [task-01-id])
|
|
53
|
-
Task
|
|
52
|
+
Task R0-01 (no blockers) ← start here
|
|
53
|
+
Task R0-02 (addBlockedBy: [task-R0-01-id])
|
|
54
|
+
Task R1-01 (addBlockedBy: [task-R0-02-id])
|
|
54
55
|
```
|
|
55
|
-
|
|
56
|
-
|
|
56
|
+
4. Create additional `TaskCreate` for high-risk steps within tasks (if any)
|
|
57
|
+
5. Verify: no dependency cycles, all tasks have complete metadata, and every hydrated task has a corresponding `task_registry` entry
|
|
57
58
|
|
|
58
59
|
## Fallback
|
|
59
60
|
|
|
@@ -78,8 +79,9 @@ Convert task files (persistent storage) into Claude Tasks (session-scoped only),
|
|
|
78
79
|
### Sync-back (after code completes)
|
|
79
80
|
1. `TaskUpdate` marks tasks as completed
|
|
80
81
|
2. Update `[ ]` → `[x]` in task files
|
|
81
|
-
3. Update `spec.json
|
|
82
|
-
4.
|
|
82
|
+
3. Update `spec.json.task_registry[path].status`, timestamps, blocker, and last_updated_at
|
|
83
|
+
4. Update `spec.json` progress for the corresponding phase
|
|
84
|
+
5. Git commit captures the state transition
|
|
83
85
|
|
|
84
86
|
## Quality Checks
|
|
85
87
|
|
|
@@ -33,6 +33,7 @@ Detail bullets must include:
|
|
|
33
33
|
- Respect architecture boundaries defined in design.md (Architecture Pattern & Boundary Map)
|
|
34
34
|
- Honor interface contracts documented in design.md
|
|
35
35
|
- Translate completion criteria into concrete proof (commands, artifacts, routes, manifests, schema objects, UI states)
|
|
36
|
+
- Reuse canonical contracts from `design.md` verbatim; never invent alternate auth/provider/deletion policies in task prose
|
|
36
37
|
- Use major task summaries sparingly—omit detail bullets if the work is fully captured by child tasks.
|
|
37
38
|
|
|
38
39
|
**End with integration tasks** to wire everything together.
|
|
@@ -113,6 +114,7 @@ Every task file MUST contain the Risk Assessment table, even if no risks are ide
|
|
|
113
114
|
- `_Requirements: X.X, Y.Y_` listing **only numeric requirement IDs** (comma-separated). Never append descriptive text, parentheses, translations, or free-form labels.
|
|
114
115
|
- For cross-cutting requirements, list every relevant requirement ID. All requirements MUST have numeric IDs in requirements.md. If an ID is missing, stop and correct requirements.md before generating tasks.
|
|
115
116
|
- Reference components/interfaces from design.md when helpful (e.g., `_Contracts: AuthService API`)
|
|
117
|
+
- If a validation interview or red-team finding changes implementation behavior, update the sub-task itself. Do NOT hide the decision only inside `Risk Assessment`.
|
|
116
118
|
|
|
117
119
|
### 5. Code-Only Focus
|
|
118
120
|
|
|
@@ -148,6 +150,8 @@ Rules:
|
|
|
148
150
|
- If the task wires entrypoints (popup, content script, route, worker, CLI command), name the exact runtime surface that must exist after implementation.
|
|
149
151
|
- If verification depends on environment or manual setup, document the blocker explicitly instead of implying success.
|
|
150
152
|
- Build success alone is NEVER enough evidence for a completed task.
|
|
153
|
+
- For provider-sensitive work, use provider-neutral wording unless the scope lock explicitly names a vendor.
|
|
154
|
+
- For delete-data/privacy work, task text MUST match the single deletion/retention policy chosen in `design.md`. Mixed policies are invalid.
|
|
151
155
|
|
|
152
156
|
## Task Hierarchy Rules
|
|
153
157
|
|
|
@@ -74,9 +74,10 @@ Capture only the contracts whose inconsistency would break downstream implementa
|
|
|
74
74
|
| Auth / session | | | |
|
|
75
75
|
| Transport / entrypoints | | | |
|
|
76
76
|
| Data / persistence | | | |
|
|
77
|
+
| Deletion / retention policy | | | |
|
|
77
78
|
| Generated artifacts / runtime outputs | | | |
|
|
78
79
|
|
|
79
|
-
> Task files must reuse the same contract wording. If implementation later needs a different contract, update this section first before generating or editing tasks.
|
|
80
|
+
> Task files must reuse the same contract wording. If the feature touches delete-data or privacy retention, explicitly decide whether re-registration is blocked and how that lock works without keeping raw PII. If implementation later needs a different contract, update this section first before generating or editing tasks.
|
|
80
81
|
|
|
81
82
|
## System Flows
|
|
82
83
|
|
|
@@ -34,6 +34,7 @@
|
|
|
34
34
|
"blocks": []
|
|
35
35
|
},
|
|
36
36
|
"task_files": [],
|
|
37
|
+
"task_registry": {},
|
|
37
38
|
"blocker": null,
|
|
38
39
|
"progress": {
|
|
39
40
|
"init": "done",
|
|
@@ -53,7 +54,8 @@
|
|
|
53
54
|
"tasks_done": null,
|
|
54
55
|
"code_done": null,
|
|
55
56
|
"test_done": null,
|
|
56
|
-
"review_done": null
|
|
57
|
+
"review_done": null,
|
|
58
|
+
"validation_done": null
|
|
57
59
|
},
|
|
58
60
|
"design_context": {
|
|
59
61
|
"discovery_mode": null,
|
|
@@ -1,22 +1,22 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: hapo:sync
|
|
3
|
-
description: "Dumb-proof status tracker and file synchronizer. Updates spec.json and tasks/*.md without breaking structural schemas. Includes Auto-Audit."
|
|
3
|
+
description: "Dumb-proof status tracker and file synchronizer. Updates spec.json, task_registry, and tasks/*.md without breaking structural schemas. Includes Auto-Audit."
|
|
4
4
|
version: 1.0.0
|
|
5
|
-
argument-hint: "<feature_name> <task_id> <status> [blocker] | phase <feature_name> <next_phase> | audit <feature_name>"
|
|
5
|
+
argument-hint: "<feature_name> <task_id|task-file> <status> [blocker] | phase <feature_name> <next_phase> | audit <feature_name>"
|
|
6
6
|
---
|
|
7
7
|
|
|
8
8
|
# Sync (State Tracking Protocol)
|
|
9
9
|
|
|
10
|
-
This skill safely bridges the gap between active development state and physical documentation files (`spec.json`
|
|
10
|
+
This skill safely bridges the gap between active development state and physical documentation files (`spec.json` + `task_registry` + `tasks/task-R*.md`). Instead of relying on risky raw AI edits, this skill executes precise contextual replacements.
|
|
11
11
|
|
|
12
12
|
## Supported Commands
|
|
13
13
|
|
|
14
14
|
### 1. Task Synchronization
|
|
15
15
|
Update a specific task's status and automatically check its relevant sub-checkboxes.
|
|
16
16
|
|
|
17
|
-
**Usage:** `/hapo:sync <feature_name> <task_id> <status> ["optional blocker msg"]`
|
|
18
|
-
- Example 1: `/hapo:sync auth
|
|
19
|
-
- Example 2: `/hapo:sync payment task-03 blocked "API Endpoint Down"`
|
|
17
|
+
**Usage:** `/hapo:sync <feature_name> <task_id|task-file> <status> ["optional blocker msg"]`
|
|
18
|
+
- Example 1: `/hapo:sync auth R0-02 done`
|
|
19
|
+
- Example 2: `/hapo:sync payment task-R1-03-chunks-api.md blocked "API Endpoint Down"`
|
|
20
20
|
|
|
21
21
|
### 2. Phase Advancement
|
|
22
22
|
Advance the entire project to the next logical phase.
|
|
@@ -25,16 +25,19 @@ Advance the entire project to the next logical phase.
|
|
|
25
25
|
- Example: `/hapo:sync phase shopping_cart test`
|
|
26
26
|
|
|
27
27
|
### 3. State Audit
|
|
28
|
-
Scans the `spec.json` against all physical `task-
|
|
28
|
+
Scans the `spec.json` against all physical `task-R*.md` files to detect mismatches between `task_files`, `task_registry`, and markdown task headers, then repairs them.
|
|
29
29
|
|
|
30
30
|
**Usage:** `/hapo:sync audit <feature_name>`
|
|
31
31
|
- Example: `/hapo:sync audit auth`
|
|
32
32
|
|
|
33
33
|
## Directives
|
|
34
34
|
|
|
35
|
-
1. **Precision Edits:** Never overwrite the entire `spec.json` string.
|
|
36
|
-
2. **
|
|
37
|
-
3. **
|
|
35
|
+
1. **Precision Edits:** Never overwrite the entire `spec.json` string blindly. Update only the required keys, while keeping JSON valid.
|
|
36
|
+
2. **Machine + Human Sync:** Every task status update MUST modify both `spec.json.task_registry[...]` and the matching markdown task file header/status section.
|
|
37
|
+
3. **Markdown Integrity:** When marking a task `done`, only then turn `[ ]` into `[x]` inside `## Implementation Steps` and relevant `Completion Criteria` / `Verification & Evidence` checkboxes that have actual proof.
|
|
38
|
+
4. **Verification Receipt Rule:** `done` is illegal without a human-readable verification receipt already present in `## Verification & Evidence` (commands executed, artifact/runtime proof, or equivalent concrete evidence). If proof is missing, keep the task `in_progress` or `blocked`.
|
|
39
|
+
5. **Task Docs Hook:** Every time `hapo:sync` marks a task as `done`, it must flag that a task-level docs checkpoint is now due for that verified task.
|
|
40
|
+
6. **Phase Prompt Rule:** When `hapo:sync` marks the final pending task in the whole feature as `done`, it should automatically prompt the user if they'd like to advance the phase, but only after the docs checkpoint for that last completed task has been considered.
|
|
38
41
|
|
|
39
42
|
## References
|
|
40
43
|
Read `references/sync-protocols.md` for exact Search/Replace regex patterns and JSON schema expectations before acting on the files.
|
|
@@ -2,34 +2,60 @@
|
|
|
2
2
|
|
|
3
3
|
The following guidelines dictate exactly how `hapo:sync` should interact with files to prevent data corruption.
|
|
4
4
|
|
|
5
|
+
**Canonical task status vocabulary:** `pending`, `in_progress`, `blocked`, `done`
|
|
6
|
+
|
|
5
7
|
## 1. Updating `spec.json`
|
|
6
8
|
|
|
7
9
|
When requested to update a phase or change task configuration, `spec.json` must maintain its strict schema (defined in `hapo:specs/templates/init.json`).
|
|
8
10
|
|
|
9
|
-
* **JSON Modification Rule:** Do not output whole files. Instead, load the JSON structure, apply the update to `status`, `current_phase`, `blocker` (if any), and overwrite the file cleanly.
|
|
10
|
-
* **
|
|
11
|
+
* **JSON Modification Rule:** Do not output whole files. Instead, load the JSON structure, apply the update to `status`, `current_phase`, `blocker` (if any), `task_files`, and the relevant `task_registry` entry, then overwrite the file cleanly.
|
|
12
|
+
* **Task Registry Rule:** Resolve the incoming task reference to a single relative path in `task_registry`. Accept either:
|
|
13
|
+
- compact task ID like `R0-02`
|
|
14
|
+
- full filename like `task-R0-02-extension-shell.md`
|
|
15
|
+
- full relative path like `tasks/task-R0-02-extension-shell.md`
|
|
16
|
+
* **Status Update:** If a task changes to `blocked`, the matching `task_registry[path].status` must become `"blocked"`, `task_registry[path].blocker` must record the reason, and `spec.json.status` / `spec.json.blocker` must reflect the top-level block if work is globally blocked.
|
|
17
|
+
* **Timestamp Rule:** Update `task_registry[path].started_at`, `completed_at`, and `last_updated_at` consistently with the new state. Also refresh `spec.json.updated_at`.
|
|
18
|
+
* **Done-State Rule:** Never set `task_registry[path].status = "done"` unless the matching markdown task file already contains a verification receipt in `## Verification & Evidence`, or the caller explicitly provides proof that can be written there first.
|
|
19
|
+
* **Task Docs Rule:** After a task is moved to `done`, emit a short alert that a task-level docs checkpoint is due for this verified task.
|
|
11
20
|
|
|
12
21
|
## 2. Updating `tasks/task-**.md`
|
|
13
22
|
|
|
14
|
-
The structure of `tasks/task.md` relies heavily on exact keyword markers. Follow these surgical
|
|
23
|
+
The structure of `tasks/task.md` relies heavily on exact keyword markers. Follow these surgical protocols against `tasks/task-R*.md`:
|
|
15
24
|
|
|
16
25
|
### A. Completing a Task
|
|
17
|
-
When `/hapo:sync <feature> <task-id>
|
|
18
|
-
1. Find: `**
|
|
19
|
-
2.
|
|
20
|
-
3.
|
|
21
|
-
4.
|
|
26
|
+
When `/hapo:sync <feature> <task-id> done`:
|
|
27
|
+
1. Find: `**Status:** pending` (or `in_progress` / `blocked`).
|
|
28
|
+
2. Inspect `## Verification & Evidence` first. If it has no explicit proof lines (commands run, artifact proof, runtime proof, or blockers cleared), STOP and refuse to mark the task done.
|
|
29
|
+
3. Replace with: `**Status:** done`.
|
|
30
|
+
4. Locate block: `## Implementation Steps`.
|
|
31
|
+
5. Convert `- [ ]` into `- [x]` strictly within that section.
|
|
32
|
+
6. Update relevant checkboxes in `## Completion Criteria` and `## Verification & Evidence` only when the caller provides or the file already contains real proof.
|
|
33
|
+
7. Surface a note such as: `Docs checkpoint due: task Rn-mm just completed`.
|
|
22
34
|
|
|
23
35
|
### B. Blocking a Task
|
|
24
36
|
When `/hapo:sync <feature> <task-id> blocked "API error"`:
|
|
25
|
-
1. Find: `**
|
|
26
|
-
2. Replace with: `**
|
|
27
|
-
3. Ensure that an entry under `##
|
|
37
|
+
1. Find: `**Status:** <anything>`.
|
|
38
|
+
2. Replace with: `**Status:** blocked`.
|
|
39
|
+
3. Ensure that an entry under `## Blocker Log` exists recording the explicit reason (e.g. `API error`) and timestamp.
|
|
40
|
+
|
|
41
|
+
### C. Starting / Resuming a Task
|
|
42
|
+
When `/hapo:sync <feature> <task-id> in_progress`:
|
|
43
|
+
1. Find: `**Status:** pending` (or `blocked`).
|
|
44
|
+
2. Replace with: `**Status:** in_progress`.
|
|
45
|
+
3. Do NOT pre-check completion boxes.
|
|
46
|
+
4. Stamp `task_registry[path].started_at` if missing and refresh `last_updated_at`.
|
|
28
47
|
|
|
29
48
|
## 3. Audit Protocol
|
|
30
49
|
|
|
31
50
|
When `/hapo:sync audit <feature>` is activated:
|
|
32
51
|
1. **Load Truth:** Read `specs/<feature>/spec.json`.
|
|
33
52
|
2. **Scan Directory:** Loop through `specs/<feature>/tasks/`.
|
|
34
|
-
3. **Compare Constraints:**
|
|
35
|
-
4. **
|
|
53
|
+
3. **Compare Constraints:** Rebuild `task_files` from disk, ensure every file exists in `task_registry`, and compare markdown `**Status:**` headers against `task_registry[path].status`.
|
|
54
|
+
4. **Reconciliation Rules:**
|
|
55
|
+
- Missing registry entry → create it
|
|
56
|
+
- Missing disk file referenced in registry → remove or flag it
|
|
57
|
+
- Markdown says `done` but registry not done → registry wins only if evidence already exists; otherwise downgrade markdown or flag conflict
|
|
58
|
+
- Registry says `done` but markdown still pending → update markdown only if evidence exists
|
|
59
|
+
- Either side says `done` but `## Verification & Evidence` has no concrete proof → downgrade to `in_progress` or flag conflict instead of preserving fake completion
|
|
60
|
+
5. **Correction Alert:** Output a brief markdown alert detailing mismatches fixed and any unresolved conflicts requiring manual review.
|
|
61
|
+
6. **Task Docs Alert:** If audit reveals tasks newly marked `done`, include whether task-level docs sync appears still due or already accounted for in the current run summary.
|
|
@@ -1,14 +1,14 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: hapo:test
|
|
3
3
|
description: "Run and verify project tests across all scopes: unit, integration, e2e, and UI. Blast-radius scoping for speed, chrome-devtools for UI verification, structured verdicts for downstream automation."
|
|
4
|
-
argument-hint: "[scope|--full|--ui <url>|--ui-auth <url>]"
|
|
4
|
+
argument-hint: "[scope|--full|--ui <url>|--ui-auth <url>|--ui-flow <url>]"
|
|
5
5
|
version: 2.0.0
|
|
6
6
|
---
|
|
7
7
|
|
|
8
8
|
# Test — Verify Implementation Quality
|
|
9
9
|
|
|
10
10
|
Run the project's test suite, analyze results, and return a structured verdict.
|
|
11
|
-
Designed to work **after `hapo:
|
|
11
|
+
Designed to work **after `hapo:develop`**. Standalone `/hapo:test` uses the same `test-runner` contract that `hapo:develop` relies on during its Quality Gate, and may run **in parallel with `hapo:code-review`**.
|
|
12
12
|
|
|
13
13
|
**Principles:** Fail-fast | Blast-radius scoping | Zero hidden failures | No mocking to pass
|
|
14
14
|
|
|
@@ -38,7 +38,7 @@ When `--full` is NOT specified, narrow the test scope to only what changed:
|
|
|
38
38
|
### Pre-flight Checks (always run first)
|
|
39
39
|
|
|
40
40
|
Catch compile errors before spending time on tests.
|
|
41
|
-
**HARD RULE:** If a pre-flight tool (like `eslint`, `flake8`, `tsc`) is missing,
|
|
41
|
+
**HARD RULE:** Do NOT auto-install missing tooling during verification. If a required pre-flight tool (like `eslint`, `flake8`, `tsc`) is missing, stop and report it as an environment gap or missing project setup.
|
|
42
42
|
|
|
43
43
|
```bash
|
|
44
44
|
# JavaScript / TypeScript
|
|
@@ -59,7 +59,7 @@ cargo check
|
|
|
59
59
|
flutter analyze
|
|
60
60
|
```
|
|
61
61
|
|
|
62
|
-
If pre-flight fails → report `Compile Error`, do NOT proceed to test execution.
|
|
62
|
+
If pre-flight fails or a required tool is missing → report `Compile Error` / `Environment Gap`, do NOT proceed to test execution.
|
|
63
63
|
|
|
64
64
|
### Test Execution by Language
|
|
65
65
|
|
|
@@ -361,4 +361,3 @@ Flag as `Security Warning` if:
|
|
|
361
361
|
- Mixed content (HTTP resources on HTTPS page) detected via network audit
|
|
362
362
|
- `autocomplete="off"` missing on password fields
|
|
363
363
|
|
|
364
|
-
|