@haposoft/cafekit 0.7.23 → 0.7.25

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (68) hide show
  1. package/README.md +81 -862
  2. package/bin/install.js +4 -3
  3. package/package.json +2 -3
  4. package/src/claude/agents/code-auditor.md +25 -1
  5. package/src/claude/agents/god-developer.md +1 -1
  6. package/src/claude/agents/spec-maker.md +25 -3
  7. package/src/claude/agents/test-runner.md +22 -3
  8. package/src/claude/hooks/spec-state.cjs +24 -5
  9. package/src/claude/migration-manifest.json +1 -1
  10. package/src/claude/rules/manage-docs.md +5 -5
  11. package/src/claude/rules/state-sync.md +11 -8
  12. package/src/claude/skills/code-review/references/spec-compliance-review.md +8 -1
  13. package/src/claude/skills/develop/SKILL.md +32 -4
  14. package/src/claude/skills/develop/references/quality-gate.md +23 -13
  15. package/src/claude/skills/generate-graph/LICENSE +21 -0
  16. package/src/claude/skills/generate-graph/README.md +523 -0
  17. package/src/claude/skills/generate-graph/SKILL.md +427 -0
  18. package/src/claude/skills/generate-graph/agentloop-core.svg +101 -0
  19. package/src/claude/skills/generate-graph/agents/openai.yaml +4 -0
  20. package/src/claude/skills/generate-graph/assets/samples/sample-style1-flat.png +0 -0
  21. package/src/claude/skills/generate-graph/assets/samples/sample-style2-dark.png +0 -0
  22. package/src/claude/skills/generate-graph/assets/samples/sample-style3-blueprint.png +0 -0
  23. package/src/claude/skills/generate-graph/assets/samples/sample-style4-notion.png +0 -0
  24. package/src/claude/skills/generate-graph/assets/samples/sample-style5-glass.png +0 -0
  25. package/src/claude/skills/generate-graph/assets/samples/sample-style6-claude.png +0 -0
  26. package/src/claude/skills/generate-graph/assets/samples/sample-style7-openai.png +0 -0
  27. package/src/claude/skills/generate-graph/fixtures/agent-memory-types-style4.json +181 -0
  28. package/src/claude/skills/generate-graph/fixtures/api-flow-style7.json +40 -0
  29. package/src/claude/skills/generate-graph/fixtures/mem0-style1.json +297 -0
  30. package/src/claude/skills/generate-graph/fixtures/microservices-style3.json +64 -0
  31. package/src/claude/skills/generate-graph/fixtures/multi-agent-style5.json +45 -0
  32. package/src/claude/skills/generate-graph/fixtures/system-architecture-style6.json +48 -0
  33. package/src/claude/skills/generate-graph/fixtures/tool-call-style2.json +182 -0
  34. package/src/claude/skills/generate-graph/package.json +42 -0
  35. package/src/claude/skills/generate-graph/references/icons.md +281 -0
  36. package/src/claude/skills/generate-graph/references/style-1-flat-icon.md +108 -0
  37. package/src/claude/skills/generate-graph/references/style-2-dark-terminal.md +107 -0
  38. package/src/claude/skills/generate-graph/references/style-3-blueprint.md +113 -0
  39. package/src/claude/skills/generate-graph/references/style-4-notion-clean.md +94 -0
  40. package/src/claude/skills/generate-graph/references/style-5-glassmorphism.md +125 -0
  41. package/src/claude/skills/generate-graph/references/style-6-claude-official.md +209 -0
  42. package/src/claude/skills/generate-graph/references/style-7-openai.md +215 -0
  43. package/src/claude/skills/generate-graph/references/style-diagram-matrix.md +135 -0
  44. package/src/claude/skills/generate-graph/references/svg-layout-best-practices.md +100 -0
  45. package/src/claude/skills/generate-graph/scripts/generate-diagram.sh +157 -0
  46. package/src/claude/skills/generate-graph/scripts/generate-from-template.py +1556 -0
  47. package/src/claude/skills/generate-graph/scripts/test-all-styles.sh +135 -0
  48. package/src/claude/skills/generate-graph/scripts/validate-svg.sh +292 -0
  49. package/src/claude/skills/generate-graph/templates/agent-architecture.svg +28 -0
  50. package/src/claude/skills/generate-graph/templates/architecture.svg +23 -0
  51. package/src/claude/skills/generate-graph/templates/comparison-matrix.svg +14 -0
  52. package/src/claude/skills/generate-graph/templates/data-flow.svg +28 -0
  53. package/src/claude/skills/generate-graph/templates/er-diagram.svg +21 -0
  54. package/src/claude/skills/generate-graph/templates/flowchart.svg +21 -0
  55. package/src/claude/skills/generate-graph/templates/sequence.svg +20 -0
  56. package/src/claude/skills/generate-graph/templates/state-machine.svg +20 -0
  57. package/src/claude/skills/generate-graph/templates/timeline.svg +19 -0
  58. package/src/claude/skills/generate-graph/templates/use-case.svg +21 -0
  59. package/src/claude/skills/specs/SKILL.md +67 -10
  60. package/src/claude/skills/specs/references/review.md +32 -3
  61. package/src/claude/skills/specs/references/task-hydration.md +14 -12
  62. package/src/claude/skills/specs/rules/tasks-generation.md +21 -0
  63. package/src/claude/skills/specs/templates/design.md +14 -0
  64. package/src/claude/skills/specs/templates/init.json +7 -2
  65. package/src/claude/skills/specs/templates/requirements.md +21 -8
  66. package/src/claude/skills/specs/templates/task.md +16 -3
  67. package/src/claude/skills/sync/SKILL.md +11 -10
  68. package/src/claude/skills/sync/references/sync-protocols.md +33 -13
@@ -19,7 +19,7 @@ Analyze → Dependency Scan → Complexity Assessment → Init → Requirements
19
19
 
20
20
  **CRITICAL:** Before starting, the system MUST:
21
21
  1. Scan `specs/` directory for incomplete specs
22
- 2. If any spec is `in-progress` → ask user whether to continue or create new
22
+ 2. If any spec is `in_progress` (accept legacy `in-progress` when reading) → ask user whether to continue or create new
23
23
  3. Detect cross-spec dependencies (see `references/cross-spec-dependency.md`)
24
24
 
25
25
  ## Core Responsibilities & Rules
@@ -40,6 +40,13 @@ Analyze → Dependency Scan → Complexity Assessment → Init → Requirements
40
40
  - Never silently expand or shrink scope
41
41
  - If scope change needed → ask user, record reason in `spec.json`
42
42
 
43
+ ### State & Integrity Rules
44
+ - Canonical active status string is `in_progress`. Legacy `in-progress` may be READ for compatibility but MUST NOT be generated in new specs.
45
+ - `current_phase` is required for live work and must track the active phase (`init`, `requirements`, `design`, `tasks`, `develop`, `test`, `review`).
46
+ - `task_files` in `spec.json` MUST exactly match the real files under `tasks/` after Step 7.
47
+ - `task_registry` in `spec.json` MUST exist once task files are generated and MUST contain one entry per task file, keyed by relative path.
48
+ - `ready_for_implementation` is a hard gate, not a convenience flag. Never set it before the finalization audit passes.
49
+
43
50
  ### Output Criteria
44
51
  - Never implement code — only create spec documents
45
52
  - Return file paths and a brief summary
@@ -66,7 +73,7 @@ Display selection menu via `AskUserQuestion`:
66
73
  "options": [
67
74
  { "label": "Create new spec", "description": "Initialize spec from a feature description" },
68
75
  { "label": "status", "description": "View status of all specs in specs/" },
69
- { "label": "resume", "description": "Continue an in-progress spec" },
76
+ { "label": "resume", "description": "Continue an active spec" },
70
77
  { "label": "--validate", "description": "Review spec (auto-decides: red team or validation)" },
71
78
  { "label": "archive", "description": "Archive completed specs + write journal" }
72
79
  ],
@@ -126,7 +133,7 @@ flowchart TD
126
133
  P --> Q["Step 6: Design — pick discovery mode"]
127
134
  Q --> R["Write design.md"]
128
135
  R --> S["Step 7: Tasks — split into individual files"]
129
- S --> T["Create tasks/task-01.md, task-02.md..."]
136
+ S --> T["Create tasks/task-R*.md + task_registry"]
130
137
  T --> U["Step 8: Hydrate Claude Tasks if >= 3 task files"]
131
138
  U --> V{Review?}
132
139
  V -->|Yes| W["Run review — auto-pick red team or validation"]
@@ -202,6 +209,7 @@ Load: `references/scope-inquiry.md`
202
209
  - Record findings in `research.md` before finalizing design
203
210
  - Write `design.md` from template `templates/design.md` (see `rules/design-principles.md`)
204
211
  - Add diagrams only when design has multi-step or cross-boundary flows
212
+ - For auth/session, transport/entrypoint, persistence/schema, generated-artifact, or runtime-sensitive work, the design MUST fill the `Canonical Contracts & Invariants` section and tasks MUST inherit the same decisions verbatim.
205
213
  - Update `spec.json` phase, timestamps, discovery mode
206
214
 
207
215
  ### Step 7: Task Breakdown
@@ -210,6 +218,16 @@ Load: `references/scope-inquiry.md`
210
218
  - Load `rules/tasks-generation.md` for core principles
211
219
  - Load `rules/tasks-parallel-analysis.md` for parallel markers (default: enabled)
212
220
  - Each task file follows template `templates/task.md`
221
+ - Each task file MUST include `Completion Criteria` and `Verification & Evidence` sections detailed enough that a downstream quality gate can prove the task is truly done.
222
+ - Build `spec.json.task_registry` alongside `task_files`. For each task file, register at minimum:
223
+ - `id`
224
+ - `title`
225
+ - `status` (`pending` by default)
226
+ - `dependencies` (relative task paths, not prose labels)
227
+ - `blocker`
228
+ - `started_at`
229
+ - `completed_at`
230
+ - `last_updated_at`
213
231
  - Update `spec.json` phase + task metadata
214
232
 
215
233
  #### Requirement-Driven Task Grouping (MANDATORY)
@@ -229,7 +247,7 @@ tasks/
229
247
  ├── task-R1-02-gap-marker-detector.md
230
248
  ├── task-R1-03-chunk-api-endpoint.md
231
249
  ├── task-R2-01-summarize-orchestrator.md
232
- ├── task-R2-02-haiku-integration.md
250
+ ├── task-R2-02-provider-integration.md
233
251
  ├── task-R3-01-consent-onboarding.md
234
252
  ...
235
253
  ```
@@ -279,9 +297,30 @@ Load: `references/review.md` + `rules/design-review.md`
279
297
  - **< 3 task files, no security concerns** → Validate only (lightweight interview)
280
298
  - **>= 5 task files OR security/migration keywords** → Red Team first, then Validate
281
299
  - **User explicit request** → respect user's intent
300
+ - Set `design_context.validation_recommended = true` if the spec includes any of: auth/session, privacy, deletion, migration, schema change, external AI/provider switching, browser extension permissions, or 5+ task files.
282
301
  - When both run: Red Team ALWAYS before Validate (red team may change the spec)
283
302
  - **PROHIBITION:** The system MUST NOT skip Red Team because of a prior code-auditor review. Code review ≠ Spec review.
284
303
  - **PROHIBITION:** The system MUST NOT create `.ts`, `.js`, `.py` or any implementation files during validation. Spec-only outputs.
304
+ - **Reconciliation Rule:** `validation.status = "completed"` is forbidden until all accepted findings and validation decisions are physically propagated into `requirements.md`, `design.md`, `tasks/*.md`, and `spec.json` where applicable.
305
+
306
+ ### Step 9.5: Finalization Audit (MANDATORY)
307
+ - Re-scan the `tasks/` directory and rebuild `spec.json.task_files` from the real filesystem (sorted, relative paths)
308
+ - Rebuild `spec.json.task_registry` from the real filesystem if it is missing, stale, or missing keys. Preserve task status fields when the path still matches.
309
+ - FAIL if any task file exists on disk but is missing from `task_files`
310
+ - FAIL if any path in `task_files` does not exist on disk
311
+ - FAIL if any task file exists on disk but is missing from `task_registry`
312
+ - FAIL if any path in `task_registry` does not exist on disk
313
+ - FAIL if any requirement or NFR mapping uses non-numeric labels (`NFR-1`, `SEC-1`, etc.)
314
+ - FAIL if a task lacks `Completion Criteria` or `Verification & Evidence`
315
+ - FAIL if accepted validation decisions exist in reports but are not reflected in the implementation-facing sections of affected artifacts (`Objective`, `Constraints`, `Implementation Steps`, `Completion Criteria`, `Verification & Evidence`, canonical contracts, or requirements text).
316
+ - FAIL if the spec scope/provider was switched away from Anthropic/Claude but `requirements.md`, `design.md`, or `tasks/*.md` still contain stale provider-specific strings such as `Claude API`, `Haiku`, or `haiku_reachable`. `research.md` is the only allowed place for historical cost comparisons.
317
+ - FAIL if privacy/delete-data work lacks a single canonical deletion policy. The design MUST explicitly choose either:
318
+ 1. hard-delete with no re-registration lock, or
319
+ 2. privacy-preserving re-registration lock using a non-raw identifier (for example `email_hash` / `email_fingerprint`) with a retention period.
320
+ Tasks and requirements must reuse the same policy verbatim; mixed policies are invalid.
321
+ - FAIL if `validation.status = "completed"` but `timestamps.validation_done` / `timestamps.review_done`, `updated_at`, and report metadata are not synchronized with the final reviewed state.
322
+ - If `validation_recommended = true` and `validation.status` is not `completed` (or an explicit accepted-risk state recorded by the user), `ready_for_implementation` MUST remain `false`
323
+ - Only after this audit passes may the system mark `progress.tasks = "done"` and `ready_for_implementation = true`
285
324
 
286
325
  ### Step 10: Completion — Context Reminder (MANDATORY)
287
326
  After completing the spec, output a short summary of what was generated, then you MUST output the following block EXACTLY as written. DO NOT use awkward translations like "Điểm đã phản ánh đúng quyết định của bạn", keep it professional or just output the block directly:
@@ -300,7 +339,7 @@ When user calls `hapo:specs`, system checks `specs/`:
300
339
 
301
340
  | Situation | Action |
302
341
  |---|---|
303
- | A spec is `in-progress` | Ask: "You have spec `<name>` at phase `<phase>`. Continue? [Y/n]" |
342
+ | A spec is `in_progress` | Ask: "You have spec `<name>` at phase `<phase>`. Continue? [Y/n]" |
304
343
  | A spec matches current git branch | Ask: "Branch `feature/X` has spec `X`. Activate or create new?" |
305
344
  | Nothing found | Create new spec or show menu |
306
345
 
@@ -311,24 +350,36 @@ When user calls `hapo:specs`, system checks `specs/`:
311
350
  | `init` done | "Next: write requirements" |
312
351
  | `requirements` done | "Next: architectural design" |
313
352
  | `design` done | "Next: break into tasks" |
314
- | `tasks` done | "Next: `/hapo:develop <feature>`" |
353
+ | `tasks` done + validation recommended but incomplete | "Next: `/hapo:specs --validate <feature>`" |
354
+ | `tasks` done + ready_for_implementation = true | "Next: `/hapo:develop <feature>`" |
315
355
  | Spec is `blocked` | "Warning: spec `X` is blocking this spec" |
316
356
 
317
357
  **State persistence:** Update `spec.json` `phase` field on each transition. `spec.json` is the single source of truth.
318
358
 
319
359
  ### spec.json Update Rules (MANDATORY)
320
360
 
321
- **Timestamps:** Each `timestamps.*_done` field MUST use the **actual current time** (ISO 8601 with timezone) when that specific phase completes. Do NOT reuse the `init` timestamp for later phases. If running the full pipeline end-to-end, capture a fresh timestamp at each phase transition.
361
+ **Canonical status vocabulary:** Use `in_progress`, `blocked`, `done`, and `archived`. New specs MUST NOT emit `in-progress`.
362
+
363
+ **Timestamps:** Each `timestamps.*_done` field MUST use the **actual current time** (ISO 8601 with timezone) when that specific phase completes. This includes `review_done` and `validation_done` after review/validate workflows. Do NOT reuse the `init` timestamp for later phases. If running the full pipeline end-to-end, capture a fresh timestamp at each phase transition.
322
364
 
323
365
  **Approvals (auto-approval behavior):**
324
366
  - When running the **full pipeline end-to-end** (init → tasks in one session): set `approvals.{phase}.generated = true` AND `approvals.{phase}.approved = true` for each completed phase before proceeding to the next.
325
367
  - When running a **single phase**: set `generated = true` but leave `approved = false` — user must explicitly approve before continuing.
326
368
 
369
+ **Task inventory:** `task_files` MUST be present and MUST list every real task file exactly once using relative paths like `tasks/task-R1-01-example.md`.
370
+
371
+ **Task machine-state:** `task_registry` MUST be present after Step 7. Each key is a relative task path, and each value MUST contain `id`, `title`, `status`, `dependencies`, `blocker`, `started_at`, `completed_at`, and `last_updated_at`.
372
+
373
+ **Validation recommendation:** `design_context.validation_recommended` MUST be `true` for auth, privacy, delete-data, migration, schema-change, browser-extension-permission, external-provider, or 5+ task file specs.
374
+
327
375
  **`ready_for_implementation`:** This field MUST only be set to `true` when ALL of the following conditions are met:
328
376
  1. `approvals.requirements.approved = true`
329
377
  2. `approvals.design.approved = true`
330
378
  3. `approvals.tasks.approved = true`
331
379
  4. `progress.tasks = "done"`
380
+ 5. `task_files` matches the real filesystem
381
+ 6. `task_registry` matches the real filesystem and does not omit any task file
382
+ 7. If `design_context.validation_recommended = true`, `validation.status = "completed"` (or another explicit user-accepted risk state that is recorded)
332
383
 
333
384
  If any approval is `false`, `ready_for_implementation` MUST remain `false`.
334
385
 
@@ -337,7 +388,7 @@ If any approval is `false`, `ready_for_implementation` MUST remain `false`.
337
388
  ```
338
389
  specs/
339
390
  └── <feature-name>/
340
- ├── spec.json # Metadata, state, scope_lock, dependencies
391
+ ├── spec.json # Metadata, state, scope_lock, dependencies, task_registry
341
392
  ├── requirements.md # Technical requirements (EARS format)
342
393
  ├── research.md # Research notes
343
394
  ├── design.md # Architectural design
@@ -359,7 +410,7 @@ specs/
359
410
  | Command | Purpose | Reference |
360
411
  |---|---|---|
361
412
  | `/hapo:specs status` | View status of all specs | — |
362
- | `/hapo:specs resume <feature>` | Continue an in-progress spec | — |
413
+ | `/hapo:specs resume <feature>` | Continue an active spec | — |
363
414
  | `/hapo:specs --validate <feature>` | Validate spec (auto: red team + validate based on complexity) | `references/review.md` |
364
415
  | `/hapo:specs archive` | Archive completed specs + write journal | `references/archive-workflow.md` |
365
416
 
@@ -393,12 +444,18 @@ Before finalizing any specification, assert all the following:
393
444
  - [ ] **Numeric requirement IDs** assigned to every requirement
394
445
  - [ ] **Discovery mode** selected and recorded in spec.json.design_context
395
446
  - [ ] **Requirements traceability** matrix present in design.md
447
+ - [ ] **Canonical Contracts & Invariants** filled for auth/transport/persistence/artifact-sensitive work
396
448
  - [ ] **Every task file** maps to at least 1 valid in-scope requirement ID
449
+ - [ ] **Every task file** includes `Verification & Evidence` with executable or inspectable proof
397
450
  - [ ] **State Machine Blueprint:** design.md contains Mermaid diagrams for non-trivial flows
398
451
  - [ ] **Dependency graph complete**: no task can start before its blockers are listed
399
452
  - [ ] **Risk matrix filled**: likelihood × impact, with mitigation for High items
400
453
  - [ ] **Test strategy defined**: what gets unit tested, integration tested, e2e validated
401
- - [ ] **spec.json fully updated**: phase, progress, timestamps, approvals, design_context
454
+ - [ ] **task_files inventory synced**: no missing or orphaned task references
455
+ - [ ] **task_registry synced**: every task file has exactly one machine-state entry with valid status + dependencies
456
+ - [ ] **Validation gate consistent**: validation_recommended and validation.status agree with spec risk
457
+ - [ ] **Provider wording clean**: no stale vendor/provider strings outside allowed research context
458
+ - [ ] **spec.json fully updated**: phase, current_phase, progress, timestamps, approvals, design_context
402
459
 
403
460
  ## When TO Use
404
461
 
@@ -7,7 +7,7 @@ Review a spec before implementation. The system auto-decides the review depth ba
7
7
  ## Spec Resolution
8
8
 
9
9
  1. If `<feature>` argument provided → use `specs/<feature>/`
10
- 2. If not → check active spec (spec with `in-progress` status)
10
+ 2. If not → check active spec (spec with `in_progress` status; accept legacy `in-progress` when reading existing files)
11
11
  3. If nothing found → ask user to specify path
12
12
 
13
13
  ## Auto-Decision: When to Red Team vs Validate
@@ -40,6 +40,9 @@ These rules override any self-reasoning or optimization the system may attempt:
40
40
  2. **No implementation code files.** This workflow produces ONLY `.md` files. If a finding requires a new shared module or config file, describe it inside the relevant `task-*.md` file. Do NOT create `.ts`, `.js`, `.py`, or any source code file.
41
41
  3. **Findings must use the exact format** defined in Part A Step 5 below. No shortened or custom formats.
42
42
  4. **Apply YAGNI to fixes.** When user says "configure later" or "decide later", add a single note to the task file. Do NOT generate multiple concrete implementations (e.g., 4 provider files when user only asked for abstraction).
43
+ 5. **No false completion.** You MUST NOT set `validation.status = "completed"` or `ready_for_implementation = true` until a reconciliation audit proves the accepted findings and validation decisions are reflected in the physical spec artifacts.
44
+ 6. **Provider drift is a real defect.** If the scope changed away from Claude/Anthropic, stale strings like `Claude API`, `Haiku`, or `haiku_reachable` in `requirements.md`, `design.md`, or `tasks/*.md` are validation failures. `research.md` may mention them only as historical comparison.
45
+ 7. **Implementation-facing propagation is mandatory.** A decision that affects implementation is NOT considered applied if it only appears in `Risk Assessment`, `validate-log.md`, or `red-team-report.md`. It must update at least one of: `requirements.md`, `Canonical Contracts & Invariants`, `Objective`, `Constraints`, `Implementation Steps`, `Completion Criteria`, or `Verification & Evidence`.
43
46
 
44
47
  ---
45
48
 
@@ -110,6 +113,7 @@ If "Review each one": For each finding, ask: "Apply" | "Reject" | "Modify sugges
110
113
  #### Step 8: Apply to Spec
111
114
  - Edit design.md / task files directly for accepted findings
112
115
  - Create `reports/red-team-report.md` documenting the full review session
116
+ - Mark a finding as `Applied To = ...` only after the physical file really contains the change
113
117
 
114
118
  ### Finding Output Format
115
119
 
@@ -141,7 +145,7 @@ If "Review each one": For each finding, ask: "Apply" | "Reject" | "Modify sugges
141
145
 
142
146
  ## Part B: Validation Interview
143
147
 
144
- ### 6-Step Workflow
148
+ ### 8-Step Workflow
145
149
 
146
150
  #### Step 1: Read Spec
147
151
  - `requirements.md` — technical requirements
@@ -211,6 +215,31 @@ Save to `reports/validate-log.md`:
211
215
  | Risk | Task files (Risk Assessment section) |
212
216
  | Unknown | `design.md` (add new subsection) |
213
217
 
218
+ **Additional propagation rules:**
219
+ - If the decision changes implementation behavior, update an implementation-facing section, not only `Risk Assessment`.
220
+ - If the decision changes scope or provider choice, scan `requirements.md`, `design.md`, and `tasks/*.md` for stale wording and normalize it.
221
+ - If the decision changes deletion/privacy behavior, update `Canonical Contracts & Invariants` first, then tasks that inherit that contract.
222
+
223
+ #### Step 7: Reconciliation Audit (MANDATORY)
224
+ Before declaring validation complete:
225
+ 1. Re-read `spec.json`, `requirements.md`, `design.md`, and all `tasks/task-*.md`
226
+ 2. Verify every accepted red-team finding and every validation action item is reflected in the correct physical file(s)
227
+ 3. Fail the audit if:
228
+ - a report says "applied" but the file still contains the old text
229
+ - stale provider strings remain after a provider change
230
+ - delete-data/privacy artifacts mix multiple canonical policies
231
+ - `spec.json.updated_at`, `timestamps.review_done`, or `timestamps.validation_done` do not reflect the final reviewed state
232
+ 4. Only after the audit passes may you:
233
+ - set `spec.json.validation.status = "completed"`
234
+ - set `spec.json.timestamps.validation_done`
235
+ - set `spec.json.timestamps.review_done`
236
+ - set `spec.json.ready_for_implementation = true` when all other gates are satisfied
237
+
238
+ #### Step 8: Final Status Write-Back
239
+ - Update `spec.json.updated_at` to the reconciliation time
240
+ - Ensure `red-team-report.md` and `validate-log.md` do not contradict `spec.json`
241
+ - If reconciliation fails, keep `validation.status` as `not-run` or `in_progress` and list blockers explicitly
242
+
214
243
  ---
215
244
 
216
245
  ## Combined Output
@@ -225,5 +254,5 @@ Validate: {Q} questions asked, {D} decisions confirmed
225
254
 
226
255
  Files modified: {list}
227
256
 
228
- 📌 Next step: /hapo:develop <feature>
257
+ 📌 Next step: /hapo:develop <feature> (ONLY if reconciliation audit passed)
229
258
  ```
@@ -2,7 +2,7 @@
2
2
 
3
3
  ## Purpose
4
4
 
5
- Convert task files (persistent storage) into Claude Tasks (session-scoped only), enabling real-time progress tracking and automatic unblocking when predecessor tasks complete.
5
+ Convert task files (persistent storage) into Claude Tasks (session-scoped only), enabling real-time progress tracking and automatic unblocking when predecessor tasks complete. `spec.json.task_registry` is the machine-state bridge between markdown task files and session-scoped Claude Tasks.
6
6
 
7
7
  ## When to Run
8
8
 
@@ -37,23 +37,24 @@ Convert task files (persistent storage) into Claude Tasks (session-scoped only),
37
37
 
38
38
  | Field | Description | Example |
39
39
  |---|---|---|
40
- | `taskNumber` | Task sequence number | `1` |
40
+ | `taskNumber` | Task sequence number | `R0-02` |
41
41
  | `priority` | Priority level | `P1`, `P2`, `P3` |
42
42
  | `effort` | Time estimate | `2h` |
43
43
  | `specDir` | Spec directory path | `specs/mobile-app/` |
44
- | `taskFile` | Task file name | `task-01-setup.md` |
44
+ | `taskFile` | Task file name | `task-R0-02-extension-shell.md` |
45
45
 
46
46
  ## Hydration Workflow
47
47
 
48
- 1. Read all `tasks/task-*.md` files → filter incomplete tasks
49
- 2. Create `TaskCreate` for each major task, attach `addBlockedBy` per dependency chain:
48
+ 1. Read all `tasks/task-R*.md` files → filter incomplete tasks
49
+ 2. Load `spec.json.task_registry` and prefer its task status/dependencies when present
50
+ 3. Create `TaskCreate` for each major task, attach `addBlockedBy` per dependency chain:
50
51
  ```
51
- Task 01 (no blockers) ← start here
52
- Task 02 (addBlockedBy: [task-01-id])
53
- Task 03 (addBlockedBy: [task-02-id])
52
+ Task R0-01 (no blockers) ← start here
53
+ Task R0-02 (addBlockedBy: [task-R0-01-id])
54
+ Task R1-01 (addBlockedBy: [task-R0-02-id])
54
55
  ```
55
- 3. Create additional `TaskCreate` for high-risk steps within tasks (if any)
56
- 4. Verify: no dependency cycles, all tasks have complete metadata
56
+ 4. Create additional `TaskCreate` for high-risk steps within tasks (if any)
57
+ 5. Verify: no dependency cycles, all tasks have complete metadata, and every hydrated task has a corresponding `task_registry` entry
57
58
 
58
59
  ## Fallback
59
60
 
@@ -78,8 +79,9 @@ Convert task files (persistent storage) into Claude Tasks (session-scoped only),
78
79
  ### Sync-back (after code completes)
79
80
  1. `TaskUpdate` marks tasks as completed
80
81
  2. Update `[ ]` → `[x]` in task files
81
- 3. Update `spec.json` progress for the corresponding phase
82
- 4. Git commit captures the state transition
82
+ 3. Update `spec.json.task_registry[path].status`, timestamps, blocker, and last_updated_at
83
+ 4. Update `spec.json` progress for the corresponding phase
84
+ 5. Git commit captures the state transition
83
85
 
84
86
  ## Quality Checks
85
87
 
@@ -32,6 +32,8 @@ Detail bullets must include:
32
32
  - Validate core functionality early in sequence
33
33
  - Respect architecture boundaries defined in design.md (Architecture Pattern & Boundary Map)
34
34
  - Honor interface contracts documented in design.md
35
+ - Translate completion criteria into concrete proof (commands, artifacts, routes, manifests, schema objects, UI states)
36
+ - Reuse canonical contracts from `design.md` verbatim; never invent alternate auth/provider/deletion policies in task prose
35
37
  - Use major task summaries sparingly—omit detail bullets if the work is fully captured by child tasks.
36
38
 
37
39
  **End with integration tasks** to wire everything together.
@@ -112,6 +114,7 @@ Every task file MUST contain the Risk Assessment table, even if no risks are ide
112
114
  - `_Requirements: X.X, Y.Y_` listing **only numeric requirement IDs** (comma-separated). Never append descriptive text, parentheses, translations, or free-form labels.
113
115
  - For cross-cutting requirements, list every relevant requirement ID. All requirements MUST have numeric IDs in requirements.md. If an ID is missing, stop and correct requirements.md before generating tasks.
114
116
  - Reference components/interfaces from design.md when helpful (e.g., `_Contracts: AuthService API`)
117
+ - If a validation interview or red-team finding changes implementation behavior, update the sub-task itself. Do NOT hide the decision only inside `Risk Assessment`.
115
118
 
116
119
  ### 5. Code-Only Focus
117
120
 
@@ -131,6 +134,24 @@ Every task file MUST contain the Risk Assessment table, even if no risks are ide
131
134
  - When the design already guarantees functional coverage and rapid MVP delivery is prioritized, mark purely test-oriented follow-up work (e.g., baseline rendering/unit tests) as **optional** using the `- [ ]*` checkbox form.
132
135
  - Only apply the optional marker when the sub-task directly references acceptance criteria from requirements.md in its detail bullets.
133
136
  - Never mark implementation work or integration-critical verification as optional—reserve `*` for auxiliary/deferrable test coverage that can be revisited post-MVP.
137
+ - Never mark auth, permissions, privacy, data deletion, migration, schema, or contract verification work as optional.
138
+
139
+ ### Mandatory Verification & Evidence
140
+
141
+ Every task file MUST include a `## Verification & Evidence` section.
142
+
143
+ That section MUST contain:
144
+ 1. **Automated proof** — exact command(s) for typecheck, tests, build, or explicit `N/A`
145
+ 2. **Artifact/runtime proof** — exact files, routes, UI surfaces, generated outputs, or persisted state to inspect
146
+ 3. **Contract/negative-path proof** — at least one contract-preserving check for unauthorized, invalid, missing-permission, rollback, or failure-path behavior when relevant
147
+
148
+ Rules:
149
+ - If the task produces a build artifact or generated file, name the exact artifact path to inspect.
150
+ - If the task wires entrypoints (popup, content script, route, worker, CLI command), name the exact runtime surface that must exist after implementation.
151
+ - If verification depends on environment or manual setup, document the blocker explicitly instead of implying success.
152
+ - Build success alone is NEVER enough evidence for a completed task.
153
+ - For provider-sensitive work, use provider-neutral wording unless the scope lock explicitly names a vendor.
154
+ - For delete-data/privacy work, task text MUST match the single deletion/retention policy chosen in `design.md`. Mixed policies are invalid.
134
155
 
135
156
  ## Task Hierarchy Rules
136
157
 
@@ -65,6 +65,20 @@ When modifying existing systems:
65
65
 
66
66
  > Keep rationale concise here and, when more depth is required (trade-offs, benchmarks), add a short summary plus pointer to the Supporting References section and `research.md` for raw investigation notes.
67
67
 
68
+ ## Canonical Contracts & Invariants
69
+
70
+ Capture only the contracts whose inconsistency would break downstream implementation or verification. If the feature touches auth/session, transport/entrypoints, persistence/schema, generated artifacts, or runtime outputs, this section is mandatory.
71
+
72
+ | Contract Area | Canonical Decision | Applies To | Must Stay Consistent In |
73
+ |---------------|--------------------|------------|-------------------------|
74
+ | Auth / session | | | |
75
+ | Transport / entrypoints | | | |
76
+ | Data / persistence | | | |
77
+ | Deletion / retention policy | | | |
78
+ | Generated artifacts / runtime outputs | | | |
79
+
80
+ > Task files must reuse the same contract wording. If the feature touches delete-data or privacy retention, explicitly decide whether re-registration is blocked and how that lock works without keeping raw PII. If implementation later needs a different contract, update this section first before generating or editing tasks.
81
+
68
82
  ## System Flows
69
83
 
70
84
  Provide only the diagrams needed to explain non-trivial flows. Use pure Mermaid syntax. Common patterns:
@@ -3,8 +3,9 @@
3
3
  "created_at": "{{TIMESTAMP}}",
4
4
  "updated_at": "{{TIMESTAMP}}",
5
5
  "language": "vi",
6
- "status": "pending",
6
+ "status": "in_progress",
7
7
  "phase": "initialized",
8
+ "current_phase": "init",
8
9
  "priority": "P2",
9
10
  "effort": null,
10
11
  "tags": [],
@@ -32,6 +33,9 @@
32
33
  "blockedBy": [],
33
34
  "blocks": []
34
35
  },
36
+ "task_files": [],
37
+ "task_registry": {},
38
+ "blocker": null,
35
39
  "progress": {
36
40
  "init": "done",
37
41
  "requirements": "pending",
@@ -50,7 +54,8 @@
50
54
  "tasks_done": null,
51
55
  "code_done": null,
52
56
  "test_done": null,
53
- "review_done": null
57
+ "review_done": null,
58
+ "validation_done": null
54
59
  },
55
60
  "design_context": {
56
61
  "discovery_mode": null,
@@ -25,14 +25,27 @@
25
25
 
26
26
  <!-- Additional requirements follow the same pattern -->
27
27
 
28
- ## Non-Functional Requirements (NFRs)
28
+ ## Non-Functional Requirements
29
29
 
30
- ### Performance & Scalability
31
- - The [system] shall [measurable performance metric, e.g. "respond within 500ms"]
32
- - The [system] shall [measurable scale metric, e.g. "support 100 concurrent users"]
30
+ <!-- Continue the SAME numeric sequence as functional requirements. Do NOT switch to labels like NFR-1, SEC-1, PERF-1. -->
33
31
 
34
- ### Security & Privacy
35
- - The [system] shall [measurable security behavior, e.g. "encrypt data at rest using AES-256"]
32
+ ### Requirement {{NEXT_REQ_NUMBER}}: Performance & Scalability
33
+ **Objective:** As a system owner, I want predictable performance characteristics, so that the feature remains usable under expected load.
36
34
 
37
- ### Reliability & Availability
38
- - If [failure condition], the [system] shall [recovery behavior]
35
+ #### Acceptance Criteria
36
+ {{NEXT_REQ_NUMBER}}.1 The [system] shall [measurable performance metric, e.g. "respond within 500ms"]
37
+ {{NEXT_REQ_NUMBER}}.2 The [system] shall [measurable scale metric, e.g. "support 100 concurrent users"]
38
+
39
+ ### Requirement {{NEXT_REQ_NUMBER_PLUS_ONE}}: Security & Privacy
40
+ **Objective:** As a security/compliance stakeholder, I want the feature to protect sensitive data and enforce access boundaries, so that the system is safe to ship.
41
+
42
+ #### Acceptance Criteria
43
+ {{NEXT_REQ_NUMBER_PLUS_ONE}}.1 The [system] shall [measurable security behavior, e.g. "encrypt data at rest using AES-256"]
44
+ {{NEXT_REQ_NUMBER_PLUS_ONE}}.2 If [unauthorized or invalid condition], the [system] shall [deny or recover with explicit behavior]
45
+
46
+ ### Requirement {{NEXT_REQ_NUMBER_PLUS_TWO}}: Reliability & Availability
47
+ **Objective:** As an operator, I want predictable failure handling, so that incidents remain diagnosable and recoverable.
48
+
49
+ #### Acceptance Criteria
50
+ {{NEXT_REQ_NUMBER_PLUS_TWO}}.1 If [failure condition], the [system] shall [recovery behavior]
51
+ {{NEXT_REQ_NUMBER_PLUS_TWO}}.2 The [system] shall [durability / retry / fallback expectation]
@@ -54,9 +54,21 @@
54
54
 
55
55
  ## Completion Criteria
56
56
 
57
- - [ ] {{Criteria 1 — observable, testable, maps to acceptance criteria R{{REQ_NUMBER}}}}
58
- - [ ] {{Criteria 2 — measurable, objective}}
59
- - [ ] {{Criteria 3 — maps directly to acceptance criteria from requirements.md}}
57
+ - [ ] {{Criteria 1 — observable output or artifact, maps to acceptance criteria R{{REQ_NUMBER}}}}
58
+ - [ ] {{Criteria 2 — measurable behavior or negative-path outcome}}
59
+ - [ ] {{Criteria 3 — maps directly to acceptance criteria from requirements.md and can be proven below}}
60
+
61
+ ## Verification & Evidence
62
+
63
+ - [ ] Automated verification
64
+ - Command(s): `{{TYPECHECK / TEST / BUILD COMMANDS OR N/A}}`
65
+ - Expected proof: {{What output, exit code, or report proves success}}
66
+ - [ ] Artifact / runtime verification
67
+ - Inspect: `{{artifact path | route | UI state | DB object | manifest entry}}`
68
+ - Expect: {{Observable result that proves the task is really wired}}
69
+ - [ ] Contract / negative-path verification
70
+ - Check: {{Unauthorized path, validation error, permission omission, missing env behavior, deletion effect, etc.}}
71
+ - Expect: {{Concrete failure mode or contract-preserving behavior}}
60
72
 
61
73
  ## Risk Assessment
62
74
 
@@ -70,3 +82,4 @@
70
82
  > **Parallel marker**: Append `(P)` to the title if this task can run concurrently with another (usually when serving different requirements).
71
83
  > **Test note**: If a test coverage sub-task can be deferred post-MVP, mark it with `- [ ]*`.
72
84
  > **Requirement mapping**: Every sub-task MUST end with `_Requirements: X.X_`. No mapping = invalid task file.
85
+ > **Verification rule**: No `## Verification & Evidence` section = invalid task file.
@@ -1,22 +1,22 @@
1
1
  ---
2
2
  name: hapo:sync
3
- description: "Dumb-proof status tracker and file synchronizer. Updates spec.json and tasks/*.md without breaking structural schemas. Includes Auto-Audit."
3
+ description: "Dumb-proof status tracker and file synchronizer. Updates spec.json, task_registry, and tasks/*.md without breaking structural schemas. Includes Auto-Audit."
4
4
  version: 1.0.0
5
- argument-hint: "<feature_name> <task_id> <status> [blocker] | phase <feature_name> <next_phase> | audit <feature_name>"
5
+ argument-hint: "<feature_name> <task_id|task-file> <status> [blocker] | phase <feature_name> <next_phase> | audit <feature_name>"
6
6
  ---
7
7
 
8
8
  # Sync (State Tracking Protocol)
9
9
 
10
- This skill safely bridges the gap between active development state and physical documentation files (`spec.json` & `task-0*.md`). Instead of relying on risky raw AI edits, this skill executes precise contextual replacements.
10
+ This skill safely bridges the gap between active development state and physical documentation files (`spec.json` + `task_registry` + `tasks/task-R*.md`). Instead of relying on risky raw AI edits, this skill executes precise contextual replacements.
11
11
 
12
12
  ## Supported Commands
13
13
 
14
14
  ### 1. Task Synchronization
15
15
  Update a specific task's status and automatically check its relevant sub-checkboxes.
16
16
 
17
- **Usage:** `/hapo:sync <feature_name> <task_id> <status> ["optional blocker msg"]`
18
- - Example 1: `/hapo:sync auth task-01 completed`
19
- - Example 2: `/hapo:sync payment task-03 blocked "API Endpoint Down"`
17
+ **Usage:** `/hapo:sync <feature_name> <task_id|task-file> <status> ["optional blocker msg"]`
18
+ - Example 1: `/hapo:sync auth R0-02 done`
19
+ - Example 2: `/hapo:sync payment task-R1-03-chunks-api.md blocked "API Endpoint Down"`
20
20
 
21
21
  ### 2. Phase Advancement
22
22
  Advance the entire project to the next logical phase.
@@ -25,16 +25,17 @@ Advance the entire project to the next logical phase.
25
25
  - Example: `/hapo:sync phase shopping_cart test`
26
26
 
27
27
  ### 3. State Audit
28
- Scans the `spec.json` against all physical `task-0*.md` files to detect mismatches or un-checked boxes and repairs them.
28
+ Scans the `spec.json` against all physical `task-R*.md` files to detect mismatches between `task_files`, `task_registry`, and markdown task headers, then repairs them.
29
29
 
30
30
  **Usage:** `/hapo:sync audit <feature_name>`
31
31
  - Example: `/hapo:sync audit auth`
32
32
 
33
33
  ## Directives
34
34
 
35
- 1. **Precision Edits:** Never overwrite the entire `spec.json` string. Provide surgical JSON modification protocols.
36
- 2. **Markdown Integrity:** When marking a task "completed", use Regex to turn `[ ]` into `[x]` ONLY inside the `## Các bước thực hiện` section.
37
- 3. **Task Completion Hook:** When `hapo:sync` marks the final pending task as `completed`, it should automatically prompt the user if they'd like to advance the Phase via `hapo:sync phase`.
35
+ 1. **Precision Edits:** Never overwrite the entire `spec.json` string blindly. Update only the required keys, while keeping JSON valid.
36
+ 2. **Machine + Human Sync:** Every task status update MUST modify both `spec.json.task_registry[...]` and the matching markdown task file header/status section.
37
+ 3. **Markdown Integrity:** When marking a task `done`, only then turn `[ ]` into `[x]` inside `## Implementation Steps` and relevant `Completion Criteria` / `Verification & Evidence` checkboxes that have actual proof.
38
+ 4. **Task Completion Hook:** When `hapo:sync` marks the final pending task as `done`, it should automatically prompt the user if they'd like to advance the phase.
38
39
 
39
40
  ## References
40
41
  Read `references/sync-protocols.md` for exact Search/Replace regex patterns and JSON schema expectations before acting on the files.
@@ -2,34 +2,54 @@
2
2
 
3
3
  The following guidelines dictate exactly how `hapo:sync` should interact with files to prevent data corruption.
4
4
 
5
+ **Canonical task status vocabulary:** `pending`, `in_progress`, `blocked`, `done`
6
+
5
7
  ## 1. Updating `spec.json`
6
8
 
7
9
  When requested to update a phase or change task configuration, `spec.json` must maintain its strict schema (defined in `hapo:specs/templates/init.json`).
8
10
 
9
- * **JSON Modification Rule:** Do not output whole files. Instead, load the JSON structure, apply the update to `status`, `current_phase`, `blocker` (if any), and overwrite the file cleanly.
10
- * **Status Update:** If a task changes to `blocked`, `spec.json`'s main `status` must transition to `"blocked"`, and the `"blocker"` string must record the task ID & reason.
11
+ * **JSON Modification Rule:** Do not output whole files. Instead, load the JSON structure, apply the update to `status`, `current_phase`, `blocker` (if any), `task_files`, and the relevant `task_registry` entry, then overwrite the file cleanly.
12
+ * **Task Registry Rule:** Resolve the incoming task reference to a single relative path in `task_registry`. Accept either:
13
+ - compact task ID like `R0-02`
14
+ - full filename like `task-R0-02-extension-shell.md`
15
+ - full relative path like `tasks/task-R0-02-extension-shell.md`
16
+ * **Status Update:** If a task changes to `blocked`, the matching `task_registry[path].status` must become `"blocked"`, `task_registry[path].blocker` must record the reason, and `spec.json.status` / `spec.json.blocker` must reflect the top-level block if work is globally blocked.
17
+ * **Timestamp Rule:** Update `task_registry[path].started_at`, `completed_at`, and `last_updated_at` consistently with the new state. Also refresh `spec.json.updated_at`.
11
18
 
12
19
  ## 2. Updating `tasks/task-**.md`
13
20
 
14
- The structure of `tasks/task.md` relies heavily on exact keyword markers. Follow these surgical regex protocols:
21
+ The structure of `tasks/task.md` relies heavily on exact keyword markers. Follow these surgical protocols against `tasks/task-R*.md`:
15
22
 
16
23
  ### A. Completing a Task
17
- When `/hapo:sync <feature> <task-id> completed`:
18
- 1. Find: `**Trạng thái:** pending` (or `in_progress`).
19
- 2. Replace with: `**Trạng thái:** completed`.
20
- 3. Locate block: `## Các bước thực hiện`.
21
- 4. Convert every `- [ ]` into `- [x]` strictly within that section. Ignore checkboxes elsewhere in the document.
24
+ When `/hapo:sync <feature> <task-id> done`:
25
+ 1. Find: `**Status:** pending` (or `in_progress` / `blocked`).
26
+ 2. Replace with: `**Status:** done`.
27
+ 3. Locate block: `## Implementation Steps`.
28
+ 4. Convert `- [ ]` into `- [x]` strictly within that section.
29
+ 5. Update relevant checkboxes in `## Completion Criteria` and `## Verification & Evidence` only when the caller provides or the file already contains real proof.
22
30
 
23
31
  ### B. Blocking a Task
24
32
  When `/hapo:sync <feature> <task-id> blocked "API error"`:
25
- 1. Find: `**Trạng thái:** <anything>`.
26
- 2. Replace with: `**Trạng thái:** blocked`.
27
- 3. Ensure that an entry under `## Đánh giá Rủi ro` or a new section `## Blocker Log` is injected recording the explicit reason (e.g. `API error`).
33
+ 1. Find: `**Status:** <anything>`.
34
+ 2. Replace with: `**Status:** blocked`.
35
+ 3. Ensure that an entry under `## Blocker Log` exists recording the explicit reason (e.g. `API error`) and timestamp.
36
+
37
+ ### C. Starting / Resuming a Task
38
+ When `/hapo:sync <feature> <task-id> in_progress`:
39
+ 1. Find: `**Status:** pending` (or `blocked`).
40
+ 2. Replace with: `**Status:** in_progress`.
41
+ 3. Do NOT pre-check completion boxes.
42
+ 4. Stamp `task_registry[path].started_at` if missing and refresh `last_updated_at`.
28
43
 
29
44
  ## 3. Audit Protocol
30
45
 
31
46
  When `/hapo:sync audit <feature>` is activated:
32
47
  1. **Load Truth:** Read `specs/<feature>/spec.json`.
33
48
  2. **Scan Directory:** Loop through `specs/<feature>/tasks/`.
34
- 3. **Compare Constraints:** If parsing `task-01.md` reveals `Trạng thái: completed` but `spec.json` is missing this accounting, update the JSON.
35
- 4. **Correction Alert:** Output a brief markdown alert detailing mismatches fixed.
49
+ 3. **Compare Constraints:** Rebuild `task_files` from disk, ensure every file exists in `task_registry`, and compare markdown `**Status:**` headers against `task_registry[path].status`.
50
+ 4. **Reconciliation Rules:**
51
+ - Missing registry entry → create it
52
+ - Missing disk file referenced in registry → remove or flag it
53
+ - Markdown says `done` but registry not done → registry wins only if evidence already exists; otherwise downgrade markdown or flag conflict
54
+ - Registry says `done` but markdown still pending → update markdown only if evidence exists
55
+ 5. **Correction Alert:** Output a brief markdown alert detailing mismatches fixed and any unresolved conflicts requiring manual review.