contract-driven-delivery 1.12.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. package/CHANGELOG.md +169 -0
  2. package/README.md +58 -38
  3. package/assets/CLAUDE.template.md +4 -12
  4. package/assets/agents/backend-engineer.md +5 -26
  5. package/assets/agents/change-classifier.md +87 -27
  6. package/assets/agents/ci-cd-gatekeeper.md +4 -25
  7. package/assets/agents/contract-reviewer.md +4 -25
  8. package/assets/agents/dependency-security-reviewer.md +4 -24
  9. package/assets/agents/e2e-resilience-engineer.md +4 -25
  10. package/assets/agents/frontend-engineer.md +4 -25
  11. package/assets/agents/monkey-test-engineer.md +4 -25
  12. package/assets/agents/qa-reviewer.md +4 -25
  13. package/assets/agents/repo-context-scanner.md +4 -24
  14. package/assets/agents/spec-architect.md +4 -25
  15. package/assets/agents/spec-drift-auditor.md +4 -24
  16. package/assets/agents/stress-soak-engineer.md +4 -25
  17. package/assets/agents/test-strategist.md +4 -25
  18. package/assets/agents/ui-ux-reviewer.md +4 -24
  19. package/assets/agents/visual-reviewer.md +4 -24
  20. package/assets/cdd/model-policy.json +20 -1
  21. package/assets/hooks/post-tool-use-files-read.sh +55 -0
  22. package/assets/skills/cdd-close/SKILL.md +9 -9
  23. package/assets/skills/cdd-new/SKILL.md +201 -198
  24. package/assets/skills/cdd-resume/SKILL.md +16 -16
  25. package/assets/skills/contract-driven-delivery/SKILL.md +6 -0
  26. package/assets/skills/contract-driven-delivery/references/agent-log-protocol.md +147 -0
  27. package/assets/skills/contract-driven-delivery/scripts/generate_change_scaffold.py +1 -1
  28. package/assets/skills/contract-driven-delivery/scripts/validate_spec_traceability.py +1 -1
  29. package/assets/skills/contract-driven-delivery/templates/agent-log.example.yml +14 -0
  30. package/assets/skills/contract-driven-delivery/templates/change-classification.md +1 -1
  31. package/assets/skills/contract-driven-delivery/templates/tasks.yml +39 -0
  32. package/assets/specs-templates/change-classification.md +1 -1
  33. package/assets/specs-templates/context-manifest.md +8 -13
  34. package/assets/specs-templates/tasks.yml +39 -0
  35. package/dist/cli/index.js +11057 -829
  36. package/package.json +7 -3
  37. package/assets/skills/contract-driven-delivery/templates/tasks.md +0 -50
  38. package/assets/specs-templates/tasks.md +0 -52
@@ -0,0 +1,55 @@
1
+ #!/bin/sh
2
+ # cdd-kit PostToolUse hook (B3): append actual Read/Grep/Glob targets to a
3
+ # runtime audit log so `cdd-kit gate` can reconcile them against the agent-log
4
+ # self-report. This turns Context Governance from a trust contract into a
5
+ # verified contract.
6
+ #
7
+ # Wire into Claude Code (~/.claude/settings.json):
8
+ #
9
+ # {
10
+ # "hooks": {
11
+ # "PostToolUse": [
12
+ # { "matcher": "Read|Grep|Glob", "command": "/path/to/hooks/post-tool-use-files-read.sh" }
13
+ # ]
14
+ # }
15
+ # }
16
+ #
17
+ # The hook receives the tool-call payload as JSON on stdin. We extract the
18
+ # best-effort path candidate and append `<change-id>\t<path>` to a JSONL audit
19
+ # file. CURRENT_CHANGE_ID is read from environment (cdd-new sets it on every
20
+ # agent invocation as of v1.10.0+).
21
+
22
+ set -eu
23
+
24
+ CDD_RUNTIME_DIR="${CDD_RUNTIME_DIR:-./.cdd/runtime}"
25
+ CHANGE_ID="${CURRENT_CHANGE_ID:-unknown}"
26
+
27
+ mkdir -p "$CDD_RUNTIME_DIR"
28
+ LOG_FILE="$CDD_RUNTIME_DIR/${CHANGE_ID}-files-read.jsonl"
29
+
30
+ # Read JSON payload from stdin without choking if jq is missing.
31
+ payload="$(cat || true)"
32
+ [ -z "$payload" ] && exit 0
33
+
34
+ # Try to extract the path field. Common Claude Code tool inputs:
35
+ # Read → tool_input.file_path
36
+ # Grep → tool_input.path / glob / pattern
37
+ # Glob → tool_input.path / pattern
38
+ # We grep first then fall back to jq when available.
39
+ path_value=""
40
+ if command -v jq >/dev/null 2>&1; then
41
+ path_value="$(printf '%s' "$payload" | jq -r '
42
+ .tool_input.file_path
43
+ // .tool_input.path
44
+ // .tool_input.pattern
45
+ // empty
46
+ ' 2>/dev/null || true)"
47
+ fi
48
+ if [ -z "$path_value" ]; then
49
+ path_value="$(printf '%s' "$payload" | grep -oE '"file_path"[[:space:]]*:[[:space:]]*"[^"]+"' | head -n1 | sed -E 's/.*"file_path"[[:space:]]*:[[:space:]]*"([^"]+)".*/\1/')"
50
+ fi
51
+
52
+ [ -z "$path_value" ] && exit 0
53
+
54
+ timestamp="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
55
+ printf '{"ts":"%s","change":"%s","path":"%s"}\n' "$timestamp" "$CHANGE_ID" "$path_value" >> "$LOG_FILE"
@@ -38,7 +38,7 @@ If the user wants to **abandon** this change (not close as complete):
38
38
  cdd-kit abandon <change-id> --reason "<reason>"
39
39
  ```
40
40
 
41
- This marks `tasks.md` as `status: abandoned` and records it in `specs/archive/INDEX.md`. The directory is preserved for git history. Do NOT run the rest of this skill after abandoning.
41
+ This marks `tasks.yml` as `status: abandoned` and records it in `specs/archive/INDEX.md`. The directory is preserved for git history. Do NOT run the rest of this skill after abandoning.
42
42
 
43
43
  ---
44
44
 
@@ -48,13 +48,13 @@ Run: `cdd-kit gate <change-id>`
48
48
 
49
49
  If gate fails: stop and report failures. Do NOT archive a change that hasn't passed gate.
50
50
 
51
- Exception: if `tasks.md` contains `status: gate-blocked`, ask the user: "This change was gate-blocked. Abandon it? (yes/no)". If yes, run `cdd-kit abandon <change-id> --reason "gate-blocked after 3 attempts"` and stop.
51
+ Exception: if `tasks.yml` contains `status: gate-blocked`, ask the user: "This change was gate-blocked. Abandon it? (yes/no)". If yes, run `cdd-kit abandon <change-id> --reason "gate-blocked after 3 attempts"` and stop.
52
52
 
53
53
  ---
54
54
 
55
- ## Step 2: Review tasks.md section 7
55
+ ## Step 2: Review tasks.yml section 7
56
56
 
57
- Read `specs/changes/<change-id>/tasks.md`.
57
+ Read `specs/changes/<change-id>/tasks.yml`.
58
58
 
59
59
  Check section 7:
60
60
  - `7.1 Archive change` — will be ticked after Step 4
@@ -71,7 +71,7 @@ Read only active evidence for this change:
71
71
  - `specs/changes/<change-id>/qa-report.md` (if exists)
72
72
  - `specs/changes/<change-id>/ci-gates.md`
73
73
  - `specs/changes/<change-id>/context-manifest.md`
74
- - `specs/changes/<change-id>/tasks.md`
74
+ - `specs/changes/<change-id>/tasks.yml`
75
75
 
76
76
  Do not read `specs/archive/` while closing a change. Historical archives are cold data and must not be used as current requirements.
77
77
 
@@ -110,9 +110,9 @@ After contract-reviewer responds:
110
110
  3. Run `cdd-kit validate --contracts` to confirm contract format is preserved
111
111
  4. Run `cdd-kit context-scan` so future classifiers see updated hot context indexes
112
112
  5. Fill in `## Lessons Promoted to Standards` in archive.md with what was promoted, where, and evidence path
113
- 6. Tick `7.2` in tasks.md
113
+ 6. Set task `7.2` to `status: done` in tasks.yml
114
114
 
115
- If there are no lessons to promote, mark `[-]` for 7.2 with rationale.
115
+ If there are no lessons to promote, mark `7.2` as `status: skipped` with rationale.
116
116
 
117
117
  ---
118
118
 
@@ -120,8 +120,8 @@ If there are no lessons to promote, mark `[-]` for 7.2 with rationale.
120
120
 
121
121
  Run: `cdd-kit archive <change-id>`
122
122
 
123
- If successful, tick `7.1` in tasks.md (the file is now in specs/archive/, update it there):
124
- `specs/archive/<year>/<change-id>/tasks.md` — change `7.1` from `[ ]` to `[x]`.
123
+ If successful, set task `7.1` to `status: done` in tasks.yml (the file is now in specs/archive/, update it there):
124
+ `specs/archive/<year>/<change-id>/tasks.yml` — change `7.1` from `status: pending` to `status: done`.
125
125
 
126
126
  ---
127
127
 
@@ -49,16 +49,52 @@ If no description is provided, ask the user: "Please describe the change you wan
49
49
 
50
50
  ---
51
51
 
52
+ ## Step 0: Request quality check (BEFORE classifier)
53
+
54
+ Non-engineers often submit ambiguous requests like "fix the slow report" or
55
+ "make it nicer". These cost a full classifier round-trip when the right move is
56
+ to ask back. Before scaffolding anything, verify the request contains all
57
+ three elements below. Rephrase the request internally in this shape:
58
+
59
+ | Element | Example | Required? |
60
+ |---|---|---|
61
+ | 1. Affected surface | "the order export page", "the JWT login flow" | always |
62
+ | 2. Desired behavior change | "complete in <10s", "support 2FA via TOTP" | always |
63
+ | 3. Observable success criterion | "1000-row export finishes without timeout", "user with 2FA can log in end-to-end" | always |
64
+
65
+ If any element is missing or ambiguous, **STOP. Do NOT call `cdd-kit new` or
66
+ the classifier.** Ask the user back in this exact shape:
67
+
68
+ ```
69
+ Before I start a tracked change, I need to lock down three things:
70
+
71
+ Affected surface: <best guess from request, or empty>
72
+ Desired behavior: <best guess, or empty>
73
+ Success criterion: <empty — please fill>
74
+
75
+ Could you confirm or fill in the missing pieces?
76
+ ```
77
+
78
+ Only proceed to Step 1 once all three are answered or the user explicitly says
79
+ "proceed without success criterion". Record the user's clarifications verbatim
80
+ in `change-request.md` § Original Request.
81
+
82
+ The cost of this step: 1 short message round-trip. The cost of skipping it:
83
+ one full classifier+contract-reviewer cycle, often 5-10× more tokens, plus an
84
+ inevitable re-classification when the agents discover the ambiguity.
85
+
86
+ ---
87
+
52
88
  ## Write Responsibilities
53
89
 
54
90
  **This distinction is critical — follow it for every step:**
55
91
 
56
- | Agent type | Who writes artifact files | Who writes agent-log | Who ticks tasks.md |
57
- |------------|--------------------------|----------------------|--------------------|
92
+ | Agent type | Who writes artifact files | Who writes agent-log | Who updates tasks.yml |
93
+ |------------|--------------------------|----------------------|----------------------|
58
94
  | Read-only agents (no Edit tool): `change-classifier`, `contract-reviewer`, `qa-reviewer`, `visual-reviewer`, `dependency-security-reviewer`, `ui-ux-reviewer` | YOU (main Claude) | YOU (main Claude) | YOU (main Claude) |
59
95
  | Write-capable agents (have Edit): `backend-engineer`, `frontend-engineer`, `e2e-resilience-engineer`, `monkey-test-engineer`, `stress-soak-engineer`, `ci-cd-gatekeeper`, `test-strategist`, `spec-architect` | The agent itself | The agent itself | YOU (main Claude) |
60
96
 
61
- **Rule**: After EVERY agent completes (whether it writes itself or you write for it), YOU must update the relevant `tasks.md` checkbox(es) from `[ ]` to `[x]`.
97
+ **Rule**: After EVERY agent completes (whether it writes itself or you write for it), YOU must update the relevant `tasks.yml` task `status:` from `pending` to `done`.
62
98
 
63
99
  ---
64
100
 
@@ -70,7 +106,7 @@ Note: `archive.md` is created during `/cdd-close`, not during `/cdd-new` — it
70
106
 
71
107
  If the classifier marks an artifact as `no` or leaves it blank, **do not create the file** — even if a review agent could contribute to it.
72
108
 
73
- The 5 always-required artifacts are: `change-request.md`, `change-classification.md`, `test-plan.md`, `ci-gates.md`, `tasks.md`.
109
+ The 5 always-required artifacts are: `change-request.md`, `change-classification.md`, `test-plan.md`, `ci-gates.md`, `tasks.yml`.
74
110
 
75
111
  ## Step 1: Generate change-id, scaffold, and scan context
76
112
 
@@ -99,176 +135,21 @@ Verify these files exist:
99
135
 
100
136
  Do not use broad search or ad hoc reads to classify the change before `context-scan` has completed.
101
137
 
102
- The generated scaffold contains the artifacts below. Fill `change-request.md` with the user's request before invoking the classifier.
103
-
104
- Update `specs/changes/<change-id>/change-request.md` with the user's description filled in:
105
- ```
106
- # Change Request: <change-id>
107
-
108
- ## Original Request
109
- <user's exact description, verbatim>
110
-
111
- ## Business / User Goal
112
- <infer from the description>
113
-
114
- ## Non-goals
115
-
116
- ## Constraints
117
-
118
- ## Known Context
138
+ The generated scaffold contains the artifacts listed in the table below. **All
139
+ templates are written from disk by `cdd-kit new` — do not paste template bodies
140
+ into this prompt.** The on-disk source of truth lives in `specs/templates/` of
141
+ the kit and is bundled into every install.
119
142
 
120
- ## Open Questions
121
-
122
- ## Requested Delivery Date / Priority
123
- as soon as possible
124
- ```
125
-
126
- `specs/changes/<change-id>/change-classification.md` starts from this blank template:
127
- ```
128
- # Change Classification
129
-
130
- ## Change Types
131
- - primary:
132
- - secondary:
133
-
134
- ## Risk Level
135
- - low / medium / high / critical
136
-
137
- ## Impact Radius
138
- - isolated / module-level / cross-module / system-wide
139
-
140
- ## Tier
141
- - 0 / 1 / 2 / 3 / 4 / 5
142
-
143
- ## Architecture Review Required
144
- - yes / no
145
- - reason: (fill only if yes)
146
-
147
- ## Required Artifacts
148
- Always required: change-request.md, change-classification.md, test-plan.md, ci-gates.md, tasks.md
149
-
150
- ## Optional Artifacts (default: no — set yes only with explicit reason)
151
- | artifact | create? | reason |
143
+ | File | Source | Your job |
152
144
  |---|---|---|
153
- | current-behavior.md | no | |
154
- | proposal.md | no | |
155
- | spec.md | no | |
156
- | design.md | no | |
157
- | qa-report.md | no | |
158
- | regression-report.md | no | |
159
-
160
- ## Required Contracts
161
- - API:
162
- - CSS/UI:
163
- - Env:
164
- - Data shape:
165
- - Business logic:
166
- - CI/CD:
167
-
168
- ## Required Test Families
169
- - unit:
170
- - contract:
171
- - integration:
172
- - E2E:
173
- - visual:
174
- - data-boundary:
175
- - resilience:
176
- - fuzz/monkey:
177
- - stress:
178
- - soak:
179
-
180
- ## Required Agents
181
-
182
- ## Assumptions / Clarifications
183
- ```
184
-
185
- `specs/changes/<change-id>/test-plan.md` starts from this blank template:
186
- ```
187
- # Test Plan: <change-id>
188
-
189
- ## Acceptance Criteria → Test Mapping
190
- | criterion id | test family | test file path | tier |
191
- |---|---|---|---|
192
-
193
- ## Test Families Required
194
- | family | tier | notes |
195
- |---|---|---|
196
- | (unit / contract / integration / e2e / data-boundary / resilience / monkey / stress / soak) | | |
197
-
198
- ## Out of Scope
145
+ | `change-request.md` | `specs/templates/change-request.md` | Fill the `## Original Request` section with the user's exact description before invoking the classifier; leave the rest blank |
146
+ | `change-classification.md` | `specs/templates/change-classification.md` | Replace blank template with classifier output (Step 2) |
147
+ | `test-plan.md` | `specs/templates/test-plan.md` | `test-strategist` writes this directly |
148
+ | `ci-gates.md` | `specs/templates/ci-gates.md` | `ci-cd-gatekeeper` writes this directly |
149
+ | `tasks.yml` | `specs/templates/tasks.yml` | Tick checkboxes as agents complete; backfill `tier:` frontmatter from classifier (Step 2.4) |
150
+ | `context-manifest.md` | `specs/templates/context-manifest.md` | Replace from classifier `## Context Manifest Draft` (Step 2) |
199
151
 
200
- ## Notes
201
- (Keep under 10 lines. Implementation detail belongs in the test files themselves.)
202
- ```
203
-
204
- `specs/changes/<change-id>/ci-gates.md` starts from this blank template:
205
- ```
206
- # CI Gates: <change-id>
207
-
208
- ## Required Gates (block merge if failing)
209
-
210
- ## Informational Gates (report only)
211
-
212
- ## Nightly / Weekly / Manual Gates
213
-
214
- ## Promotion Policy
215
- ```
216
-
217
- `specs/changes/<change-id>/tasks.md` starts with ALL checkboxes unchecked:
218
- ```
219
- ---
220
- change-id: <change-id>
221
- status: in-progress
222
- context-governance: v1
223
- depends-on: []
224
- ---
225
-
226
- <!-- [x]=done [-]=N/A [ ]=pending -->
227
-
228
- # Tasks: <change-id>
229
-
230
- ## 1. Preparation
231
- - [ ] 1.1 Confirm classification and required artifacts
232
- - [ ] 1.2 Confirm contracts to update
233
- - [ ] 1.3 Confirm CI/CD gate plan
234
-
235
- ## 2. Contract Updates
236
- - [ ] 2.1 API contract
237
- - [ ] 2.2 CSS/UI contract
238
- - [ ] 2.3 Env contract
239
- - [ ] 2.4 Data shape contract
240
- - [ ] 2.5 Business logic contract
241
- - [ ] 2.6 CI/CD contract
242
-
243
- ## 3. Tests First
244
- - [ ] 3.1 Unit/contract tests
245
- - [ ] 3.2 Integration tests
246
- - [ ] 3.3 E2E/resilience tests
247
- - [ ] 3.4 Data-boundary/monkey tests
248
- - [ ] 3.5 Stress/soak tests if required
249
-
250
- ## 4. Implementation
251
- - [ ] 4.1 Backend
252
- - [ ] 4.2 Frontend
253
- - [ ] 4.3 Env/deploy
254
- - [ ] 4.4 CI/CD workflows
255
-
256
- ## 5. Review
257
- - [ ] 5.1 UI/UX review
258
- - [ ] 5.2 Visual review
259
- - [ ] 5.3 Contract review
260
- - [ ] 5.4 QA review
261
-
262
- ## 6. Verification
263
- - [ ] 6.1 Local gates
264
- - [ ] 6.2 PR required gates
265
- - [ ] 6.3 Informational gates
266
- - [ ] 6.4 Nightly/weekly/manual gates if required
267
-
268
- ## 7. Archive
269
- - [ ] 7.1 Archive change
270
- - [ ] 7.2 Promote durable learnings to contracts or CLAUDE.md
271
- ```
152
+ If `cdd-kit new` reports a missing template, run `cdd-kit upgrade --yes`.
272
153
 
273
154
  ---
274
155
 
@@ -290,16 +171,50 @@ The classifier must include a `## Context Manifest Draft` section with:
290
171
  - required tests
291
172
  - any context expansion requests that must be approved before implementation
292
173
 
293
- **change-classifier is read-only** — it will return its output as text. After it responds:
174
+ **change-classifier is read-only** — it will return its output as text.
175
+
176
+ ### If the classifier returns `## Atomic Split Proposal`
177
+
178
+ The classifier has decided this request is too big for a single change. Do
179
+ NOT proceed with the rest of `/cdd-new`. Instead:
180
+
181
+ 1. Show the user the full `## Atomic Split Proposal` table verbatim.
182
+ 2. Ask: "Run these as separate changes (recommended), or force a single
183
+ monolithic change?"
184
+ 3. If user picks "separate":
185
+ - For each row in the proposal table, run `cdd-kit new <change-id>` with
186
+ the listed `--depends-on`.
187
+ - Then say: "I created N change directories. Want me to run `/cdd-new`
188
+ against the first one now?" — wait for confirmation; do not auto-loop.
189
+ 4. If user picks "force monolithic":
190
+ - Re-invoke change-classifier with `force-monolithic` appended to the
191
+ change-request and proceed with whatever Tier the classifier returns.
192
+ 5. Delete the partially-scaffolded change directory you created in Step 1
193
+ if the user picked "separate" and the originally-derived change-id is
194
+ not in the proposal — it would otherwise sit empty and confuse `cdd-kit
195
+ list`.
196
+
197
+ ### Classifier output lint (B8): refuse stub responses
198
+
199
+ Before writing any files, verify the classifier response contains:
200
+
201
+ - `## Tier` followed by `- N` where N is a single digit 0-5 (NOT `0 / 1 / 2 / 3 / 4 / 5` — that is the unfilled placeholder).
202
+ - `## Required Agents` with at least one agent name.
203
+ - `## Inferred Acceptance Criteria` with at least one filled `AC-1: …` line.
204
+
205
+ If any of these are missing or still hold the literal placeholder text, STOP. Re-prompt the classifier with the missing pieces named explicitly. Do NOT write classification.md — gate will reject it as a stub anyway and you will have wasted the round-trip.
206
+
207
+ ### When the classifier output passes lint
294
208
 
295
209
  1. **YOU write** `specs/changes/<change-id>/change-classification.md` — replace the blank template with the classifier's classification output.
296
- 2. **YOU write** `specs/changes/<change-id>/agent-log/change-classifier.md` — copy the Agent Log block from the classifier's response.
210
+ 2. **YOU write** `specs/changes/<change-id>/agent-log/change-classifier.yml` — copy the Agent Log block from the classifier's response.
297
211
  3. **YOU update** `specs/changes/<change-id>/context-manifest.md` from the classifier's `## Context Manifest Draft`.
298
- 4. **YOU tick** `tasks.md` item `1.1`.
212
+ 4. **YOU update** `tasks.yml` frontmatter: set `tier: <N>` to the classifier's tier digit. This is now the authoritative source for `cdd-kit gate` tier-based agent enforcement (the classification.md `## Tier` section is fallback only).
213
+ 5. **YOU tick** `tasks.yml` item `1.1`.
299
214
 
300
- Wait until these four writes are done before continuing.
215
+ Wait until these five writes are done before continuing.
301
216
 
302
- **After writing change-classification.md**: read the classifier's `## Tasks Not Applicable` list. For each listed task ID (e.g., `2.2`, `4.2`), update `tasks.md` to change that item from `[ ]` to `[-]`. Do this before invoking any other agent.
217
+ **After writing change-classification.md**: read the classifier's `## Tasks Not Applicable` list. For each listed task ID (e.g., `2.2`, `4.2`), update `tasks.yml` to change that item's `status:` from `pending` to `skipped`. Do this before invoking any other agent.
303
218
 
304
219
  ---
305
220
 
@@ -307,9 +222,9 @@ Wait until these four writes are done before continuing.
307
222
 
308
223
  Read `change-classification.md` to determine the tier. Then invoke agents **in the exact order below**.
309
224
 
310
- **For each read-only agent**: wait for its text response → YOU write its artifact file(s) → YOU write its agent-log → YOU tick relevant tasks.md item(s).
225
+ **For each read-only agent**: wait for its text response → YOU write its artifact file(s) → YOU write its agent-log → YOU tick relevant tasks.yml item(s).
311
226
 
312
- **For each write-capable agent**: wait for it to confirm completion → YOU tick relevant tasks.md item(s).
227
+ **For each write-capable agent**: wait for it to confirm completion → YOU tick relevant tasks.yml item(s).
313
228
 
314
229
  If any agent sets `status: blocked` in its log, halt immediately and report the agent's `next-action` to the user — do not proceed to subsequent agents.
315
230
 
@@ -320,16 +235,68 @@ Change directory: specs/changes/<change-id>/
320
235
  ```
321
236
  This ensures the agent's Read scope restriction points to the correct directory.
322
237
 
238
+ ### Agent stage badges (UI v1)
239
+
240
+ When you announce that you are about to invoke an agent, prefix the
241
+ announcement with the matching emoji + role tag from the table below. This
242
+ helps a non-engineer scanning the chat stream tell what stage they are in
243
+ without reading the full prompt. Use the badges only in your own narration to
244
+ the user; do not put them inside the prompt sent to the agent.
245
+
246
+ | Stage | Agent | Badge |
247
+ |---|---|---|
248
+ | Decision | `change-classifier` | 🟣 `[classifier]` |
249
+ | Decision | `spec-architect` | 🟣 `[architect]` |
250
+ | Implementation | `backend-engineer` | 🔵 `[backend]` |
251
+ | Implementation | `frontend-engineer` | 🔵 `[frontend]` |
252
+ | Implementation | `ci-cd-gatekeeper` | 🔵 `[ci-cd]` |
253
+ | Implementation | `test-strategist` | 🟡 `[test-plan]` |
254
+ | Heavy testing (Tier 0–1 only) | `e2e-resilience-engineer` | 🟠 `[e2e]` |
255
+ | Heavy testing (Tier 0–1 only) | `monkey-test-engineer` | 🟠 `[monkey]` |
256
+ | Heavy testing (Tier 0–1 only) | `stress-soak-engineer` | 🟠 `[stress]` |
257
+ | Review | `contract-reviewer` | 🟢 `[contracts]` |
258
+ | Review | `qa-reviewer` | 🟢 `[qa]` |
259
+ | Review | `ui-ux-reviewer` | 🟢 `[ui-ux]` |
260
+ | Review | `visual-reviewer` | 🟢 `[visual]` |
261
+ | Review | `dependency-security-reviewer` | 🟢 `[deps-sec]` |
262
+ | Audit | `spec-drift-auditor` | ⚫ `[drift]` |
263
+ | Audit | `repo-context-scanner` | ⚫ `[repo-scan]` |
264
+
265
+ Color semantics:
266
+ - 🟣 purple: deciding what we will do (heavy model, opus-class)
267
+ - 🔵 blue: writing code (sonnet-class implementation)
268
+ - 🟡 yellow: planning tests (sonnet-class)
269
+ - 🟠 orange: heavy testing — only appears for Tier 0–1, signals high-risk scope
270
+ - 🟢 green: reviewing what was done (no code writes; just verdicts)
271
+ - ⚫ neutral: audits and scans (read-only background work)
272
+
273
+ Format: emoji is followed by a single space, then the bracket-tag, then the
274
+ human-readable narration.
275
+
276
+ Examples:
277
+
278
+ ```
279
+ 🟣 [classifier] Reading the request and project map…
280
+ 🟢 [contracts] Confirming the API contract is unchanged. (read-only)
281
+ 🔵 [backend] Implementing the JWT issuance endpoint and writing failing
282
+ tests first per TDD policy.
283
+ 🟠 [stress] Tier 1 high-risk change — running soak test for 30 min.
284
+ ```
285
+
286
+ These badges are pure narration. They MUST NOT be sent inside the agent's
287
+ prompt; the agent's behavior is defined by the agent prompt files in
288
+ `.claude/agents/<name>.md`, not by this badge.
289
+
323
290
  ---
324
291
 
325
292
  ### Tier 4–5 (low risk: docs, prompts, config-only, no behavior change)
326
293
 
327
294
  1. **`contract-reviewer`** (read-only) — confirm no contracts are touched or all touched ones are already updated.
328
- - YOU write: `agent-log/contract-reviewer.md`
295
+ - YOU write: `agent-log/contract-reviewer.yml`
329
296
  - YOU tick: `1.2`, applicable items in section 2
330
297
 
331
298
  2. **`qa-reviewer`** (read-only) — confirm release readiness.
332
- - YOU write: `agent-log/qa-reviewer.md`
299
+ - YOU write: `agent-log/qa-reviewer.yml`
333
300
  - YOU tick: `5.4`
334
301
 
335
302
  ---
@@ -337,7 +304,7 @@ This ensures the agent's Read scope restriction points to the correct directory.
337
304
  ### Tier 2–3 (normal: feature, enhancement, bug fix with behavior change)
338
305
 
339
306
  1. **`contract-reviewer`** (read-only) — update or create contracts in `contracts/` before any implementation starts.
340
- - YOU write: `agent-log/contract-reviewer.md`
307
+ - YOU write: `agent-log/contract-reviewer.yml`
341
308
  - YOU tick: `1.2`, applicable items in section 2
342
309
 
343
310
  2. **`test-strategist`** (write-capable) — writes `specs/changes/<change-id>/test-plan.md` directly.
@@ -349,31 +316,31 @@ This ensures the agent's Read scope restriction points to the correct directory.
349
316
 
350
317
  4. **`backend-engineer`** (write-capable) — if the change touches server, API, data, or business logic. Writes implementation and its own agent-log.
351
318
  - YOU tick: `4.1` and/or `4.3` based on scope
352
- - Note: `tasks.md` items 3.1–3.2 (unit/contract/integration tests) are written by `backend-engineer` and/or `frontend-engineer` in TDD fashion — failing tests first, implementation second. Items 3.3–3.5 are written by dedicated test engineers (Tier 0–1 only or when classifier explicitly requires them).
319
+ - Note: `tasks.yml` items 3.1–3.2 (unit/contract/integration tests) are written by `backend-engineer` and/or `frontend-engineer` in TDD fashion — failing tests first, implementation second. Items 3.3–3.5 are written by dedicated test engineers (Tier 0–1 only or when classifier explicitly requires them).
353
320
 
354
321
  5. **`frontend-engineer`** (write-capable) — if the change touches UI, components, or client-side behavior. Writes implementation and its own agent-log.
355
322
  - YOU tick: `4.2`
356
323
 
357
324
  6. **`dependency-security-reviewer`** (read-only) — if the change touches lockfiles, package manifests, or DB migrations.
358
325
  - **Only invoke if** `change-classification.md` lists lockfiles, package manifests, or DB migrations as affected.
359
- - YOU write: `agent-log/dependency-security-reviewer.md`
326
+ - YOU write: `agent-log/dependency-security-reviewer.yml`
360
327
  - YOU tick: applicable security-related items
361
328
 
362
329
  7. **`ui-ux-reviewer`** (read-only) — if any UI change (run alongside or after frontend-engineer).
363
330
  - **Only invoke if** classifier marks UI/CSS as affected.
364
- - YOU write: `agent-log/ui-ux-reviewer.md`
331
+ - YOU write: `agent-log/ui-ux-reviewer.yml`
365
332
  - YOU tick: `5.1`
366
333
 
367
334
  8. **`visual-reviewer`** (read-only) — if any UI change (run after ui-ux-reviewer).
368
335
  - **Only invoke if** classifier marks UI/CSS as affected.
369
- - YOU write: `agent-log/visual-reviewer.md`
336
+ - YOU write: `agent-log/visual-reviewer.yml`
370
337
  - YOU tick: `5.2`
371
338
 
372
339
  9. **`ci-cd-gatekeeper`** (write-capable) — writes `specs/changes/<change-id>/ci-gates.md` directly.
373
340
  - YOU tick: `1.3`, `4.4`, applicable items in section 6
374
341
 
375
342
  10. **`qa-reviewer`** (read-only) — release readiness decision (always last).
376
- - YOU write: `agent-log/qa-reviewer.md`
343
+ - YOU write: `agent-log/qa-reviewer.yml`
377
344
  - YOU tick: `5.4`
378
345
 
379
346
  ---
@@ -404,24 +371,60 @@ All agents from Tier 2–3, plus insert these after `frontend-engineer` / `backe
404
371
 
405
372
  ## Step 4: Run the gate
406
373
 
407
- After all required agents have completed and all tasks.md items for their sections are ticked:
374
+ After all required agents have completed and all tasks.yml items for their sections are ticked:
408
375
 
409
376
  ```
410
377
  cdd-kit gate <change-id>
411
378
  ```
412
379
 
413
380
  **If gate passes**:
414
- - YOU tick: `tasks.md` item `6.1`
381
+ - YOU tick: `tasks.yml` item `6.1`
415
382
  - Proceed to Step 5.
416
383
 
417
- **If gate fails**:
418
- 1. Read the gate error output carefully
419
- 2. Identify which artifact is missing, stub, or invalid
420
- 3. Re-invoke the specific agent responsible for that artifact with the exact fix required
421
- 4. Re-run `cdd-kit gate <change-id>`
422
- 5. Repeat until gate passes (max 3 iterations; if still failing after 3, report to user)
384
+ **If gate fails — structured fix-back routing**:
385
+
386
+ Capture gate's full stderr verbatim. Parse error lines and route each to the
387
+ right owner. The patterns below are exhaustive every gate error message
388
+ matches one of them.
389
+
390
+ | Error pattern | Route to | Re-invocation prompt seed |
391
+ |---|---|---|
392
+ | `agent-log/<name>.yml: …` | the named agent | "PREVIOUS GATE FAILURE FOR THIS AGENT: <full error line>. Fix only your `agent-log/<name>.yml`. Re-output your Agent Log block." |
393
+ | `change-classification.md: …` | `change-classifier` | "PREVIOUS CLASSIFICATION FAILED GATE: <error>. Re-emit only the failing section." |
394
+ | `context-manifest.md: …` | `change-classifier` | "PREVIOUS MANIFEST FAILED GATE: <error>. Re-emit `## Context Manifest Draft`." |
395
+ | `tasks.yml: …` (frontmatter / pending) | YOU (main Claude) — direct edit | n/a — fix `tasks.yml` yourself. Don't re-invoke an agent for a file you own. |
396
+ | `Tier <N> change requires agent-log/<X>.yml` | invoke the missing agent `<X>` | "TIER <N> REQUIRES THIS LOG. Run your full work, not just the log." |
397
+ | `dependency <id>: upstream change is not completed` | n/a — STOP | Tell user: "Upstream change `<id>` must complete before this change can gate. Run `/cdd-new <id>` first or run `cdd-kit archive <id>` if it's already done." |
398
+ | `validators returned non-zero` | `contract-reviewer` | "PREVIOUS CONTRACT VALIDATION FAILED: <last 10 lines of validator stderr>. Reconcile contracts." |
399
+
400
+ **Re-invocation prompt template** (always use this exact prefix when re-invoking an agent for fix-back):
401
+
402
+ ```
403
+ CURRENT_CHANGE_ID: <change-id>
404
+ Change directory: specs/changes/<change-id>/
405
+
406
+ PREVIOUS GATE FAILURE FOR THIS AGENT (re-invocation):
407
+ <the exact gate error line(s) tied to this agent>
408
+
409
+ FIX TARGET:
410
+ <the specific file or section that needs to change>
411
+
412
+ REFERENCES:
413
+ - references/agent-log-protocol.md (log format)
414
+ - references/<agent-specific-standard>.md (if applicable)
415
+
416
+ Fix this exact issue without re-doing your prior work. Re-output only the
417
+ section that changed plus your updated Agent Log block.
418
+ ```
419
+
420
+ After re-invoking, re-run `cdd-kit gate <change-id>`. Repeat up to **3 times**. Each
421
+ iteration must be on a strictly smaller error set — if the same error returns
422
+ twice, halt and surface to user (an agent stuck in a loop is more expensive
423
+ than a human read).
423
424
 
424
- **Terminal state after 3 failures**: Add a line at the top of `tasks.md` reading `status: gate-blocked` and report all blocking items to the user. The change is paused — do not proceed to Step 5.
425
+ **Terminal state after 3 failures**: Update `tasks.yml` frontmatter with
426
+ `status: gate-blocked` and report all remaining errors to the user, grouped
427
+ by responsible agent, so they know who to manually direct next.
425
428
 
426
429
  ---
427
430
 
@@ -438,7 +441,7 @@ Agents invoked: <list in order>
438
441
  Gate: PASSED
439
442
 
440
443
  Tasks completed:
441
- - [x] all applicable items checked in specs/changes/<change-id>/tasks.md
444
+ - [x] all applicable items have status: done in specs/changes/<change-id>/tasks.yml
442
445
 
443
446
  All artifacts written to: specs/changes/<change-id>/
444
447
 
@@ -470,8 +473,8 @@ Please review the above items and re-run: cdd-kit gate <change-id>
470
473
  - Never start implementation (backend/frontend-engineer) before `contract-reviewer` has completed for Tier 0–3 changes
471
474
  - Never skip `test-plan.md` for Tier 0–3 changes
472
475
  - Never skip `ci-gates.md` for any implementation change
473
- - Every agent must have its `agent-log/<name>.md` written — YOU write it for read-only agents after receiving their response; write-capable agents write their own
474
- - Tick the relevant `tasks.md` checkbox immediately after each agent completes — do not batch
476
+ - Every agent must have its `agent-log/<name>.yml` written — YOU write it for read-only agents after receiving their response; write-capable agents write their own
477
+ - Tick the relevant `tasks.yml` checkbox immediately after each agent completes — do not batch
475
478
  - `qa-reviewer` always runs last and makes the release-readiness decision
476
479
 
477
480
  ---