codex-workflows 0.6.5 → 0.6.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agents/skills/documentation-criteria/references/plan-template.md +27 -0
- package/.agents/skills/documentation-criteria/references/task-template.md +12 -0
- package/.agents/skills/recipe-build/SKILL.md +20 -6
- package/.agents/skills/recipe-front-build/SKILL.md +20 -6
- package/.agents/skills/recipe-front-plan/SKILL.md +12 -0
- package/.agents/skills/recipe-front-review/SKILL.md +2 -1
- package/.agents/skills/recipe-fullstack-build/SKILL.md +20 -6
- package/.agents/skills/recipe-implement/SKILL.md +1 -1
- package/.agents/skills/recipe-plan/SKILL.md +14 -1
- package/.agents/skills/recipe-review/SKILL.md +2 -1
- package/.agents/skills/subagents-orchestration-guide/SKILL.md +14 -3
- package/.agents/skills/subagents-orchestration-guide/references/monorepo-flow.md +12 -4
- package/.codex/agents/acceptance-test-generator.toml +25 -5
- package/.codex/agents/code-reviewer.toml +5 -1
- package/.codex/agents/document-reviewer.toml +18 -1
- package/.codex/agents/integration-test-reviewer.toml +14 -2
- package/.codex/agents/task-decomposer.toml +16 -0
- package/.codex/agents/task-executor-frontend.toml +12 -15
- package/.codex/agents/task-executor.toml +12 -15
- package/.codex/agents/technical-designer-frontend.toml +10 -81
- package/.codex/agents/technical-designer.toml +10 -94
- package/.codex/agents/work-planner.toml +28 -1
- package/package.json +1 -1
|
@@ -5,8 +5,16 @@ Type: feature|fix|refactor
|
|
|
5
5
|
Estimated Duration: X days
|
|
6
6
|
Estimated Impact: X files
|
|
7
7
|
Related Issue/PR: #XXX (if any)
|
|
8
|
+
Review Scope: [planned-files scope derived from Design Doc and task targets; for a revision plan over existing work, base branch + diff range]
|
|
8
9
|
Implementation Readiness: pending
|
|
9
10
|
|
|
11
|
+
## WorkPlan Review
|
|
12
|
+
|
|
13
|
+
This section records the review gate state for the exact plan content. Set `Status: pending` when the plan is created or materially updated. The orchestrator treats only `Status: approved` with `Conditions: none` as reviewed.
|
|
14
|
+
|
|
15
|
+
- **Status**: pending|approved
|
|
16
|
+
- **Conditions**: none
|
|
17
|
+
|
|
10
18
|
## Related Documents
|
|
11
19
|
- Design Doc(s):
|
|
12
20
|
- [docs/design/XXX.md]
|
|
@@ -35,6 +43,10 @@ Repeat this block for each Design Doc when multiple Design Docs exist. Preserve
|
|
|
35
43
|
- **Success criteria**: [extracted from Design Doc]
|
|
36
44
|
- **Failure response**: [extracted from Design Doc]
|
|
37
45
|
|
|
46
|
+
### Proof Strategy
|
|
47
|
+
- **Proof obligation source**: [test skeleton annotations (`Primary failure mode`, `Proof obligation`) when skeletons exist; otherwise each acceptance criterion's primary failure mode derived from the Design Doc]
|
|
48
|
+
- **Per-task propagation**: every task that implements or verifies a claim records the AC ID or claim identifier in Proof Obligations (see task template) so downstream review can judge whether tests prove the claim, not merely run
|
|
49
|
+
|
|
38
50
|
## Quality Assurance Mechanisms (from Design Docs)
|
|
39
51
|
|
|
40
52
|
Adopted quality gates for the change area. Each task in this plan must satisfy the applicable mechanisms.
|
|
@@ -69,6 +81,21 @@ Map each Design Doc technical requirement to the task or phase that covers it. U
|
|
|
69
81
|
- Merge duplicate restatements of the same obligation from multiple DD sections into one row and cite the primary section in `DD Section`
|
|
70
82
|
- Keep `scope-boundary` rows concrete: name the protected file group, component boundary, contract, or workflow that must remain unchanged
|
|
71
83
|
|
|
84
|
+
## Failure Mode Checklist
|
|
85
|
+
|
|
86
|
+
Domain-independent failure categories this implementation must guard against. Enumerate all eight categories, mark which apply, and list a covering task for each that applies; keep category names generic and place project-specific detail in task descriptions or notes.
|
|
87
|
+
|
|
88
|
+
| Category | Applies? | Covered By Task(s) |
|
|
89
|
+
|----------|----------|--------------------|
|
|
90
|
+
| same-value | yes/no | [P1-T1] |
|
|
91
|
+
| no-op | yes/no | |
|
|
92
|
+
| empty input | yes/no | |
|
|
93
|
+
| invalid option | yes/no | |
|
|
94
|
+
| missing config | yes/no | |
|
|
95
|
+
| unavailable boundary | yes/no | |
|
|
96
|
+
| shared-state dependency | yes/no | |
|
|
97
|
+
| rollback-only visibility | yes/no | |
|
|
98
|
+
|
|
72
99
|
## UI Spec Component -> Task Mapping
|
|
73
100
|
|
|
74
101
|
Include this section when a UI Spec is among the inputs. Map each UI component section to the task(s) that implement it so task-decomposer can pass the exact UI Spec context to executor tasks. Omit this section when no UI Spec exists.
|
|
@@ -58,9 +58,21 @@ Brief observations recorded after reading Investigation Targets:
|
|
|
58
58
|
- **Failure response**: [What to do if verification fails]
|
|
59
59
|
- **Verification level**: [L1 unit/local verification, L2 integration verification, or L3 end-to-end verification]
|
|
60
60
|
|
|
61
|
+
## Proof Obligations
|
|
62
|
+
(Include one entry per acceptance criterion, user journey, boundary, or state transition this task implements or verifies. Derive from test skeleton annotations when present; otherwise derive from the acceptance criterion's primary failure mode.)
|
|
63
|
+
- **AC / Claim ID**: [AC-XXX, user journey identifier, boundary identifier, or task claim identifier]
|
|
64
|
+
- **Claim**: [behavior the acceptance criterion or task promises]
|
|
65
|
+
- **Primary failure mode**: [regression the test should turn red on]
|
|
66
|
+
- **Boundary to exercise**: [public/integration/browser/process/service/persistence boundary, or "in-process unit"]
|
|
67
|
+
- **State assertion**: [observable state before -> action -> after for state-changing claims; "N/A" otherwise]
|
|
68
|
+
- **Mock boundary rationale**: [which external boundaries may be mocked and why; "none" when all real]
|
|
69
|
+
- **Residual**: [what this task-level proof leaves unestablished, and which later task or phase closes it]
|
|
70
|
+
|
|
61
71
|
## Completion Criteria
|
|
72
|
+
- [ ] All listed AC / Claim IDs are implemented or verified by this task
|
|
62
73
|
- [ ] All added tests pass
|
|
63
74
|
- [ ] Operation verified per Operation Verification Methods above
|
|
75
|
+
- [ ] Each Proof Obligation is met: the test turns red under its primary failure mode and exercises the stated boundary
|
|
64
76
|
- [ ] Deliverables created (for research/design tasks)
|
|
65
77
|
- [ ] When Binding Decisions exist, every Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes
|
|
66
78
|
|
|
@@ -55,14 +55,28 @@ Analyze task file existence state and determine the action required:
|
|
|
55
55
|
| State | Criteria | Next Action |
|
|
56
56
|
|-------|----------|-------------|
|
|
57
57
|
| Tasks exist | Consumed Task Set is non-empty | User's execution instruction serves as batch approval -> Enter autonomous execution immediately |
|
|
58
|
-
| No tasks + plan exists | Consumed Task Set is empty
|
|
58
|
+
| No tasks + plan exists + reviewed plan | Consumed Task Set is empty and WorkPlan Review records `Status: approved`, `Conditions: none` | Confirm with user -> spawn task-decomposer |
|
|
59
|
+
| No tasks + small simplified plan | Consumed Task Set is empty, plan exists, and the plan references no Design Doc | Confirm with user -> spawn task-decomposer |
|
|
60
|
+
| No tasks + plan exists + unreviewed plan | Consumed Task Set is empty, the plan references a Design Doc, and WorkPlan Review is absent, pending, conditional, or not approved | Run work plan review, then confirm with user -> spawn task-decomposer |
|
|
59
61
|
| Neither exists | No plan or task files | Error: Prerequisites not met |
|
|
60
62
|
|
|
61
63
|
## Task Decomposition Phase (Conditional)
|
|
62
64
|
|
|
63
|
-
When task files don't exist:
|
|
65
|
+
When task files don't exist, the plan references a Design Doc, and the WorkPlan Review section is absent, pending, conditional, or not approved:
|
|
64
66
|
|
|
65
|
-
### 1.
|
|
67
|
+
### 1. Work Plan Review
|
|
68
|
+
|
|
69
|
+
Spawn document-reviewer agent: "Review the work plan before task decomposition. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
|
|
70
|
+
|
|
71
|
+
Branch on `verdict.decision`:
|
|
72
|
+
- `approved` -> spawn work-planner in update mode once to record `Status: approved` and `Conditions: none` in WorkPlan Review, then continue to user confirmation
|
|
73
|
+
- `approved_with_conditions` -> stop before task decomposition and report that the work plan needs update via recipe-plan
|
|
74
|
+
- `needs_revision` -> stop before task decomposition and report that the work plan needs update via recipe-plan
|
|
75
|
+
- `rejected` -> stop before task decomposition and present the blocking findings to the user
|
|
76
|
+
|
|
77
|
+
When task files don't exist and the WorkPlan Review section records `Status: approved` and `Conditions: none`, skip Work Plan Review and continue to user confirmation.
|
|
78
|
+
|
|
79
|
+
### 2. User Confirmation
|
|
66
80
|
```
|
|
67
81
|
No task files found.
|
|
68
82
|
Work plan: docs/plans/[plan-name].md
|
|
@@ -70,10 +84,10 @@ Work plan: docs/plans/[plan-name].md
|
|
|
70
84
|
Generate tasks from the work plan? (y/n):
|
|
71
85
|
```
|
|
72
86
|
|
|
73
|
-
###
|
|
87
|
+
### 3. Task Decomposition (if approved)
|
|
74
88
|
Spawn task-decomposer agent: "Read work plan at docs/plans/[plan-name].md and decompose into atomic tasks. Output: Individual task files in docs/plans/tasks/. Granularity: 1 task = 1 commit = independently executable."
|
|
75
89
|
|
|
76
|
-
###
|
|
90
|
+
### 4. Verify Generation
|
|
77
91
|
Recompute the Consumed Task Set and verify it is non-empty.
|
|
78
92
|
|
|
79
93
|
## Pre-execution Checklist
|
|
@@ -123,7 +137,7 @@ VERIFY approval status before proceeding. Once confirmed, INITIATE autonomous ex
|
|
|
123
137
|
## Post-Implementation Verification (After All Tasks Complete)
|
|
124
138
|
|
|
125
139
|
After all task cycles finish, collect all `filesModified` from every task-executor response (deduplicated), then run both verification agents before the completion report:
|
|
126
|
-
1. Spawn code-verifier agent: "Verify implementation consistency against the Design Doc. `doc_type: design-doc`. `document_path`: [path]. `code_paths`: [collected filesModified list]."
|
|
140
|
+
1. Spawn code-verifier agent: "Verify implementation consistency against the Design Doc. `doc_type: design-doc`. `document_path`: [path]. `code_paths`: [collected filesModified list]. Work Plan Review Scope: [Review Scope value from the active work plan, used only to confirm the collected file set is complete]."
|
|
127
141
|
2. Spawn security-reviewer agent: "Design Doc: [path]. Implementation files: [collected filesModified list]. Review security compliance."
|
|
128
142
|
3. Consolidate results:
|
|
129
143
|
- code-verifier passes when `summary.status` is `consistent` or `mostly_consistent`
|
|
@@ -55,14 +55,28 @@ Analyze task file existence state and determine the action required:
|
|
|
55
55
|
| State | Criteria | Next Action |
|
|
56
56
|
|-------|----------|-------------|
|
|
57
57
|
| Tasks exist | Consumed Task Set is non-empty | User's execution instruction serves as batch approval -> Enter autonomous execution immediately |
|
|
58
|
-
| No tasks + plan exists | Consumed Task Set is empty
|
|
58
|
+
| No tasks + plan exists + reviewed plan | Consumed Task Set is empty and WorkPlan Review records `Status: approved`, `Conditions: none` | Confirm with user -> spawn task-decomposer |
|
|
59
|
+
| No tasks + small simplified plan | Consumed Task Set is empty, plan exists, and the plan references no Design Doc | Confirm with user -> spawn task-decomposer |
|
|
60
|
+
| No tasks + plan exists + unreviewed plan | Consumed Task Set is empty, the plan references a Design Doc, and WorkPlan Review is absent, pending, conditional, or not approved | Run work plan review, then confirm with user -> spawn task-decomposer |
|
|
59
61
|
| Neither exists | No plan or task files | Error: Prerequisites not met |
|
|
60
62
|
|
|
61
63
|
## Task Decomposition Phase (Conditional)
|
|
62
64
|
|
|
63
|
-
When task files don't exist:
|
|
65
|
+
When task files don't exist, the plan references a Design Doc, and the WorkPlan Review section is absent, pending, conditional, or not approved:
|
|
64
66
|
|
|
65
|
-
### 1.
|
|
67
|
+
### 1. Work Plan Review
|
|
68
|
+
|
|
69
|
+
Spawn document-reviewer agent: "Review the frontend work plan before task decomposition. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc and UI Spec, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
|
|
70
|
+
|
|
71
|
+
Branch on `verdict.decision`:
|
|
72
|
+
- `approved` -> spawn work-planner in update mode once to record `Status: approved` and `Conditions: none` in WorkPlan Review, then continue to user confirmation
|
|
73
|
+
- `approved_with_conditions` -> stop before task decomposition and report that the work plan needs update via recipe-front-plan
|
|
74
|
+
- `needs_revision` -> stop before task decomposition and report that the work plan needs update via recipe-front-plan
|
|
75
|
+
- `rejected` -> stop before task decomposition and present the blocking findings to the user
|
|
76
|
+
|
|
77
|
+
When task files don't exist and the WorkPlan Review section records `Status: approved` and `Conditions: none`, skip Work Plan Review and continue to user confirmation.
|
|
78
|
+
|
|
79
|
+
### 2. User Confirmation
|
|
66
80
|
```
|
|
67
81
|
No task files found.
|
|
68
82
|
Work plan: docs/plans/[plan-name].md
|
|
@@ -70,10 +84,10 @@ Work plan: docs/plans/[plan-name].md
|
|
|
70
84
|
Generate tasks from the work plan? (y/n):
|
|
71
85
|
```
|
|
72
86
|
|
|
73
|
-
###
|
|
87
|
+
### 3. Task Decomposition (if approved)
|
|
74
88
|
Spawn task-decomposer agent: "Read work plan at docs/plans/[plan-name].md and decompose into atomic tasks. Output: Individual task files in docs/plans/tasks/. Granularity: 1 task = 1 commit = independently executable"
|
|
75
89
|
|
|
76
|
-
###
|
|
90
|
+
### 4. Verify Generation
|
|
77
91
|
Recompute the Consumed Task Set and verify it is non-empty.
|
|
78
92
|
|
|
79
93
|
## Pre-execution Checklist
|
|
@@ -131,7 +145,7 @@ VERIFY approval status before proceeding. Once confirmed, INITIATE autonomous ex
|
|
|
131
145
|
## Post-Implementation Verification (After All Tasks Complete)
|
|
132
146
|
|
|
133
147
|
After all task cycles finish, collect all `filesModified` from every task-executor-frontend response (deduplicated), then run both verification agents before the completion report:
|
|
134
|
-
1. Spawn code-verifier agent: "Verify implementation consistency against the Design Doc. `doc_type: design-doc`. `document_path`: [path]. `code_paths`: [collected filesModified list]."
|
|
148
|
+
1. Spawn code-verifier agent: "Verify implementation consistency against the Design Doc. `doc_type: design-doc`. `document_path`: [path]. `code_paths`: [collected filesModified list]. Work Plan Review Scope: [Review Scope value from the active work plan, used only to confirm the collected file set is complete]."
|
|
135
149
|
2. Spawn security-reviewer agent: "Design Doc: [path]. Implementation files: [collected filesModified list]. Review security compliance."
|
|
136
150
|
3. Consolidate results:
|
|
137
151
|
- code-verifier passes when `summary.status` is `consistent` or `mostly_consistent`
|
|
@@ -20,6 +20,7 @@ description: "Create frontend work plan from design document with test skeleton
|
|
|
20
20
|
**Execution Method**:
|
|
21
21
|
- Test skeleton generation -> performed by acceptance-test-generator
|
|
22
22
|
- Work plan creation -> performed by work-planner
|
|
23
|
+
- Work plan review -> performed by document-reviewer
|
|
23
24
|
|
|
24
25
|
Orchestrator spawns agents and passes structured data between them.
|
|
25
26
|
|
|
@@ -29,6 +30,7 @@ Orchestrator spawns agents and passes structured data between them.
|
|
|
29
30
|
- Design document selection
|
|
30
31
|
- Test skeleton generation with acceptance-test-generator
|
|
31
32
|
- Work plan creation with work-planner
|
|
33
|
+
- Work plan review with document-reviewer
|
|
32
34
|
- Plan approval obtainment
|
|
33
35
|
|
|
34
36
|
**Responsibility Boundary**: This skill completes with work plan approval.
|
|
@@ -50,6 +52,15 @@ Spawn acceptance-test-generator agent: "Generate test skeletons from Design Doc
|
|
|
50
52
|
### Step 3: Work Plan Creation
|
|
51
53
|
Spawn work-planner agent: "Create work plan from Design Doc at [path]. Integration test file: [path from step 2]. fixture-e2e test file: [path from step 2 or null]. service-integration-e2e test file: [path from step 2 or null]. E2E absence reasons by lane: [values from step 2 when an E2E lane is null]. Integration tests are created with each phase implementation, fixture-e2e runs alongside UI implementation, service-integration-e2e runs only in the final phase when a service E2E file exists. Include `Implementation Readiness: pending` in the work plan header."
|
|
52
54
|
|
|
55
|
+
### Step 4: Work Plan Review
|
|
56
|
+
Spawn document-reviewer agent: "Review the frontend work plan. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc and UI Spec, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
|
|
57
|
+
|
|
58
|
+
Branch on `verdict.decision`:
|
|
59
|
+
- `approved` -> spawn work-planner in update mode once to record `Status: approved` and `Conditions: none` in WorkPlan Review, then proceed to Step 5
|
|
60
|
+
- `approved_with_conditions` or `needs_revision` -> spawn work-planner in update mode with the findings or conditions, then repeat Step 4. Use max 2 revision iterations as defined by the `needs_revision` row in subagents-orchestration-guide Approval Status Vocabulary.
|
|
61
|
+
- `rejected` -> stop and present the blocking findings to the user.
|
|
62
|
+
|
|
63
|
+
### Step 5: Plan Approval
|
|
53
64
|
**[STOP -- BLOCKING]** Interact with user to complete plan and obtain approval for plan content. Clarify specific implementation steps and risks.
|
|
54
65
|
**CANNOT proceed until user explicitly approves the work plan.**
|
|
55
66
|
|
|
@@ -60,6 +71,7 @@ ENFORCEMENT: Plan content MUST be approved before declaring completion. Unapprov
|
|
|
60
71
|
- [ ] Design document selected
|
|
61
72
|
- [ ] Test skeletons generated
|
|
62
73
|
- [ ] Work plan created
|
|
74
|
+
- [ ] Work plan reviewed via document-reviewer
|
|
63
75
|
- [ ] User approved plan content
|
|
64
76
|
|
|
65
77
|
## Output Example
|
|
@@ -31,12 +31,13 @@ Design Doc (uses most recent if omitted): $ARGUMENTS
|
|
|
31
31
|
|
|
32
32
|
### 1. Prerequisite Check
|
|
33
33
|
Identify the Design Doc in docs/design/ and check implementation files changed from the default branch (detect via `git symbolic-ref refs/remotes/origin/HEAD` or fall back to current branch diff).
|
|
34
|
+
If a single active work plan is explicitly provided or unambiguously resolved for that Design Doc, read its `Review Scope` line. Otherwise set `Work Plan: none` and `Review Scope: none`; do not infer.
|
|
34
35
|
|
|
35
36
|
**[STOP -- BLOCKING]** If no Design Doc or implementation files found, notify user and halt.
|
|
36
37
|
**CANNOT proceed without both a Design Doc and implementation files.**
|
|
37
38
|
|
|
38
39
|
### 2. Execute code-reviewer
|
|
39
|
-
Spawn code-reviewer agent: "Validate Design Doc compliance for [design-doc-path]. Implementation files: [git diff file list]. Review mode: full. Return structured JSON report per your Output Format specification."
|
|
40
|
+
Spawn code-reviewer agent: "Validate Design Doc compliance for [design-doc-path]. Work Plan: [resolved work plan path or none]. Review Scope: [literal Review Scope value or none]. Implementation files: [git diff file list]. Review mode: full. Return structured JSON report per your Output Format specification."
|
|
40
41
|
|
|
41
42
|
**Store output as**: `$STEP_2_OUTPUT`
|
|
42
43
|
|
|
@@ -65,14 +65,28 @@ Analyze task file existence state and determine the action required:
|
|
|
65
65
|
| State | Criteria | Next Action |
|
|
66
66
|
|-------|----------|-------------|
|
|
67
67
|
| Tasks exist | Consumed Task Set is non-empty | User's execution instruction serves as batch approval -> Enter autonomous execution immediately |
|
|
68
|
-
| No tasks + plan exists | Consumed Task Set is empty
|
|
68
|
+
| No tasks + plan exists + reviewed plan | Consumed Task Set is empty and WorkPlan Review records `Status: approved`, `Conditions: none` | Confirm with user -> spawn task-decomposer |
|
|
69
|
+
| No tasks + small simplified plan | Consumed Task Set is empty, plan exists, and the plan references no Design Doc | Confirm with user -> spawn task-decomposer |
|
|
70
|
+
| No tasks + plan exists + unreviewed plan | Consumed Task Set is empty, the plan references a Design Doc, and WorkPlan Review is absent, pending, conditional, or not approved | Run work plan review, then confirm with user -> spawn task-decomposer |
|
|
69
71
|
| Neither exists | No plan or task files | Error: Prerequisites not met |
|
|
70
72
|
|
|
71
73
|
## Task Decomposition Phase (Conditional)
|
|
72
74
|
|
|
73
|
-
When task files don't exist:
|
|
75
|
+
When task files don't exist, the plan references a Design Doc, and the WorkPlan Review section is absent, pending, conditional, or not approved:
|
|
74
76
|
|
|
75
|
-
### 1.
|
|
77
|
+
### 1. Work Plan Review
|
|
78
|
+
|
|
79
|
+
Spawn document-reviewer agent: "Review the fullstack work plan before task decomposition. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to all Design Docs, UI Spec when present, cross-layer boundary coverage, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
|
|
80
|
+
|
|
81
|
+
Branch on `verdict.decision`:
|
|
82
|
+
- `approved` -> spawn work-planner in update mode once to record `Status: approved` and `Conditions: none` in WorkPlan Review, then continue to user confirmation
|
|
83
|
+
- `approved_with_conditions` -> stop before task decomposition and report that the work plan needs update via recipe-plan or the fullstack planning flow
|
|
84
|
+
- `needs_revision` -> stop before task decomposition and report that the work plan needs update via recipe-plan or the fullstack planning flow
|
|
85
|
+
- `rejected` -> stop before task decomposition and present the blocking findings to the user
|
|
86
|
+
|
|
87
|
+
When task files don't exist and the WorkPlan Review section records `Status: approved` and `Conditions: none`, skip Work Plan Review and continue to user confirmation.
|
|
88
|
+
|
|
89
|
+
### 2. User Confirmation
|
|
76
90
|
```
|
|
77
91
|
No task files found.
|
|
78
92
|
Work plan: docs/plans/[plan-name].md
|
|
@@ -80,10 +94,10 @@ Work plan: docs/plans/[plan-name].md
|
|
|
80
94
|
Generate tasks from the work plan? (y/n):
|
|
81
95
|
```
|
|
82
96
|
|
|
83
|
-
###
|
|
97
|
+
### 3. Task Decomposition (if approved)
|
|
84
98
|
Spawn task-decomposer agent: "Read work plan at docs/plans/[plan-name].md and decompose into atomic tasks. Output: Individual task files in docs/plans/tasks/. Granularity: 1 task = 1 commit = independently executable. Use layer-aware naming: {plan}-backend-task-{n}.md, {plan}-frontend-task-{n}.md based on target file paths."
|
|
85
99
|
|
|
86
|
-
###
|
|
100
|
+
### 4. Verify Generation
|
|
87
101
|
Recompute the Consumed Task Set and verify it is non-empty.
|
|
88
102
|
|
|
89
103
|
## Pre-execution Checklist
|
|
@@ -141,7 +155,7 @@ VERIFY approval status before proceeding. Once confirmed, INITIATE autonomous ex
|
|
|
141
155
|
## Post-Implementation Verification (After All Tasks Complete)
|
|
142
156
|
|
|
143
157
|
After all task cycles finish, collect all `filesModified` from every task-executor/task-executor-frontend response (deduplicated), then run both verification agents before the completion report:
|
|
144
|
-
1. Spawn code-verifier once per Design Doc: "Verify implementation consistency against the Design Doc. `doc_type: design-doc`. `document_path`: [single design doc path]. `code_paths`: [collected filesModified list]."
|
|
158
|
+
1. Spawn code-verifier once per Design Doc: "Verify implementation consistency against the Design Doc. `doc_type: design-doc`. `document_path`: [single design doc path]. `code_paths`: [collected filesModified list]. Work Plan Review Scope: [Review Scope value from the active work plan, used only to confirm the collected file set is complete]."
|
|
145
159
|
2. Spawn security-reviewer agent: "Design Doc: [path(s)]. Implementation files: [collected filesModified list]. Review security compliance."
|
|
146
160
|
3. Consolidate results:
|
|
147
161
|
- each code-verifier run passes when `summary.status` is `consistent` or `mostly_consistent`
|
|
@@ -69,7 +69,7 @@ Follow subagents-orchestration-guide skill Large/Medium/Small scale flow exactly
|
|
|
69
69
|
**STEP 3**: Spawn technical-designer-frontend agent → spawn document-reviewer agent → spawn design-sync agent.
|
|
70
70
|
**[STOP — BLOCKING]** Present Frontend Design Doc for user approval. **CANNOT proceed until user explicitly confirms.**
|
|
71
71
|
|
|
72
|
-
**STEP 4**: Spawn acceptance-test-generator agent → spawn work-planner agent
|
|
72
|
+
**STEP 4**: Spawn acceptance-test-generator agent → spawn work-planner agent → spawn document-reviewer agent with `doc_type: WorkPlan`.
|
|
73
73
|
**[STOP — BLOCKING]** Present Work Plan for user approval. **CANNOT proceed until user explicitly confirms.**
|
|
74
74
|
|
|
75
75
|
**STEP 5**: Run implementation readiness preflight.
|
|
@@ -33,6 +33,7 @@ ENFORCEMENT: Work-planner spawned without test skeleton data (when tests were re
|
|
|
33
33
|
- Design document selection
|
|
34
34
|
- Test skeleton generation with acceptance-test-generator
|
|
35
35
|
- Work plan creation with work-planner
|
|
36
|
+
- Work plan review with document-reviewer
|
|
36
37
|
- Plan approval obtainment
|
|
37
38
|
|
|
38
39
|
**Responsibility Boundary**: This skill completes with work plan approval.
|
|
@@ -53,7 +54,18 @@ Present options if multiple exist (can be specified with $ARGUMENTS).
|
|
|
53
54
|
|
|
54
55
|
### Step 3: Work Plan Creation
|
|
55
56
|
- Spawn work-planner agent: "Create work plan from design document at [design-doc-path]. Include deliverables from previous process according to subagents-orchestration-guide skill coordination specification. If `generatedFiles.fixtureE2e` or `generatedFiles.serviceE2e` is null, use the corresponding `e2eAbsenceReason` and accept the null E2E lane as a valid planning input. Include `Implementation Readiness: pending` in the work plan header."
|
|
56
|
-
|
|
57
|
+
|
|
58
|
+
### Step 4: Work Plan Review
|
|
59
|
+
Spawn document-reviewer agent: "Review the work plan. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
|
|
60
|
+
|
|
61
|
+
Branch on `verdict.decision`:
|
|
62
|
+
- `approved` -> spawn work-planner in update mode once to record `Status: approved` and `Conditions: none` in WorkPlan Review, then proceed to Step 5
|
|
63
|
+
- `approved_with_conditions` or `needs_revision` -> spawn work-planner in update mode with the findings or conditions, then repeat Step 4. Use max 2 revision iterations as defined by the `needs_revision` row in subagents-orchestration-guide Approval Status Vocabulary.
|
|
64
|
+
- `rejected` -> stop and present the blocking findings to the user.
|
|
65
|
+
|
|
66
|
+
### Step 5: Plan Approval
|
|
67
|
+
- Present the reviewed work plan to the user for batch approval
|
|
68
|
+
- If the user requests changes, spawn work-planner in update mode and re-run Step 4
|
|
57
69
|
- Clarify specific implementation steps and risks
|
|
58
70
|
|
|
59
71
|
**Scope**: Up to work plan creation and obtaining approval for plan content.
|
|
@@ -63,6 +75,7 @@ Present options if multiple exist (can be specified with $ARGUMENTS).
|
|
|
63
75
|
- [ ] Design document identified and selected
|
|
64
76
|
- [ ] Integration/E2E test skeleton generation confirmed with user (generated if requested)
|
|
65
77
|
- [ ] Work plan created via work-planner
|
|
78
|
+
- [ ] Work plan reviewed via document-reviewer
|
|
66
79
|
- [ ] Plan content approved by user
|
|
67
80
|
- [ ] All stopping points honored with user confirmation
|
|
68
81
|
|
|
@@ -36,9 +36,10 @@ Design Doc (uses most recent if omitted): $ARGUMENTS
|
|
|
36
36
|
|
|
37
37
|
### Step 1: Prerequisite Check
|
|
38
38
|
Identify Design Doc in docs/design/ and check implementation files via git diff.
|
|
39
|
+
If a single active work plan is explicitly provided or unambiguously resolved for that Design Doc, read its `Review Scope` line. Otherwise set `Work Plan: none` and `Review Scope: none`; do not infer.
|
|
39
40
|
|
|
40
41
|
### Step 2: Execute code-reviewer
|
|
41
|
-
Spawn code-reviewer agent: "Validate Design Doc compliance for the implementation. Design Doc path: [path]. Implementation files: [git diff file list]. Review mode: full. Return structured JSON report per your Output Format specification."
|
|
42
|
+
Spawn code-reviewer agent: "Validate Design Doc compliance for the implementation. Design Doc path: [path]. Work Plan: [resolved work plan path or none]. Review Scope: [literal Review Scope value or none]. Implementation files: [git diff file list]. Review mode: full. Return structured JSON report per your Output Format specification."
|
|
42
43
|
|
|
43
44
|
**Store output as**: `$STEP_2_OUTPUT`
|
|
44
45
|
|
|
@@ -140,7 +140,7 @@ Autonomous execution MUST stop and wait for user input at these points.
|
|
|
140
140
|
| UI Spec | After document-reviewer completes UI Spec review (frontend/fullstack) | Approve UI Spec |
|
|
141
141
|
| ADR | After document-reviewer completes ADR review (if ADR created) | Approve ADR |
|
|
142
142
|
| Design | After design-sync completes consistency verification | Approve Design Doc |
|
|
143
|
-
| Work Plan | After
|
|
143
|
+
| Work Plan | After document-reviewer completes WorkPlan review for Medium/Large, or after simplified plan creation for Small | Batch approval for implementation phase |
|
|
144
144
|
|
|
145
145
|
**ENFORCEMENT**: After batch approval, autonomous execution proceeds without stops until completion or escalation. Skipping stop points is a CRITICAL VIOLATION.
|
|
146
146
|
|
|
@@ -164,6 +164,16 @@ Handling rules:
|
|
|
164
164
|
|
|
165
165
|
**ENFORCEMENT**: Using any status value outside this vocabulary is a VIOLATION.
|
|
166
166
|
|
|
167
|
+
### WorkPlan Review State [MANDATORY]
|
|
168
|
+
|
|
169
|
+
Medium and Large work plans must contain a `WorkPlan Review` section. Small simplified plans are exempt because they have no Design Doc to trace against. The plan is reviewed only when that section records `Status: approved` and `Conditions: none`.
|
|
170
|
+
|
|
171
|
+
Handling rules:
|
|
172
|
+
- After WorkPlan review returns `approved`, invoke work-planner in update mode once to record the review section, without changing implementation content.
|
|
173
|
+
- Treat WorkPlan `approved_with_conditions` the same as `needs_revision`: return to work-planner in update mode with the conditions, then re-review. Conditions must not be carried into task decomposition or implementation readiness.
|
|
174
|
+
- A material work plan update resets `WorkPlan Review` to `Status: pending`.
|
|
175
|
+
- Standalone build recipes apply WorkPlan review only before task decomposition, not after task files already exist.
|
|
176
|
+
|
|
167
177
|
## Scale Determination and Document Requirements
|
|
168
178
|
|
|
169
179
|
| Scale | File Count | PRD | ADR | Design Doc | Work Plan |
|
|
@@ -242,8 +252,8 @@ Always start with `requirement-analyzer`, then follow the minimum flow required
|
|
|
242
252
|
|
|
243
253
|
| Scale | Required flow |
|
|
244
254
|
|-------|---------------|
|
|
245
|
-
| Large | `requirement-analyzer` **[Stop]** -> `prd-creator` -> `document-reviewer` **[Stop]** -> optional `ui-spec-designer` + `document-reviewer` **[Stop]** -> optional ADR + `document-reviewer` **[Stop]** -> `codebase-analyzer` -> `technical-designer*` -> `code-verifier` -> `document-reviewer` -> `design-sync` **[Stop]** -> `acceptance-test-generator` -> `work-planner` **[Stop]** -> `task-decomposer` |
|
|
246
|
-
| Medium | `requirement-analyzer` **[Stop]** -> `codebase-analyzer` -> optional `ui-spec-designer` + `document-reviewer` **[Stop]** -> `technical-designer*` -> `code-verifier` -> `document-reviewer` -> `design-sync` **[Stop]** -> `acceptance-test-generator` -> `work-planner` **[Stop]** -> `task-decomposer` |
|
|
255
|
+
| Large | `requirement-analyzer` **[Stop]** -> `prd-creator` -> `document-reviewer` **[Stop]** -> optional `ui-spec-designer` + `document-reviewer` **[Stop]** -> optional ADR + `document-reviewer` **[Stop]** -> `codebase-analyzer` -> `technical-designer*` -> `code-verifier` -> `document-reviewer` -> `design-sync` **[Stop]** -> `acceptance-test-generator` -> `work-planner` -> `document-reviewer` (doc_type: WorkPlan) **[Stop]** -> `task-decomposer` |
|
|
256
|
+
| Medium | `requirement-analyzer` **[Stop]** -> `codebase-analyzer` -> optional `ui-spec-designer` + `document-reviewer` **[Stop]** -> `technical-designer*` -> `code-verifier` -> `document-reviewer` -> `design-sync` **[Stop]** -> `acceptance-test-generator` -> `work-planner` -> `document-reviewer` (doc_type: WorkPlan) **[Stop]** -> `task-decomposer` |
|
|
247
257
|
| Small | `requirement-analyzer` **[Stop]** -> simplified plan **[Stop: Batch approval]** -> direct implementation |
|
|
248
258
|
|
|
249
259
|
Flow rules:
|
|
@@ -253,6 +263,7 @@ Flow rules:
|
|
|
253
263
|
- Pass `codebase-analyzer` output to the designer as `Codebase Analysis`
|
|
254
264
|
- Pass Design Doc path to `code-verifier`, then pass `code_verification` to `document-reviewer`
|
|
255
265
|
- Fullstack layer sequencing is defined in `references/monorepo-flow.md`
|
|
266
|
+
- Run WorkPlan review after every Medium/Large work plan creation or update and before batch approval. On `needs_revision` or WorkPlan `approved_with_conditions`, return to `work-planner` in update mode and re-review for max 2 revision iterations as defined by the `needs_revision` row in Approval Status Vocabulary. On `rejected`, halt and escalate to the user.
|
|
256
267
|
|
|
257
268
|
## Autonomous Execution Mode
|
|
258
269
|
|
|
@@ -10,7 +10,7 @@ This reference defines the orchestration flow for projects spanning multiple lay
|
|
|
10
10
|
|
|
11
11
|
## Design Phase
|
|
12
12
|
|
|
13
|
-
### Large Scale Fullstack (6+ Files) -
|
|
13
|
+
### Large Scale Fullstack (6+ Files) - 16 Steps
|
|
14
14
|
|
|
15
15
|
| Step | Agent | Purpose | Output |
|
|
16
16
|
|------|-------|---------|--------|
|
|
@@ -28,9 +28,10 @@ This reference defines the orchestration flow for projects spanning multiple lay
|
|
|
28
28
|
| 12 | document-reviewer x2 | Review each Design Doc with verification evidence | Reviews |
|
|
29
29
|
| 13 | design-sync | Cross-layer consistency verification (source: frontend Design Doc) **[Stop]** | Sync status |
|
|
30
30
|
| 14 | acceptance-test-generator | Integration/E2E test skeleton from cross-layer contracts | Test skeletons |
|
|
31
|
-
| 15 | work-planner | Work plan from all Design Docs
|
|
31
|
+
| 15 | work-planner | Work plan from all Design Docs | Work plan |
|
|
32
|
+
| 16 | document-reviewer | WorkPlan review **[Stop: Batch approval]** | Approval |
|
|
32
33
|
|
|
33
|
-
### Medium Scale Fullstack (3-5 Files) -
|
|
34
|
+
### Medium Scale Fullstack (3-5 Files) - 14 Steps
|
|
34
35
|
|
|
35
36
|
| Step | Agent | Purpose | Output |
|
|
36
37
|
|------|-------|---------|--------|
|
|
@@ -46,7 +47,8 @@ This reference defines the orchestration flow for projects spanning multiple lay
|
|
|
46
47
|
| 10 | document-reviewer x2 | Review each Design Doc with verification evidence | Reviews |
|
|
47
48
|
| 11 | design-sync | Cross-layer consistency verification (source: frontend Design Doc) **[Stop]** | Sync status |
|
|
48
49
|
| 12 | acceptance-test-generator | Integration/E2E test skeleton from cross-layer contracts | Test skeletons |
|
|
49
|
-
| 13 | work-planner | Work plan from all Design Docs
|
|
50
|
+
| 13 | work-planner | Work plan from all Design Docs | Work plan |
|
|
51
|
+
| 14 | document-reviewer | WorkPlan review **[Stop: Batch approval]** | Approval |
|
|
50
52
|
|
|
51
53
|
### Parallelization in Multi-Agent Steps
|
|
52
54
|
|
|
@@ -101,6 +103,12 @@ Spawn work-planner with all Design Docs:
|
|
|
101
103
|
|
|
102
104
|
work-planner's existing Integration Complete criteria naturally covers cross-layer verification when given multiple Design Docs.
|
|
103
105
|
|
|
106
|
+
After work-planner creates or updates the plan, spawn document-reviewer:
|
|
107
|
+
|
|
108
|
+
> "Review the fullstack work plan. doc_type: WorkPlan. target: [work plan path]. mode: composite. Review semantic traceability to all Design Docs, UI Spec when present, cross-layer boundary coverage, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
|
|
109
|
+
|
|
110
|
+
On `needs_revision` or `approved_with_conditions`, return to work-planner in update mode and re-review for max 2 revision iterations as defined by the `needs_revision` row in Approval Status Vocabulary. On `rejected`, halt and escalate to the user. Stop for batch approval only after WorkPlan review returns `approved` and the plan's `WorkPlan Review` section records `Status: approved` with `Conditions: none`.
|
|
111
|
+
|
|
104
112
|
## Task Decomposition Phase
|
|
105
113
|
|
|
106
114
|
task-decomposer follows standard decomposition from the work plan. The key addition is the **layer-aware naming convention**:
|
|
@@ -111,6 +111,7 @@ For each valid AC from Phase 1:
|
|
|
111
111
|
- Happy path (1 test mandatory)
|
|
112
112
|
- Error handling (only if user-visible error)
|
|
113
113
|
- Edge cases (only if high business impact)
|
|
114
|
+
- Boundary path (behavior-changing AC only): when the AC can hold on the main path while a distinct branch, state, input class, lifecycle step, or fallback regresses, capture that boundary as a proof obligation. Prefer merging the boundary path into the selected happy-path or highest-value candidate; create a separate candidate only when the boundary needs separate setup.
|
|
114
115
|
|
|
115
116
|
2. **Classify test level**:
|
|
116
117
|
- Integration test candidate (feature-level interaction)
|
|
@@ -167,7 +168,8 @@ Value score and E2E selection rules are defined in **integration-e2e-testing ski
|
|
|
167
168
|
4. Reserve 1 service-integration-e2e slot only when the journey needs real cross-service verification
|
|
168
169
|
5. Fill remaining fixture-e2e budget with candidates that satisfy `Value Score >= 20`
|
|
169
170
|
6. Fill remaining service-integration-e2e budget with candidates that satisfy `Value Score > 50`
|
|
170
|
-
7.
|
|
171
|
+
7. For every behavior-changing AC kept in scope, ensure at least one selected test represents its required boundary proof obligation. Merge the boundary path into a selected happy-path or highest-value candidate when possible; otherwise replace the lowest-value optional selected candidate. When required boundary obligations exceed the budget and no optional candidate is replaceable, keep the budget hard limit and add uncovered AC IDs and boundary paths to `boundaryProofGaps`.
|
|
172
|
+
8. If a lane emits no tests, return its generated file as `null` with a concrete lane-specific absence reason
|
|
171
173
|
```
|
|
172
174
|
|
|
173
175
|
**Output**: Final test set
|
|
@@ -192,6 +194,8 @@ Adapt comment syntax to the project's language when generating annotations.
|
|
|
192
194
|
// @dependency: PaymentService, OrderRepository, Database
|
|
193
195
|
// @real-dependency: OrderRepository, Database
|
|
194
196
|
// @complexity: high
|
|
197
|
+
// Primary failure mode: payment succeeds but the order row is absent or unpersisted
|
|
198
|
+
// Proof obligation: assert order persistence after successful payment while keeping OrderRepository and Database real; only the external payment gateway may be mocked
|
|
195
199
|
[Test: 'AC1: Successful payment creates persisted order with correct status']
|
|
196
200
|
|
|
197
201
|
// AC1-error: "Payment failure shows user-friendly error message"
|
|
@@ -200,6 +204,8 @@ Adapt comment syntax to the project's language when generating annotations.
|
|
|
200
204
|
// @category: core-functionality
|
|
201
205
|
// @dependency: PaymentService, ErrorHandler
|
|
202
206
|
// @complexity: medium
|
|
207
|
+
// Primary failure mode: payment failure still creates an order or hides the user-facing error
|
|
208
|
+
// Proof obligation: assert the visible error and the unchanged order state after a failed payment; mock only the external payment gateway failure
|
|
203
209
|
[Test: 'AC1: Failed payment displays error without creating order']
|
|
204
210
|
```
|
|
205
211
|
|
|
@@ -221,6 +227,8 @@ Adapt comment syntax to the project's language when generating annotations.
|
|
|
221
227
|
// @lane: fixture-e2e
|
|
222
228
|
// @dependency: full-ui (mocked backend)
|
|
223
229
|
// @complexity: medium
|
|
230
|
+
// Primary failure mode: undo banner appears but the dismissed card is not restored
|
|
231
|
+
// Proof obligation: assert browser-visible state before dismissal, after dismissal, and after undo using fixture-controlled backend state
|
|
224
232
|
[Test: 'User Journey: Dismiss and undo restores the card']
|
|
225
233
|
```
|
|
226
234
|
|
|
@@ -242,6 +250,8 @@ Adapt comment syntax to the project's language when generating annotations.
|
|
|
242
250
|
// @lane: service-integration-e2e
|
|
243
251
|
// @dependency: full-system
|
|
244
252
|
// @complexity: high
|
|
253
|
+
// Primary failure mode: checkout appears successful but the persisted order or confirmation event is missing
|
|
254
|
+
// Proof obligation: exercise the full local service stack and assert persisted order state plus confirmation event after checkout
|
|
245
255
|
[Test: 'User Journey: Complete product purchase persists order and emits confirmation']
|
|
246
256
|
```
|
|
247
257
|
|
|
@@ -264,7 +274,8 @@ Adapt comment syntax to the project's language when generating annotations.
|
|
|
264
274
|
"e2eAbsenceReason": {
|
|
265
275
|
"fixtureE2e": "all_e2e_candidates_below_threshold",
|
|
266
276
|
"serviceE2e": "no_real_service_dependency"
|
|
267
|
-
}
|
|
277
|
+
},
|
|
278
|
+
"boundaryProofGaps": []
|
|
268
279
|
}
|
|
269
280
|
```
|
|
270
281
|
|
|
@@ -285,7 +296,14 @@ Adapt comment syntax to the project's language when generating annotations.
|
|
|
285
296
|
"e2eAbsenceReason": {
|
|
286
297
|
"fixtureE2e": null,
|
|
287
298
|
"serviceE2e": null
|
|
288
|
-
}
|
|
299
|
+
},
|
|
300
|
+
"boundaryProofGaps": [
|
|
301
|
+
{
|
|
302
|
+
"acId": "[AC-XXX]",
|
|
303
|
+
"boundaryPath": "[branch/state/input/lifecycle/fallback/visibility path]",
|
|
304
|
+
"reason": "budget_insufficient_for_boundary_proof"
|
|
305
|
+
}
|
|
306
|
+
]
|
|
289
307
|
}
|
|
290
308
|
```
|
|
291
309
|
|
|
@@ -297,13 +315,15 @@ Each test case MUST have the following standard annotations for test implementat
|
|
|
297
315
|
- **@lane**: integration | fixture-e2e | service-integration-e2e
|
|
298
316
|
- **@dependency**: none | [component names] | full-ui (mocked backend) | full-system
|
|
299
317
|
- **@complexity**: low | medium | high
|
|
318
|
+
- **Primary failure mode**: the specific regression that should make the implemented test fail
|
|
319
|
+
- **Proof obligation**: what the implemented test must assert to prove the claim, including the boundary to exercise, before/action/after state for state-changing claims, and which boundaries may be mocked with rationale. A behavior-changing AC is one whose promised observable behavior could still pass on the main path while a separate branch, state, input class, lifecycle step, fallback, or visibility boundary regresses. For behavior-changing ACs, name the boundary path the test must traverse when the main path alone would stay green through the regression
|
|
300
320
|
|
|
301
|
-
These annotations are used when planning and prioritizing test implementation.
|
|
321
|
+
These annotations are used when planning and prioritizing test implementation. Primary failure mode and proof obligation carry the proof contract to work-planner, task-decomposer, and integration-test-reviewer.
|
|
302
322
|
|
|
303
323
|
## Constraints and Quality Standards
|
|
304
324
|
|
|
305
325
|
**Mandatory Compliance**:
|
|
306
|
-
- Output test skeletons only: verification points, expected results, and
|
|
326
|
+
- Output test skeletons only: verification points, expected results, pass criteria, primary failure mode, and proof obligation
|
|
307
327
|
- Downstream consumers treat these skeletons as design artifacts rather than runnable tests
|
|
308
328
|
- Clearly state verification points, expected results, and pass criteria for each test
|
|
309
329
|
- Preserve original AC statements in comments (ensure traceability)
|
|
@@ -53,7 +53,7 @@ Skill Status:
|
|
|
53
53
|
## Input Parameters
|
|
54
54
|
|
|
55
55
|
- **designDoc**: Path to the Design Doc (or multiple paths for fullstack features)
|
|
56
|
-
- **implementationFiles**: List of files to review (or git diff range)
|
|
56
|
+
- **implementationFiles**: List of files to review (or git diff range). When a Work Plan is provided and implementationFiles is omitted or ambiguous, derive the review file set from the plan's `Review Scope` value; for revision plans, use the recorded base branch plus diff range.
|
|
57
57
|
- **reviewMode**: `full` (default) | `acceptance` | `architecture`
|
|
58
58
|
|
|
59
59
|
## Workflow
|
|
@@ -75,6 +75,9 @@ For each acceptance criterion extracted in Step 1:
|
|
|
75
75
|
- Determine status: fulfilled / partially fulfilled / unfulfilled
|
|
76
76
|
- Record the file path and relevant code location
|
|
77
77
|
- Note any deviations from the Design Doc specification
|
|
78
|
+
- For behavior-changing ACs, confirm the evidence covers main and boundary paths. Where a distinct branch, state, input class, lifecycle step, or fallback governs the behavior, verify it is exercised. Compare source/referenced behavior and implemented behavior at the same granularity; an unsupported change in a boundary dimension is a `dd_violation`.
|
|
79
|
+
- Confirm the implementation keeps the core mechanism the AC, Design Doc, or referenced materials require. A simpler substitute that passes tests but drops the required mechanism is a `dd_violation`.
|
|
80
|
+
- For changes to persisted, shared, or externally observable state, identify the publication boundary where the new state becomes observable to another process, component, user, or later step. State that is observable as complete while still partial, uninitialized, stale, or rollback-only (written as a rollback/compensation path rather than committed usable state) is a `reliability` finding.
|
|
78
81
|
|
|
79
82
|
#### 2-2. Identifier Verification
|
|
80
83
|
For each identifier specification extracted in Step 1:
|
|
@@ -124,6 +127,7 @@ Read error paths and boundary handling directly in the code:
|
|
|
124
127
|
- Meaningful coverage: at least one assertion exercises the AC's observable behavior
|
|
125
128
|
- Coverage gap: `skip`/`xit` on tests that should run, TODO/placeholder-only bodies, always-true assertions (for example `expect(true).toBe(true)` or `expect(arr.length).toBeGreaterThanOrEqual(0)`), 0-match runner reports, or grep-only matches without behavior verification
|
|
126
129
|
- Intentional absence: meaningful when absence is the AC expectation
|
|
130
|
+
- Proof adequacy: a covered test should fail under the AC's primary failure mode and should exercise the claimed boundary rather than a substitute input that bypasses it. A test that would stay green if the claimed behavior regressed is a `coverage_gap` with rationale naming the unproven failure mode.
|
|
127
131
|
|
|
128
132
|
Classify each quality finding into one of:
|
|
129
133
|
- `dd_violation`: implementation deviates from the Design Doc
|