codex-workflows 0.6.9 → 0.7.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agents/skills/coding-rules/SKILL.md +2 -1
- package/.agents/skills/coding-rules/references/typescript.md +3 -2
- package/.agents/skills/documentation-criteria/SKILL.md +3 -3
- package/.agents/skills/documentation-criteria/references/design-template.md +16 -5
- package/.agents/skills/documentation-criteria/references/plan-template.md +18 -4
- package/.agents/skills/documentation-criteria/references/task-template.md +11 -1
- package/.agents/skills/recipe-build/SKILL.md +1 -1
- package/.agents/skills/recipe-front-build/SKILL.md +1 -1
- package/.agents/skills/recipe-front-plan/SKILL.md +1 -1
- package/.agents/skills/recipe-front-review/SKILL.md +3 -2
- package/.agents/skills/recipe-fullstack-build/SKILL.md +1 -1
- package/.agents/skills/recipe-plan/SKILL.md +1 -1
- package/.agents/skills/recipe-prepare-implementation/SKILL.md +2 -1
- package/.agents/skills/recipe-review/SKILL.md +3 -2
- package/.agents/skills/subagents-orchestration-guide/SKILL.md +2 -2
- package/.agents/skills/subagents-orchestration-guide/references/monorepo-flow.md +1 -1
- package/.agents/skills/testing/references/typescript.md +3 -2
- package/.codex/agents/code-reviewer.toml +9 -1
- package/.codex/agents/document-reviewer.toml +5 -3
- package/.codex/agents/task-decomposer.toml +29 -4
- package/.codex/agents/task-executor-frontend.toml +21 -2
- package/.codex/agents/task-executor.toml +21 -2
- package/.codex/agents/technical-designer-frontend.toml +8 -0
- package/.codex/agents/technical-designer.toml +10 -1
- package/.codex/agents/work-planner.toml +30 -9
- package/package.json +1 -1
|
@@ -93,7 +93,8 @@ Nearby code is a starting point for investigation, not a sufficient basis for ad
|
|
|
93
93
|
|
|
94
94
|
## Commenting Principles
|
|
95
95
|
|
|
96
|
-
-
|
|
96
|
+
- Prefer names, types, and structure over comments
|
|
97
|
+
- Add comments only for why, limitations, edge cases, or public API contracts
|
|
97
98
|
- No historical information — use version control
|
|
98
99
|
- Remove commented-out code
|
|
99
100
|
- Keep comments concise and timeless
|
|
@@ -6,7 +6,8 @@
|
|
|
6
6
|
- **No Unused "Just in Case" Code** - Violates YAGNI principle (Kent Beck)
|
|
7
7
|
|
|
8
8
|
## Comment Writing Rules
|
|
9
|
-
- **
|
|
9
|
+
- **Code-First Default**: Use names, types, and structure to show what the code does
|
|
10
|
+
- **Intent Focus**: Use comments only for why, limitations, edge cases, or public API contracts
|
|
10
11
|
- **History in Version Control**: Record development history in commits and PRs instead of code comments
|
|
11
12
|
- **Timeless**: Write only content that remains valid whenever read
|
|
12
13
|
- **Conciseness**: Keep explanations to necessary minimum
|
|
@@ -147,7 +148,7 @@ const response = await fetch('/api/data') // Backend handles API key authenticat
|
|
|
147
148
|
- Delete unused code immediately
|
|
148
149
|
- Delete debug `console.log()`
|
|
149
150
|
- No commented-out code (manage history with version control)
|
|
150
|
-
- Comments explain
|
|
151
|
+
- Comments explain intent, constraints, or contracts that code cannot express directly
|
|
151
152
|
|
|
152
153
|
## Error Handling
|
|
153
154
|
|
|
@@ -21,7 +21,7 @@ description: "Documentation creation criteria for PRD, ADR, Design Doc, UI Spec,
|
|
|
21
21
|
| New Feature Addition (backend) | PRD -> [ADR] -> Design Doc -> Work Plan | After PRD approval |
|
|
22
22
|
| New Feature Addition (frontend/fullstack) | PRD -> **UI Spec** -> [ADR] -> Design Doc -> Work Plan | UI Spec before Design Doc |
|
|
23
23
|
| ADR Conditions Met (see below) | ADR -> Design Doc -> Work Plan | Start immediately |
|
|
24
|
-
| 6+ Files | ADR -> Design Doc -> Work Plan (REQUIRED) | Start immediately |
|
|
24
|
+
| 6+ Files | [ADR if conditions apply] -> Design Doc -> Work Plan (Design Doc + Work Plan REQUIRED) | Start immediately |
|
|
25
25
|
| 3-5 Files | Design Doc -> Work Plan (REQUIRED) | Start immediately |
|
|
26
26
|
| 1-2 Files | None | Direct implementation |
|
|
27
27
|
|
|
@@ -81,7 +81,7 @@ description: "Documentation creation criteria for PRD, ADR, Design Doc, UI Spec,
|
|
|
81
81
|
|
|
82
82
|
### Work Plan
|
|
83
83
|
**Purpose**: Implementation task management and progress tracking
|
|
84
|
-
**Scope**: Task breakdown, dependencies, schedule estimates, test skeleton file paths, Verification Strategy summaries from each Design Doc, Design-to-Plan Traceability mapping for implementation-relevant technical requirements, ADR Bindings for implementation-binding ADR decisions, final Quality Assurance phase, and progress tracking only. Technical rationale belongs in ADR and design details belong in Design Doc.
|
|
84
|
+
**Scope**: Task breakdown, dependencies, schedule estimates, test skeleton file paths, Verification Strategy summaries from each Design Doc, Design-to-Plan Traceability mapping for implementation-relevant technical requirements, Reference Contract Values for binding observable Design Doc values, ADR Bindings for implementation-binding ADR decisions, final Quality Assurance phase, and progress tracking only. Technical rationale belongs in ADR and design details belong in Design Doc.
|
|
85
85
|
|
|
86
86
|
**Phase Division Criteria**:
|
|
87
87
|
|
|
@@ -124,7 +124,7 @@ description: "Documentation creation criteria for PRD, ADR, Design Doc, UI Spec,
|
|
|
124
124
|
`Proposed` -> `Accepted` -> `Deprecated`/`Superseded`/`Rejected`
|
|
125
125
|
|
|
126
126
|
## AI Automation Rules [MANDATORY]
|
|
127
|
-
-
|
|
127
|
+
- 6+ files: MUST evaluate ADR conditions
|
|
128
128
|
- Contract/data flow change detected: ADR REQUIRED
|
|
129
129
|
- Check existing ADRs before implementation — ALWAYS verify alignment
|
|
130
130
|
|
|
@@ -237,11 +237,12 @@ Rejected Alternatives Log is element-level. Future Extensibility below is design
|
|
|
237
237
|
// Record major contract/interface definitions here
|
|
238
238
|
```
|
|
239
239
|
|
|
240
|
-
### Data
|
|
240
|
+
### Data Contracts
|
|
241
241
|
|
|
242
|
-
#### Component
|
|
242
|
+
#### [Component or Boundary] (repeat per component/boundary)
|
|
243
243
|
|
|
244
244
|
```yaml
|
|
245
|
+
Contract: [interface / function / API / schema name]
|
|
245
246
|
Input:
|
|
246
247
|
Type: [Data shape, contract, or schema]
|
|
247
248
|
Preconditions: [Required items, format constraints]
|
|
@@ -256,6 +257,14 @@ Invariants:
|
|
|
256
257
|
- [Conditions that remain unchanged before and after processing]
|
|
257
258
|
```
|
|
258
259
|
|
|
260
|
+
### Observable Contract Values (When Applicable)
|
|
261
|
+
|
|
262
|
+
Use this section when the design defines observable values the implementation must reproduce exactly. Omit it when the Design Doc has no such values.
|
|
263
|
+
|
|
264
|
+
| Contract Type | Required Observable Value |
|
|
265
|
+
|---------------|---------------------------|
|
|
266
|
+
| structure-order / derived-display / state-lifecycle-negative | [Exact column/field/label set and order, derived display rule, or condition where persisted/restored/cached/derived state remains unused] |
|
|
267
|
+
|
|
259
268
|
### Test Boundaries
|
|
260
269
|
|
|
261
270
|
#### Mock Boundary Decisions
|
|
@@ -274,9 +283,11 @@ Invariants:
|
|
|
274
283
|
|
|
275
284
|
### Field Propagation Map (When Fields Cross Boundaries)
|
|
276
285
|
|
|
277
|
-
|
|
278
|
-
|
|
279
|
-
|
|
|
286
|
+
A boundary includes a serialized boundary: a value encoded on one side and parsed on the other through a medium such as a query string, CLI argument, environment variable, config entry, message payload, storage key, or file. For those rows, record the exact encoded representation and how the consumer parses it. Use "-" only when the row is not a serialized boundary.
|
|
287
|
+
|
|
288
|
+
| Field | Boundary | Status | Serialized Format | Consumer Parse Rule | Detail |
|
|
289
|
+
|-------|----------|--------|-------------------|---------------------|--------|
|
|
290
|
+
| [field name] | [Component A to B] | preserved / transformed / dropped | [exact representation the producer emits when serialized; "-" otherwise] | [how the consumer decodes and validates it; "-" otherwise] | [logic or reason] |
|
|
280
291
|
|
|
281
292
|
## Verification Strategy
|
|
282
293
|
|
|
@@ -81,6 +81,18 @@ Map each Design Doc technical requirement to the task or phase that covers it. U
|
|
|
81
81
|
- Merge duplicate restatements of the same obligation from multiple DD sections into one row and cite the primary section in `DD Section`
|
|
82
82
|
- Keep `scope-boundary` rows concrete: name the protected file group, component boundary, contract, or workflow that must remain unchanged
|
|
83
83
|
|
|
84
|
+
## Reference Contract Values
|
|
85
|
+
|
|
86
|
+
Include this section when a Traceability row's DD Item encodes a binding observable value the implementation must reproduce exactly: a column/label set and order, a derived-display rule where one field determines another display value, or a state-lifecycle negative that states when persisted or derived state must stay unused. Serialized boundaries belong in the Connection Map / Field Propagation Map. When a value qualifies for both this table and a serialized boundary, record it only in the Connection Map. ADR-derived structural decisions belong in ADR Bindings.
|
|
87
|
+
|
|
88
|
+
The Traceability table records coverage. This table carries the required value verbatim so the covering task can check the exact contract.
|
|
89
|
+
|
|
90
|
+
| Design Doc (section) | Contract Type | Required Observable Value (verbatim) | Covered By Task(s) | Gap Status | Notes |
|
|
91
|
+
|----------------------|---------------|--------------------------------------|--------------------|------------|-------|
|
|
92
|
+
| docs/design/xxx-design.md (Section name) | structure-order / derived-display / state-lifecycle-negative | [Exact value copied from the Design Doc] | [P1-T1] | covered | |
|
|
93
|
+
|
|
94
|
+
**Gap Status values**: `covered` (mapped to one or more tasks), `gap` (no task exists yet; set Covered By Task(s) to `-`, include justification in Notes, and require user confirmation before plan approval)
|
|
95
|
+
|
|
84
96
|
## Failure Mode Checklist
|
|
85
97
|
|
|
86
98
|
Domain-independent failure categories this implementation must guard against. Enumerate all eight categories, mark which apply, and list a covering task for each that applies; keep category names generic and place project-specific detail in task descriptions or notes.
|
|
@@ -125,11 +137,13 @@ One row represents one independently checkable binding decision. A single ADR ca
|
|
|
125
137
|
|
|
126
138
|
## Connection Map
|
|
127
139
|
|
|
128
|
-
Include this section when implementation crosses runtime, process, deployment, or service boundaries
|
|
140
|
+
Include this section when implementation crosses runtime, process, deployment, or service boundaries, or when a value is serialized and parsed across a boundary within one runtime through a query string, route parameter, form post, CLI argument, environment variable, config entry, message payload, storage key, or file.
|
|
141
|
+
|
|
142
|
+
For serialized boundaries, fill Serialized Format and Consumer Parse Rule with concrete values. Use "-" only for non-serialized external signals where the Expected Signal fully captures the boundary contract.
|
|
129
143
|
|
|
130
|
-
| Boundary | Caller / Producer | Callee / Consumer | Expected Signal | Covered By Task(s) |
|
|
131
|
-
|
|
132
|
-
| [
|
|
144
|
+
| Boundary | Caller / Producer | Callee / Consumer | Serialized Format | Consumer Parse Rule | Expected Signal | Covered By Task(s) |
|
|
145
|
+
|----------|-------------------|-------------------|-------------------|---------------------|-----------------|--------------------|
|
|
146
|
+
| [producing side -> consuming side] | [module/package initiating request or message] | [module/package receiving request or message] | [exact representation the producer emits, or "-"] | [how the consumer decodes and validates it, or "-"] | [Observable evidence, e.g. HTTP 200 matching schema X] | [P1-T1, P1-T2] |
|
|
133
147
|
|
|
134
148
|
## Objective
|
|
135
149
|
[Why this change is necessary, what problem it solves]
|
|
@@ -33,10 +33,19 @@ Each row is an ADR decision the implementation in this task must comply with.
|
|
|
33
33
|
|--------|------|----------|------------------|
|
|
34
34
|
| docs/adr/ADR-XXXX-title.md (§ <Source Section>) | [Axis value copied verbatim from the work plan's ADR Bindings row] | [Binding decision copied from the work plan's ADR Bindings row] | [Y/N-answerable positive predicate that evaluates whether the planned and final implementation satisfy the decision] |
|
|
35
35
|
|
|
36
|
+
## Reference Contracts
|
|
37
|
+
(Include this section when the work plan's Reference Contract Values table covers this task. Omit otherwise.)
|
|
38
|
+
|
|
39
|
+
Each row is a Design Doc-derived observable contract the implementation in this task must reproduce exactly. Serialized boundaries are carried by Boundary Context from the work plan's Connection Map. ADR-derived structural decisions are carried by Binding Decisions above.
|
|
40
|
+
|
|
41
|
+
| Source | Contract Type | Required Observable Value | Compliance Check |
|
|
42
|
+
|--------|---------------|---------------------------|------------------|
|
|
43
|
+
| docs/design/xxx-design.md (§ Section name) | structure-order / derived-display / state-lifecycle-negative | [Required Observable Value copied verbatim from the work plan row] | [Y/N-answerable positive predicate that evaluates whether the planned and final implementation reproduces the value] |
|
|
44
|
+
|
|
36
45
|
## Investigation Notes
|
|
37
46
|
Brief observations recorded after reading Investigation Targets:
|
|
38
47
|
- [path] - [interfaces, control/data flow, state transitions, side effects relevant to this task]
|
|
39
|
-
- When Binding Decisions exist, record the planned implementation approach and each Compliance Check result here.
|
|
48
|
+
- When Binding Decisions or Reference Contracts exist, record the planned implementation approach and each Compliance Check result here.
|
|
40
49
|
|
|
41
50
|
## Implementation Steps (TDD: Red-Green-Refactor)
|
|
42
51
|
### 1. Red Phase
|
|
@@ -83,6 +92,7 @@ Brief observations recorded after reading Investigation Targets:
|
|
|
83
92
|
- [ ] Each Proof Obligation is met: the test turns red under its primary failure mode and exercises the stated boundary
|
|
84
93
|
- [ ] Deliverables created (for research/design tasks)
|
|
85
94
|
- [ ] When Binding Decisions exist, every Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes
|
|
95
|
+
- [ ] When Reference Contracts exist, every Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes
|
|
86
96
|
|
|
87
97
|
## Notes
|
|
88
98
|
- Impact scope: [Areas where changes may propagate]
|
|
@@ -73,7 +73,7 @@ When task files don't exist, the plan references a Design Doc, and the WorkPlan
|
|
|
73
73
|
|
|
74
74
|
### 1. Work Plan Review
|
|
75
75
|
|
|
76
|
-
Spawn document-reviewer agent: "Review the work plan before task decomposition. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
|
|
76
|
+
Spawn document-reviewer agent: "Review the work plan before task decomposition. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc, Reference Contract Values fidelity, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
|
|
77
77
|
|
|
78
78
|
Branch on `verdict.decision`:
|
|
79
79
|
- `approved` -> spawn work-planner in update mode once to record `Status: approved` and `Conditions: none` in WorkPlan Review, then continue to user confirmation
|
|
@@ -73,7 +73,7 @@ When task files don't exist, the plan references a Design Doc, and the WorkPlan
|
|
|
73
73
|
|
|
74
74
|
### 1. Work Plan Review
|
|
75
75
|
|
|
76
|
-
Spawn document-reviewer agent: "Review the frontend work plan before task decomposition. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc and UI Spec, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
|
|
76
|
+
Spawn document-reviewer agent: "Review the frontend work plan before task decomposition. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc and UI Spec, Reference Contract Values fidelity, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
|
|
77
77
|
|
|
78
78
|
Branch on `verdict.decision`:
|
|
79
79
|
- `approved` -> spawn work-planner in update mode once to record `Status: approved` and `Conditions: none` in WorkPlan Review, then continue to user confirmation
|
|
@@ -53,7 +53,7 @@ Spawn acceptance-test-generator agent: "Generate test skeletons from Design Doc
|
|
|
53
53
|
Spawn work-planner agent: "Create work plan from Design Doc at [path]. Integration test file: [path from step 2]. fixture-e2e test file: [path from step 2 or null]. service-integration-e2e test file: [path from step 2 or null]. E2E absence reasons by lane: [values from step 2 when an E2E lane is null]. Integration tests are created with each phase implementation, fixture-e2e runs alongside UI implementation, service-integration-e2e runs only in the final phase when a service E2E file exists. Include `Implementation Readiness: pending` in the work plan header."
|
|
54
54
|
|
|
55
55
|
### Step 4: Work Plan Review
|
|
56
|
-
Spawn document-reviewer agent: "Review the frontend work plan. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc and UI Spec, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
|
|
56
|
+
Spawn document-reviewer agent: "Review the frontend work plan. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc and UI Spec, Reference Contract Values fidelity, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
|
|
57
57
|
|
|
58
58
|
Branch on `verdict.decision`:
|
|
59
59
|
- `approved` -> spawn work-planner in update mode once to record `Status: approved` and `Conditions: none` in WorkPlan Review, then proceed to Step 5
|
|
@@ -51,8 +51,9 @@ Spawn security-reviewer agent: "Design Doc: [path]. Implementation files: [file
|
|
|
51
51
|
**If security-reviewer returned `blocked`**: Stop immediately. Report the blocked finding and escalate to user. Do not proceed to fix steps.
|
|
52
52
|
|
|
53
53
|
**Code compliance criteria (considering project stage)**:
|
|
54
|
-
-
|
|
55
|
-
-
|
|
54
|
+
- `code-reviewer` verdict is `pass`
|
|
55
|
+
- Coverage thresholds pass only when configured by the project, task file, work plan, or Design Doc
|
|
56
|
+
- Determine pass/fail from the `code-reviewer` verdict and configured coverage thresholds; treat `complianceRate` as diagnostic context only
|
|
56
57
|
|
|
57
58
|
**Security criteria**:
|
|
58
59
|
- `approved` or `approved_with_notes` -> Pass
|
|
@@ -83,7 +83,7 @@ When task files don't exist, the plan references a Design Doc, and the WorkPlan
|
|
|
83
83
|
|
|
84
84
|
### 1. Work Plan Review
|
|
85
85
|
|
|
86
|
-
Spawn document-reviewer agent: "Review the fullstack work plan before task decomposition. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to all Design Docs, UI Spec when present, cross-layer boundary coverage, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
|
|
86
|
+
Spawn document-reviewer agent: "Review the fullstack work plan before task decomposition. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to all Design Docs, UI Spec when present, Reference Contract Values fidelity, cross-layer boundary coverage, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
|
|
87
87
|
|
|
88
88
|
Branch on `verdict.decision`:
|
|
89
89
|
- `approved` -> spawn work-planner in update mode once to record `Status: approved` and `Conditions: none` in WorkPlan Review, then continue to user confirmation
|
|
@@ -56,7 +56,7 @@ Present options if multiple exist (can be specified with $ARGUMENTS).
|
|
|
56
56
|
- Spawn work-planner agent: "Create work plan from design document at [design-doc-path]. Include deliverables from previous process according to subagents-orchestration-guide skill coordination specification. If `generatedFiles.fixtureE2e` or `generatedFiles.serviceE2e` is null, use the corresponding `e2eAbsenceReason` and accept the null E2E lane as a valid planning input. Include `Implementation Readiness: pending` in the work plan header."
|
|
57
57
|
|
|
58
58
|
### Step 4: Work Plan Review
|
|
59
|
-
Spawn document-reviewer agent: "Review the work plan. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
|
|
59
|
+
Spawn document-reviewer agent: "Review the work plan. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc, Reference Contract Values fidelity, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
|
|
60
60
|
|
|
61
61
|
Branch on `verdict.decision`:
|
|
62
62
|
- `approved` -> spawn work-planner in update mode once to record `Status: approved` and `Conditions: none` in WorkPlan Review, then proceed to Step 5
|
|
@@ -31,7 +31,7 @@ Each criterion produces `pass`, `fail`, or `not_applicable`, with file:line evid
|
|
|
31
31
|
|
|
32
32
|
| ID | Criterion | Pass Evidence |
|
|
33
33
|
|----|-----------|---------------|
|
|
34
|
-
| R1 | Verification Strategy and
|
|
34
|
+
| R1 | Verification Strategy and binding references resolve | Every command, file path, function, endpoint, fixture, seed, and test reference in the work plan's Verification Strategies either exists now or is the deliverable of a task in the plan; every Reference Contract Values `covered` row references existing task IDs; every Reference Contract Values `gap` row has Notes with user-confirmation handling; every ADR Bindings source path resolves; every ADR Bindings `covered` row references existing task IDs |
|
|
35
35
|
| R2 | E2E prerequisites are addressed | For each fixture-e2e or service-integration-e2e skeleton, every noted precondition is present in the codebase or covered by a Phase 0 task |
|
|
36
36
|
| R3 | Phase 1 observability exists | The first implementation phase includes at least one operation verification method executable at task completion using existing files, prior Phase 0 deliverables, or the task's own output |
|
|
37
37
|
| R4 | UI rendering surface exists | When the plan implements UI components, a fixture entry, dev route, Storybook story, preview harness, or equivalent render surface exists or is covered by a Phase 0 task |
|
|
@@ -47,6 +47,7 @@ Read the work plan passed in `$ARGUMENTS`; if absent, select the most recent non
|
|
|
47
47
|
- Verification Strategies
|
|
48
48
|
- Quality Assurance Mechanisms
|
|
49
49
|
- Design-to-Plan Traceability
|
|
50
|
+
- Reference Contract Values
|
|
50
51
|
- ADR Bindings
|
|
51
52
|
- UI Spec Component -> Task Mapping
|
|
52
53
|
- Connection Map
|
|
@@ -53,8 +53,9 @@ Spawn security-reviewer agent: "Design Doc: [path]. Implementation files: [file
|
|
|
53
53
|
**If security-reviewer returned `blocked`**: Stop immediately. Report the blocked finding and escalate to user. Do not proceed to fix steps.
|
|
54
54
|
|
|
55
55
|
**Code compliance criteria (considering project stage)**:
|
|
56
|
-
-
|
|
57
|
-
-
|
|
56
|
+
- `code-reviewer` verdict is `pass`
|
|
57
|
+
- Coverage thresholds pass only when configured by the project, task file, work plan, or Design Doc
|
|
58
|
+
- Determine pass/fail from the `code-reviewer` verdict and configured coverage thresholds; treat `complianceRate` as diagnostic context only
|
|
58
59
|
|
|
59
60
|
**Security criteria**:
|
|
60
61
|
- `approved` or `approved_with_notes` -> Pass
|
|
@@ -219,9 +219,9 @@ Work plans use the header line `Implementation Readiness: <status>`.
|
|
|
219
219
|
|
|
220
220
|
Use this procedure after work-plan approval and before autonomous task execution when the flow needs to verify implementation readiness. The procedure supplies the evidence needed for user decisions; prompts for approval only after concrete failing criteria and proposed prep tasks are known.
|
|
221
221
|
|
|
222
|
-
1. Load the approved work plan exact path and extract Verification Strategies, Quality Assurance Mechanisms, Design-to-Plan Traceability, ADR Bindings, UI Spec Component -> Task Mapping, Connection Map, test skeleton references, E2E absence reasons, phase structure, referenced Design Docs, ADRs, and UI Specs.
|
|
222
|
+
1. Load the approved work plan exact path and extract Verification Strategies, Quality Assurance Mechanisms, Design-to-Plan Traceability, Reference Contract Values, ADR Bindings, UI Spec Component -> Task Mapping, Connection Map, test skeleton references, E2E absence reasons, phase structure, referenced Design Docs, ADRs, and UI Specs.
|
|
223
223
|
2. Evaluate these criteria with evidence:
|
|
224
|
-
- R1 Verification Strategy and
|
|
224
|
+
- R1 Verification Strategy and binding references resolve
|
|
225
225
|
- R2 E2E prerequisites are addressed
|
|
226
226
|
- R3 Phase 1 observability exists
|
|
227
227
|
- R4 UI rendering surface exists when UI work is present
|
|
@@ -105,7 +105,7 @@ work-planner's existing Integration Complete criteria naturally covers cross-lay
|
|
|
105
105
|
|
|
106
106
|
After work-planner creates or updates the plan, spawn document-reviewer:
|
|
107
107
|
|
|
108
|
-
> "Review the fullstack work plan. doc_type: WorkPlan. target: [work plan path]. mode: composite. Review semantic traceability to all Design Docs, UI Spec when present, cross-layer boundary coverage, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
|
|
108
|
+
> "Review the fullstack work plan. doc_type: WorkPlan. target: [work plan path]. mode: composite. Review semantic traceability to all Design Docs, UI Spec when present, Reference Contract Values fidelity, cross-layer boundary coverage, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
|
|
109
109
|
|
|
110
110
|
On `needs_revision` or `approved_with_conditions`, return to work-planner in update mode and re-review for max 2 revision iterations as defined by the `needs_revision` row in Approval Status Vocabulary. On `rejected`, halt and escalate to the user. Stop for batch approval only after WorkPlan review returns `approved` and the plan's `WorkPlan Review` section records `Status: approved` with `Conditions: none`.
|
|
111
111
|
|
|
@@ -207,8 +207,9 @@ export const test = base.extend<{ authenticatedPage: Page }>({
|
|
|
207
207
|
|
|
208
208
|
### E2E Budget
|
|
209
209
|
|
|
210
|
-
-
|
|
211
|
-
-
|
|
210
|
+
- Follow `integration-e2e-testing` lane limits: fixture-e2e MAX 3 and service-integration-e2e MAX 1-2 per feature
|
|
211
|
+
- Generate the reserved fixture-e2e user journey when eligible
|
|
212
|
+
- Only generate additional non-reserved E2E tests when the lane threshold is met (`Value Score >= 20` for fixture-e2e, `Value Score > 50` for service-integration-e2e)
|
|
212
213
|
- Prefer fewer comprehensive journey tests over many granular tests
|
|
213
214
|
|
|
214
215
|
### Test Isolation
|
|
@@ -64,6 +64,7 @@ Read the Design Doc in full and extract:
|
|
|
64
64
|
- Architecture design and data flow
|
|
65
65
|
- Interface contracts (function signatures, API endpoints, data structures)
|
|
66
66
|
- Identifier specifications explicitly written in the Design Doc as exact values, literals, labels, or named fields (resource names, endpoint paths, configuration keys, error codes, schema/model names)
|
|
67
|
+
- Binding observable contracts: use the Design Doc's `Observable Contract Values` table as the primary source when present; otherwise extract column/field/label sets and order, derived-display rules, and state-lifecycle negatives from the Design Doc. Also extract Field Propagation Map rows that carry a Serialized Format and Consumer Parse Rule
|
|
67
68
|
- Error handling policy
|
|
68
69
|
- Non-functional requirements
|
|
69
70
|
|
|
@@ -78,7 +79,7 @@ For each acceptance criterion extracted in Step 1:
|
|
|
78
79
|
- For behavior-changing ACs, confirm the evidence covers main and boundary paths. Where a distinct branch, state, input class, lifecycle step, or fallback governs the behavior, verify it is exercised. Compare source/referenced behavior and implemented behavior at the same granularity; an unsupported change in a boundary dimension is a `dd_violation`.
|
|
79
80
|
- Confirm the implementation keeps the core mechanism the AC, Design Doc, or referenced materials require. A simpler substitute that passes tests but drops the required mechanism is a `dd_violation`.
|
|
80
81
|
- For changes to persisted, shared, or externally observable state, identify the publication boundary where the new state becomes observable to another process, component, user, or later step. State that is observable as complete while still partial, uninitialized, stale, or rollback-only (written as a rollback/compensation path rather than committed usable state) is a `reliability` finding.
|
|
81
|
-
- When the reviewed task has `Change Category` set to `bug-fix`, `regression`, `state-change`, or `boundary-change`, check cases sharing its path, contract, persisted state, or external boundary. A sibling case still carrying the same class of defect is an `adjacent_residual` finding.
|
|
82
|
+
- When the reviewed task has `Change Category` set to `bug-fix`, `regression`, `state-change`, or `boundary-change`, check cases sharing its path, contract, persisted state, or external boundary. When no task field is present, classify the change from the diff itself. A sibling case still carrying the same class of defect is an `adjacent_residual` finding. When the task file is in scope, also read Investigation Notes for residuals the executor recorded as outside Target Files; verify each recorded residual and report in-scope unresolved residuals as `adjacent_residual`.
|
|
82
83
|
|
|
83
84
|
#### 2-2. Identifier Verification
|
|
84
85
|
For each identifier specification extracted in Step 1:
|
|
@@ -106,6 +107,13 @@ Assign confidence based on evidence count:
|
|
|
106
107
|
- medium: 2 agreeing sources
|
|
107
108
|
- low: 1 source only
|
|
108
109
|
|
|
110
|
+
#### 2-4. Reference Contract and Boundary Verification
|
|
111
|
+
|
|
112
|
+
Run this independently of the AC loop so observable contracts without dedicated ACs are verified.
|
|
113
|
+
|
|
114
|
+
1. For each binding observable value extracted in Step 1 (column/field/label set and order, derived-display rule, state-lifecycle negative), verify the implementation reproduces it exactly. A deviation is a `dd_violation` whose rationale names it a reference contract gap and states the required observable value versus the implemented value.
|
|
115
|
+
2. For each Field Propagation Map serialized boundary extracted in Step 1 (Serialized Format and Consumer Parse Rule), verify the producer emits the recorded representation and the consumer parses it by the recorded rule. A mismatch is a `dd_violation` whose rationale names it a boundary contract gap and states what the producer emits versus what the consumer parses.
|
|
116
|
+
|
|
109
117
|
### 3. Assess Code Quality
|
|
110
118
|
Read each implementation file and evaluate:
|
|
111
119
|
|
|
@@ -84,7 +84,7 @@ Skill Status:
|
|
|
84
84
|
- When `codebase_analysis` is provided, use `analysisScope`, `existingElements`, `constraints`, `qualityAssurance`, `focusAreas`, and `limitations` as source evidence for scope, feasibility, and completeness checks
|
|
85
85
|
- When `ui_analysis` is provided, use `componentStructure`, `propsPatterns`, `cssLayout`, `stateDisplay`, `displayConditions`, `accessibility`, and `candidateWriteSet` as source evidence for UI scope, feasibility, and completeness checks
|
|
86
86
|
- When `code_verification` is provided, use its discrepancies and reverse coverage as pre-verified evidence during review
|
|
87
|
-
- For WorkPlan: confirm the plan carries the artifacts the semantic gate is judged against: WorkPlan Review, Review Scope, Design-to-Plan Traceability, Verification Strategy summary, Proof Strategy, Failure Mode Checklist, and Quality Assurance Mechanisms. Read the referenced Design Doc(s), UI Spec, ADRs, and test skeletons when listed so coverage can be checked against source artifacts.
|
|
87
|
+
- For WorkPlan: confirm the plan carries the artifacts the semantic gate is judged against: WorkPlan Review, Review Scope, Design-to-Plan Traceability, Reference Contract Values when binding observable values apply, Verification Strategy summary, Proof Strategy, Failure Mode Checklist, and Quality Assurance Mechanisms. Read the referenced Design Doc(s), including `Observable Contract Values` tables when present, UI Spec, ADRs, and test skeletons when listed so coverage can be checked against source artifacts.
|
|
88
88
|
|
|
89
89
|
### Step 2: Target Document Collection
|
|
90
90
|
- Load document specified by target
|
|
@@ -134,11 +134,13 @@ For WorkPlan, additionally verify:
|
|
|
134
134
|
- **Output comparison check**: When the Design Doc changes existing observable behavior, an external contract, or a persisted data shape, verify that a concrete output comparison method is defined with identical input, expected output fields or format, and diff method. When upstream analysis includes `dataTransformationPipelines`, each listed step must be mapped to the comparison that verifies it; steps excluded because data passes through unchanged must include rationale. Missing mappings or rationale → `important` issue (category: `completeness`)
|
|
135
135
|
- **Minimal Surface Alternatives check**: Applies when the Design Doc proposes new in-scope elements as defined by coding-rules "Minimum Surface Terms". Reverse-engineer/as-is Design Docs are exempt. Missing or empty section when the trigger fires → `critical` issue (category: `completeness`). For each entry verify: (1) Step 1 lists at least one AC ID or accepted technical constraint from the Design Doc or referenced UI Spec; speculative-only linkage → `critical` issue (category: `compliance`). (2) Steps 2-3 include at least one subtractive alternative such as derive, compute on demand, keep at caller, reuse existing, or do not introduce new state/mode/abstraction; missing subtractive alternative → `important` issue (category: `compliance`). (3) Step 4 selects the smallest alternative or names a current requirement smaller alternatives fail to satisfy; primary rationale based on coding-rules subjective-only rationales → `critical` issue (category: `compliance`). (4) Step 5 records rejected alternatives with brief rationale; missing rejected alternatives log → `important` issue (category: `completeness`)
|
|
136
136
|
- **WorkPlan semantic gate**:
|
|
137
|
-
- Coverage is checked where each item lives in the plan: each acceptance criterion is covered by a task whose Completion Criteria or Proof Obligations reference the AC ID or claim identifier; each data contract, state transition, boundary, prerequisite, and protected scope item has a Design-to-Plan Traceability row mapped to a task or an explicit out-of-scope entry. Missing coverage is a `critical` issue (category: `completeness`).
|
|
137
|
+
- Coverage is checked where each item lives in the plan: each acceptance criterion is covered by a task whose Completion Criteria or Proof Obligations reference the AC ID or claim identifier; each data contract, state transition, boundary, prerequisite, and protected scope item has a Design-to-Plan Traceability row mapped to a task or an explicit out-of-scope entry; each non-serialized binding observable Design Doc value is copied into Reference Contract Values when applicable; each serialized binding value is recorded in Connection Map with concrete Serialized Format and Consumer Parse Rule. Missing coverage is a `critical` issue (category: `completeness`).
|
|
138
138
|
- Distinguish the cause for an uncovered acceptance criterion: when the source Design Doc supports it but no task maps to it, classify as a plan omission (`critical`, fixable by re-planning); when the source document or inputs give it no basis, classify as `rejected` because re-planning cannot invent the missing source requirement.
|
|
139
139
|
- Early verification must sit in an early phase rather than only the final phase. Deferral to final phase without rationale is an `important` issue (category: `consistency`).
|
|
140
140
|
- Each cross-boundary, public-boundary, browser-boundary, or persisted-state change names a task that verifies it through the real boundary. Missing real-boundary coverage is an `important` issue (category: `completeness`).
|
|
141
|
-
- Each traceability table present (Design-to-Plan, UI Spec Component, Connection Map, ADR Bindings) is filled to the granularity needed to resolve the target task. Under-specified rows are `important` issues (category: `completeness`).
|
|
141
|
+
- Each traceability table present (Design-to-Plan, Reference Contract Values, UI Spec Component, Connection Map, ADR Bindings) is filled to the granularity needed to resolve the target task. Under-specified rows are `important` issues (category: `completeness`).
|
|
142
|
+
- Reference Contract Values uses explicit `covered` or `gap` status. A `covered` row without task IDs, or a `gap` row without Notes explaining the gap and user-confirmation handling, is an `important` issue (category: `completeness`).
|
|
143
|
+
- Binding observable values are carried with content fidelity: for each Design Doc observable contract that encodes a non-serialized binding value (column/field/label set and order, derived-display rule, or state-lifecycle negative), the plan's Reference Contract Values table carries the value verbatim from the Design Doc and maps it to a covering task. A non-serialized value reduced to a label, summarized, or absent while the Design Doc specifies it is a content-fidelity gap: `critical` issue (category: `completeness`). When the value is serialized across a boundary, verify it is recorded in Connection Map instead; missing concrete Serialized Format or Consumer Parse Rule for that serialized value is a `critical` issue (category: `completeness`).
|
|
142
144
|
- The Failure Mode Checklist covers applicable domain-independent categories: same-value, no-op, empty input, invalid option, missing config, unavailable boundary, shared-state dependency, rollback-only visibility. Missing applicable categories are `recommended` issues (category: `completeness`).
|
|
143
145
|
- Verdict mapping: any WorkPlan semantic-gate `critical` issue forces `needs_revision`, except a coverage gap traceable to missing or contradictory source documents or inputs forces `rejected`. Important-only issues may return `approved_with_conditions`, but orchestration must route WorkPlan conditions back through work-planner update before batch approval or task decomposition.
|
|
144
146
|
- **Undetermined items review** [MANDATORY]: Every TBD, unknown, or open item MUST include: (1) **owner** — who resolves it, (2) **due** — when it gets resolved (which phase or milestone), (3) **next-phase handling** — how the next phase treats this gap. Missing any of these three → `important` issue
|
|
@@ -67,6 +67,7 @@ Decompose tasks based on implementation strategy patterns determined in implemen
|
|
|
67
67
|
- Document concrete executable procedures
|
|
68
68
|
- Include task-level Quality Assurance Mechanisms when the work plan defines them
|
|
69
69
|
- Include task-level Binding Decisions when ADR Bindings cover the task
|
|
70
|
+
- Include task-level Reference Contracts when Reference Contract Values cover the task
|
|
70
71
|
- Include task-level Proof Obligations when the work plan defines Proof Strategy, test skeleton proof annotations, or acceptance-criterion primary failure modes
|
|
71
72
|
- **Always include operation verification methods**
|
|
72
73
|
- Define clear completion criteria (within executor's scope of responsibility)
|
|
@@ -94,6 +95,7 @@ Decompose tasks based on implementation strategy patterns determined in implemen
|
|
|
94
95
|
- Extract task list
|
|
95
96
|
- Identify dependencies
|
|
96
97
|
- Read Design-to-Plan Traceability rows and verify every `covered` item has a corresponding generated task or phase completion task
|
|
98
|
+
- Read Reference Contract Values rows and verify every `covered` value has a corresponding generated task
|
|
97
99
|
- Read ADR Bindings rows and verify every `covered` item has a corresponding generated task
|
|
98
100
|
- **Overall Optimization Considerations**
|
|
99
101
|
- Identify common processing (prevent redundant implementation)
|
|
@@ -139,7 +141,7 @@ Decompose tasks based on implementation strategy patterns determined in implemen
|
|
|
139
141
|
| Integration test work | Test skeleton file, target implementation under test, existing fixture/auth/setup patterns |
|
|
140
142
|
| fixture-e2e environment/setup work | Existing fixture data, API mock layer, browser harness configuration |
|
|
141
143
|
| service-integration-e2e environment/setup work | Current environment config, startup scripts, seed scripts, auth flow references, external service stubs |
|
|
142
|
-
| Cross-boundary implementation | Connection Map rows touching the task target files, caller/producer module, callee/consumer module, expected signal, contract definition |
|
|
144
|
+
| Cross-boundary implementation | Connection Map rows touching the task target files, caller/producer module, callee/consumer module, serialized format, consumer parse rule, expected signal, contract definition |
|
|
143
145
|
| Task constrained by an ADR | ADR file with section hint matching the ADR Bindings row's Source Section value |
|
|
144
146
|
| Bug fix/refactor | Affected code paths, failing tests, reproduction-related files |
|
|
145
147
|
| Behavior replacement/rewrite | Existing implementation being replaced, observable outputs, Verification Strategy section in the Design Doc |
|
|
@@ -151,6 +153,7 @@ Decompose tasks based on implementation strategy patterns determined in implemen
|
|
|
151
153
|
- When test skeletons exist, include them explicitly
|
|
152
154
|
- When the work plan contains a UI Spec Component -> Task Mapping table, propagate matching component sections to every task listed in the row
|
|
153
155
|
- When the work plan contains a Connection Map, propagate boundary rows touching the task's target files to every task on either side of the boundary
|
|
156
|
+
- When the work plan contains a Reference Contract Values table, propagate matching rows to every covered task
|
|
154
157
|
- When the work plan contains an ADR Bindings table, propagate matching binding rows to every covered task
|
|
155
158
|
- When a task matches multiple natures, include Investigation Targets from all matching rows and deduplicate overlaps
|
|
156
159
|
|
|
@@ -244,6 +247,25 @@ When the work plan includes an `ADR Bindings` section:
|
|
|
244
247
|
- `persistence`: `User records are persisted through the UsersRepository interface`
|
|
245
248
|
6. **Validation**: Treat missing Axis, unknown Axis value, or non-checkable Compliance Check as an incomplete task file. Non-checkable means the implementation cannot be observed and answered as `Y` or `N`, or the predicate is written as a negative or compound condition.
|
|
246
249
|
|
|
250
|
+
## Reference Contract Propagation
|
|
251
|
+
|
|
252
|
+
When the work plan includes a `Reference Contract Values` section:
|
|
253
|
+
|
|
254
|
+
1. **Coverage preservation**: For each row marked `covered`, locate every task ID listed in `Covered By Task(s)`.
|
|
255
|
+
2. **Gap handling**: Preserve each `gap` row as a planning issue and surface it to the caller when task generation would otherwise assume the observable value is implemented.
|
|
256
|
+
3. **Investigation Targets**: Add the row's Design Doc path with section hint to each matched task. Deduplicate against Design-to-Plan Traceability targets.
|
|
257
|
+
4. **Reference Contracts table**: Add one row to each matched task's `Reference Contracts` section:
|
|
258
|
+
- `Source`: Design Doc path with section hint
|
|
259
|
+
- `Contract Type`: value copied verbatim from the work plan row
|
|
260
|
+
- `Required Observable Value`: value copied verbatim from the work plan row, preserving exact wording, field order, labels, and conditions
|
|
261
|
+
- `Compliance Check`: a Y/N-answerable positive predicate that evaluates whether the planned and final implementation reproduces the value
|
|
262
|
+
5. **Predicate shape**: Write each Compliance Check as a concrete positive statement. Examples:
|
|
263
|
+
- `The listed columns render in the specified order`
|
|
264
|
+
- `The label shows the looked-up name in place of the raw code`
|
|
265
|
+
- `The restored state is applied only when the explicit restore signal is present`
|
|
266
|
+
6. **Boundary ownership**: Connection Map Propagation carries serialized boundaries. Reference Contract rows carry non-serialized observable values. When a work plan records the same value in both places, keep Connection Map propagation and surface the duplicate Reference Contract row as a planning issue.
|
|
267
|
+
7. **Validation**: Treat missing Contract Type, missing Required Observable Value, non-checkable Compliance Check, or `covered` row without task IDs as an incomplete task file.
|
|
268
|
+
|
|
247
269
|
## UI Spec Propagation
|
|
248
270
|
|
|
249
271
|
When the work plan includes a `UI Spec Component -> Task Mapping` section:
|
|
@@ -259,9 +281,11 @@ When the work plan includes a `UI Spec Component -> Task Mapping` section:
|
|
|
259
281
|
When the work plan includes a `Connection Map` section:
|
|
260
282
|
|
|
261
283
|
1. For each boundary row, locate all tasks listed in `Covered By Task(s)`.
|
|
262
|
-
2. Add the caller/producer module, callee/consumer module, serialized
|
|
263
|
-
3. For
|
|
264
|
-
4.
|
|
284
|
+
2. Add the caller/producer module, callee/consumer module, serialized format, consumer parse rule, and expected signal to each listed task's Investigation Targets or Notes.
|
|
285
|
+
3. For serialized boundaries, require concrete serialized format and consumer parse rule values. Treat `-` in either column as an incomplete plan row unless the row is a non-serialized external signal whose Expected Signal fully captures the contract.
|
|
286
|
+
4. For serialized in-runtime boundaries, include a Boundary Context note that states the producer value, the consumer parse rule, and the roundtrip check: the emitted value parses to the value the consumer expects.
|
|
287
|
+
5. For tasks on one side of a boundary, include an Operation Verification Method that observes the expected signal from the other side.
|
|
288
|
+
6. Propagate only boundary rows explicitly mapped in the work plan.
|
|
265
289
|
|
|
266
290
|
## Change Category Classification
|
|
267
291
|
|
|
@@ -368,6 +392,7 @@ Please execute decomposed tasks according to the order.
|
|
|
368
392
|
- [ ] Investigation Targets specified for every task
|
|
369
393
|
- [ ] Change Category set for bug-fix, regression, state-change, or boundary-change tasks, with adjacent path/boundary owners added to Investigation Targets
|
|
370
394
|
- [ ] Quality Assurance Mechanisms propagated to relevant tasks when present in the plan header
|
|
395
|
+
- [ ] Reference Contract Values rows propagated to relevant tasks when present in the work plan
|
|
371
396
|
- [ ] ADR Bindings rows propagated to relevant tasks when present in the work plan
|
|
372
397
|
- [ ] ADR source includes section hint
|
|
373
398
|
- [ ] Axis copied verbatim from the work plan row
|
|
@@ -181,6 +181,24 @@ Run this check after Pre-implementation Verification and before behavior-first i
|
|
|
181
181
|
- `N`: stop implementation and return `status: "escalation_needed"` with `escalation_type: "binding_decision_violation"` and `phase: "pre_implementation"`
|
|
182
182
|
- `Unknown`: record the row as deferred in Investigation Notes and proceed to behavior-first implementation. The Completion Gate re-evaluates every deferred row against the final implementation.
|
|
183
183
|
|
|
184
|
+
#### Reference Contract Check (Required when the task file has a Reference Contracts section)
|
|
185
|
+
|
|
186
|
+
Run this check after Pre-implementation Verification and before behavior-first implementation when the task file contains a Reference Contracts section with one or more rows.
|
|
187
|
+
|
|
188
|
+
1. Confirm each Source in the Reference Contracts table has been read. Sources should also appear in Investigation Targets.
|
|
189
|
+
2. Verify source fidelity before planning: locate the Required Observable Value in the Source and confirm it matches verbatim. If the value is not found or the task row differs by summary, omission, reordered fields, changed labels, or changed conditions, stop implementation and return `status: "escalation_needed"` with `escalation_type: "design_compliance_violation"`. Set `details.design_doc_expectation` to the Source value or "source value not found" and `details.actual_situation` to the task row's Required Observable Value.
|
|
190
|
+
3. Use the Investigation Notes format below while recording the planned approach and evaluation results.
|
|
191
|
+
- `### Reference Contracts Evaluation`
|
|
192
|
+
- `[source] planned: [one sentence planned approach]`
|
|
193
|
+
- `[source] source fidelity -> Y|N - [one-line rationale]`
|
|
194
|
+
- `[source] [Compliance Check] -> Y|N|Unknown - [one-line rationale]`
|
|
195
|
+
4. Record the planned implementation approach in Investigation Notes, one sentence per row.
|
|
196
|
+
5. Evaluate each row's Compliance Check against the planned approach. Record the result as `Y`, `N`, or `Unknown` with a one-line rationale.
|
|
197
|
+
6. Branch per row:
|
|
198
|
+
- `Y`: proceed
|
|
199
|
+
- `N`: stop implementation and return `status: "escalation_needed"` with `escalation_type: "design_compliance_violation"`. Set `details.design_doc_expectation` to the row's Required Observable Value and `details.actual_situation` to the planned approach.
|
|
200
|
+
- `Unknown`: record the row as deferred in Investigation Notes and proceed to behavior-first implementation. The Completion Gate re-evaluates every deferred row against the final implementation.
|
|
201
|
+
|
|
184
202
|
#### Reference Representativeness (Applied During Implementation)
|
|
185
203
|
|
|
186
204
|
During implementation, apply coding-rules Reference Representativeness before adopting existing patterns, UI composition, or dependency versions. Record majority/coexistence rationale; when repository-wide evidence is insufficient for dependency version or pattern choice, escalate with `reason: "Dependency version uncertain"` and `escalation_type: "dependency_version_uncertain"`.
|
|
@@ -240,7 +258,7 @@ Report in the following JSON format upon task completion (**without executing qu
|
|
|
240
258
|
#### 2-1. Design Doc Deviation Escalation
|
|
241
259
|
When unable to implement per Design Doc, escalate in following JSON format:
|
|
242
260
|
|
|
243
|
-
Use Binding Decision Violation Escalation instead when the task has a Binding Decisions row covering the same issue.
|
|
261
|
+
Use Binding Decision Violation Escalation instead when the task has a Binding Decisions row covering the same issue. Use this Design Doc Deviation Escalation for Reference Contracts failures.
|
|
244
262
|
For task/AC/UI Spec/reference core-mechanism sources, set `details.design_doc_expectation` to `[source type] [location]: [cited expectation]`.
|
|
245
263
|
For core-mechanism violations, put the substitute in `details.actual_situation`, the behavior change in `details.why_cannot_implement`, and the unblock condition in `recommendation`.
|
|
246
264
|
|
|
@@ -298,12 +316,13 @@ Triggered when the Test Environment Check finds the project-configured test tool
|
|
|
298
316
|
☐ Implementation is consistent with the observations recorded in Investigation Notes
|
|
299
317
|
☐ Final implementation preserves the required core mechanism from the task, AC, Design Doc, UI Spec, or referenced materials, with evidence recorded in Investigation Notes or runnableCheck.reason
|
|
300
318
|
☐ Every Binding Decisions Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes (when the task file has a Binding Decisions section)
|
|
319
|
+
☐ Every Reference Contracts row has source fidelity `Y` and Compliance Check `Y` against the final implementation, with evidence recorded in Investigation Notes (when the task file has a Reference Contracts section)
|
|
301
320
|
☐ When test runs are cited as `runnableCheck` evidence, they are substantive per the `runnableCheck.result` field spec; non-test verification is evaluated by command success
|
|
302
321
|
☐ Output format validated (JSON response with all required fields)
|
|
303
322
|
☐ Quality standards satisfied (tests pass, progress updated)
|
|
304
323
|
☐ Final response is a single JSON with status `completed` or `escalation_needed`
|
|
305
324
|
|
|
306
|
-
**ENFORCEMENT**: HALT if any gate unchecked. Return `status: "escalation_needed"` to caller. Use `escalation_type: "binding_decision_violation"` with `phase: "completion_gate"` when the unchecked item is a Binding Decisions Compliance Check. Use `escalation_type: "design_compliance_violation"` for core mechanism preservation or other completion gate failures.
|
|
325
|
+
**ENFORCEMENT**: HALT if any gate unchecked. Return `status: "escalation_needed"` to caller. Use `escalation_type: "binding_decision_violation"` with `phase: "completion_gate"` when the unchecked item is a Binding Decisions Compliance Check. Use `escalation_type: "design_compliance_violation"` for Reference Contracts, core mechanism preservation, or other completion gate failures.
|
|
307
326
|
|
|
308
327
|
"""
|
|
309
328
|
|
|
@@ -181,6 +181,24 @@ Run this check after Pre-implementation Verification and before the TDD cycle wh
|
|
|
181
181
|
- `N`: stop implementation and return `status: "escalation_needed"` with `escalation_type: "binding_decision_violation"` and `phase: "pre_implementation"`
|
|
182
182
|
- `Unknown`: record the row as deferred in Investigation Notes and proceed to the TDD cycle. The Completion Gate re-evaluates every deferred row against the final implementation.
|
|
183
183
|
|
|
184
|
+
#### Reference Contract Check (Required when the task file has a Reference Contracts section)
|
|
185
|
+
|
|
186
|
+
Run this check after Pre-implementation Verification and before the TDD cycle when the task file contains a Reference Contracts section with one or more rows.
|
|
187
|
+
|
|
188
|
+
1. Confirm each Source in the Reference Contracts table has been read. Sources should also appear in Investigation Targets.
|
|
189
|
+
2. Verify source fidelity before planning: locate the Required Observable Value in the Source and confirm it matches verbatim. If the value is not found or the task row differs by summary, omission, reordered fields, changed labels, or changed conditions, stop implementation and return `status: "escalation_needed"` with `escalation_type: "design_compliance_violation"`. Set `details.design_doc_expectation` to the Source value or "source value not found" and `details.actual_situation` to the task row's Required Observable Value.
|
|
190
|
+
3. Use the Investigation Notes format below while recording the planned approach and evaluation results.
|
|
191
|
+
- `### Reference Contracts Evaluation`
|
|
192
|
+
- `[source] planned: [one sentence planned approach]`
|
|
193
|
+
- `[source] source fidelity -> Y|N - [one-line rationale]`
|
|
194
|
+
- `[source] [Compliance Check] -> Y|N|Unknown - [one-line rationale]`
|
|
195
|
+
4. Record the planned implementation approach in Investigation Notes, one sentence per row.
|
|
196
|
+
5. Evaluate each row's Compliance Check against the planned approach. Record the result as `Y`, `N`, or `Unknown` with a one-line rationale.
|
|
197
|
+
6. Branch per row:
|
|
198
|
+
- `Y`: proceed
|
|
199
|
+
- `N`: stop implementation and return `status: "escalation_needed"` with `escalation_type: "design_compliance_violation"`. Set `details.design_doc_expectation` to the row's Required Observable Value and `details.actual_situation` to the planned approach.
|
|
200
|
+
- `Unknown`: record the row as deferred in Investigation Notes and proceed to the TDD cycle. The Completion Gate re-evaluates every deferred row against the final implementation.
|
|
201
|
+
|
|
184
202
|
#### Reference Representativeness (Applied During Implementation)
|
|
185
203
|
|
|
186
204
|
During implementation, apply coding-rules Reference Representativeness before adopting existing patterns, API usage, or dependency versions. Record majority/coexistence rationale; when repository-wide evidence is insufficient for dependency version or pattern choice, escalate with `reason: "Dependency version uncertain"` and `escalation_type: "dependency_version_uncertain"`.
|
|
@@ -239,7 +257,7 @@ Report in the following JSON format upon task completion (**without executing qu
|
|
|
239
257
|
#### 2-1. Design Doc Deviation Escalation
|
|
240
258
|
When unable to implement per Design Doc, escalate in following JSON format:
|
|
241
259
|
|
|
242
|
-
Use Binding Decision Violation Escalation instead when the task has a Binding Decisions row covering the same issue.
|
|
260
|
+
Use Binding Decision Violation Escalation instead when the task has a Binding Decisions row covering the same issue. Use this Design Doc Deviation Escalation for Reference Contracts failures.
|
|
243
261
|
For task/AC/reference core-mechanism sources, set `details.design_doc_expectation` to `[source type] [location]: [cited expectation]`.
|
|
244
262
|
For core-mechanism violations, put the substitute in `details.actual_situation`, the behavior change in `details.why_cannot_implement`, and the unblock condition in `recommendation`.
|
|
245
263
|
|
|
@@ -297,12 +315,13 @@ Triggered when the Test Environment Check finds the project-configured test tool
|
|
|
297
315
|
☐ Implementation is consistent with the observations recorded in Investigation Notes
|
|
298
316
|
☐ Final implementation preserves the required core mechanism from the task, AC, Design Doc, or referenced materials, with evidence recorded in Investigation Notes or runnableCheck.reason
|
|
299
317
|
☐ Every Binding Decisions Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes (when the task file has a Binding Decisions section)
|
|
318
|
+
☐ Every Reference Contracts row has source fidelity `Y` and Compliance Check `Y` against the final implementation, with evidence recorded in Investigation Notes (when the task file has a Reference Contracts section)
|
|
300
319
|
☐ When test runs are cited as `runnableCheck` evidence, they are substantive per the `runnableCheck.result` field spec; non-test verification is evaluated by command success
|
|
301
320
|
☐ Output format validated (JSON response with all required fields)
|
|
302
321
|
☐ Quality standards satisfied (tests pass, progress updated)
|
|
303
322
|
☐ Final response is a single JSON with status `completed` or `escalation_needed`
|
|
304
323
|
|
|
305
|
-
**ENFORCEMENT**: HALT if any gate unchecked. Return `status: "escalation_needed"` to caller. Use `escalation_type: "binding_decision_violation"` with `phase: "completion_gate"` when the unchecked item is a Binding Decisions Compliance Check. Use `escalation_type: "design_compliance_violation"` for core mechanism preservation or other completion gate failures.
|
|
324
|
+
**ENFORCEMENT**: HALT if any gate unchecked. Return `status: "escalation_needed"` to caller. Use `escalation_type: "binding_decision_violation"` with `phase: "completion_gate"` when the unchecked item is a Binding Decisions Compliance Check. Use `escalation_type: "design_compliance_violation"` for Reference Contracts, core mechanism preservation, or other completion gate failures.
|
|
306
325
|
|
|
307
326
|
"""
|
|
308
327
|
|
|
@@ -85,6 +85,14 @@ For each integration boundary, define:
|
|
|
85
85
|
- Input props or consumed context
|
|
86
86
|
- Output events or effects
|
|
87
87
|
- On Error behavior
|
|
88
|
+
- Serialized Format and Consumer Parse Rule when a value crosses through a query string, route parameter, form post, storage key, config entry, message payload, file, or similar encoded medium
|
|
89
|
+
|
|
90
|
+
When the design contains observable values the implementation must reproduce exactly, record them explicitly in the Design Doc using the `Observable Contract Values` table:
|
|
91
|
+
- `structure-order`: column sets, field sets, label sets, or display order
|
|
92
|
+
- `derived-display`: display value derived from another field, lookup, state, or configuration
|
|
93
|
+
- `state-lifecycle-negative`: condition where persisted, restored, cached, or derived state must stay unused
|
|
94
|
+
|
|
95
|
+
Write each Required Observable Value as a copyable exact value, not a summary. If a value is serialized and parsed across a boundary, record it in Field Propagation Map / Connection Map instead of this table.
|
|
88
96
|
|
|
89
97
|
### Minimal Surface Alternatives【Required when introducing maintenance-surface elements】
|
|
90
98
|
|
|
@@ -160,9 +160,18 @@ Record direct impact, indirect impact, and explicitly unaffected components in t
|
|
|
160
160
|
### Field Propagation Map【Required】
|
|
161
161
|
When new or changed fields cross component boundaries:
|
|
162
162
|
|
|
163
|
-
Document each field's status (preserved / transformed / dropped) at each boundary with rationale.
|
|
163
|
+
Document each field's status (preserved / transformed / dropped) at each boundary with rationale. When a value is serialized and parsed through a query string, route parameter, form post, CLI argument, environment variable, config entry, message payload, storage key, or file, record the exact Serialized Format and Consumer Parse Rule.
|
|
164
164
|
Skip if no fields cross component boundaries.
|
|
165
165
|
|
|
166
|
+
### Observable Contract Values【Required when applicable】
|
|
167
|
+
When the design contains observable values the implementation must reproduce exactly, record them explicitly in the Design Doc using the `Observable Contract Values` table:
|
|
168
|
+
|
|
169
|
+
- `structure-order`: column sets, field sets, label sets, or display order
|
|
170
|
+
- `derived-display`: display value derived from another field, lookup, state, or configuration
|
|
171
|
+
- `state-lifecycle-negative`: condition where persisted, restored, cached, or derived state must stay unused
|
|
172
|
+
|
|
173
|
+
Write each Required Observable Value as a copyable exact value, not a summary. If a value is serialized and parsed across a boundary, record it in Field Propagation Map / Connection Map instead of this table.
|
|
174
|
+
|
|
166
175
|
### Interface Change Impact Analysis【Required】
|
|
167
176
|
|
|
168
177
|
Record existing operation, new operation, conversion need, adapter/wrapper need, and compatibility method. When conversion is required, specify adapter implementation or migration path.
|
|
@@ -141,20 +141,38 @@ Rules:
|
|
|
141
141
|
- Record the mapping in the `UI Spec Component -> Task Mapping` table from the plan template.
|
|
142
142
|
- Mark components with no covering task as `gap` with justification and user confirmation before approval.
|
|
143
143
|
|
|
144
|
-
### 5b. Map
|
|
144
|
+
### 5b. Map Reference Contract Values to Tasks
|
|
145
145
|
|
|
146
|
-
|
|
146
|
+
After Design-to-Plan Traceability is complete, create a `Reference Contract Values` table when any traced Design Doc item contains a binding observable value the implementation must reproduce exactly. When the Design Doc has an `Observable Contract Values` table, use that table as the primary source and copy each applicable row into the work plan.
|
|
147
|
+
|
|
148
|
+
Qualifying observable values:
|
|
149
|
+
- `structure-order`: column sets, field sets, label sets, or display order
|
|
150
|
+
- `derived-display`: display value derived from another field, lookup, state, or configuration
|
|
151
|
+
- `state-lifecycle-negative`: condition where persisted, restored, cached, or derived state must stay unused
|
|
152
|
+
|
|
153
|
+
For each qualifying value:
|
|
154
|
+
1. Record the Design Doc path and section.
|
|
155
|
+
2. Classify the Contract Type using exactly one value above.
|
|
156
|
+
3. Copy the Required Observable Value verbatim from the Design Doc. Preserve field names, labels, order, and conditions.
|
|
157
|
+
4. Map it to the task IDs that implement or verify the value.
|
|
158
|
+
5. Mark `covered` when concrete task IDs cover the value. Mark `gap` only when no concrete task covers it, set Covered By Task(s) to `-`, add justification in Notes, and flag it for user confirmation before plan approval.
|
|
159
|
+
|
|
160
|
+
Serialized boundaries belong in the Connection Map. ADR-derived structural decisions belong in ADR Bindings. When a value qualifies both as an observable value and as a serialized boundary, record it in the Connection Map and omit a duplicate Reference Contract Values row.
|
|
161
|
+
|
|
162
|
+
### 5c. Map Runtime Boundaries to Tasks
|
|
163
|
+
|
|
164
|
+
When implementation crosses runtime, process, deployment, or service boundaries, or when a value is serialized and parsed within one runtime, create a `Connection Map`.
|
|
147
165
|
|
|
148
166
|
A boundary qualifies only when all of the following hold:
|
|
149
|
-
-
|
|
150
|
-
-
|
|
151
|
-
- A
|
|
167
|
+
- Two sides exchange a value through a boundary. This includes separate processes, services, runtimes, or deployed artifacts, and same-runtime serialized media such as query strings, route parameters, form posts, CLI arguments, environment variables, config entries, message payloads, storage keys, or files.
|
|
168
|
+
- Producer and consumer depend on a shared representation or parse rule.
|
|
169
|
+
- A mismatch creates an observable signal, such as a status code, timeout, missing field, dropped message, invalid route state, parse failure, or persisted row difference.
|
|
152
170
|
|
|
153
|
-
Map only boundaries satisfying all three qualifications above
|
|
171
|
+
Map only boundaries satisfying all three qualifications above.
|
|
154
172
|
|
|
155
|
-
For each boundary, record the caller/producer, callee/consumer, expected signal, and covering tasks in the `Connection Map` table.
|
|
173
|
+
For each boundary, record the caller/producer, callee/consumer, serialized format, consumer parse rule, expected signal, and covering tasks in the `Connection Map` table. Serialized boundaries must have concrete Serialized Format and Consumer Parse Rule values. Use `-` only for non-serialized external signals where the Expected Signal fully captures the contract.
|
|
156
174
|
|
|
157
|
-
###
|
|
175
|
+
### 5d. Map ADR Decisions to Tasks
|
|
158
176
|
|
|
159
177
|
When ADRs are provided as input or listed in a Design Doc's "Prerequisite ADRs" section, create an `ADR Bindings` table before finalizing tasks.
|
|
160
178
|
|
|
@@ -177,7 +195,7 @@ Mapping rules:
|
|
|
177
195
|
- Acceptance criteria and required user-visible behaviors belong in `Design-to-Plan Traceability`; `ADR Bindings` covers structural implementation constraints.
|
|
178
196
|
- If an ADR decision constrains the design but no task covers it, add a justified gap and flag it for user confirmation before plan approval.
|
|
179
197
|
|
|
180
|
-
###
|
|
198
|
+
### 5e. Build Failure Mode Checklist
|
|
181
199
|
|
|
182
200
|
Populate the plan template's `Failure Mode Checklist` before finalizing tasks. Enumerate all eight domain-independent categories, mark whether each applies, and list the task IDs that cover applicable categories. Keep category names generic and put project-specific details in task descriptions or notes.
|
|
183
201
|
|
|
@@ -390,6 +408,9 @@ When creating work plans, **Phase Structure Diagrams** and **Task Dependency Dia
|
|
|
390
408
|
- [ ] Scope-boundary items mapped explicitly when the Design Doc defines protected no-change areas
|
|
391
409
|
- [ ] Covered By Task(s) uses only normalized task IDs
|
|
392
410
|
- [ ] No unjustified `gap` entries remain
|
|
411
|
+
- [ ] Reference Contract Values table completed when binding observable values appear in traced Design Doc items
|
|
412
|
+
- [ ] Each row mapped to covering task(s) or justified `gap`
|
|
413
|
+
- [ ] No unjustified `gap` entries remain
|
|
393
414
|
- [ ] ADR Bindings table completed when ADRs are provided or listed in Design Doc prerequisites
|
|
394
415
|
- [ ] ADR references resolved with exact path or single `docs/adr/ADR-XXXX-*.md` match
|
|
395
416
|
- [ ] Each binding row has one valid Axis value
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "codex-workflows",
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "0.7.1",
|
|
4
4
|
"description": "Task-oriented agentic coding framework for OpenAI Codex CLI — skills, recipes, and subagents for structured development workflows",
|
|
5
5
|
"license": "MIT",
|
|
6
6
|
"author": "Shinsuke Kagawa",
|