codex-workflows 0.2.2 → 0.2.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agents/skills/documentation-criteria/SKILL.md +3 -3
- package/.agents/skills/documentation-criteria/references/design-template.md +1 -26
- package/.agents/skills/documentation-criteria/references/plan-template.md +3 -18
- package/.agents/skills/recipe-add-integration-tests/SKILL.md +58 -18
- package/.agents/skills/recipe-diagnose/SKILL.md +20 -4
- package/.agents/skills/recipe-reverse-engineer/SKILL.md +13 -5
- package/.codex/agents/code-verifier.toml +53 -20
- package/.codex/agents/investigator.toml +14 -15
- package/.codex/agents/prd-creator.toml +39 -24
- package/.codex/agents/scope-discoverer.toml +23 -27
- package/.codex/agents/task-decomposer.toml +1 -1
- package/.codex/agents/technical-designer-frontend.toml +70 -117
- package/.codex/agents/technical-designer.toml +72 -116
- package/.codex/agents/verifier.toml +5 -12
- package/.codex/agents/work-planner.toml +7 -6
- package/package.json +1 -1
|
@@ -64,16 +64,16 @@ description: "Documentation creation criteria for PRD, ADR, Design Doc, UI Spec,
|
|
|
64
64
|
### UI Specification
|
|
65
65
|
**Purpose**: Define UI structure, screen transitions, component decomposition, and interaction design
|
|
66
66
|
**Includes**: Screen list and transitions, component state x display matrix, interaction definitions, AC traceability, existing component reuse map, accessibility requirements
|
|
67
|
-
**Excludes**: Technical implementation details, API contracts, test implementation, implementation schedule
|
|
67
|
+
**Excludes**: Technical implementation details, API contracts, test implementation (generated by acceptance-test-generator), implementation schedule
|
|
68
68
|
|
|
69
69
|
### Design Document
|
|
70
70
|
**Purpose**: Define technical implementation methods in detail
|
|
71
71
|
**Includes**: Existing codebase analysis, technical approach, dependencies and constraints, interface/contract definitions, data flow, acceptance criteria, change impact map, code inspection evidence
|
|
72
|
-
**Excludes**: Why that technology was chosen (reference ADR), when/who to implement (reference Work Plan)
|
|
72
|
+
**Excludes**: Why that technology was chosen (reference ADR), when/who to implement (reference Work Plan), detailed test strategy and test case selection (generated by acceptance-test-generator from acceptance criteria)
|
|
73
73
|
|
|
74
74
|
### Work Plan
|
|
75
75
|
**Purpose**: Implementation task management and progress tracking
|
|
76
|
-
**Includes**: Task breakdown, schedule estimates,
|
|
76
|
+
**Includes**: Task breakdown, schedule estimates, test skeleton file paths, Phase 4 Quality Assurance Phase (required), progress records
|
|
77
77
|
**Excludes**: Technical rationale, design details
|
|
78
78
|
|
|
79
79
|
**Phase Division Criteria**:
|
|
@@ -259,40 +259,15 @@ System Invariants:
|
|
|
259
259
|
- Prerequisites: [Required pre-implementations]
|
|
260
260
|
|
|
261
261
|
### Integration Points
|
|
262
|
-
Each integration point requires E2E verification:
|
|
263
262
|
|
|
264
263
|
**Integration Point 1: [Name]**
|
|
265
264
|
- Components: [Component A] to [Component B]
|
|
266
|
-
-
|
|
265
|
+
- Contract: [Interface/API contract between components]
|
|
267
266
|
|
|
268
267
|
### Migration Strategy
|
|
269
268
|
|
|
270
269
|
[Technical migration approach, ensuring backward compatibility]
|
|
271
270
|
|
|
272
|
-
## Test Strategy
|
|
273
|
-
|
|
274
|
-
### Basic Test Design Policy
|
|
275
|
-
|
|
276
|
-
Automatically derive test cases from acceptance criteria:
|
|
277
|
-
- Create at least one test case for each acceptance criterion
|
|
278
|
-
- Implement measurable standards from acceptance criteria as assertions
|
|
279
|
-
|
|
280
|
-
### Unit Tests
|
|
281
|
-
|
|
282
|
-
[Unit testing policy and coverage goals]
|
|
283
|
-
|
|
284
|
-
### Integration Tests
|
|
285
|
-
|
|
286
|
-
[Integration testing policy and important test cases]
|
|
287
|
-
|
|
288
|
-
### E2E Tests
|
|
289
|
-
|
|
290
|
-
[E2E testing policy]
|
|
291
|
-
|
|
292
|
-
### Performance Tests
|
|
293
|
-
|
|
294
|
-
[Performance testing methods and standards]
|
|
295
|
-
|
|
296
271
|
## Security Considerations
|
|
297
272
|
|
|
298
273
|
Evaluate the following for this feature's trust boundaries and data flow:
|
|
@@ -48,11 +48,6 @@ Related Issue/PR: #XXX (if any)
|
|
|
48
48
|
- [ ] [Functional completion criteria]
|
|
49
49
|
- [ ] [Quality completion criteria]
|
|
50
50
|
|
|
51
|
-
#### Operational Verification Procedures
|
|
52
|
-
1. [Operation verification steps]
|
|
53
|
-
2. [Expected result verification]
|
|
54
|
-
3. [Performance verification (when applicable)]
|
|
55
|
-
|
|
56
51
|
### Phase 2: [Phase Name] (Estimated commits: X)
|
|
57
52
|
**Purpose**: [What this phase aims to achieve]
|
|
58
53
|
|
|
@@ -66,11 +61,6 @@ Related Issue/PR: #XXX (if any)
|
|
|
66
61
|
- [ ] [Functional completion criteria]
|
|
67
62
|
- [ ] [Quality completion criteria]
|
|
68
63
|
|
|
69
|
-
#### Operational Verification Procedures
|
|
70
|
-
1. [Operation verification steps]
|
|
71
|
-
2. [Expected result verification]
|
|
72
|
-
3. [Performance verification (when applicable)]
|
|
73
|
-
|
|
74
64
|
### Phase 3: [Phase Name] (Estimated commits: X)
|
|
75
65
|
**Purpose**: [What this phase aims to achieve]
|
|
76
66
|
|
|
@@ -84,9 +74,6 @@ Related Issue/PR: #XXX (if any)
|
|
|
84
74
|
- [ ] [Functional completion criteria]
|
|
85
75
|
- [ ] [Quality completion criteria]
|
|
86
76
|
|
|
87
|
-
#### Operational Verification Procedures
|
|
88
|
-
[Copy relevant integration point operational verification from Design Doc]
|
|
89
|
-
|
|
90
77
|
### Final Phase: Quality Assurance (Required) (Estimated commits: 1)
|
|
91
78
|
**Purpose**: Overall quality assurance and Design Doc consistency verification
|
|
92
79
|
|
|
@@ -94,13 +81,10 @@ Related Issue/PR: #XXX (if any)
|
|
|
94
81
|
- [ ] Verify all Design Doc acceptance criteria achieved
|
|
95
82
|
- [ ] Security review: Verify security considerations from Design Doc are implemented
|
|
96
83
|
- [ ] Quality checks (types, lint, format)
|
|
97
|
-
- [ ] Execute all tests
|
|
84
|
+
- [ ] Execute all tests (including integration/E2E from test skeletons, when provided)
|
|
98
85
|
- [ ] Coverage 70%+
|
|
99
86
|
- [ ] Document updates
|
|
100
87
|
|
|
101
|
-
#### Operational Verification Procedures
|
|
102
|
-
[Copy operational verification procedures from Design Doc]
|
|
103
|
-
|
|
104
88
|
### Quality Assurance
|
|
105
89
|
- [ ] Implement staged quality checks (details: refer to ai-development-guide skill)
|
|
106
90
|
- [ ] All tests pass
|
|
@@ -110,7 +94,8 @@ Related Issue/PR: #XXX (if any)
|
|
|
110
94
|
|
|
111
95
|
## Completion Criteria
|
|
112
96
|
- [ ] All phases completed
|
|
113
|
-
- [ ]
|
|
97
|
+
- [ ] All integration/E2E tests passing (when test skeletons provided)
|
|
98
|
+
- [ ] Acceptance criteria manually verified (when test skeletons are not provided)
|
|
114
99
|
- [ ] Design Doc acceptance criteria satisfied
|
|
115
100
|
- [ ] Staged quality checks completed (zero errors)
|
|
116
101
|
- [ ] All tests pass
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: recipe-add-integration-tests
|
|
3
|
-
description: "Add integration/E2E tests to existing codebase using Design
|
|
3
|
+
description: "Add integration/E2E tests to existing codebase using Design Docs."
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
## Required Skills [LOAD BEFORE EXECUTION]
|
|
@@ -26,11 +26,11 @@ description: "Add integration/E2E tests to existing codebase using Design Doc ac
|
|
|
26
26
|
- Test review -> Spawn integration-test-reviewer agent
|
|
27
27
|
- Quality checks -> Spawn quality-fixer agent
|
|
28
28
|
|
|
29
|
-
|
|
29
|
+
Document paths: $ARGUMENTS
|
|
30
30
|
|
|
31
31
|
## Prerequisites
|
|
32
32
|
|
|
33
|
-
- Design Doc must exist (created manually or via reverse-engineer)
|
|
33
|
+
- At least one Design Doc must exist (created manually or via reverse-engineer)
|
|
34
34
|
- Existing implementation to test
|
|
35
35
|
|
|
36
36
|
## Execution Flow
|
|
@@ -39,27 +39,59 @@ Design Doc path: $ARGUMENTS
|
|
|
39
39
|
|
|
40
40
|
Reference documentation-criteria skill for task file template in Step 3.
|
|
41
41
|
|
|
42
|
-
### Step 1: Validate
|
|
42
|
+
### Step 1: Discover and Validate Documents
|
|
43
43
|
|
|
44
|
-
|
|
44
|
+
```bash
|
|
45
|
+
# Verify at least one document path was provided
|
|
46
|
+
test -n "$ARGUMENTS" || { echo "ERROR: No document paths provided"; exit 1; }
|
|
47
|
+
|
|
48
|
+
# Verify provided paths exist
|
|
49
|
+
ls $ARGUMENTS
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
Use only the user-provided paths in `$ARGUMENTS`. Do not auto-discover additional Design Docs or UI Specs.
|
|
53
|
+
|
|
54
|
+
Classify provided documents by path and filename, using first-match-wins:
|
|
55
|
+
- Path matches `docs/ui-spec/*.md` -> **UI Spec**
|
|
56
|
+
- Path matches `docs/design/*-backend-*.md` or `docs/design/*backend*.md` -> **Design Doc (backend)**
|
|
57
|
+
- Path matches `docs/design/*-frontend-*.md` or `docs/design/*frontend*.md` -> **Design Doc (frontend)**
|
|
58
|
+
- Path matches `docs/design/*.md` and none of the above -> **single-layer Design Doc**
|
|
59
|
+
|
|
60
|
+
If a filename appears to match both backend and frontend, halt and ask the user which layer it belongs to.
|
|
45
61
|
|
|
46
62
|
### Step 2: Skeleton Generation
|
|
47
63
|
|
|
48
|
-
Spawn acceptance-test-generator agent
|
|
64
|
+
Spawn acceptance-test-generator agent with only the documents that exist from Step 1:
|
|
65
|
+
```text
|
|
66
|
+
Generate test skeletons from the following documents:
|
|
67
|
+
- Design Doc (backend): [path] <- include only if exists
|
|
68
|
+
- Design Doc (frontend): [path] <- include only if exists
|
|
69
|
+
- UI Spec: [path] <- include only if exists
|
|
70
|
+
```
|
|
49
71
|
|
|
50
|
-
**Expected output**: `generatedFiles`
|
|
72
|
+
**Expected output**: `generatedFiles` as a structured object grouped by layer, for example:
|
|
73
|
+
```json
|
|
74
|
+
{
|
|
75
|
+
"backend": ["path/to/backend.int.test.ts"],
|
|
76
|
+
"frontend": ["path/to/frontend.int.test.ts"],
|
|
77
|
+
"e2e": ["path/to/flow.e2e.test.ts"]
|
|
78
|
+
}
|
|
79
|
+
```
|
|
51
80
|
|
|
52
|
-
### Step 3: Create Task
|
|
81
|
+
### Step 3: Create Task Files [GATE]
|
|
53
82
|
|
|
54
83
|
**[STOP — BLOCKING]** Present task file content to user for confirmation before proceeding to implementation.
|
|
55
84
|
**CANNOT proceed until user explicitly confirms.**
|
|
56
85
|
|
|
57
|
-
Create task file
|
|
86
|
+
Create one task file per layer, using the monorepo-flow.md naming convention for deterministic agent routing:
|
|
87
|
+
- Backend skeletons exist -> `docs/plans/tasks/integration-tests-backend-task-YYYYMMDD.md`
|
|
88
|
+
- Frontend skeletons exist -> `docs/plans/tasks/integration-tests-frontend-task-YYYYMMDD.md`
|
|
89
|
+
- Single-layer (no backend/frontend distinction) -> `docs/plans/tasks/integration-tests-backend-task-YYYYMMDD.md`
|
|
58
90
|
|
|
59
|
-
**Template
|
|
91
|
+
**Template** (per task file):
|
|
60
92
|
```markdown
|
|
61
93
|
---
|
|
62
|
-
name: Implement integration tests for [feature name]
|
|
94
|
+
name: Implement [layer] integration tests for [feature name]
|
|
63
95
|
type: test-implementation
|
|
64
96
|
---
|
|
65
97
|
|
|
@@ -69,8 +101,8 @@ Implement test cases defined in skeleton files.
|
|
|
69
101
|
|
|
70
102
|
## Target Files
|
|
71
103
|
|
|
72
|
-
- Skeleton: [
|
|
73
|
-
- Design Doc: [
|
|
104
|
+
- Skeleton: [layer-specific paths from Step 2 generatedFiles]
|
|
105
|
+
- Design Doc: [layer-specific Design Doc from Step 1]
|
|
74
106
|
|
|
75
107
|
## Tasks
|
|
76
108
|
|
|
@@ -85,17 +117,22 @@ Implement test cases defined in skeleton files.
|
|
|
85
117
|
- No quality issues
|
|
86
118
|
```
|
|
87
119
|
|
|
88
|
-
**Output**: "Task file created at [path]. Ready for Step 4."
|
|
120
|
+
**Output**: "Task file(s) created at [path(s)]. Ready for Step 4."
|
|
89
121
|
|
|
90
122
|
### Step 4: Test Implementation
|
|
91
123
|
|
|
92
|
-
|
|
124
|
+
For each task file from Step 3, invoke task-executor routed by filename pattern:
|
|
125
|
+
- `*-backend-task-*` -> Spawn `task-executor`
|
|
126
|
+
- `*-frontend-task-*` -> Spawn `task-executor-frontend`
|
|
127
|
+
- Prompt: "Task file: [task file path from Step 3]. Implement tests following the task file."
|
|
128
|
+
|
|
129
|
+
Execute one task file at a time through Steps 4 -> 5 -> 6 -> 7 before starting the next.
|
|
93
130
|
|
|
94
131
|
**Expected output**: `status`, `testsAdded`
|
|
95
132
|
|
|
96
133
|
### Step 5: Test Review
|
|
97
134
|
|
|
98
|
-
Spawn integration-test-reviewer agent: "Review test quality. Test files: [paths from Step 4 testsAdded]. Skeleton files: [paths from Step 2 generatedFiles]."
|
|
135
|
+
Spawn integration-test-reviewer agent: "Review test quality. Test files: [paths from Step 4 testsAdded]. Skeleton files: [layer-specific paths from Step 2 generatedFiles matching current task's layer]."
|
|
99
136
|
|
|
100
137
|
**Expected output**: `status` (approved/needs_revision), `requiredFixes`
|
|
101
138
|
|
|
@@ -103,11 +140,14 @@ Spawn integration-test-reviewer agent: "Review test quality. Test files: [paths
|
|
|
103
140
|
|
|
104
141
|
Check Step 5 result:
|
|
105
142
|
- `status: approved` -> Mark complete, proceed to Step 7
|
|
106
|
-
- `status: needs_revision` -> Spawn
|
|
143
|
+
- `status: needs_revision` -> Spawn the layer-appropriate executor with: "Fix the following issues in test files: [requiredFixes from Step 5]." Then return to Step 5. Maximum 2 revision cycles per task file; if still `needs_revision`, escalate to the user.
|
|
107
144
|
|
|
108
145
|
### Step 7: Quality Check
|
|
109
146
|
|
|
110
|
-
Spawn quality-fixer
|
|
147
|
+
Spawn quality-fixer routed by task filename pattern:
|
|
148
|
+
- `*-backend-task-*` -> Spawn `quality-fixer`
|
|
149
|
+
- `*-frontend-task-*` -> Spawn `quality-fixer-frontend`
|
|
150
|
+
- Prompt: "Final quality assurance for test files added in this workflow. Run all tests and verify coverage."
|
|
111
151
|
|
|
112
152
|
**Expected output**: `status` (`approved`/`blocked`)
|
|
113
153
|
|
|
@@ -83,7 +83,21 @@ Register the following and execute:
|
|
|
83
83
|
|
|
84
84
|
### Step 1: Investigation (investigator)
|
|
85
85
|
|
|
86
|
-
Spawn investigator agent
|
|
86
|
+
Spawn investigator agent with the following prompt:
|
|
87
|
+
|
|
88
|
+
```text
|
|
89
|
+
Comprehensively collect information related to the following phenomenon.
|
|
90
|
+
|
|
91
|
+
Phenomenon: [Problem reported by user]
|
|
92
|
+
Problem essence: [taskEssence]
|
|
93
|
+
Investigation focus: [investigationFocus]
|
|
94
|
+
Applicable rules: [selectedRules summary]
|
|
95
|
+
|
|
96
|
+
For change failures, also include:
|
|
97
|
+
- what changed
|
|
98
|
+
- what broke
|
|
99
|
+
- what both areas share
|
|
100
|
+
```
|
|
87
101
|
|
|
88
102
|
**Expected output**: Evidence matrix, comparison analysis results, causal tracking results, list of unexplored areas, investigation limitations
|
|
89
103
|
|
|
@@ -92,12 +106,14 @@ Spawn investigator agent: "Comprehensively collect information related to the fo
|
|
|
92
106
|
Review investigation output:
|
|
93
107
|
|
|
94
108
|
**Quality Check** (verify output contains the following):
|
|
95
|
-
- [ ] comparisonAnalysis
|
|
96
|
-
- [ ] causalChain for each hypothesis
|
|
109
|
+
- [ ] `comparisonAnalysis` is present and `normalImplementation` is non-null, or explicitly states that no working implementation was found
|
|
110
|
+
- [ ] causalChain for each hypothesis reaches a stop condition
|
|
97
111
|
- [ ] causeCategory for each hypothesis
|
|
112
|
+
- [ ] `investigationSources` covers at least 3 distinct source types
|
|
113
|
+
- [ ] each hypothesis has supporting evidence with a concrete source
|
|
98
114
|
- [ ] Investigation covering investigationFocus items (when provided)
|
|
99
115
|
|
|
100
|
-
**If quality insufficient**: MUST re-spawn investigator agent specifying missing items
|
|
116
|
+
**If quality insufficient**: MUST re-spawn investigator agent specifying the missing items and include the previous investigation output for context
|
|
101
117
|
ENFORCEMENT: Proceeding to verifier with incomplete investigation data produces unreliable conclusions.
|
|
102
118
|
|
|
103
119
|
**design_gap Escalation**:
|
|
@@ -69,6 +69,7 @@ Spawn scope-discoverer agent: "Discover functional scope targets in the codebase
|
|
|
69
69
|
- No units discovered -> ask user for hints
|
|
70
70
|
- `$STEP_1_OUTPUT.prdUnits` exists
|
|
71
71
|
- All `sourceUnits` across `prdUnits` (flattened, deduplicated) match the set of `discoveredUnits` IDs — no unit missing, no unit duplicated
|
|
72
|
+
- Each discovered unit's `unitInventory` has at least one non-empty category. If all categories are empty, re-run discovery with focus on that unit
|
|
72
73
|
|
|
73
74
|
**[STOP — BLOCKING]** If human review enabled: Present `$STEP_1_OUTPUT.prdUnits` with their source unit mapping to user for confirmation.
|
|
74
75
|
**CANNOT proceed until user explicitly confirms.**
|
|
@@ -79,7 +80,7 @@ Spawn scope-discoverer agent: "Discover functional scope targets in the codebase
|
|
|
79
80
|
|
|
80
81
|
#### Step 2: PRD Generation
|
|
81
82
|
|
|
82
|
-
Spawn prd-creator agent: "Create reverse-engineered PRD for the following feature. Operation Mode: reverse-engineer. External Scope Provided: true. Feature: $PRD_UNIT_NAME. Description: $PRD_UNIT_DESCRIPTION. Related Files: $PRD_UNIT_COMBINED_RELATED_FILES. Entry Points: $PRD_UNIT_COMBINED_ENTRY_POINTS. Source Units: $PRD_UNIT_SOURCE_UNITS.
|
|
83
|
+
Spawn prd-creator agent: "Create reverse-engineered PRD for the following feature. Operation Mode: reverse-engineer. External Scope Provided: true. Feature: $PRD_UNIT_NAME. Description: $PRD_UNIT_DESCRIPTION. Related Files: $PRD_UNIT_COMBINED_RELATED_FILES. Entry Points: $PRD_UNIT_COMBINED_ENTRY_POINTS. Source Units: $PRD_UNIT_SOURCE_UNITS. Use provided scope as an investigation starting point. If tracing entry points reveals directly connected files outside this scope, include them. Create final version PRD based on thorough code investigation."
|
|
83
84
|
|
|
84
85
|
**Store output as**: `$STEP_2_OUTPUT` (PRD path)
|
|
85
86
|
|
|
@@ -87,12 +88,13 @@ Spawn prd-creator agent: "Create reverse-engineered PRD for the following featur
|
|
|
87
88
|
|
|
88
89
|
**Prerequisite**: $STEP_2_OUTPUT (PRD path from Step 2)
|
|
89
90
|
|
|
90
|
-
Spawn code-verifier agent: "Verify consistency between PRD and code implementation. doc_type: prd. document_path: $STEP_2_OUTPUT.
|
|
91
|
+
Spawn code-verifier agent: "Verify consistency between PRD and code implementation. doc_type: prd. document_path: $STEP_2_OUTPUT. verbose: false."
|
|
91
92
|
|
|
92
93
|
**Store output as**: `$STEP_3_OUTPUT`
|
|
93
94
|
|
|
94
95
|
**Quality Gate**:
|
|
95
|
-
- consistencyScore >= 70 -> proceed to review
|
|
96
|
+
- consistencyScore >= 70 and verifiableClaimCount >= 20 -> proceed to review (guards against shallow verification passes with too few extracted claims)
|
|
97
|
+
- consistencyScore >= 70 and verifiableClaimCount < 20 -> re-run verifier because investigation depth is insufficient
|
|
96
98
|
- consistencyScore < 70 -> flag for detailed review
|
|
97
99
|
|
|
98
100
|
#### Step 4: Review
|
|
@@ -151,6 +153,7 @@ Map PRD units to Design Doc generation targets by resolving each PRD unit's `sou
|
|
|
151
153
|
- `technicalProfile.publicInterfaces` -> Public Interfaces
|
|
152
154
|
- `dependencies` -> Dependencies
|
|
153
155
|
- `relatedFiles` -> Scope boundary
|
|
156
|
+
- `unitInventory` -> Unit Inventory
|
|
154
157
|
|
|
155
158
|
**Store output as**: `$STEP_6_OUTPUT`
|
|
156
159
|
|
|
@@ -168,6 +171,11 @@ Map PRD units to Design Doc generation targets by resolving each PRD unit's `sou
|
|
|
168
171
|
"publicInterfaces": ["AuthService.login()", "AuthController.handleLogin()"],
|
|
169
172
|
"dependencies": ["UNIT-003"],
|
|
170
173
|
"scopeBoundary": ["src/auth/*"],
|
|
174
|
+
"unitInventory": {
|
|
175
|
+
"routes": [],
|
|
176
|
+
"testFiles": [],
|
|
177
|
+
"publicExports": []
|
|
178
|
+
},
|
|
171
179
|
"mappingRationale": "Default 1:1 mapping from PRD unit because technical scope is cohesive"
|
|
172
180
|
}
|
|
173
181
|
]
|
|
@@ -186,13 +194,13 @@ Map PRD units to Design Doc generation targets by resolving each PRD unit's `sou
|
|
|
186
194
|
|
|
187
195
|
**Scope**: Document current architecture as-is. This is a documentation task, not a design improvement task.
|
|
188
196
|
|
|
189
|
-
Spawn technical-designer agent: "Create Design Doc for the following feature based on existing code. Operation Mode:
|
|
197
|
+
Spawn technical-designer agent: "Create Design Doc for the following feature based on existing code. Operation Mode: reverse-engineer. Feature: $UNIT_NAME. Description: $UNIT_DESCRIPTION. Primary Files: $UNIT_PRIMARY_MODULES. Public Interfaces: $UNIT_PUBLIC_INTERFACES. Dependencies: $UNIT_DEPENDENCIES. Unit Inventory: $UNIT_INVENTORY. Parent PRD: $APPROVED_PRD_PATH. Document current architecture as-is. Use Unit Inventory as the completeness baseline."
|
|
190
198
|
|
|
191
199
|
**Store output as**: `$STEP_7_OUTPUT`
|
|
192
200
|
|
|
193
201
|
#### Step 8: Code Verification
|
|
194
202
|
|
|
195
|
-
Spawn code-verifier agent: "Verify consistency between Design Doc and code implementation. doc_type: design-doc. document_path: $STEP_7_OUTPUT.
|
|
203
|
+
Spawn code-verifier agent: "Verify consistency between Design Doc and code implementation. doc_type: design-doc. document_path: $STEP_7_OUTPUT. verbose: false."
|
|
196
204
|
|
|
197
205
|
**Store output as**: `$STEP_8_OUTPUT`
|
|
198
206
|
|
|
@@ -52,13 +52,6 @@ Skill Status:
|
|
|
52
52
|
This agent outputs **verification results and discrepancy findings only**.
|
|
53
53
|
Document modification and solution proposals are out of scope for this agent.
|
|
54
54
|
|
|
55
|
-
## Core Responsibilities
|
|
56
|
-
|
|
57
|
-
1. **Claim Extraction** - Extract verifiable claims from document
|
|
58
|
-
2. **Multi-source Evidence Collection** - Gather evidence from code, tests, and config
|
|
59
|
-
3. **Consistency Classification** - Classify each claim's implementation status
|
|
60
|
-
4. **Coverage Assessment** - Identify undocumented code and unimplemented specifications
|
|
61
|
-
|
|
62
55
|
## Verification Framework
|
|
63
56
|
|
|
64
57
|
### Claim Categories
|
|
@@ -97,28 +90,38 @@ For each claim, classify as one of:
|
|
|
97
90
|
|
|
98
91
|
## Execution Steps
|
|
99
92
|
|
|
100
|
-
### Step 1: Document Analysis
|
|
93
|
+
### Step 1: Document Analysis — Section-by-Section Claim Extraction
|
|
101
94
|
|
|
102
|
-
1. Read the target document
|
|
103
|
-
2.
|
|
95
|
+
1. Read the target document in full
|
|
96
|
+
2. Process each section individually:
|
|
97
|
+
- Extract all statements that make verifiable claims about code behavior, data structures, file paths, API contracts, or system behavior
|
|
98
|
+
- Record `{ sectionName, claimCount, claims[] }`
|
|
99
|
+
- If a section contains factual statements but yields zero claims, record that explicitly for review
|
|
104
100
|
3. Categorize each claim
|
|
105
101
|
4. Note ambiguous claims that cannot be verified
|
|
102
|
+
5. Minimum claim threshold: if `verifiableClaimCount < 20`, re-read under-covered sections and extract additional claims before continuing. Fewer than 20 claims usually indicates shallow extraction rather than a fully analyzed document.
|
|
106
103
|
|
|
107
104
|
### Step 2: Code Scope Identification
|
|
108
105
|
|
|
109
|
-
1.
|
|
110
|
-
2.
|
|
111
|
-
3.
|
|
106
|
+
1. If `code_paths` are provided, use them as a starting point, not a ceiling
|
|
107
|
+
2. If `code_paths` are not provided, extract file paths from the document and expand scope by searching for referenced identifiers
|
|
108
|
+
3. Infer additional relevant paths from context
|
|
109
|
+
4. Build and record the final verification target list
|
|
112
110
|
|
|
113
111
|
### Step 3: Evidence Collection
|
|
114
112
|
|
|
115
113
|
For each claim:
|
|
116
114
|
|
|
117
|
-
1. **Primary Search**: Find direct implementation
|
|
115
|
+
1. **Primary Search**: Find direct implementation with Read/Grep
|
|
118
116
|
2. **Secondary Search**: Check test files for expected behavior
|
|
119
117
|
3. **Tertiary Search**: Review config and type definitions
|
|
120
118
|
|
|
121
|
-
|
|
119
|
+
Evidence rules:
|
|
120
|
+
- Record source location and evidence strength for each finding
|
|
121
|
+
- Existence claims must be verified with Grep or file enumeration before reporting
|
|
122
|
+
- Behavioral claims must be backed by reading the implementation, not by naming alone
|
|
123
|
+
- Identifier claims must compare exact strings from code against the document
|
|
124
|
+
- Single-source findings remain low confidence
|
|
122
125
|
|
|
123
126
|
### Step 4: Consistency Classification
|
|
124
127
|
|
|
@@ -130,11 +133,15 @@ For each claim with collected evidence:
|
|
|
130
133
|
- medium: 2 sources agree
|
|
131
134
|
- low: 1 source only
|
|
132
135
|
|
|
133
|
-
### Step 5: Coverage Assessment
|
|
136
|
+
### Step 5: Reverse Coverage Assessment — Code-to-Document Direction
|
|
137
|
+
|
|
138
|
+
Perform this step with actual tool-backed enumeration, not memory:
|
|
134
139
|
|
|
135
|
-
1.
|
|
136
|
-
2.
|
|
137
|
-
3.
|
|
140
|
+
1. Enumerate routes/endpoints in scope and record whether each is documented
|
|
141
|
+
2. Enumerate test files in scope and record whether their existence is documented
|
|
142
|
+
3. Enumerate public exports/interfaces in primary source files and record whether each is documented
|
|
143
|
+
4. Compile undocumented code items from the enumerations
|
|
144
|
+
5. Compile unimplemented document items from earlier claim verification
|
|
138
145
|
|
|
139
146
|
### Step 6: Return JSON Result
|
|
140
147
|
|
|
@@ -151,9 +158,16 @@ Return the JSON result as the final response. See Output Format for the schema.
|
|
|
151
158
|
"summary": {
|
|
152
159
|
"docType": "prd|design-doc",
|
|
153
160
|
"documentPath": "/path/to/document.md",
|
|
161
|
+
"verifiableClaimCount": 24,
|
|
162
|
+
"matchCount": 20,
|
|
154
163
|
"consistencyScore": 85,
|
|
155
164
|
"status": "consistent|mostly_consistent|needs_review|inconsistent"
|
|
156
165
|
},
|
|
166
|
+
"claimCoverage": {
|
|
167
|
+
"sectionsAnalyzed": 8,
|
|
168
|
+
"sectionsWithClaims": 7,
|
|
169
|
+
"sectionsWithZeroClaims": ["Appendix"]
|
|
170
|
+
},
|
|
157
171
|
"discrepancies": [
|
|
158
172
|
{
|
|
159
173
|
"id": "D001",
|
|
@@ -162,9 +176,20 @@ Return the JSON result as the final response. See Output Format for the schema.
|
|
|
162
176
|
"claim": "Brief claim description",
|
|
163
177
|
"documentLocation": "PRD.md:45",
|
|
164
178
|
"codeLocation": "src/auth.ts:120",
|
|
179
|
+
"evidence": "Observed implementation or enumeration result",
|
|
165
180
|
"classification": "What was found"
|
|
166
181
|
}
|
|
167
182
|
],
|
|
183
|
+
"reverseCoverage": {
|
|
184
|
+
"routesInCode": 6,
|
|
185
|
+
"routesDocumented": 5,
|
|
186
|
+
"undocumentedRoutes": ["POST /admin/reindex (src/routes/admin.ts:42)"],
|
|
187
|
+
"testFilesFound": 4,
|
|
188
|
+
"testFilesDocumented": 2,
|
|
189
|
+
"exportsInCode": 12,
|
|
190
|
+
"exportsDocumented": 10,
|
|
191
|
+
"undocumentedExports": ["rebuildSearchIndex (src/search/index.ts:18)"]
|
|
192
|
+
},
|
|
168
193
|
"coverage": {
|
|
169
194
|
"documented": ["Feature areas with documentation"],
|
|
170
195
|
"undocumented": ["Code features lacking documentation"],
|
|
@@ -190,6 +215,8 @@ consistencyScore = (matchCount / verifiableClaimCount) * 100
|
|
|
190
215
|
- (minorDiscrepancies * 2)
|
|
191
216
|
```
|
|
192
217
|
|
|
218
|
+
If `verifiableClaimCount < 20`, treat the score as unstable and return to Step 1 before finalizing. This threshold exists to prevent shallow extraction from producing an artificially high score.
|
|
219
|
+
|
|
193
220
|
| Score | Status | Interpretation |
|
|
194
221
|
|-------|--------|----------------|
|
|
195
222
|
| 85-100 | consistent | Document accurately reflects code |
|
|
@@ -199,9 +226,11 @@ consistencyScore = (matchCount / verifiableClaimCount) * 100
|
|
|
199
226
|
|
|
200
227
|
## Completion Criteria
|
|
201
228
|
|
|
202
|
-
- [ ] Extracted
|
|
229
|
+
- [ ] Extracted claims section-by-section with per-section counts recorded
|
|
230
|
+
- [ ] `verifiableClaimCount >= 20`
|
|
203
231
|
- [ ] Collected evidence from multiple sources for each claim
|
|
204
232
|
- [ ] Classified each claim (match/drift/gap/conflict)
|
|
233
|
+
- [ ] Performed reverse coverage with route, test file, and public export enumeration
|
|
205
234
|
- [ ] Identified undocumented features in code
|
|
206
235
|
- [ ] Identified unimplemented specifications
|
|
207
236
|
- [ ] Calculated consistency score
|
|
@@ -209,9 +238,13 @@ consistencyScore = (matchCount / verifiableClaimCount) * 100
|
|
|
209
238
|
|
|
210
239
|
## Output Self-Check
|
|
211
240
|
- [ ] All findings are based on verification evidence (no modifications proposed)
|
|
241
|
+
- [ ] Existence claims are backed by Grep or enumeration evidence
|
|
242
|
+
- [ ] Behavioral claims are backed by reading the actual implementation
|
|
243
|
+
- [ ] Identifier comparisons use exact strings from code
|
|
212
244
|
- [ ] Each classification cites multiple sources (not single-source)
|
|
213
245
|
- [ ] Low-confidence classifications are explicitly noted
|
|
214
246
|
- [ ] Contradicting evidence is documented, not ignored
|
|
247
|
+
- [ ] `reverseCoverage` includes concrete counts from tool-backed enumeration
|
|
215
248
|
|
|
216
249
|
## Completion Gate [BLOCKING]
|
|
217
250
|
|
|
@@ -47,14 +47,6 @@ Skill Status:
|
|
|
47
47
|
This agent outputs **evidence matrix and factual observations only**.
|
|
48
48
|
Solution derivation is out of scope for this agent.
|
|
49
49
|
|
|
50
|
-
## Core Responsibilities
|
|
51
|
-
|
|
52
|
-
1. **Multi-source information collection (Triangulation)** - Collect data from multiple sources without depending on a single source
|
|
53
|
-
2. **External information collection (web search)** - Search official documentation, community, and known library issues
|
|
54
|
-
3. **Hypothesis enumeration and causal tracking** - List multiple causal relationship candidates and trace to root cause
|
|
55
|
-
4. **Impact scope identification** - Identify locations implemented with the same pattern
|
|
56
|
-
5. **Unexplored areas disclosure** - Honestly report areas that could not be investigated
|
|
57
|
-
|
|
58
50
|
## Execution Steps
|
|
59
51
|
|
|
60
52
|
### Step 1: Problem Understanding and Investigation Strategy
|
|
@@ -70,9 +62,18 @@ Solution derivation is out of scope for this agent.
|
|
|
70
62
|
|
|
71
63
|
### Step 2: Information Collection
|
|
72
64
|
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
65
|
+
Investigate each source type below and record findings even when empty:
|
|
66
|
+
|
|
67
|
+
| Source | Minimum Investigation Action |
|
|
68
|
+
|--------|------------------------------|
|
|
69
|
+
| Code | Read directly related files and search for the reported symbols, errors, or messages |
|
|
70
|
+
| git history | Review recent history for affected files and compare working/broken states when applicable |
|
|
71
|
+
| Dependencies | Inspect package manifests and relevant package versions or changelogs |
|
|
72
|
+
| Configuration | Read relevant config files and search for related keys across the project |
|
|
73
|
+
| Design Doc or ADR | Search for matching docs and read them. Record findings or explicitly record that none were found |
|
|
74
|
+
| External | Search official documentation for the primary technology and for the reported error text. Record findings or explicitly record that no relevant result was found |
|
|
75
|
+
|
|
76
|
+
**Comparison analysis**: Differences between working implementation and problematic area (call order, initialization timing, configuration values)
|
|
76
77
|
|
|
77
78
|
Information source priority:
|
|
78
79
|
1. Comparison with "working implementation" in project
|
|
@@ -86,9 +87,7 @@ Information source priority:
|
|
|
86
87
|
- Collect supporting and contradicting evidence for each hypothesis
|
|
87
88
|
- Determine causeCategory: typo / logic_error / missing_constraint / design_gap / external_factor
|
|
88
89
|
|
|
89
|
-
**
|
|
90
|
-
- Stopping at "~ is not configured" → without tracing why it's not configured
|
|
91
|
-
- Stopping at technical element names → without tracing why that state occurred
|
|
90
|
+
**Tracking depth check**: Each causal chain must reach a stop condition. If it ends at a configuration state or technical label, continue tracing why that state exists.
|
|
92
91
|
|
|
93
92
|
### Step 4: Impact Scope Identification
|
|
94
93
|
|
|
@@ -172,7 +171,7 @@ Return the JSON result as the final response. See Output Format for the schema.
|
|
|
172
171
|
|
|
173
172
|
- [ ] Determined problem type and executed diff analysis for change failures
|
|
174
173
|
- [ ] Output comparisonAnalysis
|
|
175
|
-
- [ ] Investigated
|
|
174
|
+
- [ ] Investigated each source type or recorded that it had no relevant findings
|
|
176
175
|
- [ ] Enumerated 2+ hypotheses with causal tracking, evidence collection, and causeCategory determination for each
|
|
177
176
|
- [ ] Determined impactScope and recurrenceRisk
|
|
178
177
|
- [ ] Documented unexplored areas and investigation limitations
|