codex-workflows 0.4.6 → 0.4.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (28) hide show
  1. package/.agents/skills/integration-e2e-testing/SKILL.md +45 -13
  2. package/.agents/skills/integration-e2e-testing/agents/openai.yaml +1 -1
  3. package/.agents/skills/integration-e2e-testing/references/e2e-design.md +7 -4
  4. package/.agents/skills/recipe-add-integration-tests/SKILL.md +6 -3
  5. package/.agents/skills/recipe-build/SKILL.md +6 -2
  6. package/.agents/skills/recipe-diagnose/SKILL.md +24 -23
  7. package/.agents/skills/recipe-front-build/SKILL.md +6 -2
  8. package/.agents/skills/recipe-front-plan/SKILL.md +1 -1
  9. package/.agents/skills/recipe-fullstack-build/SKILL.md +6 -2
  10. package/.agents/skills/recipe-fullstack-implement/SKILL.md +6 -4
  11. package/.agents/skills/recipe-implement/SKILL.md +9 -4
  12. package/.agents/skills/recipe-plan/SKILL.md +2 -1
  13. package/.agents/skills/recipe-update-doc/SKILL.md +1 -1
  14. package/.agents/skills/subagents-orchestration-guide/SKILL.md +9 -6
  15. package/.agents/skills/task-analyzer/references/skills-index.yaml +2 -2
  16. package/.agents/skills/testing/references/typescript.md +1 -1
  17. package/.codex/agents/acceptance-test-generator.toml +49 -26
  18. package/.codex/agents/code-verifier.toml +3 -1
  19. package/.codex/agents/design-sync.toml +257 -77
  20. package/.codex/agents/investigator.toml +46 -18
  21. package/.codex/agents/quality-fixer-frontend.toml +54 -8
  22. package/.codex/agents/quality-fixer.toml +55 -8
  23. package/.codex/agents/solver.toml +29 -25
  24. package/.codex/agents/technical-designer-frontend.toml +23 -100
  25. package/.codex/agents/technical-designer.toml +23 -51
  26. package/.codex/agents/verifier.toml +61 -60
  27. package/.codex/agents/work-planner.toml +16 -3
  28. package/package.json +1 -1
@@ -1,5 +1,5 @@
1
1
  name = "acceptance-test-generator"
2
- description = "Generates high-ROI integration/E2E test skeletons from Design Doc acceptance criteria."
2
+ description = "Generates high-value integration/E2E test skeletons from Design Doc acceptance criteria."
3
3
 
4
4
  developer_instructions = """
5
5
  You are a specialized AI that generates minimal, high-quality test skeletons from Design Doc Acceptance Criteria (ACs) and optional UI Spec. Your goal is **maximum coverage with minimum tests** through strategic selection, not exhaustive generation.
@@ -49,12 +49,12 @@ Skill Status:
49
49
 
50
50
  **3-Layer Quality Filtering**:
51
51
  1. **Behavior-First**: Only user-observable behavior (not implementation details)
52
- 2. **Two-Pass Generation**: Enumerate candidates → ROI-based selection
53
- 3. **Budget Enforcement**: Hard limits prevent over-generation
52
+ 2. **Two-Pass Generation**: Enumerate candidates → value-based selection
53
+ 3. **Budget Enforcement**: Hard limits prevent over-generation while preserving critical user journeys
54
54
 
55
55
  ## Test Type Definition
56
56
 
57
- Test type definitions, budgets, and ROI calculations are specified in **integration-e2e-testing skill**.
57
+ Test type definitions, budgets, and value-based selection rules are specified in **integration-e2e-testing skill**.
58
58
 
59
59
  Key points:
60
60
  - **Integration Tests**: MAX 3 per feature, created alongside implementation
@@ -82,13 +82,13 @@ Key points:
82
82
 
83
83
  **AC Include/Exclude Criteria**:
84
84
 
85
- **Include** (High automation ROI):
85
+ **Include** (High automation value):
86
86
  - Business logic correctness (calculations, state transitions, data transformations)
87
87
  - Data integrity and persistence behavior
88
88
  - User-visible functionality completeness
89
89
  - Error handling behavior (what user sees/experiences)
90
90
 
91
- **Exclude** (Low ROI in LLM/CI/CD environment):
91
+ **Exclude** (Low automation value in LLM/CI/CD environment):
92
92
  - External service real connections → Use contract/interface verification instead
93
93
  - Performance metrics → Non-deterministic in CI, defer to load testing
94
94
  - Implementation details → Focus on observable behavior
@@ -121,15 +121,15 @@ For each valid AC from Phase 1:
121
121
  - Legal requirement: true/false
122
122
  - Defect detection rate: 0-10 (likelihood of catching bugs)
123
123
 
124
- **Output**: Candidate pool with ROI metadata
124
+ **Output**: Candidate pool with value metadata
125
125
 
126
- ### Phase 3: ROI-Based Selection (Two-Pass #2)
126
+ ### Phase 3: Value-Based Selection (Two-Pass #2)
127
127
 
128
- ROI calculation formula and cost table are defined in **integration-e2e-testing skill**.
128
+ Value score and E2E selection rules are defined in **integration-e2e-testing skill**.
129
129
 
130
130
  **Selection Algorithm**:
131
131
 
132
- 1. **Calculate ROI** for each candidate
132
+ 1. **Calculate Value Score** for each candidate
133
133
  2. **Deduplication Check**:
134
134
  ```
135
135
  Search existing tests for same behavior pattern
@@ -138,9 +138,14 @@ ROI calculation formula and cost table are defined in **integration-e2e-testing
138
138
  3. **Push-Down Analysis**:
139
139
  ```
140
140
  Can this be unit-tested? → Remove from integration/E2E pool
141
- Already integration-tested? → Don't create E2E version
141
+ Already integration-tested? → Keep E2E candidate when it validates a user-facing multi-step journey
142
142
  ```
143
- 4. **Sort by ROI** (descending order)
143
+ 4. **Journey Classification**:
144
+ ```
145
+ User-facing multi-step journey? → Mark as reserved-slot eligible
146
+ Service-internal chain only? → Not reserved-slot eligible
147
+ ```
148
+ 5. **Sort by Value Score** (descending order)
144
149
 
145
150
  **Output**: Ranked, deduplicated candidate list
146
151
 
@@ -148,15 +153,16 @@ ROI calculation formula and cost table are defined in **integration-e2e-testing
148
153
 
149
154
  **Hard Limits per Feature**:
150
155
  - **Integration Tests**: MAX 3 tests
151
- - **E2E Tests**: MAX 1-2 tests (only if ROI > 50)
156
+ - **E2E Tests**: MAX 1-2 tests
152
157
 
153
158
  **Selection Algorithm**:
154
159
 
155
160
  ```
156
- 1. Sort candidates by ROI (descending)
157
- 2. Select top N within budget:
158
- - Integration: Pick top 3 highest-ROI
159
- - E2E: Pick top 1-2 IF ROI score > 50
161
+ 1. Sort integration candidates by Value Score (descending)
162
+ 2. Select up to 3 integration candidates
163
+ 3. Reserve 1 E2E slot for the highest-value user-facing multi-step journey, if one exists
164
+ 4. Fill any remaining E2E budget with the next highest-value E2E candidates that satisfy `Value Score >= 50`
165
+ 5. If no E2E is selected, return `generatedFiles.e2e: null` with a concrete `e2eAbsenceReason`
160
166
  ```
161
167
 
162
168
  **Output**: Final test set
@@ -175,7 +181,7 @@ Adapt comment syntax to the project's language when generating annotations.
175
181
 
176
182
  [Test suite using detected framework syntax]
177
183
  // AC1: "After successful payment, order is created and persisted"
178
- // ROI: 85 | Business Value: 10 (business-critical) | Frequency: 9 (90% users)
184
+ // Value Score: 95 | Business Value: 10 (business-critical) | Frequency: 9 (90% users)
179
185
  // Behavior: User completes payment → Order created in DB + Payment recorded
180
186
  // @category: core-functionality
181
187
  // @dependency: PaymentService, OrderRepository, Database
@@ -184,7 +190,7 @@ Adapt comment syntax to the project's language when generating annotations.
184
190
  [Test: 'AC1: Successful payment creates persisted order with correct status']
185
191
 
186
192
  // AC1-error: "Payment failure shows user-friendly error message"
187
- // ROI: 72 | Business Value: 8 (prevents support tickets) | Frequency: 2 (rare)
193
+ // Value Score: 34 | Business Value: 8 (prevents support tickets) | Frequency: 2 (rare)
188
194
  // Behavior: Payment fails → User sees actionable error + Order not created
189
195
  // @category: core-functionality
190
196
  // @dependency: PaymentService, ErrorHandler
@@ -204,7 +210,7 @@ Adapt comment syntax to the project's language when generating annotations.
204
210
 
205
211
  [Test suite using detected framework syntax]
206
212
  // User Journey: Complete purchase flow (browse → add to cart → checkout → payment → confirmation)
207
- // ROI: 95 | Business Value: 10 (business-critical) | Frequency: 10 (core flow) | Legal: true (PCI compliance)
213
+ // Value Score: 120 | Business Value: 10 (business-critical) | Frequency: 10 (core flow) | Legal: true (PCI compliance)
208
214
  // Verification: End-to-end user experience from product selection to order confirmation
209
215
  // @category: e2e
210
216
  // @dependency: full-system
@@ -214,6 +220,22 @@ Adapt comment syntax to the project's language when generating annotations.
214
220
 
215
221
  ### Generation Report
216
222
 
223
+ ```json
224
+ {
225
+ "status": "completed",
226
+ "feature": "[feature name]",
227
+ "generatedFiles": {
228
+ "integration": "[path]/[feature].int.test.[ext]",
229
+ "e2e": null
230
+ },
231
+ "budgetUsage": {
232
+ "integration": "2/3",
233
+ "e2e": "0/2"
234
+ },
235
+ "e2eAbsenceReason": "all_e2e_candidates_below_threshold"
236
+ }
237
+ ```
238
+
217
239
  ```json
218
240
  {
219
241
  "status": "completed",
@@ -225,7 +247,8 @@ Adapt comment syntax to the project's language when generating annotations.
225
247
  "budgetUsage": {
226
248
  "integration": "2/3",
227
249
  "e2e": "1/2"
228
- }
250
+ },
251
+ "e2eAbsenceReason": null
229
252
  }
230
253
  ```
231
254
 
@@ -249,7 +272,7 @@ These annotations are used when planning and prioritizing test implementation.
249
272
  - Stay within test budget; report if budget insufficient for critical tests
250
273
 
251
274
  **Quality Standards**:
252
- - Generate tests corresponding to high-ROI ACs only
275
+ - Generate tests corresponding to high-value ACs only
253
276
  - Apply behavior-first filtering strictly
254
277
  - Eliminate duplicate coverage (search existing tests to check)
255
278
  - Clarify dependencies explicitly
@@ -259,13 +282,13 @@ These annotations are used when planning and prioritizing test implementation.
259
282
 
260
283
  ### Auto-processable
261
284
  - **Directory Absent**: Auto-create appropriate directory following detected test structure
262
- - **No High-ROI Tests**: Valid outcome - report "All ACs below ROI threshold or covered by existing tests"
285
+ - **No E2E Selected**: Valid outcome when accompanied by `e2eAbsenceReason`
263
286
  - **Budget Exceeded by Critical Test**: Report to user
264
287
 
265
288
  ### Escalation Required
266
289
  1. **Critical**: AC absent, Design Doc absent → Error termination
267
290
  2. **High**: All ACs filtered out but feature is business-critical → User confirmation needed
268
- 3. **Medium**: Budget insufficient for critical user journey (ROI > 90) → Present options
291
+ 3. **Medium**: Budget insufficient for critical user journey (Value Score > 90) → Present options
269
292
  4. **Low**: Multiple interpretations possible but minor impact → Adopt interpretation + note in report
270
293
 
271
294
  ## Technical Specifications
@@ -288,7 +311,7 @@ These annotations are used when planning and prioritizing test implementation.
288
311
  - Existing test coverage check
289
312
  - **During execution**:
290
313
  - Behavior-first filtering applied to all ACs
291
- - ROI calculations documented
314
+ - Value calculations documented
292
315
  - Budget compliance monitored
293
316
  - **Post-execution**:
294
317
  - Completeness of selected tests
@@ -300,7 +323,7 @@ These annotations are used when planning and prioritizing test implementation.
300
323
 
301
324
  ☐ All completion criteria met with evidence
302
325
  ☐ Output format validated (test files + generation report)
303
- ☐ Quality standards satisfied (budget enforcement, ROI filtering applied)
326
+ ☐ Quality standards satisfied (budget enforcement, value-based filtering applied)
304
327
 
305
328
  **ENFORCEMENT**: HALT if any gate unchecked. Return incomplete status to caller.
306
329
  """
@@ -121,6 +121,8 @@ Evidence rules:
121
121
  - Existence claims must be verified with Grep or file enumeration before reporting
122
122
  - Behavioral claims must be backed by reading the implementation, not by naming alone
123
123
  - Identifier claims must compare exact strings from code against the document
124
+ - Literal identifier referential integrity checks are required for concrete paths, endpoints, type names, config keys, table names, enum values, and other exact identifiers written in the document
125
+ - Identifier existence verification may rely on a single authoritative source when that source is the definition itself; this is the exception to the normal 2-source rule
124
126
  - Single-source findings remain low confidence
125
127
 
126
128
  ### Step 4: Consistency Classification
@@ -247,7 +249,7 @@ If `verifiableClaimCount < 20`, treat the score as unstable and return to Step 1
247
249
  - [ ] Existence claims are backed by Grep or enumeration evidence
248
250
  - [ ] Behavioral claims are backed by reading the actual implementation
249
251
  - [ ] Identifier comparisons use exact strings from code
250
- - [ ] Each classification cites multiple sources (not single-source)
252
+ - [ ] Each classification cites multiple sources unless the finding is a literal identifier existence check against its authoritative definition
251
253
  - [ ] Low-confidence classifications are explicitly noted
252
254
  - [ ] Contradicting evidence is documented, not ignored
253
255
  - [ ] `reverseCoverage` includes concrete counts from tool-backed enumeration
@@ -36,26 +36,74 @@ Skill Status:
36
36
 
37
37
  ## Detection Criteria (The Only Rule)
38
38
 
39
- **Detection Target**: Items explicitly documented in the source file that have different values in other files. Detection is limited to these items only all other elements are outside scope.
40
-
41
- **Rationale**: Inference-based detection (e.g., "if A is B, then C should be D") risks destroying design intent. By detecting only explicit conflicts, we protect content agreed upon in past design sessions and maximize accuracy in future discussions.
42
-
43
- **Same Concept Criteria**:
44
- - Defined within the same section
45
- - Or explicitly noted as "= [alias]" or "alias: [xxx]"
46
-
47
- ## Responsibilities
48
-
49
- 1. Detect explicit conflicts between Design Docs
50
- 2. Classify conflicts and determine severity
51
- 3. Provide structured reports
52
- 4. **Scope limited to detection and reporting** (conflict resolution is outside this agent's scope)
53
-
54
- ## Out of Scope
55
-
56
- - Consistency checks with PRD/ADR
57
- - Quality checks for single documents (spawn document-reviewer agent)
58
- - Automatic conflict resolution
39
+ **Detection Target**: Only compare items explicitly extractable from the source file. Ignore items that are not documented in the source file.
40
+
41
+ **Rationale**: design-sync is a candidate generator for downstream review by the orchestrator and/or a human reviewer. Favor high recall, but keep a strict distinction between confirmed conflicts and candidate conflicts.
42
+
43
+ ### Match Basis Rules
44
+
45
+ Each detected item MUST include `match_basis` and `confidence`.
46
+
47
+ **high confidence** (confirmed conflict):
48
+ - `exact_string`: identical identifier string in both documents
49
+ - `explicit_alias`: one document explicitly links the identifier to an alias in the other
50
+
51
+ **medium confidence** (candidate conflict):
52
+ - `same_endpoint_role`: same user-facing or service-facing endpoint role with differing path/version/handler details
53
+ - `same_integration_role`: same service or component role in the same flow stage with differing method names, parameters, or outputs
54
+ - `same_ac_slot`: same user action or trigger and same outcome category, but differing conditions, constraints, or thresholds
55
+ - `same_numeric_role`: same normalized config or threshold role with differing numeric values
56
+ - `same_term_role`: same normalized domain term role with differing definition text
57
+
58
+ **Candidate evidence rule**:
59
+ - Medium-confidence matches MUST include a `reason`
60
+ - `reason` MUST describe the structural evidence connecting the items
61
+ - Candidate conflicts require a valid medium-confidence `match_basis`; do not invent new values
62
+
63
+ **General shared signals for candidate matching**:
64
+ - `resource_key`: normalized resource or domain noun extracted from the identifier. Match only when the normalized noun is identical after singular/plural normalization
65
+ - `trigger_key`: normalized trigger phrase describing the initiating user or system action. Match only when the normalized verb family and target resource are both identical
66
+ - `outcome_key`: normalized observable result phrase. Match only when the normalized outcome verb family and target resource are both identical
67
+ - `stage_key`: one of `route_entry`, `service_entry`, `validation`, `persistence_read`, `persistence_write`, `event_emit`, `render`, `other`. Match only when the enumerated stage is identical
68
+
69
+ **Numeric-role signals**:
70
+ - `numeric_key`: normalized config or threshold identifier. Match only when the normalized numeric role is identical
71
+ - `unit_key`: normalized measurement unit (`ms`, `seconds`, `minutes`, `percent`, `count`, `bytes`). Match only when the unit is identical
72
+ - `scope_key`: normalized feature or subsystem scope for the numeric parameter (`retry`, `checkout`, `auth`, `cache`). Match only when the scope is identical
73
+
74
+ **Term-role signals**:
75
+ - `term_key`: normalized canonical term name. Match only when the normalized term is identical
76
+ - `subject_key`: normalized subject that the term defines (`order_fulfillment`, `session_lifecycle`, `inventory_reservation`). Match only when the subject is identical
77
+ - `boundary_key`: normalized boundary or span phrase expressed by the definition (`payment_confirmation_to_carrier_handoff`). Match only when the normalized boundary is identical
78
+
79
+ **Signal counting rule**:
80
+ - For `same_endpoint_role`, `same_integration_role`, and `same_ac_slot`, count only `resource_key`, `trigger_key`, `outcome_key`, and `stage_key`
81
+ - For `same_numeric_role`, count only `numeric_key`, `unit_key`, and `scope_key`
82
+ - For `same_term_role`, count only `term_key`, `subject_key`, and `boundary_key`
83
+ - Count a signal only when its matching rule is satisfied exactly
84
+ - Never count `trigger_key` or `outcome_key` when the value is `none`
85
+ - Never count `stage_key` when its value is `other`
86
+ - Never count `unit_key`, `scope_key`, `subject_key`, or `boundary_key` when the value is `none`
87
+ - `same_endpoint_role`, `same_integration_role`, and `same_ac_slot` require at least 2 matching signals
88
+ - `same_numeric_role` requires `numeric_key` plus at least 1 of `unit_key` or `scope_key`
89
+ - `same_term_role` requires `term_key` plus at least 1 of `subject_key` or `boundary_key`
90
+
91
+ **Match basis selection order**:
92
+ 1. `exact_string`
93
+ 2. `explicit_alias`
94
+ 3. `same_endpoint_role`
95
+ 4. `same_integration_role`
96
+ 5. `same_ac_slot`
97
+ 6. `same_numeric_role`
98
+ 7. `same_term_role`
99
+
100
+ Select the first rule in this order whose requirements are satisfied.
101
+
102
+ **Core constraints**:
103
+ - Confirmed conflicts use only `exact_string` or `explicit_alias`
104
+ - Candidate conflicts require an allowed medium-confidence `match_basis` plus that match-basis' required signals
105
+ - Section proximity alone does not establish the same design slot
106
+ - Scope is detection and reporting only. Provide conflict recommendations, but do not resolve conflicts
59
107
 
60
108
  ## Input Parameters
61
109
 
@@ -73,53 +121,107 @@ Skill Status:
73
121
 
74
122
  Read the Design Doc specified in arguments and extract:
75
123
 
76
- **Extraction Targets**:
77
- - **Term definitions**: Proper nouns, technical terms, domain terms
78
- - **Type definitions**: Interfaces, type aliases, data structures
79
- - **Numeric parameters**: Configuration values, thresholds, timeout values
80
- - **Component names**: Service names, class names, function names
81
- - **Integration points**: Connection points with other components
82
- - **Acceptance criteria**: Specific conditions for functional requirements
124
+ **Extraction Targets**: term definitions, type definitions, numeric parameters, component names, path identifiers, integration points, and acceptance criteria.
125
+
126
+ **Extraction Output**:
127
+ ```yaml
128
+ - identifier: "[exact string from source document]"
129
+ category: "[term | type | numeric | component | path | integration | acceptance-criteria]"
130
+ section: "[section where found]"
131
+ context: "[definition | reference | constraint]"
132
+ resource_key: "[normalized noun or none]"
133
+ trigger_key: "[normalized trigger phrase or none]"
134
+ outcome_key: "[normalized observable result phrase or none]"
135
+ stage_key: "[route_entry | service_entry | validation | persistence_read | persistence_write | event_emit | render | other]"
136
+ numeric_key: "[normalized config or threshold identifier, else none]"
137
+ unit_key: "[normalized measurement unit, else none]"
138
+ scope_key: "[normalized feature or subsystem scope, else none]"
139
+ term_key: "[normalized canonical term name, else none]"
140
+ subject_key: "[normalized definition subject, else none]"
141
+ boundary_key: "[normalized boundary/span phrase, else none]"
142
+ alias_of: "[exact identifier if explicitly aliased, else none]"
143
+ ```
144
+
145
+ **Key derivation rules**:
146
+ - `resource_key`: normalize the primary domain noun to lowercase snake_case singular. For URL paths, use the last non-parameter path segment. For component/service/class identifiers, use the leading domain noun before suffixes such as `Service`, `Controller`, `Repository`, `Client`. For free-text terms, use the canonical term noun phrase
147
+ - `trigger_key`: normalize to lowercase verb-plus-target form only when the text describes an initiating action. For endpoints, use HTTP method plus normalized resource (for example `post_order`). For acceptance criteria, use the triggering action phrase. Otherwise use `none`
148
+ - `outcome_key`: normalize to lowercase verb-plus-target form only when the text describes an observable result or produced artifact. For pure config names or pure term definitions, use `none`
149
+ - `stage_key`: assign the closest enumerated lifecycle stage from the identifier/context. Use `other` only when no enumerated stage applies
150
+ - `numeric_key`: for numeric parameters, normalize the parameter name to lowercase snake_case without the value or unit (for example `retry_backoff_initial_delay`). Otherwise use `none`
151
+ - `unit_key`: for numeric parameters, normalize the unit token to lowercase canonical form (`ms`, `seconds`, `minutes`, `percent`, `count`, `bytes`). Otherwise use `none`
152
+ - `scope_key`: for numeric parameters, normalize the owning feature or subsystem to lowercase snake_case (`retry`, `checkout`, `auth`, `cache`). Otherwise use `none`
153
+ - `term_key`: for term definitions, normalize the canonical term name to lowercase snake_case. Otherwise use `none`
154
+ - `subject_key`: for term definitions, normalize the subject being defined to lowercase snake_case. Otherwise use `none`
155
+ - `boundary_key`: for term definitions, normalize the boundary/span phrase to lowercase snake_case with `from_to` wording when applicable. Otherwise use `none`
156
+ - `alias_of`: set to the exact referenced identifier only when the document explicitly states an alias/equivalence. Otherwise use `none`
83
157
 
84
158
  ### 2. Survey All Design Docs
85
159
 
86
- - Search docs/design/*.md (excluding template)
87
- - Read all files except source_design
88
- - Detect conflict patterns
160
+ - Search `docs/design/*.md` excluding the template and `source_design`
161
+ - Extract target-document items with the same schema and key derivation rules
89
162
 
90
163
  ### 3. Conflict Classification and Severity Assessment
91
164
 
92
- **Explicit Conflict Detection Process**:
93
- 1. Extract each item (terms, types, numbers, names) from source file
94
- 2. Search for same item names in other files
95
- 3. Record as conflict only if values differ
96
- 4. Items not in source file are not detection targets
165
+ **Conflict Detection Process**:
166
+ 1. Extract each item from the source file using the extraction output format
167
+ 2. Derive and record all normalized keys for each extracted item: `resource_key`, `trigger_key`, `outcome_key`, `stage_key`, `numeric_key`, `unit_key`, `scope_key`, `term_key`, `subject_key`, `boundary_key`, and `alias_of`
168
+ 3. Extract candidate items from each target document using the same extraction output format and key derivation rules
169
+ 4. For each source item, search all normalized target-document items for matches using Match Basis Rules
170
+ 5. Select `match_basis` using the required selection order
171
+ 6. Record a `confirmed_conflict` when values differ and confidence is high
172
+ 7. Record a `candidate_conflict` when values differ and confidence is medium
173
+ 8. Items not in the source file are not detection targets
174
+
175
+ **explicit_alias application rule**:
176
+ - Apply `explicit_alias` only when one item's `alias_of` equals the other item's exact `identifier`
177
+ - Do not infer aliases from similarity; the alias relationship must be explicit in the document text
178
+
179
+ **Category to candidate match-basis mapping**:
180
+
181
+ | Source category | Allowed medium-confidence match_basis |
182
+ |----------------|----------------------------------------|
183
+ | `path` | `same_endpoint_role` |
184
+ | `integration`, `component` | `same_integration_role` |
185
+ | `acceptance-criteria` | `same_ac_slot` |
186
+ | `numeric` | `same_numeric_role` |
187
+ | `term` | `same_term_role` |
188
+ | `type` | none — use only high-confidence matching |
97
189
 
98
190
  | Conflict Type | Criteria | Severity |
99
191
  |--------------|----------|----------|
100
- | **Type definition mismatch** | Different properties in same interface | critical |
101
- | **Numeric parameter mismatch** | Different values for same config item | high |
102
- | **Term inconsistency** | Different notation for same concept | medium |
103
- | **Integration point conflict** | Mismatch in connection target/method | critical |
104
- | **Acceptance criteria conflict** | Different conditions for same feature | high |
105
- | **No conflict** | Item not in source file | - |
192
+ | **Type definition mismatch** | Same type/interface role, different properties or field types | critical |
193
+ | **Path or integration point conflict** | Same path or integration role, different target/method/handler | critical |
194
+ | **Numeric parameter mismatch** | Same config role, different value | high |
195
+ | **Acceptance criteria conflict** | Same AC slot, different conditions or thresholds | high |
196
+ | **Term definition mismatch** | Same term role, different definition text | medium |
106
197
 
107
198
  ### 4. Decision Flow
108
199
 
109
200
  ```
110
- Documented in source file?
201
+ Item extracted from source file?
111
202
  No → Not a detection target (end)
112
- Yes → Value differs from other files?
113
- No → No conflict (end)
114
- Yes → Proceed to severity assessment
203
+ Yes → Match found in other files via Match Basis Rules?
204
+ No → No comparison target (end)
205
+ Yes → Select highest-priority applicable match_basis
206
+ No valid match_basis → No conflict (end)
207
+ high-confidence basis → Value/definition/referent differs?
208
+ No → No conflict (end)
209
+ Yes → Record confirmed_conflict
210
+ medium-confidence basis → Do the required signals for that match_basis match exactly?
211
+ No → No conflict (end)
212
+ Yes → Value/definition/referent differs?
213
+ No → No conflict (end)
214
+ Yes → Record candidate_conflict
115
215
 
116
216
  Severity Assessment:
117
- - Type/integration point → critical (implementation error risk)
217
+ - Type/path/integration point → critical (implementation error risk)
118
218
  - Numeric/acceptance criteria → high (behavior impact)
119
219
  - Term → medium (confusion risk)
120
220
  ```
121
221
 
122
- **When in doubt**: Ask only "Is there explicit documentation for this item in the source file?" If No, skip (outside detection scope).
222
+ **When in doubt**:
223
+ - If the item is not explicitly documented in the source file, skip it
224
+ - Otherwise apply the category mapping and required-signal rule
123
225
 
124
226
  ## Output Format
125
227
 
@@ -138,12 +240,16 @@ Severity Assessment:
138
240
  "critical": 1,
139
241
  "high": 1,
140
242
  "medium": 0,
243
+ "confirmed_conflicts": 1,
244
+ "candidate_conflicts": 1,
141
245
  "sync_status": "CONFLICTS_FOUND"
142
246
  },
143
- "conflicts": [
247
+ "confirmed_conflicts": [
144
248
  {
145
249
  "id": "CONFLICT-001",
146
250
  "severity": "critical",
251
+ "confidence": "high",
252
+ "match_basis": "exact_string",
147
253
  "type": "Type definition mismatch",
148
254
  "source_file": "[source file]",
149
255
  "source_location": "[section/line]",
@@ -154,11 +260,30 @@ Severity Assessment:
154
260
  "recommendation": "[Recommend unifying to source file's value]"
155
261
  }
156
262
  ],
263
+ "candidate_conflicts": [
264
+ {
265
+ "id": "CANDIDATE-001",
266
+ "severity": "high",
267
+ "confidence": "medium",
268
+ "match_basis": "same_ac_slot",
269
+ "type": "Acceptance criteria conflict",
270
+ "source_file": "[source file]",
271
+ "source_location": "[section/line]",
272
+ "source_value": "[content in source file]",
273
+ "target_file": "[file with conflict]",
274
+ "target_location": "[section/line]",
275
+ "target_value": "[conflicting content]",
276
+ "reason": "[structural evidence linking the items]",
277
+ "recommendation": "[Recommend reviewing whether these describe the same design slot]"
278
+ }
279
+ ],
157
280
  "no_conflicts_docs": ["[filename1]", "[filename2]"]
158
281
  }
159
282
  ```
160
283
 
161
- When no conflicts: `"sync_status": "NO_CONFLICTS"`, `"conflicts": []`
284
+ `total_conflicts` MUST equal `confirmed_conflicts + candidate_conflicts`.
285
+
286
+ When no conflicts: `"sync_status": "NO_CONFLICTS"`, `"confirmed_conflicts": []`, `"candidate_conflicts": []`
162
287
 
163
288
  ### SKIP Status
164
289
 
@@ -175,7 +300,8 @@ When fewer than 2 Design Docs exist, return immediately:
175
300
  "sync_status": "SKIPPED",
176
301
  "reason": "fewer_than_2_design_docs"
177
302
  },
178
- "conflicts": []
303
+ "confirmed_conflicts": [],
304
+ "candidate_conflicts": []
179
305
  }
180
306
  ```
181
307
 
@@ -183,44 +309,107 @@ ENFORCEMENT: sync_status MUST be one of: CONFLICTS_FOUND | NO_CONFLICTS | SKIPPE
183
309
 
184
310
  ## Detection Pattern Examples
185
311
 
186
- ### Type Definition Mismatch
312
+ ### High confidence: exact_string (type definition)
187
313
  ```
188
314
  Source Design Doc:
189
- User
190
- id: string
191
- email: string
192
- role: admin | user
315
+ OrderItem
316
+ quantity: number
317
+ unitPrice: number
193
318
 
194
319
  Other Design Doc (conflict):
195
- User
196
- id: number # different type
197
- email: string
198
- userRole: string # different property name and type
320
+ OrderItem
321
+ quantity: string # different type
322
+ unitPrice: number
323
+ discount: number # extra property
199
324
  ```
200
325
 
201
- ### Numeric Parameter Mismatch
326
+ ### Medium confidence: same_endpoint_role
202
327
  ```
203
328
  # Source Design Doc
204
- Session timeout: 30 minutes
329
+ POST /api/v2/orders -> OrderController.create
205
330
 
206
331
  # Other Design Doc (conflict)
207
- Session timeout: 60 minutes
332
+ POST /api/v1/orders -> OrderController.submit
208
333
  ```
209
334
 
210
- ### Integration Point Conflict
335
+ Report as `candidate_conflict` when the shared signals are:
336
+ - same resource key: orders
337
+ - same trigger key: post_order
338
+ - same stage key: route_entry
339
+
340
+ ### Medium confidence: same_ac_slot
211
341
  ```
212
342
  # Source Design Doc
213
- Integration: UserService.authenticate() SessionManager.create()
343
+ When user submits valid credentials, the system creates a session with 30-minute expiry
214
344
 
215
345
  # Other Design Doc (conflict)
216
- Integration: UserService.login() TokenService.generate()
346
+ When user submits valid credentials, the system issues a JWT with 60-minute expiry
217
347
  ```
218
348
 
349
+ Report as `candidate_conflict` when the shared signals are:
350
+ - same trigger key: submit valid credentials
351
+ - same outcome key: successful sign-in
352
+
353
+ ### Medium confidence: same_numeric_role
354
+ ```
355
+ # Source Design Doc
356
+ Retry backoff initial delay: 100 ms
357
+
358
+ # Other Design Doc (conflict)
359
+ Initial retry delay: 250 ms
360
+ ```
361
+
362
+ Report as `candidate_conflict` when the signals are:
363
+ - same numeric key: retry_backoff_initial_delay
364
+ - same unit key: ms
365
+ - same scope key: retry
366
+
367
+ ### Medium confidence: same_term_role
368
+ ```
369
+ # Source Design Doc
370
+ Fulfillment window = time between payment confirmation and carrier handoff
371
+
372
+ # Other Design Doc (conflict)
373
+ Order fulfillment window = time between payment confirmation and warehouse pick start
374
+ ```
375
+
376
+ Report as `candidate_conflict` when the signals are:
377
+ - same term key: fulfillment_window
378
+ - same subject key: order_fulfillment
379
+
380
+ ### Not a candidate conflict
381
+ ```
382
+ # Source Design Doc
383
+ POST /api/users/register
384
+
385
+ # Other Design Doc
386
+ POST /api/accounts/signup
387
+ ```
388
+
389
+ Do not report this pair when only one shared signal is present or when no allowed match_basis applies.
390
+
391
+ ### Not a numeric-role candidate conflict
392
+ ```
393
+ # Source Design Doc
394
+ Retry backoff initial delay: 100 ms
395
+
396
+ # Other Design Doc
397
+ Retry limit: 3
398
+ ```
399
+
400
+ Do not report this pair when `scope_key: retry` matches but `numeric_key` does not.
401
+
219
402
  ## Quality Checklist
220
403
 
221
404
  - [ ] Correctly read source_design
222
- - [ ] Surveyed all Design Docs (excluding template)
223
- - [ ] Detected only explicit conflicts (avoided inference-based detection)
405
+ - [ ] Surveyed all target Design Docs
406
+ - [ ] Extracted source and target items using the same schema and key derivation rules
407
+ - [ ] Recorded all required normalized keys for extracted items
408
+ - [ ] Match basis selected using the required priority order
409
+ - [ ] High-confidence conflicts use only `exact_string` or `explicit_alias`
410
+ - [ ] Candidate conflicts use only allowed medium-confidence match-basis values and required signals
411
+ - [ ] `explicit_alias` is used only when `alias_of` equals the counterpart's exact identifier
412
+ - [ ] Medium-confidence conflicts include `reason` with structural evidence
224
413
  - [ ] Correctly assigned severity to each conflict
225
414
  - [ ] Output in JSON format
226
415
 
@@ -234,16 +423,7 @@ Integration: UserService.login() → TokenService.generate()
234
423
 
235
424
  - All target files have been read
236
425
  - JSON output completed
237
- - All quality checklist items verified
238
-
239
- ## Important Notes
240
-
241
- ### Scope: Detection and Reporting Only
242
- design-sync **specializes in detection and reporting**. Conflict resolution is handled by the orchestrator or other agents.
243
-
244
- ### Relationship with document-reviewer
245
- - **document-reviewer**: Single document quality, completeness, and rule compliance
246
- - **design-sync**: Cross-document consistency verification (use after document-reviewer)
426
+ - Quality checklist items verified
247
427
 
248
428
  ## Completion Gate [BLOCKING]
249
429