npm - codex-workflows - Versions diffs - 0.6.6 → 0.6.7 - Mend

codex-workflows 0.6.6 → 0.6.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/.codex/agents/acceptance-test-generator.toml +14 -4
package/.codex/agents/code-reviewer.toml +3 -0
package/.codex/agents/task-executor-frontend.toml +12 -15
package/.codex/agents/task-executor.toml +12 -15
package/.codex/agents/technical-designer-frontend.toml +10 -81
package/.codex/agents/technical-designer.toml +10 -94
package/package.json +1 -1

package/.codex/agents/acceptance-test-generator.toml CHANGED Viewed

@@ -111,6 +111,7 @@ For each valid AC from Phase 1:
    - Happy path (1 test mandatory)
    - Error handling (only if user-visible error)
    - Edge cases (only if high business impact)
+   - Boundary path (behavior-changing AC only): when the AC can hold on the main path while a distinct branch, state, input class, lifecycle step, or fallback regresses, capture that boundary as a proof obligation. Prefer merging the boundary path into the selected happy-path or highest-value candidate; create a separate candidate only when the boundary needs separate setup.
 2. **Classify test level**:
    - Integration test candidate (feature-level interaction)
@@ -167,7 +168,8 @@ Value score and E2E selection rules are defined in **integration-e2e-testing ski
 4. Reserve 1 service-integration-e2e slot only when the journey needs real cross-service verification
 5. Fill remaining fixture-e2e budget with candidates that satisfy `Value Score >= 20`
 6. Fill remaining service-integration-e2e budget with candidates that satisfy `Value Score > 50`
-7. If a lane emits no tests, return its generated file as `null` with a concrete lane-specific absence reason
+7. For every behavior-changing AC kept in scope, ensure at least one selected test represents its required boundary proof obligation. Merge the boundary path into a selected happy-path or highest-value candidate when possible; otherwise replace the lowest-value optional selected candidate. When required boundary obligations exceed the budget and no optional candidate is replaceable, keep the budget hard limit and add uncovered AC IDs and boundary paths to `boundaryProofGaps`.
+8. If a lane emits no tests, return its generated file as `null` with a concrete lane-specific absence reason
 ```
 **Output**: Final test set
@@ -272,7 +274,8 @@ Adapt comment syntax to the project's language when generating annotations.
   "e2eAbsenceReason": {
     "fixtureE2e": "all_e2e_candidates_below_threshold",
     "serviceE2e": "no_real_service_dependency"
-  }
+  },
+  "boundaryProofGaps": []
 }
 ```
@@ -293,7 +296,14 @@ Adapt comment syntax to the project's language when generating annotations.
   "e2eAbsenceReason": {
     "fixtureE2e": null,
     "serviceE2e": null
-  }
+  },
+  "boundaryProofGaps": [
+    {
+      "acId": "[AC-XXX]",
+      "boundaryPath": "[branch/state/input/lifecycle/fallback/visibility path]",
+      "reason": "budget_insufficient_for_boundary_proof"
+    }
+  ]
 }
 ```
@@ -306,7 +316,7 @@ Each test case MUST have the following standard annotations for test implementat
 - **@dependency**: none | [component names] | full-ui (mocked backend) | full-system
 - **@complexity**: low | medium | high
 - **Primary failure mode**: the specific regression that should make the implemented test fail
-- **Proof obligation**: what the implemented test must assert to prove the claim, including the boundary to exercise, before/action/after state for state-changing claims, and which boundaries may be mocked with rationale
+- **Proof obligation**: what the implemented test must assert to prove the claim, including the boundary to exercise, before/action/after state for state-changing claims, and which boundaries may be mocked with rationale. A behavior-changing AC is one whose promised observable behavior could still pass on the main path while a separate branch, state, input class, lifecycle step, fallback, or visibility boundary regresses. For behavior-changing ACs, name the boundary path the test must traverse when the main path alone would stay green through the regression
 These annotations are used when planning and prioritizing test implementation. Primary failure mode and proof obligation carry the proof contract to work-planner, task-decomposer, and integration-test-reviewer.

package/.codex/agents/code-reviewer.toml CHANGED Viewed

@@ -75,6 +75,9 @@ For each acceptance criterion extracted in Step 1:
 - Determine status: fulfilled / partially fulfilled / unfulfilled
 - Record the file path and relevant code location
 - Note any deviations from the Design Doc specification
+- For behavior-changing ACs, confirm the evidence covers main and boundary paths. Where a distinct branch, state, input class, lifecycle step, or fallback governs the behavior, verify it is exercised. Compare source/referenced behavior and implemented behavior at the same granularity; an unsupported change in a boundary dimension is a `dd_violation`.
+- Confirm the implementation keeps the core mechanism the AC, Design Doc, or referenced materials require. A simpler substitute that passes tests but drops the required mechanism is a `dd_violation`.
+- For changes to persisted, shared, or externally observable state, identify the publication boundary where the new state becomes observable to another process, component, user, or later step. State that is observable as complete while still partial, uninitialized, stale, or rollback-only (written as a rollback/compensation path rather than committed usable state) is a `reliability` finding.
 #### 2-2. Identifier Verification
 For each identifier specification extracted in Step 1:

package/.codex/agents/task-executor-frontend.toml CHANGED Viewed

@@ -27,10 +27,7 @@ The task file is the single source of truth for write scope.
 ## Required Skills [LOADING PROTOCOL]
-For each [[skills.config]] entry:
-1. Verify the skill is loaded before any task work.
-2. If not loaded, read its SKILL.md.
-3. Record one evidence line per configured skill: `Skill Status: [path] - ACTIVE`.
+Confirm configured skills are active and record `Skill Status: [path] - ACTIVE` for each loaded skill.
 ## Mandatory Rules
@@ -79,6 +76,12 @@ Use the appropriate run command based on the `packageManager` field in package.j
 **Low Duplication (Continue Implementation)** - 1 or fewer items match
+### Step4: Core Mechanism Check (Failure of either check → Immediate Escalation)
+Step1 catches contract/structure deviations; Step4 catches visible-contract-compatible substitutes that drop the required mechanism.
+□ Planned implementation preserves the mechanism required by task/AC/Design Doc/UI Spec/references?
+□ Required mechanism is feasible as specified?
+Failure of either check → return `design_compliance_violation` with source expectation, substitute, behavior change, and unblock condition.
 ### Safety Measures: Handling Ambiguous Cases
 **Gray Zone Examples (Escalation Recommended)**:
@@ -171,16 +174,7 @@ Run this check after Pre-implementation Verification and before behavior-first i
 #### Reference Representativeness (Applied During Implementation)
-When adopting a pattern, UI composition, or dependency from existing code, apply repository-wide representativeness checks at the point of adoption:
-□ **Repository-wide verification**: Confirm the referenced pattern is representative across the repository, not just the nearest 2-3 files
-□ **Dependency version verification** (when adopting external dependencies):
-  - verify repository-wide usage distribution for the same dependency
-  - if following one existing version when alternatives exist, state the reason
-  - if repository-wide verification is insufficient to determine the appropriate dependency version or pattern choice, escalate with `reason: "Dependency version uncertain"` and `escalation_type: "dependency_version_uncertain"`
-□ **Coexistence resolution**: When multiple patterns or versions coexist, identify the majority before choosing
-This is a repeated self-check during implementation, not a one-time pre-implementation gate.
+During implementation, apply coding-rules Reference Representativeness before adopting existing patterns, UI composition, or dependency versions. Record majority/coexistence rationale; when repository-wide evidence is insufficient for dependency version or pattern choice, escalate with `reason: "Dependency version uncertain"` and `escalation_type: "dependency_version_uncertain"`.
 #### Implementation Flow (Behavior-First RTL)
 **Completion Confirmation**: If all checkboxes are `[x]`, report "already completed" and end
@@ -260,6 +254,8 @@ Report in the following JSON format upon task completion (**without executing qu
 When unable to implement per Design Doc, escalate in following JSON format:
 Use Binding Decision Violation Escalation instead when the task has a Binding Decisions row covering the same issue.
+For task/AC/UI Spec/reference core-mechanism sources, set `details.design_doc_expectation` to `[source type] [location]: [cited expectation]`.
+For core-mechanism violations, put the substitute in `details.actual_situation`, the behavior change in `details.why_cannot_implement`, and the unblock condition in `recommendation`.
 ```json
 {
@@ -426,13 +422,14 @@ Triggered when the Test Environment Check finds the project-configured test tool
 ☐ Investigation Targets were processed, or marked N/A when the task file has no Investigation Targets section
 ☐ Investigation Notes were updated before implementation when Investigation Targets exist
 ☐ Implementation is consistent with the observations recorded in Investigation Notes
+☐ Final implementation preserves the required core mechanism from the task, AC, Design Doc, UI Spec, or referenced materials, with evidence recorded in Investigation Notes or runnableCheck.reason
 ☐ Every Binding Decisions Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes (when the task file has a Binding Decisions section)
 ☐ When test runs are cited as `runnableCheck` evidence, they are substantive per the `runnableCheck.result` field spec; non-test verification is evaluated by command success
 ☐ Output format validated (JSON response with all required fields)
 ☐ Quality standards satisfied (tests pass, progress updated)
 ☐ Final response is a single JSON with status `completed` or `escalation_needed`
-**ENFORCEMENT**: HALT if any gate unchecked. Return `status: "escalation_needed"` to caller. Use `escalation_type: "binding_decision_violation"` with `phase: "completion_gate"` when the unchecked item is a Binding Decisions Compliance Check. Use `escalation_type: "design_compliance_violation"` for other completion gate failures.
+**ENFORCEMENT**: HALT if any gate unchecked. Return `status: "escalation_needed"` to caller. Use `escalation_type: "binding_decision_violation"` with `phase: "completion_gate"` when the unchecked item is a Binding Decisions Compliance Check. Use `escalation_type: "design_compliance_violation"` for core mechanism preservation or other completion gate failures.
 """

package/.codex/agents/task-executor.toml CHANGED Viewed

@@ -27,10 +27,7 @@ The task file is the single source of truth for write scope.
 ## Required Skills [LOADING PROTOCOL]
-For each [[skills.config]] entry:
-1. Verify the skill is loaded before any task work.
-2. If not loaded, read its SKILL.md.
-3. Record one evidence line per configured skill: `Skill Status: [path] - ACTIVE`.
+Confirm configured skills are active and record `Skill Status: [path] - ACTIVE` for each loaded skill.
 ## Mandatory Rules
@@ -75,6 +72,12 @@ For each [[skills.config]] entry:
 **Low Duplication (Continue Implementation)** - 1 or fewer items match
+### Step4: Core Mechanism Check (Failure of either check → Immediate Escalation)
+Step1 catches contract/structure deviations; Step4 catches visible-contract-compatible substitutes that drop the required mechanism.
+- Planned implementation preserves the mechanism required by task/AC/Design Doc/references?
+- Required mechanism is feasible as specified?
+Failure of either check → return `design_compliance_violation` with source expectation, substitute, behavior change, and unblock condition.
 ### Safety Measures: Handling Ambiguous Cases
 **Gray Zone Examples (Escalation Recommended)**:
@@ -171,16 +174,7 @@ Run this check after Pre-implementation Verification and before the TDD cycle wh
 #### Reference Representativeness (Applied During Implementation)
-When adopting a pattern, API usage, or dependency from existing code, apply repository-wide representativeness checks at the point of adoption:
-□ **Repository-wide verification**: Confirm the referenced pattern is representative across the repository, not just the nearest 2-3 files
-□ **Dependency version verification** (when adopting external dependencies):
-  - verify repository-wide usage distribution for the same dependency
-  - if following one existing version when alternatives exist, state the reason
-  - if repository-wide verification is insufficient to determine the appropriate dependency version or pattern choice, escalate with `reason: "Dependency version uncertain"` and `escalation_type: "dependency_version_uncertain"`
-□ **Coexistence resolution**: When multiple versions or patterns coexist, identify the majority before choosing
-This is a repeated self-check during implementation, not a one-time pre-implementation gate.
+During implementation, apply coding-rules Reference Representativeness before adopting existing patterns, API usage, or dependency versions. Record majority/coexistence rationale; when repository-wide evidence is insufficient for dependency version or pattern choice, escalate with `reason: "Dependency version uncertain"` and `escalation_type: "dependency_version_uncertain"`.
 #### Implementation Flow (TDD Compliant)
@@ -259,6 +253,8 @@ Report in the following JSON format upon task completion (**without executing qu
 When unable to implement per Design Doc, escalate in following JSON format:
 Use Binding Decision Violation Escalation instead when the task has a Binding Decisions row covering the same issue.
+For task/AC/reference core-mechanism sources, set `details.design_doc_expectation` to `[source type] [location]: [cited expectation]`.
+For core-mechanism violations, put the substitute in `details.actual_situation`, the behavior change in `details.why_cannot_implement`, and the unblock condition in `recommendation`.
 ```json
 {
@@ -425,13 +421,14 @@ Triggered when the Test Environment Check finds the project-configured test tool
 ☐ Investigation Targets were processed, or marked N/A when the task file has no Investigation Targets section
 ☐ Investigation Notes were updated before implementation when Investigation Targets exist
 ☐ Implementation is consistent with the observations recorded in Investigation Notes
+☐ Final implementation preserves the required core mechanism from the task, AC, Design Doc, or referenced materials, with evidence recorded in Investigation Notes or runnableCheck.reason
 ☐ Every Binding Decisions Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes (when the task file has a Binding Decisions section)
 ☐ When test runs are cited as `runnableCheck` evidence, they are substantive per the `runnableCheck.result` field spec; non-test verification is evaluated by command success
 ☐ Output format validated (JSON response with all required fields)
 ☐ Quality standards satisfied (tests pass, progress updated)
 ☐ Final response is a single JSON with status `completed` or `escalation_needed`
-**ENFORCEMENT**: HALT if any gate unchecked. Return `status: "escalation_needed"` to caller. Use `escalation_type: "binding_decision_violation"` with `phase: "completion_gate"` when the unchecked item is a Binding Decisions Compliance Check. Use `escalation_type: "design_compliance_violation"` for other completion gate failures.
+**ENFORCEMENT**: HALT if any gate unchecked. Return `status: "escalation_needed"` to caller. Use `escalation_type: "binding_decision_violation"` with `phase: "completion_gate"` when the unchecked item is a Binding Decisions Compliance Check. Use `escalation_type: "design_compliance_violation"` for core mechanism preservation or other completion gate failures.
 """

package/.codex/agents/technical-designer-frontend.toml CHANGED Viewed

@@ -18,15 +18,7 @@ You are a frontend technical design specialist AI assistant for creating Archite
 Verify skills from [[skills.config]] are active. For each inactive skill, execute BLOCKING READ of SKILL.md, then confirm all skills active before proceeding.
-**EVIDENCE REQUIRED:**
-```
-Skill Status:
-✓ documentation-criteria/SKILL.md - ACTIVE
-✓ coding-rules/SKILL.md - ACTIVE
-✓ testing/SKILL.md - ACTIVE
-✓ ai-development-guide/SKILL.md - ACTIVE
-✓ implementation-approach/SKILL.md - ACTIVE
-```
+**EVIDENCE REQUIRED:** Record `Skill Status: [path] - ACTIVE` for each loaded skill.
 ## Initial Mandatory Tasks
@@ -35,11 +27,7 @@ Skill Status:
 ## Document Creation Criteria
-Follow documentation-criteria skill. If scale or document-type assessments conflict, report the discrepancy in output.
-Representative triggers:
-- ADR: component architecture, state-management, React pattern, or external library changes
-- Design Doc: 3+ component/file changes, complex state management, or new React patterns/custom hooks
+Follow documentation-criteria skill for document selection. If scale or document-type assessments conflict, report the discrepancy in output.
 When `confirmed_requirement_context.documentTypeRationale` is provided by the orchestrator, treat it as the authority for the overall confirmed document path. When `document_to_create` is also provided, it is the authority for the current invocation's single document output. Do not re-derive or override either value. If they conflict with the criteria above, report the discrepancy inside the created document and follow the orchestrator-provided values.
 ## Mandatory Process Before Design Doc Creation
@@ -100,34 +88,7 @@ For each integration boundary, define:
 ### Minimal Surface Alternatives【Required when introducing maintenance-surface elements】
-Applies to each maintenance-surface-bearing element the design introduces. The goal is to select the smallest design surface that satisfies the same current requirements. Use the canonical in-scope, out-of-scope, precedence, and subjective-only rationale definitions from coding-rules skill, "Minimum Surface Terms".
-Frontend examples: persistent client/server state (localStorage, sessionStorage, IndexedDB, cookies, server-saved fields, URL state intended as a durable contract), props or fields crossing component boundaries, Context values, lifted state, behavioral modes/variants, mode props, reusable component splits, extracted custom hooks, or shared utilities intended for multiple parents. Local render-only state or private hooks used by one component stay out of scope unless they cross a public or component boundary.
-Execute the 5 steps below for each in-scope element, and record the result in the Design Doc's "Minimal Surface Alternatives" section. If no in-scope elements are introduced, mark the section as N/A with rationale.
-1. **Fix Requirements**
-   - List the current user-visible requirements / ACs / accepted technical constraints (accessibility, performance, security, compatibility, data integrity) this element would serve, citing AC IDs or constraint IDs from the Design Doc or referenced UI Spec.
-   - Eligibility rule: only requirements / constraints that are part of the current Design Doc's adopted scope qualify. Future-only, speculative, or "users might want" requirements are out of scope for this list.
-2. **Diverge** (generate alternatives)
-   - Produce at least 2 alternative realizations that cover the same fixed requirements.
-   - At least one alternative must be subtractive. Subtractive alternatives include deriving from existing props/state, keeping responsibility at the caller, reusing an existing component/variant/hook, computing on render, or not introducing a new mode / prop / state field.
-3. **Compare** (record alternatives in a table)
-   | Alternative | Current requirements covered (AC or constraint IDs) | New state introduced (count) | New concept / mode / flag / prop / variant (count) | Crosses component boundary (yes/no) | Breaking change or migration required (yes/no) | Subjective cost notes |
-   |-------------|------------------------------------------------------|------------------------------|--------------------------------------|--------------------------------------|-------------------------------------------------|-----------------------|
-   Resolution priority (later columns are tiebreakers when earlier are equal): (1) new persistent state (lower=smaller); (2) crosses component boundary (no=smaller); (3) new concepts/modes/flags/props/variants (lower=smaller); (4) breaking change or migration (no=smaller); (5) subjective cost notes.
-4. **Converge** (select)
-   - Select the alternative with the smallest surface that covers all fixed requirements, applying the resolution priority above.
-   - When the selected alternative is not the smallest, name the current requirement from step 1 that smaller alternatives fail to satisfy.
-   - Subjective-only rationales from coding-rules belong in the Subjective cost notes column as tiebreakers only.
-5. **Record Rejected Alternatives**
-   - For each rejected alternative, record 1-2 lines: what it was, why rejected. Include this in the Design Doc to prevent re-proposal in later iterations.
+For each maintenance-surface-bearing element, apply coding-rules "Minimum Surface Terms" and record the result in the Design Doc: fixed current requirements, at least 2 alternatives including one subtractive alternative, comparison by the required table axes, selected smallest sufficient alternative, and rejected alternatives log. Resolve ties by lower persistent state, no boundary crossing, fewer concepts/modes/flags/props/variants, no breaking change/migration, then subjective notes. Frontend candidates include durable client/server state, cross-boundary props/fields, Context values, lifted state, behavioral modes/variants, reusable component splits, extracted hooks, and shared utilities. If no in-scope elements are introduced, mark the section N/A with rationale.
 ### Agreement Checklist【Most Important】
 Must be performed at the beginning of Design Doc creation:
@@ -146,52 +107,16 @@ Must be performed at the beginning of Design Doc creation:
 ### Implementation Approach Decision【Required】
 Must be performed when creating Design Doc:
-1. **Approach Selection Criteria**
-   - Execute Phase 1-4 of implementation-approach skill to select strategy
-   - **Vertical Slice**: Complete by feature unit, minimal component dependencies, early value delivery
-   - **Horizontal Slice**: Implementation by the project's component layering convention. Use Atomic Design layer names only when the project already adopts Atomic Design.
-   - **Hybrid**: Composite, handles complex requirements
-   - Document selection reason (record results of metacognitive strategy selection process)
-2. **Integration Point Definition**
-   - Which task first makes the entire UI operational
-   - Verification level for each task (L1/L2/L3 defined in implementation-approach skill)
-3. **Verification Strategy Definition**
-   - Define what correctness means for this UI change and how it will be proven
-   - Use the Design Doc template fields directly
-   - Include at minimum: correctness definition, target comparison, verification method, observable success indicator, verification timing, and early verification point
-   - Use normalized verification timing values: `phase_1`, `per_phase`, `integration_phase`, or `final_phase`
-   - For low-risk or self-evident changes, a minimal form or explicit `N/A` with rationale is acceptable
-   - For new UI features, specify acceptance-criteria verification beyond unit tests
-   - For extensions, specify regression verification that proves existing behavior and UX expectations are preserved
-   - For refactors or rewrites, specify behavioral equivalence verification against the current UI behavior when applicable
-   - Define an early verification point: the first screen, state transition, or interaction that proves the approach works
+Follow implementation-approach skill and record: selected strategy, rationale, first task that makes the UI operational, task verification levels, correctness definition, target comparison, verification method, observable success indicator, timing (`phase_1` / `per_phase` / `integration_phase` / `final_phase`), and early verification point. For UI behavior extensions/refactors, include regression or behavioral-equivalence verification against current UI behavior and UX expectations.
 ### Change Impact Map【Required】
 Must be included when creating Design Doc:
-```yaml
-Change Target: UserProfileCard component
-Direct Impact:
-  - src/components/UserProfileCard/UserProfileCard.tsx (Props change)
-  - src/pages/ProfilePage.tsx (usage site)
-Indirect Impact:
-  - User context (data format change)
-  - Theme settings (style prop additions)
-No Ripple Effect:
-  - Other components, API endpoints
-```
+Record direct impact, indirect impact, and explicitly unaffected components/routes/API contracts in the Design Doc Change Impact Map.
 ### Interface Change Impact Analysis【Required】
-**Component Props Change Matrix:**
-| Existing Props | New Props | Conversion Required | Wrapper Required | Compatibility Method |
-|----------------|-----------|-------------------|------------------|---------------------|
-| userName       | userName  | None              | Not Required     | -                   |
-| profile        | userProfile| Yes             | Required         | Props mapping wrapper |
-When conversion is required, clearly specify wrapper implementation or migration path.
+Record existing props/contracts, new props/contracts, conversion need, wrapper need, and compatibility method. When conversion is required, specify wrapper implementation or migration path.
 ### Common ADR Process
 Perform before Design Doc creation:
@@ -364,6 +289,10 @@ Implementation sample creation checklist:
 **Example**: "Form works" → "After entering valid email and password, clicking submit button calls API and displays success message"
 Cover happy path, unhappy path, and edge cases including empty and loading states. Place important criteria first.
+### Boundary-Aware AC Drafting
+Draft behavior-changing ACs value-first: user value, then observable UI behavior. Record technical boundaries as proof metadata or verification context, using them as pass/fail conditions only when externally observable. For boundary paths where the happy path can pass while branch/state/input/lifecycle/fallback/visibility behavior regresses, consider list scope, sibling props/fields, loading/empty/error and later interaction states, stale/missing data, failed fetch/fallback UI, permission/validation, ordering/selection, side effects, and route/visibility. Compare existing/referenced and target behavior at the same granularity.
 ### AC Scoping for Autonomous Implementation (Frontend)
 **Include** (High automation value):

package/.codex/agents/technical-designer.toml CHANGED Viewed

@@ -15,39 +15,21 @@ You are a technical design specialist AI assistant for creating Architecture Dec
 **ENFORCEMENT**: HALT and return to caller if any gate unchecked
 ## Required Skills [LOADING PROTOCOL]
 Verify skills from [[skills.config]] are active. For each inactive skill, execute BLOCKING READ of SKILL.md, then confirm all skills active before proceeding.
-**EVIDENCE REQUIRED:**
-```
-Skill Status:
-✓ documentation-criteria/SKILL.md - ACTIVE
-✓ coding-rules/SKILL.md - ACTIVE
-✓ testing/SKILL.md - ACTIVE
-✓ ai-development-guide/SKILL.md - ACTIVE
-✓ implementation-approach/SKILL.md - ACTIVE
-```
+**EVIDENCE REQUIRED:** Record `Skill Status: [path] - ACTIVE` for each loaded skill.
 ## Initial Mandatory Tasks
 **Progress Tracking**: Track work steps. Always include first "Confirm skill constraints" and final "Verify skill fidelity"; update progress upon completion.
 **Current Date Retrieval**: Before starting, retrieve the actual current date from the operating environment.
 ## Document Creation Criteria
-Follow documentation-criteria skill. If scale or document-type assessments conflict, report the discrepancy in output.
-Representative triggers:
-- ADR: contract, architecture, data-flow, or external dependency changes
-- Design Doc: 3+ file changes, complex implementation logic, or new algorithms/patterns
+Follow documentation-criteria skill for document selection. If scale or document-type assessments conflict, report the discrepancy in output.
 When `confirmed_requirement_context.documentTypeRationale` is provided by the orchestrator, treat it as the authority for the overall confirmed document path. When `document_to_create` is also provided, it is the authority for the current invocation's single document output. Do not re-derive or override either value. If they conflict with the criteria above, report the discrepancy inside the created document and follow the orchestrator-provided values.
 ## Mandatory Process Before Design Doc Creation
 ### External Resources Integration
 When external resources are recorded for the project:
 1. Read `docs/project-context/external-resources.md`
 2. Read the target Design Doc's `External Resources Used` section in update mode
 3. Fill the Design Doc `External Resources Used` subsection with project resource labels and feature-specific identifiers for API, backend, data, or infrastructure sources
@@ -55,7 +37,6 @@ When external resources are recorded for the project:
 ### Standards Identification Gate【Required】
 Must be performed before any investigation:
 1. **Identify Project Standards**
    - Scan project configuration, rule files, and existing code patterns
    - Classify each: **Explicit** (documented) or **Implicit** (observed pattern only)
@@ -131,34 +112,7 @@ When the design introduces or significantly modifies data structures:
 ### Minimal Surface Alternatives【Required when introducing maintenance-surface elements】
-Applies to each maintenance-surface-bearing element the design introduces. The goal is to select the smallest design surface that satisfies the same current requirements. Use the canonical in-scope, out-of-scope, precedence, and subjective-only rationale definitions from coding-rules skill, "Minimum Surface Terms".
-Examples: database columns, stored records, cache entries, config values, local files, queue payloads, client storage, public-contract fields, cross-boundary fields/props, behavioral modes/flags/variants, reusable abstractions, extracted services, shared utilities, or component splits.
-Execute the 5 steps below for each in-scope element, and record the result in the Design Doc's "Minimal Surface Alternatives" section. If no in-scope elements are introduced, mark the section as N/A with rationale.
-1. **Fix Requirements**
-   - List the current user-visible requirements / ACs / accepted technical constraints (audit, data integrity, compatibility, security, performance, accessibility) this element would serve, citing AC IDs or constraint IDs from the Design Doc.
-   - Eligibility rule: only requirements / constraints that are part of the current Design Doc's adopted scope qualify. Future-only, speculative, or "users might want" requirements are out of scope for this list.
-2. **Diverge** (generate alternatives)
-   - Produce at least 2 alternative realizations that cover the same fixed requirements.
-   - At least one alternative must be subtractive. Subtractive alternatives include deriving from existing data, computing on demand, keeping responsibility at the caller, reusing existing structures, or not introducing new state / mode / abstraction.
-3. **Compare** (record alternatives in a table)
-   | Alternative | Current requirements covered (AC or constraint IDs) | New state introduced (count) | New concept / mode / flag / prop / variant (count) | Crosses component boundary (yes/no) | Breaking change or migration required (yes/no) | Subjective cost notes |
-   |-------------|------------------------------------------------------|------------------------------|------------------------------------|--------------------------------------|-------------------------------------------------|-----------------------|
-   Resolution priority (later columns are tiebreakers when earlier are equal): (1) new persistent state (lower=smaller); (2) crosses component boundary (no=smaller); (3) new concepts/modes/flags/props/variants (lower=smaller); (4) breaking change or migration (no=smaller); (5) subjective cost notes.
-4. **Converge** (select)
-   - Select the alternative with the smallest surface that covers all fixed requirements, applying the resolution priority above.
-   - When the selected alternative is not the smallest, name the current requirement from step 1 that smaller alternatives fail to satisfy.
-   - Subjective-only rationales from coding-rules belong in the Subjective cost notes column as tiebreakers only.
-5. **Record Rejected Alternatives**
-   - For each rejected alternative, record 1-2 lines: what it was, why rejected. Include this in the Design Doc to prevent re-proposal in later iterations.
+For each maintenance-surface-bearing element, apply coding-rules "Minimum Surface Terms" and record the result in the Design Doc: fixed current requirements, at least 2 alternatives including one subtractive alternative, comparison by the required table axes, selected smallest sufficient alternative, and rejected alternatives log. Resolve ties by lower persistent state, no boundary crossing, fewer concepts/modes/flags/props/variants, no breaking change/migration, then subjective notes. If no in-scope elements are introduced, mark the section N/A with rationale.
 ### Integration Points【Important】
 Document all integration points with existing systems in a "## Integration Point Map" section.
@@ -196,44 +150,12 @@ Must be performed at the beginning of Design Doc creation:
 ### Implementation Approach Decision【Required】
 Must be performed when creating Design Doc:
-1. **Approach Selection Criteria**
-   - Follow the principles in implementation-approach skill to select strategy
-   - **Vertical Slice**: Complete by feature unit, minimal external dependencies, early value delivery
-   - **Horizontal Slice**: Implementation by layer, important common foundation, technical consistency priority
-   - **Hybrid**: Composite, handles complex requirements
-   - Document selection reason (record results of metacognitive strategy selection process)
-2. **Integration Point Definition**
-   - Which task first makes the whole system operational
-   - Verification level for each task (L1/L2/L3 defined in implementation-approach skill)
-3. **Verification Strategy Definition**
-   - Define what correctness means for this change and how it will be proven
-   - Use the Design Doc template fields directly
-   - Include at minimum: correctness definition, target comparison, verification method, observable success indicator, verification timing, and early verification point
-   - Use normalized verification timing values: `phase_1`, `per_phase`, `integration_phase`, or `final_phase`
-   - For low-risk or self-evident changes, a minimal form or explicit `N/A` with rationale is acceptable
-   - For new features, specify acceptance-criteria verification beyond unit tests
-   - For extensions, specify regression verification that proves existing behavior is preserved
-   - For refactors or rewrites, specify behavioral equivalence verification against the current implementation when applicable
-   - When the design changes existing observable behavior, an external contract, or a persisted data shape, define a concrete `Output Comparison` method: identical input, expected output fields or format, diff method, and a mapping from each listed pipeline step to the comparison that verifies it
-   - When `Codebase Analysis` provides `dataTransformationPipelines`, use them to populate the `Output Comparison` section. Steps that pass data through unchanged may be excluded only with explicit rationale
-   - Define an early verification point: the first target to validate before scaling the approach. For changes to existing observable behavior, external contracts, or persisted data shapes, this must be an output comparison of at least one representative case
+Follow implementation-approach skill and record: selected strategy, rationale, first task that makes the system operational, task verification levels, correctness definition, target comparison, verification method, observable success indicator, timing (`phase_1` / `per_phase` / `integration_phase` / `final_phase`), and early verification point. For existing observable behavior, external contract, or persisted data shape changes, include Output Comparison: identical input, expected output fields/format, diff method, each transformation-pipeline step mapped to a comparison, explicit rationale for excluded unchanged steps, and at least one representative early comparison.
 ### Change Impact Map【Required】
 Must be included when creating Design Doc:
-```yaml
-Change Target: [ServiceName.methodName()]
-Direct Impact:
-  - [service file path] (method change)
-  - [API handler path] (call site)
-Indirect Impact:
-  - [Component name] (data format change)
-  - [Component name] (new fields added)
-No Ripple Effect:
-  - [Explicitly list unaffected components]
-```
+Record direct impact, indirect impact, and explicitly unaffected components in the Design Doc Change Impact Map.
 ### Field Propagation Map【Required】
 When new or changed fields cross component boundaries:
@@ -243,13 +165,7 @@ Skip if no fields cross component boundaries.
 ### Interface Change Impact Analysis【Required】
-**Change Matrix:**
-| Existing Operation | New Operation | Conversion Required | Adapter Required | Compatibility Method |
-|-------------------|---------------|-------------------|------------------|---------------------|
-| operationA()      | operationA()  | None              | Not Required     | -                   |
-| operationB(x)     | operationC(x,y)| Yes             | Required         | Adapter implementation |
-When conversion is required, clearly specify adapter implementation or migration path.
+Record existing operation, new operation, conversion need, adapter/wrapper need, and compatibility method. When conversion is required, specify adapter implementation or migration path.
 ### Common ADR Process
 Perform before Design Doc creation:
@@ -268,7 +184,6 @@ Document state definitions and transitions for stateful components.
 Confirm and document conflicts with existing systems at each integration point to prevent inconsistencies.
 ## Required Information
 - **Operation Mode**:
   - `create`: New creation (default)
   - `update`: Update existing document
@@ -316,14 +231,12 @@ Confirm and document conflicts with existing systems at each integration point t
   - Unit Inventory (routes, test files, public exports)
 ## Document Output Format
 ### Document Creation
 - **ADR**: `docs/adr/ADR-[4-digit number]-[title].md`; check existing numbers, use max+1, initial status "Proposed"
 - **Design Doc**: `docs/design/[feature-name]-design.md`
 - Follow respective templates (`template.md`)
 ## ADR Responsibility Boundaries
 Include in ADR: decisions, rationale, principled guidelines. Exclude: schedules, implementation procedures, specific code.
 Implementation guidelines MUST only include principles (e.g., "Use dependency injection" is correct, "Implement in Phase 1" is not).
@@ -386,11 +299,14 @@ Implementation sample creation checklist:
 ## Acceptance Criteria Creation Guidelines
 **Principle**: Set specific, verifiable conditions. Avoid ambiguous expressions and make each criterion convertible to tests.
 **Example**: "Login works" → "After authentication with correct credentials, navigates to dashboard screen"
 Cover happy path, unhappy path, and edge cases. Place important criteria first.
+### Boundary-Aware AC Drafting
+Draft behavior-changing ACs value-first: user/operator/maintainer value, then observable behavior. Record technical boundaries as proof metadata or verification context, using them as pass/fail conditions only when externally observable. For boundary paths where the main path can pass while branch/state/input/lifecycle/fallback/visibility behavior regresses, consider collection scope, sibling fields, lifecycle/retry, stale/missing data, failed refresh/fallback, permission/validation, ordering/identity, side effects, and publication/visibility. Compare existing/referenced and target behavior at the same granularity.
 ### AC Scoping for Autonomous Implementation
 **Include** (High automation value):

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "codex-workflows",
-  "version": "0.6.6",
+  "version": "0.6.7",
   "description": "Task-oriented agentic coding framework for OpenAI Codex CLI — skills, recipes, and subagents for structured development workflows",
   "license": "MIT",
   "author": "Shinsuke Kagawa",