npm - codex-workflows - Versions diffs - 0.6.8 → 0.7.0 - Mend

codex-workflows 0.6.8 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (41) hide show

package/.agents/skills/ai-development-guide/SKILL.md +5 -3
package/.agents/skills/ai-development-guide/references/frontend.md +11 -19
package/.agents/skills/coding-rules/references/typescript.md +17 -12
package/.agents/skills/documentation-criteria/SKILL.md +1 -1
package/.agents/skills/documentation-criteria/references/design-template.md +16 -5
package/.agents/skills/documentation-criteria/references/plan-template.md +19 -5
package/.agents/skills/documentation-criteria/references/task-template.md +19 -1
package/.agents/skills/recipe-build/SKILL.md +1 -1
package/.agents/skills/recipe-front-build/SKILL.md +1 -1
package/.agents/skills/recipe-front-plan/SKILL.md +1 -1
package/.agents/skills/recipe-fullstack-build/SKILL.md +1 -1
package/.agents/skills/recipe-plan/SKILL.md +1 -1
package/.agents/skills/recipe-prepare-implementation/SKILL.md +2 -1
package/.agents/skills/subagents-orchestration-guide/SKILL.md +2 -2
package/.agents/skills/subagents-orchestration-guide/references/monorepo-flow.md +1 -1
package/.agents/skills/testing/SKILL.md +5 -5
package/.agents/skills/testing/references/typescript.md +2 -6
package/.codex/agents/acceptance-test-generator.toml +2 -44
package/.codex/agents/code-reviewer.toml +12 -57
package/.codex/agents/code-verifier.toml +1 -47
package/.codex/agents/codebase-analyzer.toml +1 -106
package/.codex/agents/design-sync.toml +2 -64
package/.codex/agents/document-reviewer.toml +8 -81
package/.codex/agents/integration-test-reviewer.toml +1 -26
package/.codex/agents/investigator.toml +1 -73
package/.codex/agents/quality-fixer-frontend.toml +4 -105
package/.codex/agents/quality-fixer.toml +4 -122
package/.codex/agents/requirement-analyzer.toml +1 -29
package/.codex/agents/rule-advisor.toml +1 -79
package/.codex/agents/scope-discoverer.toml +1 -70
package/.codex/agents/security-reviewer.toml +1 -19
package/.codex/agents/solver.toml +5 -54
package/.codex/agents/task-decomposer.toml +47 -4
package/.codex/agents/task-executor-frontend.toml +37 -144
package/.codex/agents/task-executor.toml +37 -144
package/.codex/agents/technical-designer-frontend.toml +8 -0
package/.codex/agents/technical-designer.toml +10 -1
package/.codex/agents/ui-analyzer.toml +1 -157
package/.codex/agents/verifier.toml +2 -65
package/.codex/agents/work-planner.toml +30 -9
package/package.json +1 -1

package/.agents/skills/ai-development-guide/SKILL.md CHANGED Viewed

@@ -165,7 +165,9 @@ To isolate problems, attempt reproduction with minimal code:
 - Replace external dependencies with mocks
 - Create minimal configuration that reproduces problem
-### 4. Debug Log Output
+### 4. Debug Log Output (temporary)
+Add structured debug logs to isolate the issue, then remove them before commit.
 ```
 Pattern: Structured logging with context
 {
@@ -204,7 +206,7 @@ Universal quality assurance phases applicable to all languages:
 ### Phase 3: Testing
 1. **Unit Tests**: Run all unit tests
 2. **Integration Tests**: Run integration tests
-3. **Test Coverage**: Measure and verify coverage meets standards
+3. **Test Coverage**: Measure coverage when configured and use it to find gaps
 4. **E2E Tests**: Run end-to-end tests
 ### Phase 4: Final Quality Gate [MANDATORY]
@@ -212,7 +214,7 @@ All checks MUST pass before proceeding:
 - Zero static analysis errors
 - Build succeeds
 - All tests pass
-- Coverage meets threshold
+- Coverage threshold passes when the project, task file, work plan, or Design Doc defines one. When no threshold is configured, use coverage output only to identify untested critical paths.
 **ENFORCEMENT**: Cannot proceed with ANY quality check failures — fix ALL errors before marking task complete

package/.agents/skills/ai-development-guide/references/frontend.md CHANGED Viewed

@@ -75,28 +75,19 @@ console.log('DEBUG:', {
 ## Frontend Quality Check Workflow
-Use the appropriate run command based on the `packageManager` field in package.json.
+Read `package.json` scripts and run them with the project's package manager from the `packageManager` field. Map the phases below using the script names declared in `package.json`.
-### Common Commands
-- `dev` - Development server
-- `build` - Production build
-- `preview` - Preview production build
-- `type-check` - Type check (no emit)
-### Quality Check Phases
-**Phase 1-3: Basic Checks**
-- `check` - Linter + formatter (Biome, ESLint, Prettier, etc.)
-- `build` - TypeScript build
-**Phase 4-5: Tests and Final Confirmation**
-- `test` - Test execution
-- `test:coverage:fresh` - Coverage measurement (fresh cache)
-- `check:all` - Overall integrated check
+### Phases
+1. **Lint/format** - the project's formatter and linter, such as Biome or ESLint plus Prettier
+2. **Type check** - type check without emit when the project has a dedicated command
+3. **Build** - production build
+4. **Test** - unit and integration tests
+5. **Coverage** - coverage run when configured or when the task added or changed behavior
 ### Troubleshooting
-- **Port in use error**: Run `cleanup:processes` script if available
-- **Cache issues**: Run tests with fresh cache option
-- **Dependency errors**: Clean reinstall dependencies
+- **Port already in use**: stop the stale dev, preview, or test process holding the port
+- **Stale cache**: re-run with the project's fresh or clean-cache option
+- **Dependency errors**: clean reinstall dependencies
 ## Frontend Technical Decisions
@@ -108,6 +99,7 @@ Use the appropriate run command based on the `packageManager` field in package.j
 ### Performance vs Readability
 - Prioritize readability unless clear bottleneck exists
 - Measure before optimizing (use React DevTools Profiler, not guesses)
+- When React Compiler is enabled, routine memoization is automatic. Use manual memoization only for a measured bottleneck or stable reference identity required by third-party APIs or effect dependencies.
 - Document reason with comments when optimizing
 ## Frontend Impact Analysis

package/.agents/skills/coding-rules/references/typescript.md CHANGED Viewed

@@ -62,6 +62,11 @@ function isUser(value: unknown): value is User {
 - **Component Hierarchy**: Follow the project's existing component architecture. Use Atoms > Molecules > Organisms > Templates > Pages only when the project adopts Atomic Design.
 - **Co-location**: Place tests, styles, and related files alongside components
+**Server/Client Boundary (RSC frameworks only, such as Next.js App Router)**
+- Default to server components for data fetching and rendering. Isolate interactivity behind a `"use client"` boundary at the smallest scope that needs it.
+- Keep browser-only APIs such as `window`, `localStorage`, and event handlers inside client components.
+- Skip this section for client-only SPAs with no server-component runtime.
 **State Management Patterns**
 - **Local State**: `useState` for component-specific state
 - **Context API**: For sharing state across component tree (theme, auth, etc.)
@@ -95,19 +100,18 @@ setUsers(users)
 - Type-safe: Always define Props type explicitly
 **Environment Variables**
-- **Use build tool's environment variable system**: `process.env` does not work in browsers
-- Centrally manage environment variables through configuration layer
-- Implement proper type safety and default value handling
+- **Use the build tool's env accessor**: read client-side env through the bundler's exposed accessor, such as Vite `import.meta.env` or Next.js/CRA prefixed `process.env`.
+- **Only prefixed vars reach the client**: build tools expose only vars carrying their public prefix. Match the project's bundler, such as Vite `VITE_`, Next.js `NEXT_PUBLIC_`, or CRA `REACT_APP_`.
+- Centrally manage environment variables through a typed configuration layer with defaults.
 ```typescript
-// Build tool environment variables (public values only)
+// Client-exposed env must carry the bundler's public prefix, or it is undefined in the browser.
+// Vite:    import.meta.env.VITE_API_URL
+// Next.js: process.env.NEXT_PUBLIC_API_URL
 const config = {
-  apiUrl: import.meta.env.API_URL || 'http://localhost:3000',
-  appName: import.meta.env.APP_NAME || 'My App'
+  apiUrl: import.meta.env.VITE_API_URL || 'http://localhost:3000',
+  appName: import.meta.env.VITE_APP_NAME || 'My App'
 }
-// Does not work in frontend
-// const apiUrl = process.env.API_URL
 ```
 **Security (Client-side Constraints)**
@@ -118,7 +122,7 @@ const config = {
 ```typescript
 // Bad: API key exposed in browser
-// const apiKey = import.meta.env.API_KEY
+// const apiKey = import.meta.env.VITE_API_KEY
 // const response = await fetch(`https://api.example.com/data?key=${apiKey}`)
 // Good: Backend manages secrets, frontend accesses via proxy
@@ -132,6 +136,7 @@ const response = await fetch('/api/data') // Backend handles API key authenticat
 - Promise Handling: Always use `async/await`
 - Error Handling: Always handle with `try-catch` or Error Boundary
 - Type Definition: Explicitly define return value types (e.g., `Promise<Result>`)
+- Effect race/cleanup: guard `useEffect` data fetches against out-of-order responses and post-unmount state updates with `AbortController`, a mounted/stale flag, or a server-state library such as React Query or SWR.
 **Format Rules**
 - Semicolon omission (follow project formatter settings)
@@ -209,10 +214,10 @@ Never include sensitive information (password, token, apiKey, secret, creditCard
 ## Performance Optimization
-- Component Memoization: Use React.memo for expensive components
+- Automatic memoization: when React Compiler is enabled, rely on it. Use manual `React.memo`, `useMemo`, or `useCallback` only for a profiler-confirmed bottleneck or stable reference identity required by third-party APIs or effect dependencies.
 - State Optimization: Minimize re-renders with proper state structure
 - Lazy Loading: Use React.lazy and Suspense for code splitting
-- Bundle Size: Monitor with the `build` script and keep under 500KB
+- Bundle Size: Monitor with the build script against the project's budget
 ## Non-functional Requirements

package/.agents/skills/documentation-criteria/SKILL.md CHANGED Viewed

@@ -81,7 +81,7 @@ description: "Documentation creation criteria for PRD, ADR, Design Doc, UI Spec,
 ### Work Plan
 **Purpose**: Implementation task management and progress tracking
-**Scope**: Task breakdown, dependencies, schedule estimates, test skeleton file paths, Verification Strategy summaries from each Design Doc, Design-to-Plan Traceability mapping for implementation-relevant technical requirements, ADR Bindings for implementation-binding ADR decisions, final Quality Assurance phase, and progress tracking only. Technical rationale belongs in ADR and design details belong in Design Doc.
+**Scope**: Task breakdown, dependencies, schedule estimates, test skeleton file paths, Verification Strategy summaries from each Design Doc, Design-to-Plan Traceability mapping for implementation-relevant technical requirements, Reference Contract Values for binding observable Design Doc values, ADR Bindings for implementation-binding ADR decisions, final Quality Assurance phase, and progress tracking only. Technical rationale belongs in ADR and design details belong in Design Doc.
 **Phase Division Criteria**:

package/.agents/skills/documentation-criteria/references/design-template.md CHANGED Viewed

@@ -237,11 +237,12 @@ Rejected Alternatives Log is element-level. Future Extensibility below is design
 // Record major contract/interface definitions here
 ```
-### Data Contract
+### Data Contracts
-#### Component 1
+#### [Component or Boundary] (repeat per component/boundary)
 ```yaml
+Contract: [interface / function / API / schema name]
 Input:
   Type: [Data shape, contract, or schema]
   Preconditions: [Required items, format constraints]
@@ -256,6 +257,14 @@ Invariants:
   - [Conditions that remain unchanged before and after processing]
 ```
+### Observable Contract Values (When Applicable)
+Use this section when the design defines observable values the implementation must reproduce exactly. Omit it when the Design Doc has no such values.
+| Contract Type | Required Observable Value |
+|---------------|---------------------------|
+| structure-order / derived-display / state-lifecycle-negative | [Exact column/field/label set and order, derived display rule, or condition where persisted/restored/cached/derived state remains unused] |
 ### Test Boundaries
 #### Mock Boundary Decisions
@@ -274,9 +283,11 @@ Invariants:
 ### Field Propagation Map (When Fields Cross Boundaries)
-| Field | Boundary | Status | Detail |
-|-------|----------|--------|--------|
-| [field name] | [Component A to B] | preserved / transformed / dropped | [logic or reason] |
+A boundary includes a serialized boundary: a value encoded on one side and parsed on the other through a medium such as a query string, CLI argument, environment variable, config entry, message payload, storage key, or file. For those rows, record the exact encoded representation and how the consumer parses it. Use "-" only when the row is not a serialized boundary.
+| Field | Boundary | Status | Serialized Format | Consumer Parse Rule | Detail |
+|-------|----------|--------|-------------------|---------------------|--------|
+| [field name] | [Component A to B] | preserved / transformed / dropped | [exact representation the producer emits when serialized; "-" otherwise] | [how the consumer decodes and validates it; "-" otherwise] | [logic or reason] |
 ## Verification Strategy

package/.agents/skills/documentation-criteria/references/plan-template.md CHANGED Viewed

@@ -81,6 +81,18 @@ Map each Design Doc technical requirement to the task or phase that covers it. U
 - Merge duplicate restatements of the same obligation from multiple DD sections into one row and cite the primary section in `DD Section`
 - Keep `scope-boundary` rows concrete: name the protected file group, component boundary, contract, or workflow that must remain unchanged
+## Reference Contract Values
+Include this section when a Traceability row's DD Item encodes a binding observable value the implementation must reproduce exactly: a column/label set and order, a derived-display rule where one field determines another display value, or a state-lifecycle negative that states when persisted or derived state must stay unused. Serialized boundaries belong in the Connection Map / Field Propagation Map. When a value qualifies for both this table and a serialized boundary, record it only in the Connection Map. ADR-derived structural decisions belong in ADR Bindings.
+The Traceability table records coverage. This table carries the required value verbatim so the covering task can check the exact contract.
+| Design Doc (section) | Contract Type | Required Observable Value (verbatim) | Covered By Task(s) | Gap Status | Notes |
+|----------------------|---------------|--------------------------------------|--------------------|------------|-------|
+| docs/design/xxx-design.md (Section name) | structure-order / derived-display / state-lifecycle-negative | [Exact value copied from the Design Doc] | [P1-T1] | covered | |
+**Gap Status values**: `covered` (mapped to one or more tasks), `gap` (no task exists yet; set Covered By Task(s) to `-`, include justification in Notes, and require user confirmation before plan approval)
 ## Failure Mode Checklist
 Domain-independent failure categories this implementation must guard against. Enumerate all eight categories, mark which apply, and list a covering task for each that applies; keep category names generic and place project-specific detail in task descriptions or notes.
@@ -125,11 +137,13 @@ One row represents one independently checkable binding decision. A single ADR ca
 ## Connection Map
-Include this section when implementation crosses runtime, process, deployment, or service boundaries. Omit it when the change stays inside one runtime or only uses in-process package imports.
+Include this section when implementation crosses runtime, process, deployment, or service boundaries, or when a value is serialized and parsed across a boundary within one runtime through a query string, route parameter, form post, CLI argument, environment variable, config entry, message payload, storage key, or file.
+For serialized boundaries, fill Serialized Format and Consumer Parse Rule with concrete values. Use "-" only for non-serialized external signals where the Expected Signal fully captures the boundary contract.
-| Boundary | Caller / Producer | Callee / Consumer | Expected Signal | Covered By Task(s) |
-|----------|-------------------|-------------------|-----------------|--------------------|
-| [e.g. "web client -> API"] | [module/package initiating request or message] | [module/package receiving request or message] | [Observable evidence, e.g. HTTP 200 matching schema X] | [P1-T1, P1-T2] |
+| Boundary | Caller / Producer | Callee / Consumer | Serialized Format | Consumer Parse Rule | Expected Signal | Covered By Task(s) |
+|----------|-------------------|-------------------|-------------------|---------------------|-----------------|--------------------|
+| [producing side -> consuming side] | [module/package initiating request or message] | [module/package receiving request or message] | [exact representation the producer emits, or "-"] | [how the consumer decodes and validates it, or "-"] | [Observable evidence, e.g. HTTP 200 matching schema X] | [P1-T1, P1-T2] |
 ## Objective
 [Why this change is necessary, what problem it solves]
@@ -243,7 +257,7 @@ This phase is required for all implementation approaches.
 - [ ] Security review: Verify security considerations from Design Doc are implemented
 - [ ] Quality checks (types, lint, format)
 - [ ] Execute all tests (including integration/E2E from test skeletons, when provided)
-- [ ] Coverage 70%+
+- [ ] Coverage threshold passes when configured
 - [ ] Document updates
 ### Quality Assurance

package/.agents/skills/documentation-criteria/references/task-template.md CHANGED Viewed

@@ -17,6 +17,13 @@ Metadata:
 Files to read before starting implementation. Use concrete file paths, optionally with a section/function hint:
 - [e.g., src/orders/checkout.ts (processOrder function)]
+## Change Category
+(Include this field only when the task is a bug fix, regression, state-change, or boundary-change. Omit otherwise.)
+`Change Category: <one or more of bug-fix, regression, state-change, boundary-change, comma-separated>`
+When present, sweep cases sharing the same path, contract, persisted state, or external boundary for the same class of defect during the Red Phase.
 ## Binding Decisions
 (Include this section when the work plan's ADR Bindings table covers this task. Omit otherwise.)
@@ -26,14 +33,24 @@ Each row is an ADR decision the implementation in this task must comply with.
 |--------|------|----------|------------------|
 | docs/adr/ADR-XXXX-title.md (§ <Source Section>) | [Axis value copied verbatim from the work plan's ADR Bindings row] | [Binding decision copied from the work plan's ADR Bindings row] | [Y/N-answerable positive predicate that evaluates whether the planned and final implementation satisfy the decision] |
+## Reference Contracts
+(Include this section when the work plan's Reference Contract Values table covers this task. Omit otherwise.)
+Each row is a Design Doc-derived observable contract the implementation in this task must reproduce exactly. Serialized boundaries are carried by Boundary Context from the work plan's Connection Map. ADR-derived structural decisions are carried by Binding Decisions above.
+| Source | Contract Type | Required Observable Value | Compliance Check |
+|--------|---------------|---------------------------|------------------|
+| docs/design/xxx-design.md (§ Section name) | structure-order / derived-display / state-lifecycle-negative | [Required Observable Value copied verbatim from the work plan row] | [Y/N-answerable positive predicate that evaluates whether the planned and final implementation reproduces the value] |
 ## Investigation Notes
 Brief observations recorded after reading Investigation Targets:
 - [path] - [interfaces, control/data flow, state transitions, side effects relevant to this task]
-- When Binding Decisions exist, record the planned implementation approach and each Compliance Check result here.
+- When Binding Decisions or Reference Contracts exist, record the planned implementation approach and each Compliance Check result here.
 ## Implementation Steps (TDD: Red-Green-Refactor)
 ### 1. Red Phase
 - [ ] Read all Investigation Targets and update Investigation Notes
+- [ ] (When Change Category is set) Sweep adjacent cases sharing the same path, contract, persisted state, or external boundary for the same class of defect; fold any in-scope residual into failing tests
 - [ ] Review dependency deliverables (if any)
 - [ ] Verify/create contract definitions
 - [ ] Write failing tests
@@ -75,6 +92,7 @@ Brief observations recorded after reading Investigation Targets:
 - [ ] Each Proof Obligation is met: the test turns red under its primary failure mode and exercises the stated boundary
 - [ ] Deliverables created (for research/design tasks)
 - [ ] When Binding Decisions exist, every Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes
+- [ ] When Reference Contracts exist, every Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes
 ## Notes
 - Impact scope: [Areas where changes may propagate]

package/.agents/skills/recipe-build/SKILL.md CHANGED Viewed

@@ -73,7 +73,7 @@ When task files don't exist, the plan references a Design Doc, and the WorkPlan
 ### 1. Work Plan Review
-Spawn document-reviewer agent: "Review the work plan before task decomposition. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
+Spawn document-reviewer agent: "Review the work plan before task decomposition. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc, Reference Contract Values fidelity, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
 Branch on `verdict.decision`:
 - `approved` -> spawn work-planner in update mode once to record `Status: approved` and `Conditions: none` in WorkPlan Review, then continue to user confirmation

package/.agents/skills/recipe-front-build/SKILL.md CHANGED Viewed

@@ -73,7 +73,7 @@ When task files don't exist, the plan references a Design Doc, and the WorkPlan
 ### 1. Work Plan Review
-Spawn document-reviewer agent: "Review the frontend work plan before task decomposition. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc and UI Spec, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
+Spawn document-reviewer agent: "Review the frontend work plan before task decomposition. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc and UI Spec, Reference Contract Values fidelity, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
 Branch on `verdict.decision`:
 - `approved` -> spawn work-planner in update mode once to record `Status: approved` and `Conditions: none` in WorkPlan Review, then continue to user confirmation

package/.agents/skills/recipe-front-plan/SKILL.md CHANGED Viewed

@@ -53,7 +53,7 @@ Spawn acceptance-test-generator agent: "Generate test skeletons from Design Doc
 Spawn work-planner agent: "Create work plan from Design Doc at [path]. Integration test file: [path from step 2]. fixture-e2e test file: [path from step 2 or null]. service-integration-e2e test file: [path from step 2 or null]. E2E absence reasons by lane: [values from step 2 when an E2E lane is null]. Integration tests are created with each phase implementation, fixture-e2e runs alongside UI implementation, service-integration-e2e runs only in the final phase when a service E2E file exists. Include `Implementation Readiness: pending` in the work plan header."
 ### Step 4: Work Plan Review
-Spawn document-reviewer agent: "Review the frontend work plan. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc and UI Spec, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
+Spawn document-reviewer agent: "Review the frontend work plan. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc and UI Spec, Reference Contract Values fidelity, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
 Branch on `verdict.decision`:
 - `approved` -> spawn work-planner in update mode once to record `Status: approved` and `Conditions: none` in WorkPlan Review, then proceed to Step 5

package/.agents/skills/recipe-fullstack-build/SKILL.md CHANGED Viewed

@@ -83,7 +83,7 @@ When task files don't exist, the plan references a Design Doc, and the WorkPlan
 ### 1. Work Plan Review
-Spawn document-reviewer agent: "Review the fullstack work plan before task decomposition. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to all Design Docs, UI Spec when present, cross-layer boundary coverage, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
+Spawn document-reviewer agent: "Review the fullstack work plan before task decomposition. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to all Design Docs, UI Spec when present, Reference Contract Values fidelity, cross-layer boundary coverage, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
 Branch on `verdict.decision`:
 - `approved` -> spawn work-planner in update mode once to record `Status: approved` and `Conditions: none` in WorkPlan Review, then continue to user confirmation

package/.agents/skills/recipe-plan/SKILL.md CHANGED Viewed

@@ -56,7 +56,7 @@ Present options if multiple exist (can be specified with $ARGUMENTS).
 - Spawn work-planner agent: "Create work plan from design document at [design-doc-path]. Include deliverables from previous process according to subagents-orchestration-guide skill coordination specification. If `generatedFiles.fixtureE2e` or `generatedFiles.serviceE2e` is null, use the corresponding `e2eAbsenceReason` and accept the null E2E lane as a valid planning input. Include `Implementation Readiness: pending` in the work plan header."
 ### Step 4: Work Plan Review
-Spawn document-reviewer agent: "Review the work plan. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
+Spawn document-reviewer agent: "Review the work plan. doc_type: WorkPlan. target: docs/plans/[plan-name].md. mode: composite. Review semantic traceability to the Design Doc, Reference Contract Values fidelity, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
 Branch on `verdict.decision`:
 - `approved` -> spawn work-planner in update mode once to record `Status: approved` and `Conditions: none` in WorkPlan Review, then proceed to Step 5

package/.agents/skills/recipe-prepare-implementation/SKILL.md CHANGED Viewed

@@ -31,7 +31,7 @@ Each criterion produces `pass`, `fail`, or `not_applicable`, with file:line evid
 | ID | Criterion | Pass Evidence |
 |----|-----------|---------------|
-| R1 | Verification Strategy and ADR Binding references resolve | Every command, file path, function, endpoint, fixture, seed, and test reference in the work plan's Verification Strategies either exists now or is the deliverable of a task in the plan; every ADR Bindings source path resolves; every ADR Bindings `covered` row references existing task IDs |
+| R1 | Verification Strategy and binding references resolve | Every command, file path, function, endpoint, fixture, seed, and test reference in the work plan's Verification Strategies either exists now or is the deliverable of a task in the plan; every Reference Contract Values `covered` row references existing task IDs; every Reference Contract Values `gap` row has Notes with user-confirmation handling; every ADR Bindings source path resolves; every ADR Bindings `covered` row references existing task IDs |
 | R2 | E2E prerequisites are addressed | For each fixture-e2e or service-integration-e2e skeleton, every noted precondition is present in the codebase or covered by a Phase 0 task |
 | R3 | Phase 1 observability exists | The first implementation phase includes at least one operation verification method executable at task completion using existing files, prior Phase 0 deliverables, or the task's own output |
 | R4 | UI rendering surface exists | When the plan implements UI components, a fixture entry, dev route, Storybook story, preview harness, or equivalent render surface exists or is covered by a Phase 0 task |
@@ -47,6 +47,7 @@ Read the work plan passed in `$ARGUMENTS`; if absent, select the most recent non
 - Verification Strategies
 - Quality Assurance Mechanisms
 - Design-to-Plan Traceability
+- Reference Contract Values
 - ADR Bindings
 - UI Spec Component -> Task Mapping
 - Connection Map

package/.agents/skills/subagents-orchestration-guide/SKILL.md CHANGED Viewed

@@ -219,9 +219,9 @@ Work plans use the header line `Implementation Readiness: <status>`.
 Use this procedure after work-plan approval and before autonomous task execution when the flow needs to verify implementation readiness. The procedure supplies the evidence needed for user decisions; prompts for approval only after concrete failing criteria and proposed prep tasks are known.
-1. Load the approved work plan exact path and extract Verification Strategies, Quality Assurance Mechanisms, Design-to-Plan Traceability, ADR Bindings, UI Spec Component -> Task Mapping, Connection Map, test skeleton references, E2E absence reasons, phase structure, referenced Design Docs, ADRs, and UI Specs.
+1. Load the approved work plan exact path and extract Verification Strategies, Quality Assurance Mechanisms, Design-to-Plan Traceability, Reference Contract Values, ADR Bindings, UI Spec Component -> Task Mapping, Connection Map, test skeleton references, E2E absence reasons, phase structure, referenced Design Docs, ADRs, and UI Specs.
 2. Evaluate these criteria with evidence:
-   - R1 Verification Strategy and ADR Binding references resolve
+   - R1 Verification Strategy and binding references resolve
    - R2 E2E prerequisites are addressed
    - R3 Phase 1 observability exists
    - R4 UI rendering surface exists when UI work is present

package/.agents/skills/subagents-orchestration-guide/references/monorepo-flow.md CHANGED Viewed

@@ -105,7 +105,7 @@ work-planner's existing Integration Complete criteria naturally covers cross-lay
 After work-planner creates or updates the plan, spawn document-reviewer:
-> "Review the fullstack work plan. doc_type: WorkPlan. target: [work plan path]. mode: composite. Review semantic traceability to all Design Docs, UI Spec when present, cross-layer boundary coverage, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
+> "Review the fullstack work plan. doc_type: WorkPlan. target: [work plan path]. mode: composite. Review semantic traceability to all Design Docs, UI Spec when present, Reference Contract Values fidelity, cross-layer boundary coverage, early verification placement, real-boundary verification coverage, Proof Strategy, Failure Mode Checklist, Review Scope, and Quality Assurance coverage."
 On `needs_revision` or `approved_with_conditions`, return to work-planner in update mode and re-review for max 2 revision iterations as defined by the `needs_revision` row in Approval Status Vocabulary. On `rejected`, halt and escalate to the user. Stop for batch approval only after WorkPlan review returns `approved` and the plan's `WorkPlan Review` section records `Status: approved` with `Conditions: none`.

package/.agents/skills/testing/SKILL.md CHANGED Viewed

@@ -52,11 +52,11 @@ For language-specific testing patterns, also read:
 ## Quality Requirements [MANDATORY]
-### Coverage Standards
+### Coverage
-- **Minimum 80% code coverage** for production code
-- Prioritize critical paths and business logic
-- Use coverage as a guide, not a goal
+- Treat coverage as a diagnostic signal for finding untested areas, not a target. Targets get gamed into trivial tests.
+- Concentrate tests on critical paths, business logic, and behavior whose regression would matter.
+- Prioritize meaningful assertions over the coverage number. Any required threshold comes from the project's CI, task file, work plan, or Design Doc.
 ### Test Characteristics
@@ -279,7 +279,7 @@ Always test:
 ☐ All tests pass
 ☐ No tests skipped or commented
 ☐ No debug code left in tests
-☐ Test coverage meets standards (≥ 80%)
+☐ Coverage threshold passes when the project, task file, work plan, or Design Doc defines one
 ☐ Tests run in reasonable time
 ### Zero Tolerance Policy

package/.agents/skills/testing/references/typescript.md CHANGED Viewed

@@ -10,13 +10,9 @@ import { render, screen } from '@testing-library/react'
 import userEvent from '@testing-library/user-event'
 ```
-### Coverage Requirements
+### Where to Concentrate Test Rigor
-- **Overall minimum**: 60%
-- **Atomic Design projects**: Atoms 70%+, Molecules 65%+, Organisms 60%+
-- **Other component architectures**: Keep 60% as the baseline and raise foundational or highly reused components to 70%+
-- **Custom Hooks**: 65%+
-- **Utils**: 70%+
+Test foundational, high-reuse units the hardest: shared components, custom hooks, utilities, and business rules reused across features carry the widest blast radius. Higher-composition surfaces such as pages and organisms lean more on integration or E2E coverage. Any numeric threshold is the project's CI, task file, work plan, or Design Doc config.
 ### Test Types

package/.codex/agents/acceptance-test-generator.toml CHANGED Viewed

@@ -265,53 +265,11 @@ A skeleton is committed before its implementation exists, so its committed form
 ### Generation Report
 ```json
-{
-  "status": "completed",
-  "feature": "[feature name]",
-  "generatedFiles": {
-    "integration": "[path]/[feature].int.test.[ext]",
-    "fixtureE2e": null,
-    "serviceE2e": null
-  },
-  "budgetUsage": {
-    "integration": "2/3",
-    "fixtureE2e": "0/3",
-    "serviceE2e": "0/2"
-  },
-  "e2eAbsenceReason": {
-    "fixtureE2e": "all_e2e_candidates_below_threshold",
-    "serviceE2e": "no_real_service_dependency"
-  },
-  "boundaryProofGaps": []
-}
+{"status":"completed","feature":"[feature name]","generatedFiles":{"integration":"[path]/[feature].int.test.[ext]","fixtureE2e":null,"serviceE2e":null},"budgetUsage":{"integration":"2/3","fixtureE2e":"0/3","serviceE2e":"0/2"},"e2eAbsenceReason":{"fixtureE2e":"all_e2e_candidates_below_threshold","serviceE2e":"no_real_service_dependency"},"boundaryProofGaps":[]}
 ```
 ```json
-{
-  "status": "completed",
-  "feature": "[feature name]",
-  "generatedFiles": {
-    "integration": "[path]/[feature].int.test.[ext]",
-    "fixtureE2e": "[path]/[feature].fixture.e2e.test.[ext]",
-    "serviceE2e": "[path]/[feature].service.e2e.test.[ext]"
-  },
-  "budgetUsage": {
-    "integration": "2/3",
-    "fixtureE2e": "1/3",
-    "serviceE2e": "1/2"
-  },
-  "e2eAbsenceReason": {
-    "fixtureE2e": null,
-    "serviceE2e": null
-  },
-  "boundaryProofGaps": [
-    {
-      "acId": "[AC-XXX]",
-      "boundaryPath": "[branch/state/input/lifecycle/fallback/visibility path]",
-      "reason": "budget_insufficient_for_boundary_proof"
-    }
-  ]
-}
+{"status":"completed","feature":"[feature name]","generatedFiles":{"integration":"[path]/[feature].int.test.[ext]","fixtureE2e":"[path]/[feature].fixture.e2e.test.[ext]","serviceE2e":"[path]/[feature].service.e2e.test.[ext]"},"budgetUsage":{"integration":"2/3","fixtureE2e":"1/3","serviceE2e":"1/2"},"e2eAbsenceReason":{"fixtureE2e":null,"serviceE2e":null},"boundaryProofGaps":[{"acId":"[AC-XXX]","boundaryPath":"[branch/state/input/lifecycle/fallback/visibility path]","reason":"budget_insufficient_for_boundary_proof"}]}
 ```
 ## Test Meta Information Assignment

package/.codex/agents/code-reviewer.toml CHANGED Viewed

@@ -64,6 +64,7 @@ Read the Design Doc in full and extract:
 - Architecture design and data flow
 - Interface contracts (function signatures, API endpoints, data structures)
 - Identifier specifications explicitly written in the Design Doc as exact values, literals, labels, or named fields (resource names, endpoint paths, configuration keys, error codes, schema/model names)
+- Binding observable contracts: use the Design Doc's `Observable Contract Values` table as the primary source when present; otherwise extract column/field/label sets and order, derived-display rules, and state-lifecycle negatives from the Design Doc. Also extract Field Propagation Map rows that carry a Serialized Format and Consumer Parse Rule
 - Error handling policy
 - Non-functional requirements
@@ -78,6 +79,7 @@ For each acceptance criterion extracted in Step 1:
 - For behavior-changing ACs, confirm the evidence covers main and boundary paths. Where a distinct branch, state, input class, lifecycle step, or fallback governs the behavior, verify it is exercised. Compare source/referenced behavior and implemented behavior at the same granularity; an unsupported change in a boundary dimension is a `dd_violation`.
 - Confirm the implementation keeps the core mechanism the AC, Design Doc, or referenced materials require. A simpler substitute that passes tests but drops the required mechanism is a `dd_violation`.
 - For changes to persisted, shared, or externally observable state, identify the publication boundary where the new state becomes observable to another process, component, user, or later step. State that is observable as complete while still partial, uninitialized, stale, or rollback-only (written as a rollback/compensation path rather than committed usable state) is a `reliability` finding.
+- When the reviewed task has `Change Category` set to `bug-fix`, `regression`, `state-change`, or `boundary-change`, check cases sharing its path, contract, persisted state, or external boundary. When no task field is present, classify the change from the diff itself. A sibling case still carrying the same class of defect is an `adjacent_residual` finding. When the task file is in scope, also read Investigation Notes for residuals the executor recorded as outside Target Files; verify each recorded residual and report in-scope unresolved residuals as `adjacent_residual`.
 #### 2-2. Identifier Verification
 For each identifier specification extracted in Step 1:
@@ -105,6 +107,13 @@ Assign confidence based on evidence count:
 - medium: 2 agreeing sources
 - low: 1 source only
+#### 2-4. Reference Contract and Boundary Verification
+Run this independently of the AC loop so observable contracts without dedicated ACs are verified.
+1. For each binding observable value extracted in Step 1 (column/field/label set and order, derived-display rule, state-lifecycle negative), verify the implementation reproduces it exactly. A deviation is a `dd_violation` whose rationale names it a reference contract gap and states the required observable value versus the implemented value.
+2. For each Field Propagation Map serialized boundary extracted in Step 1 (Serialized Format and Consumer Parse Rule), verify the producer emits the recorded representation and the consumer parses it by the recorded rule. A mismatch is a `dd_violation` whose rationale names it a boundary contract gap and states what the producer emits versus what the consumer parses.
 ### 3. Assess Code Quality
 Read each implementation file and evaluate:
@@ -134,12 +143,14 @@ Classify each quality finding into one of:
 - `maintainability`: code structure impedes change or comprehension
 - `reliability`: missing safeguards could cause runtime failure
 - `coverage_gap`: acceptance criteria lack meaningful test verification
+- `adjacent_residual`: a case sharing the change's path, contract, persisted state, or external boundary still carries the same class of defect
 Each finding MUST include a rationale:
 - `dd_violation`: what the Design Doc says vs what code does
 - `maintainability`: the concrete maintenance or comprehension risk
 - `reliability`: the failure scenario and triggering conditions
 - `coverage_gap`: the untested AC and why coverage matters
+- `adjacent_residual`: which adjacent case shares the path, contract, persisted state, or external boundary and how it still exhibits the defect class
 ### 4. Check Architecture Compliance
 Verify against the Design Doc architecture:
@@ -161,63 +172,7 @@ Return the JSON result as the final response. See Output Format for the schema.
 ## Output Format
 ```json
-{
-  "complianceRate": "[X]%",
-  "identifierMatchRate": "[X]%",
-  "verdict": "[pass/needs-improvement/needs-redesign]",
-  "acceptanceCriteria": [
-    {
-      "item": "[acceptance criteria name]",
-      "status": "fulfilled|partially_fulfilled|unfulfilled",
-      "confidence": "high|medium|low",
-      "location": "[file:line, if implemented]",
-      "evidence": ["[source1: file:line]", "[source2: test file:line]"],
-      "gap": "[what is missing or deviating, if not fully fulfilled]",
-      "suggestion": "[specific fix, if not fully fulfilled]"
-    }
-  ],
-  "identifierVerification": [
-    {
-      "identifier": "[identifier name]",
-      "designDocValue": "[value specified in Design Doc]",
-      "codeValue": "[value found in code, or 'not found']",
-      "location": "[file:line]",
-      "confidence": "high|medium|low",
-      "evidence": ["[source1: file:line]", "[source2: config file:line]"],
-      "match": false
-    }
-  ],
-  "qualityFindings": [
-    {
-      "category": "dd_violation|maintainability|reliability|coverage_gap",
-      "location": "[filename:function or file:line]",
-      "description": "[specific issue]",
-      "rationale": "[why this matters]",
-      "suggestion": "[specific improvement]"
-    }
-  ],
-  "summary": {
-    "acsTotal": 0,
-    "acsFulfilled": 0,
-    "acsPartial": 0,
-    "acsUnfulfilled": 0,
-    "identifiersTotal": 0,
-    "identifiersMatched": 0,
-    "lowConfidenceItems": 0,
-    "findingsByCategory": {
-      "dd_violation": 0,
-      "maintainability": 0,
-      "reliability": 0,
-      "coverage_gap": 0
-    }
-  },
-  "nextAction": "[highest priority action needed]"
-}
+{"complianceRate":"[X]%","identifierMatchRate":"[X]%","verdict":"[pass/needs-improvement/needs-redesign]","acceptanceCriteria":[{"item":"[acceptance criteria name]","status":"fulfilled|partially_fulfilled|unfulfilled","confidence":"high|medium|low","location":"[file:line, if implemented]","evidence":["[source1: file:line]","[source2: test file:line]"],"gap":"[what is missing or deviating, if not fully fulfilled]","suggestion":"[specific fix, if not fully fulfilled]"}],"identifierVerification":[{"identifier":"[identifier name]","designDocValue":"[value specified in Design Doc]","codeValue":"[value found in code, or 'not found']","location":"[file:line]","confidence":"high|medium|low","evidence":["[source1: file:line]","[source2: config file:line]"],"match":false}],"qualityFindings":[{"category":"dd_violation|maintainability|reliability|coverage_gap|adjacent_residual","location":"[filename:function or file:line]","description":"[specific issue]","rationale":"[why this matters]","suggestion":"[specific improvement]"}],"summary":{"acsTotal":0,"acsFulfilled":0,"acsPartial":0,"acsUnfulfilled":0,"identifiersTotal":0,"identifiersMatched":0,"lowConfidenceItems":0,"findingsByCategory":{"dd_violation":0,"maintainability":0,"reliability":0,"coverage_gap":0,"adjacent_residual":0}},"nextAction":"[highest priority action needed]"}
 ```
 `identifierVerification` MUST include mismatches only. Use `summary.identifiersTotal` and `summary.identifiersMatched` for overall counts.