npm - codex-workflows - Versions diffs - 0.4.7 → 0.4.9 - Mend

codex-workflows 0.4.7 → 0.4.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (37) hide show

package/.codex/agents/quality-fixer.toml CHANGED Viewed

@@ -37,6 +37,10 @@ Skill Status:
    - Analyze error root causes and execute both auto-fixes and manual fixes autonomously
    - Continue fixing until all phases pass with zero errors, then return approved status
+## Input Parameters
+- **task_file** (optional): Path to the task file being verified. When provided, read the task file's `Quality Assurance Mechanisms` section and use the listed mechanisms as supplementary hints for quality-check discovery. Primary detection remains code, manifest, and configuration based.
 ## Initial Required Tasks
 **Progress Tracking**: Track your work steps. Always include: first "Confirm skill constraints", final "Verify skill fidelity". Update progress upon completion.
@@ -45,7 +49,33 @@ Skill Status:
 ### Environment-Aware Quality Assurance
-**Step 1: Detect Quality Check Commands**
+**Step 1: Incomplete Implementation Check**
+Before any quality checks, inspect only the current task scope for incomplete implementation.
+Task scope for this check:
+- primary scope: `filesModified` or the current task's write set when the orchestrator provides it
+- fallback scope: the current uncommitted diff only when no task-scoped file list is available
+Evaluate changed code in this order:
+1. Explicit unfinished markers:
+   - `TODO`, `FIXME`, `placeholder`, `stub`, `temporary`, `not implemented`
+2. Missing required implementation body:
+   - empty method/function body where the task requires concrete logic
+   - empty event/handler branch where the task requires behavior
+3. Placeholder behavior with no task-level justification:
+   - constant sentinel return used instead of required business logic
+   - pass-through mock or fallback path used in production code instead of the required behavior
+Treat the following as allowed patterns:
+- intentional test doubles, fixtures, and test-only helpers
+- framework-required scaffolding when the task explicitly requests scaffolding
+- `null`, `[]`, `{}`, or fallback values when the Design Doc, task file, or existing behavior explicitly requires them
+- comments about future work outside the current task scope when the requested behavior is already complete
+If incomplete implementation is detected, stop immediately and return `status: "stub_detected"` with the affected files and reasons. Proceed to lint, build, and tests only after this check passes.
+**Step 2: Detect Quality Check Commands**
+**Primary detection** (always execute):
 ```bash
 # Auto-detect from project manifest files
 # Identify project structure and extract quality commands:
@@ -54,28 +84,40 @@ Skill Status:
 # - Build configuration → extract build/check commands
 ```
-**Step 2: Execute Quality Checks**
+**Supplementary detection** (when `task_file` is provided):
+- Read the task file's `Quality Assurance Mechanisms` section
+- For executable mechanisms, verify the tool exists and is runnable in the current project, then add it to the quality-check command set
+- For non-executable domain constraints, keep them as explicit verification targets and check the changed files against the stated constraint during review
+- Record skipped mechanisms only when neither executable verification nor direct constraint checking is possible
+**Step 3: Execute Quality Checks**
 Follow the principles in ai-development-guide skill "Quality Check Workflow" section:
 - Basic checks (lint, format, build)
 - Tests (unit, integration)
 - Final gate (all must pass)
-**Step 3: Fix Errors**
+**Step 4: Fix Errors**
 Apply fixes following the principles in coding-rules skill and testing skill.
-**Step 4: Repeat Until Approved**
+**Step 5: Repeat Until Approved**
 - Address all errors in each phase before proceeding to next phase
 - Error found → Fix immediately → Re-run checks
-- All pass → proceed to Step 5
-- Cannot determine spec → proceed to Step 5 with `blocked` status
+- All pass → proceed to Step 6
+- Cannot determine spec → proceed to Step 6 with `blocked` status
-**Step 5: Return JSON Result**
+**Step 6: Return JSON Result**
 Return one of the following as the final response (see Output Format for schemas):
+- `status: "stub_detected"` — incomplete implementation found in changed code
 - `status: "approved"` — all quality checks pass
 - `status: "blocked"` — specification unclear or execution prerequisites are missing
 ## Status Determination Criteria (Binary Determination)
+### stub_detected (Incomplete implementation found)
+- Changed code contains placeholder logic, deferred required work, or stub return values that indicate implementation is not complete
+- The issue is detected before lint/build/test execution
+- The next action is to route the task back to task-executor for completion
 ### approved (All quality checks pass)
 - All tests pass
 - Build succeeds
@@ -106,6 +148,22 @@ Return one of the following as the final response (see Output Format for schemas
 ### Internal Structured Response
+**When incomplete implementation is detected**:
+```json
+{
+  "status": "stub_detected",
+  "summary": "Incomplete implementation detected in changed code before quality checks.",
+  "stubFindings": [
+    {
+      "file": "src/example.ts",
+      "indicator": "TODO marker",
+      "details": "TODO comment defers required business logic in the task scope"
+    }
+  ],
+  "nextActions": "Return to task-executor and complete the implementation before re-running quality-fixer."
+}
+```
 **When quality check succeeds**:
 ```json
 {
@@ -150,6 +208,16 @@ Return one of the following as the final response (see Output Format for schemas
       "filesCount": 2
     }
   ],
+  "taskFileMechanisms": {
+    "provided": true,
+    "executed": ["mechanism names that were found and executed"],
+    "skipped": [
+      {
+        "mechanism": "mechanism name",
+        "reason": "tool not found / config not found / not executable"
+      }
+    ]
+  },
   "metrics": {
     "totalErrors": 0,
     "totalWarnings": 0,
@@ -176,6 +244,16 @@ Return one of the following as the final response (see Output Format for schemas
     "Fix attempt 2: Tried aligning implementation to test",
     "Fix attempt 3: Tried inferring specification from related documentation"
   ],
+  "taskFileMechanisms": {
+    "provided": true,
+    "executed": ["mechanisms executed before blocking"],
+    "skipped": [
+      {
+        "mechanism": "mechanism name",
+        "reason": "tool not found / config not found / not executable"
+      }
+    ]
+  },
   "needsUserDecision": "Please confirm the correct error code"
 }
 ```
@@ -193,6 +271,16 @@ Return one of the following as the final response (see Output Format for schemas
       "resolutionSteps": ["Run the project seed script for E2E fixtures", "Start the dependent local services"]
     }
   ],
+  "taskFileMechanisms": {
+    "provided": true,
+    "executed": ["mechanisms executed before blocking"],
+    "skipped": [
+      {
+        "mechanism": "mechanism name",
+        "reason": "tool not found / config not found / not executable"
+      }
+    ]
+  },
   "checksSkipped": 1,
   "checksPassedWithoutPrerequisites": 3
 }
@@ -224,7 +312,7 @@ This is intermediate output only. The final response must be the JSON result (St
 ## Completion Criteria
-- [ ] Final response is a single JSON with status `approved` or `blocked`
+- [ ] Final response is a single JSON with status `stub_detected`, `approved`, or `blocked`
 ## Important Principles

package/.codex/agents/solver.toml CHANGED Viewed

@@ -36,9 +36,9 @@ Skill Status:
 ## Input and Responsibility Boundaries
 - **Input**: Structured conclusion (JSON) or text format conclusion
-- **Text format**: Extract cause and confidence. Assume `medium` if confidence not specified
-- **No conclusion**: If cause is obvious, present solutions as "estimated cause" (confidence: low); if unclear, report "Cannot derive solutions due to unidentified cause"
-- **Out of scope**: Cause investigation and hypothesis verification are handled by other agents
+- **Text format**: Extract failure points and coverage status. Assume `partial` coverage if not specified
+- **No conclusion**: If a failure point is obvious, present solutions as "estimated failure point" with partial coverage; if unclear, report "Cannot derive solutions due to unidentified cause"
+- **Out of scope**: Cause investigation and failure-point verification are handled by other agents
 ## Output Scope
@@ -53,27 +53,29 @@ This agent outputs **solution derivation and recommendation presentation**. Proc
 ## Execution Steps
-### Step 1: Cause Understanding and Input Validation
+### Step 1: Failure Point Understanding and Input Validation
 **For JSON format**:
-- Confirm causes (may be multiple) from `conclusion.causes`
-- Confirm causes relationship from `conclusion.causesRelationship`
-- Confirm confidence from `conclusion.confidence`
+- Confirm failure points (may be multiple) from `conclusion.confirmedFailurePoints`
+- Confirm failure-point relationships from `conclusion.failurePointRelationships`
+- Confirm coverage assessment from `conclusion.coverageAssessment`
-**Causes Relationship Handling**:
-- independent: Derive separate solution for each cause
-- dependent: Solving root cause resolves derived causes
-- exclusive: One cause is true (others are incorrect)
+**Failure Point Relationship Handling**:
+- independent: Derive separate solution for each failure point
+- upstream_of: Prioritize the upstream failure point before downstream fixes
+- downstream_of: Verify whether the upstream failure point should be fixed first
+- amplifies: Consider a combined mitigation or staged fix because one failure point worsens another
+- same_boundary: Consider a shared boundary fix or compatibility-layer fix
 **For text format**:
-- Extract cause-related descriptions
-- Look for confidence mentions (assume `medium` if not found)
+- Extract failure-point-related descriptions
+- Look for coverage or uncertainty mentions (assume `partial` if not found)
 - Look for uncertainty-related descriptions
 **User Report Consistency Check**:
 - Example: "I changed A and B broke" → Does the conclusion explain that causal relationship?
 - Example: "The implementation is wrong" → Does the conclusion include design-level issues?
-- If inconsistent, add "Possible need to reconsider the cause" to residualRisks
+- If inconsistent, add "Possible need to reconsider the identified failure point" to residualRisks
 **Approach Selection Based on impactAnalysis**:
 - impactScope empty, recurrenceRisk: low → Direct fix only
@@ -85,8 +87,8 @@ Generate at least 3 solutions from the following perspectives:
 | Type | Definition | Application |
 |------|------------|-------------|
-| direct | Directly fix the cause | When cause is clear and certainty is high |
-| workaround | Alternative approach avoiding the cause | When fixing the cause is difficult or high-risk |
+| direct | Directly fix the failure point | When the failure point is clear and certainty is high |
+| workaround | Alternative approach avoiding the failure point | When fixing the failure point is difficult or high-risk |
 | mitigation | Measures to reduce impact | Temporary measure while waiting for root fix |
 | fundamental | Comprehensive fix including recurrence prevention | When similar problems have occurred repeatedly |
@@ -106,10 +108,10 @@ Evaluate each solution on the following axes:
 | certainty | Degree of certainty in solving the problem |
 ### Step 4: Recommendation Selection
-Recommendation strategy based on confidence:
-- high: Consider aggressive direct fixes and fundamental solutions
-- medium: Staged approach, verify with low-impact fixes before full implementation
-- low: Start with conservative mitigation, prioritize solutions that address multiple possible causes
+Recommendation strategy based on coverage assessment:
+- sufficient: Consider direct fixes and fundamental solutions
+- partial: Prefer staged approach, verify with low-impact fixes before full implementation
+- insufficient: Start with conservative mitigation and highlight additional verification needs
 ### Step 5: Implementation Steps Creation
 - Each step independently verifiable
@@ -126,11 +128,13 @@ Return the JSON result as the final response. See Output Format for the schema.
 ```json
 {
   "inputSummary": {
-    "identifiedCauses": [
-      {"hypothesisId": "H1", "description": "Cause description", "status": "confirmed|probable|possible"}
+    "identifiedFailurePoints": [
+      {"failurePointId": "FP1", "description": "Failure point description", "status": "confirmed|probable|possible"}
     ],
-    "causesRelationship": "independent|dependent|exclusive",
-    "confidence": "high|medium|low"
+    "failurePointRelationships": [
+      {"from": "FP1", "to": "FP2", "relationship": "independent|upstream_of|downstream_of|amplifies|same_boundary"}
+    ],
+    "coverageAssessment": "sufficient|partial|insufficient"
   },
   "solutions": [
     {
@@ -192,7 +196,7 @@ Return the JSON result as the final response. See Output Format for the schema.
 ## Output Self-Check
 - [ ] Solution addresses the user's reported symptoms (not just the technical conclusion)
 - [ ] Input conclusion consistency with user report was verified before solution derivation
-- [ ] Contradicting evidence discovered during solution design is addressed with adjusted confidence
+- [ ] Contradicting evidence discovered during solution design is addressed with adjusted coverage assumptions
 ## Completion Gate [BLOCKING]

package/.codex/agents/task-decomposer.toml CHANGED Viewed

@@ -65,6 +65,7 @@ Decompose tasks based on implementation strategy patterns determined in implemen
 3. **Task File Generation**
    - Create individual task files in `docs/plans/tasks/`
    - Document concrete executable procedures
+   - Include task-level Quality Assurance Mechanisms when the work plan defines them
    - **Always include operation verification methods**
    - Define clear completion criteria (within executor's scope of responsibility)
@@ -161,6 +162,18 @@ When the work plan includes one or more Verification Strategy blocks:
 4. **Failure handling**: Copy or adapt the relevant plan-level failure response so the executor knows whether to reassess, stop, or escalate.
 5. **Investigation coverage**: Include every resource required for verification, such as existing implementations for comparison, schema definitions, fixtures, contracts, or seed data.
+## Quality Assurance Mechanism Propagation
+When the work plan includes a `Quality Assurance Mechanisms` table:
+1. **Normalize coverage entries**: Treat each `Covered Files` entry as one of: exact file path, ancestor directory path, glob-like pattern, or `project-wide`
+2. **Exact file match**: Include the mechanism when a `Covered Files` entry exactly matches a task Target File
+3. **Ancestor directory match**: Include the mechanism when a `Covered Files` entry is an ancestor directory of a task Target File, for example `src/api/` covers `src/api/handlers/auth.ts` but not `src/api-utils/auth.ts`
+4. **Pattern match**: When a `Covered Files` entry uses a glob-like pattern such as `src/**/*.ts` or `**/*.schema.json`, include the mechanism when the task Target File reasonably matches that pattern
+5. **Include project-wide mechanisms**: If the mechanism is `project-wide` or has no file-specific coverage, include it in every task
+6. **Include matching mechanisms**: Add all matching mechanisms to the task file's `Quality Assurance Mechanisms` section
+7. **Omit empty sections**: If no mechanisms apply to the task, omit that section from the task file
 ## Design-to-Plan Traceability Propagation
 When the work plan includes a `Design-to-Plan Traceability` section:
@@ -271,6 +284,7 @@ Please execute decomposed tasks according to the order.
 - [ ] Impact scope and boundaries definition for each task
 - [ ] Appropriate granularity (1-5 files/task)
 - [ ] Investigation Targets specified for every task
+- [ ] Quality Assurance Mechanisms propagated to relevant tasks when present in the plan header
 - [ ] Operation Verification Methods specified for every task
 - [ ] Clear completion criteria setting
 - [ ] Overall design document creation

package/.codex/agents/task-executor-frontend.toml CHANGED Viewed

@@ -167,6 +167,19 @@ Select and execute files with pattern `docs/plans/tasks/*-task-*.md` that have u
 3. **Cross-check against Investigation Notes**: Ensure planned implementation is consistent with the observations recorded in the task file
 4. **Execute determination**: Determine continue/escalation per "Mandatory Judgment Criteria" above
+#### Reference Representativeness (Applied During Implementation)
+When adopting a pattern, UI composition, or dependency from existing code, apply repository-wide representativeness checks at the point of adoption:
+□ **Repository-wide verification**: Confirm the referenced pattern is representative across the repository, not just the nearest 2-3 files
+□ **Dependency version verification** (when adopting external dependencies):
+  - verify repository-wide usage distribution for the same dependency
+  - if following one existing version when alternatives exist, state the reason
+  - if repository-wide verification is insufficient to determine the appropriate version, escalate with `reason: "Dependency version uncertain"`
+□ **Coexistence resolution**: When multiple patterns or versions coexist, identify the majority before choosing
+This is a repeated self-check during implementation, not a one-time pre-implementation gate.
 #### Implementation Flow (TDD Compliant)
 **Completion Confirmation**: If all checkboxes are `[x]`, report "already completed" and end
@@ -325,6 +338,30 @@ When an Investigation Target file does not exist or the path is stale, escalate
 }
 ```
+#### 2-4. Dependency Version Uncertain Escalation
+When repository-wide verification is insufficient to determine the appropriate dependency version, escalate in following JSON format:
+```json
+{
+  "status": "escalation_needed",
+  "reason": "Dependency version uncertain",
+  "taskName": "[Task name being executed]",
+  "escalation_type": "dependency_version_uncertain",
+  "dependency": {
+    "name": "[dependency name]",
+    "versionsFound": ["list of versions found in repository"],
+    "filesChecked": ["file paths where the dependency usage was found"],
+    "ambiguityReason": "[why repository state alone is insufficient]"
+  },
+  "user_decision_required": true,
+  "suggested_options": [
+    "Use the majority version already in the repository",
+    "Use a different version with explicit rationale",
+    "Research the latest stable version and decide after review"
+  ]
+}
+```
 ## Scope Boundary (delegate to orchestrator)
 - Overall quality checks → handled by quality-fixer-frontend
 - Commit creation → handled by orchestrator after quality checks

package/.codex/agents/task-executor.toml CHANGED Viewed

@@ -167,6 +167,19 @@ Select and execute files with pattern `docs/plans/tasks/*-task-*.md` that have u
 3. **Cross-check against Investigation Notes**: Ensure planned implementation is consistent with the observations recorded in the task file
 4. **Execute determination**: Determine continue/escalation per "Mandatory Judgment Criteria" above
+#### Reference Representativeness (Applied During Implementation)
+When adopting a pattern, API usage, or dependency from existing code, apply repository-wide representativeness checks at the point of adoption:
+□ **Repository-wide verification**: Confirm the referenced pattern is representative across the repository, not just the nearest 2-3 files
+□ **Dependency version verification** (when adopting external dependencies):
+  - verify repository-wide usage distribution for the same dependency
+  - if following one existing version when alternatives exist, state the reason
+  - if repository-wide verification is insufficient to determine the appropriate version, escalate with `reason: "Dependency version uncertain"`
+□ **Coexistence resolution**: When multiple versions or patterns coexist, identify the majority before choosing
+This is a repeated self-check during implementation, not a one-time pre-implementation gate.
 #### Implementation Flow (TDD Compliant)
 **If all checkboxes already `[x]`**: Report "already completed" and end
@@ -326,11 +339,36 @@ When an Investigation Target file does not exist or the path is stale, escalate
 }
 ```
+#### 2-4. Dependency Version Uncertain Escalation
+When repository-wide verification is insufficient to determine the appropriate dependency version, escalate in following JSON format:
+```json
+{
+  "status": "escalation_needed",
+  "reason": "Dependency version uncertain",
+  "taskName": "[Task name being executed]",
+  "escalation_type": "dependency_version_uncertain",
+  "dependency": {
+    "name": "[dependency name]",
+    "versionsFound": ["list of versions found in repository"],
+    "filesChecked": ["file paths where the dependency usage was found"],
+    "ambiguityReason": "[why repository state alone is insufficient]"
+  },
+  "user_decision_required": true,
+  "suggested_options": [
+    "Use the majority version already in the repository",
+    "Use a different version with explicit rationale",
+    "Research the latest stable version and decide after review"
+  ]
+}
+```
 ## Execution Principles
 - Follow RED-GREEN-REFACTOR (see the principles in testing skill)
 - Update progress checkboxes per step
 - Escalate when: design deviation, similar functions found, investigation target not found, test environment missing
+- Escalate when dependency version or representative pattern choice cannot be determined from repository evidence
 - Stop after implementation and test creation — quality checks and commits are handled separately
 ## Completion Gate [BLOCKING]

package/.codex/agents/technical-designer-frontend.toml CHANGED Viewed

@@ -220,6 +220,13 @@ When a UI Spec exists for the feature (`docs/ui-spec/{feature-name}-ui-spec.md`)
   - Path to existing document
   - Reason for changes
   - Sections needing updates
+  - Before editing changed sections, build a Dependency Inventory for identifiers referenced by the update
+  - Dependency Inventory output format:
+    - `identifier`: exact literal identifier
+    - `source`: codebase | accepted_adr | external
+    - `status`: verified_existing | requires_new_creation | external_dependency
+    - `action`: keep | update_document | create_dependency | confirm_external_reference
+  - In update mode, cross-check prerequisite ADR references against Accepted ADRs only. Cross-Design-Doc consistency is handled by design-sync after the update
 - **Reverse-Engineer Context** (reverse-engineer mode only):
   - Primary Files
@@ -309,14 +316,14 @@ Cover happy path, unhappy path, and edge cases including empty and loading state
 ### AC Scoping for Autonomous Implementation (Frontend)
-**Include** (High automation ROI):
+**Include** (High automation value):
 - User interaction behavior (button clicks, form submissions, navigation)
 - Rendering correctness (component displays correct data)
 - State management behavior (state updates correctly on user actions)
 - Error handling behavior (error messages displayed to user)
 - Accessibility (keyboard navigation, screen reader support)
-**Exclude** (Low ROI in LLM/CI/CD environment):
+**Exclude** (Low automation value in LLM/CI/CD environment):
 - External API real connections → Use MSW for API mocking instead
 - Performance metrics → Non-deterministic in CI environment
 - Implementation details → Focus on user-observable behavior

package/.codex/agents/technical-designer.toml CHANGED Viewed

@@ -52,11 +52,18 @@ Must be performed before any investigation:
    - Scan project configuration, rule files, and existing code patterns
    - Classify each: **Explicit** (documented) or **Implicit** (observed pattern only)
-2. **Record in Design Doc**
-   - List in "Applicable Standards" section with `[explicit]`/`[implicit]` tags
+2. **Identify Quality Assurance Mechanisms**
+   - When Codebase Analysis is provided, use `qualityAssurance` as the primary source
+   - Otherwise inspect CI pipelines, linter configs, pre-commit hooks, and project configuration for checks that cover the change area
+   - Identify domain-specific constraints from configuration or CI
+   - For each mechanism, decide whether it is `adopted` for this change or `noted` with a reason
+3. **Record in Design Doc**
+   - List standards in "Applicable Standards" section with `[explicit]`/`[implicit]` tags
+   - List quality assurance mechanisms in "Quality Assurance Mechanisms" section with `adopted` / `noted (reason)` status
    - Implicit standards require user confirmation before design proceeds
-3. **Alignment Rule**
+4. **Alignment Rule**
    - Design decisions MUST reference applicable standards
    - Deviations MUST have documented rationale
@@ -252,6 +259,13 @@ Confirm and document conflicts with existing systems at each integration point t
   - Path to existing document
   - Reason for changes
   - Sections needing updates
+  - Before editing changed sections, build a Dependency Inventory for identifiers referenced by the update
+  - Dependency Inventory output format:
+    - `identifier`: exact literal identifier
+    - `source`: codebase | accepted_adr | external
+    - `status`: verified_existing | requires_new_creation | external_dependency
+    - `action`: keep | update_document | create_dependency | confirm_external_reference
+  - In update mode, cross-check prerequisite ADR references against Accepted ADRs only. Cross-Design-Doc consistency is handled by design-sync after the update
 - **Reverse-Engineer Context** (reverse-engineer mode only):
   - Primary Files
@@ -302,6 +316,7 @@ Implementation sample creation checklist:
 ### Design Doc Checklist
 **All modes**:
 - [ ] **Standards identification gate completed** (required)
+- [ ] **Quality assurance mechanisms identified with adoption status** (required)
 - [ ] **Code inspection evidence recorded** (required)
 - [ ] **Integration points enumerated with contracts** (required)
 - [ ] **Data contracts clarified** (required)
@@ -338,13 +353,13 @@ Cover happy path, unhappy path, and edge cases. Place important criteria first.
 ### AC Scoping for Autonomous Implementation
-**Include** (High automation ROI):
+**Include** (High automation value):
 - Business logic correctness (calculations, state transitions, data transformations)
 - Data integrity and persistence behavior
 - User-visible functionality completeness
 - Error handling behavior (what user sees/experiences)
-**Exclude** (Low ROI in LLM/CI/CD environment):
+**Exclude** (Low automation value in LLM/CI/CD environment):
 - External service real connections → Use contract/interface verification instead
 - Performance metrics → Non-deterministic in CI, defer to load testing
 - Implementation details (technology choice, algorithms, internal structure) → Focus on observable behavior