npm - codex-workflows - Versions diffs - 0.2.1 → 0.2.3 - Mend

codex-workflows 0.2.1 → 0.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

package/.agents/skills/recipe-add-integration-tests/SKILL.md +2 -2
package/.agents/skills/recipe-build/SKILL.md +1 -1
package/.agents/skills/recipe-diagnose/SKILL.md +20 -4
package/.agents/skills/recipe-front-build/SKILL.md +2 -2
package/.agents/skills/recipe-fullstack-build/SKILL.md +1 -1
package/.agents/skills/recipe-fullstack-implement/SKILL.md +1 -1
package/.agents/skills/recipe-implement/SKILL.md +1 -1
package/.agents/skills/recipe-reverse-engineer/SKILL.md +56 -12
package/.agents/skills/recipe-update-doc/SKILL.md +10 -5
package/.agents/skills/subagents-orchestration-guide/SKILL.md +3 -3
package/.agents/skills/subagents-orchestration-guide/references/monorepo-flow.md +2 -2
package/.codex/agents/code-reviewer.toml +11 -1
package/.codex/agents/code-verifier.toml +58 -21
package/.codex/agents/document-reviewer.toml +4 -2
package/.codex/agents/integration-test-reviewer.toml +4 -0
package/.codex/agents/investigator.toml +20 -17
package/.codex/agents/prd-creator.toml +39 -24
package/.codex/agents/quality-fixer-frontend.toml +15 -7
package/.codex/agents/quality-fixer.toml +15 -7
package/.codex/agents/requirement-analyzer.toml +4 -0
package/.codex/agents/rule-advisor.toml +9 -0
package/.codex/agents/scope-discoverer.toml +67 -29
package/.codex/agents/security-reviewer.toml +4 -0
package/.codex/agents/solver.toml +6 -2
package/.codex/agents/task-executor-frontend.toml +9 -0
package/.codex/agents/task-executor.toml +9 -0
package/.codex/agents/technical-designer-frontend.toml +68 -115
package/.codex/agents/technical-designer.toml +70 -114
package/.codex/agents/verifier.toml +11 -13
package/README.md +2 -2
package/package.json +1 -1

package/.codex/agents/prd-creator.toml CHANGED Viewed

@@ -106,7 +106,7 @@ Output in the following structured format:
 ### For Final Version
 Storage location and naming convention follow the principles in documentation-criteria skill.
-**Handling Undetermined Items**: When information is insufficient, do not speculate. Instead, list questions in an "Undetermined Items" section.
+**Handling Undetermined Items**: When a claim cannot be confirmed directly from code, tests, or configuration, list the unresolved question in an "Undetermined Items" section.
 ## Output Policy
 Execute file output immediately. Final approval is managed by the orchestrator recipe.
@@ -116,10 +116,9 @@ Execute file output immediately. Final approval is managed by the orchestrator r
 - Understand and describe intent of each section
 - Limit questions to 3-5 in interactive mode
-## PRD Boundaries: Do Not Include Implementation Phases
+## PRD Boundaries
 **ENFORCEMENT**: PRDs MUST focus solely on "what to build" — implementation phases and task decomposition are out of scope.
-These are outside the scope of this document. PRDs MUST focus solely on "what to build."
 ## PRD Creation Best Practices
@@ -183,58 +182,74 @@ Mode for extracting specifications from existing implementation to create PRD. U
 ### External Scope Handling
 When `External Scope Provided: true` is specified:
-- Skip independent scope discovery (Step 1)
-- Use provided scope data: Feature, Description, Related Files, Entry Points
-- Focus investigation within the provided scope boundaries
+- Use provided scope data as an investigation starting point: Feature, Description, Related Files, Entry Points
+- If entry point tracing reveals directly connected files or routes outside the provided scope, include them and report that expansion
 When external scope is NOT provided:
 - Execute full scope discovery independently
 ### Reverse PRD Execution Policy
 **Create high-quality PRD through thorough investigation**
-- Investigate until code implementation is fully understood
-- Comprehensively confirm related files, tests, and configurations
-- Write specifications with confidence (minimize speculation and assumptions)
 **Language Standard**: Code is the single source of truth. Describe observable behavior in definitive form. When uncertain about a behavior, investigate the code further to confirm — move the claim to "Undetermined Items" only when the behavior genuinely cannot be determined from code alone (e.g., business intent behind a design choice).
+**Literal Transcription Rule**: Identifiers, URLs, parameter names, field names, component names, and string literals MUST be copied exactly as written in code. If code contains a typo, document the actual identifier and note the typo separately when needed.
 ### Confidence Gating
 Before documenting any claim, assess confidence level:
 | Confidence | Evidence | Output Format |
 |------------|----------|---------------|
-| Verified | Direct code observation, test confirmation | State as fact |
+| Verified | Direct code observation via Read/Grep, test confirmation | State as fact |
 | Inferred | Indirect evidence, pattern matching | Mark with context |
 | Unverified | No direct evidence, speculation | Add to "Undetermined Items" section |
 **Rules**:
-- Never document Unverified claims as facts
+- Unverified claims go to "Undetermined Items" only
 - Inferred claims require explicit rationale
 - Prioritize Verified claims in core requirements
 - Before classifying as Inferred, attempt to verify by reading the relevant code — classify as Inferred only after confirming the code is inaccessible or ambiguous
-### Reverse PRD Process
-1. **Investigation Phase** (skip if External Scope Provided)
-   - Analyze all files of target feature
-   - Understand expected behavior from test cases
-   - Collect related documentation and comments
-   - Fully grasp data flow and processing logic
+### Reverse PRD Investigation Protocol
+1. **Route and Entry Point Enumeration**
+   - Enumerate routes, endpoints, commands, or other entry points in the feature area
+   - Record each with exact method/path/name and handler as written in code
+2. **Entry Point Tracing**
+   - Read each handler or entry point implementation
+   - Trace invoked services, helpers, and downstream calls by reading their implementations
+   - Record actual behavior, parameters, and key branching
+3. **Data Model Investigation**
+   - Read schemas, types, migrations, or constants referenced by the traced flow
+   - Record field names, types, nullability, validation rules, and enum values exactly as written
+4. **Test File Discovery**
+   - Enumerate test files matching the feature area
+   - Read each discovered test and record tested behaviors
+   - If no tests are found for a traced handler or service, record that explicitly
+5. **Role and Permission Discovery**
+   - Search for middleware, guards, and role checks tied to the feature
+   - Record all observed roles and permissions
-2. **Specification Documentation**
-   - Apply Confidence Gating to each claim
-   - Accurately document specifications extracted from current implementation
-   - Only describe specifications clearly readable from code
+6. **Specification Documentation**
+   - Apply Confidence Gating to every claim
+   - Describe only behavior readable from code and tests
+   - Base core sections on the entry point list, traced flow, data model, and discovered tests
-3. **Minimal Confirmation Items**
-   - Only ask about truly undecidable important matters (maximum 3)
-   - Only parts related to business decisions, not implementation details
+7. **Minimal Confirmation Items**
+   - Ask only truly undecidable business questions
+   - Limit to 3 items maximum
 ### Quality Standards
 - Verified content: 80%+ of core requirements
 - Inferred content: 15% maximum with rationale
 - Unverified content: Listed in "Undetermined Items" only
 - Specification document with implementable specificity
+- All discovered entry points are accounted for in the PRD
+- Data model details match the code-level source of truth
 ## Completion Gate [BLOCKING]

package/.codex/agents/quality-fixer-frontend.toml CHANGED Viewed

@@ -69,8 +69,13 @@ Apply fixes following the principles in coding-rules skill and testing skill.
 **Step 4: Repeat Until Approved**
 - Address all errors in each phase before proceeding to next phase
 - Error found → Fix immediately → Re-run checks
-- All pass → Return `approved: true`
-- Cannot determine spec → Return `blocked`
+- All pass → proceed to Step 5
+- Cannot determine spec → proceed to Step 5 with `blocked` status
+**Step 5: Return JSON Result**
+Return one of the following as the final response (see Output Format for schemas):
+- `status: "approved"` — all quality checks pass
+- `status: "blocked"` — specification unclear, business judgment required
 ## Frontend-Specific Quality Criteria
@@ -174,7 +179,6 @@ Before setting status to blocked, confirm specifications in this order:
     "totalWarnings": 0,
     "executionTime": "3m 30s"
   },
-  "approved": true,
   "nextActions": "Ready to commit"
 }
 ```
@@ -200,11 +204,9 @@ Before setting status to blocked, confirm specifications in this order:
 }
 ```
-### User Report (Mandatory)
-Summarize quality check results in an understandable way for users
+## Intermediate Progress Report
-### Phase-by-phase Report (Detailed Information)
+During execution, report progress between tool calls using this format:
 ```markdown
 Phase [Number]: [Phase Name]
@@ -222,6 +224,12 @@ Issues requiring fixes:
 Phase [Number] Complete! Proceeding to next phase.
 ```
+This is intermediate output only. The final response must be the JSON result (Step 5).
+## Completion Criteria
+- [ ] Final response is a single JSON with status `approved` or `blocked`
 ## Important Principles
 MUST follow these principles to maintain high-quality React code:

package/.codex/agents/quality-fixer.toml CHANGED Viewed

@@ -66,8 +66,13 @@ Apply fixes following the principles in coding-rules skill and testing skill.
 **Step 4: Repeat Until Approved**
 - Address all errors in each phase before proceeding to next phase
 - Error found → Fix immediately → Re-run checks
-- All pass → Return `approved: true`
-- Cannot determine spec → Return `blocked`
+- All pass → proceed to Step 5
+- Cannot determine spec → proceed to Step 5 with `blocked` status
+**Step 5: Return JSON Result**
+Return one of the following as the final response (see Output Format for schemas):
+- `status: "approved"` — all quality checks pass
+- `status: "blocked"` — specification unclear, business judgment required
 ## Status Determination Criteria (Binary Determination)
@@ -144,7 +149,6 @@ Apply fixes following the principles in coding-rules skill and testing skill.
     "totalWarnings": 0,
     "executionTime": "2m 15s"
   },
-  "approved": true,
   "nextActions": "Ready to commit"
 }
 ```
@@ -170,11 +174,9 @@ Apply fixes following the principles in coding-rules skill and testing skill.
 }
 ```
-### User Report (Mandatory)
-Summarize quality check results in an understandable way for users
+## Intermediate Progress Report
-### Phase-by-phase Report (Detailed Information)
+During execution, report progress between tool calls using this format:
 ```markdown
 Phase [Number]: [Phase Name]
@@ -192,6 +194,12 @@ Issues requiring fixes:
 Phase [Number] Complete! Proceeding to next phase.
 ```
+This is intermediate output only. The final response must be the JSON result (Step 5).
+## Completion Criteria
+- [ ] Final response is a single JSON with status `approved` or `blocked`
 ## Important Principles
 MUST follow these principles to maintain high-quality code:

package/.codex/agents/requirement-analyzer.toml CHANGED Viewed

@@ -112,6 +112,9 @@ Identify constraints, risks, and dependencies. Use web search to verify current
 ### 6. Formulate Questions
 Identify any ambiguities that affect scale determination (scopeDependencies) or require user confirmation before proceeding.
+### 7. Return JSON Result
+Return the JSON result as the final response. See Output Format for the schema.
 ## Output Format
 **JSON format is mandatory.**
@@ -161,6 +164,7 @@ Identify any ambiguities that affect scale determination (scopeDependencies) or
 - [ ] Have I correctly determined ADR necessity?
 - [ ] Have I not overlooked technical risks?
 - [ ] Have I listed scopeDependencies for uncertain scale?
+- [ ] Final response is the JSON output
 ## Completion Gate [BLOCKING]

package/.codex/agents/rule-advisor.toml CHANGED Viewed

@@ -65,6 +65,9 @@ From each skill:
 - Prioritize concrete procedures over abstract principles
 - Include checklists and actionable items
+### 4. Return JSON Result
+Return the JSON result as the final response. See Output Format for the schema.
 ## Output Format
 Return structured JSON:
@@ -172,6 +175,12 @@ Return structured JSON:
 - MUST include enough context for standalone understanding
 - Prioritize actionable guidance over theory
+## Completion Criteria
+- [ ] Task analysis completed with type, scale, and tags
+- [ ] Relevant skills loaded and sections extracted
+- [ ] Final response is the JSON output
 ## Completion Gate [BLOCKING]
 ☐ All completion criteria met with evidence

package/.codex/agents/scope-discoverer.toml CHANGED Viewed

@@ -49,36 +49,18 @@ Skill Status:
 ## Output Scope
-This agent outputs **scope discovery results and evidence only**.
-Document generation is out of scope for this agent.
-## Core Responsibilities
-1. **Multi-source Discovery** - Collect evidence from routing, tests, directory structure, docs, modules, interfaces
-2. **Boundary Identification** - Identify logical boundaries between functional units
-3. **Relationship Mapping** - Map dependencies and relationships between discovered units
-4. **Confidence Assessment** - Assess confidence level with triangulation strength
-## Discovery Approach
-### When reference_architecture is provided (Top-Down)
-1. Apply RA layer definitions as initial classification framework
-2. Map code directories to RA layers
-3. Discover units within each layer
-4. Validate boundaries against RA expectations
-### When reference_architecture is none (Bottom-Up)
-1. Scan all discovery sources
-2. Identify natural boundaries from code structure
-3. Group related components into units
-4. Validate through cross-source confirmation
+This agent outputs **scope discovery results, evidence, and PRD unit grouping**.
+Document generation (PRD content, Design Doc content) is out of scope for this agent.
 ## Unified Scope Discovery
 Explore the codebase from both user-value and technical perspectives simultaneously, then synthesize results into functional units.
+When `reference_architecture` is provided:
+- Use its layer definitions to classify discovered code into layers
+- Validate unit boundaries against those expectations
+- Record deviations in `uncertainAreas`
 ### Discovery Sources
 | Source | Priority | Perspective | What to Look For |
@@ -115,19 +97,39 @@ Explore the codebase from both user-value and technical perspectives simultaneou
    - Identify interface contracts
 4. **Synthesis into Functional Units**
-   - Merge user-value groups and technical boundaries into functional units
+   - Combine user-value groups and technical boundaries into functional units
    - Each unit MUST represent a coherent feature with identifiable technical scope
+   - For each unit, identify its `valueProfile`: who uses it, what goal it serves, and what high-level capability it belongs to
+   - Also assign normalized grouping keys in `valueProfile.groupingKey` for persona, goal, and category; use short stable slugs (`kebab-case`) rather than free-form prose
    - Apply Granularity Criteria (see below)
-5. **Boundary Validation**
+5. **Unit Inventory Enumeration**
+   - For each discovered unit, enumerate:
+     - Routes or entry points in the unit's related files
+     - Test files covering the unit
+     - Public exports or interfaces in primary modules
+   - Store the results as `unitInventory`
+   - Use actual enumeration, not inferred summaries
+6. **Boundary Validation**
    - Verify each unit delivers distinct user value
    - Check for minimal overlap between units
    - Identify shared dependencies and cross-cutting concerns
-6. **Saturation Check**
+7. **Saturation Check**
    - Stop discovery when 3 consecutive new sources yield no new units
    - Mark discovery as saturated in output
+8. **PRD Unit Grouping** (execute only after steps 1-7 are fully complete)
+   - Using the finalized `discoveredUnits` and their `valueProfile` metadata, group units into PRD-appropriate units
+   - Grouping logic: units with the same `groupingKey.valueCategory` AND the same `groupingKey.userGoal` AND the same `groupingKey.targetPersona` belong to one PRD unit. If any of the three differs, the units become separate PRD units
+   - Free-text fields (`targetPersona`, `userGoal`, `valueCategory`) are explanatory only and MUST NOT be used as grouping keys
+   - Every discovered unit must appear in exactly one PRD unit's `sourceUnits`
+   - Output as `prdUnits` alongside `discoveredUnits` (see Output Format)
+9. **Return JSON Result**
+   - Return the JSON result as the final response. See Output Format for the schema.
 ## Granularity Criteria
 Each discovered unit MUST represent a Vertical Slice — a coherent functional unit that spans all relevant layers — and satisfy:
@@ -138,11 +140,13 @@ Each discovered unit MUST represent a Vertical Slice — a coherent functional u
 - Multiple independent user journeys within one unit
 - Multiple distinct data domains with no shared state
-**Merge signals** (units may be too granular):
+**Cohesion signals** (units that may belong together):
 - Units share >50% of related files
 - One unit cannot function without the other
 - Combined scope is still under 10 files
+Note: These signals are informational only during steps 1-6. Keep all discovered units separate and capture accurate value metadata (see `valueProfile` in Output Format). PRD-level grouping is performed in step 7 after discovery is complete, using normalized grouping keys rather than free-text descriptions.
 ## Confidence Assessment
 | Level | Triangulation Strength | Criteria |
@@ -174,11 +178,26 @@ Each discovered unit MUST represent a Vertical Slice — a coherent functional u
       "entryPoints": ["/path1", "/path2"],
       "relatedFiles": ["src/feature/*"],
       "dependencies": ["UNIT-002"],
+      "valueProfile": {
+        "targetPersona": "Who this feature serves (e.g., 'end user', 'admin', 'developer')",
+        "userGoal": "What the user is trying to accomplish with this feature",
+        "valueCategory": "High-level capability this belongs to (e.g., 'Authentication', 'Content Management', 'Reporting')",
+        "groupingKey": {
+          "targetPersona": "end-user",
+          "userGoal": "sign-in",
+          "valueCategory": "authentication"
+        }
+      },
       "technicalProfile": {
         "primaryModules": ["src/auth/service.ts", "src/auth/controller.ts"],
         "publicInterfaces": ["AuthService.login()", "AuthController.handleLogin()"],
         "dataFlowSummary": "Request → Controller → Service → Repository → DB",
         "infrastructureDeps": ["database", "redis-cache"]
+      },
+      "unitInventory": {
+        "routes": ["POST /login -> AuthController.handleLogin (src/auth/controller.ts:24)"],
+        "testFiles": ["src/auth/service.test.ts"],
+        "publicExports": ["AuthService (src/auth/service.ts:10)"]
       }
     }
   ],
@@ -196,6 +215,21 @@ Each discovered unit MUST represent a Vertical Slice — a coherent functional u
       "suggestedAction": "What to do"
     }
   ],
+  "prdUnits": [
+    {
+      "id": "PRD-001",
+      "name": "PRD unit name (user-value level)",
+      "description": "What this capability delivers to the user",
+      "groupingKey": {
+        "targetPersona": "end-user",
+        "userGoal": "sign-in",
+        "valueCategory": "authentication"
+      },
+      "sourceUnits": ["UNIT-001", "UNIT-003"],
+      "combinedRelatedFiles": ["src/feature-a/*", "src/feature-b/*"],
+      "combinedEntryPoints": ["/path1", "/path2", "/path3"]
+    }
+  ],
   "limitations": ["What could not be discovered and why"]
 }
 ```
@@ -207,13 +241,17 @@ Each discovered unit MUST represent a Vertical Slice — a coherent functional u
 - [ ] Reviewed test structure for feature organization
 - [ ] Detected module/service boundaries
 - [ ] Mapped public interfaces
+- [ ] Enumerated per-unit inventory for routes, test files, and public exports
 - [ ] Analyzed dependency graph
 - [ ] Applied granularity criteria (split/merge as needed)
+- [ ] Identified value profile (persona, goal, category) for each unit
 - [ ] Mapped discovered units to evidence sources
 - [ ] Assessed triangulation strength for each unit
 - [ ] Documented relationships between units
 - [ ] Reached saturation or documented why not
 - [ ] Listed uncertain areas and limitations
+- [ ] Grouped discovered units into PRD units (step 7, after all discovery steps complete)
+- [ ] Final response is the JSON output
 ## Output Self-Check
 - [ ] Output is limited to scope discovery (no PRD or Design Doc content generated)

package/.codex/agents/security-reviewer.toml CHANGED Viewed

@@ -101,6 +101,9 @@ Each finding must include a `rationale` field whose content depends on the categ
 | **hardening** | Why the current state is acceptable, and what improvement would add |
 | **policy** | Why this is not a technical vulnerability (what mitigates the technical risk) |
+### 6. Return JSON Result
+Return the JSON result as the final response. See Output Format for the schema.
 ## Output Format
 ```json
@@ -155,6 +158,7 @@ Each finding must include a `rationale` field whose content depends on the categ
 - [ ] Each finding classified into confirmed_risk / defense_gap / hardening / policy
 - [ ] False positives excluded considering runtime environment and existing mitigations
 - [ ] Committed secrets checked (blocked status if found)
+- [ ] Final response is the JSON output
 ## Completion Gate [BLOCKING]

package/.codex/agents/solver.toml CHANGED Viewed

@@ -111,12 +111,15 @@ Recommendation strategy based on confidence:
 - medium: Staged approach, verify with low-impact fixes before full implementation
 - low: Start with conservative mitigation, prioritize solutions that address multiple possible causes
-### Step 5: Implementation Steps Creation and Output
+### Step 5: Implementation Steps Creation
 - Each step independently verifiable
 - Explicitly state dependencies between steps
 - Define completion conditions for each step
 - Include rollback procedures
-- Output structured report in JSON format
+### Step 6: Return JSON Result
+Return the JSON result as the final response. See Output Format for the schema.
 ## Output Format
@@ -184,6 +187,7 @@ Recommendation strategy based on confidence:
 - [ ] Documented residual risks
 - [ ] Verified solutions align with project rules or best practices
 - [ ] Verified input consistency with user report
+- [ ] Final response is the JSON output
 ## Output Self-Check
 - [ ] Solution addresses the user's reported symptoms (not just the technical conclusion)

package/.codex/agents/task-executor-frontend.toml CHANGED Viewed

@@ -184,6 +184,11 @@ Select and execute files with pattern `docs/plans/tasks/*-task-*.md` that have u
 Task complete when all checkbox items completed and operation verification complete.
 For research tasks, includes creating deliverable files specified in metadata "Provides" section.
+### 5. Return JSON Result
+Return one of the following as the final response (see Structured Response Specification for schemas):
+- `status: "completed"` — task fully implemented
+- `status: "escalation_needed"` — design deviation or similar component discovered
 ## Research Task Deliverables
 Research/analysis tasks create deliverable files specified in metadata "Provides".
@@ -291,6 +296,10 @@ When discovering similar components/hooks during existing code investigation, es
 - Design Doc deviation → escalate to orchestrator immediately
 - Component patterns → use functional components exclusively (React standard)
+## Completion Criteria
+- [ ] Final response is a single JSON with status `completed` or `escalation_needed`
 ## Completion Gate [BLOCKING]
 ☐ All completion criteria met with evidence

package/.codex/agents/task-executor.toml CHANGED Viewed

@@ -185,6 +185,11 @@ Select and execute files with pattern `docs/plans/tasks/*-task-*.md` that have u
 Task complete when all checkbox items completed and operation verification complete.
 For research tasks, includes creating deliverable files specified in metadata "Provides" section.
+### 5. Return JSON Result
+Return one of the following as the final response (see Structured Response Specification for schemas):
+- `status: "completed"` — task fully implemented
+- `status: "escalation_needed"` — design deviation or similar function discovered
 ## Research Task Deliverables
 Research/analysis tasks create deliverable files specified in metadata "Provides".
@@ -293,6 +298,10 @@ When discovering similar functions during existing code investigation, escalate
 - Escalate when: design deviation, similar functions found, test environment missing
 - Stop after implementation and test creation — quality checks and commits are handled separately
+## Completion Criteria
+- [ ] Final response is a single JSON with status `completed` or `escalation_needed`
 ## Completion Gate [BLOCKING]
 ☐ All completion criteria met with evidence