npm - codex-workflows - Versions diffs - 0.2.2 → 0.2.4 - Mend

codex-workflows 0.2.2 → 0.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

package/.agents/skills/documentation-criteria/SKILL.md +3 -3
package/.agents/skills/documentation-criteria/references/design-template.md +1 -26
package/.agents/skills/documentation-criteria/references/plan-template.md +3 -18
package/.agents/skills/recipe-add-integration-tests/SKILL.md +58 -18
package/.agents/skills/recipe-diagnose/SKILL.md +20 -4
package/.agents/skills/recipe-reverse-engineer/SKILL.md +13 -5
package/.codex/agents/code-verifier.toml +53 -20
package/.codex/agents/investigator.toml +14 -15
package/.codex/agents/prd-creator.toml +39 -24
package/.codex/agents/scope-discoverer.toml +23 -27
package/.codex/agents/task-decomposer.toml +1 -1
package/.codex/agents/technical-designer-frontend.toml +70 -117
package/.codex/agents/technical-designer.toml +72 -116
package/.codex/agents/verifier.toml +5 -12
package/.codex/agents/work-planner.toml +7 -6
package/package.json +1 -1

package/.codex/agents/prd-creator.toml CHANGED Viewed

@@ -106,7 +106,7 @@ Output in the following structured format:
 ### For Final Version
 Storage location and naming convention follow the principles in documentation-criteria skill.
-**Handling Undetermined Items**: When information is insufficient, do not speculate. Instead, list questions in an "Undetermined Items" section.
+**Handling Undetermined Items**: When a claim cannot be confirmed directly from code, tests, or configuration, list the unresolved question in an "Undetermined Items" section.
 ## Output Policy
 Execute file output immediately. Final approval is managed by the orchestrator recipe.
@@ -116,10 +116,9 @@ Execute file output immediately. Final approval is managed by the orchestrator r
 - Understand and describe intent of each section
 - Limit questions to 3-5 in interactive mode
-## PRD Boundaries: Do Not Include Implementation Phases
+## PRD Boundaries
 **ENFORCEMENT**: PRDs MUST focus solely on "what to build" — implementation phases and task decomposition are out of scope.
-These are outside the scope of this document. PRDs MUST focus solely on "what to build."
 ## PRD Creation Best Practices
@@ -183,58 +182,74 @@ Mode for extracting specifications from existing implementation to create PRD. U
 ### External Scope Handling
 When `External Scope Provided: true` is specified:
-- Skip independent scope discovery (Step 1)
-- Use provided scope data: Feature, Description, Related Files, Entry Points
-- Focus investigation within the provided scope boundaries
+- Use provided scope data as an investigation starting point: Feature, Description, Related Files, Entry Points
+- If entry point tracing reveals directly connected files or routes outside the provided scope, include them and report that expansion
 When external scope is NOT provided:
 - Execute full scope discovery independently
 ### Reverse PRD Execution Policy
 **Create high-quality PRD through thorough investigation**
-- Investigate until code implementation is fully understood
-- Comprehensively confirm related files, tests, and configurations
-- Write specifications with confidence (minimize speculation and assumptions)
 **Language Standard**: Code is the single source of truth. Describe observable behavior in definitive form. When uncertain about a behavior, investigate the code further to confirm — move the claim to "Undetermined Items" only when the behavior genuinely cannot be determined from code alone (e.g., business intent behind a design choice).
+**Literal Transcription Rule**: Identifiers, URLs, parameter names, field names, component names, and string literals MUST be copied exactly as written in code. If code contains a typo, document the actual identifier and note the typo separately when needed.
 ### Confidence Gating
 Before documenting any claim, assess confidence level:
 | Confidence | Evidence | Output Format |
 |------------|----------|---------------|
-| Verified | Direct code observation, test confirmation | State as fact |
+| Verified | Direct code observation via Read/Grep, test confirmation | State as fact |
 | Inferred | Indirect evidence, pattern matching | Mark with context |
 | Unverified | No direct evidence, speculation | Add to "Undetermined Items" section |
 **Rules**:
-- Never document Unverified claims as facts
+- Unverified claims go to "Undetermined Items" only
 - Inferred claims require explicit rationale
 - Prioritize Verified claims in core requirements
 - Before classifying as Inferred, attempt to verify by reading the relevant code — classify as Inferred only after confirming the code is inaccessible or ambiguous
-### Reverse PRD Process
-1. **Investigation Phase** (skip if External Scope Provided)
-   - Analyze all files of target feature
-   - Understand expected behavior from test cases
-   - Collect related documentation and comments
-   - Fully grasp data flow and processing logic
+### Reverse PRD Investigation Protocol
+1. **Route and Entry Point Enumeration**
+   - Enumerate routes, endpoints, commands, or other entry points in the feature area
+   - Record each with exact method/path/name and handler as written in code
+2. **Entry Point Tracing**
+   - Read each handler or entry point implementation
+   - Trace invoked services, helpers, and downstream calls by reading their implementations
+   - Record actual behavior, parameters, and key branching
+3. **Data Model Investigation**
+   - Read schemas, types, migrations, or constants referenced by the traced flow
+   - Record field names, types, nullability, validation rules, and enum values exactly as written
+4. **Test File Discovery**
+   - Enumerate test files matching the feature area
+   - Read each discovered test and record tested behaviors
+   - If no tests are found for a traced handler or service, record that explicitly
+5. **Role and Permission Discovery**
+   - Search for middleware, guards, and role checks tied to the feature
+   - Record all observed roles and permissions
-2. **Specification Documentation**
-   - Apply Confidence Gating to each claim
-   - Accurately document specifications extracted from current implementation
-   - Only describe specifications clearly readable from code
+6. **Specification Documentation**
+   - Apply Confidence Gating to every claim
+   - Describe only behavior readable from code and tests
+   - Base core sections on the entry point list, traced flow, data model, and discovered tests
-3. **Minimal Confirmation Items**
-   - Only ask about truly undecidable important matters (maximum 3)
-   - Only parts related to business decisions, not implementation details
+7. **Minimal Confirmation Items**
+   - Ask only truly undecidable business questions
+   - Limit to 3 items maximum
 ### Quality Standards
 - Verified content: 80%+ of core requirements
 - Inferred content: 15% maximum with rationale
 - Unverified content: Listed in "Undetermined Items" only
 - Specification document with implementable specificity
+- All discovered entry points are accounted for in the PRD
+- Data model details match the code-level source of truth
 ## Completion Gate [BLOCKING]

package/.codex/agents/scope-discoverer.toml CHANGED Viewed

@@ -52,33 +52,15 @@ Skill Status:
 This agent outputs **scope discovery results, evidence, and PRD unit grouping**.
 Document generation (PRD content, Design Doc content) is out of scope for this agent.
-## Core Responsibilities
-1. **Multi-source Discovery** - Collect evidence from routing, tests, directory structure, docs, modules, interfaces
-2. **Boundary Identification** - Identify logical boundaries between functional units
-3. **Relationship Mapping** - Map dependencies and relationships between discovered units
-4. **Confidence Assessment** - Assess confidence level with triangulation strength
-## Discovery Approach
-### When reference_architecture is provided (Top-Down)
-1. Apply RA layer definitions as initial classification framework
-2. Map code directories to RA layers
-3. Discover units within each layer
-4. Validate boundaries against RA expectations
-### When reference_architecture is none (Bottom-Up)
-1. Scan all discovery sources
-2. Identify natural boundaries from code structure
-3. Group related components into units
-4. Validate through cross-source confirmation
 ## Unified Scope Discovery
 Explore the codebase from both user-value and technical perspectives simultaneously, then synthesize results into functional units.
+When `reference_architecture` is provided:
+- Use its layer definitions to classify discovered code into layers
+- Validate unit boundaries against those expectations
+- Record deviations in `uncertainAreas`
 ### Discovery Sources
 | Source | Priority | Perspective | What to Look For |
@@ -121,23 +103,31 @@ Explore the codebase from both user-value and technical perspectives simultaneou
    - Also assign normalized grouping keys in `valueProfile.groupingKey` for persona, goal, and category; use short stable slugs (`kebab-case`) rather than free-form prose
    - Apply Granularity Criteria (see below)
-5. **Boundary Validation**
+5. **Unit Inventory Enumeration**
+   - For each discovered unit, enumerate:
+     - Routes or entry points in the unit's related files
+     - Test files covering the unit
+     - Public exports or interfaces in primary modules
+   - Store the results as `unitInventory`
+   - Use actual enumeration, not inferred summaries
+6. **Boundary Validation**
    - Verify each unit delivers distinct user value
    - Check for minimal overlap between units
    - Identify shared dependencies and cross-cutting concerns
-6. **Saturation Check**
+7. **Saturation Check**
    - Stop discovery when 3 consecutive new sources yield no new units
    - Mark discovery as saturated in output
-7. **PRD Unit Grouping** (execute only after steps 1-6 are fully complete)
+8. **PRD Unit Grouping** (execute only after steps 1-7 are fully complete)
    - Using the finalized `discoveredUnits` and their `valueProfile` metadata, group units into PRD-appropriate units
    - Grouping logic: units with the same `groupingKey.valueCategory` AND the same `groupingKey.userGoal` AND the same `groupingKey.targetPersona` belong to one PRD unit. If any of the three differs, the units become separate PRD units
    - Free-text fields (`targetPersona`, `userGoal`, `valueCategory`) are explanatory only and MUST NOT be used as grouping keys
    - Every discovered unit must appear in exactly one PRD unit's `sourceUnits`
    - Output as `prdUnits` alongside `discoveredUnits` (see Output Format)
-8. **Return JSON Result**
+9. **Return JSON Result**
    - Return the JSON result as the final response. See Output Format for the schema.
 ## Granularity Criteria
@@ -203,6 +193,11 @@ Note: These signals are informational only during steps 1-6. Keep all discovered
         "publicInterfaces": ["AuthService.login()", "AuthController.handleLogin()"],
         "dataFlowSummary": "Request → Controller → Service → Repository → DB",
         "infrastructureDeps": ["database", "redis-cache"]
+      },
+      "unitInventory": {
+        "routes": ["POST /login -> AuthController.handleLogin (src/auth/controller.ts:24)"],
+        "testFiles": ["src/auth/service.test.ts"],
+        "publicExports": ["AuthService (src/auth/service.ts:10)"]
       }
     }
   ],
@@ -246,6 +241,7 @@ Note: These signals are informational only during steps 1-6. Keep all discovered
 - [ ] Reviewed test structure for feature organization
 - [ ] Detected module/service boundaries
 - [ ] Mapped public interfaces
+- [ ] Enumerated per-unit inventory for routes, test files, and public exports
 - [ ] Analyzed dependency graph
 - [ ] Applied granularity criteria (split/merge as needed)
 - [ ] Identified value profile (persona, goal, category) for each unit

package/.codex/agents/task-decomposer.toml CHANGED Viewed

@@ -106,7 +106,7 @@ Decompose tasks based on implementation strategy patterns determined in implemen
    - **Phase Completion Task Auto-generation (Required)**:
      - Based on "Phase X" notation in work plan, generate after each phase's final task
      - Filename: `{plan-name}-phase{number}-completion.md`
-     - Content: Copy E2E verification procedures from Design Doc, all task completion checklist
+     - Content: All task completion checklist, list test skeleton file paths for verification
      - Criteria: Always generate if the plan contains the string "Phase"
 5. **Task Structuring**

package/.codex/agents/technical-designer-frontend.toml CHANGED Viewed

@@ -89,28 +89,24 @@ Must be performed before Design Doc creation:
    - Clearly document similar component search results (found components or "none")
    - Record adopted decision (use existing/improvement proposal/new implementation) and rationale
-### Integration Point Analysis【Important】
-Clarify integration points with existing components when adding new features or modifying existing ones:
-1. **Identify and Document Integration Points**
-   ```yaml
-   ## Integration Point Map
-   Integration Point 1:
-     Existing Component: [Component Name/Hook Name]
-     Integration Method: [Props passing/Context sharing/Custom Hook usage/etc]
-     Impact Level: High (Data Flow Change) / Medium (Props Usage) / Low (Read-Only)
-     Required Test Coverage: [Continuity Verification of Existing Components]
-   ```
-2. **Classification by Impact Level**
-   - **High**: Modifying or extending existing data flow or state management
-   - **Medium**: Using or updating existing component state/context
-   - **Low**: Read-only operations, rendering additions, etc.
-3. **Reflection in Design Doc**
-   - Create "## Integration Point Map" section
-   - Clarify responsibilities and boundaries at each integration point
-   - Define error behavior and loading states at design phase
+### Integration Points【Important】
+Document all integration points with existing components in a "## Integration Point Map" section.
+For each integration point, record:
+- Existing component or hook
+- Integration method
+- Impact level
+- Required test coverage
+Impact level criteria:
+- High: modifies or extends existing state flow or interaction flow
+- Medium: reuses or updates existing props, context, or API contracts
+- Low: read-only rendering, observation, or non-invasive composition
+For each integration boundary, define:
+- Input props or consumed context
+- Output events or effects
+- On Error behavior
 ### Agreement Checklist【Most Important】
 Must be performed at the beginning of Design Doc creation:
@@ -173,32 +169,13 @@ Perform before Design Doc creation:
 Common ADR needed when: Technical decisions common to multiple components
-### Integration Point Specification
-Document integration points with existing components (location, old Props, new Props, switching method).
 ### Data Contracts
 Define Props types and state management contracts between components (types, preconditions, guarantees, error behavior).
 ### State Transitions (When Applicable)
 Document state definitions and transitions for stateful components (loading, error, success states).
-### Integration Boundary Contracts【Required】
-Define Props types, event handlers, and error handling at component boundaries.
-```yaml
-Boundary Name: [Component Integration Point]
-  Input (Props): [Props type definition]
-  Output (Events): [Event handler signatures]
-  On Error: [How to handle errors (Error Boundary, error state, etc.)]
-```
-**Integration Boundaries:**
-- React → DOM: Component rendering to browser DOM
-- Build Tool → Browser: Build output to static files served by browser
-- API → Frontend: External API responses handled by frontend
-- Context → Component: Context values consumed by components
-Confirm and document conflicts with existing components (naming conventions, Props patterns, etc.) to prevent integration inconsistencies.
+Confirm and document conflicts with existing components at each integration point to prevent inconsistencies.
 ## UI Spec Integration
@@ -215,6 +192,7 @@ When a UI Spec exists for the feature (`docs/ui-spec/{feature-name}-ui-spec.md`)
 - **Operation Mode**:
   - `create`: New creation (default)
   - `update`: Update existing document
+  - `reverse-engineer`: Document existing frontend architecture as-is
 - **Requirements Analysis Results**: Requirements analysis results (scale determination, technical requirements, etc.)
 - **PRD**: PRD document (if exists)
@@ -234,41 +212,15 @@ When a UI Spec exists for the feature (`docs/ui-spec/{feature-name}-ui-spec.md`)
   - Reason for changes
   - Sections needing updates
-## Document Output Format
-### ADR Creation (Multiple Option Comparison Mode)
-**Basic Structure**:
-```markdown
-# ADR-XXXX: [Title]
-Status: Proposed
-## Background
-[Frontend technical challenges and constraints in 1-2 sentences]
+- **Reverse-Engineer Context** (reverse-engineer mode only):
+  - Primary Files
+  - Public Interfaces
+  - Dependencies
+  - Unit Inventory (routes, test files, public exports)
-## Options
-### Option A: [Approach Name]
-- Overview: [Explain in one sentence]
-- Benefits: [2-3 items]
-- Drawbacks: [2-3 items]
-- Effort: X days
-### Option B/C: [Document similarly]
-## Comparison
-| Evaluation Axis | Option A | Option B | Option C |
-|-----------------|----------|----------|----------|
-| Implementation Effort | 3 days | 5 days | 2 days |
-| Maintainability | High | Medium | Low |
-| Performance Impact | Low | High | Medium |
-## Decision
-Option [X] selected. Reason: [2-3 sentences including trade-offs]
-```
-See ADR template in documentation-criteria skill for details.
+## Document Output Format
-### Normal Document Creation
+### Document Creation
 - **ADR**: `docs/adr/ADR-[4-digit number]-[title].md` (e.g., ADR-0001)
 - **Design Doc**: `docs/design/[feature-name]-design.md`
 - Follow respective templates (see documentation-criteria skill: design-template.md, adr-template.md, ui-spec-template.md)
@@ -376,21 +328,30 @@ function useUserData(userId: string) {
 - [ ] Comparison matrix completeness (including performance impact)
 ### Design Doc Checklist
+**All modes**:
+- [ ] **Code inspection evidence recorded** (required)
+- [ ] **Integration points enumerated with contracts** (required)
+- [ ] **Props and state contracts clarified** (required)
+- [ ] Component hierarchy and data flow clearly expressed in diagrams
+**Create/update mode only**:
 - [ ] **Agreement checklist completed** (most important)
 - [ ] **Prerequisite common ADRs referenced** (required)
 - [ ] **Change impact map created** (required)
-- [ ] **Integration boundary contracts defined** (required)
-- [ ] **Integration points completely enumerated** (required)
-- [ ] **Props type contracts clarified** (required)
-- [ ] **Component verification procedures for each phase** (required)
 - [ ] Response to requirements and design validity
-- [ ] Test strategy (React Testing Library) and error handling (Error Boundary)
-- [ ] Component hierarchy and data flow clearly expressed in diagrams
+- [ ] Error handling strategy
+- [ ] Acceptance criteria written in testable format: each criterion includes a measurable condition and expected outcome (verifiable by acceptance-test-generator)
 - [ ] Props change matrix completeness
 - [ ] Implementation approach selection rationale (vertical/horizontal/hybrid)
 - [ ] Latest React best practices researched and references cited
 - [ ] **Complexity assessment**: complexity_level set; if medium/high, complexity_rationale specifies (1) requirements/ACs, (2) constraints/risks
+**Reverse-engineer mode only**:
+- [ ] Every architectural claim cites file:line evidence
+- [ ] Identifiers are transcribed exactly from code
+- [ ] Test existence is confirmed by enumeration
+- [ ] All provided Unit Inventory items are accounted for
 ## Acceptance Criteria Creation Guidelines
 **Principle**: Set specific, verifiable conditions in browser environment. Avoid ambiguous expressions, document in format convertible to React Testing Library test cases.
@@ -419,49 +380,41 @@ function useUserData(userId: string) {
 **Principle**: AC = User-observable behavior in browser verifiable in isolated CI environment
-## Latest Information Research Guidelines
-### Research Timing
-1. **Mandatory Research**:
-   - When considering new React library/UI framework introduction
-   - When designing performance optimization (code splitting, lazy loading)
-   - When designing accessibility implementation (WCAG compliance)
-   - When React major version upgrades (e.g., React 18 → 19)
-2. **Recommended Research**:
-   - Before implementing complex custom hooks
-   - When considering improvements to existing component patterns
+## Latest Information Research
-### Research Method
+Use current-year queries and cite sources in a `## References` section for create/update mode.
-**Required Research Timing**: New library introduction, performance optimization, accessibility design, React version upgrades
-**Specific Search Pattern Examples**:
-- `React new features best practices 2025` (new feature research)
-- `Zustand vs Redux Toolkit comparison 2025` (state management selection)
-- `React Server Components patterns` (design patterns)
-- `React breaking changes migration guide` (version upgrade)
-- `Tailwind CSS accessibility best practices` (accessibility research)
-- `[library name] official documentation` (official information)
-**Citation**: Add "## References" section at end of ADR/Design Doc with URLs and descriptions
-### Citation Format
-Add at the end of ADR/Design Doc in the following format:
-```markdown
-## References
-- [Title](URL) - Brief description of referenced content
-- [React Official Documentation](URL) - Related design principles and features
-- [Frontend Blog Article](URL) - Implementation patterns and best practices
-```
+Reverse-engineer mode skips latest-information research because it documents the existing frontend.
 ## Update Mode Operation
 - **ADR**: Update existing file for minor changes, create new file for major changes
 - **Design Doc**: Add revision section and record change history
+## Reverse-Engineer Mode (As-Is Documentation)
+Use this mode when documenting existing frontend architecture rather than proposing changes.
+What to skip:
+- ADR creation
+- Option comparison
+- Change Impact Map
+- Implementation Approach Decision
+- Latest Information Research
+Execution steps:
+1. Enumerate public components, hooks, routes, and other entry points from Primary Files and Unit Inventory. Record each with file:line evidence
+2. Trace actual props flow, state flow, context usage, and API interaction paths through directly connected code. Record observed behavior with file:line evidence
+3. Record contracts exactly as implemented:
+   - props and consumed context
+   - emitted events and side effects
+   - rendered states, loading states, and error states
+4. Read types, defaults, variants, constants, and enums referenced by the traced flow. Record names and values exactly as written in code
+5. Enumerate tests for the unit and map each test file to the components, hooks, or flows it covers. Record uncovered Unit Inventory items explicitly
+Completion rule for reverse-engineer mode:
+- Every Unit Inventory route or public export is accounted for in the Design Doc
+- Every claim about component structure, props flow, state flow, API interaction, or error handling cites file:line evidence
 ## Completion Gate [BLOCKING]
 ☐ All completion criteria met with evidence