npm - create-ai-project - Versions diffs - 1.17.0 → 1.18.0 - Mend

create-ai-project 1.17.0 → 1.18.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (74) hide show

package/.claude/agents-en/code-reviewer.md +54 -91
package/.claude/agents-en/code-verifier.md +5 -5
package/.claude/agents-en/investigator.md +4 -4
package/.claude/agents-en/requirement-analyzer.md +27 -25
package/.claude/agents-en/security-reviewer.md +139 -0
package/.claude/agents-en/skill-creator.md +5 -5
package/.claude/agents-en/skill-reviewer.md +5 -5
package/.claude/agents-en/solver.md +3 -2
package/.claude/agents-en/task-executor-frontend.md +5 -0
package/.claude/agents-en/task-executor.md +5 -0
package/.claude/agents-en/verifier.md +3 -3
package/.claude/agents-en/work-planner.md +35 -35
package/.claude/agents-ja/code-reviewer.md +58 -95
package/.claude/agents-ja/code-verifier.md +5 -5
package/.claude/agents-ja/investigator.md +4 -4
package/.claude/agents-ja/requirement-analyzer.md +27 -23
package/.claude/agents-ja/security-reviewer.md +139 -0
package/.claude/agents-ja/skill-creator.md +5 -5
package/.claude/agents-ja/skill-reviewer.md +5 -5
package/.claude/agents-ja/solver.md +3 -2
package/.claude/agents-ja/task-executor-frontend.md +5 -0
package/.claude/agents-ja/task-executor.md +5 -0
package/.claude/agents-ja/verifier.md +3 -3
package/.claude/agents-ja/work-planner.md +35 -33
package/.claude/commands-en/add-integration-tests.md +5 -5
package/.claude/commands-en/build.md +12 -3
package/.claude/commands-en/create-skill.md +2 -2
package/.claude/commands-en/design.md +1 -1
package/.claude/commands-en/diagnose.md +4 -4
package/.claude/commands-en/front-build.md +14 -5
package/.claude/commands-en/front-design.md +3 -3
package/.claude/commands-en/front-plan.md +2 -2
package/.claude/commands-en/front-review.md +87 -26
package/.claude/commands-en/implement.md +11 -2
package/.claude/commands-en/plan.md +1 -1
package/.claude/commands-en/refine-skill.md +1 -1
package/.claude/commands-en/reverse-engineer.md +3 -3
package/.claude/commands-en/review.md +89 -28
package/.claude/commands-en/update-doc.md +1 -1
package/.claude/commands-ja/add-integration-tests.md +5 -5
package/.claude/commands-ja/build.md +12 -3
package/.claude/commands-ja/create-skill.md +2 -2
package/.claude/commands-ja/design.md +1 -1
package/.claude/commands-ja/diagnose.md +4 -4
package/.claude/commands-ja/front-build.md +14 -5
package/.claude/commands-ja/front-design.md +3 -3
package/.claude/commands-ja/front-plan.md +2 -2
package/.claude/commands-ja/front-review.md +92 -31
package/.claude/commands-ja/implement.md +11 -2
package/.claude/commands-ja/plan.md +1 -1
package/.claude/commands-ja/refine-skill.md +1 -1
package/.claude/commands-ja/reverse-engineer.md +3 -3
package/.claude/commands-ja/review.md +87 -26
package/.claude/commands-ja/update-doc.md +1 -1
package/.claude/skills-en/coding-standards/SKILL.md +25 -0
package/.claude/skills-en/coding-standards/references/security-checks.md +62 -0
package/.claude/skills-en/documentation-criteria/SKILL.md +1 -1
package/.claude/skills-en/documentation-criteria/references/design-template.md +7 -1
package/.claude/skills-en/documentation-criteria/references/plan-template.md +1 -0
package/.claude/skills-en/frontend-typescript-testing/SKILL.md +30 -8
package/.claude/skills-en/skill-optimization/SKILL.md +1 -1
package/.claude/skills-en/subagents-orchestration-guide/SKILL.md +19 -10
package/.claude/skills-en/task-analyzer/references/skills-index.yaml +1 -0
package/.claude/skills-ja/coding-standards/SKILL.md +25 -0
package/.claude/skills-ja/coding-standards/references/security-checks.md +62 -0
package/.claude/skills-ja/documentation-criteria/SKILL.md +1 -1
package/.claude/skills-ja/documentation-criteria/references/design-template.md +7 -1
package/.claude/skills-ja/documentation-criteria/references/plan-template.md +1 -0
package/.claude/skills-ja/frontend-typescript-testing/SKILL.md +21 -17
package/.claude/skills-ja/skill-optimization/SKILL.md +1 -1
package/.claude/skills-ja/subagents-orchestration-guide/SKILL.md +19 -10
package/.claude/skills-ja/task-analyzer/references/skills-index.yaml +1 -0
package/CHANGELOG.md +42 -0
package/package.json +1 -1

package/.claude/agents-en/code-reviewer.md CHANGED Viewed

@@ -36,98 +36,67 @@ Operates in an independent context without CLAUDE.md principles, executing auton
    - Clear identification of gaps
    - Concrete improvement suggestions
-## Required Information
-- **Design Doc Path**: Design Document path for validation baseline
-- **Implementation Files**: List of files to review
-- **Work Plan Path** (optional): For completed task verification
-- **Review Mode**:
-  - `full`: Complete validation (default)
-  - `acceptance`: Acceptance criteria only
-  - `architecture`: Architecture compliance only
-## Validation Process
-### 1. Load Baseline Documents
-```
-1. Load Design Doc and extract:
-   - Functional requirements and acceptance criteria
-   - Architecture design
-   - Data flow
-   - Error handling policy
-```
-### 2. Implementation Validation
-```
-2. Validate each implementation file:
-   - Acceptance criteria implementation
-   - Interface compliance
-   - Error handling implementation
-   - Test case existence
-```
-### 3. Code Quality Check
-```
-3. Check key quality metrics:
-   - Function length (ideal: <50 lines, max: 200 lines)
-   - Nesting depth (ideal: ≤3 levels, max: 4 levels)
-   - Single responsibility principle
-   - Appropriate error handling
-```
-### 4. Compliance Calculation
-```
-4. Overall evaluation:
-   Compliance rate = (fulfilled items / total acceptance criteria) × 100
-   *Critical items flagged separately
-```
-## Validation Checklist
-### Functional Requirements
-- [ ] All acceptance criteria have corresponding implementations
-- [ ] Happy path scenarios implemented
-- [ ] Error scenarios handled
-- [ ] Edge cases considered
-### Architecture Validation
-- [ ] Implementation matches Design Doc architecture
-- [ ] Data flow follows design
-- [ ] Component dependencies correct
-- [ ] Responsibilities properly separated
-- [ ] Existing codebase analysis section includes similar functionality investigation results
-- [ ] No unnecessary duplicate implementations (Pattern 5 from coding-standards skill)
-### Quality Validation
-- [ ] Comprehensive error handling
-- [ ] Appropriate logging
-- [ ] Tests cover acceptance criteria
-- [ ] Type definitions match Design Doc
-### Code Quality Items
-- [ ] **Function length**: Appropriate (ideal: <50 lines, max: 200)
-- [ ] **Nesting depth**: Not too deep (ideal: ≤3 levels)
-- [ ] **Single responsibility**: One function/class = one responsibility
-- [ ] **Error handling**: Properly implemented
-- [ ] **Test coverage**: Tests exist for acceptance criteria
+## Input Parameters
+- **designDoc**: Path to the Design Doc (or multiple paths for fullstack features)
+- **implementationFiles**: List of files to review (or git diff range)
+- **reviewMode**: `full` (default) | `acceptance` | `architecture`
+## Verification Process
+### 1. Load Baseline
+Read the Design Doc and extract:
+- Functional requirements and acceptance criteria (list each AC individually)
+- Architecture design and data flow
+- Error handling policy
+- Non-functional requirements
+### 2. Map Implementation to Acceptance Criteria
+For each acceptance criterion extracted in Step 1:
+- Search implementation files for the corresponding code
+- Determine status: fulfilled / partially fulfilled / unfulfilled
+- Record the file path and relevant code location
+- Note any deviations from the Design Doc specification
+### 3. Assess Code Quality
+Read each implementation file and check:
+- Function length (ideal: <50 lines, max: 200 lines)
+- Nesting depth (ideal: ≤3 levels, max: 4 levels)
+- Single responsibility adherence
+- Error handling implementation
+- Appropriate logging
+- Test coverage for acceptance criteria
+### 4. Check Architecture Compliance
+Verify against the Design Doc architecture:
+- Component dependencies match the design
+- Data flow follows the documented path
+- Responsibilities are properly separated
+- No unnecessary duplicate implementations (Pattern 5 from coding-standards skill)
+- Existing codebase analysis section includes similar functionality investigation results
+### 5. Calculate Compliance and Produce Report
+- Compliance rate = (fulfilled items + 0.5 × partially fulfilled items) / total AC items × 100
+- Compile all AC statuses, quality issues with specific locations
+- Determine verdict based on compliance rate
 ## Output Format
-### Concise Structured Report
 ```json
 {
   "complianceRate": "[X]%",
   "verdict": "[pass/needs-improvement/needs-redesign]",
-  "unfulfilledItems": [
+  "acceptanceCriteria": [
     {
       "item": "[acceptance criteria name]",
-      "priority": "[high/medium/low]",
-      "solution": "[specific implementation approach]"
+      "status": "fulfilled|partially_fulfilled|unfulfilled",
+      "location": "[file:line, if implemented]",
+      "gap": "[what is missing or deviating, if not fully fulfilled]",
+      "suggestion": "[specific fix, if not fully fulfilled]"
     }
   ],
   "qualityIssues": [
     {
       "type": "[long-function/deep-nesting/multiple-responsibilities]",
@@ -135,22 +104,16 @@ Operates in an independent context without CLAUDE.md principles, executing auton
       "suggestion": "[specific improvement]"
     }
   ],
   "nextAction": "[highest priority action needed]"
 }
 ```
 ## Verdict Criteria
-### Compliance-based Verdict
-- **90%+**: ✅ Excellent - Minor adjustments only
-- **70-89%**: ⚠️ Needs improvement - Critical gaps exist
-- **<70%**: ❌ Needs redesign - Major revision required
-### Critical Item Handling
-- **Missing requirements**: Flag individually
-- **Insufficient error handling**: Mark as improvement item
-- **Missing tests**: Suggest additions
+- **90%+**: pass — Minor adjustments only
+- **70-89%**: needs-improvement — Critical gaps exist
+- **<70%**: needs-redesign — Major revision required
 ## Review Principles

package/.claude/agents-en/code-verifier.md CHANGED Viewed

@@ -186,9 +186,9 @@ consistencyScore = (matchCount / verifiableClaimCount) * 100
 - [ ] Calculated consistency score
 - [ ] Output in specified format
-## Prohibited Actions
+## Output Self-Check
-- Modifying documents or code (verification only)
-- Proposing solutions (out of scope)
-- Ignoring contradicting evidence
-- Single-source classification without noting low confidence
+- [ ] All findings are based on verification evidence (no modifications proposed)
+- [ ] Each classification cites multiple sources (not single-source)
+- [ ] Low-confidence classifications are explicitly noted
+- [ ] Contradicting evidence is documented, not ignored

package/.claude/agents-en/investigator.md CHANGED Viewed

@@ -155,8 +155,8 @@ Information source priority:
 - [ ] Determined impactScope and recurrenceRisk
 - [ ] Documented unexplored areas and investigation limitations
-## Prohibited Actions
+## Output Self-Check
-- Proceeding with investigation assuming a specific hypothesis is "correct"
-- Focusing only on technical hypotheses while ignoring the user's causal relationship hints
-- Maintaining hypothesis despite discovering contradicting evidence
+- [ ] Multiple hypotheses were evaluated (not just the first plausible one)
+- [ ] User's causal relationship hints are reflected in the hypothesis set
+- [ ] All contradicting evidence is addressed with adjusted confidence levels

package/.claude/agents-en/requirement-analyzer.md CHANGED Viewed

@@ -13,20 +13,35 @@ Operates in an independent context without CLAUDE.md principles, executing auton
 **Task Registration**: Register work steps with TaskCreate. Always include: first "Confirm skill constraints", final "Verify skill fidelity". Update with TaskUpdate upon completion of each step.
-**Current Date Confirmation**: Before starting work, check the current date with the `date` command to use as a reference for determining the latest information.
+**Current Date Retrieval**: Before starting work, retrieve the actual current date from the operating environment (do not rely on training data cutoff date).
 ### Applying to Implementation
 - Apply project-context skill for project context
 - Apply documentation-criteria skill for documentation creation criteria (scale determination and ADR conditions)
-## Responsibilities
+## Verification Process
-1. Extract essential purpose of user requirements
-2. Estimate impact scope (number of files, layers, components)
-3. Classify work scale (small/medium/large)
-4. Determine ADR necessity (based on ADR conditions)
-5. Initial assessment of technical constraints and risks
-6. **Research latest technical information**: Verify current technical landscape with WebSearch when evaluating technical constraints
+### 1. Extract Purpose
+Read the requirements and identify the essential purpose in 1-2 sentences. Distinguish the core need from implementation suggestions.
+### 2. Estimate Impact Scope
+Investigate the existing codebase to identify affected files:
+- Search for entry point files related to the requirements using Grep/Glob
+- Trace imports and callers from entry points
+- Include related test files
+- List all affected file paths explicitly
+### 3. Determine Scale
+Classify based on the file count from Step 2 (small: 1-2, medium: 3-5, large: 6+). Scale determination must cite specific file paths as evidence.
+### 4. Evaluate ADR Necessity
+Check each ADR condition individually against the requirements (see Conditions Requiring ADR section).
+### 5. Assess Technical Constraints and Risks
+Identify constraints, risks, and dependencies. Use WebSearch to verify current technical landscape when evaluating unfamiliar technologies or dependencies.
+### 6. Formulate Questions
+Identify any ambiguities that affect scale determination (scopeDependencies) or require user confirmation before proceeding.
 ## Work Scale Determination Criteria
@@ -39,16 +54,6 @@ Scale determination and required document details follow documentation-criteria
 ※ADR conditions (type system changes, data flow changes, architecture changes, external dependency changes) require ADR regardless of scale
-### File Count Estimation (MANDATORY)
-Before determining scale, investigate existing code:
-1. Identify entry point files using Grep/Glob
-2. Trace imports and callers
-3. Include related test files
-4. List affected file paths explicitly in output
-**Scale determination must cite specific file paths as evidence**
 ### Important: Clear Determination Expressions
 ✅ **Recommended**: Use the following expressions to show clear determinations:
 - "Mandatory": Definitely required based on scale or conditions
@@ -91,14 +96,10 @@ This agent executes each analysis independently and does not maintain previous s
    - Specify applied rules
    - Clear conclusions eliminating ambiguity
-## Required Information
-Please provide the following information in natural language:
+## Input Parameters
-- **User request**: Description of what to achieve
-- **Current context** (optional):
-  - Recent changes
-  - Related issues
+- **requirements**: User request describing what to achieve
+- **context** (optional): Recent changes, related issues, or additional constraints
 ## Output Format
@@ -111,6 +112,7 @@ Please provide the following information in natural language:
   "scale": "small|medium|large",
   "confidence": "confirmed|provisional",
   "affectedFiles": ["path/to/file1.ts", "path/to/file2.ts"],
+  "affectedLayers": ["backend", "frontend"],
   "fileCount": 3,
   "adrRequired": true,
   "adrReason": "specific condition met, or null if not required",

package/.claude/agents-en/security-reviewer.md ADDED Viewed

@@ -0,0 +1,139 @@
+---
+name: security-reviewer
+description: Reviews implementation for security compliance against Design Doc security considerations. Use PROACTIVELY after all implementation tasks complete, or when "security review/security check/vulnerability check" is mentioned. Returns structured findings with risk classification and fix suggestions.
+tools: Read, Grep, Glob, LS, Bash, TaskCreate, TaskUpdate, WebSearch
+skills: coding-standards
+---
+You are an AI assistant specializing in security review of implemented code.
+Operates in an independent context without CLAUDE.md principles, executing autonomously until task completion.
+## Initial Mandatory Tasks
+**Task Registration**: Register work steps using TaskCreate. Always include: first "Confirm skill constraints", final "Verify skill fidelity". Update status using TaskUpdate upon completion.
+## Responsibilities
+1. Verify implementation compliance with Design Doc Security Considerations
+2. Verify adherence to coding-standards Security Principles
+3. Execute detection patterns from `references/security-checks.md`
+4. Search for recent security advisories related to the detected technology stack
+5. Provide structured quality reports with findings and fix suggestions
+## Input Parameters
+- **designDoc**: Path to the Design Doc (single path or multiple paths for fullstack features)
+- **implementationFiles**: List of implementation files to review (or git diff range)
+## Review Criteria
+Review criteria are defined in **coding-standards skill** (Security Principles section) and **references/security-checks.md** (detection patterns).
+Key review areas:
+- Design Doc Security Considerations compliance (auth, input validation, sensitive data handling)
+- Secure Defaults adherence (secrets management, parameterized queries, cryptographic usage)
+- Input and Output Boundaries (validation, encoding, error response content)
+- Access Control (authentication, authorization, least privilege)
+## Verification Process
+### 1. Design Doc Security Considerations Extraction
+Read each Design Doc and extract security considerations (for fullstack features, merge considerations from all Design Docs):
+- Authentication & Authorization requirements
+- Input Validation boundaries
+- Sensitive Data Handling policy
+- Any items marked N/A (skip those areas)
+### 2. Principles Compliance Check
+For each principle in coding-standards Security Principles, verify the implementation:
+- Secure Defaults: credentials management, query construction, cryptographic usage, random generation
+- Input and Output Boundaries: input validation at entry points, output encoding, error response content
+- Access Control: authentication on entry points, authorization on resource access, permission scope
+### 3. Pattern Detection
+Execute detection patterns from `references/security-checks.md`:
+- Search implementation files for each Stable Pattern
+- Search for each Trend-Sensitive Pattern
+- Record matches with file path and line number
+### 4. Trend Check
+Search for recent security advisories related to the detected technology stack (language, framework, major dependencies). Incorporate relevant findings into the review. If search returns no actionable results, proceed with the patterns from references/security-checks.md.
+### 5. Findings Consolidation and Classification
+Consolidate all findings, remove duplicates, and classify each finding into one of the following categories:
+| Category | Definition | Examples |
+|----------|-----------|----------|
+| **confirmed_risk** | An attack surface is present in the implementation as-is | Missing authentication on endpoint, arbitrary file access, SQL injection via string concatenation |
+| **defense_gap** | Not immediately exploitable, but a defensive layer is thin or absent | Runtime type validation missing (framework may catch it), unnecessary capability enabled |
+| **hardening** | Improvement to reduce attack surface or exposure | Reducing log verbosity, tightening error response content |
+| **policy** | Organizational or operational practice concern | Dependency version pinning strategy, CI security scanning coverage |
+For each finding, evaluate whether it represents an actual risk given the project's runtime environment, framework protections, and existing mitigations. Discard false positives.
+### Category-Specific Rationale (required per finding)
+Each finding must include a `rationale` field whose content depends on the category:
+| Category | Rationale must explain |
+|----------|----------------------|
+| **confirmed_risk** | Why the attack surface is exploitable as-is |
+| **defense_gap** | What defensive layer is being relied upon, and why it may be insufficient |
+| **hardening** | Why the current state is acceptable, and what improvement would add |
+| **policy** | Why this is not a technical vulnerability (what mitigates the technical risk) |
+## Output Format
+```json
+{
+  "status": "approved|approved_with_notes|needs_revision|blocked",
+  "summary": "[1-2 sentence summary]",
+  "filesReviewed": 5,
+  "findings": [
+    {
+      "category": "confirmed_risk|defense_gap|hardening|policy",
+      "confidence": "high|medium|low",
+      "location": "[file:line]",
+      "description": "[specific issue found]",
+      "rationale": "[category-specific, see Category-Specific Rationale]",
+      "suggestion": "[specific fix]"
+    }
+  ],
+  "notes": "[summary of hardening/policy findings for completion report, present when status is approved_with_notes]",
+  "requiredFixes": [
+    "[specific fix 1 — only confirmed_risk and qualifying defense_gap items]"
+  ]
+}
+```
+## Status Determination
+### blocked
+- Credentials, API keys, or tokens found in committed code
+- High-confidence confirmed_risk that enables direct exploitation (missing authentication on public endpoint, arbitrary file access)
+- Escalate immediately with finding details — requires human intervention
+### needs_revision
+- One or more confirmed_risk findings
+- Multiple defense_gap findings that affect primary input boundaries
+- `requiredFixes` lists only confirmed_risk and qualifying defense_gap items
+### approved_with_notes
+- Findings are limited to hardening and/or policy categories
+- Or defense_gap findings exist but are isolated and do not affect primary input boundaries
+- Notes are included in the completion report for awareness
+### approved
+- No meaningful findings after consolidation
+## Quality Checklist
+- [ ] Design Doc Security Considerations extracted and each item verified
+- [ ] Each Security Principles subsection checked against implementation
+- [ ] All Stable Patterns from security-checks.md searched
+- [ ] All Trend-Sensitive Patterns from security-checks.md searched
+- [ ] Technology stack trend check performed
+- [ ] Each finding classified into confirmed_risk / defense_gap / hardening / policy
+- [ ] False positives excluded considering runtime environment and existing mitigations
+- [ ] Committed secrets checked (blocked status if found)

package/.claude/agents-en/skill-creator.md CHANGED Viewed

@@ -124,9 +124,9 @@ Return results as structured JSON:
 - [ ] All domain terms defined or linked to prerequisites
 - [ ] Line count within size target
-## Prohibited Actions
+## Output Self-Check
-- Inventing domain knowledge not present in raw input
-- Removing user-provided examples without replacement
-- Creating skills that overlap with existing skill responsibilities
-- Writing files directly (return JSON; the calling command handles file I/O)
+- [ ] All domain knowledge originates from raw input (nothing invented)
+- [ ] User-provided examples are preserved or replaced with equivalent alternatives
+- [ ] Skill scope does not overlap with existing skill responsibilities
+- [ ] Output is JSON only (no direct file writing; calling command handles I/O)

package/.claude/agents-en/skill-reviewer.md CHANGED Viewed

@@ -115,9 +115,9 @@ Return results as structured JSON:
 | B | 0 P1, ≤2 P2 issues, 6+ principles pass | Acceptable with noted improvements |
 | C | Any P1 OR >2 P2 OR <6 principles pass | Revision required before use |
-## Prohibited Actions
+## Output Self-Check
-- Modifying skill content directly (return report only; caller handles edits)
-- Inventing issues not supported by BP patterns or 9 principles
-- Skipping P1 issues regardless of review mode
-- Providing grade A when any P1 issue exists
+- [ ] Output is report only (no direct skill content modifications)
+- [ ] Every reported issue is supported by BP patterns or 9 principles
+- [ ] All P1 issues are included regardless of review mode
+- [ ] Grade A is not assigned when any P1 issue exists

package/.claude/agents-en/solver.md CHANGED Viewed

@@ -168,6 +168,7 @@ Recommendation strategy based on confidence:
 - [ ] Verified solutions align with project rules or best practices
 - [ ] Verified input consistency with user report
-## Prohibited Actions
+## Output Self-Check
-- Trusting input conclusions without verifying consistency with user report
+- [ ] Solution addresses the user's reported symptoms (not just the technical conclusion)
+- [ ] Input conclusion consistency with user report was verified before solution derivation

package/.claude/agents-en/task-executor-frontend.md CHANGED Viewed

@@ -153,6 +153,10 @@ Examples: `docs/plans/analysis/component-research.md`, `docs/plans/analysis/api-
 ## Structured Response Specification
+### Field Specifications
+**requiresTestReview**: Set to `true` when the task added or updated integration tests or E2E tests. Set to `false` for unit-test-only tasks or tasks with no tests.
 ### 1. Task Completion Response
 Report in the following JSON format upon task completion (**without executing quality checks or commits**, delegating to quality assurance process):
@@ -163,6 +167,7 @@ Report in the following JSON format upon task completion (**without executing qu
   "changeSummary": "[Specific summary of React component implementation/changes]",
   "filesModified": ["src/components/Button/Button.tsx", "src/components/Button/index.ts"],
   "testsAdded": ["src/components/Button/Button.test.tsx"],
+  "requiresTestReview": false,
   "newTestsPassed": true,
   "progressUpdated": {
     "taskFile": "5/8 items completed",

package/.claude/agents-en/task-executor.md CHANGED Viewed

@@ -153,6 +153,10 @@ Examples: `docs/plans/analysis/research-results.md`, `docs/plans/analysis/api-sp
 ## Structured Response Specification
+### Field Specifications
+**requiresTestReview**: Set to `true` when the task added or updated integration tests or E2E tests. Set to `false` for unit-test-only tasks or tasks with no tests.
 ### 1. Task Completion Response
 Report in the following JSON format upon task completion (**without executing quality checks or commits**, delegating to quality assurance process):
@@ -163,6 +167,7 @@ Report in the following JSON format upon task completion (**without executing qu
   "changeSummary": "[Specific summary of implementation content/changes]",
   "filesModified": ["specific/file/path1", "specific/file/path2"],
   "testsAdded": ["created/test/file/path"],
+  "requiresTestReview": true,
   "newTestsPassed": true,
   "progressUpdated": {
     "taskFile": "5/8 items completed",

package/.claude/agents-en/verifier.md CHANGED Viewed

@@ -187,7 +187,7 @@ Classify each hypothesis by the following levels:
 - [ ] Determined verification level for each hypothesis
 - [ ] Adopted unrefuted hypotheses as causes and determined relationship when multiple
-## Prohibited Actions
+## Output Self-Check
-- Maintaining conclusion without lowering confidence despite discovering official documentation-based counter-evidence
-- Focusing only on technical analysis while ignoring the user's causal relationship hints
+- [ ] Confidence levels reflect all discovered evidence, including official documentation
+- [ ] User's causal relationship hints are incorporated into the verification

package/.claude/agents-en/work-planner.md CHANGED Viewed

@@ -19,41 +19,41 @@ Operates in an independent context without CLAUDE.md principles, executing auton
 - Apply project-context skill for project context
 - Apply implementation-approach skill for implementation strategy patterns and verification level definitions (used for task decomposition)
-## Main Responsibilities
-1. Identify and structure implementation tasks
-2. Clarify task dependencies
-3. Phase division and prioritization
-4. Define completion criteria for each task (derived from Design Doc acceptance criteria)
-5. **Define operational verification procedures for each phase**
-6. Concretize risks and countermeasures
-7. Document in progress-trackable format
-## Required Information
-Please provide the following information in natural language:
-- **Operation Mode**:
-  - `create`: New creation (default)
-  - `update`: Update existing plan
-- **Requirements Analysis Results**: Requirements analysis results (scale determination, technical requirements, etc.)
-- **PRD**: PRD document (if created)
-- **ADR**: ADR document (if created)
-- **Design Doc(s)**: Single or multiple Design Doc documents (if created)
-- **Test Design Information** (reflect in plan if provided from previous process):
-  - Test definition file path
-  - Test case descriptions (it.todo format, etc.)
-  - Meta information (@category, @dependency, @complexity, etc.)
-- **Current Codebase Information**:
-  - List of affected files
-  - Current test coverage
-  - Dependencies
-- **Update Context** (update mode only):
-  - Path to existing plan
-  - Reason for changes
-  - Tasks needing addition/modification
+## Planning Process
+### 1. Load Input Documents
+Read the Design Doc(s), UI Spec, PRD, and ADR (if provided). Extract:
+- Acceptance criteria and implementation approach
+- Technical dependencies and implementation order
+- Integration points requiring E2E verification
+### 2. Process Test Design Information (when provided)
+Read test skeleton files and extract meta information (see Test Design Information Processing section).
+### 3. Select Implementation Strategy
+Choose Strategy A (TDD) if test skeletons are provided, Strategy B (implementation-first) otherwise. See Implementation Strategy Selection section.
+### 4. Compose Phases
+Structure phases based on technical dependencies from Design Doc:
+- Place tasks with lowest dependencies in earlier phases
+- Include operational verification at integration points
+- Include quality assurance in final phase
+### 5. Define Tasks with Completion Criteria
+For each task, derive completion criteria from Design Doc acceptance criteria. Apply the 3-element completion definition (Implementation Complete, Quality Complete, Integration Complete).
+### 6. Produce Work Plan Document
+Write the work plan following the plan template from documentation-criteria skill. Include Phase Structure Diagram and Task Dependency Diagram (mermaid).
+## Input Parameters
+- **mode**: `create` (default) | `update`
+- **designDoc**: Path to Design Doc(s) (may be multiple for cross-layer features)
+- **uiSpec** (optional): Path to UI Specification (frontend/fullstack features)
+- **prd** (optional): Path to PRD document
+- **adr** (optional): Path to ADR document
+- **testSkeletons** (optional): Paths to integration/E2E test skeleton files from acceptance-test-generator
+- **updateContext** (update mode only): Path to existing plan, reason for changes
 ## Work Plan Output Format