npm - agentic-swe - Versions diffs - 1.0.0 - Mend

agentic-swe 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (191) hide show

package/.claude/commands/lint.md ADDED Viewed

@@ -0,0 +1,86 @@
+# /lint
+Detect and run the target repository's linter(s) and formatter check(s) with structured result reporting.
+## Prompt
+You are running lint and format checks for the target repository. Detect the tools, execute them, and return structured results.
+Arguments: `$ARGUMENTS`
+- If arguments are provided, treat them as a scope filter (e.g., file path, directory, or glob pattern).
+- If no arguments, run against the full codebase using the project's default configuration.
+### Procedure
+1. **Detect linting tools**:
+   - Check for tool indicators in order:
+     - `package.json` → scripts.lint, scripts."lint:fix", devDependencies (eslint, biome, oxlint)
+     - `.eslintrc*`, `eslint.config.*` → ESLint
+     - `biome.json`, `biome.jsonc` → Biome
+     - `pyproject.toml` [tool.ruff], `ruff.toml` → Ruff
+     - `pyproject.toml` [tool.flake8], `.flake8` → Flake8
+     - `pyproject.toml` [tool.mypy], `mypy.ini` → mypy (typecheck)
+     - `.golangci.yml` → golangci-lint
+     - `Cargo.toml` → `cargo clippy`
+     - `.rubocop.yml` → RuboCop
+     - `Makefile` / `Justfile` → lint targets
+   - Collect all applicable tools; run each separately.
+2. **Detect formatters** (check mode only):
+   - `.prettierrc*`, `prettier.config.*` → `prettier --check`
+   - `pyproject.toml` [tool.black], `pyproject.toml` [tool.ruff.format] → `ruff format --check` or `black --check`
+   - `rustfmt.toml` → `cargo fmt --check`
+   - `gofmt` / `goimports` → check mode
+   - `biome.json` → `biome check` (covers both lint and format)
+3. **Pre-flight checks**:
+   - Verify each tool is available.
+   - If a tool is missing, report it as a blocker rather than installing it.
+4. **Execute each tool**:
+   - Run in check/report mode — do NOT auto-fix.
+   - Apply scope filter if provided.
+   - Capture stdout and stderr.
+   - Set a 3-minute timeout per tool.
+5. **Parse results**:
+   - For each tool: extract total issues, errors, warnings, fixable count.
+   - For each issue: file, line, rule/code, severity, message.
+   - Cap detailed issue listing at 50 items; note if truncated.
+### Output Format
+```markdown
+# Lint Results
+## Summary
+- **Status**: CLEAN | ISSUES | ERROR | BLOCKED
+- **Scope**: <full codebase | scoped to: X>
+## Tools Run
+### <Tool Name>
+- **Command**: `<exact command>`
+- **Exit code**: <code>
+- **Issues**: <count> (<errors> errors, <warnings> warnings)
+- **Fixable**: <count>
+| File | Line | Rule | Severity | Message |
+|------|------|------|----------|---------|
+| ... | ... | ... | ... | ... |
+<!-- Repeat for each tool -->
+## Aggregate
+| Tool | Errors | Warnings | Fixable |
+|------|--------|----------|---------|
+| ... | ... | ... | ... |
+| **Total** | ... | ... | ... |
+```
+### Failure Protocol
+- If no linting tool is detected, report `BLOCKED` with reason "no linter configured".
+- If a tool is missing or not installed, report `BLOCKED` for that tool and continue with others.
+- Do not auto-fix, install packages, or modify configuration. Report findings only.

package/.claude/commands/plan-only.md ADDED Viewed

@@ -0,0 +1,28 @@
+# /plan-only
+Plan work without implementing. Stops after the planning stages.
+## Prompt
+You are planning work for: `$ARGUMENTS`
+Instructions:
+1. If you are in a target repository and `.claude/CLAUDE.md` or the required `.claude/templates/` files are missing, run `/install` first.
+2. Create `.claude/.work/<id>/state.json` from `.claude/templates/state.json`.
+3. Fill in `work_id`, `task`, and keep `current_state: "initialized"`.
+4. Execute only:
+   - `feasibility`
+   - `fast-path-check`
+   - if full path is required (fast-path-check verdict is `complex`), `design`
+5. If ambiguity is found, stop at `ambiguity-wait`.
+6. If the task is fast-path eligible, stop after `fast-path-check` and record the recommendation instead of implementing.
+7. Do not proceed into:
+   - `verification`
+   - `test`
+   - `fast-implementation`
+   - `implementation`
+   - `code-review`
+   - `validation`
+   - `pr-created`
+8. Return the work id, current state, and recommended next state.

package/.claude/commands/repo-scan.md ADDED Viewed

@@ -0,0 +1,96 @@
+# /repo-scan
+Produce a structured snapshot of the target repository for rapid task analysis.
+## Prompt
+You are scanning the repository to produce a structured overview. This is a read-only operation — do not modify any files.
+### Procedure
+1. **Language and framework detection**:
+   - Inspect root-level config files: `package.json`, `Cargo.toml`, `go.mod`, `pyproject.toml`, `Gemfile`, `pom.xml`, `build.gradle`, `Makefile`, `CMakeLists.txt`, etc.
+   - Identify primary language(s), framework(s), and package manager(s).
+   - Note language version constraints if declared.
+2. **Project structure**:
+   - List top-level directories with a one-line purpose for each.
+   - Identify source roots (e.g., `src/`, `lib/`, `app/`), test roots (e.g., `tests/`, `__tests__/`, `spec/`), and config roots.
+   - Note monorepo indicators (workspaces, Lerna, Nx, Turborepo).
+3. **Test infrastructure**:
+   - Identify test framework(s): jest, pytest, go test, cargo test, rspec, junit, etc.
+   - Locate test config files (e.g., `jest.config.*`, `pytest.ini`, `conftest.py`, `.mocharc.*`).
+   - Note test commands from `package.json` scripts, `Makefile` targets, or CI config.
+   - Report approximate test file count.
+4. **CI/CD configuration**:
+   - Check for: `.github/workflows/`, `.gitlab-ci.yml`, `Jenkinsfile`, `.circleci/`, `buildkite/`, `.travis.yml`, `azure-pipelines.yml`.
+   - Summarize what CI runs: build, test, lint, typecheck, deploy, etc.
+   - Note required checks or branch protection indicators.
+5. **Linting and formatting**:
+   - Identify linter configs: `.eslintrc*`, `ruff.toml`, `.golangci.yml`, `.rubocop.yml`, `biome.json`, `.prettierrc*`, etc.
+   - Note lint/format commands from scripts or CI.
+6. **Entry points and exports**:
+   - Identify main entry points: `main.*`, `index.*`, `app.*`, `server.*`, CLI entry points.
+   - Note public API surface if applicable (exports, `__init__.py`, barrel files).
+7. **Dependencies overview**:
+   - Count direct vs dev dependencies.
+   - Flag notable dependencies (ORMs, HTTP frameworks, auth libraries, cloud SDKs).
+   - Note lockfile presence and type.
+### Output Format
+Write output as structured markdown:
+```markdown
+# Repository Scan
+## Identity
+- **Languages**: <list>
+- **Frameworks**: <list>
+- **Package manager**: <name> (lockfile: yes/no)
+## Structure
+| Directory | Purpose |
+|-----------|---------|
+| ... | ... |
+- **Source root(s)**: <paths>
+- **Test root(s)**: <paths>
+- **Monorepo**: yes/no (tool: <if applicable>)
+## Test Infrastructure
+- **Framework(s)**: <list>
+- **Config**: <files>
+- **Run command**: <command>
+- **Approximate test count**: <number>
+## CI/CD
+- **Platform**: <name>
+- **Pipelines**: <summary of what runs>
+- **Required checks**: <list or "none detected">
+## Linting & Formatting
+- **Linter(s)**: <list with config files>
+- **Formatter(s)**: <list with config files>
+- **Run command**: <command>
+## Entry Points
+- <path>: <description>
+## Notable Dependencies
+- <name>: <purpose>
+## Observations
+- <anything unusual, missing, or noteworthy>
+```
+### Failure Protocol
+- If the repository is empty or has no recognizable structure, report that fact.
+- Do not guess framework details — report only what config files and code confirm.
+- If a section has no findings, write "None detected" rather than omitting it.

package/.claude/commands/security-scan.md ADDED Viewed

@@ -0,0 +1,98 @@
+# /security-scan
+Run baseline security checks against the target repository with structured findings.
+## Prompt
+You are performing a security scan of the repository. Execute available checks and report findings with evidence.
+Arguments: `$ARGUMENTS`
+- If arguments specify a scope (file path, directory, or "dependencies-only"), limit the scan.
+- If no arguments, scan the full repository.
+### Procedure
+1. **Dependency audit**:
+   - Detect package manager and run its audit command:
+     - `npm`: `npm audit --json`
+     - `yarn`: `yarn audit --json`
+     - `pnpm`: `pnpm audit --json`
+     - `pip`: `pip-audit` (if available) or check `safety` output
+     - `cargo`: `cargo audit` (if available)
+     - `go`: `govulncheck ./...` (if available)
+     - `bundler`: `bundle audit check`
+   - If the audit tool is not installed, note it as unavailable rather than installing.
+   - Parse: vulnerability count by severity (critical, high, medium, low), affected packages, fix availability.
+2. **Secret scanning**:
+   - Search for common secret patterns in tracked files (not in `.git/`):
+     - API keys: patterns like `AKIA[0-9A-Z]{16}`, `sk-[a-zA-Z0-9]{48}`, `ghp_[a-zA-Z0-9]{36}`
+     - Private keys: `-----BEGIN (RSA |EC |DSA |OPENSSH )?PRIVATE KEY-----`
+     - Connection strings: `postgres://`, `mysql://`, `mongodb://`, `redis://` with credentials
+     - Generic high-entropy strings in assignment context (be conservative — flag only high-confidence matches)
+   - Check `.gitignore` for common exclusions: `.env`, `*.pem`, `*.key`, `credentials.*`
+   - Report missing `.gitignore` entries for sensitive patterns.
+3. **Configuration review**:
+   - Check for overly permissive CORS, disabled security headers, debug modes left on.
+   - Check for hardcoded `0.0.0.0` bindings, disabled TLS verification, `NODE_ENV=development` in production configs.
+   - Check Dockerfile (if present) for: running as root, using `latest` tag, copying secrets into image.
+   - This is a surface-level scan — flag obvious issues, not deep application logic.
+4. **Dangerous patterns in code** (scope to changed files when scoped):
+   - `eval()`, `exec()`, `subprocess` with shell=True, `child_process.exec` with unsanitized input
+   - SQL string concatenation (vs parameterized queries)
+   - `dangerouslySetInnerHTML`, `innerHTML` assignments with dynamic content
+   - Deserialization of untrusted data (`pickle.loads`, `yaml.load` without SafeLoader, `JSON.parse` on user input passed to eval)
+### Output Format
+```markdown
+# Security Scan
+## Summary
+- **Status**: CLEAN | FINDINGS | BLOCKED
+- **Scope**: <full repo | scoped to: X>
+- **Critical**: <count>
+- **High**: <count>
+- **Medium**: <count>
+- **Low**: <count>
+## Dependency Audit
+- **Tool**: <command run>
+- **Vulnerabilities**: <count>
+| Package | Severity | CVE/Advisory | Fix Available | Description |
+|---------|----------|-------------|---------------|-------------|
+| ... | ... | ... | ... | ... |
+## Secret Scan
+| Pattern | File | Line | Confidence |
+|---------|------|------|------------|
+| ... | ... | ... | high/medium |
+### Missing .gitignore Entries
+- <pattern that should be ignored>
+## Configuration Issues
+| File | Issue | Severity |
+|------|-------|----------|
+| ... | ... | ... |
+## Dangerous Code Patterns
+| File | Line | Pattern | Risk |
+|------|------|---------|------|
+| ... | ... | ... | ... |
+## Recommendations
+- <prioritized list of actions>
+```
+### Failure Protocol
+- If no package manager is detected, skip dependency audit and note it.
+- If audit tools are not installed, report them as unavailable — do not install them.
+- Be conservative with secret detection — false positives erode trust. Only flag high-confidence matches.
+- Do not modify any files. Report findings only.
+- Apply `.claude/templates/evidence-standard.md` to all findings.

package/.claude/commands/subagent.md ADDED Viewed

@@ -0,0 +1,109 @@
+---
+name: subagent
+description: "Browse, search, and invoke specialized subagents for specific development tasks."
+---
+# /subagent — Discover and Invoke Specialized Subagents
+Browse, search, and invoke from 135+ specialized subagents covering 10 categories of development expertise.
+## Input: $ARGUMENTS
+Usage:
+- `/subagent` — list all categories and agents
+- `/subagent search <query>` — find agents by keyword
+- `/subagent info <name>` — show full agent definition
+- `/subagent invoke <name> <task description>` — spawn agent for a task
+## Categories
+| # | Category | Count | Examples |
+|---|----------|-------|----------|
+| 01 | Core Development | 10 | api-designer, backend-developer, frontend-developer, fullstack-developer |
+| 02 | Language Specialists | 29 | python-pro, typescript-pro, rust-engineer, golang-pro, react-developer |
+| 03 | Infrastructure | 16 | cloud-architect, devops-engineer, kubernetes-specialist, terraform-engineer |
+| 04 | Quality & Security | 14 | code-reviewer, security-auditor, debugger, performance-engineer |
+| 05 | Data & AI | 13 | data-engineer, ml-engineer, llm-architect, prompt-engineer |
+| 06 | Developer Experience | 13 | documentation-engineer, cli-developer, refactoring-specialist, mcp-developer |
+| 07 | Specialized Domains | 12 | blockchain-developer, fintech-developer, gaming-developer, iot-engineer |
+| 08 | Business & Product | 11 | product-manager, project-manager, technical-writer, ux-researcher |
+| 09 | Meta & Orchestration | 10 | multi-agent-coordinator, workflow-orchestrator, context-manager |
+| 10 | Research & Analysis | 7 | research-analyst, competitive-analyst, trend-analyst |
+## Instructions
+### Action: list (no arguments or `list`)
+List all categories with their agents:
+```bash
+source .claude/tools/subagent-catalog/config.sh
+for category in $(subagent_catalog_list_categories); do
+  echo "### $category"
+  subagent_catalog_list_agents_in "$category" | tr '\n' ', ' | sed 's/,$//'
+  echo ""
+done
+```
+### Action: search <query>
+```bash
+source .claude/tools/subagent-catalog/config.sh
+subagent_catalog_search "<query>"
+```
+Display results as a table with category, name, and description.
+### Action: info <name>
+1. Find the agent file:
+```bash
+source .claude/tools/subagent-catalog/config.sh
+subagent_catalog_find "<name>"
+```
+2. Read and display the full agent definition with parsed frontmatter (name, description, tools, model).
+3. Show invocation example:
+```
+Agent(prompt="agents/subagents/<category>/<name>.md", model="<model>")
+```
+### Action: invoke <name> <task>
+1. Find the agent file path.
+2. Read the agent definition to determine the recommended model.
+3. Spawn the agent:
+```
+Agent(
+  prompt="<read agent file content>\n\n---\n\nTask: <task description>",
+  model="<model from frontmatter>",
+  description="<name>: <short task summary>"
+)
+```
+**Model routing from agent frontmatter:**
+- `opus` — deep reasoning (security audits, architecture reviews)
+- `sonnet` — everyday coding (most developers and specialists)
+- `haiku` — quick tasks (documentation, searches, dependency checks)
+## When to Use Subagents
+The orchestrator should consider invoking a subagent when:
+1. **Language-specific expertise needed** — e.g., Rust borrow checker issues, Python async patterns
+2. **Domain specialist required** — e.g., Kubernetes deployment, database optimization, security audit
+3. **Parallel review** — spawn multiple reviewers (code-reviewer + security-auditor) simultaneously
+4. **Research tasks** — competitive analysis, trend research, literature review
+5. **Infrastructure work** — Terraform plans, Docker optimization, CI/CD pipelines
+## Integration with Pipeline
+Subagents complement the core pipeline agents:
+- **developer.md** handles general implementation; language specialists handle language-specific expertise
+- **panel/*.md** handles design review; quality-security subagents provide deeper audits
+- **git-ops.md** handles git workflow; devops subagents handle infrastructure concerns
+The orchestrator delegates to subagents the same way as core agents — with explicit scope, evidence requirements, and audit logging.

package/.claude/commands/test-runner.md ADDED Viewed

@@ -0,0 +1,85 @@
+# /test-runner
+Discover and execute the target repository's test suite with structured result reporting.
+## Prompt
+You are running tests for the target repository. Detect the test framework, execute tests, and return structured results.
+Arguments: `$ARGUMENTS`
+- If arguments are provided, treat them as a test scope filter (e.g., file path, test name pattern, or directory).
+- If no arguments, run the full default test suite.
+### Procedure
+1. **Detect test framework and command**:
+   - Check for framework indicators in order:
+     - `package.json` → scripts.test, scripts."test:unit", scripts."test:integration"
+     - `Makefile` / `Justfile` → test targets
+     - `pyproject.toml` / `setup.cfg` / `pytest.ini` → pytest configuration
+     - `go.mod` → `go test ./...`
+     - `Cargo.toml` → `cargo test`
+     - `Gemfile` → `bundle exec rspec` or `bundle exec rake test`
+     - `.github/workflows/*.yml` → extract test commands from CI
+   - If multiple test commands exist, prefer the most specific match for the scope requested.
+2. **Pre-flight checks**:
+   - Verify the test command is available (e.g., `npx jest --version`, `pytest --version`).
+   - If dependencies appear uninstalled, report that as a blocker rather than installing them silently.
+   - Check for required environment setup (`.env.test`, database fixtures, docker-compose test services).
+3. **Execute tests**:
+   - Run the detected command with the scope filter applied.
+   - If scoped to a specific file or pattern, pass it as an argument to the test runner.
+   - Capture both stdout and stderr.
+   - Set a reasonable timeout (5 minutes for full suite, 2 minutes for scoped).
+4. **Parse results**:
+   - Extract: total tests, passed, failed, skipped, errored.
+   - For failures: capture test name, assertion message, and file location.
+   - Note execution time.
+### Output Format
+```markdown
+# Test Results
+## Summary
+- **Framework**: <name>
+- **Command**: `<exact command run>`
+- **Scope**: <full suite | scoped to: X>
+- **Status**: PASS | FAIL | ERROR | BLOCKED
+## Counts
+| Total | Passed | Failed | Skipped | Errored |
+|-------|--------|--------|---------|---------|
+| ... | ... | ... | ... | ... |
+## Failures
+<!-- Only if failures exist -->
+| Test | Location | Message |
+|------|----------|---------|
+| ... | ... | ... |
+## Execution
+- **Duration**: <time>
+- **Exit code**: <code>
+## Raw Output
+<details>
+<summary>Full output</summary>
+```
+<stdout + stderr>
+```
+</details>
+```
+### Failure Protocol
+- If no test framework is detected, report `BLOCKED` with reason "no test framework found".
+- If dependencies are missing, report `BLOCKED` with the missing dependency.
+- If tests timeout, report `ERROR` with partial results if available.
+- Do not install packages, start services, or modify configuration. Report blockers instead.

package/.claude/commands/work.md ADDED Viewed

@@ -0,0 +1,76 @@
+# /work
+Start a new work item or resume an existing one.
+## Prompt
+You are running `/work`.
+**Dispatch:** If `$ARGUMENTS` matches an existing work ID in `.claude/.work/`, resume that work item. Otherwise, treat `$ARGUMENTS` as a new task description.
+### Start Mode (new task)
+1. If you are in a target repository and `.claude/CLAUDE.md` or the required `.claude/templates/` files are missing, run `/install` first.
+2. Generate a short kebab-case work ID from the task description (e.g., `add-retry-logic`).
+3. Create `.claude/.work/<id>/`.
+4. Create `.claude/.work/<id>/state.json` from `.claude/templates/state.json`, filling in:
+   - `work_id`, `task`, `current_state: "initialized"`
+   - `created_at` and `updated_at` with current ISO-8601 timestamp
+   - `timeout_at` with current timestamp + 24 hours
+5. Create `.claude/.work/<id>/audit.log` and `.claude/.work/<id>/progress.md` from the templates.
+6. Branch setup (delegate to `.claude/agents/git-ops.md` if complex):
+   a. Detect base branch: check for `main`, then `master`, then current. Record in `state.json.git.base_branch`.
+   b. `git fetch origin <base> && git checkout <base> && git pull origin <base>`
+   c. `git checkout -b work/<work-id>` (check `git branch --list work/<work-id>` first for collision).
+   d. Record `state.json.git.working_branch = "work/<work-id>"`.
+   e. Do NOT push yet — no remote branch until commits exist.
+7. Read `CLAUDE.md` for the authoritative state machine and operating loop.
+8. Execute the operating loop (see below).
+### Resume Mode (existing work ID)
+1. Read the current `state.json`.
+2. Set `resume.detected` to `true`, `resume.detected_at` to current timestamp.
+3. Git state validation:
+   a. `git status --short` — if dirty worktree with unrelated changes, STOP and surface to user.
+   b. `git branch --show-current` — if not on `state.json.git.working_branch`, checkout it.
+   c. If working branch does not exist locally: `git fetch origin && git checkout -b <branch> origin/<branch>`. If not on remote either: STOP — branch was deleted, escalate.
+   d. `git fetch origin <base_branch>`
+   e. `git log HEAD..origin/<base_branch> --oneline` — if base has advanced: `git rebase origin/<base_branch>`. If conflicts: STOP, surface to user with conflict details.
+   f. Set `resume.branch_validated = true`, `resume.rebase_performed = (true if rebased)`.
+   g. Record git state (branch, SHA, rebase status) in history entry.
+4. Check `timeout_at` — if the work item has been idle beyond the timeout, warn the user before continuing.
+5. Read any existing artifacts in `.claude/.work/<id>/`.
+6. If the current state is a human gate (`ambiguity-wait`, `approval-wait`, `escalate-code`, `escalate-validation`), do not proceed without user input.
+7. Respect loop budgets already recorded in `state.json`.
+8. Update `timeout_at` to current timestamp + 24 hours on resume.
+9. Continue the operating loop (see below).
+### Operating Loop
+- Before each phase: invoke `/check budget`
+- Before each transition: invoke `/check transition`
+- After artifact creation: invoke `/check artifacts`
+- Execute phases using the matching prompt in `.claude/phases/`
+- Follow either:
+  - fast path: `fast-implementation → validation`
+  - full path: `design → design-review → verification → test → implementation → code-review → permissions → validation`
+- If validation succeeds, execute `pr-created` and stop at `approval-wait`.
+- When resuming at `approval-wait`:
+  - If `approvals.pr_approved == true`: execute `.claude/phases/completion.md`, transition to `completed`.
+  - If `approvals.changes_requested == true`:
+    a. Fetch PR feedback: `gh pr view --json reviews,comments`
+    b. Write `.claude/.work/<id>/approval-feedback.md` with structured review findings
+    c. Increment `counters.approval_iter`
+    d. Transition `approval-wait → implementation`
+    e. Implementation phase reads `approval-feedback.md` as mandatory additional input
+- Stop only at: `ambiguity-wait`, `approval-wait`, `escalate-code`, `escalate-validation`, `completed`, `failed`
+## Output
+Return:
+- work id
+- current state
+- artifacts written
+- whether human input is required

package/.claude/phases/code-review.md ADDED Viewed

@@ -0,0 +1,92 @@
+# Code Review
+## Mission
+Approve only when correctness, complexity, and test adequacy are acceptable. Review code changes after `implementation` or `fast-implementation`.
+## Persona
+Demanding senior reviewer — correctness outranks style, regressions outrank elegance, complexity must earn its keep.
+## Procedure
+1. Read `feasibility.md`, `design.md` (when present), `implementation.md`, `self-review.md` (when present), and the actual changed files.
+### Parallel Specialist Review (Auto-Selection)
+2. Read `## Subagent Signals` from `feasibility.md`. If `subagent_auto_select` is enabled and `subagent-mode` is `full`:
+   - Select 0-2 review-oriented subagents based on domain signals per `.claude/phases/subagent-selection.md` (Parallel Review Mode). Typical selections:
+     - Security-sensitive code → `security-auditor` (background)
+     - Performance-sensitive code → `performance-engineer` (background)
+     - Accessibility changes → `accessibility-tester` (background)
+     - Infrastructure changes → `security-engineer` (background)
+   - Spawn selected subagent(s) in **background** with the changed files and implementation context.
+   - They run in parallel with the main review — do NOT wait for them before proceeding.
+   - If `budget_remaining` < 3, skip specialist spawning.
+### Main Review
+3. Invoke `/diff-review` against the changed files for structured, evidence-backed findings.
+4. Compare the code against required behavior, design intent, repo conventions, and likely edge cases.
+5. Review with a defect-oriented mindset:
+   - wrong or partial behavior
+   - missing failure case handling
+   - unjustified complexity
+   - poor test coverage for risky branches
+5. Review with a defect-oriented mindset:
+   - wrong or partial behavior
+   - missing failure case handling
+   - unjustified complexity
+   - poor test coverage for risky branches
+6. Separate blocking findings, non-blocking concerns, and speculative risks.
+7. Score the implementation against each dimension (1=fail, 2=acceptable, 3=strong):
+   - **Correctness**: behavior vs. spec (1=wrong behavior, 2=correct happy path with edge gaps, 3=edge cases handled)
+   - **Safety**: failure modes and error handling (1=unsafe paths, 2=major paths covered, 3=defensive throughout)
+   - **Test adequacy**: regression coverage (1=no/trivial tests, 2=happy path tested, 3=edge+error paths tested)
+   - **Design conformance**: match to approved design (1=significant deviation, 2=minor deviations documented, 3=faithful)
+   - **Complexity**: proportionality to problem (1=unnecessary complexity, 2=acceptable, 3=simplest viable)
+8. Verdict rule: any dimension scored 1 → rejected. All 2+ → approved.
+### Integrate Specialist Findings
+9. If background specialist reviewers have returned, append their findings to the review artifact under `## Specialist Review Findings`.
+10. If any specialist finding is severity `high` or `critical`, flag it in the verdict rationale — but the main rubric scores still control the transition.
+11. If specialists have not returned yet, proceed with main verdict. Specialist findings arriving later are noted in the artifact but do not retroactively change the verdict.
+12. Log all specialist spawns and results in `audit.log`.
+## Reflection on Rejection
+When verdict is `rejected`, also append a structured entry to `.claude/.work/<id>/reflection-log.md`:
+- **What failed**: concrete file:line evidence of the deficiency
+- **Root cause**: the underlying reason (not just the symptom)
+- **Strategy change**: specific approach the next implementation attempt should take
+## Inputs
+- `.claude/.work/<id>/feasibility.md`
+- `.claude/.work/<id>/design.md` (if exists)
+- `.claude/.work/<id>/implementation.md`
+- `.claude/.work/<id>/self-review.md` (if exists)
+- Actual changed source files
+## Required Output
+Write one of:
+- `.claude/.work/<id>/review-pass.md` (when approved)
+- `.claude/.work/<id>/review-feedback.md` (when rejected or escalated)
+Following `.claude/templates/artifact-format.md`, include:
+- verdict: `approved`, `rejected`, or `escalated`
+- rubric scores (5 dimensions, 1-3 scale) with evidence for each score
+- findings by severity, confidence, evidence basis
+- recommended next state
+Apply `.claude/templates/evidence-standard.md` throughout.
+## Failure Protocol
+- do not hide weak evidence behind "looks good"
+- if a finding depends on a specific input or state, describe it concretely