npm - @lvlup-sw/exarchos - Versions diffs - 2.8.1 → 2.8.3 - Mend

@lvlup-sw/exarchos 2.8.1 → 2.8.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

package/.claude-plugin/plugin.json +1 -1
package/dist/exarchos.js +278 -277
package/package.json +1 -1
package/skills/claude/shepherd/SKILL.md +21 -2
package/skills/claude/shepherd/references/fix-strategies.md +26 -53
package/skills/codex/shepherd/SKILL.md +21 -2
package/skills/codex/shepherd/references/fix-strategies.md +26 -53
package/skills/copilot/shepherd/SKILL.md +21 -2
package/skills/copilot/shepherd/references/fix-strategies.md +26 -53
package/skills/cursor/shepherd/SKILL.md +21 -2
package/skills/cursor/shepherd/references/fix-strategies.md +26 -53
package/skills/generic/shepherd/SKILL.md +21 -2
package/skills/generic/shepherd/references/fix-strategies.md +26 -53
package/skills/opencode/shepherd/SKILL.md +21 -2
package/skills/opencode/shepherd/references/fix-strategies.md +26 -53

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@lvlup-sw/exarchos",
-  "version": "2.8.1",
+  "version": "2.8.3",
   "description": "Structure for agentic development — durable SDLC workflows, convergence gates, agent teams, and audit trails for Claude Code",
   "type": "module",
   "bin": {

package/skills/claude/shepherd/SKILL.md CHANGED Viewed

@@ -93,7 +93,26 @@ Review the returned `actionItems` and `recommendation`:
 ### Step 2 — Fix
-Address each blocking action item from the assessment. Consult `references/fix-strategies.md` for detailed strategies per issue type.
+Before iterating over individual action items, classify them so the loop
+knows which to fix inline vs. delegate. Call `classify_review_items` on
+the assessment's `actionItems` (the comment-reply subset is what the
+classifier groups by file; CI-fix and review-address items are passed
+through unchanged):
+```typescript
+mcp__plugin_exarchos_exarchos__exarchos_orchestrate({
+  action: "classify_review_items",
+  featureId: "<id>",
+  actionItems: <actionItems from assess_stack>
+})
+```
+The result returns `groups: ClassificationGroup[]` with a `recommendation`
+per group: `direct` (handle inline), `delegate-fixer` (spawn the fixer
+subagent for batched/HIGH-severity work), or `delegate-scaffolder`
+(cheap subagent for doc nits). Iterate the groups in order, applying
+per-group strategy, then consult `references/fix-strategies.md` for
+detailed per-issue-type instructions.
 **Remediation event protocol (FLYWHEEL):**
@@ -130,7 +149,7 @@ These events feed `selfCorrectionRate` and `avgRemediationAttempts` metrics in C
 | Type | Strategy |
 |------|----------|
 | `ci-fix` | Read logs, reproduce locally, fix, commit to stack branch |
-| `comment-reply` | Read context from `actionItem.context`, compose response, post via GitHub MCP |
+| `comment-reply` | Use `actionItem.reviewer`, `normalizedSeverity`, `file`, `line`, and `raw` (full original comment) to compose a response. Provider adapters under `servers/exarchos-mcp/src/review/providers/` populate the input fields per #1159 — no manual tier parsing needed. **Posting:** PR-level summary comments use the provider-agnostic `add_pr_comment` orchestrate action; per-thread inline replies currently require the platform-specific MCP (e.g. `mcp__plugin_github_github__add_reply_to_pull_request_comment` for GitHub) until `VcsProvider` gains a thread-reply primitive — see [#1165](https://github.com/lvlup-sw/exarchos/issues/1165) for tracking. |
 | `review-address` | Fix code for CHANGES_REQUESTED, reply to each thread |
 | `restack` | Run `git rebase origin/<base>`, verify with `exarchos_orchestrate({ action: "list_prs" })` |
 | `escalate` | Consult `references/escalation-criteria.md` |

package/skills/claude/shepherd/references/fix-strategies.md CHANGED Viewed

@@ -4,14 +4,16 @@ How to address common issues found during shepherd assessment.
 ## Decision: Fix Directly vs. Delegate
-| Condition | Approach |
-|-----------|----------|
-| Single file, < 20 lines changed | Fix directly in the stack branch |
-| Multiple files, contained concern | Fix directly if < 5 files |
-| Cross-cutting or architectural | Route to `/exarchos:delegate --fixes` for subagent dispatch |
-| Test changes needed | Fix directly (keep TDD cycle tight) |
+The `classify_review_items` orchestrate action owns this decision (#1159).
+Pass it the `actionItems` from `assess_stack` and consume the
+`recommendation` field on each returned group:
-**Default to fixing directly** — delegation adds overhead. Only delegate when the fix scope warrants it.
+- `direct` — handle inline in the shepherd loop
+- `delegate-fixer` — spawn the fixer subagent (batched / HIGH severity)
+- `delegate-scaffolder` — cheap scaffolder dispatch for doc nits
+Test changes still warrant inline handling regardless of recommendation —
+keep the TDD cycle tight rather than delegating test edits.
 ## Remediation Event Emission
@@ -154,52 +156,23 @@ For each comment, determine the appropriate response:
 | Already fixed (outdated) | Reply confirming | "Fixed in [commit/PR description] — [brief explanation]." |
 | False positive | Reply explaining | "[Explanation of why this doesn't apply in this context]." |
-### Sentry Comments
-Sentry's `[bot]` leaves **bug predictions** — AI-generated analysis of potential runtime issues. These appear as inline review comments with severity tags (CRITICAL, MEDIUM, etc.).
-**Sentry comments deserve careful attention because they often identify real bugs** (field name mismatches, type coercion issues, null reference risks).
-How to handle:
-1. Read the full comment body — Sentry includes a "Suggested Fix" section
-2. Evaluate whether the bug is real:
-   - Check if the code path is actually reachable
-   - Check if the field names/types match what the data actually provides
-   - Check existing tests — does any test exercise this path?
-3. If real: fix the bug, add a test, reply confirming
-4. If false positive: reply explaining why (e.g., "This path is guarded by X" or "The field is validated at Y before reaching this code")
-**Common Sentry findings:**
-- Field name mismatches between producers and consumers
-- Missing null checks on optional fields
-- Type mismatches (string vs. enum, array vs. object)
-- Unreachable error paths due to upstream validation
-### CodeRabbit Comments
-CodeRabbit leaves detailed code review suggestions with severity indicators. It re-reviews automatically on push, so code fixes may auto-resolve threads.
-How to handle:
-1. Read all CodeRabbit comments, noting severity (Critical, Major, Minor)
-2. Critical/Major: Must address — fix or provide strong rationale for not fixing
-3. Minor: Fix if low-effort, otherwise acknowledge
-4. CodeRabbit marks threads as "Addressed in commits" when it detects the code changed — but always verify with a reply
-**Common CodeRabbit findings:**
-- Error handling gaps (missing try/catch, bare catches)
-- Code duplication (DRY violations)
-- Style/naming suggestions
-- Performance optimizations
-- Security concerns
-### Human Reviewer Comments
-Human comments require the most careful handling:
-1. Read comments carefully — understand the full context
-2. For required changes: fix the code, reply confirming
-3. For questions: answer directly on the PR
-4. For suggestions: discuss or implement, reply with decision
-5. For approval with minor nits: fix nits, note the approval
+### Per-reviewer parsing (Sentry, CodeRabbit, Human, GitHub-Copilot)
+Severity normalization and per-reviewer comment parsing live in the
+provider adapters under `servers/exarchos-mcp/src/review/providers/` (#1159).
+`assess_stack` dispatches each PR comment through the adapter registry
+and attaches a normalized `ActionItem` (with `normalizedSeverity` and
+`reviewer` fields) to each unresolved comment. Use that signal when
+deciding response strategy below; you do not need to re-parse tier
+markers in the shepherd loop.
+If a *recognised* reviewer (e.g. CodeRabbit) ships a new severity tier
+that the adapter does not match, the `provider.unknown-tier` event
+surfaces the unrecognised tier marker for follow-up — the comment is
+processed as MEDIUM in the meantime. Unknown *reviewers* (authors that
+don't match any typed adapter) are routed silently to the `unknown`
+adapter and never trigger this event; their comments are also processed
+as MEDIUM by default.
 ### GitHub Actions Bot Comments

package/skills/codex/shepherd/SKILL.md CHANGED Viewed

@@ -93,7 +93,26 @@ Review the returned `actionItems` and `recommendation`:
 ### Step 2 — Fix
-Address each blocking action item from the assessment. Consult `references/fix-strategies.md` for detailed strategies per issue type.
+Before iterating over individual action items, classify them so the loop
+knows which to fix inline vs. delegate. Call `classify_review_items` on
+the assessment's `actionItems` (the comment-reply subset is what the
+classifier groups by file; CI-fix and review-address items are passed
+through unchanged):
+```typescript
+mcp__exarchos__exarchos_orchestrate({
+  action: "classify_review_items",
+  featureId: "<id>",
+  actionItems: <actionItems from assess_stack>
+})
+```
+The result returns `groups: ClassificationGroup[]` with a `recommendation`
+per group: `direct` (handle inline), `delegate-fixer` (spawn the fixer
+subagent for batched/HIGH-severity work), or `delegate-scaffolder`
+(cheap subagent for doc nits). Iterate the groups in order, applying
+per-group strategy, then consult `references/fix-strategies.md` for
+detailed per-issue-type instructions.
 **Remediation event protocol (FLYWHEEL):**
@@ -130,7 +149,7 @@ These events feed `selfCorrectionRate` and `avgRemediationAttempts` metrics in C
 | Type | Strategy |
 |------|----------|
 | `ci-fix` | Read logs, reproduce locally, fix, commit to stack branch |
-| `comment-reply` | Read context from `actionItem.context`, compose response, post via GitHub MCP |
+| `comment-reply` | Use `actionItem.reviewer`, `normalizedSeverity`, `file`, `line`, and `raw` (full original comment) to compose a response. Provider adapters under `servers/exarchos-mcp/src/review/providers/` populate the input fields per #1159 — no manual tier parsing needed. **Posting:** PR-level summary comments use the provider-agnostic `add_pr_comment` orchestrate action; per-thread inline replies currently require the platform-specific MCP (e.g. `mcp__plugin_github_github__add_reply_to_pull_request_comment` for GitHub) until `VcsProvider` gains a thread-reply primitive — see [#1165](https://github.com/lvlup-sw/exarchos/issues/1165) for tracking. |
 | `review-address` | Fix code for CHANGES_REQUESTED, reply to each thread |
 | `restack` | Run `git rebase origin/<base>`, verify with `exarchos_orchestrate({ action: "list_prs" })` |
 | `escalate` | Consult `references/escalation-criteria.md` |

package/skills/codex/shepherd/references/fix-strategies.md CHANGED Viewed

@@ -4,14 +4,16 @@ How to address common issues found during shepherd assessment.
 ## Decision: Fix Directly vs. Delegate
-| Condition | Approach |
-|-----------|----------|
-| Single file, < 20 lines changed | Fix directly in the stack branch |
-| Multiple files, contained concern | Fix directly if < 5 files |
-| Cross-cutting or architectural | Route to `/exarchos:delegate --fixes` for subagent dispatch |
-| Test changes needed | Fix directly (keep TDD cycle tight) |
+The `classify_review_items` orchestrate action owns this decision (#1159).
+Pass it the `actionItems` from `assess_stack` and consume the
+`recommendation` field on each returned group:
-**Default to fixing directly** — delegation adds overhead. Only delegate when the fix scope warrants it.
+- `direct` — handle inline in the shepherd loop
+- `delegate-fixer` — spawn the fixer subagent (batched / HIGH severity)
+- `delegate-scaffolder` — cheap scaffolder dispatch for doc nits
+Test changes still warrant inline handling regardless of recommendation —
+keep the TDD cycle tight rather than delegating test edits.
 ## Remediation Event Emission
@@ -154,52 +156,23 @@ For each comment, determine the appropriate response:
 | Already fixed (outdated) | Reply confirming | "Fixed in [commit/PR description] — [brief explanation]." |
 | False positive | Reply explaining | "[Explanation of why this doesn't apply in this context]." |
-### Sentry Comments
-Sentry's `[bot]` leaves **bug predictions** — AI-generated analysis of potential runtime issues. These appear as inline review comments with severity tags (CRITICAL, MEDIUM, etc.).
-**Sentry comments deserve careful attention because they often identify real bugs** (field name mismatches, type coercion issues, null reference risks).
-How to handle:
-1. Read the full comment body — Sentry includes a "Suggested Fix" section
-2. Evaluate whether the bug is real:
-   - Check if the code path is actually reachable
-   - Check if the field names/types match what the data actually provides
-   - Check existing tests — does any test exercise this path?
-3. If real: fix the bug, add a test, reply confirming
-4. If false positive: reply explaining why (e.g., "This path is guarded by X" or "The field is validated at Y before reaching this code")
-**Common Sentry findings:**
-- Field name mismatches between producers and consumers
-- Missing null checks on optional fields
-- Type mismatches (string vs. enum, array vs. object)
-- Unreachable error paths due to upstream validation
-### CodeRabbit Comments
-CodeRabbit leaves detailed code review suggestions with severity indicators. It re-reviews automatically on push, so code fixes may auto-resolve threads.
-How to handle:
-1. Read all CodeRabbit comments, noting severity (Critical, Major, Minor)
-2. Critical/Major: Must address — fix or provide strong rationale for not fixing
-3. Minor: Fix if low-effort, otherwise acknowledge
-4. CodeRabbit marks threads as "Addressed in commits" when it detects the code changed — but always verify with a reply
-**Common CodeRabbit findings:**
-- Error handling gaps (missing try/catch, bare catches)
-- Code duplication (DRY violations)
-- Style/naming suggestions
-- Performance optimizations
-- Security concerns
-### Human Reviewer Comments
-Human comments require the most careful handling:
-1. Read comments carefully — understand the full context
-2. For required changes: fix the code, reply confirming
-3. For questions: answer directly on the PR
-4. For suggestions: discuss or implement, reply with decision
-5. For approval with minor nits: fix nits, note the approval
+### Per-reviewer parsing (Sentry, CodeRabbit, Human, GitHub-Copilot)
+Severity normalization and per-reviewer comment parsing live in the
+provider adapters under `servers/exarchos-mcp/src/review/providers/` (#1159).
+`assess_stack` dispatches each PR comment through the adapter registry
+and attaches a normalized `ActionItem` (with `normalizedSeverity` and
+`reviewer` fields) to each unresolved comment. Use that signal when
+deciding response strategy below; you do not need to re-parse tier
+markers in the shepherd loop.
+If a *recognised* reviewer (e.g. CodeRabbit) ships a new severity tier
+that the adapter does not match, the `provider.unknown-tier` event
+surfaces the unrecognised tier marker for follow-up — the comment is
+processed as MEDIUM in the meantime. Unknown *reviewers* (authors that
+don't match any typed adapter) are routed silently to the `unknown`
+adapter and never trigger this event; their comments are also processed
+as MEDIUM by default.
 ### GitHub Actions Bot Comments

package/skills/copilot/shepherd/SKILL.md CHANGED Viewed

@@ -93,7 +93,26 @@ Review the returned `actionItems` and `recommendation`:
 ### Step 2 — Fix
-Address each blocking action item from the assessment. Consult `references/fix-strategies.md` for detailed strategies per issue type.
+Before iterating over individual action items, classify them so the loop
+knows which to fix inline vs. delegate. Call `classify_review_items` on
+the assessment's `actionItems` (the comment-reply subset is what the
+classifier groups by file; CI-fix and review-address items are passed
+through unchanged):
+```typescript
+mcp__exarchos__exarchos_orchestrate({
+  action: "classify_review_items",
+  featureId: "<id>",
+  actionItems: <actionItems from assess_stack>
+})
+```
+The result returns `groups: ClassificationGroup[]` with a `recommendation`
+per group: `direct` (handle inline), `delegate-fixer` (spawn the fixer
+subagent for batched/HIGH-severity work), or `delegate-scaffolder`
+(cheap subagent for doc nits). Iterate the groups in order, applying
+per-group strategy, then consult `references/fix-strategies.md` for
+detailed per-issue-type instructions.
 **Remediation event protocol (FLYWHEEL):**
@@ -130,7 +149,7 @@ These events feed `selfCorrectionRate` and `avgRemediationAttempts` metrics in C
 | Type | Strategy |
 |------|----------|
 | `ci-fix` | Read logs, reproduce locally, fix, commit to stack branch |
-| `comment-reply` | Read context from `actionItem.context`, compose response, post via GitHub MCP |
+| `comment-reply` | Use `actionItem.reviewer`, `normalizedSeverity`, `file`, `line`, and `raw` (full original comment) to compose a response. Provider adapters under `servers/exarchos-mcp/src/review/providers/` populate the input fields per #1159 — no manual tier parsing needed. **Posting:** PR-level summary comments use the provider-agnostic `add_pr_comment` orchestrate action; per-thread inline replies currently require the platform-specific MCP (e.g. `mcp__plugin_github_github__add_reply_to_pull_request_comment` for GitHub) until `VcsProvider` gains a thread-reply primitive — see [#1165](https://github.com/lvlup-sw/exarchos/issues/1165) for tracking. |
 | `review-address` | Fix code for CHANGES_REQUESTED, reply to each thread |
 | `restack` | Run `git rebase origin/<base>`, verify with `exarchos_orchestrate({ action: "list_prs" })` |
 | `escalate` | Consult `references/escalation-criteria.md` |

package/skills/copilot/shepherd/references/fix-strategies.md CHANGED Viewed

@@ -4,14 +4,16 @@ How to address common issues found during shepherd assessment.
 ## Decision: Fix Directly vs. Delegate
-| Condition | Approach |
-|-----------|----------|
-| Single file, < 20 lines changed | Fix directly in the stack branch |
-| Multiple files, contained concern | Fix directly if < 5 files |
-| Cross-cutting or architectural | Route to `/exarchos:delegate --fixes` for subagent dispatch |
-| Test changes needed | Fix directly (keep TDD cycle tight) |
+The `classify_review_items` orchestrate action owns this decision (#1159).
+Pass it the `actionItems` from `assess_stack` and consume the
+`recommendation` field on each returned group:
-**Default to fixing directly** — delegation adds overhead. Only delegate when the fix scope warrants it.
+- `direct` — handle inline in the shepherd loop
+- `delegate-fixer` — spawn the fixer subagent (batched / HIGH severity)
+- `delegate-scaffolder` — cheap scaffolder dispatch for doc nits
+Test changes still warrant inline handling regardless of recommendation —
+keep the TDD cycle tight rather than delegating test edits.
 ## Remediation Event Emission
@@ -154,52 +156,23 @@ For each comment, determine the appropriate response:
 | Already fixed (outdated) | Reply confirming | "Fixed in [commit/PR description] — [brief explanation]." |
 | False positive | Reply explaining | "[Explanation of why this doesn't apply in this context]." |
-### Sentry Comments
-Sentry's `[bot]` leaves **bug predictions** — AI-generated analysis of potential runtime issues. These appear as inline review comments with severity tags (CRITICAL, MEDIUM, etc.).
-**Sentry comments deserve careful attention because they often identify real bugs** (field name mismatches, type coercion issues, null reference risks).
-How to handle:
-1. Read the full comment body — Sentry includes a "Suggested Fix" section
-2. Evaluate whether the bug is real:
-   - Check if the code path is actually reachable
-   - Check if the field names/types match what the data actually provides
-   - Check existing tests — does any test exercise this path?
-3. If real: fix the bug, add a test, reply confirming
-4. If false positive: reply explaining why (e.g., "This path is guarded by X" or "The field is validated at Y before reaching this code")
-**Common Sentry findings:**
-- Field name mismatches between producers and consumers
-- Missing null checks on optional fields
-- Type mismatches (string vs. enum, array vs. object)
-- Unreachable error paths due to upstream validation
-### CodeRabbit Comments
-CodeRabbit leaves detailed code review suggestions with severity indicators. It re-reviews automatically on push, so code fixes may auto-resolve threads.
-How to handle:
-1. Read all CodeRabbit comments, noting severity (Critical, Major, Minor)
-2. Critical/Major: Must address — fix or provide strong rationale for not fixing
-3. Minor: Fix if low-effort, otherwise acknowledge
-4. CodeRabbit marks threads as "Addressed in commits" when it detects the code changed — but always verify with a reply
-**Common CodeRabbit findings:**
-- Error handling gaps (missing try/catch, bare catches)
-- Code duplication (DRY violations)
-- Style/naming suggestions
-- Performance optimizations
-- Security concerns
-### Human Reviewer Comments
-Human comments require the most careful handling:
-1. Read comments carefully — understand the full context
-2. For required changes: fix the code, reply confirming
-3. For questions: answer directly on the PR
-4. For suggestions: discuss or implement, reply with decision
-5. For approval with minor nits: fix nits, note the approval
+### Per-reviewer parsing (Sentry, CodeRabbit, Human, GitHub-Copilot)
+Severity normalization and per-reviewer comment parsing live in the
+provider adapters under `servers/exarchos-mcp/src/review/providers/` (#1159).
+`assess_stack` dispatches each PR comment through the adapter registry
+and attaches a normalized `ActionItem` (with `normalizedSeverity` and
+`reviewer` fields) to each unresolved comment. Use that signal when
+deciding response strategy below; you do not need to re-parse tier
+markers in the shepherd loop.
+If a *recognised* reviewer (e.g. CodeRabbit) ships a new severity tier
+that the adapter does not match, the `provider.unknown-tier` event
+surfaces the unrecognised tier marker for follow-up — the comment is
+processed as MEDIUM in the meantime. Unknown *reviewers* (authors that
+don't match any typed adapter) are routed silently to the `unknown`
+adapter and never trigger this event; their comments are also processed
+as MEDIUM by default.
 ### GitHub Actions Bot Comments

package/skills/cursor/shepherd/SKILL.md CHANGED Viewed

@@ -93,7 +93,26 @@ Review the returned `actionItems` and `recommendation`:
 ### Step 2 — Fix
-Address each blocking action item from the assessment. Consult `references/fix-strategies.md` for detailed strategies per issue type.
+Before iterating over individual action items, classify them so the loop
+knows which to fix inline vs. delegate. Call `classify_review_items` on
+the assessment's `actionItems` (the comment-reply subset is what the
+classifier groups by file; CI-fix and review-address items are passed
+through unchanged):
+```typescript
+mcp__exarchos__exarchos_orchestrate({
+  action: "classify_review_items",
+  featureId: "<id>",
+  actionItems: <actionItems from assess_stack>
+})
+```
+The result returns `groups: ClassificationGroup[]` with a `recommendation`
+per group: `direct` (handle inline), `delegate-fixer` (spawn the fixer
+subagent for batched/HIGH-severity work), or `delegate-scaffolder`
+(cheap subagent for doc nits). Iterate the groups in order, applying
+per-group strategy, then consult `references/fix-strategies.md` for
+detailed per-issue-type instructions.
 **Remediation event protocol (FLYWHEEL):**
@@ -130,7 +149,7 @@ These events feed `selfCorrectionRate` and `avgRemediationAttempts` metrics in C
 | Type | Strategy |
 |------|----------|
 | `ci-fix` | Read logs, reproduce locally, fix, commit to stack branch |
-| `comment-reply` | Read context from `actionItem.context`, compose response, post via GitHub MCP |
+| `comment-reply` | Use `actionItem.reviewer`, `normalizedSeverity`, `file`, `line`, and `raw` (full original comment) to compose a response. Provider adapters under `servers/exarchos-mcp/src/review/providers/` populate the input fields per #1159 — no manual tier parsing needed. **Posting:** PR-level summary comments use the provider-agnostic `add_pr_comment` orchestrate action; per-thread inline replies currently require the platform-specific MCP (e.g. `mcp__plugin_github_github__add_reply_to_pull_request_comment` for GitHub) until `VcsProvider` gains a thread-reply primitive — see [#1165](https://github.com/lvlup-sw/exarchos/issues/1165) for tracking. |
 | `review-address` | Fix code for CHANGES_REQUESTED, reply to each thread |
 | `restack` | Run `git rebase origin/<base>`, verify with `exarchos_orchestrate({ action: "list_prs" })` |
 | `escalate` | Consult `references/escalation-criteria.md` |

package/skills/cursor/shepherd/references/fix-strategies.md CHANGED Viewed

@@ -4,14 +4,16 @@ How to address common issues found during shepherd assessment.
 ## Decision: Fix Directly vs. Delegate
-| Condition | Approach |
-|-----------|----------|
-| Single file, < 20 lines changed | Fix directly in the stack branch |
-| Multiple files, contained concern | Fix directly if < 5 files |
-| Cross-cutting or architectural | Route to `/exarchos:delegate --fixes` for subagent dispatch |
-| Test changes needed | Fix directly (keep TDD cycle tight) |
+The `classify_review_items` orchestrate action owns this decision (#1159).
+Pass it the `actionItems` from `assess_stack` and consume the
+`recommendation` field on each returned group:
-**Default to fixing directly** — delegation adds overhead. Only delegate when the fix scope warrants it.
+- `direct` — handle inline in the shepherd loop
+- `delegate-fixer` — spawn the fixer subagent (batched / HIGH severity)
+- `delegate-scaffolder` — cheap scaffolder dispatch for doc nits
+Test changes still warrant inline handling regardless of recommendation —
+keep the TDD cycle tight rather than delegating test edits.
 ## Remediation Event Emission
@@ -154,52 +156,23 @@ For each comment, determine the appropriate response:
 | Already fixed (outdated) | Reply confirming | "Fixed in [commit/PR description] — [brief explanation]." |
 | False positive | Reply explaining | "[Explanation of why this doesn't apply in this context]." |
-### Sentry Comments
-Sentry's `[bot]` leaves **bug predictions** — AI-generated analysis of potential runtime issues. These appear as inline review comments with severity tags (CRITICAL, MEDIUM, etc.).
-**Sentry comments deserve careful attention because they often identify real bugs** (field name mismatches, type coercion issues, null reference risks).
-How to handle:
-1. Read the full comment body — Sentry includes a "Suggested Fix" section
-2. Evaluate whether the bug is real:
-   - Check if the code path is actually reachable
-   - Check if the field names/types match what the data actually provides
-   - Check existing tests — does any test exercise this path?
-3. If real: fix the bug, add a test, reply confirming
-4. If false positive: reply explaining why (e.g., "This path is guarded by X" or "The field is validated at Y before reaching this code")
-**Common Sentry findings:**
-- Field name mismatches between producers and consumers
-- Missing null checks on optional fields
-- Type mismatches (string vs. enum, array vs. object)
-- Unreachable error paths due to upstream validation
-### CodeRabbit Comments
-CodeRabbit leaves detailed code review suggestions with severity indicators. It re-reviews automatically on push, so code fixes may auto-resolve threads.
-How to handle:
-1. Read all CodeRabbit comments, noting severity (Critical, Major, Minor)
-2. Critical/Major: Must address — fix or provide strong rationale for not fixing
-3. Minor: Fix if low-effort, otherwise acknowledge
-4. CodeRabbit marks threads as "Addressed in commits" when it detects the code changed — but always verify with a reply
-**Common CodeRabbit findings:**
-- Error handling gaps (missing try/catch, bare catches)
-- Code duplication (DRY violations)
-- Style/naming suggestions
-- Performance optimizations
-- Security concerns
-### Human Reviewer Comments
-Human comments require the most careful handling:
-1. Read comments carefully — understand the full context
-2. For required changes: fix the code, reply confirming
-3. For questions: answer directly on the PR
-4. For suggestions: discuss or implement, reply with decision
-5. For approval with minor nits: fix nits, note the approval
+### Per-reviewer parsing (Sentry, CodeRabbit, Human, GitHub-Copilot)
+Severity normalization and per-reviewer comment parsing live in the
+provider adapters under `servers/exarchos-mcp/src/review/providers/` (#1159).
+`assess_stack` dispatches each PR comment through the adapter registry
+and attaches a normalized `ActionItem` (with `normalizedSeverity` and
+`reviewer` fields) to each unresolved comment. Use that signal when
+deciding response strategy below; you do not need to re-parse tier
+markers in the shepherd loop.
+If a *recognised* reviewer (e.g. CodeRabbit) ships a new severity tier
+that the adapter does not match, the `provider.unknown-tier` event
+surfaces the unrecognised tier marker for follow-up — the comment is
+processed as MEDIUM in the meantime. Unknown *reviewers* (authors that
+don't match any typed adapter) are routed silently to the `unknown`
+adapter and never trigger this event; their comments are also processed
+as MEDIUM by default.
 ### GitHub Actions Bot Comments

package/skills/generic/shepherd/SKILL.md CHANGED Viewed

@@ -93,7 +93,26 @@ Review the returned `actionItems` and `recommendation`:
 ### Step 2 — Fix
-Address each blocking action item from the assessment. Consult `references/fix-strategies.md` for detailed strategies per issue type.
+Before iterating over individual action items, classify them so the loop
+knows which to fix inline vs. delegate. Call `classify_review_items` on
+the assessment's `actionItems` (the comment-reply subset is what the
+classifier groups by file; CI-fix and review-address items are passed
+through unchanged):
+```typescript
+mcp__exarchos__exarchos_orchestrate({
+  action: "classify_review_items",
+  featureId: "<id>",
+  actionItems: <actionItems from assess_stack>
+})
+```
+The result returns `groups: ClassificationGroup[]` with a `recommendation`
+per group: `direct` (handle inline), `delegate-fixer` (spawn the fixer
+subagent for batched/HIGH-severity work), or `delegate-scaffolder`
+(cheap subagent for doc nits). Iterate the groups in order, applying
+per-group strategy, then consult `references/fix-strategies.md` for
+detailed per-issue-type instructions.
 **Remediation event protocol (FLYWHEEL):**
@@ -130,7 +149,7 @@ These events feed `selfCorrectionRate` and `avgRemediationAttempts` metrics in C
 | Type | Strategy |
 |------|----------|
 | `ci-fix` | Read logs, reproduce locally, fix, commit to stack branch |
-| `comment-reply` | Read context from `actionItem.context`, compose response, post via GitHub MCP |
+| `comment-reply` | Use `actionItem.reviewer`, `normalizedSeverity`, `file`, `line`, and `raw` (full original comment) to compose a response. Provider adapters under `servers/exarchos-mcp/src/review/providers/` populate the input fields per #1159 — no manual tier parsing needed. **Posting:** PR-level summary comments use the provider-agnostic `add_pr_comment` orchestrate action; per-thread inline replies currently require the platform-specific MCP (e.g. `mcp__plugin_github_github__add_reply_to_pull_request_comment` for GitHub) until `VcsProvider` gains a thread-reply primitive — see [#1165](https://github.com/lvlup-sw/exarchos/issues/1165) for tracking. |
 | `review-address` | Fix code for CHANGES_REQUESTED, reply to each thread |
 | `restack` | Run `git rebase origin/<base>`, verify with `exarchos_orchestrate({ action: "list_prs" })` |
 | `escalate` | Consult `references/escalation-criteria.md` |