npm - hatch3r - Versions diffs - 1.3.0 → 1.5.0 - Mend

hatch3r 1.3.0 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (175) hide show

package/README.md +12 -7
package/agents/hatch3r-a11y-auditor.md +18 -11
package/agents/hatch3r-architect.md +27 -12
package/agents/hatch3r-ci-watcher.md +30 -9
package/agents/hatch3r-context-rules.md +18 -8
package/agents/hatch3r-dependency-auditor.md +30 -15
package/agents/hatch3r-devops.md +18 -13
package/agents/hatch3r-docs-writer.md +33 -12
package/agents/hatch3r-fixer.md +46 -9
package/agents/hatch3r-implementer.md +21 -9
package/agents/hatch3r-learnings-loader.md +24 -7
package/agents/hatch3r-lint-fixer.md +18 -9
package/agents/hatch3r-perf-profiler.md +26 -10
package/agents/hatch3r-researcher.md +57 -919
package/agents/hatch3r-reviewer.md +29 -10
package/agents/hatch3r-security-auditor.md +25 -10
package/agents/hatch3r-test-writer.md +29 -9
package/agents/modes/architecture.md +1 -0
package/agents/modes/boundary-analysis.md +2 -1
package/agents/modes/codebase-impact.md +1 -0
package/agents/modes/complexity-risk.md +1 -0
package/agents/modes/coverage-analysis.md +1 -0
package/agents/modes/current-state.md +1 -0
package/agents/modes/feature-design.md +1 -0
package/agents/modes/impact-analysis.md +1 -0
package/agents/modes/library-docs.md +2 -1
package/agents/modes/migration-path.md +1 -0
package/agents/modes/prior-art.md +1 -0
package/agents/modes/refactoring-strategy.md +1 -0
package/agents/modes/regression.md +1 -0
package/agents/modes/requirements-elicitation.md +1 -0
package/agents/modes/risk-assessment.md +1 -0
package/agents/modes/risk-prioritization.md +1 -0
package/agents/modes/root-cause.md +1 -0
package/agents/modes/similar-implementation.md +2 -1
package/agents/modes/symptom-trace.md +1 -0
package/agents/modes/test-pattern.md +2 -1
package/agents/shared/external-knowledge.md +31 -0
package/agents/shared/quality-charter.md +96 -0
package/checks/README.md +1 -0
package/checks/accessibility.md +55 -0
package/commands/board/pickup-azure-devops.md +5 -0
package/commands/board/pickup-delegation-multi.md +9 -1
package/commands/board/pickup-delegation.md +4 -0
package/commands/board/pickup-github.md +5 -0
package/commands/board/pickup-gitlab.md +5 -0
package/commands/board/pickup-modes.md +1 -0
package/commands/board/pickup-post-impl.md +9 -1
package/commands/board/shared-azure-devops.md +14 -3
package/commands/board/shared-board-overview.md +1 -0
package/commands/board/shared-github.md +2 -0
package/commands/board/shared-gitlab.md +10 -2
package/commands/hatch3r-agent-customize.md +6 -1
package/commands/hatch3r-api-spec.md +1 -0
package/commands/hatch3r-benchmark.md +4 -3
package/commands/hatch3r-board-fill.md +52 -9
package/commands/hatch3r-board-groom.md +124 -7
package/commands/hatch3r-board-init.md +7 -3
package/commands/hatch3r-board-pickup.md +1 -0
package/commands/hatch3r-board-refresh.md +1 -0
package/commands/hatch3r-board-shared.md +71 -5
package/commands/hatch3r-bug-plan.md +2 -1
package/commands/hatch3r-codebase-map.md +4 -3
package/commands/hatch3r-command-customize.md +6 -1
package/commands/hatch3r-context-health.md +1 -0
package/commands/hatch3r-cost-tracking.md +1 -0
package/commands/hatch3r-debug.md +4 -3
package/commands/hatch3r-dep-audit.md +3 -0
package/commands/hatch3r-feature-plan.md +3 -2
package/commands/hatch3r-healthcheck.md +1 -0
package/commands/hatch3r-hooks.md +6 -1
package/commands/hatch3r-learn.md +1 -0
package/commands/hatch3r-migration-plan.md +3 -2
package/commands/hatch3r-onboard.md +2 -1
package/commands/hatch3r-project-spec.md +4 -3
package/commands/hatch3r-quick-change.md +31 -3
package/commands/hatch3r-recipe.md +1 -0
package/commands/hatch3r-refactor-plan.md +2 -1
package/commands/hatch3r-release.md +4 -1
package/commands/hatch3r-revision.md +138 -17
package/commands/hatch3r-roadmap.md +5 -4
package/commands/hatch3r-rule-customize.md +5 -0
package/commands/hatch3r-security-audit.md +1 -0
package/commands/hatch3r-skill-customize.md +5 -0
package/commands/hatch3r-test-plan.md +3 -2
package/commands/hatch3r-workflow.md +15 -1
package/dist/cli/index.js +7595 -4548
package/dist/cli/index.js.map +1 -1
package/hooks/hatch3r-ci-failure.md +1 -0
package/hooks/hatch3r-file-save.md +1 -0
package/hooks/hatch3r-post-merge.md +1 -0
package/hooks/hatch3r-pre-commit.md +1 -0
package/hooks/hatch3r-pre-push.md +1 -0
package/hooks/hatch3r-session-start.md +1 -0
package/package.json +30 -12
package/rules/hatch3r-accessibility-standards.md +2 -1
package/rules/hatch3r-accessibility-standards.mdc +1 -1
package/rules/hatch3r-agent-orchestration-detail.md +207 -0
package/rules/hatch3r-agent-orchestration-detail.mdc +202 -0
package/rules/hatch3r-agent-orchestration.md +161 -318
package/rules/hatch3r-agent-orchestration.mdc +212 -154
package/rules/hatch3r-api-design.md +2 -1
package/rules/hatch3r-api-design.mdc +1 -1
package/rules/hatch3r-browser-verification.md +4 -2
package/rules/hatch3r-browser-verification.mdc +1 -0
package/rules/hatch3r-ci-cd.md +2 -1
package/rules/hatch3r-ci-cd.mdc +1 -1
package/rules/hatch3r-code-standards.md +15 -2
package/rules/hatch3r-code-standards.mdc +22 -2
package/rules/hatch3r-component-conventions.md +2 -1
package/rules/hatch3r-component-conventions.mdc +1 -1
package/rules/hatch3r-data-classification.md +2 -1
package/rules/hatch3r-data-classification.mdc +1 -1
package/rules/hatch3r-deep-context.md +26 -1
package/rules/hatch3r-deep-context.mdc +54 -8
package/rules/hatch3r-dependency-management.md +2 -1
package/rules/hatch3r-dependency-management.mdc +17 -5
package/rules/hatch3r-feature-flags.md +2 -0
package/rules/hatch3r-feature-flags.mdc +1 -0
package/rules/hatch3r-git-conventions.md +2 -1
package/rules/hatch3r-git-conventions.mdc +2 -1
package/rules/hatch3r-i18n.md +2 -1
package/rules/hatch3r-i18n.mdc +1 -1
package/rules/hatch3r-learning-consult.md +11 -1
package/rules/hatch3r-learning-consult.mdc +11 -1
package/rules/hatch3r-migrations.md +2 -1
package/rules/hatch3r-migrations.mdc +12 -1
package/rules/hatch3r-observability-logging.md +34 -0
package/rules/hatch3r-observability-logging.mdc +30 -0
package/rules/hatch3r-observability-metrics.md +74 -0
package/rules/hatch3r-observability-metrics.mdc +70 -0
package/rules/hatch3r-observability-tracing-detail.md +160 -0
package/rules/hatch3r-observability-tracing-detail.mdc +63 -0
package/rules/hatch3r-observability-tracing.md +86 -0
package/rules/hatch3r-observability-tracing.mdc +77 -0
package/rules/hatch3r-observability.md +9 -448
package/rules/hatch3r-observability.mdc +7 -159
package/rules/hatch3r-performance-budgets.md +2 -0
package/rules/hatch3r-performance-budgets.mdc +1 -0
package/rules/hatch3r-secrets-management.md +2 -1
package/rules/hatch3r-secrets-management.mdc +1 -1
package/rules/hatch3r-security-patterns.md +3 -2
package/rules/hatch3r-security-patterns.mdc +12 -1
package/rules/hatch3r-testing.md +12 -2
package/rules/hatch3r-testing.mdc +11 -2
package/rules/hatch3r-theming.md +3 -2
package/rules/hatch3r-theming.mdc +1 -1
package/rules/hatch3r-tooling-hierarchy.md +3 -2
package/rules/hatch3r-tooling-hierarchy.mdc +19 -5
package/skills/hatch3r-a11y-audit/SKILL.md +11 -4
package/skills/hatch3r-agent-customize/SKILL.md +5 -72
package/skills/hatch3r-api-spec/SKILL.md +9 -2
package/skills/hatch3r-architecture-review/SKILL.md +7 -0
package/skills/hatch3r-bug-fix/SKILL.md +16 -7
package/skills/hatch3r-ci-pipeline/SKILL.md +8 -1
package/skills/hatch3r-command-customize/SKILL.md +5 -62
package/skills/hatch3r-context-health/SKILL.md +23 -2
package/skills/hatch3r-cost-tracking/SKILL.md +16 -6
package/skills/hatch3r-customize/SKILL.md +124 -0
package/skills/hatch3r-dep-audit/SKILL.md +9 -2
package/skills/hatch3r-feature/SKILL.md +12 -4
package/skills/hatch3r-gh-agentic-workflows/SKILL.md +7 -0
package/skills/hatch3r-incident-response/SKILL.md +7 -0
package/skills/hatch3r-issue-workflow/SKILL.md +8 -1
package/skills/hatch3r-logical-refactor/SKILL.md +8 -1
package/skills/hatch3r-migration/SKILL.md +7 -0
package/skills/hatch3r-perf-audit/SKILL.md +9 -2
package/skills/hatch3r-pr-creation/SKILL.md +8 -1
package/skills/hatch3r-qa-validation/SKILL.md +8 -1
package/skills/hatch3r-recipe/SKILL.md +8 -1
package/skills/hatch3r-refactor/SKILL.md +10 -2
package/skills/hatch3r-release/SKILL.md +8 -1
package/skills/hatch3r-rule-customize/SKILL.md +5 -65
package/skills/hatch3r-skill-customize/SKILL.md +5 -62
package/skills/hatch3r-visual-refactor/SKILL.md +12 -5

package/hooks/hatch3r-ci-failure.md CHANGED Viewed

@@ -5,6 +5,7 @@ event: ci-failure
 agent: ci-watcher
 description: Diagnose CI pipeline failures
 tags: [core]
+quality_charter: agents/shared/quality-charter.md
 ---
 # Hook: ci-failure → ci-watcher

package/hooks/hatch3r-file-save.md CHANGED Viewed

@@ -6,6 +6,7 @@ agent: context-rules
 description: Activate context-specific rules on file save
 globs: "**/*.ts, **/*.tsx, **/*.js, **/*.jsx"
 tags: [core]
+quality_charter: agents/shared/quality-charter.md
 ---
 # Hook: file-save → context-rules

package/hooks/hatch3r-post-merge.md CHANGED Viewed

@@ -5,6 +5,7 @@ event: post-merge
 agent: ci-watcher
 description: Check CI pipeline status after merge
 tags: [core]
+quality_charter: agents/shared/quality-charter.md
 ---
 # Hook: post-merge → ci-watcher

package/hooks/hatch3r-pre-commit.md CHANGED Viewed

@@ -6,6 +6,7 @@ agent: lint-fixer
 description: Auto-fix lint and formatting issues before commit
 globs: "**/*.ts, **/*.tsx, **/*.js, **/*.jsx"
 tags: [core]
+quality_charter: agents/shared/quality-charter.md
 ---
 # Hook: pre-commit → lint-fixer

package/hooks/hatch3r-pre-push.md CHANGED Viewed

@@ -5,6 +5,7 @@ event: pre-push
 agent: security-auditor
 description: Scan for secrets and security issues before push
 tags: [core]
+quality_charter: agents/shared/quality-charter.md
 ---
 # Hook: pre-push → security-auditor

package/hooks/hatch3r-session-start.md CHANGED Viewed

@@ -5,6 +5,7 @@ event: session-start
 agent: learnings-loader
 description: Load relevant learnings at session start
 tags: [core]
+quality_charter: agents/shared/quality-charter.md
 ---
 # Hook: session-start → learnings-loader

package/package.json CHANGED Viewed

@@ -1,8 +1,14 @@
 {
   "name": "hatch3r",
-  "version": "1.3.0",
+  "version": "1.5.0",
   "description": "Battle-tested agentic coding setup framework. One command to hatch your agent stack -- agents, skills, rules, commands, and MCP for every major AI coding tool.",
   "type": "module",
+  "exports": {
+    ".": {
+      "import": "./dist/cli/index.js",
+      "types": "./dist/cli/index.d.ts"
+    }
+  },
   "bin": {
     "hatch3r": "./dist/cli/index.js"
   },
@@ -13,7 +19,8 @@
     "typecheck": "tsc --noEmit",
     "prepublishOnly": "npm run build",
     "test": "vitest run",
-    "test:watch": "vitest"
+    "test:watch": "vitest",
+    "lockfile:check": "lockfile-lint --path package-lock.json --type npm --allowed-hosts npm --validate-https"
   },
   "keywords": [
     "agents",
@@ -44,11 +51,15 @@
   "bugs": {
     "url": "https://github.com/hatch3r/hatch3r/issues"
   },
+  "publishConfig": {
+    "access": "public",
+    "registry": "https://registry.npmjs.org"
+  },
   "engines": {
     "node": ">=22.0.0"
   },
   "files": [
-    "dist/",
+    "dist/cli/",
     "agents/",
     "checks/",
     "commands/",
@@ -64,18 +75,25 @@
   "dependencies": {
     "boxen": "^8.0.1",
     "chalk": "^5.4.0",
-    "commander": "^13.0.0",
-    "inquirer": "^12.0.0",
+    "commander": "^14.0.3",
+    "inquirer": "^13.3.2",
     "ora": "^9.3.0",
-    "yaml": "^2.7.0"
+    "yaml": "^2.8.3"
   },
   "devDependencies": {
-    "@types/node": "^25.3.0",
-    "@vitest/coverage-v8": "^3.2.4",
-    "eslint": "^9.0.0",
+    "@types/node": "^25.5.0",
+    "@vitest/coverage-v8": "^4.1.2",
+    "eslint": "^10.1.0",
+    "lockfile-lint": "^5.0.0",
     "tsup": "^8.0.0",
-    "typescript": "^5.7.0",
-    "typescript-eslint": "^8.56.0",
-    "vitest": "^3.0.0"
+    "typescript": "^6.0.2",
+    "typescript-eslint": "^8.57.2",
+    "vitest": "^4.1.2"
+  },
+  "overrides": {
+    "flatted": "^3.4.2"
+  },
+  "comments": {
+    "overrides/flatted": "Pinned to >=3.4.2 to resolve security advisory in transitive eslint > flat-cache > flatted dependency"
   }
 }

package/rules/hatch3r-accessibility-standards.md CHANGED Viewed

@@ -2,8 +2,9 @@
 id: hatch3r-accessibility-standards
 type: rule
 description: Accessibility standards covering WCAG 2.2 AA compliance, keyboard navigation, screen readers, and ARIA patterns
-scope: always
+scope: "**/*.vue,**/*.jsx,**/*.tsx,**/*.svelte,**/components/**,**/*.html,**/*a11y*,**/*accessibility*"
 tags: [a11y]
+quality_charter: agents/shared/quality-charter.md
 ---
 # Accessibility Standards

package/rules/hatch3r-accessibility-standards.mdc CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 description: Accessibility standards covering WCAG 2.2 AA compliance, keyboard navigation, screen readers, and ARIA patterns
-alwaysApply: true
+globs: ["**/*.vue", "**/*.jsx", "**/*.tsx", "**/*.svelte", "**/components/**", "**/*.html", "**/*a11y*", "**/*accessibility*"]
 ---
 # Accessibility Standards

package/rules/hatch3r-agent-orchestration-detail.md ADDED Viewed

@@ -0,0 +1,207 @@
+---
+id: hatch3r-agent-orchestration-detail
+type: rule
+description: Extended orchestration reference — PipelineContext schemas, resilience protocols, observability integration, and auto-mode guardrails
+scope: conditional
+globs: "**/.agents/**,**/pipeline/**,**/*orchestrat*,**/*agent*"
+tags: [core]
+quality_charter: agents/shared/quality-charter.md
+---
+# Agent Orchestration — Extended Reference
+This is the on-demand companion to `hatch3r-agent-orchestration`. Load when you need detailed schemas, failure handling protocols, or guardrail specifications.
+## PipelineContext Schema
+The `PipelineContext` is the structured handoff object passed between pipeline phases. Each phase reads its inputs and writes its outputs to this context.
+```
+PipelineContext {
+  correlationId: string          // UUID v4, generated before Phase 1
+  taskType: "bug" | "feature" | "refactor" | "qa"
+  issueRef: string | null        // Issue number or null for plain chat
+  deepContextTier: 1 | 2 | 3    // From hatch3r-deep-context scoring
+  // Detected project type for specialist selection (Finding #56)
+  projectType?: {
+    languages: string[]          // From repo analysis (e.g., "typescript", "python", "go")
+    frameworks: string[]         // Detected frameworks (e.g., "next", "express")
+    isMonorepo: boolean
+    packageManager: string       // "npm" | "yarn" | "pnpm" | "bun" | "unknown"
+  }
+  // Phase 1 outputs (Research)
+  researchFindings: {
+    modes: string[]              // Researcher modes used
+    affectedFiles: string[]      // Files to create/modify/delete
+    blastRadius: string[]        // Downstream consumers
+    existingTests: string[]      // Test files covering affected code
+    dependencies: string[]       // Internal + external dependencies
+    conventions: object | null   // From similar-implementation mode
+    resolvedRequirements: object | null  // From requirements-elicitation
+  }
+  // Research gap flags from mid-implementation checkpoint (Finding #52)
+  researchGaps?: string[]        // Gaps identified during Phase 2
+  // Phase 2 outputs (Implementation)
+  implementationResult: {
+    filesChanged: string[]
+    testsWritten: string[]
+    status: "SUCCESS" | "PARTIAL" | "FAILED" | "SKIPPED" | "TIMEOUT"
+    reason: string | null
+  }
+  // Phase 3 outputs (Review)
+  reviewResult: {
+    iterations: number           // 1-3
+    finalVerdict: "CLEAN" | "UNRESOLVED"
+    findings: ReviewFinding[]
+    confirmationPassResult: "PASS" | "FAIL"
+  }
+  // Phase 4 outputs (Quality)
+  qualityResults: {
+    specialists: SpecialistResult[]
+    validationPass: {
+      testsPass: boolean
+      typecheckPass: boolean
+      fixAttempts: number
+      regressionsPersist: boolean
+    }
+  }
+  // Metadata
+  startedAt: string              // ISO-8601
+  completedAt: string | null
+  totalDuration: number | null   // milliseconds
+}
+```
+The TypeScript implementation of this schema with runtime validation is in `src/pipeline/pipelineContext.ts`. Use `validatePhaseTransition()` to verify context completeness before advancing between phases.
+## Resilience and Failure Handling
+### Phase Failure Protocols
+| Phase | Failure Mode | Protocol |
+|-------|-------------|----------|
+| Phase 1 (Research) | Researcher timeout | Proceed with partial findings; flag missing modes. |
+| Phase 1 (Research) | No relevant findings | Surface to user; ask whether to proceed with implementation. |
+| Phase 2 (Implementation) | Build/test failure | Attempt self-fix (max 2 iterations). Escalate to user if unresolved. |
+| Phase 2 (Implementation) | Scope creep detected | Halt. Surface deviation to user. Resume only with approval. |
+| Phase 3 (Review) | Max iterations (3) | Surface unresolved findings to user. Do not merge. |
+| Phase 3 (Review) | DESIGN_OBJECTION verdict | Terminate review loop immediately. Surface the objection and alternative approaches to the user for an architectural decision. Do not spawn fixer. |
+| Phase 3 (Review) | Fixer introduces regressions | Revert fixer changes. Surface original findings + regression to user. |
+| Phase 4 (Quality) | Specialist timeout | Log timeout. Continue with available results. Flag in output. |
+| Phase 4 (Quality) | Validation pass fails | Spawn fixer (max 2 attempts). Surface if unresolved. |
+### Subagent Error Recovery
+1. **Timeout:** Forward partial output. Mark status `TIMEOUT`. Continue pipeline.
+2. **Crash/no output:** Mark status `FAILED`. Log reason. Continue if non-blocking.
+3. **Conflicting outputs:** When two specialists disagree (e.g., security vs performance), escalate to user with both positions.
+4. **Resource exhaustion:** If context window is exhausted, summarize prior context and continue with summary.
+### Retry Policies
+- Subagent retries: 0 (spawn a new agent with adjusted prompt instead).
+- Phase retries: Phase 3 review loop retries up to 3 iterations. All other phases: 0 retries (escalate to user).
+- Never retry the same failed operation identically — adjust the prompt or approach.
+## Observability Integration
+### Structured Logging
+All pipeline events should produce structured log entries when the project has observability infrastructure:
+```
+{
+  "event": "pipeline.phase.start" | "pipeline.phase.end" | "subagent.spawn" | "subagent.complete",
+  "correlationId": "...",
+  "phase": 1-4,
+  "agent": "hatch3r-implementer",
+  "status": "SUCCESS" | "PARTIAL" | "FAILED" | "TIMEOUT",
+  "duration": 12345,
+  "metadata": {}
+}
+```
+### Metrics to Track
+| Metric | Description |
+|--------|-------------|
+| Pipeline duration | Total time from Phase 1 start to Phase 4 end |
+| Phase duration | Time per phase |
+| Review iterations | Number of Phase 3 review cycles |
+| Specialist invocations | Count of Phase 4 specialists launched |
+| Fix attempts | Number of fixer invocations across all phases |
+| Failure rate | Proportion of tasks not reaching SUCCESS |
+### Correlation ID Propagation
+The correlation ID generated before Phase 1 MUST be:
+- Included in every subagent prompt
+- Included in every structured log entry
+- Included in every status report and output
+- Used as the key for cross-referencing pipeline artifacts
+## Auto-Mode Guardrails
+When operating in unattended/auto mode (no human in the loop), enforce these guardrails after each phase:
+### Scope Containment
+- **File scope:** Only modify files identified in Phase 1 research + files discovered during implementation that are direct dependencies. No drive-by refactors.
+- **Dependency scope:** Do not add new external dependencies without explicit approval.
+- **Destructive operations:** Never execute `rm -rf`, `DROP TABLE`, force push, or other destructive operations in auto mode. Queue for human review.
+### Output Schema Compliance
+After each phase, validate that the output conforms to the expected PipelineContext schema fields. Missing required fields trigger a HALT.
+### Escalation Triggers
+Auto-mode MUST halt and surface to user when:
+1. A CRITICAL finding is detected in Phase 3.
+2. Phase 4 validation pass fails after 2 fix attempts.
+3. Any specialist reports FAILED status.
+4. Scope containment violation detected.
+5. Implementation touches more than 20 files (may indicate scope creep).
+### Budget Guards
+- **Token budget:** If cumulative subagent token usage exceeds 80% of estimated budget, surface to user before spawning additional agents.
+- **Time budget:** If pipeline duration exceeds 2x the estimated time (based on deep context tier), surface status and request continuation approval.
+## Adaptive Pipeline Behavior
+### Complexity-Driven Adaptation
+The pipeline should adapt its behavior based on observed task complexity, not just the initial tier assignment:
+| Signal During Execution | Adaptation |
+|------------------------|------------|
+| Phase 1 research finds >10 affected files (initial estimate was <5) | Upgrade tier to 3 if currently 2. Re-run `codebase-impact` at `deep` depth before Phase 2. |
+| Phase 2 implementer reports >3 research gaps | Pause Phase 2. Run targeted researcher with gap-specific modes before continuing. |
+| Phase 3 review loop reaches iteration 2 with increasing Critical count | Classify as complexity underestimate. Surface to user with recommendation to break the task into smaller sub-tasks. |
+| Phase 4 validation pass fails on first attempt | Check whether failure is in test-writer's new tests (expected -- fix test) or in pre-existing tests (regression -- fix implementation). Route to appropriate fixer. |
+### Post-Pipeline Learning
+After pipeline completion, the orchestrator should capture lessons for future runs:
+1. **Tier accuracy:** Was the initial tier correct? If the pipeline needed adaptation (above), log the mismatch for the learnings system.
+2. **Phase duration ratios:** Record time spent per phase. Anomalous ratios (e.g., Phase 3 taking 5x Phase 2) indicate systemic issues worth investigating.
+3. **Specialist value:** Record which Phase 4 specialists produced actionable findings vs. clean reports. Over time, this data informs smarter specialist dispatch.
+## Context Token Optimization
+When pipeline context exceeds 50% of the available context window, apply these compression strategies in order:
+1. **Summarize Phase 1 output.** Replace full research findings with a structured summary: affected files (list), blast radius (count + top 3), key conventions (bullet points). Keep raw data only for the fields the current phase needs.
+2. **Prune resolved findings.** After Phase 3 review loop, remove findings that were fixed and confirmed. Only carry forward unresolved findings.
+3. **Collapse specialist results.** In the final output, summarize specialist results as a single status table rather than including full specialist reports. Full reports are available on request.
+4. **Never truncate security findings.** Security auditor output is always included in full regardless of context pressure.
+These strategies preserve decision-critical information while reducing token overhead for long pipelines.

package/rules/hatch3r-agent-orchestration-detail.mdc ADDED Viewed

@@ -0,0 +1,202 @@
+---
+description: Extended orchestration reference — PipelineContext schemas, resilience protocols, observability integration, and auto-mode guardrails
+globs: ["**/.agents/**", "**/pipeline/**", "**/*orchestrat*", "**/*agent*"]
+alwaysApply: false
+---
+# Agent Orchestration — Extended Reference
+This is the on-demand companion to `hatch3r-agent-orchestration`. Load when you need detailed schemas, failure handling protocols, or guardrail specifications.
+## PipelineContext Schema
+The `PipelineContext` is the structured handoff object passed between pipeline phases. Each phase reads its inputs and writes its outputs to this context.
+```
+PipelineContext {
+  correlationId: string          // UUID v4, generated before Phase 1
+  taskType: "bug" | "feature" | "refactor" | "qa"
+  issueRef: string | null        // Issue number or null for plain chat
+  deepContextTier: 1 | 2 | 3    // From hatch3r-deep-context scoring
+  // Detected project type for specialist selection (Finding #56)
+  projectType?: {
+    languages: string[]          // From repo analysis (e.g., "typescript", "python", "go")
+    frameworks: string[]         // Detected frameworks (e.g., "next", "express")
+    isMonorepo: boolean
+    packageManager: string       // "npm" | "yarn" | "pnpm" | "bun" | "unknown"
+  }
+  // Phase 1 outputs (Research)
+  researchFindings: {
+    modes: string[]              // Researcher modes used
+    affectedFiles: string[]      // Files to create/modify/delete
+    blastRadius: string[]        // Downstream consumers
+    existingTests: string[]      // Test files covering affected code
+    dependencies: string[]       // Internal + external dependencies
+    conventions: object | null   // From similar-implementation mode
+    resolvedRequirements: object | null  // From requirements-elicitation
+  }
+  // Research gap flags from mid-implementation checkpoint (Finding #52)
+  researchGaps?: string[]        // Gaps identified during Phase 2
+  // Phase 2 outputs (Implementation)
+  implementationResult: {
+    filesChanged: string[]
+    testsWritten: string[]
+    status: "SUCCESS" | "PARTIAL" | "FAILED" | "SKIPPED" | "TIMEOUT"
+    reason: string | null
+  }
+  // Phase 3 outputs (Review)
+  reviewResult: {
+    iterations: number           // 1-3
+    finalVerdict: "CLEAN" | "UNRESOLVED"
+    findings: ReviewFinding[]
+    confirmationPassResult: "PASS" | "FAIL"
+  }
+  // Phase 4 outputs (Quality)
+  qualityResults: {
+    specialists: SpecialistResult[]
+    validationPass: {
+      testsPass: boolean
+      typecheckPass: boolean
+      fixAttempts: number
+      regressionsPersist: boolean
+    }
+  }
+  // Metadata
+  startedAt: string              // ISO-8601
+  completedAt: string | null
+  totalDuration: number | null   // milliseconds
+}
+```
+The TypeScript implementation of this schema with runtime validation is in `src/pipeline/pipelineContext.ts`. Use `validatePhaseTransition()` to verify context completeness before advancing between phases.
+## Resilience and Failure Handling
+### Phase Failure Protocols
+| Phase | Failure Mode | Protocol |
+|-------|-------------|----------|
+| Phase 1 (Research) | Researcher timeout | Proceed with partial findings; flag missing modes. |
+| Phase 1 (Research) | No relevant findings | Surface to user; ask whether to proceed with implementation. |
+| Phase 2 (Implementation) | Build/test failure | Attempt self-fix (max 2 iterations). Escalate to user if unresolved. |
+| Phase 2 (Implementation) | Scope creep detected | Halt. Surface deviation to user. Resume only with approval. |
+| Phase 3 (Review) | Max iterations (3) | Surface unresolved findings to user. Do not merge. |
+| Phase 3 (Review) | Fixer introduces regressions | Revert fixer changes. Surface original findings + regression to user. |
+| Phase 4 (Quality) | Specialist timeout | Log timeout. Continue with available results. Flag in output. |
+| Phase 4 (Quality) | Validation pass fails | Spawn fixer (max 2 attempts). Surface if unresolved. |
+### Subagent Error Recovery
+1. **Timeout:** Forward partial output. Mark status `TIMEOUT`. Continue pipeline.
+2. **Crash/no output:** Mark status `FAILED`. Log reason. Continue if non-blocking.
+3. **Conflicting outputs:** When two specialists disagree (e.g., security vs performance), escalate to user with both positions.
+4. **Resource exhaustion:** If context window is exhausted, summarize prior context and continue with summary.
+### Retry Policies
+- Subagent retries: 0 (spawn a new agent with adjusted prompt instead).
+- Phase retries: Phase 3 review loop retries up to 3 iterations. All other phases: 0 retries (escalate to user).
+- Never retry the same failed operation identically — adjust the prompt or approach.
+## Observability Integration
+### Structured Logging
+All pipeline events should produce structured log entries when the project has observability infrastructure:
+```
+{
+  "event": "pipeline.phase.start" | "pipeline.phase.end" | "subagent.spawn" | "subagent.complete",
+  "correlationId": "...",
+  "phase": 1-4,
+  "agent": "hatch3r-implementer",
+  "status": "SUCCESS" | "PARTIAL" | "FAILED" | "TIMEOUT",
+  "duration": 12345,
+  "metadata": {}
+}
+```
+### Metrics to Track
+| Metric | Description |
+|--------|-------------|
+| Pipeline duration | Total time from Phase 1 start to Phase 4 end |
+| Phase duration | Time per phase |
+| Review iterations | Number of Phase 3 review cycles |
+| Specialist invocations | Count of Phase 4 specialists launched |
+| Fix attempts | Number of fixer invocations across all phases |
+| Failure rate | Proportion of tasks not reaching SUCCESS |
+### Correlation ID Propagation
+The correlation ID generated before Phase 1 MUST be:
+- Included in every subagent prompt
+- Included in every structured log entry
+- Included in every status report and output
+- Used as the key for cross-referencing pipeline artifacts
+## Auto-Mode Guardrails
+When operating in unattended/auto mode (no human in the loop), enforce these guardrails after each phase:
+### Scope Containment
+- **File scope:** Only modify files identified in Phase 1 research + files discovered during implementation that are direct dependencies. No drive-by refactors.
+- **Dependency scope:** Do not add new external dependencies without explicit approval.
+- **Destructive operations:** Never execute `rm -rf`, `DROP TABLE`, force push, or other destructive operations in auto mode. Queue for human review.
+### Output Schema Compliance
+After each phase, validate that the output conforms to the expected PipelineContext schema fields. Missing required fields trigger a HALT.
+### Escalation Triggers
+Auto-mode MUST halt and surface to user when:
+1. A CRITICAL finding is detected in Phase 3.
+2. Phase 4 validation pass fails after 2 fix attempts.
+3. Any specialist reports FAILED status.
+4. Scope containment violation detected.
+5. Implementation touches more than 20 files (may indicate scope creep).
+### Budget Guards
+- **Token budget:** If cumulative subagent token usage exceeds 80% of estimated budget, surface to user before spawning additional agents.
+- **Time budget:** If pipeline duration exceeds 2x the estimated time (based on deep context tier), surface status and request continuation approval.
+## Adaptive Pipeline Behavior
+### Complexity-Driven Adaptation
+The pipeline should adapt its behavior based on observed task complexity, not just the initial tier assignment:
+| Signal During Execution | Adaptation |
+|------------------------|------------|
+| Phase 1 research finds >10 affected files (initial estimate was <5) | Upgrade tier to 3 if currently 2. Re-run `codebase-impact` at `deep` depth before Phase 2. |
+| Phase 2 implementer reports >3 research gaps | Pause Phase 2. Run targeted researcher with gap-specific modes before continuing. |
+| Phase 3 review loop reaches iteration 2 with increasing Critical count | Classify as complexity underestimate. Surface to user with recommendation to break the task into smaller sub-tasks. |
+| Phase 4 validation pass fails on first attempt | Check whether failure is in test-writer's new tests (expected -- fix test) or in pre-existing tests (regression -- fix implementation). Route to appropriate fixer. |
+### Post-Pipeline Learning
+After pipeline completion, the orchestrator should capture lessons for future runs:
+1. **Tier accuracy:** Was the initial tier correct? If the pipeline needed adaptation (above), log the mismatch for the learnings system.
+2. **Phase duration ratios:** Record time spent per phase. Anomalous ratios (e.g., Phase 3 taking 5x Phase 2) indicate systemic issues worth investigating.
+3. **Specialist value:** Record which Phase 4 specialists produced actionable findings vs. clean reports. Over time, this data informs smarter specialist dispatch.
+## Context Token Optimization
+When pipeline context exceeds 50% of the available context window, apply these compression strategies in order:
+1. **Summarize Phase 1 output.** Replace full research findings with a structured summary: affected files (list), blast radius (count + top 3), key conventions (bullet points). Keep raw data only for the fields the current phase needs.
+2. **Prune resolved findings.** After Phase 3 review loop, remove findings that were fixed and confirmed. Only carry forward unresolved findings.
+3. **Collapse specialist results.** In the final output, summarize specialist results as a single status table rather than including full specialist reports. Full reports are available on request.
+4. **Never truncate security findings.** Security auditor output is always included in full regardless of context pressure.
+These strategies preserve decision-critical information while reducing token overhead for long pipelines.