hatch3r 1.3.0 → 1.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +12 -7
- package/agents/hatch3r-a11y-auditor.md +18 -11
- package/agents/hatch3r-architect.md +27 -12
- package/agents/hatch3r-ci-watcher.md +30 -9
- package/agents/hatch3r-context-rules.md +18 -8
- package/agents/hatch3r-dependency-auditor.md +30 -15
- package/agents/hatch3r-devops.md +18 -13
- package/agents/hatch3r-docs-writer.md +33 -12
- package/agents/hatch3r-fixer.md +46 -9
- package/agents/hatch3r-implementer.md +21 -9
- package/agents/hatch3r-learnings-loader.md +24 -7
- package/agents/hatch3r-lint-fixer.md +18 -9
- package/agents/hatch3r-perf-profiler.md +26 -10
- package/agents/hatch3r-researcher.md +57 -919
- package/agents/hatch3r-reviewer.md +29 -10
- package/agents/hatch3r-security-auditor.md +25 -10
- package/agents/hatch3r-test-writer.md +29 -9
- package/agents/modes/architecture.md +1 -0
- package/agents/modes/boundary-analysis.md +2 -1
- package/agents/modes/codebase-impact.md +1 -0
- package/agents/modes/complexity-risk.md +1 -0
- package/agents/modes/coverage-analysis.md +1 -0
- package/agents/modes/current-state.md +1 -0
- package/agents/modes/feature-design.md +1 -0
- package/agents/modes/impact-analysis.md +1 -0
- package/agents/modes/library-docs.md +2 -1
- package/agents/modes/migration-path.md +1 -0
- package/agents/modes/prior-art.md +1 -0
- package/agents/modes/refactoring-strategy.md +1 -0
- package/agents/modes/regression.md +1 -0
- package/agents/modes/requirements-elicitation.md +1 -0
- package/agents/modes/risk-assessment.md +1 -0
- package/agents/modes/risk-prioritization.md +1 -0
- package/agents/modes/root-cause.md +1 -0
- package/agents/modes/similar-implementation.md +2 -1
- package/agents/modes/symptom-trace.md +1 -0
- package/agents/modes/test-pattern.md +2 -1
- package/agents/shared/external-knowledge.md +31 -0
- package/agents/shared/quality-charter.md +96 -0
- package/checks/README.md +1 -0
- package/checks/accessibility.md +55 -0
- package/commands/board/pickup-azure-devops.md +5 -0
- package/commands/board/pickup-delegation-multi.md +9 -1
- package/commands/board/pickup-delegation.md +4 -0
- package/commands/board/pickup-github.md +5 -0
- package/commands/board/pickup-gitlab.md +5 -0
- package/commands/board/pickup-modes.md +1 -0
- package/commands/board/pickup-post-impl.md +9 -1
- package/commands/board/shared-azure-devops.md +14 -3
- package/commands/board/shared-board-overview.md +1 -0
- package/commands/board/shared-github.md +2 -0
- package/commands/board/shared-gitlab.md +10 -2
- package/commands/hatch3r-agent-customize.md +6 -1
- package/commands/hatch3r-api-spec.md +1 -0
- package/commands/hatch3r-benchmark.md +4 -3
- package/commands/hatch3r-board-fill.md +52 -9
- package/commands/hatch3r-board-groom.md +124 -7
- package/commands/hatch3r-board-init.md +7 -3
- package/commands/hatch3r-board-pickup.md +1 -0
- package/commands/hatch3r-board-refresh.md +1 -0
- package/commands/hatch3r-board-shared.md +71 -5
- package/commands/hatch3r-bug-plan.md +2 -1
- package/commands/hatch3r-codebase-map.md +4 -3
- package/commands/hatch3r-command-customize.md +6 -1
- package/commands/hatch3r-context-health.md +1 -0
- package/commands/hatch3r-cost-tracking.md +1 -0
- package/commands/hatch3r-debug.md +4 -3
- package/commands/hatch3r-dep-audit.md +3 -0
- package/commands/hatch3r-feature-plan.md +3 -2
- package/commands/hatch3r-healthcheck.md +1 -0
- package/commands/hatch3r-hooks.md +6 -1
- package/commands/hatch3r-learn.md +1 -0
- package/commands/hatch3r-migration-plan.md +3 -2
- package/commands/hatch3r-onboard.md +2 -1
- package/commands/hatch3r-project-spec.md +4 -3
- package/commands/hatch3r-quick-change.md +31 -3
- package/commands/hatch3r-recipe.md +1 -0
- package/commands/hatch3r-refactor-plan.md +2 -1
- package/commands/hatch3r-release.md +4 -1
- package/commands/hatch3r-revision.md +138 -17
- package/commands/hatch3r-roadmap.md +5 -4
- package/commands/hatch3r-rule-customize.md +5 -0
- package/commands/hatch3r-security-audit.md +1 -0
- package/commands/hatch3r-skill-customize.md +5 -0
- package/commands/hatch3r-test-plan.md +3 -2
- package/commands/hatch3r-workflow.md +15 -1
- package/dist/cli/index.js +7595 -4548
- package/dist/cli/index.js.map +1 -1
- package/hooks/hatch3r-ci-failure.md +1 -0
- package/hooks/hatch3r-file-save.md +1 -0
- package/hooks/hatch3r-post-merge.md +1 -0
- package/hooks/hatch3r-pre-commit.md +1 -0
- package/hooks/hatch3r-pre-push.md +1 -0
- package/hooks/hatch3r-session-start.md +1 -0
- package/package.json +30 -12
- package/rules/hatch3r-accessibility-standards.md +2 -1
- package/rules/hatch3r-accessibility-standards.mdc +1 -1
- package/rules/hatch3r-agent-orchestration-detail.md +207 -0
- package/rules/hatch3r-agent-orchestration-detail.mdc +202 -0
- package/rules/hatch3r-agent-orchestration.md +161 -318
- package/rules/hatch3r-agent-orchestration.mdc +212 -154
- package/rules/hatch3r-api-design.md +2 -1
- package/rules/hatch3r-api-design.mdc +1 -1
- package/rules/hatch3r-browser-verification.md +4 -2
- package/rules/hatch3r-browser-verification.mdc +1 -0
- package/rules/hatch3r-ci-cd.md +2 -1
- package/rules/hatch3r-ci-cd.mdc +1 -1
- package/rules/hatch3r-code-standards.md +15 -2
- package/rules/hatch3r-code-standards.mdc +22 -2
- package/rules/hatch3r-component-conventions.md +2 -1
- package/rules/hatch3r-component-conventions.mdc +1 -1
- package/rules/hatch3r-data-classification.md +2 -1
- package/rules/hatch3r-data-classification.mdc +1 -1
- package/rules/hatch3r-deep-context.md +26 -1
- package/rules/hatch3r-deep-context.mdc +54 -8
- package/rules/hatch3r-dependency-management.md +2 -1
- package/rules/hatch3r-dependency-management.mdc +17 -5
- package/rules/hatch3r-feature-flags.md +2 -0
- package/rules/hatch3r-feature-flags.mdc +1 -0
- package/rules/hatch3r-git-conventions.md +2 -1
- package/rules/hatch3r-git-conventions.mdc +2 -1
- package/rules/hatch3r-i18n.md +2 -1
- package/rules/hatch3r-i18n.mdc +1 -1
- package/rules/hatch3r-learning-consult.md +11 -1
- package/rules/hatch3r-learning-consult.mdc +11 -1
- package/rules/hatch3r-migrations.md +2 -1
- package/rules/hatch3r-migrations.mdc +12 -1
- package/rules/hatch3r-observability-logging.md +34 -0
- package/rules/hatch3r-observability-logging.mdc +30 -0
- package/rules/hatch3r-observability-metrics.md +74 -0
- package/rules/hatch3r-observability-metrics.mdc +70 -0
- package/rules/hatch3r-observability-tracing-detail.md +160 -0
- package/rules/hatch3r-observability-tracing-detail.mdc +63 -0
- package/rules/hatch3r-observability-tracing.md +86 -0
- package/rules/hatch3r-observability-tracing.mdc +77 -0
- package/rules/hatch3r-observability.md +9 -448
- package/rules/hatch3r-observability.mdc +7 -159
- package/rules/hatch3r-performance-budgets.md +2 -0
- package/rules/hatch3r-performance-budgets.mdc +1 -0
- package/rules/hatch3r-secrets-management.md +2 -1
- package/rules/hatch3r-secrets-management.mdc +1 -1
- package/rules/hatch3r-security-patterns.md +3 -2
- package/rules/hatch3r-security-patterns.mdc +12 -1
- package/rules/hatch3r-testing.md +12 -2
- package/rules/hatch3r-testing.mdc +11 -2
- package/rules/hatch3r-theming.md +3 -2
- package/rules/hatch3r-theming.mdc +1 -1
- package/rules/hatch3r-tooling-hierarchy.md +3 -2
- package/rules/hatch3r-tooling-hierarchy.mdc +19 -5
- package/skills/hatch3r-a11y-audit/SKILL.md +11 -4
- package/skills/hatch3r-agent-customize/SKILL.md +5 -72
- package/skills/hatch3r-api-spec/SKILL.md +9 -2
- package/skills/hatch3r-architecture-review/SKILL.md +7 -0
- package/skills/hatch3r-bug-fix/SKILL.md +16 -7
- package/skills/hatch3r-ci-pipeline/SKILL.md +8 -1
- package/skills/hatch3r-command-customize/SKILL.md +5 -62
- package/skills/hatch3r-context-health/SKILL.md +23 -2
- package/skills/hatch3r-cost-tracking/SKILL.md +16 -6
- package/skills/hatch3r-customize/SKILL.md +124 -0
- package/skills/hatch3r-dep-audit/SKILL.md +9 -2
- package/skills/hatch3r-feature/SKILL.md +12 -4
- package/skills/hatch3r-gh-agentic-workflows/SKILL.md +7 -0
- package/skills/hatch3r-incident-response/SKILL.md +7 -0
- package/skills/hatch3r-issue-workflow/SKILL.md +8 -1
- package/skills/hatch3r-logical-refactor/SKILL.md +8 -1
- package/skills/hatch3r-migration/SKILL.md +7 -0
- package/skills/hatch3r-perf-audit/SKILL.md +9 -2
- package/skills/hatch3r-pr-creation/SKILL.md +8 -1
- package/skills/hatch3r-qa-validation/SKILL.md +8 -1
- package/skills/hatch3r-recipe/SKILL.md +8 -1
- package/skills/hatch3r-refactor/SKILL.md +10 -2
- package/skills/hatch3r-release/SKILL.md +8 -1
- package/skills/hatch3r-rule-customize/SKILL.md +5 -65
- package/skills/hatch3r-skill-customize/SKILL.md +5 -62
- package/skills/hatch3r-visual-refactor/SKILL.md +12 -5
|
@@ -4,30 +4,23 @@ type: rule
|
|
|
4
4
|
description: Mandatory agent delegation, skill loading, and subagent usage directives for ALL tasks in ALL contexts
|
|
5
5
|
scope: always
|
|
6
6
|
tags: [core]
|
|
7
|
+
quality_charter: agents/shared/quality-charter.md
|
|
7
8
|
---
|
|
8
9
|
# Agent Orchestration
|
|
9
10
|
|
|
10
|
-
This rule governs when and how to delegate work to hatch3r agents, load skills, and spawn subagents. These directives are mandatory — not suggestions.
|
|
11
|
+
This rule governs when and how to delegate work to hatch3r agents, load skills, and spawn subagents. These directives are mandatory — not suggestions. For extended reference on pipeline context schemas, resilience/failure handling, and observability, see `hatch3r-agent-orchestration-detail`.
|
|
11
12
|
|
|
12
13
|
## Orchestration Differentiation
|
|
13
14
|
|
|
14
|
-
Hatch3r's orchestration
|
|
15
|
+
Hatch3r's orchestration uses a **phase-gated pipeline** (Research, Implement, Review, Quality) with **structured handoffs** via `PipelineContext` and a **mandatory review gate** before the quality phase. This is not free-form agent chat.
|
|
15
16
|
|
|
16
17
|
## Universal Applicability
|
|
17
18
|
|
|
18
|
-
This rule applies to EVERY context without exception:
|
|
19
|
-
|
|
20
|
-
- **Board-pickup** (epic, sub-issue, standalone, batch)
|
|
21
|
-
- **Workflow command** (full mode and quick mode)
|
|
22
|
-
- **Plain chat** (single task or multiple tasks)
|
|
23
|
-
- **Issue references** (e.g., "implement #5")
|
|
24
|
-
- **Natural language requests** (e.g., "add a dark mode toggle")
|
|
25
|
-
|
|
26
|
-
Whether the user invokes a command or simply asks for a task in conversation, the full sub-agent pipeline defined below is mandatory. There is no context where implementing code inline (without sub-agents) is acceptable.
|
|
19
|
+
This rule applies to EVERY context without exception: board-pickup (epic, sub-issue, standalone, batch), workflow command (full/quick), plain chat, issue references, and natural language requests. The full sub-agent pipeline is mandatory — never implement code inline without sub-agents.
|
|
27
20
|
|
|
28
21
|
## Universal Sub-Agent Pipeline
|
|
29
22
|
|
|
30
|
-
Every task MUST follow this four-phase pipeline: **Phase 1 — Research** (
|
|
23
|
+
Every task MUST follow this four-phase pipeline: **Phase 1 — Research** (`hatch3r-researcher`), **Phase 2 — Implement** (`hatch3r-implementer`), **Phase 3 — Review Loop** (`hatch3r-reviewer` + `hatch3r-fixer`), **Phase 4 — Final Quality** (parallel specialists). See Mandatory Delegation Directives below.
|
|
31
24
|
|
|
32
25
|
## Agent Roster
|
|
33
26
|
|
|
@@ -50,114 +43,140 @@ Every task MUST follow this four-phase pipeline: **Phase 1 — Research** (conte
|
|
|
50
43
|
|
|
51
44
|
## Deep Context Integration
|
|
52
45
|
|
|
53
|
-
Score task complexity per the `hatch3r-deep-context` rule
|
|
46
|
+
Score task complexity per the `hatch3r-deep-context` rule before Phase 1. Apply the resulting tier:
|
|
54
47
|
|
|
55
|
-
- **Tier 2 (Standard):** Present elicitation questions
|
|
56
|
-
- **Tier 3 (Deep):** Present
|
|
48
|
+
- **Tier 2 (Standard):** Present elicitation questions inline. Await answers before Phase 2.
|
|
49
|
+
- **Tier 3 (Deep):** Present Pre-Implementation Summary and ASK for confirmation. Do NOT proceed until all unresolved questions are answered.
|
|
57
50
|
|
|
58
51
|
## Mandatory Delegation Directives
|
|
59
52
|
|
|
60
53
|
### Context Gathering (Before Implementation)
|
|
61
54
|
|
|
62
|
-
|
|
55
|
+
Spawn `hatch3r-researcher` before implementing any task. Skip only for trivial single-line edits. Select modes by task type, then add tier-appropriate modes per Deep Context Integration:
|
|
63
56
|
|
|
64
|
-
- **`type:bug`**:
|
|
65
|
-
- **`type:feature`**:
|
|
66
|
-
- **`type:refactor`**:
|
|
67
|
-
- **`type:qa`**:
|
|
57
|
+
- **`type:bug`**: `symptom-trace`, `root-cause`, `codebase-impact` + tier modes
|
|
58
|
+
- **`type:feature`**: `codebase-impact`, `feature-design`, `architecture` + tier modes
|
|
59
|
+
- **`type:refactor`**: `current-state`, `refactoring-strategy`, `migration-path` + tier modes
|
|
60
|
+
- **`type:qa`**: `codebase-impact` + tier modes
|
|
68
61
|
|
|
69
|
-
Use depth `quick` for low-risk
|
|
62
|
+
Use depth `quick` for low-risk, `standard` for medium-risk, `deep` for high-risk. Tier 3 always uses `deep` depth.
|
|
70
63
|
|
|
71
64
|
### Research Completeness Checklist
|
|
72
65
|
|
|
73
|
-
Before
|
|
66
|
+
Before Phase 1 to Phase 2 handoff, verify:
|
|
74
67
|
|
|
75
|
-
- [ ] **All affected files identified** —
|
|
76
|
-
- [ ] **Blast radius assessed** — downstream consumers
|
|
77
|
-
- [ ] **Existing tests located** —
|
|
78
|
-
- [ ] **Dependencies mapped** — internal
|
|
68
|
+
- [ ] **All affected files identified** — files to be created, modified, or deleted are listed.
|
|
69
|
+
- [ ] **Blast radius assessed** — downstream consumers and integration points documented.
|
|
70
|
+
- [ ] **Existing tests located** — test files covering affected code identified (or absence noted).
|
|
71
|
+
- [ ] **Dependencies mapped** — internal and external dependencies enumerated.
|
|
79
72
|
|
|
80
|
-
If any item
|
|
73
|
+
If any item is unconfirmed, re-run researcher with additional modes or surface to user.
|
|
81
74
|
|
|
82
75
|
### Implementation Delegation
|
|
83
76
|
|
|
84
|
-
|
|
77
|
+
Spawn `hatch3r-implementer` via Task tool for ALL code changes. Never implement inline.
|
|
85
78
|
|
|
86
|
-
- **Single
|
|
87
|
-
- **Plain chat
|
|
88
|
-
- **Epics
|
|
89
|
-
- **
|
|
79
|
+
- **Single issue**: One implementer. Orchestrator owns git/PR/board.
|
|
80
|
+
- **Plain chat task**: One implementer. Create synthetic issue context first.
|
|
81
|
+
- **Epics**: One implementer per sub-issue, level-by-level respecting dependency order.
|
|
82
|
+
- **Batch**: Group by dependency level, one implementer per issue, shared branch + combined PR.
|
|
90
83
|
|
|
91
|
-
**Implementer prompt enrichment
|
|
92
|
-
- `similar-implementation` findings as "Reference Conventions" (triggers the implementer's Convention Lock step)
|
|
93
|
-
- Resolved `requirements-elicitation` answers as "Resolved Requirements"
|
|
94
|
-
- Enhanced `codebase-impact` blast radius data (Tier 3 only)
|
|
84
|
+
**Implementer prompt enrichment (Tier 2+):** Include `similar-implementation` findings as "Reference Conventions", resolved `requirements-elicitation` answers as "Resolved Requirements", and blast radius data (Tier 3 only).
|
|
95
85
|
|
|
96
|
-
###
|
|
86
|
+
### Mid-Implementation Research Gap Checkpoint
|
|
97
87
|
|
|
98
|
-
|
|
88
|
+
At the midpoint of Phase 2 (after initial files are modified but before completion), the implementer MUST evaluate whether research gaps exist. This prevents discovering missing context too late in the pipeline.
|
|
99
89
|
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
90
|
+
**Checkpoint triggers:**
|
|
91
|
+
1. Implementation requires modifying a file not listed in `researchFindings.affectedFiles`.
|
|
92
|
+
2. An undocumented dependency or integration point is discovered.
|
|
93
|
+
3. The implementer's confidence drops below "medium" for any sub-task.
|
|
94
|
+
4. A test file expected from research does not exist or covers different behavior.
|
|
104
95
|
|
|
105
|
-
|
|
96
|
+
**Actions when gaps are detected:**
|
|
97
|
+
- Log the gap in `PipelineContext.researchGaps`.
|
|
98
|
+
- If the gap is blocking (cannot proceed without the missing context): pause implementation, surface the gap to the orchestrator, and request a targeted re-run of `hatch3r-researcher` with the specific modes needed.
|
|
99
|
+
- If the gap is non-blocking (can proceed with assumptions): document the assumption, continue implementation, and flag for reviewer attention in Phase 3.
|
|
106
100
|
|
|
107
|
-
###
|
|
101
|
+
### Per-Task Mini-Review
|
|
102
|
+
|
|
103
|
+
For multi-sub-task implementations, the implementer performs a lightweight mini-review after each sub-task: verify correctness, check interface contracts, validate no regressions, gate progression. Mini-reviews are internal (no separate reviewer agent).
|
|
108
104
|
|
|
109
|
-
|
|
105
|
+
### Post-Implementation Quality Pipeline
|
|
110
106
|
|
|
111
107
|
**Phase 3 — Review Loop:**
|
|
112
108
|
|
|
113
|
-
1. Spawn `hatch3r-reviewer`
|
|
114
|
-
2.
|
|
115
|
-
3.
|
|
116
|
-
4.
|
|
117
|
-
5.
|
|
118
|
-
6.
|
|
109
|
+
1. Spawn `hatch3r-reviewer` with diff and acceptance criteria. Reviewer includes blast radius summary.
|
|
110
|
+
2. Critical/Warning findings: spawn `hatch3r-fixer` with full reviewer output.
|
|
111
|
+
3. Re-review after fixes. Repeat until 0 Critical + 0 Warning, or max 3 iterations.
|
|
112
|
+
4. **Confirmation pass** after clean review: lightweight re-review for fix-driven regressions and acceptance criteria completeness. The confirmation pass checks only: (a) no new test failures compared to Phase 2 baseline, (b) no type errors introduced, (c) acceptance criteria from the issue are still met. It does not re-run the full review checklist.
|
|
113
|
+
5. Max iterations reached: surface to user with a structured summary: iteration count, remaining Critical findings (with file:line), remaining Warning findings, and a recommendation (fix manually vs. accept risk). Never present raw reviewer output without summarization.
|
|
114
|
+
6. **Review gate confidence signal:** When the review loop exits with a clean verdict, record the iteration count in `PipelineContext.reviewResult.iterations`. Clean-on-first-pass (iteration 1) signals higher confidence than clean-after-multiple-iterations (iteration 2-3). Phase 4 specialists and the orchestrator should factor this into their risk assessment.
|
|
119
115
|
|
|
120
|
-
**Phase 4 — Final Quality** (
|
|
116
|
+
**Phase 4 — Final Quality** (after review loop is clean):
|
|
121
117
|
|
|
122
|
-
Launch
|
|
118
|
+
Launch parallel subagents -- no artificial concurrency limit.
|
|
123
119
|
|
|
124
|
-
**Always
|
|
120
|
+
- **Always:** `hatch3r-test-writer`, `hatch3r-security-auditor`
|
|
121
|
+
- **Evaluate:** `hatch3r-docs-writer` (when APIs/architecture/UX affected)
|
|
122
|
+
- **Conditional:** `hatch3r-lint-fixer`, `hatch3r-a11y-auditor`, `hatch3r-perf-profiler`, `hatch3r-dependency-auditor`, `hatch3r-architect`, `hatch3r-devops`
|
|
125
123
|
|
|
126
|
-
|
|
127
|
-
|
|
124
|
+
**Specialist Prompt Enrichment:** When spawning Phase 4 specialists, include:
|
|
125
|
+
- The `filesChanged` list from Phase 2 so specialists focus on affected code.
|
|
126
|
+
- The review verdict summary from Phase 3 so specialists do not re-flag already-reviewed issues.
|
|
127
|
+
- The `researchFindings.blastRadius` so specialists can assess downstream impact of their changes.
|
|
128
128
|
|
|
129
|
-
**
|
|
129
|
+
**Phase 4 Specialist Trigger Table:**
|
|
130
130
|
|
|
131
|
-
|
|
131
|
+
| Specialist | Mode | Trigger Conditions |
|
|
132
|
+
|-----------|------|--------------------|
|
|
133
|
+
| `hatch3r-test-writer` | Always | Any code change |
|
|
134
|
+
| `hatch3r-security-auditor` | Always | Any code change |
|
|
135
|
+
| `hatch3r-docs-writer` | Evaluate | Public API, architecture, or UX changes |
|
|
136
|
+
| `hatch3r-lint-fixer` | Conditional | Lint/type errors present |
|
|
137
|
+
| `hatch3r-a11y-auditor` | Conditional | UI/accessibility changes |
|
|
138
|
+
| `hatch3r-perf-profiler` | Conditional | Performance-sensitive changes |
|
|
139
|
+
| `hatch3r-dependency-auditor` | Conditional | Dependency files modified (package.json, go.mod, Cargo.toml, requirements.txt, Gemfile, pom.xml, pubspec.yaml, mix.exs, composer.json, and their lockfiles) |
|
|
140
|
+
| `hatch3r-architect` | Conditional | Architectural decisions, new modules/services |
|
|
141
|
+
| `hatch3r-devops` | Conditional | CI/CD or infrastructure changes |
|
|
132
142
|
|
|
133
|
-
**
|
|
143
|
+
**Project-Type-Aware Specialist Selection:**
|
|
134
144
|
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
9. `hatch3r-devops` — when CI/CD, deployment, or infrastructure tasks are involved.
|
|
145
|
+
When `PipelineContext.projectType` is available (populated from repo analysis), use the detected languages and frameworks to enrich specialist prompts with language-specific hints. For example:
|
|
146
|
+
- **TypeScript/JavaScript:** Include strict mode checks for lint-fixer, framework-specific test patterns for test-writer.
|
|
147
|
+
- **Python:** Include ruff/mypy hints for lint-fixer, pytest patterns for test-writer, SSTI/SQLi checks for security-auditor.
|
|
148
|
+
- **Go:** Include golangci-lint for lint-fixer, govulncheck for security-auditor, table-driven test patterns for test-writer.
|
|
149
|
+
- **Rust:** Include clippy lints for lint-fixer, cargo-audit for security-auditor.
|
|
141
150
|
|
|
142
|
-
|
|
151
|
+
See `src/pipeline/pipelineContext.ts` for the full `LANGUAGE_SPECIALIST_CONFIGS` mapping.
|
|
152
|
+
|
|
153
|
+
### Phase 4 Validation Pass
|
|
143
154
|
|
|
144
|
-
|
|
155
|
+
After all Phase 4 specialists complete, run a validation pass to catch regressions:
|
|
156
|
+
|
|
157
|
+
1. Run test suite and type checker. Compare against Phase 3 baseline cached in `PipelineContext`.
|
|
158
|
+
2. No new failures: proceed to completion.
|
|
159
|
+
3. New failures: identify causing specialist, spawn `hatch3r-fixer`, re-validate (max 2 iterations).
|
|
160
|
+
4. Persistent regressions: surface to user. Do not silently accept.
|
|
161
|
+
5. If any specialist produced code fixes (not just findings), spawn a lightweight `hatch3r-reviewer` re-review scoped to files modified by Phase 4 specialists. This prevents specialist fixes from bypassing the Phase 3 review gate. Max 1 re-review iteration; Critical findings trigger a single fixer pass.
|
|
162
|
+
|
|
163
|
+
### Specialist Success Criteria
|
|
145
164
|
|
|
146
165
|
| Specialist | Success Criterion |
|
|
147
166
|
|-----------|-------------------|
|
|
148
|
-
| `hatch3r-test-writer` | All new
|
|
149
|
-
| `hatch3r-security-auditor` | No HIGH
|
|
150
|
-
| `hatch3r-docs-writer` |
|
|
151
|
-
| `hatch3r-lint-fixer` | Zero lint
|
|
152
|
-
| `hatch3r-a11y-auditor` |
|
|
153
|
-
| `hatch3r-perf-profiler` | No performance regressions
|
|
154
|
-
| `hatch3r-dependency-auditor` | No known CVEs
|
|
155
|
-
| `hatch3r-architect` |
|
|
156
|
-
| `hatch3r-devops` | CI/CD
|
|
167
|
+
| `hatch3r-test-writer` | All new/modified code paths have tests; no untested branches in changed files. |
|
|
168
|
+
| `hatch3r-security-auditor` | No HIGH/CRITICAL findings unresolved; MEDIUM findings documented with plan. |
|
|
169
|
+
| `hatch3r-docs-writer` | Affected APIs, architecture, and UX changes reflected in docs. |
|
|
170
|
+
| `hatch3r-lint-fixer` | Zero lint/type errors in changed files. |
|
|
171
|
+
| `hatch3r-a11y-auditor` | WCAG AA compliance; no new a11y violations. |
|
|
172
|
+
| `hatch3r-perf-profiler` | No performance regressions; new hot paths benchmarked. |
|
|
173
|
+
| `hatch3r-dependency-auditor` | No known CVEs; license compatibility verified. |
|
|
174
|
+
| `hatch3r-architect` | ADRs documented; design aligns with patterns or divergence justified. |
|
|
175
|
+
| `hatch3r-devops` | CI/CD passes end-to-end; deployment config validated. |
|
|
157
176
|
|
|
158
177
|
## Skill Loading Directives
|
|
159
178
|
|
|
160
|
-
|
|
179
|
+
Load the matching skill before implementation:
|
|
161
180
|
|
|
162
181
|
| Task Type | Skill |
|
|
163
182
|
|-----------|-------|
|
|
@@ -168,278 +187,102 @@ Before implementing any task, you MUST read and follow the matching hatch3r skil
|
|
|
168
187
|
| `type:refactor` (other) | `hatch3r-refactor` |
|
|
169
188
|
| `type:qa` | `hatch3r-qa-validation` |
|
|
170
189
|
|
|
171
|
-
|
|
190
|
+
Skill-referenced agent delegations are mandatory.
|
|
172
191
|
|
|
173
192
|
## Subagent Spawning Protocol
|
|
174
193
|
|
|
175
|
-
|
|
194
|
+
1. Use `subagent_type: "generalPurpose"` for all delegations.
|
|
195
|
+
2. Include: agent protocol, applicable `scope: always` rules, tooling hierarchy, relevant learnings.
|
|
196
|
+
3. Launch independent subagents in parallel — maximum parallelism.
|
|
197
|
+
4. Await and review results. Surface BLOCKED or PARTIAL to user.
|
|
176
198
|
|
|
177
|
-
|
|
178
|
-
2. **Include in every subagent prompt**:
|
|
179
|
-
- The agent protocol to follow (e.g., "Follow the hatch3r-implementer agent protocol").
|
|
180
|
-
- All `scope: always` rules from `.agents/rules/` that apply.
|
|
181
|
-
- The project's tooling hierarchy (Context7 MCP for library docs, web research for current context).
|
|
182
|
-
- Relevant learnings from `.agents/learnings/` if the directory exists.
|
|
183
|
-
3. **Launch as many independent subagents in parallel as the platform supports.** Do not impose an artificial concurrency limit. Use maximum parallelism for independent work.
|
|
184
|
-
4. **Await and review results** before proceeding. If a subagent reports BLOCKED or PARTIAL, surface to the user.
|
|
199
|
+
## Cross-Phase Error Propagation
|
|
185
200
|
|
|
186
|
-
|
|
201
|
+
When a phase produces a non-SUCCESS status, the orchestrator must propagate error context to downstream phases rather than silently dropping it:
|
|
202
|
+
|
|
203
|
+
1. **Phase 1 PARTIAL** (incomplete research): Include the `researchGaps` list in the implementer prompt so the implementer knows which areas lack verified context. Set implementer confidence expectations accordingly.
|
|
204
|
+
2. **Phase 2 PARTIAL** (incomplete implementation): Include the `reason` field and list of unimplemented acceptance criteria in the reviewer prompt. The reviewer must distinguish between "not done yet" and "done incorrectly."
|
|
205
|
+
3. **Phase 3 UNRESOLVED** (review loop exhausted): Include the unresolved findings list in the Phase 4 specialist prompts. Specialists must not introduce changes that conflict with known unresolved issues.
|
|
206
|
+
4. **Phase 4 specialist FAILED**: Include the failure reason when surfacing to the user. Never report "Phase 4 failed" without specifying which specialist failed and why.
|
|
187
207
|
|
|
188
|
-
|
|
208
|
+
## Correlation ID
|
|
189
209
|
|
|
190
|
-
|
|
191
|
-
2. **Propagation**: Include the correlation ID in every subagent prompt — researchers, implementers, reviewers, fixers, and all Phase 4 specialists. Pass it as a top-level field: `correlation_id: "<value>"`.
|
|
192
|
-
3. **Usage in subagents**: All subagents MUST include the correlation ID in any logs, error messages, structured outputs, or status reports they produce. This applies to both success and failure paths.
|
|
193
|
-
4. **Scope**: One correlation ID per top-level task. Epic sub-issues each get their own correlation ID. Batch tasks share one correlation ID per batch but include a sub-task index (e.g., `correlation_id: "<uuid>", sub_task: 2`).
|
|
210
|
+
Generate a UUID v4 per top-level task before Phase 1. Include in every subagent prompt as `correlation_id`. All subagents include it in logs, outputs, and status reports. Epic sub-issues get individual IDs; batch tasks share one ID with a sub-task index.
|
|
194
211
|
|
|
195
212
|
## Severity Scale
|
|
196
213
|
|
|
197
|
-
All agents across the pipeline MUST use this canonical severity scale when classifying findings, issues, or audit results. This ensures consistent triage and gating across phases.
|
|
198
|
-
|
|
199
214
|
| Severity | Definition | Pipeline Action |
|
|
200
215
|
|----------|-----------|-----------------|
|
|
201
|
-
| **CRITICAL** | Blocks merge
|
|
202
|
-
| **HIGH** | Should fix before merge. Significant bugs, performance regressions
|
|
203
|
-
| **MEDIUM** | Fix in same sprint. Code quality
|
|
204
|
-
| **LOW** | Track for future. Style nits, minor
|
|
205
|
-
| **INFO** | Informational
|
|
206
|
-
|
|
207
|
-
All subagents — reviewers, security auditors, test writers, and other specialists — MUST map their findings to this scale. When a subagent uses a different internal scale, it MUST translate to this canonical scale in its output.
|
|
208
|
-
|
|
209
|
-
## Pipeline Context
|
|
210
|
-
|
|
211
|
-
The orchestrator MUST maintain a `PipelineContext` object throughout the pipeline lifecycle. This object serves as the data contract between pipeline phases, ensuring structured handoff of findings, decisions, and artifacts.
|
|
212
|
-
|
|
213
|
-
### PipelineContext Schema
|
|
214
|
-
|
|
215
|
-
```
|
|
216
|
-
PipelineContext {
|
|
217
|
-
correlationId: string // UUID v4 from the Correlation ID directive
|
|
218
|
-
phase: "research" | "implement" | "review" | "quality" // Current active phase
|
|
219
|
-
findings: Finding[] // Accumulated findings from all phases
|
|
220
|
-
decisions: Decision[] // Decisions made during the pipeline (user answers, trade-offs, overrides)
|
|
221
|
-
artifacts: string[] // File paths created or modified during the pipeline
|
|
222
|
-
}
|
|
223
|
-
|
|
224
|
-
Finding {
|
|
225
|
-
id: string // Unique finding identifier (e.g., "F-001")
|
|
226
|
-
phase: string // Phase that produced the finding
|
|
227
|
-
agent: string // Agent that produced the finding
|
|
228
|
-
severity: "CRITICAL" | "HIGH" | "MEDIUM" | "LOW" | "INFO" // Per Severity Scale
|
|
229
|
-
description: string // Human-readable finding description
|
|
230
|
-
filePath?: string // Affected file, if applicable
|
|
231
|
-
resolved: boolean // Whether the finding has been addressed
|
|
232
|
-
}
|
|
233
|
-
|
|
234
|
-
Decision {
|
|
235
|
-
id: string // Unique decision identifier (e.g., "D-001")
|
|
236
|
-
phase: string // Phase where the decision was made
|
|
237
|
-
description: string // What was decided
|
|
238
|
-
rationale: string // Why this option was chosen
|
|
239
|
-
madeBy: "user" | "agent" // Who made the decision
|
|
240
|
-
}
|
|
241
|
-
```
|
|
242
|
-
|
|
243
|
-
### Phase Handoff Metadata
|
|
244
|
-
|
|
245
|
-
When transitioning between pipeline phases, the orchestrator MUST include the following metadata fields in each handoff to enable traceability and performance analysis:
|
|
246
|
-
|
|
247
|
-
- `timestamp` -- ISO 8601 timestamp of the handoff event
|
|
248
|
-
- `agentId` -- identifier of the agent completing the phase (e.g., `hatch3r-researcher`, `hatch3r-implementer`)
|
|
249
|
-
- `phase` -- the phase being completed (e.g., `research`, `implement`, `review`, `quality`)
|
|
250
|
-
- `duration` -- elapsed time in seconds for the completed phase
|
|
251
|
-
- `filesModified` -- list of file paths created, modified, or deleted during the phase
|
|
252
|
-
|
|
253
|
-
These fields are appended to the `PipelineContext` at each phase transition, providing a structured audit trail of which agent did what, when, and for how long.
|
|
254
|
-
|
|
255
|
-
### Context Caching
|
|
256
|
-
|
|
257
|
-
When multiple agents need the same context (e.g., project structure, test results, blast radius data, reference conventions), cache it in the shared `PipelineContext` rather than having each agent re-read or re-compute it independently. Specifically:
|
|
258
|
-
|
|
259
|
-
- Research output from Phase 1 (file lists, dependency maps, convention extractions) should be stored once and passed by reference to the implementer, reviewer, and any Phase 4 specialists that need it.
|
|
260
|
-
- Test suite results captured during implementation verification should be cached and forwarded to the reviewer and test-writer rather than re-running the full suite in each phase.
|
|
261
|
-
- This reduces redundant file reads, avoids inconsistencies from reading files at different points in time, and conserves token budget across subagent prompts.
|
|
262
|
-
|
|
263
|
-
### Cache Enforcement
|
|
264
|
-
|
|
265
|
-
The orchestrator MUST enforce caching at each phase transition. Caching is not optional guidance -- it is a pipeline invariant.
|
|
266
|
-
|
|
267
|
-
1. **Pre-handoff cache check**: Before spawning any downstream subagent, the orchestrator MUST verify that all cacheable outputs from prior phases are stored in `PipelineContext`. If a cacheable output is missing, the orchestrator MUST populate it before proceeding. Cacheable outputs include:
|
|
268
|
-
- Phase 1: file lists, dependency maps, convention extractions, blast radius data
|
|
269
|
-
- Phase 2: test suite results, modified file list, build output
|
|
270
|
-
- Phase 3: reviewer findings, fixer diffs, resolved/unresolved finding status
|
|
271
|
-
2. **No redundant reads**: If a subagent prompt would include context that exists in the cache, the orchestrator MUST pass the cached version. Subagents MUST NOT re-read files or re-run commands whose results are already cached and fresh (per Cache Verification above).
|
|
272
|
-
3. **Cache population logging**: Log every cache write with the key and size: `"Cache WRITE <cache_key>: <token_estimate> tokens"`. This provides visibility into which data is being cached and its cost.
|
|
273
|
-
4. **Enforcement violation**: If a subagent re-reads or re-computes data that was available in the cache, log a warning: `"Cache BYPASS detected: <agent> re-computed <cache_key> instead of using cached value"`. This warning is informational (severity INFO) and does not block the pipeline, but it flags an optimization gap for future runs.
|
|
274
|
-
|
|
275
|
-
### PipelineContext Usage
|
|
276
|
-
|
|
277
|
-
1. **Initialization**: The orchestrator creates a `PipelineContext` at the start of Phase 1 with the `correlationId` and `phase` set to `"research"`. All other fields are initialized as empty arrays.
|
|
278
|
-
2. **Phase transitions**: When moving between phases, update the `phase` field. Do not clear previous phase data — findings and decisions accumulate across the full pipeline.
|
|
279
|
-
3. **Subagent input**: Pass the current `PipelineContext` (or relevant subsets) to each subagent so it has full pipeline history.
|
|
280
|
-
4. **Subagent output**: Each subagent appends its findings and decisions to the context. The orchestrator merges subagent outputs back into the canonical `PipelineContext`.
|
|
281
|
-
5. **Final output**: The completed `PipelineContext` is included in the task summary, giving the user full traceability from research through quality.
|
|
216
|
+
| **CRITICAL** | Blocks merge. Security vulnerabilities, data loss, broken core functionality. | Must resolve before Phase 3 exit. |
|
|
217
|
+
| **HIGH** | Should fix before merge. Significant bugs, performance regressions. | Fix or escalate to user. |
|
|
218
|
+
| **MEDIUM** | Fix in same sprint. Code quality, minor bugs. | Document with remediation plan. |
|
|
219
|
+
| **LOW** | Track for future. Style nits, minor improvements. | Log only. No merge gate. |
|
|
220
|
+
| **INFO** | Informational. Observations, suggestions. | Awareness only. |
|
|
282
221
|
|
|
283
|
-
|
|
222
|
+
All subagents MUST map findings to this scale.
|
|
284
223
|
|
|
285
|
-
|
|
286
|
-
|
|
287
|
-
When a subagent fails (error, timeout, or BLOCKED status), apply the following retry-and-fallback protocol:
|
|
288
|
-
|
|
289
|
-
1. **Retry once**: Re-send the same prompt to the same agent type exactly once. Do not modify the prompt on retry.
|
|
290
|
-
2. **Fallback on second failure**: If the retry also fails, fall back to degraded mode for that phase:
|
|
291
|
-
- **Researcher failure** → Proceed to Phase 2 (Implement) without research context. Add a warning to the implementer prompt: `"WARNING: Research phase failed. Proceeding without research context. Exercise extra caution with assumptions."` The orchestrator should note this gap in the final output.
|
|
292
|
-
- **Reviewer failure** → Surface the raw diff to the user for manual review. Do not proceed to Phase 4 automatically.
|
|
293
|
-
- **Test-writer failure** → Flag the deliverable as "untested" in the PR description. Add label `needs-tests` if the platform supports it.
|
|
294
|
-
- **Fixer failure** → Surface the original reviewer findings to the user. Do not re-enter the review loop.
|
|
295
|
-
- **Security-auditor failure** → Flag as "security-unaudited" in the PR description. Add label `needs-security-review` if the platform supports it.
|
|
296
|
-
- **Other specialist failure** → Skip that specialist, document the gap in the final output (e.g., "docs-writer skipped due to failure").
|
|
297
|
-
3. **Retry budget**: Maximum 3 total retries across all subagents per top-level task. Once the budget is exhausted, any subsequent failures go directly to fallback without retry.
|
|
298
|
-
4. **Reporting**: Include all failures and fallbacks in the task summary so the user has full visibility into degraded phases.
|
|
299
|
-
|
|
300
|
-
### Circuit Breaker Tracking
|
|
301
|
-
|
|
302
|
-
The orchestrator MUST track consecutive failures per agent type and per pipeline phase to prevent repeated invocations of persistently failing agents.
|
|
303
|
-
|
|
304
|
-
1. **Tracking**: Maintain a per-agent failure counter that increments on each consecutive failure (error, timeout, or BLOCKED) and resets to zero on any success.
|
|
305
|
-
2. **Trip threshold**: After **3 consecutive failures** for the same agent type within a single pipeline run, mark that agent as **"tripped"** and skip all subsequent invocations of it for the remainder of the task.
|
|
306
|
-
3. **State transitions**: Log every circuit breaker state change with the correlation ID, agent type, and transition:
|
|
307
|
-
- `CLOSED → OPEN` — agent tripped after 3 consecutive failures. Log: `"Circuit breaker OPEN for <agent>: <failure_count> consecutive failures"`.
|
|
308
|
-
- `OPEN → HALF-OPEN` — cooldown period elapsed or manual reset issued. Log: `"Circuit breaker HALF-OPEN for <agent>: attempting probe"`.
|
|
309
|
-
- `HALF-OPEN → CLOSED` — probe invocation succeeded. Log: `"Circuit breaker CLOSED for <agent>: probe succeeded"`.
|
|
310
|
-
- `HALF-OPEN → OPEN` — probe invocation failed. Log: `"Circuit breaker re-OPEN for <agent>: probe failed"`.
|
|
311
|
-
4. **Skipping tripped agents**: When an agent is tripped, apply its fallback behavior from the Resilience Directives above and note `"Skipped: circuit breaker OPEN"` in the task summary.
|
|
312
|
-
5. **Reset policy**: A tripped agent can be re-enabled by either:
|
|
313
|
-
- **Manual reset** — the user explicitly requests retrying the agent (e.g., "retry the reviewer").
|
|
314
|
-
- **Cooldown period** — if the pipeline spans multiple top-level tasks in a session, a tripped agent automatically transitions to HALF-OPEN after **10 minutes** of inactivity. The next invocation is a probe: success closes the breaker; failure re-opens it.
|
|
315
|
-
6. **Cross-task persistence**: Circuit breaker state persists within a session. If an agent trips during task A, it remains tripped for task B unless manually reset or the cooldown period has elapsed.
|
|
316
|
-
|
|
317
|
-
### Stall Detection
|
|
318
|
-
|
|
319
|
-
If an agent produces no output for 2 minutes, consider it stalled. The orchestrator MUST:
|
|
320
|
-
|
|
321
|
-
1. **Log the stall** with the correlation ID, agent type, phase, and elapsed idle time: `"STALL detected for <agent> in <phase>: <elapsed>s with no output"`.
|
|
322
|
-
2. **Terminate the stalled agent** and capture any partial output produced before the stall.
|
|
323
|
-
3. **Retry once** by re-spawning the same agent type with the same prompt. If the retry also stalls, skip the agent and apply the relevant fallback from the Resilience Directives (e.g., proceed without research context, flag as untested).
|
|
324
|
-
4. **Include a warning** in the `PipelineContext` noting the stall and whether the retry succeeded or the agent was skipped.
|
|
325
|
-
|
|
326
|
-
A stalled invocation counts as a failure for both the retry budget and the circuit breaker failure counter.
|
|
327
|
-
|
|
328
|
-
### Timeout Policy
|
|
329
|
-
|
|
330
|
-
Each pipeline phase has an explicit time budget. If a phase exceeds its timeout, capture partial results and move to the next phase. Do not block the pipeline indefinitely.
|
|
331
|
-
|
|
332
|
-
| Phase / Activity | Per-Item Timeout | Phase Total Timeout |
|
|
333
|
-
|-----------------|-----------------|-------------------|
|
|
334
|
-
| **Phase 1 — Research** | 5 minutes per file | 30 minutes total |
|
|
335
|
-
| **Phase 2 — Implement** | 10 minutes per task | — |
|
|
336
|
-
| **Phase 3 — Review Loop** | 5 minutes per review cycle | — |
|
|
337
|
-
| **Phase 4 — Final Quality** | 5 minutes per specialist | — |
|
|
338
|
-
|
|
339
|
-
**Timeout behavior:**
|
|
340
|
-
|
|
341
|
-
1. **Partial capture**: When a timeout fires, the orchestrator MUST capture whatever output the subagent has produced so far. Partial research context, partial reviews, or partial test suites are preferable to no output.
|
|
342
|
-
2. **Logging**: Log the timeout with the correlation ID, phase, agent, elapsed time, and whether partial results were captured: `"TIMEOUT in <phase> for <agent>: <elapsed>s elapsed, partial results captured: <yes/no>"`.
|
|
343
|
-
3. **Phase advancement**: After capturing partial results, proceed to the next phase. Include a warning in downstream prompts: `"WARNING: <phase> timed out. Partial results only. Exercise extra caution."`.
|
|
344
|
-
4. **Retry interaction**: A timed-out invocation counts as a failure for both the retry budget and the circuit breaker failure counter.
|
|
345
|
-
|
|
346
|
-
### Observability Span Naming
|
|
347
|
-
|
|
348
|
-
For observability, name tracing spans consistently using the pattern `hatch3r.{phase}.{agent}`. This convention enables filtering and aggregation across pipeline runs in any OpenTelemetry-compatible backend.
|
|
349
|
-
|
|
350
|
-
Examples:
|
|
351
|
-
- `hatch3r.research.researcher`
|
|
352
|
-
- `hatch3r.implement.implementer`
|
|
353
|
-
- `hatch3r.review.reviewer`
|
|
354
|
-
- `hatch3r.review.fixer`
|
|
355
|
-
- `hatch3r.quality.test-writer`
|
|
356
|
-
- `hatch3r.quality.security-auditor`
|
|
357
|
-
|
|
358
|
-
The orchestrator creates a root span `hatch3r.pipeline` for the full task, with child spans for each phase and grandchild spans for each agent invocation within that phase. Include the `correlationId` as a span attribute on every span.
|
|
224
|
+
## Status Codes
|
|
359
225
|
|
|
360
|
-
|
|
226
|
+
| Status | Meaning |
|
|
227
|
+
|--------|---------|
|
|
228
|
+
| **SUCCESS** | Fully completed, all criteria met. |
|
|
229
|
+
| **PARTIAL** | Partially completed; include `reason` field. |
|
|
230
|
+
| **FAILED** | No usable output; include `reason` field. |
|
|
231
|
+
| **SKIPPED** | Intentionally not executed. |
|
|
232
|
+
| **TIMEOUT** | Time budget exceeded; forward partial output. |
|
|
361
233
|
|
|
362
|
-
|
|
234
|
+
## Phase Skip Criteria
|
|
363
235
|
|
|
364
|
-
|
|
365
|
-
2. **Create synthetic issue context** — title, acceptance criteria, and type — from the user's instruction.
|
|
366
|
-
3. **Run the Universal Sub-Agent Pipeline**: Phase 1 (Research) → Phase 2 (Implement) → Phase 3 (Review Loop) → Phase 4 (Final Quality).
|
|
367
|
-
4. For issue references in chat (e.g., "fix #5"), fetch issue details using the platform CLI (check `platform` in `.agents/hatch.json`) and use them as the task context instead of creating synthetic context:
|
|
368
|
-
- **GitHub:** `gh issue view`
|
|
369
|
-
- **Azure DevOps:** `az boards work-item show --id`
|
|
370
|
-
- **GitLab:** `glab issue view`
|
|
236
|
+
Consistent criteria for when each pipeline phase can be safely skipped. All commands that use the pipeline MUST reference these criteria — do not invent command-specific skip rules.
|
|
371
237
|
|
|
372
|
-
|
|
238
|
+
| Phase | Can Skip When | Mandatory Minimum (even when skipped) |
|
|
239
|
+
|-------|--------------|--------------------------------------|
|
|
240
|
+
| **Phase 1 (Research)** | Trivial single-line edit (typo, comment, single-value config); Tier 1 single-file change with no cross-module impact; Research already cached in PipelineContext | Affected files identified (even via quick scan); existing tests noted |
|
|
241
|
+
| **Phase 2 (Implement)** | Never — implementation is always required for code changes | All changes via hatch3r-implementer (never inline except trivial items in quick-change) |
|
|
242
|
+
| **Phase 3 (Review)** | All items trivial (quick-change only); documentation-only change with no code | Quality checks (lint/typecheck/test) must pass; acceptance criteria verified |
|
|
243
|
+
| **Phase 4 (Quality)** | Review loop unresolved AND user chose manual resolution; documentation-only; all trivial + quality checks pass (quick-change only) | test-writer + security-auditor always required for code changes; quality checks must pass |
|
|
373
244
|
|
|
374
|
-
|
|
245
|
+
See `src/pipeline/pipelineContext.ts` for the programmatic `PHASE_SKIP_CRITERIA` constant.
|
|
375
246
|
|
|
376
|
-
|
|
247
|
+
## Root-Cause Depth Requirements
|
|
377
248
|
|
|
378
|
-
|
|
379
|
-
2. **Classify** each task by type (bug/feature/refactor/QA/other) based on context or explicit labels.
|
|
380
|
-
3. **Build a dependency graph** among the tasks. Independent tasks share the same level and run in parallel.
|
|
381
|
-
4. **Spawn one `hatch3r-researcher` subagent per task** (skip for trivial single-line edits only). Launch in parallel.
|
|
382
|
-
5. **Spawn one `hatch3r-implementer` subagent per task** per dependency level.
|
|
383
|
-
6. **For issue references**: fetch issue details using the platform CLI (check `platform` in `.agents/hatch.json`):
|
|
384
|
-
- **GitHub:** `gh issue view`
|
|
385
|
-
- **Azure DevOps:** `az boards work-item show --id`
|
|
386
|
-
- **GitLab:** `glab issue view`
|
|
387
|
-
7. **For natural language tasks**: create synthetic issue context (title, acceptance criteria, type) from the instruction. Pass this context to the implementer subagent.
|
|
388
|
-
8. **Run the review loop** (Phase 3) after all implementations complete: spawn reviewer, then fixer for Critical/Warning findings, re-review, repeat until clean (max 3 iterations).
|
|
389
|
-
9. **Spawn final quality subagents** (Phase 4, after review loop is clean): test-writer + security-auditor (always), plus docs-writer, auditors as applicable.
|
|
249
|
+
When a pipeline phase reports a failure or unexpected result, the orchestrator must perform root-cause classification before deciding the next action:
|
|
390
250
|
|
|
391
|
-
|
|
251
|
+
| Symptom | Shallow Fix (avoid) | Root-Cause Fix (required) |
|
|
252
|
+
|---------|---------------------|---------------------------|
|
|
253
|
+
| Test failure after Phase 2 | Disable or skip the failing test | Identify why the implementation breaks the test -- fix the code or update the test with justification |
|
|
254
|
+
| Lint errors after Phase 4 | Add `eslint-disable` comments | Fix the underlying code pattern that triggers the lint rule |
|
|
255
|
+
| Type errors after fixer changes | Cast with `as any` | Trace the type mismatch to its source and fix the type definition or usage |
|
|
256
|
+
| Review loop not converging | Surface to user after 3 iterations without analysis | Classify whether findings are oscillating (fixer A breaks what fixer B fixed) and surface the conflict pattern |
|
|
392
257
|
|
|
393
|
-
|
|
258
|
+
The orchestrator must reject superficial fixes from any subagent. If a fixer's output contains suppression patterns (disable comments, `any` casts, test skips without linked issues), classify as PARTIAL and re-run with an adjusted prompt that requests a root-cause fix.
|
|
394
259
|
|
|
395
|
-
|
|
260
|
+
## Task Context Protocols
|
|
396
261
|
|
|
397
|
-
|
|
398
|
-
2. **No destructive operations without prior approval** — verify the agent did not perform destructive operations (file deletions, database migrations, force-pushes, dependency removals) unless those operations were explicitly listed in the task prompt as approved actions. Any destructive operation not pre-approved MUST be flagged and rolled back before proceeding.
|
|
399
|
-
3. **Output schema compliance** — verify all agent outputs match their expected schemas. Researcher output must contain the required sections for its modes. Implementer output must include changed file paths and acceptance criteria status. Reviewer output must use the canonical severity scale. Malformed outputs MUST trigger a retry or escalation, not silent acceptance.
|
|
262
|
+
**Single-task plain chat:** Classify task type, create synthetic issue context, run full pipeline. For issue references, fetch details via platform CLI.
|
|
400
263
|
|
|
401
|
-
|
|
264
|
+
**Multi-task plain chat:** Parse into discrete tasks, classify each, build dependency graph, parallelize researchers and implementers per dependency level, run review loop after all implementations, then Phase 4 specialists. When parallel implementers modify the same file: accept disjoint region edits, merge overlapping regions using the larger-scope change as base, and halt on semantic conflicts (contradictory interface/contract changes) for user resolution.
|
|
402
265
|
|
|
403
|
-
|
|
266
|
+
**Auto-mode guardrails:** In unattended execution, verify scope containment, no unapproved destructive operations, and output schema compliance after each phase. Halt on violation. See `hatch3r-agent-orchestration-detail` for full guardrail specifications.
|
|
404
267
|
|
|
405
|
-
|
|
268
|
+
## Rule Application
|
|
406
269
|
|
|
407
|
-
|
|
408
|
-
|--------|---------|
|
|
409
|
-
| **SUCCESS** | Task completed fully, all acceptance criteria met. |
|
|
410
|
-
| **PARTIAL** | Task partially completed; some acceptance criteria met, others remain open or degraded. |
|
|
411
|
-
| **FAILED** | Task could not be completed; no usable output produced. |
|
|
412
|
-
| **SKIPPED** | Task was intentionally not executed (e.g., non-applicable phase, trivial edit bypass). |
|
|
413
|
-
| **TIMEOUT** | Task exceeded its time budget; partial results may be available. |
|
|
270
|
+
All `scope: always` rules apply to every task including subagent work. Include rule directives in subagent prompts.
|
|
414
271
|
|
|
415
|
-
|
|
272
|
+
### Tiered Rule Inclusion
|
|
416
273
|
|
|
417
|
-
|
|
274
|
+
**Tier 1 -- Always include (every subagent):**
|
|
275
|
+
- `hatch3r-security-patterns` -- security invariants
|
|
276
|
+
- `hatch3r-code-standards` -- code quality
|
|
418
277
|
|
|
419
|
-
|
|
278
|
+
**Tier 2 -- Include by phase:**
|
|
279
|
+
- `hatch3r-testing` -- test-writer, implementer, reviewer
|
|
280
|
+
- `hatch3r-accessibility-standards` -- a11y-auditor, reviewer (UI)
|
|
281
|
+
- `hatch3r-git-conventions` -- orchestrator git ops
|
|
282
|
+
- `hatch3r-ci-cd` -- ci-watcher, devops
|
|
283
|
+
- `hatch3r-dependency-management` -- dependency-auditor
|
|
420
284
|
|
|
421
|
-
|
|
285
|
+
**Tier 3 -- On-demand:**
|
|
286
|
+
- `hatch3r-api-design`, `hatch3r-secrets-management`, `hatch3r-data-classification`, `hatch3r-performance-budgets`, `hatch3r-browser-verification`, `hatch3r-component-conventions`, `hatch3r-i18n`, `hatch3r-theming`, `hatch3r-migrations`, `hatch3r-feature-flags`, `hatch3r-observability-logging`, `hatch3r-observability-metrics`, `hatch3r-observability-tracing`, `hatch3r-observability-tracing-detail`
|
|
422
287
|
|
|
423
|
-
|
|
424
|
-
|
|
425
|
-
**Tier 1 -- Always include (every subagent prompt):**
|
|
426
|
-
- `hatch3r-security-patterns` -- security invariants apply to all code changes
|
|
427
|
-
- `hatch3r-code-standards` -- code quality conventions apply universally
|
|
428
|
-
|
|
429
|
-
**Tier 2 -- Include by phase (match to the active agent):**
|
|
430
|
-
- `hatch3r-testing` -- include for `hatch3r-test-writer`, `hatch3r-implementer`, `hatch3r-reviewer`
|
|
431
|
-
- `hatch3r-accessibility-standards` -- include for `hatch3r-a11y-auditor`, `hatch3r-reviewer` (UI changes)
|
|
432
|
-
- `hatch3r-git-conventions` -- include for orchestrator git operations
|
|
433
|
-
- `hatch3r-ci-cd` -- include for `hatch3r-ci-watcher`, `hatch3r-devops`
|
|
434
|
-
- `hatch3r-dependency-management` -- include for `hatch3r-dependency-auditor`
|
|
435
|
-
|
|
436
|
-
**Tier 3 -- On-demand (reference only when the task context requires it):**
|
|
437
|
-
- `hatch3r-api-design` -- when designing or reviewing API contracts
|
|
438
|
-
- `hatch3r-secrets-management` -- when handling credentials or environment config
|
|
439
|
-
- `hatch3r-data-classification` -- when handling PII or sensitive data flows
|
|
440
|
-
- `hatch3r-performance-budgets` -- when profiling or reviewing performance
|
|
441
|
-
- `hatch3r-browser-verification` -- when verifying UI in browser
|
|
442
|
-
- `hatch3r-component-conventions` -- when writing UI components
|
|
443
|
-
- `hatch3r-i18n`, `hatch3r-theming`, `hatch3r-migrations`, `hatch3r-feature-flags`, `hatch3r-observability` -- when the task specifically touches these areas
|
|
444
|
-
|
|
445
|
-
For tools with limited context windows, Tier 1 rules are mandatory. Tier 2 and Tier 3 rules should be included selectively based on the subagent's role and the task scope to avoid exceeding token budgets.
|
|
288
|
+
For limited context windows, Tier 1 is mandatory. Tier 2/3 included selectively by agent role and task scope.
|