@crewpilot/agent 2.0.0 → 3.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +131 -131
- package/dist-npm/cli.js +5 -5
- package/dist-npm/index.js +100 -100
- package/package.json +69 -69
- package/prompts/agent.md +282 -282
- package/prompts/copilot-instructions.md +36 -36
- package/prompts/{catalyst.config.json → crewpilot.config.json} +72 -72
- package/prompts/skills/assure-code-quality/SKILL.md +112 -112
- package/prompts/skills/assure-pr-intelligence/SKILL.md +148 -148
- package/prompts/skills/assure-review-functional/SKILL.md +114 -114
- package/prompts/skills/assure-review-standards/SKILL.md +106 -106
- package/prompts/skills/assure-threat-model/SKILL.md +182 -182
- package/prompts/skills/assure-vulnerability-scan/SKILL.md +146 -146
- package/prompts/skills/autopilot-meeting/SKILL.md +434 -434
- package/prompts/skills/autopilot-worker/SKILL.md +737 -737
- package/prompts/skills/daily-digest/SKILL.md +188 -188
- package/prompts/skills/deliver-change-management/SKILL.md +132 -132
- package/prompts/skills/deliver-deploy-guard/SKILL.md +144 -144
- package/prompts/skills/deliver-doc-governance/SKILL.md +130 -130
- package/prompts/skills/engineer-feature-builder/SKILL.md +270 -270
- package/prompts/skills/engineer-root-cause-analysis/SKILL.md +150 -150
- package/prompts/skills/engineer-test-first/SKILL.md +148 -148
- package/prompts/skills/insights-knowledge-base/SKILL.md +202 -202
- package/prompts/skills/insights-pattern-detection/SKILL.md +142 -142
- package/prompts/skills/strategize-architecture-planner/SKILL.md +141 -141
- package/prompts/skills/strategize-solution-design/SKILL.md +118 -118
- package/scripts/postinstall.js +108 -108
|
@@ -1,737 +1,737 @@
|
|
|
1
|
-
# Autopilot Worker
|
|
2
|
-
|
|
3
|
-
> **Pillar**: Orchestrate | **ID**: `autopilot-worker`
|
|
4
|
-
|
|
5
|
-
## Purpose
|
|
6
|
-
|
|
7
|
-
Single-command pipeline that creates a board issue, plans implementation, writes code + tests, applies the full Deliver pipeline (change-management → doc-governance → deploy-guard), opens a reviewed PR, and updates the board. One human gate: approve the plan. Everything else is automatic chaining through 12 skills. Includes label-gated design/architecture phases, bug-triggered root-cause analysis, and a continuous self-improvement loop via pattern detection + knowledge base.
|
|
8
|
-
|
|
9
|
-
## Activation Triggers
|
|
10
|
-
|
|
11
|
-
- autopilot, auto, pick up, work on, do this, implement and ship, end to end, full pipeline
|
|
12
|
-
- Routed from `feature-builder` Phase 0 when complexity is moderate or complex
|
|
13
|
-
- User provides a board issue number ("#42", "issue 42")
|
|
14
|
-
|
|
15
|
-
## Session Role Exception
|
|
16
|
-
|
|
17
|
-
This pipeline chains 12 skills across role boundaries (e.g. code-quality and vulnerability-scan in Phase 6 are Review skills, but run inside the Builder pipeline). **All skills invoked internally by this pipeline are unrestricted by the session role.** Role scoping only applies to user-initiated requests, not pipeline steps.
|
|
18
|
-
|
|
19
|
-
## Tools Required
|
|
20
|
-
|
|
21
|
-
- `
|
|
22
|
-
- `
|
|
23
|
-
- `
|
|
24
|
-
- `
|
|
25
|
-
- `
|
|
26
|
-
- `
|
|
27
|
-
- `
|
|
28
|
-
- `
|
|
29
|
-
- `
|
|
30
|
-
- `
|
|
31
|
-
- `
|
|
32
|
-
- `
|
|
33
|
-
- `
|
|
34
|
-
- `
|
|
35
|
-
- `
|
|
36
|
-
- `
|
|
37
|
-
- `
|
|
38
|
-
- `
|
|
39
|
-
- `
|
|
40
|
-
- `
|
|
41
|
-
- `
|
|
42
|
-
- `
|
|
43
|
-
- `
|
|
44
|
-
- `
|
|
45
|
-
- `
|
|
46
|
-
- `
|
|
47
|
-
- `
|
|
48
|
-
- `
|
|
49
|
-
- `
|
|
50
|
-
- `
|
|
51
|
-
- `
|
|
52
|
-
- `mcp_workiq_ask_work_iq` — (optional, requires Work IQ extension) fetch M365 context (emails, docs, meetings) related to the task
|
|
53
|
-
|
|
54
|
-
## Methodology
|
|
55
|
-
|
|
56
|
-
### Process Flow
|
|
57
|
-
|
|
58
|
-
```dot
|
|
59
|
-
digraph autopilot_worker {
|
|
60
|
-
rankdir=TB;
|
|
61
|
-
node [shape=box];
|
|
62
|
-
|
|
63
|
-
intake [label="Phase 1\nIntake & Issue Creation"];
|
|
64
|
-
analysis [label="Phase 2\nCodebase Analysis & Planning"];
|
|
65
|
-
design [label="Phase 2.5\nDesign & Architecture\n(label-gated)", style=dashed];
|
|
66
|
-
rca [label="Phase 2.5c\nRoot Cause Analysis\n(bug label-gated)", style=dashed];
|
|
67
|
-
threat [label="Phase 2.5d\nThreat Model\n(security label-gated)", style=dashed];
|
|
68
|
-
plan_gate [label="Phase 3\nHUMAN GATE: Plan Approval", shape=diamond, style=filled, fillcolor="#ffcccc"];
|
|
69
|
-
implement [label="Phase 4\nBranch & Implementation"];
|
|
70
|
-
change_mgmt [label="Phase 5\nChange Management"];
|
|
71
|
-
doc_gov [label="Phase 5b\nDoc Governance"];
|
|
72
|
-
pr_review [label="Phase 6\nPR Creation & Auto-Review\n(5-stage)"];
|
|
73
|
-
deploy_guard [label="Phase 7\nDeploy Guard\n(6 gates)"];
|
|
74
|
-
complete [label="Phase 8\nCompletion & Learning", shape=doublecircle];
|
|
75
|
-
fail [label="FAIL\nCircuit Breaker", shape=octagon, style=filled, fillcolor="#ff9999"];
|
|
76
|
-
|
|
77
|
-
intake -> analysis;
|
|
78
|
-
analysis -> design [label="needs-design\nor needs-architecture"];
|
|
79
|
-
analysis -> rca [label="bug/defect/\nregression"];
|
|
80
|
-
analysis -> threat [label="needs-threat-model\nor security-sensitive"];
|
|
81
|
-
analysis -> plan_gate [label="no special labels"];
|
|
82
|
-
design -> plan_gate;
|
|
83
|
-
rca -> plan_gate;
|
|
84
|
-
threat -> plan_gate;
|
|
85
|
-
plan_gate -> implement [label="approved"];
|
|
86
|
-
plan_gate -> fail [label="cancelled"];
|
|
87
|
-
implement -> change_mgmt;
|
|
88
|
-
implement -> fail [label="3 failures"];
|
|
89
|
-
change_mgmt -> doc_gov;
|
|
90
|
-
doc_gov -> pr_review;
|
|
91
|
-
pr_review -> pr_review [label="issues found\nfix & re-run"];
|
|
92
|
-
pr_review -> deploy_guard;
|
|
93
|
-
deploy_guard -> complete [label="GO"];
|
|
94
|
-
deploy_guard -> pr_review [label="NO-GO\nfix blockers"];
|
|
95
|
-
complete -> complete [label="store knowledge\nself-improvement loop"];
|
|
96
|
-
}
|
|
97
|
-
```
|
|
98
|
-
|
|
99
|
-
### Phase 1 — Intake & Issue Creation
|
|
100
|
-
|
|
101
|
-
**First interaction hint:** If this is the first interaction in the session, start with:
|
|
102
|
-
> 💡 *Running
|
|
103
|
-
|
|
104
|
-
**Entry mode detection** — the worker can be entered four ways:
|
|
105
|
-
|
|
106
|
-
| Entry Mode | How to Detect | Behavior |
|
|
107
|
-
|---|---|---|
|
|
108
|
-
| **Direct** | User says "autopilot", "full pipeline", etc. | Run full pipeline from Phase 1 |
|
|
109
|
-
| **Routed from feature-builder** | feature-builder's Phase 0 classified as moderate/complex | Skip re-analyzing complexity — it's already assessed. Use the context feature-builder gathered. |
|
|
110
|
-
| **Mid-build escalation** | feature-builder discovered more complexity during Phase 4 | Accept the partial context (files already touched, patterns found). Start from Phase 2 (planning) with what's already known. |
|
|
111
|
-
| **Session resume** | User says "resume", "continue", "pick up where I left off" | Call `
|
|
112
|
-
|
|
113
|
-
**Session resume flow**: When resuming, the agent should:
|
|
114
|
-
1. Call `
|
|
115
|
-
2. Call `
|
|
116
|
-
3. Read relevant artifacts with `
|
|
117
|
-
4. **(Optional) Calendar-aware context refresh**: If `mcp_workiq_ask_work_iq` is available and significant time has passed since the session was saved (overnight, weekend, or >4 hours):
|
|
118
|
-
- Call `mcp_workiq_accept_eula` with `eulaUrl: "https://github.com/microsoft/work-iq-mcp"` (idempotent)
|
|
119
|
-
- **Check for new context**: `mcp_workiq_ask_work_iq` → "What meetings, emails, or Teams messages about {issue title / feature} happened since {saved_at timestamp}? Summarize any new decisions, requirement changes, or blockers."
|
|
120
|
-
- **Check calendar conflicts**: `mcp_workiq_ask_work_iq` → "Do I have any meetings in the next 2 hours that might affect my availability?"
|
|
121
|
-
- If new decisions or requirement changes are found, flag them to the user before continuing:
|
|
122
|
-
```
|
|
123
|
-
📅 Context Update (since session was saved {age} ago):
|
|
124
|
-
- {new decision / requirement change / blocker}
|
|
125
|
-
→ Continue with current plan? (yes / re-plan)
|
|
126
|
-
```
|
|
127
|
-
- If unavailable, skip — resume proceeds without M365 context refresh.
|
|
128
|
-
5. Continue from the first pending action in the saved state
|
|
129
|
-
6. Do NOT re-run phases that have already completed (check artifacts_written)
|
|
130
|
-
|
|
131
|
-
**Complexity check (direct entry only):** If the user enters autopilot directly, quickly assess if the request warrants the full pipeline:
|
|
132
|
-
- If the request is trivial (single file, obvious change) → suggest: *"This is a small change. I can implement it directly without the full pipeline. Want me to do that instead?"*
|
|
133
|
-
- If the user says "just do it" → hand off to `feature-builder` (which will handle it as trivial/simple tier).
|
|
134
|
-
- Otherwise → continue with the full pipeline below.
|
|
135
|
-
|
|
136
|
-
**If user provides a task description (not an existing issue number):**
|
|
137
|
-
|
|
138
|
-
1. Parse the user's request to extract:
|
|
139
|
-
- Title (concise, action-oriented)
|
|
140
|
-
- Description (what needs to be built)
|
|
141
|
-
- Acceptance criteria (bullet list — infer from description if not explicit)
|
|
142
|
-
- Labels (feature, bug, chore — infer from context)
|
|
143
|
-
|
|
144
|
-
<HARD-GATE>
|
|
145
|
-
2. **HUMAN GATE — Task Creation Confirmation**: Present the inferred task summary to the user BEFORE creating the board issue:
|
|
146
|
-
|
|
147
|
-
```
|
|
148
|
-
📋 Before I start, here's what I'll create as a board issue:
|
|
149
|
-
|
|
150
|
-
Title: {title}
|
|
151
|
-
Description: {description}
|
|
152
|
-
|
|
153
|
-
Acceptance Criteria:
|
|
154
|
-
- [ ] {criterion 1}
|
|
155
|
-
- [ ] {criterion 2}
|
|
156
|
-
- [ ] {criterion 3}
|
|
157
|
-
|
|
158
|
-
Labels: {labels}
|
|
159
|
-
|
|
160
|
-
→ Create this task and start the pipeline? (yes / edit / no)
|
|
161
|
-
```
|
|
162
|
-
|
|
163
|
-
- If **yes** → call `
|
|
164
|
-
- If **edit** → user provides corrections, update and re-present
|
|
165
|
-
- If **no** → stop the pipeline. Ask the user what they'd like to do instead.
|
|
166
|
-
- Do NOT create the board issue without explicit user confirmation.
|
|
167
|
-
</HARD-GATE>
|
|
168
|
-
|
|
169
|
-
3. Call `
|
|
170
|
-
4. Note the created issue ID
|
|
171
|
-
|
|
172
|
-
**If user provides an existing issue number (e.g., "#42"):**
|
|
173
|
-
|
|
174
|
-
1. Call `
|
|
175
|
-
2. Use its title, description, and acceptance criteria as-is
|
|
176
|
-
3. No confirmation needed — the task already exists
|
|
177
|
-
|
|
178
|
-
### Phase 2 — Codebase Analysis & Planning
|
|
179
|
-
|
|
180
|
-
1. Read the project structure — scan key files (package.json, tsconfig, src/ layout, existing patterns)
|
|
181
|
-
2. Identify:
|
|
182
|
-
- Which files need to be **created**
|
|
183
|
-
- Which files need to be **modified**
|
|
184
|
-
- What patterns/conventions the codebase follows (naming, directory structure, test style)
|
|
185
|
-
- What dependencies might be needed
|
|
186
|
-
3. Check issue labels for `needs-design`, `needs-architecture`, `bug`/`defect`/`regression`, and `needs-threat-model`/`security-sensitive`
|
|
187
|
-
4. **Query pattern knowledge** via `
|
|
188
|
-
- Search for known patterns and anti-patterns in the files being modified
|
|
189
|
-
- Search for past root causes in the same area of the codebase
|
|
190
|
-
- Collect any "repeat offender" warnings from previous runs
|
|
191
|
-
- Feed this context into the plan so the worker avoids known mistakes
|
|
192
|
-
5. **(Optional) Fetch M365 requirements context**: First call `mcp_workiq_accept_eula` with `eulaUrl: "https://github.com/microsoft/work-iq-mcp"` (idempotent), then use **focused queries** to surface requirements context before planning:
|
|
193
|
-
- **Requirements & specs**: `mcp_workiq_ask_work_iq` → "Find emails, documents, and Teams messages about: {issue title}. Summarize relevant discussions, specs, and design docs."
|
|
194
|
-
- **Meeting decisions**: `mcp_workiq_ask_work_iq` → "What decisions were made about {issue title / feature name} in recent meetings? What requirements were stated?"
|
|
195
|
-
- **Stakeholder expectations**: `mcp_workiq_ask_work_iq` → "What did stakeholders or customers say about {feature} in recent emails or meetings? What was promised or committed?"
|
|
196
|
-
- Feed the M365 context into the analysis artifact so Phase 3's plan addresses stated requirements, not just the issue description.
|
|
197
|
-
- If `mcp_workiq_ask_work_iq` is unavailable, skip — this step is optional.
|
|
198
|
-
6. Call `
|
|
199
|
-
7. **Write artifact**: Call `
|
|
200
|
-
- Files to create/modify
|
|
201
|
-
- Codebase patterns discovered
|
|
202
|
-
- Dependencies needed
|
|
203
|
-
- Label-gated phases to run
|
|
204
|
-
- Known patterns/anti-patterns from knowledge search
|
|
205
|
-
|
|
206
|
-
### Phase 2.5 — Design & Architecture (label-gated)
|
|
207
|
-
|
|
208
|
-
**Skip this phase entirely if the issue has neither `needs-design` nor `needs-architecture` label.**
|
|
209
|
-
|
|
210
|
-
Check the issue labels (from `
|
|
211
|
-
|
|
212
|
-
#### If issue has `needs-design` label:
|
|
213
|
-
|
|
214
|
-
**Load and follow** `.github/skills/strategize-solution-design/SKILL.md`:
|
|
215
|
-
|
|
216
|
-
1. Frame the problem — restate in one sentence with constraints
|
|
217
|
-
2. Generate 3-4 distinct approaches with strengths, risks, and effort
|
|
218
|
-
3. Build a trade-off matrix comparing all options
|
|
219
|
-
4. Present to user:
|
|
220
|
-
|
|
221
|
-
```
|
|
222
|
-
📐 Design Phase for: "{issue title}"
|
|
223
|
-
|
|
224
|
-
{trade-off matrix}
|
|
225
|
-
|
|
226
|
-
Recommendation: {option} (Confidence: {N}/10)
|
|
227
|
-
Reversal cost: {Low/Medium/High}
|
|
228
|
-
|
|
229
|
-
→ Which approach? (A / B / C / edit)
|
|
230
|
-
```
|
|
231
|
-
|
|
232
|
-
5. **HUMAN GATE**: User picks an approach
|
|
233
|
-
6. Store the decision via `
|
|
234
|
-
7. Write the design document to `docs/design/{issue_id}-{slug}.md`:
|
|
235
|
-
```markdown
|
|
236
|
-
# Design: {issue title}
|
|
237
|
-
|
|
238
|
-
**Issue**: #{id}
|
|
239
|
-
**Date**: {date}
|
|
240
|
-
**Decision**: {chosen option}
|
|
241
|
-
|
|
242
|
-
## Problem
|
|
243
|
-
{one-sentence problem statement}
|
|
244
|
-
|
|
245
|
-
## Options Considered
|
|
246
|
-
{options with strengths/risks/effort}
|
|
247
|
-
|
|
248
|
-
## Trade-off Matrix
|
|
249
|
-
{matrix}
|
|
250
|
-
|
|
251
|
-
## Decision
|
|
252
|
-
{chosen option with rationale}
|
|
253
|
-
Confidence: {N}/10 | Reversal cost: {Low/Medium/High}
|
|
254
|
-
```
|
|
255
|
-
8. Stage the design doc — it will be committed alongside the code in Phase 5
|
|
256
|
-
9. **Write artifact**: Call `
|
|
257
|
-
|
|
258
|
-
#### If issue has `needs-architecture` label:
|
|
259
|
-
|
|
260
|
-
**Load and follow** `.github/skills/strategize-architecture-planner/SKILL.md`:
|
|
261
|
-
|
|
262
|
-
1. Define scope — system boundaries, actors, quality attributes
|
|
263
|
-
2. Decompose into components with responsibilities and interfaces
|
|
264
|
-
3. Trace the primary data flow through the system
|
|
265
|
-
4. Create an implementation roadmap with milestones
|
|
266
|
-
5. Present to user:
|
|
267
|
-
|
|
268
|
-
```
|
|
269
|
-
📐 Architecture for: "{issue title}"
|
|
270
|
-
|
|
271
|
-
Components:
|
|
272
|
-
| Component | Responsibility | Interface | Dependencies |
|
|
273
|
-
|-----------|---------------|-----------|-------------|
|
|
274
|
-
| ... | ... | ... | ... |
|
|
275
|
-
|
|
276
|
-
Data Flow:
|
|
277
|
-
1. {step} → {step} → {step}
|
|
278
|
-
|
|
279
|
-
→ Approve architecture? (yes / edit)
|
|
280
|
-
```
|
|
281
|
-
|
|
282
|
-
6. **HUMAN GATE**: User approves the architecture
|
|
283
|
-
7. Store as knowledge (type: decision)
|
|
284
|
-
8. Write the ADR to `docs/adr/{NNN}-{slug}.md`:
|
|
285
|
-
```markdown
|
|
286
|
-
# ADR-{NNN}: {title}
|
|
287
|
-
|
|
288
|
-
## Status: Accepted
|
|
289
|
-
## Context
|
|
290
|
-
{why this design was needed}
|
|
291
|
-
## Decision
|
|
292
|
-
{what was decided — components, data flow, interfaces}
|
|
293
|
-
## Consequences
|
|
294
|
-
{positive and negative trade-offs}
|
|
295
|
-
## Alternatives Considered
|
|
296
|
-
{rejected options and why}
|
|
297
|
-
```
|
|
298
|
-
9. Stage the ADR — it will be committed alongside the code in Phase 5
|
|
299
|
-
10. **Write artifact**: Call `
|
|
300
|
-
|
|
301
|
-
#### If issue has BOTH labels:
|
|
302
|
-
|
|
303
|
-
Run `needs-design` first (pick the approach), then `needs-architecture` (detail the design).
|
|
304
|
-
The design decision feeds into the architecture — e.g., "we chose Redis" → architecture shows CacheService component, middleware chain, config interface.
|
|
305
|
-
|
|
306
|
-
### Phase 2.5c — Root Cause Analysis (label-gated)
|
|
307
|
-
|
|
308
|
-
**Skip if the issue does NOT have a `bug`, `defect`, or `regression` label.**
|
|
309
|
-
|
|
310
|
-
**Load and follow** `.github/skills/engineer-root-cause-analysis/SKILL.md` methodology:
|
|
311
|
-
|
|
312
|
-
1. **Symptom collection**:
|
|
313
|
-
- Extract error message, stack trace, steps to reproduce from the issue description
|
|
314
|
-
- Run `
|
|
315
|
-
- Query `
|
|
316
|
-
2. **Hypothesis generation** — generate 2-3 ranked hypotheses:
|
|
317
|
-
|
|
318
|
-
```
|
|
319
|
-
🔍 RCA for: "{issue title}"
|
|
320
|
-
|
|
321
|
-
| # | Hypothesis | Likelihood | Evidence | Test Strategy |
|
|
322
|
-
|---|---|---|---|---|
|
|
323
|
-
| H1 | {most likely} | High | {evidence} | {how to test} |
|
|
324
|
-
| H2 | {alternative} | Medium | {evidence} | {how to test} |
|
|
325
|
-
| H3 | {edge case} | Low | {evidence} | {how to test} |
|
|
326
|
-
```
|
|
327
|
-
|
|
328
|
-
3. **Systematic elimination** — for each hypothesis (highest first):
|
|
329
|
-
- Run `
|
|
330
|
-
- Record result: confirmed / eliminated / narrowed
|
|
331
|
-
- Max 5 attempts total (circuit breaker — same as Phase 4)
|
|
332
|
-
4. **Root cause identification**:
|
|
333
|
-
- State in one sentence
|
|
334
|
-
- Causal chain: trigger → intermediate effects → symptom
|
|
335
|
-
- Design gap: WHY the code was vulnerable
|
|
336
|
-
5. **Feed into Phase 3 plan**:
|
|
337
|
-
- The plan must fix the root cause, not just the symptom
|
|
338
|
-
- Include a regression test that fails without the fix
|
|
339
|
-
- Phase 5 commit footer: `Root-cause: {one-sentence description}`
|
|
340
|
-
6. **Store root cause** via `
|
|
341
|
-
- What: the root cause description
|
|
342
|
-
- Where: affected files/modules
|
|
343
|
-
- Why: the design gap
|
|
344
|
-
- Prevention: what would have caught this earlier
|
|
345
|
-
7. **Write artifact**: Call `
|
|
346
|
-
8. **If root cause reveals a systemic issue**, flag it for pattern detection in Phase 6:
|
|
347
|
-
- Add note: `systemic:{description}` for Phase 6 to pick up
|
|
348
|
-
|
|
349
|
-
### Phase 2.5d — Threat Modeling (label-gated)
|
|
350
|
-
|
|
351
|
-
**Skip if the issue does NOT have a `needs-threat-model` or `security-sensitive` label.**
|
|
352
|
-
|
|
353
|
-
**Load and follow** `.github/skills/assure-threat-model/SKILL.md` methodology:
|
|
354
|
-
|
|
355
|
-
1. **Read prior artifacts**: Load the `analysis` artifact (and `architecture` if it exists) to understand the system being built
|
|
356
|
-
2. **Scope the model**: Define the trust boundaries and data flows for the feature being implemented
|
|
357
|
-
3. **STRIDE analysis**: For each component and data flow crossing a trust boundary, evaluate all 6 STRIDE categories
|
|
358
|
-
4. **Risk assessment**: Score each threat (Likelihood × Impact = Risk)
|
|
359
|
-
5. **Mitigation planning**: For threats with risk ≥ 7, propose specific mitigations with effort and implementation phase
|
|
360
|
-
6. **Present to user**:
|
|
361
|
-
|
|
362
|
-
```
|
|
363
|
-
🛡️ Threat Model for: "{issue title}"
|
|
364
|
-
|
|
365
|
-
| ID | STRIDE | Component | Threat | Risk Score | Mitigation |
|
|
366
|
-
|----|--------|-----------|--------|------------|------------|
|
|
367
|
-
| T1 | ... | ... | ... | ... | ... |
|
|
368
|
-
|
|
369
|
-
Critical threats: {count}
|
|
370
|
-
Required mitigations before implementation: {list}
|
|
371
|
-
|
|
372
|
-
→ Approve threat model? (yes / edit)
|
|
373
|
-
```
|
|
374
|
-
|
|
375
|
-
7. **HUMAN GATE**: User approves the threat model
|
|
376
|
-
8. Store via `
|
|
377
|
-
9. **Write artifact**: Call `
|
|
378
|
-
10. Feed critical/high-risk mitigations into Phase 3 plan as mandatory implementation steps
|
|
379
|
-
|
|
380
|
-
#### After design/architecture/RCA/threat-model phases:
|
|
381
|
-
|
|
382
|
-
The design documents, RCA findings, and threat model inform the implementation plan. Phase 3's plan should reference:
|
|
383
|
-
- Which approach was chosen (from design doc)
|
|
384
|
-
- Which components to build (from architecture)
|
|
385
|
-
- Which interfaces to implement (from ADR)
|
|
386
|
-
- What root cause was found (from RCA) and what fix addresses it
|
|
387
|
-
- What threats were identified (from threat model) and what mitigations are required
|
|
388
|
-
|
|
389
|
-
**Read prior artifacts**: Call `
|
|
390
|
-
|
|
391
|
-
### Phase 3 — HUMAN GATE: Plan Approval
|
|
392
|
-
|
|
393
|
-
<HARD-GATE>
|
|
394
|
-
Do NOT proceed to implementation until the user has explicitly approved the plan.
|
|
395
|
-
Do NOT skip this gate for any reason, regardless of perceived simplicity.
|
|
396
|
-
If the user says "just do it" without seeing the plan, present the plan anyway.
|
|
397
|
-
</HARD-GATE>
|
|
398
|
-
|
|
399
|
-
**STOP HERE. Present the plan to the user:**
|
|
400
|
-
|
|
401
|
-
```
|
|
402
|
-
📋 Autopilot Plan for: "{issue title}"
|
|
403
|
-
|
|
404
|
-
Issue: #{id} on {board provider}
|
|
405
|
-
{if design doc exists: "Design: docs/design/{file}.md"}
|
|
406
|
-
{if ADR exists: "Architecture: docs/adr/{file}.md"}
|
|
407
|
-
|
|
408
|
-
Steps:
|
|
409
|
-
1. {step description}
|
|
410
|
-
2. {step description}
|
|
411
|
-
...
|
|
412
|
-
|
|
413
|
-
Files to change:
|
|
414
|
-
- {path} (create/modify)
|
|
415
|
-
- {path} (create/modify)
|
|
416
|
-
|
|
417
|
-
Complexity: {trivial|simple|moderate|complex}
|
|
418
|
-
|
|
419
|
-
Approve? (yes / edit / cancel)
|
|
420
|
-
```
|
|
421
|
-
|
|
422
|
-
- If **yes** → call `
|
|
423
|
-
- If **edit** → user provides changes, update plan, re-present
|
|
424
|
-
- If **cancel** → call `
|
|
425
|
-
|
|
426
|
-
**Write artifact**: After approval, call `
|
|
427
|
-
|
|
428
|
-
**Session checkpoint**: After plan approval, call `
|
|
429
|
-
|
|
430
|
-
### Phase 4 — Branch & Implementation
|
|
431
|
-
|
|
432
|
-
**Read prior artifacts**: Call `
|
|
433
|
-
|
|
434
|
-
1. Call `
|
|
435
|
-
2. Call `
|
|
436
|
-
3. **For each step in the plan:**
|
|
437
|
-
a. Implement the code change (create/modify files)
|
|
438
|
-
b. Follow existing codebase patterns discovered in Phase 2
|
|
439
|
-
c. After each logical unit, run `
|
|
440
|
-
d. If tests fail, diagnose and fix (max 3 attempts per step — circuit breaker)
|
|
441
|
-
4. Write tests for new code:
|
|
442
|
-
- Match existing test framework and conventions
|
|
443
|
-
- Cover happy path + key edge cases
|
|
444
|
-
- Run tests to confirm they pass
|
|
445
|
-
|
|
446
|
-
**Circuit breaker:** If any step fails 3 times consecutively:
|
|
447
|
-
- Call `
|
|
448
|
-
- Call `
|
|
449
|
-
- Tell the user what went wrong and which step is stuck
|
|
450
|
-
- STOP. Do not continue.
|
|
451
|
-
|
|
452
|
-
### Phase 5 — Change Management (Deliver Skill #1)
|
|
453
|
-
|
|
454
|
-
**Load and follow** `.github/skills/deliver-change-management/SKILL.md` methodology:
|
|
455
|
-
|
|
456
|
-
1. Run `
|
|
457
|
-
2. Categorize changes by type: `feat`, `fix`, `refactor`, `test`, `docs`, `chore`
|
|
458
|
-
3. **If changes span multiple logical units** (e.g., new feature + test + config):
|
|
459
|
-
- Split into separate commits with `
|
|
460
|
-
- Each commit gets its own conventional message
|
|
461
|
-
- Example:
|
|
462
|
-
```
|
|
463
|
-
git add src/feature.ts
|
|
464
|
-
→ feat(scope): add feature X (closes #ID)
|
|
465
|
-
|
|
466
|
-
git add tests/feature.test.ts
|
|
467
|
-
→ test(scope): add tests for feature X
|
|
468
|
-
|
|
469
|
-
git add docs/api.md
|
|
470
|
-
→ docs(scope): update API docs for feature X
|
|
471
|
-
```
|
|
472
|
-
4. **If changes are a single logical unit**, create one commit:
|
|
473
|
-
- Format: `feat(scope): description (closes #ID)`
|
|
474
|
-
- Body: what was implemented and why
|
|
475
|
-
- Footer: `Closes #ID`
|
|
476
|
-
5. Call `
|
|
477
|
-
6. **Write artifact**: Call `
|
|
478
|
-
|
|
479
|
-
### Phase 5b — Doc Governance (Deliver Skill #2)
|
|
480
|
-
|
|
481
|
-
**Load and follow** `.github/skills/deliver-doc-governance/SKILL.md` methodology:
|
|
482
|
-
|
|
483
|
-
1. Check if the changes affect any **public interfaces**:
|
|
484
|
-
- New/changed API endpoints
|
|
485
|
-
- New/changed CLI commands
|
|
486
|
-
- New/changed configuration options
|
|
487
|
-
- New/changed tool signatures
|
|
488
|
-
- New/changed exports or public functions
|
|
489
|
-
2. If public interfaces changed, run drift detection:
|
|
490
|
-
- Compare README against actual project structure and features
|
|
491
|
-
- Compare API docs against actual function signatures
|
|
492
|
-
- Check if code examples still work
|
|
493
|
-
- Verify install/setup instructions are still accurate
|
|
494
|
-
3. **If drift found:**
|
|
495
|
-
- Fix the documentation directly (same branch)
|
|
496
|
-
- Stage and commit: `docs(scope): sync docs with implementation changes`
|
|
497
|
-
- Add to the PR body: `### Documentation Updated` section listing what was synced
|
|
498
|
-
4. **If no public interfaces changed**, skip — note "No doc changes needed" in the PR body
|
|
499
|
-
|
|
500
|
-
### Phase 6 — PR Creation & Auto-Review
|
|
501
|
-
|
|
502
|
-
1. Call `
|
|
503
|
-
- Title: primary commit message
|
|
504
|
-
- Body: markdown with sections:
|
|
505
|
-
- **What**: summary of changes
|
|
506
|
-
- **Why**: linked to issue #{ID}
|
|
507
|
-
- **Changes**: list of commits with descriptions
|
|
508
|
-
- **Documentation Updated**: what docs were synced (or "N/A")
|
|
509
|
-
- **How to test**: steps to verify
|
|
510
|
-
- **Checklist**: tests pass, lint clean, types clean, docs synced
|
|
511
|
-
<HARD-GATE>
|
|
512
|
-
2. **HUMAN GATE**: User reviews the preview — do NOT create the PR until the user approves.
|
|
513
|
-
If the user requests changes, apply them and re-preview. Never skip this gate.
|
|
514
|
-
</HARD-GATE>
|
|
515
|
-
3. Call `
|
|
516
|
-
4. **Run PR Intelligence** (read `.github/skills/assure-pr-intelligence/SKILL.md`):
|
|
517
|
-
- **Change inventory**: categorize changed files (core, api, test, config, docs)
|
|
518
|
-
- **Risk assessment**: evaluate scope, complexity, blast radius, test coverage, reversibility → Low/Medium/High/Critical risk score
|
|
519
|
-
- **Reviewer guidance**: order files by review priority, flag lines needing attention, list questions the reviewer should ask, note what's missing from the PR
|
|
520
|
-
- **Merge readiness checklist**: tests pass, security clean, breaking changes documented, PR description matches changes
|
|
521
|
-
- Post the full PR Intelligence report as a **comment on the PR** so the assigned reviewer sees it immediately
|
|
522
|
-
5. Read the diff of the PR
|
|
523
|
-
6. **Subagent delegation (recommended for moderate/complex changes):** Use `
|
|
524
|
-
- Delegate `code-reviewer` role with the diff and file list — receives correctness, security, and performance findings
|
|
525
|
-
- Delegate `standards-reviewer` role with the diff and codebase conventions — receives standards compliance findings
|
|
526
|
-
- Delegate `security-auditor` role with source files and architecture context — receives STRIDE/OWASP findings
|
|
527
|
-
- Each subagent writes its output as an artifact (e.g. `review-functional`, `review-standards`) for traceability
|
|
528
|
-
- Merge subagent findings using `
|
|
529
|
-
|
|
530
|
-
**Fallback (simple changes):** Run reviews inline without subagent delegation:
|
|
531
|
-
7. Run **code-quality** review internally (read `.github/skills/assure-code-quality/SKILL.md`):
|
|
532
|
-
- Correctness: does the code do what the acceptance criteria say?
|
|
533
|
-
- Security: any obvious vulnerabilities (SQL injection, XSS, secrets)?
|
|
534
|
-
- Performance: any N+1 queries, await-in-loops, unnecessary re-renders?
|
|
535
|
-
- Style: does it match codebase conventions?
|
|
536
|
-
7. Run **vulnerability-scan** internally (read `.github/skills/assure-vulnerability-scan/SKILL.md`):
|
|
537
|
-
- OWASP Top 10 quick check on new code
|
|
538
|
-
- Dependency audit: `npm audit` or `pip audit`
|
|
539
|
-
8. Run `
|
|
540
|
-
8b. **(Optional) Requirements alignment validation**: If M365 context was fetched in Phase 2, validate the implementation against meeting-stated requirements:
|
|
541
|
-
- Read the `analysis` artifact to retrieve the M365 requirements context captured earlier
|
|
542
|
-
- If the analysis artifact contains meeting decisions or stakeholder expectations, call `mcp_workiq_ask_work_iq` → "What specific requirements and acceptance criteria were stated for {feature} in meetings and emails?"
|
|
543
|
-
- Cross-reference each stated requirement against the implementation diff:
|
|
544
|
-
- **Covered**: the requirement is addressed by the code changes ✓
|
|
545
|
-
- **Partial**: the requirement is partially addressed — flag what's missing
|
|
546
|
-
- **Missing**: the requirement is not addressed at all — flag as a review finding
|
|
547
|
-
- Include requirements alignment in the PR comment:
|
|
548
|
-
```
|
|
549
|
-
📋 Requirements Alignment:
|
|
550
|
-
Meeting requirements checked: {N}
|
|
551
|
-
Covered: {count} ✓ | Partial: {count} ⚠️ | Missing: {count} ❌
|
|
552
|
-
{list any partial/missing items}
|
|
553
|
-
```
|
|
554
|
-
- If critical requirements are missing, flag as a review issue that must be addressed before merge
|
|
555
|
-
9. **Run diff-scoped pattern detection** (read `.github/skills/insights-pattern-detection/SKILL.md`):
|
|
556
|
-
- Scope: only scan files changed in the diff (NOT full codebase)
|
|
557
|
-
- Check for **consistency** with existing codebase patterns:
|
|
558
|
-
- Error handling style matches project conventions?
|
|
559
|
-
- Data access patterns match?
|
|
560
|
-
- Naming conventions followed?
|
|
561
|
-
- Test structure matches existing tests?
|
|
562
|
-
- Check for **anti-patterns** in changed files:
|
|
563
|
-
- God object/file (single file > 500 lines with mixed responsibilities)
|
|
564
|
-
- Copy-paste (near-duplicate code blocks)
|
|
565
|
-
- Shotgun surgery (small change touching too many files)
|
|
566
|
-
- Primitive obsession (strings/numbers where domain types belong)
|
|
567
|
-
- **Query knowledge base for repeat offenses**:
|
|
568
|
-
- `
|
|
569
|
-
- If a repeat offense is found, flag prominently:
|
|
570
|
-
```
|
|
571
|
-
⚠️ Recurring Pattern Issue: {description}
|
|
572
|
-
Previously flagged in: {previous context}
|
|
573
|
-
Suggestion: Consider a structural fix.
|
|
574
|
-
```
|
|
575
|
-
- Run `
|
|
576
|
-
- Include pattern findings in the PR comment:
|
|
577
|
-
```
|
|
578
|
-
🔎 Pattern Detection Results:
|
|
579
|
-
Consistency: {✓ follows codebase patterns | ⚠️ deviations found}
|
|
580
|
-
Anti-patterns: {✓ none | ⚠️ {list}}
|
|
581
|
-
Repeat issues: {✓ none | ⚠️ {count} recurring}
|
|
582
|
-
Complexity: {✓ within threshold | ⚠️ {files} above limit}
|
|
583
|
-
```
|
|
584
|
-
10. **If issues found (review, security, or pattern):**
|
|
585
|
-
- Fix them directly
|
|
586
|
-
- Re-commit: `fix(scope): address review findings`
|
|
587
|
-
- Re-push
|
|
588
|
-
- Re-run pattern detection on the fix to confirm resolution
|
|
589
|
-
11. **Write artifact**: Call `
|
|
590
|
-
12. Call `
|
|
591
|
-
12. Call `
|
|
592
|
-
13. Call `
|
|
593
|
-
|
|
594
|
-
### Phase 7 — Deploy Guard (Deliver Skill #3)
|
|
595
|
-
|
|
596
|
-
**Load and follow** `.github/skills/deliver-deploy-guard/SKILL.md` methodology:
|
|
597
|
-
|
|
598
|
-
Before marking ready to merge, run the 6-gate checklist:
|
|
599
|
-
|
|
600
|
-
1. **Code Quality Gate**: No leftover TODOs, console.logs, or commented-out code in changed files
|
|
601
|
-
2. **Test Integrity Gate**: All tests pass, coverage meets threshold, no `.skip` tests
|
|
602
|
-
3. **Security Gate**: No hardcoded secrets, no critical CVEs, no unsafe patterns
|
|
603
|
-
4. **Configuration Gate**: Env vars documented, no dev config in prod paths
|
|
604
|
-
5. **Breaking Changes Gate**: API contracts backward-compatible, no dropped exports
|
|
605
|
-
6. **Operational Readiness Gate**: Health endpoints, logging, error handling
|
|
606
|
-
|
|
607
|
-
Produce a verdict and include in the PR comment:
|
|
608
|
-
|
|
609
|
-
```
|
|
610
|
-
🛡️ Deploy Guard Results:
|
|
611
|
-
Code Quality: ✓ pass
|
|
612
|
-
Test Integrity: ✓ pass (coverage: 86%)
|
|
613
|
-
Security: ✓ pass
|
|
614
|
-
Configuration: ✓ pass
|
|
615
|
-
Breaking Changes: ✓ pass
|
|
616
|
-
Operational: ✓ pass
|
|
617
|
-
|
|
618
|
-
Verdict: GO ✅
|
|
619
|
-
```
|
|
620
|
-
|
|
621
|
-
- If **GO** → proceed to Phase 8
|
|
622
|
-
- If **CONDITIONAL** → list warnings in PR comment, proceed (human decides)
|
|
623
|
-
- If **NO-GO** → fix blockers, re-run until GO or escalate to user
|
|
624
|
-
|
|
625
|
-
**Write artifact**: Call `
|
|
626
|
-
|
|
627
|
-
### Phase 8 — Completion & Learning
|
|
628
|
-
|
|
629
|
-
1. Call `
|
|
630
|
-
2. **Store knowledge** via `
|
|
631
|
-
- Decisions made during implementation (type: `decision`)
|
|
632
|
-
- Root cause findings, if this was a bug fix (type: `root-cause`)
|
|
633
|
-
- **Pattern findings** from Phase 6 (type: `pattern`):
|
|
634
|
-
- What patterns were followed or violated
|
|
635
|
-
- Any anti-patterns found and fixed
|
|
636
|
-
- Any repeat offenses detected
|
|
637
|
-
- Complexity hotspots
|
|
638
|
-
- This creates the **self-improvement loop**: future runs query this data in Phase 2 to avoid repeating the same mistakes
|
|
639
|
-
3. Present final summary to user:
|
|
640
|
-
|
|
641
|
-
```
|
|
642
|
-
✅ Autopilot Complete
|
|
643
|
-
|
|
644
|
-
Issue: #{id} — {title}
|
|
645
|
-
Branch: {branch_name}
|
|
646
|
-
PR: #{pr_number}
|
|
647
|
-
Status: Ready to merge
|
|
648
|
-
|
|
649
|
-
Changes:
|
|
650
|
-
- {N} commits across {M} files
|
|
651
|
-
- {file} (created/modified) — {what changed}
|
|
652
|
-
|
|
653
|
-
Deliver Pipeline:
|
|
654
|
-
Change Mgmt: {N} conventional commits (feat/fix/test/docs)
|
|
655
|
-
Doc Sync: {updated | no changes needed}
|
|
656
|
-
Deploy Guard: {GO | CONDITIONAL — warnings}
|
|
657
|
-
|
|
658
|
-
{if bug fix:}
|
|
659
|
-
Root Cause: {one-sentence root cause}
|
|
660
|
-
Design Gap: {why it was vulnerable}
|
|
661
|
-
Prevention: {what would catch this earlier}
|
|
662
|
-
|
|
663
|
-
Tests: {X} passing | Coverage: {Y}%
|
|
664
|
-
Review: Auto-reviewed — code-quality + vulnerability-scan
|
|
665
|
-
Security: No issues found
|
|
666
|
-
Patterns: {✓ clean | ⚠️ {count} findings — stored for future runs}
|
|
667
|
-
Repeat Issues: {none | {count} recurring patterns detected}
|
|
668
|
-
|
|
669
|
-
→ Merge when ready. Board will auto-update on close.
|
|
670
|
-
```
|
|
671
|
-
|
|
672
|
-
4. **Write artifact**: Call `
|
|
673
|
-
5. Call `
|
|
674
|
-
|
|
675
|
-
### Capability Hints (on completion)
|
|
676
|
-
|
|
677
|
-
After presenting the final summary, append **one** contextual hint based on the session. Show each hint at most once per session.
|
|
678
|
-
|
|
679
|
-
| Context | Hint |
|
|
680
|
-
|---|---|
|
|
681
|
-
| First time user ran autopilot | 💡 *I can also parse meeting transcripts into user stories and epics — say "parse meeting" with your notes.* |
|
|
682
|
-
| Multiple autopilot runs completed | 💡 *I can generate a daily digest summarizing all your work — say "daily digest" or "eod report".* |
|
|
683
|
-
| Knowledge was stored during this run | 💡 *I remember decisions across sessions. Ask "what did we decide about X" anytime to recall.* |
|
|
684
|
-
| Pattern issues were detected | 💡 *I can run a full codebase health scan for anti-patterns and tech debt — say "codebase health".* |
|
|
685
|
-
|
|
686
|
-
## Output Format
|
|
687
|
-
|
|
688
|
-
Always use the structured format shown in each phase. Lead with the status emoji:
|
|
689
|
-
- 📋 = planning
|
|
690
|
-
- ⚠️ = waiting for approval
|
|
691
|
-
- 🔨 = implementing
|
|
692
|
-
- 🔍 = reviewing
|
|
693
|
-
- ✅ = done
|
|
694
|
-
- ✗ = failed
|
|
695
|
-
|
|
696
|
-
## Anti-Patterns
|
|
697
|
-
|
|
698
|
-
<HARD-GATE>
|
|
699
|
-
- Do NOT skip the human gate (Phase 3). The plan MUST be shown and approved.
|
|
700
|
-
- Do NOT auto-merge the PR. Only humans merge.
|
|
701
|
-
- Do NOT bypass the PR preview gate (Phase 6). The user MUST see the preview.
|
|
702
|
-
</HARD-GATE>
|
|
703
|
-
- Do NOT continue after 3 consecutive failures on a step. Escalate to human.
|
|
704
|
-
- Do NOT install new dependencies without mentioning them in the plan.
|
|
705
|
-
- Do NOT modify files outside the scope of the plan without asking.
|
|
706
|
-
- Do NOT generate placeholder/stub code. Every file must be functional.
|
|
707
|
-
- Do NOT skip tests. If the project has a test framework, write tests.
|
|
708
|
-
|
|
709
|
-
## No Placeholders
|
|
710
|
-
|
|
711
|
-
Every step in the Phase 3 plan and every file produced in Phase 4 must contain real, working content. The following are **plan failures** — never write them:
|
|
712
|
-
|
|
713
|
-
| Forbidden Pattern | Why It Fails |
|
|
714
|
-
|---|---|
|
|
715
|
-
| "TBD", "TODO", "implement later" | Defers work that should be done now |
|
|
716
|
-
| "Add appropriate error handling" | Vague — specify which errors and how to handle them |
|
|
717
|
-
| "Add validation" | Which inputs? What rules? What error messages? |
|
|
718
|
-
| "Handle edge cases" | Name the edge cases or don't mention them |
|
|
719
|
-
| "Write tests for the above" | Show the actual test code |
|
|
720
|
-
| "Similar to Phase N" | Repeat the details — context resets between phases |
|
|
721
|
-
| Steps without code blocks | If a step changes code, show the code |
|
|
722
|
-
| References to undefined types/functions | Every symbol must trace back to an earlier step |
|
|
723
|
-
|
|
724
|
-
## Chains To
|
|
725
|
-
|
|
726
|
-
- `solution-design` — Phase 2.5: generate solution design doc when `needs-design` label detected
|
|
727
|
-
- `architecture-planner` — Phase 2.5: generate ADR when `needs-architecture` label detected
|
|
728
|
-
- `root-cause-analysis` — Phase 2.5c: systematic RCA when `bug`/`defect`/`regression` label detected
|
|
729
|
-
- `threat-model` — Phase 2.5d: STRIDE threat modeling when `needs-threat-model`/`security-sensitive` label detected
|
|
730
|
-
- `change-management` — Phase 5: proper conventional commits with multi-commit splitting
|
|
731
|
-
- `doc-governance` — Phase 5b: auto-detect and fix documentation drift
|
|
732
|
-
- `pr-intelligence` — Phase 6: risk assessment + reviewer guidance posted on PR
|
|
733
|
-
- `code-quality` — Phase 6: multi-pass review of the PR
|
|
734
|
-
- `vulnerability-scan` — Phase 6: security audit of new code
|
|
735
|
-
- `pattern-detection` — Phase 2 (query known patterns) + Phase 6 (diff-scoped scan) + Phase 8 (store findings)
|
|
736
|
-
- `deploy-guard` — Phase 7: 6-gate safety check before marking ready to merge
|
|
737
|
-
- `knowledge-base` — Phase 2, 2.5c, 6, 8: the memory hub that powers the self-improvement loop
|
|
1
|
+
# Autopilot Worker
|
|
2
|
+
|
|
3
|
+
> **Pillar**: Orchestrate | **ID**: `autopilot-worker`
|
|
4
|
+
|
|
5
|
+
## Purpose
|
|
6
|
+
|
|
7
|
+
Single-command pipeline that creates a board issue, plans implementation, writes code + tests, applies the full Deliver pipeline (change-management → doc-governance → deploy-guard), opens a reviewed PR, and updates the board. One human gate: approve the plan. Everything else is automatic chaining through 12 skills. Includes label-gated design/architecture phases, bug-triggered root-cause analysis, and a continuous self-improvement loop via pattern detection + knowledge base.
|
|
8
|
+
|
|
9
|
+
## Activation Triggers
|
|
10
|
+
|
|
11
|
+
- autopilot, auto, pick up, work on, do this, implement and ship, end to end, full pipeline
|
|
12
|
+
- Routed from `feature-builder` Phase 0 when complexity is moderate or complex
|
|
13
|
+
- User provides a board issue number ("#42", "issue 42")
|
|
14
|
+
|
|
15
|
+
## Session Role Exception
|
|
16
|
+
|
|
17
|
+
This pipeline chains 12 skills across role boundaries (e.g. code-quality and vulnerability-scan in Phase 6 are Review skills, but run inside the Builder pipeline). **All skills invoked internally by this pipeline are unrestricted by the session role.** Role scoping only applies to user-initiated requests, not pipeline steps.
|
|
18
|
+
|
|
19
|
+
## Tools Required
|
|
20
|
+
|
|
21
|
+
- `crewpilot_board_connect` — connect to board provider
|
|
22
|
+
- `crewpilot_board_create` — create issue on board
|
|
23
|
+
- `crewpilot_board_move` — update issue status
|
|
24
|
+
- `crewpilot_board_comment` — log progress on the issue
|
|
25
|
+
- `crewpilot_worker_start` — start orchestrator workflow
|
|
26
|
+
- `crewpilot_worker_plan` — set execution plan
|
|
27
|
+
- `crewpilot_worker_approve` — human approval gate
|
|
28
|
+
- `crewpilot_worker_branch` — create feature branch
|
|
29
|
+
- `crewpilot_worker_pr` — push + open PR
|
|
30
|
+
- `crewpilot_worker_review_done` — record review verdict
|
|
31
|
+
- `crewpilot_worker_complete` — mark workflow done
|
|
32
|
+
- `crewpilot_worker_fail` — circuit breaker on failure
|
|
33
|
+
- `crewpilot_git_stage` — stage files
|
|
34
|
+
- `crewpilot_git_commit` — commit changes
|
|
35
|
+
- `crewpilot_exec` — run commands (tests, lint, build)
|
|
36
|
+
- `crewpilot_knowledge_store` — store decisions made during implementation
|
|
37
|
+
- `crewpilot_git_diff` — analyze changes for change-management
|
|
38
|
+
- `crewpilot_git_log` — commit history for release notes
|
|
39
|
+
- `crewpilot_metrics_coverage` — coverage check for deploy-guard
|
|
40
|
+
- `crewpilot_metrics_complexity` — complexity check for deploy-guard and pattern detection
|
|
41
|
+
- `crewpilot_worker_preview_pr` — preview changes before PR creation
|
|
42
|
+
- `crewpilot_worker_push_fixes` — push fixes to existing PR branch (no new PR)
|
|
43
|
+
- `crewpilot_board_pr_comments` — fetch review comments from a PR
|
|
44
|
+
- `crewpilot_knowledge_search` — query known patterns, anti-patterns, and past root causes
|
|
45
|
+
- `crewpilot_artifact_write` — persist phase outputs (analysis, plans, reviews) so downstream phases can read them
|
|
46
|
+
- `crewpilot_artifact_read` — read artifacts from prior phases (e.g. analysis → plan, plan → implementation)
|
|
47
|
+
- `crewpilot_artifact_list` — list all artifacts for the current workflow
|
|
48
|
+
- `crewpilot_dispatch_subagent` — delegate focused work (code review, test writing, security audit) to specialized sub-agents
|
|
49
|
+
- `crewpilot_session_save` — save session state for long-running tasks (enables resume across conversations)
|
|
50
|
+
- `crewpilot_session_restore` — restore a previously saved session to continue work
|
|
51
|
+
- `crewpilot_session_list` — list all saved sessions
|
|
52
|
+
- `mcp_workiq_ask_work_iq` — (optional, requires Work IQ extension) fetch M365 context (emails, docs, meetings) related to the task
|
|
53
|
+
|
|
54
|
+
## Methodology
|
|
55
|
+
|
|
56
|
+
### Process Flow
|
|
57
|
+
|
|
58
|
+
```dot
|
|
59
|
+
digraph autopilot_worker {
|
|
60
|
+
rankdir=TB;
|
|
61
|
+
node [shape=box];
|
|
62
|
+
|
|
63
|
+
intake [label="Phase 1\nIntake & Issue Creation"];
|
|
64
|
+
analysis [label="Phase 2\nCodebase Analysis & Planning"];
|
|
65
|
+
design [label="Phase 2.5\nDesign & Architecture\n(label-gated)", style=dashed];
|
|
66
|
+
rca [label="Phase 2.5c\nRoot Cause Analysis\n(bug label-gated)", style=dashed];
|
|
67
|
+
threat [label="Phase 2.5d\nThreat Model\n(security label-gated)", style=dashed];
|
|
68
|
+
plan_gate [label="Phase 3\nHUMAN GATE: Plan Approval", shape=diamond, style=filled, fillcolor="#ffcccc"];
|
|
69
|
+
implement [label="Phase 4\nBranch & Implementation"];
|
|
70
|
+
change_mgmt [label="Phase 5\nChange Management"];
|
|
71
|
+
doc_gov [label="Phase 5b\nDoc Governance"];
|
|
72
|
+
pr_review [label="Phase 6\nPR Creation & Auto-Review\n(5-stage)"];
|
|
73
|
+
deploy_guard [label="Phase 7\nDeploy Guard\n(6 gates)"];
|
|
74
|
+
complete [label="Phase 8\nCompletion & Learning", shape=doublecircle];
|
|
75
|
+
fail [label="FAIL\nCircuit Breaker", shape=octagon, style=filled, fillcolor="#ff9999"];
|
|
76
|
+
|
|
77
|
+
intake -> analysis;
|
|
78
|
+
analysis -> design [label="needs-design\nor needs-architecture"];
|
|
79
|
+
analysis -> rca [label="bug/defect/\nregression"];
|
|
80
|
+
analysis -> threat [label="needs-threat-model\nor security-sensitive"];
|
|
81
|
+
analysis -> plan_gate [label="no special labels"];
|
|
82
|
+
design -> plan_gate;
|
|
83
|
+
rca -> plan_gate;
|
|
84
|
+
threat -> plan_gate;
|
|
85
|
+
plan_gate -> implement [label="approved"];
|
|
86
|
+
plan_gate -> fail [label="cancelled"];
|
|
87
|
+
implement -> change_mgmt;
|
|
88
|
+
implement -> fail [label="3 failures"];
|
|
89
|
+
change_mgmt -> doc_gov;
|
|
90
|
+
doc_gov -> pr_review;
|
|
91
|
+
pr_review -> pr_review [label="issues found\nfix & re-run"];
|
|
92
|
+
pr_review -> deploy_guard;
|
|
93
|
+
deploy_guard -> complete [label="GO"];
|
|
94
|
+
deploy_guard -> pr_review [label="NO-GO\nfix blockers"];
|
|
95
|
+
complete -> complete [label="store knowledge\nself-improvement loop"];
|
|
96
|
+
}
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
### Phase 1 — Intake & Issue Creation
|
|
100
|
+
|
|
101
|
+
**First interaction hint:** If this is the first interaction in the session, start with:
|
|
102
|
+
> 💡 *Running CrewPilot Autopilot — I'll summarize the task, confirm with you before creating a board issue, plan the work, get your approval, implement, test, review, and open a PR.*
|
|
103
|
+
|
|
104
|
+
**Entry mode detection** — the worker can be entered four ways:
|
|
105
|
+
|
|
106
|
+
| Entry Mode | How to Detect | Behavior |
|
|
107
|
+
|---|---|---|
|
|
108
|
+
| **Direct** | User says "autopilot", "full pipeline", etc. | Run full pipeline from Phase 1 |
|
|
109
|
+
| **Routed from feature-builder** | feature-builder's Phase 0 classified as moderate/complex | Skip re-analyzing complexity — it's already assessed. Use the context feature-builder gathered. |
|
|
110
|
+
| **Mid-build escalation** | feature-builder discovered more complexity during Phase 4 | Accept the partial context (files already touched, patterns found). Start from Phase 2 (planning) with what's already known. |
|
|
111
|
+
| **Session resume** | User says "resume", "continue", "pick up where I left off" | Call `crewpilot_session_restore` with the workflow ID. Read the saved state, load associated artifacts, and resume from the last pending action. |
|
|
112
|
+
|
|
113
|
+
**Session resume flow**: When resuming, the agent should:
|
|
114
|
+
1. Call `crewpilot_session_restore` to get the saved state
|
|
115
|
+
2. Call `crewpilot_artifact_list` to see what artifacts exist
|
|
116
|
+
3. Read relevant artifacts with `crewpilot_artifact_read`
|
|
117
|
+
4. **(Optional) Calendar-aware context refresh**: If `mcp_workiq_ask_work_iq` is available and significant time has passed since the session was saved (overnight, weekend, or >4 hours):
|
|
118
|
+
- Call `mcp_workiq_accept_eula` with `eulaUrl: "https://github.com/microsoft/work-iq-mcp"` (idempotent)
|
|
119
|
+
- **Check for new context**: `mcp_workiq_ask_work_iq` → "What meetings, emails, or Teams messages about {issue title / feature} happened since {saved_at timestamp}? Summarize any new decisions, requirement changes, or blockers."
|
|
120
|
+
- **Check calendar conflicts**: `mcp_workiq_ask_work_iq` → "Do I have any meetings in the next 2 hours that might affect my availability?"
|
|
121
|
+
- If new decisions or requirement changes are found, flag them to the user before continuing:
|
|
122
|
+
```
|
|
123
|
+
📅 Context Update (since session was saved {age} ago):
|
|
124
|
+
- {new decision / requirement change / blocker}
|
|
125
|
+
→ Continue with current plan? (yes / re-plan)
|
|
126
|
+
```
|
|
127
|
+
- If unavailable, skip — resume proceeds without M365 context refresh.
|
|
128
|
+
5. Continue from the first pending action in the saved state
|
|
129
|
+
6. Do NOT re-run phases that have already completed (check artifacts_written)
|
|
130
|
+
|
|
131
|
+
**Complexity check (direct entry only):** If the user enters autopilot directly, quickly assess if the request warrants the full pipeline:
|
|
132
|
+
- If the request is trivial (single file, obvious change) → suggest: *"This is a small change. I can implement it directly without the full pipeline. Want me to do that instead?"*
|
|
133
|
+
- If the user says "just do it" → hand off to `feature-builder` (which will handle it as trivial/simple tier).
|
|
134
|
+
- Otherwise → continue with the full pipeline below.
|
|
135
|
+
|
|
136
|
+
**If user provides a task description (not an existing issue number):**
|
|
137
|
+
|
|
138
|
+
1. Parse the user's request to extract:
|
|
139
|
+
- Title (concise, action-oriented)
|
|
140
|
+
- Description (what needs to be built)
|
|
141
|
+
- Acceptance criteria (bullet list — infer from description if not explicit)
|
|
142
|
+
- Labels (feature, bug, chore — infer from context)
|
|
143
|
+
|
|
144
|
+
<HARD-GATE>
|
|
145
|
+
2. **HUMAN GATE — Task Creation Confirmation**: Present the inferred task summary to the user BEFORE creating the board issue:
|
|
146
|
+
|
|
147
|
+
```
|
|
148
|
+
📋 Before I start, here's what I'll create as a board issue:
|
|
149
|
+
|
|
150
|
+
Title: {title}
|
|
151
|
+
Description: {description}
|
|
152
|
+
|
|
153
|
+
Acceptance Criteria:
|
|
154
|
+
- [ ] {criterion 1}
|
|
155
|
+
- [ ] {criterion 2}
|
|
156
|
+
- [ ] {criterion 3}
|
|
157
|
+
|
|
158
|
+
Labels: {labels}
|
|
159
|
+
|
|
160
|
+
→ Create this task and start the pipeline? (yes / edit / no)
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
- If **yes** → call `crewpilot_board_create`, continue to Phase 2
|
|
164
|
+
- If **edit** → user provides corrections, update and re-present
|
|
165
|
+
- If **no** → stop the pipeline. Ask the user what they'd like to do instead.
|
|
166
|
+
- Do NOT create the board issue without explicit user confirmation.
|
|
167
|
+
</HARD-GATE>
|
|
168
|
+
|
|
169
|
+
3. Call `crewpilot_board_create` with title, description, acceptance criteria
|
|
170
|
+
4. Note the created issue ID
|
|
171
|
+
|
|
172
|
+
**If user provides an existing issue number (e.g., "#42"):**
|
|
173
|
+
|
|
174
|
+
1. Call `crewpilot_board_get` to read the existing issue
|
|
175
|
+
2. Use its title, description, and acceptance criteria as-is
|
|
176
|
+
3. No confirmation needed — the task already exists
|
|
177
|
+
|
|
178
|
+
### Phase 2 — Codebase Analysis & Planning
|
|
179
|
+
|
|
180
|
+
1. Read the project structure — scan key files (package.json, tsconfig, src/ layout, existing patterns)
|
|
181
|
+
2. Identify:
|
|
182
|
+
- Which files need to be **created**
|
|
183
|
+
- Which files need to be **modified**
|
|
184
|
+
- What patterns/conventions the codebase follows (naming, directory structure, test style)
|
|
185
|
+
- What dependencies might be needed
|
|
186
|
+
3. Check issue labels for `needs-design`, `needs-architecture`, `bug`/`defect`/`regression`, and `needs-threat-model`/`security-sensitive`
|
|
187
|
+
4. **Query pattern knowledge** via `crewpilot_knowledge_search` (type: `pattern`):
|
|
188
|
+
- Search for known patterns and anti-patterns in the files being modified
|
|
189
|
+
- Search for past root causes in the same area of the codebase
|
|
190
|
+
- Collect any "repeat offender" warnings from previous runs
|
|
191
|
+
- Feed this context into the plan so the worker avoids known mistakes
|
|
192
|
+
5. **(Optional) Fetch M365 requirements context**: First call `mcp_workiq_accept_eula` with `eulaUrl: "https://github.com/microsoft/work-iq-mcp"` (idempotent), then use **focused queries** to surface requirements context before planning:
|
|
193
|
+
- **Requirements & specs**: `mcp_workiq_ask_work_iq` → "Find emails, documents, and Teams messages about: {issue title}. Summarize relevant discussions, specs, and design docs."
|
|
194
|
+
- **Meeting decisions**: `mcp_workiq_ask_work_iq` → "What decisions were made about {issue title / feature name} in recent meetings? What requirements were stated?"
|
|
195
|
+
- **Stakeholder expectations**: `mcp_workiq_ask_work_iq` → "What did stakeholders or customers say about {feature} in recent emails or meetings? What was promised or committed?"
|
|
196
|
+
- Feed the M365 context into the analysis artifact so Phase 3's plan addresses stated requirements, not just the issue description.
|
|
197
|
+
- If `mcp_workiq_ask_work_iq` is unavailable, skip — this step is optional.
|
|
198
|
+
6. Call `crewpilot_worker_start` with the issue ID and title
|
|
199
|
+
7. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="analysis"` containing:
|
|
200
|
+
- Files to create/modify
|
|
201
|
+
- Codebase patterns discovered
|
|
202
|
+
- Dependencies needed
|
|
203
|
+
- Label-gated phases to run
|
|
204
|
+
- Known patterns/anti-patterns from knowledge search
|
|
205
|
+
|
|
206
|
+
### Phase 2.5 — Design & Architecture (label-gated)
|
|
207
|
+
|
|
208
|
+
**Skip this phase entirely if the issue has neither `needs-design` nor `needs-architecture` label.**
|
|
209
|
+
|
|
210
|
+
Check the issue labels (from `crewpilot_board_get`). Run the applicable skills:
|
|
211
|
+
|
|
212
|
+
#### If issue has `needs-design` label:
|
|
213
|
+
|
|
214
|
+
**Load and follow** `.github/skills/strategize-solution-design/SKILL.md`:
|
|
215
|
+
|
|
216
|
+
1. Frame the problem — restate in one sentence with constraints
|
|
217
|
+
2. Generate 3-4 distinct approaches with strengths, risks, and effort
|
|
218
|
+
3. Build a trade-off matrix comparing all options
|
|
219
|
+
4. Present to user:
|
|
220
|
+
|
|
221
|
+
```
|
|
222
|
+
📐 Design Phase for: "{issue title}"
|
|
223
|
+
|
|
224
|
+
{trade-off matrix}
|
|
225
|
+
|
|
226
|
+
Recommendation: {option} (Confidence: {N}/10)
|
|
227
|
+
Reversal cost: {Low/Medium/High}
|
|
228
|
+
|
|
229
|
+
→ Which approach? (A / B / C / edit)
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
5. **HUMAN GATE**: User picks an approach
|
|
233
|
+
6. Store the decision via `crewpilot_knowledge_store` (type: decision)
|
|
234
|
+
7. Write the design document to `docs/design/{issue_id}-{slug}.md`:
|
|
235
|
+
```markdown
|
|
236
|
+
# Design: {issue title}
|
|
237
|
+
|
|
238
|
+
**Issue**: #{id}
|
|
239
|
+
**Date**: {date}
|
|
240
|
+
**Decision**: {chosen option}
|
|
241
|
+
|
|
242
|
+
## Problem
|
|
243
|
+
{one-sentence problem statement}
|
|
244
|
+
|
|
245
|
+
## Options Considered
|
|
246
|
+
{options with strengths/risks/effort}
|
|
247
|
+
|
|
248
|
+
## Trade-off Matrix
|
|
249
|
+
{matrix}
|
|
250
|
+
|
|
251
|
+
## Decision
|
|
252
|
+
{chosen option with rationale}
|
|
253
|
+
Confidence: {N}/10 | Reversal cost: {Low/Medium/High}
|
|
254
|
+
```
|
|
255
|
+
8. Stage the design doc — it will be committed alongside the code in Phase 5
|
|
256
|
+
9. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="design"` containing the chosen approach, trade-off summary, and design document path
|
|
257
|
+
|
|
258
|
+
#### If issue has `needs-architecture` label:
|
|
259
|
+
|
|
260
|
+
**Load and follow** `.github/skills/strategize-architecture-planner/SKILL.md`:
|
|
261
|
+
|
|
262
|
+
1. Define scope — system boundaries, actors, quality attributes
|
|
263
|
+
2. Decompose into components with responsibilities and interfaces
|
|
264
|
+
3. Trace the primary data flow through the system
|
|
265
|
+
4. Create an implementation roadmap with milestones
|
|
266
|
+
5. Present to user:
|
|
267
|
+
|
|
268
|
+
```
|
|
269
|
+
📐 Architecture for: "{issue title}"
|
|
270
|
+
|
|
271
|
+
Components:
|
|
272
|
+
| Component | Responsibility | Interface | Dependencies |
|
|
273
|
+
|-----------|---------------|-----------|-------------|
|
|
274
|
+
| ... | ... | ... | ... |
|
|
275
|
+
|
|
276
|
+
Data Flow:
|
|
277
|
+
1. {step} → {step} → {step}
|
|
278
|
+
|
|
279
|
+
→ Approve architecture? (yes / edit)
|
|
280
|
+
```
|
|
281
|
+
|
|
282
|
+
6. **HUMAN GATE**: User approves the architecture
|
|
283
|
+
7. Store as knowledge (type: decision)
|
|
284
|
+
8. Write the ADR to `docs/adr/{NNN}-{slug}.md`:
|
|
285
|
+
```markdown
|
|
286
|
+
# ADR-{NNN}: {title}
|
|
287
|
+
|
|
288
|
+
## Status: Accepted
|
|
289
|
+
## Context
|
|
290
|
+
{why this design was needed}
|
|
291
|
+
## Decision
|
|
292
|
+
{what was decided — components, data flow, interfaces}
|
|
293
|
+
## Consequences
|
|
294
|
+
{positive and negative trade-offs}
|
|
295
|
+
## Alternatives Considered
|
|
296
|
+
{rejected options and why}
|
|
297
|
+
```
|
|
298
|
+
9. Stage the ADR — it will be committed alongside the code in Phase 5
|
|
299
|
+
10. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="architecture"` containing the component decomposition, data flow, interfaces, and ADR path
|
|
300
|
+
|
|
301
|
+
#### If issue has BOTH labels:
|
|
302
|
+
|
|
303
|
+
Run `needs-design` first (pick the approach), then `needs-architecture` (detail the design).
|
|
304
|
+
The design decision feeds into the architecture — e.g., "we chose Redis" → architecture shows CacheService component, middleware chain, config interface.
|
|
305
|
+
|
|
306
|
+
### Phase 2.5c — Root Cause Analysis (label-gated)
|
|
307
|
+
|
|
308
|
+
**Skip if the issue does NOT have a `bug`, `defect`, or `regression` label.**
|
|
309
|
+
|
|
310
|
+
**Load and follow** `.github/skills/engineer-root-cause-analysis/SKILL.md` methodology:
|
|
311
|
+
|
|
312
|
+
1. **Symptom collection**:
|
|
313
|
+
- Extract error message, stack trace, steps to reproduce from the issue description
|
|
314
|
+
- Run `crewpilot_git_log` on the affected files to check recent changes
|
|
315
|
+
- Query `crewpilot_knowledge_search` for previous root causes in the same area
|
|
316
|
+
2. **Hypothesis generation** — generate 2-3 ranked hypotheses:
|
|
317
|
+
|
|
318
|
+
```
|
|
319
|
+
🔍 RCA for: "{issue title}"
|
|
320
|
+
|
|
321
|
+
| # | Hypothesis | Likelihood | Evidence | Test Strategy |
|
|
322
|
+
|---|---|---|---|---|
|
|
323
|
+
| H1 | {most likely} | High | {evidence} | {how to test} |
|
|
324
|
+
| H2 | {alternative} | Medium | {evidence} | {how to test} |
|
|
325
|
+
| H3 | {edge case} | Low | {evidence} | {how to test} |
|
|
326
|
+
```
|
|
327
|
+
|
|
328
|
+
3. **Systematic elimination** — for each hypothesis (highest first):
|
|
329
|
+
- Run `crewpilot_exec` to test (add logging, reproduce, check state)
|
|
330
|
+
- Record result: confirmed / eliminated / narrowed
|
|
331
|
+
- Max 5 attempts total (circuit breaker — same as Phase 4)
|
|
332
|
+
4. **Root cause identification**:
|
|
333
|
+
- State in one sentence
|
|
334
|
+
- Causal chain: trigger → intermediate effects → symptom
|
|
335
|
+
- Design gap: WHY the code was vulnerable
|
|
336
|
+
5. **Feed into Phase 3 plan**:
|
|
337
|
+
- The plan must fix the root cause, not just the symptom
|
|
338
|
+
- Include a regression test that fails without the fix
|
|
339
|
+
- Phase 5 commit footer: `Root-cause: {one-sentence description}`
|
|
340
|
+
6. **Store root cause** via `crewpilot_knowledge_store` (type: `root-cause`):
|
|
341
|
+
- What: the root cause description
|
|
342
|
+
- Where: affected files/modules
|
|
343
|
+
- Why: the design gap
|
|
344
|
+
- Prevention: what would have caught this earlier
|
|
345
|
+
7. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="rca"` containing the root cause, causal chain, design gap, prevention strategy, and affected files
|
|
346
|
+
8. **If root cause reveals a systemic issue**, flag it for pattern detection in Phase 6:
|
|
347
|
+
- Add note: `systemic:{description}` for Phase 6 to pick up
|
|
348
|
+
|
|
349
|
+
### Phase 2.5d — Threat Modeling (label-gated)
|
|
350
|
+
|
|
351
|
+
**Skip if the issue does NOT have a `needs-threat-model` or `security-sensitive` label.**
|
|
352
|
+
|
|
353
|
+
**Load and follow** `.github/skills/assure-threat-model/SKILL.md` methodology:
|
|
354
|
+
|
|
355
|
+
1. **Read prior artifacts**: Load the `analysis` artifact (and `architecture` if it exists) to understand the system being built
|
|
356
|
+
2. **Scope the model**: Define the trust boundaries and data flows for the feature being implemented
|
|
357
|
+
3. **STRIDE analysis**: For each component and data flow crossing a trust boundary, evaluate all 6 STRIDE categories
|
|
358
|
+
4. **Risk assessment**: Score each threat (Likelihood × Impact = Risk)
|
|
359
|
+
5. **Mitigation planning**: For threats with risk ≥ 7, propose specific mitigations with effort and implementation phase
|
|
360
|
+
6. **Present to user**:
|
|
361
|
+
|
|
362
|
+
```
|
|
363
|
+
🛡️ Threat Model for: "{issue title}"
|
|
364
|
+
|
|
365
|
+
| ID | STRIDE | Component | Threat | Risk Score | Mitigation |
|
|
366
|
+
|----|--------|-----------|--------|------------|------------|
|
|
367
|
+
| T1 | ... | ... | ... | ... | ... |
|
|
368
|
+
|
|
369
|
+
Critical threats: {count}
|
|
370
|
+
Required mitigations before implementation: {list}
|
|
371
|
+
|
|
372
|
+
→ Approve threat model? (yes / edit)
|
|
373
|
+
```
|
|
374
|
+
|
|
375
|
+
7. **HUMAN GATE**: User approves the threat model
|
|
376
|
+
8. Store via `crewpilot_knowledge_store` (type: `threat-model`)
|
|
377
|
+
9. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="threat-model"` containing the full threat register
|
|
378
|
+
10. Feed critical/high-risk mitigations into Phase 3 plan as mandatory implementation steps
|
|
379
|
+
|
|
380
|
+
#### After design/architecture/RCA/threat-model phases:
|
|
381
|
+
|
|
382
|
+
The design documents, RCA findings, and threat model inform the implementation plan. Phase 3's plan should reference:
|
|
383
|
+
- Which approach was chosen (from design doc)
|
|
384
|
+
- Which components to build (from architecture)
|
|
385
|
+
- Which interfaces to implement (from ADR)
|
|
386
|
+
- What root cause was found (from RCA) and what fix addresses it
|
|
387
|
+
- What threats were identified (from threat model) and what mitigations are required
|
|
388
|
+
|
|
389
|
+
**Read prior artifacts**: Call `crewpilot_artifact_read` to load the `analysis`, `design`, `architecture`, `rca`, and/or `threat-model` artifacts. These contain the full context from earlier phases — do not rely on chat history alone.
|
|
390
|
+
|
|
391
|
+
### Phase 3 — HUMAN GATE: Plan Approval
|
|
392
|
+
|
|
393
|
+
<HARD-GATE>
|
|
394
|
+
Do NOT proceed to implementation until the user has explicitly approved the plan.
|
|
395
|
+
Do NOT skip this gate for any reason, regardless of perceived simplicity.
|
|
396
|
+
If the user says "just do it" without seeing the plan, present the plan anyway.
|
|
397
|
+
</HARD-GATE>
|
|
398
|
+
|
|
399
|
+
**STOP HERE. Present the plan to the user:**
|
|
400
|
+
|
|
401
|
+
```
|
|
402
|
+
📋 Autopilot Plan for: "{issue title}"
|
|
403
|
+
|
|
404
|
+
Issue: #{id} on {board provider}
|
|
405
|
+
{if design doc exists: "Design: docs/design/{file}.md"}
|
|
406
|
+
{if ADR exists: "Architecture: docs/adr/{file}.md"}
|
|
407
|
+
|
|
408
|
+
Steps:
|
|
409
|
+
1. {step description}
|
|
410
|
+
2. {step description}
|
|
411
|
+
...
|
|
412
|
+
|
|
413
|
+
Files to change:
|
|
414
|
+
- {path} (create/modify)
|
|
415
|
+
- {path} (create/modify)
|
|
416
|
+
|
|
417
|
+
Complexity: {trivial|simple|moderate|complex}
|
|
418
|
+
|
|
419
|
+
Approve? (yes / edit / cancel)
|
|
420
|
+
```
|
|
421
|
+
|
|
422
|
+
- If **yes** → call `crewpilot_worker_approve`, continue to Phase 4
|
|
423
|
+
- If **edit** → user provides changes, update plan, re-present
|
|
424
|
+
- If **cancel** → call `crewpilot_worker_fail`, stop
|
|
425
|
+
|
|
426
|
+
**Write artifact**: After approval, call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="plan"` containing the approved plan (steps, files, complexity).
|
|
427
|
+
|
|
428
|
+
**Session checkpoint**: After plan approval, call `crewpilot_session_save` with status="checkpoint", phase="phase-3-approved", and the current context. This ensures the approved plan can be resumed if the session is interrupted.
|
|
429
|
+
|
|
430
|
+
### Phase 4 — Branch & Implementation
|
|
431
|
+
|
|
432
|
+
**Read prior artifacts**: Call `crewpilot_artifact_read` for `plan` (and `analysis`, `design`, `architecture`, `rca` if they exist) to load the full execution context.
|
|
433
|
+
|
|
434
|
+
1. Call `crewpilot_worker_branch` to create feature branch
|
|
435
|
+
2. Call `crewpilot_board_move` to set issue status to "in-progress"
|
|
436
|
+
3. **For each step in the plan:**
|
|
437
|
+
a. Implement the code change (create/modify files)
|
|
438
|
+
b. Follow existing codebase patterns discovered in Phase 2
|
|
439
|
+
c. After each logical unit, run `crewpilot_exec("npm test")` or equivalent to verify nothing is broken
|
|
440
|
+
d. If tests fail, diagnose and fix (max 3 attempts per step — circuit breaker)
|
|
441
|
+
4. Write tests for new code:
|
|
442
|
+
- Match existing test framework and conventions
|
|
443
|
+
- Cover happy path + key edge cases
|
|
444
|
+
- Run tests to confirm they pass
|
|
445
|
+
|
|
446
|
+
**Circuit breaker:** If any step fails 3 times consecutively:
|
|
447
|
+
- Call `crewpilot_board_comment` with details of the failure
|
|
448
|
+
- Call `crewpilot_worker_fail` with reason
|
|
449
|
+
- Tell the user what went wrong and which step is stuck
|
|
450
|
+
- STOP. Do not continue.
|
|
451
|
+
|
|
452
|
+
### Phase 5 — Change Management (Deliver Skill #1)
|
|
453
|
+
|
|
454
|
+
**Load and follow** `.github/skills/deliver-change-management/SKILL.md` methodology:
|
|
455
|
+
|
|
456
|
+
1. Run `crewpilot_git_diff` to analyze all changes
|
|
457
|
+
2. Categorize changes by type: `feat`, `fix`, `refactor`, `test`, `docs`, `chore`
|
|
458
|
+
3. **If changes span multiple logical units** (e.g., new feature + test + config):
|
|
459
|
+
- Split into separate commits with `crewpilot_git_stage` per group
|
|
460
|
+
- Each commit gets its own conventional message
|
|
461
|
+
- Example:
|
|
462
|
+
```
|
|
463
|
+
git add src/feature.ts
|
|
464
|
+
→ feat(scope): add feature X (closes #ID)
|
|
465
|
+
|
|
466
|
+
git add tests/feature.test.ts
|
|
467
|
+
→ test(scope): add tests for feature X
|
|
468
|
+
|
|
469
|
+
git add docs/api.md
|
|
470
|
+
→ docs(scope): update API docs for feature X
|
|
471
|
+
```
|
|
472
|
+
4. **If changes are a single logical unit**, create one commit:
|
|
473
|
+
- Format: `feat(scope): description (closes #ID)`
|
|
474
|
+
- Body: what was implemented and why
|
|
475
|
+
- Footer: `Closes #ID`
|
|
476
|
+
5. Call `crewpilot_git_stage` and `crewpilot_git_commit` for each logical commit
|
|
477
|
+
6. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="change-mgmt"` containing the list of commits created (hash, type, scope, message)
|
|
478
|
+
|
|
479
|
+
### Phase 5b — Doc Governance (Deliver Skill #2)
|
|
480
|
+
|
|
481
|
+
**Load and follow** `.github/skills/deliver-doc-governance/SKILL.md` methodology:
|
|
482
|
+
|
|
483
|
+
1. Check if the changes affect any **public interfaces**:
|
|
484
|
+
- New/changed API endpoints
|
|
485
|
+
- New/changed CLI commands
|
|
486
|
+
- New/changed configuration options
|
|
487
|
+
- New/changed tool signatures
|
|
488
|
+
- New/changed exports or public functions
|
|
489
|
+
2. If public interfaces changed, run drift detection:
|
|
490
|
+
- Compare README against actual project structure and features
|
|
491
|
+
- Compare API docs against actual function signatures
|
|
492
|
+
- Check if code examples still work
|
|
493
|
+
- Verify install/setup instructions are still accurate
|
|
494
|
+
3. **If drift found:**
|
|
495
|
+
- Fix the documentation directly (same branch)
|
|
496
|
+
- Stage and commit: `docs(scope): sync docs with implementation changes`
|
|
497
|
+
- Add to the PR body: `### Documentation Updated` section listing what was synced
|
|
498
|
+
4. **If no public interfaces changed**, skip — note "No doc changes needed" in the PR body
|
|
499
|
+
|
|
500
|
+
### Phase 6 — PR Creation & Auto-Review
|
|
501
|
+
|
|
502
|
+
1. Call `crewpilot_worker_preview_pr` with:
|
|
503
|
+
- Title: primary commit message
|
|
504
|
+
- Body: markdown with sections:
|
|
505
|
+
- **What**: summary of changes
|
|
506
|
+
- **Why**: linked to issue #{ID}
|
|
507
|
+
- **Changes**: list of commits with descriptions
|
|
508
|
+
- **Documentation Updated**: what docs were synced (or "N/A")
|
|
509
|
+
- **How to test**: steps to verify
|
|
510
|
+
- **Checklist**: tests pass, lint clean, types clean, docs synced
|
|
511
|
+
<HARD-GATE>
|
|
512
|
+
2. **HUMAN GATE**: User reviews the preview — do NOT create the PR until the user approves.
|
|
513
|
+
If the user requests changes, apply them and re-preview. Never skip this gate.
|
|
514
|
+
</HARD-GATE>
|
|
515
|
+
3. Call `crewpilot_worker_pr` to create the PR
|
|
516
|
+
4. **Run PR Intelligence** (read `.github/skills/assure-pr-intelligence/SKILL.md`):
|
|
517
|
+
- **Change inventory**: categorize changed files (core, api, test, config, docs)
|
|
518
|
+
- **Risk assessment**: evaluate scope, complexity, blast radius, test coverage, reversibility → Low/Medium/High/Critical risk score
|
|
519
|
+
- **Reviewer guidance**: order files by review priority, flag lines needing attention, list questions the reviewer should ask, note what's missing from the PR
|
|
520
|
+
- **Merge readiness checklist**: tests pass, security clean, breaking changes documented, PR description matches changes
|
|
521
|
+
- Post the full PR Intelligence report as a **comment on the PR** so the assigned reviewer sees it immediately
|
|
522
|
+
5. Read the diff of the PR
|
|
523
|
+
6. **Subagent delegation (recommended for moderate/complex changes):** Use `crewpilot_dispatch_subagent` to delegate review work in parallel:
|
|
524
|
+
- Delegate `code-reviewer` role with the diff and file list — receives correctness, security, and performance findings
|
|
525
|
+
- Delegate `standards-reviewer` role with the diff and codebase conventions — receives standards compliance findings
|
|
526
|
+
- Delegate `security-auditor` role with source files and architecture context — receives STRIDE/OWASP findings
|
|
527
|
+
- Each subagent writes its output as an artifact (e.g. `review-functional`, `review-standards`) for traceability
|
|
528
|
+
- Merge subagent findings using `crewpilot_dispatch_consensus` to identify high-confidence vs disputed issues
|
|
529
|
+
|
|
530
|
+
**Fallback (simple changes):** Run reviews inline without subagent delegation:
|
|
531
|
+
7. Run **code-quality** review internally (read `.github/skills/assure-code-quality/SKILL.md`):
|
|
532
|
+
- Correctness: does the code do what the acceptance criteria say?
|
|
533
|
+
- Security: any obvious vulnerabilities (SQL injection, XSS, secrets)?
|
|
534
|
+
- Performance: any N+1 queries, await-in-loops, unnecessary re-renders?
|
|
535
|
+
- Style: does it match codebase conventions?
|
|
536
|
+
7. Run **vulnerability-scan** internally (read `.github/skills/assure-vulnerability-scan/SKILL.md`):
|
|
537
|
+
- OWASP Top 10 quick check on new code
|
|
538
|
+
- Dependency audit: `npm audit` or `pip audit`
|
|
539
|
+
8. Run `crewpilot_exec("npm run lint")` and `crewpilot_exec("npm run typecheck")` if available
|
|
540
|
+
8b. **(Optional) Requirements alignment validation**: If M365 context was fetched in Phase 2, validate the implementation against meeting-stated requirements:
|
|
541
|
+
- Read the `analysis` artifact to retrieve the M365 requirements context captured earlier
|
|
542
|
+
- If the analysis artifact contains meeting decisions or stakeholder expectations, call `mcp_workiq_ask_work_iq` → "What specific requirements and acceptance criteria were stated for {feature} in meetings and emails?"
|
|
543
|
+
- Cross-reference each stated requirement against the implementation diff:
|
|
544
|
+
- **Covered**: the requirement is addressed by the code changes ✓
|
|
545
|
+
- **Partial**: the requirement is partially addressed — flag what's missing
|
|
546
|
+
- **Missing**: the requirement is not addressed at all — flag as a review finding
|
|
547
|
+
- Include requirements alignment in the PR comment:
|
|
548
|
+
```
|
|
549
|
+
📋 Requirements Alignment:
|
|
550
|
+
Meeting requirements checked: {N}
|
|
551
|
+
Covered: {count} ✓ | Partial: {count} ⚠️ | Missing: {count} ❌
|
|
552
|
+
{list any partial/missing items}
|
|
553
|
+
```
|
|
554
|
+
- If critical requirements are missing, flag as a review issue that must be addressed before merge
|
|
555
|
+
9. **Run diff-scoped pattern detection** (read `.github/skills/insights-pattern-detection/SKILL.md`):
|
|
556
|
+
- Scope: only scan files changed in the diff (NOT full codebase)
|
|
557
|
+
- Check for **consistency** with existing codebase patterns:
|
|
558
|
+
- Error handling style matches project conventions?
|
|
559
|
+
- Data access patterns match?
|
|
560
|
+
- Naming conventions followed?
|
|
561
|
+
- Test structure matches existing tests?
|
|
562
|
+
- Check for **anti-patterns** in changed files:
|
|
563
|
+
- God object/file (single file > 500 lines with mixed responsibilities)
|
|
564
|
+
- Copy-paste (near-duplicate code blocks)
|
|
565
|
+
- Shotgun surgery (small change touching too many files)
|
|
566
|
+
- Primitive obsession (strings/numbers where domain types belong)
|
|
567
|
+
- **Query knowledge base for repeat offenses**:
|
|
568
|
+
- `crewpilot_knowledge_search` type: `pattern` — "has this same anti-pattern been flagged before?"
|
|
569
|
+
- If a repeat offense is found, flag prominently:
|
|
570
|
+
```
|
|
571
|
+
⚠️ Recurring Pattern Issue: {description}
|
|
572
|
+
Previously flagged in: {previous context}
|
|
573
|
+
Suggestion: Consider a structural fix.
|
|
574
|
+
```
|
|
575
|
+
- Run `crewpilot_metrics_complexity` on changed files — flag any function with complexity > threshold
|
|
576
|
+
- Include pattern findings in the PR comment:
|
|
577
|
+
```
|
|
578
|
+
🔎 Pattern Detection Results:
|
|
579
|
+
Consistency: {✓ follows codebase patterns | ⚠️ deviations found}
|
|
580
|
+
Anti-patterns: {✓ none | ⚠️ {list}}
|
|
581
|
+
Repeat issues: {✓ none | ⚠️ {count} recurring}
|
|
582
|
+
Complexity: {✓ within threshold | ⚠️ {files} above limit}
|
|
583
|
+
```
|
|
584
|
+
10. **If issues found (review, security, or pattern):**
|
|
585
|
+
- Fix them directly
|
|
586
|
+
- Re-commit: `fix(scope): address review findings`
|
|
587
|
+
- Re-push
|
|
588
|
+
- Re-run pattern detection on the fix to confirm resolution
|
|
589
|
+
11. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="review-merged"` containing the combined review results (code-quality, vulnerability-scan, pattern detection findings, and fix iterations)
|
|
590
|
+
12. Call `crewpilot_worker_review_done` with verdict: "approved" and summary
|
|
591
|
+
12. Call `crewpilot_board_move` to set issue status to "in-review"
|
|
592
|
+
13. Call `crewpilot_board_comment`: "PR #{pr_number} opened. Ready for review."
|
|
593
|
+
|
|
594
|
+
### Phase 7 — Deploy Guard (Deliver Skill #3)
|
|
595
|
+
|
|
596
|
+
**Load and follow** `.github/skills/deliver-deploy-guard/SKILL.md` methodology:
|
|
597
|
+
|
|
598
|
+
Before marking ready to merge, run the 6-gate checklist:
|
|
599
|
+
|
|
600
|
+
1. **Code Quality Gate**: No leftover TODOs, console.logs, or commented-out code in changed files
|
|
601
|
+
2. **Test Integrity Gate**: All tests pass, coverage meets threshold, no `.skip` tests
|
|
602
|
+
3. **Security Gate**: No hardcoded secrets, no critical CVEs, no unsafe patterns
|
|
603
|
+
4. **Configuration Gate**: Env vars documented, no dev config in prod paths
|
|
604
|
+
5. **Breaking Changes Gate**: API contracts backward-compatible, no dropped exports
|
|
605
|
+
6. **Operational Readiness Gate**: Health endpoints, logging, error handling
|
|
606
|
+
|
|
607
|
+
Produce a verdict and include in the PR comment:
|
|
608
|
+
|
|
609
|
+
```
|
|
610
|
+
🛡️ Deploy Guard Results:
|
|
611
|
+
Code Quality: ✓ pass
|
|
612
|
+
Test Integrity: ✓ pass (coverage: 86%)
|
|
613
|
+
Security: ✓ pass
|
|
614
|
+
Configuration: ✓ pass
|
|
615
|
+
Breaking Changes: ✓ pass
|
|
616
|
+
Operational: ✓ pass
|
|
617
|
+
|
|
618
|
+
Verdict: GO ✅
|
|
619
|
+
```
|
|
620
|
+
|
|
621
|
+
- If **GO** → proceed to Phase 8
|
|
622
|
+
- If **CONDITIONAL** → list warnings in PR comment, proceed (human decides)
|
|
623
|
+
- If **NO-GO** → fix blockers, re-run until GO or escalate to user
|
|
624
|
+
|
|
625
|
+
**Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="deploy-guard"` containing the full 6-gate results and verdict.
|
|
626
|
+
|
|
627
|
+
### Phase 8 — Completion & Learning
|
|
628
|
+
|
|
629
|
+
1. Call `crewpilot_board_comment` with deploy guard results: "All checks passed. Ready to merge."
|
|
630
|
+
2. **Store knowledge** via `crewpilot_knowledge_store`:
|
|
631
|
+
- Decisions made during implementation (type: `decision`)
|
|
632
|
+
- Root cause findings, if this was a bug fix (type: `root-cause`)
|
|
633
|
+
- **Pattern findings** from Phase 6 (type: `pattern`):
|
|
634
|
+
- What patterns were followed or violated
|
|
635
|
+
- Any anti-patterns found and fixed
|
|
636
|
+
- Any repeat offenses detected
|
|
637
|
+
- Complexity hotspots
|
|
638
|
+
- This creates the **self-improvement loop**: future runs query this data in Phase 2 to avoid repeating the same mistakes
|
|
639
|
+
3. Present final summary to user:
|
|
640
|
+
|
|
641
|
+
```
|
|
642
|
+
✅ Autopilot Complete
|
|
643
|
+
|
|
644
|
+
Issue: #{id} — {title}
|
|
645
|
+
Branch: {branch_name}
|
|
646
|
+
PR: #{pr_number}
|
|
647
|
+
Status: Ready to merge
|
|
648
|
+
|
|
649
|
+
Changes:
|
|
650
|
+
- {N} commits across {M} files
|
|
651
|
+
- {file} (created/modified) — {what changed}
|
|
652
|
+
|
|
653
|
+
Deliver Pipeline:
|
|
654
|
+
Change Mgmt: {N} conventional commits (feat/fix/test/docs)
|
|
655
|
+
Doc Sync: {updated | no changes needed}
|
|
656
|
+
Deploy Guard: {GO | CONDITIONAL — warnings}
|
|
657
|
+
|
|
658
|
+
{if bug fix:}
|
|
659
|
+
Root Cause: {one-sentence root cause}
|
|
660
|
+
Design Gap: {why it was vulnerable}
|
|
661
|
+
Prevention: {what would catch this earlier}
|
|
662
|
+
|
|
663
|
+
Tests: {X} passing | Coverage: {Y}%
|
|
664
|
+
Review: Auto-reviewed — code-quality + vulnerability-scan
|
|
665
|
+
Security: No issues found
|
|
666
|
+
Patterns: {✓ clean | ⚠️ {count} findings — stored for future runs}
|
|
667
|
+
Repeat Issues: {none | {count} recurring patterns detected}
|
|
668
|
+
|
|
669
|
+
→ Merge when ready. Board will auto-update on close.
|
|
670
|
+
```
|
|
671
|
+
|
|
672
|
+
4. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="completion"` containing the final summary (PR number, branch, commits, review/deploy-guard results, knowledge stored)
|
|
673
|
+
5. Call `crewpilot_worker_complete`
|
|
674
|
+
|
|
675
|
+
### Capability Hints (on completion)
|
|
676
|
+
|
|
677
|
+
After presenting the final summary, append **one** contextual hint based on the session. Show each hint at most once per session.
|
|
678
|
+
|
|
679
|
+
| Context | Hint |
|
|
680
|
+
|---|---|
|
|
681
|
+
| First time user ran autopilot | 💡 *I can also parse meeting transcripts into user stories and epics — say "parse meeting" with your notes.* |
|
|
682
|
+
| Multiple autopilot runs completed | 💡 *I can generate a daily digest summarizing all your work — say "daily digest" or "eod report".* |
|
|
683
|
+
| Knowledge was stored during this run | 💡 *I remember decisions across sessions. Ask "what did we decide about X" anytime to recall.* |
|
|
684
|
+
| Pattern issues were detected | 💡 *I can run a full codebase health scan for anti-patterns and tech debt — say "codebase health".* |
|
|
685
|
+
|
|
686
|
+
## Output Format
|
|
687
|
+
|
|
688
|
+
Always use the structured format shown in each phase. Lead with the status emoji:
|
|
689
|
+
- 📋 = planning
|
|
690
|
+
- ⚠️ = waiting for approval
|
|
691
|
+
- 🔨 = implementing
|
|
692
|
+
- 🔍 = reviewing
|
|
693
|
+
- ✅ = done
|
|
694
|
+
- ✗ = failed
|
|
695
|
+
|
|
696
|
+
## Anti-Patterns
|
|
697
|
+
|
|
698
|
+
<HARD-GATE>
|
|
699
|
+
- Do NOT skip the human gate (Phase 3). The plan MUST be shown and approved.
|
|
700
|
+
- Do NOT auto-merge the PR. Only humans merge.
|
|
701
|
+
- Do NOT bypass the PR preview gate (Phase 6). The user MUST see the preview.
|
|
702
|
+
</HARD-GATE>
|
|
703
|
+
- Do NOT continue after 3 consecutive failures on a step. Escalate to human.
|
|
704
|
+
- Do NOT install new dependencies without mentioning them in the plan.
|
|
705
|
+
- Do NOT modify files outside the scope of the plan without asking.
|
|
706
|
+
- Do NOT generate placeholder/stub code. Every file must be functional.
|
|
707
|
+
- Do NOT skip tests. If the project has a test framework, write tests.
|
|
708
|
+
|
|
709
|
+
## No Placeholders
|
|
710
|
+
|
|
711
|
+
Every step in the Phase 3 plan and every file produced in Phase 4 must contain real, working content. The following are **plan failures** — never write them:
|
|
712
|
+
|
|
713
|
+
| Forbidden Pattern | Why It Fails |
|
|
714
|
+
|---|---|
|
|
715
|
+
| "TBD", "TODO", "implement later" | Defers work that should be done now |
|
|
716
|
+
| "Add appropriate error handling" | Vague — specify which errors and how to handle them |
|
|
717
|
+
| "Add validation" | Which inputs? What rules? What error messages? |
|
|
718
|
+
| "Handle edge cases" | Name the edge cases or don't mention them |
|
|
719
|
+
| "Write tests for the above" | Show the actual test code |
|
|
720
|
+
| "Similar to Phase N" | Repeat the details — context resets between phases |
|
|
721
|
+
| Steps without code blocks | If a step changes code, show the code |
|
|
722
|
+
| References to undefined types/functions | Every symbol must trace back to an earlier step |
|
|
723
|
+
|
|
724
|
+
## Chains To
|
|
725
|
+
|
|
726
|
+
- `solution-design` — Phase 2.5: generate solution design doc when `needs-design` label detected
|
|
727
|
+
- `architecture-planner` — Phase 2.5: generate ADR when `needs-architecture` label detected
|
|
728
|
+
- `root-cause-analysis` — Phase 2.5c: systematic RCA when `bug`/`defect`/`regression` label detected
|
|
729
|
+
- `threat-model` — Phase 2.5d: STRIDE threat modeling when `needs-threat-model`/`security-sensitive` label detected
|
|
730
|
+
- `change-management` — Phase 5: proper conventional commits with multi-commit splitting
|
|
731
|
+
- `doc-governance` — Phase 5b: auto-detect and fix documentation drift
|
|
732
|
+
- `pr-intelligence` — Phase 6: risk assessment + reviewer guidance posted on PR
|
|
733
|
+
- `code-quality` — Phase 6: multi-pass review of the PR
|
|
734
|
+
- `vulnerability-scan` — Phase 6: security audit of new code
|
|
735
|
+
- `pattern-detection` — Phase 2 (query known patterns) + Phase 6 (diff-scoped scan) + Phase 8 (store findings)
|
|
736
|
+
- `deploy-guard` — Phase 7: 6-gate safety check before marking ready to merge
|
|
737
|
+
- `knowledge-base` — Phase 2, 2.5c, 6, 8: the memory hub that powers the self-improvement loop
|