@crewpilot/agent 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,623 @@
1
+ # Autopilot Worker
2
+
3
+ > **Pillar**: Orchestrate | **ID**: `autopilot-worker`
4
+
5
+ ## Purpose
6
+
7
+ Single-command pipeline that creates a board issue, plans implementation, writes code + tests, applies the full Deliver pipeline (change-management → doc-governance → deploy-guard), opens a reviewed PR, and updates the board. One human gate: approve the plan. Everything else is automatic chaining through 12 skills. Includes label-gated design/architecture phases, bug-triggered root-cause analysis, and a continuous self-improvement loop via pattern detection + knowledge base.
8
+
9
+ ## Activation Triggers
10
+
11
+ - autopilot, auto, pick up, work on, do this, implement and ship, end to end, full pipeline
12
+ - Routed from `feature-builder` Phase 0 when complexity is moderate or complex
13
+ - User provides a board issue number ("#42", "issue 42")
14
+
15
+ ## Session Role Exception
16
+
17
+ This pipeline chains 12 skills across role boundaries (e.g. code-quality and vulnerability-scan in Phase 6 are Review skills, but run inside the Builder pipeline). **All skills invoked internally by this pipeline are unrestricted by the session role.** Role scoping only applies to user-initiated requests, not pipeline steps.
18
+
19
+ ## Tools Required
20
+
21
+ - `catalyst_board_connect` — connect to board provider
22
+ - `catalyst_board_create` — create issue on board
23
+ - `catalyst_board_move` — update issue status
24
+ - `catalyst_board_comment` — log progress on the issue
25
+ - `catalyst_worker_start` — start orchestrator workflow
26
+ - `catalyst_worker_plan` — set execution plan
27
+ - `catalyst_worker_approve` — human approval gate
28
+ - `catalyst_worker_branch` — create feature branch
29
+ - `catalyst_worker_pr` — push + open PR
30
+ - `catalyst_worker_review_done` — record review verdict
31
+ - `catalyst_worker_complete` — mark workflow done
32
+ - `catalyst_worker_fail` — circuit breaker on failure
33
+ - `catalyst_git_stage` — stage files
34
+ - `catalyst_git_commit` — commit changes
35
+ - `catalyst_exec` — run commands (tests, lint, build)
36
+ - `catalyst_knowledge_store` — store decisions made during implementation
37
+ - `catalyst_git_diff` — analyze changes for change-management
38
+ - `catalyst_git_log` — commit history for release notes
39
+ - `catalyst_metrics_coverage` — coverage check for deploy-guard
40
+ - `catalyst_metrics_complexity` — complexity check for deploy-guard and pattern detection
41
+ - `catalyst_worker_preview_pr` — preview changes before PR creation
42
+ - `catalyst_worker_push_fixes` — push fixes to existing PR branch (no new PR)
43
+ - `catalyst_board_pr_comments` — fetch review comments from a PR
44
+ - `catalyst_knowledge_search` — query known patterns, anti-patterns, and past root causes
45
+
46
+ ## Methodology
47
+
48
+ ### Process Flow
49
+
50
+ ```dot
51
+ digraph autopilot_worker {
52
+ rankdir=TB;
53
+ node [shape=box];
54
+
55
+ intake [label="Phase 1\nIntake & Issue Creation"];
56
+ analysis [label="Phase 2\nCodebase Analysis & Planning"];
57
+ design [label="Phase 2.5\nDesign & Architecture\n(label-gated)", style=dashed];
58
+ rca [label="Phase 2.5c\nRoot Cause Analysis\n(bug label-gated)", style=dashed];
59
+ plan_gate [label="Phase 3\nHUMAN GATE: Plan Approval", shape=diamond, style=filled, fillcolor="#ffcccc"];
60
+ implement [label="Phase 4\nBranch & Implementation"];
61
+ change_mgmt [label="Phase 5\nChange Management"];
62
+ doc_gov [label="Phase 5b\nDoc Governance"];
63
+ pr_review [label="Phase 6\nPR Creation & Auto-Review\n(5-stage)"];
64
+ deploy_guard [label="Phase 7\nDeploy Guard\n(6 gates)"];
65
+ complete [label="Phase 8\nCompletion & Learning", shape=doublecircle];
66
+ fail [label="FAIL\nCircuit Breaker", shape=octagon, style=filled, fillcolor="#ff9999"];
67
+
68
+ intake -> analysis;
69
+ analysis -> design [label="needs-design\nor needs-architecture"];
70
+ analysis -> rca [label="bug/defect/\nregression"];
71
+ analysis -> plan_gate [label="no special labels"];
72
+ design -> plan_gate;
73
+ rca -> plan_gate;
74
+ plan_gate -> implement [label="approved"];
75
+ plan_gate -> fail [label="cancelled"];
76
+ implement -> change_mgmt;
77
+ implement -> fail [label="3 failures"];
78
+ change_mgmt -> doc_gov;
79
+ doc_gov -> pr_review;
80
+ pr_review -> pr_review [label="issues found\nfix & re-run"];
81
+ pr_review -> deploy_guard;
82
+ deploy_guard -> complete [label="GO"];
83
+ deploy_guard -> pr_review [label="NO-GO\nfix blockers"];
84
+ complete -> complete [label="store knowledge\nself-improvement loop"];
85
+ }
86
+ ```
87
+
88
+ ### Phase 1 — Intake & Issue Creation
89
+
90
+ **First interaction hint:** If this is the first interaction in the session, start with:
91
+ > 💡 *Running Catalyst Autopilot — I'll summarize the task, confirm with you before creating a board issue, plan the work, get your approval, implement, test, review, and open a PR.*
92
+
93
+ **Entry mode detection** — the worker can be entered three ways:
94
+
95
+ | Entry Mode | How to Detect | Behavior |
96
+ |---|---|---|
97
+ | **Direct** | User says "autopilot", "full pipeline", etc. | Run full pipeline from Phase 1 |
98
+ | **Routed from feature-builder** | feature-builder's Phase 0 classified as moderate/complex | Skip re-analyzing complexity — it's already assessed. Use the context feature-builder gathered. |
99
+ | **Mid-build escalation** | feature-builder discovered more complexity during Phase 4 | Accept the partial context (files already touched, patterns found). Start from Phase 2 (planning) with what's already known. |
100
+
101
+ **Complexity check (direct entry only):** If the user enters autopilot directly, quickly assess if the request warrants the full pipeline:
102
+ - If the request is trivial (single file, obvious change) → suggest: *"This is a small change. I can implement it directly without the full pipeline. Want me to do that instead?"*
103
+ - If the user says "just do it" → hand off to `feature-builder` (which will handle it as trivial/simple tier).
104
+ - Otherwise → continue with the full pipeline below.
105
+
106
+ **If user provides a task description (not an existing issue number):**
107
+
108
+ 1. Parse the user's request to extract:
109
+ - Title (concise, action-oriented)
110
+ - Description (what needs to be built)
111
+ - Acceptance criteria (bullet list — infer from description if not explicit)
112
+ - Labels (feature, bug, chore — infer from context)
113
+
114
+ <HARD-GATE>
115
+ 2. **HUMAN GATE — Task Creation Confirmation**: Present the inferred task summary to the user BEFORE creating the board issue:
116
+
117
+ ```
118
+ 📋 Before I start, here's what I'll create as a board issue:
119
+
120
+ Title: {title}
121
+ Description: {description}
122
+
123
+ Acceptance Criteria:
124
+ - [ ] {criterion 1}
125
+ - [ ] {criterion 2}
126
+ - [ ] {criterion 3}
127
+
128
+ Labels: {labels}
129
+
130
+ → Create this task and start the pipeline? (yes / edit / no)
131
+ ```
132
+
133
+ - If **yes** → call `catalyst_board_create`, continue to Phase 2
134
+ - If **edit** → user provides corrections, update and re-present
135
+ - If **no** → stop the pipeline. Ask the user what they'd like to do instead.
136
+ - Do NOT create the board issue without explicit user confirmation.
137
+ </HARD-GATE>
138
+
139
+ 3. Call `catalyst_board_create` with title, description, acceptance criteria
140
+ 4. Note the created issue ID
141
+
142
+ **If user provides an existing issue number (e.g., "#42"):**
143
+
144
+ 1. Call `catalyst_board_get` to read the existing issue
145
+ 2. Use its title, description, and acceptance criteria as-is
146
+ 3. No confirmation needed — the task already exists
147
+
148
+ ### Phase 2 — Codebase Analysis & Planning
149
+
150
+ 1. Read the project structure — scan key files (package.json, tsconfig, src/ layout, existing patterns)
151
+ 2. Identify:
152
+ - Which files need to be **created**
153
+ - Which files need to be **modified**
154
+ - What patterns/conventions the codebase follows (naming, directory structure, test style)
155
+ - What dependencies might be needed
156
+ 3. Check issue labels for `needs-design`, `needs-architecture`, and `bug`/`defect`/`regression`
157
+ 4. **Query pattern knowledge** via `catalyst_knowledge_search` (type: `pattern`):
158
+ - Search for known patterns and anti-patterns in the files being modified
159
+ - Search for past root causes in the same area of the codebase
160
+ - Collect any "repeat offender" warnings from previous runs
161
+ - Feed this context into the plan so the worker avoids known mistakes
162
+ 5. Call `catalyst_worker_start` with the issue ID and title
163
+
164
+ ### Phase 2.5 — Design & Architecture (label-gated)
165
+
166
+ **Skip this phase entirely if the issue has neither `needs-design` nor `needs-architecture` label.**
167
+
168
+ Check the issue labels (from `catalyst_board_get`). Run the applicable skills:
169
+
170
+ #### If issue has `needs-design` label:
171
+
172
+ **Load and follow** `.github/skills/strategize-solution-design/SKILL.md`:
173
+
174
+ 1. Frame the problem — restate in one sentence with constraints
175
+ 2. Generate 3-4 distinct approaches with strengths, risks, and effort
176
+ 3. Build a trade-off matrix comparing all options
177
+ 4. Present to user:
178
+
179
+ ```
180
+ 📐 Design Phase for: "{issue title}"
181
+
182
+ {trade-off matrix}
183
+
184
+ Recommendation: {option} (Confidence: {N}/10)
185
+ Reversal cost: {Low/Medium/High}
186
+
187
+ → Which approach? (A / B / C / edit)
188
+ ```
189
+
190
+ 5. **HUMAN GATE**: User picks an approach
191
+ 6. Store the decision via `catalyst_knowledge_store` (type: decision)
192
+ 7. Write the design document to `docs/design/{issue_id}-{slug}.md`:
193
+ ```markdown
194
+ # Design: {issue title}
195
+
196
+ **Issue**: #{id}
197
+ **Date**: {date}
198
+ **Decision**: {chosen option}
199
+
200
+ ## Problem
201
+ {one-sentence problem statement}
202
+
203
+ ## Options Considered
204
+ {options with strengths/risks/effort}
205
+
206
+ ## Trade-off Matrix
207
+ {matrix}
208
+
209
+ ## Decision
210
+ {chosen option with rationale}
211
+ Confidence: {N}/10 | Reversal cost: {Low/Medium/High}
212
+ ```
213
+ 8. Stage the design doc — it will be committed alongside the code in Phase 5
214
+
215
+ #### If issue has `needs-architecture` label:
216
+
217
+ **Load and follow** `.github/skills/strategize-architecture-planner/SKILL.md`:
218
+
219
+ 1. Define scope — system boundaries, actors, quality attributes
220
+ 2. Decompose into components with responsibilities and interfaces
221
+ 3. Trace the primary data flow through the system
222
+ 4. Create an implementation roadmap with milestones
223
+ 5. Present to user:
224
+
225
+ ```
226
+ 📐 Architecture for: "{issue title}"
227
+
228
+ Components:
229
+ | Component | Responsibility | Interface | Dependencies |
230
+ |-----------|---------------|-----------|-------------|
231
+ | ... | ... | ... | ... |
232
+
233
+ Data Flow:
234
+ 1. {step} → {step} → {step}
235
+
236
+ → Approve architecture? (yes / edit)
237
+ ```
238
+
239
+ 6. **HUMAN GATE**: User approves the architecture
240
+ 7. Store as knowledge (type: decision)
241
+ 8. Write the ADR to `docs/adr/{NNN}-{slug}.md`:
242
+ ```markdown
243
+ # ADR-{NNN}: {title}
244
+
245
+ ## Status: Accepted
246
+ ## Context
247
+ {why this design was needed}
248
+ ## Decision
249
+ {what was decided — components, data flow, interfaces}
250
+ ## Consequences
251
+ {positive and negative trade-offs}
252
+ ## Alternatives Considered
253
+ {rejected options and why}
254
+ ```
255
+ 9. Stage the ADR — it will be committed alongside the code in Phase 5
256
+
257
+ #### If issue has BOTH labels:
258
+
259
+ Run `needs-design` first (pick the approach), then `needs-architecture` (detail the design).
260
+ The design decision feeds into the architecture — e.g., "we chose Redis" → architecture shows CacheService component, middleware chain, config interface.
261
+
262
+ ### Phase 2.5c — Root Cause Analysis (label-gated)
263
+
264
+ **Skip if the issue does NOT have a `bug`, `defect`, or `regression` label.**
265
+
266
+ **Load and follow** `.github/skills/engineer-root-cause-analysis/SKILL.md` methodology:
267
+
268
+ 1. **Symptom collection**:
269
+ - Extract error message, stack trace, steps to reproduce from the issue description
270
+ - Run `catalyst_git_log` on the affected files to check recent changes
271
+ - Query `catalyst_knowledge_search` for previous root causes in the same area
272
+ 2. **Hypothesis generation** — generate 2-3 ranked hypotheses:
273
+
274
+ ```
275
+ 🔍 RCA for: "{issue title}"
276
+
277
+ | # | Hypothesis | Likelihood | Evidence | Test Strategy |
278
+ |---|---|---|---|---|
279
+ | H1 | {most likely} | High | {evidence} | {how to test} |
280
+ | H2 | {alternative} | Medium | {evidence} | {how to test} |
281
+ | H3 | {edge case} | Low | {evidence} | {how to test} |
282
+ ```
283
+
284
+ 3. **Systematic elimination** — for each hypothesis (highest first):
285
+ - Run `catalyst_exec` to test (add logging, reproduce, check state)
286
+ - Record result: confirmed / eliminated / narrowed
287
+ - Max 5 attempts total (circuit breaker — same as Phase 4)
288
+ 4. **Root cause identification**:
289
+ - State in one sentence
290
+ - Causal chain: trigger → intermediate effects → symptom
291
+ - Design gap: WHY the code was vulnerable
292
+ 5. **Feed into Phase 3 plan**:
293
+ - The plan must fix the root cause, not just the symptom
294
+ - Include a regression test that fails without the fix
295
+ - Phase 5 commit footer: `Root-cause: {one-sentence description}`
296
+ 6. **Store root cause** via `catalyst_knowledge_store` (type: `root-cause`):
297
+ - What: the root cause description
298
+ - Where: affected files/modules
299
+ - Why: the design gap
300
+ - Prevention: what would have caught this earlier
301
+ 7. **If root cause reveals a systemic issue**, flag it for pattern detection in Phase 6:
302
+ - Add note: `systemic:{description}` for Phase 6 to pick up
303
+
304
+ #### After design/architecture/RCA phases:
305
+
306
+ The design documents and RCA findings inform the implementation plan. Phase 3's plan should reference:
307
+ - Which approach was chosen (from design doc)
308
+ - Which components to build (from architecture)
309
+ - Which interfaces to implement (from ADR)
310
+ - What root cause was found (from RCA) and what fix addresses it
311
+
312
+ ### Phase 3 — HUMAN GATE: Plan Approval
313
+
314
+ <HARD-GATE>
315
+ Do NOT proceed to implementation until the user has explicitly approved the plan.
316
+ Do NOT skip this gate for any reason, regardless of perceived simplicity.
317
+ If the user says "just do it" without seeing the plan, present the plan anyway.
318
+ </HARD-GATE>
319
+
320
+ **STOP HERE. Present the plan to the user:**
321
+
322
+ ```
323
+ 📋 Autopilot Plan for: "{issue title}"
324
+
325
+ Issue: #{id} on {board provider}
326
+ {if design doc exists: "Design: docs/design/{file}.md"}
327
+ {if ADR exists: "Architecture: docs/adr/{file}.md"}
328
+
329
+ Steps:
330
+ 1. {step description}
331
+ 2. {step description}
332
+ ...
333
+
334
+ Files to change:
335
+ - {path} (create/modify)
336
+ - {path} (create/modify)
337
+
338
+ Complexity: {trivial|simple|moderate|complex}
339
+
340
+ Approve? (yes / edit / cancel)
341
+ ```
342
+
343
+ - If **yes** → call `catalyst_worker_approve`, continue to Phase 4
344
+ - If **edit** → user provides changes, update plan, re-present
345
+ - If **cancel** → call `catalyst_worker_fail`, stop
346
+
347
+ ### Phase 4 — Branch & Implementation
348
+
349
+ 1. Call `catalyst_worker_branch` to create feature branch
350
+ 2. Call `catalyst_board_move` to set issue status to "in-progress"
351
+ 3. **For each step in the plan:**
352
+ a. Implement the code change (create/modify files)
353
+ b. Follow existing codebase patterns discovered in Phase 2
354
+ c. After each logical unit, run `catalyst_exec("npm test")` or equivalent to verify nothing is broken
355
+ d. If tests fail, diagnose and fix (max 3 attempts per step — circuit breaker)
356
+ 4. Write tests for new code:
357
+ - Match existing test framework and conventions
358
+ - Cover happy path + key edge cases
359
+ - Run tests to confirm they pass
360
+
361
+ **Circuit breaker:** If any step fails 3 times consecutively:
362
+ - Call `catalyst_board_comment` with details of the failure
363
+ - Call `catalyst_worker_fail` with reason
364
+ - Tell the user what went wrong and which step is stuck
365
+ - STOP. Do not continue.
366
+
367
+ ### Phase 5 — Change Management (Deliver Skill #1)
368
+
369
+ **Load and follow** `.github/skills/deliver-change-management/SKILL.md` methodology:
370
+
371
+ 1. Run `catalyst_git_diff` to analyze all changes
372
+ 2. Categorize changes by type: `feat`, `fix`, `refactor`, `test`, `docs`, `chore`
373
+ 3. **If changes span multiple logical units** (e.g., new feature + test + config):
374
+ - Split into separate commits with `catalyst_git_stage` per group
375
+ - Each commit gets its own conventional message
376
+ - Example:
377
+ ```
378
+ git add src/feature.ts
379
+ → feat(scope): add feature X (closes #ID)
380
+
381
+ git add tests/feature.test.ts
382
+ → test(scope): add tests for feature X
383
+
384
+ git add docs/api.md
385
+ → docs(scope): update API docs for feature X
386
+ ```
387
+ 4. **If changes are a single logical unit**, create one commit:
388
+ - Format: `feat(scope): description (closes #ID)`
389
+ - Body: what was implemented and why
390
+ - Footer: `Closes #ID`
391
+ 5. Call `catalyst_git_stage` and `catalyst_git_commit` for each logical commit
392
+
393
+ ### Phase 5b — Doc Governance (Deliver Skill #2)
394
+
395
+ **Load and follow** `.github/skills/deliver-doc-governance/SKILL.md` methodology:
396
+
397
+ 1. Check if the changes affect any **public interfaces**:
398
+ - New/changed API endpoints
399
+ - New/changed CLI commands
400
+ - New/changed configuration options
401
+ - New/changed tool signatures
402
+ - New/changed exports or public functions
403
+ 2. If public interfaces changed, run drift detection:
404
+ - Compare README against actual project structure and features
405
+ - Compare API docs against actual function signatures
406
+ - Check if code examples still work
407
+ - Verify install/setup instructions are still accurate
408
+ 3. **If drift found:**
409
+ - Fix the documentation directly (same branch)
410
+ - Stage and commit: `docs(scope): sync docs with implementation changes`
411
+ - Add to the PR body: `### Documentation Updated` section listing what was synced
412
+ 4. **If no public interfaces changed**, skip — note "No doc changes needed" in the PR body
413
+
414
+ ### Phase 6 — PR Creation & Auto-Review
415
+
416
+ 1. Call `catalyst_worker_preview_pr` with:
417
+ - Title: primary commit message
418
+ - Body: markdown with sections:
419
+ - **What**: summary of changes
420
+ - **Why**: linked to issue #{ID}
421
+ - **Changes**: list of commits with descriptions
422
+ - **Documentation Updated**: what docs were synced (or "N/A")
423
+ - **How to test**: steps to verify
424
+ - **Checklist**: tests pass, lint clean, types clean, docs synced
425
+ <HARD-GATE>
426
+ 2. **HUMAN GATE**: User reviews the preview — do NOT create the PR until the user approves.
427
+ If the user requests changes, apply them and re-preview. Never skip this gate.
428
+ </HARD-GATE>
429
+ 3. Call `catalyst_worker_pr` to create the PR
430
+ 4. **Run PR Intelligence** (read `.github/skills/assure-pr-intelligence/SKILL.md`):
431
+ - **Change inventory**: categorize changed files (core, api, test, config, docs)
432
+ - **Risk assessment**: evaluate scope, complexity, blast radius, test coverage, reversibility → Low/Medium/High/Critical risk score
433
+ - **Reviewer guidance**: order files by review priority, flag lines needing attention, list questions the reviewer should ask, note what's missing from the PR
434
+ - **Merge readiness checklist**: tests pass, security clean, breaking changes documented, PR description matches changes
435
+ - Post the full PR Intelligence report as a **comment on the PR** so the assigned reviewer sees it immediately
436
+ 5. Read the diff of the PR
437
+ 6. Run **code-quality** review internally (read `.github/skills/assure-code-quality/SKILL.md`):
438
+ - Correctness: does the code do what the acceptance criteria say?
439
+ - Security: any obvious vulnerabilities (SQL injection, XSS, secrets)?
440
+ - Performance: any N+1 queries, await-in-loops, unnecessary re-renders?
441
+ - Style: does it match codebase conventions?
442
+ 7. Run **vulnerability-scan** internally (read `.github/skills/assure-vulnerability-scan/SKILL.md`):
443
+ - OWASP Top 10 quick check on new code
444
+ - Dependency audit: `npm audit` or `pip audit`
445
+ 8. Run `catalyst_exec("npm run lint")` and `catalyst_exec("npm run typecheck")` if available
446
+ 9. **Run diff-scoped pattern detection** (read `.github/skills/insights-pattern-detection/SKILL.md`):
447
+ - Scope: only scan files changed in the diff (NOT full codebase)
448
+ - Check for **consistency** with existing codebase patterns:
449
+ - Error handling style matches project conventions?
450
+ - Data access patterns match?
451
+ - Naming conventions followed?
452
+ - Test structure matches existing tests?
453
+ - Check for **anti-patterns** in changed files:
454
+ - God object/file (single file > 500 lines with mixed responsibilities)
455
+ - Copy-paste (near-duplicate code blocks)
456
+ - Shotgun surgery (small change touching too many files)
457
+ - Primitive obsession (strings/numbers where domain types belong)
458
+ - **Query knowledge base for repeat offenses**:
459
+ - `catalyst_knowledge_search` type: `pattern` — "has this same anti-pattern been flagged before?"
460
+ - If a repeat offense is found, flag prominently:
461
+ ```
462
+ ⚠️ Recurring Pattern Issue: {description}
463
+ Previously flagged in: {previous context}
464
+ Suggestion: Consider a structural fix.
465
+ ```
466
+ - Run `catalyst_metrics_complexity` on changed files — flag any function with complexity > threshold
467
+ - Include pattern findings in the PR comment:
468
+ ```
469
+ 🔎 Pattern Detection Results:
470
+ Consistency: {✓ follows codebase patterns | ⚠️ deviations found}
471
+ Anti-patterns: {✓ none | ⚠️ {list}}
472
+ Repeat issues: {✓ none | ⚠️ {count} recurring}
473
+ Complexity: {✓ within threshold | ⚠️ {files} above limit}
474
+ ```
475
+ 10. **If issues found (review, security, or pattern):**
476
+ - Fix them directly
477
+ - Re-commit: `fix(scope): address review findings`
478
+ - Re-push
479
+ - Re-run pattern detection on the fix to confirm resolution
480
+ 11. Call `catalyst_worker_review_done` with verdict: "approved" and summary
481
+ 12. Call `catalyst_board_move` to set issue status to "in-review"
482
+ 13. Call `catalyst_board_comment`: "PR #{pr_number} opened. Ready for review."
483
+
484
+ ### Phase 7 — Deploy Guard (Deliver Skill #3)
485
+
486
+ **Load and follow** `.github/skills/deliver-deploy-guard/SKILL.md` methodology:
487
+
488
+ Before marking ready to merge, run the 6-gate checklist:
489
+
490
+ 1. **Code Quality Gate**: No leftover TODOs, console.logs, or commented-out code in changed files
491
+ 2. **Test Integrity Gate**: All tests pass, coverage meets threshold, no `.skip` tests
492
+ 3. **Security Gate**: No hardcoded secrets, no critical CVEs, no unsafe patterns
493
+ 4. **Configuration Gate**: Env vars documented, no dev config in prod paths
494
+ 5. **Breaking Changes Gate**: API contracts backward-compatible, no dropped exports
495
+ 6. **Operational Readiness Gate**: Health endpoints, logging, error handling
496
+
497
+ Produce a verdict and include in the PR comment:
498
+
499
+ ```
500
+ 🛡️ Deploy Guard Results:
501
+ Code Quality: ✓ pass
502
+ Test Integrity: ✓ pass (coverage: 86%)
503
+ Security: ✓ pass
504
+ Configuration: ✓ pass
505
+ Breaking Changes: ✓ pass
506
+ Operational: ✓ pass
507
+
508
+ Verdict: GO ✅
509
+ ```
510
+
511
+ - If **GO** → proceed to Phase 8
512
+ - If **CONDITIONAL** → list warnings in PR comment, proceed (human decides)
513
+ - If **NO-GO** → fix blockers, re-run until GO or escalate to user
514
+
515
+ ### Phase 8 — Completion & Learning
516
+
517
+ 1. Call `catalyst_board_comment` with deploy guard results: "All checks passed. Ready to merge."
518
+ 2. **Store knowledge** via `catalyst_knowledge_store`:
519
+ - Decisions made during implementation (type: `decision`)
520
+ - Root cause findings, if this was a bug fix (type: `root-cause`)
521
+ - **Pattern findings** from Phase 6 (type: `pattern`):
522
+ - What patterns were followed or violated
523
+ - Any anti-patterns found and fixed
524
+ - Any repeat offenses detected
525
+ - Complexity hotspots
526
+ - This creates the **self-improvement loop**: future runs query this data in Phase 2 to avoid repeating the same mistakes
527
+ 3. Present final summary to user:
528
+
529
+ ```
530
+ ✅ Autopilot Complete
531
+
532
+ Issue: #{id} — {title}
533
+ Branch: {branch_name}
534
+ PR: #{pr_number}
535
+ Status: Ready to merge
536
+
537
+ Changes:
538
+ - {N} commits across {M} files
539
+ - {file} (created/modified) — {what changed}
540
+
541
+ Deliver Pipeline:
542
+ Change Mgmt: {N} conventional commits (feat/fix/test/docs)
543
+ Doc Sync: {updated | no changes needed}
544
+ Deploy Guard: {GO | CONDITIONAL — warnings}
545
+
546
+ {if bug fix:}
547
+ Root Cause: {one-sentence root cause}
548
+ Design Gap: {why it was vulnerable}
549
+ Prevention: {what would catch this earlier}
550
+
551
+ Tests: {X} passing | Coverage: {Y}%
552
+ Review: Auto-reviewed — code-quality + vulnerability-scan
553
+ Security: No issues found
554
+ Patterns: {✓ clean | ⚠️ {count} findings — stored for future runs}
555
+ Repeat Issues: {none | {count} recurring patterns detected}
556
+
557
+ → Merge when ready. Board will auto-update on close.
558
+ ```
559
+
560
+ 4. Call `catalyst_worker_complete`
561
+
562
+ ### Capability Hints (on completion)
563
+
564
+ After presenting the final summary, append **one** contextual hint based on the session. Show each hint at most once per session.
565
+
566
+ | Context | Hint |
567
+ |---|---|
568
+ | First time user ran autopilot | 💡 *I can also parse meeting transcripts into user stories and epics — say "parse meeting" with your notes.* |
569
+ | Multiple autopilot runs completed | 💡 *I can generate a daily digest summarizing all your work — say "daily digest" or "eod report".* |
570
+ | Knowledge was stored during this run | 💡 *I remember decisions across sessions. Ask "what did we decide about X" anytime to recall.* |
571
+ | Pattern issues were detected | 💡 *I can run a full codebase health scan for anti-patterns and tech debt — say "codebase health".* |
572
+
573
+ ## Output Format
574
+
575
+ Always use the structured format shown in each phase. Lead with the status emoji:
576
+ - 📋 = planning
577
+ - ⚠️ = waiting for approval
578
+ - 🔨 = implementing
579
+ - 🔍 = reviewing
580
+ - ✅ = done
581
+ - ✗ = failed
582
+
583
+ ## Anti-Patterns
584
+
585
+ <HARD-GATE>
586
+ - Do NOT skip the human gate (Phase 3). The plan MUST be shown and approved.
587
+ - Do NOT auto-merge the PR. Only humans merge.
588
+ - Do NOT bypass the PR preview gate (Phase 6). The user MUST see the preview.
589
+ </HARD-GATE>
590
+ - Do NOT continue after 3 consecutive failures on a step. Escalate to human.
591
+ - Do NOT install new dependencies without mentioning them in the plan.
592
+ - Do NOT modify files outside the scope of the plan without asking.
593
+ - Do NOT generate placeholder/stub code. Every file must be functional.
594
+ - Do NOT skip tests. If the project has a test framework, write tests.
595
+
596
+ ## No Placeholders
597
+
598
+ Every step in the Phase 3 plan and every file produced in Phase 4 must contain real, working content. The following are **plan failures** — never write them:
599
+
600
+ | Forbidden Pattern | Why It Fails |
601
+ |---|---|
602
+ | "TBD", "TODO", "implement later" | Defers work that should be done now |
603
+ | "Add appropriate error handling" | Vague — specify which errors and how to handle them |
604
+ | "Add validation" | Which inputs? What rules? What error messages? |
605
+ | "Handle edge cases" | Name the edge cases or don't mention them |
606
+ | "Write tests for the above" | Show the actual test code |
607
+ | "Similar to Phase N" | Repeat the details — context resets between phases |
608
+ | Steps without code blocks | If a step changes code, show the code |
609
+ | References to undefined types/functions | Every symbol must trace back to an earlier step |
610
+
611
+ ## Chains To
612
+
613
+ - `solution-design` — Phase 2.5: generate solution design doc when `needs-design` label detected
614
+ - `architecture-planner` — Phase 2.5: generate ADR when `needs-architecture` label detected
615
+ - `root-cause-analysis` — Phase 2.5c: systematic RCA when `bug`/`defect`/`regression` label detected
616
+ - `change-management` — Phase 5: proper conventional commits with multi-commit splitting
617
+ - `doc-governance` — Phase 5b: auto-detect and fix documentation drift
618
+ - `pr-intelligence` — Phase 6: risk assessment + reviewer guidance posted on PR
619
+ - `code-quality` — Phase 6: multi-pass review of the PR
620
+ - `vulnerability-scan` — Phase 6: security audit of new code
621
+ - `pattern-detection` — Phase 2 (query known patterns) + Phase 6 (diff-scoped scan) + Phase 8 (store findings)
622
+ - `deploy-guard` — Phase 7: 6-gate safety check before marking ready to merge
623
+ - `knowledge-base` — Phase 2, 2.5c, 6, 8: the memory hub that powers the self-improvement loop