@crewpilot/agent 1.0.0 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (27) hide show
  1. package/README.md +35 -11
  2. package/dist-npm/cli.js +5 -5
  3. package/dist-npm/index.js +171 -138
  4. package/package.json +2 -2
  5. package/prompts/agent.md +38 -22
  6. package/prompts/copilot-instructions.md +8 -8
  7. package/prompts/{catalyst.config.json → crewpilot.config.json} +1 -1
  8. package/prompts/skills/assure-code-quality/SKILL.md +3 -3
  9. package/prompts/skills/assure-pr-intelligence/SKILL.md +4 -4
  10. package/prompts/skills/assure-review-functional/SKILL.md +114 -0
  11. package/prompts/skills/assure-review-standards/SKILL.md +106 -0
  12. package/prompts/skills/assure-threat-model/SKILL.md +182 -0
  13. package/prompts/skills/assure-vulnerability-scan/SKILL.md +1 -1
  14. package/prompts/skills/autopilot-meeting/SKILL.md +43 -16
  15. package/prompts/skills/autopilot-worker/SKILL.md +177 -63
  16. package/prompts/skills/daily-digest/SKILL.md +35 -14
  17. package/prompts/skills/deliver-change-management/SKILL.md +6 -6
  18. package/prompts/skills/deliver-deploy-guard/SKILL.md +6 -6
  19. package/prompts/skills/deliver-doc-governance/SKILL.md +2 -2
  20. package/prompts/skills/engineer-feature-builder/SKILL.md +3 -3
  21. package/prompts/skills/engineer-root-cause-analysis/SKILL.md +3 -3
  22. package/prompts/skills/engineer-test-first/SKILL.md +2 -2
  23. package/prompts/skills/insights-knowledge-base/SKILL.md +32 -11
  24. package/prompts/skills/insights-pattern-detection/SKILL.md +5 -5
  25. package/prompts/skills/strategize-architecture-planner/SKILL.md +2 -2
  26. package/prompts/skills/strategize-solution-design/SKILL.md +2 -2
  27. package/scripts/postinstall.js +4 -4
@@ -18,30 +18,38 @@ This pipeline chains 12 skills across role boundaries (e.g. code-quality and vul
18
18
 
19
19
  ## Tools Required
20
20
 
21
- - `catalyst_board_connect` — connect to board provider
22
- - `catalyst_board_create` — create issue on board
23
- - `catalyst_board_move` — update issue status
24
- - `catalyst_board_comment` — log progress on the issue
25
- - `catalyst_worker_start` — start orchestrator workflow
26
- - `catalyst_worker_plan` — set execution plan
27
- - `catalyst_worker_approve` — human approval gate
28
- - `catalyst_worker_branch` — create feature branch
29
- - `catalyst_worker_pr` — push + open PR
30
- - `catalyst_worker_review_done` — record review verdict
31
- - `catalyst_worker_complete` — mark workflow done
32
- - `catalyst_worker_fail` — circuit breaker on failure
33
- - `catalyst_git_stage` — stage files
34
- - `catalyst_git_commit` — commit changes
35
- - `catalyst_exec` — run commands (tests, lint, build)
36
- - `catalyst_knowledge_store` — store decisions made during implementation
37
- - `catalyst_git_diff` — analyze changes for change-management
38
- - `catalyst_git_log` — commit history for release notes
39
- - `catalyst_metrics_coverage` — coverage check for deploy-guard
40
- - `catalyst_metrics_complexity` — complexity check for deploy-guard and pattern detection
41
- - `catalyst_worker_preview_pr` — preview changes before PR creation
42
- - `catalyst_worker_push_fixes` — push fixes to existing PR branch (no new PR)
43
- - `catalyst_board_pr_comments` — fetch review comments from a PR
44
- - `catalyst_knowledge_search` — query known patterns, anti-patterns, and past root causes
21
+ - `crewpilot_board_connect` — connect to board provider
22
+ - `crewpilot_board_create` — create issue on board
23
+ - `crewpilot_board_move` — update issue status
24
+ - `crewpilot_board_comment` — log progress on the issue
25
+ - `crewpilot_worker_start` — start orchestrator workflow
26
+ - `crewpilot_worker_plan` — set execution plan
27
+ - `crewpilot_worker_approve` — human approval gate
28
+ - `crewpilot_worker_branch` — create feature branch
29
+ - `crewpilot_worker_pr` — push + open PR
30
+ - `crewpilot_worker_review_done` — record review verdict
31
+ - `crewpilot_worker_complete` — mark workflow done
32
+ - `crewpilot_worker_fail` — circuit breaker on failure
33
+ - `crewpilot_git_stage` — stage files
34
+ - `crewpilot_git_commit` — commit changes
35
+ - `crewpilot_exec` — run commands (tests, lint, build)
36
+ - `crewpilot_knowledge_store` — store decisions made during implementation
37
+ - `crewpilot_git_diff` — analyze changes for change-management
38
+ - `crewpilot_git_log` — commit history for release notes
39
+ - `crewpilot_metrics_coverage` — coverage check for deploy-guard
40
+ - `crewpilot_metrics_complexity` — complexity check for deploy-guard and pattern detection
41
+ - `crewpilot_worker_preview_pr` — preview changes before PR creation
42
+ - `crewpilot_worker_push_fixes` — push fixes to existing PR branch (no new PR)
43
+ - `crewpilot_board_pr_comments` — fetch review comments from a PR
44
+ - `crewpilot_knowledge_search` — query known patterns, anti-patterns, and past root causes
45
+ - `crewpilot_artifact_write` — persist phase outputs (analysis, plans, reviews) so downstream phases can read them
46
+ - `crewpilot_artifact_read` — read artifacts from prior phases (e.g. analysis → plan, plan → implementation)
47
+ - `crewpilot_artifact_list` — list all artifacts for the current workflow
48
+ - `crewpilot_dispatch_subagent` — delegate focused work (code review, test writing, security audit) to specialized sub-agents
49
+ - `crewpilot_session_save` — save session state for long-running tasks (enables resume across conversations)
50
+ - `crewpilot_session_restore` — restore a previously saved session to continue work
51
+ - `crewpilot_session_list` — list all saved sessions
52
+ - `mcp_workiq_ask_work_iq` — (optional, requires Work IQ extension) fetch M365 context (emails, docs, meetings) related to the task
45
53
 
46
54
  ## Methodology
47
55
 
@@ -56,6 +64,7 @@ digraph autopilot_worker {
56
64
  analysis [label="Phase 2\nCodebase Analysis & Planning"];
57
65
  design [label="Phase 2.5\nDesign & Architecture\n(label-gated)", style=dashed];
58
66
  rca [label="Phase 2.5c\nRoot Cause Analysis\n(bug label-gated)", style=dashed];
67
+ threat [label="Phase 2.5d\nThreat Model\n(security label-gated)", style=dashed];
59
68
  plan_gate [label="Phase 3\nHUMAN GATE: Plan Approval", shape=diamond, style=filled, fillcolor="#ffcccc"];
60
69
  implement [label="Phase 4\nBranch & Implementation"];
61
70
  change_mgmt [label="Phase 5\nChange Management"];
@@ -68,9 +77,11 @@ digraph autopilot_worker {
68
77
  intake -> analysis;
69
78
  analysis -> design [label="needs-design\nor needs-architecture"];
70
79
  analysis -> rca [label="bug/defect/\nregression"];
80
+ analysis -> threat [label="needs-threat-model\nor security-sensitive"];
71
81
  analysis -> plan_gate [label="no special labels"];
72
82
  design -> plan_gate;
73
83
  rca -> plan_gate;
84
+ threat -> plan_gate;
74
85
  plan_gate -> implement [label="approved"];
75
86
  plan_gate -> fail [label="cancelled"];
76
87
  implement -> change_mgmt;
@@ -88,15 +99,34 @@ digraph autopilot_worker {
88
99
  ### Phase 1 — Intake & Issue Creation
89
100
 
90
101
  **First interaction hint:** If this is the first interaction in the session, start with:
91
- > 💡 *Running Catalyst Autopilot — I'll summarize the task, confirm with you before creating a board issue, plan the work, get your approval, implement, test, review, and open a PR.*
102
+ > 💡 *Running CrewPilot Autopilot — I'll summarize the task, confirm with you before creating a board issue, plan the work, get your approval, implement, test, review, and open a PR.*
92
103
 
93
- **Entry mode detection** — the worker can be entered three ways:
104
+ **Entry mode detection** — the worker can be entered four ways:
94
105
 
95
106
  | Entry Mode | How to Detect | Behavior |
96
107
  |---|---|---|
97
108
  | **Direct** | User says "autopilot", "full pipeline", etc. | Run full pipeline from Phase 1 |
98
109
  | **Routed from feature-builder** | feature-builder's Phase 0 classified as moderate/complex | Skip re-analyzing complexity — it's already assessed. Use the context feature-builder gathered. |
99
110
  | **Mid-build escalation** | feature-builder discovered more complexity during Phase 4 | Accept the partial context (files already touched, patterns found). Start from Phase 2 (planning) with what's already known. |
111
+ | **Session resume** | User says "resume", "continue", "pick up where I left off" | Call `crewpilot_session_restore` with the workflow ID. Read the saved state, load associated artifacts, and resume from the last pending action. |
112
+
113
+ **Session resume flow**: When resuming, the agent should:
114
+ 1. Call `crewpilot_session_restore` to get the saved state
115
+ 2. Call `crewpilot_artifact_list` to see what artifacts exist
116
+ 3. Read relevant artifacts with `crewpilot_artifact_read`
117
+ 4. **(Optional) Calendar-aware context refresh**: If `mcp_workiq_ask_work_iq` is available and significant time has passed since the session was saved (overnight, weekend, or >4 hours):
118
+ - Call `mcp_workiq_accept_eula` with `eulaUrl: "https://github.com/microsoft/work-iq-mcp"` (idempotent)
119
+ - **Check for new context**: `mcp_workiq_ask_work_iq` → "What meetings, emails, or Teams messages about {issue title / feature} happened since {saved_at timestamp}? Summarize any new decisions, requirement changes, or blockers."
120
+ - **Check calendar conflicts**: `mcp_workiq_ask_work_iq` → "Do I have any meetings in the next 2 hours that might affect my availability?"
121
+ - If new decisions or requirement changes are found, flag them to the user before continuing:
122
+ ```
123
+ 📅 Context Update (since session was saved {age} ago):
124
+ - {new decision / requirement change / blocker}
125
+ → Continue with current plan? (yes / re-plan)
126
+ ```
127
+ - If unavailable, skip — resume proceeds without M365 context refresh.
128
+ 5. Continue from the first pending action in the saved state
129
+ 6. Do NOT re-run phases that have already completed (check artifacts_written)
100
130
 
101
131
  **Complexity check (direct entry only):** If the user enters autopilot directly, quickly assess if the request warrants the full pipeline:
102
132
  - If the request is trivial (single file, obvious change) → suggest: *"This is a small change. I can implement it directly without the full pipeline. Want me to do that instead?"*
@@ -130,18 +160,18 @@ Labels: {labels}
130
160
  → Create this task and start the pipeline? (yes / edit / no)
131
161
  ```
132
162
 
133
- - If **yes** → call `catalyst_board_create`, continue to Phase 2
163
+ - If **yes** → call `crewpilot_board_create`, continue to Phase 2
134
164
  - If **edit** → user provides corrections, update and re-present
135
165
  - If **no** → stop the pipeline. Ask the user what they'd like to do instead.
136
166
  - Do NOT create the board issue without explicit user confirmation.
137
167
  </HARD-GATE>
138
168
 
139
- 3. Call `catalyst_board_create` with title, description, acceptance criteria
169
+ 3. Call `crewpilot_board_create` with title, description, acceptance criteria
140
170
  4. Note the created issue ID
141
171
 
142
172
  **If user provides an existing issue number (e.g., "#42"):**
143
173
 
144
- 1. Call `catalyst_board_get` to read the existing issue
174
+ 1. Call `crewpilot_board_get` to read the existing issue
145
175
  2. Use its title, description, and acceptance criteria as-is
146
176
  3. No confirmation needed — the task already exists
147
177
 
@@ -153,19 +183,31 @@ Labels: {labels}
153
183
  - Which files need to be **modified**
154
184
  - What patterns/conventions the codebase follows (naming, directory structure, test style)
155
185
  - What dependencies might be needed
156
- 3. Check issue labels for `needs-design`, `needs-architecture`, and `bug`/`defect`/`regression`
157
- 4. **Query pattern knowledge** via `catalyst_knowledge_search` (type: `pattern`):
186
+ 3. Check issue labels for `needs-design`, `needs-architecture`, `bug`/`defect`/`regression`, and `needs-threat-model`/`security-sensitive`
187
+ 4. **Query pattern knowledge** via `crewpilot_knowledge_search` (type: `pattern`):
158
188
  - Search for known patterns and anti-patterns in the files being modified
159
189
  - Search for past root causes in the same area of the codebase
160
190
  - Collect any "repeat offender" warnings from previous runs
161
191
  - Feed this context into the plan so the worker avoids known mistakes
162
- 5. Call `catalyst_worker_start` with the issue ID and title
192
+ 5. **(Optional) Fetch M365 requirements context**: First call `mcp_workiq_accept_eula` with `eulaUrl: "https://github.com/microsoft/work-iq-mcp"` (idempotent), then use **focused queries** to surface requirements context before planning:
193
+ - **Requirements & specs**: `mcp_workiq_ask_work_iq` → "Find emails, documents, and Teams messages about: {issue title}. Summarize relevant discussions, specs, and design docs."
194
+ - **Meeting decisions**: `mcp_workiq_ask_work_iq` → "What decisions were made about {issue title / feature name} in recent meetings? What requirements were stated?"
195
+ - **Stakeholder expectations**: `mcp_workiq_ask_work_iq` → "What did stakeholders or customers say about {feature} in recent emails or meetings? What was promised or committed?"
196
+ - Feed the M365 context into the analysis artifact so Phase 3's plan addresses stated requirements, not just the issue description.
197
+ - If `mcp_workiq_ask_work_iq` is unavailable, skip — this step is optional.
198
+ 6. Call `crewpilot_worker_start` with the issue ID and title
199
+ 7. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="analysis"` containing:
200
+ - Files to create/modify
201
+ - Codebase patterns discovered
202
+ - Dependencies needed
203
+ - Label-gated phases to run
204
+ - Known patterns/anti-patterns from knowledge search
163
205
 
164
206
  ### Phase 2.5 — Design & Architecture (label-gated)
165
207
 
166
208
  **Skip this phase entirely if the issue has neither `needs-design` nor `needs-architecture` label.**
167
209
 
168
- Check the issue labels (from `catalyst_board_get`). Run the applicable skills:
210
+ Check the issue labels (from `crewpilot_board_get`). Run the applicable skills:
169
211
 
170
212
  #### If issue has `needs-design` label:
171
213
 
@@ -188,7 +230,7 @@ Reversal cost: {Low/Medium/High}
188
230
  ```
189
231
 
190
232
  5. **HUMAN GATE**: User picks an approach
191
- 6. Store the decision via `catalyst_knowledge_store` (type: decision)
233
+ 6. Store the decision via `crewpilot_knowledge_store` (type: decision)
192
234
  7. Write the design document to `docs/design/{issue_id}-{slug}.md`:
193
235
  ```markdown
194
236
  # Design: {issue title}
@@ -211,6 +253,7 @@ Reversal cost: {Low/Medium/High}
211
253
  Confidence: {N}/10 | Reversal cost: {Low/Medium/High}
212
254
  ```
213
255
  8. Stage the design doc — it will be committed alongside the code in Phase 5
256
+ 9. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="design"` containing the chosen approach, trade-off summary, and design document path
214
257
 
215
258
  #### If issue has `needs-architecture` label:
216
259
 
@@ -253,6 +296,7 @@ Data Flow:
253
296
  {rejected options and why}
254
297
  ```
255
298
  9. Stage the ADR — it will be committed alongside the code in Phase 5
299
+ 10. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="architecture"` containing the component decomposition, data flow, interfaces, and ADR path
256
300
 
257
301
  #### If issue has BOTH labels:
258
302
 
@@ -267,8 +311,8 @@ The design decision feeds into the architecture — e.g., "we chose Redis" → a
267
311
 
268
312
  1. **Symptom collection**:
269
313
  - Extract error message, stack trace, steps to reproduce from the issue description
270
- - Run `catalyst_git_log` on the affected files to check recent changes
271
- - Query `catalyst_knowledge_search` for previous root causes in the same area
314
+ - Run `crewpilot_git_log` on the affected files to check recent changes
315
+ - Query `crewpilot_knowledge_search` for previous root causes in the same area
272
316
  2. **Hypothesis generation** — generate 2-3 ranked hypotheses:
273
317
 
274
318
  ```
@@ -282,7 +326,7 @@ The design decision feeds into the architecture — e.g., "we chose Redis" → a
282
326
  ```
283
327
 
284
328
  3. **Systematic elimination** — for each hypothesis (highest first):
285
- - Run `catalyst_exec` to test (add logging, reproduce, check state)
329
+ - Run `crewpilot_exec` to test (add logging, reproduce, check state)
286
330
  - Record result: confirmed / eliminated / narrowed
287
331
  - Max 5 attempts total (circuit breaker — same as Phase 4)
288
332
  4. **Root cause identification**:
@@ -293,21 +337,56 @@ The design decision feeds into the architecture — e.g., "we chose Redis" → a
293
337
  - The plan must fix the root cause, not just the symptom
294
338
  - Include a regression test that fails without the fix
295
339
  - Phase 5 commit footer: `Root-cause: {one-sentence description}`
296
- 6. **Store root cause** via `catalyst_knowledge_store` (type: `root-cause`):
340
+ 6. **Store root cause** via `crewpilot_knowledge_store` (type: `root-cause`):
297
341
  - What: the root cause description
298
342
  - Where: affected files/modules
299
343
  - Why: the design gap
300
344
  - Prevention: what would have caught this earlier
301
- 7. **If root cause reveals a systemic issue**, flag it for pattern detection in Phase 6:
345
+ 7. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="rca"` containing the root cause, causal chain, design gap, prevention strategy, and affected files
346
+ 8. **If root cause reveals a systemic issue**, flag it for pattern detection in Phase 6:
302
347
  - Add note: `systemic:{description}` for Phase 6 to pick up
303
348
 
304
- #### After design/architecture/RCA phases:
349
+ ### Phase 2.5d — Threat Modeling (label-gated)
350
+
351
+ **Skip if the issue does NOT have a `needs-threat-model` or `security-sensitive` label.**
352
+
353
+ **Load and follow** `.github/skills/assure-threat-model/SKILL.md` methodology:
354
+
355
+ 1. **Read prior artifacts**: Load the `analysis` artifact (and `architecture` if it exists) to understand the system being built
356
+ 2. **Scope the model**: Define the trust boundaries and data flows for the feature being implemented
357
+ 3. **STRIDE analysis**: For each component and data flow crossing a trust boundary, evaluate all 6 STRIDE categories
358
+ 4. **Risk assessment**: Score each threat (Likelihood × Impact = Risk)
359
+ 5. **Mitigation planning**: For threats with risk ≥ 7, propose specific mitigations with effort and implementation phase
360
+ 6. **Present to user**:
361
+
362
+ ```
363
+ 🛡️ Threat Model for: "{issue title}"
364
+
365
+ | ID | STRIDE | Component | Threat | Risk Score | Mitigation |
366
+ |----|--------|-----------|--------|------------|------------|
367
+ | T1 | ... | ... | ... | ... | ... |
368
+
369
+ Critical threats: {count}
370
+ Required mitigations before implementation: {list}
371
+
372
+ → Approve threat model? (yes / edit)
373
+ ```
305
374
 
306
- The design documents and RCA findings inform the implementation plan. Phase 3's plan should reference:
375
+ 7. **HUMAN GATE**: User approves the threat model
376
+ 8. Store via `crewpilot_knowledge_store` (type: `threat-model`)
377
+ 9. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="threat-model"` containing the full threat register
378
+ 10. Feed critical/high-risk mitigations into Phase 3 plan as mandatory implementation steps
379
+
380
+ #### After design/architecture/RCA/threat-model phases:
381
+
382
+ The design documents, RCA findings, and threat model inform the implementation plan. Phase 3's plan should reference:
307
383
  - Which approach was chosen (from design doc)
308
384
  - Which components to build (from architecture)
309
385
  - Which interfaces to implement (from ADR)
310
386
  - What root cause was found (from RCA) and what fix addresses it
387
+ - What threats were identified (from threat model) and what mitigations are required
388
+
389
+ **Read prior artifacts**: Call `crewpilot_artifact_read` to load the `analysis`, `design`, `architecture`, `rca`, and/or `threat-model` artifacts. These contain the full context from earlier phases — do not rely on chat history alone.
311
390
 
312
391
  ### Phase 3 — HUMAN GATE: Plan Approval
313
392
 
@@ -340,18 +419,24 @@ Complexity: {trivial|simple|moderate|complex}
340
419
  Approve? (yes / edit / cancel)
341
420
  ```
342
421
 
343
- - If **yes** → call `catalyst_worker_approve`, continue to Phase 4
422
+ - If **yes** → call `crewpilot_worker_approve`, continue to Phase 4
344
423
  - If **edit** → user provides changes, update plan, re-present
345
- - If **cancel** → call `catalyst_worker_fail`, stop
424
+ - If **cancel** → call `crewpilot_worker_fail`, stop
425
+
426
+ **Write artifact**: After approval, call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="plan"` containing the approved plan (steps, files, complexity).
427
+
428
+ **Session checkpoint**: After plan approval, call `crewpilot_session_save` with status="checkpoint", phase="phase-3-approved", and the current context. This ensures the approved plan can be resumed if the session is interrupted.
346
429
 
347
430
  ### Phase 4 — Branch & Implementation
348
431
 
349
- 1. Call `catalyst_worker_branch` to create feature branch
350
- 2. Call `catalyst_board_move` to set issue status to "in-progress"
432
+ **Read prior artifacts**: Call `crewpilot_artifact_read` for `plan` (and `analysis`, `design`, `architecture`, `rca` if they exist) to load the full execution context.
433
+
434
+ 1. Call `crewpilot_worker_branch` to create feature branch
435
+ 2. Call `crewpilot_board_move` to set issue status to "in-progress"
351
436
  3. **For each step in the plan:**
352
437
  a. Implement the code change (create/modify files)
353
438
  b. Follow existing codebase patterns discovered in Phase 2
354
- c. After each logical unit, run `catalyst_exec("npm test")` or equivalent to verify nothing is broken
439
+ c. After each logical unit, run `crewpilot_exec("npm test")` or equivalent to verify nothing is broken
355
440
  d. If tests fail, diagnose and fix (max 3 attempts per step — circuit breaker)
356
441
  4. Write tests for new code:
357
442
  - Match existing test framework and conventions
@@ -359,8 +444,8 @@ Approve? (yes / edit / cancel)
359
444
  - Run tests to confirm they pass
360
445
 
361
446
  **Circuit breaker:** If any step fails 3 times consecutively:
362
- - Call `catalyst_board_comment` with details of the failure
363
- - Call `catalyst_worker_fail` with reason
447
+ - Call `crewpilot_board_comment` with details of the failure
448
+ - Call `crewpilot_worker_fail` with reason
364
449
  - Tell the user what went wrong and which step is stuck
365
450
  - STOP. Do not continue.
366
451
 
@@ -368,10 +453,10 @@ Approve? (yes / edit / cancel)
368
453
 
369
454
  **Load and follow** `.github/skills/deliver-change-management/SKILL.md` methodology:
370
455
 
371
- 1. Run `catalyst_git_diff` to analyze all changes
456
+ 1. Run `crewpilot_git_diff` to analyze all changes
372
457
  2. Categorize changes by type: `feat`, `fix`, `refactor`, `test`, `docs`, `chore`
373
458
  3. **If changes span multiple logical units** (e.g., new feature + test + config):
374
- - Split into separate commits with `catalyst_git_stage` per group
459
+ - Split into separate commits with `crewpilot_git_stage` per group
375
460
  - Each commit gets its own conventional message
376
461
  - Example:
377
462
  ```
@@ -388,7 +473,8 @@ Approve? (yes / edit / cancel)
388
473
  - Format: `feat(scope): description (closes #ID)`
389
474
  - Body: what was implemented and why
390
475
  - Footer: `Closes #ID`
391
- 5. Call `catalyst_git_stage` and `catalyst_git_commit` for each logical commit
476
+ 5. Call `crewpilot_git_stage` and `crewpilot_git_commit` for each logical commit
477
+ 6. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="change-mgmt"` containing the list of commits created (hash, type, scope, message)
392
478
 
393
479
  ### Phase 5b — Doc Governance (Deliver Skill #2)
394
480
 
@@ -413,7 +499,7 @@ Approve? (yes / edit / cancel)
413
499
 
414
500
  ### Phase 6 — PR Creation & Auto-Review
415
501
 
416
- 1. Call `catalyst_worker_preview_pr` with:
502
+ 1. Call `crewpilot_worker_preview_pr` with:
417
503
  - Title: primary commit message
418
504
  - Body: markdown with sections:
419
505
  - **What**: summary of changes
@@ -426,7 +512,7 @@ Approve? (yes / edit / cancel)
426
512
  2. **HUMAN GATE**: User reviews the preview — do NOT create the PR until the user approves.
427
513
  If the user requests changes, apply them and re-preview. Never skip this gate.
428
514
  </HARD-GATE>
429
- 3. Call `catalyst_worker_pr` to create the PR
515
+ 3. Call `crewpilot_worker_pr` to create the PR
430
516
  4. **Run PR Intelligence** (read `.github/skills/assure-pr-intelligence/SKILL.md`):
431
517
  - **Change inventory**: categorize changed files (core, api, test, config, docs)
432
518
  - **Risk assessment**: evaluate scope, complexity, blast radius, test coverage, reversibility → Low/Medium/High/Critical risk score
@@ -434,7 +520,15 @@ Approve? (yes / edit / cancel)
434
520
  - **Merge readiness checklist**: tests pass, security clean, breaking changes documented, PR description matches changes
435
521
  - Post the full PR Intelligence report as a **comment on the PR** so the assigned reviewer sees it immediately
436
522
  5. Read the diff of the PR
437
- 6. Run **code-quality** review internally (read `.github/skills/assure-code-quality/SKILL.md`):
523
+ 6. **Subagent delegation (recommended for moderate/complex changes):** Use `crewpilot_dispatch_subagent` to delegate review work in parallel:
524
+ - Delegate `code-reviewer` role with the diff and file list — receives correctness, security, and performance findings
525
+ - Delegate `standards-reviewer` role with the diff and codebase conventions — receives standards compliance findings
526
+ - Delegate `security-auditor` role with source files and architecture context — receives STRIDE/OWASP findings
527
+ - Each subagent writes its output as an artifact (e.g. `review-functional`, `review-standards`) for traceability
528
+ - Merge subagent findings using `crewpilot_dispatch_consensus` to identify high-confidence vs disputed issues
529
+
530
+ **Fallback (simple changes):** Run reviews inline without subagent delegation:
531
+ 7. Run **code-quality** review internally (read `.github/skills/assure-code-quality/SKILL.md`):
438
532
  - Correctness: does the code do what the acceptance criteria say?
439
533
  - Security: any obvious vulnerabilities (SQL injection, XSS, secrets)?
440
534
  - Performance: any N+1 queries, await-in-loops, unnecessary re-renders?
@@ -442,7 +536,22 @@ Approve? (yes / edit / cancel)
442
536
  7. Run **vulnerability-scan** internally (read `.github/skills/assure-vulnerability-scan/SKILL.md`):
443
537
  - OWASP Top 10 quick check on new code
444
538
  - Dependency audit: `npm audit` or `pip audit`
445
- 8. Run `catalyst_exec("npm run lint")` and `catalyst_exec("npm run typecheck")` if available
539
+ 8. Run `crewpilot_exec("npm run lint")` and `crewpilot_exec("npm run typecheck")` if available
540
+ 8b. **(Optional) Requirements alignment validation**: If M365 context was fetched in Phase 2, validate the implementation against meeting-stated requirements:
541
+ - Read the `analysis` artifact to retrieve the M365 requirements context captured earlier
542
+ - If the analysis artifact contains meeting decisions or stakeholder expectations, call `mcp_workiq_ask_work_iq` → "What specific requirements and acceptance criteria were stated for {feature} in meetings and emails?"
543
+ - Cross-reference each stated requirement against the implementation diff:
544
+ - **Covered**: the requirement is addressed by the code changes ✓
545
+ - **Partial**: the requirement is partially addressed — flag what's missing
546
+ - **Missing**: the requirement is not addressed at all — flag as a review finding
547
+ - Include requirements alignment in the PR comment:
548
+ ```
549
+ 📋 Requirements Alignment:
550
+ Meeting requirements checked: {N}
551
+ Covered: {count} ✓ | Partial: {count} ⚠️ | Missing: {count} ❌
552
+ {list any partial/missing items}
553
+ ```
554
+ - If critical requirements are missing, flag as a review issue that must be addressed before merge
446
555
  9. **Run diff-scoped pattern detection** (read `.github/skills/insights-pattern-detection/SKILL.md`):
447
556
  - Scope: only scan files changed in the diff (NOT full codebase)
448
557
  - Check for **consistency** with existing codebase patterns:
@@ -456,14 +565,14 @@ Approve? (yes / edit / cancel)
456
565
  - Shotgun surgery (small change touching too many files)
457
566
  - Primitive obsession (strings/numbers where domain types belong)
458
567
  - **Query knowledge base for repeat offenses**:
459
- - `catalyst_knowledge_search` type: `pattern` — "has this same anti-pattern been flagged before?"
568
+ - `crewpilot_knowledge_search` type: `pattern` — "has this same anti-pattern been flagged before?"
460
569
  - If a repeat offense is found, flag prominently:
461
570
  ```
462
571
  ⚠️ Recurring Pattern Issue: {description}
463
572
  Previously flagged in: {previous context}
464
573
  Suggestion: Consider a structural fix.
465
574
  ```
466
- - Run `catalyst_metrics_complexity` on changed files — flag any function with complexity > threshold
575
+ - Run `crewpilot_metrics_complexity` on changed files — flag any function with complexity > threshold
467
576
  - Include pattern findings in the PR comment:
468
577
  ```
469
578
  🔎 Pattern Detection Results:
@@ -477,9 +586,10 @@ Approve? (yes / edit / cancel)
477
586
  - Re-commit: `fix(scope): address review findings`
478
587
  - Re-push
479
588
  - Re-run pattern detection on the fix to confirm resolution
480
- 11. Call `catalyst_worker_review_done` with verdict: "approved" and summary
481
- 12. Call `catalyst_board_move` to set issue status to "in-review"
482
- 13. Call `catalyst_board_comment`: "PR #{pr_number} opened. Ready for review."
589
+ 11. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="review-merged"` containing the combined review results (code-quality, vulnerability-scan, pattern detection findings, and fix iterations)
590
+ 12. Call `crewpilot_worker_review_done` with verdict: "approved" and summary
591
+ 12. Call `crewpilot_board_move` to set issue status to "in-review"
592
+ 13. Call `crewpilot_board_comment`: "PR #{pr_number} opened. Ready for review."
483
593
 
484
594
  ### Phase 7 — Deploy Guard (Deliver Skill #3)
485
595
 
@@ -512,10 +622,12 @@ Produce a verdict and include in the PR comment:
512
622
  - If **CONDITIONAL** → list warnings in PR comment, proceed (human decides)
513
623
  - If **NO-GO** → fix blockers, re-run until GO or escalate to user
514
624
 
625
+ **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="deploy-guard"` containing the full 6-gate results and verdict.
626
+
515
627
  ### Phase 8 — Completion & Learning
516
628
 
517
- 1. Call `catalyst_board_comment` with deploy guard results: "All checks passed. Ready to merge."
518
- 2. **Store knowledge** via `catalyst_knowledge_store`:
629
+ 1. Call `crewpilot_board_comment` with deploy guard results: "All checks passed. Ready to merge."
630
+ 2. **Store knowledge** via `crewpilot_knowledge_store`:
519
631
  - Decisions made during implementation (type: `decision`)
520
632
  - Root cause findings, if this was a bug fix (type: `root-cause`)
521
633
  - **Pattern findings** from Phase 6 (type: `pattern`):
@@ -557,7 +669,8 @@ Repeat Issues: {none | {count} recurring patterns detected}
557
669
  → Merge when ready. Board will auto-update on close.
558
670
  ```
559
671
 
560
- 4. Call `catalyst_worker_complete`
672
+ 4. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="completion"` containing the final summary (PR number, branch, commits, review/deploy-guard results, knowledge stored)
673
+ 5. Call `crewpilot_worker_complete`
561
674
 
562
675
  ### Capability Hints (on completion)
563
676
 
@@ -613,6 +726,7 @@ Every step in the Phase 3 plan and every file produced in Phase 4 must contain r
613
726
  - `solution-design` — Phase 2.5: generate solution design doc when `needs-design` label detected
614
727
  - `architecture-planner` — Phase 2.5: generate ADR when `needs-architecture` label detected
615
728
  - `root-cause-analysis` — Phase 2.5c: systematic RCA when `bug`/`defect`/`regression` label detected
729
+ - `threat-model` — Phase 2.5d: STRIDE threat modeling when `needs-threat-model`/`security-sensitive` label detected
616
730
  - `change-management` — Phase 5: proper conventional commits with multi-commit splitting
617
731
  - `doc-governance` — Phase 5b: auto-detect and fix documentation drift
618
732
  - `pr-intelligence` — Phase 6: risk assessment + reviewer guidance posted on PR
@@ -12,12 +12,15 @@ Generate a comprehensive daily/weekly work summary by aggregating git activity,
12
12
 
13
13
  ## Tools Required
14
14
 
15
- - `catalyst_git_log` — get commits for the time period
16
- - `catalyst_board_my_items` — get board items (opened, closed, in-progress)
17
- - `catalyst_worker_dashboard` — workflow completions and stats
18
- - `catalyst_knowledge_timeline` — decisions made in the period
19
- - `catalyst_exec` — run git/gh commands for additional data
20
- - `catalyst_notify_send` — deliver the report via email
15
+ - `crewpilot_git_log` — get commits for the time period
16
+ - `crewpilot_board_my_items` — get board items (opened, closed, in-progress)
17
+ - `crewpilot_worker_dashboard` — workflow completions and stats
18
+ - `crewpilot_knowledge_timeline` — decisions made in the period
19
+ - `crewpilot_exec` — run git/gh commands for additional data
20
+ - `crewpilot_notify_send` — deliver the report via email
21
+ - `mcp_workiq_accept_eula` — (optional) accept Work IQ EULA before first query
22
+ - `mcp_workiq_ask_work_iq` — (optional, requires Work IQ extension) fetch M365 activity (emails, meetings, docs, Teams) for a full work-surface report
23
+ - `crewpilot_artifact_write` — persist the digest as an artifact
21
24
 
22
25
  ## Methodology
23
26
 
@@ -42,26 +45,44 @@ digraph daily_digest {
42
45
  Gather from all sources for the requested time period (default: today):
43
46
 
44
47
  **Git Activity:**
45
- 1. Call `catalyst_git_log` with `--since="today 00:00"` (or requested range)
48
+ 1. Call `crewpilot_git_log` with `--since="today 00:00"` (or requested range)
46
49
  2. Extract: commit count, files changed, insertions/deletions, branches touched
47
50
  3. Group commits by scope/type (feat, fix, refactor, test, docs)
48
51
 
49
52
  **Board Activity:**
50
- 1. Call `catalyst_exec` with `gh issue list --author=@me --state=all --json number,title,state,updatedAt,labels`
53
+ 1. Call `crewpilot_exec` with `gh issue list --author=@me --state=all --json number,title,state,updatedAt,labels`
51
54
  2. Filter to items updated in the time period
52
55
  3. Categorize: created, moved to in-progress, closed/done, commented on
53
56
 
54
57
  **PR Activity:**
55
- 1. Call `catalyst_exec` with `gh pr list --author=@me --state=all --json number,title,state,createdAt,mergedAt,reviewDecision`
58
+ 1. Call `crewpilot_exec` with `gh pr list --author=@me --state=all --json number,title,state,createdAt,mergedAt,reviewDecision`
56
59
  2. Filter to time period
57
60
  3. Categorize: opened, merged, review pending, changes requested
58
61
 
59
62
  **Workflow Activity:**
60
- 1. Call `catalyst_worker_dashboard` for digital worker stats
63
+ 1. Call `crewpilot_worker_dashboard` for digital worker stats
61
64
  2. Filter completed/failed workflows in the period
62
65
 
63
66
  **Knowledge:**
64
- 1. Call `catalyst_knowledge_timeline` for decisions and lessons stored today
67
+ 1. Call `crewpilot_knowledge_timeline` for decisions and lessons stored today
68
+
69
+ **M365 Activity (optional — requires Work IQ MCP server):**
70
+ 1. Call `mcp_workiq_accept_eula` with `eulaUrl: "https://github.com/microsoft/work-iq-mcp"` (idempotent — safe to call every time)
71
+ 2. Use **multiple focused queries** for comprehensive coverage (targeted queries return better results than one broad question):
72
+ - **Emails**: `mcp_workiq_ask_work_iq` → "What emails did I send and receive on {date}? Summarize key threads and any action items."
73
+ - **Meetings**: `mcp_workiq_ask_work_iq` → "What meetings did I attend on {date}? What decisions were made and what action items were assigned to me?"
74
+ - **Documents**: `mcp_workiq_ask_work_iq` → "What documents did I edit or view in SharePoint and OneDrive on {date}?"
75
+ - **Teams**: `mcp_workiq_ask_work_iq` → "What Teams channel messages and chats was I active in on {date}? What mentions did I receive?"
76
+ - **Tasks**: `mcp_workiq_ask_work_iq` → "What Planner or To-Do tasks did I complete or get assigned on {date}?"
77
+ 3. If Work IQ is available, parse all responses and include the full work surface:
78
+ - **Emails**: sent/received count, key threads, action items from emails
79
+ - **Meetings**: attended meetings, decisions made, action items assigned, linked documents
80
+ - **Documents**: files edited/viewed in SharePoint/OneDrive, co-authoring activity
81
+ - **Teams**: active channel conversations, 1:1 chats, mentions, and responses
82
+ - **Tasks**: Planner/To-Do items completed, created, or updated
83
+ 4. If `mcp_workiq_ask_work_iq` is unavailable or errors, skip this section — the digest works without it (git + board + PRs is the baseline)
84
+
85
+ > **Query budget**: Work IQ queries have a ~30/session budget. The 5 queries above are a reasonable investment for a full daily digest. For weekly summaries, combine into broader date-range queries to conserve budget.
65
86
 
66
87
  ### Phase 2 — Report Generation
67
88
 
@@ -112,9 +133,9 @@ Tomorrow's focus:
112
133
  Based on notification configuration:
113
134
 
114
135
  **Email (default when recipients configured):**
115
- 1. Call `catalyst_notify_send` with subject: "Daily Digest — {date} — {project name}", body: full report
136
+ 1. Call `crewpilot_notify_send` with subject: "Daily Digest — {date} — {project name}", body: full report
116
137
  2. Email sent automatically via SMTP (no manual interaction needed)
117
- 3. Requires SMTP env vars or `catalyst_notify_configure` to be set up
138
+ 3. Requires SMTP env vars or `crewpilot_notify_configure` to be set up
118
139
 
119
140
  **Console (fallback when no recipients configured):**
120
141
  1. Just display the report in chat
@@ -160,7 +181,7 @@ When triggered with "weekly summary" or "weekly digest":
160
181
  - Do NOT include sensitive data (secrets, tokens, passwords found in code)
161
182
  - Do NOT fabricate activity — if nothing happened, say "quiet day"
162
183
  - Do NOT include full commit messages — summarize by category
163
- - Do NOT send to recipients not configured via catalyst_notify_configure
184
+ - Do NOT send to recipients not configured via crewpilot_notify_configure
164
185
 
165
186
  ## Chains To
166
187
 
@@ -84,16 +84,16 @@ If multiple logical changes are staged:
84
84
  ## Tools Required
85
85
 
86
86
  - `terminal` — Run git commands
87
- - `catalyst_git_status` — Get current state
88
- - `catalyst_git_diff` — Get detailed diff
89
- - `catalyst_git_log` — Parse commit history
90
- - `catalyst_git_stage` — Stage files
91
- - `catalyst_git_commit` — Execute commit
87
+ - `crewpilot_git_status` — Get current state
88
+ - `crewpilot_git_diff` — Get detailed diff
89
+ - `crewpilot_git_log` — Parse commit history
90
+ - `crewpilot_git_stage` — Stage files
91
+ - `crewpilot_git_commit` — Execute commit
92
92
 
93
93
  ## Output Format
94
94
 
95
95
  ```
96
- ## [Catalyst → Change Management]
96
+ ## [CrewPilot → Change Management]
97
97
 
98
98
  ### Changes Detected
99
99
  | Type | Scope | Files |
@@ -49,14 +49,14 @@ digraph deploy_guard {
49
49
  - [ ] No `TODO`/`FIXME`/`HACK` in files changed since last deploy
50
50
  - [ ] No `console.log`/`print`/debug statements in production paths
51
51
  - [ ] No commented-out code blocks
52
- - Run: `catalyst_metrics_complexity` on changed files — flag any high-complexity additions
52
+ - Run: `crewpilot_metrics_complexity` on changed files — flag any high-complexity additions
53
53
 
54
54
  ### Gate 2 — Test Integrity
55
55
  - [ ] All tests pass
56
56
  - [ ] Test coverage meets minimum threshold
57
57
  - [ ] No skipped tests (`.skip`, `@disabled`, `@pytest.mark.skip`)
58
58
  - [ ] No test files with zero assertions
59
- - Run: `catalyst_metrics_coverage` to validate
59
+ - Run: `crewpilot_metrics_coverage` to validate
60
60
 
61
61
  ### Gate 3 — Security
62
62
  - [ ] No new vulnerabilities from `vulnerability-scan`
@@ -100,14 +100,14 @@ Produce a clear GO / NO-GO / CONDITIONAL decision:
100
100
 
101
101
  - `terminal` — Run tests, linters, audit tools
102
102
  - `codebase` — Scan for anti-patterns, secrets, debug statements
103
- - `catalyst_metrics_coverage` — Coverage report
104
- - `catalyst_metrics_complexity` — Complexity scores
105
- - `catalyst_git_diff` — Changes since last deploy/tag
103
+ - `crewpilot_metrics_coverage` — Coverage report
104
+ - `crewpilot_metrics_complexity` — Complexity scores
105
+ - `crewpilot_git_diff` — Changes since last deploy/tag
106
106
 
107
107
  ## Output Format
108
108
 
109
109
  ```
110
- ## [Catalyst → Deploy Guard]
110
+ ## [CrewPilot → Deploy Guard]
111
111
 
112
112
  ### Gate Results
113
113
 
@@ -89,12 +89,12 @@ Verify minimum documentation exists:
89
89
 
90
90
  - `codebase` — Read source code and documentation files
91
91
  - `terminal` — Verify install steps, run examples
92
- - `catalyst_knowledge_search` — Check if documentation decisions were previously recorded
92
+ - `crewpilot_knowledge_search` — Check if documentation decisions were previously recorded
93
93
 
94
94
  ## Output Format
95
95
 
96
96
  ```
97
- ## [Catalyst → Doc Governance]
97
+ ## [CrewPilot → Doc Governance]
98
98
 
99
99
  ### Documentation Map
100
100
  | Doc File | Covers | Status |