@crewpilot/agent 1.0.0 → 3.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +35 -11
- package/dist-npm/cli.js +5 -5
- package/dist-npm/index.js +171 -138
- package/package.json +2 -2
- package/prompts/agent.md +38 -22
- package/prompts/copilot-instructions.md +8 -8
- package/prompts/{catalyst.config.json → crewpilot.config.json} +1 -1
- package/prompts/skills/assure-code-quality/SKILL.md +3 -3
- package/prompts/skills/assure-pr-intelligence/SKILL.md +4 -4
- package/prompts/skills/assure-review-functional/SKILL.md +114 -0
- package/prompts/skills/assure-review-standards/SKILL.md +106 -0
- package/prompts/skills/assure-threat-model/SKILL.md +182 -0
- package/prompts/skills/assure-vulnerability-scan/SKILL.md +1 -1
- package/prompts/skills/autopilot-meeting/SKILL.md +43 -16
- package/prompts/skills/autopilot-worker/SKILL.md +177 -63
- package/prompts/skills/daily-digest/SKILL.md +35 -14
- package/prompts/skills/deliver-change-management/SKILL.md +6 -6
- package/prompts/skills/deliver-deploy-guard/SKILL.md +6 -6
- package/prompts/skills/deliver-doc-governance/SKILL.md +2 -2
- package/prompts/skills/engineer-feature-builder/SKILL.md +3 -3
- package/prompts/skills/engineer-root-cause-analysis/SKILL.md +3 -3
- package/prompts/skills/engineer-test-first/SKILL.md +2 -2
- package/prompts/skills/insights-knowledge-base/SKILL.md +32 -11
- package/prompts/skills/insights-pattern-detection/SKILL.md +5 -5
- package/prompts/skills/strategize-architecture-planner/SKILL.md +2 -2
- package/prompts/skills/strategize-solution-design/SKILL.md +2 -2
- package/scripts/postinstall.js +4 -4
|
@@ -18,30 +18,38 @@ This pipeline chains 12 skills across role boundaries (e.g. code-quality and vul
|
|
|
18
18
|
|
|
19
19
|
## Tools Required
|
|
20
20
|
|
|
21
|
-
- `
|
|
22
|
-
- `
|
|
23
|
-
- `
|
|
24
|
-
- `
|
|
25
|
-
- `
|
|
26
|
-
- `
|
|
27
|
-
- `
|
|
28
|
-
- `
|
|
29
|
-
- `
|
|
30
|
-
- `
|
|
31
|
-
- `
|
|
32
|
-
- `
|
|
33
|
-
- `
|
|
34
|
-
- `
|
|
35
|
-
- `
|
|
36
|
-
- `
|
|
37
|
-
- `
|
|
38
|
-
- `
|
|
39
|
-
- `
|
|
40
|
-
- `
|
|
41
|
-
- `
|
|
42
|
-
- `
|
|
43
|
-
- `
|
|
44
|
-
- `
|
|
21
|
+
- `crewpilot_board_connect` — connect to board provider
|
|
22
|
+
- `crewpilot_board_create` — create issue on board
|
|
23
|
+
- `crewpilot_board_move` — update issue status
|
|
24
|
+
- `crewpilot_board_comment` — log progress on the issue
|
|
25
|
+
- `crewpilot_worker_start` — start orchestrator workflow
|
|
26
|
+
- `crewpilot_worker_plan` — set execution plan
|
|
27
|
+
- `crewpilot_worker_approve` — human approval gate
|
|
28
|
+
- `crewpilot_worker_branch` — create feature branch
|
|
29
|
+
- `crewpilot_worker_pr` — push + open PR
|
|
30
|
+
- `crewpilot_worker_review_done` — record review verdict
|
|
31
|
+
- `crewpilot_worker_complete` — mark workflow done
|
|
32
|
+
- `crewpilot_worker_fail` — circuit breaker on failure
|
|
33
|
+
- `crewpilot_git_stage` — stage files
|
|
34
|
+
- `crewpilot_git_commit` — commit changes
|
|
35
|
+
- `crewpilot_exec` — run commands (tests, lint, build)
|
|
36
|
+
- `crewpilot_knowledge_store` — store decisions made during implementation
|
|
37
|
+
- `crewpilot_git_diff` — analyze changes for change-management
|
|
38
|
+
- `crewpilot_git_log` — commit history for release notes
|
|
39
|
+
- `crewpilot_metrics_coverage` — coverage check for deploy-guard
|
|
40
|
+
- `crewpilot_metrics_complexity` — complexity check for deploy-guard and pattern detection
|
|
41
|
+
- `crewpilot_worker_preview_pr` — preview changes before PR creation
|
|
42
|
+
- `crewpilot_worker_push_fixes` — push fixes to existing PR branch (no new PR)
|
|
43
|
+
- `crewpilot_board_pr_comments` — fetch review comments from a PR
|
|
44
|
+
- `crewpilot_knowledge_search` — query known patterns, anti-patterns, and past root causes
|
|
45
|
+
- `crewpilot_artifact_write` — persist phase outputs (analysis, plans, reviews) so downstream phases can read them
|
|
46
|
+
- `crewpilot_artifact_read` — read artifacts from prior phases (e.g. analysis → plan, plan → implementation)
|
|
47
|
+
- `crewpilot_artifact_list` — list all artifacts for the current workflow
|
|
48
|
+
- `crewpilot_dispatch_subagent` — delegate focused work (code review, test writing, security audit) to specialized sub-agents
|
|
49
|
+
- `crewpilot_session_save` — save session state for long-running tasks (enables resume across conversations)
|
|
50
|
+
- `crewpilot_session_restore` — restore a previously saved session to continue work
|
|
51
|
+
- `crewpilot_session_list` — list all saved sessions
|
|
52
|
+
- `mcp_workiq_ask_work_iq` — (optional, requires Work IQ extension) fetch M365 context (emails, docs, meetings) related to the task
|
|
45
53
|
|
|
46
54
|
## Methodology
|
|
47
55
|
|
|
@@ -56,6 +64,7 @@ digraph autopilot_worker {
|
|
|
56
64
|
analysis [label="Phase 2\nCodebase Analysis & Planning"];
|
|
57
65
|
design [label="Phase 2.5\nDesign & Architecture\n(label-gated)", style=dashed];
|
|
58
66
|
rca [label="Phase 2.5c\nRoot Cause Analysis\n(bug label-gated)", style=dashed];
|
|
67
|
+
threat [label="Phase 2.5d\nThreat Model\n(security label-gated)", style=dashed];
|
|
59
68
|
plan_gate [label="Phase 3\nHUMAN GATE: Plan Approval", shape=diamond, style=filled, fillcolor="#ffcccc"];
|
|
60
69
|
implement [label="Phase 4\nBranch & Implementation"];
|
|
61
70
|
change_mgmt [label="Phase 5\nChange Management"];
|
|
@@ -68,9 +77,11 @@ digraph autopilot_worker {
|
|
|
68
77
|
intake -> analysis;
|
|
69
78
|
analysis -> design [label="needs-design\nor needs-architecture"];
|
|
70
79
|
analysis -> rca [label="bug/defect/\nregression"];
|
|
80
|
+
analysis -> threat [label="needs-threat-model\nor security-sensitive"];
|
|
71
81
|
analysis -> plan_gate [label="no special labels"];
|
|
72
82
|
design -> plan_gate;
|
|
73
83
|
rca -> plan_gate;
|
|
84
|
+
threat -> plan_gate;
|
|
74
85
|
plan_gate -> implement [label="approved"];
|
|
75
86
|
plan_gate -> fail [label="cancelled"];
|
|
76
87
|
implement -> change_mgmt;
|
|
@@ -88,15 +99,34 @@ digraph autopilot_worker {
|
|
|
88
99
|
### Phase 1 — Intake & Issue Creation
|
|
89
100
|
|
|
90
101
|
**First interaction hint:** If this is the first interaction in the session, start with:
|
|
91
|
-
> 💡 *Running
|
|
102
|
+
> 💡 *Running CrewPilot Autopilot — I'll summarize the task, confirm with you before creating a board issue, plan the work, get your approval, implement, test, review, and open a PR.*
|
|
92
103
|
|
|
93
|
-
**Entry mode detection** — the worker can be entered
|
|
104
|
+
**Entry mode detection** — the worker can be entered four ways:
|
|
94
105
|
|
|
95
106
|
| Entry Mode | How to Detect | Behavior |
|
|
96
107
|
|---|---|---|
|
|
97
108
|
| **Direct** | User says "autopilot", "full pipeline", etc. | Run full pipeline from Phase 1 |
|
|
98
109
|
| **Routed from feature-builder** | feature-builder's Phase 0 classified as moderate/complex | Skip re-analyzing complexity — it's already assessed. Use the context feature-builder gathered. |
|
|
99
110
|
| **Mid-build escalation** | feature-builder discovered more complexity during Phase 4 | Accept the partial context (files already touched, patterns found). Start from Phase 2 (planning) with what's already known. |
|
|
111
|
+
| **Session resume** | User says "resume", "continue", "pick up where I left off" | Call `crewpilot_session_restore` with the workflow ID. Read the saved state, load associated artifacts, and resume from the last pending action. |
|
|
112
|
+
|
|
113
|
+
**Session resume flow**: When resuming, the agent should:
|
|
114
|
+
1. Call `crewpilot_session_restore` to get the saved state
|
|
115
|
+
2. Call `crewpilot_artifact_list` to see what artifacts exist
|
|
116
|
+
3. Read relevant artifacts with `crewpilot_artifact_read`
|
|
117
|
+
4. **(Optional) Calendar-aware context refresh**: If `mcp_workiq_ask_work_iq` is available and significant time has passed since the session was saved (overnight, weekend, or >4 hours):
|
|
118
|
+
- Call `mcp_workiq_accept_eula` with `eulaUrl: "https://github.com/microsoft/work-iq-mcp"` (idempotent)
|
|
119
|
+
- **Check for new context**: `mcp_workiq_ask_work_iq` → "What meetings, emails, or Teams messages about {issue title / feature} happened since {saved_at timestamp}? Summarize any new decisions, requirement changes, or blockers."
|
|
120
|
+
- **Check calendar conflicts**: `mcp_workiq_ask_work_iq` → "Do I have any meetings in the next 2 hours that might affect my availability?"
|
|
121
|
+
- If new decisions or requirement changes are found, flag them to the user before continuing:
|
|
122
|
+
```
|
|
123
|
+
📅 Context Update (since session was saved {age} ago):
|
|
124
|
+
- {new decision / requirement change / blocker}
|
|
125
|
+
→ Continue with current plan? (yes / re-plan)
|
|
126
|
+
```
|
|
127
|
+
- If unavailable, skip — resume proceeds without M365 context refresh.
|
|
128
|
+
5. Continue from the first pending action in the saved state
|
|
129
|
+
6. Do NOT re-run phases that have already completed (check artifacts_written)
|
|
100
130
|
|
|
101
131
|
**Complexity check (direct entry only):** If the user enters autopilot directly, quickly assess if the request warrants the full pipeline:
|
|
102
132
|
- If the request is trivial (single file, obvious change) → suggest: *"This is a small change. I can implement it directly without the full pipeline. Want me to do that instead?"*
|
|
@@ -130,18 +160,18 @@ Labels: {labels}
|
|
|
130
160
|
→ Create this task and start the pipeline? (yes / edit / no)
|
|
131
161
|
```
|
|
132
162
|
|
|
133
|
-
- If **yes** → call `
|
|
163
|
+
- If **yes** → call `crewpilot_board_create`, continue to Phase 2
|
|
134
164
|
- If **edit** → user provides corrections, update and re-present
|
|
135
165
|
- If **no** → stop the pipeline. Ask the user what they'd like to do instead.
|
|
136
166
|
- Do NOT create the board issue without explicit user confirmation.
|
|
137
167
|
</HARD-GATE>
|
|
138
168
|
|
|
139
|
-
3. Call `
|
|
169
|
+
3. Call `crewpilot_board_create` with title, description, acceptance criteria
|
|
140
170
|
4. Note the created issue ID
|
|
141
171
|
|
|
142
172
|
**If user provides an existing issue number (e.g., "#42"):**
|
|
143
173
|
|
|
144
|
-
1. Call `
|
|
174
|
+
1. Call `crewpilot_board_get` to read the existing issue
|
|
145
175
|
2. Use its title, description, and acceptance criteria as-is
|
|
146
176
|
3. No confirmation needed — the task already exists
|
|
147
177
|
|
|
@@ -153,19 +183,31 @@ Labels: {labels}
|
|
|
153
183
|
- Which files need to be **modified**
|
|
154
184
|
- What patterns/conventions the codebase follows (naming, directory structure, test style)
|
|
155
185
|
- What dependencies might be needed
|
|
156
|
-
3. Check issue labels for `needs-design`, `needs-architecture`,
|
|
157
|
-
4. **Query pattern knowledge** via `
|
|
186
|
+
3. Check issue labels for `needs-design`, `needs-architecture`, `bug`/`defect`/`regression`, and `needs-threat-model`/`security-sensitive`
|
|
187
|
+
4. **Query pattern knowledge** via `crewpilot_knowledge_search` (type: `pattern`):
|
|
158
188
|
- Search for known patterns and anti-patterns in the files being modified
|
|
159
189
|
- Search for past root causes in the same area of the codebase
|
|
160
190
|
- Collect any "repeat offender" warnings from previous runs
|
|
161
191
|
- Feed this context into the plan so the worker avoids known mistakes
|
|
162
|
-
5.
|
|
192
|
+
5. **(Optional) Fetch M365 requirements context**: First call `mcp_workiq_accept_eula` with `eulaUrl: "https://github.com/microsoft/work-iq-mcp"` (idempotent), then use **focused queries** to surface requirements context before planning:
|
|
193
|
+
- **Requirements & specs**: `mcp_workiq_ask_work_iq` → "Find emails, documents, and Teams messages about: {issue title}. Summarize relevant discussions, specs, and design docs."
|
|
194
|
+
- **Meeting decisions**: `mcp_workiq_ask_work_iq` → "What decisions were made about {issue title / feature name} in recent meetings? What requirements were stated?"
|
|
195
|
+
- **Stakeholder expectations**: `mcp_workiq_ask_work_iq` → "What did stakeholders or customers say about {feature} in recent emails or meetings? What was promised or committed?"
|
|
196
|
+
- Feed the M365 context into the analysis artifact so Phase 3's plan addresses stated requirements, not just the issue description.
|
|
197
|
+
- If `mcp_workiq_ask_work_iq` is unavailable, skip — this step is optional.
|
|
198
|
+
6. Call `crewpilot_worker_start` with the issue ID and title
|
|
199
|
+
7. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="analysis"` containing:
|
|
200
|
+
- Files to create/modify
|
|
201
|
+
- Codebase patterns discovered
|
|
202
|
+
- Dependencies needed
|
|
203
|
+
- Label-gated phases to run
|
|
204
|
+
- Known patterns/anti-patterns from knowledge search
|
|
163
205
|
|
|
164
206
|
### Phase 2.5 — Design & Architecture (label-gated)
|
|
165
207
|
|
|
166
208
|
**Skip this phase entirely if the issue has neither `needs-design` nor `needs-architecture` label.**
|
|
167
209
|
|
|
168
|
-
Check the issue labels (from `
|
|
210
|
+
Check the issue labels (from `crewpilot_board_get`). Run the applicable skills:
|
|
169
211
|
|
|
170
212
|
#### If issue has `needs-design` label:
|
|
171
213
|
|
|
@@ -188,7 +230,7 @@ Reversal cost: {Low/Medium/High}
|
|
|
188
230
|
```
|
|
189
231
|
|
|
190
232
|
5. **HUMAN GATE**: User picks an approach
|
|
191
|
-
6. Store the decision via `
|
|
233
|
+
6. Store the decision via `crewpilot_knowledge_store` (type: decision)
|
|
192
234
|
7. Write the design document to `docs/design/{issue_id}-{slug}.md`:
|
|
193
235
|
```markdown
|
|
194
236
|
# Design: {issue title}
|
|
@@ -211,6 +253,7 @@ Reversal cost: {Low/Medium/High}
|
|
|
211
253
|
Confidence: {N}/10 | Reversal cost: {Low/Medium/High}
|
|
212
254
|
```
|
|
213
255
|
8. Stage the design doc — it will be committed alongside the code in Phase 5
|
|
256
|
+
9. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="design"` containing the chosen approach, trade-off summary, and design document path
|
|
214
257
|
|
|
215
258
|
#### If issue has `needs-architecture` label:
|
|
216
259
|
|
|
@@ -253,6 +296,7 @@ Data Flow:
|
|
|
253
296
|
{rejected options and why}
|
|
254
297
|
```
|
|
255
298
|
9. Stage the ADR — it will be committed alongside the code in Phase 5
|
|
299
|
+
10. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="architecture"` containing the component decomposition, data flow, interfaces, and ADR path
|
|
256
300
|
|
|
257
301
|
#### If issue has BOTH labels:
|
|
258
302
|
|
|
@@ -267,8 +311,8 @@ The design decision feeds into the architecture — e.g., "we chose Redis" → a
|
|
|
267
311
|
|
|
268
312
|
1. **Symptom collection**:
|
|
269
313
|
- Extract error message, stack trace, steps to reproduce from the issue description
|
|
270
|
-
- Run `
|
|
271
|
-
- Query `
|
|
314
|
+
- Run `crewpilot_git_log` on the affected files to check recent changes
|
|
315
|
+
- Query `crewpilot_knowledge_search` for previous root causes in the same area
|
|
272
316
|
2. **Hypothesis generation** — generate 2-3 ranked hypotheses:
|
|
273
317
|
|
|
274
318
|
```
|
|
@@ -282,7 +326,7 @@ The design decision feeds into the architecture — e.g., "we chose Redis" → a
|
|
|
282
326
|
```
|
|
283
327
|
|
|
284
328
|
3. **Systematic elimination** — for each hypothesis (highest first):
|
|
285
|
-
- Run `
|
|
329
|
+
- Run `crewpilot_exec` to test (add logging, reproduce, check state)
|
|
286
330
|
- Record result: confirmed / eliminated / narrowed
|
|
287
331
|
- Max 5 attempts total (circuit breaker — same as Phase 4)
|
|
288
332
|
4. **Root cause identification**:
|
|
@@ -293,21 +337,56 @@ The design decision feeds into the architecture — e.g., "we chose Redis" → a
|
|
|
293
337
|
- The plan must fix the root cause, not just the symptom
|
|
294
338
|
- Include a regression test that fails without the fix
|
|
295
339
|
- Phase 5 commit footer: `Root-cause: {one-sentence description}`
|
|
296
|
-
6. **Store root cause** via `
|
|
340
|
+
6. **Store root cause** via `crewpilot_knowledge_store` (type: `root-cause`):
|
|
297
341
|
- What: the root cause description
|
|
298
342
|
- Where: affected files/modules
|
|
299
343
|
- Why: the design gap
|
|
300
344
|
- Prevention: what would have caught this earlier
|
|
301
|
-
7. **
|
|
345
|
+
7. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="rca"` containing the root cause, causal chain, design gap, prevention strategy, and affected files
|
|
346
|
+
8. **If root cause reveals a systemic issue**, flag it for pattern detection in Phase 6:
|
|
302
347
|
- Add note: `systemic:{description}` for Phase 6 to pick up
|
|
303
348
|
|
|
304
|
-
|
|
349
|
+
### Phase 2.5d — Threat Modeling (label-gated)
|
|
350
|
+
|
|
351
|
+
**Skip if the issue does NOT have a `needs-threat-model` or `security-sensitive` label.**
|
|
352
|
+
|
|
353
|
+
**Load and follow** `.github/skills/assure-threat-model/SKILL.md` methodology:
|
|
354
|
+
|
|
355
|
+
1. **Read prior artifacts**: Load the `analysis` artifact (and `architecture` if it exists) to understand the system being built
|
|
356
|
+
2. **Scope the model**: Define the trust boundaries and data flows for the feature being implemented
|
|
357
|
+
3. **STRIDE analysis**: For each component and data flow crossing a trust boundary, evaluate all 6 STRIDE categories
|
|
358
|
+
4. **Risk assessment**: Score each threat (Likelihood × Impact = Risk)
|
|
359
|
+
5. **Mitigation planning**: For threats with risk ≥ 7, propose specific mitigations with effort and implementation phase
|
|
360
|
+
6. **Present to user**:
|
|
361
|
+
|
|
362
|
+
```
|
|
363
|
+
🛡️ Threat Model for: "{issue title}"
|
|
364
|
+
|
|
365
|
+
| ID | STRIDE | Component | Threat | Risk Score | Mitigation |
|
|
366
|
+
|----|--------|-----------|--------|------------|------------|
|
|
367
|
+
| T1 | ... | ... | ... | ... | ... |
|
|
368
|
+
|
|
369
|
+
Critical threats: {count}
|
|
370
|
+
Required mitigations before implementation: {list}
|
|
371
|
+
|
|
372
|
+
→ Approve threat model? (yes / edit)
|
|
373
|
+
```
|
|
305
374
|
|
|
306
|
-
|
|
375
|
+
7. **HUMAN GATE**: User approves the threat model
|
|
376
|
+
8. Store via `crewpilot_knowledge_store` (type: `threat-model`)
|
|
377
|
+
9. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="threat-model"` containing the full threat register
|
|
378
|
+
10. Feed critical/high-risk mitigations into Phase 3 plan as mandatory implementation steps
|
|
379
|
+
|
|
380
|
+
#### After design/architecture/RCA/threat-model phases:
|
|
381
|
+
|
|
382
|
+
The design documents, RCA findings, and threat model inform the implementation plan. Phase 3's plan should reference:
|
|
307
383
|
- Which approach was chosen (from design doc)
|
|
308
384
|
- Which components to build (from architecture)
|
|
309
385
|
- Which interfaces to implement (from ADR)
|
|
310
386
|
- What root cause was found (from RCA) and what fix addresses it
|
|
387
|
+
- What threats were identified (from threat model) and what mitigations are required
|
|
388
|
+
|
|
389
|
+
**Read prior artifacts**: Call `crewpilot_artifact_read` to load the `analysis`, `design`, `architecture`, `rca`, and/or `threat-model` artifacts. These contain the full context from earlier phases — do not rely on chat history alone.
|
|
311
390
|
|
|
312
391
|
### Phase 3 — HUMAN GATE: Plan Approval
|
|
313
392
|
|
|
@@ -340,18 +419,24 @@ Complexity: {trivial|simple|moderate|complex}
|
|
|
340
419
|
Approve? (yes / edit / cancel)
|
|
341
420
|
```
|
|
342
421
|
|
|
343
|
-
- If **yes** → call `
|
|
422
|
+
- If **yes** → call `crewpilot_worker_approve`, continue to Phase 4
|
|
344
423
|
- If **edit** → user provides changes, update plan, re-present
|
|
345
|
-
- If **cancel** → call `
|
|
424
|
+
- If **cancel** → call `crewpilot_worker_fail`, stop
|
|
425
|
+
|
|
426
|
+
**Write artifact**: After approval, call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="plan"` containing the approved plan (steps, files, complexity).
|
|
427
|
+
|
|
428
|
+
**Session checkpoint**: After plan approval, call `crewpilot_session_save` with status="checkpoint", phase="phase-3-approved", and the current context. This ensures the approved plan can be resumed if the session is interrupted.
|
|
346
429
|
|
|
347
430
|
### Phase 4 — Branch & Implementation
|
|
348
431
|
|
|
349
|
-
|
|
350
|
-
|
|
432
|
+
**Read prior artifacts**: Call `crewpilot_artifact_read` for `plan` (and `analysis`, `design`, `architecture`, `rca` if they exist) to load the full execution context.
|
|
433
|
+
|
|
434
|
+
1. Call `crewpilot_worker_branch` to create feature branch
|
|
435
|
+
2. Call `crewpilot_board_move` to set issue status to "in-progress"
|
|
351
436
|
3. **For each step in the plan:**
|
|
352
437
|
a. Implement the code change (create/modify files)
|
|
353
438
|
b. Follow existing codebase patterns discovered in Phase 2
|
|
354
|
-
c. After each logical unit, run `
|
|
439
|
+
c. After each logical unit, run `crewpilot_exec("npm test")` or equivalent to verify nothing is broken
|
|
355
440
|
d. If tests fail, diagnose and fix (max 3 attempts per step — circuit breaker)
|
|
356
441
|
4. Write tests for new code:
|
|
357
442
|
- Match existing test framework and conventions
|
|
@@ -359,8 +444,8 @@ Approve? (yes / edit / cancel)
|
|
|
359
444
|
- Run tests to confirm they pass
|
|
360
445
|
|
|
361
446
|
**Circuit breaker:** If any step fails 3 times consecutively:
|
|
362
|
-
- Call `
|
|
363
|
-
- Call `
|
|
447
|
+
- Call `crewpilot_board_comment` with details of the failure
|
|
448
|
+
- Call `crewpilot_worker_fail` with reason
|
|
364
449
|
- Tell the user what went wrong and which step is stuck
|
|
365
450
|
- STOP. Do not continue.
|
|
366
451
|
|
|
@@ -368,10 +453,10 @@ Approve? (yes / edit / cancel)
|
|
|
368
453
|
|
|
369
454
|
**Load and follow** `.github/skills/deliver-change-management/SKILL.md` methodology:
|
|
370
455
|
|
|
371
|
-
1. Run `
|
|
456
|
+
1. Run `crewpilot_git_diff` to analyze all changes
|
|
372
457
|
2. Categorize changes by type: `feat`, `fix`, `refactor`, `test`, `docs`, `chore`
|
|
373
458
|
3. **If changes span multiple logical units** (e.g., new feature + test + config):
|
|
374
|
-
- Split into separate commits with `
|
|
459
|
+
- Split into separate commits with `crewpilot_git_stage` per group
|
|
375
460
|
- Each commit gets its own conventional message
|
|
376
461
|
- Example:
|
|
377
462
|
```
|
|
@@ -388,7 +473,8 @@ Approve? (yes / edit / cancel)
|
|
|
388
473
|
- Format: `feat(scope): description (closes #ID)`
|
|
389
474
|
- Body: what was implemented and why
|
|
390
475
|
- Footer: `Closes #ID`
|
|
391
|
-
5. Call `
|
|
476
|
+
5. Call `crewpilot_git_stage` and `crewpilot_git_commit` for each logical commit
|
|
477
|
+
6. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="change-mgmt"` containing the list of commits created (hash, type, scope, message)
|
|
392
478
|
|
|
393
479
|
### Phase 5b — Doc Governance (Deliver Skill #2)
|
|
394
480
|
|
|
@@ -413,7 +499,7 @@ Approve? (yes / edit / cancel)
|
|
|
413
499
|
|
|
414
500
|
### Phase 6 — PR Creation & Auto-Review
|
|
415
501
|
|
|
416
|
-
1. Call `
|
|
502
|
+
1. Call `crewpilot_worker_preview_pr` with:
|
|
417
503
|
- Title: primary commit message
|
|
418
504
|
- Body: markdown with sections:
|
|
419
505
|
- **What**: summary of changes
|
|
@@ -426,7 +512,7 @@ Approve? (yes / edit / cancel)
|
|
|
426
512
|
2. **HUMAN GATE**: User reviews the preview — do NOT create the PR until the user approves.
|
|
427
513
|
If the user requests changes, apply them and re-preview. Never skip this gate.
|
|
428
514
|
</HARD-GATE>
|
|
429
|
-
3. Call `
|
|
515
|
+
3. Call `crewpilot_worker_pr` to create the PR
|
|
430
516
|
4. **Run PR Intelligence** (read `.github/skills/assure-pr-intelligence/SKILL.md`):
|
|
431
517
|
- **Change inventory**: categorize changed files (core, api, test, config, docs)
|
|
432
518
|
- **Risk assessment**: evaluate scope, complexity, blast radius, test coverage, reversibility → Low/Medium/High/Critical risk score
|
|
@@ -434,7 +520,15 @@ Approve? (yes / edit / cancel)
|
|
|
434
520
|
- **Merge readiness checklist**: tests pass, security clean, breaking changes documented, PR description matches changes
|
|
435
521
|
- Post the full PR Intelligence report as a **comment on the PR** so the assigned reviewer sees it immediately
|
|
436
522
|
5. Read the diff of the PR
|
|
437
|
-
6.
|
|
523
|
+
6. **Subagent delegation (recommended for moderate/complex changes):** Use `crewpilot_dispatch_subagent` to delegate review work in parallel:
|
|
524
|
+
- Delegate `code-reviewer` role with the diff and file list — receives correctness, security, and performance findings
|
|
525
|
+
- Delegate `standards-reviewer` role with the diff and codebase conventions — receives standards compliance findings
|
|
526
|
+
- Delegate `security-auditor` role with source files and architecture context — receives STRIDE/OWASP findings
|
|
527
|
+
- Each subagent writes its output as an artifact (e.g. `review-functional`, `review-standards`) for traceability
|
|
528
|
+
- Merge subagent findings using `crewpilot_dispatch_consensus` to identify high-confidence vs disputed issues
|
|
529
|
+
|
|
530
|
+
**Fallback (simple changes):** Run reviews inline without subagent delegation:
|
|
531
|
+
7. Run **code-quality** review internally (read `.github/skills/assure-code-quality/SKILL.md`):
|
|
438
532
|
- Correctness: does the code do what the acceptance criteria say?
|
|
439
533
|
- Security: any obvious vulnerabilities (SQL injection, XSS, secrets)?
|
|
440
534
|
- Performance: any N+1 queries, await-in-loops, unnecessary re-renders?
|
|
@@ -442,7 +536,22 @@ Approve? (yes / edit / cancel)
|
|
|
442
536
|
7. Run **vulnerability-scan** internally (read `.github/skills/assure-vulnerability-scan/SKILL.md`):
|
|
443
537
|
- OWASP Top 10 quick check on new code
|
|
444
538
|
- Dependency audit: `npm audit` or `pip audit`
|
|
445
|
-
8. Run `
|
|
539
|
+
8. Run `crewpilot_exec("npm run lint")` and `crewpilot_exec("npm run typecheck")` if available
|
|
540
|
+
8b. **(Optional) Requirements alignment validation**: If M365 context was fetched in Phase 2, validate the implementation against meeting-stated requirements:
|
|
541
|
+
- Read the `analysis` artifact to retrieve the M365 requirements context captured earlier
|
|
542
|
+
- If the analysis artifact contains meeting decisions or stakeholder expectations, call `mcp_workiq_ask_work_iq` → "What specific requirements and acceptance criteria were stated for {feature} in meetings and emails?"
|
|
543
|
+
- Cross-reference each stated requirement against the implementation diff:
|
|
544
|
+
- **Covered**: the requirement is addressed by the code changes ✓
|
|
545
|
+
- **Partial**: the requirement is partially addressed — flag what's missing
|
|
546
|
+
- **Missing**: the requirement is not addressed at all — flag as a review finding
|
|
547
|
+
- Include requirements alignment in the PR comment:
|
|
548
|
+
```
|
|
549
|
+
📋 Requirements Alignment:
|
|
550
|
+
Meeting requirements checked: {N}
|
|
551
|
+
Covered: {count} ✓ | Partial: {count} ⚠️ | Missing: {count} ❌
|
|
552
|
+
{list any partial/missing items}
|
|
553
|
+
```
|
|
554
|
+
- If critical requirements are missing, flag as a review issue that must be addressed before merge
|
|
446
555
|
9. **Run diff-scoped pattern detection** (read `.github/skills/insights-pattern-detection/SKILL.md`):
|
|
447
556
|
- Scope: only scan files changed in the diff (NOT full codebase)
|
|
448
557
|
- Check for **consistency** with existing codebase patterns:
|
|
@@ -456,14 +565,14 @@ Approve? (yes / edit / cancel)
|
|
|
456
565
|
- Shotgun surgery (small change touching too many files)
|
|
457
566
|
- Primitive obsession (strings/numbers where domain types belong)
|
|
458
567
|
- **Query knowledge base for repeat offenses**:
|
|
459
|
-
- `
|
|
568
|
+
- `crewpilot_knowledge_search` type: `pattern` — "has this same anti-pattern been flagged before?"
|
|
460
569
|
- If a repeat offense is found, flag prominently:
|
|
461
570
|
```
|
|
462
571
|
⚠️ Recurring Pattern Issue: {description}
|
|
463
572
|
Previously flagged in: {previous context}
|
|
464
573
|
Suggestion: Consider a structural fix.
|
|
465
574
|
```
|
|
466
|
-
- Run `
|
|
575
|
+
- Run `crewpilot_metrics_complexity` on changed files — flag any function with complexity > threshold
|
|
467
576
|
- Include pattern findings in the PR comment:
|
|
468
577
|
```
|
|
469
578
|
🔎 Pattern Detection Results:
|
|
@@ -477,9 +586,10 @@ Approve? (yes / edit / cancel)
|
|
|
477
586
|
- Re-commit: `fix(scope): address review findings`
|
|
478
587
|
- Re-push
|
|
479
588
|
- Re-run pattern detection on the fix to confirm resolution
|
|
480
|
-
11. Call `
|
|
481
|
-
12. Call `
|
|
482
|
-
|
|
589
|
+
11. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="review-merged"` containing the combined review results (code-quality, vulnerability-scan, pattern detection findings, and fix iterations)
|
|
590
|
+
12. Call `crewpilot_worker_review_done` with verdict: "approved" and summary
|
|
591
|
+
12. Call `crewpilot_board_move` to set issue status to "in-review"
|
|
592
|
+
13. Call `crewpilot_board_comment`: "PR #{pr_number} opened. Ready for review."
|
|
483
593
|
|
|
484
594
|
### Phase 7 — Deploy Guard (Deliver Skill #3)
|
|
485
595
|
|
|
@@ -512,10 +622,12 @@ Produce a verdict and include in the PR comment:
|
|
|
512
622
|
- If **CONDITIONAL** → list warnings in PR comment, proceed (human decides)
|
|
513
623
|
- If **NO-GO** → fix blockers, re-run until GO or escalate to user
|
|
514
624
|
|
|
625
|
+
**Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="deploy-guard"` containing the full 6-gate results and verdict.
|
|
626
|
+
|
|
515
627
|
### Phase 8 — Completion & Learning
|
|
516
628
|
|
|
517
|
-
1. Call `
|
|
518
|
-
2. **Store knowledge** via `
|
|
629
|
+
1. Call `crewpilot_board_comment` with deploy guard results: "All checks passed. Ready to merge."
|
|
630
|
+
2. **Store knowledge** via `crewpilot_knowledge_store`:
|
|
519
631
|
- Decisions made during implementation (type: `decision`)
|
|
520
632
|
- Root cause findings, if this was a bug fix (type: `root-cause`)
|
|
521
633
|
- **Pattern findings** from Phase 6 (type: `pattern`):
|
|
@@ -557,7 +669,8 @@ Repeat Issues: {none | {count} recurring patterns detected}
|
|
|
557
669
|
→ Merge when ready. Board will auto-update on close.
|
|
558
670
|
```
|
|
559
671
|
|
|
560
|
-
4. Call `
|
|
672
|
+
4. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="completion"` containing the final summary (PR number, branch, commits, review/deploy-guard results, knowledge stored)
|
|
673
|
+
5. Call `crewpilot_worker_complete`
|
|
561
674
|
|
|
562
675
|
### Capability Hints (on completion)
|
|
563
676
|
|
|
@@ -613,6 +726,7 @@ Every step in the Phase 3 plan and every file produced in Phase 4 must contain r
|
|
|
613
726
|
- `solution-design` — Phase 2.5: generate solution design doc when `needs-design` label detected
|
|
614
727
|
- `architecture-planner` — Phase 2.5: generate ADR when `needs-architecture` label detected
|
|
615
728
|
- `root-cause-analysis` — Phase 2.5c: systematic RCA when `bug`/`defect`/`regression` label detected
|
|
729
|
+
- `threat-model` — Phase 2.5d: STRIDE threat modeling when `needs-threat-model`/`security-sensitive` label detected
|
|
616
730
|
- `change-management` — Phase 5: proper conventional commits with multi-commit splitting
|
|
617
731
|
- `doc-governance` — Phase 5b: auto-detect and fix documentation drift
|
|
618
732
|
- `pr-intelligence` — Phase 6: risk assessment + reviewer guidance posted on PR
|
|
@@ -12,12 +12,15 @@ Generate a comprehensive daily/weekly work summary by aggregating git activity,
|
|
|
12
12
|
|
|
13
13
|
## Tools Required
|
|
14
14
|
|
|
15
|
-
- `
|
|
16
|
-
- `
|
|
17
|
-
- `
|
|
18
|
-
- `
|
|
19
|
-
- `
|
|
20
|
-
- `
|
|
15
|
+
- `crewpilot_git_log` — get commits for the time period
|
|
16
|
+
- `crewpilot_board_my_items` — get board items (opened, closed, in-progress)
|
|
17
|
+
- `crewpilot_worker_dashboard` — workflow completions and stats
|
|
18
|
+
- `crewpilot_knowledge_timeline` — decisions made in the period
|
|
19
|
+
- `crewpilot_exec` — run git/gh commands for additional data
|
|
20
|
+
- `crewpilot_notify_send` — deliver the report via email
|
|
21
|
+
- `mcp_workiq_accept_eula` — (optional) accept Work IQ EULA before first query
|
|
22
|
+
- `mcp_workiq_ask_work_iq` — (optional, requires Work IQ extension) fetch M365 activity (emails, meetings, docs, Teams) for a full work-surface report
|
|
23
|
+
- `crewpilot_artifact_write` — persist the digest as an artifact
|
|
21
24
|
|
|
22
25
|
## Methodology
|
|
23
26
|
|
|
@@ -42,26 +45,44 @@ digraph daily_digest {
|
|
|
42
45
|
Gather from all sources for the requested time period (default: today):
|
|
43
46
|
|
|
44
47
|
**Git Activity:**
|
|
45
|
-
1. Call `
|
|
48
|
+
1. Call `crewpilot_git_log` with `--since="today 00:00"` (or requested range)
|
|
46
49
|
2. Extract: commit count, files changed, insertions/deletions, branches touched
|
|
47
50
|
3. Group commits by scope/type (feat, fix, refactor, test, docs)
|
|
48
51
|
|
|
49
52
|
**Board Activity:**
|
|
50
|
-
1. Call `
|
|
53
|
+
1. Call `crewpilot_exec` with `gh issue list --author=@me --state=all --json number,title,state,updatedAt,labels`
|
|
51
54
|
2. Filter to items updated in the time period
|
|
52
55
|
3. Categorize: created, moved to in-progress, closed/done, commented on
|
|
53
56
|
|
|
54
57
|
**PR Activity:**
|
|
55
|
-
1. Call `
|
|
58
|
+
1. Call `crewpilot_exec` with `gh pr list --author=@me --state=all --json number,title,state,createdAt,mergedAt,reviewDecision`
|
|
56
59
|
2. Filter to time period
|
|
57
60
|
3. Categorize: opened, merged, review pending, changes requested
|
|
58
61
|
|
|
59
62
|
**Workflow Activity:**
|
|
60
|
-
1. Call `
|
|
63
|
+
1. Call `crewpilot_worker_dashboard` for digital worker stats
|
|
61
64
|
2. Filter completed/failed workflows in the period
|
|
62
65
|
|
|
63
66
|
**Knowledge:**
|
|
64
|
-
1. Call `
|
|
67
|
+
1. Call `crewpilot_knowledge_timeline` for decisions and lessons stored today
|
|
68
|
+
|
|
69
|
+
**M365 Activity (optional — requires Work IQ MCP server):**
|
|
70
|
+
1. Call `mcp_workiq_accept_eula` with `eulaUrl: "https://github.com/microsoft/work-iq-mcp"` (idempotent — safe to call every time)
|
|
71
|
+
2. Use **multiple focused queries** for comprehensive coverage (targeted queries return better results than one broad question):
|
|
72
|
+
- **Emails**: `mcp_workiq_ask_work_iq` → "What emails did I send and receive on {date}? Summarize key threads and any action items."
|
|
73
|
+
- **Meetings**: `mcp_workiq_ask_work_iq` → "What meetings did I attend on {date}? What decisions were made and what action items were assigned to me?"
|
|
74
|
+
- **Documents**: `mcp_workiq_ask_work_iq` → "What documents did I edit or view in SharePoint and OneDrive on {date}?"
|
|
75
|
+
- **Teams**: `mcp_workiq_ask_work_iq` → "What Teams channel messages and chats was I active in on {date}? What mentions did I receive?"
|
|
76
|
+
- **Tasks**: `mcp_workiq_ask_work_iq` → "What Planner or To-Do tasks did I complete or get assigned on {date}?"
|
|
77
|
+
3. If Work IQ is available, parse all responses and include the full work surface:
|
|
78
|
+
- **Emails**: sent/received count, key threads, action items from emails
|
|
79
|
+
- **Meetings**: attended meetings, decisions made, action items assigned, linked documents
|
|
80
|
+
- **Documents**: files edited/viewed in SharePoint/OneDrive, co-authoring activity
|
|
81
|
+
- **Teams**: active channel conversations, 1:1 chats, mentions, and responses
|
|
82
|
+
- **Tasks**: Planner/To-Do items completed, created, or updated
|
|
83
|
+
4. If `mcp_workiq_ask_work_iq` is unavailable or errors, skip this section — the digest works without it (git + board + PRs is the baseline)
|
|
84
|
+
|
|
85
|
+
> **Query budget**: Work IQ queries have a ~30/session budget. The 5 queries above are a reasonable investment for a full daily digest. For weekly summaries, combine into broader date-range queries to conserve budget.
|
|
65
86
|
|
|
66
87
|
### Phase 2 — Report Generation
|
|
67
88
|
|
|
@@ -112,9 +133,9 @@ Tomorrow's focus:
|
|
|
112
133
|
Based on notification configuration:
|
|
113
134
|
|
|
114
135
|
**Email (default when recipients configured):**
|
|
115
|
-
1. Call `
|
|
136
|
+
1. Call `crewpilot_notify_send` with subject: "Daily Digest — {date} — {project name}", body: full report
|
|
116
137
|
2. Email sent automatically via SMTP (no manual interaction needed)
|
|
117
|
-
3. Requires SMTP env vars or `
|
|
138
|
+
3. Requires SMTP env vars or `crewpilot_notify_configure` to be set up
|
|
118
139
|
|
|
119
140
|
**Console (fallback when no recipients configured):**
|
|
120
141
|
1. Just display the report in chat
|
|
@@ -160,7 +181,7 @@ When triggered with "weekly summary" or "weekly digest":
|
|
|
160
181
|
- Do NOT include sensitive data (secrets, tokens, passwords found in code)
|
|
161
182
|
- Do NOT fabricate activity — if nothing happened, say "quiet day"
|
|
162
183
|
- Do NOT include full commit messages — summarize by category
|
|
163
|
-
- Do NOT send to recipients not configured via
|
|
184
|
+
- Do NOT send to recipients not configured via crewpilot_notify_configure
|
|
164
185
|
|
|
165
186
|
## Chains To
|
|
166
187
|
|
|
@@ -84,16 +84,16 @@ If multiple logical changes are staged:
|
|
|
84
84
|
## Tools Required
|
|
85
85
|
|
|
86
86
|
- `terminal` — Run git commands
|
|
87
|
-
- `
|
|
88
|
-
- `
|
|
89
|
-
- `
|
|
90
|
-
- `
|
|
91
|
-
- `
|
|
87
|
+
- `crewpilot_git_status` — Get current state
|
|
88
|
+
- `crewpilot_git_diff` — Get detailed diff
|
|
89
|
+
- `crewpilot_git_log` — Parse commit history
|
|
90
|
+
- `crewpilot_git_stage` — Stage files
|
|
91
|
+
- `crewpilot_git_commit` — Execute commit
|
|
92
92
|
|
|
93
93
|
## Output Format
|
|
94
94
|
|
|
95
95
|
```
|
|
96
|
-
## [
|
|
96
|
+
## [CrewPilot → Change Management]
|
|
97
97
|
|
|
98
98
|
### Changes Detected
|
|
99
99
|
| Type | Scope | Files |
|
|
@@ -49,14 +49,14 @@ digraph deploy_guard {
|
|
|
49
49
|
- [ ] No `TODO`/`FIXME`/`HACK` in files changed since last deploy
|
|
50
50
|
- [ ] No `console.log`/`print`/debug statements in production paths
|
|
51
51
|
- [ ] No commented-out code blocks
|
|
52
|
-
- Run: `
|
|
52
|
+
- Run: `crewpilot_metrics_complexity` on changed files — flag any high-complexity additions
|
|
53
53
|
|
|
54
54
|
### Gate 2 — Test Integrity
|
|
55
55
|
- [ ] All tests pass
|
|
56
56
|
- [ ] Test coverage meets minimum threshold
|
|
57
57
|
- [ ] No skipped tests (`.skip`, `@disabled`, `@pytest.mark.skip`)
|
|
58
58
|
- [ ] No test files with zero assertions
|
|
59
|
-
- Run: `
|
|
59
|
+
- Run: `crewpilot_metrics_coverage` to validate
|
|
60
60
|
|
|
61
61
|
### Gate 3 — Security
|
|
62
62
|
- [ ] No new vulnerabilities from `vulnerability-scan`
|
|
@@ -100,14 +100,14 @@ Produce a clear GO / NO-GO / CONDITIONAL decision:
|
|
|
100
100
|
|
|
101
101
|
- `terminal` — Run tests, linters, audit tools
|
|
102
102
|
- `codebase` — Scan for anti-patterns, secrets, debug statements
|
|
103
|
-
- `
|
|
104
|
-
- `
|
|
105
|
-
- `
|
|
103
|
+
- `crewpilot_metrics_coverage` — Coverage report
|
|
104
|
+
- `crewpilot_metrics_complexity` — Complexity scores
|
|
105
|
+
- `crewpilot_git_diff` — Changes since last deploy/tag
|
|
106
106
|
|
|
107
107
|
## Output Format
|
|
108
108
|
|
|
109
109
|
```
|
|
110
|
-
## [
|
|
110
|
+
## [CrewPilot → Deploy Guard]
|
|
111
111
|
|
|
112
112
|
### Gate Results
|
|
113
113
|
|
|
@@ -89,12 +89,12 @@ Verify minimum documentation exists:
|
|
|
89
89
|
|
|
90
90
|
- `codebase` — Read source code and documentation files
|
|
91
91
|
- `terminal` — Verify install steps, run examples
|
|
92
|
-
- `
|
|
92
|
+
- `crewpilot_knowledge_search` — Check if documentation decisions were previously recorded
|
|
93
93
|
|
|
94
94
|
## Output Format
|
|
95
95
|
|
|
96
96
|
```
|
|
97
|
-
## [
|
|
97
|
+
## [CrewPilot → Doc Governance]
|
|
98
98
|
|
|
99
99
|
### Documentation Map
|
|
100
100
|
| Doc File | Covers | Status |
|