npm - @crewpilot/agent - Versions diffs - 1.0.0 → 3.0.0 - Mend

@crewpilot/agent 1.0.0 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

package/README.md +35 -11
package/dist-npm/cli.js +5 -5
package/dist-npm/index.js +171 -138
package/package.json +2 -2
package/prompts/agent.md +38 -22
package/prompts/copilot-instructions.md +8 -8
package/prompts/{catalyst.config.json → crewpilot.config.json} +1 -1
package/prompts/skills/assure-code-quality/SKILL.md +3 -3
package/prompts/skills/assure-pr-intelligence/SKILL.md +4 -4
package/prompts/skills/assure-review-functional/SKILL.md +114 -0
package/prompts/skills/assure-review-standards/SKILL.md +106 -0
package/prompts/skills/assure-threat-model/SKILL.md +182 -0
package/prompts/skills/assure-vulnerability-scan/SKILL.md +1 -1
package/prompts/skills/autopilot-meeting/SKILL.md +43 -16
package/prompts/skills/autopilot-worker/SKILL.md +177 -63
package/prompts/skills/daily-digest/SKILL.md +35 -14
package/prompts/skills/deliver-change-management/SKILL.md +6 -6
package/prompts/skills/deliver-deploy-guard/SKILL.md +6 -6
package/prompts/skills/deliver-doc-governance/SKILL.md +2 -2
package/prompts/skills/engineer-feature-builder/SKILL.md +3 -3
package/prompts/skills/engineer-root-cause-analysis/SKILL.md +3 -3
package/prompts/skills/engineer-test-first/SKILL.md +2 -2
package/prompts/skills/insights-knowledge-base/SKILL.md +32 -11
package/prompts/skills/insights-pattern-detection/SKILL.md +5 -5
package/prompts/skills/strategize-architecture-planner/SKILL.md +2 -2
package/prompts/skills/strategize-solution-design/SKILL.md +2 -2
package/scripts/postinstall.js +4 -4

package/prompts/skills/autopilot-worker/SKILL.md CHANGED Viewed

@@ -18,30 +18,38 @@ This pipeline chains 12 skills across role boundaries (e.g. code-quality and vul
 ## Tools Required
-- `catalyst_board_connect` — connect to board provider
-- `catalyst_board_create` — create issue on board
-- `catalyst_board_move` — update issue status
-- `catalyst_board_comment` — log progress on the issue
-- `catalyst_worker_start` — start orchestrator workflow
-- `catalyst_worker_plan` — set execution plan
-- `catalyst_worker_approve` — human approval gate
-- `catalyst_worker_branch` — create feature branch
-- `catalyst_worker_pr` — push + open PR
-- `catalyst_worker_review_done` — record review verdict
-- `catalyst_worker_complete` — mark workflow done
-- `catalyst_worker_fail` — circuit breaker on failure
-- `catalyst_git_stage` — stage files
-- `catalyst_git_commit` — commit changes
-- `catalyst_exec` — run commands (tests, lint, build)
-- `catalyst_knowledge_store` — store decisions made during implementation
-- `catalyst_git_diff` — analyze changes for change-management
-- `catalyst_git_log` — commit history for release notes
-- `catalyst_metrics_coverage` — coverage check for deploy-guard
-- `catalyst_metrics_complexity` — complexity check for deploy-guard and pattern detection
-- `catalyst_worker_preview_pr` — preview changes before PR creation
-- `catalyst_worker_push_fixes` — push fixes to existing PR branch (no new PR)
-- `catalyst_board_pr_comments` — fetch review comments from a PR
-- `catalyst_knowledge_search` — query known patterns, anti-patterns, and past root causes
+- `crewpilot_board_connect` — connect to board provider
+- `crewpilot_board_create` — create issue on board
+- `crewpilot_board_move` — update issue status
+- `crewpilot_board_comment` — log progress on the issue
+- `crewpilot_worker_start` — start orchestrator workflow
+- `crewpilot_worker_plan` — set execution plan
+- `crewpilot_worker_approve` — human approval gate
+- `crewpilot_worker_branch` — create feature branch
+- `crewpilot_worker_pr` — push + open PR
+- `crewpilot_worker_review_done` — record review verdict
+- `crewpilot_worker_complete` — mark workflow done
+- `crewpilot_worker_fail` — circuit breaker on failure
+- `crewpilot_git_stage` — stage files
+- `crewpilot_git_commit` — commit changes
+- `crewpilot_exec` — run commands (tests, lint, build)
+- `crewpilot_knowledge_store` — store decisions made during implementation
+- `crewpilot_git_diff` — analyze changes for change-management
+- `crewpilot_git_log` — commit history for release notes
+- `crewpilot_metrics_coverage` — coverage check for deploy-guard
+- `crewpilot_metrics_complexity` — complexity check for deploy-guard and pattern detection
+- `crewpilot_worker_preview_pr` — preview changes before PR creation
+- `crewpilot_worker_push_fixes` — push fixes to existing PR branch (no new PR)
+- `crewpilot_board_pr_comments` — fetch review comments from a PR
+- `crewpilot_knowledge_search` — query known patterns, anti-patterns, and past root causes
+- `crewpilot_artifact_write` — persist phase outputs (analysis, plans, reviews) so downstream phases can read them
+- `crewpilot_artifact_read` — read artifacts from prior phases (e.g. analysis → plan, plan → implementation)
+- `crewpilot_artifact_list` — list all artifacts for the current workflow
+- `crewpilot_dispatch_subagent` — delegate focused work (code review, test writing, security audit) to specialized sub-agents
+- `crewpilot_session_save` — save session state for long-running tasks (enables resume across conversations)
+- `crewpilot_session_restore` — restore a previously saved session to continue work
+- `crewpilot_session_list` — list all saved sessions
+- `mcp_workiq_ask_work_iq` — (optional, requires Work IQ extension) fetch M365 context (emails, docs, meetings) related to the task
 ## Methodology
@@ -56,6 +64,7 @@ digraph autopilot_worker {
     analysis [label="Phase 2\nCodebase Analysis & Planning"];
     design [label="Phase 2.5\nDesign & Architecture\n(label-gated)", style=dashed];
     rca [label="Phase 2.5c\nRoot Cause Analysis\n(bug label-gated)", style=dashed];
+    threat [label="Phase 2.5d\nThreat Model\n(security label-gated)", style=dashed];
     plan_gate [label="Phase 3\nHUMAN GATE: Plan Approval", shape=diamond, style=filled, fillcolor="#ffcccc"];
     implement [label="Phase 4\nBranch & Implementation"];
     change_mgmt [label="Phase 5\nChange Management"];
@@ -68,9 +77,11 @@ digraph autopilot_worker {
     intake -> analysis;
     analysis -> design [label="needs-design\nor needs-architecture"];
     analysis -> rca [label="bug/defect/\nregression"];
+    analysis -> threat [label="needs-threat-model\nor security-sensitive"];
     analysis -> plan_gate [label="no special labels"];
     design -> plan_gate;
     rca -> plan_gate;
+    threat -> plan_gate;
     plan_gate -> implement [label="approved"];
     plan_gate -> fail [label="cancelled"];
     implement -> change_mgmt;
@@ -88,15 +99,34 @@ digraph autopilot_worker {
 ### Phase 1 — Intake & Issue Creation
 **First interaction hint:** If this is the first interaction in the session, start with:
-> 💡 *Running Catalyst Autopilot — I'll summarize the task, confirm with you before creating a board issue, plan the work, get your approval, implement, test, review, and open a PR.*
+> 💡 *Running CrewPilot Autopilot — I'll summarize the task, confirm with you before creating a board issue, plan the work, get your approval, implement, test, review, and open a PR.*
-**Entry mode detection** — the worker can be entered three ways:
+**Entry mode detection** — the worker can be entered four ways:
 | Entry Mode | How to Detect | Behavior |
 |---|---|---|
 | **Direct** | User says "autopilot", "full pipeline", etc. | Run full pipeline from Phase 1 |
 | **Routed from feature-builder** | feature-builder's Phase 0 classified as moderate/complex | Skip re-analyzing complexity — it's already assessed. Use the context feature-builder gathered. |
 | **Mid-build escalation** | feature-builder discovered more complexity during Phase 4 | Accept the partial context (files already touched, patterns found). Start from Phase 2 (planning) with what's already known. |
+| **Session resume** | User says "resume", "continue", "pick up where I left off" | Call `crewpilot_session_restore` with the workflow ID. Read the saved state, load associated artifacts, and resume from the last pending action. |
+**Session resume flow**: When resuming, the agent should:
+1. Call `crewpilot_session_restore` to get the saved state
+2. Call `crewpilot_artifact_list` to see what artifacts exist
+3. Read relevant artifacts with `crewpilot_artifact_read`
+4. **(Optional) Calendar-aware context refresh**: If `mcp_workiq_ask_work_iq` is available and significant time has passed since the session was saved (overnight, weekend, or >4 hours):
+   - Call `mcp_workiq_accept_eula` with `eulaUrl: "https://github.com/microsoft/work-iq-mcp"` (idempotent)
+   - **Check for new context**: `mcp_workiq_ask_work_iq` → "What meetings, emails, or Teams messages about {issue title / feature} happened since {saved_at timestamp}? Summarize any new decisions, requirement changes, or blockers."
+   - **Check calendar conflicts**: `mcp_workiq_ask_work_iq` → "Do I have any meetings in the next 2 hours that might affect my availability?"
+   - If new decisions or requirement changes are found, flag them to the user before continuing:
+     ```
+     📅 Context Update (since session was saved {age} ago):
+       - {new decision / requirement change / blocker}
+       → Continue with current plan? (yes / re-plan)
+     ```
+   - If unavailable, skip — resume proceeds without M365 context refresh.
+5. Continue from the first pending action in the saved state
+6. Do NOT re-run phases that have already completed (check artifacts_written)
 **Complexity check (direct entry only):** If the user enters autopilot directly, quickly assess if the request warrants the full pipeline:
 - If the request is trivial (single file, obvious change) → suggest: *"This is a small change. I can implement it directly without the full pipeline. Want me to do that instead?"*
@@ -130,18 +160,18 @@ Labels: {labels}
 → Create this task and start the pipeline? (yes / edit / no)
 ```
-- If **yes** → call `catalyst_board_create`, continue to Phase 2
+- If **yes** → call `crewpilot_board_create`, continue to Phase 2
 - If **edit** → user provides corrections, update and re-present
 - If **no** → stop the pipeline. Ask the user what they'd like to do instead.
 - Do NOT create the board issue without explicit user confirmation.
 </HARD-GATE>
-3. Call `catalyst_board_create` with title, description, acceptance criteria
+3. Call `crewpilot_board_create` with title, description, acceptance criteria
 4. Note the created issue ID
 **If user provides an existing issue number (e.g., "#42"):**
-1. Call `catalyst_board_get` to read the existing issue
+1. Call `crewpilot_board_get` to read the existing issue
 2. Use its title, description, and acceptance criteria as-is
 3. No confirmation needed — the task already exists
@@ -153,19 +183,31 @@ Labels: {labels}
    - Which files need to be **modified**
    - What patterns/conventions the codebase follows (naming, directory structure, test style)
    - What dependencies might be needed
-3. Check issue labels for `needs-design`, `needs-architecture`, and `bug`/`defect`/`regression`
-4. **Query pattern knowledge** via `catalyst_knowledge_search` (type: `pattern`):
+3. Check issue labels for `needs-design`, `needs-architecture`, `bug`/`defect`/`regression`, and `needs-threat-model`/`security-sensitive`
+4. **Query pattern knowledge** via `crewpilot_knowledge_search` (type: `pattern`):
    - Search for known patterns and anti-patterns in the files being modified
    - Search for past root causes in the same area of the codebase
    - Collect any "repeat offender" warnings from previous runs
    - Feed this context into the plan so the worker avoids known mistakes
-5. Call `catalyst_worker_start` with the issue ID and title
+5. **(Optional) Fetch M365 requirements context**: First call `mcp_workiq_accept_eula` with `eulaUrl: "https://github.com/microsoft/work-iq-mcp"` (idempotent), then use **focused queries** to surface requirements context before planning:
+   - **Requirements & specs**: `mcp_workiq_ask_work_iq` → "Find emails, documents, and Teams messages about: {issue title}. Summarize relevant discussions, specs, and design docs."
+   - **Meeting decisions**: `mcp_workiq_ask_work_iq` → "What decisions were made about {issue title / feature name} in recent meetings? What requirements were stated?"
+   - **Stakeholder expectations**: `mcp_workiq_ask_work_iq` → "What did stakeholders or customers say about {feature} in recent emails or meetings? What was promised or committed?"
+   - Feed the M365 context into the analysis artifact so Phase 3's plan addresses stated requirements, not just the issue description.
+   - If `mcp_workiq_ask_work_iq` is unavailable, skip — this step is optional.
+6. Call `crewpilot_worker_start` with the issue ID and title
+7. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="analysis"` containing:
+   - Files to create/modify
+   - Codebase patterns discovered
+   - Dependencies needed
+   - Label-gated phases to run
+   - Known patterns/anti-patterns from knowledge search
 ### Phase 2.5 — Design & Architecture (label-gated)
 **Skip this phase entirely if the issue has neither `needs-design` nor `needs-architecture` label.**
-Check the issue labels (from `catalyst_board_get`). Run the applicable skills:
+Check the issue labels (from `crewpilot_board_get`). Run the applicable skills:
 #### If issue has `needs-design` label:
@@ -188,7 +230,7 @@ Reversal cost: {Low/Medium/High}
 ```
 5. **HUMAN GATE**: User picks an approach
-6. Store the decision via `catalyst_knowledge_store` (type: decision)
+6. Store the decision via `crewpilot_knowledge_store` (type: decision)
 7. Write the design document to `docs/design/{issue_id}-{slug}.md`:
    ```markdown
    # Design: {issue title}
@@ -211,6 +253,7 @@ Reversal cost: {Low/Medium/High}
    Confidence: {N}/10 | Reversal cost: {Low/Medium/High}
    ```
 8. Stage the design doc — it will be committed alongside the code in Phase 5
+9. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="design"` containing the chosen approach, trade-off summary, and design document path
 #### If issue has `needs-architecture` label:
@@ -253,6 +296,7 @@ Data Flow:
    {rejected options and why}
    ```
 9. Stage the ADR — it will be committed alongside the code in Phase 5
+10. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="architecture"` containing the component decomposition, data flow, interfaces, and ADR path
 #### If issue has BOTH labels:
@@ -267,8 +311,8 @@ The design decision feeds into the architecture — e.g., "we chose Redis" → a
 1. **Symptom collection**:
    - Extract error message, stack trace, steps to reproduce from the issue description
-   - Run `catalyst_git_log` on the affected files to check recent changes
-   - Query `catalyst_knowledge_search` for previous root causes in the same area
+   - Run `crewpilot_git_log` on the affected files to check recent changes
+   - Query `crewpilot_knowledge_search` for previous root causes in the same area
 2. **Hypothesis generation** — generate 2-3 ranked hypotheses:
 ```
@@ -282,7 +326,7 @@ The design decision feeds into the architecture — e.g., "we chose Redis" → a
 ```
 3. **Systematic elimination** — for each hypothesis (highest first):
-   - Run `catalyst_exec` to test (add logging, reproduce, check state)
+   - Run `crewpilot_exec` to test (add logging, reproduce, check state)
    - Record result: confirmed / eliminated / narrowed
    - Max 5 attempts total (circuit breaker — same as Phase 4)
 4. **Root cause identification**:
@@ -293,21 +337,56 @@ The design decision feeds into the architecture — e.g., "we chose Redis" → a
    - The plan must fix the root cause, not just the symptom
    - Include a regression test that fails without the fix
    - Phase 5 commit footer: `Root-cause: {one-sentence description}`
-6. **Store root cause** via `catalyst_knowledge_store` (type: `root-cause`):
+6. **Store root cause** via `crewpilot_knowledge_store` (type: `root-cause`):
    - What: the root cause description
    - Where: affected files/modules
    - Why: the design gap
    - Prevention: what would have caught this earlier
-7. **If root cause reveals a systemic issue**, flag it for pattern detection in Phase 6:
+7. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="rca"` containing the root cause, causal chain, design gap, prevention strategy, and affected files
+8. **If root cause reveals a systemic issue**, flag it for pattern detection in Phase 6:
    - Add note: `systemic:{description}` for Phase 6 to pick up
-#### After design/architecture/RCA phases:
+### Phase 2.5d — Threat Modeling (label-gated)
+**Skip if the issue does NOT have a `needs-threat-model` or `security-sensitive` label.**
+**Load and follow** `.github/skills/assure-threat-model/SKILL.md` methodology:
+1. **Read prior artifacts**: Load the `analysis` artifact (and `architecture` if it exists) to understand the system being built
+2. **Scope the model**: Define the trust boundaries and data flows for the feature being implemented
+3. **STRIDE analysis**: For each component and data flow crossing a trust boundary, evaluate all 6 STRIDE categories
+4. **Risk assessment**: Score each threat (Likelihood × Impact = Risk)
+5. **Mitigation planning**: For threats with risk ≥ 7, propose specific mitigations with effort and implementation phase
+6. **Present to user**:
+```
+🛡️ Threat Model for: "{issue title}"
+| ID | STRIDE | Component | Threat | Risk Score | Mitigation |
+|----|--------|-----------|--------|------------|------------|
+| T1 | ...    | ...       | ...    | ...        | ...        |
+Critical threats: {count}
+Required mitigations before implementation: {list}
+→ Approve threat model? (yes / edit)
+```
-The design documents and RCA findings inform the implementation plan. Phase 3's plan should reference:
+7. **HUMAN GATE**: User approves the threat model
+8. Store via `crewpilot_knowledge_store` (type: `threat-model`)
+9. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="threat-model"` containing the full threat register
+10. Feed critical/high-risk mitigations into Phase 3 plan as mandatory implementation steps
+#### After design/architecture/RCA/threat-model phases:
+The design documents, RCA findings, and threat model inform the implementation plan. Phase 3's plan should reference:
 - Which approach was chosen (from design doc)
 - Which components to build (from architecture)
 - Which interfaces to implement (from ADR)
 - What root cause was found (from RCA) and what fix addresses it
+- What threats were identified (from threat model) and what mitigations are required
+**Read prior artifacts**: Call `crewpilot_artifact_read` to load the `analysis`, `design`, `architecture`, `rca`, and/or `threat-model` artifacts. These contain the full context from earlier phases — do not rely on chat history alone.
 ### Phase 3 — HUMAN GATE: Plan Approval
@@ -340,18 +419,24 @@ Complexity: {trivial|simple|moderate|complex}
 Approve? (yes / edit / cancel)
 ```
-- If **yes** → call `catalyst_worker_approve`, continue to Phase 4
+- If **yes** → call `crewpilot_worker_approve`, continue to Phase 4
 - If **edit** → user provides changes, update plan, re-present
-- If **cancel** → call `catalyst_worker_fail`, stop
+- If **cancel** → call `crewpilot_worker_fail`, stop
+**Write artifact**: After approval, call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="plan"` containing the approved plan (steps, files, complexity).
+**Session checkpoint**: After plan approval, call `crewpilot_session_save` with status="checkpoint", phase="phase-3-approved", and the current context. This ensures the approved plan can be resumed if the session is interrupted.
 ### Phase 4 — Branch & Implementation
-1. Call `catalyst_worker_branch` to create feature branch
-2. Call `catalyst_board_move` to set issue status to "in-progress"
+**Read prior artifacts**: Call `crewpilot_artifact_read` for `plan` (and `analysis`, `design`, `architecture`, `rca` if they exist) to load the full execution context.
+1. Call `crewpilot_worker_branch` to create feature branch
+2. Call `crewpilot_board_move` to set issue status to "in-progress"
 3. **For each step in the plan:**
    a. Implement the code change (create/modify files)
    b. Follow existing codebase patterns discovered in Phase 2
-   c. After each logical unit, run `catalyst_exec("npm test")` or equivalent to verify nothing is broken
+   c. After each logical unit, run `crewpilot_exec("npm test")` or equivalent to verify nothing is broken
    d. If tests fail, diagnose and fix (max 3 attempts per step — circuit breaker)
 4. Write tests for new code:
    - Match existing test framework and conventions
@@ -359,8 +444,8 @@ Approve? (yes / edit / cancel)
    - Run tests to confirm they pass
 **Circuit breaker:** If any step fails 3 times consecutively:
-- Call `catalyst_board_comment` with details of the failure
-- Call `catalyst_worker_fail` with reason
+- Call `crewpilot_board_comment` with details of the failure
+- Call `crewpilot_worker_fail` with reason
 - Tell the user what went wrong and which step is stuck
 - STOP. Do not continue.
@@ -368,10 +453,10 @@ Approve? (yes / edit / cancel)
 **Load and follow** `.github/skills/deliver-change-management/SKILL.md` methodology:
-1. Run `catalyst_git_diff` to analyze all changes
+1. Run `crewpilot_git_diff` to analyze all changes
 2. Categorize changes by type: `feat`, `fix`, `refactor`, `test`, `docs`, `chore`
 3. **If changes span multiple logical units** (e.g., new feature + test + config):
-   - Split into separate commits with `catalyst_git_stage` per group
+   - Split into separate commits with `crewpilot_git_stage` per group
    - Each commit gets its own conventional message
    - Example:
      ```
@@ -388,7 +473,8 @@ Approve? (yes / edit / cancel)
    - Format: `feat(scope): description (closes #ID)`
    - Body: what was implemented and why
    - Footer: `Closes #ID`
-5. Call `catalyst_git_stage` and `catalyst_git_commit` for each logical commit
+5. Call `crewpilot_git_stage` and `crewpilot_git_commit` for each logical commit
+6. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="change-mgmt"` containing the list of commits created (hash, type, scope, message)
 ### Phase 5b — Doc Governance (Deliver Skill #2)
@@ -413,7 +499,7 @@ Approve? (yes / edit / cancel)
 ### Phase 6 — PR Creation & Auto-Review
-1. Call `catalyst_worker_preview_pr` with:
+1. Call `crewpilot_worker_preview_pr` with:
    - Title: primary commit message
    - Body: markdown with sections:
      - **What**: summary of changes
@@ -426,7 +512,7 @@ Approve? (yes / edit / cancel)
 2. **HUMAN GATE**: User reviews the preview — do NOT create the PR until the user approves.
    If the user requests changes, apply them and re-preview. Never skip this gate.
 </HARD-GATE>
-3. Call `catalyst_worker_pr` to create the PR
+3. Call `crewpilot_worker_pr` to create the PR
 4. **Run PR Intelligence** (read `.github/skills/assure-pr-intelligence/SKILL.md`):
    - **Change inventory**: categorize changed files (core, api, test, config, docs)
    - **Risk assessment**: evaluate scope, complexity, blast radius, test coverage, reversibility → Low/Medium/High/Critical risk score
@@ -434,7 +520,15 @@ Approve? (yes / edit / cancel)
    - **Merge readiness checklist**: tests pass, security clean, breaking changes documented, PR description matches changes
    - Post the full PR Intelligence report as a **comment on the PR** so the assigned reviewer sees it immediately
 5. Read the diff of the PR
-6. Run **code-quality** review internally (read `.github/skills/assure-code-quality/SKILL.md`):
+6. **Subagent delegation (recommended for moderate/complex changes):** Use `crewpilot_dispatch_subagent` to delegate review work in parallel:
+   - Delegate `code-reviewer` role with the diff and file list — receives correctness, security, and performance findings
+   - Delegate `standards-reviewer` role with the diff and codebase conventions — receives standards compliance findings
+   - Delegate `security-auditor` role with source files and architecture context — receives STRIDE/OWASP findings
+   - Each subagent writes its output as an artifact (e.g. `review-functional`, `review-standards`) for traceability
+   - Merge subagent findings using `crewpilot_dispatch_consensus` to identify high-confidence vs disputed issues
+   **Fallback (simple changes):** Run reviews inline without subagent delegation:
+7. Run **code-quality** review internally (read `.github/skills/assure-code-quality/SKILL.md`):
    - Correctness: does the code do what the acceptance criteria say?
    - Security: any obvious vulnerabilities (SQL injection, XSS, secrets)?
    - Performance: any N+1 queries, await-in-loops, unnecessary re-renders?
@@ -442,7 +536,22 @@ Approve? (yes / edit / cancel)
 7. Run **vulnerability-scan** internally (read `.github/skills/assure-vulnerability-scan/SKILL.md`):
    - OWASP Top 10 quick check on new code
    - Dependency audit: `npm audit` or `pip audit`
-8. Run `catalyst_exec("npm run lint")` and `catalyst_exec("npm run typecheck")` if available
+8. Run `crewpilot_exec("npm run lint")` and `crewpilot_exec("npm run typecheck")` if available
+8b. **(Optional) Requirements alignment validation**: If M365 context was fetched in Phase 2, validate the implementation against meeting-stated requirements:
+   - Read the `analysis` artifact to retrieve the M365 requirements context captured earlier
+   - If the analysis artifact contains meeting decisions or stakeholder expectations, call `mcp_workiq_ask_work_iq` → "What specific requirements and acceptance criteria were stated for {feature} in meetings and emails?"
+   - Cross-reference each stated requirement against the implementation diff:
+     - **Covered**: the requirement is addressed by the code changes ✓
+     - **Partial**: the requirement is partially addressed — flag what's missing
+     - **Missing**: the requirement is not addressed at all — flag as a review finding
+   - Include requirements alignment in the PR comment:
+     ```
+     📋 Requirements Alignment:
+       Meeting requirements checked: {N}
+       Covered: {count} ✓ | Partial: {count} ⚠️ | Missing: {count} ❌
+       {list any partial/missing items}
+     ```
+   - If critical requirements are missing, flag as a review issue that must be addressed before merge
 9. **Run diff-scoped pattern detection** (read `.github/skills/insights-pattern-detection/SKILL.md`):
    - Scope: only scan files changed in the diff (NOT full codebase)
    - Check for **consistency** with existing codebase patterns:
@@ -456,14 +565,14 @@ Approve? (yes / edit / cancel)
      - Shotgun surgery (small change touching too many files)
      - Primitive obsession (strings/numbers where domain types belong)
    - **Query knowledge base for repeat offenses**:
-     - `catalyst_knowledge_search` type: `pattern` — "has this same anti-pattern been flagged before?"
+     - `crewpilot_knowledge_search` type: `pattern` — "has this same anti-pattern been flagged before?"
      - If a repeat offense is found, flag prominently:
        ```
        ⚠️ Recurring Pattern Issue: {description}
        Previously flagged in: {previous context}
        Suggestion: Consider a structural fix.
        ```
-   - Run `catalyst_metrics_complexity` on changed files — flag any function with complexity > threshold
+   - Run `crewpilot_metrics_complexity` on changed files — flag any function with complexity > threshold
    - Include pattern findings in the PR comment:
      ```
      🔎 Pattern Detection Results:
@@ -477,9 +586,10 @@ Approve? (yes / edit / cancel)
    - Re-commit: `fix(scope): address review findings`
    - Re-push
    - Re-run pattern detection on the fix to confirm resolution
-11. Call `catalyst_worker_review_done` with verdict: "approved" and summary
-12. Call `catalyst_board_move` to set issue status to "in-review"
-13. Call `catalyst_board_comment`: "PR #{pr_number} opened. Ready for review."
+11. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="review-merged"` containing the combined review results (code-quality, vulnerability-scan, pattern detection findings, and fix iterations)
+12. Call `crewpilot_worker_review_done` with verdict: "approved" and summary
+12. Call `crewpilot_board_move` to set issue status to "in-review"
+13. Call `crewpilot_board_comment`: "PR #{pr_number} opened. Ready for review."
 ### Phase 7 — Deploy Guard (Deliver Skill #3)
@@ -512,10 +622,12 @@ Produce a verdict and include in the PR comment:
 - If **CONDITIONAL** → list warnings in PR comment, proceed (human decides)
 - If **NO-GO** → fix blockers, re-run until GO or escalate to user
+**Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="deploy-guard"` containing the full 6-gate results and verdict.
 ### Phase 8 — Completion & Learning
-1. Call `catalyst_board_comment` with deploy guard results: "All checks passed. Ready to merge."
-2. **Store knowledge** via `catalyst_knowledge_store`:
+1. Call `crewpilot_board_comment` with deploy guard results: "All checks passed. Ready to merge."
+2. **Store knowledge** via `crewpilot_knowledge_store`:
    - Decisions made during implementation (type: `decision`)
    - Root cause findings, if this was a bug fix (type: `root-cause`)
    - **Pattern findings** from Phase 6 (type: `pattern`):
@@ -557,7 +669,8 @@ Repeat Issues: {none | {count} recurring patterns detected}
 → Merge when ready. Board will auto-update on close.
 ```
-4. Call `catalyst_worker_complete`
+4. **Write artifact**: Call `crewpilot_artifact_write` with `workflow_id={issue_id}`, `phase="completion"` containing the final summary (PR number, branch, commits, review/deploy-guard results, knowledge stored)
+5. Call `crewpilot_worker_complete`
 ### Capability Hints (on completion)
@@ -613,6 +726,7 @@ Every step in the Phase 3 plan and every file produced in Phase 4 must contain r
 - `solution-design` — Phase 2.5: generate solution design doc when `needs-design` label detected
 - `architecture-planner` — Phase 2.5: generate ADR when `needs-architecture` label detected
 - `root-cause-analysis` — Phase 2.5c: systematic RCA when `bug`/`defect`/`regression` label detected
+- `threat-model` — Phase 2.5d: STRIDE threat modeling when `needs-threat-model`/`security-sensitive` label detected
 - `change-management` — Phase 5: proper conventional commits with multi-commit splitting
 - `doc-governance` — Phase 5b: auto-detect and fix documentation drift
 - `pr-intelligence` — Phase 6: risk assessment + reviewer guidance posted on PR

package/prompts/skills/daily-digest/SKILL.md CHANGED Viewed

@@ -12,12 +12,15 @@ Generate a comprehensive daily/weekly work summary by aggregating git activity,
 ## Tools Required
-- `catalyst_git_log` — get commits for the time period
-- `catalyst_board_my_items` — get board items (opened, closed, in-progress)
-- `catalyst_worker_dashboard` — workflow completions and stats
-- `catalyst_knowledge_timeline` — decisions made in the period
-- `catalyst_exec` — run git/gh commands for additional data
-- `catalyst_notify_send` — deliver the report via email
+- `crewpilot_git_log` — get commits for the time period
+- `crewpilot_board_my_items` — get board items (opened, closed, in-progress)
+- `crewpilot_worker_dashboard` — workflow completions and stats
+- `crewpilot_knowledge_timeline` — decisions made in the period
+- `crewpilot_exec` — run git/gh commands for additional data
+- `crewpilot_notify_send` — deliver the report via email
+- `mcp_workiq_accept_eula` — (optional) accept Work IQ EULA before first query
+- `mcp_workiq_ask_work_iq` — (optional, requires Work IQ extension) fetch M365 activity (emails, meetings, docs, Teams) for a full work-surface report
+- `crewpilot_artifact_write` — persist the digest as an artifact
 ## Methodology
@@ -42,26 +45,44 @@ digraph daily_digest {
 Gather from all sources for the requested time period (default: today):
 **Git Activity:**
-1. Call `catalyst_git_log` with `--since="today 00:00"` (or requested range)
+1. Call `crewpilot_git_log` with `--since="today 00:00"` (or requested range)
 2. Extract: commit count, files changed, insertions/deletions, branches touched
 3. Group commits by scope/type (feat, fix, refactor, test, docs)
 **Board Activity:**
-1. Call `catalyst_exec` with `gh issue list --author=@me --state=all --json number,title,state,updatedAt,labels`
+1. Call `crewpilot_exec` with `gh issue list --author=@me --state=all --json number,title,state,updatedAt,labels`
 2. Filter to items updated in the time period
 3. Categorize: created, moved to in-progress, closed/done, commented on
 **PR Activity:**
-1. Call `catalyst_exec` with `gh pr list --author=@me --state=all --json number,title,state,createdAt,mergedAt,reviewDecision`
+1. Call `crewpilot_exec` with `gh pr list --author=@me --state=all --json number,title,state,createdAt,mergedAt,reviewDecision`
 2. Filter to time period
 3. Categorize: opened, merged, review pending, changes requested
 **Workflow Activity:**
-1. Call `catalyst_worker_dashboard` for digital worker stats
+1. Call `crewpilot_worker_dashboard` for digital worker stats
 2. Filter completed/failed workflows in the period
 **Knowledge:**
-1. Call `catalyst_knowledge_timeline` for decisions and lessons stored today
+1. Call `crewpilot_knowledge_timeline` for decisions and lessons stored today
+**M365 Activity (optional — requires Work IQ MCP server):**
+1. Call `mcp_workiq_accept_eula` with `eulaUrl: "https://github.com/microsoft/work-iq-mcp"` (idempotent — safe to call every time)
+2. Use **multiple focused queries** for comprehensive coverage (targeted queries return better results than one broad question):
+   - **Emails**: `mcp_workiq_ask_work_iq` → "What emails did I send and receive on {date}? Summarize key threads and any action items."
+   - **Meetings**: `mcp_workiq_ask_work_iq` → "What meetings did I attend on {date}? What decisions were made and what action items were assigned to me?"
+   - **Documents**: `mcp_workiq_ask_work_iq` → "What documents did I edit or view in SharePoint and OneDrive on {date}?"
+   - **Teams**: `mcp_workiq_ask_work_iq` → "What Teams channel messages and chats was I active in on {date}? What mentions did I receive?"
+   - **Tasks**: `mcp_workiq_ask_work_iq` → "What Planner or To-Do tasks did I complete or get assigned on {date}?"
+3. If Work IQ is available, parse all responses and include the full work surface:
+   - **Emails**: sent/received count, key threads, action items from emails
+   - **Meetings**: attended meetings, decisions made, action items assigned, linked documents
+   - **Documents**: files edited/viewed in SharePoint/OneDrive, co-authoring activity
+   - **Teams**: active channel conversations, 1:1 chats, mentions, and responses
+   - **Tasks**: Planner/To-Do items completed, created, or updated
+4. If `mcp_workiq_ask_work_iq` is unavailable or errors, skip this section — the digest works without it (git + board + PRs is the baseline)
+> **Query budget**: Work IQ queries have a ~30/session budget. The 5 queries above are a reasonable investment for a full daily digest. For weekly summaries, combine into broader date-range queries to conserve budget.
 ### Phase 2 — Report Generation
@@ -112,9 +133,9 @@ Tomorrow's focus:
 Based on notification configuration:
 **Email (default when recipients configured):**
-1. Call `catalyst_notify_send` with subject: "Daily Digest — {date} — {project name}", body: full report
+1. Call `crewpilot_notify_send` with subject: "Daily Digest — {date} — {project name}", body: full report
 2. Email sent automatically via SMTP (no manual interaction needed)
-3. Requires SMTP env vars or `catalyst_notify_configure` to be set up
+3. Requires SMTP env vars or `crewpilot_notify_configure` to be set up
 **Console (fallback when no recipients configured):**
 1. Just display the report in chat
@@ -160,7 +181,7 @@ When triggered with "weekly summary" or "weekly digest":
 - Do NOT include sensitive data (secrets, tokens, passwords found in code)
 - Do NOT fabricate activity — if nothing happened, say "quiet day"
 - Do NOT include full commit messages — summarize by category
-- Do NOT send to recipients not configured via catalyst_notify_configure
+- Do NOT send to recipients not configured via crewpilot_notify_configure
 ## Chains To

package/prompts/skills/deliver-change-management/SKILL.md CHANGED Viewed

@@ -84,16 +84,16 @@ If multiple logical changes are staged:
 ## Tools Required
 - `terminal` — Run git commands
-- `catalyst_git_status` — Get current state
-- `catalyst_git_diff` — Get detailed diff
-- `catalyst_git_log` — Parse commit history
-- `catalyst_git_stage` — Stage files
-- `catalyst_git_commit` — Execute commit
+- `crewpilot_git_status` — Get current state
+- `crewpilot_git_diff` — Get detailed diff
+- `crewpilot_git_log` — Parse commit history
+- `crewpilot_git_stage` — Stage files
+- `crewpilot_git_commit` — Execute commit
 ## Output Format
 ```
-## [Catalyst → Change Management]
+## [CrewPilot → Change Management]
 ### Changes Detected
 | Type | Scope | Files |

package/prompts/skills/deliver-deploy-guard/SKILL.md CHANGED Viewed

@@ -49,14 +49,14 @@ digraph deploy_guard {
 - [ ] No `TODO`/`FIXME`/`HACK` in files changed since last deploy
 - [ ] No `console.log`/`print`/debug statements in production paths
 - [ ] No commented-out code blocks
-- Run: `catalyst_metrics_complexity` on changed files — flag any high-complexity additions
+- Run: `crewpilot_metrics_complexity` on changed files — flag any high-complexity additions
 ### Gate 2 — Test Integrity
 - [ ] All tests pass
 - [ ] Test coverage meets minimum threshold
 - [ ] No skipped tests (`.skip`, `@disabled`, `@pytest.mark.skip`)
 - [ ] No test files with zero assertions
-- Run: `catalyst_metrics_coverage` to validate
+- Run: `crewpilot_metrics_coverage` to validate
 ### Gate 3 — Security
 - [ ] No new vulnerabilities from `vulnerability-scan`
@@ -100,14 +100,14 @@ Produce a clear GO / NO-GO / CONDITIONAL decision:
 - `terminal` — Run tests, linters, audit tools
 - `codebase` — Scan for anti-patterns, secrets, debug statements
-- `catalyst_metrics_coverage` — Coverage report
-- `catalyst_metrics_complexity` — Complexity scores
-- `catalyst_git_diff` — Changes since last deploy/tag
+- `crewpilot_metrics_coverage` — Coverage report
+- `crewpilot_metrics_complexity` — Complexity scores
+- `crewpilot_git_diff` — Changes since last deploy/tag
 ## Output Format
 ```
-## [Catalyst → Deploy Guard]
+## [CrewPilot → Deploy Guard]
 ### Gate Results

package/prompts/skills/deliver-doc-governance/SKILL.md CHANGED Viewed

@@ -89,12 +89,12 @@ Verify minimum documentation exists:
 - `codebase` — Read source code and documentation files
 - `terminal` — Verify install steps, run examples
-- `catalyst_knowledge_search` — Check if documentation decisions were previously recorded
+- `crewpilot_knowledge_search` — Check if documentation decisions were previously recorded
 ## Output Format
 ```
-## [Catalyst → Doc Governance]
+## [CrewPilot → Doc Governance]
 ### Documentation Map
 | Doc File | Covers | Status |