@crewpilot/agent 1.0.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (27) hide show
  1. package/README.md +131 -107
  2. package/dist-npm/cli.js +0 -0
  3. package/dist-npm/index.js +160 -127
  4. package/package.json +69 -69
  5. package/prompts/agent.md +282 -266
  6. package/prompts/catalyst.config.json +72 -72
  7. package/prompts/copilot-instructions.md +36 -36
  8. package/prompts/skills/assure-code-quality/SKILL.md +112 -112
  9. package/prompts/skills/assure-pr-intelligence/SKILL.md +148 -148
  10. package/prompts/skills/assure-review-functional/SKILL.md +114 -0
  11. package/prompts/skills/assure-review-standards/SKILL.md +106 -0
  12. package/prompts/skills/assure-threat-model/SKILL.md +182 -0
  13. package/prompts/skills/assure-vulnerability-scan/SKILL.md +146 -146
  14. package/prompts/skills/autopilot-meeting/SKILL.md +434 -407
  15. package/prompts/skills/autopilot-worker/SKILL.md +737 -623
  16. package/prompts/skills/daily-digest/SKILL.md +188 -167
  17. package/prompts/skills/deliver-change-management/SKILL.md +132 -132
  18. package/prompts/skills/deliver-deploy-guard/SKILL.md +144 -144
  19. package/prompts/skills/deliver-doc-governance/SKILL.md +130 -130
  20. package/prompts/skills/engineer-feature-builder/SKILL.md +270 -270
  21. package/prompts/skills/engineer-root-cause-analysis/SKILL.md +150 -150
  22. package/prompts/skills/engineer-test-first/SKILL.md +148 -148
  23. package/prompts/skills/insights-knowledge-base/SKILL.md +202 -181
  24. package/prompts/skills/insights-pattern-detection/SKILL.md +142 -142
  25. package/prompts/skills/strategize-architecture-planner/SKILL.md +141 -141
  26. package/prompts/skills/strategize-solution-design/SKILL.md +118 -118
  27. package/scripts/postinstall.js +108 -108
@@ -0,0 +1,182 @@
1
+ # Threat Model — STRIDE
2
+
3
+ > **Pillar**: Assure | **ID**: `assure-threat-model`
4
+
5
+ ## Purpose
6
+
7
+ Systematic threat modeling using the STRIDE framework. Identifies threats across Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, and Elevation of Privilege. Produces a threat register with risk scores and mitigations that informs design decisions and security reviews.
8
+
9
+ ## Activation Triggers
10
+
11
+ - "threat model", "stride", "threat analysis", "security architecture"
12
+ - "what could go wrong", "attack vectors", "threat register"
13
+ - Label-gated: automatically invoked by autopilot-worker Phase 2.5d when `needs-threat-model` or `security-sensitive` label detected
14
+ - Routed from `security-auditor` subagent role for architecture-level security analysis
15
+
16
+ ## Methodology
17
+
18
+ ### Process Flow
19
+
20
+ ```dot
21
+ digraph threat_model {
22
+ rankdir=TB;
23
+ node [shape=box];
24
+
25
+ scope [label="Phase 1\nScope & Data Flow"];
26
+ decompose [label="Phase 2\nComponent Decomposition"];
27
+ stride [label="Phase 3\nSTRIDE Analysis"];
28
+ risk [label="Phase 4\nRisk Assessment"];
29
+ mitigate [label="Phase 5\nMitigation Planning"];
30
+ register [label="Phase 6\nThreat Register", shape=doublecircle];
31
+
32
+ scope -> decompose;
33
+ decompose -> stride;
34
+ stride -> risk;
35
+ risk -> mitigate;
36
+ mitigate -> register;
37
+ }
38
+ ```
39
+
40
+ ### Phase 1 — Scope & Data Flow
41
+
42
+ 1. Define the system boundary — what's being threat-modeled (entire system, single feature, or API surface)
43
+ 2. Identify actors: end users, admins, external services, background jobs
44
+ 3. Map data flows:
45
+ - User input → processing → storage → output
46
+ - Service-to-service communication
47
+ - External API calls
48
+ 4. Identify trust boundaries:
49
+ - Authenticated vs unauthenticated zones
50
+ - Internal vs external network
51
+ - Client-side vs server-side
52
+ - Different privilege levels
53
+ 5. **(Optional) Fetch security context from M365**: If `mcp_workiq_ask_work_iq` is available, query for relevant compliance and security context:
54
+ - Call `mcp_workiq_accept_eula` with `eulaUrl: "https://github.com/microsoft/work-iq-mcp"` (idempotent)
55
+ - **Compliance requirements**: `mcp_workiq_ask_work_iq` → "What compliance requirements, security policies, or regulatory constraints apply to {system/feature}? Check emails, docs, and Teams messages."
56
+ - **Past security discussions**: `mcp_workiq_ask_work_iq` → "What security concerns or vulnerabilities have been discussed about {system/feature} in recent emails and meetings?"
57
+ - **Architecture decisions**: `mcp_workiq_ask_work_iq` → "What architecture or security design decisions were made about {system/feature} in meetings or design docs?"
58
+ - Feed this context into the STRIDE analysis to ensure threats are evaluated against the organization's actual compliance posture and known security concerns.
59
+ - If unavailable, proceed without — the threat model works from code analysis alone.
60
+
61
+ ### Phase 2 — Component Decomposition
62
+
63
+ 1. List each component in the data flow:
64
+ - Frontend (SPA, mobile app, CLI)
65
+ - API gateway / load balancer
66
+ - Application server(s)
67
+ - Database(s)
68
+ - Cache layer
69
+ - Message queue / event bus
70
+ - External services / third-party APIs
71
+ - File storage / CDN
72
+ 2. For each component, note:
73
+ - Technology stack
74
+ - Authentication mechanism
75
+ - Data stored/processed
76
+ - Network exposure (public, internal, VPN)
77
+
78
+ ### Phase 3 — STRIDE Analysis
79
+
80
+ For each component and each data flow crossing a trust boundary, evaluate all six STRIDE categories:
81
+
82
+ | Category | Threat | Key Questions |
83
+ |----------|--------|---------------|
84
+ | **S**poofing | Identity impersonation | Can an attacker pretend to be another user/service? Is authentication enforced at every entry point? Are tokens/sessions properly validated? |
85
+ | **T**ampering | Data modification | Can data be modified in transit or at rest? Are inputs validated? Is there integrity checking (HMAC, checksums)? Can request parameters be manipulated? |
86
+ | **R**epudiation | Deniability of actions | Are actions logged with sufficient detail? Can a user deny performing an action? Are audit logs tamper-proof? |
87
+ | **I**nformation Disclosure | Data exposure | Can sensitive data leak through error messages, logs, API responses, or side channels? Is PII/secrets encrypted at rest and in transit? |
88
+ | **D**enial of Service | Availability threats | Are there rate limits? Can a single request exhaust resources (memory, CPU, disk)? Are there circuit breakers? Can an attacker trigger expensive operations? |
89
+ | **E**levation of Privilege | Unauthorized access | Can a regular user access admin functions? Are authorization checks at every layer (not just frontend)? Can parameters be manipulated to bypass access controls? |
90
+
91
+ ### Phase 4 — Risk Assessment
92
+
93
+ For each identified threat, assess:
94
+
95
+ 1. **Likelihood** (1-5): How easy is this to exploit?
96
+ - 1 = Requires deep insider knowledge + sophisticated tools
97
+ - 3 = Moderately skilled attacker with publicly available tools
98
+ - 5 = Trivial exploitation, automated scanners can find it
99
+ 2. **Impact** (1-5): What's the damage if exploited?
100
+ - 1 = Minor inconvenience, no data loss
101
+ - 3 = Service disruption, limited data exposure
102
+ - 5 = Full data breach, system compromise, regulatory impact
103
+ 3. **Risk Score** = Likelihood × Impact (1-25)
104
+ - 1-6: Low → Accept or monitor
105
+ - 7-14: Medium → Mitigate within normal development
106
+ - 15-25: High/Critical → Block release until mitigated
107
+
108
+ ### Phase 5 — Mitigation Planning
109
+
110
+ For each threat with risk score ≥ 7:
111
+
112
+ 1. Propose a specific mitigation (not generic "add security")
113
+ 2. Classify the mitigation:
114
+ - **Prevent**: Eliminate the threat entirely (e.g., parameterized queries for SQLi)
115
+ - **Detect**: Monitor and alert (e.g., anomaly detection for DoS)
116
+ - **Respond**: Limit damage (e.g., circuit breakers, rate limits)
117
+ - **Transfer**: Shift risk (e.g., use managed service with SLA)
118
+ 3. Estimate implementation effort: Low / Medium / High
119
+ 4. Identify which phase of the worker pipeline should implement the mitigation:
120
+ - Phase 4 (Implementation): Code-level fixes
121
+ - Phase 5 (Change Mgmt): Configuration changes
122
+ - Phase 7 (Deploy Guard): Operational checks
123
+
124
+ ### Phase 6 — Threat Register
125
+
126
+ Compile all findings into a structured threat register and:
127
+ 1. Store via `catalyst_knowledge_store` (type: `threat-model`) for future reference
128
+ 2. Write as artifact via `catalyst_artifact_write` (phase: `threat-model`)
129
+ 3. Feed high-risk items into the Phase 3 plan as mandatory implementation steps
130
+
131
+ ## Tools Required
132
+
133
+ - `catalyst_knowledge_store` — Store threat model for future reference
134
+ - `catalyst_knowledge_search` — Query past threat models and security findings
135
+ - `catalyst_artifact_write` — Persist threat register as workflow artifact
136
+ - `catalyst_artifact_read` — Read analysis/architecture artifacts for context
137
+ - `catalyst_metrics_complexity` — Identify complex code that may have more attack surface
138
+ - `mcp_workiq_accept_eula` — (optional) Accept Work IQ EULA before first query
139
+ - `mcp_workiq_ask_work_iq` — (optional) Query M365 for compliance requirements, security discussions, and architecture decisions
140
+
141
+ ## Output Format
142
+
143
+ ```
144
+ ## [Catalyst → Threat Model (STRIDE)]
145
+
146
+ ### Scope
147
+ **System**: {what's being modeled}
148
+ **Actors**: {user types}
149
+ **Trust Boundaries**: {boundary list}
150
+
151
+ ### Data Flow Diagram
152
+ ```
153
+ {text-based data flow: Actor → Component → Data Store → Output}
154
+ ```
155
+
156
+ ### Threat Register
157
+
158
+ | ID | STRIDE | Component | Threat | Likelihood | Impact | Risk | Mitigation | Effort |
159
+ |----|--------|-----------|--------|------------|--------|------|------------|--------|
160
+ | T1 | S | Auth API | ... | 4 | 5 | 20 | ... | Medium |
161
+ | T2 | T | ... | ... | 3 | 3 | 9 | ... | Low |
162
+ | ...| ... | ... | ... | ... | ... | ... | ... | ... |
163
+
164
+ ### Risk Summary
165
+ - **Critical** (15-25): {count} threats → Must mitigate before release
166
+ - **Medium** (7-14): {count} threats → Mitigate within sprint
167
+ - **Low** (1-6): {count} threats → Accept/monitor
168
+
169
+ ### Recommended Mitigations (Priority Order)
170
+ 1. {T-ID}: {mitigation} — {effort} — Phase {N}
171
+ 2. {T-ID}: {mitigation} — {effort} — Phase {N}
172
+ 3. ...
173
+
174
+ ### Confidence: {N}/10
175
+ ```
176
+
177
+ ## Chains To
178
+
179
+ - `assure-vulnerability-scan` — Complements STRIDE with OWASP/CWE code-level scanning
180
+ - `assure-review-functional` — Security pass covers code-level implementation of mitigations
181
+ - `strategize-architecture-planner` — Architecture decisions should reference the threat model
182
+ - `insights-knowledge-base` — Past threat models inform future analysis
@@ -1,146 +1,146 @@
1
- # Vulnerability Scan
2
-
3
- > **Pillar**: Assure | **ID**: `assure-vulnerability-scan`
4
-
5
- ## Purpose
6
-
7
- Security-focused code analysis mapping findings to OWASP Top 10 and CWE Top 25. Provides actionable remediation with severity scoring, not just warnings.
8
-
9
- ## Activation Triggers
10
-
11
- - "security review", "vulnerability scan", "is this secure", "owasp check"
12
- - "audit for security", "cwe check", "pentest this code"
13
- - Automatically chained when `code-quality` detects security-adjacent patterns
14
-
15
- ## Methodology
16
-
17
- ### Process Flow
18
-
19
- ```dot
20
- digraph vulnerability_scan {
21
- rankdir=TB;
22
- node [shape=box];
23
-
24
- surface [label="Phase 1\nAttack Surface Mapping"];
25
- owasp [label="Phase 2\nOWASP Top 10 Scan"];
26
- cwe [label="Phase 3\nCWE Pattern Matching"];
27
- remediate [label="Phase 4\nRemediation"];
28
- deps [label="Phase 5\nDependency Audit"];
29
- report [label="Report", shape=doublecircle];
30
-
31
- surface -> owasp;
32
- owasp -> cwe;
33
- cwe -> remediate;
34
- remediate -> deps;
35
- deps -> report;
36
- }
37
- ```
38
-
39
- ### Phase 1 — Attack Surface Mapping
40
- 1. Identify all entry points: API endpoints, user inputs, file uploads, URL params
41
- 2. Map data flow from input → processing → storage → output
42
- 3. Identify trust boundaries (authenticated vs. unauthenticated, internal vs. external)
43
- 4. List dependencies and their known vulnerability status
44
-
45
- ### Phase 2 — OWASP Top 10 Scan
46
- Check each applicable category:
47
-
48
- | ID | Category | What to Look For |
49
- |---|---|---|
50
- | A01 | Broken Access Control | Missing auth checks, IDOR, privilege escalation |
51
- | A02 | Cryptographic Failures | Weak hashing, plaintext secrets, poor TLS config |
52
- | A03 | Injection | SQL/NoSQL/OS/LDAP injection, template injection |
53
- | A04 | Insecure Design | Missing rate limits, business logic flaws |
54
- | A05 | Security Misconfiguration | Default creds, verbose errors, unnecessary features |
55
- | A06 | Vulnerable Components | Known CVEs in dependencies |
56
- | A07 | Auth Failures | Weak passwords, missing MFA, session fixation |
57
- | A08 | Data Integrity Failures | Insecure deserialization, unsigned updates |
58
- | A09 | Logging Failures | Insufficient logging, log injection, PII in logs |
59
- | A10 | SSRF | Unvalidated URLs, internal network access |
60
-
61
- ### Phase 3 — CWE Pattern Matching
62
- Map findings to specific CWE entries (e.g., CWE-79 for XSS, CWE-89 for SQL injection). Include CWE ID in every finding.
63
-
64
- ### Phase 4 — Remediation
65
- For each finding:
66
- 1. Explain the vulnerability in plain language
67
- 2. Show the vulnerable code
68
- 3. Provide the fixed code
69
- 4. Explain why the fix works
70
- 5. Rate exploitability: `trivial / moderate / complex`
71
-
72
- ### Phase 5 — Dependency Audit
73
- 1. Parse dependency manifests (package.json, requirements.txt, go.mod, etc.)
74
- 2. Flag dependencies with known CVEs
75
- 3. Suggest version upgrades with breaking change warnings
76
-
77
- ## Tools Required
78
-
79
- - `codebase` — Read source code and dependency files
80
- - `terminal` — Run `npm audit`, `pip audit`, or equivalent
81
- - `fetch` — Check CVE databases for dependency vulnerabilities
82
-
83
- ## Severity Scoring
84
-
85
- <HARD-GATE>
86
- Do NOT mark a scan as "clean" or "no issues" if any Critical or High severity findings exist.
87
- Do NOT downgrade severity to avoid blocking a deployment.
88
- Critical findings MUST be remediated before code is shipped.
89
- </HARD-GATE>
90
-
91
- | Level | Criteria |
92
- |---|---|
93
- | **Critical** | Remote code execution, auth bypass, data exfiltration — exploit is trivial |
94
- | **High** | Significant data exposure, privilege escalation — exploit is moderate |
95
- | **Medium** | Information disclosure, denial of service — exploit requires chaining |
96
- | **Low** | Best practice violation with no direct exploit path |
97
-
98
- ## Output Format
99
-
100
- ```
101
- ## [Catalyst → Vulnerability Scan]
102
-
103
- ### Attack Surface
104
- - Entry points: {N}
105
- - Trust boundaries: {list}
106
- - Dependencies: {N} total, {N} flagged
107
-
108
- ### Findings
109
-
110
- #### [{severity}] {OWASP-ID} — {title} (CWE-{NNN})
111
- **File**: {path}:{line}
112
- **Vulnerability**: {plain language explanation}
113
- **Exploitability**: {trivial/moderate/complex}
114
- **Vulnerable code**:
115
- \`\`\`{lang}
116
- {code}
117
- \`\`\`
118
- **Remediation**:
119
- \`\`\`{lang}
120
- {fixed code}
121
- \`\`\`
122
- **Why this fixes it**: {explanation}
123
-
124
- ---
125
- (repeat per finding)
126
-
127
- ### Dependency Alerts
128
- | Package | Current | Vulnerable | Fixed In | CVE |
129
- |---|---|---|---|---|
130
- | | | | | |
131
-
132
- ### Summary
133
- {critical}/{high}/{medium}/{low} findings | Exploitability: {overall risk}
134
- ```
135
-
136
- ## Chains To
137
-
138
- - `code-quality` — For non-security improvements found during scan
139
- - `deploy-guard` — Security findings should block deployment
140
-
141
- ## Anti-Patterns
142
-
143
- - Do NOT report theoretical vulnerabilities in unreachable code
144
- - Do NOT flag every dependency without checking actual CVE relevance
145
- - Do NOT provide fixes that break functionality to achieve security
146
- - Do NOT skip the "why this fixes it" explanation — it's educational
1
+ # Vulnerability Scan
2
+
3
+ > **Pillar**: Assure | **ID**: `assure-vulnerability-scan`
4
+
5
+ ## Purpose
6
+
7
+ Security-focused code analysis mapping findings to OWASP Top 10 and CWE Top 25. Provides actionable remediation with severity scoring, not just warnings.
8
+
9
+ ## Activation Triggers
10
+
11
+ - "security review", "vulnerability scan", "is this secure", "owasp check"
12
+ - "audit for security", "cwe check", "pentest this code"
13
+ - Automatically chained when `code-quality` detects security-adjacent patterns
14
+
15
+ ## Methodology
16
+
17
+ ### Process Flow
18
+
19
+ ```dot
20
+ digraph vulnerability_scan {
21
+ rankdir=TB;
22
+ node [shape=box];
23
+
24
+ surface [label="Phase 1\nAttack Surface Mapping"];
25
+ owasp [label="Phase 2\nOWASP Top 10 Scan"];
26
+ cwe [label="Phase 3\nCWE Pattern Matching"];
27
+ remediate [label="Phase 4\nRemediation"];
28
+ deps [label="Phase 5\nDependency Audit"];
29
+ report [label="Report", shape=doublecircle];
30
+
31
+ surface -> owasp;
32
+ owasp -> cwe;
33
+ cwe -> remediate;
34
+ remediate -> deps;
35
+ deps -> report;
36
+ }
37
+ ```
38
+
39
+ ### Phase 1 — Attack Surface Mapping
40
+ 1. Identify all entry points: API endpoints, user inputs, file uploads, URL params
41
+ 2. Map data flow from input → processing → storage → output
42
+ 3. Identify trust boundaries (authenticated vs. unauthenticated, internal vs. external)
43
+ 4. List dependencies and their known vulnerability status
44
+
45
+ ### Phase 2 — OWASP Top 10 Scan
46
+ Check each applicable category:
47
+
48
+ | ID | Category | What to Look For |
49
+ |---|---|---|
50
+ | A01 | Broken Access Control | Missing auth checks, IDOR, privilege escalation |
51
+ | A02 | Cryptographic Failures | Weak hashing, plaintext secrets, poor TLS config |
52
+ | A03 | Injection | SQL/NoSQL/OS/LDAP injection, template injection |
53
+ | A04 | Insecure Design | Missing rate limits, business logic flaws |
54
+ | A05 | Security Misconfiguration | Default creds, verbose errors, unnecessary features |
55
+ | A06 | Vulnerable Components | Known CVEs in dependencies |
56
+ | A07 | Auth Failures | Weak passwords, missing MFA, session fixation |
57
+ | A08 | Data Integrity Failures | Insecure deserialization, unsigned updates |
58
+ | A09 | Logging Failures | Insufficient logging, log injection, PII in logs |
59
+ | A10 | SSRF | Unvalidated URLs, internal network access |
60
+
61
+ ### Phase 3 — CWE Pattern Matching
62
+ Map findings to specific CWE entries (e.g., CWE-79 for XSS, CWE-89 for SQL injection). Include CWE ID in every finding.
63
+
64
+ ### Phase 4 — Remediation
65
+ For each finding:
66
+ 1. Explain the vulnerability in plain language
67
+ 2. Show the vulnerable code
68
+ 3. Provide the fixed code
69
+ 4. Explain why the fix works
70
+ 5. Rate exploitability: `trivial / moderate / complex`
71
+
72
+ ### Phase 5 — Dependency Audit
73
+ 1. Parse dependency manifests (package.json, requirements.txt, go.mod, etc.)
74
+ 2. Flag dependencies with known CVEs
75
+ 3. Suggest version upgrades with breaking change warnings
76
+
77
+ ## Tools Required
78
+
79
+ - `codebase` — Read source code and dependency files
80
+ - `terminal` — Run `npm audit`, `pip audit`, or equivalent
81
+ - `fetch` — Check CVE databases for dependency vulnerabilities
82
+
83
+ ## Severity Scoring
84
+
85
+ <HARD-GATE>
86
+ Do NOT mark a scan as "clean" or "no issues" if any Critical or High severity findings exist.
87
+ Do NOT downgrade severity to avoid blocking a deployment.
88
+ Critical findings MUST be remediated before code is shipped.
89
+ </HARD-GATE>
90
+
91
+ | Level | Criteria |
92
+ |---|---|
93
+ | **Critical** | Remote code execution, auth bypass, data exfiltration — exploit is trivial |
94
+ | **High** | Significant data exposure, privilege escalation — exploit is moderate |
95
+ | **Medium** | Information disclosure, denial of service — exploit requires chaining |
96
+ | **Low** | Best practice violation with no direct exploit path |
97
+
98
+ ## Output Format
99
+
100
+ ```
101
+ ## [Catalyst → Vulnerability Scan]
102
+
103
+ ### Attack Surface
104
+ - Entry points: {N}
105
+ - Trust boundaries: {list}
106
+ - Dependencies: {N} total, {N} flagged
107
+
108
+ ### Findings
109
+
110
+ #### [{severity}] {OWASP-ID} — {title} (CWE-{NNN})
111
+ **File**: {path}:{line}
112
+ **Vulnerability**: {plain language explanation}
113
+ **Exploitability**: {trivial/moderate/complex}
114
+ **Vulnerable code**:
115
+ \`\`\`{lang}
116
+ {code}
117
+ \`\`\`
118
+ **Remediation**:
119
+ \`\`\`{lang}
120
+ {fixed code}
121
+ \`\`\`
122
+ **Why this fixes it**: {explanation}
123
+
124
+ ---
125
+ (repeat per finding)
126
+
127
+ ### Dependency Alerts
128
+ | Package | Current | Vulnerable | Fixed In | CVE |
129
+ |---|---|---|---|---|
130
+ | | | | | |
131
+
132
+ ### Summary
133
+ {critical}/{high}/{medium}/{low} findings | Exploitability: {overall risk}
134
+ ```
135
+
136
+ ## Chains To
137
+
138
+ - `code-quality` — For non-security improvements found during scan
139
+ - `deploy-guard` — Security findings should block deployment
140
+
141
+ ## Anti-Patterns
142
+
143
+ - Do NOT report theoretical vulnerabilities in unreachable code
144
+ - Do NOT flag every dependency without checking actual CVE relevance
145
+ - Do NOT provide fixes that break functionality to achieve security
146
+ - Do NOT skip the "why this fixes it" explanation — it's educational