pi-crew 0.5.2 → 0.5.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (137) hide show
  1. package/CHANGELOG.md +183 -0
  2. package/README.md +17 -1
  3. package/docs/architecture.md +2 -0
  4. package/docs/bugs/cross-session-notification-leakage.md +82 -0
  5. package/docs/coding-agent-optimization.md +268 -0
  6. package/docs/deep-review-report.md +384 -0
  7. package/docs/distillation/cybersecurity-patterns.md +294 -0
  8. package/docs/migration-v0.4-v0.5.md +208 -0
  9. package/docs/optimization-plan.md +642 -0
  10. package/docs/pi-crew-v0.5.5-audit-fix-plan.md +133 -0
  11. package/docs/pi-mono-opportunities.md +969 -0
  12. package/docs/pi-mono-review.md +291 -0
  13. package/docs/skills/REFERENCE.md +144 -0
  14. package/package.json +12 -9
  15. package/skills/artifact-analysis-loop/SKILL.md +302 -0
  16. package/skills/async-worker-recovery/SKILL.md +19 -1
  17. package/skills/child-pi-spawning/SKILL.md +19 -6
  18. package/skills/context-artifact-hygiene/SKILL.md +19 -2
  19. package/skills/delegation-patterns/SKILL.md +68 -3
  20. package/skills/detection-pipeline-design/SKILL.md +285 -0
  21. package/skills/event-log-tracing/SKILL.md +20 -6
  22. package/skills/git-master/SKILL.md +20 -6
  23. package/skills/hunting-investigation-loop/SKILL.md +401 -0
  24. package/skills/incident-playbook-construction/SKILL.md +383 -0
  25. package/skills/live-agent-lifecycle/SKILL.md +20 -6
  26. package/skills/mailbox-interactive/SKILL.md +19 -6
  27. package/skills/model-routing-context/SKILL.md +19 -1
  28. package/skills/multi-perspective-review/SKILL.md +19 -4
  29. package/skills/observability-reliability/SKILL.md +19 -2
  30. package/skills/orchestration/SKILL.md +20 -2
  31. package/skills/ownership-session-security/SKILL.md +20 -2
  32. package/skills/pi-extension-lifecycle/SKILL.md +20 -2
  33. package/skills/post-mortem/SKILL.md +7 -2
  34. package/skills/read-only-explorer/SKILL.md +20 -6
  35. package/skills/requirements-to-task-packet/SKILL.md +23 -3
  36. package/skills/resource-discovery-config/SKILL.md +20 -2
  37. package/skills/runtime-state-reader/SKILL.md +20 -2
  38. package/skills/safe-bash/SKILL.md +21 -6
  39. package/skills/scrutinize/SKILL.md +20 -2
  40. package/skills/secure-agent-orchestration-review/SKILL.md +29 -2
  41. package/skills/security-review/SKILL.md +560 -0
  42. package/skills/state-mutation-locking/SKILL.md +22 -2
  43. package/skills/systematic-debugging/SKILL.md +8 -6
  44. package/skills/threat-hypothesis-framework/SKILL.md +175 -0
  45. package/skills/ui-render-performance/SKILL.md +20 -2
  46. package/skills/verification-before-done/SKILL.md +17 -2
  47. package/skills/widget-rendering/SKILL.md +21 -6
  48. package/skills/workspace-isolation/SKILL.md +20 -6
  49. package/skills/worktree-isolation/SKILL.md +20 -6
  50. package/src/agents/agent-config.ts +40 -1
  51. package/src/benchmark/benchmark-runner.ts +45 -0
  52. package/src/benchmark/feedback-loop.ts +5 -0
  53. package/src/config/config.ts +32 -5
  54. package/src/config/role-tools.ts +82 -0
  55. package/src/config/suggestions.ts +8 -0
  56. package/src/config/types.ts +4 -0
  57. package/src/extension/async-notifier.ts +10 -1
  58. package/src/extension/crew-cleanup.ts +114 -0
  59. package/src/extension/cross-extension-rpc.ts +1 -1
  60. package/src/extension/notification-router.ts +18 -0
  61. package/src/extension/register.ts +27 -19
  62. package/src/extension/registration/subagent-tools.ts +1 -1
  63. package/src/extension/team-tool/anchor.ts +201 -0
  64. package/src/extension/team-tool/api.ts +2 -1
  65. package/src/extension/team-tool/auto-summarize.ts +154 -0
  66. package/src/extension/team-tool/run.ts +42 -7
  67. package/src/extension/team-tool.ts +44 -2
  68. package/src/hooks/registry.ts +1 -3
  69. package/src/observability/event-bus.ts +69 -0
  70. package/src/observability/event-to-metric.ts +0 -2
  71. package/src/runtime/anchor-manager.ts +473 -0
  72. package/src/runtime/async-runner.ts +8 -4
  73. package/src/runtime/auto-summarize.ts +350 -0
  74. package/src/runtime/background-runner.ts +10 -3
  75. package/src/runtime/budget-tracker.ts +354 -0
  76. package/src/runtime/chain-runner.ts +507 -0
  77. package/src/runtime/child-pi.ts +123 -35
  78. package/src/runtime/crash-recovery.ts +5 -4
  79. package/src/runtime/crew-agent-runtime.ts +1 -0
  80. package/src/runtime/custom-tools/irc-tool.ts +13 -0
  81. package/src/runtime/custom-tools/submit-result-tool.ts +3 -2
  82. package/src/runtime/delivery-coordinator.ts +10 -3
  83. package/src/runtime/dynamic-script-runner.ts +482 -0
  84. package/src/runtime/foreground-control.ts +87 -17
  85. package/src/runtime/handoff-manager.ts +589 -0
  86. package/src/runtime/hidden-handoff.ts +424 -0
  87. package/src/runtime/live-agent-manager.ts +20 -4
  88. package/src/runtime/live-session-runtime.ts +39 -4
  89. package/src/runtime/manifest-cache.ts +2 -1
  90. package/src/runtime/model-resolver.ts +16 -4
  91. package/src/runtime/phase-tracker.ts +373 -0
  92. package/src/runtime/pi-args.ts +11 -1
  93. package/src/runtime/pi-json-output.ts +31 -0
  94. package/src/runtime/pipeline-runner.ts +514 -0
  95. package/src/runtime/progress-tracker.ts +124 -0
  96. package/src/runtime/retry-runner.ts +354 -0
  97. package/src/runtime/sandbox.ts +252 -0
  98. package/src/runtime/scheduler.ts +7 -2
  99. package/src/runtime/skill-effectiveness.ts +473 -0
  100. package/src/runtime/skill-instructions.ts +37 -3
  101. package/src/runtime/subagent-manager.ts +1 -1
  102. package/src/runtime/task-graph.ts +11 -1
  103. package/src/runtime/task-runner.ts +92 -18
  104. package/src/runtime/team-runner.ts +13 -12
  105. package/src/runtime/tool-progress.ts +10 -3
  106. package/src/runtime/verification-gates.ts +367 -0
  107. package/src/schema/team-tool-schema.ts +37 -0
  108. package/src/skills/discover-skills.ts +5 -0
  109. package/src/state/active-run-registry.ts +9 -2
  110. package/src/state/contracts.ts +9 -0
  111. package/src/state/crew-init.ts +3 -3
  112. package/src/state/decision-ledger.ts +98 -55
  113. package/src/state/event-log-rotation.ts +2 -2
  114. package/src/state/event-log.ts +144 -10
  115. package/src/state/hook-instinct-bridge.ts +5 -5
  116. package/src/state/mailbox.ts +10 -0
  117. package/src/state/run-cache.ts +18 -8
  118. package/src/state/state-store.ts +3 -1
  119. package/src/state/types.ts +4 -0
  120. package/src/tools/safe-bash-extension.ts +1 -0
  121. package/src/tools/safe-bash.ts +152 -20
  122. package/src/types/new-api-types.ts +34 -0
  123. package/src/ui/agent-management-overlay.ts +5 -1
  124. package/src/ui/crew-widget.ts +29 -15
  125. package/src/ui/overlays/mailbox-detail-overlay.ts +13 -2
  126. package/src/ui/powerbar-publisher.ts +101 -7
  127. package/src/ui/tool-render.ts +15 -15
  128. package/src/ui/transcript-cache.ts +13 -0
  129. package/src/utils/bm25-search.ts +16 -8
  130. package/src/utils/env-filter.ts +8 -5
  131. package/src/utils/redaction.ts +169 -15
  132. package/src/utils/session-utils.ts +52 -0
  133. package/src/utils/sse-parser.ts +10 -1
  134. package/src/worktree/cleanup.ts +6 -1
  135. package/src/worktree/worktree-manager.ts +32 -13
  136. package/workflows/chain.workflow.md +252 -0
  137. package/workflows/pipeline.workflow.md +27 -0
@@ -0,0 +1,560 @@
1
+ ---
2
+ name: security-review
3
+ description: "Security review patterns with audit and detection authoring."
4
+ triggers:
5
+ - "security review"
6
+ - "vulnerability scan"
7
+ - "audit"
8
+ - "pen test"
9
+ - "build detection rule"
10
+ ---
11
+ # Security Review Skill
12
+
13
+ **Version:** 1.0.0
14
+ **Author:** pi-crew team
15
+ **Source:** `source/Anthropic-Cybersecurity-Skills/` distillation
16
+
17
+ ## Overview
18
+
19
+ Security review patterns for pi-crew multi-agent orchestration.
20
+ Based on MITRE ATLAS v5.4, NIST AI RMF, and Anthropic Cybersecurity Skills.
21
+
22
+ ## TRIGGERS
23
+
24
+ Trigger this skill when:
25
+ - User requests: "security review", "vulnerability scan", "audit", "pen test"
26
+ - Keywords: security, vulnerability, auth, owasp, injection, xss, csrf, exploit
27
+ - Actions: `team action='run', team='review'`
28
+ - High-risk tasks routed by autonomous policy
29
+
30
+ ## Audit Finding Prioritization (from benchmark-based auditing)
31
+
32
+ Use audit frameworks to systematically assess security posture:
33
+
34
+ ### Audit Workflow
35
+
36
+ ```markdown
37
+ ## Audit Process
38
+
39
+ 1. **Select Standard** → [CIS, OWASP-Top10, SOC2, NIST, custom]
40
+ 2. **Identify Controls** → Which security controls to check
41
+ 3. **Run Checks** → Automated (Semgrep, npm audit) or manual inspection
42
+ 4. **Document Findings** → [passed, failed, warning, info]
43
+ 5. **Calculate Score** → Compliance: [X/Y passed as percentage]
44
+ 6. **Prioritize** → By risk: [Critical, High, Medium, Low]
45
+ 7. **Remediate** → Fix in priority order
46
+ 8. **Verify** → Confirm fixes resolved findings
47
+ ```
48
+
49
+ ### Audit Finding Structure
50
+
51
+ ```yaml
52
+ audit_finding:
53
+ control_id: string # e.g., "CIS-1.1", "OWASP-A1"
54
+ control_name: string # Human-readable name
55
+ status: [pass|fail|warning|info|not_applicable]
56
+ severity: [critical|high|medium|low]
57
+ evidence:
58
+ - file: string # File with issue
59
+ line: number
60
+ description: string
61
+ recommendation: string # How to fix
62
+ effort: [low|medium|high] # Implementation effort
63
+ benefit: [low|medium|high] # Security improvement
64
+ ```
65
+
66
+ ### Prioritization Matrix
67
+
68
+ | Severity \ Effort | Low | Medium | High |
69
+ |-------------------|-----|--------|------|
70
+ | **Critical** | Fix now | Fix now | Priority |
71
+ | **High** | Fix now | Priority | Schedule |
72
+ | **Medium** | Priority | Schedule | Later |
73
+ | **Low** | Schedule | Later | Later |
74
+
75
+ Sort by: severity DESC, effort ASC, benefit DESC
76
+
77
+ ### Audit Check Examples
78
+
79
+ ```bash
80
+ # OWASP Top 10 check
81
+ semgrep --config=owasp-top-10 .
82
+
83
+ # Dependency audit
84
+ npm audit --audit-level=high
85
+
86
+ # Secrets detection
87
+ trufflehog3 filesystem .
88
+
89
+ # CIS Benchmark (cloud)
90
+ prowler aws --output-format json
91
+ ```
92
+
93
+ ## Detection Signature Authoring (from detection rule building)
94
+
95
+ Build detection rules to identify vulnerabilities and attack patterns:
96
+
97
+ ### Detection Workflow
98
+
99
+ ```markdown
100
+ ## Detection Authoring Process
101
+
102
+ 1. **Identify Target** → What to detect: [vulnerability, pattern, anomaly]
103
+ 2. **Define Source** → Where to look: [files, logs, events, network]
104
+ 3. **Create Pattern** → Match logic: [regex, AST pattern, rule]
105
+ 4. **Tune** → Adjust thresholds: [reduce noise, increase sensitivity]
106
+ 5. **Test** → Validate: [true positive, false positive rate]
107
+ 6. **Deploy** → Activate monitoring: [hook, alert, block]
108
+ 7. **Monitor** → Track: [alerts triggered, quality]
109
+ ```
110
+
111
+ ### Detection Rule Structure
112
+
113
+ ```yaml
114
+ detection:
115
+ name: string
116
+ description: string
117
+ severity: [critical|high|medium|low]
118
+ target:
119
+ type: [vulnerability|pattern|anomaly]
120
+ techniques: [MITRE ATT&CK IDs]
121
+ source:
122
+ - type: [file|log|network|event]
123
+ locations: [path, glob, endpoint]
124
+ pattern:
125
+ type: [regex|AST|signature|heuristic]
126
+ match: string_or_structure
127
+ exclude: [false_positive_patterns]
128
+ threshold:
129
+ count: int
130
+ time_window: duration
131
+ response:
132
+ alert: [severity, message]
133
+ block: boolean
134
+ log: boolean
135
+ tuning:
136
+ false_positives: [known_noise]
137
+ sensitivity: [high|medium|low]
138
+ validation:
139
+ test_cases:
140
+ - input: string
141
+ expected: [match|no_match]
142
+ true_positive_rate: float
143
+ false_positive_rate: float
144
+ ```
145
+
146
+ ### Detection Rule Examples
147
+
148
+ ```yaml
149
+ # SQL Injection Detection
150
+ detection:
151
+ name: sql-injection-pattern
152
+ severity: critical
153
+ pattern:
154
+ type: regex
155
+ match: '(union|select|insert|update|delete).*from'
156
+ exclude:
157
+ - '// comment with select'
158
+ - 'userProvidedQuery = "safe_value"'
159
+
160
+ # Log4j Detection (CVE-2021-44228)
161
+ pattern:
162
+ type: regex
163
+ match: '\$\{jndi:ldap://'
164
+
165
+ # Sensitive Data in Logs
166
+ pattern:
167
+ type: regex
168
+ match: '(password|secret|token|key)\s*[=:]\s*["\']?[\w+/]{20,}'
169
+ ```
170
+
171
+ ## ENFORCE
172
+
173
+ ### Gate 1: PATH TRAVERSAL (RED → GREEN)
174
+ ```
175
+ RED: Any unvalidated path operation (read/write/exec)
176
+ YELLOW: Path validated but without symlink check
177
+ GREEN: Path validated with assertSafePathId() + resolveRealContainedPath()
178
+ ```
179
+
180
+ ### Gate 2: PROMPT INJECTION (RED → Green)
181
+ ```
182
+ RED: Untrusted input passed to model without sanitization
183
+ YELLOW: Partial sanitization (regex only, no context markers)
184
+ GREEN: Full sanitization with injection markers + context isolation
185
+ ```
186
+
187
+ ### Gate 3: SECRET EXPOSURE (RED → Green)
188
+ ```
189
+ RED: *** values visible in logs/artifacts/transcripts
190
+ YELLOW: Partial redaction (logs only, not artifacts)
191
+ GREEN: Full redaction via redactEvent(), sanitizeEnvSecrets()
192
+ ```
193
+
194
+ ### Gate 4: SUPPLY CHAIN (RED → Green)
195
+ ```
196
+ RED: Dependencies from untrusted sources without verification
197
+ YELLOW: Lockfile checked but package integrity not verified
198
+ GREEN: Package integrity verified + npm audit + typosquatting check
199
+ ```
200
+
201
+ ## PATTERNS
202
+
203
+ ### Pattern 1: Agent Context Poisoning Detection
204
+
205
+ **MITRE ATLAS:** AML.T0051 (Prompt Injection), AML.T0054 (Jailbreak)
206
+
207
+ ```typescript
208
+ // Check for injection markers in user input
209
+ const INJECTION_PATTERNS = [
210
+ /\b(ignore|disregard|forget)\s+(previous|all|above)\s+(instructions|prompts)/i,
211
+ /\b(you\s+are\s+now|act\s+as|pretend)\s+\w+/i,
212
+ /<\s*script\s*>/i,
213
+ /\{\{.*?\}\}/, // Template injection
214
+ /\$\{.*?\}/, // Variable injection
215
+ /\[\s*system\s*\]/i,
216
+ /\[\s*assistant\s*\]/i,
217
+ ];
218
+
219
+ function detectInjection(input: string): boolean {
220
+ return INJECTION_PATTERNS.some(p => p.test(input));
221
+ }
222
+
223
+ // Check task packet for poisoned context
224
+ function validateTaskPacket(packet: TaskPacket): ValidationResult {
225
+ const injections = detectInjection(packet.prompt);
226
+ if (injections) {
227
+ return {
228
+ severity: 'critical',
229
+ category: 'prompt-injection',
230
+ evidence: packet.prompt,
231
+ recommendation: 'Sanitize input with injection markers',
232
+ };
233
+ }
234
+ return { severity: 'pass', category: 'context-integrity', evidence: null };
235
+ }
236
+ ```
237
+
238
+ ### Pattern 2: Path Traversal Prevention
239
+
240
+ **MITRE ATLAS:** ATT&CK T1059 (Command & Scripting Interpreter)
241
+
242
+ ```typescript
243
+ import { assertSafePathId, resolveContainedPath } from '../utils/safe-paths.ts';
244
+
245
+ function safeFileOperation(path: string, cwd: string): SafePathResult {
246
+ // Step 1: Validate path ID format
247
+ assertSafePathId(path);
248
+
249
+ // Step 2: Resolve to absolute with containment
250
+ const resolved = resolveContainedPath(path, cwd);
251
+
252
+ // Step 3: Verify resolved path is within cwd
253
+ if (!resolved.startsWith(cwd)) {
254
+ return {
255
+ safe: false,
256
+ reason: 'Path escapes working directory',
257
+ resolved: undefined,
258
+ };
259
+ }
260
+
261
+ return {
262
+ safe: true,
263
+ resolved,
264
+ reason: 'Path validated and contained',
265
+ };
266
+ }
267
+ ```
268
+
269
+ ### Pattern 3: Supply Chain Security
270
+
271
+ **MITRE ATLAS:** AML.T0010 (Supply Chain), AML.T0104 (Software Supply Chain)
272
+
273
+ ```typescript
274
+ const TRUSTED_NPM_SOURCES = [
275
+ 'registry.npmjs.org',
276
+ 'registry.npmmirror.com',
277
+ ];
278
+
279
+ function validateNpmPackage(manifest: PackageManifest): ValidationResult {
280
+ // Check for typosquatting
281
+ const suspiciousNames = detectTyposquatting(manifest.name);
282
+ if (suspiciousNames.length > 0) {
283
+ return {
284
+ severity: 'high',
285
+ category: 'typosquatting',
286
+ evidence: `Package name similar to: ${suspiciousNames.join(', ')}`,
287
+ };
288
+ }
289
+
290
+ // Check for post-install scripts
291
+ if (manifest.scripts?.postinstall && !isTrustedSource(manifest)) {
292
+ return {
293
+ severity: 'medium',
294
+ category: 'supply-chain',
295
+ evidence: 'Post-install script detected',
296
+ };
297
+ }
298
+
299
+ // Check dependencies
300
+ const dangerousDeps = findDangerousDependencies(manifest.dependencies);
301
+ if (dangerousDeps.length > 0) {
302
+ return {
303
+ severity: 'high',
304
+ category: 'dependency-confusion',
305
+ evidence: dangerousDeps,
306
+ };
307
+ }
308
+
309
+ return { severity: 'pass', category: 'supply-chain', evidence: null };
310
+ }
311
+ ```
312
+
313
+ ### Pattern 4: Secret Redaction
314
+
315
+ **MITRE ATLAS:** AML.T0067 (Exfiltrate Training Data)
316
+
317
+ ```typescript
318
+ import { redactEvent } from '../state/event-log.ts';
319
+
320
+ const SECRET_PATTERNS = [
321
+ /\b(?:api[_-]?key|secret|token|password|credential)["\s]*[=:]["\s]*[A-Za-z0-9+/]{20,}/gi,
322
+ /\b(?:ghp|github)_[A-Za-z0-9]{36,}/g,
323
+ /\bBearer\s+[A-Za-z0-9+/=_.-]{20,}/g,
324
+ /\b[A-Za-z0-9+/]{40,}={0,2}\b/g, // Generic long base64
325
+ ];
326
+
327
+ function redactSecrets(content: string): string {
328
+ let redacted = content;
329
+ for (const pattern of SECRET_PATTERNS) {
330
+ redacted = redacted.replace(pattern, '***REDACTED***');
331
+ }
332
+ return redacted;
333
+ }
334
+
335
+ // Apply to all event types
336
+ function safeLogEvent(event: CrewEvent): void {
337
+ const redacted = redactEvent(event); // Built-in redaction
338
+ appendEvent(redacted);
339
+ }
340
+ ```
341
+
342
+ ### Pattern 5: Race Condition Detection
343
+
344
+ **MITRE ATLAS:** AML.T0054 (Privilege Escalation via Race)
345
+
346
+ ```typescript
347
+ // Detect timing attacks and race conditions
348
+ const RACE_CONDITION_PATTERNS = [
349
+ { pattern: /appendFileSync.*race/i, severity: 'medium' },
350
+ { pattern: /readFileSync.*writeFileSync.*race/i, severity: 'high' },
351
+ { pattern: /mkdirSync.*mkdir.*race/i, severity: 'medium' },
352
+ ];
353
+
354
+ function detectRaceConditions(code: string): Finding[] {
355
+ const findings: Finding[] = [];
356
+
357
+ // Check for file operation races
358
+ for (const { pattern, severity } of RACE_CONDITION_PATTERNS) {
359
+ if (pattern.test(code)) {
360
+ findings.push({
361
+ severity,
362
+ category: 'race-condition',
363
+ pattern: pattern.source,
364
+ recommendation: 'Use atomic write or filesystem locking',
365
+ });
366
+ }
367
+ }
368
+
369
+ // Check for timing-sensitive operations
370
+ if (code.includes('setTimeout') && code.includes('auth')) {
371
+ findings.push({
372
+ severity: 'medium',
373
+ category: 'timing-attack',
374
+ recommendation: 'Add constant-time comparison for auth checks',
375
+ });
376
+ }
377
+
378
+ return findings;
379
+ }
380
+ ```
381
+
382
+ ### Pattern 6: Authentication Anomaly Detection
383
+
384
+ **MITRE ATLAS:** AML.T0043 (Auth Failure), AML.T0018 (Token Theft)
385
+
386
+ ```typescript
387
+ interface AuthPattern {
388
+ sessionId: string;
389
+ timestamp: number;
390
+ failures: number;
391
+ source: string;
392
+ }
393
+
394
+ function detectAuthAnomalies(sessions: AuthPattern[]): Finding[] {
395
+ const findings: Finding[] = [];
396
+
397
+ // Brute force detection
398
+ for (const session of sessions) {
399
+ if (session.failures > 5) {
400
+ findings.push({
401
+ severity: 'high',
402
+ category: 'brute-force',
403
+ evidence: `${session.failures} auth failures from ${session.source}`,
404
+ });
405
+ }
406
+
407
+ // Token reuse detection
408
+ if (session.timestamp < Date.now() - 3600000) {
409
+ findings.push({
410
+ severity: 'medium',
411
+ category: 'token-reuse',
412
+ evidence: 'Stale session token used',
413
+ });
414
+ }
415
+ }
416
+
417
+ // Session fixation
418
+ const predictableIds = sessions.filter(s =>
419
+ /^(session|team|run)_[a-z0-9]{8}$/i.test(s.sessionId)
420
+ );
421
+ if (predictableIds.length > 0) {
422
+ findings.push({
423
+ severity: 'medium',
424
+ category: 'session-fixation',
425
+ evidence: 'Predictable session ID pattern detected',
426
+ });
427
+ }
428
+
429
+ return findings;
430
+ }
431
+ ```
432
+
433
+ ### Pattern 7: Tool Invocation Abuse Detection
434
+
435
+ **MITRE ATLAS:** AML.T0051 (Prompt Injection)
436
+
437
+ ```typescript
438
+ interface ToolMetrics {
439
+ toolName: string;
440
+ callCount: number;
441
+ timeWindow: number;
442
+ anomalies: string[];
443
+ }
444
+
445
+ function detectToolAbuse(metrics: ToolMetrics[]): Finding[] {
446
+ const findings: Finding[] = [];
447
+ const RATE_THRESHOLD = 10; // calls per minute
448
+ const BURST_THRESHOLD = 20; // calls in 30 seconds
449
+
450
+ for (const metric of metrics) {
451
+ // Rate limiting
452
+ const rate = metric.callCount / (metric.timeWindow / 60000);
453
+ if (rate > RATE_THRESHOLD) {
454
+ findings.push({
455
+ severity: 'high',
456
+ category: 'tool-abuse',
457
+ evidence: `${metric.toolName}: ${rate.toFixed(1)} calls/min (threshold: ${RATE_THRESHOLD})`,
458
+ recommendation: 'Implement rate limiting or throttling',
459
+ });
460
+ }
461
+
462
+ // Burst detection
463
+ if (metric.callCount > BURST_THRESHOLD && metric.timeWindow < 30000) {
464
+ findings.push({
465
+ severity: 'critical',
466
+ category: 'tool-burst',
467
+ evidence: `${metric.toolName}: ${metric.callCount} calls in <30s`,
468
+ recommendation: 'Block tool and investigate source',
469
+ });
470
+ }
471
+ }
472
+
473
+ return findings;
474
+ }
475
+ ```
476
+
477
+ ### Pattern 8: Malicious Skill Loading Detection
478
+
479
+ **MITRE ATLAS:** AML.T0062 (Exfiltrate Data via ML)
480
+
481
+ ```typescript
482
+ const UNSAFE_SKILL_PATTERNS = [
483
+ /(^|\/)\.\.(\/|$)/, // Path traversal
484
+ /^[A-Z]:/i, // Windows absolute path
485
+ /^\//, // Unix absolute path
486
+ /\.exe$|\.dll$|\.so$/i, // Binary files
487
+ /<script|SQL|SELECT.*FROM/i, // Script injection
488
+ ];
489
+
490
+ function validateSkillPath(path: string): ValidationResult {
491
+ if (!path || path.includes('\0')) {
492
+ return {
493
+ safe: false,
494
+ reason: 'Null byte or empty path',
495
+ category: 'malicious-skill',
496
+ };
497
+ }
498
+
499
+ for (const pattern of UNSAFE_SKILL_PATTERNS) {
500
+ if (pattern.test(path)) {
501
+ return {
502
+ safe: false,
503
+ reason: `Path matches unsafe pattern: ${pattern}`,
504
+ category: 'malicious-skill',
505
+ };
506
+ }
507
+ }
508
+
509
+ // Check if skill exists and is readable
510
+ if (!existsSync(path)) {
511
+ return {
512
+ safe: false,
513
+ reason: 'Skill file does not exist',
514
+ category: 'missing-skill',
515
+ };
516
+ }
517
+
518
+ return {
519
+ safe: true,
520
+ reason: 'Skill path validated',
521
+ category: 'skill-path',
522
+ };
523
+ }
524
+ ```
525
+
526
+ ---
527
+
528
+ ## TOOLS
529
+
530
+ | Tool | Purpose |
531
+ |------|---------|
532
+ | `assertSafePathId()` | Path ID format validation |
533
+ | `resolveContainedPath()` | Path containment resolution |
534
+ | `redactEvent()` | Event log redaction |
535
+ | `sanitizeEnvSecrets()` | Environment variable sanitization |
536
+ | `sanitizeTaskPacket()` | Task packet sanitization |
537
+ | `atomicWriteJson()` | Atomic file writes |
538
+
539
+ ---
540
+
541
+ ## METRICS
542
+
543
+ | Metric | Target |
544
+ |--------|--------|
545
+ | Path traversal findings | 0 critical |
546
+ | Secret exposure | 0 in any artifact |
547
+ | Supply chain issues | <5 medium |
548
+ | Race conditions | <2 medium |
549
+ | Tool abuse detection | 100% coverage |
550
+
551
+ ---
552
+
553
+ *See also: `docs/distillation/cybersecurity-patterns.md`*
554
+ ## Anti-Patterns
555
+
556
+ - **Don't** skip path traversal checks when dealing with user input
557
+ - **Don't** trust agent output without auditing artifacts
558
+ - **Don't** run security review without knowing the data flow boundaries
559
+ - **Don't** skip secrets detection in configuration and environment files
560
+ - **Don't** skip supply chain checks when using external packages
@@ -1,8 +1,15 @@
1
1
  ---
2
2
  name: state-mutation-locking
3
- description: Durable state mutation and locking workflow. Use when changing manifests, tasks, mailbox, claims, events, stale reconciliation, recovery, cancel/respond/resume, or retry logic.
4
- ---
3
+ description: "Durable state mutation and locking workflow."
4
+ triggers:
5
+ - "modify manifest"
6
+ - "update tasks"
7
+ - "stale reconciliation"
8
+ - "cancel run"
9
+ - "respond to task"
10
+
5
11
 
12
+ ---
6
13
  # state-mutation-locking
7
14
 
8
15
  Use this skill before modifying pi-crew run state.
@@ -25,6 +32,19 @@ Use this skill before modifying pi-crew run state.
25
32
  - In retry/resume paths, reload fresh task status immediately before execution and skip if the task is no longer retryable/runnable.
26
33
  - Include event-log entries for externally visible state changes.
27
34
 
35
+ ## Enforcement — State Mutation Locking Gate
36
+
37
+ **Before mutating run state, verify:**
38
+
39
+ - [ ] Run lock acquired before mutation (concurrent actions possible)
40
+ - [ ] Manifest/tasks re-read inside the lock before decision
41
+ - [ ] Atomic write helpers used (atomicWriteJson or state-store helpers)
42
+ - [ ] Status contracts respected (no terminal transitions without force semantics)
43
+ - [ ] Event-log entries emitted for externally visible state changes
44
+ - [ ] Retry paths reload fresh task status before execution
45
+
46
+ If ANY answer is NO → Stop. Verify locking and atomicity before mutating.
47
+
28
48
  ## Anti-patterns
29
49
 
30
50
  - Reading state, waiting/doing async work, then writing the old copy.
@@ -1,8 +1,14 @@
1
1
  ---
2
2
  name: systematic-debugging
3
- description: "Four-phase debugging discipline with refuse gates. Use when encountering a bug, test failure, blocked run, provider error, stale state, crash, or unexpected behavior. Triggers: debug this, investigate, fix this bug, something is broken, crash, error, test failed, it broke, not working, unexpected."
4
- ---
3
+ description: "Four-phase debugging discipline with refuse gates."
4
+ triggers:
5
+ - "debug this"
6
+ - "investigate"
7
+ - "fix this bug"
8
+ - "test failed"
9
+ - "crash"
5
10
 
11
+ ---
6
12
  # systematic-debugging
7
13
 
8
14
  Core principle: no fixes without root-cause investigation first. Symptom patches create new bugs and hide the real failure.
@@ -20,8 +26,6 @@ Before beginning any debug session, recite these four steps:
20
26
 
21
27
  If the user says "skip the ritual" → skip the recitation but still apply the four phases silently.
22
28
 
23
- ---
24
-
25
29
  ## Refuse Gate — Do NOT Proceed Without These
26
30
 
27
31
  Before proposing ANY fix:
@@ -37,8 +41,6 @@ If ANY answer is NO:
37
41
 
38
42
  Exception: if the user explicitly says "just patch the symptom" — proceed but flag it as a symptom patch, not a root-cause fix.
39
43
 
40
- ---
41
-
42
44
  ## Four Phases
43
45
 
44
46
  ### 1. Root Cause Investigation