pi-crew 0.5.2 → 0.5.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (80) hide show
  1. package/CHANGELOG.md +67 -0
  2. package/docs/bugs/cross-session-notification-leakage.md +82 -0
  3. package/docs/coding-agent-optimization.md +268 -0
  4. package/docs/deep-review-report.md +384 -0
  5. package/docs/distillation/cybersecurity-patterns.md +294 -0
  6. package/docs/migration-v0.4-v0.5.md +191 -0
  7. package/docs/optimization-plan.md +642 -0
  8. package/docs/pi-mono-opportunities.md +969 -0
  9. package/docs/pi-mono-review.md +291 -0
  10. package/docs/skills/REFERENCE.md +144 -0
  11. package/package.json +7 -6
  12. package/skills/artifact-analysis-loop/SKILL.md +302 -0
  13. package/skills/async-worker-recovery/SKILL.md +19 -1
  14. package/skills/child-pi-spawning/SKILL.md +19 -6
  15. package/skills/context-artifact-hygiene/SKILL.md +19 -2
  16. package/skills/delegation-patterns/SKILL.md +68 -3
  17. package/skills/detection-pipeline-design/SKILL.md +285 -0
  18. package/skills/event-log-tracing/SKILL.md +20 -6
  19. package/skills/git-master/SKILL.md +20 -6
  20. package/skills/hunting-investigation-loop/SKILL.md +401 -0
  21. package/skills/incident-playbook-construction/SKILL.md +383 -0
  22. package/skills/live-agent-lifecycle/SKILL.md +20 -6
  23. package/skills/mailbox-interactive/SKILL.md +19 -6
  24. package/skills/model-routing-context/SKILL.md +19 -1
  25. package/skills/multi-perspective-review/SKILL.md +19 -4
  26. package/skills/observability-reliability/SKILL.md +19 -2
  27. package/skills/orchestration/SKILL.md +20 -2
  28. package/skills/ownership-session-security/SKILL.md +20 -2
  29. package/skills/pi-extension-lifecycle/SKILL.md +20 -2
  30. package/skills/post-mortem/SKILL.md +7 -2
  31. package/skills/read-only-explorer/SKILL.md +20 -6
  32. package/skills/requirements-to-task-packet/SKILL.md +23 -3
  33. package/skills/resource-discovery-config/SKILL.md +20 -2
  34. package/skills/runtime-state-reader/SKILL.md +20 -2
  35. package/skills/safe-bash/SKILL.md +21 -6
  36. package/skills/scrutinize/SKILL.md +20 -2
  37. package/skills/secure-agent-orchestration-review/SKILL.md +29 -2
  38. package/skills/security-review/SKILL.md +560 -0
  39. package/skills/state-mutation-locking/SKILL.md +22 -2
  40. package/skills/systematic-debugging/SKILL.md +8 -6
  41. package/skills/threat-hypothesis-framework/SKILL.md +175 -0
  42. package/skills/ui-render-performance/SKILL.md +20 -2
  43. package/skills/verification-before-done/SKILL.md +17 -2
  44. package/skills/widget-rendering/SKILL.md +21 -6
  45. package/skills/workspace-isolation/SKILL.md +20 -6
  46. package/skills/worktree-isolation/SKILL.md +20 -6
  47. package/src/agents/agent-config.ts +40 -1
  48. package/src/config/config.ts +22 -5
  49. package/src/config/role-tools.ts +82 -0
  50. package/src/config/types.ts +4 -0
  51. package/src/extension/crew-cleanup.ts +114 -0
  52. package/src/extension/register.ts +15 -3
  53. package/src/extension/team-tool/run.ts +7 -7
  54. package/src/observability/event-bus.ts +60 -0
  55. package/src/runtime/background-runner.ts +8 -2
  56. package/src/runtime/child-pi.ts +122 -34
  57. package/src/runtime/crew-agent-runtime.ts +1 -0
  58. package/src/runtime/foreground-control.ts +87 -17
  59. package/src/runtime/pi-args.ts +11 -1
  60. package/src/runtime/pi-json-output.ts +31 -0
  61. package/src/runtime/progress-tracker.ts +124 -0
  62. package/src/runtime/skill-effectiveness.ts +473 -0
  63. package/src/runtime/skill-instructions.ts +37 -3
  64. package/src/runtime/task-runner.ts +91 -17
  65. package/src/runtime/team-runner.ts +11 -11
  66. package/src/runtime/tool-progress.ts +10 -3
  67. package/src/runtime/verification-gates.ts +367 -0
  68. package/src/schema/team-tool-schema.ts +7 -0
  69. package/src/state/decision-ledger.ts +92 -43
  70. package/src/state/event-log.ts +136 -10
  71. package/src/state/hook-instinct-bridge.ts +5 -5
  72. package/src/state/state-store.ts +3 -1
  73. package/src/state/types.ts +4 -0
  74. package/src/types/new-api-types.ts +34 -0
  75. package/src/ui/agent-management-overlay.ts +5 -1
  76. package/src/ui/crew-widget.ts +29 -15
  77. package/src/ui/powerbar-publisher.ts +100 -7
  78. package/src/ui/tool-render.ts +15 -15
  79. package/src/utils/session-utils.ts +52 -0
  80. package/src/worktree/worktree-manager.ts +32 -13
@@ -0,0 +1,383 @@
1
+ ---
2
+ name: incident-playbook-construction
3
+ description: "Build structured incident response playbooks and runbooks."
4
+ triggers:
5
+ - "build playbook"
6
+ - "create runbook"
7
+ - "document procedure"
8
+ - "IR automation"
9
+ - "SOAR design"
10
+ ---
11
+ # incident-playbook-construction
12
+
13
+ Use this skill when building incident response playbooks and structured procedures.
14
+
15
+ ## Source
16
+
17
+ Distilled from `building-incident-response-playbook` (Anthropic Cybersecurity Skills) and generalized for software/team context.
18
+
19
+ ## When to Use
20
+
21
+ - Creating new incident response procedures
22
+ - Documenting procedures for a specific incident type
23
+ - Automating response workflows
24
+ - Preparing for compliance audits (SOC 2, HIPAA, PCI-DSS)
25
+ - Conducting gap analysis of existing response capabilities
26
+
27
+ ## Playbook Structure
28
+
29
+ ```yaml
30
+ playbook:
31
+ name: string # e.g., "data-breach-response"
32
+ version: string # e.g., "1.0.0"
33
+ trigger:
34
+ conditions:
35
+ - name: string
36
+ description: string
37
+ scope:
38
+ affected:
39
+ - [files, systems, teams]
40
+ not_in_scope:
41
+ - [what this playbook doesn't cover]
42
+ steps:
43
+ - id: number
44
+ name: string
45
+ action: string # What to do
46
+ verify: string # How to confirm success
47
+ on_success: next_step_id # or "close"
48
+ on_failure: escalation_id # or "abort"
49
+ decision_tree:
50
+ - condition: string # e.g., "data_encrypted == true"
51
+ branches:
52
+ yes: step_id
53
+ no: step_id
54
+ escalation:
55
+ - id: number
56
+ condition: string
57
+ action: string # notify, escalate, abort
58
+ notify: [roles]
59
+ raci:
60
+ responsible: [role]
61
+ accountable: [role]
62
+ consulted: [role]
63
+ informed: [role]
64
+ sla:
65
+ detection: duration # e.g., "15m"
66
+ containment: duration
67
+ eradication: duration
68
+ recovery: duration
69
+ lessons_learned: duration
70
+ ```
71
+
72
+ ## Playbook Workflow
73
+
74
+ ```markdown
75
+ ## Playbook Construction Process
76
+
77
+ 1. **Identify Type** → [bug-fix, security-incident, outage, data-breach, supply-chain]
78
+ 2. **Define Scope** → [affected files, teams, systems]
79
+ 3. **Document Steps** → Numbered procedure: [step → action → verification]
80
+ 4. **Add Decisions** → Branch points: [if X then Y else Z]
81
+ 5. **Specify Roles** → RACI: [Responsible, Accountable, Consulted, Informed]
82
+ 6. **Set SLAs** → Time-based thresholds: [P1: 1hr, P2: 4hr, P3: 24hr]
83
+ 7. **Add Automation** → Auto-trigger conditions: [if X then run Y]
84
+ 8. **Test** → Validate with [scenario simulation]
85
+ ```
86
+
87
+ ## Step Definition
88
+
89
+ Each step follows this structure:
90
+
91
+ ```yaml
92
+ step:
93
+ id: 1 # Sequential ID
94
+ name: "Contain the incident" # Human-readable name
95
+ action: |
96
+ # Concrete actions to take
97
+ 1. Isolate affected system
98
+ 2. Preserve evidence
99
+ 3. Notify team
100
+ verify: |
101
+ # How to confirm step completed
102
+ - System isolated: check network connections
103
+ - Evidence preserved: snapshot taken
104
+ - Team notified: ack received
105
+ tools:
106
+ - name: string
107
+ command: string
108
+ artifacts:
109
+ output: [files created by this step]
110
+ rollback: |
111
+ # How to undo this step if needed
112
+ reconnect_system()
113
+ next:
114
+ success: 2 # Next step ID on success
115
+ failure: "escalate" # Or step ID, or "abort"
116
+ ```
117
+
118
+ ## Decision Tree Patterns
119
+
120
+ ### Branch by Severity
121
+
122
+ ```yaml
123
+ decision:
124
+ name: severity-assessment
125
+ condition: "incident.severity"
126
+ branches:
127
+ P1:
128
+ - step: 2 # Immediate containment
129
+ notify: [lead, manager]
130
+ - step: 5 # War room
131
+ P2:
132
+ - step: 3 # Standard response
133
+ notify: [lead]
134
+ P3:
135
+ - step: 4 # Low priority
136
+ notify: [team]
137
+ P4:
138
+ - step: 6 # Backlog
139
+ notify: []
140
+ ```
141
+
142
+ ### Branch by Type
143
+
144
+ ```yaml
145
+ decision:
146
+ name: incident-type-assessment
147
+ condition: "incident.type"
148
+ branches:
149
+ data-breach:
150
+ - step: data_containment
151
+ - step: legal_notification
152
+ - step: affected_users_notification
153
+ security-compromise:
154
+ - step: isolate_system
155
+ - step: preserve_evidence
156
+ - step: forensic_investigation
157
+ outage:
158
+ - step: assess_impact
159
+ - step: restore_service
160
+ - step: post-mortem
161
+ bug:
162
+ - step: reproduce
163
+ - step: fix
164
+ - step: verify
165
+ ```
166
+
167
+ ## RACI Matrix
168
+
169
+ ```yaml
170
+ raci:
171
+ roles:
172
+ - name: orchestrator
173
+ responsible: [coordinate, dispatch]
174
+ accountable: [decisions, outcomes]
175
+ - name: executor
176
+ responsible: [implement, investigate]
177
+ accountable: []
178
+ - name: verifier
179
+ responsible: [test, validate]
180
+ accountable: []
181
+ - name: lead
182
+ responsible: []
183
+ accountable: [escalation, resource_allocation]
184
+ - name: manager
185
+ responsible: []
186
+ accountable: [approval, external_communication]
187
+ ```
188
+
189
+ ## SLA Definition
190
+
191
+ ```yaml
192
+ sla:
193
+ phases:
194
+ - name: detection
195
+ target: "15m" # Time to detect/acknowledge
196
+ measured_from: incident_start
197
+ - name: containment
198
+ target: "1h"
199
+ measured_from: detection_complete
200
+ - name: eradication
201
+ target: "4h"
202
+ measured_from: containment_complete
203
+ - name: recovery
204
+ target: "24h"
205
+ measured_from: eradication_complete
206
+ - name: lessons_learned
207
+ target: "72h"
208
+ measured_from: recovery_complete
209
+ escalation:
210
+ - phase: detection
211
+ exceeded: notify_lead
212
+ - phase: containment
213
+ exceeded: notify_manager
214
+ - phase: recovery
215
+ exceeded: notify_executive
216
+ ```
217
+
218
+ ## Playbook Examples
219
+
220
+ ### Example 1: Security Incident Playbook
221
+
222
+ ```yaml
223
+ playbook:
224
+ name: security-compromise-response
225
+ trigger:
226
+ conditions:
227
+ - name: unauthorized_access
228
+ description: Access by unauthorized user
229
+ - name: malware_detection
230
+ description: Malware or suspicious process
231
+ - name: data_exfiltration
232
+ description: Abnormal data transfer
233
+ steps:
234
+ - id: 1
235
+ name: Detect and confirm
236
+ action: |
237
+ - Review logs for unauthorized access
238
+ - Confirm malware detection with secondary tool
239
+ - Identify scope of compromise
240
+ verify: |
241
+ - Logs show compromise indicators
242
+ - Malware confirmed by 2+ tools
243
+ next:
244
+ success: 2
245
+ failure: abort
246
+ - id: 2
247
+ name: Contain
248
+ action: |
249
+ - Isolate affected system from network
250
+ - Preserve evidence (memory dump, disk image)
251
+ - Block malicious IPs/domains
252
+ verify: |
253
+ - System isolated: no network connections
254
+ - Evidence preserved: hash verified
255
+ next:
256
+ success: 3
257
+ failure: escalate
258
+ - id: 3
259
+ name: Investigate
260
+ action: |
261
+ - Determine attack vector
262
+ - Identify affected systems
263
+ - Timeline reconstruction
264
+ next:
265
+ success: 4
266
+ - id: 4
267
+ name: Eradicate
268
+ action: |
269
+ - Remove malware/backdoor
270
+ - Patch vulnerability
271
+ - Reset compromised credentials
272
+ next:
273
+ success: 5
274
+ - id: 5
275
+ name: Recover
276
+ action: |
277
+ - Restore from clean backup
278
+ - Verify system integrity
279
+ - Monitor for recurrence
280
+ next:
281
+ success: close
282
+ sla:
283
+ detection: "15m"
284
+ containment: "1h"
285
+ eradication: "4h"
286
+ recovery: "24h"
287
+ ```
288
+
289
+ ### Example 2: Bug Fix Playbook
290
+
291
+ ```yaml
292
+ playbook:
293
+ name: bug-fix-response
294
+ trigger:
295
+ conditions:
296
+ - name: regression
297
+ description: Existing feature broken
298
+ - name: new_bug
299
+ description: Newly reported issue
300
+ steps:
301
+ - id: 1
302
+ name: Reproduce
303
+ action: |
304
+ - Get reliable repro steps
305
+ - Verify bug exists
306
+ - Document environment
307
+ verify: |
308
+ - Bug reproduced consistently OR
309
+ - Bug confirmed as flaky (>50% reproduction)
310
+ next:
311
+ success: 2
312
+ - id: 2
313
+ name: Investigate
314
+ action: |
315
+ - Find root cause
316
+ - Identify affected code
317
+ - Determine fix approach
318
+ verify: |
319
+ - Root cause identified (not hypothesis)
320
+ next:
321
+ success: 3
322
+ - id: 3
323
+ name: Fix
324
+ action: |
325
+ - Implement fix
326
+ - Write regression test
327
+ - Update documentation
328
+ next:
329
+ success: 4
330
+ - id: 4
331
+ name: Verify
332
+ action: |
333
+ - Run original repro
334
+ - Run full test suite
335
+ - Code review
336
+ next:
337
+ success: close
338
+ sla:
339
+ detection: "30m"
340
+ containment: "1h"
341
+ eradication: "4h"
342
+ recovery: "24h"
343
+ ```
344
+
345
+ ## Enforcement — Incident Playbook Construction Gate
346
+
347
+ **Before publishing a playbook, verify:**
348
+
349
+ - [ ] Trigger conditions defined (knowing when to use this playbook)
350
+ - [ ] Each step has action + verification + next (success/failure) defined
351
+ - [ ] Decision points have explicit branches (if X then Y else Z)
352
+ - [ ] RACI matrix assigns responsible/accountable roles
353
+ - [ ] SLA phases defined with targets and escalation conditions
354
+ - [ ] Rollback procedures documented for critical steps
355
+
356
+ If ANY answer is NO → Stop. Complete playbook structure before publishing.
357
+
358
+ ## Anti-Patterns
359
+
360
+ - **Don't** create playbooks without trigger conditions (don't know when to use)
361
+ - **Don't** skip verification steps (can't confirm success)
362
+ - **Don't** skip rollback procedures (can't undo mistakes)
363
+ - **Don't** skip decision points (linear playbooks miss branches)
364
+ - **Don't** skip SLA definition (no accountability for timing)
365
+
366
+ ## Tools
367
+
368
+ | Tool | Purpose |
369
+ |------|---------|
370
+ | `post-mortem` | Post-incident documentation |
371
+ | `systematic-debugging` | Root cause investigation |
372
+ | `verification-before-done` | Step verification |
373
+
374
+ ## Verification
375
+
376
+ For playbook changes:
377
+ ```bash
378
+ cd pi-crew
379
+ npx tsc --noEmit
380
+ node --experimental-strip-types --test test/unit/playbook-validation.test.ts
381
+ ```
382
+
383
+ *See also: `post-mortem` skill for post-incident documentation, `delegation-patterns` for escalation matrix.*
@@ -1,8 +1,14 @@
1
1
  ---
2
2
  name: live-agent-lifecycle
3
- description: Live agent registration, workspace isolation, termination, and eviction workflow. Use when tracking live agents, debugging ghost agents, or understanding workspace boundaries.
4
- ---
3
+ description: "Live agent registration, workspace isolation, termination, and eviction workflow."
4
+ triggers:
5
+ - "register agent"
6
+ - "terminate agent"
7
+ - "evict stale"
8
+ - "ghost agent"
9
+ - "workspace isolation"
5
10
 
11
+ ---
6
12
  # live-agent-lifecycle
7
13
 
8
14
  Live agents are real-time, in-memory worker sessions managed by `LiveAgentManager` (`src/runtime/live-agent-manager.ts`). They are distinct from `CrewAgentRecord` files on disk — live agents provide real-time activity (tool names, response text, turn count) while agent records are durable snapshots.
@@ -33,8 +39,6 @@ interface LiveAgentHandle {
33
39
 
34
40
  The in-memory `liveAgents` Map stores all active handles. It is never persisted — on Pi restart, the Map is empty and agents are re-created from agent records.
35
41
 
36
- ---
37
-
38
42
  ## Registration
39
43
 
40
44
  `registerLiveAgent(input, eventLogFn?, eventsPath?)` is called when a live session worker starts. It:
@@ -49,8 +53,6 @@ Key caller sites:
49
53
  - `live-executor.ts` — when spawning a live task
50
54
  - (workspaceId is passed through the entire call chain)
51
55
 
52
- ---
53
-
54
56
  ## Workspace Isolation
55
57
 
56
58
  **`workspaceId: string`** field is the workspace boundary. Set to `manifest.cwd` at registration time.
@@ -159,6 +161,18 @@ task.completed → upsertCrewAgent → agents.json updated
159
161
 
160
162
  ---
161
163
 
164
+ ## Enforcement — Live Agent Lifecycle Gate
165
+
166
+ **Before terminating or evicting live agents, verify:**
167
+
168
+ - [ ] Agent handle status is terminal (not running/queued/waiting) for eviction
169
+ - [ ] Handle age exceeds STALE_HANDLE_MS (10 minutes) for eviction
170
+ - [ ] workspaceId matches current workspace for cross-workspace prevention
171
+ - [ ] Agent record sync completed before handle eviction (upsertCrewAgent called)
172
+ - [ ] Termination called in all exit paths (finally blocks, crash-recovery)
173
+
174
+ If ANY answer is NO → Stop. Verify lifecycle state before mutation.
175
+
162
176
  ## Anti-patterns
163
177
 
164
178
  - **Missing termination on error path**: If a live agent crashes and `terminateLiveAgent` is not called, the handle stays in the Map forever with status "running". Use `finally` blocks or crash-recovery to ensure termination.
@@ -1,8 +1,13 @@
1
1
  ---
2
2
  name: mailbox-interactive
3
- description: Interactive waiting-task and mailbox workflow. Use when implementing or operating respond/nudge/ack/replay/supervisor-contact behavior.
3
+ description: "Interactive waiting-task and mailbox workflow."
4
+ triggers:
5
+ - "respond to worker"
6
+ - "nudge agent"
7
+ - "mailbox message"
8
+ - "supervisor contact"
9
+ - "waiting task"
4
10
  ---
5
-
6
11
  # mailbox-interactive
7
12
 
8
13
  Use this skill for live coordination between leader and workers. Mailbox provides an asynchronous message protocol for steer, follow-up, respond, and nudge operations.
@@ -276,6 +281,18 @@ function verifyRunOwnership(manifest: TeamRunManifest, sessionId: string, force
276
281
  }
277
282
  ```
278
283
 
284
+ ## Enforcement — Mailbox Interactive Gate
285
+
286
+ **Before responding to or mutating mailbox state, verify:**
287
+
288
+ - [ ] Target task status is "waiting" (respond only works on waiting tasks)
289
+ - [ ] ownerSessionId matches current session (ownership verified)
290
+ - [ ] Run status is not terminal (do not respond to completed/failed/cancelled)
291
+ - [ ] Corrupt JSONL handled gracefully (skip malformed lines)
292
+ - [ ] Backpressure respected (queue depth below MAX_PENDING limits)
293
+
294
+ If ANY answer is NO → Stop. Verify mailbox state before mutating.
295
+
279
296
  ## Anti-patterns
280
297
 
281
298
  - **Resuming non-waiting tasks**: `respond` only works on `waiting` tasks. Resuming `running` tasks corrupts state.
@@ -285,8 +302,6 @@ function verifyRunOwnership(manifest: TeamRunManifest, sessionId: string, force
285
302
  - **Not handling corrupt JSONL**: Skip malformed lines; don't fail the whole read.
286
303
  - **Losing pending messages on session switch**: Pending steers/followups are stored in-memory in the handle. They survive session fork but not session death.
287
304
 
288
- ---
289
-
290
305
  ## Source patterns
291
306
 
292
307
  - `src/state/mailbox.ts` — appendInboxMessage, appendSteeringMessage, readMailboxMessages
@@ -297,8 +312,6 @@ function verifyRunOwnership(manifest: TeamRunManifest, sessionId: string, force
297
312
  - `src/runtime/supervisor-contact.ts` — parseSupervisorContactFromLine, recordSupervisorContact
298
313
  - `src/ui/overlays/mailbox-detail-overlay.ts` — mailbox UI
299
314
 
300
- ---
301
-
302
315
  ## Verification
303
316
 
304
317
  ```bash
@@ -1,8 +1,14 @@
1
1
  ---
2
2
  name: model-routing-context
3
3
  description: Model routing, parent context, thinking level, and prompt construction workflow. Use when changing model fallback, child Pi args, inherited context, task prompts, or compact-read behavior.
4
- ---
4
+ triggers:
5
+ - "change model"
6
+ - "parent context"
7
+ - "thinking level"
8
+ - "task prompts"
9
+ - "compact read"
5
10
 
11
+ ---
6
12
  # model-routing-context
7
13
 
8
14
  Use this skill when working on model/context propagation.
@@ -22,6 +28,18 @@ Use this skill when working on model/context propagation.
22
28
  - When changing model precedence, add tests for undefined, empty, whitespace, agent, task, parent, and explicit tool override cases.
23
29
  - Redact secrets in context snippets and child prompts where logs/artifacts may persist them.
24
30
 
31
+ ## Enforcement — Model Routing Context Gate
32
+
33
+ **Before changing model precedence or building task prompts, verify:**
34
+
35
+ - [ ] Empty/whitespace model values treated as absent (not as explicit overrides)
36
+ - [ ] Model precedence chain understood: tool override → step model → team role → agent model → parent → registry default
37
+ - [ ] Thinking level suffix applied correctly (or intentionally omitted)
38
+ - [ ] Secrets redacted in context snippets and child prompts
39
+ - [ ] Tests cover: undefined, empty, whitespace, agent, task, parent, and explicit tool override cases
40
+
41
+ If ANY answer is NO → Stop. Verify model routing before proceeding.
42
+
25
43
  ## Anti-patterns
26
44
 
27
45
  - Letting `agentModel: ""` block parent model fallback.
@@ -1,8 +1,13 @@
1
1
  ---
2
2
  name: multi-perspective-review
3
- description: "Multi-perspective code review with simpler-alternative pass. Use when reviewing a plan, diff, implementation, worker output, release candidate, or external feedback. Triggers: review this, look at this, LGTM check, sanity check, audit this, get a second opinion, check this PR, examine this code."
3
+ description: "Multi-perspective code review with simpler-alternative pass."
4
+ triggers:
5
+ - "review this"
6
+ - "look at this"
7
+ - "LGTM check"
8
+ - "sanity check"
9
+ - "check this PR"
4
10
  ---
5
-
6
11
  # multi-perspective-review
7
12
 
8
13
  Core principle: review early, review often, and separate concerns. Reviewer output is evidence to evaluate, not an instruction to obey blindly.
@@ -23,8 +28,6 @@ Before running any review passes, ask:
23
28
 
24
29
  This is the most valuable finding you can produce — surfacing unnecessary complexity before reviewing its details.
25
30
 
26
- ---
27
-
28
31
  ## Review Passes
29
32
 
30
33
  Run relevant passes separately:
@@ -154,6 +157,18 @@ When receiving feedback:
154
157
  5. Test each fix and verify no regressions.
155
158
  6. Push back with evidence if the suggestion is wrong, out of scope, or violates user decisions.
156
159
 
160
+ ## Enforcement — Multi-Perspective Review Gate
161
+
162
+ **Before reporting review findings, verify:**
163
+
164
+ - [ ] Simpler-alternative pass completed first (delete, use existing, smaller change, different layer)
165
+ - [ ] Findings include: severity, path/symbol, evidence, impact, fix, verification
166
+ - [ ] No rubber-stamps (if nothing found, state what was traced)
167
+ - [ ] Critical/high findings have actionable fixes before proceeding
168
+ - [ ] Verdict stated: ship / fix-then-ship / rework / reject
169
+
170
+ If ANY answer is NO → Stop. Complete review requirements before reporting.
171
+
157
172
  ## Rules
158
173
 
159
174
  - Do not use performative agreement; act or give technical reasoning.
@@ -1,8 +1,13 @@
1
1
  ---
2
2
  name: observability-reliability
3
- description: Metrics, diagnostics, correlation, retry, deadletter, and recovery evidence workflow. Use when adding reliability features or investigating failures.
3
+ description: "Metrics, diagnostics, correlation, retry, deadletter, and recovery evidence workflow."
4
+ triggers:
5
+ - "add metrics"
6
+ - "diagnose failure"
7
+ - "retry logic"
8
+ - "deadletter"
9
+ - "recovery evidence"
4
10
  ---
5
-
6
11
  # observability-reliability
7
12
 
8
13
  Use this skill for reliability and observability work.
@@ -24,6 +29,18 @@ Use this skill for reliability and observability work.
24
29
  - Heartbeat classification should be threshold-based and should ignore terminal tasks/runs.
25
30
  - Overflow recovery should track phase progression and terminal states without repeatedly alerting on completed work.
26
31
 
32
+ ## Enforcement — Observability Reliability Gate
33
+
34
+ **Before emitting metrics or implementing retry, verify:**
35
+
36
+ - [ ] Metric labels are low-cardinality (no raw paths, prompts, or secrets)
37
+ - [ ] Secrets redacted before writing logs, events, diagnostics, or bundles
38
+ - [ ] Retry records attempts and deadletters on exhaustion
39
+ - [ ] Diagnostics are safe to share (no secrets, no raw sensitive data)
40
+ - [ ] Heartbeat thresholds ignore terminal tasks/runs
41
+
42
+ If ANY answer is NO → Stop. Fix observability issues before proceeding.
43
+
27
44
  ## Anti-patterns
28
45
 
29
46
  - High-cardinality Prometheus labels.
@@ -1,8 +1,13 @@
1
1
  ---
2
2
  name: orchestration
3
- description: "Multi-phase orchestration for planners and executors. Use when decomposing complex tasks into parallel phases, dispatching workers, verifying gates, and iterating to closure. Triggers: orchestrate this, coordinate these tasks, run this multi-phase, dispatch workers, coordinate team."
3
+ description: "Multi-phase orchestration for planners and executors."
4
+ triggers:
5
+ - "orchestrate this"
6
+ - "coordinate tasks"
7
+ - "run this multi-phase"
8
+ - "dispatch workers"
9
+ - "coordinate team"
4
10
  ---
5
-
6
11
  # orchestration
7
12
 
8
13
  Use this skill when orchestrating multi-phase tasks across pi-crew teams and workers.
@@ -96,6 +101,19 @@ Maintain the original scope exactly. Không mở rộng scope vì "thấy thêm
96
101
  - This is the final safety net — typecheck, tests, lint, everything.
97
102
  - Only report DONE when final verification is green.
98
103
 
104
+ ## Enforcement — Orchestration Gate
105
+
106
+ **Before launching a new phase, verify:**
107
+
108
+ - [ ] Full work surface enumerated (all files, symbols, subsystems known)
109
+ - [ ] Phase tasks are independent (disjoint file scope, no edit conflicts)
110
+ - [ ] Each worker has explicit file ownership (no two workers same file)
111
+ - [ ] Verification gates defined for phase completion
112
+ - [ ] Phase gate passed (typecheck, tests, lint green) before advancing
113
+ - [ ] Respawn workers for broken work (do not absorb/fix yourself)
114
+
115
+ If ANY answer is NO → Stop. Complete planning before dispatching.
116
+
99
117
  ## Anti-patterns
100
118
 
101
119
  These are the behaviours that kill orchestration quality — tránh xa:
@@ -1,8 +1,14 @@
1
1
  ---
2
2
  name: ownership-session-security
3
- description: Session ownership and authorization workflow. Use when implementing cancel, respond, steer, run ownership, cwd overrides, imported runs, or cross-session actions.
4
- ---
3
+ description: "Session ownership and authorization workflow."
4
+ triggers:
5
+ - "cancel run"
6
+ - "respond to task"
7
+ - "cross-session action"
8
+ - "ownership verify"
9
+ - "session security"
5
10
 
11
+ ---
6
12
  # ownership-session-security
7
13
 
8
14
  Use this skill for cross-session safety and trust-boundary work.
@@ -24,6 +30,18 @@ Use this skill for cross-session safety and trust-boundary work.
24
30
  - Use `resolveContainedPath`, `resolveRealContainedPath`, `assertSafePathId`, and symlink checks rather than ad-hoc `startsWith` checks.
25
31
  - Destructive management actions must require `confirm: true`; referenced resource deletes must require `force: true` where applicable.
26
32
 
33
+ ## Enforcement — Ownership Session Security Gate
34
+
35
+ **Before mutating run state or cross-session operations, verify:**
36
+
37
+ - [ ] Session ID propagated into TeamContext for production paths
38
+ - [ ] ownerSessionId verified before respond/cancel/mutate operations
39
+ - [ ] Path fields (cwd, import, artifact) normalized and contained under allowed base
40
+ - [ ] Safe path helpers used (resolveContainedPath, assertSafePathId) not startsWith checks
41
+ - [ ] Destructive actions require explicit confirm/force parameters
42
+
43
+ If ANY answer is NO → Stop. Verify ownership before mutating state.
44
+
27
45
  ## Anti-patterns
28
46
 
29
47
  - Assuming `ctx.sessionId` exists directly on Pi context.