playbook-ai 1.4.0 → 1.4.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +5 -0
- package/VERSION +1 -1
- package/commands/chess.md +65 -13
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,11 @@
|
|
|
2
2
|
|
|
3
3
|
All notable updates to Playbook are documented here. Only impactful changes are listed — new commands, upgraded behavior, and things that make your workflow better. Cosmetic fixes and internal housekeeping are omitted.
|
|
4
4
|
|
|
5
|
+
## [1.4.1] — 2026-05-03
|
|
6
|
+
|
|
7
|
+
### Strategy
|
|
8
|
+
- **`/chess` System Mode** — `/chess` now has two modes. Human Mode is unchanged (parallel Opus 4.6 session, move-tree analysis, adversary modeling). New System Mode handles the case where there's no human adversary but a technical plan needs rigorous stress-testing. System Mode runs inline on Sonnet, enumerates every assumption in the plan, attacks each one, and produces verdicts (✅ / ⚠️ / ❌) plus a minimal list of required changes. Escalates to Opus 4.6 inline (no parallel session) for genuinely complex systems. Pre-flight now routes to three paths: Human Mode, System Mode, or `/plan`.
|
|
9
|
+
|
|
5
10
|
## [1.4.0] — 2026-05-01
|
|
6
11
|
|
|
7
12
|
### Strategy
|
package/VERSION
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
1.4.
|
|
1
|
+
1.4.1
|
package/commands/chess.md
CHANGED
|
@@ -1,28 +1,28 @@
|
|
|
1
|
-
Adversarial strategy analysis
|
|
1
|
+
Adversarial strategy analysis and technical stress-testing. Two modes: Human Mode (opponent-modeled, branch-traced) for situations with a real adversary; System Mode (assumption-attack, failure-traced) for technical plans with no human counterparty. Intake runs in the primary session (Sonnet). Human Mode generates a parallel Opus 4.6 handoff; System Mode runs inline.
|
|
2
2
|
|
|
3
3
|
---
|
|
4
4
|
|
|
5
|
-
## Pre-flight:
|
|
5
|
+
## Pre-flight: Route the situation
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
Three routes. Assess before doing anything else.
|
|
8
8
|
|
|
9
|
-
|
|
10
|
-
- Does the outcome depend on what that adversary does in response to your moves?
|
|
11
|
-
- Are the stakes material — money, a key relationship, legal exposure, or a make-or-break decision?
|
|
9
|
+
**Route 1 — Human Mode:** There's a real adversary — a person or party with competing interests and their own move set. The outcome depends on what they do in response to your moves. Stakes are material (money, a key relationship, legal exposure, a make-or-break decision). → Confirm to the user and proceed to Human Mode intake.
|
|
12
10
|
|
|
13
|
-
**
|
|
11
|
+
**Route 2 — System Mode:** No human adversary, but there IS a technical plan, implementation, or system being challenged. The question is: what could break, and how does the system respond to each move? → Confirm to the user and proceed to System Mode intake.
|
|
14
12
|
|
|
15
|
-
**
|
|
13
|
+
**Route 3 — /plan:** No adversary, no system to stress-test. This is a pure tradeoffs decision or planning question. → Tell the user: *"/plan is the right tool here — this is a tradeoffs decision, not a strategic scenario."* Offer to invoke /plan instead.
|
|
16
14
|
|
|
17
15
|
**Borderline:** Name what makes it ambiguous. Ask the user to confirm before proceeding.
|
|
18
16
|
|
|
19
|
-
The pre-flight runs on Sonnet. Do not begin intake until the
|
|
17
|
+
The pre-flight runs on Sonnet. Do not begin intake until the route is confirmed.
|
|
20
18
|
|
|
21
19
|
---
|
|
22
20
|
|
|
23
|
-
##
|
|
21
|
+
## Human Mode
|
|
24
22
|
|
|
25
|
-
|
|
23
|
+
### Intake: Build the chess brief
|
|
24
|
+
|
|
25
|
+
Run in the primary session on Sonnet. Ask these questions conversationally — not as a numbered list. Group related questions naturally. Use follow-ups where the answer is thin. The goal is a complete picture before the chess engine runs.
|
|
26
26
|
|
|
27
27
|
**The situation**
|
|
28
28
|
- What is the decision, negotiation, or situation you're navigating?
|
|
@@ -53,7 +53,7 @@ Once intake is complete, read the summary back to the user and confirm it's accu
|
|
|
53
53
|
|
|
54
54
|
---
|
|
55
55
|
|
|
56
|
-
|
|
56
|
+
### Generate the handoff prompt
|
|
57
57
|
|
|
58
58
|
Compile everything from intake into a self-contained chess brief. Fill in all [BRACKETS] with real values — current date, actual repo path (use the current working directory), and a short topic slug derived from the situation (e.g., `hilldun-negotiation`, `board-vote`, `vendor-renewal`).
|
|
59
59
|
|
|
@@ -65,7 +65,7 @@ Present the completed handoff prompt to the user in a clearly labeled block:
|
|
|
65
65
|
|
|
66
66
|
---
|
|
67
67
|
|
|
68
|
-
|
|
68
|
+
#### Handoff prompt (embed verbatim — this is what runs in the parallel session)
|
|
69
69
|
|
|
70
70
|
---
|
|
71
71
|
|
|
@@ -166,3 +166,55 @@ End of handoff prompt.
|
|
|
166
166
|
After presenting the handoff prompt to the user, say:
|
|
167
167
|
|
|
168
168
|
> *"Open a new terminal, paste the block above, and watch the chess engine work. When it's done, use the return prompt to bring the debrief back into this session."*
|
|
169
|
+
|
|
170
|
+
---
|
|
171
|
+
|
|
172
|
+
## System Mode
|
|
173
|
+
|
|
174
|
+
### Intake: Build the stress-test brief
|
|
175
|
+
|
|
176
|
+
If the system, plan, and constraints are already established in the conversation, skip directly to the stress-test. Only ask for information that's genuinely missing.
|
|
177
|
+
|
|
178
|
+
If context is thin, ask conversationally — not as a numbered list. Use follow-ups where the answer is thin. The goal is a complete picture before the stress-test runs.
|
|
179
|
+
|
|
180
|
+
**The system**
|
|
181
|
+
- What are you trying to build, fix, or change?
|
|
182
|
+
- What does the current system look like? (components, dependencies, environment, constraints)
|
|
183
|
+
- What's already been decided or built that the plan must work around?
|
|
184
|
+
|
|
185
|
+
**The plan**
|
|
186
|
+
- What's the sequence of moves you're planning?
|
|
187
|
+
- What outcome are you trying to guarantee?
|
|
188
|
+
- What's an acceptable failure mode vs. an unacceptable one?
|
|
189
|
+
|
|
190
|
+
Once you have sufficient context, proceed directly — no need to read it back unless something is ambiguous.
|
|
191
|
+
|
|
192
|
+
---
|
|
193
|
+
|
|
194
|
+
### System stress-test framework
|
|
195
|
+
|
|
196
|
+
Default: run inline on Sonnet. Escalate to Opus 4.6 inline (no parallel session) when the system is complex enough that a shallow pass would miss real risks — multiple interacting services, deep dependency chains, complex state machines. Surface your reasoning in chat as you go.
|
|
197
|
+
|
|
198
|
+
**Step 1 — Map the assumptions**
|
|
199
|
+
|
|
200
|
+
List every assumption embedded in the plan. These are the places where "this works if..." is implicit. Be exhaustive — surface assumptions about environment, dependencies, timing, state, permissions, behavior under failure. Name them all before attacking any.
|
|
201
|
+
|
|
202
|
+
**Step 2 — Attack each vector**
|
|
203
|
+
|
|
204
|
+
Work through the assumption list. For each one:
|
|
205
|
+
- **State the risk:** what breaks if this assumption is wrong?
|
|
206
|
+
- **Reason through the system's response:** how does the environment, dependency, or component actually behave in this case? Trace the actual mechanics, not the happy path.
|
|
207
|
+
- **Verdict:**
|
|
208
|
+
- ✅ — assumption holds, no change needed
|
|
209
|
+
- ⚠️ → [specific mitigation] — risk is real but manageable; state the required change
|
|
210
|
+
- ❌ — plan-breaker; must resolve before building
|
|
211
|
+
|
|
212
|
+
Don't pad. If an assumption clearly holds, say so and move on. Depth where there's actual risk.
|
|
213
|
+
|
|
214
|
+
**Step 3 — Surface what changed**
|
|
215
|
+
|
|
216
|
+
List only the changes the stress-test produced. Not every risk — only the ones with a ⚠️ or ❌ verdict that require action. For each: what changes, and why.
|
|
217
|
+
|
|
218
|
+
**Step 4 — Ready to build?**
|
|
219
|
+
|
|
220
|
+
State clearly: the plan is sound as-is / the plan needs these N changes before it's sound. If changes are needed, offer to incorporate them and proceed.
|