playbook-ai 1.3.5 → 1.4.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +11 -0
- package/VERSION +1 -1
- package/commands/chess.md +220 -0
- package/commands/plan.md +0 -7
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,17 @@
|
|
|
2
2
|
|
|
3
3
|
All notable updates to Playbook are documented here. Only impactful changes are listed — new commands, upgraded behavior, and things that make your workflow better. Cosmetic fixes and internal housekeeping are omitted.
|
|
4
4
|
|
|
5
|
+
## [1.4.1] — 2026-05-03
|
|
6
|
+
|
|
7
|
+
### Strategy
|
|
8
|
+
- **`/chess` System Mode** — `/chess` now has two modes. Human Mode is unchanged (parallel Opus 4.6 session, move-tree analysis, adversary modeling). New System Mode handles the case where there's no human adversary but a technical plan needs rigorous stress-testing. System Mode runs inline on Sonnet, enumerates every assumption in the plan, attacks each one, and produces verdicts (✅ / ⚠️ / ❌) plus a minimal list of required changes. Escalates to Opus 4.6 inline (no parallel session) for genuinely complex systems. Pre-flight now routes to three paths: Human Mode, System Mode, or `/plan`.
|
|
9
|
+
|
|
10
|
+
## [1.4.0] — 2026-05-01
|
|
11
|
+
|
|
12
|
+
### Strategy
|
|
13
|
+
- **New `/chess` command** — adversarial strategy analysis with full opponent modeling and multi-move branch tracing. Designed for negotiations, competitive decisions, legal disputes, and any high-stakes scenario with a real counterparty. Runs a structured intake in the primary session (Sonnet), then generates a self-contained handoff prompt for a parallel Opus 4.6 session that does the reasoning. The chess session delivers a structured debrief artifact and a return prompt to bring findings back into the primary session. Closes with a clean "Chess session complete. You can close this window." — no /end needed in the parallel session.
|
|
14
|
+
- **`/plan` lightened** — branch trace removed. It now lives in `/chess` where adversarial forward-tracing belongs. `/plan` retains its three-phase structure (assess → harden → steps) without the overhead of tracing implementation options forward multiple moves. For decisions involving a counterparty, reach for `/chess`; for implementation decisions, `/plan` is leaner and faster.
|
|
15
|
+
|
|
5
16
|
## [1.3.5] — 2026-04-28
|
|
6
17
|
|
|
7
18
|
### Planning
|
package/VERSION
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
1.
|
|
1
|
+
1.4.1
|
|
@@ -0,0 +1,220 @@
|
|
|
1
|
+
Adversarial strategy analysis and technical stress-testing. Two modes: Human Mode (opponent-modeled, branch-traced) for situations with a real adversary; System Mode (assumption-attack, failure-traced) for technical plans with no human counterparty. Intake runs in the primary session (Sonnet). Human Mode generates a parallel Opus 4.6 handoff; System Mode runs inline.
|
|
2
|
+
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
## Pre-flight: Route the situation
|
|
6
|
+
|
|
7
|
+
Three routes. Assess before doing anything else.
|
|
8
|
+
|
|
9
|
+
**Route 1 — Human Mode:** There's a real adversary — a person or party with competing interests and their own move set. The outcome depends on what they do in response to your moves. Stakes are material (money, a key relationship, legal exposure, a make-or-break decision). → Confirm to the user and proceed to Human Mode intake.
|
|
10
|
+
|
|
11
|
+
**Route 2 — System Mode:** No human adversary, but there IS a technical plan, implementation, or system being challenged. The question is: what could break, and how does the system respond to each move? → Confirm to the user and proceed to System Mode intake.
|
|
12
|
+
|
|
13
|
+
**Route 3 — /plan:** No adversary, no system to stress-test. This is a pure tradeoffs decision or planning question. → Tell the user: *"/plan is the right tool here — this is a tradeoffs decision, not a strategic scenario."* Offer to invoke /plan instead.
|
|
14
|
+
|
|
15
|
+
**Borderline:** Name what makes it ambiguous. Ask the user to confirm before proceeding.
|
|
16
|
+
|
|
17
|
+
The pre-flight runs on Sonnet. Do not begin intake until the route is confirmed.
|
|
18
|
+
|
|
19
|
+
---
|
|
20
|
+
|
|
21
|
+
## Human Mode
|
|
22
|
+
|
|
23
|
+
### Intake: Build the chess brief
|
|
24
|
+
|
|
25
|
+
Run in the primary session on Sonnet. Ask these questions conversationally — not as a numbered list. Group related questions naturally. Use follow-ups where the answer is thin. The goal is a complete picture before the chess engine runs.
|
|
26
|
+
|
|
27
|
+
**The situation**
|
|
28
|
+
- What is the decision, negotiation, or situation you're navigating?
|
|
29
|
+
- What outcome are you trying to achieve?
|
|
30
|
+
- What's the time horizon? (One-time event, ongoing relationship, hard deadline?)
|
|
31
|
+
- What's already been said or done that constrains the situation?
|
|
32
|
+
|
|
33
|
+
**For each adversary**
|
|
34
|
+
- Who are they? (Name, role, organization, relationship to you)
|
|
35
|
+
- What do they want most — and what are their secondary interests?
|
|
36
|
+
- What are their motivations and incentives? (Financial, ego, power, relationship, legal, other)
|
|
37
|
+
- How aggressive or passive are they likely to be?
|
|
38
|
+
- What's their BATNA — their best alternative if this doesn't go their way?
|
|
39
|
+
- What constraints do they operate under? (Budget, authority limits, approvals needed, external pressures)
|
|
40
|
+
- What do they likely know about your position?
|
|
41
|
+
|
|
42
|
+
**Your position**
|
|
43
|
+
- Are you stronger, equal, or weaker than the adversary right now?
|
|
44
|
+
- What's your BATNA?
|
|
45
|
+
- What are you not willing to concede?
|
|
46
|
+
- What information are you holding back, and what do they probably assume about you?
|
|
47
|
+
|
|
48
|
+
**The decision**
|
|
49
|
+
- What moves are you currently considering?
|
|
50
|
+
- What counts as a win? An acceptable outcome? A loss?
|
|
51
|
+
|
|
52
|
+
Once intake is complete, read the summary back to the user and confirm it's accurate before generating the handoff prompt.
|
|
53
|
+
|
|
54
|
+
---
|
|
55
|
+
|
|
56
|
+
### Generate the handoff prompt
|
|
57
|
+
|
|
58
|
+
Compile everything from intake into a self-contained chess brief. Fill in all [BRACKETS] with real values — current date, actual repo path (use the current working directory), and a short topic slug derived from the situation (e.g., `hilldun-negotiation`, `board-vote`, `vendor-renewal`).
|
|
59
|
+
|
|
60
|
+
Create the `docs/chess/` directory in the current project if it doesn't already exist.
|
|
61
|
+
|
|
62
|
+
Present the completed handoff prompt to the user in a clearly labeled block:
|
|
63
|
+
|
|
64
|
+
> **Open a new terminal window and paste this entire block:**
|
|
65
|
+
|
|
66
|
+
---
|
|
67
|
+
|
|
68
|
+
#### Handoff prompt (embed verbatim — this is what runs in the parallel session)
|
|
69
|
+
|
|
70
|
+
---
|
|
71
|
+
|
|
72
|
+
You are running a chess-style adversarial analysis. Intake is complete — all context is below. Your job is to think several moves ahead, model each adversary's rational behavior, and produce a debrief the user can act on. Do not ask questions. Do not invoke any slash commands. Run straight through.
|
|
73
|
+
|
|
74
|
+
Run on Opus 4.6 (`claude-opus-4-6`). Do not switch models.
|
|
75
|
+
|
|
76
|
+
---
|
|
77
|
+
|
|
78
|
+
**SITUATION**
|
|
79
|
+
[Paste situation summary]
|
|
80
|
+
|
|
81
|
+
**OBJECTIVE**
|
|
82
|
+
[What the user is trying to achieve]
|
|
83
|
+
|
|
84
|
+
**TIME HORIZON**
|
|
85
|
+
[One-time / ongoing / deadline: YYYY-MM-DD]
|
|
86
|
+
|
|
87
|
+
**YOUR POSITION**
|
|
88
|
+
[Standing (stronger/equal/weaker), BATNA, non-negotiables, information held back, what adversary likely assumes]
|
|
89
|
+
|
|
90
|
+
**ADVERSARIES**
|
|
91
|
+
[For each: name/role, primary goal, secondary interests, motivations, aggression level, BATNA, constraints, what they know about the user]
|
|
92
|
+
|
|
93
|
+
**THE BOARD**
|
|
94
|
+
[Current state, moves already made or said, moves under consideration, definition of win / acceptable / loss]
|
|
95
|
+
|
|
96
|
+
---
|
|
97
|
+
|
|
98
|
+
**CHESS REASONING FRAMEWORK**
|
|
99
|
+
|
|
100
|
+
Surface your reasoning in chat as you go — the user is watching in real time.
|
|
101
|
+
|
|
102
|
+
**Step 1 — Inhabit each adversary**
|
|
103
|
+
|
|
104
|
+
For each adversary, build their internal model: what do they see, what do they want most, what are they afraid of? Then for each move the user is considering, think through what that adversary would rationally do in response.
|
|
105
|
+
|
|
106
|
+
Stay strictly within rational human motivation. Model what a reasonable person in their position — with their specific incentives, constraints, and information — would actually do. No mind-reading. No irrational escalations. No assuming they'll cooperate without reason. Where your assumptions about them are weak, flag it explicitly.
|
|
107
|
+
|
|
108
|
+
**Step 2 — Trace the branches**
|
|
109
|
+
|
|
110
|
+
For each move the user is considering, trace the likely sequence of responses 3–4 moves deep, or until the path reaches a clearly labeled terminal state. Prune branches that are implausible given the adversary's model — don't trace everything, trace what matters.
|
|
111
|
+
|
|
112
|
+
Format each branch as:
|
|
113
|
+
|
|
114
|
+
```
|
|
115
|
+
Your move → Adversary response → Your counter → Their counter → Terminal: [favorable / acceptable / problematic]
|
|
116
|
+
→ Risk branch: if [X] instead → Terminal: [recovery cost]
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
**Step 3 — Stress the assumptions**
|
|
120
|
+
|
|
121
|
+
For each branch, name the assumptions it rests on. Which are well-grounded? Which are guesses? Where is the analysis most fragile? Flag any adversary behavior you're most uncertain about.
|
|
122
|
+
|
|
123
|
+
**Step 4 — Identify leverage**
|
|
124
|
+
|
|
125
|
+
Where in the move tree can you shift a terminal state? What moves create future options vs. close them down? What piece of information — if you had it — would change the picture most?
|
|
126
|
+
|
|
127
|
+
**Step 5 — Recommended line**
|
|
128
|
+
|
|
129
|
+
State the recommended opening move clearly. Explain why it scores better than the alternatives across the terminal states. Include the top 2 contingency responses for the most likely adversary deviation from the expected path.
|
|
130
|
+
|
|
131
|
+
Be specific: *"Open with X. If they respond with Y, do Z. If they respond with W instead, do V."*
|
|
132
|
+
|
|
133
|
+
---
|
|
134
|
+
|
|
135
|
+
**OUTPUT**
|
|
136
|
+
|
|
137
|
+
When the analysis is complete:
|
|
138
|
+
|
|
139
|
+
1. Write the full debrief to: `[REPO_PATH]/docs/chess/[YYYY-MM-DD]-[topic-slug].md`
|
|
140
|
+
- Create the directory if it doesn't exist
|
|
141
|
+
- Structure: Situation → Adversary models → Move tree → Recommended line → Contingencies → Assumption flags
|
|
142
|
+
|
|
143
|
+
2. Display the full debrief in chat so the user can read through it.
|
|
144
|
+
|
|
145
|
+
3. Display this block at the end — formatted exactly as shown:
|
|
146
|
+
|
|
147
|
+
---
|
|
148
|
+
**Return prompt — paste this into your original session:**
|
|
149
|
+
|
|
150
|
+
> Chess debrief complete. Read the analysis at `[REPO_PATH]/docs/chess/[YYYY-MM-DD]-[topic-slug].md` and pull the recommended line and top contingencies into our working context so we can decide next steps.
|
|
151
|
+
---
|
|
152
|
+
|
|
153
|
+
4. Close with a clearly formatted terminal closer — display exactly this as the final output:
|
|
154
|
+
|
|
155
|
+
```
|
|
156
|
+
Chess session complete. You can close this window.
|
|
157
|
+
(Do not run /end — the primary session handles closeout.)
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
---
|
|
161
|
+
|
|
162
|
+
End of handoff prompt.
|
|
163
|
+
|
|
164
|
+
---
|
|
165
|
+
|
|
166
|
+
After presenting the handoff prompt to the user, say:
|
|
167
|
+
|
|
168
|
+
> *"Open a new terminal, paste the block above, and watch the chess engine work. When it's done, use the return prompt to bring the debrief back into this session."*
|
|
169
|
+
|
|
170
|
+
---
|
|
171
|
+
|
|
172
|
+
## System Mode
|
|
173
|
+
|
|
174
|
+
### Intake: Build the stress-test brief
|
|
175
|
+
|
|
176
|
+
If the system, plan, and constraints are already established in the conversation, skip directly to the stress-test. Only ask for information that's genuinely missing.
|
|
177
|
+
|
|
178
|
+
If context is thin, ask conversationally — not as a numbered list. Use follow-ups where the answer is thin. The goal is a complete picture before the stress-test runs.
|
|
179
|
+
|
|
180
|
+
**The system**
|
|
181
|
+
- What are you trying to build, fix, or change?
|
|
182
|
+
- What does the current system look like? (components, dependencies, environment, constraints)
|
|
183
|
+
- What's already been decided or built that the plan must work around?
|
|
184
|
+
|
|
185
|
+
**The plan**
|
|
186
|
+
- What's the sequence of moves you're planning?
|
|
187
|
+
- What outcome are you trying to guarantee?
|
|
188
|
+
- What's an acceptable failure mode vs. an unacceptable one?
|
|
189
|
+
|
|
190
|
+
Once you have sufficient context, proceed directly — no need to read it back unless something is ambiguous.
|
|
191
|
+
|
|
192
|
+
---
|
|
193
|
+
|
|
194
|
+
### System stress-test framework
|
|
195
|
+
|
|
196
|
+
Default: run inline on Sonnet. Escalate to Opus 4.6 inline (no parallel session) when the system is complex enough that a shallow pass would miss real risks — multiple interacting services, deep dependency chains, complex state machines. Surface your reasoning in chat as you go.
|
|
197
|
+
|
|
198
|
+
**Step 1 — Map the assumptions**
|
|
199
|
+
|
|
200
|
+
List every assumption embedded in the plan. These are the places where "this works if..." is implicit. Be exhaustive — surface assumptions about environment, dependencies, timing, state, permissions, behavior under failure. Name them all before attacking any.
|
|
201
|
+
|
|
202
|
+
**Step 2 — Attack each vector**
|
|
203
|
+
|
|
204
|
+
Work through the assumption list. For each one:
|
|
205
|
+
- **State the risk:** what breaks if this assumption is wrong?
|
|
206
|
+
- **Reason through the system's response:** how does the environment, dependency, or component actually behave in this case? Trace the actual mechanics, not the happy path.
|
|
207
|
+
- **Verdict:**
|
|
208
|
+
- ✅ — assumption holds, no change needed
|
|
209
|
+
- ⚠️ → [specific mitigation] — risk is real but manageable; state the required change
|
|
210
|
+
- ❌ — plan-breaker; must resolve before building
|
|
211
|
+
|
|
212
|
+
Don't pad. If an assumption clearly holds, say so and move on. Depth where there's actual risk.
|
|
213
|
+
|
|
214
|
+
**Step 3 — Surface what changed**
|
|
215
|
+
|
|
216
|
+
List only the changes the stress-test produced. Not every risk — only the ones with a ⚠️ or ❌ verdict that require action. For each: what changes, and why.
|
|
217
|
+
|
|
218
|
+
**Step 4 — Ready to build?**
|
|
219
|
+
|
|
220
|
+
State clearly: the plan is sound as-is / the plan needs these N changes before it's sound. If changes are needed, offer to incorporate them and proceed.
|
package/commands/plan.md
CHANGED
|
@@ -6,13 +6,6 @@ Read all relevant files first. Then determine:
|
|
|
6
6
|
- **Is this straightforward?** (One obvious approach, clear requirements) → Skip to Phase 2.
|
|
7
7
|
- **Are there meaningful tradeoffs?** (Multiple approaches, architectural choices, unclear requirements) → Brainstorm first:
|
|
8
8
|
- Present 2-3 approaches with pros/cons in short, scannable sections
|
|
9
|
-
- **For high-stakes decisions** (architecture choices, irreversible actions, multi-system impact, high cost-of-being-wrong): run a **branch trace** — trace each option forward 2-3 moves to its terminal state. Compare endpoints, not just opening positions:
|
|
10
|
-
```
|
|
11
|
-
Option A → requires X → leads to Y → Terminal: [favorable / acceptable / problematic]
|
|
12
|
-
→ Risk branch: if Z happens instead → Terminal: [recovery cost]
|
|
13
|
-
Option B → ...
|
|
14
|
-
```
|
|
15
|
-
Max 3 branches, max 3 moves deep. Label each terminal state. Recommend based on where each path actually *lands*, not just how it starts.
|
|
16
9
|
- Flag which approach you'd recommend and why
|
|
17
10
|
- Wait for approval of a direction before proceeding
|
|
18
11
|
- Save the decision to `docs/decisions/YYYY-MM-DD-<topic>.md` if it's non-trivial
|