@runtypelabs/cli 1.9.4 β†’ 2.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,4 +1,5 @@
1
1
  <p align="center" style="background:white;border-radius:4px;padding: 12px;margin-bottom:16px;">
2
+ <br>
2
3
  <img
3
4
  src="https://www.runtype.com/runtype-text-only.svg"
4
5
  alt="Runtype: The Intelligent Product Company"
@@ -6,6 +7,10 @@
6
7
  />
7
8
  </p>
8
9
 
10
+ πŸ‘‹ Hey there! We're here to help you build intelligent products.
11
+
12
+ You are looking at...
13
+
9
14
  # The Runtype CLI
10
15
 
11
16
  This is our command-line interface for the platform, which includes _Marathon_, our harness for long-running tasks and deep workflow analysis.
@@ -120,6 +125,144 @@ runtype marathon "Code Builder" --goal "Build it" --no-runner --no-finish
120
125
  | `--no-runner` | | Hide the runner emoji from the header border |
121
126
  | `--no-finish` | | Hide the finish line emoji from the header border |
122
127
 
128
+ #### Custom Playbooks
129
+
130
+ Playbooks let you define custom workflows with milestones, model overrides, verification settings, and rules in a single YAML file.
131
+
132
+ ```bash
133
+ # Run with a custom playbook
134
+ runtype marathon "Minimal design library" \
135
+ --playbook design-library \
136
+ --goal "Create a clean, modern component library with a blue/slate palette" \
137
+ --sandbox daytona
138
+
139
+ # Playbooks are loaded from .runtype/marathons/playbooks/ (first checks folder CLI is ran from, then ~/.runtype/)
140
+ ```
141
+
142
+ From the folder where you want to run the CLI, create the playbook with:
143
+
144
+ ```bash
145
+ mkdir -p .runtype/marathons/playbooks
146
+
147
+ cat > .runtype/marathons/playbooks/design-library.yaml <<'EOF'
148
+ name: design-library
149
+ description: Generate a styled component library (headers, buttons, forms) and deploy a live preview
150
+
151
+ rules: |
152
+ IMPORTANT: Before doing anything else, create a new directory for this project
153
+ (e.g. "design-library/" or a name derived from the goal). All files must be
154
+ created inside that directory β€” do NOT write files in the current working directory.
155
+ Use vanilla HTML + CSS only β€” no frameworks, no build tools.
156
+ Every component must be responsive and accessible (aria labels, focus states).
157
+ Use CSS custom properties for all colors, spacing, and typography so the
158
+ entire theme can be changed by editing a single :root block.
159
+ The final index.html should be a showcase page displaying every component variant.
160
+
161
+ milestones:
162
+ - name: research
163
+ description: Analyze the style direction and plan the component set
164
+ model: gpt-5-mini
165
+ instructions: |
166
+ FIRST: Create a new project directory (e.g. "design-library/" or a name
167
+ derived from the goal). All files for this project go inside that directory.
168
+ Do NOT look at or use any existing files in the current working directory.
169
+
170
+ Then, read the goal to understand the desired visual style.
171
+ Decide on a color palette, typography scale, and spacing system.
172
+ List the exact components to build:
173
+ - Header: full-width navbar with logo area, nav links, and mobile menu
174
+ - Buttons: primary, secondary, outline, ghost, destructive β€” in sm/md/lg sizes
175
+ - Forms: text input, textarea, select, checkbox, radio, toggle β€” with labels and validation states
176
+ Write design-tokens.md inside the project directory capturing your decisions.
177
+ completionCriteria:
178
+ type: evidence
179
+ minReadFiles: 1
180
+
181
+ - name: build
182
+ description: Create the HTML, CSS, and showcase page
183
+ model: claude-sonnet-4-6
184
+ instructions: |
185
+ Create the project files:
186
+ 1. styles/tokens.css β€” CSS custom properties for the entire theme
187
+ 2. styles/components.css β€” all component styles using the tokens
188
+ 3. index.html β€” showcase page that renders every component and variant
189
+ Make sure:
190
+ - The showcase page has sections for Headers, Buttons, and Forms
191
+ - Each section shows all variants side by side
192
+ - The page looks good at mobile (375px) and desktop (1200px) widths
193
+ - All interactive elements have hover/focus/active states
194
+ If a sandbox is available, use `deploy_sandbox` to deploy the showcase
195
+ as a live preview so the user can see it in their browser.
196
+ completionCriteria:
197
+ type: sessions
198
+ minSessions: 1
199
+
200
+ - name: polish
201
+ description: Refine details, test responsiveness, and deploy
202
+ model: gemini-3-flash
203
+ instructions: |
204
+ Review the showcase page in the browser preview.
205
+ Fix any visual issues, alignment problems, or missing states.
206
+ Ensure the page title and meta description reflect the library name.
207
+ Verify the deployed preview URL is accessible.
208
+ completionCriteria:
209
+ type: never
210
+ canAcceptCompletion: true
211
+ EOF
212
+ ```
213
+
214
+ **Search order**: Exact path β†’ `.runtype/marathons/playbooks/<name>.yaml|yml|json` (repo) β†’ `~/.runtype/marathons/playbooks/<name>.yaml|yml|json` (user).
215
+
216
+ **Completion criteria types**:
217
+
218
+ - `evidence` β€” advances when enough files have been read (`minReadFiles`)
219
+ - `sessions` β€” advances after N sessions (`minSessions`)
220
+ - `planWritten` β€” advances when the agent writes its plan artifact
221
+ - `never` β€” only the agent's `TASK_COMPLETE` signal can advance (if `canAcceptCompletion: true`)
222
+
223
+ #### Marathon Anatomy
224
+
225
+ ```
226
+ β”Œβ”€ marathon ──────────────────────────────────────────────────────┐
227
+ β”‚ β”‚
228
+ β”‚ β”Œβ”€ playbook (optional) ─────────────────────────────┐ β”‚
229
+ β”‚ β”‚ Defines milestones, models, verification, rules β”‚ β”‚
230
+ β”‚ β”‚ .runtype/marathons/playbooks/tdd.yaml β”‚ β”‚
231
+ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
232
+ β”‚ β”‚ β”‚
233
+ β”‚ β–Ό β”‚
234
+ β”‚ β”Œβ”€ milestone 1 ──┐ β”Œβ”€ milestone 2 ──┐ β”Œβ”€ milestone 3 ─────┐ |
235
+ β”‚ β”‚ research β”‚ β”‚ test-design β”‚ β”‚ execution β”‚ |
236
+ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ |
237
+ β”‚ β”‚ run 1 β”‚ β”‚ run 3 β”‚ β”‚ run 5 β”‚ |
238
+ β”‚ β”‚ checkpoint ─── β”‚ checkpoint ─── β”‚ checkpoint ─── |
239
+ β”‚ β”‚ run 2 β”‚ β”‚ run 4 β”‚ β”‚ run 6 β”‚ |
240
+ β”‚ β”‚ checkpoint ─── β”‚ checkpoint β”€β”€β”˜ β”‚ checkpoint β”€β”€β”˜ |
241
+ β”‚ β”‚ (advances) β”€β”€β”€β”€β”˜ β”‚ | β”‚ β”‚ |
242
+ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ └──── finish β”€β”€β”€β”€β”€β”˜ |
243
+ β”‚ |
244
+ β”‚ β”Œβ”€ rules (optional) ────────────────────────────────┐ |
245
+ β”‚ β”‚ Repo-wide standards applied to ALL milestones β”‚ β”‚
246
+ β”‚ β”‚ .runtype/marathons/rules/*.md β”‚ β”‚
247
+ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
248
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
249
+
250
+ Vocabulary:
251
+ marathon The overall autonomous task runner
252
+ playbook Custom workflow definition
253
+ (optional, default: research→planning→execution)
254
+ milestones Behavioral phases the agent works through
255
+ runs Individual agent sessions within a milestone
256
+ checkpoint Pause between runs for human review/steering
257
+ finish Task completion
258
+
259
+ What's optional:
260
+ βœ“ Playbook Without one, uses default workflow (researchβ†’planningβ†’execution)
261
+ βœ“ Rules Without them, agent follows only playbook/milestone instructions
262
+ βœ“ Models Without overrides, uses CLI --model flag or default
263
+ βœ“ Verification Without it, no verification gate between milestones
264
+ ```
265
+
123
266
  #### Tool Context Modes
124
267
 
125
268
  When a marathon runs multiple sessions, tool call/result pairs from previous sessions are preserved in the conversation history. The `--tool-context` flag controls how older tool results are stored to balance cost and re-readability: