npm - sparkecoder - Versions diffs - 0.1.113 → 0.1.115 - Mend

sparkecoder 0.1.113 → 0.1.115

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (106) hide show

package/dist/skills/default/computer-use.md CHANGED Viewed

@@ -6,9 +6,36 @@ platforms: ["darwin"]
 # Computer Use Skill
-Anthropic's `computer` tool gives Claude direct control of the actual macOS desktop. You take screenshots, then click/type/scroll at pixel coordinates anywhere on screen — Finder, Safari, native apps, anything visible. This is real desktop automation, not a sandboxed browser.
+Anthropic's `computer` tool gives Claude direct control of the actual macOS desktop — take screenshots, click/type/scroll at pixel coordinates, drive any app. Real desktop automation.
-For most browser work, **prefer `agent-browser` with refs from `snapshot -i`** (see the Browser Automation skill). Refs are deterministic, much cheaper in tokens, and don't depend on visual coordinates that shift between screenshots. Reach for computer use when the task involves a native macOS app, a non-browser GUI, complex visual verification, or cross-app workflows.
+## ⚠️ Computer use is the LAST RESORT — read this first
+Computer use is the slowest, most token-expensive, most error-prone tool you have. Almost every task you're tempted to use it for has a better alternative:
+| If you're trying to … | Use this instead |
+|---|---|
+| Read / write / search / edit files | `read_file`, `write_file`, `bash` (`grep`, `sed`, `cat`, etc.) |
+| Run a command, build, test, script | `bash` |
+| Use git / npm / brew / any CLI | `bash` |
+| **Anything in a web browser** | **`agent-browser` skill (`load_skill browser`)** — refs from `snapshot -i` are deterministic and ~100× cheaper in tokens than pixel-coordinate clicks |
+| Look up info on the web | `web_fetch` / `web_search` |
+| Call an HTTP API | `bash` (`curl`) |
+| Render an HTML page, screenshot it, scrape it | `agent-browser` (`load_skill browser`) |
+| Drive a Slack/Linear/Jira/etc. workflow | The integration's API via `bash` + `curl`, OR an MCP server (`load_skill manage-mcp`) |
+**Decision tree**:
+1. Is there a CLI for this? → `bash`.
+2. Is it in a web browser? → `load_skill browser`, then drive `agent-browser` via refs.
+3. Is it something visible only in a native macOS GUI app with no CLI / API / accessibility shortcut? → **only now** consider computer use.
+Computer use is appropriate for:
+- Native macOS apps with no CLI (e.g. driving System Settings, Calculator, Finder for tasks that can't be done via `defaults`/`open`/`launchctl`).
+- Complex visual verification ("does this app look right?").
+- Cross-app workflows that genuinely require the desktop (drag a file from app A to app B's window).
+- Demo-style tasks where the user wants to *see* the screen action.
+Computer use is **NOT** appropriate for browser work — `agent-browser` snapshots + refs are deterministic, don't depend on pixel coordinates that shift between screenshots, work cross-platform (no macOS dependency), don't need Accessibility/Screen Recording permissions, and use a fraction of the tokens. Always reach for the browser skill first.
 ## Requirements (macOS only)
@@ -125,6 +152,47 @@ Coordinates are in the **logical points** of the primary display — same coordi
 `enable_computer_use` auto-detects the size and stores it. Always look at the most recent screenshot to find positions before clicking — windows move, apps quit, the desktop re-flows.
+## Record what you're doing (default ON)
+Computer-use sessions are *visual* — the user can't see the screen you're driving, only your text summary. **Record almost every computer-use task** so the user can replay it. Use the built-in `sparkecoder record` helper which manages start/stop properly so the recording covers the **entire task**, not just a fixed-time window:
+```bash
+# 1. At the START of the task, BEFORE your first `computer` action:
+REC=$(sparkecoder record start --name "calculator-demo")
+# REC is JSON: {"id":"rec-abc123","path":"~/recordings/rec-abc123-calculator-demo.mov","pid":12345}
+REC_ID=$(echo "$REC" | jq -r .id)
+REC_PATH=$(echo "$REC" | jq -r .path)
+echo "Recording started: $REC_PATH (id=$REC_ID)"
+# 2. ... do all your computer-use actions (open apps, click, type, etc.) ...
+# 3. At the END (success OR failure), stop the recording:
+sparkecoder record stop "$REC_ID"
+# → JSON: {"id":"...","path":"...","durationSec":42,"sizeMb":18.4,"ok":true}
+```
+Why use `sparkecoder record` instead of `screencapture -V 60` directly:
+- `-V <seconds>` is a fixed timeout — if your task takes longer than the guess, the recording ends mid-task and you get a partial. The helper has no timeout; it records until you explicitly stop it.
+- The helper tracks PIDs in state so `sparkecoder record stop-all` can clean up if something crashes.
+- Works on both macOS (screencapture) and Linux (ffmpeg x11grab) with the same command.
+- Returns the file path as JSON, so you can include it in your final result without guessing.
+Default behavior:
+- **Always record** short / visually interesting tasks (open an app, click around, drag/drop, fill a form, demos, "show me X working").
+- **Always announce** before starting: "Starting a screen recording — I'll send you the file when I'm done."
+- **Include the file path in your final summary** (and in your `outputSchema` if you're a worker) so the orchestrator can post the file back via Slack/whatever channel.
+Skip recording when:
+- The task is **long-running and boring** (e.g. "every 15 minutes for the next hour, click Refresh"). A 60-minute screen video at full resolution gets huge fast.
+- The screen contains **sensitive content** the user hasn't explicitly approved recording (1Password vaults, customer dashboards, banking, private DMs). Ask first.
+- The user explicitly says "no recording" or "just describe what you did."
+- The task is **purely keyboard-driven CLI work** that didn't need computer-use in the first place — that should be `bash`, not `computer`.
+The auto-stop flag `-V 180` (3 minutes) is a generous default that prevents runaway recordings if you crash or forget to kill the process. For longer tasks, bump it or use a separate `asciinema` recording for the terminal half.
 ## Best practices
 1. **Screenshot before AND after each action.** Computer use is blind without screenshots. After clicking, re-screenshot to confirm the click had the expected effect before doing the next thing.

package/dist/skills/default/recording.md ADDED Viewed

@@ -0,0 +1,196 @@
+---
+name: Recording
+description: Record terminal sessions (asciinema, anywhere) or screen video (macOS / Linux with X) while a task runs so the user can replay what happened. Useful for demos, debugging visual bugs, capturing computer-use sessions, and producing evidence the task actually did what it claims.
+platforms: ["darwin", "linux"]
+---
+# Recording Skill
+Sometimes a task is best explained by *showing* it. This skill teaches you to record a **terminal session** or **screen video** while you work, then hand the file to the user.
+## When to record
+Reach for this when:
+- The user explicitly asks for a recording / demo / video / proof.
+- You're using **computer use** to drive the desktop and the user will want to see what you did.
+- You're running a long-running command (build, test suite, deploy) and the user might want to scrub through it later.
+- You're debugging a flaky visual issue — recording lets you and the user re-watch frames.
+Skip recording when:
+- The task is purely textual (reading/writing files, code analysis). The chat transcript already has everything.
+- Recording would capture sensitive content (passwords on screen, private DMs, customer data). Ask first.
+## Screen recording — use `sparkecoder record` (preferred)
+The built-in helper handles start/stop reliably so the recording covers the WHOLE task, not a fixed time window. It tracks PIDs in state, works on macOS + Linux with the same interface, and returns JSON you can parse.
+```bash
+# Start: returns {id, path, pid}
+REC=$(sparkecoder record start --name "calculator-demo")
+REC_ID=$(echo "$REC" | jq -r .id)
+REC_PATH=$(echo "$REC" | jq -r .path)
+# ... do work ...
+# Stop: returns {id, path, durationSec, sizeMb, ok}
+sparkecoder record stop "$REC_ID"
+```
+Other operations:
+```bash
+sparkecoder record list           # list active recordings (auto-prunes dead pids)
+sparkecoder record stop-all       # stop everything (cleanup)
+sparkecoder record start --dir /custom/path --name my-task
+```
+Storage: defaults to `~/recordings/`. Files are named `rec-<id>[-<name>].<mov|mp4>`.
+### Why not `screencapture -V N` or raw `ffmpeg` backgrounded?
+You can — but you have to:
+- guess the right `-V` timeout (too short = partial, too long = wasted disk),
+- track the PID yourself for `kill -INT`,
+- pick the right tool per OS,
+- and clean up zombie processes if you crash mid-task.
+`sparkecoder record` does all of that. Reach for raw `screencapture` / `ffmpeg` only if you need specific flags the helper doesn't expose (custom display, rectangle crop, audio mux).
+## Terminal recording — asciinema
+`asciinema` records the **content of a terminal session** as a tiny `.cast` text file. It's a fraction of the size of video, plays in any browser via [asciinema-player](https://github.com/asciinema/asciinema-player), and the output is also greppable.
+### Install (one-time per machine)
+```bash
+# macOS
+brew install asciinema
+# Debian / Ubuntu
+sudo apt-get install -y asciinema
+# Other Linux
+pip install asciinema
+```
+### Record
+```bash
+# Start recording; everything you type and every command output goes into the cast.
+asciinema rec ~/recordings/demo-$(date +%s).cast
+# Run whatever you want to demo. Then exit the recording with Ctrl-D or `exit`.
+# Replay locally:
+asciinema play ~/recordings/demo-1700000000.cast
+# Upload to asciinema.org (public, optional):
+asciinema upload ~/recordings/demo-1700000000.cast
+```
+### Non-interactive recording (script-driven)
+If you (the agent) are running a sequence of commands and want to record without sitting in an interactive prompt:
+```bash
+# Record a single command's output:
+asciinema rec --command "bash -c 'npm test; echo exit=$?'" out.cast
+# Or record into a file while a script runs:
+asciinema rec --idle-time-limit 2 --title "test run $(date)" out.cast \
+  --command "./scripts/run-the-thing.sh"
+```
+`--idle-time-limit 2` collapses dead-air pauses longer than 2s so the playback isn't boring.
+### Hand the file to the user
+When you're done:
+1. Move the `.cast` to a stable path under `~/recordings/` or the session workspace.
+2. Tell the user the path. If `agent-remote-server` is configured, you can upload via the storage API and give them a signed URL — see the bash example below.
+3. Mention they can either `asciinema play <file>` locally or open the URL.
+```bash
+# Optional: upload to GCS via the storage route if configured
+RECORDING=~/recordings/demo-$(date +%s).cast
+curl -fsS -X POST "${SPARKECODER_REMOTE_URL}/storage/upload" \
+  -H "Authorization: Bearer ${SPARKECODER_AUTH_KEY}" \
+  -F "file=@${RECORDING}" | jq .url
+```
+## Advanced: raw `screencapture` / `ffmpeg` (use only when the helper isn't enough)
+Prefer `sparkecoder record` (above) for almost everything. Reach for raw tools only when you need flags the helper doesn't expose: rectangle crop, specific display, audio mux, custom frame rate.
+### Requirements
+- macOS: **Screen Recording permission** for the agent's parent process. `sparkecoder check-permissions` verifies; `sparkecoder request-permissions` opens the right System Settings pane.
+- Linux: `ffmpeg` installed (`sudo apt-get install -y ffmpeg`), and a running X server with the right `DISPLAY` env var.
+### macOS `screencapture` flags
+```bash
+# Crop to a rectangle (x,y,w,h):
+screencapture -v -R 100,100,800,600 -C ~/recordings/clip.mov
+# Pick a specific display (run `system_profiler SPDisplaysDataType` to list ids):
+screencapture -v -G <displayID> -C ~/recordings/clip.mov
+# Auto-stop after N seconds (avoid this for task-bounded recordings; use the
+# helper instead so the recording covers the whole task):
+screencapture -v -V 30 -C ~/recordings/clip.mov
+```
+### Linux `ffmpeg` x11grab
+```bash
+# Capture the X display (size matches the screen):
+ffmpeg -y -f x11grab -framerate 25 -video_size 1920x1080 -i :0.0 \
+  -c:v libx264 -preset veryfast -pix_fmt yuv420p \
+  ~/recordings/screen-$(date +%s).mp4
+# Stop with Ctrl-C or kill the ffmpeg pid.
+```
+On a headless server with no X, screen recording isn't meaningful — record the terminal with `asciinema` instead.
+## Quick recipes
+| You want to … | Use |
+|---|---|
+| Show what a computer-use task did on the desktop | `sparkecoder record start` → work → `record stop <id>` |
+| Capture a CLI demo, share a link | `asciinema rec → upload` |
+| Capture a single command's output verbatim | `asciinema rec --command "cmd" out.cast` |
+| Debug a flaky GUI test (any OS) | `sparkecoder record start` around the test run |
+| Both terminal AND screen at once | `sparkecoder record start` + `asciinema rec` in parallel |
+## Best practices
+1. **Always announce.** Before starting a recording, post a short message to the user: "Starting screen recording for the Calculator demo." So they can intervene if it'd capture something sensitive.
+2. **Use `sparkecoder record` for screen video.** It removes every footgun the raw tools have (no fixed timeout that cuts the task off mid-way, no PID juggling, no per-OS branching). Reach for raw `screencapture` / `ffmpeg` only when you need flags the helper doesn't expose (region crop, custom display, audio).
+3. **Bracket the WHOLE task.** `record start` *before* your first work command; `record stop <id>` *after* the very last one. Don't bracket each step individually — you'll lose context between clips.
+4. **Stop on exit, success OR failure.** Always run `record stop <id>` at the end of your task, even on errors. Otherwise the recorder keeps writing until `record stop-all` (or a reboot) catches it. The duration / size returned by `stop` is also useful in your final summary.
+5. **Don't record secrets.** Refuse to record if the screen contains visible passwords, 1Password vaults, customer PII, etc. Ask the user to close those windows first.
+6. **Clean up after yourself.** If a recording is no longer needed, `rm` it — videos are big.
+7. **Always mention the file in your final message.** "Recording saved to `<path>`" or, better, upload via `/storage/upload` and post the `/f/<id>` URL so Slack/etc. unfurls it inline.
+## Returning the recording in your final task summary
+For a **task worker** that captured a recording, include the structured fields below in your `result` so the orchestrator can route it back to the user:
+```ts
+{
+  "summary": "Opened Calculator, typed 2+2, hit Return. Got 4.",
+  "recording": {
+    "path": "/Users/me/recordings/rec-abc123-calculator-demo.mov",
+    "durationSec": 42,
+    "sizeMb": 18.4,
+    "description": "Full screen recording of the Calculator interaction"
+  }
+}
+```
+The orchestrator can then upload that path via `/storage/upload` and post the `/f/<id>` URL into Slack (or whatever channel the original request came in on) — Slack unfurls it inline as a video preview.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "sparkecoder",
-  "version": "0.1.113",
+  "version": "0.1.115",
   "description": "A powerful coding agent CLI with HTTP API for development environments",
   "type": "module",
   "main": "./dist/index.js",

package/src/skills/default/computer-use.md CHANGED Viewed

@@ -6,9 +6,36 @@ platforms: ["darwin"]
 # Computer Use Skill
-Anthropic's `computer` tool gives Claude direct control of the actual macOS desktop. You take screenshots, then click/type/scroll at pixel coordinates anywhere on screen — Finder, Safari, native apps, anything visible. This is real desktop automation, not a sandboxed browser.
+Anthropic's `computer` tool gives Claude direct control of the actual macOS desktop — take screenshots, click/type/scroll at pixel coordinates, drive any app. Real desktop automation.
-For most browser work, **prefer `agent-browser` with refs from `snapshot -i`** (see the Browser Automation skill). Refs are deterministic, much cheaper in tokens, and don't depend on visual coordinates that shift between screenshots. Reach for computer use when the task involves a native macOS app, a non-browser GUI, complex visual verification, or cross-app workflows.
+## ⚠️ Computer use is the LAST RESORT — read this first
+Computer use is the slowest, most token-expensive, most error-prone tool you have. Almost every task you're tempted to use it for has a better alternative:
+| If you're trying to … | Use this instead |
+|---|---|
+| Read / write / search / edit files | `read_file`, `write_file`, `bash` (`grep`, `sed`, `cat`, etc.) |
+| Run a command, build, test, script | `bash` |
+| Use git / npm / brew / any CLI | `bash` |
+| **Anything in a web browser** | **`agent-browser` skill (`load_skill browser`)** — refs from `snapshot -i` are deterministic and ~100× cheaper in tokens than pixel-coordinate clicks |
+| Look up info on the web | `web_fetch` / `web_search` |
+| Call an HTTP API | `bash` (`curl`) |
+| Render an HTML page, screenshot it, scrape it | `agent-browser` (`load_skill browser`) |
+| Drive a Slack/Linear/Jira/etc. workflow | The integration's API via `bash` + `curl`, OR an MCP server (`load_skill manage-mcp`) |
+**Decision tree**:
+1. Is there a CLI for this? → `bash`.
+2. Is it in a web browser? → `load_skill browser`, then drive `agent-browser` via refs.
+3. Is it something visible only in a native macOS GUI app with no CLI / API / accessibility shortcut? → **only now** consider computer use.
+Computer use is appropriate for:
+- Native macOS apps with no CLI (e.g. driving System Settings, Calculator, Finder for tasks that can't be done via `defaults`/`open`/`launchctl`).
+- Complex visual verification ("does this app look right?").
+- Cross-app workflows that genuinely require the desktop (drag a file from app A to app B's window).
+- Demo-style tasks where the user wants to *see* the screen action.
+Computer use is **NOT** appropriate for browser work — `agent-browser` snapshots + refs are deterministic, don't depend on pixel coordinates that shift between screenshots, work cross-platform (no macOS dependency), don't need Accessibility/Screen Recording permissions, and use a fraction of the tokens. Always reach for the browser skill first.
 ## Requirements (macOS only)
@@ -125,6 +152,47 @@ Coordinates are in the **logical points** of the primary display — same coordi
 `enable_computer_use` auto-detects the size and stores it. Always look at the most recent screenshot to find positions before clicking — windows move, apps quit, the desktop re-flows.
+## Record what you're doing (default ON)
+Computer-use sessions are *visual* — the user can't see the screen you're driving, only your text summary. **Record almost every computer-use task** so the user can replay it. Use the built-in `sparkecoder record` helper which manages start/stop properly so the recording covers the **entire task**, not just a fixed-time window:
+```bash
+# 1. At the START of the task, BEFORE your first `computer` action:
+REC=$(sparkecoder record start --name "calculator-demo")
+# REC is JSON: {"id":"rec-abc123","path":"~/recordings/rec-abc123-calculator-demo.mov","pid":12345}
+REC_ID=$(echo "$REC" | jq -r .id)
+REC_PATH=$(echo "$REC" | jq -r .path)
+echo "Recording started: $REC_PATH (id=$REC_ID)"
+# 2. ... do all your computer-use actions (open apps, click, type, etc.) ...
+# 3. At the END (success OR failure), stop the recording:
+sparkecoder record stop "$REC_ID"
+# → JSON: {"id":"...","path":"...","durationSec":42,"sizeMb":18.4,"ok":true}
+```
+Why use `sparkecoder record` instead of `screencapture -V 60` directly:
+- `-V <seconds>` is a fixed timeout — if your task takes longer than the guess, the recording ends mid-task and you get a partial. The helper has no timeout; it records until you explicitly stop it.
+- The helper tracks PIDs in state so `sparkecoder record stop-all` can clean up if something crashes.
+- Works on both macOS (screencapture) and Linux (ffmpeg x11grab) with the same command.
+- Returns the file path as JSON, so you can include it in your final result without guessing.
+Default behavior:
+- **Always record** short / visually interesting tasks (open an app, click around, drag/drop, fill a form, demos, "show me X working").
+- **Always announce** before starting: "Starting a screen recording — I'll send you the file when I'm done."
+- **Include the file path in your final summary** (and in your `outputSchema` if you're a worker) so the orchestrator can post the file back via Slack/whatever channel.
+Skip recording when:
+- The task is **long-running and boring** (e.g. "every 15 minutes for the next hour, click Refresh"). A 60-minute screen video at full resolution gets huge fast.
+- The screen contains **sensitive content** the user hasn't explicitly approved recording (1Password vaults, customer dashboards, banking, private DMs). Ask first.
+- The user explicitly says "no recording" or "just describe what you did."
+- The task is **purely keyboard-driven CLI work** that didn't need computer-use in the first place — that should be `bash`, not `computer`.
+The auto-stop flag `-V 180` (3 minutes) is a generous default that prevents runaway recordings if you crash or forget to kill the process. For longer tasks, bump it or use a separate `asciinema` recording for the terminal half.
 ## Best practices
 1. **Screenshot before AND after each action.** Computer use is blind without screenshots. After clicking, re-screenshot to confirm the click had the expected effect before doing the next thing.

package/src/skills/default/recording.md ADDED Viewed

@@ -0,0 +1,196 @@
+---
+name: Recording
+description: Record terminal sessions (asciinema, anywhere) or screen video (macOS / Linux with X) while a task runs so the user can replay what happened. Useful for demos, debugging visual bugs, capturing computer-use sessions, and producing evidence the task actually did what it claims.
+platforms: ["darwin", "linux"]
+---
+# Recording Skill
+Sometimes a task is best explained by *showing* it. This skill teaches you to record a **terminal session** or **screen video** while you work, then hand the file to the user.
+## When to record
+Reach for this when:
+- The user explicitly asks for a recording / demo / video / proof.
+- You're using **computer use** to drive the desktop and the user will want to see what you did.
+- You're running a long-running command (build, test suite, deploy) and the user might want to scrub through it later.
+- You're debugging a flaky visual issue — recording lets you and the user re-watch frames.
+Skip recording when:
+- The task is purely textual (reading/writing files, code analysis). The chat transcript already has everything.
+- Recording would capture sensitive content (passwords on screen, private DMs, customer data). Ask first.
+## Screen recording — use `sparkecoder record` (preferred)
+The built-in helper handles start/stop reliably so the recording covers the WHOLE task, not a fixed time window. It tracks PIDs in state, works on macOS + Linux with the same interface, and returns JSON you can parse.
+```bash
+# Start: returns {id, path, pid}
+REC=$(sparkecoder record start --name "calculator-demo")
+REC_ID=$(echo "$REC" | jq -r .id)
+REC_PATH=$(echo "$REC" | jq -r .path)
+# ... do work ...
+# Stop: returns {id, path, durationSec, sizeMb, ok}
+sparkecoder record stop "$REC_ID"
+```
+Other operations:
+```bash
+sparkecoder record list           # list active recordings (auto-prunes dead pids)
+sparkecoder record stop-all       # stop everything (cleanup)
+sparkecoder record start --dir /custom/path --name my-task
+```
+Storage: defaults to `~/recordings/`. Files are named `rec-<id>[-<name>].<mov|mp4>`.
+### Why not `screencapture -V N` or raw `ffmpeg` backgrounded?
+You can — but you have to:
+- guess the right `-V` timeout (too short = partial, too long = wasted disk),
+- track the PID yourself for `kill -INT`,
+- pick the right tool per OS,
+- and clean up zombie processes if you crash mid-task.
+`sparkecoder record` does all of that. Reach for raw `screencapture` / `ffmpeg` only if you need specific flags the helper doesn't expose (custom display, rectangle crop, audio mux).
+## Terminal recording — asciinema
+`asciinema` records the **content of a terminal session** as a tiny `.cast` text file. It's a fraction of the size of video, plays in any browser via [asciinema-player](https://github.com/asciinema/asciinema-player), and the output is also greppable.
+### Install (one-time per machine)
+```bash
+# macOS
+brew install asciinema
+# Debian / Ubuntu
+sudo apt-get install -y asciinema
+# Other Linux
+pip install asciinema
+```
+### Record
+```bash
+# Start recording; everything you type and every command output goes into the cast.
+asciinema rec ~/recordings/demo-$(date +%s).cast
+# Run whatever you want to demo. Then exit the recording with Ctrl-D or `exit`.
+# Replay locally:
+asciinema play ~/recordings/demo-1700000000.cast
+# Upload to asciinema.org (public, optional):
+asciinema upload ~/recordings/demo-1700000000.cast
+```
+### Non-interactive recording (script-driven)
+If you (the agent) are running a sequence of commands and want to record without sitting in an interactive prompt:
+```bash
+# Record a single command's output:
+asciinema rec --command "bash -c 'npm test; echo exit=$?'" out.cast
+# Or record into a file while a script runs:
+asciinema rec --idle-time-limit 2 --title "test run $(date)" out.cast \
+  --command "./scripts/run-the-thing.sh"
+```
+`--idle-time-limit 2` collapses dead-air pauses longer than 2s so the playback isn't boring.
+### Hand the file to the user
+When you're done:
+1. Move the `.cast` to a stable path under `~/recordings/` or the session workspace.
+2. Tell the user the path. If `agent-remote-server` is configured, you can upload via the storage API and give them a signed URL — see the bash example below.
+3. Mention they can either `asciinema play <file>` locally or open the URL.
+```bash
+# Optional: upload to GCS via the storage route if configured
+RECORDING=~/recordings/demo-$(date +%s).cast
+curl -fsS -X POST "${SPARKECODER_REMOTE_URL}/storage/upload" \
+  -H "Authorization: Bearer ${SPARKECODER_AUTH_KEY}" \
+  -F "file=@${RECORDING}" | jq .url
+```
+## Advanced: raw `screencapture` / `ffmpeg` (use only when the helper isn't enough)
+Prefer `sparkecoder record` (above) for almost everything. Reach for raw tools only when you need flags the helper doesn't expose: rectangle crop, specific display, audio mux, custom frame rate.
+### Requirements
+- macOS: **Screen Recording permission** for the agent's parent process. `sparkecoder check-permissions` verifies; `sparkecoder request-permissions` opens the right System Settings pane.
+- Linux: `ffmpeg` installed (`sudo apt-get install -y ffmpeg`), and a running X server with the right `DISPLAY` env var.
+### macOS `screencapture` flags
+```bash
+# Crop to a rectangle (x,y,w,h):
+screencapture -v -R 100,100,800,600 -C ~/recordings/clip.mov
+# Pick a specific display (run `system_profiler SPDisplaysDataType` to list ids):
+screencapture -v -G <displayID> -C ~/recordings/clip.mov
+# Auto-stop after N seconds (avoid this for task-bounded recordings; use the
+# helper instead so the recording covers the whole task):
+screencapture -v -V 30 -C ~/recordings/clip.mov
+```
+### Linux `ffmpeg` x11grab
+```bash
+# Capture the X display (size matches the screen):
+ffmpeg -y -f x11grab -framerate 25 -video_size 1920x1080 -i :0.0 \
+  -c:v libx264 -preset veryfast -pix_fmt yuv420p \
+  ~/recordings/screen-$(date +%s).mp4
+# Stop with Ctrl-C or kill the ffmpeg pid.
+```
+On a headless server with no X, screen recording isn't meaningful — record the terminal with `asciinema` instead.
+## Quick recipes
+| You want to … | Use |
+|---|---|
+| Show what a computer-use task did on the desktop | `sparkecoder record start` → work → `record stop <id>` |
+| Capture a CLI demo, share a link | `asciinema rec → upload` |
+| Capture a single command's output verbatim | `asciinema rec --command "cmd" out.cast` |
+| Debug a flaky GUI test (any OS) | `sparkecoder record start` around the test run |
+| Both terminal AND screen at once | `sparkecoder record start` + `asciinema rec` in parallel |
+## Best practices
+1. **Always announce.** Before starting a recording, post a short message to the user: "Starting screen recording for the Calculator demo." So they can intervene if it'd capture something sensitive.
+2. **Use `sparkecoder record` for screen video.** It removes every footgun the raw tools have (no fixed timeout that cuts the task off mid-way, no PID juggling, no per-OS branching). Reach for raw `screencapture` / `ffmpeg` only when you need flags the helper doesn't expose (region crop, custom display, audio).
+3. **Bracket the WHOLE task.** `record start` *before* your first work command; `record stop <id>` *after* the very last one. Don't bracket each step individually — you'll lose context between clips.
+4. **Stop on exit, success OR failure.** Always run `record stop <id>` at the end of your task, even on errors. Otherwise the recorder keeps writing until `record stop-all` (or a reboot) catches it. The duration / size returned by `stop` is also useful in your final summary.
+5. **Don't record secrets.** Refuse to record if the screen contains visible passwords, 1Password vaults, customer PII, etc. Ask the user to close those windows first.
+6. **Clean up after yourself.** If a recording is no longer needed, `rm` it — videos are big.
+7. **Always mention the file in your final message.** "Recording saved to `<path>`" or, better, upload via `/storage/upload` and post the `/f/<id>` URL so Slack/etc. unfurls it inline.
+## Returning the recording in your final task summary
+For a **task worker** that captured a recording, include the structured fields below in your `result` so the orchestrator can route it back to the user:
+```ts
+{
+  "summary": "Opened Calculator, typed 2+2, hit Return. Got 4.",
+  "recording": {
+    "path": "/Users/me/recordings/rec-abc123-calculator-demo.mov",
+    "durationSec": 42,
+    "sizeMb": 18.4,
+    "description": "Full screen recording of the Calculator interaction"
+  }
+}
+```
+The orchestrator can then upload that path via `/storage/upload` and post the `/f/<id>` URL into Slack (or whatever channel the original request came in on) — Slack unfurls it inline as a video preview.

package/web/.next/BUILD_ID CHANGED Viewed

	@@ -1 +1 @@
1	- ~~mC7Yp8aJpCL22mrYVzIEG~~
1	+ pSGgtxBjN5_o0YHZgQzXm

package/web/.next/standalone/web/.next/BUILD_ID CHANGED Viewed

	@@ -1 +1 @@
1	- ~~mC7Yp8aJpCL22mrYVzIEG~~
1	+ pSGgtxBjN5_o0YHZgQzXm

package/web/.next/standalone/web/.next/build-manifest.json CHANGED Viewed

@@ -7,8 +7,8 @@
     "static/chunks/a6dad97d9634a72d.js"
   ],
   "lowPriorityFiles": [
-    "static/mC7Yp8aJpCL22mrYVzIEG/_ssgManifest.js",
-    "static/mC7Yp8aJpCL22mrYVzIEG/_buildManifest.js"
+    "static/pSGgtxBjN5_o0YHZgQzXm/_ssgManifest.js",
+    "static/pSGgtxBjN5_o0YHZgQzXm/_buildManifest.js"
   ],
   "rootMainFiles": [
     "static/chunks/651e187cc15d66de.js",

package/web/.next/standalone/web/.next/prerender-manifest.json CHANGED Viewed

@@ -367,8 +367,8 @@
   "dynamicRoutes": {},
   "notFoundRoutes": [],
   "preview": {
-    "previewModeId": "79c2ff3b3e0e05351c87d5ce707e28a6",
-    "previewModeSigningKey": "c1e0783b8b9f89caf7a9a27d7b457e7b897d5ad98f3e75e27d39ed1ae0f2078e",
-    "previewModeEncryptionKey": "8bb1e0f83531b92db30d127bc55ce5b3ab0ad1a4ac2728cda33424894dec3a1b"
+    "previewModeId": "4d4df0b08ac51f255c1fba959eff551a",
+    "previewModeSigningKey": "61a87433a4783d0b199c61d12ecf05a6503595b50fb2863e5e86ea9db1544061",
+    "previewModeEncryptionKey": "b4d0ec3479d75253ad48871374201d5a399ead8edb8cc23a194a5b2375eddfa9"
   }
 }