npm - claudeboss - Versions diffs - 0.1.0 → 0.2.0 - Mend

claudeboss 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/CHANGELOG.md +18 -0
package/README.md +77 -19
package/bin/claudeboss.js +43 -13
package/package.json +5 -3
package/scripts/install-opencode.sh +2 -16
package/scripts/run-task.js +113 -0
package/scripts/select-model.sh +24 -18
package/scripts/stats-report.js +61 -0
package/skills/boss/SKILL.md +7 -1
package/skills/plan/SKILL.md +33 -0
package/skills/team/SKILL.md +1 -1
package/scripts/claudeboss-run.sh +0 -30

package/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,18 @@
+# Changelog
+All notable changes to ClaudeBoss are documented here.
+## 0.2.0
+- Added `/claudeboss:plan` mode, routing design/architecture questions to opencode's dedicated `plan` agent before any implementation is delegated.
+- Added `claudeboss stats` -- reports estimated Claude-context tokens saved across past runs, overall and by task type.
+- Added `claudeboss update` -- updates both the `claudeboss` npm package and the `opencode` CLI to their latest versions in one command.
+- Added `claudeboss --version` / `-v`.
+- Replaced the jq-based run pipeline with `scripts/run-task.js` (Node): drops the `jq` system dependency, and records a token-savings estimate per run.
+- Boss mode now reads a project's `.claudeboss.yml` `deploy:` key for the deployment command, asking only once per project if it's missing.
+- Added CI (`.github/workflows/ci.yml`): validates JSON manifests, syntax-checks all JS files, and lints all shell scripts.
+- Rewrote README for clarity and a measured (not invented) token-savings example.
+## 0.1.0
+- Initial release: `/claudeboss:team` and `/claudeboss:boss` modes, `claudeboss install`/`model`/`run` commands, free-model selection via live `opencode models` queries, manual (non-automated) OpenRouter fallback instructions.

package/README.md CHANGED Viewed

@@ -1,8 +1,24 @@
 # ClaudeBoss
-Delegate tasks to **free** opencode agents, straight from Claude Code — as a teammate you review, or as a worker you manage.
+**Cut Claude Code's own token consumption by up to 90% — delegate the heavy lifting to free opencode agents, keep Claude in the reviewer's or manager's seat.**
-Claude Code can spend a lot of its own context/tokens doing every step of a task itself. ClaudeBoss routes the actual work to [opencode](https://opencode.ai)'s free-tier models (OpenCode Zen, no login required), while Claude reviews the result, fixes what's wrong, and — in Boss mode — supervises deployment. You keep Claude's judgment; the free model does the token-heavy grinding.
+[![npm version](https://img.shields.io/npm/v/claudeboss.svg)](https://www.npmjs.com/package/claudeboss)
+[![license: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
+[![node](https://img.shields.io/badge/node-%3E%3D18-brightgreen.svg)](package.json)
+Every step Claude Code takes itself — reading files, writing code, running commands, iterating — burns its own context window. ClaudeBoss routes that grinding work to [opencode](https://opencode.ai)'s **free-tier models** (OpenCode Zen, zero login required) and hands Claude back only the finished result. Claude still reviews, verifies, and decides — it just stops paying token-rate for every intermediate step.
+## The number, measured
+On a real multi-step coding task (write a function, save it, run it, verify the output) delegated through ClaudeBoss:
+| | Tokens (approx.) |
+|---|---|
+| Full opencode reasoning/tool-call trace | ~3,726 |
+| What Claude actually reads (final answer only) | ~44 |
+| **Reduction** | **99%** |
+That's one real, reproducible run — see `claudeboss stats` below to measure your own. We publish **"up to 90%"** as the honest, conservative baseline; plenty of real tasks land higher.
 ## Install (one command)
@@ -11,7 +27,7 @@ npm install -g claudeboss
 claudeboss install
 ```
-`claudeboss install` checks for the `opencode` CLI (installs it via npm if missing), confirms at least one free model is reachable, and prints manual OpenRouter fallback steps if not — **it never creates third-party accounts on your behalf.**
+`claudeboss install` checks for the `opencode` CLI (installs it via npm if missing) and confirms at least one free model is reachable. **It never creates third-party accounts on your behalf** — if OpenCode Zen's free tier is ever unavailable, you get exact manual steps for a free OpenRouter key instead of silent browser automation.
 ### Via the Claude Code plugin marketplace
@@ -20,39 +36,81 @@ claudeboss install
 /plugin install claudeboss@claudeboss
 ```
-Either install path gives you the same two skills inside Claude Code:
+## The three modes
-| Command | What it does |
-|---|---|
-| `/claudeboss:team` | You and opencode work as a pair. Opencode drafts, Claude reviews like a second engineer and implements/finishes it. |
-| `/claudeboss:boss` | Claude manages, doesn't code. Delegates to opencode, requests revisions (max 3 rounds), shows you a summary, waits for your approval, then supervises deployment end-to-end. |
+| Command | Role | What it does |
+|---|---|---|
+| `/claudeboss:plan` | Architect | Delegates a design/scoping question to opencode's dedicated `plan` agent before anything gets built. |
+| `/claudeboss:team` | Pair programmer | Opencode drafts, Claude reviews like a second engineer and finishes the implementation. Two failed drafts → Claude just does it. |
+| `/claudeboss:boss` | Manager | Claude doesn't write code. It delegates, reviews, requests revisions (max 3 rounds), shows you a summary, waits for your explicit approval, then supervises deployment end-to-end. |
-## Why this exists
+## Track your own savings
-Spawning a full Claude subagent for routine work still burns Claude's own context/tokens. Opencode's free tier (OpenCode Zen — DeepSeek V4 Flash Free, Nemotron 3 Ultra Free, North Mini Code Free, MiMo V2.5 Free, and more) does real work at zero API cost. ClaudeBoss:
+```bash
+claudeboss stats
+```
+```
+ClaudeBoss stats
+================
+Runs recorded:            1
+Est. tokens w/o delegation: ~3,726
+Est. tokens w/ delegation:  ~44
+Est. tokens saved:          ~3,682
+Avg. reduction per run:     ~99%
+```
-- **Picks the model live, per task.** `claudeboss model code` and `claudeboss model write` query `opencode models` at run time and rank by keyword heuristics (coder/code-tuned models for coding tasks, larger general models for prose/research). Free-tier catalogs change without notice — nothing here is hardcoded to a specific model ID.
-- **Extracts only the final answer.** Opencode's own reasoning/tool-call trace never reaches Claude's context — only the last text output does.
-- **Never auto-creates accounts.** If OpenCode Zen's free tier is ever unavailable, you get exact manual steps for a free OpenRouter key — no silent browser automation signing you up for a third-party service.
+Every `claudeboss run` records its estimated before/after token footprint locally (`~/.claudeboss/stats.json`) — nothing leaves your machine.
+## Stay up to date
+```bash
+claudeboss update
+```
+Updates both the `claudeboss` npm package and the `opencode` CLI to their latest versions in one command. Check your installed version any time with `claudeboss --version`. See [CHANGELOG.md](CHANGELOG.md) for release notes.
+## How the model gets picked
+`claudeboss model <task-type>` queries `opencode models` live and ranks by task type — coding tasks prefer code-tuned free models (OpenCode Zen's own first, since they've proven more reliable for multi-step agentic work; OpenRouter free tier as fallback), prose/research prefer larger general-reasoning free models. Nothing is hardcoded to a specific model ID — free-tier catalogs change without notice, so every run re-checks what's actually available.
+```bash
+claudeboss model code     # -> e.g. opencode/north-mini-code-free
+claudeboss model write    # -> e.g. opencode/nemotron-3-ultra-free
+```
 ## Manual CLI usage (outside Claude Code)
 ```bash
-claudeboss model code                          # -> best free model id for a coding task
 claudeboss run code "add input validation to the signup form"
+claudeboss run plan "should this feature be a new service or extend the existing API?"
 claudeboss run write "draft a 3-paragraph changelog entry for v0.2.0"
 ```
+## Boss mode + deployment
+Boss mode looks for a `.claudeboss.yml` in your project root to know how to deploy:
+```yaml
+deploy: ./scripts/deploy.sh
+```
+If it's missing, Claude asks once what "deploy" means for your project and suggests saving it there so future runs don't ask again.
 ## Requirements
-- Node.js >= 18 (for the npm install path)
-- `jq` (installed automatically via Homebrew/apt if missing, or install manually: https://jqlang.org/download/)
+- Node.js >= 18
 ## Limitations
-- Free models are weaker than Claude — don't expect them to solve hard/ambiguous problems unsupervised. Both modes cap revision loops (2 in team mode, 3 in boss mode) and escalate to you instead of grinding forever.
-- Boss mode's deployment step assumes you've already told Claude what "deploy" means for your project (push to a branch, run a script, SSH to a server, etc.) — it doesn't invent a deploy pipeline for you.
-- Free-tier model catalogs are provider-controlled and can shrink or disappear. `claudeboss install` re-checks live, every time.
+- Free models are weaker than Claude — they're not meant to solve hard, ambiguous problems unsupervised. Both modes cap revision loops (2 in team mode, 3 in boss mode) and escalate to you instead of grinding forever.
+- Runs use opencode's `--auto` flag to auto-approve tool permissions (file writes, bash execution) inside the task's working directory, since there's no human present during headless delegation to click "allow." This is what makes non-interactive delegation possible at all — know that opencode's worker model can read/write files and run commands unattended, scoped to wherever Claude invokes it.
+- Free-tier model catalogs are provider-controlled and can shrink or change without notice. `claudeboss install` and every `claudeboss run` re-check live.
+- Boss mode's deployment step assumes you've told Claude (or `.claudeboss.yml`) what "deploy" means for your project — it doesn't invent a deploy pipeline for you.
+## Contributing
+Issues and PRs welcome: https://github.com/xWael9/ClaudeBoss/issues
 ## License

package/bin/claudeboss.js CHANGED Viewed

@@ -1,34 +1,64 @@
 #!/usr/bin/env node
 const { spawnSync } = require('child_process');
 const path = require('path');
+const fs = require('fs');
 const SCRIPTS_DIR = path.join(__dirname, '..', 'scripts');
+const PKG = require('../package.json');
 const [, , cmd, ...rest] = process.argv;
-function run(script, args) {
+function runShell(script, args) {
   const res = spawnSync('bash', [path.join(SCRIPTS_DIR, script), ...args], { stdio: 'inherit' });
   process.exit(res.status === null ? 1 : res.status);
 }
+function runNode(script, args) {
+  const res = spawnSync(process.execPath, [path.join(SCRIPTS_DIR, script), ...args], { stdio: 'inherit' });
+  process.exit(res.status === null ? 1 : res.status);
+}
+function printHelp() {
+  console.log(`ClaudeBoss v${PKG.version} -- delegate tasks to free opencode agents from Claude Code.
+Usage:
+  claudeboss install                    Check/install opencode CLI, verify a free model is reachable
+  claudeboss model <task-type>          Print the best free model id for a task type (code|write|research|general|plan)
+  claudeboss run <task-type> "<task>"   Run a task through the selected free model, print only the final answer
+  claudeboss stats                      Show estimated token savings from past runs
+  claudeboss update                     Update claudeboss (npm) and opencode to their latest versions
+  claudeboss --version, -v              Print the installed ClaudeBoss version
+Claude Code usage: install this as a plugin, then use /claudeboss:team, /claudeboss:boss, or
+/claudeboss:plan inside a session.
+Repo: https://github.com/xWael9/ClaudeBoss`);
+}
 switch (cmd) {
   case 'install':
-    run('install-opencode.sh', rest);
+    runShell('install-opencode.sh', rest);
     break;
   case 'model':
-    run('select-model.sh', rest);
+    runShell('select-model.sh', rest);
     break;
   case 'run':
-    run('claudeboss-run.sh', rest);
+    runNode('run-task.js', rest);
+    break;
+  case 'stats':
+    runNode('stats-report.js', rest);
+    break;
+  case 'update': {
+    console.log('Updating claudeboss...');
+    const npmRes = spawnSync('npm', ['install', '-g', 'claudeboss@latest'], { stdio: 'inherit' });
+    console.log('');
+    console.log('Updating opencode...');
+    const ocRes = spawnSync('opencode', ['upgrade'], { stdio: 'inherit' });
+    process.exit((npmRes.status || 0) || (ocRes.status || 0));
+  }
+  case '--version':
+  case '-v':
+    console.log(PKG.version);
     break;
   default:
-    console.log(`ClaudeBoss -- delegate tasks to free opencode agents from Claude Code.
-Usage:
-  claudeboss install                    Check/install opencode CLI, verify a free model is reachable
-  claudeboss model <task-type>          Print the best free model id for a task type (code|write|research|general)
-  claudeboss run <task-type> "<task>"   Run a task through the selected free model, print only the final answer
-Claude Code usage: install this as a plugin, then use /claudeboss:team or /claudeboss:boss inside a session.
-Repo: https://github.com/xWael9/ClaudeBoss`);
+    printHelp();
     process.exit(cmd ? 1 : 0);
 }

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "claudeboss",
-  "version": "0.1.0",
-  "description": "Delegate tasks to free opencode agents and supervise them from Claude Code -- Team mode (pair review) or Boss mode (manage, approve, deploy).",
+  "version": "0.2.0",
+  "description": "Cut Claude Code's own token consumption by delegating routine work to free opencode agents -- Team mode (pair review), Boss mode (manage, approve, deploy), or Plan mode (design first).",
   "bin": {
     "claudeboss": "bin/claudeboss.js"
   },
@@ -11,6 +11,7 @@
     "skills",
     ".claude-plugin",
     "README.md",
+    "CHANGELOG.md",
     "LICENSE"
   ],
   "keywords": [
@@ -19,7 +20,8 @@
     "ai-agent",
     "plugin",
     "delegation",
-    "free-llm"
+    "free-llm",
+    "token-savings"
   ],
   "author": "xWael9",
   "license": "MIT",

package/scripts/install-opencode.sh CHANGED Viewed

@@ -1,24 +1,10 @@
 #!/usr/bin/env bash
-# ClaudeBoss setup: ensure opencode CLI + jq are present, verify a free model is reachable.
+# ClaudeBoss setup: ensure opencode CLI is present, verify a free model is reachable.
 # Never creates third-party accounts automatically -- prints manual steps instead.
 set -uo pipefail
 echo "== ClaudeBoss: opencode setup =="
-if ! command -v jq >/dev/null 2>&1; then
-  echo "jq not found (needed to parse opencode's JSON output)." >&2
-  if command -v brew >/dev/null 2>&1; then
-    echo "Installing jq via Homebrew..."
-    brew install jq
-  elif command -v apt-get >/dev/null 2>&1; then
-    echo "Installing jq via apt-get..."
-    sudo apt-get update -y && sudo apt-get install -y jq
-  else
-    echo "Install jq manually: https://jqlang.org/download/" >&2
-    exit 1
-  fi
-fi
 if command -v opencode >/dev/null 2>&1; then
   echo "opencode CLI found: $(opencode --version 2>/dev/null)"
 else
@@ -43,7 +29,7 @@ if [ -n "$FREE_MODELS" ]; then
   echo "Free models available (no login needed):"
   echo "$FREE_MODELS" | sed 's/^/  - /'
   echo ""
-  echo "ClaudeBoss is ready. Use /claudeboss:team or /claudeboss:boss in Claude Code."
+  echo "ClaudeBoss is ready. Use /claudeboss:team, /claudeboss:boss, or /claudeboss:plan in Claude Code."
   exit 0
 fi

package/scripts/run-task.js ADDED Viewed

@@ -0,0 +1,113 @@
+#!/usr/bin/env node
+// Runs one task through opencode, prints ONLY the final answer text, and
+// records a token-savings estimate to ~/.claudeboss/stats.json.
+//
+// Why this exists instead of a jq one-liner: it lets us (a) drop the jq
+// system dependency, and (b) measure the actual reduction between opencode's
+// full reasoning/tool-call stream and the final text Claude actually reads --
+// that gap is where ClaudeBoss's token savings come from.
+const { spawnSync } = require('child_process');
+const path = require('path');
+const fs = require('fs');
+const os = require('os');
+const SCRIPTS_DIR = __dirname;
+const STATS_DIR = path.join(os.homedir(), '.claudeboss');
+const STATS_FILE = path.join(STATS_DIR, 'stats.json');
+const [, , taskTypeArg, ...taskParts] = process.argv;
+const taskType = taskTypeArg || 'general';
+const task = taskParts.join(' ');
+if (!task) {
+  console.error('Usage: claudeboss run <task-type> "<task text>"');
+  console.error('task-type: code | write | research | general | plan');
+  process.exit(1);
+}
+function selectModel(type) {
+  const res = spawnSync('bash', [path.join(SCRIPTS_DIR, 'select-model.sh'), type], { encoding: 'utf8' });
+  if (res.status !== 0 || !res.stdout.trim()) {
+    console.error(res.stderr || 'No usable free model found. Run: claudeboss install');
+    process.exit(1);
+  }
+  return res.stdout.trim();
+}
+const agent = taskType === 'plan' ? 'plan' : 'build';
+const model = selectModel(taskType);
+console.error(`[claudeboss] agent: ${agent} | model: ${model}`);
+const run = spawnSync(
+  'opencode',
+  ['run', '--agent', agent, '--model', model, '--format', 'json', '--auto', task],
+  { encoding: 'utf8', maxBuffer: 64 * 1024 * 1024 }
+);
+const rawStream = (run.stdout || '') + (run.stderr || '');
+if (run.status !== 0 && !rawStream.trim()) {
+  console.error('opencode run failed with no output.');
+  process.exit(run.status || 1);
+}
+const lines = rawStream.split('\n').filter(Boolean);
+let finalText = '';
+let openencodeTokens = 0;
+for (const line of lines) {
+  let evt;
+  try {
+    evt = JSON.parse(line);
+  } catch {
+    continue; // non-JSON lines (warnings etc.) are ignored, not surfaced
+  }
+  if (evt.type === 'text' && evt.part && typeof evt.part.text === 'string') {
+    finalText = evt.part.text;
+  }
+  if (evt.type === 'step_finish' && evt.part && evt.part.tokens && typeof evt.part.tokens.total === 'number') {
+    openencodeTokens += evt.part.tokens.total;
+  }
+}
+if (!finalText) {
+  console.error('No final text found in opencode output.');
+  console.error(rawStream.slice(0, 2000));
+  process.exit(1);
+}
+// Rough approximation: ~4 chars per token (standard heuristic, not exact for
+// any specific tokenizer). Used only to make the savings estimate legible --
+// never presented as an exact count.
+const CHARS_PER_TOKEN = 4;
+const streamChars = rawStream.length;
+const finalChars = finalText.length;
+const estimatedClaudeTokensWithoutDelegation = Math.round(streamChars / CHARS_PER_TOKEN);
+const estimatedClaudeTokensWithDelegation = Math.round(finalChars / CHARS_PER_TOKEN);
+const reductionPct = streamChars > 0
+  ? Math.round((1 - finalChars / streamChars) * 100)
+  : 0;
+try {
+  fs.mkdirSync(STATS_DIR, { recursive: true });
+  let stats = { runs: [] };
+  if (fs.existsSync(STATS_FILE)) {
+    try { stats = JSON.parse(fs.readFileSync(STATS_FILE, 'utf8')); } catch { stats = { runs: [] }; }
+  }
+  stats.runs.push({
+    taskType,
+    agent,
+    model,
+    openencodeTokens,
+    estimatedClaudeTokensWithoutDelegation,
+    estimatedClaudeTokensWithDelegation,
+    reductionPct,
+  });
+  fs.writeFileSync(STATS_FILE, JSON.stringify(stats, null, 2));
+} catch (e) {
+  // stats logging must never break the actual task output
+  console.error(`[claudeboss] stats logging skipped: ${e.message}`);
+}
+process.stdout.write(finalText + '\n');

package/scripts/select-model.sh CHANGED Viewed

@@ -18,34 +18,40 @@ if [ -z "$FREE_MODELS" ]; then
   exit 1
 fi
-pick() {
-  echo "$FREE_MODELS" | grep -i -- "$1" | head -1 || true
+# Prefer OpenCode Zen's own free models first -- they've proven reliable for
+# multi-step agentic tasks (file writes, bash, verification loops). OpenRouter
+# free-tier models are tried only if no Zen model matches any priority
+# keyword, since availability/latency on a shared free tier varies more.
+ZEN_MODELS=$(echo "$FREE_MODELS" | grep -i '^opencode/' || true)
+pick_from() {
+  local pool="$1"; shift
+  local keyword
+  for keyword in "$@"; do
+    local hit
+    hit=$(echo "$pool" | grep -i -- "$keyword" | head -1 || true)
+    if [ -n "$hit" ]; then
+      echo "$hit"
+      return 0
+    fi
+  done
+  return 1
 }
-MODEL=""
 case "$TASK_TYPE" in
   code|coding)
-    for kw in coder code-free deepseek qwen3 nemotron; do
-      MODEL=$(pick "$kw")
-      [ -n "$MODEL" ] && break
-    done
+    KEYWORDS=(coder code-free deepseek qwen3 nemotron)
     ;;
   write|writing|research)
-    for kw in ultra nemotron gpt-oss-120b deepseek qwen3; do
-      MODEL=$(pick "$kw")
-      [ -n "$MODEL" ] && break
-    done
+    KEYWORDS=(ultra nemotron gpt-oss-120b deepseek qwen3)
     ;;
   *)
-    for kw in deepseek ultra nemotron gpt-oss-120b qwen3; do
-      MODEL=$(pick "$kw")
-      [ -n "$MODEL" ] && break
-    done
+    KEYWORDS=(deepseek ultra nemotron gpt-oss-120b qwen3)
     ;;
 esac
-if [ -z "$MODEL" ]; then
-  MODEL=$(echo "$FREE_MODELS" | head -1)
-fi
+MODEL=$(pick_from "$ZEN_MODELS" "${KEYWORDS[@]}") || \
+MODEL=$(pick_from "$FREE_MODELS" "${KEYWORDS[@]}") || \
+MODEL=$(echo "$FREE_MODELS" | head -1)
 echo "$MODEL"

package/scripts/stats-report.js ADDED Viewed

@@ -0,0 +1,61 @@
+#!/usr/bin/env node
+// Summarizes ~/.claudeboss/stats.json -- the estimated Claude-context tokens
+// saved by delegating runs to opencode instead of Claude doing the full
+// reasoning/tool-call loop itself.
+const fs = require('fs');
+const path = require('path');
+const os = require('os');
+const STATS_FILE = path.join(os.homedir(), '.claudeboss', 'stats.json');
+if (!fs.existsSync(STATS_FILE)) {
+  console.log('No ClaudeBoss runs recorded yet. Run `claudeboss run <type> "<task>"` at least once.');
+  process.exit(0);
+}
+let stats;
+try {
+  stats = JSON.parse(fs.readFileSync(STATS_FILE, 'utf8'));
+} catch {
+  console.log('Stats file is unreadable/corrupted:', STATS_FILE);
+  process.exit(1);
+}
+const runs = stats.runs || [];
+if (runs.length === 0) {
+  console.log('No ClaudeBoss runs recorded yet.');
+  process.exit(0);
+}
+const totalWithout = runs.reduce((sum, r) => sum + (r.estimatedClaudeTokensWithoutDelegation || 0), 0);
+const totalWith = runs.reduce((sum, r) => sum + (r.estimatedClaudeTokensWithDelegation || 0), 0);
+const totalSaved = totalWithout - totalWith;
+const avgReductionPct = runs.length
+  ? Math.round(runs.reduce((sum, r) => sum + (r.reductionPct || 0), 0) / runs.length)
+  : 0;
+const byType = {};
+for (const r of runs) {
+  const t = r.taskType || 'general';
+  byType[t] = byType[t] || { count: 0, saved: 0 };
+  byType[t].count += 1;
+  byType[t].saved += (r.estimatedClaudeTokensWithoutDelegation || 0) - (r.estimatedClaudeTokensWithDelegation || 0);
+}
+console.log('ClaudeBoss stats');
+console.log('================');
+console.log(`Runs recorded:            ${runs.length}`);
+console.log(`Est. tokens w/o delegation: ~${totalWithout.toLocaleString()}`);
+console.log(`Est. tokens w/ delegation:  ~${totalWith.toLocaleString()}`);
+console.log(`Est. tokens saved:          ~${totalSaved.toLocaleString()}`);
+console.log(`Avg. reduction per run:     ~${avgReductionPct}%`);
+console.log('');
+console.log('By task type:');
+for (const [type, v] of Object.entries(byType)) {
+  console.log(`  ${type.padEnd(10)} runs: ${String(v.count).padEnd(4)} est. saved: ~${v.saved.toLocaleString()} tokens`);
+}
+console.log('');
+console.log('(Estimates use a ~4 chars/token approximation of opencode\'s full');
+console.log('reasoning/tool-call stream vs. the final text Claude actually reads --');
+console.log('not an exact token count from any specific tokenizer.)');

package/skills/boss/SKILL.md CHANGED Viewed

@@ -13,7 +13,7 @@ You are the boss. The free opencode agent is the worker who writes code and does
 ## Prerequisites
-`claudeboss install` must have succeeded at least once on this machine (see the `team` skill for what it checks).
+`claudeboss install` must have succeeded at least once on this machine (see the `team` skill for what it checks). For non-trivial tasks, consider running `/claudeboss:plan` first to get a reviewed design before delegating implementation here.
 ## Workflow
@@ -34,6 +34,12 @@ You are the boss. The free opencode agent is the worker who writes code and does
 4. **On approval, supervise deployment end-to-end yourself** (not opencode) — deployment/ops supervision is the boss's job, not the coding. Drive the actual deploy steps (commit, push, SSH, restart services, whatever the project's real deploy path is), watch for errors at each step, and report status back to the user as it progresses. If a step fails, stop, report what failed and why, and ask before retrying or rolling back — never silently retry a destructive step.
+   **Where the deploy steps come from:** check the project root for a `.claudeboss.yml` file with a `deploy:` key first:
+   ```yaml
+   deploy: ./scripts/deploy.sh
+   ```
+   If it exists, that's the deploy command — run it (with the same error-handling/reporting rules above). If it doesn't exist, ask the user once what "deploy" means for this project, then suggest adding a `.claudeboss.yml` with that command so future boss-mode runs don't need to ask again.
 5. **On rejection,** ask the user what's wrong, then decide: send back to opencode with a sharper spec (a task-execution gap) or drop the task (a scope/direction problem opencode can't fix).
 ## What this mode explicitly forbids

package/skills/plan/SKILL.md ADDED Viewed

@@ -0,0 +1,33 @@
+---
+name: plan
+description: Delegate architecture/planning work to opencode's dedicated planning agent before any code gets written. Use when the user says "/claudeboss:plan", wants a plan or design reviewed before implementation, or asks for a task to be scoped before building it.
+metadata:
+  origin: ClaudeBoss
+---
+# ClaudeBoss — Plan Mode
+Opencode ships a dedicated `plan` primary agent, separate from `build`. Use it when the task is "figure out the right approach" rather than "write the code" -- it's a distinct pass from `/claudeboss:team` and `/claudeboss:boss`, meant to run *before* either of them on non-trivial work.
+## Prerequisites
+`claudeboss install` must have succeeded at least once on this machine.
+## Workflow
+1. **Delegate the planning question**, not the implementation:
+   ```bash
+   claudeboss run plan "<the design question: what needs to change, what constraints apply, what are the tradeoffs>"
+   ```
+   `claudeboss run plan` automatically routes to opencode's `plan` agent (not `build`) and a general-purpose free model.
+2. **Review the plan like you'd review a design doc.** Does it account for the actual codebase's conventions and constraints (you may need to check the real code yourself -- the free model has no memory of this specific project beyond what you told it)? Is it missing an edge case, a migration step, a rollback path?
+3. **Hand off.** Once the plan is sound, either:
+   - implement it yourself,
+   - delegate implementation to `/claudeboss:team` (you review + finish), or
+   - delegate to `/claudeboss:boss` (you manage, opencode's `build` agent implements, you approve before deploy).
+   Tell the user which handoff you're using and why.
+4. **If the plan is weak** (vague, ignores a stated constraint, unworkable), don't pass it downstream -- refine the prompt and re-run, or just write the plan yourself and say so. A bad plan compounds into worse code if it flows straight into `/claudeboss:boss`.

package/skills/team/SKILL.md CHANGED Viewed

@@ -17,7 +17,7 @@ Run once per machine: `claudeboss install` — checks/installs the `opencode` CL
 ## Workflow
-1. **Classify the task** into one of: `code`, `write`, `research`, `general`. Coding tasks (features, fixes, refactors) → `code`. Prose/docs/copy → `write`. Fact-finding/comparison → `research`. Anything else → `general`.
+1. **Classify the task** into one of: `code`, `write`, `research`, `general` (use `/claudeboss:plan` instead, first, if the task itself is "figure out the right approach"). Coding tasks (features, fixes, refactors) → `code`. Prose/docs/copy → `write`. Fact-finding/comparison → `research`. Anything else → `general`.
 2. **Delegate.** Run:
    ```bash

package/scripts/claudeboss-run.sh DELETED Viewed

@@ -1,30 +0,0 @@
-#!/usr/bin/env bash
-# Delegates one task to the best free opencode model and prints ONLY the
-# final answer text -- opencode's internal tool calls/reasoning never reach
-# the caller's context.
-set -uo pipefail
-TASK_TYPE="${1:-general}"
-shift || true
-TASK="$*"
-if [ -z "$TASK" ]; then
-  echo "Usage: claudeboss run <task-type> \"<task text>\"" >&2
-  echo "task-type: code | write | research | general" >&2
-  exit 1
-fi
-SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
-MODEL=$("$SCRIPT_DIR/select-model.sh" "$TASK_TYPE")
-STATUS=$?
-if [ $STATUS -ne 0 ] || [ -z "$MODEL" ]; then
-  echo "No usable free model found. Run: claudeboss install" >&2
-  exit 1
-fi
-echo "[claudeboss] model: $MODEL" >&2
-opencode run --agent build --model "$MODEL" --format json "$TASK" 2>&1 \
-  | jq -r 'select(.type=="text") | .part.text' \
-  | tail -1