npm - botverse-mcp - Versions diffs - 1.0.4 → 1.2.0 - Mend

botverse-mcp 1.0.4 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # botverse-mcp
-MCP server for [Botverse](https://botverse.cloud) — video transcoding and document conversion for AI agents.
+MCP server **and command-line tool** for [Botverse](https://botverse.cloud) — video transcoding and document conversion for AI agents and the humans who configure them.
 [![npm](https://img.shields.io/npm/v/botverse-mcp)](https://www.npmjs.com/package/botverse-mcp)
@@ -8,12 +8,13 @@ MCP server for [Botverse](https://botverse.cloud) — video transcoding and docu
 - **Video transcoding** — MP4 (H.264), WebM (VP9), ProRes 422, GIF, MP3 extraction · $0.25/job
 - **Document conversion** — Markdown ↔ DOCX ↔ PDF ↔ HTML ↔ XLSX · $0.05/file
+- **Transcription** — speaker-labelled transcripts (diarization + AI speaker naming) → txt/srt/vtt/docx/pdf · ~$5/hour
-Three tool calls. No AWS. No FFmpeg. No infrastructure.
+Two ways to use it: an **MCP server** for your AI agents, and a **`botverse` CLI** for the shell — evaluation, CI/CD, cron, scripts, and local coding agents. No AWS. No FFmpeg. No infrastructure.
 ## Setup
-1. Sign up at [botverse.cloud](https://botverse.cloud) — $5 minimum top-up, no monthly fees
+1. Sign up at [botverse.cloud](https://botverse.cloud) — **free to try: $1 credit on signup, no card required.** A card + 2FA are only needed at your first top-up ($5 min). No monthly fees.
 2. Get an API key or connector URL from your dashboard
 3. Add to your MCP client config
@@ -51,16 +52,41 @@ Or with a connector URL (recommended for claude.ai):
 }
 ```
-## Tools
+## Command line (`botverse`)
+The same package ships a `botverse` CLI for the shell — it reads files from disk and
+streams them straight to the API (no content goes through an LLM), so it's the fast
+path for evaluation, automation, and local coding agents.
+```bash
+export BOTVERSE_API_KEY=bv_live_…        # or BOTVERSE_CONNECTOR_URL=…?token=bv_sess_…
+npx botverse convert report.md --to pdf
+npx botverse convert *.md --to docx,pdf -o ./out
+npx botverse transcode clip.mov --to mp4 -o ./out
+npx botverse transcribe call.mp4 --to docx --attendees "Sarah Chen,Mike Torres"
+npx botverse balance
+```
+Each job uploads → polls → downloads the finished file to `-o` (default: current dir).
+Globs and multiple `--to` formats run as a batch.
+> **Sandbox note:** the CLI needs outbound network to `botverse.cloud` and S3, so it does
+> **not** run inside sandboxed agent environments (claude.ai / Claude Desktop), whose
+> egress is allowlisted. There, use the MCP tools (`convert_content` / `get_output_content`).
+## Tools (MCP)
 | Tool | Description |
 |---|---|
 | `transcode_from_url` | Transcode video from a public URL |
 | `transcode_video` | Transcode an uploaded video file |
-| `convert_content` | Convert document content inline |
+| `convert_content` | Convert document content inline (up to 4 MB; sandbox-safe) |
 | `convert_from_url` | Convert a document from a public URL |
+| `convert_file` | Convert an uploaded document |
 | `get_job_status` | Poll a job until complete |
 | `get_download_url` | Get the signed download URL |
+| `get_output_content` | Get finished output bytes inline (sandbox-safe download) |
 | `get_wallet_balance` | Check wallet balance |
 ## Pricing

package/cli.js ADDED Viewed

@@ -0,0 +1,282 @@
+#!/usr/bin/env node
+/**
+ * botverse — command-line interface for Botverse.
+ *
+ * For humans and shell-capable automation (CI, cron, local coding agents) — it
+ * reads files from disk and streams them to the API directly, so it never serializes
+ * content through an LLM the way the in-chat MCP route must. Talks to the same
+ * hosted endpoint (botverse.cloud/mcp) with a bv_live_ key.
+ *
+ *   export BOTVERSE_API_KEY=bv_live_xxx
+ *   botverse convert report.md --to pdf
+ *   botverse convert *.md --to docx,pdf -o ./out
+ *   botverse transcode clip.mov --to mp4 -o ./out
+ *   botverse transcribe call.mp4 --to docx --attendees "Sarah Chen,Mike Torres"
+ *   botverse balance
+ *
+ * NOTE: this needs outbound network to botverse.cloud and S3. It does NOT work inside
+ * sandboxed agent environments (claude.ai / Claude Desktop) whose egress is allowlisted —
+ * there, use the MCP tools (convert_content / get_output_content) instead.
+ */
+"use strict";
+const fs = require("fs");
+const path = require("path");
+const https = require("https");
+const { URL } = require("url");
+const VERSION = "1.1.0";
+const BASE_URL = process.env.BOTVERSE_MCP_URL || "https://botverse.cloud/mcp";
+// ── tiny ANSI helpers ─────────────────────────────────────────────────────────
+const useColor = process.stdout.isTTY && !process.env.NO_COLOR;
+const c = (code, s) => (useColor ? `\x1b[${code}m${s}\x1b[0m` : s);
+const dim = (s) => c("2", s), bold = (s) => c("1", s), green = (s) => c("32", s), red = (s) => c("31", s), cyan = (s) => c("36", s);
+const log = (...a) => process.stderr.write(a.join(" ") + "\n");
+function die(msg) { log(red("error: ") + msg); process.exit(1); }
+// ── format maps ───────────────────────────────────────────────────────────────
+const CONTENT_TYPES = {
+  md: "text/markdown", markdown: "text/markdown", html: "text/html", htm: "text/html",
+  rst: "text/x-rst", txt: "text/plain", doc: "application/msword",
+  docx: "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
+  mp4: "video/mp4", mov: "video/quicktime", webm: "video/webm", avi: "video/x-msvideo",
+  mkv: "video/x-matroska", m4v: "video/x-m4v", wav: "audio/wav", m4a: "audio/mp4",
+  mp3: "audio/mpeg", flac: "audio/flac", wma: "audio/x-ms-wma",
+};
+const TEXT_INPUTS = new Set(["md", "markdown", "html", "htm", "rst", "txt"]);
+const EXT_OF = (f) => (path.extname(f).slice(1) || "").toLowerCase();
+const MAX_INLINE = 4 * 1024 * 1024; // proxy inline ceiling
+// ── HTTP / JSON-RPC ───────────────────────────────────────────────────────────
+// Auth: either a bv_live_ API key (Authorization: Bearer) or a full connector URL
+// containing ?token=bv_sess_… (BOTVERSE_CONNECTOR_URL). Returns the endpoint + headers.
+function authTarget(contentLength) {
+  const connector = argv.flags["connector-url"] || process.env.BOTVERSE_CONNECTOR_URL;
+  const key = argv.flags["api-key"] || process.env.BOTVERSE_API_KEY;
+  if (!connector && !key) {
+    die("no credentials. Set BOTVERSE_API_KEY=bv_live_… (or BOTVERSE_CONNECTOR_URL=…?token=bv_sess_…). Get a key at https://botverse.cloud/dashboard/api-keys");
+  }
+  const headers = { "Content-Type": "application/json", "Content-Length": contentLength, "User-Agent": "botverse-cli/" + VERSION };
+  if (!connector && key) headers["Authorization"] = `Bearer ${key}`;
+  return { url: connector || BASE_URL, headers };
+}
+function request(urlStr, { method = "POST", headers = {}, body }) {
+  return new Promise((resolve, reject) => {
+    const u = new URL(urlStr);
+    const req = https.request(
+      { hostname: u.hostname, port: u.port || 443, path: u.pathname + u.search, method, headers },
+      (res) => {
+        const chunks = [];
+        res.on("data", (d) => chunks.push(d));
+        res.on("end", () => resolve({ status: res.statusCode, buffer: Buffer.concat(chunks) }));
+      }
+    );
+    req.on("error", reject);
+    req.setTimeout(180000, () => req.destroy(new Error("request timed out")));
+    if (body) req.write(body);
+    req.end();
+  });
+}
+let RPC_ID = 0;
+async function mcp(tool, args) {
+  const body = JSON.stringify({ jsonrpc: "2.0", id: ++RPC_ID, method: "tools/call", params: { name: tool, arguments: args } });
+  const { url, headers } = authTarget(Buffer.byteLength(body));
+  const { status, buffer } = await request(url, { headers, body });
+  let json;
+  try { json = JSON.parse(buffer.toString()); }
+  catch { throw new Error(`HTTP ${status}: ${buffer.toString().slice(0, 200)}`); }
+  if (json.error) throw new Error(json.error.message || JSON.stringify(json.error));
+  const text = json.result?.structuredContent ?? json.result?.content?.[0]?.text;
+  if (text == null) throw new Error("unexpected response shape");
+  return typeof text === "string" ? JSON.parse(text) : text;
+}
+// ── S3 multipart upload (presigned POST) ──────────────────────────────────────
+async function uploadFile(filePath) {
+  const filename = path.basename(filePath);
+  const ext = EXT_OF(filename);
+  const ct = CONTENT_TYPES[ext] || "application/octet-stream";
+  const up = await mcp("get_upload_url", { filename, content_type: ct });
+  const fields = up.upload_fields || {};
+  const fileBuf = fs.readFileSync(filePath);
+  const boundary = "----botverse" + Math.random().toString(16).slice(2);
+  const pre = [];
+  for (const [k, v] of Object.entries(fields)) {
+    pre.push(`--${boundary}\r\nContent-Disposition: form-data; name="${k}"\r\n\r\n${v}\r\n`);
+  }
+  pre.push(`--${boundary}\r\nContent-Disposition: form-data; name="file"; filename="${filename}"\r\nContent-Type: ${fields["Content-Type"] || ct}\r\n\r\n`);
+  const body = Buffer.concat([Buffer.from(pre.join("")), fileBuf, Buffer.from(`\r\n--${boundary}--\r\n`)]);
+  const { status, buffer } = await request(up.upload_url, {
+    headers: { "Content-Type": `multipart/form-data; boundary=${boundary}`, "Content-Length": body.length },
+    body,
+  });
+  if (status !== 204 && status !== 201 && status !== 200) {
+    throw new Error(`S3 upload failed (HTTP ${status}): ${buffer.toString().slice(0, 200)}`);
+  }
+  return up.object_key;
+}
+// ── job polling + download ────────────────────────────────────────────────────
+async function poll(jobId) {
+  const start = Date.now();
+  for (;;) {
+    const s = await mcp("get_job_status", { job_id: jobId });
+    if (s.status === "complete") return s;
+    if (s.status === "failed") throw new Error(s.error || "job failed");
+    if (Date.now() - start > 30 * 60 * 1000) throw new Error("timed out waiting for job");
+    if (s.stage_message) process.stderr.write("\r" + dim("  " + s.stage_message.padEnd(48)));
+    await new Promise((r) => setTimeout(r, 3000));
+  }
+}
+async function downloadOutput(jobId, outPath) {
+  const dl = await mcp("get_download_url", { job_id: jobId });
+  const { status, buffer } = await request(dl.download_url, { method: "GET" });
+  if (status !== 200) throw new Error(`download failed (HTTP ${status})`);
+  fs.mkdirSync(path.dirname(outPath), { recursive: true });
+  fs.writeFileSync(outPath, buffer);
+  return buffer.length;
+}
+// ── submit helpers ────────────────────────────────────────────────────────────
+async function submitConvert(filePath, outFmt) {
+  const ext = EXT_OF(filePath);
+  const size = fs.statSync(filePath).size;
+  // Small text files go inline (no upload round-trip); large or binary go via S3.
+  if (TEXT_INPUTS.has(ext) && size <= MAX_INLINE) {
+    const r = await mcp("convert_content", {
+      content: fs.readFileSync(filePath, "utf8"),
+      input_format: ext === "markdown" ? "md" : ext === "htm" ? "html" : ext,
+      output_format: outFmt,
+    });
+    return r.job_id;
+  }
+  const key = await uploadFile(filePath);
+  const r = await mcp("convert_file", { object_key: key, output_format: outFmt });
+  return r.job_id;
+}
+async function submitTranscode(filePath, outFmt, opts) {
+  const key = await uploadFile(filePath);
+  const r = await mcp("transcode_video", { object_key: key, output_format: outFmt, ...(opts ? { options: opts } : {}) });
+  return r.job_id;
+}
+async function submitTranscribe(filePath, outFmt, opts) {
+  const key = await uploadFile(filePath);
+  const r = await mcp("transcribe_media", { object_key: key, output_format: outFmt, ...(opts ? { options: opts } : {}) });
+  return r.job_id;
+}
+// ── commands ──────────────────────────────────────────────────────────────────
+async function runBatch(files, formats, submit, outDir) {
+  if (!files.length) die("no input files");
+  let failures = 0;
+  for (const file of files) {
+    if (!fs.existsSync(file)) { log(red("✗ ") + file + dim(" — not found")); failures++; continue; }
+    for (const fmt of formats) {
+      const t0 = Date.now();
+      const base = path.basename(file, path.extname(file));
+      const outPath = path.join(outDir, `${base}.${fmt}`);
+      process.stderr.write(dim(`· ${path.basename(file)} → ${fmt} …`));
+      try {
+        const jobId = await submit(file, fmt);
+        await poll(jobId);
+        const bytes = await downloadOutput(jobId, outPath);
+        process.stderr.write("\r" + green("✓ ") + outPath + dim(`  (${(bytes / 1024).toFixed(0)} KB, ${((Date.now() - t0) / 1000).toFixed(1)}s)`).padEnd(20) + "\n");
+      } catch (e) {
+        process.stderr.write("\r" + red("✗ ") + `${path.basename(file)} → ${fmt}` + dim("  " + (e.message || e)) + "\n");
+        failures++;
+      }
+    }
+  }
+  if (failures) process.exitCode = 1;
+}
+function parseFormats(flag, allowed, label) {
+  if (!flag) die(`--to is required (${label}). e.g. --to ${allowed[0]}`);
+  const fmts = String(flag).split(",").map((s) => s.trim()).filter(Boolean);
+  for (const f of fmts) if (!allowed.includes(f)) die(`unsupported --to "${f}". Allowed: ${allowed.join(", ")}`);
+  return fmts;
+}
+const COMMANDS = {
+  async convert() {
+    const fmts = parseFormats(argv.flags.to, ["docx", "pdf", "html", "txt", "md", "rst", "xlsx"], "convert");
+    await runBatch(argv.files, fmts, (f, fmt) => submitConvert(f, fmt), argv.flags.o || argv.flags.out || ".");
+  },
+  async transcode() {
+    const fmts = parseFormats(argv.flags.to, ["mp4", "webm", "mov_prores", "mp3", "gif"], "transcode");
+    const opts = {};
+    if (argv.flags.resolution) opts.height = ({ "4k": 2160, "1080p": 1080, "720p": 720, "480p": 480, "360p": 360 }[argv.flags.resolution]) || undefined;
+    await runBatch(argv.files, fmts, (f, fmt) => submitTranscode(f, fmt, Object.keys(opts).length ? opts : null), argv.flags.o || argv.flags.out || ".");
+  },
+  async transcribe() {
+    const fmts = parseFormats(argv.flags.to, ["txt", "json", "srt", "vtt", "docx", "pdf"], "transcribe");
+    const opts = {};
+    if (argv.flags.attendees) opts.attendees = String(argv.flags.attendees).split(",").map((n) => ({ name: n.trim() })).filter((a) => a.name);
+    if (argv.flags.language) opts.language = argv.flags.language;
+    await runBatch(argv.files, fmts, (f, fmt) => submitTranscribe(f, fmt, Object.keys(opts).length ? opts : null), argv.flags.o || argv.flags.out || ".");
+  },
+  async balance() {
+    const r = await mcp("get_wallet_balance", {});
+    log(bold("Wallet: ") + green(`$${Number(r.balance_usd).toFixed(2)}`) + (r.auto_refill_enabled ? dim("  (auto-refill on)") : ""));
+  },
+};
+function usage() {
+  log(`${bold("botverse")} ${dim("v" + VERSION)} — Botverse from the command line
+${bold("Usage:")}
+  botverse convert    <files…> --to <fmt[,fmt]> [-o dir]
+  botverse transcode  <files…> --to <fmt> [--resolution 1080p] [-o dir]
+  botverse transcribe <files…> --to <fmt> [--attendees "A,B"] [--language en-US] [-o dir]
+  botverse balance
+${bold("Auth:")}  export BOTVERSE_API_KEY=bv_live_…   (or --api-key)
+${bold("Examples:")}
+  ${cyan("botverse convert report.md --to pdf")}
+  ${cyan("botverse convert *.md --to docx,pdf -o ./out")}
+  ${cyan("botverse transcode clip.mov --to mp4 -o ./out")}
+  ${cyan("botverse transcribe call.mp4 --to docx --attendees \"Sarah Chen,Mike Torres\"")}
+Docs: https://botverse.cloud/docs/cli`);
+}
+// ── arg parsing ───────────────────────────────────────────────────────────────
+function parseArgs(args) {
+  const flags = {}; const files = [];
+  for (let i = 0; i < args.length; i++) {
+    const a = args[i];
+    if (a.startsWith("--")) {
+      const key = a.slice(2);
+      const next = args[i + 1];
+      if (next !== undefined && !next.startsWith("-")) { flags[key] = next; i++; } else flags[key] = true;
+    } else if (a.startsWith("-")) {
+      const key = a.slice(1);
+      const next = args[i + 1];
+      if (next !== undefined && !next.startsWith("-")) { flags[key] = next; i++; } else flags[key] = true;
+    } else files.push(a);
+  }
+  return { flags, files };
+}
+const rawArgs = process.argv.slice(2);
+const command = rawArgs[0];
+const argv = parseArgs(rawArgs.slice(1));
+(async () => {
+  if (!command || command === "help" || argv.flags.help || argv.flags.h) return usage();
+  if (command === "version" || argv.flags.version || argv.flags.v) return log("botverse " + VERSION);
+  const fn = COMMANDS[command];
+  if (!fn) { log(red(`unknown command: ${command}`)); usage(); process.exit(1); }
+  try { await fn(); }
+  catch (e) { die(e.message || String(e)); }
+})();

package/package.json CHANGED Viewed

@@ -1,11 +1,12 @@
 {
   "name": "botverse-mcp",
-  "version": "1.0.4",
+  "version": "1.2.0",
   "mcpName": "io.github.MkTurner74/botverse",
-  "description": "MCP server for Botverse — video transcoding and document conversion for AI agents. $0.25/transcode · $0.05/convert · No AWS required.",
+  "description": "Botverse for AI agents and the command line — video transcoding and document conversion. MCP server + `botverse` CLI. $0.25/transcode · $0.05/convert · No AWS required.",
   "main": "index.js",
   "bin": {
-    "botverse-mcp": "index.js"
+    "botverse-mcp": "index.js",
+    "botverse": "cli.js"
   },
   "scripts": {
     "start": "node index.js"

package/tools.json CHANGED Viewed

@@ -152,7 +152,7 @@
   },
   {
     "name": "transcode_video",
-    "description": "Offload a video transcode to Botverse — encoding runs server-side so you can continue with other tasks. Returns a job_id immediately. Source must be ≤ 10 minutes and ≤ 5 GB. Poll get_job_status every 5 seconds until 'complete', then get_download_url. Wallet debited on completion.",
+    "description": "Offload a video transcode to Botverse — encoding runs server-side so you can continue with other tasks. Returns a job_id immediately. Source must be ≤ 60 minutes and ≤ 2 GB. Poll get_job_status every 5 seconds until 'complete', then get_download_url. Wallet debited on completion.",
     "inputSchema": {
       "type": "object",
       "properties": {
@@ -254,13 +254,13 @@
   },
   {
     "name": "get_job_status",
-    "description": "Poll the status of a transcode or convert job. Call every 5 seconds until status is 'complete' or 'failed'. Status 'queued' or 'processing' is normal — large files take 5–15 minutes. Keep polling indefinitely until a terminal status is reached. Do not stop polling after a fixed number of attempts.",
+    "description": "Poll the status of a transcode, convert, or transcribe job. Call every 5 seconds until status is 'complete' or 'failed'. Status 'queued' or 'processing' is normal — large files take 5–15 minutes; transcribe reports a live stage (converting audio → transcribing → AI augmenting → rendering). Keep polling indefinitely until a terminal status is reached. Do not stop polling after a fixed number of attempts.",
     "inputSchema": {
       "type": "object",
       "properties": {
         "job_id": {
           "type": "string",
-          "description": "Job ID returned by transcode_video, transcode_from_url, convert_file, convert_from_url, or convert_content."
+          "description": "Job ID returned by transcode_video, transcode_from_url, convert_file, convert_from_url, convert_content, transcribe_from_url, or transcribe_media."
         }
       },
       "required": [
@@ -321,7 +321,7 @@
       "properties": {
         "job_id": {
           "type": "string",
-          "description": "Job ID from transcode_video, transcode_from_url, or any convert tool."
+          "description": "Job ID from transcode_video, transcode_from_url, any convert tool, or any transcribe tool."
         }
       },
       "required": [
@@ -525,7 +525,7 @@
   },
   {
     "name": "submit_workflow",
-    "description": "Submit a multi-step workflow to the Botverse workflow engine. Steps execute in dependency order; parallel branches (multiple steps with the same depends_on) run simultaneously. Returns a workflow_id immediately — poll get_workflow_status every 5–10 seconds until terminal. Requires auto-refill to be enabled at botverse.cloud/dashboard/billing to prevent mid-workflow balance failures. Workflow definition uses BWDL (Botverse Workflow Definition Language) — schema at botverse.cloud/schemas/workflow/v1.json.",
+    "description": "Submit a multi-step workflow to the Botverse workflow engine. Steps execute in dependency order; parallel branches (multiple steps with the same depends_on) run simultaneously. Returns a workflow_id immediately — poll get_workflow_status every 5–10 seconds until terminal. INTER-STEP REFERENCES: pass a prior step's output into a later step with the string \"$.steps.<step_id>.output_key\" (e.g. a docx→pdf chain: step to_pdf has depends_on: [\"to_docx\"] and inputs {\"source_url\": \"$.steps.to_docx.output_key\", \"input_format\": \"docx\", \"output_format\": \"pdf\"} using tool convert_from_url). Workflow params are referenced as \"$.params.<name>\". No other template syntax (${...} etc.) is supported. BILLING: convert-only workflows run on wallet balance ($0.05/step). Workflows containing transcode or transcribe steps require auto-refill to be enabled at botverse.cloud/dashboard/billing (their cost scales with source duration). Workflow definition uses BWDL (Botverse Workflow Definition Language) — schema at botverse.cloud/schemas/workflow/v1.json.",
     "inputSchema": {
       "type": "object",
       "properties": {
@@ -746,5 +746,125 @@
       "idempotentHint": false,
       "openWorldHint": true
     }
+  },
+  {
+    "name": "transcribe_from_url",
+    "description": "Transcribe a video or audio file from a public HTTPS URL into a speaker-labelled transcript — ONE call does everything. Source can be a direct HTTPS URL or a Dropbox / Google Drive / Box share link (auto-resolved); OneDrive and SharePoint share links are unreliable — use a direct download URL, or upload via get_upload_url + transcribe_media. Internally: converts to audio, runs speech-to-text with speaker diarization, uses AI to name the speakers from your attendee list, and renders the document. Pass options.attendees (names, optional gender/role) and it tags who said what. Output formats: txt, json, srt, vtt, docx, pdf. CONSENT: you must have all parties' consent to record/transcribe. Returns a job_id immediately — report it to the user, then poll get_job_status (it reports a live stage: converting audio → transcribing → AI augmenting → rendering) until 'complete', then get_download_url. ~$0.08/audio-minute (~$5/hour), diarization + naming included.",
+    "inputSchema": {
+      "type": "object",
+      "properties": {
+        "source_url": {
+          "type": "string",
+          "description": "Public HTTPS URL of the source video or audio file."
+        },
+        "output_format": {
+          "type": "string",
+          "enum": [
+            "txt",
+            "json",
+            "srt",
+            "vtt",
+            "docx",
+            "pdf"
+          ],
+          "description": "Primary deliverable format."
+        },
+        "options": {
+          "type": "object",
+          "description": "Optional. attendees: [{name, gender?, role?}] to name speakers; language (BCP-47 or 'auto'); diarize (default true); max_speakers; title; include_timestamps; also_deliver: extra formats in the same job."
+        }
+      },
+      "required": [
+        "source_url",
+        "output_format"
+      ]
+    },
+    "outputSchema": {
+      "type": "object",
+      "properties": {
+        "job_id": {
+          "type": "string",
+          "description": "Unique identifier for this job. Pass to get_job_status and get_download_url."
+        },
+        "status": {
+          "type": "string",
+          "enum": [
+            "queued",
+            "processing"
+          ],
+          "description": "Initial job state."
+        }
+      },
+      "required": [
+        "job_id",
+        "status"
+      ]
+    },
+    "annotations": {
+      "readOnlyHint": false,
+      "destructiveHint": false,
+      "idempotentHint": false,
+      "openWorldHint": true
+    }
+  },
+  {
+    "name": "transcribe_media",
+    "description": "Transcribe an already-uploaded video/audio file (from get_upload_url) into a speaker-labelled transcript. Same one-call pipeline and options as transcribe_from_url (attendee naming, srt/vtt, formatted docx/pdf). Use for local files or files larger than a URL fetch allows (up to 2 GB). CONSENT: you must have all parties' consent. Poll get_job_status (live stage) until complete, then get_download_url. ~$0.08/audio-minute (~$5/hour).",
+    "inputSchema": {
+      "type": "object",
+      "properties": {
+        "object_key": {
+          "type": "string",
+          "description": "The object_key returned by get_upload_url."
+        },
+        "output_format": {
+          "type": "string",
+          "enum": [
+            "txt",
+            "json",
+            "srt",
+            "vtt",
+            "docx",
+            "pdf"
+          ],
+          "description": "Primary deliverable format."
+        },
+        "options": {
+          "type": "object",
+          "description": "Same options object as transcribe_from_url (attendees, language, diarize, max_speakers, title, include_timestamps, also_deliver)."
+        }
+      },
+      "required": [
+        "object_key",
+        "output_format"
+      ]
+    },
+    "outputSchema": {
+      "type": "object",
+      "properties": {
+        "job_id": {
+          "type": "string",
+          "description": "Unique identifier for this job. Pass to get_job_status and get_download_url."
+        },
+        "status": {
+          "type": "string",
+          "enum": [
+            "queued",
+            "processing"
+          ],
+          "description": "Initial job state."
+        }
+      },
+      "required": [
+        "job_id",
+        "status"
+      ]
+    },
+    "annotations": {
+      "readOnlyHint": false,
+      "destructiveHint": false,
+      "idempotentHint": false,
+      "openWorldHint": true
+    }
   }
 ]