botverse-mcp 1.0.4 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (4) hide show
  1. package/README.md +31 -5
  2. package/cli.js +282 -0
  3. package/package.json +4 -3
  4. package/tools.json +125 -5
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # botverse-mcp
2
2
 
3
- MCP server for [Botverse](https://botverse.cloud) — video transcoding and document conversion for AI agents.
3
+ MCP server **and command-line tool** for [Botverse](https://botverse.cloud) — video transcoding and document conversion for AI agents and the humans who configure them.
4
4
 
5
5
  [![npm](https://img.shields.io/npm/v/botverse-mcp)](https://www.npmjs.com/package/botverse-mcp)
6
6
 
@@ -8,12 +8,13 @@ MCP server for [Botverse](https://botverse.cloud) — video transcoding and docu
8
8
 
9
9
  - **Video transcoding** — MP4 (H.264), WebM (VP9), ProRes 422, GIF, MP3 extraction · $0.25/job
10
10
  - **Document conversion** — Markdown ↔ DOCX ↔ PDF ↔ HTML ↔ XLSX · $0.05/file
11
+ - **Transcription** — speaker-labelled transcripts (diarization + AI speaker naming) → txt/srt/vtt/docx/pdf · ~$5/hour
11
12
 
12
- Three tool calls. No AWS. No FFmpeg. No infrastructure.
13
+ Two ways to use it: an **MCP server** for your AI agents, and a **`botverse` CLI** for the shell — evaluation, CI/CD, cron, scripts, and local coding agents. No AWS. No FFmpeg. No infrastructure.
13
14
 
14
15
  ## Setup
15
16
 
16
- 1. Sign up at [botverse.cloud](https://botverse.cloud) — $5 minimum top-up, no monthly fees
17
+ 1. Sign up at [botverse.cloud](https://botverse.cloud) — **free to try: $1 credit on signup, no card required.** A card + 2FA are only needed at your first top-up ($5 min). No monthly fees.
17
18
  2. Get an API key or connector URL from your dashboard
18
19
  3. Add to your MCP client config
19
20
 
@@ -51,16 +52,41 @@ Or with a connector URL (recommended for claude.ai):
51
52
  }
52
53
  ```
53
54
 
54
- ## Tools
55
+ ## Command line (`botverse`)
56
+
57
+ The same package ships a `botverse` CLI for the shell — it reads files from disk and
58
+ streams them straight to the API (no content goes through an LLM), so it's the fast
59
+ path for evaluation, automation, and local coding agents.
60
+
61
+ ```bash
62
+ export BOTVERSE_API_KEY=bv_live_… # or BOTVERSE_CONNECTOR_URL=…?token=bv_sess_…
63
+
64
+ npx botverse convert report.md --to pdf
65
+ npx botverse convert *.md --to docx,pdf -o ./out
66
+ npx botverse transcode clip.mov --to mp4 -o ./out
67
+ npx botverse transcribe call.mp4 --to docx --attendees "Sarah Chen,Mike Torres"
68
+ npx botverse balance
69
+ ```
70
+
71
+ Each job uploads → polls → downloads the finished file to `-o` (default: current dir).
72
+ Globs and multiple `--to` formats run as a batch.
73
+
74
+ > **Sandbox note:** the CLI needs outbound network to `botverse.cloud` and S3, so it does
75
+ > **not** run inside sandboxed agent environments (claude.ai / Claude Desktop), whose
76
+ > egress is allowlisted. There, use the MCP tools (`convert_content` / `get_output_content`).
77
+
78
+ ## Tools (MCP)
55
79
 
56
80
  | Tool | Description |
57
81
  |---|---|
58
82
  | `transcode_from_url` | Transcode video from a public URL |
59
83
  | `transcode_video` | Transcode an uploaded video file |
60
- | `convert_content` | Convert document content inline |
84
+ | `convert_content` | Convert document content inline (up to 4 MB; sandbox-safe) |
61
85
  | `convert_from_url` | Convert a document from a public URL |
86
+ | `convert_file` | Convert an uploaded document |
62
87
  | `get_job_status` | Poll a job until complete |
63
88
  | `get_download_url` | Get the signed download URL |
89
+ | `get_output_content` | Get finished output bytes inline (sandbox-safe download) |
64
90
  | `get_wallet_balance` | Check wallet balance |
65
91
 
66
92
  ## Pricing
package/cli.js ADDED
@@ -0,0 +1,282 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * botverse — command-line interface for Botverse.
4
+ *
5
+ * For humans and shell-capable automation (CI, cron, local coding agents) — it
6
+ * reads files from disk and streams them to the API directly, so it never serializes
7
+ * content through an LLM the way the in-chat MCP route must. Talks to the same
8
+ * hosted endpoint (botverse.cloud/mcp) with a bv_live_ key.
9
+ *
10
+ * export BOTVERSE_API_KEY=bv_live_xxx
11
+ * botverse convert report.md --to pdf
12
+ * botverse convert *.md --to docx,pdf -o ./out
13
+ * botverse transcode clip.mov --to mp4 -o ./out
14
+ * botverse transcribe call.mp4 --to docx --attendees "Sarah Chen,Mike Torres"
15
+ * botverse balance
16
+ *
17
+ * NOTE: this needs outbound network to botverse.cloud and S3. It does NOT work inside
18
+ * sandboxed agent environments (claude.ai / Claude Desktop) whose egress is allowlisted —
19
+ * there, use the MCP tools (convert_content / get_output_content) instead.
20
+ */
21
+
22
+ "use strict";
23
+ const fs = require("fs");
24
+ const path = require("path");
25
+ const https = require("https");
26
+ const { URL } = require("url");
27
+
28
+ const VERSION = "1.1.0";
29
+ const BASE_URL = process.env.BOTVERSE_MCP_URL || "https://botverse.cloud/mcp";
30
+
31
+ // ── tiny ANSI helpers ─────────────────────────────────────────────────────────
32
+ const useColor = process.stdout.isTTY && !process.env.NO_COLOR;
33
+ const c = (code, s) => (useColor ? `\x1b[${code}m${s}\x1b[0m` : s);
34
+ const dim = (s) => c("2", s), bold = (s) => c("1", s), green = (s) => c("32", s), red = (s) => c("31", s), cyan = (s) => c("36", s);
35
+ const log = (...a) => process.stderr.write(a.join(" ") + "\n");
36
+
37
+ function die(msg) { log(red("error: ") + msg); process.exit(1); }
38
+
39
+ // ── format maps ───────────────────────────────────────────────────────────────
40
+ const CONTENT_TYPES = {
41
+ md: "text/markdown", markdown: "text/markdown", html: "text/html", htm: "text/html",
42
+ rst: "text/x-rst", txt: "text/plain", doc: "application/msword",
43
+ docx: "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
44
+ mp4: "video/mp4", mov: "video/quicktime", webm: "video/webm", avi: "video/x-msvideo",
45
+ mkv: "video/x-matroska", m4v: "video/x-m4v", wav: "audio/wav", m4a: "audio/mp4",
46
+ mp3: "audio/mpeg", flac: "audio/flac", wma: "audio/x-ms-wma",
47
+ };
48
+ const TEXT_INPUTS = new Set(["md", "markdown", "html", "htm", "rst", "txt"]);
49
+ const EXT_OF = (f) => (path.extname(f).slice(1) || "").toLowerCase();
50
+ const MAX_INLINE = 4 * 1024 * 1024; // proxy inline ceiling
51
+
52
+ // ── HTTP / JSON-RPC ───────────────────────────────────────────────────────────
53
+ // Auth: either a bv_live_ API key (Authorization: Bearer) or a full connector URL
54
+ // containing ?token=bv_sess_… (BOTVERSE_CONNECTOR_URL). Returns the endpoint + headers.
55
+ function authTarget(contentLength) {
56
+ const connector = argv.flags["connector-url"] || process.env.BOTVERSE_CONNECTOR_URL;
57
+ const key = argv.flags["api-key"] || process.env.BOTVERSE_API_KEY;
58
+ if (!connector && !key) {
59
+ die("no credentials. Set BOTVERSE_API_KEY=bv_live_… (or BOTVERSE_CONNECTOR_URL=…?token=bv_sess_…). Get a key at https://botverse.cloud/dashboard/api-keys");
60
+ }
61
+ const headers = { "Content-Type": "application/json", "Content-Length": contentLength, "User-Agent": "botverse-cli/" + VERSION };
62
+ if (!connector && key) headers["Authorization"] = `Bearer ${key}`;
63
+ return { url: connector || BASE_URL, headers };
64
+ }
65
+
66
+ function request(urlStr, { method = "POST", headers = {}, body }) {
67
+ return new Promise((resolve, reject) => {
68
+ const u = new URL(urlStr);
69
+ const req = https.request(
70
+ { hostname: u.hostname, port: u.port || 443, path: u.pathname + u.search, method, headers },
71
+ (res) => {
72
+ const chunks = [];
73
+ res.on("data", (d) => chunks.push(d));
74
+ res.on("end", () => resolve({ status: res.statusCode, buffer: Buffer.concat(chunks) }));
75
+ }
76
+ );
77
+ req.on("error", reject);
78
+ req.setTimeout(180000, () => req.destroy(new Error("request timed out")));
79
+ if (body) req.write(body);
80
+ req.end();
81
+ });
82
+ }
83
+
84
+ let RPC_ID = 0;
85
+ async function mcp(tool, args) {
86
+ const body = JSON.stringify({ jsonrpc: "2.0", id: ++RPC_ID, method: "tools/call", params: { name: tool, arguments: args } });
87
+ const { url, headers } = authTarget(Buffer.byteLength(body));
88
+ const { status, buffer } = await request(url, { headers, body });
89
+ let json;
90
+ try { json = JSON.parse(buffer.toString()); }
91
+ catch { throw new Error(`HTTP ${status}: ${buffer.toString().slice(0, 200)}`); }
92
+ if (json.error) throw new Error(json.error.message || JSON.stringify(json.error));
93
+ const text = json.result?.structuredContent ?? json.result?.content?.[0]?.text;
94
+ if (text == null) throw new Error("unexpected response shape");
95
+ return typeof text === "string" ? JSON.parse(text) : text;
96
+ }
97
+
98
+ // ── S3 multipart upload (presigned POST) ──────────────────────────────────────
99
+ async function uploadFile(filePath) {
100
+ const filename = path.basename(filePath);
101
+ const ext = EXT_OF(filename);
102
+ const ct = CONTENT_TYPES[ext] || "application/octet-stream";
103
+ const up = await mcp("get_upload_url", { filename, content_type: ct });
104
+ const fields = up.upload_fields || {};
105
+ const fileBuf = fs.readFileSync(filePath);
106
+
107
+ const boundary = "----botverse" + Math.random().toString(16).slice(2);
108
+ const pre = [];
109
+ for (const [k, v] of Object.entries(fields)) {
110
+ pre.push(`--${boundary}\r\nContent-Disposition: form-data; name="${k}"\r\n\r\n${v}\r\n`);
111
+ }
112
+ pre.push(`--${boundary}\r\nContent-Disposition: form-data; name="file"; filename="${filename}"\r\nContent-Type: ${fields["Content-Type"] || ct}\r\n\r\n`);
113
+ const body = Buffer.concat([Buffer.from(pre.join("")), fileBuf, Buffer.from(`\r\n--${boundary}--\r\n`)]);
114
+
115
+ const { status, buffer } = await request(up.upload_url, {
116
+ headers: { "Content-Type": `multipart/form-data; boundary=${boundary}`, "Content-Length": body.length },
117
+ body,
118
+ });
119
+ if (status !== 204 && status !== 201 && status !== 200) {
120
+ throw new Error(`S3 upload failed (HTTP ${status}): ${buffer.toString().slice(0, 200)}`);
121
+ }
122
+ return up.object_key;
123
+ }
124
+
125
+ // ── job polling + download ────────────────────────────────────────────────────
126
+ async function poll(jobId) {
127
+ const start = Date.now();
128
+ for (;;) {
129
+ const s = await mcp("get_job_status", { job_id: jobId });
130
+ if (s.status === "complete") return s;
131
+ if (s.status === "failed") throw new Error(s.error || "job failed");
132
+ if (Date.now() - start > 30 * 60 * 1000) throw new Error("timed out waiting for job");
133
+ if (s.stage_message) process.stderr.write("\r" + dim(" " + s.stage_message.padEnd(48)));
134
+ await new Promise((r) => setTimeout(r, 3000));
135
+ }
136
+ }
137
+
138
+ async function downloadOutput(jobId, outPath) {
139
+ const dl = await mcp("get_download_url", { job_id: jobId });
140
+ const { status, buffer } = await request(dl.download_url, { method: "GET" });
141
+ if (status !== 200) throw new Error(`download failed (HTTP ${status})`);
142
+ fs.mkdirSync(path.dirname(outPath), { recursive: true });
143
+ fs.writeFileSync(outPath, buffer);
144
+ return buffer.length;
145
+ }
146
+
147
+ // ── submit helpers ────────────────────────────────────────────────────────────
148
+ async function submitConvert(filePath, outFmt) {
149
+ const ext = EXT_OF(filePath);
150
+ const size = fs.statSync(filePath).size;
151
+ // Small text files go inline (no upload round-trip); large or binary go via S3.
152
+ if (TEXT_INPUTS.has(ext) && size <= MAX_INLINE) {
153
+ const r = await mcp("convert_content", {
154
+ content: fs.readFileSync(filePath, "utf8"),
155
+ input_format: ext === "markdown" ? "md" : ext === "htm" ? "html" : ext,
156
+ output_format: outFmt,
157
+ });
158
+ return r.job_id;
159
+ }
160
+ const key = await uploadFile(filePath);
161
+ const r = await mcp("convert_file", { object_key: key, output_format: outFmt });
162
+ return r.job_id;
163
+ }
164
+
165
+ async function submitTranscode(filePath, outFmt, opts) {
166
+ const key = await uploadFile(filePath);
167
+ const r = await mcp("transcode_video", { object_key: key, output_format: outFmt, ...(opts ? { options: opts } : {}) });
168
+ return r.job_id;
169
+ }
170
+
171
+ async function submitTranscribe(filePath, outFmt, opts) {
172
+ const key = await uploadFile(filePath);
173
+ const r = await mcp("transcribe_media", { object_key: key, output_format: outFmt, ...(opts ? { options: opts } : {}) });
174
+ return r.job_id;
175
+ }
176
+
177
+ // ── commands ──────────────────────────────────────────────────────────────────
178
+ async function runBatch(files, formats, submit, outDir) {
179
+ if (!files.length) die("no input files");
180
+ let failures = 0;
181
+ for (const file of files) {
182
+ if (!fs.existsSync(file)) { log(red("✗ ") + file + dim(" — not found")); failures++; continue; }
183
+ for (const fmt of formats) {
184
+ const t0 = Date.now();
185
+ const base = path.basename(file, path.extname(file));
186
+ const outPath = path.join(outDir, `${base}.${fmt}`);
187
+ process.stderr.write(dim(`· ${path.basename(file)} → ${fmt} …`));
188
+ try {
189
+ const jobId = await submit(file, fmt);
190
+ await poll(jobId);
191
+ const bytes = await downloadOutput(jobId, outPath);
192
+ process.stderr.write("\r" + green("✓ ") + outPath + dim(` (${(bytes / 1024).toFixed(0)} KB, ${((Date.now() - t0) / 1000).toFixed(1)}s)`).padEnd(20) + "\n");
193
+ } catch (e) {
194
+ process.stderr.write("\r" + red("✗ ") + `${path.basename(file)} → ${fmt}` + dim(" " + (e.message || e)) + "\n");
195
+ failures++;
196
+ }
197
+ }
198
+ }
199
+ if (failures) process.exitCode = 1;
200
+ }
201
+
202
+ function parseFormats(flag, allowed, label) {
203
+ if (!flag) die(`--to is required (${label}). e.g. --to ${allowed[0]}`);
204
+ const fmts = String(flag).split(",").map((s) => s.trim()).filter(Boolean);
205
+ for (const f of fmts) if (!allowed.includes(f)) die(`unsupported --to "${f}". Allowed: ${allowed.join(", ")}`);
206
+ return fmts;
207
+ }
208
+
209
+ const COMMANDS = {
210
+ async convert() {
211
+ const fmts = parseFormats(argv.flags.to, ["docx", "pdf", "html", "txt", "md", "rst", "xlsx"], "convert");
212
+ await runBatch(argv.files, fmts, (f, fmt) => submitConvert(f, fmt), argv.flags.o || argv.flags.out || ".");
213
+ },
214
+ async transcode() {
215
+ const fmts = parseFormats(argv.flags.to, ["mp4", "webm", "mov_prores", "mp3", "gif"], "transcode");
216
+ const opts = {};
217
+ if (argv.flags.resolution) opts.height = ({ "4k": 2160, "1080p": 1080, "720p": 720, "480p": 480, "360p": 360 }[argv.flags.resolution]) || undefined;
218
+ await runBatch(argv.files, fmts, (f, fmt) => submitTranscode(f, fmt, Object.keys(opts).length ? opts : null), argv.flags.o || argv.flags.out || ".");
219
+ },
220
+ async transcribe() {
221
+ const fmts = parseFormats(argv.flags.to, ["txt", "json", "srt", "vtt", "docx", "pdf"], "transcribe");
222
+ const opts = {};
223
+ if (argv.flags.attendees) opts.attendees = String(argv.flags.attendees).split(",").map((n) => ({ name: n.trim() })).filter((a) => a.name);
224
+ if (argv.flags.language) opts.language = argv.flags.language;
225
+ await runBatch(argv.files, fmts, (f, fmt) => submitTranscribe(f, fmt, Object.keys(opts).length ? opts : null), argv.flags.o || argv.flags.out || ".");
226
+ },
227
+ async balance() {
228
+ const r = await mcp("get_wallet_balance", {});
229
+ log(bold("Wallet: ") + green(`$${Number(r.balance_usd).toFixed(2)}`) + (r.auto_refill_enabled ? dim(" (auto-refill on)") : ""));
230
+ },
231
+ };
232
+
233
+ function usage() {
234
+ log(`${bold("botverse")} ${dim("v" + VERSION)} — Botverse from the command line
235
+
236
+ ${bold("Usage:")}
237
+ botverse convert <files…> --to <fmt[,fmt]> [-o dir]
238
+ botverse transcode <files…> --to <fmt> [--resolution 1080p] [-o dir]
239
+ botverse transcribe <files…> --to <fmt> [--attendees "A,B"] [--language en-US] [-o dir]
240
+ botverse balance
241
+
242
+ ${bold("Auth:")} export BOTVERSE_API_KEY=bv_live_… (or --api-key)
243
+
244
+ ${bold("Examples:")}
245
+ ${cyan("botverse convert report.md --to pdf")}
246
+ ${cyan("botverse convert *.md --to docx,pdf -o ./out")}
247
+ ${cyan("botverse transcode clip.mov --to mp4 -o ./out")}
248
+ ${cyan("botverse transcribe call.mp4 --to docx --attendees \"Sarah Chen,Mike Torres\"")}
249
+
250
+ Docs: https://botverse.cloud/docs/cli`);
251
+ }
252
+
253
+ // ── arg parsing ───────────────────────────────────────────────────────────────
254
+ function parseArgs(args) {
255
+ const flags = {}; const files = [];
256
+ for (let i = 0; i < args.length; i++) {
257
+ const a = args[i];
258
+ if (a.startsWith("--")) {
259
+ const key = a.slice(2);
260
+ const next = args[i + 1];
261
+ if (next !== undefined && !next.startsWith("-")) { flags[key] = next; i++; } else flags[key] = true;
262
+ } else if (a.startsWith("-")) {
263
+ const key = a.slice(1);
264
+ const next = args[i + 1];
265
+ if (next !== undefined && !next.startsWith("-")) { flags[key] = next; i++; } else flags[key] = true;
266
+ } else files.push(a);
267
+ }
268
+ return { flags, files };
269
+ }
270
+
271
+ const rawArgs = process.argv.slice(2);
272
+ const command = rawArgs[0];
273
+ const argv = parseArgs(rawArgs.slice(1));
274
+
275
+ (async () => {
276
+ if (!command || command === "help" || argv.flags.help || argv.flags.h) return usage();
277
+ if (command === "version" || argv.flags.version || argv.flags.v) return log("botverse " + VERSION);
278
+ const fn = COMMANDS[command];
279
+ if (!fn) { log(red(`unknown command: ${command}`)); usage(); process.exit(1); }
280
+ try { await fn(); }
281
+ catch (e) { die(e.message || String(e)); }
282
+ })();
package/package.json CHANGED
@@ -1,11 +1,12 @@
1
1
  {
2
2
  "name": "botverse-mcp",
3
- "version": "1.0.4",
3
+ "version": "1.2.0",
4
4
  "mcpName": "io.github.MkTurner74/botverse",
5
- "description": "MCP server for Botverse — video transcoding and document conversion for AI agents. $0.25/transcode · $0.05/convert · No AWS required.",
5
+ "description": "Botverse for AI agents and the command line — video transcoding and document conversion. MCP server + `botverse` CLI. $0.25/transcode · $0.05/convert · No AWS required.",
6
6
  "main": "index.js",
7
7
  "bin": {
8
- "botverse-mcp": "index.js"
8
+ "botverse-mcp": "index.js",
9
+ "botverse": "cli.js"
9
10
  },
10
11
  "scripts": {
11
12
  "start": "node index.js"
package/tools.json CHANGED
@@ -152,7 +152,7 @@
152
152
  },
153
153
  {
154
154
  "name": "transcode_video",
155
- "description": "Offload a video transcode to Botverse — encoding runs server-side so you can continue with other tasks. Returns a job_id immediately. Source must be ≤ 10 minutes and ≤ 5 GB. Poll get_job_status every 5 seconds until 'complete', then get_download_url. Wallet debited on completion.",
155
+ "description": "Offload a video transcode to Botverse — encoding runs server-side so you can continue with other tasks. Returns a job_id immediately. Source must be ≤ 60 minutes and ≤ 2 GB. Poll get_job_status every 5 seconds until 'complete', then get_download_url. Wallet debited on completion.",
156
156
  "inputSchema": {
157
157
  "type": "object",
158
158
  "properties": {
@@ -254,13 +254,13 @@
254
254
  },
255
255
  {
256
256
  "name": "get_job_status",
257
- "description": "Poll the status of a transcode or convert job. Call every 5 seconds until status is 'complete' or 'failed'. Status 'queued' or 'processing' is normal — large files take 5–15 minutes. Keep polling indefinitely until a terminal status is reached. Do not stop polling after a fixed number of attempts.",
257
+ "description": "Poll the status of a transcode, convert, or transcribe job. Call every 5 seconds until status is 'complete' or 'failed'. Status 'queued' or 'processing' is normal — large files take 5–15 minutes; transcribe reports a live stage (converting audio → transcribing → AI augmenting → rendering). Keep polling indefinitely until a terminal status is reached. Do not stop polling after a fixed number of attempts.",
258
258
  "inputSchema": {
259
259
  "type": "object",
260
260
  "properties": {
261
261
  "job_id": {
262
262
  "type": "string",
263
- "description": "Job ID returned by transcode_video, transcode_from_url, convert_file, convert_from_url, or convert_content."
263
+ "description": "Job ID returned by transcode_video, transcode_from_url, convert_file, convert_from_url, convert_content, transcribe_from_url, or transcribe_media."
264
264
  }
265
265
  },
266
266
  "required": [
@@ -321,7 +321,7 @@
321
321
  "properties": {
322
322
  "job_id": {
323
323
  "type": "string",
324
- "description": "Job ID from transcode_video, transcode_from_url, or any convert tool."
324
+ "description": "Job ID from transcode_video, transcode_from_url, any convert tool, or any transcribe tool."
325
325
  }
326
326
  },
327
327
  "required": [
@@ -525,7 +525,7 @@
525
525
  },
526
526
  {
527
527
  "name": "submit_workflow",
528
- "description": "Submit a multi-step workflow to the Botverse workflow engine. Steps execute in dependency order; parallel branches (multiple steps with the same depends_on) run simultaneously. Returns a workflow_id immediately — poll get_workflow_status every 5–10 seconds until terminal. Requires auto-refill to be enabled at botverse.cloud/dashboard/billing to prevent mid-workflow balance failures. Workflow definition uses BWDL (Botverse Workflow Definition Language) — schema at botverse.cloud/schemas/workflow/v1.json.",
528
+ "description": "Submit a multi-step workflow to the Botverse workflow engine. Steps execute in dependency order; parallel branches (multiple steps with the same depends_on) run simultaneously. Returns a workflow_id immediately — poll get_workflow_status every 5–10 seconds until terminal. INTER-STEP REFERENCES: pass a prior step's output into a later step with the string \"$.steps.<step_id>.output_key\" (e.g. a docx→pdf chain: step to_pdf has depends_on: [\"to_docx\"] and inputs {\"source_url\": \"$.steps.to_docx.output_key\", \"input_format\": \"docx\", \"output_format\": \"pdf\"} using tool convert_from_url). Workflow params are referenced as \"$.params.<name>\". No other template syntax (${...} etc.) is supported. BILLING: convert-only workflows run on wallet balance ($0.05/step). Workflows containing transcode or transcribe steps require auto-refill to be enabled at botverse.cloud/dashboard/billing (their cost scales with source duration). Workflow definition uses BWDL (Botverse Workflow Definition Language) — schema at botverse.cloud/schemas/workflow/v1.json.",
529
529
  "inputSchema": {
530
530
  "type": "object",
531
531
  "properties": {
@@ -746,5 +746,125 @@
746
746
  "idempotentHint": false,
747
747
  "openWorldHint": true
748
748
  }
749
+ },
750
+ {
751
+ "name": "transcribe_from_url",
752
+ "description": "Transcribe a video or audio file from a public HTTPS URL into a speaker-labelled transcript — ONE call does everything. Source can be a direct HTTPS URL or a Dropbox / Google Drive / Box share link (auto-resolved); OneDrive and SharePoint share links are unreliable — use a direct download URL, or upload via get_upload_url + transcribe_media. Internally: converts to audio, runs speech-to-text with speaker diarization, uses AI to name the speakers from your attendee list, and renders the document. Pass options.attendees (names, optional gender/role) and it tags who said what. Output formats: txt, json, srt, vtt, docx, pdf. CONSENT: you must have all parties' consent to record/transcribe. Returns a job_id immediately — report it to the user, then poll get_job_status (it reports a live stage: converting audio → transcribing → AI augmenting → rendering) until 'complete', then get_download_url. ~$0.08/audio-minute (~$5/hour), diarization + naming included.",
753
+ "inputSchema": {
754
+ "type": "object",
755
+ "properties": {
756
+ "source_url": {
757
+ "type": "string",
758
+ "description": "Public HTTPS URL of the source video or audio file."
759
+ },
760
+ "output_format": {
761
+ "type": "string",
762
+ "enum": [
763
+ "txt",
764
+ "json",
765
+ "srt",
766
+ "vtt",
767
+ "docx",
768
+ "pdf"
769
+ ],
770
+ "description": "Primary deliverable format."
771
+ },
772
+ "options": {
773
+ "type": "object",
774
+ "description": "Optional. attendees: [{name, gender?, role?}] to name speakers; language (BCP-47 or 'auto'); diarize (default true); max_speakers; title; include_timestamps; also_deliver: extra formats in the same job."
775
+ }
776
+ },
777
+ "required": [
778
+ "source_url",
779
+ "output_format"
780
+ ]
781
+ },
782
+ "outputSchema": {
783
+ "type": "object",
784
+ "properties": {
785
+ "job_id": {
786
+ "type": "string",
787
+ "description": "Unique identifier for this job. Pass to get_job_status and get_download_url."
788
+ },
789
+ "status": {
790
+ "type": "string",
791
+ "enum": [
792
+ "queued",
793
+ "processing"
794
+ ],
795
+ "description": "Initial job state."
796
+ }
797
+ },
798
+ "required": [
799
+ "job_id",
800
+ "status"
801
+ ]
802
+ },
803
+ "annotations": {
804
+ "readOnlyHint": false,
805
+ "destructiveHint": false,
806
+ "idempotentHint": false,
807
+ "openWorldHint": true
808
+ }
809
+ },
810
+ {
811
+ "name": "transcribe_media",
812
+ "description": "Transcribe an already-uploaded video/audio file (from get_upload_url) into a speaker-labelled transcript. Same one-call pipeline and options as transcribe_from_url (attendee naming, srt/vtt, formatted docx/pdf). Use for local files or files larger than a URL fetch allows (up to 2 GB). CONSENT: you must have all parties' consent. Poll get_job_status (live stage) until complete, then get_download_url. ~$0.08/audio-minute (~$5/hour).",
813
+ "inputSchema": {
814
+ "type": "object",
815
+ "properties": {
816
+ "object_key": {
817
+ "type": "string",
818
+ "description": "The object_key returned by get_upload_url."
819
+ },
820
+ "output_format": {
821
+ "type": "string",
822
+ "enum": [
823
+ "txt",
824
+ "json",
825
+ "srt",
826
+ "vtt",
827
+ "docx",
828
+ "pdf"
829
+ ],
830
+ "description": "Primary deliverable format."
831
+ },
832
+ "options": {
833
+ "type": "object",
834
+ "description": "Same options object as transcribe_from_url (attendees, language, diarize, max_speakers, title, include_timestamps, also_deliver)."
835
+ }
836
+ },
837
+ "required": [
838
+ "object_key",
839
+ "output_format"
840
+ ]
841
+ },
842
+ "outputSchema": {
843
+ "type": "object",
844
+ "properties": {
845
+ "job_id": {
846
+ "type": "string",
847
+ "description": "Unique identifier for this job. Pass to get_job_status and get_download_url."
848
+ },
849
+ "status": {
850
+ "type": "string",
851
+ "enum": [
852
+ "queued",
853
+ "processing"
854
+ ],
855
+ "description": "Initial job state."
856
+ }
857
+ },
858
+ "required": [
859
+ "job_id",
860
+ "status"
861
+ ]
862
+ },
863
+ "annotations": {
864
+ "readOnlyHint": false,
865
+ "destructiveHint": false,
866
+ "idempotentHint": false,
867
+ "openWorldHint": true
868
+ }
749
869
  }
750
870
  ]