reelforge 0.5.5 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,222 +1,240 @@
1
- # reelforge
2
-
3
- > CLI for [ReelForge Studio](https://github.com/puke3615/ReelForge) — every REST API exposed as a command, with `--help` available at every level.
4
-
5
- ## Install
6
-
7
- ```bash
8
- npm install -g reelforge
9
- ```
10
-
11
- Or use directly without install:
12
-
13
- ```bash
14
- npx reelforge <command>
15
- ```
16
-
17
- After install, two binaries are on your `PATH` — `reelforge` and the short alias `rf`. Both behave identically; the docs use `rf` from here on.
18
-
19
- ```bash
20
- rf --version # same as `reelforge --version`
21
- ```
22
-
23
- ## Quick start
24
-
25
- The CLI ships pointing at the hosted instance (`https://reelforge.timor419.com`). Log in once, then call:
26
-
27
- ```bash
28
- npm install -g reelforge
29
- rf login # opens browser; headless? rf login <api_key>
30
- rf whoami # balance + api_keys
31
- rf create "为什么我们还没找到外星文明?" # auto-saves to ./<title>-<id>.mp4 in cwd
32
- ```
33
-
34
- That's the whole story — no server to run.
35
-
36
- ### Output behavior
37
-
38
- | invocation | result |
39
- |---|---|
40
- | `rf create "..."` | Saves to `./<sanitized-title>-<task_id_short>.mp4`, prints the path |
41
- | `rf create "..." -o ./videos/space.mp4` | Saves to that exact path (must include filename, not just a directory) |
42
- | `rf create "..." --no-download` | Skips local save, prints JSON result with `video_url` |
43
- | `rf create "..." \| jq .video_url` | When stdout is piped, download is skipped automatically |
44
-
45
- ### Self-hosting
46
-
47
- If you want to run your own ReelForge Studio (own RelayX key, your own pricing) clone the upstream repo, `pnpm dev`, then point the CLI at it:
48
-
49
- ```bash
50
- rf --server http://localhost:8501 health
51
- # or persist:
52
- export REELFORGE_SERVER=http://localhost:8501
53
- # or via `rf login <key> --server http://localhost:8501`
54
- ```
55
-
56
- ## Global options
57
-
58
- | flag | description |
59
- |---|---|
60
- | `-s, --server <url>` | ReelForge server URL (overrides `$REELFORGE_SERVER`; default `https://reelforge.timor419.com`) |
61
- | `-k, --api-key <key>` | API key (overrides `$REELFORGE_API_KEY` and `reelforge login` saved key) |
62
- | `--json` | Output raw JSON instead of pretty text — pipe-friendly |
63
- | `--quiet` | Suppress informational messages on stderr |
64
- | `-v, --version` | Show CLI version |
65
- | `-h, --help` | Show help (works on every sub-command) |
66
-
67
- ## Command map
68
-
69
- Run `rf <command> --help` for full details on any of these.
70
-
71
- ### Core capabilities
72
-
73
- | command | what it does |
74
- |---|---|
75
- | `llm chat -p <text>` | Send one prompt to the configured LLM (RelayX gateway by default) |
76
- | `llm presets` | List built-in RelayX model presets |
77
- | `tts edge -t <text> -o out.mp3` | Local Edge TTS synthesis (free) |
78
- | `tts relayx -t <text> -o out.mp3` | RelayX TTS (vox/index-tts-2, 149 built-in voices) |
79
- | `tts voices [--locale zh]` | List supported Edge TTS voices |
80
- | `images generate -p <prompt> -m rx-image-flux` | Image generation via RelayX (rx-image-z / rx-image-flux / rx-image-qwen) |
81
-
82
- ### Content generation
83
-
84
- | command | what it does |
85
- |---|---|
86
- | `content narration -t <topic>` | Generate N narration sentences from a topic |
87
- | `content split -s <script>` | Split a fixed script into narrations |
88
- | `content image-prompts -i <file>` | English image prompts from narration list |
89
- | `content title -c <content>` | Generate a short video title |
90
- | `content asset-script --intent ... --assets <file>` | Asset-based scene script |
91
-
92
- ### Composition
93
-
94
- | command | what it does |
95
- |---|---|
96
- | `templates list [--size 1080x1920] [--type image]` | List HTML frame templates |
97
- | `templates preview <keyOrPath> [-o out.png]` | Render a preview from a preset key **or your own local .html file** |
98
- | `templates show <key> [-o file.html]` | Print or save the source HTML of any preset copy it as a starting point for a custom template |
99
- | `frames render -t <keyOrPath> --title ... --text ...` | Render a single composed frame to PNG. `-t` accepts a preset key **or a local .html path** |
100
- | `compositions concat <v1> <v2> -o out.mp4` | FFmpeg concat (+ optional BGM) |
101
- | `compositions bgm -i video.mp4 --bgm bgm.mp3 -o out.mp4` | Add background music |
102
- | `compositions image-to-video -i img.png -a aud.mp3 -o out.mp4` | Build video from image + audio |
103
- | `compositions overlay -v video.mp4 --overlay overlay.png -o out.mp4` | Overlay PNG on video |
104
-
105
- ### End-to-end pipelines
106
-
107
- All `pipelines *` commands submit an **async task** and (by default) poll until it finishes with a live progress indicator on stderr. Use `--no-wait` to return immediately with a `task_id`, then `rf tasks wait <id>` later.
108
-
109
- | command | what it does |
110
- |---|---|
111
- | `pipelines standard -t <topic\|script>` | Topic / script → narration → frames → final MP4 |
112
-
113
- ### Resources
114
-
115
- | command | what it does |
116
- |---|---|
117
- | `bgm list / upload <file> / delete <name>` | Manage background music |
118
- | `files list / upload <file> / download <path> / delete <path>` | Manage user assets |
119
-
120
- ### System
121
-
122
- | command | what it does |
123
- |---|---|
124
- | `config get` | Read server config (keys masked) |
125
- | `config set <key> <value>` | Update a dotted-path setting (e.g. `llm.api_key sk-xxx`) |
126
- | `config patch <file>` | Apply a JSON-merge patch |
127
- | `tasks list [--status running]` | List recent tasks |
128
- | `tasks get <id>` / `tasks wait <id>` / `tasks cancel <id>` | Task lifecycle |
129
- | `history list / get <id> / delete <id>` | Browse / delete completed runs |
130
- | `health` | Server health + capability check |
131
-
132
- ## Examples
133
-
134
- ```bash
135
- # 1. One-click out a video (auto-saves to ./<title>-<id>.mp4 in cwd)
136
- rf create "为什么我们还没找到外星文明?"
137
-
138
- # 2. Same, but with a fixed script and explicit output path
139
- rf pipelines standard \
140
- -t "Hello world. This is scene one.\n\nThis is scene two." \
141
- --mode fixed --title "Smoke Test" \
142
- --frame-template 1080x1920/static_default.html \
143
- --tts-voice en-US-AriaNeural -o smoke.mp4
144
-
145
- # 3. Inspect existing tasks & redownload a finished video
146
- rf tasks list --limit 5
147
- rf history get <task-id> --download recovered.mp4
148
-
149
- # 4. JSON pipe for automation
150
- rf llm presets --json | jq '.[].defaultModel'
151
-
152
- # 5. Configure & test LLM (self-hosted)
153
- rf config set llm.api_key rx-xxxxx # RelayX key (or your own provider key)
154
- rf config set llm.base_url https://relayx.timor419.com/v1
155
- rf config set llm.model anthropic/claude-4-7-sonnet
156
- rf llm chat -p 'one-sentence summary of antifragile'
157
-
158
- # 6. Use your own HTML template (no PR/release needed)
159
- # Any of -t / --frame-template that points to a local .html file is read and
160
- # sent inline. Declare size inside the file via
161
- # <meta name="template:width" content="1080">
162
- # <meta name="template:height" content="1920">
163
- # or pass --size 1080x1920 on the CLI.
164
- rf templates show 1080x1920/image_default.html -o my-brand.html # copy a preset
165
- # ...edit my-brand.html to suit your style...
166
- rf templates preview ./my-brand.html --title "Hello" -o preview.png
167
- rf frames render -t ./my-brand.html --values '{"author":"Alice"}' -o frame.png
168
- rf pipelines standard -t "宠物" --frame-template ./my-brand.html -o final.mp4
169
- ```
170
-
171
- ### Custom HTML templates
172
-
173
- Easiest way to start: grab a preset as a reference.
174
-
175
- ```bash
176
- rf templates list # see all keys
177
- rf templates show 1080x1920/static_default.html # print to stdout
178
- rf templates show 1080x1920/image_default.html -o my-brand.html # save and edit
179
- ```
180
-
181
- `{{title}}`, `{{text}}`, `{{image}}`, `{{index}}` are reserved built-ins; everything else uses the `{{name:type=default}}` DSL (`type` ∈ `text|number|color|bool`). Pass extras through `--values '{"author":"Alice"}'` (or `template_params` on the pipeline API).
182
-
183
- #### Template type does the pipeline generate an AI image per scene?
184
-
185
- When you ship an inline template through `rf create` / `rf pipelines standard`, ReelForge needs to know whether each scene should kick off RelayX image generation. Resolution priority (high → low):
186
-
187
- 1. Explicit flag — `--frame-template-type image|static|asset` (or `frame_template_type` in the API body).
188
- 2. Inside the HTML `<meta name="template:type" content="image">` (or `static` / `asset`).
189
- 3. **Default: `image`** — best practice for zero-config users. If your template doesn't reference scene imagery (pure-text card, etc.), declare `static` explicitly to skip image generation and its cost.
190
-
191
- The placeholder `{{image}}` no longer doubles as a type signal — declare type explicitly.
192
-
193
- Limits and safety:
194
-
195
- - Max 2 MB per inline HTML.
196
- - The render sandbox blocks `file://`, loopback / private / link-local IPs, CGNAT range, cloud-metadata, and `*.local` / `*.internal` hostnames. So your template can only reference public `https`/`http` resources or `data:` URIs.
197
- - If the CLI is talking to a hosted server, local-path `--image` won't reach the server; either upload to `rf files upload` first or use an HTTPS URL / data: URI.
198
-
199
- #### API field reference
200
-
201
- | endpoint | inline HTML field | size field | type field |
202
- |---|---|---|---|
203
- | `POST /api/v1/frames/render` | `template_html` | `size` | (n/a, no image generation) |
204
- | `POST /api/v1/templates/preview` | `template_html` | `size` | — |
205
- | `POST /api/v1/pipelines/standard` | `frame_template_inline` | `frame_template_size` | `frame_template_type` |
206
-
207
- The pipeline endpoint uses the `frame_template_*` prefix because it already has a `frame_template` field (preset key). The single-frame endpoints use the shorter `template_html` because they don't.
208
-
209
- ## Tipgetting unstuck
210
-
211
- Every level has `--help`:
212
-
213
- ```bash
214
- rf --help # top-level overview
215
- rf pipelines --help # list of pipelines
216
- rf pipelines standard --help # full option reference
217
- rf tts edge --help # one specific command
218
- ```
219
-
220
- ## License
221
-
222
- Apache-2.0
1
+ # reelforge
2
+
3
+ > CLI for [ReelForge Studio](https://github.com/puke3615/ReelForge) — every REST API exposed as a command, with `--help` available at every level.
4
+
5
+ ## Install
6
+
7
+ ```bash
8
+ npm install -g reelforge
9
+ ```
10
+
11
+ Or use directly without install:
12
+
13
+ ```bash
14
+ npx reelforge <command>
15
+ ```
16
+
17
+ After install, two binaries are on your `PATH` — `reelforge` and the short alias `rf`. Both behave identically; the docs use `rf` from here on.
18
+
19
+ ```bash
20
+ rf --version # same as `reelforge --version`
21
+ ```
22
+
23
+ ## Quick start
24
+
25
+ The CLI ships pointing at the hosted instance (`https://reelforge.timor419.com`). Log in once, then call:
26
+
27
+ ```bash
28
+ npm install -g reelforge
29
+ rf login # opens browser; headless? rf login <api_key>
30
+ rf whoami # balance + api_keys
31
+ rf create "为什么我们还没找到外星文明?" # auto-saves to ./<title>-<id>.mp4 in cwd
32
+ ```
33
+
34
+ That's the whole story — no server to run.
35
+
36
+ ### Output behavior
37
+
38
+ | invocation | result |
39
+ |---|---|
40
+ | `rf create "..."` | Saves to `./<sanitized-title>-<task_id_short>.mp4`, prints the path |
41
+ | `rf create "..." -o ./videos/space.mp4` | Saves to that exact path (must include filename, not just a directory) |
42
+ | `rf create "..." --no-download` | Skips local save, prints JSON result with `video_url` |
43
+ | `rf create "..." \| jq .video_url` | When stdout is piped, download is skipped automatically |
44
+
45
+ ### Self-hosting
46
+
47
+ If you want to run your own ReelForge Studio (own RelayX key, your own pricing) clone the upstream repo, `pnpm dev`, then point the CLI at it:
48
+
49
+ ```bash
50
+ rf --server http://localhost:8501 health
51
+ # or persist:
52
+ export REELFORGE_SERVER=http://localhost:8501
53
+ # or via `rf login <key> --server http://localhost:8501`
54
+ ```
55
+
56
+ ## Global options
57
+
58
+ | flag | description |
59
+ |---|---|
60
+ | `-s, --server <url>` | ReelForge server URL (overrides `$REELFORGE_SERVER`; default `https://reelforge.timor419.com`) |
61
+ | `-k, --api-key <key>` | API key (overrides `$REELFORGE_API_KEY` and `reelforge login` saved key) |
62
+ | `--json` | Output raw JSON instead of pretty text — pipe-friendly |
63
+ | `--quiet` | Suppress informational messages on stderr |
64
+ | `-v, --version` | Show CLI version |
65
+ | `-h, --help` | Show help (works on every sub-command) |
66
+
67
+ ## Command map
68
+
69
+ Run `rf <command> --help` for full details on any of these.
70
+
71
+ ### Core capabilities
72
+
73
+ | command | what it does |
74
+ |---|---|
75
+ | `llm chat -p <text>` | Send one prompt to the configured LLM (RelayX gateway by default) |
76
+ | `llm presets` | List built-in RelayX model presets |
77
+ | `tts edge -t <text> -o out.mp3` | Local Edge TTS synthesis (free) |
78
+ | `tts relayx -t <text> -o out.mp3` | RelayX TTS (vox/index-tts-2, 149 built-in voices) |
79
+ | `tts voices [--locale zh]` | List supported Edge TTS voices |
80
+ | `images generate -p <prompt> -m rx-image-flux` | Image generation via RelayX (rx-image-z / rx-image-flux / rx-image-qwen) |
81
+
82
+ ### Content / audio / subtitle atomics
83
+
84
+ | command | what it does |
85
+ |---|---|
86
+ | `content scene-plan -t <topic>` | Single LLM call: title + master script + per-scene image prompts (replaces the old narration / split / image-prompts / title trio) |
87
+ | `content scene-plan --script <text-or-@file>` | Same, but the user supplies the script verbatim LLM only segments and writes image prompts |
88
+ | `audio transcribe -f <file>` / `--url <url>` | RelayX paraformer-v2 ASR with word + segment timestamps |
89
+ | `subtitles split -t <text-or-@file>` | Deterministic tiered-punctuation subtitle line splitter (pure function, zero billing) |
90
+
91
+ ### Composition
92
+
93
+ | command | what it does |
94
+ |---|---|
95
+ | `templates list [--size 1080x1920] [--type image]` | List HTML frame templates |
96
+ | `templates preview <keyOrPath> [-o out.png]` | Render a preview from a preset key **or your own local .html file** |
97
+ | `templates show <key> [-o file.html]` | Print or save the source HTML of any preset copy it as a starting point for a custom template |
98
+ | `frames render -t <keyOrPath> --title ... --text ...` | Render a single composed frame to PNG. `-t` accepts a preset key **or a local .html path** |
99
+ | `compositions concat <v1> <v2> -o out.mp4` | FFmpeg concat (+ optional BGM) |
100
+ | `compositions bgm -i video.mp4 --bgm bgm.mp3 -o out.mp4` | Add background music |
101
+ | `compositions image-to-video -i img.png -a aud.mp3 -o out.mp4` | Build video from image + audio |
102
+ | `compositions overlay -v video.mp4 --overlay overlay.png -o out.mp4` | Overlay PNG on video |
103
+
104
+ ### End-to-end pipelines
105
+
106
+ All `pipelines *` commands submit an **async task** and (by default) poll until it finishes with a live progress indicator on stderr. Use `--no-wait` to return immediately with a `task_id`, then `rf tasks wait <id>` later.
107
+
108
+ The standard pipeline is **audio-first**: scene-plan → one-shot TTS → ASR alignment → per-scene image generation → per-subtitle-line frame rendering → ffmpeg mux. One continuous master audio track; image cuts at scene boundaries; subtitle cuts at line boundaries.
109
+
110
+ | command | what it does |
111
+ |---|---|
112
+ | `pipelines standard -t <topic>` (or `--script <text>`) | Audio-first pipeline; `-d/--duration` and `-p/--pace` are the two main knobs |
113
+
114
+ ### Resources
115
+
116
+ | command | what it does |
117
+ |---|---|
118
+ | `bgm list / upload <file> / delete <name>` | Manage background music |
119
+ | `files list / upload <file> / download <path> / delete <path>` | Manage user assets |
120
+
121
+ ### System
122
+
123
+ | command | what it does |
124
+ |---|---|
125
+ | `config get` | Read server config (keys masked) |
126
+ | `config set <key> <value>` | Update a dotted-path setting (e.g. `llm.api_key sk-xxx`) |
127
+ | `config patch <file>` | Apply a JSON-merge patch |
128
+ | `tasks list [--status running]` | List recent tasks |
129
+ | `tasks get <id>` / `tasks wait <id>` / `tasks cancel <id>` | Task lifecycle |
130
+ | `history list / get <id> / delete <id>` | Browse / delete completed runs |
131
+ | `health` | Server health + capability check |
132
+
133
+ ## Examples
134
+
135
+ ```bash
136
+ # 1. One-click out a video (45s default, AI writes the script)
137
+ rf create "为什么我们还没找到外星文明?"
138
+
139
+ # 2. Longer video with a slower visual rhythm
140
+ rf create "深夜便利店的灯光" -d 90 -p slow
141
+
142
+ # 3. Your own script — no narration-splitting on your side, the pipeline handles it
143
+ rf create --script @./my-script.txt
144
+ rf create --script "雨水缓缓滑落在玻璃窗上,像是无声的泪珠。"
145
+
146
+ # 4. Pick a built-in visual style preset
147
+ rf create "美食教程" --style photorealistic
148
+
149
+ # 5. Pipeline form with explicit output path
150
+ rf pipelines standard \
151
+ --script @./script.txt \
152
+ --frame-template 1080x1920/image_default.html \
153
+ -p normal -o smoke.mp4
154
+
155
+ # 6. Inspect existing tasks & redownload a finished video
156
+ rf tasks list --limit 5
157
+ rf history get <task-id> --download recovered.mp4
158
+
159
+ # 7. Atomics for stand-alone use
160
+ rf content scene-plan -t "雨天的玻璃窗" -d 45 --json | jq .scenes
161
+ rf audio transcribe -f narration.mp3 --json | jq '.words[:5]'
162
+ rf subtitles split -t @./narration.txt --min 10 --hard-max 24
163
+
164
+ # 8. JSON pipe for automation
165
+ rf llm presets --json | jq '.[].defaultModel'
166
+
167
+ # 9. Configure & test LLM (self-hosted)
168
+ rf config set llm.api_key rx-xxxxx # RelayX key (or your own provider key)
169
+ rf config set llm.base_url https://relayx.timor419.com/v1
170
+ rf config set llm.model anthropic/claude-4-7-sonnet
171
+ rf llm chat -p 'one-sentence summary of antifragile'
172
+
173
+ # 10. Use your own HTML template (no PR/release needed)
174
+ # Any --frame-template that points to a local .html file is read and sent
175
+ # inline. Declare size inside the file via
176
+ # <meta name="template:width" content="1080">
177
+ # <meta name="template:height" content="1920">
178
+ # or pass --frame-template-size 1080x1920.
179
+ rf templates show 1080x1920/image_default.html -o my-brand.html # copy a preset
180
+ # ...edit my-brand.html to suit your style...
181
+ rf templates preview ./my-brand.html --title "Hello" -o preview.png
182
+ rf frames render -t ./my-brand.html --values '{"author":"Alice"}' -o frame.png
183
+ rf pipelines standard -t "宠物" --frame-template ./my-brand.html -o final.mp4
184
+ ```
185
+
186
+ ### Custom HTML templates
187
+
188
+ Easiest way to start: grab a preset as a reference.
189
+
190
+ ```bash
191
+ rf templates list # see all keys
192
+ rf templates show 1080x1920/static_default.html # print to stdout
193
+ rf templates show 1080x1920/image_default.html -o my-brand.html # save and edit
194
+ ```
195
+
196
+ `{{title}}`, `{{text}}`, `{{image}}`, `{{index}}`, `{{total}}` are reserved built-ins auto-injected by the pipeline; everything else uses the `{{name:type=default}}` DSL (`type` ∈ `text|number|color|bool`). Pass extras through `--values '{"author":"Alice"}'` (or `template_params` on the pipeline API).
197
+
198
+ - `{{index}}` — current scene number, 1-based
199
+ - `{{total}}` scene count the LLM actually produced (use this for "scene N of M" badges; don't hardcode in `template_params`, the scene count is decided at runtime)
200
+
201
+ #### Template type does the pipeline generate an AI image per scene?
202
+
203
+ When you ship an inline template through `rf create` / `rf pipelines standard`, ReelForge needs to know whether each scene should kick off RelayX image generation. Resolution priority (high → low):
204
+
205
+ 1. Explicit flag `--frame-template-type image|static|asset` (or `frame_template_type` in the API body).
206
+ 2. Inside the HTML — `<meta name="template:type" content="image">` (or `static` / `asset`).
207
+ 3. **Default: `image`** best practice for zero-config users. If your template doesn't reference scene imagery (pure-text card, etc.), declare `static` explicitly to skip image generation and its cost.
208
+
209
+ The placeholder `{{image}}` no longer doubles as a type signal declare type explicitly.
210
+
211
+ Limits and safety:
212
+
213
+ - Max 2 MB per inline HTML.
214
+ - The render sandbox blocks `file://`, loopback / private / link-local IPs, CGNAT range, cloud-metadata, and `*.local` / `*.internal` hostnames. So your template can only reference public `https`/`http` resources or `data:` URIs.
215
+ - If the CLI is talking to a hosted server, local-path `--image` won't reach the server; either upload to `rf files upload` first or use an HTTPS URL / data: URI.
216
+
217
+ #### API field reference
218
+
219
+ | endpoint | inline HTML field | size field | type field |
220
+ |---|---|---|---|
221
+ | `POST /api/v1/frames/render` | `template_html` | `size` | — (n/a, no image generation) |
222
+ | `POST /api/v1/templates/preview` | `template_html` | `size` | — |
223
+ | `POST /api/v1/pipelines/standard` | `frame_template_inline` | `frame_template_size` | `frame_template_type` |
224
+
225
+ The pipeline endpoint uses the `frame_template_*` prefix because it already has a `frame_template` field (preset key). The single-frame endpoints use the shorter `template_html` because they don't.
226
+
227
+ ## Tip — getting unstuck
228
+
229
+ Every level has `--help`:
230
+
231
+ ```bash
232
+ rf --help # top-level overview
233
+ rf pipelines --help # list of pipelines
234
+ rf pipelines standard --help # full option reference
235
+ rf tts edge --help # one specific command
236
+ ```
237
+
238
+ ## License
239
+
240
+ Apache-2.0
@@ -0,0 +1,73 @@
1
+ import fs from "node:fs/promises";
2
+ import path from "node:path";
3
+ import { uploadMultipart, post } from "../client.js";
4
+ import { print } from "../utils/output.js";
5
+ export function registerAudio(program) {
6
+ const audio = program
7
+ .command("audio")
8
+ .description("Audio atomics — transcription / forced alignment")
9
+ .helpOption("-h, --help", "show help");
10
+ audio
11
+ .command("transcribe")
12
+ .description("Transcribe an audio file to text + word-level timestamps (RelayX paraformer-v2)")
13
+ .helpOption("-h, --help", "show help")
14
+ .option("-f, --file <path>", "local audio file (mp3/wav/m4a). Use this OR --url.")
15
+ .option("-u, --url <url>", "remote audio URL — server downloads and transcribes.")
16
+ .option("-l, --language <code>", "language hint (e.g. zh, en). Optional — paraformer-v2 auto-detects.")
17
+ .option("-m, --model <id>", "override ASR model id (default alibaba/paraformer-v2)")
18
+ .option("-o, --output <file>", "write the full JSON response to this file as well as stdout")
19
+ .addHelpText("after", [
20
+ "",
21
+ "Examples:",
22
+ " rf audio transcribe -f ./narration.mp3",
23
+ " rf audio transcribe --url https://example.com/clip.mp3 --language zh",
24
+ " rf audio transcribe -f ./voice.wav --json | jq '.words[:5]'",
25
+ ].join("\n"))
26
+ .action(async (opts) => {
27
+ if (!opts.file && !opts.url) {
28
+ throw new Error("either --file or --url is required");
29
+ }
30
+ if (opts.file && opts.url) {
31
+ throw new Error("--file and --url are mutually exclusive");
32
+ }
33
+ let r;
34
+ if (opts.file) {
35
+ const buf = await fs.readFile(opts.file);
36
+ const filename = path.basename(opts.file);
37
+ const ext = path.extname(filename).toLowerCase();
38
+ const mime = ext === ".wav" ? "audio/wav" :
39
+ ext === ".m4a" ? "audio/mp4" :
40
+ ext === ".flac" ? "audio/flac" :
41
+ ext === ".ogg" ? "audio/ogg" :
42
+ "audio/mpeg";
43
+ const fileBlob = new File([new Uint8Array(buf)], filename, { type: mime });
44
+ const fields = { file: fileBlob };
45
+ if (opts.language)
46
+ fields.language = opts.language;
47
+ if (opts.model)
48
+ fields.model = opts.model;
49
+ r = await uploadMultipart("/api/v1/audio/transcribe", fields);
50
+ }
51
+ else {
52
+ const body = { audio_url: opts.url };
53
+ if (opts.language)
54
+ body.language = opts.language;
55
+ if (opts.model)
56
+ body.model = opts.model;
57
+ r = await post("/api/v1/audio/transcribe", body);
58
+ }
59
+ if (opts.output) {
60
+ await fs.writeFile(opts.output, JSON.stringify(r, null, 2), "utf-8");
61
+ }
62
+ print({
63
+ model: r.model,
64
+ language: r.language,
65
+ duration: r.duration,
66
+ text: r.text,
67
+ n_segments: r.segments.length,
68
+ n_words: r.words.length,
69
+ segments: r.segments,
70
+ words: r.words,
71
+ });
72
+ });
73
+ }