miriad-viz 0.4.1 → 0.4.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/dist-cli/index.js
CHANGED
|
@@ -627,18 +627,21 @@ function computeNext(progress, existingFiles, logTails, flags = {}, dataSummary,
|
|
|
627
627
|
output.push(" Each line has an id, speaker, text, and style.");
|
|
628
628
|
output.push("");
|
|
629
629
|
output.push(" WORKFLOW:");
|
|
630
|
-
output.push(
|
|
631
|
-
|
|
632
|
-
|
|
630
|
+
output.push(
|
|
631
|
+
" 1. List available voices: GET /v1/voices (header: xi-api-key: $ELEVENLABS_API_KEY)"
|
|
632
|
+
);
|
|
633
|
+
output.push(" 2. Read data/script.json \u2014 get speakers and line counts");
|
|
634
|
+
output.push(' 3. Suggest a voice for each speaker (be proactive, not "what do you want?")');
|
|
635
|
+
output.push(" 4. Generate one test clip per speaker for human approval");
|
|
636
|
+
output.push(" 5. Generate all clips: audio/{lineId}.mp3");
|
|
633
637
|
output.push(" \u26A0\uFE0F Filename MUST match script line ID exactly");
|
|
634
638
|
output.push(" narrator-01 \u2192 audio/narrator-01.mp3");
|
|
635
|
-
output.push("
|
|
636
|
-
output.push("
|
|
639
|
+
output.push(" 6. Human listens and gives feedback");
|
|
640
|
+
output.push(" 7. Regenerate specific lines as needed");
|
|
637
641
|
output.push("");
|
|
638
|
-
output.push("
|
|
639
|
-
output.push(" \u{
|
|
640
|
-
output.push("
|
|
641
|
-
output.push(" Voices: https://elevenlabs.io/voice-library");
|
|
642
|
+
output.push(" API:");
|
|
643
|
+
output.push(" \u{1F50D} List voices: GET /v1/voices");
|
|
644
|
+
output.push(" \u{1F5E3}\uFE0F Generate clip: POST /v1/text-to-speech/{voice_id}");
|
|
642
645
|
output.push("");
|
|
643
646
|
output.push(" When approved: npx miriad-viz next --voices-approved");
|
|
644
647
|
output.push("");
|
|
@@ -6,6 +6,26 @@ You write the narrative backbone of the video: **who says what, in what order**.
|
|
|
6
6
|
|
|
7
7
|
This step also proposes **phases, milestones, and featured quotes** — the visual curation that structures the timeline. Script + curation happen together because they're the same creative act: deciding what story to tell.
|
|
8
8
|
|
|
9
|
+
## Two Speech Bubble Systems
|
|
10
|
+
|
|
11
|
+
The visualization has **two separate text systems** — you write both during this step:
|
|
12
|
+
|
|
13
|
+
### 1. Script Lines → Narration Lane (the story backbone)
|
|
14
|
+
These are the lines in `data/script.json`. They appear sequentially in the narration lane at the bottom of the screen, get voiced by TTS, and drive the entire pacing of the video. This is the primary creative output.
|
|
15
|
+
|
|
16
|
+
### 2. Chat Pills → Speech Bubbles Above Agents (color and authenticity)
|
|
17
|
+
These are curated real messages from the channel, saved in `data/retro-page-quotes.json`. They appear as floating speech bubbles above the agent who said them, at the moment they said it. They add color, humor, and authenticity — the audience sees what people actually typed.
|
|
18
|
+
|
|
19
|
+
**The script tells the story. The chat pills show the texture of the collaboration.**
|
|
20
|
+
|
|
21
|
+
When writing the script, also volunteer to curate 10-20 standout chat messages as pills. Look for:
|
|
22
|
+
- Emotional beats ("finally!", "everything is broken", "ship it")
|
|
23
|
+
- Humor and personality
|
|
24
|
+
- Key decisions or breakthroughs
|
|
25
|
+
- Messages that capture the vibe of a moment
|
|
26
|
+
|
|
27
|
+
The script is primary. Chat pills are secondary enrichment — they make the visualization feel alive.
|
|
28
|
+
|
|
9
29
|
## Files
|
|
10
30
|
|
|
11
31
|
| File | Read/Write | Description |
|
|
@@ -39,15 +39,30 @@ audio file: audio/lead-03.mp3
|
|
|
39
39
|
|
|
40
40
|
## Workflow
|
|
41
41
|
|
|
42
|
-
### 1.
|
|
42
|
+
### 1. List Available Voices from the API
|
|
43
43
|
|
|
44
|
-
Before
|
|
44
|
+
Before presenting anything to the human, fetch the available voices:
|
|
45
45
|
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
-
|
|
46
|
+
```bash
|
|
47
|
+
curl -s "https://api.elevenlabs.io/v1/voices" \
|
|
48
|
+
-H "xi-api-key: $ELEVENLABS_API_KEY" | jq '.voices[] | {voice_id, name, labels}'
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
This returns all voices available to the account (default library + any cloned voices). Note the `voice_id`, `name`, and `labels` (accent, age, gender, use case) for each.
|
|
52
|
+
|
|
53
|
+
### 2. Suggest Voice Assignments
|
|
54
|
+
|
|
55
|
+
Read `data/script.json` to get the speaker list and line counts. Then **proactively suggest a voice for each speaker** based on what's available:
|
|
56
|
+
|
|
57
|
+
*"Your script has 4 speakers. Here's what I'd suggest:*
|
|
58
|
+
- *NARRATOR (12 lines) → Rachel — warm, clear, documentary feel*
|
|
59
|
+
- *@lead (5 lines) → Adam — professional, authoritative*
|
|
60
|
+
- *@snorre (4 lines) → Antoni — conversational, energetic*
|
|
61
|
+
- *@edge (3 lines) → Bella — thoughtful, measured*
|
|
62
|
+
|
|
63
|
+
*Want me to generate test samples with these, or swap any voices?"*
|
|
49
64
|
|
|
50
|
-
|
|
65
|
+
**Be proactive.** Don't ask "what do you want?" — suggest assignments and let the human approve or tweak. The agent knows the speakers, knows the available voices, and should make the creative connection.
|
|
51
66
|
|
|
52
67
|
### 2. Generate Test Samples First
|
|
53
68
|
|