agentvibes-avatars 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,34 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Paul Preibisch
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
22
+
23
+ ---
24
+
25
+ This package loads, at runtime from CDN, third-party libraries it does NOT
26
+ bundle or redistribute:
27
+
28
+ - TalkingHead by met4citizen — MIT License
29
+ https://github.com/met4citizen/TalkingHead
30
+ - three.js by mrdoob and contributors — MIT License
31
+ https://github.com/mrdoob/three.js
32
+
33
+ Avatar models are loaded from the TalkingHead repository at runtime and remain
34
+ the property of their respective authors. See their repositories for terms.
package/README.md ADDED
@@ -0,0 +1,77 @@
1
+ # agentvibes-avatars
2
+
3
+ A **talking-head avatar window** on a galaxy stage that lip-syncs **any TTS audio you POST to it**. Run it, leave the window open, and `POST` audio + text to `/speak` — an avatar speaks it with synchronized lip movement, captions, and a chat log. Multiple sources (projects / remote servers) appear as their own tabs.
4
+
5
+ It's a thin, zero-dependency Node HTTP server plus a browser page. The heavy 3D — the [TalkingHead](https://github.com/met4citizen/TalkingHead) library, [three.js](https://github.com/mrdoob/three.js), and the avatar models — is **loaded from CDN at runtime, not bundled**, so this package stays tiny and ships only its own code.
6
+
7
+ ## Install
8
+
9
+ ```bash
10
+ npm install -g agentvibes-avatars
11
+ ```
12
+
13
+ ## Run
14
+
15
+ ```bash
16
+ agentvibes-avatars # start the server + open the avatar window
17
+ agentvibes-avatars --port 4000
18
+ agentvibes-avatars --view gallery.html
19
+ ```
20
+
21
+ This starts the receiver on `http://localhost:3747` and opens a dedicated Chrome app window (with autoplay enabled so audio plays hands-free). If Chrome isn't found it falls back to your default browser. If a window is already open, it's refreshed in place instead of duplicated.
22
+
23
+ > First run needs internet (to fetch the TalkingHead library + three.js from CDN). After that the browser caches them.
24
+
25
+ ## Speak to it
26
+
27
+ POST JSON to `/speak`. `audioBase64` is a base64-encoded WAV; the other fields drive the labels.
28
+
29
+ ```bash
30
+ curl -X POST http://localhost:3747/speak \
31
+ -H "Content-Type: application/json" \
32
+ -d '{
33
+ "audioBase64": "<base64 WAV>",
34
+ "text": "Hello from my app.",
35
+ "voice": "en_US-amy-medium",
36
+ "project": "my-app",
37
+ "origin": "local"
38
+ }'
39
+ ```
40
+
41
+ | field | meaning |
42
+ |---|---|
43
+ | `audioBase64` | base64 WAV to play (required for audio) |
44
+ | `text` | caption + chat bubble + lip-sync timing |
45
+ | `voice` | logical voice name (drives avatar colour/mapping) |
46
+ | `project` | session/tab label |
47
+ | `origin` | source label badge: `local`, `remote`, or a server name |
48
+
49
+ The window assigns a consistent avatar per voice, groups messages by `project` into tabs (with unread badges), and lets you click any chat bubble to replay it.
50
+
51
+ ## Endpoints
52
+
53
+ | method | path | purpose |
54
+ |---|---|---|
55
+ | `POST` | `/speak` | play audio + show text |
56
+ | `GET` | `/has-browser` | `{ "connected": bool }` |
57
+ | `GET` | `/health` | liveness |
58
+ | `POST` | `/reload` | refresh open windows |
59
+
60
+ ## Security
61
+
62
+ The server binds **`127.0.0.1` only**. State-changing endpoints reject requests carrying a non-loopback `Origin` header (blocks CSRF from a stray browser tab); local tools that POST without an `Origin` work normally. Request bodies are size-capped.
63
+
64
+ ## How it fits with AgentVibes
65
+
66
+ [AgentVibes](https://github.com/paulpreibisch/AgentVibes) forwards its TTS to this window automatically. But the receiver is generic — anything that can synthesize a WAV and POST it can drive the avatars.
67
+
68
+ ## Credits
69
+
70
+ - [TalkingHead](https://github.com/met4citizen/TalkingHead) by met4citizen — MIT
71
+ - [three.js](https://github.com/mrdoob/three.js) — MIT
72
+
73
+ Loaded from CDN at runtime; not bundled or redistributed by this package. See [LICENSE](./LICENSE).
74
+
75
+ ## License
76
+
77
+ MIT © Paul Preibisch
package/avatars.json ADDED
@@ -0,0 +1,35 @@
1
+ {
2
+ "_comment": "The original models.readyplayer.me URLs no longer resolve (NXDOMAIN), so all entries point to the TalkingHead demo avatar 'brunette.glb' served via jsDelivr — it has ARKit+Oculus visemes baked in for lip sync. Replace any entry's url with your own GLB (e.g. a fresh ReadyPlayerMe export with ?morphTargets=ARKit,Oculus+Visemes) to get distinct avatars per gender.",
3
+ "avatars": {
4
+ "female-1": {
5
+ "url": "https://cdn.jsdelivr.net/gh/met4citizen/TalkingHead@1.3/avatars/brunette.glb",
6
+ "body": "F",
7
+ "label": "Female 1"
8
+ },
9
+ "female-2": {
10
+ "url": "https://cdn.jsdelivr.net/gh/met4citizen/TalkingHead@1.3/avatars/brunette.glb",
11
+ "body": "F",
12
+ "label": "Female 2"
13
+ },
14
+ "male-1": {
15
+ "url": "https://cdn.jsdelivr.net/gh/met4citizen/TalkingHead@1.3/avatars/brunette.glb",
16
+ "body": "M",
17
+ "label": "Male 1"
18
+ },
19
+ "male-2": {
20
+ "url": "https://cdn.jsdelivr.net/gh/met4citizen/TalkingHead@1.3/avatars/brunette.glb",
21
+ "body": "M",
22
+ "label": "Male 2"
23
+ },
24
+ "neutral": {
25
+ "url": "https://cdn.jsdelivr.net/gh/met4citizen/TalkingHead@1.3/avatars/brunette.glb",
26
+ "body": "F",
27
+ "label": "Neutral"
28
+ }
29
+ },
30
+ "genderOrder": {
31
+ "female": ["female-1", "female-2", "neutral"],
32
+ "male": ["male-1", "male-2", "neutral"],
33
+ "neutral": ["neutral", "female-1", "male-1"]
34
+ }
35
+ }
package/bin/cli.js ADDED
@@ -0,0 +1,109 @@
1
+ #!/usr/bin/env node
2
+ 'use strict';
3
+ //
4
+ // agentvibes-avatars — start the receiver server and open the avatar window.
5
+ //
6
+ // Usage:
7
+ // agentvibes-avatars # start server + open the avatar window
8
+ // agentvibes-avatars --port 4000 # custom port
9
+ // agentvibes-avatars --view gallery.html
10
+ //
11
+ // The window speaks any TTS audio POSTed to http://localhost:<port>/speak
12
+ // { audioBase64, text, voice, project, origin }
13
+ //
14
+ // The TalkingHead library + three.js + avatars load from CDN at runtime — nothing
15
+ // is bundled here. First run needs internet; after that the page is cached by the
16
+ // browser.
17
+ //
18
+
19
+ const { spawn } = require('child_process');
20
+ const http = require('http');
21
+ const path = require('path');
22
+ const os = require('os');
23
+ const fs = require('fs');
24
+
25
+ const argv = process.argv.slice(2);
26
+ function flag(name, def) { const i = argv.indexOf(name); return (i >= 0 && argv[i + 1]) ? argv[i + 1] : def; }
27
+
28
+ const PORT = parseInt(process.env.AVATARS_PORT || flag('--port', '3747'), 10);
29
+ const VIEW = flag('--view', 'cosmic.html');
30
+ const BASE = `http://localhost:${PORT}`;
31
+
32
+ function get(p) {
33
+ return new Promise(resolve => {
34
+ const req = http.get(`${BASE}${p}`, res => { let b = ''; res.on('data', c => b += c); res.on('end', () => resolve({ status: res.statusCode, body: b })); });
35
+ req.on('error', () => resolve(null));
36
+ req.setTimeout(1500, () => { req.destroy(); resolve(null); });
37
+ });
38
+ }
39
+ function post(p) {
40
+ return new Promise(resolve => {
41
+ const req = http.request(`${BASE}${p}`, { method: 'POST' }, res => { res.on('data', () => {}); res.on('end', () => resolve(true)); });
42
+ req.on('error', () => resolve(false));
43
+ req.setTimeout(1500, () => { req.destroy(); resolve(false); });
44
+ req.end();
45
+ });
46
+ }
47
+
48
+ function findChrome() {
49
+ const c = [];
50
+ if (process.platform === 'win32') {
51
+ c.push(`${process.env['PROGRAMFILES']}\\Google\\Chrome\\Application\\chrome.exe`,
52
+ `${process.env['PROGRAMFILES(X86)']}\\Google\\Chrome\\Application\\chrome.exe`,
53
+ `${process.env['LOCALAPPDATA']}\\Google\\Chrome\\Application\\chrome.exe`);
54
+ } else if (process.platform === 'darwin') {
55
+ c.push('/Applications/Google Chrome.app/Contents/MacOS/Google Chrome');
56
+ } else {
57
+ c.push('/usr/bin/google-chrome', '/usr/bin/google-chrome-stable', '/usr/bin/chromium', '/usr/bin/chromium-browser', '/snap/bin/chromium');
58
+ }
59
+ return c.find(p => { try { return p && fs.existsSync(p); } catch { return false; } });
60
+ }
61
+
62
+ function openWindow(url) {
63
+ const chrome = findChrome();
64
+ if (chrome) {
65
+ // Dedicated app window with autoplay enabled so audio plays hands-free.
66
+ const profile = path.join(os.homedir(), '.agentvibes-avatars', 'chrome-profile');
67
+ spawn(chrome, [`--app=${url}`, '--autoplay-policy=no-user-gesture-required',
68
+ `--user-data-dir=${profile}`, '--no-first-run', '--no-default-browser-check'],
69
+ { detached: true, stdio: 'ignore' }).unref();
70
+ return 'chrome app window';
71
+ }
72
+ // Fallback: the OS default browser (autoplay may require one click).
73
+ const opener = process.platform === 'win32' ? ['cmd', ['/c', 'start', '', url]]
74
+ : process.platform === 'darwin' ? ['open', [url]]
75
+ : ['xdg-open', [url]];
76
+ spawn(opener[0], opener[1], { detached: true, stdio: 'ignore' }).unref();
77
+ return 'default browser';
78
+ }
79
+
80
+ (async () => {
81
+ // 1. Ensure the receiver server is running.
82
+ let health = await get('/health');
83
+ if (!health) {
84
+ const serverPath = path.join(__dirname, '..', 'server.js');
85
+ const child = spawn(process.execPath, [serverPath], {
86
+ detached: true, stdio: 'ignore',
87
+ env: { ...process.env, TALKING_HEAD_PORT: String(PORT) },
88
+ });
89
+ child.unref();
90
+ for (let i = 0; i < 25 && !health; i++) { await new Promise(r => setTimeout(r, 150)); health = await get('/health'); }
91
+ if (!health) { console.error(`agentvibes-avatars: server failed to start on ${BASE}`); process.exit(1); }
92
+ console.log(`agentvibes-avatars: server started on ${BASE}`);
93
+ } else {
94
+ console.log(`agentvibes-avatars: server already running on ${BASE}`);
95
+ }
96
+
97
+ // 2. Idempotent: if a window is already connected, refresh it instead of duplicating.
98
+ const hb = await get('/has-browser');
99
+ if (hb && /"connected":true/.test(hb.body)) {
100
+ await post('/reload');
101
+ console.log('agentvibes-avatars: existing window refreshed (no duplicate).');
102
+ return;
103
+ }
104
+
105
+ // 3. Open the window.
106
+ const how = openWindow(`${BASE}/${VIEW}`);
107
+ console.log(`agentvibes-avatars: opened ${BASE}/${VIEW} via ${how}`);
108
+ console.log(`POST TTS audio to ${BASE}/speak { audioBase64, text, voice, project, origin }`);
109
+ })();
package/package.json ADDED
@@ -0,0 +1,40 @@
1
+ {
2
+ "name": "agentvibes-avatars",
3
+ "version": "0.1.0",
4
+ "description": "A talking-head avatar window on a galaxy stage that lip-syncs any TTS audio POSTed to it. The met4citizen TalkingHead library and three.js are loaded from CDN at runtime (not bundled).",
5
+ "bin": {
6
+ "agentvibes-avatars": "bin/cli.js"
7
+ },
8
+ "main": "server.js",
9
+ "scripts": {
10
+ "start": "node server.js"
11
+ },
12
+ "files": [
13
+ "bin/",
14
+ "server.js",
15
+ "avatars.json",
16
+ "public/*.html",
17
+ "README.md",
18
+ "LICENSE"
19
+ ],
20
+ "engines": {
21
+ "node": ">=16"
22
+ },
23
+ "keywords": [
24
+ "avatar",
25
+ "talking-head",
26
+ "tts",
27
+ "text-to-speech",
28
+ "lip-sync",
29
+ "three.js",
30
+ "agentvibes"
31
+ ],
32
+ "author": "Paul Preibisch",
33
+ "license": "MIT",
34
+ "repository": {
35
+ "type": "git",
36
+ "url": "git+https://github.com/paulpreibisch/AgentVibes.git",
37
+ "directory": "agentvibes-avatars"
38
+ },
39
+ "homepage": "https://github.com/paulpreibisch/AgentVibes#readme"
40
+ }