talking-head-studio 0.4.9 → 0.4.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (69) hide show
  1. package/README.md +227 -351
  2. package/dist/TalkingHead.d.ts +16 -25
  3. package/dist/TalkingHead.web.d.ts +6 -0
  4. package/dist/TalkingHead.web.js +18 -8
  5. package/dist/api/studioApi.js +25 -26
  6. package/dist/appearance/apply.js +2 -3
  7. package/dist/appearance/matchers.js +1 -2
  8. package/dist/appearance/schema.js +1 -2
  9. package/dist/core/avatar/backend.d.ts +130 -0
  10. package/dist/core/avatar/backend.js +4 -0
  11. package/dist/core/avatar/backends/gaussian.d.ts +49 -0
  12. package/dist/core/avatar/backends/gaussian.js +291 -0
  13. package/dist/core/avatar/backends/index.d.ts +3 -0
  14. package/dist/core/avatar/backends/index.js +7 -0
  15. package/dist/core/avatar/backends/morphTarget.d.ts +39 -0
  16. package/dist/core/avatar/backends/morphTarget.js +179 -0
  17. package/dist/core/avatar/faceControls.d.ts +40 -0
  18. package/dist/core/avatar/faceControls.js +138 -0
  19. package/dist/core/avatar/schema.d.ts +50 -0
  20. package/dist/core/avatar/schema.js +134 -0
  21. package/dist/core/avatar/visemes.d.ts +64 -0
  22. package/dist/core/avatar/visemes.js +72 -0
  23. package/dist/editor/AvatarCanvas.js +1 -2
  24. package/dist/editor/AvatarEditor.native.js +18 -9
  25. package/dist/editor/AvatarModel.js +1 -2
  26. package/dist/editor/FaceSqueezeEditor.js +19 -9
  27. package/dist/editor/FaceSqueezeEditor.web.js +2 -2
  28. package/dist/editor/RigidAccessory.js +1 -2
  29. package/dist/editor/SkinnedClothing.js +18 -9
  30. package/dist/editor/boneSnap.js +22 -12
  31. package/dist/editor/studioTheme.js +2 -2
  32. package/dist/html.js +1 -2
  33. package/dist/index.d.ts +15 -1
  34. package/dist/index.js +28 -5
  35. package/dist/platform/api/types.d.ts +10 -0
  36. package/dist/platform/api/types.js +2 -0
  37. package/dist/platform/marketplace/types.d.ts +32 -0
  38. package/dist/platform/marketplace/types.js +2 -0
  39. package/dist/platform/sdk/unity.d.ts +27 -0
  40. package/dist/platform/sdk/unity.js +2 -0
  41. package/dist/platform/sdk/unreal.d.ts +23 -0
  42. package/dist/platform/sdk/unreal.js +2 -0
  43. package/dist/platform/sdk/web.d.ts +16 -0
  44. package/dist/platform/sdk/web.js +2 -0
  45. package/dist/sketchfab/api.js +4 -5
  46. package/dist/sketchfab/useSketchfabSearch.js +1 -2
  47. package/dist/tts/useDirectVisemeStream.d.ts +2 -6
  48. package/dist/tts/useDirectVisemeStream.js +1 -2
  49. package/dist/tts/useMotionMarkers.d.ts +0 -1
  50. package/dist/tts/useMotionMarkers.js +1 -2
  51. package/dist/utils/avatarUtils.js +2 -3
  52. package/dist/utils/faceLandmarkerToShapeWeights.js +19 -10
  53. package/dist/voice/convertToWav.js +1 -2
  54. package/dist/voice/index.d.ts +3 -0
  55. package/dist/voice/index.js +6 -1
  56. package/dist/voice/useAudioPlayer.js +1 -2
  57. package/dist/voice/useAudioRecording.js +1 -2
  58. package/dist/voice/useFaceControls.d.ts +14 -0
  59. package/dist/voice/useFaceControls.js +81 -0
  60. package/dist/voice/useVoicePreview.d.ts +7 -0
  61. package/dist/voice/useVoicePreview.js +81 -0
  62. package/dist/wardrobe/index.d.ts +2 -0
  63. package/dist/wardrobe/index.js +3 -1
  64. package/dist/wardrobe/useAvatarWardrobeHydration.js +1 -2
  65. package/dist/wardrobe/useStudioAvatar.d.ts +29 -0
  66. package/dist/wardrobe/useStudioAvatar.js +177 -0
  67. package/dist/wgpu/WgpuAvatar.js +17 -7
  68. package/dist/wgpu/useAuthedModelUri.js +18 -9
  69. package/package.json +8 -4
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # talking-head-studio
2
2
 
3
- **The missing UI layer for AI Agents. Drop-in, lip-syncing 3D avatars for Web, React, and React Native.**
3
+ **Open-source avatar platform for Web, React Native, Unity, and Unreal. Any GLB model. Full lip-sync with or without blend shapes.**
4
4
 
5
5
  [![npm version](https://img.shields.io/npm/v/talking-head-studio.svg)](https://www.npmjs.com/package/talking-head-studio)
6
6
  [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
@@ -8,56 +8,64 @@
8
8
 
9
9
  ---
10
10
 
11
- ## Why this?
11
+ ## What this is
12
12
 
13
- - **Zero-Jank React Native & Web:** True cross-platform rendering. React Native gets a blazing fast wgpu-accelerated native render loop, skipping WebView bridge latency entirely. React on web gets a robust `react-three-fiber` setup. Same API, same props.
14
- - **Universal GLB Compatibility:** Bring any GLB. Out-of-the-box support for standard ARKit blendshapes. Rigged models get full phoneme-based lip-sync. Non-rigged models get an amplitude-driven jaw animation fallback.
15
- - **Built for AI & Voice Pipelines:** Wire `sendAmplitude` or visemes directly to LiveKit, Web Audio, ElevenLabs, OpenAI Realtime, or any audio source.
16
- - **Always Alive:** Procedural idle animations (breathing, nodding, swaying) keep your avatar from feeling like a static doll.
17
- - **Dynamic Wardrobe & Accessories:** Swap hair, skin, and eye colors on the fly. Attach hats, glasses, or backpacks to any bone at runtime.
13
+ A drop-in avatar runtime and platform SDK built to be a self-hostable replacement for Ready Player Me. The core problem it solves: **any arbitrary 3D model should be able to talk, emote, and respond to a voice pipeline** regardless of whether the artist baked in blend shapes, visemes, or any face rig at all.
14
+
15
+ The library ships a renderer (web iframe + React Native wgpu), a backend-agnostic face control contract, and a growing set of adapters that map TTS/audio/AI output onto whatever rendering mechanism the model actually supports.
18
16
 
19
17
  ---
20
18
 
21
- ## Table of Contents
22
-
23
- - [Installation](#installation)
24
- - [Quick Start](#quick-start)
25
- - [Subpath Exports](#subpath-exports)
26
- - [Props](#props)
27
- - [Ref API](#ref-api)
28
- - [Accessories](#accessories)
29
- - [Color Customization](#color-customization)
30
- - [Voice Pipeline Integration](#voice-pipeline-integration)
31
- - [GLB Compatibility](#glb-compatibility)
32
- - [Plain React / Next.js](#plain-react--nextjs)
33
- - [MotionEngine (Upcoming)](#motionengine-upcoming)
34
- - [Contributing](#contributing)
35
- - [Credits](#credits)
36
- - [License](#license)
19
+ ## Lip-sync tiers (any model works)
37
20
 
38
- ---
21
+ | Model type | Lip-sync method | Quality |
22
+ |---|---|---|
23
+ | GLB with Oculus viseme morphs | Direct morph drive via `MorphTargetBackend` | Excellent |
24
+ | GLB with ARKit blend shapes | `remapArkitToOculus()` → morph drive | Good |
25
+ | GLB with only `jawOpen` / `mouthOpen` | Amplitude fallback | Acceptable |
26
+ | Any other GLB | Gaussian splat backend *(roadmap)* | Excellent |
39
27
 
40
- ## Installation
28
+ The last row is the goal: **scan any model into a Gaussian representation, generate per-viseme deltas via FLAME-based transfer, and drive it from the same `FaceControl` contract everything else uses.** No blend shapes required. No artist work required.
41
29
 
42
- ### React Native / Expo
30
+ ---
31
+
32
+ ## Architecture
43
33
 
44
- ```bash
45
- npm install talking-head-studio react-native-webview
34
+ ```
35
+ TTS / audio / face tracking
36
+
37
+ AgentVisemePayload ← canonical wire format for lip-sync schedules
38
+
39
+ FaceControl ← pose (HeadPose) + expression (ExpressionState) + gaze (EyeGaze)
40
+
41
+ AvatarBackend ←────────────── swap without changing anything upstream
42
+ ├── MorphTargetBackend ← Three.js morph targets (GLB with blend shapes)
43
+ ├── GaussianBackend ← [roadmap] Gaussian splat + FLAME delta transfer
44
+ └── (your backend) ← implement AvatarBackend, plug in
45
+
46
+ Renderer
47
+ ├── Web iframe ← TalkingHead.web.tsx (any React app)
48
+ ├── React Native wgpu ← WgpuAvatar (native GPU, no WebView latency)
49
+ └── Unity / Unreal ← [roadmap] SDK plugins consuming same contracts
46
50
  ```
47
51
 
48
- `react-native-webview` is a peer dependency. If you are using Expo, it is available as a built-in package.
52
+ Everything above `AvatarBackend` is renderer-agnostic. Everything above `FaceControl` is model-agnostic.
49
53
 
50
- ### Web only (React, Next.js, Vite)
54
+ ---
55
+
56
+ ## Installation
51
57
 
52
58
  ```bash
59
+ # React Native / Expo
60
+ npm install talking-head-studio react-native-webview
61
+
62
+ # Web (React, Next.js, Vite)
53
63
  npm install talking-head-studio
54
64
  ```
55
65
 
56
- No `react-native` or WebView runtime dependency needed. The package ships a web entry point that renders via `<iframe srcdoc>` automatically when bundled for browser targets.
57
-
58
66
  ---
59
67
 
60
- ## Quick Start
68
+ ## Quick start
61
69
 
62
70
  ```tsx
63
71
  import { useRef } from 'react';
@@ -74,19 +82,16 @@ export default function Avatar() {
74
82
  cameraView="upper"
75
83
  hairColor="#1a1a2e"
76
84
  skinColor="#e0a370"
77
- accessories={[
78
- {
79
- id: 'sunglasses',
80
- url: 'https://example.com/sunglasses.glb',
81
- bone: 'Head',
82
- position: [0, 0.08, 0.12],
83
- rotation: [0, 0, 0],
84
- scale: 1.0,
85
- },
86
- ]}
85
+ accessories={[{
86
+ id: 'sunglasses',
87
+ url: 'https://example.com/sunglasses.glb',
88
+ bone: 'Head',
89
+ position: [0, 0.08, 0.12],
90
+ rotation: [0, 0, 0],
91
+ scale: 1.0,
92
+ }]}
87
93
  style={{ width: 400, height: 600 }}
88
- onReady={() => console.log('Avatar loaded')}
89
- onError={(msg) => console.error('Load failed:', msg)}
94
+ onReady={() => console.log('ready')}
90
95
  />
91
96
  );
92
97
  }
@@ -94,372 +99,243 @@ export default function Avatar() {
94
99
 
95
100
  ---
96
101
 
97
- ## Subpath Exports
98
-
99
- The package ships five independent entry points. Import only what you need — each subpath has its own optional peer dependencies.
102
+ ## FaceControl — the core contract
100
103
 
101
- ### `talking-head-studio` Live talking avatar
102
- ```tsx
103
- import { TalkingHead } from 'talking-head-studio';
104
- // Peer deps: react
105
- // Native-only peers: react-native (optional), react-native-webview (optional)
106
- ```
104
+ The `FaceControl` type is the single value that flows between your voice pipeline and any avatar backend. If you're building a custom backend or integrating with a game engine, this is what you implement against.
107
105
 
108
- ### `talking-head-studio/editor` — 3D editor with gizmo (web)
109
- R3F-based canvas with PivotControls gizmo for placing accessories on an avatar. Web only.
110
- ```tsx
111
- import { AvatarCanvas } from 'talking-head-studio/editor';
112
- // Peer deps: @react-three/fiber, @react-three/drei, three
106
+ ```ts
107
+ import type { FaceControl, ExpressionState, HeadPose, EyeGaze } from 'talking-head-studio';
108
+
109
+ type HeadPose = {
110
+ yaw: number; // -1..1, left..right
111
+ pitch: number; // -1..1, down..up
112
+ roll: number; // -1..1, tilt
113
+ };
114
+
115
+ type EyeGaze = {
116
+ x: number; // -1..1, left..right
117
+ y: number; // -1..1, down..up
118
+ };
119
+
120
+ type ExpressionState = {
121
+ jawOpen: number; // 0..1
122
+ mouthSmile: number;
123
+ mouthFunnel: number;
124
+ mouthPucker: number;
125
+ mouthWide: number;
126
+ upperLipRaise: number;
127
+ lowerLipDepress: number;
128
+ cheekRaise: number;
129
+ blinkLeft: number;
130
+ blinkRight: number;
131
+ browInnerUp: number;
132
+ browDownLeft: number;
133
+ browDownRight: number;
134
+ eyeGazeLeft: EyeGaze;
135
+ eyeGazeRight: EyeGaze;
136
+ };
113
137
  ```
114
138
 
115
- ### `talking-head-studio/appearance` Material color system
116
- Apply skin/hair/eye colors to any GLB avatar. Works in both the live view and the 3D editor.
117
- ```tsx
118
- import { applyAppearanceToObject3D, type AvatarAppearance } from 'talking-head-studio/appearance';
119
- // No extra peer deps
120
- ```
139
+ ### Driving FaceControl from a viseme schedule
121
140
 
122
- ### `talking-head-studio/voice` — Audio recording hooks
123
- Headless hooks for recording voice samples (WebM→WAV conversion included). Backend-agnostic — send audio wherever you want (Qwen3-TTS, ElevenLabs, Groq, etc).
124
- ```tsx
125
- import { useAudioRecording, useAudioPlayer } from 'talking-head-studio/voice';
126
- // No extra peer deps (browser APIs only)
127
- ```
141
+ ```ts
142
+ import { useFaceControlsFromVisemes } from 'talking-head-studio';
128
143
 
129
- ### `talking-head-studio/sketchfab` Sketchfab search & download
130
- Headless hooks and utilities for searching and downloading GLB models from Sketchfab. Bring your own UI and API key.
131
- ```tsx
132
- import { useSketchfabSearch, ACCESSORY_CATEGORIES, downloadModel } from 'talking-head-studio/sketchfab';
133
- // No extra peer deps
144
+ // schedule: AgentVisemePayload from your TTS backend
145
+ const faceControl = useFaceControlsFromVisemes(schedule);
146
+ // → { pose: { yaw:0, pitch:0, roll:0 }, expr: { jawOpen: 0.7, ... } }
134
147
  ```
135
148
 
136
- ---
137
-
138
- ## Props
139
-
140
- | Prop | Type | Default | Description |
141
- |------|------|---------|-------------|
142
- | `avatarUrl` | `string` | **required** | URL to any `.glb` model. Rigged or non-rigged. |
143
- | `authToken` | `string \| null` | `null` | Bearer token sent when fetching the model URL. CDN URLs are excluded automatically. |
144
- | `mood` | `TalkingHeadMood` | `'neutral'` | Avatar expression. See [Moods](#moods) below. |
145
- | `cameraView` | `'head' \| 'upper' \| 'full'` | `'upper'` | Camera framing preset. |
146
- | `cameraDistance` | `number` | `-0.5` | Camera zoom offset. Negative values zoom in. |
147
- | `hairColor` | `string` | -- | CSS color applied to materials whose name contains `hair` or `fur`. |
148
- | `skinColor` | `string` | -- | CSS color applied to materials whose name contains `skin`, `body`, or `face`. |
149
- | `eyeColor` | `string` | -- | CSS color applied to materials whose name contains `eye` or `iris`. |
150
- | `accessories` | `TalkingHeadAccessory[]` | `[]` | Array of GLB items to attach to bones. See [Accessories](#accessories). |
151
- | `onReady` | `() => void` | -- | Fires once the avatar and scene are fully loaded. |
152
- | `onError` | `(message: string) => void` | -- | Fires on load failure. |
153
- | `style` | `ViewStyle` | -- | Container style (works on both native and web). |
154
-
155
- ### Moods
156
-
157
- The `mood` prop accepts one of:
149
+ ### Implementing a custom backend
158
150
 
159
- ```
160
- neutral | happy | sad | angry | excited | thinking | concerned | surprised
151
+ ```ts
152
+ import type { AvatarBackend, AvatarRenderTarget, FaceControl } from 'talking-head-studio';
153
+
154
+ class MyGaussianBackend implements AvatarBackend {
155
+ initialize() { /* load splat data, FLAME weights */ }
156
+ attach(target: AvatarRenderTarget) { /* bind to canvas/surface */ }
157
+ setControl(control: FaceControl) { /* map ExpressionState → splat coefficients */ }
158
+ renderFrame() { /* rasterize */ }
159
+ dispose() { /* cleanup */ }
160
+ }
161
161
  ```
162
162
 
163
- Mood can be changed at any time via props or the ref API. On rigged models, mood maps to blend shape expressions. On non-rigged models, mood is a no-op.
164
-
165
163
  ---
166
164
 
167
- ## Ref API
168
-
169
- Access runtime controls through a React ref. Every method is safe to call at any time -- calls made before the avatar is ready are silently dropped.
165
+ ## MorphTargetBackend — Three.js GLB adapter
170
166
 
171
- ```tsx
172
- const ref = useRef<TalkingHeadRef>(null);
167
+ The first concrete `AvatarBackend` implementation. Give it any loaded Three.js scene and it will find morph targets, build a lookup cache, and drive them from `FaceControl`.
173
168
 
174
- // Drive lip-sync from an audio amplitude value (0..1)
175
- ref.current?.sendAmplitude(0.7);
176
-
177
- // Change expression
178
- ref.current?.setMood('excited');
179
-
180
- // Change colors at runtime
181
- ref.current?.setHairColor('#ff0000');
182
- ref.current?.setSkinColor('#8d5524');
183
- ref.current?.setEyeColor('#2e86de');
184
-
185
- // Swap accessories without re-mounting the component
186
- ref.current?.setAccessories([
187
- {
188
- id: 'crown',
189
- url: 'https://example.com/crown.glb',
190
- bone: 'Head',
191
- position: [0, 0.22, 0],
192
- rotation: [0, 0, 0],
193
- scale: 0.8,
169
+ ```ts
170
+ import * as THREE from 'three';
171
+ import { GLTFLoader } from 'three/examples/jsm/loaders/GLTFLoader';
172
+ import { MorphTargetBackend } from 'talking-head-studio';
173
+
174
+ const loader = new GLTFLoader();
175
+ const gltf = await loader.loadAsync('/avatar.glb');
176
+
177
+ const backend = new MorphTargetBackend(gltf.scene, {
178
+ mood: 'neutral',
179
+ expressionScale: 1.0,
180
+ calibration: {
181
+ neutral: { pose: { yaw: 0, pitch: 0, roll: 0 }, expr: createNeutralExpression() },
182
+ ranges: { jawOpen: { min: 0, max: 0.85 } }, // clamp jaw for this model
183
+ gazeLimits: { x: { min: -0.6, max: 0.6 } },
194
184
  },
195
- ]);
196
- ```
185
+ });
197
186
 
198
- ### Ref Methods
187
+ // Each frame:
188
+ backend.setControl(faceControl);
189
+ backend.renderFrame();
199
190
 
200
- | Method | Signature | Description |
201
- |--------|-----------|-------------|
202
- | `sendAmplitude` | `(amplitude: number) => void` | Feed audio amplitude (0 to 1) for jaw animation. |
203
- | `setMood` | `(mood: TalkingHeadMood) => void` | Change avatar expression at runtime. |
204
- | `setHairColor` | `(color: string) => void` | Update hair material color. |
205
- | `setSkinColor` | `(color: string) => void` | Update skin material color. |
206
- | `setEyeColor` | `(color: string) => void` | Update eye/iris material color. |
207
- | `setAccessories` | `(accessories: TalkingHeadAccessory[]) => void` | Replace the entire accessory set. Handles loading, diffing, and cleanup automatically. |
191
+ // Debug: what morphs does this model actually have?
192
+ console.log(backend.availableChannels);
193
+ // { visemes: ['aa','PP','oh',...], expressions: ['jawOpen','blinkLeft',...], gaze: ['lookLeft','lookUp'] }
194
+ ```
208
195
 
209
196
  ---
210
197
 
211
- ## Accessories
212
-
213
- Attach any GLB model to any bone on the avatar skeleton. The system handles loading, disposal, and transform updates.
198
+ ## ARKit → Oculus remap
214
199
 
215
- ### Accessory shape
200
+ Models with ARKit blend shapes (52 facial action units) but no Oculus viseme morphs can be remapped analytically — no ML, no FLAME, no artist work.
216
201
 
217
202
  ```ts
218
- interface TalkingHeadAccessory {
219
- id: string; // Unique identifier for diffing
220
- url: string; // URL to a .glb file
221
- bone: string; // Target bone name (e.g. "Head", "RightHand", "Spine")
222
- position: [number, number, number]; // Offset from the bone origin
223
- rotation: [number, number, number]; // Euler rotation in radians
224
- scale: number; // Uniform scale factor
225
- }
203
+ import { remapArkitToOculus, getArkitWeightsForViseme } from 'talking-head-studio';
204
+
205
+ // Runtime: face tracking data Oculus viseme weights
206
+ const oculusWeights = remapArkitToOculus({
207
+ jawOpen: 0.7,
208
+ mouthLowerDownLeft: 0.4,
209
+ mouthLowerDownRight: 0.4,
210
+ });
211
+ // → { aa: 0.68, PP: 0.03, oh: 0.12, ... }
212
+
213
+ // Bake-time: get the ARKit recipe for a specific viseme
214
+ const recipe = getArkitWeightsForViseme('ou');
215
+ // → { mouthPucker: 0.9, mouthRollLower: 0.3 }
226
216
  ```
227
217
 
228
- ### Example: hat + glasses + backpack
229
-
230
- ```tsx
231
- <TalkingHead
232
- avatarUrl="https://example.com/avatar.glb"
233
- accessories={[
234
- {
235
- id: 'cowboy-hat',
236
- url: '/models/cowboy-hat.glb',
237
- bone: 'Head',
238
- position: [0, 0.18, 0],
239
- rotation: [0, 0, 0],
240
- scale: 1.2,
241
- },
242
- {
243
- id: 'aviators',
244
- url: '/models/aviator-glasses.glb',
245
- bone: 'Head',
246
- position: [0, 0.06, 0.11],
247
- rotation: [0, 0, 0],
248
- scale: 1.0,
249
- },
250
- {
251
- id: 'backpack',
252
- url: '/models/backpack.glb',
253
- bone: 'Spine1',
254
- position: [0, 0, -0.15],
255
- rotation: [0, Math.PI, 0],
256
- scale: 0.9,
257
- },
258
- ]}
259
- />
260
- ```
261
-
262
- ### Common bone names
263
-
264
- Mixamo-rigged models typically expose these bones:
265
-
266
- ```
267
- Head, Neck, Spine, Spine1, Spine2,
268
- LeftShoulder, LeftArm, LeftForeArm, LeftHand,
269
- RightShoulder, RightArm, RightForeArm, RightHand,
270
- LeftUpLeg, LeftLeg, LeftFoot,
271
- RightUpLeg, RightLeg, RightFoot
272
- ```
273
-
274
- Bone matching is flexible -- if an exact match is not found, the component tries a prefix match (useful for Sketchfab exports like `Head_5`). If no bone matches, the accessory falls back to the scene root.
275
-
276
- ### Runtime accessory swaps
277
-
278
- ```tsx
279
- // Remove all accessories
280
- ref.current?.setAccessories([]);
281
-
282
- // Swap glasses for a monocle
283
- ref.current?.setAccessories([
284
- { id: 'monocle', url: '/models/monocle.glb', bone: 'Head', position: [0.03, 0.07, 0.11], rotation: [0, 0, 0], scale: 0.6 },
285
- ]);
286
- ```
287
-
288
- Accessories that were previously loaded but are absent from the new array are automatically disposed (geometry, materials, textures).
218
+ The full `ARKIT_TO_OCULUS` coefficient table is exported so you can build your own bake pipeline.
289
219
 
290
220
  ---
291
221
 
292
- ## Color Customization
293
-
294
- Colors can be set via props (applied on initial load) or via the ref API (applied at runtime without reloading the model).
222
+ ## TalkingHead component — props & ref
295
223
 
296
- The system matches material names against known keywords:
224
+ ### Props
297
225
 
298
- | Target | Material name keywords |
299
- |--------|----------------------|
300
- | Hair | `hair`, `fur` |
301
- | Skin | `skin`, `body`, `face` |
302
- | Eyes | `eye`, `iris` |
303
-
304
- ```tsx
305
- // Via props
306
- <TalkingHead hairColor="#2d1b00" skinColor="#f0c8a0" eyeColor="#3d6b4f" />
226
+ | Prop | Type | Default | Description |
227
+ |------|------|---------|-------------|
228
+ | `avatarUrl` | `string` | required | Any `.glb`. Rigged or not. |
229
+ | `authToken` | `string \| null` | `null` | Bearer token for authenticated GLB URLs. |
230
+ | `mood` | `TalkingHeadMood` | `'neutral'` | `neutral \| happy \| sad \| angry \| excited \| thinking \| concerned \| surprised` |
231
+ | `cameraView` | `'head' \| 'upper' \| 'full'` | `'upper'` | Framing preset. |
232
+ | `cameraDistance` | `number` | `-0.5` | Zoom offset. Negative = closer. |
233
+ | `hairColor` | `string` | — | Hex color. Applied to materials named `hair`, `fur`. |
234
+ | `skinColor` | `string` | — | Applied to `skin`, `body`, `face`. |
235
+ | `eyeColor` | `string` | — | Applied to `eye`, `iris`. |
236
+ | `accessories` | `TalkingHeadAccessory[]` | `[]` | Bone-attached GLB items. |
237
+ | `onReady` | `() => void` | — | Fired when fully loaded. |
238
+ | `onError` | `(msg: string) => void` | — | Fired on load failure. |
239
+ | `style` | `ViewStyle / CSSProperties` | — | Container style. |
240
+
241
+ ### Ref methods
307
242
 
308
- // Via ref (runtime)
309
- ref.current?.setHairColor('#ff4500');
310
- ref.current?.setSkinColor('#c68642');
311
- ref.current?.setEyeColor('#1abc9c');
243
+ ```ts
244
+ ref.current?.sendAmplitude(0.7); // amplitude 0..1 → jaw
245
+ ref.current?.scheduleVisemes(payload); // AgentVisemePayload → full lip-sync schedule
246
+ ref.current?.clearVisemes();
247
+ ref.current?.setMood('excited');
248
+ ref.current?.setHairColor('#ff0000');
249
+ ref.current?.setSkinColor('#8d5524');
250
+ ref.current?.setEyeColor('#2e86de');
251
+ ref.current?.setAccessories([...]);
252
+ ref.current?.dispatchMotion('nod');
312
253
  ```
313
254
 
314
- This works on both rigged and non-rigged models -- any GLB with appropriately named materials will respond to color changes.
315
-
316
255
  ---
317
256
 
318
- ## Voice Pipeline Integration
319
-
320
- The component is designed to sit at the end of a voice pipeline. Feed it audio amplitude and it handles the rest.
321
-
322
- ### Primary: HeadAudio phoneme lip-sync
323
-
324
- On rigged models in browser contexts with Web Audio available, [HeadAudio](https://github.com/met4citizen/HeadAudio) provides phoneme-level lip-sync automatically. Audio elements in the page are intercepted and routed through the lip-sync engine -- no wiring required on your end.
325
-
326
- ### Fallback: amplitude-driven jaw
327
-
328
- When phoneme-level lip-sync is unavailable (React Native WebView, non-rigged models, or missing blend shapes), `sendAmplitude` drives jaw movement directly via morph targets.
329
-
330
- ### LiveKit integration
331
-
332
- ```tsx
333
- import { useDataChannel } from '@livekit/components-react';
334
-
335
- function AvatarWithLiveKit() {
336
- const ref = useRef<TalkingHeadRef>(null);
257
+ ## Accessories
337
258
 
338
- useDataChannel('agent_speaking', (data) => {
339
- if (data.amplitude !== undefined) {
340
- ref.current?.sendAmplitude(data.amplitude);
341
- }
342
- });
259
+ Any GLB attached to any skeleton bone. Placement is editable at runtime via the 3D editor.
343
260
 
344
- return <TalkingHead ref={ref} avatarUrl="..." />;
261
+ ```ts
262
+ interface TalkingHeadAccessory {
263
+ id: string;
264
+ url: string;
265
+ bone: string; // 'Head' | 'Spine' | 'RightHand' | ...
266
+ position: [number, number, number];
267
+ rotation: [number, number, number]; // Euler, radians
268
+ scale: number;
345
269
  }
346
270
  ```
347
271
 
348
- ### Web Audio analyser
349
-
350
- ```tsx
351
- const audioCtx = new AudioContext();
352
- const analyser = audioCtx.createAnalyser();
353
- const buf = new Uint8Array(analyser.frequencyBinCount);
354
-
355
- // Connect your audio source to the analyser
356
- source.connect(analyser);
357
-
358
- // Poll amplitude and feed the avatar
359
- const interval = setInterval(() => {
360
- analyser.getByteFrequencyData(buf);
361
- const amplitude = buf.reduce((a, b) => a + b, 0) / buf.length / 255;
362
- ref.current?.sendAmplitude(amplitude);
363
- }, 50);
364
- ```
365
-
366
- ### Any audio source
272
+ Common Mixamo bones: `Head, Neck, Spine, Spine1, Spine2, LeftHand, RightHand, LeftFoot, RightFoot, Hips`
367
273
 
368
- The only contract is a number between 0 and 1, called at roughly 20 Hz. This works with ElevenLabs, OpenAI Realtime, Deepgram, Whisper, or any other TTS/STT pipeline.
274
+ The 3D editor (`talking-head-studio/editor`) provides a gizmo for live placement with front/top/side views. LLM-assisted placement is available via the companion backend.
369
275
 
370
276
  ---
371
277
 
372
- ## GLB Compatibility
373
-
374
- ### Rigged models (full feature set)
375
-
376
- For the complete experience -- phoneme lip-sync, expressions, moods, gestures -- your GLB should have:
377
-
378
- - A **Mixamo-compatible armature** (the component expects standard bone names)
379
- - **ARKit blend shapes** and/or **Oculus viseme blend shapes** for lip-sync
380
- - Standard Three.js-compatible GLB format
381
-
382
- Models from [Avaturn](https://avaturn.me/) or any Mixamo-rigged source work out of the box.
383
-
384
- ### Non-rigged models (static fallback)
385
-
386
- Any valid GLB loads successfully. Non-rigged models get:
387
-
388
- - Auto-framing and centering in the viewport
389
- - Orbit controls for rotation
390
- - Embedded animation playback (walk cycles, idle loops, etc.)
391
- - Amplitude-driven jaw via morph targets (if the model has `jawOpen`, `mouthOpen`, or `viseme_aa` blend shapes)
392
- - Color customization (if materials are named appropriately)
393
- - Accessory attachment (falls back to scene root if no bones exist)
394
-
395
- ### Upstream documentation
396
-
397
- For detailed model authoring guidance, see the [TalkingHead documentation](https://github.com/met4citizen/TalkingHead).
398
-
399
- ---
400
-
401
- ## Plain React / Next.js
402
-
403
- This works on the web without `react-native` or `react-native-webview` installed at runtime.
404
-
405
- On web, the component renders an `<iframe>` with `srcdoc` containing the full Three.js scene. No WebView, no native modules, no build plugins.
406
-
407
- ```tsx
408
- // Works in any React 18+ web app
409
- import { TalkingHead } from 'talking-head-studio';
410
-
411
- export default function Page() {
412
- return (
413
- <TalkingHead
414
- avatarUrl="/models/avatar.glb"
415
- mood="happy"
416
- style={{ width: 600, height: 800 }}
417
- />
418
- );
419
- }
420
- ```
421
-
422
- Metro and Expo use the native entry backed by `react-native-webview`. Standard web bundlers use the browser entry backed by a plain `<iframe>`. The API is identical.
278
+ ## Packages
279
+
280
+ | Path | Description |
281
+ |------|-------------|
282
+ | `talking-head-studio` | Live avatar renderer + FaceControl contracts |
283
+ | `talking-head-studio/editor` | R3F-based 3D editor with gizmo (web only) |
284
+ | `talking-head-studio/appearance` | Material color system for any GLB |
285
+ | `talking-head-studio/voice` | Audio recording + WAV conversion hooks |
286
+ | `talking-head-studio/sketchfab` | Sketchfab search + download hooks |
287
+ | `talking-head-studio/api` | Studio API client (avatar CRUD, voice profiles) |
288
+ | `talking-head-studio/wardrobe` | Accessory + outfit state management |
289
+ | `talking-head-studio/wgpu` | React Native wgpu renderer |
290
+ | `packages/avatar-creator` | Embeddable avatar creator widget |
291
+ | `packages/agent-avatar` | LiveKit agent + MCP integration |
423
292
 
424
293
  ---
425
294
 
426
- ## MotionEngine (Upcoming)
427
-
428
- [MotionEngine](https://github.com/lhupyn/motion-engine) integration is in development. This will add real-time body tracking and gesture replay to the avatar, driven by webcam or motion capture data.
429
-
430
- Stay tuned.
295
+ ## Roadmap
296
+
297
+ ### Now shipped
298
+ - `FaceControl` canonical face control space (pose + expression + gaze)
299
+ - `AvatarBackend` interface — swap renderers without changing upstream code
300
+ - `MorphTargetBackend` — Three.js GLB adapter with morph target discovery and mood layering
301
+ - ARKit → Oculus analytical remap (`remapArkitToOculus`, full coefficient table)
302
+ - `useFaceControlsFromVisemes` — rAF-sampled hook from `AgentVisemePayload`
303
+ - `AgentVisemePayload` canonical TTS → lip-sync wire format
304
+ - `AvatarGlbParams` — typed API contract for quality/compression/morph group selection
305
+ - `CalibrationProfile` — per-avatar range remapping and gaze limits
306
+ - Platform type stubs: SDK (web/Unity/Unreal), marketplace catalog, avatar GLB API
307
+ - `packages/avatar-creator` — embeddable creator widget with preset catalog
308
+ - `packages/agent-avatar` — LiveKit agent + MCP tool integration
309
+
310
+ ### Next
311
+ - **GLB schema walker** — scan any loaded GLB and report: morph target coverage, skeleton bones, LODs, viseme tier. Prerequisite for the validator and import pipeline.
312
+ - **`GET /avatars/{id}.glb` with `AvatarGlbParams`** — extend the companion backend to serve quality/compression/morph-group variants on the existing endpoint.
313
+ - **Creator postMessage bridge** — let partners embed the avatar creator in an iframe and receive avatar IDs back, like RPM's WebView creator.
314
+
315
+ ### Medium term
316
+ - **`GaussianBackend`** — Gaussian splat renderer implementing `AvatarBackend`. Takes any model, scans it, drives expression via FLAME-based per-viseme delta transfer. No artist work, no blend shapes required. This is the zero-prerequisite lip-sync path.
317
+ - **FLAME viseme transfer pipeline** (Python, companion backend) — fit FLAME to a face screenshot, generate Oculus viseme deltas, bake back into the GLB as morph targets. Background task on upload for any avatar missing viseme morphs.
318
+ - **Unity SDK** — C# plugin implementing the `AvatarBackend` contract. Blueprint-friendly API for loading GLBs, driving morphs, consuming `AgentVisemePayload`.
319
+ - **Unreal plugin** — UE5 plugin with Blueprint-accessible `UAvatarDescriptor` and a sample Quickstart map.
320
+
321
+ ### Longer term
322
+ - Avatar marketplace — `CatalogItem`, `AvatarAsset`, `RarityLevel` types are already defined. Backend + web store + in-creator purchasing.
323
+ - RPM migration tools — import existing RPM avatars where technically possible.
324
+ - SLA + deprecation policy — for teams that need a reliability guarantee as they move off RPM.
431
325
 
432
326
  ---
433
327
 
434
328
  ## Contributing
435
329
 
436
- Contributions are welcome. Please open an issue to discuss your idea before submitting a pull request.
437
-
438
330
  ```bash
439
331
  git clone https://github.com/sitebay/talking-head-studio.git
440
332
  cd talking-head-studio
441
333
  npm install
442
- npm run typecheck
334
+ npm run typecheck # must be clean (excluding known expo-audio peer dep warnings)
443
335
  npm test
444
336
  ```
445
337
 
446
- ---
447
-
448
- ## Credits
449
-
450
- This project builds on excellent open-source work:
451
-
452
- - [met4citizen/TalkingHead](https://github.com/met4citizen/TalkingHead) -- The 3D avatar engine powering model loading, rigging, and expression systems.
453
- - [met4citizen/HeadAudio](https://github.com/met4citizen/HeadAudio) -- Phoneme-based lip-sync from audio streams using AudioWorklet.
454
- - [lhupyn/motion-engine](https://github.com/lhupyn/motion-engine) -- Real-time body motion tracking (upcoming integration).
455
- - [Three.js](https://threejs.org/) -- 3D rendering, loaded via CDN at runtime.
456
-
457
- ---
458
-
459
- ## License
460
-
461
- MIT
462
- at runtime.
338
+ The repo is a monorepo with `packages/*` as npm workspaces. The main library is the root package.
463
339
 
464
340
  ---
465
341