talking-head-studio 0.4.10 → 0.4.11
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +227 -351
- package/dist/TalkingHead.d.ts +16 -25
- package/dist/TalkingHead.web.d.ts +6 -0
- package/dist/TalkingHead.web.js +17 -7
- package/dist/api/studioApi.js +25 -26
- package/dist/appearance/apply.js +2 -3
- package/dist/appearance/matchers.js +1 -2
- package/dist/appearance/schema.js +1 -2
- package/dist/core/avatar/backend.d.ts +130 -0
- package/dist/core/avatar/backend.js +4 -0
- package/dist/core/avatar/backends/gaussian.d.ts +49 -0
- package/dist/core/avatar/backends/gaussian.js +291 -0
- package/dist/core/avatar/backends/index.d.ts +3 -0
- package/dist/core/avatar/backends/index.js +7 -0
- package/dist/core/avatar/backends/morphTarget.d.ts +39 -0
- package/dist/core/avatar/backends/morphTarget.js +179 -0
- package/dist/core/avatar/faceControls.d.ts +40 -0
- package/dist/core/avatar/faceControls.js +138 -0
- package/dist/core/avatar/schema.d.ts +50 -0
- package/dist/core/avatar/schema.js +134 -0
- package/dist/core/avatar/visemes.d.ts +31 -0
- package/dist/core/avatar/visemes.js +67 -1
- package/dist/editor/AvatarCanvas.js +1 -2
- package/dist/editor/AvatarEditor.native.js +18 -9
- package/dist/editor/AvatarModel.js +1 -2
- package/dist/editor/FaceSqueezeEditor.js +19 -9
- package/dist/editor/FaceSqueezeEditor.web.js +2 -2
- package/dist/editor/RigidAccessory.js +1 -2
- package/dist/editor/SkinnedClothing.js +18 -9
- package/dist/editor/boneSnap.js +22 -12
- package/dist/editor/studioTheme.js +2 -2
- package/dist/html.js +1 -2
- package/dist/index.d.ts +15 -1
- package/dist/index.js +28 -5
- package/dist/platform/api/types.d.ts +10 -0
- package/dist/platform/api/types.js +2 -0
- package/dist/platform/marketplace/types.d.ts +32 -0
- package/dist/platform/marketplace/types.js +2 -0
- package/dist/platform/sdk/unity.d.ts +27 -0
- package/dist/platform/sdk/unity.js +2 -0
- package/dist/platform/sdk/unreal.d.ts +23 -0
- package/dist/platform/sdk/unreal.js +2 -0
- package/dist/platform/sdk/web.d.ts +16 -0
- package/dist/platform/sdk/web.js +2 -0
- package/dist/sketchfab/api.js +4 -5
- package/dist/sketchfab/useSketchfabSearch.js +1 -2
- package/dist/tts/useDirectVisemeStream.d.ts +2 -6
- package/dist/tts/useDirectVisemeStream.js +1 -2
- package/dist/tts/useMotionMarkers.d.ts +0 -1
- package/dist/tts/useMotionMarkers.js +1 -2
- package/dist/utils/avatarUtils.js +2 -3
- package/dist/utils/faceLandmarkerToShapeWeights.js +19 -10
- package/dist/voice/convertToWav.js +1 -2
- package/dist/voice/index.d.ts +3 -0
- package/dist/voice/index.js +6 -1
- package/dist/voice/useAudioPlayer.js +1 -2
- package/dist/voice/useAudioRecording.js +1 -2
- package/dist/voice/useFaceControls.d.ts +14 -0
- package/dist/voice/useFaceControls.js +81 -0
- package/dist/voice/useVoicePreview.d.ts +7 -0
- package/dist/voice/useVoicePreview.js +81 -0
- package/dist/wardrobe/index.d.ts +2 -0
- package/dist/wardrobe/index.js +3 -1
- package/dist/wardrobe/useAvatarWardrobeHydration.js +1 -2
- package/dist/wardrobe/useStudioAvatar.d.ts +29 -0
- package/dist/wardrobe/useStudioAvatar.js +177 -0
- package/dist/wgpu/WgpuAvatar.js +17 -7
- package/dist/wgpu/useAuthedModelUri.js +18 -9
- package/package.json +8 -4
package/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# talking-head-studio
|
|
2
2
|
|
|
3
|
-
**
|
|
3
|
+
**Open-source avatar platform for Web, React Native, Unity, and Unreal. Any GLB model. Full lip-sync — with or without blend shapes.**
|
|
4
4
|
|
|
5
5
|
[](https://www.npmjs.com/package/talking-head-studio)
|
|
6
6
|
[](https://opensource.org/licenses/MIT)
|
|
@@ -8,56 +8,64 @@
|
|
|
8
8
|
|
|
9
9
|
---
|
|
10
10
|
|
|
11
|
-
##
|
|
11
|
+
## What this is
|
|
12
12
|
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
- **Always Alive:** Procedural idle animations (breathing, nodding, swaying) keep your avatar from feeling like a static doll.
|
|
17
|
-
- **Dynamic Wardrobe & Accessories:** Swap hair, skin, and eye colors on the fly. Attach hats, glasses, or backpacks to any bone at runtime.
|
|
13
|
+
A drop-in avatar runtime and platform SDK built to be a self-hostable replacement for Ready Player Me. The core problem it solves: **any arbitrary 3D model should be able to talk, emote, and respond to a voice pipeline** — regardless of whether the artist baked in blend shapes, visemes, or any face rig at all.
|
|
14
|
+
|
|
15
|
+
The library ships a renderer (web iframe + React Native wgpu), a backend-agnostic face control contract, and a growing set of adapters that map TTS/audio/AI output onto whatever rendering mechanism the model actually supports.
|
|
18
16
|
|
|
19
17
|
---
|
|
20
18
|
|
|
21
|
-
##
|
|
22
|
-
|
|
23
|
-
- [Installation](#installation)
|
|
24
|
-
- [Quick Start](#quick-start)
|
|
25
|
-
- [Subpath Exports](#subpath-exports)
|
|
26
|
-
- [Props](#props)
|
|
27
|
-
- [Ref API](#ref-api)
|
|
28
|
-
- [Accessories](#accessories)
|
|
29
|
-
- [Color Customization](#color-customization)
|
|
30
|
-
- [Voice Pipeline Integration](#voice-pipeline-integration)
|
|
31
|
-
- [GLB Compatibility](#glb-compatibility)
|
|
32
|
-
- [Plain React / Next.js](#plain-react--nextjs)
|
|
33
|
-
- [MotionEngine (Upcoming)](#motionengine-upcoming)
|
|
34
|
-
- [Contributing](#contributing)
|
|
35
|
-
- [Credits](#credits)
|
|
36
|
-
- [License](#license)
|
|
19
|
+
## Lip-sync tiers (any model works)
|
|
37
20
|
|
|
38
|
-
|
|
21
|
+
| Model type | Lip-sync method | Quality |
|
|
22
|
+
|---|---|---|
|
|
23
|
+
| GLB with Oculus viseme morphs | Direct morph drive via `MorphTargetBackend` | Excellent |
|
|
24
|
+
| GLB with ARKit blend shapes | `remapArkitToOculus()` → morph drive | Good |
|
|
25
|
+
| GLB with only `jawOpen` / `mouthOpen` | Amplitude fallback | Acceptable |
|
|
26
|
+
| Any other GLB | Gaussian splat backend *(roadmap)* | Excellent |
|
|
39
27
|
|
|
40
|
-
|
|
28
|
+
The last row is the goal: **scan any model into a Gaussian representation, generate per-viseme deltas via FLAME-based transfer, and drive it from the same `FaceControl` contract everything else uses.** No blend shapes required. No artist work required.
|
|
41
29
|
|
|
42
|
-
|
|
30
|
+
---
|
|
31
|
+
|
|
32
|
+
## Architecture
|
|
43
33
|
|
|
44
|
-
```
|
|
45
|
-
|
|
34
|
+
```
|
|
35
|
+
TTS / audio / face tracking
|
|
36
|
+
↓
|
|
37
|
+
AgentVisemePayload ← canonical wire format for lip-sync schedules
|
|
38
|
+
↓
|
|
39
|
+
FaceControl ← pose (HeadPose) + expression (ExpressionState) + gaze (EyeGaze)
|
|
40
|
+
↓
|
|
41
|
+
AvatarBackend ←────────────── swap without changing anything upstream
|
|
42
|
+
├── MorphTargetBackend ← Three.js morph targets (GLB with blend shapes)
|
|
43
|
+
├── GaussianBackend ← [roadmap] Gaussian splat + FLAME delta transfer
|
|
44
|
+
└── (your backend) ← implement AvatarBackend, plug in
|
|
45
|
+
↓
|
|
46
|
+
Renderer
|
|
47
|
+
├── Web iframe ← TalkingHead.web.tsx (any React app)
|
|
48
|
+
├── React Native wgpu ← WgpuAvatar (native GPU, no WebView latency)
|
|
49
|
+
└── Unity / Unreal ← [roadmap] SDK plugins consuming same contracts
|
|
46
50
|
```
|
|
47
51
|
|
|
48
|
-
`
|
|
52
|
+
Everything above `AvatarBackend` is renderer-agnostic. Everything above `FaceControl` is model-agnostic.
|
|
49
53
|
|
|
50
|
-
|
|
54
|
+
---
|
|
55
|
+
|
|
56
|
+
## Installation
|
|
51
57
|
|
|
52
58
|
```bash
|
|
59
|
+
# React Native / Expo
|
|
60
|
+
npm install talking-head-studio react-native-webview
|
|
61
|
+
|
|
62
|
+
# Web (React, Next.js, Vite)
|
|
53
63
|
npm install talking-head-studio
|
|
54
64
|
```
|
|
55
65
|
|
|
56
|
-
No `react-native` or WebView runtime dependency needed. The package ships a web entry point that renders via `<iframe srcdoc>` automatically when bundled for browser targets.
|
|
57
|
-
|
|
58
66
|
---
|
|
59
67
|
|
|
60
|
-
## Quick
|
|
68
|
+
## Quick start
|
|
61
69
|
|
|
62
70
|
```tsx
|
|
63
71
|
import { useRef } from 'react';
|
|
@@ -74,19 +82,16 @@ export default function Avatar() {
|
|
|
74
82
|
cameraView="upper"
|
|
75
83
|
hairColor="#1a1a2e"
|
|
76
84
|
skinColor="#e0a370"
|
|
77
|
-
accessories={[
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
},
|
|
86
|
-
]}
|
|
85
|
+
accessories={[{
|
|
86
|
+
id: 'sunglasses',
|
|
87
|
+
url: 'https://example.com/sunglasses.glb',
|
|
88
|
+
bone: 'Head',
|
|
89
|
+
position: [0, 0.08, 0.12],
|
|
90
|
+
rotation: [0, 0, 0],
|
|
91
|
+
scale: 1.0,
|
|
92
|
+
}]}
|
|
87
93
|
style={{ width: 400, height: 600 }}
|
|
88
|
-
onReady={() => console.log('
|
|
89
|
-
onError={(msg) => console.error('Load failed:', msg)}
|
|
94
|
+
onReady={() => console.log('ready')}
|
|
90
95
|
/>
|
|
91
96
|
);
|
|
92
97
|
}
|
|
@@ -94,372 +99,243 @@ export default function Avatar() {
|
|
|
94
99
|
|
|
95
100
|
---
|
|
96
101
|
|
|
97
|
-
##
|
|
98
|
-
|
|
99
|
-
The package ships five independent entry points. Import only what you need — each subpath has its own optional peer dependencies.
|
|
102
|
+
## FaceControl — the core contract
|
|
100
103
|
|
|
101
|
-
|
|
102
|
-
```tsx
|
|
103
|
-
import { TalkingHead } from 'talking-head-studio';
|
|
104
|
-
// Peer deps: react
|
|
105
|
-
// Native-only peers: react-native (optional), react-native-webview (optional)
|
|
106
|
-
```
|
|
104
|
+
The `FaceControl` type is the single value that flows between your voice pipeline and any avatar backend. If you're building a custom backend or integrating with a game engine, this is what you implement against.
|
|
107
105
|
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
106
|
+
```ts
|
|
107
|
+
import type { FaceControl, ExpressionState, HeadPose, EyeGaze } from 'talking-head-studio';
|
|
108
|
+
|
|
109
|
+
type HeadPose = {
|
|
110
|
+
yaw: number; // -1..1, left..right
|
|
111
|
+
pitch: number; // -1..1, down..up
|
|
112
|
+
roll: number; // -1..1, tilt
|
|
113
|
+
};
|
|
114
|
+
|
|
115
|
+
type EyeGaze = {
|
|
116
|
+
x: number; // -1..1, left..right
|
|
117
|
+
y: number; // -1..1, down..up
|
|
118
|
+
};
|
|
119
|
+
|
|
120
|
+
type ExpressionState = {
|
|
121
|
+
jawOpen: number; // 0..1
|
|
122
|
+
mouthSmile: number;
|
|
123
|
+
mouthFunnel: number;
|
|
124
|
+
mouthPucker: number;
|
|
125
|
+
mouthWide: number;
|
|
126
|
+
upperLipRaise: number;
|
|
127
|
+
lowerLipDepress: number;
|
|
128
|
+
cheekRaise: number;
|
|
129
|
+
blinkLeft: number;
|
|
130
|
+
blinkRight: number;
|
|
131
|
+
browInnerUp: number;
|
|
132
|
+
browDownLeft: number;
|
|
133
|
+
browDownRight: number;
|
|
134
|
+
eyeGazeLeft: EyeGaze;
|
|
135
|
+
eyeGazeRight: EyeGaze;
|
|
136
|
+
};
|
|
113
137
|
```
|
|
114
138
|
|
|
115
|
-
###
|
|
116
|
-
Apply skin/hair/eye colors to any GLB avatar. Works in both the live view and the 3D editor.
|
|
117
|
-
```tsx
|
|
118
|
-
import { applyAppearanceToObject3D, type AvatarAppearance } from 'talking-head-studio/appearance';
|
|
119
|
-
// No extra peer deps
|
|
120
|
-
```
|
|
139
|
+
### Driving FaceControl from a viseme schedule
|
|
121
140
|
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
```tsx
|
|
125
|
-
import { useAudioRecording, useAudioPlayer } from 'talking-head-studio/voice';
|
|
126
|
-
// No extra peer deps (browser APIs only)
|
|
127
|
-
```
|
|
141
|
+
```ts
|
|
142
|
+
import { useFaceControlsFromVisemes } from 'talking-head-studio';
|
|
128
143
|
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
import { useSketchfabSearch, ACCESSORY_CATEGORIES, downloadModel } from 'talking-head-studio/sketchfab';
|
|
133
|
-
// No extra peer deps
|
|
144
|
+
// schedule: AgentVisemePayload from your TTS backend
|
|
145
|
+
const faceControl = useFaceControlsFromVisemes(schedule);
|
|
146
|
+
// → { pose: { yaw:0, pitch:0, roll:0 }, expr: { jawOpen: 0.7, ... } }
|
|
134
147
|
```
|
|
135
148
|
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
## Props
|
|
139
|
-
|
|
140
|
-
| Prop | Type | Default | Description |
|
|
141
|
-
|------|------|---------|-------------|
|
|
142
|
-
| `avatarUrl` | `string` | **required** | URL to any `.glb` model. Rigged or non-rigged. |
|
|
143
|
-
| `authToken` | `string \| null` | `null` | Bearer token sent when fetching the model URL. CDN URLs are excluded automatically. |
|
|
144
|
-
| `mood` | `TalkingHeadMood` | `'neutral'` | Avatar expression. See [Moods](#moods) below. |
|
|
145
|
-
| `cameraView` | `'head' \| 'upper' \| 'full'` | `'upper'` | Camera framing preset. |
|
|
146
|
-
| `cameraDistance` | `number` | `-0.5` | Camera zoom offset. Negative values zoom in. |
|
|
147
|
-
| `hairColor` | `string` | -- | CSS color applied to materials whose name contains `hair` or `fur`. |
|
|
148
|
-
| `skinColor` | `string` | -- | CSS color applied to materials whose name contains `skin`, `body`, or `face`. |
|
|
149
|
-
| `eyeColor` | `string` | -- | CSS color applied to materials whose name contains `eye` or `iris`. |
|
|
150
|
-
| `accessories` | `TalkingHeadAccessory[]` | `[]` | Array of GLB items to attach to bones. See [Accessories](#accessories). |
|
|
151
|
-
| `onReady` | `() => void` | -- | Fires once the avatar and scene are fully loaded. |
|
|
152
|
-
| `onError` | `(message: string) => void` | -- | Fires on load failure. |
|
|
153
|
-
| `style` | `ViewStyle` | -- | Container style (works on both native and web). |
|
|
154
|
-
|
|
155
|
-
### Moods
|
|
156
|
-
|
|
157
|
-
The `mood` prop accepts one of:
|
|
149
|
+
### Implementing a custom backend
|
|
158
150
|
|
|
159
|
-
```
|
|
160
|
-
|
|
151
|
+
```ts
|
|
152
|
+
import type { AvatarBackend, AvatarRenderTarget, FaceControl } from 'talking-head-studio';
|
|
153
|
+
|
|
154
|
+
class MyGaussianBackend implements AvatarBackend {
|
|
155
|
+
initialize() { /* load splat data, FLAME weights */ }
|
|
156
|
+
attach(target: AvatarRenderTarget) { /* bind to canvas/surface */ }
|
|
157
|
+
setControl(control: FaceControl) { /* map ExpressionState → splat coefficients */ }
|
|
158
|
+
renderFrame() { /* rasterize */ }
|
|
159
|
+
dispose() { /* cleanup */ }
|
|
160
|
+
}
|
|
161
161
|
```
|
|
162
162
|
|
|
163
|
-
Mood can be changed at any time via props or the ref API. On rigged models, mood maps to blend shape expressions. On non-rigged models, mood is a no-op.
|
|
164
|
-
|
|
165
163
|
---
|
|
166
164
|
|
|
167
|
-
##
|
|
168
|
-
|
|
169
|
-
Access runtime controls through a React ref. Every method is safe to call at any time -- calls made before the avatar is ready are silently dropped.
|
|
165
|
+
## MorphTargetBackend — Three.js GLB adapter
|
|
170
166
|
|
|
171
|
-
|
|
172
|
-
const ref = useRef<TalkingHeadRef>(null);
|
|
167
|
+
The first concrete `AvatarBackend` implementation. Give it any loaded Three.js scene and it will find morph targets, build a lookup cache, and drive them from `FaceControl`.
|
|
173
168
|
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
url: 'https://example.com/crown.glb',
|
|
190
|
-
bone: 'Head',
|
|
191
|
-
position: [0, 0.22, 0],
|
|
192
|
-
rotation: [0, 0, 0],
|
|
193
|
-
scale: 0.8,
|
|
169
|
+
```ts
|
|
170
|
+
import * as THREE from 'three';
|
|
171
|
+
import { GLTFLoader } from 'three/examples/jsm/loaders/GLTFLoader';
|
|
172
|
+
import { MorphTargetBackend } from 'talking-head-studio';
|
|
173
|
+
|
|
174
|
+
const loader = new GLTFLoader();
|
|
175
|
+
const gltf = await loader.loadAsync('/avatar.glb');
|
|
176
|
+
|
|
177
|
+
const backend = new MorphTargetBackend(gltf.scene, {
|
|
178
|
+
mood: 'neutral',
|
|
179
|
+
expressionScale: 1.0,
|
|
180
|
+
calibration: {
|
|
181
|
+
neutral: { pose: { yaw: 0, pitch: 0, roll: 0 }, expr: createNeutralExpression() },
|
|
182
|
+
ranges: { jawOpen: { min: 0, max: 0.85 } }, // clamp jaw for this model
|
|
183
|
+
gazeLimits: { x: { min: -0.6, max: 0.6 } },
|
|
194
184
|
},
|
|
195
|
-
|
|
196
|
-
```
|
|
185
|
+
});
|
|
197
186
|
|
|
198
|
-
|
|
187
|
+
// Each frame:
|
|
188
|
+
backend.setControl(faceControl);
|
|
189
|
+
backend.renderFrame();
|
|
199
190
|
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
| `setHairColor` | `(color: string) => void` | Update hair material color. |
|
|
205
|
-
| `setSkinColor` | `(color: string) => void` | Update skin material color. |
|
|
206
|
-
| `setEyeColor` | `(color: string) => void` | Update eye/iris material color. |
|
|
207
|
-
| `setAccessories` | `(accessories: TalkingHeadAccessory[]) => void` | Replace the entire accessory set. Handles loading, diffing, and cleanup automatically. |
|
|
191
|
+
// Debug: what morphs does this model actually have?
|
|
192
|
+
console.log(backend.availableChannels);
|
|
193
|
+
// → { visemes: ['aa','PP','oh',...], expressions: ['jawOpen','blinkLeft',...], gaze: ['lookLeft','lookUp'] }
|
|
194
|
+
```
|
|
208
195
|
|
|
209
196
|
---
|
|
210
197
|
|
|
211
|
-
##
|
|
212
|
-
|
|
213
|
-
Attach any GLB model to any bone on the avatar skeleton. The system handles loading, disposal, and transform updates.
|
|
198
|
+
## ARKit → Oculus remap
|
|
214
199
|
|
|
215
|
-
|
|
200
|
+
Models with ARKit blend shapes (52 facial action units) but no Oculus viseme morphs can be remapped analytically — no ML, no FLAME, no artist work.
|
|
216
201
|
|
|
217
202
|
```ts
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
|
|
223
|
-
|
|
224
|
-
|
|
225
|
-
}
|
|
203
|
+
import { remapArkitToOculus, getArkitWeightsForViseme } from 'talking-head-studio';
|
|
204
|
+
|
|
205
|
+
// Runtime: face tracking data → Oculus viseme weights
|
|
206
|
+
const oculusWeights = remapArkitToOculus({
|
|
207
|
+
jawOpen: 0.7,
|
|
208
|
+
mouthLowerDownLeft: 0.4,
|
|
209
|
+
mouthLowerDownRight: 0.4,
|
|
210
|
+
});
|
|
211
|
+
// → { aa: 0.68, PP: 0.03, oh: 0.12, ... }
|
|
212
|
+
|
|
213
|
+
// Bake-time: get the ARKit recipe for a specific viseme
|
|
214
|
+
const recipe = getArkitWeightsForViseme('ou');
|
|
215
|
+
// → { mouthPucker: 0.9, mouthRollLower: 0.3 }
|
|
226
216
|
```
|
|
227
217
|
|
|
228
|
-
|
|
229
|
-
|
|
230
|
-
```tsx
|
|
231
|
-
<TalkingHead
|
|
232
|
-
avatarUrl="https://example.com/avatar.glb"
|
|
233
|
-
accessories={[
|
|
234
|
-
{
|
|
235
|
-
id: 'cowboy-hat',
|
|
236
|
-
url: '/models/cowboy-hat.glb',
|
|
237
|
-
bone: 'Head',
|
|
238
|
-
position: [0, 0.18, 0],
|
|
239
|
-
rotation: [0, 0, 0],
|
|
240
|
-
scale: 1.2,
|
|
241
|
-
},
|
|
242
|
-
{
|
|
243
|
-
id: 'aviators',
|
|
244
|
-
url: '/models/aviator-glasses.glb',
|
|
245
|
-
bone: 'Head',
|
|
246
|
-
position: [0, 0.06, 0.11],
|
|
247
|
-
rotation: [0, 0, 0],
|
|
248
|
-
scale: 1.0,
|
|
249
|
-
},
|
|
250
|
-
{
|
|
251
|
-
id: 'backpack',
|
|
252
|
-
url: '/models/backpack.glb',
|
|
253
|
-
bone: 'Spine1',
|
|
254
|
-
position: [0, 0, -0.15],
|
|
255
|
-
rotation: [0, Math.PI, 0],
|
|
256
|
-
scale: 0.9,
|
|
257
|
-
},
|
|
258
|
-
]}
|
|
259
|
-
/>
|
|
260
|
-
```
|
|
261
|
-
|
|
262
|
-
### Common bone names
|
|
263
|
-
|
|
264
|
-
Mixamo-rigged models typically expose these bones:
|
|
265
|
-
|
|
266
|
-
```
|
|
267
|
-
Head, Neck, Spine, Spine1, Spine2,
|
|
268
|
-
LeftShoulder, LeftArm, LeftForeArm, LeftHand,
|
|
269
|
-
RightShoulder, RightArm, RightForeArm, RightHand,
|
|
270
|
-
LeftUpLeg, LeftLeg, LeftFoot,
|
|
271
|
-
RightUpLeg, RightLeg, RightFoot
|
|
272
|
-
```
|
|
273
|
-
|
|
274
|
-
Bone matching is flexible -- if an exact match is not found, the component tries a prefix match (useful for Sketchfab exports like `Head_5`). If no bone matches, the accessory falls back to the scene root.
|
|
275
|
-
|
|
276
|
-
### Runtime accessory swaps
|
|
277
|
-
|
|
278
|
-
```tsx
|
|
279
|
-
// Remove all accessories
|
|
280
|
-
ref.current?.setAccessories([]);
|
|
281
|
-
|
|
282
|
-
// Swap glasses for a monocle
|
|
283
|
-
ref.current?.setAccessories([
|
|
284
|
-
{ id: 'monocle', url: '/models/monocle.glb', bone: 'Head', position: [0.03, 0.07, 0.11], rotation: [0, 0, 0], scale: 0.6 },
|
|
285
|
-
]);
|
|
286
|
-
```
|
|
287
|
-
|
|
288
|
-
Accessories that were previously loaded but are absent from the new array are automatically disposed (geometry, materials, textures).
|
|
218
|
+
The full `ARKIT_TO_OCULUS` coefficient table is exported so you can build your own bake pipeline.
|
|
289
219
|
|
|
290
220
|
---
|
|
291
221
|
|
|
292
|
-
##
|
|
293
|
-
|
|
294
|
-
Colors can be set via props (applied on initial load) or via the ref API (applied at runtime without reloading the model).
|
|
222
|
+
## TalkingHead component — props & ref
|
|
295
223
|
|
|
296
|
-
|
|
224
|
+
### Props
|
|
297
225
|
|
|
298
|
-
|
|
|
299
|
-
|
|
300
|
-
|
|
|
301
|
-
|
|
|
302
|
-
|
|
|
303
|
-
|
|
304
|
-
|
|
305
|
-
|
|
306
|
-
|
|
226
|
+
| Prop | Type | Default | Description |
|
|
227
|
+
|------|------|---------|-------------|
|
|
228
|
+
| `avatarUrl` | `string` | required | Any `.glb`. Rigged or not. |
|
|
229
|
+
| `authToken` | `string \| null` | `null` | Bearer token for authenticated GLB URLs. |
|
|
230
|
+
| `mood` | `TalkingHeadMood` | `'neutral'` | `neutral \| happy \| sad \| angry \| excited \| thinking \| concerned \| surprised` |
|
|
231
|
+
| `cameraView` | `'head' \| 'upper' \| 'full'` | `'upper'` | Framing preset. |
|
|
232
|
+
| `cameraDistance` | `number` | `-0.5` | Zoom offset. Negative = closer. |
|
|
233
|
+
| `hairColor` | `string` | — | Hex color. Applied to materials named `hair`, `fur`. |
|
|
234
|
+
| `skinColor` | `string` | — | Applied to `skin`, `body`, `face`. |
|
|
235
|
+
| `eyeColor` | `string` | — | Applied to `eye`, `iris`. |
|
|
236
|
+
| `accessories` | `TalkingHeadAccessory[]` | `[]` | Bone-attached GLB items. |
|
|
237
|
+
| `onReady` | `() => void` | — | Fired when fully loaded. |
|
|
238
|
+
| `onError` | `(msg: string) => void` | — | Fired on load failure. |
|
|
239
|
+
| `style` | `ViewStyle / CSSProperties` | — | Container style. |
|
|
240
|
+
|
|
241
|
+
### Ref methods
|
|
307
242
|
|
|
308
|
-
|
|
309
|
-
ref.current?.
|
|
310
|
-
ref.current?.
|
|
311
|
-
ref.current?.
|
|
243
|
+
```ts
|
|
244
|
+
ref.current?.sendAmplitude(0.7); // amplitude 0..1 → jaw
|
|
245
|
+
ref.current?.scheduleVisemes(payload); // AgentVisemePayload → full lip-sync schedule
|
|
246
|
+
ref.current?.clearVisemes();
|
|
247
|
+
ref.current?.setMood('excited');
|
|
248
|
+
ref.current?.setHairColor('#ff0000');
|
|
249
|
+
ref.current?.setSkinColor('#8d5524');
|
|
250
|
+
ref.current?.setEyeColor('#2e86de');
|
|
251
|
+
ref.current?.setAccessories([...]);
|
|
252
|
+
ref.current?.dispatchMotion('nod');
|
|
312
253
|
```
|
|
313
254
|
|
|
314
|
-
This works on both rigged and non-rigged models -- any GLB with appropriately named materials will respond to color changes.
|
|
315
|
-
|
|
316
255
|
---
|
|
317
256
|
|
|
318
|
-
##
|
|
319
|
-
|
|
320
|
-
The component is designed to sit at the end of a voice pipeline. Feed it audio amplitude and it handles the rest.
|
|
321
|
-
|
|
322
|
-
### Primary: HeadAudio phoneme lip-sync
|
|
323
|
-
|
|
324
|
-
On rigged models in browser contexts with Web Audio available, [HeadAudio](https://github.com/met4citizen/HeadAudio) provides phoneme-level lip-sync automatically. Audio elements in the page are intercepted and routed through the lip-sync engine -- no wiring required on your end.
|
|
325
|
-
|
|
326
|
-
### Fallback: amplitude-driven jaw
|
|
327
|
-
|
|
328
|
-
When phoneme-level lip-sync is unavailable (React Native WebView, non-rigged models, or missing blend shapes), `sendAmplitude` drives jaw movement directly via morph targets.
|
|
329
|
-
|
|
330
|
-
### LiveKit integration
|
|
331
|
-
|
|
332
|
-
```tsx
|
|
333
|
-
import { useDataChannel } from '@livekit/components-react';
|
|
334
|
-
|
|
335
|
-
function AvatarWithLiveKit() {
|
|
336
|
-
const ref = useRef<TalkingHeadRef>(null);
|
|
257
|
+
## Accessories
|
|
337
258
|
|
|
338
|
-
|
|
339
|
-
if (data.amplitude !== undefined) {
|
|
340
|
-
ref.current?.sendAmplitude(data.amplitude);
|
|
341
|
-
}
|
|
342
|
-
});
|
|
259
|
+
Any GLB attached to any skeleton bone. Placement is editable at runtime via the 3D editor.
|
|
343
260
|
|
|
344
|
-
|
|
261
|
+
```ts
|
|
262
|
+
interface TalkingHeadAccessory {
|
|
263
|
+
id: string;
|
|
264
|
+
url: string;
|
|
265
|
+
bone: string; // 'Head' | 'Spine' | 'RightHand' | ...
|
|
266
|
+
position: [number, number, number];
|
|
267
|
+
rotation: [number, number, number]; // Euler, radians
|
|
268
|
+
scale: number;
|
|
345
269
|
}
|
|
346
270
|
```
|
|
347
271
|
|
|
348
|
-
|
|
349
|
-
|
|
350
|
-
```tsx
|
|
351
|
-
const audioCtx = new AudioContext();
|
|
352
|
-
const analyser = audioCtx.createAnalyser();
|
|
353
|
-
const buf = new Uint8Array(analyser.frequencyBinCount);
|
|
354
|
-
|
|
355
|
-
// Connect your audio source to the analyser
|
|
356
|
-
source.connect(analyser);
|
|
357
|
-
|
|
358
|
-
// Poll amplitude and feed the avatar
|
|
359
|
-
const interval = setInterval(() => {
|
|
360
|
-
analyser.getByteFrequencyData(buf);
|
|
361
|
-
const amplitude = buf.reduce((a, b) => a + b, 0) / buf.length / 255;
|
|
362
|
-
ref.current?.sendAmplitude(amplitude);
|
|
363
|
-
}, 50);
|
|
364
|
-
```
|
|
365
|
-
|
|
366
|
-
### Any audio source
|
|
272
|
+
Common Mixamo bones: `Head, Neck, Spine, Spine1, Spine2, LeftHand, RightHand, LeftFoot, RightFoot, Hips`
|
|
367
273
|
|
|
368
|
-
The
|
|
274
|
+
The 3D editor (`talking-head-studio/editor`) provides a gizmo for live placement with front/top/side views. LLM-assisted placement is available via the companion backend.
|
|
369
275
|
|
|
370
276
|
---
|
|
371
277
|
|
|
372
|
-
##
|
|
373
|
-
|
|
374
|
-
|
|
375
|
-
|
|
376
|
-
|
|
377
|
-
|
|
378
|
-
-
|
|
379
|
-
-
|
|
380
|
-
-
|
|
381
|
-
|
|
382
|
-
|
|
383
|
-
|
|
384
|
-
|
|
385
|
-
|
|
386
|
-
Any valid GLB loads successfully. Non-rigged models get:
|
|
387
|
-
|
|
388
|
-
- Auto-framing and centering in the viewport
|
|
389
|
-
- Orbit controls for rotation
|
|
390
|
-
- Embedded animation playback (walk cycles, idle loops, etc.)
|
|
391
|
-
- Amplitude-driven jaw via morph targets (if the model has `jawOpen`, `mouthOpen`, or `viseme_aa` blend shapes)
|
|
392
|
-
- Color customization (if materials are named appropriately)
|
|
393
|
-
- Accessory attachment (falls back to scene root if no bones exist)
|
|
394
|
-
|
|
395
|
-
### Upstream documentation
|
|
396
|
-
|
|
397
|
-
For detailed model authoring guidance, see the [TalkingHead documentation](https://github.com/met4citizen/TalkingHead).
|
|
398
|
-
|
|
399
|
-
---
|
|
400
|
-
|
|
401
|
-
## Plain React / Next.js
|
|
402
|
-
|
|
403
|
-
This works on the web without `react-native` or `react-native-webview` installed at runtime.
|
|
404
|
-
|
|
405
|
-
On web, the component renders an `<iframe>` with `srcdoc` containing the full Three.js scene. No WebView, no native modules, no build plugins.
|
|
406
|
-
|
|
407
|
-
```tsx
|
|
408
|
-
// Works in any React 18+ web app
|
|
409
|
-
import { TalkingHead } from 'talking-head-studio';
|
|
410
|
-
|
|
411
|
-
export default function Page() {
|
|
412
|
-
return (
|
|
413
|
-
<TalkingHead
|
|
414
|
-
avatarUrl="/models/avatar.glb"
|
|
415
|
-
mood="happy"
|
|
416
|
-
style={{ width: 600, height: 800 }}
|
|
417
|
-
/>
|
|
418
|
-
);
|
|
419
|
-
}
|
|
420
|
-
```
|
|
421
|
-
|
|
422
|
-
Metro and Expo use the native entry backed by `react-native-webview`. Standard web bundlers use the browser entry backed by a plain `<iframe>`. The API is identical.
|
|
278
|
+
## Packages
|
|
279
|
+
|
|
280
|
+
| Path | Description |
|
|
281
|
+
|------|-------------|
|
|
282
|
+
| `talking-head-studio` | Live avatar renderer + FaceControl contracts |
|
|
283
|
+
| `talking-head-studio/editor` | R3F-based 3D editor with gizmo (web only) |
|
|
284
|
+
| `talking-head-studio/appearance` | Material color system for any GLB |
|
|
285
|
+
| `talking-head-studio/voice` | Audio recording + WAV conversion hooks |
|
|
286
|
+
| `talking-head-studio/sketchfab` | Sketchfab search + download hooks |
|
|
287
|
+
| `talking-head-studio/api` | Studio API client (avatar CRUD, voice profiles) |
|
|
288
|
+
| `talking-head-studio/wardrobe` | Accessory + outfit state management |
|
|
289
|
+
| `talking-head-studio/wgpu` | React Native wgpu renderer |
|
|
290
|
+
| `packages/avatar-creator` | Embeddable avatar creator widget |
|
|
291
|
+
| `packages/agent-avatar` | LiveKit agent + MCP integration |
|
|
423
292
|
|
|
424
293
|
---
|
|
425
294
|
|
|
426
|
-
##
|
|
427
|
-
|
|
428
|
-
|
|
429
|
-
|
|
430
|
-
|
|
295
|
+
## Roadmap
|
|
296
|
+
|
|
297
|
+
### Now — shipped
|
|
298
|
+
- `FaceControl` canonical face control space (pose + expression + gaze)
|
|
299
|
+
- `AvatarBackend` interface — swap renderers without changing upstream code
|
|
300
|
+
- `MorphTargetBackend` — Three.js GLB adapter with morph target discovery and mood layering
|
|
301
|
+
- ARKit → Oculus analytical remap (`remapArkitToOculus`, full coefficient table)
|
|
302
|
+
- `useFaceControlsFromVisemes` — rAF-sampled hook from `AgentVisemePayload`
|
|
303
|
+
- `AgentVisemePayload` canonical TTS → lip-sync wire format
|
|
304
|
+
- `AvatarGlbParams` — typed API contract for quality/compression/morph group selection
|
|
305
|
+
- `CalibrationProfile` — per-avatar range remapping and gaze limits
|
|
306
|
+
- Platform type stubs: SDK (web/Unity/Unreal), marketplace catalog, avatar GLB API
|
|
307
|
+
- `packages/avatar-creator` — embeddable creator widget with preset catalog
|
|
308
|
+
- `packages/agent-avatar` — LiveKit agent + MCP tool integration
|
|
309
|
+
|
|
310
|
+
### Next
|
|
311
|
+
- **GLB schema walker** — scan any loaded GLB and report: morph target coverage, skeleton bones, LODs, viseme tier. Prerequisite for the validator and import pipeline.
|
|
312
|
+
- **`GET /avatars/{id}.glb` with `AvatarGlbParams`** — extend the companion backend to serve quality/compression/morph-group variants on the existing endpoint.
|
|
313
|
+
- **Creator postMessage bridge** — let partners embed the avatar creator in an iframe and receive avatar IDs back, like RPM's WebView creator.
|
|
314
|
+
|
|
315
|
+
### Medium term
|
|
316
|
+
- **`GaussianBackend`** — Gaussian splat renderer implementing `AvatarBackend`. Takes any model, scans it, drives expression via FLAME-based per-viseme delta transfer. No artist work, no blend shapes required. This is the zero-prerequisite lip-sync path.
|
|
317
|
+
- **FLAME viseme transfer pipeline** (Python, companion backend) — fit FLAME to a face screenshot, generate Oculus viseme deltas, bake back into the GLB as morph targets. Background task on upload for any avatar missing viseme morphs.
|
|
318
|
+
- **Unity SDK** — C# plugin implementing the `AvatarBackend` contract. Blueprint-friendly API for loading GLBs, driving morphs, consuming `AgentVisemePayload`.
|
|
319
|
+
- **Unreal plugin** — UE5 plugin with Blueprint-accessible `UAvatarDescriptor` and a sample Quickstart map.
|
|
320
|
+
|
|
321
|
+
### Longer term
|
|
322
|
+
- Avatar marketplace — `CatalogItem`, `AvatarAsset`, `RarityLevel` types are already defined. Backend + web store + in-creator purchasing.
|
|
323
|
+
- RPM migration tools — import existing RPM avatars where technically possible.
|
|
324
|
+
- SLA + deprecation policy — for teams that need a reliability guarantee as they move off RPM.
|
|
431
325
|
|
|
432
326
|
---
|
|
433
327
|
|
|
434
328
|
## Contributing
|
|
435
329
|
|
|
436
|
-
Contributions are welcome. Please open an issue to discuss your idea before submitting a pull request.
|
|
437
|
-
|
|
438
330
|
```bash
|
|
439
331
|
git clone https://github.com/sitebay/talking-head-studio.git
|
|
440
332
|
cd talking-head-studio
|
|
441
333
|
npm install
|
|
442
|
-
npm run typecheck
|
|
334
|
+
npm run typecheck # must be clean (excluding known expo-audio peer dep warnings)
|
|
443
335
|
npm test
|
|
444
336
|
```
|
|
445
337
|
|
|
446
|
-
|
|
447
|
-
|
|
448
|
-
## Credits
|
|
449
|
-
|
|
450
|
-
This project builds on excellent open-source work:
|
|
451
|
-
|
|
452
|
-
- [met4citizen/TalkingHead](https://github.com/met4citizen/TalkingHead) -- The 3D avatar engine powering model loading, rigging, and expression systems.
|
|
453
|
-
- [met4citizen/HeadAudio](https://github.com/met4citizen/HeadAudio) -- Phoneme-based lip-sync from audio streams using AudioWorklet.
|
|
454
|
-
- [lhupyn/motion-engine](https://github.com/lhupyn/motion-engine) -- Real-time body motion tracking (upcoming integration).
|
|
455
|
-
- [Three.js](https://threejs.org/) -- 3D rendering, loaded via CDN at runtime.
|
|
456
|
-
|
|
457
|
-
---
|
|
458
|
-
|
|
459
|
-
## License
|
|
460
|
-
|
|
461
|
-
MIT
|
|
462
|
-
at runtime.
|
|
338
|
+
The repo is a monorepo with `packages/*` as npm workspaces. The main library is the root package.
|
|
463
339
|
|
|
464
340
|
---
|
|
465
341
|
|