@convbased/sdk 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +235 -0
- package/dist/cjs/client.js +635 -0
- package/dist/cjs/client.js.map +1 -0
- package/dist/cjs/endpoints.js +10 -0
- package/dist/cjs/endpoints.js.map +1 -0
- package/dist/cjs/events.js +39 -0
- package/dist/cjs/events.js.map +1 -0
- package/dist/cjs/graphql.js +40 -0
- package/dist/cjs/graphql.js.map +1 -0
- package/dist/cjs/index.js +24 -0
- package/dist/cjs/index.js.map +1 -0
- package/dist/cjs/package.json +3 -0
- package/dist/cjs/rtcServers.js +35 -0
- package/dist/cjs/rtcServers.js.map +1 -0
- package/dist/cjs/sdp.js +37 -0
- package/dist/cjs/sdp.js.map +1 -0
- package/dist/cjs/signaling.js +146 -0
- package/dist/cjs/signaling.js.map +1 -0
- package/dist/cjs/tts.js +227 -0
- package/dist/cjs/tts.js.map +1 -0
- package/dist/cjs/types.js +26 -0
- package/dist/cjs/types.js.map +1 -0
- package/dist/cjs/upload.js +87 -0
- package/dist/cjs/upload.js.map +1 -0
- package/dist/client.d.ts +169 -0
- package/dist/client.d.ts.map +1 -0
- package/dist/client.js +631 -0
- package/dist/client.js.map +1 -0
- package/dist/convbased-sdk.global.js +1291 -0
- package/dist/endpoints.d.ts +3 -0
- package/dist/endpoints.d.ts.map +1 -0
- package/dist/endpoints.js +7 -0
- package/dist/endpoints.js.map +1 -0
- package/dist/events.d.ts +9 -0
- package/dist/events.d.ts.map +1 -0
- package/dist/events.js +35 -0
- package/dist/events.js.map +1 -0
- package/dist/graphql.d.ts +18 -0
- package/dist/graphql.d.ts.map +1 -0
- package/dist/graphql.js +37 -0
- package/dist/graphql.js.map +1 -0
- package/dist/index.d.ts +9 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +9 -0
- package/dist/index.js.map +1 -0
- package/dist/rtcServers.d.ts +13 -0
- package/dist/rtcServers.d.ts.map +1 -0
- package/dist/rtcServers.js +31 -0
- package/dist/rtcServers.js.map +1 -0
- package/dist/sdp.d.ts +6 -0
- package/dist/sdp.d.ts.map +1 -0
- package/dist/sdp.js +34 -0
- package/dist/sdp.js.map +1 -0
- package/dist/signaling.d.ts +33 -0
- package/dist/signaling.d.ts.map +1 -0
- package/dist/signaling.js +142 -0
- package/dist/signaling.js.map +1 -0
- package/dist/tts.d.ts +111 -0
- package/dist/tts.d.ts.map +1 -0
- package/dist/tts.js +223 -0
- package/dist/tts.js.map +1 -0
- package/dist/types.d.ts +194 -0
- package/dist/types.d.ts.map +1 -0
- package/dist/types.js +23 -0
- package/dist/types.js.map +1 -0
- package/dist/upload.d.ts +46 -0
- package/dist/upload.d.ts.map +1 -0
- package/dist/upload.js +82 -0
- package/dist/upload.js.map +1 -0
- package/package.json +57 -0
- package/src/client.ts +839 -0
- package/src/endpoints.ts +8 -0
- package/src/events.ts +38 -0
- package/src/graphql.ts +58 -0
- package/src/index.ts +50 -0
- package/src/rtcServers.ts +38 -0
- package/src/sdp.ts +45 -0
- package/src/signaling.ts +172 -0
- package/src/tts.ts +364 -0
- package/src/types.ts +201 -0
- package/src/upload.ts +132 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Convbased
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,235 @@
|
|
|
1
|
+
# @convbased/sdk
|
|
2
|
+
|
|
3
|
+
Browser SDK for the Convbased voice services. Authenticate with an API key and:
|
|
4
|
+
|
|
5
|
+
- **Real-time voice conversion** — capture the microphone and stream the
|
|
6
|
+
converted audio back over WebRTC (`ConvbasedClient`).
|
|
7
|
+
- **File inference (voice-to-voice)** — upload an audio file and convert it as
|
|
8
|
+
a discrete task within a live session (`ConvbasedClient.runFileInference`).
|
|
9
|
+
- **Text-to-speech** — synthesize speech from text with a reference voice over
|
|
10
|
+
GraphQL (`TtsClient`).
|
|
11
|
+
|
|
12
|
+
The SDK speaks the same signaling protocol as the official Convbased Web
|
|
13
|
+
console (`/signaling/ws` on `ServerAPI/signaling`) and is the easiest way to
|
|
14
|
+
embed Convbased voice features in third-party web apps.
|
|
15
|
+
|
|
16
|
+
## Demo
|
|
17
|
+
|
|
18
|
+
Two ready-made examples ship in `examples/`:
|
|
19
|
+
|
|
20
|
+
- **`examples/h5/`** — single-file HTML demo that loads the SDK from
|
|
21
|
+
the official CDN:
|
|
22
|
+
```html
|
|
23
|
+
<script src="https://cdn.weights.chat/sdk/convbased-sdk.global.js"></script>
|
|
24
|
+
```
|
|
25
|
+
- **`examples/vanilla-ts/main.ts`** — a framework-free TypeScript snippet you
|
|
26
|
+
can drop into a Vite / Vue / React project to wire the SDK to an existing
|
|
27
|
+
`<audio>` tag. See `tts.ts` and `file-inference.ts` alongside it for the
|
|
28
|
+
text-to-speech and voice-to-voice flows.
|
|
29
|
+
|
|
30
|
+
## Install
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
bun add @convbased/sdk
|
|
34
|
+
# or
|
|
35
|
+
npm install @convbased/sdk
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
## Quick start
|
|
39
|
+
|
|
40
|
+
```ts
|
|
41
|
+
import { ConvbasedClient } from "@convbased/sdk";
|
|
42
|
+
|
|
43
|
+
const client = new ConvbasedClient({
|
|
44
|
+
apiKey: "sk_********************************",
|
|
45
|
+
// Signaling + GraphQL endpoints default to the production Convbased
|
|
46
|
+
// service (wss://api.weights.chat/api/signaling/ws), so an API key is
|
|
47
|
+
// the only required field. Override `signalingUrl` / `graphqlUrl` for
|
|
48
|
+
// self-hosted deployments.
|
|
49
|
+
});
|
|
50
|
+
|
|
51
|
+
client.on("track", ({ stream }) => {
|
|
52
|
+
// `stream` is your own voice after conversion — play it to hear the result.
|
|
53
|
+
const audio = document.querySelector<HTMLAudioElement>("#converted")!;
|
|
54
|
+
audio.srcObject = stream;
|
|
55
|
+
void audio.play();
|
|
56
|
+
});
|
|
57
|
+
|
|
58
|
+
client.on("state", ({ state }) => console.log("state:", state));
|
|
59
|
+
client.on("error", (err) => console.error(err));
|
|
60
|
+
|
|
61
|
+
await client.connect({
|
|
62
|
+
modelId: "model_xxx",
|
|
63
|
+
preferences: {
|
|
64
|
+
pitch: 0,
|
|
65
|
+
rms_mix_rate: 0.25,
|
|
66
|
+
f0_autotune: false,
|
|
67
|
+
},
|
|
68
|
+
});
|
|
69
|
+
|
|
70
|
+
// Adjust pitch live:
|
|
71
|
+
client.updateConfig({ pitch: 2 });
|
|
72
|
+
|
|
73
|
+
// End the session:
|
|
74
|
+
await client.disconnect();
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
## Endpoints
|
|
78
|
+
|
|
79
|
+
The production endpoints are baked into the SDK and exposed as constants:
|
|
80
|
+
|
|
81
|
+
```ts
|
|
82
|
+
import {
|
|
83
|
+
DEFAULT_SIGNALING_URL, // wss://api.weights.chat/api/signaling/ws
|
|
84
|
+
DEFAULT_GRAPHQL_URL, // https://api.weights.chat/api/v1/graphql
|
|
85
|
+
} from "@convbased/sdk";
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
- For the official Convbased service, **don't pass `signalingUrl` /
|
|
89
|
+
`graphqlUrl`** — the defaults are correct.
|
|
90
|
+
- For a self-hosted deployment, override either or both. `signalingUrl`
|
|
91
|
+
accepts a full final URL (ending in `/ws`), the bare `…/signaling` path,
|
|
92
|
+
or just a host (`ws://localhost:3010` → SDK appends `/signaling/ws`).
|
|
93
|
+
- To disable the TURN auto-fetch entirely, pass `graphqlUrl: false`. The SDK
|
|
94
|
+
will then use `iceServers` if provided, or public Google STUN as a last
|
|
95
|
+
resort.
|
|
96
|
+
|
|
97
|
+
## Authentication
|
|
98
|
+
|
|
99
|
+
Pass either:
|
|
100
|
+
|
|
101
|
+
- `apiKey` — long-lived key created in the Convbased dashboard. Sent as
|
|
102
|
+
`?api_key=…` on the WebSocket and `x-api-key` on the GraphQL fetch.
|
|
103
|
+
- `accessToken` — short-lived JWT obtained via `/auth/login`. Sent as `?token=…`
|
|
104
|
+
and `Authorization: Bearer …`.
|
|
105
|
+
|
|
106
|
+
The signaling server rejects connections with code `1008` if the credential is
|
|
107
|
+
invalid or the account has zero balance.
|
|
108
|
+
|
|
109
|
+
## Reusing an existing MediaStream
|
|
110
|
+
|
|
111
|
+
Pass a captured stream to skip `getUserMedia` (useful with custom effect
|
|
112
|
+
pipelines like noise suppression or pitch-shift workers):
|
|
113
|
+
|
|
114
|
+
```ts
|
|
115
|
+
const raw = await navigator.mediaDevices.getUserMedia({ audio: true });
|
|
116
|
+
const processed = await applyEffects(raw);
|
|
117
|
+
await client.connect({ modelId, audio: processed });
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
## File inference (voice-to-voice)
|
|
121
|
+
|
|
122
|
+
Convert a whole audio file through the model instead of the live mic. This runs
|
|
123
|
+
as a discrete task **inside an existing live session**, so you must `connect()`
|
|
124
|
+
first (that's what provisions the inference node). The converted result is
|
|
125
|
+
returned as a presigned download URL, not as a live `track`.
|
|
126
|
+
|
|
127
|
+
```ts
|
|
128
|
+
await client.connect({ modelId: "model_xxx" });
|
|
129
|
+
|
|
130
|
+
// One-call helper: upload, start the task, wait for completion.
|
|
131
|
+
const result = await client.runFileInference({
|
|
132
|
+
audio: fileInput.files[0], // a File/Blob, or pass `audioKey` if already uploaded
|
|
133
|
+
preferences: { pitch: 2, f0_method: "rmvpe" },
|
|
134
|
+
onProgress: ({ progress }) => console.log(`progress: ${progress}`),
|
|
135
|
+
});
|
|
136
|
+
console.log("converted audio:", result.downloadUrl);
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
Prefer the lower-level API when you want to drive the task lifecycle yourself:
|
|
140
|
+
|
|
141
|
+
```ts
|
|
142
|
+
const { key } = await client.uploadAudio(file);
|
|
143
|
+
const taskId = client.startTask({ audioKey: key, preferences: { pitch: 2 } });
|
|
144
|
+
|
|
145
|
+
client.on("taskAck", ({ status, queuePosition }) => { /* queued | started */ });
|
|
146
|
+
client.on("taskProgress", ({ progress }) => { /* 0..1 */ });
|
|
147
|
+
client.on("taskFinished", ({ status, downloadUrl, error }) => { /* … */ });
|
|
148
|
+
|
|
149
|
+
client.stopTask(taskId); // cancel
|
|
150
|
+
```
|
|
151
|
+
|
|
152
|
+
> **Must be connected first.** The inference node enforces that a task only runs
|
|
153
|
+
> once the live session is `SERVICE_READY` (offer → answer → model loaded).
|
|
154
|
+
> Calling `startTask` / `runFileInference` before `connect()` resolves throws;
|
|
155
|
+
> sending a task to a not-yet-ready node fails it with `"service not ready"`.
|
|
156
|
+
|
|
157
|
+
## Text-to-speech
|
|
158
|
+
|
|
159
|
+
`TtsClient` is a standalone, GraphQL-only client — it does not open a WebSocket
|
|
160
|
+
or a peer connection. Synthesis is asynchronous (submit → poll), wrapped by a
|
|
161
|
+
one-call `synthesize()`:
|
|
162
|
+
|
|
163
|
+
```ts
|
|
164
|
+
import { TtsClient } from "@convbased/sdk";
|
|
165
|
+
|
|
166
|
+
const tts = new TtsClient({ apiKey: "sk_********************************" });
|
|
167
|
+
|
|
168
|
+
// `referenceAudio` is the voice to clone; pass a previously uploaded
|
|
169
|
+
// `referenceKey` instead to skip the upload.
|
|
170
|
+
const result = await tts.synthesize({
|
|
171
|
+
referenceAudio: referenceFile, // a File/Blob
|
|
172
|
+
text: "你好,这是一段合成语音。",
|
|
173
|
+
params: { temperature: 0.8 },
|
|
174
|
+
onJob: (job) => console.log(job.status, "queue:", job.position),
|
|
175
|
+
});
|
|
176
|
+
|
|
177
|
+
const audio = document.querySelector<HTMLAudioElement>("#tts")!;
|
|
178
|
+
audio.src = result.url!; // presigned, valid ~1h
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
Lower-level pieces are available too: `tts.uploadReferenceAudio(file)`,
|
|
182
|
+
`tts.submit({ referenceKey, text, params })`, `tts.getJob(jobId)`,
|
|
183
|
+
`tts.cancel(jobId)`, and `tts.getPricing()`.
|
|
184
|
+
|
|
185
|
+
> **Audio upload limits.** Both `client.uploadAudio` and
|
|
186
|
+
> `tts.uploadReferenceAudio` are validated by the service on **filename
|
|
187
|
+
> extension** — one of `mp3`, `wav`, `ogg`, `flac`, `m4a`, `aac` — and a max
|
|
188
|
+
> size of **100 MB**. When uploading a bare `Blob` (no filename), pass
|
|
189
|
+
> `{ filename: "source.wav" }` so the extension check passes.
|
|
190
|
+
|
|
191
|
+
## Events
|
|
192
|
+
|
|
193
|
+
| Event | Payload | Notes |
|
|
194
|
+
| -------------- | -------------------------------------------------- | --------------------------------------------------------------------- |
|
|
195
|
+
| `state` | `{ state, previous }` | `idle → signaling → negotiating → connecting → connected → closed` |
|
|
196
|
+
| `message` | `{ code?, message?, raw }` | Every JSON frame the server sends. |
|
|
197
|
+
| `ready` | `{ code: 3009, message? }` | Inference node has loaded the model and converted audio is flowing. |
|
|
198
|
+
| `track` | `{ stream, track }` | The converted audio track to wire into an `<audio>` element. |
|
|
199
|
+
| `taskAck` | `{ taskId, status, queuePosition?, code? }` | File-inference task accepted (`queued` / `started`). |
|
|
200
|
+
| `taskProgress` | `{ taskId, progress, code? }` | File-inference progress, `progress` in `[0, 1]`. |
|
|
201
|
+
| `taskFinished` | `{ taskId, status, resultKey?, downloadUrl?, error? }` | File-inference terminal state (`success` / `failure` / `cancelled`). |
|
|
202
|
+
| `error` | `Error` | Connection or server-reported failure. |
|
|
203
|
+
| `closed` | `{ code?, reason? }` | The signaling socket has gone away. |
|
|
204
|
+
|
|
205
|
+
## API surface
|
|
206
|
+
|
|
207
|
+
**`ConvbasedClient` — live conversion + file inference**
|
|
208
|
+
|
|
209
|
+
- `new ConvbasedClient(options)`
|
|
210
|
+
- `client.connect({ modelId, audio?, preferences?, sampleRate? })`
|
|
211
|
+
- `client.updateConfig(preferences)`
|
|
212
|
+
- `client.setMuted(muted)`
|
|
213
|
+
- `client.replaceLocalStream(stream)`
|
|
214
|
+
- `client.getStats()` — `{ rttMs, jitter, packetsLost }`
|
|
215
|
+
- `client.getConvertedStream()` / `client.getPeerConnection()`
|
|
216
|
+
- `client.uploadAudio(file, opts?)` — `{ key }`
|
|
217
|
+
- `client.runFileInference({ audio? | audioKey?, preferences?, ... })` — `Promise<TaskFinishedEvent>`
|
|
218
|
+
- `client.startTask({ audioKey, preferences?, ... })` / `client.stopTask(taskId?)`
|
|
219
|
+
- `client.disconnect()`
|
|
220
|
+
|
|
221
|
+
**`TtsClient` — text-to-speech (GraphQL only)**
|
|
222
|
+
|
|
223
|
+
- `new TtsClient(options)`
|
|
224
|
+
- `tts.synthesize({ referenceAudio? | referenceKey?, text, params?, ... })` — `Promise<TtsResult>`
|
|
225
|
+
- `tts.uploadReferenceAudio(file, opts?)` — `{ key }`
|
|
226
|
+
- `tts.submit({ referenceKey, text, params? })` / `tts.getJob(jobId)` / `tts.cancel(jobId)`
|
|
227
|
+
- `tts.getPricing()` — `{ pricePerToken, minCharge }`
|
|
228
|
+
|
|
229
|
+
## Notes
|
|
230
|
+
|
|
231
|
+
- The SDK is **browser-only**. WebRTC support in pure Node requires `wrtc` or
|
|
232
|
+
`aiortc`, which are out of scope here.
|
|
233
|
+
- One client = one session. Create a new instance after `disconnect()`.
|
|
234
|
+
- The server rejects upfront if the wallet is empty, so a thrown error during
|
|
235
|
+
`connect()` may indicate insufficient balance — check `error.message`.
|