@convbased/sdk 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (82) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +235 -0
  3. package/dist/cjs/client.js +635 -0
  4. package/dist/cjs/client.js.map +1 -0
  5. package/dist/cjs/endpoints.js +10 -0
  6. package/dist/cjs/endpoints.js.map +1 -0
  7. package/dist/cjs/events.js +39 -0
  8. package/dist/cjs/events.js.map +1 -0
  9. package/dist/cjs/graphql.js +40 -0
  10. package/dist/cjs/graphql.js.map +1 -0
  11. package/dist/cjs/index.js +24 -0
  12. package/dist/cjs/index.js.map +1 -0
  13. package/dist/cjs/package.json +3 -0
  14. package/dist/cjs/rtcServers.js +35 -0
  15. package/dist/cjs/rtcServers.js.map +1 -0
  16. package/dist/cjs/sdp.js +37 -0
  17. package/dist/cjs/sdp.js.map +1 -0
  18. package/dist/cjs/signaling.js +146 -0
  19. package/dist/cjs/signaling.js.map +1 -0
  20. package/dist/cjs/tts.js +227 -0
  21. package/dist/cjs/tts.js.map +1 -0
  22. package/dist/cjs/types.js +26 -0
  23. package/dist/cjs/types.js.map +1 -0
  24. package/dist/cjs/upload.js +87 -0
  25. package/dist/cjs/upload.js.map +1 -0
  26. package/dist/client.d.ts +169 -0
  27. package/dist/client.d.ts.map +1 -0
  28. package/dist/client.js +631 -0
  29. package/dist/client.js.map +1 -0
  30. package/dist/convbased-sdk.global.js +1291 -0
  31. package/dist/endpoints.d.ts +3 -0
  32. package/dist/endpoints.d.ts.map +1 -0
  33. package/dist/endpoints.js +7 -0
  34. package/dist/endpoints.js.map +1 -0
  35. package/dist/events.d.ts +9 -0
  36. package/dist/events.d.ts.map +1 -0
  37. package/dist/events.js +35 -0
  38. package/dist/events.js.map +1 -0
  39. package/dist/graphql.d.ts +18 -0
  40. package/dist/graphql.d.ts.map +1 -0
  41. package/dist/graphql.js +37 -0
  42. package/dist/graphql.js.map +1 -0
  43. package/dist/index.d.ts +9 -0
  44. package/dist/index.d.ts.map +1 -0
  45. package/dist/index.js +9 -0
  46. package/dist/index.js.map +1 -0
  47. package/dist/rtcServers.d.ts +13 -0
  48. package/dist/rtcServers.d.ts.map +1 -0
  49. package/dist/rtcServers.js +31 -0
  50. package/dist/rtcServers.js.map +1 -0
  51. package/dist/sdp.d.ts +6 -0
  52. package/dist/sdp.d.ts.map +1 -0
  53. package/dist/sdp.js +34 -0
  54. package/dist/sdp.js.map +1 -0
  55. package/dist/signaling.d.ts +33 -0
  56. package/dist/signaling.d.ts.map +1 -0
  57. package/dist/signaling.js +142 -0
  58. package/dist/signaling.js.map +1 -0
  59. package/dist/tts.d.ts +111 -0
  60. package/dist/tts.d.ts.map +1 -0
  61. package/dist/tts.js +223 -0
  62. package/dist/tts.js.map +1 -0
  63. package/dist/types.d.ts +194 -0
  64. package/dist/types.d.ts.map +1 -0
  65. package/dist/types.js +23 -0
  66. package/dist/types.js.map +1 -0
  67. package/dist/upload.d.ts +46 -0
  68. package/dist/upload.d.ts.map +1 -0
  69. package/dist/upload.js +82 -0
  70. package/dist/upload.js.map +1 -0
  71. package/package.json +57 -0
  72. package/src/client.ts +839 -0
  73. package/src/endpoints.ts +8 -0
  74. package/src/events.ts +38 -0
  75. package/src/graphql.ts +58 -0
  76. package/src/index.ts +50 -0
  77. package/src/rtcServers.ts +38 -0
  78. package/src/sdp.ts +45 -0
  79. package/src/signaling.ts +172 -0
  80. package/src/tts.ts +364 -0
  81. package/src/types.ts +201 -0
  82. package/src/upload.ts +132 -0
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Convbased
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,235 @@
1
+ # @convbased/sdk
2
+
3
+ Browser SDK for the Convbased voice services. Authenticate with an API key and:
4
+
5
+ - **Real-time voice conversion** — capture the microphone and stream the
6
+ converted audio back over WebRTC (`ConvbasedClient`).
7
+ - **File inference (voice-to-voice)** — upload an audio file and convert it as
8
+ a discrete task within a live session (`ConvbasedClient.runFileInference`).
9
+ - **Text-to-speech** — synthesize speech from text with a reference voice over
10
+ GraphQL (`TtsClient`).
11
+
12
+ The SDK speaks the same signaling protocol as the official Convbased Web
13
+ console (`/signaling/ws` on `ServerAPI/signaling`) and is the easiest way to
14
+ embed Convbased voice features in third-party web apps.
15
+
16
+ ## Demo
17
+
18
+ Two ready-made examples ship in `examples/`:
19
+
20
+ - **`examples/h5/`** — single-file HTML demo that loads the SDK from
21
+ the official CDN:
22
+ ```html
23
+ <script src="https://cdn.weights.chat/sdk/convbased-sdk.global.js"></script>
24
+ ```
25
+ - **`examples/vanilla-ts/main.ts`** — a framework-free TypeScript snippet you
26
+ can drop into a Vite / Vue / React project to wire the SDK to an existing
27
+ `<audio>` tag. See `tts.ts` and `file-inference.ts` alongside it for the
28
+ text-to-speech and voice-to-voice flows.
29
+
30
+ ## Install
31
+
32
+ ```bash
33
+ bun add @convbased/sdk
34
+ # or
35
+ npm install @convbased/sdk
36
+ ```
37
+
38
+ ## Quick start
39
+
40
+ ```ts
41
+ import { ConvbasedClient } from "@convbased/sdk";
42
+
43
+ const client = new ConvbasedClient({
44
+ apiKey: "sk_********************************",
45
+ // Signaling + GraphQL endpoints default to the production Convbased
46
+ // service (wss://api.weights.chat/api/signaling/ws), so an API key is
47
+ // the only required field. Override `signalingUrl` / `graphqlUrl` for
48
+ // self-hosted deployments.
49
+ });
50
+
51
+ client.on("track", ({ stream }) => {
52
+ // `stream` is your own voice after conversion — play it to hear the result.
53
+ const audio = document.querySelector<HTMLAudioElement>("#converted")!;
54
+ audio.srcObject = stream;
55
+ void audio.play();
56
+ });
57
+
58
+ client.on("state", ({ state }) => console.log("state:", state));
59
+ client.on("error", (err) => console.error(err));
60
+
61
+ await client.connect({
62
+ modelId: "model_xxx",
63
+ preferences: {
64
+ pitch: 0,
65
+ rms_mix_rate: 0.25,
66
+ f0_autotune: false,
67
+ },
68
+ });
69
+
70
+ // Adjust pitch live:
71
+ client.updateConfig({ pitch: 2 });
72
+
73
+ // End the session:
74
+ await client.disconnect();
75
+ ```
76
+
77
+ ## Endpoints
78
+
79
+ The production endpoints are baked into the SDK and exposed as constants:
80
+
81
+ ```ts
82
+ import {
83
+ DEFAULT_SIGNALING_URL, // wss://api.weights.chat/api/signaling/ws
84
+ DEFAULT_GRAPHQL_URL, // https://api.weights.chat/api/v1/graphql
85
+ } from "@convbased/sdk";
86
+ ```
87
+
88
+ - For the official Convbased service, **don't pass `signalingUrl` /
89
+ `graphqlUrl`** — the defaults are correct.
90
+ - For a self-hosted deployment, override either or both. `signalingUrl`
91
+ accepts a full final URL (ending in `/ws`), the bare `…/signaling` path,
92
+ or just a host (`ws://localhost:3010` → SDK appends `/signaling/ws`).
93
+ - To disable the TURN auto-fetch entirely, pass `graphqlUrl: false`. The SDK
94
+ will then use `iceServers` if provided, or public Google STUN as a last
95
+ resort.
96
+
97
+ ## Authentication
98
+
99
+ Pass either:
100
+
101
+ - `apiKey` — long-lived key created in the Convbased dashboard. Sent as
102
+ `?api_key=…` on the WebSocket and `x-api-key` on the GraphQL fetch.
103
+ - `accessToken` — short-lived JWT obtained via `/auth/login`. Sent as `?token=…`
104
+ and `Authorization: Bearer …`.
105
+
106
+ The signaling server rejects connections with code `1008` if the credential is
107
+ invalid or the account has zero balance.
108
+
109
+ ## Reusing an existing MediaStream
110
+
111
+ Pass a captured stream to skip `getUserMedia` (useful with custom effect
112
+ pipelines like noise suppression or pitch-shift workers):
113
+
114
+ ```ts
115
+ const raw = await navigator.mediaDevices.getUserMedia({ audio: true });
116
+ const processed = await applyEffects(raw);
117
+ await client.connect({ modelId, audio: processed });
118
+ ```
119
+
120
+ ## File inference (voice-to-voice)
121
+
122
+ Convert a whole audio file through the model instead of the live mic. This runs
123
+ as a discrete task **inside an existing live session**, so you must `connect()`
124
+ first (that's what provisions the inference node). The converted result is
125
+ returned as a presigned download URL, not as a live `track`.
126
+
127
+ ```ts
128
+ await client.connect({ modelId: "model_xxx" });
129
+
130
+ // One-call helper: upload, start the task, wait for completion.
131
+ const result = await client.runFileInference({
132
+ audio: fileInput.files[0], // a File/Blob, or pass `audioKey` if already uploaded
133
+ preferences: { pitch: 2, f0_method: "rmvpe" },
134
+ onProgress: ({ progress }) => console.log(`progress: ${progress}`),
135
+ });
136
+ console.log("converted audio:", result.downloadUrl);
137
+ ```
138
+
139
+ Prefer the lower-level API when you want to drive the task lifecycle yourself:
140
+
141
+ ```ts
142
+ const { key } = await client.uploadAudio(file);
143
+ const taskId = client.startTask({ audioKey: key, preferences: { pitch: 2 } });
144
+
145
+ client.on("taskAck", ({ status, queuePosition }) => { /* queued | started */ });
146
+ client.on("taskProgress", ({ progress }) => { /* 0..1 */ });
147
+ client.on("taskFinished", ({ status, downloadUrl, error }) => { /* … */ });
148
+
149
+ client.stopTask(taskId); // cancel
150
+ ```
151
+
152
+ > **Must be connected first.** The inference node enforces that a task only runs
153
+ > once the live session is `SERVICE_READY` (offer → answer → model loaded).
154
+ > Calling `startTask` / `runFileInference` before `connect()` resolves throws;
155
+ > sending a task to a not-yet-ready node fails it with `"service not ready"`.
156
+
157
+ ## Text-to-speech
158
+
159
+ `TtsClient` is a standalone, GraphQL-only client — it does not open a WebSocket
160
+ or a peer connection. Synthesis is asynchronous (submit → poll), wrapped by a
161
+ one-call `synthesize()`:
162
+
163
+ ```ts
164
+ import { TtsClient } from "@convbased/sdk";
165
+
166
+ const tts = new TtsClient({ apiKey: "sk_********************************" });
167
+
168
+ // `referenceAudio` is the voice to clone; pass a previously uploaded
169
+ // `referenceKey` instead to skip the upload.
170
+ const result = await tts.synthesize({
171
+ referenceAudio: referenceFile, // a File/Blob
172
+ text: "你好,这是一段合成语音。",
173
+ params: { temperature: 0.8 },
174
+ onJob: (job) => console.log(job.status, "queue:", job.position),
175
+ });
176
+
177
+ const audio = document.querySelector<HTMLAudioElement>("#tts")!;
178
+ audio.src = result.url!; // presigned, valid ~1h
179
+ ```
180
+
181
+ Lower-level pieces are available too: `tts.uploadReferenceAudio(file)`,
182
+ `tts.submit({ referenceKey, text, params })`, `tts.getJob(jobId)`,
183
+ `tts.cancel(jobId)`, and `tts.getPricing()`.
184
+
185
+ > **Audio upload limits.** Both `client.uploadAudio` and
186
+ > `tts.uploadReferenceAudio` are validated by the service on **filename
187
+ > extension** — one of `mp3`, `wav`, `ogg`, `flac`, `m4a`, `aac` — and a max
188
+ > size of **100 MB**. When uploading a bare `Blob` (no filename), pass
189
+ > `{ filename: "source.wav" }` so the extension check passes.
190
+
191
+ ## Events
192
+
193
+ | Event | Payload | Notes |
194
+ | -------------- | -------------------------------------------------- | --------------------------------------------------------------------- |
195
+ | `state` | `{ state, previous }` | `idle → signaling → negotiating → connecting → connected → closed` |
196
+ | `message` | `{ code?, message?, raw }` | Every JSON frame the server sends. |
197
+ | `ready` | `{ code: 3009, message? }` | Inference node has loaded the model and converted audio is flowing. |
198
+ | `track` | `{ stream, track }` | The converted audio track to wire into an `<audio>` element. |
199
+ | `taskAck` | `{ taskId, status, queuePosition?, code? }` | File-inference task accepted (`queued` / `started`). |
200
+ | `taskProgress` | `{ taskId, progress, code? }` | File-inference progress, `progress` in `[0, 1]`. |
201
+ | `taskFinished` | `{ taskId, status, resultKey?, downloadUrl?, error? }` | File-inference terminal state (`success` / `failure` / `cancelled`). |
202
+ | `error` | `Error` | Connection or server-reported failure. |
203
+ | `closed` | `{ code?, reason? }` | The signaling socket has gone away. |
204
+
205
+ ## API surface
206
+
207
+ **`ConvbasedClient` — live conversion + file inference**
208
+
209
+ - `new ConvbasedClient(options)`
210
+ - `client.connect({ modelId, audio?, preferences?, sampleRate? })`
211
+ - `client.updateConfig(preferences)`
212
+ - `client.setMuted(muted)`
213
+ - `client.replaceLocalStream(stream)`
214
+ - `client.getStats()` — `{ rttMs, jitter, packetsLost }`
215
+ - `client.getConvertedStream()` / `client.getPeerConnection()`
216
+ - `client.uploadAudio(file, opts?)` — `{ key }`
217
+ - `client.runFileInference({ audio? | audioKey?, preferences?, ... })` — `Promise<TaskFinishedEvent>`
218
+ - `client.startTask({ audioKey, preferences?, ... })` / `client.stopTask(taskId?)`
219
+ - `client.disconnect()`
220
+
221
+ **`TtsClient` — text-to-speech (GraphQL only)**
222
+
223
+ - `new TtsClient(options)`
224
+ - `tts.synthesize({ referenceAudio? | referenceKey?, text, params?, ... })` — `Promise<TtsResult>`
225
+ - `tts.uploadReferenceAudio(file, opts?)` — `{ key }`
226
+ - `tts.submit({ referenceKey, text, params? })` / `tts.getJob(jobId)` / `tts.cancel(jobId)`
227
+ - `tts.getPricing()` — `{ pricePerToken, minCharge }`
228
+
229
+ ## Notes
230
+
231
+ - The SDK is **browser-only**. WebRTC support in pure Node requires `wrtc` or
232
+ `aiortc`, which are out of scope here.
233
+ - One client = one session. Create a new instance after `disconnect()`.
234
+ - The server rejects upfront if the wallet is empty, so a thrown error during
235
+ `connect()` may indicate insufficient balance — check `error.message`.