@data-netmonk/mona-chat-widget 2.6.0 → 2.6.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +95 -14
- package/dist/index.cjs +92 -94
- package/dist/index.js +16071 -14741
- package/dist/style.css +1 -1
- package/package.json +1 -1
- package/dist/index.d.ts +0 -30
- package/dist/netmonk-logo.ico +0 -0
- package/dist/phoneme-mona/A.jpg.png +0 -0
- package/dist/phoneme-mona/BP.jpg.png +0 -0
- package/dist/phoneme-mona/ChJ.png +0 -0
- package/dist/phoneme-mona/E.jpg.png +0 -0
- package/dist/phoneme-mona/FV.png +0 -0
- package/dist/phoneme-mona/I.jpg.png +0 -0
- package/dist/phoneme-mona/KG.png +0 -0
- package/dist/phoneme-mona/L.jpg.png +0 -0
- package/dist/phoneme-mona/M.jpg.png +0 -0
- package/dist/phoneme-mona/O.jpg.png +0 -0
- package/dist/phoneme-mona/SZ.png +0 -0
- package/dist/phoneme-mona/U.jpg.png +0 -0
- package/dist/vite.svg +0 -1
package/README.md
CHANGED
|
@@ -8,7 +8,7 @@ Chat widget package developed by Netmonk data & solution team to be imported in
|
|
|
8
8
|
|
|
9
9
|
---
|
|
10
10
|
|
|
11
|
-
**Latest Version Changes (`v2.6.
|
|
11
|
+
**Latest Version Changes (`v2.6.1`):**
|
|
12
12
|
|
|
13
13
|
✅ **Non-breaking Changes:**
|
|
14
14
|
1. **Built-in voice mode button** - Chat widget now includes a microphone button to toggle voice mode directly from the input area
|
|
@@ -104,12 +104,35 @@ Chat widget package developed by Netmonk data & solution team to be imported in
|
|
|
104
104
|
|
|
105
105
|
To enable voice mode and phoneme-based avatar animation, set these environment variables in your `.env` file:
|
|
106
106
|
```
|
|
107
|
-
VITE_STT_ENDPOINT=https://
|
|
107
|
+
VITE_STT_ENDPOINT=https://voice.netmonk-ai.tech/stt
|
|
108
|
+
VITE_STT_API_KEY=
|
|
109
|
+
VITE_STT_API_KEY_HEADER=x-api-key
|
|
110
|
+
VITE_STT_API_KEY_PREFIX=
|
|
108
111
|
VITE_TTS_ENDPOINT=https://your-tts-service.example.com/synthesize
|
|
112
|
+
VITE_TTS_API_KEY=
|
|
113
|
+
VITE_TTS_API_KEY_HEADER=Authorization
|
|
114
|
+
VITE_TTS_API_KEY_PREFIX=Bearer
|
|
115
|
+
VITE_PREFER_WEBHOOK_TTS=true
|
|
116
|
+
VITE_TTS_MINIO_OBJECT_ENDPOINT=http://localhost:8000/minio/object
|
|
117
|
+
VITE_TTS_MINIO_API_KEY=
|
|
118
|
+
VITE_TTS_MINIO_API_KEY_HEADER=X-API-Key
|
|
119
|
+
VITE_TTS_MINIO_API_KEY_PREFIX=
|
|
120
|
+
VITE_TTS_MINIO_BUCKET=chatbot-tts
|
|
121
|
+
VITE_TTS_MINIO_DOWNLOAD=false
|
|
109
122
|
```
|
|
110
123
|
|
|
111
124
|
`VITE_STT_ENDPOINT` is used by the built-in mic button to transcribe recorded audio.
|
|
112
|
-
|
|
125
|
+
For the Netmonk STT service, set `VITE_STT_API_KEY_HEADER=x-api-key` and leave `VITE_STT_API_KEY_PREFIX` empty so the widget sends the raw key value without `Bearer`.
|
|
126
|
+
`VITE_PREFER_WEBHOOK_TTS=true` makes the widget prioritize TTS assets from webhook response (for example `tts_assets.items[].audio_object_key`) and use `VITE_TTS_ENDPOINT` only as fallback.
|
|
127
|
+
Widget tidak lagi konek langsung ke MinIO. Object diambil lewat endpoint voice-engine:
|
|
128
|
+
`GET /minio/object?object_key=<key>&bucket=<bucket>&download=<true|false>`.
|
|
129
|
+
`VITE_TTS_MINIO_API_KEY` opsional. Jika diisi, widget mengirim header `X-API-Key` (atau header custom via `VITE_TTS_MINIO_API_KEY_HEADER`).
|
|
130
|
+
`VITE_TTS_MINIO_BUCKET` dipakai sebagai default bucket bila payload webhook tidak mengirim bucket.
|
|
131
|
+
Set `VITE_TTS_MINIO_DOWNLOAD=true` jika ingin force download mode saat fetch object.
|
|
132
|
+
`VITE_TTS_ENDPOINT` is used for fallback TTS playback.
|
|
133
|
+
The widget expects the TTS response body to contain raw audio binary such as `audio/wav`, and reads lip-sync metadata from `x-tts-visemes-b64`, `x-tts-phonemes-b64`, and `x-tts-phoneme-timeline-b64` response headers.
|
|
134
|
+
If your TTS endpoint is called cross-origin, the server must expose those headers with `Access-Control-Expose-Headers`.
|
|
135
|
+
If your TTS service uses a different header such as `x-api-key`, set `VITE_TTS_API_KEY_HEADER=x-api-key` and leave the prefix empty.
|
|
113
136
|
5. Optional TTS debug logging
|
|
114
137
|
|
|
115
138
|
To inspect TTS queue and playback lifecycle in browser console, set `VITE_DEBUG_TTS=true` in your `.env` file.
|
|
@@ -307,7 +330,22 @@ For responses with buttons:
|
|
|
307
330
|
|
|
308
331
|
The widget now includes a built-in microphone button in the input area. Clicking the button toggles voice mode, requests microphone access, records speech, sends the recorded audio to `VITE_STT_ENDPOINT`, and forwards the returned transcription as a regular user message.
|
|
309
332
|
|
|
310
|
-
|
|
333
|
+
If `VITE_STT_API_KEY` is set, the STT request also includes the configured auth header. For `https://voice.netmonk-ai.tech/stt`, use `x-api-key` without any prefix. Other providers can still use `Authorization: Bearer <key>` or any custom header via `.env`.
|
|
334
|
+
|
|
335
|
+
**Expected STT request/response shape:**
|
|
336
|
+
|
|
337
|
+
- Request body: `multipart/form-data`
|
|
338
|
+
- Audio field name: `audio`
|
|
339
|
+
- Request headers: `Accept: application/json`, plus `x-api-key: <key>` when configured for the Netmonk STT service
|
|
340
|
+
- Expected JSON response:
|
|
341
|
+
|
|
342
|
+
```json
|
|
343
|
+
{
|
|
344
|
+
"text": "Halo, saya mau tanya status tiket saya"
|
|
345
|
+
}
|
|
346
|
+
```
|
|
347
|
+
|
|
348
|
+
During TTS playback, the header avatar can switch between phoneme images using the metadata returned by the TTS service. The widget currently supports these phoneme IDs:
|
|
311
349
|
|
|
312
350
|
- `A`
|
|
313
351
|
- `BP`
|
|
@@ -324,22 +362,65 @@ During TTS playback, the header avatar can switch between phoneme images using t
|
|
|
324
362
|
|
|
325
363
|
Phoneme IDs are normalized case-insensitively in the widget, so values such as `ChJ` and `CHJ` resolve to the same avatar image.
|
|
326
364
|
|
|
327
|
-
**
|
|
365
|
+
**Preferred webhook TTS asset shape (MinIO-first):**
|
|
328
366
|
|
|
329
367
|
```json
|
|
330
368
|
{
|
|
331
|
-
"
|
|
332
|
-
"
|
|
333
|
-
|
|
334
|
-
|
|
335
|
-
|
|
336
|
-
|
|
337
|
-
|
|
338
|
-
|
|
369
|
+
"messages": [{ "type": "text", "text": "Halo, ada yang bisa saya bantu?" }],
|
|
370
|
+
"tts_assets": {
|
|
371
|
+
"items": [
|
|
372
|
+
{
|
|
373
|
+
"bucket": "chatbot-tts",
|
|
374
|
+
"response_index": 1,
|
|
375
|
+
"audio_object_key": "tts/...wav",
|
|
376
|
+
"phoneme_object_key": "tts/...phonemes.txt",
|
|
377
|
+
"phoneme_timeline_object_key": "tts/...phoneme-timeline.json",
|
|
378
|
+
"viseme_timeline_object_key": "tts/...viseme-timeline.json"
|
|
379
|
+
}
|
|
380
|
+
]
|
|
381
|
+
}
|
|
339
382
|
}
|
|
340
383
|
```
|
|
341
384
|
|
|
342
|
-
|
|
385
|
+
When this payload is available, the widget resolves MinIO object keys first for audio and timeline metadata. If no usable asset is found, it falls back to `VITE_TTS_ENDPOINT`.
|
|
386
|
+
|
|
387
|
+
**Expected TTS response shape:**
|
|
388
|
+
|
|
389
|
+
```text
|
|
390
|
+
body: <raw WAV or other audio binary>
|
|
391
|
+
content-type: audio/wav
|
|
392
|
+
x-tts-visemes-b64: W3siaWQiOiJNIiwic3RhcnRNcyI6MCwiZW5kTXMiOjEyMH1d
|
|
393
|
+
x-tts-phonemes-b64: TSBBCg==
|
|
394
|
+
x-tts-phoneme-timeline-b64: W3sicGhvbmVtZSI6Ik0iLCJzdGFydE1zIjowLCJlbmRNcyI6MTIwfV0=
|
|
395
|
+
```
|
|
396
|
+
|
|
397
|
+
`x-tts-phonemes-b64` is the legacy phoneme string, while `x-tts-phoneme-timeline-b64` is the timed phoneme timeline. The base64 headers should decode as:
|
|
398
|
+
|
|
399
|
+
Decoded `x-tts-phonemes-b64`:
|
|
400
|
+
|
|
401
|
+
```text
|
|
402
|
+
M A
|
|
403
|
+
```
|
|
404
|
+
|
|
405
|
+
Decoded `x-tts-phoneme-timeline-b64`:
|
|
406
|
+
|
|
407
|
+
```json
|
|
408
|
+
[
|
|
409
|
+
{ "phoneme": "M", "startMs": 0, "endMs": 120 }
|
|
410
|
+
]
|
|
411
|
+
```
|
|
412
|
+
|
|
413
|
+
Decoded `x-tts-visemes-b64`:
|
|
414
|
+
|
|
415
|
+
```json
|
|
416
|
+
[
|
|
417
|
+
{ "id": "M", "startMs": 0, "endMs": 120 },
|
|
418
|
+
{ "id": "A", "startMs": 121, "endMs": 260 },
|
|
419
|
+
{ "id": "SZ", "startMs": 261, "endMs": 420 }
|
|
420
|
+
]
|
|
421
|
+
```
|
|
422
|
+
|
|
423
|
+
The widget still supports the older JSON body format as a fallback, but header-based metadata is now the preferred format for binary audio responses.
|
|
343
424
|
|
|
344
425
|
---
|
|
345
426
|
|