testaugnitorecorder4 1.0.46

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,179 @@
1
+ # Async JS Client for Voice Backend
2
+
3
+ **Contents**
4
+
5
+ - [Intent](#intent)
6
+ - [Components](#components)
7
+ - [Usage](#usage)
8
+ - [Running an Example](#running-an-example)
9
+
10
+ ## Intent
11
+
12
+ The rationale behind an async implementation of the JS voice client is to decouple the following sets of tasks...
13
+
14
+ - generating an audio stream
15
+ - processing this audio stream (resample, conversion, etc.)
16
+ - streaming to server over a websocket connection
17
+ - handling ASR returned from server
18
+
19
+ ...such that each of the above run independently in their own thread insulated from each other and communicate via message passing.
20
+
21
+ ## Components
22
+
23
+ The async JS client is composed of the following components, each portraying well-defined and exclusive roles.
24
+
25
+ ### Streamer
26
+
27
+ Streamer is the entrypoint of the async JS client. It's role is two-fold:
28
+
29
+ - initialise the **Worklet** and the **Executor** components
30
+ - manage communication and control between all components
31
+
32
+ When a new dictation session is started by the end-user, following steps are executed:
33
+
34
+ 1. Streamer is initialised along with the Worklet and the Executor
35
+ 2. It initialises the audio context post which audio generation begins
36
+ 3. It sends the audio packets to the Worklet for further processing
37
+ 4. It receives the processed and buffered packets from the Worklet and sends it to the Executor for ASR reception, processing and presentation.
38
+ 5. It repeats again from (3.)
39
+
40
+ When a dictation session is stopped by the end-user, following steps are executed:
41
+
42
+ 1. Streamer stops the audio context
43
+ 2. It sends DONE messages to Worklet and Executor asking them to close gracefully
44
+
45
+ _NOTE: Audio source in the existing implementation is in the form of a recorded audio played in a loop. To support audio generation through a microphone, please uncomment pertinent code in Streamer._
46
+
47
+ ### Worklet
48
+
49
+ Worklet is employed mainly for audio processing, employing [AudioWorklets](https://developer.mozilla.org/en-US/docs/Web/API/AudioWorklet). In this context, it can be used to resample the incoming audio stream before sending the same to the server. It also keeps the processed audio packets buffered for a short period of time before sending it back to the Streamer to decrease overhead of message passing between OS-level threads running itself and the Streamer.
50
+
51
+ ### Executor
52
+
53
+ This component deals with the following tasks:
54
+
55
+ - obtain processed audio packets from the Streamer and stream it to the server over a websocket connection
56
+ - obtain ASR from the server, process it well before pasting it to an editor screen
57
+
58
+ Executor manages a websocket connection with the server in a [Web Worker](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers). Audio packets received from the Streamer are buffered in a read-queue to be sent over to the server. It houses 3 daemons running periodically:
59
+
60
+ 1. **Consumer**: It consumes audio packets from the read-queue and streams it to the server over a websocket connection
61
+ 2. **Healthcheck**: It oversees the websocket connection and closes it forcefully when the server is unreachable.
62
+ 3. **IdleThread**: It keeps track of the fact that data is sent regularly over the websocket connection. If the connection remains idle for a given period of time, this daemon closes the connection gracefully thus freeing up network and server resources.
63
+
64
+ ## Usage
65
+
66
+ ### Initialise Audio Stream
67
+
68
+ This can be employed when a new dictation session begins by the user with an intent of creating a clinical document.
69
+
70
+ ```js
71
+ const recorderInstance = new AugnitoRecorder(
72
+ {
73
+ serverURL: WS_URL,
74
+ enableLogs: false,
75
+ isDebug: false,
76
+ bufferInterval: 1,
77
+ EOS_Message: "EOS",
78
+ socketTimeoutInterval: 10000,
79
+ shouldSendAudioDataSequence: false,
80
+ },
81
+ heavyOp
82
+ );
83
+ recorderInstance.toggleStartStopAudioStream(); //if you want to start/stop audio on button click
84
+ or;
85
+ recorderInstance.togglePauseResumeAudioStream(); //if you want to pause/resume audio on button click
86
+ ```
87
+
88
+ - `WS_URL` is the server websocket endpoint to which the client connects to, to stream audio.
89
+ - `heavyOp` is a CPU-intensive operation which is run on the received ASR before displaying it on the editor. A dummy example is as follows:
90
+ - `enableLogs` is set to true if you want to see logs in console else set to false
91
+ - `isDebug` is set to true if you want to save recorded audio to a file else set to false
92
+ - `bufferInterval` is set to any non negative integer, indicates the buffer size interval in seconds, default value is 1 sec
93
+ - `EOS_Message` is set to any string that indicates EOS or can be undefined, default value is EOS
94
+ - `socketTimeoutInterval` is set to any socket timeout interval as desired, default value is 10000
95
+ - `shouldSendAudioDataSequence` is set to true if you want the audio packet sequence sent with every packet as header else set to false
96
+
97
+ ```js
98
+ const heavyOp = (text) => {
99
+ let num = -1;
100
+ let iters = 0;
101
+ let sum = 0;
102
+ for (let i = 0; i < text.length; i++) {
103
+ sum += text.charCodeAt(i);
104
+ }
105
+ console.debug(`Running Heavy Operation for "${text}" with sum: ${sum}`);
106
+ while (num != sum) {
107
+ num = Math.floor(Math.random() * 100000);
108
+ iters++;
109
+ }
110
+ console.log(`Iterations completed for sum(${sum}): ${iters}`);
111
+ return text;
112
+ };
113
+ ```
114
+
115
+ ### Audio Stream Pause and Resume
116
+
117
+ This can be employed when the user pauses and resumes the dictation session emulating a toggle of the microphone button.
118
+
119
+ ```js
120
+ recorderInstance.pauseAudio();
121
+
122
+ /*
123
+ Do something interesting here...
124
+ */
125
+
126
+ recorderInstance.resumeAudio();
127
+ ```
128
+
129
+ ### Stop Audio Stream
130
+
131
+ This can be employed when the user has completed their dictation session after finishing off compiling the medical document.
132
+
133
+ ```js
134
+ recorderInstance.stopAudio();
135
+ ```
136
+
137
+ ### Library Callbacks
138
+
139
+ These callbacks can be employed to read socket connection state change event, session event, speech text output event or any other error event from the library.
140
+
141
+ ```js
142
+ recorderInstance.onStateChanged = (connected) => {
143
+ //connected is true when socket is opened and false when socket is closed
144
+ };
145
+ recorderInstance.onSessionEvent = (response) => {
146
+ //handle all meta events
147
+ };
148
+ recorderInstance.onError = (error) => {
149
+ //handle any error message
150
+ };
151
+ recorderIns.onPartialResult = (text) => {
152
+ // hypothesis text output generated as the user speaks
153
+ };
154
+ recorderInstance.onFinalResult = (textResponse) => {
155
+ //text reponse can be parsed to read the speech output json
156
+ };
157
+ ```
158
+
159
+ ## Compile library
160
+
161
+ Below steps to compile library and the output is generated in dist folder
162
+
163
+ ```sh
164
+ cd AugnitoRecorderJS
165
+ npm run build
166
+ ```
167
+
168
+ ## Running an Example
169
+
170
+ ```sh
171
+ cd AugnitoRecorderJS
172
+ python3 -m http.server
173
+ ```
174
+
175
+ Once done, fire up the webpage at: http://localhost:8000. The webpage has following controls for speech:
176
+
177
+ - **Start Stream**: This begins playing the recorded audio which is streamed to the server and the ASR obtained pertaining to it is pasted on screen live. This control marks the start of a new session with the server and so a new WS connection is created for audio streaming and ASR reception. The recording and the ensuing ASR can be paused and resumed using the same control.
178
+ - **Stop Stream**: This stops the recorded audio from playing marking the end of the client session. The connection with the server is severed when this control is engaged.
179
+ - **Go Crazy...**: This control randomly starts, pauses, resumes and stops the audio streaming session in a loop. It employs the above two controls as it's building block and is a hands-free approach to communicate with the server in the form of a virtual user.
@@ -0,0 +1,54 @@
1
+ export declare class AugnitoRecorder {
2
+ private WebsocketURL;
3
+ private enableLogs;
4
+ private isDebug;
5
+ private streamer;
6
+ private heavyOp;
7
+ private bufferInterval;
8
+ private eosMessage;
9
+ private socketTimeoutInterval;
10
+ onSessionEvent: (data: any) => void;
11
+ onStateChanged: (isRecording: boolean) => void;
12
+ onError: (errorMessage: string) => void;
13
+ onPartialResult: (hype: string) => void;
14
+ onFinalResult: (recipe: any) => void;
15
+ onOtherResults: (message: string) => void;
16
+ onIntensity: (intenity: number) => void;
17
+ showLog: (event: string) => void;
18
+ constructor(
19
+ config: {
20
+ serverURL?: string;
21
+ enableLogs: boolean;
22
+ isDebug: boolean;
23
+ bufferInterval?: number;
24
+ pausedBufferInterval?: number;
25
+ EOS_Message?: string;
26
+ socketTimeoutInterval?: number;
27
+ shouldSendAudioDataSequence?: boolean;
28
+ shouldPreIntialiseRecorder?: boolean;
29
+ shouldReadIntensity?: boolean;
30
+ debounceDelay?: number;
31
+ switchToRegularSpeechProfile?: boolean
32
+ },
33
+ heavyOp?: any
34
+ );
35
+ togglePauseResumeAudioStream(
36
+ audioDuration?: number,
37
+ socketURL?: string
38
+ ): void;
39
+ toggleStartStopAudioStream(audioDuration?: number, socketURL?: string): void;
40
+ startAudio(): void;
41
+ pauseAudio(): void;
42
+ resumeAudio(): void;
43
+ stopAudio(shouldSendEOS?: boolean, forceStopForPausedState?: boolean): void;
44
+ getBlob(): Blob;
45
+ log(event: any): void;
46
+ dispose(): void;
47
+ onSessionEventCallback(data: any): void;
48
+ onStateChangedCallback(isRecording: any): void;
49
+ onErrorCallback(errorMessage: any): void;
50
+ onPartialResultCallback(hype: any): void;
51
+ onFinalResultCallback(recipe: any): void;
52
+ showLogCallback(event: any): void;
53
+ onIntensityCallback(intensity: number): void;
54
+ }