@aituber-onair/core 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (104) hide show
  1. package/README.md +723 -0
  2. package/dist/constants/api.d.ts +4 -0
  3. package/dist/constants/api.js +13 -0
  4. package/dist/constants/api.js.map +1 -0
  5. package/dist/constants/index.d.ts +23 -0
  6. package/dist/constants/index.js +25 -0
  7. package/dist/constants/index.js.map +1 -0
  8. package/dist/constants/openaiApi.d.ts +15 -0
  9. package/dist/constants/openaiApi.js +15 -0
  10. package/dist/constants/openaiApi.js.map +1 -0
  11. package/dist/constants/prompts.d.ts +2 -0
  12. package/dist/constants/prompts.js +13 -0
  13. package/dist/constants/prompts.js.map +1 -0
  14. package/dist/core/AITuberOnAirCore.d.ts +142 -0
  15. package/dist/core/AITuberOnAirCore.js +316 -0
  16. package/dist/core/AITuberOnAirCore.js.map +1 -0
  17. package/dist/core/ChatProcessor.d.ts +86 -0
  18. package/dist/core/ChatProcessor.js +246 -0
  19. package/dist/core/ChatProcessor.js.map +1 -0
  20. package/dist/core/EventEmitter.d.ts +35 -0
  21. package/dist/core/EventEmitter.js +72 -0
  22. package/dist/core/EventEmitter.js.map +1 -0
  23. package/dist/core/MemoryManager.d.ts +98 -0
  24. package/dist/core/MemoryManager.js +208 -0
  25. package/dist/core/MemoryManager.js.map +1 -0
  26. package/dist/index.d.ts +24 -0
  27. package/dist/index.js +22 -0
  28. package/dist/index.js.map +1 -0
  29. package/dist/services/chat/ChatService.d.ts +21 -0
  30. package/dist/services/chat/ChatService.js +2 -0
  31. package/dist/services/chat/ChatService.js.map +1 -0
  32. package/dist/services/chat/ChatServiceFactory.d.ts +38 -0
  33. package/dist/services/chat/ChatServiceFactory.js +55 -0
  34. package/dist/services/chat/ChatServiceFactory.js.map +1 -0
  35. package/dist/services/chat/OpenAIChatService.d.ts +38 -0
  36. package/dist/services/chat/OpenAIChatService.js +166 -0
  37. package/dist/services/chat/OpenAIChatService.js.map +1 -0
  38. package/dist/services/chat/OpenAISummarizer.d.ts +25 -0
  39. package/dist/services/chat/OpenAISummarizer.js +70 -0
  40. package/dist/services/chat/OpenAISummarizer.js.map +1 -0
  41. package/dist/services/chat/providers/ChatServiceProvider.d.ts +44 -0
  42. package/dist/services/chat/providers/ChatServiceProvider.js +2 -0
  43. package/dist/services/chat/providers/ChatServiceProvider.js.map +1 -0
  44. package/dist/services/chat/providers/OpenAIChatServiceProvider.d.ts +33 -0
  45. package/dist/services/chat/providers/OpenAIChatServiceProvider.js +44 -0
  46. package/dist/services/chat/providers/OpenAIChatServiceProvider.js.map +1 -0
  47. package/dist/services/voice/VoiceEngineAdapter.d.ts +46 -0
  48. package/dist/services/voice/VoiceEngineAdapter.js +173 -0
  49. package/dist/services/voice/VoiceEngineAdapter.js.map +1 -0
  50. package/dist/services/voice/VoiceService.d.ts +55 -0
  51. package/dist/services/voice/VoiceService.js +2 -0
  52. package/dist/services/voice/VoiceService.js.map +1 -0
  53. package/dist/services/voice/engines/AivisSpeechEngine.d.ts +10 -0
  54. package/dist/services/voice/engines/AivisSpeechEngine.js +70 -0
  55. package/dist/services/voice/engines/AivisSpeechEngine.js.map +1 -0
  56. package/dist/services/voice/engines/NijiVoiceEngine.d.ts +12 -0
  57. package/dist/services/voice/engines/NijiVoiceEngine.js +105 -0
  58. package/dist/services/voice/engines/NijiVoiceEngine.js.map +1 -0
  59. package/dist/services/voice/engines/OpenAiEngine.d.ts +9 -0
  60. package/dist/services/voice/engines/OpenAiEngine.js +34 -0
  61. package/dist/services/voice/engines/OpenAiEngine.js.map +1 -0
  62. package/dist/services/voice/engines/VoiceEngine.d.ts +21 -0
  63. package/dist/services/voice/engines/VoiceEngine.js +2 -0
  64. package/dist/services/voice/engines/VoiceEngine.js.map +1 -0
  65. package/dist/services/voice/engines/VoiceEngineFactory.d.ts +14 -0
  66. package/dist/services/voice/engines/VoiceEngineFactory.js +34 -0
  67. package/dist/services/voice/engines/VoiceEngineFactory.js.map +1 -0
  68. package/dist/services/voice/engines/VoicePeakEngine.d.ts +13 -0
  69. package/dist/services/voice/engines/VoicePeakEngine.js +46 -0
  70. package/dist/services/voice/engines/VoicePeakEngine.js.map +1 -0
  71. package/dist/services/voice/engines/VoiceVoxEngine.d.ts +13 -0
  72. package/dist/services/voice/engines/VoiceVoxEngine.js +67 -0
  73. package/dist/services/voice/engines/VoiceVoxEngine.js.map +1 -0
  74. package/dist/services/voice/engines/index.d.ts +7 -0
  75. package/dist/services/voice/engines/index.js +7 -0
  76. package/dist/services/voice/engines/index.js.map +1 -0
  77. package/dist/services/voice/messages.d.ts +38 -0
  78. package/dist/services/voice/messages.js +49 -0
  79. package/dist/services/voice/messages.js.map +1 -0
  80. package/dist/services/youtube/YouTubeDataApiService.d.ts +69 -0
  81. package/dist/services/youtube/YouTubeDataApiService.js +255 -0
  82. package/dist/services/youtube/YouTubeDataApiService.js.map +1 -0
  83. package/dist/services/youtube/YouTubeService.d.ts +63 -0
  84. package/dist/services/youtube/YouTubeService.js +2 -0
  85. package/dist/services/youtube/YouTubeService.js.map +1 -0
  86. package/dist/types/index.d.ts +82 -0
  87. package/dist/types/index.js +5 -0
  88. package/dist/types/index.js.map +1 -0
  89. package/dist/types/nijiVoice.d.ts +27 -0
  90. package/dist/types/nijiVoice.js +2 -0
  91. package/dist/types/nijiVoice.js.map +1 -0
  92. package/dist/utils/index.d.ts +5 -0
  93. package/dist/utils/index.js +6 -0
  94. package/dist/utils/index.js.map +1 -0
  95. package/dist/utils/screenplay.d.ts +19 -0
  96. package/dist/utils/screenplay.js +42 -0
  97. package/dist/utils/screenplay.js.map +1 -0
  98. package/dist/utils/screenshot.d.ts +19 -0
  99. package/dist/utils/screenshot.js +44 -0
  100. package/dist/utils/screenshot.js.map +1 -0
  101. package/dist/utils/storage.d.ts +44 -0
  102. package/dist/utils/storage.js +103 -0
  103. package/dist/utils/storage.js.map +1 -0
  104. package/package.json +33 -0
package/README.md ADDED
@@ -0,0 +1,723 @@
1
+ # AITuber OnAir Core
2
+
3
+ ![AITuber OnAir Core - logo](./images/aituber-onair-core.png)
4
+
5
+ **AITuber OnAir Core** is a TypeScript library developed to provide functionality for the [AITuber OnAir](https://aituberonair.com) web service, designed for AI-based virtual streaming (AITuber).
6
+
7
+ [日本語版はこちら](./README_ja.md)
8
+
9
+ While it is primarily intended for use within [AITuber OnAir](https://aituberonair.com), this project is available as open-source software under the MIT License and can be used freely.
10
+
11
+ It specializes in generating response text and audio from text or image inputs, and is designed to easily integrate with other parts of an application (storage, YouTube integration, avatar control, etc.).
12
+
13
+ ## Table of Contents
14
+
15
+ - [Overview](#overview)
16
+ - [Installation](#installation)
17
+ - [Main Features](#main-features)
18
+ - [Basic Usage](#basic-usage)
19
+ - [Architecture](#architecture)
20
+ - [Main Components](#main-components)
21
+ - [Event System](#event-system)
22
+ - [Supported Speech Engines](#supported-speech-engines)
23
+ - [AI Provider System](#ai-provider-system)
24
+ - [Memory & Persistence](#memory--persistence)
25
+ - [Examples](#examples)
26
+ - [Integration with Existing Applications](#integration-with-existing-applications)
27
+ - [Testing & Development](#testing--development)
28
+
29
+ ## Overview
30
+
31
+ **AITuberOnAirCore** is the central module that provides core features for AI tubers. It forms the core of the AITuber OnAir application. It encapsulates complex AI response generation, conversation context management, speech synthesis, and more, making these features available through a simple API.
32
+
33
+ ## Installation
34
+
35
+ You can install AITuber OnAir Core using npm:
36
+
37
+ ```bash
38
+ npm install @aituber-onair/core
39
+ ```
40
+
41
+ Or using yarn:
42
+
43
+ ```bash
44
+ yarn add @aituber-onair/core
45
+ ```
46
+
47
+ Or using pnpm:
48
+
49
+ ```bash
50
+ pnpm install @aituber-onair/core
51
+ ```
52
+
53
+ ## Main Features
54
+
55
+ - **AI Response Generation from Text Input**
56
+ Generates natural responses to user text input using OpenAI GPT models.
57
+ - **AI Response Generation from Images (Vision)**
58
+ Generates AI responses based on recognized content from images (e.g., live broadcast screens).
59
+ - **Conversation Context Management & Memory**
60
+ Maintains long-running conversation context via short-, mid-, and long-term memory systems.
61
+ - **Text-to-Speech Conversion**
62
+ Compatible with multiple speech engines (VOICEVOX, VoicePeak, NijiVoice, AivisSpeech, OpenAI TTS).
63
+ - **Emotion Extraction & Processing**
64
+ Extracts emotion from AI responses and utilizes it for speech synthesis or avatar expressions.
65
+ - **Event-Driven Architecture**
66
+ Emits events at each stage of processing to simplify external integrations.
67
+ - **Customizable Prompts**
68
+ Allows customization of prompts for vision processing and conversation summarization.
69
+ - **Pluggable Persistence**
70
+ Memory features can be persisted via LocalStorage, IndexedDB, or other customizable methods.
71
+
72
+ ## Basic Usage
73
+
74
+ Below is a simplified example of how to use **AITuber OnAir Core**:
75
+
76
+ ```typescript
77
+ import {
78
+ AITuberOnAirCore,
79
+ AITuberOnAirCoreEvent,
80
+ AITuberOnAirCoreOptions
81
+ } from '../lib/aituber-core';
82
+
83
+ // 1. Define options
84
+ const options: AITuberOnAirCoreOptions = {
85
+ openAiKey: 'YOUR_OPENAI_API_KEY',
86
+ chatOptions: {
87
+ systemPrompt: 'You are an AI streamer. Act as a cheerful and friendly live broadcaster.',
88
+ visionSystemPrompt: 'Please comment like a streamer on what is shown on screen.',
89
+ visionPrompt: 'Look at the broadcast screen and provide commentary suited to the situation.', // Prompt for image input
90
+ },
91
+ memoryOptions: {
92
+ enableSummarization: true,
93
+ shortTermDuration: 60 * 1000, // 1 minute
94
+ midTermDuration: 4 * 60 * 1000, // 4 minutes
95
+ longTermDuration: 9 * 60 * 1000, // 9 minutes
96
+ maxMessagesBeforeSummarization: 20,
97
+ maxSummaryLength: 256,
98
+ // You can specify a custom summarization prompt
99
+ summaryPromptTemplate: 'Please summarize the following conversation in under {maxLength} characters. Include important points.'
100
+ },
101
+ voiceOptions: {
102
+ engineType: 'voicevox', // Speech engine type
103
+ speaker: '1', // Speaker ID
104
+ apiKey: 'ENGINE_SPECIFIC_API_KEY', // If required (e.g., NijiVoice)
105
+ },
106
+ debug: true, // Enable debug output
107
+ };
108
+
109
+ // 2. Create an instance
110
+ const aituber = new AITuberOnAirCore(options);
111
+
112
+ // 3. Set up event listeners
113
+ aituber.on(AITuberOnAirCoreEvent.PROCESSING_START, () => {
114
+ console.log('Processing started');
115
+ });
116
+
117
+ aituber.on(AITuberOnAirCoreEvent.ASSISTANT_PARTIAL, (text) => {
118
+ // Receive streaming responses and display in UI
119
+ console.log(`Partial response: ${text}`);
120
+ });
121
+
122
+ aituber.on(AITuberOnAirCoreEvent.ASSISTANT_RESPONSE, (data) => {
123
+ const { message, screenplay, rawText } = data;
124
+ console.log(`Complete response: ${message.content}`);
125
+ console.log(`Original text with emotion tags: ${rawText}`);
126
+ if (screenplay.emotion) {
127
+ console.log(`Emotion: ${screenplay.emotion}`);
128
+ }
129
+ });
130
+
131
+ aituber.on(AITuberOnAirCoreEvent.SPEECH_START, (data) => {
132
+ // The SPEECH_START event includes the screenplay object and rawText
133
+ if (data && data.screenplay) {
134
+ console.log(`Speech playback started: emotion = ${data.screenplay.emotion || 'neutral'}`);
135
+ console.log(`Original text with emotion tags: ${data.rawText}`);
136
+ } else {
137
+ console.log('Speech playback started');
138
+ }
139
+ });
140
+
141
+ aituber.on(AITuberOnAirCoreEvent.SPEECH_END, () => {
142
+ console.log('Speech playback finished');
143
+ });
144
+
145
+ aituber.on(AITuberOnAirCoreEvent.ERROR, (error) => {
146
+ console.error('Error occurred:', error);
147
+ });
148
+
149
+ // 4. Process text input
150
+ await aituber.processChat('Hello, how is the weather today?');
151
+
152
+ // 5. Clear event listeners if needed
153
+ aituber.offAll();
154
+ ```
155
+
156
+ ## Architecture
157
+
158
+ **AITuberOnAirCore** is designed with the following layered structure:
159
+
160
+ ```
161
+ AITuberOnAirCore (Integration Layer)
162
+ ├── ChatProcessor (Conversation handling)
163
+ │ └── ChatService (AI Chat)
164
+ ├── MemoryManager (Memory handling)
165
+ │ └── Summarizer (Summarization)
166
+ └── VoiceService (Speech processing)
167
+ └── VoiceEngineAdapter (Speech Engine Interface)
168
+ └── Various Speech Engines (VOICEVOX, NijiVoice, etc.)
169
+ ```
170
+
171
+ ## Main Components
172
+
173
+ ### AITuberOnAirCore
174
+
175
+ This is the overall integration class, responsible for initializing and coordinating other components. It extends `EventEmitter` and emits events at various processing stages. In most cases, you will interact primarily with this class to use its features.
176
+
177
+ **Main methods** include:
178
+
179
+ - `processChat(text)` – Process text input
180
+ - `processVisionChat(imageDataUrl, visionPrompt?)` – Process image input (optionally pass a custom prompt)
181
+ - `stopSpeech()` – Stop speech playback
182
+ - `getChatHistory()` – Retrieve chat history
183
+ - `clearChatHistory()` – Clear chat history
184
+ - `updateVoiceService(options)` – Update speech settings
185
+ - `isMemoryEnabled()` – Check if memory functionality is enabled
186
+ - `offAll()` – Remove all event listeners
187
+
188
+ ### ChatProcessor
189
+
190
+ The component that sends text input to an AI model (e.g., OpenAI GPT) and receives responses. It manages the conversation flow and supports streaming responses. It also handles emotion extraction from responses.
191
+
192
+ - `updateOptions(newOptions)` – Allows you to update settings at runtime
193
+
194
+ ### MemoryManager
195
+
196
+ Handles conversational context. In long conversations, older messages are summarized and maintained as short-term (1 min), mid-term (4 min), and long-term (9 min) memory. This helps maintain consistency in AI responses.
197
+
198
+ - **Custom Settings**:
199
+ - `summaryPromptTemplate` can be customized for summarization (it uses a `{maxLength}` placeholder).
200
+
201
+ ### VoiceService
202
+
203
+ Converts text to speech. It integrates with multiple external speech synthesis engines through the `VoiceEngineAdapter`.
204
+
205
+ #### speakTextWithOptions Method
206
+
207
+ The `AITuberOnAirCore` class provides a flexible `speakTextWithOptions` method for speech playback:
208
+
209
+ ```typescript
210
+ // Example of speaking text with temporary settings
211
+ await aituberOnairCore.speakTextWithOptions('[happy] Hello, everyone watching!', {
212
+ // Enable or disable avatar animation
213
+ enableAnimation: true,
214
+
215
+ // Temporarily override current speech settings
216
+ temporaryVoiceOptions: {
217
+ engineType: 'voicevox',
218
+ speaker: '8',
219
+ apiKey: 'YOUR_API_KEY' // If required
220
+ },
221
+
222
+ // Specify the ID of the HTML audio element for playback
223
+ audioElementId: 'custom-audio-player'
224
+ });
225
+ ```
226
+
227
+ **Key Features**:
228
+
229
+ 1. **Temporary Voice Settings**: Override current speech settings without permanently changing them.
230
+ 2. **Animation Control**: Control avatar animation with the `enableAnimation` option.
231
+ 3. **Flexible Audio Playback**: Play audio in a specified HTML audio element.
232
+ 4. **Automatic Emotion Extraction**: Extract emotion tags (e.g., `[happy]`) from text and provide them in the `SPEECH_START` event.
233
+
234
+ ## Event System
235
+
236
+ **AITuberOnAirCore** emits the following events:
237
+
238
+ - `PROCESSING_START`: When processing begins
239
+ - `PROCESSING_END`: When processing finishes
240
+ - `ASSISTANT_PARTIAL`: Upon receiving partial responses from the assistant (streaming)
241
+ - `ASSISTANT_RESPONSE`: Upon receiving a complete response (includes a screenplay object and rawText with emotion tags)
242
+ - `SPEECH_START`: When speech playback starts (includes a screenplay object with emotion and rawText with emotion tags)
243
+ - `SPEECH_END`: When speech playback ends
244
+ - `ERROR`: When an error occurs
245
+
246
+ ### Safely Handling Event Data
247
+
248
+ In particular, when implementing a listener for the `SPEECH_START` event, it is recommended to check if data is present:
249
+
250
+ ```typescript
251
+ // Safe handling of SPEECH events
252
+ aituber.on(AITuberOnAirCoreEvent.SPEECH_START, (data) => {
253
+ if (!data) {
254
+ console.log('No data available');
255
+ return;
256
+ }
257
+
258
+ const screenplay = data.screenplay;
259
+ if (!screenplay) {
260
+ console.log('No screenplay object');
261
+ return;
262
+ }
263
+
264
+ const emotion = screenplay.emotion || 'neutral';
265
+ console.log(`Speech started: Emotion = ${emotion}`);
266
+
267
+ // Get original text with emotion tags
268
+ console.log(`Original text: ${data.rawText}`);
269
+
270
+ // Update UI or avatar animation
271
+ updateUIWithEmotion(emotion);
272
+ });
273
+ ```
274
+
275
+ ### Emotion Handling
276
+
277
+ In a React application, you might use `useRef` to store the latest emotion data for immediate access:
278
+
279
+ ```typescript
280
+ // Example in a React component
281
+ const [currentEmotion, setCurrentEmotion] = useState('neutral');
282
+ const emotionRef = useRef({ emotion: 'neutral', text: '' });
283
+
284
+ useEffect(() => {
285
+ if (aituberOnairCore) {
286
+ aituberOnairCore.on(AITuberOnAirCoreEvent.SPEECH_START, (data) => {
287
+ if (data?.screenplay?.emotion) {
288
+ setCurrentEmotion(data.screenplay.emotion);
289
+ emotionRef.current = data.screenplay;
290
+ }
291
+ });
292
+ }
293
+ }, [aituberOnairCore]);
294
+
295
+ // Use the ref for animation callbacks
296
+ const handleAnimation = () => {
297
+ const emotion = emotionRef.current.emotion || 'neutral';
298
+ // Perform animation based on emotion
299
+ };
300
+ ```
301
+
302
+ ### ChatProcessor Events
303
+
304
+ The internal `ChatProcessor` emits additional events:
305
+
306
+ - `chatLogUpdated`: Fired when the chat log is updated (e.g., when new messages are added or history is cleared).
307
+
308
+ You can access this event by referencing the `ChatProcessor` instance directly:
309
+
310
+ ```typescript
311
+ // Example: using the chatLogUpdated event in ChatProcessor
312
+ const aituber = new AITuberOnAirCore(options);
313
+ const chatProcessor = aituber['chatProcessor']; // Accessing internal component
314
+
315
+ chatProcessor.on('chatLogUpdated', (chatLog) => {
316
+ console.log('Chat log updated:', chatLog);
317
+
318
+ // Example: Update UI
319
+ updateChatDisplay(chatLog);
320
+
321
+ // Example: Sync with an external system
322
+ syncChatToExternalSystem(chatLog);
323
+ });
324
+ ```
325
+
326
+ Possible use cases for `chatLogUpdated` include:
327
+
328
+ 1. **Real-Time Chat UI Updates**
329
+ Reflect new messages or cleared logs in the UI immediately.
330
+ 2. **External System Integration**
331
+ Save chat logs to a database or send them to an analytics service.
332
+ 3. **Debugging & Monitoring**
333
+ Monitor changes in the chat log during development.
334
+
335
+ ## Supported Speech Engines
336
+
337
+ **AITuberOnAirCore** supports the following speech engines:
338
+
339
+ - **VOICEVOX**: High-quality Japanese speech synthesis engine.
340
+ - **VoicePeak**: Speech synthesis engine with rich emotional expression.
341
+ - **NijiVoice**: AI-based speech synthesis service (requires an API key).
342
+ - **AivisSpeech**: Speech synthesis using AI technology.
343
+ - **OpenAI TTS**: Text-to-speech API from OpenAI.
344
+
345
+ You can dynamically switch the speech engine via `updateVoiceService`:
346
+
347
+ ```typescript
348
+ // Example of switching speech engines
349
+ aituber.updateVoiceService({
350
+ engineType: 'nijivoice',
351
+ speaker: 'some-speaker-id',
352
+ apiKey: 'YOUR_NIJIVOICE_API_KEY'
353
+ });
354
+ ```
355
+
356
+ ## AI Provider System
357
+
358
+ AITuber OnAir Core uses an extensible provider system to accommodate various AI APIs. By default, it uses the OpenAI API, but other providers (Gemini, Claude, etc.) can be added.
359
+
360
+ ### Available Providers
361
+
362
+ Currently, the following AI provider is built-in:
363
+
364
+ - **OpenAI**: Supports models like GPT-4o, GPT-4o-mini, GPT-4 Turbo, etc.
365
+
366
+ ### Specifying a Provider
367
+
368
+ You can specify the provider when instantiating `AITuberOnAirCore`:
369
+
370
+ ```typescript
371
+ const aituberCore = new AITuberOnAirCore({
372
+ chatProvider: 'openai', // Provider name
373
+ apiKey: 'your-api-key',
374
+ model: 'gpt-4o-mini', // Optional (if omitted, the default model is used)
375
+ // Other options...
376
+ });
377
+ ```
378
+
379
+ ### Retrieving Providers & Models
380
+
381
+ You can programmatically retrieve available providers and their supported models:
382
+
383
+ ```typescript
384
+ // Get all available providers
385
+ const providers = AITuberOnAirCore.getAvailableProviders();
386
+
387
+ // Get supported models for a specific provider
388
+ const models = AITuberOnAirCore.getSupportedModels('openai');
389
+ ```
390
+
391
+ ### Creating a Custom Provider
392
+
393
+ To add a new AI provider, implement the `ChatServiceProvider` interface in a custom class and register it with the `ChatServiceFactory`:
394
+
395
+ ```typescript
396
+ import { ChatServiceFactory } from 'aituber-onair-core';
397
+ import { MyCustomProvider } from './MyCustomProvider';
398
+
399
+ // Register the custom provider
400
+ ChatServiceFactory.registerProvider(new MyCustomProvider());
401
+
402
+ // Use the registered provider
403
+ const aituberCore = new AITuberOnAirCore({
404
+ chatProvider: 'myCustomProvider',
405
+ apiKey: 'your-api-key',
406
+ // Other options...
407
+ });
408
+ ```
409
+
410
+ ## Memory & Persistence
411
+
412
+ **AITuberOnAirCore** includes a memory feature that maintains the context of long-running conversations. The AI summarizes older messages, preserving short-, mid-, and long-term context for more coherent responses.
413
+
414
+ ### Memory Types
415
+
416
+ There are three types of memory:
417
+
418
+ 1. **Short-Term Memory**
419
+ - Generated **1 minute** after the conversation starts
420
+ - Holds recent conversation details
421
+
422
+ 2. **Mid-Term Memory**
423
+ - Generated **4 minutes** after the conversation starts
424
+ - Holds slightly broader summaries of the conversation
425
+
426
+ 3. **Long-Term Memory**
427
+ - Generated **9 minutes** after the conversation starts
428
+ - Holds key themes and important information from the overall conversation
429
+
430
+ These memory records are automatically included in the AI prompts, helping the AI respond consistently over time.
431
+
432
+ ### Memory Persistence
433
+
434
+ AITuberOnAirCore has a pluggable design for memory persistence, so that the conversation context can be retained even if the application is restarted.
435
+
436
+ #### MemoryStorage Interface
437
+
438
+ Persistence is provided through the abstract `MemoryStorage` interface:
439
+
440
+ ```typescript
441
+ interface MemoryStorage {
442
+ load(): Promise<MemoryRecord[]>;
443
+ save(records: MemoryRecord[]): Promise<void>;
444
+ clear(): Promise<void>;
445
+ }
446
+ ```
447
+
448
+ #### Default Implementations
449
+
450
+ 1. **LocalStorageMemoryStorage**
451
+ - Uses the browser's LocalStorage
452
+ - Simple solution (subject to storage limits)
453
+
454
+ 2. **IndexedDBMemoryStorage** (Planned)
455
+ - Uses the browser's IndexedDB
456
+ - Supports larger capacity and more complex data structures
457
+
458
+ #### Custom Storage Implementations
459
+
460
+ To create your own storage implementation, simply implement the `MemoryStorage` interface:
461
+
462
+ ```typescript
463
+ class CustomMemoryStorage implements MemoryStorage {
464
+ async load(): Promise<MemoryRecord[]> {
465
+ // Load records from a custom storage
466
+ return customStorage.getItems();
467
+ }
468
+
469
+ async save(records: MemoryRecord[]): Promise<void> {
470
+ // Save records to a custom storage
471
+ await customStorage.setItems(records);
472
+ }
473
+
474
+ async clear(): Promise<void> {
475
+ // Clear records in a custom storage
476
+ await customStorage.clear();
477
+ }
478
+ }
479
+ ```
480
+
481
+ ### Configuring the Memory Feature
482
+
483
+ Enable the memory feature and set up persistence when initializing **AITuberOnAirCore**:
484
+
485
+ ```typescript
486
+ import { AITuberOnAirCore } from './lib/aituber-onair-core';
487
+ import { createMemoryStorage } from './lib/aituber-onair-core/utils/storage';
488
+
489
+ // Create a memory storage (LocalStorage example)
490
+ const memoryStorage = createMemoryStorage('myapp.aiMemoryRecords');
491
+
492
+ // Initialize AITuberOnAirCore
493
+ const aiTuber = new AITuberOnAirCore({
494
+ // Other options...
495
+
496
+ // Memory options
497
+ memoryOptions: {
498
+ enableSummarization: true,
499
+ shortTermDuration: 60 * 1000, // 1 minute (ms)
500
+ midTermDuration: 4 * 60 * 1000, // 4 minutes
501
+ longTermDuration: 9 * 60 * 1000, // 9 minutes
502
+ maxMessagesBeforeSummarization: 20,
503
+ maxSummaryLength: 256,
504
+ memoryRetentionPeriod: 60 * 60 * 1000, // 1 hour
505
+ },
506
+
507
+ // Memory storage
508
+ memoryStorage: memoryStorage,
509
+ });
510
+ ```
511
+
512
+ ### Memory-Related Events
513
+
514
+ The memory feature triggers the following events:
515
+
516
+ - `memoriesLoaded`: When memory is loaded from storage
517
+ - `memoryCreated`: When a new memory record is created
518
+ - `memoriesRemoved`: When a memory record is deleted
519
+ - `memoriesSaved`: When memory records are saved to storage
520
+ - `storageCleared`: When the storage is cleared
521
+
522
+ These events are emitted by the `MemoryManager` instance internally, so you typically need a reference to the internal component to use them.
523
+
524
+ ### Memory Cleanup
525
+
526
+ Over time, memory records may grow and consume storage space. **AITuberOnAirCore** automatically removes old memories beyond the set retention period (default is 1 hour).
527
+
528
+ - `cleanupOldMemories` is invoked automatically during user input processing.
529
+ - You can manually trigger a cleanup if necessary.
530
+
531
+ ```typescript
532
+ // Clear both chat history and memory
533
+ aiTuber.clearChatHistory();
534
+
535
+ // Or access the memory manager directly (not recommended for production)
536
+ const memoryManager = aiTuber['memoryManager'];
537
+ if (memoryManager) {
538
+ await memoryManager.cleanupOldMemories();
539
+ }
540
+ ```
541
+
542
+ ## Examples
543
+
544
+ ### Vision (Image) Input Processing
545
+
546
+ ```typescript
547
+ // Obtain image data URL (e.g., via camera capture)
548
+ const imageDataUrl = captureScreenshot();
549
+
550
+ // Basic vision processing with default prompt
551
+ await aituber.processVisionChat(imageDataUrl);
552
+
553
+ // Vision processing with a custom prompt
554
+ await aituber.processVisionChat(
555
+ imageDataUrl,
556
+ 'Analyze the broadcast screen and provide entertaining comments for viewers.'
557
+ );
558
+ ```
559
+
560
+ ### Custom Summarization Prompts
561
+
562
+ ```typescript
563
+ // Using a custom summarization prompt
564
+ const aiTuberCore = new AITuberOnAirCore({
565
+ openAiKey: 'your_api_key',
566
+ chatOptions: { /* ... */ },
567
+ memoryOptions: {
568
+ enableSummarization: true,
569
+ // Other memory settings
570
+ summaryPromptTemplate: 'Please summarize the following conversation in under {maxLength} characters, highlighting the key points.',
571
+ },
572
+ });
573
+ ```
574
+
575
+ ### Synchronized Speech Playback
576
+
577
+ ```typescript
578
+ // Example of waiting for speech playback to finish (using handleSpeakAi)
579
+ async function playSequentially() {
580
+ // Wait for the listener's speech playback
581
+ await handleSpeakAi(
582
+ listenerScreenplay,
583
+ listenerVoiceType,
584
+ listenerSpeaker,
585
+ openAiKey
586
+ );
587
+
588
+ console.log('Listener speech playback has finished');
589
+
590
+ // AI avatar response
591
+ await aituber.processChat('Hello, any updates on the show so far?');
592
+ }
593
+ ```
594
+
595
+ ## Integration with Existing Applications
596
+
597
+ AITuberOnAirCore can be integrated into existing applications relatively easily. For example:
598
+
599
+ 1. Initialize with relevant API keys or settings at application startup.
600
+ 2. Set up event listeners to handle various stages of processing.
601
+ 3. Call the appropriate methods (`processChat`, `processVisionChat`, etc.) when a user or vision input occurs.
602
+
603
+ ```typescript
604
+ // Example in App.tsx
605
+ useEffect(() => {
606
+ // If AITuberOnAirCore is already initialized, set up event listeners
607
+ if (aituberOnairCore) {
608
+ // Clear old listeners
609
+ aituberOnairCore.offAll();
610
+
611
+ // Register new listeners
612
+ aituberOnairCore.on(AITuberOnAirCoreEvent.PROCESSING_START, () => {
613
+ setChatProcessing(true);
614
+ setAssistantMessage('Loading...');
615
+ });
616
+
617
+ aituberOnairCore.on(AITuberOnAirCoreEvent.ASSISTANT_PARTIAL, (text) => {
618
+ setAssistantMessage((prev) => {
619
+ if (prev === 'Loading...') return text;
620
+ return prev + text;
621
+ });
622
+ });
623
+
624
+ // Other event listeners...
625
+ }
626
+ }, [aituberOnairCore]);
627
+ ```
628
+
629
+ In real-world applications, you might update the speech engine settings when the user changes preferences, toggle the memory feature on or off, and so on. Though optimized for AITuber OnAir, it's flexible enough to be embedded into custom AITuber apps.
630
+
631
+ ## Testing & Development
632
+
633
+ **AITuberOnAirCore** includes a comprehensive test suite to ensure quality and stability.
634
+
635
+ ### Test Structure
636
+
637
+ Tests are organized in the following directory structure:
638
+
639
+ ```
640
+ tests/
641
+ ├── core/ # Tests for core components
642
+ ├── services/ # Tests for services (speech, chat, etc.)
643
+ ├── utils/ # Tests for utility functions
644
+ └── README.md # Detailed info on the test structure
645
+ ```
646
+
647
+ ### Naming Conventions
648
+
649
+ - Test files use the `.test.ts` suffix (e.g., `AITuberOnAirCore.test.ts`).
650
+ - There should be a corresponding test file for each source file.
651
+
652
+ ### Running Tests
653
+
654
+ The test framework uses **Vitest**:
655
+
656
+ ```bash
657
+ # Navigate to the AITuberOnAirCore root directory
658
+ cd src/lib/aituber-onair-core
659
+
660
+ # Run all tests
661
+ npm test
662
+
663
+ # Watch mode (automatically reruns tests on file changes)
664
+ npm run test:watch
665
+
666
+ # Generate coverage report
667
+ npm run test:coverage
668
+ ```
669
+
670
+ ### Writing Tests
671
+
672
+ Follow these guidelines:
673
+
674
+ 1. Use the Arrange-Act-Assert pattern.
675
+ 2. Properly mock external dependencies.
676
+ 3. Keep tests isolated and independent.
677
+ 4. Test both success and error cases.
678
+
679
+ **Example**:
680
+
681
+ ```typescript
682
+ import { describe, it, expect } from 'vitest';
683
+ import { AITuberOnAirCore } from '../../core/AITuberOnAirCore';
684
+
685
+ describe('AITuberOnAirCore', () => {
686
+ describe('constructor', () => {
687
+ it('initializes properly with valid options', () => {
688
+ // Arrange
689
+ const options = { /* ... */ };
690
+
691
+ // Act
692
+ const instance = new AITuberOnAirCore(options);
693
+
694
+ // Assert
695
+ expect(instance).toBeDefined();
696
+ });
697
+ });
698
+ });
699
+ ```
700
+
701
+ ### Coverage Requirements
702
+
703
+ Particularly high test coverage is sought for:
704
+
705
+ - Core functionality
706
+ - Public APIs
707
+ - Edge cases
708
+ - Error handling
709
+
710
+ ### Setting Up the Development Environment
711
+
712
+ You will need:
713
+
714
+ 1. Node.js (version 20 or higher)
715
+ 2. npm (version 10 or higher)
716
+
717
+ ```bash
718
+ # Install dependencies
719
+ npm install
720
+
721
+ # Run the test suite
722
+ npm test
723
+ ```