symposium 1.0.0 → 1.0.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md
CHANGED
|
@@ -13,10 +13,22 @@ Symposium is a powerful and flexible Node.js framework for building Large Langua
|
|
|
13
13
|
|
|
14
14
|
## Installation
|
|
15
15
|
|
|
16
|
+
Requires Node.js v18 or higher.
|
|
17
|
+
|
|
16
18
|
```bash
|
|
17
19
|
npm install symposium
|
|
18
20
|
```
|
|
19
21
|
|
|
22
|
+
## Configuration
|
|
23
|
+
|
|
24
|
+
Symposium uses environment variables to configure access to various services. You can set these in a `.env` file at the root of your project.
|
|
25
|
+
|
|
26
|
+
- `OPENAI_API_KEY`: Required for using OpenAI models and for Real-time Voice Sessions.
|
|
27
|
+
- `ANTHROPIC_API_KEY`: Required for using Anthropic models.
|
|
28
|
+
- `GROQ_API_KEY`: Required for using Groq models.
|
|
29
|
+
- `DEEPSEEK_API_KEY`: Required for using DeepSeek models.
|
|
30
|
+
- `TRANSCRIPTION_MODEL`: (Optional) The name of the model to use for audio transcription (currently, only `gpt-4o-transcribe` is supported).
|
|
31
|
+
|
|
20
32
|
## Core Concepts
|
|
21
33
|
|
|
22
34
|
The framework is built around a few core components:
|
|
@@ -26,6 +38,9 @@ The framework is built around a few core components:
|
|
|
26
38
|
- **`Thread`**: Represents a single conversation with an agent. It maintains the message history and the agent's state for that conversation. Each thread has a unique ID.
|
|
27
39
|
- **`Tool`**: A base class for creating tools that an `Agent` can use. Tools expose functions that the LLM can call to interact with external APIs or data.
|
|
28
40
|
- **`Message`**: A wrapper for messages within a `Thread`, containing the role (`user`, `assistant`, `system`, `tool`), content, and other metadata.
|
|
41
|
+
- **`MemoryHandler`**: A class for managing an agent's long-term memory. It can be extended to create custom memory strategies.
|
|
42
|
+
- **`Summarizer`**: A utility agent for summarizing text or conversations.
|
|
43
|
+
- **`Logger`**: A simple logging utility that can be passed to an agent to log its activity.
|
|
29
44
|
|
|
30
45
|
## Getting Started
|
|
31
46
|
|
|
@@ -85,12 +100,25 @@ async function main() {
|
|
|
85
100
|
emitter.on('output', (content) => {
|
|
86
101
|
process.stdout.write(content);
|
|
87
102
|
});
|
|
103
|
+
|
|
104
|
+
emitter.on('error', (error) => {
|
|
105
|
+
console.error(`\nAn error occurred: ${error.message}`);
|
|
106
|
+
});
|
|
107
|
+
|
|
108
|
+
emitter.on('partial', (text) => {
|
|
109
|
+
console.log(`\n> ${text}\n`);
|
|
110
|
+
});
|
|
88
111
|
}
|
|
89
112
|
|
|
90
113
|
main();
|
|
91
114
|
```
|
|
92
115
|
|
|
93
|
-
When you run this, the agent will respond to your message, and the response will be streamed to the console. The `message` method returns an `EventEmitter` that emits
|
|
116
|
+
When you run this, the agent will respond to your message, and the response will be streamed to the console. The `message` method returns an `EventEmitter` that emits several events:
|
|
117
|
+
|
|
118
|
+
- `start`: Emitted when the agent begins processing the message. The `thread` object is passed as an argument.
|
|
119
|
+
- `output`: Emitted for each chunk of text in the response stream.
|
|
120
|
+
- `partial`: Emitted to provide insight into the agent's internal state, like when it decides to use a tool.
|
|
121
|
+
- `error`: Emitted if an error occurs during processing.
|
|
94
122
|
|
|
95
123
|
## Advanced Usage
|
|
96
124
|
|
|
@@ -158,6 +186,43 @@ const emitter = await agent.message("What's the weather like in Paris?");
|
|
|
158
186
|
|
|
159
187
|
The agent's underlying LLM will now be able to see the `get_weather` function and will call it when appropriate, passing the result back into the conversation.
|
|
160
188
|
|
|
189
|
+
### Real-time Voice and Transcription
|
|
190
|
+
|
|
191
|
+
Symposium has built-in support for audio transcription and real-time voice sessions, currently powered by OpenAI.
|
|
192
|
+
|
|
193
|
+
#### Audio Transcription
|
|
194
|
+
|
|
195
|
+
You can send audio content directly in a message. If the model doesn't support audio input, Symposium will automatically transcribe it to text.
|
|
196
|
+
|
|
197
|
+
```javascript
|
|
198
|
+
// Transcribing an audio file from a URL
|
|
199
|
+
const emitter = await agent.message([
|
|
200
|
+
{
|
|
201
|
+
type: 'audio',
|
|
202
|
+
content: {
|
|
203
|
+
type: 'url',
|
|
204
|
+
data: 'http://example.com/audio.mp3'
|
|
205
|
+
}
|
|
206
|
+
}
|
|
207
|
+
]);
|
|
208
|
+
```
|
|
209
|
+
|
|
210
|
+
You can also use the static `Symposium.transcribe()` method for standalone transcription.
|
|
211
|
+
|
|
212
|
+
#### Real-time Voice Sessions
|
|
213
|
+
|
|
214
|
+
For interactive voice conversations, you can create a real-time session. This is useful for building voice bots.
|
|
215
|
+
|
|
216
|
+
```javascript
|
|
217
|
+
// (inside an async function)
|
|
218
|
+
const { response, thread } = await agent.createRealtimeSession();
|
|
219
|
+
const sessionId = response.id;
|
|
220
|
+
const clientSecret = response.client_secret.value;
|
|
221
|
+
|
|
222
|
+
// You would then use this session ID and client secret on the client-side
|
|
223
|
+
// to connect to the real-time session endpoint.
|
|
224
|
+
```
|
|
225
|
+
|
|
161
226
|
### Switching Models
|
|
162
227
|
|
|
163
228
|
You can set a default model for an agent or change it on a per-thread basis.
|
|
@@ -271,6 +336,12 @@ This is a high-level overview. For details, please refer to the source code.
|
|
|
271
336
|
- `getFunctions()`: **Abstract**. Must return an array of function definitions that the LLM can call.
|
|
272
337
|
- `callFunction(thread, name, payload)`: **Abstract**. Called when the LLM decides to use one of the tool's functions.
|
|
273
338
|
|
|
339
|
+
### Other Classes
|
|
340
|
+
|
|
341
|
+
- **`MemoryHandler`**: Provides a base for implementing long-term memory for an agent.
|
|
342
|
+
- **`Summarizer`**: A utility agent for text summarization.
|
|
343
|
+
- **`Logger`**: A simple logger for agent activity.
|
|
344
|
+
|
|
274
345
|
## License
|
|
275
346
|
|
|
276
347
|
ISC
|
|
@@ -1,8 +1,8 @@
|
|
|
1
1
|
import OpenAIModel from "./OpenAIModel.js";
|
|
2
2
|
|
|
3
|
-
export default class
|
|
3
|
+
export default class Gpt4oTranscribe extends OpenAIModel {
|
|
4
4
|
type = 'stt';
|
|
5
|
-
name = '
|
|
5
|
+
name = 'gpt-4o-transcribe';
|
|
6
6
|
|
|
7
7
|
async transcribe(file, prompt = null) {
|
|
8
8
|
const response = await this.getOpenAi().audio.transcriptions.create({
|