@flink-app/openai-adapter 2.0.0-alpha.48

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License
2
+
3
+ Copyright (c) Frost Experience AB https://www.frost.se
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,470 @@
1
+ # @flink-app/openai-adapter
2
+
3
+ OpenAI adapter for the Flink AI framework using the **Responses API** - OpenAI's modern API that provides step-aware reasoning, explicit tool invocation, and better performance.
4
+
5
+ ## Why Responses API?
6
+
7
+ This adapter uses OpenAI's **Responses API** instead of the older Chat Completions API, providing:
8
+
9
+ - **🔧 Step-aware reasoning**: Model returns multiple tool calls as explicit, typed items in a single response
10
+ - **âš¡ Better performance**: 3% improvement on SWE-bench, 5% on TAUBench vs Chat Completions
11
+ - **💰 Lower costs**: 40-80% better cache utilization via response persistence
12
+ - **📦 First-class tool steps**: Tool calls and results are structured items, not message hacks
13
+ - **🎯 Future-proof**: All new OpenAI features will land in Responses API first
14
+
15
+ ## Important: Understanding the Agent Loop
16
+
17
+ **The Responses API does NOT run your agent loop.**
18
+
19
+ What it provides:
20
+ - Multiple tool calls in **one response** (you still execute them)
21
+ - Structured step types (message, function_call, function_call_output)
22
+ - Optional response persistence for caching
23
+
24
+ What **Flink handles** (via AgentRunner):
25
+ - Tool execution
26
+ - Multi-turn loops (API → execute tools → API → execute tools → done)
27
+ - Deciding when to stop
28
+ - Managing conversation state
29
+
30
+ This separation is intentional - it gives you full control over agent behavior while leveraging better API primitives.
31
+
32
+ ## Installation
33
+
34
+ ```bash
35
+ npm install @flink-app/openai-adapter
36
+ # or
37
+ pnpm add @flink-app/openai-adapter
38
+ ```
39
+
40
+ The `openai` package is included as a dependency, so you don't need to install it separately.
41
+
42
+ ## Usage
43
+
44
+ ### Basic Setup
45
+
46
+ ```typescript
47
+ import { OpenAIAdapter } from "@flink-app/openai-adapter";
48
+ import { FlinkApp } from "@flink-app/flink";
49
+
50
+ const app = new FlinkApp({
51
+ ai: {
52
+ llms: {
53
+ default: new OpenAIAdapter({
54
+ apiKey: process.env.OPENAI_API_KEY!,
55
+ model: "gpt-5"
56
+ }),
57
+ },
58
+ },
59
+ });
60
+
61
+ await app.start();
62
+ ```
63
+
64
+ **Legacy API (still supported):**
65
+ ```typescript
66
+ // Backward-compatible constructor
67
+ new OpenAIAdapter(process.env.OPENAI_API_KEY!, "gpt-5")
68
+ ```
69
+
70
+ ### Agent Instructions
71
+
72
+ Define your agent's behavior using the `instructions` property:
73
+
74
+ ```typescript
75
+ // src/agents/support_agent.ts
76
+ export const Agent: FlinkAgentProps = {
77
+ name: "support_agent",
78
+ instructions: "You are a helpful customer support agent.",
79
+ tools: ["get_order_status"],
80
+ model: { adapterId: "default" },
81
+ };
82
+ ```
83
+
84
+ **How it works:**
85
+ - Instructions are prepended as a system message to every conversation
86
+ - Follows Vercel AI SDK pattern for consistency
87
+ - Provides stable agent behavior across all interactions
88
+
89
+ #### Dynamic Context with System Messages
90
+
91
+ For per-request context, add system messages to the conversation:
92
+
93
+ ```typescript
94
+ const result = await ctx.agents.myAgent.execute({
95
+ message: [
96
+ { role: "system", content: "Current user tier: Premium" },
97
+ { role: "user", content: "What can I do?" }
98
+ ]
99
+ });
100
+ ```
101
+
102
+ **Order of messages sent to OpenAI:**
103
+ 1. Agent `instructions` (as system message)
104
+ 2. User-provided system messages (if any)
105
+ 3. Conversation messages
106
+
107
+ This gives you both static agent behavior and dynamic per-request context.
108
+
109
+ ### Structured Outputs
110
+
111
+ Enable structured outputs with JSON schema for 100% reliability in output format:
112
+
113
+ ```typescript
114
+ import { OpenAIAdapter } from "@flink-app/openai-adapter";
115
+
116
+ const adapter = new OpenAIAdapter({
117
+ apiKey: process.env.OPENAI_API_KEY!,
118
+ model: "gpt-5",
119
+ structuredOutput: {
120
+ type: "json_schema",
121
+ name: "car_analysis",
122
+ description: "Analysis of car specifications",
123
+ schema: {
124
+ type: "object",
125
+ properties: {
126
+ brand: { type: "string" },
127
+ model: { type: "string" },
128
+ year: { type: "number" },
129
+ features: {
130
+ type: "array",
131
+ items: { type: "string" }
132
+ }
133
+ },
134
+ required: ["brand", "model", "year"],
135
+ additionalProperties: false
136
+ },
137
+ strict: true // Enforces 100% schema adherence
138
+ }
139
+ });
140
+ ```
141
+
142
+ **Benefits of Structured Outputs:**
143
+ - 100% reliability (vs ~95% with JSON mode)
144
+ - No need for retry logic or manual validation
145
+ - Automatic schema validation during generation
146
+ - Supported on all modern models: `gpt-4.1`, `gpt-5`, `o4-mini`, `o3`
147
+
148
+ ### Zero Data Retention (ZDR)
149
+
150
+ For organizations with compliance or data retention requirements:
151
+
152
+ ```typescript
153
+ const adapter = new OpenAIAdapter({
154
+ apiKey: process.env.OPENAI_API_KEY!,
155
+ model: "gpt-5",
156
+ persistResponse: false // Don't store responses on OpenAI servers
157
+ });
158
+ ```
159
+
160
+ **What `persistResponse` does:**
161
+ - `true` (default): OpenAI stores the response for caching and retrieval via `response_id`
162
+ - `false`: No data stored on OpenAI servers (ZDR compliance)
163
+
164
+ **What it does NOT do:**
165
+ - It does NOT automatically manage conversation state
166
+ - You still need to pass full conversation history in messages
167
+ - It's purely about server-side response persistence
168
+
169
+ **Note**: OpenAI automatically enforces `persistResponse: false` for Zero Data Retention organizations.
170
+
171
+ ## Multiple Adapters
172
+
173
+ You can register multiple OpenAI adapters with different configurations:
174
+
175
+ ```typescript
176
+ import { OpenAIAdapter } from "@flink-app/openai-adapter";
177
+ import { FlinkApp } from "@flink-app/flink";
178
+
179
+ const app = new FlinkApp({
180
+ ai: {
181
+ llms: {
182
+ // Default GPT-5 - best for general tasks
183
+ default: new OpenAIAdapter({
184
+ apiKey: process.env.OPENAI_API_KEY!,
185
+ model: "gpt-5"
186
+ }),
187
+
188
+ // Fast reasoning model - cost-efficient
189
+ fast: new OpenAIAdapter({
190
+ apiKey: process.env.OPENAI_API_KEY!,
191
+ model: "o4-mini"
192
+ }),
193
+
194
+ // Maximum intelligence for complex reasoning
195
+ smart: new OpenAIAdapter({
196
+ apiKey: process.env.OPENAI_API_KEY!,
197
+ model: "o3"
198
+ }),
199
+
200
+ // With structured output
201
+ structured: new OpenAIAdapter({
202
+ apiKey: process.env.OPENAI_API_KEY!,
203
+ model: "gpt-5",
204
+ structuredOutput: {
205
+ type: "json_schema",
206
+ name: "response",
207
+ schema: { /* your schema */ },
208
+ strict: true
209
+ }
210
+ }),
211
+ },
212
+ },
213
+ });
214
+ ```
215
+
216
+ ## Using in Agents
217
+
218
+ Reference the adapter by its registered ID in your agent configuration:
219
+
220
+ ```typescript
221
+ // src/agents/support_agent.ts
222
+ import { FlinkAgentProps } from "@flink-app/flink";
223
+
224
+ export const Agent: FlinkAgentProps = {
225
+ name: "support_agent",
226
+ description: "Customer support assistant",
227
+ instructions: "You are a helpful customer support agent.",
228
+ tools: ["get_order_status", "search_knowledge_base"],
229
+ model: {
230
+ adapterId: "default", // Uses the "default" adapter
231
+ maxTokens: 2000,
232
+ temperature: 0.7,
233
+ },
234
+ };
235
+ ```
236
+
237
+ ## Supported Models
238
+
239
+ This adapter works with all OpenAI models available via the Responses API. The latest models (as of 2026) offer significant improvements:
240
+
241
+ ### GPT-5 Series (Recommended)
242
+
243
+ - **GPT-5**: `gpt-5`
244
+ - Latest and most capable model
245
+ - Best for general-purpose applications
246
+ - Excellent at coding, reasoning, and agentic tasks
247
+
248
+ ### GPT-4.1 Series
249
+
250
+ - **GPT-4.1**: `gpt-4.1`
251
+ - Smartest non-reasoning model
252
+ - Excellent at coding tasks
253
+ - Strong at precise instruction following
254
+ - Best for web development and technical tasks
255
+
256
+ - **GPT-4.1 mini**: `gpt-4.1-mini`
257
+ - Smaller, faster, more cost-efficient
258
+ - Good balance of capability and cost
259
+
260
+ - **GPT-4.1 nano**: `gpt-4.1-nano`
261
+ - Ultra-fast and cost-efficient
262
+ - Best for simple, high-volume tasks
263
+
264
+ ### O-Series Reasoning Models
265
+
266
+ - **o4-mini**: `o4-mini` (recommended for reasoning tasks)
267
+ - Fast, cost-efficient reasoning model
268
+ - Best-performing on AIME 2024 and 2025 benchmarks
269
+ - Optimized for mathematical and logical reasoning
270
+
271
+ - **o3**: `o3`
272
+ - Advanced reasoning model for complex tasks
273
+ - State-of-the-art performance on coding, math, and science
274
+ - Excellent at Codeforces, SWE-bench, and MMMU
275
+
276
+ - **o3-pro**: `o3-pro`
277
+ - Premium reasoning model (Pro users only)
278
+ - Designed to think longer and provide most reliable responses
279
+
280
+ ### Legacy Models
281
+
282
+ For backwards compatibility:
283
+
284
+ - **GPT-4 Turbo**: `gpt-4-turbo`
285
+ - **GPT-4**: `gpt-4`
286
+ - **GPT-3.5 Turbo**: `gpt-3.5-turbo`
287
+
288
+ **Note**: Some older models (GPT-4o, early GPT-4.1 variants) are being retired in 2026. Migrate to the latest models for continued support.
289
+
290
+ ## Model Selection Guide
291
+
292
+ | Use Case | Recommended Model | Why |
293
+ |----------|------------------|-----|
294
+ | General development | `gpt-5` | Latest and most capable |
295
+ | Coding & technical | `gpt-4.1` | Best instruction following |
296
+ | High-volume tasks | `gpt-4.1-mini` | Cost-efficient with good performance |
297
+ | Mathematical reasoning | `o4-mini` | Optimized for math, fast and cost-efficient |
298
+ | Complex problem-solving | `o3` | State-of-the-art reasoning |
299
+ | Mission-critical | `o3-pro` | Maximum reliability (Pro users) |
300
+
301
+ ## Features
302
+
303
+ - ✅ **Step-based tool calling** - Multiple tool calls in one response as typed items
304
+ - ✅ **Event-based streaming** - Proper event taxonomy (not just token streaming)
305
+ - ✅ **Structured outputs** with JSON schema (100% reliability)
306
+ - ✅ **Response persistence** for better caching (optional)
307
+ - ✅ **First-class tool steps** - function_call and function_call_output as explicit types
308
+ - ✅ **Zero Data Retention** mode for compliance
309
+ - ✅ Support for all OpenAI models
310
+ - ✅ 40-80% cost savings via better caching
311
+ - ✅ 3-5% performance improvement over Chat Completions
312
+
313
+ ### What Makes Responses API Different
314
+
315
+ The key difference is **how tool calls are represented**, not who executes them:
316
+
317
+ **Chat Completions (old):**
318
+ ```
319
+ Response: {
320
+ message: {
321
+ content: "...",
322
+ tool_calls: [...] // Tool calls embedded in message
323
+ }
324
+ }
325
+
326
+ You: Extract tool calls, execute them, create new messages, call API again
327
+ ```
328
+
329
+ **Responses API (new):**
330
+ ```
331
+ Response: {
332
+ output: [
333
+ { type: "message", content: "..." },
334
+ { type: "function_call", name: "...", call_id: "..." },
335
+ { type: "function_call", name: "...", call_id: "..." } // Multiple calls!
336
+ ]
337
+ }
338
+
339
+ You: Extract tool calls, execute them, create function_call_output items, call API again
340
+ ```
341
+
342
+ **Key improvements:**
343
+ - **Multiple tool calls per response**: Model can request several tools at once
344
+ - **Explicit step types**: No more message role gymnastics
345
+ - **Better structured**: function_call_output vs cramming into user messages
346
+ - **Clearer semantics**: Steps are first-class, not message metadata
347
+
348
+ **Still your responsibility:**
349
+ - Executing the tools
350
+ - Deciding when to stop the loop
351
+ - Managing conversation history
352
+
353
+ ## API
354
+
355
+ ### `OpenAIAdapter`
356
+
357
+ ```typescript
358
+ interface OpenAIAdapterOptions {
359
+ apiKey: string;
360
+ model: string;
361
+ structuredOutput?: {
362
+ type: "json_schema";
363
+ name: string;
364
+ description?: string;
365
+ schema: Record<string, any>;
366
+ strict?: boolean;
367
+ };
368
+ persistResponse?: boolean; // Default: true
369
+ }
370
+
371
+ class OpenAIAdapter implements LLMAdapter {
372
+ constructor(options: OpenAIAdapterOptions);
373
+ constructor(apiKey: string, model: string); // Legacy
374
+ }
375
+ ```
376
+
377
+ #### Parameters
378
+
379
+ - `apiKey`: Your OpenAI API key
380
+ - `model`: The OpenAI model to use (e.g., "gpt-5", "o4-mini")
381
+ - `structuredOutput`: Optional JSON schema for structured outputs
382
+ - `persistResponse`: Whether to persist responses on OpenAI servers for caching (default: `true`)
383
+
384
+ ## Architecture Notes
385
+
386
+ ### Responses API vs Chat Completions
387
+
388
+ This adapter uses OpenAI's **Responses API**, which differs from Chat Completions in several ways:
389
+
390
+ **Request Format:**
391
+ - Chat Completions: `messages` array with system/user/assistant roles
392
+ - Responses API: `input` array with typed items (messages, function_call_outputs, etc.) + separate `instructions` field
393
+
394
+ **Response Format:**
395
+ - Chat Completions: `choices[0].message.content`
396
+ - Responses API: `output` array of items with type `message`, `function_call`, etc.
397
+
398
+ **Tool/Function Format:**
399
+ - Chat Completions: Externally-tagged `{ type: "function", function: {...} }`
400
+ - Responses API: Internally-tagged `{ type: "function", name: "...", ... }` (strict by default)
401
+
402
+ **Structured Outputs:**
403
+ - Chat Completions: `response_format: { type: "json_schema", json_schema: {...} }`
404
+ - Responses API: `text: { format: { type: "json_schema", ... } }`
405
+
406
+ **State Management:**
407
+ - Chat Completions: Manual conversation state management
408
+ - Responses API: Optional response persistence with `persistResponse: true` (for caching, not automatic state replay)
409
+
410
+ ### Flink Integration
411
+
412
+ The adapter seamlessly integrates with Flink's `LLMAdapter` interface:
413
+
414
+ - Flink's `instructions` → Responses API `instructions`
415
+ - Flink's `messages` → Converted to Responses API `input` items (typed steps)
416
+ - Flink's tool schema → Converted to Responses API function format (internally-tagged, strict by default)
417
+ - Responses `output` → Extracted to Flink's `LLMResponse` format
418
+
419
+ **Each API call is one turn:**
420
+ - Flink calls `adapter.execute()` → One Responses API request
421
+ - Response may contain multiple tool calls (as separate items)
422
+ - Flink's AgentRunner executes those tools
423
+ - Flink calls `adapter.execute()` again with tool results → Another Responses API request
424
+ - Repeat until no more tool calls
425
+
426
+ This is the standard agent loop architecture used by modern frameworks (LangGraph, Vercel AI SDK, etc.)
427
+
428
+ ## Migration from Chat Completions
429
+
430
+ If you're coming from the Chat Completions API, the good news is: **no code changes needed!**
431
+
432
+ The adapter handles all the differences internally:
433
+
434
+ ```typescript
435
+ // This works the same way with both APIs
436
+ const app = new FlinkApp({
437
+ ai: {
438
+ llms: {
439
+ default: new OpenAIAdapter({
440
+ apiKey: process.env.OPENAI_API_KEY!,
441
+ model: "gpt-5" // Just update the model
442
+ }),
443
+ },
444
+ },
445
+ });
446
+ ```
447
+
448
+ Benefits of upgrading:
449
+ - ✅ Step-aware reasoning (multiple tool calls per response)
450
+ - ✅ Better performance (3-5% improvement on benchmarks)
451
+ - ✅ Lower costs (40-80% better caching via response persistence)
452
+ - ✅ First-class tool steps (cleaner than message role hacks)
453
+ - ✅ Future-proof (new features land here first)
454
+
455
+ ## Requirements
456
+
457
+ - Node.js >= 18
458
+ - @flink-app/flink >= 1.0.0
459
+ - openai >= 4.77.0 (with Responses API support)
460
+
461
+ ## License
462
+
463
+ MIT
464
+
465
+ ## Resources
466
+
467
+ - [OpenAI Responses API Documentation](https://platform.openai.com/docs/api-reference/responses)
468
+ - [Migrate to the Responses API Guide](https://platform.openai.com/docs/guides/migrate-to-responses)
469
+ - [Why we built the Responses API](https://developers.openai.com/blog/responses-api/)
470
+ - [Structured Outputs Guide](https://platform.openai.com/docs/guides/structured-outputs)