@makaio/adapter-openai-node 1.0.0-dev-1779051654000 → 1.0.0-dev-1781449862362

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE CHANGED
@@ -1,6 +1,6 @@
1
1
  MIT License
2
2
 
3
- Copyright (c) 2026-present Makaio GmbH
3
+ Copyright (c) Makaio GmbH
4
4
 
5
5
  Permission is hereby granted, free of charge, to any person obtaining a copy
6
6
  of this software and associated documentation files (the "Software"), to deal
@@ -0,0 +1,206 @@
1
+ # OpenAI-Compatible API Fixtures
2
+
3
+ Real streaming chunks captured from various OSS models via OpenAI-compatible APIs (NanoGPT, OpenRouter).
4
+ These fixtures document the quirks and variations in how different models implement tool calling.
5
+
6
+ ## Models Overview
7
+
8
+ | Model | Tool Call Method | finish_reason | Quirks |
9
+ |-------|-----------------|---------------|--------|
10
+ | Qwen-3-VL | `tool_calls` array | `tool_calls` | **Baseline** - standard behavior |
11
+ | GLM-4.6 | `tool_calls` array | `tool_calls` | Trailing `{}` in args |
12
+ | GLM-4.5-Air | XML in content + empty `tool_calls` | `tool_calls` | Custom XML format |
13
+ | DeepSeek-v3.2 | XML in content only | `stop` | No tool_calls array |
14
+ | Nemotron-Nano | `tool_calls` array | `tool_calls` | Wrong param names, extra fields |
15
+
16
+ ---
17
+
18
+ ## Detailed Analysis
19
+
20
+ ### Qwen-3-VL (`qwen3-vl-235b-a22b-thinking`) - BASELINE
21
+
22
+ **Standard OpenAI-compatible behavior.** Use this as reference for expected format.
23
+
24
+ ```
25
+ Chunk flow:
26
+ 1. role: 'assistant'
27
+ 2. reasoning (multiple chunks)
28
+ 3. content: '\n\n'
29
+ 4. tool_calls[0]: { id, name: 'write_file', arguments: '' }
30
+ 5. tool_calls[0]: { arguments: '{"path": "...' } (streamed)
31
+ 6. tool_calls[0]: { arguments: '..."}' }
32
+ 7. finish_reason: 'tool_calls' + usage
33
+ ```
34
+
35
+ **Aggregated tool call:**
36
+ ```json
37
+ {
38
+ "id": "call_1910c9d8f6224ad591709f28",
39
+ "type": "function",
40
+ "function": {
41
+ "name": "write_file",
42
+ "arguments": "{\"path\": \"...\", \"content\": \"HELLO\"}"
43
+ }
44
+ }
45
+ ```
46
+
47
+ ---
48
+
49
+ ### GLM-4.6 (`z-ai/glm-4.6:thinking`)
50
+
51
+ **Quirk: Sends tool arguments in TWO chunks, second chunk is just `{}`**
52
+
53
+ ```
54
+ Chunk flow:
55
+ 1. role: 'assistant'
56
+ 2. reasoning (multiple chunks)
57
+ 3. tool_calls[0]: { id, name, arguments: '{"path":"...","content":"HELLO"}' } <- FULL JSON
58
+ 4. tool_calls[0]: { id, arguments: '{}' } <- EXTRA CHUNK
59
+ 5. finish_reason: 'tool_calls' + usage
60
+ ```
61
+
62
+ **Problem:** Naive concatenation produces malformed JSON:
63
+ ```
64
+ {"path":"/tmp/test.txt","content":"HELLO"}{}
65
+ ```
66
+
67
+ **Detection:** `arguments.match(/\}\s*\{\s*\}$/)`
68
+
69
+ **Fix:** Strip trailing `{}` from concatenated arguments.
70
+
71
+ ---
72
+
73
+ ### GLM-4.5-Air (`zai-org/GLM-4.5-Air:thinking`)
74
+
75
+ **Quirk: Outputs tool calls as custom XML in `content`, sends empty `tool_calls` array**
76
+
77
+ ```
78
+ Chunk flow:
79
+ 1. role: 'assistant'
80
+ 2. reasoning (multiple chunks)
81
+ 3. content: '\n'
82
+ 4. content: '<tool_call>write_file\n<arg_key>content</arg_key>\n<arg_value>HELLO</arg_value>\n'
83
+ 5. content: '<arg_key>path</arg_key>\n<arg_value>/path/to/file</arg_value>\n</tool_call>'
84
+ 6. tool_calls[0]: { id: 'call_0_0', arguments: '{}' } <- EMPTY
85
+ 7. finish_reason: 'tool_calls' + usage
86
+ ```
87
+
88
+ **XML Format (different from DeepSeek!):**
89
+ ```xml
90
+ <tool_call>write_file
91
+ <arg_key>content</arg_key>
92
+ <arg_value>HELLO</arg_value>
93
+ <arg_key>path</arg_key>
94
+ <arg_value>/path/to/file</arg_value>
95
+ </tool_call>
96
+ ```
97
+
98
+ **Detection:** `content.includes('<tool_call>') && toolCalls[0]?.function.arguments === '{}'`
99
+
100
+ **Fix:** Parse XML from content, ignore empty tool_calls array.
101
+
102
+ ---
103
+
104
+ ### DeepSeek-v3.2 (`deepseek/deepseek-v3.2:thinking`)
105
+
106
+ **Quirk: Outputs tool calls as XML in `content`, NO tool_calls array, finish_reason is `stop`**
107
+
108
+ ```
109
+ Chunk flow:
110
+ 1. role: 'assistant'
111
+ 2. reasoning (multiple chunks)
112
+ 3. content: '\n\n<function_calls>\n<invoke name="write_file">\n'
113
+ 4. content: '<parameter name="path" string="true">/path/to/file</parameter>\n'
114
+ 5. content: '<parameter name="content" string="true">HELLO</parameter>\n'
115
+ 6. content: '</invoke>\n</function_calls>'
116
+ 7. finish_reason: 'stop' + usage <- NOT 'tool_calls'!
117
+ ```
118
+
119
+ **XML Format:**
120
+ ```xml
121
+ <function_calls>
122
+ <invoke name="write_file">
123
+ <parameter name="path" string="true">/path/to/file</parameter>
124
+ <parameter name="content" string="true">HELLO</parameter>
125
+ </invoke>
126
+ </function_calls>
127
+ ```
128
+
129
+ **Detection:** `content.includes('<function_calls>') && finishReason === 'stop' && toolCalls.length === 0`
130
+
131
+ **Fix:** Parse XML, synthesize tool_calls array, change finish_reason to `tool_calls`.
132
+
133
+ ---
134
+
135
+ ### Nemotron-Nano (`nvidia/nemotron-nano-12b-v2-vl:free`)
136
+
137
+ **Multiple quirks: wrong param names, extra fields, repeated args in final chunk**
138
+
139
+ ```
140
+ Chunk flow:
141
+ 1. role + content: '' + reasoning: null + reasoning_details: []
142
+ 2. reasoning + reasoning_details (multiple chunks)
143
+ 3. reasoning: null (end of reasoning)
144
+ 4. tool_calls[0]: { id, name: 'write_file' } <- name only
145
+ 5. tool_calls[0]: { arguments: '{"file": "/var/...' } <- WRONG PARAM NAME
146
+ 6. tool_calls[0]: { arguments: '..."}' }
147
+ 7. tool_calls[0]: { arguments: '<FULL JSON>' } + finish_reason: 'tool_calls' <- REPEATED
148
+ 8. usage chunk
149
+ ```
150
+
151
+ **Quirks:**
152
+ 1. Uses `file` instead of `path` parameter (model hallucination)
153
+ 2. Has `reasoning_details` array with structured format
154
+ 3. Provider-specific fields: `provider`, `native_finish_reason`
155
+ 4. Final tool_calls chunk repeats FULL arguments (not delta)
156
+ 5. Usage comes in separate chunk after finish_reason
157
+
158
+ **Detection:** Prefer structural detection over model-name checks. The fixture is identified by
159
+ a final `tool_calls` chunk that repeats a full JSON object after arguments have already closed;
160
+ `reasoning_details` and `provider` are useful supporting signals, not required identifiers.
161
+
162
+ **Fix:**
163
+ - For repeated args: During aggregation, if accumulated ends with `}` and incoming starts with `{` (but is not just `{}`), replace instead of concatenate
164
+ - For wrong params: Application-level concern (tool schema validation)
165
+
166
+ ---
167
+
168
+ ## Normalization Strategy
169
+
170
+ Stream-bridge should apply these normalizations **post-aggregation**:
171
+
172
+ ```typescript
173
+ // 1. GLM-4.6: Strip trailing {}
174
+ if (/\}\s*\{\s*\}$/.test(args)) {
175
+ args = args.replace(/\}\s*\{\s*\}$/, '}');
176
+ }
177
+
178
+ // 2. DeepSeek: Extract from <function_calls> XML
179
+ if (content.includes('<function_calls>') && finishReason === 'stop' && toolCalls.length === 0) {
180
+ toolCalls = parseDeepSeekXml(content);
181
+ finishReason = 'tool_calls';
182
+ }
183
+
184
+ // 3. GLM-4.5-Air: Extract from <tool_call> XML
185
+ if (content.includes('<tool_call>') && toolCalls[0]?.function.arguments === '{}') {
186
+ toolCalls = parseGlmAirXml(content);
187
+ }
188
+
189
+ // 4. Nemotron: Handle repeated full args in final chunk (during aggregation)
190
+ // If accumulated ends with '}' and incoming starts with '{' but isn't '{}', replace
191
+ if (current.endsWith('}') && incoming.startsWith('{') && incoming !== '{}') {
192
+ accumulator.arguments = incoming; // Replace, don't concatenate
193
+ }
194
+ ```
195
+
196
+ ---
197
+
198
+ ## Key Differences Summary
199
+
200
+ | Aspect | Standard | GLM-4.6 | GLM-4.5-Air | DeepSeek | Nemotron |
201
+ |--------|----------|---------|-------------|----------|----------|
202
+ | Tool call location | `tool_calls` | `tool_calls` | content XML | content XML | `tool_calls` |
203
+ | finish_reason | `tool_calls` | `tool_calls` | `tool_calls` | `stop` | `tool_calls` |
204
+ | Args streaming | Concatenate | + trailing `{}` | Empty | N/A | + repeated final |
205
+ | XML format | N/A | N/A | `<tool_call>` | `<function_calls>` | N/A |
206
+ | Extra fields | None | None | None | None | `reasoning_details` |
package/dist/index.d.mts CHANGED
@@ -4,11 +4,11 @@ import { BaseStreamAgent, BaseStreamConnector, BaseStreamSession, StreamAdapterS
4
4
  import { z } from "zod";
5
5
  import * as _$_makaio_core0 from "@makaio/framework/core";
6
6
  import { ExtractSubjectPayload, ExtractSubjectResponse, ScopedSubjectDefinition } from "@makaio/framework/core";
7
+ import { DiscoveredAIModel, ResponseSchemaDescriptor, SessionMessageBlock, ToolListItem } from "@makaio/framework/contracts";
7
8
  import * as _$_makaio_bus_core0 from "@makaio/framework/bus";
8
9
  import { ScopedBus } from "@makaio/framework/bus";
9
10
  import { ChatCompletionChunk } from "openai/resources";
10
- import { ChatCompletionTool } from "openai/resources/index.js";
11
- import { DiscoveredAIModel, SessionMessageBlock, ToolListItem } from "@makaio/framework/contracts";
11
+ import { ChatCompletionMessageParam, ChatCompletionTool } from "openai/resources/index.js";
12
12
 
13
13
  //#region src/schemas.d.ts
14
14
  /**
@@ -1740,7 +1740,7 @@ declare class OpenAIConnectorTurn extends ProceduralConnectorTurn<StreamSessionT
1740
1740
  * Each turn rebuilds the messages[] array with full history.
1741
1741
  */
1742
1742
  declare class OpenAIConnectorSession extends BaseStreamSession<OpenAISessionConfig, OpenAIConnectorTurn, MessageCompleteEvent> {
1743
- private messages;
1743
+ protected messages: ChatCompletionMessageParam[];
1744
1744
  /**
1745
1745
  * Mutable tool list for this session.
1746
1746
  * Rebuilt whenever either `nativeTools` or `mcpTools` changes.
@@ -1759,6 +1759,13 @@ declare class OpenAIConnectorSession extends BaseStreamSession<OpenAISessionConf
1759
1759
  * Initialized from `config.reasoningEffort` at session construction.
1760
1760
  */
1761
1761
  private currentReasoningEffort;
1762
+ /**
1763
+ * Per-turn structured-output schema descriptor, captured from the active
1764
+ * {@link MessageHandle} in {@link buildMessages} and consumed by
1765
+ * {@link executeApiCall} when building the `response_format` payload.
1766
+ * Reset to `undefined` at the start of each new turn.
1767
+ */
1768
+ private currentResponseSchema;
1762
1769
  /**
1763
1770
  * Create an OpenAI connector session.
1764
1771
  * @param config - OpenAI session configuration (bus identity, model/cwd, SDK client, and lifecycle hooks).
@@ -1799,6 +1806,38 @@ declare class OpenAIConnectorSession extends BaseStreamSession<OpenAISessionConf
1799
1806
  * @returns Ordered list of tool names matching the next API request
1800
1807
  */
1801
1808
  protected getEffectiveToolNames(): string[];
1809
+ /**
1810
+ * Return true when OpenAI-compatible tool calling must carry the structured
1811
+ * result because the active provider rejects `response_format` alongside tools.
1812
+ * @returns Whether this turn should use the internal finalizer tool
1813
+ */
1814
+ private shouldUseStructuredOutputFinalizer;
1815
+ /**
1816
+ * Build the internal terminal tool used when structured output and normal
1817
+ * tool calling are both active.
1818
+ * @returns OpenAI function tool that submits the final structured result
1819
+ */
1820
+ private createStructuredOutputFinalizerTool;
1821
+ /**
1822
+ * Build the tool list for the next OpenAI request.
1823
+ * @returns Normal tools plus the internal finalizer when needed
1824
+ */
1825
+ protected buildRequestTools(): ChatCompletionTool[];
1826
+ /**
1827
+ * Return the response schema to send via OpenAI `response_format`.
1828
+ *
1829
+ * The schema is intentionally omitted when the finalizer tool is active
1830
+ * because some OpenAI-compatible providers reject `response_format` and
1831
+ * function tools in the same request.
1832
+ * @returns Direct response schema for schema-only turns, otherwise undefined
1833
+ */
1834
+ protected getRequestResponseSchema(): ResponseSchemaDescriptor | undefined;
1835
+ /**
1836
+ * Instruction prepended to the current user turn when the internal finalizer
1837
+ * tool is required.
1838
+ * @returns Internal turn instruction or an empty string
1839
+ */
1840
+ private getStructuredOutputFinalizerInstruction;
1802
1841
  /**
1803
1842
  * Create an OpenAI connector turn for the given message handle.
1804
1843
  * @param handle - The message handle this turn will process
@@ -1818,10 +1857,26 @@ declare class OpenAIConnectorSession extends BaseStreamSession<OpenAISessionConf
1818
1857
  *
1819
1858
  * Ensures the configured system prompt is always at index 0 without
1820
1859
  * duplicating existing history system entries.
1860
+ *
1861
+ * Also snapshots {@link MessageHandle.responseSchema} into
1862
+ * {@link currentResponseSchema} so {@link executeApiCall} can forward it to
1863
+ * the OpenAI `response_format` field without re-reading the handle.
1821
1864
  * @param handle - The message handle containing history
1822
1865
  * @param mergedContent - Optional content from superseded/merged messages
1823
1866
  */
1824
1867
  protected buildMessages(handle: MessageHandle, mergedContent?: string[]): void;
1868
+ /**
1869
+ * Return the current OpenAI chat history length.
1870
+ * @returns Number of messages currently staged for the next request
1871
+ */
1872
+ protected getConversationHistoryLength(): number;
1873
+ /**
1874
+ * Compact provisional assistant/retry blocks to the canonical assistant turn.
1875
+ * @param startIndex - History index immediately after the user turn input
1876
+ * @param endIndex - Exclusive history boundary for the provisional blocks
1877
+ * @param assistantMessage - Canonical assistant content to persist
1878
+ */
1879
+ protected replaceAssistantTurnHistory(startIndex: number, endIndex: number, assistantMessage: string): void;
1825
1880
  /**
1826
1881
  * Execute the OpenAI streaming API call.
1827
1882
  *
@@ -1855,6 +1910,12 @@ declare class OpenAIConnectorSession extends BaseStreamSession<OpenAISessionConf
1855
1910
  * @returns A classified `Error` instance
1856
1911
  */
1857
1912
  protected classifyError(error: unknown): Error;
1913
+ /**
1914
+ * Convert an internal structured-output finalizer call into terminal content.
1915
+ * @param toolCalls - Tool calls from the current message_complete event
1916
+ * @returns Serialized final structured output, or undefined when absent
1917
+ */
1918
+ private extractStructuredOutputFinalizerMessage;
1858
1919
  }
1859
1920
  //#endregion
1860
1921
  //#region src/connector.d.ts