bedrock-wrapper 2.4.1 → 2.4.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,6 +1,61 @@
1
1
  # Changelog
2
2
  All notable changes to this project will be documented in this file.
3
3
 
4
+ ## [2.4.3] - 2025-07-31 (Stop Sequences Fixes)
5
+ ### Fixed
6
+ - **Critical Discovery**: Removed stop sequences support from Llama models
7
+ - AWS Bedrock does not support stop sequences for Llama models (confirmed via official AWS documentation)
8
+ - Llama models only support: `prompt`, `temperature`, `top_p`, `max_gen_len`, `images`
9
+ - This is an AWS Bedrock limitation, not a wrapper limitation
10
+ - Fixed Nova model configuration conflicts that were causing stop sequence inconsistencies
11
+ - Removed conflicting empty `inferenceConfig: {}` from Nova model configurations
12
+ - Improved error handling for empty responses when stop sequences trigger early
13
+
14
+ ### Updated
15
+ - **Documentation corrections**
16
+ - Corrected stop sequences support claims (removed "all models support" language)
17
+ - Added accurate model-specific support matrix with sequence limits
18
+ - Added comprehensive stop sequences support table with AWS documentation references
19
+ - **Model Support Matrix** now clearly documented:
20
+ - ✅ Claude models: Full support (up to 8,191 sequences)
21
+ - ✅ Nova models: Full support (up to 4 sequences)
22
+ - ✅ Mistral models: Full support (up to 10 sequences)
23
+ - ❌ Llama models: Not supported (AWS Bedrock limitation)
24
+
25
+ ### Technical Details
26
+ - Based on comprehensive research of official AWS Bedrock documentation
27
+ - All changes maintain full backward compatibility
28
+ - Test results show significant improvements in stop sequences reliability for supported models
29
+ - Added detailed explanations to help users understand AWS Bedrock's actual capabilities
30
+
31
+ ## [2.4.2] - 2025-07-31 (Stop Sequences Support)
32
+ ### Added
33
+ - Stop sequences support for compatible models
34
+ - OpenAI-compatible `stop` and `stop_sequences` parameters
35
+ - Automatic string-to-array conversion for compatibility
36
+ - Model-specific parameter mapping (stop_sequences for Claude, stopSequences for Nova, stop for Mistral)
37
+ - Enhanced request building logic to include stop sequences in appropriate API formats
38
+ - Comprehensive stop sequences testing and validation with `npm run test-stop`
39
+
40
+ ### Fixed
41
+ - **Critical Discovery**: Removed stop sequences support from Llama models
42
+ - AWS Bedrock does not support stop sequences for Llama models (confirmed via official documentation)
43
+ - Llama models only support: `prompt`, `temperature`, `top_p`, `max_gen_len`, `images`
44
+ - This is an AWS Bedrock limitation, not a wrapper limitation
45
+ - Fixed Nova model configuration conflicts that were causing stop sequence inconsistencies
46
+ - Improved error handling for empty responses when stop sequences trigger early
47
+
48
+ ### Technical Details
49
+ - **Model Support Matrix**:
50
+ - ✅ Claude models: Full support (up to 8,191 sequences)
51
+ - ✅ Nova models: Full support (up to 4 sequences)
52
+ - ✅ Mistral models: Full support (up to 10 sequences)
53
+ - ❌ Llama models: Not supported (AWS Bedrock limitation)
54
+ - Updated request construction for both messages API and prompt-based models
55
+ - Supports both single string and array formats for stop sequences
56
+ - Maintains full backward compatibility with existing API usage
57
+ - Added comprehensive documentation in README.md and CLAUDE.md explaining support limitations
58
+
4
59
  ## [2.4.0] - 2025-07-24 (AWS Nova Models)
5
60
  ### Added
6
61
  - Support for AWS Nova models
package/README.md CHANGED
@@ -44,6 +44,7 @@ Bedrock Wrapper is an npm package that simplifies the integration of existing Op
44
44
  "stream": true,
45
45
  "temperature": LLM_TEMPERATURE,
46
46
  "top_p": LLM_TOP_P,
47
+ "stop_sequences": ["STOP", "END"], // Optional: sequences that will stop generation
47
48
  };
48
49
  ```
49
50
 
@@ -189,6 +190,48 @@ You can include multiple images in a single message by adding more image_url obj
189
190
 
190
191
  ---
191
192
 
193
+ ### Stop Sequences
194
+
195
+ Stop sequences are custom text sequences that cause the model to stop generating text. This is useful for controlling where the model stops its response.
196
+
197
+ ```javascript
198
+ const openaiChatCompletionsCreateObject = {
199
+ "messages": messages,
200
+ "model": "Claude-3-5-Sonnet",
201
+ "max_tokens": 100,
202
+ "stop_sequences": ["STOP", "END", "\n\n"], // Array of stop sequences
203
+ // OR use single string format:
204
+ // "stop": "STOP"
205
+ };
206
+ ```
207
+
208
+ **Model Support:**
209
+ - ✅ **Claude models**: Fully supported (up to 8,191 sequences)
210
+ - ✅ **Nova models**: Fully supported (up to 4 sequences)
211
+ - ✅ **Mistral models**: Fully supported (up to 10 sequences)
212
+ - ❌ **Llama models**: Not supported (AWS Bedrock limitation)
213
+
214
+ **Features:**
215
+ - Compatible with OpenAI's `stop` parameter (single string or array)
216
+ - Also accepts `stop_sequences` parameter for explicit usage
217
+ - Automatic conversion between string and array formats
218
+ - Model-specific parameter mapping handled automatically
219
+
220
+ **Example Usage:**
221
+ ```javascript
222
+ // Stop generation when model tries to output "7"
223
+ const result = await bedrockWrapper(awsCreds, {
224
+ messages: [{ role: "user", content: "Count from 1 to 10" }],
225
+ model: "Claude-3-5-Sonnet", // Use Claude, Nova, or Mistral models
226
+ stop_sequences: ["7"]
227
+ });
228
+ // Response: "1, 2, 3, 4, 5, 6," (stops before "7")
229
+
230
+ // Note: Llama models will ignore stop sequences due to AWS Bedrock limitations
231
+ ```
232
+
233
+ ---
234
+
192
235
  ### 📢 P.S.
193
236
 
194
237
  In case you missed it at the beginning of this doc, for an even easier setup, use the 🔀 [Bedrock Proxy Endpoint](https://github.com/jparkerweb/bedrock-proxy-endpoint) project to spin up your own custom OpenAI server endpoint (using the standard `baseUrl`, and `apiKey` params).
package/bedrock-models.js CHANGED
@@ -19,6 +19,7 @@ export const bedrock_models = [
19
19
  "display_role_names": true,
20
20
  "max_tokens_param_name": "max_tokens",
21
21
  "max_supported_response_tokens": 131072,
22
+ "stop_sequences_param_name": "stop_sequences",
22
23
  "response_chunk_element": "delta.text",
23
24
  "response_nonchunk_element": "content[0].text",
24
25
  "thinking_response_chunk_element": "delta.thinking",
@@ -46,6 +47,7 @@ export const bedrock_models = [
46
47
  "display_role_names": true,
47
48
  "max_tokens_param_name": "max_tokens",
48
49
  "max_supported_response_tokens": 131072,
50
+ "stop_sequences_param_name": "stop_sequences",
49
51
  "response_chunk_element": "delta.text",
50
52
  "response_nonchunk_element": "content[0].text",
51
53
  "thinking_response_chunk_element": "delta.thinking",
@@ -77,6 +79,7 @@ export const bedrock_models = [
77
79
  "display_role_names": true,
78
80
  "max_tokens_param_name": "max_tokens",
79
81
  "max_supported_response_tokens": 131072,
82
+ "stop_sequences_param_name": "stop_sequences",
80
83
  "response_chunk_element": "delta.text",
81
84
  "response_nonchunk_element": "content[0].text",
82
85
  "thinking_response_chunk_element": "delta.thinking",
@@ -104,6 +107,7 @@ export const bedrock_models = [
104
107
  "display_role_names": true,
105
108
  "max_tokens_param_name": "max_tokens",
106
109
  "max_supported_response_tokens": 131072,
110
+ "stop_sequences_param_name": "stop_sequences",
107
111
  "response_chunk_element": "delta.text",
108
112
  "response_nonchunk_element": "content[0].text",
109
113
  "thinking_response_chunk_element": "delta.thinking",
@@ -135,6 +139,7 @@ export const bedrock_models = [
135
139
  "display_role_names": true,
136
140
  "max_tokens_param_name": "max_tokens",
137
141
  "max_supported_response_tokens": 131072,
142
+ "stop_sequences_param_name": "stop_sequences",
138
143
  "response_chunk_element": "delta.text",
139
144
  "response_nonchunk_element": "content[0].text",
140
145
  "thinking_response_chunk_element": "delta.thinking",
@@ -166,6 +171,7 @@ export const bedrock_models = [
166
171
  "display_role_names": true,
167
172
  "max_tokens_param_name": "max_tokens",
168
173
  "max_supported_response_tokens": 131072,
174
+ "stop_sequences_param_name": "stop_sequences",
169
175
  "response_chunk_element": "delta.text",
170
176
  "response_nonchunk_element": "content[0].text",
171
177
  "special_request_schema": {
@@ -190,6 +196,7 @@ export const bedrock_models = [
190
196
  "display_role_names": true,
191
197
  "max_tokens_param_name": "max_tokens",
192
198
  "max_supported_response_tokens": 8192,
199
+ "stop_sequences_param_name": "stop_sequences",
193
200
  "response_chunk_element": "delta.text",
194
201
  "response_nonchunk_element": "content[0].text",
195
202
  "special_request_schema": {
@@ -213,6 +220,7 @@ export const bedrock_models = [
213
220
  "display_role_names": true,
214
221
  "max_tokens_param_name": "max_tokens",
215
222
  "max_supported_response_tokens": 8192,
223
+ "stop_sequences_param_name": "stop_sequences",
216
224
  "response_chunk_element": "delta.text",
217
225
  "response_nonchunk_element": "content[0].text",
218
226
  "special_request_schema": {
@@ -236,6 +244,7 @@ export const bedrock_models = [
236
244
  "display_role_names": true,
237
245
  "max_tokens_param_name": "max_tokens",
238
246
  "max_supported_response_tokens": 8192,
247
+ "stop_sequences_param_name": "stop_sequences",
239
248
  "response_chunk_element": "delta.text",
240
249
  "response_nonchunk_element": "content[0].text",
241
250
  "special_request_schema": {
@@ -254,6 +263,7 @@ export const bedrock_models = [
254
263
  "display_role_names": true,
255
264
  "max_tokens_param_name": "max_tokens",
256
265
  "max_supported_response_tokens": 8192,
266
+ "stop_sequences_param_name": "stop_sequences",
257
267
  "response_chunk_element": "delta.text",
258
268
  "response_nonchunk_element": "content[0].text",
259
269
  "special_request_schema": {
@@ -552,11 +562,11 @@ export const bedrock_models = [
552
562
  "display_role_names": true,
553
563
  "max_tokens_param_name": "maxTokens",
554
564
  "max_supported_response_tokens": 5000,
565
+ "stop_sequences_param_name": "stopSequences",
555
566
  "response_chunk_element": "contentBlockDelta.delta.text",
556
567
  "response_nonchunk_element": "output.message.content[0].text",
557
568
  "special_request_schema": {
558
- "schemaVersion": "messages-v1",
559
- "inferenceConfig": {}
569
+ "schemaVersion": "messages-v1"
560
570
  },
561
571
  "image_support": {
562
572
  "max_image_size": 5242880, // 5MB per image
@@ -576,11 +586,11 @@ export const bedrock_models = [
576
586
  "display_role_names": true,
577
587
  "max_tokens_param_name": "maxTokens",
578
588
  "max_supported_response_tokens": 5000,
589
+ "stop_sequences_param_name": "stopSequences",
579
590
  "response_chunk_element": "contentBlockDelta.delta.text",
580
591
  "response_nonchunk_element": "output.message.content[0].text",
581
592
  "special_request_schema": {
582
- "schemaVersion": "messages-v1",
583
- "inferenceConfig": {}
593
+ "schemaVersion": "messages-v1"
584
594
  },
585
595
  "image_support": {
586
596
  "max_image_size": 5242880, // 5MB per image
@@ -600,6 +610,7 @@ export const bedrock_models = [
600
610
  "display_role_names": true,
601
611
  "max_tokens_param_name": "maxTokens",
602
612
  "max_supported_response_tokens": 5000,
613
+ "stop_sequences_param_name": "stopSequences",
603
614
  "response_chunk_element": "contentBlockDelta.delta.text",
604
615
  "response_nonchunk_element": "output.message.content[0].text",
605
616
  "special_request_schema": {
@@ -632,6 +643,7 @@ export const bedrock_models = [
632
643
  "display_role_names": false,
633
644
  "max_tokens_param_name": "max_tokens",
634
645
  "max_supported_response_tokens": 8192,
646
+ "stop_sequences_param_name": "stop",
635
647
  "response_chunk_element": "outputs[0].text"
636
648
  },
637
649
  {
@@ -659,6 +671,7 @@ export const bedrock_models = [
659
671
  "display_role_names": false,
660
672
  "max_tokens_param_name": "max_tokens",
661
673
  "max_supported_response_tokens": 4096,
674
+ "stop_sequences_param_name": "stop",
662
675
  "response_chunk_element": "outputs[0].text"
663
676
  },
664
677
  {
@@ -686,6 +699,7 @@ export const bedrock_models = [
686
699
  "display_role_names": false,
687
700
  "max_tokens_param_name": "max_tokens",
688
701
  "max_supported_response_tokens": 8192,
702
+ "stop_sequences_param_name": "stop",
689
703
  "response_chunk_element": "outputs[0].text"
690
704
  },
691
705
  ];
@@ -64,7 +64,7 @@ async function processImage(imageInput) {
64
64
 
65
65
  export async function* bedrockWrapper(awsCreds, openaiChatCompletionsCreateObject, { logging = false } = {} ) {
66
66
  const { region, accessKeyId, secretAccessKey } = awsCreds;
67
- let { messages, model, max_tokens, stream, temperature, top_p, include_thinking_data } = openaiChatCompletionsCreateObject;
67
+ let { messages, model, max_tokens, stream, temperature, top_p, include_thinking_data, stop, stop_sequences } = openaiChatCompletionsCreateObject;
68
68
 
69
69
 
70
70
  let {awsModelId, awsModel} = findAwsModelWithId(model);
@@ -269,13 +269,17 @@ export async function* bedrockWrapper(awsCreds, openaiChatCompletionsCreateObjec
269
269
  };
270
270
  });
271
271
 
272
+ const stopSequencesValue = stop_sequences || stop;
272
273
  const novaRequest = {
273
274
  ...awsModel.special_request_schema,
274
275
  messages: novaMessages,
275
276
  inferenceConfig: {
276
277
  [awsModel.max_tokens_param_name]: max_gen_tokens,
277
278
  temperature: temperature,
278
- topP: top_p
279
+ topP: top_p,
280
+ ...(awsModel.stop_sequences_param_name && stopSequencesValue && {
281
+ [awsModel.stop_sequences_param_name]: Array.isArray(stopSequencesValue) ? stopSequencesValue : [stopSequencesValue]
282
+ })
279
283
  }
280
284
  };
281
285
 
@@ -287,12 +291,16 @@ export async function* bedrockWrapper(awsCreds, openaiChatCompletionsCreateObjec
287
291
  return novaRequest;
288
292
  } else {
289
293
  // Standard messages API format (Claude, etc.)
294
+ const stopSequencesValue = stop_sequences || stop;
290
295
  return {
291
296
  messages: prompt,
292
297
  ...(awsModel.system_as_separate_field && system_message && { system: system_message }),
293
298
  [awsModel.max_tokens_param_name]: max_gen_tokens,
294
299
  temperature: temperature,
295
300
  top_p: top_p,
301
+ ...(awsModel.stop_sequences_param_name && stopSequencesValue && {
302
+ [awsModel.stop_sequences_param_name]: Array.isArray(stopSequencesValue) ? stopSequencesValue : [stopSequencesValue]
303
+ }),
296
304
  ...awsModel.special_request_schema
297
305
  };
298
306
  }
@@ -311,6 +319,12 @@ export async function* bedrockWrapper(awsCreds, openaiChatCompletionsCreateObjec
311
319
  [awsModel.max_tokens_param_name]: max_gen_tokens,
312
320
  temperature: temperature,
313
321
  top_p: top_p,
322
+ ...(() => {
323
+ const stopSequencesValue = stop_sequences || stop;
324
+ return awsModel.stop_sequences_param_name && stopSequencesValue ? {
325
+ [awsModel.stop_sequences_param_name]: Array.isArray(stopSequencesValue) ? stopSequencesValue : [stopSequencesValue]
326
+ } : {};
327
+ })(),
314
328
  ...awsModel.special_request_schema
315
329
  };
316
330
 
@@ -392,7 +406,23 @@ export async function* bedrockWrapper(awsCreds, openaiChatCompletionsCreateObjec
392
406
  }
393
407
  }
394
408
 
409
+ // Handle case where stop sequences cause empty content array
410
+ if (!text_result && decodedBodyResponse.stop_reason === "stop_sequence") {
411
+ // If stopped by sequence but no content, return empty string instead of undefined
412
+ text_result = "";
413
+ }
414
+
415
+ // Ensure text_result is a string to prevent 'undefined' from being part of the response
416
+ if (text_result === null || text_result === undefined) {
417
+ text_result = "";
418
+ }
419
+
395
420
  let result = thinking_result ? `<think>${thinking_result}</think>\n\n${text_result}` : text_result;
421
+
422
+ // Ensure final result is a string, in case thinking_result was also empty
423
+ if (result === null || result === undefined) {
424
+ result = "";
425
+ }
396
426
  yield result;
397
427
  }
398
428
  }
@@ -428,7 +458,10 @@ function findAwsModelWithId(model) {
428
458
  export async function listBedrockWrapperSupportedModels() {
429
459
  let supported_models = [];
430
460
  for (let i = 0; i < bedrock_models.length; i++) {
431
- supported_models.push(`{"modelName": ${bedrock_models[i].modelName}, "modelId": ${bedrock_models[i].modelId}}`);
461
+ supported_models.push(JSON.stringify({
462
+ modelName: bedrock_models[i].modelName,
463
+ modelId: bedrock_models[i].modelId
464
+ }));
432
465
  }
433
466
  return supported_models;
434
467
  }
@@ -0,0 +1,51 @@
1
+ [
2
+ {
3
+ "session_id": "e0b34b2c-ee9a-4813-893a-82d47d3d5141",
4
+ "transcript_path": "C:\\Users\\Justin.Parker\\.claude\\projects\\C--git-bedrock-wrapper\\e0b34b2c-ee9a-4813-893a-82d47d3d5141.jsonl",
5
+ "cwd": "C:\\git\\bedrock-wrapper",
6
+ "hook_event_name": "Notification",
7
+ "message": "Claude is waiting for your input"
8
+ },
9
+ {
10
+ "session_id": "e0b34b2c-ee9a-4813-893a-82d47d3d5141",
11
+ "transcript_path": "C:\\Users\\Justin.Parker\\.claude\\projects\\C--git-bedrock-wrapper\\e0b34b2c-ee9a-4813-893a-82d47d3d5141.jsonl",
12
+ "cwd": "C:\\git\\bedrock-wrapper",
13
+ "hook_event_name": "Notification",
14
+ "message": "Claude is waiting for your input"
15
+ },
16
+ {
17
+ "session_id": "e0b34b2c-ee9a-4813-893a-82d47d3d5141",
18
+ "transcript_path": "C:\\Users\\Justin.Parker\\.claude\\projects\\C--git-bedrock-wrapper\\e0b34b2c-ee9a-4813-893a-82d47d3d5141.jsonl",
19
+ "cwd": "C:\\git\\bedrock-wrapper",
20
+ "hook_event_name": "Notification",
21
+ "message": "Claude is waiting for your input"
22
+ },
23
+ {
24
+ "session_id": "e0b34b2c-ee9a-4813-893a-82d47d3d5141",
25
+ "transcript_path": "C:\\Users\\Justin.Parker\\.claude\\projects\\C--git-bedrock-wrapper\\e0b34b2c-ee9a-4813-893a-82d47d3d5141.jsonl",
26
+ "cwd": "C:\\git\\bedrock-wrapper",
27
+ "hook_event_name": "Notification",
28
+ "message": "Claude is waiting for your input"
29
+ },
30
+ {
31
+ "session_id": "e0b34b2c-ee9a-4813-893a-82d47d3d5141",
32
+ "transcript_path": "C:\\Users\\Justin.Parker\\.claude\\projects\\C--git-bedrock-wrapper\\e0b34b2c-ee9a-4813-893a-82d47d3d5141.jsonl",
33
+ "cwd": "C:\\git\\bedrock-wrapper",
34
+ "hook_event_name": "Notification",
35
+ "message": "Claude is waiting for your input"
36
+ },
37
+ {
38
+ "session_id": "e0b34b2c-ee9a-4813-893a-82d47d3d5141",
39
+ "transcript_path": "C:\\Users\\Justin.Parker\\.claude\\projects\\C--git-bedrock-wrapper\\e0b34b2c-ee9a-4813-893a-82d47d3d5141.jsonl",
40
+ "cwd": "C:\\git\\bedrock-wrapper",
41
+ "hook_event_name": "Notification",
42
+ "message": "Claude is waiting for your input"
43
+ },
44
+ {
45
+ "session_id": "e0b34b2c-ee9a-4813-893a-82d47d3d5141",
46
+ "transcript_path": "C:\\Users\\Justin.Parker\\.claude\\projects\\C--git-bedrock-wrapper\\e0b34b2c-ee9a-4813-893a-82d47d3d5141.jsonl",
47
+ "cwd": "C:\\git\\bedrock-wrapper",
48
+ "hook_event_name": "Notification",
49
+ "message": "Claude is waiting for your input"
50
+ }
51
+ ]