bedrock-wrapper 2.8.0 → 2.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/AGENTS.md ADDED
@@ -0,0 +1,123 @@
1
+ # AGENTS.md
2
+
3
+ This file provides guidance to AI coding agents like Claude Code (claude.ai/code), Cursor AI, Codex, Gemini CLI, GitHub Copilot, and other AI coding assistants when working with code in this repository.
4
+
5
+ ## Project Purpose
6
+
7
+ Bedrock Wrapper translates OpenAI-compatible API objects to AWS Bedrock's serverless inference LLMs. It acts as an adapter layer allowing applications using the OpenAI API format to seamlessly call AWS Bedrock models.
8
+
9
+ ## Development Commands
10
+
11
+ ```bash
12
+ npm install # Install dependencies
13
+ npm run clean # Clean reinstall (removes node_modules and package-lock.json)
14
+ npm run test # Test all models with both Invoke and Converse APIs
15
+ npm run test:invoke # Test with Invoke API only
16
+ npm run test:converse # Test with Converse API only
17
+ npm run test-vision # Test vision capabilities
18
+ npm run test-stop # Test stop sequences
19
+ npm run interactive # Interactive CLI for testing specific models
20
+ ```
21
+
22
+ ## Architecture Overview
23
+
24
+ ```
25
+ bedrock-wrapper.js (main entry)
26
+
27
+ ├── Converse API Path (useConverseAPI: true)
28
+ │ └── Unified format for all models
29
+
30
+ └── Invoke API Path (default)
31
+ └── Model-specific request/response handling
32
+
33
+ └── bedrock-models.js
34
+ └── Model configurations registry
35
+ ```
36
+
37
+ ### Key Functions in bedrock-wrapper.js
38
+
39
+ | Function | Line | Purpose |
40
+ |----------|------|---------|
41
+ | `bedrockWrapper()` | ~501 | Main entry point, async generator |
42
+ | `convertToConverseFormat()` | ~86 | OpenAI messages → Converse API format |
43
+ | `processMessagesForInvoke()` | ~168 | Model-specific message processing |
44
+ | `buildInvokePrompt()` | ~234 | Constructs model-specific prompts |
45
+ | `buildInvokeRequest()` | ~300 | Creates model-specific request objects |
46
+ | `executeInvokeAPI()` | ~409 | Handles streaming and non-streaming |
47
+ | `findAwsModelWithId()` | ~763 | Model lookup by name or ID |
48
+
49
+ ### Model Configuration Schema (bedrock-models.js)
50
+
51
+ Each model entry requires:
52
+ - `modelName`: Consumer-facing name (e.g., "Claude-4-5-Sonnet")
53
+ - `modelId`: AWS Bedrock identifier
54
+ - `vision`: Boolean for image support
55
+ - `messages_api`: Boolean (true = structured messages, false = prompt string)
56
+ - `response_chunk_element`: JSON path for streaming response extraction
57
+ - `response_nonchunk_element`: JSON path for non-streaming response
58
+
59
+ ### Two API Paths
60
+
61
+ 1. **Converse API** (`useConverseAPI: true`): Unified format, handles all models consistently
62
+ 2. **Invoke API** (default): Model-specific formatting required
63
+
64
+ Some models (e.g., DeepSeek-V3.1) have `converse_api_only: true` and automatically use the Converse API.
65
+
66
+ ## Model Family Patterns
67
+
68
+ | Family | API Type | Special Handling |
69
+ |--------|----------|------------------|
70
+ | Claude | Messages API | Thinking tags: `<think>`, anthropic_version required |
71
+ | Nova | Messages API | Content as array `[{text: content}]`, schemaVersion: "messages-v1" |
72
+ | Llama | Prompt-based | Role tags: `<\|begin_of_text\|>`, `<\|start_header_id\|>` |
73
+ | Mistral | Prompt-based (older) / Messages (v3+) | `[INST]`/`[/INST]` tags for older models |
74
+ | GPT-OSS | Messages API | Reasoning tags: `<reasoning>`, streaming not supported |
75
+ | Qwen | Messages API | Standard messages format |
76
+ | DeepSeek | Messages API | V3.1 requires Converse API only |
77
+ | Gemma | Messages API | Standard messages format with vision |
78
+ | Kimi | Messages API | preserve_reasoning for thinking models |
79
+
80
+ ## Adding a New Model
81
+
82
+ 1. Add entry to `bedrock_models` array in `bedrock-models.js`
83
+ 2. For prompt-based models, define all role prefix/suffix tokens
84
+ 3. For vision models, set `vision: true` and add `image_support` config
85
+ 4. For thinking models, add `thinking` config in `special_request_schema`
86
+ 5. Test with `npm run test` to verify both API paths
87
+
88
+ ## Key Implementation Details
89
+
90
+ ### Image Processing
91
+ - Uses Sharp library to resize images to max 2048x2048
92
+ - Converts all formats to JPEG for consistency
93
+ - Handles base64, data URLs, and HTTP URLs
94
+
95
+ ### Thinking Mode
96
+ - Claude: `<think>` tags, budget_tokens in special_request_schema
97
+ - GPT-OSS: `<reasoning>` tags, preserve_reasoning flag
98
+ - Temperature auto-set to 1.0, budget_tokens constrained to 80% of max_tokens
99
+
100
+ ### Stop Sequences
101
+ - Claude: `stop_sequences` (up to 8,191)
102
+ - Nova: `stopSequences` (up to 4)
103
+ - Mistral: `stop` (up to 10)
104
+ - Llama: Not supported by AWS Bedrock
105
+
106
+ ## Environment Setup
107
+
108
+ Create `.env` file:
109
+ ```
110
+ AWS_REGION=us-west-2
111
+ AWS_ACCESS_KEY_ID=your_key
112
+ AWS_SECRET_ACCESS_KEY=your_secret
113
+ LLM_MAX_GEN_TOKENS=1024
114
+ LLM_TEMPERATURE=0.2
115
+ ```
116
+
117
+ ## Test Output Files
118
+
119
+ After running tests, check these files for results:
120
+ - `test-models-output.txt`
121
+ - `test-vision-models-output.txt`
122
+ - `test-stop-sequences-output.txt`
123
+ - `test-converse-api-output.txt`
package/CHANGELOG.md CHANGED
@@ -2,6 +2,40 @@
2
2
 
3
3
  All notable changes to this project will be documented in this file.
4
4
 
5
+ ## [2.10.0] - 2026-02-20 (Claude Opus 4.6 & Sonnet 4.6)
6
+
7
+ ### ✨ Added
8
+
9
+ - Support for Claude Opus 4.6 and Claude Sonnet 4.6 models
10
+ - Claude-Opus-4-6 (128K max output tokens, vision support, cross-region inference profile)
11
+ - Claude-Sonnet-4-6 (64K max output tokens, vision support, cross-region inference profile)
12
+
13
+ ### ⚙️ Technical Details
14
+
15
+ - **Model IDs**: `us.anthropic.claude-opus-4-6-v1` and `us.anthropic.claude-sonnet-4-6`
16
+ - **Vision Support**: Both models support image inputs
17
+ - **Extended Output**: Both models use the `output-128k-2025-02-19` beta header
18
+ - **API Compatibility**: Both Invoke API and Converse API paths supported
19
+
20
+ ---
21
+
22
+ ## [2.9.0] - 2026-01-08 (Llama 4 Models)
23
+
24
+ ### ✨ Added
25
+
26
+ - Support for Llama 4 Scout and Maverick models
27
+ - Llama-4-Scout-17b (vision support, 2K max output tokens)
28
+ - Llama-4-Maverick-17b (vision support, 2K max output tokens)
29
+ - First Llama models with multimodal/vision capabilities in this wrapper
30
+ - Cross-region inference profile IDs (us.meta.llama4-*)
31
+
32
+ ### ⚙️ Technical Details
33
+
34
+ - **Vision Support**: Both models support image inputs (first Llama models with vision)
35
+ - **API Compatibility**: Both Invoke API and Converse API paths supported
36
+ - **Streaming**: Full streaming and non-streaming support
37
+ - **Stop Sequences**: Not supported (AWS Bedrock limitation for all Llama models)
38
+
5
39
  ## [2.8.0] - 2025-12-05 (New Models: Claude Opus 4.5, Gemma, Kimi, MiniMax, Mistral, Nova)
6
40
 
7
41
  ### ✨ Added
package/README.md CHANGED
@@ -142,6 +142,8 @@ Bedrock Wrapper is an npm package that simplifies the integration of existing Op
142
142
  | Claude-4-5-Opus-Thinking | global.anthropic.claude-opus-4-5-20251101-v1:0 | ✅ |
143
143
  | Claude-4-5-Sonnet | us.anthropic.claude-sonnet-4-5-20250929-v1:0 | ✅ |
144
144
  | Claude-4-5-Sonnet-Thinking | us.anthropic.claude-sonnet-4-5-20250929-v1:0 | ✅ |
145
+ | Claude-Opus-4-6 | us.anthropic.claude-opus-4-6-v1 | ✅ |
146
+ | Claude-Sonnet-4-6 | us.anthropic.claude-sonnet-4-6 | ✅ |
145
147
  | DeepSeek-R1 | us.deepseek.r1-v1:0 | ❌ |
146
148
  | DeepSeek-V3.1 | deepseek.v3-v1:0 | ❌ |
147
149
  | Gemma-3-4b | google.gemma-3-4b-it | ✅ |
@@ -163,6 +165,8 @@ Bedrock Wrapper is an npm package that simplifies the integration of existing Op
163
165
  | Llama-3-2-11b | us.meta.llama3-2-11b-instruct-v1:0 | ❌ |
164
166
  | Llama-3-2-90b | us.meta.llama3-2-90b-instruct-v1:0 | ❌ |
165
167
  | Llama-3-3-70b | us.meta.llama3-3-70b-instruct-v1:0 | ❌ |
168
+ | Llama-4-Scout-17b | us.meta.llama4-scout-17b-instruct-v1:0 | ✅ |
169
+ | Llama-4-Maverick-17b | us.meta.llama4-maverick-17b-instruct-v1:0 | ✅ |
166
170
  | Magistral-Small-2509 | mistral.magistral-small-2509 | ❌ |
167
171
  | MiniMax-M2 | minimax.minimax-m2 | ❌ |
168
172
  | Ministral-3-3b | mistral.ministral-3-3b-instruct | ✅ |
@@ -229,7 +233,7 @@ for await (const chunk of bedrockWrapper(awsCreds, openaiChatCompletionsCreateOb
229
233
 
230
234
  ### Image Support
231
235
 
232
- For models with image support (Claude 4+ series including Claude 4.5 Opus, Claude 4.5 Sonnet, Claude 4.5 Haiku, Claude 3.7 Sonnet, Claude 3.5 Sonnet, Claude 3 Haiku, Nova Pro, Nova Lite, Nova 2 Lite, Mistral Large 3, Ministral 3 series, and Gemma 3 series), you can include images in your messages using the following format (not all models support system prompts):
236
+ For models with image support (Claude 4+ series including Claude Opus 4.6, Claude Sonnet 4.6, Claude 4.5 Opus, Claude 4.5 Sonnet, Claude 4.5 Haiku, Claude 3.7 Sonnet, Claude 3.5 Sonnet, Claude 3 Haiku, Nova Pro, Nova Lite, Nova 2 Lite, Mistral Large 3, Ministral 3 series, Gemma 3 series, and Llama 4 series), you can include images in your messages using the following format (not all models support system prompts):
233
237
 
234
238
  ```javascript
235
239
  messages = [
@@ -329,6 +333,8 @@ Some AWS Bedrock models have specific parameter restrictions that are automatica
329
333
  #### Claude 4+ Models (Temperature/Top-P Mutual Exclusion)
330
334
 
331
335
  **Affected Models:**
336
+ - Claude-Opus-4-6
337
+ - Claude-Sonnet-4-6
332
338
  - Claude-4-5-Opus & Claude-4-5-Opus-Thinking
333
339
  - Claude-4-5-Sonnet & Claude-4-5-Sonnet-Thinking
334
340
  - Claude-4-5-Haiku & Claude-4-5-Haiku-Thinking
package/bedrock-models.js CHANGED
@@ -6,6 +6,62 @@
6
6
  // https://us-west-2.console.aws.amazon.com/bedrock/home?region=us-west-2#/cross-region-inference
7
7
 
8
8
  export const bedrock_models = [
9
+ {
10
+ // =====================
11
+ // == Claude Opus 4.6 ==
12
+ // =====================
13
+ "modelName": "Claude-Opus-4-6",
14
+ // "modelId": "anthropic.claude-opus-4-6-v1", // single-region
15
+ "modelId": "us.anthropic.claude-opus-4-6-v1", // cross-region inference profile
16
+ "vision": true,
17
+ "messages_api": true,
18
+ "system_as_separate_field": true,
19
+ "display_role_names": true,
20
+ "max_tokens_param_name": "max_tokens",
21
+ "max_supported_response_tokens": 131072,
22
+ "stop_sequences_param_name": "stop_sequences",
23
+ "response_chunk_element": "delta.text",
24
+ "response_nonchunk_element": "content[0].text",
25
+ "thinking_response_chunk_element": "delta.thinking",
26
+ "thinking_response_nonchunk_element": "content[0].thinking",
27
+ "special_request_schema": {
28
+ "anthropic_version": "bedrock-2023-05-31",
29
+ "anthropic_beta": ["output-128k-2025-02-19"],
30
+ },
31
+ "image_support": {
32
+ "max_image_size": 20971520, // 20MB
33
+ "supported_formats": ["jpeg", "png", "gif", "webp"],
34
+ "max_images_per_request": 10
35
+ }
36
+ },
37
+ {
38
+ // =======================
39
+ // == Claude Sonnet 4.6 ==
40
+ // =======================
41
+ "modelName": "Claude-Sonnet-4-6",
42
+ // "modelId": "anthropic.claude-sonnet-4-6", // single-region
43
+ "modelId": "us.anthropic.claude-sonnet-4-6", // cross-region inference profile
44
+ "vision": true,
45
+ "messages_api": true,
46
+ "system_as_separate_field": true,
47
+ "display_role_names": true,
48
+ "max_tokens_param_name": "max_tokens",
49
+ "max_supported_response_tokens": 65536,
50
+ "stop_sequences_param_name": "stop_sequences",
51
+ "response_chunk_element": "delta.text",
52
+ "response_nonchunk_element": "content[0].text",
53
+ "thinking_response_chunk_element": "delta.thinking",
54
+ "thinking_response_nonchunk_element": "content[0].thinking",
55
+ "special_request_schema": {
56
+ "anthropic_version": "bedrock-2023-05-31",
57
+ "anthropic_beta": ["output-128k-2025-02-19"],
58
+ },
59
+ "image_support": {
60
+ "max_image_size": 20971520, // 20MB
61
+ "supported_formats": ["jpeg", "png", "gif", "webp"],
62
+ "max_images_per_request": 10
63
+ }
64
+ },
9
65
  {
10
66
  // ======================
11
67
  // == Claude 4.5 Opus ==
@@ -547,6 +603,62 @@ export const bedrock_models = [
547
603
  "max_supported_response_tokens": 2048,
548
604
  "response_chunk_element": "generation"
549
605
  },
606
+ {
607
+ // ======================
608
+ // == Llama 4 Scout 17b ==
609
+ // ======================
610
+ "modelName": "Llama-4-Scout-17b",
611
+ // "modelId": "meta.llama4-scout-17b-instruct-v1:0",
612
+ "modelId": "us.meta.llama4-scout-17b-instruct-v1:0",
613
+ "vision": true,
614
+ "messages_api": false,
615
+ "bos_text": "<|begin_of_text|>",
616
+ "role_system_message_prefix": "",
617
+ "role_system_message_suffix": "",
618
+ "role_system_prefix": "<|start_header_id|>",
619
+ "role_system_suffix": "<|end_header_id|>",
620
+ "role_user_message_prefix": "",
621
+ "role_user_message_suffix": "",
622
+ "role_user_prefix": "<|start_header_id|>",
623
+ "role_user_suffix": "<|end_header_id|>",
624
+ "role_assistant_message_prefix": "",
625
+ "role_assistant_message_suffix": "",
626
+ "role_assistant_prefix": "<|start_header_id|>",
627
+ "role_assistant_suffix": "<|end_header_id|>",
628
+ "eom_text": "<|eot_id|>",
629
+ "display_role_names": true,
630
+ "max_tokens_param_name": "max_gen_len",
631
+ "max_supported_response_tokens": 2048,
632
+ "response_chunk_element": "generation"
633
+ },
634
+ {
635
+ // ========================
636
+ // == Llama 4 Maverick 17b ==
637
+ // ========================
638
+ "modelName": "Llama-4-Maverick-17b",
639
+ // "modelId": "meta.llama4-maverick-17b-instruct-v1:0",
640
+ "modelId": "us.meta.llama4-maverick-17b-instruct-v1:0",
641
+ "vision": true,
642
+ "messages_api": false,
643
+ "bos_text": "<|begin_of_text|>",
644
+ "role_system_message_prefix": "",
645
+ "role_system_message_suffix": "",
646
+ "role_system_prefix": "<|start_header_id|>",
647
+ "role_system_suffix": "<|end_header_id|>",
648
+ "role_user_message_prefix": "",
649
+ "role_user_message_suffix": "",
650
+ "role_user_prefix": "<|start_header_id|>",
651
+ "role_user_suffix": "<|end_header_id|>",
652
+ "role_assistant_message_prefix": "",
653
+ "role_assistant_message_suffix": "",
654
+ "role_assistant_prefix": "<|start_header_id|>",
655
+ "role_assistant_suffix": "<|end_header_id|>",
656
+ "eom_text": "<|eot_id|>",
657
+ "display_role_names": true,
658
+ "max_tokens_param_name": "max_gen_len",
659
+ "max_supported_response_tokens": 2048,
660
+ "response_chunk_element": "generation"
661
+ },
550
662
  {
551
663
  // ==================
552
664
  // == Llama 3.2 1b ==
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "bedrock-wrapper",
3
- "version": "2.8.0",
3
+ "version": "2.10.0",
4
4
  "description": "🪨 Bedrock Wrapper is an npm package that simplifies the integration of existing OpenAI-compatible API objects with AWS Bedrock's serverless inference LLMs.",
5
5
  "homepage": "https://www.equilllabs.com/projects/bedrock-wrapper",
6
6
  "repository": {
@@ -41,8 +41,8 @@
41
41
  "author": "",
42
42
  "license": "ISC",
43
43
  "dependencies": {
44
- "@aws-sdk/client-bedrock-runtime": "^3.943.0",
45
- "dotenv": "^17.2.3",
44
+ "@aws-sdk/client-bedrock-runtime": "^3.994.0",
45
+ "dotenv": "^17.3.1",
46
46
  "sharp": "^0.34.5"
47
47
  },
48
48
  "devDependencies": {
@@ -0,0 +1,241 @@
1
+ # Llama 4 Model Support - Implementation Plan
2
+
3
+ **Created:** 2026-01-08
4
+ **Status:** Complete (All Phases Verified)
5
+ **Confidence:** 92% (Requirements: 25/25, Feasibility: 24/25, Integration: 23/25, Risk: 20/25)
6
+
7
+ ## 1. Executive Summary
8
+
9
+ Add support for two new AWS Bedrock models: Llama 4 Scout 17B and Llama 4 Maverick 17B. Both models are multimodal (vision + text) and will be the first Llama models in this wrapper with vision enabled.
10
+
11
+ ## 2. Requirements
12
+
13
+ ### 2.1 Functional Requirements
14
+ - [x] FR-1: Add `Llama-4-Scout-17b` model configuration to bedrock-models.js
15
+ - [x] FR-2: Add `Llama-4-Maverick-17b` model configuration to bedrock-models.js
16
+ - [x] FR-3: Enable vision/image input support for both models
17
+ - [x] FR-4: Support both Invoke API and Converse API paths
18
+ - [x] FR-5: Use cross-region inference profile IDs (following existing pattern)
19
+
20
+ ### 2.2 Non-Functional Requirements
21
+ - [x] NFR-1: Follow existing Llama model configuration patterns
22
+ - [x] NFR-2: Maintain backward compatibility with existing code
23
+
24
+ ### 2.3 Out of Scope
25
+ - New test frameworks or test file creation
26
+ - Changes to core wrapper logic
27
+ - Tool use / function calling support
28
+
29
+ ### 2.4 Testing Strategy
30
+ | Preference | Selection |
31
+ |------------|-----------|
32
+ | Test Types | Expand existing model tests |
33
+ | Phase Testing | Run after implementation |
34
+ | Coverage Target | Same coverage as existing Llama models |
35
+
36
+ ## 3. Tech Stack
37
+
38
+ | Category | Technology | Version | Justification |
39
+ |----------|------------|---------|---------------|
40
+ | Language | JavaScript (ES Modules) | N/A | Existing project standard |
41
+ | Runtime | Node.js | Existing | No changes |
42
+ | Config Format | JSON objects in JS | N/A | Existing pattern |
43
+
44
+ No new technologies required - configuration addition only.
45
+
46
+ ## 4. Architecture
47
+
48
+ ### 4.1 Architecture Pattern
49
+ **Configuration Registry Pattern** - Models are defined as entries in the `bedrock_models` array in `bedrock-models.js`. New models follow the existing Llama configuration schema.
50
+
51
+ ### 4.2 System Context Diagram
52
+ ```
53
+ ┌─────────────────────────────────────────────────────────────┐
54
+ │ bedrock-wrapper.js │
55
+ │ │
56
+ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
57
+ │ │ Converse API │ │ Invoke API │ │ Image Process│ │
58
+ │ │ Path │ │ Path │ │ (Sharp) │ │
59
+ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
60
+ │ │ │ │ │
61
+ │ └─────────┬─────────┴────────────────────┘ │
62
+ │ │ │
63
+ │ ┌─────────▼─────────┐ │
64
+ │ │ bedrock-models.js │◄── ADD NEW MODELS HERE │
65
+ │ │ (model registry) │ │
66
+ │ └───────────────────┘ │
67
+ └─────────────────────────────────────────────────────────────┘
68
+
69
+
70
+ ┌─────────────────┐
71
+ │ AWS Bedrock │
72
+ │ - Llama 4 Scout│
73
+ │ - Llama 4 Maverick
74
+ └─────────────────┘
75
+ ```
76
+
77
+ ### 4.3 Component Overview
78
+
79
+ | Component | Responsibility | Dependencies |
80
+ |-----------|----------------|--------------|
81
+ | bedrock-models.js | Model configuration registry | None |
82
+ | test-models.js | General model testing | bedrock-models.js |
83
+ | test-vision.js | Vision capability testing | bedrock-models.js |
84
+
85
+ ### 4.4 Data Model
86
+
87
+ **Model Configuration Schema (per existing Llama pattern):**
88
+
89
+ | Field | Type | Value (Scout) | Value (Maverick) |
90
+ |-------|------|---------------|------------------|
91
+ | modelName | string | "Llama-4-Scout-17b" | "Llama-4-Maverick-17b" |
92
+ | modelId | string | "us.meta.llama4-scout-17b-instruct-v1:0" | "us.meta.llama4-maverick-17b-instruct-v1:0" |
93
+ | vision | boolean | true | true |
94
+ | messages_api | boolean | false | false |
95
+ | bos_text | string | "<\|begin_of_text\|>" | "<\|begin_of_text\|>" |
96
+ | role_system_prefix | string | "<\|start_header_id\|>" | "<\|start_header_id\|>" |
97
+ | role_system_suffix | string | "<\|end_header_id\|>" | "<\|end_header_id\|>" |
98
+ | role_user_prefix | string | "<\|start_header_id\|>" | "<\|start_header_id\|>" |
99
+ | role_user_suffix | string | "<\|end_header_id\|>" | "<\|end_header_id\|>" |
100
+ | role_assistant_prefix | string | "<\|start_header_id\|>" | "<\|start_header_id\|>" |
101
+ | role_assistant_suffix | string | "<\|end_header_id\|>" | "<\|end_header_id\|>" |
102
+ | eom_text | string | "<\|eot_id\|>" | "<\|eot_id\|>" |
103
+ | display_role_names | boolean | true | true |
104
+ | max_tokens_param_name | string | "max_gen_len" | "max_gen_len" |
105
+ | max_supported_response_tokens | number | 2048 | 2048 |
106
+ | response_chunk_element | string | "generation" | "generation" |
107
+
108
+ ### 4.5 API Design
109
+
110
+ No new APIs. Models integrate with existing:
111
+ - `bedrockWrapper()` - main entry point
112
+ - `listBedrockWrapperSupportedModels()` - model listing
113
+
114
+ ## 5. Implementation Phases
115
+
116
+ ### Phase 1: Add Model Configurations
117
+ **Goal:** Add both Llama 4 model configurations to bedrock-models.js
118
+ **Dependencies:** None
119
+
120
+ - [x] Task 1.1: Add Llama 4 Scout 17B configuration after existing Llama models (~line 520)
121
+ - [x] Task 1.2: Add Llama 4 Maverick 17B configuration after Scout
122
+ - [x] Task 1.3: Include commented single-region model IDs as fallback reference
123
+
124
+ **Insertion Point:** After the Llama 3.3 70b model block (around line 520)
125
+
126
+ **Configuration Template:**
127
+ ```javascript
128
+ {
129
+ // ======================
130
+ // == Llama 4 Scout 17b ==
131
+ // ======================
132
+ "modelName": "Llama-4-Scout-17b",
133
+ // "modelId": "meta.llama4-scout-17b-instruct-v1:0",
134
+ "modelId": "us.meta.llama4-scout-17b-instruct-v1:0",
135
+ "vision": true,
136
+ "messages_api": false,
137
+ "bos_text": "<|begin_of_text|>",
138
+ "role_system_message_prefix": "",
139
+ "role_system_message_suffix": "",
140
+ "role_system_prefix": "<|start_header_id|>",
141
+ "role_system_suffix": "<|end_header_id|>",
142
+ "role_user_message_prefix": "",
143
+ "role_user_message_suffix": "",
144
+ "role_user_prefix": "<|start_header_id|>",
145
+ "role_user_suffix": "<|end_header_id|>",
146
+ "role_assistant_message_prefix": "",
147
+ "role_assistant_message_suffix": "",
148
+ "role_assistant_prefix": "<|start_header_id|>",
149
+ "role_assistant_suffix": "<|end_header_id|>",
150
+ "eom_text": "<|eot_id|>",
151
+ "display_role_names": true,
152
+ "max_tokens_param_name": "max_gen_len",
153
+ "max_supported_response_tokens": 2048,
154
+ "response_chunk_element": "generation"
155
+ },
156
+ {
157
+ // ========================
158
+ // == Llama 4 Maverick 17b ==
159
+ // ========================
160
+ "modelName": "Llama-4-Maverick-17b",
161
+ // "modelId": "meta.llama4-maverick-17b-instruct-v1:0",
162
+ "modelId": "us.meta.llama4-maverick-17b-instruct-v1:0",
163
+ "vision": true,
164
+ "messages_api": false,
165
+ "bos_text": "<|begin_of_text|>",
166
+ "role_system_message_prefix": "",
167
+ "role_system_message_suffix": "",
168
+ "role_system_prefix": "<|start_header_id|>",
169
+ "role_system_suffix": "<|end_header_id|>",
170
+ "role_user_message_prefix": "",
171
+ "role_user_message_suffix": "",
172
+ "role_user_prefix": "<|start_header_id|>",
173
+ "role_user_suffix": "<|end_header_id|>",
174
+ "role_assistant_message_prefix": "",
175
+ "role_assistant_message_suffix": "",
176
+ "role_assistant_prefix": "<|start_header_id|>",
177
+ "role_assistant_suffix": "<|end_header_id|>",
178
+ "eom_text": "<|eot_id|>",
179
+ "display_role_names": true,
180
+ "max_tokens_param_name": "max_gen_len",
181
+ "max_supported_response_tokens": 2048,
182
+ "response_chunk_element": "generation"
183
+ },
184
+ ```
185
+
186
+ ### Phase 2: Verification Testing
187
+ **Goal:** Verify both models work correctly with existing tests
188
+ **Dependencies:** Phase 1 complete
189
+
190
+ - [x] Task 2.1: Run `npm run test` - verify both models pass streaming and non-streaming tests
191
+ - [x] Task 2.2: Run `npm run test-vision` - verify vision capability works for both models
192
+ - [x] Task 2.3: Run `npm run test:converse` - verify Converse API works
193
+ - [x] Task 2.4: Review output files for any errors or warnings
194
+
195
+ ## 6. Risks and Mitigations
196
+
197
+ | Risk | Likelihood | Impact | Mitigation |
198
+ |------|------------|--------|------------|
199
+ | Cross-region profile ID not yet available | Low | Medium | Single-region ID included as commented fallback |
200
+ | Llama 4 uses different special tokens | Low | High | Test both APIs; adjust tokens if tests fail |
201
+ | Vision handling differs from expected | Low | Medium | Existing Llama image format in codebase; Converse API as fallback |
202
+ | Model not available in user's region | Medium | Low | Document supported regions (us-east-1, us-east-2, us-west-1, us-west-2) |
203
+
204
+ ## 7. Success Criteria
205
+
206
+ - [x] Both models appear in `listBedrockWrapperSupportedModels()` output
207
+ - [x] `npm run test` passes for both Llama 4 models (streaming & non-streaming)
208
+ - [x] `npm run test-vision` includes and passes for both models
209
+ - [x] Both Invoke API and Converse API paths work correctly
210
+ - [x] No regression in existing model tests
211
+
212
+ ## 8. Open Questions
213
+
214
+ None - all requirements clarified during planning.
215
+
216
+ ## 9. Assumptions
217
+
218
+ - AWS Bedrock cross-region inference profiles are available for Llama 4 models (following pattern `us.meta.llama4-*`)
219
+ - Llama 4 models use the same special tokens as Llama 3.x family
220
+ - Vision handling follows the existing Llama image format in the codebase
221
+ - Max response token limit of 2048 is appropriate (same as other Llama models)
222
+
223
+ ## 10. Completion Summary
224
+
225
+ **Completed:** 2026-01-08 | **Completion Rate:** 100% (7/7 tasks)
226
+
227
+ ### What Was Built
228
+ Added support for two new AWS Bedrock models: Llama 4 Scout 17B and Llama 4 Maverick 17B. These are the first Llama models in the wrapper with vision/multimodal support enabled. Both models work with Invoke API and Converse API paths, streaming and non-streaming modes.
229
+
230
+ ### Files Modified
231
+ | File | Changes |
232
+ |------|---------|
233
+ | `bedrock-models.js` | Added Llama-4-Scout-17b config (lines 550-577), Llama-4-Maverick-17b config (lines 578-605) |
234
+
235
+ ### Test Results
236
+ - `npm run test --both`: Both models pass all 4 test modes (Invoke/Converse × Streaming/Non-streaming)
237
+ - `npm run test-vision --both`: Both models successfully describe images
238
+ - `npm run test:converse`: Both models pass Converse API tests
239
+
240
+ ### Known Limitations
241
+ - Llama 4 models do not support stop sequences (AWS Bedrock limitation, consistent with other Llama models)