bedrock-wrapper 2.8.0 → 2.9.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +123 -0
- package/CHANGELOG.md +17 -0
- package/README.md +3 -1
- package/bedrock-models.js +56 -0
- package/package.json +1 -1
- package/specs--completed/llama-4-model-support/PLAN-DRAFT-20260108-130000.md +241 -0
package/AGENTS.md
ADDED
|
@@ -0,0 +1,123 @@
|
|
|
1
|
+
# AGENTS.md
|
|
2
|
+
|
|
3
|
+
This file provides guidance to AI coding agents like Claude Code (claude.ai/code), Cursor AI, Codex, Gemini CLI, GitHub Copilot, and other AI coding assistants when working with code in this repository.
|
|
4
|
+
|
|
5
|
+
## Project Purpose
|
|
6
|
+
|
|
7
|
+
Bedrock Wrapper translates OpenAI-compatible API objects to AWS Bedrock's serverless inference LLMs. It acts as an adapter layer allowing applications using the OpenAI API format to seamlessly call AWS Bedrock models.
|
|
8
|
+
|
|
9
|
+
## Development Commands
|
|
10
|
+
|
|
11
|
+
```bash
|
|
12
|
+
npm install # Install dependencies
|
|
13
|
+
npm run clean # Clean reinstall (removes node_modules and package-lock.json)
|
|
14
|
+
npm run test # Test all models with both Invoke and Converse APIs
|
|
15
|
+
npm run test:invoke # Test with Invoke API only
|
|
16
|
+
npm run test:converse # Test with Converse API only
|
|
17
|
+
npm run test-vision # Test vision capabilities
|
|
18
|
+
npm run test-stop # Test stop sequences
|
|
19
|
+
npm run interactive # Interactive CLI for testing specific models
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
## Architecture Overview
|
|
23
|
+
|
|
24
|
+
```
|
|
25
|
+
bedrock-wrapper.js (main entry)
|
|
26
|
+
│
|
|
27
|
+
├── Converse API Path (useConverseAPI: true)
|
|
28
|
+
│ └── Unified format for all models
|
|
29
|
+
│
|
|
30
|
+
└── Invoke API Path (default)
|
|
31
|
+
└── Model-specific request/response handling
|
|
32
|
+
│
|
|
33
|
+
└── bedrock-models.js
|
|
34
|
+
└── Model configurations registry
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
### Key Functions in bedrock-wrapper.js
|
|
38
|
+
|
|
39
|
+
| Function | Line | Purpose |
|
|
40
|
+
|----------|------|---------|
|
|
41
|
+
| `bedrockWrapper()` | ~501 | Main entry point, async generator |
|
|
42
|
+
| `convertToConverseFormat()` | ~86 | OpenAI messages → Converse API format |
|
|
43
|
+
| `processMessagesForInvoke()` | ~168 | Model-specific message processing |
|
|
44
|
+
| `buildInvokePrompt()` | ~234 | Constructs model-specific prompts |
|
|
45
|
+
| `buildInvokeRequest()` | ~300 | Creates model-specific request objects |
|
|
46
|
+
| `executeInvokeAPI()` | ~409 | Handles streaming and non-streaming |
|
|
47
|
+
| `findAwsModelWithId()` | ~763 | Model lookup by name or ID |
|
|
48
|
+
|
|
49
|
+
### Model Configuration Schema (bedrock-models.js)
|
|
50
|
+
|
|
51
|
+
Each model entry requires:
|
|
52
|
+
- `modelName`: Consumer-facing name (e.g., "Claude-4-5-Sonnet")
|
|
53
|
+
- `modelId`: AWS Bedrock identifier
|
|
54
|
+
- `vision`: Boolean for image support
|
|
55
|
+
- `messages_api`: Boolean (true = structured messages, false = prompt string)
|
|
56
|
+
- `response_chunk_element`: JSON path for streaming response extraction
|
|
57
|
+
- `response_nonchunk_element`: JSON path for non-streaming response
|
|
58
|
+
|
|
59
|
+
### Two API Paths
|
|
60
|
+
|
|
61
|
+
1. **Converse API** (`useConverseAPI: true`): Unified format, handles all models consistently
|
|
62
|
+
2. **Invoke API** (default): Model-specific formatting required
|
|
63
|
+
|
|
64
|
+
Some models (e.g., DeepSeek-V3.1) have `converse_api_only: true` and automatically use the Converse API.
|
|
65
|
+
|
|
66
|
+
## Model Family Patterns
|
|
67
|
+
|
|
68
|
+
| Family | API Type | Special Handling |
|
|
69
|
+
|--------|----------|------------------|
|
|
70
|
+
| Claude | Messages API | Thinking tags: `<think>`, anthropic_version required |
|
|
71
|
+
| Nova | Messages API | Content as array `[{text: content}]`, schemaVersion: "messages-v1" |
|
|
72
|
+
| Llama | Prompt-based | Role tags: `<\|begin_of_text\|>`, `<\|start_header_id\|>` |
|
|
73
|
+
| Mistral | Prompt-based (older) / Messages (v3+) | `[INST]`/`[/INST]` tags for older models |
|
|
74
|
+
| GPT-OSS | Messages API | Reasoning tags: `<reasoning>`, streaming not supported |
|
|
75
|
+
| Qwen | Messages API | Standard messages format |
|
|
76
|
+
| DeepSeek | Messages API | V3.1 requires Converse API only |
|
|
77
|
+
| Gemma | Messages API | Standard messages format with vision |
|
|
78
|
+
| Kimi | Messages API | preserve_reasoning for thinking models |
|
|
79
|
+
|
|
80
|
+
## Adding a New Model
|
|
81
|
+
|
|
82
|
+
1. Add entry to `bedrock_models` array in `bedrock-models.js`
|
|
83
|
+
2. For prompt-based models, define all role prefix/suffix tokens
|
|
84
|
+
3. For vision models, set `vision: true` and add `image_support` config
|
|
85
|
+
4. For thinking models, add `thinking` config in `special_request_schema`
|
|
86
|
+
5. Test with `npm run test` to verify both API paths
|
|
87
|
+
|
|
88
|
+
## Key Implementation Details
|
|
89
|
+
|
|
90
|
+
### Image Processing
|
|
91
|
+
- Uses Sharp library to resize images to max 2048x2048
|
|
92
|
+
- Converts all formats to JPEG for consistency
|
|
93
|
+
- Handles base64, data URLs, and HTTP URLs
|
|
94
|
+
|
|
95
|
+
### Thinking Mode
|
|
96
|
+
- Claude: `<think>` tags, budget_tokens in special_request_schema
|
|
97
|
+
- GPT-OSS: `<reasoning>` tags, preserve_reasoning flag
|
|
98
|
+
- Temperature auto-set to 1.0, budget_tokens constrained to 80% of max_tokens
|
|
99
|
+
|
|
100
|
+
### Stop Sequences
|
|
101
|
+
- Claude: `stop_sequences` (up to 8,191)
|
|
102
|
+
- Nova: `stopSequences` (up to 4)
|
|
103
|
+
- Mistral: `stop` (up to 10)
|
|
104
|
+
- Llama: Not supported by AWS Bedrock
|
|
105
|
+
|
|
106
|
+
## Environment Setup
|
|
107
|
+
|
|
108
|
+
Create `.env` file:
|
|
109
|
+
```
|
|
110
|
+
AWS_REGION=us-west-2
|
|
111
|
+
AWS_ACCESS_KEY_ID=your_key
|
|
112
|
+
AWS_SECRET_ACCESS_KEY=your_secret
|
|
113
|
+
LLM_MAX_GEN_TOKENS=1024
|
|
114
|
+
LLM_TEMPERATURE=0.2
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
## Test Output Files
|
|
118
|
+
|
|
119
|
+
After running tests, check these files for results:
|
|
120
|
+
- `test-models-output.txt`
|
|
121
|
+
- `test-vision-models-output.txt`
|
|
122
|
+
- `test-stop-sequences-output.txt`
|
|
123
|
+
- `test-converse-api-output.txt`
|
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,23 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to this project will be documented in this file.
|
|
4
4
|
|
|
5
|
+
## [2.9.0] - 2026-01-08 (Llama 4 Models)
|
|
6
|
+
|
|
7
|
+
### ✨ Added
|
|
8
|
+
|
|
9
|
+
- Support for Llama 4 Scout and Maverick models
|
|
10
|
+
- Llama-4-Scout-17b (vision support, 2K max output tokens)
|
|
11
|
+
- Llama-4-Maverick-17b (vision support, 2K max output tokens)
|
|
12
|
+
- First Llama models with multimodal/vision capabilities in this wrapper
|
|
13
|
+
- Cross-region inference profile IDs (us.meta.llama4-*)
|
|
14
|
+
|
|
15
|
+
### ⚙️ Technical Details
|
|
16
|
+
|
|
17
|
+
- **Vision Support**: Both models support image inputs (first Llama models with vision)
|
|
18
|
+
- **API Compatibility**: Both Invoke API and Converse API paths supported
|
|
19
|
+
- **Streaming**: Full streaming and non-streaming support
|
|
20
|
+
- **Stop Sequences**: Not supported (AWS Bedrock limitation for all Llama models)
|
|
21
|
+
|
|
5
22
|
## [2.8.0] - 2025-12-05 (New Models: Claude Opus 4.5, Gemma, Kimi, MiniMax, Mistral, Nova)
|
|
6
23
|
|
|
7
24
|
### ✨ Added
|
package/README.md
CHANGED
|
@@ -163,6 +163,8 @@ Bedrock Wrapper is an npm package that simplifies the integration of existing Op
|
|
|
163
163
|
| Llama-3-2-11b | us.meta.llama3-2-11b-instruct-v1:0 | ❌ |
|
|
164
164
|
| Llama-3-2-90b | us.meta.llama3-2-90b-instruct-v1:0 | ❌ |
|
|
165
165
|
| Llama-3-3-70b | us.meta.llama3-3-70b-instruct-v1:0 | ❌ |
|
|
166
|
+
| Llama-4-Scout-17b | us.meta.llama4-scout-17b-instruct-v1:0 | ✅ |
|
|
167
|
+
| Llama-4-Maverick-17b | us.meta.llama4-maverick-17b-instruct-v1:0 | ✅ |
|
|
166
168
|
| Magistral-Small-2509 | mistral.magistral-small-2509 | ❌ |
|
|
167
169
|
| MiniMax-M2 | minimax.minimax-m2 | ❌ |
|
|
168
170
|
| Ministral-3-3b | mistral.ministral-3-3b-instruct | ✅ |
|
|
@@ -229,7 +231,7 @@ for await (const chunk of bedrockWrapper(awsCreds, openaiChatCompletionsCreateOb
|
|
|
229
231
|
|
|
230
232
|
### Image Support
|
|
231
233
|
|
|
232
|
-
For models with image support (Claude 4+ series including Claude 4.5 Opus, Claude 4.5 Sonnet, Claude 4.5 Haiku, Claude 3.7 Sonnet, Claude 3.5 Sonnet, Claude 3 Haiku, Nova Pro, Nova Lite, Nova 2 Lite, Mistral Large 3, Ministral 3 series,
|
|
234
|
+
For models with image support (Claude 4+ series including Claude 4.5 Opus, Claude 4.5 Sonnet, Claude 4.5 Haiku, Claude 3.7 Sonnet, Claude 3.5 Sonnet, Claude 3 Haiku, Nova Pro, Nova Lite, Nova 2 Lite, Mistral Large 3, Ministral 3 series, Gemma 3 series, and Llama 4 series), you can include images in your messages using the following format (not all models support system prompts):
|
|
233
235
|
|
|
234
236
|
```javascript
|
|
235
237
|
messages = [
|
package/bedrock-models.js
CHANGED
|
@@ -547,6 +547,62 @@ export const bedrock_models = [
|
|
|
547
547
|
"max_supported_response_tokens": 2048,
|
|
548
548
|
"response_chunk_element": "generation"
|
|
549
549
|
},
|
|
550
|
+
{
|
|
551
|
+
// ======================
|
|
552
|
+
// == Llama 4 Scout 17b ==
|
|
553
|
+
// ======================
|
|
554
|
+
"modelName": "Llama-4-Scout-17b",
|
|
555
|
+
// "modelId": "meta.llama4-scout-17b-instruct-v1:0",
|
|
556
|
+
"modelId": "us.meta.llama4-scout-17b-instruct-v1:0",
|
|
557
|
+
"vision": true,
|
|
558
|
+
"messages_api": false,
|
|
559
|
+
"bos_text": "<|begin_of_text|>",
|
|
560
|
+
"role_system_message_prefix": "",
|
|
561
|
+
"role_system_message_suffix": "",
|
|
562
|
+
"role_system_prefix": "<|start_header_id|>",
|
|
563
|
+
"role_system_suffix": "<|end_header_id|>",
|
|
564
|
+
"role_user_message_prefix": "",
|
|
565
|
+
"role_user_message_suffix": "",
|
|
566
|
+
"role_user_prefix": "<|start_header_id|>",
|
|
567
|
+
"role_user_suffix": "<|end_header_id|>",
|
|
568
|
+
"role_assistant_message_prefix": "",
|
|
569
|
+
"role_assistant_message_suffix": "",
|
|
570
|
+
"role_assistant_prefix": "<|start_header_id|>",
|
|
571
|
+
"role_assistant_suffix": "<|end_header_id|>",
|
|
572
|
+
"eom_text": "<|eot_id|>",
|
|
573
|
+
"display_role_names": true,
|
|
574
|
+
"max_tokens_param_name": "max_gen_len",
|
|
575
|
+
"max_supported_response_tokens": 2048,
|
|
576
|
+
"response_chunk_element": "generation"
|
|
577
|
+
},
|
|
578
|
+
{
|
|
579
|
+
// ========================
|
|
580
|
+
// == Llama 4 Maverick 17b ==
|
|
581
|
+
// ========================
|
|
582
|
+
"modelName": "Llama-4-Maverick-17b",
|
|
583
|
+
// "modelId": "meta.llama4-maverick-17b-instruct-v1:0",
|
|
584
|
+
"modelId": "us.meta.llama4-maverick-17b-instruct-v1:0",
|
|
585
|
+
"vision": true,
|
|
586
|
+
"messages_api": false,
|
|
587
|
+
"bos_text": "<|begin_of_text|>",
|
|
588
|
+
"role_system_message_prefix": "",
|
|
589
|
+
"role_system_message_suffix": "",
|
|
590
|
+
"role_system_prefix": "<|start_header_id|>",
|
|
591
|
+
"role_system_suffix": "<|end_header_id|>",
|
|
592
|
+
"role_user_message_prefix": "",
|
|
593
|
+
"role_user_message_suffix": "",
|
|
594
|
+
"role_user_prefix": "<|start_header_id|>",
|
|
595
|
+
"role_user_suffix": "<|end_header_id|>",
|
|
596
|
+
"role_assistant_message_prefix": "",
|
|
597
|
+
"role_assistant_message_suffix": "",
|
|
598
|
+
"role_assistant_prefix": "<|start_header_id|>",
|
|
599
|
+
"role_assistant_suffix": "<|end_header_id|>",
|
|
600
|
+
"eom_text": "<|eot_id|>",
|
|
601
|
+
"display_role_names": true,
|
|
602
|
+
"max_tokens_param_name": "max_gen_len",
|
|
603
|
+
"max_supported_response_tokens": 2048,
|
|
604
|
+
"response_chunk_element": "generation"
|
|
605
|
+
},
|
|
550
606
|
{
|
|
551
607
|
// ==================
|
|
552
608
|
// == Llama 3.2 1b ==
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "bedrock-wrapper",
|
|
3
|
-
"version": "2.
|
|
3
|
+
"version": "2.9.0",
|
|
4
4
|
"description": "🪨 Bedrock Wrapper is an npm package that simplifies the integration of existing OpenAI-compatible API objects with AWS Bedrock's serverless inference LLMs.",
|
|
5
5
|
"homepage": "https://www.equilllabs.com/projects/bedrock-wrapper",
|
|
6
6
|
"repository": {
|
|
@@ -0,0 +1,241 @@
|
|
|
1
|
+
# Llama 4 Model Support - Implementation Plan
|
|
2
|
+
|
|
3
|
+
**Created:** 2026-01-08
|
|
4
|
+
**Status:** Complete (All Phases Verified)
|
|
5
|
+
**Confidence:** 92% (Requirements: 25/25, Feasibility: 24/25, Integration: 23/25, Risk: 20/25)
|
|
6
|
+
|
|
7
|
+
## 1. Executive Summary
|
|
8
|
+
|
|
9
|
+
Add support for two new AWS Bedrock models: Llama 4 Scout 17B and Llama 4 Maverick 17B. Both models are multimodal (vision + text) and will be the first Llama models in this wrapper with vision enabled.
|
|
10
|
+
|
|
11
|
+
## 2. Requirements
|
|
12
|
+
|
|
13
|
+
### 2.1 Functional Requirements
|
|
14
|
+
- [x] FR-1: Add `Llama-4-Scout-17b` model configuration to bedrock-models.js
|
|
15
|
+
- [x] FR-2: Add `Llama-4-Maverick-17b` model configuration to bedrock-models.js
|
|
16
|
+
- [x] FR-3: Enable vision/image input support for both models
|
|
17
|
+
- [x] FR-4: Support both Invoke API and Converse API paths
|
|
18
|
+
- [x] FR-5: Use cross-region inference profile IDs (following existing pattern)
|
|
19
|
+
|
|
20
|
+
### 2.2 Non-Functional Requirements
|
|
21
|
+
- [x] NFR-1: Follow existing Llama model configuration patterns
|
|
22
|
+
- [x] NFR-2: Maintain backward compatibility with existing code
|
|
23
|
+
|
|
24
|
+
### 2.3 Out of Scope
|
|
25
|
+
- New test frameworks or test file creation
|
|
26
|
+
- Changes to core wrapper logic
|
|
27
|
+
- Tool use / function calling support
|
|
28
|
+
|
|
29
|
+
### 2.4 Testing Strategy
|
|
30
|
+
| Preference | Selection |
|
|
31
|
+
|------------|-----------|
|
|
32
|
+
| Test Types | Expand existing model tests |
|
|
33
|
+
| Phase Testing | Run after implementation |
|
|
34
|
+
| Coverage Target | Same coverage as existing Llama models |
|
|
35
|
+
|
|
36
|
+
## 3. Tech Stack
|
|
37
|
+
|
|
38
|
+
| Category | Technology | Version | Justification |
|
|
39
|
+
|----------|------------|---------|---------------|
|
|
40
|
+
| Language | JavaScript (ES Modules) | N/A | Existing project standard |
|
|
41
|
+
| Runtime | Node.js | Existing | No changes |
|
|
42
|
+
| Config Format | JSON objects in JS | N/A | Existing pattern |
|
|
43
|
+
|
|
44
|
+
No new technologies required - configuration addition only.
|
|
45
|
+
|
|
46
|
+
## 4. Architecture
|
|
47
|
+
|
|
48
|
+
### 4.1 Architecture Pattern
|
|
49
|
+
**Configuration Registry Pattern** - Models are defined as entries in the `bedrock_models` array in `bedrock-models.js`. New models follow the existing Llama configuration schema.
|
|
50
|
+
|
|
51
|
+
### 4.2 System Context Diagram
|
|
52
|
+
```
|
|
53
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
54
|
+
│ bedrock-wrapper.js │
|
|
55
|
+
│ │
|
|
56
|
+
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
|
57
|
+
│ │ Converse API │ │ Invoke API │ │ Image Process│ │
|
|
58
|
+
│ │ Path │ │ Path │ │ (Sharp) │ │
|
|
59
|
+
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
|
|
60
|
+
│ │ │ │ │
|
|
61
|
+
│ └─────────┬─────────┴────────────────────┘ │
|
|
62
|
+
│ │ │
|
|
63
|
+
│ ┌─────────▼─────────┐ │
|
|
64
|
+
│ │ bedrock-models.js │◄── ADD NEW MODELS HERE │
|
|
65
|
+
│ │ (model registry) │ │
|
|
66
|
+
│ └───────────────────┘ │
|
|
67
|
+
└─────────────────────────────────────────────────────────────┘
|
|
68
|
+
│
|
|
69
|
+
▼
|
|
70
|
+
┌─────────────────┐
|
|
71
|
+
│ AWS Bedrock │
|
|
72
|
+
│ - Llama 4 Scout│
|
|
73
|
+
│ - Llama 4 Maverick
|
|
74
|
+
└─────────────────┘
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
### 4.3 Component Overview
|
|
78
|
+
|
|
79
|
+
| Component | Responsibility | Dependencies |
|
|
80
|
+
|-----------|----------------|--------------|
|
|
81
|
+
| bedrock-models.js | Model configuration registry | None |
|
|
82
|
+
| test-models.js | General model testing | bedrock-models.js |
|
|
83
|
+
| test-vision.js | Vision capability testing | bedrock-models.js |
|
|
84
|
+
|
|
85
|
+
### 4.4 Data Model
|
|
86
|
+
|
|
87
|
+
**Model Configuration Schema (per existing Llama pattern):**
|
|
88
|
+
|
|
89
|
+
| Field | Type | Value (Scout) | Value (Maverick) |
|
|
90
|
+
|-------|------|---------------|------------------|
|
|
91
|
+
| modelName | string | "Llama-4-Scout-17b" | "Llama-4-Maverick-17b" |
|
|
92
|
+
| modelId | string | "us.meta.llama4-scout-17b-instruct-v1:0" | "us.meta.llama4-maverick-17b-instruct-v1:0" |
|
|
93
|
+
| vision | boolean | true | true |
|
|
94
|
+
| messages_api | boolean | false | false |
|
|
95
|
+
| bos_text | string | "<\|begin_of_text\|>" | "<\|begin_of_text\|>" |
|
|
96
|
+
| role_system_prefix | string | "<\|start_header_id\|>" | "<\|start_header_id\|>" |
|
|
97
|
+
| role_system_suffix | string | "<\|end_header_id\|>" | "<\|end_header_id\|>" |
|
|
98
|
+
| role_user_prefix | string | "<\|start_header_id\|>" | "<\|start_header_id\|>" |
|
|
99
|
+
| role_user_suffix | string | "<\|end_header_id\|>" | "<\|end_header_id\|>" |
|
|
100
|
+
| role_assistant_prefix | string | "<\|start_header_id\|>" | "<\|start_header_id\|>" |
|
|
101
|
+
| role_assistant_suffix | string | "<\|end_header_id\|>" | "<\|end_header_id\|>" |
|
|
102
|
+
| eom_text | string | "<\|eot_id\|>" | "<\|eot_id\|>" |
|
|
103
|
+
| display_role_names | boolean | true | true |
|
|
104
|
+
| max_tokens_param_name | string | "max_gen_len" | "max_gen_len" |
|
|
105
|
+
| max_supported_response_tokens | number | 2048 | 2048 |
|
|
106
|
+
| response_chunk_element | string | "generation" | "generation" |
|
|
107
|
+
|
|
108
|
+
### 4.5 API Design
|
|
109
|
+
|
|
110
|
+
No new APIs. Models integrate with existing:
|
|
111
|
+
- `bedrockWrapper()` - main entry point
|
|
112
|
+
- `listBedrockWrapperSupportedModels()` - model listing
|
|
113
|
+
|
|
114
|
+
## 5. Implementation Phases
|
|
115
|
+
|
|
116
|
+
### Phase 1: Add Model Configurations
|
|
117
|
+
**Goal:** Add both Llama 4 model configurations to bedrock-models.js
|
|
118
|
+
**Dependencies:** None
|
|
119
|
+
|
|
120
|
+
- [x] Task 1.1: Add Llama 4 Scout 17B configuration after existing Llama models (~line 520)
|
|
121
|
+
- [x] Task 1.2: Add Llama 4 Maverick 17B configuration after Scout
|
|
122
|
+
- [x] Task 1.3: Include commented single-region model IDs as fallback reference
|
|
123
|
+
|
|
124
|
+
**Insertion Point:** After the Llama 3.3 70b model block (around line 520)
|
|
125
|
+
|
|
126
|
+
**Configuration Template:**
|
|
127
|
+
```javascript
|
|
128
|
+
{
|
|
129
|
+
// ======================
|
|
130
|
+
// == Llama 4 Scout 17b ==
|
|
131
|
+
// ======================
|
|
132
|
+
"modelName": "Llama-4-Scout-17b",
|
|
133
|
+
// "modelId": "meta.llama4-scout-17b-instruct-v1:0",
|
|
134
|
+
"modelId": "us.meta.llama4-scout-17b-instruct-v1:0",
|
|
135
|
+
"vision": true,
|
|
136
|
+
"messages_api": false,
|
|
137
|
+
"bos_text": "<|begin_of_text|>",
|
|
138
|
+
"role_system_message_prefix": "",
|
|
139
|
+
"role_system_message_suffix": "",
|
|
140
|
+
"role_system_prefix": "<|start_header_id|>",
|
|
141
|
+
"role_system_suffix": "<|end_header_id|>",
|
|
142
|
+
"role_user_message_prefix": "",
|
|
143
|
+
"role_user_message_suffix": "",
|
|
144
|
+
"role_user_prefix": "<|start_header_id|>",
|
|
145
|
+
"role_user_suffix": "<|end_header_id|>",
|
|
146
|
+
"role_assistant_message_prefix": "",
|
|
147
|
+
"role_assistant_message_suffix": "",
|
|
148
|
+
"role_assistant_prefix": "<|start_header_id|>",
|
|
149
|
+
"role_assistant_suffix": "<|end_header_id|>",
|
|
150
|
+
"eom_text": "<|eot_id|>",
|
|
151
|
+
"display_role_names": true,
|
|
152
|
+
"max_tokens_param_name": "max_gen_len",
|
|
153
|
+
"max_supported_response_tokens": 2048,
|
|
154
|
+
"response_chunk_element": "generation"
|
|
155
|
+
},
|
|
156
|
+
{
|
|
157
|
+
// ========================
|
|
158
|
+
// == Llama 4 Maverick 17b ==
|
|
159
|
+
// ========================
|
|
160
|
+
"modelName": "Llama-4-Maverick-17b",
|
|
161
|
+
// "modelId": "meta.llama4-maverick-17b-instruct-v1:0",
|
|
162
|
+
"modelId": "us.meta.llama4-maverick-17b-instruct-v1:0",
|
|
163
|
+
"vision": true,
|
|
164
|
+
"messages_api": false,
|
|
165
|
+
"bos_text": "<|begin_of_text|>",
|
|
166
|
+
"role_system_message_prefix": "",
|
|
167
|
+
"role_system_message_suffix": "",
|
|
168
|
+
"role_system_prefix": "<|start_header_id|>",
|
|
169
|
+
"role_system_suffix": "<|end_header_id|>",
|
|
170
|
+
"role_user_message_prefix": "",
|
|
171
|
+
"role_user_message_suffix": "",
|
|
172
|
+
"role_user_prefix": "<|start_header_id|>",
|
|
173
|
+
"role_user_suffix": "<|end_header_id|>",
|
|
174
|
+
"role_assistant_message_prefix": "",
|
|
175
|
+
"role_assistant_message_suffix": "",
|
|
176
|
+
"role_assistant_prefix": "<|start_header_id|>",
|
|
177
|
+
"role_assistant_suffix": "<|end_header_id|>",
|
|
178
|
+
"eom_text": "<|eot_id|>",
|
|
179
|
+
"display_role_names": true,
|
|
180
|
+
"max_tokens_param_name": "max_gen_len",
|
|
181
|
+
"max_supported_response_tokens": 2048,
|
|
182
|
+
"response_chunk_element": "generation"
|
|
183
|
+
},
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
### Phase 2: Verification Testing
|
|
187
|
+
**Goal:** Verify both models work correctly with existing tests
|
|
188
|
+
**Dependencies:** Phase 1 complete
|
|
189
|
+
|
|
190
|
+
- [x] Task 2.1: Run `npm run test` - verify both models pass streaming and non-streaming tests
|
|
191
|
+
- [x] Task 2.2: Run `npm run test-vision` - verify vision capability works for both models
|
|
192
|
+
- [x] Task 2.3: Run `npm run test:converse` - verify Converse API works
|
|
193
|
+
- [x] Task 2.4: Review output files for any errors or warnings
|
|
194
|
+
|
|
195
|
+
## 6. Risks and Mitigations
|
|
196
|
+
|
|
197
|
+
| Risk | Likelihood | Impact | Mitigation |
|
|
198
|
+
|------|------------|--------|------------|
|
|
199
|
+
| Cross-region profile ID not yet available | Low | Medium | Single-region ID included as commented fallback |
|
|
200
|
+
| Llama 4 uses different special tokens | Low | High | Test both APIs; adjust tokens if tests fail |
|
|
201
|
+
| Vision handling differs from expected | Low | Medium | Existing Llama image format in codebase; Converse API as fallback |
|
|
202
|
+
| Model not available in user's region | Medium | Low | Document supported regions (us-east-1, us-east-2, us-west-1, us-west-2) |
|
|
203
|
+
|
|
204
|
+
## 7. Success Criteria
|
|
205
|
+
|
|
206
|
+
- [x] Both models appear in `listBedrockWrapperSupportedModels()` output
|
|
207
|
+
- [x] `npm run test` passes for both Llama 4 models (streaming & non-streaming)
|
|
208
|
+
- [x] `npm run test-vision` includes and passes for both models
|
|
209
|
+
- [x] Both Invoke API and Converse API paths work correctly
|
|
210
|
+
- [x] No regression in existing model tests
|
|
211
|
+
|
|
212
|
+
## 8. Open Questions
|
|
213
|
+
|
|
214
|
+
None - all requirements clarified during planning.
|
|
215
|
+
|
|
216
|
+
## 9. Assumptions
|
|
217
|
+
|
|
218
|
+
- AWS Bedrock cross-region inference profiles are available for Llama 4 models (following pattern `us.meta.llama4-*`)
|
|
219
|
+
- Llama 4 models use the same special tokens as Llama 3.x family
|
|
220
|
+
- Vision handling follows the existing Llama image format in the codebase
|
|
221
|
+
- Max response token limit of 2048 is appropriate (same as other Llama models)
|
|
222
|
+
|
|
223
|
+
## 10. Completion Summary
|
|
224
|
+
|
|
225
|
+
**Completed:** 2026-01-08 | **Completion Rate:** 100% (7/7 tasks)
|
|
226
|
+
|
|
227
|
+
### What Was Built
|
|
228
|
+
Added support for two new AWS Bedrock models: Llama 4 Scout 17B and Llama 4 Maverick 17B. These are the first Llama models in the wrapper with vision/multimodal support enabled. Both models work with Invoke API and Converse API paths, streaming and non-streaming modes.
|
|
229
|
+
|
|
230
|
+
### Files Modified
|
|
231
|
+
| File | Changes |
|
|
232
|
+
|------|---------|
|
|
233
|
+
| `bedrock-models.js` | Added Llama-4-Scout-17b config (lines 550-577), Llama-4-Maverick-17b config (lines 578-605) |
|
|
234
|
+
|
|
235
|
+
### Test Results
|
|
236
|
+
- `npm run test --both`: Both models pass all 4 test modes (Invoke/Converse × Streaming/Non-streaming)
|
|
237
|
+
- `npm run test-vision --both`: Both models successfully describe images
|
|
238
|
+
- `npm run test:converse`: Both models pass Converse API tests
|
|
239
|
+
|
|
240
|
+
### Known Limitations
|
|
241
|
+
- Llama 4 models do not support stop sequences (AWS Bedrock limitation, consistent with other Llama models)
|