bedrock-wrapper 2.7.2 → 2.9.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +123 -0
- package/CHANGELOG.md +101 -4
- package/README.md +54 -32
- package/bedrock-models.js +409 -11
- package/bedrock-wrapper.js +22 -6
- package/package.json +2 -2
- package/specs--completed/llama-4-model-support/PLAN-DRAFT-20260108-130000.md +241 -0
- package/test-converse-output.txt +7477 -0
- package/test-final-output.txt +7577 -0
- package/test-run-output.txt +7629 -0
|
@@ -0,0 +1,241 @@
|
|
|
1
|
+
# Llama 4 Model Support - Implementation Plan
|
|
2
|
+
|
|
3
|
+
**Created:** 2026-01-08
|
|
4
|
+
**Status:** Complete (All Phases Verified)
|
|
5
|
+
**Confidence:** 92% (Requirements: 25/25, Feasibility: 24/25, Integration: 23/25, Risk: 20/25)
|
|
6
|
+
|
|
7
|
+
## 1. Executive Summary
|
|
8
|
+
|
|
9
|
+
Add support for two new AWS Bedrock models: Llama 4 Scout 17B and Llama 4 Maverick 17B. Both models are multimodal (vision + text) and will be the first Llama models in this wrapper with vision enabled.
|
|
10
|
+
|
|
11
|
+
## 2. Requirements
|
|
12
|
+
|
|
13
|
+
### 2.1 Functional Requirements
|
|
14
|
+
- [x] FR-1: Add `Llama-4-Scout-17b` model configuration to bedrock-models.js
|
|
15
|
+
- [x] FR-2: Add `Llama-4-Maverick-17b` model configuration to bedrock-models.js
|
|
16
|
+
- [x] FR-3: Enable vision/image input support for both models
|
|
17
|
+
- [x] FR-4: Support both Invoke API and Converse API paths
|
|
18
|
+
- [x] FR-5: Use cross-region inference profile IDs (following existing pattern)
|
|
19
|
+
|
|
20
|
+
### 2.2 Non-Functional Requirements
|
|
21
|
+
- [x] NFR-1: Follow existing Llama model configuration patterns
|
|
22
|
+
- [x] NFR-2: Maintain backward compatibility with existing code
|
|
23
|
+
|
|
24
|
+
### 2.3 Out of Scope
|
|
25
|
+
- New test frameworks or test file creation
|
|
26
|
+
- Changes to core wrapper logic
|
|
27
|
+
- Tool use / function calling support
|
|
28
|
+
|
|
29
|
+
### 2.4 Testing Strategy
|
|
30
|
+
| Preference | Selection |
|
|
31
|
+
|------------|-----------|
|
|
32
|
+
| Test Types | Expand existing model tests |
|
|
33
|
+
| Phase Testing | Run after implementation |
|
|
34
|
+
| Coverage Target | Same coverage as existing Llama models |
|
|
35
|
+
|
|
36
|
+
## 3. Tech Stack
|
|
37
|
+
|
|
38
|
+
| Category | Technology | Version | Justification |
|
|
39
|
+
|----------|------------|---------|---------------|
|
|
40
|
+
| Language | JavaScript (ES Modules) | N/A | Existing project standard |
|
|
41
|
+
| Runtime | Node.js | Existing | No changes |
|
|
42
|
+
| Config Format | JSON objects in JS | N/A | Existing pattern |
|
|
43
|
+
|
|
44
|
+
No new technologies required - configuration addition only.
|
|
45
|
+
|
|
46
|
+
## 4. Architecture
|
|
47
|
+
|
|
48
|
+
### 4.1 Architecture Pattern
|
|
49
|
+
**Configuration Registry Pattern** - Models are defined as entries in the `bedrock_models` array in `bedrock-models.js`. New models follow the existing Llama configuration schema.
|
|
50
|
+
|
|
51
|
+
### 4.2 System Context Diagram
|
|
52
|
+
```
|
|
53
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
54
|
+
│ bedrock-wrapper.js │
|
|
55
|
+
│ │
|
|
56
|
+
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
|
57
|
+
│ │ Converse API │ │ Invoke API │ │ Image Process│ │
|
|
58
|
+
│ │ Path │ │ Path │ │ (Sharp) │ │
|
|
59
|
+
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
|
|
60
|
+
│ │ │ │ │
|
|
61
|
+
│ └─────────┬─────────┴────────────────────┘ │
|
|
62
|
+
│ │ │
|
|
63
|
+
│ ┌─────────▼─────────┐ │
|
|
64
|
+
│ │ bedrock-models.js │◄── ADD NEW MODELS HERE │
|
|
65
|
+
│ │ (model registry) │ │
|
|
66
|
+
│ └───────────────────┘ │
|
|
67
|
+
└─────────────────────────────────────────────────────────────┘
|
|
68
|
+
│
|
|
69
|
+
▼
|
|
70
|
+
┌─────────────────┐
|
|
71
|
+
│ AWS Bedrock │
|
|
72
|
+
│ - Llama 4 Scout│
|
|
73
|
+
│ - Llama 4 Maverick
|
|
74
|
+
└─────────────────┘
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
### 4.3 Component Overview
|
|
78
|
+
|
|
79
|
+
| Component | Responsibility | Dependencies |
|
|
80
|
+
|-----------|----------------|--------------|
|
|
81
|
+
| bedrock-models.js | Model configuration registry | None |
|
|
82
|
+
| test-models.js | General model testing | bedrock-models.js |
|
|
83
|
+
| test-vision.js | Vision capability testing | bedrock-models.js |
|
|
84
|
+
|
|
85
|
+
### 4.4 Data Model
|
|
86
|
+
|
|
87
|
+
**Model Configuration Schema (per existing Llama pattern):**
|
|
88
|
+
|
|
89
|
+
| Field | Type | Value (Scout) | Value (Maverick) |
|
|
90
|
+
|-------|------|---------------|------------------|
|
|
91
|
+
| modelName | string | "Llama-4-Scout-17b" | "Llama-4-Maverick-17b" |
|
|
92
|
+
| modelId | string | "us.meta.llama4-scout-17b-instruct-v1:0" | "us.meta.llama4-maverick-17b-instruct-v1:0" |
|
|
93
|
+
| vision | boolean | true | true |
|
|
94
|
+
| messages_api | boolean | false | false |
|
|
95
|
+
| bos_text | string | "<\|begin_of_text\|>" | "<\|begin_of_text\|>" |
|
|
96
|
+
| role_system_prefix | string | "<\|start_header_id\|>" | "<\|start_header_id\|>" |
|
|
97
|
+
| role_system_suffix | string | "<\|end_header_id\|>" | "<\|end_header_id\|>" |
|
|
98
|
+
| role_user_prefix | string | "<\|start_header_id\|>" | "<\|start_header_id\|>" |
|
|
99
|
+
| role_user_suffix | string | "<\|end_header_id\|>" | "<\|end_header_id\|>" |
|
|
100
|
+
| role_assistant_prefix | string | "<\|start_header_id\|>" | "<\|start_header_id\|>" |
|
|
101
|
+
| role_assistant_suffix | string | "<\|end_header_id\|>" | "<\|end_header_id\|>" |
|
|
102
|
+
| eom_text | string | "<\|eot_id\|>" | "<\|eot_id\|>" |
|
|
103
|
+
| display_role_names | boolean | true | true |
|
|
104
|
+
| max_tokens_param_name | string | "max_gen_len" | "max_gen_len" |
|
|
105
|
+
| max_supported_response_tokens | number | 2048 | 2048 |
|
|
106
|
+
| response_chunk_element | string | "generation" | "generation" |
|
|
107
|
+
|
|
108
|
+
### 4.5 API Design
|
|
109
|
+
|
|
110
|
+
No new APIs. Models integrate with existing:
|
|
111
|
+
- `bedrockWrapper()` - main entry point
|
|
112
|
+
- `listBedrockWrapperSupportedModels()` - model listing
|
|
113
|
+
|
|
114
|
+
## 5. Implementation Phases
|
|
115
|
+
|
|
116
|
+
### Phase 1: Add Model Configurations
|
|
117
|
+
**Goal:** Add both Llama 4 model configurations to bedrock-models.js
|
|
118
|
+
**Dependencies:** None
|
|
119
|
+
|
|
120
|
+
- [x] Task 1.1: Add Llama 4 Scout 17B configuration after existing Llama models (~line 520)
|
|
121
|
+
- [x] Task 1.2: Add Llama 4 Maverick 17B configuration after Scout
|
|
122
|
+
- [x] Task 1.3: Include commented single-region model IDs as fallback reference
|
|
123
|
+
|
|
124
|
+
**Insertion Point:** After the Llama 3.3 70b model block (around line 520)
|
|
125
|
+
|
|
126
|
+
**Configuration Template:**
|
|
127
|
+
```javascript
|
|
128
|
+
{
|
|
129
|
+
// ======================
|
|
130
|
+
// == Llama 4 Scout 17b ==
|
|
131
|
+
// ======================
|
|
132
|
+
"modelName": "Llama-4-Scout-17b",
|
|
133
|
+
// "modelId": "meta.llama4-scout-17b-instruct-v1:0",
|
|
134
|
+
"modelId": "us.meta.llama4-scout-17b-instruct-v1:0",
|
|
135
|
+
"vision": true,
|
|
136
|
+
"messages_api": false,
|
|
137
|
+
"bos_text": "<|begin_of_text|>",
|
|
138
|
+
"role_system_message_prefix": "",
|
|
139
|
+
"role_system_message_suffix": "",
|
|
140
|
+
"role_system_prefix": "<|start_header_id|>",
|
|
141
|
+
"role_system_suffix": "<|end_header_id|>",
|
|
142
|
+
"role_user_message_prefix": "",
|
|
143
|
+
"role_user_message_suffix": "",
|
|
144
|
+
"role_user_prefix": "<|start_header_id|>",
|
|
145
|
+
"role_user_suffix": "<|end_header_id|>",
|
|
146
|
+
"role_assistant_message_prefix": "",
|
|
147
|
+
"role_assistant_message_suffix": "",
|
|
148
|
+
"role_assistant_prefix": "<|start_header_id|>",
|
|
149
|
+
"role_assistant_suffix": "<|end_header_id|>",
|
|
150
|
+
"eom_text": "<|eot_id|>",
|
|
151
|
+
"display_role_names": true,
|
|
152
|
+
"max_tokens_param_name": "max_gen_len",
|
|
153
|
+
"max_supported_response_tokens": 2048,
|
|
154
|
+
"response_chunk_element": "generation"
|
|
155
|
+
},
|
|
156
|
+
{
|
|
157
|
+
// ========================
|
|
158
|
+
// == Llama 4 Maverick 17b ==
|
|
159
|
+
// ========================
|
|
160
|
+
"modelName": "Llama-4-Maverick-17b",
|
|
161
|
+
// "modelId": "meta.llama4-maverick-17b-instruct-v1:0",
|
|
162
|
+
"modelId": "us.meta.llama4-maverick-17b-instruct-v1:0",
|
|
163
|
+
"vision": true,
|
|
164
|
+
"messages_api": false,
|
|
165
|
+
"bos_text": "<|begin_of_text|>",
|
|
166
|
+
"role_system_message_prefix": "",
|
|
167
|
+
"role_system_message_suffix": "",
|
|
168
|
+
"role_system_prefix": "<|start_header_id|>",
|
|
169
|
+
"role_system_suffix": "<|end_header_id|>",
|
|
170
|
+
"role_user_message_prefix": "",
|
|
171
|
+
"role_user_message_suffix": "",
|
|
172
|
+
"role_user_prefix": "<|start_header_id|>",
|
|
173
|
+
"role_user_suffix": "<|end_header_id|>",
|
|
174
|
+
"role_assistant_message_prefix": "",
|
|
175
|
+
"role_assistant_message_suffix": "",
|
|
176
|
+
"role_assistant_prefix": "<|start_header_id|>",
|
|
177
|
+
"role_assistant_suffix": "<|end_header_id|>",
|
|
178
|
+
"eom_text": "<|eot_id|>",
|
|
179
|
+
"display_role_names": true,
|
|
180
|
+
"max_tokens_param_name": "max_gen_len",
|
|
181
|
+
"max_supported_response_tokens": 2048,
|
|
182
|
+
"response_chunk_element": "generation"
|
|
183
|
+
},
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
### Phase 2: Verification Testing
|
|
187
|
+
**Goal:** Verify both models work correctly with existing tests
|
|
188
|
+
**Dependencies:** Phase 1 complete
|
|
189
|
+
|
|
190
|
+
- [x] Task 2.1: Run `npm run test` - verify both models pass streaming and non-streaming tests
|
|
191
|
+
- [x] Task 2.2: Run `npm run test-vision` - verify vision capability works for both models
|
|
192
|
+
- [x] Task 2.3: Run `npm run test:converse` - verify Converse API works
|
|
193
|
+
- [x] Task 2.4: Review output files for any errors or warnings
|
|
194
|
+
|
|
195
|
+
## 6. Risks and Mitigations
|
|
196
|
+
|
|
197
|
+
| Risk | Likelihood | Impact | Mitigation |
|
|
198
|
+
|------|------------|--------|------------|
|
|
199
|
+
| Cross-region profile ID not yet available | Low | Medium | Single-region ID included as commented fallback |
|
|
200
|
+
| Llama 4 uses different special tokens | Low | High | Test both APIs; adjust tokens if tests fail |
|
|
201
|
+
| Vision handling differs from expected | Low | Medium | Existing Llama image format in codebase; Converse API as fallback |
|
|
202
|
+
| Model not available in user's region | Medium | Low | Document supported regions (us-east-1, us-east-2, us-west-1, us-west-2) |
|
|
203
|
+
|
|
204
|
+
## 7. Success Criteria
|
|
205
|
+
|
|
206
|
+
- [x] Both models appear in `listBedrockWrapperSupportedModels()` output
|
|
207
|
+
- [x] `npm run test` passes for both Llama 4 models (streaming & non-streaming)
|
|
208
|
+
- [x] `npm run test-vision` includes and passes for both models
|
|
209
|
+
- [x] Both Invoke API and Converse API paths work correctly
|
|
210
|
+
- [x] No regression in existing model tests
|
|
211
|
+
|
|
212
|
+
## 8. Open Questions
|
|
213
|
+
|
|
214
|
+
None - all requirements clarified during planning.
|
|
215
|
+
|
|
216
|
+
## 9. Assumptions
|
|
217
|
+
|
|
218
|
+
- AWS Bedrock cross-region inference profiles are available for Llama 4 models (following pattern `us.meta.llama4-*`)
|
|
219
|
+
- Llama 4 models use the same special tokens as Llama 3.x family
|
|
220
|
+
- Vision handling follows the existing Llama image format in the codebase
|
|
221
|
+
- Max response token limit of 2048 is appropriate (same as other Llama models)
|
|
222
|
+
|
|
223
|
+
## 10. Completion Summary
|
|
224
|
+
|
|
225
|
+
**Completed:** 2026-01-08 | **Completion Rate:** 100% (7/7 tasks)
|
|
226
|
+
|
|
227
|
+
### What Was Built
|
|
228
|
+
Added support for two new AWS Bedrock models: Llama 4 Scout 17B and Llama 4 Maverick 17B. These are the first Llama models in the wrapper with vision/multimodal support enabled. Both models work with Invoke API and Converse API paths, streaming and non-streaming modes.
|
|
229
|
+
|
|
230
|
+
### Files Modified
|
|
231
|
+
| File | Changes |
|
|
232
|
+
|------|---------|
|
|
233
|
+
| `bedrock-models.js` | Added Llama-4-Scout-17b config (lines 550-577), Llama-4-Maverick-17b config (lines 578-605) |
|
|
234
|
+
|
|
235
|
+
### Test Results
|
|
236
|
+
- `npm run test --both`: Both models pass all 4 test modes (Invoke/Converse × Streaming/Non-streaming)
|
|
237
|
+
- `npm run test-vision --both`: Both models successfully describe images
|
|
238
|
+
- `npm run test:converse`: Both models pass Converse API tests
|
|
239
|
+
|
|
240
|
+
### Known Limitations
|
|
241
|
+
- Llama 4 models do not support stop sequences (AWS Bedrock limitation, consistent with other Llama models)
|