@aws/ml-container-creator 0.13.3 → 0.13.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +23 -5
- package/infra/ci-harness/package-lock.json +1 -5
- package/package.json +5 -3
- package/pyproject.toml +21 -0
- package/requirements.txt +19 -0
- package/servers/instance-sizer/lib/model-resolver.js +127 -185
- package/servers/instance-sizer/lib/vram-estimator.js +86 -0
- package/servers/lib/catalogs/instances.json +0 -27
- package/src/app.js +2 -0
- package/src/lib/bootstrap-command-handler.js +35 -25
- package/src/lib/generated/cli-options.js +1 -1
- package/src/lib/generated/parameter-matrix.js +1 -1
- package/src/lib/generated/validation-rules.js +1 -1
- package/src/lib/prompt-runner.js +14 -31
- package/templates/IAM_PERMISSIONS.md +64 -13
- package/templates/do/.adapter_helper.py +451 -0
- package/templates/do/.benchmark_writer.py +13 -0
- package/templates/do/.stage_helper.py +419 -0
- package/templates/do/.tune_helper.py +218 -67
- package/templates/do/README.md +50 -604
- package/templates/do/__pycache__/.adapter_helper.cpython-312.pyc +0 -0
- package/templates/do/__pycache__/.benchmark_writer.cpython-312.pyc +0 -0
- package/templates/do/__pycache__/.tune_helper.cpython-312.pyc +0 -0
- package/templates/do/adapter +109 -4
- package/templates/do/benchmark +150 -12
- package/templates/do/build +2 -5
- package/templates/do/clean.d/async-inference.ejs +2 -5
- package/templates/do/clean.d/batch-transform.ejs +2 -5
- package/templates/do/clean.d/hyperpod-eks.ejs +2 -5
- package/templates/do/clean.d/managed-inference.ejs +2 -5
- package/templates/do/config +4 -0
- package/templates/do/deploy.d/async-inference.ejs +6 -9
- package/templates/do/deploy.d/batch-transform.ejs +4 -7
- package/templates/do/deploy.d/hyperpod-eks.ejs +1 -4
- package/templates/do/deploy.d/managed-inference.ejs +15 -6
- package/templates/do/lib/profile.sh +24 -15
- package/templates/do/push +2 -5
- package/templates/do/register +2 -5
- package/templates/do/stage +114 -292
- package/templates/do/submit +1 -4
- package/templates/do/tune +64 -10
- package/templates/MIGRATION.md +0 -488
- package/templates/TEMPLATE_SYSTEM.md +0 -243
package/templates/do/README.md
CHANGED
|
@@ -1,611 +1,57 @@
|
|
|
1
|
-
# do
|
|
2
|
-
|
|
3
|
-
|
|
1
|
+
# do/ Scripts
|
|
2
|
+
|
|
3
|
+
Standardized lifecycle scripts for your ML deployment project, following [do-framework](https://github.com/iankoulski/do-framework) conventions.
|
|
4
|
+
|
|
5
|
+
> Full documentation: [awslabs.github.io/ml-container-creator](https://awslabs.github.io/ml-container-creator/)
|
|
6
|
+
|
|
7
|
+
## Configuration
|
|
8
|
+
|
|
9
|
+
- **Framework**: `<%= framework %>`
|
|
10
|
+
- **Model Server**: `<%= modelServer %>`
|
|
11
|
+
- **Region**: `<%= awsRegion %>`
|
|
12
|
+
- **Instance**: `<%= instanceType %>`
|
|
13
|
+
- **Build Target**: `<%= buildTarget %>`
|
|
14
|
+
|
|
15
|
+
All settings centralized in `do/config`. Override via environment variables.
|
|
16
|
+
|
|
17
|
+
## Available Scripts
|
|
18
|
+
|
|
19
|
+
| Script | Description |
|
|
20
|
+
|--------|-------------|
|
|
21
|
+
| `./do/build` | Build Docker image locally |
|
|
22
|
+
| `./do/push` | Push image to Amazon ECR |
|
|
23
|
+
| `./do/submit` | Submit remote build to CodeBuild |
|
|
24
|
+
| `./do/stage` | Download and stage model artifacts (HuggingFace → S3) |
|
|
25
|
+
| `./do/deploy` | Deploy endpoint to SageMaker |
|
|
26
|
+
| `./do/add-ic` | Add an inference component to an existing endpoint |
|
|
27
|
+
| `./do/test` | Test local container or live endpoint |
|
|
28
|
+
| `./do/benchmark` | Run SageMaker AI Benchmarking job |
|
|
29
|
+
| `./do/optimize` | Apply model optimizations (quantization, compilation) |
|
|
30
|
+
| `./do/tune` | Run hyperparameter tuning job |
|
|
31
|
+
| `./do/train` | Launch SageMaker training job |
|
|
32
|
+
| `./do/adapter` | Deploy/manage LoRA adapters on a base endpoint |
|
|
33
|
+
| `./do/register` | Register model in SageMaker Model Registry |
|
|
34
|
+
| `./do/manifest` | Generate deployment manifest (multi-model, multi-region) |
|
|
35
|
+
| `./do/export` | Export project configuration as portable archive |
|
|
36
|
+
| `./do/validate` | Validate project structure and configuration |
|
|
37
|
+
| `./do/run` | Run container locally (docker run) |
|
|
38
|
+
| `./do/logs` | Tail CloudWatch logs for endpoint |
|
|
39
|
+
| `./do/status` | Show endpoint/IC status and health |
|
|
40
|
+
| `./do/clean` | Clean up AWS resources (endpoint, ECR, CodeBuild) |
|
|
41
|
+
| `./do/ci` | Run full CI pipeline (build → push → deploy → test) |
|
|
42
|
+
| `./do/config` | Display resolved configuration |
|
|
4
43
|
|
|
5
44
|
## Quick Start
|
|
6
45
|
|
|
7
46
|
```bash
|
|
8
|
-
# Build
|
|
9
|
-
./do/
|
|
10
|
-
|
|
11
|
-
# Test
|
|
12
|
-
./do/
|
|
13
|
-
|
|
14
|
-
# Push to Amazon ECR
|
|
15
|
-
./do/push
|
|
16
|
-
|
|
17
|
-
# Deploy to SageMaker
|
|
18
|
-
export ROLE_ARN=arn:aws:iam::ACCOUNT_ID:role/YOUR_ROLE
|
|
19
|
-
./do/deploy
|
|
20
|
-
|
|
21
|
-
# Test the endpoint
|
|
22
|
-
./do/test <endpoint-name>
|
|
23
|
-
|
|
24
|
-
# Clean up resources
|
|
25
|
-
./do/clean all
|
|
26
|
-
```
|
|
27
|
-
|
|
28
|
-
## Project Configuration
|
|
29
|
-
|
|
30
|
-
**Deployment Configuration**: `<%= deploymentConfig %>`
|
|
31
|
-
- Framework: `<%= framework %>`
|
|
32
|
-
- Model Server: `<%= modelServer %>`
|
|
33
|
-
- AWS Region: `<%= awsRegion %>`
|
|
34
|
-
- Instance Type: `<%= instanceType %>`
|
|
35
|
-
- Build Target: `<%= buildTarget %>`
|
|
36
|
-
|
|
37
|
-
All configuration is centralized in `do/config`. You can override any setting by exporting environment variables before running scripts.
|
|
38
|
-
|
|
39
|
-
## Available Commands
|
|
40
|
-
|
|
41
|
-
### `./do/build`
|
|
42
|
-
|
|
43
|
-
Build the Docker image for your ML model.
|
|
44
|
-
|
|
45
|
-
**What it does:**
|
|
46
|
-
- Validates Docker is installed
|
|
47
|
-
- Handles framework-specific authentication (e.g., NGC for TensorRT-LLM)
|
|
48
|
-
- Builds Docker image with appropriate base image (CPU or GPU)
|
|
49
|
-
- Tags image with project name and timestamp
|
|
50
|
-
|
|
51
|
-
**Usage:**
|
|
52
|
-
```bash
|
|
53
|
-
./do/build
|
|
54
|
-
```
|
|
55
|
-
|
|
56
|
-
<% if (modelServer === 'tensorrt-llm') { %>
|
|
57
|
-
**TensorRT-LLM Requirements:**
|
|
58
|
-
```bash
|
|
59
|
-
# Set NGC API key before building
|
|
60
|
-
export NGC_API_KEY=your_ngc_api_key
|
|
61
|
-
./do/build
|
|
62
|
-
```
|
|
63
|
-
|
|
64
|
-
Get your NGC API key from [NVIDIA NGC](https://ngc.nvidia.com/).
|
|
65
|
-
<% } %>
|
|
66
|
-
|
|
67
|
-
**Output:**
|
|
68
|
-
- Docker image: `<%= projectName %>:latest`
|
|
69
|
-
- Tagged image: `<%= projectName %>:YYYYMMDD-HHMMSS`
|
|
70
|
-
|
|
71
|
-
---
|
|
72
|
-
|
|
73
|
-
### `./do/push`
|
|
74
|
-
|
|
75
|
-
Push the Docker image to Amazon Elastic Container Registry (ECR).
|
|
76
|
-
|
|
77
|
-
**What it does:**
|
|
78
|
-
- Validates AWS credentials
|
|
79
|
-
- Authenticates with ECR
|
|
80
|
-
- Creates ECR repository if it doesn't exist
|
|
81
|
-
- Pushes all image tags to ECR
|
|
82
|
-
- Displays pushed image URIs
|
|
83
|
-
|
|
84
|
-
**Prerequisites:**
|
|
85
|
-
- AWS credentials configured (`aws configure`)
|
|
86
|
-
- Docker image built (`./do/build`)
|
|
87
|
-
- IAM permissions for ECR operations
|
|
88
|
-
|
|
89
|
-
**Usage:**
|
|
90
|
-
```bash
|
|
91
|
-
./do/push
|
|
92
|
-
```
|
|
93
|
-
|
|
94
|
-
**Output:**
|
|
95
|
-
- Image URI: `ACCOUNT_ID.dkr.ecr.<%= awsRegion %>.amazonaws.com/ml-container-creator:<%= projectName %>-latest`
|
|
96
|
-
|
|
97
|
-
---
|
|
98
|
-
|
|
99
|
-
### `./do/deploy`
|
|
100
|
-
|
|
101
|
-
Deploy the container to AWS SageMaker as a managed endpoint.
|
|
102
|
-
|
|
103
|
-
**What it does:**
|
|
104
|
-
- Validates AWS credentials and execution role
|
|
105
|
-
- Verifies ECR image exists
|
|
106
|
-
- Creates SageMaker model
|
|
107
|
-
- Creates endpoint configuration
|
|
108
|
-
- Creates and waits for endpoint to reach InService status
|
|
109
|
-
- Displays endpoint details and test command
|
|
110
|
-
|
|
111
|
-
**Prerequisites:**
|
|
112
|
-
- AWS credentials configured
|
|
113
|
-
- Docker image pushed to ECR (`./do/push`<% if (buildTarget === 'codebuild') { %> or `./do/submit`<% } %>)
|
|
114
|
-
- SageMaker execution role ARN
|
|
115
|
-
|
|
116
|
-
**Usage:**
|
|
117
|
-
```bash
|
|
118
|
-
export ROLE_ARN=arn:aws:iam::ACCOUNT_ID:role/YOUR_SAGEMAKER_ROLE
|
|
119
|
-
./do/deploy
|
|
120
|
-
```
|
|
121
|
-
|
|
122
|
-
Or set `ROLE_ARN` in `do/config` to avoid exporting each time.
|
|
123
|
-
|
|
124
|
-
**Required IAM Permissions:**
|
|
125
|
-
|
|
126
|
-
The execution role must have:
|
|
127
|
-
- SageMaker model and endpoint management
|
|
128
|
-
- ECR image access
|
|
129
|
-
- S3 access (if using model artifacts)
|
|
130
|
-
- CloudWatch Logs write access
|
|
131
|
-
|
|
132
|
-
**Output:**
|
|
133
|
-
- Endpoint name: `<%= projectName %>-endpoint-TIMESTAMP`
|
|
134
|
-
- Endpoint status: InService
|
|
135
|
-
- Test command: `./do/test <endpoint-name>`
|
|
136
|
-
|
|
137
|
-
**Deployment Time:** Typically 5-10 minutes for endpoint to reach InService status.
|
|
138
|
-
|
|
139
|
-
---
|
|
140
|
-
|
|
141
|
-
### `./do/run`
|
|
142
|
-
|
|
143
|
-
Run the container locally for testing before deployment.
|
|
144
|
-
|
|
145
|
-
**What it does:**
|
|
146
|
-
- Detects if GPU support is needed based on deployment configuration
|
|
147
|
-
- Starts Docker container with port 8080 exposed
|
|
148
|
-
- Mounts model directory if specified
|
|
149
|
-
- Streams container logs to console
|
|
150
|
-
|
|
151
|
-
**Prerequisites:**
|
|
152
|
-
- Docker image built (`./do/build`)
|
|
153
|
-
<% if (framework === 'transformers') { %>- NVIDIA Docker runtime (for GPU support)
|
|
154
|
-
<% } %>
|
|
155
|
-
|
|
156
|
-
**Usage:**
|
|
157
|
-
```bash
|
|
158
|
-
./do/run
|
|
159
|
-
```
|
|
160
|
-
|
|
161
|
-
<% if (framework === 'transformers') { %>
|
|
162
|
-
**GPU Requirements:**
|
|
163
|
-
This deployment configuration requires GPU support. Ensure you have:
|
|
164
|
-
- NVIDIA GPU with appropriate drivers
|
|
165
|
-
- NVIDIA Container Toolkit installed
|
|
166
|
-
- Docker configured to use NVIDIA runtime
|
|
167
|
-
<% } %>
|
|
168
|
-
|
|
169
|
-
**Testing the local container:**
|
|
170
|
-
```bash
|
|
171
|
-
# In another terminal, test the endpoints
|
|
172
|
-
./do/test
|
|
173
|
-
```
|
|
174
|
-
|
|
175
|
-
**Stop the container:** Press `Ctrl+C`
|
|
176
|
-
|
|
177
|
-
---
|
|
178
|
-
|
|
179
|
-
### `./do/test`
|
|
180
|
-
|
|
181
|
-
Test the container or SageMaker endpoint with sample requests.
|
|
182
|
-
|
|
183
|
-
**What it does:**
|
|
184
|
-
- Sends health check request to `/ping` endpoint
|
|
185
|
-
- Sends sample inference request to `/invocations` endpoint
|
|
186
|
-
- Validates responses and displays results
|
|
187
|
-
- Supports both local container and SageMaker endpoint testing
|
|
188
|
-
|
|
189
|
-
**Usage:**
|
|
190
|
-
|
|
191
|
-
Test local container:
|
|
192
|
-
```bash
|
|
193
|
-
./do/test
|
|
194
|
-
```
|
|
195
|
-
|
|
196
|
-
Test SageMaker endpoint:
|
|
197
|
-
```bash
|
|
198
|
-
./do/test <endpoint-name>
|
|
199
|
-
```
|
|
200
|
-
|
|
201
|
-
**Test Payloads:**
|
|
202
|
-
|
|
203
|
-
<% if (framework === 'sklearn' || framework === 'xgboost' || framework === 'tensorflow') { %>
|
|
204
|
-
Traditional ML models expect JSON with feature vectors:
|
|
205
|
-
```json
|
|
206
|
-
{
|
|
207
|
-
"instances": [[1.0, 2.0, 3.0, 4.0]]
|
|
208
|
-
}
|
|
209
|
-
```
|
|
210
|
-
<% } else if (framework === 'transformers') { %>
|
|
211
|
-
Transformer models expect text generation requests:
|
|
212
|
-
```json
|
|
213
|
-
{
|
|
214
|
-
"inputs": "What is machine learning?",
|
|
215
|
-
"parameters": {
|
|
216
|
-
"max_new_tokens": 50,
|
|
217
|
-
"temperature": 0.7
|
|
218
|
-
}
|
|
219
|
-
}
|
|
220
|
-
```
|
|
221
|
-
<% } %>
|
|
222
|
-
|
|
223
|
-
**Exit Codes:**
|
|
224
|
-
- `0`: All tests passed
|
|
225
|
-
- `1`: Test failed (connection error, HTTP error, or validation error)
|
|
226
|
-
|
|
227
|
-
---
|
|
228
|
-
|
|
229
|
-
### `./do/clean`
|
|
230
|
-
|
|
231
|
-
Clean up Docker images and AWS resources.
|
|
232
|
-
|
|
233
|
-
**What it does:**
|
|
234
|
-
- Removes local Docker images
|
|
235
|
-
- Deletes images from ECR
|
|
236
|
-
- Deletes SageMaker endpoints, configurations, and models
|
|
237
|
-
- Prompts for confirmation before destructive operations
|
|
238
|
-
|
|
239
|
-
**Usage:**
|
|
240
|
-
|
|
241
|
-
Clean local Docker images:
|
|
242
|
-
```bash
|
|
243
|
-
./do/clean local
|
|
244
|
-
```
|
|
245
|
-
|
|
246
|
-
Clean ECR images:
|
|
247
|
-
```bash
|
|
248
|
-
./do/clean ecr
|
|
249
|
-
```
|
|
250
|
-
|
|
251
|
-
Clean SageMaker endpoint and related resources:
|
|
252
|
-
```bash
|
|
253
|
-
./do/clean endpoint
|
|
254
|
-
```
|
|
255
|
-
|
|
256
|
-
Clean everything:
|
|
257
|
-
```bash
|
|
258
|
-
./do/clean all
|
|
259
|
-
```
|
|
260
|
-
|
|
261
|
-
**Warning:** Cleaning operations are destructive and cannot be undone. Always confirm you want to delete resources.
|
|
262
|
-
|
|
263
|
-
---
|
|
264
|
-
|
|
265
|
-
### `./do/stage`
|
|
266
|
-
|
|
267
|
-
Pre-stage model weights from HuggingFace to S3 for faster builds and deploys.
|
|
268
|
-
|
|
269
|
-
**What it does:**
|
|
270
|
-
- Downloads model weights from HuggingFace using `huggingface-cli`
|
|
271
|
-
- Uses `hf_transfer` for accelerated parallel downloads
|
|
272
|
-
- Syncs downloaded weights to S3 (regional, fast access)
|
|
273
|
-
- Records the staged S3 URI in `.mlcc/staged-assets.json`
|
|
274
|
-
- Idempotent: skips if model is already staged (use `--force` to re-stage)
|
|
275
|
-
|
|
276
|
-
**Prerequisites:**
|
|
277
|
-
- AWS credentials configured
|
|
278
|
-
- `huggingface-cli` installed (`pip install huggingface_hub[cli] hf_transfer`)
|
|
279
|
-
- Bootstrap profile configured (`ml-container-creator bootstrap`)
|
|
280
|
-
|
|
281
|
-
**Usage:**
|
|
282
|
-
```bash
|
|
283
|
-
# Stage model to S3
|
|
284
|
-
./do/stage
|
|
285
|
-
|
|
286
|
-
# Force re-stage even if already present
|
|
287
|
-
./do/stage --force
|
|
288
|
-
|
|
289
|
-
# Stage and update MODEL_NAME in do/config
|
|
290
|
-
./do/stage --update-config
|
|
291
|
-
|
|
292
|
-
# Submit as SageMaker Processing Job (for models >500GB)
|
|
293
|
-
./do/stage --submit
|
|
294
|
-
```
|
|
295
|
-
|
|
296
|
-
**Output:**
|
|
297
|
-
- Staged model S3 URI
|
|
298
|
-
- Updated `.mlcc/staged-assets.json` tracking file
|
|
299
|
-
|
|
300
|
-
---
|
|
301
|
-
|
|
302
|
-
<% if (typeof includeBenchmark !== 'undefined' && includeBenchmark) { %>
|
|
303
|
-
### `./do/benchmark`
|
|
304
|
-
|
|
305
|
-
Run SageMaker AI Benchmark against deployed endpoint.
|
|
306
|
-
|
|
307
|
-
**What it does:**
|
|
308
|
-
- Verifies endpoint is InService
|
|
309
|
-
- Ensures S3 output bucket exists
|
|
310
|
-
- Creates AI workload configuration
|
|
311
|
-
- Creates and monitors AI benchmark job
|
|
312
|
-
- Displays performance results (throughput, latency P50/P90/P99, TTFT, ITL)
|
|
313
|
-
|
|
314
|
-
**Prerequisites:**
|
|
315
|
-
- Endpoint deployed and InService (`./do/deploy`)
|
|
316
|
-
- AWS credentials configured
|
|
317
|
-
|
|
318
|
-
**Usage:**
|
|
319
|
-
```bash
|
|
320
|
-
./do/benchmark
|
|
321
|
-
```
|
|
322
|
-
|
|
323
|
-
**Clean up benchmark resources:**
|
|
324
|
-
```bash
|
|
325
|
-
./do/benchmark --clean
|
|
326
|
-
```
|
|
327
|
-
|
|
328
|
-
**Output:**
|
|
329
|
-
- Benchmark results summary table
|
|
330
|
-
- Detailed results in S3
|
|
331
|
-
|
|
332
|
-
---
|
|
333
|
-
|
|
334
|
-
<% } %>
|
|
335
|
-
<% if (buildTarget === 'codebuild') { %>
|
|
336
|
-
### `./do/submit`
|
|
337
|
-
|
|
338
|
-
Submit a build job to AWS CodeBuild (CodeBuild deployment only).
|
|
339
|
-
|
|
340
|
-
**What it does:**
|
|
341
|
-
- Creates CodeBuild project if it doesn't exist
|
|
342
|
-
- Creates IAM service role for CodeBuild if needed
|
|
343
|
-
- Uploads source code to S3
|
|
344
|
-
- Starts CodeBuild job that builds AND pushes image to ECR
|
|
345
|
-
- Monitors build progress
|
|
346
|
-
- Displays ECR image URI on success
|
|
347
|
-
|
|
348
|
-
**Prerequisites:**
|
|
349
|
-
- AWS credentials configured
|
|
350
|
-
- IAM permissions for CodeBuild, S3, and IAM operations
|
|
351
|
-
|
|
352
|
-
**Usage:**
|
|
353
|
-
```bash
|
|
354
|
-
./do/submit
|
|
355
|
-
```
|
|
356
|
-
|
|
357
|
-
**Important:** When using CodeBuild deployment, `./do/submit` replaces both `./do/build` and `./do/push`. The buildspec.yml handles building the Docker image and pushing it to ECR in the AWS environment.
|
|
358
|
-
|
|
359
|
-
**Workflow Comparison:**
|
|
360
|
-
|
|
361
|
-
Local/SageMaker deployment:
|
|
362
|
-
```bash
|
|
363
|
-
./do/build # Build locally
|
|
364
|
-
./do/push # Push to ECR
|
|
365
|
-
./do/deploy # Deploy to SageMaker
|
|
366
|
-
```
|
|
367
|
-
|
|
368
|
-
CodeBuild deployment:
|
|
369
|
-
```bash
|
|
370
|
-
./do/submit # Build + push via CodeBuild
|
|
371
|
-
./do/deploy # Deploy to SageMaker
|
|
372
|
-
```
|
|
373
|
-
|
|
374
|
-
**Build Time:** Typically 5-15 minutes depending on image size and complexity.
|
|
375
|
-
|
|
376
|
-
---
|
|
377
|
-
|
|
378
|
-
<% } %>
|
|
379
|
-
## Configuration Reference
|
|
380
|
-
|
|
381
|
-
All scripts source configuration from `do/config`. Key variables:
|
|
382
|
-
|
|
383
|
-
| Variable | Description | Current Value |
|
|
384
|
-
|----------|-------------|---------------|
|
|
385
|
-
| `PROJECT_NAME` | Project identifier | `<%= projectName %>` |
|
|
386
|
-
| `DEPLOYMENT_CONFIG` | Framework-server combination | `<%= deploymentConfig %>` |
|
|
387
|
-
| `FRAMEWORK` | ML framework | `<%= framework %>` |
|
|
388
|
-
| `MODEL_SERVER` | Model serving framework | `<%= modelServer %>` |
|
|
389
|
-
| `AWS_REGION` | AWS region for deployment | `<%= awsRegion %>` |
|
|
390
|
-
| `ECR_REPOSITORY_NAME` | ECR repository name | `ml-container-creator` |
|
|
391
|
-
| `INSTANCE_TYPE` | SageMaker instance type | `<%= instanceType %>` |
|
|
392
|
-
| `BUILD_TARGET` | Build target | `<%= buildTarget %>` |
|
|
393
|
-
<% if (framework === 'transformers') { %>| `MODEL_NAME` | HuggingFace model name | `<%= modelName %>` |
|
|
394
|
-
<% } %><% if (modelFormat) { %>| `MODEL_FORMAT` | Model file format | `<%= modelFormat %>` |
|
|
395
|
-
<% } %>
|
|
396
|
-
|
|
397
|
-
### Environment Variable Overrides
|
|
398
|
-
|
|
399
|
-
You can override any configuration variable by exporting it before running scripts:
|
|
400
|
-
|
|
401
|
-
```bash
|
|
402
|
-
# Override AWS region
|
|
403
|
-
export AWS_REGION=us-west-2
|
|
404
|
-
./do/deploy
|
|
405
|
-
|
|
406
|
-
# Override instance type
|
|
407
|
-
export INSTANCE_TYPE=ml.g5.2xlarge
|
|
408
|
-
./do/deploy
|
|
409
|
-
|
|
410
|
-
# Override ECR repository name
|
|
411
|
-
export ECR_REPOSITORY_NAME=my-custom-repo
|
|
412
|
-
./do/push
|
|
47
|
+
./do/build # Build image
|
|
48
|
+
./do/push # Push to ECR
|
|
49
|
+
./do/deploy # Deploy endpoint
|
|
50
|
+
./do/test <endpoint> # Test inference
|
|
51
|
+
./do/clean all # Tear down
|
|
413
52
|
```
|
|
414
53
|
|
|
415
|
-
##
|
|
416
|
-
|
|
417
|
-
### Build Issues
|
|
418
|
-
|
|
419
|
-
**Docker not found:**
|
|
420
|
-
```
|
|
421
|
-
❌ Docker is not installed
|
|
422
|
-
```
|
|
423
|
-
Install Docker from [https://docs.docker.com/get-docker/](https://docs.docker.com/get-docker/)
|
|
424
|
-
|
|
425
|
-
<% if (modelServer === 'tensorrt-llm') { %>
|
|
426
|
-
**NGC authentication failed:**
|
|
427
|
-
```
|
|
428
|
-
❌ NGC_API_KEY environment variable not set
|
|
429
|
-
```
|
|
430
|
-
Get your NGC API key from [https://ngc.nvidia.com/](https://ngc.nvidia.com/) and export it:
|
|
431
|
-
```bash
|
|
432
|
-
export NGC_API_KEY=your_key_here
|
|
433
|
-
```
|
|
434
|
-
<% } %>
|
|
435
|
-
|
|
436
|
-
### Push Issues
|
|
437
|
-
|
|
438
|
-
**AWS credentials not configured:**
|
|
439
|
-
```
|
|
440
|
-
❌ AWS credentials not configured
|
|
441
|
-
```
|
|
442
|
-
Run `aws configure` or set `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables.
|
|
443
|
-
|
|
444
|
-
**ECR authentication failed:**
|
|
445
|
-
```
|
|
446
|
-
❌ Failed to authenticate with ECR
|
|
447
|
-
```
|
|
448
|
-
Ensure your IAM user/role has `ecr:GetAuthorizationToken` permission.
|
|
449
|
-
|
|
450
|
-
### Deploy Issues
|
|
451
|
-
|
|
452
|
-
**Execution role not provided:**
|
|
453
|
-
```
|
|
454
|
-
❌ Execution role ARN not provided
|
|
455
|
-
```
|
|
456
|
-
Export the role ARN:
|
|
457
|
-
```bash
|
|
458
|
-
export ROLE_ARN=arn:aws:iam::ACCOUNT_ID:role/YOUR_ROLE
|
|
459
|
-
```
|
|
460
|
-
|
|
461
|
-
**ECR image not found:**
|
|
462
|
-
```
|
|
463
|
-
❌ ECR image not found
|
|
464
|
-
```
|
|
465
|
-
<% if (buildTarget === 'codebuild') { %>Run `./do/submit` to build and push the image via CodeBuild.
|
|
466
|
-
<% } else { %>Run `./do/build` and `./do/push` to build and push the image.
|
|
467
|
-
<% } %>
|
|
468
|
-
|
|
469
|
-
**Endpoint creation failed:**
|
|
470
|
-
```
|
|
471
|
-
❌ Failed to create endpoint
|
|
472
|
-
```
|
|
473
|
-
Check:
|
|
474
|
-
- Instance type is available in your region
|
|
475
|
-
- You have sufficient service quota for the instance type
|
|
476
|
-
- The execution role has correct permissions
|
|
477
|
-
- CloudWatch Logs for detailed error messages
|
|
478
|
-
|
|
479
|
-
### Test Issues
|
|
480
|
-
|
|
481
|
-
**Local container not responding:**
|
|
482
|
-
```
|
|
483
|
-
❌ Could not connect to local container
|
|
484
|
-
```
|
|
485
|
-
Ensure the container is running: `./do/run`
|
|
486
|
-
|
|
487
|
-
**SageMaker endpoint not InService:**
|
|
488
|
-
```
|
|
489
|
-
❌ Endpoint is not InService
|
|
490
|
-
```
|
|
491
|
-
Wait for endpoint to finish deploying. Check status:
|
|
492
|
-
```bash
|
|
493
|
-
aws sagemaker describe-endpoint --endpoint-name <endpoint-name> --region <%= awsRegion %>
|
|
494
|
-
```
|
|
495
|
-
|
|
496
|
-
<% if (framework === 'transformers') { %>
|
|
497
|
-
### GPU Issues
|
|
498
|
-
|
|
499
|
-
**NVIDIA runtime not found:**
|
|
500
|
-
```
|
|
501
|
-
❌ NVIDIA Container Toolkit not installed
|
|
502
|
-
```
|
|
503
|
-
Install NVIDIA Container Toolkit: [https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html)
|
|
504
|
-
|
|
505
|
-
**Out of GPU memory:**
|
|
506
|
-
```
|
|
507
|
-
❌ CUDA out of memory
|
|
508
|
-
```
|
|
509
|
-
Try:
|
|
510
|
-
- Using a larger instance type with more GPU memory
|
|
511
|
-
- Reducing batch size or model size
|
|
512
|
-
- Using model quantization
|
|
513
|
-
<% } %>
|
|
514
|
-
|
|
515
|
-
## Workflow Examples
|
|
516
|
-
|
|
517
|
-
### Development Workflow
|
|
518
|
-
|
|
519
|
-
1. **Build and test locally:**
|
|
520
|
-
```bash
|
|
521
|
-
./do/build
|
|
522
|
-
./do/run &
|
|
523
|
-
./do/test
|
|
524
|
-
```
|
|
525
|
-
|
|
526
|
-
2. **Deploy to SageMaker:**
|
|
527
|
-
```bash
|
|
528
|
-
<% if (buildTarget === 'codebuild') { %>./do/submit<% } else { %>./do/push<% } %>
|
|
529
|
-
export ROLE_ARN=arn:aws:iam::ACCOUNT_ID:role/YOUR_ROLE
|
|
530
|
-
./do/deploy
|
|
531
|
-
```
|
|
532
|
-
|
|
533
|
-
3. **Test the endpoint:**
|
|
534
|
-
```bash
|
|
535
|
-
./do/test <endpoint-name>
|
|
536
|
-
```
|
|
537
|
-
|
|
538
|
-
4. **Clean up when done:**
|
|
539
|
-
```bash
|
|
540
|
-
./do/clean endpoint
|
|
541
|
-
```
|
|
542
|
-
|
|
543
|
-
### CI/CD Workflow
|
|
544
|
-
|
|
545
|
-
<% if (buildTarget === 'codebuild') { %>
|
|
546
|
-
```bash
|
|
547
|
-
# In your CI/CD pipeline
|
|
548
|
-
./do/submit # Build and push via CodeBuild
|
|
549
|
-
./do/deploy # Deploy to SageMaker
|
|
550
|
-
./do/test <endpoint-name> # Validate deployment
|
|
551
|
-
```
|
|
552
|
-
<% } else { %>
|
|
553
|
-
```bash
|
|
554
|
-
# In your CI/CD pipeline
|
|
555
|
-
./do/build # Build image
|
|
556
|
-
./do/push # Push to ECR
|
|
557
|
-
./do/deploy # Deploy to SageMaker
|
|
558
|
-
./do/test <endpoint-name> # Validate deployment
|
|
559
|
-
```
|
|
560
|
-
<% } %>
|
|
561
|
-
|
|
562
|
-
### Iterative Development
|
|
563
|
-
|
|
564
|
-
```bash
|
|
565
|
-
# Make code changes
|
|
566
|
-
vim code/model_handler.py
|
|
567
|
-
|
|
568
|
-
# Rebuild and test
|
|
569
|
-
./do/build
|
|
570
|
-
./do/run &
|
|
571
|
-
./do/test
|
|
572
|
-
|
|
573
|
-
# Deploy updated version
|
|
574
|
-
<% if (buildTarget === 'codebuild') { %>./do/submit<% } else { %>./do/push<% } %>
|
|
575
|
-
./do/deploy
|
|
576
|
-
```
|
|
577
|
-
|
|
578
|
-
## Relationship to Legacy Scripts
|
|
579
|
-
|
|
580
|
-
The `deploy/` directory contains legacy wrapper scripts for backward compatibility:
|
|
581
|
-
|
|
582
|
-
| Legacy Script | do-framework Equivalent | Status |
|
|
583
|
-
|---------------|------------------------|--------|
|
|
584
|
-
| `deploy/build_and_push.sh` | `./do/build && ./do/push` | Deprecated |
|
|
585
|
-
| `deploy/deploy.sh` | `./do/deploy` | Deprecated |
|
|
586
|
-
<% if (buildTarget === 'codebuild') { %>| `deploy/submit_build.sh` | `./do/submit` | Deprecated |
|
|
587
|
-
<% } %>
|
|
588
|
-
|
|
589
|
-
**Migration:** The legacy scripts display deprecation warnings and forward to do-framework scripts. Update your workflows to use `do/` scripts directly.
|
|
590
|
-
|
|
591
|
-
See [MIGRATION.md](../MIGRATION.md) for detailed migration instructions.
|
|
592
|
-
|
|
593
|
-
## Additional Resources
|
|
594
|
-
|
|
595
|
-
- **Main Project README**: [../README.md](../README.md)
|
|
596
|
-
- **Migration Guide**: [../MIGRATION.md](../MIGRATION.md)
|
|
597
|
-
- **do-framework**: [https://github.com/iankoulski/do-framework](https://github.com/iankoulski/do-framework)
|
|
598
|
-
- **AWS SageMaker BYOC**: [https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms.html](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms.html)
|
|
599
|
-
- **Docker Documentation**: [https://docs.docker.com/](https://docs.docker.com/)
|
|
600
|
-
|
|
601
|
-
## Getting Help
|
|
602
|
-
|
|
603
|
-
If you encounter issues:
|
|
604
|
-
|
|
605
|
-
1. Check the troubleshooting section above
|
|
606
|
-
2. Review CloudWatch Logs for SageMaker endpoints
|
|
607
|
-
3. Verify IAM permissions and AWS credentials
|
|
608
|
-
4. Ensure prerequisites are installed and configured
|
|
609
|
-
5. Check the main project README for additional guidance
|
|
54
|
+
## See Also
|
|
610
55
|
|
|
611
|
-
|
|
56
|
+
- [`IAM_PERMISSIONS.md`](../IAM_PERMISSIONS.md) — Required IAM permissions per script
|
|
57
|
+
- [Full docs](https://awslabs.github.io/ml-container-creator/) — Architecture, tutorials, API reference
|
|
Binary file
|
|
Binary file
|
|
Binary file
|