@aws/ml-container-creator 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +202 -0
- package/LICENSE-THIRD-PARTY +68620 -0
- package/NOTICE +2 -0
- package/README.md +106 -0
- package/bin/cli.js +365 -0
- package/config/defaults.json +32 -0
- package/config/presets/transformers-djl.json +26 -0
- package/config/presets/transformers-gpu.json +24 -0
- package/config/presets/transformers-lmi.json +27 -0
- package/package.json +129 -0
- package/servers/README.md +419 -0
- package/servers/base-image-picker/catalogs/model-servers.json +1191 -0
- package/servers/base-image-picker/catalogs/python-slim.json +38 -0
- package/servers/base-image-picker/catalogs/triton-backends.json +51 -0
- package/servers/base-image-picker/catalogs/triton.json +38 -0
- package/servers/base-image-picker/index.js +495 -0
- package/servers/base-image-picker/manifest.json +17 -0
- package/servers/base-image-picker/package.json +15 -0
- package/servers/hyperpod-cluster-picker/LICENSE +202 -0
- package/servers/hyperpod-cluster-picker/index.js +424 -0
- package/servers/hyperpod-cluster-picker/manifest.json +14 -0
- package/servers/hyperpod-cluster-picker/package.json +17 -0
- package/servers/instance-recommender/LICENSE +202 -0
- package/servers/instance-recommender/catalogs/instances.json +852 -0
- package/servers/instance-recommender/index.js +284 -0
- package/servers/instance-recommender/manifest.json +16 -0
- package/servers/instance-recommender/package.json +15 -0
- package/servers/lib/LICENSE +202 -0
- package/servers/lib/bedrock-client.js +160 -0
- package/servers/lib/custom-validators.js +46 -0
- package/servers/lib/dynamic-resolver.js +36 -0
- package/servers/lib/package.json +11 -0
- package/servers/lib/schemas/image-catalog.schema.json +185 -0
- package/servers/lib/schemas/instances.schema.json +124 -0
- package/servers/lib/schemas/manifest.schema.json +64 -0
- package/servers/lib/schemas/model-catalog.schema.json +91 -0
- package/servers/lib/schemas/regions.schema.json +26 -0
- package/servers/lib/schemas/triton-backends.schema.json +51 -0
- package/servers/model-picker/catalogs/jumpstart-public.json +66 -0
- package/servers/model-picker/catalogs/popular-diffusors.json +88 -0
- package/servers/model-picker/catalogs/popular-transformers.json +226 -0
- package/servers/model-picker/index.js +1693 -0
- package/servers/model-picker/manifest.json +18 -0
- package/servers/model-picker/package.json +20 -0
- package/servers/region-picker/LICENSE +202 -0
- package/servers/region-picker/catalogs/regions.json +263 -0
- package/servers/region-picker/index.js +230 -0
- package/servers/region-picker/manifest.json +16 -0
- package/servers/region-picker/package.json +15 -0
- package/src/app.js +1007 -0
- package/src/copy-tpl.js +77 -0
- package/src/lib/accelerator-validator.js +39 -0
- package/src/lib/asset-manager.js +385 -0
- package/src/lib/aws-profile-parser.js +181 -0
- package/src/lib/bootstrap-command-handler.js +1647 -0
- package/src/lib/bootstrap-config.js +238 -0
- package/src/lib/ci-register-helpers.js +124 -0
- package/src/lib/ci-report-helpers.js +158 -0
- package/src/lib/ci-stage-helpers.js +268 -0
- package/src/lib/cli-handler.js +529 -0
- package/src/lib/comment-generator.js +544 -0
- package/src/lib/community-reports-validator.js +91 -0
- package/src/lib/config-manager.js +2106 -0
- package/src/lib/configuration-exporter.js +204 -0
- package/src/lib/configuration-manager.js +695 -0
- package/src/lib/configuration-matcher.js +221 -0
- package/src/lib/cpu-validator.js +36 -0
- package/src/lib/cuda-validator.js +57 -0
- package/src/lib/deployment-config-resolver.js +103 -0
- package/src/lib/deployment-entry-schema.js +125 -0
- package/src/lib/deployment-registry.js +598 -0
- package/src/lib/docker-introspection-validator.js +51 -0
- package/src/lib/engine-prefix-resolver.js +60 -0
- package/src/lib/huggingface-client.js +172 -0
- package/src/lib/key-value-parser.js +37 -0
- package/src/lib/known-flags-validator.js +200 -0
- package/src/lib/manifest-cli.js +280 -0
- package/src/lib/mcp-client.js +303 -0
- package/src/lib/mcp-command-handler.js +532 -0
- package/src/lib/neuron-validator.js +80 -0
- package/src/lib/parameter-schema-validator.js +284 -0
- package/src/lib/prompt-runner.js +1349 -0
- package/src/lib/prompts.js +1138 -0
- package/src/lib/registry-command-handler.js +519 -0
- package/src/lib/registry-loader.js +198 -0
- package/src/lib/rocm-validator.js +80 -0
- package/src/lib/schema-validator.js +157 -0
- package/src/lib/sensitive-redactor.js +59 -0
- package/src/lib/template-engine.js +156 -0
- package/src/lib/template-manager.js +341 -0
- package/src/lib/validation-engine.js +314 -0
- package/src/prompt-adapter.js +63 -0
- package/templates/Dockerfile +300 -0
- package/templates/IAM_PERMISSIONS.md +84 -0
- package/templates/MIGRATION.md +488 -0
- package/templates/PROJECT_README.md +439 -0
- package/templates/TEMPLATE_SYSTEM.md +243 -0
- package/templates/buildspec.yml +64 -0
- package/templates/code/chat_template.jinja +1 -0
- package/templates/code/flask/gunicorn_config.py +35 -0
- package/templates/code/flask/wsgi.py +10 -0
- package/templates/code/model_handler.py +387 -0
- package/templates/code/serve +300 -0
- package/templates/code/serve.py +175 -0
- package/templates/code/serving.properties +105 -0
- package/templates/code/start_server.py +39 -0
- package/templates/code/start_server.sh +39 -0
- package/templates/diffusors/Dockerfile +72 -0
- package/templates/diffusors/patch_image_api.py +35 -0
- package/templates/diffusors/serve +115 -0
- package/templates/diffusors/start_server.sh +114 -0
- package/templates/do/.gitkeep +1 -0
- package/templates/do/README.md +541 -0
- package/templates/do/build +83 -0
- package/templates/do/ci +681 -0
- package/templates/do/clean +811 -0
- package/templates/do/config +260 -0
- package/templates/do/deploy +1560 -0
- package/templates/do/export +306 -0
- package/templates/do/logs +319 -0
- package/templates/do/manifest +12 -0
- package/templates/do/push +119 -0
- package/templates/do/register +580 -0
- package/templates/do/run +113 -0
- package/templates/do/submit +417 -0
- package/templates/do/test +1147 -0
- package/templates/hyperpod/configmap.yaml +24 -0
- package/templates/hyperpod/deployment.yaml +71 -0
- package/templates/hyperpod/pvc.yaml +42 -0
- package/templates/hyperpod/service.yaml +17 -0
- package/templates/nginx-diffusors.conf +74 -0
- package/templates/nginx-predictors.conf +47 -0
- package/templates/nginx-tensorrt.conf +74 -0
- package/templates/requirements.txt +61 -0
- package/templates/sample_model/test_inference.py +123 -0
- package/templates/sample_model/train_abalone.py +252 -0
- package/templates/test/test_endpoint.sh +79 -0
- package/templates/test/test_local_image.sh +80 -0
- package/templates/test/test_model_handler.py +180 -0
- package/templates/triton/Dockerfile +128 -0
- package/templates/triton/config.pbtxt +163 -0
- package/templates/triton/model.py +130 -0
- package/templates/triton/requirements.txt +11 -0
|
@@ -0,0 +1,439 @@
|
|
|
1
|
+
# <%= projectName %>
|
|
2
|
+
|
|
3
|
+
SageMaker-compatible ML container for deploying <%= framework %> models using <%= modelServer %>.
|
|
4
|
+
|
|
5
|
+
Generated on <%= buildTimestamp %> using [ML Container Creator](https://github.com/yourusername/ml-container-creator).
|
|
6
|
+
|
|
7
|
+
## Quick Start
|
|
8
|
+
|
|
9
|
+
### 1. Build the Container
|
|
10
|
+
|
|
11
|
+
```bash
|
|
12
|
+
./do/build
|
|
13
|
+
```
|
|
14
|
+
|
|
15
|
+
Builds a Docker image tagged as `<%= projectName %>:latest`.
|
|
16
|
+
|
|
17
|
+
### 2. Test Locally
|
|
18
|
+
|
|
19
|
+
```bash
|
|
20
|
+
# Start the container
|
|
21
|
+
./do/run
|
|
22
|
+
|
|
23
|
+
# In another terminal, test the endpoints
|
|
24
|
+
./do/test
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
### 3. Push to ECR
|
|
28
|
+
|
|
29
|
+
```bash
|
|
30
|
+
./do/push
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
Pushes the image to Amazon ECR in the `<%= awsRegion %>` region.
|
|
34
|
+
|
|
35
|
+
### 4. Deploy to SageMaker
|
|
36
|
+
|
|
37
|
+
```bash
|
|
38
|
+
./do/deploy <your-sagemaker-execution-role-arn>
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
Creates a SageMaker endpoint named `<%= projectName %>-endpoint`.
|
|
42
|
+
|
|
43
|
+
### 5. Test the Endpoint
|
|
44
|
+
|
|
45
|
+
```bash
|
|
46
|
+
./do/test <%= projectName %>-endpoint
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
## Project Structure
|
|
50
|
+
|
|
51
|
+
```
|
|
52
|
+
<%= projectName %>/
|
|
53
|
+
├── do/ # do-framework lifecycle scripts
|
|
54
|
+
│ ├── build # Build Docker image
|
|
55
|
+
│ ├── push # Push to Amazon ECR
|
|
56
|
+
│ ├── deploy # Deploy to SageMaker
|
|
57
|
+
│ ├── run # Run container locally
|
|
58
|
+
│ ├── test # Test container or endpoint
|
|
59
|
+
│ ├── clean # Clean up resources
|
|
60
|
+
<% if (buildTarget === 'codebuild') { %>│ ├── submit # Submit build to CodeBuild
|
|
61
|
+
<% } %>│ ├── config # Configuration variables
|
|
62
|
+
│ └── README.md # Detailed do-framework documentation
|
|
63
|
+
├── code/ # Model serving code
|
|
64
|
+
<% if (framework === 'transformers') { %>│ └── serve # <%= modelServer %> entrypoint script
|
|
65
|
+
<% } else { %>│ ├── model_handler.py # Model loading and inference
|
|
66
|
+
│ └── serve.py # <%= modelServer %> server
|
|
67
|
+
<% } %>├── deploy/ # Legacy scripts (deprecated)
|
|
68
|
+
│ ├── build_and_push.sh # Use ./do/build && ./do/push instead
|
|
69
|
+
│ └── deploy.sh # Use ./do/deploy instead
|
|
70
|
+
<% if (includeSampleModel) { %>├── sample_model/ # Sample training code
|
|
71
|
+
│ ├── train_abalone.py # Train sample model
|
|
72
|
+
│ └── test_inference.py # Test inference
|
|
73
|
+
<% } %>
|
|
74
|
+
<% if (includeTesting) { %>├── test/ # Test suite
|
|
75
|
+
│ ├── test_endpoint.sh # Test SageMaker endpoint
|
|
76
|
+
│ └── test_local_image.sh # Test local container
|
|
77
|
+
<% } %>
|
|
78
|
+
├── Dockerfile # Container definition
|
|
79
|
+
├── requirements.txt # Python dependencies
|
|
80
|
+
└── README.md # This file
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
## Configuration
|
|
84
|
+
|
|
85
|
+
All deployment configuration is centralized in `do/config`:
|
|
86
|
+
|
|
87
|
+
```bash
|
|
88
|
+
# Project identification
|
|
89
|
+
PROJECT_NAME="<%= projectName %>"
|
|
90
|
+
DEPLOYMENT_CONFIG="<%= deploymentConfig %>"
|
|
91
|
+
|
|
92
|
+
# AWS configuration
|
|
93
|
+
AWS_REGION="<%= awsRegion %>"
|
|
94
|
+
INSTANCE_TYPE="<%= instanceType %>"
|
|
95
|
+
|
|
96
|
+
# Framework configuration
|
|
97
|
+
FRAMEWORK="<%= framework %>"
|
|
98
|
+
MODEL_SERVER="<%= modelServer %>"
|
|
99
|
+
<% if (framework === 'transformers') { %>
|
|
100
|
+
# Model configuration
|
|
101
|
+
MODEL_NAME="<%= modelName %>"
|
|
102
|
+
<% } %>
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
You can override these values by setting environment variables before running do scripts.
|
|
106
|
+
|
|
107
|
+
## Deployment Workflows
|
|
108
|
+
|
|
109
|
+
### Local Development Workflow
|
|
110
|
+
|
|
111
|
+
```bash
|
|
112
|
+
# Build and test locally
|
|
113
|
+
./do/build
|
|
114
|
+
./do/run &
|
|
115
|
+
./do/test
|
|
116
|
+
|
|
117
|
+
# When satisfied, push to ECR
|
|
118
|
+
./do/push
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
<% if (buildTarget === 'codebuild') { %>### CodeBuild Workflow
|
|
122
|
+
|
|
123
|
+
```bash
|
|
124
|
+
# Submit build to CodeBuild (builds and pushes to ECR)
|
|
125
|
+
./do/submit
|
|
126
|
+
|
|
127
|
+
# Deploy to SageMaker
|
|
128
|
+
./do/deploy <role-arn>
|
|
129
|
+
|
|
130
|
+
# Test the endpoint
|
|
131
|
+
./do/test <%= projectName %>-endpoint
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
<% } else { %>### SageMaker Deployment Workflow
|
|
135
|
+
|
|
136
|
+
```bash
|
|
137
|
+
# Build, push, and deploy
|
|
138
|
+
./do/build
|
|
139
|
+
./do/push
|
|
140
|
+
./do/deploy <role-arn>
|
|
141
|
+
|
|
142
|
+
# Test the endpoint
|
|
143
|
+
./do/test <%= projectName %>-endpoint
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
<% } %>### Cleanup
|
|
147
|
+
|
|
148
|
+
```bash
|
|
149
|
+
# Remove local images
|
|
150
|
+
./do/clean local
|
|
151
|
+
|
|
152
|
+
# Remove ECR images
|
|
153
|
+
./do/clean ecr
|
|
154
|
+
|
|
155
|
+
# Delete SageMaker endpoint
|
|
156
|
+
./do/clean endpoint
|
|
157
|
+
|
|
158
|
+
# Clean everything
|
|
159
|
+
./do/clean all
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
## do-framework Commands
|
|
163
|
+
|
|
164
|
+
This project uses the [do-framework](https://github.com/iankoulski/do-framework) for standardized container lifecycle management.
|
|
165
|
+
|
|
166
|
+
### Available Commands
|
|
167
|
+
|
|
168
|
+
| Command | Description |
|
|
169
|
+
|---------|-------------|
|
|
170
|
+
| `./do/build` | Build Docker image locally |
|
|
171
|
+
| `./do/push` | Push image to Amazon ECR |
|
|
172
|
+
| `./do/deploy <role-arn>` | Deploy to SageMaker endpoint |
|
|
173
|
+
| `./do/run` | Run container locally on port 8080 |
|
|
174
|
+
| `./do/test [endpoint]` | Test local container or SageMaker endpoint |
|
|
175
|
+
| `./do/clean <target>` | Clean up resources (local/ecr/endpoint/all) |
|
|
176
|
+
<% if (buildTarget === 'codebuild') { %>| `./do/submit` | Submit build to AWS CodeBuild |
|
|
177
|
+
<% } %>
|
|
178
|
+
For detailed documentation on each command, see `do/README.md`.
|
|
179
|
+
|
|
180
|
+
## Framework-Specific Information
|
|
181
|
+
|
|
182
|
+
<% if (framework === 'sklearn') { %>### scikit-learn
|
|
183
|
+
|
|
184
|
+
This container serves scikit-learn models using <%= modelServer %>.
|
|
185
|
+
|
|
186
|
+
**Model Format**: <%= modelFormat %>
|
|
187
|
+
|
|
188
|
+
**Loading**: Models are loaded from `/opt/ml/model/model.<%= modelFormat %>`
|
|
189
|
+
|
|
190
|
+
**Inference**: Send JSON requests to `/invocations` endpoint
|
|
191
|
+
|
|
192
|
+
<% } else if (framework === 'xgboost') { %>### XGBoost
|
|
193
|
+
|
|
194
|
+
This container serves XGBoost models using <%= modelServer %>.
|
|
195
|
+
|
|
196
|
+
**Model Format**: <%= modelFormat %>
|
|
197
|
+
|
|
198
|
+
**Loading**: Models are loaded from `/opt/ml/model/model.<%= modelFormat %>`
|
|
199
|
+
|
|
200
|
+
**Inference**: Send JSON requests to `/invocations` endpoint
|
|
201
|
+
|
|
202
|
+
<% } else if (framework === 'tensorflow') { %>### TensorFlow
|
|
203
|
+
|
|
204
|
+
This container serves TensorFlow/Keras models using <%= modelServer %>.
|
|
205
|
+
|
|
206
|
+
**Model Format**: <%= modelFormat %>
|
|
207
|
+
|
|
208
|
+
**Loading**: Models are loaded from `/opt/ml/model/`
|
|
209
|
+
|
|
210
|
+
**Inference**: Send JSON requests to `/invocations` endpoint
|
|
211
|
+
|
|
212
|
+
<% } else if (framework === 'transformers') { %>### Transformers (<%= modelServer %>)
|
|
213
|
+
|
|
214
|
+
This container serves transformer models using <%= modelServer %>.
|
|
215
|
+
|
|
216
|
+
**Model**: <%= modelName %>
|
|
217
|
+
|
|
218
|
+
<% if (modelServer === 'vllm') { %>**Server**: vLLM - High-throughput LLM serving with PagedAttention
|
|
219
|
+
|
|
220
|
+
**Features**:
|
|
221
|
+
- Continuous batching
|
|
222
|
+
- Optimized CUDA kernels
|
|
223
|
+
- OpenAI-compatible API
|
|
224
|
+
|
|
225
|
+
<% } else if (modelServer === 'sglang') { %>**Server**: SGLang - Fast serving with RadixAttention
|
|
226
|
+
|
|
227
|
+
**Features**:
|
|
228
|
+
- Structured generation
|
|
229
|
+
- Radix attention for prefix caching
|
|
230
|
+
- OpenAI-compatible API
|
|
231
|
+
|
|
232
|
+
<% } else if (modelServer === 'tensorrt-llm') { %>**Server**: TensorRT-LLM - NVIDIA optimized LLM serving
|
|
233
|
+
|
|
234
|
+
**Features**:
|
|
235
|
+
- TensorRT optimizations
|
|
236
|
+
- Multi-GPU support
|
|
237
|
+
- OpenAI-compatible API via nginx proxy
|
|
238
|
+
|
|
239
|
+
**Note**: Requires NGC API key for building. Set `NGC_API_KEY` environment variable.
|
|
240
|
+
|
|
241
|
+
<% } else if (modelServer === 'lmi') { %>**Server**: LMI (Large Model Inference) - AWS optimized serving
|
|
242
|
+
|
|
243
|
+
**Features**:
|
|
244
|
+
- AWS-optimized inference
|
|
245
|
+
- Multiple backend support
|
|
246
|
+
- DJL Serving integration
|
|
247
|
+
|
|
248
|
+
<% } else if (modelServer === 'djl') { %>**Server**: DJL (Deep Java Library) - Multi-framework serving
|
|
249
|
+
|
|
250
|
+
**Features**:
|
|
251
|
+
- Multi-framework support
|
|
252
|
+
- Production-ready serving
|
|
253
|
+
- AWS integration
|
|
254
|
+
|
|
255
|
+
<% } %>
|
|
256
|
+
**Inference**: Send requests to `/invocations` endpoint with:
|
|
257
|
+
```json
|
|
258
|
+
{
|
|
259
|
+
"inputs": "Your prompt here",
|
|
260
|
+
"parameters": {
|
|
261
|
+
"max_new_tokens": 100,
|
|
262
|
+
"temperature": 0.7
|
|
263
|
+
}
|
|
264
|
+
}
|
|
265
|
+
```
|
|
266
|
+
|
|
267
|
+
<% } %>
|
|
268
|
+
|
|
269
|
+
## SageMaker Endpoints
|
|
270
|
+
|
|
271
|
+
### Health Check
|
|
272
|
+
|
|
273
|
+
SageMaker calls the `/ping` endpoint to verify container health:
|
|
274
|
+
|
|
275
|
+
```bash
|
|
276
|
+
curl http://localhost:8080/ping
|
|
277
|
+
```
|
|
278
|
+
|
|
279
|
+
Expected response: `200 OK`
|
|
280
|
+
|
|
281
|
+
### Inference
|
|
282
|
+
|
|
283
|
+
Send prediction requests to the `/invocations` endpoint:
|
|
284
|
+
|
|
285
|
+
<% if (framework === 'transformers') { %>```bash
|
|
286
|
+
curl -X POST http://localhost:8080/invocations \
|
|
287
|
+
-H "Content-Type: application/json" \
|
|
288
|
+
-d '{
|
|
289
|
+
"inputs": "What is machine learning?",
|
|
290
|
+
"parameters": {
|
|
291
|
+
"max_new_tokens": 100,
|
|
292
|
+
"temperature": 0.7
|
|
293
|
+
}
|
|
294
|
+
}'
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
<% } else { %>```bash
|
|
298
|
+
curl -X POST http://localhost:8080/invocations \
|
|
299
|
+
-H "Content-Type: application/json" \
|
|
300
|
+
-d '{
|
|
301
|
+
"instances": [[1.0, 2.0, 3.0, 4.0]]
|
|
302
|
+
}'
|
|
303
|
+
```
|
|
304
|
+
|
|
305
|
+
<% } %>
|
|
306
|
+
## AWS Requirements
|
|
307
|
+
|
|
308
|
+
### IAM Permissions
|
|
309
|
+
|
|
310
|
+
The SageMaker execution role needs these permissions:
|
|
311
|
+
|
|
312
|
+
- `ecr:GetAuthorizationToken`
|
|
313
|
+
- `ecr:BatchCheckLayerAvailability`
|
|
314
|
+
- `ecr:GetDownloadUrlForLayer`
|
|
315
|
+
- `ecr:BatchGetImage`
|
|
316
|
+
- `s3:GetObject` (if using S3 for model artifacts)
|
|
317
|
+
- `logs:CreateLogGroup`
|
|
318
|
+
- `logs:CreateLogStream`
|
|
319
|
+
- `logs:PutLogEvents`
|
|
320
|
+
|
|
321
|
+
See `IAM_PERMISSIONS.md` for detailed permission requirements.
|
|
322
|
+
|
|
323
|
+
### AWS CLI Configuration
|
|
324
|
+
|
|
325
|
+
Ensure AWS CLI is configured with appropriate credentials:
|
|
326
|
+
|
|
327
|
+
```bash
|
|
328
|
+
aws configure
|
|
329
|
+
```
|
|
330
|
+
|
|
331
|
+
Or use environment variables:
|
|
332
|
+
|
|
333
|
+
```bash
|
|
334
|
+
export AWS_ACCESS_KEY_ID=your-access-key
|
|
335
|
+
export AWS_SECRET_ACCESS_KEY=your-secret-key
|
|
336
|
+
export AWS_DEFAULT_REGION=<%= awsRegion %>
|
|
337
|
+
```
|
|
338
|
+
|
|
339
|
+
## Troubleshooting
|
|
340
|
+
|
|
341
|
+
### Build Issues
|
|
342
|
+
|
|
343
|
+
<% if (modelServer === 'tensorrt-llm') { %>**NGC Authentication Failed**
|
|
344
|
+
|
|
345
|
+
Set your NGC API key:
|
|
346
|
+
```bash
|
|
347
|
+
export NGC_API_KEY=your-ngc-api-key
|
|
348
|
+
./do/build
|
|
349
|
+
```
|
|
350
|
+
|
|
351
|
+
<% } %>**Docker Not Found**
|
|
352
|
+
|
|
353
|
+
Install Docker: https://docs.docker.com/get-docker/
|
|
354
|
+
|
|
355
|
+
**Permission Denied**
|
|
356
|
+
|
|
357
|
+
Add your user to the docker group:
|
|
358
|
+
```bash
|
|
359
|
+
sudo usermod -aG docker $USER
|
|
360
|
+
```
|
|
361
|
+
|
|
362
|
+
### Deployment Issues
|
|
363
|
+
|
|
364
|
+
**ECR Push Failed**
|
|
365
|
+
|
|
366
|
+
Check AWS credentials and IAM permissions:
|
|
367
|
+
```bash
|
|
368
|
+
aws sts get-caller-identity
|
|
369
|
+
```
|
|
370
|
+
|
|
371
|
+
**Endpoint Creation Failed**
|
|
372
|
+
|
|
373
|
+
- Verify the execution role ARN is correct
|
|
374
|
+
- Check IAM permissions
|
|
375
|
+
- Ensure the instance type is available in your region
|
|
376
|
+
|
|
377
|
+
**Endpoint Stuck in Creating**
|
|
378
|
+
|
|
379
|
+
Check CloudWatch logs:
|
|
380
|
+
```bash
|
|
381
|
+
aws logs tail /aws/sagemaker/Endpoints/<%= projectName %>-endpoint --follow
|
|
382
|
+
```
|
|
383
|
+
|
|
384
|
+
### Runtime Issues
|
|
385
|
+
|
|
386
|
+
**Container Exits Immediately**
|
|
387
|
+
|
|
388
|
+
Check container logs:
|
|
389
|
+
```bash
|
|
390
|
+
docker logs $(docker ps -a | grep <%= projectName %> | awk '{print $1}')
|
|
391
|
+
```
|
|
392
|
+
|
|
393
|
+
**Out of Memory**
|
|
394
|
+
|
|
395
|
+
Increase instance size or optimize model:
|
|
396
|
+
```bash
|
|
397
|
+
# Edit do/config
|
|
398
|
+
INSTANCE_TYPE="ml.m5.2xlarge" # Larger instance
|
|
399
|
+
```
|
|
400
|
+
|
|
401
|
+
## Migration from Legacy Scripts
|
|
402
|
+
|
|
403
|
+
If you're familiar with the old `deploy/` scripts, see `MIGRATION.md` for a command mapping guide.
|
|
404
|
+
|
|
405
|
+
**Quick Reference**:
|
|
406
|
+
|
|
407
|
+
| Legacy Command | do-framework Command |
|
|
408
|
+
|----------------|---------------------|
|
|
409
|
+
| `./deploy/build_and_push.sh` | `./do/build && ./do/push` |
|
|
410
|
+
| `./deploy/deploy.sh <role>` | `./do/deploy <role>` |
|
|
411
|
+
<% if (buildTarget === 'codebuild') { %>| `./deploy/submit_build.sh` | `./do/submit` |
|
|
412
|
+
<% } %>
|
|
413
|
+
The legacy scripts are still available but deprecated. They will display warnings and forward to do-framework commands.
|
|
414
|
+
|
|
415
|
+
## Additional Resources
|
|
416
|
+
|
|
417
|
+
- [do-framework Documentation](https://github.com/iankoulski/do-framework)
|
|
418
|
+
- [AWS SageMaker Documentation](https://docs.aws.amazon.com/sagemaker/)
|
|
419
|
+
- [SageMaker BYOC Guide](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms.html)
|
|
420
|
+
<% if (framework === 'transformers') { %>
|
|
421
|
+
<% if (modelServer === 'vllm') { %>- [vLLM Documentation](https://docs.vllm.ai/)
|
|
422
|
+
<% } else if (modelServer === 'sglang') { %>- [SGLang Documentation](https://sgl-project.github.io/)
|
|
423
|
+
<% } else if (modelServer === 'tensorrt-llm') { %>- [TensorRT-LLM Documentation](https://github.com/NVIDIA/TensorRT-LLM)
|
|
424
|
+
<% } else if (modelServer === 'lmi') { %>- [LMI Documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/large-model-inference.html)
|
|
425
|
+
<% } else if (modelServer === 'djl') { %>- [DJL Documentation](https://docs.djl.ai/)
|
|
426
|
+
<% } %>
|
|
427
|
+
<% } %>
|
|
428
|
+
## Support
|
|
429
|
+
|
|
430
|
+
For issues or questions:
|
|
431
|
+
|
|
432
|
+
1. Check `do/README.md` for detailed command documentation
|
|
433
|
+
2. Review CloudWatch logs for deployment issues
|
|
434
|
+
3. See `MIGRATION.md` if migrating from legacy scripts
|
|
435
|
+
4. Open an issue on the [ML Container Creator repository](https://github.com/yourusername/ml-container-creator)
|
|
436
|
+
|
|
437
|
+
## License
|
|
438
|
+
|
|
439
|
+
This generated project is provided as starter code. Modify as needed for your use case.
|
|
@@ -0,0 +1,243 @@
|
|
|
1
|
+
# Template System Documentation
|
|
2
|
+
|
|
3
|
+
This directory contains EJS templates that are processed and copied to generate complete ML container projects.
|
|
4
|
+
|
|
5
|
+
## How Templates Work
|
|
6
|
+
|
|
7
|
+
### Template Processing
|
|
8
|
+
All files in this directory are processed using [EJS (Embedded JavaScript)](https://ejs.co/) templating:
|
|
9
|
+
|
|
10
|
+
```ejs
|
|
11
|
+
<%%= variable %> <%# Outputs escaped value %>
|
|
12
|
+
<%%- variable %> <%# Outputs unescaped value %>
|
|
13
|
+
<%% if (condition) { %>
|
|
14
|
+
Conditional content
|
|
15
|
+
<%% } %>
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
### Available Variables
|
|
19
|
+
|
|
20
|
+
All user answers from the prompting phase are available in templates:
|
|
21
|
+
|
|
22
|
+
| Variable | Type | Description | Example Values |
|
|
23
|
+
|----------|------|-------------|----------------|
|
|
24
|
+
| `projectName` | string | Project name | `my-ml-model` |
|
|
25
|
+
| `destinationDir` | string | Output directory | `./my-ml-model-2024-12-02` |
|
|
26
|
+
| `framework` | string | ML framework | `sklearn`, `xgboost`, `tensorflow`, `transformers` |
|
|
27
|
+
| `modelFormat` | string | Model serialization format | `pkl`, `joblib`, `json`, `keras`, `h5` |
|
|
28
|
+
| `modelServer` | string | Model serving framework | `flask`, `fastapi`, `vllm`, `sglang`, `lmi`, `djl` |
|
|
29
|
+
| `includeSampleModel` | boolean | Include sample model | `true`, `false` |
|
|
30
|
+
| `includeTesting` | boolean | Include test suite | `true`, `false` |
|
|
31
|
+
| `testTypes` | string[] | Selected test types | `['local-model-cli', 'hosted-model-endpoint']` |
|
|
32
|
+
| `buildTarget` | string | Build target | `codebuild` |
|
|
33
|
+
| `instanceType` | string | Instance configuration | `cpu-optimized`, `gpu-enabled`, `custom` |
|
|
34
|
+
| `customInstanceType` | string | Custom AWS instance type | `ml.m5.large`, `ml.g4dn.xlarge` |
|
|
35
|
+
| `awsRegion` | string | AWS region | `us-east-1` |
|
|
36
|
+
| `buildTimestamp` | string | Generation timestamp | `2024-12-02T15-30-45` |
|
|
37
|
+
|
|
38
|
+
## Directory Structure
|
|
39
|
+
|
|
40
|
+
```
|
|
41
|
+
templates/
|
|
42
|
+
├── code/ # Model serving code
|
|
43
|
+
│ ├── flask/ # Flask-specific implementation
|
|
44
|
+
│ ├── model_handler.py # Model loading and inference (traditional ML)
|
|
45
|
+
│ ├── serve.py # Flask/FastAPI server (traditional ML)
|
|
46
|
+
│ ├── serve # vLLM/SGLang/TensorRT-LLM entrypoint (transformers)
|
|
47
|
+
│ ├── serving.properties # LMI/DJL configuration (transformers with lmi/djl)
|
|
48
|
+
│ └── start_server.py # Server startup script (traditional ML)
|
|
49
|
+
├── deploy/ # Deployment scripts
|
|
50
|
+
│ ├── build_and_push.sh # Build Docker image and push to ECR
|
|
51
|
+
│ ├── deploy.sh # Deploy to SageMaker endpoint
|
|
52
|
+
│ └── upload_to_s3.sh # Upload model to S3 (transformers only)
|
|
53
|
+
├── sample_model/ # Optional sample training code
|
|
54
|
+
│ ├── train_abalone.py # Sample model training
|
|
55
|
+
│ └── test_inference.py # Sample inference testing
|
|
56
|
+
├── test/ # Optional test suite
|
|
57
|
+
│ ├── test_endpoint.sh # Test hosted SageMaker endpoint
|
|
58
|
+
│ ├── test_local_image.sh # Test local Docker container
|
|
59
|
+
│ └── test_model_handler.py # Unit tests for model handler
|
|
60
|
+
├── Dockerfile # Container definition
|
|
61
|
+
├── nginx-predictors.conf # Nginx configuration (traditional ML only)
|
|
62
|
+
└── requirements.txt # Python dependencies
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
## Conditional File Inclusion
|
|
66
|
+
|
|
67
|
+
Files are conditionally included based on user configuration:
|
|
68
|
+
|
|
69
|
+
### Transformers Configuration
|
|
70
|
+
When `framework === 'transformers'`:
|
|
71
|
+
- **Excluded**: Traditional ML serving files
|
|
72
|
+
- `code/model_handler.py`
|
|
73
|
+
- `code/serve.py`
|
|
74
|
+
- `code/start_server.py`
|
|
75
|
+
- `nginx-predictors.conf`
|
|
76
|
+
- `requirements.txt` (uses transformer-specific version)
|
|
77
|
+
- `test/test_local_image.sh`
|
|
78
|
+
- `test/test_model_handler.py`
|
|
79
|
+
- **Included**: Transformer-specific files
|
|
80
|
+
- `code/serve` (vLLM/SGLang/TensorRT-LLM entrypoint)
|
|
81
|
+
- `deploy/upload_to_s3.sh`
|
|
82
|
+
|
|
83
|
+
### TensorRT-LLM Specific Files
|
|
84
|
+
When `modelServer === 'tensorrt-llm'`:
|
|
85
|
+
- **Additional Included Files**:
|
|
86
|
+
- `nginx-tensorrt.conf` - Nginx reverse proxy for OpenAI API compatibility
|
|
87
|
+
- `code/start_server.sh` - Startup script that launches TensorRT-LLM and nginx
|
|
88
|
+
- **Architecture**: TensorRT-LLM runs on port 8081, nginx proxies SageMaker endpoints on port 8080
|
|
89
|
+
|
|
90
|
+
### LMI/DJL Specific Files
|
|
91
|
+
When `modelServer === 'lmi'` or `modelServer === 'djl'`:
|
|
92
|
+
- **Additional Included Files**:
|
|
93
|
+
- `code/serving.properties` - Configuration file for LMI/DJL serving
|
|
94
|
+
- **Architecture**: Uses AWS pre-built containers with DJL Serving
|
|
95
|
+
- **Configuration**: Model and serving parameters defined in serving.properties instead of environment variables
|
|
96
|
+
|
|
97
|
+
### Traditional ML Configuration
|
|
98
|
+
When `framework !== 'transformers'`:
|
|
99
|
+
- **Excluded**: Transformer-specific files
|
|
100
|
+
- `code/serve`
|
|
101
|
+
- `deploy/upload_to_s3.sh`
|
|
102
|
+
- **Included**: Traditional ML serving files
|
|
103
|
+
- All Flask/FastAPI serving code
|
|
104
|
+
- Nginx configuration
|
|
105
|
+
- Model handler
|
|
106
|
+
|
|
107
|
+
### Optional Modules
|
|
108
|
+
- **Sample Model**: Excluded if `includeSampleModel === false`
|
|
109
|
+
- **Test Suite**: Excluded if `includeTesting === false`
|
|
110
|
+
- **Flask Code**: Excluded if `modelServer !== 'flask'`
|
|
111
|
+
|
|
112
|
+
## Template Examples
|
|
113
|
+
|
|
114
|
+
### Using Variables in Shell Scripts
|
|
115
|
+
```bash
|
|
116
|
+
#!/bin/bash
|
|
117
|
+
# deploy/build_and_push.sh
|
|
118
|
+
|
|
119
|
+
PROJECT_NAME="<%%= projectName %>"
|
|
120
|
+
REGION="<%%= awsRegion %>"
|
|
121
|
+
|
|
122
|
+
echo "Building ${PROJECT_NAME} for region ${REGION}"
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
### Conditional Content in Python
|
|
126
|
+
```python
|
|
127
|
+
# code/model_handler.py
|
|
128
|
+
|
|
129
|
+
<%% if (framework === 'sklearn') { %>
|
|
130
|
+
import joblib
|
|
131
|
+
model = joblib.load(model_path)
|
|
132
|
+
<%% } else if (framework === 'xgboost') { %>
|
|
133
|
+
import xgboost as xgb
|
|
134
|
+
model = xgb.Booster()
|
|
135
|
+
model.load_model(model_path)
|
|
136
|
+
<%% } %>
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
### Using Arrays
|
|
140
|
+
```python
|
|
141
|
+
# test/test_endpoint.sh
|
|
142
|
+
|
|
143
|
+
<%% if (testTypes.includes('hosted-model-endpoint')) { %>
|
|
144
|
+
echo "Testing hosted endpoint..."
|
|
145
|
+
aws sagemaker-runtime invoke-endpoint \
|
|
146
|
+
--endpoint-name <%%= projectName %>-endpoint \
|
|
147
|
+
--body file://test_data.json \
|
|
148
|
+
output.json
|
|
149
|
+
<%% } %>
|
|
150
|
+
```
|
|
151
|
+
|
|
152
|
+
## Adding New Templates
|
|
153
|
+
|
|
154
|
+
### 1. Create Template File
|
|
155
|
+
Add your template file in the appropriate directory:
|
|
156
|
+
```bash
|
|
157
|
+
templates/code/my_new_file.py
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
### 2. Use EJS Syntax
|
|
161
|
+
```python
|
|
162
|
+
# templates/code/my_new_file.py
|
|
163
|
+
"""
|
|
164
|
+
Generated for <%%= projectName %>
|
|
165
|
+
Framework: <%%= framework %>
|
|
166
|
+
"""
|
|
167
|
+
|
|
168
|
+
<%% if (framework === 'sklearn') { %>
|
|
169
|
+
# sklearn-specific code
|
|
170
|
+
<%% } %>
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
### 3. Add Conditional Exclusion (if needed)
|
|
174
|
+
In `generators/app/index.js`, add to `ignorePatterns`:
|
|
175
|
+
```javascript
|
|
176
|
+
if (someCondition) {
|
|
177
|
+
ignorePatterns.push('**/code/my_new_file.py');
|
|
178
|
+
}
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
### 4. Update This Documentation
|
|
182
|
+
Document the new template and when it's included/excluded.
|
|
183
|
+
|
|
184
|
+
## Best Practices
|
|
185
|
+
|
|
186
|
+
### Template Design
|
|
187
|
+
- **Keep templates simple** - Complex logic belongs in the generator
|
|
188
|
+
- **Use descriptive variable names** - Make templates self-documenting
|
|
189
|
+
- **Add comments** - Explain why conditional logic exists
|
|
190
|
+
- **Test all paths** - Verify templates work for all configurations
|
|
191
|
+
|
|
192
|
+
### Variable Usage
|
|
193
|
+
- **Escape output by default** - Use `<%%= %>` unless you need HTML
|
|
194
|
+
- **Validate in generator** - Don't assume variables exist in templates
|
|
195
|
+
- **Provide defaults** - Use `<%%= variable || 'default' %>`
|
|
196
|
+
|
|
197
|
+
### File Organization
|
|
198
|
+
- **Group related files** - Keep similar templates together
|
|
199
|
+
- **Use subdirectories** - Organize by feature or component
|
|
200
|
+
- **Name clearly** - File names should indicate purpose
|
|
201
|
+
|
|
202
|
+
## Testing Templates
|
|
203
|
+
|
|
204
|
+
### Manual Testing
|
|
205
|
+
```bash
|
|
206
|
+
# Link generator locally
|
|
207
|
+
npm link
|
|
208
|
+
|
|
209
|
+
# Run generator
|
|
210
|
+
ml-container-creator
|
|
211
|
+
|
|
212
|
+
# Test different configurations
|
|
213
|
+
# - sklearn + flask
|
|
214
|
+
# - xgboost + fastapi
|
|
215
|
+
# - transformers + vllm
|
|
216
|
+
# - With/without sample model
|
|
217
|
+
# - With/without tests
|
|
218
|
+
```
|
|
219
|
+
|
|
220
|
+
### Automated Testing
|
|
221
|
+
See `test/` directory for generator tests that verify template generation.
|
|
222
|
+
|
|
223
|
+
## Troubleshooting
|
|
224
|
+
|
|
225
|
+
### Template Not Copied
|
|
226
|
+
- Check if file matches an ignore pattern
|
|
227
|
+
- Verify file is in templates directory
|
|
228
|
+
- Check for EJS syntax errors
|
|
229
|
+
|
|
230
|
+
### Variables Not Replaced
|
|
231
|
+
- Ensure variable exists in `this.answers`
|
|
232
|
+
- Check EJS syntax: `<%%= variable %>` not `{{ variable }}`
|
|
233
|
+
- Verify template is processed with `copyTpl` not `copy`
|
|
234
|
+
|
|
235
|
+
### Conditional Logic Not Working
|
|
236
|
+
- Test condition in generator first
|
|
237
|
+
- Use `console.log()` to debug values
|
|
238
|
+
- Check for typos in variable names
|
|
239
|
+
|
|
240
|
+
## Related Documentation
|
|
241
|
+
|
|
242
|
+
- [EJS Documentation](https://ejs.co/)
|
|
243
|
+
- [Project Steering Files](../../../.kiro/steering/)
|