@aws/ml-container-creator 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (143) hide show
  1. package/LICENSE +202 -0
  2. package/LICENSE-THIRD-PARTY +68620 -0
  3. package/NOTICE +2 -0
  4. package/README.md +106 -0
  5. package/bin/cli.js +365 -0
  6. package/config/defaults.json +32 -0
  7. package/config/presets/transformers-djl.json +26 -0
  8. package/config/presets/transformers-gpu.json +24 -0
  9. package/config/presets/transformers-lmi.json +27 -0
  10. package/package.json +129 -0
  11. package/servers/README.md +419 -0
  12. package/servers/base-image-picker/catalogs/model-servers.json +1191 -0
  13. package/servers/base-image-picker/catalogs/python-slim.json +38 -0
  14. package/servers/base-image-picker/catalogs/triton-backends.json +51 -0
  15. package/servers/base-image-picker/catalogs/triton.json +38 -0
  16. package/servers/base-image-picker/index.js +495 -0
  17. package/servers/base-image-picker/manifest.json +17 -0
  18. package/servers/base-image-picker/package.json +15 -0
  19. package/servers/hyperpod-cluster-picker/LICENSE +202 -0
  20. package/servers/hyperpod-cluster-picker/index.js +424 -0
  21. package/servers/hyperpod-cluster-picker/manifest.json +14 -0
  22. package/servers/hyperpod-cluster-picker/package.json +17 -0
  23. package/servers/instance-recommender/LICENSE +202 -0
  24. package/servers/instance-recommender/catalogs/instances.json +852 -0
  25. package/servers/instance-recommender/index.js +284 -0
  26. package/servers/instance-recommender/manifest.json +16 -0
  27. package/servers/instance-recommender/package.json +15 -0
  28. package/servers/lib/LICENSE +202 -0
  29. package/servers/lib/bedrock-client.js +160 -0
  30. package/servers/lib/custom-validators.js +46 -0
  31. package/servers/lib/dynamic-resolver.js +36 -0
  32. package/servers/lib/package.json +11 -0
  33. package/servers/lib/schemas/image-catalog.schema.json +185 -0
  34. package/servers/lib/schemas/instances.schema.json +124 -0
  35. package/servers/lib/schemas/manifest.schema.json +64 -0
  36. package/servers/lib/schemas/model-catalog.schema.json +91 -0
  37. package/servers/lib/schemas/regions.schema.json +26 -0
  38. package/servers/lib/schemas/triton-backends.schema.json +51 -0
  39. package/servers/model-picker/catalogs/jumpstart-public.json +66 -0
  40. package/servers/model-picker/catalogs/popular-diffusors.json +88 -0
  41. package/servers/model-picker/catalogs/popular-transformers.json +226 -0
  42. package/servers/model-picker/index.js +1693 -0
  43. package/servers/model-picker/manifest.json +18 -0
  44. package/servers/model-picker/package.json +20 -0
  45. package/servers/region-picker/LICENSE +202 -0
  46. package/servers/region-picker/catalogs/regions.json +263 -0
  47. package/servers/region-picker/index.js +230 -0
  48. package/servers/region-picker/manifest.json +16 -0
  49. package/servers/region-picker/package.json +15 -0
  50. package/src/app.js +1007 -0
  51. package/src/copy-tpl.js +77 -0
  52. package/src/lib/accelerator-validator.js +39 -0
  53. package/src/lib/asset-manager.js +385 -0
  54. package/src/lib/aws-profile-parser.js +181 -0
  55. package/src/lib/bootstrap-command-handler.js +1647 -0
  56. package/src/lib/bootstrap-config.js +238 -0
  57. package/src/lib/ci-register-helpers.js +124 -0
  58. package/src/lib/ci-report-helpers.js +158 -0
  59. package/src/lib/ci-stage-helpers.js +268 -0
  60. package/src/lib/cli-handler.js +529 -0
  61. package/src/lib/comment-generator.js +544 -0
  62. package/src/lib/community-reports-validator.js +91 -0
  63. package/src/lib/config-manager.js +2106 -0
  64. package/src/lib/configuration-exporter.js +204 -0
  65. package/src/lib/configuration-manager.js +695 -0
  66. package/src/lib/configuration-matcher.js +221 -0
  67. package/src/lib/cpu-validator.js +36 -0
  68. package/src/lib/cuda-validator.js +57 -0
  69. package/src/lib/deployment-config-resolver.js +103 -0
  70. package/src/lib/deployment-entry-schema.js +125 -0
  71. package/src/lib/deployment-registry.js +598 -0
  72. package/src/lib/docker-introspection-validator.js +51 -0
  73. package/src/lib/engine-prefix-resolver.js +60 -0
  74. package/src/lib/huggingface-client.js +172 -0
  75. package/src/lib/key-value-parser.js +37 -0
  76. package/src/lib/known-flags-validator.js +200 -0
  77. package/src/lib/manifest-cli.js +280 -0
  78. package/src/lib/mcp-client.js +303 -0
  79. package/src/lib/mcp-command-handler.js +532 -0
  80. package/src/lib/neuron-validator.js +80 -0
  81. package/src/lib/parameter-schema-validator.js +284 -0
  82. package/src/lib/prompt-runner.js +1349 -0
  83. package/src/lib/prompts.js +1138 -0
  84. package/src/lib/registry-command-handler.js +519 -0
  85. package/src/lib/registry-loader.js +198 -0
  86. package/src/lib/rocm-validator.js +80 -0
  87. package/src/lib/schema-validator.js +157 -0
  88. package/src/lib/sensitive-redactor.js +59 -0
  89. package/src/lib/template-engine.js +156 -0
  90. package/src/lib/template-manager.js +341 -0
  91. package/src/lib/validation-engine.js +314 -0
  92. package/src/prompt-adapter.js +63 -0
  93. package/templates/Dockerfile +300 -0
  94. package/templates/IAM_PERMISSIONS.md +84 -0
  95. package/templates/MIGRATION.md +488 -0
  96. package/templates/PROJECT_README.md +439 -0
  97. package/templates/TEMPLATE_SYSTEM.md +243 -0
  98. package/templates/buildspec.yml +64 -0
  99. package/templates/code/chat_template.jinja +1 -0
  100. package/templates/code/flask/gunicorn_config.py +35 -0
  101. package/templates/code/flask/wsgi.py +10 -0
  102. package/templates/code/model_handler.py +387 -0
  103. package/templates/code/serve +300 -0
  104. package/templates/code/serve.py +175 -0
  105. package/templates/code/serving.properties +105 -0
  106. package/templates/code/start_server.py +39 -0
  107. package/templates/code/start_server.sh +39 -0
  108. package/templates/diffusors/Dockerfile +72 -0
  109. package/templates/diffusors/patch_image_api.py +35 -0
  110. package/templates/diffusors/serve +115 -0
  111. package/templates/diffusors/start_server.sh +114 -0
  112. package/templates/do/.gitkeep +1 -0
  113. package/templates/do/README.md +541 -0
  114. package/templates/do/build +83 -0
  115. package/templates/do/ci +681 -0
  116. package/templates/do/clean +811 -0
  117. package/templates/do/config +260 -0
  118. package/templates/do/deploy +1560 -0
  119. package/templates/do/export +306 -0
  120. package/templates/do/logs +319 -0
  121. package/templates/do/manifest +12 -0
  122. package/templates/do/push +119 -0
  123. package/templates/do/register +580 -0
  124. package/templates/do/run +113 -0
  125. package/templates/do/submit +417 -0
  126. package/templates/do/test +1147 -0
  127. package/templates/hyperpod/configmap.yaml +24 -0
  128. package/templates/hyperpod/deployment.yaml +71 -0
  129. package/templates/hyperpod/pvc.yaml +42 -0
  130. package/templates/hyperpod/service.yaml +17 -0
  131. package/templates/nginx-diffusors.conf +74 -0
  132. package/templates/nginx-predictors.conf +47 -0
  133. package/templates/nginx-tensorrt.conf +74 -0
  134. package/templates/requirements.txt +61 -0
  135. package/templates/sample_model/test_inference.py +123 -0
  136. package/templates/sample_model/train_abalone.py +252 -0
  137. package/templates/test/test_endpoint.sh +79 -0
  138. package/templates/test/test_local_image.sh +80 -0
  139. package/templates/test/test_model_handler.py +180 -0
  140. package/templates/triton/Dockerfile +128 -0
  141. package/templates/triton/config.pbtxt +163 -0
  142. package/templates/triton/model.py +130 -0
  143. package/templates/triton/requirements.txt +11 -0
@@ -0,0 +1,439 @@
1
+ # <%= projectName %>
2
+
3
+ SageMaker-compatible ML container for deploying <%= framework %> models using <%= modelServer %>.
4
+
5
+ Generated on <%= buildTimestamp %> using [ML Container Creator](https://github.com/yourusername/ml-container-creator).
6
+
7
+ ## Quick Start
8
+
9
+ ### 1. Build the Container
10
+
11
+ ```bash
12
+ ./do/build
13
+ ```
14
+
15
+ Builds a Docker image tagged as `<%= projectName %>:latest`.
16
+
17
+ ### 2. Test Locally
18
+
19
+ ```bash
20
+ # Start the container
21
+ ./do/run
22
+
23
+ # In another terminal, test the endpoints
24
+ ./do/test
25
+ ```
26
+
27
+ ### 3. Push to ECR
28
+
29
+ ```bash
30
+ ./do/push
31
+ ```
32
+
33
+ Pushes the image to Amazon ECR in the `<%= awsRegion %>` region.
34
+
35
+ ### 4. Deploy to SageMaker
36
+
37
+ ```bash
38
+ ./do/deploy <your-sagemaker-execution-role-arn>
39
+ ```
40
+
41
+ Creates a SageMaker endpoint named `<%= projectName %>-endpoint`.
42
+
43
+ ### 5. Test the Endpoint
44
+
45
+ ```bash
46
+ ./do/test <%= projectName %>-endpoint
47
+ ```
48
+
49
+ ## Project Structure
50
+
51
+ ```
52
+ <%= projectName %>/
53
+ ├── do/ # do-framework lifecycle scripts
54
+ │ ├── build # Build Docker image
55
+ │ ├── push # Push to Amazon ECR
56
+ │ ├── deploy # Deploy to SageMaker
57
+ │ ├── run # Run container locally
58
+ │ ├── test # Test container or endpoint
59
+ │ ├── clean # Clean up resources
60
+ <% if (buildTarget === 'codebuild') { %>│ ├── submit # Submit build to CodeBuild
61
+ <% } %>│ ├── config # Configuration variables
62
+ │ └── README.md # Detailed do-framework documentation
63
+ ├── code/ # Model serving code
64
+ <% if (framework === 'transformers') { %>│ └── serve # <%= modelServer %> entrypoint script
65
+ <% } else { %>│ ├── model_handler.py # Model loading and inference
66
+ │ └── serve.py # <%= modelServer %> server
67
+ <% } %>├── deploy/ # Legacy scripts (deprecated)
68
+ │ ├── build_and_push.sh # Use ./do/build && ./do/push instead
69
+ │ └── deploy.sh # Use ./do/deploy instead
70
+ <% if (includeSampleModel) { %>├── sample_model/ # Sample training code
71
+ │ ├── train_abalone.py # Train sample model
72
+ │ └── test_inference.py # Test inference
73
+ <% } %>
74
+ <% if (includeTesting) { %>├── test/ # Test suite
75
+ │ ├── test_endpoint.sh # Test SageMaker endpoint
76
+ │ └── test_local_image.sh # Test local container
77
+ <% } %>
78
+ ├── Dockerfile # Container definition
79
+ ├── requirements.txt # Python dependencies
80
+ └── README.md # This file
81
+ ```
82
+
83
+ ## Configuration
84
+
85
+ All deployment configuration is centralized in `do/config`:
86
+
87
+ ```bash
88
+ # Project identification
89
+ PROJECT_NAME="<%= projectName %>"
90
+ DEPLOYMENT_CONFIG="<%= deploymentConfig %>"
91
+
92
+ # AWS configuration
93
+ AWS_REGION="<%= awsRegion %>"
94
+ INSTANCE_TYPE="<%= instanceType %>"
95
+
96
+ # Framework configuration
97
+ FRAMEWORK="<%= framework %>"
98
+ MODEL_SERVER="<%= modelServer %>"
99
+ <% if (framework === 'transformers') { %>
100
+ # Model configuration
101
+ MODEL_NAME="<%= modelName %>"
102
+ <% } %>
103
+ ```
104
+
105
+ You can override these values by setting environment variables before running do scripts.
106
+
107
+ ## Deployment Workflows
108
+
109
+ ### Local Development Workflow
110
+
111
+ ```bash
112
+ # Build and test locally
113
+ ./do/build
114
+ ./do/run &
115
+ ./do/test
116
+
117
+ # When satisfied, push to ECR
118
+ ./do/push
119
+ ```
120
+
121
+ <% if (buildTarget === 'codebuild') { %>### CodeBuild Workflow
122
+
123
+ ```bash
124
+ # Submit build to CodeBuild (builds and pushes to ECR)
125
+ ./do/submit
126
+
127
+ # Deploy to SageMaker
128
+ ./do/deploy <role-arn>
129
+
130
+ # Test the endpoint
131
+ ./do/test <%= projectName %>-endpoint
132
+ ```
133
+
134
+ <% } else { %>### SageMaker Deployment Workflow
135
+
136
+ ```bash
137
+ # Build, push, and deploy
138
+ ./do/build
139
+ ./do/push
140
+ ./do/deploy <role-arn>
141
+
142
+ # Test the endpoint
143
+ ./do/test <%= projectName %>-endpoint
144
+ ```
145
+
146
+ <% } %>### Cleanup
147
+
148
+ ```bash
149
+ # Remove local images
150
+ ./do/clean local
151
+
152
+ # Remove ECR images
153
+ ./do/clean ecr
154
+
155
+ # Delete SageMaker endpoint
156
+ ./do/clean endpoint
157
+
158
+ # Clean everything
159
+ ./do/clean all
160
+ ```
161
+
162
+ ## do-framework Commands
163
+
164
+ This project uses the [do-framework](https://github.com/iankoulski/do-framework) for standardized container lifecycle management.
165
+
166
+ ### Available Commands
167
+
168
+ | Command | Description |
169
+ |---------|-------------|
170
+ | `./do/build` | Build Docker image locally |
171
+ | `./do/push` | Push image to Amazon ECR |
172
+ | `./do/deploy <role-arn>` | Deploy to SageMaker endpoint |
173
+ | `./do/run` | Run container locally on port 8080 |
174
+ | `./do/test [endpoint]` | Test local container or SageMaker endpoint |
175
+ | `./do/clean <target>` | Clean up resources (local/ecr/endpoint/all) |
176
+ <% if (buildTarget === 'codebuild') { %>| `./do/submit` | Submit build to AWS CodeBuild |
177
+ <% } %>
178
+ For detailed documentation on each command, see `do/README.md`.
179
+
180
+ ## Framework-Specific Information
181
+
182
+ <% if (framework === 'sklearn') { %>### scikit-learn
183
+
184
+ This container serves scikit-learn models using <%= modelServer %>.
185
+
186
+ **Model Format**: <%= modelFormat %>
187
+
188
+ **Loading**: Models are loaded from `/opt/ml/model/model.<%= modelFormat %>`
189
+
190
+ **Inference**: Send JSON requests to `/invocations` endpoint
191
+
192
+ <% } else if (framework === 'xgboost') { %>### XGBoost
193
+
194
+ This container serves XGBoost models using <%= modelServer %>.
195
+
196
+ **Model Format**: <%= modelFormat %>
197
+
198
+ **Loading**: Models are loaded from `/opt/ml/model/model.<%= modelFormat %>`
199
+
200
+ **Inference**: Send JSON requests to `/invocations` endpoint
201
+
202
+ <% } else if (framework === 'tensorflow') { %>### TensorFlow
203
+
204
+ This container serves TensorFlow/Keras models using <%= modelServer %>.
205
+
206
+ **Model Format**: <%= modelFormat %>
207
+
208
+ **Loading**: Models are loaded from `/opt/ml/model/`
209
+
210
+ **Inference**: Send JSON requests to `/invocations` endpoint
211
+
212
+ <% } else if (framework === 'transformers') { %>### Transformers (<%= modelServer %>)
213
+
214
+ This container serves transformer models using <%= modelServer %>.
215
+
216
+ **Model**: <%= modelName %>
217
+
218
+ <% if (modelServer === 'vllm') { %>**Server**: vLLM - High-throughput LLM serving with PagedAttention
219
+
220
+ **Features**:
221
+ - Continuous batching
222
+ - Optimized CUDA kernels
223
+ - OpenAI-compatible API
224
+
225
+ <% } else if (modelServer === 'sglang') { %>**Server**: SGLang - Fast serving with RadixAttention
226
+
227
+ **Features**:
228
+ - Structured generation
229
+ - Radix attention for prefix caching
230
+ - OpenAI-compatible API
231
+
232
+ <% } else if (modelServer === 'tensorrt-llm') { %>**Server**: TensorRT-LLM - NVIDIA optimized LLM serving
233
+
234
+ **Features**:
235
+ - TensorRT optimizations
236
+ - Multi-GPU support
237
+ - OpenAI-compatible API via nginx proxy
238
+
239
+ **Note**: Requires NGC API key for building. Set `NGC_API_KEY` environment variable.
240
+
241
+ <% } else if (modelServer === 'lmi') { %>**Server**: LMI (Large Model Inference) - AWS optimized serving
242
+
243
+ **Features**:
244
+ - AWS-optimized inference
245
+ - Multiple backend support
246
+ - DJL Serving integration
247
+
248
+ <% } else if (modelServer === 'djl') { %>**Server**: DJL (Deep Java Library) - Multi-framework serving
249
+
250
+ **Features**:
251
+ - Multi-framework support
252
+ - Production-ready serving
253
+ - AWS integration
254
+
255
+ <% } %>
256
+ **Inference**: Send requests to `/invocations` endpoint with:
257
+ ```json
258
+ {
259
+ "inputs": "Your prompt here",
260
+ "parameters": {
261
+ "max_new_tokens": 100,
262
+ "temperature": 0.7
263
+ }
264
+ }
265
+ ```
266
+
267
+ <% } %>
268
+
269
+ ## SageMaker Endpoints
270
+
271
+ ### Health Check
272
+
273
+ SageMaker calls the `/ping` endpoint to verify container health:
274
+
275
+ ```bash
276
+ curl http://localhost:8080/ping
277
+ ```
278
+
279
+ Expected response: `200 OK`
280
+
281
+ ### Inference
282
+
283
+ Send prediction requests to the `/invocations` endpoint:
284
+
285
+ <% if (framework === 'transformers') { %>```bash
286
+ curl -X POST http://localhost:8080/invocations \
287
+ -H "Content-Type: application/json" \
288
+ -d '{
289
+ "inputs": "What is machine learning?",
290
+ "parameters": {
291
+ "max_new_tokens": 100,
292
+ "temperature": 0.7
293
+ }
294
+ }'
295
+ ```
296
+
297
+ <% } else { %>```bash
298
+ curl -X POST http://localhost:8080/invocations \
299
+ -H "Content-Type: application/json" \
300
+ -d '{
301
+ "instances": [[1.0, 2.0, 3.0, 4.0]]
302
+ }'
303
+ ```
304
+
305
+ <% } %>
306
+ ## AWS Requirements
307
+
308
+ ### IAM Permissions
309
+
310
+ The SageMaker execution role needs these permissions:
311
+
312
+ - `ecr:GetAuthorizationToken`
313
+ - `ecr:BatchCheckLayerAvailability`
314
+ - `ecr:GetDownloadUrlForLayer`
315
+ - `ecr:BatchGetImage`
316
+ - `s3:GetObject` (if using S3 for model artifacts)
317
+ - `logs:CreateLogGroup`
318
+ - `logs:CreateLogStream`
319
+ - `logs:PutLogEvents`
320
+
321
+ See `IAM_PERMISSIONS.md` for detailed permission requirements.
322
+
323
+ ### AWS CLI Configuration
324
+
325
+ Ensure AWS CLI is configured with appropriate credentials:
326
+
327
+ ```bash
328
+ aws configure
329
+ ```
330
+
331
+ Or use environment variables:
332
+
333
+ ```bash
334
+ export AWS_ACCESS_KEY_ID=your-access-key
335
+ export AWS_SECRET_ACCESS_KEY=your-secret-key
336
+ export AWS_DEFAULT_REGION=<%= awsRegion %>
337
+ ```
338
+
339
+ ## Troubleshooting
340
+
341
+ ### Build Issues
342
+
343
+ <% if (modelServer === 'tensorrt-llm') { %>**NGC Authentication Failed**
344
+
345
+ Set your NGC API key:
346
+ ```bash
347
+ export NGC_API_KEY=your-ngc-api-key
348
+ ./do/build
349
+ ```
350
+
351
+ <% } %>**Docker Not Found**
352
+
353
+ Install Docker: https://docs.docker.com/get-docker/
354
+
355
+ **Permission Denied**
356
+
357
+ Add your user to the docker group:
358
+ ```bash
359
+ sudo usermod -aG docker $USER
360
+ ```
361
+
362
+ ### Deployment Issues
363
+
364
+ **ECR Push Failed**
365
+
366
+ Check AWS credentials and IAM permissions:
367
+ ```bash
368
+ aws sts get-caller-identity
369
+ ```
370
+
371
+ **Endpoint Creation Failed**
372
+
373
+ - Verify the execution role ARN is correct
374
+ - Check IAM permissions
375
+ - Ensure the instance type is available in your region
376
+
377
+ **Endpoint Stuck in Creating**
378
+
379
+ Check CloudWatch logs:
380
+ ```bash
381
+ aws logs tail /aws/sagemaker/Endpoints/<%= projectName %>-endpoint --follow
382
+ ```
383
+
384
+ ### Runtime Issues
385
+
386
+ **Container Exits Immediately**
387
+
388
+ Check container logs:
389
+ ```bash
390
+ docker logs $(docker ps -a | grep <%= projectName %> | awk '{print $1}')
391
+ ```
392
+
393
+ **Out of Memory**
394
+
395
+ Increase instance size or optimize model:
396
+ ```bash
397
+ # Edit do/config
398
+ INSTANCE_TYPE="ml.m5.2xlarge" # Larger instance
399
+ ```
400
+
401
+ ## Migration from Legacy Scripts
402
+
403
+ If you're familiar with the old `deploy/` scripts, see `MIGRATION.md` for a command mapping guide.
404
+
405
+ **Quick Reference**:
406
+
407
+ | Legacy Command | do-framework Command |
408
+ |----------------|---------------------|
409
+ | `./deploy/build_and_push.sh` | `./do/build && ./do/push` |
410
+ | `./deploy/deploy.sh <role>` | `./do/deploy <role>` |
411
+ <% if (buildTarget === 'codebuild') { %>| `./deploy/submit_build.sh` | `./do/submit` |
412
+ <% } %>
413
+ The legacy scripts are still available but deprecated. They will display warnings and forward to do-framework commands.
414
+
415
+ ## Additional Resources
416
+
417
+ - [do-framework Documentation](https://github.com/iankoulski/do-framework)
418
+ - [AWS SageMaker Documentation](https://docs.aws.amazon.com/sagemaker/)
419
+ - [SageMaker BYOC Guide](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms.html)
420
+ <% if (framework === 'transformers') { %>
421
+ <% if (modelServer === 'vllm') { %>- [vLLM Documentation](https://docs.vllm.ai/)
422
+ <% } else if (modelServer === 'sglang') { %>- [SGLang Documentation](https://sgl-project.github.io/)
423
+ <% } else if (modelServer === 'tensorrt-llm') { %>- [TensorRT-LLM Documentation](https://github.com/NVIDIA/TensorRT-LLM)
424
+ <% } else if (modelServer === 'lmi') { %>- [LMI Documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/large-model-inference.html)
425
+ <% } else if (modelServer === 'djl') { %>- [DJL Documentation](https://docs.djl.ai/)
426
+ <% } %>
427
+ <% } %>
428
+ ## Support
429
+
430
+ For issues or questions:
431
+
432
+ 1. Check `do/README.md` for detailed command documentation
433
+ 2. Review CloudWatch logs for deployment issues
434
+ 3. See `MIGRATION.md` if migrating from legacy scripts
435
+ 4. Open an issue on the [ML Container Creator repository](https://github.com/yourusername/ml-container-creator)
436
+
437
+ ## License
438
+
439
+ This generated project is provided as starter code. Modify as needed for your use case.
@@ -0,0 +1,243 @@
1
+ # Template System Documentation
2
+
3
+ This directory contains EJS templates that are processed and copied to generate complete ML container projects.
4
+
5
+ ## How Templates Work
6
+
7
+ ### Template Processing
8
+ All files in this directory are processed using [EJS (Embedded JavaScript)](https://ejs.co/) templating:
9
+
10
+ ```ejs
11
+ <%%= variable %> <%# Outputs escaped value %>
12
+ <%%- variable %> <%# Outputs unescaped value %>
13
+ <%% if (condition) { %>
14
+ Conditional content
15
+ <%% } %>
16
+ ```
17
+
18
+ ### Available Variables
19
+
20
+ All user answers from the prompting phase are available in templates:
21
+
22
+ | Variable | Type | Description | Example Values |
23
+ |----------|------|-------------|----------------|
24
+ | `projectName` | string | Project name | `my-ml-model` |
25
+ | `destinationDir` | string | Output directory | `./my-ml-model-2024-12-02` |
26
+ | `framework` | string | ML framework | `sklearn`, `xgboost`, `tensorflow`, `transformers` |
27
+ | `modelFormat` | string | Model serialization format | `pkl`, `joblib`, `json`, `keras`, `h5` |
28
+ | `modelServer` | string | Model serving framework | `flask`, `fastapi`, `vllm`, `sglang`, `lmi`, `djl` |
29
+ | `includeSampleModel` | boolean | Include sample model | `true`, `false` |
30
+ | `includeTesting` | boolean | Include test suite | `true`, `false` |
31
+ | `testTypes` | string[] | Selected test types | `['local-model-cli', 'hosted-model-endpoint']` |
32
+ | `buildTarget` | string | Build target | `codebuild` |
33
+ | `instanceType` | string | Instance configuration | `cpu-optimized`, `gpu-enabled`, `custom` |
34
+ | `customInstanceType` | string | Custom AWS instance type | `ml.m5.large`, `ml.g4dn.xlarge` |
35
+ | `awsRegion` | string | AWS region | `us-east-1` |
36
+ | `buildTimestamp` | string | Generation timestamp | `2024-12-02T15-30-45` |
37
+
38
+ ## Directory Structure
39
+
40
+ ```
41
+ templates/
42
+ ├── code/ # Model serving code
43
+ │ ├── flask/ # Flask-specific implementation
44
+ │ ├── model_handler.py # Model loading and inference (traditional ML)
45
+ │ ├── serve.py # Flask/FastAPI server (traditional ML)
46
+ │ ├── serve # vLLM/SGLang/TensorRT-LLM entrypoint (transformers)
47
+ │ ├── serving.properties # LMI/DJL configuration (transformers with lmi/djl)
48
+ │ └── start_server.py # Server startup script (traditional ML)
49
+ ├── deploy/ # Deployment scripts
50
+ │ ├── build_and_push.sh # Build Docker image and push to ECR
51
+ │ ├── deploy.sh # Deploy to SageMaker endpoint
52
+ │ └── upload_to_s3.sh # Upload model to S3 (transformers only)
53
+ ├── sample_model/ # Optional sample training code
54
+ │ ├── train_abalone.py # Sample model training
55
+ │ └── test_inference.py # Sample inference testing
56
+ ├── test/ # Optional test suite
57
+ │ ├── test_endpoint.sh # Test hosted SageMaker endpoint
58
+ │ ├── test_local_image.sh # Test local Docker container
59
+ │ └── test_model_handler.py # Unit tests for model handler
60
+ ├── Dockerfile # Container definition
61
+ ├── nginx-predictors.conf # Nginx configuration (traditional ML only)
62
+ └── requirements.txt # Python dependencies
63
+ ```
64
+
65
+ ## Conditional File Inclusion
66
+
67
+ Files are conditionally included based on user configuration:
68
+
69
+ ### Transformers Configuration
70
+ When `framework === 'transformers'`:
71
+ - **Excluded**: Traditional ML serving files
72
+ - `code/model_handler.py`
73
+ - `code/serve.py`
74
+ - `code/start_server.py`
75
+ - `nginx-predictors.conf`
76
+ - `requirements.txt` (uses transformer-specific version)
77
+ - `test/test_local_image.sh`
78
+ - `test/test_model_handler.py`
79
+ - **Included**: Transformer-specific files
80
+ - `code/serve` (vLLM/SGLang/TensorRT-LLM entrypoint)
81
+ - `deploy/upload_to_s3.sh`
82
+
83
+ ### TensorRT-LLM Specific Files
84
+ When `modelServer === 'tensorrt-llm'`:
85
+ - **Additional Included Files**:
86
+ - `nginx-tensorrt.conf` - Nginx reverse proxy for OpenAI API compatibility
87
+ - `code/start_server.sh` - Startup script that launches TensorRT-LLM and nginx
88
+ - **Architecture**: TensorRT-LLM runs on port 8081, nginx proxies SageMaker endpoints on port 8080
89
+
90
+ ### LMI/DJL Specific Files
91
+ When `modelServer === 'lmi'` or `modelServer === 'djl'`:
92
+ - **Additional Included Files**:
93
+ - `code/serving.properties` - Configuration file for LMI/DJL serving
94
+ - **Architecture**: Uses AWS pre-built containers with DJL Serving
95
+ - **Configuration**: Model and serving parameters defined in serving.properties instead of environment variables
96
+
97
+ ### Traditional ML Configuration
98
+ When `framework !== 'transformers'`:
99
+ - **Excluded**: Transformer-specific files
100
+ - `code/serve`
101
+ - `deploy/upload_to_s3.sh`
102
+ - **Included**: Traditional ML serving files
103
+ - All Flask/FastAPI serving code
104
+ - Nginx configuration
105
+ - Model handler
106
+
107
+ ### Optional Modules
108
+ - **Sample Model**: Excluded if `includeSampleModel === false`
109
+ - **Test Suite**: Excluded if `includeTesting === false`
110
+ - **Flask Code**: Excluded if `modelServer !== 'flask'`
111
+
112
+ ## Template Examples
113
+
114
+ ### Using Variables in Shell Scripts
115
+ ```bash
116
+ #!/bin/bash
117
+ # deploy/build_and_push.sh
118
+
119
+ PROJECT_NAME="<%%= projectName %>"
120
+ REGION="<%%= awsRegion %>"
121
+
122
+ echo "Building ${PROJECT_NAME} for region ${REGION}"
123
+ ```
124
+
125
+ ### Conditional Content in Python
126
+ ```python
127
+ # code/model_handler.py
128
+
129
+ <%% if (framework === 'sklearn') { %>
130
+ import joblib
131
+ model = joblib.load(model_path)
132
+ <%% } else if (framework === 'xgboost') { %>
133
+ import xgboost as xgb
134
+ model = xgb.Booster()
135
+ model.load_model(model_path)
136
+ <%% } %>
137
+ ```
138
+
139
+ ### Using Arrays
140
+ ```python
141
+ # test/test_endpoint.sh
142
+
143
+ <%% if (testTypes.includes('hosted-model-endpoint')) { %>
144
+ echo "Testing hosted endpoint..."
145
+ aws sagemaker-runtime invoke-endpoint \
146
+ --endpoint-name <%%= projectName %>-endpoint \
147
+ --body file://test_data.json \
148
+ output.json
149
+ <%% } %>
150
+ ```
151
+
152
+ ## Adding New Templates
153
+
154
+ ### 1. Create Template File
155
+ Add your template file in the appropriate directory:
156
+ ```bash
157
+ templates/code/my_new_file.py
158
+ ```
159
+
160
+ ### 2. Use EJS Syntax
161
+ ```python
162
+ # templates/code/my_new_file.py
163
+ """
164
+ Generated for <%%= projectName %>
165
+ Framework: <%%= framework %>
166
+ """
167
+
168
+ <%% if (framework === 'sklearn') { %>
169
+ # sklearn-specific code
170
+ <%% } %>
171
+ ```
172
+
173
+ ### 3. Add Conditional Exclusion (if needed)
174
+ In `generators/app/index.js`, add to `ignorePatterns`:
175
+ ```javascript
176
+ if (someCondition) {
177
+ ignorePatterns.push('**/code/my_new_file.py');
178
+ }
179
+ ```
180
+
181
+ ### 4. Update This Documentation
182
+ Document the new template and when it's included/excluded.
183
+
184
+ ## Best Practices
185
+
186
+ ### Template Design
187
+ - **Keep templates simple** - Complex logic belongs in the generator
188
+ - **Use descriptive variable names** - Make templates self-documenting
189
+ - **Add comments** - Explain why conditional logic exists
190
+ - **Test all paths** - Verify templates work for all configurations
191
+
192
+ ### Variable Usage
193
+ - **Escape output by default** - Use `<%%= %>` unless you need HTML
194
+ - **Validate in generator** - Don't assume variables exist in templates
195
+ - **Provide defaults** - Use `<%%= variable || 'default' %>`
196
+
197
+ ### File Organization
198
+ - **Group related files** - Keep similar templates together
199
+ - **Use subdirectories** - Organize by feature or component
200
+ - **Name clearly** - File names should indicate purpose
201
+
202
+ ## Testing Templates
203
+
204
+ ### Manual Testing
205
+ ```bash
206
+ # Link generator locally
207
+ npm link
208
+
209
+ # Run generator
210
+ ml-container-creator
211
+
212
+ # Test different configurations
213
+ # - sklearn + flask
214
+ # - xgboost + fastapi
215
+ # - transformers + vllm
216
+ # - With/without sample model
217
+ # - With/without tests
218
+ ```
219
+
220
+ ### Automated Testing
221
+ See `test/` directory for generator tests that verify template generation.
222
+
223
+ ## Troubleshooting
224
+
225
+ ### Template Not Copied
226
+ - Check if file matches an ignore pattern
227
+ - Verify file is in templates directory
228
+ - Check for EJS syntax errors
229
+
230
+ ### Variables Not Replaced
231
+ - Ensure variable exists in `this.answers`
232
+ - Check EJS syntax: `<%%= variable %>` not `{{ variable }}`
233
+ - Verify template is processed with `copyTpl` not `copy`
234
+
235
+ ### Conditional Logic Not Working
236
+ - Test condition in generator first
237
+ - Use `console.log()` to debug values
238
+ - Check for typos in variable names
239
+
240
+ ## Related Documentation
241
+
242
+ - [EJS Documentation](https://ejs.co/)
243
+ - [Project Steering Files](../../../.kiro/steering/)