@aws/ml-container-creator 0.13.3 → 0.13.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (43) hide show
  1. package/README.md +23 -5
  2. package/infra/ci-harness/package-lock.json +1 -5
  3. package/package.json +5 -3
  4. package/pyproject.toml +21 -0
  5. package/requirements.txt +19 -0
  6. package/servers/instance-sizer/lib/model-resolver.js +127 -185
  7. package/servers/instance-sizer/lib/vram-estimator.js +86 -0
  8. package/servers/lib/catalogs/instances.json +0 -27
  9. package/src/app.js +2 -0
  10. package/src/lib/bootstrap-command-handler.js +35 -25
  11. package/src/lib/generated/cli-options.js +1 -1
  12. package/src/lib/generated/parameter-matrix.js +1 -1
  13. package/src/lib/generated/validation-rules.js +1 -1
  14. package/src/lib/prompt-runner.js +14 -31
  15. package/templates/IAM_PERMISSIONS.md +64 -13
  16. package/templates/do/.adapter_helper.py +451 -0
  17. package/templates/do/.benchmark_writer.py +13 -0
  18. package/templates/do/.stage_helper.py +419 -0
  19. package/templates/do/.tune_helper.py +218 -67
  20. package/templates/do/README.md +50 -604
  21. package/templates/do/__pycache__/.adapter_helper.cpython-312.pyc +0 -0
  22. package/templates/do/__pycache__/.benchmark_writer.cpython-312.pyc +0 -0
  23. package/templates/do/__pycache__/.tune_helper.cpython-312.pyc +0 -0
  24. package/templates/do/adapter +109 -4
  25. package/templates/do/benchmark +150 -12
  26. package/templates/do/build +2 -5
  27. package/templates/do/clean.d/async-inference.ejs +2 -5
  28. package/templates/do/clean.d/batch-transform.ejs +2 -5
  29. package/templates/do/clean.d/hyperpod-eks.ejs +2 -5
  30. package/templates/do/clean.d/managed-inference.ejs +2 -5
  31. package/templates/do/config +4 -0
  32. package/templates/do/deploy.d/async-inference.ejs +6 -9
  33. package/templates/do/deploy.d/batch-transform.ejs +4 -7
  34. package/templates/do/deploy.d/hyperpod-eks.ejs +1 -4
  35. package/templates/do/deploy.d/managed-inference.ejs +15 -6
  36. package/templates/do/lib/profile.sh +24 -15
  37. package/templates/do/push +2 -5
  38. package/templates/do/register +2 -5
  39. package/templates/do/stage +114 -292
  40. package/templates/do/submit +1 -4
  41. package/templates/do/tune +64 -10
  42. package/templates/MIGRATION.md +0 -488
  43. package/templates/TEMPLATE_SYSTEM.md +0 -243
@@ -1,243 +0,0 @@
1
- # Template System Documentation
2
-
3
- This directory contains EJS templates that are processed and copied to generate complete ML container projects.
4
-
5
- ## How Templates Work
6
-
7
- ### Template Processing
8
- All files in this directory are processed using [EJS (Embedded JavaScript)](https://ejs.co/) templating:
9
-
10
- ```ejs
11
- <%%= variable %> <%# Outputs escaped value %>
12
- <%%- variable %> <%# Outputs unescaped value %>
13
- <%% if (condition) { %>
14
- Conditional content
15
- <%% } %>
16
- ```
17
-
18
- ### Available Variables
19
-
20
- All user answers from the prompting phase are available in templates:
21
-
22
- | Variable | Type | Description | Example Values |
23
- |----------|------|-------------|----------------|
24
- | `projectName` | string | Project name | `my-ml-model` |
25
- | `destinationDir` | string | Output directory | `./my-ml-model-2024-12-02` |
26
- | `framework` | string | ML framework | `sklearn`, `xgboost`, `tensorflow`, `transformers` |
27
- | `modelFormat` | string | Model serialization format | `pkl`, `joblib`, `json`, `keras`, `h5` |
28
- | `modelServer` | string | Model serving framework | `flask`, `fastapi`, `vllm`, `sglang`, `lmi`, `djl` |
29
- | `includeSampleModel` | boolean | Include sample model | `true`, `false` |
30
- | `includeTesting` | boolean | Include test suite | `true`, `false` |
31
- | `testTypes` | string[] | Selected test types | `['local-model-cli', 'hosted-model-endpoint']` |
32
- | `buildTarget` | string | Build target | `codebuild` |
33
- | `instanceType` | string | Instance configuration | `cpu-optimized`, `gpu-enabled`, `custom` |
34
- | `customInstanceType` | string | Custom AWS instance type | `ml.m5.large`, `ml.g4dn.xlarge` |
35
- | `awsRegion` | string | AWS region | `us-east-1` |
36
- | `buildTimestamp` | string | Generation timestamp | `2024-12-02T15-30-45` |
37
-
38
- ## Directory Structure
39
-
40
- ```
41
- templates/
42
- ├── code/ # Model serving code
43
- │ ├── flask/ # Flask-specific implementation
44
- │ ├── model_handler.py # Model loading and inference (traditional ML)
45
- │ ├── serve.py # Flask/FastAPI server (traditional ML)
46
- │ ├── serve # vLLM/SGLang/TensorRT-LLM entrypoint (transformers)
47
- │ ├── serving.properties # LMI/DJL configuration (transformers with lmi/djl)
48
- │ └── start_server.py # Server startup script (traditional ML)
49
- ├── deploy/ # Deployment scripts
50
- │ ├── build_and_push.sh # Build Docker image and push to ECR
51
- │ ├── deploy.sh # Deploy to SageMaker endpoint
52
- │ └── upload_to_s3.sh # Upload model to S3 (transformers only)
53
- ├── sample_model/ # Optional sample training code
54
- │ ├── train_abalone.py # Sample model training
55
- │ └── test_inference.py # Sample inference testing
56
- ├── test/ # Optional test suite
57
- │ ├── test_endpoint.sh # Test hosted SageMaker endpoint
58
- │ ├── test_local_image.sh # Test local Docker container
59
- │ └── test_model_handler.py # Unit tests for model handler
60
- ├── Dockerfile # Container definition
61
- ├── nginx-predictors.conf # Nginx configuration (traditional ML only)
62
- └── requirements.txt # Python dependencies
63
- ```
64
-
65
- ## Conditional File Inclusion
66
-
67
- Files are conditionally included based on user configuration:
68
-
69
- ### Transformers Configuration
70
- When `framework === 'transformers'`:
71
- - **Excluded**: Traditional ML serving files
72
- - `code/model_handler.py`
73
- - `code/serve.py`
74
- - `code/start_server.py`
75
- - `nginx-predictors.conf`
76
- - `requirements.txt` (uses transformer-specific version)
77
- - `test/test_local_image.sh`
78
- - `test/test_model_handler.py`
79
- - **Included**: Transformer-specific files
80
- - `code/serve` (vLLM/SGLang/TensorRT-LLM entrypoint)
81
- - `deploy/upload_to_s3.sh`
82
-
83
- ### TensorRT-LLM Specific Files
84
- When `modelServer === 'tensorrt-llm'`:
85
- - **Additional Included Files**:
86
- - `nginx-tensorrt.conf` - Nginx reverse proxy for OpenAI API compatibility
87
- - `code/start_server.sh` - Startup script that launches TensorRT-LLM and nginx
88
- - **Architecture**: TensorRT-LLM runs on port 8081, nginx proxies SageMaker endpoints on port 8080
89
-
90
- ### LMI/DJL Specific Files
91
- When `modelServer === 'lmi'` or `modelServer === 'djl'`:
92
- - **Additional Included Files**:
93
- - `code/serving.properties` - Configuration file for LMI/DJL serving
94
- - **Architecture**: Uses AWS pre-built containers with DJL Serving
95
- - **Configuration**: Model and serving parameters defined in serving.properties instead of environment variables
96
-
97
- ### Traditional ML Configuration
98
- When `framework !== 'transformers'`:
99
- - **Excluded**: Transformer-specific files
100
- - `code/serve`
101
- - `deploy/upload_to_s3.sh`
102
- - **Included**: Traditional ML serving files
103
- - All Flask/FastAPI serving code
104
- - Nginx configuration
105
- - Model handler
106
-
107
- ### Optional Modules
108
- - **Sample Model**: Excluded if `includeSampleModel === false`
109
- - **Test Suite**: Excluded if `includeTesting === false`
110
- - **Flask Code**: Excluded if `modelServer !== 'flask'`
111
-
112
- ## Template Examples
113
-
114
- ### Using Variables in Shell Scripts
115
- ```bash
116
- #!/bin/bash
117
- # deploy/build_and_push.sh
118
-
119
- PROJECT_NAME="<%%= projectName %>"
120
- REGION="<%%= awsRegion %>"
121
-
122
- echo "Building ${PROJECT_NAME} for region ${REGION}"
123
- ```
124
-
125
- ### Conditional Content in Python
126
- ```python
127
- # code/model_handler.py
128
-
129
- <%% if (framework === 'sklearn') { %>
130
- import joblib
131
- model = joblib.load(model_path)
132
- <%% } else if (framework === 'xgboost') { %>
133
- import xgboost as xgb
134
- model = xgb.Booster()
135
- model.load_model(model_path)
136
- <%% } %>
137
- ```
138
-
139
- ### Using Arrays
140
- ```python
141
- # test/test_endpoint.sh
142
-
143
- <%% if (testTypes.includes('hosted-model-endpoint')) { %>
144
- echo "Testing hosted endpoint..."
145
- aws sagemaker-runtime invoke-endpoint \
146
- --endpoint-name <%%= projectName %>-endpoint \
147
- --body file://test_data.json \
148
- output.json
149
- <%% } %>
150
- ```
151
-
152
- ## Adding New Templates
153
-
154
- ### 1. Create Template File
155
- Add your template file in the appropriate directory:
156
- ```bash
157
- templates/code/my_new_file.py
158
- ```
159
-
160
- ### 2. Use EJS Syntax
161
- ```python
162
- # templates/code/my_new_file.py
163
- """
164
- Generated for <%%= projectName %>
165
- Framework: <%%= framework %>
166
- """
167
-
168
- <%% if (framework === 'sklearn') { %>
169
- # sklearn-specific code
170
- <%% } %>
171
- ```
172
-
173
- ### 3. Add Conditional Exclusion (if needed)
174
- In `generators/app/index.js`, add to `ignorePatterns`:
175
- ```javascript
176
- if (someCondition) {
177
- ignorePatterns.push('**/code/my_new_file.py');
178
- }
179
- ```
180
-
181
- ### 4. Update This Documentation
182
- Document the new template and when it's included/excluded.
183
-
184
- ## Best Practices
185
-
186
- ### Template Design
187
- - **Keep templates simple** - Complex logic belongs in the generator
188
- - **Use descriptive variable names** - Make templates self-documenting
189
- - **Add comments** - Explain why conditional logic exists
190
- - **Test all paths** - Verify templates work for all configurations
191
-
192
- ### Variable Usage
193
- - **Escape output by default** - Use `<%%= %>` unless you need HTML
194
- - **Validate in generator** - Don't assume variables exist in templates
195
- - **Provide defaults** - Use `<%%= variable || 'default' %>`
196
-
197
- ### File Organization
198
- - **Group related files** - Keep similar templates together
199
- - **Use subdirectories** - Organize by feature or component
200
- - **Name clearly** - File names should indicate purpose
201
-
202
- ## Testing Templates
203
-
204
- ### Manual Testing
205
- ```bash
206
- # Link generator locally
207
- npm link
208
-
209
- # Run generator
210
- ml-container-creator
211
-
212
- # Test different configurations
213
- # - sklearn + flask
214
- # - xgboost + fastapi
215
- # - transformers + vllm
216
- # - With/without sample model
217
- # - With/without tests
218
- ```
219
-
220
- ### Automated Testing
221
- See `test/` directory for generator tests that verify template generation.
222
-
223
- ## Troubleshooting
224
-
225
- ### Template Not Copied
226
- - Check if file matches an ignore pattern
227
- - Verify file is in templates directory
228
- - Check for EJS syntax errors
229
-
230
- ### Variables Not Replaced
231
- - Ensure variable exists in `this.answers`
232
- - Check EJS syntax: `<%%= variable %>` not `{{ variable }}`
233
- - Verify template is processed with `copyTpl` not `copy`
234
-
235
- ### Conditional Logic Not Working
236
- - Test condition in generator first
237
- - Use `console.log()` to debug values
238
- - Check for typos in variable names
239
-
240
- ## Related Documentation
241
-
242
- - [EJS Documentation](https://ejs.co/)
243
- - [Project Steering Files](../../../.kiro/steering/)