npm - @aws/ml-container-creator - Versions diffs - 0.13.3 → 0.13.5 - Mend

@aws/ml-container-creator 0.13.3 → 0.13.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (43) hide show

package/README.md +23 -5
package/infra/ci-harness/package-lock.json +1 -5
package/package.json +5 -3
package/pyproject.toml +21 -0
package/requirements.txt +19 -0
package/servers/instance-sizer/lib/model-resolver.js +127 -185
package/servers/instance-sizer/lib/vram-estimator.js +86 -0
package/servers/lib/catalogs/instances.json +0 -27
package/src/app.js +2 -0
package/src/lib/bootstrap-command-handler.js +35 -25
package/src/lib/generated/cli-options.js +1 -1
package/src/lib/generated/parameter-matrix.js +1 -1
package/src/lib/generated/validation-rules.js +1 -1
package/src/lib/prompt-runner.js +14 -31
package/templates/IAM_PERMISSIONS.md +64 -13
package/templates/do/.adapter_helper.py +451 -0
package/templates/do/.benchmark_writer.py +13 -0
package/templates/do/.stage_helper.py +419 -0
package/templates/do/.tune_helper.py +218 -67
package/templates/do/README.md +50 -604
package/templates/do/__pycache__/.adapter_helper.cpython-312.pyc +0 -0
package/templates/do/__pycache__/.benchmark_writer.cpython-312.pyc +0 -0
package/templates/do/__pycache__/.tune_helper.cpython-312.pyc +0 -0
package/templates/do/adapter +109 -4
package/templates/do/benchmark +150 -12
package/templates/do/build +2 -5
package/templates/do/clean.d/async-inference.ejs +2 -5
package/templates/do/clean.d/batch-transform.ejs +2 -5
package/templates/do/clean.d/hyperpod-eks.ejs +2 -5
package/templates/do/clean.d/managed-inference.ejs +2 -5
package/templates/do/config +4 -0
package/templates/do/deploy.d/async-inference.ejs +6 -9
package/templates/do/deploy.d/batch-transform.ejs +4 -7
package/templates/do/deploy.d/hyperpod-eks.ejs +1 -4
package/templates/do/deploy.d/managed-inference.ejs +15 -6
package/templates/do/lib/profile.sh +24 -15
package/templates/do/push +2 -5
package/templates/do/register +2 -5
package/templates/do/stage +114 -292
package/templates/do/submit +1 -4
package/templates/do/tune +64 -10
package/templates/MIGRATION.md +0 -488
package/templates/TEMPLATE_SYSTEM.md +0 -243

package/templates/TEMPLATE_SYSTEM.md DELETED Viewed

@@ -1,243 +0,0 @@
-# Template System Documentation
-This directory contains EJS templates that are processed and copied to generate complete ML container projects.
-## How Templates Work
-### Template Processing
-All files in this directory are processed using [EJS (Embedded JavaScript)](https://ejs.co/) templating:
-```ejs
-<%%= variable %>     <%# Outputs escaped value %>
-<%%- variable %>     <%# Outputs unescaped value %>
-<%% if (condition) { %>
-    Conditional content
-<%% } %>
-```
-### Available Variables
-All user answers from the prompting phase are available in templates:
-| Variable | Type | Description | Example Values |
-|----------|------|-------------|----------------|
-| `projectName` | string | Project name | `my-ml-model` |
-| `destinationDir` | string | Output directory | `./my-ml-model-2024-12-02` |
-| `framework` | string | ML framework | `sklearn`, `xgboost`, `tensorflow`, `transformers` |
-| `modelFormat` | string | Model serialization format | `pkl`, `joblib`, `json`, `keras`, `h5` |
-| `modelServer` | string | Model serving framework | `flask`, `fastapi`, `vllm`, `sglang`, `lmi`, `djl` |
-| `includeSampleModel` | boolean | Include sample model | `true`, `false` |
-| `includeTesting` | boolean | Include test suite | `true`, `false` |
-| `testTypes` | string[] | Selected test types | `['local-model-cli', 'hosted-model-endpoint']` |
-| `buildTarget` | string | Build target | `codebuild` |
-| `instanceType` | string | Instance configuration | `cpu-optimized`, `gpu-enabled`, `custom` |
-| `customInstanceType` | string | Custom AWS instance type | `ml.m5.large`, `ml.g4dn.xlarge` |
-| `awsRegion` | string | AWS region | `us-east-1` |
-| `buildTimestamp` | string | Generation timestamp | `2024-12-02T15-30-45` |
-## Directory Structure
-```
-templates/
-├── code/                    # Model serving code
-│   ├── flask/              # Flask-specific implementation
-│   ├── model_handler.py    # Model loading and inference (traditional ML)
-│   ├── serve.py            # Flask/FastAPI server (traditional ML)
-│   ├── serve               # vLLM/SGLang/TensorRT-LLM entrypoint (transformers)
-│   ├── serving.properties  # LMI/DJL configuration (transformers with lmi/djl)
-│   └── start_server.py     # Server startup script (traditional ML)
-├── deploy/                 # Deployment scripts
-│   ├── build_and_push.sh   # Build Docker image and push to ECR
-│   ├── deploy.sh           # Deploy to SageMaker endpoint
-│   └── upload_to_s3.sh     # Upload model to S3 (transformers only)
-├── sample_model/           # Optional sample training code
-│   ├── train_abalone.py    # Sample model training
-│   └── test_inference.py   # Sample inference testing
-├── test/                   # Optional test suite
-│   ├── test_endpoint.sh    # Test hosted SageMaker endpoint
-│   ├── test_local_image.sh # Test local Docker container
-│   └── test_model_handler.py # Unit tests for model handler
-├── Dockerfile              # Container definition
-├── nginx-predictors.conf   # Nginx configuration (traditional ML only)
-└── requirements.txt        # Python dependencies
-```
-## Conditional File Inclusion
-Files are conditionally included based on user configuration:
-### Transformers Configuration
-When `framework === 'transformers'`:
-- **Excluded**: Traditional ML serving files
-  - `code/model_handler.py`
-  - `code/serve.py`
-  - `code/start_server.py`
-  - `nginx-predictors.conf`
-  - `requirements.txt` (uses transformer-specific version)
-  - `test/test_local_image.sh`
-  - `test/test_model_handler.py`
-- **Included**: Transformer-specific files
-  - `code/serve` (vLLM/SGLang/TensorRT-LLM entrypoint)
-  - `deploy/upload_to_s3.sh`
-### TensorRT-LLM Specific Files
-When `modelServer === 'tensorrt-llm'`:
-- **Additional Included Files**:
-  - `nginx-tensorrt.conf` - Nginx reverse proxy for OpenAI API compatibility
-  - `code/start_server.sh` - Startup script that launches TensorRT-LLM and nginx
-- **Architecture**: TensorRT-LLM runs on port 8081, nginx proxies SageMaker endpoints on port 8080
-### LMI/DJL Specific Files
-When `modelServer === 'lmi'` or `modelServer === 'djl'`:
-- **Additional Included Files**:
-  - `code/serving.properties` - Configuration file for LMI/DJL serving
-- **Architecture**: Uses AWS pre-built containers with DJL Serving
-- **Configuration**: Model and serving parameters defined in serving.properties instead of environment variables
-### Traditional ML Configuration
-When `framework !== 'transformers'`:
-- **Excluded**: Transformer-specific files
-  - `code/serve`
-  - `deploy/upload_to_s3.sh`
-- **Included**: Traditional ML serving files
-  - All Flask/FastAPI serving code
-  - Nginx configuration
-  - Model handler
-### Optional Modules
-- **Sample Model**: Excluded if `includeSampleModel === false`
-- **Test Suite**: Excluded if `includeTesting === false`
-- **Flask Code**: Excluded if `modelServer !== 'flask'`
-## Template Examples
-### Using Variables in Shell Scripts
-```bash
-#!/bin/bash
-# deploy/build_and_push.sh
-PROJECT_NAME="<%%= projectName %>"
-REGION="<%%= awsRegion %>"
-echo "Building ${PROJECT_NAME} for region ${REGION}"
-```
-### Conditional Content in Python
-```python
-# code/model_handler.py
-<%% if (framework === 'sklearn') { %>
-import joblib
-model = joblib.load(model_path)
-<%% } else if (framework === 'xgboost') { %>
-import xgboost as xgb
-model = xgb.Booster()
-model.load_model(model_path)
-<%% } %>
-```
-### Using Arrays
-```python
-# test/test_endpoint.sh
-<%% if (testTypes.includes('hosted-model-endpoint')) { %>
-echo "Testing hosted endpoint..."
-aws sagemaker-runtime invoke-endpoint \
-  --endpoint-name <%%= projectName %>-endpoint \
-  --body file://test_data.json \
-  output.json
-<%% } %>
-```
-## Adding New Templates
-### 1. Create Template File
-Add your template file in the appropriate directory:
-```bash
-templates/code/my_new_file.py
-```
-### 2. Use EJS Syntax
-```python
-# templates/code/my_new_file.py
-"""
-Generated for <%%= projectName %>
-Framework: <%%= framework %>
-"""
-<%% if (framework === 'sklearn') { %>
-# sklearn-specific code
-<%% } %>
-```
-### 3. Add Conditional Exclusion (if needed)
-In `generators/app/index.js`, add to `ignorePatterns`:
-```javascript
-if (someCondition) {
-    ignorePatterns.push('**/code/my_new_file.py');
-}
-```
-### 4. Update This Documentation
-Document the new template and when it's included/excluded.
-## Best Practices
-### Template Design
-- **Keep templates simple** - Complex logic belongs in the generator
-- **Use descriptive variable names** - Make templates self-documenting
-- **Add comments** - Explain why conditional logic exists
-- **Test all paths** - Verify templates work for all configurations
-### Variable Usage
-- **Escape output by default** - Use `<%%= %>` unless you need HTML
-- **Validate in generator** - Don't assume variables exist in templates
-- **Provide defaults** - Use `<%%= variable || 'default' %>`
-### File Organization
-- **Group related files** - Keep similar templates together
-- **Use subdirectories** - Organize by feature or component
-- **Name clearly** - File names should indicate purpose
-## Testing Templates
-### Manual Testing
-```bash
-# Link generator locally
-npm link
-# Run generator
-ml-container-creator
-# Test different configurations
-# - sklearn + flask
-# - xgboost + fastapi
-# - transformers + vllm
-# - With/without sample model
-# - With/without tests
-```
-### Automated Testing
-See `test/` directory for generator tests that verify template generation.
-## Troubleshooting
-### Template Not Copied
-- Check if file matches an ignore pattern
-- Verify file is in templates directory
-- Check for EJS syntax errors
-### Variables Not Replaced
-- Ensure variable exists in `this.answers`
-- Check EJS syntax: `<%%= variable %>` not `{{ variable }}`
-- Verify template is processed with `copyTpl` not `copy`
-### Conditional Logic Not Working
-- Test condition in generator first
-- Use `console.log()` to debug values
-- Check for typos in variable names
-## Related Documentation
-- [EJS Documentation](https://ejs.co/)
-- [Project Steering Files](../../../.kiro/steering/)