@aws/ml-container-creator 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (143) hide show
  1. package/LICENSE +202 -0
  2. package/LICENSE-THIRD-PARTY +68620 -0
  3. package/NOTICE +2 -0
  4. package/README.md +106 -0
  5. package/bin/cli.js +365 -0
  6. package/config/defaults.json +32 -0
  7. package/config/presets/transformers-djl.json +26 -0
  8. package/config/presets/transformers-gpu.json +24 -0
  9. package/config/presets/transformers-lmi.json +27 -0
  10. package/package.json +129 -0
  11. package/servers/README.md +419 -0
  12. package/servers/base-image-picker/catalogs/model-servers.json +1191 -0
  13. package/servers/base-image-picker/catalogs/python-slim.json +38 -0
  14. package/servers/base-image-picker/catalogs/triton-backends.json +51 -0
  15. package/servers/base-image-picker/catalogs/triton.json +38 -0
  16. package/servers/base-image-picker/index.js +495 -0
  17. package/servers/base-image-picker/manifest.json +17 -0
  18. package/servers/base-image-picker/package.json +15 -0
  19. package/servers/hyperpod-cluster-picker/LICENSE +202 -0
  20. package/servers/hyperpod-cluster-picker/index.js +424 -0
  21. package/servers/hyperpod-cluster-picker/manifest.json +14 -0
  22. package/servers/hyperpod-cluster-picker/package.json +17 -0
  23. package/servers/instance-recommender/LICENSE +202 -0
  24. package/servers/instance-recommender/catalogs/instances.json +852 -0
  25. package/servers/instance-recommender/index.js +284 -0
  26. package/servers/instance-recommender/manifest.json +16 -0
  27. package/servers/instance-recommender/package.json +15 -0
  28. package/servers/lib/LICENSE +202 -0
  29. package/servers/lib/bedrock-client.js +160 -0
  30. package/servers/lib/custom-validators.js +46 -0
  31. package/servers/lib/dynamic-resolver.js +36 -0
  32. package/servers/lib/package.json +11 -0
  33. package/servers/lib/schemas/image-catalog.schema.json +185 -0
  34. package/servers/lib/schemas/instances.schema.json +124 -0
  35. package/servers/lib/schemas/manifest.schema.json +64 -0
  36. package/servers/lib/schemas/model-catalog.schema.json +91 -0
  37. package/servers/lib/schemas/regions.schema.json +26 -0
  38. package/servers/lib/schemas/triton-backends.schema.json +51 -0
  39. package/servers/model-picker/catalogs/jumpstart-public.json +66 -0
  40. package/servers/model-picker/catalogs/popular-diffusors.json +88 -0
  41. package/servers/model-picker/catalogs/popular-transformers.json +226 -0
  42. package/servers/model-picker/index.js +1693 -0
  43. package/servers/model-picker/manifest.json +18 -0
  44. package/servers/model-picker/package.json +20 -0
  45. package/servers/region-picker/LICENSE +202 -0
  46. package/servers/region-picker/catalogs/regions.json +263 -0
  47. package/servers/region-picker/index.js +230 -0
  48. package/servers/region-picker/manifest.json +16 -0
  49. package/servers/region-picker/package.json +15 -0
  50. package/src/app.js +1007 -0
  51. package/src/copy-tpl.js +77 -0
  52. package/src/lib/accelerator-validator.js +39 -0
  53. package/src/lib/asset-manager.js +385 -0
  54. package/src/lib/aws-profile-parser.js +181 -0
  55. package/src/lib/bootstrap-command-handler.js +1647 -0
  56. package/src/lib/bootstrap-config.js +238 -0
  57. package/src/lib/ci-register-helpers.js +124 -0
  58. package/src/lib/ci-report-helpers.js +158 -0
  59. package/src/lib/ci-stage-helpers.js +268 -0
  60. package/src/lib/cli-handler.js +529 -0
  61. package/src/lib/comment-generator.js +544 -0
  62. package/src/lib/community-reports-validator.js +91 -0
  63. package/src/lib/config-manager.js +2106 -0
  64. package/src/lib/configuration-exporter.js +204 -0
  65. package/src/lib/configuration-manager.js +695 -0
  66. package/src/lib/configuration-matcher.js +221 -0
  67. package/src/lib/cpu-validator.js +36 -0
  68. package/src/lib/cuda-validator.js +57 -0
  69. package/src/lib/deployment-config-resolver.js +103 -0
  70. package/src/lib/deployment-entry-schema.js +125 -0
  71. package/src/lib/deployment-registry.js +598 -0
  72. package/src/lib/docker-introspection-validator.js +51 -0
  73. package/src/lib/engine-prefix-resolver.js +60 -0
  74. package/src/lib/huggingface-client.js +172 -0
  75. package/src/lib/key-value-parser.js +37 -0
  76. package/src/lib/known-flags-validator.js +200 -0
  77. package/src/lib/manifest-cli.js +280 -0
  78. package/src/lib/mcp-client.js +303 -0
  79. package/src/lib/mcp-command-handler.js +532 -0
  80. package/src/lib/neuron-validator.js +80 -0
  81. package/src/lib/parameter-schema-validator.js +284 -0
  82. package/src/lib/prompt-runner.js +1349 -0
  83. package/src/lib/prompts.js +1138 -0
  84. package/src/lib/registry-command-handler.js +519 -0
  85. package/src/lib/registry-loader.js +198 -0
  86. package/src/lib/rocm-validator.js +80 -0
  87. package/src/lib/schema-validator.js +157 -0
  88. package/src/lib/sensitive-redactor.js +59 -0
  89. package/src/lib/template-engine.js +156 -0
  90. package/src/lib/template-manager.js +341 -0
  91. package/src/lib/validation-engine.js +314 -0
  92. package/src/prompt-adapter.js +63 -0
  93. package/templates/Dockerfile +300 -0
  94. package/templates/IAM_PERMISSIONS.md +84 -0
  95. package/templates/MIGRATION.md +488 -0
  96. package/templates/PROJECT_README.md +439 -0
  97. package/templates/TEMPLATE_SYSTEM.md +243 -0
  98. package/templates/buildspec.yml +64 -0
  99. package/templates/code/chat_template.jinja +1 -0
  100. package/templates/code/flask/gunicorn_config.py +35 -0
  101. package/templates/code/flask/wsgi.py +10 -0
  102. package/templates/code/model_handler.py +387 -0
  103. package/templates/code/serve +300 -0
  104. package/templates/code/serve.py +175 -0
  105. package/templates/code/serving.properties +105 -0
  106. package/templates/code/start_server.py +39 -0
  107. package/templates/code/start_server.sh +39 -0
  108. package/templates/diffusors/Dockerfile +72 -0
  109. package/templates/diffusors/patch_image_api.py +35 -0
  110. package/templates/diffusors/serve +115 -0
  111. package/templates/diffusors/start_server.sh +114 -0
  112. package/templates/do/.gitkeep +1 -0
  113. package/templates/do/README.md +541 -0
  114. package/templates/do/build +83 -0
  115. package/templates/do/ci +681 -0
  116. package/templates/do/clean +811 -0
  117. package/templates/do/config +260 -0
  118. package/templates/do/deploy +1560 -0
  119. package/templates/do/export +306 -0
  120. package/templates/do/logs +319 -0
  121. package/templates/do/manifest +12 -0
  122. package/templates/do/push +119 -0
  123. package/templates/do/register +580 -0
  124. package/templates/do/run +113 -0
  125. package/templates/do/submit +417 -0
  126. package/templates/do/test +1147 -0
  127. package/templates/hyperpod/configmap.yaml +24 -0
  128. package/templates/hyperpod/deployment.yaml +71 -0
  129. package/templates/hyperpod/pvc.yaml +42 -0
  130. package/templates/hyperpod/service.yaml +17 -0
  131. package/templates/nginx-diffusors.conf +74 -0
  132. package/templates/nginx-predictors.conf +47 -0
  133. package/templates/nginx-tensorrt.conf +74 -0
  134. package/templates/requirements.txt +61 -0
  135. package/templates/sample_model/test_inference.py +123 -0
  136. package/templates/sample_model/train_abalone.py +252 -0
  137. package/templates/test/test_endpoint.sh +79 -0
  138. package/templates/test/test_local_image.sh +80 -0
  139. package/templates/test/test_model_handler.py +180 -0
  140. package/templates/triton/Dockerfile +128 -0
  141. package/templates/triton/config.pbtxt +163 -0
  142. package/templates/triton/model.py +130 -0
  143. package/templates/triton/requirements.txt +11 -0
@@ -0,0 +1,63 @@
1
+ // Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
2
+ // SPDX-License-Identifier: Apache-2.0
3
+
4
+ import { select, input, confirm, checkbox, number, Separator } from '@inquirer/prompts'
5
+
6
+ /**
7
+ * Maps Yeoman prompt type names to @inquirer/prompts runner functions.
8
+ */
9
+ const runners = { list: select, select, input, confirm, checkbox, number }
10
+
11
+ /**
12
+ * Runs a sequence of Yeoman-style prompt definitions using @inquirer/prompts.
13
+ *
14
+ * Handles:
15
+ * - Type mapping (list → select)
16
+ * - Conditional prompts via `when` function
17
+ * - Dynamic `choices`, `default`, and `message` (functions resolved with current answers)
18
+ * - Separator mapping from Yeoman format to @inquirer/prompts Separator
19
+ * - Validate function passthrough
20
+ *
21
+ * @param {Array<object>} prompts - Array of Yeoman-style prompt definitions
22
+ * @param {object} [previousAnswers={}] - Answers from prior prompt phases
23
+ * @param {object} [options={}] - Options for dependency injection
24
+ * @param {object} [options.runners] - Override prompt runners (useful for testing)
25
+ * @returns {Promise<object>} Accumulated answers keyed by prompt name
26
+ */
27
+ export async function runPrompts(prompts, previousAnswers = {}, options = {}) {
28
+ const promptRunners = options.runners || runners
29
+ const answers = { ...previousAnswers }
30
+
31
+ for (const prompt of prompts) {
32
+ if (prompt.when && !prompt.when(answers)) continue
33
+
34
+ const type = prompt.type === 'list' ? 'select' : prompt.type
35
+ const runner = promptRunners[type]
36
+
37
+ if (!runner) {
38
+ throw new Error(`Unsupported prompt type: "${prompt.type}"`)
39
+ }
40
+
41
+ const message = typeof prompt.message === 'function'
42
+ ? prompt.message(answers) : prompt.message
43
+ const choices = typeof prompt.choices === 'function'
44
+ ? prompt.choices(answers) : prompt.choices
45
+ const defaultVal = typeof prompt.default === 'function'
46
+ ? prompt.default(answers) : prompt.default
47
+
48
+ const mappedChoices = choices?.map(c =>
49
+ c && c.type === 'separator'
50
+ ? new Separator(c.separator || c.line)
51
+ : c
52
+ )
53
+
54
+ const config = { message }
55
+ if (mappedChoices !== undefined) config.choices = mappedChoices
56
+ if (defaultVal !== undefined) config.default = defaultVal
57
+ if (prompt.validate) config.validate = prompt.validate
58
+
59
+ answers[prompt.name] = await runner(config)
60
+ }
61
+
62
+ return answers
63
+ }
@@ -0,0 +1,300 @@
1
+ # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
2
+ # SPDX-License-Identifier: Apache-2.0
3
+
4
+ <% if (comments && comments.acceleratorInfo) { %>
5
+ <%= comments.acceleratorInfo %>
6
+ <% } %>
7
+
8
+ <% if (comments && comments.validationInfo) { %>
9
+ <%= comments.validationInfo %>
10
+ <% } %>
11
+
12
+ <% if (framework !== 'transformers') { %>
13
+ FROM <%= baseImage || 'python:3.12-slim' %>
14
+
15
+ # Set a docker label to name this project, postpended with the build time
16
+ LABEL project.name="<%= projectName %>-<%= buildTimestamp %>" \
17
+ project.base-name="<%= projectName %>" \
18
+ project.build-time="<%= buildTimestamp %>"
19
+
20
+ # Set a docker label to advertise multi-model support on the container
21
+ LABEL com.amazonaws.sagemaker.capabilities.multi-models=true
22
+ # Set a docker label to enable container to use SAGEMAKER_BIND_TO_PORT environment variable if present
23
+ LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
24
+
25
+ # Set working directory
26
+ WORKDIR /opt/ml
27
+
28
+ RUN apt-get update && \
29
+ apt-get upgrade -y && \
30
+ apt-get clean
31
+
32
+ # Install system dependencies
33
+ RUN apt-get install -y --no-install-recommends \
34
+ build-essential \
35
+ ca-certificates \
36
+ curl \
37
+ git \
38
+ nginx \
39
+ && rm -rf /var/lib/apt/lists/* \
40
+ && apt-get clean
41
+
42
+ # Install Python dependencies
43
+ COPY requirements.txt .
44
+ RUN pip install --no-cache-dir pip==24.2 && \
45
+ pip install --no-cache-dir -r requirements.txt
46
+
47
+ # Copy model serving code
48
+ COPY code/serve.py code/
49
+ COPY code/start_server.py code/
50
+ COPY code/model_handler.py code/
51
+
52
+ <% if (modelServer === 'flask') { %>
53
+ COPY code/flask/gunicorn_config.py code/
54
+ COPY code/flask/wsgi.py code/
55
+ <% } %>
56
+
57
+
58
+ # Set up SageMaker directories
59
+ RUN mkdir -p /opt/ml/input/data \
60
+ && mkdir -p /opt/ml/output/data \
61
+ && mkdir -p /opt/ml/model
62
+
63
+ COPY nginx-predictors.conf /etc/nginx/nginx.conf
64
+
65
+ # Model files will be provided at runtime via SageMaker model artifacts
66
+ <% if (includeSampleModel) { %>
67
+ # Copy the generated sample model
68
+ <% if (modelFormat === 'SavedModel') { %>
69
+ COPY sample_model/abalone_model /opt/ml/model/
70
+ <% } else { %>
71
+ COPY sample_model/abalone_model.<%= modelFormat %> /opt/ml/model/
72
+ <% } %>
73
+ # Also copy training script for reference
74
+ COPY sample_model/ /opt/ml/sample_model/
75
+ <% } else { %>
76
+ # COPY your_model_files /opt/ml/model/
77
+ <% } %>
78
+
79
+ <% if (comments && comments.envVarExplanations && Object.keys(comments.envVarExplanations).length > 0) { %>
80
+ # Environment Variables Configuration
81
+ <% for (const [category, comment] of Object.entries(comments.envVarExplanations)) { %>
82
+ <%= comment %>
83
+ <% } %>
84
+ <% } %>
85
+
86
+ # Set environment variables for SageMaker
87
+ ENV PYTHONPATH=/opt/ml/code
88
+ ENV SAGEMAKER_BIND_TO_PORT=8080
89
+
90
+ <% if (orderedEnvVars && orderedEnvVars.length > 0) { %>
91
+ # Additional environment variables from configuration
92
+ <% orderedEnvVars.forEach(({ key, value }) => { %>
93
+ ENV <%= key %>=<%= value %>
94
+ <% }); %>
95
+ <% } %>
96
+
97
+ # Expose port 8080 for SageMaker inference
98
+ EXPOSE 8080
99
+
100
+ <% if (comments && comments.troubleshooting) { %>
101
+ <%= comments.troubleshooting %>
102
+ <% } %>
103
+
104
+ # Set the inference script as the entry point
105
+ RUN chmod +x code/start_server.py
106
+ ENTRYPOINT ["python", "/opt/ml/code/start_server.py"]
107
+ <% } else { %>
108
+ <% if (comments && comments.acceleratorInfo) { %>
109
+ <%= comments.acceleratorInfo %>
110
+ <% } %>
111
+
112
+ <% if (comments && comments.validationInfo) { %>
113
+ <%= comments.validationInfo %>
114
+ <% } %>
115
+
116
+ <% if (modelServer === 'vllm') { %>
117
+ # https://github.com/aws-samples/sagemaker-genai-hosting-examples/tree/main/OpenAI/gpt-oss/deploy/docker
118
+ ARG BASE_IMAGE=<%= baseImage || 'vllm/vllm-openai:v0.10.1' %>
119
+ <% } else if (modelServer === 'sglang') { %>
120
+ ARG BASE_IMAGE=<%= baseImage || 'lmsysorg/sglang:v0.5.4.post1' %>
121
+ <% } else if (modelServer === 'tensorrt-llm') { %>
122
+ # TensorRT-LLM requires NVIDIA NGC authentication
123
+ # Before building, authenticate with NGC:
124
+ # 1. Create NGC account: https://ngc.nvidia.com/signup
125
+ # 2. Generate API key: https://ngc.nvidia.com/setup/api-key
126
+ # 3. Login: docker login nvcr.io
127
+ # Username: $oauthtoken
128
+ # Password: <your-ngc-api-key>
129
+ # Using a stable release for better SageMaker compatibility
130
+ ARG BASE_IMAGE=<%= baseImage || 'nvcr.io/nvidia/tensorrt-llm/release:1.2.0rc8' %>
131
+ <% } else if (modelServer === 'lmi') { %>
132
+ # AWS Large Model Inference (LMI) Container
133
+ # LMI containers are pre-built by AWS and include DJL Serving with optimized inference libraries
134
+ # Available backends: vLLM, TensorRT-LLM, LMI-Dist (DeepSpeed), Transformers NeuronX
135
+ # Documentation: https://docs.djl.ai/master/docs/serving/serving/docs/lmi/index.html
136
+ ARG BASE_IMAGE=<%= baseImage || '763104351884.dkr.ecr.us-east-1.amazonaws.com/djl-inference:0.32.0-lmi14.0.0-cu126' %>
137
+ <% } else if (modelServer === 'djl') { %>
138
+ # DJL Serving Container
139
+ # Deep Java Library serving with support for multiple inference backends
140
+ # Documentation: https://djl.ai/
141
+ ARG BASE_IMAGE=<%= baseImage || 'deepjavalibrary/djl-serving:0.36.0-pytorch-gpu' %>
142
+ <% } %>
143
+
144
+ FROM ${BASE_IMAGE}
145
+
146
+ <% if (comments && comments.chatTemplate) { %>
147
+ <%= comments.chatTemplate %>
148
+ <% } %>
149
+
150
+ # Model source metadata
151
+ ENV MODEL_SOURCE="<%= (typeof modelSource !== 'undefined' && modelSource) ? modelSource : 'huggingface' %>"
152
+ <% if (typeof artifactUri !== 'undefined' && artifactUri) { %>
153
+ ENV MODEL_ARTIFACT_URI="<%= artifactUri %>"
154
+ <% } %>
155
+
156
+ # Set the model name for the transformer model
157
+ <% if (modelServer === 'vllm') { %>
158
+ <% if (typeof modelLoadStrategy !== 'undefined' && modelLoadStrategy === 'build-time' && (typeof modelSource === 'undefined' || !modelSource || modelSource === 'huggingface')) { %>
159
+ ENV VLLM_MODEL="/opt/ml/model"
160
+ <% } else { %>
161
+ ENV VLLM_MODEL="<%= modelName %>"
162
+ <% } %>
163
+ <% if (typeof modelSource !== 'undefined' && modelSource && modelSource !== 'huggingface') { %>
164
+ # Model will be resolved at container startup by the serve script
165
+ <% } %>
166
+ <% } else if (modelServer === 'sglang') { %>
167
+ <% if (typeof modelLoadStrategy !== 'undefined' && modelLoadStrategy === 'build-time' && (typeof modelSource === 'undefined' || !modelSource || modelSource === 'huggingface')) { %>
168
+ ENV SGLANG_MODEL_PATH="/opt/ml/model"
169
+ <% } else { %>
170
+ ENV SGLANG_MODEL_PATH="<%= modelName %>"
171
+ <% } %>
172
+ <% if (typeof modelSource !== 'undefined' && modelSource && modelSource !== 'huggingface') { %>
173
+ # Model will be resolved at container startup by the serve script
174
+ <% } %>
175
+ <% } else if (modelServer === 'tensorrt-llm') { %>
176
+ <% if (typeof modelLoadStrategy !== 'undefined' && modelLoadStrategy === 'build-time' && (typeof modelSource === 'undefined' || !modelSource || modelSource === 'huggingface')) { %>
177
+ ENV TRTLLM_MODEL="/opt/ml/model"
178
+ <% } else { %>
179
+ ENV TRTLLM_MODEL="<%= modelName %>"
180
+ <% } %>
181
+ <% if (typeof modelSource !== 'undefined' && modelSource && modelSource !== 'huggingface') { %>
182
+ # Model will be resolved at container startup by the serve script
183
+ <% } %>
184
+
185
+ # Disable UCX CUDA transport to avoid symbol lookup errors
186
+ # The UCX CUDA library has compatibility issues with some CUDA versions in SageMaker
187
+ RUN if [ -f /usr/local/ucx/lib/ucx/libuct_cuda.so.0 ]; then \
188
+ mv /usr/local/ucx/lib/ucx/libuct_cuda.so.0 /usr/local/ucx/lib/ucx/libuct_cuda.so.0.disabled; \
189
+ fi
190
+
191
+ ENV UCX_TLS=tcp,self,sm
192
+ ENV UCX_NET_DEVICES=all
193
+ ENV NCCL_IB_DISABLE=1
194
+ ENV NCCL_P2P_DISABLE=1
195
+ <% } else if (modelServer === 'lmi' || modelServer === 'djl') { %>
196
+ # LMI/DJL Configuration
197
+ # Model configuration is done via serving.properties file
198
+ # The model will be loaded from HuggingFace Hub or S3
199
+ ENV HF_MODEL_ID="<%= modelName %>"
200
+ <% if (typeof modelSource !== 'undefined' && modelSource && modelSource !== 'huggingface') { %>
201
+ # Model will be resolved at container startup by the serve script
202
+ <% } %>
203
+
204
+ # DJL Serving listens on port 8080 by default (SageMaker requirement)
205
+ ENV SERVING_PORT=8080
206
+ <% } %>
207
+
208
+ <% if (hfToken && (!modelSource || modelSource === 'huggingface')) { %>
209
+ # Set HuggingFace authentication token
210
+ ENV HF_TOKEN="<%= hfToken %>"
211
+ <% } %>
212
+
213
+ <% if (chatTemplate) { %>
214
+ # Chat template configuration
215
+ # This template formats chat messages for the model
216
+ # Writing to file to avoid shell escaping issues with multi-line Jinja2 templates
217
+ COPY code/chat_template.jinja /opt/ml/chat_template.jinja
218
+ ENV SGLANG_CHAT_TEMPLATE="/opt/ml/chat_template.jinja"
219
+ <% } %>
220
+
221
+ <% if (comments && comments.envVarExplanations && Object.keys(comments.envVarExplanations).length > 0) { %>
222
+ # Environment Variables Configuration
223
+ <% for (const [category, comment] of Object.entries(comments.envVarExplanations)) { %>
224
+ <%= comment %>
225
+ <% } %>
226
+ <% } %>
227
+
228
+ <% if (orderedEnvVars && orderedEnvVars.length > 0) { %>
229
+ # Additional environment variables from configuration
230
+ <% orderedEnvVars.forEach(({ key, value }) => { %>
231
+ ENV <%= key %>=<%= value %>
232
+ <% }); %>
233
+ <% } %>
234
+
235
+ <% if (typeof modelSource !== 'undefined' && modelSource && modelSource !== 'huggingface' && modelServer !== 'lmi' && modelServer !== 'djl') { %>
236
+ # Install AWS CLI for S3 model downloads
237
+ RUN pip install --no-cache-dir awscli
238
+
239
+ <% } %>
240
+ <% if (typeof modelLoadStrategy !== 'undefined' && modelLoadStrategy === 'build-time') { %>
241
+ # Build-time model download
242
+ # ⚠️ Credentials required during docker build.
243
+ # Build with: DOCKER_BUILDKIT=1 docker build --secret id=aws,src=$HOME/.aws/credentials .
244
+ # Or in CodeBuild: pass AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN as build args.
245
+ <% if (typeof modelSource === 'undefined' || !modelSource || modelSource === 'huggingface') { %>
246
+ ARG HF_TOKEN
247
+ RUN huggingface-cli download <%= modelName %> --local-dir /opt/ml/model
248
+ <% } else if (typeof artifactUri !== 'undefined' && artifactUri) { %>
249
+ ARG AWS_ACCESS_KEY_ID
250
+ ARG AWS_SECRET_ACCESS_KEY
251
+ ARG AWS_SESSION_TOKEN
252
+ ARG AWS_DEFAULT_REGION=<%= (typeof awsRegion !== 'undefined' && awsRegion) ? awsRegion : 'us-east-1' %>
253
+ RUN AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} \
254
+ AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} \
255
+ AWS_SESSION_TOKEN=${AWS_SESSION_TOKEN} \
256
+ AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION} \
257
+ aws s3 sync <%= artifactUri %> /opt/ml/model
258
+ <% } %>
259
+ <% } %>
260
+ <% if (modelServer === 'tensorrt-llm') { %>
261
+ # Install nginx and curl for reverse proxy and health checks
262
+ RUN apt-get update && \
263
+ apt-get install -y nginx curl && \
264
+ rm -rf /var/lib/apt/lists/*
265
+
266
+ # Copy nginx configuration for TensorRT-LLM
267
+ COPY nginx-tensorrt.conf /etc/nginx/nginx.conf
268
+
269
+ # Copy TensorRT-LLM serve script
270
+ COPY code/serve /usr/bin/serve_trtllm
271
+ RUN chmod +x /usr/bin/serve_trtllm
272
+
273
+ # Copy startup script
274
+ COPY code/start_server.sh /usr/bin/start_server.sh
275
+ RUN chmod +x /usr/bin/start_server.sh
276
+
277
+ ENTRYPOINT [ "/usr/bin/start_server.sh" ]
278
+ <% } else if (modelServer === 'lmi' || modelServer === 'djl') { %>
279
+ # Create serving.properties configuration file for LMI/DJL
280
+ RUN mkdir -p /opt/ml/model
281
+ COPY code/serving.properties /opt/ml/model/serving.properties
282
+
283
+ <% if (comments && comments.troubleshooting) { %>
284
+ <%= comments.troubleshooting %>
285
+ <% } %>
286
+
287
+ # LMI/DJL containers use their own entrypoint
288
+ # The container will automatically start DJL Serving with the configuration
289
+ <% } else { %>
290
+ COPY code/serve /usr/bin/serve
291
+ RUN chmod 777 /usr/bin/serve
292
+
293
+ <% if (comments && comments.troubleshooting) { %>
294
+ <%= comments.troubleshooting %>
295
+ <% } %>
296
+
297
+ ENTRYPOINT [ "/usr/bin/serve" ]
298
+ <% } %>
299
+
300
+ <% } %>
@@ -0,0 +1,84 @@
1
+ # IAM Permissions — <%= projectName %>
2
+
3
+ ## Overview
4
+
5
+ This project uses three sets of IAM permissions:
6
+
7
+ 1. **SageMaker Execution Role** — created automatically by `bootstrap` via CloudFormation
8
+ 2. **CodeBuild Service Role** — created automatically by `./do/submit`
9
+ 3. **User/CI Permissions** — your AWS user or CI system needs these to run the do-scripts
10
+
11
+ ## SageMaker Execution Role
12
+
13
+ The bootstrap command creates an IAM role (`mlcc-sagemaker-execution-role`) with permissions for:
14
+
15
+ - **SageMaker**: Create, update, delete, and invoke endpoints, endpoint configs, models, and inference components
16
+ - **ECR**: Pull images from the `ml-container-creator` repository
17
+ - **CloudWatch Logs**: Write container logs
18
+ - **S3**: Read model artifacts from `ml-container-creator-*` buckets
19
+
20
+ The role is defined in the CloudFormation stack template (`config/bootstrap-stack.json`) and updated automatically when you re-run bootstrap after upgrading.
21
+
22
+ If you use a custom role (`--role-arn`), ensure it has at minimum:
23
+
24
+ | Permission | Purpose |
25
+ |-----------|---------|
26
+ | `sagemaker:CreateEndpoint`, `CreateEndpointConfig`, `CreateModel`, `CreateInferenceComponent` | Deploy |
27
+ | `sagemaker:DeleteEndpoint`, `DeleteEndpointConfig`, `DeleteModel`, `DeleteInferenceComponent` | Clean up |
28
+ | `sagemaker:DescribeEndpoint`, `DescribeEndpointConfig`, `DescribeModel`, `DescribeInferenceComponent` | Status checks |
29
+ | `sagemaker:InvokeEndpoint`, `InvokeEndpointAsync` | Inference |
30
+ | `sagemaker:UpdateEndpoint`, `UpdateEndpointWeightsAndCapacities`, `UpdateInferenceComponent` | Updates |
31
+ | `ecr:GetAuthorizationToken`, `BatchGetImage`, `GetDownloadUrlForLayer`, `BatchCheckLayerAvailability` | Pull container image |
32
+ | `logs:CreateLogGroup`, `CreateLogStream`, `PutLogEvents` | Container logging |
33
+ | `s3:GetObject`, `s3:ListBucket` on `ml-container-creator-*` | Model artifact access |
34
+
35
+ Trust policy must allow `sagemaker.amazonaws.com` to assume the role.
36
+
37
+ ## CodeBuild Service Role
38
+
39
+ Created automatically by `./do/submit` as `<%= codebuildProjectName %>-service-role`. Permissions:
40
+
41
+ - **CloudWatch Logs**: Write build logs to `/aws/codebuild/<%= codebuildProjectName %>*`
42
+ - **ECR**: Push images to `ml-container-creator` repository
43
+ - **S3**: Read source archives from `codebuild-source-*` buckets
44
+
45
+ ## User/CI Permissions
46
+
47
+ Your AWS user or CI system needs these permissions to run the do-scripts:
48
+
49
+ | Script | Permissions Needed |
50
+ |--------|-------------------|
51
+ | `./do/push` | `ecr:GetAuthorizationToken`, `ecr:PutImage`, `ecr:InitiateLayerUpload`, `ecr:UploadLayerPart`, `ecr:CompleteLayerUpload`, `ecr:BatchCheckLayerAvailability` |
52
+ | `./do/submit` | `codebuild:CreateProject`, `codebuild:StartBuild`, `codebuild:BatchGetBuilds`, `iam:CreateRole`, `iam:PutRolePolicy`, `iam:PassRole`, `s3:PutObject`, `s3:CreateBucket` |
53
+ | `./do/deploy` | `sagemaker:CreateEndpointConfig`, `sagemaker:CreateEndpoint`, `sagemaker:CreateInferenceComponent`, `sagemaker:DescribeEndpoint`, `iam:PassRole` |
54
+ | `./do/clean` | `sagemaker:DeleteEndpoint`, `sagemaker:DeleteEndpointConfig`, `sagemaker:DeleteInferenceComponent`, `codebuild:DeleteProject`, `iam:DeleteRole`, `iam:DeleteRolePolicy` |
55
+ | `./do/test` | `sagemaker-runtime:InvokeEndpoint` |
56
+ | `bootstrap` | `cloudformation:*`, `iam:CreateRole`, `iam:PutRolePolicy`, `iam:TagRole`, `ecr:CreateRepository`, `s3:CreateBucket` (and `sts:GetCallerIdentity`) |
57
+
58
+ <% if (framework === 'transformers' && hfToken) { %>
59
+ ## HuggingFace Token Security
60
+
61
+ This project includes a HuggingFace token baked into the Docker image. Key practices:
62
+
63
+ - **Use read-only tokens** — never bake write tokens into containers
64
+ - **Rotate regularly** — every 30–90 days, or immediately if compromised
65
+ - **Restrict ECR access** — limit who can pull images containing the token
66
+ - **Consider runtime injection** — pass `HF_TOKEN` as a SageMaker environment variable instead of baking it in (avoids token in image layers, enables rotation without rebuild)
67
+
68
+ To rotate: generate a new token on [HuggingFace](https://huggingface.co/settings/tokens), rebuild with `./do/submit`, revoke the old token.
69
+
70
+ If compromised: revoke the token immediately, delete the ECR image (`aws ecr batch-delete-image`), rebuild, and review CloudTrail logs.
71
+ <% } %>
72
+
73
+ ## Security Best Practices
74
+
75
+ - **Least privilege**: All roles are scoped to specific resources where possible
76
+ - **Resource scoping**: CodeBuild permissions scoped to `<%= codebuildProjectName %>`, SageMaker to `<%= projectName %>*`
77
+ - **Audit**: Enable CloudTrail for IAM, SageMaker, ECR, and CodeBuild events
78
+ - **Separate environments**: Consider per-environment roles (dev/prod)
79
+
80
+ ## References
81
+
82
+ - [SageMaker Execution Roles](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html)
83
+ - [CodeBuild Service Role](https://docs.aws.amazon.com/codebuild/latest/userguide/setting-up.html#setting-up-service-role)
84
+ - [ECR Permissions](https://docs.aws.amazon.com/AmazonECR/latest/userguide/security_iam_service-with-iam.html)