sagemaker-core 1.0.47__py3-none-any.whl → 1.0.62__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -66,7 +66,7 @@ class InvokeEndpointAsyncOutput(Base):
66
66
 
67
67
  Attributes
68
68
  ----------------------
69
- inference_id: Identifier for an inference request. This will be the same as the InferenceId specified in the input. Amazon SageMaker will generate an identifier for you if you do not specify one.
69
+ inference_id: Identifier for an inference request. This will be the same as the InferenceId specified in the input. Amazon SageMaker AI will generate an identifier for you if you do not specify one.
70
70
  output_location: The Amazon S3 URI where the inference response payload is stored.
71
71
  failure_location: The Amazon S3 URI where the inference failure response payload is stored.
72
72
  """
@@ -85,7 +85,7 @@ class InvokeEndpointOutput(Base):
85
85
  body: Includes the inference provided by the model. For information about the format of the response body, see Common Data Formats-Inference. If the explainer is activated, the body includes the explanations provided by the model. For more information, see the Response section under Invoke the Endpoint in the Developer Guide.
86
86
  content_type: The MIME type of the inference returned from the model container.
87
87
  invoked_production_variant: Identifies the production variant that was invoked.
88
- custom_attributes: Provides additional information in the response about the inference returned by a model hosted at an Amazon SageMaker endpoint. The information is an opaque value that is forwarded verbatim. You could use this value, for example, to return an ID received in the CustomAttributes header of a request or other metadata that a service endpoint was programmed to produce. The value must consist of no more than 1024 visible US-ASCII characters as specified in Section 3.3.6. Field Value Components of the Hypertext Transfer Protocol (HTTP/1.1). If the customer wants the custom attribute returned, the model must set the custom attribute to be included on the way back. The code in your model is responsible for setting or updating any custom attributes in the response. If your code does not set this value in the response, an empty value is returned. For example, if a custom attribute represents the trace ID, your model can prepend the custom attribute with Trace ID: in your post-processing function. This feature is currently supported in the Amazon Web Services SDKs but not in the Amazon SageMaker Python SDK.
88
+ custom_attributes: Provides additional information in the response about the inference returned by a model hosted at an Amazon SageMaker AI endpoint. The information is an opaque value that is forwarded verbatim. You could use this value, for example, to return an ID received in the CustomAttributes header of a request or other metadata that a service endpoint was programmed to produce. The value must consist of no more than 1024 visible US-ASCII characters as specified in Section 3.3.6. Field Value Components of the Hypertext Transfer Protocol (HTTP/1.1). If the customer wants the custom attribute returned, the model must set the custom attribute to be included on the way back. The code in your model is responsible for setting or updating any custom attributes in the response. If your code does not set this value in the response, an empty value is returned. For example, if a custom attribute represents the trace ID, your model can prepend the custom attribute with Trace ID: in your post-processing function. This feature is currently supported in the Amazon Web Services SDKs but not in the Amazon SageMaker AI Python SDK.
89
89
  new_session_id: If you created a stateful session with your request, the ID and expiration time that the model assigns to that session.
90
90
  closed_session_id: If you closed a stateful session with your request, the ID of that session.
91
91
  """
@@ -114,12 +114,12 @@ class PayloadPart(Base):
114
114
  class ModelStreamError(Base):
115
115
  """
116
116
  ModelStreamError
117
- An error occurred while streaming the response body. This error can have the following error codes: ModelInvocationTimeExceeded The model failed to finish sending the response within the timeout period allowed by Amazon SageMaker. StreamBroken The Transmission Control Protocol (TCP) connection between the client and the model was reset or closed.
117
+ An error occurred while streaming the response body. This error can have the following error codes: ModelInvocationTimeExceeded The model failed to finish sending the response within the timeout period allowed by Amazon SageMaker AI. StreamBroken The Transmission Control Protocol (TCP) connection between the client and the model was reset or closed.
118
118
 
119
119
  Attributes
120
120
  ----------------------
121
121
  message
122
- error_code: This error can have the following error codes: ModelInvocationTimeExceeded The model failed to finish sending the response within the timeout period allowed by Amazon SageMaker. StreamBroken The Transmission Control Protocol (TCP) connection between the client and the model was reset or closed.
122
+ error_code: This error can have the following error codes: ModelInvocationTimeExceeded The model failed to finish sending the response within the timeout period allowed by Amazon SageMaker AI. StreamBroken The Transmission Control Protocol (TCP) connection between the client and the model was reset or closed.
123
123
  """
124
124
 
125
125
  message: Optional[str] = Unassigned()
@@ -134,7 +134,7 @@ class ResponseStream(Base):
134
134
  Attributes
135
135
  ----------------------
136
136
  payload_part: A wrapper for pieces of the payload that's returned in response to a streaming inference request. A streaming inference response consists of one or more payload parts.
137
- model_stream_error: An error occurred while streaming the response body. This error can have the following error codes: ModelInvocationTimeExceeded The model failed to finish sending the response within the timeout period allowed by Amazon SageMaker. StreamBroken The Transmission Control Protocol (TCP) connection between the client and the model was reset or closed.
137
+ model_stream_error: An error occurred while streaming the response body. This error can have the following error codes: ModelInvocationTimeExceeded The model failed to finish sending the response within the timeout period allowed by Amazon SageMaker AI. StreamBroken The Transmission Control Protocol (TCP) connection between the client and the model was reset or closed.
138
138
  internal_stream_failure: The stream processing failed because of an unknown error, exception or failure. Try your request again.
139
139
  """
140
140
 
@@ -152,7 +152,7 @@ class InvokeEndpointWithResponseStreamOutput(Base):
152
152
  body
153
153
  content_type: The MIME type of the inference returned from the model container.
154
154
  invoked_production_variant: Identifies the production variant that was invoked.
155
- custom_attributes: Provides additional information in the response about the inference returned by a model hosted at an Amazon SageMaker endpoint. The information is an opaque value that is forwarded verbatim. You could use this value, for example, to return an ID received in the CustomAttributes header of a request or other metadata that a service endpoint was programmed to produce. The value must consist of no more than 1024 visible US-ASCII characters as specified in Section 3.3.6. Field Value Components of the Hypertext Transfer Protocol (HTTP/1.1). If the customer wants the custom attribute returned, the model must set the custom attribute to be included on the way back. The code in your model is responsible for setting or updating any custom attributes in the response. If your code does not set this value in the response, an empty value is returned. For example, if a custom attribute represents the trace ID, your model can prepend the custom attribute with Trace ID: in your post-processing function. This feature is currently supported in the Amazon Web Services SDKs but not in the Amazon SageMaker Python SDK.
155
+ custom_attributes: Provides additional information in the response about the inference returned by a model hosted at an Amazon SageMaker AI endpoint. The information is an opaque value that is forwarded verbatim. You could use this value, for example, to return an ID received in the CustomAttributes header of a request or other metadata that a service endpoint was programmed to produce. The value must consist of no more than 1024 visible US-ASCII characters as specified in Section 3.3.6. Field Value Components of the Hypertext Transfer Protocol (HTTP/1.1). If the customer wants the custom attribute returned, the model must set the custom attribute to be included on the way back. The code in your model is responsible for setting or updating any custom attributes in the response. If your code does not set this value in the response, an empty value is returned. For example, if a custom attribute represents the trace ID, your model can prepend the custom attribute with Trace ID: in your post-processing function. This feature is currently supported in the Amazon Web Services SDKs but not in the Amazon SageMaker AI Python SDK.
156
156
  """
157
157
 
158
158
  body: ResponseStream
@@ -494,6 +494,21 @@ class ActionSummary(Base):
494
494
  last_modified_time: Optional[datetime.datetime] = Unassigned()
495
495
 
496
496
 
497
+ class AddClusterNodeSpecification(Base):
498
+ """
499
+ AddClusterNodeSpecification
500
+ Specifies an instance group and the number of nodes to add to it.
501
+
502
+ Attributes
503
+ ----------------------
504
+ instance_group_name: The name of the instance group to which you want to add nodes.
505
+ increment_target_count_by: The number of nodes to add to the specified instance group. The total number of nodes across all instance groups in a single request cannot exceed 50.
506
+ """
507
+
508
+ instance_group_name: str
509
+ increment_target_count_by: int
510
+
511
+
497
512
  class Tag(Base):
498
513
  """
499
514
  Tag
@@ -509,6 +524,19 @@ class Tag(Base):
509
524
  value: str
510
525
 
511
526
 
527
+ class AdditionalEnis(Base):
528
+ """
529
+ AdditionalEnis
530
+ Information about additional Elastic Network Interfaces (ENIs) associated with an instance.
531
+
532
+ Attributes
533
+ ----------------------
534
+ efa_enis: A list of Elastic Fabric Adapter (EFA) ENIs associated with the instance.
535
+ """
536
+
537
+ efa_enis: Optional[List[str]] = Unassigned()
538
+
539
+
512
540
  class ModelAccessConfig(Base):
513
541
  """
514
542
  ModelAccessConfig
@@ -992,6 +1020,36 @@ class InstanceGroup(Base):
992
1020
  instance_group_name: str
993
1021
 
994
1022
 
1023
+ class PlacementSpecification(Base):
1024
+ """
1025
+ PlacementSpecification
1026
+ Specifies how instances should be placed on a specific UltraServer.
1027
+
1028
+ Attributes
1029
+ ----------------------
1030
+ ultra_server_id: The unique identifier of the UltraServer where instances should be placed.
1031
+ instance_count: The number of ML compute instances required to be placed together on the same UltraServer. Minimum value of 1.
1032
+ """
1033
+
1034
+ instance_count: int
1035
+ ultra_server_id: Optional[str] = Unassigned()
1036
+
1037
+
1038
+ class InstancePlacementConfig(Base):
1039
+ """
1040
+ InstancePlacementConfig
1041
+ Configuration for how instances are placed and allocated within UltraServers. This is only applicable for UltraServer capacity.
1042
+
1043
+ Attributes
1044
+ ----------------------
1045
+ enable_multiple_jobs: If set to true, allows multiple jobs to share the same UltraServer instances. If set to false, ensures this job's instances are placed on an UltraServer exclusively, with no other jobs sharing the same UltraServer. Default is false.
1046
+ placement_specifications: A list of specifications for how instances should be placed on specific UltraServers. Maximum of 10 items is supported.
1047
+ """
1048
+
1049
+ enable_multiple_jobs: Optional[bool] = Unassigned()
1050
+ placement_specifications: Optional[List[PlacementSpecification]] = Unassigned()
1051
+
1052
+
995
1053
  class ResourceConfig(Base):
996
1054
  """
997
1055
  ResourceConfig
@@ -1006,6 +1064,7 @@ class ResourceConfig(Base):
1006
1064
  keep_alive_period_in_seconds: The duration of time in seconds to retain configured resources in a warm pool for subsequent training jobs.
1007
1065
  instance_groups: The configuration of a heterogeneous cluster in JSON format.
1008
1066
  training_plan_arn: The Amazon Resource Name (ARN); of the training plan to use for this resource configuration.
1067
+ instance_placement_config: Configuration for how training job instances are placed and allocated within UltraServers. Only applicable for UltraServer capacity.
1009
1068
  """
1010
1069
 
1011
1070
  volume_size_in_gb: int
@@ -1015,6 +1074,7 @@ class ResourceConfig(Base):
1015
1074
  keep_alive_period_in_seconds: Optional[int] = Unassigned()
1016
1075
  instance_groups: Optional[List[InstanceGroup]] = Unassigned()
1017
1076
  training_plan_arn: Optional[str] = Unassigned()
1077
+ instance_placement_config: Optional[InstancePlacementConfig] = Unassigned()
1018
1078
 
1019
1079
 
1020
1080
  class StoppingCondition(Base):
@@ -2400,6 +2460,42 @@ class Autotune(Base):
2400
2460
  mode: str
2401
2461
 
2402
2462
 
2463
+ class BatchAddClusterNodesError(Base):
2464
+ """
2465
+ BatchAddClusterNodesError
2466
+ Information about an error that occurred during the node addition operation.
2467
+
2468
+ Attributes
2469
+ ----------------------
2470
+ instance_group_name: The name of the instance group for which the error occurred.
2471
+ error_code: The error code associated with the failure. Possible values include InstanceGroupNotFound and InvalidInstanceGroupState.
2472
+ failed_count: The number of nodes that failed to be added to the specified instance group.
2473
+ message: A descriptive message providing additional details about the error.
2474
+ """
2475
+
2476
+ instance_group_name: str
2477
+ error_code: str
2478
+ failed_count: int
2479
+ message: Optional[str] = Unassigned()
2480
+
2481
+
2482
+ class NodeAdditionResult(Base):
2483
+ """
2484
+ NodeAdditionResult
2485
+ Information about a node that was successfully added to the cluster.
2486
+
2487
+ Attributes
2488
+ ----------------------
2489
+ node_logical_id: A unique identifier assigned to the node that can be used to track its provisioning status through the DescribeClusterNode operation.
2490
+ instance_group_name: The name of the instance group to which the node was added.
2491
+ status: The current status of the node. Possible values include Pending, Running, Failed, ShuttingDown, SystemUpdating, DeepHealthCheckInProgress, and NotFound.
2492
+ """
2493
+
2494
+ node_logical_id: str
2495
+ instance_group_name: str
2496
+ status: str
2497
+
2498
+
2403
2499
  class BatchDataCaptureConfig(Base):
2404
2500
  """
2405
2501
  BatchDataCaptureConfig
@@ -2417,6 +2513,23 @@ class BatchDataCaptureConfig(Base):
2417
2513
  generate_inference_id: Optional[bool] = Unassigned()
2418
2514
 
2419
2515
 
2516
+ class BatchDeleteClusterNodeLogicalIdsError(Base):
2517
+ """
2518
+ BatchDeleteClusterNodeLogicalIdsError
2519
+ Information about an error that occurred when attempting to delete a node identified by its NodeLogicalId.
2520
+
2521
+ Attributes
2522
+ ----------------------
2523
+ code: The error code associated with the failure. Possible values include NodeLogicalIdNotFound, InvalidNodeStatus, and InternalError.
2524
+ message: A descriptive message providing additional details about the error.
2525
+ node_logical_id: The NodeLogicalId of the node that could not be deleted.
2526
+ """
2527
+
2528
+ code: str
2529
+ message: str
2530
+ node_logical_id: str
2531
+
2532
+
2420
2533
  class BatchDeleteClusterNodesError(Base):
2421
2534
  """
2422
2535
  BatchDeleteClusterNodesError
@@ -2442,10 +2555,14 @@ class BatchDeleteClusterNodesResponse(Base):
2442
2555
  ----------------------
2443
2556
  failed: A list of errors encountered when deleting the specified nodes.
2444
2557
  successful: A list of node IDs that were successfully deleted from the specified cluster.
2558
+ failed_node_logical_ids: A list of NodeLogicalIds that could not be deleted, along with error information explaining why the deletion failed.
2559
+ successful_node_logical_ids: A list of NodeLogicalIds that were successfully deleted from the cluster.
2445
2560
  """
2446
2561
 
2447
2562
  failed: Optional[List[BatchDeleteClusterNodesError]] = Unassigned()
2448
2563
  successful: Optional[List[str]] = Unassigned()
2564
+ failed_node_logical_ids: Optional[List[BatchDeleteClusterNodeLogicalIdsError]] = Unassigned()
2565
+ successful_node_logical_ids: Optional[List[str]] = Unassigned()
2449
2566
 
2450
2567
 
2451
2568
  class BatchDescribeModelPackageError(Base):
@@ -2901,6 +3018,21 @@ class CanvasAppSettings(Base):
2901
3018
  emr_serverless_settings: Optional[EmrServerlessSettings] = Unassigned()
2902
3019
 
2903
3020
 
3021
+ class CapacityReservation(Base):
3022
+ """
3023
+ CapacityReservation
3024
+ Information about the Capacity Reservation used by an instance or instance group.
3025
+
3026
+ Attributes
3027
+ ----------------------
3028
+ arn: The Amazon Resource Name (ARN) of the Capacity Reservation.
3029
+ type: The type of Capacity Reservation. Valid values are ODCR (On-Demand Capacity Reservation) or CRG (Capacity Reservation Group).
3030
+ """
3031
+
3032
+ arn: Optional[str] = Unassigned()
3033
+ type: Optional[str] = Unassigned()
3034
+
3035
+
2904
3036
  class CapacitySizeConfig(Base):
2905
3037
  """
2906
3038
  CapacitySizeConfig
@@ -3274,6 +3406,40 @@ class ClarifyExplainerConfig(Base):
3274
3406
  inference_config: Optional[ClarifyInferenceConfig] = Unassigned()
3275
3407
 
3276
3408
 
3409
+ class ClusterAutoScalingConfig(Base):
3410
+ """
3411
+ ClusterAutoScalingConfig
3412
+ Specifies the autoscaling configuration for a HyperPod cluster.
3413
+
3414
+ Attributes
3415
+ ----------------------
3416
+ mode: Describes whether autoscaling is enabled or disabled for the cluster. Valid values are Enable and Disable.
3417
+ auto_scaler_type: The type of autoscaler to use. Currently supported value is Karpenter.
3418
+ """
3419
+
3420
+ mode: str
3421
+ auto_scaler_type: Optional[str] = Unassigned()
3422
+
3423
+
3424
+ class ClusterAutoScalingConfigOutput(Base):
3425
+ """
3426
+ ClusterAutoScalingConfigOutput
3427
+ The autoscaling configuration and status information for a HyperPod cluster.
3428
+
3429
+ Attributes
3430
+ ----------------------
3431
+ mode: Describes whether autoscaling is enabled or disabled for the cluster.
3432
+ auto_scaler_type: The type of autoscaler configured for the cluster.
3433
+ status: The current status of the autoscaling configuration. Valid values are InService, Failed, Creating, and Deleting.
3434
+ failure_message: If the autoscaling status is Failed, this field contains a message describing the failure.
3435
+ """
3436
+
3437
+ mode: str
3438
+ status: str
3439
+ auto_scaler_type: Optional[str] = Unassigned()
3440
+ failure_message: Optional[str] = Unassigned()
3441
+
3442
+
3277
3443
  class ClusterEbsVolumeConfig(Base):
3278
3444
  """
3279
3445
  ClusterEbsVolumeConfig
@@ -3282,9 +3448,181 @@ class ClusterEbsVolumeConfig(Base):
3282
3448
  Attributes
3283
3449
  ----------------------
3284
3450
  volume_size_in_gb: The size in gigabytes (GB) of the additional EBS volume to be attached to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.
3451
+ volume_kms_key_id: The ID of a KMS key to encrypt the Amazon EBS volume.
3452
+ root_volume: Specifies whether the configuration is for the cluster's root or secondary Amazon EBS volume. You can specify two ClusterEbsVolumeConfig fields to configure both the root and secondary volumes. Set the value to True if you'd like to provide your own customer managed Amazon Web Services KMS key to encrypt the root volume. When True: The configuration is applied to the root volume. You can't specify the VolumeSizeInGB field. The size of the root volume is determined for you. You must specify a KMS key ID for VolumeKmsKeyId to encrypt the root volume with your own KMS key instead of an Amazon Web Services owned KMS key. Otherwise, by default, the value is False, and the following applies: The configuration is applied to the secondary volume, while the root volume is encrypted with an Amazon Web Services owned key. You must specify the VolumeSizeInGB field. You can optionally specify the VolumeKmsKeyId to encrypt the secondary volume with your own KMS key instead of an Amazon Web Services owned KMS key.
3285
3453
  """
3286
3454
 
3287
3455
  volume_size_in_gb: Optional[int] = Unassigned()
3456
+ volume_kms_key_id: Optional[str] = Unassigned()
3457
+ root_volume: Optional[bool] = Unassigned()
3458
+
3459
+
3460
+ class ClusterMetadata(Base):
3461
+ """
3462
+ ClusterMetadata
3463
+ Metadata information about a HyperPod cluster showing information about the cluster level operations, such as creating, updating, and deleting.
3464
+
3465
+ Attributes
3466
+ ----------------------
3467
+ failure_message: An error message describing why the cluster level operation (such as creating, updating, or deleting) failed.
3468
+ eks_role_access_entries: A list of Amazon EKS IAM role ARNs associated with the cluster. This is created by HyperPod on your behalf and only applies for EKS orchestrated clusters.
3469
+ slr_access_entry: The Service-Linked Role (SLR) associated with the cluster. This is created by HyperPod on your behalf and only applies for EKS orchestrated clusters.
3470
+ """
3471
+
3472
+ failure_message: Optional[str] = Unassigned()
3473
+ eks_role_access_entries: Optional[List[str]] = Unassigned()
3474
+ slr_access_entry: Optional[str] = Unassigned()
3475
+
3476
+
3477
+ class InstanceGroupMetadata(Base):
3478
+ """
3479
+ InstanceGroupMetadata
3480
+ Metadata information about an instance group in a HyperPod cluster.
3481
+
3482
+ Attributes
3483
+ ----------------------
3484
+ failure_message: An error message describing why the instance group level operation (such as creating, scaling, or deleting) failed.
3485
+ availability_zone_id: The ID of the Availability Zone where the instance group is located.
3486
+ capacity_reservation: Information about the Capacity Reservation used by the instance group.
3487
+ subnet_id: The ID of the subnet where the instance group is located.
3488
+ security_group_ids: A list of security group IDs associated with the instance group.
3489
+ ami_override: If you use a custom Amazon Machine Image (AMI) for the instance group, this field shows the ID of the custom AMI.
3490
+ """
3491
+
3492
+ failure_message: Optional[str] = Unassigned()
3493
+ availability_zone_id: Optional[str] = Unassigned()
3494
+ capacity_reservation: Optional[CapacityReservation] = Unassigned()
3495
+ subnet_id: Optional[str] = Unassigned()
3496
+ security_group_ids: Optional[List[str]] = Unassigned()
3497
+ ami_override: Optional[str] = Unassigned()
3498
+
3499
+
3500
+ class InstanceGroupScalingMetadata(Base):
3501
+ """
3502
+ InstanceGroupScalingMetadata
3503
+ Metadata information about scaling operations for an instance group.
3504
+
3505
+ Attributes
3506
+ ----------------------
3507
+ instance_count: The current number of instances in the group.
3508
+ target_count: The desired number of instances for the group after scaling.
3509
+ failure_message: An error message describing why the scaling operation failed, if applicable.
3510
+ """
3511
+
3512
+ instance_count: Optional[int] = Unassigned()
3513
+ target_count: Optional[int] = Unassigned()
3514
+ failure_message: Optional[str] = Unassigned()
3515
+
3516
+
3517
+ class InstanceMetadata(Base):
3518
+ """
3519
+ InstanceMetadata
3520
+ Metadata information about an instance in a HyperPod cluster.
3521
+
3522
+ Attributes
3523
+ ----------------------
3524
+ customer_eni: The ID of the customer-managed Elastic Network Interface (ENI) associated with the instance.
3525
+ additional_enis: Information about additional Elastic Network Interfaces (ENIs) associated with the instance.
3526
+ capacity_reservation: Information about the Capacity Reservation used by the instance.
3527
+ failure_message: An error message describing why the instance creation or update failed, if applicable.
3528
+ lcs_execution_state: The execution state of the Lifecycle Script (LCS) for the instance.
3529
+ node_logical_id: The unique logical identifier of the node within the cluster. The ID used here is the same object as in the BatchAddClusterNodes API.
3530
+ """
3531
+
3532
+ customer_eni: Optional[str] = Unassigned()
3533
+ additional_enis: Optional[AdditionalEnis] = Unassigned()
3534
+ capacity_reservation: Optional[CapacityReservation] = Unassigned()
3535
+ failure_message: Optional[str] = Unassigned()
3536
+ lcs_execution_state: Optional[str] = Unassigned()
3537
+ node_logical_id: Optional[str] = Unassigned()
3538
+
3539
+
3540
+ class EventMetadata(Base):
3541
+ """
3542
+ EventMetadata
3543
+ Metadata associated with a cluster event, which may include details about various resource types.
3544
+
3545
+ Attributes
3546
+ ----------------------
3547
+ cluster: Metadata specific to cluster-level events.
3548
+ instance_group: Metadata specific to instance group-level events.
3549
+ instance_group_scaling: Metadata related to instance group scaling events.
3550
+ instance: Metadata specific to instance-level events.
3551
+ """
3552
+
3553
+ cluster: Optional[ClusterMetadata] = Unassigned()
3554
+ instance_group: Optional[InstanceGroupMetadata] = Unassigned()
3555
+ instance_group_scaling: Optional[InstanceGroupScalingMetadata] = Unassigned()
3556
+ instance: Optional[InstanceMetadata] = Unassigned()
3557
+
3558
+
3559
+ class EventDetails(Base):
3560
+ """
3561
+ EventDetails
3562
+ Detailed information about a specific event, including event metadata.
3563
+
3564
+ Attributes
3565
+ ----------------------
3566
+ event_metadata: Metadata specific to the event, which may include information about the cluster, instance group, or instance involved.
3567
+ """
3568
+
3569
+ event_metadata: Optional[EventMetadata] = Unassigned()
3570
+
3571
+
3572
+ class ClusterEventDetail(Base):
3573
+ """
3574
+ ClusterEventDetail
3575
+ Detailed information about a specific event in a HyperPod cluster.
3576
+
3577
+ Attributes
3578
+ ----------------------
3579
+ event_id: The unique identifier (UUID) of the event.
3580
+ cluster_arn: The Amazon Resource Name (ARN) of the HyperPod cluster associated with the event.
3581
+ cluster_name: The name of the HyperPod cluster associated with the event.
3582
+ instance_group_name: The name of the instance group associated with the event, if applicable.
3583
+ instance_id: The EC2 instance ID associated with the event, if applicable.
3584
+ resource_type: The type of resource associated with the event. Valid values are Cluster, InstanceGroup, or Instance.
3585
+ event_time: The timestamp when the event occurred.
3586
+ event_details: Additional details about the event, including event-specific metadata.
3587
+ description: A human-readable description of the event.
3588
+ """
3589
+
3590
+ event_id: str
3591
+ cluster_arn: str
3592
+ cluster_name: Union[str, object]
3593
+ resource_type: str
3594
+ event_time: datetime.datetime
3595
+ instance_group_name: Optional[str] = Unassigned()
3596
+ instance_id: Optional[str] = Unassigned()
3597
+ event_details: Optional[EventDetails] = Unassigned()
3598
+ description: Optional[str] = Unassigned()
3599
+
3600
+
3601
+ class ClusterEventSummary(Base):
3602
+ """
3603
+ ClusterEventSummary
3604
+ A summary of an event in a HyperPod cluster.
3605
+
3606
+ Attributes
3607
+ ----------------------
3608
+ event_id: The unique identifier (UUID) of the event.
3609
+ cluster_arn: The Amazon Resource Name (ARN) of the HyperPod cluster associated with the event.
3610
+ cluster_name: The name of the HyperPod cluster associated with the event.
3611
+ instance_group_name: The name of the instance group associated with the event, if applicable.
3612
+ instance_id: The Amazon Elastic Compute Cloud (EC2) instance ID associated with the event, if applicable.
3613
+ resource_type: The type of resource associated with the event. Valid values are Cluster, InstanceGroup, or Instance.
3614
+ event_time: The timestamp when the event occurred.
3615
+ description: A brief, human-readable description of the event.
3616
+ """
3617
+
3618
+ event_id: str
3619
+ cluster_arn: str
3620
+ cluster_name: Union[str, object]
3621
+ resource_type: str
3622
+ event_time: datetime.datetime
3623
+ instance_group_name: Optional[str] = Unassigned()
3624
+ instance_id: Optional[str] = Unassigned()
3625
+ description: Optional[str] = Unassigned()
3288
3626
 
3289
3627
 
3290
3628
  class ClusterLifeCycleConfig(Base):
@@ -3383,6 +3721,8 @@ class ClusterInstanceGroupDetails(Base):
3383
3721
  training_plan_status: The current status of the training plan associated with this cluster instance group.
3384
3722
  override_vpc_config: The customized Amazon VPC configuration at the instance group level that overrides the default Amazon VPC configuration of the SageMaker HyperPod cluster.
3385
3723
  scheduled_update_config: The configuration object of the schedule that SageMaker follows when updating the AMI.
3724
+ current_image_id: The ID of the Amazon Machine Image (AMI) currently in use by the instance group.
3725
+ desired_image_id: The ID of the Amazon Machine Image (AMI) desired for the instance group.
3386
3726
  """
3387
3727
 
3388
3728
  current_count: Optional[int] = Unassigned()
@@ -3399,6 +3739,8 @@ class ClusterInstanceGroupDetails(Base):
3399
3739
  training_plan_status: Optional[str] = Unassigned()
3400
3740
  override_vpc_config: Optional[VpcConfig] = Unassigned()
3401
3741
  scheduled_update_config: Optional[ScheduledUpdateConfig] = Unassigned()
3742
+ current_image_id: Optional[str] = Unassigned()
3743
+ desired_image_id: Optional[str] = Unassigned()
3402
3744
 
3403
3745
 
3404
3746
  class ClusterInstanceGroupSpecification(Base):
@@ -3419,6 +3761,7 @@ class ClusterInstanceGroupSpecification(Base):
3419
3761
  training_plan_arn: The Amazon Resource Name (ARN); of the training plan to use for this cluster instance group. For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see CreateTrainingPlan .
3420
3762
  override_vpc_config: To configure multi-AZ deployments, customize the Amazon VPC configuration at the instance group level. You can specify different subnets and security groups across different AZs in the instance group specification to override a SageMaker HyperPod cluster's default Amazon VPC configuration. For more information about deploying a cluster in multiple AZs, see Setting up SageMaker HyperPod clusters across multiple AZs. When your Amazon VPC and subnets support IPv6, network communications differ based on the cluster orchestration platform: Slurm-orchestrated clusters automatically configure nodes with dual IPv6 and IPv4 addresses, allowing immediate IPv6 network communications. In Amazon EKS-orchestrated clusters, nodes receive dual-stack addressing, but pods can only use IPv6 when the Amazon EKS cluster is explicitly IPv6-enabled. For information about deploying an IPv6 Amazon EKS cluster, see Amazon EKS IPv6 Cluster Deployment. Additional resources for IPv6 configuration: For information about adding IPv6 support to your VPC, see to IPv6 Support for VPC. For information about creating a new IPv6-compatible VPC, see Amazon VPC Creation Guide. To configure SageMaker HyperPod with a custom Amazon VPC, see Custom Amazon VPC Setup for SageMaker HyperPod.
3421
3763
  scheduled_update_config: The configuration object of the schedule that SageMaker uses to update the AMI.
3764
+ image_id: When configuring your HyperPod cluster, you can specify an image ID using one of the following options: HyperPodPublicAmiId: Use a HyperPod public AMI CustomAmiId: Use your custom AMI default: Use the default latest system image If you choose to use a custom AMI (CustomAmiId), ensure it meets the following requirements: Encryption: The custom AMI must be unencrypted. Ownership: The custom AMI must be owned by the same Amazon Web Services account that is creating the HyperPod cluster. Volume support: Only the primary AMI snapshot volume is supported; additional AMI volumes are not supported. When updating the instance group's AMI through the UpdateClusterSoftware operation, if an instance group uses a custom AMI, you must provide an ImageId or use the default as input. Note that if you don't specify an instance group in your UpdateClusterSoftware request, then all of the instance groups are patched with the specified image.
3422
3765
  """
3423
3766
 
3424
3767
  instance_count: int
@@ -3432,6 +3775,7 @@ class ClusterInstanceGroupSpecification(Base):
3432
3775
  training_plan_arn: Optional[str] = Unassigned()
3433
3776
  override_vpc_config: Optional[VpcConfig] = Unassigned()
3434
3777
  scheduled_update_config: Optional[ScheduledUpdateConfig] = Unassigned()
3778
+ image_id: Optional[str] = Unassigned()
3435
3779
 
3436
3780
 
3437
3781
  class ClusterInstancePlacement(Base):
@@ -3464,6 +3808,19 @@ class ClusterInstanceStatusDetails(Base):
3464
3808
  message: Optional[str] = Unassigned()
3465
3809
 
3466
3810
 
3811
+ class UltraServerInfo(Base):
3812
+ """
3813
+ UltraServerInfo
3814
+ Contains information about the UltraServer object.
3815
+
3816
+ Attributes
3817
+ ----------------------
3818
+ id: The unique identifier of the UltraServer.
3819
+ """
3820
+
3821
+ id: Optional[str] = Unassigned()
3822
+
3823
+
3467
3824
  class ClusterNodeDetails(Base):
3468
3825
  """
3469
3826
  ClusterNodeDetails
@@ -3473,6 +3830,7 @@ class ClusterNodeDetails(Base):
3473
3830
  ----------------------
3474
3831
  instance_group_name: The instance group name in which the instance is.
3475
3832
  instance_id: The ID of the instance.
3833
+ node_logical_id: A unique identifier for the node that persists throughout its lifecycle, from provisioning request to termination. This identifier can be used to track the node even before it has an assigned InstanceId.
3476
3834
  instance_status: The status of the instance.
3477
3835
  instance_type: The type of the instance.
3478
3836
  launch_time: The time when the instance is launched.
@@ -3485,10 +3843,14 @@ class ClusterNodeDetails(Base):
3485
3843
  private_primary_ipv6: The private primary IPv6 address of the SageMaker HyperPod cluster node when configured with an Amazon VPC that supports IPv6 and includes subnets with IPv6 addressing enabled in either the cluster Amazon VPC configuration or the instance group Amazon VPC configuration.
3486
3844
  private_dns_hostname: The private DNS hostname of the SageMaker HyperPod cluster node.
3487
3845
  placement: The placement details of the SageMaker HyperPod cluster node.
3846
+ current_image_id: The ID of the Amazon Machine Image (AMI) currently in use by the node.
3847
+ desired_image_id: The ID of the Amazon Machine Image (AMI) desired for the node.
3848
+ ultra_server_info: Contains information about the UltraServer.
3488
3849
  """
3489
3850
 
3490
3851
  instance_group_name: Optional[str] = Unassigned()
3491
3852
  instance_id: Optional[str] = Unassigned()
3853
+ node_logical_id: Optional[str] = Unassigned()
3492
3854
  instance_status: Optional[ClusterInstanceStatusDetails] = Unassigned()
3493
3855
  instance_type: Optional[str] = Unassigned()
3494
3856
  launch_time: Optional[datetime.datetime] = Unassigned()
@@ -3501,6 +3863,9 @@ class ClusterNodeDetails(Base):
3501
3863
  private_primary_ipv6: Optional[str] = Unassigned()
3502
3864
  private_dns_hostname: Optional[str] = Unassigned()
3503
3865
  placement: Optional[ClusterInstancePlacement] = Unassigned()
3866
+ current_image_id: Optional[str] = Unassigned()
3867
+ desired_image_id: Optional[str] = Unassigned()
3868
+ ultra_server_info: Optional[UltraServerInfo] = Unassigned()
3504
3869
 
3505
3870
 
3506
3871
  class ClusterNodeSummary(Base):
@@ -3512,10 +3877,12 @@ class ClusterNodeSummary(Base):
3512
3877
  ----------------------
3513
3878
  instance_group_name: The name of the instance group in which the instance is.
3514
3879
  instance_id: The ID of the instance.
3880
+ node_logical_id: A unique identifier for the node that persists throughout its lifecycle, from provisioning request to termination. This identifier can be used to track the node even before it has an assigned InstanceId. This field is only included when IncludeNodeLogicalIds is set to True in the ListClusterNodes request.
3515
3881
  instance_type: The type of the instance.
3516
3882
  launch_time: The time when the instance is launched.
3517
3883
  last_software_update_time: The time when SageMaker last updated the software of the instances in the cluster.
3518
3884
  instance_status: The status of the instance.
3885
+ ultra_server_info: Contains information about the UltraServer.
3519
3886
  """
3520
3887
 
3521
3888
  instance_group_name: str
@@ -3523,7 +3890,9 @@ class ClusterNodeSummary(Base):
3523
3890
  instance_type: str
3524
3891
  launch_time: datetime.datetime
3525
3892
  instance_status: ClusterInstanceStatusDetails
3893
+ node_logical_id: Optional[str] = Unassigned()
3526
3894
  last_software_update_time: Optional[datetime.datetime] = Unassigned()
3895
+ ultra_server_info: Optional[UltraServerInfo] = Unassigned()
3527
3896
 
3528
3897
 
3529
3898
  class ClusterOrchestratorEksConfig(Base):
@@ -3715,6 +4084,21 @@ class ClusterSummary(Base):
3715
4084
  training_plan_arns: Optional[List[str]] = Unassigned()
3716
4085
 
3717
4086
 
4087
+ class ClusterTieredStorageConfig(Base):
4088
+ """
4089
+ ClusterTieredStorageConfig
4090
+ Defines the configuration for managed tier checkpointing in a HyperPod cluster. Managed tier checkpointing uses multiple storage tiers, including cluster CPU memory, to provide faster checkpoint operations and improved fault tolerance for large-scale model training. The system automatically saves checkpoints at high frequency to memory and periodically persists them to durable storage, like Amazon S3.
4091
+
4092
+ Attributes
4093
+ ----------------------
4094
+ mode: Specifies whether managed tier checkpointing is enabled or disabled for the HyperPod cluster. When set to Enable, the system installs a memory management daemon that provides disaggregated memory as a service for checkpoint storage. When set to Disable, the feature is turned off and the memory management daemon is removed from the cluster.
4095
+ instance_memory_allocation_percentage: The percentage (int) of cluster memory to allocate for checkpointing.
4096
+ """
4097
+
4098
+ mode: str
4099
+ instance_memory_allocation_percentage: Optional[int] = Unassigned()
4100
+
4101
+
3718
4102
  class CustomImage(Base):
3719
4103
  """
3720
4104
  CustomImage
@@ -3919,10 +4303,16 @@ class ComputeQuotaResourceConfig(Base):
3919
4303
  ----------------------
3920
4304
  instance_type: The instance type of the instance group for the cluster.
3921
4305
  count: The number of instances to add to the instance group of a SageMaker HyperPod cluster.
4306
+ accelerators: The number of accelerators to allocate. If you don't specify a value for vCPU and MemoryInGiB, SageMaker AI automatically allocates ratio-based values for those parameters based on the number of accelerators you provide. For example, if you allocate 16 out of 32 total accelerators, SageMaker AI uses the ratio of 0.5 and allocates values to vCPU and MemoryInGiB.
4307
+ v_cpu: The number of vCPU to allocate. If you specify a value only for vCPU, SageMaker AI automatically allocates ratio-based values for MemoryInGiB based on this vCPU parameter. For example, if you allocate 20 out of 40 total vCPU, SageMaker AI uses the ratio of 0.5 and allocates values to MemoryInGiB. Accelerators are set to 0.
4308
+ memory_in_gi_b: The amount of memory in GiB to allocate. If you specify a value only for this parameter, SageMaker AI automatically allocates a ratio-based value for vCPU based on this memory that you provide. For example, if you allocate 200 out of 400 total memory in GiB, SageMaker AI uses the ratio of 0.5 and allocates values to vCPU. Accelerators are set to 0.
3922
4309
  """
3923
4310
 
3924
4311
  instance_type: str
3925
4312
  count: Optional[int] = Unassigned()
4313
+ accelerators: Optional[int] = Unassigned()
4314
+ v_cpu: Optional[float] = Unassigned()
4315
+ memory_in_gi_b: Optional[float] = Unassigned()
3926
4316
 
3927
4317
 
3928
4318
  class ResourceSharingConfig(Base):
@@ -4895,8 +5285,8 @@ class S3FileSystemConfig(Base):
4895
5285
  s3_uri: The Amazon S3 URI of the S3 file system configuration.
4896
5286
  """
4897
5287
 
5288
+ s3_uri: str
4898
5289
  mount_path: Optional[str] = Unassigned()
4899
- s3_uri: Optional[str] = Unassigned()
4900
5290
 
4901
5291
 
4902
5292
  class CustomFileSystemConfig(Base):
@@ -5016,6 +5406,19 @@ class RStudioServerProDomainSettings(Base):
5016
5406
  default_resource_spec: Optional[ResourceSpec] = Unassigned()
5017
5407
 
5018
5408
 
5409
+ class TrustedIdentityPropagationSettings(Base):
5410
+ """
5411
+ TrustedIdentityPropagationSettings
5412
+ The Trusted Identity Propagation (TIP) settings for the SageMaker domain. These settings determine how user identities from IAM Identity Center are propagated through the domain to TIP enabled Amazon Web Services services.
5413
+
5414
+ Attributes
5415
+ ----------------------
5416
+ status: The status of Trusted Identity Propagation (TIP) at the SageMaker domain level. When disabled, standard IAM role-based access is used. When enabled: User identities from IAM Identity Center are propagated through the application to TIP enabled Amazon Web Services services. New applications or existing applications that are automatically patched, will use the domain level configuration.
5417
+ """
5418
+
5419
+ status: str
5420
+
5421
+
5019
5422
  class DockerSettings(Base):
5020
5423
  """
5021
5424
  DockerSettings
@@ -5025,10 +5428,12 @@ class DockerSettings(Base):
5025
5428
  ----------------------
5026
5429
  enable_docker_access: Indicates whether the domain can access Docker.
5027
5430
  vpc_only_trusted_accounts: The list of Amazon Web Services accounts that are trusted when the domain is created in VPC-only mode.
5431
+ rootless_docker: Indicates whether to use rootless Docker.
5028
5432
  """
5029
5433
 
5030
5434
  enable_docker_access: Optional[str] = Unassigned()
5031
5435
  vpc_only_trusted_accounts: Optional[List[str]] = Unassigned()
5436
+ rootless_docker: Optional[str] = Unassigned()
5032
5437
 
5033
5438
 
5034
5439
  class UnifiedStudioSettings(Base):
@@ -5045,7 +5450,7 @@ class UnifiedStudioSettings(Base):
5045
5450
  project_id: The ID of the Amazon SageMaker Unified Studio project that corresponds to the domain.
5046
5451
  environment_id: The ID of the environment that Amazon SageMaker Unified Studio associates with the domain.
5047
5452
  project_s3_path: The location where Amazon S3 stores temporary execution data and other artifacts for the project that corresponds to the domain.
5048
- single_sign_on_application_arn: The ARN of the application managed by SageMaker AI and SageMaker Unified Studio in the Amazon Web Services IAM Identity Center.
5453
+ single_sign_on_application_arn: The ARN of the Amazon DataZone application managed by Amazon SageMaker Unified Studio in the Amazon Web Services IAM Identity Center.
5049
5454
  """
5050
5455
 
5051
5456
  studio_web_portal_access: Optional[str] = Unassigned()
@@ -5068,17 +5473,23 @@ class DomainSettings(Base):
5068
5473
  security_group_ids: The security groups for the Amazon Virtual Private Cloud that the Domain uses for communication between Domain-level apps and user apps.
5069
5474
  r_studio_server_pro_domain_settings: A collection of settings that configure the RStudioServerPro Domain-level app.
5070
5475
  execution_role_identity_config: The configuration for attaching a SageMaker AI user profile name to the execution role as a sts:SourceIdentity key.
5476
+ trusted_identity_propagation_settings: The Trusted Identity Propagation (TIP) settings for the SageMaker domain. These settings determine how user identities from IAM Identity Center are propagated through the domain to TIP enabled Amazon Web Services services.
5071
5477
  docker_settings: A collection of settings that configure the domain's Docker interaction.
5072
5478
  amazon_q_settings: A collection of settings that configure the Amazon Q experience within the domain. The AuthMode that you use to create the domain must be SSO.
5073
5479
  unified_studio_settings: The settings that apply to an SageMaker AI domain when you use it in Amazon SageMaker Unified Studio.
5480
+ ip_address_type: The IP address type for the domain. Specify ipv4 for IPv4-only connectivity or dualstack for both IPv4 and IPv6 connectivity. When you specify dualstack, the subnet must support IPv6 CIDR blocks. If not specified, defaults to ipv4.
5074
5481
  """
5075
5482
 
5076
5483
  security_group_ids: Optional[List[str]] = Unassigned()
5077
5484
  r_studio_server_pro_domain_settings: Optional[RStudioServerProDomainSettings] = Unassigned()
5078
5485
  execution_role_identity_config: Optional[str] = Unassigned()
5486
+ trusted_identity_propagation_settings: Optional[TrustedIdentityPropagationSettings] = (
5487
+ Unassigned()
5488
+ )
5079
5489
  docker_settings: Optional[DockerSettings] = Unassigned()
5080
5490
  amazon_q_settings: Optional[AmazonQSettings] = Unassigned()
5081
5491
  unified_studio_settings: Optional[UnifiedStudioSettings] = Unassigned()
5492
+ ip_address_type: Optional[str] = Unassigned()
5082
5493
 
5083
5494
 
5084
5495
  class DefaultSpaceSettings(Base):
@@ -5966,6 +6377,19 @@ class InferenceComponentComputeResourceRequirements(Base):
5966
6377
  max_memory_required_in_mb: Optional[int] = Unassigned()
5967
6378
 
5968
6379
 
6380
+ class InferenceComponentDataCacheConfig(Base):
6381
+ """
6382
+ InferenceComponentDataCacheConfig
6383
+ Settings that affect how the inference component caches data.
6384
+
6385
+ Attributes
6386
+ ----------------------
6387
+ enable_caching: Sets whether the endpoint that hosts the inference component caches the model artifacts and container image. With caching enabled, the endpoint caches this data in each instance that it provisions for the inference component. That way, the inference component deploys faster during the auto scaling process. If caching isn't enabled, the inference component takes longer to deploy because of the time it spends downloading the data.
6388
+ """
6389
+
6390
+ enable_caching: bool
6391
+
6392
+
5969
6393
  class InferenceComponentSpecification(Base):
5970
6394
  """
5971
6395
  InferenceComponentSpecification
@@ -5978,6 +6402,7 @@ class InferenceComponentSpecification(Base):
5978
6402
  startup_parameters: Settings that take effect while the model container starts up.
5979
6403
  compute_resource_requirements: The compute resources allocated to run the model, plus any adapter models, that you assign to the inference component. Omit this parameter if your request is meant to create an adapter inference component. An adapter inference component is loaded by a base inference component, and it uses the compute resources of the base inference component.
5980
6404
  base_inference_component_name: The name of an existing inference component that is to contain the inference component that you're creating with your request. Specify this parameter only if your request is meant to create an adapter inference component. An adapter inference component contains the path to an adapter model. The purpose of the adapter model is to tailor the inference output of a base foundation model, which is hosted by the base inference component. The adapter inference component uses the compute resources that you assigned to the base inference component. When you create an adapter inference component, use the Container parameter to specify the location of the adapter artifacts. In the parameter value, use the ArtifactUrl parameter of the InferenceComponentContainerSpecification data type. Before you can create an adapter inference component, you must have an existing inference component that contains the foundation model that you want to adapt.
6405
+ data_cache_config: Settings that affect how the inference component caches data.
5981
6406
  """
5982
6407
 
5983
6408
  model_name: Optional[Union[str, object]] = Unassigned()
@@ -5987,6 +6412,7 @@ class InferenceComponentSpecification(Base):
5987
6412
  Unassigned()
5988
6413
  )
5989
6414
  base_inference_component_name: Optional[str] = Unassigned()
6415
+ data_cache_config: Optional[InferenceComponentDataCacheConfig] = Unassigned()
5990
6416
 
5991
6417
 
5992
6418
  class InferenceComponentRuntimeConfig(Base):
@@ -7392,7 +7818,7 @@ class ProcessingS3Input(Base):
7392
7818
  local_path: The local path in your container where you want Amazon SageMaker to write input data to. LocalPath is an absolute path to the input data and must begin with /opt/ml/processing/. LocalPath is a required parameter when AppManaged is False (default).
7393
7819
  s3_data_type: Whether you use an S3Prefix or a ManifestFile for the data type. If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all objects with the specified key name prefix for the processing job. If you choose ManifestFile, S3Uri identifies an object that is a manifest file containing a list of object keys that you want Amazon SageMaker to use for the processing job.
7394
7820
  s3_input_mode: Whether to use File or Pipe input mode. In File mode, Amazon SageMaker copies the data from the input source onto the local ML storage volume before starting your processing container. This is the most commonly used input mode. In Pipe mode, Amazon SageMaker streams input data from the source directly to your processing container into named pipes without using the ML storage volume.
7395
- s3_data_distribution_type: Whether to distribute the data from Amazon S3 to all processing instances with FullyReplicated, or whether the data from Amazon S3 is shared by Amazon S3 key, downloading one shard of data to each processing instance.
7821
+ s3_data_distribution_type: Whether to distribute the data from Amazon S3 to all processing instances with FullyReplicated, or whether the data from Amazon S3 is sharded by Amazon S3 key, downloading one shard of data to each processing instance.
7396
7822
  s3_compression_type: Whether to GZIP-decompress the data in Amazon S3 as it is streamed into the processing container. Gzip can only be used when Pipe mode is specified as the S3InputMode. In Pipe mode, Amazon SageMaker streams input data from the source directly to your container without using the EBS volume.
7397
7823
  """
7398
7824
 
@@ -7768,7 +8194,7 @@ class S3FileSystem(Base):
7768
8194
  s3_uri: The Amazon S3 URI that specifies the location in S3 where files are stored, which is mounted within the Studio environment. For example: s3://<bucket-name>/<prefix>/.
7769
8195
  """
7770
8196
 
7771
- s3_uri: Optional[str] = Unassigned()
8197
+ s3_uri: str
7772
8198
 
7773
8199
 
7774
8200
  class CustomFileSystem(Base):
@@ -8873,6 +9299,19 @@ class InferenceComponentContainerSpecificationSummary(Base):
8873
9299
  environment: Optional[Dict[str, str]] = Unassigned()
8874
9300
 
8875
9301
 
9302
+ class InferenceComponentDataCacheConfigSummary(Base):
9303
+ """
9304
+ InferenceComponentDataCacheConfigSummary
9305
+ Settings that affect how the inference component caches data.
9306
+
9307
+ Attributes
9308
+ ----------------------
9309
+ enable_caching: Indicates whether the inference component caches model artifacts as part of the auto scaling process.
9310
+ """
9311
+
9312
+ enable_caching: bool
9313
+
9314
+
8876
9315
  class InferenceComponentSpecificationSummary(Base):
8877
9316
  """
8878
9317
  InferenceComponentSpecificationSummary
@@ -8885,6 +9324,7 @@ class InferenceComponentSpecificationSummary(Base):
8885
9324
  startup_parameters: Settings that take effect while the model container starts up.
8886
9325
  compute_resource_requirements: The compute resources allocated to run the model, plus any adapter models, that you assign to the inference component.
8887
9326
  base_inference_component_name: The name of the base inference component that contains this inference component.
9327
+ data_cache_config: Settings that affect how the inference component caches data.
8888
9328
  """
8889
9329
 
8890
9330
  model_name: Optional[Union[str, object]] = Unassigned()
@@ -8894,6 +9334,7 @@ class InferenceComponentSpecificationSummary(Base):
8894
9334
  Unassigned()
8895
9335
  )
8896
9336
  base_inference_component_name: Optional[str] = Unassigned()
9337
+ data_cache_config: Optional[InferenceComponentDataCacheConfigSummary] = Unassigned()
8897
9338
 
8898
9339
 
8899
9340
  class InferenceComponentRuntimeConfigSummary(Base):
@@ -9356,6 +9797,27 @@ class TemplateProviderDetail(Base):
9356
9797
  cfn_template_provider_detail: Optional[CfnTemplateProviderDetail] = Unassigned()
9357
9798
 
9358
9799
 
9800
+ class UltraServerSummary(Base):
9801
+ """
9802
+ UltraServerSummary
9803
+ A summary of UltraServer resources and their current status.
9804
+
9805
+ Attributes
9806
+ ----------------------
9807
+ ultra_server_type: The type of UltraServer, such as ml.u-p6e-gb200x72.
9808
+ instance_type: The Amazon EC2 instance type used in the UltraServer.
9809
+ ultra_server_count: The number of UltraServers of this type.
9810
+ available_spare_instance_count: The number of available spare instances in the UltraServers.
9811
+ unhealthy_instance_count: The total number of instances across all UltraServers of this type that are currently in an unhealthy state.
9812
+ """
9813
+
9814
+ ultra_server_type: str
9815
+ instance_type: str
9816
+ ultra_server_count: Optional[int] = Unassigned()
9817
+ available_spare_instance_count: Optional[int] = Unassigned()
9818
+ unhealthy_instance_count: Optional[int] = Unassigned()
9819
+
9820
+
9359
9821
  class SubscribedWorkteam(Base):
9360
9822
  """
9361
9823
  SubscribedWorkteam
@@ -9459,6 +9921,9 @@ class ReservedCapacitySummary(Base):
9459
9921
  Attributes
9460
9922
  ----------------------
9461
9923
  reserved_capacity_arn: The Amazon Resource Name (ARN); of the reserved capacity.
9924
+ reserved_capacity_type: The type of reserved capacity.
9925
+ ultra_server_type: The type of UltraServer included in this reserved capacity, such as ml.u-p6e-gb200x72.
9926
+ ultra_server_count: The number of UltraServers included in this reserved capacity.
9462
9927
  instance_type: The instance type for the reserved capacity.
9463
9928
  total_instance_count: The total number of instances in the reserved capacity.
9464
9929
  status: The current status of the reserved capacity.
@@ -9473,6 +9938,9 @@ class ReservedCapacitySummary(Base):
9473
9938
  instance_type: str
9474
9939
  total_instance_count: int
9475
9940
  status: str
9941
+ reserved_capacity_type: Optional[str] = Unassigned()
9942
+ ultra_server_type: Optional[str] = Unassigned()
9943
+ ultra_server_count: Optional[int] = Unassigned()
9476
9944
  availability_zone: Optional[str] = Unassigned()
9477
9945
  duration_hours: Optional[int] = Unassigned()
9478
9946
  duration_minutes: Optional[int] = Unassigned()
@@ -9871,9 +10339,11 @@ class DomainSettingsForUpdate(Base):
9871
10339
  r_studio_server_pro_domain_settings_for_update: A collection of RStudioServerPro Domain-level app settings to update. A single RStudioServerPro application is created for a domain.
9872
10340
  execution_role_identity_config: The configuration for attaching a SageMaker AI user profile name to the execution role as a sts:SourceIdentity key. This configuration can only be modified if there are no apps in the InService or Pending state.
9873
10341
  security_group_ids: The security groups for the Amazon Virtual Private Cloud that the Domain uses for communication between Domain-level apps and user apps.
10342
+ trusted_identity_propagation_settings: The Trusted Identity Propagation (TIP) settings for the SageMaker domain. These settings determine how user identities from IAM Identity Center are propagated through the domain to TIP enabled Amazon Web Services services.
9874
10343
  docker_settings: A collection of settings that configure the domain's Docker interaction.
9875
10344
  amazon_q_settings: A collection of settings that configure the Amazon Q experience within the domain.
9876
10345
  unified_studio_settings: The settings that apply to an SageMaker AI domain when you use it in Amazon SageMaker Unified Studio.
10346
+ ip_address_type: The IP address type for the domain. Specify ipv4 for IPv4-only connectivity or dualstack for both IPv4 and IPv6 connectivity. When you specify dualstack, the subnet must support IPv6 CIDR blocks.
9877
10347
  """
9878
10348
 
9879
10349
  r_studio_server_pro_domain_settings_for_update: Optional[
@@ -9881,9 +10351,13 @@ class DomainSettingsForUpdate(Base):
9881
10351
  ] = Unassigned()
9882
10352
  execution_role_identity_config: Optional[str] = Unassigned()
9883
10353
  security_group_ids: Optional[List[str]] = Unassigned()
10354
+ trusted_identity_propagation_settings: Optional[TrustedIdentityPropagationSettings] = (
10355
+ Unassigned()
10356
+ )
9884
10357
  docker_settings: Optional[DockerSettings] = Unassigned()
9885
10358
  amazon_q_settings: Optional[AmazonQSettings] = Unassigned()
9886
10359
  unified_studio_settings: Optional[UnifiedStudioSettings] = Unassigned()
10360
+ ip_address_type: Optional[str] = Unassigned()
9887
10361
 
9888
10362
 
9889
10363
  class PredefinedMetricSpecification(Base):
@@ -11925,6 +12399,7 @@ class TrainingPlanSummary(Base):
11925
12399
  total_instance_count: The total number of instances reserved in this training plan.
11926
12400
  available_instance_count: The number of instances currently available for use in this training plan.
11927
12401
  in_use_instance_count: The number of instances currently in use from this training plan.
12402
+ total_ultra_server_count: The total number of UltraServers allocated to this training plan.
11928
12403
  target_resources: The target resources (e.g., training jobs, HyperPod clusters) that can use this training plan. Training plans are specific to their target resource. A training plan designed for SageMaker training jobs can only be used to schedule and run training jobs. A training plan for HyperPod clusters can be used exclusively to provide compute resources to a cluster's instance group.
11929
12404
  reserved_capacity_summaries: A list of reserved capacities associated with this training plan, including details such as instance types, counts, and availability zones.
11930
12405
  """
@@ -11942,6 +12417,7 @@ class TrainingPlanSummary(Base):
11942
12417
  total_instance_count: Optional[int] = Unassigned()
11943
12418
  available_instance_count: Optional[int] = Unassigned()
11944
12419
  in_use_instance_count: Optional[int] = Unassigned()
12420
+ total_ultra_server_count: Optional[int] = Unassigned()
11945
12421
  target_resources: Optional[List[str]] = Unassigned()
11946
12422
  reserved_capacity_summaries: Optional[List[ReservedCapacitySummary]] = Unassigned()
11947
12423
 
@@ -12027,6 +12503,39 @@ class TrialSummary(Base):
12027
12503
  last_modified_time: Optional[datetime.datetime] = Unassigned()
12028
12504
 
12029
12505
 
12506
+ class UltraServer(Base):
12507
+ """
12508
+ UltraServer
12509
+ Represents a high-performance compute server used for distributed training in SageMaker AI. An UltraServer consists of multiple instances within a shared NVLink interconnect domain.
12510
+
12511
+ Attributes
12512
+ ----------------------
12513
+ ultra_server_id: The unique identifier for the UltraServer.
12514
+ ultra_server_type: The type of UltraServer, such as ml.u-p6e-gb200x72.
12515
+ availability_zone: The name of the Availability Zone where the UltraServer is provisioned.
12516
+ instance_type: The Amazon EC2 instance type used in the UltraServer.
12517
+ total_instance_count: The total number of instances in this UltraServer.
12518
+ configured_spare_instance_count: The number of spare instances configured for this UltraServer to provide enhanced resiliency.
12519
+ available_instance_count: The number of instances currently available for use in this UltraServer.
12520
+ in_use_instance_count: The number of instances currently in use in this UltraServer.
12521
+ available_spare_instance_count: The number of available spare instances in the UltraServer.
12522
+ unhealthy_instance_count: The number of instances in this UltraServer that are currently in an unhealthy state.
12523
+ health_status: The overall health status of the UltraServer.
12524
+ """
12525
+
12526
+ ultra_server_id: str
12527
+ ultra_server_type: str
12528
+ availability_zone: str
12529
+ instance_type: str
12530
+ total_instance_count: int
12531
+ configured_spare_instance_count: Optional[int] = Unassigned()
12532
+ available_instance_count: Optional[int] = Unassigned()
12533
+ in_use_instance_count: Optional[int] = Unassigned()
12534
+ available_spare_instance_count: Optional[int] = Unassigned()
12535
+ unhealthy_instance_count: Optional[int] = Unassigned()
12536
+ health_status: Optional[str] = Unassigned()
12537
+
12538
+
12030
12539
  class UserProfileDetails(Base):
12031
12540
  """
12032
12541
  UserProfileDetails
@@ -12746,6 +13255,9 @@ class ReservedCapacityOffering(Base):
12746
13255
 
12747
13256
  Attributes
12748
13257
  ----------------------
13258
+ reserved_capacity_type: The type of reserved capacity offering.
13259
+ ultra_server_type: The type of UltraServer included in this reserved capacity offering, such as ml.u-p6e-gb200x72.
13260
+ ultra_server_count: The number of UltraServers included in this reserved capacity offering.
12749
13261
  instance_type: The instance type for the reserved capacity offering.
12750
13262
  instance_count: The number of instances in the reserved capacity offering.
12751
13263
  availability_zone: The availability zone for the reserved capacity offering.
@@ -12757,6 +13269,9 @@ class ReservedCapacityOffering(Base):
12757
13269
 
12758
13270
  instance_type: str
12759
13271
  instance_count: int
13272
+ reserved_capacity_type: Optional[str] = Unassigned()
13273
+ ultra_server_type: Optional[str] = Unassigned()
13274
+ ultra_server_count: Optional[int] = Unassigned()
12760
13275
  availability_zone: Optional[str] = Unassigned()
12761
13276
  duration_hours: Optional[int] = Unassigned()
12762
13277
  duration_minutes: Optional[int] = Unassigned()