aws-cdk-neuronx-patterns 0.2.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (34) hide show
  1. package/.jsii +754 -117
  2. package/API.md +1044 -158
  3. package/README.ja.md +18 -6
  4. package/README.md +16 -5
  5. package/lib/base/aws-batch/neuronx-batch-compute-environment.js +1 -1
  6. package/lib/base/aws-batch/neuronx-batch-ecs-job-definition.js +1 -1
  7. package/lib/base/aws-batch/neuronx-batch.js +1 -1
  8. package/lib/base/aws-ecs-patterns/application-load-balanced-neuronx-service.js +4 -4
  9. package/lib/base/neuronx/calculator.test.js +61 -1
  10. package/lib/base/neuronx/deep-learning-containers.js +3 -3
  11. package/lib/base/neuronx/model.js +2 -2
  12. package/lib/base/neuronx/neuron-optimized-machine-image.js +1 -1
  13. package/lib/base/neuronx/neuronx-instance-type.d.ts +18 -0
  14. package/lib/base/neuronx/neuronx-instance-type.js +60 -7
  15. package/lib/base/neuronx/neuronx-instance-type.test.js +80 -1
  16. package/lib/base/neuronx-compiler/index.d.ts +3 -1
  17. package/lib/base/neuronx-compiler/index.js +4 -2
  18. package/lib/base/neuronx-compiler/{neuronx-compiler.d.ts → neuronx-compiler-base.d.ts} +74 -32
  19. package/lib/base/neuronx-compiler/neuronx-compiler-base.js +129 -0
  20. package/lib/base/neuronx-compiler/neuronx-cross-compiler.d.ts +30 -0
  21. package/lib/base/neuronx-compiler/neuronx-cross-compiler.js +83 -0
  22. package/lib/base/neuronx-compiler/neuronx-native-compiler.d.ts +18 -0
  23. package/lib/base/neuronx-compiler/neuronx-native-compiler.js +69 -0
  24. package/lib/base/server-engine/vllm-engine/vllm-engine-argments.js +1 -1
  25. package/lib/sagemaker-inference-toolkit-tnx/sagemaker-inference-toolkit-tnx-compiler.js +2 -2
  26. package/lib/sagemaker-inference-toolkit-tnx/sagemaker-inference-toolkit-tnx-sagemaker.d.ts +1 -1
  27. package/lib/sagemaker-inference-toolkit-tnx/sagemaker-inference-toolkit-tnx-sagemaker.js +2 -2
  28. package/lib/vllm-nxd-inference/vllm-nxd-inference-compiler.d.ts +8 -0
  29. package/lib/vllm-nxd-inference/vllm-nxd-inference-compiler.js +32 -4
  30. package/lib/vllm-nxd-inference/vllm-nxd-inference-ecs-patterns.js +6 -6
  31. package/package.json +7 -7
  32. package/scripts/compile/vllm-nxd-inference/Dockerfile +5 -0
  33. package/scripts/compile/vllm-nxd-inference/entrypoint.sh +39 -14
  34. package/lib/base/neuronx-compiler/neuronx-compiler.js +0 -166
package/API.md CHANGED
@@ -1434,43 +1434,342 @@ Batch terminates your jobs if they aren't finished.
1434
1434
  ---
1435
1435
 
1436
1436
 
1437
- ### NeuronxCompiler <a name="NeuronxCompiler" id="aws-cdk-neuronx-patterns.NeuronxCompiler"></a>
1437
+ ### NeuronxCompilerBase <a name="NeuronxCompilerBase" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase"></a>
1438
+
1439
+ - *Implements:* <a href="#aws-cdk-neuronx-patterns.INeuronxCompiler">INeuronxCompiler</a>
1440
+
1441
+ Abstract base class for Neuronx compilers.
1442
+
1443
+ Provides the common orchestration logic (Lambda, CustomResource, WaitCondition)
1444
+ while subclasses define how to create the Batch compute environment and job definition.
1445
+
1446
+ #### Initializers <a name="Initializers" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.Initializer"></a>
1447
+
1448
+ ```typescript
1449
+ import { NeuronxCompilerBase } from 'aws-cdk-neuronx-patterns'
1450
+
1451
+ new NeuronxCompilerBase(scope: Construct, id: string, props: NeuronxCompilerBaseProps)
1452
+ ```
1453
+
1454
+ | **Name** | **Type** | **Description** |
1455
+ | --- | --- | --- |
1456
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBase.Initializer.parameter.scope">scope</a></code> | <code>constructs.Construct</code> | *No description.* |
1457
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBase.Initializer.parameter.id">id</a></code> | <code>string</code> | *No description.* |
1458
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBase.Initializer.parameter.props">props</a></code> | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps">NeuronxCompilerBaseProps</a></code> | *No description.* |
1459
+
1460
+ ---
1461
+
1462
+ ##### `scope`<sup>Required</sup> <a name="scope" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.Initializer.parameter.scope"></a>
1463
+
1464
+ - *Type:* constructs.Construct
1465
+
1466
+ ---
1467
+
1468
+ ##### `id`<sup>Required</sup> <a name="id" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.Initializer.parameter.id"></a>
1469
+
1470
+ - *Type:* string
1471
+
1472
+ ---
1473
+
1474
+ ##### `props`<sup>Required</sup> <a name="props" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.Initializer.parameter.props"></a>
1475
+
1476
+ - *Type:* <a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps">NeuronxCompilerBaseProps</a>
1477
+
1478
+ ---
1479
+
1480
+ #### Methods <a name="Methods" id="Methods"></a>
1481
+
1482
+ | **Name** | **Description** |
1483
+ | --- | --- |
1484
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBase.toString">toString</a></code> | Returns a string representation of this construct. |
1485
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBase.with">with</a></code> | Applies one or more mixins to this construct. |
1486
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBase.compile">compile</a></code> | *No description.* |
1487
+
1488
+ ---
1489
+
1490
+ ##### `toString` <a name="toString" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.toString"></a>
1491
+
1492
+ ```typescript
1493
+ public toString(): string
1494
+ ```
1495
+
1496
+ Returns a string representation of this construct.
1497
+
1498
+ ##### `with` <a name="with" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.with"></a>
1499
+
1500
+ ```typescript
1501
+ public with(mixins: ...IMixin[]): IConstruct
1502
+ ```
1503
+
1504
+ Applies one or more mixins to this construct.
1505
+
1506
+ Mixins are applied in order. The list of constructs is captured at the
1507
+ start of the call, so constructs added by a mixin will not be visited.
1508
+ Use multiple `with()` calls if subsequent mixins should apply to added
1509
+ constructs.
1510
+
1511
+ ###### `mixins`<sup>Required</sup> <a name="mixins" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.with.parameter.mixins"></a>
1512
+
1513
+ - *Type:* ...constructs.IMixin[]
1514
+
1515
+ The mixins to apply.
1516
+
1517
+ ---
1518
+
1519
+ ##### `compile` <a name="compile" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.compile"></a>
1520
+
1521
+ ```typescript
1522
+ public compile(): NeuronxCompiledModel
1523
+ ```
1524
+
1525
+ #### Static Functions <a name="Static Functions" id="Static Functions"></a>
1526
+
1527
+ | **Name** | **Description** |
1528
+ | --- | --- |
1529
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBase.isConstruct">isConstruct</a></code> | Checks if `x` is a construct. |
1530
+
1531
+ ---
1532
+
1533
+ ##### `isConstruct` <a name="isConstruct" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.isConstruct"></a>
1534
+
1535
+ ```typescript
1536
+ import { NeuronxCompilerBase } from 'aws-cdk-neuronx-patterns'
1537
+
1538
+ NeuronxCompilerBase.isConstruct(x: any)
1539
+ ```
1540
+
1541
+ Checks if `x` is a construct.
1542
+
1543
+ Use this method instead of `instanceof` to properly detect `Construct`
1544
+ instances, even when the construct library is symlinked.
1545
+
1546
+ Explanation: in JavaScript, multiple copies of the `constructs` library on
1547
+ disk are seen as independent, completely different libraries. As a
1548
+ consequence, the class `Construct` in each copy of the `constructs` library
1549
+ is seen as a different class, and an instance of one class will not test as
1550
+ `instanceof` the other class. `npm install` will not create installations
1551
+ like this, but users may manually symlink construct libraries together or
1552
+ use a monorepo tool: in those cases, multiple copies of the `constructs`
1553
+ library can be accidentally installed, and `instanceof` will behave
1554
+ unpredictably. It is safest to avoid using `instanceof`, and using
1555
+ this type-testing method instead.
1556
+
1557
+ ###### `x`<sup>Required</sup> <a name="x" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.isConstruct.parameter.x"></a>
1558
+
1559
+ - *Type:* any
1560
+
1561
+ Any object.
1562
+
1563
+ ---
1564
+
1565
+ #### Properties <a name="Properties" id="Properties"></a>
1566
+
1567
+ | **Name** | **Type** | **Description** |
1568
+ | --- | --- | --- |
1569
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBase.property.node">node</a></code> | <code>constructs.Node</code> | The tree node. |
1570
+
1571
+ ---
1572
+
1573
+ ##### `node`<sup>Required</sup> <a name="node" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.property.node"></a>
1574
+
1575
+ ```typescript
1576
+ public readonly node: Node;
1577
+ ```
1578
+
1579
+ - *Type:* constructs.Node
1580
+
1581
+ The tree node.
1582
+
1583
+ ---
1584
+
1585
+
1586
+ ### NeuronxCrossCompiler <a name="NeuronxCrossCompiler" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler"></a>
1587
+
1588
+ Neuronx cross-compiler construct.
1589
+
1590
+ Compile the model on a non-Neuron instance and upload the artifacts to an S3 bucket.
1591
+ This avoids the need for expensive Neuron instances during the compilation phase.
1592
+
1593
+ The compilation uses `vllm serve` which performs model tracing and neuronx-cc compilation
1594
+ entirely on CPU. The resulting artifacts are compatible with Neuron instances for inference.
1595
+
1596
+ #### Initializers <a name="Initializers" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.Initializer"></a>
1597
+
1598
+ ```typescript
1599
+ import { NeuronxCrossCompiler } from 'aws-cdk-neuronx-patterns'
1600
+
1601
+ new NeuronxCrossCompiler(scope: Construct, id: string, props: NeuronxCrossCompilerProps)
1602
+ ```
1603
+
1604
+ | **Name** | **Type** | **Description** |
1605
+ | --- | --- | --- |
1606
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompiler.Initializer.parameter.scope">scope</a></code> | <code>constructs.Construct</code> | *No description.* |
1607
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompiler.Initializer.parameter.id">id</a></code> | <code>string</code> | *No description.* |
1608
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompiler.Initializer.parameter.props">props</a></code> | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps">NeuronxCrossCompilerProps</a></code> | *No description.* |
1609
+
1610
+ ---
1611
+
1612
+ ##### `scope`<sup>Required</sup> <a name="scope" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.Initializer.parameter.scope"></a>
1613
+
1614
+ - *Type:* constructs.Construct
1615
+
1616
+ ---
1617
+
1618
+ ##### `id`<sup>Required</sup> <a name="id" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.Initializer.parameter.id"></a>
1619
+
1620
+ - *Type:* string
1621
+
1622
+ ---
1623
+
1624
+ ##### `props`<sup>Required</sup> <a name="props" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.Initializer.parameter.props"></a>
1625
+
1626
+ - *Type:* <a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps">NeuronxCrossCompilerProps</a>
1627
+
1628
+ ---
1629
+
1630
+ #### Methods <a name="Methods" id="Methods"></a>
1631
+
1632
+ | **Name** | **Description** |
1633
+ | --- | --- |
1634
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompiler.toString">toString</a></code> | Returns a string representation of this construct. |
1635
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompiler.with">with</a></code> | Applies one or more mixins to this construct. |
1636
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompiler.compile">compile</a></code> | *No description.* |
1637
+
1638
+ ---
1639
+
1640
+ ##### `toString` <a name="toString" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.toString"></a>
1641
+
1642
+ ```typescript
1643
+ public toString(): string
1644
+ ```
1645
+
1646
+ Returns a string representation of this construct.
1647
+
1648
+ ##### `with` <a name="with" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.with"></a>
1649
+
1650
+ ```typescript
1651
+ public with(mixins: ...IMixin[]): IConstruct
1652
+ ```
1653
+
1654
+ Applies one or more mixins to this construct.
1655
+
1656
+ Mixins are applied in order. The list of constructs is captured at the
1657
+ start of the call, so constructs added by a mixin will not be visited.
1658
+ Use multiple `with()` calls if subsequent mixins should apply to added
1659
+ constructs.
1660
+
1661
+ ###### `mixins`<sup>Required</sup> <a name="mixins" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.with.parameter.mixins"></a>
1662
+
1663
+ - *Type:* ...constructs.IMixin[]
1664
+
1665
+ The mixins to apply.
1666
+
1667
+ ---
1668
+
1669
+ ##### `compile` <a name="compile" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.compile"></a>
1670
+
1671
+ ```typescript
1672
+ public compile(): NeuronxCompiledModel
1673
+ ```
1674
+
1675
+ #### Static Functions <a name="Static Functions" id="Static Functions"></a>
1676
+
1677
+ | **Name** | **Description** |
1678
+ | --- | --- |
1679
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompiler.isConstruct">isConstruct</a></code> | Checks if `x` is a construct. |
1680
+
1681
+ ---
1682
+
1683
+ ##### `isConstruct` <a name="isConstruct" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.isConstruct"></a>
1684
+
1685
+ ```typescript
1686
+ import { NeuronxCrossCompiler } from 'aws-cdk-neuronx-patterns'
1687
+
1688
+ NeuronxCrossCompiler.isConstruct(x: any)
1689
+ ```
1690
+
1691
+ Checks if `x` is a construct.
1692
+
1693
+ Use this method instead of `instanceof` to properly detect `Construct`
1694
+ instances, even when the construct library is symlinked.
1695
+
1696
+ Explanation: in JavaScript, multiple copies of the `constructs` library on
1697
+ disk are seen as independent, completely different libraries. As a
1698
+ consequence, the class `Construct` in each copy of the `constructs` library
1699
+ is seen as a different class, and an instance of one class will not test as
1700
+ `instanceof` the other class. `npm install` will not create installations
1701
+ like this, but users may manually symlink construct libraries together or
1702
+ use a monorepo tool: in those cases, multiple copies of the `constructs`
1703
+ library can be accidentally installed, and `instanceof` will behave
1704
+ unpredictably. It is safest to avoid using `instanceof`, and using
1705
+ this type-testing method instead.
1706
+
1707
+ ###### `x`<sup>Required</sup> <a name="x" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.isConstruct.parameter.x"></a>
1708
+
1709
+ - *Type:* any
1710
+
1711
+ Any object.
1712
+
1713
+ ---
1714
+
1715
+ #### Properties <a name="Properties" id="Properties"></a>
1716
+
1717
+ | **Name** | **Type** | **Description** |
1718
+ | --- | --- | --- |
1719
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompiler.property.node">node</a></code> | <code>constructs.Node</code> | The tree node. |
1720
+
1721
+ ---
1722
+
1723
+ ##### `node`<sup>Required</sup> <a name="node" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.property.node"></a>
1724
+
1725
+ ```typescript
1726
+ public readonly node: Node;
1727
+ ```
1728
+
1729
+ - *Type:* constructs.Node
1730
+
1731
+ The tree node.
1732
+
1733
+ ---
1734
+
1735
+
1736
+ ### NeuronxNativeCompiler <a name="NeuronxNativeCompiler" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler"></a>
1438
1737
 
1439
1738
  Neuronx compiler construct.
1440
1739
 
1441
1740
  Compile the model to work with Inferentia2 and Trainium1 and upload it to an S3 bucket.
1442
1741
 
1443
- #### Initializers <a name="Initializers" id="aws-cdk-neuronx-patterns.NeuronxCompiler.Initializer"></a>
1742
+ #### Initializers <a name="Initializers" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.Initializer"></a>
1444
1743
 
1445
1744
  ```typescript
1446
- import { NeuronxCompiler } from 'aws-cdk-neuronx-patterns'
1745
+ import { NeuronxNativeCompiler } from 'aws-cdk-neuronx-patterns'
1447
1746
 
1448
- new NeuronxCompiler(scope: Construct, id: string, props: NeuronxCompilerProps)
1747
+ new NeuronxNativeCompiler(scope: Construct, id: string, props: NeuronxNativeCompilerProps)
1449
1748
  ```
1450
1749
 
1451
1750
  | **Name** | **Type** | **Description** |
1452
1751
  | --- | --- | --- |
1453
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiler.Initializer.parameter.scope">scope</a></code> | <code>constructs.Construct</code> | *No description.* |
1454
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiler.Initializer.parameter.id">id</a></code> | <code>string</code> | *No description.* |
1455
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiler.Initializer.parameter.props">props</a></code> | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerProps">NeuronxCompilerProps</a></code> | *No description.* |
1752
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompiler.Initializer.parameter.scope">scope</a></code> | <code>constructs.Construct</code> | *No description.* |
1753
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompiler.Initializer.parameter.id">id</a></code> | <code>string</code> | *No description.* |
1754
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompiler.Initializer.parameter.props">props</a></code> | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps">NeuronxNativeCompilerProps</a></code> | *No description.* |
1456
1755
 
1457
1756
  ---
1458
1757
 
1459
- ##### `scope`<sup>Required</sup> <a name="scope" id="aws-cdk-neuronx-patterns.NeuronxCompiler.Initializer.parameter.scope"></a>
1758
+ ##### `scope`<sup>Required</sup> <a name="scope" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.Initializer.parameter.scope"></a>
1460
1759
 
1461
1760
  - *Type:* constructs.Construct
1462
1761
 
1463
1762
  ---
1464
1763
 
1465
- ##### `id`<sup>Required</sup> <a name="id" id="aws-cdk-neuronx-patterns.NeuronxCompiler.Initializer.parameter.id"></a>
1764
+ ##### `id`<sup>Required</sup> <a name="id" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.Initializer.parameter.id"></a>
1466
1765
 
1467
1766
  - *Type:* string
1468
1767
 
1469
1768
  ---
1470
1769
 
1471
- ##### `props`<sup>Required</sup> <a name="props" id="aws-cdk-neuronx-patterns.NeuronxCompiler.Initializer.parameter.props"></a>
1770
+ ##### `props`<sup>Required</sup> <a name="props" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.Initializer.parameter.props"></a>
1472
1771
 
1473
- - *Type:* <a href="#aws-cdk-neuronx-patterns.NeuronxCompilerProps">NeuronxCompilerProps</a>
1772
+ - *Type:* <a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps">NeuronxNativeCompilerProps</a>
1474
1773
 
1475
1774
  ---
1476
1775
 
@@ -1478,13 +1777,13 @@ new NeuronxCompiler(scope: Construct, id: string, props: NeuronxCompilerProps)
1478
1777
 
1479
1778
  | **Name** | **Description** |
1480
1779
  | --- | --- |
1481
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiler.toString">toString</a></code> | Returns a string representation of this construct. |
1482
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiler.with">with</a></code> | Applies one or more mixins to this construct. |
1483
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiler.compile">compile</a></code> | *No description.* |
1780
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompiler.toString">toString</a></code> | Returns a string representation of this construct. |
1781
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompiler.with">with</a></code> | Applies one or more mixins to this construct. |
1782
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompiler.compile">compile</a></code> | *No description.* |
1484
1783
 
1485
1784
  ---
1486
1785
 
1487
- ##### `toString` <a name="toString" id="aws-cdk-neuronx-patterns.NeuronxCompiler.toString"></a>
1786
+ ##### `toString` <a name="toString" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.toString"></a>
1488
1787
 
1489
1788
  ```typescript
1490
1789
  public toString(): string
@@ -1492,7 +1791,7 @@ public toString(): string
1492
1791
 
1493
1792
  Returns a string representation of this construct.
1494
1793
 
1495
- ##### `with` <a name="with" id="aws-cdk-neuronx-patterns.NeuronxCompiler.with"></a>
1794
+ ##### `with` <a name="with" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.with"></a>
1496
1795
 
1497
1796
  ```typescript
1498
1797
  public with(mixins: ...IMixin[]): IConstruct
@@ -1505,7 +1804,7 @@ start of the call, so constructs added by a mixin will not be visited.
1505
1804
  Use multiple `with()` calls if subsequent mixins should apply to added
1506
1805
  constructs.
1507
1806
 
1508
- ###### `mixins`<sup>Required</sup> <a name="mixins" id="aws-cdk-neuronx-patterns.NeuronxCompiler.with.parameter.mixins"></a>
1807
+ ###### `mixins`<sup>Required</sup> <a name="mixins" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.with.parameter.mixins"></a>
1509
1808
 
1510
1809
  - *Type:* ...constructs.IMixin[]
1511
1810
 
@@ -1513,7 +1812,7 @@ The mixins to apply.
1513
1812
 
1514
1813
  ---
1515
1814
 
1516
- ##### `compile` <a name="compile" id="aws-cdk-neuronx-patterns.NeuronxCompiler.compile"></a>
1815
+ ##### `compile` <a name="compile" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.compile"></a>
1517
1816
 
1518
1817
  ```typescript
1519
1818
  public compile(): NeuronxCompiledModel
@@ -1523,16 +1822,16 @@ public compile(): NeuronxCompiledModel
1523
1822
 
1524
1823
  | **Name** | **Description** |
1525
1824
  | --- | --- |
1526
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiler.isConstruct">isConstruct</a></code> | Checks if `x` is a construct. |
1825
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompiler.isConstruct">isConstruct</a></code> | Checks if `x` is a construct. |
1527
1826
 
1528
1827
  ---
1529
1828
 
1530
- ##### `isConstruct` <a name="isConstruct" id="aws-cdk-neuronx-patterns.NeuronxCompiler.isConstruct"></a>
1829
+ ##### `isConstruct` <a name="isConstruct" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.isConstruct"></a>
1531
1830
 
1532
1831
  ```typescript
1533
- import { NeuronxCompiler } from 'aws-cdk-neuronx-patterns'
1832
+ import { NeuronxNativeCompiler } from 'aws-cdk-neuronx-patterns'
1534
1833
 
1535
- NeuronxCompiler.isConstruct(x: any)
1834
+ NeuronxNativeCompiler.isConstruct(x: any)
1536
1835
  ```
1537
1836
 
1538
1837
  Checks if `x` is a construct.
@@ -1551,7 +1850,7 @@ library can be accidentally installed, and `instanceof` will behave
1551
1850
  unpredictably. It is safest to avoid using `instanceof`, and using
1552
1851
  this type-testing method instead.
1553
1852
 
1554
- ###### `x`<sup>Required</sup> <a name="x" id="aws-cdk-neuronx-patterns.NeuronxCompiler.isConstruct.parameter.x"></a>
1853
+ ###### `x`<sup>Required</sup> <a name="x" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.isConstruct.parameter.x"></a>
1555
1854
 
1556
1855
  - *Type:* any
1557
1856
 
@@ -1563,11 +1862,11 @@ Any object.
1563
1862
 
1564
1863
  | **Name** | **Type** | **Description** |
1565
1864
  | --- | --- | --- |
1566
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiler.property.node">node</a></code> | <code>constructs.Node</code> | The tree node. |
1865
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompiler.property.node">node</a></code> | <code>constructs.Node</code> | The tree node. |
1567
1866
 
1568
1867
  ---
1569
1868
 
1570
- ##### `node`<sup>Required</sup> <a name="node" id="aws-cdk-neuronx-patterns.NeuronxCompiler.property.node"></a>
1869
+ ##### `node`<sup>Required</sup> <a name="node" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.property.node"></a>
1571
1870
 
1572
1871
  ```typescript
1573
1872
  public readonly node: Node;
@@ -4652,43 +4951,88 @@ The task definition to use for tasks in the service. TaskDefinition or TaskImage
4652
4951
 
4653
4952
  ---
4654
4953
 
4655
- ### ModelConfig <a name="ModelConfig" id="aws-cdk-neuronx-patterns.ModelConfig"></a>
4954
+ ### ComputeEnvironmentResult <a name="ComputeEnvironmentResult" id="aws-cdk-neuronx-patterns.ComputeEnvironmentResult"></a>
4656
4955
 
4657
- #### Initializer <a name="Initializer" id="aws-cdk-neuronx-patterns.ModelConfig.Initializer"></a>
4956
+ Result of creating a compute environment.
4957
+
4958
+ #### Initializer <a name="Initializer" id="aws-cdk-neuronx-patterns.ComputeEnvironmentResult.Initializer"></a>
4658
4959
 
4659
4960
  ```typescript
4660
- import { ModelConfig } from 'aws-cdk-neuronx-patterns'
4961
+ import { ComputeEnvironmentResult } from 'aws-cdk-neuronx-patterns'
4661
4962
 
4662
- const modelConfig: ModelConfig = { ... }
4963
+ const computeEnvironmentResult: ComputeEnvironmentResult = { ... }
4663
4964
  ```
4664
4965
 
4665
4966
  #### Properties <a name="Properties" id="Properties"></a>
4666
4967
 
4667
4968
  | **Name** | **Type** | **Description** |
4668
4969
  | --- | --- | --- |
4669
- | <code><a href="#aws-cdk-neuronx-patterns.ModelConfig.property.attentionHeads">attentionHeads</a></code> | <code>number</code> | *No description.* |
4670
- | <code><a href="#aws-cdk-neuronx-patterns.ModelConfig.property.embeddingDimension">embeddingDimension</a></code> | <code>number</code> | *No description.* |
4671
- | <code><a href="#aws-cdk-neuronx-patterns.ModelConfig.property.layers">layers</a></code> | <code>number</code> | *No description.* |
4970
+ | <code><a href="#aws-cdk-neuronx-patterns.ComputeEnvironmentResult.property.computeEnvironment">computeEnvironment</a></code> | <code>aws-cdk-lib.aws_batch.IComputeEnvironment</code> | The compute environment. |
4971
+ | <code><a href="#aws-cdk-neuronx-patterns.ComputeEnvironmentResult.property.instanceRole">instanceRole</a></code> | <code>aws-cdk-lib.aws_iam.IRole</code> | The instance role associated with the compute environment. |
4672
4972
 
4673
4973
  ---
4674
4974
 
4675
- ##### `attentionHeads`<sup>Required</sup> <a name="attentionHeads" id="aws-cdk-neuronx-patterns.ModelConfig.property.attentionHeads"></a>
4975
+ ##### `computeEnvironment`<sup>Required</sup> <a name="computeEnvironment" id="aws-cdk-neuronx-patterns.ComputeEnvironmentResult.property.computeEnvironment"></a>
4676
4976
 
4677
4977
  ```typescript
4678
- public readonly attentionHeads: number;
4978
+ public readonly computeEnvironment: IComputeEnvironment;
4679
4979
  ```
4680
4980
 
4681
- - *Type:* number
4981
+ - *Type:* aws-cdk-lib.aws_batch.IComputeEnvironment
4982
+
4983
+ The compute environment.
4682
4984
 
4683
4985
  ---
4684
4986
 
4685
- ##### `embeddingDimension`<sup>Required</sup> <a name="embeddingDimension" id="aws-cdk-neuronx-patterns.ModelConfig.property.embeddingDimension"></a>
4987
+ ##### `instanceRole`<sup>Required</sup> <a name="instanceRole" id="aws-cdk-neuronx-patterns.ComputeEnvironmentResult.property.instanceRole"></a>
4686
4988
 
4687
4989
  ```typescript
4688
- public readonly embeddingDimension: number;
4990
+ public readonly instanceRole: IRole;
4689
4991
  ```
4690
4992
 
4691
- - *Type:* number
4993
+ - *Type:* aws-cdk-lib.aws_iam.IRole
4994
+
4995
+ The instance role associated with the compute environment.
4996
+
4997
+ ---
4998
+
4999
+ ### ModelConfig <a name="ModelConfig" id="aws-cdk-neuronx-patterns.ModelConfig"></a>
5000
+
5001
+ #### Initializer <a name="Initializer" id="aws-cdk-neuronx-patterns.ModelConfig.Initializer"></a>
5002
+
5003
+ ```typescript
5004
+ import { ModelConfig } from 'aws-cdk-neuronx-patterns'
5005
+
5006
+ const modelConfig: ModelConfig = { ... }
5007
+ ```
5008
+
5009
+ #### Properties <a name="Properties" id="Properties"></a>
5010
+
5011
+ | **Name** | **Type** | **Description** |
5012
+ | --- | --- | --- |
5013
+ | <code><a href="#aws-cdk-neuronx-patterns.ModelConfig.property.attentionHeads">attentionHeads</a></code> | <code>number</code> | *No description.* |
5014
+ | <code><a href="#aws-cdk-neuronx-patterns.ModelConfig.property.embeddingDimension">embeddingDimension</a></code> | <code>number</code> | *No description.* |
5015
+ | <code><a href="#aws-cdk-neuronx-patterns.ModelConfig.property.layers">layers</a></code> | <code>number</code> | *No description.* |
5016
+
5017
+ ---
5018
+
5019
+ ##### `attentionHeads`<sup>Required</sup> <a name="attentionHeads" id="aws-cdk-neuronx-patterns.ModelConfig.property.attentionHeads"></a>
5020
+
5021
+ ```typescript
5022
+ public readonly attentionHeads: number;
5023
+ ```
5024
+
5025
+ - *Type:* number
5026
+
5027
+ ---
5028
+
5029
+ ##### `embeddingDimension`<sup>Required</sup> <a name="embeddingDimension" id="aws-cdk-neuronx-patterns.ModelConfig.property.embeddingDimension"></a>
5030
+
5031
+ ```typescript
5032
+ public readonly embeddingDimension: number;
5033
+ ```
5034
+
5035
+ - *Type:* number
4692
5036
 
4693
5037
  ---
4694
5038
 
@@ -5846,234 +6190,627 @@ Automatically added to the job definition.
5846
6190
  ##### `gpu`<sup>Optional</sup> <a name="gpu" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.gpu"></a>
5847
6191
 
5848
6192
  ```typescript
5849
- public readonly gpu: number;
6193
+ public readonly gpu: number;
6194
+ ```
6195
+
6196
+ - *Type:* number
6197
+ - *Default:* no gpus
6198
+
6199
+ The number of physical GPUs to reserve for the container.
6200
+
6201
+ Make sure that the number of GPUs reserved for all containers in a job doesn't exceed
6202
+ the number of available GPUs on the compute resource that the job is launched on.
6203
+
6204
+ ---
6205
+
6206
+ ##### `privileged`<sup>Optional</sup> <a name="privileged" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.privileged"></a>
6207
+
6208
+ ```typescript
6209
+ public readonly privileged: boolean;
6210
+ ```
6211
+
6212
+ - *Type:* boolean
6213
+ - *Default:* false
6214
+
6215
+ When this parameter is true, the container is given elevated permissions on the host container instance (similar to the root user).
6216
+
6217
+ ---
6218
+
6219
+ ##### `ulimits`<sup>Optional</sup> <a name="ulimits" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.ulimits"></a>
6220
+
6221
+ ```typescript
6222
+ public readonly ulimits: Ulimit[];
6223
+ ```
6224
+
6225
+ - *Type:* aws-cdk-lib.aws_batch.Ulimit[]
6226
+ - *Default:* no ulimits
6227
+
6228
+ Limits to set for the user this docker container will run as.
6229
+
6230
+ ---
6231
+
6232
+ ##### `neuronxInstanceType`<sup>Required</sup> <a name="neuronxInstanceType" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.neuronxInstanceType"></a>
6233
+
6234
+ ```typescript
6235
+ public readonly neuronxInstanceType: INeuronxInstanceType;
6236
+ ```
6237
+
6238
+ - *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
6239
+
6240
+ The instance type of worker instance.
6241
+
6242
+ ---
6243
+
6244
+ ##### `volumeSize`<sup>Required</sup> <a name="volumeSize" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.volumeSize"></a>
6245
+
6246
+ ```typescript
6247
+ public readonly volumeSize: Size;
6248
+ ```
6249
+
6250
+ - *Type:* aws-cdk-lib.Size
6251
+ - *Default:* N bilion parameters * 5GiB EBS
6252
+
6253
+ The root volume of worker instance.
6254
+
6255
+ ---
6256
+
6257
+ ##### `vpc`<sup>Required</sup> <a name="vpc" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.vpc"></a>
6258
+
6259
+ ```typescript
6260
+ public readonly vpc: IVpc;
6261
+ ```
6262
+
6263
+ - *Type:* aws-cdk-lib.aws_ec2.IVpc
6264
+
6265
+ VPC in which this will launch worker instance.
6266
+
6267
+ ---
6268
+
6269
+ ##### `spot`<sup>Optional</sup> <a name="spot" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.spot"></a>
6270
+
6271
+ ```typescript
6272
+ public readonly spot: boolean;
6273
+ ```
6274
+
6275
+ - *Type:* boolean
6276
+ - *Default:* false
6277
+
6278
+ Whether or not to use spot instances.
6279
+
6280
+ Spot instances are less expensive EC2 instances that can be reclaimed by EC2 at any time;
6281
+ your job will be given two minutes of notice before reclamation.
6282
+
6283
+ ---
6284
+
6285
+ ##### `vpcSubnets`<sup>Optional</sup> <a name="vpcSubnets" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.vpcSubnets"></a>
6286
+
6287
+ ```typescript
6288
+ public readonly vpcSubnets: SubnetSelection;
6289
+ ```
6290
+
6291
+ - *Type:* aws-cdk-lib.aws_ec2.SubnetSelection
6292
+ - *Default:* new subnets will be created
6293
+
6294
+ The VPC Subnets this Compute Environment will launch instances in.
6295
+
6296
+ ---
6297
+
6298
+ ### NeuronxCompiledModel <a name="NeuronxCompiledModel" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel"></a>
6299
+
6300
+ The model compiled by Neuronx compiler.
6301
+
6302
+ #### Initializer <a name="Initializer" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.Initializer"></a>
6303
+
6304
+ ```typescript
6305
+ import { NeuronxCompiledModel } from 'aws-cdk-neuronx-patterns'
6306
+
6307
+ const neuronxCompiledModel: NeuronxCompiledModel = { ... }
6308
+ ```
6309
+
6310
+ #### Properties <a name="Properties" id="Properties"></a>
6311
+
6312
+ | **Name** | **Type** | **Description** |
6313
+ | --- | --- | --- |
6314
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.bucket">bucket</a></code> | <code>aws-cdk-lib.aws_s3.IBucket</code> | The bucket to upload compiled artifacts. |
6315
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.modelName">modelName</a></code> | <code>string</code> | The model name. |
6316
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.recommendedInstanceType">recommendedInstanceType</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | The recommended Neuron instance type for running inference with this compiled model. |
6317
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.s3Prefix">s3Prefix</a></code> | <code>string</code> | S3 prefix that compiled artifact uploaded. |
6318
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.s3Uri">s3Uri</a></code> | <code>string</code> | S3 URL that compiled artifact uploaded. |
6319
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.weightSize">weightSize</a></code> | <code>aws-cdk-lib.Size</code> | The weight size of the model. |
6320
+
6321
+ ---
6322
+
6323
+ ##### `bucket`<sup>Required</sup> <a name="bucket" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.bucket"></a>
6324
+
6325
+ ```typescript
6326
+ public readonly bucket: IBucket;
6327
+ ```
6328
+
6329
+ - *Type:* aws-cdk-lib.aws_s3.IBucket
6330
+
6331
+ The bucket to upload compiled artifacts.
6332
+
6333
+ ---
6334
+
6335
+ ##### `modelName`<sup>Required</sup> <a name="modelName" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.modelName"></a>
6336
+
6337
+ ```typescript
6338
+ public readonly modelName: string;
6339
+ ```
6340
+
6341
+ - *Type:* string
6342
+
6343
+ The model name.
6344
+
6345
+ ---
6346
+
6347
+ ##### `recommendedInstanceType`<sup>Required</sup> <a name="recommendedInstanceType" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.recommendedInstanceType"></a>
6348
+
6349
+ ```typescript
6350
+ public readonly recommendedInstanceType: INeuronxInstanceType;
6351
+ ```
6352
+
6353
+ - *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
6354
+
6355
+ The recommended Neuron instance type for running inference with this compiled model.
6356
+
6357
+ ---
6358
+
6359
+ ##### `s3Prefix`<sup>Required</sup> <a name="s3Prefix" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.s3Prefix"></a>
6360
+
6361
+ ```typescript
6362
+ public readonly s3Prefix: string;
6363
+ ```
6364
+
6365
+ - *Type:* string
6366
+
6367
+ S3 prefix that compiled artifact uploaded.
6368
+
6369
+ ---
6370
+
6371
+ ##### `s3Uri`<sup>Required</sup> <a name="s3Uri" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.s3Uri"></a>
6372
+
6373
+ ```typescript
6374
+ public readonly s3Uri: string;
6375
+ ```
6376
+
6377
+ - *Type:* string
6378
+
6379
+ S3 URL that compiled artifact uploaded.
6380
+
6381
+ ---
6382
+
6383
+ ##### `weightSize`<sup>Required</sup> <a name="weightSize" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.weightSize"></a>
6384
+
6385
+ ```typescript
6386
+ public readonly weightSize: Size;
6387
+ ```
6388
+
6389
+ - *Type:* aws-cdk-lib.Size
6390
+
6391
+ The weight size of the model.
6392
+
6393
+ ---
6394
+
6395
+ ### NeuronxCompilerBaseProps <a name="NeuronxCompilerBaseProps" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps"></a>
6396
+
6397
+ Common props for NeuronxCompilerBase.
6398
+
6399
+ #### Initializer <a name="Initializer" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.Initializer"></a>
6400
+
6401
+ ```typescript
6402
+ import { NeuronxCompilerBaseProps } from 'aws-cdk-neuronx-patterns'
6403
+
6404
+ const neuronxCompilerBaseProps: NeuronxCompilerBaseProps = { ... }
6405
+ ```
6406
+
6407
+ #### Properties <a name="Properties" id="Properties"></a>
6408
+
6409
+ | **Name** | **Type** | **Description** |
6410
+ | --- | --- | --- |
6411
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.artifactS3Prefix">artifactS3Prefix</a></code> | <code>string</code> | S3 Prefix that compiled artifact uploaded. |
6412
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.bucket">bucket</a></code> | <code>aws-cdk-lib.aws_s3.IBucket</code> | The bucket to upload compiled artifacts. |
6413
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.image">image</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxContainerImage">INeuronxContainerImage</a></code> | An image of the container where the compile job is executed. |
6414
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.model">model</a></code> | <code><a href="#aws-cdk-neuronx-patterns.Model">Model</a></code> | The model to be compiled. |
6415
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.neuronxInstanceType">neuronxInstanceType</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | The instance type of compile worker instance. |
6416
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.vpc">vpc</a></code> | <code>aws-cdk-lib.aws_ec2.IVpc</code> | VPC in which this will launch compile worker instance. |
6417
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.command">command</a></code> | <code>string[]</code> | The command to run in the container. |
6418
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.environment">environment</a></code> | <code>{[ key: string ]: string}</code> | The environment variables to pass to the container. |
6419
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.secrets">secrets</a></code> | <code>{[ key: string ]: aws-cdk-lib.aws_batch.Secret}</code> | Secrets to pass to the container. |
6420
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.spot">spot</a></code> | <code>boolean</code> | Whether or not to use spot instances. |
6421
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.volumeSize">volumeSize</a></code> | <code>aws-cdk-lib.Size</code> | The root volume of worker instance. |
6422
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.vpcSubnets">vpcSubnets</a></code> | <code>aws-cdk-lib.aws_ec2.SubnetSelection</code> | The VPC Subnets this Compute Environment will launch instances in. |
6423
+
6424
+ ---
6425
+
6426
+ ##### `artifactS3Prefix`<sup>Required</sup> <a name="artifactS3Prefix" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.artifactS3Prefix"></a>
6427
+
6428
+ ```typescript
6429
+ public readonly artifactS3Prefix: string;
6430
+ ```
6431
+
6432
+ - *Type:* string
6433
+
6434
+ S3 Prefix that compiled artifact uploaded.
6435
+
6436
+ This property is not depends on compile job finish.
6437
+
6438
+ ---
6439
+
6440
+ ##### `bucket`<sup>Required</sup> <a name="bucket" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.bucket"></a>
6441
+
6442
+ ```typescript
6443
+ public readonly bucket: IBucket;
6444
+ ```
6445
+
6446
+ - *Type:* aws-cdk-lib.aws_s3.IBucket
6447
+
6448
+ The bucket to upload compiled artifacts.
6449
+
6450
+ ---
6451
+
6452
+ ##### `image`<sup>Required</sup> <a name="image" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.image"></a>
6453
+
6454
+ ```typescript
6455
+ public readonly image: INeuronxContainerImage;
6456
+ ```
6457
+
6458
+ - *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxContainerImage">INeuronxContainerImage</a>
6459
+
6460
+ An image of the container where the compile job is executed.
6461
+
6462
+ ---
6463
+
6464
+ ##### `model`<sup>Required</sup> <a name="model" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.model"></a>
6465
+
6466
+ ```typescript
6467
+ public readonly model: Model;
6468
+ ```
6469
+
6470
+ - *Type:* <a href="#aws-cdk-neuronx-patterns.Model">Model</a>
6471
+
6472
+ The model to be compiled.
6473
+
6474
+ ---
6475
+
6476
+ ##### `neuronxInstanceType`<sup>Required</sup> <a name="neuronxInstanceType" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.neuronxInstanceType"></a>
6477
+
6478
+ ```typescript
6479
+ public readonly neuronxInstanceType: INeuronxInstanceType;
6480
+ ```
6481
+
6482
+ - *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
6483
+
6484
+ The instance type of compile worker instance.
6485
+
6486
+ ---
6487
+
6488
+ ##### `vpc`<sup>Required</sup> <a name="vpc" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.vpc"></a>
6489
+
6490
+ ```typescript
6491
+ public readonly vpc: IVpc;
6492
+ ```
6493
+
6494
+ - *Type:* aws-cdk-lib.aws_ec2.IVpc
6495
+
6496
+ VPC in which this will launch compile worker instance.
6497
+
6498
+ ---
6499
+
6500
+ ##### `command`<sup>Optional</sup> <a name="command" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.command"></a>
6501
+
6502
+ ```typescript
6503
+ public readonly command: string[];
6504
+ ```
6505
+
6506
+ - *Type:* string[]
6507
+
6508
+ The command to run in the container.
6509
+
6510
+ ---
6511
+
6512
+ ##### `environment`<sup>Optional</sup> <a name="environment" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.environment"></a>
6513
+
6514
+ ```typescript
6515
+ public readonly environment: {[ key: string ]: string};
6516
+ ```
6517
+
6518
+ - *Type:* {[ key: string ]: string}
6519
+ - *Default:* No environment variables.
6520
+
6521
+ The environment variables to pass to the container.
6522
+
6523
+ This is only applicable when using container runtime.
6524
+
6525
+ ---
6526
+
6527
+ ##### `secrets`<sup>Optional</sup> <a name="secrets" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.secrets"></a>
6528
+
6529
+ ```typescript
6530
+ public readonly secrets: {[ key: string ]: Secret};
6531
+ ```
6532
+
6533
+ - *Type:* {[ key: string ]: aws-cdk-lib.aws_batch.Secret}
6534
+
6535
+ Secrets to pass to the container.
6536
+
6537
+ ---
6538
+
6539
+ ##### `spot`<sup>Optional</sup> <a name="spot" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.spot"></a>
6540
+
6541
+ ```typescript
6542
+ public readonly spot: boolean;
6543
+ ```
6544
+
6545
+ - *Type:* boolean
6546
+ - *Default:* false
6547
+
6548
+ Whether or not to use spot instances.
6549
+
6550
+ Spot instances are less expensive EC2 instances that can be reclaimed by EC2 at any time; your job will be given two minutes of notice before reclamation.
6551
+
6552
+ ---
6553
+
6554
+ ##### `volumeSize`<sup>Optional</sup> <a name="volumeSize" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.volumeSize"></a>
6555
+
6556
+ ```typescript
6557
+ public readonly volumeSize: Size;
6558
+ ```
6559
+
6560
+ - *Type:* aws-cdk-lib.Size
6561
+ - *Default:* N billion parameters * 5GiB EBS
6562
+
6563
+ The root volume of worker instance.
6564
+
6565
+ ---
6566
+
6567
+ ##### `vpcSubnets`<sup>Optional</sup> <a name="vpcSubnets" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.vpcSubnets"></a>
6568
+
6569
+ ```typescript
6570
+ public readonly vpcSubnets: SubnetSelection;
5850
6571
  ```
5851
6572
 
5852
- - *Type:* number
5853
- - *Default:* no gpus
5854
-
5855
- The number of physical GPUs to reserve for the container.
6573
+ - *Type:* aws-cdk-lib.aws_ec2.SubnetSelection
6574
+ - *Default:* new subnets will be created
5856
6575
 
5857
- Make sure that the number of GPUs reserved for all containers in a job doesn't exceed
5858
- the number of available GPUs on the compute resource that the job is launched on.
6576
+ The VPC Subnets this Compute Environment will launch instances in.
5859
6577
 
5860
6578
  ---
5861
6579
 
5862
- ##### `privileged`<sup>Optional</sup> <a name="privileged" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.privileged"></a>
6580
+ ### NeuronxCrossCompilerProps <a name="NeuronxCrossCompilerProps" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps"></a>
6581
+
6582
+ Props of NeuronxCrossCompiler.
6583
+
6584
+ #### Initializer <a name="Initializer" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.Initializer"></a>
5863
6585
 
5864
6586
  ```typescript
5865
- public readonly privileged: boolean;
6587
+ import { NeuronxCrossCompilerProps } from 'aws-cdk-neuronx-patterns'
6588
+
6589
+ const neuronxCrossCompilerProps: NeuronxCrossCompilerProps = { ... }
5866
6590
  ```
5867
6591
 
5868
- - *Type:* boolean
5869
- - *Default:* false
6592
+ #### Properties <a name="Properties" id="Properties"></a>
5870
6593
 
5871
- When this parameter is true, the container is given elevated permissions on the host container instance (similar to the root user).
6594
+ | **Name** | **Type** | **Description** |
6595
+ | --- | --- | --- |
6596
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.artifactS3Prefix">artifactS3Prefix</a></code> | <code>string</code> | S3 Prefix that compiled artifact uploaded. |
6597
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.bucket">bucket</a></code> | <code>aws-cdk-lib.aws_s3.IBucket</code> | The bucket to upload compiled artifacts. |
6598
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.image">image</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxContainerImage">INeuronxContainerImage</a></code> | An image of the container where the compile job is executed. |
6599
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.model">model</a></code> | <code><a href="#aws-cdk-neuronx-patterns.Model">Model</a></code> | The model to be compiled. |
6600
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.neuronxInstanceType">neuronxInstanceType</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | The instance type of compile worker instance. |
6601
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.vpc">vpc</a></code> | <code>aws-cdk-lib.aws_ec2.IVpc</code> | VPC in which this will launch compile worker instance. |
6602
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.command">command</a></code> | <code>string[]</code> | The command to run in the container. |
6603
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.environment">environment</a></code> | <code>{[ key: string ]: string}</code> | The environment variables to pass to the container. |
6604
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.secrets">secrets</a></code> | <code>{[ key: string ]: aws-cdk-lib.aws_batch.Secret}</code> | Secrets to pass to the container. |
6605
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.spot">spot</a></code> | <code>boolean</code> | Whether or not to use spot instances. |
6606
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.volumeSize">volumeSize</a></code> | <code>aws-cdk-lib.Size</code> | The root volume of worker instance. |
6607
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.vpcSubnets">vpcSubnets</a></code> | <code>aws-cdk-lib.aws_ec2.SubnetSelection</code> | The VPC Subnets this Compute Environment will launch instances in. |
6608
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.compileInstanceType">compileInstanceType</a></code> | <code>aws-cdk-lib.aws_ec2.InstanceType</code> | The EC2 instance type to use for cross-compilation. |
5872
6609
 
5873
6610
  ---
5874
6611
 
5875
- ##### `ulimits`<sup>Optional</sup> <a name="ulimits" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.ulimits"></a>
6612
+ ##### `artifactS3Prefix`<sup>Required</sup> <a name="artifactS3Prefix" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.artifactS3Prefix"></a>
5876
6613
 
5877
6614
  ```typescript
5878
- public readonly ulimits: Ulimit[];
6615
+ public readonly artifactS3Prefix: string;
5879
6616
  ```
5880
6617
 
5881
- - *Type:* aws-cdk-lib.aws_batch.Ulimit[]
5882
- - *Default:* no ulimits
6618
+ - *Type:* string
5883
6619
 
5884
- Limits to set for the user this docker container will run as.
6620
+ S3 Prefix that compiled artifact uploaded.
6621
+
6622
+ This property is not depends on compile job finish.
5885
6623
 
5886
6624
  ---
5887
6625
 
5888
- ##### `neuronxInstanceType`<sup>Required</sup> <a name="neuronxInstanceType" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.neuronxInstanceType"></a>
6626
+ ##### `bucket`<sup>Required</sup> <a name="bucket" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.bucket"></a>
5889
6627
 
5890
6628
  ```typescript
5891
- public readonly neuronxInstanceType: INeuronxInstanceType;
6629
+ public readonly bucket: IBucket;
5892
6630
  ```
5893
6631
 
5894
- - *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
6632
+ - *Type:* aws-cdk-lib.aws_s3.IBucket
5895
6633
 
5896
- The instance type of worker instance.
6634
+ The bucket to upload compiled artifacts.
5897
6635
 
5898
6636
  ---
5899
6637
 
5900
- ##### `volumeSize`<sup>Required</sup> <a name="volumeSize" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.volumeSize"></a>
6638
+ ##### `image`<sup>Required</sup> <a name="image" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.image"></a>
5901
6639
 
5902
6640
  ```typescript
5903
- public readonly volumeSize: Size;
6641
+ public readonly image: INeuronxContainerImage;
5904
6642
  ```
5905
6643
 
5906
- - *Type:* aws-cdk-lib.Size
5907
- - *Default:* N bilion parameters * 5GiB EBS
6644
+ - *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxContainerImage">INeuronxContainerImage</a>
5908
6645
 
5909
- The root volume of worker instance.
6646
+ An image of the container where the compile job is executed.
5910
6647
 
5911
6648
  ---
5912
6649
 
5913
- ##### `vpc`<sup>Required</sup> <a name="vpc" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.vpc"></a>
6650
+ ##### `model`<sup>Required</sup> <a name="model" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.model"></a>
5914
6651
 
5915
6652
  ```typescript
5916
- public readonly vpc: IVpc;
6653
+ public readonly model: Model;
5917
6654
  ```
5918
6655
 
5919
- - *Type:* aws-cdk-lib.aws_ec2.IVpc
6656
+ - *Type:* <a href="#aws-cdk-neuronx-patterns.Model">Model</a>
5920
6657
 
5921
- VPC in which this will launch worker instance.
6658
+ The model to be compiled.
5922
6659
 
5923
6660
  ---
5924
6661
 
5925
- ##### `spot`<sup>Optional</sup> <a name="spot" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.spot"></a>
6662
+ ##### `neuronxInstanceType`<sup>Required</sup> <a name="neuronxInstanceType" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.neuronxInstanceType"></a>
5926
6663
 
5927
6664
  ```typescript
5928
- public readonly spot: boolean;
6665
+ public readonly neuronxInstanceType: INeuronxInstanceType;
5929
6666
  ```
5930
6667
 
5931
- - *Type:* boolean
5932
- - *Default:* false
5933
-
5934
- Whether or not to use spot instances.
6668
+ - *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
5935
6669
 
5936
- Spot instances are less expensive EC2 instances that can be reclaimed by EC2 at any time;
5937
- your job will be given two minutes of notice before reclamation.
6670
+ The instance type of compile worker instance.
5938
6671
 
5939
6672
  ---
5940
6673
 
5941
- ##### `vpcSubnets`<sup>Optional</sup> <a name="vpcSubnets" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.vpcSubnets"></a>
6674
+ ##### `vpc`<sup>Required</sup> <a name="vpc" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.vpc"></a>
5942
6675
 
5943
6676
  ```typescript
5944
- public readonly vpcSubnets: SubnetSelection;
6677
+ public readonly vpc: IVpc;
5945
6678
  ```
5946
6679
 
5947
- - *Type:* aws-cdk-lib.aws_ec2.SubnetSelection
5948
- - *Default:* new subnets will be created
6680
+ - *Type:* aws-cdk-lib.aws_ec2.IVpc
5949
6681
 
5950
- The VPC Subnets this Compute Environment will launch instances in.
6682
+ VPC in which this will launch compile worker instance.
5951
6683
 
5952
6684
  ---
5953
6685
 
5954
- ### NeuronxCompiledModel <a name="NeuronxCompiledModel" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel"></a>
5955
-
5956
- #### Initializer <a name="Initializer" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.Initializer"></a>
6686
+ ##### `command`<sup>Optional</sup> <a name="command" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.command"></a>
5957
6687
 
5958
6688
  ```typescript
5959
- import { NeuronxCompiledModel } from 'aws-cdk-neuronx-patterns'
5960
-
5961
- const neuronxCompiledModel: NeuronxCompiledModel = { ... }
6689
+ public readonly command: string[];
5962
6690
  ```
5963
6691
 
5964
- #### Properties <a name="Properties" id="Properties"></a>
6692
+ - *Type:* string[]
5965
6693
 
5966
- | **Name** | **Type** | **Description** |
5967
- | --- | --- | --- |
5968
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.bucket">bucket</a></code> | <code>aws-cdk-lib.aws_s3.IBucket</code> | The bucket to upload compiled artifacts. |
5969
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.compileTimeInstanceType">compileTimeInstanceType</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | *No description.* |
5970
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.modelName">modelName</a></code> | <code>string</code> | The model name. |
5971
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.s3Prefix">s3Prefix</a></code> | <code>string</code> | S3 prefix that compiled artifact uploaded. |
5972
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.s3Uri">s3Uri</a></code> | <code>string</code> | S3 URL that compiled artifact uploaded. |
5973
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.weightSize">weightSize</a></code> | <code>aws-cdk-lib.Size</code> | *No description.* |
6694
+ The command to run in the container.
5974
6695
 
5975
6696
  ---
5976
6697
 
5977
- ##### `bucket`<sup>Required</sup> <a name="bucket" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.bucket"></a>
6698
+ ##### `environment`<sup>Optional</sup> <a name="environment" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.environment"></a>
5978
6699
 
5979
6700
  ```typescript
5980
- public readonly bucket: IBucket;
6701
+ public readonly environment: {[ key: string ]: string};
5981
6702
  ```
5982
6703
 
5983
- - *Type:* aws-cdk-lib.aws_s3.IBucket
6704
+ - *Type:* {[ key: string ]: string}
6705
+ - *Default:* No environment variables.
5984
6706
 
5985
- The bucket to upload compiled artifacts.
6707
+ The environment variables to pass to the container.
6708
+
6709
+ This is only applicable when using container runtime.
5986
6710
 
5987
6711
  ---
5988
6712
 
5989
- ##### `compileTimeInstanceType`<sup>Required</sup> <a name="compileTimeInstanceType" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.compileTimeInstanceType"></a>
6713
+ ##### `secrets`<sup>Optional</sup> <a name="secrets" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.secrets"></a>
5990
6714
 
5991
6715
  ```typescript
5992
- public readonly compileTimeInstanceType: INeuronxInstanceType;
6716
+ public readonly secrets: {[ key: string ]: Secret};
5993
6717
  ```
5994
6718
 
5995
- - *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
6719
+ - *Type:* {[ key: string ]: aws-cdk-lib.aws_batch.Secret}
6720
+
6721
+ Secrets to pass to the container.
5996
6722
 
5997
6723
  ---
5998
6724
 
5999
- ##### `modelName`<sup>Required</sup> <a name="modelName" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.modelName"></a>
6725
+ ##### `spot`<sup>Optional</sup> <a name="spot" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.spot"></a>
6000
6726
 
6001
6727
  ```typescript
6002
- public readonly modelName: string;
6728
+ public readonly spot: boolean;
6003
6729
  ```
6004
6730
 
6005
- - *Type:* string
6731
+ - *Type:* boolean
6732
+ - *Default:* false
6006
6733
 
6007
- The model name.
6734
+ Whether or not to use spot instances.
6735
+
6736
+ Spot instances are less expensive EC2 instances that can be reclaimed by EC2 at any time; your job will be given two minutes of notice before reclamation.
6008
6737
 
6009
6738
  ---
6010
6739
 
6011
- ##### `s3Prefix`<sup>Required</sup> <a name="s3Prefix" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.s3Prefix"></a>
6740
+ ##### `volumeSize`<sup>Optional</sup> <a name="volumeSize" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.volumeSize"></a>
6012
6741
 
6013
6742
  ```typescript
6014
- public readonly s3Prefix: string;
6743
+ public readonly volumeSize: Size;
6015
6744
  ```
6016
6745
 
6017
- - *Type:* string
6746
+ - *Type:* aws-cdk-lib.Size
6747
+ - *Default:* N billion parameters * 5GiB EBS
6018
6748
 
6019
- S3 prefix that compiled artifact uploaded.
6749
+ The root volume of worker instance.
6020
6750
 
6021
6751
  ---
6022
6752
 
6023
- ##### `s3Uri`<sup>Required</sup> <a name="s3Uri" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.s3Uri"></a>
6753
+ ##### `vpcSubnets`<sup>Optional</sup> <a name="vpcSubnets" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.vpcSubnets"></a>
6024
6754
 
6025
6755
  ```typescript
6026
- public readonly s3Uri: string;
6756
+ public readonly vpcSubnets: SubnetSelection;
6027
6757
  ```
6028
6758
 
6029
- - *Type:* string
6759
+ - *Type:* aws-cdk-lib.aws_ec2.SubnetSelection
6760
+ - *Default:* new subnets will be created
6030
6761
 
6031
- S3 URL that compiled artifact uploaded.
6762
+ The VPC Subnets this Compute Environment will launch instances in.
6032
6763
 
6033
6764
  ---
6034
6765
 
6035
- ##### `weightSize`<sup>Required</sup> <a name="weightSize" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.weightSize"></a>
6766
+ ##### `compileInstanceType`<sup>Optional</sup> <a name="compileInstanceType" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.compileInstanceType"></a>
6036
6767
 
6037
6768
  ```typescript
6038
- public readonly weightSize: Size;
6769
+ public readonly compileInstanceType: InstanceType;
6039
6770
  ```
6040
6771
 
6041
- - *Type:* aws-cdk-lib.Size
6772
+ - *Type:* aws-cdk-lib.aws_ec2.InstanceType
6773
+ - *Default:* ec2.InstanceType.of(ec2.InstanceClass.C7I, ec2.InstanceSize.XLARGE4)
6774
+
6775
+ The EC2 instance type to use for cross-compilation.
6776
+
6777
+ This should be a non-Neuron instance type with sufficient memory and CPU
6778
+ for model compilation.
6042
6779
 
6043
6780
  ---
6044
6781
 
6045
- ### NeuronxCompilerProps <a name="NeuronxCompilerProps" id="aws-cdk-neuronx-patterns.NeuronxCompilerProps"></a>
6782
+ ### NeuronxNativeCompilerProps <a name="NeuronxNativeCompilerProps" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps"></a>
6046
6783
 
6047
- Props of NeuronxCompiler.
6784
+ Props of NeuronxNativeCompiler.
6048
6785
 
6049
- #### Initializer <a name="Initializer" id="aws-cdk-neuronx-patterns.NeuronxCompilerProps.Initializer"></a>
6786
+ #### Initializer <a name="Initializer" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.Initializer"></a>
6050
6787
 
6051
6788
  ```typescript
6052
- import { NeuronxCompilerProps } from 'aws-cdk-neuronx-patterns'
6789
+ import { NeuronxNativeCompilerProps } from 'aws-cdk-neuronx-patterns'
6053
6790
 
6054
- const neuronxCompilerProps: NeuronxCompilerProps = { ... }
6791
+ const neuronxNativeCompilerProps: NeuronxNativeCompilerProps = { ... }
6055
6792
  ```
6056
6793
 
6057
6794
  #### Properties <a name="Properties" id="Properties"></a>
6058
6795
 
6059
6796
  | **Name** | **Type** | **Description** |
6060
6797
  | --- | --- | --- |
6061
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.artifactS3Prefix">artifactS3Prefix</a></code> | <code>string</code> | S3 Prefix that compiled artifact uploaded. |
6062
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.bucket">bucket</a></code> | <code>aws-cdk-lib.aws_s3.IBucket</code> | The bucket to upload compiled artifacts. |
6063
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.image">image</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxContainerImage">INeuronxContainerImage</a></code> | An image of the container where the compile job is executed. |
6064
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.model">model</a></code> | <code><a href="#aws-cdk-neuronx-patterns.Model">Model</a></code> | The model to be compiled. |
6065
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.neuronxInstanceType">neuronxInstanceType</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | The instance type of compile worker instance. |
6066
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.vpc">vpc</a></code> | <code>aws-cdk-lib.aws_ec2.IVpc</code> | VPC in which this will launch compile worker instance. |
6067
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.command">command</a></code> | <code>string[]</code> | *No description.* |
6068
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.environment">environment</a></code> | <code>{[ key: string ]: string}</code> | The environment variables to pass to the container. |
6069
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.secrets">secrets</a></code> | <code>{[ key: string ]: aws-cdk-lib.aws_batch.Secret}</code> | Secrets to pass to the container. |
6070
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.spot">spot</a></code> | <code>boolean</code> | Whether or not to use spot instances. |
6071
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.volumeSize">volumeSize</a></code> | <code>aws-cdk-lib.Size</code> | The root volume of worker instance. |
6072
- | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.vpcSubnets">vpcSubnets</a></code> | <code>aws-cdk-lib.aws_ec2.SubnetSelection</code> | The VPC Subnets this Compute Environment will launch instances in. |
6798
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.artifactS3Prefix">artifactS3Prefix</a></code> | <code>string</code> | S3 Prefix that compiled artifact uploaded. |
6799
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.bucket">bucket</a></code> | <code>aws-cdk-lib.aws_s3.IBucket</code> | The bucket to upload compiled artifacts. |
6800
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.image">image</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxContainerImage">INeuronxContainerImage</a></code> | An image of the container where the compile job is executed. |
6801
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.model">model</a></code> | <code><a href="#aws-cdk-neuronx-patterns.Model">Model</a></code> | The model to be compiled. |
6802
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.neuronxInstanceType">neuronxInstanceType</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | The instance type of compile worker instance. |
6803
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.vpc">vpc</a></code> | <code>aws-cdk-lib.aws_ec2.IVpc</code> | VPC in which this will launch compile worker instance. |
6804
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.command">command</a></code> | <code>string[]</code> | The command to run in the container. |
6805
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.environment">environment</a></code> | <code>{[ key: string ]: string}</code> | The environment variables to pass to the container. |
6806
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.secrets">secrets</a></code> | <code>{[ key: string ]: aws-cdk-lib.aws_batch.Secret}</code> | Secrets to pass to the container. |
6807
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.spot">spot</a></code> | <code>boolean</code> | Whether or not to use spot instances. |
6808
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.volumeSize">volumeSize</a></code> | <code>aws-cdk-lib.Size</code> | The root volume of worker instance. |
6809
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.vpcSubnets">vpcSubnets</a></code> | <code>aws-cdk-lib.aws_ec2.SubnetSelection</code> | The VPC Subnets this Compute Environment will launch instances in. |
6073
6810
 
6074
6811
  ---
6075
6812
 
6076
- ##### `artifactS3Prefix`<sup>Required</sup> <a name="artifactS3Prefix" id="aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.artifactS3Prefix"></a>
6813
+ ##### `artifactS3Prefix`<sup>Required</sup> <a name="artifactS3Prefix" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.artifactS3Prefix"></a>
6077
6814
 
6078
6815
  ```typescript
6079
6816
  public readonly artifactS3Prefix: string;
@@ -6087,7 +6824,7 @@ This property is not depends on compile job finish.
6087
6824
 
6088
6825
  ---
6089
6826
 
6090
- ##### `bucket`<sup>Required</sup> <a name="bucket" id="aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.bucket"></a>
6827
+ ##### `bucket`<sup>Required</sup> <a name="bucket" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.bucket"></a>
6091
6828
 
6092
6829
  ```typescript
6093
6830
  public readonly bucket: IBucket;
@@ -6099,7 +6836,7 @@ The bucket to upload compiled artifacts.
6099
6836
 
6100
6837
  ---
6101
6838
 
6102
- ##### `image`<sup>Required</sup> <a name="image" id="aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.image"></a>
6839
+ ##### `image`<sup>Required</sup> <a name="image" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.image"></a>
6103
6840
 
6104
6841
  ```typescript
6105
6842
  public readonly image: INeuronxContainerImage;
@@ -6111,7 +6848,7 @@ An image of the container where the compile job is executed.
6111
6848
 
6112
6849
  ---
6113
6850
 
6114
- ##### `model`<sup>Required</sup> <a name="model" id="aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.model"></a>
6851
+ ##### `model`<sup>Required</sup> <a name="model" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.model"></a>
6115
6852
 
6116
6853
  ```typescript
6117
6854
  public readonly model: Model;
@@ -6123,7 +6860,7 @@ The model to be compiled.
6123
6860
 
6124
6861
  ---
6125
6862
 
6126
- ##### `neuronxInstanceType`<sup>Required</sup> <a name="neuronxInstanceType" id="aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.neuronxInstanceType"></a>
6863
+ ##### `neuronxInstanceType`<sup>Required</sup> <a name="neuronxInstanceType" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.neuronxInstanceType"></a>
6127
6864
 
6128
6865
  ```typescript
6129
6866
  public readonly neuronxInstanceType: INeuronxInstanceType;
@@ -6135,7 +6872,7 @@ The instance type of compile worker instance.
6135
6872
 
6136
6873
  ---
6137
6874
 
6138
- ##### `vpc`<sup>Required</sup> <a name="vpc" id="aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.vpc"></a>
6875
+ ##### `vpc`<sup>Required</sup> <a name="vpc" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.vpc"></a>
6139
6876
 
6140
6877
  ```typescript
6141
6878
  public readonly vpc: IVpc;
@@ -6147,7 +6884,7 @@ VPC in which this will launch compile worker instance.
6147
6884
 
6148
6885
  ---
6149
6886
 
6150
- ##### `command`<sup>Optional</sup> <a name="command" id="aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.command"></a>
6887
+ ##### `command`<sup>Optional</sup> <a name="command" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.command"></a>
6151
6888
 
6152
6889
  ```typescript
6153
6890
  public readonly command: string[];
@@ -6155,9 +6892,11 @@ public readonly command: string[];
6155
6892
 
6156
6893
  - *Type:* string[]
6157
6894
 
6895
+ The command to run in the container.
6896
+
6158
6897
  ---
6159
6898
 
6160
- ##### `environment`<sup>Optional</sup> <a name="environment" id="aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.environment"></a>
6899
+ ##### `environment`<sup>Optional</sup> <a name="environment" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.environment"></a>
6161
6900
 
6162
6901
  ```typescript
6163
6902
  public readonly environment: {[ key: string ]: string};
@@ -6172,7 +6911,7 @@ This is only applicable when using container runtime.
6172
6911
 
6173
6912
  ---
6174
6913
 
6175
- ##### `secrets`<sup>Optional</sup> <a name="secrets" id="aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.secrets"></a>
6914
+ ##### `secrets`<sup>Optional</sup> <a name="secrets" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.secrets"></a>
6176
6915
 
6177
6916
  ```typescript
6178
6917
  public readonly secrets: {[ key: string ]: Secret};
@@ -6184,7 +6923,7 @@ Secrets to pass to the container.
6184
6923
 
6185
6924
  ---
6186
6925
 
6187
- ##### `spot`<sup>Optional</sup> <a name="spot" id="aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.spot"></a>
6926
+ ##### `spot`<sup>Optional</sup> <a name="spot" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.spot"></a>
6188
6927
 
6189
6928
  ```typescript
6190
6929
  public readonly spot: boolean;
@@ -6199,20 +6938,20 @@ Spot instances are less expensive EC2 instances that can be reclaimed by EC2 at
6199
6938
 
6200
6939
  ---
6201
6940
 
6202
- ##### `volumeSize`<sup>Optional</sup> <a name="volumeSize" id="aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.volumeSize"></a>
6941
+ ##### `volumeSize`<sup>Optional</sup> <a name="volumeSize" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.volumeSize"></a>
6203
6942
 
6204
6943
  ```typescript
6205
6944
  public readonly volumeSize: Size;
6206
6945
  ```
6207
6946
 
6208
6947
  - *Type:* aws-cdk-lib.Size
6209
- - *Default:* N bilion parameters * 5GiB EBS
6948
+ - *Default:* N billion parameters * 5GiB EBS
6210
6949
 
6211
6950
  The root volume of worker instance.
6212
6951
 
6213
6952
  ---
6214
6953
 
6215
- ##### `vpcSubnets`<sup>Optional</sup> <a name="vpcSubnets" id="aws-cdk-neuronx-patterns.NeuronxCompilerProps.property.vpcSubnets"></a>
6954
+ ##### `vpcSubnets`<sup>Optional</sup> <a name="vpcSubnets" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.vpcSubnets"></a>
6216
6955
 
6217
6956
  ```typescript
6218
6957
  public readonly vpcSubnets: SubnetSelection;
@@ -10554,11 +11293,11 @@ const vllmNxdInferenceCompiledModel: VllmNxdInferenceCompiledModel = { ... }
10554
11293
  | **Name** | **Type** | **Description** |
10555
11294
  | --- | --- | --- |
10556
11295
  | <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.bucket">bucket</a></code> | <code>aws-cdk-lib.aws_s3.IBucket</code> | The bucket to upload compiled artifacts. |
10557
- | <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.compileTimeInstanceType">compileTimeInstanceType</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | *No description.* |
10558
11296
  | <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.modelName">modelName</a></code> | <code>string</code> | The model name. |
11297
+ | <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.recommendedInstanceType">recommendedInstanceType</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | The recommended Neuron instance type for running inference with this compiled model. |
10559
11298
  | <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.s3Prefix">s3Prefix</a></code> | <code>string</code> | S3 prefix that compiled artifact uploaded. |
10560
11299
  | <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.s3Uri">s3Uri</a></code> | <code>string</code> | S3 URL that compiled artifact uploaded. |
10561
- | <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.weightSize">weightSize</a></code> | <code>aws-cdk-lib.Size</code> | *No description.* |
11300
+ | <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.weightSize">weightSize</a></code> | <code>aws-cdk-lib.Size</code> | The weight size of the model. |
10562
11301
  | <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.vllmArgs">vllmArgs</a></code> | <code><a href="#aws-cdk-neuronx-patterns.VllmEngineArguments">VllmEngineArguments</a></code> | Passed to the vllm engine at compile time. |
10563
11302
 
10564
11303
  ---
@@ -10575,25 +11314,27 @@ The bucket to upload compiled artifacts.
10575
11314
 
10576
11315
  ---
10577
11316
 
10578
- ##### `compileTimeInstanceType`<sup>Required</sup> <a name="compileTimeInstanceType" id="aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.compileTimeInstanceType"></a>
11317
+ ##### `modelName`<sup>Required</sup> <a name="modelName" id="aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.modelName"></a>
10579
11318
 
10580
11319
  ```typescript
10581
- public readonly compileTimeInstanceType: INeuronxInstanceType;
11320
+ public readonly modelName: string;
10582
11321
  ```
10583
11322
 
10584
- - *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
11323
+ - *Type:* string
11324
+
11325
+ The model name.
10585
11326
 
10586
11327
  ---
10587
11328
 
10588
- ##### `modelName`<sup>Required</sup> <a name="modelName" id="aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.modelName"></a>
11329
+ ##### `recommendedInstanceType`<sup>Required</sup> <a name="recommendedInstanceType" id="aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.recommendedInstanceType"></a>
10589
11330
 
10590
11331
  ```typescript
10591
- public readonly modelName: string;
11332
+ public readonly recommendedInstanceType: INeuronxInstanceType;
10592
11333
  ```
10593
11334
 
10594
- - *Type:* string
11335
+ - *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
10595
11336
 
10596
- The model name.
11337
+ The recommended Neuron instance type for running inference with this compiled model.
10597
11338
 
10598
11339
  ---
10599
11340
 
@@ -10629,6 +11370,8 @@ public readonly weightSize: Size;
10629
11370
 
10630
11371
  - *Type:* aws-cdk-lib.Size
10631
11372
 
11373
+ The weight size of the model.
11374
+
10632
11375
  ---
10633
11376
 
10634
11377
  ##### `vllmArgs`<sup>Required</sup> <a name="vllmArgs" id="aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.vllmArgs"></a>
@@ -10662,6 +11405,7 @@ const vllmNxdInferenceCompileProps: VllmNxdInferenceCompileProps = { ... }
10662
11405
  | <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompileProps.property.bucket">bucket</a></code> | <code>aws-cdk-lib.aws_s3.IBucket</code> | The bucket to upload compiled artifacts. |
10663
11406
  | <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompileProps.property.model">model</a></code> | <code><a href="#aws-cdk-neuronx-patterns.Model">Model</a></code> | The model to be compiled. |
10664
11407
  | <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompileProps.property.vpc">vpc</a></code> | <code>aws-cdk-lib.aws_ec2.IVpc</code> | VPC in which this will launch compile worker instance. |
11408
+ | <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompileProps.property.compileInstanceType">compileInstanceType</a></code> | <code>aws-cdk-lib.aws_ec2.InstanceType</code> | The EC2 instance type to use for cross-compilation. |
10665
11409
  | <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompileProps.property.environment">environment</a></code> | <code>{[ key: string ]: string}</code> | The environment variables to pass to the container. |
10666
11410
  | <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompileProps.property.image">image</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxContainerImage">INeuronxContainerImage</a></code> | An image of the container where the compile job is executed. |
10667
11411
  | <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompileProps.property.neuronxInstanceType">neuronxInstanceType</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | The instance type of compile worker instance. |
@@ -10708,6 +11452,21 @@ VPC in which this will launch compile worker instance.
10708
11452
 
10709
11453
  ---
10710
11454
 
11455
+ ##### `compileInstanceType`<sup>Optional</sup> <a name="compileInstanceType" id="aws-cdk-neuronx-patterns.VllmNxdInferenceCompileProps.property.compileInstanceType"></a>
11456
+
11457
+ ```typescript
11458
+ public readonly compileInstanceType: InstanceType;
11459
+ ```
11460
+
11461
+ - *Type:* aws-cdk-lib.aws_ec2.InstanceType
11462
+ - *Default:* Automatically selected based on model size
11463
+
11464
+ The EC2 instance type to use for cross-compilation.
11465
+
11466
+ This should be a non-Neuron instance type with sufficient memory for model compilation.
11467
+
11468
+ ---
11469
+
10711
11470
  ##### `environment`<sup>Optional</sup> <a name="environment" id="aws-cdk-neuronx-patterns.VllmNxdInferenceCompileProps.property.environment"></a>
10712
11471
 
10713
11472
  ```typescript
@@ -11788,6 +12547,9 @@ new NeuronxInstanceType()
11788
12547
  | <code><a href="#aws-cdk-neuronx-patterns.NeuronxInstanceType.property.INF2_XLARGE">INF2_XLARGE</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | inf2.xlarge. |
11789
12548
  | <code><a href="#aws-cdk-neuronx-patterns.NeuronxInstanceType.property.TRN1_2XLARGE">TRN1_2XLARGE</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | trn1.2xlarge. |
11790
12549
  | <code><a href="#aws-cdk-neuronx-patterns.NeuronxInstanceType.property.TRN1_32XLARGE">TRN1_32XLARGE</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | trn1.32xlarge. |
12550
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxInstanceType.property.TRN2_3XLARGE">TRN2_3XLARGE</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | trn2.3xlarge. |
12551
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxInstanceType.property.TRN2_48XLARGE">TRN2_48XLARGE</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | trn2.48xlarge. |
12552
+ | <code><a href="#aws-cdk-neuronx-patterns.NeuronxInstanceType.property.TRN2U_48XLARGE">TRN2U_48XLARGE</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | trn2u.48xlarge. |
11791
12553
 
11792
12554
  ---
11793
12555
 
@@ -11863,6 +12625,42 @@ trn1.32xlarge.
11863
12625
 
11864
12626
  ---
11865
12627
 
12628
+ ##### `TRN2_3XLARGE`<sup>Required</sup> <a name="TRN2_3XLARGE" id="aws-cdk-neuronx-patterns.NeuronxInstanceType.property.TRN2_3XLARGE"></a>
12629
+
12630
+ ```typescript
12631
+ public readonly TRN2_3XLARGE: INeuronxInstanceType;
12632
+ ```
12633
+
12634
+ - *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
12635
+
12636
+ trn2.3xlarge.
12637
+
12638
+ ---
12639
+
12640
+ ##### `TRN2_48XLARGE`<sup>Required</sup> <a name="TRN2_48XLARGE" id="aws-cdk-neuronx-patterns.NeuronxInstanceType.property.TRN2_48XLARGE"></a>
12641
+
12642
+ ```typescript
12643
+ public readonly TRN2_48XLARGE: INeuronxInstanceType;
12644
+ ```
12645
+
12646
+ - *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
12647
+
12648
+ trn2.48xlarge.
12649
+
12650
+ ---
12651
+
12652
+ ##### `TRN2U_48XLARGE`<sup>Required</sup> <a name="TRN2U_48XLARGE" id="aws-cdk-neuronx-patterns.NeuronxInstanceType.property.TRN2U_48XLARGE"></a>
12653
+
12654
+ ```typescript
12655
+ public readonly TRN2U_48XLARGE: INeuronxInstanceType;
12656
+ ```
12657
+
12658
+ - *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
12659
+
12660
+ trn2u.48xlarge.
12661
+
12662
+ ---
12663
+
11866
12664
  ### Parameters <a name="Parameters" id="aws-cdk-neuronx-patterns.Parameters"></a>
11867
12665
 
11868
12666
  Represents the amount of parameters.
@@ -12679,6 +13477,73 @@ public readonly neuronxCores: number;
12679
13477
  ---
12680
13478
 
12681
13479
 
13480
+ ### Trainium2Chips <a name="Trainium2Chips" id="aws-cdk-neuronx-patterns.Trainium2Chips"></a>
13481
+
13482
+ - *Implements:* <a href="#aws-cdk-neuronx-patterns.IAcceleratorChips">IAcceleratorChips</a>
13483
+
13484
+ #### Initializers <a name="Initializers" id="aws-cdk-neuronx-patterns.Trainium2Chips.Initializer"></a>
13485
+
13486
+ ```typescript
13487
+ import { Trainium2Chips } from 'aws-cdk-neuronx-patterns'
13488
+
13489
+ new Trainium2Chips(chips: number)
13490
+ ```
13491
+
13492
+ | **Name** | **Type** | **Description** |
13493
+ | --- | --- | --- |
13494
+ | <code><a href="#aws-cdk-neuronx-patterns.Trainium2Chips.Initializer.parameter.chips">chips</a></code> | <code>number</code> | *No description.* |
13495
+
13496
+ ---
13497
+
13498
+ ##### `chips`<sup>Required</sup> <a name="chips" id="aws-cdk-neuronx-patterns.Trainium2Chips.Initializer.parameter.chips"></a>
13499
+
13500
+ - *Type:* number
13501
+
13502
+ ---
13503
+
13504
+
13505
+
13506
+ #### Properties <a name="Properties" id="Properties"></a>
13507
+
13508
+ | **Name** | **Type** | **Description** |
13509
+ | --- | --- | --- |
13510
+ | <code><a href="#aws-cdk-neuronx-patterns.Trainium2Chips.property.acceleratorMemory">acceleratorMemory</a></code> | <code>aws-cdk-lib.Size</code> | *No description.* |
13511
+ | <code><a href="#aws-cdk-neuronx-patterns.Trainium2Chips.property.chips">chips</a></code> | <code>number</code> | *No description.* |
13512
+ | <code><a href="#aws-cdk-neuronx-patterns.Trainium2Chips.property.neuronxCores">neuronxCores</a></code> | <code>number</code> | *No description.* |
13513
+
13514
+ ---
13515
+
13516
+ ##### `acceleratorMemory`<sup>Required</sup> <a name="acceleratorMemory" id="aws-cdk-neuronx-patterns.Trainium2Chips.property.acceleratorMemory"></a>
13517
+
13518
+ ```typescript
13519
+ public readonly acceleratorMemory: Size;
13520
+ ```
13521
+
13522
+ - *Type:* aws-cdk-lib.Size
13523
+
13524
+ ---
13525
+
13526
+ ##### `chips`<sup>Required</sup> <a name="chips" id="aws-cdk-neuronx-patterns.Trainium2Chips.property.chips"></a>
13527
+
13528
+ ```typescript
13529
+ public readonly chips: number;
13530
+ ```
13531
+
13532
+ - *Type:* number
13533
+
13534
+ ---
13535
+
13536
+ ##### `neuronxCores`<sup>Required</sup> <a name="neuronxCores" id="aws-cdk-neuronx-patterns.Trainium2Chips.property.neuronxCores"></a>
13537
+
13538
+ ```typescript
13539
+ public readonly neuronxCores: number;
13540
+ ```
13541
+
13542
+ - *Type:* number
13543
+
13544
+ ---
13545
+
13546
+
12682
13547
  ### VllmEngineArgumentsParser <a name="VllmEngineArgumentsParser" id="aws-cdk-neuronx-patterns.VllmEngineArgumentsParser"></a>
12683
13548
 
12684
13549
  #### Initializers <a name="Initializers" id="aws-cdk-neuronx-patterns.VllmEngineArgumentsParser.Initializer"></a>
@@ -13117,7 +13982,7 @@ The neuronx SDK version.
13117
13982
 
13118
13983
  ### IAcceleratorChips <a name="IAcceleratorChips" id="aws-cdk-neuronx-patterns.IAcceleratorChips"></a>
13119
13984
 
13120
- - *Implemented By:* <a href="#aws-cdk-neuronx-patterns.Inferentia2Chips">Inferentia2Chips</a>, <a href="#aws-cdk-neuronx-patterns.Trainium1Chips">Trainium1Chips</a>, <a href="#aws-cdk-neuronx-patterns.IAcceleratorChips">IAcceleratorChips</a>
13985
+ - *Implemented By:* <a href="#aws-cdk-neuronx-patterns.Inferentia2Chips">Inferentia2Chips</a>, <a href="#aws-cdk-neuronx-patterns.Trainium1Chips">Trainium1Chips</a>, <a href="#aws-cdk-neuronx-patterns.Trainium2Chips">Trainium2Chips</a>, <a href="#aws-cdk-neuronx-patterns.IAcceleratorChips">IAcceleratorChips</a>
13121
13986
 
13122
13987
 
13123
13988
  #### Properties <a name="Properties" id="Properties"></a>
@@ -13160,6 +14025,27 @@ public readonly neuronxCores: number;
13160
14025
 
13161
14026
  ---
13162
14027
 
14028
+ ### INeuronxCompiler <a name="INeuronxCompiler" id="aws-cdk-neuronx-patterns.INeuronxCompiler"></a>
14029
+
14030
+ - *Implemented By:* <a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBase">NeuronxCompilerBase</a>, <a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompiler">NeuronxCrossCompiler</a>, <a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompiler">NeuronxNativeCompiler</a>, <a href="#aws-cdk-neuronx-patterns.INeuronxCompiler">INeuronxCompiler</a>
14031
+
14032
+ Interface for Neuronx compilers.
14033
+
14034
+ #### Methods <a name="Methods" id="Methods"></a>
14035
+
14036
+ | **Name** | **Description** |
14037
+ | --- | --- |
14038
+ | <code><a href="#aws-cdk-neuronx-patterns.INeuronxCompiler.compile">compile</a></code> | *No description.* |
14039
+
14040
+ ---
14041
+
14042
+ ##### `compile` <a name="compile" id="aws-cdk-neuronx-patterns.INeuronxCompiler.compile"></a>
14043
+
14044
+ ```typescript
14045
+ public compile(): NeuronxCompiledModel
14046
+ ```
14047
+
14048
+
13163
14049
  ### INeuronxContainerImage <a name="INeuronxContainerImage" id="aws-cdk-neuronx-patterns.INeuronxContainerImage"></a>
13164
14050
 
13165
14051
  - *Implemented By:* <a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompileImage">VllmNxdInferenceCompileImage</a>, <a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceEcsImage">VllmNxdInferenceEcsImage</a>, <a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceEcsImageBase">VllmNxdInferenceEcsImageBase</a>, <a href="#aws-cdk-neuronx-patterns.INeuronxContainerImage">INeuronxContainerImage</a>