aws-cdk-neuronx-patterns 0.2.0 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.jsii +754 -117
- package/API.md +1044 -158
- package/README.ja.md +18 -6
- package/README.md +16 -5
- package/lib/base/aws-batch/neuronx-batch-compute-environment.js +1 -1
- package/lib/base/aws-batch/neuronx-batch-ecs-job-definition.js +1 -1
- package/lib/base/aws-batch/neuronx-batch.js +1 -1
- package/lib/base/aws-ecs-patterns/application-load-balanced-neuronx-service.js +4 -4
- package/lib/base/neuronx/calculator.test.js +61 -1
- package/lib/base/neuronx/deep-learning-containers.js +3 -3
- package/lib/base/neuronx/model.js +2 -2
- package/lib/base/neuronx/neuron-optimized-machine-image.js +1 -1
- package/lib/base/neuronx/neuronx-instance-type.d.ts +18 -0
- package/lib/base/neuronx/neuronx-instance-type.js +60 -7
- package/lib/base/neuronx/neuronx-instance-type.test.js +80 -1
- package/lib/base/neuronx-compiler/index.d.ts +3 -1
- package/lib/base/neuronx-compiler/index.js +4 -2
- package/lib/base/neuronx-compiler/{neuronx-compiler.d.ts → neuronx-compiler-base.d.ts} +74 -32
- package/lib/base/neuronx-compiler/neuronx-compiler-base.js +129 -0
- package/lib/base/neuronx-compiler/neuronx-cross-compiler.d.ts +30 -0
- package/lib/base/neuronx-compiler/neuronx-cross-compiler.js +83 -0
- package/lib/base/neuronx-compiler/neuronx-native-compiler.d.ts +18 -0
- package/lib/base/neuronx-compiler/neuronx-native-compiler.js +69 -0
- package/lib/base/server-engine/vllm-engine/vllm-engine-argments.js +1 -1
- package/lib/sagemaker-inference-toolkit-tnx/sagemaker-inference-toolkit-tnx-compiler.js +2 -2
- package/lib/sagemaker-inference-toolkit-tnx/sagemaker-inference-toolkit-tnx-sagemaker.d.ts +1 -1
- package/lib/sagemaker-inference-toolkit-tnx/sagemaker-inference-toolkit-tnx-sagemaker.js +2 -2
- package/lib/vllm-nxd-inference/vllm-nxd-inference-compiler.d.ts +8 -0
- package/lib/vllm-nxd-inference/vllm-nxd-inference-compiler.js +32 -4
- package/lib/vllm-nxd-inference/vllm-nxd-inference-ecs-patterns.js +6 -6
- package/package.json +7 -7
- package/scripts/compile/vllm-nxd-inference/Dockerfile +5 -0
- package/scripts/compile/vllm-nxd-inference/entrypoint.sh +39 -14
- package/lib/base/neuronx-compiler/neuronx-compiler.js +0 -166
package/API.md
CHANGED
|
@@ -1434,43 +1434,342 @@ Batch terminates your jobs if they aren't finished.
|
|
|
1434
1434
|
---
|
|
1435
1435
|
|
|
1436
1436
|
|
|
1437
|
-
###
|
|
1437
|
+
### NeuronxCompilerBase <a name="NeuronxCompilerBase" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase"></a>
|
|
1438
|
+
|
|
1439
|
+
- *Implements:* <a href="#aws-cdk-neuronx-patterns.INeuronxCompiler">INeuronxCompiler</a>
|
|
1440
|
+
|
|
1441
|
+
Abstract base class for Neuronx compilers.
|
|
1442
|
+
|
|
1443
|
+
Provides the common orchestration logic (Lambda, CustomResource, WaitCondition)
|
|
1444
|
+
while subclasses define how to create the Batch compute environment and job definition.
|
|
1445
|
+
|
|
1446
|
+
#### Initializers <a name="Initializers" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.Initializer"></a>
|
|
1447
|
+
|
|
1448
|
+
```typescript
|
|
1449
|
+
import { NeuronxCompilerBase } from 'aws-cdk-neuronx-patterns'
|
|
1450
|
+
|
|
1451
|
+
new NeuronxCompilerBase(scope: Construct, id: string, props: NeuronxCompilerBaseProps)
|
|
1452
|
+
```
|
|
1453
|
+
|
|
1454
|
+
| **Name** | **Type** | **Description** |
|
|
1455
|
+
| --- | --- | --- |
|
|
1456
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBase.Initializer.parameter.scope">scope</a></code> | <code>constructs.Construct</code> | *No description.* |
|
|
1457
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBase.Initializer.parameter.id">id</a></code> | <code>string</code> | *No description.* |
|
|
1458
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBase.Initializer.parameter.props">props</a></code> | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps">NeuronxCompilerBaseProps</a></code> | *No description.* |
|
|
1459
|
+
|
|
1460
|
+
---
|
|
1461
|
+
|
|
1462
|
+
##### `scope`<sup>Required</sup> <a name="scope" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.Initializer.parameter.scope"></a>
|
|
1463
|
+
|
|
1464
|
+
- *Type:* constructs.Construct
|
|
1465
|
+
|
|
1466
|
+
---
|
|
1467
|
+
|
|
1468
|
+
##### `id`<sup>Required</sup> <a name="id" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.Initializer.parameter.id"></a>
|
|
1469
|
+
|
|
1470
|
+
- *Type:* string
|
|
1471
|
+
|
|
1472
|
+
---
|
|
1473
|
+
|
|
1474
|
+
##### `props`<sup>Required</sup> <a name="props" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.Initializer.parameter.props"></a>
|
|
1475
|
+
|
|
1476
|
+
- *Type:* <a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps">NeuronxCompilerBaseProps</a>
|
|
1477
|
+
|
|
1478
|
+
---
|
|
1479
|
+
|
|
1480
|
+
#### Methods <a name="Methods" id="Methods"></a>
|
|
1481
|
+
|
|
1482
|
+
| **Name** | **Description** |
|
|
1483
|
+
| --- | --- |
|
|
1484
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBase.toString">toString</a></code> | Returns a string representation of this construct. |
|
|
1485
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBase.with">with</a></code> | Applies one or more mixins to this construct. |
|
|
1486
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBase.compile">compile</a></code> | *No description.* |
|
|
1487
|
+
|
|
1488
|
+
---
|
|
1489
|
+
|
|
1490
|
+
##### `toString` <a name="toString" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.toString"></a>
|
|
1491
|
+
|
|
1492
|
+
```typescript
|
|
1493
|
+
public toString(): string
|
|
1494
|
+
```
|
|
1495
|
+
|
|
1496
|
+
Returns a string representation of this construct.
|
|
1497
|
+
|
|
1498
|
+
##### `with` <a name="with" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.with"></a>
|
|
1499
|
+
|
|
1500
|
+
```typescript
|
|
1501
|
+
public with(mixins: ...IMixin[]): IConstruct
|
|
1502
|
+
```
|
|
1503
|
+
|
|
1504
|
+
Applies one or more mixins to this construct.
|
|
1505
|
+
|
|
1506
|
+
Mixins are applied in order. The list of constructs is captured at the
|
|
1507
|
+
start of the call, so constructs added by a mixin will not be visited.
|
|
1508
|
+
Use multiple `with()` calls if subsequent mixins should apply to added
|
|
1509
|
+
constructs.
|
|
1510
|
+
|
|
1511
|
+
###### `mixins`<sup>Required</sup> <a name="mixins" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.with.parameter.mixins"></a>
|
|
1512
|
+
|
|
1513
|
+
- *Type:* ...constructs.IMixin[]
|
|
1514
|
+
|
|
1515
|
+
The mixins to apply.
|
|
1516
|
+
|
|
1517
|
+
---
|
|
1518
|
+
|
|
1519
|
+
##### `compile` <a name="compile" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.compile"></a>
|
|
1520
|
+
|
|
1521
|
+
```typescript
|
|
1522
|
+
public compile(): NeuronxCompiledModel
|
|
1523
|
+
```
|
|
1524
|
+
|
|
1525
|
+
#### Static Functions <a name="Static Functions" id="Static Functions"></a>
|
|
1526
|
+
|
|
1527
|
+
| **Name** | **Description** |
|
|
1528
|
+
| --- | --- |
|
|
1529
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBase.isConstruct">isConstruct</a></code> | Checks if `x` is a construct. |
|
|
1530
|
+
|
|
1531
|
+
---
|
|
1532
|
+
|
|
1533
|
+
##### `isConstruct` <a name="isConstruct" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.isConstruct"></a>
|
|
1534
|
+
|
|
1535
|
+
```typescript
|
|
1536
|
+
import { NeuronxCompilerBase } from 'aws-cdk-neuronx-patterns'
|
|
1537
|
+
|
|
1538
|
+
NeuronxCompilerBase.isConstruct(x: any)
|
|
1539
|
+
```
|
|
1540
|
+
|
|
1541
|
+
Checks if `x` is a construct.
|
|
1542
|
+
|
|
1543
|
+
Use this method instead of `instanceof` to properly detect `Construct`
|
|
1544
|
+
instances, even when the construct library is symlinked.
|
|
1545
|
+
|
|
1546
|
+
Explanation: in JavaScript, multiple copies of the `constructs` library on
|
|
1547
|
+
disk are seen as independent, completely different libraries. As a
|
|
1548
|
+
consequence, the class `Construct` in each copy of the `constructs` library
|
|
1549
|
+
is seen as a different class, and an instance of one class will not test as
|
|
1550
|
+
`instanceof` the other class. `npm install` will not create installations
|
|
1551
|
+
like this, but users may manually symlink construct libraries together or
|
|
1552
|
+
use a monorepo tool: in those cases, multiple copies of the `constructs`
|
|
1553
|
+
library can be accidentally installed, and `instanceof` will behave
|
|
1554
|
+
unpredictably. It is safest to avoid using `instanceof`, and using
|
|
1555
|
+
this type-testing method instead.
|
|
1556
|
+
|
|
1557
|
+
###### `x`<sup>Required</sup> <a name="x" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.isConstruct.parameter.x"></a>
|
|
1558
|
+
|
|
1559
|
+
- *Type:* any
|
|
1560
|
+
|
|
1561
|
+
Any object.
|
|
1562
|
+
|
|
1563
|
+
---
|
|
1564
|
+
|
|
1565
|
+
#### Properties <a name="Properties" id="Properties"></a>
|
|
1566
|
+
|
|
1567
|
+
| **Name** | **Type** | **Description** |
|
|
1568
|
+
| --- | --- | --- |
|
|
1569
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBase.property.node">node</a></code> | <code>constructs.Node</code> | The tree node. |
|
|
1570
|
+
|
|
1571
|
+
---
|
|
1572
|
+
|
|
1573
|
+
##### `node`<sup>Required</sup> <a name="node" id="aws-cdk-neuronx-patterns.NeuronxCompilerBase.property.node"></a>
|
|
1574
|
+
|
|
1575
|
+
```typescript
|
|
1576
|
+
public readonly node: Node;
|
|
1577
|
+
```
|
|
1578
|
+
|
|
1579
|
+
- *Type:* constructs.Node
|
|
1580
|
+
|
|
1581
|
+
The tree node.
|
|
1582
|
+
|
|
1583
|
+
---
|
|
1584
|
+
|
|
1585
|
+
|
|
1586
|
+
### NeuronxCrossCompiler <a name="NeuronxCrossCompiler" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler"></a>
|
|
1587
|
+
|
|
1588
|
+
Neuronx cross-compiler construct.
|
|
1589
|
+
|
|
1590
|
+
Compile the model on a non-Neuron instance and upload the artifacts to an S3 bucket.
|
|
1591
|
+
This avoids the need for expensive Neuron instances during the compilation phase.
|
|
1592
|
+
|
|
1593
|
+
The compilation uses `vllm serve` which performs model tracing and neuronx-cc compilation
|
|
1594
|
+
entirely on CPU. The resulting artifacts are compatible with Neuron instances for inference.
|
|
1595
|
+
|
|
1596
|
+
#### Initializers <a name="Initializers" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.Initializer"></a>
|
|
1597
|
+
|
|
1598
|
+
```typescript
|
|
1599
|
+
import { NeuronxCrossCompiler } from 'aws-cdk-neuronx-patterns'
|
|
1600
|
+
|
|
1601
|
+
new NeuronxCrossCompiler(scope: Construct, id: string, props: NeuronxCrossCompilerProps)
|
|
1602
|
+
```
|
|
1603
|
+
|
|
1604
|
+
| **Name** | **Type** | **Description** |
|
|
1605
|
+
| --- | --- | --- |
|
|
1606
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompiler.Initializer.parameter.scope">scope</a></code> | <code>constructs.Construct</code> | *No description.* |
|
|
1607
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompiler.Initializer.parameter.id">id</a></code> | <code>string</code> | *No description.* |
|
|
1608
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompiler.Initializer.parameter.props">props</a></code> | <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps">NeuronxCrossCompilerProps</a></code> | *No description.* |
|
|
1609
|
+
|
|
1610
|
+
---
|
|
1611
|
+
|
|
1612
|
+
##### `scope`<sup>Required</sup> <a name="scope" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.Initializer.parameter.scope"></a>
|
|
1613
|
+
|
|
1614
|
+
- *Type:* constructs.Construct
|
|
1615
|
+
|
|
1616
|
+
---
|
|
1617
|
+
|
|
1618
|
+
##### `id`<sup>Required</sup> <a name="id" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.Initializer.parameter.id"></a>
|
|
1619
|
+
|
|
1620
|
+
- *Type:* string
|
|
1621
|
+
|
|
1622
|
+
---
|
|
1623
|
+
|
|
1624
|
+
##### `props`<sup>Required</sup> <a name="props" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.Initializer.parameter.props"></a>
|
|
1625
|
+
|
|
1626
|
+
- *Type:* <a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps">NeuronxCrossCompilerProps</a>
|
|
1627
|
+
|
|
1628
|
+
---
|
|
1629
|
+
|
|
1630
|
+
#### Methods <a name="Methods" id="Methods"></a>
|
|
1631
|
+
|
|
1632
|
+
| **Name** | **Description** |
|
|
1633
|
+
| --- | --- |
|
|
1634
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompiler.toString">toString</a></code> | Returns a string representation of this construct. |
|
|
1635
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompiler.with">with</a></code> | Applies one or more mixins to this construct. |
|
|
1636
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompiler.compile">compile</a></code> | *No description.* |
|
|
1637
|
+
|
|
1638
|
+
---
|
|
1639
|
+
|
|
1640
|
+
##### `toString` <a name="toString" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.toString"></a>
|
|
1641
|
+
|
|
1642
|
+
```typescript
|
|
1643
|
+
public toString(): string
|
|
1644
|
+
```
|
|
1645
|
+
|
|
1646
|
+
Returns a string representation of this construct.
|
|
1647
|
+
|
|
1648
|
+
##### `with` <a name="with" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.with"></a>
|
|
1649
|
+
|
|
1650
|
+
```typescript
|
|
1651
|
+
public with(mixins: ...IMixin[]): IConstruct
|
|
1652
|
+
```
|
|
1653
|
+
|
|
1654
|
+
Applies one or more mixins to this construct.
|
|
1655
|
+
|
|
1656
|
+
Mixins are applied in order. The list of constructs is captured at the
|
|
1657
|
+
start of the call, so constructs added by a mixin will not be visited.
|
|
1658
|
+
Use multiple `with()` calls if subsequent mixins should apply to added
|
|
1659
|
+
constructs.
|
|
1660
|
+
|
|
1661
|
+
###### `mixins`<sup>Required</sup> <a name="mixins" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.with.parameter.mixins"></a>
|
|
1662
|
+
|
|
1663
|
+
- *Type:* ...constructs.IMixin[]
|
|
1664
|
+
|
|
1665
|
+
The mixins to apply.
|
|
1666
|
+
|
|
1667
|
+
---
|
|
1668
|
+
|
|
1669
|
+
##### `compile` <a name="compile" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.compile"></a>
|
|
1670
|
+
|
|
1671
|
+
```typescript
|
|
1672
|
+
public compile(): NeuronxCompiledModel
|
|
1673
|
+
```
|
|
1674
|
+
|
|
1675
|
+
#### Static Functions <a name="Static Functions" id="Static Functions"></a>
|
|
1676
|
+
|
|
1677
|
+
| **Name** | **Description** |
|
|
1678
|
+
| --- | --- |
|
|
1679
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompiler.isConstruct">isConstruct</a></code> | Checks if `x` is a construct. |
|
|
1680
|
+
|
|
1681
|
+
---
|
|
1682
|
+
|
|
1683
|
+
##### `isConstruct` <a name="isConstruct" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.isConstruct"></a>
|
|
1684
|
+
|
|
1685
|
+
```typescript
|
|
1686
|
+
import { NeuronxCrossCompiler } from 'aws-cdk-neuronx-patterns'
|
|
1687
|
+
|
|
1688
|
+
NeuronxCrossCompiler.isConstruct(x: any)
|
|
1689
|
+
```
|
|
1690
|
+
|
|
1691
|
+
Checks if `x` is a construct.
|
|
1692
|
+
|
|
1693
|
+
Use this method instead of `instanceof` to properly detect `Construct`
|
|
1694
|
+
instances, even when the construct library is symlinked.
|
|
1695
|
+
|
|
1696
|
+
Explanation: in JavaScript, multiple copies of the `constructs` library on
|
|
1697
|
+
disk are seen as independent, completely different libraries. As a
|
|
1698
|
+
consequence, the class `Construct` in each copy of the `constructs` library
|
|
1699
|
+
is seen as a different class, and an instance of one class will not test as
|
|
1700
|
+
`instanceof` the other class. `npm install` will not create installations
|
|
1701
|
+
like this, but users may manually symlink construct libraries together or
|
|
1702
|
+
use a monorepo tool: in those cases, multiple copies of the `constructs`
|
|
1703
|
+
library can be accidentally installed, and `instanceof` will behave
|
|
1704
|
+
unpredictably. It is safest to avoid using `instanceof`, and using
|
|
1705
|
+
this type-testing method instead.
|
|
1706
|
+
|
|
1707
|
+
###### `x`<sup>Required</sup> <a name="x" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.isConstruct.parameter.x"></a>
|
|
1708
|
+
|
|
1709
|
+
- *Type:* any
|
|
1710
|
+
|
|
1711
|
+
Any object.
|
|
1712
|
+
|
|
1713
|
+
---
|
|
1714
|
+
|
|
1715
|
+
#### Properties <a name="Properties" id="Properties"></a>
|
|
1716
|
+
|
|
1717
|
+
| **Name** | **Type** | **Description** |
|
|
1718
|
+
| --- | --- | --- |
|
|
1719
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompiler.property.node">node</a></code> | <code>constructs.Node</code> | The tree node. |
|
|
1720
|
+
|
|
1721
|
+
---
|
|
1722
|
+
|
|
1723
|
+
##### `node`<sup>Required</sup> <a name="node" id="aws-cdk-neuronx-patterns.NeuronxCrossCompiler.property.node"></a>
|
|
1724
|
+
|
|
1725
|
+
```typescript
|
|
1726
|
+
public readonly node: Node;
|
|
1727
|
+
```
|
|
1728
|
+
|
|
1729
|
+
- *Type:* constructs.Node
|
|
1730
|
+
|
|
1731
|
+
The tree node.
|
|
1732
|
+
|
|
1733
|
+
---
|
|
1734
|
+
|
|
1735
|
+
|
|
1736
|
+
### NeuronxNativeCompiler <a name="NeuronxNativeCompiler" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler"></a>
|
|
1438
1737
|
|
|
1439
1738
|
Neuronx compiler construct.
|
|
1440
1739
|
|
|
1441
1740
|
Compile the model to work with Inferentia2 and Trainium1 and upload it to an S3 bucket.
|
|
1442
1741
|
|
|
1443
|
-
#### Initializers <a name="Initializers" id="aws-cdk-neuronx-patterns.
|
|
1742
|
+
#### Initializers <a name="Initializers" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.Initializer"></a>
|
|
1444
1743
|
|
|
1445
1744
|
```typescript
|
|
1446
|
-
import {
|
|
1745
|
+
import { NeuronxNativeCompiler } from 'aws-cdk-neuronx-patterns'
|
|
1447
1746
|
|
|
1448
|
-
new
|
|
1747
|
+
new NeuronxNativeCompiler(scope: Construct, id: string, props: NeuronxNativeCompilerProps)
|
|
1449
1748
|
```
|
|
1450
1749
|
|
|
1451
1750
|
| **Name** | **Type** | **Description** |
|
|
1452
1751
|
| --- | --- | --- |
|
|
1453
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
1454
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
1455
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
1752
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompiler.Initializer.parameter.scope">scope</a></code> | <code>constructs.Construct</code> | *No description.* |
|
|
1753
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompiler.Initializer.parameter.id">id</a></code> | <code>string</code> | *No description.* |
|
|
1754
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompiler.Initializer.parameter.props">props</a></code> | <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps">NeuronxNativeCompilerProps</a></code> | *No description.* |
|
|
1456
1755
|
|
|
1457
1756
|
---
|
|
1458
1757
|
|
|
1459
|
-
##### `scope`<sup>Required</sup> <a name="scope" id="aws-cdk-neuronx-patterns.
|
|
1758
|
+
##### `scope`<sup>Required</sup> <a name="scope" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.Initializer.parameter.scope"></a>
|
|
1460
1759
|
|
|
1461
1760
|
- *Type:* constructs.Construct
|
|
1462
1761
|
|
|
1463
1762
|
---
|
|
1464
1763
|
|
|
1465
|
-
##### `id`<sup>Required</sup> <a name="id" id="aws-cdk-neuronx-patterns.
|
|
1764
|
+
##### `id`<sup>Required</sup> <a name="id" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.Initializer.parameter.id"></a>
|
|
1466
1765
|
|
|
1467
1766
|
- *Type:* string
|
|
1468
1767
|
|
|
1469
1768
|
---
|
|
1470
1769
|
|
|
1471
|
-
##### `props`<sup>Required</sup> <a name="props" id="aws-cdk-neuronx-patterns.
|
|
1770
|
+
##### `props`<sup>Required</sup> <a name="props" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.Initializer.parameter.props"></a>
|
|
1472
1771
|
|
|
1473
|
-
- *Type:* <a href="#aws-cdk-neuronx-patterns.
|
|
1772
|
+
- *Type:* <a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps">NeuronxNativeCompilerProps</a>
|
|
1474
1773
|
|
|
1475
1774
|
---
|
|
1476
1775
|
|
|
@@ -1478,13 +1777,13 @@ new NeuronxCompiler(scope: Construct, id: string, props: NeuronxCompilerProps)
|
|
|
1478
1777
|
|
|
1479
1778
|
| **Name** | **Description** |
|
|
1480
1779
|
| --- | --- |
|
|
1481
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
1482
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
1483
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
1780
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompiler.toString">toString</a></code> | Returns a string representation of this construct. |
|
|
1781
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompiler.with">with</a></code> | Applies one or more mixins to this construct. |
|
|
1782
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompiler.compile">compile</a></code> | *No description.* |
|
|
1484
1783
|
|
|
1485
1784
|
---
|
|
1486
1785
|
|
|
1487
|
-
##### `toString` <a name="toString" id="aws-cdk-neuronx-patterns.
|
|
1786
|
+
##### `toString` <a name="toString" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.toString"></a>
|
|
1488
1787
|
|
|
1489
1788
|
```typescript
|
|
1490
1789
|
public toString(): string
|
|
@@ -1492,7 +1791,7 @@ public toString(): string
|
|
|
1492
1791
|
|
|
1493
1792
|
Returns a string representation of this construct.
|
|
1494
1793
|
|
|
1495
|
-
##### `with` <a name="with" id="aws-cdk-neuronx-patterns.
|
|
1794
|
+
##### `with` <a name="with" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.with"></a>
|
|
1496
1795
|
|
|
1497
1796
|
```typescript
|
|
1498
1797
|
public with(mixins: ...IMixin[]): IConstruct
|
|
@@ -1505,7 +1804,7 @@ start of the call, so constructs added by a mixin will not be visited.
|
|
|
1505
1804
|
Use multiple `with()` calls if subsequent mixins should apply to added
|
|
1506
1805
|
constructs.
|
|
1507
1806
|
|
|
1508
|
-
###### `mixins`<sup>Required</sup> <a name="mixins" id="aws-cdk-neuronx-patterns.
|
|
1807
|
+
###### `mixins`<sup>Required</sup> <a name="mixins" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.with.parameter.mixins"></a>
|
|
1509
1808
|
|
|
1510
1809
|
- *Type:* ...constructs.IMixin[]
|
|
1511
1810
|
|
|
@@ -1513,7 +1812,7 @@ The mixins to apply.
|
|
|
1513
1812
|
|
|
1514
1813
|
---
|
|
1515
1814
|
|
|
1516
|
-
##### `compile` <a name="compile" id="aws-cdk-neuronx-patterns.
|
|
1815
|
+
##### `compile` <a name="compile" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.compile"></a>
|
|
1517
1816
|
|
|
1518
1817
|
```typescript
|
|
1519
1818
|
public compile(): NeuronxCompiledModel
|
|
@@ -1523,16 +1822,16 @@ public compile(): NeuronxCompiledModel
|
|
|
1523
1822
|
|
|
1524
1823
|
| **Name** | **Description** |
|
|
1525
1824
|
| --- | --- |
|
|
1526
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
1825
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompiler.isConstruct">isConstruct</a></code> | Checks if `x` is a construct. |
|
|
1527
1826
|
|
|
1528
1827
|
---
|
|
1529
1828
|
|
|
1530
|
-
##### `isConstruct` <a name="isConstruct" id="aws-cdk-neuronx-patterns.
|
|
1829
|
+
##### `isConstruct` <a name="isConstruct" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.isConstruct"></a>
|
|
1531
1830
|
|
|
1532
1831
|
```typescript
|
|
1533
|
-
import {
|
|
1832
|
+
import { NeuronxNativeCompiler } from 'aws-cdk-neuronx-patterns'
|
|
1534
1833
|
|
|
1535
|
-
|
|
1834
|
+
NeuronxNativeCompiler.isConstruct(x: any)
|
|
1536
1835
|
```
|
|
1537
1836
|
|
|
1538
1837
|
Checks if `x` is a construct.
|
|
@@ -1551,7 +1850,7 @@ library can be accidentally installed, and `instanceof` will behave
|
|
|
1551
1850
|
unpredictably. It is safest to avoid using `instanceof`, and using
|
|
1552
1851
|
this type-testing method instead.
|
|
1553
1852
|
|
|
1554
|
-
###### `x`<sup>Required</sup> <a name="x" id="aws-cdk-neuronx-patterns.
|
|
1853
|
+
###### `x`<sup>Required</sup> <a name="x" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.isConstruct.parameter.x"></a>
|
|
1555
1854
|
|
|
1556
1855
|
- *Type:* any
|
|
1557
1856
|
|
|
@@ -1563,11 +1862,11 @@ Any object.
|
|
|
1563
1862
|
|
|
1564
1863
|
| **Name** | **Type** | **Description** |
|
|
1565
1864
|
| --- | --- | --- |
|
|
1566
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
1865
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompiler.property.node">node</a></code> | <code>constructs.Node</code> | The tree node. |
|
|
1567
1866
|
|
|
1568
1867
|
---
|
|
1569
1868
|
|
|
1570
|
-
##### `node`<sup>Required</sup> <a name="node" id="aws-cdk-neuronx-patterns.
|
|
1869
|
+
##### `node`<sup>Required</sup> <a name="node" id="aws-cdk-neuronx-patterns.NeuronxNativeCompiler.property.node"></a>
|
|
1571
1870
|
|
|
1572
1871
|
```typescript
|
|
1573
1872
|
public readonly node: Node;
|
|
@@ -4652,43 +4951,88 @@ The task definition to use for tasks in the service. TaskDefinition or TaskImage
|
|
|
4652
4951
|
|
|
4653
4952
|
---
|
|
4654
4953
|
|
|
4655
|
-
###
|
|
4954
|
+
### ComputeEnvironmentResult <a name="ComputeEnvironmentResult" id="aws-cdk-neuronx-patterns.ComputeEnvironmentResult"></a>
|
|
4656
4955
|
|
|
4657
|
-
|
|
4956
|
+
Result of creating a compute environment.
|
|
4957
|
+
|
|
4958
|
+
#### Initializer <a name="Initializer" id="aws-cdk-neuronx-patterns.ComputeEnvironmentResult.Initializer"></a>
|
|
4658
4959
|
|
|
4659
4960
|
```typescript
|
|
4660
|
-
import {
|
|
4961
|
+
import { ComputeEnvironmentResult } from 'aws-cdk-neuronx-patterns'
|
|
4661
4962
|
|
|
4662
|
-
const
|
|
4963
|
+
const computeEnvironmentResult: ComputeEnvironmentResult = { ... }
|
|
4663
4964
|
```
|
|
4664
4965
|
|
|
4665
4966
|
#### Properties <a name="Properties" id="Properties"></a>
|
|
4666
4967
|
|
|
4667
4968
|
| **Name** | **Type** | **Description** |
|
|
4668
4969
|
| --- | --- | --- |
|
|
4669
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
4670
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
4671
|
-
| <code><a href="#aws-cdk-neuronx-patterns.ModelConfig.property.layers">layers</a></code> | <code>number</code> | *No description.* |
|
|
4970
|
+
| <code><a href="#aws-cdk-neuronx-patterns.ComputeEnvironmentResult.property.computeEnvironment">computeEnvironment</a></code> | <code>aws-cdk-lib.aws_batch.IComputeEnvironment</code> | The compute environment. |
|
|
4971
|
+
| <code><a href="#aws-cdk-neuronx-patterns.ComputeEnvironmentResult.property.instanceRole">instanceRole</a></code> | <code>aws-cdk-lib.aws_iam.IRole</code> | The instance role associated with the compute environment. |
|
|
4672
4972
|
|
|
4673
4973
|
---
|
|
4674
4974
|
|
|
4675
|
-
##### `
|
|
4975
|
+
##### `computeEnvironment`<sup>Required</sup> <a name="computeEnvironment" id="aws-cdk-neuronx-patterns.ComputeEnvironmentResult.property.computeEnvironment"></a>
|
|
4676
4976
|
|
|
4677
4977
|
```typescript
|
|
4678
|
-
public readonly
|
|
4978
|
+
public readonly computeEnvironment: IComputeEnvironment;
|
|
4679
4979
|
```
|
|
4680
4980
|
|
|
4681
|
-
- *Type:*
|
|
4981
|
+
- *Type:* aws-cdk-lib.aws_batch.IComputeEnvironment
|
|
4982
|
+
|
|
4983
|
+
The compute environment.
|
|
4682
4984
|
|
|
4683
4985
|
---
|
|
4684
4986
|
|
|
4685
|
-
##### `
|
|
4987
|
+
##### `instanceRole`<sup>Required</sup> <a name="instanceRole" id="aws-cdk-neuronx-patterns.ComputeEnvironmentResult.property.instanceRole"></a>
|
|
4686
4988
|
|
|
4687
4989
|
```typescript
|
|
4688
|
-
public readonly
|
|
4990
|
+
public readonly instanceRole: IRole;
|
|
4689
4991
|
```
|
|
4690
4992
|
|
|
4691
|
-
- *Type:*
|
|
4993
|
+
- *Type:* aws-cdk-lib.aws_iam.IRole
|
|
4994
|
+
|
|
4995
|
+
The instance role associated with the compute environment.
|
|
4996
|
+
|
|
4997
|
+
---
|
|
4998
|
+
|
|
4999
|
+
### ModelConfig <a name="ModelConfig" id="aws-cdk-neuronx-patterns.ModelConfig"></a>
|
|
5000
|
+
|
|
5001
|
+
#### Initializer <a name="Initializer" id="aws-cdk-neuronx-patterns.ModelConfig.Initializer"></a>
|
|
5002
|
+
|
|
5003
|
+
```typescript
|
|
5004
|
+
import { ModelConfig } from 'aws-cdk-neuronx-patterns'
|
|
5005
|
+
|
|
5006
|
+
const modelConfig: ModelConfig = { ... }
|
|
5007
|
+
```
|
|
5008
|
+
|
|
5009
|
+
#### Properties <a name="Properties" id="Properties"></a>
|
|
5010
|
+
|
|
5011
|
+
| **Name** | **Type** | **Description** |
|
|
5012
|
+
| --- | --- | --- |
|
|
5013
|
+
| <code><a href="#aws-cdk-neuronx-patterns.ModelConfig.property.attentionHeads">attentionHeads</a></code> | <code>number</code> | *No description.* |
|
|
5014
|
+
| <code><a href="#aws-cdk-neuronx-patterns.ModelConfig.property.embeddingDimension">embeddingDimension</a></code> | <code>number</code> | *No description.* |
|
|
5015
|
+
| <code><a href="#aws-cdk-neuronx-patterns.ModelConfig.property.layers">layers</a></code> | <code>number</code> | *No description.* |
|
|
5016
|
+
|
|
5017
|
+
---
|
|
5018
|
+
|
|
5019
|
+
##### `attentionHeads`<sup>Required</sup> <a name="attentionHeads" id="aws-cdk-neuronx-patterns.ModelConfig.property.attentionHeads"></a>
|
|
5020
|
+
|
|
5021
|
+
```typescript
|
|
5022
|
+
public readonly attentionHeads: number;
|
|
5023
|
+
```
|
|
5024
|
+
|
|
5025
|
+
- *Type:* number
|
|
5026
|
+
|
|
5027
|
+
---
|
|
5028
|
+
|
|
5029
|
+
##### `embeddingDimension`<sup>Required</sup> <a name="embeddingDimension" id="aws-cdk-neuronx-patterns.ModelConfig.property.embeddingDimension"></a>
|
|
5030
|
+
|
|
5031
|
+
```typescript
|
|
5032
|
+
public readonly embeddingDimension: number;
|
|
5033
|
+
```
|
|
5034
|
+
|
|
5035
|
+
- *Type:* number
|
|
4692
5036
|
|
|
4693
5037
|
---
|
|
4694
5038
|
|
|
@@ -5846,234 +6190,627 @@ Automatically added to the job definition.
|
|
|
5846
6190
|
##### `gpu`<sup>Optional</sup> <a name="gpu" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.gpu"></a>
|
|
5847
6191
|
|
|
5848
6192
|
```typescript
|
|
5849
|
-
public readonly gpu: number;
|
|
6193
|
+
public readonly gpu: number;
|
|
6194
|
+
```
|
|
6195
|
+
|
|
6196
|
+
- *Type:* number
|
|
6197
|
+
- *Default:* no gpus
|
|
6198
|
+
|
|
6199
|
+
The number of physical GPUs to reserve for the container.
|
|
6200
|
+
|
|
6201
|
+
Make sure that the number of GPUs reserved for all containers in a job doesn't exceed
|
|
6202
|
+
the number of available GPUs on the compute resource that the job is launched on.
|
|
6203
|
+
|
|
6204
|
+
---
|
|
6205
|
+
|
|
6206
|
+
##### `privileged`<sup>Optional</sup> <a name="privileged" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.privileged"></a>
|
|
6207
|
+
|
|
6208
|
+
```typescript
|
|
6209
|
+
public readonly privileged: boolean;
|
|
6210
|
+
```
|
|
6211
|
+
|
|
6212
|
+
- *Type:* boolean
|
|
6213
|
+
- *Default:* false
|
|
6214
|
+
|
|
6215
|
+
When this parameter is true, the container is given elevated permissions on the host container instance (similar to the root user).
|
|
6216
|
+
|
|
6217
|
+
---
|
|
6218
|
+
|
|
6219
|
+
##### `ulimits`<sup>Optional</sup> <a name="ulimits" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.ulimits"></a>
|
|
6220
|
+
|
|
6221
|
+
```typescript
|
|
6222
|
+
public readonly ulimits: Ulimit[];
|
|
6223
|
+
```
|
|
6224
|
+
|
|
6225
|
+
- *Type:* aws-cdk-lib.aws_batch.Ulimit[]
|
|
6226
|
+
- *Default:* no ulimits
|
|
6227
|
+
|
|
6228
|
+
Limits to set for the user this docker container will run as.
|
|
6229
|
+
|
|
6230
|
+
---
|
|
6231
|
+
|
|
6232
|
+
##### `neuronxInstanceType`<sup>Required</sup> <a name="neuronxInstanceType" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.neuronxInstanceType"></a>
|
|
6233
|
+
|
|
6234
|
+
```typescript
|
|
6235
|
+
public readonly neuronxInstanceType: INeuronxInstanceType;
|
|
6236
|
+
```
|
|
6237
|
+
|
|
6238
|
+
- *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
|
|
6239
|
+
|
|
6240
|
+
The instance type of worker instance.
|
|
6241
|
+
|
|
6242
|
+
---
|
|
6243
|
+
|
|
6244
|
+
##### `volumeSize`<sup>Required</sup> <a name="volumeSize" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.volumeSize"></a>
|
|
6245
|
+
|
|
6246
|
+
```typescript
|
|
6247
|
+
public readonly volumeSize: Size;
|
|
6248
|
+
```
|
|
6249
|
+
|
|
6250
|
+
- *Type:* aws-cdk-lib.Size
|
|
6251
|
+
- *Default:* N bilion parameters * 5GiB EBS
|
|
6252
|
+
|
|
6253
|
+
The root volume of worker instance.
|
|
6254
|
+
|
|
6255
|
+
---
|
|
6256
|
+
|
|
6257
|
+
##### `vpc`<sup>Required</sup> <a name="vpc" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.vpc"></a>
|
|
6258
|
+
|
|
6259
|
+
```typescript
|
|
6260
|
+
public readonly vpc: IVpc;
|
|
6261
|
+
```
|
|
6262
|
+
|
|
6263
|
+
- *Type:* aws-cdk-lib.aws_ec2.IVpc
|
|
6264
|
+
|
|
6265
|
+
VPC in which this will launch worker instance.
|
|
6266
|
+
|
|
6267
|
+
---
|
|
6268
|
+
|
|
6269
|
+
##### `spot`<sup>Optional</sup> <a name="spot" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.spot"></a>
|
|
6270
|
+
|
|
6271
|
+
```typescript
|
|
6272
|
+
public readonly spot: boolean;
|
|
6273
|
+
```
|
|
6274
|
+
|
|
6275
|
+
- *Type:* boolean
|
|
6276
|
+
- *Default:* false
|
|
6277
|
+
|
|
6278
|
+
Whether or not to use spot instances.
|
|
6279
|
+
|
|
6280
|
+
Spot instances are less expensive EC2 instances that can be reclaimed by EC2 at any time;
|
|
6281
|
+
your job will be given two minutes of notice before reclamation.
|
|
6282
|
+
|
|
6283
|
+
---
|
|
6284
|
+
|
|
6285
|
+
##### `vpcSubnets`<sup>Optional</sup> <a name="vpcSubnets" id="aws-cdk-neuronx-patterns.NeuronxBatchProps.property.vpcSubnets"></a>
|
|
6286
|
+
|
|
6287
|
+
```typescript
|
|
6288
|
+
public readonly vpcSubnets: SubnetSelection;
|
|
6289
|
+
```
|
|
6290
|
+
|
|
6291
|
+
- *Type:* aws-cdk-lib.aws_ec2.SubnetSelection
|
|
6292
|
+
- *Default:* new subnets will be created
|
|
6293
|
+
|
|
6294
|
+
The VPC Subnets this Compute Environment will launch instances in.
|
|
6295
|
+
|
|
6296
|
+
---
|
|
6297
|
+
|
|
6298
|
+
### NeuronxCompiledModel <a name="NeuronxCompiledModel" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel"></a>
|
|
6299
|
+
|
|
6300
|
+
The model compiled by Neuronx compiler.
|
|
6301
|
+
|
|
6302
|
+
#### Initializer <a name="Initializer" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.Initializer"></a>
|
|
6303
|
+
|
|
6304
|
+
```typescript
|
|
6305
|
+
import { NeuronxCompiledModel } from 'aws-cdk-neuronx-patterns'
|
|
6306
|
+
|
|
6307
|
+
const neuronxCompiledModel: NeuronxCompiledModel = { ... }
|
|
6308
|
+
```
|
|
6309
|
+
|
|
6310
|
+
#### Properties <a name="Properties" id="Properties"></a>
|
|
6311
|
+
|
|
6312
|
+
| **Name** | **Type** | **Description** |
|
|
6313
|
+
| --- | --- | --- |
|
|
6314
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.bucket">bucket</a></code> | <code>aws-cdk-lib.aws_s3.IBucket</code> | The bucket to upload compiled artifacts. |
|
|
6315
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.modelName">modelName</a></code> | <code>string</code> | The model name. |
|
|
6316
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.recommendedInstanceType">recommendedInstanceType</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | The recommended Neuron instance type for running inference with this compiled model. |
|
|
6317
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.s3Prefix">s3Prefix</a></code> | <code>string</code> | S3 prefix that compiled artifact uploaded. |
|
|
6318
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.s3Uri">s3Uri</a></code> | <code>string</code> | S3 URL that compiled artifact uploaded. |
|
|
6319
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.weightSize">weightSize</a></code> | <code>aws-cdk-lib.Size</code> | The weight size of the model. |
|
|
6320
|
+
|
|
6321
|
+
---
|
|
6322
|
+
|
|
6323
|
+
##### `bucket`<sup>Required</sup> <a name="bucket" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.bucket"></a>
|
|
6324
|
+
|
|
6325
|
+
```typescript
|
|
6326
|
+
public readonly bucket: IBucket;
|
|
6327
|
+
```
|
|
6328
|
+
|
|
6329
|
+
- *Type:* aws-cdk-lib.aws_s3.IBucket
|
|
6330
|
+
|
|
6331
|
+
The bucket to upload compiled artifacts.
|
|
6332
|
+
|
|
6333
|
+
---
|
|
6334
|
+
|
|
6335
|
+
##### `modelName`<sup>Required</sup> <a name="modelName" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.modelName"></a>
|
|
6336
|
+
|
|
6337
|
+
```typescript
|
|
6338
|
+
public readonly modelName: string;
|
|
6339
|
+
```
|
|
6340
|
+
|
|
6341
|
+
- *Type:* string
|
|
6342
|
+
|
|
6343
|
+
The model name.
|
|
6344
|
+
|
|
6345
|
+
---
|
|
6346
|
+
|
|
6347
|
+
##### `recommendedInstanceType`<sup>Required</sup> <a name="recommendedInstanceType" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.recommendedInstanceType"></a>
|
|
6348
|
+
|
|
6349
|
+
```typescript
|
|
6350
|
+
public readonly recommendedInstanceType: INeuronxInstanceType;
|
|
6351
|
+
```
|
|
6352
|
+
|
|
6353
|
+
- *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
|
|
6354
|
+
|
|
6355
|
+
The recommended Neuron instance type for running inference with this compiled model.
|
|
6356
|
+
|
|
6357
|
+
---
|
|
6358
|
+
|
|
6359
|
+
##### `s3Prefix`<sup>Required</sup> <a name="s3Prefix" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.s3Prefix"></a>
|
|
6360
|
+
|
|
6361
|
+
```typescript
|
|
6362
|
+
public readonly s3Prefix: string;
|
|
6363
|
+
```
|
|
6364
|
+
|
|
6365
|
+
- *Type:* string
|
|
6366
|
+
|
|
6367
|
+
S3 prefix that compiled artifact uploaded.
|
|
6368
|
+
|
|
6369
|
+
---
|
|
6370
|
+
|
|
6371
|
+
##### `s3Uri`<sup>Required</sup> <a name="s3Uri" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.s3Uri"></a>
|
|
6372
|
+
|
|
6373
|
+
```typescript
|
|
6374
|
+
public readonly s3Uri: string;
|
|
6375
|
+
```
|
|
6376
|
+
|
|
6377
|
+
- *Type:* string
|
|
6378
|
+
|
|
6379
|
+
S3 URL that compiled artifact uploaded.
|
|
6380
|
+
|
|
6381
|
+
---
|
|
6382
|
+
|
|
6383
|
+
##### `weightSize`<sup>Required</sup> <a name="weightSize" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.weightSize"></a>
|
|
6384
|
+
|
|
6385
|
+
```typescript
|
|
6386
|
+
public readonly weightSize: Size;
|
|
6387
|
+
```
|
|
6388
|
+
|
|
6389
|
+
- *Type:* aws-cdk-lib.Size
|
|
6390
|
+
|
|
6391
|
+
The weight size of the model.
|
|
6392
|
+
|
|
6393
|
+
---
|
|
6394
|
+
|
|
6395
|
+
### NeuronxCompilerBaseProps <a name="NeuronxCompilerBaseProps" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps"></a>
|
|
6396
|
+
|
|
6397
|
+
Common props for NeuronxCompilerBase.
|
|
6398
|
+
|
|
6399
|
+
#### Initializer <a name="Initializer" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.Initializer"></a>
|
|
6400
|
+
|
|
6401
|
+
```typescript
|
|
6402
|
+
import { NeuronxCompilerBaseProps } from 'aws-cdk-neuronx-patterns'
|
|
6403
|
+
|
|
6404
|
+
const neuronxCompilerBaseProps: NeuronxCompilerBaseProps = { ... }
|
|
6405
|
+
```
|
|
6406
|
+
|
|
6407
|
+
#### Properties <a name="Properties" id="Properties"></a>
|
|
6408
|
+
|
|
6409
|
+
| **Name** | **Type** | **Description** |
|
|
6410
|
+
| --- | --- | --- |
|
|
6411
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.artifactS3Prefix">artifactS3Prefix</a></code> | <code>string</code> | S3 Prefix that compiled artifact uploaded. |
|
|
6412
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.bucket">bucket</a></code> | <code>aws-cdk-lib.aws_s3.IBucket</code> | The bucket to upload compiled artifacts. |
|
|
6413
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.image">image</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxContainerImage">INeuronxContainerImage</a></code> | An image of the container where the compile job is executed. |
|
|
6414
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.model">model</a></code> | <code><a href="#aws-cdk-neuronx-patterns.Model">Model</a></code> | The model to be compiled. |
|
|
6415
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.neuronxInstanceType">neuronxInstanceType</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | The instance type of compile worker instance. |
|
|
6416
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.vpc">vpc</a></code> | <code>aws-cdk-lib.aws_ec2.IVpc</code> | VPC in which this will launch compile worker instance. |
|
|
6417
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.command">command</a></code> | <code>string[]</code> | The command to run in the container. |
|
|
6418
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.environment">environment</a></code> | <code>{[ key: string ]: string}</code> | The environment variables to pass to the container. |
|
|
6419
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.secrets">secrets</a></code> | <code>{[ key: string ]: aws-cdk-lib.aws_batch.Secret}</code> | Secrets to pass to the container. |
|
|
6420
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.spot">spot</a></code> | <code>boolean</code> | Whether or not to use spot instances. |
|
|
6421
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.volumeSize">volumeSize</a></code> | <code>aws-cdk-lib.Size</code> | The root volume of worker instance. |
|
|
6422
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.vpcSubnets">vpcSubnets</a></code> | <code>aws-cdk-lib.aws_ec2.SubnetSelection</code> | The VPC Subnets this Compute Environment will launch instances in. |
|
|
6423
|
+
|
|
6424
|
+
---
|
|
6425
|
+
|
|
6426
|
+
##### `artifactS3Prefix`<sup>Required</sup> <a name="artifactS3Prefix" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.artifactS3Prefix"></a>
|
|
6427
|
+
|
|
6428
|
+
```typescript
|
|
6429
|
+
public readonly artifactS3Prefix: string;
|
|
6430
|
+
```
|
|
6431
|
+
|
|
6432
|
+
- *Type:* string
|
|
6433
|
+
|
|
6434
|
+
S3 Prefix that compiled artifact uploaded.
|
|
6435
|
+
|
|
6436
|
+
This property is not depends on compile job finish.
|
|
6437
|
+
|
|
6438
|
+
---
|
|
6439
|
+
|
|
6440
|
+
##### `bucket`<sup>Required</sup> <a name="bucket" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.bucket"></a>
|
|
6441
|
+
|
|
6442
|
+
```typescript
|
|
6443
|
+
public readonly bucket: IBucket;
|
|
6444
|
+
```
|
|
6445
|
+
|
|
6446
|
+
- *Type:* aws-cdk-lib.aws_s3.IBucket
|
|
6447
|
+
|
|
6448
|
+
The bucket to upload compiled artifacts.
|
|
6449
|
+
|
|
6450
|
+
---
|
|
6451
|
+
|
|
6452
|
+
##### `image`<sup>Required</sup> <a name="image" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.image"></a>
|
|
6453
|
+
|
|
6454
|
+
```typescript
|
|
6455
|
+
public readonly image: INeuronxContainerImage;
|
|
6456
|
+
```
|
|
6457
|
+
|
|
6458
|
+
- *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxContainerImage">INeuronxContainerImage</a>
|
|
6459
|
+
|
|
6460
|
+
An image of the container where the compile job is executed.
|
|
6461
|
+
|
|
6462
|
+
---
|
|
6463
|
+
|
|
6464
|
+
##### `model`<sup>Required</sup> <a name="model" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.model"></a>
|
|
6465
|
+
|
|
6466
|
+
```typescript
|
|
6467
|
+
public readonly model: Model;
|
|
6468
|
+
```
|
|
6469
|
+
|
|
6470
|
+
- *Type:* <a href="#aws-cdk-neuronx-patterns.Model">Model</a>
|
|
6471
|
+
|
|
6472
|
+
The model to be compiled.
|
|
6473
|
+
|
|
6474
|
+
---
|
|
6475
|
+
|
|
6476
|
+
##### `neuronxInstanceType`<sup>Required</sup> <a name="neuronxInstanceType" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.neuronxInstanceType"></a>
|
|
6477
|
+
|
|
6478
|
+
```typescript
|
|
6479
|
+
public readonly neuronxInstanceType: INeuronxInstanceType;
|
|
6480
|
+
```
|
|
6481
|
+
|
|
6482
|
+
- *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
|
|
6483
|
+
|
|
6484
|
+
The instance type of compile worker instance.
|
|
6485
|
+
|
|
6486
|
+
---
|
|
6487
|
+
|
|
6488
|
+
##### `vpc`<sup>Required</sup> <a name="vpc" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.vpc"></a>
|
|
6489
|
+
|
|
6490
|
+
```typescript
|
|
6491
|
+
public readonly vpc: IVpc;
|
|
6492
|
+
```
|
|
6493
|
+
|
|
6494
|
+
- *Type:* aws-cdk-lib.aws_ec2.IVpc
|
|
6495
|
+
|
|
6496
|
+
VPC in which this will launch compile worker instance.
|
|
6497
|
+
|
|
6498
|
+
---
|
|
6499
|
+
|
|
6500
|
+
##### `command`<sup>Optional</sup> <a name="command" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.command"></a>
|
|
6501
|
+
|
|
6502
|
+
```typescript
|
|
6503
|
+
public readonly command: string[];
|
|
6504
|
+
```
|
|
6505
|
+
|
|
6506
|
+
- *Type:* string[]
|
|
6507
|
+
|
|
6508
|
+
The command to run in the container.
|
|
6509
|
+
|
|
6510
|
+
---
|
|
6511
|
+
|
|
6512
|
+
##### `environment`<sup>Optional</sup> <a name="environment" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.environment"></a>
|
|
6513
|
+
|
|
6514
|
+
```typescript
|
|
6515
|
+
public readonly environment: {[ key: string ]: string};
|
|
6516
|
+
```
|
|
6517
|
+
|
|
6518
|
+
- *Type:* {[ key: string ]: string}
|
|
6519
|
+
- *Default:* No environment variables.
|
|
6520
|
+
|
|
6521
|
+
The environment variables to pass to the container.
|
|
6522
|
+
|
|
6523
|
+
This is only applicable when using container runtime.
|
|
6524
|
+
|
|
6525
|
+
---
|
|
6526
|
+
|
|
6527
|
+
##### `secrets`<sup>Optional</sup> <a name="secrets" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.secrets"></a>
|
|
6528
|
+
|
|
6529
|
+
```typescript
|
|
6530
|
+
public readonly secrets: {[ key: string ]: Secret};
|
|
6531
|
+
```
|
|
6532
|
+
|
|
6533
|
+
- *Type:* {[ key: string ]: aws-cdk-lib.aws_batch.Secret}
|
|
6534
|
+
|
|
6535
|
+
Secrets to pass to the container.
|
|
6536
|
+
|
|
6537
|
+
---
|
|
6538
|
+
|
|
6539
|
+
##### `spot`<sup>Optional</sup> <a name="spot" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.spot"></a>
|
|
6540
|
+
|
|
6541
|
+
```typescript
|
|
6542
|
+
public readonly spot: boolean;
|
|
6543
|
+
```
|
|
6544
|
+
|
|
6545
|
+
- *Type:* boolean
|
|
6546
|
+
- *Default:* false
|
|
6547
|
+
|
|
6548
|
+
Whether or not to use spot instances.
|
|
6549
|
+
|
|
6550
|
+
Spot instances are less expensive EC2 instances that can be reclaimed by EC2 at any time; your job will be given two minutes of notice before reclamation.
|
|
6551
|
+
|
|
6552
|
+
---
|
|
6553
|
+
|
|
6554
|
+
##### `volumeSize`<sup>Optional</sup> <a name="volumeSize" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.volumeSize"></a>
|
|
6555
|
+
|
|
6556
|
+
```typescript
|
|
6557
|
+
public readonly volumeSize: Size;
|
|
6558
|
+
```
|
|
6559
|
+
|
|
6560
|
+
- *Type:* aws-cdk-lib.Size
|
|
6561
|
+
- *Default:* N billion parameters * 5GiB EBS
|
|
6562
|
+
|
|
6563
|
+
The root volume of worker instance.
|
|
6564
|
+
|
|
6565
|
+
---
|
|
6566
|
+
|
|
6567
|
+
##### `vpcSubnets`<sup>Optional</sup> <a name="vpcSubnets" id="aws-cdk-neuronx-patterns.NeuronxCompilerBaseProps.property.vpcSubnets"></a>
|
|
6568
|
+
|
|
6569
|
+
```typescript
|
|
6570
|
+
public readonly vpcSubnets: SubnetSelection;
|
|
5850
6571
|
```
|
|
5851
6572
|
|
|
5852
|
-
- *Type:*
|
|
5853
|
-
- *Default:*
|
|
5854
|
-
|
|
5855
|
-
The number of physical GPUs to reserve for the container.
|
|
6573
|
+
- *Type:* aws-cdk-lib.aws_ec2.SubnetSelection
|
|
6574
|
+
- *Default:* new subnets will be created
|
|
5856
6575
|
|
|
5857
|
-
|
|
5858
|
-
the number of available GPUs on the compute resource that the job is launched on.
|
|
6576
|
+
The VPC Subnets this Compute Environment will launch instances in.
|
|
5859
6577
|
|
|
5860
6578
|
---
|
|
5861
6579
|
|
|
5862
|
-
|
|
6580
|
+
### NeuronxCrossCompilerProps <a name="NeuronxCrossCompilerProps" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps"></a>
|
|
6581
|
+
|
|
6582
|
+
Props of NeuronxCrossCompiler.
|
|
6583
|
+
|
|
6584
|
+
#### Initializer <a name="Initializer" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.Initializer"></a>
|
|
5863
6585
|
|
|
5864
6586
|
```typescript
|
|
5865
|
-
|
|
6587
|
+
import { NeuronxCrossCompilerProps } from 'aws-cdk-neuronx-patterns'
|
|
6588
|
+
|
|
6589
|
+
const neuronxCrossCompilerProps: NeuronxCrossCompilerProps = { ... }
|
|
5866
6590
|
```
|
|
5867
6591
|
|
|
5868
|
-
|
|
5869
|
-
- *Default:* false
|
|
6592
|
+
#### Properties <a name="Properties" id="Properties"></a>
|
|
5870
6593
|
|
|
5871
|
-
|
|
6594
|
+
| **Name** | **Type** | **Description** |
|
|
6595
|
+
| --- | --- | --- |
|
|
6596
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.artifactS3Prefix">artifactS3Prefix</a></code> | <code>string</code> | S3 Prefix that compiled artifact uploaded. |
|
|
6597
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.bucket">bucket</a></code> | <code>aws-cdk-lib.aws_s3.IBucket</code> | The bucket to upload compiled artifacts. |
|
|
6598
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.image">image</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxContainerImage">INeuronxContainerImage</a></code> | An image of the container where the compile job is executed. |
|
|
6599
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.model">model</a></code> | <code><a href="#aws-cdk-neuronx-patterns.Model">Model</a></code> | The model to be compiled. |
|
|
6600
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.neuronxInstanceType">neuronxInstanceType</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | The instance type of compile worker instance. |
|
|
6601
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.vpc">vpc</a></code> | <code>aws-cdk-lib.aws_ec2.IVpc</code> | VPC in which this will launch compile worker instance. |
|
|
6602
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.command">command</a></code> | <code>string[]</code> | The command to run in the container. |
|
|
6603
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.environment">environment</a></code> | <code>{[ key: string ]: string}</code> | The environment variables to pass to the container. |
|
|
6604
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.secrets">secrets</a></code> | <code>{[ key: string ]: aws-cdk-lib.aws_batch.Secret}</code> | Secrets to pass to the container. |
|
|
6605
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.spot">spot</a></code> | <code>boolean</code> | Whether or not to use spot instances. |
|
|
6606
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.volumeSize">volumeSize</a></code> | <code>aws-cdk-lib.Size</code> | The root volume of worker instance. |
|
|
6607
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.vpcSubnets">vpcSubnets</a></code> | <code>aws-cdk-lib.aws_ec2.SubnetSelection</code> | The VPC Subnets this Compute Environment will launch instances in. |
|
|
6608
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.compileInstanceType">compileInstanceType</a></code> | <code>aws-cdk-lib.aws_ec2.InstanceType</code> | The EC2 instance type to use for cross-compilation. |
|
|
5872
6609
|
|
|
5873
6610
|
---
|
|
5874
6611
|
|
|
5875
|
-
##### `
|
|
6612
|
+
##### `artifactS3Prefix`<sup>Required</sup> <a name="artifactS3Prefix" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.artifactS3Prefix"></a>
|
|
5876
6613
|
|
|
5877
6614
|
```typescript
|
|
5878
|
-
public readonly
|
|
6615
|
+
public readonly artifactS3Prefix: string;
|
|
5879
6616
|
```
|
|
5880
6617
|
|
|
5881
|
-
- *Type:*
|
|
5882
|
-
- *Default:* no ulimits
|
|
6618
|
+
- *Type:* string
|
|
5883
6619
|
|
|
5884
|
-
|
|
6620
|
+
S3 Prefix that compiled artifact uploaded.
|
|
6621
|
+
|
|
6622
|
+
This property is not depends on compile job finish.
|
|
5885
6623
|
|
|
5886
6624
|
---
|
|
5887
6625
|
|
|
5888
|
-
##### `
|
|
6626
|
+
##### `bucket`<sup>Required</sup> <a name="bucket" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.bucket"></a>
|
|
5889
6627
|
|
|
5890
6628
|
```typescript
|
|
5891
|
-
public readonly
|
|
6629
|
+
public readonly bucket: IBucket;
|
|
5892
6630
|
```
|
|
5893
6631
|
|
|
5894
|
-
- *Type:*
|
|
6632
|
+
- *Type:* aws-cdk-lib.aws_s3.IBucket
|
|
5895
6633
|
|
|
5896
|
-
The
|
|
6634
|
+
The bucket to upload compiled artifacts.
|
|
5897
6635
|
|
|
5898
6636
|
---
|
|
5899
6637
|
|
|
5900
|
-
##### `
|
|
6638
|
+
##### `image`<sup>Required</sup> <a name="image" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.image"></a>
|
|
5901
6639
|
|
|
5902
6640
|
```typescript
|
|
5903
|
-
public readonly
|
|
6641
|
+
public readonly image: INeuronxContainerImage;
|
|
5904
6642
|
```
|
|
5905
6643
|
|
|
5906
|
-
- *Type:* aws-cdk-
|
|
5907
|
-
- *Default:* N bilion parameters * 5GiB EBS
|
|
6644
|
+
- *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxContainerImage">INeuronxContainerImage</a>
|
|
5908
6645
|
|
|
5909
|
-
|
|
6646
|
+
An image of the container where the compile job is executed.
|
|
5910
6647
|
|
|
5911
6648
|
---
|
|
5912
6649
|
|
|
5913
|
-
##### `
|
|
6650
|
+
##### `model`<sup>Required</sup> <a name="model" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.model"></a>
|
|
5914
6651
|
|
|
5915
6652
|
```typescript
|
|
5916
|
-
public readonly
|
|
6653
|
+
public readonly model: Model;
|
|
5917
6654
|
```
|
|
5918
6655
|
|
|
5919
|
-
- *Type:* aws-cdk-
|
|
6656
|
+
- *Type:* <a href="#aws-cdk-neuronx-patterns.Model">Model</a>
|
|
5920
6657
|
|
|
5921
|
-
|
|
6658
|
+
The model to be compiled.
|
|
5922
6659
|
|
|
5923
6660
|
---
|
|
5924
6661
|
|
|
5925
|
-
##### `
|
|
6662
|
+
##### `neuronxInstanceType`<sup>Required</sup> <a name="neuronxInstanceType" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.neuronxInstanceType"></a>
|
|
5926
6663
|
|
|
5927
6664
|
```typescript
|
|
5928
|
-
public readonly
|
|
6665
|
+
public readonly neuronxInstanceType: INeuronxInstanceType;
|
|
5929
6666
|
```
|
|
5930
6667
|
|
|
5931
|
-
- *Type:*
|
|
5932
|
-
- *Default:* false
|
|
5933
|
-
|
|
5934
|
-
Whether or not to use spot instances.
|
|
6668
|
+
- *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
|
|
5935
6669
|
|
|
5936
|
-
|
|
5937
|
-
your job will be given two minutes of notice before reclamation.
|
|
6670
|
+
The instance type of compile worker instance.
|
|
5938
6671
|
|
|
5939
6672
|
---
|
|
5940
6673
|
|
|
5941
|
-
##### `
|
|
6674
|
+
##### `vpc`<sup>Required</sup> <a name="vpc" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.vpc"></a>
|
|
5942
6675
|
|
|
5943
6676
|
```typescript
|
|
5944
|
-
public readonly
|
|
6677
|
+
public readonly vpc: IVpc;
|
|
5945
6678
|
```
|
|
5946
6679
|
|
|
5947
|
-
- *Type:* aws-cdk-lib.aws_ec2.
|
|
5948
|
-
- *Default:* new subnets will be created
|
|
6680
|
+
- *Type:* aws-cdk-lib.aws_ec2.IVpc
|
|
5949
6681
|
|
|
5950
|
-
|
|
6682
|
+
VPC in which this will launch compile worker instance.
|
|
5951
6683
|
|
|
5952
6684
|
---
|
|
5953
6685
|
|
|
5954
|
-
|
|
5955
|
-
|
|
5956
|
-
#### Initializer <a name="Initializer" id="aws-cdk-neuronx-patterns.NeuronxCompiledModel.Initializer"></a>
|
|
6686
|
+
##### `command`<sup>Optional</sup> <a name="command" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.command"></a>
|
|
5957
6687
|
|
|
5958
6688
|
```typescript
|
|
5959
|
-
|
|
5960
|
-
|
|
5961
|
-
const neuronxCompiledModel: NeuronxCompiledModel = { ... }
|
|
6689
|
+
public readonly command: string[];
|
|
5962
6690
|
```
|
|
5963
6691
|
|
|
5964
|
-
|
|
6692
|
+
- *Type:* string[]
|
|
5965
6693
|
|
|
5966
|
-
|
|
5967
|
-
| --- | --- | --- |
|
|
5968
|
-
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.bucket">bucket</a></code> | <code>aws-cdk-lib.aws_s3.IBucket</code> | The bucket to upload compiled artifacts. |
|
|
5969
|
-
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.compileTimeInstanceType">compileTimeInstanceType</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | *No description.* |
|
|
5970
|
-
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.modelName">modelName</a></code> | <code>string</code> | The model name. |
|
|
5971
|
-
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.s3Prefix">s3Prefix</a></code> | <code>string</code> | S3 prefix that compiled artifact uploaded. |
|
|
5972
|
-
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.s3Uri">s3Uri</a></code> | <code>string</code> | S3 URL that compiled artifact uploaded. |
|
|
5973
|
-
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxCompiledModel.property.weightSize">weightSize</a></code> | <code>aws-cdk-lib.Size</code> | *No description.* |
|
|
6694
|
+
The command to run in the container.
|
|
5974
6695
|
|
|
5975
6696
|
---
|
|
5976
6697
|
|
|
5977
|
-
##### `
|
|
6698
|
+
##### `environment`<sup>Optional</sup> <a name="environment" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.environment"></a>
|
|
5978
6699
|
|
|
5979
6700
|
```typescript
|
|
5980
|
-
public readonly
|
|
6701
|
+
public readonly environment: {[ key: string ]: string};
|
|
5981
6702
|
```
|
|
5982
6703
|
|
|
5983
|
-
- *Type:*
|
|
6704
|
+
- *Type:* {[ key: string ]: string}
|
|
6705
|
+
- *Default:* No environment variables.
|
|
5984
6706
|
|
|
5985
|
-
The
|
|
6707
|
+
The environment variables to pass to the container.
|
|
6708
|
+
|
|
6709
|
+
This is only applicable when using container runtime.
|
|
5986
6710
|
|
|
5987
6711
|
---
|
|
5988
6712
|
|
|
5989
|
-
##### `
|
|
6713
|
+
##### `secrets`<sup>Optional</sup> <a name="secrets" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.secrets"></a>
|
|
5990
6714
|
|
|
5991
6715
|
```typescript
|
|
5992
|
-
public readonly
|
|
6716
|
+
public readonly secrets: {[ key: string ]: Secret};
|
|
5993
6717
|
```
|
|
5994
6718
|
|
|
5995
|
-
- *Type:*
|
|
6719
|
+
- *Type:* {[ key: string ]: aws-cdk-lib.aws_batch.Secret}
|
|
6720
|
+
|
|
6721
|
+
Secrets to pass to the container.
|
|
5996
6722
|
|
|
5997
6723
|
---
|
|
5998
6724
|
|
|
5999
|
-
##### `
|
|
6725
|
+
##### `spot`<sup>Optional</sup> <a name="spot" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.spot"></a>
|
|
6000
6726
|
|
|
6001
6727
|
```typescript
|
|
6002
|
-
public readonly
|
|
6728
|
+
public readonly spot: boolean;
|
|
6003
6729
|
```
|
|
6004
6730
|
|
|
6005
|
-
- *Type:*
|
|
6731
|
+
- *Type:* boolean
|
|
6732
|
+
- *Default:* false
|
|
6006
6733
|
|
|
6007
|
-
|
|
6734
|
+
Whether or not to use spot instances.
|
|
6735
|
+
|
|
6736
|
+
Spot instances are less expensive EC2 instances that can be reclaimed by EC2 at any time; your job will be given two minutes of notice before reclamation.
|
|
6008
6737
|
|
|
6009
6738
|
---
|
|
6010
6739
|
|
|
6011
|
-
##### `
|
|
6740
|
+
##### `volumeSize`<sup>Optional</sup> <a name="volumeSize" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.volumeSize"></a>
|
|
6012
6741
|
|
|
6013
6742
|
```typescript
|
|
6014
|
-
public readonly
|
|
6743
|
+
public readonly volumeSize: Size;
|
|
6015
6744
|
```
|
|
6016
6745
|
|
|
6017
|
-
- *Type:*
|
|
6746
|
+
- *Type:* aws-cdk-lib.Size
|
|
6747
|
+
- *Default:* N billion parameters * 5GiB EBS
|
|
6018
6748
|
|
|
6019
|
-
|
|
6749
|
+
The root volume of worker instance.
|
|
6020
6750
|
|
|
6021
6751
|
---
|
|
6022
6752
|
|
|
6023
|
-
##### `
|
|
6753
|
+
##### `vpcSubnets`<sup>Optional</sup> <a name="vpcSubnets" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.vpcSubnets"></a>
|
|
6024
6754
|
|
|
6025
6755
|
```typescript
|
|
6026
|
-
public readonly
|
|
6756
|
+
public readonly vpcSubnets: SubnetSelection;
|
|
6027
6757
|
```
|
|
6028
6758
|
|
|
6029
|
-
- *Type:*
|
|
6759
|
+
- *Type:* aws-cdk-lib.aws_ec2.SubnetSelection
|
|
6760
|
+
- *Default:* new subnets will be created
|
|
6030
6761
|
|
|
6031
|
-
|
|
6762
|
+
The VPC Subnets this Compute Environment will launch instances in.
|
|
6032
6763
|
|
|
6033
6764
|
---
|
|
6034
6765
|
|
|
6035
|
-
##### `
|
|
6766
|
+
##### `compileInstanceType`<sup>Optional</sup> <a name="compileInstanceType" id="aws-cdk-neuronx-patterns.NeuronxCrossCompilerProps.property.compileInstanceType"></a>
|
|
6036
6767
|
|
|
6037
6768
|
```typescript
|
|
6038
|
-
public readonly
|
|
6769
|
+
public readonly compileInstanceType: InstanceType;
|
|
6039
6770
|
```
|
|
6040
6771
|
|
|
6041
|
-
- *Type:* aws-cdk-lib.
|
|
6772
|
+
- *Type:* aws-cdk-lib.aws_ec2.InstanceType
|
|
6773
|
+
- *Default:* ec2.InstanceType.of(ec2.InstanceClass.C7I, ec2.InstanceSize.XLARGE4)
|
|
6774
|
+
|
|
6775
|
+
The EC2 instance type to use for cross-compilation.
|
|
6776
|
+
|
|
6777
|
+
This should be a non-Neuron instance type with sufficient memory and CPU
|
|
6778
|
+
for model compilation.
|
|
6042
6779
|
|
|
6043
6780
|
---
|
|
6044
6781
|
|
|
6045
|
-
###
|
|
6782
|
+
### NeuronxNativeCompilerProps <a name="NeuronxNativeCompilerProps" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps"></a>
|
|
6046
6783
|
|
|
6047
|
-
Props of
|
|
6784
|
+
Props of NeuronxNativeCompiler.
|
|
6048
6785
|
|
|
6049
|
-
#### Initializer <a name="Initializer" id="aws-cdk-neuronx-patterns.
|
|
6786
|
+
#### Initializer <a name="Initializer" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.Initializer"></a>
|
|
6050
6787
|
|
|
6051
6788
|
```typescript
|
|
6052
|
-
import {
|
|
6789
|
+
import { NeuronxNativeCompilerProps } from 'aws-cdk-neuronx-patterns'
|
|
6053
6790
|
|
|
6054
|
-
const
|
|
6791
|
+
const neuronxNativeCompilerProps: NeuronxNativeCompilerProps = { ... }
|
|
6055
6792
|
```
|
|
6056
6793
|
|
|
6057
6794
|
#### Properties <a name="Properties" id="Properties"></a>
|
|
6058
6795
|
|
|
6059
6796
|
| **Name** | **Type** | **Description** |
|
|
6060
6797
|
| --- | --- | --- |
|
|
6061
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
6062
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
6063
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
6064
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
6065
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
6066
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
6067
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
6068
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
6069
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
6070
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
6071
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
6072
|
-
| <code><a href="#aws-cdk-neuronx-patterns.
|
|
6798
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.artifactS3Prefix">artifactS3Prefix</a></code> | <code>string</code> | S3 Prefix that compiled artifact uploaded. |
|
|
6799
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.bucket">bucket</a></code> | <code>aws-cdk-lib.aws_s3.IBucket</code> | The bucket to upload compiled artifacts. |
|
|
6800
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.image">image</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxContainerImage">INeuronxContainerImage</a></code> | An image of the container where the compile job is executed. |
|
|
6801
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.model">model</a></code> | <code><a href="#aws-cdk-neuronx-patterns.Model">Model</a></code> | The model to be compiled. |
|
|
6802
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.neuronxInstanceType">neuronxInstanceType</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | The instance type of compile worker instance. |
|
|
6803
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.vpc">vpc</a></code> | <code>aws-cdk-lib.aws_ec2.IVpc</code> | VPC in which this will launch compile worker instance. |
|
|
6804
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.command">command</a></code> | <code>string[]</code> | The command to run in the container. |
|
|
6805
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.environment">environment</a></code> | <code>{[ key: string ]: string}</code> | The environment variables to pass to the container. |
|
|
6806
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.secrets">secrets</a></code> | <code>{[ key: string ]: aws-cdk-lib.aws_batch.Secret}</code> | Secrets to pass to the container. |
|
|
6807
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.spot">spot</a></code> | <code>boolean</code> | Whether or not to use spot instances. |
|
|
6808
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.volumeSize">volumeSize</a></code> | <code>aws-cdk-lib.Size</code> | The root volume of worker instance. |
|
|
6809
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.vpcSubnets">vpcSubnets</a></code> | <code>aws-cdk-lib.aws_ec2.SubnetSelection</code> | The VPC Subnets this Compute Environment will launch instances in. |
|
|
6073
6810
|
|
|
6074
6811
|
---
|
|
6075
6812
|
|
|
6076
|
-
##### `artifactS3Prefix`<sup>Required</sup> <a name="artifactS3Prefix" id="aws-cdk-neuronx-patterns.
|
|
6813
|
+
##### `artifactS3Prefix`<sup>Required</sup> <a name="artifactS3Prefix" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.artifactS3Prefix"></a>
|
|
6077
6814
|
|
|
6078
6815
|
```typescript
|
|
6079
6816
|
public readonly artifactS3Prefix: string;
|
|
@@ -6087,7 +6824,7 @@ This property is not depends on compile job finish.
|
|
|
6087
6824
|
|
|
6088
6825
|
---
|
|
6089
6826
|
|
|
6090
|
-
##### `bucket`<sup>Required</sup> <a name="bucket" id="aws-cdk-neuronx-patterns.
|
|
6827
|
+
##### `bucket`<sup>Required</sup> <a name="bucket" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.bucket"></a>
|
|
6091
6828
|
|
|
6092
6829
|
```typescript
|
|
6093
6830
|
public readonly bucket: IBucket;
|
|
@@ -6099,7 +6836,7 @@ The bucket to upload compiled artifacts.
|
|
|
6099
6836
|
|
|
6100
6837
|
---
|
|
6101
6838
|
|
|
6102
|
-
##### `image`<sup>Required</sup> <a name="image" id="aws-cdk-neuronx-patterns.
|
|
6839
|
+
##### `image`<sup>Required</sup> <a name="image" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.image"></a>
|
|
6103
6840
|
|
|
6104
6841
|
```typescript
|
|
6105
6842
|
public readonly image: INeuronxContainerImage;
|
|
@@ -6111,7 +6848,7 @@ An image of the container where the compile job is executed.
|
|
|
6111
6848
|
|
|
6112
6849
|
---
|
|
6113
6850
|
|
|
6114
|
-
##### `model`<sup>Required</sup> <a name="model" id="aws-cdk-neuronx-patterns.
|
|
6851
|
+
##### `model`<sup>Required</sup> <a name="model" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.model"></a>
|
|
6115
6852
|
|
|
6116
6853
|
```typescript
|
|
6117
6854
|
public readonly model: Model;
|
|
@@ -6123,7 +6860,7 @@ The model to be compiled.
|
|
|
6123
6860
|
|
|
6124
6861
|
---
|
|
6125
6862
|
|
|
6126
|
-
##### `neuronxInstanceType`<sup>Required</sup> <a name="neuronxInstanceType" id="aws-cdk-neuronx-patterns.
|
|
6863
|
+
##### `neuronxInstanceType`<sup>Required</sup> <a name="neuronxInstanceType" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.neuronxInstanceType"></a>
|
|
6127
6864
|
|
|
6128
6865
|
```typescript
|
|
6129
6866
|
public readonly neuronxInstanceType: INeuronxInstanceType;
|
|
@@ -6135,7 +6872,7 @@ The instance type of compile worker instance.
|
|
|
6135
6872
|
|
|
6136
6873
|
---
|
|
6137
6874
|
|
|
6138
|
-
##### `vpc`<sup>Required</sup> <a name="vpc" id="aws-cdk-neuronx-patterns.
|
|
6875
|
+
##### `vpc`<sup>Required</sup> <a name="vpc" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.vpc"></a>
|
|
6139
6876
|
|
|
6140
6877
|
```typescript
|
|
6141
6878
|
public readonly vpc: IVpc;
|
|
@@ -6147,7 +6884,7 @@ VPC in which this will launch compile worker instance.
|
|
|
6147
6884
|
|
|
6148
6885
|
---
|
|
6149
6886
|
|
|
6150
|
-
##### `command`<sup>Optional</sup> <a name="command" id="aws-cdk-neuronx-patterns.
|
|
6887
|
+
##### `command`<sup>Optional</sup> <a name="command" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.command"></a>
|
|
6151
6888
|
|
|
6152
6889
|
```typescript
|
|
6153
6890
|
public readonly command: string[];
|
|
@@ -6155,9 +6892,11 @@ public readonly command: string[];
|
|
|
6155
6892
|
|
|
6156
6893
|
- *Type:* string[]
|
|
6157
6894
|
|
|
6895
|
+
The command to run in the container.
|
|
6896
|
+
|
|
6158
6897
|
---
|
|
6159
6898
|
|
|
6160
|
-
##### `environment`<sup>Optional</sup> <a name="environment" id="aws-cdk-neuronx-patterns.
|
|
6899
|
+
##### `environment`<sup>Optional</sup> <a name="environment" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.environment"></a>
|
|
6161
6900
|
|
|
6162
6901
|
```typescript
|
|
6163
6902
|
public readonly environment: {[ key: string ]: string};
|
|
@@ -6172,7 +6911,7 @@ This is only applicable when using container runtime.
|
|
|
6172
6911
|
|
|
6173
6912
|
---
|
|
6174
6913
|
|
|
6175
|
-
##### `secrets`<sup>Optional</sup> <a name="secrets" id="aws-cdk-neuronx-patterns.
|
|
6914
|
+
##### `secrets`<sup>Optional</sup> <a name="secrets" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.secrets"></a>
|
|
6176
6915
|
|
|
6177
6916
|
```typescript
|
|
6178
6917
|
public readonly secrets: {[ key: string ]: Secret};
|
|
@@ -6184,7 +6923,7 @@ Secrets to pass to the container.
|
|
|
6184
6923
|
|
|
6185
6924
|
---
|
|
6186
6925
|
|
|
6187
|
-
##### `spot`<sup>Optional</sup> <a name="spot" id="aws-cdk-neuronx-patterns.
|
|
6926
|
+
##### `spot`<sup>Optional</sup> <a name="spot" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.spot"></a>
|
|
6188
6927
|
|
|
6189
6928
|
```typescript
|
|
6190
6929
|
public readonly spot: boolean;
|
|
@@ -6199,20 +6938,20 @@ Spot instances are less expensive EC2 instances that can be reclaimed by EC2 at
|
|
|
6199
6938
|
|
|
6200
6939
|
---
|
|
6201
6940
|
|
|
6202
|
-
##### `volumeSize`<sup>Optional</sup> <a name="volumeSize" id="aws-cdk-neuronx-patterns.
|
|
6941
|
+
##### `volumeSize`<sup>Optional</sup> <a name="volumeSize" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.volumeSize"></a>
|
|
6203
6942
|
|
|
6204
6943
|
```typescript
|
|
6205
6944
|
public readonly volumeSize: Size;
|
|
6206
6945
|
```
|
|
6207
6946
|
|
|
6208
6947
|
- *Type:* aws-cdk-lib.Size
|
|
6209
|
-
- *Default:* N
|
|
6948
|
+
- *Default:* N billion parameters * 5GiB EBS
|
|
6210
6949
|
|
|
6211
6950
|
The root volume of worker instance.
|
|
6212
6951
|
|
|
6213
6952
|
---
|
|
6214
6953
|
|
|
6215
|
-
##### `vpcSubnets`<sup>Optional</sup> <a name="vpcSubnets" id="aws-cdk-neuronx-patterns.
|
|
6954
|
+
##### `vpcSubnets`<sup>Optional</sup> <a name="vpcSubnets" id="aws-cdk-neuronx-patterns.NeuronxNativeCompilerProps.property.vpcSubnets"></a>
|
|
6216
6955
|
|
|
6217
6956
|
```typescript
|
|
6218
6957
|
public readonly vpcSubnets: SubnetSelection;
|
|
@@ -10554,11 +11293,11 @@ const vllmNxdInferenceCompiledModel: VllmNxdInferenceCompiledModel = { ... }
|
|
|
10554
11293
|
| **Name** | **Type** | **Description** |
|
|
10555
11294
|
| --- | --- | --- |
|
|
10556
11295
|
| <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.bucket">bucket</a></code> | <code>aws-cdk-lib.aws_s3.IBucket</code> | The bucket to upload compiled artifacts. |
|
|
10557
|
-
| <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.compileTimeInstanceType">compileTimeInstanceType</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | *No description.* |
|
|
10558
11296
|
| <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.modelName">modelName</a></code> | <code>string</code> | The model name. |
|
|
11297
|
+
| <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.recommendedInstanceType">recommendedInstanceType</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | The recommended Neuron instance type for running inference with this compiled model. |
|
|
10559
11298
|
| <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.s3Prefix">s3Prefix</a></code> | <code>string</code> | S3 prefix that compiled artifact uploaded. |
|
|
10560
11299
|
| <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.s3Uri">s3Uri</a></code> | <code>string</code> | S3 URL that compiled artifact uploaded. |
|
|
10561
|
-
| <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.weightSize">weightSize</a></code> | <code>aws-cdk-lib.Size</code> |
|
|
11300
|
+
| <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.weightSize">weightSize</a></code> | <code>aws-cdk-lib.Size</code> | The weight size of the model. |
|
|
10562
11301
|
| <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.vllmArgs">vllmArgs</a></code> | <code><a href="#aws-cdk-neuronx-patterns.VllmEngineArguments">VllmEngineArguments</a></code> | Passed to the vllm engine at compile time. |
|
|
10563
11302
|
|
|
10564
11303
|
---
|
|
@@ -10575,25 +11314,27 @@ The bucket to upload compiled artifacts.
|
|
|
10575
11314
|
|
|
10576
11315
|
---
|
|
10577
11316
|
|
|
10578
|
-
##### `
|
|
11317
|
+
##### `modelName`<sup>Required</sup> <a name="modelName" id="aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.modelName"></a>
|
|
10579
11318
|
|
|
10580
11319
|
```typescript
|
|
10581
|
-
public readonly
|
|
11320
|
+
public readonly modelName: string;
|
|
10582
11321
|
```
|
|
10583
11322
|
|
|
10584
|
-
- *Type:*
|
|
11323
|
+
- *Type:* string
|
|
11324
|
+
|
|
11325
|
+
The model name.
|
|
10585
11326
|
|
|
10586
11327
|
---
|
|
10587
11328
|
|
|
10588
|
-
##### `
|
|
11329
|
+
##### `recommendedInstanceType`<sup>Required</sup> <a name="recommendedInstanceType" id="aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.recommendedInstanceType"></a>
|
|
10589
11330
|
|
|
10590
11331
|
```typescript
|
|
10591
|
-
public readonly
|
|
11332
|
+
public readonly recommendedInstanceType: INeuronxInstanceType;
|
|
10592
11333
|
```
|
|
10593
11334
|
|
|
10594
|
-
- *Type:*
|
|
11335
|
+
- *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
|
|
10595
11336
|
|
|
10596
|
-
The model
|
|
11337
|
+
The recommended Neuron instance type for running inference with this compiled model.
|
|
10597
11338
|
|
|
10598
11339
|
---
|
|
10599
11340
|
|
|
@@ -10629,6 +11370,8 @@ public readonly weightSize: Size;
|
|
|
10629
11370
|
|
|
10630
11371
|
- *Type:* aws-cdk-lib.Size
|
|
10631
11372
|
|
|
11373
|
+
The weight size of the model.
|
|
11374
|
+
|
|
10632
11375
|
---
|
|
10633
11376
|
|
|
10634
11377
|
##### `vllmArgs`<sup>Required</sup> <a name="vllmArgs" id="aws-cdk-neuronx-patterns.VllmNxdInferenceCompiledModel.property.vllmArgs"></a>
|
|
@@ -10662,6 +11405,7 @@ const vllmNxdInferenceCompileProps: VllmNxdInferenceCompileProps = { ... }
|
|
|
10662
11405
|
| <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompileProps.property.bucket">bucket</a></code> | <code>aws-cdk-lib.aws_s3.IBucket</code> | The bucket to upload compiled artifacts. |
|
|
10663
11406
|
| <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompileProps.property.model">model</a></code> | <code><a href="#aws-cdk-neuronx-patterns.Model">Model</a></code> | The model to be compiled. |
|
|
10664
11407
|
| <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompileProps.property.vpc">vpc</a></code> | <code>aws-cdk-lib.aws_ec2.IVpc</code> | VPC in which this will launch compile worker instance. |
|
|
11408
|
+
| <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompileProps.property.compileInstanceType">compileInstanceType</a></code> | <code>aws-cdk-lib.aws_ec2.InstanceType</code> | The EC2 instance type to use for cross-compilation. |
|
|
10665
11409
|
| <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompileProps.property.environment">environment</a></code> | <code>{[ key: string ]: string}</code> | The environment variables to pass to the container. |
|
|
10666
11410
|
| <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompileProps.property.image">image</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxContainerImage">INeuronxContainerImage</a></code> | An image of the container where the compile job is executed. |
|
|
10667
11411
|
| <code><a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompileProps.property.neuronxInstanceType">neuronxInstanceType</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | The instance type of compile worker instance. |
|
|
@@ -10708,6 +11452,21 @@ VPC in which this will launch compile worker instance.
|
|
|
10708
11452
|
|
|
10709
11453
|
---
|
|
10710
11454
|
|
|
11455
|
+
##### `compileInstanceType`<sup>Optional</sup> <a name="compileInstanceType" id="aws-cdk-neuronx-patterns.VllmNxdInferenceCompileProps.property.compileInstanceType"></a>
|
|
11456
|
+
|
|
11457
|
+
```typescript
|
|
11458
|
+
public readonly compileInstanceType: InstanceType;
|
|
11459
|
+
```
|
|
11460
|
+
|
|
11461
|
+
- *Type:* aws-cdk-lib.aws_ec2.InstanceType
|
|
11462
|
+
- *Default:* Automatically selected based on model size
|
|
11463
|
+
|
|
11464
|
+
The EC2 instance type to use for cross-compilation.
|
|
11465
|
+
|
|
11466
|
+
This should be a non-Neuron instance type with sufficient memory for model compilation.
|
|
11467
|
+
|
|
11468
|
+
---
|
|
11469
|
+
|
|
10711
11470
|
##### `environment`<sup>Optional</sup> <a name="environment" id="aws-cdk-neuronx-patterns.VllmNxdInferenceCompileProps.property.environment"></a>
|
|
10712
11471
|
|
|
10713
11472
|
```typescript
|
|
@@ -11788,6 +12547,9 @@ new NeuronxInstanceType()
|
|
|
11788
12547
|
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxInstanceType.property.INF2_XLARGE">INF2_XLARGE</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | inf2.xlarge. |
|
|
11789
12548
|
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxInstanceType.property.TRN1_2XLARGE">TRN1_2XLARGE</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | trn1.2xlarge. |
|
|
11790
12549
|
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxInstanceType.property.TRN1_32XLARGE">TRN1_32XLARGE</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | trn1.32xlarge. |
|
|
12550
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxInstanceType.property.TRN2_3XLARGE">TRN2_3XLARGE</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | trn2.3xlarge. |
|
|
12551
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxInstanceType.property.TRN2_48XLARGE">TRN2_48XLARGE</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | trn2.48xlarge. |
|
|
12552
|
+
| <code><a href="#aws-cdk-neuronx-patterns.NeuronxInstanceType.property.TRN2U_48XLARGE">TRN2U_48XLARGE</a></code> | <code><a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a></code> | trn2u.48xlarge. |
|
|
11791
12553
|
|
|
11792
12554
|
---
|
|
11793
12555
|
|
|
@@ -11863,6 +12625,42 @@ trn1.32xlarge.
|
|
|
11863
12625
|
|
|
11864
12626
|
---
|
|
11865
12627
|
|
|
12628
|
+
##### `TRN2_3XLARGE`<sup>Required</sup> <a name="TRN2_3XLARGE" id="aws-cdk-neuronx-patterns.NeuronxInstanceType.property.TRN2_3XLARGE"></a>
|
|
12629
|
+
|
|
12630
|
+
```typescript
|
|
12631
|
+
public readonly TRN2_3XLARGE: INeuronxInstanceType;
|
|
12632
|
+
```
|
|
12633
|
+
|
|
12634
|
+
- *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
|
|
12635
|
+
|
|
12636
|
+
trn2.3xlarge.
|
|
12637
|
+
|
|
12638
|
+
---
|
|
12639
|
+
|
|
12640
|
+
##### `TRN2_48XLARGE`<sup>Required</sup> <a name="TRN2_48XLARGE" id="aws-cdk-neuronx-patterns.NeuronxInstanceType.property.TRN2_48XLARGE"></a>
|
|
12641
|
+
|
|
12642
|
+
```typescript
|
|
12643
|
+
public readonly TRN2_48XLARGE: INeuronxInstanceType;
|
|
12644
|
+
```
|
|
12645
|
+
|
|
12646
|
+
- *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
|
|
12647
|
+
|
|
12648
|
+
trn2.48xlarge.
|
|
12649
|
+
|
|
12650
|
+
---
|
|
12651
|
+
|
|
12652
|
+
##### `TRN2U_48XLARGE`<sup>Required</sup> <a name="TRN2U_48XLARGE" id="aws-cdk-neuronx-patterns.NeuronxInstanceType.property.TRN2U_48XLARGE"></a>
|
|
12653
|
+
|
|
12654
|
+
```typescript
|
|
12655
|
+
public readonly TRN2U_48XLARGE: INeuronxInstanceType;
|
|
12656
|
+
```
|
|
12657
|
+
|
|
12658
|
+
- *Type:* <a href="#aws-cdk-neuronx-patterns.INeuronxInstanceType">INeuronxInstanceType</a>
|
|
12659
|
+
|
|
12660
|
+
trn2u.48xlarge.
|
|
12661
|
+
|
|
12662
|
+
---
|
|
12663
|
+
|
|
11866
12664
|
### Parameters <a name="Parameters" id="aws-cdk-neuronx-patterns.Parameters"></a>
|
|
11867
12665
|
|
|
11868
12666
|
Represents the amount of parameters.
|
|
@@ -12679,6 +13477,73 @@ public readonly neuronxCores: number;
|
|
|
12679
13477
|
---
|
|
12680
13478
|
|
|
12681
13479
|
|
|
13480
|
+
### Trainium2Chips <a name="Trainium2Chips" id="aws-cdk-neuronx-patterns.Trainium2Chips"></a>
|
|
13481
|
+
|
|
13482
|
+
- *Implements:* <a href="#aws-cdk-neuronx-patterns.IAcceleratorChips">IAcceleratorChips</a>
|
|
13483
|
+
|
|
13484
|
+
#### Initializers <a name="Initializers" id="aws-cdk-neuronx-patterns.Trainium2Chips.Initializer"></a>
|
|
13485
|
+
|
|
13486
|
+
```typescript
|
|
13487
|
+
import { Trainium2Chips } from 'aws-cdk-neuronx-patterns'
|
|
13488
|
+
|
|
13489
|
+
new Trainium2Chips(chips: number)
|
|
13490
|
+
```
|
|
13491
|
+
|
|
13492
|
+
| **Name** | **Type** | **Description** |
|
|
13493
|
+
| --- | --- | --- |
|
|
13494
|
+
| <code><a href="#aws-cdk-neuronx-patterns.Trainium2Chips.Initializer.parameter.chips">chips</a></code> | <code>number</code> | *No description.* |
|
|
13495
|
+
|
|
13496
|
+
---
|
|
13497
|
+
|
|
13498
|
+
##### `chips`<sup>Required</sup> <a name="chips" id="aws-cdk-neuronx-patterns.Trainium2Chips.Initializer.parameter.chips"></a>
|
|
13499
|
+
|
|
13500
|
+
- *Type:* number
|
|
13501
|
+
|
|
13502
|
+
---
|
|
13503
|
+
|
|
13504
|
+
|
|
13505
|
+
|
|
13506
|
+
#### Properties <a name="Properties" id="Properties"></a>
|
|
13507
|
+
|
|
13508
|
+
| **Name** | **Type** | **Description** |
|
|
13509
|
+
| --- | --- | --- |
|
|
13510
|
+
| <code><a href="#aws-cdk-neuronx-patterns.Trainium2Chips.property.acceleratorMemory">acceleratorMemory</a></code> | <code>aws-cdk-lib.Size</code> | *No description.* |
|
|
13511
|
+
| <code><a href="#aws-cdk-neuronx-patterns.Trainium2Chips.property.chips">chips</a></code> | <code>number</code> | *No description.* |
|
|
13512
|
+
| <code><a href="#aws-cdk-neuronx-patterns.Trainium2Chips.property.neuronxCores">neuronxCores</a></code> | <code>number</code> | *No description.* |
|
|
13513
|
+
|
|
13514
|
+
---
|
|
13515
|
+
|
|
13516
|
+
##### `acceleratorMemory`<sup>Required</sup> <a name="acceleratorMemory" id="aws-cdk-neuronx-patterns.Trainium2Chips.property.acceleratorMemory"></a>
|
|
13517
|
+
|
|
13518
|
+
```typescript
|
|
13519
|
+
public readonly acceleratorMemory: Size;
|
|
13520
|
+
```
|
|
13521
|
+
|
|
13522
|
+
- *Type:* aws-cdk-lib.Size
|
|
13523
|
+
|
|
13524
|
+
---
|
|
13525
|
+
|
|
13526
|
+
##### `chips`<sup>Required</sup> <a name="chips" id="aws-cdk-neuronx-patterns.Trainium2Chips.property.chips"></a>
|
|
13527
|
+
|
|
13528
|
+
```typescript
|
|
13529
|
+
public readonly chips: number;
|
|
13530
|
+
```
|
|
13531
|
+
|
|
13532
|
+
- *Type:* number
|
|
13533
|
+
|
|
13534
|
+
---
|
|
13535
|
+
|
|
13536
|
+
##### `neuronxCores`<sup>Required</sup> <a name="neuronxCores" id="aws-cdk-neuronx-patterns.Trainium2Chips.property.neuronxCores"></a>
|
|
13537
|
+
|
|
13538
|
+
```typescript
|
|
13539
|
+
public readonly neuronxCores: number;
|
|
13540
|
+
```
|
|
13541
|
+
|
|
13542
|
+
- *Type:* number
|
|
13543
|
+
|
|
13544
|
+
---
|
|
13545
|
+
|
|
13546
|
+
|
|
12682
13547
|
### VllmEngineArgumentsParser <a name="VllmEngineArgumentsParser" id="aws-cdk-neuronx-patterns.VllmEngineArgumentsParser"></a>
|
|
12683
13548
|
|
|
12684
13549
|
#### Initializers <a name="Initializers" id="aws-cdk-neuronx-patterns.VllmEngineArgumentsParser.Initializer"></a>
|
|
@@ -13117,7 +13982,7 @@ The neuronx SDK version.
|
|
|
13117
13982
|
|
|
13118
13983
|
### IAcceleratorChips <a name="IAcceleratorChips" id="aws-cdk-neuronx-patterns.IAcceleratorChips"></a>
|
|
13119
13984
|
|
|
13120
|
-
- *Implemented By:* <a href="#aws-cdk-neuronx-patterns.Inferentia2Chips">Inferentia2Chips</a>, <a href="#aws-cdk-neuronx-patterns.Trainium1Chips">Trainium1Chips</a>, <a href="#aws-cdk-neuronx-patterns.IAcceleratorChips">IAcceleratorChips</a>
|
|
13985
|
+
- *Implemented By:* <a href="#aws-cdk-neuronx-patterns.Inferentia2Chips">Inferentia2Chips</a>, <a href="#aws-cdk-neuronx-patterns.Trainium1Chips">Trainium1Chips</a>, <a href="#aws-cdk-neuronx-patterns.Trainium2Chips">Trainium2Chips</a>, <a href="#aws-cdk-neuronx-patterns.IAcceleratorChips">IAcceleratorChips</a>
|
|
13121
13986
|
|
|
13122
13987
|
|
|
13123
13988
|
#### Properties <a name="Properties" id="Properties"></a>
|
|
@@ -13160,6 +14025,27 @@ public readonly neuronxCores: number;
|
|
|
13160
14025
|
|
|
13161
14026
|
---
|
|
13162
14027
|
|
|
14028
|
+
### INeuronxCompiler <a name="INeuronxCompiler" id="aws-cdk-neuronx-patterns.INeuronxCompiler"></a>
|
|
14029
|
+
|
|
14030
|
+
- *Implemented By:* <a href="#aws-cdk-neuronx-patterns.NeuronxCompilerBase">NeuronxCompilerBase</a>, <a href="#aws-cdk-neuronx-patterns.NeuronxCrossCompiler">NeuronxCrossCompiler</a>, <a href="#aws-cdk-neuronx-patterns.NeuronxNativeCompiler">NeuronxNativeCompiler</a>, <a href="#aws-cdk-neuronx-patterns.INeuronxCompiler">INeuronxCompiler</a>
|
|
14031
|
+
|
|
14032
|
+
Interface for Neuronx compilers.
|
|
14033
|
+
|
|
14034
|
+
#### Methods <a name="Methods" id="Methods"></a>
|
|
14035
|
+
|
|
14036
|
+
| **Name** | **Description** |
|
|
14037
|
+
| --- | --- |
|
|
14038
|
+
| <code><a href="#aws-cdk-neuronx-patterns.INeuronxCompiler.compile">compile</a></code> | *No description.* |
|
|
14039
|
+
|
|
14040
|
+
---
|
|
14041
|
+
|
|
14042
|
+
##### `compile` <a name="compile" id="aws-cdk-neuronx-patterns.INeuronxCompiler.compile"></a>
|
|
14043
|
+
|
|
14044
|
+
```typescript
|
|
14045
|
+
public compile(): NeuronxCompiledModel
|
|
14046
|
+
```
|
|
14047
|
+
|
|
14048
|
+
|
|
13163
14049
|
### INeuronxContainerImage <a name="INeuronxContainerImage" id="aws-cdk-neuronx-patterns.INeuronxContainerImage"></a>
|
|
13164
14050
|
|
|
13165
14051
|
- *Implemented By:* <a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceCompileImage">VllmNxdInferenceCompileImage</a>, <a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceEcsImage">VllmNxdInferenceEcsImage</a>, <a href="#aws-cdk-neuronx-patterns.VllmNxdInferenceEcsImageBase">VllmNxdInferenceEcsImageBase</a>, <a href="#aws-cdk-neuronx-patterns.INeuronxContainerImage">INeuronxContainerImage</a>
|