PyPI - aws-bootstrap-g4dn - Versions diffs - 0.3.0__tar.gz → 0.5.0__tar.gz - Mend

aws-bootstrap-g4dn 0.3.0tar.gz → 0.5.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (44) hide show

{aws_bootstrap_g4dn-0.3.0 → aws_bootstrap_g4dn-0.5.0}/.pre-commit-config.yaml RENAMED Viewed

@@ -8,6 +8,7 @@ repos:
     - id: fix-byte-order-marker
     - id: check-case-conflict
     - id: check-json
+      exclude: ^aws_bootstrap/resources/(launch|tasks)\.json$
     - id: check-yaml
       args: [ --unsafe ]
     - id: detect-aws-credentials

{aws_bootstrap_g4dn-0.3.0 → aws_bootstrap_g4dn-0.5.0}/CLAUDE.md RENAMED Viewed

@@ -39,6 +39,9 @@ aws_bootstrap/
         __init__.py
         gpu_benchmark.py       # GPU throughput benchmark (CNN + Transformer), copied to ~/gpu_benchmark.py on instance
         gpu_smoke_test.ipynb   # Interactive Jupyter notebook for GPU verification, copied to ~/gpu_smoke_test.ipynb
+        launch.json            # VSCode CUDA debug config template (deployed to ~/workspace/.vscode/launch.json)
+        saxpy.cu               # Example CUDA SAXPY source (deployed to ~/workspace/saxpy.cu)
+        tasks.json             # VSCode CUDA build tasks template (deployed to ~/workspace/.vscode/tasks.json)
         remote_setup.sh        # Uploaded & run on instance post-boot (GPU verify, Jupyter, etc.)
         requirements.txt       # Python dependencies installed on the remote instance
     tests/               # Unit tests (pytest)
@@ -49,6 +52,7 @@ aws_bootstrap/
         test_ssh_config.py
         test_ssh_gpu.py
 docs/
+    nsight-remote-profiling.md # Nsight Compute, Nsight Systems, and Nsight VSCE remote profiling guide
     spot-request-lifecycle.md  # Research notes on spot request cleanup
 ```
@@ -58,7 +62,7 @@ Entry point: `aws-bootstrap = "aws_bootstrap.cli:main"` (installed via `uv sync`
 - **`launch`** — provisions an EC2 instance (spot by default, falls back to on-demand on capacity errors); adds SSH config alias (e.g. `aws-gpu1`) to `~/.ssh/config`; `--python-version` controls which Python `uv` installs in the remote venv; `--ssh-port` overrides the default SSH port (22) for security group ingress, connection checks, and SSH config
 - **`status`** — lists all non-terminated instances (including `shutting-down`) with type, IP, SSH alias, pricing (spot price/hr or on-demand), uptime, and estimated cost for running spot instances; `--gpu` flag queries GPU info via SSH, reporting both CUDA toolkit version (from `nvcc`) and driver-supported max (from `nvidia-smi`); `--instructions` (default: on) prints connection commands (SSH, Jupyter tunnel, VSCode Remote SSH, GPU benchmark) for each running instance; suppress with `--no-instructions`
-- **`terminate`** — terminates instances by ID or all aws-bootstrap instances in the region; removes SSH config aliases
+- **`terminate`** — terminates instances by ID or SSH alias (e.g. `aws-gpu1`, resolved via `~/.ssh/config`), or all aws-bootstrap instances in the region if no arguments given; removes SSH config aliases
 - **`list instance-types`** — lists EC2 instance types matching a family prefix (default: `g4dn`), showing vCPUs, memory, and GPU info
 - **`list amis`** — lists available AMIs matching a name pattern (default: Deep Learning Base OSS Nvidia Driver GPU AMIs), sorted newest-first
@@ -99,8 +103,10 @@ The `KNOWN_CUDA_TAGS` array in `remote_setup.sh` lists the CUDA wheel tags publi
 `remote_setup.sh` also:
 - Creates `~/venv` and appends `source ~/venv/bin/activate` to `~/.bashrc` so the venv is auto-activated on SSH login. When `--python-version` is passed to `launch`, the CLI sets `PYTHON_VERSION` as an inline env var on the SSH command; `remote_setup.sh` reads it to run `uv python install` and `uv venv --python` with the requested version
+- Adds NVIDIA Nsight Systems (`nsys`) to PATH if installed under `/opt/nvidia/nsight-systems/` (pre-installed on Deep Learning AMIs but not on PATH by default). Fixes directory permissions, finds the latest version, and prepends its `bin/` to PATH in `~/.bashrc`
 - Runs a quick CUDA smoke test (`torch.cuda.is_available()` + GPU matmul) after PyTorch installation to verify the GPU stack; prints a WARNING on failure but does not abort
 - Copies `gpu_benchmark.py` to `~/gpu_benchmark.py` and `gpu_smoke_test.ipynb` to `~/gpu_smoke_test.ipynb`
+- Sets up `~/workspace/.vscode/` with `launch.json` and `tasks.json` for CUDA debugging. Detects `cuda-gdb` path and GPU SM architecture (via `nvidia-smi --query-gpu=compute_cap`) at deploy time, replacing `__CUDA_GDB_PATH__` and `__GPU_ARCH__` placeholders in the template files via `sed`
 ## GPU Benchmark

{aws_bootstrap_g4dn-0.3.0/aws_bootstrap_g4dn.egg-info → aws_bootstrap_g4dn-0.5.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: aws-bootstrap-g4dn
-Version: 0.3.0
+Version: 0.5.0
 Summary: Bootstrap AWS EC2 GPU instances for hybrid local-remote development
 Author: Adam Ever-Hadani
 License-Expression: MIT
@@ -49,7 +49,7 @@ ssh aws-gpu1                  # You're in, venv activated, PyTorch works
 ### 🎯 Target Workflows
 1. **Jupyter server-client** — Jupyter runs on the instance, connect from your local browser
-2. **VSCode Remote SSH** — `ssh aws-gpu1` just works with the Remote SSH extension
+2. **VSCode Remote SSH** — opens `~/workspace` with pre-configured CUDA debug/build tasks and an example `.cu` file
 3. **NVIDIA Nsight remote debugging** — GPU debugging over SSH
 ---
@@ -162,6 +162,7 @@ The setup script runs automatically on the instance after SSH becomes available:
 | **GPU smoke test notebook** | Copies `gpu_smoke_test.ipynb` to `~/gpu_smoke_test.ipynb` (open in JupyterLab) |
 | **Jupyter** | Configures and starts JupyterLab as a systemd service on port 8888 |
 | **SSH keepalive** | Configures server-side keepalive to prevent idle disconnects |
+| **VSCode workspace** | Creates `~/workspace/.vscode/` with `launch.json` and `tasks.json` (auto-detected `cuda-gdb` path and GPU arch), plus an example `saxpy.cu` |
 ### 📊 GPU Benchmark
@@ -200,6 +201,28 @@ ssh -i ~/.ssh/id_ed25519 -NL 8888:localhost:8888 ubuntu@<public-ip>
 A **GPU smoke test notebook** (`~/gpu_smoke_test.ipynb`) is pre-installed on every instance. Open it in JupyterLab to interactively verify the CUDA stack, run FP32/FP16 matmuls, train a small CNN on MNIST, and visualise training loss and GPU memory usage.
+### 🖥️ VSCode Remote SSH
+The remote setup creates a `~/workspace` folder with pre-configured CUDA debug and build tasks:
+```
+~/workspace/
+├── .vscode/
+│   ├── launch.json   # CUDA debug configs (cuda-gdb path auto-detected)
+│   └── tasks.json    # nvcc build tasks (GPU arch auto-detected, e.g. sm_75)
+└── saxpy.cu          # Example CUDA source — open and press F5 to debug
+```
+Connect directly from your terminal:
+```bash
+code --folder-uri vscode-remote://ssh-remote+aws-gpu1/home/ubuntu/workspace
+```
+Then install the [Nsight VSCE extension](https://marketplace.visualstudio.com/items?itemName=NVIDIA.nsight-vscode-edition) on the remote when prompted. Open `saxpy.cu`, set a breakpoint, and press F5.
+See [Nsight remote profiling guide](docs/nsight-remote-profiling.md) for more details on CUDA debugging and profiling workflows.
 ### 📋 Listing Resources
 ```bash
@@ -238,8 +261,14 @@ aws-bootstrap status --region us-east-1
 # Terminate all aws-bootstrap instances (with confirmation prompt)
 aws-bootstrap terminate
-# Terminate specific instances
-aws-bootstrap terminate i-abc123 i-def456
+# Terminate by SSH alias (resolved via ~/.ssh/config)
+aws-bootstrap terminate aws-gpu1
+# Terminate by instance ID
+aws-bootstrap terminate i-abc123
+# Mix aliases and instance IDs
+aws-bootstrap terminate aws-gpu1 i-def456
 # Skip confirmation prompt
 aws-bootstrap terminate --yes
@@ -251,7 +280,7 @@ aws-bootstrap terminate --yes
 CUDA: 12.8 (driver supports up to 13.0)
 ```
-SSH aliases are managed automatically — they're created on `launch`, shown in `status`, and cleaned up on `terminate`. Aliases use sequential numbering (`aws-gpu1`, `aws-gpu2`, etc.) and never reuse numbers from previous instances.
+SSH aliases are managed automatically — they're created on `launch`, shown in `status`, and cleaned up on `terminate`. Aliases use sequential numbering (`aws-gpu1`, `aws-gpu2`, etc.) and never reuse numbers from previous instances. You can use aliases anywhere you'd use an instance ID, e.g. `aws-bootstrap terminate aws-gpu1`.
 ## EC2 vCPU Quotas
@@ -322,7 +351,7 @@ aws-bootstrap launch --instance-type t3.medium --ami-filter "ubuntu/images/hvm-s
 | GPU instance pricing | [instances.vantage.sh](https://instances.vantage.sh/aws/ec2/g4dn.xlarge) |
 | Spot instance quotas | [AWS docs](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-limits.html) |
 | Deep Learning AMIs | [AWS docs](https://docs.aws.amazon.com/dlami/latest/devguide/what-is-dlami.html) |
-| Nvidia Nsight remote debugging | [Nvidia docs](https://docs.nvidia.com/nsight-visual-studio-edition/3.2/Content/Setup_Remote_Debugging.htm) |
+| Nsight remote GPU profiling | [Guide](docs/nsight-remote-profiling.md) — Nsight Compute, Nsight Systems, and Nsight VSCE on EC2 |
 Tutorials on setting up a CUDA environment on EC2 GPU instances:

{aws_bootstrap_g4dn-0.3.0 → aws_bootstrap_g4dn-0.5.0}/README.md RENAMED Viewed

@@ -30,7 +30,7 @@ ssh aws-gpu1                  # You're in, venv activated, PyTorch works
 ### 🎯 Target Workflows
 1. **Jupyter server-client** — Jupyter runs on the instance, connect from your local browser
-2. **VSCode Remote SSH** — `ssh aws-gpu1` just works with the Remote SSH extension
+2. **VSCode Remote SSH** — opens `~/workspace` with pre-configured CUDA debug/build tasks and an example `.cu` file
 3. **NVIDIA Nsight remote debugging** — GPU debugging over SSH
 ---
@@ -143,6 +143,7 @@ The setup script runs automatically on the instance after SSH becomes available:
 | **GPU smoke test notebook** | Copies `gpu_smoke_test.ipynb` to `~/gpu_smoke_test.ipynb` (open in JupyterLab) |
 | **Jupyter** | Configures and starts JupyterLab as a systemd service on port 8888 |
 | **SSH keepalive** | Configures server-side keepalive to prevent idle disconnects |
+| **VSCode workspace** | Creates `~/workspace/.vscode/` with `launch.json` and `tasks.json` (auto-detected `cuda-gdb` path and GPU arch), plus an example `saxpy.cu` |
 ### 📊 GPU Benchmark
@@ -181,6 +182,28 @@ ssh -i ~/.ssh/id_ed25519 -NL 8888:localhost:8888 ubuntu@<public-ip>
 A **GPU smoke test notebook** (`~/gpu_smoke_test.ipynb`) is pre-installed on every instance. Open it in JupyterLab to interactively verify the CUDA stack, run FP32/FP16 matmuls, train a small CNN on MNIST, and visualise training loss and GPU memory usage.
+### 🖥️ VSCode Remote SSH
+The remote setup creates a `~/workspace` folder with pre-configured CUDA debug and build tasks:
+```
+~/workspace/
+├── .vscode/
+│   ├── launch.json   # CUDA debug configs (cuda-gdb path auto-detected)
+│   └── tasks.json    # nvcc build tasks (GPU arch auto-detected, e.g. sm_75)
+└── saxpy.cu          # Example CUDA source — open and press F5 to debug
+```
+Connect directly from your terminal:
+```bash
+code --folder-uri vscode-remote://ssh-remote+aws-gpu1/home/ubuntu/workspace
+```
+Then install the [Nsight VSCE extension](https://marketplace.visualstudio.com/items?itemName=NVIDIA.nsight-vscode-edition) on the remote when prompted. Open `saxpy.cu`, set a breakpoint, and press F5.
+See [Nsight remote profiling guide](docs/nsight-remote-profiling.md) for more details on CUDA debugging and profiling workflows.
 ### 📋 Listing Resources
 ```bash
@@ -219,8 +242,14 @@ aws-bootstrap status --region us-east-1
 # Terminate all aws-bootstrap instances (with confirmation prompt)
 aws-bootstrap terminate
-# Terminate specific instances
-aws-bootstrap terminate i-abc123 i-def456
+# Terminate by SSH alias (resolved via ~/.ssh/config)
+aws-bootstrap terminate aws-gpu1
+# Terminate by instance ID
+aws-bootstrap terminate i-abc123
+# Mix aliases and instance IDs
+aws-bootstrap terminate aws-gpu1 i-def456
 # Skip confirmation prompt
 aws-bootstrap terminate --yes
@@ -232,7 +261,7 @@ aws-bootstrap terminate --yes
 CUDA: 12.8 (driver supports up to 13.0)
 ```
-SSH aliases are managed automatically — they're created on `launch`, shown in `status`, and cleaned up on `terminate`. Aliases use sequential numbering (`aws-gpu1`, `aws-gpu2`, etc.) and never reuse numbers from previous instances.
+SSH aliases are managed automatically — they're created on `launch`, shown in `status`, and cleaned up on `terminate`. Aliases use sequential numbering (`aws-gpu1`, `aws-gpu2`, etc.) and never reuse numbers from previous instances. You can use aliases anywhere you'd use an instance ID, e.g. `aws-bootstrap terminate aws-gpu1`.
 ## EC2 vCPU Quotas
@@ -303,7 +332,7 @@ aws-bootstrap launch --instance-type t3.medium --ami-filter "ubuntu/images/hvm-s
 | GPU instance pricing | [instances.vantage.sh](https://instances.vantage.sh/aws/ec2/g4dn.xlarge) |
 | Spot instance quotas | [AWS docs](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-limits.html) |
 | Deep Learning AMIs | [AWS docs](https://docs.aws.amazon.com/dlami/latest/devguide/what-is-dlami.html) |
-| Nvidia Nsight remote debugging | [Nvidia docs](https://docs.nvidia.com/nsight-visual-studio-edition/3.2/Content/Setup_Remote_Debugging.htm) |
+| Nsight remote GPU profiling | [Guide](docs/nsight-remote-profiling.md) — Nsight Compute, Nsight Systems, and Nsight VSCE on EC2 |
 Tutorials on setting up a CUDA environment on EC2 GPU instances:

{aws_bootstrap_g4dn-0.3.0 → aws_bootstrap_g4dn-0.5.0}/aws_bootstrap/cli.py RENAMED Viewed

@@ -29,6 +29,7 @@ from .ssh import (
     private_key_path,
     query_gpu_info,
     remove_ssh_host,
+    resolve_instance_id,
     run_remote_setup,
     wait_for_ssh,
 )
@@ -277,7 +278,7 @@ def launch(
     click.echo()
     click.secho("  VSCode Remote SSH:", fg="cyan")
     click.secho(
-        f"    code --folder-uri vscode-remote://ssh-remote+{alias}/home/{config.ssh_user}",
+        f"    code --folder-uri vscode-remote://ssh-remote+{alias}/home/{config.ssh_user}/workspace",
         bold=True,
     )
@@ -288,7 +289,7 @@ def launch(
     click.echo()
     click.secho("  Terminate:", fg="cyan")
-    click.secho(f"    aws-bootstrap terminate {instance_id} --region {config.region}", bold=True)
+    click.secho(f"    aws-bootstrap terminate {alias} --region {config.region}", bold=True)
     click.echo()
@@ -410,7 +411,7 @@ def status(region, profile, gpu, instructions):
             click.secho("    VSCode Remote SSH:", fg="cyan")
             click.secho(
-                f"      code --folder-uri vscode-remote://ssh-remote+{alias}/home/{user}",
+                f"      code --folder-uri vscode-remote://ssh-remote+{alias}/home/{user}/workspace",
                 bold=True,
             )
@@ -419,7 +420,8 @@ def status(region, profile, gpu, instructions):
     click.echo()
     first_id = instances[0]["InstanceId"]
-    click.echo("  To terminate:  " + click.style(f"aws-bootstrap terminate {first_id}", bold=True))
+    first_ref = ssh_hosts.get(first_id, first_id)
+    click.echo("  To terminate:  " + click.style(f"aws-bootstrap terminate {first_ref}", bold=True))
     click.echo()
@@ -427,18 +429,28 @@ def status(region, profile, gpu, instructions):
 @click.option("--region", default="us-west-2", show_default=True, help="AWS region.")
 @click.option("--profile", default=None, help="AWS profile override.")
 @click.option("--yes", "-y", is_flag=True, default=False, help="Skip confirmation prompt.")
-@click.argument("instance_ids", nargs=-1)
+@click.argument("instance_ids", nargs=-1, metavar="[INSTANCE_ID_OR_ALIAS]...")
 def terminate(region, profile, yes, instance_ids):
     """Terminate instances created by aws-bootstrap.
-    Pass specific instance IDs to terminate, or omit to terminate all
-    aws-bootstrap instances in the region.
+    Pass specific instance IDs or SSH aliases (e.g. aws-gpu1) to terminate,
+    or omit to terminate all aws-bootstrap instances in the region.
     """
     session = boto3.Session(profile_name=profile, region_name=region)
     ec2 = session.client("ec2")
     if instance_ids:
-        targets = list(instance_ids)
+        targets = []
+        for value in instance_ids:
+            resolved = resolve_instance_id(value)
+            if resolved is None:
+                raise CLIError(
+                    f"Could not resolve '{value}' to an instance ID.\n\n"
+                    "  It is not a valid instance ID or a known SSH alias."
+                )
+            if resolved != value:
+                info(f"Resolved alias '{value}' -> {resolved}")
+            targets.append(resolved)
     else:
         instances = find_tagged_instances(ec2, "aws-bootstrap-g4dn")
         if not instances:

{aws_bootstrap_g4dn-0.3.0 → aws_bootstrap_g4dn-0.5.0}/aws_bootstrap/resources/gpu_benchmark.py RENAMED Viewed

@@ -628,7 +628,9 @@ def configure_precision(device: torch.device, requested: PrecisionMode) -> Preci
     return PrecisionMode.FP32
-def print_system_info(requested_precision: PrecisionMode) -> tuple[torch.device, PrecisionMode]:
+def print_system_info(
+    requested_precision: PrecisionMode, force_cpu: bool = False
+) -> tuple[torch.device, PrecisionMode]:
     """Print system and CUDA information, return device and actual precision mode."""
     print("\n" + "=" * 60)
     print("System Information")
@@ -636,7 +638,7 @@ def print_system_info(requested_precision: PrecisionMode) -> tuple[torch.device,
     print(f"PyTorch version: {torch.__version__}")
     print(f"Python version: {sys.version.split()[0]}")
-    if torch.cuda.is_available():
+    if torch.cuda.is_available() and not force_cpu:
         device = torch.device("cuda")
         print("CUDA available: Yes")
         print(f"CUDA version: {torch.version.cuda}")
@@ -666,8 +668,11 @@ def print_system_info(requested_precision: PrecisionMode) -> tuple[torch.device,
     else:
         device = torch.device("cpu")
         actual_precision = PrecisionMode.FP32
-        print("CUDA available: No (running on CPU)")
-        print("WARNING: GPU benchmark results will not be representative!")
+        if force_cpu:
+            print("CPU-only mode requested (--cpu flag)")
+        else:
+            print("CUDA available: No (running on CPU)")
+        print("Running on CPU for benchmarking")
     print("=" * 60)
     return device, actual_precision
@@ -724,10 +729,15 @@ def main() -> None:
         action="store_true",
         help="Run CUDA/cuBLAS diagnostic tests before benchmarking",
     )
+    parser.add_argument(
+        "--cpu",
+        action="store_true",
+        help="Force CPU-only execution (for CPU vs GPU comparison)",
+    )
     args = parser.parse_args()
     requested_precision = PrecisionMode(args.precision)
-    device, actual_precision = print_system_info(requested_precision)
+    device, actual_precision = print_system_info(requested_precision, force_cpu=args.cpu)
     # Run diagnostics if requested
     if args.diagnose:

aws_bootstrap_g4dn-0.5.0/aws_bootstrap/resources/launch.json ADDED Viewed

@@ -0,0 +1,42 @@
+{
+    // CUDA debug configurations for VSCode
+    // Deployed to: ~/workspace/.vscode/launch.json
+    //
+    // Usage: Open any .cu file, press F5 to build and debug
+    "version": "0.2.0",
+    "configurations": [
+        {
+            "name": "CUDA: Build and Debug Active File",
+            "type": "cuda-gdb",
+            "request": "launch",
+            "program": "${fileDirname}/${fileBasenameNoExtension}",
+            "args": [],
+            "cwd": "${fileDirname}",
+            "miDebuggerPath": "__CUDA_GDB_PATH__",
+            "stopAtEntry": false,
+            "preLaunchTask": "nvcc: build active file (debug)"
+        },
+        {
+            "name": "CUDA: Build and Debug (stop at main)",
+            "type": "cuda-gdb",
+            "request": "launch",
+            "program": "${fileDirname}/${fileBasenameNoExtension}",
+            "args": [],
+            "cwd": "${fileDirname}",
+            "miDebuggerPath": "__CUDA_GDB_PATH__",
+            "stopAtEntry": true,
+            "preLaunchTask": "nvcc: build active file (debug)"
+        },
+        {
+            "name": "CUDA: Run Active File (no debug)",
+            "type": "cuda-gdb",
+            "request": "launch",
+            "program": "${fileDirname}/${fileBasenameNoExtension}",
+            "args": [],
+            "cwd": "${fileDirname}",
+            "miDebuggerPath": "__CUDA_GDB_PATH__",
+            "stopAtEntry": false,
+            "preLaunchTask": "nvcc: build active file (release)"
+        }
+    ]
+}

{aws_bootstrap_g4dn-0.3.0 → aws_bootstrap_g4dn-0.5.0}/aws_bootstrap/resources/remote_setup.sh RENAMED Viewed

@@ -7,7 +7,7 @@ echo "=== aws-bootstrap-g4dn remote setup ==="
 # 1. Verify GPU
 echo ""
-echo "[1/5] Verifying GPU and CUDA..."
+echo "[1/6] Verifying GPU and CUDA..."
 if command -v nvidia-smi &>/dev/null; then
     nvidia-smi --query-gpu=name,driver_version,memory.total --format=csv,noheader
 else
@@ -20,15 +20,40 @@ else
     echo "WARNING: nvcc not found (CUDA toolkit may not be installed)"
 fi
+# Make Nsight Systems (nsys) available on PATH if installed under /opt/nvidia
+if ! command -v nsys &>/dev/null; then
+    NSIGHT_DIR="/opt/nvidia/nsight-systems"
+    if [ -d "$NSIGHT_DIR" ]; then
+        # Fix permissions — the parent dir is often root-only (drwx------)
+        sudo chmod o+rx "$NSIGHT_DIR"
+        # Find the latest version directory (lexicographic sort)
+        NSYS_VERSION=$(ls -1 "$NSIGHT_DIR" | sort -V | tail -1)
+        if [ -n "$NSYS_VERSION" ] && [ -x "$NSIGHT_DIR/$NSYS_VERSION/bin/nsys" ]; then
+            NSYS_BIN="$NSIGHT_DIR/$NSYS_VERSION/bin"
+            if ! grep -q "nsight-systems" ~/.bashrc 2>/dev/null; then
+                echo "export PATH=\"$NSYS_BIN:\$PATH\"" >> ~/.bashrc
+            fi
+            export PATH="$NSYS_BIN:$PATH"
+            echo "  Nsight Systems $NSYS_VERSION added to PATH ($NSYS_BIN)"
+        else
+            echo "  WARNING: Nsight Systems directory found but no nsys binary"
+        fi
+    else
+        echo "  Nsight Systems not found at $NSIGHT_DIR"
+    fi
+else
+    echo "  nsys already on PATH: $(command -v nsys)"
+fi
 # 2. Install utilities
 echo ""
-echo "[2/5] Installing utilities..."
+echo "[2/6] Installing utilities..."
 sudo apt-get update -qq
 sudo apt-get install -y -qq htop tmux tree jq
 # 3. Set up Python environment with uv
 echo ""
-echo "[3/5] Setting up Python environment with uv..."
+echo "[3/6] Setting up Python environment with uv..."
 if ! command -v uv &>/dev/null; then
     curl -LsSf https://astral.sh/uv/install.sh | sh
 fi
@@ -153,7 +178,7 @@ echo "  Jupyter config written to $JUPYTER_CONFIG_DIR/jupyter_lab_config.py"
 # 4. Jupyter systemd service
 echo ""
-echo "[4/5] Setting up Jupyter systemd service..."
+echo "[4/6] Setting up Jupyter systemd service..."
 LOGIN_USER=$(whoami)
 sudo tee /etc/systemd/system/jupyter.service > /dev/null << SVCEOF
@@ -180,7 +205,7 @@ echo "  Jupyter service started (port 8888)"
 # 5. SSH keepalive
 echo ""
-echo "[5/5] Configuring SSH keepalive..."
+echo "[5/6] Configuring SSH keepalive..."
 if ! grep -q "ClientAliveInterval" /etc/ssh/sshd_config; then
     echo "ClientAliveInterval 60" | sudo tee -a /etc/ssh/sshd_config > /dev/null
     echo "ClientAliveCountMax 10" | sudo tee -a /etc/ssh/sshd_config > /dev/null
@@ -190,5 +215,58 @@ else
     echo "  SSH keepalive already configured"
 fi
+# 6. VSCode workspace setup
+echo ""
+echo "[6/6] Setting up VSCode workspace..."
+mkdir -p ~/workspace/.vscode
+# Detect cuda-gdb path
+CUDA_GDB_PATH=""
+if command -v cuda-gdb &>/dev/null; then
+    CUDA_GDB_PATH=$(command -v cuda-gdb)
+elif [ -x /usr/local/cuda/bin/cuda-gdb ]; then
+    CUDA_GDB_PATH="/usr/local/cuda/bin/cuda-gdb"
+else
+    # Try glob for versioned CUDA installs
+    for p in /usr/local/cuda-*/bin/cuda-gdb; do
+        if [ -x "$p" ]; then
+            CUDA_GDB_PATH="$p"
+        fi
+    done
+fi
+if [ -z "$CUDA_GDB_PATH" ]; then
+    echo "  WARNING: cuda-gdb not found — using placeholder in launch.json"
+    CUDA_GDB_PATH="cuda-gdb"
+else
+    echo "  cuda-gdb: $CUDA_GDB_PATH"
+fi
+# Detect GPU SM architecture
+GPU_ARCH=""
+if command -v nvidia-smi &>/dev/null; then
+    COMPUTE_CAP=$(nvidia-smi --query-gpu=compute_cap --format=csv,noheader 2>/dev/null | head -1 | tr -d '[:space:]')
+    if [ -n "$COMPUTE_CAP" ]; then
+        GPU_ARCH="sm_$(echo "$COMPUTE_CAP" | tr -d '.')"
+    fi
+fi
+if [ -z "$GPU_ARCH" ]; then
+    echo "  WARNING: Could not detect GPU arch — defaulting to sm_75"
+    GPU_ARCH="sm_75"
+else
+    echo "  GPU arch: $GPU_ARCH"
+fi
+# Copy example CUDA source into workspace
+cp /tmp/saxpy.cu ~/workspace/saxpy.cu
+echo "  Deployed saxpy.cu"
+# Deploy launch.json with cuda-gdb path
+sed "s|__CUDA_GDB_PATH__|${CUDA_GDB_PATH}|g" /tmp/launch.json > ~/workspace/.vscode/launch.json
+echo "  Deployed launch.json"
+# Deploy tasks.json with GPU architecture
+sed "s|__GPU_ARCH__|${GPU_ARCH}|g" /tmp/tasks.json > ~/workspace/.vscode/tasks.json
+echo "  Deployed tasks.json"
 echo ""
 echo "=== Remote setup complete ==="

aws_bootstrap_g4dn-0.5.0/aws_bootstrap/resources/saxpy.cu ADDED Viewed

@@ -0,0 +1,49 @@
+/**
+ * SAXPY Example, CUDA Style
+ * Source: https://developer.nvidia.com/blog/easy-introduction-cuda-c-and-c/
+ *
+ * This is included as an example CUDA C++ source file to try out the VS Code launch configuration we include on the host machine.
+ *
+ */
+#include <stdio.h>
+__global__
+void saxpy(int n, float a, float *x, float *y)
+{
+  int i = blockIdx.x*blockDim.x + threadIdx.x;
+  if (i < n) y[i] = a*x[i] + y[i];
+}
+int main(void)
+{
+  int N = 1<<20;
+  float *x, *y, *d_x, *d_y;
+  x = (float*)malloc(N*sizeof(float));
+  y = (float*)malloc(N*sizeof(float));
+  cudaMalloc(&d_x, N*sizeof(float));
+  cudaMalloc(&d_y, N*sizeof(float));
+  for (int i = 0; i < N; i++) {
+    x[i] = 1.0f;
+    y[i] = 2.0f;
+  }
+  cudaMemcpy(d_x, x, N*sizeof(float), cudaMemcpyHostToDevice);
+  cudaMemcpy(d_y, y, N*sizeof(float), cudaMemcpyHostToDevice);
+  // Perform SAXPY on 1M elements
+  saxpy<<<(N+255)/256, 256>>>(N, 2.0f, d_x, d_y);
+  cudaMemcpy(y, d_y, N*sizeof(float), cudaMemcpyDeviceToHost);
+  float maxError = 0.0f;
+  for (int i = 0; i < N; i++)
+    maxError = max(maxError, abs(y[i]-4.0f));
+  printf("Max error: %f\n", maxError);
+  cudaFree(d_x);
+  cudaFree(d_y);
+  free(x);
+  free(y);
+}

aws_bootstrap_g4dn-0.5.0/aws_bootstrap/resources/tasks.json ADDED Viewed

@@ -0,0 +1,48 @@
+{
+    // CUDA build tasks for VSCode
+    // Deployed to: ~/workspace/.vscode/tasks.json
+    "version": "2.0.0",
+    "tasks": [
+        {
+            "label": "nvcc: build active file (debug)",
+            "type": "shell",
+            "command": "nvcc",
+            "args": [
+                "-g",                           // Host debug symbols
+                "-G",                           // Device (GPU) debug symbols
+                "-O0",                          // No optimization
+                "-arch=__GPU_ARCH__",            // GPU arch (auto-detected)
+                "-o",
+                "${fileDirname}/${fileBasenameNoExtension}",
+                "${file}"
+            ],
+            "options": {
+                "cwd": "${fileDirname}"
+            },
+            "problemMatcher": ["$nvcc"],
+            "group": {
+                "kind": "build",
+                "isDefault": true
+            },
+            "detail": "Compile active .cu file with debug symbols (-g -G)"
+        },
+        {
+            "label": "nvcc: build active file (release)",
+            "type": "shell",
+            "command": "nvcc",
+            "args": [
+                "-O3",
+                "-arch=__GPU_ARCH__",
+                "-o",
+                "${fileDirname}/${fileBasenameNoExtension}",
+                "${file}"
+            ],
+            "options": {
+                "cwd": "${fileDirname}"
+            },
+            "problemMatcher": ["$nvcc"],
+            "group": "build",
+            "detail": "Compile active .cu file optimized (no debug)"
+        }
+    ]
+}

aws-bootstrap-g4dn 0.3.0__tar.gz → 0.5.0__tar.gz

aws-bootstrap-g4dn 0.3.0tar.gz → 0.5.0tar.gz