@sweet-search/native-linux-arm64-gnu-cuda 2.4.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +67 -0
- package/manifest.json +5 -0
- package/package.json +20 -0
- package/sweet-search +0 -0
- package/sweet-search-native.node +0 -0
package/README.md
ADDED
|
@@ -0,0 +1,67 @@
|
|
|
1
|
+
# @sweet-search/native-linux-arm64-gnu-cuda
|
|
2
|
+
|
|
3
|
+
NVIDIA CUDA-enabled native binaries for [sweet-search](https://github.com/panonitorg/sweet-search) on Linux arm64 (glibc) — **Jetson Orin**, **Grace Hopper**, and arm64 SBSA server GPUs.
|
|
4
|
+
|
|
5
|
+
This package is a platform-scoped `optionalDependency` of the main `sweet-search`
|
|
6
|
+
package. npm installs it automatically on matching hosts:
|
|
7
|
+
|
|
8
|
+
- `os === 'linux'`
|
|
9
|
+
- `cpu === 'arm64'`
|
|
10
|
+
- `libc === 'glibc'` (Ubuntu/Debian/RHEL on arm64 — not Alpine/musl)
|
|
11
|
+
|
|
12
|
+
On non-matching platforms, npm silently skips this package.
|
|
13
|
+
|
|
14
|
+
## What's inside
|
|
15
|
+
|
|
16
|
+
- `sweet-search-native.node` — napi-rs addon built with the `cuda,flash-attn`
|
|
17
|
+
Cargo features for `aarch64-unknown-linux-gnu`. Embedding + late-interaction
|
|
18
|
+
inference dispatch to candle-cuda when `libcuda.so.1` is present at runtime;
|
|
19
|
+
otherwise the addon fails to load and sweet-search falls back to the
|
|
20
|
+
CPU-only `@sweet-search/native-linux-arm64-gnu` variant.
|
|
21
|
+
- `sweet-search` — Rust CLI binary (identical to the non-CUDA Linux arm64
|
|
22
|
+
package).
|
|
23
|
+
|
|
24
|
+
## Runtime requirements
|
|
25
|
+
|
|
26
|
+
- Linux arm64 (aarch64) with glibc ≥ 2.31
|
|
27
|
+
- NVIDIA driver providing `libcuda.so.1`
|
|
28
|
+
- Compute capability ≥ 7.0 (Jetson Orin is SM 8.7, Grace Hopper is SM 9.0;
|
|
29
|
+
older Jetson Xavier at SM 7.2 also qualifies but is EOL)
|
|
30
|
+
- CUDA Toolkit 12.x runtime components; the binary is linked against CUDA
|
|
31
|
+
12.2 at build time
|
|
32
|
+
|
|
33
|
+
Flash-attention kernels are compiled for `CUDA_COMPUTE_CAP=87` (Jetson Orin)
|
|
34
|
+
with forward compatibility to SM 9.0 (Grace Hopper). On SM 7.x hardware the
|
|
35
|
+
flash-attn path is skipped at runtime and candle's naive attention is used
|
|
36
|
+
instead — the binary supports both paths.
|
|
37
|
+
|
|
38
|
+
## Install
|
|
39
|
+
|
|
40
|
+
Automatic via the main `sweet-search` package:
|
|
41
|
+
|
|
42
|
+
```bash
|
|
43
|
+
npm install sweet-search
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
On a Linux arm64 host with a working NVIDIA driver, npm pulls this
|
|
47
|
+
CUDA-enabled addon and `core/infrastructure/native-resolver.js` prefers it
|
|
48
|
+
over the plain `@sweet-search/native-linux-arm64-gnu` variant.
|
|
49
|
+
|
|
50
|
+
## Detecting whether CUDA is armed
|
|
51
|
+
|
|
52
|
+
Runtime detection is authoritative. Run `sweet-search init` — its report
|
|
53
|
+
prints either `NVIDIA GPU: <name> ... candle-cuda armed` or a warning that
|
|
54
|
+
the standard CPU-only package should be installed. `.sweet-search/config.json`
|
|
55
|
+
records the decision under `runtime.hardware`. See
|
|
56
|
+
`docs/INIT_STRATEGY.md` → "CUDA Backend" for the full contract.
|
|
57
|
+
|
|
58
|
+
## Troubleshooting
|
|
59
|
+
|
|
60
|
+
Force-disable CUDA without uninstalling:
|
|
61
|
+
|
|
62
|
+
```bash
|
|
63
|
+
export SWEET_SEARCH_CUDA=0 # or pass --skip-cuda to `sweet-search init`
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
Parity between the CUDA path and the CPU reference is validated pre-release
|
|
67
|
+
with `node scripts/parity-cuda.js` on a real Jetson Orin or Grace host.
|
package/manifest.json
ADDED
package/package.json
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "@sweet-search/native-linux-arm64-gnu-cuda",
|
|
3
|
+
"version": "2.4.1",
|
|
4
|
+
"description": "Sweet Search native binaries for Linux arm64 (glibc) with NVIDIA CUDA backend (candle-cuda + flash-attn) — Jetson Orin, Grace Hopper, and arm64 server GPUs",
|
|
5
|
+
"os": ["linux"],
|
|
6
|
+
"cpu": ["arm64"],
|
|
7
|
+
"libc": ["glibc"],
|
|
8
|
+
"files": [
|
|
9
|
+
"manifest.json",
|
|
10
|
+
"sweet-search-native.node",
|
|
11
|
+
"sweet-search",
|
|
12
|
+
"README.md"
|
|
13
|
+
],
|
|
14
|
+
"repository": {
|
|
15
|
+
"type": "git",
|
|
16
|
+
"url": "https://github.com/panonitorg/sweet-search"
|
|
17
|
+
},
|
|
18
|
+
"license": "Apache-2.0",
|
|
19
|
+
"author": "Marko Sladojevic <marko@panonit.com> (https://panonit.com)"
|
|
20
|
+
}
|
package/sweet-search
ADDED
|
Binary file
|
|
Binary file
|