jetfit 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- jetfit-0.1.0/LICENSE +21 -0
- jetfit-0.1.0/PKG-INFO +264 -0
- jetfit-0.1.0/README.md +235 -0
- jetfit-0.1.0/jetfit/__init__.py +3 -0
- jetfit-0.1.0/jetfit/cli.py +220 -0
- jetfit-0.1.0/jetfit/fit.py +244 -0
- jetfit-0.1.0/jetfit/hardware.py +235 -0
- jetfit-0.1.0/jetfit/profiles.py +258 -0
- jetfit-0.1.0/jetfit/tui.py +1544 -0
- jetfit-0.1.0/jetfit.egg-info/PKG-INFO +264 -0
- jetfit-0.1.0/jetfit.egg-info/SOURCES.txt +19 -0
- jetfit-0.1.0/jetfit.egg-info/dependency_links.txt +1 -0
- jetfit-0.1.0/jetfit.egg-info/entry_points.txt +2 -0
- jetfit-0.1.0/jetfit.egg-info/requires.txt +7 -0
- jetfit-0.1.0/jetfit.egg-info/top_level.txt +1 -0
- jetfit-0.1.0/pyproject.toml +49 -0
- jetfit-0.1.0/setup.cfg +4 -0
- jetfit-0.1.0/tests/test_calibration.py +8 -0
- jetfit-0.1.0/tests/test_fit.py +164 -0
- jetfit-0.1.0/tests/test_hardware.py +236 -0
- jetfit-0.1.0/tests/test_ros2.py +8 -0
jetfit-0.1.0/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 mannsub
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
jetfit-0.1.0/PKG-INFO
ADDED
|
@@ -0,0 +1,264 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: jetfit
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: LLM model advisor for NVIDIA Jetson and DGX Spark unified-memory devices
|
|
5
|
+
Author-email: mannsub <akstjq0511@gmail.com>
|
|
6
|
+
License: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/mannsub/jetfit
|
|
8
|
+
Project-URL: Repository, https://github.com/mannsub/jetfit
|
|
9
|
+
Project-URL: Issues, https://github.com/mannsub/jetfit/issues
|
|
10
|
+
Keywords: jetson,llm,nvidia,tui,quantization,dgx
|
|
11
|
+
Classifier: Programming Language :: Python :: 3
|
|
12
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
13
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
14
|
+
Classifier: Environment :: Console
|
|
15
|
+
Classifier: Intended Audience :: Developers
|
|
16
|
+
Classifier: Intended Audience :: Science/Research
|
|
17
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
18
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
19
|
+
Requires-Python: >=3.11
|
|
20
|
+
Description-Content-Type: text/markdown
|
|
21
|
+
License-File: LICENSE
|
|
22
|
+
Requires-Dist: click>=8.1
|
|
23
|
+
Requires-Dist: rich>=13.0
|
|
24
|
+
Requires-Dist: textual>=0.60
|
|
25
|
+
Provides-Extra: dev
|
|
26
|
+
Requires-Dist: pytest>=8.0; extra == "dev"
|
|
27
|
+
Requires-Dist: pytest-cov>=5.0; extra == "dev"
|
|
28
|
+
Dynamic: license-file
|
|
29
|
+
|
|
30
|
+
# jetfit
|
|
31
|
+
|
|
32
|
+
**LLM model advisor for NVIDIA Jetson and DGX Spark unified-memory devices.**
|
|
33
|
+
|
|
34
|
+
Detects your Jetson hardware, scores LLM models across quality, speed, and memory fit, and tells you exactly which quantization level will run well on your device.
|
|
35
|
+
|
|
36
|
+
Ships with an interactive TUI (default) and a CLI mode. Supports hardware simulation, calibration, compare view, and plan mode.
|
|
37
|
+
|
|
38
|
+
---
|
|
39
|
+
|
|
40
|
+
## Install
|
|
41
|
+
|
|
42
|
+
```bash
|
|
43
|
+
pip install jetfit
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
or with [uv](https://github.com/astral-sh/uv):
|
|
47
|
+
|
|
48
|
+
```bash
|
|
49
|
+
uv tool install jetfit # install globally
|
|
50
|
+
uvx jetfit # run without installing
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
---
|
|
54
|
+
|
|
55
|
+
## Usage
|
|
56
|
+
|
|
57
|
+
### TUI (default)
|
|
58
|
+
|
|
59
|
+
```bash
|
|
60
|
+
jetfit
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
Launches the interactive terminal UI. The top bar shows your detected platform, available RAM, accelerator type, and minimum JetPack version. Models are listed in a scrollable table sorted by params, with composite score, estimated tok/s, best quantization, memory %, and fit grade per row.
|
|
64
|
+
|
|
65
|
+
#### Normal mode
|
|
66
|
+
|
|
67
|
+
| Key | Action |
|
|
68
|
+
|-----|--------|
|
|
69
|
+
| `j` / `k` | Navigate models |
|
|
70
|
+
| `g` | Jump to top / bottom (toggle) |
|
|
71
|
+
| `Enter` | Open detail view |
|
|
72
|
+
| `p` | Open plan mode |
|
|
73
|
+
| `m` | Mark / unmark model for compare |
|
|
74
|
+
| `c` | Open compare view (marked vs selected) |
|
|
75
|
+
| `x` | Clear all marks |
|
|
76
|
+
| `v` | Enter visual select mode |
|
|
77
|
+
| `/` | Focus search bar |
|
|
78
|
+
| `r` | Cycle provider (family) filter |
|
|
79
|
+
| `b` | Cycle size filter |
|
|
80
|
+
| `f` | Cycle fit filter |
|
|
81
|
+
| `s` | Cycle sort column |
|
|
82
|
+
| `-` | Flip sort direction |
|
|
83
|
+
| `F` | Open advanced filter popup |
|
|
84
|
+
| `S` | Open hardware simulation |
|
|
85
|
+
| `A` | Open advanced config (tune efficiency) |
|
|
86
|
+
| `t` | Cycle theme |
|
|
87
|
+
| `h` | Open help |
|
|
88
|
+
| `q` | Quit |
|
|
89
|
+
|
|
90
|
+
#### Visual mode (`v`)
|
|
91
|
+
|
|
92
|
+
Select a contiguous range of models for bulk comparison.
|
|
93
|
+
|
|
94
|
+
| Key | Action |
|
|
95
|
+
|-----|--------|
|
|
96
|
+
| `j` / `k` | Extend selection |
|
|
97
|
+
| `m` | Mark selected model |
|
|
98
|
+
| `c` | Open compare view for selection |
|
|
99
|
+
| `v` / `Esc` | Exit visual mode |
|
|
100
|
+
|
|
101
|
+
#### Detail view (`Enter`)
|
|
102
|
+
|
|
103
|
+
Shows full quant ladder for the selected model — size, KV cache, total memory, memory %, estimated tok/s, and fit grade for every quantization level. Navigate rows with `j`/`k`; the left panel updates to show specs for the highlighted quant.
|
|
104
|
+
|
|
105
|
+
#### Plan mode (`p`)
|
|
106
|
+
|
|
107
|
+
Estimates hardware requirements for a model config. Edit Context, Quant, and Target TPS fields. Shows minimum and recommended RAM, feasibility per run path, and upgrade deltas.
|
|
108
|
+
|
|
109
|
+
| Key | Action |
|
|
110
|
+
|-----|--------|
|
|
111
|
+
| `Tab` / `j` / `k` | Move between fields |
|
|
112
|
+
| Type | Edit current field |
|
|
113
|
+
| `Backspace` | Remove characters |
|
|
114
|
+
| `Esc` / `q` | Exit plan mode |
|
|
115
|
+
|
|
116
|
+
#### Compare view (`c`)
|
|
117
|
+
|
|
118
|
+
Side-by-side comparison of marked models. Rows are attributes (Score, tok/s, Fit, Mem%, Params, Quant, Context); columns are models. Best values are highlighted.
|
|
119
|
+
|
|
120
|
+
#### Hardware simulation (`S`)
|
|
121
|
+
|
|
122
|
+
Override the active hardware profile to preview recommendations for any supported Jetson or DGX Spark device without leaving the TUI. The system bar shows `(sim)` when active.
|
|
123
|
+
|
|
124
|
+
#### Advanced config (`A`)
|
|
125
|
+
|
|
126
|
+
Tune the efficiency factor used for tok/s estimation. Changes apply immediately and all scores are recalculated.
|
|
127
|
+
|
|
128
|
+
#### Advanced filter (`F`)
|
|
129
|
+
|
|
130
|
+
Set numeric bounds on parameter count and memory utilization %.
|
|
131
|
+
|
|
132
|
+
---
|
|
133
|
+
|
|
134
|
+
### CLI
|
|
135
|
+
|
|
136
|
+
```bash
|
|
137
|
+
# Detect hardware
|
|
138
|
+
jetfit system
|
|
139
|
+
|
|
140
|
+
# Detect hardware (JSON)
|
|
141
|
+
jetfit system --json
|
|
142
|
+
|
|
143
|
+
# Recommend models for current hardware
|
|
144
|
+
jetfit recommend
|
|
145
|
+
|
|
146
|
+
# Filter by model name
|
|
147
|
+
jetfit recommend --model llama
|
|
148
|
+
|
|
149
|
+
# Fix a specific quant level
|
|
150
|
+
jetfit recommend --quant Q4_K_M
|
|
151
|
+
|
|
152
|
+
# Show all quant levels per model
|
|
153
|
+
jetfit recommend --all-quants
|
|
154
|
+
|
|
155
|
+
# Override available memory
|
|
156
|
+
jetfit recommend --available-gb 12.0
|
|
157
|
+
|
|
158
|
+
# Target a specific hardware profile
|
|
159
|
+
jetfit recommend --profile jetson_agx_orin_64gb
|
|
160
|
+
|
|
161
|
+
# Minimum tok/s threshold
|
|
162
|
+
jetfit recommend --min-tps 5.0
|
|
163
|
+
|
|
164
|
+
# JSON output
|
|
165
|
+
jetfit recommend --json
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
---
|
|
169
|
+
|
|
170
|
+
## Supported Hardware
|
|
171
|
+
|
|
172
|
+
| Device | RAM | Bandwidth | Accelerator | JetPack |
|
|
173
|
+
|--------|-----|-----------|-------------|---------|
|
|
174
|
+
| Jetson Nano | 4 GB | 25.6 GB/s | DLA+CUDA | 4.x |
|
|
175
|
+
| Jetson TX2 NX | 4 GB | 51.2 GB/s | CUDA | 5.x |
|
|
176
|
+
| Jetson TX2 4GB | 4 GB | 51.2 GB/s | CUDA | 4.x |
|
|
177
|
+
| Jetson TX2 | 8 GB | 59.7 GB/s | CUDA | 4.x |
|
|
178
|
+
| Jetson TX2i | 8 GB | 51.2 GB/s | CUDA | 4.x |
|
|
179
|
+
| Jetson Xavier NX 8GB | 8 GB | 59.7 GB/s | DLA+CUDA | 5.x |
|
|
180
|
+
| Jetson Xavier NX 16GB | 16 GB | 59.7 GB/s | DLA+CUDA | 5.x |
|
|
181
|
+
| Jetson AGX Xavier 16GB | 16 GB | 136.5 GB/s | DLA+CUDA | 5.x |
|
|
182
|
+
| Jetson AGX Xavier 32GB | 32 GB | 136.5 GB/s | DLA+CUDA | 5.x |
|
|
183
|
+
| Jetson AGX Xavier 64GB | 64 GB | 136.5 GB/s | DLA+CUDA | 5.x |
|
|
184
|
+
| Jetson AGX Xavier Industrial | 64 GB | 136.5 GB/s | DLA+CUDA | 5.x |
|
|
185
|
+
| Jetson Orin Nano 4GB | 4 GB | 51.2 GB/s | CUDA | 6.x |
|
|
186
|
+
| Jetson Orin Nano 8GB | 8 GB | 102.4 GB/s | CUDA | 6.x |
|
|
187
|
+
| Jetson Orin NX 8GB | 8 GB | 102.4 GB/s | DLA+CUDA | 6.x |
|
|
188
|
+
| Jetson Orin NX 16GB | 16 GB | 102.4 GB/s | DLA+CUDA | 6.x |
|
|
189
|
+
| Jetson AGX Orin 32GB | 32 GB | 204.8 GB/s | DLA+CUDA | 6.x |
|
|
190
|
+
| Jetson AGX Orin 64GB | 64 GB | 204.8 GB/s | DLA+CUDA | 6.x |
|
|
191
|
+
| Jetson AGX Orin Industrial | 64 GB | 204.8 GB/s | DLA+CUDA | 6.x |
|
|
192
|
+
| Jetson AGX Thor T4000 | 64 GB | 273 GB/s | FP4+CUDA | 6.x |
|
|
193
|
+
| Jetson AGX Thor T5000 | 128 GB | 273 GB/s | FP4+CUDA | 6.x |
|
|
194
|
+
| DGX Spark (GB10) | 128 GB | 273 GB/s | FP4+CUDA | — |
|
|
195
|
+
|
|
196
|
+
On macOS or Linux dev machines, jetfit runs in simulation mode — pick any profile with `S` to preview recommendations.
|
|
197
|
+
|
|
198
|
+
---
|
|
199
|
+
|
|
200
|
+
## How it works
|
|
201
|
+
|
|
202
|
+
1. **Hardware detection** — Reads device-tree model and compatible strings (`/proc/device-tree/`), tegra release (`/etc/nv_tegra_release`), and available RAM via `tegrastats`, `jtop`, or `/proc/meminfo` (priority order). On non-Jetson machines, falls back to simulation mode with a selectable profile.
|
|
203
|
+
|
|
204
|
+
2. **Model database** — 67 models embedded directly in `fit.py`. Each entry has a parameter count and real context length sourced from HuggingFace. Memory requirements are computed across a 6-level quantization ladder (Q8_0 through Q2_K) using per-quant bytes-per-parameter values that account for k-quant codebook overhead.
|
|
205
|
+
|
|
206
|
+
3. **KV cache accounting** — Memory estimates include a fp16 KV cache (`0.000008 × params_b × 4096 GB`) and 0.5 GB runtime overhead, so "fits" means the model will actually load at a typical 4K inference context.
|
|
207
|
+
|
|
208
|
+
4. **FP4 halving** — On devices with FP4 support (Thor, DGX Spark), effective model size is halved before all memory and speed calculations.
|
|
209
|
+
|
|
210
|
+
5. **Fit levels** — Based on `(weights + KV cache + overhead) / available_memory`:
|
|
211
|
+
|
|
212
|
+
| Level | Utilization |
|
|
213
|
+
|-------|-------------|
|
|
214
|
+
| Perfect | ≤ 70% |
|
|
215
|
+
| Good | 71–90% |
|
|
216
|
+
| Marginal | 91–100% |
|
|
217
|
+
| TooTight | > 100% |
|
|
218
|
+
|
|
219
|
+
6. **Speed estimation** — Token generation is memory-bandwidth-bound. Estimated tok/s:
|
|
220
|
+
|
|
221
|
+
`(bandwidth_GB_s / effective_size_GB) × efficiency × quant_speed_multiplier`
|
|
222
|
+
|
|
223
|
+
Default efficiency is 0.50–0.55 per profile, tunable via `A`. Quant multipliers range from 1.00× (Q8_0) to 1.80× (Q2_K).
|
|
224
|
+
|
|
225
|
+
7. **Composite score** — Each model gets a 0–100 score combining normalized speed (45%), fit level (35%), and quantization quality (20%). Used for sorting and the score column.
|
|
226
|
+
|
|
227
|
+
8. **Calibration** — Run `jetfit calibrate` to measure real tok/s on your device and save a per-profile efficiency factor to `~/.config/jetfit/calibration.json`. Calibrated profiles show a `✓ cal` badge in the system bar.
|
|
228
|
+
|
|
229
|
+
---
|
|
230
|
+
|
|
231
|
+
## Project structure
|
|
232
|
+
|
|
233
|
+
```
|
|
234
|
+
jetfit/
|
|
235
|
+
__init__.py -- version
|
|
236
|
+
cli.py -- Click CLI entry point, TUI launch
|
|
237
|
+
hardware.py -- Jetson/DGX hardware detection
|
|
238
|
+
profiles.py -- Hardware profile database (22 devices)
|
|
239
|
+
fit.py -- Scoring engine, quantization ladder, model catalog
|
|
240
|
+
tui.py -- Textual TUI (app state, rendering, keyboard events)
|
|
241
|
+
tests/
|
|
242
|
+
test_hardware.py -- Hardware detection and TUI markup regression tests
|
|
243
|
+
test_fit.py -- Scoring engine unit tests
|
|
244
|
+
test_calibration.py
|
|
245
|
+
test_ros2.py
|
|
246
|
+
pyproject.toml
|
|
247
|
+
LICENSE
|
|
248
|
+
```
|
|
249
|
+
|
|
250
|
+
---
|
|
251
|
+
|
|
252
|
+
## Dependencies
|
|
253
|
+
|
|
254
|
+
| Package | Purpose |
|
|
255
|
+
|---------|---------|
|
|
256
|
+
| `click` | CLI argument parsing |
|
|
257
|
+
| `rich` | CLI table and colored output |
|
|
258
|
+
| `textual` | Terminal UI framework |
|
|
259
|
+
|
|
260
|
+
---
|
|
261
|
+
|
|
262
|
+
## License
|
|
263
|
+
|
|
264
|
+
MIT
|
jetfit-0.1.0/README.md
ADDED
|
@@ -0,0 +1,235 @@
|
|
|
1
|
+
# jetfit
|
|
2
|
+
|
|
3
|
+
**LLM model advisor for NVIDIA Jetson and DGX Spark unified-memory devices.**
|
|
4
|
+
|
|
5
|
+
Detects your Jetson hardware, scores LLM models across quality, speed, and memory fit, and tells you exactly which quantization level will run well on your device.
|
|
6
|
+
|
|
7
|
+
Ships with an interactive TUI (default) and a CLI mode. Supports hardware simulation, calibration, compare view, and plan mode.
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## Install
|
|
12
|
+
|
|
13
|
+
```bash
|
|
14
|
+
pip install jetfit
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
or with [uv](https://github.com/astral-sh/uv):
|
|
18
|
+
|
|
19
|
+
```bash
|
|
20
|
+
uv tool install jetfit # install globally
|
|
21
|
+
uvx jetfit # run without installing
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
---
|
|
25
|
+
|
|
26
|
+
## Usage
|
|
27
|
+
|
|
28
|
+
### TUI (default)
|
|
29
|
+
|
|
30
|
+
```bash
|
|
31
|
+
jetfit
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
Launches the interactive terminal UI. The top bar shows your detected platform, available RAM, accelerator type, and minimum JetPack version. Models are listed in a scrollable table sorted by params, with composite score, estimated tok/s, best quantization, memory %, and fit grade per row.
|
|
35
|
+
|
|
36
|
+
#### Normal mode
|
|
37
|
+
|
|
38
|
+
| Key | Action |
|
|
39
|
+
|-----|--------|
|
|
40
|
+
| `j` / `k` | Navigate models |
|
|
41
|
+
| `g` | Jump to top / bottom (toggle) |
|
|
42
|
+
| `Enter` | Open detail view |
|
|
43
|
+
| `p` | Open plan mode |
|
|
44
|
+
| `m` | Mark / unmark model for compare |
|
|
45
|
+
| `c` | Open compare view (marked vs selected) |
|
|
46
|
+
| `x` | Clear all marks |
|
|
47
|
+
| `v` | Enter visual select mode |
|
|
48
|
+
| `/` | Focus search bar |
|
|
49
|
+
| `r` | Cycle provider (family) filter |
|
|
50
|
+
| `b` | Cycle size filter |
|
|
51
|
+
| `f` | Cycle fit filter |
|
|
52
|
+
| `s` | Cycle sort column |
|
|
53
|
+
| `-` | Flip sort direction |
|
|
54
|
+
| `F` | Open advanced filter popup |
|
|
55
|
+
| `S` | Open hardware simulation |
|
|
56
|
+
| `A` | Open advanced config (tune efficiency) |
|
|
57
|
+
| `t` | Cycle theme |
|
|
58
|
+
| `h` | Open help |
|
|
59
|
+
| `q` | Quit |
|
|
60
|
+
|
|
61
|
+
#### Visual mode (`v`)
|
|
62
|
+
|
|
63
|
+
Select a contiguous range of models for bulk comparison.
|
|
64
|
+
|
|
65
|
+
| Key | Action |
|
|
66
|
+
|-----|--------|
|
|
67
|
+
| `j` / `k` | Extend selection |
|
|
68
|
+
| `m` | Mark selected model |
|
|
69
|
+
| `c` | Open compare view for selection |
|
|
70
|
+
| `v` / `Esc` | Exit visual mode |
|
|
71
|
+
|
|
72
|
+
#### Detail view (`Enter`)
|
|
73
|
+
|
|
74
|
+
Shows full quant ladder for the selected model — size, KV cache, total memory, memory %, estimated tok/s, and fit grade for every quantization level. Navigate rows with `j`/`k`; the left panel updates to show specs for the highlighted quant.
|
|
75
|
+
|
|
76
|
+
#### Plan mode (`p`)
|
|
77
|
+
|
|
78
|
+
Estimates hardware requirements for a model config. Edit Context, Quant, and Target TPS fields. Shows minimum and recommended RAM, feasibility per run path, and upgrade deltas.
|
|
79
|
+
|
|
80
|
+
| Key | Action |
|
|
81
|
+
|-----|--------|
|
|
82
|
+
| `Tab` / `j` / `k` | Move between fields |
|
|
83
|
+
| Type | Edit current field |
|
|
84
|
+
| `Backspace` | Remove characters |
|
|
85
|
+
| `Esc` / `q` | Exit plan mode |
|
|
86
|
+
|
|
87
|
+
#### Compare view (`c`)
|
|
88
|
+
|
|
89
|
+
Side-by-side comparison of marked models. Rows are attributes (Score, tok/s, Fit, Mem%, Params, Quant, Context); columns are models. Best values are highlighted.
|
|
90
|
+
|
|
91
|
+
#### Hardware simulation (`S`)
|
|
92
|
+
|
|
93
|
+
Override the active hardware profile to preview recommendations for any supported Jetson or DGX Spark device without leaving the TUI. The system bar shows `(sim)` when active.
|
|
94
|
+
|
|
95
|
+
#### Advanced config (`A`)
|
|
96
|
+
|
|
97
|
+
Tune the efficiency factor used for tok/s estimation. Changes apply immediately and all scores are recalculated.
|
|
98
|
+
|
|
99
|
+
#### Advanced filter (`F`)
|
|
100
|
+
|
|
101
|
+
Set numeric bounds on parameter count and memory utilization %.
|
|
102
|
+
|
|
103
|
+
---
|
|
104
|
+
|
|
105
|
+
### CLI
|
|
106
|
+
|
|
107
|
+
```bash
|
|
108
|
+
# Detect hardware
|
|
109
|
+
jetfit system
|
|
110
|
+
|
|
111
|
+
# Detect hardware (JSON)
|
|
112
|
+
jetfit system --json
|
|
113
|
+
|
|
114
|
+
# Recommend models for current hardware
|
|
115
|
+
jetfit recommend
|
|
116
|
+
|
|
117
|
+
# Filter by model name
|
|
118
|
+
jetfit recommend --model llama
|
|
119
|
+
|
|
120
|
+
# Fix a specific quant level
|
|
121
|
+
jetfit recommend --quant Q4_K_M
|
|
122
|
+
|
|
123
|
+
# Show all quant levels per model
|
|
124
|
+
jetfit recommend --all-quants
|
|
125
|
+
|
|
126
|
+
# Override available memory
|
|
127
|
+
jetfit recommend --available-gb 12.0
|
|
128
|
+
|
|
129
|
+
# Target a specific hardware profile
|
|
130
|
+
jetfit recommend --profile jetson_agx_orin_64gb
|
|
131
|
+
|
|
132
|
+
# Minimum tok/s threshold
|
|
133
|
+
jetfit recommend --min-tps 5.0
|
|
134
|
+
|
|
135
|
+
# JSON output
|
|
136
|
+
jetfit recommend --json
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
---
|
|
140
|
+
|
|
141
|
+
## Supported Hardware
|
|
142
|
+
|
|
143
|
+
| Device | RAM | Bandwidth | Accelerator | JetPack |
|
|
144
|
+
|--------|-----|-----------|-------------|---------|
|
|
145
|
+
| Jetson Nano | 4 GB | 25.6 GB/s | DLA+CUDA | 4.x |
|
|
146
|
+
| Jetson TX2 NX | 4 GB | 51.2 GB/s | CUDA | 5.x |
|
|
147
|
+
| Jetson TX2 4GB | 4 GB | 51.2 GB/s | CUDA | 4.x |
|
|
148
|
+
| Jetson TX2 | 8 GB | 59.7 GB/s | CUDA | 4.x |
|
|
149
|
+
| Jetson TX2i | 8 GB | 51.2 GB/s | CUDA | 4.x |
|
|
150
|
+
| Jetson Xavier NX 8GB | 8 GB | 59.7 GB/s | DLA+CUDA | 5.x |
|
|
151
|
+
| Jetson Xavier NX 16GB | 16 GB | 59.7 GB/s | DLA+CUDA | 5.x |
|
|
152
|
+
| Jetson AGX Xavier 16GB | 16 GB | 136.5 GB/s | DLA+CUDA | 5.x |
|
|
153
|
+
| Jetson AGX Xavier 32GB | 32 GB | 136.5 GB/s | DLA+CUDA | 5.x |
|
|
154
|
+
| Jetson AGX Xavier 64GB | 64 GB | 136.5 GB/s | DLA+CUDA | 5.x |
|
|
155
|
+
| Jetson AGX Xavier Industrial | 64 GB | 136.5 GB/s | DLA+CUDA | 5.x |
|
|
156
|
+
| Jetson Orin Nano 4GB | 4 GB | 51.2 GB/s | CUDA | 6.x |
|
|
157
|
+
| Jetson Orin Nano 8GB | 8 GB | 102.4 GB/s | CUDA | 6.x |
|
|
158
|
+
| Jetson Orin NX 8GB | 8 GB | 102.4 GB/s | DLA+CUDA | 6.x |
|
|
159
|
+
| Jetson Orin NX 16GB | 16 GB | 102.4 GB/s | DLA+CUDA | 6.x |
|
|
160
|
+
| Jetson AGX Orin 32GB | 32 GB | 204.8 GB/s | DLA+CUDA | 6.x |
|
|
161
|
+
| Jetson AGX Orin 64GB | 64 GB | 204.8 GB/s | DLA+CUDA | 6.x |
|
|
162
|
+
| Jetson AGX Orin Industrial | 64 GB | 204.8 GB/s | DLA+CUDA | 6.x |
|
|
163
|
+
| Jetson AGX Thor T4000 | 64 GB | 273 GB/s | FP4+CUDA | 6.x |
|
|
164
|
+
| Jetson AGX Thor T5000 | 128 GB | 273 GB/s | FP4+CUDA | 6.x |
|
|
165
|
+
| DGX Spark (GB10) | 128 GB | 273 GB/s | FP4+CUDA | — |
|
|
166
|
+
|
|
167
|
+
On macOS or Linux dev machines, jetfit runs in simulation mode — pick any profile with `S` to preview recommendations.
|
|
168
|
+
|
|
169
|
+
---
|
|
170
|
+
|
|
171
|
+
## How it works
|
|
172
|
+
|
|
173
|
+
1. **Hardware detection** — Reads device-tree model and compatible strings (`/proc/device-tree/`), tegra release (`/etc/nv_tegra_release`), and available RAM via `tegrastats`, `jtop`, or `/proc/meminfo` (priority order). On non-Jetson machines, falls back to simulation mode with a selectable profile.
|
|
174
|
+
|
|
175
|
+
2. **Model database** — 67 models embedded directly in `fit.py`. Each entry has a parameter count and real context length sourced from HuggingFace. Memory requirements are computed across a 6-level quantization ladder (Q8_0 through Q2_K) using per-quant bytes-per-parameter values that account for k-quant codebook overhead.
|
|
176
|
+
|
|
177
|
+
3. **KV cache accounting** — Memory estimates include a fp16 KV cache (`0.000008 × params_b × 4096 GB`) and 0.5 GB runtime overhead, so "fits" means the model will actually load at a typical 4K inference context.
|
|
178
|
+
|
|
179
|
+
4. **FP4 halving** — On devices with FP4 support (Thor, DGX Spark), effective model size is halved before all memory and speed calculations.
|
|
180
|
+
|
|
181
|
+
5. **Fit levels** — Based on `(weights + KV cache + overhead) / available_memory`:
|
|
182
|
+
|
|
183
|
+
| Level | Utilization |
|
|
184
|
+
|-------|-------------|
|
|
185
|
+
| Perfect | ≤ 70% |
|
|
186
|
+
| Good | 71–90% |
|
|
187
|
+
| Marginal | 91–100% |
|
|
188
|
+
| TooTight | > 100% |
|
|
189
|
+
|
|
190
|
+
6. **Speed estimation** — Token generation is memory-bandwidth-bound. Estimated tok/s:
|
|
191
|
+
|
|
192
|
+
`(bandwidth_GB_s / effective_size_GB) × efficiency × quant_speed_multiplier`
|
|
193
|
+
|
|
194
|
+
Default efficiency is 0.50–0.55 per profile, tunable via `A`. Quant multipliers range from 1.00× (Q8_0) to 1.80× (Q2_K).
|
|
195
|
+
|
|
196
|
+
7. **Composite score** — Each model gets a 0–100 score combining normalized speed (45%), fit level (35%), and quantization quality (20%). Used for sorting and the score column.
|
|
197
|
+
|
|
198
|
+
8. **Calibration** — Run `jetfit calibrate` to measure real tok/s on your device and save a per-profile efficiency factor to `~/.config/jetfit/calibration.json`. Calibrated profiles show a `✓ cal` badge in the system bar.
|
|
199
|
+
|
|
200
|
+
---
|
|
201
|
+
|
|
202
|
+
## Project structure
|
|
203
|
+
|
|
204
|
+
```
|
|
205
|
+
jetfit/
|
|
206
|
+
__init__.py -- version
|
|
207
|
+
cli.py -- Click CLI entry point, TUI launch
|
|
208
|
+
hardware.py -- Jetson/DGX hardware detection
|
|
209
|
+
profiles.py -- Hardware profile database (22 devices)
|
|
210
|
+
fit.py -- Scoring engine, quantization ladder, model catalog
|
|
211
|
+
tui.py -- Textual TUI (app state, rendering, keyboard events)
|
|
212
|
+
tests/
|
|
213
|
+
test_hardware.py -- Hardware detection and TUI markup regression tests
|
|
214
|
+
test_fit.py -- Scoring engine unit tests
|
|
215
|
+
test_calibration.py
|
|
216
|
+
test_ros2.py
|
|
217
|
+
pyproject.toml
|
|
218
|
+
LICENSE
|
|
219
|
+
```
|
|
220
|
+
|
|
221
|
+
---
|
|
222
|
+
|
|
223
|
+
## Dependencies
|
|
224
|
+
|
|
225
|
+
| Package | Purpose |
|
|
226
|
+
|---------|---------|
|
|
227
|
+
| `click` | CLI argument parsing |
|
|
228
|
+
| `rich` | CLI table and colored output |
|
|
229
|
+
| `textual` | Terminal UI framework |
|
|
230
|
+
|
|
231
|
+
---
|
|
232
|
+
|
|
233
|
+
## License
|
|
234
|
+
|
|
235
|
+
MIT
|