offgrid-ai 0.8.9 → 0.8.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -2,105 +2,95 @@
2
2
 
3
3
  # offgrid-ai
4
4
 
5
- **Privacy-first CLI for running local LLMs. Your AI, your machine, nothing leaves.**
5
+ **Privacy-first CLI for running local AI models on your own machine.**
6
6
 
7
7
  [![node](https://img.shields.io/badge/node-20%2B-3c873a)](package.json)
8
8
  [![platform](https://img.shields.io/badge/platform-macOS%20%7C%20Linux-blue)]()
9
9
 
10
- Install • RunDone.
11
-
10
+ Install • Pick a model Start chatting
12
11
  ```bash
13
12
  curl -fsSL https://raw.githubusercontent.com/eeshansrivastava89/offgrid-ai/main/install.sh | bash
14
13
  ```
15
14
 
16
15
  </div>
17
16
 
18
- ## What it does
17
+ ## What is offgrid-ai?
19
18
 
20
- You run `offgrid-ai`. It finds your local models, auto-configures everything, starts the server, and launches Pi. Zero configuration. No parameter tuning. No presets.
19
+ offgrid-ai is a command-line tool that lets you run AI models locally. Everything stays on your computer. No API keys, no remote servers, no data leaving your machine.
21
20
 
22
- **First run** walks you through installing anything missing. For GGUF models, offgrid-ai installs a managed `llama.cpp` runtime under `~/.offgrid-ai/runtime`; Homebrew is only used if you choose Homebrew-installed apps like LM Studio, Ollama, or oMLX.
21
+ It works with:
23
22
 
24
- ```bash
25
- offgrid-ai # pick a model and run it
26
- offgrid-ai status # show running servers (from another terminal)
27
- offgrid-ai stop # stop a running server
28
- ```
23
+ - Models from **LM Studio**
24
+ - **Ollama** models
25
+ - **oMLX** models on Apple Silicon
26
+ - GGUF models from **Hugging Face** or other sources
29
27
 
30
- ## Install
28
+ ## Quick start
31
29
 
32
- ### Recommended: one command installer
30
+ ### 1. Install
33
31
 
34
- Installs Node.js if you don't have it, then installs offgrid-ai and adds it to your PATH. Prints a welcome message so you know it worked.
32
+ Open your terminal and run:
35
33
 
36
34
  ```bash
37
35
  curl -fsSL https://raw.githubusercontent.com/eeshansrivastava89/offgrid-ai/main/install.sh | bash
38
36
  ```
39
37
 
40
- Or review the install script first:
38
+ This installs offgrid-ai and anything else it needs. Then open a new terminal window and run:
41
39
 
42
40
  ```bash
43
- curl -fsSL https://raw.githubusercontent.com/eeshansrivastava89/offgrid-ai/main/install.sh | less
41
+ offgrid-ai
44
42
  ```
45
43
 
46
- ### Already have Node.js?
44
+ If you already have Node.js installed, you can also use:
47
45
 
48
46
  ```bash
49
47
  npm install -g offgrid-ai@latest --prefer-online
50
48
  ```
51
49
 
52
- This works without extra flags, but npm hides postinstall output by default, so you won't see the welcome message. Open a new terminal window or run `source ~/.zshrc` and then `offgrid-ai`.
50
+ ### 2. Pick a model
53
51
 
54
- ## How it works
52
+ The first time you run offgrid-ai, it looks for models already on your machine. If it does not find any, it tells you how to get one.
55
53
 
56
- 1. **Auto-detect everything.** Scans for GGUF models in LM Studio and Hugging Face cache directories, and checks managed backends like Ollama/oMLX through their local APIs. Reads model metadata (quantization, context size, vision, thinking mode) directly from GGUF files. No presets, no manual configuration.
54
+ Supported ways to get models:
57
55
 
58
- 2. **One command to run.** `offgrid-ai` → pick a model → confirm context/KV memory settings on first setup → it starts llama-server, syncs Pi config, and launches Pi.
56
+ | Source | Example command |
57
+ |---|---|
58
+ | LM Studio | `lms get qwen/qwen3.5-9b` |
59
+ | Ollama | `ollama pull gemma3:4b` |
60
+ | oMLX | Use `omlx start` |
61
+ | Hugging Face | Download a GGUF file |
59
62
 
60
- 3. **One model at a time.** Laptops have limited RAM. One server, one model, no confusion.
63
+ ### 3. Start chatting
61
64
 
62
- ## Supported backends
63
-
64
- | Backend | Type | Auto-detected |
65
- |---|---|---|
66
- | **LM Studio** | Visual model browser + CLI (`lms`) | ✓ models in `~/.lmstudio/models/` |
67
- | **llama.cpp** | Managed local runtime | ✓ GGUF models in `~/.lmstudio/models/` and Hugging Face cache |
68
- | **llama.cpp MTP** | Managed local runtime (speculative decoding) | ✓ MTP detected from model metadata |
69
- | **Ollama** | Managed server | ✓ via `localhost:11434` |
70
- | **oMLX** | Managed server | ✓ via `127.0.0.1:8000` |
65
+ ```bash
66
+ offgrid-ai
67
+ ```
71
68
 
72
- ## First run onboarding
69
+ Pick a model from the list and press Enter. offgrid-ai configures the rest and opens the Pi coding agent.
73
70
 
74
- When you run `offgrid-ai` for the first time on a fresh machine:
71
+ ## Everyday commands
75
72
 
76
- 1. **llama.cpp runtime** — Required for GGUF models. Offered as an offgrid-ai managed runtime from official `llama.cpp` release binaries.
77
- 2. **Pi** Required to chat from the Pi coding agent. Offered to install via npm if missing.
78
- 3. **Model backend** At least one is needed (LM Studio recommended):
79
- - **LM Studio** visual model browser + `lms` CLI, download models with `lms get qwen/qwen3.5-9b`
80
- - **Ollama** models download on demand with `ollama pull`
81
- - **oMLX** Apple Silicon optimized
82
- 4. **Models** — If no models found, tells you where to get them.
73
+ ```bash
74
+ offgrid-ai # start a model
75
+ offgrid-ai status # see what's running
76
+ offgrid-ai stop # stop the running model
77
+ offgrid-ai benchmark # run a benchmark
78
+ offgrid-ai uninstall # remove offgrid-ai
79
+ ```
83
80
 
84
- Homebrew is optional and only prompted when you choose a Homebrew-based backend install. Subsequent runs skip everything that's already installed. When a GGUF model is set up for the first time, offgrid-ai asks only for the memory-impacting choices: context window and KV cache precision. Sampling defaults are shown but not forced into a tuning wizard.
81
+ ## What can I do with it?
85
82
 
86
- ## Data directory
83
+ - **Chat with local models** — no internet required after setup.
84
+ - **Run benchmarks** — compare how different models perform on creative or data-science tasks.
85
+ - **Keep data private** — everything happens on your machine.
87
86
 
88
- ```
89
- ~/.offgrid-ai/
90
- config.json # auto-detected paths, editable for overrides
91
- profiles/ # one per model, auto-created on first run
92
- <id>/
93
- profile.json # model metadata + auto-detected settings
94
- command.json # llama-server flags (auto-generated, hand-editable)
95
- notes.md # scratch notes
96
- logs/
97
- run/ # PID state files
98
- runtime/ # managed llama.cpp binaries
99
- ```
87
+ ## Need help?
100
88
 
101
- ## Benchmark (coming soon)
89
+ Run any command with `--help`:
102
90
 
103
- "Benchmark" is always shown as an option in the CLI. If the [local-llm-visual-benchmark](https://github.com/eeshansrivastava89/local-llm-visual-benchmark) repo is found locally, it works. If not, it offers to clone it. Model management works standalone; benchmarking is the upsell.
91
+ ```bash
92
+ offgrid-ai --help
93
+ ```
104
94
 
105
95
  ## Development
106
96
 
@@ -113,4 +103,4 @@ node bin/offgrid-ai.mjs
113
103
 
114
104
  ## License
115
105
 
116
- Personal project by [Eeshan Srivastava](https://eeshans.com).
106
+ Personal project by [Eeshan Srivastava](https://eeshans.com).
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "offgrid-ai",
3
- "version": "0.8.9",
3
+ "version": "0.8.11",
4
4
  "description": "Privacy-first CLI for running local LLMs — discover, configure, run, benchmark",
5
5
  "author": "Eeshan Srivastava (https://eeshans.com)",
6
6
  "type": "module",
package/src/benchmark.mjs CHANGED
@@ -306,7 +306,7 @@ function renderStreamEvent(parsed, state, opts = {}) {
306
306
  state.status.mode = "thinking";
307
307
  state.status.toolName = null;
308
308
  state.status.bytes = 0;
309
- state.status.tokens = 0;
309
+ state.status.text = "";
310
310
  printFinalLine(BENCH_COLORS.info(`[turn ${state.turn}]`));
311
311
  break;
312
312
  }
@@ -357,7 +357,7 @@ function renderStreamEvent(parsed, state, opts = {}) {
357
357
  state.status.mode = "exec";
358
358
  state.status.toolName = parsed.toolName;
359
359
  state.status.bytes = 0;
360
- state.status.tokens = 0;
360
+ state.status.text = "";
361
361
  printFinalLine(BENCH_COLORS.tool(`[exec] ${parsed.toolName}`));
362
362
  break;
363
363
  case "tool_execution_update": {
@@ -398,7 +398,8 @@ function renderStreamEvent(parsed, state, opts = {}) {
398
398
  function updateStatusFromDelta(state, delta) {
399
399
  if (!delta) return;
400
400
  state.status.bytes += Buffer.byteLength(delta, "utf8");
401
- state.status.tokens = estimatedTokensFromText(String(state.status.bytes));
401
+ state.status.text = (state.status.text || "") + delta;
402
+ state.status.tokens = estimatedTokensFromText(state.status.text);
402
403
  const label = state.status.toolName ? ` · ${state.status.toolName}` : "";
403
404
  const modeLabel = state.status.mode === "thinking" ? "thinking" : state.status.mode === "text" ? "text" : state.status.mode === "tool" ? "tool" : "exec";
404
405
  const bytes = formatBytes(state.status.bytes);
@@ -462,7 +463,7 @@ export async function runBenchmarkInPi(profile, runDirectory, { signal } = {}) {
462
463
  const stderrHandle = await openFileHandle(stderrPath, "w");
463
464
 
464
465
  const verbose = Boolean(process.env.OFFGRID_BENCHMARK_VERBOSE);
465
- const renderState = { turn: 0, status: { mode: "idle", toolName: null, bytes: 0, tokens: 0 } };
466
+ const renderState = { turn: 0, status: { mode: "idle", toolName: null, bytes: 0, text: "", tokens: 0 } };
466
467
 
467
468
  function appendResponse(text) {
468
469
  responseBuffer += text;
@@ -866,7 +867,21 @@ export async function finalizeBenchmarkRun(runDirectory, runResult, speedMetrics
866
867
 
867
868
  const success = existsSync(requiredPath) && (await readFile(requiredPath, "utf8")).trim().length > 0;
868
869
  const hasTurns = runResult.agentTurns > 0;
869
- const failed = runResult.error || !success || !hasTurns;
870
+
871
+ let failureReason = null;
872
+ if (runResult.error) {
873
+ failureReason = typeof runResult.error === "string" ? runResult.error : (runResult.error.message ?? "Unknown error");
874
+ } else if (!hasTurns) {
875
+ failureReason = "The model did not produce any response turns.";
876
+ } else if (!success) {
877
+ if (runResult.toolCalls === 0) {
878
+ failureReason = `The model finished without writing the required output file (${requiredFile}). It may have returned the response as chat text instead of using the write tool.`;
879
+ } else {
880
+ failureReason = `The required output file (${requiredFile}) was missing or empty after the run.`;
881
+ }
882
+ }
883
+
884
+ const failed = failureReason !== null;
870
885
 
871
886
  metadata.status = failed ? "failed" : "completed";
872
887
  metadata.updatedAt = timestamp;
@@ -898,7 +913,9 @@ export async function finalizeBenchmarkRun(runDirectory, runResult, speedMetrics
898
913
  perTurn: runResult.perTurn,
899
914
  };
900
915
 
901
- if (runResult.error) {
916
+ if (failureReason) {
917
+ metadata.error = { message: failureReason, ...(typeof runResult.error === "object" && runResult.error?.stack ? { stack: runResult.error.stack } : {}) };
918
+ } else if (runResult.error) {
902
919
  metadata.error = typeof runResult.error === "string"
903
920
  ? { message: runResult.error }
904
921
  : { message: runResult.error.message ?? "Unknown error", ...(runResult.error.stack ? { stack: runResult.error.stack } : {}) };
@@ -1048,8 +1065,30 @@ export function renderBenchmarkSummary(metadata) {
1048
1065
  ];
1049
1066
  console.log(renderSection("Speed Metrics", renderRows(speedRows)));
1050
1067
  } else if (error) {
1051
- console.log(renderSection("Error", pc.red(error.message ?? "Unknown error")));
1068
+ const wrappedError = wrapText(error.message ?? "Unknown error");
1069
+ console.log(renderSection("Error", pc.red(wrappedError)));
1070
+ if (error.message?.includes("write tool") || error.message?.includes("required output file")) {
1071
+ const tip = wrapText("Tip: This usually means the model returned the answer as chat text instead of writing the file. Try a model with stronger tool-use support, or run the prompt manually.", 64);
1072
+ console.log(pc.dim("\n" + tip));
1073
+ }
1074
+ }
1075
+ }
1076
+
1077
+ function wrapText(text, width = 64) {
1078
+ if (!text) return "";
1079
+ const words = text.split(/\s+/);
1080
+ const lines = [];
1081
+ let current = "";
1082
+ for (const word of words) {
1083
+ if ((current + " " + word).trim().length > width) {
1084
+ if (current) lines.push(current.trim());
1085
+ current = word;
1086
+ } else {
1087
+ current = current ? `${current} ${word}` : word;
1088
+ }
1052
1089
  }
1090
+ if (current) lines.push(current.trim());
1091
+ return lines.join("\n");
1053
1092
  }
1054
1093
 
1055
1094
  function benchmarkModelSource(profile) {