npm - @burtson-labs/bandit-stealth-cli - Versions diffs - 1.7.80 → 1.7.84 - Mend

@burtson-labs/bandit-stealth-cli 1.7.80 → 1.7.84

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -185,6 +185,44 @@ Workspace config overrides user config. Secrets belong in the user-level file, n
 Running a bigger model on a remote Ollama instance? Point `OLLAMA_URL` at the remote endpoint and set `BANDIT_MODEL` to the bigger model. Requests route to the remote node; everything else stays local.
+#### Rented GPU (RunPod / Vast.ai / Lambda)
+When you need to run a model your local hardware can't fit, Bandit talks to any remote Ollama endpoint — including rented GPU pods. Same shape on every provider: spin up a pod with Ollama on port 11434, copy the proxy URL, point `OLLAMA_URL` at it.
+**RunPod** (recommended — simplest UX):
+```bash
+# 1. From the RunPod template gallery, pick any Ollama template.
+#    H100 SXM is the right pick for 27-32B models; multi-GPU only
+#    needed for 70B+. Network volume optional but useful if you want
+#    model weights to persist across pod restarts.
+# 2. Once the pod boots, copy its proxy URL from the dashboard.
+#    Format: https://<pod-id>-11434.proxy.runpod.net
+# 3. SSH into the pod and pull a model:
+ollama pull qwen3.6:27b
+# 4. Locally, point Bandit at it:
+export OLLAMA_URL="https://<pod-id>-11434.proxy.runpod.net"
+export BANDIT_MODEL="qwen3.6:27b"
+bandit
+```
+Tear the pod down when you're done. ~$2/hr for an H100 SXM × 15-20 min agent session = under $1.
+**Vast.ai / Lambda Labs**: same pattern. Find an Ollama-preloaded image (or `apt install` Ollama yourself), expose port 11434, set `OLLAMA_URL` to the host URL.
+**Recommended models for rented GPU:**
+| Model | Size | What it's good at |
+|---|---|---|
+| `qwen3.6:27b` | ~17 GB | Same model as `bandit-logic`. Native tool calling, vision, 256K context. Best general-purpose pick. |
+| `qwen2.5-coder:32b` | ~20 GB | Code-specialist post-train. Strongest on file edits and refactors. |
+| `qwen3.6:35b` | ~24 GB | Bigger Qwen 3.6 variant — slower, marginally better reasoning. |
+**Avoid for agent work:** `gpt-oss:120b` and similar reasoning-tuned models. They're post-trained for OpenAI's harmony tool-call format, not the XML protocol Bandit uses for non-native models — they tend to narrate intent without emitting tool calls. Great for math/proofs in chat, poor for filesystem agent loops.
 ---
 ## Security & privacy