npm - @burtson-labs/bandit-stealth-cli - Versions diffs - 1.7.273 → 1.7.275 - Mend

@burtson-labs/bandit-stealth-cli 1.7.273 → 1.7.275

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -51,7 +51,7 @@ That's it. No API keys. No cloud services. The agent reads your code, searches,
 ## What it does
 - **Agentic tool use** — reads files, searches code, runs commands, writes changes
-- **Unified-diff approval gate** — every `write_file` / `apply_edit` shows a colored diff before touching disk
+- **Auditable approval gate** — writes show a colored diff, shell commands show the full command/cwd/risk, and `Allow once` / `Allow session` / `Always for target` scopes are recorded in turn traces
 - **Pre-write validation** — TypeScript, Python, JSON, C# syntax-checked before the agent can write
 - **Post-write validation** — JSON edits are re-parsed after write; failure feeds back to the agent on the next turn so it self-corrects without you flagging it
 - **Skills system** — agent activates specialized skills based on your prompt, and can create its own
@@ -63,11 +63,14 @@ That's it. No API keys. No cloud services. The agent reads your code, searches,
 - **Interactive scaffolders work** — `create-vite`, `create-react-app`, `ng new`, etc. detect a non-TTY stdin and self-abort. Bandit recognizes the pattern and surfaces a clear *"run this with `!`"* recovery hint so the model doesn't loop on a "command appeared to succeed" misread
 - **Live command output** — `npm install`, `pip install`, `watch_command npm run dev` stream their output to your terminal as it arrives, dimmed, while the spinner keeps animating. No more wondering if a 20-second install is hung
 - **Interrupt + queue** — press **Esc** mid-turn to cancel the agent and clear your queue. Type a follow-up + Enter to queue it (`queued: N · sends after this turn` in the status row). The next turn picks it up automatically
+- **Opt-in notifications** — `/notify on` enables desktop notifications for approvals, failures, background-task completion, and long turns; `/notify sound on` adds a terminal bell
 - **`?` shortcuts overlay** — type `?` at an empty prompt for a live cheatsheet that disappears the moment you backspace it
 - **`!`-prefix shell escape** — `!cmd` runs straight in your shell with full TTY access. First-use confirmation gate; per-call yellow box every time after so you can't miss the bypass. Catastrophic patterns (`rm -rf`, `mkfs`, `dd if=`) blocked even here
 - **Plan execution** — structured multi-step plans for complex refactors
 - **Session persistence** — every REPL session saved as JSONL under `~/.bandit/sessions/` for later resume
+- **Turn traces** — every agent turn writes a JSONL trace under `.bandit/turns`; `/trace` turns it into a readable timeline of prompts, permission requests/decisions, tool calls, retries, native-tool fallbacks, errors, and final output
 - **`/insights` HTML report** — local-only activity report: tool stats, top-touched files, languages, longest streak, peak day, error patterns, optional AI summary, mailto share
+- **Model behavior profiles** — `/profile` shows how Bandit treats the active model: native vs text tools, fallback policy, safe context budget, thinking default, parallel-tool limits, and known failure modes
 - **Project memory** — drop a `BANDIT.md` or `CLAUDE.md` at your workspace root and it's auto-loaded into the system prompt
 - **File + image mentions** — `@path` auto-inlines files; images are either sent multimodally or OCR'd locally (Apple Vision / tesseract)
 - **Clipboard paste** — `Ctrl+V` in the REPL pastes an image straight from your clipboard
@@ -90,6 +93,8 @@ Type `?` on an empty prompt for the at-a-glance overlay; `/help` for the full li
 | `/model [name]` | Switch model mid-session |
 | `/ollama [url]` | Show or set the Ollama endpoint — `/ollama default` resets to `http://localhost:11434` |
 | `/think on`, `/think off`, `/think auto` | Override per-model thinking-mode default |
+| `/profile [model]` | Show the active model behavior profile (tool protocol, fallback, context budget, known failure modes) |
+| `/notify status` | Configure desktop/bell notifications for approvals, failures, background tasks, and long turns |
 | `/theme [name]` | Pick a color palette (`/theme` lists; saved to global config) |
 | `/skills` | List loaded skills |
 | `/session list`, `/session resume <id>`, `/session new` | Manage sessions |
@@ -98,6 +103,7 @@ Type `?` on an empty prompt for the at-a-glance overlay; `/help` for the full li
 | `/clear` | Reset conversation (keeps session id) |
 | `/compact` | Trim old tool results to fit the context window |
 | `/rewind [id]` | Restore a file from a per-edit checkpoint |
+| `/trace`, `/trace list`, `/trace <id>` | Inspect turn traces from `.bandit/turns` |
 | `/tasks` | List background subagent tasks (`/tasks <id>` drill-down, `/tasks cancel <id>`) |
 | `/plan <goal>` | Heuristic plan first, y/N to execute |
 | `/init` | Scaffold `BANDIT.md` from a repo scan |
@@ -199,6 +205,25 @@ If you want to test models outside the recommended list, expect the reasoning-on
 - **Native tool calling** — Qwen 3.6, Qwen 2.5 Coder, Llama 3.1+, Devstral, DeepSeek-Coder-V2+. Tool schemas go in Ollama's `tools:` field. Saves ~1500–3000 tokens per turn.
 - **Text-parsing fallback** — Gemma 3/4 and anything else. XML-style tool block lives in the system prompt with the full mitigation stack armed.
+**Behavior profiles** sit beside capability detection. Capabilities answer "can this model do native tools or vision?" Behavior profiles answer "what should the harness do with it?" For example, Qwen 3.6 starts on native tools and degrades to text tools on retryable native-parser failures; Gemma-family models use compact text-tool prompting and earlier compaction; unknown models default to serialized text tools. Inspect the active profile with `/profile`.
+Workspace overrides load from `.bandit/model-profiles.json`:
+```jsonc
+{
+  "version": 1,
+  "profiles": {
+    "my-qwen": {
+      "match": ["my-qwen:14b"],
+      "protocol": { "preferred": "text-tools", "fallback": null, "envelope": "xml-json" },
+      "context": { "safeInputTokens": 12000, "outputBudgetTokens": 2048, "compaction": "early" },
+      "prompting": { "template": "qwen-agent", "examples": "strict", "thinking": "off" },
+      "reliability": { "maxParallelTools": 1, "knownFailureModes": ["custom parser drift"] }
+    }
+  }
+}
+```
 Any Ollama model works — capabilities auto-detect via `/api/show`.
 ---