npm - llm-cli-gateway - Versions diffs - 1.4.0 → 1.5.13 - Mend

llm-cli-gateway 1.4.0 → 1.5.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (62) hide show

package/CHANGELOG.md +135 -1
package/README.md +358 -15
package/dist/approval-manager.d.ts +1 -1
package/dist/async-job-manager.d.ts +32 -2
package/dist/async-job-manager.js +101 -16
package/dist/auth.d.ts +15 -0
package/dist/auth.js +46 -0
package/dist/cli-updater.d.ts +19 -2
package/dist/cli-updater.js +110 -7
package/dist/codex-json-parser.d.ts +34 -0
package/dist/codex-json-parser.js +105 -0
package/dist/config.d.ts +30 -0
package/dist/config.js +167 -0
package/dist/doctor.d.ts +110 -0
package/dist/doctor.js +280 -0
package/dist/endpoint-exposure.d.ts +22 -0
package/dist/endpoint-exposure.js +231 -0
package/dist/entrypoint-url.d.ts +1 -0
package/dist/entrypoint-url.js +5 -0
package/dist/executor.d.ts +9 -1
package/dist/executor.js +52 -17
package/dist/flight-recorder.d.ts +3 -1
package/dist/flight-recorder.js +31 -2
package/dist/gateway-server.d.ts +2 -0
package/dist/gateway-server.js +1 -0
package/dist/gemini-json-parser.d.ts +21 -0
package/dist/gemini-json-parser.js +47 -0
package/dist/health.d.ts +7 -0
package/dist/health.js +22 -0
package/dist/http-transport.d.ts +22 -0
package/dist/http-transport.js +164 -0
package/dist/index.d.ts +186 -2
package/dist/index.js +2761 -1454
package/dist/job-store.d.ts +118 -2
package/dist/job-store.js +176 -5
package/dist/logger.d.ts +9 -0
package/dist/logger.js +14 -0
package/dist/model-registry.js +40 -6
package/dist/provider-login-guidance.d.ts +21 -0
package/dist/provider-login-guidance.js +98 -0
package/dist/provider-status.d.ts +41 -0
package/dist/provider-status.js +203 -0
package/dist/request-helpers.d.ts +484 -4
package/dist/request-helpers.js +613 -0
package/dist/resources.js +44 -0
package/dist/session-manager-pg.js +1 -0
package/dist/session-manager.d.ts +1 -1
package/dist/session-manager.js +2 -1
package/dist/upstream-contracts.d.ts +62 -0
package/dist/upstream-contracts.js +620 -0
package/dist/validation-normalizer.d.ts +23 -0
package/dist/validation-normalizer.js +79 -0
package/dist/validation-orchestrator.d.ts +47 -0
package/dist/validation-orchestrator.js +145 -0
package/dist/validation-prompts.d.ts +15 -0
package/dist/validation-prompts.js +52 -0
package/dist/validation-report.d.ts +57 -0
package/dist/validation-report.js +129 -0
package/dist/validation-tools.d.ts +7 -0
package/dist/validation-tools.js +198 -0
package/package.json +25 -10
package/setup/status.schema.json +271 -0

package/CHANGELOG.md CHANGED Viewed

@@ -2,7 +2,141 @@
 All notable changes to the llm-cli-gateway project.
-## Unreleased
+## [1.5.13] - 2026-05-24
+### Fixed
+- Report missing provider CLI launches as a clear command-not-found error instead of leaking Windows/libuv codes such as `-4058`.
+- Preserve async provider launch errors in job stderr/result output so sync MCP tools can return actionable setup guidance.
+- Replace `irm | iex` Windows install guidance and generated release manifest commands with direct binary download plus SHA256 verification.
+## [1.5.12] - 2026-05-24
+### Fixed
+- Stop detaching provider CLI processes on Windows so `ask_model` and async requests do not flash visible cmd/conhost windows.
+- Use hidden Windows process creation for the bootstrapper's managed Node gateway process and status checks.
+- Keep Windows process cleanup by killing provider process trees with hidden `taskkill.exe` instead of Unix process-group signals.
+## [1.5.11] - 2026-05-24
+### Fixed
+- Install a stable Windows `llm-cli-gateway.exe` command alongside the versioned bootstrapper and add the install directory to the user PATH.
+- Make the Windows one-command installer stop any running gateway before replacing the managed bundle, then start and doctor through the stable command.
+- Fix bootstrapper `status` and `stop` behavior on Windows so they do not depend on Unix-style PID probing.
+## [1.5.10] - 2026-05-24
+### Fixed
+- Hide Windows console windows when the gateway spawns provider CLIs for synchronous and asynchronous requests.
+## [1.5.9] - 2026-05-24
+### Fixed
+- Fix the Node entrypoint direct-run guard on Windows by using `pathToFileURL(realpathSync(...))` instead of constructing a POSIX-style file URL manually.
+- Make the Windows one-command installer stop when bootstrapper commands fail by checking native process exit codes.
+## [1.5.8] - 2026-05-24
+### Fixed
+- Make `start` wait for the local HTTP health endpoint before reporting success.
+- Write gateway stdout/stderr to local log files so startup failures are diagnosable instead of returning a misleading PID.
+## [1.5.7] - 2026-05-24
+### Fixed
+- Add a release-pinned `install-windows.ps1` asset so Windows users can install with one PowerShell command while still verifying the downloaded bootstrapper and platform bundle against `SHA256SUMS`.
+- Add the Windows one-liner to `release-manifest.json` and upload the installer script as part of the desktop release workflow.
+## [1.5.6] - 2026-05-24
+### Fixed
+- Replace the host-Node installer path with platform-specific verified bundles that include the compiled gateway, production dependencies, setup assets, and a managed Node runtime.
+- Make the bootstrapper start the managed runtime from the installed bundle and require `RVWR_ALLOW_HOST_NODE=1` for the developer host-Node fallback.
+- Update release packaging metadata and docs so Windows/macOS/Linux install instructions use `llm-cli-gateway-bundle-<version>-<os>-<arch>.tar.gz`.
+- Update production dependencies (`@modelcontextprotocol/sdk`, `better-sqlite3`, and transitive Hono/AJV packages) so `npm audit --omit=dev` reports zero vulnerabilities while pinning `type-is` and `content-type` away from Socket-flagged latest releases.
+## [1.5.5] - 2026-05-24
+### Fixed
+- Build desktop installer binaries on local self-hosted Linux, Windows, and macOS runners, then publish combined release metadata from the Linux packaging job.
+- Make `installer/build-release.sh` default to the host target for local runs, with `--all-targets` / `RVWR_RELEASE_ALL_TARGETS=1` reserved for local full-matrix testing.
+- Package setup UI/provider assets into the verified gateway bundle and let the setup UI resolve installed bundle assets from the managed gateway directory.
+## [1.5.4] - 2026-05-19
+### Fixed
+- Disable the default shared SQLite flight recorder during Vitest runs so parallel test workers do not race on `~/.llm-cli-gateway/logs.db` in GitHub Actions.
+- Keep the npm publish job under the public mirror's hosted-runner limit by installing without lifecycle scripts/audit, building once, verifying package contents, and leaving the full suite to CI.
+## [1.5.3] - 2026-05-19
+### Fixed
+- Align npm and PyPI release versions at 1.5.3.
+- Publish npm from the build already verified by CI instead of re-running `prepublishOnly` inside `npm publish`, which was causing the release publish step to be cancelled.
+- Add a PyPI tag/version guard so future release jobs fail before upload when `integrations/llm-plugin/pyproject.toml` does not match the release tag.
+## [1.5.2] - 2026-05-19
+### Fixed
+- **CI publish workflows fixed.** Both v1.5.0 and v1.5.1 npm + PyPI publish workflows failed; this release unblocks them:
+  - **`src/__tests__/session-manager.test.ts:437` — "should update lastUsedAt but not createdAt" was a broken test.** It used `setTimeout(...)` without awaiting it: the inner assertions never ran, AND the timer fired after `afterEach` removed the tmpdir, causing `FileSessionManager.updateSessionUsage` → `saveStorage` → `writeFileSync` to throw an unhandled `ENOENT`. Local vitest happened to exit 0 anyway; CI vitest correctly exits 1 on unhandled errors, so `npm test` failed every publish job. The test now `await`s the timer and snapshots `originalLastUsed` as a string (the original code compared against `session.lastUsedAt`, which is a live reference into the storage map and mutates when `updateSessionUsage` runs).
+  - **`.github/workflows/publish.yml` (PyPI) missing `contents: read`.** Declaring `permissions: { id-token: write }` shrinks `GITHUB_TOKEN` to only that scope, so `actions/checkout@v4` couldn't authenticate to fetch the release tag and failed with `fatal: could not read Username for 'https://github.com': terminal prompts disabled`. Permission now explicitly includes `contents: read`.
+No package-code changes vs 1.5.0 (functional surface) or 1.5.1 (installer workflow). This patch is the test + workflow correctness fix that lets the npm + PyPI artifacts actually publish.
+## [1.5.1] - 2026-05-19
+### Changed
+- **Desktop installer artifacts now built and uploaded automatically on release.** New `.github/workflows/release-installer.yml` triggers on `release: published`, cross-compiles the Go bootstrapper for 5 OS/arch targets (`darwin/{arm64,amd64}`, `linux/{amd64,arm64}`, `windows/amd64`), packages the Node gateway bundle (`llm-cli-gateway-bundle-<ver>.tar.gz`), generates `SHA256SUMS` + `release-manifest.json` with the repo-relative `RVWR_RELEASE_PUBLIC_BASE`, verifies checksums, and uploads everything as release assets via `gh release upload --clobber`. `workflow_dispatch` is supported so a missed run can be rebuilt for an existing tag. No package-code changes vs 1.5.0; this is purely the build/distribution pipeline that lets users install the desktop integration without git/npm/docker.
+## [1.5.0] - 2026-05-19
+Lands DAG layers 6-12 — the personal-MCP MVP terminal plus all of Phase 0-3 provider modernisation. Codex round-2 unconditional SHIP across U22-U27 (correlation `517700e1`). 523 tests passing (+184 from 1.4.0).
+### Added
+- **U19 / U20 — Early LLM-assisted setup validation + automated MVP test harness.** New `doctor.ts`, `http-transport.ts`, `validation-orchestrator.ts`, `validation-report.ts`, `validation-normalizer.ts`, `validation-prompts.ts`, `validation-tools.ts`, `endpoint-exposure.ts`, `auth.ts`, `provider-status.ts`, `provider-login-guidance.ts`, and `gateway-server.ts`. Prompt-pack tightenings driven by real LLM dogfooding (Gemini chat-only + Codex command-capable). 35 new tests across the four matching `__tests__/` files.
+- **U13 / U16 — Release packaging + dogfood readiness.** `installer/build-release.sh` cross-compiles 5 OS/arch targets (linux/{amd64,arm64}, darwin/{amd64,arm64}, windows/amd64) + Node bundle + `SHA256SUMS` + `release-manifest.json`. New `cli_upgrade --uninstall` (idempotent, dry-run by default) and `cli_upgrade --check`. New `Dockerfile.personal` + `docker-compose.personal.yml` for the personal-MCP container path. New `installer/packaging/README.md`. New `package.json` scripts `release:build`, `release:checksums`, `release:docker`. Comprehensive `docs/personal-mcp/{DOGFOODING_RESULTS,RELEASE_READINESS,SINGLE_BINARY_INSTALLER,ENDPOINT_EXPOSURE,PRODUCT_CONTRACT,PROVIDER_SUPPORT_MATRIX,VALIDATION_REPORT_FORMAT}.md` + per-provider `connect-*.md` guides + `setup/assistants/*-install-prompt.md` install-prompt corpus.
+- **U21 — Phase-0 parity fixes.** `SESSION_PROVIDER_VALUES` / `SESSION_PROVIDER_ENUM` now expose the full provider set (grok was previously absent from `session_create`/`session_list`/`session_clear_all` Zod enums despite the storage layer supporting it). `prepareGeminiRequest` emits `["-p", prompt, ...]` instead of a positional prompt, eliminating the dependency on Gemini's TTY/mode-detection heuristics. 6 new tests pin both fixes.
+- **U22 — Mistral Vibe is the fifth supported provider.** New `mistral_request` and `mistral_request_async` MCP tools register alongside the four incumbents and route through the same async job manager, dedup store, flight recorder, approval manager, and validation orchestrator. Five Vibe-specific divergences are documented in `docs/personal-mcp/PROVIDER_MODERNISATION_AUDIT.md`:
+  - **No `--model` flag** — model selection is via the `VIBE_ACTIVE_MODEL` environment variable (default alias: `devstral-medium`); the executor and async job manager forward an `env` override.
+  - **Session-logging is opt-in** in `~/.vibe/config.toml` — `doctor --json` probes `[session_logging] enabled = true` (read-only) and surfaces an actionable `next_actions` entry when the toggle is missing.
+  - **`--agent` enum** replaces Grok's `--always-approve` (`default | plan | accept-edits | auto-approve | chat | explore | lean`); the gateway always emits `--agent` explicitly and defaults to `auto-approve` for programmatic callers.
+  - **`--enabled-tools` allow-list only** — `allowedTools` emits one `--enabled-tools <tool>` per entry; `disallowedTools` is accepted in the schema for caller parity but silently ignored at the CLI boundary (a logged warning records the no-op).
+  - **No self-update** — `cli_upgrade --cli mistral` detects pip / uv / brew via probes and dispatches to `pip install -U vibe-cli`, `uv tool upgrade vibe-cli`, or `brew upgrade mistral-vibe`. Unknown installations return an actionable error rather than running a non-existent `vibe update`.
+  Other surfaces extended: `SESSION_PROVIDER_VALUES` now includes `"mistral"`; `list_models`, `cli_versions`, `cli_upgrade`, `approval_list`, `session_create`, `session_list`, and `session_clear_all` accept the fifth provider; new MCP resources `sessions://mistral` and `models://mistral` are registered; `validate_with_models` / `consensus_check` / `red_team_review` can route to Mistral.
+- **U23 — JSON output + token/cost parity across providers.** New `src/codex-json-parser.ts` parses the Codex `--json` JSONL event stream (`thread.started`, `turn.started`/`completed`/`failed`, `item.*`, `error`); lenient against partial streams and garbage preamble. New `src/gemini-json-parser.ts` parses `gemini -o json` output and maps `usageMetadata.{promptTokenCount, candidatesTokenCount, cachedContentTokenCount}`. `extractUsageAndCost` is now a thin per-provider dispatcher returning `{inputTokens, outputTokens, cacheReadTokens?, cacheCreationTokens?, costUsd?}` for every provider that supports JSON; Claude `cache_read_input_tokens` / `cache_creation_input_tokens` are now plumbed through instead of being discarded. `codex_request`, `codex_request_async`, `gemini_request`, and `gemini_request_async` now expose `outputFormat: enum("text","json")` — set to `"json"` and the gateway emits `--json` (Codex) or `-o json` (Gemini) and forwards parsed usage/cost into the flight recorder. Flight-recorder schema gains `cache_read_tokens` and `cache_creation_tokens` columns via idempotent migration (`PRAGMA table_info` → `ALTER TABLE ADD COLUMN`); existing `logs.db` files are upgraded in place. 15 new tests.
+- **U24 — Permission/approval-mode parity across providers.** Claude `permissionMode` enum (`default | acceptEdits | plan | auto | dontAsk | bypassPermissions`) replaces the boolean `dangerouslySkipPermissions` (the boolean still works and now maps to `permissionMode: "bypassPermissions"`; setting both logs a warning, `permissionMode` wins). Gemini `approvalMode` gains `plan`. Codex splits `--full-auto` into `sandboxMode: enum("read-only","workspace-write","danger-full-access")` and `askForApproval: enum("untrusted","on-request","never")`, emitting `--sandbox <mode>` and `--ask-for-approval <mode>` independently; legacy `fullAuto: true` still works and expands to `--sandbox workspace-write --ask-for-approval never` by default, with `useLegacyFullAutoFlag: true` as an explicit escape hatch to emit `--full-auto` directly. Codex resume mode filters all three flags (`--full-auto`, `--sandbox`, `--ask-for-approval`) since `codex exec resume` inherits the session's policy. 26 new tests.
+- **U25 — Claude high-impact features.** `claude_request` / `claude_request_async` schemas gain `agent?: string` (single sub-agent dispatch), `agents?: Record<string, object>` (multi-agent JSON, validated against `CLAUDE_AGENT_DEFINITION_SCHEMA` before emit), `forkSession?: boolean`, `systemPrompt?: string`, `appendSystemPrompt?: string` (mutually exclusive at the schema + tool-callback boundary), `maxBudgetUsd?: number`, `maxTurns?: number`, `effort?: enum("low","medium","high","xhigh","max")`, and `excludeDynamicSystemPromptSections?: boolean`. Each emits the documented `--<flag>` form. 25 new tests in `src/__tests__/claude-handler.test.ts`.
+- **U26 — Codex high-impact features.** `codex_request` / `codex_request_async` gain `outputSchema?: string | object` (object form is materialised to an `0o600` temp file under `os.tmpdir()` and cleaned via the AsyncJobManager `onComplete` contract — see post-review fixes below), `search?: boolean`, `profile?: string`, `configOverrides?: Record<string,string>` (keys validated against `/^[a-zA-Z0-9._]+$/`, values reject `\r`/`\n` via Zod refinement; emitted as repeated `-c key=value`), `ephemeral?: boolean`, `images?: string[]` (each path existence-validated; missing paths fail fast), `ignoreUserConfig?: boolean`, `ignoreRules?: boolean`. New top-level tool `codex_fork_session` wraps `codex fork <UUID> <prompt>` and `codex fork --last <prompt>` (sessionId XOR forkLast via Zod refinement). Codex default model alias is now `gpt-5.5` (the prior `gpt-5.3-codex` alias still resolves). Codex resume filter list extended with `--add-dir`, `-C`, `--output-schema`, and `--search`. 28 new tests across `codex-handler.test.ts` and `codex-fork.test.ts`.
+- **U27 — Gemini high-impact features.** `gemini_request` / `gemini_request_async` gain `sandbox?: boolean` (emits `-s`), `policyFiles?: string[]` and `adminPolicyFiles?: string[]` (each path existence-validated; missing paths fail fast), and `attachments?: string[]` (absolute paths only, validated and prepended to the prompt as `@<abs-path>` tokens before the `-p` pair — U21 ordering invariant preserved). For fresh sessions (`createNewSession: true` or no sessionId), the gateway now emits `--session-id <uuid-v4>` instead of `--resume`, mapping the gateway session 1:1 to Gemini's authoritative store; `gw-*` prefixed IDs are rejected via strict UUID-v4 regex. `doctor --json` probes `./GEMINI.md`, `~/.gemini/GEMINI.md`, and `~/.gemini/settings.json` (parses `mcpServers` and reconciles against the gateway's `--allowed-mcp-server-names` whitelist; surfaces `next_actions` for missing registrations). `provider-status.ts` `geminiAuthStatus()` recognises four auth methods: OAuth file, `GEMINI_API_KEY`, `GOOGLE_API_KEY`, and `GOOGLE_CLOUD_PROJECT` + `GOOGLE_GENAI_USE_VERTEXAI=true`. 41 new tests across `gemini-handler.test.ts`, `provider-status.test.ts`, and the extended `doctor.test.ts`.
+### Fixed
+Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 unconditional SHIP. Locked in by `src/__tests__/post-review-fixes.test.ts` (14 tests, no mocks).
+- **U22 dedup key now reflects env vars.** `AsyncJobManager.buildRequestKey(cli, args, env)` hashes a `canonicaliseEnvForKey(env)` payload (sorted-keys JSON) via the existing `computeRequestKey(cli, args, extra)` API. Two Mistral requests with the same argv but different `VIBE_ACTIVE_MODEL` no longer collide on dedup. Empty/undefined env collapses to `""` so pre-U22 callers retain the same key shape and previously-stored entries remain hit-able.
+- **U23 JSON parsers are now reachable.** The newly-added Codex JSONL parser and Gemini JSON parser were dead code because `codex_request` / `gemini_request` exposed no `outputFormat` parameter and the gateway never emitted `--json` / `-o json`. Both tool schemas (sync + async) now expose `outputFormat: enum("text","json")`. `prepareCodexRequest` emits `--json`; `prepareGeminiRequest` emits a contiguous `-o json` pair after the U21 `-p` prompt pair. The success paths for `codex_request` and `gemini_request` now run `extractUsageAndCost(cli, stdout, outputFormat)` and forward `inputTokens`, `outputTokens`, `cacheReadTokens`, `cacheCreationTokens`, and `costUsd` into the flight recorder.
+- **U26 `outputSchema` temp-file lifecycle now correct on every exit path.** `AsyncJobRecord` gains `onComplete?: () => void` + `onCompleteFired?: boolean` guard. `fireOnComplete(job)` is wired into every site that calls `persistComplete(job)` (8 total: close handler, cancel, idle-timeout, output overflow, dead-process recovery, exited-flag mismatch, process-monitor expiry, persistence-recovery). The dedup path also fires the new request's `onComplete` immediately so a deduped request never leaves its own materialised temp file orphaned. `awaitJobOrDefer` now takes `onComplete` as a trailing arg and guarantees exactly-once consumption across direct-execution, deferred, and `startJobWithDedup`-throws branches. The sync `codex_request` finally no longer runs cleanup (would have deleted the temp file while the deferred CLI process was still reading it); the async `codex_request_async` no longer leaks the temp file on successful start.
+### Changed
+- Codex default model alias is now `gpt-5.5` (legacy `gpt-5.3-codex` alias preserved).
+- Default `model-registry` fallback chain order updated for new aliases.
+- Skills (`.agents/skills/*` and `skills/*`) extended from four-provider to five-provider lists, with Mistral notes on auto-approve default and session-logging requirement.
 ## [1.4.0] - 2026-05-16

package/README.md CHANGED Viewed

@@ -5,6 +5,77 @@
 A Model Context Protocol (MCP) server providing unified access to Claude Code, Codex, Gemini, and Grok CLIs with session management, retry logic, and async job orchestration.
+## Personal MCP Appliance MVP
+`llm-cli-gateway` is being packaged as a single-user personal MCP appliance for cross-LLM validation. The intended workflow is: connect one MCP endpoint, ask any client for cross-LLM validation.
+The product contract is documented in [docs/personal-mcp/PRODUCT_CONTRACT.md](docs/personal-mcp/PRODUCT_CONTRACT.md). It defines the single-user scope, security posture, target support matrix, and provider-support verification gates. Public setup guides must not claim ChatGPT, Claude web, Claude Desktop, Codex, Gemini CLI, Gemini web, or Grok inbound support until the corresponding provider/client path has been verified.
+This project does not provide hosted multi-tenant credential custody. Provider credentials stay on the user's machine or user-owned deployment volume.
+MVP release readiness is tracked in [docs/personal-mcp/RELEASE_READINESS.md](docs/personal-mcp/RELEASE_READINESS.md). Dogfooding evidence (which target LLMs guided setup, what unsafe suggestions were captured, which findings are deferred to post-MVP work) is in [docs/personal-mcp/DOGFOODING_RESULTS.md](docs/personal-mcp/DOGFOODING_RESULTS.md).
+Current personal-appliance artifacts include:
+- Streamable HTTP startup: `LLM_GATEWAY_AUTH_TOKEN=<token> npm run start:http`
+- Machine-readable diagnostics: `npm run doctor`
+- Go bootstrapper scaffold: `installer/` with `setup`, `doctor --json`, `start`, `stop`, `status`, `repair`, `upgrade`, `uninstall`, `print-client-config`, and verified bundle download commands.
+- Release packaging: the release workflow builds Linux binaries on the local self-hosted runner, builds Windows/macOS binaries on GitHub-hosted runners, then publishes checksummed platform bundles with the gateway, production dependencies, and a managed Node runtime; see [installer/packaging/README.md](installer/packaging/README.md).
+- Docker Compose fallback: [docker-compose.personal.yml](docker-compose.personal.yml) + [Dockerfile.personal](Dockerfile.personal) for users who already manage containers.
+- Local setup UI artifact: [setup/ui/index.html](setup/ui/index.html)
+- Provider setup snippets: [setup/providers/](setup/providers/)
+- Cross-validation tools: `validate_with_models`, `second_opinion`, `compare_answers`, `red_team_review`, `consensus_check`, `ask_model`, `synthesize_validation`, `job_status`, and `job_result`.
+### Install / Upgrade / Uninstall (single binary)
+Windows PowerShell:
+```powershell
+$Version = '<version>'
+$Base = "https://github.com/verivus-oss/llm-cli-gateway/releases/download/v$Version"
+$InstallDir = Join-Path (Join-Path $env:LOCALAPPDATA 'Programs') 'llm-cli-gateway'
+$Exe = Join-Path $InstallDir 'llm-cli-gateway.exe'
+New-Item -ItemType Directory -Force $InstallDir | Out-Null
+Invoke-WebRequest -UseBasicParsing "$Base/llm-cli-gateway-$Version-windows-amd64.exe" -OutFile $Exe
+$env:RVWR_GATEWAY_BUNDLE_URL = "$Base/llm-cli-gateway-bundle-$Version-windows-amd64.tar.gz"
+$env:RVWR_GATEWAY_BUNDLE_SHA256 = '<bundle-sha256-from-SHA256SUMS>'
+& $Exe setup
+& $Exe stop
+& $Exe install-bundle
+& $Exe start
+& $Exe status
+& $Exe doctor
+```
+The Windows installer keeps a stable `llm-cli-gateway.exe` command in
+`%LOCALAPPDATA%\Programs\llm-cli-gateway` and adds that directory to the user
+PATH. Do not script against release-versioned exe names after install.
+```bash
+# After downloading the binary that matches your OS/arch from a release:
+sha256sum --check SHA256SUMS            # verify before run (or `shasum -a 256 --check` on macOS)
+chmod +x llm-cli-gateway-<ver>-<os>-<arch>
+./llm-cli-gateway-<ver>-<os>-<arch> setup
+./llm-cli-gateway-<ver>-<os>-<arch> install-bundle    # uses the platform bundle URL/SHA256
+./llm-cli-gateway-<ver>-<os>-<arch> start
+./llm-cli-gateway-<ver>-<os>-<arch> doctor
+# Upgrade: replace the binary, set the new bundle env vars, run upgrade.
+./llm-cli-gateway-<new>-<os>-<arch> upgrade
+# Uninstall: dry-run first, then run with --yes.
+./llm-cli-gateway-<ver>-<os>-<arch> uninstall
+./llm-cli-gateway-<ver>-<os>-<arch> uninstall --yes
+```
+Docker fallback:
+```bash
+LLM_GATEWAY_AUTH_TOKEN=$(openssl rand -hex 32) \
+  docker compose -f docker-compose.personal.yml up -d
+docker compose -f docker-compose.personal.yml run --rm doctor
+```
 ## Features
 ### Core Capabilities
@@ -63,6 +134,36 @@ grok login   # OAuth flow, or set GROK_CODE_XAI_API_KEY
 # Docs: https://docs.x.ai/build/cli
 ```
+### Mistral Vibe CLI
+```bash
+# Pick one — the gateway's cli_upgrade auto-detects which one you used.
+pip install vibe-cli
+uv tool install vibe-cli
+brew install mistral-vibe
+vibe auth login
+# Required for `mistral_request --resume` / `--continue` to persist sessions:
+vibe config set session_logging.enabled true   # or edit ~/.vibe/config.toml
+```
+Vibe-specific notes:
+- **Model selection is via the `VIBE_ACTIVE_MODEL` environment variable** —
+  Vibe has no `--model` flag. The gateway resolves the requested model alias
+  (default: `devstral-medium`) and injects it as `VIBE_ACTIVE_MODEL` when
+  spawning `vibe`.
+- **`permissionMode` accepts** `default | plan | accept-edits | auto-approve |
+  chat | explore | lean` and emits `--agent <mode>`. The gateway's
+  programmatic-mode default is `auto-approve`; pick a stricter mode
+  explicitly if you need approval gates.
+- **`allowedTools` is allow-list only** — the gateway emits one
+  `--enabled-tools <tool>` flag per entry. `disallowedTools` is accepted in
+  the schema for caller-side parity but is silently ignored at the CLI
+  boundary (a `logger.info` warning records the no-op).
+- **No self-update**: `cli_upgrade --cli mistral` detects whether you used
+  pip / uv / brew and dispatches the matching upgrade command. Running
+  `vibe update` is not a thing.
 ## Installation
 ### As an MCP server (npm)
@@ -94,7 +195,7 @@ npm run build
 ### As an MCP Server
-Add to your MCP client configuration (e.g., Claude Desktop):
+For clients that already support local stdio MCP servers, add a configuration like:
 ```json
 {
@@ -107,8 +208,24 @@ Add to your MCP client configuration (e.g., Claude Desktop):
 }
 ```
+This generic stdio example is not provider-support verification for the Personal MCP Appliance MVP. Client-specific setup guides for ChatGPT, Claude web, Claude Desktop, Codex, Gemini CLI, Gemini web, and Grok remain gated by the provider-support matrix in [docs/personal-mcp/PRODUCT_CONTRACT.md](docs/personal-mcp/PRODUCT_CONTRACT.md).
 ### Available Tools
+#### Cross-LLM Validation Tools
+The personal-appliance surface exposes simplified validation tools for non-developer clients. These tools start provider CLI jobs through the durable async job manager and return normalized provider status plus raw job references.
+- `validate_with_models`: ask two or more providers to independently validate a question.
+- `second_opinion`: ask one provider to review an answer.
+- `red_team_review`: challenge a plan, answer, or document for risks and failure modes.
+- `consensus_check`: check whether providers agree with a claim.
+- `ask_model`: ask one provider through the simplified surface.
+- `synthesize_validation`: run an explicit judge model after provider results have been collected.
+- `job_status` and `job_result`: poll and collect validation job outputs.
+The validation report preserves per-provider disagreement. Optional judge synthesis is explicit about which provider produced the judge job.
 #### LLM Request Tools
 ##### `claude_request`
@@ -246,20 +363,245 @@ Execute a Grok CLI (xAI) request with session support.
 #### Durable job results & automatic dedup
-Every async job is persisted to a `jobs` table in `~/.llm-cli-gateway/logs.db` as it transitions through running → completed/failed/canceled. This makes the gateway a durable collection layer:
+Every async job is persisted to a job store as it transitions through running → completed/failed/canceled. This makes the gateway a durable collection layer:
 - **Re-issuing a request is safe.** Identical `*_request` / `*_request_async` calls within the dedup window (default 1 hour) short-circuit onto the existing running or completed job — the caller gets back the same job ID instead of starting a duplicate run. This directly fixes the "agent times out polling, re-issues, and the whole job starts over" failure mode.
 - **`llm_job_status` and `llm_job_result` work across gateway restarts.** Job rows live for 30 days by default; callers can fetch results long after the in-memory cache has evicted them.
 - **Jobs running at shutdown are marked `orphaned`** on the next gateway boot (the detached child can't be reattached to). Their captured partial output remains readable.
 - **Pass `forceRefresh: true`** on any request tool to bypass dedup and force a fresh CLI run.
-Environment variables:
-- `LLM_GATEWAY_JOB_RETENTION_DAYS` — how long completed jobs stay queryable. Default `30`.
-- `LLM_GATEWAY_DEDUP_WINDOW_MS` — how recent an existing job must be to dedup against. Default `3600000` (1 hour). Set `0` to disable dedup.
-- `LLM_GATEWAY_JOBS_DB` — override the sqlite path. Defaults to the value of `LLM_GATEWAY_LOGS_DB`, then `~/.llm-cli-gateway/logs.db`. Set to `none` to disable durability entirely (in-memory only).
+##### Persistence configuration
+The job-store backend is configured by `~/.llm-cli-gateway/config.toml` (override with `LLM_GATEWAY_CONFIG=/path/to/config.toml`). Example:
+```toml
+[persistence]
+backend = "sqlite"                          # "sqlite" | "memory" | "postgres" | "none"
+path = "~/.llm-cli-gateway/logs.db"         # for sqlite
+# dsn = "postgresql://user:pw@host/db"      # for postgres (interface only — impl not yet shipped)
+retentionDays = 30
+dedupWindowMs = 3600000
+acknowledgeEphemeral = false                # required to enable async tools with memory backend
+```
+Backends:
+- **`sqlite`** (default) — durable, file-backed. Safe for single-instance deployments.
+- **`memory`** — in-process Map. Lost on gateway exit. Requires `acknowledgeEphemeral = true` to be loaded. Suitable for tests and ephemeral CI gateways.
+- **`postgres`** — interface only, implementation not yet shipped. Selecting this backend throws at startup.
+- **`none`** — no store. **`*_request_async`, `llm_job_status`, `llm_job_result`, and `llm_job_cancel` are NOT registered on the gateway.** This is a structural invariant: agents that try to call async tools against a gateway with `backend = "none"` get a clean "tool not found" at connect time instead of silent in-memory loss after the 1-hour TTL. Use `llm_process_health` to inspect the resolved persistence state programmatically.
+Legacy environment variables (deprecated; emit a warning at startup):
+- `LLM_GATEWAY_LOGS_DB` / `LLM_GATEWAY_JOBS_DB` — `none` selects `backend = "none"`; any other value selects `backend = "sqlite"` with that path.
+- `LLM_GATEWAY_JOB_RETENTION_DAYS` — overrides `retentionDays`.
+- `LLM_GATEWAY_DEDUP_WINDOW_MS` — overrides `dedupWindowMs`.
+- `LLM_GATEWAY_ACKNOWLEDGE_EPHEMERAL` — `1`/`true`/`yes` sets `acknowledgeEphemeral = true`.
+##### Per-project isolation
+By default, **all gateway data is global per user**, not per project. With no overrides, every Claude Code window — across every repo — spawns its own gateway subprocess but they all read and write the same files:
+- `~/.llm-cli-gateway/logs.db` (async jobs + flight recorder)
+- `~/.llm-cli-gateway/sessions.json` (CLI sessions)
+- `~/.llm-cli-gateway/config.toml` (resolved config)
+This is usually what you want — `session_list` from repo A shows sessions from repo B, an async job started in window A can be polled from window B, and the 1-hour dedup window catches re-issues across windows. SQLite WAL mode makes concurrent access from multiple gateway subprocesses safe.
+If you instead want **per-project isolation** (e.g. unrelated repos shouldn't share session lists or risk false dedup hits), point each project at its own config file. In `.claude/settings.local.json` for the project:
+```json
+{
+  "mcpServers": {
+    "llm-gateway": {
+      "env": {
+        "LLM_GATEWAY_CONFIG": "${workspaceFolder}/.gateway/config.toml"
+      }
+    }
+  }
+}
+```
+…and put a per-project `config.toml` in the repo:
+```toml
+[persistence]
+backend = "sqlite"
+path = "/srv/repos/.../my-repo/.gateway/logs.db"
+```
+Now every gateway subprocess spawned for *this* repo's Claude Code window reads its own config and writes to its own SQLite file; sessions, jobs, and dedup state are scoped to the repo. Other repos keep using the global default. `llm_process_health.persistence.sources.configFile` lets an agent confirm which config it's actually running under.
+###### Agent-executable spec (DAG-TOML)
+If you want an LLM agent to perform this setup deterministically — rather than reading the prose above and guessing — copy the following DAG-TOML into the repo (e.g. `docs/planning/per-project-gateway-isolation.toml`) and point your agent at it. The schema is [`agent-assurance`](https://github.com/verivus-oss/agent-assurance/blob/main/SPEC.md) `template_kind = "implementation-dag"`. The agent MUST execute units in layer order, must not skip the verification unit, and must treat any failed gate as blocking.
+```toml
+[meta]
+schema_version    = "1.0.0"
+template_kind     = "implementation-dag"
+docs              = "https://github.com/verivus-oss/agent-assurance/blob/main/SPEC.md"
+confidentiality   = "public"
+title             = "Per-project llm-cli-gateway persistence isolation"
+spec              = "https://github.com/verivusai-labs/llm-cli-gateway#per-project-isolation"
+created           = "YYYY-MM-DD"
+total_units       = 5
+tier1_units       = ["U01","U02","U03","U04","U05"]
+tier2_units       = []
+tier3_units       = []
+# ============================================================================
+# [policy.agent] — persona for the agent performing the configuration.
+# ============================================================================
+[policy.agent]
+name                 = "Gateway Persistence Isolator"
+role                 = "Configuration Engineer"
+purpose              = "Configure the llm-cli-gateway MCP server so its async job store, sessions, dedup state, and flight recorder are scoped to THIS repository instead of the per-user default at ~/.llm-cli-gateway/."
+validation_type      = "Structural + Runtime Verification"
+workflow_initiator   = false
+description          = "Writes a repo-local config.toml, registers an LLM_GATEWAY_CONFIG override in .claude/settings.local.json, restarts the MCP server, and confirms via llm_process_health that the gateway is now reading the repo-local config and writing to the repo-local SQLite path."
+[policy.agent.orchestration]
+consumes_events      = ["PerProjectIsolationRequested"]
+produces_events      = ["PerProjectIsolationComplete"]
+[policy.agent.responsibilities]
+items = [
+  "Create the repo-local gateway data directory and add it to .gitignore.",
+  "Write a config.toml that pins backend=sqlite to a repo-local path.",
+  "Register the LLM_GATEWAY_CONFIG env override in .claude/settings.local.json (NOT .mcp.json — that file is committed and shared).",
+  "Trigger an MCP server reconnect.",
+  "Verify via llm_process_health that the resolved configFile and dbPath are the repo-local values.",
+]
+# ============================================================================
+# [policy.instance] — concrete paths the agent fills in for THIS repo.
+# Agent MUST replace <REPO_ABS_PATH> with the absolute path to the repo
+# before emitting any artefact. Relative paths in config.toml MUST be
+# expanded to absolute — the gateway does not re-resolve them per cwd.
+# ============================================================================
+[policy.instance]
+repo_abs_path                  = "<REPO_ABS_PATH>"           # e.g. /srv/repos/me/my-project
+gateway_data_dir_relative      = ".gateway"                  # repo-relative directory
+config_toml_relative           = ".gateway/config.toml"
+sqlite_db_relative             = ".gateway/logs.db"
+claude_local_settings_relative = ".claude/settings.local.json"
+gitignore_relative             = ".gitignore"
+mcp_server_name                = "llm-gateway"               # must match the entry in .mcp.json
+# ============================================================================
+# [policy.gates] — blocking checks. Any failure stops the workflow.
+# ============================================================================
+[policy.gates]
+gate_repo_abs_path_resolved    = "policy.instance.repo_abs_path must NOT be the literal string '<REPO_ABS_PATH>' when U01 starts."
+gate_config_is_committed       = "policy.instance.config_toml_relative MAY be committed. policy.instance.claude_local_settings_relative MUST NOT be committed (it is per-developer). Agent MUST verify .gitignore covers .claude/settings.local.json if absent."
+gate_no_legacy_env_leak        = "Agent MUST grep the shell init files for LLM_GATEWAY_LOGS_DB / LLM_GATEWAY_JOBS_DB. If set, the legacy env var will override the new config and the deprecation warning will fire at every gateway boot. The agent reports this as a finding and asks the operator to unset before proceeding."
+gate_health_confirms_isolation = "U05 MUST observe llm_process_health.persistence.sources.configFile == policy.instance.repo_abs_path + '/' + policy.instance.config_toml_relative AND llm_process_health.persistence.path == policy.instance.repo_abs_path + '/' + policy.instance.sqlite_db_relative. Anything else means the override did not take effect."
+# ============================================================================
+# [policy.evidence] — what each unit must emit so the work is auditable.
+# ============================================================================
+[policy.evidence]
+per_unit_required_fields = [
+  "unit_id",                  # U01..U05
+  "status",                   # "completed" | "failed"
+  "artefact_paths",           # files written / modified
+  "stdout_tail",              # last 20 lines of any command output
+  "verification_quote",       # for U05, the verbatim llm_process_health.persistence block
+]
+findings_required_fields = [
+  "gate_id",                  # which gate failed
+  "observed",
+  "expected",
+  "remediation",
+]
+# ============================================================================
+# Units. Execute in layer order. U01..U03 modify the working tree; U04
+# triggers a reconnect; U05 is the verification gate that decides success.
+# ============================================================================
+[units.U01]
+name           = "create-repo-local-data-dir"
+summary        = "mkdir -p <repo>/.gateway and append /.gateway/ to .gitignore (creating .gitignore if missing). The gateway will write logs.db, logs.db-wal, logs.db-shm here — none should be committed."
+layer          = 0
+tier           = 1
+status         = "pending"
+depends_on     = []
+blocks         = ["U02"]
+estimated_loc  = 5
+files_modify   = [".gitignore"]
+produces       = ["ART:gateway-data-dir"]
+consumes       = []
+[units.U02]
+name           = "write-config-toml"
+summary        = "Write <repo>/.gateway/config.toml with [persistence] backend='sqlite' and path=<absolute-path-to-repo>/.gateway/logs.db. Path MUST be absolute. Do NOT use ~ — the gateway expands ~ but [persistence].path is read literally if not prefixed with ~/, and Claude Code may launch the gateway with a HOME that surprises you."
+layer          = 1
+tier           = 1
+status         = "pending"
+depends_on     = ["U01"]
+blocks         = ["U03"]
+estimated_loc  = 10
+files_modify   = [".gateway/config.toml"]
+produces       = ["ART:gateway-config"]
+consumes       = ["ART:gateway-data-dir"]
+[units.U03]
+name           = "register-llm-gateway-config-env-in-claude-local-settings"
+summary        = "Add (or merge) an mcpServers.<mcp_server_name>.env entry in .claude/settings.local.json that sets LLM_GATEWAY_CONFIG to the absolute path of .gateway/config.toml. Do NOT modify .mcp.json — that file is committed and the path would be wrong for every other developer. If .claude/settings.local.json already has an mcpServers.<mcp_server_name> entry, the agent MUST merge into the existing env map (preserving other keys), not overwrite the whole entry."
+layer          = 2
+tier           = 1
+status         = "pending"
+depends_on     = ["U02"]
+blocks         = ["U04"]
+estimated_loc  = 20
+files_modify   = [".claude/settings.local.json"]
+produces       = ["ART:claude-local-settings"]
+consumes       = ["ART:gateway-config"]
+[units.U04]
+name           = "trigger-mcp-reconnect"
+summary        = "Ask the operator to run /mcp in Claude Code (or restart Claude Code) so the gateway subprocess is re-spawned under the new env. The agent cannot do this itself — MCP server lifecycle is owned by the host."
+layer          = 3
+tier           = 1
+status         = "pending"
+depends_on     = ["U03"]
+blocks         = ["U05"]
+estimated_loc  = 0
+files_modify   = []
+produces       = ["OUT:mcp-reconnected"]
+consumes       = ["ART:claude-local-settings"]
+[units.U05]
+name           = "verify-via-llm-process-health"
+summary        = "Call llm_process_health and assert the returned persistence block satisfies policy.gates.gate_health_confirms_isolation. Quote the verbatim persistence block in evidence. If the assertion fails, the agent MUST NOT mark the workflow complete — it must emit a finding under policy.evidence.findings_required_fields, naming the observed vs. expected configFile/path, and stop."
+layer          = 4
+tier           = 1
+status         = "pending"
+depends_on     = ["U04"]
+blocks         = []
+estimated_loc  = 5
+files_modify   = []
+produces       = ["ART:isolation-verification","OUT:per-project-isolation-complete"]
+consumes       = ["OUT:mcp-reconnected"]
+```
+**Why this matters for agents:** the gateway has multiple configuration surfaces (TOML file, env-var overrides, two different MCP settings files) and one easy mistake — editing the committed `.mcp.json` instead of the local-only `.claude/settings.local.json` — will silently break the per-project scope for every other developer on the repo. The DAG above encodes the correct sequence, the verification gate, and the failure modes explicitly so an agent can execute it without inference.
+##### `mistral_request`
+Run a Mistral Vibe agentic coding request. Like `grok_request` in shape, but with Vibe's specific surface:
+- `model` (string, optional): Resolved alias (e.g. `devstral-medium`, `devstral-large`, `latest`). The resolved value is injected via the `VIBE_ACTIVE_MODEL` environment variable — Vibe has no `--model` flag.
+- `permissionMode`: `default | plan | accept-edits | auto-approve | chat | explore | lean` — emitted as `--agent <mode>`. Defaults to `auto-approve` in programmatic mode.
+- `allowedTools` (string[], optional): One `--enabled-tools <tool>` flag per entry (allow-list only).
+- `disallowedTools` (string[], optional): Accepted for parity with the other providers; ignored at the CLI boundary with a logged warning.
+- `sessionId` / `resumeLatest` / `createNewSession`: standard session controls. Continuity requires `[session_logging] enabled = true` in `~/.vibe/config.toml` — `doctor --json` surfaces an actionable next-action when the toggle is missing.
-##### `claude_request_async` / `codex_request_async` / `gemini_request_async` / `grok_request_async`
-Start a long-running Claude, Codex, Gemini, or Grok request without waiting for completion in the same MCP call.
+##### `claude_request_async` / `codex_request_async` / `gemini_request_async` / `grok_request_async` / `mistral_request_async`
+Start a long-running Claude, Codex, Gemini, Grok, or Mistral request without waiting for completion in the same MCP call.
 Use this flow when analysis/runtime can exceed client tool-call limits:
 1. Start job with `*_request_async`
@@ -297,7 +639,7 @@ Approval records are persisted to `~/.llm-cli-gateway/approvals.jsonl`.
 Create a new session for a specific CLI.
 **Parameters:**
-- `cli` (string, required): CLI to create session for ("claude", "codex", "gemini", "grok")
+- `cli` (string, required): CLI to create session for ("claude", "codex", "gemini", "grok", "mistral")
 - `description` (string, optional): Description for the session
 - `setAsActive` (boolean, optional): Set as active session, default: true
@@ -314,7 +656,7 @@ Create a new session for a specific CLI.
 List all sessions, optionally filtered by CLI.
 **Parameters:**
-- `cli` (string, optional): Filter by CLI ("claude", "codex", "gemini", "grok")
+- `cli` (string, optional): Filter by CLI ("claude", "codex", "gemini", "grok", "mistral")
 **Response includes:**
 - Total session count
@@ -352,7 +694,7 @@ Clear all sessions, optionally for a specific CLI.
 List available models for each CLI.
 **Parameters:**
-- `cli` (string, optional): Specific CLI to list models for ("claude", "codex", "gemini", "grok")
+- `cli` (string, optional): Specific CLI to list models for ("claude", "codex", "gemini", "grok", "mistral")
 **Response includes:**
 - Model names and descriptions
@@ -394,13 +736,13 @@ LLM_GATEWAY_DISABLE_MODEL_DISCOVERY=1
 Report installed CLI versions.
 **Parameters:**
-- `cli` (string, optional): Specific CLI to inspect ("claude", "codex", "gemini", "grok")
+- `cli` (string, optional): Specific CLI to inspect ("claude", "codex", "gemini", "grok", "mistral")
 ##### `cli_upgrade`
 Plan or run an upgrade for one CLI.
 **Parameters:**
-- `cli` (string, required): CLI to upgrade ("claude", "codex", "gemini", "grok")
+- `cli` (string, required): CLI to upgrade ("claude", "codex", "gemini", "grok", "mistral")
 - `target` (string, optional): Package tag/version/target, default: `latest`
 - `dryRun` (boolean, optional): Return the upgrade plan without running it, default: `true`
 - `timeoutMs` (number, optional): Upgrade timeout when `dryRun=false`
@@ -479,11 +821,12 @@ await callTool("session_delete", {
   ```bash
   LLM_GATEWAY_APPROVAL_POLICY=strict node dist/index.js
   ```
-- `LLM_GATEWAY_LOGS_DB`: Path to SQLite flight recorder database. Default: `~/.llm-cli-gateway/logs.db`. Set to empty string or `none` to disable logging.
+- `LLM_GATEWAY_CONFIG`: Path to the gateway TOML config (default: `~/.llm-cli-gateway/config.toml`). See **Persistence configuration** above for the `[persistence]` schema.
+- `LLM_GATEWAY_LOGS_DB`: **Deprecated** — overrides `[persistence].path` and selects `backend = "sqlite"` (or `backend = "none"` when set to `none`). Emits a deprecation warning at startup; migrate to `config.toml`.
   ```bash
   # Custom path
   LLM_GATEWAY_LOGS_DB=/var/log/gateway/logs.db node dist/index.js
-  # Disable flight recorder
+  # Disable durable persistence (also disables *_request_async tools)
   LLM_GATEWAY_LOGS_DB=none node dist/index.js
   ```

package/dist/approval-manager.d.ts CHANGED Viewed

@@ -2,7 +2,7 @@ import type { Logger } from "./logger.js";
 import type { ReviewIntegrityResult } from "./review-integrity.js";
 export type ApprovalPolicy = "strict" | "balanced" | "permissive";
 export type ApprovalStrategy = "legacy" | "mcp_managed";
-export type ApprovalCli = "claude" | "codex" | "gemini" | "grok";
+export type ApprovalCli = "claude" | "codex" | "gemini" | "grok" | "mistral";
 export type ApprovalStatus = "approved" | "denied";
 export interface ApprovalRequest {
     cli: ApprovalCli;