npm - consult-llm-mcp - Versions diffs - 2.7.4 → 2.9.0 - Mend

consult-llm-mcp 2.7.4 → 2.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,24 @@
 # Changelog
+## v2.8.0 (2026-03-13)
+- Replaced hardcoded model enum with abstract selectors (`gemini`, `openai`,
+  `deepseek`) that resolve to the best available model at query time. This
+  avoids the need to hardcode a specific model in the caller side.
+- Responses now include a `[model:xxx]` prefix showing which concrete model was
+  used
+- Default Codex reasoning effort to "high" (was previously unset)
+- Monitor: added Task column to active and history tables
+- Monitor: show task mode and reasoning effort in detail view header
+- Monitor: press `s` in detail view to toggle system prompt display
+- Monitor: system prompt is now recorded in sidecar event files for viewing in
+  the TUI
+## v2.7.4 (2026-03-13)
+- Fixed Linux prebuilt binaries failing on older distros due to glibc version
+  mismatch by switching to musl static linking
 ## v2.7.1 (2026-03-09)
 - Monitor: show "Thinking..." spinner when thinking events are streaming

package/README.md CHANGED Viewed

@@ -29,12 +29,10 @@ to bring in the heavy artillery. Supports multi-turn conversations.
 - Query powerful AI models (GPT-5.4, Gemini 3.1 Pro, DeepSeek Reasoner) with
   relevant files as context
-- Direct queries with optional file context
-- Include git changes for code review and analysis
-- Comprehensive logging with cost estimation
+- Include git changes for code review
+- Comprehensive logging with cost estimation (if using API)
 - [Monitor TUI](#monitor): Real-time dashboard for watching active consultations
-- [Gemini CLI backend](#gemini-cli): Use the `gemini` CLI to take advantage of
-  [free quota](https://developers.google.com/gemini-code-assist/resources/quotas#quotas-for-agent-mode-gemini-cli)
+- [Gemini CLI backend](#gemini-cli): Use the `gemini` CLI for Gemini models
 - [Codex CLI backend](#codex-cli): Use the `codex` CLI for OpenAI models
 - [Cursor CLI backend](#cursor-cli): Use the `cursor-agent` CLI to route GPT and
   Gemini models through a single tool
@@ -55,37 +53,40 @@ to bring in the heavy artillery. Supports multi-turn conversations.
    ```bash
    claude mcp add consult-llm \
-     -e OPENAI_API_KEY=your_key \
-     -e GEMINI_API_KEY=your_key \
+     -e CONSULT_LLM_GEMINI_BACKEND=gemini-cli \
+     -e CONSULT_LLM_OPENAI_BACKEND=codex-cli \
      -- npx -y consult-llm-mcp
    ```
+   This is the recommended setup. Uses [Gemini CLI](#gemini-cli) and
+   [Codex CLI](#codex-cli). No API keys required, just `gemini login` and
+   `codex login`.
    **With binary** (no Node.js required):
    ```bash
    curl -fsSL https://raw.githubusercontent.com/raine/consult-llm-mcp/main/scripts/install.sh | bash
+   ```
+   ```bash
    claude mcp add consult-llm \
-     -e OPENAI_API_KEY=your_key \
-     -e GEMINI_API_KEY=your_key \
+     -e CONSULT_LLM_GEMINI_BACKEND=gemini-cli \
+     -e CONSULT_LLM_OPENAI_BACKEND=codex-cli \
      -- consult-llm-mcp
    ```
    For global availability across projects, add `--scope user`.
-   <details>
-   <summary>Using multiple API keys or CLI backends</summary>
+   **Using API keys instead of CLI backends:**
    ```bash
    claude mcp add consult-llm \
      -e OPENAI_API_KEY=your_openai_key \
      -e GEMINI_API_KEY=your_gemini_key \
      -e DEEPSEEK_API_KEY=your_deepseek_key \
-     -e CONSULT_LLM_GEMINI_BACKEND=gemini-cli \
      -- npx -y consult-llm-mcp
    ```
-   </details>
 2. **Verify connection** with `/mcp`:
    ```
@@ -380,8 +381,9 @@ claude mcp add consult-llm -e CONSULT_LLM_OPENAI_BACKEND=codex-cli -- npx -y con
 <!-- prettier-ignore -->
 > [!TIP]
-> Set reasoning effort with `-e CONSULT_LLM_CODEX_REASONING_EFFORT=high`. Options:
-> `none`, `minimal`, `low`, `medium`, `high`, `xhigh`.
+> Reasoning effort defaults to `high`. Override with
+> `-e CONSULT_LLM_CODEX_REASONING_EFFORT=xhigh`. Options: `none`, `minimal`,
+> `low`, `medium`, `high`, `xhigh`.
 #### Cursor CLI
@@ -405,8 +407,7 @@ claude mcp add consult-llm -e CONSULT_LLM_GEMINI_BACKEND=cursor-cli -- npx -y co
 claude mcp add consult-llm \
   -e CONSULT_LLM_OPENAI_BACKEND=cursor-cli \
   -e CONSULT_LLM_GEMINI_BACKEND=cursor-cli \
-  -e CONSULT_LLM_CODEX_REASONING_EFFORT=high \
-  -e CONSULT_LLM_ALLOWED_MODELS="gemini-3-pro-preview,gpt-5.3-codex" \
+  -e CONSULT_LLM_ALLOWED_MODELS="gemini-3.1-pro-preview,gpt-5.3-codex" \
   -- npx -y consult-llm-mcp
 ```
@@ -419,11 +420,7 @@ review), allow them in `~/.cursor/cli-config.json`:
 ```json
 {
   "permissions": {
-    "allow": [
-      "Shell(git diff*)",
-      "Shell(git log*)",
-      "Shell(git show*)"
-    ],
+    "allow": ["Shell(git diff*)", "Shell(git log*)", "Shell(git show*)"],
     "deny": []
   }
 }
@@ -495,31 +492,38 @@ See the "Using web mode..." example above for a concrete transcript.
   mode)
 - `DEEPSEEK_API_KEY` - Your DeepSeek API key (required for DeepSeek models)
 - `CONSULT_LLM_DEFAULT_MODEL` - Override the default model (optional)
-  - Options: `gpt-5.2` (default), `gpt-5.4`, `gemini-2.5-pro`,
-    `gemini-3-pro-preview`, `gemini-3.1-pro-preview`, `deepseek-reasoner`,
-    `gpt-5.3-codex`, `gpt-5.2-codex`
+  - Accepts selectors (`gemini`, `openai`, `deepseek`) or exact model IDs
+    (`gpt-5.4`, `gemini-3.1-pro-preview`, etc.)
+  - Selectors are resolved to the best available model at startup
 - `CONSULT_LLM_GEMINI_BACKEND` - Backend for Gemini models (optional)
   - Options: `api` (default), `gemini-cli`, `cursor-cli`
 - `CONSULT_LLM_OPENAI_BACKEND` - Backend for OpenAI models (optional)
   - Options: `api` (default), `codex-cli`, `cursor-cli`
-- `CONSULT_LLM_CODEX_REASONING_EFFORT` - Configure reasoning effort for Codex
-  CLI (optional)
-  - See [Codex CLI](#codex-cli) for details and available options
+- `CONSULT_LLM_ALLOWED_MODELS` - Restrict which concrete models can be used
+  (optional)
+  - Comma-separated list, e.g., `gpt-5.4,gemini-3.1-pro-preview`
+  - Selectors resolve against this list — e.g., if only `gemini-2.5-pro` is
+    allowed, the `gemini` selector resolves to it
+  - Useful when a backend doesn't support all models (e.g., Cursor CLI)
+  - See [Tips](#controlling-which-models-are-used) for usage examples
 - `CONSULT_LLM_EXTRA_MODELS` - Add models not in the built-in list (optional)
   - Comma-separated list, e.g., `grok-3,kimi-k2.5`
   - Merged with built-in models and included in the tool schema
   - Useful for newly released models with a known provider prefix (`gpt-`,
     `gemini-`, `deepseek-`)
-- `CONSULT_LLM_ALLOWED_MODELS` - List of models to advertise (optional)
-  - Comma-separated list, e.g., `gpt-5.2,gemini-3-pro-preview`
-  - When set, only these models appear in the tool schema
-  - Filters the combined catalog (built-in + extra models)
-  - If `CONSULT_LLM_DEFAULT_MODEL` is set, it must be in this list
-  - See [Tips](#controlling-which-models-claude-uses) for usage examples
+- `CONSULT_LLM_CODEX_REASONING_EFFORT` - Configure reasoning effort for Codex
+  CLI (optional, default: `high`)
+  - See [Codex CLI](#codex-cli) for details and available options
 - `CONSULT_LLM_SYSTEM_PROMPT_PATH` - Custom path to system prompt file
   (optional)
   - Overrides the default `~/.consult-llm-mcp/SYSTEM_PROMPT.md` location
   - Useful for project-specific prompts
+- `CONSULT_LLM_NO_UPDATE_CHECK` - Disable automatic update checking on server
+  startup (optional)
+  - Set to `1` to disable
+  - By default, the server checks for new versions in the background every 24
+    hours and logs a notice when an update is available
+  - Only applies to binary installs — npm installs are never checked
 - `MCP_DEBUG_STDIN` - Log raw JSON-RPC messages received on stdin (optional)
   - Set to `1` to enable
   - Logs every message as `RAW RECV` entries and poll timing gaps as
@@ -558,30 +562,33 @@ claude mcp add consult-llm \
 ## Tips
-### Controlling which models Claude uses
+### Controlling which models are used
-When you ask Claude to "consult an LLM" without specifying a model, it picks one
-from the available options in the tool schema. The `CONSULT_LLM_DEFAULT_MODEL`
-only affects the fallback when no model is specified in the tool call.
+The `model` parameter accepts **selectors** (`gemini`, `openai`, `deepseek`)
+that the server resolves to the best available concrete model. When no model is
+specified, the server uses `CONSULT_LLM_DEFAULT_MODEL` or its built-in fallback.
-To control which models Claude can choose from, use
-`CONSULT_LLM_ALLOWED_MODELS`:
+**Selector resolution order** (first available wins):
-```bash
-claude mcp add consult-llm \
-  -e GEMINI_API_KEY=your_key \
-  -e CONSULT_LLM_ALLOWED_MODELS='gemini-3-pro-preview,gpt-5.2-codex' \
-  -- npx -y consult-llm-mcp
-```
+| Selector   | Priority                                                       |
+| ---------- | -------------------------------------------------------------- |
+| `gemini`   | gemini-3.1-pro-preview → gemini-3-pro-preview → gemini-2.5-pro |
+| `openai`   | gpt-5.4 → gpt-5.3-codex → gpt-5.2 → gpt-5.2-codex              |
+| `deepseek` | deepseek-reasoner                                              |
-This restricts the tool schema to only advertise these models. For example, to
-ensure Claude always uses Gemini 3 Pro:
+**Restricting models with `CONSULT_LLM_ALLOWED_MODELS`:**
+If your backend doesn't support all models (e.g., Cursor CLI can't use
+`gpt-5.4`), use `CONSULT_LLM_ALLOWED_MODELS` to filter. Selectors will
+automatically resolve to the best model within the allowed list:
 ```bash
+# Only allow codex models through Cursor CLI
 claude mcp add consult-llm \
-  -e GEMINI_API_KEY=your_key \
-  -e CONSULT_LLM_ALLOWED_MODELS='gemini-3-pro-preview' \
+  -e CONSULT_LLM_OPENAI_BACKEND=cursor-cli \
+  -e CONSULT_LLM_ALLOWED_MODELS='gpt-5.3-codex,gemini-3.1-pro-preview' \
   -- npx -y consult-llm-mcp
+# "openai" selector → gpt-5.3-codex (gpt-5.4 filtered out)
 ```
 ## MCP tool: consult_llm
@@ -596,10 +603,12 @@ models complex questions.
 - **files** (optional): Array of file paths to include as context
   - All files are added as context with file paths and code blocks
-- **model** (optional): LLM model to use
-  - Options: `gpt-5.2` (default), `gpt-5.4`, `gemini-2.5-pro`,
-    `gemini-3-pro-preview`, `gemini-3.1-pro-preview`, `deepseek-reasoner`,
-    `gpt-5.3-codex`, `gpt-5.2-codex`
+- **model** (optional): Model selector or exact model ID
+  - Selectors: `gemini`, `openai`, `deepseek` — the server resolves to the best
+    available model for each family
+  - Exact model IDs (`gpt-5.4`, `gemini-3.1-pro-preview`, etc.) are also
+    accepted as an advanced override
+  - When omitted, the server uses the configured default
 - **task_mode** (optional): Controls the system prompt persona. The calling LLM
   should choose based on the task:
@@ -631,15 +640,12 @@ models complex questions.
 ## Supported models
-- **gemini-2.5-pro**: Google's Gemini 2.5 Pro ($1.25/$10 per million tokens)
-- **gemini-3-pro-preview**: Google's Gemini 3 Pro Preview ($2/$12 per million
-  tokens for prompts ≤200k tokens, $4/$18 for prompts >200k tokens)
-- **gemini-3.1-pro-preview**: Google's Gemini 3.1 Pro Preview ($2/$12 per
-  million tokens for prompts ≤200k tokens, $4/$18 for prompts >200k tokens)
-- **deepseek-reasoner**: DeepSeek's reasoning model ($0.55/$2.19 per million
-  tokens)
-- **gpt-5.4**: OpenAI's GPT-5.4 model ($2.50/$15 per million tokens)
-- **gpt-5.2**: OpenAI's GPT-5.2 model ($1.75/$14 per million tokens)
+- **gemini-2.5-pro**: Google's Gemini 2.5 Pro
+- **gemini-3-pro-preview**: Google's Gemini 3 Pro Preview
+- **gemini-3.1-pro-preview**: Google's Gemini 3.1 Pro Preview
+- **deepseek-reasoner**: DeepSeek's reasoning model
+- **gpt-5.4**: OpenAI's GPT-5.4 model
+- **gpt-5.2**: OpenAI's GPT-5.2 model
 - **gpt-5.3-codex**: OpenAI's Codex model based on GPT-5.3
 - **gpt-5.2-codex**: OpenAI's Codex model based on GPT-5.2
@@ -739,7 +745,9 @@ always reliably triggered. See the [consult skill](#consult) below.
 **Recommendation:** Start with no custom activation. Use skills if you need
 custom instructions for how the MCP is invoked.
-## Installing skills
+## Skills
+### Installing skills
 Install all skills globally with a single command:
@@ -759,8 +767,6 @@ To uninstall:
 curl -fsSL https://raw.githubusercontent.com/raine/consult-llm-mcp/main/scripts/install-skills | bash -s uninstall
 ```
-## Skills
 ### consult
 An example [Claude Code skill](https://code.claude.com/docs/en/skills) that uses
@@ -816,6 +822,21 @@ forth before synthesizing and implementing. See
 > /debate-vs --gemini design the multi-tenant isolation strategy
 ```
+## Updating
+**Binary installs:**
+```bash
+consult-llm-mcp update
+```
+Downloads the latest release from GitHub with SHA-256 checksum verification. If
+`consult-llm-monitor` is found alongside the binary, it's updated too.
+The server also checks for updates in the background on startup (every 24 hours)
+and logs a notice when a newer version is available. Disable with
+`CONSULT_LLM_NO_UPDATE_CHECK=1`.
 ## Development
 To work on the MCP server locally and use your development version:

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "consult-llm-mcp",
-  "version": "2.7.4",
+  "version": "2.9.0",
   "description": "MCP server for consulting powerful AI models",
   "repository": {
     "type": "git",
@@ -31,9 +31,9 @@
     "ai"
   ],
   "optionalDependencies": {
-    "consult-llm-mcp-darwin-arm64": "2.7.4",
-    "consult-llm-mcp-darwin-x64": "2.7.4",
-    "consult-llm-mcp-linux-x64": "2.7.4",
-    "consult-llm-mcp-linux-arm64": "2.7.4"
+    "consult-llm-mcp-darwin-arm64": "2.9.0",
+    "consult-llm-mcp-darwin-x64": "2.9.0",
+    "consult-llm-mcp-linux-x64": "2.9.0",
+    "consult-llm-mcp-linux-arm64": "2.9.0"
   }
 }