nodebench-mcp 3.0.1 → 3.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -21,24 +21,24 @@ Add to `~/.claude/settings.json`:
21
21
  }
22
22
  ```
23
23
 
24
- Restart Claude Code. 175 tools available immediately.
24
+ Restart Claude Code. The default lane exposes 9 tools immediately.
25
25
 
26
- ### Preset Selection
27
-
28
- By default all toolsets are enabled. Use `--preset` to start with a scoped subset:
26
+ ### Preset Selection
27
+
28
+ By default NodeBench starts in the workflow-first lane. Use `--preset` when you want a deeper or more specialized surface:
29
29
 
30
30
  ```json
31
31
  {
32
32
  "mcpServers": {
33
33
  "nodebench": {
34
34
  "command": "npx",
35
- "args": ["-y", "nodebench-mcp", "--preset", "meta"]
36
- }
37
- }
38
- }
39
- ```
40
-
41
- The **meta** preset is the recommended front door for new agents: start with just 5 discovery tools, use `discover_tools` to find what you need, then self-escalate to a larger preset. See [Toolset Gating & Presets](#toolset-gating--presets) for the full breakdown.
35
+ "args": ["-y", "nodebench-mcp", "--preset", "power"]
36
+ }
37
+ }
38
+ }
39
+ ```
40
+
41
+ The **default** preset is the recommended front door for new agents: start with the 9-tool workflow lane, use `discover_tools` only when the task needs more, then expand with `load_toolset` or a larger preset. See [Toolset Gating & Presets](#toolset-gating--presets) for the full breakdown.
42
42
 
43
43
  **→ Quick Refs:** After setup, run `getMethodology("overview")` | First task? See [Verification Cycle](#verification-cycle-workflow) | New to codebase? See [Environment Setup](#environment-setup) | Preset options: See [Toolset Gating & Presets](#toolset-gating--presets)
44
44
 
@@ -287,71 +287,78 @@ Use `getMethodology("overview")` to see all available workflows.
287
287
  | **Session Memory** | `save_session_note`, `load_session_notes`, `refresh_task_context` | Compaction-resilient notes, attention refresh |
288
288
  | **Discovery** | `discover_tools`, `get_tool_quick_ref`, `get_workflow_chain` | Hybrid search, quick refs, workflow chains |
289
289
 
290
- Meta + Discovery tools (5 total) are **always included** regardless of preset. See [Toolset Gating & Presets](#toolset-gating--presets).
291
-
292
- **→ Quick Refs:** Find tools by keyword: `findTools({ query: "verification" })` | Hybrid search: `discover_tools({ query: "security" })` | Get workflow guide: `getMethodology({ topic: "..." })` | See [Methodology Topics](#methodology-topics) for all topics
290
+ The current front door is the workflow-first default lane. Agents start small, prove value quickly, and only expand into heavier toolsets when the task demands it.
291
+
292
+ **→ Quick Refs:** Start small: `investigate({ topic: "..." })` | Expand only when needed: `discover_tools({ query: "security" })` | Load a deeper lane: `load_toolset({ toolset: "recon" })` | See [Methodology Topics](#methodology-topics) for verification and eval guidance
293
293
 
294
294
  ---
295
295
 
296
296
  ## Toolset Gating & Presets
297
297
 
298
- NodeBench MCP supports 4 presets that control which domain toolsets are loaded at startup. Meta + Discovery tools (5 total) are **always included** on top of any preset.
298
+ NodeBench MCP now uses a workflow-first default and keeps the larger surfaces behind explicit presets.
299
299
 
300
300
  ### Preset Table
301
301
 
302
- | Preset | Domain Toolsets | Domain Tools | Total (with meta+discovery) | Use Case |
303
- |--------|----------------|-------------|----------------------------|----------|
304
- | **meta** | 0 | 0 | 5 | Discovery-only front door. Agents start here and self-escalate. |
305
- | **lite** | 8 | 38 | 43 | Lightweight verification-focused workflows. CI bots, quick checks. |
306
- | **core** | 23 | 105 | 110 | Full development workflow. Most agent sessions. |
307
- | **full** | 31 | 170 | 175 | Everything enabled. Benchmarking, exploration, advanced use. |
302
+ | Preset | Surface | Use Case |
303
+ |--------|---------|----------|
304
+ | **default** | 9 visible tools | Workflow-first lane: investigate, compare, track, summarize, search, report, ask_context, discover_tools, load_toolset |
305
+ | **power** | Expanded workflow surface | Founder, recon, packets, and web-heavy research without admin runtime |
306
+ | **admin** | Operator surface | Profiling, observability, dashboards, eval, and debug workflows |
307
+ | **core** | Full methodology lane | Verification, eval, learning, recon, execution trace, mission harness |
308
+ | **founder** | Compatibility preset | Existing founder-oriented setups |
309
+ | **full** | All loaded domains | Maximum coverage when you explicitly want everything |
308
310
 
309
311
  ### Usage
310
312
 
311
- ```bash
312
- npx nodebench-mcp --preset meta # Discovery-only (5 tools)
313
- npx nodebench-mcp --preset lite # Verification + eval + recon + security
314
- npx nodebench-mcp --preset core # Full dev workflow without vision/parallel
315
- npx nodebench-mcp --preset full # All toolsets (default)
316
- npx nodebench-mcp --toolsets verification,eval,recon # Custom selection
317
- npx nodebench-mcp --exclude vision,ui_capture # Exclude specific toolsets
318
- ```
319
-
320
- ### The Meta Preset — Discovery-Only Front Door
321
-
322
- The **meta** preset loads zero domain tools. Agents start with only 5 tools:
323
-
324
- | Tool | Purpose |
325
- |------|---------|
326
- | `findTools` | Keyword search across all registered tools |
327
- | `getMethodology` | Get workflow guides by topic |
328
- | `discover_tools` | Hybrid search with relevance scoring (richer than findTools) |
329
- | `get_tool_quick_ref` | Quick reference card for any specific tool |
330
- | `get_workflow_chain` | Recommended tool sequence for common workflows |
331
-
332
- This is the recommended starting point for autonomous agents. The self-escalation pattern:
333
-
334
- ```
335
- 1. Start with --preset meta (5 tools)
336
- 2. discover_tools({ query: "what I need to do" }) // Find relevant tools
337
- 3. get_workflow_chain({ workflow: "verification" }) // Get the tool sequence
338
- 4. If needed tools are not loaded:
339
- → Restart with --preset core or --preset full
340
- Or use --toolsets to add specific domains
341
- 5. Proceed with full workflow
342
- ```
343
-
344
- ### Preset Domain Breakdown
345
-
346
- **meta** (0 domains): No domain tools. Meta + Discovery only.
347
-
348
- **lite** (8 domains): `verification`, `eval`, `quality_gate`, `learning`, `flywheel`, `recon`, `security`, `boilerplate`
349
-
350
- **core** (22 domains): Everything in lite plus `bootstrap`, `self_eval`, `llm`, `platform`, `research_writing`, `flicker_detection`, `figma_flow`, `benchmark`, `session_memory`, `toon`, `pattern`, `git_workflow`, `seo`, `voice_bridge`
351
-
352
- **full** (30 domains): All toolsets in TOOLSET_MAP including `ui_capture`, `vision`, `local_file`, `web`, `github`, `docs`, `parallel`, `gaia_solvers`, and everything in core.
353
-
354
- **→ Quick Refs:** Check current toolset: `findTools({ query: "*" })` | Self-escalate: restart with `--preset core` | See [MCP Tool Categories](#mcp-tool-categories) | CLI help: `npx nodebench-mcp --help`
313
+ ```bash
314
+ npx nodebench-mcp # Default workflow lane
315
+ npx nodebench-mcp --preset power # Expanded founder / recon / packet workflows
316
+ npx nodebench-mcp --preset admin # Profiling, dashboards, and operator tooling
317
+ npx nodebench-mcp --preset core # Full dev workflow without all optional domains
318
+ npx nodebench-mcp --preset full # All toolsets
319
+ npx nodebench-mcp --toolsets verification,eval,recon # Custom selection
320
+ npx nodebench-mcp --exclude vision,ui_capture # Exclude specific toolsets
321
+ ```
322
+
323
+ ### The Default Lane
324
+
325
+ The **default** preset is intentionally small:
326
+
327
+ | Tool | Purpose |
328
+ |------|---------|
329
+ | `investigate` | Produce a sourced report on a topic, company, person, link, or messy note |
330
+ | `compare` | Side-by-side compare 2 to 4 entities |
331
+ | `track` | Add, inspect, or list tracked entities |
332
+ | `summarize` | Compress raw context into a compact brief |
333
+ | `search` | Search live web and saved knowledge in one step |
334
+ | `report` | Turn findings into a readable report artifact |
335
+ | `ask_context` | Query saved NodeBench context |
336
+ | `discover_tools` | Find the next deeper lane when the default surface is not enough |
337
+ | `load_toolset` | Expand the session with a specific toolset only when needed |
338
+
339
+ Recommended escalation pattern:
340
+
341
+ ```
342
+ 1. Start with the default workflow lane
343
+ 2. Try investigate / compare / search / report first
344
+ 3. If the task needs deeper capability, call discover_tools(...)
345
+ 4. Load exactly one relevant toolset with load_toolset(...)
346
+ 5. Continue the workflow with the newly loaded tools
347
+ ```
348
+
349
+ ### Preset Domain Breakdown
350
+
351
+ **default**: `core_workflow`
352
+
353
+ **power**: `core_workflow`, `deep_sim`, `founder`, `recon`, `web`, `shared_context`, `sync_bridge`, `session_memory`, `entity_lookup`, `delta`, `site_map`
354
+
355
+ **admin**: `core_workflow`, `observability`, `profiler`, `local_dashboard`, `benchmark`, `longitudinal_benchmark`, `dogfood_judge`, `execution_trace`, `qa_orchestration`, `mission_harness`, `quality_gate`, `eval`, `verification`
356
+
357
+ **core**: verification, eval, quality_gate, learning, flywheel, autonomous_delivery, sync_bridge, shared_context, recon, security, boilerplate, skill_update, context_sandbox, observability, execution_trace, mission_harness, deep_sim, founder, scenario_compiler, packet_compiler, entity_temporal
358
+
359
+ **full**: every registered domain in the package
360
+
361
+ **→ Quick Refs:** Check current lane with `discover_tools({ query: "..." })` | Self-escalate with `load_toolset({ toolset: "recon" })` or restart with `--preset core` | See [MCP Tool Categories](#mcp-tool-categories) | CLI help: `npx nodebench-mcp --help`
355
362
 
356
363
  ---
357
364
 
@@ -706,12 +713,12 @@ Available via `getMethodology({ topic: "..." })`:
706
713
  | `autonomous_maintenance` | Risk-tiered execution | [Autonomous Maintenance](#autonomous-self-maintenance-system) |
707
714
  | `parallel_agent_teams` | Multi-agent coordination, task locking, oracle testing | [Parallel Agent Teams](#parallel-agent-teams) |
708
715
  | `self_reinforced_learning` | Trajectory analysis, self-eval, improvement recs | [Self-Reinforced Learning](#self-reinforced-learning-loop) |
709
- | `toolset_gating` | 4 presets (meta, lite, core, full) and self-escalation | [Toolset Gating & Presets](#toolset-gating--presets) |
716
+ | `toolset_gating` | Default, power, admin, core, founder, and full preset strategy | [Toolset Gating & Presets](#toolset-gating--presets) |
710
717
  | `toon_format` | TOON encoding — ~40% token savings vs JSON | TOON is on by default since v2.14.1 |
711
718
  | `seo_audit` | Full SEO audit workflow (technical + performance + content) | `seo_audit_url`, `check_page_performance`, `analyze_seo_content` |
712
719
  | `voice_bridge` | Voice pipeline design, config analysis, scaffolding | `design_voice_pipeline`, `analyze_voice_config` |
713
720
 
714
- **→ Quick Refs:** Find tools: `findTools({ query: "..." })` | Get any methodology: `getMethodology({ topic: "..." })` | See [MCP Tool Categories](#mcp-tool-categories)
721
+ **→ Quick Refs:** Find deeper lanes: `discover_tools({ query: "..." })` | Get any methodology: `getMethodology({ topic: "..." })` | See [MCP Tool Categories](#mcp-tool-categories)
715
722
 
716
723
  ---
717
724
 
package/README.md CHANGED
@@ -5,22 +5,22 @@
5
5
  [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
6
6
  [![GitHub stars](https://img.shields.io/github/stars/HomenShum/nodebench-ai.svg)](https://github.com/HomenShum/nodebench-ai)
7
7
  [![MCP Compatible](https://img.shields.io/badge/MCP-Compatible-green.svg)](https://modelcontextprotocol.io)
8
- [![Default](https://img.shields.io/badge/Default-19%20visible-brightgreen.svg)](https://www.npmjs.com/package/nodebench-mcp)
8
+ [![Default](https://img.shields.io/badge/Default-9%20visible-brightgreen.svg)](https://www.npmjs.com/package/nodebench-mcp)
9
9
 
10
- **Investigate a topic and return a sourced report fast.** NodeBench MCP now defaults to a small workflow facade, with heavier research and admin surfaces moved behind explicit presets.
10
+ **Investigate a topic and return a sourced report fast.** The NodeBench MCP architecture is now split into three install lanes that share one runtime.
11
11
 
12
- Default install: `19` visible tools total, including `7` core workflow tools:
12
+ Default install: `9` visible tools total, including `7` core workflow tools:
13
13
  `investigate`, `compare`, `track`, `summarize`, `search`, `report`, and `ask_context`.
14
14
 
15
15
  ```bash
16
- # Default v3 core workflow surface
16
+ # Core workflow lane
17
17
  claude mcp add nodebench -- npx -y nodebench-mcp
18
18
 
19
- # Extended workflow surface
20
- claude mcp add nodebench -- npx -y nodebench-mcp --preset power
19
+ # Power lane
20
+ claude mcp add nodebench-power -- npx -y nodebench-mcp-power
21
21
 
22
- # Admin/runtime surface
23
- claude mcp add nodebench -- npx -y nodebench-mcp --preset admin --admin
22
+ # Admin lane
23
+ claude mcp add nodebench-admin -- npx -y nodebench-mcp-admin
24
24
  ```
25
25
 
26
26
  ### What's New
@@ -40,8 +40,8 @@ This section records a repo-grounded reexamination of NodeBench MCP as of April
40
40
 
41
41
  ### Repo-grounded findings
42
42
 
43
- - **Surface area and messaging are out of sync**
44
- - This README currently markets `350+ tools`, `founder` under `50` tools, and `full` as `338`.
43
+ - **Surface area and messaging were out of sync**
44
+ - At the time of this audit, the README marketed `350+ tools`, `founder` under `50` tools, and `full` as `338`.
45
45
  - The audit that drove this section measured roughly `28` tools in the starter `tools/list` payload, roughly `186` in `founder`, and roughly `546` in `full`, before counting the extra dynamic-loading helpers separately.
46
46
  - **The core executable is overloaded**
47
47
  - `packages/mcp-local/src/index.ts` currently combines MCP serving, analytics tracking, embedding bootstrapping, profiling hooks, A/B instrumentation, dynamic tool loading, dashboard startup, and engine hosting in one runtime path.
@@ -103,13 +103,14 @@ NodeBench is now a workflow-first MCP. The default install proves one concrete j
103
103
 
104
104
  ### Presets
105
105
 
106
- | Preset | Visible tools | What it is for |
106
+ | Preset | Surface | What it is for |
107
107
  |---|---|---|
108
- | `default` | `19` | v3 core workflow facade: investigate, compare, track, summarize, search, report, ask_context, plus discovery/meta helpers |
109
- | `power` | `203` | Extended research and founder workflow pack without auto-starting admin runtime surfaces |
110
- | `admin` | `106` | Profiling, observability, dashboards, eval, and debug-oriented operator lanes |
111
- | `founder` | `191` | Legacy founder preset kept for compatibility |
112
- | `full` | all domains | Maximum coverage when you explicitly want the warehouse |
108
+ | `default` | `9` visible tools | Workflow-first lane: `investigate`, `compare`, `track`, `summarize`, `search`, `report`, `ask_context`, `discover_tools`, `load_toolset` |
109
+ | `power` | expanded workflow surface | Founder, recon, packets, and web-heavy workflows without admin runtime |
110
+ | `admin` | operator surface | Profiling, observability, dashboards, eval, and debug-oriented lanes |
111
+ | `core` | full methodology lane | Verification, eval, learning, recon, execution trace, and mission harness |
112
+ | `founder` | compatibility preset | Legacy founder-facing pack kept for existing setups |
113
+ | `full` | all loaded domains | Maximum coverage when you explicitly want the warehouse |
113
114
 
114
115
  ### Default workflow
115
116
 
@@ -173,7 +174,7 @@ Add to `.windsurf/mcp.json` (or Settings > MCP > View raw config):
173
174
  "mcpServers": {
174
175
  "nodebench": {
175
176
  "command": "npx",
176
- "args": ["-y", "nodebench-mcp", "--preset", "founder"]
177
+ "args": ["-y", "nodebench-mcp", "--preset", "power"]
177
178
  }
178
179
  }
179
180
  }
@@ -186,23 +187,23 @@ Any MCP-compatible client works. Point `command` to `npx`, `args` to `["-y", "no
186
187
  ### First Prompts to Try
187
188
 
188
189
  ```
189
- # Find tools for your task
190
- > Use discover_tools("evaluate this acquisition target") to find relevant tools
190
+ # Research a topic
191
+ > Use investigate with topic="Anthropic" to produce a sourced report
191
192
 
192
- # Load a toolset
193
- > Use load_toolset("deep_sim") to activate decision simulation tools
193
+ # Compare two entities
194
+ > Use compare with entities=["Anthropic","OpenAI"] to get a side-by-side brief
194
195
 
195
- # Run a decision simulation
196
- > Use run_deep_sim_scenario to simulate a business decision with multiple variables
196
+ # Turn rough notes into a report
197
+ > Use report with topic="AI agent infrastructure" and context="..." to produce a decision memo
197
198
 
198
- # Generate a decision memo
199
- > Use generate_decision_memo to produce a shareable memo from your analysis
199
+ # Search live web + saved knowledge
200
+ > Use search with query="MCP server best practices 2026"
200
201
 
201
- # Weekly founder reset
202
- > Use founder_weekly_reset to review the week's decisions and outcomes
202
+ # Track an entity
203
+ > Use track with action="add" and entity="Anthropic"
203
204
 
204
- # Pre-delegation briefing
205
- > Use pre_delegation_briefing to prepare context before handing off a task
205
+ # Expand only when you need more
206
+ > Use discover_tools("visual QA for a Vite app"), then load_toolset("ui_ux_dive")
206
207
  ```
207
208
 
208
209
  ### Optional: API Keys
@@ -231,17 +232,17 @@ Set these as environment variables, or add them to the `env` block in your MCP c
231
232
 
232
233
  ---
233
234
 
234
- ## Progressive Discovery — How 338 Tools Fit in Any Context Window
235
+ ## Progressive Discovery — How Optional Toolsets Stay Off the Hot Path
235
236
 
236
- The starter preset loads 15 tools. The other 323 are discoverable and loadable on demand.
237
+ The default preset exposes `9` tools. Everything else stays off the hot path until you deliberately load a specific toolset.
237
238
 
238
239
  ### How it works
239
240
 
240
241
  ```
241
- 1. discover_tools("your task description") → ranked results from all 338 tools
242
- 2. load_toolset("deep_sim") tools activate in your session
243
- 3. Use the tools directly → no proxy, native binding
244
- 4. unload_toolset("deep_sim") free context budget when done
242
+ 1. discover_tools("your task description") → ranked results from the full registry
243
+ 2. load_toolset("ui_ux_dive") a specific toolset activates in your session
244
+ 3. Use the newly loaded tools directly → no proxy, native binding
245
+ 4. Keep the default surface small only load what the workflow needs
245
246
  ```
246
247
 
247
248
  ### Multi-modal search engine
@@ -263,10 +264,11 @@ Plus cursor pagination (`offset`/`limit`), result expansion (`expand: N`), and m
263
264
 
264
265
  ### Client compatibility
265
266
 
266
- | Client | Dynamic Loading |
267
+ | Client | Recommended path |
267
268
  |---|---|
268
- | Claude Code, GitHub Copilot | Native re-fetches tools after `list_changed` |
269
- | Windsurf, Cursor, Claude Desktop, Gemini CLI | Via `call_loaded_tool` fallback (always available) |
269
+ | Claude Code, GitHub Copilot | Default preset + `discover_tools` / `load_toolset` |
270
+ | Cursor | `--preset cursor` to stay within its tool cap |
271
+ | Windsurf, Claude Desktop, Gemini CLI | Use `--preset power` or a targeted preset if your client does not refresh tools reliably |
270
272
 
271
273
  ---
272
274
 
@@ -358,7 +360,7 @@ NodeBench MCP runs locally on your machine.
358
360
 
359
361
  - All persistent data stored in `~/.nodebench/` (SQLite). No data sent to external servers unless you provide API keys and use tools that call external APIs.
360
362
  - Analytics data never leaves your machine.
361
- - The `local_file` toolset can read files anywhere your Node.js process has permission. Use the `starter` preset to restrict file system access.
363
+ - The `local_file` toolset can read files anywhere your Node.js process has permission. Use the `default` preset to keep local-file tools off the hot path.
362
364
  - All API keys read from environment variables — never hardcoded or logged.
363
365
  - All database queries use parameterized statements.
364
366
 
@@ -401,7 +403,7 @@ Then use absolute path:
401
403
 
402
404
  **Cursor tools not loading** — Ensure `.cursor/mcp.json` exists in project root. Use `--preset cursor` to stay within the tool cap. Restart Cursor after config changes.
403
405
 
404
- **Dynamic loading not working** — Claude Code and GitHub Copilot support native dynamic loading. For Windsurf/Cursor, use `call_loaded_tool` as a fallback.
406
+ **Dynamic loading not working** — Claude Code and GitHub Copilot support native dynamic loading. For Windsurf/Cursor, prefer `--preset cursor` or `--preset power` if your client does not refresh tools reliably after `load_toolset`.
405
407
 
406
408
  ---
407
409
 
@@ -1511,7 +1511,7 @@ export function getOperatingDashboardHtml() {
1511
1511
  <!-- 10. Footer -->
1512
1512
  <footer class="footer fade-in fade-in-9">
1513
1513
  <div class="footer-text">
1514
- <span>NodeBench</span> MCP v${NODEBENCH_VERSION} &middot; 325 tools &middot; 30 tables &middot; auto-refresh: 10s
1514
+ <span>NodeBench</span> MCP v${NODEBENCH_VERSION} &middot; workflow lanes + operator data &middot; auto-refresh: 10s
1515
1515
  </div>
1516
1516
  </footer>
1517
1517
  </div>
package/dist/index.js CHANGED
@@ -34,7 +34,7 @@ import { createMetaTools } from "./tools/metaTools.js";
34
34
  import { createProgressiveDiscoveryTools } from "./tools/progressiveDiscoveryTools.js";
35
35
  import { getQuickRef, ALL_REGISTRY_ENTRIES, TOOL_REGISTRY, getToolComplexity, getToolAnnotations, toolNameToTitle, _setDbAccessor, hybridSearch, WORKFLOW_CHAINS } from "./tools/toolRegistry.js";
36
36
  import { getRequestedPreset, resolveRuntimeFlags } from "./runtimeConfig.js";
37
- import { NODEBENCH_PACKAGE_NAME, NODEBENCH_VERSION, comparePackageVersions } from "./packageInfo.js";
37
+ import { NODEBENCH_CLI_COMMAND, NODEBENCH_DISPLAY_NAME, NODEBENCH_NPX_PACKAGE, NODEBENCH_PACKAGE_NAME, NODEBENCH_SERVER_KEY, NODEBENCH_VERSION, comparePackageVersions, } from "./packageInfo.js";
38
38
  // TOON format — ~40% token savings on tool responses
39
39
  import { encode as toonEncode } from "@toon-format/toon";
40
40
  // Embedding provider — neural semantic search
@@ -57,6 +57,11 @@ const runtimeFlags = resolveRuntimeFlags(cliArgs, requestedPreset);
57
57
  const useEmbedding = runtimeFlags.enableEmbedding;
58
58
  const useEngine = runtimeFlags.enableEngine;
59
59
  const useProfile = runtimeFlags.enableProfiling;
60
+ const DISPLAY_NAME = NODEBENCH_DISPLAY_NAME;
61
+ const CLI_COMMAND = NODEBENCH_CLI_COMMAND;
62
+ const NPX_COMMAND = `npx ${NODEBENCH_NPX_PACKAGE}`;
63
+ const NPX_Y_COMMAND = `npx -y ${NODEBENCH_NPX_PACKAGE}`;
64
+ const SERVER_KEY = NODEBENCH_SERVER_KEY;
60
65
  const engineSecret = (() => {
61
66
  const idx = cliArgs.indexOf("--engine-secret");
62
67
  return idx >= 0 && idx + 1 < cliArgs.length ? cliArgs[idx + 1] : process.env.ENGINE_SECRET;
@@ -128,6 +133,9 @@ const PRESET_DESCRIPTIONS = {
128
133
  delta: "Delta (~65 tools) — full operating-intelligence preset. Entity intel, decision memos, watchlists, agent handoff, execution traces.",
129
134
  full: "Everything — all domains for maximum coverage",
130
135
  };
136
+ function isWorkflowFacadePreset(presetName) {
137
+ return presetName === "default" || presetName === "starter" || presetName === "v3";
138
+ }
131
139
  async function parseToolsets() {
132
140
  if (cliArgs.includes("--help")) {
133
141
  const lines = [
@@ -528,7 +536,8 @@ if (healthFlag) {
528
536
  await loadToolsets(presetToolsets);
529
537
  }
530
538
  const presetToolCount = presetToolsets
531
- ? presetToolsets.reduce((s, k) => s + (TOOLSET_MAP[k]?.length ?? 0), 0) + 12
539
+ ? presetToolsets.reduce((s, k) => s + (TOOLSET_MAP[k]?.length ?? 0), 0)
540
+ + (isWorkflowFacadePreset(activePreset) ? 2 : 12)
532
541
  : 12;
533
542
  lines.push(`${C}Tools${X} ${presetToolCount} visible (preset: ${activePreset}) | ${ALL_DOMAIN_KEYS.length} domains available`);
534
543
  // 2. TOON + Embedding
@@ -630,7 +639,7 @@ if (healthFlag) {
630
639
  try {
631
640
  const controller = new AbortController();
632
641
  const timeout = setTimeout(() => controller.abort(), 3000);
633
- const res = await fetch("https://registry.npmjs.org/nodebench-mcp/latest", {
642
+ const res = await fetch(`https://registry.npmjs.org/${NODEBENCH_PACKAGE_NAME}/latest`, {
634
643
  signal: controller.signal,
635
644
  headers: { Accept: "application/json" },
636
645
  });
@@ -905,11 +914,26 @@ if (syncConfigsFlag) {
905
914
  if (process.env[key])
906
915
  envObj[key] = process.env[key];
907
916
  }
908
- // Build the MCP server config entry
917
+ // Build the MCP server config entry. Wrapper packages can override this so
918
+ // sync-configs writes the public lane command instead of an internal entry path.
909
919
  const nodePath = process.execPath; // path to node binary
920
+ const configCommandOverride = process.env.NODEBENCH_CONFIG_COMMAND_OVERRIDE?.trim();
921
+ let configArgsOverride;
922
+ const rawConfigArgsOverride = process.env.NODEBENCH_CONFIG_ARGS_OVERRIDE;
923
+ if (rawConfigArgsOverride) {
924
+ try {
925
+ const parsed = JSON.parse(rawConfigArgsOverride);
926
+ if (Array.isArray(parsed) && parsed.every((item) => typeof item === "string")) {
927
+ configArgsOverride = parsed;
928
+ }
929
+ }
930
+ catch {
931
+ // Ignore invalid override payloads and fall back to the local entry path.
932
+ }
933
+ }
910
934
  const serverEntry = {
911
- command: nodePath,
912
- args: [entryPath, ...forwardArgs],
935
+ command: configCommandOverride || nodePath,
936
+ args: configArgsOverride ?? [entryPath, ...forwardArgs],
913
937
  ...(Object.keys(envObj).length > 0 ? { env: envObj } : {}),
914
938
  };
915
939
  // Helper: merge into existing config file (preserves other servers)
@@ -948,7 +972,7 @@ if (syncConfigsFlag) {
948
972
  // 1. Claude Code: ~/.claude/claude_desktop_config.json
949
973
  try {
950
974
  const claudeConfigPath = path.join(os.homedir(), ".claude", "claude_desktop_config.json");
951
- const r = mergeConfig(claudeConfigPath, "nodebench-mcp");
975
+ const r = mergeConfig(claudeConfigPath, SERVER_KEY);
952
976
  results.push({ name: "Claude Code", ...r });
953
977
  }
954
978
  catch (e) {
@@ -957,7 +981,7 @@ if (syncConfigsFlag) {
957
981
  // 2. Cursor: <project>/.cursor/mcp.json
958
982
  try {
959
983
  const cursorConfigPath = path.join(process.cwd(), ".cursor", "mcp.json");
960
- const r = mergeConfig(cursorConfigPath, "nodebench-mcp");
984
+ const r = mergeConfig(cursorConfigPath, SERVER_KEY);
961
985
  results.push({ name: "Cursor", ...r });
962
986
  }
963
987
  catch (e) {
@@ -966,7 +990,7 @@ if (syncConfigsFlag) {
966
990
  // 3. Windsurf: <project>/.windsurf/mcp.json
967
991
  try {
968
992
  const windsurfConfigPath = path.join(process.cwd(), ".windsurf", "mcp.json");
969
- const r = mergeConfig(windsurfConfigPath, "nodebench-mcp");
993
+ const r = mergeConfig(windsurfConfigPath, SERVER_KEY);
970
994
  results.push({ name: "Windsurf", ...r });
971
995
  }
972
996
  catch (e) {
@@ -985,8 +1009,8 @@ if (syncConfigsFlag) {
985
1009
  // Print config summary
986
1010
  lines.push("");
987
1011
  lines.push(`${C}Config entry:${X}`);
988
- lines.push(` command: ${nodePath}`);
989
- lines.push(` args: [${[entryPath, ...forwardArgs].map(a => `"${a}"`).join(", ")}]`);
1012
+ lines.push(` command: ${String(serverEntry.command)}`);
1013
+ lines.push(` args: [${(serverEntry.args ?? []).map(a => `"${a}"`).join(", ")}]`);
990
1014
  if (Object.keys(envObj).length > 0) {
991
1015
  lines.push(` env: ${Object.keys(envObj).join(", ")}`);
992
1016
  }
@@ -1933,7 +1957,7 @@ const dynamicLoadingTools = [
1933
1957
  },
1934
1958
  ];
1935
1959
  // v3 preset gate: expose only facade tools + discover_tools + load_toolset
1936
- const isV3Surface = currentPreset === "default" || currentPreset === "starter" || currentPreset === "v3";
1960
+ const isV3Surface = isWorkflowFacadePreset(currentPreset);
1937
1961
  // Combine all tools (mutable for dynamic loading)
1938
1962
  let allTools;
1939
1963
  if (isV3Surface) {
@@ -3378,7 +3402,7 @@ const engineInfo = enginePort ? ` engine at http://127.0.0.1:${enginePort}` : ""
3378
3402
  const runtimeInfo = runtimeFlags.enableDashboards || runtimeFlags.enableWatchdog
3379
3403
  ? " admin-runtime"
3380
3404
  : " core-runtime";
3381
- console.error(`nodebench-mcp ready (${allTools.length} tools, ${PROMPTS.length} prompts${toolsetInfo}, SQLite at ~/.nodebench/${dashInfo}${uiDiveInfo}${engineInfo}${runtimeInfo})`);
3405
+ console.error(`${CLI_COMMAND} ready (${allTools.length} tools, ${PROMPTS.length} prompts${toolsetInfo}, SQLite at ~/.nodebench/${dashInfo}${uiDiveInfo}${engineInfo}${runtimeInfo})`);
3382
3406
  // ── Auto-brief on first start (delta/hackathon presets) ──────────────
3383
3407
  // When using delta or hackathon preset, auto-run delta_brief on first session
3384
3408
  // to give users immediate value before they even ask.