npm - agent-discover - Versions diffs - 1.2.2 → 1.2.3 - Mend

agent-discover 1.2.2 → 1.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -5,6 +5,66 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [1.2.3] - 2026-04-09
+### Documentation
+- **README.md**: bumped test count to 179, added the new badge for 11 registry actions, rewrote the "MCP Tools" section to document `find_tool` / `find_tools` / `get_schema` / `proxy_call` with the auto_activate guidance, expanded the Features list to lead with single-call discovery + pluggable embeddings + indirect invocation, added the embeddings env-var section, linked the bench README headline result.
+- **CHANGELOG.md**: backfilled v1.1.3 / v1.1.4 / v1.2.0 / v1.2.1 / v1.2.2 entries.
+- **docs/API.md**: documented all 4 new registry actions with parameters and example payloads + responses, added the no-match threshold semantics for `find_tool`.
+- **docs/ARCHITECTURE.md**: added the embeddings provider layer to the layered diagram, documented the `src/embeddings/` subsystem (none/local/openai), the hybrid retrieval pipeline, the `0.25` no-match threshold, the migration to schema V5, and the new `embedding` / `embedding_model` columns on `server_tools`.
+- **docs/USER-MANUAL.md**: new sections for `registry_find_tool` / `registry_find_tools` / `registry_get_schema` / `registry_proxy_call` with usage examples, added the embeddings env-var table to the configuration section.
+- **docs/SETUP.md**: added the embeddings env-var table with full enable/disable walkthrough for both local and openai providers.
+## [1.2.2] - 2026-04-09
+### Fixed
+- **`find_tool` / `find_tools` no-match detection.** Hybrid retrieval (introduced in 1.2.0) returned at least one match for every query because cosine similarity is non-zero for every embedded tool, so the existing `matches.length === 0` guard never fired and a query like `"totally nonexistent xyzzy"` would surface a low-confidence garbage tool. Both actions now apply a `MIN_SCORE_THRESHOLD = 0.25` to the top result and return `{ found: false, top_score, hint }` when nothing crosses it. Real queries (typical scores > 0.4) are unaffected; garbage matches (~0.1) are now correctly rejected. Caught by the v1.2.x E2E test pass.
+## [1.2.1] - 2026-04-09
+### Added
+- **Pluggable embedding providers (`src/embeddings/`).** Multi-provider subsystem mirroring agent-knowledge's pattern, with semantic search opt-in via `AGENT_DISCOVER_EMBEDDING_PROVIDER` (default `none` so existing installs without an embedding key keep working with BM25-only ranking). Providers shipped:
+  - `none` — `NoopEmbeddingProvider`, returned by default. Reports unavailable so callers transparently fall back to BM25.
+  - `openai` — `OpenAIEmbeddingProvider`, `text-embedding-3-small` (1536 dims), native `fetch`, batched 256 inputs per request.
+  - `local` — `LocalEmbeddingProvider`, `Xenova/all-MiniLM-L6-v2` (384 dims) via `@huggingface/transformers` (optional peer dep, dynamically imported via indirect string so the package isn't required at compile time). q8 quantized, configurable `AGENT_DISCOVER_EMBEDDING_THREADS` and `AGENT_DISCOVER_EMBEDDING_IDLE_TIMEOUT`.
+- **Provider factory** (`src/embeddings/factory.ts`) caches the resolved provider, falls back to `NoopEmbeddingProvider` on any unavailable / API-key-missing / model-load-failure case so the registry never crashes on a misconfiguration.
+- **Shared math + encoding helpers** (`cosineSimilarity`, `encodeEmbedding` / `decodeEmbedding`) exported from `src/embeddings/index.ts`.
+### Changed
+- `RegistryService` constructor no longer eagerly creates the embedding provider — it's lazily resolved via `getEmbeddings()` so the factory's dynamic imports only run when somebody actually saves or searches tools.
+- `saveToolsWithEmbeddings()` returns `{ embedded, skipped, provider }` and tolerates per-tool null embeddings cleanly.
+- `searchToolsHybrid()` awaits the provider, falls back to plain `searchTools()` when `provider.name === 'none'` or the query embedding fails.
+## [1.2.0] - 2026-04-09
+### Added
+- **`find_tool` registry action — single-call tool discovery.** Hybrid BM25 + semantic ranking returns the top match with a confidence label (`high` / `medium` / `low` derived from the score gap to the runner-up), compact `required_args`, and 4 ranked alternatives in `other_matches`. Auto-activates the owning child server so the agent can call the proxied tool immediately on the next turn. Replaces the old multi-call `search → list → activate` flow that took ~16 round-trips per task in the bench baseline.
+- **`find_tools` registry action — batch variant.** Pass `intents: ["intent1", "intent2", …]` to discover N tools in one round-trip for multi-step tasks.
+- **`get_schema` registry action.** Returns the full `input_schema` for a tool already discovered via `find_tool`. Use only when the compact `required_args` summary isn't enough (conditional / polymorphic args). Compact-first delivery cuts `find_tool` result tokens 5–10× on heavy registries.
+- **`proxy_call` registry action.** Invokes a discovered tool **through** agent-discover without exposing it to the host catalog. Combined with `find_tool({auto_activate: false})`, this keeps the host MCP catalog at exactly 5 agent-discover actions regardless of how many tools the registered child servers actually expose — critical at large catalog sizes where firing `notifications/tools/list_changed` would flood the host with thousands of schemas.
+- **`did_you_mean` recovery on tool errors.** When a proxied tool call fails (validation error, missing args, runtime error), the proxy intercepts the response, runs a BM25 search by the failed tool name, and attaches a `did_you_mean` array of similarly-named alternatives. Lets the agent recover from a wrong-tool selection in one extra turn instead of giving up or re-running discovery.
+- **FTS5 + BM25 ranking on `server_tools`** (migration v4). Adds a `server_tools_fts` virtual table with name × 4 / description × 1 column weighting, plus a query preprocessor that expands verb synonyms (`fetch → get`, `cancel → delete`, etc.) and singularizes plurals (`subscriptions → subscription`).
+- **Embedding columns on `server_tools`** (migration v5). `embedding` (TEXT, base64-encoded float32) and `embedding_model` columns for semantic search. Embeddings are optional; tools without one fall back to BM25 ranking only.
+- **Hybrid retrieval (`searchToolsHybrid`).** Brute-force cosine similarity over the entire embedded catalog + BM25 candidate union, scored 70% semantic / 30% lexical. Closes the natural-language gap that pure BM25 misses (e.g. "billing arrangement" → "subscription").
+- **Bench harness** under `bench/` comparing eager tool loading vs deferred discovery. Real Claude Code (`bench/drivers/cli.ts`) and OpenCode (`bench/drivers/opencode.ts`) drivers, isolated bench DB, scoring with `success` / `choice_accuracy` / `distractor_call_rate` / `refusal_rate` metrics, and a standalone `bench/rescore.ts` that re-applies the current scoring logic to captured event streams without spending fresh API tokens. Headline result at N=1000 on OpenCode + gpt-5-mini against an adversarial natural-language verb pack: discover 100% / 100% / 0% vs eager 80% / 80% / 20%, with ~27% lower per-turn token cost. Full results in `bench/README.md`.
+## [1.1.4] - 2026-04-09
+### Added
+- Bench iteration adding `find_tools` (plural) and `did_you_mean` recovery — both later carried forward into v1.2.0. See `bench/README.md` for the historical context.
+## [1.1.3] - 2026-04-09
+### Added
+- Bench iteration adding BM25 + confidence labels + compact-first schema delivery — later carried forward into v1.2.0.
 ## [1.1.2] - 2026-04-08
 ### Fixed

package/README.md CHANGED Viewed

@@ -2,8 +2,9 @@
 [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
 [![Node.js](https://img.shields.io/badge/node-%3E%3D20.11-brightgreen)](https://nodejs.org/)
-[![Tests](https://img.shields.io/badge/tests-151%20passing-brightgreen)]()
+[![Tests](https://img.shields.io/badge/tests-179%20passing-brightgreen)]()
 [![MCP Tools](https://img.shields.io/badge/MCP%20tools-1-purple)]()
+[![Registry Actions](https://img.shields.io/badge/registry%20actions-11-blueviolet)]()
 [![REST Endpoints](https://img.shields.io/badge/REST-19%20endpoints-orange)]()
 **MCP server registry and marketplace.** Discover, install, activate, and manage MCP tools on demand. Acts as a dynamic proxy -- activated servers have their tools merged into the registry's own tool list, so agents can use them without restarting.
@@ -37,6 +38,11 @@ Static MCP configs mean every server is always running, even when unused. Adding
 ## Features
+- **Single-call tool discovery (`find_tool`)** — hybrid BM25 + semantic ranking returns the top match with a confidence label, compact `required_args`, and 4 ranked alternatives. Auto-activates the owning child server so the agent can call the proxied tool immediately on the next turn. Replaces the multi-step `search → list → activate` dance with one round-trip.
+- **Batch discovery (`find_tools`)** — pass an array of intents to discover N tools in a single round-trip for multi-step tasks.
+- **Indirect invocation (`proxy_call`)** — call a discovered tool **through** agent-discover without exposing it to the host catalog. Keeps the host MCP surface at exactly 5 actions regardless of how many tools the registered child servers expose — critical for very large catalogs where flooding the host with thousands of schemas would blow the model's context budget.
+- **Pluggable embeddings (`AGENT_DISCOVER_EMBEDDING_PROVIDER`)** — semantic search is opt-in via `none` (default, BM25 only) / `local` (Xenova/all-MiniLM-L6-v2 via `@huggingface/transformers`) / `openai` (`text-embedding-3-small`). Provider failures fall back to BM25 cleanly. Mirrors agent-knowledge's pattern so the same model can be reused.
+- **`did_you_mean` recovery** — when a proxied tool call fails, the proxy attaches BM25-ranked similar-tool suggestions to the error response so the agent can correct in one extra turn instead of giving up.
 - **Local registry** -- register MCP servers in a SQLite database with name, command, args, env, tags
 - **Federated marketplace search** -- a single query hits the official MCP registry, npm, and PyPI in parallel, dedupes by `<source>:<name>`, and collapses version duplicates
 - **PyPI integration** -- curated list of well-known Python MCP servers (`mcp-server-fetch`, `mcp-server-git`, `mcp-server-time`, `mcp-server-postgres`, `mcp-server-sqlite`, `mcp-proxy`, …) plus live metadata via the PyPI JSON API; Python entries install via `uvx`
@@ -49,10 +55,11 @@ Static MCP configs mean every server is always running, even when unused. Adding
 - **Secret management** -- store API keys and tokens per server, automatically injected as env vars (stdio) or HTTP headers (SSE/streamable-http) on activation; CRLF-validated to prevent header injection
 - **Health checks** -- connect/disconnect probes for inactive servers, tool-list checks for active ones, with error count tracking
 - **Per-tool metrics** -- call counts, error counts, and average latency recorded automatically on every proxied tool call
-- **Full-text search** -- FTS5 search across server names, descriptions, and tags
+- **Full-text search** -- FTS5 search across server names, descriptions, and tags + cross-server tool index for `find_tool`
 - **Pre-download** -- fire-and-forget `npm cache add` (npx servers) or `uv tool install` (uvx servers) on registration, plus a dedicated `/preinstall` endpoint
 - **Real-time dashboard** -- web UI at http://localhost:3424 with Servers and Browse tabs, dark/light theme, WebSocket updates
 - **3 transport layers** -- MCP (stdio), REST API (HTTP), WebSocket (real-time events)
+- **Bench harness** -- under `bench/`, comparing eager tool loading vs deferred discovery against real Claude Code and OpenCode hosts. Headline result at N=1000 against an adversarial natural-language verb pack: discover 100% accuracy + 27% lower per-turn token cost vs eager 80% accuracy. See [`bench/README.md`](bench/README.md).
 ---
@@ -106,14 +113,26 @@ node dist/server.js --port 3424
 ## MCP Tools (1)
-A single action-based tool handles every operation via the `action` parameter — this keeps the prompt-overhead cost minimal.
-| Tool       | Actions                                                                      | Description                                                                                                                             |
-| ---------- | ---------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
-| `registry` | `list`, `install`, `uninstall`, `activate`, `deactivate`, `browse`, `status` | Registry + server lifecycle — search local servers, install from marketplace or manual config, activate/deactivate, browse, show status |
+A single action-based tool handles every operation via the `action` parameter — this keeps the prompt-overhead cost minimal regardless of how many child servers are registered.
+| Action       | Purpose                                                                                                                                                               |
+| ------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `find_tool`  | **Single-call discovery.** Hybrid BM25 + semantic search → top match + confidence label + compact `required_args` + 4 alternatives. Auto-activates the owning server. |
+| `find_tools` | **Batch discovery.** Pass `intents: [...]` to discover N tools in one round-trip. Use for multi-step tasks.                                                           |
+| `get_schema` | Full `input_schema` for a discovered tool. Only needed when the compact `required_args` summary isn't enough (conditional / polymorphic args).                        |
+| `proxy_call` | Invoke a discovered tool **through** agent-discover without exposing it to the host catalog. Pair with `find_tool({auto_activate: false})` for huge catalogs.         |
+| `list`       | Search the local registry by server (FTS5).                                                                                                                           |
+| `install`    | Add a server from the marketplace or via manual config (command + args + env).                                                                                        |
+| `uninstall`  | Remove a server.                                                                                                                                                      |
+| `activate`   | Start a server, discover its tools, expose them to the host as `serverName__toolName`.                                                                                |
+| `deactivate` | Stop a server, hide its tools.                                                                                                                                        |
+| `browse`     | Federated search across the official MCP registry, npm, and PyPI.                                                                                                     |
+| `status`     | Active servers summary (names, tool counts, tool lists).                                                                                                              |
 Activated servers expose their tools through agent-discover, namespaced as `serverName__toolName`. For example, activating a server named `filesystem` that exposes `read_file` makes it available as `filesystem__read_file`.
+When `find_tool` is called with `auto_activate: false` (recommended for catalogs above ~1k tools), the proxy connection is opened silently and tools must be invoked via `proxy_call` instead of being added to the host's catalog. This keeps the host MCP surface area constant regardless of how many tools the registered child servers expose.
 ---
 ## REST API (19 endpoints)
@@ -170,11 +189,29 @@ npm run test:e2e:ui   # Playwright dashboard smoke tests
 ## Environment Variables
+### Core
 | Variable              | Default                       | Description          |
 | --------------------- | ----------------------------- | -------------------- |
 | `AGENT_DISCOVER_PORT` | `3424`                        | Dashboard HTTP port  |
 | `AGENT_DISCOVER_DB`   | `~/.claude/agent-discover.db` | SQLite database path |
+### Embeddings (semantic search for `find_tool`)
+Embeddings are **opt-in**. The default is `none`, which means `find_tool` ranks by BM25 + verb synonyms only. Setting a provider enables hybrid BM25 + cosine retrieval, which closes the natural-language gap (e.g. "billing arrangement" → "subscription") that BM25 alone misses.
+| Variable                                | Default | Description                                                             |
+| --------------------------------------- | ------- | ----------------------------------------------------------------------- |
+| `AGENT_DISCOVER_EMBEDDING_PROVIDER`     | `none`  | `none` \| `local` \| `openai`                                           |
+| `AGENT_DISCOVER_EMBEDDING_MODEL`        | —       | Override the default model id for the chosen provider                   |
+| `AGENT_DISCOVER_EMBEDDING_THREADS`      | `1`     | Local provider only — onnx runtime thread count                         |
+| `AGENT_DISCOVER_EMBEDDING_IDLE_TIMEOUT` | `60`    | Local provider only — seconds before unloading the model from RAM       |
+| `AGENT_DISCOVER_OPENAI_API_KEY`         | —       | OpenAI API key for embeddings (falls back to `OPENAI_API_KEY` if unset) |
+**Local provider** uses `Xenova/all-MiniLM-L6-v2` (384 dims) via `@huggingface/transformers`. Install the optional peer dependency with `npm install @huggingface/transformers` if you want to use it. No network calls, no API key.
+**OpenAI provider** uses `text-embedding-3-small` (1536 dims). Same model as agent-knowledge so the two servers can share an embedding key.
 ### Host package manager prerequisites
 agent-discover spawns child MCP servers via the host's installed package managers. Install whatever you intend to use; missing tools are reported by `GET /api/prereqs` and surfaced as a banner in the Browse tab.

package/agent-desk-plugin.json CHANGED Viewed

@@ -2,7 +2,7 @@
   "id": "agent-discover",
   "name": "Discover",
   "icon": "explore",
-  "version": "1.2.2",
+  "version": "1.2.3",
   "description": "MCP server registry — browse, install, configure, monitor",
   "ui": "./dist/ui/app.js",
   "css": "./dist/ui/styles.css",

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "agent-discover",
-  "version": "1.2.2",
+  "version": "1.2.3",
   "mcpName": "io.github.keshrath/agent-discover",
   "description": "MCP server registry and marketplace — discover, install, activate, and manage MCP tools on demand",
   "type": "module",