uer-mcp 4.1.0 → 4.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -3,16 +3,21 @@
3
3
 
4
4
  # Universal Expert Registry
5
5
 
6
- [![npm version](https://badge.fury.io/js/uer-mcp.svg)](https://www.npmjs.com/package/uer-mcp)
6
+ [![npm version](https://img.shields.io/npm/v/uer-mcp)](https://www.npmjs.com/package/uer-mcp)
7
+ [![npm](https://img.shields.io/npm/dm/uer-mcp)](https://www.npmjs.com/package/uer-mcp)
8
+ [![npm bundle size](https://img.shields.io/bundlephobia/min/uer-mcp)](https://www.npmjs.com/package/uer-mcp)
7
9
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
8
10
 
9
- **ASI-Level Experts, Infinite Memory, Any Client**
11
+ **Multi-Provider LLM Gateway S3-Compatible Storage • MCP Tool Orchestration**
12
+
13
+ > ⚠️ **Development Status**: This is a hackathon proof-of-concept. While the architecture supports 100+ LLM providers via LiteLLM, only a subset (Anthropic, Cerebras, OpenAI, Gemini, LM Studio, Ollama) have been extensively tested. Version numbers track feature implementation progress, not production readiness. If you encounter issues with other providers, please [open an issue](https://github.com/margusmartsepp/UER/issues).
10
14
  </div>
11
15
 
12
16
  ---
13
17
 
14
18
  **Standard config** works in most MCP clients:
15
- > 💡 **Quick Start**: Get a free Gemini API key at [aistudio.google.com/api-keys](https://aistudio.google.com/api-keys)
19
+ > **Quick Start**: Get a free Cerebras API key at [cloud.cerebras.ai/platform](https://cloud.cerebras.ai/platform) under apikeys or use LM Studio (100% free, local)
20
+
16
21
  ```json
17
22
  {
18
23
  "mcpServers": {
@@ -20,16 +25,20 @@
20
25
  "command": "npx",
21
26
  "args": ["uer-mcp@latest"],
22
27
  "env": {
23
- "GEMINI_API_KEY": "your-key-here"
28
+ // Specific provider key(s)
29
+ "CEREBRAS_API_KEY": "your-key-here",
30
+ "GEMINI_API_KEY": "your-key-here", // etc
31
+ // LM Studio (optional) - local models
32
+ "LM_STUDIO_API_BASE": "http://localhost:1234/v1"
24
33
  }
25
34
  }
26
35
  }
27
36
  }
28
37
  ```
29
38
 
30
- > **📦 Storage is optional**: This config works immediately for LLM and MCP features. For storage/context features, see [Storage Configuration Options](#storage-configuration-options) below.
39
+ > **Storage is optional**: This config works immediately for LLM and MCP features. For storage/context features, see [Storage Configuration Options](#storage-configuration-options) below.
31
40
 
32
- > **⚠️ Required**: Add at least one API key to the `env` section. See [CONFIGURATION.md](CONFIGURATION.md) for all provider links and detailed setup.
41
+ > **Required**: Add at least one API key to the `env` section. See [CONFIGURATION.md](CONFIGURATION.md) for all provider links and detailed setup.
33
42
 
34
43
  [<img src="https://img.shields.io/badge/VS_Code-VS_Code?style=flat-square&label=Install%20Server&color=0098FF" alt="Install in VS Code">](https://insiders.vscode.dev/redirect?url=vscode%3Amcp%2Finstall%3F%257B%2522name%2522%253A%2522uer%2522%252C%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522uer-mcp%2540latest%2522%255D%257D) [<img alt="Install in VS Code Insiders" src="https://img.shields.io/badge/VS_Code_Insiders-VS_Code_Insiders?style=flat-square&label=Install%20Server&color=24bfa5">](https://insiders.vscode.dev/redirect?url=vscode-insiders%3Amcp%2Finstall%3F%257B%2522name%2522%253A%2522uer%2522%252C%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522uer-mcp%2540latest%2522%255D%257D) [<img src="https://cursor.com/deeplink/mcp-install-dark.svg" alt="Install in Cursor">](https://cursor.com/en/install-mcp?name=UER&config=eyJjb21tYW5kIjoibnB4IHVlci1tY3BAbGF0ZXN0In0%3D) [<img src="https://img.shields.io/badge/Windsurf-Windsurf?style=flat-square&label=Install%20Server&color=0B7A8F" alt="Install in Windsurf">](https://windsurf.com)
35
44
 
@@ -43,24 +52,24 @@ For Claude Desktop, Goose, Codex, Amp, and other clients, see [CONFIGURATION.md]
43
52
  ---
44
53
 
45
54
  An MCP server that provides:
46
- 1. **Universal LLM Access** - Call any LLM (Claude, GPT, Gemini, Bedrock, Azure, local models) through LiteLLM
47
- 2. **MCP Tool Orchestration** - Connect to 1000+ MCP servers (filesystem, databases, browsers, etc.)
48
- 3. **Shared Memory/Context** - Break context window limits via external storage with URI references
49
- 4. **Subagent Delegation** - Spawn subagents with full chat history, not just single messages
55
+ 1. **Multi-Provider LLM Access** - Call 100+ LLM providers (Anthropic, OpenAI, Google, Azure, AWS Bedrock, local models) through LiteLLM
56
+ 2. **MCP Tool Integration** - Connect to other MCP servers for extended functionality
57
+ 3. **S3-Compatible Storage** - Store context and data in MinIO, AWS S3, or other S3-compatible backends
58
+ 4. **Prompt Injection Detection** - Basic content validation and security warnings
50
59
 
51
60
  ## Why This Exists
52
61
 
53
- LLMs have fundamental limitations:
54
- - **Single message I/O**: 32-64k tokens max
55
- - **Context window**: 200k-2M tokens
56
- - **No persistent memory**: Forget between sessions
57
- - **No expert access**: Can't use specialized tools
62
+ MCP clients often need:
63
+ - **Multiple LLM providers** - Different models for different tasks
64
+ - **Persistent storage** - Save context between sessions
65
+ - **Tool integration** - Connect to specialized MCP servers
66
+ - **Configuration flexibility** - Support cloud and self-hosted solutions
58
67
 
59
- Traditional multi-agent approaches waste tokens by copying full context to each subagent. This registry solves it by:
60
- - Storing context externally (unlimited)
61
- - Passing URI references instead of full data (50 tokens vs 50k)
62
- - Building complete chat histories for subagents
63
- - Persisting across sessions
68
+ UER provides:
69
+ - Unified interface to 100+ LLM providers via LiteLLM
70
+ - S3-compatible storage for context and data
71
+ - MCP client for calling other MCP servers
72
+ - Support for enterprise clouds (Azure, AWS, GCP) and self-hosted (Ollama, LM Studio)
64
73
 
65
74
  ## Architecture
66
75
 
@@ -80,9 +89,9 @@ graph TB
80
89
 
81
90
  subgraph litellm["LiteLLM Gateway"]
82
91
  C1["100+ LLM providers"]
83
- C2["Native MCP Gateway"]
84
- C3["A2A Protocol support"]
85
- C4["Cost tracking, rate limiting, fallbacks"]
92
+ C2["Model routing"]
93
+ C3["Error handling"]
94
+ C4["Response formatting"]
86
95
  end
87
96
 
88
97
  subgraph store["Context Store"]
@@ -142,10 +151,10 @@ llm_call(model="ollama/llama3.1:8b-instruct-q4_K_M", messages=[...])
142
151
  ```
143
152
 
144
153
  Features included:
145
- - Automatic fallbacks between providers
146
- - Cost tracking per request
147
- - Rate limit handling with retries
148
- - Tool/function calling across all providers
154
+ - Unified interface across providers
155
+ - Support for cloud and self-hosted models
156
+ - Automatic model detection and caching
157
+ - Error handling and response formatting
149
158
 
150
159
  ### 2. MCP Tool Integration
151
160
 
@@ -161,28 +170,25 @@ mcp_call(server="postgres", tool="query", args={"sql": "SELECT * FROM users"})
161
170
  mcp_call(server="context7", tool="search", args={"query": "LiteLLM API reference"})
162
171
  ```
163
172
 
164
- ### 3. Shared Context (The Killer Feature)
173
+ ### 3. S3-Compatible Storage
165
174
 
166
- Store data externally, pass URI references:
175
+ Store data in S3-compatible backends:
167
176
 
168
177
  ```python
169
- # Store large document (200k tokens) in S3-compatible storage
170
- put("s3://uer-context/analysis/doc_001.json", {"content": large_document})
171
-
172
- # Pass only URI to subagent (50 tokens!)
173
- delegate(
174
- model="anthropic/claude-sonnet-4-5-20250929",
175
- task="Analyze the document",
176
- context_refs=["s3://uer-context/analysis/doc_001.json"]
178
+ # Store data in MinIO, AWS S3, or other S3-compatible storage
179
+ storage_put(
180
+ key="analysis/doc_001.json",
181
+ content={"content": large_document},
182
+ bucket="uer-context"
177
183
  )
178
184
 
179
- # Subagent retrieves full content from storage
180
- # Result stored back to S3
181
- # Parent retrieves summary only
185
+ # Retrieve data
186
+ data = storage_get(
187
+ key="analysis/doc_001.json",
188
+ bucket="uer-context"
189
+ )
182
190
  ```
183
191
 
184
- **Token savings: 99.9%** for multi-agent workflows.
185
-
186
192
  **Storage backends:**
187
193
  - **Local:** MinIO (S3-compatible, Docker-based)
188
194
  - **Cloud:** AWS S3, Azure Blob Storage, NetApp StorageGRID
@@ -263,33 +269,14 @@ With storage disabled:
263
269
 
264
270
  The server will start successfully without storage, and LLMs won't see storage-related tools in their tool list.
265
271
 
266
- ### 4. Full Chat History for Subagents
272
+ ### 4. Prompt Injection Detection
267
273
 
268
- Build complete conversation context, not just single messages:
274
+ Basic content validation and security warnings:
269
275
 
270
276
  ```python
271
- delegate(
272
- model="openai/gpt-5-mini",
273
- messages=[
274
- {"role": "system", "content": "You are a code reviewer..."},
275
- {"role": "user", "content": "Review this code for security issues"},
276
- {"role": "assistant", "content": "I'll analyze the code..."},
277
- {"role": "user", "content": "Focus on SQL injection risks"}
278
- ],
279
- tools=[...], # MCP tools available to subagent
280
- context_refs=["registry://context/codebase"] # Large context via URI
281
- )
282
- ```
283
-
284
- ### 5. Continuation Across Sessions
285
-
286
- Complex tasks can span multiple messages and sessions:
287
-
288
- ```
289
- Message 1: Start analysis → Progress: 20% → {{continuation: registry://plan/001}}
290
- Message 2: Continue → Progress: 60% → {{continuation: registry://plan/001}}
291
- [Next day]
292
- Message 3: Continue → Complete! Here's your report...
277
+ # Detects potential prompt injection patterns
278
+ # Provides risk assessment and warnings
279
+ # Helps identify suspicious content in user inputs
293
280
  ```
294
281
 
295
282
  ## Usage
@@ -327,12 +314,15 @@ User: "Ask both Gemini and Claude Sonnet to write a haiku about programming"
327
314
  → Returns both haikus for comparison
328
315
  ```
329
316
 
330
- **3. Store and Share Context:**
317
+ **3. Store and Retrieve Data:**
331
318
  ```
332
- User: "Store this document in the registry and have Gemini summarize it"
333
- put("registry://context/doc", {...})
334
- delegate(model="gemini/gemini-3-flash-preview", context_refs=["registry://context/doc"])
335
- → Returns: Summary without re-sending full document
319
+ User: "Store this configuration in S3"
320
+ storage_put(key="config/settings.json", content={...})
321
+ Returns: Confirmation with storage details
322
+
323
+ User: "Retrieve the configuration"
324
+ → storage_get(key="config/settings.json")
325
+ → Returns: Configuration data
336
326
  ```
337
327
 
338
328
  ## Troubleshooting
@@ -364,36 +354,88 @@ User: "Store this document in the registry and have Gemini summarize it"
364
354
  | Tool | Description |
365
355
  |------|-------------|
366
356
  | `llm_call` | Call any LLM via LiteLLM (100+ providers) |
357
+ | `llm_list_models` | List available models from configured providers |
358
+ | `llm_config_guide` | Get configuration help for LLM providers |
367
359
  | `mcp_call` | Call any configured MCP server tool |
368
- | `put` | Store data/context in registry |
369
- | `get` | Retrieve data/context from registry |
370
- | `search` | Search MCP servers, skills, or stored context |
371
- | `delegate` | Spawn subagent with full chat history |
372
- | `subscribe` | Watch for async results |
373
- | `cancel` | Cancel subscription or execution |
360
+ | `mcp_list_tools` | List available MCP tools |
361
+ | `mcp_servers` | List configured MCP servers |
362
+ | `storage_put` | Store data in S3-compatible storage |
363
+ | `storage_get` | Retrieve data from storage |
364
+ | `storage_list` | List stored objects |
365
+ | `storage_delete` | Delete stored objects |
374
366
 
375
367
  ## LiteLLM Integration
376
368
 
377
369
  This project uses [LiteLLM](https://github.com/BerriAI/litellm) as the unified LLM gateway, providing:
378
370
 
379
371
  - **100+ LLM providers** through single interface
380
- - **Native MCP Gateway** with permission management
381
- - **A2A Protocol** for agent-to-agent communication
382
- - **Cost tracking** per request with spend reports
383
- - **Rate limiting** with automatic retries
384
- - **Fallbacks** between providers on failure
385
- - **Tool/function calling** normalized across providers
386
-
387
- ### Supported Providers
388
-
389
- | Provider | Model Examples |
390
- |----------|---------------|
391
- | Anthropic | `anthropic/claude-sonnet-4-5-20250929`, `anthropic/claude-opus-4-5-20251101` |
392
- | OpenAI | `openai/gpt-5.2`, `openai/gpt-5-mini`, `openai/gpt-5.2-codex` |
393
- | Google | `gemini/gemini-3-flash-preview`, `gemini/gemini-3-pro-preview` |
394
- | Azure | `azure/gpt-4-deployment` |
395
- | AWS Bedrock | `bedrock/anthropic.claude-3-sonnet` |
396
- | Local | `ollama/llama3.1:8b-instruct-q4_K_M`, `lm_studio/lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF` |
372
+ - **Unified API format** across all providers
373
+ - **Support for cloud and self-hosted models**
374
+ - **Automatic model detection** and caching
375
+ - **Error handling** and response formatting
376
+
377
+ ### Provider & Model Discovery
378
+
379
+ **Find supported providers and models:**
380
+ - 📖 **[PROVIDERS.md](PROVIDERS.md)** - Complete guide to LiteLLM provider integrations and configuration
381
+ - 🌐 **[LiteLLM Provider Docs](https://docs.litellm.ai/docs/providers/)** - Official documentation for all 100+ providers
382
+ - 🔧 **`llm_list_models` tool** - Query available models from your configured providers
383
+ - 🔧 **`llm_config_guide` tool** - Get configuration help for specific providers
384
+
385
+ ### Supported Providers (Examples)
386
+
387
+ | Provider | Model Examples | Testing Status |
388
+ |----------|---------------|----------------|
389
+ | Anthropic | `anthropic/claude-sonnet-4-5-20250929`, `anthropic/claude-opus-4-5-20251101` | ✅ Tested |
390
+ | Cerebras | `cerebras/llama-3.3-70b`, `cerebras/qwen-3-235b-a22b-instruct-2507` | ✅ Tested |
391
+ | OpenAI | `openai/gpt-4o`, `openai/o3-mini` | ✅ Tested |
392
+ | Google | `gemini/gemini-2.5-flash`, `gemini/gemini-2.0-flash-exp` | ✅ Tested |
393
+ | LM Studio | `lm_studio/meta-llama-3.1-8b-instruct` (local) | ✅ Tested |
394
+ | Ollama | `ollama/llama3.1:8b-instruct-q4_K_M` (local) | ✅ Tested |
395
+ | Azure | `azure/gpt-4-deployment` | ⚠️ Untested |
396
+ | AWS Bedrock | `bedrock/anthropic.claude-3-sonnet` | ⚠️ Untested |
397
+ | Cohere | `cohere_chat/command-r-plus` | ⚠️ Untested |
398
+ | Together AI | `together_ai/meta-llama/Llama-3-70b-chat-hf` | ⚠️ Untested |
399
+
400
+ **Testing Status:**
401
+ - ✅ **Tested**: Verified during development with live API queries and model caching
402
+ - ⚠️ **Untested**: Supported via LiteLLM but not extensively tested. May require minor adjustments. Please [report issues](https://github.com/margusmartsepp/UER/issues) if you encounter problems.
403
+
404
+ **Note:** Model names change frequently. Use the discovery tools above to find current models.
405
+
406
+ ### Advanced Configuration
407
+
408
+ **Multi-Instance Providers:**
409
+ LiteLLM supports multiple instances of the same provider (e.g., multiple Azure deployments). Configure via environment variables:
410
+
411
+ ```bash
412
+ # Multiple Azure deployments
413
+ AZURE_API_KEY="key1"
414
+ AZURE_API_BASE="https://endpoint1.openai.azure.com"
415
+ AZURE_API_VERSION="2023-05-15"
416
+
417
+ # Use model format: azure/<deployment-name>
418
+ # Example: azure/gpt-4-deployment
419
+ ```
420
+
421
+ **Generic Provider Support:**
422
+ Any provider with a configured API key will be detected automatically. If we don't have a specific query implementation, example models will be provided. Supported providers include:
423
+
424
+ - Cohere (`COHERE_API_KEY`)
425
+ - Together AI (`TOGETHERAI_API_KEY`)
426
+ - Replicate (`REPLICATE_API_KEY`)
427
+ - Hugging Face (`HUGGINGFACE_API_KEY`)
428
+ - And 90+ more - see [LiteLLM docs](https://docs.litellm.ai/docs/providers/)
429
+
430
+ **Fallback Chains:**
431
+ LiteLLM supports automatic fallbacks. Configure via model list:
432
+ ```python
433
+ # In your LLM call, specify fallback models
434
+ model="gpt-4o" # Primary
435
+ fallbacks=["claude-sonnet-4-5", "gemini-2.5-flash"] # Fallbacks
436
+ ```
437
+
438
+ See [PROVIDERS.md](PROVIDERS.md) for detailed configuration examples.
397
439
 
398
440
  ## Project Structure
399
441
 
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "uer-mcp",
3
- "version": "4.1.0",
4
- "description": "Universal Expert Registry - Multi-agent safety monitoring, sandbagging detection, and simulation framework with 100+ LLM providers",
3
+ "version": "4.2.1",
4
+ "description": "[Hackathon Proof-of-Concept] Universal Expert Registry - Multi-provider LLM gateway, S3-compatible storage, and MCP tool orchestration. Tested with Anthropic, Cerebras, OpenAI, Gemini, LM Studio, Ollama.",
5
5
  "main": "index.js",
6
6
  "bin": {
7
7
  "uer-mcp": "bin/uer-mcp.js"
@@ -51,15 +51,17 @@
51
51
  "mcp": {
52
52
  "displayName": "Universal Expert Registry",
53
53
  "icon": "img/uer.jpg",
54
- "description": "Multi-agent safety monitoring, sandbagging detection, and simulation framework. Access 100+ LLM providers, connect to 1000+ MCP servers, and manage unlimited context with external storage.",
54
+ "description": "[Proof-of-Concept] Multi-provider LLM gateway with 100+ providers via LiteLLM. Extensively tested: Anthropic, Cerebras, OpenAI, Gemini, LM Studio, Ollama. Other providers may need adjustments. Version numbers track features, not production readiness.",
55
55
  "features": [
56
56
  "Multi-Agent Safety Monitoring - 15+ behavior patterns (AgentVerse, sycophancy, deception, sandbagging)",
57
57
  "Sandbagging Detection - Multi-method detection with consistency testing and capability elicitation",
58
58
  "Multi-Agent Simulation - Full conversation orchestration with personas, audit trails, and manipulation detection",
59
59
  "Universal LLM Access - Call any LLM through LiteLLM (Claude, GPT, Gemini, Bedrock, Azure, local models)",
60
60
  "MCP Tool Orchestration - Connect to 1000+ MCP servers (filesystem, databases, browsers, etc.)",
61
- "Shared Memory/Context - Break context window limits via external storage with URI references",
62
- "Subagent Delegation - Spawn subagents with full chat history and behavior monitoring"
61
+ "S3-Compatible Storage - Persistent context storage with MinIO, AWS S3, or Azure Blob",
62
+ "Prompt Injection Detection - Basic content validation and security warnings",
63
+ "LM Studio Support - Local model hosting with OpenAI-compatible API",
64
+ "Model Query & Caching - Automatic model detection for Anthropic, Cerebras, OpenAI, Gemini"
63
65
  ],
64
66
  "tools": [
65
67
  {
@@ -91,20 +93,20 @@
91
93
  "description": "Quick sandbagging screening test"
92
94
  },
93
95
  {
94
- "name": "put",
95
- "description": "Store data in external context storage"
96
+ "name": "storage_put",
97
+ "description": "Store data in S3-compatible storage"
96
98
  },
97
99
  {
98
- "name": "get",
99
- "description": "Retrieve data from external context storage"
100
+ "name": "storage_get",
101
+ "description": "Retrieve data from S3-compatible storage"
100
102
  },
101
103
  {
102
- "name": "delegate",
103
- "description": "Delegate tasks to subagents with full context"
104
+ "name": "llm_list_models",
105
+ "description": "List available models from configured providers"
104
106
  },
105
107
  {
106
- "name": "search",
107
- "description": "Search stored context and knowledge"
108
+ "name": "llm_config_guide",
109
+ "description": "Get configuration help for LLM providers"
108
110
  }
109
111
  ],
110
112
  "configuration": {