@elvatis_com/elvatis-mcp 1.0.2 → 1.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +55 -28
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -4,7 +4,7 @@
4
4
 
5
5
  [![npm](https://img.shields.io/npm/v/@elvatis_com/elvatis-mcp)](https://www.npmjs.com/package/@elvatis_com/elvatis-mcp)
6
6
  [![License](https://img.shields.io/badge/license-Apache--2.0-blue)](LICENSE)
7
- [![Tests](https://img.shields.io/badge/tests-11%2F11%20passed-brightgreen)](#test-results)
7
+ [![Tests](https://img.shields.io/badge/unit%20tests-42%2F42%20passed-brightgreen)](#test-results)
8
8
 
9
9
  ## What is this?
10
10
 
@@ -13,10 +13,10 @@ elvatis-mcp connects Claude (or any MCP client) to your infrastructure:
13
13
  - **Smart home** control via Home Assistant (lights, thermostats, vacuum, sensors)
14
14
  - **Memory** system with daily logs stored on your OpenClaw server
15
15
  - **Cron** job management and triggering
16
- - **Multi-LLM orchestration** through 4 AI backends: OpenClaw, Google Gemini, OpenAI Codex, and local LLMs
17
- - **Smart prompt splitting** that analyzes complex requests and routes sub-tasks to the right AI
16
+ - **Multi-LLM orchestration** through 5 AI backends: Claude, OpenClaw, Google Gemini, OpenAI Codex, and local LLMs
17
+ - **Smart prompt splitting** that analyzes complex requests, routes sub-tasks to the right AI, and executes the plan with rate limiting
18
18
 
19
- The key idea: Claude is the orchestrator, but it can delegate specialized work to other AI models. Coding tasks go to Codex. Research goes to Gemini. Simple formatting goes to your local LLM (free, private). Trading and automation go to OpenClaw. And `prompt_split` figures out the routing automatically.
19
+ The key idea: Claude is the orchestrator, but it can delegate specialized work to other AI models. Coding tasks go to Codex. Research goes to Gemini. Simple formatting goes to your local LLM (free, private). Trading and automation go to OpenClaw. `prompt_split` figures out the routing automatically, and `prompt_split_execute` runs the plan with rate limiting on cloud agents.
20
20
 
21
21
  ## What is MCP?
22
22
 
@@ -69,7 +69,7 @@ prompt_split returns:
69
69
  t4: openclaw_memory_write -- "Save summary to today's log" (after t2, t3)
70
70
  ```
71
71
 
72
- Claude then executes the plan, calling tools in the right order and running parallel tasks concurrently. Three analysis strategies:
72
+ Use `prompt_split_execute` to run the plan automatically, or let Claude execute it step by step. Tasks run in dependency order with parallel groups executed concurrently. Three analysis strategies:
73
73
 
74
74
  | Strategy | Speed | Quality | Uses |
75
75
  |---|---|---|---|
@@ -80,7 +80,7 @@ Claude then executes the plan, calling tools in the right order and running para
80
80
 
81
81
  ---
82
82
 
83
- ## Available Tools (32 total)
83
+ ## Available Tools (34 total)
84
84
 
85
85
  ### Home Assistant (7 tools)
86
86
  | Tool | Description |
@@ -100,7 +100,7 @@ Claude then executes the plan, calling tools in the right order and running para
100
100
  | `openclaw_memory_read_today` | Read today's memory log |
101
101
  | `openclaw_memory_search` | Search memory files across the last N days |
102
102
 
103
- ### Cron Automation (6 tools)
103
+ ### Cron Automation (7 tools)
104
104
  | Tool | Description |
105
105
  |---|---|
106
106
  | `openclaw_cron_list` | List all scheduled OpenClaw cron jobs |
@@ -125,7 +125,7 @@ Claude then executes the plan, calling tools in the right order and running para
125
125
  | `claude_run` | Send a prompt to Claude via the local CLI. For non-Claude MCP clients (Cursor, Windsurf). |
126
126
  | `gemini_run` | Send a prompt to Google Gemini via the local CLI. 1M token context. |
127
127
  | `codex_run` | Send a coding task to OpenAI Codex via the local CLI. |
128
- | `local_llm_run` | Send a prompt to a local LLM (LM Studio, Ollama, llama.cpp). Free, private. |
128
+ | `local_llm_run` | Send a prompt to a local LLM (LM Studio, Ollama, llama.cpp). Free, private. Supports streaming. |
129
129
  | `llama_server` | Start/stop/configure a llama.cpp server with TurboQuant cache support. |
130
130
 
131
131
  ### System Management (4 tools)
@@ -136,11 +136,12 @@ Claude then executes the plan, calling tools in the right order and running para
136
136
  | `openclaw_logs` | View gateway, agent, or system logs from the OpenClaw server |
137
137
  | `file_transfer` | Upload, download, or list files on the OpenClaw server via SSH |
138
138
 
139
- ### Routing and Orchestration (2 tools)
139
+ ### Routing and Orchestration (3 tools)
140
140
  | Tool | Description |
141
141
  |---|---|
142
142
  | `mcp_help` | Show routing guide. Pass a task to get a specific tool recommendation. |
143
143
  | `prompt_split` | Analyze a complex prompt, split into sub-tasks with agent assignments. |
144
+ | `prompt_split_execute` | Execute a split plan: dispatch subtasks to agents in dependency order with rate limiting. |
144
145
 
145
146
  ### Dashboard
146
147
  | Endpoint | Description |
@@ -229,27 +230,27 @@ See [BENCHMARKS.md](BENCHMARKS.md) for the full benchmark suite, methodology, an
229
230
  | GPU | AMD Radeon RX 9070 XT Elite (16 GB GDDR6) |
230
231
  | RAM | 128 GB DDR4 |
231
232
  | OS | Windows 11 Pro |
232
- | Runtime | LM Studio + ROCm (`llama.cpp-win-x86_64-amd-rocm-avx2@2.8.0`) |
233
+ | Runtime | LM Studio + Vulkan (`llama.cpp-win-x86_64-vulkan-avx2@2.8.0`) |
233
234
 
234
- ### Local LLM Inference (LM Studio, ROCm GPU, `--gpu max`)
235
+ ### Local LLM Inference (LM Studio, Vulkan GPU, `--gpu max`)
235
236
 
236
- Median of 3 runs, `max_tokens=512`. Tasks: classify (1-word sentiment), extract (JSON), reason (arithmetic), code (Python function).
237
+ Median of 3 runs, `max_tokens=512`. Tasks: classify (1-word sentiment), extract (JSON), reason (arithmetic), code (Python function). Vulkan is the recommended runtime for AMD RX 9070 XT (wins 4 of 5 models over ROCm).
237
238
 
238
- | Model | Params | classify | extract | reason | code |
239
- |-------|--------|----------|---------|--------|------|
240
- | Phi 4 Mini Reasoning | 3B | 3.9s | 5.1s | 8.6s | 8.5s |
241
- | Deepseek R1 0528 Qwen3 | 8B | 2.5s | 7.7s | 13.3s | 7.3s |
242
- | Qwen 3.5 9B | 9B | 4.9s | 11.2s | 6.6s | 11.4s |
243
- | Phi 4 Reasoning Plus | 15B | 0.4s | 17.4s | 4.5s | 17.3s |
244
- | **GPT-OSS 20B** | **20B** | **0.6s** | **0.7s** | **0.7s** | **3.1s** |
239
+ | Model | Params | classify | extract | reason | code | avg tok/s |
240
+ |-------|--------|----------|---------|--------|------|-----------|
241
+ | Phi 4 Mini Reasoning | 3B | 2.6s | 1.9s | 4.7s | 4.8s | **106** |
242
+ | Deepseek R1 0528 Qwen3 | 8B | 3.0s | 6.5s | 7.2s | 7.4s | 70 |
243
+ | Qwen 3.5 9B | 9B | 6.2s | 4.0s | 8.4s | 7.2s | 48 |
244
+ | Phi 4 Reasoning Plus | 15B | 0.4s | 9.7s | 3.5s | 9.9s | 40 |
245
+ | **GPT-OSS 20B** | **20B** | **0.6s** | **0.6s** | **0.6s** | **1.9s** | 63 |
245
246
 
246
- **GPU speedup vs CPU (Deepseek R1 8B):** classify 8.4x faster, extract 3.2x faster.
247
+ **GPU speedup vs CPU (Deepseek R1 8B, Vulkan):** classify 7.2x faster, extract 3.8x faster.
247
248
 
248
249
  ### Sub-Agent Comparison (same task, different backends)
249
250
 
250
251
  | Agent | Backend | Avg Latency | Cost | Notes |
251
252
  |-------|---------|-------------|------|-------|
252
- | **local_llm_run** | GPT-OSS 20B (ROCm GPU) | **1.3s** | Free | 3x faster than Codex, 5x faster than Claude |
253
+ | **local_llm_run** | GPT-OSS 20B (Vulkan GPU) | **1.0s** | Free | 4x faster than Codex, 6x faster than Claude |
253
254
  | codex_run | OpenAI Codex CLI | 4.1s | Pay-per-use | Best for coding tasks |
254
255
  | claude_run | Claude Sonnet 4.6 | 6.3s | Pay-per-use | Best for complex reasoning |
255
256
  | gemini_run | Gemini 2.5 Flash | 34.0s | Free tier | CLI startup overhead, best for long context |
@@ -269,12 +270,12 @@ Median of 3 runs, `max_tokens=512`. Tasks: classify (1-word sentiment), extract
269
270
 
270
271
  | Metric | Result |
271
272
  |--------|--------|
272
- | Pass rate | 6/10 (60%) |
273
- | Task count accuracy | 7/10 (70%) |
274
- | Avg agent match | 70% |
273
+ | Pass rate | **10/10 (100%)** |
274
+ | Task count accuracy | 10/10 (100%) |
275
+ | Avg agent match | 100% |
275
276
  | Latency | <1ms (no LLM call) |
276
277
 
277
- The `auto` strategy (Gemini or local LLM) handles complex multi-step prompts with higher accuracy. See [BENCHMARKS.md](BENCHMARKS.md) for details.
278
+ Improvements in v0.8.0+: word boundary regex matching, comma-clause splitting for multi-agent prompts, per-tool routing rules, `openclaw_notify` routing. See [BENCHMARKS.md](BENCHMARKS.md) for the full test corpus.
278
279
 
279
280
  > Want to contribute benchmarks from your hardware? See [BENCHMARKS.md](BENCHMARKS.md#community-contributions).
280
281
 
@@ -443,6 +444,8 @@ Connect your client to `http://your-server:3333/mcp`.
443
444
  | `MCP_TRANSPORT` | `stdio` | Transport mode: `stdio` or `http` |
444
445
  | `MCP_HTTP_PORT` | `3333` | HTTP port |
445
446
  | `SSH_DEBUG` | -- | Set to `1` for verbose SSH output |
447
+ | `ELVATIS_DATA_DIR` | `~/.elvatis-mcp` | Directory for persistent usage data (rate limiter) |
448
+ | `RATE_LIMITS` | -- | JSON string with per-agent rate limit overrides |
446
449
 
447
450
  ---
448
451
 
@@ -500,11 +503,32 @@ On Windows, elvatis-mcp automatically resolves the SSH binary to `C:\Windows\Sys
500
503
 
501
504
  ## `/mcp-help` Slash Command
502
505
 
503
- In Claude Code, the `/project:mcp-help` slash command is available:
506
+ In Claude Code, the `/project:mcp-help` slash command shows the full 34-tool routing guide as formatted output:
504
507
 
505
508
  ```
506
- /project:mcp-help
507
- /project:mcp-help analyze this trading strategy for risk
509
+ /project:mcp-help # full guide
510
+ /project:mcp-help openclaw_status # help for a specific tool
511
+ /project:mcp-help analyze this trading strategy for risk # routing recommendation
512
+ ```
513
+
514
+ ---
515
+
516
+ ## Rate Limiting
517
+
518
+ Cloud sub-agents (`claude_run`, `codex_run`, `gemini_run`) are rate-limited to prevent runaway costs. Default limits:
519
+
520
+ | Agent | /min | /hr | /day | Est. cost/call |
521
+ |-------|------|-----|------|----------------|
522
+ | `claude_run` | 5 | 30 | 200 | $0.03 |
523
+ | `codex_run` | 5 | 30 | 200 | $0.02 |
524
+ | `gemini_run` | 10 | 60 | 500 | $0.01 |
525
+
526
+ Local agents (`local_llm_run`, `home_*`, `openclaw_*`) are unlimited.
527
+
528
+ Usage data persists to `~/.elvatis-mcp/usage.json`. Override limits via the `RATE_LIMITS` env var:
529
+
530
+ ```bash
531
+ RATE_LIMITS='{"claude_run":{"perMinute":3,"perDay":100}}'
508
532
  ```
509
533
 
510
534
  ---
@@ -555,9 +579,12 @@ src/
555
579
  file-transfer.ts File upload/download via SSH
556
580
  system-status.ts Unified health check across all services
557
581
  splitter.ts Smart prompt splitter (multi-strategy)
582
+ split-execute.ts Plan executor with agent dispatch and rate limiting
558
583
  help.ts Routing guide and task recommender
559
584
  routing-rules.ts Shared routing rules and keyword matching
585
+ rate-limiter.ts Rate limiting + cost tracking for cloud sub-agents
560
586
  tests/
587
+ unit.test.ts 42 unit tests (no external services needed)
561
588
  integration.test.ts Live integration tests
562
589
  ```
563
590
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@elvatis_com/elvatis-mcp",
3
- "version": "1.0.2",
3
+ "version": "1.0.4",
4
4
  "description": "MCP server for OpenClaw — expose smart home, memory, cron, and more to Claude Desktop, Cursor, Windsurf, and any MCP client",
5
5
  "main": "dist/index.js",
6
6
  "types": "dist/index.d.ts",