@elvatis_com/elvatis-mcp 0.6.1 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +54 -0
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -217,6 +217,60 @@ Prerequisites: `.env` configured, local LLM server running, OpenClaw server reac
217
217
 
218
218
  ---
219
219
 
220
+ ## Benchmarks
221
+
222
+ See [BENCHMARKS.md](BENCHMARKS.md) for the full benchmark suite, methodology, and community contribution guide.
223
+
224
+ ### Reference Hardware
225
+
226
+ | Component | Spec |
227
+ |---|---|
228
+ | CPU | AMD Threadripper 3960X (24 cores / 48 threads) |
229
+ | GPU | AMD Radeon RX 9070 XT Elite (16 GB GDDR6) |
230
+ | RAM | 128 GB DDR4 |
231
+ | OS | Windows 11 Pro |
232
+ | Runtime | LM Studio + ROCm (`llama.cpp-win-x86_64-amd-rocm-avx2@2.8.0`) |
233
+
234
+ ### Local LLM Inference (LM Studio, ROCm GPU, `--gpu max`)
235
+
236
+ Median of 3 runs, `max_tokens=512`. Tasks: classify (1-word sentiment), extract (JSON), reason (arithmetic), code (Python function).
237
+
238
+ | Model | Params | classify | extract | reason | code |
239
+ |-------|--------|----------|---------|--------|------|
240
+ | Phi 4 Mini Reasoning | 3B | 3.9s | 5.1s | 8.6s | 8.5s |
241
+ | Deepseek R1 0528 Qwen3 | 8B | 2.5s | 7.7s | 13.3s | 7.3s |
242
+ | Qwen 3.5 9B | 9B | 4.9s | 11.2s | 6.6s | 11.4s |
243
+ | Phi 4 Reasoning Plus | 15B | 0.4s | 17.4s | 4.5s | 17.3s |
244
+ | **GPT-OSS 20B** | **20B** | **0.6s** | **0.7s** | **0.7s** | **3.1s** |
245
+
246
+ **GPU speedup vs CPU (Deepseek R1 8B):** classify 8.4x faster, extract 3.2x faster.
247
+
248
+ ### Service Latency (system_status)
249
+
250
+ | Service | Latency | Notes |
251
+ |---|---|---|
252
+ | Home Assistant (REST API) | 48-84 ms | Local network, direct HTTP |
253
+ | OpenClaw SSH | 273-299 ms | LAN SSH + command execution |
254
+ | Local LLM (model list) | 19-38 ms | LM Studio localhost API |
255
+ | Claude CLI (version check) | 472-478 ms | CLI startup overhead |
256
+ | Codex CLI (version check) | 131-136 ms | CLI startup overhead |
257
+ | Gemini CLI (version check) | 4,700-4,900 ms | CLI startup + auth check |
258
+
259
+ ### prompt_split Accuracy (heuristic strategy)
260
+
261
+ | Metric | Result |
262
+ |--------|--------|
263
+ | Pass rate | 6/10 (60%) |
264
+ | Task count accuracy | 7/10 (70%) |
265
+ | Avg agent match | 70% |
266
+ | Latency | <1ms (no LLM call) |
267
+
268
+ The `auto` strategy (Gemini or local LLM) handles complex multi-step prompts with higher accuracy. See [BENCHMARKS.md](BENCHMARKS.md) for details.
269
+
270
+ > Want to contribute benchmarks from your hardware? See [BENCHMARKS.md](BENCHMARKS.md#community-contributions).
271
+
272
+ ---
273
+
220
274
  ## Requirements
221
275
 
222
276
  - Node.js 18 or later
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@elvatis_com/elvatis-mcp",
3
- "version": "0.6.1",
3
+ "version": "0.7.0",
4
4
  "description": "MCP server for OpenClaw — expose smart home, memory, cron, and more to Claude Desktop, Cursor, Windsurf, and any MCP client",
5
5
  "main": "dist/index.js",
6
6
  "types": "dist/index.d.ts",