local-model-suitability-mcp 1.0.1 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,13 +1,38 @@
1
1
  # Changelog
2
2
 
3
- ## [1.0.0] - 2026-04-13
3
+ ## [1.1.0] - 2026-04-20
4
4
 
5
- ### Initial release
5
+ ### Changed
6
+ - Renamed tool from `evaluate_local_model_suitability` to `check_local_viability` — sharper, more action-oriented name
7
+ - Reframed core premise: cloud is expensive, local is the default, cloud must justify itself
8
+ - Tool description now positions as a cost gate to call BEFORE every cloud inference call
9
+
10
+ ### Added
11
+ - `data_sensitivity` input: CONFIDENTIAL forces LOCAL verdict regardless of task — data never leaves the machine
12
+ - `quality_threshold` input: PRODUCTION / PROTOTYPE / BEST_EFFORT — controls how conservatively LOCAL verdicts are given
13
+ - `estimated_cost_saving` in response — approximate $ saved per call if routing LOCAL
14
+ - `recommended_local_models` — specific Ollama model names (e.g. llama3.2:8b, mistral-7b) when LOCAL or EITHER
15
+ - `cloud_justified_reason` — specific reason why local is insufficient, only present on CLOUD verdicts
16
+ - Partial response monetisation: free tier returns verdict + confidence + reason; paid adds cost savings + model recommendations
17
+
18
+ ### Improved
19
+ - System prompt now takes a strong LOCAL-first stance — cloud must be justified, not the default
20
+ - More specific reasoning in responses — names the task type explicitly
21
+
22
+ ## [1.0.4] - 2026-04-10
23
+
24
+ ### Added
25
+ - HTTP POST MCP handler for dashboard tool counting
26
+ - STRIPE_WEBHOOK_SECRET signature verification
27
+ - RESEND_API_KEY for API key email delivery
6
28
 
7
- - `evaluate_local_model_suitability` tool — AI-powered evaluation of local model suitability
8
- - Built-in capability profiles for 25+ popular local models (Llama, Mistral, Qwen, Gemma, Phi, DeepSeek, CodeLlama)
9
- - Four-dimensional reasoning: cost, privacy, latency, quality
10
- - Verdict: LOCAL / CLOUD / EITHER / NEITHER with confidence score
11
- - Free tier: 20 evaluations/month
12
- - Pro tier: 2,000 evaluations/month ($99/month)
13
- - Enterprise tier: unlimited ($299/month)
29
+ ## [1.0.1] - 2026-04-05
30
+
31
+ ### Fixed
32
+ - Stats endpoint computing flat numbers for dashboard
33
+
34
+ ## [1.0.0] - 2026-04-01
35
+
36
+ ### Initial release
37
+ - evaluate_local_model_suitability tool
38
+ - Free 20/month | Pro $99/month | Enterprise $299/month
package/LICENSE CHANGED
@@ -1,21 +1,9 @@
1
- MIT License
1
+ UNLICENSED
2
2
 
3
- Copyright (c) 2026 Kord Agencies
3
+ Copyright (c) 2026 Kord Agencies Pte Ltd
4
4
 
5
- Permission is hereby granted, free of charge, to any person obtaining a copy
6
- of this software and associated documentation files (the "Software"), to deal
7
- in the Software without restriction, including without limitation the rights
8
- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
- copies of the Software, and to permit persons to whom the Software is
10
- furnished to do so, subject to the following conditions:
5
+ All rights reserved. This software and associated documentation files are proprietary
6
+ and confidential. Unauthorized copying, modification, distribution, or use of this
7
+ software, in whole or in part, is strictly prohibited.
11
8
 
12
- The above copyright notice and this permission notice shall be included in all
13
- copies or substantial portions of the Software.
14
-
15
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
- SOFTWARE.
9
+ For licensing enquiries: ojas@kordagencies.com
package/README.md CHANGED
@@ -1,62 +1,53 @@
1
1
  # Local Model Suitability MCP
2
2
 
3
- **AI-powered evaluation of whether your local model is actually good enough for the task at hand.**
3
+ **Cloud inference is expensive. Everything that can run locally should.**
4
4
 
5
- ---
5
+ This MCP server tells your agent — before every cloud API call — whether the task can be handled by a local model instead. Route to Ollama, LM Studio, or llama.cpp when you can. Only pay for cloud when you must.
6
6
 
7
- ## The Problem
7
+ ## The Tool
8
8
 
9
- When you have both a local model (Ollama, LM Studio, etc.) and cloud APIs available, agents face a decision they cannot make intelligently alone:
9
+ ### `check_local_viability`
10
10
 
11
- **Should I run this locally or send it to the cloud?**
11
+ Call this BEFORE every cloud inference call. If verdict is `LOCAL`, skip the cloud call entirely and route to your local model. Only use cloud when this tool returns `CLOUD`.
12
12
 
13
- Getting this wrong in either direction is expensive:
14
-
15
- - **Wrong direction 1 — cloud when local works:** You pay Claude Opus rates for a task a 7B model handles perfectly. At scale, this is thousands of dollars wasted monthly.
16
- - **Wrong direction 2 local when cloud is needed:** You run a complex reasoning task through a small model and get silent quality failures. The agent proceeds confidently on bad output.
17
- - **Wrong direction 3 cloud when data is sensitive:** You send confidential internal data to an external API that logs it. A privacy or compliance violation you never intended.
18
-
19
- ## The Solution
13
+ **Inputs:**
14
+ | Field | Required | Description |
15
+ |---|---|---|
16
+ | `task` | | The exact task you are about to send to a cloud model |
17
+ | `quality_threshold` | Optional | `PRODUCTION` (default) / `PROTOTYPE` / `BEST_EFFORT` |
18
+ | `data_sensitivity` | Optional | `PUBLIC` (default) / `INTERNAL` / `CONFIDENTIAL` |
20
19
 
21
- `evaluate_local_model_suitability` is a single AI-powered tool that reasons across four dimensions simultaneously **cost, privacy, latency, and quality** — and returns a clear verdict your agent can act on.
20
+ `CONFIDENTIAL` forces `LOCAL` regardless of task complexitydata never leaves the machine.
22
21
 
23
- ```
24
- Verdict: LOCAL | CLOUD | EITHER | NEITHER
22
+ **Response:**
23
+ ```json
24
+ {
25
+ "verdict": "LOCAL",
26
+ "confidence": "HIGH",
27
+ "reason": "Simple text summarisation — no reasoning depth required. Any 7B+ local model handles this well.",
28
+ "estimated_cost_saving": "$0.002-0.008 saved per call at claude-sonnet pricing",
29
+ "recommended_local_models": ["llama3.2:8b", "mistral-7b", "phi3:medium"],
30
+ "cloud_justified_reason": null,
31
+ "analysis_type": "AI-powered cost routing — NOT a simple lookup"
32
+ }
25
33
  ```
26
34
 
27
- This is not a benchmark lookup. Claude reasons about your specific task, your specific model, and your specific constraints.
35
+ ## Data Sources
28
36
 
29
- ---
37
+ - AI reasoning: Anthropic Claude (claude-sonnet) — cost routing analysis
38
+ - No external data sources — pure AI reasoning
30
39
 
31
- ## Installation
32
-
33
- ```bash
34
- npx local-model-suitability-mcp
35
- ```
36
-
37
- Or install globally:
38
-
39
- ```bash
40
- npm install -g local-model-suitability-mcp
41
- ```
40
+ ## Pricing
42
41
 
43
- ### Claude Desktop / Claude Code config
42
+ | Plan | Price | Calls/month |
43
+ |---|---|---|
44
+ | Free | $0 | 20 |
45
+ | Pro | $99/month | 2,000 |
46
+ | Enterprise | $299/month | Unlimited |
44
47
 
45
- ```json
46
- {
47
- "mcpServers": {
48
- "local-model-suitability": {
49
- "command": "npx",
50
- "args": ["-y", "local-model-suitability-mcp"],
51
- "env": {
52
- "ANTHROPIC_API_KEY": "your-key-here"
53
- }
54
- }
55
- }
56
- }
57
- ```
48
+ [Subscribe at kordagencies.com](https://kordagencies.com)
58
49
 
59
- ### With Pro API key
50
+ ## Setup
60
51
 
61
52
  ```json
62
53
  {
@@ -65,112 +56,16 @@ npm install -g local-model-suitability-mcp
65
56
  "command": "npx",
66
57
  "args": ["-y", "local-model-suitability-mcp"],
67
58
  "env": {
68
- "ANTHROPIC_API_KEY": "your-anthropic-key",
69
- "LMS_API_KEY": "your-pro-key-from-kordagencies"
59
+ "ANTHROPIC_API_KEY": "your-key",
60
+ "API_KEY": "your-lms-api-key-for-paid-tier"
70
61
  }
71
62
  }
72
63
  }
73
64
  }
74
65
  ```
75
66
 
76
- ---
77
-
78
- ## Tool: `evaluate_local_model_suitability`
79
-
80
- ### Parameters
81
-
82
- | Parameter | Type | Required | Description |
83
- |---|---|---|---|
84
- | `task_description` | string | ✅ | Describe the task specifically. Include output format, accuracy requirements, stakes. |
85
- | `local_model` | string | ✅ | Model name in Ollama format: `llama3.1:8b`, `mistral:7b`, `qwen2.5:14b`, etc. |
86
- | `quality_threshold` | enum | ✅ | `draft` / `production` / `critical` |
87
- | `use_case_type` | enum | ✅ | `classification` / `summarisation` / `code_generation` / `reasoning` / `data_extraction` / `creative_writing` / `question_answering` / `translation` / `sentiment_analysis` / `other` |
88
- | `data_sensitivity` | enum | ✅ | `public` / `internal` / `confidential` |
89
- | `latency_requirement` | enum | ✅ | `flexible` / `moderate` / `realtime` |
90
-
91
- ### Example Request
92
-
93
- ```json
94
- {
95
- "task_description": "Classify customer support emails into 5 categories: billing, technical, returns, complaints, general. Must be accurate enough for production routing — wrong classification means wrong team gets the ticket.",
96
- "local_model": "llama3.1:8b",
97
- "quality_threshold": "production",
98
- "use_case_type": "classification",
99
- "data_sensitivity": "internal",
100
- "latency_requirement": "moderate"
101
- }
102
- ```
103
-
104
- ### Example Response
105
-
106
- ```json
107
- {
108
- "verdict": "EITHER",
109
- "confidence": "HIGH",
110
- "summary": "Llama 3.1 8B can handle 5-category email classification at production quality if emails are clear — use local to protect customer data and save cost, with cloud fallback for ambiguous cases.",
111
- "model_evaluated": "llama3.1:8b",
112
- "model_profile": {
113
- "parameter_count": "8B",
114
- "tier": "small",
115
- "known_strengths": ["simple Q&A", "basic summarisation", "short classification", "data extraction"],
116
- "known_weaknesses": ["complex multi-step reasoning", "long-context coherence", "nuanced instruction following"]
117
- },
118
- "task_complexity": "SIMPLE",
119
- "reasoning": {
120
- "quality_assessment": "5-category classification is within 8B capability for well-structured emails. Performance degrades on ambiguous or multi-issue tickets.",
121
- "cost_impact": "Running locally saves approximately $0.003-0.008 per classification vs cloud. At 10,000 emails/month that is $30-80 saved monthly.",
122
- "privacy_assessment": "Customer support emails contain personal data. Keeping classification local avoids sending customer PII to external APIs — strong argument for local.",
123
- "latency_assessment": "Classification on an 8B model completes in 200-800ms depending on hardware. Meets moderate latency requirement.",
124
- "failure_modes": "Watch for: (1) multi-issue emails being misclassified to only one category, (2) sarcastic or informal language confusing the classifier, (3) very short one-word emails with no context."
125
- },
126
- "recommended_cloud_model": null,
127
- "fallback_advice": "If local classification confidence is low (detectable via logprobs or by asking the model to rate its own confidence), escalate to claude-haiku-3 for a second opinion — cheapest cloud model that handles ambiguous classification reliably.",
128
- "task_complexity": "SIMPLE",
129
- "analysis_type": "AI-powered — NOT a simple benchmark lookup",
130
- "free_tier_remaining": 17,
131
- "checked_at": "2026-04-13T10:22:31.000Z"
132
- }
133
- ```
134
-
135
- ---
136
-
137
- ## Models With Built-in Knowledge
138
-
139
- The following models have detailed capability profiles built in. All other models are assessed based on name and parameter patterns.
140
-
141
- | Model | Params | Tier |
142
- |---|---|---|
143
- | llama3.1:8b | 8B | small |
144
- | llama3.1:70b | 70B | large |
145
- | llama3.1:405b | 405B | frontier |
146
- | llama3.2:3b | 3B | tiny |
147
- | mistral:7b | 7B | small |
148
- | mixtral:8x7b | 47B | medium |
149
- | qwen2.5:7b–72b | 7B–72B | small–large |
150
- | gemma2:2b–27b | 2B–27B | tiny–medium |
151
- | phi3:mini, phi3:medium, phi4 | 3.8B–14B | tiny–medium |
152
- | deepseek-r1:8b–70b | 8B–70B | small–large |
153
- | codellama:7b–34b | 7B–34B | small–large |
154
- | deepseek-coder:6.7b–33b | 6.7B–33B | small–large |
155
-
156
- ---
157
-
158
- ## Pricing
159
-
160
- | Tier | Price | Evaluations |
161
- |---|---|---|
162
- | Free | $0 | 20/month |
163
- | Pro | $99/month | 2,000/month |
164
- | Enterprise | $299/month | Unlimited |
165
-
166
- Get your Pro key at [kordagencies.com](https://kordagencies.com)
167
-
168
- ---
169
-
170
- ## Privacy
171
-
172
- We do not log or store your task descriptions, model names, or any query content. Each evaluation is processed and discarded. Full terms: [kordagencies.com/terms.html](https://kordagencies.com/terms.html)
67
+ Free tier requires no API key — tracked by IP.
173
68
 
174
- ---
69
+ ## Legal
175
70
 
176
- Built by [Kord Agencies](https://kordagencies.com)
71
+ Results are for cost-optimisation guidance only and do not constitute technical advice. Full terms: [kordagencies.com/terms.html](https://kordagencies.com/terms.html)
package/glama.json CHANGED
@@ -1,14 +1,13 @@
1
1
  {
2
2
  "name": "local-model-suitability-mcp",
3
- "description": "AI-powered MCP tool that evaluates whether a local model is suitable for a specific task. Helps agents avoid paying cloud rates unnecessarily, sending sensitive data to external APIs, and trusting under-powered local models with tasks beyond their capability.",
4
- "version": "1.0.0",
5
- "author": "ojas1",
3
+ "title": "Local Model Suitability MCP",
4
+ "description": "Check whether a task can run on a local model instead of cloud. Saves money on every call that does not need cloud inference. AI-powered routing.",
5
+ "version": "1.1.0",
6
6
  "homepage": "https://kordagencies.com",
7
- "license": "MIT",
8
7
  "tools": [
9
8
  {
10
- "name": "evaluate_local_model_suitability",
11
- "description": "Evaluates whether a local model is suitable for a task. Returns LOCAL / CLOUD / EITHER / NEITHER verdict with cost, privacy, latency, and quality reasoning."
9
+ "name": "check_local_viability",
10
+ "description": "Call BEFORE every cloud inference call. Returns LOCAL/CLOUD/EITHER verdict with cost saving estimate and specific local model recommendations."
12
11
  }
13
12
  ]
14
13
  }
package/package.json CHANGED
@@ -1,39 +1,28 @@
1
1
  {
2
2
  "name": "local-model-suitability-mcp",
3
- "version": "1.0.1",
4
- "description": "AI-powered MCP tool that evaluates whether a local model is suitable for a specific task helps agents decide between local inference and cloud APIs based on cost, privacy, latency, and quality requirements.",
3
+ "version": "1.1.0",
4
+ "description": "Check whether a task can run on a local model instead of cloud. Save money on every call that does not need cloud inference.",
5
5
  "main": "src/server.js",
6
6
  "type": "module",
7
- "bin": {
8
- "local-model-suitability-mcp": "src/server.js"
9
- },
10
7
  "scripts": {
11
8
  "start": "node src/server.js"
12
9
  },
10
+ "license": "UNLICENSED",
11
+ "homepage": "https://kordagencies.com",
12
+ "mcpName": "io.github.OjasKord/local-model-suitability-mcp",
13
13
  "keywords": [
14
14
  "mcp",
15
15
  "agent",
16
16
  "local-llm",
17
17
  "ollama",
18
+ "cost-reduction",
18
19
  "model-routing",
19
- "ai-inference",
20
- "llm-evaluation",
21
- "on-device-ai",
22
20
  "privacy",
23
- "cost-optimisation"
21
+ "cost-optimisation",
22
+ "llm-routing",
23
+ "inference-cost"
24
24
  ],
25
- "author": "ojas1",
26
- "license": "MIT",
27
25
  "dependencies": {
28
26
  "@anthropic-ai/sdk": "^0.39.0"
29
- },
30
- "engines": {
31
- "node": ">=18.0.0"
32
- },
33
- "repository": {
34
- "type": "git",
35
- "url": "https://github.com/OjasKord/local-model-suitability-mcp"
36
- },
37
- "homepage": "https://kordagencies.com",
38
- "mcpName": "io.github.OjasKord/local-model-suitability-mcp"
39
- }
27
+ }
28
+ }
package/server.json CHANGED
@@ -1,8 +1,8 @@
1
1
  {
2
2
  "$schema": "https://static.modelcontextprotocol.io/schemas/2025-12-11/server.schema.json",
3
3
  "name": "io.github.OjasKord/local-model-suitability-mcp",
4
- "version": "1.0.0",
5
- "description": "AI-powered evaluation of local model suitability for agents.",
4
+ "version": "1.1.0",
5
+ "description": "Check if a task can run locally instead of cloud. Save money on every call that doesn't need cloud inference.",
6
6
  "title": "Local Model Suitability MCP",
7
7
  "websiteUrl": "https://kordagencies.com",
8
8
  "repository": {
@@ -14,18 +14,22 @@
14
14
  "registryType": "npm",
15
15
  "registryBaseUrl": "https://registry.npmjs.org",
16
16
  "identifier": "local-model-suitability-mcp",
17
- "version": "1.0.0",
18
- "transport": {
19
- "type": "stdio"
20
- },
17
+ "version": "1.1.0",
18
+ "transport": { "type": "stdio" },
21
19
  "environmentVariables": [
22
20
  {
23
21
  "name": "ANTHROPIC_API_KEY",
24
- "description": "Anthropic API key required to power AI-based evaluations",
22
+ "description": "Anthropic API key for Claude routing analysis",
25
23
  "isRequired": true,
26
24
  "isSecret": true
27
25
  }
28
26
  ]
29
27
  }
28
+ ],
29
+ "remotes": [
30
+ {
31
+ "type": "streamable-http",
32
+ "url": "https://local-model-suitability-mcp-production.up.railway.app"
33
+ }
30
34
  ]
31
35
  }
package/smithery.yaml CHANGED
@@ -1,24 +1,14 @@
1
- name: local-model-suitability-mcp
2
- description: AI-powered tool that evaluates whether a local model (Ollama, LM Studio, etc.) is suitable for a specific task. Helps agents make intelligent decisions about cost, privacy, latency, and quality — avoiding expensive mistakes in both directions.
3
- version: 1.0.0
4
- author: ojas1
5
- homepage: https://kordagencies.com
6
- license: MIT
7
-
8
- tools:
9
- - name: evaluate_local_model_suitability
10
- description: >
11
- Evaluates whether a specific local model is suitable for a specific task.
12
- Returns a structured verdict (LOCAL / CLOUD / EITHER / NEITHER) with
13
- reasoning about cost, privacy, quality risk, and failure modes.
14
-
15
- config:
16
- schema:
1
+ startCommand:
2
+ type: http
3
+ url: https://local-model-suitability-mcp-production.up.railway.app
4
+ configSchema:
17
5
  type: object
18
6
  properties:
19
7
  apiKey:
20
8
  type: string
21
- description: "Pro API key from kordagencies.com. Leave blank for free tier (20 evaluations/month)."
9
+ description: "Local Model Suitability MCP API key from kordagencies.com. Leave blank for free tier (20 calls/month)."
22
10
  x-from:
23
- header: x-api-key
11
+ header: "x-api-key"
24
12
  required: []
13
+ systemPrompt: |
14
+ Call check_local_viability BEFORE every cloud inference call to check whether the task can run on a local model instead. Cloud inference costs money -- this tool tells you when you are wasting it. If verdict is LOCAL, route to Ollama or local model and skip the cloud call. Only use cloud when this tool says CLOUD. AI-powered analysis -- NOT a simple lookup. Free tier: 20 calls/month, no API key needed. Pro: $99/month at kordagencies.com.