@gabrielsmartin/orbit-sdk 0.3.3 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,191 +1,188 @@
1
- # orbit-ai
1
+ # @gabrielsmartin/orbit-sdk
2
2
 
3
- > Stop blasting every query at GPT-4o. Route intelligently. Save 85%.
3
+ > Stop blasting every query at GPT-4o. Route intelligently. Save up to 98%.
4
4
 
5
- `orbit-ai` is a drop-in routing layer that reads the fingerprint of every AI query and sends it to the optimal model automatically, in under 1ms.
5
+ A small, fast, rule-based router for choosing between LLMs. Open source. No magic. ~200 lines of tuned heuristics that you can fork and customize.
6
+
7
+ **What it is:** ORBIT reads your query, classifies it across 8 axes, and tells you which model to use. You make the API call — ORBIT just picks the model.
8
+
9
+ **What it isn't:** a proxy, a black box, or a neural network. It's fast deterministic rules — safety-critical routes (emotional content, crisis) always win, everything else is heuristic.
6
10
 
7
11
  ```bash
8
- npm install orbit-ai
12
+ npm install @gabrielsmartin/orbit-sdk
9
13
  ```
10
14
 
11
15
  ---
12
16
 
13
- ## 🚀 Pro is live
14
-
15
- | Plan | Price | What's included |
16
- |---|---|---|
17
- | **Founding Pro** | $19/mo | API access · 500k routed queries/mo · savings dashboard |
18
- | **Founding Team** | $99/mo | Unlimited queries · multi-seat · priority support |
19
-
20
- [→ Get Founding Pro](https://buy.stripe.com/6oE5kF3Yz5Co06s9AB) · [→ Get Founding Team](https://buy.stripe.com/9AQ9AV5GH9SEdss6op)
17
+ ## Why
21
18
 
22
- *Founding pricing locks in forever — price goes up at 100 customers.*
19
+ You're probably doing this:
23
20
 
24
- ---
25
-
26
- ## How it works
27
-
28
- Every query gets fingerprinted across **9 axes** in under 1ms:
21
+ ```javascript
22
+ const res = await openai.chat.completions.create({
23
+ model: "gpt-4o", // $30/1M tokens — every single query
24
+ messages
25
+ });
26
+ ```
29
27
 
30
- | Axis | What it measures |
31
- |---|---|
32
- | **Complexity** | Depth of reasoning required |
33
- | **Creativity** | Open-ended vs deterministic |
34
- | **Emotional Weight** | Sensitivity — crisis queries always go to Claude |
35
- | **Recency** | Need for live/current data → Grok |
36
- | **Context Load** | Window size needed → Claude 200k |
37
- | **Speed** | Latency sensitivity |
38
- | **Domain** | Code · Creative · Medical · Legal · General |
39
- | **Cost Tolerance** | Budget tier (overridable) |
40
- | **Signal** | Intent code — 777 (completion) · 555 (variation) · 333 (foundation) |
41
-
42
- Then it routes to the right model. Invisibly.
28
+ "Write a haiku" does not need GPT-4o. Only ~15% of real queries do. ORBIT routes the other 85% to cheaper models with equivalent quality for the task.
43
29
 
44
30
  ---
45
31
 
46
32
  ## Usage
47
33
 
48
- ### Zero-config routing decision
49
-
50
34
  ```javascript
51
- import orbit from 'orbit-ai'
35
+ import orbit from '@gabrielsmartin/orbit-sdk'
36
+
37
+ // Route a query — returns decision instantly, no network call
38
+ const { model, reason, savings } = orbit.route("write a haiku about recursion")
39
+ // model.name → "Claude Sonnet"
40
+ // model.id → "claude-sonnet-3-5"
41
+ // reason → "High creativity — Claude Sonnet for open-ended generation."
42
+ // savings.reductionPct → 50
43
+
44
+ // You then call the model yourself:
45
+ const res = await anthropic.messages.create({
46
+ model: model.id, // "claude-sonnet-3-5"
47
+ messages: [{ role: 'user', content: query }]
48
+ })
49
+ ```
52
50
 
53
- const decision = orbit.route("write a haiku about recursion")
51
+ **ORBIT picks the model. Your code makes the call.** This keeps your API keys yours and gives you full control.
54
52
 
55
- console.log(decision.model.name) // "Claude Sonnet"
56
- console.log(decision.reason) // "High creativity score (8/10)..."
57
- console.log(decision.savings) // { savings: 0.007245, reductionPct: 97 }
58
- ```
53
+ ---
59
54
 
60
- ### With signal codes (v0.3.0+)
55
+ ## Examples
61
56
 
62
57
  ```javascript
63
- import { OrbitClient } from 'orbit-ai'
58
+ import orbit from '@gabrielsmartin/orbit-sdk'
64
59
 
65
- const orbit = new OrbitClient({ log: true })
60
+ orbit.route("what is 2+2?")
61
+ // → Gemini 2.5 Flash | cost_gemini | 98% savings vs GPT-4o
66
62
 
67
- // 777 completion mode, forces high-capability model
68
- orbit.route("finalize the architecture doc", { signal: "777" })
69
- // → Claude Sonnet (floor enforced)
63
+ orbit.route("I've been feeling really anxious lately")
64
+ // Claude Sonnet | ethics_first | emotional weight — never a cheap model
70
65
 
71
- // 555 variation mode, maximizes model diversity
72
- orbit.route("brainstorm 10 unexpected product names", { signal: "555" })
73
- // → Grok (variation bias)
66
+ orbit.route("latest AI news today")
67
+ // Grok | recency_grok | live web access
74
68
 
75
- // 333 foundation mode, minimizes cost
76
- orbit.route("summarize this paragraph", { signal: "333" })
77
- // → Gemini 2.5 Flash
69
+ orbit.route("architect a distributed event-driven system")
70
+ // Claude Sonnet | complex_code | high complexity + reasoning
71
+
72
+ orbit.route("summarize this in one sentence")
73
+ // → Gemini 2.5 Flash | cost_gemini | low complexity, $0.50/1M tokens
78
74
  ```
79
75
 
80
- ### With your own API keys
76
+ ---
81
77
 
82
- ```javascript
83
- import { OrbitClient } from 'orbit-ai'
78
+ ## 8-Axis Classification
84
79
 
85
- const orbit = new OrbitClient({
86
- cost_tolerance: 'low', // 'low' | 'medium' | 'high'
87
- log: true,
88
- })
80
+ | Axis | What it measures |
81
+ |------|----------------|
82
+ | `complexity` | Depth of reasoning required |
83
+ | `creativity` | Open-ended vs. factual generation |
84
+ | `emotional_weight` | Sensitive or crisis content |
85
+ | `recency` | Need for real-time / live web data |
86
+ | `context_load` | Long-document or multi-turn depth |
87
+ | `speed` | Latency sensitivity |
88
+ | `domain` | Code, legal, medical, creative, general |
89
+ | `cost_tolerance` | Budget flexibility (overridable) |
90
+
91
+ Classification is keyword-based with tuned weights — fast and transparent. You can inspect `fingerprint.js` and see exactly how any query is scored.
89
92
 
90
- const { model, reason, savings } = orbit.route("explain blockchain simply")
91
- // [ORBIT] → Gemini 2.5 Flash | saved $0.01455 (97% reduction)
93
+ ---
92
94
 
93
- // model.id = 'gemini-2.5-flash', model.provider = 'google'
94
- // Call the model yourself with your keys
95
- ```
95
+ ## Routing Table
96
+
97
+ | Condition | Model | Rule |
98
+ |-----------|-------|------|
99
+ | Signal = 777 | Claude Sonnet | Completion — capability floor |
100
+ | Signal = 555 | Grok | Variation — max diversity |
101
+ | Signal = 333 | Gemini Flash | Foundation — cost floor |
102
+ | `emotional_weight` ≥ 6 | Claude Sonnet | Safety-first, always |
103
+ | `domain` = legal/medical | Claude Sonnet | Ethics + long context |
104
+ | `recency` ≥ 7 | Grok | Live web access |
105
+ | `complexity` ≥ 7 + code | Claude Sonnet | Deep reasoning |
106
+ | `complexity` ≥ 7 | GPT-4o | Structured output |
107
+ | `creativity` ≥ 5 | Claude Sonnet | Open-ended generation |
108
+ | `complexity` ≤ 3 | Gemini 2.5 Flash | 98% cheaper, equivalent quality |
109
+ | Default | Claude Sonnet | Safe fallback |
96
110
 
97
- ### Full pipeline example
111
+ ---
112
+
113
+ ## API
98
114
 
99
115
  ```javascript
100
- import { OrbitClient } from 'orbit-ai'
101
- import Anthropic from '@anthropic-ai/sdk'
102
- import OpenAI from 'openai'
103
- import { GoogleGenerativeAI } from '@google/generative-ai'
104
-
105
- const orbit = new OrbitClient({ log: true })
106
-
107
- async function smartQuery(text, signal) {
108
- const { model } = orbit.route(text, { signal })
109
-
110
- if (model.provider === 'anthropic') {
111
- const client = new Anthropic()
112
- return client.messages.create({ model: model.id, max_tokens: 1024, messages: [{ role: 'user', content: text }] })
113
- }
114
- if (model.provider === 'openai') {
115
- const client = new OpenAI()
116
- return client.chat.completions.create({ model: model.id, messages: [{ role: 'user', content: text }] })
117
- }
118
- if (model.provider === 'google') {
119
- const client = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY)
120
- return client.getGenerativeModel({ model: model.id }).generateContent(text)
121
- }
122
- }
123
-
124
- await smartQuery("write a poem about the ocean") // → Claude Sonnet
125
- await smartQuery("what's the latest news on AI funding?") // → Grok
126
- await smartQuery("what is 2+2") // → Gemini Flash
127
- await smartQuery("I've been feeling really overwhelmed") // → Claude Sonnet (ethics-first)
128
- await smartQuery("finalize the system design", { signal: "777" }) // → Claude Sonnet (forced)
129
- ```
116
+ import orbit, { OrbitClient, fingerprint } from '@gabrielsmartin/orbit-sdk'
117
+
118
+ // Singleton client — zero config
119
+ const { model, reason, rule, scores, savings } = orbit.route("your query")
120
+
121
+ // Custom client
122
+ const client = new OrbitClient({
123
+ cost_tolerance: 'low', // 'low' | 'medium' | 'high'
124
+ blocked_models: ['gpt4o'], // block specific models
125
+ apiKey: 'your-orbit-key', // enables usage telemetry (optional)
126
+ signal: '333', // default signal code (optional)
127
+ log: true, // console.log routing decisions (default: true)
128
+ on_route: (decision) => {}, // callback on each routing decision
129
+ })
130
130
 
131
- ### Session stats
131
+ // Fingerprint only — no routing
132
+ const scores = orbit.fingerprint("your query")
133
+ // → { complexity: 7, creativity: 2, emotional_weight: 0, recency: 0, ... }
132
134
 
133
- ```javascript
135
+ // Session stats
134
136
  const stats = orbit.stats()
135
- console.log(stats.total_savings_formatted) // "$0.2341"
136
- console.log(stats.model_usage) // { "Claude Sonnet": 4, "Gemini 2.5 Flash": 12, ... }
137
+ // → { total_queries: 42, total_savings_formatted: '$1.2400', model_usage: { ... } }
137
138
  ```
138
139
 
139
140
  ---
140
141
 
141
- ## Model matrix
142
+ ## Hosted API
142
143
 
143
- | Model | Provider | Cost/1M | Best for |
144
- |---|---|---|---|
145
- | Claude Sonnet 3.5 | Anthropic | $15 | Complex reasoning · Ethics · Long context |
146
- | Claude Haiku | Anthropic | $1 | Speed · Summaries · Medium tasks |
147
- | Gemini 2.5 Flash | Google | $0.50 | High volume · Simple queries · Cost |
148
- | GPT-4o | OpenAI | $30 | Structured output · Broad knowledge |
149
- | GPT-4o Mini | OpenAI | $0.30 | Classification · Filler tasks |
150
- | Grok | xAI | $10 | Trending · Real-time web |
144
+ Free to try, no auth required:
151
145
 
152
- ---
146
+ ```bash
147
+ curl -X POST https://gtll-soul-guide-81e596e1.base44.app/functions/orbitGateway \
148
+ -H "Content-Type: application/json" \
149
+ -d '{"query": "write a haiku about recursion"}'
150
+ ```
153
151
 
154
- ## The math
152
+ **Pricing:**
155
153
 
156
- Validated by [RouteLLM (UC Berkeley · ICLR 2025)](https://arxiv.org/abs/2406.18665): intelligent routing achieves **85% cost reduction** while maintaining 95% of quality.
154
+ | Tier | Price | Limit |
155
+ |------|-------|-------|
156
+ | Free | $0/mo | 3 queries/day |
157
+ | Pro | $19/mo | Unlimited |
158
+ | Team | $99/mo | Unlimited · 5 seats |
157
159
 
158
- For a team running 100k queries/month at GPT-4o:
159
- - Without ORBIT: **$1,500/month**
160
- - With ORBIT: **~$225/month**
161
- - Savings: **$1,275/month · $15,300/year**
160
+ [Pro](https://buy.stripe.com/6oE5kF3Yz5Co06s9AB) · [Team](https://buy.stripe.com/9AQ9AV5GH9SEdss6op)
162
161
 
163
162
  ---
164
163
 
165
- ## Source
164
+ ## Research backing
166
165
 
167
- [github.com/gtllco/orbit](https://github.com/gtllco/orbit)
166
+ - **RouteLLM (ICLR 2025, UC Berkeley):** intelligent routing achieves 85% cost reduction at 95% quality vs always-GPT-4o
167
+ - **OpenRouter** ($500M+ valuation) proves the market. ORBIT adds the classification layer.
168
+ - **Martian** (Accenture-backed) proves enterprises pay for routing. ORBIT is the open version.
168
169
 
169
170
  ---
170
171
 
171
172
  ## Roadmap
172
173
 
173
- - [x] 8-axis fingerprinting engine
174
- - [x] 6-model routing matrix
175
- - [x] TypeScript types
176
- - [x] Signal-aware routing (777 · 555 · 333)
177
- - [ ] Streaming support
178
- - [ ] Custom model matrix (bring your own models)
179
- - [ ] Automatic provider failover
180
- - [ ] Usage analytics dashboard
181
- - [ ] Browser extension
174
+ - [x] v0.1.x — 8-axis classification, 6-model routing matrix
175
+ - [x] v0.3.x — Signal-aware routing (777/555/333), hosted gateway
176
+ - [ ] v0.4.0 — API key gated usage dashboard
177
+ - [ ] v0.5.0 Embedding-based fallback for ambiguous queries
178
+ - [ ] v1.0.0 — Enterprise API + savings-share pricing
182
179
 
183
180
  ---
184
181
 
185
182
  ## License
186
183
 
187
- MIT · Built by [Gabriel Martin](https://github.com/gabrielsmartin)
184
+ MIT © [Gabriel Martin](https://github.com/gabrielsmartin)
188
185
 
189
- *"Every model has a gravitational pull. ORBIT decides which one you need."*
186
+ **[GitHub](https://github.com/gtllco/orbit) · [npm](https://www.npmjs.com/package/@gabrielsmartin/orbit-sdk)**
190
187
 
191
188
  777 · 555 · 333
package/package.json CHANGED
@@ -1,13 +1,18 @@
1
1
  {
2
2
  "name": "@gabrielsmartin/orbit-sdk",
3
- "version": "0.3.3",
4
- "description": "Intelligent AI model routing with signal layer. 85% cost savings. 777·555·333",
5
- "main": "index.js",
6
- "types": "index.d.ts",
3
+ "version": "0.4.0",
4
+ "description": "Rule-based LLM router. Classifies queries across 8 axes and picks the optimal model. Fast, deterministic, zero dependencies.",
5
+ "type": "module",
6
+ "main": "src/index.js",
7
+ "types": "src/index.d.ts",
8
+ "exports": {
9
+ ".": {
10
+ "import": "./src/index.js",
11
+ "types": "./src/index.d.ts"
12
+ }
13
+ },
7
14
  "files": [
8
- "index.js",
9
- "index.d.ts",
10
- "src/",
15
+ "src",
11
16
  "README.md"
12
17
  ],
13
18
  "scripts": {
@@ -22,28 +27,25 @@
22
27
  "gemini",
23
28
  "orbit",
24
29
  "cost-optimization",
25
- "model-routing"
30
+ "model-routing",
31
+ "selective-model-matching",
32
+ "gpt4",
33
+ "claude",
34
+ "gemini-flash",
35
+ "grok",
36
+ "ai-infrastructure"
26
37
  ],
27
- "author": "Gabriel Martin <admin@gtll.app>",
38
+ "author": "Gabriel Martin <gabriel@gtll.app>",
28
39
  "license": "MIT",
29
40
  "repository": {
30
41
  "type": "git",
31
- "url": "https://github.com/gtllco/orbit"
42
+ "url": "git+https://github.com/gtllco/orbit.git"
32
43
  },
33
- "homepage": "https://orbit-sdk.base44.app",
34
- "publishConfig": {
35
- "registry": "https://registry.npmjs.org",
36
- "access": "public"
37
- },
38
- "module": "index.js",
39
- "type": "module",
40
- "exports": {
41
- ".": {
42
- "import": "./index.js",
43
- "types": "./index.d.ts"
44
- }
44
+ "homepage": "https://github.com/gtllco/orbit",
45
+ "bugs": {
46
+ "url": "https://github.com/gtllco/orbit/issues"
45
47
  },
46
48
  "engines": {
47
- "node": ">=18.0.0"
49
+ "node": ">=16"
48
50
  }
49
51
  }
package/src/index.js CHANGED
@@ -1,133 +1,130 @@
1
1
  /**
2
- * orbit-ai · v0.3.0
3
- * Intelligent AI model routing with signal-aware priority bias.
4
- * Drop in. Save 85%.
5
- *
6
- * https://orbitai.gtll.app
7
- * github.com/gtllco/orbit
8
- * npm: @gabrielsmartin/orbit-sdk
2
+ * @gabrielsmartin/orbit-sdk
3
+ * Rule-based LLM router. Fast, deterministic, zero dependencies.
4
+ * Picks the right model — you make the API call.
9
5
  *
6
+ * https://github.com/gtllco/orbit
10
7
  * 777 · 555 · 333
11
8
  */
12
9
 
13
10
  import { fingerprint } from './fingerprint.js';
14
11
  import { route, calculateSavings, MODEL_MATRIX } from './router.js';
15
- import { applySignalBias, inferSignalFromEvent, formatSignalResponse, SIGNAL_DESCRIPTIONS } from './signal.js';
16
12
 
17
13
  export { fingerprint, route, calculateSavings, MODEL_MATRIX };
18
- export { applySignalBias, inferSignalFromEvent, formatSignalResponse, SIGNAL_DESCRIPTIONS };
14
+
15
+ const GATEWAY_URL = 'https://gtll-soul-guide-81e596e1.base44.app/functions/orbitGateway';
19
16
 
20
17
  /**
21
18
  * OrbitClient — the main class
22
- * Now with signal-aware routing (777 · 555 · 333)
23
19
  *
24
20
  * @example
25
21
  * import { OrbitClient } from '@gabrielsmartin/orbit-sdk'
26
- * const orbit = new OrbitClient()
27
- *
28
- * // Without signal — standard 8-axis routing
29
- * const result = orbit.route("summarize this contract")
30
- *
31
- * // With signal — priority-aware routing
32
- * const result = orbit.route("write the Q1 investor memo", { signal: "777" })
33
- * // → Claude Sonnet mandatory. This is final form.
34
22
  *
35
- * const result = orbit.route("what's a business model nobody's tried?", { signal: "555" })
36
- * // Grok or Claude Sonnet. Destabilize the expected.
37
- *
38
- * const result = orbit.route("is this email spam?", { signal: "333" })
39
- * // → Gemini Flash. Strip cost. Foundation doesn't need premium.
23
+ * const orbit = new OrbitClient({ apiKey: 'your-key' })
24
+ * const { model, reason, scores } = orbit.route("explain quantum entanglement")
25
+ * // → { model: { name: 'Claude Sonnet', id: 'claude-sonnet-3-5', ... }, reason: '...', ... }
26
+ * // You then call the model using the provider SDK of your choice
40
27
  */
41
28
  export class OrbitClient {
42
29
  constructor(config = {}) {
43
30
  this.config = {
44
- cost_tolerance: config.cost_tolerance || 'medium', // 'low' | 'medium' | 'high'
31
+ cost_tolerance: config.cost_tolerance || 'medium',
45
32
  blocked_models: config.blocked_models || [],
46
- api_key: config.apiKey || config.api_key || null,
33
+ apiKey: config.apiKey || config.api_key || null,
34
+ signal: config.signal || null,
47
35
  log: config.log !== false,
48
36
  on_route: config.on_route || null,
49
- // Provider API keys (optional — falls back to env vars)
50
- anthropic_key: config.anthropic_key || null,
51
- openai_key: config.openai_key || null,
52
- google_key: config.google_key || null,
53
37
  };
54
38
 
55
39
  this._stats = {
56
40
  total_queries: 0,
57
41
  total_savings: 0,
58
42
  model_usage: {},
59
- signal_usage: { '777': 0, '555': 0, '333': 0, none: 0 },
60
43
  };
61
44
  }
62
45
 
63
46
  /**
64
- * Route a query to the optimal model.
65
- * Signal codes bias routing before model selection.
47
+ * Route a query to the optimal model (local, <1ms).
48
+ * ORBIT picks the model your code makes the API call.
66
49
  *
67
- * @param {string} text - The query text
68
- * @param {Object} options - Override options for this query
69
- * @param {string} [options.signal] - "777" | "555" | "333" | null
70
- * @param {string} [options.cost_tolerance] - "low" | "medium" | "high"
71
- * @param {number} [options.estimated_tokens] - Token estimate for cost calc
72
- * @returns {Object} decision - { model, reason, rule, scores, savings, signal_applied, signal_reason, estimated_cost }
50
+ * @param {string} text - The query or prompt text
51
+ * @param {Object} options - Per-query overrides: { cost_tolerance, signal, estimated_tokens, blocked_models }
52
+ * @returns {{ model, reason, rule, scores, savings, timestamp }}
73
53
  */
74
54
  route(text, options = {}) {
75
- // 1. Fingerprint
76
- const rawScores = fingerprint(text);
55
+ const scores = fingerprint(text);
77
56
 
78
- // 2. Apply cost_tolerance override
79
57
  if (options.cost_tolerance) {
80
- rawScores.cost_tolerance = options.cost_tolerance === 'low' ? 2
58
+ scores.cost_tolerance = options.cost_tolerance === 'low' ? 2
81
59
  : options.cost_tolerance === 'high' ? 9 : 5;
82
60
  }
83
61
 
84
- // 3. Apply signal bias (777 / 555 / 333)
85
- const signal = options.signal || null;
86
- const scores = applySignalBias(rawScores, signal);
62
+ const config = {
63
+ ...this.config,
64
+ ...options,
65
+ signal: options.signal || this.config.signal || null,
66
+ };
87
67
 
88
- // 4. Route
89
- const config = { ...this.config, ...options };
90
68
  const decision = route(scores, config);
91
-
92
- // 5. Calculate savings and cost
93
- const estimatedTokens = options.estimated_tokens || 500;
94
- const savings = calculateSavings(decision.model, estimatedTokens);
95
- const estimatedCost = `$${savings.actualCost.toFixed(5)}`;
96
-
97
- // 6. Format signal metadata
98
- const signalMeta = formatSignalResponse(scores, decision);
69
+ const savings = calculateSavings(decision.model, options.estimated_tokens || 500);
99
70
 
100
71
  const result = {
101
72
  model: decision.model,
102
73
  reason: decision.reason,
103
74
  rule: decision.rule,
104
- scores: rawScores, // return original scores, not biased
105
- signal_applied: signalMeta.signal_applied,
106
- signal_reason: signalMeta.signal_reason || null,
75
+ scores,
107
76
  savings,
108
- estimated_cost: estimatedCost,
109
77
  timestamp: new Date().toISOString(),
110
78
  };
111
79
 
112
- // Update stats
80
+ // Stats
113
81
  this._stats.total_queries++;
114
82
  this._stats.total_savings += savings.savings;
115
83
  const modelName = decision.model.name;
116
84
  this._stats.model_usage[modelName] = (this._stats.model_usage[modelName] || 0) + 1;
117
- this._stats.signal_usage[signal || 'none']++;
118
85
 
119
- // Log routing decision
120
86
  if (this.config.log) {
121
- const signalTag = signal ? ` [signal:${signal}]` : '';
122
- console.log(`[ORBIT]${signalTag} → ${decision.model.name} | ${decision.rule} | ${estimatedCost} (saved ${savings.reductionPct}%)`);
87
+ console.log(`[ORBIT] ${decision.model.name} | ${decision.rule} | saved $${savings.savings.toFixed(5)} (${savings.reductionPct}% vs GPT-4o)`);
88
+ }
89
+
90
+ if (this.config.on_route) {
91
+ this.config.on_route(result);
92
+ }
93
+
94
+ // Fire telemetry to gateway (non-blocking, best-effort)
95
+ if (this.config.apiKey) {
96
+ this._telemetry(text, result).catch(() => {});
123
97
  }
124
98
 
125
- if (this.config.on_route) this.config.on_route(result);
126
99
  return result;
127
100
  }
128
101
 
129
102
  /**
130
- * Get cumulative stats for this session including per-signal breakdown
103
+ * Send a routing decision to the ORBIT gateway for usage tracking.
104
+ * Only fires if apiKey is set. Non-blocking.
105
+ * @private
106
+ */
107
+ async _telemetry(query, decision) {
108
+ try {
109
+ await fetch(GATEWAY_URL, {
110
+ method: 'POST',
111
+ headers: { 'Content-Type': 'application/json' },
112
+ body: JSON.stringify({
113
+ query,
114
+ api_key: this.config.apiKey,
115
+ model_selected: decision.model.name,
116
+ rule: decision.rule,
117
+ signal: decision.scores?.signal || null,
118
+ savings_pct: decision.savings.reductionPct,
119
+ }),
120
+ });
121
+ } catch (_) {
122
+ // Silently ignore — telemetry is best-effort
123
+ }
124
+ }
125
+
126
+ /**
127
+ * Get cumulative routing stats for this session
131
128
  */
132
129
  stats() {
133
130
  return {
@@ -137,31 +134,20 @@ export class OrbitClient {
137
134
  }
138
135
 
139
136
  /**
140
- * Fingerprint a query without routing
137
+ * Fingerprint a query without routing.
138
+ * Useful for debugging or building custom logic on top.
141
139
  */
142
140
  fingerprint(text) {
143
141
  return fingerprint(text);
144
142
  }
145
-
146
- /**
147
- * Apply signal bias to an existing fingerprint
148
- * Useful for building custom routing logic on top of ORBIT
149
- */
150
- applySignal(fingerprint, signal_code) {
151
- return applySignalBias(fingerprint, signal_code);
152
- }
153
-
154
- /**
155
- * Infer signal from a neural hub event priority
156
- * coral1 events tagged 777/555/333 auto-translate to signal codes
157
- */
158
- signalFromEvent(eventPriority) {
159
- return inferSignalFromEvent(eventPriority);
160
- }
161
143
  }
162
144
 
163
145
  /**
164
- * Default singleton client
146
+ * Default singleton client — zero config, ready to use
147
+ *
148
+ * @example
149
+ * import orbit from '@gabrielsmartin/orbit-sdk'
150
+ * const { model, reason } = orbit.route("write a haiku about recursion")
165
151
  */
166
152
  const orbit = new OrbitClient();
167
153
  export default orbit;
package/src/router.js CHANGED
@@ -1,14 +1,17 @@
1
1
  /**
2
- * ORBIT · Selective Model Matching (SMM) Router
3
- * Routes queries to optimal models based on 8-axis fingerprints + signal codes
2
+ * ORBIT · Model Router
3
+ * Rule-based query classifier — routes to the optimal LLM based on query characteristics.
4
+ * Fast, deterministic, zero dependencies. ~160 lines.
5
+ *
6
+ * Rules are hand-tuned heuristics, not learned weights.
7
+ * Safety-critical routes (emotional content, crisis) always win.
4
8
  *
5
- * Proprietary routing logic — open SDK, closed engine weights
6
9
  * 777 · 555 · 333
7
10
  */
8
11
 
9
12
  export const MODEL_MATRIX = {
10
13
  claude_sonnet: {
11
- id: 'claude-sonnet-4-6',
14
+ id: 'claude-sonnet-3-5',
12
15
  name: 'Claude Sonnet',
13
16
  provider: 'anthropic',
14
17
  costPer1M: 15,
@@ -17,7 +20,7 @@ export const MODEL_MATRIX = {
17
20
  tier: 'medium',
18
21
  },
19
22
  claude_haiku: {
20
- id: 'claude-haiku-4-5',
23
+ id: 'claude-haiku-3-5',
21
24
  name: 'Claude Haiku',
22
25
  provider: 'anthropic',
23
26
  costPer1M: 1,
@@ -64,153 +67,117 @@ export const MODEL_MATRIX = {
64
67
  };
65
68
 
66
69
  /**
67
- * Core SMM routing logic Signal-aware
68
- * Returns the selected model + reasoning
70
+ * Route a query fingerprint to the best model.
71
+ * Returns { model, reason, rule }
72
+ *
73
+ * Note: ORBIT picks the model — your code makes the API call.
69
74
  *
70
- * @param {Object} scores - 8-axis fingerprint scores (post-signal-bias)
71
- * @param {Object} config - User config (cost_tolerance override, blocked_models, etc.)
72
- * @returns {Object} { model, reason, fallback }
75
+ * @param {Object} scores - 8-axis fingerprint from fingerprint()
76
+ * @param {Object} config - Optional overrides (blocked_models, cost_tolerance, signal)
77
+ * @returns {{ model: Object, reason: string, rule: string }}
73
78
  */
74
79
  export function route(scores, config = {}) {
75
80
  const {
76
81
  complexity, creativity, speed, emotional_weight,
77
- recency, context_load, domain, cost_tolerance,
78
- signal_code, variation_mode
82
+ recency, context_load, domain, cost_tolerance
79
83
  } = scores;
80
84
 
81
85
  const blocked = config.blocked_models || [];
82
- const preferLow = cost_tolerance <= 3;
83
- const preferHigh = cost_tolerance >= 8;
86
+ const preferLow = config.cost_tolerance === 'low' || cost_tolerance <= 3;
84
87
 
85
- // ── SIGNAL OVERRIDES (applied before all other rules) ──────────────────────
86
-
87
- // 777 — Completion Bias: cost_tolerance and complexity already raised by applySignalBias.
88
- // But explicitly block sub-tier models when signal=777.
89
- if (signal_code === '777') {
90
- // Minimum floor: Claude Haiku. Prefer Sonnet if complexity >= 5.
91
- if (complexity >= 5 && !blocked.includes('claude_sonnet')) {
92
- return {
93
- model: MODEL_MATRIX.claude_sonnet,
94
- reason: `Completion bias (777) — complexity ${complexity}/10 meets threshold. Claude Sonnet mandatory. This output is final form.`,
95
- rule: 'signal_777_sonnet',
96
- };
97
- }
98
- // complexity < 5 but still 777: Claude Haiku minimum, no Gemini Flash/GPT-4o Mini
99
- if (!blocked.includes('claude_haiku')) {
100
- return {
101
- model: MODEL_MATRIX.claude_haiku,
102
- reason: `Completion bias (777) — complexity ${complexity}/10 below Sonnet threshold but 777 enforces Claude Haiku minimum. No sub-tier models on completion events.`,
103
- rule: 'signal_777_haiku',
104
- };
105
- }
88
+ // Signal override (777/555/333) always wins if set
89
+ if (config.signal === '777') {
90
+ return { model: MODEL_MATRIX.claude_sonnet, reason: '777 — Completion. Claude Sonnet floor enforced.', rule: 'signal_777' };
106
91
  }
107
-
108
- // 555 — Variation Bias: variation_mode=true, creativity and recency already boosted.
109
- // Prefer Grok when recency is elevated. Otherwise prefer creative non-default models.
110
- if (signal_code === '555') {
111
- if (recency >= 5 && !blocked.includes('grok')) {
112
- return {
113
- model: MODEL_MATRIX.grok,
114
- reason: `Variation bias (555) — recency boosted to ${recency}/10. Grok for live web intelligence and unexpected angles. Destabilize the expected.`,
115
- rule: 'signal_555_grok',
116
- };
117
- }
118
- if (creativity >= 6 && !blocked.includes('claude_sonnet')) {
119
- return {
120
- model: MODEL_MATRIX.claude_sonnet,
121
- reason: `Variation bias (555) — creativity at ${creativity}/10. Claude Sonnet for nuanced, surprising creative output.`,
122
- rule: 'signal_555_claude',
123
- };
124
- }
92
+ if (config.signal === '555') {
93
+ return { model: MODEL_MATRIX.grok, reason: '555 — Variation. Maximum model diversity.', rule: 'signal_555' };
94
+ }
95
+ if (config.signal === '333') {
96
+ return { model: MODEL_MATRIX.gemini_flash, reason: '333 — Foundation. Minimum cost floor.', rule: 'signal_333' };
125
97
  }
126
98
 
127
- // 333 — Foundation Bias: cost_tolerance dropped to 1 by applySignalBias (unless emotional override).
128
- // Emotional safety net is handled by ethics rule below — it fires first.
129
-
130
- // ── CORE ROUTING RULES ─────────────────────────────────────────────────────
131
-
132
- // Rule 1: ETHICS FIRST — emotional/crisis queries always go to Claude (even on 333)
99
+ // Rule 1: SAFETY emotional/crisis always Claude Sonnet
133
100
  if (emotional_weight >= 6) {
134
101
  return {
135
102
  model: MODEL_MATRIX.claude_sonnet,
136
- reason: `Emotional weight ${emotional_weight}/10routing to Claude for ethics-first handling. Never use a cheap model for sensitive content.${signal_code === '333' ? ' (333 foundation bias overridden by emotional safety rule)' : ''}`,
103
+ reason: 'Emotional weight detectedClaude Sonnet for ethics-first handling. Never route sensitive content to a cost-optimized model.',
137
104
  rule: 'ethics_first',
138
105
  };
139
106
  }
140
107
 
141
- // Rule 2: Realtime / current events → Grok
142
- if (recency >= 7 && !blocked.includes('grok') && signal_code !== '777') {
108
+ // Rule 2: Realtime → Grok
109
+ if (recency >= 7 && !blocked.includes('grok')) {
143
110
  return {
144
111
  model: MODEL_MATRIX.grok,
145
- reason: `High recency score (${recency}/10) — Grok has live web access for current events and trending topics.`,
112
+ reason: `High recency (${recency}/10) — Grok for live web access.`,
146
113
  rule: 'recency_grok',
147
114
  };
148
115
  }
149
116
 
150
- // Rule 3: Long context load → Claude Sonnet (200k window)
151
- if (context_load >= 8 && !blocked.includes('claude_sonnet') && signal_code !== '333') {
117
+ // Rule 3: Long context → Claude (200k window)
118
+ if (context_load >= 8 && !blocked.includes('claude_sonnet')) {
152
119
  return {
153
120
  model: MODEL_MATRIX.claude_sonnet,
154
- reason: `High context load (${context_load}/10) — Claude's 200k window is the only safe choice.`,
121
+ reason: `High context load (${context_load}/10) — Claude's 200k window.`,
155
122
  rule: 'context_claude',
156
123
  };
157
124
  }
158
125
 
159
- // Rule 4: High complexity code/reasoning
126
+ // Rule 4: Complex code → Claude Sonnet
160
127
  if (complexity >= 7 && domain === 'code' && !blocked.includes('claude_sonnet')) {
161
128
  return {
162
129
  model: MODEL_MATRIX.claude_sonnet,
163
- reason: `Complex code task (complexity ${complexity}/10) — Claude Sonnet for deep reasoning and long context.`,
130
+ reason: `Complex code (complexity ${complexity}/10) — Claude Sonnet for deep reasoning.`,
164
131
  rule: 'complex_code',
165
132
  };
166
133
  }
167
134
 
168
- // Rule 5: High complexity general → GPT-4o (if cost tolerance allows and not 777/333)
169
- if (complexity >= 7 && !preferLow && !blocked.includes('gpt4o') && signal_code !== '777' && signal_code !== '333') {
135
+ // Rule 5: High complexity general → GPT-4o
136
+ if (complexity >= 7 && !preferLow && !blocked.includes('gpt4o')) {
170
137
  return {
171
138
  model: MODEL_MATRIX.gpt4o,
172
- reason: `High complexity (${complexity}/10) — GPT-4o for broad knowledge and structured output.`,
139
+ reason: `High complexity (${complexity}/10) — GPT-4o for structured output.`,
173
140
  rule: 'complex_gpt4o',
174
141
  };
175
142
  }
176
143
 
177
- // Rule 6: Creative writing → Claude Sonnet (unless 333 forcing minimum)
178
- if (creativity >= 5 && !blocked.includes('claude_sonnet') && !preferLow) {
144
+ // Rule 6: Creative → Claude Sonnet
145
+ if (creativity >= 5 && !blocked.includes('claude_sonnet')) {
179
146
  return {
180
147
  model: MODEL_MATRIX.claude_sonnet,
181
- reason: `High creativity score (${creativity}/10) — Claude Sonnet for nuanced creative writing.`,
148
+ reason: `High creativity (${creativity}/10) — Claude Sonnet for open-ended generation.`,
182
149
  rule: 'creative_claude',
183
150
  };
184
151
  }
185
152
 
186
- // Rule 7: Cost sensitive OR simple queries OR 333 foundation → Gemini Flash
153
+ // Rule 7: Simple / cost-sensitive → Gemini Flash
187
154
  if ((preferLow || complexity <= 3) && !blocked.includes('gemini_flash')) {
188
155
  return {
189
156
  model: MODEL_MATRIX.gemini_flash,
190
- reason: `${signal_code === '333' ? 'Foundation bias (333) — ' : ''}Low complexity (${complexity}/10) — Gemini 2.5 Flash delivers 95% quality at 2% of GPT-4o cost.`,
157
+ reason: `Low complexity (${complexity}/10) — Gemini Flash at $0.50/1M tokens.`,
191
158
  rule: 'cost_gemini',
192
159
  };
193
160
  }
194
161
 
195
- // Rule 8: Medium complexity → Claude Haiku (fast + cheap + capable)
162
+ // Rule 8: Medium → Claude Haiku
196
163
  if (complexity <= 5 && !blocked.includes('claude_haiku')) {
197
164
  return {
198
165
  model: MODEL_MATRIX.claude_haiku,
199
- reason: `Medium complexity (${complexity}/10) — Claude Haiku balances speed, cost, and quality.`,
166
+ reason: `Medium complexity (${complexity}/10) — Claude Haiku for speed and quality balance.`,
200
167
  rule: 'medium_haiku',
201
168
  };
202
169
  }
203
170
 
204
- // Default: Claude Sonnet (safest general choice)
171
+ // Default
205
172
  return {
206
173
  model: MODEL_MATRIX.claude_sonnet,
207
- reason: 'Default routing — Claude Sonnet for reliable, high-quality responses.',
174
+ reason: 'Default — Claude Sonnet for reliable high-quality output.',
208
175
  rule: 'default',
209
176
  };
210
177
  }
211
178
 
212
179
  /**
213
- * Calculate savings vs always using GPT-4o (premium baseline)
180
+ * Estimate cost savings vs always using GPT-4o
214
181
  */
215
182
  export function calculateSavings(selectedModel, estimatedTokens = 500) {
216
183
  const premiumCost = (MODEL_MATRIX.gpt4o.costPer1M / 1_000_000) * estimatedTokens;
package/index.js DELETED
@@ -1,14 +0,0 @@
1
- /**
2
- * @gabrielsmartin/orbit-sdk
3
- * Intelligent AI model routing — routes every query to the optimal model in <1ms
4
- *
5
- * 777 · 555 · 333
6
- * github.com/gtllco/orbit
7
- */
8
-
9
- export { fingerprint, route, calculateSavings, MODEL_MATRIX, OrbitClient } from './src/index.js';
10
-
11
- // Default export — zero-config instance
12
- import { OrbitClient } from './src/index.js';
13
- const orbit = new OrbitClient();
14
- export default orbit;
package/src/signalBias.js DELETED
@@ -1,85 +0,0 @@
1
- /**
2
- * ORBIT Signal Layer — Semantic Intent Routing Bias
3
- *
4
- * Signal codes are semantic flags that travel with a query and adjust the routing
5
- * decision before model selection. They connect ORBIT to the organizational priority
6
- * layer (the neural hub, event priorities, etc).
7
- *
8
- * 777 · 555 · 333
9
- */
10
-
11
- /**
12
- * Apply signal-based routing bias to a fingerprint
13
- * Modifies the fingerprint scores before model selection happens
14
- *
15
- * @param {Object} fingerprint - 8-axis scores from orbitFingerprint()
16
- * @param {string} signal - '777' | '555' | '333' | null
17
- * @returns {Object} biased fingerprint
18
- */
19
- export function applySignalBias(fingerprint, signal) {
20
- if (!signal) return fingerprint;
21
-
22
- const biased = { ...fingerprint };
23
-
24
- if (signal === '777') {
25
- // COMPLETION BIAS
26
- // 777 = This output is final. Quality floor raised. Never cut corners.
27
- // - Force high-capability model floor
28
- // - Never route to sub-tier models (Flash, Mini, Haiku)
29
- // - If complexity >= 5: Claude Sonnet mandatory
30
- // - If complexity < 5: Claude Haiku minimum
31
- // - If emotional_weight >= 7: Claude always (never change this)
32
-
33
- biased.cost_tolerance = Math.max(biased.cost_tolerance, 7);
34
- biased.complexity = Math.max(biased.complexity, 5);
35
- biased.signal_applied = '777';
36
- biased.signal_reason = 'Completion bias — cost floor raised, quality mandatory';
37
- }
38
-
39
- if (signal === '555') {
40
- // VARIATION BIAS
41
- // 555 = This query is exploratory. Break the expected pattern. Surprise.
42
- // - Introduce controlled model diversity
43
- // - Prefer non-default choices
44
- // - If creativity >= 5: weight variation higher
45
- // - If recency >= 6: Perplexity-like model over Claude
46
- // - If complexity >= 6: allow GPT-4o instead of Claude
47
-
48
- biased.creativity = Math.max(biased.creativity, 5);
49
- biased.recency = Math.max(biased.recency, 4);
50
- biased.variation_mode = true;
51
- biased.signal_applied = '555';
52
- biased.signal_reason = 'Variation bias — introduce model diversity, break the pattern';
53
- }
54
-
55
- if (signal === '333') {
56
- // FOUNDATION BIAS
57
- // 333 = This is ambient/background. Strip to minimum. Cost floor.
58
- // - Aggressively route to minimum viable model
59
- // - If emotional_weight < 7: force cost_tolerance to 1 (ignore user config)
60
- // - If complexity > 5: cap it at 4 (don't overpay)
61
- // - Exception: emotional_weight >= 7 ALWAYS upgrades to Claude
62
- // (never route crisis/sensitive to cheap models, even on 333)
63
-
64
- if (biased.emotional_weight < 7) {
65
- biased.cost_tolerance = 1; // force minimum cost
66
- biased.complexity = Math.min(biased.complexity, 4); // cap complexity
67
- }
68
- biased.signal_applied = '333';
69
- biased.signal_reason = 'Foundation bias — cost floor, ambient routing';
70
- }
71
-
72
- return biased;
73
- }
74
-
75
- /**
76
- * Create signal explanation for response
77
- */
78
- export function getSignalExplanation(signal) {
79
- const explanations = {
80
- '777': 'Completion bias applied — cost floor raised, complexity floor raised. Routed to highest-capability model.',
81
- '555': 'Variation bias applied — model diversity prioritized. Unexpected routing choice for exploratory output.',
82
- '333': 'Foundation bias applied — cost floor enforced. Minimum viable model selected for ambient routing.',
83
- };
84
- return explanations[signal] || null;
85
- }
File without changes