npm - @gabrielsmartin/orbit-sdk - Versions diffs - 0.1.0 → 0.1.1 - Mend

@gabrielsmartin/orbit-sdk 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +96 -134
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -1,13 +1,15 @@
-# orbit-ai
+# @gabrielsmartin/orbit-sdk
 > Stop blasting every query at GPT-4o. Route intelligently. Save 85%.
-`orbit-ai` is a drop-in routing layer that reads the fingerprint of every AI query and sends it to the optimal model — automatically, in under 1ms.
+`@gabrielsmartin/orbit-sdk` is a drop-in routing layer that reads the fingerprint of every AI query and sends it to the optimal model — automatically, in under 1ms.
 ```bash
-npm install orbit-ai
+npm install @gabrielsmartin/orbit-sdk
 ```
+**Built by [Gabriel Martin](https://www.linkedin.com/in/gabrielsmartin) · [orbit-model-flow.base44.app](https://orbit-model-flow.base44.app)**
 ---
 ## The problem
@@ -18,185 +20,145 @@ You're probably doing this:
 const res = await openai.chat.completions.create({
   model: "gpt-4o",  // $30/1M tokens — every single query
   messages
-})
+});
 ```
-GPT-4o costs **30x more per token** than Gemini Flash. For "what's 2+2?" — you're paying Ferrari prices to drive to the mailbox.
-ORBIT fixes this. One line.
+You're overpaying by 85%. "Write a haiku" does not need GPT-4o. "What is 2+2?" does not need GPT-4o. Only ~15% of real queries actually require your most expensive model.
 ---
-## How it works
+## The solution
-Every query gets fingerprinted across **8 axes** in under 1ms:
+```javascript
+import orbit from '@gabrielsmartin/orbit-sdk'
+const decision = orbit.route("write a haiku about recursion")
+// → Claude Sonnet | creative_claude | saved 50%
-| Axis | What it measures |
-|---|---|
-| **Complexity** | Depth of reasoning required |
-| **Creativity** | Open-ended vs deterministic |
-| **Emotional Weight** | Sensitivity — crisis queries always go to Claude |
-| **Recency** | Need for live/current data → Grok |
-| **Context Load** | Window size needed → Claude 200k |
-| **Speed** | Latency sensitivity |
-| **Domain** | Code · Creative · Medical · Legal · General |
-| **Cost Tolerance** | Budget tier (overridable) |
+const decision2 = orbit.route("what is 2+2?")
+// → Gemini 2.5 Flash | cost_gemini | saved 98%
-Then it routes to the right model. Invisibly.
+const decision3 = orbit.route("I've been feeling really anxious")
+// → Claude Sonnet | ethics_first | (never routes sensitive content to cheap models)
+```
 ---
-## Usage
+## How it works
-### Zero-config routing decision
+Every query is fingerprinted across **8 axes** in under 1ms:
+| Axis | What it detects |
+|------|----------------|
+| `complexity` | Depth of reasoning required |
+| `creativity` | Open-ended vs. factual generation |
+| `emotional_weight` | Sensitive or crisis content |
+| `recency` | Need for real-time / live web data |
+| `context_load` | Long-document or multi-turn depth |
+| `speed` | Latency sensitivity |
+| `domain` | Code, legal, medical, creative, general |
+| `cost_tolerance` | Budget flexibility signal |
+Then the SMM (Selective Model Matching) engine routes:
+| Signal | → Model | Why |
+|--------|---------|-----|
+| Emotional weight > 6 | Claude Sonnet | Ethics-first. Always. |
+| Domain = legal/medical | Claude Sonnet | Long-context + safety |
+| Recency > 7 | Grok | Real-time web access |
+| Creativity > 5 | Claude Sonnet | Best open-ended generation |
+| Complexity < 5 | Gemini 2.5 Flash | 98% cheaper, 95% quality |
+| Trivial query | GPT-4o Mini | 99% cheaper than GPT-4o |
-```javascript
-import orbit from 'orbit-ai'
+---
-// Get the routing decision
-const decision = orbit.route("write a haiku about recursion")
+## API
-console.log(decision.model.name)    // "Claude Sonnet"
-console.log(decision.reason)        // "High creativity score (8/10)..."
-console.log(decision.savings)       // { savings: 0.007245, reductionPct: 97 }
-```
+```javascript
+import orbit, { OrbitClient, fingerprint } from '@gabrielsmartin/orbit-sdk'
-### With your own API keys
+// Route a query
+const result = orbit.route(queryText)
+// Returns: { model, reason, savings: { reductionPct, estimatedCost, premiumCost } }
-```javascript
-import { OrbitClient } from 'orbit-ai'
+// Get session stats
+const stats = orbit.stats()
+// Returns: { queries_routed, total_savings, total_savings_formatted, breakdown }
-const orbit = new OrbitClient({
-  cost_tolerance: 'low',  // 'low' | 'medium' | 'high'
-  log: true,              // logs routing decisions to console
+// Custom config
+const client = new OrbitClient({
+  default_model: 'claude_sonnet',
+  blocked_models: ['gpt4o'],  // never route here
+  always_log: true,
 })
-// Route the query
-const { model, reason, savings } = orbit.route("explain blockchain simply")
-// [ORBIT] → Gemini 2.5 Flash | cost_gemini | saved $0.01455 (97% reduction)
-// Now call the model yourself with your keys
-// model.id = 'gemini-2.5-flash', model.provider = 'google'
+// Raw fingerprint
+const fp = fingerprint("write a poem about loss")
+// Returns: { complexity, creativity, emotional_weight, recency, context_load, speed, domain, cost_tolerance }
 ```
-### Full pipeline example (with Anthropic SDK)
+---
+## Results
-```javascript
-import { OrbitClient } from 'orbit-ai'
-import Anthropic from '@anthropic-ai/sdk'
-import OpenAI from 'openai'
-import { GoogleGenerativeAI } from '@google/generative-ai'
-const orbit = new OrbitClient({ log: true })
-async function smartQuery(text) {
-  const { model, reason } = orbit.route(text)
-  if (model.provider === 'anthropic') {
-    const client = new Anthropic()
-    return client.messages.create({
-      model: model.id,
-      max_tokens: 1024,
-      messages: [{ role: 'user', content: text }]
-    })
-  }
-  if (model.provider === 'openai') {
-    const client = new OpenAI()
-    return client.chat.completions.create({
-      model: model.id,
-      messages: [{ role: 'user', content: text }]
-    })
-  }
-  if (model.provider === 'google') {
-    const client = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY)
-    const genModel = client.getGenerativeModel({ model: model.id })
-    return genModel.generateContent(text)
-  }
-}
-// Routes each query to the optimal model
-await smartQuery("write a poem about the ocean")           // → Claude Sonnet
-await smartQuery("what's the latest news on AI funding?")  // → Grok
-await smartQuery("what is 2+2")                            // → Gemini Flash
-await smartQuery("I've been feeling really overwhelmed")   // → Claude Sonnet (ethics-first)
 ```
+$ node --input-type=module << 'EOF'
+import orbit from '@gabrielsmartin/orbit-sdk'
-### Session stats
+orbit.route("write a haiku about recursion")
+// [ORBIT] → Claude Sonnet | creative_claude | saved $0.00750 (50% reduction)
-```javascript
-const stats = orbit.stats()
-console.log(stats.total_savings_formatted)  // "$0.2341"
-console.log(stats.model_usage)              // { "Claude Sonnet": 4, "Gemini 2.5 Flash": 12, ... }
-```
+orbit.route("what is 2+2?")
+// [ORBIT] → Gemini 2.5 Flash | cost_gemini | saved $0.01475 (98% reduction)
-### Fingerprint only (no routing)
+orbit.route("architect a distributed cache for 10M users")
+// [ORBIT] → Claude Sonnet | default | saved $0.00750 (50% reduction)
-```javascript
-import { fingerprint } from 'orbit-ai'
-const scores = fingerprint("architect a distributed caching system for 10M users")
-// {
-//   complexity: 9,
-//   creativity: 0,
-//   domain: 'code',
-//   emotional_weight: 0,
-//   recency: 0,
-//   context_load: 3,
-//   ...
-// }
+console.log(orbit.stats())
+// { queries_routed: 3, total_savings_formatted: '$0.0298' }
+EOF
 ```
----
+Validated by **RouteLLM (UC Berkeley / ICLR 2025)**: intelligent routing achieves **85% cost reduction** while maintaining **95% of GPT-4o quality**.
-## Model matrix
+---
-| Model | Provider | Cost/1M | Best for |
-|---|---|---|---|
-| Claude Sonnet | Anthropic | $15 | Complex reasoning · Ethics · Long context |
-| Claude Haiku | Anthropic | $1 | Speed · Summaries · Medium tasks |
-| Gemini 2.5 Flash | Google | $0.50 | High volume · Simple queries · Cost |
-| GPT-4o | OpenAI | $30 | Structured output · Broad knowledge |
-| GPT-4o Mini | OpenAI | $0.30 | Classification · Filler tasks |
-| Grok | xAI | $10 | Trending · Real-time web |
+## Hosted API (coming soon)
----
+The SDK routes decisions client-side (no API key needed, zero latency). A **hosted API with analytics dashboard** is coming for teams that want:
-## The math
+- Cross-session savings tracking
+- Routing policy editor
+- Team analytics
+- A/B cost testing
+- Custom model matrix
-Validated by [RouteLLM (UC Berkeley · ICLR 2025)](https://arxiv.org/abs/2406.18665): intelligent routing achieves **85% cost reduction** while maintaining 95% of quality.
+**[Join the waitlist →](https://orbit-model-flow.base44.app/#waitlist)**
-For a team running 100k queries/month at GPT-4o:
-- Without ORBIT: **$1,500/month**
-- With ORBIT: **~$225/month**
-- Savings: **$1,275/month · $15,300/year**
+Free tier: 100 queries/day · Pro: $19/mo unlimited · Team: $99/mo
 ---
-## Live demo
+## Research backing
-[orbit-model-flow.base44.app](https://orbit-model-flow.base44.app) — route a real query, see the fingerprint radar, watch the savings counter.
+- **RouteLLM** — UC Berkeley / ICLR 2025: *"Routing between weak and strong LLMs reduces costs by 85% while maintaining 95% quality."*
+- **OpenRouter** ($500M+ valuation) proves the market exists. ORBIT adds the intelligence layer they're missing.
+- **Martian** (Accenture-backed) proves enterprises pay for routing. ORBIT is the frictionless version.
 ---
 ## Roadmap
-- [x] 8-axis fingerprinting engine
-- [x] 6-model routing matrix
-- [x] TypeScript types
-- [ ] Streaming support
-- [ ] Custom model matrix (bring your own models)
-- [ ] Automatic provider failover
-- [ ] Usage analytics dashboard integration
-- [ ] Browser extension
+- [x] v0.1.0 — 8-axis fingerprinting + 6-model routing matrix
+- [ ] v0.2.0 — Hosted API with API key auth + rate limiting
+- [ ] v0.3.0 — Analytics dashboard
+- [ ] v0.4.0 — Chrome extension
+- [ ] v1.0.0 — Enterprise API + savings-share pricing
 ---
 ## License
-MIT · Built by [Gabriel Martin](https://github.com/gabrielsmartin)
-*"Every model has a gravitational pull. ORBIT decides which one you need."*
+MIT © [Gabriel Martin](https://www.linkedin.com/in/gabrielsmartin)
-777 · 555 · 333
+**[Live demo](https://orbit-model-flow.base44.app) · [GitHub](https://github.com/gabrielsmartin/orbit) · [npm](https://www.npmjs.com/package/@gabrielsmartin/orbit-sdk) · [LinkedIn](https://www.linkedin.com/in/gabrielsmartin)**

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@gabrielsmartin/orbit-sdk",
-  "version": "0.1.0",
+  "version": "0.1.1",
   "description": "Intelligent AI model routing. Drop-in replacement for OpenAI/Anthropic. Routes every query to the optimal model automatically.",
   "type": "module",
   "main": "src/index.js",