@jz92/ai-provider 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,367 @@
1
+ # @jz92/ai-provider
2
+
3
+ A zero-config AI routing layer for Node.js and Next.js projects.
4
+
5
+ Import one function — get Ollama locally and Anthropic/OpenAI/Groq in production, automatically, based on `NODE_ENV`. No provider-switching logic in your feature code, ever.
6
+
7
+ ---
8
+
9
+ ## What this is
10
+
11
+ When building AI-powered features, you typically want:
12
+ - **Local dev** → free, fast, no API key, works offline
13
+ - **CI** → real API, cheapest model, minimal tokens
14
+ - **Production** → best model, prompt caching, cost-optimised
15
+
16
+ This package handles that routing. You write `generateStructured()` once — the environment decides which provider runs it.
17
+
18
+ ## What this is not
19
+
20
+ - Not an agent framework
21
+ - Not a coding assistant or CLI tool
22
+ - Not something that manages Ollama for you
23
+
24
+ You bring Ollama. This package talks to it.
25
+
26
+ ---
27
+
28
+ ## Prerequisites
29
+
30
+ For local development you need Ollama installed and running on your machine.
31
+
32
+ ```bash
33
+ # Install (macOS)
34
+ brew install ollama
35
+
36
+ # Start as a background service
37
+ brew services start ollama
38
+
39
+ # Pull a model (one-time, ~9GB)
40
+ ollama pull qwen2.5-coder:14b
41
+ ```
42
+
43
+ Verify it's running:
44
+ ```bash
45
+ curl http://localhost:11434 # should return: Ollama is running
46
+ ```
47
+
48
+ For production (Vercel, AWS, etc.) you only need an API key from your chosen provider — no Ollama required.
49
+
50
+ ---
51
+
52
+ ## Installation
53
+
54
+ ```bash
55
+ npm install @jz92/ai-provider
56
+ ```
57
+
58
+ After install, a setup guide prints automatically telling you exactly which peer deps to install based on the providers you want to use. The short version:
59
+
60
+ ```bash
61
+ # Always required
62
+ npm install ai zod
63
+
64
+ # Local dev (free, no API key)
65
+ npm install ollama-ai-provider
66
+
67
+ # Cloud — install only the provider(s) you use
68
+ npm install @ai-sdk/anthropic # → ANTHROPIC_API_KEY
69
+ npm install @ai-sdk/openai # → OPENAI_API_KEY
70
+ npm install @ai-sdk/google # → GOOGLE_GENERATIVE_AI_API_KEY
71
+ npm install @ai-sdk/groq # → GROQ_API_KEY
72
+ npm install @ai-sdk/mistral # → MISTRAL_API_KEY
73
+ ```
74
+
75
+ **Only install adapters for providers you actually use.** Unused ones are never loaded — the package uses dynamic imports so missing adapters don't cause errors unless you try to use them.
76
+
77
+ **Switching providers later is one env var change** — `AI_PROVIDER=openai` — no code changes needed in your feature files.
78
+
79
+ ---
80
+
81
+ ## Usage
82
+
83
+ ```typescript
84
+ import { generateStructured, generatePlainText } from '@jz92/ai-provider'
85
+ import { z } from 'zod'
86
+
87
+ // Structured output — returns validated, typed JSON
88
+ const result = await generateStructured({
89
+ systemPrompt: 'Extract data. Respond in JSON only.',
90
+ prompt: 'My name is Jithin and I live in Maidenhead.',
91
+ schema: z.object({ name: z.string(), city: z.string() }),
92
+ cacheKey: `extract:${input}`, // optional — skips API on repeat calls
93
+ })
94
+
95
+ console.log(result.data) // { name: 'Jithin', city: 'Maidenhead' }
96
+ console.log(result.provider) // 'ollama' locally · 'anthropic' in prod
97
+ console.log(result.fromCache) // true on cache hit
98
+
99
+ // Plain text output
100
+ const result = await generatePlainText({
101
+ systemPrompt: 'You are a helpful assistant.',
102
+ prompt: 'Summarise this in one sentence...',
103
+ })
104
+ ```
105
+
106
+ Your code is identical in every environment. The provider switches automatically.
107
+
108
+ ---
109
+
110
+ ## How routing works
111
+
112
+ | `NODE_ENV` | Provider | Model | Cost |
113
+ |---|---|---|---|
114
+ | `development` | Ollama (local) | `qwen2.5-coder:14b` | $0 |
115
+ | `test` / CI | Anthropic | `claude-haiku-4-5` | ~$0.001/req |
116
+ | `production` | Anthropic | `claude-sonnet-4-6` | ~$0.03/req |
117
+
118
+ Override anything with env vars:
119
+
120
+ ```bash
121
+ # Force a specific provider
122
+ AI_PROVIDER=openai npm run dev
123
+
124
+ # Force a specific model
125
+ AI_MODEL=gpt-4o npm run dev
126
+
127
+ # Use a custom Ollama model variant
128
+ OLLAMA_MODEL=my-custom-model npm run dev
129
+ ```
130
+
131
+ ---
132
+
133
+ ## Setting your API key
134
+
135
+ Keys are read from environment variables at runtime. The package never sees or stores them.
136
+
137
+ ### Local dev — no key needed
138
+ Ollama runs entirely on your machine. Just set `NODE_ENV=development` (the default).
139
+
140
+ ```bash
141
+ # .env.development
142
+ NODE_ENV=development
143
+ OLLAMA_BASE_URL=http://localhost:11434
144
+ OLLAMA_MODEL=qwen2.5-coder:14b
145
+ AI_LOG_USAGE=true
146
+ ```
147
+
148
+ ### Production — Vercel
149
+ Set one environment variable in your Vercel project dashboard:
150
+ ```
151
+ ANTHROPIC_API_KEY = sk-ant-...
152
+ ```
153
+ `NODE_ENV=production` is set automatically by Vercel. Done.
154
+
155
+ ### Production — AWS (ECS / Lambda / EC2)
156
+ ```bash
157
+ ANTHROPIC_API_KEY=sk-ant-...
158
+ NODE_ENV=production
159
+ ```
160
+
161
+ ### CI — GitHub Actions
162
+ ```yaml
163
+ env:
164
+ NODE_ENV: test
165
+ ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
166
+ ```
167
+
168
+ ### Supported provider keys
169
+
170
+ | Provider | Env var |
171
+ |---|---|
172
+ | Anthropic | `ANTHROPIC_API_KEY` |
173
+ | OpenAI | `OPENAI_API_KEY` |
174
+ | Google | `GOOGLE_GENERATIVE_AI_API_KEY` |
175
+ | Groq | `GROQ_API_KEY` |
176
+ | Mistral | `MISTRAL_API_KEY` |
177
+
178
+ ---
179
+
180
+ ## Environment variables
181
+
182
+ | Variable | Default | Description |
183
+ |---|---|---|
184
+ | `NODE_ENV` | `development` | Drives provider selection |
185
+ | `AI_PROVIDER` | — | Force a provider: `ollama`, `anthropic`, `openai`, `google`, `groq`, `mistral` |
186
+ | `AI_MODEL` | — | Force a specific model string |
187
+ | `AI_LOG_USAGE` | `false` | Log provider, model, and token usage to console |
188
+ | `AI_TIMEOUT_MS` | `60000` (Ollama) / `30000` (cloud) | Request timeout in ms |
189
+ | `AI_CACHE_MAX_SIZE` | `500` | Max in-memory cache entries |
190
+ | `AI_CACHE_TTL_MS` | `300000` (5 min) | Cache entry TTL |
191
+ | `OLLAMA_BASE_URL` | `http://localhost:11434` | Ollama host |
192
+ | `OLLAMA_MODEL` | `qwen2.5-coder:14b` | Local model name |
193
+
194
+ ---
195
+
196
+ ## Architecture
197
+
198
+ ```mermaid
199
+ flowchart TD
200
+ A["Your feature code\ngenerateStructured() · generatePlainText()"]
201
+ B["Gateway\ncache · token guard · retry · timeout"]
202
+ C["Provider resolver\nreads NODE_ENV + overrides"]
203
+
204
+ D["development\nNODE_ENV=development"]
205
+ E["test / CI\nNODE_ENV=test"]
206
+ F["production\nNODE_ENV=production"]
207
+
208
+ G["Ollama\nlocalhost:11434 · free"]
209
+ H["Anthropic · OpenAI\nGoogle · Groq · Mistral"]
210
+
211
+ A --> B --> C
212
+ C --> D --> G
213
+ C --> E --> H
214
+ C --> F --> H
215
+
216
+ style A fill:#F1EFE8,stroke:#5F5E5A
217
+ style B fill:#EEEDFE,stroke:#534AB7
218
+ style C fill:#EEEDFE,stroke:#534AB7
219
+ style D fill:#E1F5EE,stroke:#0F6E56
220
+ style E fill:#FAEEDA,stroke:#854F0B
221
+ style F fill:#FAECE7,stroke:#993C1D
222
+ style G fill:#E1F5EE,stroke:#0F6E56
223
+ style H fill:#FAECE7,stroke:#993C1D
224
+ ```
225
+
226
+ ---
227
+
228
+ ## What's included in the gateway
229
+
230
+ Every request passes through the gateway regardless of provider:
231
+
232
+ - **Response cache** — same `cacheKey` skips the API entirely. Bounded at 500 entries, 5 min TTL. Configurable via env vars.
233
+ - **Token budget guard** — estimates input size and throws before the API call if it exceeds the limit. Set `maxInputTokens` per call.
234
+ - **Smart retry** — retries only transient errors (rate limit, server error, timeout). Never retries auth or billing failures — those won't recover and would waste money.
235
+ - **Hard timeout** — 60s for Ollama (model load time), 30s for cloud. Override with `AI_TIMEOUT_MS`.
236
+ - **Prompt caching** — automatically enabled for Anthropic in production. Marks the system prompt for server-side caching, reducing input costs by ~90% on repeat calls.
237
+
238
+ ---
239
+
240
+ ## Custom Ollama model variants
241
+
242
+ You can bake your system prompt into a named local model using an Ollama `Modelfile`. This mirrors what prompt caching does in production — the stable context is paid once, not on every request.
243
+
244
+ ```dockerfile
245
+ # modelfiles/Modelfile.my-feature
246
+ FROM qwen2.5-coder:14b
247
+
248
+ SYSTEM """
249
+ Your stable system prompt here.
250
+ Respond only in JSON.
251
+ """
252
+
253
+ PARAMETER temperature 0.1
254
+ PARAMETER num_predict 1024
255
+ ```
256
+
257
+ ```bash
258
+ ollama create my-feature -f modelfiles/Modelfile.my-feature
259
+ ```
260
+
261
+ ```bash
262
+ # .env.development
263
+ OLLAMA_MODEL=my-feature
264
+ ```
265
+
266
+ A `Modelfile.template` is included at `node_modules/@jz92/ai-provider/modelfiles-template/Modelfile.template`.
267
+
268
+ ---
269
+
270
+ ## Security
271
+
272
+ This package reads API keys from environment variables and passes them directly to the provider SDK over HTTPS. Keys are never logged, stored, or transmitted by this package.
273
+
274
+ Your responsibilities as a consumer:
275
+
276
+ - Never commit `.env` or `.env.local` — add both to `.gitignore`
277
+ - Never log `process.env` in application code
278
+ - Use `.env.example` with placeholder values for documentation
279
+ - Use deployment secrets (Vercel dashboard / AWS Secrets Manager) in production
280
+ - Rotate keys immediately if accidentally exposed
281
+
282
+ ---
283
+
284
+ ## Error handling
285
+
286
+ The package throws `AIProviderError` with a typed `code` and a clear actionable message. You never see raw SDK errors.
287
+
288
+ ```typescript
289
+ import { generateStructured, AIProviderError } from '@jz92/ai-provider'
290
+
291
+ try {
292
+ const result = await generateStructured({ ... })
293
+ } catch (err) {
294
+ if (err instanceof AIProviderError) {
295
+ console.error(err.code) // 'AUTH_ERROR' | 'BILLING_ERROR' | 'RATE_LIMIT' | etc.
296
+ console.error(err.message) // tells you exactly what to do
297
+ }
298
+ }
299
+ ```
300
+
301
+ ### Error codes
302
+
303
+ | Code | Cause | Retried? |
304
+ |---|---|---|
305
+ | `AUTH_ERROR` | Missing or invalid API key | No |
306
+ | `BILLING_ERROR` | No credits / quota exceeded | No |
307
+ | `RATE_LIMIT` | Too many requests (429) | Yes — with backoff |
308
+ | `SERVER_ERROR` | Provider 5xx error | Yes — with backoff |
309
+ | `TIMEOUT` | Request exceeded `AI_TIMEOUT_MS` | Yes — once |
310
+ | `MODEL_NOT_FOUND` | Model not pulled locally | No |
311
+ | `TOKEN_BUDGET` | Input exceeded `maxInputTokens` | No |
312
+ | `SCHEMA_VALIDATION` | Output did not match Zod schema | No |
313
+
314
+ ### Ollama not running
315
+
316
+ ```
317
+ [ai-provider] Ollama is not reachable at http://localhost:11434.
318
+
319
+ Start Ollama: brew services start ollama
320
+ Or (foreground): ollama serve
321
+
322
+ To use a cloud provider instead:
323
+ Set AI_PROVIDER=anthropic (and ANTHROPIC_API_KEY) in your .env
324
+ Or: AI_PROVIDER=openai (and OPENAI_API_KEY)
325
+ Or: AI_PROVIDER=groq (and GROQ_API_KEY — free tier available)
326
+ ```
327
+
328
+ ### API key not set
329
+
330
+ ```
331
+ [ai-provider] ANTHROPIC_API_KEY is not set.
332
+
333
+ 1. Install the SDK: npm install @ai-sdk/anthropic
334
+ 2. Set the key:
335
+ Local: add ANTHROPIC_API_KEY=<your-key> to .env.local
336
+ Vercel: Project Settings → Environment Variables
337
+ AWS: task definition or Secrets Manager
338
+ GitHub CI: repo secrets → ${{ secrets.ANTHROPIC_API_KEY }}
339
+
340
+ Get a key at: https://console.anthropic.com
341
+ ```
342
+
343
+ ---
344
+
345
+ ## Running tests
346
+
347
+ ```bash
348
+ # Requires Ollama running with qwen2.5-coder:14b pulled
349
+ npm test
350
+ ```
351
+
352
+ Expected: 23 passed.
353
+
354
+ ---
355
+
356
+ ## Publishing
357
+
358
+ ```bash
359
+ npm run build
360
+ npm publish --access public
361
+ ```
362
+
363
+ ---
364
+
365
+ ## Repo
366
+
367
+ [github.com/jithinjohnzachariah92/ai-provider](https://github.com/jithinjohnzachariah92/ai-provider)