@petrgrishin/ai-sdk-ollama 3.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,1008 @@
1
+ # AI SDK Ollama Provider
2
+
3
+ [![npm version](https://badge.fury.io/js/ai-sdk-ollama.svg)](https://badge.fury.io/js/ai-sdk-ollama)
4
+ [![TypeScript](https://img.shields.io/badge/TypeScript-5.9+-blue.svg)](https://www.typescriptlang.org/)
5
+ [![Node.js](https://img.shields.io/badge/Node.js-22+-green.svg)](https://nodejs.org/)
6
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
7
+
8
+ A Vercel AI SDK v6 provider for Ollama built on the official `ollama` package. Type safe, future proof, with cross provider compatibility and native Ollama features.
9
+
10
+ > **📌 Version Compatibility**: This version (v3+) requires AI SDK v6. If you're using AI SDK v5, please use `ai-sdk-ollama@^2.0.0` instead.
11
+
12
+ ## Quick Start
13
+
14
+ ```bash
15
+ npm install ai-sdk-ollama ai@^6.0.0
16
+ ```
17
+
18
+ ```typescript
19
+ import { ollama } from 'ai-sdk-ollama';
20
+ import { generateText } from 'ai';
21
+
22
+ // Works in both Node.js and browsers
23
+ const { text } = await generateText({
24
+ model: ollama('llama3.2'),
25
+ prompt: 'Write a haiku about coding',
26
+ temperature: 0.8,
27
+ });
28
+
29
+ console.log(text);
30
+ ```
31
+
32
+ ## Why Choose AI SDK Ollama?
33
+
34
+ - ✅ **Solves tool calling problems** - Response synthesis for reliable tool execution
35
+ - ✅ **Enhanced wrapper functions** - `generateText` and `streamText` guarantees complete responses
36
+ - ✅ **Built-in reliability** - Default reliability features enabled automatically
37
+ - ✅ **Automatic JSON repair** - Fixes 14+ types of malformed JSON from LLM outputs (trailing commas, comments, URLs, Python constants, etc.)
38
+ - ✅ **Web search and fetch tools** - Built-in web search and fetch tools powered by [Ollama's web search API](https://ollama.com/blog/web-search). Perfect for getting current information and reducing hallucinations.
39
+ - ✅ **Type-safe** - Full TypeScript support with strict typing
40
+ - ✅ **Cross-environment** - Works in Node.js and browsers automatically
41
+ - ✅ **Native Ollama power** - Access advanced features like `mirostat`, `repeat_penalty`, `num_ctx`
42
+ - ✅ **Production ready** - Handles the core Ollama limitations other providers struggle with
43
+
44
+ ## Enhanced Tool Calling
45
+
46
+ > **🚀 The Problem We Solve**: Standard Ollama providers often execute tools but return empty responses. Our enhanced functions guarantee complete, useful responses every time.
47
+
48
+ ```typescript
49
+ import { ollama, generateText, streamText } from 'ai-sdk-ollama';
50
+
51
+ // ✅ Enhanced generateText - guaranteed complete responses
52
+ const { text } = await generateText({
53
+ model: ollama('llama3.2'),
54
+ tools: {
55
+ /* your tools */
56
+ },
57
+ prompt: 'Use the tools and explain the results',
58
+ });
59
+
60
+ // ✅ Enhanced streaming - tool-aware streaming
61
+ const { textStream } = await streamText({
62
+ model: ollama('llama3.2'),
63
+ tools: {
64
+ /* your tools */
65
+ },
66
+ prompt: 'Stream with tools',
67
+ });
68
+ ```
69
+
70
+ ## Web Search Tools
71
+
72
+ > **🌐 New in v0.9.0**: Built-in web search and fetch tools powered by [Ollama's web search API](https://ollama.com/blog/web-search). Perfect for getting current information and reducing hallucinations.
73
+
74
+ ```typescript
75
+ import { generateText } from 'ai';
76
+ import { ollama } from 'ai-sdk-ollama';
77
+
78
+ // 🔍 Web search for current information
79
+ const { text } = await generateText({
80
+ model: ollama('qwen3-coder:480b-cloud'), // Cloud models recommended for web search
81
+ prompt: 'What are the latest developments in AI this week?',
82
+ tools: {
83
+ webSearch: ollama.tools.webSearch({ maxResults: 5 }),
84
+ },
85
+ });
86
+
87
+ // 📄 Fetch specific web content
88
+ const { text: summary } = await generateText({
89
+ model: ollama('gpt-oss:120b-cloud'),
90
+ prompt: 'Summarize this article: https://example.com/article',
91
+ tools: {
92
+ webFetch: ollama.tools.webFetch({ maxContentLength: 5000 }),
93
+ },
94
+ });
95
+
96
+ // 🔄 Combine search and fetch for comprehensive research
97
+ const { text: research } = await generateText({
98
+ model: ollama('gpt-oss:120b-cloud'),
99
+ prompt: 'Research recent TypeScript updates and provide a detailed analysis',
100
+ tools: {
101
+ webSearch: ollama.tools.webSearch({ maxResults: 3 }),
102
+ webFetch: ollama.tools.webFetch(),
103
+ },
104
+ });
105
+ ```
106
+
107
+ ### Web Search Prerequisites
108
+
109
+ 1. **Ollama API Key**: Set `OLLAMA_API_KEY` environment variable
110
+ 2. **Cloud Models**: Use cloud models for optimal web search performance:
111
+ - `qwen3-coder:480b-cloud` - Best for general web search
112
+ - `gpt-oss:120b-cloud` - Best for complex reasoning with web data
113
+
114
+ ```bash
115
+ # Set your API key
116
+ export OLLAMA_API_KEY="your_api_key_here"
117
+
118
+ # Get your API key from: https://ollama.com/account
119
+ ```
120
+
121
+ ## Contents
122
+
123
+ - [AI SDK Ollama Provider](#ai-sdk-ollama-provider)
124
+ - [Quick Start](#quick-start)
125
+ - [Why Choose AI SDK Ollama?](#why-choose-ai-sdk-ollama)
126
+ - [Enhanced Tool Calling](#enhanced-tool-calling)
127
+ - [Web Search Tools](#web-search-tools)
128
+ - [Web Search Prerequisites](#web-search-prerequisites)
129
+ - [Contents](#contents)
130
+ - [Prerequisites](#prerequisites)
131
+ - [Browser Support](#browser-support)
132
+ - [Browser Usage](#browser-usage)
133
+ - [Explicit Browser Import](#explicit-browser-import)
134
+ - [CORS Configuration](#cors-configuration)
135
+ - [More Examples](#more-examples)
136
+ - [Cross Provider Compatibility](#cross-provider-compatibility)
137
+ - [Native Ollama Power](#native-ollama-power)
138
+ - [Model Keep-Alive Control](#model-keep-alive-control)
139
+ - [Enhanced Tool Calling Wrappers](#enhanced-tool-calling-wrappers)
140
+ - [Combining Tools with Structured Output](#combining-tools-with-structured-output)
141
+ - [Simple and Predictable](#simple-and-predictable)
142
+ - [Reranking](#reranking)
143
+ - [Streaming Utilities](#streaming-utilities)
144
+ - [Smooth Stream](#smooth-stream)
145
+ - [Partial JSON Parsing](#partial-json-parsing)
146
+ - [Middleware System](#middleware-system)
147
+ - [Default Settings Middleware](#default-settings-middleware)
148
+ - [Extract Reasoning Middleware](#extract-reasoning-middleware)
149
+ - [ToolLoopAgent](#toolloopagent)
150
+ - [Advanced Features](#advanced-features)
151
+ - [Custom Ollama Instance](#custom-ollama-instance)
152
+ - [API Key Configuration](#api-key-configuration)
153
+ - [Using Existing Ollama Client](#using-existing-ollama-client)
154
+ - [Structured Output](#structured-output)
155
+ - [Auto-Detection of Structured Outputs](#auto-detection-of-structured-outputs)
156
+ - [Automatic JSON Repair](#automatic-json-repair)
157
+ - [Reasoning Support](#reasoning-support)
158
+ - [Common Issues](#common-issues)
159
+ - [Supported Models](#supported-models)
160
+ - [Testing](#testing)
161
+ - [Learn More](#learn-more)
162
+ - [License](#license)
163
+
164
+ ```typescript
165
+ import { ollama } from 'ai-sdk-ollama';
166
+ import { generateText } from 'ai';
167
+
168
+ // Standard AI SDK parameters work everywhere
169
+ const { text } = await generateText({
170
+ model: ollama('llama3.2'),
171
+ prompt: 'Write a haiku about coding',
172
+ temperature: 0.8,
173
+ maxOutputTokens: 100,
174
+ });
175
+
176
+ // Plus access to Ollama's advanced features
177
+ const { text: advancedText } = await generateText({
178
+ model: ollama('llama3.2', {
179
+ options: {
180
+ mirostat: 2, // Advanced sampling algorithm
181
+ repeat_penalty: 1.1, // Fine-tune repetition
182
+ num_ctx: 8192, // Larger context window
183
+ },
184
+ }),
185
+ prompt: 'Write a haiku about coding',
186
+ temperature: 0.8, // Standard parameters still work
187
+ });
188
+ ```
189
+
190
+ ## Prerequisites
191
+
192
+ - Node.js 22+
193
+ - [Ollama](https://ollama.com) installed locally or running on a remote server
194
+ - AI SDK v6 (`ai` package)
195
+ - TypeScript 5.9+ (for TypeScript users)
196
+
197
+ ```bash
198
+ # Install Ollama from ollama.com
199
+ ollama serve
200
+
201
+ # Pull a model
202
+ ollama pull llama3.2
203
+ ```
204
+
205
+ ## Browser Support
206
+
207
+ See the [browser example](../../examples/browser/).
208
+
209
+ This provider works in both Node.js and browser environments. The library automatically selects the correct Ollama client based on the environment.
210
+
211
+ ### Browser Usage
212
+
213
+ The same API works in browsers with automatic environment detection:
214
+
215
+ ```typescript
216
+ import { ollama } from 'ai-sdk-ollama'; // Automatically uses browser version
217
+ import { generateText } from 'ai';
218
+
219
+ const { text } = await generateText({
220
+ model: ollama('llama3.2'),
221
+ prompt: 'Write a haiku about coding',
222
+ });
223
+ ```
224
+
225
+ ### Explicit Browser Import
226
+
227
+ You can also explicitly import the browser version:
228
+
229
+ ```typescript
230
+ import { ollama } from 'ai-sdk-ollama/browser';
231
+ ```
232
+
233
+ ### CORS Configuration
234
+
235
+ For browser usage, you have several options to handle CORS:
236
+
237
+ ```bash
238
+ # Option 1: Use a proxy (recommended for development)
239
+ # Configure your bundler (Vite, Webpack, etc.) to proxy /api/* to Ollama
240
+ # See browser example for Vite proxy configuration
241
+
242
+ # Option 2: Allow all origins (development only)
243
+ OLLAMA_ORIGINS=* ollama serve
244
+
245
+ # Option 3: Allow specific origins
246
+ OLLAMA_ORIGINS="http://localhost:3000,https://myapp.com" ollama serve
247
+ ```
248
+
249
+ **Recommended**: Use a development proxy (like Vite proxy) to avoid CORS issues entirely. See the browser example for a complete working setup.
250
+
251
+ ## More Examples
252
+
253
+ ### Cross Provider Compatibility
254
+
255
+ Write code that works with any AI SDK provider:
256
+
257
+ ```typescript
258
+ // This exact code works with OpenAI, Anthropic, or Ollama
259
+ const { text } = await generateText({
260
+ model: ollama('llama3.2'), // or openai('gpt-4') or anthropic('claude-3')
261
+ prompt: 'Write a haiku',
262
+ temperature: 0.8,
263
+ maxOutputTokens: 100,
264
+ topP: 0.9,
265
+ });
266
+ ```
267
+
268
+ ### Native Ollama Power
269
+
270
+ Access Ollama's advanced features without losing portability:
271
+
272
+ ```typescript
273
+ const { text } = await generateText({
274
+ model: ollama('llama3.2', {
275
+ options: {
276
+ mirostat: 2, // Advanced sampling algorithm
277
+ repeat_penalty: 1.1, // Repetition control
278
+ num_ctx: 8192, // Context window size
279
+ },
280
+ }),
281
+ prompt: 'Write a haiku',
282
+ temperature: 0.8, // Standard parameters still work
283
+ });
284
+ ```
285
+
286
+ > **Parameter Precedence**: When both AI SDK parameters and Ollama options are specified, **Ollama options take precedence**. For example, if you set `temperature: 0.5` in Ollama options and `temperature: 0.8` in the `generateText` call, the final value will be `0.5`. This allows you to use standard AI SDK parameters for portability while having fine-grained control with Ollama-specific options when needed.
287
+
288
+ ### Model Keep-Alive Control
289
+
290
+ Control how long models stay loaded in memory after requests using the `keep_alive` parameter:
291
+
292
+ ```typescript
293
+ // Keep model loaded for 10 minutes
294
+ const model = ollama('llama3.2', { keep_alive: '10m' });
295
+
296
+ // Keep model loaded for 1 hour (3600 seconds)
297
+ const model2 = ollama('llama3.2', { keep_alive: 3600 });
298
+
299
+ // Keep model loaded indefinitely
300
+ const model3 = ollama('llama3.2', { keep_alive: -1 });
301
+
302
+ // Unload model immediately after each request
303
+ const model4 = ollama('llama3.2', { keep_alive: 0 });
304
+
305
+ const { text } = await generateText({
306
+ model,
307
+ prompt: 'Write a haiku',
308
+ });
309
+ ```
310
+
311
+ **Accepted values:**
312
+
313
+ - Duration strings: `"10m"`, `"24h"`, `"30s"` (minutes, hours, seconds)
314
+ - Numbers: seconds as a number (e.g., `3600` for 1 hour)
315
+ - Negative numbers: keep loaded indefinitely (e.g., `-1`)
316
+ - `0`: unload immediately after the request
317
+
318
+ **Default behavior**: If not specified, Ollama keeps models loaded for 5 minutes to facilitate quicker response times for subsequent requests.
319
+
320
+ ### Enhanced Tool Calling Wrappers
321
+
322
+ For maximum tool calling reliability, use our enhanced wrapper functions that guarantee complete responses:
323
+
324
+ ```typescript
325
+ import { ollama, generateText, streamText } from 'ai-sdk-ollama';
326
+ import { tool } from 'ai';
327
+ import { z } from 'zod';
328
+
329
+ // Enhanced generateText with automatic response synthesis
330
+ const result = await generateText({
331
+ model: ollama('llama3.2'),
332
+ prompt: 'What is 15 + 27? Use the math tool to calculate it.',
333
+ tools: {
334
+ math: tool({
335
+ description: 'Perform math calculations',
336
+ inputSchema: z.object({
337
+ operation: z.string().describe('Math operation like "15 + 27"'),
338
+ }),
339
+ execute: async ({ operation }) => {
340
+ return { result: eval(operation), operation };
341
+ },
342
+ }),
343
+ },
344
+ // Optional: Configure reliability behavior
345
+ enhancedOptions: {
346
+ enableSynthesis: true, // Default: true
347
+ maxSynthesisAttempts: 2, // Default: 2
348
+ minResponseLength: 10, // Default: 10
349
+ },
350
+ });
351
+
352
+ console.log(result.text); // "15 + 27 equals 42. Using the math tool, I calculated..."
353
+ ```
354
+
355
+ ### Combining Tools with Structured Output
356
+
357
+ The `enableToolsWithStructuredOutput` option allows you to use both tool calling and structured output together:
358
+
359
+ ```typescript
360
+ import { ollama, generateText } from 'ai-sdk-ollama';
361
+ import { Output, tool } from 'ai';
362
+ import { z } from 'zod';
363
+
364
+ const weatherTool = tool({
365
+ description: 'Get current weather for a location',
366
+ inputSchema: z.object({
367
+ location: z.string().describe('City name'),
368
+ }),
369
+ execute: async ({ location }) => ({
370
+ location,
371
+ temperature: 22,
372
+ condition: 'sunny',
373
+ humidity: 60,
374
+ }),
375
+ });
376
+
377
+ // AI SDK v6: tools and structured output work together by default
378
+ import { ollama } from 'ai-sdk-ollama';
379
+
380
+ const result = await generateText({
381
+ model: ollama('llama3.2'),
382
+ prompt: 'Get weather for San Francisco and provide a structured summary',
383
+ tools: { getWeather: weatherTool },
384
+ output: Output.object({
385
+ schema: z.object({
386
+ location: z.string(),
387
+ temperature: z.number(),
388
+ summary: z.string(),
389
+ }),
390
+ }),
391
+ toolChoice: 'required',
392
+ });
393
+ // Result: Tool is called AND structured output is generated
394
+ ```
395
+
396
+ **When to Use Enhanced Wrappers:**
397
+
398
+ - **Critical tool calling scenarios** where you need guaranteed text responses
399
+ - **Production applications** that can't handle empty responses after tool execution
400
+ - **Complex multi-step tool interactions** requiring reliable synthesis
401
+
402
+ **Standard vs Enhanced Comparison:**
403
+
404
+ | Function | Standard `generateText` | Enhanced `generateText` |
405
+ | -------------------------- | ------------------------- | ------------------------------------ |
406
+ | **Simple prompts** | ✅ Perfect | ✅ Works (slight overhead) |
407
+ | **Tool calling** | ⚠️ May return empty text | ✅ **Guarantees complete responses** |
408
+ | **Complete responses** | ❌ Manual handling needed | ✅ **Automatic completion** |
409
+ | **Production reliability** | ⚠️ Unpredictable | ✅ **Reliable** |
410
+
411
+ ### Simple and Predictable
412
+
413
+ The provider works the same way with any model - just try the features you need:
414
+
415
+ ```typescript
416
+ // No capability checking required - just use any model
417
+ const { text } = await generateText({
418
+ model: ollama('any-model'),
419
+ prompt: 'What is the weather?',
420
+ tools: {
421
+ /* ... */
422
+ }, // If the model doesn't support tools, you'll get a clear error
423
+ });
424
+
425
+ // The provider is simple and predictable
426
+ // - Try any feature with any model
427
+ // - Get clear error messages if something doesn't work
428
+ // - No hidden complexity or capability detection
429
+ ```
430
+
431
+ ## Reranking
432
+
433
+ > **AI SDK v6 Feature**: Rerank documents by semantic relevance to improve search results and RAG pipelines.
434
+
435
+ Since Ollama doesn't have native reranking yet, we provide embedding-based reranking using cosine similarity:
436
+
437
+ ```typescript
438
+ import { rerank } from 'ai';
439
+ import { ollama } from 'ai-sdk-ollama';
440
+
441
+ // Rerank documents by relevance to a query
442
+ const { ranking, rerankedDocuments } = await rerank({
443
+ model: ollama.embeddingReranking('nomic-embed-text'),
444
+ query: 'How do I get a refund?',
445
+ documents: [
446
+ 'To reset your password, click Forgot Password on the login page.',
447
+ 'Refunds are available within 14 days of purchase. Go to Settings > Cancel Plan.',
448
+ 'Enable 2FA for extra security in Settings > Security.',
449
+ ],
450
+ topN: 2, // Return top 2 most relevant
451
+ });
452
+
453
+ console.log('Most relevant:', rerankedDocuments[0]);
454
+ // Output: "Refunds are available within 14 days..."
455
+
456
+ // Each ranking item includes score and original index
457
+ ranking.forEach((item, i) => {
458
+ console.log(
459
+ `${i + 1}. Score: ${item.score.toFixed(3)}, Index: ${item.originalIndex}`,
460
+ );
461
+ });
462
+ ```
463
+
464
+ **Use Cases:**
465
+
466
+ - **RAG Pipelines**: Rerank retrieved documents before passing to LLM
467
+ - **Search Results**: Improve relevance of search results
468
+ - **Customer Support**: Find most relevant help articles
469
+
470
+ **Recommended Models**: `embeddinggemma` (best score separation), `nomic-embed-text`, `bge-m3`
471
+
472
+ ## Streaming Utilities
473
+
474
+ ### Smooth Stream
475
+
476
+ Create smoother streaming output by chunking text into words, lines, or custom patterns:
477
+
478
+ ```typescript
479
+ import { ollama } from 'ai-sdk-ollama';
480
+ import { streamText, smoothStream } from 'ai';
481
+
482
+ // Word-by-word streaming with delay
483
+ const result = streamText({
484
+ model: ollama('llama3.2'),
485
+ prompt: 'Write a poem about the ocean.',
486
+ experimental_transform: smoothStream({
487
+ delayInMs: 50, // 50ms between chunks
488
+ chunking: 'word', // 'word' | 'line' | RegExp
489
+ }),
490
+ });
491
+
492
+ for await (const chunk of result.textStream) {
493
+ process.stdout.write(chunk); // Smooth, word-by-word output
494
+ }
495
+ ```
496
+
497
+ **Chunking Options:**
498
+
499
+ - `'word'` - Emit word by word (default)
500
+ - `'line'` - Emit line by line
501
+ - `RegExp` - Custom pattern (e.g., `/[.!?]\s+/` for sentences)
502
+
503
+ ### Partial JSON Parsing
504
+
505
+ Parse incomplete JSON from streaming responses - useful for progressive UI updates:
506
+
507
+ ```typescript
508
+ import { parsePartialJson } from 'ai';
509
+
510
+ // As JSON streams in, parse what's available
511
+ const partial = '{"name": "Alice", "age": 25, "city": "New';
512
+ const result = await parsePartialJson(partial);
513
+
514
+ if (result.state === 'repaired-parse' || result.state === 'successful-parse') {
515
+ console.log(result.value); // { name: "Alice", age: 25, city: "New" }
516
+ }
517
+ ```
518
+
519
+ **Note**: `createStitchableStream` and other advanced stream utilities are internal to the AI SDK. Use standard `ReadableStream` APIs for stream manipulation, or import utilities directly from `'ai'` when available.
520
+
521
+ ## Middleware System
522
+
523
+ Wrap language models with middleware for parameter transformation, logging, or custom behavior:
524
+
525
+ ```typescript
526
+ import {
527
+ ollama,
528
+ wrapLanguageModel,
529
+ defaultSettingsMiddleware,
530
+ } from 'ai-sdk-ollama';
531
+ import { generateText } from 'ai';
532
+
533
+ // Apply default settings to all calls
534
+ const model = wrapLanguageModel({
535
+ model: ollama('llama3.2'),
536
+ middleware: defaultSettingsMiddleware({
537
+ settings: {
538
+ temperature: 0.7,
539
+ maxOutputTokens: 1000,
540
+ },
541
+ }),
542
+ });
543
+
544
+ // Temperature and maxOutputTokens are now applied by default
545
+ const { text } = await generateText({
546
+ model,
547
+ prompt: 'Write a story.',
548
+ });
549
+ ```
550
+
551
+ ### Default Settings Middleware
552
+
553
+ Apply default parameters that can be overridden per-call:
554
+
555
+ ```typescript
556
+ import { defaultSettingsMiddleware } from 'ai-sdk-ollama';
557
+
558
+ const middleware = defaultSettingsMiddleware({
559
+ settings: {
560
+ temperature: 0.7,
561
+ maxOutputTokens: 500,
562
+ },
563
+ });
564
+ ```
565
+
566
+ ### Extract Reasoning Middleware
567
+
568
+ Extract reasoning/thinking from model outputs that use XML tags:
569
+
570
+ ```typescript
571
+ import {
572
+ ollama,
573
+ wrapLanguageModel,
574
+ extractReasoningMiddleware,
575
+ } from 'ai-sdk-ollama';
576
+
577
+ const model = wrapLanguageModel({
578
+ model: ollama('deepseek-r1:7b'),
579
+ middleware: extractReasoningMiddleware({
580
+ tagName: 'think', // Extract content from <think> tags
581
+ separator: '\n', // Separator for multiple reasoning blocks
582
+ startWithReasoning: true, // Model starts with reasoning
583
+ }),
584
+ });
585
+ ```
586
+
587
+ **Combining Multiple Middlewares:**
588
+
589
+ ```typescript
590
+ const model = wrapLanguageModel({
591
+ model: ollama('llama3.2'),
592
+ middleware: [
593
+ defaultSettingsMiddleware({ settings: { temperature: 0.5 } }),
594
+ extractReasoningMiddleware({ tagName: 'thinking' }),
595
+ ],
596
+ });
597
+ ```
598
+
599
+ ## ToolLoopAgent
600
+
601
+ An agent that runs tools in a loop until a stop condition is met:
602
+
603
+ ```typescript
604
+ import { ollama } from 'ai-sdk-ollama';
605
+ import { ToolLoopAgent, stepCountIs, hasToolCall, tool } from 'ai';
606
+ import { z } from 'zod';
607
+
608
+ const agent = new ToolLoopAgent({
609
+ model: ollama('llama3.2'),
610
+ instructions: 'You are a helpful assistant.',
611
+ tools: {
612
+ weather: tool({
613
+ description: 'Get weather for a location',
614
+ inputSchema: z.object({ location: z.string() }),
615
+ execute: async ({ location }: { location: string }) => ({
616
+ temp: 72,
617
+ condition: 'sunny',
618
+ }),
619
+ }),
620
+ done: tool({
621
+ description: 'Call when task is complete',
622
+ inputSchema: z.object({ summary: z.string() }),
623
+ execute: async ({ summary }: { summary: string }) => ({
624
+ completed: true,
625
+ summary,
626
+ }),
627
+ }),
628
+ },
629
+ maxOutputTokens: 1000,
630
+ stopWhen: [
631
+ stepCountIs(10), // Stop after 10 steps max
632
+ hasToolCall('done'), // Stop when 'done' tool is called
633
+ ],
634
+ onStepFinish: (stepResult) => {
635
+ console.log(`Step:`, stepResult.toolCalls.length, 'tool calls');
636
+ },
637
+ });
638
+
639
+ const result = await agent.generate({
640
+ prompt: 'What is the weather in San Francisco?',
641
+ });
642
+
643
+ console.log('Final:', result.text);
644
+ console.log('Steps:', result.steps.length);
645
+ console.log('Tokens:', result.totalUsage.totalTokens ?? 'undefined');
646
+ ```
647
+
648
+ **Stop Conditions:**
649
+
650
+ - `stepCountIs(n)` - Stop after n steps
651
+ - `hasToolCall(name)` - Stop when specific tool is called
652
+ - Custom: `(options: { steps: StepResult[] }) => boolean | Promise<boolean>`
653
+
654
+ ## Advanced Features
655
+
656
+ ### Custom Ollama Instance
657
+
658
+ You can create a custom Ollama provider instance with specific configuration:
659
+
660
+ ```typescript
661
+ import { createOllama } from 'ai-sdk-ollama';
662
+ import { generateText } from 'ai';
663
+
664
+ const ollama = createOllama({
665
+ baseURL: 'http://my-ollama-server:11434',
666
+ headers: {
667
+ 'Custom-Header': 'value',
668
+ },
669
+ });
670
+
671
+ const { text } = await generateText({
672
+ model: ollama('llama3.2'),
673
+ prompt: 'Hello!',
674
+ });
675
+ ```
676
+
677
+ ### API Key Configuration
678
+
679
+ For cloud Ollama services, pass your API key explicitly using `createOllama`:
680
+
681
+ ```typescript
682
+ import { createOllama } from 'ai-sdk-ollama';
683
+ import { generateText } from 'ai';
684
+
685
+ const ollama = createOllama({
686
+ apiKey: process.env.OLLAMA_API_KEY,
687
+ baseURL: 'https://ollama.com',
688
+ });
689
+
690
+ const { text } = await generateText({
691
+ model: ollama('llama3.2'),
692
+ prompt: 'Hello!',
693
+ });
694
+ ```
695
+
696
+ **Why explicit over auto-detection?**
697
+
698
+ Different runtimes handle environment variables differently:
699
+
700
+ | Runtime | `.env` Auto-Loading |
701
+ | --------------- | ----------------------------- |
702
+ | Node.js | ❌ No (requires `dotenv`) |
703
+ | Bun | ✅ Yes (usually) |
704
+ | Deno | ❌ No |
705
+ | Edge/Serverless | ❌ No (platform injects vars) |
706
+
707
+ Passing `apiKey` explicitly works reliably everywhere and avoids surprises.
708
+
709
+ **Runtime-specific examples:**
710
+
711
+ ```typescript
712
+ // Node.js (with dotenv)
713
+ import 'dotenv/config';
714
+ const ollama = createOllama({ apiKey: process.env.OLLAMA_API_KEY });
715
+
716
+ // Bun
717
+ const ollama = createOllama({ apiKey: Bun.env.OLLAMA_API_KEY });
718
+
719
+ // Deno
720
+ const ollama = createOllama({ apiKey: Deno.env.get('OLLAMA_API_KEY') });
721
+
722
+ // Production (Vercel, Railway, Fly.io, etc.)
723
+ // Env vars are injected by the platform - no .env files needed
724
+ const ollama = createOllama({ apiKey: process.env.OLLAMA_API_KEY });
725
+ ```
726
+
727
+ **Note**: The API key is set as `Authorization: Bearer {apiKey}` header. If you provide both an `apiKey` and a pre-existing `Authorization` header, the existing header takes precedence.
728
+
729
+ ### Using Existing Ollama Client
730
+
731
+ You can also pass an existing Ollama client instance to reuse your configuration:
732
+
733
+ ```typescript
734
+ import { Ollama } from 'ollama';
735
+ import { createOllama } from 'ai-sdk-ollama';
736
+
737
+ // Create your existing Ollama client
738
+ const existingClient = new Ollama({
739
+ host: 'http://my-ollama-server:11434',
740
+ // Add any custom configuration
741
+ });
742
+
743
+ // Use it with the AI SDK provider
744
+ const ollamaSdk = createOllama({ client: existingClient });
745
+
746
+ // Use both clients as needed
747
+ await ollamaRaw.list(); // Direct Ollama operations
748
+ const { text } = await generateText({
749
+ model: ollamaSdk('llama3.2'),
750
+ prompt: 'Hello!',
751
+ });
752
+ ```
753
+
754
+ ### Structured Output
755
+
756
+ ```typescript
757
+ import { generateObject } from 'ai';
758
+ import { z } from 'zod';
759
+
760
+ // Auto-detection: structuredOutputs is automatically enabled for object generation
761
+ const { object } = await generateObject({
762
+ model: ollama('llama3.2'), // No need to set structuredOutputs: true
763
+ schema: z.object({
764
+ name: z.string(),
765
+ age: z.number(),
766
+ interests: z.array(z.string()),
767
+ }),
768
+ prompt: 'Generate a random person profile',
769
+ });
770
+
771
+ console.log(object);
772
+ // { name: "Alice", age: 28, interests: ["reading", "hiking"] }
773
+
774
+ // Explicit setting still works
775
+ const { object: explicitObject } = await generateObject({
776
+ model: ollama('llama3.2', { structuredOutputs: true }), // Explicit
777
+ schema: z.object({
778
+ name: z.string(),
779
+ age: z.number(),
780
+ }),
781
+ prompt: 'Generate a person',
782
+ });
783
+ ```
784
+
785
+ ### Auto-Detection of Structured Outputs
786
+
787
+ The provider automatically detects when structured outputs are needed:
788
+
789
+ - **Object Generation**: `generateObject` and `streamObject` automatically enable `structuredOutputs: true`
790
+ - **Text Generation**: `generateText` and `streamText` require explicit `structuredOutputs: true` for JSON output
791
+ - **Backward Compatibility**: Explicit settings are respected, with warnings when overridden
792
+ - **No Breaking Changes**: Existing code continues to work as expected
793
+
794
+ ```typescript
795
+ import { ollama } from 'ai-sdk-ollama';
796
+ import { generateObject, generateText } from 'ai';
797
+ import { z } from 'zod';
798
+
799
+ // This works without explicit structuredOutputs: true
800
+ const { object } = await generateObject({
801
+ model: ollama('llama3.2'),
802
+ schema: z.object({ name: z.string() }),
803
+ prompt: 'Generate a name',
804
+ });
805
+
806
+ // This still requires explicit setting for JSON output
807
+ const { text } = await generateText({
808
+ model: ollama('llama3.2', { structuredOutputs: true }),
809
+ prompt: 'Generate JSON with a message field',
810
+ });
811
+ ```
812
+
813
+ ### Automatic JSON Repair
814
+
815
+ > **🔧 Enhanced Reliability**: Built-in JSON repair automatically fixes malformed LLM outputs for object generation.
816
+
817
+ The provider includes automatic JSON repair that handles 14+ types of common JSON issues from LLM outputs:
818
+
819
+ ````typescript
820
+ import { ollama } from 'ai-sdk-ollama';
821
+ import { generateObject } from 'ai';
822
+ import { z } from 'zod';
823
+
824
+ // JSON repair is enabled by default for all object generation
825
+ const { object } = await generateObject({
826
+ model: ollama('llama3.2'),
827
+ schema: z.object({
828
+ name: z.string(),
829
+ email: z.string().email(),
830
+ age: z.number(),
831
+ }),
832
+ prompt: 'Generate a person profile',
833
+ // reliableObjectGeneration: true is the default
834
+ });
835
+
836
+ // Automatically handles:
837
+ // ✅ Trailing commas: {"name": "John",}
838
+ // ✅ Single quotes: {'name': 'John'}
839
+ // ✅ Unquoted keys: {name: "John"}
840
+ // ✅ Python constants: {active: True, value: None}
841
+ // ✅ Comments: {"name": "John" // comment}
842
+ // ✅ URLs in strings: {"url": "https://example.com" // comment}
843
+ // ✅ Escaped quotes: {"text": "It's // fine"}
844
+ // ✅ JSONP wrappers: callback({"name": "John"})
845
+ // ✅ Markdown code blocks: ```json\n{...}\n```
846
+ // ✅ Incomplete objects/arrays
847
+ // ✅ Smart quotes and special characters
848
+ // ✅ And more...
849
+ ````
850
+
851
+ **Control Options:**
852
+
853
+ ```typescript
854
+ // Disable all reliability features (not recommended)
855
+ const { object } = await generateObject({
856
+ model: ollama('llama3.2', {
857
+ reliableObjectGeneration: false, // Everything off
858
+ }),
859
+ schema: z.object({ message: z.string() }),
860
+ prompt: 'Generate a message',
861
+ });
862
+
863
+ // Fine-grained control: disable only repair, keep retries
864
+ const { object: withRetries } = await generateObject({
865
+ model: ollama('llama3.2', {
866
+ reliableObjectGeneration: true,
867
+ objectGenerationOptions: {
868
+ enableTextRepair: false, // Disable repair only
869
+ maxRetries: 3, // But keep retries
870
+ },
871
+ }),
872
+ schema: z.object({ message: z.string() }),
873
+ prompt: 'Generate a message',
874
+ });
875
+
876
+ // Custom repair function (advanced)
877
+ const { object: custom } = await generateObject({
878
+ model: ollama('llama3.2', {
879
+ objectGenerationOptions: {
880
+ repairText: async ({ text, error }) => {
881
+ // Your custom repair logic
882
+ return text.replace(/,(\s*[}\]])/g, '$1');
883
+ },
884
+ },
885
+ }),
886
+ schema: z.object({ message: z.string() }),
887
+ prompt: 'Generate a message',
888
+ });
889
+ ```
890
+
891
+ ### Reasoning Support
892
+
893
+ Some models like DeepSeek-R1 support reasoning (chain-of-thought) output. Enable this feature to see the model's thinking process:
894
+
895
+ ```typescript
896
+ import { ollama } from 'ai-sdk-ollama';
897
+ import { generateText } from 'ai';
898
+
899
+ // Enable reasoning for models that support it (e.g., deepseek-r1:7b)
900
+ const model = ollama('deepseek-r1:7b', { reasoning: true });
901
+
902
+ // Generate text with reasoning
903
+ const { text } = await generateText({
904
+ model,
905
+ prompt:
906
+ 'Solve: If I have 3 boxes, each with 4 smaller boxes, and each smaller box has 5 items, how many items total?',
907
+ });
908
+
909
+ console.log('Answer:', text);
910
+ // DeepSeek-R1 includes reasoning in the output with <think> tags:
911
+ // <think>
912
+ // First, I'll calculate the number of smaller boxes: 3 × 4 = 12
913
+ // Then, the total items: 12 × 5 = 60
914
+ // </think>
915
+ // You have 60 items in total.
916
+
917
+ // Compare with reasoning disabled
918
+ const modelNoReasoning = ollama('deepseek-r1:7b', { reasoning: false });
919
+ const { text: noReasoningText } = await generateText({
920
+ model: modelNoReasoning,
921
+ prompt: 'Calculate 3 × 4 × 5',
922
+ });
923
+ // Output: 60 (without showing the thinking process)
924
+ ```
925
+
926
+ **Recommended Reasoning Models**:
927
+
928
+ - `deepseek-r1:7b` - Balanced performance and reasoning capability (5GB)
929
+ - `deepseek-r1:1.5b` - Lightweight option (2.5GB)
930
+ - `deepseek-r1:8b` - Llama-based distilled version (5.5GB)
931
+
932
+ Install with: `ollama pull deepseek-r1:7b`
933
+
934
+ **Note**: The reasoning feature is model-dependent. Models without reasoning support will work normally without showing thinking process.
935
+
936
+ ## Common Issues
937
+
938
+ - **Make sure Ollama is running** - Run `ollama serve` before using the provider
939
+ - **Pull models first** - Use `ollama pull model-name` before generating text
940
+ - **Model compatibility errors** - The provider will throw errors if you try to use unsupported features (e.g., tools with non-compatible models)
941
+ - **Network issues** - Verify Ollama is accessible at the configured URL
942
+ - **TypeScript support** - Full type safety with TypeScript 5.9+
943
+ - **AI SDK v6 compatibility** - Built for the latest AI SDK specification
944
+
945
+ ## Supported Models
946
+
947
+ Works with any model in your Ollama installation:
948
+
949
+ - **Chat**: `llama3.2`, `mistral`, `phi4-mini`, `qwen2.5`, `codellama`, `gpt-oss:20b`
950
+ - **Vision**: `llava`, `bakllava`, `llama3.2-vision`, `minicpm-v`
951
+ - **Embeddings**: `nomic-embed-text`, `all-minilm`, `mxbai-embed-large`
952
+ - **Reasoning**: `deepseek-r1:7b`, `deepseek-r1:1.5b`, `deepseek-r1:8b`
953
+ - **Cloud Models** (for web search): `qwen3-coder:480b-cloud`, `gpt-oss:120b-cloud`
954
+
955
+ ## Testing
956
+
957
+ The project includes unit and integration tests:
958
+
959
+ ```bash
960
+ # Run unit tests only (fast, no external dependencies)
961
+ npm test
962
+
963
+ # Run all tests (unit + integration)
964
+ npm run test:all
965
+
966
+ # Run integration tests only (requires Ollama running)
967
+ npm run test:integration
968
+ ```
969
+
970
+ > **Note**: Integration tests may occasionally fail due to the non-deterministic nature of AI model outputs. This is expected behavior - the tests use loose assertions to account for LLM output variability. Some tests may also skip if required models aren't available locally.
971
+
972
+ For detailed testing information, see [Integration Tests Documentation](./src/integration-tests/README.md).
973
+
974
+ ## Learn More
975
+
976
+ 📚 **[Examples Directory](../../examples/)** - Comprehensive usage patterns with real working code
977
+
978
+ 🚀 **[Quick Start Guide](../../examples/node/src/basic-chat.ts)** - Get running in 2 minutes
979
+
980
+ ⚙️ **[Dual Parameters Demo](../../examples/node/src/dual-parameter-example.ts)** - See the key feature in action
981
+
982
+ 🔧 **[Tool Calling Guide](../../examples/node/src/simple-tool-test.ts)** - Function calling with Ollama
983
+
984
+ 🖼️ **[Image Processing Guide](../../examples/node/src/image-handling-example.ts)** - Vision models with LLaVA
985
+
986
+ 📡 **[Streaming Examples](../../examples/node/src/streaming-simple-test.ts)** - Real-time responses
987
+
988
+ 🌐 **[Web Search Tools](../../examples/node/src/web-search-ai-sdk-ollama.ts)** - Web search and fetch capabilities
989
+
990
+ 🔄 **[Reranking Example](../../examples/node/src/v6-reranking-example.ts)** - Document reranking with embeddings
991
+
992
+ 🌊 **[SmoothStream Example](../../examples/node/src/smooth-stream-example.ts)** - Smooth chunked streaming output
993
+
994
+ 🔌 **[Middleware Example](../../examples/node/src/middleware-example.ts)** - Model wrapping and middleware system
995
+
996
+ 🤖 **[ToolLoopAgent Example](../../examples/node/src/tool-loop-agent-example.ts)** - Autonomous tool-calling agents
997
+
998
+ 🛡️ **[Tool Approval Example](../../examples/node/src/v6-tool-approval-example.ts)** - Human-in-the-loop tool execution approval
999
+
1000
+ 📦 **[Structured Output + Tools](../../examples/node/src/v6-structured-output-example.ts)** - Tool calling with structured output generation
1001
+
1002
+ 🔗 **[MCP Tools Example](../../examples/node/src/mcp-tools-example.ts)** - Model Context Protocol integration
1003
+
1004
+ ## License
1005
+
1006
+ MIT © [Jag Reehal](https://jagreehal.com)
1007
+
1008
+ See [LICENSE](./LICENSE) for details.