@warlock.js/ai-ollama 4.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/cjs/index.cjs +705 -0
- package/cjs/index.cjs.map +1 -0
- package/esm/config.type.d.mts +80 -0
- package/esm/config.type.d.mts.map +1 -0
- package/esm/embedder.mjs +101 -0
- package/esm/embedder.mjs.map +1 -0
- package/esm/index.d.mts +3 -0
- package/esm/index.mjs +3 -0
- package/esm/known-vision-models.mjs +44 -0
- package/esm/known-vision-models.mjs.map +1 -0
- package/esm/model.mjs +251 -0
- package/esm/model.mjs.map +1 -0
- package/esm/sdk.d.mts +62 -0
- package/esm/sdk.d.mts.map +1 -0
- package/esm/sdk.mjs +78 -0
- package/esm/sdk.mjs.map +1 -0
- package/esm/utils/index.mjs +6 -0
- package/esm/utils/map-done-reason.mjs +31 -0
- package/esm/utils/map-done-reason.mjs.map +1 -0
- package/esm/utils/to-ollama-messages.mjs +87 -0
- package/esm/utils/to-ollama-messages.mjs.map +1 -0
- package/esm/utils/to-ollama-tools.mjs +41 -0
- package/esm/utils/to-ollama-tools.mjs.map +1 -0
- package/esm/utils/wrap-ollama-error.mjs +104 -0
- package/esm/utils/wrap-ollama-error.mjs.map +1 -0
- package/llms-full.txt +122 -0
- package/llms.txt +9 -0
- package/package.json +38 -0
- package/skills/README.md +9 -0
- package/skills/setup-ollama/SKILL.md +112 -0
|
@@ -0,0 +1,112 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: setup-ollama
|
|
3
|
+
description: 'Wire @warlock.js/ai-ollama — new OllamaSDK({host?, headers?}) for local / self-hosted Ollama via the official ollama client (not OpenAI-compat). chat + embed, daemon-down error handling. Triggers: `OllamaSDK`, `ollama.model`, `ollama.embedder`, `embedder.embedMany`, `ollama.count`, `host`, `headers`; "use ollama with warlock", "run llama3 locally", "self-hosted llama"; typical import `import { OllamaSDK } from "@warlock.js/ai-ollama"`. Skip: agent loop — `@warlock.js/ai/run-ai-agent/SKILL.md`; provider choice — `@warlock.js/ai/pick-ai-provider/SKILL.md`; embeddings core — `@warlock.js/ai/embed-text/SKILL.md`; siblings `@warlock.js/ai-openai`, `@warlock.js/ai-anthropic`, `@warlock.js/ai-google`; raw `ollama` npm, Vercel `@ai-sdk/ollama`; OpenAI-compat gateway via `@warlock.js/ai-openai` `baseURL`.'
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# `@warlock.js/ai-ollama`
|
|
7
|
+
|
|
8
|
+
Provider adapter that turns a local (or self-hosted) Ollama server into a vendor-neutral `ModelContract`, plus an Ollama embedder. Uses the **official `ollama` npm package** (not OpenAI-compat). Mirrors the openai / anthropic / bedrock / google adapters.
|
|
9
|
+
|
|
10
|
+
## Construction
|
|
11
|
+
|
|
12
|
+
```ts
|
|
13
|
+
import { OllamaSDK } from "@warlock.js/ai-ollama";
|
|
14
|
+
|
|
15
|
+
const ollama = new OllamaSDK(); // local default host
|
|
16
|
+
const remote = new OllamaSDK({ host: "http://gpu-box.internal:11434" });
|
|
17
|
+
const gated = new OllamaSDK({
|
|
18
|
+
host: "https://ollama.internal",
|
|
19
|
+
headers: { Authorization: `Bearer ${process.env.OLLAMA_TOKEN}` },
|
|
20
|
+
});
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
`OllamaSDK` is a class with a long-lived `Ollama` client. Config is `Partial<Config>` (host defaults to `http://127.0.0.1:11434`) + `provider` (default `"ollama"`) + optional `pricing` (local is free; kept for parity/chargeback).
|
|
24
|
+
|
|
25
|
+
## Producing a model
|
|
26
|
+
|
|
27
|
+
```ts
|
|
28
|
+
ollama.model({ name: "llama3.1" })
|
|
29
|
+
ollama.model({ name: "qwen2.5:14b", temperature: 0.2 })
|
|
30
|
+
ollama.model({ name: "llama3.2-vision", maxTokens: 1024 })
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
## Capabilities — what's auto-set
|
|
34
|
+
|
|
35
|
+
| Flag | Default |
|
|
36
|
+
| --- | --- |
|
|
37
|
+
| `structuredOutput` | `true` (via Ollama's native `format` JSON-schema field) |
|
|
38
|
+
| `vision` | Inferred from model tag substring. `true` for `llava`, `bakllava`, `*-vision`, `moondream`, `minicpm-v`, `qwen2-vl`, `qwen2.5-vl`, `llama4`, `gemma3`; `false` otherwise. |
|
|
39
|
+
|
|
40
|
+
Explicit config always wins.
|
|
41
|
+
|
|
42
|
+
## System prompt & roles
|
|
43
|
+
|
|
44
|
+
Unlike Anthropic/Gemini/Bedrock, **Ollama keeps a first-class `system` role inside `messages`** — no hoisting. Neutral roles (`system`/`user`/`assistant`/`tool`) pass straight through.
|
|
45
|
+
|
|
46
|
+
## Tool calls
|
|
47
|
+
|
|
48
|
+
- Outgoing: neutral tools → `{ type: "function", function: { name, description, parameters } }`.
|
|
49
|
+
- Assistant tool calls → `tool_calls: [{ function: { name, arguments } }]` (Ollama has **no tool-call id**).
|
|
50
|
+
- Tool results (`role: "tool"`) → a `tool` message with `tool_name` set from `toolCallId` (Ollama matches a result to its call by name).
|
|
51
|
+
|
|
52
|
+
**Synthesized ids.** Because Ollama tool calls carry no id, the adapter sets neutral `id` = tool name. **Parallel calls to the same tool in one turn share an id** — a documented v1 limitation. Ollama reports `done_reason: "stop"` even when it called a tool; the adapter derives `finishReason: "tool_calls"` from tool-call presence.
|
|
53
|
+
|
|
54
|
+
## Structured output
|
|
55
|
+
|
|
56
|
+
Object-root `responseSchema` + `structuredOutput`-capable → `chat({ format: <schema> })` (Ollama's `format` accepts a JSON Schema object).
|
|
57
|
+
|
|
58
|
+
## Multipart messages (vision)
|
|
59
|
+
|
|
60
|
+
A multipart user message collapses to a single `content` string + an `images` array of **base64 strings**. `{ type: "image", source: { url } }` → **throws `InvalidRequestError`** (Ollama cannot fetch remote URLs). Resolve images to base64 first.
|
|
61
|
+
|
|
62
|
+
## Streaming
|
|
63
|
+
|
|
64
|
+
`model.stream()` drains `chat({ stream: true })` (an `AbortableAsyncIterator`). Each chunk's `message.content` → `{ type: "delta" }`; `message.tool_calls` are emitted as `{ type: "tool-call" }` **fully formed**. Terminal `{ type: "done", finishReason, usage }` — usage from the final (`done: true`) chunk's `prompt_eval_count` / `eval_count`.
|
|
65
|
+
|
|
66
|
+
**`options.signal` is honored** by calling the iterator's `abort()` (stream path; non-stream `complete()` ignores it — the agent still honors the signal at trip boundaries).
|
|
67
|
+
|
|
68
|
+
## Finish-reason mapping
|
|
69
|
+
|
|
70
|
+
`stop` → `stop` · `length` → `length` · `load` / unknown / null → `error`. `tool_calls` derived from tool-call presence.
|
|
71
|
+
|
|
72
|
+
## Embeddings
|
|
73
|
+
|
|
74
|
+
```ts
|
|
75
|
+
const embedder = ollama.embedder({ name: "nomic-embed-text" });
|
|
76
|
+
const { vector } = await embedder.embed("Hello world");
|
|
77
|
+
const { vectors } = await embedder.embedMany(["a", "b"]); // single batched call
|
|
78
|
+
const truncated = ollama.embedder({ name: "mxbai-embed-large", dimensions: 512 });
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
`client.embed` accepts a string array natively, so `embedMany` is **one request** (like the Gemini adapter). Usage comes from `prompt_eval_count` (reported as both `promptTokens` and `totalTokens`). Local Ollama runs without a prompt cache, so model usage has no `cachedTokens`.
|
|
82
|
+
|
|
83
|
+
`dimensions` is optional. When set it's forwarded as Ollama's `dimensions` truncation field (newer embedding models) and seeds `embedder.dimensions`; when omitted, `embedder.dimensions` starts at `0` and is resolved lazily from the first response's vector length, then cached.
|
|
84
|
+
|
|
85
|
+
## Errors
|
|
86
|
+
|
|
87
|
+
Wrapped into the typed `@warlock.js/ai` `AIError` hierarchy. The `ollama` client throws an internal `ResponseError` (`status_code` + message); transport failures surface as `fetch` `TypeError` with `ECONNREFUSED` cause:
|
|
88
|
+
|
|
89
|
+
- **Daemon-down (`ECONNREFUSED` / "fetch failed") → `ProviderError`** (operational "is Ollama running?", not a request defect)
|
|
90
|
+
- Timeouts → `ProviderTimeoutError`
|
|
91
|
+
- 401/403 → `ProviderAuthError`
|
|
92
|
+
- 429 → `ProviderRateLimitError`
|
|
93
|
+
- 4xx with context phrasing → `ContextLengthExceededError`, else `InvalidRequestError`
|
|
94
|
+
- 5xx → `ProviderError`
|
|
95
|
+
|
|
96
|
+
## Token counting
|
|
97
|
+
|
|
98
|
+
```ts
|
|
99
|
+
await ollama.count("some text") // approximate heuristic, offline
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
## When NOT to use this skill
|
|
103
|
+
|
|
104
|
+
- Direct `ollama` client calls without going through `@warlock.js/ai` agents.
|
|
105
|
+
- OpenAI / Anthropic / Bedrock / Google models — those have their own adapter packages.
|
|
106
|
+
- An OpenAI-compatible Ollama gateway you specifically want to drive through the OpenAI protocol — use `@warlock.js/ai-openai` with `baseURL` instead.
|
|
107
|
+
|
|
108
|
+
## See also
|
|
109
|
+
|
|
110
|
+
- [`@warlock.js/ai/run-ai-agent/SKILL.md`](@warlock.js/ai/run-ai-agent/SKILL.md)
|
|
111
|
+
- [`@warlock.js/ai/pick-ai-provider/SKILL.md`](@warlock.js/ai/pick-ai-provider/SKILL.md)
|
|
112
|
+
- [`@warlock.js/ai/embed-text/SKILL.md`](@warlock.js/ai/embed-text/SKILL.md)
|