omg-llmkit 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,33 @@
1
+ name: CI
2
+
3
+ on:
4
+ push:
5
+ branches: [main]
6
+ pull_request:
7
+
8
+ jobs:
9
+ check:
10
+ runs-on: ubuntu-latest
11
+ steps:
12
+ - uses: actions/checkout@v4
13
+
14
+ - name: Install uv
15
+ uses: astral-sh/setup-uv@v5
16
+
17
+ - name: Set up Python
18
+ run: uv python install 3.13
19
+
20
+ - name: Install dependencies
21
+ run: uv sync --extra dev
22
+
23
+ - name: Ruff (lint)
24
+ run: uv run ruff check .
25
+
26
+ - name: Ruff (format check)
27
+ run: uv run ruff format --check .
28
+
29
+ - name: basedpyright (no baseline)
30
+ run: uv run basedpyright
31
+
32
+ - name: Tests
33
+ run: uv run pytest
@@ -0,0 +1,49 @@
1
+ name: Publish to PyPI
2
+
3
+ # Publishes to PyPI when a GitHub Release is published. Uses PyPI Trusted
4
+ # Publishing (OIDC) — there is NO API token stored in the repo. Configure the
5
+ # matching "pending publisher" on PyPI first (see CONTRIBUTING / release notes):
6
+ # project: omg-llmkit | owner: OMGBrews | repo: llmkit
7
+ # workflow: publish.yml | environment: pypi
8
+ # (PyPI distribution name is "omg-llmkit"; the import name stays "llmkit".)
9
+
10
+ on:
11
+ release:
12
+ types: [published]
13
+ workflow_dispatch: {}
14
+
15
+ jobs:
16
+ build:
17
+ runs-on: ubuntu-latest
18
+ steps:
19
+ - uses: actions/checkout@v4
20
+
21
+ - name: Install uv
22
+ uses: astral-sh/setup-uv@v5
23
+
24
+ - name: Build sdist and wheel
25
+ run: uv build
26
+
27
+ - name: Upload dist artifact
28
+ uses: actions/upload-artifact@v4
29
+ with:
30
+ name: dist
31
+ path: dist/
32
+
33
+ publish:
34
+ needs: build
35
+ runs-on: ubuntu-latest
36
+ environment:
37
+ name: pypi
38
+ url: https://pypi.org/p/omg-llmkit
39
+ permissions:
40
+ id-token: write # required for PyPI Trusted Publishing (OIDC)
41
+ steps:
42
+ - name: Download dist artifact
43
+ uses: actions/download-artifact@v4
44
+ with:
45
+ name: dist
46
+ path: dist/
47
+
48
+ - name: Publish to PyPI
49
+ uses: pypa/gh-action-pypi-publish@release/v1
@@ -0,0 +1,16 @@
1
+ # Python
2
+ __pycache__/
3
+ *.py[cod]
4
+ *.egg-info/
5
+ .pytest_cache/
6
+ .ruff_cache/
7
+
8
+ # Virtual env / uv
9
+ .venv/
10
+ uv.lock
11
+
12
+ # Local LLM call logs (default LocalYamlLogSink output)
13
+ data/llm-logs/
14
+
15
+ # Editor / OS
16
+ .DS_Store
@@ -0,0 +1,26 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project are documented here. The format follows
4
+ [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project
5
+ adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
6
+
7
+ ## [0.1.0] — 2026-06-05
8
+
9
+ Initial public release.
10
+
11
+ ### Added
12
+
13
+ - Provider-agnostic call surface over LiteLLM (with `instructor` for structured
14
+ output) across OpenRouter, Google, Anthropic, and local Ollama.
15
+ - `structured_llm_call` / `structured_llm_call_sync` — validated Pydantic output,
16
+ with each provider pinned to its native JSON-schema mode (never auto-`Mode.TOOLS`).
17
+ - `text_llm_call` and `stream_text_with_log` for plain-text and streamed calls.
18
+ - Process-global async rate limiter (`GlobalRateLimiter`, `configure_rate_limit`).
19
+ - Transient-error retries (`with_retries`, `LLM_RECOVERABLE_ERRORS`), kept
20
+ separate from instructor's schema-repair retries.
21
+ - Agent-readable logging: `LocalYamlLogSink` writes verdict-first per-call YAML
22
+ plus an append-only `index.jsonl`; pluggable `LogSink` protocol for custom sinks.
23
+ - Approximate per-call cost (`approximate_cost`) sourced from LiteLLM's response
24
+ estimate, for budget visibility.
25
+
26
+ [0.1.0]: https://github.com/OMGBrews/llmkit/releases/tag/v0.1.0
@@ -0,0 +1,32 @@
1
+ # Contributing
2
+
3
+ Thanks for your interest. This is a small, opinionated, best-effort project — see
4
+ the scope notes in the [README](README.md). Bug reports and focused pull requests
5
+ are welcome; large feature proposals may not be a fit for the library's
6
+ deliberately-thin design, so please open an issue to discuss before investing in
7
+ a big change.
8
+
9
+ ## Development setup
10
+
11
+ ```bash
12
+ uv sync --extra dev
13
+ ```
14
+
15
+ ## Checks must pass
16
+
17
+ CI runs the same four gates on every push and pull request, with **no baseline**:
18
+
19
+ ```bash
20
+ uv run ruff check .
21
+ uv run ruff format --check .
22
+ uv run basedpyright # 0 errors, 0 warnings
23
+ uv run pytest
24
+ ```
25
+
26
+ Please run them locally before opening a PR. New behavior needs a test.
27
+
28
+ ## Conventions
29
+
30
+ - Keep the public surface small — `llmkit` owns the call ergonomics, not transport.
31
+ - No `dict[str, Any]` / bare `Any`; use precise types (basedpyright enforces this).
32
+ - Hard cuts over deprecation shims for internal changes.
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 OMGBrews
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,226 @@
1
+ Metadata-Version: 2.4
2
+ Name: omg-llmkit
3
+ Version: 0.1.0
4
+ Summary: A thin, opinionated, local-first structured-output + logging layer over LiteLLM
5
+ Project-URL: Homepage, https://github.com/OMGBrews/llmkit
6
+ Project-URL: Repository, https://github.com/OMGBrews/llmkit
7
+ Project-URL: Issues, https://github.com/OMGBrews/llmkit/issues
8
+ Project-URL: Changelog, https://github.com/OMGBrews/llmkit/blob/main/CHANGELOG.md
9
+ Author: OMGBrews
10
+ License: MIT License
11
+
12
+ Copyright (c) 2026 OMGBrews
13
+
14
+ Permission is hereby granted, free of charge, to any person obtaining a copy
15
+ of this software and associated documentation files (the "Software"), to deal
16
+ in the Software without restriction, including without limitation the rights
17
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
18
+ copies of the Software, and to permit persons to whom the Software is
19
+ furnished to do so, subject to the following conditions:
20
+
21
+ The above copyright notice and this permission notice shall be included in all
22
+ copies or substantial portions of the Software.
23
+
24
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
25
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
26
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
27
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
28
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
29
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
30
+ SOFTWARE.
31
+ License-File: LICENSE
32
+ Keywords: anthropic,gemini,instructor,litellm,llm,openai,structured-output
33
+ Classifier: Development Status :: 4 - Beta
34
+ Classifier: Intended Audience :: Developers
35
+ Classifier: License :: OSI Approved :: MIT License
36
+ Classifier: Operating System :: OS Independent
37
+ Classifier: Programming Language :: Python :: 3.13
38
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
39
+ Classifier: Typing :: Typed
40
+ Requires-Python: >=3.13
41
+ Requires-Dist: httpx>=0.27.0
42
+ Requires-Dist: instructor>=1.15.1
43
+ Requires-Dist: litellm>=1.87.1
44
+ Requires-Dist: openai>=2.0.0
45
+ Requires-Dist: pydantic>=2.5.0
46
+ Requires-Dist: pyyaml>=6.0.0
47
+ Provides-Extra: dev
48
+ Requires-Dist: basedpyright>=1.39; extra == 'dev'
49
+ Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
50
+ Requires-Dist: pytest>=8.0.0; extra == 'dev'
51
+ Requires-Dist: ruff==0.15.0; extra == 'dev'
52
+ Description-Content-Type: text/markdown
53
+
54
+ # llmkit
55
+
56
+ A thin, opinionated, **local-first** layer over [LiteLLM](https://github.com/BerriAI/litellm) (with [instructor](https://github.com/567-labs/instructor) for structured output). It gives an application one provider-agnostic call surface across **OpenRouter, Google, Anthropic, and local Ollama**, with validated structured output, a global async rate limiter, transient-error retries, and **agent-readable per-call logging** out of the box.
57
+
58
+ LiteLLM is the implementation of the HTTP providers; llmkit owns the ergonomic call surface, the structured-output mode pinning, the rate-limit policy, and the logging convention. It is **not** a gateway and does not reimplement transport — that is solved, and reimplementing it is the thing this library deliberately does not do.
59
+
60
+ ## Why llmkit
61
+
62
+ - **Structured output that actually validates.** Each provider is pinned to its *native* JSON-schema mode (never instructor's auto-`Mode.TOOLS`, which silently regresses Gemini to empty shapes), and instructor's in-call validation-retry repairs truncated JSON. You pass a Pydantic model; you get a validated instance back.
63
+ - **Provider switching is config, not code.** OpenRouter / Google / Anthropic / Ollama behind one `Provider` enum and one `LLMClientConfig`. Call sites never change when you switch.
64
+ - **Logging tuned for coding agents.** Every call is logged verdict-first (see below) — the design assumption is that the reader is usually an LLM coding agent debugging a run, not a dashboard.
65
+ - **Local-first, zero infra.** The default sink writes plain files to a directory. No collector, no account, no network. A pluggable `LogSink` lets you ship records anywhere later without touching call sites.
66
+
67
+ ## Install
68
+
69
+ ```bash
70
+ uv add omg-llmkit # or: pip install omg-llmkit
71
+ ```
72
+
73
+ The distribution is published as **`omg-llmkit`** (the bare `llmkit` name was already
74
+ taken on PyPI), but the import name is just `llmkit`:
75
+
76
+ ```python
77
+ import llmkit
78
+ ```
79
+
80
+ Requires Python ≥ 3.13.
81
+
82
+ ## Quick start
83
+
84
+ ```python
85
+ from pydantic import BaseModel
86
+ from llmkit import (
87
+ LLMClientConfig,
88
+ Provider,
89
+ configure_llm_client,
90
+ structured_llm_call,
91
+ )
92
+
93
+ # Point the library at a provider once, at startup.
94
+ configure_llm_client(lambda: LLMClientConfig(
95
+ provider=Provider.OPENROUTER,
96
+ model="google/gemini-2.5-flash",
97
+ api_key="sk-or-...",
98
+ ))
99
+
100
+ class Summary(BaseModel):
101
+ title: str
102
+ bullets: list[str]
103
+
104
+ result: Summary = await structured_llm_call(
105
+ prompt="Summarize the attached report.",
106
+ schema=Summary,
107
+ feature="reports", # groups calls in the logs
108
+ label="exec_summary", # names this specific call in the logs
109
+ )
110
+ ```
111
+
112
+ The public call surface:
113
+
114
+ | Function | Use |
115
+ |----------|-----|
116
+ | `structured_llm_call(prompt, schema, feature, label, ...)` | Async, returns a validated Pydantic instance |
117
+ | `structured_llm_call_sync(...)` | Synchronous wrapper around the above |
118
+ | `text_llm_call(prompt, feature, label, ...)` | Async, returns plain text (coerces provider list-content blocks) |
119
+ | `stream_text_with_log(prompt, feature, label, ...)` | Async generator yielding text chunks, logged on completion |
120
+
121
+ `configure_rate_limit(...)` sets the process-global concurrency cap; `configure_llm_logging(sink)` swaps the log sink (below).
122
+
123
+ ## Logging: agent-readable by default
124
+
125
+ `LocalYamlLogSink` (the default) writes **two** things to `data/llm-logs/`:
126
+
127
+ 1. **One YAML file per call, laid out verdict-first.** The file opens with a one-line `#` header — `ok`/`ERROR`, feature/label, resolved model, schema, duration, approximate cost — so `head -1 *.yaml` triages a whole run. Small metadata is next; the large `response` and `prompt` blobs are last, so the *head* of the file is the whole story for most reads.
128
+ 2. **A compact append-only `index.jsonl`** — one JSON line per call (file, timestamp, feature, label, model, provider, schema, duration, cost, error). Cross-call questions — "which calls errored / were slowest / most expensive / the last call for feature X" — are a single small scan instead of globbing and parsing every YAML.
129
+
130
+ ```
131
+ # ok | reports/exec_summary | google/gemini-2.5-flash | Summary | 1840ms | $0.0007
132
+ # 2026-06-05T14:22:31.004512
133
+
134
+ timestamp: '2026-06-05T14:22:31.004512'
135
+ feature: reports
136
+ label: exec_summary
137
+ model: google/gemini-2.5-flash
138
+ provider: openrouter
139
+ schema: Summary
140
+ temperature: 0.0
141
+ duration_ms: 1840.2
142
+ approximate_cost: 0.0007
143
+ error: null
144
+ response: ...
145
+ prompt: ...
146
+ ```
147
+
148
+ `approximate_cost` is LiteLLM's per-response estimate for budget visibility — **not** a billing figure (and `None` when the provider does not report it, e.g. streamed calls).
149
+
150
+ ### Write your own `LogSink`
151
+
152
+ `LogSink` is a one-method `Protocol`. Records (`LLMCallRecord`, a frozen dataclass) are handed to your sink for every call; failures are swallowed so logging can never break a call. To send records somewhere other than local YAML — a database, an HTTP collector, structured stdout — implement `write` and register it:
153
+
154
+ ```python
155
+ import logging
156
+ from pathlib import Path
157
+ from llmkit import LLMCallRecord, configure_llm_logging
158
+
159
+ logger = logging.getLogger("llm-calls")
160
+
161
+ class StructuredStdoutSink:
162
+ def write(self, record: LLMCallRecord) -> Path | None:
163
+ logger.info(
164
+ "llm_call",
165
+ extra={
166
+ "feature": record.feature,
167
+ "label": record.label,
168
+ "model": record.model,
169
+ "provider": record.provider,
170
+ "schema": record.schema,
171
+ "duration_ms": record.duration_ms,
172
+ "approximate_cost": record.approximate_cost,
173
+ "error": record.error,
174
+ },
175
+ )
176
+ return None # nothing persisted to a path
177
+
178
+ configure_llm_logging(StructuredStdoutSink()) # pass None to disable logging entirely
179
+ ```
180
+
181
+ An OpenTelemetry exporter (e.g. to Langfuse/Phoenix) is a natural future `llmkit[otel]` extra; the pluggable seam makes it a non-breaking addition.
182
+
183
+ ## Configuration
184
+
185
+ `LLMClientConfig` is flat and carries only what a call needs:
186
+
187
+ ```python
188
+ @dataclass(frozen=True)
189
+ class LLMClientConfig:
190
+ provider: Provider # OPENROUTER | OLLAMA | GOOGLE | ANTHROPIC
191
+ model: str # the provider's default model
192
+ api_key: str | None = None
193
+ base_url: str | None = None
194
+ ```
195
+
196
+ Per-call `model=` overrides the default, so "strong/small/current" model roles are the host's concern — resolve them to a model string and pass it at the call site. The library has no opinion about roles.
197
+
198
+ Register the config with `configure_llm_client(source)`, where `source` is a zero-arg callable returning an `LLMClientConfig` (re-read on each provider construction, so it tracks live settings changes).
199
+
200
+ ## Retries
201
+
202
+ Two retry layers, kept deliberately separate:
203
+
204
+ - **`with_retries()`** ([`retry.py`](src/llmkit/retry.py)) handles *transient provider* errors (429 / 503 / 5xx; the recoverable set is `LLM_RECOVERABLE_ERRORS`).
205
+ - **instructor's own low `max_retries`** handles *schema-validation* repair (re-ask the model to fix malformed JSON).
206
+
207
+ ## Development
208
+
209
+ ```bash
210
+ uv sync --extra dev
211
+ uv run ruff check . && uv run ruff format --check .
212
+ uv run basedpyright # 0 errors, 0 warnings — no baseline
213
+ uv run pytest
214
+ ```
215
+
216
+ ## Status & support
217
+
218
+ `llmkit` is a small, opinionated, **best-effort** project, extracted from a real
219
+ application and maintained in the open. It is used in production by its author
220
+ but carries no support SLA. Bug reports and focused pull requests are welcome —
221
+ see [CONTRIBUTING.md](CONTRIBUTING.md). For security issues, see
222
+ [SECURITY.md](SECURITY.md).
223
+
224
+ ## License
225
+
226
+ MIT — see [LICENSE](LICENSE).
@@ -0,0 +1,173 @@
1
+ # llmkit
2
+
3
+ A thin, opinionated, **local-first** layer over [LiteLLM](https://github.com/BerriAI/litellm) (with [instructor](https://github.com/567-labs/instructor) for structured output). It gives an application one provider-agnostic call surface across **OpenRouter, Google, Anthropic, and local Ollama**, with validated structured output, a global async rate limiter, transient-error retries, and **agent-readable per-call logging** out of the box.
4
+
5
+ LiteLLM is the implementation of the HTTP providers; llmkit owns the ergonomic call surface, the structured-output mode pinning, the rate-limit policy, and the logging convention. It is **not** a gateway and does not reimplement transport — that is solved, and reimplementing it is the thing this library deliberately does not do.
6
+
7
+ ## Why llmkit
8
+
9
+ - **Structured output that actually validates.** Each provider is pinned to its *native* JSON-schema mode (never instructor's auto-`Mode.TOOLS`, which silently regresses Gemini to empty shapes), and instructor's in-call validation-retry repairs truncated JSON. You pass a Pydantic model; you get a validated instance back.
10
+ - **Provider switching is config, not code.** OpenRouter / Google / Anthropic / Ollama behind one `Provider` enum and one `LLMClientConfig`. Call sites never change when you switch.
11
+ - **Logging tuned for coding agents.** Every call is logged verdict-first (see below) — the design assumption is that the reader is usually an LLM coding agent debugging a run, not a dashboard.
12
+ - **Local-first, zero infra.** The default sink writes plain files to a directory. No collector, no account, no network. A pluggable `LogSink` lets you ship records anywhere later without touching call sites.
13
+
14
+ ## Install
15
+
16
+ ```bash
17
+ uv add omg-llmkit # or: pip install omg-llmkit
18
+ ```
19
+
20
+ The distribution is published as **`omg-llmkit`** (the bare `llmkit` name was already
21
+ taken on PyPI), but the import name is just `llmkit`:
22
+
23
+ ```python
24
+ import llmkit
25
+ ```
26
+
27
+ Requires Python ≥ 3.13.
28
+
29
+ ## Quick start
30
+
31
+ ```python
32
+ from pydantic import BaseModel
33
+ from llmkit import (
34
+ LLMClientConfig,
35
+ Provider,
36
+ configure_llm_client,
37
+ structured_llm_call,
38
+ )
39
+
40
+ # Point the library at a provider once, at startup.
41
+ configure_llm_client(lambda: LLMClientConfig(
42
+ provider=Provider.OPENROUTER,
43
+ model="google/gemini-2.5-flash",
44
+ api_key="sk-or-...",
45
+ ))
46
+
47
+ class Summary(BaseModel):
48
+ title: str
49
+ bullets: list[str]
50
+
51
+ result: Summary = await structured_llm_call(
52
+ prompt="Summarize the attached report.",
53
+ schema=Summary,
54
+ feature="reports", # groups calls in the logs
55
+ label="exec_summary", # names this specific call in the logs
56
+ )
57
+ ```
58
+
59
+ The public call surface:
60
+
61
+ | Function | Use |
62
+ |----------|-----|
63
+ | `structured_llm_call(prompt, schema, feature, label, ...)` | Async, returns a validated Pydantic instance |
64
+ | `structured_llm_call_sync(...)` | Synchronous wrapper around the above |
65
+ | `text_llm_call(prompt, feature, label, ...)` | Async, returns plain text (coerces provider list-content blocks) |
66
+ | `stream_text_with_log(prompt, feature, label, ...)` | Async generator yielding text chunks, logged on completion |
67
+
68
+ `configure_rate_limit(...)` sets the process-global concurrency cap; `configure_llm_logging(sink)` swaps the log sink (below).
69
+
70
+ ## Logging: agent-readable by default
71
+
72
+ `LocalYamlLogSink` (the default) writes **two** things to `data/llm-logs/`:
73
+
74
+ 1. **One YAML file per call, laid out verdict-first.** The file opens with a one-line `#` header — `ok`/`ERROR`, feature/label, resolved model, schema, duration, approximate cost — so `head -1 *.yaml` triages a whole run. Small metadata is next; the large `response` and `prompt` blobs are last, so the *head* of the file is the whole story for most reads.
75
+ 2. **A compact append-only `index.jsonl`** — one JSON line per call (file, timestamp, feature, label, model, provider, schema, duration, cost, error). Cross-call questions — "which calls errored / were slowest / most expensive / the last call for feature X" — are a single small scan instead of globbing and parsing every YAML.
76
+
77
+ ```
78
+ # ok | reports/exec_summary | google/gemini-2.5-flash | Summary | 1840ms | $0.0007
79
+ # 2026-06-05T14:22:31.004512
80
+
81
+ timestamp: '2026-06-05T14:22:31.004512'
82
+ feature: reports
83
+ label: exec_summary
84
+ model: google/gemini-2.5-flash
85
+ provider: openrouter
86
+ schema: Summary
87
+ temperature: 0.0
88
+ duration_ms: 1840.2
89
+ approximate_cost: 0.0007
90
+ error: null
91
+ response: ...
92
+ prompt: ...
93
+ ```
94
+
95
+ `approximate_cost` is LiteLLM's per-response estimate for budget visibility — **not** a billing figure (and `None` when the provider does not report it, e.g. streamed calls).
96
+
97
+ ### Write your own `LogSink`
98
+
99
+ `LogSink` is a one-method `Protocol`. Records (`LLMCallRecord`, a frozen dataclass) are handed to your sink for every call; failures are swallowed so logging can never break a call. To send records somewhere other than local YAML — a database, an HTTP collector, structured stdout — implement `write` and register it:
100
+
101
+ ```python
102
+ import logging
103
+ from pathlib import Path
104
+ from llmkit import LLMCallRecord, configure_llm_logging
105
+
106
+ logger = logging.getLogger("llm-calls")
107
+
108
+ class StructuredStdoutSink:
109
+ def write(self, record: LLMCallRecord) -> Path | None:
110
+ logger.info(
111
+ "llm_call",
112
+ extra={
113
+ "feature": record.feature,
114
+ "label": record.label,
115
+ "model": record.model,
116
+ "provider": record.provider,
117
+ "schema": record.schema,
118
+ "duration_ms": record.duration_ms,
119
+ "approximate_cost": record.approximate_cost,
120
+ "error": record.error,
121
+ },
122
+ )
123
+ return None # nothing persisted to a path
124
+
125
+ configure_llm_logging(StructuredStdoutSink()) # pass None to disable logging entirely
126
+ ```
127
+
128
+ An OpenTelemetry exporter (e.g. to Langfuse/Phoenix) is a natural future `llmkit[otel]` extra; the pluggable seam makes it a non-breaking addition.
129
+
130
+ ## Configuration
131
+
132
+ `LLMClientConfig` is flat and carries only what a call needs:
133
+
134
+ ```python
135
+ @dataclass(frozen=True)
136
+ class LLMClientConfig:
137
+ provider: Provider # OPENROUTER | OLLAMA | GOOGLE | ANTHROPIC
138
+ model: str # the provider's default model
139
+ api_key: str | None = None
140
+ base_url: str | None = None
141
+ ```
142
+
143
+ Per-call `model=` overrides the default, so "strong/small/current" model roles are the host's concern — resolve them to a model string and pass it at the call site. The library has no opinion about roles.
144
+
145
+ Register the config with `configure_llm_client(source)`, where `source` is a zero-arg callable returning an `LLMClientConfig` (re-read on each provider construction, so it tracks live settings changes).
146
+
147
+ ## Retries
148
+
149
+ Two retry layers, kept deliberately separate:
150
+
151
+ - **`with_retries()`** ([`retry.py`](src/llmkit/retry.py)) handles *transient provider* errors (429 / 503 / 5xx; the recoverable set is `LLM_RECOVERABLE_ERRORS`).
152
+ - **instructor's own low `max_retries`** handles *schema-validation* repair (re-ask the model to fix malformed JSON).
153
+
154
+ ## Development
155
+
156
+ ```bash
157
+ uv sync --extra dev
158
+ uv run ruff check . && uv run ruff format --check .
159
+ uv run basedpyright # 0 errors, 0 warnings — no baseline
160
+ uv run pytest
161
+ ```
162
+
163
+ ## Status & support
164
+
165
+ `llmkit` is a small, opinionated, **best-effort** project, extracted from a real
166
+ application and maintained in the open. It is used in production by its author
167
+ but carries no support SLA. Bug reports and focused pull requests are welcome —
168
+ see [CONTRIBUTING.md](CONTRIBUTING.md). For security issues, see
169
+ [SECURITY.md](SECURITY.md).
170
+
171
+ ## License
172
+
173
+ MIT — see [LICENSE](LICENSE).
@@ -0,0 +1,28 @@
1
+ # Security Policy
2
+
3
+ ## Reporting a vulnerability
4
+
5
+ Please **do not** open a public issue for security problems.
6
+
7
+ Report privately through GitHub's
8
+ [private vulnerability reporting](https://github.com/OMGBrews/llmkit/security/advisories/new)
9
+ (the **Security** tab → **Report a vulnerability**). This opens a private
10
+ advisory visible only to the maintainers.
11
+
12
+ This is a small, best-effort project. There is no formal SLA, but reports are
13
+ taken seriously and acknowledged as soon as practical.
14
+
15
+ ## Scope
16
+
17
+ `llmkit` is a thin client layer; it holds no credentials of its own and runs no
18
+ network services. The most security-relevant surfaces are:
19
+
20
+ - **Provider API keys** passed through `LLMClientConfig` — these live in your
21
+ process, never in `llmkit`.
22
+ - **The default log sink** (`LocalYamlLogSink`) writes prompts and responses to
23
+ local files under `data/llm-logs/`. Treat that directory as sensitive if your
24
+ prompts carry secrets, and add it to `.gitignore` (the default for this repo).
25
+
26
+ ## Supported versions
27
+
28
+ Only the latest released version receives fixes.