RubyGems - llm.rb - Versions diffs - 4.21.0 → 4.23.0 - Mend

llm.rb 4.21.0 → 4.23.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +78 -0
data/README.md +290 -59
data/data/anthropic.json +35 -2
data/data/google.json +7 -2
data/data/openai.json +0 -30
data/lib/llm/active_record/acts_as_agent.rb +11 -64
data/lib/llm/active_record/acts_as_llm.rb +81 -61
data/lib/llm/agent.rb +15 -3
data/lib/llm/buffer.rb +10 -0
data/lib/llm/compactor.rb +128 -0
data/lib/llm/context.rb +31 -2
data/lib/llm/function.rb +2 -1
data/lib/llm/sequel/agent.rb +4 -17
data/lib/llm/sequel/plugin.rb +82 -60
data/lib/llm/skill.rb +29 -14
data/lib/llm/stream.rb +20 -1
data/lib/llm/tool.rb +14 -0
data/lib/llm/version.rb +1 -1
data/llm.gemspec +3 -0
metadata +44 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: f0bca66b2bd8873cf39abb3be19dc99ca20d558e40ef3e9f475bf1f33faef6b6
-  data.tar.gz: c73a2c5093e7e09557242919feb5a377f25b0fa8a11249a9f346673ad7d3a921
+  metadata.gz: 49ed8077a6283802d4141dcb9ec037c7fc46920ebd3273b30c55624b575f3156
+  data.tar.gz: e2289baf740ba9603ed1c308414e632ddda296356659c8714bf3a1744c216104
 SHA512:
-  metadata.gz: 2a00191aaab47702a794f9fa86d782f21832be2a7ef309bd558aa482100d7c66ddbdf3320e89c80af2942c6e33295f10d387702130162fbac7cc98fd9b24c9a8
-  data.tar.gz: a6709f6fd265af673da771f635f34c68e28e490405700c1a59b18253391dbbcae09ce677a4251994d898a851ec08dc598c5ff858e516e25b1206948f509abf67
+  metadata.gz: b6b0d72baa785a6bf25cbfd3f2581d7f6a5850a0fa61dea29668596e19eb8a1142330f8acfea7f04a1bc76461c02c0af681588332d955aae2b5c6808f2fc0610
+  data.tar.gz: 836fc45489b9d86c7bde3ed2b94d2813be5bdaea1ebf7697f7e7eca5962f5374343e371188e40ced180ed50e053cd74ec8fcec8dea08c164291ee8577301f195

data/CHANGELOG.md CHANGED Viewed

@@ -2,8 +2,86 @@
 ## Unreleased
+Changes since `v4.23.0`.
+## v4.23.0
+Changes since `v4.22.0`.
+This release expands llm.rb's runtime surface for long-lived contexts and
+stateful tools. It adds built-in context compaction through `LLM::Compactor`,
+lets explicit `tools:` arrays accept bound `LLM::Tool` instances, and fixes
+OpenAI-compatible no-arg tool schemas for stricter providers such as xAI.
+### Change
+* **Add `LLM::Compactor` for long-lived contexts** <br>
+  Add built-in context compaction through `LLM::Compactor`, so older history
+  can be summarized, retained windows can stay bounded, compaction can run on
+  its own `model:`, and `LLM::Stream` can observe the lifecycle through
+  `on_compaction` and `on_compaction_finish`.
+* **Allow bound tool instances in explicit tool lists** <br>
+  Let explicit `tools:` arrays accept `LLM::Tool` instances such as
+  `MyTool.new(foo: 1)`, so tools can carry bound state without changing the
+  global tool registry model.
+### Fix
+* **Fix xAI/OpenAI-compatible no-arg tool schemas** <br>
+  Send an empty object schema for tools without declared parameters instead
+  of `null`, so stricter providers such as xAI accept mixed tool sets that
+  include no-arg tools.
+## v4.22.0
 Changes since `v4.21.0`.
+This release deepens the runtime shape of llm.rb. It reduces helper-method
+surface on persisted ORM models, expands real ORM coverage, and makes skills
+behave more like bounded sub-agents with inherited recent context and proper
+instruction injection.
+### Change
+* **Reduce ActiveRecord wrapper model surface** <br>
+  Move helper methods such as option resolution, column mapping,
+  serialization, and persistence into `Utils` for the ActiveRecord
+  wrappers so wrapped models include fewer internal helper methods.
+* **Reduce Sequel wrapper model surface** <br>
+  Move helper methods such as option resolution, column mapping,
+  serialization, and persistence into `Utils` for the Sequel wrappers
+  so wrapped models include fewer internal helper methods.
+* **Expand ORM integration coverage** <br>
+  Add broader ActiveRecord and Sequel coverage for persisted context and
+  agent wrappers, including real SQLite-backed records and cassette-backed
+  OpenAI persistence paths.
+* **Make skills inherit recent parent context** <br>
+  Run `LLM::Skill` with a curated slice of recent parent user and assistant
+  messages, prefixed with `Recent context:`, so skills behave more like
+  task-scoped sub-agents instead of instruction-only helpers.
+### Fix
+* **Fix Sequel `plugin :agent` load order** <br>
+  Require the shared Sequel plugin support from `LLM::Sequel::Agent` so
+  `plugin :agent` can load independently without raising
+  `uninitialized constant LLM::Sequel::Plugin`.
+* **Make skill execution inherit parent context request settings** <br>
+  Run `LLM::Skill` through a parent `LLM::Context` instead of a bare
+  provider so nested skill agents inherit context-level settings such as
+  `mode: :responses`, `store: false`, streaming, and other request defaults,
+  while still keeping skill-local tools and avoiding parent schemas.
+* **Keep agent instructions when history is preseeded** <br>
+  Inject `LLM::Agent` instructions once unless a system message is already
+  present, so agents and nested skills still get their instructions when
+  they start with inherited non-system context.
 ## v4.21.0
 Changes since `v4.20.2`.

data/README.md CHANGED Viewed

@@ -4,23 +4,14 @@
 <p align="center">
   <a href="https://0x1eef.github.io/x/llm.rb?rebuild=1"><img src="https://img.shields.io/badge/docs-0x1eef.github.io-blue.svg" alt="RubyDoc"></a>
   <a href="https://opensource.org/license/0bsd"><img src="https://img.shields.io/badge/License-0BSD-orange.svg?" alt="License"></a>
-  <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-4.21.0-green.svg?" alt="Version"></a>
+  <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-4.23.0-green.svg?" alt="Version"></a>
 </p>
 ## About
-llm.rb is a lightweight runtime for building capable AI systems in Ruby.
+llm.rb is the most capable runtime for building AI systems in Ruby.
 <br>
-It is also the most capable AI Ruby runtime that exists _today_, and that claim is
-backed up by research. Maybe it won't always be true, and that would be good news too -
-because it would mean the Ruby ecosystem is getting stronger.
-llm.rb is not just an API wrapper: it gives you one runtime for providers,
-contexts, agents, tools, skills, MCP servers, streaming, schemas, files, and
-persisted state, so real systems can be built out of one coherent execution
-model instead of a pile of adapters.
 llm.rb is designed for Ruby, and although it works great in Rails, it is not tightly
 coupled to it. It runs on the standard library by default (zero dependencies),
 loads optional pieces only when needed, includes built-in ActiveRecord support through
@@ -29,6 +20,10 @@ loads optional pieces only when needed, includes built-in ActiveRecord support t
 long-lived, tool-capable, stateful AI workflows instead of just
 request/response helpers.
+It provides one runtime for providers, agents, tools, skills, MCP servers, streaming,
+schemas, files, and persisted state, so real systems can be built out of one coherent
+execution model instead of a pile of adapters.
 Want to see some code? Jump to [the examples](#examples) section. <br>
 Want a taste of what llm.rb can build? See [the screencast](#screencast).
@@ -53,6 +48,197 @@ It holds:
 Instead of switching abstractions for each feature, everything builds on the
 same context object.
+## Standout features
+The following list is **not exhaustive**, but it covers a lot of ground.
+#### Skills
+Skills are reusable, directory-backed capabilities loaded from `SKILL.md`.
+They run through the same runtime as tools, agents, and MCP. They do not
+require a second orchestration layer or a parallel abstraction. If you've
+used Claude or Codex, you know the general idea of skills, and llm.rb
+supports that same concept with the same execution model as the rest of the
+system.
+In llm.rb, a skill has frontmatter and instructions. The frontmatter can
+define `name`, `description`, and `tools`. The `tools` entries are tool names,
+and each name must resolve to a subclass of
+[`LLM::Tool`](https://0x1eef.github.io/x/llm.rb/LLM/Tool.html) that is already
+loaded in the runtime.
+If you want Claude/Codex-like skills that can drive scripts or shell
+commands, you would typically pair the skill with a tool that can execute
+system commands.
+```yaml
+---
+name: release
+description: Prepare a release
+tools:
+  - search_docs
+  - git
+---
+Review the release state, summarize what changed, and prepare the release.
+```
+```ruby
+class Agent < LLM::Agent
+  model "gpt-5.4-mini"
+  skills "./skills/release"
+end
+llm = LLM.openai(key: ENV["KEY"])
+Agent.new(llm, stream: $stdout).talk("Let's prepare the release!")
+```
+#### ORM
+Any ActiveRecord model or Sequel model can become an agent-capable model,
+including existing business and domain models, without forcing you into a
+separate agent table or a second persistence layer.
+`acts_as_agent` extends a model with agent capabilities: the same runtime
+surface as [`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html),
+because it actually wraps an `LLM::Agent`, plus persistence through a text,
+JSON, or JSONB-backed column on the same table.
+```ruby
+class Ticket < ApplicationRecord
+  acts_as_agent provider: :set_provider
+  model "gpt-5.4-mini"
+  instructions "You are a support assistant."
+  private
+  def set_provider
+    { key: ENV["#{provider.upcase}_SECRET"], persistent: true }
+  end
+end
+```
+#### Agentic Patterns
+llm.rb is especially strong when you want to build agentic systems in a Ruby
+way. Agents can be ordinary application models with state, associations,
+tools, skills, and persistence, which makes it much easier to build systems
+where users have their own specialized agents instead of treating agents as
+something outside the app.
+That pattern works so well in llm.rb because
+[`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html),
+`acts_as_agent`, `plugin :agent`, skills, tools, and persisted runtime state
+all fit the same execution model. The runtime stays small enough that the
+main design work becomes application design, not orchestration glue.
+For a concrete example, see
+[How to build a platform of agents](https://0x1eef.github.io/posts/how-to-build-a-platform-of-agents).
+#### Persistence
+The same runtime can be serialized to disk, restored later, persisted in JSON
+or JSONB-backed ORM columns, resumed across process boundaries, or shared
+across long-lived workflows.
+```ruby
+ctx = LLM::Context.new(llm)
+ctx.talk("Remember that my favorite language is Ruby.")
+ctx.save(path: "context.json")
+```
+#### Context Compaction
+Long-lived contexts can compact older history into a summary instead of
+growing forever. Compaction is built into [`LLM::Context`](https://0x1eef.github.io/x/llm.rb/LLM/Context.html)
+through [`LLM::Compactor`](https://0x1eef.github.io/x/llm.rb/LLM/Compactor.html),
+and when a stream is present it emits `on_compaction` and
+`on_compaction_finish` through [`LLM::Stream`](https://0x1eef.github.io/x/llm.rb/LLM/Stream.html).
+The compactor can also use a different model from the main context, which is
+useful when you want summarization to run on a cheaper or faster model.
+```ruby
+ctx = LLM::Context.new(
+  llm,
+  compactor: {
+    message_threshold: 200,
+    retention_window: 8,
+    model: "gpt-5.4-mini"
+  }
+)
+```
+#### LLM::Stream
+`LLM::Stream` is not just for printing tokens. It supports `on_content`,
+`on_reasoning_content`, `on_tool_call`, `on_tool_return`, `on_compaction`,
+and `on_compaction_finish`, which means visible output, reasoning output, tool
+execution, and context compaction can all be driven through the same
+execution path.
+```ruby
+class Stream < LLM::Stream
+  def on_tool_call(tool, error)
+    queue << tool.spawn(:thread)
+  end
+  def on_tool_return(tool, result)
+    puts(result.value)
+  end
+end
+```
+#### Concurrency
+Tool execution can run sequentially with `:call` or concurrently through
+`:thread`, `:task`, `:fiber`, and experimental `:ractor`, without rewriting
+your tool layer.
+```ruby
+class Agent < LLM::Agent
+  model "gpt-5.4-mini"
+  tools FetchWeather, FetchNews, FetchStock
+  concurrency :thread
+end
+```
+#### MCP
+Remote MCP tools and prompts are not bolted on as a separate integration
+stack. They adapt into the same tool and prompt path used by local tools,
+skills, contexts, and agents.
+```ruby
+begin
+  mcp = LLM::MCP.http(url: "https://api.githubcopilot.com/mcp/").persistent
+  mcp.start
+  ctx = LLM::Context.new(llm, tools: mcp.tools)
+ensure
+  mcp.stop
+end
+```
+#### Cancellation
+Cancellation is one of the harder problems to get right, and while llm.rb
+makes it possible, it still requires careful engineering to use effectively.
+The point though is that it is possible to stop in-flight provider work cleanly
+through the same runtime, and the model used by llm.rb is directly inspired by
+Go's context package. In fact, llm.rb is heavily inspired by Go but with a Ruby
+twist.
+```ruby
+ctx = LLM::Context.new(llm, stream: $stdout)
+worker = Thread.new do
+  ctx.talk("Write a very long essay about network protocols.")
+rescue LLM::Interrupt
+  puts "Request was interrupted!"
+end
+STDIN.getch
+ctx.interrupt!
+worker.join
+```
 ## Differentiators
 ### Execution Model
@@ -137,11 +323,11 @@ same context object.
 - **Tools are explicit** <br>
   Run local tools, provider-native tools, and MCP tools through the same path
   with fewer special cases.
-- **Skills are just tools loaded from directories** <br>
+- **Skills become bounded runtime capabilities** <br>
   Point llm.rb at directories with a `SKILL.md`, resolve named tools through
-  the registry, and run those skills through `LLM::Context` or `LLM::Agent`
-  without creating a second execution model. If you are familiar with skills
-  in Claude or Codex, llm.rb supports the same general idea.
+  the registry, and adapt each skill into its own callable capability through
+  the normal runtime. Unlike a generic skill-discovery tool, each skill runs
+  with its own bounded tool subset and behaves like a task-scoped sub-agent.
 - **Providers are normalized, not flattened** <br>
   Share one API surface across providers without losing access to provider-
   specific capabilities where they matter.
@@ -173,24 +359,32 @@ same context object.
 ## Capabilities
+Execution:
 - **Chat & Contexts** — stateless and stateful interactions with persistence
 - **Context Serialization** — save and restore state across processes or time
 - **Streaming** — visible output, reasoning output, tool-call events
 - **Request Interruption** — stop in-flight provider work cleanly
+- **Concurrent Execution** — threads, async tasks, and fibers
+Runtime Building Blocks:
 - **Tool Calling** — class-based tools and closure-based functions
 - **Run Tools While Streaming** — overlap model output with tool latency
-- **Concurrent Execution** — threads, async tasks, and fibers
 - **Agents** — reusable assistants with tool auto-execution
 - **Skills** — directory-backed capabilities loaded from `SKILL.md`
+- **MCP Support** — stdio and HTTP MCP clients with prompt and tool support
+- **Context Compaction** — summarize older history in long-lived contexts
+Data and Structure:
 - **Structured Outputs** — JSON Schema-based responses
 - **Responses API** — stateful response workflows where providers support them
-- **MCP Support** — stdio and HTTP MCP clients with prompt and tool support
 - **Multimodal Inputs** — text, images, audio, documents, URLs
 - **Audio** — speech generation, transcription, translation
 - **Images** — generation and editing
 - **Files API** — upload and reference files in prompts
 - **Embeddings** — vector generation for search and RAG
 - **Vector Stores** — retrieval workflows
+Operations:
 - **Cost Tracking** — local cost estimation without extra API calls
 - **Observability** — tracing, logging, telemetry
 - **Model Registry** — local metadata for capabilities, limits, pricing
@@ -221,6 +415,44 @@ loop do
 end
 ```
+#### Agent
+This example uses [`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html) directly and lets the agent manage tool execution. <br> See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
+```ruby
+require "llm"
+class ShellAgent < LLM::Agent
+  model "gpt-5.4-mini"
+  instructions "You are a Linux system assistant."
+  tools Shell
+  concurrency :thread
+end
+llm = LLM.openai(key: ENV["KEY"])
+agent = ShellAgent.new(llm)
+puts agent.talk("What time is it on this system?").content
+```
+#### Skills
+This example uses [`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html) with directory-backed skills so `SKILL.md` capabilities run through the normal tool path. In llm.rb, a skill is exposed as a tool in the runtime. When that tool is called, it spawns a sub-agent with relevant context plus the instructions and tool subset declared in its own `SKILL.md`. <br> See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
+Each skill runs only with the tools declared in its own frontmatter.
+```ruby
+require "llm"
+class Agent < LLM::Agent
+  model "gpt-5.4-mini"
+  instructions "You are a concise release assistant."
+  skills "./skills/release", "./skills/review"
+end
+llm = LLM.openai(key: ENV["KEY"])
+puts Agent.new(llm).talk("Use the review skill.").content
+```
 #### Streaming
 This example uses [`LLM::Stream`](https://0x1eef.github.io/x/llm.rb/LLM/Stream.html) directly so visible output and tool execution can happen together. <br> See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
@@ -255,6 +487,42 @@ ctx.talk("Run `date` and `uname -a`.")
 ctx.talk(ctx.wait(:thread)) while ctx.functions.any?
 ```
+#### Context Compaction
+This example uses [`LLM::Context`](https://0x1eef.github.io/x/llm.rb/LLM/Context.html),
+[`LLM::Compactor`](https://0x1eef.github.io/x/llm.rb/LLM/Compactor.html), and
+[`LLM::Stream`](https://0x1eef.github.io/x/llm.rb/LLM/Stream.html) together so
+long-lived contexts can summarize older history and expose the lifecycle
+through stream hooks. This approach is inspired by General Intelligence
+Systems' [Brute](https://github.com/general-intelligence-systems/brute). The
+compactor can also use its own `model:` if you want summarization to run on a
+different model from the main context. <br> See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
+```ruby
+require "llm"
+class Stream < LLM::Stream
+  def on_compaction(ctx, compactor)
+    puts "Compacting #{ctx.messages.size} messages..."
+  end
+  def on_compaction_finish(ctx, compactor)
+    puts "Compacted to #{ctx.messages.size} messages."
+  end
+end
+llm = LLM.openai(key: ENV["KEY"])
+ctx = LLM::Context.new(
+  llm,
+  stream: Stream.new,
+  compactor: {
+    message_threshold: 200,
+    retention_window: 8,
+    model: "gpt-5.4-mini"
+  }
+)
+```
 #### Reasoning
 This example uses [`LLM::Stream`](https://0x1eef.github.io/x/llm.rb/LLM/Stream.html) with the OpenAI Responses API so reasoning output is streamed separately from visible assistant output. See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
@@ -354,12 +622,11 @@ require "active_record"
 require "llm/active_record"
 class Ticket < ApplicationRecord
-  acts_as_agent provider: :set_provider do
-    model "gpt-5.4-mini"
-    instructions "You are a concise support assistant."
-    tools SearchDocs, Escalate
-    concurrency :thread
-  end
+  acts_as_agent provider: :set_provider
+  model "gpt-5.4-mini"
+  instructions "You are a concise support assistant."
+  tools SearchDocs, Escalate
+  concurrency :thread
   private
@@ -372,42 +639,6 @@ ticket = Ticket.create!(provider: "openai", model: "gpt-5.4-mini")
 puts ticket.talk("How do I rotate my API key?").content
 ```
-#### Agent
-This example uses [`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html) directly and lets the agent manage tool execution. <br> See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
-```ruby
-require "llm"
-class ShellAgent < LLM::Agent
-  model "gpt-5.4-mini"
-  instructions "You are a Linux system assistant."
-  tools Shell
-  concurrency :thread
-end
-llm = LLM.openai(key: ENV["KEY"])
-agent = ShellAgent.new(llm)
-puts agent.talk("What time is it on this system?").content
-```
-#### Skills
-This example uses [`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html) with directory-backed skills so `SKILL.md` capabilities run through the normal tool path. If you have used skills in Claude or Codex, this is the same kind of building block. <br> See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
-```ruby
-require "llm"
-class Agent < LLM::Agent
-  model "gpt-5.4-mini"
-  instructions "You are a concise release assistant."
-  skills "./skills/release", "./skills/review"
-end
-llm = LLM.openai(key: ENV["KEY"])
-puts Agent.new(llm).talk("Use the review skill.").content
-```
 #### MCP
 This example uses [`LLM::MCP`](https://0x1eef.github.io/x/llm.rb/LLM/MCP.html) over HTTP so remote GitHub MCP tools run through the same `LLM::Context` tool path as local tools. See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.

data/data/anthropic.json CHANGED Viewed

@@ -213,7 +213,7 @@
       "reasoning": true,
       "tool_call": true,
       "temperature": true,
-      "knowledge": "2025-08",
+      "knowledge": "2025-08-31",
       "release_date": "2026-02-17",
       "last_updated": "2026-03-13",
       "modalities": {
@@ -271,6 +271,39 @@
         "output": 32000
       }
     },
+    "claude-opus-4-7": {
+      "id": "claude-opus-4-7",
+      "name": "Claude Opus 4.7",
+      "family": "claude-opus",
+      "attachment": true,
+      "reasoning": true,
+      "tool_call": true,
+      "temperature": false,
+      "knowledge": "2026-01-31",
+      "release_date": "2026-04-16",
+      "last_updated": "2026-04-16",
+      "modalities": {
+        "input": [
+          "text",
+          "image",
+          "pdf"
+        ],
+        "output": [
+          "text"
+        ]
+      },
+      "open_weights": false,
+      "cost": {
+        "input": 5,
+        "output": 25,
+        "cache_read": 0.5,
+        "cache_write": 6.25
+      },
+      "limit": {
+        "context": 1000000,
+        "output": 128000
+      }
+    },
     "claude-3-haiku-20240307": {
       "id": "claude-3-haiku-20240307",
       "name": "Claude Haiku 3",
@@ -609,7 +642,7 @@
       "reasoning": true,
       "tool_call": true,
       "temperature": true,
-      "knowledge": "2025-05",
+      "knowledge": "2025-05-31",
       "release_date": "2026-02-05",
       "last_updated": "2026-03-13",
       "modalities": {

data/data/google.json CHANGED Viewed

@@ -594,7 +594,12 @@
       "cost": {
         "input": 1.25,
         "output": 10,
-        "cache_read": 0.31
+        "cache_read": 0.125,
+        "context_over_200k": {
+          "input": 2.5,
+          "output": 15,
+          "cache_read": 0.25
+        }
       },
       "limit": {
         "context": 1048576,
@@ -824,7 +829,7 @@
       "cost": {
         "input": 0.3,
         "output": 2.5,
-        "cache_read": 0.075,
+        "cache_read": 0.03,
         "input_audio": 1
       },
       "limit": {

data/data/openai.json CHANGED Viewed

@@ -1066,36 +1066,6 @@
         "output": 100000
       }
     },
-    "codex-mini-latest": {
-      "id": "codex-mini-latest",
-      "name": "Codex Mini",
-      "family": "gpt-codex-mini",
-      "attachment": true,
-      "reasoning": true,
-      "tool_call": true,
-      "temperature": false,
-      "knowledge": "2024-04",
-      "release_date": "2025-05-16",
-      "last_updated": "2025-05-16",
-      "modalities": {
-        "input": [
-          "text"
-        ],
-        "output": [
-          "text"
-        ]
-      },
-      "open_weights": false,
-      "cost": {
-        "input": 1.5,
-        "output": 6,
-        "cache_read": 0.375
-      },
-      "limit": {
-        "context": 200000,
-        "output": 100000
-      }
-    },
     "gpt-4": {
       "id": "gpt-4",
       "name": "GPT-4",