llm_cassette 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: bcde9366222cf40c75db3288557afc756c0e8d36f9a98fa2d8c538a57351f734
4
+ data.tar.gz: ffcba7b15bf3c5dc5e358def22da27bdf2985593120c1f92897f6057295d5e96
5
+ SHA512:
6
+ metadata.gz: d7dad527e4d7f8084f0c98a8d0ef5db8d406d6f7b972223d2cf61a5af8b2c628697c2ce05be2e4c9a4754267bc55210e66dbe1f41f4b3fcb87b0711bc48e9596
7
+ data.tar.gz: 6b7b76e152aecfdd2659390b60907863727b6b2f78c9db2aa61da428bdd90e6489c8b690682eb900b001c29dd6766add168322f10f73c6a8f77825b89bd7e28a
data/.rspec ADDED
@@ -0,0 +1,3 @@
1
+ --require spec_helper
2
+ --format documentation
3
+ --color
data/.rubocop.yml ADDED
@@ -0,0 +1,91 @@
1
+ plugins:
2
+ - rubocop-rspec
3
+
4
+ AllCops:
5
+ NewCops: enable
6
+ TargetRubyVersion: 3.2
7
+ Exclude:
8
+ - "bin/**/*"
9
+ - "vendor/**/*"
10
+ - "pkg/**/*"
11
+
12
+ Style/StringLiterals:
13
+ EnforcedStyle: double_quotes
14
+
15
+ Style/FrozenStringLiteralComment:
16
+ Enabled: true
17
+
18
+ Metrics/MethodLength:
19
+ Max: 20
20
+
21
+ Metrics/BlockLength:
22
+ Max: 40
23
+ Exclude:
24
+ - "spec/**/*"
25
+
26
+ Metrics/ClassLength:
27
+ Max: 150
28
+
29
+ Style/Documentation:
30
+ Enabled: false
31
+
32
+ Naming/MethodParameterName:
33
+ MinNameLength: 1
34
+
35
+ Naming/PredicateMethod:
36
+ Enabled: false
37
+
38
+ RSpec/VerifiedDoubles:
39
+ Enabled: false
40
+
41
+ RSpec/MessageSpies:
42
+ Enabled: false
43
+
44
+ RSpec/IdenticalEqualityAssertion:
45
+ Enabled: false
46
+
47
+ RSpec/StubbedMock:
48
+ Enabled: false
49
+
50
+ Style/OpenStructUse:
51
+ Exclude:
52
+ - "spec/**/*"
53
+
54
+ Lint/FloatComparison:
55
+ Enabled: false
56
+
57
+ Metrics/AbcSize:
58
+ Max: 35
59
+
60
+ Metrics/CyclomaticComplexity:
61
+ Max: 10
62
+
63
+ Metrics/PerceivedComplexity:
64
+ Max: 12
65
+
66
+ Metrics/ParameterLists:
67
+ Max: 7
68
+
69
+ RSpec/SpecFilePathFormat:
70
+ Enabled: false
71
+
72
+ RSpec/DescribeClass:
73
+ Exclude:
74
+ - "spec/llm_cassette/rspec_helpers_spec.rb"
75
+
76
+ RSpec/MultipleDescribes:
77
+ Exclude:
78
+ - "spec/llm_cassette/rspec_helpers_spec.rb"
79
+
80
+ RSpec/MultipleExpectations:
81
+ Max: 5
82
+
83
+ RSpec/ExampleLength:
84
+ Max: 20
85
+
86
+ RSpec/MultipleMemoizedHelpers:
87
+ Max: 10
88
+
89
+ Lint/EmptyBlock:
90
+ Exclude:
91
+ - "spec/**/*"
data/CHANGELOG.md ADDED
@@ -0,0 +1,27 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [Unreleased]
9
+
10
+ ## [0.1.0] — 2026-06-09
11
+
12
+ ### Added
13
+
14
+ - `LlmCassette::Middleware` Faraday middleware — intercepts LLM HTTP requests, routes to recorder or replayer based on active cassette and record mode.
15
+ - `LlmCassette::Cassette` — loads and saves `.yml` cassette files. Sequential interaction replay via internal index pointer. Creates intermediate directories on save.
16
+ - `LlmCassette::Interaction` — value object for a single request+response pair. Handles streaming (SSE chunk array + offsets) and non-streaming (raw body) response shapes.
17
+ - `LlmCassette::Recorder` — wraps Faraday's `on_data` callback to capture SSE chunks with wall-clock offsets. Falls back to capturing response body for non-streaming calls.
18
+ - `LlmCassette::Replayer` — replays non-streaming responses as a fake `Faraday::Response`. For streaming, emits recorded chunks via the caller's `on_data` proc. Optional timing replay via `replay_timing` config flag.
19
+ - `LlmCassette::RequestSignature` — normalizes request method, URI, and JSON body (sorted keys) for cassette storage and debugging.
20
+ - Record modes: `:none` (replay only — raises `CassetteNotFoundError` if cassette missing) and `:all` (always re-record, hits real API).
21
+ - Token usage extraction — best-effort parsing from response body (non-streaming) or last SSE chunk containing `usage` (streaming). Stored per interaction under `response.usage`.
22
+ - `LlmCassette.use_cassette("name") { }` block API — sets and clears thread-local cassette, ensures `eject!` even on exception.
23
+ - `LlmCassette::RSpec::Helpers` — `use_llm_cassette "name"` class macro wraps every example in the group via `around`. Auto-derives cassette name from example description when called without arguments.
24
+ - RSpec metadata form — `it "...", llm_cassette: "name"` and `it "...", llm_cassette: true` (auto-name). Enabled via `require "llm_cassette/rspec"`.
25
+ - `LlmCassette::CassetteNotFoundError` — raised when cassette file is missing in `:none` mode.
26
+ - `LlmCassette::NoMoreInteractionsError` — raised when a cassette is exhausted and another request is made.
27
+ - Works with OpenAI and Anthropic out of the box — provider-agnostic at the Faraday layer; usage extraction handles both `prompt_tokens`/`completion_tokens` (OpenAI) and `input_tokens`/`output_tokens` (Anthropic) field names.
data/README.md ADDED
@@ -0,0 +1,267 @@
1
+ # llm_cassette
2
+
3
+ **VCR for LLMs — streaming-aware. Record once, replay fast, never hit the API in CI.**
4
+
5
+ [![CI](https://github.com/jibranusman95/llm_cassette/actions/workflows/ci.yml/badge.svg)](https://github.com/jibranusman95/llm_cassette/actions)
6
+ [![Gem Version](https://badge.fury.io/rb/llm_cassette.svg)](https://badge.fury.io/rb/llm_cassette)
7
+ [![Downloads](https://img.shields.io/gem/dt/llm_cassette)](https://rubygems.org/gems/llm_cassette)
8
+ [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
9
+
10
+ ---
11
+
12
+ Every Ruby team shipping LLM features ends up with the same problem:
13
+
14
+ ```ruby
15
+ # spec/services/chat_service_spec.rb
16
+ it "returns a greeting" do
17
+ # VCR records raw bytes — replays the entire SSE response as one blob.
18
+ # Incremental stream processing breaks. Token counts are lost.
19
+ # Cassettes bust on every prompt tweak.
20
+ VCR.use_cassette("greeting") do
21
+ result = ChatService.call("say hello")
22
+ expect(result).to include("Hello")
23
+ end
24
+ end
25
+ ```
26
+
27
+ VCR records raw HTTP bytes. For SSE streaming it replays them all at once — your `on_data` callback fires once instead of chunk-by-chunk. Incremental stream rendering breaks. Token costs vanish. Cassettes bust on any prompt change.
28
+
29
+ Here's the same test with llm_cassette:
30
+
31
+ ```ruby
32
+ it "returns a greeting" do
33
+ LlmCassette.use_cassette("greeting") do
34
+ result = ChatService.call("say hello")
35
+ expect(result).to include("Hello")
36
+ end
37
+ end
38
+ ```
39
+
40
+ Chunks replay in order via `on_data`. Token usage is stored per cassette. Cassettes are human-readable YAML. Works with OpenAI, Anthropic, or any Faraday-based LLM client.
41
+
42
+ ---
43
+
44
+ ## What you get
45
+
46
+ **Streaming-aware replay** — records SSE chunks as an ordered sequence with wall-clock offsets. Replays them via `on_data` exactly as the real API would, so incremental stream processing works correctly in tests.
47
+
48
+ **Works without configuration** — hooks into Faraday, which both [ruby_llm](https://github.com/crmne/ruby_llm) and [llm.rb](https://github.com/kieranklaassen/llm.rb) use internally. One middleware insert and you're done.
49
+
50
+ **Token usage stored per interaction** — `response.usage.prompt_tokens`, `completion_tokens`, `total_tokens` stored in every cassette. Supports both OpenAI and Anthropic field names.
51
+
52
+ **Human-readable cassettes** — plain YAML files you can read, edit, and commit. One file per cassette, multiple interactions per file.
53
+
54
+ **RSpec helpers** — class-level `use_llm_cassette`, inline block form, and RSpec metadata — three ergonomic styles for three different situations.
55
+
56
+ ---
57
+
58
+ ## Install
59
+
60
+ ```ruby
61
+ # Gemfile
62
+ gem "llm_cassette"
63
+ ```
64
+
65
+ Add the middleware to your Faraday connection:
66
+
67
+ ```ruby
68
+ # config/initializers/faraday.rb (or wherever you build your connection)
69
+ Faraday.default_connection_options.builder_middlewares.unshift(LlmCassette::Middleware)
70
+
71
+ # Or on a specific connection:
72
+ conn = Faraday.new do |f|
73
+ f.use LlmCassette::Middleware
74
+ f.adapter Faraday.default_adapter
75
+ end
76
+ ```
77
+
78
+ ---
79
+
80
+ ## Configuration
81
+
82
+ ```ruby
83
+ # spec/support/llm_cassette.rb
84
+ require "llm_cassette/rspec"
85
+
86
+ LlmCassette.configure do |config|
87
+ config.cassette_directory = Rails.root.join("spec/llm_cassettes").to_s
88
+
89
+ # :none — replay only. Raises CassetteNotFoundError if cassette missing (default, good for CI).
90
+ # :all — always hit the real API and re-record.
91
+ config.record = ENV["LLM_RECORD"] ? :all : :none
92
+
93
+ # Replay chunks with the original timing delays (default: false — fast CI).
94
+ config.replay_timing = false
95
+ end
96
+ ```
97
+
98
+ ---
99
+
100
+ ## Usage
101
+
102
+ ### Block form
103
+
104
+ ```ruby
105
+ LlmCassette.use_cassette("chat_greeting") do
106
+ response = client.chat(messages: [{ role: "user", content: "say hello" }])
107
+ expect(response.content).to include("Hello")
108
+ end
109
+ ```
110
+
111
+ ### Class-level helper (wraps every example in the group)
112
+
113
+ ```ruby
114
+ RSpec.describe ChatService do
115
+ use_llm_cassette "chat_service"
116
+
117
+ it "returns a greeting" do
118
+ expect(ChatService.call("say hello")).to include("Hello")
119
+ end
120
+
121
+ it "handles follow-ups" do
122
+ # same cassette, second interaction consumed sequentially
123
+ expect(ChatService.call("now say goodbye")).to include("Goodbye")
124
+ end
125
+ end
126
+ ```
127
+
128
+ Omit the name to auto-derive it from the example group description:
129
+
130
+ ```ruby
131
+ RSpec.describe ChatService do
132
+ use_llm_cassette # cassette name: "chatservice"
133
+
134
+ it "..." { ... }
135
+ end
136
+ ```
137
+
138
+ ### RSpec metadata
139
+
140
+ ```ruby
141
+ it "greets the user", llm_cassette: "greeting" do
142
+ expect(ChatService.call("hi")).to include("Hello")
143
+ end
144
+
145
+ # Auto-name from example description:
146
+ it "greets the user", llm_cassette: true do
147
+ expect(ChatService.call("hi")).to include("Hello")
148
+ end
149
+ ```
150
+
151
+ ---
152
+
153
+ ## Recording cassettes
154
+
155
+ Set `record: :all` to hit the real API and write cassette files:
156
+
157
+ ```bash
158
+ LLM_RECORD=1 bundle exec rspec spec/services/chat_service_spec.rb
159
+ ```
160
+
161
+ Then commit the cassette files and run with `record: :none` in CI.
162
+
163
+ To re-record a single cassette, delete its file and run with `LLM_RECORD=1` again.
164
+
165
+ ---
166
+
167
+ ## Cassette format
168
+
169
+ Cassettes are plain YAML — readable, diffable, and editable by hand:
170
+
171
+ ```yaml
172
+ ---
173
+ llm_cassette_version: "1"
174
+ recorded_at: "2026-06-09T12:00:00Z"
175
+ interactions:
176
+ - request:
177
+ method: post
178
+ uri: "https://api.openai.com/v1/chat/completions"
179
+ body: '{"messages":[{"role":"user","content":"say hello"}],"model":"gpt-4o","stream":true}'
180
+ response:
181
+ status: 200
182
+ headers:
183
+ content-type: "text/event-stream; charset=utf-8"
184
+ streaming: true
185
+ chunks:
186
+ - data: "data: {\"choices\":[{\"delta\":{\"content\":\"Hello\"}}]}\n\n"
187
+ offset: 0.0
188
+ - data: "data: {\"choices\":[{\"delta\":{\"content\":\" world!\"}}],\"usage\":{\"prompt_tokens\":10,\"completion_tokens\":2,\"total_tokens\":12}}\n\n"
189
+ offset: 0.134
190
+ - data: "data: [DONE]\n\n"
191
+ offset: 0.187
192
+ usage:
193
+ prompt_tokens: 10
194
+ completion_tokens: 2
195
+ total_tokens: 12
196
+ ```
197
+
198
+ Multiple interactions in one cassette are replayed sequentially — first request gets `interactions[0]`, second gets `interactions[1]`, and so on.
199
+
200
+ ---
201
+
202
+ ## How streaming replay works
203
+
204
+ VCR captures raw HTTP bytes and writes them to a cassette. When it replays a streaming response, it returns the entire body at once — your `on_data` callback fires once with all the bytes. Any code that renders output incrementally as chunks arrive breaks silently.
205
+
206
+ llm_cassette records each SSE chunk separately as it arrives from `on_data`, along with its wall-clock offset from the start of the request. On replay, it calls your `on_data` proc with each chunk in sequence. The stream arrives the same way it would from the real API.
207
+
208
+ ```
209
+ Real API: on_data("data: Hello\n\n") → on_data("data: world\n\n") → on_data("data: [DONE]\n\n")
210
+ VCR replay: on_data("data: Hello\n\ndata: world\n\ndata: [DONE]\n\n") ← one call, breaks incremental rendering
211
+ llm_cassette: on_data("data: Hello\n\n") → on_data("data: world\n\n") → on_data("data: [DONE]\n\n")
212
+ ```
213
+
214
+ Enable `replay_timing: true` to also replay the inter-chunk delays for timing-sensitive tests.
215
+
216
+ ---
217
+
218
+ ## Why not VCR?
219
+
220
+ [vcr](https://github.com/vcr/vcr) is excellent for REST APIs — 156M downloads and well-maintained. For LLM calls specifically:
221
+
222
+ | | VCR | llm_cassette |
223
+ |---|---|---|
224
+ | SSE streaming replay | Blob — one `on_data` call | Sequential chunks via `on_data` |
225
+ | Token usage | Not captured | Stored per interaction |
226
+ | Cassette format | Marshal / YAML of raw bytes | Human-readable YAML |
227
+ | Provider knowledge | None | Extracts usage from OpenAI + Anthropic chunk formats |
228
+
229
+ If you're not using streaming and don't need token tracking, VCR works fine. llm_cassette is for teams where streaming is the default.
230
+
231
+ ---
232
+
233
+ ## Requirements
234
+
235
+ - Ruby >= 3.2
236
+ - Faraday >= 1.0
237
+
238
+ No Rails dependency. Works with any Ruby HTTP stack that uses Faraday.
239
+
240
+ ---
241
+
242
+ ## Contributing
243
+
244
+ ```bash
245
+ git clone https://github.com/jibranusman95/llm_cassette
246
+ cd llm_cassette
247
+ bundle install
248
+ bundle exec rspec
249
+ bundle exec rubocop
250
+ ```
251
+
252
+ ---
253
+
254
+ ## From the same author
255
+
256
+ | Gem | What it does |
257
+ |-----|-------------|
258
+ | [webhook_inbox](https://github.com/jibranusman95/webhook_inbox) | Transactional inbox for Rails webhook receivers — dedup, async processing, replay, dashboard |
259
+ | [turbo_presence](https://github.com/jibranusman95/turbo_presence) | Figma-style live cursors, avatar stacks, and typing indicators for Rails — one line |
260
+ | [promptscrub](https://github.com/jibranusman95/promptscrub) | PII redaction middleware for LLM calls |
261
+ | [http_decoy](https://github.com/jibranusman95/http_decoy) | A real Rack server that runs inside your RSpec tests — test HTTP contracts, not stubs |
262
+
263
+ ---
264
+
265
+ ## License
266
+
267
+ MIT. See [LICENSE](LICENSE).
data/Rakefile ADDED
@@ -0,0 +1,10 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "bundler/gem_tasks"
4
+ require "rspec/core/rake_task"
5
+ require "rubocop/rake_task"
6
+
7
+ RSpec::Core::RakeTask.new(:spec)
8
+ RuboCop::RakeTask.new
9
+
10
+ task default: %i[spec rubocop]
@@ -0,0 +1,83 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "yaml"
4
+ require "fileutils"
5
+ require "time"
6
+
7
+ module LlmCassette
8
+ class Cassette
9
+ attr_reader :name
10
+
11
+ def initialize(name, record: nil)
12
+ @name = name.to_s
13
+ @record_mode = (record || LlmCassette.configuration.record).to_sym
14
+ @interactions = []
15
+ @replay_index = 0
16
+
17
+ return if record?
18
+ unless file_exists?
19
+ raise CassetteNotFoundError, "Cassette '#{name}' not found at #{path}. " \
20
+ "Re-run with record: :all to record it."
21
+ end
22
+ load!
23
+ end
24
+
25
+ def record?
26
+ @record_mode == :all
27
+ end
28
+
29
+ def next_interaction
30
+ interaction = @interactions[@replay_index]
31
+ @replay_index += 1
32
+
33
+ unless interaction
34
+ raise NoMoreInteractionsError,
35
+ "No more interactions in cassette '#{name}'. " \
36
+ "Expected interaction ##{@replay_index} but cassette has #{@interactions.size}."
37
+ end
38
+
39
+ interaction
40
+ end
41
+
42
+ def record_interaction(interaction)
43
+ @interactions << interaction
44
+ end
45
+
46
+ def eject!
47
+ save! if record?
48
+ end
49
+
50
+ def size
51
+ @interactions.size
52
+ end
53
+
54
+ private
55
+
56
+ def path
57
+ dir = LlmCassette.configuration.cassette_directory
58
+ File.join(dir, "#{name}.yml")
59
+ end
60
+
61
+ def file_exists?
62
+ File.exist?(path)
63
+ end
64
+
65
+ def load!
66
+ data = YAML.safe_load_file(path, permitted_classes: [Symbol])
67
+ @interactions = (data["interactions"] || []).map { |i| Interaction.from_hash(i) }
68
+ end
69
+
70
+ def save!
71
+ FileUtils.mkdir_p(File.dirname(path))
72
+ File.write(path, YAML.dump(to_h))
73
+ end
74
+
75
+ def to_h
76
+ {
77
+ "llm_cassette_version" => "1",
78
+ "recorded_at" => Time.now.utc.iso8601,
79
+ "interactions" => @interactions.map(&:to_h)
80
+ }
81
+ end
82
+ end
83
+ end
@@ -0,0 +1,24 @@
1
+ # frozen_string_literal: true
2
+
3
+ module LlmCassette
4
+ class Configuration
5
+ RECORD_MODES = %i[none all].freeze
6
+
7
+ attr_reader :record
8
+ attr_accessor :cassette_directory, :replay_timing
9
+
10
+ def initialize
11
+ @cassette_directory = "spec/llm_cassettes"
12
+ @record = :none
13
+ @replay_timing = false
14
+ end
15
+
16
+ def record=(mode)
17
+ unless RECORD_MODES.include?(mode.to_sym)
18
+ raise ArgumentError, "Invalid record mode: #{mode}. Must be one of: #{RECORD_MODES.join(', ')}"
19
+ end
20
+
21
+ @record = mode.to_sym
22
+ end
23
+ end
24
+ end
@@ -0,0 +1,7 @@
1
+ # frozen_string_literal: true
2
+
3
+ module LlmCassette
4
+ class Error < StandardError; end
5
+ class CassetteNotFoundError < Error; end
6
+ class NoMoreInteractionsError < Error; end
7
+ end
@@ -0,0 +1,106 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "json"
4
+
5
+ module LlmCassette
6
+ class Interaction
7
+ attr_reader :request, :response
8
+
9
+ def initialize(request:, response:)
10
+ @request = request
11
+ @response = response
12
+ end
13
+
14
+ def self.from_recording(env:, response:, streaming:, chunks:, request_body:)
15
+ req = {
16
+ "method" => env.method.to_s.downcase,
17
+ "uri" => env.url.to_s,
18
+ "body" => request_body
19
+ }
20
+
21
+ res = build_response_hash(response, streaming, chunks)
22
+
23
+ new(request: req, response: res)
24
+ end
25
+
26
+ def self.from_hash(hash)
27
+ new(request: hash["request"], response: hash["response"])
28
+ end
29
+
30
+ def to_h
31
+ { "request" => request, "response" => response }
32
+ end
33
+
34
+ def streaming?
35
+ response["streaming"] == true
36
+ end
37
+
38
+ class << self
39
+ private
40
+
41
+ def build_response_hash(response, streaming, chunks)
42
+ res = {
43
+ "status" => response.status,
44
+ "headers" => normalize_headers(response.headers),
45
+ "streaming" => streaming
46
+ }
47
+
48
+ if streaming
49
+ serialized = chunks.map { |c| { "data" => c[:data], "offset" => c[:offset] } }
50
+ res["chunks"] = serialized
51
+ res["usage"] = extract_usage_from_chunks(chunks)
52
+ else
53
+ res["body"] = response.body.to_s
54
+ res["usage"] = extract_usage_from_body(response.body)
55
+ end
56
+
57
+ res
58
+ end
59
+
60
+ def normalize_headers(headers)
61
+ return {} unless headers
62
+
63
+ headers.each_with_object({}) { |(k, v), h| h[k.to_s.downcase] = v }
64
+ end
65
+
66
+ def extract_usage_from_body(body)
67
+ return nil unless body&.length&.positive?
68
+
69
+ parsed = JSON.parse(body)
70
+ usage = parsed["usage"]
71
+ return nil unless usage
72
+
73
+ build_usage_hash(usage)
74
+ rescue JSON::ParserError
75
+ nil
76
+ end
77
+
78
+ def extract_usage_from_chunks(chunks)
79
+ chunks.reverse_each do |chunk|
80
+ data = chunk[:data].to_s
81
+ next unless data.start_with?("data: ")
82
+
83
+ json_str = data.sub(/\Adata: /, "").strip
84
+ next if json_str == "[DONE]"
85
+
86
+ parsed = JSON.parse(json_str)
87
+ usage = parsed["usage"]
88
+ next unless usage
89
+
90
+ return build_usage_hash(usage)
91
+ rescue JSON::ParserError
92
+ next
93
+ end
94
+ nil
95
+ end
96
+
97
+ def build_usage_hash(usage)
98
+ {
99
+ "prompt_tokens" => usage["prompt_tokens"] || usage["input_tokens"],
100
+ "completion_tokens" => usage["completion_tokens"] || usage["output_tokens"],
101
+ "total_tokens" => usage["total_tokens"]
102
+ }.compact
103
+ end
104
+ end
105
+ end
106
+ end
@@ -0,0 +1,19 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "faraday"
4
+
5
+ module LlmCassette
6
+ class Middleware < Faraday::Middleware
7
+ def call(env)
8
+ cassette = LlmCassette.current_cassette
9
+ return app.call(env) unless cassette
10
+
11
+ if cassette.record?
12
+ Recorder.new(env, app, cassette).call
13
+ else
14
+ interaction = cassette.next_interaction
15
+ Replayer.new(env, interaction).call
16
+ end
17
+ end
18
+ end
19
+ end
@@ -0,0 +1,46 @@
1
+ # frozen_string_literal: true
2
+
3
+ module LlmCassette
4
+ class Recorder
5
+ def initialize(env, app, cassette)
6
+ @env = env
7
+ @app = app
8
+ @cassette = cassette
9
+ end
10
+
11
+ def call
12
+ request_body = @env.body.dup.to_s
13
+ streaming = !@env.request.on_data.nil?
14
+ chunks = intercept_streaming([], streaming)
15
+ response = @app.call(@env)
16
+ record(request_body, streaming, chunks, response)
17
+ response
18
+ end
19
+
20
+ private
21
+
22
+ def intercept_streaming(chunks, streaming)
23
+ return chunks unless streaming
24
+
25
+ original = @env.request.on_data
26
+ start = Process.clock_gettime(Process::CLOCK_MONOTONIC)
27
+ @env.request.on_data = proc do |chunk, bytes|
28
+ offset = (Process.clock_gettime(Process::CLOCK_MONOTONIC) - start).round(4)
29
+ chunks << { data: chunk, offset: offset }
30
+ original.call(chunk, bytes)
31
+ end
32
+ chunks
33
+ end
34
+
35
+ def record(request_body, streaming, chunks, response)
36
+ interaction = Interaction.from_recording(
37
+ env: @env,
38
+ response: response,
39
+ streaming: streaming,
40
+ chunks: chunks,
41
+ request_body: request_body
42
+ )
43
+ @cassette.record_interaction(interaction)
44
+ end
45
+ end
46
+ end
@@ -0,0 +1,55 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "faraday"
4
+
5
+ module LlmCassette
6
+ class Replayer
7
+ def initialize(env, interaction)
8
+ @env = env
9
+ @interaction = interaction
10
+ end
11
+
12
+ def call
13
+ if @interaction.streaming?
14
+ replay_streaming
15
+ else
16
+ replay_non_streaming
17
+ end
18
+ end
19
+
20
+ private
21
+
22
+ def replay_streaming
23
+ on_data = @env.request.on_data
24
+ chunks = @interaction.response["chunks"] || []
25
+
26
+ chunks.each do |chunk|
27
+ data = chunk["data"].to_s
28
+ sleep(chunk["offset"].to_f) if LlmCassette.configuration.replay_timing && chunk["offset"].to_f.positive?
29
+ on_data&.call(data, data.bytesize)
30
+ end
31
+
32
+ build_response(
33
+ status: @interaction.response["status"],
34
+ headers: @interaction.response["headers"],
35
+ body: chunks.map { |c| c["data"].to_s }.join
36
+ )
37
+ end
38
+
39
+ def replay_non_streaming
40
+ build_response(
41
+ status: @interaction.response["status"],
42
+ headers: @interaction.response["headers"],
43
+ body: @interaction.response["body"].to_s
44
+ )
45
+ end
46
+
47
+ def build_response(status:, headers:, body:)
48
+ env = @env.dup
49
+ env.status = status.to_i
50
+ env.response_headers = Faraday::Utils::Headers.new(headers || {})
51
+ env.body = body
52
+ Faraday::Response.new(env)
53
+ end
54
+ end
55
+ end
@@ -0,0 +1,34 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "json"
4
+
5
+ module LlmCassette
6
+ class RequestSignature
7
+ attr_reader :method, :uri, :body
8
+
9
+ def initialize(env)
10
+ @method = env.method.to_s.downcase
11
+ @uri = env.url.to_s
12
+ @body = normalize_body(env.body)
13
+ end
14
+
15
+ private
16
+
17
+ def normalize_body(body)
18
+ return nil unless body&.length&.positive?
19
+
20
+ parsed = JSON.parse(body)
21
+ JSON.generate(sort_hash(parsed))
22
+ rescue JSON::ParserError
23
+ body.to_s
24
+ end
25
+
26
+ def sort_hash(obj)
27
+ case obj
28
+ when Hash then obj.sort.to_h.transform_values { |v| sort_hash(v) }
29
+ when Array then obj.map { |v| sort_hash(v) }
30
+ else obj
31
+ end
32
+ end
33
+ end
34
+ end
@@ -0,0 +1,25 @@
1
+ # frozen_string_literal: true
2
+
3
+ module LlmCassette
4
+ module RSpec
5
+ module Helpers
6
+ # Class-level helper — wraps every example in the group with a cassette.
7
+ #
8
+ # describe MyService do
9
+ # use_llm_cassette "my_service"
10
+ # it "..." { ... }
11
+ # end
12
+ #
13
+ # Omit the name to auto-derive it from the example group description.
14
+ def use_llm_cassette(name = nil, **options)
15
+ around do |example|
16
+ cassette_name = name || example.metadata[:full_description]
17
+ .gsub(%r{[^a-zA-Z0-9_\-/]}, "_")
18
+ .squeeze("_")
19
+ .downcase
20
+ LlmCassette.use_cassette(cassette_name, **options) { example.run }
21
+ end
22
+ end
23
+ end
24
+ end
25
+ end
@@ -0,0 +1,24 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "llm_cassette"
4
+ require "llm_cassette/rspec/helpers"
5
+
6
+ RSpec.configure do |config|
7
+ config.extend LlmCassette::RSpec::Helpers
8
+
9
+ # Metadata form:
10
+ # it "...", llm_cassette: "name" do ... end
11
+ # it "...", llm_cassette: true do ... end # auto-name
12
+ config.around(:each, :llm_cassette) do |example|
13
+ raw = example.metadata[:llm_cassette]
14
+ name = if raw == true
15
+ example.metadata[:full_description]
16
+ .gsub(%r{[^a-zA-Z0-9_\-/]}, "_")
17
+ .squeeze("_")
18
+ .downcase
19
+ else
20
+ raw.to_s
21
+ end
22
+ LlmCassette.use_cassette(name) { example.run }
23
+ end
24
+ end
@@ -0,0 +1,5 @@
1
+ # frozen_string_literal: true
2
+
3
+ module LlmCassette
4
+ VERSION = "0.1.0"
5
+ end
@@ -0,0 +1,51 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "llm_cassette/version"
4
+ require_relative "llm_cassette/errors"
5
+ require_relative "llm_cassette/configuration"
6
+ require_relative "llm_cassette/interaction"
7
+ require_relative "llm_cassette/request_signature"
8
+ require_relative "llm_cassette/cassette"
9
+ require_relative "llm_cassette/recorder"
10
+ require_relative "llm_cassette/replayer"
11
+ require_relative "llm_cassette/middleware"
12
+
13
+ module LlmCassette
14
+ class << self
15
+ def configure
16
+ yield configuration
17
+ end
18
+
19
+ def configuration
20
+ @configuration ||= Configuration.new
21
+ end
22
+
23
+ def reset!
24
+ @configuration = nil
25
+ end
26
+
27
+ def current_cassette
28
+ Thread.current[:llm_cassette_current]
29
+ end
30
+
31
+ # Block form — wraps a block with an active cassette.
32
+ #
33
+ # LlmCassette.use_cassette("greeting") do
34
+ # response = client.chat("say hello")
35
+ # end
36
+ #
37
+ # Options:
38
+ # record: :none (default) — replay only, raise if cassette missing
39
+ # record: :all — always record (hits real API)
40
+ def use_cassette(name, record: nil, &block)
41
+ cassette = Cassette.new(name, record: record)
42
+ Thread.current[:llm_cassette_current] = cassette
43
+ begin
44
+ block.call(cassette)
45
+ ensure
46
+ cassette.eject!
47
+ Thread.current[:llm_cassette_current] = nil
48
+ end
49
+ end
50
+ end
51
+ end
@@ -0,0 +1,79 @@
1
+ module LlmCassette
2
+ VERSION: String
3
+
4
+ def self.configure: () { (Configuration) -> void } -> void
5
+ def self.configuration: () -> Configuration
6
+ def self.reset!: () -> nil
7
+ def self.current_cassette: () -> Cassette?
8
+ def self.use_cassette: (String name, ?record: Symbol?) { (Cassette) -> void } -> void
9
+
10
+ class Error < StandardError
11
+ end
12
+
13
+ class CassetteNotFoundError < Error
14
+ end
15
+
16
+ class NoMoreInteractionsError < Error
17
+ end
18
+
19
+ class Configuration
20
+ RECORD_MODES: Array[Symbol]
21
+
22
+ attr_accessor cassette_directory: String
23
+ attr_reader record: Symbol
24
+ attr_accessor replay_timing: bool
25
+
26
+ def initialize: () -> void
27
+ def record=: (Symbol | String mode) -> Symbol
28
+ end
29
+
30
+ class Interaction
31
+ attr_reader request: Hash[String, untyped]
32
+ attr_reader response: Hash[String, untyped]
33
+
34
+ def initialize: (request: Hash[String, untyped], response: Hash[String, untyped]) -> void
35
+ def self.from_recording: (env: untyped, response: untyped, streaming: bool, chunks: Array[Hash[Symbol, untyped]], request_body: String?) -> Interaction
36
+ def self.from_hash: (Hash[String, untyped] hash) -> Interaction
37
+ def to_h: () -> Hash[String, untyped]
38
+ def streaming?: () -> bool
39
+ end
40
+
41
+ class RequestSignature
42
+ attr_reader method: String
43
+ attr_reader uri: String
44
+ attr_reader body: String?
45
+
46
+ def initialize: (untyped env) -> void
47
+ end
48
+
49
+ class Cassette
50
+ attr_reader name: String
51
+
52
+ def initialize: (String name, ?record: Symbol?) -> void
53
+ def record?: () -> bool
54
+ def next_interaction: () -> Interaction
55
+ def record_interaction: (Interaction interaction) -> void
56
+ def eject!: () -> void
57
+ def size: () -> Integer
58
+ end
59
+
60
+ class Recorder
61
+ def initialize: (untyped env, untyped app, Cassette cassette) -> void
62
+ def call: () -> untyped
63
+ end
64
+
65
+ class Replayer
66
+ def initialize: (untyped env, Interaction interaction) -> void
67
+ def call: () -> untyped
68
+ end
69
+
70
+ class Middleware < Faraday::Middleware
71
+ def call: (untyped env) -> untyped
72
+ end
73
+
74
+ module RSpec
75
+ module Helpers
76
+ def use_llm_cassette: (?String? name, **untyped options) -> void
77
+ end
78
+ end
79
+ end
metadata ADDED
@@ -0,0 +1,82 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: llm_cassette
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Jibran Usman
8
+ autorequire:
9
+ bindir: exe
10
+ cert_chain: []
11
+ date: 2026-06-09 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: faraday
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ">="
18
+ - !ruby/object:Gem::Version
19
+ version: '1.0'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - ">="
25
+ - !ruby/object:Gem::Version
26
+ version: '1.0'
27
+ description: |-
28
+ VCR for LLMs. Hooks into Faraday (used by ruby_llm, llm.rb, and most OpenAI/Anthropic clients),
29
+ records SSE chunks with timing, and replays them correctly. Exact-match cassettes, RSpec helpers,
30
+ token usage stored per interaction. Works with OpenAI and Anthropic out of the box.
31
+ email:
32
+ - jibran.usman@eunasolutions.com
33
+ executables: []
34
+ extensions: []
35
+ extra_rdoc_files: []
36
+ files:
37
+ - ".rspec"
38
+ - ".rubocop.yml"
39
+ - CHANGELOG.md
40
+ - README.md
41
+ - Rakefile
42
+ - lib/llm_cassette.rb
43
+ - lib/llm_cassette/cassette.rb
44
+ - lib/llm_cassette/configuration.rb
45
+ - lib/llm_cassette/errors.rb
46
+ - lib/llm_cassette/interaction.rb
47
+ - lib/llm_cassette/middleware.rb
48
+ - lib/llm_cassette/recorder.rb
49
+ - lib/llm_cassette/replayer.rb
50
+ - lib/llm_cassette/request_signature.rb
51
+ - lib/llm_cassette/rspec.rb
52
+ - lib/llm_cassette/rspec/helpers.rb
53
+ - lib/llm_cassette/version.rb
54
+ - sig/llm_cassette.rbs
55
+ homepage: https://github.com/jibranusman95/llm_cassette
56
+ licenses: []
57
+ metadata:
58
+ homepage_uri: https://github.com/jibranusman95/llm_cassette
59
+ source_code_uri: https://github.com/jibranusman95/llm_cassette
60
+ changelog_uri: https://github.com/jibranusman95/llm_cassette/blob/main/CHANGELOG.md
61
+ rubygems_mfa_required: 'true'
62
+ post_install_message:
63
+ rdoc_options: []
64
+ require_paths:
65
+ - lib
66
+ required_ruby_version: !ruby/object:Gem::Requirement
67
+ requirements:
68
+ - - ">="
69
+ - !ruby/object:Gem::Version
70
+ version: 3.2.0
71
+ required_rubygems_version: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - ">="
74
+ - !ruby/object:Gem::Version
75
+ version: '0'
76
+ requirements: []
77
+ rubygems_version: 3.5.22
78
+ signing_key:
79
+ specification_version: 4
80
+ summary: Streaming-aware cassette recorder for LLM calls — record once, replay fast,
81
+ never hit the API in CI.
82
+ test_files: []