conduit-sse 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 98cf3556d809dacc27ff9aebe5179bb0203539d0357f4d6ca52352833f76450d
4
+ data.tar.gz: dbeb4685dcef94355a0d71ad0e89cece01fbb69bcde53400e2d7755e2e252669
5
+ SHA512:
6
+ metadata.gz: ffdcffe64f0ddd71565010439debaea23bc5bbb542d1318ef46a9846d4c5e54e3a3f43267de344851a6d6e89517d71dd199d8eeb2b2b79c2ececdd6b5f49ebbb
7
+ data.tar.gz: 679f209458cc1d9bb63e758dbb0decd1bdbba578edb538e758d0716e328e23e1e66ee6c1d0143c4b71aecf65aed9244a307ea6d923005aa87e07879f7a5c2334
data/CHANGELOG.md ADDED
@@ -0,0 +1,19 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [0.1.0] - 2026-05-08
9
+
10
+ ### Added
11
+ - Initial release of Conduit, a lightweight, zero-dependency Ruby gem for parsing Server-Sent Events (SSE) streams
12
+ - Flexible callback-based architecture for processing real-time server push data
13
+ - Custom parser support for transforming event data into any shape
14
+ - SSE spec compliant parsing following the HTML Server-Sent Events specification
15
+ - Built-in inspector for development and troubleshooting
16
+ - Robust error handling to prevent stream interruption
17
+ - Granular access callbacks (`on_frame`, `on_field`) for non-standard SSE implementations
18
+ - Support for streaming AI responses, real-time analytics, and live updates
19
+ - Comprehensive test coverage with fuzz testing
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2026 franbach
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,600 @@
1
+ # Conduit
2
+
3
+ Conduit is a lightweight, zero-dependency Ruby gem for parsing Server-Sent Events (SSE) streams. It provides a flexible callback-based architecture for processing real-time server push data with full control over every stage of the parsing pipeline.
4
+
5
+ ## Why Conduit?
6
+
7
+ Building real-time applications with SSE shouldn't require wrestling with complex parsing logic or sacrificing performance for convenience. Conduit gives you:
8
+
9
+ **🎯 Zero Dependencies** - Drop it into any Ruby project without worrying about dependency hell. Pure Ruby, no external gems.
10
+
11
+ **🔧 Complete Control** - Hook into every stage of the parsing pipeline with callbacks. Whether you need to transform data, forward to services, emit to frontends, or add observability. Conduit adapts to your architecture.
12
+
13
+ **📡 Production Ready** - Built for real-world use with robust error handling, SSE spec compliance, and a built-in inspector for debugging. Streams from AI providers, or any SSE endpoint just work.
14
+
15
+ **⚡ Flexible Parsers** - Your parser lambda can do anything: JSON parsing, YAML loading, custom transformations, or domain-specific logic. You're not locked into any data shape.
16
+
17
+ **🔍 Granular Access** - Need to handle non-standard SSE fields? Want raw frame access? Conduit provides both spec-compliant callbacks (`on_event`, `on_parsed`) and low-level access (`on_frame`, `on_field`) for maximum flexibility.
18
+
19
+ Perfect for streaming AI responses, real-time analytics, live updates, and any application that needs to process server-push events efficiently.
20
+
21
+ ## Features
22
+
23
+ - **Zero dependencies** - Pure Ruby, no external gems required
24
+ - **Flexible callback system** - Hook into every stage of the parsing pipeline
25
+ - **Custom parsers** - Transform event data into any shape your application needs
26
+ - **SSE spec compliant** - Follows the HTML Server-Sent Events specification
27
+ - **Debugging support** - Built-in inspector for development and troubleshooting
28
+ - **Error handling** - Robust error routing to prevent stream interruption
29
+
30
+ ## Installation
31
+
32
+ Install the gem and add to your application's Gemfile:
33
+
34
+ ```bash
35
+ bundle add conduit-sse
36
+ ```
37
+
38
+ If bundler is not being used, install the gem directly:
39
+
40
+ ```bash
41
+ gem install conduit-sse
42
+ ```
43
+
44
+ ## Usage
45
+
46
+ ### Basic Example
47
+
48
+ At its core, Conduit processes SSE data chunks and emits callbacks at each stage:
49
+
50
+ ```ruby
51
+ require "conduit"
52
+
53
+ # Create a stream with a parser that transforms event data
54
+ stream = Conduit.new(parser: ->(data) { JSON.parse(data) })
55
+
56
+ # Subscribe to parsed events
57
+ stream.on_parsed do |parsed|
58
+ puts "Received: #{parsed}"
59
+ end
60
+
61
+ # Feed data chunks (typically from an HTTP stream)
62
+ stream << "data: {\"message\": \"hello\"}\n\n"
63
+ ```
64
+
65
+ ### Real-World Example with Net::HTTP
66
+
67
+ Here's a complete example connecting to an SSE endpoint:
68
+
69
+ ```ruby
70
+ require "conduit"
71
+ require "net/http"
72
+ require "uri"
73
+ require "json"
74
+
75
+ stream = Conduit.new(parser: ->(d) { JSON.parse(d) rescue d })
76
+
77
+ stream.on_parsed do |parsed|
78
+ next unless parsed.is_a?(Hash)
79
+ puts "#{parsed['wiki']}: #{parsed['title']} by #{parsed['user']}"
80
+ end
81
+
82
+ uri = URI("https://stream.wikimedia.org/v2/stream/recentchange")
83
+
84
+ Net::HTTP.start(uri.host, uri.port, use_ssl: true) do |http|
85
+ http.read_timeout = nil # disable read timeout for SSE
86
+
87
+ http.request(Net::HTTP::Get.new(uri, "Accept" => "text/event-stream")) do |response|
88
+ response.read_body { |chunk| stream << chunk }
89
+ end
90
+ end
91
+ ```
92
+
93
+ ### OpenAI Streaming Example
94
+
95
+ Here's a complete example using Conduit to stream responses from OpenAI's Chat Completions API:
96
+
97
+ ```ruby
98
+ require "conduit"
99
+ require "net/http"
100
+ require "uri"
101
+ require "json"
102
+
103
+ # Set your OpenAI API key
104
+ api_key = "your-api-key-here"
105
+
106
+ # Create the stream with a parser that extracts the delta content
107
+ stream = Conduit.new(parser: ->(data) { JSON.parse(data) })
108
+
109
+ result = +""
110
+
111
+ # Approach 1: Use on_parsed to extract delta after JSON parsing
112
+ # Since OpenAI sends structured JSON in the data field, the parser converts it to a Hash,
113
+ # making it easy to extract the delta content directly.
114
+ stream.on_parsed do |parsed_data|
115
+ type = parsed_data["type"]
116
+
117
+ if type == "response.output_text.delta"
118
+ delta = parsed_data["delta"]
119
+ if delta
120
+ puts "parsed delta: #{delta}"
121
+ result += delta
122
+
123
+ # You can also emit the delta to a frontend app here if you will.
124
+ # emit_to_frontend(delta)
125
+ end
126
+ end
127
+
128
+ if type == "response.completed"
129
+ puts "\n\nResult: #{result}"
130
+ stream.close
131
+ end
132
+ end
133
+
134
+ # Approach 2: Use on_field for more granular control
135
+ # This approach gives you access to the raw field values before JSON parsing,
136
+ # useful if you need to inspect or modify the raw data field content.
137
+ stream.on_field do |name, value|
138
+ if name == "data"
139
+ data = JSON.parse(value)
140
+ type = data["type"]
141
+
142
+ if type == "response.output_text.delta"
143
+ delta = data["delta"]
144
+ if delta
145
+ puts "delta: #{delta}"
146
+ result += delta
147
+
148
+ # You can also emit the delta to a frontend app here if you will.
149
+ # emit_to_frontend(delta)
150
+ end
151
+ end
152
+
153
+ if type == "response.completed"
154
+ puts "\n\nResult: #{result}"
155
+ stream.close
156
+ end
157
+ end
158
+ end
159
+
160
+ # Make the streaming request
161
+ uri = URI("https://api.openai.com/v1/responses")
162
+ http = Net::HTTP.new(uri.host, uri.port)
163
+ http.use_ssl = true
164
+
165
+ request = Net::HTTP::Post.new(uri)
166
+ request["Content-Type"] = "application/json"
167
+ request["Authorization"] = "Bearer #{api_key}"
168
+
169
+ request.body = JSON.generate({
170
+ model: "gpt-4.1-mini",
171
+ stream: true, # Enable streaming
172
+ input: [
173
+ { role: "user", content: "Write a haiku about programming" }
174
+ ]
175
+ })
176
+
177
+ http.request(request) do |response|
178
+ response.read_body do |chunk|
179
+ stream << chunk
180
+ end
181
+ end
182
+ ```
183
+
184
+ **Note:** OpenAI's Responses API uses `data:` fields with JSON payloads. The response format includes a `type` field to identify event types (`response.output_text.delta` for streaming text chunks, `response.completed` when the stream finishes). The `parser` extracts the delta content from each chunk as it arrives, allowing you to display the response in real-time.
185
+
186
+ ### Callback System
187
+
188
+ Conduit provides callbacks at every stage of processing:
189
+
190
+ ```ruby
191
+ stream = Conduit.new(parser: ->(data) { data })
192
+
193
+ # Raw chunk as it arrived (after normalization)
194
+ stream.on_chunk do |chunk|
195
+ puts "Chunk received: #{chunk.bytesize} bytes"
196
+ end
197
+
198
+ # Complete frame text (after sanitization)
199
+ stream.on_frame do |frame|
200
+ puts "Frame: #{frame}"
201
+ end
202
+
203
+ # Individual SSE field lines
204
+ stream.on_field do |name, value|
205
+ puts "Field: #{name}=#{value}"
206
+ end
207
+
208
+ # Fully parsed SSE event
209
+ stream.on_event do |event|
210
+ puts "Event type: #{event.event}, id: #{event.id}"
211
+ end
212
+
213
+ # Result of your parser
214
+ stream.on_parsed do |parsed|
215
+ puts "Parsed: #{parsed}"
216
+ end
217
+
218
+ # Ping/comment frames
219
+ stream.on_ping do |frame|
220
+ puts "Ping received"
221
+ end
222
+
223
+ # Errors from callbacks or parser
224
+ stream.on_error do |error|
225
+ puts "Error: #{error.message}"
226
+ end
227
+ ```
228
+
229
+ ### Understanding Callback Differences
230
+
231
+ It's important to understand the distinction between `on_frame`, `on_event`, and `on_parsed`/`each`:
232
+
233
+ **`on_frame`** - Receives the raw frame text (string) after sanitization, regardless of whether it produces an event:
234
+
235
+ ```ruby
236
+ stream.on_frame do |frame|
237
+ # frame is a string like "data: hello\n"
238
+ puts frame
239
+ end
240
+ ```
241
+
242
+ **`on_event`** - Receives a fully parsed `Conduit::Event` object with SSE metadata:
243
+
244
+ ```ruby
245
+ stream.on_event do |event|
246
+ # event is a Conduit::Event object
247
+ puts event.event # Event type (e.g., "message")
248
+ puts event.data # Data field content (joined data lines)
249
+ puts event.id # Last event ID (if sent by server)
250
+ puts event.retry # Retry delay in ms (if sent by server)
251
+ end
252
+ ```
253
+
254
+ **`each` / `on_parsed`** - Receives the result of your custom parser (the `parser:` lambda):
255
+
256
+ ```ruby
257
+ stream = Conduit.new(parser: ->(data) { JSON.parse(data) })
258
+
259
+ stream.each do |parsed|
260
+ # parsed is whatever your parser returns
261
+ # In this case, a Hash from JSON.parse(data)
262
+ puts parsed
263
+ end
264
+ ```
265
+
266
+ **The processing flow:**
267
+
268
+ 1. Raw chunk arrives → `on_chunk` (string)
269
+ 2. Chunks are buffered and split into frames
270
+ 3. Frame is sanitized → `on_frame` (string)
271
+ 4. Frame is parsed into SSE fields → `on_field` (name, value pairs)
272
+ 5. Event object is constructed → `on_event` (Conduit::Event)
273
+ 6. Your parser transforms the data → `on_parsed`/`each` (your custom output)
274
+
275
+ **Key nuance:** The parser receives **only the data field content** (joined by newlines), not the entire frame. If you need access to other fields (event type, id, retry), use `on_event` instead.
276
+
277
+ ### Callback Philosophy
278
+
279
+ Conduit's callback system is designed around two complementary approaches:
280
+
281
+ **SSE-Spec Callbacks** (`on_event`, `on_parsed`)
282
+
283
+ - These callbacks are tied to the SSE specification
284
+ - `on_event` receives a structured `Conduit::Event` object with standard SSE fields (event type, data, id, retry)
285
+ - `on_parsed` receives the output of your custom parser, which operates on the data field content
286
+ - Use these when working with spec-compliant SSE streams or when you want structured, predictable data
287
+
288
+ **Granular Control Callbacks** (`on_frame`, `on_field`)
289
+
290
+ - These provide low-level access to the raw stream data, independent of SSE specification
291
+ - `on_frame` gives you the complete frame text before field parsing
292
+ - `on_field` gives you individual field lines as they're parsed, including custom/non-standard fields
293
+ - Use these when dealing with non-standard SSE implementations, custom field names, or when you need complete control over the parsing process
294
+
295
+ **Choosing between approaches:**
296
+
297
+ - If the SSE stream follows the specification, `on_event` with `Conduit::Event` provides a structured, spec-compliant representation of the event
298
+ - If the frame deviates from the SSE specification or uses custom/non-standard fields, `on_frame` gives you raw access to the frame content, allowing you to handle it independently of the specification
299
+ - Use `on_field` to inspect individual fields when you need to handle custom or non-standard field names
300
+ - Your `parser` lambda can implement any logic needed: JSON parsing, YAML loading, custom transformations, validation, or domain-specific processing
301
+
302
+ ### Common Use Cases
303
+
304
+ Conduit's callback system makes it easy to integrate SSE streams into your application architecture:
305
+
306
+ **Forwarding to Services**
307
+
308
+ ```ruby
309
+ stream.on_parsed do |parsed|
310
+ # Forward parsed events to a message queue, database, or external service
311
+ MessageQueue.publish("events", parsed)
312
+ end
313
+ ```
314
+
315
+ **Emitting to Frontend Applications**
316
+
317
+ ```ruby
318
+ stream.on_parsed do |parsed|
319
+ # Stream real-time updates to connected WebSocket clients
320
+ WebSocketBroadcaster.broadcast("updates", parsed)
321
+ end
322
+ ```
323
+
324
+ **Adding Observability**
325
+
326
+ ```ruby
327
+ stream.on_event do |event|
328
+ # Track metrics for monitoring
329
+ Metrics.increment("sse.events.received", tags: { type: event.event })
330
+ end
331
+
332
+ stream.on_error do |error|
333
+ # Log errors for debugging
334
+ Logger.error("SSE processing error", error: error.message)
335
+ end
336
+ ```
337
+
338
+ **Data Transformation**
339
+
340
+ ```ruby
341
+ stream = Conduit.new(parser: ->(data) {
342
+ # Transform raw data into your domain models
343
+ raw = JSON.parse(data)
344
+ MyDomainModel.new(raw)
345
+ })
346
+
347
+ stream.on_parsed do |model|
348
+ # Work with your domain objects directly
349
+ model.process!
350
+ end
351
+ ```
352
+
353
+ **Multi-Consumer Pattern**
354
+
355
+ ```ruby
356
+ # Multiple callbacks can handle the same event
357
+ stream.on_parsed do |parsed|
358
+ # Consumer 1: Update cache
359
+ Cache.set(parsed["id"], parsed)
360
+ end
361
+
362
+ stream.on_parsed do |parsed|
363
+ # Consumer 2: Trigger webhook
364
+ WebhookService.trigger(parsed)
365
+ end
366
+
367
+ stream.on_parsed do |parsed|
368
+ # Consumer 3: Update analytics
369
+ Analytics.track("event_received", parsed)
370
+ end
371
+ ```
372
+
373
+ ### Event Object
374
+
375
+ Parsed events are returned as `Conduit::Event` objects with the following attributes:
376
+
377
+ - `event` - Event type (defaults to "message")
378
+ - `data` - The event data string
379
+ - `id` - Last event ID (from SSE spec)
380
+ - `retry` - Retry delay in milliseconds (from SSE spec)
381
+
382
+ ```ruby
383
+ stream.on_event do |event|
384
+ puts "Type: #{event.event}"
385
+ puts "Data: #{event.data}"
386
+ puts "ID: #{event.id}"
387
+ puts "Retry: #{event.retry}ms" if event.retry
388
+ end
389
+ ```
390
+
391
+ ### Customization Options
392
+
393
+ You can customize the parsing behavior with these options:
394
+
395
+ ```ruby
396
+ stream = Conduit.new(
397
+ # Required: A callable that receives the joined data field content (string)
398
+ # and returns whatever shape your application needs (e.g., JSON.parse, YAML.load, etc.)
399
+ parser: ->(data) { JSON.parse(data) },
400
+
401
+ # Optional: Transforms incoming chunks before processing.
402
+ # The default normalizer performs UTF-8 conversion and CRLF→LF normalization.
403
+ # NOTE: Providing your own completely replaces the default behavior,
404
+ # including UTF-8 conversion. If you need UTF-8 handling, you must implement it yourself.
405
+ chunk_normalizer: ->(chunk) { chunk.upcase },
406
+
407
+ # Optional: Delimiter that separates frames in the stream (default: "\n\n")
408
+ frame_separator: "\r\n\r\n",
409
+
410
+ # Optional: Prefix used to identify the data field.
411
+ # The trailing ":" is stripped to derive the field name (default: "data:")
412
+ payload_start: "data:",
413
+
414
+ # Optional: Pattern identifying ping/comment frames (default: ":")
415
+ ping_pattern: ":",
416
+
417
+ # Optional: Cleans or validates frame content after splitting.
418
+ # The default sanitizer strips whitespace and performs UTF-8 conversion.
419
+ # NOTE: Providing your own completely replaces the default behavior,
420
+ # including UTF-8 handling. If you need UTF-8 handling, you must implement it yourself.
421
+ sanitize_pattern: ->(frame) { frame.strip }
422
+ )
423
+ ```
424
+
425
+ ### Using `each` for Enumerable Interface
426
+
427
+ For a simpler interface, use `each` to iterate over parsed events:
428
+
429
+ ```ruby
430
+ stream = Conduit.new(parser: ->(data) { data })
431
+
432
+ stream.each do |parsed|
433
+ puts "Received: #{parsed}"
434
+ end
435
+
436
+ # Feed data
437
+ stream << "data: hello\n\n"
438
+ stream << "data: world\n\n"
439
+ ```
440
+
441
+ ### Accessing SSE State
442
+
443
+ Conduit tracks SSE spec state that you can access:
444
+
445
+ ```ruby
446
+ stream = Conduit.new(parser: ->(data) { data })
447
+
448
+ stream << "id: 123\ndata: hello\n\n"
449
+
450
+ puts stream.last_event_id # => "123"
451
+ puts stream.retry_ms # => nil (unless server sends retry field)
452
+ ```
453
+
454
+ ### Handling Stream Completion
455
+
456
+ Use `finish` (or its alias `close`) to process any remaining data in the buffer when the stream ends:
457
+
458
+ ```ruby
459
+ stream = Conduit.new(parser: ->(data) { JSON.parse(data) })
460
+
461
+ http.request(request) do |response|
462
+ response.read_body do |chunk|
463
+ stream << chunk
464
+ end
465
+ end
466
+
467
+ # Process any trailing data not terminated by the frame separator
468
+ stream.finish
469
+ # or
470
+ stream.close
471
+ ```
472
+
473
+ This is useful when the HTTP connection closes cleanly without a trailing `\n\n`, which is common with many SSE servers. The method is safe to call multiple times and on empty buffers.
474
+
475
+ ### Error Handling
476
+
477
+ Errors in callbacks are routed to the `on_error` handler, preventing stream interruption:
478
+
479
+ ```ruby
480
+ stream = Conduit.new(parser: ->(data) { JSON.parse(data) })
481
+
482
+ stream.on_error do |error|
483
+ puts "Caught error: #{error.message}"
484
+ # Stream continues processing
485
+ end
486
+
487
+ stream.on_parsed do |parsed|
488
+ # If this raises, it's caught by on_error
489
+ process_data(parsed)
490
+ end
491
+
492
+ stream << "data: invalid json\n\n" # Parser fails, but stream continues
493
+ ```
494
+
495
+ ### Debugging with Inspector
496
+
497
+ Use the built-in inspector to log all stream activity during development:
498
+
499
+ ```ruby
500
+ require "net/http"
501
+ require "uri"
502
+ require "json"
503
+
504
+ stream = Conduit.new(parser: ->(data) { JSON.parse(data) })
505
+
506
+ # Attach inspector to log everything to stdout
507
+ Conduit::Inspector.attach(stream)
508
+
509
+ # Or log to a different IO
510
+ Conduit::Inspector.attach(stream, io: $stderr)
511
+
512
+ # You'll see [CHUNK], [FRAME], [FIELD], [EVENT], [PARSED] lines as data flows.
513
+ # Wikimedia tends to emit event:, id:, data: and occasional : ping keep-alives.
514
+ uri = URI("https://stream.wikimedia.org/v2/stream/recentchange")
515
+
516
+ Net::HTTP.start(uri.host, uri.port, use_ssl: true) do |http|
517
+ http.read_timeout = nil # disable read timeout for SSE
518
+
519
+ http.request(Net::HTTP::Get.new(uri, "Accept" => "text/event-stream")) do |response|
520
+ response.read_body { |chunk| stream << chunk }
521
+ end
522
+ end
523
+
524
+ ```
525
+
526
+ The inspector logs:
527
+
528
+ - Chunks with byte counts
529
+ - Frames with byte counts
530
+ - Individual fields
531
+ - Pings
532
+ - Events with metadata
533
+ - Parsed results
534
+ - Errors
535
+
536
+ ### Multiple Callbacks
537
+
538
+ You can register multiple callbacks for the same event type:
539
+
540
+ ```ruby
541
+ stream = Conduit.new(parser: ->(data) { data })
542
+
543
+ stream.on_parsed do |parsed|
544
+ puts "Handler 1: #{parsed}"
545
+ end
546
+
547
+ stream.on_parsed do |parsed|
548
+ puts "Handler 2: #{parsed}"
549
+ end
550
+
551
+ stream << "data: hello\n\n"
552
+ # Both handlers execute in registration order
553
+ ```
554
+
555
+ ### Custom Field Handling
556
+
557
+ Conduit emits all SSE fields, including custom ones:
558
+
559
+ ```ruby
560
+ stream = Conduit.new(parser: ->(data) { data })
561
+
562
+ stream.on_field do |name, value|
563
+ case name
564
+ when "data"
565
+ puts "Data: #{value}"
566
+ when "custom-field"
567
+ puts "Custom: #{value}"
568
+ end
569
+ end
570
+
571
+ stream << "data: hello\ncustom-field: value\n\n"
572
+ ```
573
+
574
+ ## Architecture
575
+
576
+ Conduit processes data through these stages:
577
+
578
+ 1. **Chunk Normalization** - Raw chunks are normalized (UTF-8 conversion, CRLF→LF)
579
+ 2. **Buffering** - Chunks are buffered until frame boundaries are found
580
+ 3. **Frame Splitting** - Frames are split by the separator (default: `\n\n`)
581
+ 4. **Sanitization** - Frames are sanitized (default: strip whitespace)
582
+ 5. **Ping Detection** - Ping/comment frames are identified
583
+ 6. **Field Parsing** - SSE fields are parsed per the HTML spec
584
+ 7. **Event Construction** - Events are built from parsed fields
585
+ 8. **Parser Application** - Your custom parser transforms event data
586
+ 9. **Callback Emission** - Callbacks are invoked at each stage
587
+
588
+ ## Development
589
+
590
+ After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake test` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
591
+
592
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
593
+
594
+ ## Contributing
595
+
596
+ Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/conduit.
597
+
598
+ ## License
599
+
600
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
data/Rakefile ADDED
@@ -0,0 +1,12 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "bundler/gem_tasks"
4
+ require "minitest/test_task"
5
+
6
+ Minitest::TestTask.create
7
+
8
+ require "rubocop/rake_task"
9
+
10
+ RuboCop::RakeTask.new
11
+
12
+ task default: %i[test rubocop]
@@ -0,0 +1,48 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Conduit
4
+ # Manages callback registration and execution for Conduit::Stream.
5
+ class Callbacks
6
+ FAILED = Object.new.freeze
7
+
8
+ def initialize
9
+ @callbacks = {}
10
+ end
11
+
12
+ def on(name, &block)
13
+ @callbacks[name] = compose(@callbacks[name], block)
14
+ end
15
+
16
+ def emit(name, *args)
17
+ callback = @callbacks[name]
18
+ return if callback.nil?
19
+
20
+ callback.call(*args)
21
+ rescue StandardError => e
22
+ raise unless name != :error && @callbacks[:error]
23
+
24
+ @callbacks[:error].call(e)
25
+ end
26
+
27
+ def call_safely(callable, *args)
28
+ callable.call(*args)
29
+ rescue StandardError => e
30
+ raise unless @callbacks[:error]
31
+
32
+ @callbacks[:error].call(e)
33
+ FAILED
34
+ end
35
+
36
+ private
37
+
38
+ def compose(previous, current)
39
+ return previous if current.nil?
40
+ return current if previous.nil?
41
+
42
+ proc do |*args|
43
+ previous.call(*args)
44
+ current.call(*args)
45
+ end
46
+ end
47
+ end
48
+ end
@@ -0,0 +1,22 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Conduit
4
+ # Default configurations for Conduit::Stream
5
+ module Defaults
6
+ PING_PATTERN = ":"
7
+ FRAME_SEPARATOR = "\n\n"
8
+ PAYLOAD_START = "data:"
9
+
10
+ module_function
11
+
12
+ def to_utf8(string)
13
+ string.dup
14
+ .force_encoding("UTF-8")
15
+ .encode("UTF-8", invalid: :replace, undef: :replace)
16
+ .gsub("\r\n", "\n")
17
+ end
18
+
19
+ SANITIZE_PATTERN = ->(frame) { to_utf8(frame).strip }
20
+ CHUNK_NORMALIZER = ->(chunk) { to_utf8(chunk) }
21
+ end
22
+ end
@@ -0,0 +1,5 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Conduit
4
+ Event = Data.define(:event, :data, :id, :retry)
5
+ end
@@ -0,0 +1,108 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Conduit
4
+ # Attach to a Conduit::Stream to log every layer of activity to an IO.
5
+ # Intended for development/debugging only.
6
+ #
7
+ # stream = Conduit.new(parser: ->(d) { JSON.parse(d) })
8
+ # Conduit::Inspector.attach(stream)
9
+ #
10
+ # Pass io: to redirect (e.g. a StringIO in tests, a file, $stderr).
11
+ class Inspector
12
+ def self.attach(stream, io: $stdout)
13
+ new(stream, io: io).attach
14
+ end
15
+
16
+ attr_reader :counts
17
+
18
+ def initialize(stream, io:)
19
+ @stream = stream
20
+ @io = io
21
+ @counts = Hash.new(0)
22
+ end
23
+
24
+ def attach
25
+ log_chunks
26
+ log_frames
27
+ log_fields
28
+ log_pings
29
+ log_events
30
+ log_parsed
31
+ log_errors
32
+ self
33
+ end
34
+
35
+ # Print a one-line summary of everything seen so far.
36
+ def summary
37
+ @io.puts(
38
+ "[SUMMARY] " \
39
+ "chunks=#{@counts[:chunk]} " \
40
+ "frames=#{@counts[:frame]} " \
41
+ "events=#{@counts[:event]} " \
42
+ "parsed=#{@counts[:parsed]} " \
43
+ "pings=#{@counts[:ping]} " \
44
+ "fields=#{@counts[:field]} " \
45
+ "errors=#{@counts[:error]} " \
46
+ "last_event_id=#{@stream.last_event_id.inspect} " \
47
+ "retry_ms=#{@stream.retry_ms.inspect}"
48
+ )
49
+ end
50
+
51
+ private
52
+
53
+ def log_chunks
54
+ @stream.on_chunk do |chunk|
55
+ @counts[:chunk] += 1
56
+ @io.puts "\n[CHUNK ##{@counts[:chunk]} | #{chunk.bytesize} bytes]"
57
+ @io.puts chunk
58
+ end
59
+ end
60
+
61
+ def log_frames
62
+ @stream.on_frame do |frame|
63
+ @counts[:frame] += 1
64
+ @io.puts "\n[FRAME ##{@counts[:frame]} | #{frame.bytesize} bytes]"
65
+ @io.puts frame
66
+ end
67
+ end
68
+
69
+ def log_fields
70
+ @stream.on_field do |name, value|
71
+ @counts[:field] += 1
72
+ @io.puts "-->[FIELD] #{name}=#{value.inspect}"
73
+ end
74
+ end
75
+
76
+ def log_pings
77
+ @stream.on_ping do |frame|
78
+ @counts[:ping] += 1
79
+ @io.puts "\n[PING ##{@counts[:ping]}] #{frame.inspect}"
80
+ end
81
+ end
82
+
83
+ def log_events
84
+ @stream.on_event do |event|
85
+ @counts[:event] += 1
86
+ @io.puts "\n[EVENT ##{@counts[:event]}] " \
87
+ "event=#{event.event.inspect} " \
88
+ "id=#{event.id.inspect} " \
89
+ "retry=#{event.retry.inspect}"
90
+ @io.puts " data: #{event.data.inspect}"
91
+ end
92
+ end
93
+
94
+ def log_parsed
95
+ @stream.on_parsed do |result|
96
+ @counts[:parsed] += 1
97
+ @io.puts "[PARSED ##{@counts[:parsed]}] #{result.inspect}"
98
+ end
99
+ end
100
+
101
+ def log_errors
102
+ @stream.on_error do |error|
103
+ @counts[:error] += 1
104
+ @io.puts "\n[ERROR ##{@counts[:error]}] #{error.class}: #{error.message}"
105
+ end
106
+ end
107
+ end
108
+ end
@@ -0,0 +1,298 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "callbacks"
4
+ require_relative "defaults"
5
+ require_relative "event"
6
+
7
+ module Conduit
8
+ # Core streaming parser for Server-Sent Events (SSE).
9
+ class Stream
10
+ # Initialize a new Stream with optional customizations.
11
+ #
12
+ # @param parser [Proc] Required. Callable that receives the joined data string of an SSE event and returns whatever shape the application wants.
13
+ # @param chunk_normalizer [Proc] Optional. Transforms incoming chunks before processing.
14
+ # @param frame_separator [String] Optional. Delimiter that separates frames in the stream.
15
+ # @param payload_start [String] Optional. Prefix used to identify the data field (the trailing ":" is stripped to derive the field name).
16
+ # @param ping_pattern [String] Optional. Pattern identifying ping frames.
17
+ # @param sanitize_pattern [Proc] Optional. Cleans or validates frame content.
18
+ def initialize(
19
+ parser:,
20
+ chunk_normalizer: nil,
21
+ frame_separator: nil,
22
+ payload_start: nil,
23
+ ping_pattern: nil,
24
+ sanitize_pattern: nil
25
+ )
26
+ raise ArgumentError, "parser must be a Proc (respond to #call)" unless parser.respond_to?(:call)
27
+
28
+ @parser = parser
29
+ @chunk_normalizer = chunk_normalizer || Defaults::CHUNK_NORMALIZER
30
+ @sanitize_pattern = sanitize_pattern || Defaults::SANITIZE_PATTERN
31
+ @frame_separator = frame_separator || Defaults::FRAME_SEPARATOR
32
+ @payload_start = payload_start || Defaults::PAYLOAD_START
33
+ @ping_pattern = ping_pattern || Defaults::PING_PATTERN
34
+ @data_field = @payload_start.chomp(":")
35
+ @buffer = +""
36
+ @callbacks = Callbacks.new
37
+ @last_event_id = nil
38
+ @retry_ms = nil
39
+ end
40
+
41
+ # Stream state — last id/retry seen, per SSE spec semantics.
42
+ attr_reader :last_event_id, :retry_ms
43
+
44
+ # Raw chunk as it arrived (after normalization).
45
+ #
46
+ # The chunk is a string that has been normalized (UTF-8 encoded, CRLF→LF).
47
+ # This is called for every chunk fed to the stream via <<, regardless of
48
+ # whether the chunk contains complete frames or partial data.
49
+ #
50
+ # @yield [chunk] The normalized chunk string
51
+ def on_chunk(&block)
52
+ @callbacks.on(:chunk, &block)
53
+ end
54
+
55
+ # Complete frame text (after sanitization), regardless of whether it produces an event.
56
+ #
57
+ # A frame is the text between frame separators (default: "\n\n").
58
+ # This callback receives the raw frame string after sanitization (default: strip).
59
+ # This includes frames that may not produce events (e.g., frames without data fields).
60
+ # Ping frames are handled separately by on_ping and do not trigger this callback.
61
+ #
62
+ # @yield [frame] The sanitized frame string
63
+ def on_frame(&block)
64
+ @callbacks.on(:frame, &block)
65
+ end
66
+
67
+ # Fully parsed SSE event as a {Conduit::Event}.
68
+ #
69
+ # This callback receives a Conduit::Event object with the following attributes:
70
+ # - event: Event type (defaults to "message")
71
+ # - data: Joined data field content (data lines joined by "\n")
72
+ # - id: Last event ID from the SSE spec
73
+ # - retry: Retry delay in milliseconds from the SSE spec
74
+ #
75
+ # Only called for frames that contain at least one data field.
76
+ # Use this callback when you need access to SSE metadata (event type, id, retry).
77
+ #
78
+ # @yield [event] A Conduit::Event object
79
+ def on_event(&block)
80
+ @callbacks.on(:event, &block)
81
+ end
82
+
83
+ # Result of running the configured parser over an event's data.
84
+ #
85
+ # The parser receives ONLY the data field content (joined by "\n"), not the entire frame.
86
+ # If you need access to other SSE fields (event type, id, retry), use on_event instead.
87
+ #
88
+ # If the parser raises an error and an on_error handler is registered,
89
+ # the error is routed to on_error and this callback is NOT invoked for that event.
90
+ #
91
+ # @yield [parsed] Whatever your parser lambda returns
92
+ def on_parsed(&block)
93
+ @callbacks.on(:parsed, &block)
94
+ end
95
+
96
+ # Ping/comment frame.
97
+ #
98
+ # Ping frames are identified by the ping_pattern (default: ":").
99
+ # These are typically used for keep-alive messages or comments.
100
+ # Ping frames do NOT trigger on_frame or on_event callbacks.
101
+ #
102
+ # @yield [frame] The ping frame string
103
+ def on_ping(&block)
104
+ @callbacks.on(:ping, &block)
105
+ end
106
+
107
+ # Every parsed SSE field line. Yields (name, value) for every field, including
108
+ # the standard ones (data/event/id/retry) and any custom fields a server emits.
109
+ #
110
+ # Per the SSE spec, fields are parsed one per line with the format "name: value".
111
+ # This callback is invoked for each field line as it's parsed from the frame.
112
+ #
113
+ # @yield [name, value] The field name and value as strings
114
+ def on_field(&block)
115
+ @callbacks.on(:field, &block)
116
+ end
117
+
118
+ # Errors raised by any callback or by the parser.
119
+ #
120
+ # When a callback (other than on_error itself) or the parser raises an error,
121
+ # it's routed to this handler if registered. This prevents errors from interrupting
122
+ # the stream processing.
123
+ #
124
+ # If on_error is not registered, errors will bubble up and interrupt processing.
125
+ # If on_error itself raises, that error will bubble up.
126
+ #
127
+ # @yield [error] The exception that was raised
128
+ def on_error(&block)
129
+ @callbacks.on(:error, &block)
130
+ end
131
+
132
+ # Feed a chunk of data to the stream for processing.
133
+ #
134
+ # Chunks are typically received from an HTTP stream (e.g., Net::HTTP response body).
135
+ # The chunk is normalized, buffered, and then processed for complete frames.
136
+ # Returns self for method chaining.
137
+ #
138
+ # @param chunk [String] Raw data chunk from the stream
139
+ # @return [self]
140
+ def <<(chunk)
141
+ chunk = normalize_chunk(chunk)
142
+
143
+ @callbacks.emit(:chunk, chunk)
144
+ @buffer << chunk
145
+
146
+ process_frames
147
+ self
148
+ end
149
+
150
+ # Signal end-of-input. Processes any bytes left in the buffer as a final frame,
151
+ # so trailing data not terminated by the frame separator still produces an event.
152
+ #
153
+ # Call this when the underlying transport closes cleanly without a trailing "\n\n"
154
+ # (typical for many HTTP SSE servers). Safe to call multiple times; safe to call on
155
+ # an empty buffer; safe to keep using the stream afterwards.
156
+ #
157
+ # @return [self]
158
+ def finish
159
+ return self if @buffer.empty?
160
+
161
+ remainder = @buffer.slice!(0, @buffer.length)
162
+ process_frame(remainder)
163
+ self
164
+ end
165
+ alias close finish
166
+
167
+ # Enumerable interface for iterating over parsed events.
168
+ #
169
+ # Provides a convenient way to iterate over the results of your parser.
170
+ # Without a block, returns an Enumerator. With a block, registers an on_parsed
171
+ # callback and returns self for chaining.
172
+ #
173
+ # @yield [parsed] The result of your parser
174
+ # @return [Enumerator, self]
175
+ def each(&block)
176
+ return enum_for(:each) unless block
177
+
178
+ on_parsed(&block)
179
+ self
180
+ end
181
+
182
+ private
183
+
184
+ # Process buffered chunks to extract complete frames.
185
+ #
186
+ # Scans the buffer for the frame separator and extracts complete frames.
187
+ # Incomplete frames remain in the buffer for the next chunk.
188
+ # This is called automatically after each chunk is fed via <<.
189
+ def process_frames
190
+ loop do
191
+ idx = @buffer.index(@frame_separator)
192
+ break unless idx
193
+
194
+ frame = @buffer.slice!(0, idx + @frame_separator.length)
195
+ process_frame(frame)
196
+ end
197
+ end
198
+
199
+ # Process a single frame through the parsing pipeline.
200
+ #
201
+ # Processing stages:
202
+ # 1. Sanitize the frame (default: strip whitespace)
203
+ # 2. Check if it's a ping frame (if so, emit on_ping and return)
204
+ # 3. Emit on_frame with the sanitized frame
205
+ # 4. Parse SSE fields from the frame
206
+ # 5. Emit on_field for each field line
207
+ # 6. Track SSE state (id, retry) from standard fields
208
+ # 7. If data fields present, build Event object and emit on_event
209
+ # 8. Apply user parser to the data content
210
+ # 9. Emit on_parsed with the parser result
211
+ #
212
+ # @param frame [String] The raw frame string
213
+ def process_frame(frame)
214
+ frame = sanitize(frame)
215
+ return if frame.empty?
216
+
217
+ if ping?(frame)
218
+ @callbacks.emit(:ping, frame)
219
+ return
220
+ end
221
+
222
+ @callbacks.emit(:frame, frame)
223
+
224
+ type = nil
225
+ data_lines = []
226
+
227
+ parse_fields(frame).each do |name, value|
228
+ @callbacks.emit(:field, name, value)
229
+
230
+ case name
231
+ when @data_field then data_lines << value
232
+ when "event" then type = value
233
+ when "id" then @last_event_id = value unless value.include?("\u0000")
234
+ when "retry" then @retry_ms = lenient_int(value)
235
+ end
236
+ end
237
+
238
+ return if data_lines.empty?
239
+
240
+ data = data_lines.join("\n")
241
+ event = Event.new(
242
+ event: type || "message",
243
+ data: data,
244
+ id: @last_event_id,
245
+ retry: @retry_ms
246
+ )
247
+
248
+ @callbacks.emit(:event, event)
249
+
250
+ parsed = @callbacks.call_safely(@parser, data)
251
+ return if parsed.equal?(Callbacks::FAILED)
252
+
253
+ @callbacks.emit(:parsed, parsed)
254
+ end
255
+
256
+ # Per https://html.spec.whatwg.org/multipage/server-sent-events.html, parse one field per line:
257
+ # - empty line: skipped
258
+ # - line with no ":" : whole line is the field name, value is ""
259
+ # - otherwise: field name = before first ":", value = after, with one optional leading space stripped
260
+ # - empty field name (line starting with ":") is a comment and ignored
261
+ def parse_fields(frame)
262
+ frame.lines.filter_map do |line|
263
+ line = line.chomp
264
+ next if line.empty?
265
+
266
+ idx = line.index(":")
267
+ if idx.nil?
268
+ [line, ""]
269
+ else
270
+ name = line[0...idx]
271
+ next if name.empty?
272
+
273
+ value = line[(idx + 1)..] || ""
274
+ value = value[1..] if value.start_with?(" ")
275
+ [name, value]
276
+ end
277
+ end
278
+ end
279
+
280
+ def lenient_int(value)
281
+ Integer(value, 10)
282
+ rescue ArgumentError, TypeError
283
+ value
284
+ end
285
+
286
+ def normalize_chunk(chunk)
287
+ @chunk_normalizer.call(chunk)
288
+ end
289
+
290
+ def sanitize(frame)
291
+ @sanitize_pattern.call(frame)
292
+ end
293
+
294
+ def ping?(frame)
295
+ frame.start_with?(@ping_pattern)
296
+ end
297
+ end
298
+ end
@@ -0,0 +1,5 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Conduit
4
+ VERSION = "0.1.0"
5
+ end
data/lib/conduit.rb ADDED
@@ -0,0 +1,12 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "conduit/version"
4
+ require_relative "conduit/stream"
5
+ require_relative "conduit/inspector"
6
+
7
+ # Conduit is a lightweight, zero-dependency Ruby gem for parsing Server-Sent Events (SSE) streams.
8
+ module Conduit
9
+ def self.new(**)
10
+ Stream.new(**)
11
+ end
12
+ end
metadata ADDED
@@ -0,0 +1,58 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: conduit-sse
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - franbach
8
+ bindir: exe
9
+ cert_chain: []
10
+ date: 2026-05-09 00:00:00.000000000 Z
11
+ dependencies: []
12
+ description: Conduit provides a flexible callback-based architecture for processing
13
+ real-time server push data with full control over every stage of the parsing pipeline.
14
+ Perfect for streaming AI responses, real-time analytics, and live updates.
15
+ email:
16
+ - franciscobach@gmail.com
17
+ executables: []
18
+ extensions: []
19
+ extra_rdoc_files: []
20
+ files:
21
+ - CHANGELOG.md
22
+ - LICENSE.txt
23
+ - README.md
24
+ - Rakefile
25
+ - lib/conduit.rb
26
+ - lib/conduit/callbacks.rb
27
+ - lib/conduit/defaults.rb
28
+ - lib/conduit/event.rb
29
+ - lib/conduit/inspector.rb
30
+ - lib/conduit/stream.rb
31
+ - lib/conduit/version.rb
32
+ homepage: https://github.com/franbach/conduit
33
+ licenses:
34
+ - MIT
35
+ metadata:
36
+ allowed_push_host: https://rubygems.org
37
+ homepage_uri: https://github.com/franbach/conduit
38
+ source_code_uri: https://github.com/franbach/conduit
39
+ changelog_uri: https://github.com/franbach/conduit/blob/main/CHANGELOG.md
40
+ rdoc_options: []
41
+ require_paths:
42
+ - lib
43
+ required_ruby_version: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - ">="
46
+ - !ruby/object:Gem::Version
47
+ version: 3.2.0
48
+ required_rubygems_version: !ruby/object:Gem::Requirement
49
+ requirements:
50
+ - - ">="
51
+ - !ruby/object:Gem::Version
52
+ version: '0'
53
+ requirements: []
54
+ rubygems_version: 3.6.2
55
+ specification_version: 4
56
+ summary: A lightweight, zero-dependency Ruby gem for parsing Server-Sent Events (SSE)
57
+ streams
58
+ test_files: []