fiber_stream 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: ab4449558105eb57805e40970a28971e32a37e748176df0875de75816bf53d2c
4
- data.tar.gz: 6d1998ce477a4602a6f23d60817f2f49b9acc6efde64b18ee90ebd7084905bd5
3
+ metadata.gz: c6ac2f9ec0d6888789ac2535538124b9057895b52e226d812d002e5d2955d437
4
+ data.tar.gz: df4722fff9e5020b24e7e23d2d4cc98ea98aa54a3f17dd86ff0825f41f007de4
5
5
  SHA512:
6
- metadata.gz: 4242c44baa5c1db6f7cb6c39ac217729603838dba31876180f4fcc2382e72308cad4d5197d8687b522317dfc9ad964bcd6f85b9449f580f88324098536b8906c
7
- data.tar.gz: c2cc8091ec14b27eef4a60a82c1c22503118eee9da0a1a53e24a3e137058e1dd8c5e1536e89c5ba13c7f725cdc0ad241d48ade86cbe2162f6ac5ae2d9ce8f96f
6
+ metadata.gz: 11389c5aaf69a315a9a932ddd06488eb7fe81b72ca265ecf2e45e7d96b4d984a67c2eee49b38dbde267ce35b4e5dfe7dff4d2e61a32330d5e09153058ad2fb22
7
+ data.tar.gz: e231b92f6c0ddb08aae7e0db1791b17325705705a76ace92b197d626954ae72dd1510359c5ea8f0c5cff19985f01b78cc4e62179699d244afa4d988ed4da9d51
data/CHANGELOG.md CHANGED
@@ -1,5 +1,25 @@
1
1
  # Changelog
2
2
 
3
+ ## 0.2.0 - 2026-06-05
4
+
5
+ ### Added
6
+
7
+ - `Source#zip(source)` for element-wise pairing of two sources with
8
+ demand-driven materialization and shortest-source completion.
9
+ - `Source#concat(source)` for lazy source concatenation.
10
+ - `Sink.foreach { |element| ... }` for side-effecting stream consumption
11
+ without accumulating elements.
12
+ - `Flow.drop(count)` and `Source#drop(count)` for fixed-prefix dropping.
13
+ - `Flow.take_while { |element| ... }` and `Source#take_while { |element| ... }`
14
+ for predicate-based prefix limiting.
15
+ - `Flow.drop_while { |element| ... }` and
16
+ `Source#drop_while { |element| ... }` for predicate-based prefix dropping.
17
+
18
+ ### Changed
19
+
20
+ - Clarified documentation around FiberStream's linear roadmap and Ractor port
21
+ cancellation contract.
22
+
3
23
  ## 0.1.0 - 2026-06-03
4
24
 
5
25
  Initial release.
data/README.md CHANGED
@@ -28,16 +28,17 @@ FiberStream currently supports linear pipelines only.
28
28
  Implemented capabilities:
29
29
 
30
30
  - in-memory, IO, and backpressure-aware Ractor port sources
31
- - mapping, filtering, limiting, line splitting, buffering, async boundaries,
32
- ordered parallel mapping, and ordered Ractor-backed mapping
33
- - array, first-element, fold, and IO sinks
31
+ - lazy source concatenation and zipping
32
+ - mapping, filtering, limiting, predicate-based limiting and dropping,
33
+ fixed-prefix dropping, line splitting, buffering, async boundaries, ordered
34
+ parallel mapping, and ordered Ractor-backed mapping
35
+ - array, first-element, fold, foreach, and IO sinks
34
36
  - reusable flow composition and runnable pipelines
35
37
  - foreground and scheduler-backed background pipeline execution
36
38
  - public RBS signatures
37
39
 
38
- Not yet implemented:
39
-
40
- - graph DSLs
40
+ FiberStream intentionally keeps the public model linear: one source, an
41
+ ordered chain of flows, and one sink.
41
42
 
42
43
  ## Core Concepts
43
44
 
@@ -52,6 +53,30 @@ source = FiberStream::Source.each([1, 2, 3])
52
53
  source.run_with(FiberStream::Sink.to_a) # => [1, 2, 3]
53
54
  ```
54
55
 
56
+ Sources can be concatenated without materializing the appended source until the
57
+ first source completes:
58
+
59
+ ```ruby
60
+ result =
61
+ FiberStream::Source.each([1, 2])
62
+ .concat(FiberStream::Source.each([3, 4]))
63
+ .run_with(FiberStream::Sink.to_a)
64
+
65
+ result # => [1, 2, 3, 4]
66
+ ```
67
+
68
+ Sources can also be zipped element-by-element. The zipped source emits pairs
69
+ and completes when either input source completes:
70
+
71
+ ```ruby
72
+ result =
73
+ FiberStream::Source.each([1, 2, 3])
74
+ .zip(FiberStream::Source.each(["a", "b"]))
75
+ .run_with(FiberStream::Sink.to_a)
76
+
77
+ result # => [[1, "a"], [2, "b"]]
78
+ ```
79
+
55
80
  IO sources read chunks on demand and require a scheduler-backed non-blocking
56
81
  fiber:
57
82
 
@@ -107,6 +132,39 @@ FiberStream::Source.ractor_port(data_port, ack_port: ack_port)
107
132
  producer.value
108
133
  ```
109
134
 
135
+ Streaming HTTP response bodies that implement `#each`, such as
136
+ `async-http` response bodies, can be used with `Source.each` without buffering
137
+ the full body first. Use the HTTP client's block form or an explicit `ensure`
138
+ close because `Source.each` does not own the response body:
139
+
140
+ ```ruby
141
+ require "async"
142
+ require "async/http/internet/instance"
143
+ require "fiber_stream"
144
+
145
+ url = "https://raw.githubusercontent.com/elastic/examples/master/" \
146
+ "Common%20Data%20Formats/nginx_logs/nginx_logs"
147
+
148
+ status_counts = Hash.new(0)
149
+
150
+ processed =
151
+ Sync do
152
+ Async::HTTP::Internet.get(url) do |response|
153
+ raise "unexpected status #{response.status}" unless response.status == 200
154
+
155
+ FiberStream::Source.each(response.body)
156
+ .lines(max_length: 16 * 1024)
157
+ .map { |line| line.split.fetch(8, nil) }
158
+ .select { |status| status&.match?(/\A\d{3}\z/) }
159
+ .run_with(
160
+ FiberStream::Sink.foreach do |status|
161
+ status_counts[status] += 1
162
+ end
163
+ )
164
+ end
165
+ end
166
+ ```
167
+
110
168
  ### Flows
111
169
 
112
170
  Flows transform a stream lazily. Convenience methods on `Source` delegate to
@@ -136,6 +194,40 @@ FiberStream::Source.each([" a ", "", " b "])
136
194
  # => ["a", "b"]
137
195
  ```
138
196
 
197
+ Use `ractor_map` for ordered CPU-bound mapping in Ractor workers. The mapper
198
+ must be shareable, usually by creating it with `Ractor.shareable_proc`.
199
+
200
+ ```ruby
201
+ require "digest"
202
+ require "fiber_stream"
203
+
204
+ records = [
205
+ { name: "alpha.bin", payload: +"A" * 200_000 },
206
+ { name: "bravo.bin", payload: +"B" * 120_000 }
207
+ ]
208
+
209
+ HASH_RECORD =
210
+ Ractor.shareable_proc do |record|
211
+ payload = record.fetch(:payload)
212
+
213
+ {
214
+ name: record.fetch(:name),
215
+ bytes: payload.bytesize,
216
+ sha256: Digest::SHA256.hexdigest(payload)
217
+ }
218
+ end
219
+
220
+ digests =
221
+ FiberStream::Source.each(records)
222
+ .ractor_map(workers: 2, input_transfer: :move, &HASH_RECORD)
223
+ .run_with(FiberStream::Sink.to_a)
224
+ ```
225
+
226
+ `ractor_map` preserves input order, limits pulled-but-unemitted work to
227
+ `workers`, and does not require `Fiber.scheduler`. Use `input_transfer: :move`
228
+ or `output_transfer: :move` only when the moved object will not be reused by
229
+ the sender.
230
+
139
231
  ### Sinks
140
232
 
141
233
  A `Sink` consumes the stream and returns a materialized value.
@@ -146,6 +238,17 @@ FiberStream::Source.each([1, 2, 3])
146
238
  # => 6
147
239
  ```
148
240
 
241
+ Use `Sink.foreach` when the terminal operation is a side effect and the stream
242
+ values should not be accumulated:
243
+
244
+ ```ruby
245
+ count =
246
+ FiberStream::Source.each(["a", "b", "c"])
247
+ .run_with(FiberStream::Sink.foreach { |value| puts value })
248
+
249
+ count # => 3
250
+ ```
251
+
149
252
  ### Pipelines
150
253
 
151
254
  `Source#to(sink)` creates a reusable runnable pipeline.
@@ -212,6 +315,67 @@ limited =
212
315
  limited # => [1, 2]
213
316
  ```
214
317
 
318
+ `Flow.drop` skips a fixed prefix and then passes later elements through:
319
+
320
+ ```ruby
321
+ tail =
322
+ FiberStream::Source.each([1, 2, 3, 4])
323
+ .drop(2)
324
+ .run_with(FiberStream::Sink.to_a)
325
+
326
+ tail # => [3, 4]
327
+ ```
328
+
329
+ `Flow.take_while` emits the leading prefix while a predicate is truthy, then
330
+ closes upstream at the first false or nil result:
331
+
332
+ ```ruby
333
+ prefix =
334
+ FiberStream::Source.each([1, 2, 3, 1])
335
+ .take_while { |number| number < 3 }
336
+ .run_with(FiberStream::Sink.to_a)
337
+
338
+ prefix # => [1, 2]
339
+ ```
340
+
341
+ `Flow.drop_while` skips the leading prefix while a predicate is truthy, then
342
+ passes the first false or nil result and all later elements through:
343
+
344
+ ```ruby
345
+ tail =
346
+ FiberStream::Source.each([1, 2, 3, 1])
347
+ .drop_while { |number| number < 3 }
348
+ .run_with(FiberStream::Sink.to_a)
349
+
350
+ tail # => [3, 1]
351
+ ```
352
+
353
+ `Source#concat` preserves pull-driven demand across source boundaries. The
354
+ appended source is not materialized while the first source can still satisfy
355
+ downstream demand:
356
+
357
+ ```ruby
358
+ first =
359
+ FiberStream::Source.each([1])
360
+ .concat(FiberStream::Source.each([2]))
361
+ .run_with(FiberStream::Sink.first)
362
+
363
+ first # => 1
364
+ ```
365
+
366
+ `Source#zip` keeps input source materialization behind downstream demand. The
367
+ other source is not materialized until the receiver has produced an element for
368
+ a pair:
369
+
370
+ ```ruby
371
+ first =
372
+ FiberStream::Source.each([1])
373
+ .zip(FiberStream::Source.each([2]))
374
+ .run_with(FiberStream::Sink.first)
375
+
376
+ first # => [1, 2]
377
+ ```
378
+
215
379
  `Flow.buffer(count)` allows bounded prefetch. `Flow.async`, `Flow.buffer`,
216
380
  `Flow.parallel_map`, `Source.io`, `Sink.io`, and `Pipeline#run_async` require an
217
381
  installed `Fiber.scheduler` and a non-blocking current fiber when demanded or
@@ -229,11 +393,16 @@ Sources:
229
393
  Source convenience methods:
230
394
 
231
395
  - `Source#via(flow)`
396
+ - `Source#concat(source)`
397
+ - `Source#zip(source)`
232
398
  - `Source#map { |element| ... }`
233
399
  - `Source#parallel_map(concurrency:) { |element| ... }`
234
400
  - `Source#ractor_map(workers:, input_transfer: :copy, output_transfer: :copy) { |element| ... }`
235
401
  - `Source#select { |element| ... }`
236
402
  - `Source#take(count)`
403
+ - `Source#drop(count)`
404
+ - `Source#take_while { |element| ... }`
405
+ - `Source#drop_while { |element| ... }`
237
406
  - `Source#async`
238
407
  - `Source#buffer(count)`
239
408
  - `Source#lines(chomp: true, max_length: nil)`
@@ -247,6 +416,9 @@ Flows:
247
416
  - `FiberStream::Flow.ractor_map(workers:, input_transfer: :copy, output_transfer: :copy) { |element| ... }`
248
417
  - `FiberStream::Flow.select { |element| ... }`
249
418
  - `FiberStream::Flow.take(count)`
419
+ - `FiberStream::Flow.drop(count)`
420
+ - `FiberStream::Flow.take_while { |element| ... }`
421
+ - `FiberStream::Flow.drop_while { |element| ... }`
250
422
  - `FiberStream::Flow.async`
251
423
  - `FiberStream::Flow.buffer(count)`
252
424
  - `FiberStream::Flow.lines(chomp: true, max_length: nil)`
@@ -258,6 +430,7 @@ Sinks:
258
430
  - `FiberStream::Sink.to_a`
259
431
  - `FiberStream::Sink.first`
260
432
  - `FiberStream::Sink.fold(initial) { |accumulator, element| ... }`
433
+ - `FiberStream::Sink.foreach { |element| ... }`
261
434
  - `FiberStream::Sink.io(io, close: false, flush: false)`
262
435
 
263
436
  Pipelines:
@@ -283,6 +456,7 @@ bundle exec ruby examples/background_execution.rb
283
456
  bundle exec ruby examples/ractor_map_hashing.rb
284
457
  bundle exec ruby examples/ractor_port_source.rb
285
458
  bundle exec ruby examples/async_http_requests.rb
459
+ bundle exec ruby examples/async_http_streaming_body.rb
286
460
  ```
287
461
 
288
462
  `examples/backpressure_buffer.rb` prints timestamped producer and consumer
@@ -297,6 +471,10 @@ with a shareable mapper proc and `input_transfer: :move`.
297
471
  `examples/async_http_requests.rb` starts a local HTTP server and shows
298
472
  FiberStream overlapping independent HTTP request waits with `parallel_map`.
299
473
 
474
+ `examples/async_http_streaming_body.rb` streams a public nginx access log with
475
+ `async-http`, feeds the response body through `Source.each(response.body)`, and
476
+ aggregates lines without storing the full body.
477
+
300
478
  Benchmark scripts live under `benchmarks/`.
301
479
 
302
480
  ```sh
data/examples/README.md CHANGED
@@ -12,6 +12,7 @@ bundle exec ruby examples/background_execution.rb
12
12
  bundle exec ruby examples/ractor_map_hashing.rb
13
13
  bundle exec ruby examples/ractor_port_source.rb
14
14
  bundle exec ruby examples/async_http_requests.rb
15
+ bundle exec ruby examples/async_http_streaming_body.rb
15
16
  ```
16
17
 
17
18
  `basic_pipeline.rb` uses only in-memory values and does not require an async
@@ -49,3 +50,7 @@ demand.
49
50
  `async_http_requests.rb` starts a local HTTP server and compares serial
50
51
  requests with FiberStream `parallel_map` requests. It keeps responses ordered
51
52
  while overlapping independent network waits.
53
+
54
+ `async_http_streaming_body.rb` downloads a public nginx access log with
55
+ `async-http` and streams `response.body` through `Source.each`, `Flow.lines`,
56
+ and `Sink.foreach` so the full HTTP body is not buffered in memory.
@@ -0,0 +1,115 @@
1
+ # frozen_string_literal: true
2
+
3
+ $LOAD_PATH.unshift(File.expand_path("../lib", __dir__))
4
+
5
+ require "async"
6
+ require "async/http/internet/instance"
7
+ require "fiber_stream"
8
+
9
+ DEFAULT_URL =
10
+ "https://raw.githubusercontent.com/elastic/examples/master/" \
11
+ "Common%20Data%20Formats/nginx_logs/nginx_logs"
12
+
13
+ URL = ENV.fetch("FIBER_STREAM_HTTP_LOG_URL", DEFAULT_URL)
14
+ PROGRESS_EVERY = Integer(ENV.fetch("FIBER_STREAM_HTTP_PROGRESS_EVERY", "10_000"))
15
+
16
+ LOG_LINE =
17
+ /
18
+ \A
19
+ (?<remote_addr>\S+)\s+\S+\s+\S+\s+
20
+ \[[^\]]+\]\s+
21
+ "[^"]+"\s+
22
+ (?<status>\d{3})\s+
23
+ (?<bytes>\d+|-)\s
24
+ /x
25
+
26
+ def monotonic_time
27
+ Process.clock_gettime(Process::CLOCK_MONOTONIC)
28
+ end
29
+
30
+ def parse_access_log(line)
31
+ match = LOG_LINE.match(line)
32
+ return nil unless match
33
+
34
+ {
35
+ remote_addr: match[:remote_addr],
36
+ status: match[:status],
37
+ bytes: match[:bytes] == "-" ? 0 : match[:bytes].to_i
38
+ }
39
+ end
40
+
41
+ def empty_stats
42
+ {
43
+ lines: 0,
44
+ parsed: 0,
45
+ payload_bytes: 0,
46
+ statuses: Hash.new(0),
47
+ remote_addrs: Hash.new(0),
48
+ started_at: monotonic_time
49
+ }
50
+ end
51
+
52
+ def record_entry(stats, entry)
53
+ stats[:lines] += 1
54
+
55
+ if entry
56
+ stats[:parsed] += 1
57
+ stats[:payload_bytes] += entry.fetch(:bytes)
58
+ stats[:statuses][entry.fetch(:status)] += 1
59
+ stats[:remote_addrs][entry.fetch(:remote_addr)] += 1
60
+ end
61
+
62
+ if (stats[:lines] % PROGRESS_EVERY).zero?
63
+ elapsed = monotonic_time - stats.fetch(:started_at)
64
+ puts format(
65
+ "processed %<lines>d lines in %<elapsed>.2fs",
66
+ lines: stats.fetch(:lines),
67
+ elapsed: elapsed
68
+ )
69
+ end
70
+
71
+ stats
72
+ end
73
+
74
+ def print_summary(stats)
75
+ elapsed = monotonic_time - stats.fetch(:started_at)
76
+ mib = stats.fetch(:payload_bytes).fdiv(1024 * 1024)
77
+
78
+ puts
79
+ puts "Streaming HTTP body summary"
80
+ puts "URL: #{URL}"
81
+ puts format("lines parsed: %<parsed>d/%<lines>d", stats)
82
+ puts format("logged payload bytes: %<mib>.2f MiB", mib: mib)
83
+ puts format("unique remote addresses: %<count>d", count: stats.fetch(:remote_addrs).length)
84
+ puts format("elapsed: %<elapsed>.2fs", elapsed: elapsed)
85
+
86
+ puts
87
+ puts "HTTP status counts"
88
+ stats.fetch(:statuses).sort.each do |status, count|
89
+ puts format("- %<status>s: %<count>d", status: status, count: count)
90
+ end
91
+ end
92
+
93
+ stats = empty_stats
94
+
95
+ processed =
96
+ Sync do
97
+ Async::HTTP::Internet.get(URL) do |response|
98
+ unless response.status == 200
99
+ raise "unexpected HTTP status #{response.status} for #{URL}"
100
+ end
101
+
102
+ FiberStream::Source.each(response.body)
103
+ .lines(max_length: 16 * 1024)
104
+ .map { |line| parse_access_log(line) }
105
+ .run_with(
106
+ FiberStream::Sink.foreach do |entry|
107
+ record_entry(stats, entry)
108
+ end
109
+ )
110
+ end
111
+ end
112
+
113
+ raise "processed count mismatch" unless processed == stats.fetch(:lines)
114
+
115
+ print_summary(stats)
@@ -75,6 +75,43 @@ module FiberStream
75
75
  new { |upstream| Pull.take(upstream, count) }
76
76
  end
77
77
 
78
+ # Creates a fixed-prefix dropping flow.
79
+ #
80
+ # The flow discards the first `count` upstream elements, then passes later
81
+ # elements through unchanged. `drop(0)` behaves as pass-through. Negative
82
+ # counts raise `ArgumentError`; non-Integer counts raise `TypeError`.
83
+ def self.drop(count)
84
+ raise TypeError, "count must be an Integer" unless count.is_a?(Integer)
85
+ raise ArgumentError, "count must be non-negative" if count.negative?
86
+
87
+ new { |upstream| Pull.drop(upstream, count) }
88
+ end
89
+
90
+ # Creates a predicate-based limiting flow.
91
+ #
92
+ # The flow emits leading elements while the block result is truthy. The
93
+ # first false or nil result completes the stream without emitting that
94
+ # element and closes upstream during the same downstream pull. Exceptions
95
+ # raised by the block fail the stream and are re-raised from
96
+ # `Source#run_with`.
97
+ def self.take_while(&block)
98
+ raise ArgumentError, "missing block" unless block
99
+
100
+ new { |upstream| Pull.take_while(upstream, block) }
101
+ end
102
+
103
+ # Creates a predicate-based prefix-dropping flow.
104
+ #
105
+ # The flow drops leading elements while the block result is truthy. The
106
+ # first false or nil result, and all later elements, pass through unchanged.
107
+ # After that boundary the block is not called again. Exceptions raised by
108
+ # the block fail the stream and are re-raised from `Source#run_with`.
109
+ def self.drop_while(&block)
110
+ raise ArgumentError, "missing block" unless block
111
+
112
+ new { |upstream| Pull.drop_while(upstream, block) }
113
+ end
114
+
78
115
  # Creates a scheduler-backed asynchronous boundary.
79
116
  #
80
117
  # The boundary starts its producer on the first downstream demand and
@@ -0,0 +1,95 @@
1
+ # frozen_string_literal: true
2
+
3
+ module FiberStream
4
+ module Pull
5
+ # Pull stream that emits all values from one materialized source, then all
6
+ # values from a second source materialized only after the first completes.
7
+ class Concat
8
+ def initialize(left_materializer, right_materializer)
9
+ @left_materializer = left_materializer
10
+ @right_materializer = right_materializer
11
+ @left = @left_materializer.call
12
+ @right = nil
13
+ @phase = :left
14
+ @closed = false
15
+ @done = false
16
+ end
17
+
18
+ def next
19
+ return DONE if @closed || @done
20
+
21
+ case @phase
22
+ when :left
23
+ next_left
24
+ when :right
25
+ next_right
26
+ else
27
+ DONE
28
+ end
29
+ end
30
+
31
+ def close
32
+ return if @closed
33
+
34
+ @closed = true
35
+ close_materialized_streams
36
+ end
37
+
38
+ private
39
+
40
+ def next_left
41
+ value = @left.next
42
+ return value unless Pull.done?(value)
43
+
44
+ close_left
45
+ @phase = :right
46
+ @right = @right_materializer.call
47
+ next_right
48
+ end
49
+
50
+ def next_right
51
+ value = @right.next
52
+ return value unless Pull.done?(value)
53
+
54
+ close_right
55
+ @done = true
56
+ DONE
57
+ end
58
+
59
+ def close_left
60
+ stream = @left
61
+ return unless stream
62
+
63
+ stream.close
64
+ @left = nil
65
+ end
66
+
67
+ def close_right
68
+ stream = @right
69
+ return unless stream
70
+
71
+ stream.close
72
+ @right = nil
73
+ end
74
+
75
+ def close_materialized_streams
76
+ first_error = nil
77
+
78
+ [@right, @left].each do |stream|
79
+ next unless stream
80
+
81
+ begin
82
+ stream.close
83
+ rescue StandardError => error
84
+ first_error ||= error
85
+ end
86
+ end
87
+
88
+ @right = nil
89
+ @left = nil
90
+
91
+ raise first_error if first_error
92
+ end
93
+ end
94
+ end
95
+ end
@@ -0,0 +1,58 @@
1
+ # frozen_string_literal: true
2
+
3
+ module FiberStream
4
+ module Pull
5
+ # Fixed-prefix dropping stage.
6
+ #
7
+ # It discards the first `count` upstream elements on downstream demand, then
8
+ # passes later elements through without buffering.
9
+ class Drop
10
+ def initialize(upstream, count)
11
+ @upstream = upstream
12
+ @remaining = count
13
+ @closed = false
14
+ @done = false
15
+ end
16
+
17
+ def next
18
+ return DONE if @closed || @done
19
+
20
+ drop_prefix
21
+ return DONE if @done
22
+
23
+ pull_retained_value
24
+ end
25
+
26
+ def close
27
+ return if @closed
28
+
29
+ @closed = true
30
+ @upstream.close
31
+ end
32
+
33
+ private
34
+
35
+ def drop_prefix
36
+ while @remaining.positive?
37
+ value = @upstream.next
38
+ if Pull.done?(value)
39
+ @done = true
40
+ return
41
+ end
42
+
43
+ @remaining -= 1
44
+ end
45
+ end
46
+
47
+ def pull_retained_value
48
+ value = @upstream.next
49
+ if Pull.done?(value)
50
+ @done = true
51
+ return DONE
52
+ end
53
+
54
+ value
55
+ end
56
+ end
57
+ end
58
+ end
@@ -0,0 +1,61 @@
1
+ # frozen_string_literal: true
2
+
3
+ module FiberStream
4
+ module Pull
5
+ # Predicate-based prefix dropping stage.
6
+ #
7
+ # It drops leading elements while the predicate is truthy. The first falsey
8
+ # element and all later elements pass through unchanged.
9
+ class DropWhile
10
+ def initialize(upstream, predicate)
11
+ @upstream = upstream
12
+ @predicate = predicate
13
+ @dropping = true
14
+ @closed = false
15
+ @done = false
16
+ end
17
+
18
+ def next
19
+ return DONE if @closed || @done
20
+
21
+ return pull_pass_through unless @dropping
22
+
23
+ pull_until_retained
24
+ end
25
+
26
+ def close
27
+ return if @closed
28
+
29
+ @closed = true
30
+ @upstream.close
31
+ end
32
+
33
+ private
34
+
35
+ def pull_until_retained
36
+ loop do
37
+ value = @upstream.next
38
+ if Pull.done?(value)
39
+ @done = true
40
+ return DONE
41
+ end
42
+
43
+ next if @predicate.call(value)
44
+
45
+ @dropping = false
46
+ return value
47
+ end
48
+ end
49
+
50
+ def pull_pass_through
51
+ value = @upstream.next
52
+ if Pull.done?(value)
53
+ @done = true
54
+ return DONE
55
+ end
56
+
57
+ value
58
+ end
59
+ end
60
+ end
61
+ end
@@ -0,0 +1,42 @@
1
+ # frozen_string_literal: true
2
+
3
+ module FiberStream
4
+ module Pull
5
+ # Predicate-based limiting stage.
6
+ #
7
+ # It forwards the leading prefix whose predicate results are truthy. The
8
+ # first falsey predicate result is consumed, not emitted, and closes
9
+ # upstream immediately.
10
+ class TakeWhile
11
+ def initialize(upstream, predicate)
12
+ @upstream = upstream
13
+ @predicate = predicate
14
+ @closed = false
15
+ @done = false
16
+ end
17
+
18
+ def next
19
+ return DONE if @closed || @done
20
+
21
+ value = @upstream.next
22
+ if Pull.done?(value)
23
+ @done = true
24
+ return DONE
25
+ end
26
+
27
+ return value if @predicate.call(value)
28
+
29
+ @done = true
30
+ close
31
+ DONE
32
+ end
33
+
34
+ def close
35
+ return if @closed
36
+
37
+ @closed = true
38
+ @upstream.close
39
+ end
40
+ end
41
+ end
42
+ end
@@ -0,0 +1,83 @@
1
+ # frozen_string_literal: true
2
+
3
+ module FiberStream
4
+ module Pull
5
+ # Pull stream that emits pairs from two source definitions.
6
+ #
7
+ # The receiver side is materialized on first downstream demand. The other
8
+ # side is materialized only after the receiver produces a value for a pair.
9
+ class Zip
10
+ def initialize(left_materializer, right_materializer)
11
+ @left_materializer = left_materializer
12
+ @right_materializer = right_materializer
13
+ @left = nil
14
+ @right = nil
15
+ @closed = false
16
+ @done = false
17
+ end
18
+
19
+ def next
20
+ return DONE if @closed || @done
21
+
22
+ left = materialize_left
23
+ left_value = left.next
24
+ if Pull.done?(left_value)
25
+ @done = true
26
+ close_materialized_streams
27
+ return DONE
28
+ end
29
+
30
+ right = materialize_right
31
+ right_value = right.next
32
+ if Pull.done?(right_value)
33
+ @done = true
34
+ close_materialized_streams
35
+ return DONE
36
+ end
37
+
38
+ [left_value, right_value]
39
+ rescue StandardError
40
+ @done = true
41
+ close_materialized_streams(raise_error: false)
42
+ raise
43
+ end
44
+
45
+ def close
46
+ return if @closed
47
+
48
+ @closed = true
49
+ close_materialized_streams
50
+ end
51
+
52
+ private
53
+
54
+ def materialize_left
55
+ @left ||= @left_materializer.call
56
+ end
57
+
58
+ def materialize_right
59
+ @right ||= @right_materializer.call
60
+ end
61
+
62
+ def close_materialized_streams(raise_error: true)
63
+ streams = [@left, @right]
64
+ @left = nil
65
+ @right = nil
66
+
67
+ first_error = nil
68
+
69
+ streams.each do |stream|
70
+ next unless stream
71
+
72
+ begin
73
+ stream.close
74
+ rescue StandardError => error
75
+ first_error ||= error
76
+ end
77
+ end
78
+
79
+ raise first_error if raise_error && first_error
80
+ end
81
+ end
82
+ end
83
+ end
@@ -28,6 +28,14 @@ module FiberStream
28
28
  RactorPortSource.new(port, ack_port, ack_transfer, cancel)
29
29
  end
30
30
 
31
+ def self.concat(left_materializer, right_materializer)
32
+ Concat.new(left_materializer, right_materializer)
33
+ end
34
+
35
+ def self.zip(left_materializer, right_materializer)
36
+ Zip.new(left_materializer, right_materializer)
37
+ end
38
+
31
39
  def self.map(upstream, transform)
32
40
  Map.new(upstream, transform)
33
41
  end
@@ -48,6 +56,18 @@ module FiberStream
48
56
  Take.new(upstream, count)
49
57
  end
50
58
 
59
+ def self.drop(upstream, count)
60
+ Drop.new(upstream, count)
61
+ end
62
+
63
+ def self.take_while(upstream, predicate)
64
+ TakeWhile.new(upstream, predicate)
65
+ end
66
+
67
+ def self.drop_while(upstream, predicate)
68
+ DropWhile.new(upstream, predicate)
69
+ end
70
+
51
71
  def self.async(upstream)
52
72
  AsyncBoundary.new(upstream)
53
73
  end
@@ -67,9 +87,14 @@ end
67
87
  require_relative "pull/each"
68
88
  require_relative "pull/io_source"
69
89
  require_relative "pull/ractor_port_source"
90
+ require_relative "pull/concat"
91
+ require_relative "pull/zip"
70
92
  require_relative "pull/map"
71
93
  require_relative "pull/select"
72
94
  require_relative "pull/take"
95
+ require_relative "pull/drop"
96
+ require_relative "pull/take_while"
97
+ require_relative "pull/drop_while"
73
98
  require_relative "pull/lines"
74
99
  require_relative "pull/async_boundary"
75
100
  require_relative "pull/buffer_boundary"
@@ -78,8 +103,8 @@ require_relative "pull/ractor_map_boundary"
78
103
 
79
104
  module FiberStream
80
105
  module Pull
81
- private_constant :Each, :IOSource, :RactorPortSource, :Map, :Select, :Take, :Lines,
82
- :AsyncBoundary, :BufferBoundary, :ParallelMapBoundary,
83
- :RactorMapBoundary
106
+ private_constant :Each, :IOSource, :RactorPortSource, :Concat, :Zip, :Map, :Select, :Take, :Drop,
107
+ :TakeWhile, :DropWhile, :Lines, :AsyncBoundary, :BufferBoundary,
108
+ :ParallelMapBoundary, :RactorMapBoundary
84
109
  end
85
110
  end
@@ -56,6 +56,30 @@ module FiberStream
56
56
  end
57
57
  end
58
58
 
59
+ # Creates a sink that runs a block for each stream element.
60
+ #
61
+ # The sink consumes upstream until normal completion, calls the block once
62
+ # per element in input order, and returns the number of elements whose block
63
+ # completed successfully. Exceptions raised by the block fail the stream and
64
+ # are re-raised from `Source#run_with`.
65
+ def self.foreach(&block)
66
+ raise ArgumentError, "missing block" unless block
67
+
68
+ new do |stream|
69
+ count = 0
70
+
71
+ loop do
72
+ value = stream.next
73
+ break if Pull.done?(value)
74
+
75
+ block.call(value)
76
+ count += 1
77
+ end
78
+
79
+ count
80
+ end
81
+ end
82
+
59
83
  # Creates a sink that writes String chunks to an IO-like object.
60
84
  #
61
85
  # The sink consumes upstream until normal completion and returns the number
@@ -62,6 +62,37 @@ module FiberStream
62
62
  self.class.__send__(:new, @source_factory, @flows + [flow])
63
63
  end
64
64
 
65
+ # Returns a new source definition that emits this source, then `source`.
66
+ #
67
+ # Construction is lazy. The appended source is not materialized or pulled
68
+ # until downstream demand observes completion from this source. Flows
69
+ # attached before concat stay scoped to their source; flows attached after
70
+ # concat apply to the combined output.
71
+ def concat(source)
72
+ raise TypeError, "expected FiberStream::Source" unless source.is_a?(Source)
73
+
74
+ self.class.__send__(
75
+ :new,
76
+ -> { Pull.concat(materializer, source.__send__(:materializer)) }
77
+ )
78
+ end
79
+
80
+ # Returns a new source definition that emits pairs from this source and
81
+ # `source`.
82
+ #
83
+ # Construction is lazy. The receiver side is materialized only when
84
+ # downstream demand reaches the zip stage; the other side is materialized
85
+ # only after the receiver produces an element for a pair. The zipped source
86
+ # completes when either input completes.
87
+ def zip(source)
88
+ raise TypeError, "expected FiberStream::Source" unless source.is_a?(Source)
89
+
90
+ self.class.__send__(
91
+ :new,
92
+ -> { Pull.zip(materializer, source.__send__(:materializer)) }
93
+ )
94
+ end
95
+
65
96
  # Returns a new source definition that maps each element with `block`.
66
97
  #
67
98
  # This is a convenience wrapper around `via(FiberStream::Flow.map { ... })`
@@ -115,6 +146,34 @@ module FiberStream
115
146
  via(Flow.take(count))
116
147
  end
117
148
 
149
+ # Returns a new source definition that drops the first `count` elements.
150
+ #
151
+ # This is a convenience wrapper around `via(FiberStream::Flow.drop(count))`
152
+ # and preserves the same validation and pull-driven backpressure behavior.
153
+ def drop(count)
154
+ via(Flow.drop(count))
155
+ end
156
+
157
+ # Returns a new source definition that emits leading elements while `block`
158
+ # is truthy.
159
+ #
160
+ # This is a convenience wrapper around
161
+ # `via(FiberStream::Flow.take_while { ... })` and preserves the same
162
+ # predicate truthiness, early completion, and upstream close behavior.
163
+ def take_while(&block)
164
+ via(Flow.take_while(&block))
165
+ end
166
+
167
+ # Returns a new source definition that drops leading elements while `block`
168
+ # is truthy.
169
+ #
170
+ # This is a convenience wrapper around
171
+ # `via(FiberStream::Flow.drop_while { ... })` and preserves the same
172
+ # predicate truthiness, prefix-dropping, and pass-through behavior.
173
+ def drop_while(&block)
174
+ via(Flow.drop_while(&block))
175
+ end
176
+
118
177
  # Returns a new source definition with an asynchronous boundary.
119
178
  #
120
179
  # This is a convenience wrapper around `via(FiberStream::Flow.async)` and
@@ -161,10 +220,7 @@ module FiberStream
161
220
  primary_error = nil
162
221
 
163
222
  begin
164
- stream = @source_factory.call
165
- @flows.each do |flow|
166
- stream = flow.__send__(:attach, stream)
167
- end
223
+ stream = materialize
168
224
 
169
225
  sink.__send__(:run, stream)
170
226
  rescue StandardError => error
@@ -180,5 +236,30 @@ module FiberStream
180
236
  end
181
237
 
182
238
  private_class_method :new
239
+
240
+ private
241
+
242
+ def materializer
243
+ -> { materialize }
244
+ end
245
+
246
+ def materialize
247
+ stream = nil
248
+
249
+ begin
250
+ stream = @source_factory.call
251
+ @flows.each do |flow|
252
+ stream = flow.__send__(:attach, stream)
253
+ end
254
+ stream
255
+ rescue StandardError
256
+ begin
257
+ stream&.close
258
+ rescue StandardError
259
+ nil
260
+ end
261
+ raise
262
+ end
263
+ end
183
264
  end
184
265
  end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module FiberStream
4
- VERSION = "0.1.0"
4
+ VERSION = "0.2.0"
5
5
  end
data/sig/fiber_stream.rbs CHANGED
@@ -2,6 +2,7 @@ module FiberStream
2
2
  type ractor_transfer_policy = :copy | :move
3
3
  type ractor_map_error_kind = :input_transfer | :output_transfer | :worker | :worker_termination | :isolation
4
4
  type ractor_port_source_error_kind = :invalid_message | :producer_failure | :receive | :ack_transfer | :cancel_transfer
5
+ type ractor_port_cancel_reason = :closed
5
6
 
6
7
  class SchedulerRequiredError < RuntimeError
7
8
  end
@@ -42,7 +43,7 @@ module FiberStream
42
43
  end
43
44
 
44
45
  class Cancel < Data
45
- attr_reader reason: Symbol
46
+ attr_reader reason: ractor_port_cancel_reason
46
47
  end
47
48
  end
48
49
 
@@ -51,11 +52,16 @@ module FiberStream
51
52
  def self.io: (untyped io, ?chunk_size: Integer, ?close: bool) -> Source[String]
52
53
  def self.ractor_port: [Elem] (untyped port, ack_port: untyped, ?ack_transfer: ractor_transfer_policy, ?cancel: bool) -> Source[Elem]
53
54
  def via: [Out] (Flow[Elem, Out] flow) -> Source[Out]
55
+ def concat: [Other] (Source[Other] source) -> Source[Elem | Other]
56
+ def zip: [Other] (Source[Other] source) -> Source[[Elem, Other]]
54
57
  def map: [Out] () { (Elem) -> Out } -> Source[Out]
55
58
  def parallel_map: [Out] (concurrency: Integer) { (Elem) -> Out } -> Source[Out]
56
59
  def ractor_map: [Out] (workers: Integer, ?input_transfer: ractor_transfer_policy, ?output_transfer: ractor_transfer_policy) { (Elem) -> Out } -> Source[Out]
57
60
  def select: () { (Elem) -> boolish } -> Source[Elem]
58
61
  def take: (Integer count) -> Source[Elem]
62
+ def drop: (Integer count) -> Source[Elem]
63
+ def take_while: () { (Elem) -> boolish } -> Source[Elem]
64
+ def drop_while: () { (Elem) -> boolish } -> Source[Elem]
59
65
  def async: () -> Source[Elem]
60
66
  def buffer: (Integer count) -> Source[Elem]
61
67
  def lines: (?chomp: bool, ?max_length: Integer?) -> Source[String]
@@ -69,6 +75,9 @@ module FiberStream
69
75
  def self.ractor_map: [In, Out] (workers: Integer, ?input_transfer: ractor_transfer_policy, ?output_transfer: ractor_transfer_policy) { (In) -> Out } -> Flow[In, Out]
70
76
  def self.select: [Elem] () { (Elem) -> boolish } -> Flow[Elem, Elem]
71
77
  def self.take: [Elem] (Integer count) -> Flow[Elem, Elem]
78
+ def self.drop: [Elem] (Integer count) -> Flow[Elem, Elem]
79
+ def self.take_while: [Elem] () { (Elem) -> boolish } -> Flow[Elem, Elem]
80
+ def self.drop_while: [Elem] () { (Elem) -> boolish } -> Flow[Elem, Elem]
72
81
  def self.async: [Elem] () -> Flow[Elem, Elem]
73
82
  def self.buffer: [Elem] (Integer count) -> Flow[Elem, Elem]
74
83
  def self.lines: (?chomp: bool, ?max_length: Integer?) -> Flow[String, String]
@@ -80,6 +89,7 @@ module FiberStream
80
89
  def self.to_a: [Elem] () -> Sink[Elem, Array[Elem]]
81
90
  def self.first: [Elem] () -> Sink[Elem, Elem?]
82
91
  def self.fold: [Elem, Acc] (Acc initial) { (Acc, Elem) -> Acc } -> Sink[Elem, Acc]
92
+ def self.foreach: [Elem] () { (Elem) -> void } -> Sink[Elem, Integer]
83
93
  def self.io: (untyped io, ?close: bool, ?flush: bool) -> Sink[String, Integer]
84
94
  end
85
95
 
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fiber_stream
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Dai Akatsuka
@@ -23,6 +23,20 @@ dependencies:
23
23
  - - ">="
24
24
  - !ruby/object:Gem::Version
25
25
  version: '2.0'
26
+ - !ruby/object:Gem::Dependency
27
+ name: async-http
28
+ requirement: !ruby/object:Gem::Requirement
29
+ requirements:
30
+ - - ">="
31
+ - !ruby/object:Gem::Version
32
+ version: '0.95'
33
+ type: :development
34
+ prerelease: false
35
+ version_requirements: !ruby/object:Gem::Requirement
36
+ requirements:
37
+ - - ">="
38
+ - !ruby/object:Gem::Version
39
+ version: '0.95'
26
40
  - !ruby/object:Gem::Dependency
27
41
  name: minitest
28
42
  requirement: !ruby/object:Gem::Requirement
@@ -95,6 +109,7 @@ files:
95
109
  - README.md
96
110
  - examples/README.md
97
111
  - examples/async_http_requests.rb
112
+ - examples/async_http_streaming_body.rb
98
113
  - examples/background_execution.rb
99
114
  - examples/backpressure_buffer.rb
100
115
  - examples/basic_pipeline.rb
@@ -110,6 +125,9 @@ files:
110
125
  - lib/fiber_stream/pull.rb
111
126
  - lib/fiber_stream/pull/async_boundary.rb
112
127
  - lib/fiber_stream/pull/buffer_boundary.rb
128
+ - lib/fiber_stream/pull/concat.rb
129
+ - lib/fiber_stream/pull/drop.rb
130
+ - lib/fiber_stream/pull/drop_while.rb
113
131
  - lib/fiber_stream/pull/each.rb
114
132
  - lib/fiber_stream/pull/io_source.rb
115
133
  - lib/fiber_stream/pull/lines.rb
@@ -119,6 +137,8 @@ files:
119
137
  - lib/fiber_stream/pull/ractor_port_source.rb
120
138
  - lib/fiber_stream/pull/select.rb
121
139
  - lib/fiber_stream/pull/take.rb
140
+ - lib/fiber_stream/pull/take_while.rb
141
+ - lib/fiber_stream/pull/zip.rb
122
142
  - lib/fiber_stream/ractor_port.rb
123
143
  - lib/fiber_stream/running_pipeline.rb
124
144
  - lib/fiber_stream/sink.rb
@@ -131,7 +151,7 @@ licenses:
131
151
  metadata:
132
152
  allowed_push_host: https://rubygems.org
133
153
  homepage_uri: https://github.com/dakatsuka/fiber_stream
134
- source_code_uri: https://github.com/dakatsuka/fiber_stream/tree/v0.1.0
154
+ source_code_uri: https://github.com/dakatsuka/fiber_stream/tree/v0.2.0
135
155
  changelog_uri: https://github.com/dakatsuka/fiber_stream/blob/main/CHANGELOG.md
136
156
  rubygems_mfa_required: 'true'
137
157
  rdoc_options: []