async-enumerable 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.rspec +3 -0
- data/.standard.yml +5 -0
- data/CHANGELOG.md +5 -0
- data/CODE_OF_CONDUCT.md +132 -0
- data/LICENSE.txt +21 -0
- data/README.md +416 -0
- data/Rakefile +127 -0
- data/benchmark/async_all.yaml +38 -0
- data/benchmark/async_any.yaml +39 -0
- data/benchmark/async_each.yaml +51 -0
- data/benchmark/async_find.yaml +37 -0
- data/benchmark/async_map.yaml +50 -0
- data/benchmark/async_select.yaml +31 -0
- data/benchmark/early_termination/any_early.yaml +17 -0
- data/benchmark/early_termination/any_late.yaml +17 -0
- data/benchmark/early_termination/find_middle.yaml +17 -0
- data/benchmark/size_comparison/map_10.yaml +17 -0
- data/benchmark/size_comparison/map_100.yaml +17 -0
- data/benchmark/size_comparison/map_1000.yaml +20 -0
- data/benchmark/size_comparison/map_10000.yaml +23 -0
- data/docs/reference/README.md +43 -0
- data/docs/reference/concurrency_bounder.md +234 -0
- data/docs/reference/enumerable.md +258 -0
- data/docs/reference/enumerator.md +221 -0
- data/docs/reference/methods/converters.md +97 -0
- data/docs/reference/methods/predicates.md +254 -0
- data/docs/reference/methods/transformers.md +104 -0
- data/lib/async/enumerable/comparable.rb +26 -0
- data/lib/async/enumerable/concurrency_bounder.rb +37 -0
- data/lib/async/enumerable/configurable.rb +140 -0
- data/lib/async/enumerable/methods/aggregators.rb +40 -0
- data/lib/async/enumerable/methods/converters.rb +21 -0
- data/lib/async/enumerable/methods/each.rb +39 -0
- data/lib/async/enumerable/methods/iterators.rb +27 -0
- data/lib/async/enumerable/methods/predicates/all.rb +47 -0
- data/lib/async/enumerable/methods/predicates/any.rb +47 -0
- data/lib/async/enumerable/methods/predicates/find.rb +55 -0
- data/lib/async/enumerable/methods/predicates/find_index.rb +50 -0
- data/lib/async/enumerable/methods/predicates/include.rb +23 -0
- data/lib/async/enumerable/methods/predicates/none.rb +27 -0
- data/lib/async/enumerable/methods/predicates/one.rb +48 -0
- data/lib/async/enumerable/methods/predicates.rb +29 -0
- data/lib/async/enumerable/methods/slicers.rb +34 -0
- data/lib/async/enumerable/methods/transformers/compact.rb +18 -0
- data/lib/async/enumerable/methods/transformers/filter_map.rb +19 -0
- data/lib/async/enumerable/methods/transformers/flat_map.rb +20 -0
- data/lib/async/enumerable/methods/transformers/map.rb +22 -0
- data/lib/async/enumerable/methods/transformers/reject.rb +19 -0
- data/lib/async/enumerable/methods/transformers/select.rb +21 -0
- data/lib/async/enumerable/methods/transformers/sort.rb +18 -0
- data/lib/async/enumerable/methods/transformers/sort_by.rb +19 -0
- data/lib/async/enumerable/methods/transformers/uniq.rb +18 -0
- data/lib/async/enumerable/methods/transformers.rb +35 -0
- data/lib/async/enumerable/methods.rb +26 -0
- data/lib/async/enumerable/version.rb +10 -0
- data/lib/async/enumerable.rb +72 -0
- data/lib/async/enumerator.rb +33 -0
- data/lib/enumerable/async.rb +38 -0
- data/scripts/debug_config.rb +26 -0
- data/scripts/debug_config2.rb +34 -0
- data/scripts/sketch.rb +30 -0
- data/scripts/test_aggregators.rb +66 -0
- data/scripts/test_ancestors.rb +12 -0
- data/scripts/test_async_chaining.rb +30 -0
- data/scripts/test_direct_method_calls.rb +53 -0
- data/scripts/test_example.rb +37 -0
- data/scripts/test_issue_7.rb +69 -0
- data/scripts/test_method_source.rb +15 -0
- metadata +145 -0
@@ -0,0 +1,234 @@
|
|
1
|
+
# ConcurrencyBounder Module
|
2
|
+
|
3
|
+
The `ConcurrencyBounder` module provides bounded concurrency control for async operations.
|
4
|
+
|
5
|
+
## Overview
|
6
|
+
|
7
|
+
ConcurrencyBounder provides a helper method for executing async operations with a maximum fiber limit to prevent unbounded concurrency. This module is included in `Async::Enumerable` to provide a consistent way to limit the number of concurrent fibers created during async operations.
|
8
|
+
|
9
|
+
## Core Method
|
10
|
+
|
11
|
+
### with_bounded_concurrency
|
12
|
+
|
13
|
+
Executes a block with bounded concurrency using a semaphore.
|
14
|
+
|
15
|
+
This method sets up an `Async::Semaphore` to limit the number of concurrent fibers, creates a barrier under that semaphore, and yields the barrier to the block for spawning async tasks.
|
16
|
+
|
17
|
+
#### Parameters
|
18
|
+
|
19
|
+
- `early_termination` (Boolean): Whether the operation supports early termination (expects Async::Stop exceptions)
|
20
|
+
- `&block`: Block that receives the barrier for spawning tasks
|
21
|
+
|
22
|
+
#### Yields
|
23
|
+
|
24
|
+
- `barrier` (Async::Barrier): The barrier to use for spawning async tasks
|
25
|
+
|
26
|
+
#### Returns
|
27
|
+
|
28
|
+
The result of the block execution.
|
29
|
+
|
30
|
+
#### Example Usage
|
31
|
+
|
32
|
+
```ruby
|
33
|
+
def parallel_map(&block)
|
34
|
+
results = []
|
35
|
+
|
36
|
+
with_bounded_concurrency do |barrier|
|
37
|
+
@items.each_with_index do |item, index|
|
38
|
+
barrier.async do
|
39
|
+
results[index] = block.call(item)
|
40
|
+
end
|
41
|
+
end
|
42
|
+
end
|
43
|
+
|
44
|
+
results
|
45
|
+
end
|
46
|
+
```
|
47
|
+
|
48
|
+
### max_fibers
|
49
|
+
|
50
|
+
Gets the maximum number of fibers for this instance.
|
51
|
+
|
52
|
+
#### Returns
|
53
|
+
|
54
|
+
`Integer` - The maximum number of concurrent fibers
|
55
|
+
|
56
|
+
#### Behavior
|
57
|
+
|
58
|
+
1. Returns from instance config if set
|
59
|
+
2. Falls back to class config if defined
|
60
|
+
3. Falls back to `Async::Enumerable.config.max_fibers` module default
|
61
|
+
4. Module default is 1024 if not configured
|
62
|
+
|
63
|
+
## Implementation Details
|
64
|
+
|
65
|
+
### Semaphore Usage
|
66
|
+
|
67
|
+
The module uses `Async::Semaphore` to enforce fiber limits:
|
68
|
+
|
69
|
+
```ruby
|
70
|
+
semaphore = Async::Semaphore.new(max_fibers, parent:)
|
71
|
+
barrier = Async::Barrier.new(parent: semaphore)
|
72
|
+
```
|
73
|
+
|
74
|
+
This ensures that no more than `max_fibers` tasks run concurrently.
|
75
|
+
|
76
|
+
### Early Termination Support
|
77
|
+
|
78
|
+
When `early_termination: true`:
|
79
|
+
|
80
|
+
```ruby
|
81
|
+
with_bounded_concurrency(early_termination: true) do |barrier|
|
82
|
+
# Tasks can call barrier.stop to terminate early
|
83
|
+
# Async::Stop exceptions are caught and handled
|
84
|
+
end
|
85
|
+
```
|
86
|
+
|
87
|
+
This is used by predicate methods like `any?` and `find` to stop processing once the result is determined.
|
88
|
+
|
89
|
+
### Normal Execution
|
90
|
+
|
91
|
+
When `early_termination: false` (default):
|
92
|
+
|
93
|
+
```ruby
|
94
|
+
with_bounded_concurrency do |barrier|
|
95
|
+
# All tasks run to completion
|
96
|
+
# barrier.wait blocks until all finish
|
97
|
+
end
|
98
|
+
```
|
99
|
+
|
100
|
+
## Usage in Async::Enumerable
|
101
|
+
|
102
|
+
### In Predicate Methods
|
103
|
+
|
104
|
+
```ruby
|
105
|
+
def any?(&block)
|
106
|
+
found = Concurrent::AtomicBoolean.new(false)
|
107
|
+
|
108
|
+
with_bounded_concurrency(early_termination: true) do |barrier|
|
109
|
+
@items.each do |item|
|
110
|
+
break if found.true?
|
111
|
+
|
112
|
+
barrier.async do
|
113
|
+
if block.call(item)
|
114
|
+
found.make_true
|
115
|
+
barrier.stop # Early termination
|
116
|
+
end
|
117
|
+
end
|
118
|
+
end
|
119
|
+
end
|
120
|
+
|
121
|
+
found.true?
|
122
|
+
end
|
123
|
+
```
|
124
|
+
|
125
|
+
### In Transform Methods
|
126
|
+
|
127
|
+
```ruby
|
128
|
+
def each(&block)
|
129
|
+
with_bounded_concurrency do |barrier|
|
130
|
+
@items.each do |item|
|
131
|
+
barrier.async do
|
132
|
+
block.call(item)
|
133
|
+
end
|
134
|
+
end
|
135
|
+
end
|
136
|
+
end
|
137
|
+
```
|
138
|
+
|
139
|
+
## Fiber Limit Configuration
|
140
|
+
|
141
|
+
### Global Default
|
142
|
+
|
143
|
+
```ruby
|
144
|
+
Async::Enumerable.configure { |c| c.max_fibers = 100 } # Set global default
|
145
|
+
```
|
146
|
+
|
147
|
+
### Per Instance
|
148
|
+
|
149
|
+
```ruby
|
150
|
+
class MyEnumerator
|
151
|
+
include Async::Enumerable
|
152
|
+
def_async_enumerable :@items, max_fibers: 50 # Class-level default
|
153
|
+
|
154
|
+
def initialize(items)
|
155
|
+
@items = items
|
156
|
+
end
|
157
|
+
end
|
158
|
+
|
159
|
+
# Instance override
|
160
|
+
enumerator = MyEnumerator.new(items)
|
161
|
+
enumerator.async(max_fibers: 100).map { |item| process(item) }
|
162
|
+
```
|
163
|
+
|
164
|
+
### Precedence
|
165
|
+
|
166
|
+
1. Instance config (passed to `.async` method)
|
167
|
+
2. Class config (set via `def_async_enumerable`)
|
168
|
+
3. Module config (`Async::Enumerable.configure`)
|
169
|
+
4. Default constant (1024)
|
170
|
+
|
171
|
+
## Performance Considerations
|
172
|
+
|
173
|
+
### Choosing Fiber Limits
|
174
|
+
|
175
|
+
- **Too low**: Reduces parallelism, may not fully utilize resources
|
176
|
+
- **Too high**: Can cause resource exhaustion, context switching overhead
|
177
|
+
- **Recommended**: Start with defaults, tune based on workload
|
178
|
+
|
179
|
+
### Typical Values
|
180
|
+
|
181
|
+
- CPU-bound tasks: 2-4x CPU cores
|
182
|
+
- I/O-bound tasks: 50-200 fibers
|
183
|
+
- Mixed workloads: 10-50 fibers
|
184
|
+
|
185
|
+
## Thread Safety
|
186
|
+
|
187
|
+
The module operates within the async runtime which handles:
|
188
|
+
- Fiber scheduling
|
189
|
+
- Resource allocation
|
190
|
+
- Synchronization via barriers
|
191
|
+
|
192
|
+
Combined with atomic variables from concurrent-ruby, this ensures thread-safe operation.
|
193
|
+
|
194
|
+
## Error Handling
|
195
|
+
|
196
|
+
### Normal Errors
|
197
|
+
|
198
|
+
Exceptions in async tasks propagate normally:
|
199
|
+
|
200
|
+
```ruby
|
201
|
+
with_bounded_concurrency do |barrier|
|
202
|
+
barrier.async do
|
203
|
+
raise "Error!" # Will propagate after barrier.wait
|
204
|
+
end
|
205
|
+
end
|
206
|
+
```
|
207
|
+
|
208
|
+
### Early Termination
|
209
|
+
|
210
|
+
`Async::Stop` exceptions are caught when `early_termination: true`:
|
211
|
+
|
212
|
+
```ruby
|
213
|
+
with_bounded_concurrency(early_termination: true) do |barrier|
|
214
|
+
barrier.async do
|
215
|
+
barrier.stop # Raises Async::Stop internally
|
216
|
+
end
|
217
|
+
end
|
218
|
+
# Async::Stop is caught, execution continues
|
219
|
+
```
|
220
|
+
|
221
|
+
## Integration with Async Runtime
|
222
|
+
|
223
|
+
ConcurrencyBounder requires the async runtime via `Sync` blocks:
|
224
|
+
|
225
|
+
```ruby
|
226
|
+
def with_bounded_concurrency(...)
|
227
|
+
Sync do |parent|
|
228
|
+
# Async context established
|
229
|
+
# Semaphore and barrier created with parent
|
230
|
+
end
|
231
|
+
end
|
232
|
+
```
|
233
|
+
|
234
|
+
This ensures proper async context even when called from synchronous code.
|
@@ -0,0 +1,258 @@
|
|
1
|
+
# Async::Enumerable Module
|
2
|
+
|
3
|
+
The `Async::Enumerable` module provides asynchronous, parallel execution capabilities for Ruby's Enumerable.
|
4
|
+
|
5
|
+
## Overview
|
6
|
+
|
7
|
+
This module extends Ruby's Enumerable with an `.async` method that returns an Async::Enumerator wrapper, enabling concurrent execution of enumerable operations using the socketry/async library. This allows for significant performance improvements when dealing with I/O-bound operations or processing large collections.
|
8
|
+
|
9
|
+
## Features
|
10
|
+
|
11
|
+
- Parallel execution of enumerable methods
|
12
|
+
- Thread-safe operation with atomic variables
|
13
|
+
- Optimized early-termination implementations for predicates and find operations
|
14
|
+
- Full compatibility with standard Enumerable interface
|
15
|
+
- Configurable concurrency limits to prevent unbounded fiber creation
|
16
|
+
|
17
|
+
## Basic Usage
|
18
|
+
|
19
|
+
### Including in Your Class
|
20
|
+
|
21
|
+
```ruby
|
22
|
+
class MyCollection
|
23
|
+
include Async::Enumerable
|
24
|
+
def_async_enumerable :@items
|
25
|
+
|
26
|
+
def initialize
|
27
|
+
@items = []
|
28
|
+
end
|
29
|
+
|
30
|
+
attr_reader :items
|
31
|
+
end
|
32
|
+
|
33
|
+
collection = MyCollection.new
|
34
|
+
collection.items.concat([1, 2, 3])
|
35
|
+
collection.async.map { |x| x * 2 } # => [2, 4, 6]
|
36
|
+
```
|
37
|
+
|
38
|
+
### Using with Arrays and Standard Collections
|
39
|
+
|
40
|
+
```ruby
|
41
|
+
# Basic async enumeration
|
42
|
+
[1, 2, 3, 4, 5].async.map { |n| n * 2 }
|
43
|
+
# => [2, 4, 6, 8, 10] (processed in parallel)
|
44
|
+
|
45
|
+
# Async I/O operations
|
46
|
+
urls = ["http://api1.com", "http://api2.com", "http://api3.com"]
|
47
|
+
results = urls.async.map { |url| fetch_data(url) }
|
48
|
+
# All URLs fetched concurrently
|
49
|
+
```
|
50
|
+
|
51
|
+
## The def_async_enumerable Method
|
52
|
+
|
53
|
+
The `def_async_enumerable` class method defines the source of enumeration for async operations.
|
54
|
+
|
55
|
+
### Syntax
|
56
|
+
|
57
|
+
```ruby
|
58
|
+
def_async_enumerable :collection_ref, max_fibers: nil
|
59
|
+
```
|
60
|
+
|
61
|
+
### Parameters
|
62
|
+
|
63
|
+
- `collection_ref` (Symbol): The name of the method or instance variable that returns the enumerable
|
64
|
+
- `max_fibers` (Integer, optional): Default max_fibers for this class
|
65
|
+
|
66
|
+
### Examples
|
67
|
+
|
68
|
+
#### With Method
|
69
|
+
|
70
|
+
```ruby
|
71
|
+
class DataProcessor
|
72
|
+
include Async::Enumerable
|
73
|
+
def_async_enumerable :dataset
|
74
|
+
|
75
|
+
def dataset
|
76
|
+
fetch_data_from_source
|
77
|
+
end
|
78
|
+
end
|
79
|
+
```
|
80
|
+
|
81
|
+
#### With Instance Variable
|
82
|
+
|
83
|
+
```ruby
|
84
|
+
class Queue
|
85
|
+
include Async::Enumerable
|
86
|
+
def_async_enumerable :@items # Note the @ prefix
|
87
|
+
|
88
|
+
def initialize
|
89
|
+
@items = []
|
90
|
+
end
|
91
|
+
end
|
92
|
+
```
|
93
|
+
|
94
|
+
#### With Custom Fiber Limit
|
95
|
+
|
96
|
+
```ruby
|
97
|
+
class LargeDataset
|
98
|
+
include Async::Enumerable
|
99
|
+
def_async_enumerable :@records, max_fibers: 50
|
100
|
+
|
101
|
+
attr_reader :records
|
102
|
+
end
|
103
|
+
```
|
104
|
+
|
105
|
+
## Idempotent Async Chaining
|
106
|
+
|
107
|
+
The `.async` method is idempotent - calling it multiple times returns the same instance:
|
108
|
+
|
109
|
+
```ruby
|
110
|
+
arr = [1, 2, 3]
|
111
|
+
async1 = arr.async
|
112
|
+
async2 = async1.async
|
113
|
+
async3 = async2.async
|
114
|
+
|
115
|
+
async1.equal?(async2) # => true
|
116
|
+
async2.equal?(async3) # => true
|
117
|
+
```
|
118
|
+
|
119
|
+
This prevents unnecessary wrapper creation and allows for flexible API design.
|
120
|
+
|
121
|
+
## Fiber Limits
|
122
|
+
|
123
|
+
### Global Default
|
124
|
+
|
125
|
+
Set the global default maximum fibers:
|
126
|
+
|
127
|
+
```ruby
|
128
|
+
Async::Enumerable.configure { |c| c.max_fibers = 100 }
|
129
|
+
```
|
130
|
+
|
131
|
+
### Per-Instance
|
132
|
+
|
133
|
+
Override for specific calls:
|
134
|
+
|
135
|
+
```ruby
|
136
|
+
huge_dataset.async(max_fibers: 50).map { |item| process(item) }
|
137
|
+
```
|
138
|
+
|
139
|
+
### Default Value
|
140
|
+
|
141
|
+
The default maximum is 1024 fibers if not explicitly configured.
|
142
|
+
|
143
|
+
## Method Categories
|
144
|
+
|
145
|
+
When you include `Async::Enumerable`, you get:
|
146
|
+
|
147
|
+
### Predicate Methods (Early Termination)
|
148
|
+
- `all?`, `any?`, `none?`, `one?`
|
149
|
+
- `find`, `find_index`
|
150
|
+
- `include?`, `member?`
|
151
|
+
|
152
|
+
These methods stop processing as soon as the result is determined.
|
153
|
+
|
154
|
+
### Transformer Methods
|
155
|
+
- `map`, `select`, `reject`
|
156
|
+
- `filter_map`, `flat_map`
|
157
|
+
- `compact`, `uniq`
|
158
|
+
- `sort`, `sort_by`
|
159
|
+
|
160
|
+
These return new `Async::Enumerator` instances for chaining.
|
161
|
+
|
162
|
+
### Converter Methods
|
163
|
+
- `to_a` - Convert to array
|
164
|
+
- `sync` - Alias for `to_a`, semantically ends async chain
|
165
|
+
|
166
|
+
## Implementation Details
|
167
|
+
|
168
|
+
### Module Inclusion
|
169
|
+
|
170
|
+
When included, `Async::Enumerable` automatically:
|
171
|
+
1. Extends with `Configurable` for configuration management
|
172
|
+
2. Extends with `Configurable::ClassMethods` (provides `def_async_enumerable`)
|
173
|
+
3. Includes `Configurable` for instance-level configuration
|
174
|
+
4. Includes `Comparable` for comparison operators
|
175
|
+
5. Includes all async method implementations
|
176
|
+
6. Includes `AsyncMethod` module that provides the `async` method
|
177
|
+
|
178
|
+
### Source Resolution
|
179
|
+
|
180
|
+
The enumerable source is determined by:
|
181
|
+
1. If `def_async_enumerable` was called, uses that source
|
182
|
+
2. If source is an instance variable (starts with @), uses `instance_variable_get`
|
183
|
+
3. If source is a method name, calls that method
|
184
|
+
4. If no source defined, assumes self is enumerable
|
185
|
+
|
186
|
+
## Common Patterns
|
187
|
+
|
188
|
+
### Processing API Responses
|
189
|
+
|
190
|
+
```ruby
|
191
|
+
class ApiClient
|
192
|
+
include Async::Enumerable
|
193
|
+
def_async_enumerable :endpoints
|
194
|
+
|
195
|
+
def endpoints
|
196
|
+
["users", "posts", "comments"]
|
197
|
+
end
|
198
|
+
|
199
|
+
def fetch_all
|
200
|
+
async.map { |endpoint| fetch("/api/#{endpoint}") }
|
201
|
+
end
|
202
|
+
end
|
203
|
+
```
|
204
|
+
|
205
|
+
### Batch Processing
|
206
|
+
|
207
|
+
```ruby
|
208
|
+
class BatchProcessor
|
209
|
+
include Async::Enumerable
|
210
|
+
def_async_enumerable :@items, max_fibers: 10
|
211
|
+
|
212
|
+
def initialize(items)
|
213
|
+
@items = items
|
214
|
+
end
|
215
|
+
|
216
|
+
def process_all
|
217
|
+
async.map { |item| expensive_operation(item) }
|
218
|
+
end
|
219
|
+
end
|
220
|
+
```
|
221
|
+
|
222
|
+
### Custom Collections
|
223
|
+
|
224
|
+
```ruby
|
225
|
+
class ThreadSafeQueue
|
226
|
+
include Async::Enumerable
|
227
|
+
def_async_enumerable :snapshot
|
228
|
+
|
229
|
+
def initialize
|
230
|
+
@mutex = Mutex.new
|
231
|
+
@items = []
|
232
|
+
end
|
233
|
+
|
234
|
+
def snapshot
|
235
|
+
@mutex.synchronize { @items.dup }
|
236
|
+
end
|
237
|
+
|
238
|
+
def add(item)
|
239
|
+
@mutex.synchronize { @items << item }
|
240
|
+
end
|
241
|
+
end
|
242
|
+
```
|
243
|
+
|
244
|
+
## Performance Considerations
|
245
|
+
|
246
|
+
- Async processing has overhead - best for I/O-bound or CPU-intensive operations
|
247
|
+
- Small collections with simple operations may be slower async
|
248
|
+
- Early termination methods provide significant optimization
|
249
|
+
- Fiber limits prevent resource exhaustion
|
250
|
+
|
251
|
+
## Thread Safety
|
252
|
+
|
253
|
+
All async operations use thread-safe atomic variables from concurrent-ruby:
|
254
|
+
- `Concurrent::AtomicBoolean` for boolean flags
|
255
|
+
- `Concurrent::AtomicFixnum` for counters
|
256
|
+
- `Concurrent::AtomicReference` for object references
|
257
|
+
|
258
|
+
This ensures correct behavior even with concurrent fiber execution.
|
@@ -0,0 +1,221 @@
|
|
1
|
+
# Async::Enumerator
|
2
|
+
|
3
|
+
The `Async::Enumerator` class is a wrapper that provides asynchronous implementations of Enumerable methods for parallel execution.
|
4
|
+
|
5
|
+
## Overview
|
6
|
+
|
7
|
+
This class wraps any Enumerable object and provides async versions of standard enumerable methods. It includes the standard Enumerable module for compatibility, as well as specialized async implementations through the Async::Enumerable module.
|
8
|
+
|
9
|
+
The Enumerator maintains a reference to the original enumerable and delegates method calls while providing concurrent execution capabilities through the async runtime.
|
10
|
+
|
11
|
+
## Creating an Async::Enumerator
|
12
|
+
|
13
|
+
### Direct Instantiation
|
14
|
+
|
15
|
+
```ruby
|
16
|
+
async_enum = Async::Enumerator.new([1, 2, 3, 4, 5])
|
17
|
+
async_enum.map { |n| n * 2 } # Executes in parallel
|
18
|
+
```
|
19
|
+
|
20
|
+
### Using Enumerable#async (Preferred)
|
21
|
+
|
22
|
+
The preferred way to create an Async::Enumerator:
|
23
|
+
|
24
|
+
```ruby
|
25
|
+
result = [1, 2, 3].async.map { |n| slow_operation(n) }
|
26
|
+
```
|
27
|
+
|
28
|
+
### With Configuration
|
29
|
+
|
30
|
+
```ruby
|
31
|
+
# With custom fiber limit
|
32
|
+
huge_dataset.async(max_fibers: 100).map { |n| process(n) }
|
33
|
+
|
34
|
+
# With config object
|
35
|
+
config = Async::Enumerable::Configurable::Config.new(max_fibers: 50)
|
36
|
+
enumerator = Async::Enumerator.new(data, config)
|
37
|
+
```
|
38
|
+
|
39
|
+
## Initialization
|
40
|
+
|
41
|
+
### Parameters
|
42
|
+
|
43
|
+
- `enumerable` (Enumerable): Any object that includes Enumerable
|
44
|
+
- `config` (Config, nil): Configuration object containing settings like max_fibers
|
45
|
+
- `**kwargs`: Configuration options passed as keyword arguments (e.g., max_fibers: 100)
|
46
|
+
|
47
|
+
### Examples
|
48
|
+
|
49
|
+
```ruby
|
50
|
+
# Default configuration
|
51
|
+
async_array = Async::Enumerator.new([1, 2, 3])
|
52
|
+
|
53
|
+
# Custom fiber limit via kwargs
|
54
|
+
async_range = Async::Enumerator.new(1..100, max_fibers: 50)
|
55
|
+
|
56
|
+
# With config object
|
57
|
+
config = Async::Enumerable::Configurable::Config.new(max_fibers: 100)
|
58
|
+
async_enum = Async::Enumerator.new(data, config)
|
59
|
+
|
60
|
+
# Override config with kwargs
|
61
|
+
async_enum = Async::Enumerator.new(data, config, max_fibers: 200)
|
62
|
+
```
|
63
|
+
|
64
|
+
## Core Methods
|
65
|
+
|
66
|
+
### each
|
67
|
+
|
68
|
+
Asynchronously iterates over each element in the enumerable, executing the given block in parallel for each item.
|
69
|
+
|
70
|
+
This method spawns async tasks for each item in the enumerable, allowing them to execute concurrently. It uses an `Async::Barrier` to coordinate the tasks and waits for all of them to complete before returning.
|
71
|
+
|
72
|
+
When called without a block, returns an Enumerator for compatibility with the standard Enumerable interface.
|
73
|
+
|
74
|
+
#### Parameters
|
75
|
+
- `&block`: Block to execute for each element in parallel
|
76
|
+
|
77
|
+
#### Returns
|
78
|
+
- With block: Returns self (for chaining)
|
79
|
+
- Without block: Returns an Enumerator
|
80
|
+
|
81
|
+
#### Examples
|
82
|
+
|
83
|
+
```ruby
|
84
|
+
# Basic async iteration
|
85
|
+
[1, 2, 3].async.each do |n|
|
86
|
+
puts "Processing #{n}"
|
87
|
+
sleep(1) # All three will complete in ~1 second total
|
88
|
+
end
|
89
|
+
|
90
|
+
# With I/O operations
|
91
|
+
urls.async.each do |url|
|
92
|
+
response = HTTP.get(url)
|
93
|
+
save_to_cache(url, response)
|
94
|
+
end
|
95
|
+
# All URLs are fetched and cached concurrently
|
96
|
+
|
97
|
+
# Chaining
|
98
|
+
data.async
|
99
|
+
.each { |item| log(item) }
|
100
|
+
.map { |item| transform(item) }
|
101
|
+
```
|
102
|
+
|
103
|
+
#### Important Notes
|
104
|
+
- The execution order of the block is not guaranteed to match the order of items in the enumerable due to parallel execution
|
105
|
+
- All tasks complete before the method returns
|
106
|
+
- Returns self to allow chaining, like standard each
|
107
|
+
|
108
|
+
### Comparison Methods
|
109
|
+
|
110
|
+
#### <=>
|
111
|
+
|
112
|
+
Compares this Async::Enumerator with another object. Converts both to arrays for comparison.
|
113
|
+
|
114
|
+
```ruby
|
115
|
+
async_enum = [1, 2, 3].async
|
116
|
+
async_enum <=> [1, 2, 3] # => 0
|
117
|
+
async_enum <=> [1, 2, 4] # => -1
|
118
|
+
```
|
119
|
+
|
120
|
+
#### ==
|
121
|
+
|
122
|
+
Checks equality with another object. Converts both to arrays for comparison.
|
123
|
+
|
124
|
+
```ruby
|
125
|
+
result = [1, 2, 3].async.map { |x| x * 2 }
|
126
|
+
result == [2, 4, 6] # => true
|
127
|
+
```
|
128
|
+
|
129
|
+
Also available as `eql?`.
|
130
|
+
|
131
|
+
## Delegated Methods
|
132
|
+
|
133
|
+
The following methods are inherently sequential and are delegated back to the wrapped enumerable for efficiency:
|
134
|
+
|
135
|
+
- `first` - Returns the first element(s)
|
136
|
+
- `take` - Takes the first n elements
|
137
|
+
- `take_while` - Takes elements while condition is true
|
138
|
+
- `lazy` - Returns a lazy enumerator
|
139
|
+
- `size` - Returns the size of the enumerable
|
140
|
+
- `length` - Alias for size
|
141
|
+
|
142
|
+
These methods bypass async processing since they don't benefit from parallelization.
|
143
|
+
|
144
|
+
## Method Chaining
|
145
|
+
|
146
|
+
Async::Enumerator supports full method chaining:
|
147
|
+
|
148
|
+
```ruby
|
149
|
+
result = [1, 2, 3, 4, 5].async
|
150
|
+
.map { |n| fetch_data(n) } # Parallel fetch
|
151
|
+
.select { |data| data.valid? } # Filter results
|
152
|
+
.map { |data| process(data) } # Transform data
|
153
|
+
.sync # Materialize results
|
154
|
+
```
|
155
|
+
|
156
|
+
## Fiber Limits
|
157
|
+
|
158
|
+
The `max_fibers` parameter controls concurrency:
|
159
|
+
|
160
|
+
```ruby
|
161
|
+
# Default limit (from Async::Enumerable.max_fibers)
|
162
|
+
data.async.map { |x| process(x) }
|
163
|
+
|
164
|
+
# Custom limit for this instance
|
165
|
+
data.async(max_fibers: 10).map { |x| process(x) }
|
166
|
+
|
167
|
+
# Limit is preserved through chaining
|
168
|
+
enum = data.async(max_fibers: 5)
|
169
|
+
enum.map { |x| x * 2 }.select { |x| x > 10 }
|
170
|
+
# Both map and select respect the 5 fiber limit
|
171
|
+
```
|
172
|
+
|
173
|
+
## Integration with Async Runtime
|
174
|
+
|
175
|
+
Async::Enumerator requires the async runtime to be available. Operations are automatically wrapped in `Sync` blocks when needed:
|
176
|
+
|
177
|
+
```ruby
|
178
|
+
# Automatically wrapped in Sync block
|
179
|
+
result = [1, 2, 3].async.map { |n| n * 2 }
|
180
|
+
|
181
|
+
# Explicit async context
|
182
|
+
Async do
|
183
|
+
result = urls.async.map { |url| fetch(url) }
|
184
|
+
process_results(result)
|
185
|
+
end
|
186
|
+
```
|
187
|
+
|
188
|
+
## Performance Considerations
|
189
|
+
|
190
|
+
- Async processing has overhead - use for I/O-bound or CPU-intensive operations
|
191
|
+
- For simple transformations on small collections, synchronous processing may be faster
|
192
|
+
- Fiber limits prevent resource exhaustion with large collections
|
193
|
+
- Early termination methods (any?, all?, find) stop processing as soon as possible
|
194
|
+
|
195
|
+
## Common Patterns
|
196
|
+
|
197
|
+
### Parallel API Calls
|
198
|
+
```ruby
|
199
|
+
user_ids.async.map { |id| fetch_user(id) }
|
200
|
+
```
|
201
|
+
|
202
|
+
### Concurrent File Processing
|
203
|
+
```ruby
|
204
|
+
files.async.each { |file| process_file(file) }
|
205
|
+
```
|
206
|
+
|
207
|
+
### Batch Processing with Limits
|
208
|
+
```ruby
|
209
|
+
huge_dataset.async(max_fibers: 100).map { |item|
|
210
|
+
expensive_operation(item)
|
211
|
+
}
|
212
|
+
```
|
213
|
+
|
214
|
+
### Pipeline Processing
|
215
|
+
```ruby
|
216
|
+
raw_data.async
|
217
|
+
.map { |d| parse(d) }
|
218
|
+
.select { |d| validate(d) }
|
219
|
+
.map { |d| transform(d) }
|
220
|
+
.sync
|
221
|
+
```
|