enumerable-lazy 0.0.1
Sign up to get free protection for your applications and to get access to all the features.
- data/README.mkd +209 -0
- data/Rakefile +2 -0
- data/enumerable-lazy.gemspec +18 -0
- data/examples/13_friday.rb +11 -0
- data/examples/fizzbuzz.rb +8 -0
- data/lib/enumerable/lazy.rb +187 -0
- metadata +53 -0
data/README.mkd
ADDED
@@ -0,0 +1,209 @@
|
|
1
|
+
enumerable-lazy
|
2
|
+
===============
|
3
|
+
|
4
|
+
This is a sample implementation of Enumerable#lazy
|
5
|
+
(see #4890 - http://redmine.ruby-lang.org/issues/4890 ).
|
6
|
+
|
7
|
+
Enumerable#lazy returns an instance of Enumerable::Lazy,
|
8
|
+
which is like Enumerator but has 'lazy' version of .map,
|
9
|
+
.select, etc.
|
10
|
+
|
11
|
+
'lazy' version of map and select never returns an array.
|
12
|
+
Instead, they yields the transformed/filtered elements one by one.
|
13
|
+
So you can use them for -
|
14
|
+
|
15
|
+
* huge data which cannot be processed at a time,
|
16
|
+
* data stream which you want to process in real-time,
|
17
|
+
* or infinit list, which has no end.
|
18
|
+
|
19
|
+
Looking form another view, Enumerable#lazy (and its lazy
|
20
|
+
version of map and select) provides a way to transform or filter
|
21
|
+
|
22
|
+
|
23
|
+
Requirements
|
24
|
+
------------
|
25
|
+
|
26
|
+
Ruby 1.9.x (for now; it should not be so hard to support 1.8)
|
27
|
+
|
28
|
+
Example
|
29
|
+
-------
|
30
|
+
|
31
|
+
This code prints the first 100 primes which is of the form n^2 + 1.
|
32
|
+
|
33
|
+
require 'prime'
|
34
|
+
INFINITY = 1.0 / 0
|
35
|
+
|
36
|
+
p (1..INFINITY).lazy.map{|n| n**2+1}.
|
37
|
+
select{|m| m.prime?}.
|
38
|
+
take(100)
|
39
|
+
|
40
|
+
see examples/ for more.
|
41
|
+
|
42
|
+
When do I need Enumerable#lazy?
|
43
|
+
-------------------------------
|
44
|
+
|
45
|
+
### maping/selecting huge files
|
46
|
+
|
47
|
+
Suppose you want to print the first 10 words of a text file:
|
48
|
+
|
49
|
+
File.open(ARGV[0]){|f|
|
50
|
+
puts f.lines.flat_map{|l| l.split}.take(10)
|
51
|
+
}
|
52
|
+
|
53
|
+
Thanks to flat_map, this task is done by really simple code. But this example
|
54
|
+
has a problem; when the files is extremely large, flat_map reads the entire
|
55
|
+
file, where only a few lines are really needed.
|
56
|
+
|
57
|
+
Adding '.lazy' resolves this problem:
|
58
|
+
|
59
|
+
require 'enumerable/lazy'
|
60
|
+
|
61
|
+
File.open(ARGV[0]){|f|
|
62
|
+
puts f.lines.lazy.flat_map{|l| l.split}.take(10)
|
63
|
+
# ~~~~
|
64
|
+
}
|
65
|
+
|
66
|
+
The 'lazy' version of flat_map yields the result one by one, instead of
|
67
|
+
creating the entire array at first.
|
68
|
+
|
69
|
+
Enumerable#lazy is also useful when all the lines are important.
|
70
|
+
The following code counts the number of words of a (possible big) file.
|
71
|
+
|
72
|
+
File.open(ARGV[0]){|f|
|
73
|
+
p f.lines.lazy.flat_map{|l| l.split}.count
|
74
|
+
}
|
75
|
+
|
76
|
+
Without '.lazy', flat_map loads the entire file into memory.
|
77
|
+
|
78
|
+
This example prints the length of the longest line of the file.
|
79
|
+
With '.lazy', it consumes far less memory than using normal map.
|
80
|
+
|
81
|
+
File.open(ARGV[0]){|f|
|
82
|
+
p f.lines.lazy.map{|l| l.size}.max
|
83
|
+
}
|
84
|
+
|
85
|
+
### maping//selecting infinite list
|
86
|
+
|
87
|
+
Another fun of lazy evaluation is to manipulate infinit lists.
|
88
|
+
With Enumerable#lazy, you can apply map(=transform) or select(=filter) to them.
|
89
|
+
|
90
|
+
This is the exmaple shown above. Without '.lazy', map tries to
|
91
|
+
create an array with infinit length.
|
92
|
+
|
93
|
+
require 'prime'
|
94
|
+
INFINITY = 1.0 / 0
|
95
|
+
|
96
|
+
p (1..INFINITY)
|
97
|
+
.lazy
|
98
|
+
.map{|n| n**2+1}.
|
99
|
+
select{|m| m.prime?}.
|
100
|
+
take(100)
|
101
|
+
|
102
|
+
Another example is finding first ten 'Fridays the 13th'
|
103
|
+
with infinit list of dates starts from 1 Jan 2011.
|
104
|
+
|
105
|
+
require 'date'
|
106
|
+
|
107
|
+
puts (Date.new(2011)..Date.new(9999))
|
108
|
+
.lazy
|
109
|
+
.select{|d| d.day == 13 and d.friday?}
|
110
|
+
.take(10)
|
111
|
+
|
112
|
+
This program otherwise written with each and a counter variable:
|
113
|
+
|
114
|
+
require 'date'
|
115
|
+
|
116
|
+
n = 0
|
117
|
+
(Date.new(2011)..Date.new(9999)).each do |d|
|
118
|
+
if d.day == 13 and d.friday?
|
119
|
+
puts d
|
120
|
+
n += 1
|
121
|
+
exit if n >= 10
|
122
|
+
end
|
123
|
+
end
|
124
|
+
|
125
|
+
mapping/selecting infinit lists provide another (and often simpler
|
126
|
+
and cleaner) algorithm to solve the same problem.
|
127
|
+
|
128
|
+
### maping/selecting for stream
|
129
|
+
|
130
|
+
lazy map and lazy select are considered as transformation and
|
131
|
+
filterling for streams.
|
132
|
+
|
133
|
+
So Enumerable#lazy can be used to transform/filter data stream, like
|
134
|
+
data coming from network.
|
135
|
+
|
136
|
+
Methods
|
137
|
+
-------
|
138
|
+
|
139
|
+
Enumerable::Lazy is a subclass of Enumerator. It overrides
|
140
|
+
the following methods as 'lazy' version.
|
141
|
+
|
142
|
+
* Methods which transform the element:
|
143
|
+
* map(collect)
|
144
|
+
* flat_map(collect_concat)
|
145
|
+
* zip
|
146
|
+
* Methods which filters the element:
|
147
|
+
* select(find_all)
|
148
|
+
* grep
|
149
|
+
* take_while
|
150
|
+
* reject
|
151
|
+
* drop
|
152
|
+
* drop_while
|
153
|
+
|
154
|
+
These lazy methods returns an instance of Enumerable::Lazy,
|
155
|
+
where the 'normal' version returns an Array (and goes into infinite loop).
|
156
|
+
It means that calling these methods does not generate any result.
|
157
|
+
Actual values are generated when you call the methods like these
|
158
|
+
on Enumerable::Lazy.
|
159
|
+
|
160
|
+
* take
|
161
|
+
* first
|
162
|
+
* find(detect)
|
163
|
+
|
164
|
+
Example:
|
165
|
+
|
166
|
+
irb> ary = (1..100).to_a
|
167
|
+
irb> _.lazy
|
168
|
+
=> #<Enumerable::Lazy: #<Enumerator::Generator:0x000001010f3988>:each>
|
169
|
+
irb> _.map{|n| n*n}
|
170
|
+
=> #<Enumerable::Lazy: #<Enumerator::Generator:0x000001010b5b38>:each>
|
171
|
+
irb> _.select{|n| n.to_s[-1] == "9"}
|
172
|
+
=> #<Enumerable::Lazy: #<Enumerator::Generator:0x000001010848f8>:each>
|
173
|
+
irb> _.take(3)
|
174
|
+
=> [9, 49, 169]
|
175
|
+
|
176
|
+
### Methods not redefined
|
177
|
+
|
178
|
+
(FYI) Enumerable::Lazy does not override these methods:
|
179
|
+
|
180
|
+
* chunk cycle slice_before each_(cons|entry|slice|with_index|with_object)
|
181
|
+
* They return an Enumerator. In other words, they are already lazy.
|
182
|
+
|
183
|
+
* any? include? take detect find_index first
|
184
|
+
* They never cause an infinit loop, because they need only finite number of elements
|
185
|
+
|
186
|
+
* inject count all? none? one? max(_by) min(_by) minmax(_by)
|
187
|
+
* They need all elements to define the return value, and they are smart enough not to store all the element on the memory.
|
188
|
+
|
189
|
+
* entries group_by partition sort(_by)
|
190
|
+
* Size of their return value is linear to the input size.
|
191
|
+
|
192
|
+
* reverse_each
|
193
|
+
* It needs the last element first.
|
194
|
+
|
195
|
+
Implementation note
|
196
|
+
===================
|
197
|
+
|
198
|
+
* Current implementation is not built for speed.
|
199
|
+
|
200
|
+
* zip does not need to be lazy if one of the array is finite
|
201
|
+
(because zip terminates execution when an end is found),
|
202
|
+
but it must be lazy for the case all the arrays are infinite.
|
203
|
+
|
204
|
+
Contact
|
205
|
+
=======
|
206
|
+
|
207
|
+
http://github.com/yhara/enumerable-lazy
|
208
|
+
|
209
|
+
http://twitter.com/yhara_en
|
data/Rakefile
ADDED
@@ -0,0 +1,18 @@
|
|
1
|
+
# -*- encoding: utf-8 -*-
|
2
|
+
$:.push File.expand_path("../lib", __FILE__)
|
3
|
+
|
4
|
+
Gem::Specification.new do |s|
|
5
|
+
s.name = "enumerable-lazy"
|
6
|
+
s.version = "0.0.1"
|
7
|
+
s.platform = Gem::Platform::RUBY
|
8
|
+
s.authors = ["Yutaka HARA"]
|
9
|
+
s.email = ["yutaka.hara.gmail.com"]
|
10
|
+
s.homepage = "https://github.com/yhara/enumerable-lazy"
|
11
|
+
s.summary = %q{provides `lazy' version of map, select, etc. by `.lazy.map'}
|
12
|
+
s.description = s.summary
|
13
|
+
|
14
|
+
s.files = `git ls-files`.split("\n")
|
15
|
+
#s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
|
16
|
+
#s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
|
17
|
+
s.require_paths = ["lib"]
|
18
|
+
end
|
@@ -0,0 +1,187 @@
|
|
1
|
+
# = Enumerable#lazy example implementation
|
2
|
+
#
|
3
|
+
# Enumerable#lazy returns an instance of Enumerable::Lazy.
|
4
|
+
# You can use it just like as normal Enumerable object,
|
5
|
+
# except these methods act as 'lazy':
|
6
|
+
#
|
7
|
+
# - map collect
|
8
|
+
# - select find_all
|
9
|
+
# - reject
|
10
|
+
# - grep
|
11
|
+
# - drop
|
12
|
+
# - drop_while
|
13
|
+
# - take_while
|
14
|
+
# - flat_map collect_concat
|
15
|
+
# - zip
|
16
|
+
#
|
17
|
+
# == Example
|
18
|
+
#
|
19
|
+
# This code prints the first 100 primes.
|
20
|
+
#
|
21
|
+
# require 'prime'
|
22
|
+
# INFINITY = 1.0 / 0
|
23
|
+
# p (1..INFINITY).lazy.select{|m| m.prime?}.take(100)
|
24
|
+
#
|
25
|
+
# == Acknowledgements
|
26
|
+
#
|
27
|
+
# Inspired by https://github.com/antimon2/enumerable_lz
|
28
|
+
# http://jp.rubyist.net/magazine/?0034-Enumerable_lz (ja)
|
29
|
+
|
30
|
+
module Enumerable
|
31
|
+
def lazy
|
32
|
+
Lazy.new(self)
|
33
|
+
end
|
34
|
+
|
35
|
+
class Lazy < Enumerator
|
36
|
+
def initialize(obj, &block)
|
37
|
+
super(){|yielder|
|
38
|
+
begin
|
39
|
+
obj.each{|x|
|
40
|
+
if block
|
41
|
+
block.call(yielder, x)
|
42
|
+
else
|
43
|
+
yielder << x
|
44
|
+
end
|
45
|
+
}
|
46
|
+
rescue StopIteration
|
47
|
+
end
|
48
|
+
}
|
49
|
+
end
|
50
|
+
|
51
|
+
def map(&block)
|
52
|
+
Lazy.new(self){|yielder, val|
|
53
|
+
yielder << block.call(val)
|
54
|
+
}
|
55
|
+
end
|
56
|
+
alias collect map
|
57
|
+
|
58
|
+
def select(&block)
|
59
|
+
Lazy.new(self){|yielder, val|
|
60
|
+
if block.call(val)
|
61
|
+
yielder << val
|
62
|
+
end
|
63
|
+
}
|
64
|
+
end
|
65
|
+
alias find_all select
|
66
|
+
|
67
|
+
def reject(&block)
|
68
|
+
Lazy.new(self){|yielder, val|
|
69
|
+
if not block.call(val)
|
70
|
+
yielder << val
|
71
|
+
end
|
72
|
+
}
|
73
|
+
end
|
74
|
+
|
75
|
+
def grep(pattern)
|
76
|
+
Lazy.new(self){|yielder, val|
|
77
|
+
if pattern === val
|
78
|
+
yielder << val
|
79
|
+
end
|
80
|
+
}
|
81
|
+
end
|
82
|
+
|
83
|
+
def drop(n)
|
84
|
+
dropped = 0
|
85
|
+
Lazy.new(self){|yielder, val|
|
86
|
+
if dropped < n
|
87
|
+
dropped += 1
|
88
|
+
else
|
89
|
+
yielder << val
|
90
|
+
end
|
91
|
+
}
|
92
|
+
end
|
93
|
+
|
94
|
+
def drop_while(&block)
|
95
|
+
dropping = true
|
96
|
+
Lazy.new(self){|yielder, val|
|
97
|
+
if dropping
|
98
|
+
if not block.call(val)
|
99
|
+
yielder << val
|
100
|
+
dropping = false
|
101
|
+
end
|
102
|
+
else
|
103
|
+
yielder << val
|
104
|
+
end
|
105
|
+
}
|
106
|
+
end
|
107
|
+
|
108
|
+
def take(n)
|
109
|
+
taken = 0
|
110
|
+
Lazy.new(self){|yielder, val|
|
111
|
+
if taken < n
|
112
|
+
yielder << val
|
113
|
+
taken += 1
|
114
|
+
else
|
115
|
+
raise StopIteration
|
116
|
+
end
|
117
|
+
}
|
118
|
+
end
|
119
|
+
|
120
|
+
def take_while(&block)
|
121
|
+
Lazy.new(self){|yielder, val|
|
122
|
+
if block.call(val)
|
123
|
+
yielder << val
|
124
|
+
else
|
125
|
+
raise StopIteration
|
126
|
+
end
|
127
|
+
}
|
128
|
+
end
|
129
|
+
|
130
|
+
def flat_map(&block)
|
131
|
+
Lazy.new(self){|yielder, val|
|
132
|
+
ary = block.call(val)
|
133
|
+
# TODO: check ary is an Array
|
134
|
+
ary.each{|x|
|
135
|
+
yielder << x
|
136
|
+
}
|
137
|
+
}
|
138
|
+
end
|
139
|
+
alias collect_concat flat_map
|
140
|
+
|
141
|
+
def zip(*args, &block)
|
142
|
+
enums = [self] + args
|
143
|
+
Lazy.new(self){|yielder, val|
|
144
|
+
ary = enums.map{|e| e.next}
|
145
|
+
if block
|
146
|
+
yielder << block.call(ary)
|
147
|
+
else
|
148
|
+
yielder << ary
|
149
|
+
end
|
150
|
+
}
|
151
|
+
end
|
152
|
+
|
153
|
+
# def chunk
|
154
|
+
# def slice_before
|
155
|
+
#
|
156
|
+
# There methods are already implemented with Enumerator.
|
157
|
+
|
158
|
+
end
|
159
|
+
end
|
160
|
+
|
161
|
+
# Example
|
162
|
+
|
163
|
+
# -- Print the first 100 primes
|
164
|
+
#require 'prime'
|
165
|
+
#p (1..1.0/0).lazy.select{|m|m.prime?}.first(100)
|
166
|
+
|
167
|
+
#p (1..1.0/0).lazy.find{|n| n*n*Math::PI>10000}
|
168
|
+
|
169
|
+
# -- Print the first 10 word from a text file
|
170
|
+
#File.open("english.txt"){|f|
|
171
|
+
# p f.lines.lazy.flat_map{|line| line.split}.take(10)
|
172
|
+
#}
|
173
|
+
|
174
|
+
# -- Example of cycle and zip
|
175
|
+
#e1 = [1, 2, 3].cycle
|
176
|
+
#e2 = [:a, :b].cycle
|
177
|
+
#p e1.lazy.zip(e2).take(10)
|
178
|
+
|
179
|
+
# -- Example of chunk and take_while
|
180
|
+
#p Enumerator.new{|y|
|
181
|
+
# loop do
|
182
|
+
# y << rand(100)
|
183
|
+
# end
|
184
|
+
#}.chunk{|n| n.even?}.
|
185
|
+
# lazy.map{|even, ns| ns}.
|
186
|
+
# take_while{|ns| ns.length <= 5}.to_a
|
187
|
+
|
metadata
ADDED
@@ -0,0 +1,53 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: enumerable-lazy
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.0.1
|
5
|
+
prerelease:
|
6
|
+
platform: ruby
|
7
|
+
authors:
|
8
|
+
- Yutaka HARA
|
9
|
+
autorequire:
|
10
|
+
bindir: bin
|
11
|
+
cert_chain: []
|
12
|
+
date: 2011-07-03 00:00:00.000000000 +09:00
|
13
|
+
default_executable:
|
14
|
+
dependencies: []
|
15
|
+
description: provides `lazy' version of map, select, etc. by `.lazy.map'
|
16
|
+
email:
|
17
|
+
- yutaka.hara.gmail.com
|
18
|
+
executables: []
|
19
|
+
extensions: []
|
20
|
+
extra_rdoc_files: []
|
21
|
+
files:
|
22
|
+
- README.mkd
|
23
|
+
- Rakefile
|
24
|
+
- enumerable-lazy.gemspec
|
25
|
+
- examples/13_friday.rb
|
26
|
+
- examples/fizzbuzz.rb
|
27
|
+
- lib/enumerable/lazy.rb
|
28
|
+
has_rdoc: true
|
29
|
+
homepage: https://github.com/yhara/enumerable-lazy
|
30
|
+
licenses: []
|
31
|
+
post_install_message:
|
32
|
+
rdoc_options: []
|
33
|
+
require_paths:
|
34
|
+
- lib
|
35
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
36
|
+
none: false
|
37
|
+
requirements:
|
38
|
+
- - ! '>='
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '0'
|
41
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
42
|
+
none: false
|
43
|
+
requirements:
|
44
|
+
- - ! '>='
|
45
|
+
- !ruby/object:Gem::Version
|
46
|
+
version: '0'
|
47
|
+
requirements: []
|
48
|
+
rubyforge_project:
|
49
|
+
rubygems_version: 1.6.2
|
50
|
+
signing_key:
|
51
|
+
specification_version: 3
|
52
|
+
summary: provides `lazy' version of map, select, etc. by `.lazy.map'
|
53
|
+
test_files: []
|