filter_io 0.1.1 → 0.1.2
Sign up to get free protection for your applications and to get access to all the features.
- data/.gitignore +1 -0
- data/README.markdown +12 -14
- data/Rakefile +22 -0
- data/VERSION +1 -1
- data/lib/filter_io.rb +1 -1
- data/test/filter_io_test.rb +13 -1
- metadata +7 -3
data/.gitignore
CHANGED
data/README.markdown
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
# `filter_io`
|
2
2
|
## Filter IO streams with a block. Ruby's FilterInputStream.
|
3
3
|
|
4
|
-
`filter_io` is analogous to Java's `FilterIOStream` in that it allows you to intercept and process data in an IO stream. This is particularly useful when
|
4
|
+
`filter_io` is analogous to Java's `FilterIOStream` in that it allows you to intercept and process data in an IO stream. This is particularly useful when cleaning up bad input data for a CSV or XML parser.
|
5
5
|
|
6
6
|
`filter_io` provides a one-pass approach to filtering data which can be much faster and memory efficient than doing two passes (cleaning the source file into a buffer and then calling the original parser).
|
7
7
|
|
@@ -9,9 +9,9 @@
|
|
9
9
|
|
10
10
|
### Installation
|
11
11
|
|
12
|
-
You can install
|
12
|
+
You can install the gem by running:
|
13
13
|
|
14
|
-
|
14
|
+
gem install filter_io
|
15
15
|
|
16
16
|
### Example Usage
|
17
17
|
|
@@ -45,7 +45,7 @@ A common usage of `filter_io` is to normalise line endings before parsing CSV da
|
|
45
45
|
|
46
46
|
### Reference
|
47
47
|
|
48
|
-
Call `FilterIO.new` with the original IO stream, any options and the filtering block. The returned object
|
48
|
+
Call `FilterIO.new` with the original IO stream, any options and the filtering block. The returned `filter_io` object acts like a normal read-only forward-only IO stream.
|
49
49
|
|
50
50
|
#### Block `state` parameter
|
51
51
|
|
@@ -54,7 +54,7 @@ An optional second parameter to the block is the `state` parameter which contain
|
|
54
54
|
* `bof?`: Returns true if this is the *first* chuck of the stream.
|
55
55
|
* `eof?`: Returns true if this is the *last* chunk of the stream.
|
56
56
|
|
57
|
-
####
|
57
|
+
#### Requesting Additional Data
|
58
58
|
|
59
59
|
If the filtering block needs more data to be able to return anything, you can raise a `FilterIO::NeedMoreData` exception and `filter_io` will read another block and pass the additional data to you. This can be repeated as necessary until enough data is retrieved.
|
60
60
|
|
@@ -62,7 +62,7 @@ For example usage of `NeedMoreData`, see the line ending normalisation example a
|
|
62
62
|
|
63
63
|
#### Re-buffering Unprocessed Data
|
64
64
|
|
65
|
-
If your block is unable to process the whole chunk of data immediately, it can return both the processed chuck and the remainder to be processed later. This is done by returning a 2 element array: `[processed, unprocessed]`. If
|
65
|
+
If your block is unable to process the whole chunk of data immediately, it can return both the processed chuck and the remainder to be processed later. This is done by returning a 2 element array: `[processed, unprocessed]`. If `processed` is empty and there is `unprocessed` data, `filter_io` will grab another block of data from the source stream and call the block again.
|
66
66
|
|
67
67
|
Here's an example which processes whole lines and prepends the line length to the beginning of each line.
|
68
68
|
|
@@ -88,15 +88,13 @@ Ruby 1.9 has character encoding support can convert between UTF-8, ISO-8859-1, A
|
|
88
88
|
As per the core `IO` object, if `read` is called with a length (in bytes), the data will be returned in the external encoding.
|
89
89
|
In summary, everything should Just Work™
|
90
90
|
|
91
|
-
###
|
91
|
+
### Notes on Patches/Pull Requests
|
92
92
|
|
93
|
-
|
94
|
-
|
95
|
-
|
96
|
-
|
97
|
-
|
98
|
-
(if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
|
99
|
-
* Send me a pull request. Bonus points for topic branches.
|
93
|
+
1. Fork the project.
|
94
|
+
1. Make your feature addition or bug fix.
|
95
|
+
1. Add tests for it. This is important so I don't break it in a future version unintentionally.
|
96
|
+
1. Commit, do not mess with Rakefile, VERSION, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
|
97
|
+
1. Send me a pull request. Bonus points for topic branches.
|
100
98
|
|
101
99
|
### Copyright
|
102
100
|
|
data/Rakefile
CHANGED
@@ -30,3 +30,25 @@ begin
|
|
30
30
|
rescue LoadError
|
31
31
|
puts "Jeweler (or a dependency) not available. Install it with: gem install jeweler"
|
32
32
|
end
|
33
|
+
|
34
|
+
begin
|
35
|
+
require 'rcov/rcovtask'
|
36
|
+
Rcov::RcovTask.new do |t|
|
37
|
+
t.libs << "test"
|
38
|
+
t.rcov_opts = [
|
39
|
+
"--exclude '^(?!lib)'"
|
40
|
+
]
|
41
|
+
t.test_files = FileList[
|
42
|
+
'test/**/*_test.rb'
|
43
|
+
]
|
44
|
+
t.output_dir = 'coverage'
|
45
|
+
t.verbose = true
|
46
|
+
end
|
47
|
+
task :rcov do
|
48
|
+
system "open coverage/index.html"
|
49
|
+
end
|
50
|
+
rescue LoadError
|
51
|
+
task :rcov do
|
52
|
+
raise "You must install the 'rcov' gem"
|
53
|
+
end
|
54
|
+
end
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.1.
|
1
|
+
0.1.2
|
data/lib/filter_io.rb
CHANGED
data/test/filter_io_test.rb
CHANGED
@@ -317,7 +317,7 @@ class FilterIOTest < ActiveSupport::TestCase
|
|
317
317
|
assert_equal expected, io.read
|
318
318
|
end
|
319
319
|
|
320
|
-
test "block size" do
|
320
|
+
test "block size for read(nil)" do
|
321
321
|
[1,4,7,9,13,30].each do |block_size|
|
322
322
|
input = ('A'..'Z').to_a.join
|
323
323
|
expected = input.chars.enum_for(:each_slice, block_size).to_a.map(&:join).map { |x| "[#{x}]" }.join
|
@@ -328,6 +328,18 @@ class FilterIOTest < ActiveSupport::TestCase
|
|
328
328
|
end
|
329
329
|
end
|
330
330
|
|
331
|
+
test "block size for gets/readline" do
|
332
|
+
[1,4,7,9,13,30].each do |block_size|
|
333
|
+
input = "ABCDEFG\nHJIKLMNOP\n"
|
334
|
+
expected = input.chars.enum_for(:each_slice, block_size).to_a.map(&:join).map { |x| "[#{x}]" }.join.lines.to_a
|
335
|
+
io = FilterIO.new(StringIO.new(input), :block_size => block_size) do |data|
|
336
|
+
"[#{data}]"
|
337
|
+
end
|
338
|
+
actual = io.readlines
|
339
|
+
assert_equal expected, actual
|
340
|
+
end
|
341
|
+
end
|
342
|
+
|
331
343
|
test "block size different to read size" do
|
332
344
|
(1..5).each do |block_size|
|
333
345
|
input_str = ('A'..'Z').to_a.join
|
metadata
CHANGED
@@ -1,12 +1,13 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: filter_io
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
+
hash: 31
|
4
5
|
prerelease: false
|
5
6
|
segments:
|
6
7
|
- 0
|
7
8
|
- 1
|
8
|
-
-
|
9
|
-
version: 0.1.
|
9
|
+
- 2
|
10
|
+
version: 0.1.2
|
10
11
|
platform: ruby
|
11
12
|
authors:
|
12
13
|
- Jason Weathered
|
@@ -14,7 +15,7 @@ autorequire:
|
|
14
15
|
bindir: bin
|
15
16
|
cert_chain: []
|
16
17
|
|
17
|
-
date: 2010-06-
|
18
|
+
date: 2010-06-27 00:00:00 +10:00
|
18
19
|
default_executable:
|
19
20
|
dependencies:
|
20
21
|
- !ruby/object:Gem::Dependency
|
@@ -25,6 +26,7 @@ dependencies:
|
|
25
26
|
requirements:
|
26
27
|
- - ">="
|
27
28
|
- !ruby/object:Gem::Version
|
29
|
+
hash: 3
|
28
30
|
segments:
|
29
31
|
- 0
|
30
32
|
version: "0"
|
@@ -62,6 +64,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
62
64
|
requirements:
|
63
65
|
- - ">="
|
64
66
|
- !ruby/object:Gem::Version
|
67
|
+
hash: 3
|
65
68
|
segments:
|
66
69
|
- 0
|
67
70
|
version: "0"
|
@@ -70,6 +73,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
70
73
|
requirements:
|
71
74
|
- - ">="
|
72
75
|
- !ruby/object:Gem::Version
|
76
|
+
hash: 3
|
73
77
|
segments:
|
74
78
|
- 0
|
75
79
|
version: "0"
|