tsv 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 9872712bd4a1b57465a813c036ed829be77673b0
4
+ data.tar.gz: e392e8d086e7d4277f2bbb868cb93a26e6ab07d2
5
+ SHA512:
6
+ metadata.gz: b4c7d0043ab7b5ae3a25769e1dca356c4ec672acef4310895c889ac991e3f764137176eb2490373c8028f9ce361e3f7600c8d9ab193fe6b9dbe0fd112b334e9b
7
+ data.tar.gz: 1a9e12c59e28ebf235591b73f294b2bc02f2118307a6d71d83ada443e6188736d46bef9036e81690b55eed986a5c0d272919c185e48135450ecd9e561eedc3e5
data/.gitignore ADDED
@@ -0,0 +1,17 @@
1
+ *.gem
2
+ *.rbc
3
+ .bundle
4
+ .config
5
+ .yardoc
6
+ Gemfile.lock
7
+ InstalledFiles
8
+ _yardoc
9
+ coverage
10
+ doc/
11
+ lib/bundler/man
12
+ pkg
13
+ rdoc
14
+ spec/reports
15
+ test/tmp
16
+ test/version_tmp
17
+ tmp
data/.rspec ADDED
@@ -0,0 +1 @@
1
+ --color
data/.travis.yml ADDED
@@ -0,0 +1,13 @@
1
+ language: ruby
2
+
3
+ addons:
4
+ code_climate:
5
+ repo_token: 17abb3979e6abb0cee4069ec3e7aeee9c6e4fd8277b0899b4cd0900ac6030f98
6
+
7
+ rvm:
8
+ - 1.9.3
9
+ - 2.0.0
10
+ - 2.1.1
11
+ - 2.1.2
12
+ - rbx-2.2.6
13
+ - jruby
data/Gemfile ADDED
@@ -0,0 +1,9 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in tsv.gemspec
4
+ gemspec
5
+
6
+ gem "codeclimate-test-reporter", group: :test, require: nil
7
+ gem "rake"
8
+ gem "rspec"
9
+ gem "pry"
data/LICENSE.txt ADDED
@@ -0,0 +1,22 @@
1
+ Copyright (c) 2014 Moron Activity
2
+
3
+ MIT License
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining
6
+ a copy of this software and associated documentation files (the
7
+ "Software"), to deal in the Software without restriction, including
8
+ without limitation the rights to use, copy, modify, merge, publish,
9
+ distribute, sublicense, and/or sell copies of the Software, and to
10
+ permit persons to whom the Software is furnished to do so, subject to
11
+ the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be
14
+ included in all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,74 @@
1
+ # Tsv
2
+ [![Build Status](https://travis-ci.org/mimimi/ruby-tsv.svg?branch=master)](https://travis-ci.org/mimimi/ruby-tsv)
3
+
4
+ A simple TSV parser, developed with aim of parsing a ~200Gb TSV dump. As such, no mode of operation, but enumerable is considered sane. Feel free to use `#to_a` on your supercomputer :)
5
+
6
+ Does not (yet) provide TSV writing mechanism. Pull requests are welcome :)
7
+
8
+ ## Installation
9
+
10
+ Add this line to your application's Gemfile:
11
+
12
+ gem 'tsv'
13
+
14
+ And then execute:
15
+
16
+ $ bundle
17
+
18
+ Or install it yourself as:
19
+
20
+ $ gem install tsv
21
+
22
+ ## Usage
23
+
24
+ ### High level interfaces
25
+
26
+ #### TSV::parse
27
+
28
+ `TSV.parse` accepts TSV as a whole string, returning lazy enumerator, yielding TSV::Row objects on demand
29
+
30
+ #### TSV::parse_file
31
+
32
+ `TSV.parse_file` accepts path to TSV file, returning lazy enumerator, yielding TSV::Row objects on demand
33
+ `TSV.parse_file` is also aliased as `[]`, allowing for `TSV[filename]` syntax
34
+
35
+ #### TSV::Row
36
+
37
+ By default TSV::Row behaves like an Array of strings, derived from TSV row. However this similarity is limited to Enumerable methods. In case a real array is needed, `#to_a` will behave as expected.
38
+ Additionally TSV::Row contains header data, accessible via `#header` reader.
39
+
40
+ In case a hash-like behaviour is required, field can be accessed with header string key. Alternatively, `#with_header` and `#to_h` will return hash representation for the row.
41
+
42
+ ### Examples
43
+
44
+ Getting first line from tsv file without headers:
45
+ ```ruby
46
+ TSV.parse_file("tsv.tsv").without_header.first
47
+ ```
48
+
49
+ Mapping name fields from a file:
50
+ ```ruby
51
+ TSV["tsv.tsv"].map do |row|
52
+ row['name']
53
+ end
54
+ ```
55
+
56
+ Mapping last and first row elements:
57
+ ```ruby
58
+ TSV["tsv.tsv"].map do |row|
59
+ [row[-1], row[1]]
60
+ end
61
+ ```
62
+
63
+ ### Nuances
64
+
65
+ Range accessor is not implemented for initial version due to authors' lack of need.
66
+ In addition, accessing tenth element in a row of five is considered an exception from TSV standpoint, which should be represented in range accessor. Such nuance, would it be implemented, will break expectations. Still, if need arises, pull or feature requests with accompanying reasoning (or even without one) are more than welcome.
67
+
68
+ ## Contributing
69
+
70
+ 1. Fork it
71
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
72
+ 3. Commit your changes (`git commit -am 'Add some feature'`)
73
+ 4. Push to the branch (`git push origin my-new-feature`)
74
+ 5. Create new Pull Request
data/Rakefile ADDED
@@ -0,0 +1,6 @@
1
+ require "bundler/gem_tasks"
2
+ require 'rspec/core/rake_task'
3
+
4
+ RSpec::Core::RakeTask.new('spec')
5
+
6
+ task :default => :spec
data/lib/tsv.rb ADDED
@@ -0,0 +1,20 @@
1
+ require "tsv/version"
2
+ require "tsv/row"
3
+ require "tsv/cyclist"
4
+
5
+ module TSV
6
+ extend self
7
+
8
+ def parse(content, opts = {}, &block)
9
+ TSV::StringCyclist.new(content, opts, &block)
10
+ end
11
+
12
+ def parse_file(filename, opts = {}, &block)
13
+ TSV::FileCyclist.new(filename, opts, &block)
14
+ end
15
+
16
+ alias :[] :parse_file
17
+
18
+ class ReadOnly < StandardError
19
+ end
20
+ end
@@ -0,0 +1,71 @@
1
+ module TSV
2
+ class Cyclist
3
+ extend Forwardable
4
+
5
+ def_delegators :enumerator, *Enumerator.instance_methods(false)
6
+ def_delegators :enumerator, *Enumerable.instance_methods(false)
7
+
8
+ attr_accessor :source, :header
9
+
10
+ def initialize(source, params = {}, &block)
11
+ self.header = params.fetch(:header, true)
12
+ self.source = source.to_s
13
+ self.enumerator.each(&block) if block_given?
14
+ end
15
+
16
+ def with_header
17
+ self.class.new(self.source, header: true)
18
+ end
19
+
20
+ def without_header
21
+ self.class.new(self.source, header: false)
22
+ end
23
+
24
+ def enumerator
25
+ @enumerator ||= ::Enumerator.new do |y|
26
+ lines = data_enumerator
27
+
28
+ first_line = generate_row_from begin
29
+ lines.next
30
+ rescue StopIteration => ex
31
+ ''
32
+ end
33
+
34
+ local_header = if self.header
35
+ first_line
36
+ else
37
+ lines.rewind
38
+ generate_default_header_from first_line
39
+ end
40
+
41
+ loop do
42
+ y << TSV::Row.new(generate_row_from(lines.next).freeze, local_header.freeze)
43
+ end
44
+ end
45
+ end
46
+
47
+ protected
48
+
49
+ def generate_row_from(str)
50
+ str.to_s.chomp.split("\t")
51
+ end
52
+
53
+ def generate_default_header_from(example_line)
54
+ (0...example_line.length).to_a.map(&:to_s)
55
+ end
56
+ end
57
+
58
+ class FileCyclist < Cyclist
59
+ alias :filepath :source
60
+
61
+ def data_enumerator
62
+ File.new(self.source).each_line
63
+ end
64
+ end
65
+
66
+ class StringCyclist < Cyclist
67
+ def data_enumerator
68
+ source.each_line
69
+ end
70
+ end
71
+ end
data/lib/tsv/row.rb ADDED
@@ -0,0 +1,54 @@
1
+ module TSV
2
+ class Row
3
+ extend Forwardable
4
+
5
+ def_delegators :data, *Enumerable.instance_methods(false)
6
+
7
+ attr_reader :header, :data
8
+
9
+ def []=(key, value)
10
+ raise TSV::ReadOnly.new('TSV data is read only. Export data to modify it.')
11
+ end
12
+
13
+ def [](key)
14
+ if key.is_a? ::String
15
+ raise UnknownKey unless header.include?(key)
16
+
17
+ data[header.index(key)]
18
+ elsif key.is_a? ::Numeric
19
+ raise UnknownKey if data[key].nil?
20
+
21
+ data[key]
22
+ else
23
+ raise InvalidKey.new
24
+ end
25
+ end
26
+
27
+ def initialize(data, header)
28
+ @data = data
29
+ @header = header
30
+
31
+ raise InputError if @data.length != @header.length
32
+ end
33
+
34
+ def with_header
35
+ Hash[header.zip(data)]
36
+ end
37
+ alias :to_h :with_header
38
+
39
+ def ==(other)
40
+ other.is_a?(self.class) and
41
+ header == other.header and
42
+ data == other.data
43
+ end
44
+
45
+ class InvalidKey < StandardError
46
+ end
47
+
48
+ class UnknownKey < StandardError
49
+ end
50
+
51
+ class InputError < StandardError
52
+ end
53
+ end
54
+ end
@@ -0,0 +1,3 @@
1
+ module TSV
2
+ VERSION = "0.0.1"
3
+ end
@@ -0,0 +1,3 @@
1
+ first second third
2
+ 0 1 2
3
+ one
File without changes
@@ -0,0 +1,4 @@
1
+ first second third
2
+ 0 1 2
3
+ one two three
4
+ weird data s@mthin# else
@@ -0,0 +1,34 @@
1
+ require File.join(File.dirname(__FILE__), '..', '..', 'spec_helper.rb')
2
+
3
+ describe TSV::FileCyclist do
4
+ let(:tsv_path) { File.join(File.dirname(__FILE__), '..', '..', 'fixtures', filename) }
5
+ let(:source) { tsv_path }
6
+ let(:filename) { 'example.tsv' }
7
+
8
+ let(:header) { true }
9
+ let(:parameters) { { header: header } }
10
+
11
+ subject(:cyclist) { TSV::FileCyclist.new(source, parameters) }
12
+
13
+ it_behaves_like "Cyclist"
14
+
15
+ describe "accessing unavailable files" do
16
+ subject { lambda { TSV::FileCyclist.new(tsv_path).to_a } }
17
+
18
+ context "when file is not found" do
19
+ let(:tsv_path) { "AManThatWasntThere.tsv" }
20
+
21
+ it "returns FileNotFoundException" do
22
+ expect(subject).to raise_error(Errno::ENOENT)
23
+ end
24
+ end
25
+
26
+ context "when filename is nil" do
27
+ let(:tsv_path) { nil }
28
+
29
+ it "returns FileNameInvalidException" do
30
+ expect(subject).to raise_error(Errno::ENOENT)
31
+ end
32
+ end
33
+ end
34
+ end
@@ -0,0 +1,168 @@
1
+ require File.join(File.dirname(__FILE__), '..', '..', 'spec_helper.rb')
2
+
3
+ describe TSV::Row do
4
+ describe "::new" do
5
+ it "sets header and data from params" do
6
+ t = TSV::Row.new(['data'], ['header'])
7
+
8
+ expect(t.header).to eq(['header'])
9
+ expect(t.data).to eq(['data'])
10
+ end
11
+
12
+ context "when header and data length do not match" do
13
+ it "raises TSV::Row::InputError" do
14
+ expect { TSV::Row.new(['data'], ['header', 'footer']) }.to raise_error(TSV::Row::InputError)
15
+ expect { TSV::Row.new(['data', 'not data'], ['header']) }.to raise_error(TSV::Row::InputError)
16
+ end
17
+ end
18
+ end
19
+
20
+ let(:header) { ['first', 'second', 'third'] }
21
+ let(:data) { ['one', 'two', 'three'] }
22
+
23
+ subject(:row) { TSV::Row.new(data, header) }
24
+
25
+ describe "#[]" do
26
+ describe "array interface compatibility" do
27
+ context "when provided with element number" do
28
+ it "returns requested element" do
29
+ expect(subject[1]).to eq "two"
30
+ end
31
+ end
32
+
33
+ context "when provided with negative offset" do
34
+ it "returns requested element" do
35
+ expect(subject[-1]).to eq "three"
36
+ end
37
+ end
38
+
39
+ context "when provided with header name" do
40
+ it "returns requested element" do
41
+ expect(subject['third']).to eq "three"
42
+ end
43
+ end
44
+
45
+ context "when provided with nil or symbol" do
46
+ it "raises TSV::Row::InvalidKey" do
47
+ expect { subject[nil] }.to raise_error(TSV::Row::InvalidKey)
48
+ expect { subject[:something] }.to raise_error(TSV::Row::InvalidKey)
49
+ end
50
+ end
51
+
52
+ context "when provided with unknown numeric key" do
53
+ let(:cases) { [-(data.length + 1), data.length, 500, -500]}
54
+
55
+ it "raises TSV::Row::UnknownKey" do
56
+ cases.each do |item|
57
+ expect { subject[item] }.to raise_error(TSV::Row::UnknownKey)
58
+ end
59
+ end
60
+ end
61
+
62
+ context "when provided with unknown string key" do
63
+ it "raises TSV::Row::UnknownKey" do
64
+ expect { subject['something'] }.to raise_error(TSV::Row::UnknownKey)
65
+ end
66
+ end
67
+ end
68
+ end
69
+
70
+ describe "#[]=" do
71
+ it "raises TSV::ReadOnly exception" do
72
+ expect { subject['a'] = 123 }.to raise_error(TSV::ReadOnly, 'TSV data is read only. Export data to modify it.')
73
+ end
74
+ end
75
+
76
+ describe "accessors" do
77
+ describe "header" do
78
+ it "does not have setter" do
79
+ expect(subject).to_not respond_to(:"header=")
80
+ end
81
+
82
+ it "has getter" do
83
+ expect(subject.header).to eq ['first', 'second', 'third']
84
+ end
85
+ end
86
+
87
+ describe "data" do
88
+ it "does not have setter" do
89
+ expect(subject).to_not respond_to(:"header=")
90
+ end
91
+
92
+ it "has getter" do
93
+ expect(subject.data).to eq ['one', 'two', 'three']
94
+ end
95
+ end
96
+ end
97
+
98
+ describe "iterators" do
99
+ describe "Enumerable #methods (except #to_h, which we have a better implementation for)" do
100
+ (Enumerable.instance_methods(false) - [:to_h]).each do |name|
101
+ it "delegates #{name} to data array" do
102
+ expect(subject.data).to receive(name)
103
+ subject.send(name)
104
+ end
105
+ end
106
+ end
107
+
108
+ describe "#with_header" do
109
+ subject { row.with_header }
110
+
111
+ it "gathers header and data into hash" do
112
+ expect(subject).to eq({
113
+ "first" => "one",
114
+ "second" => "two",
115
+ "third" => "three"
116
+ })
117
+ end
118
+ end
119
+
120
+ describe "#to_h" do
121
+ subject { row.to_h }
122
+
123
+ it "gathers header and data into hash" do
124
+ expect(subject).to eq({
125
+ "first" => "one",
126
+ "second" => "two",
127
+ "third" => "three"
128
+ })
129
+ end
130
+ end
131
+ end
132
+
133
+ describe "#==" do
134
+ let(:other_header) { header }
135
+ let(:other_data) { data }
136
+
137
+ let(:other_row) { TSV::Row.new(other_data, other_header) }
138
+ subject { row == other_row }
139
+
140
+ context "when compared to TSV::Row" do
141
+ context "when both objects' data and header are equal" do
142
+ it { should be true }
143
+ end
144
+
145
+ context "when data attributes are not equal" do
146
+ let(:other_data) { data.reverse }
147
+ it { should be false }
148
+ end
149
+
150
+ context "when header attributes are not equal" do
151
+ let(:other_header) { header.reverse }
152
+ it { should be false }
153
+ end
154
+
155
+ context "when both objects' data and header are not equal" do
156
+ let(:other_data) { data.reverse }
157
+ let(:other_header) { header.reverse }
158
+ it { should be false }
159
+ end
160
+ end
161
+
162
+ context "when compared to something else than TSV::Row" do
163
+ let(:other_row) { data }
164
+
165
+ it { should be false }
166
+ end
167
+ end
168
+ end
@@ -0,0 +1,13 @@
1
+ require File.join(File.dirname(__FILE__), '..', '..', 'spec_helper.rb')
2
+
3
+ describe TSV::StringCyclist do
4
+ let(:source) { IO.read(File.join(File.dirname(__FILE__), '..', '..', 'fixtures', filename)) }
5
+ let(:filename) { 'example.tsv' }
6
+
7
+ let(:header) { true }
8
+ let(:parameters) { { header: header } }
9
+
10
+ subject(:cyclist) { TSV::StringCyclist.new(source, parameters) }
11
+
12
+ it_behaves_like "Cyclist"
13
+ end
@@ -0,0 +1,59 @@
1
+ require File.join(File.dirname(__FILE__), '..', 'spec_helper.rb')
2
+
3
+ describe TSV do
4
+ let(:filename) { 'example.tsv' }
5
+
6
+ describe "#parse" do
7
+ let(:header) { nil }
8
+ let(:content) { IO.read(File.join(File.dirname(__FILE__), '..', 'fixtures', filename)) }
9
+ let(:parameters) { { header: header } }
10
+
11
+ subject { TSV.parse(content, parameters) }
12
+
13
+ it "returns String Cyclist initialized with given data" do
14
+ expect(subject).to be_a TSV::StringCyclist
15
+ expect(subject.source).to eq(content)
16
+ end
17
+
18
+ context "when block is given" do
19
+ it "passes block to Cyclist" do
20
+ data = []
21
+
22
+ TSV.parse(content) do |i|
23
+ data.push i
24
+ end
25
+
26
+ headers = %w{first second third}
27
+ expect(data).to eq [ TSV::Row.new( ['0', '1', '2'], headers ),
28
+ TSV::Row.new( ['one', 'two', 'three'], headers ),
29
+ TSV::Row.new( ['weird data', 's@mthin#', 'else'], headers ) ]
30
+ end
31
+ end
32
+ end
33
+
34
+ describe "#parse_file" do
35
+ let(:tsv_path) { File.join(File.dirname(__FILE__), '..', 'fixtures', filename) }
36
+
37
+ subject { TSV.parse_file tsv_path }
38
+
39
+ it "returns Cyclist object initialized with given filepath" do
40
+ expect(subject).to be_a TSV::FileCyclist
41
+ expect(subject.filepath).to eq tsv_path
42
+ end
43
+
44
+ context "when block is given" do
45
+ it "passes block to Cyclist" do
46
+ data = []
47
+
48
+ TSV.parse_file(tsv_path) do |i|
49
+ data.push i
50
+ end
51
+
52
+ headers = %w{first second third}
53
+ expect(data).to eq [ TSV::Row.new( ['0', '1', '2'], headers ),
54
+ TSV::Row.new( ['one', 'two', 'three'], headers ),
55
+ TSV::Row.new( ['weird data', 's@mthin#', 'else'], headers ) ]
56
+ end
57
+ end
58
+ end
59
+ end
@@ -0,0 +1,21 @@
1
+ require 'rubygems'
2
+ require 'bundler/setup'
3
+
4
+ require 'pry'
5
+ require 'rspec'
6
+
7
+ require 'tsv'
8
+
9
+ require "codeclimate-test-reporter"
10
+ CodeClimate::TestReporter.start
11
+
12
+ # Disabling old rspec should syntax
13
+ RSpec.configure do |config|
14
+ config.expect_with :rspec do |c|
15
+ c.syntax = :expect
16
+ end
17
+
18
+ config.raise_errors_for_deprecations!
19
+ end
20
+
21
+ Dir[File.expand_path(File.join(File.dirname(__FILE__),'support','**','*.rb'))].each {|f| require f}
@@ -0,0 +1,109 @@
1
+ shared_examples_for "Cyclist" do
2
+ describe "::new" do
3
+ it "initializes header to true by default" do
4
+ expect(subject.header).to be true
5
+ end
6
+
7
+ it "initializes source to given value" do
8
+ expect(subject.source).to eq(source)
9
+ end
10
+
11
+ context "when block is given" do
12
+ it "passes block to enumerator through each" do
13
+ data = []
14
+
15
+ described_class.new(source) do |v|
16
+ data << v
17
+ end
18
+
19
+ headers = %w{first second third}
20
+ expect(data).to eq [ TSV::Row.new( ['0', '1', '2'], headers ),
21
+ TSV::Row.new( ['one', 'two', 'three'], headers ),
22
+ TSV::Row.new( ['weird data', 's@mthin#', 'else'], headers ) ]
23
+ end
24
+ end
25
+ end
26
+
27
+ describe "#enumerator" do
28
+ it { expect(cyclist.enumerator).to be_a_kind_of(Enumerator) }
29
+ subject { cyclist.enumerator.to_a }
30
+
31
+ context "string is empty" do
32
+ let(:filename) { 'empty.tsv' }
33
+
34
+ it { should be_empty }
35
+ end
36
+
37
+ context "string is incorrect" do
38
+ let(:filename) { 'broken.tsv' }
39
+
40
+ it "should raise exception" do
41
+ expect { subject }.to raise_error(TSV::Row::InputError)
42
+ end
43
+ end
44
+
45
+ context "string is correct" do
46
+ context "when requested without header" do
47
+ let(:header) { false }
48
+ let(:auto_header) { %w{0 1 2} }
49
+
50
+ it "returns its content as array of arrays" do
51
+ expect(subject).to eq [ TSV::Row.new( ['first', 'second', 'third'], auto_header ),
52
+ TSV::Row.new( ['0', '1', '2'], auto_header ),
53
+ TSV::Row.new( ['one', 'two', 'three'], auto_header ),
54
+ TSV::Row.new( ['weird data', 's@mthin#', 'else'], auto_header ) ]
55
+ end
56
+
57
+ it "freezes data and header for TSV::Row" do
58
+ subject.each do |i|
59
+ expect(i.data).to be_frozen
60
+ expect(i.header).to be_frozen
61
+ end
62
+ end
63
+ end
64
+
65
+ context "when requested with header" do
66
+ let(:header) { true }
67
+
68
+ it "returns its content as array of hashes" do
69
+ headers = %w{first second third}
70
+ expect(subject).to eq [ TSV::Row.new( ['0', '1', '2'], headers ),
71
+ TSV::Row.new( ['one', 'two', 'three'], headers ),
72
+ TSV::Row.new( ['weird data', 's@mthin#', 'else'], headers ) ]
73
+ end
74
+
75
+ it "freezes data and header for TSV::Row" do
76
+ subject.each do |i|
77
+ expect(i.data).to be_frozen
78
+ expect(i.header).to be_frozen
79
+ end
80
+ end
81
+ end
82
+ end
83
+ end
84
+
85
+ describe "#with_header" do
86
+ subject { cyclist.with_header }
87
+
88
+ it "returns a Cyclist with header option set to true" do
89
+ expect(subject.header).to be true
90
+ end
91
+ end
92
+
93
+ describe "#without_header" do
94
+ subject { cyclist.without_header }
95
+
96
+ it "returns a Cyclist with header option set to false" do
97
+ expect(subject.header).to be false
98
+ end
99
+ end
100
+
101
+ describe "enumerator interfaces" do
102
+ ( Enumerable.instance_methods(false) + Enumerator.instance_methods(false) ).each do |name|
103
+ it "delegates #{name} to enumerator" do
104
+ expect(cyclist.enumerator).to receive(name)
105
+ cyclist.send(name)
106
+ end
107
+ end
108
+ end
109
+ end
@@ -0,0 +1,95 @@
1
+ require File.join(File.dirname(__FILE__), 'spec_helper.rb')
2
+
3
+ describe TSV do
4
+ let(:header) { nil }
5
+ let(:tsv_path) { File.join(File.dirname(__FILE__), 'fixtures', filename) }
6
+ let(:parameters) { { header: header } }
7
+
8
+ describe "reading file" do
9
+ subject { TSV.parse_file(tsv_path, parameters).to_a }
10
+
11
+ context "when file is empty" do
12
+ let(:filename) { 'empty.tsv' }
13
+
14
+ context "when requested without header" do
15
+ let(:header) { true }
16
+
17
+ it { expect(subject).to be_empty }
18
+ end
19
+
20
+ context "when requested with header" do
21
+ let(:header) { false }
22
+
23
+ it { expect(subject).to be_empty }
24
+ end
25
+ end
26
+
27
+ context "when file is invalid" do
28
+ subject { lambda { TSV.parse_file(tsv_path, parameters).to_a } }
29
+ let(:filename) { 'broken.tsv' }
30
+
31
+ it "when file is broken" do
32
+ expect(subject).to raise_error TSV::Row::InputError
33
+ end
34
+ end
35
+
36
+ context "when file is valid" do
37
+ let(:filename) { 'example.tsv' }
38
+
39
+ context "when no block is passed" do
40
+ let(:parameters) { Hash.new }
41
+
42
+ it "returns its content as array of hashes" do
43
+ headers = %w{first second third}
44
+ expect(subject).to eq [ TSV::Row.new( ['0', '1', '2'], headers ),
45
+ TSV::Row.new( ['one', 'two', 'three'], headers ),
46
+ TSV::Row.new( ['weird data', 's@mthin#', 'else'], headers ) ]
47
+ end
48
+ end
49
+ end
50
+ end
51
+
52
+ describe "reading from string" do
53
+ subject { TSV.parse(IO.read(tsv_path), parameters).to_a }
54
+
55
+ context "when string is empty" do
56
+ let(:filename) { 'empty.tsv' }
57
+
58
+ context "when requested without header" do
59
+ let(:header) { true }
60
+
61
+ it { expect(subject).to be_empty }
62
+ end
63
+
64
+ context "when requested with header" do
65
+ let(:header) { false }
66
+
67
+ it { expect(subject).to be_empty }
68
+ end
69
+ end
70
+
71
+ context "when string is invalid" do
72
+ subject { lambda { TSV.parse(IO.read(tsv_path), parameters).to_a } }
73
+ let(:filename) { 'broken.tsv' }
74
+
75
+ it "when file is broken" do
76
+ expect(subject).to raise_error TSV::Row::InputError
77
+ end
78
+ end
79
+
80
+ context "when string is valid" do
81
+ let(:filename) { 'example.tsv' }
82
+
83
+ context "when no block is passed" do
84
+ let(:parameters) { Hash.new }
85
+
86
+ it "returns its content as array of hashes" do
87
+ headers = %w{first second third}
88
+ expect(subject).to eq [ TSV::Row.new( ['0', '1', '2'], headers ),
89
+ TSV::Row.new( ['one', 'two', 'three'], headers ),
90
+ TSV::Row.new( ['weird data', 's@mthin#', 'else'], headers ) ]
91
+ end
92
+ end
93
+ end
94
+ end
95
+ end
data/tsv.gemspec ADDED
@@ -0,0 +1,20 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'tsv/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "tsv"
8
+ spec.version = TSV::VERSION
9
+ spec.authors = ["Dmytro Soltys", "Alexander Rozumiy"]
10
+ spec.email = ["soap@slotos.net", "brain-geek@yandex.ua"]
11
+ spec.description = %q{Streamed TSV parser}
12
+ spec.summary = %q{Provides a simple parser for standard compliant and not so (missing header line) TSV files}
13
+ spec.homepage = ""
14
+ spec.license = "MIT"
15
+
16
+ spec.files = `git ls-files`.split($/)
17
+ spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
18
+ spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
19
+ spec.require_paths = ["lib"]
20
+ end
metadata ADDED
@@ -0,0 +1,79 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: tsv
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.0.1
5
+ platform: ruby
6
+ authors:
7
+ - Dmytro Soltys
8
+ - Alexander Rozumiy
9
+ autorequire:
10
+ bindir: bin
11
+ cert_chain: []
12
+ date: 2014-07-10 00:00:00.000000000 Z
13
+ dependencies: []
14
+ description: Streamed TSV parser
15
+ email:
16
+ - soap@slotos.net
17
+ - brain-geek@yandex.ua
18
+ executables: []
19
+ extensions: []
20
+ extra_rdoc_files: []
21
+ files:
22
+ - .gitignore
23
+ - .rspec
24
+ - .travis.yml
25
+ - Gemfile
26
+ - LICENSE.txt
27
+ - README.md
28
+ - Rakefile
29
+ - lib/tsv.rb
30
+ - lib/tsv/cyclist.rb
31
+ - lib/tsv/row.rb
32
+ - lib/tsv/version.rb
33
+ - spec/fixtures/broken.tsv
34
+ - spec/fixtures/empty.tsv
35
+ - spec/fixtures/example.tsv
36
+ - spec/lib/tsv/file_cyclist_spec.rb
37
+ - spec/lib/tsv/row_spec.rb
38
+ - spec/lib/tsv/string_cyclist_spec.rb
39
+ - spec/lib/tsv_spec.rb
40
+ - spec/spec_helper.rb
41
+ - spec/support/cyclist_generic.rb
42
+ - spec/tsv_integration_spec.rb
43
+ - tsv.gemspec
44
+ homepage: ''
45
+ licenses:
46
+ - MIT
47
+ metadata: {}
48
+ post_install_message:
49
+ rdoc_options: []
50
+ require_paths:
51
+ - lib
52
+ required_ruby_version: !ruby/object:Gem::Requirement
53
+ requirements:
54
+ - - '>='
55
+ - !ruby/object:Gem::Version
56
+ version: '0'
57
+ required_rubygems_version: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - '>='
60
+ - !ruby/object:Gem::Version
61
+ version: '0'
62
+ requirements: []
63
+ rubyforge_project:
64
+ rubygems_version: 2.0.6
65
+ signing_key:
66
+ specification_version: 4
67
+ summary: Provides a simple parser for standard compliant and not so (missing header
68
+ line) TSV files
69
+ test_files:
70
+ - spec/fixtures/broken.tsv
71
+ - spec/fixtures/empty.tsv
72
+ - spec/fixtures/example.tsv
73
+ - spec/lib/tsv/file_cyclist_spec.rb
74
+ - spec/lib/tsv/row_spec.rb
75
+ - spec/lib/tsv/string_cyclist_spec.rb
76
+ - spec/lib/tsv_spec.rb
77
+ - spec/spec_helper.rb
78
+ - spec/support/cyclist_generic.rb
79
+ - spec/tsv_integration_spec.rb