kiba 2.5.0 → 3.0.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.travis.yml +1 -3
- data/Changes.md +8 -0
- data/Pro-Changes.md +63 -2
- data/bin/kiba +12 -2
- data/lib/kiba.rb +1 -1
- data/lib/kiba/parser.rb +6 -22
- data/lib/kiba/runner.rb +1 -1
- data/lib/kiba/version.rb +1 -1
- data/test/shared_runner_tests.rb +0 -4
- data/test/test_parser.rb +0 -33
- data/test/test_run.rb +12 -0
- data/test/test_runner.rb +5 -0
- data/test/test_streaming_runner.rb +5 -10
- metadata +4 -13
- data/lib/kiba/cli.rb +0 -16
- data/test/fixtures/bogus.etl +0 -2
- data/test/fixtures/namespace_conflict.etl +0 -9
- data/test/fixtures/some_extension.rb +0 -4
- data/test/fixtures/valid.etl +0 -1
- data/test/test_cli.rb +0 -21
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: a5cf22f1013b65b59d9c91b9f629749b47b5474b3f500d5ca54118b6b6a46c4f
|
4
|
+
data.tar.gz: cae5e05fabc7c8d169d47d6aa6a7da8befd6711ad2d2471c2c0f5f1d3e604841
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 2ed99641ab799ee57f213525f6ca3ab4b9a9d5e0b7680115bef8e139d32dcfb350fb0117065b5be0ded6a886c21988413b2963e2cbea546a7a1d5691ed078f27
|
7
|
+
data.tar.gz: 24434d65dcf9bb81ce0f575860edbdf92598a1e990663347a599eb42155211c02b9c9eb0304c869cf85c8cd15df8c262afcf9cdf11bb7568323ea98b393b9518
|
data/.travis.yml
CHANGED
@@ -3,12 +3,10 @@ matrix:
|
|
3
3
|
include:
|
4
4
|
# see https://www.ruby-lang.org/en/downloads/branches/
|
5
5
|
- rvm: ruby-head
|
6
|
+
- rvm: 2.7
|
6
7
|
- rvm: 2.6
|
7
8
|
- rvm: 2.5
|
8
9
|
- rvm: 2.4
|
9
|
-
# NOTE: EOL since 2019-03-31
|
10
|
-
- rvm: 2.3
|
11
10
|
# see https://www.jruby.org/download
|
12
|
-
- rvm: jruby-9.1
|
13
11
|
- rvm: jruby-9.2
|
14
12
|
- rvm: truffleruby
|
data/Changes.md
CHANGED
@@ -1,3 +1,11 @@
|
|
1
|
+
3.0.0
|
2
|
+
-----
|
3
|
+
|
4
|
+
- Breaking: the `kiba` command line is deprecated to encourage using `Kiba.parse` API. See [#81](https://github.com/thbar/kiba/pull/81) and release notes for details & migration path.
|
5
|
+
- Kiba now defaults to `StreamingRunner` (backward compatible & more powerful engine) [#83](https://github.com/thbar/kiba/pull/83).
|
6
|
+
- Kiba now officially supports MRI Ruby 2.4+ (although 2.3 will still work for now), JRuby 9.2+ or TruffleRuby.
|
7
|
+
- You may get warnings with Ruby 2.7 and errors with Ruby 2.8+. See [#85] for status on Ruby 3 keyword arguments support.
|
8
|
+
|
1
9
|
2.5.0
|
2
10
|
-----
|
3
11
|
|
data/Pro-Changes.md
CHANGED
@@ -3,8 +3,69 @@ Kiba Pro Changelog
|
|
3
3
|
|
4
4
|
Kiba Pro is the commercial extension for Kiba. Documentation is available on the [Wiki](https://github.com/thbar/kiba/wiki).
|
5
5
|
|
6
|
-
|
7
|
-
|
6
|
+
1.5.0
|
7
|
+
-----
|
8
|
+
|
9
|
+
- Compatibility with Kiba v3
|
10
|
+
- BREAKING CHANGE: deprecate non-live Sequel connection passing (https://github.com/thbar/kiba/issues/79). Do not use `database: "connection_string"`, instead pass your `Sequel` connection directly. This moves the connection management out of the destination, which is a better pattern & provides better (block-based) resources closing.
|
11
|
+
- Official MySQL support:
|
12
|
+
- While the compatibility was already here, it is now tested for in our QA testing suite.
|
13
|
+
- MySQL 5.5-8.0 is supported & tested
|
14
|
+
- MariaDB should be supported (although not tested against in the QA testing suite)
|
15
|
+
- Amazon Aurora MySQL is also supposed to work (although not tested)
|
16
|
+
- `Kiba::Pro::Sources::SQL` supports for non-streaming + streaming use
|
17
|
+
- `Kiba::Pro::Destinations::SQLBulkInsert` supports:
|
18
|
+
- Bulk insert
|
19
|
+
- Bulk insert with ignore
|
20
|
+
- Bulk upsert (including with dynamically computed columns) via `ON DUPLICATE KEY UPDATE`
|
21
|
+
- Note that the `Kiba::Pro::Destinations::SQLUpsert` (row-by-row) is not MySQL compatible at the moment
|
22
|
+
|
23
|
+
1.2.0
|
24
|
+
-----
|
25
|
+
|
26
|
+
- `SQL` source improvements:
|
27
|
+
- Deprecate use_cursor in favor of block query construct. The source could previously be configured with:
|
28
|
+
|
29
|
+
```ruby
|
30
|
+
source Kiba::Pro::Sources::SQL,
|
31
|
+
query: "SELECT * FROM items",
|
32
|
+
use_cursor: true
|
33
|
+
```
|
34
|
+
|
35
|
+
The `use_cursor` keyword is now deprecated. You can use the more powerful block query construct:
|
36
|
+
|
37
|
+
```ruby
|
38
|
+
source Kiba::Pro::Sources::SQL,
|
39
|
+
query: -> (db) { db["SELECT * FROM items"].use_cursor },
|
40
|
+
```
|
41
|
+
|
42
|
+
- Avoid bogus nested SQL calls when configuring the query via block/proc. A call with:
|
43
|
+
|
44
|
+
```ruby
|
45
|
+
source Kiba::Pro::Sources::SQL,
|
46
|
+
query: -> (db) { db["SELECT * FROM items"] },
|
47
|
+
```
|
48
|
+
|
49
|
+
would have previously generated a `SELECT * FROM (SELECT * FROM "items")`. This is now fixed.
|
50
|
+
|
51
|
+
- Add specs around streaming support (for both MySQL and Postgres).
|
52
|
+
|
53
|
+
For Postgres, streaming was [recommended by the author of Sequel](https://groups.google.com/d/msg/sequel-talk/olznPcmEf8M/hd5Ris0pYNwJ) over `use_cursor: true` (but do compare on your actual cases!). To enable streaming for Postgres:
|
54
|
+
- Add `sequel_pg` to your `Gemfile`
|
55
|
+
- Enable the extension in your `db` instance & add `.stream` to your dataset e.g.:
|
56
|
+
|
57
|
+
```ruby
|
58
|
+
Sequel.connect(ENV.fetch('DATABASE_URL')) do |db|
|
59
|
+
db.extension(:pg_streaming)
|
60
|
+
Kiba.run(Kiba.parse do
|
61
|
+
source Kiba::Pro::Sources::SQL,
|
62
|
+
db: db,
|
63
|
+
query: -> (db) { db[:items].stream }
|
64
|
+
# SNIP
|
65
|
+
end)
|
66
|
+
```
|
67
|
+
|
68
|
+
For MySQL, just add `.stream` to your dataset like above (no extension required).
|
8
69
|
|
9
70
|
1.1.0
|
10
71
|
-----
|
data/bin/kiba
CHANGED
@@ -1,5 +1,15 @@
|
|
1
1
|
#!/usr/bin/env ruby
|
2
2
|
|
3
|
-
|
3
|
+
STDERR.puts <<DOC
|
4
4
|
|
5
|
-
|
5
|
+
##########################################################################
|
6
|
+
|
7
|
+
The 'kiba' CLI is deprecated and has been removed in Kiba ETL v3.
|
8
|
+
|
9
|
+
See release notes / changelog for help.
|
10
|
+
|
11
|
+
##########################################################################
|
12
|
+
|
13
|
+
DOC
|
14
|
+
|
15
|
+
exit(1)
|
data/lib/kiba.rb
CHANGED
@@ -13,7 +13,7 @@ Kiba.extend(Kiba::Parser)
|
|
13
13
|
module Kiba
|
14
14
|
def self.run(job)
|
15
15
|
# NOTE: use Hash#dig when Ruby 2.2 reaches EOL
|
16
|
-
runner = job.config.fetch(:kiba, {}).fetch(:runner, Kiba::
|
16
|
+
runner = job.config.fetch(:kiba, {}).fetch(:runner, Kiba::StreamingRunner)
|
17
17
|
runner.run(job)
|
18
18
|
end
|
19
19
|
end
|
data/lib/kiba/parser.rb
CHANGED
@@ -1,26 +1,10 @@
|
|
1
|
-
|
2
|
-
|
3
|
-
|
4
|
-
|
5
|
-
|
6
|
-
# as by default instance_eval will resolve references by adding
|
7
|
-
# the module containing the parser class (initially "Kiba").
|
8
|
-
# Now, the classes appear to be further hidden from the user,
|
9
|
-
# as Kiba::Parser is its own module.
|
10
|
-
# This allows the user to create a Parser, Context, Control class
|
11
|
-
# without it being interpreted as reopening Kiba::Parser, Kiba::Context,
|
12
|
-
# etc.
|
13
|
-
# See test in test_cli.rb (test_namespace_conflict)
|
14
|
-
module Kiba::Parser
|
15
|
-
def parse(source_as_string = nil, source_file = nil, &source_as_block)
|
16
|
-
control = Kiba::Control.new
|
17
|
-
context = Kiba::Context.new(control)
|
18
|
-
if source_as_string
|
19
|
-
# this somewhat weird construct allows to remove a nil source_file
|
20
|
-
context.instance_eval(*[source_as_string, source_file].compact)
|
21
|
-
else
|
1
|
+
module Kiba
|
2
|
+
module Parser
|
3
|
+
def parse(&source_as_block)
|
4
|
+
control = Kiba::Control.new
|
5
|
+
context = Kiba::Context.new(control)
|
22
6
|
context.instance_eval(&source_as_block)
|
7
|
+
control
|
23
8
|
end
|
24
|
-
control
|
25
9
|
end
|
26
10
|
end
|
data/lib/kiba/runner.rb
CHANGED
data/lib/kiba/version.rb
CHANGED
data/test/shared_runner_tests.rb
CHANGED
data/test/test_parser.rb
CHANGED
@@ -68,39 +68,6 @@ class TestParser < Kiba::Test
|
|
68
68
|
assert_instance_of Proc, control.pre_processes[0][:block]
|
69
69
|
end
|
70
70
|
|
71
|
-
def test_source_as_string_parsing
|
72
|
-
control = Kiba.parse <<RUBY
|
73
|
-
source DummyClass, 'from', 'file'
|
74
|
-
RUBY
|
75
|
-
|
76
|
-
assert_equal 1, control.sources.size
|
77
|
-
assert_equal DummyClass, control.sources[0][:klass]
|
78
|
-
assert_equal %w(from file), control.sources[0][:args]
|
79
|
-
end
|
80
|
-
|
81
|
-
def test_source_as_file_doing_require
|
82
|
-
IO.write 'test/tmp/etl-common.rb', <<RUBY
|
83
|
-
def common_source_declaration
|
84
|
-
source DummyClass, 'from', 'common'
|
85
|
-
end
|
86
|
-
RUBY
|
87
|
-
IO.write 'test/tmp/etl-main.rb', <<RUBY
|
88
|
-
require './test/tmp/etl-common.rb'
|
89
|
-
|
90
|
-
source DummyClass, 'from', 'main'
|
91
|
-
common_source_declaration
|
92
|
-
RUBY
|
93
|
-
control = Kiba.parse IO.read('test/tmp/etl-main.rb')
|
94
|
-
|
95
|
-
assert_equal 2, control.sources.size
|
96
|
-
|
97
|
-
assert_equal %w(from main), control.sources[0][:args]
|
98
|
-
assert_equal %w(from common), control.sources[1][:args]
|
99
|
-
|
100
|
-
ensure
|
101
|
-
remove_files('test/tmp/etl-common.rb', 'test/tmp/etl-main.rb')
|
102
|
-
end
|
103
|
-
|
104
71
|
def test_config
|
105
72
|
control = Kiba.parse do
|
106
73
|
extend Kiba::DSLExtensions::Config
|
data/test/test_run.rb
ADDED
@@ -0,0 +1,12 @@
|
|
1
|
+
require_relative 'helper'
|
2
|
+
require 'minitest/mock'
|
3
|
+
|
4
|
+
class TestRun < Kiba::Test
|
5
|
+
def test_ensure_kiba_defaults_to_streaming_runner
|
6
|
+
cb = -> (job) { "Streaming runner called" }
|
7
|
+
Kiba::StreamingRunner.stub(:run, cb) do
|
8
|
+
job = Kiba::Control.new
|
9
|
+
assert_equal "Streaming runner called", Kiba.run(job)
|
10
|
+
end
|
11
|
+
end
|
12
|
+
end
|
data/test/test_runner.rb
CHANGED
@@ -8,6 +8,11 @@ require_relative 'support/test_non_closing_transform'
|
|
8
8
|
require_relative 'shared_runner_tests'
|
9
9
|
|
10
10
|
class TestStreamingRunner < Kiba::Test
|
11
|
+
def kiba_run(job)
|
12
|
+
job.config[:kiba] = {runner: Kiba::StreamingRunner}
|
13
|
+
Kiba.run(job)
|
14
|
+
end
|
15
|
+
|
11
16
|
include SharedRunnerTests
|
12
17
|
|
13
18
|
def test_yielding_class_transform
|
@@ -15,10 +20,6 @@ class TestStreamingRunner < Kiba::Test
|
|
15
20
|
destination_array = []
|
16
21
|
|
17
22
|
job = Kiba.parse do
|
18
|
-
extend Kiba::DSLExtensions::Config
|
19
|
-
|
20
|
-
config :kiba, runner: Kiba::StreamingRunner
|
21
|
-
|
22
23
|
# provide a single row as the input
|
23
24
|
source TestEnumerableSource, [input_row]
|
24
25
|
|
@@ -51,9 +52,6 @@ class TestStreamingRunner < Kiba::Test
|
|
51
52
|
def test_transform_yielding_from_close
|
52
53
|
destination_array = []
|
53
54
|
job = Kiba.parse do
|
54
|
-
extend Kiba::DSLExtensions::Config
|
55
|
-
config :kiba, runner: Kiba::StreamingRunner
|
56
|
-
|
57
55
|
transform CloseYieldingTransform, yield_on_close: [1, 2]
|
58
56
|
destination TestArrayDestination, destination_array
|
59
57
|
end
|
@@ -63,9 +61,6 @@ class TestStreamingRunner < Kiba::Test
|
|
63
61
|
|
64
62
|
def test_transform_with_no_close_must_not_raise
|
65
63
|
job = Kiba.parse do
|
66
|
-
extend Kiba::DSLExtensions::Config
|
67
|
-
config :kiba, runner: Kiba::StreamingRunner
|
68
|
-
|
69
64
|
transform NonClosingTransform
|
70
65
|
end
|
71
66
|
Kiba.run(job)
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: kiba
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version:
|
4
|
+
version: 3.0.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Thibaut Barrère
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2020-02-10 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: rake
|
@@ -88,7 +88,6 @@ files:
|
|
88
88
|
- bin/kiba
|
89
89
|
- kiba.gemspec
|
90
90
|
- lib/kiba.rb
|
91
|
-
- lib/kiba/cli.rb
|
92
91
|
- lib/kiba/context.rb
|
93
92
|
- lib/kiba/control.rb
|
94
93
|
- lib/kiba/dsl_extensions/config.rb
|
@@ -96,10 +95,6 @@ files:
|
|
96
95
|
- lib/kiba/runner.rb
|
97
96
|
- lib/kiba/streaming_runner.rb
|
98
97
|
- lib/kiba/version.rb
|
99
|
-
- test/fixtures/bogus.etl
|
100
|
-
- test/fixtures/namespace_conflict.etl
|
101
|
-
- test/fixtures/some_extension.rb
|
102
|
-
- test/fixtures/valid.etl
|
103
98
|
- test/helper.rb
|
104
99
|
- test/shared_runner_tests.rb
|
105
100
|
- test/support/shared_tests.rb
|
@@ -115,9 +110,9 @@ files:
|
|
115
110
|
- test/support/test_rename_field_transform.rb
|
116
111
|
- test/support/test_source_that_reads_at_instantiation_time.rb
|
117
112
|
- test/support/test_yielding_transform.rb
|
118
|
-
- test/test_cli.rb
|
119
113
|
- test/test_integration.rb
|
120
114
|
- test/test_parser.rb
|
115
|
+
- test/test_run.rb
|
121
116
|
- test/test_runner.rb
|
122
117
|
- test/test_streaming_runner.rb
|
123
118
|
- test/tmp/.gitkeep
|
@@ -147,10 +142,6 @@ signing_key:
|
|
147
142
|
specification_version: 4
|
148
143
|
summary: Lightweight ETL for Ruby
|
149
144
|
test_files:
|
150
|
-
- test/fixtures/bogus.etl
|
151
|
-
- test/fixtures/namespace_conflict.etl
|
152
|
-
- test/fixtures/some_extension.rb
|
153
|
-
- test/fixtures/valid.etl
|
154
145
|
- test/helper.rb
|
155
146
|
- test/shared_runner_tests.rb
|
156
147
|
- test/support/shared_tests.rb
|
@@ -166,9 +157,9 @@ test_files:
|
|
166
157
|
- test/support/test_rename_field_transform.rb
|
167
158
|
- test/support/test_source_that_reads_at_instantiation_time.rb
|
168
159
|
- test/support/test_yielding_transform.rb
|
169
|
-
- test/test_cli.rb
|
170
160
|
- test/test_integration.rb
|
171
161
|
- test/test_parser.rb
|
162
|
+
- test/test_run.rb
|
172
163
|
- test/test_runner.rb
|
173
164
|
- test/test_streaming_runner.rb
|
174
165
|
- test/tmp/.gitkeep
|
data/lib/kiba/cli.rb
DELETED
@@ -1,16 +0,0 @@
|
|
1
|
-
require 'kiba'
|
2
|
-
|
3
|
-
module Kiba
|
4
|
-
class Cli
|
5
|
-
def self.run(args)
|
6
|
-
unless args.size == 1
|
7
|
-
puts 'Syntax: kiba your-script.etl'
|
8
|
-
exit(-1)
|
9
|
-
end
|
10
|
-
filename = args[0]
|
11
|
-
script_content = IO.read(filename)
|
12
|
-
job_definition = Kiba.parse(script_content, filename)
|
13
|
-
Kiba.run(job_definition)
|
14
|
-
end
|
15
|
-
end
|
16
|
-
end
|
data/test/fixtures/bogus.etl
DELETED
@@ -1,9 +0,0 @@
|
|
1
|
-
fail "Context should not be visible without Kiba namespace" if defined?(Context)
|
2
|
-
fail "Control should not be visible without Kiba namespace" if defined?(Control)
|
3
|
-
fail "Parser should not be visible without Kiba namespace" if defined?(Parser)
|
4
|
-
fail "Config should not be visible without Kiba namespace" if defined?(DSLExtensions::Config)
|
5
|
-
|
6
|
-
# verify Kiba config (namespaced under Kiba::DSLExtensions::Config)
|
7
|
-
# isn't causing troubles to implementers using a top-level DSLExtensions module
|
8
|
-
require_relative 'some_extension'
|
9
|
-
extend DSLExtensions::SomeExtension
|
data/test/fixtures/valid.etl
DELETED
@@ -1 +0,0 @@
|
|
1
|
-
# this does nothing
|
data/test/test_cli.rb
DELETED
@@ -1,21 +0,0 @@
|
|
1
|
-
require_relative 'helper'
|
2
|
-
require 'kiba/cli'
|
3
|
-
|
4
|
-
class TestCli < Kiba::Test
|
5
|
-
def test_cli_launches
|
6
|
-
Kiba::Cli.run([fixture('valid.etl')])
|
7
|
-
end
|
8
|
-
|
9
|
-
def test_cli_reports_filename_and_lineno
|
10
|
-
exception = assert_raises(NameError) do
|
11
|
-
Kiba::Cli.run([fixture('bogus.etl')])
|
12
|
-
end
|
13
|
-
|
14
|
-
assert_match(/uninitialized constant(.*)UnknownThing/, exception.message)
|
15
|
-
assert_includes exception.backtrace.to_s, 'test/fixtures/bogus.etl:2:in'
|
16
|
-
end
|
17
|
-
|
18
|
-
def test_namespace_conflict
|
19
|
-
Kiba::Cli.run([fixture('namespace_conflict.etl')])
|
20
|
-
end
|
21
|
-
end
|