kiba 2.5.0 → 3.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.travis.yml +1 -3
- data/Changes.md +8 -0
- data/Pro-Changes.md +63 -2
- data/bin/kiba +12 -2
- data/lib/kiba.rb +1 -1
- data/lib/kiba/parser.rb +6 -22
- data/lib/kiba/runner.rb +1 -1
- data/lib/kiba/version.rb +1 -1
- data/test/shared_runner_tests.rb +0 -4
- data/test/test_parser.rb +0 -33
- data/test/test_run.rb +12 -0
- data/test/test_runner.rb +5 -0
- data/test/test_streaming_runner.rb +5 -10
- metadata +4 -13
- data/lib/kiba/cli.rb +0 -16
- data/test/fixtures/bogus.etl +0 -2
- data/test/fixtures/namespace_conflict.etl +0 -9
- data/test/fixtures/some_extension.rb +0 -4
- data/test/fixtures/valid.etl +0 -1
- data/test/test_cli.rb +0 -21
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: a5cf22f1013b65b59d9c91b9f629749b47b5474b3f500d5ca54118b6b6a46c4f
|
4
|
+
data.tar.gz: cae5e05fabc7c8d169d47d6aa6a7da8befd6711ad2d2471c2c0f5f1d3e604841
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 2ed99641ab799ee57f213525f6ca3ab4b9a9d5e0b7680115bef8e139d32dcfb350fb0117065b5be0ded6a886c21988413b2963e2cbea546a7a1d5691ed078f27
|
7
|
+
data.tar.gz: 24434d65dcf9bb81ce0f575860edbdf92598a1e990663347a599eb42155211c02b9c9eb0304c869cf85c8cd15df8c262afcf9cdf11bb7568323ea98b393b9518
|
data/.travis.yml
CHANGED
@@ -3,12 +3,10 @@ matrix:
|
|
3
3
|
include:
|
4
4
|
# see https://www.ruby-lang.org/en/downloads/branches/
|
5
5
|
- rvm: ruby-head
|
6
|
+
- rvm: 2.7
|
6
7
|
- rvm: 2.6
|
7
8
|
- rvm: 2.5
|
8
9
|
- rvm: 2.4
|
9
|
-
# NOTE: EOL since 2019-03-31
|
10
|
-
- rvm: 2.3
|
11
10
|
# see https://www.jruby.org/download
|
12
|
-
- rvm: jruby-9.1
|
13
11
|
- rvm: jruby-9.2
|
14
12
|
- rvm: truffleruby
|
data/Changes.md
CHANGED
@@ -1,3 +1,11 @@
|
|
1
|
+
3.0.0
|
2
|
+
-----
|
3
|
+
|
4
|
+
- Breaking: the `kiba` command line is deprecated to encourage using `Kiba.parse` API. See [#81](https://github.com/thbar/kiba/pull/81) and release notes for details & migration path.
|
5
|
+
- Kiba now defaults to `StreamingRunner` (backward compatible & more powerful engine) [#83](https://github.com/thbar/kiba/pull/83).
|
6
|
+
- Kiba now officially supports MRI Ruby 2.4+ (although 2.3 will still work for now), JRuby 9.2+ or TruffleRuby.
|
7
|
+
- You may get warnings with Ruby 2.7 and errors with Ruby 2.8+. See [#85] for status on Ruby 3 keyword arguments support.
|
8
|
+
|
1
9
|
2.5.0
|
2
10
|
-----
|
3
11
|
|
data/Pro-Changes.md
CHANGED
@@ -3,8 +3,69 @@ Kiba Pro Changelog
|
|
3
3
|
|
4
4
|
Kiba Pro is the commercial extension for Kiba. Documentation is available on the [Wiki](https://github.com/thbar/kiba/wiki).
|
5
5
|
|
6
|
-
|
7
|
-
|
6
|
+
1.5.0
|
7
|
+
-----
|
8
|
+
|
9
|
+
- Compatibility with Kiba v3
|
10
|
+
- BREAKING CHANGE: deprecate non-live Sequel connection passing (https://github.com/thbar/kiba/issues/79). Do not use `database: "connection_string"`, instead pass your `Sequel` connection directly. This moves the connection management out of the destination, which is a better pattern & provides better (block-based) resources closing.
|
11
|
+
- Official MySQL support:
|
12
|
+
- While the compatibility was already here, it is now tested for in our QA testing suite.
|
13
|
+
- MySQL 5.5-8.0 is supported & tested
|
14
|
+
- MariaDB should be supported (although not tested against in the QA testing suite)
|
15
|
+
- Amazon Aurora MySQL is also supposed to work (although not tested)
|
16
|
+
- `Kiba::Pro::Sources::SQL` supports for non-streaming + streaming use
|
17
|
+
- `Kiba::Pro::Destinations::SQLBulkInsert` supports:
|
18
|
+
- Bulk insert
|
19
|
+
- Bulk insert with ignore
|
20
|
+
- Bulk upsert (including with dynamically computed columns) via `ON DUPLICATE KEY UPDATE`
|
21
|
+
- Note that the `Kiba::Pro::Destinations::SQLUpsert` (row-by-row) is not MySQL compatible at the moment
|
22
|
+
|
23
|
+
1.2.0
|
24
|
+
-----
|
25
|
+
|
26
|
+
- `SQL` source improvements:
|
27
|
+
- Deprecate use_cursor in favor of block query construct. The source could previously be configured with:
|
28
|
+
|
29
|
+
```ruby
|
30
|
+
source Kiba::Pro::Sources::SQL,
|
31
|
+
query: "SELECT * FROM items",
|
32
|
+
use_cursor: true
|
33
|
+
```
|
34
|
+
|
35
|
+
The `use_cursor` keyword is now deprecated. You can use the more powerful block query construct:
|
36
|
+
|
37
|
+
```ruby
|
38
|
+
source Kiba::Pro::Sources::SQL,
|
39
|
+
query: -> (db) { db["SELECT * FROM items"].use_cursor },
|
40
|
+
```
|
41
|
+
|
42
|
+
- Avoid bogus nested SQL calls when configuring the query via block/proc. A call with:
|
43
|
+
|
44
|
+
```ruby
|
45
|
+
source Kiba::Pro::Sources::SQL,
|
46
|
+
query: -> (db) { db["SELECT * FROM items"] },
|
47
|
+
```
|
48
|
+
|
49
|
+
would have previously generated a `SELECT * FROM (SELECT * FROM "items")`. This is now fixed.
|
50
|
+
|
51
|
+
- Add specs around streaming support (for both MySQL and Postgres).
|
52
|
+
|
53
|
+
For Postgres, streaming was [recommended by the author of Sequel](https://groups.google.com/d/msg/sequel-talk/olznPcmEf8M/hd5Ris0pYNwJ) over `use_cursor: true` (but do compare on your actual cases!). To enable streaming for Postgres:
|
54
|
+
- Add `sequel_pg` to your `Gemfile`
|
55
|
+
- Enable the extension in your `db` instance & add `.stream` to your dataset e.g.:
|
56
|
+
|
57
|
+
```ruby
|
58
|
+
Sequel.connect(ENV.fetch('DATABASE_URL')) do |db|
|
59
|
+
db.extension(:pg_streaming)
|
60
|
+
Kiba.run(Kiba.parse do
|
61
|
+
source Kiba::Pro::Sources::SQL,
|
62
|
+
db: db,
|
63
|
+
query: -> (db) { db[:items].stream }
|
64
|
+
# SNIP
|
65
|
+
end)
|
66
|
+
```
|
67
|
+
|
68
|
+
For MySQL, just add `.stream` to your dataset like above (no extension required).
|
8
69
|
|
9
70
|
1.1.0
|
10
71
|
-----
|
data/bin/kiba
CHANGED
@@ -1,5 +1,15 @@
|
|
1
1
|
#!/usr/bin/env ruby
|
2
2
|
|
3
|
-
|
3
|
+
STDERR.puts <<DOC
|
4
4
|
|
5
|
-
|
5
|
+
##########################################################################
|
6
|
+
|
7
|
+
The 'kiba' CLI is deprecated and has been removed in Kiba ETL v3.
|
8
|
+
|
9
|
+
See release notes / changelog for help.
|
10
|
+
|
11
|
+
##########################################################################
|
12
|
+
|
13
|
+
DOC
|
14
|
+
|
15
|
+
exit(1)
|
data/lib/kiba.rb
CHANGED
@@ -13,7 +13,7 @@ Kiba.extend(Kiba::Parser)
|
|
13
13
|
module Kiba
|
14
14
|
def self.run(job)
|
15
15
|
# NOTE: use Hash#dig when Ruby 2.2 reaches EOL
|
16
|
-
runner = job.config.fetch(:kiba, {}).fetch(:runner, Kiba::
|
16
|
+
runner = job.config.fetch(:kiba, {}).fetch(:runner, Kiba::StreamingRunner)
|
17
17
|
runner.run(job)
|
18
18
|
end
|
19
19
|
end
|
data/lib/kiba/parser.rb
CHANGED
@@ -1,26 +1,10 @@
|
|
1
|
-
|
2
|
-
|
3
|
-
|
4
|
-
|
5
|
-
|
6
|
-
# as by default instance_eval will resolve references by adding
|
7
|
-
# the module containing the parser class (initially "Kiba").
|
8
|
-
# Now, the classes appear to be further hidden from the user,
|
9
|
-
# as Kiba::Parser is its own module.
|
10
|
-
# This allows the user to create a Parser, Context, Control class
|
11
|
-
# without it being interpreted as reopening Kiba::Parser, Kiba::Context,
|
12
|
-
# etc.
|
13
|
-
# See test in test_cli.rb (test_namespace_conflict)
|
14
|
-
module Kiba::Parser
|
15
|
-
def parse(source_as_string = nil, source_file = nil, &source_as_block)
|
16
|
-
control = Kiba::Control.new
|
17
|
-
context = Kiba::Context.new(control)
|
18
|
-
if source_as_string
|
19
|
-
# this somewhat weird construct allows to remove a nil source_file
|
20
|
-
context.instance_eval(*[source_as_string, source_file].compact)
|
21
|
-
else
|
1
|
+
module Kiba
|
2
|
+
module Parser
|
3
|
+
def parse(&source_as_block)
|
4
|
+
control = Kiba::Control.new
|
5
|
+
context = Kiba::Context.new(control)
|
22
6
|
context.instance_eval(&source_as_block)
|
7
|
+
control
|
23
8
|
end
|
24
|
-
control
|
25
9
|
end
|
26
10
|
end
|
data/lib/kiba/runner.rb
CHANGED
data/lib/kiba/version.rb
CHANGED
data/test/shared_runner_tests.rb
CHANGED
data/test/test_parser.rb
CHANGED
@@ -68,39 +68,6 @@ class TestParser < Kiba::Test
|
|
68
68
|
assert_instance_of Proc, control.pre_processes[0][:block]
|
69
69
|
end
|
70
70
|
|
71
|
-
def test_source_as_string_parsing
|
72
|
-
control = Kiba.parse <<RUBY
|
73
|
-
source DummyClass, 'from', 'file'
|
74
|
-
RUBY
|
75
|
-
|
76
|
-
assert_equal 1, control.sources.size
|
77
|
-
assert_equal DummyClass, control.sources[0][:klass]
|
78
|
-
assert_equal %w(from file), control.sources[0][:args]
|
79
|
-
end
|
80
|
-
|
81
|
-
def test_source_as_file_doing_require
|
82
|
-
IO.write 'test/tmp/etl-common.rb', <<RUBY
|
83
|
-
def common_source_declaration
|
84
|
-
source DummyClass, 'from', 'common'
|
85
|
-
end
|
86
|
-
RUBY
|
87
|
-
IO.write 'test/tmp/etl-main.rb', <<RUBY
|
88
|
-
require './test/tmp/etl-common.rb'
|
89
|
-
|
90
|
-
source DummyClass, 'from', 'main'
|
91
|
-
common_source_declaration
|
92
|
-
RUBY
|
93
|
-
control = Kiba.parse IO.read('test/tmp/etl-main.rb')
|
94
|
-
|
95
|
-
assert_equal 2, control.sources.size
|
96
|
-
|
97
|
-
assert_equal %w(from main), control.sources[0][:args]
|
98
|
-
assert_equal %w(from common), control.sources[1][:args]
|
99
|
-
|
100
|
-
ensure
|
101
|
-
remove_files('test/tmp/etl-common.rb', 'test/tmp/etl-main.rb')
|
102
|
-
end
|
103
|
-
|
104
71
|
def test_config
|
105
72
|
control = Kiba.parse do
|
106
73
|
extend Kiba::DSLExtensions::Config
|
data/test/test_run.rb
ADDED
@@ -0,0 +1,12 @@
|
|
1
|
+
require_relative 'helper'
|
2
|
+
require 'minitest/mock'
|
3
|
+
|
4
|
+
class TestRun < Kiba::Test
|
5
|
+
def test_ensure_kiba_defaults_to_streaming_runner
|
6
|
+
cb = -> (job) { "Streaming runner called" }
|
7
|
+
Kiba::StreamingRunner.stub(:run, cb) do
|
8
|
+
job = Kiba::Control.new
|
9
|
+
assert_equal "Streaming runner called", Kiba.run(job)
|
10
|
+
end
|
11
|
+
end
|
12
|
+
end
|
data/test/test_runner.rb
CHANGED
@@ -8,6 +8,11 @@ require_relative 'support/test_non_closing_transform'
|
|
8
8
|
require_relative 'shared_runner_tests'
|
9
9
|
|
10
10
|
class TestStreamingRunner < Kiba::Test
|
11
|
+
def kiba_run(job)
|
12
|
+
job.config[:kiba] = {runner: Kiba::StreamingRunner}
|
13
|
+
Kiba.run(job)
|
14
|
+
end
|
15
|
+
|
11
16
|
include SharedRunnerTests
|
12
17
|
|
13
18
|
def test_yielding_class_transform
|
@@ -15,10 +20,6 @@ class TestStreamingRunner < Kiba::Test
|
|
15
20
|
destination_array = []
|
16
21
|
|
17
22
|
job = Kiba.parse do
|
18
|
-
extend Kiba::DSLExtensions::Config
|
19
|
-
|
20
|
-
config :kiba, runner: Kiba::StreamingRunner
|
21
|
-
|
22
23
|
# provide a single row as the input
|
23
24
|
source TestEnumerableSource, [input_row]
|
24
25
|
|
@@ -51,9 +52,6 @@ class TestStreamingRunner < Kiba::Test
|
|
51
52
|
def test_transform_yielding_from_close
|
52
53
|
destination_array = []
|
53
54
|
job = Kiba.parse do
|
54
|
-
extend Kiba::DSLExtensions::Config
|
55
|
-
config :kiba, runner: Kiba::StreamingRunner
|
56
|
-
|
57
55
|
transform CloseYieldingTransform, yield_on_close: [1, 2]
|
58
56
|
destination TestArrayDestination, destination_array
|
59
57
|
end
|
@@ -63,9 +61,6 @@ class TestStreamingRunner < Kiba::Test
|
|
63
61
|
|
64
62
|
def test_transform_with_no_close_must_not_raise
|
65
63
|
job = Kiba.parse do
|
66
|
-
extend Kiba::DSLExtensions::Config
|
67
|
-
config :kiba, runner: Kiba::StreamingRunner
|
68
|
-
|
69
64
|
transform NonClosingTransform
|
70
65
|
end
|
71
66
|
Kiba.run(job)
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: kiba
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version:
|
4
|
+
version: 3.0.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Thibaut Barrère
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2020-02-10 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: rake
|
@@ -88,7 +88,6 @@ files:
|
|
88
88
|
- bin/kiba
|
89
89
|
- kiba.gemspec
|
90
90
|
- lib/kiba.rb
|
91
|
-
- lib/kiba/cli.rb
|
92
91
|
- lib/kiba/context.rb
|
93
92
|
- lib/kiba/control.rb
|
94
93
|
- lib/kiba/dsl_extensions/config.rb
|
@@ -96,10 +95,6 @@ files:
|
|
96
95
|
- lib/kiba/runner.rb
|
97
96
|
- lib/kiba/streaming_runner.rb
|
98
97
|
- lib/kiba/version.rb
|
99
|
-
- test/fixtures/bogus.etl
|
100
|
-
- test/fixtures/namespace_conflict.etl
|
101
|
-
- test/fixtures/some_extension.rb
|
102
|
-
- test/fixtures/valid.etl
|
103
98
|
- test/helper.rb
|
104
99
|
- test/shared_runner_tests.rb
|
105
100
|
- test/support/shared_tests.rb
|
@@ -115,9 +110,9 @@ files:
|
|
115
110
|
- test/support/test_rename_field_transform.rb
|
116
111
|
- test/support/test_source_that_reads_at_instantiation_time.rb
|
117
112
|
- test/support/test_yielding_transform.rb
|
118
|
-
- test/test_cli.rb
|
119
113
|
- test/test_integration.rb
|
120
114
|
- test/test_parser.rb
|
115
|
+
- test/test_run.rb
|
121
116
|
- test/test_runner.rb
|
122
117
|
- test/test_streaming_runner.rb
|
123
118
|
- test/tmp/.gitkeep
|
@@ -147,10 +142,6 @@ signing_key:
|
|
147
142
|
specification_version: 4
|
148
143
|
summary: Lightweight ETL for Ruby
|
149
144
|
test_files:
|
150
|
-
- test/fixtures/bogus.etl
|
151
|
-
- test/fixtures/namespace_conflict.etl
|
152
|
-
- test/fixtures/some_extension.rb
|
153
|
-
- test/fixtures/valid.etl
|
154
145
|
- test/helper.rb
|
155
146
|
- test/shared_runner_tests.rb
|
156
147
|
- test/support/shared_tests.rb
|
@@ -166,9 +157,9 @@ test_files:
|
|
166
157
|
- test/support/test_rename_field_transform.rb
|
167
158
|
- test/support/test_source_that_reads_at_instantiation_time.rb
|
168
159
|
- test/support/test_yielding_transform.rb
|
169
|
-
- test/test_cli.rb
|
170
160
|
- test/test_integration.rb
|
171
161
|
- test/test_parser.rb
|
162
|
+
- test/test_run.rb
|
172
163
|
- test/test_runner.rb
|
173
164
|
- test/test_streaming_runner.rb
|
174
165
|
- test/tmp/.gitkeep
|
data/lib/kiba/cli.rb
DELETED
@@ -1,16 +0,0 @@
|
|
1
|
-
require 'kiba'
|
2
|
-
|
3
|
-
module Kiba
|
4
|
-
class Cli
|
5
|
-
def self.run(args)
|
6
|
-
unless args.size == 1
|
7
|
-
puts 'Syntax: kiba your-script.etl'
|
8
|
-
exit(-1)
|
9
|
-
end
|
10
|
-
filename = args[0]
|
11
|
-
script_content = IO.read(filename)
|
12
|
-
job_definition = Kiba.parse(script_content, filename)
|
13
|
-
Kiba.run(job_definition)
|
14
|
-
end
|
15
|
-
end
|
16
|
-
end
|
data/test/fixtures/bogus.etl
DELETED
@@ -1,9 +0,0 @@
|
|
1
|
-
fail "Context should not be visible without Kiba namespace" if defined?(Context)
|
2
|
-
fail "Control should not be visible without Kiba namespace" if defined?(Control)
|
3
|
-
fail "Parser should not be visible without Kiba namespace" if defined?(Parser)
|
4
|
-
fail "Config should not be visible without Kiba namespace" if defined?(DSLExtensions::Config)
|
5
|
-
|
6
|
-
# verify Kiba config (namespaced under Kiba::DSLExtensions::Config)
|
7
|
-
# isn't causing troubles to implementers using a top-level DSLExtensions module
|
8
|
-
require_relative 'some_extension'
|
9
|
-
extend DSLExtensions::SomeExtension
|
data/test/fixtures/valid.etl
DELETED
@@ -1 +0,0 @@
|
|
1
|
-
# this does nothing
|
data/test/test_cli.rb
DELETED
@@ -1,21 +0,0 @@
|
|
1
|
-
require_relative 'helper'
|
2
|
-
require 'kiba/cli'
|
3
|
-
|
4
|
-
class TestCli < Kiba::Test
|
5
|
-
def test_cli_launches
|
6
|
-
Kiba::Cli.run([fixture('valid.etl')])
|
7
|
-
end
|
8
|
-
|
9
|
-
def test_cli_reports_filename_and_lineno
|
10
|
-
exception = assert_raises(NameError) do
|
11
|
-
Kiba::Cli.run([fixture('bogus.etl')])
|
12
|
-
end
|
13
|
-
|
14
|
-
assert_match(/uninitialized constant(.*)UnknownThing/, exception.message)
|
15
|
-
assert_includes exception.backtrace.to_s, 'test/fixtures/bogus.etl:2:in'
|
16
|
-
end
|
17
|
-
|
18
|
-
def test_namespace_conflict
|
19
|
-
Kiba::Cli.run([fixture('namespace_conflict.etl')])
|
20
|
-
end
|
21
|
-
end
|