kiba 2.5.0 → 3.0.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: aebe72316fa16e820c9c11b7a2a274b05b77abed7eda8799858fe00a4279e2db
4
- data.tar.gz: 686b359e9b74684472d17491c98a32adddd30603186b4f6ec7b6a40263d47d19
3
+ metadata.gz: a5cf22f1013b65b59d9c91b9f629749b47b5474b3f500d5ca54118b6b6a46c4f
4
+ data.tar.gz: cae5e05fabc7c8d169d47d6aa6a7da8befd6711ad2d2471c2c0f5f1d3e604841
5
5
  SHA512:
6
- metadata.gz: 41be9f07f5c7bef36d85c97bc3a9224c7f885de92f458407064d7fa153f8a0ed1443a626b82a4e97f1269154aac0aaca738b8ed63c596953cc1059ae0657f4d3
7
- data.tar.gz: d03d4e66991f0b903ca6b86eda7ae33790218dc84d9de906337a0ddae709a7cbfa6adb9432bbb812396e6a122c942f2cecc8a95ee16d184714e6625cac411905
6
+ metadata.gz: 2ed99641ab799ee57f213525f6ca3ab4b9a9d5e0b7680115bef8e139d32dcfb350fb0117065b5be0ded6a886c21988413b2963e2cbea546a7a1d5691ed078f27
7
+ data.tar.gz: 24434d65dcf9bb81ce0f575860edbdf92598a1e990663347a599eb42155211c02b9c9eb0304c869cf85c8cd15df8c262afcf9cdf11bb7568323ea98b393b9518
@@ -3,12 +3,10 @@ matrix:
3
3
  include:
4
4
  # see https://www.ruby-lang.org/en/downloads/branches/
5
5
  - rvm: ruby-head
6
+ - rvm: 2.7
6
7
  - rvm: 2.6
7
8
  - rvm: 2.5
8
9
  - rvm: 2.4
9
- # NOTE: EOL since 2019-03-31
10
- - rvm: 2.3
11
10
  # see https://www.jruby.org/download
12
- - rvm: jruby-9.1
13
11
  - rvm: jruby-9.2
14
12
  - rvm: truffleruby
data/Changes.md CHANGED
@@ -1,3 +1,11 @@
1
+ 3.0.0
2
+ -----
3
+
4
+ - Breaking: the `kiba` command line is deprecated to encourage using `Kiba.parse` API. See [#81](https://github.com/thbar/kiba/pull/81) and release notes for details & migration path.
5
+ - Kiba now defaults to `StreamingRunner` (backward compatible & more powerful engine) [#83](https://github.com/thbar/kiba/pull/83).
6
+ - Kiba now officially supports MRI Ruby 2.4+ (although 2.3 will still work for now), JRuby 9.2+ or TruffleRuby.
7
+ - You may get warnings with Ruby 2.7 and errors with Ruby 2.8+. See [#85] for status on Ruby 3 keyword arguments support.
8
+
1
9
  2.5.0
2
10
  -----
3
11
 
@@ -3,8 +3,69 @@ Kiba Pro Changelog
3
3
 
4
4
  Kiba Pro is the commercial extension for Kiba. Documentation is available on the [Wiki](https://github.com/thbar/kiba/wiki).
5
5
 
6
- HEAD
7
- -------
6
+ 1.5.0
7
+ -----
8
+
9
+ - Compatibility with Kiba v3
10
+ - BREAKING CHANGE: deprecate non-live Sequel connection passing (https://github.com/thbar/kiba/issues/79). Do not use `database: "connection_string"`, instead pass your `Sequel` connection directly. This moves the connection management out of the destination, which is a better pattern & provides better (block-based) resources closing.
11
+ - Official MySQL support:
12
+ - While the compatibility was already here, it is now tested for in our QA testing suite.
13
+ - MySQL 5.5-8.0 is supported & tested
14
+ - MariaDB should be supported (although not tested against in the QA testing suite)
15
+ - Amazon Aurora MySQL is also supposed to work (although not tested)
16
+ - `Kiba::Pro::Sources::SQL` supports for non-streaming + streaming use
17
+ - `Kiba::Pro::Destinations::SQLBulkInsert` supports:
18
+ - Bulk insert
19
+ - Bulk insert with ignore
20
+ - Bulk upsert (including with dynamically computed columns) via `ON DUPLICATE KEY UPDATE`
21
+ - Note that the `Kiba::Pro::Destinations::SQLUpsert` (row-by-row) is not MySQL compatible at the moment
22
+
23
+ 1.2.0
24
+ -----
25
+
26
+ - `SQL` source improvements:
27
+ - Deprecate use_cursor in favor of block query construct. The source could previously be configured with:
28
+
29
+ ```ruby
30
+ source Kiba::Pro::Sources::SQL,
31
+ query: "SELECT * FROM items",
32
+ use_cursor: true
33
+ ```
34
+
35
+ The `use_cursor` keyword is now deprecated. You can use the more powerful block query construct:
36
+
37
+ ```ruby
38
+ source Kiba::Pro::Sources::SQL,
39
+ query: -> (db) { db["SELECT * FROM items"].use_cursor },
40
+ ```
41
+
42
+ - Avoid bogus nested SQL calls when configuring the query via block/proc. A call with:
43
+
44
+ ```ruby
45
+ source Kiba::Pro::Sources::SQL,
46
+ query: -> (db) { db["SELECT * FROM items"] },
47
+ ```
48
+
49
+ would have previously generated a `SELECT * FROM (SELECT * FROM "items")`. This is now fixed.
50
+
51
+ - Add specs around streaming support (for both MySQL and Postgres).
52
+
53
+ For Postgres, streaming was [recommended by the author of Sequel](https://groups.google.com/d/msg/sequel-talk/olznPcmEf8M/hd5Ris0pYNwJ) over `use_cursor: true` (but do compare on your actual cases!). To enable streaming for Postgres:
54
+ - Add `sequel_pg` to your `Gemfile`
55
+ - Enable the extension in your `db` instance & add `.stream` to your dataset e.g.:
56
+
57
+ ```ruby
58
+ Sequel.connect(ENV.fetch('DATABASE_URL')) do |db|
59
+ db.extension(:pg_streaming)
60
+ Kiba.run(Kiba.parse do
61
+ source Kiba::Pro::Sources::SQL,
62
+ db: db,
63
+ query: -> (db) { db[:items].stream }
64
+ # SNIP
65
+ end)
66
+ ```
67
+
68
+ For MySQL, just add `.stream` to your dataset like above (no extension required).
8
69
 
9
70
  1.1.0
10
71
  -----
data/bin/kiba CHANGED
@@ -1,5 +1,15 @@
1
1
  #!/usr/bin/env ruby
2
2
 
3
- require_relative '../lib/kiba/cli'
3
+ STDERR.puts <<DOC
4
4
 
5
- Kiba::Cli.run(ARGV)
5
+ ##########################################################################
6
+
7
+ The 'kiba' CLI is deprecated and has been removed in Kiba ETL v3.
8
+
9
+ See release notes / changelog for help.
10
+
11
+ ##########################################################################
12
+
13
+ DOC
14
+
15
+ exit(1)
@@ -13,7 +13,7 @@ Kiba.extend(Kiba::Parser)
13
13
  module Kiba
14
14
  def self.run(job)
15
15
  # NOTE: use Hash#dig when Ruby 2.2 reaches EOL
16
- runner = job.config.fetch(:kiba, {}).fetch(:runner, Kiba::Runner)
16
+ runner = job.config.fetch(:kiba, {}).fetch(:runner, Kiba::StreamingRunner)
17
17
  runner.run(job)
18
18
  end
19
19
  end
@@ -1,26 +1,10 @@
1
- # NOTE: using the "Kiba::Parser" declaration, as I discovered,
2
- # provides increased isolation to the declared ETL script, compared
3
- # to 2 nested modules.
4
- # Before that, a user creating entities named Control, Context
5
- # or DSLExtensions would see a conflict with Kiba own classes,
6
- # as by default instance_eval will resolve references by adding
7
- # the module containing the parser class (initially "Kiba").
8
- # Now, the classes appear to be further hidden from the user,
9
- # as Kiba::Parser is its own module.
10
- # This allows the user to create a Parser, Context, Control class
11
- # without it being interpreted as reopening Kiba::Parser, Kiba::Context,
12
- # etc.
13
- # See test in test_cli.rb (test_namespace_conflict)
14
- module Kiba::Parser
15
- def parse(source_as_string = nil, source_file = nil, &source_as_block)
16
- control = Kiba::Control.new
17
- context = Kiba::Context.new(control)
18
- if source_as_string
19
- # this somewhat weird construct allows to remove a nil source_file
20
- context.instance_eval(*[source_as_string, source_file].compact)
21
- else
1
+ module Kiba
2
+ module Parser
3
+ def parse(&source_as_block)
4
+ control = Kiba::Control.new
5
+ context = Kiba::Context.new(control)
22
6
  context.instance_eval(&source_as_block)
7
+ control
23
8
  end
24
- control
25
9
  end
26
10
  end
@@ -71,7 +71,7 @@ module Kiba
71
71
  elsif block
72
72
  fail 'Block form is not allowed here' unless allow_block
73
73
  AliasingProc.new(&block)
74
- elsif
74
+ else
75
75
  fail 'Nil parameters not allowed here'
76
76
  end
77
77
  end
@@ -1,3 +1,3 @@
1
1
  module Kiba
2
- VERSION = '2.5.0'
2
+ VERSION = '3.0.0'
3
3
  end
@@ -3,10 +3,6 @@ require_relative 'support/test_enumerable_source'
3
3
  require_relative 'support/test_destination_returning_nil'
4
4
 
5
5
  module SharedRunnerTests
6
- def kiba_run(job)
7
- Kiba.run(job)
8
- end
9
-
10
6
  def rows
11
7
  @rows ||= [
12
8
  { identifier: 'first-row' },
@@ -68,39 +68,6 @@ class TestParser < Kiba::Test
68
68
  assert_instance_of Proc, control.pre_processes[0][:block]
69
69
  end
70
70
 
71
- def test_source_as_string_parsing
72
- control = Kiba.parse <<RUBY
73
- source DummyClass, 'from', 'file'
74
- RUBY
75
-
76
- assert_equal 1, control.sources.size
77
- assert_equal DummyClass, control.sources[0][:klass]
78
- assert_equal %w(from file), control.sources[0][:args]
79
- end
80
-
81
- def test_source_as_file_doing_require
82
- IO.write 'test/tmp/etl-common.rb', <<RUBY
83
- def common_source_declaration
84
- source DummyClass, 'from', 'common'
85
- end
86
- RUBY
87
- IO.write 'test/tmp/etl-main.rb', <<RUBY
88
- require './test/tmp/etl-common.rb'
89
-
90
- source DummyClass, 'from', 'main'
91
- common_source_declaration
92
- RUBY
93
- control = Kiba.parse IO.read('test/tmp/etl-main.rb')
94
-
95
- assert_equal 2, control.sources.size
96
-
97
- assert_equal %w(from main), control.sources[0][:args]
98
- assert_equal %w(from common), control.sources[1][:args]
99
-
100
- ensure
101
- remove_files('test/tmp/etl-common.rb', 'test/tmp/etl-main.rb')
102
- end
103
-
104
71
  def test_config
105
72
  control = Kiba.parse do
106
73
  extend Kiba::DSLExtensions::Config
@@ -0,0 +1,12 @@
1
+ require_relative 'helper'
2
+ require 'minitest/mock'
3
+
4
+ class TestRun < Kiba::Test
5
+ def test_ensure_kiba_defaults_to_streaming_runner
6
+ cb = -> (job) { "Streaming runner called" }
7
+ Kiba::StreamingRunner.stub(:run, cb) do
8
+ job = Kiba::Control.new
9
+ assert_equal "Streaming runner called", Kiba.run(job)
10
+ end
11
+ end
12
+ end
@@ -2,5 +2,10 @@ require_relative 'helper'
2
2
  require_relative 'shared_runner_tests'
3
3
 
4
4
  class TestRunner < Kiba::Test
5
+ def kiba_run(job)
6
+ job.config[:kiba] = {runner: Kiba::Runner}
7
+ Kiba.run(job)
8
+ end
9
+
5
10
  include SharedRunnerTests
6
11
  end
@@ -8,6 +8,11 @@ require_relative 'support/test_non_closing_transform'
8
8
  require_relative 'shared_runner_tests'
9
9
 
10
10
  class TestStreamingRunner < Kiba::Test
11
+ def kiba_run(job)
12
+ job.config[:kiba] = {runner: Kiba::StreamingRunner}
13
+ Kiba.run(job)
14
+ end
15
+
11
16
  include SharedRunnerTests
12
17
 
13
18
  def test_yielding_class_transform
@@ -15,10 +20,6 @@ class TestStreamingRunner < Kiba::Test
15
20
  destination_array = []
16
21
 
17
22
  job = Kiba.parse do
18
- extend Kiba::DSLExtensions::Config
19
-
20
- config :kiba, runner: Kiba::StreamingRunner
21
-
22
23
  # provide a single row as the input
23
24
  source TestEnumerableSource, [input_row]
24
25
 
@@ -51,9 +52,6 @@ class TestStreamingRunner < Kiba::Test
51
52
  def test_transform_yielding_from_close
52
53
  destination_array = []
53
54
  job = Kiba.parse do
54
- extend Kiba::DSLExtensions::Config
55
- config :kiba, runner: Kiba::StreamingRunner
56
-
57
55
  transform CloseYieldingTransform, yield_on_close: [1, 2]
58
56
  destination TestArrayDestination, destination_array
59
57
  end
@@ -63,9 +61,6 @@ class TestStreamingRunner < Kiba::Test
63
61
 
64
62
  def test_transform_with_no_close_must_not_raise
65
63
  job = Kiba.parse do
66
- extend Kiba::DSLExtensions::Config
67
- config :kiba, runner: Kiba::StreamingRunner
68
-
69
64
  transform NonClosingTransform
70
65
  end
71
66
  Kiba.run(job)
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: kiba
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.5.0
4
+ version: 3.0.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Thibaut Barrère
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2019-05-29 00:00:00.000000000 Z
11
+ date: 2020-02-10 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rake
@@ -88,7 +88,6 @@ files:
88
88
  - bin/kiba
89
89
  - kiba.gemspec
90
90
  - lib/kiba.rb
91
- - lib/kiba/cli.rb
92
91
  - lib/kiba/context.rb
93
92
  - lib/kiba/control.rb
94
93
  - lib/kiba/dsl_extensions/config.rb
@@ -96,10 +95,6 @@ files:
96
95
  - lib/kiba/runner.rb
97
96
  - lib/kiba/streaming_runner.rb
98
97
  - lib/kiba/version.rb
99
- - test/fixtures/bogus.etl
100
- - test/fixtures/namespace_conflict.etl
101
- - test/fixtures/some_extension.rb
102
- - test/fixtures/valid.etl
103
98
  - test/helper.rb
104
99
  - test/shared_runner_tests.rb
105
100
  - test/support/shared_tests.rb
@@ -115,9 +110,9 @@ files:
115
110
  - test/support/test_rename_field_transform.rb
116
111
  - test/support/test_source_that_reads_at_instantiation_time.rb
117
112
  - test/support/test_yielding_transform.rb
118
- - test/test_cli.rb
119
113
  - test/test_integration.rb
120
114
  - test/test_parser.rb
115
+ - test/test_run.rb
121
116
  - test/test_runner.rb
122
117
  - test/test_streaming_runner.rb
123
118
  - test/tmp/.gitkeep
@@ -147,10 +142,6 @@ signing_key:
147
142
  specification_version: 4
148
143
  summary: Lightweight ETL for Ruby
149
144
  test_files:
150
- - test/fixtures/bogus.etl
151
- - test/fixtures/namespace_conflict.etl
152
- - test/fixtures/some_extension.rb
153
- - test/fixtures/valid.etl
154
145
  - test/helper.rb
155
146
  - test/shared_runner_tests.rb
156
147
  - test/support/shared_tests.rb
@@ -166,9 +157,9 @@ test_files:
166
157
  - test/support/test_rename_field_transform.rb
167
158
  - test/support/test_source_that_reads_at_instantiation_time.rb
168
159
  - test/support/test_yielding_transform.rb
169
- - test/test_cli.rb
170
160
  - test/test_integration.rb
171
161
  - test/test_parser.rb
162
+ - test/test_run.rb
172
163
  - test/test_runner.rb
173
164
  - test/test_streaming_runner.rb
174
165
  - test/tmp/.gitkeep
@@ -1,16 +0,0 @@
1
- require 'kiba'
2
-
3
- module Kiba
4
- class Cli
5
- def self.run(args)
6
- unless args.size == 1
7
- puts 'Syntax: kiba your-script.etl'
8
- exit(-1)
9
- end
10
- filename = args[0]
11
- script_content = IO.read(filename)
12
- job_definition = Kiba.parse(script_content, filename)
13
- Kiba.run(job_definition)
14
- end
15
- end
16
- end
@@ -1,2 +0,0 @@
1
- # this should fail because we have an unknown class
2
- source UnknownThing
@@ -1,9 +0,0 @@
1
- fail "Context should not be visible without Kiba namespace" if defined?(Context)
2
- fail "Control should not be visible without Kiba namespace" if defined?(Control)
3
- fail "Parser should not be visible without Kiba namespace" if defined?(Parser)
4
- fail "Config should not be visible without Kiba namespace" if defined?(DSLExtensions::Config)
5
-
6
- # verify Kiba config (namespaced under Kiba::DSLExtensions::Config)
7
- # isn't causing troubles to implementers using a top-level DSLExtensions module
8
- require_relative 'some_extension'
9
- extend DSLExtensions::SomeExtension
@@ -1,4 +0,0 @@
1
- module DSLExtensions
2
- module SomeExtension
3
- end
4
- end
@@ -1 +0,0 @@
1
- # this does nothing
@@ -1,21 +0,0 @@
1
- require_relative 'helper'
2
- require 'kiba/cli'
3
-
4
- class TestCli < Kiba::Test
5
- def test_cli_launches
6
- Kiba::Cli.run([fixture('valid.etl')])
7
- end
8
-
9
- def test_cli_reports_filename_and_lineno
10
- exception = assert_raises(NameError) do
11
- Kiba::Cli.run([fixture('bogus.etl')])
12
- end
13
-
14
- assert_match(/uninitialized constant(.*)UnknownThing/, exception.message)
15
- assert_includes exception.backtrace.to_s, 'test/fixtures/bogus.etl:2:in'
16
- end
17
-
18
- def test_namespace_conflict
19
- Kiba::Cli.run([fixture('namespace_conflict.etl')])
20
- end
21
- end