kiba 2.5.0 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: aebe72316fa16e820c9c11b7a2a274b05b77abed7eda8799858fe00a4279e2db
4
- data.tar.gz: 686b359e9b74684472d17491c98a32adddd30603186b4f6ec7b6a40263d47d19
3
+ metadata.gz: a5cf22f1013b65b59d9c91b9f629749b47b5474b3f500d5ca54118b6b6a46c4f
4
+ data.tar.gz: cae5e05fabc7c8d169d47d6aa6a7da8befd6711ad2d2471c2c0f5f1d3e604841
5
5
  SHA512:
6
- metadata.gz: 41be9f07f5c7bef36d85c97bc3a9224c7f885de92f458407064d7fa153f8a0ed1443a626b82a4e97f1269154aac0aaca738b8ed63c596953cc1059ae0657f4d3
7
- data.tar.gz: d03d4e66991f0b903ca6b86eda7ae33790218dc84d9de906337a0ddae709a7cbfa6adb9432bbb812396e6a122c942f2cecc8a95ee16d184714e6625cac411905
6
+ metadata.gz: 2ed99641ab799ee57f213525f6ca3ab4b9a9d5e0b7680115bef8e139d32dcfb350fb0117065b5be0ded6a886c21988413b2963e2cbea546a7a1d5691ed078f27
7
+ data.tar.gz: 24434d65dcf9bb81ce0f575860edbdf92598a1e990663347a599eb42155211c02b9c9eb0304c869cf85c8cd15df8c262afcf9cdf11bb7568323ea98b393b9518
@@ -3,12 +3,10 @@ matrix:
3
3
  include:
4
4
  # see https://www.ruby-lang.org/en/downloads/branches/
5
5
  - rvm: ruby-head
6
+ - rvm: 2.7
6
7
  - rvm: 2.6
7
8
  - rvm: 2.5
8
9
  - rvm: 2.4
9
- # NOTE: EOL since 2019-03-31
10
- - rvm: 2.3
11
10
  # see https://www.jruby.org/download
12
- - rvm: jruby-9.1
13
11
  - rvm: jruby-9.2
14
12
  - rvm: truffleruby
data/Changes.md CHANGED
@@ -1,3 +1,11 @@
1
+ 3.0.0
2
+ -----
3
+
4
+ - Breaking: the `kiba` command line is deprecated to encourage using `Kiba.parse` API. See [#81](https://github.com/thbar/kiba/pull/81) and release notes for details & migration path.
5
+ - Kiba now defaults to `StreamingRunner` (backward compatible & more powerful engine) [#83](https://github.com/thbar/kiba/pull/83).
6
+ - Kiba now officially supports MRI Ruby 2.4+ (although 2.3 will still work for now), JRuby 9.2+ or TruffleRuby.
7
+ - You may get warnings with Ruby 2.7 and errors with Ruby 2.8+. See [#85] for status on Ruby 3 keyword arguments support.
8
+
1
9
  2.5.0
2
10
  -----
3
11
 
@@ -3,8 +3,69 @@ Kiba Pro Changelog
3
3
 
4
4
  Kiba Pro is the commercial extension for Kiba. Documentation is available on the [Wiki](https://github.com/thbar/kiba/wiki).
5
5
 
6
- HEAD
7
- -------
6
+ 1.5.0
7
+ -----
8
+
9
+ - Compatibility with Kiba v3
10
+ - BREAKING CHANGE: deprecate non-live Sequel connection passing (https://github.com/thbar/kiba/issues/79). Do not use `database: "connection_string"`, instead pass your `Sequel` connection directly. This moves the connection management out of the destination, which is a better pattern & provides better (block-based) resources closing.
11
+ - Official MySQL support:
12
+ - While the compatibility was already here, it is now tested for in our QA testing suite.
13
+ - MySQL 5.5-8.0 is supported & tested
14
+ - MariaDB should be supported (although not tested against in the QA testing suite)
15
+ - Amazon Aurora MySQL is also supposed to work (although not tested)
16
+ - `Kiba::Pro::Sources::SQL` supports for non-streaming + streaming use
17
+ - `Kiba::Pro::Destinations::SQLBulkInsert` supports:
18
+ - Bulk insert
19
+ - Bulk insert with ignore
20
+ - Bulk upsert (including with dynamically computed columns) via `ON DUPLICATE KEY UPDATE`
21
+ - Note that the `Kiba::Pro::Destinations::SQLUpsert` (row-by-row) is not MySQL compatible at the moment
22
+
23
+ 1.2.0
24
+ -----
25
+
26
+ - `SQL` source improvements:
27
+ - Deprecate use_cursor in favor of block query construct. The source could previously be configured with:
28
+
29
+ ```ruby
30
+ source Kiba::Pro::Sources::SQL,
31
+ query: "SELECT * FROM items",
32
+ use_cursor: true
33
+ ```
34
+
35
+ The `use_cursor` keyword is now deprecated. You can use the more powerful block query construct:
36
+
37
+ ```ruby
38
+ source Kiba::Pro::Sources::SQL,
39
+ query: -> (db) { db["SELECT * FROM items"].use_cursor },
40
+ ```
41
+
42
+ - Avoid bogus nested SQL calls when configuring the query via block/proc. A call with:
43
+
44
+ ```ruby
45
+ source Kiba::Pro::Sources::SQL,
46
+ query: -> (db) { db["SELECT * FROM items"] },
47
+ ```
48
+
49
+ would have previously generated a `SELECT * FROM (SELECT * FROM "items")`. This is now fixed.
50
+
51
+ - Add specs around streaming support (for both MySQL and Postgres).
52
+
53
+ For Postgres, streaming was [recommended by the author of Sequel](https://groups.google.com/d/msg/sequel-talk/olznPcmEf8M/hd5Ris0pYNwJ) over `use_cursor: true` (but do compare on your actual cases!). To enable streaming for Postgres:
54
+ - Add `sequel_pg` to your `Gemfile`
55
+ - Enable the extension in your `db` instance & add `.stream` to your dataset e.g.:
56
+
57
+ ```ruby
58
+ Sequel.connect(ENV.fetch('DATABASE_URL')) do |db|
59
+ db.extension(:pg_streaming)
60
+ Kiba.run(Kiba.parse do
61
+ source Kiba::Pro::Sources::SQL,
62
+ db: db,
63
+ query: -> (db) { db[:items].stream }
64
+ # SNIP
65
+ end)
66
+ ```
67
+
68
+ For MySQL, just add `.stream` to your dataset like above (no extension required).
8
69
 
9
70
  1.1.0
10
71
  -----
data/bin/kiba CHANGED
@@ -1,5 +1,15 @@
1
1
  #!/usr/bin/env ruby
2
2
 
3
- require_relative '../lib/kiba/cli'
3
+ STDERR.puts <<DOC
4
4
 
5
- Kiba::Cli.run(ARGV)
5
+ ##########################################################################
6
+
7
+ The 'kiba' CLI is deprecated and has been removed in Kiba ETL v3.
8
+
9
+ See release notes / changelog for help.
10
+
11
+ ##########################################################################
12
+
13
+ DOC
14
+
15
+ exit(1)
@@ -13,7 +13,7 @@ Kiba.extend(Kiba::Parser)
13
13
  module Kiba
14
14
  def self.run(job)
15
15
  # NOTE: use Hash#dig when Ruby 2.2 reaches EOL
16
- runner = job.config.fetch(:kiba, {}).fetch(:runner, Kiba::Runner)
16
+ runner = job.config.fetch(:kiba, {}).fetch(:runner, Kiba::StreamingRunner)
17
17
  runner.run(job)
18
18
  end
19
19
  end
@@ -1,26 +1,10 @@
1
- # NOTE: using the "Kiba::Parser" declaration, as I discovered,
2
- # provides increased isolation to the declared ETL script, compared
3
- # to 2 nested modules.
4
- # Before that, a user creating entities named Control, Context
5
- # or DSLExtensions would see a conflict with Kiba own classes,
6
- # as by default instance_eval will resolve references by adding
7
- # the module containing the parser class (initially "Kiba").
8
- # Now, the classes appear to be further hidden from the user,
9
- # as Kiba::Parser is its own module.
10
- # This allows the user to create a Parser, Context, Control class
11
- # without it being interpreted as reopening Kiba::Parser, Kiba::Context,
12
- # etc.
13
- # See test in test_cli.rb (test_namespace_conflict)
14
- module Kiba::Parser
15
- def parse(source_as_string = nil, source_file = nil, &source_as_block)
16
- control = Kiba::Control.new
17
- context = Kiba::Context.new(control)
18
- if source_as_string
19
- # this somewhat weird construct allows to remove a nil source_file
20
- context.instance_eval(*[source_as_string, source_file].compact)
21
- else
1
+ module Kiba
2
+ module Parser
3
+ def parse(&source_as_block)
4
+ control = Kiba::Control.new
5
+ context = Kiba::Context.new(control)
22
6
  context.instance_eval(&source_as_block)
7
+ control
23
8
  end
24
- control
25
9
  end
26
10
  end
@@ -71,7 +71,7 @@ module Kiba
71
71
  elsif block
72
72
  fail 'Block form is not allowed here' unless allow_block
73
73
  AliasingProc.new(&block)
74
- elsif
74
+ else
75
75
  fail 'Nil parameters not allowed here'
76
76
  end
77
77
  end
@@ -1,3 +1,3 @@
1
1
  module Kiba
2
- VERSION = '2.5.0'
2
+ VERSION = '3.0.0'
3
3
  end
@@ -3,10 +3,6 @@ require_relative 'support/test_enumerable_source'
3
3
  require_relative 'support/test_destination_returning_nil'
4
4
 
5
5
  module SharedRunnerTests
6
- def kiba_run(job)
7
- Kiba.run(job)
8
- end
9
-
10
6
  def rows
11
7
  @rows ||= [
12
8
  { identifier: 'first-row' },
@@ -68,39 +68,6 @@ class TestParser < Kiba::Test
68
68
  assert_instance_of Proc, control.pre_processes[0][:block]
69
69
  end
70
70
 
71
- def test_source_as_string_parsing
72
- control = Kiba.parse <<RUBY
73
- source DummyClass, 'from', 'file'
74
- RUBY
75
-
76
- assert_equal 1, control.sources.size
77
- assert_equal DummyClass, control.sources[0][:klass]
78
- assert_equal %w(from file), control.sources[0][:args]
79
- end
80
-
81
- def test_source_as_file_doing_require
82
- IO.write 'test/tmp/etl-common.rb', <<RUBY
83
- def common_source_declaration
84
- source DummyClass, 'from', 'common'
85
- end
86
- RUBY
87
- IO.write 'test/tmp/etl-main.rb', <<RUBY
88
- require './test/tmp/etl-common.rb'
89
-
90
- source DummyClass, 'from', 'main'
91
- common_source_declaration
92
- RUBY
93
- control = Kiba.parse IO.read('test/tmp/etl-main.rb')
94
-
95
- assert_equal 2, control.sources.size
96
-
97
- assert_equal %w(from main), control.sources[0][:args]
98
- assert_equal %w(from common), control.sources[1][:args]
99
-
100
- ensure
101
- remove_files('test/tmp/etl-common.rb', 'test/tmp/etl-main.rb')
102
- end
103
-
104
71
  def test_config
105
72
  control = Kiba.parse do
106
73
  extend Kiba::DSLExtensions::Config
@@ -0,0 +1,12 @@
1
+ require_relative 'helper'
2
+ require 'minitest/mock'
3
+
4
+ class TestRun < Kiba::Test
5
+ def test_ensure_kiba_defaults_to_streaming_runner
6
+ cb = -> (job) { "Streaming runner called" }
7
+ Kiba::StreamingRunner.stub(:run, cb) do
8
+ job = Kiba::Control.new
9
+ assert_equal "Streaming runner called", Kiba.run(job)
10
+ end
11
+ end
12
+ end
@@ -2,5 +2,10 @@ require_relative 'helper'
2
2
  require_relative 'shared_runner_tests'
3
3
 
4
4
  class TestRunner < Kiba::Test
5
+ def kiba_run(job)
6
+ job.config[:kiba] = {runner: Kiba::Runner}
7
+ Kiba.run(job)
8
+ end
9
+
5
10
  include SharedRunnerTests
6
11
  end
@@ -8,6 +8,11 @@ require_relative 'support/test_non_closing_transform'
8
8
  require_relative 'shared_runner_tests'
9
9
 
10
10
  class TestStreamingRunner < Kiba::Test
11
+ def kiba_run(job)
12
+ job.config[:kiba] = {runner: Kiba::StreamingRunner}
13
+ Kiba.run(job)
14
+ end
15
+
11
16
  include SharedRunnerTests
12
17
 
13
18
  def test_yielding_class_transform
@@ -15,10 +20,6 @@ class TestStreamingRunner < Kiba::Test
15
20
  destination_array = []
16
21
 
17
22
  job = Kiba.parse do
18
- extend Kiba::DSLExtensions::Config
19
-
20
- config :kiba, runner: Kiba::StreamingRunner
21
-
22
23
  # provide a single row as the input
23
24
  source TestEnumerableSource, [input_row]
24
25
 
@@ -51,9 +52,6 @@ class TestStreamingRunner < Kiba::Test
51
52
  def test_transform_yielding_from_close
52
53
  destination_array = []
53
54
  job = Kiba.parse do
54
- extend Kiba::DSLExtensions::Config
55
- config :kiba, runner: Kiba::StreamingRunner
56
-
57
55
  transform CloseYieldingTransform, yield_on_close: [1, 2]
58
56
  destination TestArrayDestination, destination_array
59
57
  end
@@ -63,9 +61,6 @@ class TestStreamingRunner < Kiba::Test
63
61
 
64
62
  def test_transform_with_no_close_must_not_raise
65
63
  job = Kiba.parse do
66
- extend Kiba::DSLExtensions::Config
67
- config :kiba, runner: Kiba::StreamingRunner
68
-
69
64
  transform NonClosingTransform
70
65
  end
71
66
  Kiba.run(job)
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: kiba
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.5.0
4
+ version: 3.0.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Thibaut Barrère
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2019-05-29 00:00:00.000000000 Z
11
+ date: 2020-02-10 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rake
@@ -88,7 +88,6 @@ files:
88
88
  - bin/kiba
89
89
  - kiba.gemspec
90
90
  - lib/kiba.rb
91
- - lib/kiba/cli.rb
92
91
  - lib/kiba/context.rb
93
92
  - lib/kiba/control.rb
94
93
  - lib/kiba/dsl_extensions/config.rb
@@ -96,10 +95,6 @@ files:
96
95
  - lib/kiba/runner.rb
97
96
  - lib/kiba/streaming_runner.rb
98
97
  - lib/kiba/version.rb
99
- - test/fixtures/bogus.etl
100
- - test/fixtures/namespace_conflict.etl
101
- - test/fixtures/some_extension.rb
102
- - test/fixtures/valid.etl
103
98
  - test/helper.rb
104
99
  - test/shared_runner_tests.rb
105
100
  - test/support/shared_tests.rb
@@ -115,9 +110,9 @@ files:
115
110
  - test/support/test_rename_field_transform.rb
116
111
  - test/support/test_source_that_reads_at_instantiation_time.rb
117
112
  - test/support/test_yielding_transform.rb
118
- - test/test_cli.rb
119
113
  - test/test_integration.rb
120
114
  - test/test_parser.rb
115
+ - test/test_run.rb
121
116
  - test/test_runner.rb
122
117
  - test/test_streaming_runner.rb
123
118
  - test/tmp/.gitkeep
@@ -147,10 +142,6 @@ signing_key:
147
142
  specification_version: 4
148
143
  summary: Lightweight ETL for Ruby
149
144
  test_files:
150
- - test/fixtures/bogus.etl
151
- - test/fixtures/namespace_conflict.etl
152
- - test/fixtures/some_extension.rb
153
- - test/fixtures/valid.etl
154
145
  - test/helper.rb
155
146
  - test/shared_runner_tests.rb
156
147
  - test/support/shared_tests.rb
@@ -166,9 +157,9 @@ test_files:
166
157
  - test/support/test_rename_field_transform.rb
167
158
  - test/support/test_source_that_reads_at_instantiation_time.rb
168
159
  - test/support/test_yielding_transform.rb
169
- - test/test_cli.rb
170
160
  - test/test_integration.rb
171
161
  - test/test_parser.rb
162
+ - test/test_run.rb
172
163
  - test/test_runner.rb
173
164
  - test/test_streaming_runner.rb
174
165
  - test/tmp/.gitkeep
@@ -1,16 +0,0 @@
1
- require 'kiba'
2
-
3
- module Kiba
4
- class Cli
5
- def self.run(args)
6
- unless args.size == 1
7
- puts 'Syntax: kiba your-script.etl'
8
- exit(-1)
9
- end
10
- filename = args[0]
11
- script_content = IO.read(filename)
12
- job_definition = Kiba.parse(script_content, filename)
13
- Kiba.run(job_definition)
14
- end
15
- end
16
- end
@@ -1,2 +0,0 @@
1
- # this should fail because we have an unknown class
2
- source UnknownThing
@@ -1,9 +0,0 @@
1
- fail "Context should not be visible without Kiba namespace" if defined?(Context)
2
- fail "Control should not be visible without Kiba namespace" if defined?(Control)
3
- fail "Parser should not be visible without Kiba namespace" if defined?(Parser)
4
- fail "Config should not be visible without Kiba namespace" if defined?(DSLExtensions::Config)
5
-
6
- # verify Kiba config (namespaced under Kiba::DSLExtensions::Config)
7
- # isn't causing troubles to implementers using a top-level DSLExtensions module
8
- require_relative 'some_extension'
9
- extend DSLExtensions::SomeExtension
@@ -1,4 +0,0 @@
1
- module DSLExtensions
2
- module SomeExtension
3
- end
4
- end
@@ -1 +0,0 @@
1
- # this does nothing
@@ -1,21 +0,0 @@
1
- require_relative 'helper'
2
- require 'kiba/cli'
3
-
4
- class TestCli < Kiba::Test
5
- def test_cli_launches
6
- Kiba::Cli.run([fixture('valid.etl')])
7
- end
8
-
9
- def test_cli_reports_filename_and_lineno
10
- exception = assert_raises(NameError) do
11
- Kiba::Cli.run([fixture('bogus.etl')])
12
- end
13
-
14
- assert_match(/uninitialized constant(.*)UnknownThing/, exception.message)
15
- assert_includes exception.backtrace.to_s, 'test/fixtures/bogus.etl:2:in'
16
- end
17
-
18
- def test_namespace_conflict
19
- Kiba::Cli.run([fixture('namespace_conflict.etl')])
20
- end
21
- end