kiba 3.0.0 → 3.6.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a5cf22f1013b65b59d9c91b9f629749b47b5474b3f500d5ca54118b6b6a46c4f
4
- data.tar.gz: cae5e05fabc7c8d169d47d6aa6a7da8befd6711ad2d2471c2c0f5f1d3e604841
3
+ metadata.gz: f8569f0d68410f27d42f38dcb8182ef9e8f47a8c88f714dc64382943febad2c8
4
+ data.tar.gz: 4d7c6f8c5ef9256182cc51f85c614e383586dd600167e12190c110693d6963fb
5
5
  SHA512:
6
- metadata.gz: 2ed99641ab799ee57f213525f6ca3ab4b9a9d5e0b7680115bef8e139d32dcfb350fb0117065b5be0ded6a886c21988413b2963e2cbea546a7a1d5691ed078f27
7
- data.tar.gz: 24434d65dcf9bb81ce0f575860edbdf92598a1e990663347a599eb42155211c02b9c9eb0304c869cf85c8cd15df8c262afcf9cdf11bb7568323ea98b393b9518
6
+ metadata.gz: 9fb3c4fd7836e2863771a0086e98a7d8bf24f4df99310464dad6cbc84fef3963155cdac8af0d9a281b6470686334b163491ae9d15158fb19cd5269ef77102129
7
+ data.tar.gz: 25082194faf157dfa2398485e4cd993ba491871007a3a47c3b5f13db57e677f2f6b11621c2ec5e03ccdce5b44bb6a51e3bfee05968c8a72c99088dfbeb46cd7c
@@ -0,0 +1 @@
1
+ github: thbar
data/.travis.yml CHANGED
@@ -3,6 +3,7 @@ matrix:
3
3
  include:
4
4
  # see https://www.ruby-lang.org/en/downloads/branches/
5
5
  - rvm: ruby-head
6
+ - rvm: 3.0
6
7
  - rvm: 2.7
7
8
  - rvm: 2.6
8
9
  - rvm: 2.5
data/COMM-LICENSE.md CHANGED
@@ -177,7 +177,7 @@ The Licensee is responsible for making and keeping adequate backup copies of dat
177
177
 
178
178
  An active Agreement gives access to “Priority Support”.
179
179
 
180
- Priority Support covers one (1) email request per quarter. LoGeek will use reasonable endeavours to answer within two (2) French working days. Scope is limited to Kiba and Kiba Pro features and APIs, not the application or infrastructure. For support, email thibaut.barrere+kiba@gmail.com. Please email using the same domain as the original license email or explain your connection to the licensed company.
180
+ Priority Support covers one (1) email request per quarter. LoGeek will use reasonable endeavours to answer within two (2) French working days. Scope is limited to Kiba and Kiba Pro features and APIs, not the application or infrastructure. For support, email support@logeek.fr. Please email using the same domain as the original license email or explain your connection to the licensed company.
181
181
 
182
182
  ## ARTICLE 5 - CUSTOM SUPPORT AND MAINTENANCE
183
183
 
@@ -245,7 +245,7 @@ LoGeek is responsible for processing personal data regarding managing contractua
245
245
 
246
246
  Personal data collected by LoGeek is needed for its processing and is intended for LoGeek’s relevant departments and, where appropriate, its subcontractors and co-contractors, for the requirements of executing the Agreement.
247
247
 
248
- Pursuant to the legal provisions regarding protecting personal data, and under the conditions and to the extent provided by the applicable regulation, each Licensee’s employee concerned shall have a right to query, access, rectify, oppose, obtain erasure or restriction of processing regarding its personal data – rights which may be exercised by mail sent to the attention of thibaut.barrere+kiba@gmail.com, accompanied, if deemed appropriate by Logeek, by a copy of the relevant person’s identification papers.
248
+ Pursuant to the legal provisions regarding protecting personal data, and under the conditions and to the extent provided by the applicable regulation, each Licensee’s employee concerned shall have a right to query, access, rectify, oppose, obtain erasure or restriction of processing regarding its personal data – rights which may be exercised by mail sent to the attention of support@logeek.fr, accompanied, if deemed appropriate by Logeek, by a copy of the relevant person’s identification papers.
249
249
 
250
250
  ## ARTICLE 10 - AUDIT
251
251
 
data/Changes.md CHANGED
@@ -1,3 +1,16 @@
1
+ HEAD
2
+ ----
3
+
4
+ 3.6.0
5
+ -----
6
+
7
+ - New: `Kiba.run(job)` can now (instead of a job parameter) take a block to define the job. See [#94](https://github.com/thbar/kiba/pull/94) for more details.
8
+
9
+ 3.5.0
10
+ -----
11
+
12
+ - Support for Ruby 2.7+ [#93](https://github.com/thbar/kiba/pull/93). Special thanks to @eregon and @mame for their input.
13
+
1
14
  3.0.0
2
15
  -----
3
16
 
data/Pro-Changes.md CHANGED
@@ -1,7 +1,18 @@
1
1
  Kiba Pro Changelog
2
2
  ==================
3
3
 
4
- Kiba Pro is the commercial extension for Kiba. Documentation is available on the [Wiki](https://github.com/thbar/kiba/wiki).
4
+ Kiba Pro provides vendor-supported ETL extensions for Kiba. Your subscription funds the Open-Source development, thanks for considering it!
5
+
6
+ Learn more on the [Kiba website](https://www.kiba-etl.org/kiba-pro).
7
+
8
+ Documentation is available on the [Wiki](https://github.com/thbar/kiba/wiki#kiba-pro).
9
+
10
+ 2.0.0
11
+ -----
12
+
13
+ - New: `SQLBulkLookup` transform allows to efficiently lookup values in SQL tables. This is particularly useful in datawarehouse scenarios (to replace unique business keys by surrogate keys), or when writing migrations of SQL databases. Instead of looking-up each row individually, it avoids a "N+1" like effect, by working on large batches of rows.
14
+ - New: `ParallelTransform` provides an easy way to process a group of ETL rows at the same time using a pool of threads. It can be used to accelerate ETL transforms doing IO operations such as HTTP queries, by going multithreaded.
15
+ - New: `FileLock` adds an easy way to avoid overlapping runs in ETL Jobs using a local file lock.
5
16
 
6
17
  1.5.0
7
18
  -----
data/README.md CHANGED
@@ -1,67 +1,31 @@
1
- **If you need help**, please [ask your question with tag kiba-etl on StackOverflow](http://stackoverflow.com/questions/ask?tags=kiba-etl) so that other can benefit from your contribution! I monitor this specific tag and will reply to you.
2
-
3
- Writing reliable, concise, well-tested & maintainable data-processing code is tricky.
4
-
5
- Kiba lets you define and run such high-quality ETL ([Extract-Transform-Load](http://en.wikipedia.org/wiki/Extract,_transform,_load)) jobs using Ruby.
6
-
7
- Learn more on the [Wiki](https://github.com/thbar/kiba/wiki), on my [blog](http://thibautbarrere.com) and on [StackOverflow](http://stackoverflow.com/questions/tagged/kiba-etl).
8
-
9
- A new [kiba-blueprints](https://github.com/thbar/kiba-blueprints) repo has been also created to showcase the use of Kiba OSS and Kiba Pro. More examples are planned!
1
+ # Kiba ETL
10
2
 
11
3
  [![Gem Version](https://badge.fury.io/rb/kiba.svg)](http://badge.fury.io/rb/kiba)
12
4
  [![Build Status](https://travis-ci.org/thbar/kiba.svg?branch=master)](https://travis-ci.org/thbar/kiba) [![Build status](https://ci.appveyor.com/api/projects/status/v05jcyhpp1mueq9i?svg=true)](https://ci.appveyor.com/project/thbar/kiba) [![Code Climate](https://codeclimate.com/github/thbar/kiba/badges/gpa.svg)](https://codeclimate.com/github/thbar/kiba)
13
5
 
14
- ## Kiba 2.0.0
15
-
16
- Kiba 2.0.0 has been recently released with one major improvement for ETL components developers.
17
-
18
- Please check out the [release notes](https://github.com/thbar/kiba/releases/tag/v2.0.0).
6
+ Writing reliable, concise, well-tested & maintainable data-processing code is tricky.
19
7
 
20
- [Subscribe to my newsletter](https://tinyletter.com/kiba-etl) to get the upcoming detailed article on Kiba 2 benefits.
8
+ Kiba lets you define and run such high-quality ETL ([Extract-Transform-Load](http://en.wikipedia.org/wiki/Extract,_transform,_load)) jobs using Ruby.
21
9
 
22
- ## Installation
10
+ ## Getting Started
23
11
 
24
- Add the gem to your `Gemfile` and run the bundle command:
12
+ Head over to the [Wiki](https://github.com/thbar/kiba/wiki) for up-to-date documentation.
25
13
 
26
- ```ruby
27
- gem 'kiba'
28
- ```
14
+ **If you need help**, please [ask your question with tag kiba-etl on StackOverflow](http://stackoverflow.com/questions/ask?tags=kiba-etl) so that other can benefit from your contribution! I monitor this specific tag and will reply to you.
29
15
 
30
- ## Getting Started
16
+ [Kiba Pro](https://www.kiba-etl.org/kiba-pro) customers get priority private email support for any unforeseen issues and simple matters such as installation troubles. Our consulting services will also be prioritized to Kiba Pro subscribers. If you need any coaching on ETL & data pipeline implementation, please [reach out via email](mailto:info@logeek.fr) so we can discuss how to help you out.
31
17
 
32
- * [How do you define ETL jobs with Kiba?](https://github.com/thbar/kiba/wiki/How-do-you-define-ETL-jobs-with-Kiba%3F)
33
- * [Running Kiba jobs from the command line](https://github.com/thbar/kiba/wiki/Running-Kiba-jobs-from-the-command-line)
34
- * [Considerations for running Kiba jobs programmatically (from Sidekiq, Faktory, Rake, ...)](https://github.com/thbar/kiba/wiki/Considerations-for-running-Kiba-jobs-programmatically-(from-Sidekiq,-Faktory,-Rake,-...))
35
- * [Implementing ETL sources](https://github.com/thbar/kiba/wiki/Implementing-ETL-sources).
36
- * [Implementing ETL transforms](https://github.com/thbar/kiba/wiki/Implementing-ETL-transforms).
37
- * [Implementing ETL destinations](https://github.com/thbar/kiba/wiki/Implementing-ETL-destinations).
38
- * [Implementing pre and post-processors](https://github.com/thbar/kiba/wiki/Implementing-pre-and-post-processors).
39
-
40
- ## Useful links
41
-
42
- * [Ruby Kaigi 2018 conference - Kiba 2 - Past, present & future of data processing with Ruby](https://www.youtube.com/watch?v=fxVtbog7pIQ) ([slides](https://speakerdeck.com/thbar/kiba-etl-v2-rubykaigi-2018))
43
- * [Live Coding Session - Processing data with Kiba ETL](http://thibautbarrere.com/2015/11/09/video-processing-data-with-kiba-etl/)
44
- * [Rubyists - are you doing ETL unknowningly?](http://thibautbarrere.com/2015/03/25/rubyists-are-you-doing-etl-unknowingly/)
45
- * [How to write solid data processing code](http://thibautbarrere.com/2015/04/05/how-to-write-solid-data-processing-code/)
46
- * [How to reformat CSV files with Kiba](http://thibautbarrere.com/2015/06/04/how-to-reformat-csv-files-with-kiba/) (in-depth, hands-on tutorial)
47
- * [How to explode multivalued attributes with Kiba ETL?](http://thibautbarrere.com/2015/06/25/how-to-explode-multivalued-attributes-with-kiba/)
48
- * [Common techniques to compute aggregates with Kiba](https://stackoverflow.com/questions/31145715/how-to-do-a-aggregation-transformation-in-a-kiba-etl-script-kiba-gem)
49
- * [How to run Kiba in a Rails environment?](http://thibautbarrere.com/2015/09/26/how-to-run-kiba-in-a-rails-environment/)
50
- * [How to pass parameters to the Kiba command line?](http://stackoverflow.com/questions/32959692/how-to-pass-parameters-into-your-etl-job)
18
+ You can also check out the [author blog](https://thibautbarrere.com) and [StackOverflow answers](http://stackoverflow.com/questions/tagged/kiba-etl).
51
19
 
52
20
  ## Supported Ruby versions
53
21
 
54
- Kiba currently supports Ruby 2.3+, JRuby 9.1+ and TruffleRuby. See [test matrix](https://travis-ci.org/thbar/kiba).
55
-
56
- ## Kiba Common
57
-
58
- I'm starting to add commonly used reusable helpers in a separate gem called [kiba-common](https://github.com/thbar/kiba-common), check it out (work-in-progress).
22
+ Kiba currently supports Ruby 2.4+, JRuby 9.2+ and TruffleRuby. See [test matrix](https://travis-ci.org/thbar/kiba).
59
23
 
60
24
  ## ETL consulting & commercial version
61
25
 
62
- **Consulting services**: if your organization needs help to implement a data pipeline or to build a data-intensive application, I provide consulting services. [More information](http://thibautbarrere.com/hire-me/).
26
+ **Consulting services**: if your organization needs guidance on Kiba / ETL implementations, we provide consulting services. Contact at [https://www.logeek.fr](https://www.logeek.fr).
63
27
 
64
- **Kiba Pro**: for more features & goodies, check out [Kiba Pro](https://github.com/thbar/kiba/wiki#kiba-pro).
28
+ **Kiba Pro**: for vendor-backed ETL extensions, check out [Kiba Pro](https://www.kiba-etl.org/kiba-pro).
65
29
 
66
30
  ## License
67
31
 
data/appveyor.yml CHANGED
@@ -5,10 +5,11 @@ cache:
5
5
 
6
6
  environment:
7
7
  matrix:
8
+ # TODO: add RUBY_VERSION=30 when available (https://www.appveyor.com/updates/)
9
+ - RUBY_VERSION: 27
8
10
  - RUBY_VERSION: 26
9
11
  - RUBY_VERSION: 25
10
12
  - RUBY_VERSION: 24
11
- - RUBY_VERSION: 23
12
13
  # NOTE: jruby doesn't seem to be supported on default images
13
14
  # see https://www.appveyor.com/docs/build-environment/#ruby
14
15
 
data/lib/kiba.rb CHANGED
@@ -11,7 +11,13 @@ require 'kiba/dsl_extensions/config'
11
11
  Kiba.extend(Kiba::Parser)
12
12
 
13
13
  module Kiba
14
- def self.run(job)
14
+ def self.run(job = nil, &block)
15
+ unless (job.nil? ^ block.nil?)
16
+ fail ArgumentError.new("Kiba.run takes either one argument (the job) or a block (defining the job)")
17
+ end
18
+
19
+ job ||= Kiba.parse { instance_exec(&block) }
20
+
15
21
  # NOTE: use Hash#dig when Ruby 2.2 reaches EOL
16
22
  runner = job.config.fetch(:kiba, {}).fetch(:runner, Kiba::StreamingRunner)
17
23
  runner.run(job)
data/lib/kiba/context.rb CHANGED
@@ -23,5 +23,9 @@ module Kiba
23
23
  def post_process(&block)
24
24
  @control.post_processes << { block: block }
25
25
  end
26
+
27
+ [:source, :transform, :destination].each do |m|
28
+ ruby2_keywords(m) if respond_to?(:ruby2_keywords, true)
29
+ end
26
30
  end
27
31
  end
data/lib/kiba/runner.rb CHANGED
@@ -8,9 +8,6 @@ module Kiba
8
8
  end
9
9
 
10
10
  def run(control)
11
- # TODO: add a dry-run (not instantiating mode) to_instances call
12
- # that will validate the job definition from a syntax pov before
13
- # going any further. This could be shared with the parser.
14
11
  run_pre_processes(control)
15
12
  process_rows(
16
13
  to_instances(control.sources),
@@ -18,8 +15,6 @@ module Kiba
18
15
  destinations = to_instances(control.destinations)
19
16
  )
20
17
  close_destinations(destinations)
21
- # TODO: when I add post processes as class, I'll have to add a test to
22
- # make sure instantiation occurs after the main processing is done (#16)
23
18
  run_post_processes(control)
24
19
  end
25
20
 
data/lib/kiba/version.rb CHANGED
@@ -1,3 +1,3 @@
1
1
  module Kiba
2
- VERSION = '3.0.0'
2
+ VERSION = '3.6.0'
3
3
  end
@@ -155,4 +155,74 @@ module SharedRunnerTests
155
155
  end
156
156
  assert_raises(RuntimeError, 'Nil parameters not allowed here') { kiba_run(control) }
157
157
  end
158
+
159
+ def test_ruby_3_source_kwargs
160
+ # NOTE: before Ruby 3 kwargs support, a Ruby warning would
161
+ # be captured here with Ruby 2.7 & ensure we fail,
162
+ # and an error would be raised with Ruby 2.8.0-dev
163
+ # NOTE: only the first warning will be captured, though, but
164
+ # having 3 different tests is still better
165
+ storage = nil
166
+ assert_silent do
167
+ Kiba.run(Kiba.parse do
168
+ source TestKeywordArgumentsComponent,
169
+ mandatory: "first",
170
+ on_init: -> (values) { storage = values }
171
+ end)
172
+ end
173
+ assert_equal({
174
+ mandatory: "first",
175
+ optional: nil
176
+ }, storage)
177
+ end
178
+
179
+ def test_ruby_3_transform_kwargs
180
+ storage = nil
181
+ assert_silent do
182
+ Kiba.run(Kiba.parse do
183
+ transform TestKeywordArgumentsComponent,
184
+ mandatory: "first",
185
+ on_init: -> (values) { storage = values }
186
+ end)
187
+ end
188
+ assert_equal({
189
+ mandatory: "first",
190
+ optional: nil
191
+ }, storage)
192
+ end
193
+
194
+ def test_ruby_3_destination_kwargs
195
+ storage = nil
196
+ assert_silent do
197
+ Kiba.run(Kiba.parse do
198
+ destination TestKeywordArgumentsComponent,
199
+ mandatory: "first",
200
+ on_init: -> (values) { storage = values }
201
+ end)
202
+ end
203
+ assert_equal({
204
+ mandatory: "first",
205
+ optional: nil
206
+ }, storage)
207
+ end
208
+
209
+ def test_positional_plus_keyword_arguments
210
+ storage = nil
211
+ assert_silent do
212
+ Kiba.run(Kiba.parse do
213
+ source TestMixedArgumentsComponent,
214
+ "some positional argument",
215
+ mandatory: "first",
216
+ on_init: -> (values) {
217
+ storage = values
218
+ }
219
+ end)
220
+ end
221
+
222
+ assert_equal({
223
+ some_value: "some positional argument",
224
+ mandatory: "first",
225
+ optional: nil
226
+ }, storage)
227
+ end
158
228
  end
@@ -0,0 +1,14 @@
1
+ # a mock component to test Ruby 3 keyword argument support
2
+ class TestKeywordArgumentsComponent
3
+ def initialize(mandatory:, optional: nil, on_init: nil)
4
+ values = {
5
+ mandatory: mandatory,
6
+ optional: optional
7
+ }
8
+ on_init&.call(values)
9
+ end
10
+
11
+ def each
12
+ # no-op
13
+ end
14
+ end
@@ -0,0 +1,14 @@
1
+ # a mock component to test Ruby 3 keyword argument support
2
+ class TestMixedArgumentsComponent
3
+ def initialize(some_value, mandatory:, optional: nil, on_init:)
4
+ @values = {}
5
+ @values[:some_value] = some_value
6
+ @values[:mandatory] = mandatory
7
+ @values[:optional] = optional
8
+ on_init&.call(@values)
9
+ end
10
+
11
+ def each
12
+ # no-op
13
+ end
14
+ end
data/test/test_run.rb CHANGED
@@ -1,5 +1,7 @@
1
1
  require_relative 'helper'
2
2
  require 'minitest/mock'
3
+ require_relative 'support/test_enumerable_source'
4
+ require_relative 'support/test_array_destination'
3
5
 
4
6
  class TestRun < Kiba::Test
5
7
  def test_ensure_kiba_defaults_to_streaming_runner
@@ -9,4 +11,28 @@ class TestRun < Kiba::Test
9
11
  assert_equal "Streaming runner called", Kiba.run(job)
10
12
  end
11
13
  end
14
+
15
+ def test_run_allows_block_arg
16
+ rows = []
17
+ Kiba.run do
18
+ source TestEnumerableSource, (1..10)
19
+ destination TestArrayDestination, rows
20
+ end
21
+ assert_equal (1..10).to_a, rows
22
+ end
23
+
24
+ def test_forbids_no_arg
25
+ assert_raises ArgumentError do
26
+ Kiba.run
27
+ end
28
+ end
29
+
30
+ def test_forbids_multiple_args
31
+ assert_raises ArgumentError do
32
+ job = Kiba.parse { }
33
+ Kiba.run(job) do
34
+ #
35
+ end
36
+ end
37
+ end
12
38
  end
@@ -6,6 +6,8 @@ require_relative 'support/test_duplicate_row_transform'
6
6
  require_relative 'support/test_close_yielding_transform'
7
7
  require_relative 'support/test_non_closing_transform'
8
8
  require_relative 'shared_runner_tests'
9
+ require_relative 'support/test_keyword_arguments_component'
10
+ require_relative 'support/test_mixed_arguments_component'
9
11
 
10
12
  class TestStreamingRunner < Kiba::Test
11
13
  def kiba_run(job)
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: kiba
3
3
  version: !ruby/object:Gem::Version
4
- version: 3.0.0
4
+ version: 3.6.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Thibaut Barrère
8
- autorequire:
8
+ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2020-02-10 00:00:00.000000000 Z
11
+ date: 2021-02-07 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rake
@@ -74,6 +74,7 @@ executables:
74
74
  extensions: []
75
75
  extra_rdoc_files: []
76
76
  files:
77
+ - ".github/FUNDING.yml"
77
78
  - ".gitignore"
78
79
  - ".travis.yml"
79
80
  - COMM-LICENSE.md
@@ -106,6 +107,8 @@ files:
106
107
  - test/support/test_destination_returning_nil.rb
107
108
  - test/support/test_duplicate_row_transform.rb
108
109
  - test/support/test_enumerable_source.rb
110
+ - test/support/test_keyword_arguments_component.rb
111
+ - test/support/test_mixed_arguments_component.rb
109
112
  - test/support/test_non_closing_transform.rb
110
113
  - test/support/test_rename_field_transform.rb
111
114
  - test/support/test_source_that_reads_at_instantiation_time.rb
@@ -122,7 +125,7 @@ licenses:
122
125
  metadata:
123
126
  source_code_uri: https://github.com/thbar/kiba
124
127
  documentation_uri: https://github.com/thbar/kiba/wiki
125
- post_install_message:
128
+ post_install_message:
126
129
  rdoc_options: []
127
130
  require_paths:
128
131
  - lib
@@ -137,8 +140,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
137
140
  - !ruby/object:Gem::Version
138
141
  version: '0'
139
142
  requirements: []
140
- rubygems_version: 3.0.3
141
- signing_key:
143
+ rubygems_version: 3.2.3
144
+ signing_key:
142
145
  specification_version: 4
143
146
  summary: Lightweight ETL for Ruby
144
147
  test_files:
@@ -153,6 +156,8 @@ test_files:
153
156
  - test/support/test_destination_returning_nil.rb
154
157
  - test/support/test_duplicate_row_transform.rb
155
158
  - test/support/test_enumerable_source.rb
159
+ - test/support/test_keyword_arguments_component.rb
160
+ - test/support/test_mixed_arguments_component.rb
156
161
  - test/support/test_non_closing_transform.rb
157
162
  - test/support/test_rename_field_transform.rb
158
163
  - test/support/test_source_that_reads_at_instantiation_time.rb