simmer 1.0.0.pre.alpha.6 → 1.0.0.pre.alpha.7
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +311 -1
- data/lib/simmer/specification/assert/assertions/table.rb +17 -3
- data/lib/simmer/util/record.rb +1 -1
- data/lib/simmer/util/record_set.rb +4 -0
- data/lib/simmer/version.rb +1 -1
- data/spec/fixtures/specifications/load_noc_list.yaml +7 -0
- data/spec/simmer/specification/assert_spec.rb +3 -1
- data/spec/simmer/util/record_set_spec.rb +17 -0
- metadata +1 -1
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: c2f5116d06a3aa1eaafbe7a14cc204f40eeeb1a45408801ecc9f15706b36f2a2
|
4
|
+
data.tar.gz: 49c6a41c862824592fbd31fa9654570d93bb84a6ba70f8e59353b7187dc15e17
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: e2315fec18c186454fd415c4050f183d753a635f5e45e630da3b609adfd1f498930c18a25f2ea61f4a79b0dd59fb6ee9558c2081b7092a8624fe41c4e70d68d1
|
7
|
+
data.tar.gz: c5572b235b2bf109e213c3d428bd7668c97af2967f56a925d3d6b0f522f1689af9e46124ccec2460e1f71a5a04f36eb13478380e633d060e4d7999f32b198739
|
data/README.md
CHANGED
@@ -2,4 +2,314 @@
|
|
2
2
|
|
3
3
|
---
|
4
4
|
|
5
|
-
|
5
|
+
[![Gem Version](https://badge.fury.io/rb/simmer.svg)](https://badge.fury.io/rb/simmer) [![Build Status](https://travis-ci.org/bluemarblepayroll/simmer.svg?branch=master)](https://travis-ci.org/bluemarblepayroll/simmer) [![Maintainability](https://api.codeclimate.com/v1/badges/61996dff817d44efc408/maintainability)](https://codeclimate.com/github/bluemarblepayroll/simmer/maintainability) [![Test Coverage](https://api.codeclimate.com/v1/badges/61996dff817d44efc408/test_coverage)](https://codeclimate.com/github/bluemarblepayroll/simmer/test_coverage) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
|
6
|
+
|
7
|
+
*Note: This is not officially supported by Hitachi Vantara.*
|
8
|
+
|
9
|
+
This library provides is a Ruby-based testing suite for [Pentaho Data Integration](https://www.hitachivantara.com/en-us/products/data-management-analytics/pentaho-platform/pentaho-data-integration.html). You can create specifications for Pentaho transformations and jobs then ensure they always run correctly.
|
10
|
+
|
11
|
+
## Compatibility & Limitations
|
12
|
+
|
13
|
+
This library was tested against:
|
14
|
+
|
15
|
+
* Kettle version 6.1.0.1-196
|
16
|
+
* MacOS and Linux
|
17
|
+
|
18
|
+
Note that it also is currently limited to:
|
19
|
+
|
20
|
+
* MySQL
|
21
|
+
* Amazon Simple Storage Service
|
22
|
+
|
23
|
+
Future enhancements potentially could include breaking these out and making them plug-ins in order to support other database and cloud storage vendors/systems.
|
24
|
+
|
25
|
+
## Installation
|
26
|
+
|
27
|
+
To install through Rubygems:
|
28
|
+
|
29
|
+
````bash
|
30
|
+
gem install simmer
|
31
|
+
````
|
32
|
+
|
33
|
+
You can also add this to your Gemfile:
|
34
|
+
|
35
|
+
````bash
|
36
|
+
bundle add simmer
|
37
|
+
````
|
38
|
+
|
39
|
+
After installation, you will need do to two things:
|
40
|
+
|
41
|
+
1. Add simmer configuration file
|
42
|
+
2. Add simmer directory
|
43
|
+
|
44
|
+
### Simmer Configuration File
|
45
|
+
|
46
|
+
The configuration file contains information about external systems, such as:
|
47
|
+
|
48
|
+
* Amazon Simple Storage Service
|
49
|
+
* Local File System
|
50
|
+
* Pentaho Data Integration
|
51
|
+
* MySQL Database
|
52
|
+
|
53
|
+
Copy this configuration template into your project's root to: `config/simmer.yaml`:
|
54
|
+
|
55
|
+
````yaml
|
56
|
+
mysql_database:
|
57
|
+
database:
|
58
|
+
username:
|
59
|
+
host:
|
60
|
+
port:
|
61
|
+
flags: MULTI_STATEMENTS
|
62
|
+
|
63
|
+
spoon_client:
|
64
|
+
dir: spec/mocks/spoon
|
65
|
+
args: 0
|
66
|
+
|
67
|
+
# local_file_system:
|
68
|
+
# dir: tmp/store_test
|
69
|
+
|
70
|
+
# aws_file_system:
|
71
|
+
# access_key_id:
|
72
|
+
# bucket:
|
73
|
+
# default_expires_in_seconds: 3600
|
74
|
+
# encryption: AES256
|
75
|
+
# region:
|
76
|
+
# secret_access_key:
|
77
|
+
````
|
78
|
+
|
79
|
+
Fill out the missing configuration values required for each section. If you would like to use your local file system then un-comment the `local_file_system` key. If you would like to use AWS S3 then un-comment out the `aws_file_system` key.
|
80
|
+
|
81
|
+
### Simmer Directory
|
82
|
+
|
83
|
+
You will also need to create the following folder structure in your project's root folder:
|
84
|
+
|
85
|
+
* **simmer/files**: Place any files necessary to stage in this directory.
|
86
|
+
* **simmer/fixtures**: Place yaml files, that describe database records, necessary to stage the database.
|
87
|
+
* **simmer/specs**: Place specification yaml files here.
|
88
|
+
|
89
|
+
It does not matter how each of these directories are internally structured, they can contain folder structure in any arbitrary way. These directories should be version controlled as they contain the necessary information to execute your tests. But you may want to ignore the `simmer/results` directory as that will store the results after execution.
|
90
|
+
|
91
|
+
## Getting Started
|
92
|
+
|
93
|
+
### What is a Specification?
|
94
|
+
|
95
|
+
A specification is a blueprint for how to run a transformation or job and contains configuration for:
|
96
|
+
|
97
|
+
* File system state before execution
|
98
|
+
* Database state before execution
|
99
|
+
* Execution command
|
100
|
+
* Expected database state after execution
|
101
|
+
* Expected execution output
|
102
|
+
|
103
|
+
#### Specification Example
|
104
|
+
|
105
|
+
The following is an example specification for a transformation:
|
106
|
+
|
107
|
+
````yaml
|
108
|
+
name: Declassify Users
|
109
|
+
stage:
|
110
|
+
files:
|
111
|
+
src: noc_list.csv
|
112
|
+
dest: input/noc_list.csv
|
113
|
+
fixtures:
|
114
|
+
- iron_man
|
115
|
+
- hulk
|
116
|
+
act:
|
117
|
+
name: load_noc_list
|
118
|
+
repository: top_secret
|
119
|
+
type: transformation
|
120
|
+
params:
|
121
|
+
files:
|
122
|
+
input_file: noc_list.csv
|
123
|
+
keys:
|
124
|
+
code: 'The secret code is: {codes.the_secret_one}'
|
125
|
+
assert:
|
126
|
+
assertions:
|
127
|
+
- type: table
|
128
|
+
name: agents
|
129
|
+
records:
|
130
|
+
- call_sign: iron_man
|
131
|
+
first: tony
|
132
|
+
last: stark
|
133
|
+
- call_sign: hulk
|
134
|
+
first: bruce
|
135
|
+
last: banner
|
136
|
+
- type: table
|
137
|
+
name: agents
|
138
|
+
logic: includes
|
139
|
+
records:
|
140
|
+
- last: stark
|
141
|
+
- type: output
|
142
|
+
value: output to stdout
|
143
|
+
````
|
144
|
+
|
145
|
+
##### Stage Section
|
146
|
+
|
147
|
+
The stage section defines the pre-execution state that needs to exist before PDI execution. There are two options:
|
148
|
+
|
149
|
+
1. Files
|
150
|
+
2. Fixtures
|
151
|
+
|
152
|
+
###### Files
|
153
|
+
|
154
|
+
Each file entry specifies two things:
|
155
|
+
|
156
|
+
* **src**: the location of the file (relative to the `simmer/files`)
|
157
|
+
* **dest**: where to copy it to (within the configured file system: local or S3)
|
158
|
+
|
159
|
+
###### Fixtures
|
160
|
+
|
161
|
+
Each fixture entry contains the name of a fixture specified in one of the yaml files within fixture directory.
|
162
|
+
|
163
|
+
Fixtures live in yaml files within the `simmer/fixtures` directory. They can be placed in any arbitrary file, the only restriction is their top-level keys that uniquely identify a fixture. Here is an example of a fixture file:
|
164
|
+
|
165
|
+
````yaml
|
166
|
+
hulk:
|
167
|
+
table: agents
|
168
|
+
fields:
|
169
|
+
call_sign: hulk
|
170
|
+
first: CLASSIFIED
|
171
|
+
last: CLASSIFIED
|
172
|
+
|
173
|
+
iron_man:
|
174
|
+
table: agents
|
175
|
+
fields:
|
176
|
+
call_sign: iron_man
|
177
|
+
first: CLASSIFIED
|
178
|
+
last: CLASSIFIED
|
179
|
+
````
|
180
|
+
|
181
|
+
This example specifies two fixtures: `hulk` and `iron_man`. Each will end up creating a record in the `agents` table with their respective attributes (columns.)
|
182
|
+
|
183
|
+
##### Act Section
|
184
|
+
|
185
|
+
The act configuration contains the necessary information for invoking Pentaho through its Spoon script. The options are:
|
186
|
+
|
187
|
+
* **name**: The name of the transformation or job
|
188
|
+
* **repository**: The name of the Kettle repository
|
189
|
+
* **type**: transformation or job
|
190
|
+
* **file params**: key-value pairs to send through to Spoon as params. The values will be joined with and are relative to the `simmer/files` directory.
|
191
|
+
* **key params**: key-value pairs to send through to Spoon as params.
|
192
|
+
|
193
|
+
##### Assert Section
|
194
|
+
|
195
|
+
The assert section contains the expected state of:
|
196
|
+
|
197
|
+
* Database table contents
|
198
|
+
* Pentaho output contents
|
199
|
+
|
200
|
+
Take the assert block from the example above:
|
201
|
+
|
202
|
+
````yaml
|
203
|
+
assert:
|
204
|
+
assertions:
|
205
|
+
- type: table
|
206
|
+
name: agents
|
207
|
+
records:
|
208
|
+
- call_sign: iron_man
|
209
|
+
first: tony
|
210
|
+
last: stark
|
211
|
+
- call_sign: hulk
|
212
|
+
first: bruce
|
213
|
+
last: banner
|
214
|
+
- type: table
|
215
|
+
name: agents
|
216
|
+
logic: includes
|
217
|
+
records:
|
218
|
+
- last: stark
|
219
|
+
- type: output
|
220
|
+
value: output to stdout
|
221
|
+
````
|
222
|
+
|
223
|
+
This contains two table and one output assertion. It explicitly states that:
|
224
|
+
|
225
|
+
* The table `agents` should exactly contain two records with the column values as described (iron_man and hulk)
|
226
|
+
* The table `agents` should include a record where the last name is `stark`
|
227
|
+
* The output should contain the string described in the value somewhere in the log
|
228
|
+
|
229
|
+
###### Table Assertion Rules
|
230
|
+
|
231
|
+
Currently table assertions operate under a very rudimentary set of rules:
|
232
|
+
|
233
|
+
* Record order does not matter
|
234
|
+
* Each record being asserted should have the same keys compared
|
235
|
+
* All values are asserted against their string coerced value
|
236
|
+
* There is no concept of relationships or associations (yet)
|
237
|
+
|
238
|
+
### Running Tests
|
239
|
+
|
240
|
+
After you have configured simmer and written a specification, you can run it by executing:
|
241
|
+
|
242
|
+
````bash
|
243
|
+
bundle exec simmer ./simmer/specs/name_of_the_spec.yaml
|
244
|
+
````
|
245
|
+
|
246
|
+
The passed in path can also be a directory and all specs in the directory (recursively) will be executed:
|
247
|
+
|
248
|
+
````bash
|
249
|
+
bundle exec simmer ./simmer/specs/some_directory
|
250
|
+
````
|
251
|
+
|
252
|
+
You can also omit the path altogether to execute all specs:
|
253
|
+
|
254
|
+
````bash
|
255
|
+
bundle exec simmer
|
256
|
+
````
|
257
|
+
|
258
|
+
## Contributing
|
259
|
+
|
260
|
+
### Development Environment Configuration
|
261
|
+
|
262
|
+
Basic steps to take to get this repository compiling:
|
263
|
+
|
264
|
+
1. Install [Ruby](https://www.ruby-lang.org/en/documentation/installation/) (check simmer.gemspec for versions supported)
|
265
|
+
2. Install bundler (gem install bundler)
|
266
|
+
3. Clone the repository (git clone git@github.com:bluemarblepayroll/simmer.git)
|
267
|
+
4. Navigate to the root folder (cd simmer)
|
268
|
+
5. Install dependencies (bundle)
|
269
|
+
|
270
|
+
### Running Tests
|
271
|
+
|
272
|
+
To execute the test suite and code-coverage tool, run:
|
273
|
+
|
274
|
+
````bash
|
275
|
+
bundle exec rspec spec --format documentation
|
276
|
+
````
|
277
|
+
|
278
|
+
Alternatively, you can have Guard watch for changes:
|
279
|
+
|
280
|
+
````bash
|
281
|
+
bundle exec guard
|
282
|
+
````
|
283
|
+
|
284
|
+
Also, do not forget to run Rubocop:
|
285
|
+
|
286
|
+
````bash
|
287
|
+
bundle exec rubocop
|
288
|
+
````
|
289
|
+
|
290
|
+
or run all three in one command:
|
291
|
+
|
292
|
+
````bash
|
293
|
+
bundle exec rake
|
294
|
+
````
|
295
|
+
|
296
|
+
### Publishing
|
297
|
+
|
298
|
+
Note: ensure you have proper authorization before trying to publish new versions.
|
299
|
+
|
300
|
+
After code changes have successfully gone through the Pull Request review process then the following steps should be followed for publishing new versions:
|
301
|
+
|
302
|
+
1. Merge Pull Request into master
|
303
|
+
2. Update `lib/simmer/version.rb` using [semantic versioning](https://semver.org/)
|
304
|
+
3. Install dependencies: `bundle`
|
305
|
+
4. Update `CHANGELOG.md` with release notes
|
306
|
+
5. Commit & push master to remote and ensure CI builds master successfully
|
307
|
+
6. Run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
|
308
|
+
|
309
|
+
## Code of Conduct
|
310
|
+
|
311
|
+
Everyone interacting in this codebase, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/bluemarblepayroll/simmer/blob/master/CODE_OF_CONDUCT.md).
|
312
|
+
|
313
|
+
## License
|
314
|
+
|
315
|
+
This project is MIT Licensed.
|
@@ -17,9 +17,23 @@ module Simmer
|
|
17
17
|
class Table
|
18
18
|
acts_as_hashable
|
19
19
|
|
20
|
-
|
20
|
+
module Logic
|
21
|
+
EQUALS = :equals
|
22
|
+
INCLUDES = :includes
|
23
|
+
end
|
24
|
+
include Logic
|
25
|
+
|
26
|
+
LOGIC_METHODS = {
|
27
|
+
EQUALS => ->(actual_record_set, record_set) { actual_record_set == record_set },
|
28
|
+
INCLUDES => lambda { |actual_record_set, record_set|
|
29
|
+
(actual_record_set & record_set) == record_set
|
30
|
+
}
|
31
|
+
}.freeze
|
32
|
+
|
33
|
+
attr_reader :logic, :name, :record_set
|
21
34
|
|
22
|
-
def initialize(name:, records: [])
|
35
|
+
def initialize(logic: EQUALS, name:, records: [])
|
36
|
+
@logic = Logic.const_get(logic.to_s.upcase.to_sym)
|
23
37
|
@name = name.to_s
|
24
38
|
@record_set = Util::RecordSet.new(records)
|
25
39
|
|
@@ -31,7 +45,7 @@ module Simmer
|
|
31
45
|
actual_records = database.records(name, keys)
|
32
46
|
actual_record_set = Util::RecordSet.new(actual_records)
|
33
47
|
|
34
|
-
return nil if actual_record_set
|
48
|
+
return nil if LOGIC_METHODS[logic].call(actual_record_set, record_set)
|
35
49
|
|
36
50
|
BadTableAssertion.new(name, record_set, actual_record_set)
|
37
51
|
end
|
data/lib/simmer/util/record.rb
CHANGED
data/lib/simmer/version.rb
CHANGED
@@ -18,9 +18,11 @@ describe Simmer::Specification::Assert do
|
|
18
18
|
subject { described_class.make(config) }
|
19
19
|
|
20
20
|
it 'sets assertions' do
|
21
|
-
expect(subject.assertions.length).to eq(
|
21
|
+
expect(subject.assertions.length).to eq(3)
|
22
22
|
expect(subject.assertions.first.name).to eq('agents')
|
23
23
|
expect(subject.assertions.first.record_set.length).to eq(2)
|
24
|
+
expect(subject.assertions[1].name).to eq('agents')
|
25
|
+
expect(subject.assertions[1].record_set.length).to eq(1)
|
24
26
|
expect(subject.assertions.last.value).to eq('output to stdout')
|
25
27
|
end
|
26
28
|
end
|
@@ -38,6 +38,13 @@ describe Simmer::Util::RecordSet do
|
|
38
38
|
}
|
39
39
|
end
|
40
40
|
|
41
|
+
let(:hash3) do
|
42
|
+
{
|
43
|
+
e: 'e',
|
44
|
+
f: 'f'
|
45
|
+
}
|
46
|
+
end
|
47
|
+
|
41
48
|
subject { described_class.new([hash1, hash2]) }
|
42
49
|
|
43
50
|
describe 'equality' do
|
@@ -64,4 +71,14 @@ describe Simmer::Util::RecordSet do
|
|
64
71
|
expect(subject.records.length).to eq(1)
|
65
72
|
end
|
66
73
|
end
|
74
|
+
|
75
|
+
specify '#& returns intersection of both record sets' do
|
76
|
+
subject1 = described_class.new([hash1, hash2])
|
77
|
+
subject2 = described_class.new([hash1, hash3])
|
78
|
+
|
79
|
+
expected = described_class.new([hash1])
|
80
|
+
actual = subject1 & subject2
|
81
|
+
|
82
|
+
expect(actual).to eq(expected)
|
83
|
+
end
|
67
84
|
end
|