json_data_extractor 0.0.11 → 0.0.12
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +100 -7
- data/lib/json_data_extractor.rb +18 -2
- data/lib/src/version.rb +1 -1
- metadata +2 -2
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 0b205c1e3094b426d39161a47261d0d0168244cff62b3c99b557d7f0e286a322
|
|
4
|
+
data.tar.gz: 3bf7cc322309d9e0c7bb8997eaae41346fbb97e42280a78b5e663c8307ec5311
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 7973f7ca4b7575333282cbb0033f6d326b30003b8c36a7f9b7f0be4e7ee6194bbeb712e5abe2446d77e8ab8ffd5b7bf48c9d125a280da4c7d54aa62d11d2c0a8
|
|
7
|
+
data.tar.gz: 23507463438d1a106da73c0b561912ac16963aa49160563d876883586792c89ca2a2b0531ab236fffb11a45dc2eba0573f012727b28b518cb72ea3b7db35d558
|
data/README.md
CHANGED
|
@@ -163,9 +163,91 @@ results = extractor.extract(schema)
|
|
|
163
163
|
```
|
|
164
164
|
|
|
165
165
|
Modifiers are called in the order in which they are defined, so keep that in mind when defining your
|
|
166
|
-
schema. By default JDE raises an ArgumentError if a modifier is not applicable, but this behaviour
|
|
166
|
+
schema. By default JDE raises an ArgumentError if a modifier is not applicable, but this behaviour
|
|
167
167
|
can be configured to ignore missing modifiers. See Configuration options for details
|
|
168
168
|
|
|
169
|
+
### Maps
|
|
170
|
+
|
|
171
|
+
The JsonDataExtractor gem provides a powerful feature called "maps" that allows you to transform
|
|
172
|
+
extracted data using predefined mappings. Maps are useful when you want to convert specific values
|
|
173
|
+
from the source data into different values based on predefined rules. The best use case is when you
|
|
174
|
+
need to traverse a complex tree to get to a value and them just convert it to your own disctionary.
|
|
175
|
+
E.g.:
|
|
176
|
+
|
|
177
|
+
```ruby
|
|
178
|
+
data = {
|
|
179
|
+
cars: [
|
|
180
|
+
{ make: 'A', fuel: 1 },
|
|
181
|
+
{ make: 'B', fuel: 2 },
|
|
182
|
+
{ make: 'C', fuel: 3 },
|
|
183
|
+
{ make: 'D', fuel: nil },
|
|
184
|
+
]
|
|
185
|
+
}
|
|
186
|
+
|
|
187
|
+
FUEL_TYPES = { 1 => 'Petrol', 2 => 'Diesel', nil => 'Unknown' }
|
|
188
|
+
schema = {
|
|
189
|
+
fuel: {
|
|
190
|
+
path: '$.cars[*].fuel',
|
|
191
|
+
map: FUEL_TYPES
|
|
192
|
+
}
|
|
193
|
+
}
|
|
194
|
+
result = described_class.new(data).extract(schema) # => {"fuel":["Petrol","Diesel",nil,"Unknown"]}
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
A map is essentially a dictionary that defines key-value pairs, where the keys represent the source
|
|
198
|
+
values and the corresponding values represent the transformed values. When extracting data, you can
|
|
199
|
+
apply one or multiple maps to modify the extracted values.
|
|
200
|
+
|
|
201
|
+
#### Syntax
|
|
202
|
+
|
|
203
|
+
To define a map, you can use the `map` or `maps` key in the schema. The map value can be a single
|
|
204
|
+
hash or an array of hashes, where each hash represents a separate mapping rule. Here's an example:
|
|
205
|
+
|
|
206
|
+
```ruby
|
|
207
|
+
{
|
|
208
|
+
path: "$.data[*].category",
|
|
209
|
+
map: {
|
|
210
|
+
"fruit" => "Fresh Fruit",
|
|
211
|
+
"vegetable" => "Organic Vegetable",
|
|
212
|
+
"meat" => "Premium Meat"
|
|
213
|
+
},
|
|
214
|
+
}
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
Multiple maps can also be provided. In this case, each map is applied to the result of previous
|
|
218
|
+
transformation:
|
|
219
|
+
|
|
220
|
+
```ruby
|
|
221
|
+
{
|
|
222
|
+
path: "$.data[*].category",
|
|
223
|
+
maps: [
|
|
224
|
+
{
|
|
225
|
+
"fruit" => "Fresh Fruit",
|
|
226
|
+
"vegetable" => "Organic Vegetable",
|
|
227
|
+
"meat" => "Premium Meat",
|
|
228
|
+
},
|
|
229
|
+
{
|
|
230
|
+
"Fresh Fruit" => "Frisches Obst",
|
|
231
|
+
"Organic Vegetable" => "Biologisches Gemüse",
|
|
232
|
+
"Premium Meat" => "Hochwertiges Fleisch",
|
|
233
|
+
}
|
|
234
|
+
]
|
|
235
|
+
}
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
_(the example is a little bit silly, but you should get the idea of chaining maps)_
|
|
239
|
+
|
|
240
|
+
You can use keys `:map` and `:maps` interchangeably much like `:modifier`, `:modifiers`.
|
|
241
|
+
|
|
242
|
+
#### Notes
|
|
243
|
+
|
|
244
|
+
- Maps can be used together with modifiers but this has less sense as you can always apply complex
|
|
245
|
+
mapping rules in modifiers themselves.
|
|
246
|
+
- If used together with modifiers, maps are applied **after** modifiers.
|
|
247
|
+
- If a map does not have a key corresponding to a transformed value, it will return nil, be careful
|
|
248
|
+
- Maps are applied in the order they are defined in the schema. Be cautious of the order if you have
|
|
249
|
+
overlapping or conflicting mapping rules.
|
|
250
|
+
|
|
169
251
|
### Nested schemas
|
|
170
252
|
|
|
171
253
|
JDE supports nested schemas. Just provide your element with a type of `array` and add a `schema` key
|
|
@@ -189,26 +271,37 @@ E.g. this is a valid real-life schema with nested data:
|
|
|
189
271
|
}
|
|
190
272
|
}
|
|
191
273
|
```
|
|
274
|
+
|
|
192
275
|
Nested schema can be also applied to objects, not arrays. See specs for more examples.
|
|
193
276
|
|
|
194
277
|
## Configuration Options
|
|
195
|
-
|
|
278
|
+
|
|
279
|
+
The JsonDataExtractor gem provides a configuration option to control the behavior when encountering
|
|
280
|
+
invalid modifiers.
|
|
196
281
|
|
|
197
282
|
### Strict Modifiers
|
|
198
|
-
By default, the gem operates in strict mode, which means that if an invalid modifier is encountered, an `ArgumentError` will be raised. This ensures that only valid modifiers are applied to the extracted data.
|
|
199
283
|
|
|
200
|
-
|
|
284
|
+
By default, the gem operates in strict mode, which means that if an invalid modifier is encountered,
|
|
285
|
+
an `ArgumentError` will be raised. This ensures that only valid modifiers are applied to the
|
|
286
|
+
extracted data.
|
|
287
|
+
|
|
288
|
+
To change this behavior and allow the use of invalid modifiers without raising an error, you can
|
|
289
|
+
configure the gem to operate in non-strict mode.
|
|
201
290
|
|
|
202
291
|
```ruby
|
|
203
292
|
JsonDataExtractor.configure do |config|
|
|
204
293
|
config.strict_modifiers = false
|
|
205
294
|
end
|
|
206
295
|
```
|
|
207
|
-
When `strict_modifiers` is set to `false`, any invalid modifiers will be ignored, and the original value will be returned without applying any modification.
|
|
208
296
|
|
|
209
|
-
|
|
297
|
+
When `strict_modifiers` is set to `false`, any invalid modifiers will be ignored, and the original
|
|
298
|
+
value will be returned without applying any modification.
|
|
299
|
+
|
|
300
|
+
It is important to note that enabling non-strict mode should be done with caution, as it can lead to
|
|
301
|
+
unexpected behavior if there are typos or incorrect modifiers specified in the schema.
|
|
210
302
|
|
|
211
|
-
By default, `strict_modifiers` is set to `true`, providing a safe and strict behavior. However, you
|
|
303
|
+
By default, `strict_modifiers` is set to `true`, providing a safe and strict behavior. However, you
|
|
304
|
+
can customize this configuration option according to your specific needs.
|
|
212
305
|
|
|
213
306
|
## TODO
|
|
214
307
|
|
data/lib/json_data_extractor.rb
CHANGED
|
@@ -23,6 +23,13 @@ class JsonDataExtractor
|
|
|
23
23
|
if val.is_a?(Hash)
|
|
24
24
|
val.transform_keys!(&:to_sym)
|
|
25
25
|
path = val[:path]
|
|
26
|
+
maps = Array([val[:maps] || val[:map]]).flatten.compact.map do |map|
|
|
27
|
+
if map.is_a?(Hash)
|
|
28
|
+
map
|
|
29
|
+
else
|
|
30
|
+
raise ArgumentError, "Invalid map: #{map.inspect}"
|
|
31
|
+
end
|
|
32
|
+
end
|
|
26
33
|
modifiers = Array(val[:modifiers] || val[:modifier]).map do |mod|
|
|
27
34
|
case mod
|
|
28
35
|
when Symbol, Proc
|
|
@@ -38,6 +45,7 @@ class JsonDataExtractor
|
|
|
38
45
|
else
|
|
39
46
|
path = val
|
|
40
47
|
modifiers = []
|
|
48
|
+
maps = []
|
|
41
49
|
end
|
|
42
50
|
|
|
43
51
|
extracted_data = JsonPath.on(@data, path)
|
|
@@ -45,7 +53,8 @@ class JsonDataExtractor
|
|
|
45
53
|
if extracted_data.empty?
|
|
46
54
|
results[key] = nil
|
|
47
55
|
else
|
|
48
|
-
|
|
56
|
+
transformed_data = apply_modifiers(extracted_data, modifiers)
|
|
57
|
+
results[key] = apply_maps(transformed_data, maps)
|
|
49
58
|
|
|
50
59
|
if array_type && nested
|
|
51
60
|
results[key] = extract_nested_data(results[key], nested)
|
|
@@ -72,6 +81,14 @@ class JsonDataExtractor
|
|
|
72
81
|
end
|
|
73
82
|
end
|
|
74
83
|
|
|
84
|
+
def apply_maps(data, maps)
|
|
85
|
+
data.map do |value|
|
|
86
|
+
mapped_value = value
|
|
87
|
+
maps.each { |map| mapped_value = map[mapped_value] }
|
|
88
|
+
mapped_value
|
|
89
|
+
end
|
|
90
|
+
end
|
|
91
|
+
|
|
75
92
|
def apply_modifiers(data, modifiers)
|
|
76
93
|
data.map do |value|
|
|
77
94
|
modified_value = value
|
|
@@ -96,7 +113,6 @@ class JsonDataExtractor
|
|
|
96
113
|
end
|
|
97
114
|
end
|
|
98
115
|
|
|
99
|
-
|
|
100
116
|
class << self
|
|
101
117
|
def configuration
|
|
102
118
|
@configuration ||= Configuration.new
|
data/lib/src/version.rb
CHANGED
metadata
CHANGED
|
@@ -1,14 +1,14 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: json_data_extractor
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.0.
|
|
4
|
+
version: 0.0.12
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Max Buslaev
|
|
8
8
|
autorequire:
|
|
9
9
|
bindir: exe
|
|
10
10
|
cert_chain: []
|
|
11
|
-
date: 2023-07-
|
|
11
|
+
date: 2023-07-07 00:00:00.000000000 Z
|
|
12
12
|
dependencies:
|
|
13
13
|
- !ruby/object:Gem::Dependency
|
|
14
14
|
name: bundler
|