json_data_extractor 0.0.8 → 0.0.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +78 -26
- data/json_data_extractor.gemspec +1 -1
- data/lib/json_data_extractor.rb +27 -15
- data/lib/src/version.rb +1 -1
- metadata +6 -6
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: be51d8200f061eb267a9c591934a29072729462d242c349c6057e48cb5e77627
|
4
|
+
data.tar.gz: 88213a955399a735f2cd15546d9c0b9ad22ba38de8798e9070d5a35e4a47902f
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: f724f3a5b3542644abe1103d8831bdefa0e6bbf2f073d8bfe928cce6f8d42992ecf570527f4adceba4b8a01f12fe0146511ae2ef80301f1a48a52b7e0f5c697a
|
7
|
+
data.tar.gz: e4318e950ec778eb19558cc1a7da37234f730ba977b42ff19f5be4d5b86c8b7a4b8a7e50a85705fa38c89d41d9cb0f39493d8f6c0bfa1e79215956f2e9817528
|
data/README.md
CHANGED
@@ -3,13 +3,15 @@
|
|
3
3
|
NOTE: This is still a very early beta.
|
4
4
|
|
5
5
|
Transform JSON data structures with the help of a simple schema and JsonPath expressions.
|
6
|
-
Use the JsonDataExtractor gem to extract and modify data from complex JSON structures using a
|
6
|
+
Use the JsonDataExtractor gem to extract and modify data from complex JSON structures using a
|
7
|
+
straightforward syntax
|
7
8
|
and a range of built-in or custom modifiers.
|
8
9
|
|
9
10
|
_Another try to make something for JSON that is XSLT for XML.
|
10
11
|
We transform one JSON into another JSON with the help of a third JSON!!!111!!eleventy!!_
|
11
12
|
|
12
|
-
Remap one JSON structure into another with some basic rules
|
13
|
+
Remap one JSON structure into another with some basic rules
|
14
|
+
and [jsonpath](https://github.com/joshbuddy/jsonpath).
|
13
15
|
|
14
16
|
Heavily inspired by [xml_data_extractor](https://github.com/monde-sistemas/xml_data_extractor).
|
15
17
|
|
@@ -32,8 +34,8 @@ Or install it yourself as:
|
|
32
34
|
## Usage
|
33
35
|
|
34
36
|
JsonDataExtractor allows you to remap one JSON structure into another with some basic rules
|
35
|
-
and [JSONPath](https://goessner.net/articles/JsonPath/) expressions. The process involves defining a
|
36
|
-
the input JSON structure to the desired output structure.
|
37
|
+
and [JSONPath](https://goessner.net/articles/JsonPath/) expressions. The process involves defining a
|
38
|
+
schema that maps the input JSON structure to the desired output structure.
|
37
39
|
|
38
40
|
We'll base our examples on the following source:
|
39
41
|
|
@@ -78,15 +80,15 @@ We'll base our examples on the following source:
|
|
78
80
|
|
79
81
|
### Defining a Schema
|
80
82
|
|
81
|
-
A schema consists of one or more mappings that specify how to extract data from the input JSON and
|
82
|
-
the output JSON.
|
83
|
+
A schema consists of one or more mappings that specify how to extract data from the input JSON and
|
84
|
+
where to place it in the output JSON.
|
83
85
|
|
84
|
-
Each mapping has a path field that specifies the JsonPath expression to use for data extraction, and
|
85
|
-
modifier field that specifies one or more modifiers to apply to the extracted data.
|
86
|
-
data in some way before placing it in the output JSON.
|
86
|
+
Each mapping has a path field that specifies the JsonPath expression to use for data extraction, and
|
87
|
+
an optional modifier field that specifies one or more modifiers to apply to the extracted data.
|
88
|
+
Modifiers are used to transform the data in some way before placing it in the output JSON.
|
87
89
|
|
88
|
-
Here's an example schema that extracts the authors and categories from a JSON structure similar to
|
89
|
-
previous example (here it's in YAML just for readability):
|
90
|
+
Here's an example schema that extracts the authors and categories from a JSON structure similar to
|
91
|
+
the one used in the previous example (here it's in YAML just for readability):
|
90
92
|
|
91
93
|
```yaml
|
92
94
|
schemas:
|
@@ -116,23 +118,66 @@ The resulting json will be:
|
|
116
118
|
|
117
119
|
```
|
118
120
|
|
119
|
-
Modifiers
|
120
|
-
|
121
|
+
### Modifiers
|
122
|
+
|
123
|
+
Modifiers can be supplied on object creation and/or added later by calling `#add_modifier` method.
|
124
|
+
Please see specs for examples.
|
125
|
+
Modifiers allow you to perform transformations on the extracted data before it is returned. You can
|
126
|
+
use modifiers to clean up the data, format it, or apply any custom logic you need.
|
127
|
+
|
128
|
+
Modifiers can be defined in two ways: by providing a symbol corresponding to the name of the method
|
129
|
+
or lambda that should be called on each extracted value, or by providing an anonymous lambda. Here's
|
130
|
+
an example schema that uses both types of modifiers:
|
131
|
+
|
132
|
+
```ruby
|
133
|
+
schema = {
|
134
|
+
name: '$.name',
|
135
|
+
age: { path: '$.age', modifier: :to_i },
|
136
|
+
email: { path: '$.contact.email', modifiers: [:downcase, lambda { |email| email.gsub(/\s/, '') }] }
|
137
|
+
}
|
138
|
+
|
139
|
+
```
|
140
|
+
|
141
|
+
In this schema, the name value is simply extracted as-is. The age value is extracted from the JSON,
|
142
|
+
but it is modified with the `to_i` method, which converts the value to an integer. The email value
|
143
|
+
is extracted from a nested object, and then passed through two modifiers: first `downcase` is called
|
144
|
+
to convert the email address to all lowercase letters, and then an anonymous lambda is called to
|
145
|
+
remove any whitespace in the email address.
|
146
|
+
|
147
|
+
You can also define custom modifiers by passing a lambda to the `add_modifier` method on a
|
148
|
+
JsonDataExtractor instance:
|
149
|
+
|
150
|
+
```ruby
|
151
|
+
extractor = JsonDataExtractor.new(json_data)
|
152
|
+
extractor.add_modifier(:remove_newlines) { |value| value.gsub("\n", '') }
|
153
|
+
|
154
|
+
schema = {
|
155
|
+
name: 'name',
|
156
|
+
bio: { path: 'bio', modifiers: [:remove_newlines] }
|
157
|
+
}
|
158
|
+
|
159
|
+
results = extractor.extract(schema)
|
160
|
+
|
161
|
+
```
|
162
|
+
|
163
|
+
Modifiers are called in the order in which they are defined, so keep that in mind when defining your
|
164
|
+
schema.
|
121
165
|
|
122
166
|
### Nested schemas
|
123
167
|
|
124
|
-
JDE supports nested schemas. Just provide your element with a type of `array` and add a `schema` key
|
168
|
+
JDE supports nested schemas. Just provide your element with a type of `array` and add a `schema` key
|
169
|
+
for its data.
|
125
170
|
|
126
171
|
E.g. this is a valid real-life schema with nested data:
|
127
172
|
|
128
173
|
```json
|
129
174
|
{
|
130
|
-
"name":
|
131
|
-
"code":
|
132
|
-
"services":
|
175
|
+
"name": "$.Name",
|
176
|
+
"code": "$.Code",
|
177
|
+
"services": "$.Services[*].Code",
|
133
178
|
"locations": {
|
134
|
-
"path":
|
135
|
-
"type":
|
179
|
+
"path": "$.Locations[*]",
|
180
|
+
"type": "array",
|
136
181
|
"schema": {
|
137
182
|
"name": "$.Name",
|
138
183
|
"type": "$.Type",
|
@@ -148,25 +193,32 @@ Update this readme for better usage cases. Add info on arrays and modifiers.
|
|
148
193
|
|
149
194
|
## Development
|
150
195
|
|
151
|
-
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run
|
196
|
+
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run
|
197
|
+
the tests. You can
|
152
198
|
also run `bin/console` for an interactive prompt that will allow you to experiment.
|
153
199
|
|
154
|
-
To install this gem onto your local machine, run `bundle exec rake install`. To release a new
|
155
|
-
version
|
200
|
+
To install this gem onto your local machine, run `bundle exec rake install`. To release a new
|
201
|
+
version, update the
|
202
|
+
version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag
|
203
|
+
for the version,
|
156
204
|
push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
|
157
205
|
|
158
206
|
## Contributing
|
159
207
|
|
160
|
-
Bug reports and pull requests are welcome on GitHub
|
161
|
-
|
208
|
+
Bug reports and pull requests are welcome on GitHub
|
209
|
+
at https://github.com/austerlitz/json_data_extractor. This project
|
210
|
+
is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere
|
211
|
+
to
|
162
212
|
the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
|
163
213
|
|
164
214
|
## License
|
165
215
|
|
166
|
-
The gem is available as open source under the terms of
|
216
|
+
The gem is available as open source under the terms of
|
217
|
+
the [MIT License](https://opensource.org/licenses/MIT).
|
167
218
|
|
168
219
|
## Code of Conduct
|
169
220
|
|
170
|
-
Everyone interacting in the JsonDataExtractor project’s codebases, issue trackers, chat rooms and
|
221
|
+
Everyone interacting in the JsonDataExtractor project’s codebases, issue trackers, chat rooms and
|
222
|
+
mailing lists is
|
171
223
|
expected to follow
|
172
224
|
the [code of conduct](https://github.com/austerlitz/json_data_extractor/blob/master/CODE_OF_CONDUCT.md).
|
data/json_data_extractor.gemspec
CHANGED
@@ -26,7 +26,7 @@ transformations. The schema is defined as a simple Ruby hash that maps keys to p
|
|
26
26
|
spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
|
27
27
|
spec.require_paths = ['lib']
|
28
28
|
|
29
|
-
spec.add_development_dependency 'bundler'
|
29
|
+
spec.add_development_dependency 'bundler'
|
30
30
|
spec.add_development_dependency 'rake', '~> 10.0'
|
31
31
|
spec.add_development_dependency 'rspec', '~> 3.0'
|
32
32
|
spec.add_development_dependency 'pry'
|
data/lib/json_data_extractor.rb
CHANGED
@@ -22,7 +22,16 @@ class JsonDataExtractor
|
|
22
22
|
if val.is_a?(Hash)
|
23
23
|
val.transform_keys!(&:to_sym)
|
24
24
|
path = val[:path]
|
25
|
-
modifiers = Array(val[:modifiers] || val[:modifier]).map
|
25
|
+
modifiers = Array(val[:modifiers] || val[:modifier]).map do |mod|
|
26
|
+
case mod
|
27
|
+
when Symbol, Proc
|
28
|
+
mod
|
29
|
+
when String
|
30
|
+
mod.to_sym
|
31
|
+
else
|
32
|
+
raise ArgumentError, "Invalid modifier: #{mod.inspect}"
|
33
|
+
end
|
34
|
+
end
|
26
35
|
array_type = 'array' == val[:type]
|
27
36
|
nested = val.dup.delete(:schema)
|
28
37
|
else
|
@@ -37,20 +46,17 @@ class JsonDataExtractor
|
|
37
46
|
else
|
38
47
|
results[key] = apply_modifiers(extracted_data, modifiers)
|
39
48
|
|
40
|
-
|
41
|
-
|
42
|
-
|
43
|
-
|
44
|
-
|
45
|
-
|
46
|
-
|
47
|
-
|
48
|
-
|
49
|
-
|
50
|
-
results[key] = []
|
51
|
-
Array(extracted_data).each do |item|
|
52
|
-
results[key] << self.class.new(item, @modifiers).extract(nested)
|
53
|
-
end
|
49
|
+
if array_type && nested
|
50
|
+
results[key] = extract_nested_data(results[key], nested)
|
51
|
+
elsif !array_type && nested
|
52
|
+
results[key] = extract_nested_data(results[key], nested).first
|
53
|
+
elsif !array_type && 1 < results[key].size
|
54
|
+
# TODO: handle case where results[key] has more than one item
|
55
|
+
# do nothing for now
|
56
|
+
elsif array_type && !nested
|
57
|
+
# do nothing, it is already an array
|
58
|
+
else
|
59
|
+
results[key] = results[key].first
|
54
60
|
end
|
55
61
|
end
|
56
62
|
end
|
@@ -59,6 +65,12 @@ class JsonDataExtractor
|
|
59
65
|
|
60
66
|
private
|
61
67
|
|
68
|
+
def extract_nested_data(data, schema)
|
69
|
+
Array(data).map do |item|
|
70
|
+
self.class.new(item, modifiers).extract(schema)
|
71
|
+
end
|
72
|
+
end
|
73
|
+
|
62
74
|
def apply_modifiers(data, modifiers)
|
63
75
|
data.map do |value|
|
64
76
|
modified_value = value
|
data/lib/src/version.rb
CHANGED
metadata
CHANGED
@@ -1,29 +1,29 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: json_data_extractor
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.
|
4
|
+
version: 0.0.10
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Max Buslaev
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2023-05-
|
11
|
+
date: 2023-05-12 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
15
15
|
requirement: !ruby/object:Gem::Requirement
|
16
16
|
requirements:
|
17
|
-
- - "
|
17
|
+
- - ">="
|
18
18
|
- !ruby/object:Gem::Version
|
19
|
-
version: '
|
19
|
+
version: '0'
|
20
20
|
type: :development
|
21
21
|
prerelease: false
|
22
22
|
version_requirements: !ruby/object:Gem::Requirement
|
23
23
|
requirements:
|
24
|
-
- - "
|
24
|
+
- - ">="
|
25
25
|
- !ruby/object:Gem::Version
|
26
|
-
version: '
|
26
|
+
version: '0'
|
27
27
|
- !ruby/object:Gem::Dependency
|
28
28
|
name: rake
|
29
29
|
requirement: !ruby/object:Gem::Requirement
|