json-streamer 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: bdffaa800a206cbb2aecd8d6386b0f2dc7759cb3
4
+ data.tar.gz: ee433030988de4475141e342fffce3344b7b16d9
5
+ SHA512:
6
+ metadata.gz: 9195c29ee40264ad8579eed344182181e4d70c8abf677789ec2bd4deabc4fe31330ec4f69eec07abb9e33778f8753f8f38c67390d07dd5146c43df2e4bfe0f33
7
+ data.tar.gz: 37d8bf0ade2f73b28645f63c0f9c3a8427bce49b55b99a72b90e1d51fadd06ffd883cc7c2af9931538c7812bc53c47ab5b964f02e379309d82de032868baf959
data/.gitignore ADDED
@@ -0,0 +1,10 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /Gemfile.lock
4
+ /_yardoc/
5
+ /coverage/
6
+ /doc/
7
+ /pkg/
8
+ /spec/reports/
9
+ /tmp/
10
+ .idea/*
@@ -0,0 +1,49 @@
1
+ # Contributor Code of Conduct
2
+
3
+ As contributors and maintainers of this project, and in the interest of
4
+ fostering an open and welcoming community, we pledge to respect all people who
5
+ contribute through reporting issues, posting feature requests, updating
6
+ documentation, submitting pull requests or patches, and other activities.
7
+
8
+ We are committed to making participation in this project a harassment-free
9
+ experience for everyone, regardless of level of experience, gender, gender
10
+ identity and expression, sexual orientation, disability, personal appearance,
11
+ body size, race, ethnicity, age, religion, or nationality.
12
+
13
+ Examples of unacceptable behavior by participants include:
14
+
15
+ * The use of sexualized language or imagery
16
+ * Personal attacks
17
+ * Trolling or insulting/derogatory comments
18
+ * Public or private harassment
19
+ * Publishing other's private information, such as physical or electronic
20
+ addresses, without explicit permission
21
+ * Other unethical or unprofessional conduct
22
+
23
+ Project maintainers have the right and responsibility to remove, edit, or
24
+ reject comments, commits, code, wiki edits, issues, and other contributions
25
+ that are not aligned to this Code of Conduct, or to ban temporarily or
26
+ permanently any contributor for other behaviors that they deem inappropriate,
27
+ threatening, offensive, or harmful.
28
+
29
+ By adopting this Code of Conduct, project maintainers commit themselves to
30
+ fairly and consistently applying these principles to every aspect of managing
31
+ this project. Project maintainers who do not follow or enforce the Code of
32
+ Conduct may be permanently removed from the project team.
33
+
34
+ This code of conduct applies both within project spaces and in public spaces
35
+ when an individual is representing the project or its community.
36
+
37
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
38
+ reported by contacting a project maintainer at csapagyi@users.noreply.github.com. All
39
+ complaints will be reviewed and investigated and will result in a response that
40
+ is deemed necessary and appropriate to the circumstances. Maintainers are
41
+ obligated to maintain confidentiality with regard to the reporter of an
42
+ incident.
43
+
44
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage],
45
+ version 1.3.0, available at
46
+ [http://contributor-covenant.org/version/1/3/0/][version]
47
+
48
+ [homepage]: http://contributor-covenant.org
49
+ [version]: http://contributor-covenant.org/version/1/3/0/
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in json-streamer.gemspec
4
+ gemspec
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2016 Csaba Apagyi
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,75 @@
1
+ # Json::Streamer
2
+
3
+ Utility to support JSON streaming allowing you to get objects based on various criteria.
4
+ Useful for e.g. streaming objects from a JSON array.
5
+
6
+ This gem will basically spare you the need to define you own callbacks when parsing JSON stream.
7
+ Streaming is useful for
8
+ - big files that not fit in the memory (or you'd rather avoid the risk)
9
+ - files read in chunks (e.g. arriving over network)
10
+ - cases where you expect some issue with the file (e.g. losing connection to source, invalid data at some point) but would like to get as much data as possible anyway
11
+
12
+ Regarding performance:
13
+ The gem uses JSON::Stream's events in the background. It was chosen because it's a pure Ruby parser.
14
+ A similar implementation can be done using the ~10 times faster Yajl::FFI gem that is dependent on the native YAJL library.
15
+ I did not measure the performance of my implementation on top of these libraries.
16
+
17
+ I do not recommend this or any of the gems mentioned above if you don't need streaming.
18
+
19
+ ## Installation
20
+
21
+ Add this line to your application's Gemfile:
22
+
23
+ ```ruby
24
+ gem 'json-streamer'
25
+ ```
26
+
27
+ And then execute:
28
+
29
+ $ bundle
30
+
31
+ Or install it yourself as:
32
+
33
+ $ gem install json-streamer
34
+
35
+ ## Usage
36
+
37
+ ```ruby
38
+ require 'json/streamer'
39
+
40
+ # Get a JsonStreamer object that will parse file_stream by chunks of 500
41
+ # Default chunk size in 1000
42
+ streamer = Json::Streamer::JsonStreamer.new(file_stream, 500)
43
+ ```
44
+
45
+ ```ruby
46
+ # Get objects based on nesting level
47
+ # First level will give you the full JSON, second level will give you objects within full JSON object, etc.
48
+ streamer.get_objects_from_level(2)
49
+ ```
50
+
51
+ Getting second level objects on the JSON below will yield you 2 empty objects
52
+
53
+ ```json
54
+ {
55
+ "object1": {},
56
+ "object2": {}
57
+ }
58
+ ```
59
+
60
+ Check the unit tests for more examples.
61
+
62
+ ## Development
63
+
64
+ After checking out the repo, run `bin/setup` to install dependencies. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
65
+
66
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
67
+
68
+ ## Contributing
69
+
70
+ Bug reports and pull requests are welcome on GitHub at https://github.com/csapagyi/json-streamer. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
71
+
72
+
73
+ ## License
74
+
75
+ The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
data/Rakefile ADDED
@@ -0,0 +1,2 @@
1
+ require "bundler/gem_tasks"
2
+ task :default => :spec
data/bin/console ADDED
@@ -0,0 +1,14 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "json/streamer"
5
+
6
+ # You can add fixtures and/or initialization code here to make experimenting
7
+ # with your gem easier. You can also use a different console, if you like.
8
+
9
+ # (If you use this, don't forget to add pry to your Gemfile!)
10
+ # require "pry"
11
+ # Pry.start
12
+
13
+ require "irb"
14
+ IRB.start
data/bin/setup ADDED
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
7
+
8
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,27 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'json/streamer/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "json-streamer"
8
+ spec.version = Json::Streamer::VERSION
9
+ spec.authors = ["Csaba Apagyi"]
10
+ spec.email = ["csapagyi@users.noreply.github.com"]
11
+
12
+ spec.summary = %q{Utility to support JSON streaming allowing you to get objects based on various criteria}
13
+ spec.description = %q{Useful for e.g. streaming objects from a JSON array.}
14
+ spec.homepage = "https://github.com/csapagyi/json-streamer"
15
+ spec.license = "MIT"
16
+
17
+ spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
18
+ spec.bindir = "exe"
19
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
20
+ spec.require_paths = ["lib"]
21
+
22
+ spec.add_development_dependency "bundler", "~> 1.12.a"
23
+ spec.add_development_dependency "rake", "~> 10.0"
24
+ spec.add_development_dependency "json-stream"
25
+ spec.add_development_dependency "rspec"
26
+ spec.add_development_dependency "ndhash"
27
+ end
@@ -0,0 +1,96 @@
1
+ require "json/streamer/version"
2
+ require "json/stream"
3
+
4
+ module Json
5
+ module Streamer
6
+ class JsonStreamer
7
+ def initialize(file_io, chunk_size = 1000)
8
+ @parser = JSON::Stream::Parser.new
9
+
10
+ @file_io = file_io
11
+ @chunk_size = chunk_size
12
+
13
+ @object_nesting_level = 0
14
+ @current_key = nil
15
+ @aggregator = {}
16
+ @temp_aggregator_keys = {}
17
+
18
+ @parser.start_object {start_object}
19
+ @parser.start_array {start_array}
20
+ @parser.key {|k| key(k)}
21
+ @parser.value {|v| value(v)}
22
+
23
+ end
24
+
25
+ def get_objects_from_level(yield_nesting_level)
26
+ @yield_nesting_level = yield_nesting_level
27
+
28
+ # Callback containing yield has be defined in the method called via block
29
+ @parser.end_object do
30
+ if @object_nesting_level.eql? @yield_nesting_level
31
+ yield @aggregator[@object_nesting_level].clone
32
+ # TODO probably can be faster than reject!{true}
33
+ @aggregator[@object_nesting_level].reject!{true}
34
+ else
35
+ merge_up
36
+ end
37
+
38
+ @object_nesting_level -= 1
39
+ end
40
+
41
+ @parser.end_array do
42
+ if @object_nesting_level.eql? @yield_nesting_level
43
+ yield @aggregator[@object_nesting_level].clone
44
+ # TODO probably can be faster than reject!{true}
45
+ @aggregator[@object_nesting_level].reject!{true}
46
+ else
47
+ merge_up
48
+ end
49
+
50
+ @object_nesting_level -= 1
51
+ end
52
+
53
+ @file_io.each(@chunk_size) do |chunk|
54
+ @parser << chunk
55
+ end
56
+ end
57
+
58
+ def start_object
59
+ @temp_aggregator_keys[@object_nesting_level] = @current_key
60
+ @object_nesting_level += 1
61
+ @aggregator[@object_nesting_level] = {}
62
+ end
63
+
64
+ def start_array
65
+ @temp_aggregator_keys[@object_nesting_level] = @current_key
66
+ @object_nesting_level += 1
67
+ @aggregator[@object_nesting_level] = []
68
+ end
69
+
70
+ def key k
71
+ @current_key = k
72
+ end
73
+
74
+ def value v
75
+ if @aggregator[@object_nesting_level].kind_of? Array
76
+ @aggregator[@object_nesting_level] << v
77
+ else
78
+ @aggregator[@object_nesting_level][@current_key] = v
79
+ end
80
+ end
81
+
82
+ def merge_up
83
+ return if @object_nesting_level == 1
84
+ previous_object_nesting_level = @object_nesting_level - 1
85
+ if @aggregator[previous_object_nesting_level].kind_of? Array
86
+ @aggregator[previous_object_nesting_level] << @aggregator[@object_nesting_level]
87
+ else
88
+ @aggregator[previous_object_nesting_level][@temp_aggregator_keys[previous_object_nesting_level]] = @aggregator[@object_nesting_level]
89
+ end
90
+
91
+ @aggregator.delete(@object_nesting_level)
92
+ @aggregator
93
+ end
94
+ end
95
+ end
96
+ end
@@ -0,0 +1,5 @@
1
+ module Json
2
+ module Streamer
3
+ VERSION = "0.1.0"
4
+ end
5
+ end
metadata ADDED
@@ -0,0 +1,126 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: json-streamer
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Csaba Apagyi
8
+ autorequire:
9
+ bindir: exe
10
+ cert_chain: []
11
+ date: 2016-05-28 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: bundler
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: 1.12.a
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: 1.12.a
27
+ - !ruby/object:Gem::Dependency
28
+ name: rake
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: '10.0'
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: '10.0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: json-stream
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - ">="
46
+ - !ruby/object:Gem::Version
47
+ version: '0'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - ">="
53
+ - !ruby/object:Gem::Version
54
+ version: '0'
55
+ - !ruby/object:Gem::Dependency
56
+ name: rspec
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - ">="
60
+ - !ruby/object:Gem::Version
61
+ version: '0'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - ">="
67
+ - !ruby/object:Gem::Version
68
+ version: '0'
69
+ - !ruby/object:Gem::Dependency
70
+ name: ndhash
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - ">="
74
+ - !ruby/object:Gem::Version
75
+ version: '0'
76
+ type: :development
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - ">="
81
+ - !ruby/object:Gem::Version
82
+ version: '0'
83
+ description: Useful for e.g. streaming objects from a JSON array.
84
+ email:
85
+ - csapagyi@users.noreply.github.com
86
+ executables: []
87
+ extensions: []
88
+ extra_rdoc_files: []
89
+ files:
90
+ - ".gitignore"
91
+ - CODE_OF_CONDUCT.md
92
+ - Gemfile
93
+ - LICENSE.txt
94
+ - README.md
95
+ - Rakefile
96
+ - bin/console
97
+ - bin/setup
98
+ - json-streamer.gemspec
99
+ - lib/json/streamer.rb
100
+ - lib/json/streamer/version.rb
101
+ homepage: https://github.com/csapagyi/json-streamer
102
+ licenses:
103
+ - MIT
104
+ metadata: {}
105
+ post_install_message:
106
+ rdoc_options: []
107
+ require_paths:
108
+ - lib
109
+ required_ruby_version: !ruby/object:Gem::Requirement
110
+ requirements:
111
+ - - ">="
112
+ - !ruby/object:Gem::Version
113
+ version: '0'
114
+ required_rubygems_version: !ruby/object:Gem::Requirement
115
+ requirements:
116
+ - - ">="
117
+ - !ruby/object:Gem::Version
118
+ version: '0'
119
+ requirements: []
120
+ rubyforge_project:
121
+ rubygems_version: 2.5.1
122
+ signing_key:
123
+ specification_version: 4
124
+ summary: Utility to support JSON streaming allowing you to get objects based on various
125
+ criteria
126
+ test_files: []