json-stream 0.1.2 → 0.1.3

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 6d404ae0a1e8e03bb1551ead94eda3e413c1f492
4
+ data.tar.gz: 0f14c4b8a6de9f53b4bddaf824a0c0184f969399
5
+ SHA512:
6
+ metadata.gz: 3640958bef5a32726c09c723a459d15986564bc7afe70bfc0db9faa15037bddd6539fa04a8d79bb95674b56a08901771f33118d3c43e862cc51984d9eca12b39
7
+ data.tar.gz: 260eac9e6ff3b440fce89231e0f013bb490ed5666ace3e2c5ee28c3d17cd2eb36a589a5340c42862cf4317a9737e80a976e3ce805b4979ed9dc1df3e9b39ae70
data/LICENSE CHANGED
@@ -1,4 +1,4 @@
1
- Copyright (c) 2010-2011 David Graham
1
+ Copyright (c) 2010-2013 David Graham
2
2
 
3
3
  Permission is hereby granted, free of charge, to any person obtaining a copy
4
4
  of this software and associated documentation files (the "Software"), to deal
@@ -1,42 +1,48 @@
1
- == Welcome to JSON::Stream
1
+ # JSON::Stream
2
2
 
3
- JSON::Stream is a finite state machine based JSON parser that generates events
4
- for each state change. This allows us to stream both the JSON document into
5
- memory and the parsed object graph out of memory to some other process. This
6
- is much like an XML SAX parser that generates events during parsing. There is
7
- no requirement for the document nor the object graph to be fully buffered in
8
- memory. This is best suited for huge JSON documents that won't fit in memory.
9
- For example, streaming and processing large map/reduce views from Apache CouchDB.
3
+ JSON::Stream is a JSON parser, based on a finite state machine, that generates
4
+ events for each state change. This allows streaming both the JSON document into
5
+ memory and the parsed object graph out of memory to some other process. This
6
+ is much like an XML SAX parser that generates events during parsing. There is
7
+ no requirement for the document, or the object graph, to be fully buffered in
8
+ memory. This is best suited for huge JSON documents that won't fit in memory.
9
+ For example, streaming and processing large map/reduce views from Apache
10
+ CouchDB.
10
11
 
11
- == Usage
12
+ ## Usage
12
13
 
13
14
  The simplest way to parse is to read the full JSON document into memory
14
- and then parse it into a full object graph. This is fine for small documents
15
+ and then parse it into a full object graph. This is fine for small documents
15
16
  because we have room for both the document and parsed object in memory.
16
17
 
18
+ ```ruby
17
19
  require 'json/stream'
18
20
  json = File.read('/tmp/test.json')
19
21
  obj = JSON::Stream::Parser.parse(json)
22
+ ```
20
23
 
21
24
  While it's possible to do this with JSON::Stream, we really want to use the json
22
- gem for documents like this. JSON.parse() is much faster than this parser
25
+ gem for documents like this. JSON.parse() is much faster than this parser,
23
26
  because it can rely on having the entire document in memory to analyze.
24
27
 
25
28
  For larger documents we can use an IO object to stream it into the parser.
26
29
  We still need room for the parsed object, but the document itself is never
27
30
  fully read into memory.
28
31
 
32
+ ```ruby
29
33
  require 'json/stream'
30
34
  stream = File.open('/tmp/test.json')
31
35
  obj = JSON::Stream::Parser.parse(stream)
36
+ ```
32
37
 
33
- Again, while we can do this with JSON::Stream, if we just need to stream the
38
+ Again, while JSON::Stream can be used this way, if we just need to stream the
34
39
  document from disk or the network, we're better off using the yajl-ruby gem.
35
40
 
36
41
  Huge documents arriving over the network in small chunks to an EventMachine
37
- receive_data loop is where JSON::Stream is really useful. Inside our
42
+ receive_data loop is where JSON::Stream is really useful. Inside an
38
43
  EventMachine::Connection subclass we might have:
39
44
 
45
+ ```ruby
40
46
  def post_init
41
47
  @parser = JSON::Stream::Parser.new do
42
48
  start_document { puts "start document" }
@@ -57,26 +63,27 @@ def receive_data(data)
57
63
  close_connection
58
64
  end
59
65
  end
66
+ ```
60
67
 
61
68
  Notice how the parser accepts chunks of the JSON document and parses up
62
- up to the end of the available buffer. Passing in more data resumes the
63
- parse from the prior state. When an interesting state change happens, the
69
+ to the end of the available buffer. Passing in more data resumes the
70
+ parse from the prior state. When an interesting state change happens, the
64
71
  parser notifies all registered callback procs of the event.
65
72
 
66
73
  The event callback is where we can do interesting data filtering and passing
67
- to other processes. The above example simply prints state changes, but
74
+ to other processes. The above example simply prints state changes, but
68
75
  imagine the callbacks looking for an array named "rows" and processing sets
69
- of these row objects in small batches. We can process millions of rows streaming
70
- over the network in constant memory space this way.
76
+ of these row objects in small batches. Millions of rows, streaming over the
77
+ network, can be processed in constant memory space this way.
71
78
 
72
- == Dependencies
79
+ ## Dependencies
73
80
 
74
- * ruby >= 1.9.1
81
+ * ruby >= 1.9.2
75
82
 
76
- == Contact
83
+ ## Contact
77
84
 
78
85
  Project contact: David Graham <david.malcom.graham@gmail.com>
79
86
 
80
- == License
87
+ ## License
81
88
 
82
89
  JSON::Stream is released under the MIT license. Check the LICENSE file for details.
data/Rakefile CHANGED
@@ -1,37 +1,20 @@
1
1
  require 'rake'
2
2
  require 'rake/clean'
3
- require 'rake/gempackagetask'
4
3
  require 'rake/testtask'
5
- require_relative 'lib/json/stream/version'
6
4
 
7
- spec = Gem::Specification.new do |s|
8
- s.name = "json-stream"
9
- s.version = JSON::Stream::VERSION
10
- s.date = Time.now.strftime("%Y-%m-%d")
11
- s.summary = "A streaming JSON parser that generates SAX-like events."
12
- s.description = "A finite state machine based JSON parser that generates events
13
- for each state change. This allows us to stream both the JSON document into
14
- memory and the parsed object graph out of memory to some other process. This
15
- is much like an XML SAX parser that generates events during parsing. There is
16
- no requirement for the document nor the object graph to be fully buffered in
17
- memory. This is best suited for huge JSON documents that won't fit in memory.
18
- For example, streaming and processing large map/reduce views from Apache CouchDB."
19
- s.email = "david.malcom.graham@gmail.com"
20
- s.homepage = "http://dgraham.github.com/json-stream/"
21
- s.authors = ["David Graham"]
22
- s.files = FileList['[A-Z]*', "{lib}/**/*"]
23
- s.require_path = "lib"
24
- s.test_files = FileList["{test}/**/*test.rb"]
25
- s.required_ruby_version = '>= 1.9.1'
26
- end
5
+ CLOBBER.include('pkg')
6
+
7
+ directory 'pkg'
27
8
 
28
- Rake::GemPackageTask.new(spec) do |pkg|
29
- pkg.need_tar = true
9
+ desc 'Build distributable packages'
10
+ task :build => [:pkg] do
11
+ system 'gem build json-stream.gemspec && mv json-*.gem pkg/'
30
12
  end
31
13
 
32
14
  Rake::TestTask.new(:test) do |test|
15
+ test.libs << 'test'
33
16
  test.pattern = 'test/**/*_test.rb'
34
17
  test.warning = true
35
18
  end
36
19
 
37
- task :default => [:clobber, :test, :gem]
20
+ task :default => [:clobber, :test, :build]
@@ -0,0 +1,20 @@
1
+ require './lib/json/stream/version'
2
+
3
+ Gem::Specification.new do |s|
4
+ s.name = 'json-stream'
5
+ s.version = JSON::Stream::VERSION
6
+ s.summary = %q[A streaming JSON parser that generates SAX-like events.]
7
+ s.description = %q[A parser best suited for huge JSON documents that don't fit in memory.]
8
+
9
+ s.authors = ['David Graham']
10
+ s.email = %w[david.malcom.graham@gmail.com]
11
+ s.homepage = 'http://dgraham.github.io/json-stream/'
12
+ s.license = 'MIT'
13
+
14
+ s.files = Dir['[A-Z]*', 'json-stream.gemspec', '{lib}/**/*']
15
+ s.test_files = Dir['test/**/*']
16
+ s.require_path = 'lib'
17
+
18
+ s.add_development_dependency 'rake'
19
+ s.required_ruby_version = '>= 1.9.2'
20
+ end
@@ -58,6 +58,5 @@ module JSON
58
58
  raise ParserError, message
59
59
  end
60
60
  end
61
-
62
61
  end
63
62
  end
@@ -82,11 +82,10 @@ module JSON
82
82
  @obj[@key] = node
83
83
  @key = nil
84
84
  else
85
- @key = node
85
+ @key = node
86
86
  end
87
87
  self
88
88
  end
89
89
  end
90
-
91
90
  end
92
91
  end
@@ -2,7 +2,6 @@
2
2
 
3
3
  module JSON
4
4
  module Stream
5
-
6
5
  class ParserError < RuntimeError; end
7
6
 
8
7
  # A streaming JSON parser that generates SAX-like events for
@@ -10,7 +9,7 @@ module JSON
10
9
  # for huge documents that won't fit in memory.
11
10
  class Parser
12
11
  BUF_SIZE = 512
13
- CONTROL = /[[:cntrl:]]/
12
+ CONTROL = /[\x00-\x1F]/
14
13
  WS = /\s/
15
14
  HEX = /[0-9a-fA-F]/
16
15
  DIGIT = /[0-9]/
@@ -194,7 +193,7 @@ module JSON
194
193
  end
195
194
  when :start_surrogate_pair
196
195
  case ch
197
- when BACKSLASH
196
+ when BACKSLASH
198
197
  @state = :start_surrogate_pair_u
199
198
  else
200
199
  error('Expected low surrogate pair half')
@@ -425,6 +424,5 @@ module JSON
425
424
  raise ParserError, "#{message}: char #{@pos}"
426
425
  end
427
426
  end
428
-
429
427
  end
430
428
  end
@@ -2,6 +2,6 @@
2
2
 
3
3
  module JSON
4
4
  module Stream
5
- VERSION = "0.1.2"
5
+ VERSION = '0.1.3'
6
6
  end
7
7
  end
@@ -220,6 +220,9 @@ class ParserTest < Test::Unit::TestCase
220
220
 
221
221
  expected = [:start_document, :start_object, :error]
222
222
  assert_equal(expected, events("{\" \u0000 \":12}"))
223
+
224
+ expected = [:start_document, :start_array, [:value, " \u007F "], :end_array, :end_document]
225
+ assert_equal(expected, events("[\" \u007f \"]"))
223
226
  end
224
227
 
225
228
  def test_unicode_escape
@@ -447,5 +450,4 @@ class ParserTest < Test::Unit::TestCase
447
450
  @events << :error
448
451
  end
449
452
  end
450
-
451
453
  end
metadata CHANGED
@@ -1,38 +1,40 @@
1
- --- !ruby/object:Gem::Specification
1
+ --- !ruby/object:Gem::Specification
2
2
  name: json-stream
3
- version: !ruby/object:Gem::Version
4
- prerelease:
5
- version: 0.1.2
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.3
6
5
  platform: ruby
7
- authors:
6
+ authors:
8
7
  - David Graham
9
8
  autorequire:
10
9
  bindir: bin
11
10
  cert_chain: []
12
-
13
- date: 2011-04-22 00:00:00 -06:00
14
- default_executable:
15
- dependencies: []
16
-
17
- description: |-
18
- A finite state machine based JSON parser that generates events
19
- for each state change. This allows us to stream both the JSON document into
20
- memory and the parsed object graph out of memory to some other process. This
21
- is much like an XML SAX parser that generates events during parsing. There is
22
- no requirement for the document nor the object graph to be fully buffered in
23
- memory. This is best suited for huge JSON documents that won't fit in memory.
24
- For example, streaming and processing large map/reduce views from Apache CouchDB.
25
- email: david.malcom.graham@gmail.com
11
+ date: 2013-10-15 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: rake
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - '>='
18
+ - !ruby/object:Gem::Version
19
+ version: '0'
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - '>='
25
+ - !ruby/object:Gem::Version
26
+ version: '0'
27
+ description: A parser best suited for huge JSON documents that don't fit in memory.
28
+ email:
29
+ - david.malcom.graham@gmail.com
26
30
  executables: []
27
-
28
31
  extensions: []
29
-
30
32
  extra_rdoc_files: []
31
-
32
- files:
33
+ files:
33
34
  - LICENSE
34
35
  - Rakefile
35
- - README
36
+ - README.md
37
+ - json-stream.gemspec
36
38
  - lib/json/stream/buffer.rb
37
39
  - lib/json/stream/builder.rb
38
40
  - lib/json/stream/parser.rb
@@ -41,35 +43,31 @@ files:
41
43
  - test/buffer_test.rb
42
44
  - test/builder_test.rb
43
45
  - test/parser_test.rb
44
- has_rdoc: true
45
- homepage: http://dgraham.github.com/json-stream/
46
- licenses: []
47
-
46
+ homepage: http://dgraham.github.io/json-stream/
47
+ licenses:
48
+ - MIT
49
+ metadata: {}
48
50
  post_install_message:
49
51
  rdoc_options: []
50
-
51
- require_paths:
52
+ require_paths:
52
53
  - lib
53
- required_ruby_version: !ruby/object:Gem::Requirement
54
- none: false
55
- requirements:
56
- - - ">="
57
- - !ruby/object:Gem::Version
58
- version: 1.9.1
59
- required_rubygems_version: !ruby/object:Gem::Requirement
60
- none: false
61
- requirements:
62
- - - ">="
63
- - !ruby/object:Gem::Version
64
- version: "0"
54
+ required_ruby_version: !ruby/object:Gem::Requirement
55
+ requirements:
56
+ - - '>='
57
+ - !ruby/object:Gem::Version
58
+ version: 1.9.2
59
+ required_rubygems_version: !ruby/object:Gem::Requirement
60
+ requirements:
61
+ - - '>='
62
+ - !ruby/object:Gem::Version
63
+ version: '0'
65
64
  requirements: []
66
-
67
65
  rubyforge_project:
68
- rubygems_version: 1.6.2
66
+ rubygems_version: 2.0.3
69
67
  signing_key:
70
- specification_version: 3
68
+ specification_version: 4
71
69
  summary: A streaming JSON parser that generates SAX-like events.
72
- test_files:
70
+ test_files:
73
71
  - test/buffer_test.rb
74
72
  - test/builder_test.rb
75
73
  - test/parser_test.rb