saxy 0.5.2 → 0.6.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: c9486f25851c6f1fc75f37b2196b02cc27831386
4
- data.tar.gz: d914fa68442184ac2066e8f860f1f82d029222a3
3
+ metadata.gz: d90158e708f84a77ddd18b19c881fa528a839f28
4
+ data.tar.gz: 14e88af956179b23fff98fc80a2dac03cd36efa2
5
5
  SHA512:
6
- metadata.gz: 116cf7b365cecb245cec43dc7e45ec62f5e36f05461d47ce270b8aab2113a546de053d4772ac97878518d5a0534241ee44ffcd37cd9db172016872edd7192d04
7
- data.tar.gz: de32e8c5e86c0a29300a5f0dc64b29c3282c40d3aa8fb5d4a2ad0f70d6352d860cb95c040a985eca2782c7e1a0e70fa507fc677272e0c8e25b0a277d71e98e88
6
+ metadata.gz: d1380012654abe0e6c51095220d0be5349666de298294addbd81ae1954842249cea8e055514c3583eaa285834eb5ee9ac181b632647bc30d1fa8ebcfd6d049d9
7
+ data.tar.gz: 4bb6c121dd6543cb736d02278ee752eba2bf9a2e6105cdaf3bdff7fa617facda2738ba697de4626135a09984f77a100a4d3ccd7082c25a1ea880cf87aee1a127
@@ -0,0 +1,22 @@
1
+ # Saxy Changelog
2
+
3
+ ## 0.6.0
4
+
5
+ * [BREAKING] `Saxy::ParsingError` now inherits from `StandardError`, not `Exception`.
6
+ * [BREAKING] Forced encoding is now an option instead of third argument of `Saxy.parse` method.
7
+ * Added `recovery` and `replace_entities` options that are internally passed to `Nokogiri::XML::SAX::ParserContext`
8
+ * Added `context` method to `Saxy::ParsingError` that holds parser context at the time of error.
9
+
10
+ ## 0.5.2
11
+
12
+ * Added optional `encoding` argument to `Saxy.parse`
13
+
14
+ ## 0.5.1
15
+
16
+ * Removed `activesupport` dependency
17
+
18
+ ## 0.5.0
19
+
20
+ * [BREAKING] Dropped support for ruby 1.9.2 and lower
21
+ * [BREAKING] Yields hashes instead of `OpenStruct`s
22
+ * Added support for `IO`-like objects
data/README.md CHANGED
@@ -5,7 +5,7 @@
5
5
 
6
6
  Memory-efficient XML parser. Finds object definitions in XML and translates them into Ruby objects.
7
7
 
8
- It uses SAX parser under the hood, which means that it doesn't load the whole XML file into memory. It goes once through it and yields objects along the way.
8
+ It uses SAX parser (provided by Nokogiri gem) under the hood, which means that it doesn't load the whole XML file into memory. It goes once through it and yields objects along the way.
9
9
 
10
10
  In result the memory footprint of the parser remains small and more or less constant irrespective of the size of the XML file, be it few KB or hundreds of GB.
11
11
 
@@ -23,24 +23,51 @@ Or install it yourself as:
23
23
 
24
24
  $ gem install saxy
25
25
 
26
+ ## Requirements
27
+
26
28
  As of `0.5.0` version `saxy` requires ruby 1.9.3 or higher. Previous versions of the gem work with ruby 1.8 and 1.9.2 (see below), but they are not maintained anymore.
27
29
 
28
- ## Ruby 1.8 support
30
+ ### Ruby 1.8 support
29
31
 
30
32
  See `ruby-1.8` branch. Install with:
31
33
 
32
34
  gem 'saxy', '~> 0.3.0'
33
35
 
34
- ## Ruby 1.9.2 support
36
+ ### Ruby 1.9.2 support
35
37
 
36
38
  See `ruby-1.9.2` branch. Install with:
37
39
 
38
40
  gem 'saxy', '~> 0.4.0'
39
41
 
42
+ ## Changelog
43
+
44
+ See `CHANGELOG.md` file.
40
45
 
41
46
  ## Usage
42
47
 
43
- Assume the XML file:
48
+ You instantiate the parser by passing path to XML file or an IO-like object, object-identifying tag name and options hash (optionally) as its arguments.
49
+
50
+ ```ruby
51
+ parser = Saxy.parse(path_or_io, object_tag, options = {})
52
+ ```
53
+
54
+ Then iterate over it using `each` (or any of convenient methods provided by `Enumerable` mix-in).
55
+
56
+ ```ruby
57
+ parser.each do |object|
58
+ ...
59
+ end
60
+ ```
61
+
62
+ ### Options
63
+
64
+ * `encoding` - Forces the parser to work in given encoding
65
+ * `recovery` - Should this parser recover from structural errors? It will not stop processing file on structural errors if set to `true`.
66
+ * `replace_entities` - Should this parser replace entities? `&` will get converted to `&` if set to `true`.
67
+
68
+ ## Example
69
+
70
+ Assume the XML file (an imaginary product feed):
44
71
 
45
72
  ````xml
46
73
  <?xml version='1.0' encoding='UTF-8'?>
@@ -63,8 +90,6 @@ Assume the XML file:
63
90
  </webstore>
64
91
  ````
65
92
 
66
- You instantiate the parser by passing path to XML file or an IO-like object and object-identyfing tag name as its arguments.
67
-
68
93
  The following will parse the XML, find product definitions (inside `<product>` and `</product>` tags), build `Hash`es and yield them inside the block.
69
94
 
70
95
  Usage with a file path:
@@ -119,6 +144,18 @@ webstore = Saxy.parse("filename.xml", "webstore").first
119
144
  webstore[:products][:product].size # => 2
120
145
  ````
121
146
 
147
+ ## Debugging
148
+
149
+ Invalid XML files happen a lot and error messages are not always extremely helpful. In case of a parsing error, some additional information can be retrieved from parser's context.
150
+
151
+ ```ruby
152
+ begin
153
+ Saxy.parse(...) { ... }
154
+ rescue e => Saxy::ParsingError
155
+ puts "#{e.message} at #{e.context.line} line and #{e.context.column}"
156
+ end
157
+ ```
158
+
122
159
  ## Contributing
123
160
 
124
161
  1. Fork it
@@ -126,3 +163,11 @@ webstore[:products][:product].size # => 2
126
163
  3. Commit your changes (`git commit -am 'Added some feature'`)
127
164
  4. Push to the branch (`git push origin my-new-feature`)
128
165
  5. Create new Pull Request
166
+
167
+ ## License
168
+
169
+ See `LICENSE.txt` file.
170
+
171
+ ## Author
172
+
173
+ Michał Szajbe, [@szajbus](https://twitter.com/szajbus), [szajbe.pl](http://szajbe.pl)
@@ -16,10 +16,15 @@ module Saxy
16
16
  # Will yield objects inside the callback after they're built
17
17
  attr_reader :callback
18
18
 
19
- def initialize(object, object_tag, encoding=nil)
20
- @object, @object_tag = object, object_tag
19
+ # Parser context
20
+ attr_reader :context
21
+
22
+ # Parser options
23
+ attr_reader :options
24
+
25
+ def initialize(object, object_tag, options={})
26
+ @object, @object_tag, @options = object, object_tag, options
21
27
  @tags, @elements = [], []
22
- @encoding = encoding
23
28
  end
24
29
 
25
30
  def start_element(tag, attributes=[])
@@ -56,7 +61,7 @@ module Saxy
56
61
  end
57
62
 
58
63
  def error(message)
59
- raise ParsingError.new(message)
64
+ raise ParsingError.new(message, context)
60
65
  end
61
66
 
62
67
  def current_element
@@ -68,15 +73,25 @@ module Saxy
68
73
 
69
74
  @callback = blk
70
75
 
71
- args = [self, @encoding].compact
76
+ args = [self, options[:encoding]].compact
72
77
 
73
78
  parser = Nokogiri::XML::SAX::Parser.new(*args)
74
79
 
75
80
  if @object.respond_to?(:read) && @object.respond_to?(:close)
76
- parser.parse_io(@object)
81
+ parser.parse_io(@object, &context_blk)
77
82
  else
78
- parser.parse_file(@object)
83
+ parser.parse_file(@object, &context_blk)
79
84
  end
80
85
  end
86
+
87
+ def context_blk
88
+ proc { |context|
89
+ [:recovery, :replace_entities].each do |key|
90
+ context.send("#{key}=", options[key]) if options.has_key?(key)
91
+ end
92
+
93
+ @context = context
94
+ }
95
+ end
81
96
  end
82
97
  end
@@ -1,4 +1,10 @@
1
1
  module Saxy
2
- class ParsingError < ::Exception
2
+ class ParsingError < ::StandardError
3
+ attr_reader :context
4
+
5
+ def initialize(message, context)
6
+ @context = context
7
+ super(message)
8
+ end
3
9
  end
4
- end
10
+ end
@@ -1,3 +1,3 @@
1
1
  module Saxy
2
- VERSION = "0.5.2"
2
+ VERSION = "0.6.0"
3
3
  end
@@ -0,0 +1,9 @@
1
+ <?xml version='1.0' encoding='UTF-8'?>
2
+ <webstore>
3
+ <products>
4
+ <product>
5
+ <uid>FFCF177</uid>
6
+ <name>No closing tag...
7
+ </product>
8
+ </products>
9
+ </webstore>
@@ -3,7 +3,8 @@ require 'spec_helper'
3
3
  describe Saxy::Parser do
4
4
  include FixturesHelper
5
5
 
6
- let(:parser) { Saxy::Parser.new(fixture_file("webstore.xml"), "product", "UTF-8") }
6
+ let(:parser) { Saxy::Parser.new(fixture_file("webstore.xml"), "product") }
7
+ let(:invalid_parser) { Saxy::Parser.new(fixture_file("invalid.xml"), "product") }
7
8
  let(:file_io) { File.new(fixture_file("webstore.xml")) }
8
9
  let(:io_like) { IOLike.new(file_io) }
9
10
 
@@ -19,11 +20,18 @@ describe Saxy::Parser do
19
20
  end
20
21
 
21
22
  it "should accept optional force-encoding" do
22
- parser = Saxy::Parser.new(file_io, "product", 'UTF-8')
23
+ parser = Saxy::Parser.new(file_io, "product", encoding: "UTF-8")
23
24
  expect(Nokogiri::XML::SAX::Parser).to receive(:new).with(parser, "UTF-8").and_call_original
24
25
  expect(parser.each.to_a.size).to eq(2)
25
26
  end
26
27
 
28
+ it "should pass options to parser context" do
29
+ parser = Saxy::Parser.new(file_io, "product", recovery: true, replace_entities: true)
30
+ parser.each.to_a
31
+ expect(parser.context.recovery).to be_truthy
32
+ expect(parser.context.replace_entities).to be_truthy
33
+ end
34
+
27
35
  it "should accept an IO-like for parsing" do
28
36
  parser = Saxy::Parser.new(io_like, "product")
29
37
  expect(parser.each.to_a.size).to eq(2)
@@ -158,7 +166,11 @@ describe Saxy::Parser do
158
166
  end
159
167
 
160
168
  it "should raise Saxy::ParsingError on error" do
161
- expect { parser.error("Error message.") }.to raise_error(Saxy::ParsingError, "Error message.")
169
+ expect { invalid_parser.each.to_a }.to raise_error { |error|
170
+ expect(error).to be_a(Saxy::ParsingError)
171
+ expect(error.message).to match(/Opening and ending tag mismatch/)
172
+ expect(error.context).to be_a(Nokogiri::XML::SAX::ParserContext)
173
+ }
162
174
  end
163
175
 
164
176
  it "should return Enumerator when calling #each without a block" do
@@ -3,4 +3,4 @@ require 'bundler/setup'
3
3
  require 'saxy'
4
4
 
5
5
  require 'fixtures_helper'
6
- require File.join(File.dirname(__FILE__), 'fixtures', 'io_like')
6
+ require 'support/io_like'
File without changes
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: saxy
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.5.2
4
+ version: 0.6.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Michał Szajbe
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2017-02-20 00:00:00.000000000 Z
11
+ date: 2017-08-23 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: nokogiri
@@ -66,6 +66,7 @@ files:
66
66
  - ".rspec"
67
67
  - ".travis.yml"
68
68
  - Appraisals
69
+ - CHANGELOG.md
69
70
  - Gemfile
70
71
  - LICENSE.txt
71
72
  - README.md
@@ -78,13 +79,14 @@ files:
78
79
  - lib/saxy/parsing_error.rb
79
80
  - lib/saxy/version.rb
80
81
  - saxy.gemspec
81
- - spec/fixtures/io_like.rb
82
+ - spec/fixtures/invalid.xml
82
83
  - spec/fixtures/webstore.xml
83
84
  - spec/fixtures_helper.rb
84
85
  - spec/saxy/element_spec.rb
85
86
  - spec/saxy/parser_spec.rb
86
87
  - spec/saxy_spec.rb
87
88
  - spec/spec_helper.rb
89
+ - spec/support/io_like.rb
88
90
  homepage: http://github.com/humante/saxy
89
91
  licenses: []
90
92
  metadata: {}
@@ -110,10 +112,11 @@ specification_version: 4
110
112
  summary: Memory-efficient XML parser. Finds object definitions and translates them
111
113
  into Ruby objects.
112
114
  test_files:
113
- - spec/fixtures/io_like.rb
115
+ - spec/fixtures/invalid.xml
114
116
  - spec/fixtures/webstore.xml
115
117
  - spec/fixtures_helper.rb
116
118
  - spec/saxy/element_spec.rb
117
119
  - spec/saxy/parser_spec.rb
118
120
  - spec/saxy_spec.rb
119
121
  - spec/spec_helper.rb
122
+ - spec/support/io_like.rb