nokogiri 1.5.6.rc2-x86-mswin32-60 → 1.5.6.rc3-x86-mswin32-60
Sign up to get free protection for your applications and to get access to all the features.
Potentially problematic release.
This version of nokogiri might be problematic. Click here for more details.
- data/CHANGELOG.ja.rdoc +23 -6
- data/CHANGELOG.rdoc +20 -4
- data/README.rdoc +3 -6
- data/ROADMAP.md +3 -2
- data/Rakefile +4 -2
- data/bin/nokogiri +19 -4
- data/build_all +5 -1
- data/ext/nokogiri/xml_document.c +1 -1
- data/ext/nokogiri/xml_node.c +31 -14
- data/ext/nokogiri/xml_sax_parser.c +16 -0
- data/lib/nokogiri/1.8/nokogiri.so +0 -0
- data/lib/nokogiri/1.9/nokogiri.so +0 -0
- data/lib/nokogiri/version.rb +1 -1
- data/lib/nokogiri/xml/document.rb +8 -6
- data/lib/nokogiri/xml/document_fragment.rb +10 -1
- data/lib/nokogiri/xml/node.rb +15 -11
- data/lib/nokogiri/xml/sax/document.rb +7 -0
- data/lib/nokogiri/xml/xpath_context.rb +1 -1
- data/test/helper.rb +6 -0
- data/test/html/test_document_fragment.rb +5 -0
- data/test/xml/sax/test_parser.rb +15 -1
- data/test/xml/test_builder.rb +19 -0
- data/test/xml/test_document.rb +42 -9
- data/test/xml/test_document_fragment.rb +7 -0
- data/test/xml/test_node.rb +62 -0
- data/test/xml/test_node_attributes.rb +22 -2
- data/test/xml/test_unparented_node.rb +9 -0
- data/test_all +6 -2
- metadata +4 -4
data/CHANGELOG.ja.rdoc
CHANGED
@@ -2,13 +2,30 @@
|
|
2
2
|
|
3
3
|
* Features
|
4
4
|
|
5
|
-
*
|
5
|
+
* XML::Document#collect_namespaces メソッドのパフォーマンスを改善した。 #761 (ありがとう、Juergen Mangler!)
|
6
|
+
* SAX::Document#processing_instructionに新しいcallbackが追加 (ありがとう、Kitaiti Makoto!)
|
7
|
+
* Node#native_content= メソッドでエスケープされていない文字列をセットできるようにした。 #768
|
8
|
+
* 名前空間を付けて xpath 式を書く場合に、シンボルキーを使えるようにした。#729 (ありがとう、Ben Langfeld.)
|
9
|
+
* XML::Node#[]= メソッド内で受け取った引数を文字列に変換するようにした。#729 (ありがとう、Ben Langfeld.)
|
10
|
+
* bin/nokogiri コマンドが $stdin からドキュメントを読んで処理できるようにした。
|
11
|
+
* bin/nokogiri -e を指定することでコマンドラインプログラムを実行できるようにした。
|
12
|
+
|
6
13
|
|
7
|
-
|
8
|
-
* JRuby
|
9
|
-
* Nokogiri
|
10
|
-
*
|
11
|
-
* JRuby Node
|
14
|
+
* Bugfixes
|
15
|
+
* [JRuby] XML宣言の前にスペースがあると、ドキュメントのパーズに失敗する。(#748の修正でこれもなおっている) #790
|
16
|
+
* [JRuby] Nokogiri::XML::Node#content のJRubyの振る舞いがCRubyと同じではない。#794, #797
|
17
|
+
* [JRuby] で '#' で始まる文字列を名前とする EntityReference を作ろうとすると INVALID_CHARACTER_ERR という例外がはっせいする。 #719
|
18
|
+
* [JRuby] では Nodeのサブクラスのnamespaceを正しく文字列に変換しない。 #715
|
19
|
+
* Nokogiri はこのバージョンからXSLT変換のエラーを検出するようになった。#731 (ありがとう、Justin Fitzsimmons!)
|
20
|
+
* SAXパーザに不正なエンコーディングに渡された場合はArgumentErrorを投げるようにした。#756 (ありがとう、Bradley Schaefer!)
|
21
|
+
* [JRuby] Node#contentがこのバージョンから改行コードを正しく表示するようになった。#737 (ありがとう、Piotr Szmielew!)
|
22
|
+
* [JRuby] recover optionが指定されている場合は宣言の無いネームスペースを無視するようにした。#748
|
23
|
+
* [JRuby] ネームスペースを検出するXPathが続けて実行されても例外を投げてはいけない。#764
|
24
|
+
* [JRuby] XMLを表示(出力)する際のホワイトスペースの扱いをlibxml2バージョンとさらに同様になるようにした。#771
|
25
|
+
* [JRuby] ネームスペース付きの属性を含むXMLドキュメントを文字列でbuilderに追加しようとすると失敗する。#770
|
26
|
+
* [JRuby] Nokogiri::XML::Document#wrapを使って生成したドキュメントに << でノードを追加しようとすると
|
27
|
+
undefined method `length' for nil:NilClassのエラーが発生する #781
|
28
|
+
* [JRuby] 開いているファイルのデスクリプタを閉じようとすると、"bad file descriptor" が発生する。#495
|
12
29
|
|
13
30
|
|
14
31
|
== 1.5.5 / 2012年6月24日
|
data/CHANGELOG.rdoc
CHANGED
@@ -2,13 +2,29 @@
|
|
2
2
|
|
3
3
|
* Features
|
4
4
|
|
5
|
-
*
|
5
|
+
* Improved performance of XML::Document#collect_namespaces. #761 (Thanks, Juergen Mangler!)
|
6
|
+
* New callback SAX::Document#processing_instruction (Thanks, Kitaiti Makoto!)
|
7
|
+
* Node#native_content= allows setting unescaped node contant. #768
|
8
|
+
* XPath lookup with namespaces supports symbol keys. #729 (Thanks, Ben Langfeld.)
|
9
|
+
* XML::Node#[]= stringifies values. #729 (Thanks, Ben Langfeld.)
|
10
|
+
* bin/nokogiri will process a document from $stdin
|
11
|
+
* bin/nokogiri -e will execute a program from the command line
|
12
|
+
|
6
13
|
|
7
|
-
|
8
|
-
* JRuby
|
14
|
+
* Bugfixes
|
15
|
+
* [JRuby] space prior to xml preamble causes nokogiri to fail parsing. (fixed along with #748) #790
|
16
|
+
* [JRuby] Fixed the bug Nokogiri::XML::Node#content inconsistency between Java and C. #794, #797
|
17
|
+
* [JRuby] raises INVALID_CHARACTER_ERR exception when EntityReference name starts with '#'. #719
|
18
|
+
* [JRuby] doesn't coerce namespaces out of strings on a direct subclass of Node. #715
|
9
19
|
* Nokogiri now detects XSLT transform errors. #731 (Thanks, Justin Fitzsimmons!)
|
10
20
|
* Raise an ArgumentError if an invalid encoding is passed to the SAX parser. #756 (Thanks, Bradley Schaefer!)
|
11
|
-
* JRuby Node#content now renders newlines properly. #737 (Thanks, Piotr Szmielew!)
|
21
|
+
* [JRuby] Node#content now renders newlines properly. #737 (Thanks, Piotr Szmielew!)
|
22
|
+
* [JRuby] Unknown namespace are ignore when the recover option is used. #748
|
23
|
+
* [JRuby] XPath queries for namespaces should not throw exceptions when called twice in a row. #764
|
24
|
+
* [JRuby] More consistent (with libxml2) whitespace formatting when emitting XML. #771
|
25
|
+
* [JRuby] namespaced attributes broken when appending raw xml to builder. #770
|
26
|
+
* [JRuby] Nokogiri::XML::Document#wrap raises undefined method `length' for nil:NilClass when trying to << to a node. #781
|
27
|
+
* [JRuby] Fixed "bad file descriptor" bug when closing open file descriptors. #495
|
12
28
|
|
13
29
|
|
14
30
|
== 1.5.5 / 2012-06-24
|
data/README.rdoc
CHANGED
@@ -122,13 +122,10 @@ Developing Nokogiri requires racc and rexical to generate the parser and
|
|
122
122
|
tokenizer. To start development, make sure you have `libxml2` and `libxslt`
|
123
123
|
installed.
|
124
124
|
|
125
|
-
Then install
|
125
|
+
Then install core gems and bootstrap:
|
126
126
|
|
127
|
-
$ gem install hoe rake-compiler
|
128
|
-
|
129
|
-
Then run rake:
|
130
|
-
|
131
|
-
$ rake
|
127
|
+
$ gem install hoe rake-compiler mini_portile
|
128
|
+
$ rake newb
|
132
129
|
|
133
130
|
=== Developing on JRuby
|
134
131
|
|
data/ROADMAP.md
CHANGED
@@ -19,8 +19,9 @@
|
|
19
19
|
* https://github.com/sparklemotion/nokogiri/issues/679
|
20
20
|
Mixing in Enumerable has some unintended consequences; plus we want to improve the attributes API
|
21
21
|
|
22
|
-
*
|
23
|
-
|
22
|
+
* Some ideas for a better attributes API?
|
23
|
+
* (closed) https://github.com/sparklemotion/nokogiri/issues/666
|
24
|
+
* https://github.com/sparklemotion/nokogiri/issues/765
|
24
25
|
|
25
26
|
|
26
27
|
## improve CSS query parsing
|
data/Rakefile
CHANGED
@@ -17,6 +17,8 @@ def java?
|
|
17
17
|
!! (RUBY_PLATFORM =~ /java/)
|
18
18
|
end
|
19
19
|
|
20
|
+
ENV['LANG'] = "en_US.UTF-8" # UBUNTU 10.04, Y U NO DEFAULT TO UTF-8?
|
21
|
+
|
20
22
|
require 'tasks/nokogiri.org'
|
21
23
|
|
22
24
|
HOE = Hoe.spec 'nokogiri' do
|
@@ -118,9 +120,9 @@ task 'gem:spec' => 'generate' if Rake::Task.task_defined?("gem:spec")
|
|
118
120
|
# dependencies in the Gemfile are constrainted to ruby platforms
|
119
121
|
# (i.e. MRI and Rubinius). There's no way to do that through hoe,
|
120
122
|
# and any solution will require changing hoe and hoe-bundler.
|
121
|
-
old_gemfile_task = Rake::Task['bundler:gemfile']
|
123
|
+
old_gemfile_task = Rake::Task['bundler:gemfile'] rescue nil
|
122
124
|
task 'bundler:gemfile' do
|
123
|
-
old_gemfile_task.invoke
|
125
|
+
old_gemfile_task.invoke if old_gemfile_task
|
124
126
|
|
125
127
|
lines = File.open('Gemfile', 'r') { |f| f.readlines }.map do |line|
|
126
128
|
line =~ /racc|rexical/ ? "#{line.strip}, :platform => :ruby" : line
|
data/bin/nokogiri
CHANGED
@@ -16,6 +16,7 @@ opts = OptionParser.new do |opts|
|
|
16
16
|
opts.separator "Examples:"
|
17
17
|
opts.separator " nokogiri http://www.ruby-lang.org/"
|
18
18
|
opts.separator " nokogiri ./public/index.html"
|
19
|
+
opts.separator " curl -s http://nokogiri.org | nokogiri -e'p $_.css(\"h1\").length'"
|
19
20
|
opts.separator ""
|
20
21
|
opts.separator "Options:"
|
21
22
|
|
@@ -27,6 +28,10 @@ opts = OptionParser.new do |opts|
|
|
27
28
|
encoding = v
|
28
29
|
end
|
29
30
|
|
31
|
+
opts.on("-e command", "Specifies script from command-line.") do |v|
|
32
|
+
@script = v
|
33
|
+
end
|
34
|
+
|
30
35
|
opts.on("--rng <uri|path>", "Validate using this rng file.") do |v|
|
31
36
|
@rng = open(v) {|f| Nokogiri::XML::RelaxNG(f)}
|
32
37
|
end
|
@@ -45,19 +50,29 @@ opts.parse!
|
|
45
50
|
|
46
51
|
uri = ARGV.shift
|
47
52
|
|
48
|
-
if uri.to_s.strip.empty?
|
53
|
+
if uri.to_s.strip.empty? && $stdin.tty?
|
49
54
|
puts opts
|
50
55
|
exit 1
|
51
56
|
end
|
52
57
|
|
53
|
-
|
58
|
+
if $stdin.tty?
|
59
|
+
@doc = parse_class.parse(open(uri).read, nil, encoding)
|
60
|
+
else
|
61
|
+
@doc = parse_class.parse($stdin, nil, encoding)
|
62
|
+
end
|
63
|
+
|
64
|
+
$_ = @doc
|
54
65
|
|
55
66
|
if @rng
|
56
67
|
@rng.validate(@doc).each do |error|
|
57
68
|
puts error.message
|
58
69
|
end
|
59
70
|
else
|
60
|
-
|
61
|
-
|
71
|
+
if @script
|
72
|
+
eval @script, binding, '<main>'
|
73
|
+
else
|
74
|
+
puts "Your document is stored in @doc..."
|
75
|
+
IRB.start
|
76
|
+
end
|
62
77
|
end
|
63
78
|
|
data/build_all
CHANGED
@@ -13,6 +13,10 @@
|
|
13
13
|
#
|
14
14
|
# as you build, you may run into these problems:
|
15
15
|
#
|
16
|
+
# - if you're using Virtualbox shared directories, you'll get a mingw
|
17
|
+
# "Protocol error" at linktime. Boo! Either use NFS or a
|
18
|
+
# locally-checked-out repository.
|
19
|
+
#
|
16
20
|
# - on ubuntus 11 and later, you may have issues with building
|
17
21
|
# rake-compiler's rubies against openssl v2. Just comment the lines
|
18
22
|
# out from ossl_ssl.c and you'll be fine.
|
@@ -43,7 +47,7 @@ fi
|
|
43
47
|
|
44
48
|
function rvm_use {
|
45
49
|
current_ruby=$1
|
46
|
-
rvm use "${1}@nokogiri" --create
|
50
|
+
rvm use "${1}@nokogiri" --create || rvm -v
|
47
51
|
}
|
48
52
|
|
49
53
|
set -o errexit
|
data/ext/nokogiri/xml_document.c
CHANGED
@@ -367,7 +367,7 @@ static VALUE new(int argc, VALUE *argv, VALUE klass)
|
|
367
367
|
*
|
368
368
|
* For more information on why this probably is *not* a good thing in general,
|
369
369
|
* please direct your browser to
|
370
|
-
* http://tenderlovemaking.com/2009/04/23/namespaces-in-xml
|
370
|
+
* http://tenderlovemaking.com/2009/04/23/namespaces-in-xml.html
|
371
371
|
*/
|
372
372
|
VALUE remove_namespaces_bang(VALUE self)
|
373
373
|
{
|
data/ext/nokogiri/xml_node.c
CHANGED
@@ -84,7 +84,7 @@ static xmlNodePtr xmlReplaceNodeWrapper(xmlNodePtr pivot, xmlNodePtr new_node)
|
|
84
84
|
}
|
85
85
|
|
86
86
|
/* work around libxml2 issue: https://bugzilla.gnome.org/show_bug.cgi?id=615612 */
|
87
|
-
if (retval->type == XML_TEXT_NODE) {
|
87
|
+
if (retval && retval->type == XML_TEXT_NODE) {
|
88
88
|
if (retval->prev && retval->prev->type == XML_TEXT_NODE) {
|
89
89
|
retval = xmlTextMerge(retval->prev, retval);
|
90
90
|
}
|
@@ -699,23 +699,40 @@ static VALUE set(VALUE self, VALUE property, VALUE value)
|
|
699
699
|
*
|
700
700
|
* Get the value for +attribute+
|
701
701
|
*/
|
702
|
-
static VALUE get(VALUE self, VALUE
|
702
|
+
static VALUE get(VALUE self, VALUE rattribute)
|
703
703
|
{
|
704
704
|
xmlNodePtr node;
|
705
|
-
xmlChar*
|
706
|
-
VALUE
|
707
|
-
|
705
|
+
xmlChar* value = 0;
|
706
|
+
VALUE rvalue ;
|
707
|
+
char* attribute = 0;
|
708
|
+
char *colon = 0, *attr_name = 0, *prefix = 0;
|
709
|
+
xmlNsPtr ns;
|
708
710
|
|
709
|
-
if(NIL_P(
|
711
|
+
if (NIL_P(rattribute)) return Qnil;
|
710
712
|
|
711
|
-
|
713
|
+
Data_Get_Struct(self, xmlNode, node);
|
714
|
+
attribute = strdup(StringValuePtr(rattribute));
|
715
|
+
|
716
|
+
colon = strchr(attribute, ':');
|
717
|
+
if (colon) {
|
718
|
+
(*colon) = 0 ; /* create two null-terminated strings of the prefix and attribute name */
|
719
|
+
prefix = attribute ;
|
720
|
+
attr_name = colon + 1 ;
|
721
|
+
ns = xmlSearchNs(node->doc, node, (const xmlChar *)(prefix));
|
722
|
+
if (ns) {
|
723
|
+
value = xmlGetNsProp(node, (xmlChar*)(attr_name), ns->href);
|
724
|
+
}
|
725
|
+
} else {
|
726
|
+
value = xmlGetNoNsProp(node, (xmlChar*)attribute);
|
727
|
+
}
|
712
728
|
|
713
|
-
|
729
|
+
free(attribute);
|
730
|
+
if (!value) return Qnil;
|
714
731
|
|
715
|
-
|
732
|
+
rvalue = NOKOGIRI_STR_NEW2(value);
|
733
|
+
xmlFree(value);
|
716
734
|
|
717
|
-
|
718
|
-
return rval ;
|
735
|
+
return rvalue ;
|
719
736
|
}
|
720
737
|
|
721
738
|
/*
|
@@ -892,7 +909,7 @@ static VALUE node_type(VALUE self)
|
|
892
909
|
*
|
893
910
|
* Set the content for this Node
|
894
911
|
*/
|
895
|
-
static VALUE
|
912
|
+
static VALUE native_content(VALUE self, VALUE content)
|
896
913
|
{
|
897
914
|
xmlNodePtr node, child, next ;
|
898
915
|
Data_Get_Struct(self, xmlNode, node);
|
@@ -1288,7 +1305,7 @@ static VALUE in_context(VALUE self, VALUE _str, VALUE _options)
|
|
1288
1305
|
child_iter = node;
|
1289
1306
|
while (child_iter->parent)
|
1290
1307
|
child_iter = child_iter->parent;
|
1291
|
-
|
1308
|
+
|
1292
1309
|
if (child_iter->type == XML_DOCUMENT_FRAG_NODE)
|
1293
1310
|
node->doc->children = NULL;
|
1294
1311
|
}
|
@@ -1458,6 +1475,7 @@ void init_xml_node()
|
|
1458
1475
|
rb_define_method(klass, "create_external_subset", create_external_subset, 3);
|
1459
1476
|
rb_define_method(klass, "pointer_id", pointer_id, 0);
|
1460
1477
|
rb_define_method(klass, "line", line, 0);
|
1478
|
+
rb_define_method(klass, "native_content=", native_content, 1);
|
1461
1479
|
|
1462
1480
|
rb_define_private_method(klass, "process_xincludes", process_xincludes, 1);
|
1463
1481
|
rb_define_private_method(klass, "in_context", in_context, 2);
|
@@ -1467,7 +1485,6 @@ void init_xml_node()
|
|
1467
1485
|
rb_define_private_method(klass, "replace_node", replace, 1);
|
1468
1486
|
rb_define_private_method(klass, "dump_html", dump_html, 0);
|
1469
1487
|
rb_define_private_method(klass, "native_write_to", native_write_to, 4);
|
1470
|
-
rb_define_private_method(klass, "native_content=", set_content, 1);
|
1471
1488
|
rb_define_private_method(klass, "get", get, 1);
|
1472
1489
|
rb_define_private_method(klass, "set", set, 2);
|
1473
1490
|
rb_define_private_method(klass, "set_namespace", set_namespace, 1);
|
@@ -7,6 +7,7 @@ static ID id_start_document, id_end_document, id_start_element, id_end_element;
|
|
7
7
|
static ID id_start_element_namespace, id_end_element_namespace;
|
8
8
|
static ID id_comment, id_characters, id_xmldecl, id_error, id_warning;
|
9
9
|
static ID id_cdata_block, id_cAttribute;
|
10
|
+
static ID id_processing_instruction;
|
10
11
|
|
11
12
|
#define STRING_OR_NULL(str) \
|
12
13
|
(RTEST(str) ? StringValuePtr(str) : NULL)
|
@@ -236,6 +237,19 @@ static void cdata_block(void * ctx, const xmlChar * value, int len)
|
|
236
237
|
rb_funcall(doc, id_cdata_block, 1, string);
|
237
238
|
}
|
238
239
|
|
240
|
+
static void processing_instruction(void * ctx, const xmlChar * name, const xmlChar * content)
|
241
|
+
{
|
242
|
+
VALUE self = NOKOGIRI_SAX_SELF(ctx);
|
243
|
+
VALUE doc = rb_iv_get(self, "@document");
|
244
|
+
|
245
|
+
rb_funcall( doc,
|
246
|
+
id_processing_instruction,
|
247
|
+
2,
|
248
|
+
NOKOGIRI_STR_NEW2(name),
|
249
|
+
NOKOGIRI_STR_NEW2(content)
|
250
|
+
);
|
251
|
+
}
|
252
|
+
|
239
253
|
static void deallocate(xmlSAXHandlerPtr handler)
|
240
254
|
{
|
241
255
|
NOKOGIRI_DEBUG_START(handler);
|
@@ -260,6 +274,7 @@ static VALUE allocate(VALUE klass)
|
|
260
274
|
handler->warning = warning_func;
|
261
275
|
handler->error = error_func;
|
262
276
|
handler->cdataBlock = cdata_block;
|
277
|
+
handler->processingInstruction = processing_instruction;
|
263
278
|
handler->initialized = XML_SAX2_MAGIC;
|
264
279
|
|
265
280
|
return Data_Wrap_Struct(klass, NULL, deallocate, handler);
|
@@ -290,4 +305,5 @@ void init_xml_sax_parser()
|
|
290
305
|
id_cAttribute = rb_intern("Attribute");
|
291
306
|
id_start_element_namespace = rb_intern("start_element_namespace");
|
292
307
|
id_end_element_namespace = rb_intern("end_element_namespace");
|
308
|
+
id_processing_instruction = rb_intern("processing_instruction");
|
293
309
|
}
|
Binary file
|
Binary file
|
data/lib/nokogiri/version.rb
CHANGED
@@ -149,13 +149,15 @@ module Nokogiri
|
|
149
149
|
# Non-prefixed default namespaces (as in "xmlns=") are not included
|
150
150
|
# in the hash.
|
151
151
|
#
|
152
|
-
# Note this
|
153
|
-
#
|
154
|
-
#
|
152
|
+
# Note that this method does an xpath lookup for nodes with
|
153
|
+
# namespaces, and as a result the order may be dependent on the
|
154
|
+
# implementation of the underlying XML library.
|
155
|
+
#
|
155
156
|
def collect_namespaces
|
156
|
-
|
157
|
-
|
158
|
-
|
157
|
+
xpath("//namespace::*").inject({}) do |hash, ns|
|
158
|
+
hash[["xmlns",ns.prefix].compact.join(":")] = ns.href if ns.prefix != "xml"
|
159
|
+
hash
|
160
|
+
end
|
159
161
|
end
|
160
162
|
|
161
163
|
# Get the list of decorators given +key+
|
@@ -13,7 +13,8 @@ module Nokogiri
|
|
13
13
|
children = if ctx
|
14
14
|
# Fix for issue#490
|
15
15
|
if Nokogiri.jruby?
|
16
|
-
|
16
|
+
# fix for issue #770
|
17
|
+
ctx.parse("<root #{namespace_declarations(ctx)}>#{tags}</root>").children
|
17
18
|
else
|
18
19
|
ctx.parse(tags)
|
19
20
|
end
|
@@ -93,6 +94,14 @@ module Nokogiri
|
|
93
94
|
|
94
95
|
private
|
95
96
|
|
97
|
+
# fix for issue 770
|
98
|
+
def namespace_declarations ctx
|
99
|
+
ctx.namespace_scopes.map do |namespace|
|
100
|
+
prefix = namespace.prefix.nil? ? "" : ":#{namespace.prefix}"
|
101
|
+
%Q{xmlns#{prefix}="#{namespace.href}"}
|
102
|
+
end.join ' '
|
103
|
+
end
|
104
|
+
|
96
105
|
def coerce data
|
97
106
|
return super unless String === data
|
98
107
|
|
data/lib/nokogiri/xml/node.rb
CHANGED
@@ -251,14 +251,13 @@ module Nokogiri
|
|
251
251
|
###
|
252
252
|
# Get the attribute value for the attribute +name+
|
253
253
|
def [] name
|
254
|
-
return nil unless key?(name.to_s)
|
255
254
|
get(name.to_s)
|
256
255
|
end
|
257
256
|
|
258
257
|
###
|
259
258
|
# Set the attribute value for the attribute +name+ to +value+
|
260
259
|
def []= name, value
|
261
|
-
set name.to_s, value
|
260
|
+
set name.to_s, value.to_s
|
262
261
|
end
|
263
262
|
|
264
263
|
###
|
@@ -377,17 +376,22 @@ module Nokogiri
|
|
377
376
|
#
|
378
377
|
# Also see related method +swap+.
|
379
378
|
def replace node_or_tags
|
379
|
+
# We cannot replace a text node directly, otherwise libxml will return
|
380
|
+
# an internal error at parser.c:13031, I don't know exactly why
|
381
|
+
# libxml is trying to find a parent node that is an element or document
|
382
|
+
# so I can't tell if this is bug in libxml or not. issue #775.
|
383
|
+
if text?
|
384
|
+
replacee = Nokogiri::XML::Node.new 'dummy', document
|
385
|
+
add_previous_sibling_node replacee
|
386
|
+
unlink
|
387
|
+
return replacee.replace node_or_tags
|
388
|
+
end
|
389
|
+
|
380
390
|
node_or_tags = coerce(node_or_tags)
|
391
|
+
|
381
392
|
if node_or_tags.is_a?(XML::NodeSet)
|
382
|
-
|
383
|
-
|
384
|
-
add_previous_sibling_node replacee
|
385
|
-
unlink
|
386
|
-
else
|
387
|
-
replacee = self
|
388
|
-
end
|
389
|
-
node_or_tags.each { |n| replacee.add_previous_sibling n }
|
390
|
-
replacee.unlink
|
393
|
+
node_or_tags.each { |n| add_previous_sibling n }
|
394
|
+
unlink
|
391
395
|
else
|
392
396
|
replace_node node_or_tags
|
393
397
|
end
|
@@ -158,6 +158,13 @@ module Nokogiri
|
|
158
158
|
# +string+ contains the cdata content
|
159
159
|
def cdata_block string
|
160
160
|
end
|
161
|
+
|
162
|
+
###
|
163
|
+
# Called when processing instructions are found
|
164
|
+
# +name+ is the target of the instruction
|
165
|
+
# +content+ is the value of the instruction
|
166
|
+
def processing_instruction name, content
|
167
|
+
end
|
161
168
|
end
|
162
169
|
end
|
163
170
|
end
|
@@ -6,7 +6,7 @@ module Nokogiri
|
|
6
6
|
# Register namespaces in +namespaces+
|
7
7
|
def register_namespaces(namespaces)
|
8
8
|
namespaces.each do |k, v|
|
9
|
-
k = k.gsub(/.*:/,'') # strip off 'xmlns:' or 'xml:'
|
9
|
+
k = k.to_s.gsub(/.*:/,'') # strip off 'xmlns:' or 'xml:'
|
10
10
|
register_ns(k, v)
|
11
11
|
end
|
12
12
|
end
|
data/test/helper.rb
CHANGED
@@ -78,6 +78,7 @@ module Nokogiri
|
|
78
78
|
attr_reader :data, :comments, :cdata_blocks, :start_elements_namespace
|
79
79
|
attr_reader :errors, :warnings, :end_elements_namespace
|
80
80
|
attr_reader :xmldecls
|
81
|
+
attr_reader :processing_instructions
|
81
82
|
|
82
83
|
def xmldecl version, encoding, standalone
|
83
84
|
@xmldecls = [version, encoding, standalone].compact
|
@@ -141,6 +142,11 @@ module Nokogiri
|
|
141
142
|
@cdata_blocks += [string]
|
142
143
|
super
|
143
144
|
end
|
145
|
+
|
146
|
+
def processing_instruction name, content
|
147
|
+
@processing_instructions ||= []
|
148
|
+
@processing_instructions << [name, content]
|
149
|
+
end
|
144
150
|
end
|
145
151
|
end
|
146
152
|
end
|
@@ -24,6 +24,11 @@ module Nokogiri
|
|
24
24
|
end
|
25
25
|
end
|
26
26
|
|
27
|
+
def test_colons_are_not_removed
|
28
|
+
doc = Nokogiri::HTML::DocumentFragment.parse("<span>3:30pm</span>")
|
29
|
+
assert_match /3:30/, doc.to_s
|
30
|
+
end
|
31
|
+
|
27
32
|
def test_parse_encoding
|
28
33
|
fragment = "<div>hello world</div>"
|
29
34
|
f = Nokogiri::HTML::DocumentFragment.parse fragment, 'ISO-8859-1'
|
data/test/xml/sax/test_parser.rb
CHANGED
@@ -173,7 +173,12 @@ module Nokogiri
|
|
173
173
|
end
|
174
174
|
end
|
175
175
|
|
176
|
-
|
176
|
+
# when using JRuby Nokogiri, more errors will be generated as the DOM
|
177
|
+
# parser continue to parse an ill formed document, while the sax parser
|
178
|
+
# will stop at the first error
|
179
|
+
unless Nokogiri.jruby?
|
180
|
+
assert_equal doc.errors.length, @parser.document.errors.length
|
181
|
+
end
|
177
182
|
end
|
178
183
|
|
179
184
|
def test_parse_with_memory_argument
|
@@ -313,6 +318,15 @@ module Nokogiri
|
|
313
318
|
@parser.document.start_elements
|
314
319
|
end
|
315
320
|
|
321
|
+
def test_processing_instruction
|
322
|
+
@parser.parse_memory(<<-eoxml)
|
323
|
+
<?xml-stylesheet href="a.xsl" type="text/xsl"?>
|
324
|
+
<?xml version="1.0"?>
|
325
|
+
eoxml
|
326
|
+
assert_equal [['xml-stylesheet', 'href="a.xsl" type="text/xsl"']],
|
327
|
+
@parser.document.processing_instructions
|
328
|
+
end
|
329
|
+
|
316
330
|
if Nokogiri.uses_libxml? # JRuby SAXParser only parses well-formed XML documents
|
317
331
|
def test_parse_document
|
318
332
|
@parser.parse_memory(<<-eoxml)
|
data/test/xml/test_builder.rb
CHANGED
@@ -209,6 +209,25 @@ module Nokogiri
|
|
209
209
|
assert_equal ["bbb","ccc"], builder.doc.at_css("aaa").children.collect(&:name)
|
210
210
|
end
|
211
211
|
|
212
|
+
def test_raw_xml_append_with_namespaces
|
213
|
+
doc = Nokogiri::XML::Builder.new do |xml|
|
214
|
+
xml.root("xmlns:foo" => "x", "xmlns" => "y") do
|
215
|
+
xml << '<Element foo:bar="bazz"/>'
|
216
|
+
end
|
217
|
+
end.doc
|
218
|
+
|
219
|
+
el = doc.at 'Element'
|
220
|
+
assert_not_nil el
|
221
|
+
|
222
|
+
assert_equal 'y', el.namespace.href
|
223
|
+
assert_nil el.namespace.prefix
|
224
|
+
|
225
|
+
attr = el.attributes["bar"]
|
226
|
+
assert_not_nil attr
|
227
|
+
assert_not_nil attr.namespace
|
228
|
+
assert_equal "foo", attr.namespace.prefix
|
229
|
+
end
|
230
|
+
|
212
231
|
def test_cdata
|
213
232
|
builder = Nokogiri::XML::Builder.new do
|
214
233
|
root {
|
data/test/xml/test_document.rb
CHANGED
@@ -16,11 +16,27 @@ module Nokogiri
|
|
16
16
|
@xml = Nokogiri::XML.parse(File.read(XML_FILE), XML_FILE)
|
17
17
|
end
|
18
18
|
|
19
|
+
def test_document_with_initial_space
|
20
|
+
doc = Nokogiri::XML(" <?xml version='1.0' encoding='utf-8' ?><first \>")
|
21
|
+
assert_equal 2, doc.children.size
|
22
|
+
end
|
23
|
+
|
19
24
|
def test_root_set_to_nil
|
20
25
|
@xml.root = nil
|
21
26
|
assert_equal nil, @xml.root
|
22
27
|
end
|
23
28
|
|
29
|
+
def test_ignore_unknown_namespace
|
30
|
+
doc = Nokogiri::XML(<<-eoxml)
|
31
|
+
<xml>
|
32
|
+
<unknown:foo xmlns='hello' />
|
33
|
+
<bar />
|
34
|
+
</xml>
|
35
|
+
eoxml
|
36
|
+
refute doc.xpath('//foo').first.namespace # assert that the namespace is nil
|
37
|
+
refute_empty doc.xpath('//bar'), "bar wasn't found in the document" # bar should be part of the doc
|
38
|
+
end
|
39
|
+
|
24
40
|
def test_collect_namespaces
|
25
41
|
doc = Nokogiri::XML(<<-eoxml)
|
26
42
|
<xml>
|
@@ -716,26 +732,43 @@ module Nokogiri
|
|
716
732
|
assert @xml.children.respond_to?(:awesome!)
|
717
733
|
end
|
718
734
|
|
719
|
-
|
720
|
-
|
735
|
+
if Nokogiri.jruby?
|
736
|
+
def wrap_java_document
|
721
737
|
require 'java'
|
722
738
|
factory = javax.xml.parsers.DocumentBuilderFactory.newInstance
|
723
739
|
builder = factory.newDocumentBuilder
|
724
740
|
document = builder.newDocument
|
725
741
|
root = document.createElement("foo")
|
726
742
|
document.appendChild(root)
|
727
|
-
|
728
|
-
|
743
|
+
Nokogiri::XML::Document.wrap(document)
|
744
|
+
end
|
745
|
+
end
|
729
746
|
|
730
|
-
|
747
|
+
def test_java_integration
|
748
|
+
skip("Ruby doesn't have the wrap method") unless Nokogiri.jruby?
|
749
|
+
noko_doc = wrap_java_document
|
750
|
+
assert_equal 'foo', noko_doc.root.name
|
751
|
+
|
752
|
+
noko_doc = Nokogiri::XML(<<eoxml)
|
731
753
|
<foo xmlns='hello'>
|
732
754
|
<bar xmlns:foo='world' />
|
733
755
|
</foo>
|
734
756
|
eoxml
|
735
|
-
|
736
|
-
|
737
|
-
|
738
|
-
|
757
|
+
dom = noko_doc.to_java
|
758
|
+
assert dom.kind_of? org.w3c.dom.Document
|
759
|
+
assert_equal 'foo', dom.getDocumentElement().getTagName()
|
760
|
+
end
|
761
|
+
|
762
|
+
def test_add_child
|
763
|
+
skip("Ruby doesn't have the wrap method") unless Nokogiri.jruby?
|
764
|
+
doc = wrap_java_document
|
765
|
+
doc.root.add_child "<bar />"
|
766
|
+
end
|
767
|
+
|
768
|
+
def test_can_be_closed
|
769
|
+
f = File.open XML_FILE
|
770
|
+
Nokogiri::XML f
|
771
|
+
f.close
|
739
772
|
end
|
740
773
|
end
|
741
774
|
end
|
@@ -8,6 +8,13 @@ module Nokogiri
|
|
8
8
|
@xml = Nokogiri::XML.parse(File.read(XML_FILE), XML_FILE)
|
9
9
|
end
|
10
10
|
|
11
|
+
def test_replace_text_node
|
12
|
+
html = "foo"
|
13
|
+
doc = Nokogiri::XML::DocumentFragment.parse(html)
|
14
|
+
doc.children[0].replace "bar"
|
15
|
+
assert_equal 'bar', doc.children[0].content
|
16
|
+
end
|
17
|
+
|
11
18
|
def test_fragment_is_relative
|
12
19
|
doc = Nokogiri::XML('<root><a xmlns="blah" /></root>')
|
13
20
|
ctx = doc.root.child
|
data/test/xml/test_node.rb
CHANGED
@@ -637,6 +637,13 @@ module Nokogiri
|
|
637
637
|
assert_equal 'foo', node.content
|
638
638
|
end
|
639
639
|
|
640
|
+
def test_set_native_content_is_unescaped
|
641
|
+
comment = Nokogiri.XML('<r><!-- foo --></r>').at('//comment()')
|
642
|
+
|
643
|
+
comment.native_content = " < " # content= will escape this string
|
644
|
+
assert_equal "<!-- < -->", comment.to_xml
|
645
|
+
end
|
646
|
+
|
640
647
|
def test_find_by_css_with_tilde_eql
|
641
648
|
xml = Nokogiri::XML.parse(<<-eoxml)
|
642
649
|
<root>
|
@@ -717,6 +724,14 @@ module Nokogiri
|
|
717
724
|
assert_equal('bar', node['foo'])
|
718
725
|
end
|
719
726
|
|
727
|
+
def test_set_property_non_string
|
728
|
+
assert node = @xml.search('//address').first
|
729
|
+
node['foo'] = 1
|
730
|
+
assert_equal('1', node['foo'])
|
731
|
+
node['foo'] = false
|
732
|
+
assert_equal('false', node['foo'])
|
733
|
+
end
|
734
|
+
|
720
735
|
def test_attributes
|
721
736
|
assert node = @xml.search('//address').first
|
722
737
|
assert_nil(node['asdfasdfasdf'])
|
@@ -783,6 +798,23 @@ module Nokogiri
|
|
783
798
|
node.content = "1234 <-> 1234"
|
784
799
|
assert_equal "1234 <-> 1234", node.content
|
785
800
|
assert_equal "<form>1234 <-> 1234</form>", node.to_xml
|
801
|
+
|
802
|
+
node.content = '1234'
|
803
|
+
node.add_child '<foo>5678</foo>'
|
804
|
+
assert_equal '12345678', node.content
|
805
|
+
end
|
806
|
+
|
807
|
+
def test_content_after_appending_text
|
808
|
+
doc = Nokogiri::XML '<foo />'
|
809
|
+
node = doc.children.first
|
810
|
+
node.content = 'bar'
|
811
|
+
node << 'baz'
|
812
|
+
assert_equal 'barbaz', node.content
|
813
|
+
end
|
814
|
+
|
815
|
+
def test_content_depth_first
|
816
|
+
node = Nokogiri::XML '<foo>first<baz>second</baz>third</foo>'
|
817
|
+
assert_equal 'firstsecondthird', node.content
|
786
818
|
end
|
787
819
|
|
788
820
|
def test_set_content_should_unlink_existing_content
|
@@ -828,6 +860,22 @@ module Nokogiri
|
|
828
860
|
assert_equal 1, tires.length
|
829
861
|
end
|
830
862
|
|
863
|
+
def test_namespace_search_with_xpath_and_hash_with_symbol_keys
|
864
|
+
xml = Nokogiri::XML.parse(<<-eoxml)
|
865
|
+
<root>
|
866
|
+
<car xmlns:part="http://general-motors.com/">
|
867
|
+
<part:tire>Michelin Model XGV</part:tire>
|
868
|
+
</car>
|
869
|
+
<bicycle xmlns:part="http://schwinn.com/">
|
870
|
+
<part:tire>I'm a bicycle tire!</part:tire>
|
871
|
+
</bicycle>
|
872
|
+
</root>
|
873
|
+
eoxml
|
874
|
+
|
875
|
+
tires = xml.xpath('//bike:tire', :bike => 'http://schwinn.com/')
|
876
|
+
assert_equal 1, tires.length
|
877
|
+
end
|
878
|
+
|
831
879
|
def test_namespace_search_with_css
|
832
880
|
xml = Nokogiri::XML.parse(<<-eoxml)
|
833
881
|
<root>
|
@@ -1034,6 +1082,20 @@ EOXML
|
|
1034
1082
|
assert_match(/xmlns="bar"/, subject.to_xml)
|
1035
1083
|
end
|
1036
1084
|
|
1085
|
+
# issue 771
|
1086
|
+
def test_format_noblank
|
1087
|
+
content = <<eoxml
|
1088
|
+
<foo>
|
1089
|
+
<bar>hello</bar>
|
1090
|
+
</foo>
|
1091
|
+
eoxml
|
1092
|
+
subject = Nokogiri::XML(content) do |conf|
|
1093
|
+
conf.default_xml.noblanks
|
1094
|
+
end
|
1095
|
+
|
1096
|
+
assert_match %r{<bar>hello</bar>}, subject.to_xml(:indent => 2)
|
1097
|
+
end
|
1098
|
+
|
1037
1099
|
def test_text_node_colon
|
1038
1100
|
document = Nokogiri::XML::Document.new
|
1039
1101
|
root = Nokogiri::XML::Node.new 'foo', document
|
@@ -21,8 +21,28 @@ module Nokogiri
|
|
21
21
|
|
22
22
|
node = doc.root
|
23
23
|
|
24
|
-
assert_equal 'en-GB', node[
|
25
|
-
assert_equal
|
24
|
+
assert_equal 'en-GB', node['xml:lang']
|
25
|
+
assert_equal 'en-GB', node.attributes['lang'].value
|
26
|
+
assert_equal nil, node['lang']
|
27
|
+
end
|
28
|
+
|
29
|
+
def test_set_prefixed_attributes
|
30
|
+
doc = Nokogiri::XML %Q{<root xmlns:foo="x"/>}
|
31
|
+
|
32
|
+
node = doc.root
|
33
|
+
|
34
|
+
node['xml:lang'] = 'en-GB'
|
35
|
+
node['foo:bar'] = 'bazz'
|
36
|
+
|
37
|
+
assert_equal 'en-GB', node['xml:lang']
|
38
|
+
assert_equal 'en-GB', node.attributes['lang'].value
|
39
|
+
assert_equal nil, node['lang']
|
40
|
+
assert_equal 'http://www.w3.org/XML/1998/namespace', node.attributes['lang'].namespace.href
|
41
|
+
|
42
|
+
assert_equal 'bazz', node['foo:bar']
|
43
|
+
assert_equal 'bazz', node.attributes['bar'].value
|
44
|
+
assert_equal nil, node['bar']
|
45
|
+
assert_equal 'x', node.attributes['bar'].namespace.href
|
26
46
|
end
|
27
47
|
|
28
48
|
def test_namespace_key?
|
@@ -403,6 +403,15 @@ module Nokogiri
|
|
403
403
|
assert_equal set[0].to_xml, second.to_xml
|
404
404
|
end
|
405
405
|
|
406
|
+
def test_replace_on_unparented_node
|
407
|
+
foo = Node.new('foo', @node.document)
|
408
|
+
if Nokogiri.jruby? # JRuby Nokogiri doesn't raise an exception
|
409
|
+
@node.replace(foo)
|
410
|
+
else
|
411
|
+
assert_raises(RuntimeError){ @node.replace(foo) }
|
412
|
+
end
|
413
|
+
end
|
414
|
+
|
406
415
|
def test_illegal_replace_of_node_with_doc
|
407
416
|
new_node = Nokogiri::XML.parse('<foo>bar</foo>')
|
408
417
|
old_node = @node.at('.//employee')
|
data/test_all
CHANGED
@@ -5,8 +5,12 @@
|
|
5
5
|
#
|
6
6
|
# requires `rvm` to be installed. sorry about that, multiruby dudes.
|
7
7
|
#
|
8
|
+
# it's worth periodically using hoe-debugger's ability to generate
|
9
|
+
# valgrind suppression files to remove spurious valgrind messages
|
10
|
+
# (e.g., 1.9.3's glob_helper). ["rake test:valgrind:suppression"]
|
11
|
+
#
|
8
12
|
|
9
|
-
RUBIES="ruby-1.9.3-p194 jruby-1.6.5.1 jruby-1.6.7.2 ree-1.8.7-2011.12 ruby-1.9.2-
|
13
|
+
RUBIES="ruby-1.9.3-p194 jruby-1.7.0 jruby-1.6.5.1 jruby-1.6.7.2 ree-1.8.7-2011.12 ruby-1.9.2-p320 ruby-1.8.7-p370"
|
10
14
|
TEST_LOG=test.log
|
11
15
|
VALGRIND_LOG=valgrind.log
|
12
16
|
|
@@ -25,7 +29,7 @@ set -o errexit
|
|
25
29
|
|
26
30
|
function rvm_use {
|
27
31
|
current_ruby=$1
|
28
|
-
rvm use "${1}@nokogiri" --create
|
32
|
+
rvm use "${1}@nokogiri" --create || rvm -v
|
29
33
|
}
|
30
34
|
|
31
35
|
function generate_parser_and_tokenizer {
|
metadata
CHANGED
@@ -1,15 +1,15 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: nokogiri
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
hash:
|
4
|
+
hash: 147139679
|
5
5
|
prerelease: 6
|
6
6
|
segments:
|
7
7
|
- 1
|
8
8
|
- 5
|
9
9
|
- 6
|
10
10
|
- rc
|
11
|
-
-
|
12
|
-
version: 1.5.6.
|
11
|
+
- 3
|
12
|
+
version: 1.5.6.rc3
|
13
13
|
platform: x86-mswin32-60
|
14
14
|
authors:
|
15
15
|
- Aaron Patterson
|
@@ -20,7 +20,7 @@ autorequire:
|
|
20
20
|
bindir: bin
|
21
21
|
cert_chain: []
|
22
22
|
|
23
|
-
date: 2012-
|
23
|
+
date: 2012-11-27 00:00:00 Z
|
24
24
|
dependencies:
|
25
25
|
- !ruby/object:Gem::Dependency
|
26
26
|
prerelease: false
|