nokogiri 1.10.9 → 1.12.5
Sign up to get free protection for your applications and to get access to all the features.
Potentially problematic release.
This version of nokogiri might be problematic. Click here for more details.
- checksums.yaml +4 -4
- data/Gemfile +3 -0
- data/LICENSE-DEPENDENCIES.md +1173 -884
- data/LICENSE.md +1 -1
- data/README.md +176 -96
- data/dependencies.yml +12 -12
- data/ext/nokogiri/depend +38 -358
- data/ext/nokogiri/extconf.rb +716 -414
- data/ext/nokogiri/gumbo.c +584 -0
- data/ext/nokogiri/html4_document.c +166 -0
- data/ext/nokogiri/html4_element_description.c +294 -0
- data/ext/nokogiri/html4_entity_lookup.c +37 -0
- data/ext/nokogiri/html4_sax_parser_context.c +120 -0
- data/ext/nokogiri/html4_sax_push_parser.c +95 -0
- data/ext/nokogiri/libxml2_backwards_compat.c +121 -0
- data/ext/nokogiri/nokogiri.c +228 -91
- data/ext/nokogiri/nokogiri.h +191 -89
- data/ext/nokogiri/test_global_handlers.c +40 -0
- data/ext/nokogiri/xml_attr.c +15 -15
- data/ext/nokogiri/xml_attribute_decl.c +18 -18
- data/ext/nokogiri/xml_cdata.c +13 -18
- data/ext/nokogiri/xml_comment.c +19 -26
- data/ext/nokogiri/xml_document.c +267 -195
- data/ext/nokogiri/xml_document_fragment.c +13 -15
- data/ext/nokogiri/xml_dtd.c +54 -48
- data/ext/nokogiri/xml_element_content.c +31 -26
- data/ext/nokogiri/xml_element_decl.c +22 -22
- data/ext/nokogiri/xml_encoding_handler.c +28 -17
- data/ext/nokogiri/xml_entity_decl.c +32 -30
- data/ext/nokogiri/xml_entity_reference.c +16 -18
- data/ext/nokogiri/xml_namespace.c +60 -51
- data/ext/nokogiri/xml_node.c +493 -407
- data/ext/nokogiri/xml_node_set.c +174 -162
- data/ext/nokogiri/xml_processing_instruction.c +17 -19
- data/ext/nokogiri/xml_reader.c +197 -172
- data/ext/nokogiri/xml_relax_ng.c +52 -28
- data/ext/nokogiri/xml_sax_parser.c +112 -112
- data/ext/nokogiri/xml_sax_parser_context.c +105 -86
- data/ext/nokogiri/xml_sax_push_parser.c +36 -27
- data/ext/nokogiri/xml_schema.c +96 -46
- data/ext/nokogiri/xml_syntax_error.c +42 -21
- data/ext/nokogiri/xml_text.c +13 -17
- data/ext/nokogiri/xml_xpath_context.c +158 -73
- data/ext/nokogiri/xslt_stylesheet.c +158 -164
- data/gumbo-parser/CHANGES.md +63 -0
- data/gumbo-parser/Makefile +101 -0
- data/gumbo-parser/THANKS +27 -0
- data/gumbo-parser/src/Makefile +34 -0
- data/gumbo-parser/src/README.md +41 -0
- data/gumbo-parser/src/ascii.c +75 -0
- data/gumbo-parser/src/ascii.h +115 -0
- data/gumbo-parser/src/attribute.c +42 -0
- data/gumbo-parser/src/attribute.h +17 -0
- data/gumbo-parser/src/char_ref.c +22225 -0
- data/gumbo-parser/src/char_ref.h +29 -0
- data/gumbo-parser/src/char_ref.rl +2154 -0
- data/gumbo-parser/src/error.c +626 -0
- data/gumbo-parser/src/error.h +148 -0
- data/gumbo-parser/src/foreign_attrs.c +104 -0
- data/gumbo-parser/src/foreign_attrs.gperf +27 -0
- data/gumbo-parser/src/gumbo.h +943 -0
- data/gumbo-parser/src/insertion_mode.h +33 -0
- data/gumbo-parser/src/macros.h +91 -0
- data/gumbo-parser/src/parser.c +4886 -0
- data/gumbo-parser/src/parser.h +41 -0
- data/gumbo-parser/src/replacement.h +33 -0
- data/gumbo-parser/src/string_buffer.c +103 -0
- data/gumbo-parser/src/string_buffer.h +68 -0
- data/gumbo-parser/src/string_piece.c +48 -0
- data/gumbo-parser/src/svg_attrs.c +174 -0
- data/gumbo-parser/src/svg_attrs.gperf +77 -0
- data/gumbo-parser/src/svg_tags.c +137 -0
- data/gumbo-parser/src/svg_tags.gperf +55 -0
- data/gumbo-parser/src/tag.c +222 -0
- data/gumbo-parser/src/tag_lookup.c +382 -0
- data/gumbo-parser/src/tag_lookup.gperf +169 -0
- data/gumbo-parser/src/tag_lookup.h +13 -0
- data/gumbo-parser/src/token_buffer.c +79 -0
- data/gumbo-parser/src/token_buffer.h +71 -0
- data/gumbo-parser/src/token_type.h +17 -0
- data/gumbo-parser/src/tokenizer.c +3463 -0
- data/gumbo-parser/src/tokenizer.h +112 -0
- data/gumbo-parser/src/tokenizer_states.h +339 -0
- data/gumbo-parser/src/utf8.c +245 -0
- data/gumbo-parser/src/utf8.h +164 -0
- data/gumbo-parser/src/util.c +68 -0
- data/gumbo-parser/src/util.h +30 -0
- data/gumbo-parser/src/vector.c +111 -0
- data/gumbo-parser/src/vector.h +45 -0
- data/lib/nokogiri/css/node.rb +1 -0
- data/lib/nokogiri/css/parser.rb +64 -63
- data/lib/nokogiri/css/parser.y +3 -3
- data/lib/nokogiri/css/parser_extras.rb +39 -36
- data/lib/nokogiri/css/syntax_error.rb +2 -1
- data/lib/nokogiri/css/tokenizer.rb +1 -0
- data/lib/nokogiri/css/xpath_visitor.rb +73 -43
- data/lib/nokogiri/css.rb +15 -14
- data/lib/nokogiri/decorators/slop.rb +1 -0
- data/lib/nokogiri/extension.rb +31 -0
- data/lib/nokogiri/gumbo.rb +14 -0
- data/lib/nokogiri/html.rb +32 -27
- data/lib/nokogiri/{html → html4}/builder.rb +3 -2
- data/lib/nokogiri/{html → html4}/document.rb +17 -30
- data/lib/nokogiri/{html → html4}/document_fragment.rb +18 -17
- data/lib/nokogiri/{html → html4}/element_description.rb +2 -1
- data/lib/nokogiri/{html → html4}/element_description_defaults.rb +2 -1
- data/lib/nokogiri/{html → html4}/entity_lookup.rb +2 -1
- data/lib/nokogiri/{html → html4}/sax/parser.rb +12 -14
- data/lib/nokogiri/html4/sax/parser_context.rb +19 -0
- data/lib/nokogiri/{html → html4}/sax/push_parser.rb +6 -5
- data/lib/nokogiri/html4.rb +40 -0
- data/lib/nokogiri/html5/document.rb +74 -0
- data/lib/nokogiri/html5/document_fragment.rb +80 -0
- data/lib/nokogiri/html5/node.rb +93 -0
- data/lib/nokogiri/html5.rb +473 -0
- data/lib/nokogiri/jruby/dependencies.rb +20 -0
- data/lib/nokogiri/syntax_error.rb +1 -0
- data/lib/nokogiri/version/constant.rb +5 -0
- data/lib/nokogiri/version/info.rb +215 -0
- data/lib/nokogiri/version.rb +3 -109
- data/lib/nokogiri/xml/attr.rb +1 -0
- data/lib/nokogiri/xml/attribute_decl.rb +1 -0
- data/lib/nokogiri/xml/builder.rb +41 -2
- data/lib/nokogiri/xml/cdata.rb +1 -0
- data/lib/nokogiri/xml/character_data.rb +1 -0
- data/lib/nokogiri/xml/document.rb +138 -41
- data/lib/nokogiri/xml/document_fragment.rb +5 -6
- data/lib/nokogiri/xml/dtd.rb +1 -0
- data/lib/nokogiri/xml/element_content.rb +1 -0
- data/lib/nokogiri/xml/element_decl.rb +1 -0
- data/lib/nokogiri/xml/entity_decl.rb +1 -0
- data/lib/nokogiri/xml/entity_reference.rb +1 -0
- data/lib/nokogiri/xml/namespace.rb +1 -0
- data/lib/nokogiri/xml/node/save_options.rb +2 -1
- data/lib/nokogiri/xml/node.rb +629 -293
- data/lib/nokogiri/xml/node_set.rb +1 -0
- data/lib/nokogiri/xml/notation.rb +1 -0
- data/lib/nokogiri/xml/parse_options.rb +12 -3
- data/lib/nokogiri/xml/pp/character_data.rb +1 -0
- data/lib/nokogiri/xml/pp/node.rb +1 -0
- data/lib/nokogiri/xml/pp.rb +3 -2
- data/lib/nokogiri/xml/processing_instruction.rb +1 -0
- data/lib/nokogiri/xml/reader.rb +9 -12
- data/lib/nokogiri/xml/relax_ng.rb +7 -2
- data/lib/nokogiri/xml/sax/document.rb +25 -30
- data/lib/nokogiri/xml/sax/parser.rb +1 -0
- data/lib/nokogiri/xml/sax/parser_context.rb +1 -0
- data/lib/nokogiri/xml/sax/push_parser.rb +1 -0
- data/lib/nokogiri/xml/sax.rb +5 -4
- data/lib/nokogiri/xml/schema.rb +13 -4
- data/lib/nokogiri/xml/searchable.rb +25 -16
- data/lib/nokogiri/xml/syntax_error.rb +1 -0
- data/lib/nokogiri/xml/text.rb +1 -0
- data/lib/nokogiri/xml/xpath/syntax_error.rb +2 -1
- data/lib/nokogiri/xml/xpath.rb +4 -5
- data/lib/nokogiri/xml/xpath_context.rb +1 -0
- data/lib/nokogiri/xml.rb +36 -36
- data/lib/nokogiri/xslt/stylesheet.rb +2 -1
- data/lib/nokogiri/xslt.rb +17 -16
- data/lib/nokogiri.rb +32 -51
- data/lib/xsd/xmlparser/nokogiri.rb +1 -0
- data/patches/libxml2/{0002-Remove-script-macro-support.patch → 0001-Remove-script-macro-support.patch} +0 -0
- data/patches/libxml2/{0003-Update-entities-to-remove-handling-of-ssi.patch → 0002-Update-entities-to-remove-handling-of-ssi.patch} +0 -0
- data/patches/libxml2/{0004-libxml2.la-is-in-top_builddir.patch → 0003-libxml2.la-is-in-top_builddir.patch} +1 -1
- data/patches/libxml2/0004-use-glibc-strlen.patch +53 -0
- data/patches/libxml2/0005-avoid-isnan-isinf.patch +81 -0
- data/patches/libxml2/0006-update-automake-files-for-arm64.patch +2511 -0
- data/patches/libxml2/0007-Fix-XPath-recursion-limit.patch +31 -0
- data/patches/libxslt/0001-update-automake-files-for-arm64.patch +2511 -0
- data/patches/libxslt/0002-Fix-xml2-config-check-in-configure-script.patch +19 -0
- data/ports/archives/libxml2-2.9.12.tar.gz +0 -0
- metadata +139 -161
- data/ext/nokogiri/html_document.c +0 -170
- data/ext/nokogiri/html_document.h +0 -10
- data/ext/nokogiri/html_element_description.c +0 -279
- data/ext/nokogiri/html_element_description.h +0 -10
- data/ext/nokogiri/html_entity_lookup.c +0 -32
- data/ext/nokogiri/html_entity_lookup.h +0 -8
- data/ext/nokogiri/html_sax_parser_context.c +0 -116
- data/ext/nokogiri/html_sax_parser_context.h +0 -11
- data/ext/nokogiri/html_sax_push_parser.c +0 -87
- data/ext/nokogiri/html_sax_push_parser.h +0 -9
- data/ext/nokogiri/xml_attr.h +0 -9
- data/ext/nokogiri/xml_attribute_decl.h +0 -9
- data/ext/nokogiri/xml_cdata.h +0 -9
- data/ext/nokogiri/xml_comment.h +0 -9
- data/ext/nokogiri/xml_document.h +0 -23
- data/ext/nokogiri/xml_document_fragment.h +0 -10
- data/ext/nokogiri/xml_dtd.h +0 -10
- data/ext/nokogiri/xml_element_content.h +0 -10
- data/ext/nokogiri/xml_element_decl.h +0 -9
- data/ext/nokogiri/xml_encoding_handler.h +0 -8
- data/ext/nokogiri/xml_entity_decl.h +0 -10
- data/ext/nokogiri/xml_entity_reference.h +0 -9
- data/ext/nokogiri/xml_io.c +0 -61
- data/ext/nokogiri/xml_io.h +0 -11
- data/ext/nokogiri/xml_libxml2_hacks.c +0 -112
- data/ext/nokogiri/xml_libxml2_hacks.h +0 -12
- data/ext/nokogiri/xml_namespace.h +0 -14
- data/ext/nokogiri/xml_node.h +0 -13
- data/ext/nokogiri/xml_node_set.h +0 -12
- data/ext/nokogiri/xml_processing_instruction.h +0 -9
- data/ext/nokogiri/xml_reader.h +0 -10
- data/ext/nokogiri/xml_relax_ng.h +0 -9
- data/ext/nokogiri/xml_sax_parser.h +0 -39
- data/ext/nokogiri/xml_sax_parser_context.h +0 -10
- data/ext/nokogiri/xml_sax_push_parser.h +0 -9
- data/ext/nokogiri/xml_schema.h +0 -9
- data/ext/nokogiri/xml_syntax_error.h +0 -13
- data/ext/nokogiri/xml_text.h +0 -9
- data/ext/nokogiri/xml_xpath_context.h +0 -10
- data/ext/nokogiri/xslt_stylesheet.h +0 -14
- data/lib/nokogiri/html/sax/parser_context.rb +0 -16
- data/patches/libxml2/0001-Revert-Do-not-URI-escape-in-server-side-includes.patch +0 -78
- data/patches/libxml2/0005-Fix-infinite-loop-in-xmlStringLenDecodeEntities.patch +0 -32
- data/ports/archives/libxml2-2.9.10.tar.gz +0 -0
@@ -1,34 +1,29 @@
|
|
1
|
-
#include <
|
1
|
+
#include <nokogiri.h>
|
2
2
|
|
3
|
-
|
4
|
-
#include <libxslt/xsltutils.h>
|
5
|
-
#include <libxslt/transform.h>
|
6
|
-
#include <libexslt/exslt.h>
|
7
|
-
|
8
|
-
VALUE xslt;
|
9
|
-
|
10
|
-
int vasprintf (char **strp, const char *fmt, va_list ap);
|
11
|
-
void vasprintf_free (void *p);
|
3
|
+
VALUE cNokogiriXsltStylesheet ;
|
12
4
|
|
13
|
-
static void
|
5
|
+
static void
|
6
|
+
mark(nokogiriXsltStylesheetTuple *wrapper)
|
14
7
|
{
|
15
8
|
rb_gc_mark(wrapper->func_instances);
|
16
9
|
}
|
17
10
|
|
18
|
-
static void
|
11
|
+
static void
|
12
|
+
dealloc(nokogiriXsltStylesheetTuple *wrapper)
|
19
13
|
{
|
20
|
-
|
14
|
+
xsltStylesheetPtr doc = wrapper->ss;
|
21
15
|
|
22
|
-
|
23
|
-
|
24
|
-
|
16
|
+
NOKOGIRI_DEBUG_START(doc);
|
17
|
+
xsltFreeStylesheet(doc); /* commented out for now. */
|
18
|
+
NOKOGIRI_DEBUG_END(doc);
|
25
19
|
|
26
|
-
|
20
|
+
free(wrapper);
|
27
21
|
}
|
28
22
|
|
29
|
-
static void
|
23
|
+
static void
|
24
|
+
xslt_generic_error_handler(void *ctx, const char *msg, ...)
|
30
25
|
{
|
31
|
-
char *
|
26
|
+
char *message;
|
32
27
|
|
33
28
|
va_list args;
|
34
29
|
va_start(args, msg);
|
@@ -37,10 +32,11 @@ static void xslt_generic_error_handler(void * ctx, const char *msg, ...)
|
|
37
32
|
|
38
33
|
rb_str_cat2((VALUE)ctx, message);
|
39
34
|
|
40
|
-
|
35
|
+
free(message);
|
41
36
|
}
|
42
37
|
|
43
|
-
VALUE
|
38
|
+
VALUE
|
39
|
+
Nokogiri_wrap_xslt_stylesheet(xsltStylesheetPtr ss)
|
44
40
|
{
|
45
41
|
VALUE self;
|
46
42
|
nokogiriXsltStylesheetTuple *wrapper;
|
@@ -61,29 +57,29 @@ VALUE Nokogiri_wrap_xslt_stylesheet(xsltStylesheetPtr ss)
|
|
61
57
|
*
|
62
58
|
* Parse a stylesheet from +document+.
|
63
59
|
*/
|
64
|
-
static VALUE
|
60
|
+
static VALUE
|
61
|
+
parse_stylesheet_doc(VALUE klass, VALUE xmldocobj)
|
65
62
|
{
|
66
|
-
|
67
|
-
|
68
|
-
|
69
|
-
|
70
|
-
exsltRegisterAll();
|
63
|
+
xmlDocPtr xml, xml_cpy;
|
64
|
+
VALUE errstr, exception;
|
65
|
+
xsltStylesheetPtr ss ;
|
66
|
+
Data_Get_Struct(xmldocobj, xmlDoc, xml);
|
71
67
|
|
72
|
-
|
73
|
-
|
68
|
+
errstr = rb_str_new(0, 0);
|
69
|
+
xsltSetGenericErrorFunc((void *)errstr, xslt_generic_error_handler);
|
74
70
|
|
75
|
-
|
76
|
-
|
71
|
+
xml_cpy = xmlCopyDoc(xml, 1); /* 1 => recursive */
|
72
|
+
ss = xsltParseStylesheetDoc(xml_cpy);
|
77
73
|
|
78
|
-
|
74
|
+
xsltSetGenericErrorFunc(NULL, NULL);
|
79
75
|
|
80
|
-
|
81
|
-
|
82
|
-
|
83
|
-
|
84
|
-
|
76
|
+
if (!ss) {
|
77
|
+
xmlFreeDoc(xml_cpy);
|
78
|
+
exception = rb_exc_new3(rb_eRuntimeError, errstr);
|
79
|
+
rb_exc_raise(exception);
|
80
|
+
}
|
85
81
|
|
86
|
-
|
82
|
+
return Nokogiri_wrap_xslt_stylesheet(ss);
|
87
83
|
}
|
88
84
|
|
89
85
|
|
@@ -93,24 +89,21 @@ static VALUE parse_stylesheet_doc(VALUE klass, VALUE xmldocobj)
|
|
93
89
|
*
|
94
90
|
* Serialize +document+ to an xml string.
|
95
91
|
*/
|
96
|
-
static VALUE
|
97
|
-
|
98
|
-
xmlDocPtr xml ;
|
99
|
-
nokogiriXsltStylesheetTuple *wrapper;
|
100
|
-
xmlChar* doc_ptr ;
|
101
|
-
int doc_len ;
|
102
|
-
VALUE rval ;
|
103
|
-
|
104
|
-
Data_Get_Struct(xmlobj, xmlDoc, xml);
|
105
|
-
Data_Get_Struct(self, nokogiriXsltStylesheetTuple, wrapper);
|
106
|
-
xsltSaveResultToString(&doc_ptr, &doc_len, xml, wrapper->ss);
|
107
|
-
rval = NOKOGIRI_STR_NEW(doc_ptr, doc_len);
|
108
|
-
xmlFree(doc_ptr);
|
109
|
-
return rval ;
|
110
|
-
}
|
111
|
-
|
112
|
-
static void swallow_superfluous_xml_errors(void * userdata, xmlErrorPtr error, ...)
|
92
|
+
static VALUE
|
93
|
+
serialize(VALUE self, VALUE xmlobj)
|
113
94
|
{
|
95
|
+
xmlDocPtr xml ;
|
96
|
+
nokogiriXsltStylesheetTuple *wrapper;
|
97
|
+
xmlChar *doc_ptr ;
|
98
|
+
int doc_len ;
|
99
|
+
VALUE rval ;
|
100
|
+
|
101
|
+
Data_Get_Struct(xmlobj, xmlDoc, xml);
|
102
|
+
Data_Get_Struct(self, nokogiriXsltStylesheetTuple, wrapper);
|
103
|
+
xsltSaveResultToString(&doc_ptr, &doc_len, xml, wrapper->ss);
|
104
|
+
rval = NOKOGIRI_STR_NEW(doc_ptr, doc_len);
|
105
|
+
xmlFree(doc_ptr);
|
106
|
+
return rval ;
|
114
107
|
}
|
115
108
|
|
116
109
|
/*
|
@@ -128,109 +121,114 @@ static void swallow_superfluous_xml_errors(void * userdata, xmlErrorPtr error, .
|
|
128
121
|
* puts xslt.transform(doc, ['key', 'value'])
|
129
122
|
*
|
130
123
|
*/
|
131
|
-
static VALUE
|
124
|
+
static VALUE
|
125
|
+
transform(int argc, VALUE *argv, VALUE self)
|
132
126
|
{
|
133
|
-
|
134
|
-
|
135
|
-
|
136
|
-
|
137
|
-
|
138
|
-
|
139
|
-
|
140
|
-
|
141
|
-
|
142
|
-
|
143
|
-
|
144
|
-
|
145
|
-
|
146
|
-
|
147
|
-
|
148
|
-
|
149
|
-
|
150
|
-
|
151
|
-
|
152
|
-
|
153
|
-
|
154
|
-
|
155
|
-
|
156
|
-
|
157
|
-
|
158
|
-
|
159
|
-
|
160
|
-
|
161
|
-
|
162
|
-
|
163
|
-
|
164
|
-
|
165
|
-
|
166
|
-
|
167
|
-
|
168
|
-
|
169
|
-
|
170
|
-
|
171
|
-
|
172
|
-
|
173
|
-
|
174
|
-
|
175
|
-
|
176
|
-
|
177
|
-
|
178
|
-
|
179
|
-
|
180
|
-
|
181
|
-
|
182
|
-
|
183
|
-
|
127
|
+
VALUE xmldoc, paramobj, errstr, exception ;
|
128
|
+
xmlDocPtr xml ;
|
129
|
+
xmlDocPtr result ;
|
130
|
+
nokogiriXsltStylesheetTuple *wrapper;
|
131
|
+
const char **params ;
|
132
|
+
long param_len, j ;
|
133
|
+
int parse_error_occurred ;
|
134
|
+
|
135
|
+
rb_scan_args(argc, argv, "11", &xmldoc, ¶mobj);
|
136
|
+
if (NIL_P(paramobj)) { paramobj = rb_ary_new2(0L) ; }
|
137
|
+
if (!rb_obj_is_kind_of(xmldoc, cNokogiriXmlDocument)) {
|
138
|
+
rb_raise(rb_eArgError, "argument must be a Nokogiri::XML::Document");
|
139
|
+
}
|
140
|
+
|
141
|
+
/* handle hashes as arguments. */
|
142
|
+
if (T_HASH == TYPE(paramobj)) {
|
143
|
+
paramobj = rb_funcall(paramobj, rb_intern("to_a"), 0);
|
144
|
+
paramobj = rb_funcall(paramobj, rb_intern("flatten"), 0);
|
145
|
+
}
|
146
|
+
|
147
|
+
Check_Type(paramobj, T_ARRAY);
|
148
|
+
|
149
|
+
Data_Get_Struct(xmldoc, xmlDoc, xml);
|
150
|
+
Data_Get_Struct(self, nokogiriXsltStylesheetTuple, wrapper);
|
151
|
+
|
152
|
+
param_len = RARRAY_LEN(paramobj);
|
153
|
+
params = calloc((size_t)param_len + 1, sizeof(char *));
|
154
|
+
for (j = 0 ; j < param_len ; j++) {
|
155
|
+
VALUE entry = rb_ary_entry(paramobj, j);
|
156
|
+
const char *ptr = StringValueCStr(entry);
|
157
|
+
params[j] = ptr;
|
158
|
+
}
|
159
|
+
params[param_len] = 0 ;
|
160
|
+
|
161
|
+
errstr = rb_str_new(0, 0);
|
162
|
+
xsltSetGenericErrorFunc((void *)errstr, xslt_generic_error_handler);
|
163
|
+
xmlSetGenericErrorFunc((void *)errstr, xslt_generic_error_handler);
|
164
|
+
|
165
|
+
result = xsltApplyStylesheet(wrapper->ss, xml, params);
|
166
|
+
free(params);
|
167
|
+
|
168
|
+
xsltSetGenericErrorFunc(NULL, NULL);
|
169
|
+
xmlSetGenericErrorFunc(NULL, NULL);
|
170
|
+
|
171
|
+
parse_error_occurred = (Qfalse == rb_funcall(errstr, rb_intern("empty?"), 0));
|
172
|
+
|
173
|
+
if (parse_error_occurred) {
|
174
|
+
exception = rb_exc_new3(rb_eRuntimeError, errstr);
|
175
|
+
rb_exc_raise(exception);
|
176
|
+
}
|
177
|
+
|
178
|
+
return noko_xml_document_wrap((VALUE)0, result) ;
|
184
179
|
}
|
185
180
|
|
186
|
-
static void
|
181
|
+
static void
|
182
|
+
method_caller(xmlXPathParserContextPtr ctxt, int nargs)
|
187
183
|
{
|
188
|
-
|
189
|
-
|
190
|
-
|
191
|
-
|
184
|
+
VALUE handler;
|
185
|
+
const char *function_name;
|
186
|
+
xsltTransformContextPtr transform;
|
187
|
+
const xmlChar *functionURI;
|
192
188
|
|
193
|
-
|
194
|
-
|
195
|
-
|
196
|
-
|
189
|
+
transform = xsltXPathGetTransformContext(ctxt);
|
190
|
+
functionURI = ctxt->context->functionURI;
|
191
|
+
handler = (VALUE)xsltGetExtData(transform, functionURI);
|
192
|
+
function_name = (const char *)(ctxt->context->function);
|
197
193
|
|
198
|
-
|
194
|
+
Nokogiri_marshal_xpath_funcall_and_return_values(ctxt, nargs, handler, (const char *)function_name);
|
199
195
|
}
|
200
196
|
|
201
|
-
static void *
|
197
|
+
static void *
|
198
|
+
initFunc(xsltTransformContextPtr ctxt, const xmlChar *uri)
|
202
199
|
{
|
203
|
-
|
204
|
-
|
205
|
-
|
206
|
-
|
207
|
-
|
208
|
-
|
209
|
-
|
210
|
-
|
211
|
-
|
212
|
-
|
213
|
-
|
214
|
-
|
215
|
-
|
216
|
-
|
217
|
-
|
218
|
-
|
219
|
-
|
220
|
-
|
221
|
-
|
222
|
-
|
200
|
+
VALUE modules = rb_iv_get(mNokogiriXslt, "@modules");
|
201
|
+
VALUE obj = rb_hash_aref(modules, rb_str_new2((const char *)uri));
|
202
|
+
VALUE args = { Qfalse };
|
203
|
+
VALUE methods = rb_funcall(obj, rb_intern("instance_methods"), 1, args);
|
204
|
+
VALUE inst;
|
205
|
+
nokogiriXsltStylesheetTuple *wrapper;
|
206
|
+
int i;
|
207
|
+
|
208
|
+
for (i = 0; i < RARRAY_LEN(methods); i++) {
|
209
|
+
VALUE method_name = rb_obj_as_string(rb_ary_entry(methods, i));
|
210
|
+
xsltRegisterExtFunction(ctxt,
|
211
|
+
(unsigned char *)StringValueCStr(method_name), uri, method_caller);
|
212
|
+
}
|
213
|
+
|
214
|
+
Data_Get_Struct((VALUE)ctxt->style->_private, nokogiriXsltStylesheetTuple,
|
215
|
+
wrapper);
|
216
|
+
inst = rb_class_new_instance(0, NULL, obj);
|
217
|
+
rb_ary_push(wrapper->func_instances, inst);
|
218
|
+
|
219
|
+
return (void *)inst;
|
223
220
|
}
|
224
221
|
|
225
|
-
static void
|
226
|
-
|
222
|
+
static void
|
223
|
+
shutdownFunc(xsltTransformContextPtr ctxt,
|
224
|
+
const xmlChar *uri, void *data)
|
227
225
|
{
|
228
|
-
|
226
|
+
nokogiriXsltStylesheetTuple *wrapper;
|
229
227
|
|
230
|
-
|
231
|
-
|
228
|
+
Data_Get_Struct((VALUE)ctxt->style->_private, nokogiriXsltStylesheetTuple,
|
229
|
+
wrapper);
|
232
230
|
|
233
|
-
|
231
|
+
rb_ary_clear(wrapper->func_instances);
|
234
232
|
}
|
235
233
|
|
236
234
|
/*
|
@@ -239,32 +237,28 @@ static void shutdownFunc(xsltTransformContextPtr ctxt,
|
|
239
237
|
*
|
240
238
|
* Register a class that implements custom XSLT transformation functions.
|
241
239
|
*/
|
242
|
-
static VALUE
|
240
|
+
static VALUE
|
241
|
+
registr(VALUE self, VALUE uri, VALUE obj)
|
243
242
|
{
|
244
|
-
|
245
|
-
|
243
|
+
VALUE modules = rb_iv_get(self, "@modules");
|
244
|
+
if (NIL_P(modules)) { rb_raise(rb_eRuntimeError, "wtf! @modules isn't set"); }
|
246
245
|
|
247
|
-
|
248
|
-
|
249
|
-
|
246
|
+
rb_hash_aset(modules, uri, obj);
|
247
|
+
xsltRegisterExtModule((unsigned char *)StringValueCStr(uri), initFunc, shutdownFunc);
|
248
|
+
return self;
|
250
249
|
}
|
251
250
|
|
252
|
-
|
253
|
-
|
251
|
+
void
|
252
|
+
noko_init_xslt_stylesheet()
|
254
253
|
{
|
255
|
-
|
256
|
-
|
257
|
-
|
258
|
-
nokogiri = rb_define_module("Nokogiri");
|
259
|
-
xslt = rb_define_module_under(nokogiri, "XSLT");
|
260
|
-
klass = rb_define_class_under(xslt, "Stylesheet", rb_cObject);
|
254
|
+
rb_define_singleton_method(mNokogiriXslt, "register", registr, 2);
|
255
|
+
rb_iv_set(mNokogiriXslt, "@modules", rb_hash_new());
|
261
256
|
|
262
|
-
|
257
|
+
cNokogiriXsltStylesheet = rb_define_class_under(mNokogiriXslt, "Stylesheet", rb_cObject);
|
263
258
|
|
264
|
-
cNokogiriXsltStylesheet
|
259
|
+
rb_undef_alloc_func(cNokogiriXsltStylesheet);
|
265
260
|
|
266
|
-
rb_define_singleton_method(
|
267
|
-
|
268
|
-
rb_define_method(
|
269
|
-
rb_define_method(klass, "transform", transform, -1);
|
261
|
+
rb_define_singleton_method(cNokogiriXsltStylesheet, "parse_stylesheet_doc", parse_stylesheet_doc, 1);
|
262
|
+
rb_define_method(cNokogiriXsltStylesheet, "serialize", serialize, 1);
|
263
|
+
rb_define_method(cNokogiriXsltStylesheet, "transform", transform, -1);
|
270
264
|
}
|
@@ -0,0 +1,63 @@
|
|
1
|
+
## Gumbo 0.10.1 (2015-04-30)
|
2
|
+
|
3
|
+
Same as 0.10.0, but with the version number bumped because the last version-number commit to v0.9.4 makes GitHub think that v0.9.4 is the latest version and so it's not highlighted on the webpage.
|
4
|
+
|
5
|
+
## Gumbo 0.10.0 (2015-04-30)
|
6
|
+
|
7
|
+
* Full support for `<template>` tag (kevinhendricks, nostrademons).
|
8
|
+
* Some fixes for `<rtc>`/`<rt>` handling (kevinhendricks, vmg).
|
9
|
+
* All html5lib-trunk tests pass now! (kevinhendricks, vmg, nostrademons)
|
10
|
+
* Support for fragment parsing (vmg)
|
11
|
+
* A couple additional example programs (kevinhendricks)
|
12
|
+
* Performance improvements totaling an estimated 30-40% total improvement (vmg, nostrademons).
|
13
|
+
|
14
|
+
## Gumbo 0.9.4 (2015-04-30)
|
15
|
+
|
16
|
+
* Additional Visual Studio fixes (lowjoel, nostrademons)
|
17
|
+
* Fixed some unused variable warnings.
|
18
|
+
* Fix for glibtoolize vs. libtoolize build errors on Mac.
|
19
|
+
* Fixed `CDATA` end tag handling.
|
20
|
+
|
21
|
+
## Gumbo 0.9.3 (2015-02-17)
|
22
|
+
|
23
|
+
* Bugfix for `Æ` entities (rgrove)
|
24
|
+
* Fix `CDATA` handling; `CDATA` sections now generate a `GUMBO_NODE_CDATA` node rather
|
25
|
+
than plain text.
|
26
|
+
* Fix `get_title example` to handle whitespace nodes (gsnedders)
|
27
|
+
* Visual Studio compilation fixes (fishioon)
|
28
|
+
* Take the namespace into account when determining whether a node matches a
|
29
|
+
certain tag (aroben)
|
30
|
+
* Replace the varargs tag functions with a tagset bytevector, for a 20-30%
|
31
|
+
speedup in overall parse time (kevinhendricks, vmg)
|
32
|
+
* Add MacOS X support to Travis CI, and fix the deployment/DLL issues this
|
33
|
+
uncovered (nostrademons, kevinhendricks, vmg)
|
34
|
+
|
35
|
+
## Gumbo 0.9.2 (2014-09-21)
|
36
|
+
|
37
|
+
* Performance improvements: Ragel-based char ref decoder and DFA-based UTF8
|
38
|
+
decoder, totaling speedups of up to 300%.
|
39
|
+
* Added benchmarking program and some sample data.
|
40
|
+
* Fixed a compiler error under Visual Studio.
|
41
|
+
* Fix an error in the ctypes bindings that could lead to memory corruption in
|
42
|
+
the Python bindings.
|
43
|
+
* Fix duplicate attributes when parsing `<isindex>` tags.
|
44
|
+
* Don't leave semicolons behind when consuming entity references (rgrove)
|
45
|
+
* Internally rename some functions in preparation for an amalgamation file
|
46
|
+
(jdeng)
|
47
|
+
* Add proper cflags for gyp builds (skabbes)
|
48
|
+
|
49
|
+
## Gumbo 0.9.1 (2014-08-07)
|
50
|
+
|
51
|
+
* First version listed on PyPi.
|
52
|
+
* Autotools files excluded from GitHub and generated via autogen.sh. (endgame)
|
53
|
+
* Numerous compiler warnings fixed. (bnoordhuis, craigbarnes)
|
54
|
+
* Google security audit passed.
|
55
|
+
* Gyp support (tfarina)
|
56
|
+
* Naming convention for structs changed to avoid C reserved words.
|
57
|
+
* Fix several integer and buffer overflows (Maxime2)
|
58
|
+
* Some Visual Studio compiler support (bugparty)
|
59
|
+
* Python3 compatibility for the ctypes bindings.
|
60
|
+
|
61
|
+
## Gumbo 0.9.0 (2013-08-13)
|
62
|
+
|
63
|
+
* Initial release open-sourced by Google.
|
@@ -0,0 +1,101 @@
|
|
1
|
+
.PHONY: all clean check coverage
|
2
|
+
|
3
|
+
gumbo_objs := $(patsubst %.c,build/%.o,$(wildcard src/*.c))
|
4
|
+
test_objs := $(patsubst %.cc,build/%.o,$(wildcard test/*.cc))
|
5
|
+
gtest_lib := googletest/make/gtest_main.a
|
6
|
+
|
7
|
+
# make SANITIZEFLAGS='-fsanitize=undefined -fsanitize=address'
|
8
|
+
SANITIZEFLAGS :=
|
9
|
+
CPPFLAGS := -Isrc
|
10
|
+
CFLAGS := -std=c99 -Os -Wall
|
11
|
+
CXXFLAGS := -isystem googletest/include -std=c++11 -Os -Wall
|
12
|
+
LDFLAGS := -pthread
|
13
|
+
|
14
|
+
all: check
|
15
|
+
|
16
|
+
src/%.c: src/%.rl
|
17
|
+
ragel -F1 -o $@ $<
|
18
|
+
|
19
|
+
build/src:
|
20
|
+
mkdir -p $@
|
21
|
+
|
22
|
+
build/test:
|
23
|
+
mkdir -p $@
|
24
|
+
|
25
|
+
build/src/%.o: src/%.c build/src/flags | build/src
|
26
|
+
$(CC) -MMD $(CPPFLAGS) $(CFLAGS) $(SANITIZEFLAGS) -c -o $@ $<
|
27
|
+
|
28
|
+
build/test/%.o: test/%.cc build/test/flags | build/test
|
29
|
+
$(CXX) -MMD $(CPPFLAGS) $(CXXFLAGS) $(SANITIZEFLAGS) -c -o $@ $<
|
30
|
+
|
31
|
+
build/run_tests: $(gumbo_objs) $(test_objs) $(gtest_lib)
|
32
|
+
$(CXX) -o $@ $+ $(LDFLAGS) $(SANITIZEFLAGS)
|
33
|
+
|
34
|
+
check: build/run_tests
|
35
|
+
./build/run_tests
|
36
|
+
|
37
|
+
coverage:
|
38
|
+
$(RM) build/{src,test}/*.gcda
|
39
|
+
$(RM) build/*.info
|
40
|
+
$(MAKE) CPPFLAGS='-Isrc -DNDEBUG=1' \
|
41
|
+
CFLAGS='-std=c99 --coverage -g -O0' \
|
42
|
+
CXXFLAGS='-isystem googletest/include -std=c++11 --coverage -g -O0' \
|
43
|
+
LDFLAGS='--coverage' \
|
44
|
+
build/run_tests
|
45
|
+
lcov --no-external \
|
46
|
+
--initial \
|
47
|
+
--capture \
|
48
|
+
--base-directory . \
|
49
|
+
--directory build \
|
50
|
+
--output-file build/coverage-pre.info
|
51
|
+
awk -F '[:,]' \
|
52
|
+
'/^SF:/ { delete defs } /^FN:/ { defs[$$2]=1 } /^DA:/ { if ($$3 == 0 && $$2 in defs) next } { print }' \
|
53
|
+
build/coverage-pre.info > build/coverage-initial.info
|
54
|
+
./build/run_tests
|
55
|
+
lcov --no-external \
|
56
|
+
--capture \
|
57
|
+
--base-directory . \
|
58
|
+
--directory build \
|
59
|
+
--rc lcov_branch_coverage=1 \
|
60
|
+
--output-file build/coverage-test.info
|
61
|
+
lcov --add-tracefile build/coverage-initial.info \
|
62
|
+
--add-tracefile build/coverage-test.info \
|
63
|
+
--rc lcov_branch_coverage=1 \
|
64
|
+
--output-file build/coverage.info
|
65
|
+
lcov --remove build/coverage.info '$(CURDIR)/googletest/*' \
|
66
|
+
--rc lcov_branch_coverage=1 \
|
67
|
+
--output-file build/coverage.info
|
68
|
+
genhtml --branch-coverage \
|
69
|
+
--output-directory build/coverage \
|
70
|
+
build/coverage.info
|
71
|
+
|
72
|
+
clean:
|
73
|
+
$(RM) -r build
|
74
|
+
|
75
|
+
build/src/flags: | build/src
|
76
|
+
@echo 'old_CC := $(CC)' > $@
|
77
|
+
@echo 'old_CPPFLAGS := $(CPPFLAGS)' >> $@
|
78
|
+
@echo 'old_CFLAGS := $(CFLAGS)' >>$@
|
79
|
+
@echo 'old_SANITIZEFLAGS := $(SANITIZEFLAGS)' >> $@
|
80
|
+
@echo 'old_LDFLAGS := $(LDFLAGS)' >> $@
|
81
|
+
|
82
|
+
build/test/flags: | build/test
|
83
|
+
@echo 'old_CXX := $(CXX)' > $@
|
84
|
+
@echo 'old_CPPFLAGS := $(CPPFLAGS)' >> $@
|
85
|
+
@echo 'old_CXXFLAGS := $(CXXFLAGS)' >> $@
|
86
|
+
@echo 'old_SANITIZEFLAGS := $(SANITIZEFLAGS)' >> $@
|
87
|
+
@echo 'old_LDFLAGS := $(LDFLAGS)' >> $@
|
88
|
+
|
89
|
+
ifeq (,$(filter clean coverage,$(MAKECMDGOALS)))
|
90
|
+
# Ensure that the flags are up to date.
|
91
|
+
-include build/src/flags build/test/flags
|
92
|
+
ifneq ($(old_CC) | $(old_CPPFLAGS) | $(old_CFLAGS) | $(old_SANITIZEFLAGS) | $(old_LDFLAGS),$(CC) | $(CPPFLAGS) | $(CFLAGS) | $(SANITIZEFLAGS) | $(LDFLAGS))
|
93
|
+
.PHONY: build/src/flags
|
94
|
+
endif
|
95
|
+
ifneq ($(old_CXX) | $(old_CPPFLAGS) | $(old_CXXFLAGS) | $(old_SANITIZEFLAGS) | $(old_LDFLAGS),$(CXX) | $(CPPFLAGS) | $(CXXFLAGS) | $(SANITIZEFLAGS) | $(LDFLAGS))
|
96
|
+
.PHONY: build/test/flags
|
97
|
+
endif
|
98
|
+
|
99
|
+
# Include dependencies.
|
100
|
+
-include $(test_objs:.o=.d) $(gumbo_objs:.o=.d)
|
101
|
+
endif
|
data/gumbo-parser/THANKS
ADDED
@@ -0,0 +1,27 @@
|
|
1
|
+
Gumbo HTML parser THANKS file
|
2
|
+
|
3
|
+
Gumbo was originally written by Jonathan Tang, but many people helped out through suggestions, question-answering, code reviews, bugfixes, and organizational support. Here is a list of these people. Help me keep it complete and exempt of errors.
|
4
|
+
|
5
|
+
Adam Barth
|
6
|
+
Adam Roben
|
7
|
+
Ben Noordhuis
|
8
|
+
Bowen Han
|
9
|
+
Constantinos Michael
|
10
|
+
Craig Barnes
|
11
|
+
Geoffrey Sneddon
|
12
|
+
Ian Hickson
|
13
|
+
Jack Deng
|
14
|
+
Joel Low
|
15
|
+
Jonathan Shneier
|
16
|
+
Kevin Hendricks
|
17
|
+
Mason Tang
|
18
|
+
Maxim Zakharov
|
19
|
+
Michal Zalewski
|
20
|
+
Neal Norwitz
|
21
|
+
Othar Hansson
|
22
|
+
Ryan Grove
|
23
|
+
Stefan Haustein
|
24
|
+
Steffen Meschkat
|
25
|
+
Steven Kabbes
|
26
|
+
Thiago Farina
|
27
|
+
Vicent Marti
|
@@ -0,0 +1,34 @@
|
|
1
|
+
# this Makefile is used by ext/nokogiri/extconf.rb
|
2
|
+
# to enable a mini_portile2 recipe to build the gumbo parser
|
3
|
+
.PHONY: clean
|
4
|
+
|
5
|
+
CFLAGS += -std=c99 -Wall
|
6
|
+
|
7
|
+
# allow the ENV var to override this
|
8
|
+
RANLIB ?= ranlib
|
9
|
+
|
10
|
+
gumbo_objs := \
|
11
|
+
ascii.o \
|
12
|
+
attribute.o \
|
13
|
+
char_ref.o \
|
14
|
+
error.o \
|
15
|
+
foreign_attrs.o \
|
16
|
+
parser.o \
|
17
|
+
string_buffer.o \
|
18
|
+
string_piece.o \
|
19
|
+
svg_attrs.o \
|
20
|
+
svg_tags.o \
|
21
|
+
tag.o \
|
22
|
+
tag_lookup.o \
|
23
|
+
token_buffer.o \
|
24
|
+
tokenizer.o \
|
25
|
+
utf8.o \
|
26
|
+
util.o \
|
27
|
+
vector.o
|
28
|
+
|
29
|
+
libgumbo.a: $(gumbo_objs)
|
30
|
+
$(AR) $(ARFLAGS) $@ $(gumbo_objs)
|
31
|
+
- ($(RANLIB) $@ || true) >/dev/null 2>&1
|
32
|
+
|
33
|
+
clean:
|
34
|
+
rm -f $(gumbo_objs) libgumbo.a
|
@@ -0,0 +1,41 @@
|
|
1
|
+
libgumbo
|
2
|
+
========
|
3
|
+
|
4
|
+
This is an internal fork of the [libgumbo] library, which was copied and
|
5
|
+
later modified under the terms of the Apache 2.0 [license]. See `lua-gumbo`
|
6
|
+
commit [`0a04728`] for details of the original import.
|
7
|
+
|
8
|
+
Since importing the code, the following notable fixes and improvements
|
9
|
+
have been made:
|
10
|
+
|
11
|
+
* `91cef89`: Re-implement `adjust_foreign_attributes()` with a gperf hash
|
12
|
+
* `b11abe7`: Pass `TagSet` arrays into functions by reference instead of value
|
13
|
+
* `b73dc03`: Simplify `maybe_replace_codepoint()` function
|
14
|
+
* `d5d0bb3`: Remove special handling of `<menuitem>` tag
|
15
|
+
* `7bd5162`: Remove special handling of `<isindex>` tag
|
16
|
+
* `a5c1b0e`: Use `realloc(3)` instead of `malloc(3)` in `enlarge_vector_if_full()`
|
17
|
+
* `dcbebd7`: Use `realloc(3)` instead of `malloc(3)` in `maybe_resize_string_buffer()`
|
18
|
+
* `df15262`: Make `destroy_node()` function non-recursive
|
19
|
+
* `2df37f5`: Fix signedness of some format specifiers
|
20
|
+
* `176553e`: Add maximum element nesting limit
|
21
|
+
* `bed0f4a`: Annotate `gumbo_debug()` with `PRINTF` macro and fix warnings
|
22
|
+
* `7ffc218`: Annotate `print_message()` with `PRINTF` macro and fix warnings
|
23
|
+
* `1bd8ab5`, `9136507`, `53a1f9a`: Deduplicate some identical `TagSet` arrays
|
24
|
+
* `a7a9065`: Add some GCC/Clang function attributes
|
25
|
+
* `8d3d4e4`: Remove custom allocator support
|
26
|
+
* `8d3b006`: Fix recording of source positions for `</form>` end tags
|
27
|
+
* `1a8d763`: Replace linear search in `maybe_replace_codepoint()` with a lookup table
|
28
|
+
* `6dca79e`: Replace `strcasecmp()` and `strncasecmp()` with ascii-only equivalents
|
29
|
+
* `17ab1d2`: Fix `TAGSET_INCLUDES` macro to work properly with multiple bit flags
|
30
|
+
* `7e56d45`: Re-implement `gumbo_normalize_svg_tagname()` with a gperf hash
|
31
|
+
* `a518d35`: Replace linear array search in `adjust_svg_attributes()` with a gperf hash
|
32
|
+
* `a4a7433`: Fix duplicate `TagSet` initializer being ignored in `is_special_node()`
|
33
|
+
* `8137fcd`: Add support for `<dialog>` tag
|
34
|
+
* `4b35471`: Add missing `static` qualifiers to hide symbols that shouldn't be extern
|
35
|
+
* `df57c59`, `03101f3`, `ea62330`: Replace use of locale-dependant `ctype.h` functions
|
36
|
+
with custom, ASCII-only equivalents
|
37
|
+
|
38
|
+
|
39
|
+
[libgumbo]: https://github.com/google/gumbo-parser/tree/aa91b27b02c0c80c482e24348a457ed7c3c088e0/src
|
40
|
+
[license]: https://github.com/google/gumbo-parser/blob/aa91b27b02c0c80c482e24348a457ed7c3c088e0/COPYING
|
41
|
+
[`0a04728`]: https://gitlab.com/craigbarnes/lua-gumbo/commit/0a047282815af86f3367a7d95fefcfe5723ece48
|