nokogiri 1.10.3 → 1.12.5

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of nokogiri might be problematic. Click here for more details.

Files changed (218) hide show
  1. checksums.yaml +4 -4
  2. data/Gemfile +3 -0
  3. data/LICENSE-DEPENDENCIES.md +1173 -884
  4. data/LICENSE.md +1 -1
  5. data/README.md +176 -96
  6. data/dependencies.yml +28 -26
  7. data/ext/nokogiri/depend +38 -358
  8. data/ext/nokogiri/extconf.rb +716 -414
  9. data/ext/nokogiri/gumbo.c +584 -0
  10. data/ext/nokogiri/html4_document.c +166 -0
  11. data/ext/nokogiri/html4_element_description.c +294 -0
  12. data/ext/nokogiri/html4_entity_lookup.c +37 -0
  13. data/ext/nokogiri/html4_sax_parser_context.c +120 -0
  14. data/ext/nokogiri/html4_sax_push_parser.c +95 -0
  15. data/ext/nokogiri/libxml2_backwards_compat.c +121 -0
  16. data/ext/nokogiri/nokogiri.c +228 -91
  17. data/ext/nokogiri/nokogiri.h +191 -89
  18. data/ext/nokogiri/test_global_handlers.c +40 -0
  19. data/ext/nokogiri/xml_attr.c +15 -15
  20. data/ext/nokogiri/xml_attribute_decl.c +18 -18
  21. data/ext/nokogiri/xml_cdata.c +13 -18
  22. data/ext/nokogiri/xml_comment.c +19 -26
  23. data/ext/nokogiri/xml_document.c +267 -195
  24. data/ext/nokogiri/xml_document_fragment.c +13 -15
  25. data/ext/nokogiri/xml_dtd.c +54 -48
  26. data/ext/nokogiri/xml_element_content.c +31 -26
  27. data/ext/nokogiri/xml_element_decl.c +22 -22
  28. data/ext/nokogiri/xml_encoding_handler.c +28 -17
  29. data/ext/nokogiri/xml_entity_decl.c +32 -30
  30. data/ext/nokogiri/xml_entity_reference.c +16 -18
  31. data/ext/nokogiri/xml_namespace.c +60 -51
  32. data/ext/nokogiri/xml_node.c +493 -407
  33. data/ext/nokogiri/xml_node_set.c +174 -162
  34. data/ext/nokogiri/xml_processing_instruction.c +17 -19
  35. data/ext/nokogiri/xml_reader.c +197 -172
  36. data/ext/nokogiri/xml_relax_ng.c +52 -28
  37. data/ext/nokogiri/xml_sax_parser.c +112 -112
  38. data/ext/nokogiri/xml_sax_parser_context.c +105 -86
  39. data/ext/nokogiri/xml_sax_push_parser.c +36 -27
  40. data/ext/nokogiri/xml_schema.c +112 -33
  41. data/ext/nokogiri/xml_syntax_error.c +42 -21
  42. data/ext/nokogiri/xml_text.c +13 -17
  43. data/ext/nokogiri/xml_xpath_context.c +158 -73
  44. data/ext/nokogiri/xslt_stylesheet.c +158 -164
  45. data/gumbo-parser/CHANGES.md +63 -0
  46. data/gumbo-parser/Makefile +101 -0
  47. data/gumbo-parser/THANKS +27 -0
  48. data/gumbo-parser/src/Makefile +34 -0
  49. data/gumbo-parser/src/README.md +41 -0
  50. data/gumbo-parser/src/ascii.c +75 -0
  51. data/gumbo-parser/src/ascii.h +115 -0
  52. data/gumbo-parser/src/attribute.c +42 -0
  53. data/gumbo-parser/src/attribute.h +17 -0
  54. data/gumbo-parser/src/char_ref.c +22225 -0
  55. data/gumbo-parser/src/char_ref.h +29 -0
  56. data/gumbo-parser/src/char_ref.rl +2154 -0
  57. data/gumbo-parser/src/error.c +626 -0
  58. data/gumbo-parser/src/error.h +148 -0
  59. data/gumbo-parser/src/foreign_attrs.c +104 -0
  60. data/gumbo-parser/src/foreign_attrs.gperf +27 -0
  61. data/gumbo-parser/src/gumbo.h +943 -0
  62. data/gumbo-parser/src/insertion_mode.h +33 -0
  63. data/gumbo-parser/src/macros.h +91 -0
  64. data/gumbo-parser/src/parser.c +4886 -0
  65. data/gumbo-parser/src/parser.h +41 -0
  66. data/gumbo-parser/src/replacement.h +33 -0
  67. data/gumbo-parser/src/string_buffer.c +103 -0
  68. data/gumbo-parser/src/string_buffer.h +68 -0
  69. data/gumbo-parser/src/string_piece.c +48 -0
  70. data/gumbo-parser/src/svg_attrs.c +174 -0
  71. data/gumbo-parser/src/svg_attrs.gperf +77 -0
  72. data/gumbo-parser/src/svg_tags.c +137 -0
  73. data/gumbo-parser/src/svg_tags.gperf +55 -0
  74. data/gumbo-parser/src/tag.c +222 -0
  75. data/gumbo-parser/src/tag_lookup.c +382 -0
  76. data/gumbo-parser/src/tag_lookup.gperf +169 -0
  77. data/gumbo-parser/src/tag_lookup.h +13 -0
  78. data/gumbo-parser/src/token_buffer.c +79 -0
  79. data/gumbo-parser/src/token_buffer.h +71 -0
  80. data/gumbo-parser/src/token_type.h +17 -0
  81. data/gumbo-parser/src/tokenizer.c +3463 -0
  82. data/gumbo-parser/src/tokenizer.h +112 -0
  83. data/gumbo-parser/src/tokenizer_states.h +339 -0
  84. data/gumbo-parser/src/utf8.c +245 -0
  85. data/gumbo-parser/src/utf8.h +164 -0
  86. data/gumbo-parser/src/util.c +68 -0
  87. data/gumbo-parser/src/util.h +30 -0
  88. data/gumbo-parser/src/vector.c +111 -0
  89. data/gumbo-parser/src/vector.h +45 -0
  90. data/lib/nokogiri/css/node.rb +1 -0
  91. data/lib/nokogiri/css/parser.rb +64 -63
  92. data/lib/nokogiri/css/parser.y +3 -3
  93. data/lib/nokogiri/css/parser_extras.rb +39 -36
  94. data/lib/nokogiri/css/syntax_error.rb +2 -1
  95. data/lib/nokogiri/css/tokenizer.rb +105 -103
  96. data/lib/nokogiri/css/xpath_visitor.rb +73 -43
  97. data/lib/nokogiri/css.rb +15 -14
  98. data/lib/nokogiri/decorators/slop.rb +1 -0
  99. data/lib/nokogiri/extension.rb +31 -0
  100. data/lib/nokogiri/gumbo.rb +14 -0
  101. data/lib/nokogiri/html.rb +32 -27
  102. data/lib/nokogiri/{html → html4}/builder.rb +3 -2
  103. data/lib/nokogiri/{html → html4}/document.rb +17 -30
  104. data/lib/nokogiri/{html → html4}/document_fragment.rb +18 -17
  105. data/lib/nokogiri/{html → html4}/element_description.rb +2 -1
  106. data/lib/nokogiri/{html → html4}/element_description_defaults.rb +2 -1
  107. data/lib/nokogiri/{html → html4}/entity_lookup.rb +2 -1
  108. data/lib/nokogiri/{html → html4}/sax/parser.rb +12 -14
  109. data/lib/nokogiri/html4/sax/parser_context.rb +19 -0
  110. data/lib/nokogiri/{html → html4}/sax/push_parser.rb +6 -5
  111. data/lib/nokogiri/html4.rb +40 -0
  112. data/lib/nokogiri/html5/document.rb +74 -0
  113. data/lib/nokogiri/html5/document_fragment.rb +80 -0
  114. data/lib/nokogiri/html5/node.rb +93 -0
  115. data/lib/nokogiri/html5.rb +473 -0
  116. data/lib/nokogiri/jruby/dependencies.rb +20 -0
  117. data/lib/nokogiri/syntax_error.rb +1 -0
  118. data/lib/nokogiri/version/constant.rb +5 -0
  119. data/lib/nokogiri/version/info.rb +215 -0
  120. data/lib/nokogiri/version.rb +3 -109
  121. data/lib/nokogiri/xml/attr.rb +1 -0
  122. data/lib/nokogiri/xml/attribute_decl.rb +1 -0
  123. data/lib/nokogiri/xml/builder.rb +74 -32
  124. data/lib/nokogiri/xml/cdata.rb +1 -0
  125. data/lib/nokogiri/xml/character_data.rb +1 -0
  126. data/lib/nokogiri/xml/document.rb +138 -41
  127. data/lib/nokogiri/xml/document_fragment.rb +5 -6
  128. data/lib/nokogiri/xml/dtd.rb +1 -0
  129. data/lib/nokogiri/xml/element_content.rb +1 -0
  130. data/lib/nokogiri/xml/element_decl.rb +1 -0
  131. data/lib/nokogiri/xml/entity_decl.rb +1 -0
  132. data/lib/nokogiri/xml/entity_reference.rb +1 -0
  133. data/lib/nokogiri/xml/namespace.rb +1 -0
  134. data/lib/nokogiri/xml/node/save_options.rb +2 -1
  135. data/lib/nokogiri/xml/node.rb +629 -293
  136. data/lib/nokogiri/xml/node_set.rb +1 -0
  137. data/lib/nokogiri/xml/notation.rb +1 -0
  138. data/lib/nokogiri/xml/parse_options.rb +12 -3
  139. data/lib/nokogiri/xml/pp/character_data.rb +1 -0
  140. data/lib/nokogiri/xml/pp/node.rb +1 -0
  141. data/lib/nokogiri/xml/pp.rb +3 -2
  142. data/lib/nokogiri/xml/processing_instruction.rb +1 -0
  143. data/lib/nokogiri/xml/reader.rb +9 -12
  144. data/lib/nokogiri/xml/relax_ng.rb +7 -2
  145. data/lib/nokogiri/xml/sax/document.rb +25 -30
  146. data/lib/nokogiri/xml/sax/parser.rb +1 -0
  147. data/lib/nokogiri/xml/sax/parser_context.rb +1 -0
  148. data/lib/nokogiri/xml/sax/push_parser.rb +1 -0
  149. data/lib/nokogiri/xml/sax.rb +5 -4
  150. data/lib/nokogiri/xml/schema.rb +13 -4
  151. data/lib/nokogiri/xml/searchable.rb +25 -16
  152. data/lib/nokogiri/xml/syntax_error.rb +1 -0
  153. data/lib/nokogiri/xml/text.rb +1 -0
  154. data/lib/nokogiri/xml/xpath/syntax_error.rb +2 -1
  155. data/lib/nokogiri/xml/xpath.rb +4 -5
  156. data/lib/nokogiri/xml/xpath_context.rb +1 -0
  157. data/lib/nokogiri/xml.rb +36 -36
  158. data/lib/nokogiri/xslt/stylesheet.rb +2 -1
  159. data/lib/nokogiri/xslt.rb +17 -16
  160. data/lib/nokogiri.rb +32 -51
  161. data/lib/xsd/xmlparser/nokogiri.rb +1 -0
  162. data/patches/libxml2/{0002-Remove-script-macro-support.patch → 0001-Remove-script-macro-support.patch} +0 -0
  163. data/patches/libxml2/{0003-Update-entities-to-remove-handling-of-ssi.patch → 0002-Update-entities-to-remove-handling-of-ssi.patch} +0 -0
  164. data/patches/libxml2/0003-libxml2.la-is-in-top_builddir.patch +25 -0
  165. data/patches/libxml2/0004-use-glibc-strlen.patch +53 -0
  166. data/patches/libxml2/0005-avoid-isnan-isinf.patch +81 -0
  167. data/patches/libxml2/0006-update-automake-files-for-arm64.patch +2511 -0
  168. data/patches/libxml2/0007-Fix-XPath-recursion-limit.patch +31 -0
  169. data/patches/libxslt/0001-update-automake-files-for-arm64.patch +2511 -0
  170. data/patches/libxslt/0002-Fix-xml2-config-check-in-configure-script.patch +19 -0
  171. data/ports/archives/libxml2-2.9.12.tar.gz +0 -0
  172. data/ports/archives/libxslt-1.1.34.tar.gz +0 -0
  173. metadata +151 -153
  174. data/ext/nokogiri/html_document.c +0 -170
  175. data/ext/nokogiri/html_document.h +0 -10
  176. data/ext/nokogiri/html_element_description.c +0 -279
  177. data/ext/nokogiri/html_element_description.h +0 -10
  178. data/ext/nokogiri/html_entity_lookup.c +0 -32
  179. data/ext/nokogiri/html_entity_lookup.h +0 -8
  180. data/ext/nokogiri/html_sax_parser_context.c +0 -116
  181. data/ext/nokogiri/html_sax_parser_context.h +0 -11
  182. data/ext/nokogiri/html_sax_push_parser.c +0 -87
  183. data/ext/nokogiri/html_sax_push_parser.h +0 -9
  184. data/ext/nokogiri/xml_attr.h +0 -9
  185. data/ext/nokogiri/xml_attribute_decl.h +0 -9
  186. data/ext/nokogiri/xml_cdata.h +0 -9
  187. data/ext/nokogiri/xml_comment.h +0 -9
  188. data/ext/nokogiri/xml_document.h +0 -23
  189. data/ext/nokogiri/xml_document_fragment.h +0 -10
  190. data/ext/nokogiri/xml_dtd.h +0 -10
  191. data/ext/nokogiri/xml_element_content.h +0 -10
  192. data/ext/nokogiri/xml_element_decl.h +0 -9
  193. data/ext/nokogiri/xml_encoding_handler.h +0 -8
  194. data/ext/nokogiri/xml_entity_decl.h +0 -10
  195. data/ext/nokogiri/xml_entity_reference.h +0 -9
  196. data/ext/nokogiri/xml_io.c +0 -61
  197. data/ext/nokogiri/xml_io.h +0 -11
  198. data/ext/nokogiri/xml_libxml2_hacks.c +0 -112
  199. data/ext/nokogiri/xml_libxml2_hacks.h +0 -12
  200. data/ext/nokogiri/xml_namespace.h +0 -14
  201. data/ext/nokogiri/xml_node.h +0 -13
  202. data/ext/nokogiri/xml_node_set.h +0 -12
  203. data/ext/nokogiri/xml_processing_instruction.h +0 -9
  204. data/ext/nokogiri/xml_reader.h +0 -10
  205. data/ext/nokogiri/xml_relax_ng.h +0 -9
  206. data/ext/nokogiri/xml_sax_parser.h +0 -39
  207. data/ext/nokogiri/xml_sax_parser_context.h +0 -10
  208. data/ext/nokogiri/xml_sax_push_parser.h +0 -9
  209. data/ext/nokogiri/xml_schema.h +0 -9
  210. data/ext/nokogiri/xml_syntax_error.h +0 -13
  211. data/ext/nokogiri/xml_text.h +0 -9
  212. data/ext/nokogiri/xml_xpath_context.h +0 -10
  213. data/ext/nokogiri/xslt_stylesheet.h +0 -14
  214. data/lib/nokogiri/html/sax/parser_context.rb +0 -16
  215. data/patches/libxml2/0001-Revert-Do-not-URI-escape-in-server-side-includes.patch +0 -78
  216. data/patches/libxslt/0001-Fix-security-framework-bypass.patch +0 -120
  217. data/ports/archives/libxml2-2.9.9.tar.gz +0 -0
  218. data/ports/archives/libxslt-1.1.33.tar.gz +0 -0
@@ -1,34 +1,29 @@
1
- #include <xslt_stylesheet.h>
1
+ #include <nokogiri.h>
2
2
 
3
- #include <libxslt/xsltInternals.h>
4
- #include <libxslt/xsltutils.h>
5
- #include <libxslt/transform.h>
6
- #include <libexslt/exslt.h>
7
-
8
- VALUE xslt;
9
-
10
- int vasprintf (char **strp, const char *fmt, va_list ap);
11
- void vasprintf_free (void *p);
3
+ VALUE cNokogiriXsltStylesheet ;
12
4
 
13
- static void mark(nokogiriXsltStylesheetTuple *wrapper)
5
+ static void
6
+ mark(nokogiriXsltStylesheetTuple *wrapper)
14
7
  {
15
8
  rb_gc_mark(wrapper->func_instances);
16
9
  }
17
10
 
18
- static void dealloc(nokogiriXsltStylesheetTuple *wrapper)
11
+ static void
12
+ dealloc(nokogiriXsltStylesheetTuple *wrapper)
19
13
  {
20
- xsltStylesheetPtr doc = wrapper->ss;
14
+ xsltStylesheetPtr doc = wrapper->ss;
21
15
 
22
- NOKOGIRI_DEBUG_START(doc);
23
- xsltFreeStylesheet(doc); /* commented out for now. */
24
- NOKOGIRI_DEBUG_END(doc);
16
+ NOKOGIRI_DEBUG_START(doc);
17
+ xsltFreeStylesheet(doc); /* commented out for now. */
18
+ NOKOGIRI_DEBUG_END(doc);
25
19
 
26
- free(wrapper);
20
+ free(wrapper);
27
21
  }
28
22
 
29
- static void xslt_generic_error_handler(void * ctx, const char *msg, ...)
23
+ static void
24
+ xslt_generic_error_handler(void *ctx, const char *msg, ...)
30
25
  {
31
- char * message;
26
+ char *message;
32
27
 
33
28
  va_list args;
34
29
  va_start(args, msg);
@@ -37,10 +32,11 @@ static void xslt_generic_error_handler(void * ctx, const char *msg, ...)
37
32
 
38
33
  rb_str_cat2((VALUE)ctx, message);
39
34
 
40
- vasprintf_free(message);
35
+ free(message);
41
36
  }
42
37
 
43
- VALUE Nokogiri_wrap_xslt_stylesheet(xsltStylesheetPtr ss)
38
+ VALUE
39
+ Nokogiri_wrap_xslt_stylesheet(xsltStylesheetPtr ss)
44
40
  {
45
41
  VALUE self;
46
42
  nokogiriXsltStylesheetTuple *wrapper;
@@ -61,29 +57,29 @@ VALUE Nokogiri_wrap_xslt_stylesheet(xsltStylesheetPtr ss)
61
57
  *
62
58
  * Parse a stylesheet from +document+.
63
59
  */
64
- static VALUE parse_stylesheet_doc(VALUE klass, VALUE xmldocobj)
60
+ static VALUE
61
+ parse_stylesheet_doc(VALUE klass, VALUE xmldocobj)
65
62
  {
66
- xmlDocPtr xml, xml_cpy;
67
- VALUE errstr, exception;
68
- xsltStylesheetPtr ss ;
69
- Data_Get_Struct(xmldocobj, xmlDoc, xml);
70
- exsltRegisterAll();
63
+ xmlDocPtr xml, xml_cpy;
64
+ VALUE errstr, exception;
65
+ xsltStylesheetPtr ss ;
66
+ Data_Get_Struct(xmldocobj, xmlDoc, xml);
71
67
 
72
- errstr = rb_str_new(0, 0);
73
- xsltSetGenericErrorFunc((void *)errstr, xslt_generic_error_handler);
68
+ errstr = rb_str_new(0, 0);
69
+ xsltSetGenericErrorFunc((void *)errstr, xslt_generic_error_handler);
74
70
 
75
- xml_cpy = xmlCopyDoc(xml, 1); /* 1 => recursive */
76
- ss = xsltParseStylesheetDoc(xml_cpy);
71
+ xml_cpy = xmlCopyDoc(xml, 1); /* 1 => recursive */
72
+ ss = xsltParseStylesheetDoc(xml_cpy);
77
73
 
78
- xsltSetGenericErrorFunc(NULL, NULL);
74
+ xsltSetGenericErrorFunc(NULL, NULL);
79
75
 
80
- if (!ss) {
81
- xmlFreeDoc(xml_cpy);
82
- exception = rb_exc_new3(rb_eRuntimeError, errstr);
83
- rb_exc_raise(exception);
84
- }
76
+ if (!ss) {
77
+ xmlFreeDoc(xml_cpy);
78
+ exception = rb_exc_new3(rb_eRuntimeError, errstr);
79
+ rb_exc_raise(exception);
80
+ }
85
81
 
86
- return Nokogiri_wrap_xslt_stylesheet(ss);
82
+ return Nokogiri_wrap_xslt_stylesheet(ss);
87
83
  }
88
84
 
89
85
 
@@ -93,24 +89,21 @@ static VALUE parse_stylesheet_doc(VALUE klass, VALUE xmldocobj)
93
89
  *
94
90
  * Serialize +document+ to an xml string.
95
91
  */
96
- static VALUE serialize(VALUE self, VALUE xmlobj)
97
- {
98
- xmlDocPtr xml ;
99
- nokogiriXsltStylesheetTuple *wrapper;
100
- xmlChar* doc_ptr ;
101
- int doc_len ;
102
- VALUE rval ;
103
-
104
- Data_Get_Struct(xmlobj, xmlDoc, xml);
105
- Data_Get_Struct(self, nokogiriXsltStylesheetTuple, wrapper);
106
- xsltSaveResultToString(&doc_ptr, &doc_len, xml, wrapper->ss);
107
- rval = NOKOGIRI_STR_NEW(doc_ptr, doc_len);
108
- xmlFree(doc_ptr);
109
- return rval ;
110
- }
111
-
112
- static void swallow_superfluous_xml_errors(void * userdata, xmlErrorPtr error, ...)
92
+ static VALUE
93
+ serialize(VALUE self, VALUE xmlobj)
113
94
  {
95
+ xmlDocPtr xml ;
96
+ nokogiriXsltStylesheetTuple *wrapper;
97
+ xmlChar *doc_ptr ;
98
+ int doc_len ;
99
+ VALUE rval ;
100
+
101
+ Data_Get_Struct(xmlobj, xmlDoc, xml);
102
+ Data_Get_Struct(self, nokogiriXsltStylesheetTuple, wrapper);
103
+ xsltSaveResultToString(&doc_ptr, &doc_len, xml, wrapper->ss);
104
+ rval = NOKOGIRI_STR_NEW(doc_ptr, doc_len);
105
+ xmlFree(doc_ptr);
106
+ return rval ;
114
107
  }
115
108
 
116
109
  /*
@@ -128,109 +121,114 @@ static void swallow_superfluous_xml_errors(void * userdata, xmlErrorPtr error, .
128
121
  * puts xslt.transform(doc, ['key', 'value'])
129
122
  *
130
123
  */
131
- static VALUE transform(int argc, VALUE* argv, VALUE self)
124
+ static VALUE
125
+ transform(int argc, VALUE *argv, VALUE self)
132
126
  {
133
- VALUE xmldoc, paramobj, errstr, exception ;
134
- xmlDocPtr xml ;
135
- xmlDocPtr result ;
136
- nokogiriXsltStylesheetTuple *wrapper;
137
- const char** params ;
138
- long param_len, j ;
139
- int parse_error_occurred ;
140
-
141
- rb_scan_args(argc, argv, "11", &xmldoc, &paramobj);
142
- if (NIL_P(paramobj)) { paramobj = rb_ary_new2(0L) ; }
143
- if (!rb_obj_is_kind_of(xmldoc, cNokogiriXmlDocument))
144
- rb_raise(rb_eArgError, "argument must be a Nokogiri::XML::Document");
145
-
146
- /* handle hashes as arguments. */
147
- if(T_HASH == TYPE(paramobj)) {
148
- paramobj = rb_funcall(paramobj, rb_intern("to_a"), 0);
149
- paramobj = rb_funcall(paramobj, rb_intern("flatten"), 0);
150
- }
151
-
152
- Check_Type(paramobj, T_ARRAY);
153
-
154
- Data_Get_Struct(xmldoc, xmlDoc, xml);
155
- Data_Get_Struct(self, nokogiriXsltStylesheetTuple, wrapper);
156
-
157
- param_len = RARRAY_LEN(paramobj);
158
- params = calloc((size_t)param_len+1, sizeof(char*));
159
- for (j = 0 ; j < param_len ; j++) {
160
- VALUE entry = rb_ary_entry(paramobj, j);
161
- const char * ptr = StringValueCStr(entry);
162
- params[j] = ptr;
163
- }
164
- params[param_len] = 0 ;
165
-
166
- errstr = rb_str_new(0, 0);
167
- xsltSetGenericErrorFunc((void *)errstr, xslt_generic_error_handler);
168
- xmlSetGenericErrorFunc((void *)errstr, xslt_generic_error_handler);
169
-
170
- result = xsltApplyStylesheet(wrapper->ss, xml, params);
171
- free(params);
172
-
173
- xsltSetGenericErrorFunc(NULL, NULL);
174
- xmlSetGenericErrorFunc(NULL, NULL);
175
-
176
- parse_error_occurred = (Qfalse == rb_funcall(errstr, rb_intern("empty?"), 0));
177
-
178
- if (parse_error_occurred) {
179
- exception = rb_exc_new3(rb_eRuntimeError, errstr);
180
- rb_exc_raise(exception);
181
- }
182
-
183
- return Nokogiri_wrap_xml_document((VALUE)0, result) ;
127
+ VALUE xmldoc, paramobj, errstr, exception ;
128
+ xmlDocPtr xml ;
129
+ xmlDocPtr result ;
130
+ nokogiriXsltStylesheetTuple *wrapper;
131
+ const char **params ;
132
+ long param_len, j ;
133
+ int parse_error_occurred ;
134
+
135
+ rb_scan_args(argc, argv, "11", &xmldoc, &paramobj);
136
+ if (NIL_P(paramobj)) { paramobj = rb_ary_new2(0L) ; }
137
+ if (!rb_obj_is_kind_of(xmldoc, cNokogiriXmlDocument)) {
138
+ rb_raise(rb_eArgError, "argument must be a Nokogiri::XML::Document");
139
+ }
140
+
141
+ /* handle hashes as arguments. */
142
+ if (T_HASH == TYPE(paramobj)) {
143
+ paramobj = rb_funcall(paramobj, rb_intern("to_a"), 0);
144
+ paramobj = rb_funcall(paramobj, rb_intern("flatten"), 0);
145
+ }
146
+
147
+ Check_Type(paramobj, T_ARRAY);
148
+
149
+ Data_Get_Struct(xmldoc, xmlDoc, xml);
150
+ Data_Get_Struct(self, nokogiriXsltStylesheetTuple, wrapper);
151
+
152
+ param_len = RARRAY_LEN(paramobj);
153
+ params = calloc((size_t)param_len + 1, sizeof(char *));
154
+ for (j = 0 ; j < param_len ; j++) {
155
+ VALUE entry = rb_ary_entry(paramobj, j);
156
+ const char *ptr = StringValueCStr(entry);
157
+ params[j] = ptr;
158
+ }
159
+ params[param_len] = 0 ;
160
+
161
+ errstr = rb_str_new(0, 0);
162
+ xsltSetGenericErrorFunc((void *)errstr, xslt_generic_error_handler);
163
+ xmlSetGenericErrorFunc((void *)errstr, xslt_generic_error_handler);
164
+
165
+ result = xsltApplyStylesheet(wrapper->ss, xml, params);
166
+ free(params);
167
+
168
+ xsltSetGenericErrorFunc(NULL, NULL);
169
+ xmlSetGenericErrorFunc(NULL, NULL);
170
+
171
+ parse_error_occurred = (Qfalse == rb_funcall(errstr, rb_intern("empty?"), 0));
172
+
173
+ if (parse_error_occurred) {
174
+ exception = rb_exc_new3(rb_eRuntimeError, errstr);
175
+ rb_exc_raise(exception);
176
+ }
177
+
178
+ return noko_xml_document_wrap((VALUE)0, result) ;
184
179
  }
185
180
 
186
- static void method_caller(xmlXPathParserContextPtr ctxt, int nargs)
181
+ static void
182
+ method_caller(xmlXPathParserContextPtr ctxt, int nargs)
187
183
  {
188
- VALUE handler;
189
- const char *function_name;
190
- xsltTransformContextPtr transform;
191
- const xmlChar *functionURI;
184
+ VALUE handler;
185
+ const char *function_name;
186
+ xsltTransformContextPtr transform;
187
+ const xmlChar *functionURI;
192
188
 
193
- transform = xsltXPathGetTransformContext(ctxt);
194
- functionURI = ctxt->context->functionURI;
195
- handler = (VALUE)xsltGetExtData(transform, functionURI);
196
- function_name = (const char*)(ctxt->context->function);
189
+ transform = xsltXPathGetTransformContext(ctxt);
190
+ functionURI = ctxt->context->functionURI;
191
+ handler = (VALUE)xsltGetExtData(transform, functionURI);
192
+ function_name = (const char *)(ctxt->context->function);
197
193
 
198
- Nokogiri_marshal_xpath_funcall_and_return_values(ctxt, nargs, handler, (const char*)function_name);
194
+ Nokogiri_marshal_xpath_funcall_and_return_values(ctxt, nargs, handler, (const char *)function_name);
199
195
  }
200
196
 
201
- static void * initFunc(xsltTransformContextPtr ctxt, const xmlChar *uri)
197
+ static void *
198
+ initFunc(xsltTransformContextPtr ctxt, const xmlChar *uri)
202
199
  {
203
- VALUE modules = rb_iv_get(xslt, "@modules");
204
- VALUE obj = rb_hash_aref(modules, rb_str_new2((const char *)uri));
205
- VALUE args = { Qfalse };
206
- VALUE methods = rb_funcall(obj, rb_intern("instance_methods"), 1, args);
207
- VALUE inst;
208
- nokogiriXsltStylesheetTuple *wrapper;
209
- int i;
210
-
211
- for(i = 0; i < RARRAY_LEN(methods); i++) {
212
- VALUE method_name = rb_obj_as_string(rb_ary_entry(methods, i));
213
- xsltRegisterExtFunction(ctxt,
214
- (unsigned char *)StringValueCStr(method_name), uri, method_caller);
215
- }
216
-
217
- Data_Get_Struct((VALUE)ctxt->style->_private, nokogiriXsltStylesheetTuple,
218
- wrapper);
219
- inst = rb_class_new_instance(0, NULL, obj);
220
- rb_ary_push(wrapper->func_instances, inst);
221
-
222
- return (void *)inst;
200
+ VALUE modules = rb_iv_get(mNokogiriXslt, "@modules");
201
+ VALUE obj = rb_hash_aref(modules, rb_str_new2((const char *)uri));
202
+ VALUE args = { Qfalse };
203
+ VALUE methods = rb_funcall(obj, rb_intern("instance_methods"), 1, args);
204
+ VALUE inst;
205
+ nokogiriXsltStylesheetTuple *wrapper;
206
+ int i;
207
+
208
+ for (i = 0; i < RARRAY_LEN(methods); i++) {
209
+ VALUE method_name = rb_obj_as_string(rb_ary_entry(methods, i));
210
+ xsltRegisterExtFunction(ctxt,
211
+ (unsigned char *)StringValueCStr(method_name), uri, method_caller);
212
+ }
213
+
214
+ Data_Get_Struct((VALUE)ctxt->style->_private, nokogiriXsltStylesheetTuple,
215
+ wrapper);
216
+ inst = rb_class_new_instance(0, NULL, obj);
217
+ rb_ary_push(wrapper->func_instances, inst);
218
+
219
+ return (void *)inst;
223
220
  }
224
221
 
225
- static void shutdownFunc(xsltTransformContextPtr ctxt,
226
- const xmlChar *uri, void *data)
222
+ static void
223
+ shutdownFunc(xsltTransformContextPtr ctxt,
224
+ const xmlChar *uri, void *data)
227
225
  {
228
- nokogiriXsltStylesheetTuple *wrapper;
226
+ nokogiriXsltStylesheetTuple *wrapper;
229
227
 
230
- Data_Get_Struct((VALUE)ctxt->style->_private, nokogiriXsltStylesheetTuple,
231
- wrapper);
228
+ Data_Get_Struct((VALUE)ctxt->style->_private, nokogiriXsltStylesheetTuple,
229
+ wrapper);
232
230
 
233
- rb_ary_clear(wrapper->func_instances);
231
+ rb_ary_clear(wrapper->func_instances);
234
232
  }
235
233
 
236
234
  /*
@@ -239,32 +237,28 @@ static void shutdownFunc(xsltTransformContextPtr ctxt,
239
237
  *
240
238
  * Register a class that implements custom XSLT transformation functions.
241
239
  */
242
- static VALUE registr(VALUE self, VALUE uri, VALUE obj)
240
+ static VALUE
241
+ registr(VALUE self, VALUE uri, VALUE obj)
243
242
  {
244
- VALUE modules = rb_iv_get(self, "@modules");
245
- if(NIL_P(modules)) rb_raise(rb_eRuntimeError, "wtf! @modules isn't set");
243
+ VALUE modules = rb_iv_get(self, "@modules");
244
+ if (NIL_P(modules)) { rb_raise(rb_eRuntimeError, "wtf! @modules isn't set"); }
246
245
 
247
- rb_hash_aset(modules, uri, obj);
248
- xsltRegisterExtModule((unsigned char *)StringValueCStr(uri), initFunc, shutdownFunc);
249
- return self;
246
+ rb_hash_aset(modules, uri, obj);
247
+ xsltRegisterExtModule((unsigned char *)StringValueCStr(uri), initFunc, shutdownFunc);
248
+ return self;
250
249
  }
251
250
 
252
- VALUE cNokogiriXsltStylesheet ;
253
- void init_xslt_stylesheet()
251
+ void
252
+ noko_init_xslt_stylesheet()
254
253
  {
255
- VALUE nokogiri;
256
- VALUE klass;
257
-
258
- nokogiri = rb_define_module("Nokogiri");
259
- xslt = rb_define_module_under(nokogiri, "XSLT");
260
- klass = rb_define_class_under(xslt, "Stylesheet", rb_cObject);
254
+ rb_define_singleton_method(mNokogiriXslt, "register", registr, 2);
255
+ rb_iv_set(mNokogiriXslt, "@modules", rb_hash_new());
261
256
 
262
- rb_iv_set(xslt, "@modules", rb_hash_new());
257
+ cNokogiriXsltStylesheet = rb_define_class_under(mNokogiriXslt, "Stylesheet", rb_cObject);
263
258
 
264
- cNokogiriXsltStylesheet = klass;
259
+ rb_undef_alloc_func(cNokogiriXsltStylesheet);
265
260
 
266
- rb_define_singleton_method(klass, "parse_stylesheet_doc", parse_stylesheet_doc, 1);
267
- rb_define_singleton_method(xslt, "register", registr, 2);
268
- rb_define_method(klass, "serialize", serialize, 1);
269
- rb_define_method(klass, "transform", transform, -1);
261
+ rb_define_singleton_method(cNokogiriXsltStylesheet, "parse_stylesheet_doc", parse_stylesheet_doc, 1);
262
+ rb_define_method(cNokogiriXsltStylesheet, "serialize", serialize, 1);
263
+ rb_define_method(cNokogiriXsltStylesheet, "transform", transform, -1);
270
264
  }
@@ -0,0 +1,63 @@
1
+ ## Gumbo 0.10.1 (2015-04-30)
2
+
3
+ Same as 0.10.0, but with the version number bumped because the last version-number commit to v0.9.4 makes GitHub think that v0.9.4 is the latest version and so it's not highlighted on the webpage.
4
+
5
+ ## Gumbo 0.10.0 (2015-04-30)
6
+
7
+ * Full support for `<template>` tag (kevinhendricks, nostrademons).
8
+ * Some fixes for `<rtc>`/`<rt>` handling (kevinhendricks, vmg).
9
+ * All html5lib-trunk tests pass now! (kevinhendricks, vmg, nostrademons)
10
+ * Support for fragment parsing (vmg)
11
+ * A couple additional example programs (kevinhendricks)
12
+ * Performance improvements totaling an estimated 30-40% total improvement (vmg, nostrademons).
13
+
14
+ ## Gumbo 0.9.4 (2015-04-30)
15
+
16
+ * Additional Visual Studio fixes (lowjoel, nostrademons)
17
+ * Fixed some unused variable warnings.
18
+ * Fix for glibtoolize vs. libtoolize build errors on Mac.
19
+ * Fixed `CDATA` end tag handling.
20
+
21
+ ## Gumbo 0.9.3 (2015-02-17)
22
+
23
+ * Bugfix for `&AElig;` entities (rgrove)
24
+ * Fix `CDATA` handling; `CDATA` sections now generate a `GUMBO_NODE_CDATA` node rather
25
+ than plain text.
26
+ * Fix `get_title example` to handle whitespace nodes (gsnedders)
27
+ * Visual Studio compilation fixes (fishioon)
28
+ * Take the namespace into account when determining whether a node matches a
29
+ certain tag (aroben)
30
+ * Replace the varargs tag functions with a tagset bytevector, for a 20-30%
31
+ speedup in overall parse time (kevinhendricks, vmg)
32
+ * Add MacOS X support to Travis CI, and fix the deployment/DLL issues this
33
+ uncovered (nostrademons, kevinhendricks, vmg)
34
+
35
+ ## Gumbo 0.9.2 (2014-09-21)
36
+
37
+ * Performance improvements: Ragel-based char ref decoder and DFA-based UTF8
38
+ decoder, totaling speedups of up to 300%.
39
+ * Added benchmarking program and some sample data.
40
+ * Fixed a compiler error under Visual Studio.
41
+ * Fix an error in the ctypes bindings that could lead to memory corruption in
42
+ the Python bindings.
43
+ * Fix duplicate attributes when parsing `<isindex>` tags.
44
+ * Don't leave semicolons behind when consuming entity references (rgrove)
45
+ * Internally rename some functions in preparation for an amalgamation file
46
+ (jdeng)
47
+ * Add proper cflags for gyp builds (skabbes)
48
+
49
+ ## Gumbo 0.9.1 (2014-08-07)
50
+
51
+ * First version listed on PyPi.
52
+ * Autotools files excluded from GitHub and generated via autogen.sh. (endgame)
53
+ * Numerous compiler warnings fixed. (bnoordhuis, craigbarnes)
54
+ * Google security audit passed.
55
+ * Gyp support (tfarina)
56
+ * Naming convention for structs changed to avoid C reserved words.
57
+ * Fix several integer and buffer overflows (Maxime2)
58
+ * Some Visual Studio compiler support (bugparty)
59
+ * Python3 compatibility for the ctypes bindings.
60
+
61
+ ## Gumbo 0.9.0 (2013-08-13)
62
+
63
+ * Initial release open-sourced by Google.
@@ -0,0 +1,101 @@
1
+ .PHONY: all clean check coverage
2
+
3
+ gumbo_objs := $(patsubst %.c,build/%.o,$(wildcard src/*.c))
4
+ test_objs := $(patsubst %.cc,build/%.o,$(wildcard test/*.cc))
5
+ gtest_lib := googletest/make/gtest_main.a
6
+
7
+ # make SANITIZEFLAGS='-fsanitize=undefined -fsanitize=address'
8
+ SANITIZEFLAGS :=
9
+ CPPFLAGS := -Isrc
10
+ CFLAGS := -std=c99 -Os -Wall
11
+ CXXFLAGS := -isystem googletest/include -std=c++11 -Os -Wall
12
+ LDFLAGS := -pthread
13
+
14
+ all: check
15
+
16
+ src/%.c: src/%.rl
17
+ ragel -F1 -o $@ $<
18
+
19
+ build/src:
20
+ mkdir -p $@
21
+
22
+ build/test:
23
+ mkdir -p $@
24
+
25
+ build/src/%.o: src/%.c build/src/flags | build/src
26
+ $(CC) -MMD $(CPPFLAGS) $(CFLAGS) $(SANITIZEFLAGS) -c -o $@ $<
27
+
28
+ build/test/%.o: test/%.cc build/test/flags | build/test
29
+ $(CXX) -MMD $(CPPFLAGS) $(CXXFLAGS) $(SANITIZEFLAGS) -c -o $@ $<
30
+
31
+ build/run_tests: $(gumbo_objs) $(test_objs) $(gtest_lib)
32
+ $(CXX) -o $@ $+ $(LDFLAGS) $(SANITIZEFLAGS)
33
+
34
+ check: build/run_tests
35
+ ./build/run_tests
36
+
37
+ coverage:
38
+ $(RM) build/{src,test}/*.gcda
39
+ $(RM) build/*.info
40
+ $(MAKE) CPPFLAGS='-Isrc -DNDEBUG=1' \
41
+ CFLAGS='-std=c99 --coverage -g -O0' \
42
+ CXXFLAGS='-isystem googletest/include -std=c++11 --coverage -g -O0' \
43
+ LDFLAGS='--coverage' \
44
+ build/run_tests
45
+ lcov --no-external \
46
+ --initial \
47
+ --capture \
48
+ --base-directory . \
49
+ --directory build \
50
+ --output-file build/coverage-pre.info
51
+ awk -F '[:,]' \
52
+ '/^SF:/ { delete defs } /^FN:/ { defs[$$2]=1 } /^DA:/ { if ($$3 == 0 && $$2 in defs) next } { print }' \
53
+ build/coverage-pre.info > build/coverage-initial.info
54
+ ./build/run_tests
55
+ lcov --no-external \
56
+ --capture \
57
+ --base-directory . \
58
+ --directory build \
59
+ --rc lcov_branch_coverage=1 \
60
+ --output-file build/coverage-test.info
61
+ lcov --add-tracefile build/coverage-initial.info \
62
+ --add-tracefile build/coverage-test.info \
63
+ --rc lcov_branch_coverage=1 \
64
+ --output-file build/coverage.info
65
+ lcov --remove build/coverage.info '$(CURDIR)/googletest/*' \
66
+ --rc lcov_branch_coverage=1 \
67
+ --output-file build/coverage.info
68
+ genhtml --branch-coverage \
69
+ --output-directory build/coverage \
70
+ build/coverage.info
71
+
72
+ clean:
73
+ $(RM) -r build
74
+
75
+ build/src/flags: | build/src
76
+ @echo 'old_CC := $(CC)' > $@
77
+ @echo 'old_CPPFLAGS := $(CPPFLAGS)' >> $@
78
+ @echo 'old_CFLAGS := $(CFLAGS)' >>$@
79
+ @echo 'old_SANITIZEFLAGS := $(SANITIZEFLAGS)' >> $@
80
+ @echo 'old_LDFLAGS := $(LDFLAGS)' >> $@
81
+
82
+ build/test/flags: | build/test
83
+ @echo 'old_CXX := $(CXX)' > $@
84
+ @echo 'old_CPPFLAGS := $(CPPFLAGS)' >> $@
85
+ @echo 'old_CXXFLAGS := $(CXXFLAGS)' >> $@
86
+ @echo 'old_SANITIZEFLAGS := $(SANITIZEFLAGS)' >> $@
87
+ @echo 'old_LDFLAGS := $(LDFLAGS)' >> $@
88
+
89
+ ifeq (,$(filter clean coverage,$(MAKECMDGOALS)))
90
+ # Ensure that the flags are up to date.
91
+ -include build/src/flags build/test/flags
92
+ ifneq ($(old_CC) | $(old_CPPFLAGS) | $(old_CFLAGS) | $(old_SANITIZEFLAGS) | $(old_LDFLAGS),$(CC) | $(CPPFLAGS) | $(CFLAGS) | $(SANITIZEFLAGS) | $(LDFLAGS))
93
+ .PHONY: build/src/flags
94
+ endif
95
+ ifneq ($(old_CXX) | $(old_CPPFLAGS) | $(old_CXXFLAGS) | $(old_SANITIZEFLAGS) | $(old_LDFLAGS),$(CXX) | $(CPPFLAGS) | $(CXXFLAGS) | $(SANITIZEFLAGS) | $(LDFLAGS))
96
+ .PHONY: build/test/flags
97
+ endif
98
+
99
+ # Include dependencies.
100
+ -include $(test_objs:.o=.d) $(gumbo_objs:.o=.d)
101
+ endif
@@ -0,0 +1,27 @@
1
+ Gumbo HTML parser THANKS file
2
+
3
+ Gumbo was originally written by Jonathan Tang, but many people helped out through suggestions, question-answering, code reviews, bugfixes, and organizational support. Here is a list of these people. Help me keep it complete and exempt of errors.
4
+
5
+ Adam Barth
6
+ Adam Roben
7
+ Ben Noordhuis
8
+ Bowen Han
9
+ Constantinos Michael
10
+ Craig Barnes
11
+ Geoffrey Sneddon
12
+ Ian Hickson
13
+ Jack Deng
14
+ Joel Low
15
+ Jonathan Shneier
16
+ Kevin Hendricks
17
+ Mason Tang
18
+ Maxim Zakharov
19
+ Michal Zalewski
20
+ Neal Norwitz
21
+ Othar Hansson
22
+ Ryan Grove
23
+ Stefan Haustein
24
+ Steffen Meschkat
25
+ Steven Kabbes
26
+ Thiago Farina
27
+ Vicent Marti
@@ -0,0 +1,34 @@
1
+ # this Makefile is used by ext/nokogiri/extconf.rb
2
+ # to enable a mini_portile2 recipe to build the gumbo parser
3
+ .PHONY: clean
4
+
5
+ CFLAGS += -std=c99 -Wall
6
+
7
+ # allow the ENV var to override this
8
+ RANLIB ?= ranlib
9
+
10
+ gumbo_objs := \
11
+ ascii.o \
12
+ attribute.o \
13
+ char_ref.o \
14
+ error.o \
15
+ foreign_attrs.o \
16
+ parser.o \
17
+ string_buffer.o \
18
+ string_piece.o \
19
+ svg_attrs.o \
20
+ svg_tags.o \
21
+ tag.o \
22
+ tag_lookup.o \
23
+ token_buffer.o \
24
+ tokenizer.o \
25
+ utf8.o \
26
+ util.o \
27
+ vector.o
28
+
29
+ libgumbo.a: $(gumbo_objs)
30
+ $(AR) $(ARFLAGS) $@ $(gumbo_objs)
31
+ - ($(RANLIB) $@ || true) >/dev/null 2>&1
32
+
33
+ clean:
34
+ rm -f $(gumbo_objs) libgumbo.a
@@ -0,0 +1,41 @@
1
+ libgumbo
2
+ ========
3
+
4
+ This is an internal fork of the [libgumbo] library, which was copied and
5
+ later modified under the terms of the Apache 2.0 [license]. See `lua-gumbo`
6
+ commit [`0a04728`] for details of the original import.
7
+
8
+ Since importing the code, the following notable fixes and improvements
9
+ have been made:
10
+
11
+ * `91cef89`: Re-implement `adjust_foreign_attributes()` with a gperf hash
12
+ * `b11abe7`: Pass `TagSet` arrays into functions by reference instead of value
13
+ * `b73dc03`: Simplify `maybe_replace_codepoint()` function
14
+ * `d5d0bb3`: Remove special handling of `<menuitem>` tag
15
+ * `7bd5162`: Remove special handling of `<isindex>` tag
16
+ * `a5c1b0e`: Use `realloc(3)` instead of `malloc(3)` in `enlarge_vector_if_full()`
17
+ * `dcbebd7`: Use `realloc(3)` instead of `malloc(3)` in `maybe_resize_string_buffer()`
18
+ * `df15262`: Make `destroy_node()` function non-recursive
19
+ * `2df37f5`: Fix signedness of some format specifiers
20
+ * `176553e`: Add maximum element nesting limit
21
+ * `bed0f4a`: Annotate `gumbo_debug()` with `PRINTF` macro and fix warnings
22
+ * `7ffc218`: Annotate `print_message()` with `PRINTF` macro and fix warnings
23
+ * `1bd8ab5`, `9136507`, `53a1f9a`: Deduplicate some identical `TagSet` arrays
24
+ * `a7a9065`: Add some GCC/Clang function attributes
25
+ * `8d3d4e4`: Remove custom allocator support
26
+ * `8d3b006`: Fix recording of source positions for `</form>` end tags
27
+ * `1a8d763`: Replace linear search in `maybe_replace_codepoint()` with a lookup table
28
+ * `6dca79e`: Replace `strcasecmp()` and `strncasecmp()` with ascii-only equivalents
29
+ * `17ab1d2`: Fix `TAGSET_INCLUDES` macro to work properly with multiple bit flags
30
+ * `7e56d45`: Re-implement `gumbo_normalize_svg_tagname()` with a gperf hash
31
+ * `a518d35`: Replace linear array search in `adjust_svg_attributes()` with a gperf hash
32
+ * `a4a7433`: Fix duplicate `TagSet` initializer being ignored in `is_special_node()`
33
+ * `8137fcd`: Add support for `<dialog>` tag
34
+ * `4b35471`: Add missing `static` qualifiers to hide symbols that shouldn't be extern
35
+ * `df57c59`, `03101f3`, `ea62330`: Replace use of locale-dependant `ctype.h` functions
36
+ with custom, ASCII-only equivalents
37
+
38
+
39
+ [libgumbo]: https://github.com/google/gumbo-parser/tree/aa91b27b02c0c80c482e24348a457ed7c3c088e0/src
40
+ [license]: https://github.com/google/gumbo-parser/blob/aa91b27b02c0c80c482e24348a457ed7c3c088e0/COPYING
41
+ [`0a04728`]: https://gitlab.com/craigbarnes/lua-gumbo/commit/0a047282815af86f3367a7d95fefcfe5723ece48