rgen 0.5.4 → 0.6.0
Sign up to get free protection for your applications and to get access to all the features.
- data/CHANGELOG +28 -0
- data/Rakefile +3 -4
- data/lib/ea_support/uml13_ea_metamodel.rb +3 -3
- data/lib/ea_support/uml13_ea_to_uml13.rb +33 -2
- data/lib/ea_support/uml13_to_uml13_ea.rb +7 -0
- data/lib/mmgen/mm_ext/ecore_mmgen_ext.rb +4 -4
- data/lib/mmgen/templates/metamodel_generator.tpl +143 -143
- data/lib/rgen/ecore/ecore.rb +11 -1
- data/lib/rgen/ecore/ecore_interface.rb +47 -0
- data/lib/rgen/ecore/ecore_to_ruby.rb +166 -0
- data/lib/rgen/ecore/{ecore_transformer.rb → ruby_to_ecore.rb} +11 -11
- data/lib/rgen/environment.rb +15 -2
- data/lib/rgen/fragment/dump_file_cache.rb +63 -0
- data/lib/rgen/fragment/fragmented_model.rb +139 -0
- data/lib/rgen/fragment/model_fragment.rb +268 -0
- data/lib/rgen/instantiator/abstract_xml_instantiator.rb +44 -72
- data/lib/rgen/instantiator/default_xml_instantiator.rb +2 -2
- data/lib/rgen/instantiator/ecore_xml_instantiator.rb +16 -1
- data/lib/rgen/instantiator/json_instantiator.rb +16 -2
- data/lib/rgen/instantiator/nodebased_xml_instantiator.rb +118 -138
- data/lib/rgen/instantiator/qualified_name_resolver.rb +5 -1
- data/lib/rgen/instantiator/reference_resolver.rb +126 -24
- data/lib/rgen/instantiator/xmi11_instantiator.rb +6 -2
- data/lib/rgen/metamodel_builder.rb +18 -6
- data/lib/rgen/metamodel_builder/builder_extensions.rb +431 -407
- data/lib/rgen/metamodel_builder/builder_runtime.rb +8 -8
- data/lib/rgen/metamodel_builder/constant_order_helper.rb +4 -4
- data/lib/rgen/metamodel_builder/data_types.rb +5 -1
- data/lib/rgen/metamodel_builder/intermediate/feature.rb +167 -0
- data/lib/rgen/metamodel_builder/module_extension.rb +2 -2
- data/lib/rgen/model_builder.rb +10 -5
- data/lib/rgen/model_builder/builder_context.rb +17 -1
- data/lib/rgen/serializer/opposite_reference_filter.rb +18 -0
- data/lib/rgen/serializer/qualified_name_provider.rb +45 -0
- data/lib/rgen/template_language/template_container.rb +3 -1
- data/lib/rgen/{auto_class_creator.rb → util/auto_class_creator.rb} +6 -1
- data/lib/rgen/util/cached_glob.rb +67 -0
- data/lib/rgen/util/file_cache_map.rb +104 -0
- data/lib/rgen/util/file_change_detector.rb +78 -0
- data/lib/rgen/{method_delegation.rb → util/method_delegation.rb} +18 -3
- data/lib/rgen/{model_comparator.rb → util/model_comparator.rb} +17 -5
- data/lib/rgen/{model_comparator_base.rb → util/model_comparator_base.rb} +6 -1
- data/lib/rgen/{model_dumper.rb → util/model_dumper.rb} +6 -1
- data/lib/rgen/{name_helper.rb → util/name_helper.rb} +6 -1
- data/lib/rgen/util/pattern_matcher.rb +329 -0
- data/lib/transformers/uml13_to_ecore.rb +103 -60
- data/test/ecore_self_test.rb +43 -42
- data/test/json_test.rb +15 -0
- data/test/metamodel_builder_test.rb +361 -206
- data/test/metamodel_from_ecore_test.rb +45 -0
- data/test/metamodel_order_test.rb +10 -4
- data/test/metamodel_roundtrip_test.rb +2 -2
- data/test/metamodel_roundtrip_test/TestModel_Regenerated.rb +1 -1
- data/test/metamodel_roundtrip_test/houseMetamodel_Regenerated.ecore +50 -50
- data/test/method_delegation_test.rb +9 -9
- data/test/model_builder/ecore_internal.rb +19 -9
- data/test/model_builder/serializer_test.rb +1 -1
- data/test/reference_resolver_test.rb +79 -12
- data/test/rgen_test.rb +2 -0
- data/test/template_language_test.rb +7 -0
- data/test/template_language_test/templates/callback_indent_test/a.tpl +12 -0
- data/test/template_language_test/templates/callback_indent_test/b.tpl +5 -0
- data/test/testmodel/ea_testmodel_regenerated.xml +588 -583
- data/test/transformer_test.rb +3 -3
- data/test/util/file_cache_map_test.rb +91 -0
- data/test/util/file_cache_map_test/testdir/fileA +1 -0
- data/test/util_test.rb +4 -0
- data/test/xml_instantiator_test.rb +139 -135
- metadata +49 -104
- data/lib/rgen/ecore/ecore_instantiator.rb +0 -31
- data/lib/rgen/metamodel_builder/metamodel_description.rb +0 -232
- data/redist/xmlscan/ChangeLog +0 -1301
- data/redist/xmlscan/README +0 -34
- data/redist/xmlscan/THANKS +0 -11
- data/redist/xmlscan/doc/changes.html +0 -74
- data/redist/xmlscan/doc/changes.rd +0 -80
- data/redist/xmlscan/doc/en/conformance.html +0 -136
- data/redist/xmlscan/doc/en/conformance.rd +0 -152
- data/redist/xmlscan/doc/en/manual.html +0 -356
- data/redist/xmlscan/doc/en/manual.rd +0 -402
- data/redist/xmlscan/doc/ja/conformance.ja.html +0 -118
- data/redist/xmlscan/doc/ja/conformance.ja.rd +0 -134
- data/redist/xmlscan/doc/ja/manual.ja.html +0 -325
- data/redist/xmlscan/doc/ja/manual.ja.rd +0 -370
- data/redist/xmlscan/doc/src/Makefile +0 -41
- data/redist/xmlscan/doc/src/conformance.rd.src +0 -256
- data/redist/xmlscan/doc/src/langsplit.rb +0 -110
- data/redist/xmlscan/doc/src/manual.rd.src +0 -614
- data/redist/xmlscan/install.rb +0 -41
- data/redist/xmlscan/lib/xmlscan/encoding.rb +0 -311
- data/redist/xmlscan/lib/xmlscan/htmlscan.rb +0 -289
- data/redist/xmlscan/lib/xmlscan/namespace.rb +0 -352
- data/redist/xmlscan/lib/xmlscan/parser.rb +0 -299
- data/redist/xmlscan/lib/xmlscan/scanner.rb +0 -1109
- data/redist/xmlscan/lib/xmlscan/version.rb +0 -22
- data/redist/xmlscan/lib/xmlscan/visitor.rb +0 -158
- data/redist/xmlscan/lib/xmlscan/xmlchar.rb +0 -441
- data/redist/xmlscan/memo/CONFORMANCE +0 -1249
- data/redist/xmlscan/memo/PRODUCTIONS +0 -195
- data/redist/xmlscan/memo/contentspec.ry +0 -335
- data/redist/xmlscan/samples/chibixml.rb +0 -105
- data/redist/xmlscan/samples/getxmlchar.rb +0 -122
- data/redist/xmlscan/samples/rexml.rb +0 -159
- data/redist/xmlscan/samples/xmlbench.rb +0 -88
- data/redist/xmlscan/samples/xmlbench/parser/chibixml.rb +0 -22
- data/redist/xmlscan/samples/xmlbench/parser/nqxml.rb +0 -29
- data/redist/xmlscan/samples/xmlbench/parser/rexml.rb +0 -62
- data/redist/xmlscan/samples/xmlbench/parser/xmlparser.rb +0 -22
- data/redist/xmlscan/samples/xmlbench/parser/xmlscan-0.0.10.rb +0 -62
- data/redist/xmlscan/samples/xmlbench/parser/xmlscan-chibixml.rb +0 -22
- data/redist/xmlscan/samples/xmlbench/parser/xmlscan-rexml.rb +0 -22
- data/redist/xmlscan/samples/xmlbench/parser/xmlscan.rb +0 -99
- data/redist/xmlscan/samples/xmlbench/xmlbench-lib.rb +0 -116
- data/redist/xmlscan/samples/xmlconftest.rb +0 -200
- data/redist/xmlscan/test.rb +0 -7
- data/redist/xmlscan/tests/deftestcase.rb +0 -73
- data/redist/xmlscan/tests/runtest.rb +0 -47
- data/redist/xmlscan/tests/testall.rb +0 -14
- data/redist/xmlscan/tests/testencoding.rb +0 -438
- data/redist/xmlscan/tests/testhtmlscan.rb +0 -752
- data/redist/xmlscan/tests/testnamespace.rb +0 -457
- data/redist/xmlscan/tests/testparser.rb +0 -591
- data/redist/xmlscan/tests/testscanner.rb +0 -1749
- data/redist/xmlscan/tests/testxmlchar.rb +0 -143
- data/redist/xmlscan/tests/visitor.rb +0 -34
@@ -1,356 +0,0 @@
|
|
1
|
-
<?xml version="1.0" ?>
|
2
|
-
<!DOCTYPE html
|
3
|
-
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
4
|
-
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
5
|
-
<html xmlns="http://www.w3.org/1999/xhtml">
|
6
|
-
<head>
|
7
|
-
<title>en/manual.rd</title>
|
8
|
-
</head>
|
9
|
-
<body>
|
10
|
-
<h1><a name="label-0" id="label-0">xmlscan version 0.2 Reference Manual</a></h1><!-- RDLabel: "xmlscan version 0.2 Reference Manual" -->
|
11
|
-
<p>This is a broken English version. If you find lexical or
|
12
|
-
grammatical mistakes, or strange expressions (including kidding,
|
13
|
-
unnatural or unclear ones) in this document, please
|
14
|
-
<a href="mailto:katsu@blue.sky.or.jp">let me know</a>.</p>
|
15
|
-
<h2><a name="label-1" id="label-1">Abstract</a></h2><!-- RDLabel: "Abstract" -->
|
16
|
-
<p>XMLscan is one of non-validating XML parser written in 100%
|
17
|
-
pure Ruby.</p>
|
18
|
-
<p>XMLscan's features are as follows:</p>
|
19
|
-
<dl>
|
20
|
-
<dt><a name="label-2" id="label-2">100% pure Ruby</a></dt><!-- RDLabel: "100% pure Ruby" -->
|
21
|
-
<dd>
|
22
|
-
XMLscan doesn't require any extension libraries, so
|
23
|
-
it completely works only with a Ruby interpreter version
|
24
|
-
1.6 or above.
|
25
|
-
(It also needs no standard-bundled extension library.)
|
26
|
-
</dd>
|
27
|
-
<dt><a name="label-3" id="label-3">Compliant to the specification</a></dt><!-- RDLabel: "Compliant to the specification" -->
|
28
|
-
<dd>
|
29
|
-
XMLscan has been developed to satisfy all conditions,
|
30
|
-
described in XML 1.0 Specification and required to a
|
31
|
-
non-validating XML processor
|
32
|
-
</dd>
|
33
|
-
<dt><a name="label-4" id="label-4">High-speed</a></dt><!-- RDLabel: "High-speed" -->
|
34
|
-
<dd>
|
35
|
-
XMLscan is, probably, the fastest parser among all
|
36
|
-
existing XML/HTML parsers written in pure Ruby.
|
37
|
-
</dd>
|
38
|
-
<dt><a name="label-5" id="label-5">Support for various CES.</a></dt><!-- RDLabel: "Support for various CES." -->
|
39
|
-
<dd>
|
40
|
-
XMLscan can parse an XML document encoded in at least
|
41
|
-
iso-8859-*, EUC-*, Shift_JIS, and UTF-8 as it is.
|
42
|
-
UTF-16 is not supported directly, though.
|
43
|
-
</dd>
|
44
|
-
<dt><a name="label-6" id="label-6">Just parsing</a></dt><!-- RDLabel: "Just parsing" -->
|
45
|
-
<dd>
|
46
|
-
The role of xmlscan is just to parse an XML document.
|
47
|
-
XMLscan doesn't provide high-level features to easily
|
48
|
-
handle an XML document. XMLscan is assumed to be used as
|
49
|
-
a core part of a library providing such features.
|
50
|
-
</dd>
|
51
|
-
<dt><a name="label-7" id="label-7">HTML</a></dt><!-- RDLabel: "HTML" -->
|
52
|
-
<dd>
|
53
|
-
XMLscan contains htmlscan, an HTML parser.
|
54
|
-
</dd>
|
55
|
-
</dl>
|
56
|
-
<h2><a name="label-8" id="label-8">Character encodings</a></h2><!-- RDLabel: "Character encodings" -->
|
57
|
-
<p>By default, the value of global variable $KCODE decides
|
58
|
-
which CES (character encoding scheme) is assumed for xmlscan
|
59
|
-
to parse an XML document.
|
60
|
-
You need to set $KCODE or <!-- Reference, RDLabel "XMLScan::XMLScanner#kcode=" doesn't exist --><em class="label-not-found">XMLScan::XMLScanner#kcode=</em><!-- Reference end -->
|
61
|
-
an appropriate value to parse an XML document encoded in EUC-*,
|
62
|
-
Shift_JIS, or UTF-8.</p>
|
63
|
-
<p>UTF-16 is not supported directly. You should convert it into
|
64
|
-
UTF-8 before parsing.</p>
|
65
|
-
<h2><a name="label-9" id="label-9">XML Namespaces</a></h2><!-- RDLabel: "XML Namespaces" -->
|
66
|
-
<p>XML Namespaces have been already implemented in
|
67
|
-
xmlscan/namespace.rb. However, since its interface is going
|
68
|
-
to be modified, this feature is undocumented now.</p>
|
69
|
-
<h2><a name="label-10" id="label-10">Class Reference</a></h2><!-- RDLabel: "Class Reference" -->
|
70
|
-
<h3><a name="label-11" id="label-11">XMLScan::Error</a></h3><!-- RDLabel: "XMLScan::Error" -->
|
71
|
-
<p>The superclass for all exceptions related to xmlscan.</p>
|
72
|
-
<p>These exceptions are raised by XMLScan::Visitor
|
73
|
-
by default when it receives an error report from a parser,
|
74
|
-
such as XMLScan::XMLScanner or XMLScan::XMLParser.
|
75
|
-
Each parser never raises these exceptions by itself.</p>
|
76
|
-
<dl>
|
77
|
-
<dt><a name="label-12" id="label-12">XMLScan::ParseError</a></dt><!-- RDLabel: "XMLScan::ParseError" -->
|
78
|
-
<dd>
|
79
|
-
An error except a constraint violation, for example,
|
80
|
-
an XML document is unmatched with a production.
|
81
|
-
</dd>
|
82
|
-
<dt><a name="label-13" id="label-13">XMLScan::NotWellFormedError</a></dt><!-- RDLabel: "XMLScan::NotWellFormedError" -->
|
83
|
-
<dd>
|
84
|
-
Raised when an XML document violates an well-formedness
|
85
|
-
constraint.
|
86
|
-
</dd>
|
87
|
-
<dt><a name="label-14" id="label-14">XMLScan::NotValidError</a></dt><!-- RDLabel: "XMLScan::NotValidError" -->
|
88
|
-
<dd>
|
89
|
-
Raised when an XML document violates an validity constraint.
|
90
|
-
</dd>
|
91
|
-
</dl>
|
92
|
-
<h3><a name="label-15" id="label-15">XMLScan::Visitor</a></h3><!-- RDLabel: "XMLScan::Visitor" -->
|
93
|
-
<p>Mix-in for receiving the result of parsing an XML document.</p>
|
94
|
-
<p>Each parser included in xmlscan parses an XML document from
|
95
|
-
the beginning, and calls each specific method of given instance of
|
96
|
-
XMLScan::Visitor for each syntactic element, such as a tag.
|
97
|
-
It is ensured that these calls is in order of the appearance
|
98
|
-
in the document from the beginning.</p>
|
99
|
-
<h4><a name="label-16" id="label-16">Methods:</a></h4><!-- RDLabel: "Methods:" -->
|
100
|
-
<p>Without special notice, the following methods do nothing by
|
101
|
-
default.</p>
|
102
|
-
<dl>
|
103
|
-
<dt><a name="label-17" id="label-17"><code>XMLScan::Visitor#parse_error(<var>msg</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#parse_error" -->
|
104
|
-
<dd>
|
105
|
-
Called when the parser meets an error except a constraint
|
106
|
-
violation, for example, an XML document is unmatched with
|
107
|
-
a production. By default, this method raises
|
108
|
-
<a href="#label-12">XMLScan::ParseError</a> exception. If no exception is
|
109
|
-
raised and this method returns normally, the parser recovers
|
110
|
-
the error and continues to parse.</dd>
|
111
|
-
<dt><a name="label-18" id="label-18"><code>XMLScan::Visitor#wellformed_error(<var>msg</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#wellformed_error" -->
|
112
|
-
<dd>
|
113
|
-
Called when the parser meets an well-formedness constraint
|
114
|
-
violation. By default, this method raises
|
115
|
-
<a href="#label-13">XMLScan::NotWellFormedError</a> exception. If no exception
|
116
|
-
is raised and this method returns normally, the parser recovers
|
117
|
-
the error and continues to parse.</dd>
|
118
|
-
<dt><a name="label-19" id="label-19"><code>XMLScan::Visitor#valid_error(<var>msg</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#valid_error" -->
|
119
|
-
<dd>
|
120
|
-
<p>Called when the parser meets validity constraint
|
121
|
-
violation. By default, this method raises
|
122
|
-
<a href="#label-14">XMLScan::NotValidError</a> exception. If no exception
|
123
|
-
is raised and this method returns normally, the parser recovers
|
124
|
-
the error and continues to parse.</p>
|
125
|
-
<p>FYI, current version of xmlscan includes no validating XML
|
126
|
-
processor. This method is reserved for future versions.</p></dd>
|
127
|
-
<dt><a name="label-20" id="label-20"><code>XMLScan::Visitor#warning(<var>msg</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#warning" -->
|
128
|
-
<dd>
|
129
|
-
Called when the parser meets a non-error but unrecommended
|
130
|
-
thing or a syntax which xmlscan is not able to parse.</dd>
|
131
|
-
<dt><a name="label-21" id="label-21"><code>XMLScan::Visitor#on_start_document</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_start_document" -->
|
132
|
-
<dd>
|
133
|
-
Called just before the parser starts parsing an XML document.
|
134
|
-
After this method is called, corresponding
|
135
|
-
<a href="#label-22">XMLScan::Visitor#on_end_document</a> method is always called.</dd>
|
136
|
-
<dt><a name="label-22" id="label-22"><code>XMLScan::Visitor#on_end_document</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_end_document" -->
|
137
|
-
<dd>
|
138
|
-
Called after the parser reaches the end of an XML document.</dd>
|
139
|
-
<dt><a name="label-23" id="label-23"><code>XMLScan::Visitor#on_xmldecl</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_xmldecl" -->
|
140
|
-
<dt><a name="label-24" id="label-24"><code>XMLScan::Visitor#on_xmldecl_version(<var>str</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_xmldecl_version" -->
|
141
|
-
<dt><a name="label-25" id="label-25"><code>XMLScan::Visitor#on_xmldecl_encoding(<var>str</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_xmldecl_encoding" -->
|
142
|
-
<dt><a name="label-26" id="label-26"><code>XMLScan::Visitor#on_xmldecl_standalone(<var>str</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_xmldecl_standalone" -->
|
143
|
-
<dt><a name="label-27" id="label-27"><code>XMLScan::Visitor#on_xmldecl_other(<var>name</var>, <var>value</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_xmldecl_other" -->
|
144
|
-
<dt><a name="label-28" id="label-28"><code>XMLScan::Visitor#on_xmldecl_end</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_xmldecl_end" -->
|
145
|
-
<dd>
|
146
|
-
<p>Called when the parser meets an XML declaration.</p>
|
147
|
-
<pre><?xml version="1.0" encoding="euc-jp" standalone="yes" ?>
|
148
|
-
^ ^ ^ ^ ^
|
149
|
-
1 2 3 4 5
|
150
|
-
|
151
|
-
method argument
|
152
|
-
--------------------------------------
|
153
|
-
1: on_xmldecl
|
154
|
-
2: on_xmldecl_version ("1.0")
|
155
|
-
3: on_xmldecl_encoding ("euc-jp")
|
156
|
-
4: on_xmldecl_standalone ("yes")
|
157
|
-
5: on_xmldecl_end</pre>
|
158
|
-
<p>When an XML declaration is found, both on_xmldecl and
|
159
|
-
on_xmldecl_end method are always called. Any other methods
|
160
|
-
are called only when the corresponding syntaxes are found.</p>
|
161
|
-
<p>When a declaration except version, encoding, and standalone
|
162
|
-
is found in an XML declaration, on_xmldecl_other method is
|
163
|
-
called. Since such a declaration is not permitted, note that
|
164
|
-
the parser always calls <a href="#label-17">XMLScan::Visitor#parse_error</a> method
|
165
|
-
before calling on_xmldecl_other method.</p></dd>
|
166
|
-
<dt><a name="label-29" id="label-29"><code>XMLScan::Visitor#on_doctype(<var>root</var>, <var>pubid</var>, <var>sysid</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_doctype" -->
|
167
|
-
<dd>
|
168
|
-
<p>Called when the parser meets a document type declaration.</p>
|
169
|
-
<pre>document argument</pre>
|
170
|
-
<pre>--------------------------------------------------------------
|
171
|
-
1: <!DOCTYPE foo> ('foo', nil, nil)
|
172
|
-
2: <!DOCTYPE foo SYSTEM "bar"> ('foo', nil, 'bar')
|
173
|
-
3: <!DOCTYPE foo PUBLIC "bar"> ('foo', 'bar', nil )
|
174
|
-
4: <!DOCTYPE foo PUBLIC "bar" "baz"> ('foo', 'bar', 'baz')</pre></dd>
|
175
|
-
<dt><a name="label-30" id="label-30"><code>XMLScan::Visitor#on_prolog_space(<var>str</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_prolog_space" -->
|
176
|
-
<dd>
|
177
|
-
Called when the parser meets whitespaces in prolog.</dd>
|
178
|
-
<dt><a name="label-31" id="label-31"><code>XMLScan::Visitor#on_comment(<var>str</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_comment" -->
|
179
|
-
<dd>
|
180
|
-
Called when the parser meets a comment.</dd>
|
181
|
-
<dt><a name="label-32" id="label-32"><code>XMLScan::Visitor#on_pi(<var>target</var>, <var>pi</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_pi" -->
|
182
|
-
<dd>
|
183
|
-
Called when the parser meets a processing instruction.</dd>
|
184
|
-
<dt><a name="label-33" id="label-33"><code>XMLScan::Visitor#on_chardata(<var>str</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_chardata" -->
|
185
|
-
<dd>
|
186
|
-
Called when the parser meets character data.</dd>
|
187
|
-
<dt><a name="label-34" id="label-34"><code>XMLScan::Visitor#on_cdata(<var>str</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_cdata" -->
|
188
|
-
<dd>
|
189
|
-
Called when the parser meets a CDATA section.</dd>
|
190
|
-
<dt><a name="label-35" id="label-35"><code>XMLScan::Visitor#on_entityref(<var>ref</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_entityref" -->
|
191
|
-
<dd>
|
192
|
-
Called when the parser meets a general entity reference
|
193
|
-
in a place except an attribute value.</dd>
|
194
|
-
<dt><a name="label-36" id="label-36"><code>XMLScan::Visitor#on_charref(<var>code</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_charref" -->
|
195
|
-
<dt><a name="label-37" id="label-37"><code>XMLScan::Visitor#on_charref_hex(<var>code</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_charref_hex" -->
|
196
|
-
<dd>
|
197
|
-
Called when the parser meets a character reference
|
198
|
-
in a place except an attribute value.
|
199
|
-
When the character code is represented by decimals,
|
200
|
-
on_charref is called. When by hexadecimals, on_charref_hex
|
201
|
-
is called. <var>code</var> is an integer.</dd>
|
202
|
-
<dt><a name="label-38" id="label-38"><code>XMLScan::Visitor#on_stag(<var>name</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_stag" -->
|
203
|
-
<dt><a name="label-39" id="label-39"><code>XMLScan::Visitor#on_attribute(<var>name</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_attribute" -->
|
204
|
-
<dt><a name="label-40" id="label-40"><code>XMLScan::Visitor#on_attr_value(<var>str</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_attr_value" -->
|
205
|
-
<dt><a name="label-41" id="label-41"><code>XMLScan::Visitor#on_attr_entityref(<var>ref</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_attr_entityref" -->
|
206
|
-
<dt><a name="label-42" id="label-42"><code>XMLScan::Visitor#on_attr_charref(<var>code</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_attr_charref" -->
|
207
|
-
<dt><a name="label-43" id="label-43"><code>XMLScan::Visitor#on_attr_charref_hex(<var>code</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_attr_charref_hex" -->
|
208
|
-
<dt><a name="label-44" id="label-44"><code>XMLScan::Visitor#on_attribute_end(<var>name</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_attribute_end" -->
|
209
|
-
<dt><a name="label-45" id="label-45"><code>XMLScan::Visitor#on_stag_end_empty(<var>name</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_stag_end_empty" -->
|
210
|
-
<dt><a name="label-46" id="label-46"><code>XMLScan::Visitor#on_stag_end(<var>name</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_stag_end" -->
|
211
|
-
<dd>
|
212
|
-
<p>Called when the parser meets an XML declaration.</p>
|
213
|
-
<pre><hoge fuga="foo&bar;&#38;&#x26;baz" >
|
214
|
-
^ ^ ^ ^ ^ ^ ^ ^ ^
|
215
|
-
1 2 3 4 5 6 7 8 9
|
216
|
-
|
217
|
-
method argument
|
218
|
-
------------------------------------
|
219
|
-
1: on_stag ('hoge')
|
220
|
-
2: on_attribute ('fuga')
|
221
|
-
3: on_attr_value ('foo')
|
222
|
-
4: on_attr_entityref ('bar')
|
223
|
-
5: on_attr_charref (38)
|
224
|
-
6: on_attr_charref_hex (38)
|
225
|
-
7: on_attr_value ('baz')
|
226
|
-
8: on_attribute_end ('fuga')
|
227
|
-
9: on_stag_end ('hoge')
|
228
|
-
or
|
229
|
-
on_stag_end_empty ('hoge')</pre>
|
230
|
-
<p>When a start tag is found, both on_stag and corresponding
|
231
|
-
either on_stag_end or on_stag_end_empty method are always
|
232
|
-
called. Any other methods are called only when at least one
|
233
|
-
attribute is found in the start tag.</p>
|
234
|
-
<p>When an attribute is found, both on_attribute and
|
235
|
-
on_attribute_end method are always called. If the attribute
|
236
|
-
value is empty, only these two methods are called.</p>
|
237
|
-
<p>When the parser meets a general entity reference in an
|
238
|
-
attribute value, it calls on_attr_entityref method.
|
239
|
-
When the parser meets a character reference in an attribute
|
240
|
-
value, it calls either on_charref or on_charref_hex method.</p>
|
241
|
-
<p>If the tag is an empty element tag, on_stag_end_empty method
|
242
|
-
is called instead of on_stag_end method.</p></dd>
|
243
|
-
<dt><a name="label-47" id="label-47"><code>XMLScan::Visitor#on_etag(<var>name</var>)</code></a></dt><!-- RDLabel: "XMLScan::Visitor#on_etag" -->
|
244
|
-
<dd>
|
245
|
-
Called when the parser meets an end tag.</dd>
|
246
|
-
</dl>
|
247
|
-
<h3><a name="label-48" id="label-48">XMLScan::XMLScanner</a></h3><!-- RDLabel: "XMLScan::XMLScanner" -->
|
248
|
-
<p>The scanner which tokenizes an XML document and recognize tags,
|
249
|
-
and so on.</p>
|
250
|
-
<p>The conformance of XMLScan::XMLScanner to the specification
|
251
|
-
is described in another document.</p>
|
252
|
-
<h4><a name="label-49" id="label-49">SuperClass:</a></h4><!-- RDLabel: "SuperClass:" -->
|
253
|
-
<ul>
|
254
|
-
<li>Object</li>
|
255
|
-
</ul>
|
256
|
-
<h4><a name="label-50" id="label-50">Class Methods:</a></h4><!-- RDLabel: "Class Methods:" -->
|
257
|
-
<dl>
|
258
|
-
<dt><a name="label-51" id="label-51"><code>XMLScan::XMLScanner.new(<var>visitor</var>[, <var>option</var> ...])</code></a></dt><!-- RDLabel: "XMLScan::XMLScanner.new" -->
|
259
|
-
<dd>
|
260
|
-
<p>Creates an instance. <var>visitor</var> is a instance of
|
261
|
-
<a href="#label-15">XMLScan::Visitor</a> and receives the result of parsing
|
262
|
-
from the XMLScan::Scanner object.</p>
|
263
|
-
<p>You can specify one of more <var>option</var> as a string or symbol.
|
264
|
-
XMLScan::Scanner's options are as follows:</p>
|
265
|
-
<dl>
|
266
|
-
<dt><a name="label-52" id="label-52">'strict_char'</a></dt><!-- RDLabel: "'strict_char'" -->
|
267
|
-
<dd>
|
268
|
-
This option is enabled after
|
269
|
-
<code>require 'xmlscan/xmlchar'</code>.
|
270
|
-
XMLScan::Scanner checks whether an XML document includes
|
271
|
-
an illegal character. The performance decreases sharply.
|
272
|
-
</dd>
|
273
|
-
</dl></dd>
|
274
|
-
</dl>
|
275
|
-
<h4><a name="label-53" id="label-53">Methods:</a></h4><!-- RDLabel: "Methods:" -->
|
276
|
-
<dl>
|
277
|
-
<dt><a name="label-54" id="label-54"><code>XMLScan::XMLScanner#kcode= <var>arg</var></code></a></dt><!-- RDLabel: "XMLScan::XMLScanner#kcode= arg" -->
|
278
|
-
<dd>
|
279
|
-
Sets CES. Available values for <var>code</var> are same as $KCODE
|
280
|
-
except nil. If <var>code</var> is nil, $KCODE decides the CES.</dd>
|
281
|
-
<dt><a name="label-55" id="label-55"><code>XMLScan::XMLScanner#kcode</code></a></dt><!-- RDLabel: "XMLScan::XMLScanner#kcode" -->
|
282
|
-
<dd>
|
283
|
-
Returns CES. The format of the return value is same as
|
284
|
-
Regexp#kcode. If this method returns nil, it represents that
|
285
|
-
$KCODE decides the CES.</dd>
|
286
|
-
<dt><a name="label-56" id="label-56"><code>XMLScan::XMLScanner#parse(<var>source</var>)</code></a></dt><!-- RDLabel: "XMLScan::XMLScanner#parse" -->
|
287
|
-
<dd>
|
288
|
-
Parses <var>source</var> as an XML document. <var>source</var> must be
|
289
|
-
a string, an array of strings, or an object which responds to
|
290
|
-
gets method which behaves same as IO#gets does.</dd>
|
291
|
-
</dl>
|
292
|
-
<h3><a name="label-57" id="label-57">XMLScan::XMLParser</a></h3><!-- RDLabel: "XMLScan::XMLParser" -->
|
293
|
-
<p>The non-validating XML parser.</p>
|
294
|
-
<p>The conformance of XMLScan::XMLParser to the specification
|
295
|
-
is described in another document.</p>
|
296
|
-
<h4><a name="label-58" id="label-58">SuperClass:</a></h4><!-- RDLabel: "SuperClass:" -->
|
297
|
-
<ul>
|
298
|
-
<li><a href="#label-48">XMLScan::XMLScanner</a></li>
|
299
|
-
</ul>
|
300
|
-
<h4><a name="label-59" id="label-59">Class Methods:</a></h4><!-- RDLabel: "Class Methods:" -->
|
301
|
-
<dl>
|
302
|
-
<dt><a name="label-60" id="label-60"><code>XMLScan::XMLParser.new(<var>visitor</var>[, <var>option</var> ...])</code></a></dt><!-- RDLabel: "XMLScan::XMLParser.new" -->
|
303
|
-
<dd>
|
304
|
-
<p>XMLScan::XMLParser makes sure the following for each
|
305
|
-
method of <var>visitor</var>:</p>
|
306
|
-
<dl>
|
307
|
-
<dt><a name="label-61" id="label-61"><a href="#label-38">XMLScan::Visitor#on_stag</a></a></dt><!-- RDLabel: "XMLScan::Visitor#on_stag" -->
|
308
|
-
<dd>
|
309
|
-
After calling this method, XMLScan::Parser always call
|
310
|
-
corresponding <a href="#label-47">XMLScan::Visitor#on_etag</a> method.
|
311
|
-
</dd>
|
312
|
-
</dl>
|
313
|
-
<p>In addition, if you never intend error recovery, method calls
|
314
|
-
which must not be occurred in a well-formed XML document are
|
315
|
-
all suppressed.</p></dd>
|
316
|
-
</dl>
|
317
|
-
<h3><a name="label-62" id="label-62">XMLScan::HTMLScanner</a></h3><!-- RDLabel: "XMLScan::HTMLScanner" -->
|
318
|
-
<p>An HTML parser based on <a href="#label-48">XMLScan::XMLScanner</a>.</p>
|
319
|
-
<p>The conformance of XMLScan::HTMLScanner to the specification
|
320
|
-
is described in another document.</p>
|
321
|
-
<h4><a name="label-63" id="label-63">SuperClass:</a></h4><!-- RDLabel: "SuperClass:" -->
|
322
|
-
<ul>
|
323
|
-
<li><a href="#label-48">XMLScan::XMLScanner</a></li>
|
324
|
-
</ul>
|
325
|
-
<h4><a name="label-64" id="label-64">Class Methods:</a></h4><!-- RDLabel: "Class Methods:" -->
|
326
|
-
<dl>
|
327
|
-
<dt><a name="label-65" id="label-65"><code>XMLScan::HTMLScanner.new(<var>visitor</var>[, <var>option</var> ...])</code></a></dt><!-- RDLabel: "XMLScan::HTMLScanner.new" -->
|
328
|
-
<dd>
|
329
|
-
XMLScan::HTMLScanner makes sure the following for each
|
330
|
-
method of <var>visitor</var>:
|
331
|
-
<dl>
|
332
|
-
<dt><a name="label-66" id="label-66"><a href="#label-23">XMLScan::Visitor#on_xmldecl</a></a></dt><!-- RDLabel: "XMLScan::Visitor#on_xmldecl" -->
|
333
|
-
<dt><a name="label-67" id="label-67"><a href="#label-24">XMLScan::Visitor#on_xmldecl_version</a></a></dt><!-- RDLabel: "XMLScan::Visitor#on_xmldecl_version" -->
|
334
|
-
<dt><a name="label-68" id="label-68"><a href="#label-25">XMLScan::Visitor#on_xmldecl_encoding</a></a></dt><!-- RDLabel: "XMLScan::Visitor#on_xmldecl_encoding" -->
|
335
|
-
<dt><a name="label-69" id="label-69"><a href="#label-26">XMLScan::Visitor#on_xmldecl_standalone</a></a></dt><!-- RDLabel: "XMLScan::Visitor#on_xmldecl_standalone" -->
|
336
|
-
<dt><a name="label-70" id="label-70"><a href="#label-28">XMLScan::Visitor#on_xmldecl_end</a></a></dt><!-- RDLabel: "XMLScan::Visitor#on_xmldecl_end" -->
|
337
|
-
<dd>
|
338
|
-
An XML declaration never appears in an HTML document,
|
339
|
-
so XMLScan::HTMLScanner never calls these methods.
|
340
|
-
</dd>
|
341
|
-
<dt><a name="label-71" id="label-71"><a href="#label-45">XMLScan::Visitor#on_stag_end_empty</a></a></dt><!-- RDLabel: "XMLScan::Visitor#on_stag_end_empty" -->
|
342
|
-
<dd>
|
343
|
-
An empty element tag never appears in an HTML document,
|
344
|
-
so XMLScan::HTMLScanner never calls this method.
|
345
|
-
An empty element tag causes a parse error.
|
346
|
-
</dd>
|
347
|
-
<dt><a name="label-72" id="label-72"><a href="#label-18">XMLScan::Visitor#wellformed_error</a></a></dt><!-- RDLabel: "XMLScan::Visitor#wellformed_error" -->
|
348
|
-
<dd>
|
349
|
-
There is no well-formedness constraint for HTML,
|
350
|
-
so XMLScan::HTMLScanner never calls this method.
|
351
|
-
</dd>
|
352
|
-
</dl></dd>
|
353
|
-
</dl>
|
354
|
-
|
355
|
-
</body>
|
356
|
-
</html>
|
@@ -1,402 +0,0 @@
|
|
1
|
-
=begin
|
2
|
-
# $Id: manual.rd.src,v 1.1 2003/01/22 16:41:45 katsu Exp $
|
3
|
-
|
4
|
-
= xmlscan version 0.2 Reference Manual
|
5
|
-
|
6
|
-
This is a broken English version. If you find lexical or
|
7
|
-
grammatical mistakes, or strange expressions (including kidding,
|
8
|
-
unnatural or unclear ones) in this document, please
|
9
|
-
((<let me know|URL:mailto:katsu@blue.sky.or.jp>)).
|
10
|
-
|
11
|
-
== Abstract
|
12
|
-
|
13
|
-
XMLscan is one of non-validating XML parser written in 100%
|
14
|
-
pure Ruby.
|
15
|
-
|
16
|
-
XMLscan's features are as follows:
|
17
|
-
|
18
|
-
: 100% pure Ruby
|
19
|
-
XMLscan doesn't require any extension libraries, so
|
20
|
-
it completely works only with a Ruby interpreter version
|
21
|
-
1.6 or above.
|
22
|
-
(It also needs no standard-bundled extension library.)
|
23
|
-
|
24
|
-
: Compliant to the specification
|
25
|
-
XMLscan has been developed to satisfy all conditions,
|
26
|
-
described in XML 1.0 Specification and required to a
|
27
|
-
non-validating XML processor
|
28
|
-
|
29
|
-
: High-speed
|
30
|
-
XMLscan is, probably, the fastest parser among all
|
31
|
-
existing XML/HTML parsers written in pure Ruby.
|
32
|
-
|
33
|
-
: Support for various CES.
|
34
|
-
XMLscan can parse an XML document encoded in at least
|
35
|
-
iso-8859-*, EUC-*, Shift_JIS, and UTF-8 as it is.
|
36
|
-
UTF-16 is not supported directly, though.
|
37
|
-
|
38
|
-
: Just parsing
|
39
|
-
The role of xmlscan is just to parse an XML document.
|
40
|
-
XMLscan doesn't provide high-level features to easily
|
41
|
-
handle an XML document. XMLscan is assumed to be used as
|
42
|
-
a core part of a library providing such features.
|
43
|
-
|
44
|
-
: HTML
|
45
|
-
XMLscan contains htmlscan, an HTML parser.
|
46
|
-
|
47
|
-
|
48
|
-
== Character encodings
|
49
|
-
|
50
|
-
By default, the value of global variable $KCODE decides
|
51
|
-
which CES (character encoding scheme) is assumed for xmlscan
|
52
|
-
to parse an XML document.
|
53
|
-
You need to set $KCODE or ((<XMLScan::XMLScanner#kcode=>))
|
54
|
-
an appropriate value to parse an XML document encoded in EUC-*,
|
55
|
-
Shift_JIS, or UTF-8.
|
56
|
-
|
57
|
-
UTF-16 is not supported directly. You should convert it into
|
58
|
-
UTF-8 before parsing.
|
59
|
-
|
60
|
-
|
61
|
-
== XML Namespaces
|
62
|
-
|
63
|
-
XML Namespaces have been already implemented in
|
64
|
-
xmlscan/namespace.rb. However, since its interface is going
|
65
|
-
to be modified, this feature is undocumented now.
|
66
|
-
|
67
|
-
|
68
|
-
|
69
|
-
== Class Reference
|
70
|
-
|
71
|
-
|
72
|
-
=== XMLScan::Error
|
73
|
-
|
74
|
-
The superclass for all exceptions related to xmlscan.
|
75
|
-
|
76
|
-
These exceptions are raised by XMLScan::Visitor
|
77
|
-
by default when it receives an error report from a parser,
|
78
|
-
such as XMLScan::XMLScanner or XMLScan::XMLParser.
|
79
|
-
Each parser never raises these exceptions by itself.
|
80
|
-
|
81
|
-
#The following exceptions are defined in xmlscan/scanner.rb:
|
82
|
-
|
83
|
-
: XMLScan::ParseError
|
84
|
-
|
85
|
-
An error except a constraint violation, for example,
|
86
|
-
an XML document is unmatched with a production.
|
87
|
-
|
88
|
-
: XMLScan::NotWellFormedError
|
89
|
-
|
90
|
-
Raised when an XML document violates an well-formedness
|
91
|
-
constraint.
|
92
|
-
|
93
|
-
: XMLScan::NotValidError
|
94
|
-
|
95
|
-
Raised when an XML document violates an validity constraint.
|
96
|
-
|
97
|
-
|
98
|
-
=== XMLScan::Visitor
|
99
|
-
|
100
|
-
Mix-in for receiving the result of parsing an XML document.
|
101
|
-
|
102
|
-
Each parser included in xmlscan parses an XML document from
|
103
|
-
the beginning, and calls each specific method of given instance of
|
104
|
-
XMLScan::Visitor for each syntactic element, such as a tag.
|
105
|
-
It is ensured that these calls is in order of the appearance
|
106
|
-
in the document from the beginning.
|
107
|
-
|
108
|
-
==== Methods:
|
109
|
-
|
110
|
-
Without special notice, the following methods do nothing by
|
111
|
-
default.
|
112
|
-
|
113
|
-
--- XMLScan::Visitor#parse_error(msg)
|
114
|
-
|
115
|
-
Called when the parser meets an error except a constraint
|
116
|
-
violation, for example, an XML document is unmatched with
|
117
|
-
a production. By default, this method raises
|
118
|
-
((<XMLScan::ParseError>)) exception. If no exception is
|
119
|
-
raised and this method returns normally, the parser recovers
|
120
|
-
the error and continues to parse.
|
121
|
-
|
122
|
-
--- XMLScan::Visitor#wellformed_error(msg)
|
123
|
-
|
124
|
-
Called when the parser meets an well-formedness constraint
|
125
|
-
violation. By default, this method raises
|
126
|
-
((<XMLScan::NotWellFormedError>)) exception. If no exception
|
127
|
-
is raised and this method returns normally, the parser recovers
|
128
|
-
the error and continues to parse.
|
129
|
-
|
130
|
-
--- XMLScan::Visitor#valid_error(msg)
|
131
|
-
|
132
|
-
Called when the parser meets validity constraint
|
133
|
-
violation. By default, this method raises
|
134
|
-
((<XMLScan::NotValidError>)) exception. If no exception
|
135
|
-
is raised and this method returns normally, the parser recovers
|
136
|
-
the error and continues to parse.
|
137
|
-
|
138
|
-
FYI, current version of xmlscan includes no validating XML
|
139
|
-
processor. This method is reserved for future versions.
|
140
|
-
|
141
|
-
--- XMLScan::Visitor#warning(msg)
|
142
|
-
|
143
|
-
Called when the parser meets a non-error but unrecommended
|
144
|
-
thing or a syntax which xmlscan is not able to parse.
|
145
|
-
|
146
|
-
--- XMLScan::Visitor#on_start_document
|
147
|
-
|
148
|
-
Called just before the parser starts parsing an XML document.
|
149
|
-
After this method is called, corresponding
|
150
|
-
((<XMLScan::Visitor#on_end_document>)) method is always called.
|
151
|
-
|
152
|
-
--- XMLScan::Visitor#on_end_document
|
153
|
-
|
154
|
-
Called after the parser reaches the end of an XML document.
|
155
|
-
|
156
|
-
--- XMLScan::Visitor#on_xmldecl
|
157
|
-
--- XMLScan::Visitor#on_xmldecl_version(str)
|
158
|
-
--- XMLScan::Visitor#on_xmldecl_encoding(str)
|
159
|
-
--- XMLScan::Visitor#on_xmldecl_standalone(str)
|
160
|
-
--- XMLScan::Visitor#on_xmldecl_other(name, value)
|
161
|
-
--- XMLScan::Visitor#on_xmldecl_end
|
162
|
-
|
163
|
-
Called when the parser meets an XML declaration.
|
164
|
-
|
165
|
-
<?xml version="1.0" encoding="euc-jp" standalone="yes" ?>
|
166
|
-
^ ^ ^ ^ ^
|
167
|
-
1 2 3 4 5
|
168
|
-
|
169
|
-
method argument
|
170
|
-
--------------------------------------
|
171
|
-
1: on_xmldecl
|
172
|
-
2: on_xmldecl_version ("1.0")
|
173
|
-
3: on_xmldecl_encoding ("euc-jp")
|
174
|
-
4: on_xmldecl_standalone ("yes")
|
175
|
-
5: on_xmldecl_end
|
176
|
-
|
177
|
-
When an XML declaration is found, both on_xmldecl and
|
178
|
-
on_xmldecl_end method are always called. Any other methods
|
179
|
-
are called only when the corresponding syntaxes are found.
|
180
|
-
|
181
|
-
When a declaration except version, encoding, and standalone
|
182
|
-
is found in an XML declaration, on_xmldecl_other method is
|
183
|
-
called. Since such a declaration is not permitted, note that
|
184
|
-
the parser always calls ((<XMLScan::Visitor#parse_error>)) method
|
185
|
-
before calling on_xmldecl_other method.
|
186
|
-
|
187
|
-
--- XMLScan::Visitor#on_doctype(root, pubid, sysid)
|
188
|
-
|
189
|
-
Called when the parser meets a document type declaration.
|
190
|
-
|
191
|
-
document argument
|
192
|
-
--------------------------------------------------------------
|
193
|
-
1: <!DOCTYPE foo> ('foo', nil, nil)
|
194
|
-
2: <!DOCTYPE foo SYSTEM "bar"> ('foo', nil, 'bar')
|
195
|
-
3: <!DOCTYPE foo PUBLIC "bar"> ('foo', 'bar', nil )
|
196
|
-
4: <!DOCTYPE foo PUBLIC "bar" "baz"> ('foo', 'bar', 'baz')
|
197
|
-
|
198
|
-
--- XMLScan::Visitor#on_prolog_space(str)
|
199
|
-
|
200
|
-
Called when the parser meets whitespaces in prolog.
|
201
|
-
|
202
|
-
--- XMLScan::Visitor#on_comment(str)
|
203
|
-
|
204
|
-
Called when the parser meets a comment.
|
205
|
-
|
206
|
-
--- XMLScan::Visitor#on_pi(target, pi)
|
207
|
-
|
208
|
-
Called when the parser meets a processing instruction.
|
209
|
-
|
210
|
-
--- XMLScan::Visitor#on_chardata(str)
|
211
|
-
|
212
|
-
Called when the parser meets character data.
|
213
|
-
|
214
|
-
--- XMLScan::Visitor#on_cdata(str)
|
215
|
-
|
216
|
-
Called when the parser meets a CDATA section.
|
217
|
-
|
218
|
-
--- XMLScan::Visitor#on_entityref(ref)
|
219
|
-
|
220
|
-
Called when the parser meets a general entity reference
|
221
|
-
in a place except an attribute value.
|
222
|
-
|
223
|
-
--- XMLScan::Visitor#on_charref(code)
|
224
|
-
--- XMLScan::Visitor#on_charref_hex(code)
|
225
|
-
|
226
|
-
Called when the parser meets a character reference
|
227
|
-
in a place except an attribute value.
|
228
|
-
When the character code is represented by decimals,
|
229
|
-
on_charref is called. When by hexadecimals, on_charref_hex
|
230
|
-
is called. ((|code|)) is an integer.
|
231
|
-
|
232
|
-
--- XMLScan::Visitor#on_stag(name)
|
233
|
-
--- XMLScan::Visitor#on_attribute(name)
|
234
|
-
--- XMLScan::Visitor#on_attr_value(str)
|
235
|
-
--- XMLScan::Visitor#on_attr_entityref(ref)
|
236
|
-
--- XMLScan::Visitor#on_attr_charref(code)
|
237
|
-
--- XMLScan::Visitor#on_attr_charref_hex(code)
|
238
|
-
--- XMLScan::Visitor#on_attribute_end(name)
|
239
|
-
--- XMLScan::Visitor#on_stag_end_empty(name)
|
240
|
-
--- XMLScan::Visitor#on_stag_end(name)
|
241
|
-
|
242
|
-
Called when the parser meets an XML declaration.
|
243
|
-
|
244
|
-
<hoge fuga="foo&bar;&&baz" >
|
245
|
-
^ ^ ^ ^ ^ ^ ^ ^ ^
|
246
|
-
1 2 3 4 5 6 7 8 9
|
247
|
-
|
248
|
-
method argument
|
249
|
-
------------------------------------
|
250
|
-
1: on_stag ('hoge')
|
251
|
-
2: on_attribute ('fuga')
|
252
|
-
3: on_attr_value ('foo')
|
253
|
-
4: on_attr_entityref ('bar')
|
254
|
-
5: on_attr_charref (38)
|
255
|
-
6: on_attr_charref_hex (38)
|
256
|
-
7: on_attr_value ('baz')
|
257
|
-
8: on_attribute_end ('fuga')
|
258
|
-
9: on_stag_end ('hoge')
|
259
|
-
or
|
260
|
-
on_stag_end_empty ('hoge')
|
261
|
-
|
262
|
-
When a start tag is found, both on_stag and corresponding
|
263
|
-
either on_stag_end or on_stag_end_empty method are always
|
264
|
-
called. Any other methods are called only when at least one
|
265
|
-
attribute is found in the start tag.
|
266
|
-
|
267
|
-
When an attribute is found, both on_attribute and
|
268
|
-
on_attribute_end method are always called. If the attribute
|
269
|
-
value is empty, only these two methods are called.
|
270
|
-
|
271
|
-
When the parser meets a general entity reference in an
|
272
|
-
attribute value, it calls on_attr_entityref method.
|
273
|
-
When the parser meets a character reference in an attribute
|
274
|
-
value, it calls either on_charref or on_charref_hex method.
|
275
|
-
|
276
|
-
If the tag is an empty element tag, on_stag_end_empty method
|
277
|
-
is called instead of on_stag_end method.
|
278
|
-
|
279
|
-
--- XMLScan::Visitor#on_etag(name)
|
280
|
-
|
281
|
-
Called when the parser meets an end tag.
|
282
|
-
|
283
|
-
|
284
|
-
|
285
|
-
=== XMLScan::XMLScanner
|
286
|
-
|
287
|
-
The scanner which tokenizes an XML document and recognize tags,
|
288
|
-
and so on.
|
289
|
-
|
290
|
-
The conformance of XMLScan::XMLScanner to the specification
|
291
|
-
is described in another document.
|
292
|
-
|
293
|
-
==== SuperClass:
|
294
|
-
|
295
|
-
* Object
|
296
|
-
|
297
|
-
==== Class Methods:
|
298
|
-
|
299
|
-
--- XMLScan::XMLScanner.new(visitor[, option ...])
|
300
|
-
|
301
|
-
Creates an instance. ((|visitor|)) is a instance of
|
302
|
-
((<XMLScan::Visitor>)) and receives the result of parsing
|
303
|
-
from the XMLScan::Scanner object.
|
304
|
-
|
305
|
-
You can specify one of more ((|option|)) as a string or symbol.
|
306
|
-
XMLScan::Scanner's options are as follows:
|
307
|
-
|
308
|
-
: 'strict_char'
|
309
|
-
|
310
|
-
This option is enabled after
|
311
|
-
(({require 'xmlscan/xmlchar'})).
|
312
|
-
XMLScan::Scanner checks whether an XML document includes
|
313
|
-
an illegal character. The performance decreases sharply.
|
314
|
-
|
315
|
-
==== Methods:
|
316
|
-
|
317
|
-
--- XMLScan::XMLScanner#kcode= arg
|
318
|
-
|
319
|
-
Sets CES. Available values for ((|code|)) are same as $KCODE
|
320
|
-
except nil. If ((|code|)) is nil, $KCODE decides the CES.
|
321
|
-
|
322
|
-
--- XMLScan::XMLScanner#kcode
|
323
|
-
|
324
|
-
Returns CES. The format of the return value is same as
|
325
|
-
Regexp#kcode. If this method returns nil, it represents that
|
326
|
-
$KCODE decides the CES.
|
327
|
-
|
328
|
-
--- XMLScan::XMLScanner#parse(source)
|
329
|
-
|
330
|
-
Parses ((|source|)) as an XML document. ((|source|)) must be
|
331
|
-
a string, an array of strings, or an object which responds to
|
332
|
-
gets method which behaves same as IO#gets does.
|
333
|
-
|
334
|
-
|
335
|
-
=== XMLScan::XMLParser
|
336
|
-
|
337
|
-
The non-validating XML parser.
|
338
|
-
|
339
|
-
The conformance of XMLScan::XMLParser to the specification
|
340
|
-
is described in another document.
|
341
|
-
|
342
|
-
|
343
|
-
==== SuperClass:
|
344
|
-
|
345
|
-
* ((<XMLScan::XMLScanner>))
|
346
|
-
|
347
|
-
==== Class Methods:
|
348
|
-
|
349
|
-
--- XMLScan::XMLParser.new(visitor[, option ...])
|
350
|
-
|
351
|
-
XMLScan::XMLParser makes sure the following for each
|
352
|
-
method of ((|visitor|)):
|
353
|
-
|
354
|
-
: ((<XMLScan::Visitor#on_stag>))
|
355
|
-
|
356
|
-
After calling this method, XMLScan::Parser always call
|
357
|
-
corresponding ((<XMLScan::Visitor#on_etag>)) method.
|
358
|
-
|
359
|
-
In addition, if you never intend error recovery, method calls
|
360
|
-
which must not be occurred in a well-formed XML document are
|
361
|
-
all suppressed.
|
362
|
-
|
363
|
-
|
364
|
-
=== XMLScan::HTMLScanner
|
365
|
-
|
366
|
-
An HTML parser based on ((<XMLScan::XMLScanner>)).
|
367
|
-
|
368
|
-
The conformance of XMLScan::HTMLScanner to the specification
|
369
|
-
is described in another document.
|
370
|
-
|
371
|
-
==== SuperClass:
|
372
|
-
|
373
|
-
* ((<XMLScan::XMLScanner>))
|
374
|
-
|
375
|
-
==== Class Methods:
|
376
|
-
|
377
|
-
--- XMLScan::HTMLScanner.new(visitor[, option ...])
|
378
|
-
|
379
|
-
XMLScan::HTMLScanner makes sure the following for each
|
380
|
-
method of ((|visitor|)):
|
381
|
-
|
382
|
-
: ((<XMLScan::Visitor#on_xmldecl>))
|
383
|
-
: ((<XMLScan::Visitor#on_xmldecl_version>))
|
384
|
-
: ((<XMLScan::Visitor#on_xmldecl_encoding>))
|
385
|
-
: ((<XMLScan::Visitor#on_xmldecl_standalone>))
|
386
|
-
: ((<XMLScan::Visitor#on_xmldecl_end>))
|
387
|
-
|
388
|
-
An XML declaration never appears in an HTML document,
|
389
|
-
so XMLScan::HTMLScanner never calls these methods.
|
390
|
-
|
391
|
-
: ((<XMLScan::Visitor#on_stag_end_empty>))
|
392
|
-
|
393
|
-
An empty element tag never appears in an HTML document,
|
394
|
-
so XMLScan::HTMLScanner never calls this method.
|
395
|
-
An empty element tag causes a parse error.
|
396
|
-
|
397
|
-
: ((<XMLScan::Visitor#wellformed_error>))
|
398
|
-
|
399
|
-
There is no well-formedness constraint for HTML,
|
400
|
-
so XMLScan::HTMLScanner never calls this method.
|
401
|
-
|
402
|
-
=end
|