pandoc_refeq_mathml 0.1 → 0.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.gitignore +1 -0
- data/ChangeLog +7 -0
- data/README.en.rdoc +31 -10
- data/doc/PandocRefeqMathml.html +1428 -0
- data/doc/class_list.html +51 -0
- data/doc/css/common.css +1 -0
- data/doc/css/full_list.css +58 -0
- data/doc/css/style.css +497 -0
- data/doc/file.README.en.html +262 -0
- data/doc/file_list.html +56 -0
- data/doc/frames.html +17 -0
- data/doc/index.html +262 -0
- data/doc/js/app.js +314 -0
- data/doc/js/full_list.js +216 -0
- data/doc/js/jquery.js +4 -0
- data/doc/method_list.html +139 -0
- data/doc/top-level-namespace.html +110 -0
- data/lib/pandoc_refeq_mathml.rb +38 -11
- data/pandoc_refeq_mathml.gemspec +9 -9
- data/test/data/.gitignore +10 -0
- data/test/data/Makefile +32 -0
- data/test/data/shifted-numbered-headings.lua +17 -0
- data/test/data/try01.html +115 -0
- data/test/data/try01_latex.aux +32 -0
- data/test/data/try01_tmpl.tex +96 -0
- data/test/test_pandoc_refeq_mathml.rb +22 -15
- metadata +24 -3
@@ -0,0 +1,262 @@
|
|
1
|
+
<!DOCTYPE html>
|
2
|
+
<html>
|
3
|
+
<head>
|
4
|
+
<meta charset="UTF-8">
|
5
|
+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
6
|
+
<title>
|
7
|
+
File: README.en
|
8
|
+
|
9
|
+
— Documentation by YARD 0.9.28
|
10
|
+
|
11
|
+
</title>
|
12
|
+
|
13
|
+
<link rel="stylesheet" href="css/style.css" type="text/css" />
|
14
|
+
|
15
|
+
<link rel="stylesheet" href="css/common.css" type="text/css" />
|
16
|
+
|
17
|
+
<script type="text/javascript">
|
18
|
+
pathId = "README.en";
|
19
|
+
relpath = '';
|
20
|
+
</script>
|
21
|
+
|
22
|
+
|
23
|
+
<script type="text/javascript" charset="utf-8" src="js/jquery.js"></script>
|
24
|
+
|
25
|
+
<script type="text/javascript" charset="utf-8" src="js/app.js"></script>
|
26
|
+
|
27
|
+
|
28
|
+
</head>
|
29
|
+
<body>
|
30
|
+
<div class="nav_wrap">
|
31
|
+
<iframe id="nav" src="file_list.html?1"></iframe>
|
32
|
+
<div id="resizer"></div>
|
33
|
+
</div>
|
34
|
+
|
35
|
+
<div id="main" tabindex="-1">
|
36
|
+
<div id="header">
|
37
|
+
<div id="menu">
|
38
|
+
|
39
|
+
<a href="_index.html">Index</a> »
|
40
|
+
<span class="title">File: README.en</span>
|
41
|
+
|
42
|
+
</div>
|
43
|
+
|
44
|
+
<div id="search">
|
45
|
+
|
46
|
+
<a class="full_list_link" id="class_list_link"
|
47
|
+
href="class_list.html">
|
48
|
+
|
49
|
+
<svg width="24" height="24">
|
50
|
+
<rect x="0" y="4" width="24" height="4" rx="1" ry="1"></rect>
|
51
|
+
<rect x="0" y="12" width="24" height="4" rx="1" ry="1"></rect>
|
52
|
+
<rect x="0" y="20" width="24" height="4" rx="1" ry="1"></rect>
|
53
|
+
</svg>
|
54
|
+
</a>
|
55
|
+
|
56
|
+
</div>
|
57
|
+
<div class="clear"></div>
|
58
|
+
</div>
|
59
|
+
|
60
|
+
<div id="content"><div id='filecontents'>
|
61
|
+
<h1 id="label-PandocRefeqMathml+-+ad+hoc+tool+to+modify+pandoc-converted+MathML+from+LaTeX">PandocRefeqMathml - ad hoc tool to modify pandoc-converted MathML from LaTeX</h1>
|
62
|
+
|
63
|
+
<h2 id="label-Summary">Summary</h2>
|
64
|
+
|
65
|
+
<p>This Ruby command-line command modifies a MathML file converted with <code>pandoc</code> from LaTeX.</p>
|
66
|
+
|
67
|
+
<p>Whereas <code>pandoc</code> is a great text-ish file converter, there are a few caveats, at the time of writing, in converting a LaTeX file to MathML.</p>
|
68
|
+
|
69
|
+
<p>A major caveat is the generated MathML does not display the equation numbers that are auto-generated by LaTeX in default for the <code>equation</code> and <code>eqnarray</code> environments, nor their (LaTeX) labels. All the (LaTeX) <code>ref</code> remain as they are, which is a coded message for readers.</p>
|
70
|
+
|
71
|
+
<p>Another caveat is the alignments of equations in the <code>eqnarray</code> environment.</p>
|
72
|
+
|
73
|
+
<p>This tool is a bit of ad hoc (dirty) hack to correct these points *in some basic situations*. “Basic” here means just the standard LaTeX commands, not some external package-specific commands.</p>
|
74
|
+
|
75
|
+
<p>The full package of this module is found in <a href="http://rubygems.org/gems/pandoc_refeq_mathml">PandocRefeqMathml Ruby Gems page</a> (with document created from source annotation with yard) and in <a href="https://github.com/masasakano/pandoc_refeq_mathml">Github</a></p>
|
76
|
+
|
77
|
+
<h2 id="label-Background+and+constraints">Background and constraints</h2>
|
78
|
+
|
79
|
+
<p>Pandoc-converted MathML.html from LaTeX lacks equation numbers that are present in the original LaTeX. The <a href="https://github.com/lierdakil/pandoc-crossref">pandoc-crossref</a> offers a way to tackle the problem; however its fix is far from perfect with three or four major caveats.</p>
|
80
|
+
<ol><li>
|
81
|
+
<p>A single number is assigned to the whole set of equations in an <code>eqnarray</code> environment, which is inconsistent with LaTeX.</p>
|
82
|
+
</li><li>
|
83
|
+
<p>The LaTeX <tt>\nonumber</tt> is not taken into account.</p>
|
84
|
+
</li><li>
|
85
|
+
<p>Referencing text to an equation displays the original LaTeX label, as opposed to the equation number, which makes no sense to readers.</p>
|
86
|
+
</li><li>
|
87
|
+
<p>Because of points (1) and (2), the given equation numbers usually do not agree at all with the original document compiled by LaTeX.</p>
|
88
|
+
</li></ol>
|
89
|
+
|
90
|
+
<p>In LaTeX, you may reference equation 1 and 3 in a single <code>eqnarray</code> environment separately. However, because of point (1), it would not be possible in pandoc-generated MathML. Besides, since they are not referenced with equation numbers in the MathML (point 3) in the first place.</p>
|
91
|
+
|
92
|
+
<p>This tool (command-line command) offers a way to fix these problems, albeit in a crude way. The command adds equation numbers that are guessed from the text in the annotation fields in <tt><math></tt> and LaTeX aux file (the latter of which is automatically generated as a byproduct when you compile a LaTeX document). Not all the numbers are recovered but only those that are referenced somewhere in the MathML file.</p>
|
93
|
+
|
94
|
+
<p>(Note that in principle, it should not be too difficult to modify the program so that all the labelled equations in LaTeX are labelled again in MathML. Nevertheless, it would be tricky to label equations that are not explicitly labelled in LaTeX because implicit numbering information is not available in the LaTeX aux file.)</p>
|
95
|
+
|
96
|
+
<p>The algorithm assumes a LaTeX standard aux file-format, the MathML having a link tag <tt><a></tt> with the attributes “data-reference-type=ref” and href to the label of the exact reference label in LaTeX (and the label should have no duplicates in the MathML) and also having the <tt>‘annotation[ encoding=“application/x-tex”]’</tt> tag in each math tag containing the original LaTeX code. The LaTeX code must have either the standard “equation” or “eqnarray” structures associated with the standard “label” tag with a simple content (if it contains, apart from the label string, something more than preceding or trailing white spaces, such as a comment, this algorithm would likely fail). If equations in an eqnarray environment have complicated nested structures like a matrix, I do not know how the algorithm of this routine handles them. Also, the LaTeX section numbering must be combinations of Arabic numbers, full-stops, and maybe capital letters (for Appendix) only.</p>
|
97
|
+
|
98
|
+
<p>Essentially, LaTeX has a huge amount of freedom and so I am afraid it would be a somewhat futile effort to deal with every possibility…</p>
|
99
|
+
|
100
|
+
<h3 id="label-Output+MathML+by+pandoc-2.19+converted+from+LaTeX">Output MathML by pandoc-2.19 converted from LaTeX</h3>
|
101
|
+
|
102
|
+
<p>Ordinary LaTeX inline maths expressions (e.g., <tt>$5^2$</tt>) are expressed as follows:</p>
|
103
|
+
|
104
|
+
<pre class="code ruby"><code class="ruby"><math display="inline" xmlns="http://w..."><semantics>
|
105
|
+
<mrow><mn>5</mn><mi>π</mi></mrow>
|
106
|
+
<annotation encoding="application/x-tex">5\pi</annotation>
|
107
|
+
</semantics></math>
|
108
|
+
</code></pre>
|
109
|
+
|
110
|
+
<p>LaTeX’s <tt>begin{equation}</tt> is as follows (n.b., the <tt><p></tt> tag may not be closed immediately after <tt></math></tt> but another ordinary sentences may follow):</p>
|
111
|
+
|
112
|
+
<pre class="code ruby"><code class="ruby"><p><math display="block" xmlns="http://w..."><semantics>
|
113
|
+
<mrow><mi>x</mi><mo>±</mo><mi>ϵ</mi></mrow></mrow>
|
114
|
+
<annotation encoding="application/x-tex">x \pm \epsilon \label{my_xe}</annotation>
|
115
|
+
</semantics></math>
|
116
|
+
</code></pre>
|
117
|
+
|
118
|
+
<p>LaTeX’s <tt>begin{eqnarray}</tt> is as follows:</p>
|
119
|
+
|
120
|
+
<pre class="code ruby"><code class="ruby"><p><math display="block" xmlns="http://w..."><semantics><mtable>
|
121
|
+
<mtr><mtd columnalign="right"><mrow><mn>1</mn><mo>+</mo><mi>x</mi></mrow></mtd>
|
122
|
+
<mtd columnalign="left"><mo>=</mo></mtd>
|
123
|
+
<mtd columnalign="right"><mrow><mn>1</mn><mo>−</mo><mi>x</mi></mrow></mtd></mtr>
|
124
|
+
<mtr><mtd columnalign="right"></mtd>
|
125
|
+
<mtd columnalign="left"><mo>=</mo></mtd>
|
126
|
+
<mtd columnalign="right"><mfrac><mn>2</mn><mrow><mn>1</mn><mi>x</mi></mrow></mfrac></mtd></mtr>
|
127
|
+
</mtable><annotation encoding="application/x-tex">\begin{aligned}
|
128
|
+
1+x &amp; = &amp; 1-x \nonumber\\
|
129
|
+
&amp; = &amp; \frac{2}{1x} \label{eq_trivial}
|
130
|
+
\end{aligned}</annotation></semantics></math></p>
|
131
|
+
</code></pre>
|
132
|
+
|
133
|
+
<p>They are referred to as from another text follows:</p>
|
134
|
+
|
135
|
+
<pre class="code ruby"><code class="ruby"><p>Eq.<a href="#eq_trivial" data-reference-type="ref"
|
136
|
+
data-reference="eq_trivial">[eq_trivial]</a> was easy...
|
137
|
+
</code></pre>
|
138
|
+
|
139
|
+
<h3 id="label-Algorithm">Algorithm</h3>
|
140
|
+
|
141
|
+
<p>For fixing the alignments to follow the standard eqnarray alignments (right, centre, and left in this order), the program searches for <tt><mtable></tt> and rewrites the <code>columnalign</code> attributes in the <tt><mtd></tt> tags.</p>
|
142
|
+
|
143
|
+
<p>For fixing the equation numbers and links, the program</p>
|
144
|
+
<ol><li>
|
145
|
+
<p>first reads a LaTeX aux file and lists all the labels for equations and their numbers.</p>
|
146
|
+
</li><li>
|
147
|
+
<p>Then, it picks up an internally-pointing HTML anchor,</p>
|
148
|
+
</li><li>
|
149
|
+
<p>matches it with the list generated from the LaTeX aux file and identifies the equation number,</p>
|
150
|
+
</li><li>
|
151
|
+
<p>searches labels in <tt><annotation></tt> tags for the identical string for the HTML/MathML-anchor,</p>
|
152
|
+
</li><li>
|
153
|
+
<p>identifies the exact equation corresponding to the label (if in the eqnarray environment),</p>
|
154
|
+
</li><li>
|
155
|
+
<p>inserts the identified equation number next to the MathML equation,</p>
|
156
|
+
</li><li>
|
157
|
+
<p>and finally modifies the plain text for the HTML anchor.</p>
|
158
|
+
</li></ol>
|
159
|
+
|
160
|
+
<p>Each of the inserted equation number next to the corresponding equation is inside the <tt><mtext></tt> tags. In <tt><mtable></tt> (for LaTeX <tt>eqnarray{}</tt>), it is inserted as a new <tt><mtd></tt> cell. In both cases, the text is right-aligned with some padding to the left. However, the position is relative to either the equation or the set of the equations that contains the relevant equation (for LaTeX <tt>eqnarray{}</tt>) and is not like the original LaTeX, where equation numbers inside a pair of parentheses are always located at the right edge of a page in default.</p>
|
161
|
+
|
162
|
+
<h2 id="label-How+to+use+the+command">How to use the command</h2>
|
163
|
+
|
164
|
+
<p>Once you have installed it according to the standard RubyGems procedure (see section Install), the main Ruby executable (command) <code>pandoc_refeq_mathml</code> should be in your command-search path.</p>
|
165
|
+
|
166
|
+
<p>It basically reads a MathML file from either the first command-line argument or STDIN and also a LaTeX aux file specified in a command-line, and then outputs the modified (corrected) MathML to STDOUT.</p>
|
167
|
+
|
168
|
+
<p>Any warnings are printed to either STDERR or a log-file specified in a command-line as an option.</p>
|
169
|
+
|
170
|
+
<p>Failure in matching the labels from an HTML tag with any of the MathML equations are printed as a warning (to STDERR in default). Although it may genuinely mean the non-existent labels in the original LaTeX source, it is far more likely that the labels belong to one of the sections (or tables of figures), because the algorithm cannot tell what the type (section, table, figure, or equation or else) of each label’s origin is.</p>
|
171
|
+
|
172
|
+
<h3 id="label-Help+doc">Help doc</h3>
|
173
|
+
|
174
|
+
<p>The help doc for the command-line interface is displayed with <code>-h</code> (or <code>--help</code>) option:</p>
|
175
|
+
|
176
|
+
<pre class="code ruby"><code class="ruby">% pandoc_refeq_mathml -h
|
177
|
+
Usage: pandoc_refeq_mathml [options] [--] [MathML.html] > STDOUT
|
178
|
+
pandoc_refeq_mathml [options] [--] < STDIN > STDOUT
|
179
|
+
|
180
|
+
Description (Version=0.1):
|
181
|
+
This fixes issues, label-references of equations and eqnarray alignments, of pandoc-converted MathML from LaTeX.
|
182
|
+
|
183
|
+
Specific options:
|
184
|
+
-a, --aux [FILENAME] (mandatory) LaTeX aux filename
|
185
|
+
--log [FILENAME] Log filename (Default: STDERR). /dev/null to disable it.
|
186
|
+
--[no-]fixalign Fix eqnarray-alignment problems? (Def: true)
|
187
|
+
-v, --[no-]verbose Run verbosely (Def: true)
|
188
|
+
|
189
|
+
Common options:
|
190
|
+
-h, --help Show this message
|
191
|
+
--version Show version
|
192
|
+
</code></pre>
|
193
|
+
|
194
|
+
<h3 id="label-Examples">Examples</h3>
|
195
|
+
|
196
|
+
<pre class="code ruby"><code class="ruby">% pandoc_refeq_mathml --aux=mydoc.aux --log=error.log mydoc.html > revised1.html
|
197
|
+
% head -n 90 mydoc.html | pandoc_refeq_mathml --aux=mydoc.aux --no-fixalign > revised2.html
|
198
|
+
</code></pre>
|
199
|
+
|
200
|
+
<p>Also, in the <code>test/data/</code> directory, there is a sample LaTeX file. You can run <code>make</code> in the directory to generate and correct a HTML/MathML file. Read the comment in the <code>Makefile</code> to see options, such as the LaTeX executable in your environment.</p>
|
201
|
+
|
202
|
+
<h2 id="label-Install">Install</h2>
|
203
|
+
|
204
|
+
<p>Standard Ruby-gem install procedure is suffice</p>
|
205
|
+
|
206
|
+
<pre class="code ruby"><code class="ruby">% gem install pandoc_refeq_mathml
|
207
|
+
</code></pre>
|
208
|
+
|
209
|
+
<p>which should also install the dependant <a href="https://rubygems.org/gems/nokogiri/">Nokogiri gem</a>.</p>
|
210
|
+
|
211
|
+
<p>Alternatively, it is possible to download the library file <code>lib/pandoc_refeq_mathml.rb</code> somewhere in your local directory, set the environmental variable <code>RUBYLIB</code> to also point to the directory for the library, and execute </p>
|
212
|
+
|
213
|
+
<pre class="code ruby"><code class="ruby">% ruby bin/pandoc_refeq_mathml
|
214
|
+
</code></pre>
|
215
|
+
|
216
|
+
<p>where <code>ruby</code> is optional. Note that <a href="https://rubygems.org/gems/nokogiri/">Nokogiri gem</a> must be available in your RUBY library path.</p>
|
217
|
+
|
218
|
+
<p>In the developer’s environment <a href="https://rubygems.org/gems/diff-lcs">diff-lcs gem</a> is also required.</p>
|
219
|
+
|
220
|
+
<p>This tool requires <a href="http://www.ruby-lang.org">Ruby</a> Version 2.0 or above.</p>
|
221
|
+
|
222
|
+
<h2 id="label-Developer-27s+note">Developer’s note</h2>
|
223
|
+
|
224
|
+
<p>The source code is maintained also in <a href="https://github.com/masasakano/pandoc_refeq_mathml">Github</a> with no intuitive interface for annotations.</p>
|
225
|
+
|
226
|
+
<h3 id="label-Tests">Tests</h3>
|
227
|
+
|
228
|
+
<p>The Ruby codes under the directory <code>test/</code> are the test scripts. You can run them from the top directory as <code>ruby test/test_****.rb</code> or simply run <code>make test</code> or <code>rake test</code>.</p>
|
229
|
+
|
230
|
+
<h2 id="label-Known+bugs+and+ToDo+items">Known bugs and ToDo items</h2>
|
231
|
+
<ul><li>
|
232
|
+
<p>pandoc-generated HTMLs do not contain Table/Figure numbers in their <tt><caption></tt>, even though each anchored text refers to the corresponding number, such as, <tt>see Table “2”</tt>, where “2” is the anchor.</p>
|
233
|
+
</li><li>
|
234
|
+
<p>In fact, pandoc-generated HTMLs do not generate <tt><figure></tt> tags, let alone <tt><figurecaption></tt> for the LaTeX figure environments that contain more than one figure (with <code>\includegraphics</code>)…</p>
|
235
|
+
</li></ul>
|
236
|
+
|
237
|
+
<h2 id="label-Copyright">Copyright</h2>
|
238
|
+
<dl class="rdoc-list note-list"><dt>Author
|
239
|
+
<dd>
|
240
|
+
<p>Masa Sakano < info a_t wisebabel dot com ></p>
|
241
|
+
</dd><dt>Versions
|
242
|
+
<dd>
|
243
|
+
<p>The versions of this package follow Semantic Versioning (2.0.0) <a href="http://semver.org">semver.org</a>/</p>
|
244
|
+
</dd><dt>License
|
245
|
+
<dd>
|
246
|
+
<p>MIT</p>
|
247
|
+
</dd><dt>Warranty
|
248
|
+
<dd>
|
249
|
+
<p>No warranty.</p>
|
250
|
+
</dd></dl>
|
251
|
+
<hr>
|
252
|
+
</div></div>
|
253
|
+
|
254
|
+
<div id="footer">
|
255
|
+
Generated on Sat Aug 27 02:12:28 2022 by
|
256
|
+
<a href="https://yardoc.org" title="Yay! A Ruby Documentation Tool" target="_parent">yard</a>
|
257
|
+
0.9.28 (ruby-3.1.2).
|
258
|
+
</div>
|
259
|
+
|
260
|
+
</div>
|
261
|
+
</body>
|
262
|
+
</html>
|
data/doc/file_list.html
ADDED
@@ -0,0 +1,56 @@
|
|
1
|
+
<!DOCTYPE html>
|
2
|
+
<html>
|
3
|
+
<head>
|
4
|
+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
5
|
+
<meta charset="utf-8" />
|
6
|
+
|
7
|
+
<link rel="stylesheet" href="css/full_list.css" type="text/css" media="screen" />
|
8
|
+
|
9
|
+
<link rel="stylesheet" href="css/common.css" type="text/css" media="screen" />
|
10
|
+
|
11
|
+
|
12
|
+
|
13
|
+
<script type="text/javascript" charset="utf-8" src="js/jquery.js"></script>
|
14
|
+
|
15
|
+
<script type="text/javascript" charset="utf-8" src="js/full_list.js"></script>
|
16
|
+
|
17
|
+
|
18
|
+
<title>File List</title>
|
19
|
+
<base id="base_target" target="_parent" />
|
20
|
+
</head>
|
21
|
+
<body>
|
22
|
+
<div id="content">
|
23
|
+
<div class="fixed_header">
|
24
|
+
<h1 id="full_list_header">File List</h1>
|
25
|
+
<div id="full_list_nav">
|
26
|
+
|
27
|
+
<span><a target="_self" href="class_list.html">
|
28
|
+
Classes
|
29
|
+
</a></span>
|
30
|
+
|
31
|
+
<span><a target="_self" href="method_list.html">
|
32
|
+
Methods
|
33
|
+
</a></span>
|
34
|
+
|
35
|
+
<span><a target="_self" href="file_list.html">
|
36
|
+
Files
|
37
|
+
</a></span>
|
38
|
+
|
39
|
+
</div>
|
40
|
+
|
41
|
+
<div id="search">Search: <input type="text" /></div>
|
42
|
+
</div>
|
43
|
+
|
44
|
+
<ul id="full_list" class="file">
|
45
|
+
|
46
|
+
|
47
|
+
<li id="object_README.en" class="odd">
|
48
|
+
<div class="item"><span class="object_link"><a href="index.html" title="README.en">README.en</a></span></div>
|
49
|
+
</li>
|
50
|
+
|
51
|
+
|
52
|
+
|
53
|
+
</ul>
|
54
|
+
</div>
|
55
|
+
</body>
|
56
|
+
</html>
|
data/doc/frames.html
ADDED
@@ -0,0 +1,17 @@
|
|
1
|
+
<!DOCTYPE html>
|
2
|
+
<html>
|
3
|
+
<head>
|
4
|
+
<meta charset="utf-8">
|
5
|
+
<title>Documentation by YARD 0.9.28</title>
|
6
|
+
</head>
|
7
|
+
<script type="text/javascript">
|
8
|
+
var match = unescape(window.location.hash).match(/^#!(.+)/);
|
9
|
+
var name = match ? match[1] : 'index.html';
|
10
|
+
name = name.replace(/^(\w+):\/\//, '').replace(/^\/\//, '');
|
11
|
+
window.top.location = name;
|
12
|
+
</script>
|
13
|
+
<noscript>
|
14
|
+
<h1>Oops!</h1>
|
15
|
+
<h2>YARD requires JavaScript!</h2>
|
16
|
+
</noscript>
|
17
|
+
</html>
|
data/doc/index.html
ADDED
@@ -0,0 +1,262 @@
|
|
1
|
+
<!DOCTYPE html>
|
2
|
+
<html>
|
3
|
+
<head>
|
4
|
+
<meta charset="UTF-8">
|
5
|
+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
6
|
+
<title>
|
7
|
+
File: README.en
|
8
|
+
|
9
|
+
— Documentation by YARD 0.9.28
|
10
|
+
|
11
|
+
</title>
|
12
|
+
|
13
|
+
<link rel="stylesheet" href="css/style.css" type="text/css" />
|
14
|
+
|
15
|
+
<link rel="stylesheet" href="css/common.css" type="text/css" />
|
16
|
+
|
17
|
+
<script type="text/javascript">
|
18
|
+
pathId = "README.en";
|
19
|
+
relpath = '';
|
20
|
+
</script>
|
21
|
+
|
22
|
+
|
23
|
+
<script type="text/javascript" charset="utf-8" src="js/jquery.js"></script>
|
24
|
+
|
25
|
+
<script type="text/javascript" charset="utf-8" src="js/app.js"></script>
|
26
|
+
|
27
|
+
|
28
|
+
</head>
|
29
|
+
<body>
|
30
|
+
<div class="nav_wrap">
|
31
|
+
<iframe id="nav" src="class_list.html?1"></iframe>
|
32
|
+
<div id="resizer"></div>
|
33
|
+
</div>
|
34
|
+
|
35
|
+
<div id="main" tabindex="-1">
|
36
|
+
<div id="header">
|
37
|
+
<div id="menu">
|
38
|
+
|
39
|
+
<a href="_index.html">Index</a> »
|
40
|
+
<span class="title">File: README.en</span>
|
41
|
+
|
42
|
+
</div>
|
43
|
+
|
44
|
+
<div id="search">
|
45
|
+
|
46
|
+
<a class="full_list_link" id="class_list_link"
|
47
|
+
href="class_list.html">
|
48
|
+
|
49
|
+
<svg width="24" height="24">
|
50
|
+
<rect x="0" y="4" width="24" height="4" rx="1" ry="1"></rect>
|
51
|
+
<rect x="0" y="12" width="24" height="4" rx="1" ry="1"></rect>
|
52
|
+
<rect x="0" y="20" width="24" height="4" rx="1" ry="1"></rect>
|
53
|
+
</svg>
|
54
|
+
</a>
|
55
|
+
|
56
|
+
</div>
|
57
|
+
<div class="clear"></div>
|
58
|
+
</div>
|
59
|
+
|
60
|
+
<div id="content"><div id='filecontents'>
|
61
|
+
<h1 id="label-PandocRefeqMathml+-+ad+hoc+tool+to+modify+pandoc-converted+MathML+from+LaTeX">PandocRefeqMathml - ad hoc tool to modify pandoc-converted MathML from LaTeX</h1>
|
62
|
+
|
63
|
+
<h2 id="label-Summary">Summary</h2>
|
64
|
+
|
65
|
+
<p>This Ruby command-line command modifies a MathML file converted with <code>pandoc</code> from LaTeX.</p>
|
66
|
+
|
67
|
+
<p>Whereas <code>pandoc</code> is a great text-ish file converter, there are a few caveats, at the time of writing, in converting a LaTeX file to MathML.</p>
|
68
|
+
|
69
|
+
<p>A major caveat is the generated MathML does not display the equation numbers that are auto-generated by LaTeX in default for the <code>equation</code> and <code>eqnarray</code> environments, nor their (LaTeX) labels. All the (LaTeX) <code>ref</code> remain as they are, which is a coded message for readers.</p>
|
70
|
+
|
71
|
+
<p>Another caveat is the alignments of equations in the <code>eqnarray</code> environment.</p>
|
72
|
+
|
73
|
+
<p>This tool is a bit of ad hoc (dirty) hack to correct these points *in some basic situations*. “Basic” here means just the standard LaTeX commands, not some external package-specific commands.</p>
|
74
|
+
|
75
|
+
<p>The full package of this module is found in <a href="http://rubygems.org/gems/pandoc_refeq_mathml">PandocRefeqMathml Ruby Gems page</a> (with document created from source annotation with yard) and in <a href="https://github.com/masasakano/pandoc_refeq_mathml">Github</a></p>
|
76
|
+
|
77
|
+
<h2 id="label-Background+and+constraints">Background and constraints</h2>
|
78
|
+
|
79
|
+
<p>Pandoc-converted MathML.html from LaTeX lacks equation numbers that are present in the original LaTeX. The <a href="https://github.com/lierdakil/pandoc-crossref">pandoc-crossref</a> offers a way to tackle the problem; however its fix is far from perfect with three or four major caveats.</p>
|
80
|
+
<ol><li>
|
81
|
+
<p>A single number is assigned to the whole set of equations in an <code>eqnarray</code> environment, which is inconsistent with LaTeX.</p>
|
82
|
+
</li><li>
|
83
|
+
<p>The LaTeX <tt>\nonumber</tt> is not taken into account.</p>
|
84
|
+
</li><li>
|
85
|
+
<p>Referencing text to an equation displays the original LaTeX label, as opposed to the equation number, which makes no sense to readers.</p>
|
86
|
+
</li><li>
|
87
|
+
<p>Because of points (1) and (2), the given equation numbers usually do not agree at all with the original document compiled by LaTeX.</p>
|
88
|
+
</li></ol>
|
89
|
+
|
90
|
+
<p>In LaTeX, you may reference equation 1 and 3 in a single <code>eqnarray</code> environment separately. However, because of point (1), it would not be possible in pandoc-generated MathML. Besides, since they are not referenced with equation numbers in the MathML (point 3) in the first place.</p>
|
91
|
+
|
92
|
+
<p>This tool (command-line command) offers a way to fix these problems, albeit in a crude way. The command adds equation numbers that are guessed from the text in the annotation fields in <tt><math></tt> and LaTeX aux file (the latter of which is automatically generated as a byproduct when you compile a LaTeX document). Not all the numbers are recovered but only those that are referenced somewhere in the MathML file.</p>
|
93
|
+
|
94
|
+
<p>(Note that in principle, it should not be too difficult to modify the program so that all the labelled equations in LaTeX are labelled again in MathML. Nevertheless, it would be tricky to label equations that are not explicitly labelled in LaTeX because implicit numbering information is not available in the LaTeX aux file.)</p>
|
95
|
+
|
96
|
+
<p>The algorithm assumes a LaTeX standard aux file-format, the MathML having a link tag <tt><a></tt> with the attributes “data-reference-type=ref” and href to the label of the exact reference label in LaTeX (and the label should have no duplicates in the MathML) and also having the <tt>‘annotation[ encoding=“application/x-tex”]’</tt> tag in each math tag containing the original LaTeX code. The LaTeX code must have either the standard “equation” or “eqnarray” structures associated with the standard “label” tag with a simple content (if it contains, apart from the label string, something more than preceding or trailing white spaces, such as a comment, this algorithm would likely fail). If equations in an eqnarray environment have complicated nested structures like a matrix, I do not know how the algorithm of this routine handles them. Also, the LaTeX section numbering must be combinations of Arabic numbers, full-stops, and maybe capital letters (for Appendix) only.</p>
|
97
|
+
|
98
|
+
<p>Essentially, LaTeX has a huge amount of freedom and so I am afraid it would be a somewhat futile effort to deal with every possibility…</p>
|
99
|
+
|
100
|
+
<h3 id="label-Output+MathML+by+pandoc-2.19+converted+from+LaTeX">Output MathML by pandoc-2.19 converted from LaTeX</h3>
|
101
|
+
|
102
|
+
<p>Ordinary LaTeX inline maths expressions (e.g., <tt>$5^2$</tt>) are expressed as follows:</p>
|
103
|
+
|
104
|
+
<pre class="code ruby"><code class="ruby"><math display="inline" xmlns="http://w..."><semantics>
|
105
|
+
<mrow><mn>5</mn><mi>π</mi></mrow>
|
106
|
+
<annotation encoding="application/x-tex">5\pi</annotation>
|
107
|
+
</semantics></math>
|
108
|
+
</code></pre>
|
109
|
+
|
110
|
+
<p>LaTeX’s <tt>begin{equation}</tt> is as follows (n.b., the <tt><p></tt> tag may not be closed immediately after <tt></math></tt> but another ordinary sentences may follow):</p>
|
111
|
+
|
112
|
+
<pre class="code ruby"><code class="ruby"><p><math display="block" xmlns="http://w..."><semantics>
|
113
|
+
<mrow><mi>x</mi><mo>±</mo><mi>ϵ</mi></mrow></mrow>
|
114
|
+
<annotation encoding="application/x-tex">x \pm \epsilon \label{my_xe}</annotation>
|
115
|
+
</semantics></math>
|
116
|
+
</code></pre>
|
117
|
+
|
118
|
+
<p>LaTeX’s <tt>begin{eqnarray}</tt> is as follows:</p>
|
119
|
+
|
120
|
+
<pre class="code ruby"><code class="ruby"><p><math display="block" xmlns="http://w..."><semantics><mtable>
|
121
|
+
<mtr><mtd columnalign="right"><mrow><mn>1</mn><mo>+</mo><mi>x</mi></mrow></mtd>
|
122
|
+
<mtd columnalign="left"><mo>=</mo></mtd>
|
123
|
+
<mtd columnalign="right"><mrow><mn>1</mn><mo>−</mo><mi>x</mi></mrow></mtd></mtr>
|
124
|
+
<mtr><mtd columnalign="right"></mtd>
|
125
|
+
<mtd columnalign="left"><mo>=</mo></mtd>
|
126
|
+
<mtd columnalign="right"><mfrac><mn>2</mn><mrow><mn>1</mn><mi>x</mi></mrow></mfrac></mtd></mtr>
|
127
|
+
</mtable><annotation encoding="application/x-tex">\begin{aligned}
|
128
|
+
1+x &amp; = &amp; 1-x \nonumber\\
|
129
|
+
&amp; = &amp; \frac{2}{1x} \label{eq_trivial}
|
130
|
+
\end{aligned}</annotation></semantics></math></p>
|
131
|
+
</code></pre>
|
132
|
+
|
133
|
+
<p>They are referred to as from another text follows:</p>
|
134
|
+
|
135
|
+
<pre class="code ruby"><code class="ruby"><p>Eq.<a href="#eq_trivial" data-reference-type="ref"
|
136
|
+
data-reference="eq_trivial">[eq_trivial]</a> was easy...
|
137
|
+
</code></pre>
|
138
|
+
|
139
|
+
<h3 id="label-Algorithm">Algorithm</h3>
|
140
|
+
|
141
|
+
<p>For fixing the alignments to follow the standard eqnarray alignments (right, centre, and left in this order), the program searches for <tt><mtable></tt> and rewrites the <code>columnalign</code> attributes in the <tt><mtd></tt> tags.</p>
|
142
|
+
|
143
|
+
<p>For fixing the equation numbers and links, the program</p>
|
144
|
+
<ol><li>
|
145
|
+
<p>first reads a LaTeX aux file and lists all the labels for equations and their numbers.</p>
|
146
|
+
</li><li>
|
147
|
+
<p>Then, it picks up an internally-pointing HTML anchor,</p>
|
148
|
+
</li><li>
|
149
|
+
<p>matches it with the list generated from the LaTeX aux file and identifies the equation number,</p>
|
150
|
+
</li><li>
|
151
|
+
<p>searches labels in <tt><annotation></tt> tags for the identical string for the HTML/MathML-anchor,</p>
|
152
|
+
</li><li>
|
153
|
+
<p>identifies the exact equation corresponding to the label (if in the eqnarray environment),</p>
|
154
|
+
</li><li>
|
155
|
+
<p>inserts the identified equation number next to the MathML equation,</p>
|
156
|
+
</li><li>
|
157
|
+
<p>and finally modifies the plain text for the HTML anchor.</p>
|
158
|
+
</li></ol>
|
159
|
+
|
160
|
+
<p>Each of the inserted equation number next to the corresponding equation is inside the <tt><mtext></tt> tags. In <tt><mtable></tt> (for LaTeX <tt>eqnarray{}</tt>), it is inserted as a new <tt><mtd></tt> cell. In both cases, the text is right-aligned with some padding to the left. However, the position is relative to either the equation or the set of the equations that contains the relevant equation (for LaTeX <tt>eqnarray{}</tt>) and is not like the original LaTeX, where equation numbers inside a pair of parentheses are always located at the right edge of a page in default.</p>
|
161
|
+
|
162
|
+
<h2 id="label-How+to+use+the+command">How to use the command</h2>
|
163
|
+
|
164
|
+
<p>Once you have installed it according to the standard RubyGems procedure (see section Install), the main Ruby executable (command) <code>pandoc_refeq_mathml</code> should be in your command-search path.</p>
|
165
|
+
|
166
|
+
<p>It basically reads a MathML file from either the first command-line argument or STDIN and also a LaTeX aux file specified in a command-line, and then outputs the modified (corrected) MathML to STDOUT.</p>
|
167
|
+
|
168
|
+
<p>Any warnings are printed to either STDERR or a log-file specified in a command-line as an option.</p>
|
169
|
+
|
170
|
+
<p>Failure in matching the labels from an HTML tag with any of the MathML equations are printed as a warning (to STDERR in default). Although it may genuinely mean the non-existent labels in the original LaTeX source, it is far more likely that the labels belong to one of the sections (or tables of figures), because the algorithm cannot tell what the type (section, table, figure, or equation or else) of each label’s origin is.</p>
|
171
|
+
|
172
|
+
<h3 id="label-Help+doc">Help doc</h3>
|
173
|
+
|
174
|
+
<p>The help doc for the command-line interface is displayed with <code>-h</code> (or <code>--help</code>) option:</p>
|
175
|
+
|
176
|
+
<pre class="code ruby"><code class="ruby">% pandoc_refeq_mathml -h
|
177
|
+
Usage: pandoc_refeq_mathml [options] [--] [MathML.html] > STDOUT
|
178
|
+
pandoc_refeq_mathml [options] [--] < STDIN > STDOUT
|
179
|
+
|
180
|
+
Description (Version=0.1):
|
181
|
+
This fixes issues, label-references of equations and eqnarray alignments, of pandoc-converted MathML from LaTeX.
|
182
|
+
|
183
|
+
Specific options:
|
184
|
+
-a, --aux [FILENAME] (mandatory) LaTeX aux filename
|
185
|
+
--log [FILENAME] Log filename (Default: STDERR). /dev/null to disable it.
|
186
|
+
--[no-]fixalign Fix eqnarray-alignment problems? (Def: true)
|
187
|
+
-v, --[no-]verbose Run verbosely (Def: true)
|
188
|
+
|
189
|
+
Common options:
|
190
|
+
-h, --help Show this message
|
191
|
+
--version Show version
|
192
|
+
</code></pre>
|
193
|
+
|
194
|
+
<h3 id="label-Examples">Examples</h3>
|
195
|
+
|
196
|
+
<pre class="code ruby"><code class="ruby">% pandoc_refeq_mathml --aux=mydoc.aux --log=error.log mydoc.html > revised1.html
|
197
|
+
% head -n 90 mydoc.html | pandoc_refeq_mathml --aux=mydoc.aux --no-fixalign > revised2.html
|
198
|
+
</code></pre>
|
199
|
+
|
200
|
+
<p>Also, in the <code>test/data/</code> directory, there is a sample LaTeX file. You can run <code>make</code> in the directory to generate and correct a HTML/MathML file. Read the comment in the <code>Makefile</code> to see options, such as the LaTeX executable in your environment.</p>
|
201
|
+
|
202
|
+
<h2 id="label-Install">Install</h2>
|
203
|
+
|
204
|
+
<p>Standard Ruby-gem install procedure is suffice</p>
|
205
|
+
|
206
|
+
<pre class="code ruby"><code class="ruby">% gem install pandoc_refeq_mathml
|
207
|
+
</code></pre>
|
208
|
+
|
209
|
+
<p>which should also install the dependant <a href="https://rubygems.org/gems/nokogiri/">Nokogiri gem</a>.</p>
|
210
|
+
|
211
|
+
<p>Alternatively, it is possible to download the library file <code>lib/pandoc_refeq_mathml.rb</code> somewhere in your local directory, set the environmental variable <code>RUBYLIB</code> to also point to the directory for the library, and execute </p>
|
212
|
+
|
213
|
+
<pre class="code ruby"><code class="ruby">% ruby bin/pandoc_refeq_mathml
|
214
|
+
</code></pre>
|
215
|
+
|
216
|
+
<p>where <code>ruby</code> is optional. Note that <a href="https://rubygems.org/gems/nokogiri/">Nokogiri gem</a> must be available in your RUBY library path.</p>
|
217
|
+
|
218
|
+
<p>In the developer’s environment <a href="https://rubygems.org/gems/diff-lcs">diff-lcs gem</a> is also required.</p>
|
219
|
+
|
220
|
+
<p>This tool requires <a href="http://www.ruby-lang.org">Ruby</a> Version 2.0 or above.</p>
|
221
|
+
|
222
|
+
<h2 id="label-Developer-27s+note">Developer’s note</h2>
|
223
|
+
|
224
|
+
<p>The source code is maintained also in <a href="https://github.com/masasakano/pandoc_refeq_mathml">Github</a> with no intuitive interface for annotations.</p>
|
225
|
+
|
226
|
+
<h3 id="label-Tests">Tests</h3>
|
227
|
+
|
228
|
+
<p>The Ruby codes under the directory <code>test/</code> are the test scripts. You can run them from the top directory as <code>ruby test/test_****.rb</code> or simply run <code>make test</code> or <code>rake test</code>.</p>
|
229
|
+
|
230
|
+
<h2 id="label-Known+bugs+and+ToDo+items">Known bugs and ToDo items</h2>
|
231
|
+
<ul><li>
|
232
|
+
<p>pandoc-generated HTMLs do not contain Table/Figure numbers in their <tt><caption></tt>, even though each anchored text refers to the corresponding number, such as, <tt>see Table “2”</tt>, where “2” is the anchor.</p>
|
233
|
+
</li><li>
|
234
|
+
<p>In fact, pandoc-generated HTMLs do not generate <tt><figure></tt> tags, let alone <tt><figurecaption></tt> for the LaTeX figure environments that contain more than one figure (with <code>\includegraphics</code>)…</p>
|
235
|
+
</li></ul>
|
236
|
+
|
237
|
+
<h2 id="label-Copyright">Copyright</h2>
|
238
|
+
<dl class="rdoc-list note-list"><dt>Author
|
239
|
+
<dd>
|
240
|
+
<p>Masa Sakano < info a_t wisebabel dot com ></p>
|
241
|
+
</dd><dt>Versions
|
242
|
+
<dd>
|
243
|
+
<p>The versions of this package follow Semantic Versioning (2.0.0) <a href="http://semver.org">semver.org</a>/</p>
|
244
|
+
</dd><dt>License
|
245
|
+
<dd>
|
246
|
+
<p>MIT</p>
|
247
|
+
</dd><dt>Warranty
|
248
|
+
<dd>
|
249
|
+
<p>No warranty.</p>
|
250
|
+
</dd></dl>
|
251
|
+
<hr>
|
252
|
+
</div></div>
|
253
|
+
|
254
|
+
<div id="footer">
|
255
|
+
Generated on Sat Aug 27 02:12:28 2022 by
|
256
|
+
<a href="https://yardoc.org" title="Yay! A Ruby Documentation Tool" target="_parent">yard</a>
|
257
|
+
0.9.28 (ruby-3.1.2).
|
258
|
+
</div>
|
259
|
+
|
260
|
+
</div>
|
261
|
+
</body>
|
262
|
+
</html>
|