asciimath2unitsml 0.0.2 → 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: d2ef44eb717d6b445489de85ee1e8eb3f5b81a6d83602c48ea23a8cbe4f100c7
4
- data.tar.gz: 84b867f5a97b3ad154c7be8e5975ca4c97446049c4368ece6ae5e23da3c94437
3
+ metadata.gz: 97081f507478d1952ab2010cd4810efe745980341b9d4d1a420dbd8132be88b0
4
+ data.tar.gz: ef63bceb0dc7b7c8dc8c40a1b6fa3795b02a3d0f0d9454be41e121d3a5199054
5
5
  SHA512:
6
- metadata.gz: b749aa65924f4b815a7d38d2df9c9d6b48e4cd8f9db62678c100d732a60c8159b537fa2c3bd192296f533371f731ec85365f0d7f231f560f173d167971c38871
7
- data.tar.gz: 64a47773ef26b6b870fc6a4406b697ec10b943e637a82f549946d8126830db9a521f3bdcb4e314718d084e40b7cfacbc820417476cd98663760ec8de29ba566e
6
+ metadata.gz: ead9d484a747edf442ff9ce220d46e53f32ccfeada2653ca44e2ecac9a3da4ade854fbfe0dbb6d0120b87358e5cfae0dee3135e7798eed331b7cff25b3558a9c
7
+ data.tar.gz: ed1aaf1967055b8ca4e270bef69b02b24ccef20fb075cc9f46acee7f5447e7c92e9c2f9bfd36646ee043b4f335dd4fd15920d958d429bdf89d155c5ddc14d729
data/README.adoc CHANGED
@@ -1,21 +1,27 @@
1
1
  = asciimath2unitsml
2
- Convert Asciimath via MathML to UnitsML
2
+ Convert Units expressions via MathML to UnitsML
3
3
 
4
- Encode UnitsML expressions in AsciiMath as `"unitsml(...)"`. The gem converts
5
- AsciiMath incorporating UnitsML expressions (based on the Ascii representation provided by NIST)
4
+ This gem converts
5
+ MathML incorporating UnitsML expressions (based on the Ascii representation provided by NIST)
6
6
  into MathML complying with https://www.w3.org/TR/mathml-units/[], with
7
- UnitsML markup embedded in it, with identifiers for each unit and dimension.
7
+ UnitsML markup embedded in it, and with unique identifiers for each distinct unit and dimension.
8
+ Units expressions are identified in MathML as `<mtext>unitsml(...)</mtext>`, which in turn
9
+ can be identified in AsciiMath as `"unitsml(...)"`.
8
10
  The consuming document is meant to deduplicate the instances of UnitsML markup
9
11
  with the same identifier, and potentially remove them to elsewhere in the document
10
12
  or another document.
11
13
 
12
- The AsciiMath conventions used are:
14
+ The conventions used for writing units are:
13
15
 
14
16
  * `^` for exponents, e.g. `m^-2`
15
17
  * `*` to combine two units by multiplication; e.g. `m*s^-2`. Division is not supported, use negative exponents instead
16
18
  * `u` for μ (micro-)
17
19
 
18
- So
20
+ The gem follows the MathML Units convention of inserting a spacing invisible times operator
21
+ (`<mo rspace='thickmathspace'>&#x2062;</mo>`) between any numbers (`<mn>`) and unit expressions
22
+ in MathML, and representing units in MathML as non-italic variables (`<mi mathvariant='normal'>`).
23
+
24
+ So:
19
25
 
20
26
  [source]
21
27
  ----
@@ -77,4 +83,20 @@ is converted into:
77
83
  </math>
78
84
  ----
79
85
 
86
+ The converter is run as:
87
+
88
+ [source,ruby]
89
+ ----
90
+ c = Asciimath2UnitsML::Conv.new()
91
+ c.Asciimath2UnitsML({Asciimath string containing "unitsml()"})
92
+ c.MathML2UnitsML({Nokogiri parse of MathML document containing <mtext>unitsml()</mtext>})
93
+ ----
94
+
95
+ The converter class may be initialised with options:
80
96
 
97
+ * `multiplier` is the symbol used to represent the multiplication of units. By default,
98
+ following MathML Units, the symbol is middle dot (`&#xB7`). An arbitrary UTF-8 string can be
99
+ supplied instead; it will be encoded as XML entities. The value `:space` is rendered
100
+ as a spacing invisible times in MathML (`<mo rspace='thickmathspace'>&#x2062;</mo>`),
101
+ and as a non-breaking space in HTML. The value `:nospace` is rendered as a non-spacing
102
+ invisible times in MathML (`<mo>&#x2062;</mo>`), and is not rendered in HTML.
@@ -11,36 +11,25 @@ module Asciimath2UnitsML
11
11
  UNITSML_NS = "http://unitsml.nist.gov/2005".freeze
12
12
 
13
13
  class Conv
14
- def initialize
14
+ def initialize(options = {})
15
15
  @prefixes_id = read_yaml("../unitsdb/prefixes.yaml")
16
16
  @prefixes = flip_name_and_id(@prefixes_id)
17
17
  @quantities = read_yaml("../unitsdb/quantities.yaml")
18
18
  @units_id = read_yaml("../unitsdb/units.yaml")
19
19
  @units = flip_name_and_id(@units_id)
20
20
  @parser = parser
21
+ @multiplier = multiplier(options[:multiplier] || "\u00b7")
21
22
  end
22
23
 
23
- # https://www.w3.org/TR/mathml-units/ section 2: delimit number Invisible-Times unit
24
- def Asciimath2UnitsML(expression)
25
- xml = Nokogiri::XML(asciimath2mathml(expression))
26
- MathML2UnitsML(xml).to_xml
27
- end
28
-
29
- def MathML2UnitsML(xml)
30
- xml.xpath(".//m:mtext", "m" => MATHML_NS).each do |x|
31
- next unless %r{^unitsml\(.+\)$}.match(x.text)
32
- text = x.text.sub(%r{^unitsml\((.+)\)$}m, "\\1")
33
- units = parse(text)
34
- delim = x&.previous_element&.name == "mn" ? "<mo rspace='thickmathspace'>&#x2062;</mo>" : ""
35
- x.replace("#{delim}<mrow xref='#{unit_id(text)}'>#{mathmlsymbol(units)}</mrow>\n#{unitsml(units, text)}")
24
+ def multiplier(x)
25
+ case x
26
+ when :space
27
+ { html: "&nbsp;", mathml: "<mo rspace='thickmathspace'>&#x2062;</mo>" }
28
+ when :nospace
29
+ { html: "", mathml: "<mo>&#x2062;</mo>" }
30
+ else
31
+ { html: HTMLEntities.new.encode(x), mathml: "<mo>#{HTMLEntities.new.encode(x)}</mo>" }
36
32
  end
37
- xml
38
- end
39
-
40
- def asciimath2mathml(expression)
41
- AsciiMath::MathMLBuilder.new(:msword => true).append_expression(
42
- AsciiMath.parse(HTMLEntities.new.decode(expression)).ast).to_s.
43
- gsub(/<math>/, "<math xmlns='#{MATHML_NS}'>")
44
33
  end
45
34
 
46
35
  def unit_id(text)
@@ -92,7 +81,7 @@ module Asciimath2UnitsML
92
81
  units.map do |u|
93
82
  u[:exponent] and exp = "<sup>#{u[:exponent].sub(/-/, "&#x2212;")}</sup>"
94
83
  "#{u[:prefix]}#{u[:unit]}#{exp}"
95
- end.join(" &#183; ")
84
+ end.join(@multiplier[:html])
96
85
  end
97
86
 
98
87
  def mathmlsymbol(units)
@@ -104,7 +93,7 @@ module Asciimath2UnitsML
104
93
  else
105
94
  base
106
95
  end
107
- end.join("<mo>&#xB7;</mo>")
96
+ end.join(@multiplier[:mathml])
108
97
  end
109
98
 
110
99
  def mathmlsymbolwrap(units)
@@ -149,17 +138,6 @@ module Asciimath2UnitsML
149
138
  END
150
139
  end
151
140
 
152
- U2D = {
153
- "m" => { dimension: "Length", order: 1, symbol: "L" },
154
- "g" => { dimension: "Mass", order: 2, symbol: "M" },
155
- "kg" => { dimension: "Mass", order: 2, symbol: "M" },
156
- "s" => { dimension: "Time", order: 3, symbol: "T" },
157
- "A" => { dimension: "ElectricCurrent", order: 4, symbol: "I" },
158
- "K" => { dimension: "ThermodynamicTemperature", order: 5, symbol: "Theta" },
159
- "mol" => { dimension: "AmountOfSubstance", order: 6, symbol: "N" },
160
- "cd" => { dimension: "LuminousIntensity", order: 7, symbol: "J" },
161
- }
162
-
163
141
  def units2dimensions(units)
164
142
  norm = normalise_units(units)
165
143
  return if norm.any? { |u| u[:unit] == "unknown" || u[:prefix] == "unknown" }
@@ -219,15 +197,6 @@ module Asciimath2UnitsML
219
197
  "unknown"
220
198
  end
221
199
 
222
- def parse(x)
223
- units = @parser.parse(x)
224
- if !units || Rsec::INVALID[units]
225
- raise Rsec::SyntaxError.new "error parsing UnitsML expression", x, 1, 0
226
- end
227
- Rsec::Fail.reset
228
- units
229
- end
230
-
231
200
  def unitsml(units, text)
232
201
  dims = units2dimensions(units)
233
202
  <<~END
@@ -54,5 +54,39 @@ module Asciimath2UnitsML
54
54
  Rsec::Fail.reset
55
55
  units
56
56
  end
57
+
58
+ U2D = {
59
+ "m" => { dimension: "Length", order: 1, symbol: "L" },
60
+ "g" => { dimension: "Mass", order: 2, symbol: "M" },
61
+ "kg" => { dimension: "Mass", order: 2, symbol: "M" },
62
+ "s" => { dimension: "Time", order: 3, symbol: "T" },
63
+ "A" => { dimension: "ElectricCurrent", order: 4, symbol: "I" },
64
+ "K" => { dimension: "ThermodynamicTemperature", order: 5, symbol: "Theta" },
65
+ "mol" => { dimension: "AmountOfSubstance", order: 6, symbol: "N" },
66
+ "cd" => { dimension: "LuminousIntensity", order: 7, symbol: "J" },
67
+ }
68
+
69
+ def Asciimath2UnitsML(expression)
70
+ xml = Nokogiri::XML(asciimath2mathml(expression))
71
+ MathML2UnitsML(xml).to_xml
72
+ end
73
+
74
+ # https://www.w3.org/TR/mathml-units/ section 2: delimit number Invisible-Times unit
75
+ def MathML2UnitsML(xml)
76
+ xml.xpath(".//m:mtext", "m" => MATHML_NS).each do |x|
77
+ next unless %r{^unitsml\(.+\)$}.match(x.text)
78
+ text = x.text.sub(%r{^unitsml\((.+)\)$}m, "\\1")
79
+ units = parse(text)
80
+ delim = x&.previous_element&.name == "mn" ? "<mo rspace='thickmathspace'>&#x2062;</mo>" : ""
81
+ x.replace("#{delim}<mrow xref='#{unit_id(text)}'>#{mathmlsymbol(units)}</mrow>\n#{unitsml(units, text)}")
82
+ end
83
+ xml
84
+ end
85
+
86
+ def asciimath2mathml(expression)
87
+ AsciiMath::MathMLBuilder.new(:msword => true).append_expression(
88
+ AsciiMath.parse(HTMLEntities.new.decode(expression)).ast).to_s.
89
+ gsub(/<math>/, "<math xmlns='#{MATHML_NS}'>")
90
+ end
57
91
  end
58
92
  end
@@ -1,3 +1,3 @@
1
1
  module Asciimath2UnitsML
2
- VERSION = '0.0.2'.freeze
2
+ VERSION = '0.1.0'.freeze
3
3
  end
data/spec/conv_spec.rb CHANGED
@@ -94,7 +94,7 @@ RSpec.describe Asciimath2UnitsML do
94
94
  <UnitSystem name='SI' type='SI_derived' xml:lang='en-US'/>
95
95
  <UnitName xml:lang='en'>kg*s^-2</UnitName>
96
96
  <UnitSymbol type='HTML'>
97
- kg &#xB7; s
97
+ kg&#xB7;s
98
98
  <sup>&#x2212;2</sup>
99
99
  </UnitSymbol>
100
100
  <UnitSymbol type='MathML'>
@@ -147,7 +147,7 @@ RSpec.describe Asciimath2UnitsML do
147
147
  <UnitSystem name='SI' type='SI_derived' xml:lang='en-US'/>
148
148
  <UnitName xml:lang='en'>meter per second squared</UnitName>
149
149
  <UnitSymbol type='HTML'>
150
- m &#xB7; s
150
+ m&#xB7;s
151
151
  <sup>&#x2212;2</sup>
152
152
  </UnitSymbol>
153
153
  <UnitSymbol type='MathML'>
@@ -197,7 +197,7 @@ RSpec.describe Asciimath2UnitsML do
197
197
  <UnitSymbol type='HTML'>
198
198
  C
199
199
  <sup>3</sup>
200
- &#xB7; A
200
+ &#xB7;A
201
201
  </UnitSymbol>
202
202
  <UnitSymbol type='MathML'>
203
203
  <math xmlns='http://www.w3.org/1998/Math/MathML'>
@@ -251,4 +251,181 @@ RSpec.describe Asciimath2UnitsML do
251
251
  12 "unitsml(que?)"
252
252
  INPUT
253
253
  end
254
+
255
+ it "initialises multiplier" do
256
+ expect(xmlpp(Asciimath2UnitsML::Conv.new(multiplier: "\u00d7").Asciimath2UnitsML(<<~INPUT))).to be_equivalent_to xmlpp(<<~OUTPUT)
257
+ 1 "unitsml(kg*s^-2)"
258
+ INPUT
259
+ <math xmlns='http://www.w3.org/1998/Math/MathML'>
260
+ <mn>1</mn>
261
+ <mo rspace='thickmathspace'>&#x2062;</mo>
262
+ <mrow xref='U_kg.s-2'>
263
+ <mi mathvariant='normal'>kg</mi>
264
+ <mo>&#xD7;</mo>
265
+ <msup>
266
+ <mrow>
267
+ <mi mathvariant='normal'>s</mi>
268
+ </mrow>
269
+ <mrow>
270
+ <mo>&#x2212;</mo>
271
+ <mn>2</mn>
272
+ </mrow>
273
+ </msup>
274
+ </mrow>
275
+ <Unit xmlns='http://unitsml.nist.gov/2005' xml:id='U_kg.s-2' dimensionURL='#D_MT-2'>
276
+ <UnitSystem name='SI' type='SI_derived' xml:lang='en-US'/>
277
+ <UnitName xml:lang='en'>kg*s^-2</UnitName>
278
+ <UnitSymbol type='HTML'>
279
+ kg&#xD7;s
280
+ <sup>&#x2212;2</sup>
281
+ </UnitSymbol>
282
+ <UnitSymbol type='MathML'>
283
+ <math xmlns='http://www.w3.org/1998/Math/MathML'>
284
+ <mrow>
285
+ <mi mathvariant='normal'>kg</mi>
286
+ <mo>&#xD7;</mo>
287
+ <msup>
288
+ <mrow>
289
+ <mi mathvariant='normal'>s</mi>
290
+ </mrow>
291
+ <mrow>
292
+ <mo>&#x2212;</mo>
293
+ <mn>2</mn>
294
+ </mrow>
295
+ </msup>
296
+ </mrow>
297
+ </math>
298
+ </UnitSymbol>
299
+ <RootUnits>
300
+ <EnumeratedRootUnit unit='gram' prefix='k'/>
301
+ <EnumeratedRootUnit unit='second' powerNumerator='-2'/>
302
+ </RootUnits>
303
+ </Unit>
304
+ <Prefix xmlns='http://unitsml.nist.gov/2005' prefixBase='10' prefixPower='3' xml:id='NISTp10_3'>
305
+ <PrefixName xml:lang='en'>kilo</PrefixName>
306
+ <PrefixSymbol type='ASCII'>k</PrefixSymbol>
307
+ </Prefix>
308
+ <Dimension xmlns='http://unitsml.nist.gov/2005' xml:id='D_MT-2'>
309
+ <Mass symbol='M' powerNumerator='1'/>
310
+ <Time symbol='T' powerNumerator='-2'/>
311
+ </Dimension>
312
+ </math>
313
+ OUTPUT
314
+ expect(xmlpp(Asciimath2UnitsML::Conv.new(multiplier: :space).Asciimath2UnitsML(<<~INPUT))).to be_equivalent_to xmlpp(<<~OUTPUT)
315
+ 1 "unitsml(kg*s^-2)"
316
+ INPUT
317
+ <math xmlns='http://www.w3.org/1998/Math/MathML'>
318
+ <mn>1</mn>
319
+ <mo rspace='thickmathspace'>&#x2062;</mo>
320
+ <mrow xref='U_kg.s-2'>
321
+ <mi mathvariant='normal'>kg</mi>
322
+ <mo rspace='thickmathspace'>&#x2062;</mo>
323
+ <msup>
324
+ <mrow>
325
+ <mi mathvariant='normal'>s</mi>
326
+ </mrow>
327
+ <mrow>
328
+ <mo>&#x2212;</mo>
329
+ <mn>2</mn>
330
+ </mrow>
331
+ </msup>
332
+ </mrow>
333
+ <unit xmlns='http://unitsml.nist.gov/2005' dimensionurl='#D_MT-2' xml:id='U_kg.s-2'>
334
+ <unitsystem name='SI' type='SI_derived' xml:lang='en-US'/>
335
+ <unitname xml:lang='en'>kg*s^-2</unitname>
336
+ <unitsymbol type='HTML'>
337
+ kg&#xA0;s
338
+ <sup>&#x2212;2</sup>
339
+ </unitsymbol>
340
+ <unitsymbol type='MathML'>
341
+ <math xmlns='http://www.w3.org/1998/Math/MathML'>
342
+ <mrow>
343
+ <mi mathvariant='normal'>kg</mi>
344
+ <mo rspace='thickmathspace'>&#x2062;</mo>
345
+ <msup>
346
+ <mrow>
347
+ <mi mathvariant='normal'>s</mi>
348
+ </mrow>
349
+ <mrow>
350
+ <mo>&#x2212;</mo>
351
+ <mn>2</mn>
352
+ </mrow>
353
+ </msup>
354
+ </mrow>
355
+ </math>
356
+ </unitsymbol>
357
+ <rootunits>
358
+ <enumeratedrootunit unit='gram' prefix='k'/>
359
+ <enumeratedrootunit unit='second' powernumerator='-2'/>
360
+ </rootunits>
361
+ </unit>
362
+ <prefix xmlns='http://unitsml.nist.gov/2005' prefixbase='10' prefixpower='3' xml:id='NISTp10_3'>
363
+ <prefixname xml:lang='en'>kilo</prefixname>
364
+ <prefixsymbol type='ASCII'>k</prefixsymbol>
365
+ </prefix>
366
+ <dimension xmlns='http://unitsml.nist.gov/2005' xml:id='D_MT-2'>
367
+ <mass symbol='M' powernumerator='1'/>
368
+ <time symbol='T' powernumerator='-2'/>
369
+ </dimension>
370
+ </math>
371
+ OUTPUT
372
+ expect(xmlpp(Asciimath2UnitsML::Conv.new(multiplier: :nospace).Asciimath2UnitsML(<<~INPUT))).to be_equivalent_to xmlpp(<<~OUTPUT)
373
+ 1 "unitsml(kg*s^-2)"
374
+ INPUT
375
+ <math xmlns='http://www.w3.org/1998/Math/MathML'>
376
+ <mn>1</mn>
377
+ <mo rspace='thickmathspace'>&#x2062;</mo>
378
+ <mrow xref='U_kg.s-2'>
379
+ <mi mathvariant='normal'>kg</mi>
380
+ <mo>&#x2062;</mo>
381
+ <msup>
382
+ <mrow>
383
+ <mi mathvariant='normal'>s</mi>
384
+ </mrow>
385
+ <mrow>
386
+ <mo>&#x2212;</mo>
387
+ <mn>2</mn>
388
+ </mrow>
389
+ </msup>
390
+ </mrow>
391
+ <Unit xmlns='http://unitsml.nist.gov/2005' xml:id='U_kg.s-2' dimensionURL='#D_MT-2'>
392
+ <UnitSystem name='SI' type='SI_derived' xml:lang='en-US'/>
393
+ <UnitName xml:lang='en'>kg*s^-2</UnitName>
394
+ <UnitSymbol type='HTML'>
395
+ kgs
396
+ <sup>&#x2212;2</sup>
397
+ </UnitSymbol>
398
+ <UnitSymbol type='MathML'>
399
+ <math xmlns='http://www.w3.org/1998/Math/MathML'>
400
+ <mrow>
401
+ <mi mathvariant='normal'>kg</mi>
402
+ <mo>&#x2062;</mo>
403
+ <msup>
404
+ <mrow>
405
+ <mi mathvariant='normal'>s</mi>
406
+ </mrow>
407
+ <mrow>
408
+ <mo>&#x2212;</mo>
409
+ <mn>2</mn>
410
+ </mrow>
411
+ </msup>
412
+ </mrow>
413
+ </math>
414
+ </UnitSymbol>
415
+ <RootUnits>
416
+ <EnumeratedRootUnit unit='gram' prefix='k'/>
417
+ <EnumeratedRootUnit unit='second' powerNumerator='-2'/>
418
+ </RootUnits>
419
+ </Unit>
420
+ <Prefix xmlns='http://unitsml.nist.gov/2005' prefixBase='10' prefixPower='3' xml:id='NISTp10_3'>
421
+ <PrefixName xml:lang='en'>kilo</PrefixName>
422
+ <PrefixSymbol type='ASCII'>k</PrefixSymbol>
423
+ </Prefix>
424
+ <Dimension xmlns='http://unitsml.nist.gov/2005' xml:id='D_MT-2'>
425
+ <Mass symbol='M' powerNumerator='1'/>
426
+ <Time symbol='T' powerNumerator='-2'/>
427
+ </Dimension>
428
+ </math>
429
+ OUTPUT
430
+ end
254
431
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: asciimath2unitsml
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.2
4
+ version: 0.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ribose Inc.
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2021-02-15 00:00:00.000000000 Z
11
+ date: 2021-02-18 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: asciimath