asciimath2unitsml 0.0.2 → 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: d2ef44eb717d6b445489de85ee1e8eb3f5b81a6d83602c48ea23a8cbe4f100c7
4
- data.tar.gz: 84b867f5a97b3ad154c7be8e5975ca4c97446049c4368ece6ae5e23da3c94437
3
+ metadata.gz: 97081f507478d1952ab2010cd4810efe745980341b9d4d1a420dbd8132be88b0
4
+ data.tar.gz: ef63bceb0dc7b7c8dc8c40a1b6fa3795b02a3d0f0d9454be41e121d3a5199054
5
5
  SHA512:
6
- metadata.gz: b749aa65924f4b815a7d38d2df9c9d6b48e4cd8f9db62678c100d732a60c8159b537fa2c3bd192296f533371f731ec85365f0d7f231f560f173d167971c38871
7
- data.tar.gz: 64a47773ef26b6b870fc6a4406b697ec10b943e637a82f549946d8126830db9a521f3bdcb4e314718d084e40b7cfacbc820417476cd98663760ec8de29ba566e
6
+ metadata.gz: ead9d484a747edf442ff9ce220d46e53f32ccfeada2653ca44e2ecac9a3da4ade854fbfe0dbb6d0120b87358e5cfae0dee3135e7798eed331b7cff25b3558a9c
7
+ data.tar.gz: ed1aaf1967055b8ca4e270bef69b02b24ccef20fb075cc9f46acee7f5447e7c92e9c2f9bfd36646ee043b4f335dd4fd15920d958d429bdf89d155c5ddc14d729
data/README.adoc CHANGED
@@ -1,21 +1,27 @@
1
1
  = asciimath2unitsml
2
- Convert Asciimath via MathML to UnitsML
2
+ Convert Units expressions via MathML to UnitsML
3
3
 
4
- Encode UnitsML expressions in AsciiMath as `"unitsml(...)"`. The gem converts
5
- AsciiMath incorporating UnitsML expressions (based on the Ascii representation provided by NIST)
4
+ This gem converts
5
+ MathML incorporating UnitsML expressions (based on the Ascii representation provided by NIST)
6
6
  into MathML complying with https://www.w3.org/TR/mathml-units/[], with
7
- UnitsML markup embedded in it, with identifiers for each unit and dimension.
7
+ UnitsML markup embedded in it, and with unique identifiers for each distinct unit and dimension.
8
+ Units expressions are identified in MathML as `<mtext>unitsml(...)</mtext>`, which in turn
9
+ can be identified in AsciiMath as `"unitsml(...)"`.
8
10
  The consuming document is meant to deduplicate the instances of UnitsML markup
9
11
  with the same identifier, and potentially remove them to elsewhere in the document
10
12
  or another document.
11
13
 
12
- The AsciiMath conventions used are:
14
+ The conventions used for writing units are:
13
15
 
14
16
  * `^` for exponents, e.g. `m^-2`
15
17
  * `*` to combine two units by multiplication; e.g. `m*s^-2`. Division is not supported, use negative exponents instead
16
18
  * `u` for μ (micro-)
17
19
 
18
- So
20
+ The gem follows the MathML Units convention of inserting a spacing invisible times operator
21
+ (`<mo rspace='thickmathspace'>&#x2062;</mo>`) between any numbers (`<mn>`) and unit expressions
22
+ in MathML, and representing units in MathML as non-italic variables (`<mi mathvariant='normal'>`).
23
+
24
+ So:
19
25
 
20
26
  [source]
21
27
  ----
@@ -77,4 +83,20 @@ is converted into:
77
83
  </math>
78
84
  ----
79
85
 
86
+ The converter is run as:
87
+
88
+ [source,ruby]
89
+ ----
90
+ c = Asciimath2UnitsML::Conv.new()
91
+ c.Asciimath2UnitsML({Asciimath string containing "unitsml()"})
92
+ c.MathML2UnitsML({Nokogiri parse of MathML document containing <mtext>unitsml()</mtext>})
93
+ ----
94
+
95
+ The converter class may be initialised with options:
80
96
 
97
+ * `multiplier` is the symbol used to represent the multiplication of units. By default,
98
+ following MathML Units, the symbol is middle dot (`&#xB7`). An arbitrary UTF-8 string can be
99
+ supplied instead; it will be encoded as XML entities. The value `:space` is rendered
100
+ as a spacing invisible times in MathML (`<mo rspace='thickmathspace'>&#x2062;</mo>`),
101
+ and as a non-breaking space in HTML. The value `:nospace` is rendered as a non-spacing
102
+ invisible times in MathML (`<mo>&#x2062;</mo>`), and is not rendered in HTML.
@@ -11,36 +11,25 @@ module Asciimath2UnitsML
11
11
  UNITSML_NS = "http://unitsml.nist.gov/2005".freeze
12
12
 
13
13
  class Conv
14
- def initialize
14
+ def initialize(options = {})
15
15
  @prefixes_id = read_yaml("../unitsdb/prefixes.yaml")
16
16
  @prefixes = flip_name_and_id(@prefixes_id)
17
17
  @quantities = read_yaml("../unitsdb/quantities.yaml")
18
18
  @units_id = read_yaml("../unitsdb/units.yaml")
19
19
  @units = flip_name_and_id(@units_id)
20
20
  @parser = parser
21
+ @multiplier = multiplier(options[:multiplier] || "\u00b7")
21
22
  end
22
23
 
23
- # https://www.w3.org/TR/mathml-units/ section 2: delimit number Invisible-Times unit
24
- def Asciimath2UnitsML(expression)
25
- xml = Nokogiri::XML(asciimath2mathml(expression))
26
- MathML2UnitsML(xml).to_xml
27
- end
28
-
29
- def MathML2UnitsML(xml)
30
- xml.xpath(".//m:mtext", "m" => MATHML_NS).each do |x|
31
- next unless %r{^unitsml\(.+\)$}.match(x.text)
32
- text = x.text.sub(%r{^unitsml\((.+)\)$}m, "\\1")
33
- units = parse(text)
34
- delim = x&.previous_element&.name == "mn" ? "<mo rspace='thickmathspace'>&#x2062;</mo>" : ""
35
- x.replace("#{delim}<mrow xref='#{unit_id(text)}'>#{mathmlsymbol(units)}</mrow>\n#{unitsml(units, text)}")
24
+ def multiplier(x)
25
+ case x
26
+ when :space
27
+ { html: "&nbsp;", mathml: "<mo rspace='thickmathspace'>&#x2062;</mo>" }
28
+ when :nospace
29
+ { html: "", mathml: "<mo>&#x2062;</mo>" }
30
+ else
31
+ { html: HTMLEntities.new.encode(x), mathml: "<mo>#{HTMLEntities.new.encode(x)}</mo>" }
36
32
  end
37
- xml
38
- end
39
-
40
- def asciimath2mathml(expression)
41
- AsciiMath::MathMLBuilder.new(:msword => true).append_expression(
42
- AsciiMath.parse(HTMLEntities.new.decode(expression)).ast).to_s.
43
- gsub(/<math>/, "<math xmlns='#{MATHML_NS}'>")
44
33
  end
45
34
 
46
35
  def unit_id(text)
@@ -92,7 +81,7 @@ module Asciimath2UnitsML
92
81
  units.map do |u|
93
82
  u[:exponent] and exp = "<sup>#{u[:exponent].sub(/-/, "&#x2212;")}</sup>"
94
83
  "#{u[:prefix]}#{u[:unit]}#{exp}"
95
- end.join(" &#183; ")
84
+ end.join(@multiplier[:html])
96
85
  end
97
86
 
98
87
  def mathmlsymbol(units)
@@ -104,7 +93,7 @@ module Asciimath2UnitsML
104
93
  else
105
94
  base
106
95
  end
107
- end.join("<mo>&#xB7;</mo>")
96
+ end.join(@multiplier[:mathml])
108
97
  end
109
98
 
110
99
  def mathmlsymbolwrap(units)
@@ -149,17 +138,6 @@ module Asciimath2UnitsML
149
138
  END
150
139
  end
151
140
 
152
- U2D = {
153
- "m" => { dimension: "Length", order: 1, symbol: "L" },
154
- "g" => { dimension: "Mass", order: 2, symbol: "M" },
155
- "kg" => { dimension: "Mass", order: 2, symbol: "M" },
156
- "s" => { dimension: "Time", order: 3, symbol: "T" },
157
- "A" => { dimension: "ElectricCurrent", order: 4, symbol: "I" },
158
- "K" => { dimension: "ThermodynamicTemperature", order: 5, symbol: "Theta" },
159
- "mol" => { dimension: "AmountOfSubstance", order: 6, symbol: "N" },
160
- "cd" => { dimension: "LuminousIntensity", order: 7, symbol: "J" },
161
- }
162
-
163
141
  def units2dimensions(units)
164
142
  norm = normalise_units(units)
165
143
  return if norm.any? { |u| u[:unit] == "unknown" || u[:prefix] == "unknown" }
@@ -219,15 +197,6 @@ module Asciimath2UnitsML
219
197
  "unknown"
220
198
  end
221
199
 
222
- def parse(x)
223
- units = @parser.parse(x)
224
- if !units || Rsec::INVALID[units]
225
- raise Rsec::SyntaxError.new "error parsing UnitsML expression", x, 1, 0
226
- end
227
- Rsec::Fail.reset
228
- units
229
- end
230
-
231
200
  def unitsml(units, text)
232
201
  dims = units2dimensions(units)
233
202
  <<~END
@@ -54,5 +54,39 @@ module Asciimath2UnitsML
54
54
  Rsec::Fail.reset
55
55
  units
56
56
  end
57
+
58
+ U2D = {
59
+ "m" => { dimension: "Length", order: 1, symbol: "L" },
60
+ "g" => { dimension: "Mass", order: 2, symbol: "M" },
61
+ "kg" => { dimension: "Mass", order: 2, symbol: "M" },
62
+ "s" => { dimension: "Time", order: 3, symbol: "T" },
63
+ "A" => { dimension: "ElectricCurrent", order: 4, symbol: "I" },
64
+ "K" => { dimension: "ThermodynamicTemperature", order: 5, symbol: "Theta" },
65
+ "mol" => { dimension: "AmountOfSubstance", order: 6, symbol: "N" },
66
+ "cd" => { dimension: "LuminousIntensity", order: 7, symbol: "J" },
67
+ }
68
+
69
+ def Asciimath2UnitsML(expression)
70
+ xml = Nokogiri::XML(asciimath2mathml(expression))
71
+ MathML2UnitsML(xml).to_xml
72
+ end
73
+
74
+ # https://www.w3.org/TR/mathml-units/ section 2: delimit number Invisible-Times unit
75
+ def MathML2UnitsML(xml)
76
+ xml.xpath(".//m:mtext", "m" => MATHML_NS).each do |x|
77
+ next unless %r{^unitsml\(.+\)$}.match(x.text)
78
+ text = x.text.sub(%r{^unitsml\((.+)\)$}m, "\\1")
79
+ units = parse(text)
80
+ delim = x&.previous_element&.name == "mn" ? "<mo rspace='thickmathspace'>&#x2062;</mo>" : ""
81
+ x.replace("#{delim}<mrow xref='#{unit_id(text)}'>#{mathmlsymbol(units)}</mrow>\n#{unitsml(units, text)}")
82
+ end
83
+ xml
84
+ end
85
+
86
+ def asciimath2mathml(expression)
87
+ AsciiMath::MathMLBuilder.new(:msword => true).append_expression(
88
+ AsciiMath.parse(HTMLEntities.new.decode(expression)).ast).to_s.
89
+ gsub(/<math>/, "<math xmlns='#{MATHML_NS}'>")
90
+ end
57
91
  end
58
92
  end
@@ -1,3 +1,3 @@
1
1
  module Asciimath2UnitsML
2
- VERSION = '0.0.2'.freeze
2
+ VERSION = '0.1.0'.freeze
3
3
  end
data/spec/conv_spec.rb CHANGED
@@ -94,7 +94,7 @@ RSpec.describe Asciimath2UnitsML do
94
94
  <UnitSystem name='SI' type='SI_derived' xml:lang='en-US'/>
95
95
  <UnitName xml:lang='en'>kg*s^-2</UnitName>
96
96
  <UnitSymbol type='HTML'>
97
- kg &#xB7; s
97
+ kg&#xB7;s
98
98
  <sup>&#x2212;2</sup>
99
99
  </UnitSymbol>
100
100
  <UnitSymbol type='MathML'>
@@ -147,7 +147,7 @@ RSpec.describe Asciimath2UnitsML do
147
147
  <UnitSystem name='SI' type='SI_derived' xml:lang='en-US'/>
148
148
  <UnitName xml:lang='en'>meter per second squared</UnitName>
149
149
  <UnitSymbol type='HTML'>
150
- m &#xB7; s
150
+ m&#xB7;s
151
151
  <sup>&#x2212;2</sup>
152
152
  </UnitSymbol>
153
153
  <UnitSymbol type='MathML'>
@@ -197,7 +197,7 @@ RSpec.describe Asciimath2UnitsML do
197
197
  <UnitSymbol type='HTML'>
198
198
  C
199
199
  <sup>3</sup>
200
- &#xB7; A
200
+ &#xB7;A
201
201
  </UnitSymbol>
202
202
  <UnitSymbol type='MathML'>
203
203
  <math xmlns='http://www.w3.org/1998/Math/MathML'>
@@ -251,4 +251,181 @@ RSpec.describe Asciimath2UnitsML do
251
251
  12 "unitsml(que?)"
252
252
  INPUT
253
253
  end
254
+
255
+ it "initialises multiplier" do
256
+ expect(xmlpp(Asciimath2UnitsML::Conv.new(multiplier: "\u00d7").Asciimath2UnitsML(<<~INPUT))).to be_equivalent_to xmlpp(<<~OUTPUT)
257
+ 1 "unitsml(kg*s^-2)"
258
+ INPUT
259
+ <math xmlns='http://www.w3.org/1998/Math/MathML'>
260
+ <mn>1</mn>
261
+ <mo rspace='thickmathspace'>&#x2062;</mo>
262
+ <mrow xref='U_kg.s-2'>
263
+ <mi mathvariant='normal'>kg</mi>
264
+ <mo>&#xD7;</mo>
265
+ <msup>
266
+ <mrow>
267
+ <mi mathvariant='normal'>s</mi>
268
+ </mrow>
269
+ <mrow>
270
+ <mo>&#x2212;</mo>
271
+ <mn>2</mn>
272
+ </mrow>
273
+ </msup>
274
+ </mrow>
275
+ <Unit xmlns='http://unitsml.nist.gov/2005' xml:id='U_kg.s-2' dimensionURL='#D_MT-2'>
276
+ <UnitSystem name='SI' type='SI_derived' xml:lang='en-US'/>
277
+ <UnitName xml:lang='en'>kg*s^-2</UnitName>
278
+ <UnitSymbol type='HTML'>
279
+ kg&#xD7;s
280
+ <sup>&#x2212;2</sup>
281
+ </UnitSymbol>
282
+ <UnitSymbol type='MathML'>
283
+ <math xmlns='http://www.w3.org/1998/Math/MathML'>
284
+ <mrow>
285
+ <mi mathvariant='normal'>kg</mi>
286
+ <mo>&#xD7;</mo>
287
+ <msup>
288
+ <mrow>
289
+ <mi mathvariant='normal'>s</mi>
290
+ </mrow>
291
+ <mrow>
292
+ <mo>&#x2212;</mo>
293
+ <mn>2</mn>
294
+ </mrow>
295
+ </msup>
296
+ </mrow>
297
+ </math>
298
+ </UnitSymbol>
299
+ <RootUnits>
300
+ <EnumeratedRootUnit unit='gram' prefix='k'/>
301
+ <EnumeratedRootUnit unit='second' powerNumerator='-2'/>
302
+ </RootUnits>
303
+ </Unit>
304
+ <Prefix xmlns='http://unitsml.nist.gov/2005' prefixBase='10' prefixPower='3' xml:id='NISTp10_3'>
305
+ <PrefixName xml:lang='en'>kilo</PrefixName>
306
+ <PrefixSymbol type='ASCII'>k</PrefixSymbol>
307
+ </Prefix>
308
+ <Dimension xmlns='http://unitsml.nist.gov/2005' xml:id='D_MT-2'>
309
+ <Mass symbol='M' powerNumerator='1'/>
310
+ <Time symbol='T' powerNumerator='-2'/>
311
+ </Dimension>
312
+ </math>
313
+ OUTPUT
314
+ expect(xmlpp(Asciimath2UnitsML::Conv.new(multiplier: :space).Asciimath2UnitsML(<<~INPUT))).to be_equivalent_to xmlpp(<<~OUTPUT)
315
+ 1 "unitsml(kg*s^-2)"
316
+ INPUT
317
+ <math xmlns='http://www.w3.org/1998/Math/MathML'>
318
+ <mn>1</mn>
319
+ <mo rspace='thickmathspace'>&#x2062;</mo>
320
+ <mrow xref='U_kg.s-2'>
321
+ <mi mathvariant='normal'>kg</mi>
322
+ <mo rspace='thickmathspace'>&#x2062;</mo>
323
+ <msup>
324
+ <mrow>
325
+ <mi mathvariant='normal'>s</mi>
326
+ </mrow>
327
+ <mrow>
328
+ <mo>&#x2212;</mo>
329
+ <mn>2</mn>
330
+ </mrow>
331
+ </msup>
332
+ </mrow>
333
+ <unit xmlns='http://unitsml.nist.gov/2005' dimensionurl='#D_MT-2' xml:id='U_kg.s-2'>
334
+ <unitsystem name='SI' type='SI_derived' xml:lang='en-US'/>
335
+ <unitname xml:lang='en'>kg*s^-2</unitname>
336
+ <unitsymbol type='HTML'>
337
+ kg&#xA0;s
338
+ <sup>&#x2212;2</sup>
339
+ </unitsymbol>
340
+ <unitsymbol type='MathML'>
341
+ <math xmlns='http://www.w3.org/1998/Math/MathML'>
342
+ <mrow>
343
+ <mi mathvariant='normal'>kg</mi>
344
+ <mo rspace='thickmathspace'>&#x2062;</mo>
345
+ <msup>
346
+ <mrow>
347
+ <mi mathvariant='normal'>s</mi>
348
+ </mrow>
349
+ <mrow>
350
+ <mo>&#x2212;</mo>
351
+ <mn>2</mn>
352
+ </mrow>
353
+ </msup>
354
+ </mrow>
355
+ </math>
356
+ </unitsymbol>
357
+ <rootunits>
358
+ <enumeratedrootunit unit='gram' prefix='k'/>
359
+ <enumeratedrootunit unit='second' powernumerator='-2'/>
360
+ </rootunits>
361
+ </unit>
362
+ <prefix xmlns='http://unitsml.nist.gov/2005' prefixbase='10' prefixpower='3' xml:id='NISTp10_3'>
363
+ <prefixname xml:lang='en'>kilo</prefixname>
364
+ <prefixsymbol type='ASCII'>k</prefixsymbol>
365
+ </prefix>
366
+ <dimension xmlns='http://unitsml.nist.gov/2005' xml:id='D_MT-2'>
367
+ <mass symbol='M' powernumerator='1'/>
368
+ <time symbol='T' powernumerator='-2'/>
369
+ </dimension>
370
+ </math>
371
+ OUTPUT
372
+ expect(xmlpp(Asciimath2UnitsML::Conv.new(multiplier: :nospace).Asciimath2UnitsML(<<~INPUT))).to be_equivalent_to xmlpp(<<~OUTPUT)
373
+ 1 "unitsml(kg*s^-2)"
374
+ INPUT
375
+ <math xmlns='http://www.w3.org/1998/Math/MathML'>
376
+ <mn>1</mn>
377
+ <mo rspace='thickmathspace'>&#x2062;</mo>
378
+ <mrow xref='U_kg.s-2'>
379
+ <mi mathvariant='normal'>kg</mi>
380
+ <mo>&#x2062;</mo>
381
+ <msup>
382
+ <mrow>
383
+ <mi mathvariant='normal'>s</mi>
384
+ </mrow>
385
+ <mrow>
386
+ <mo>&#x2212;</mo>
387
+ <mn>2</mn>
388
+ </mrow>
389
+ </msup>
390
+ </mrow>
391
+ <Unit xmlns='http://unitsml.nist.gov/2005' xml:id='U_kg.s-2' dimensionURL='#D_MT-2'>
392
+ <UnitSystem name='SI' type='SI_derived' xml:lang='en-US'/>
393
+ <UnitName xml:lang='en'>kg*s^-2</UnitName>
394
+ <UnitSymbol type='HTML'>
395
+ kgs
396
+ <sup>&#x2212;2</sup>
397
+ </UnitSymbol>
398
+ <UnitSymbol type='MathML'>
399
+ <math xmlns='http://www.w3.org/1998/Math/MathML'>
400
+ <mrow>
401
+ <mi mathvariant='normal'>kg</mi>
402
+ <mo>&#x2062;</mo>
403
+ <msup>
404
+ <mrow>
405
+ <mi mathvariant='normal'>s</mi>
406
+ </mrow>
407
+ <mrow>
408
+ <mo>&#x2212;</mo>
409
+ <mn>2</mn>
410
+ </mrow>
411
+ </msup>
412
+ </mrow>
413
+ </math>
414
+ </UnitSymbol>
415
+ <RootUnits>
416
+ <EnumeratedRootUnit unit='gram' prefix='k'/>
417
+ <EnumeratedRootUnit unit='second' powerNumerator='-2'/>
418
+ </RootUnits>
419
+ </Unit>
420
+ <Prefix xmlns='http://unitsml.nist.gov/2005' prefixBase='10' prefixPower='3' xml:id='NISTp10_3'>
421
+ <PrefixName xml:lang='en'>kilo</PrefixName>
422
+ <PrefixSymbol type='ASCII'>k</PrefixSymbol>
423
+ </Prefix>
424
+ <Dimension xmlns='http://unitsml.nist.gov/2005' xml:id='D_MT-2'>
425
+ <Mass symbol='M' powerNumerator='1'/>
426
+ <Time symbol='T' powerNumerator='-2'/>
427
+ </Dimension>
428
+ </math>
429
+ OUTPUT
430
+ end
254
431
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: asciimath2unitsml
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.2
4
+ version: 0.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ribose Inc.
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2021-02-15 00:00:00.000000000 Z
11
+ date: 2021-02-18 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: asciimath