plurimath 0.10.4 → 0.10.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (33) hide show
  1. checksums.yaml +4 -4
  2. data/lib/plurimath/mathml/translator.rb +1 -2
  3. data/lib/plurimath/version.rb +1 -1
  4. metadata +2 -31
  5. data/TODO.bad_symbols.md +0 -45
  6. data/TODO.bugs/01-unitsml-enoent.md +0 -28
  7. data/TODO.bugs/02-system-stack-error-cloned-objects.md +0 -34
  8. data/TODO.bugs/03-omml-underover-displaystyle.md +0 -23
  9. data/TODO.bugs/04-unitsml-spec-diffs.md +0 -20
  10. data/TODO.bugs/05-omml-greek-entity-encoding.md +0 -50
  11. data/TODO.bugs/mml_custom_model_child_order.md +0 -149
  12. data/TODO.fix-fails/01-phantom-whitespace.md +0 -63
  13. data/TODO.fix-fails/02-table-parentheses-latex.md +0 -17
  14. data/TODO.fix-fails/03-longidv-tag-mathml.md +0 -17
  15. data/TODO.fix-fails/04-mmultiscript-none-mathml.md +0 -17
  16. data/TODO.fix-fails/05-mstyle-nary-oint-mathml.md +0 -17
  17. data/TODO.fix-fails/06-issue-238-mathml.md +0 -17
  18. data/TODO.fix-fails/07-metanorma-bipm-latex.md +0 -19
  19. data/TODO.fix-fails/08-metanorma-itu-latex.md +0 -17
  20. data/TODO.fix-fails/09-omml-greek-encoding.md +0 -24
  21. data/TODO.fix-fails/10-omml-underover-greek.md +0 -19
  22. data/TODO.fix-fails/11-omml-multiscripts-zwsp.md +0 -19
  23. data/TODO.fix-fails/12-omml-oint-integral.md +0 -19
  24. data/TODO.fix-fails/13-omml-nary-prod.md +0 -19
  25. data/TODO.fix-fails/14-omml-empty-mo.md +0 -19
  26. data/TODO.fix-fails/REMAINING_FAILURES.md +0 -168
  27. data/TODO.fix-tests/00-overview.md +0 -102
  28. data/TODO.fix-tests/01-zero-width-space.md +0 -34
  29. data/TODO.fix-tests/02-html-linebreak.md +0 -41
  30. data/TODO.fix-tests/03-phantom-whitespace.md +0 -34
  31. data/TODO.fix-tests/04-mathml-structure.md +0 -33
  32. data/TODO.fix-tests/05-omml-rendering.md +0 -27
  33. data/TODO.mml-plurimath-model.md +0 -86
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 7aad5adc31e4e7391ca43f1e6c4608d00710092037b26a0a42c6f47de06c8524
4
- data.tar.gz: 89f63ecb2b54a96a2326802917e54a6dd82ba55426d173fe575010d3ac706524
3
+ metadata.gz: df50cd9d9fe202d3c2847ab7bd9cc5f40d702c9e531c80f62eff307dea491111
4
+ data.tar.gz: 27221b3755dca96be34fb924a83fa1fa495890f079b7f1f360f4b4f290286f3c
5
5
  SHA512:
6
- metadata.gz: d4b5a070148d62563d62bd3f363cecdcf309a95d97de74f4a7b4c0221f8364631b63780d11521a6e1d6666b7f30e28e16e5257938e9b13ac0859a6f0c9375f38
7
- data.tar.gz: 0a3c516e3b3da885fca07a94d36967998ebf0c9c7f4b046f0110062434c3da70e47c03b31af33acbffeb5c3d6a5a398daac7ea15f3cb9b066444c51402d320da
6
+ metadata.gz: 37f43e179e056e95e857a4c6a79c810a6bd8922fba767607195b0e354376017b9329a8e29adc996aa495605d86d1f8244aaa42f07228a7acd882d785cd762e94
7
+ data.tar.gz: b7733abb9f05a78266577ec20b656b71c7d33a7c5db76823e7191906771645074944d29676b98c23bbbf26a21c5d14c6ef6c157aad59cbfa26464c4a72c5eee9
@@ -338,8 +338,7 @@ module Plurimath
338
338
  def msqrt_to_sqrt(sqrt)
339
339
  children = content_children(sqrt)
340
340
  radicand = children.filter_map { |child| mml_to_plurimath(child) }
341
- radicand = radicand.first if radicand.size == 1
342
- Plurimath::Math::Function::Sqrt.new(radicand)
341
+ Plurimath::Math::Function::Sqrt.new(wrap_children(radicand))
343
342
  end
344
343
 
345
344
  # MathML element: <mroot> radicand index
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Plurimath
4
- VERSION = "0.10.4"
4
+ VERSION = "0.10.5"
5
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: plurimath
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.10.4
4
+ version: 0.10.5
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ribose Inc.
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2026-05-02 00:00:00.000000000 Z
11
+ date: 2026-05-14 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bigdecimal
@@ -180,35 +180,6 @@ files:
180
180
  - MathML-Supported-Data.adoc
181
181
  - README.adoc
182
182
  - Rakefile
183
- - TODO.bad_symbols.md
184
- - TODO.bugs/01-unitsml-enoent.md
185
- - TODO.bugs/02-system-stack-error-cloned-objects.md
186
- - TODO.bugs/03-omml-underover-displaystyle.md
187
- - TODO.bugs/04-unitsml-spec-diffs.md
188
- - TODO.bugs/05-omml-greek-entity-encoding.md
189
- - TODO.bugs/mml_custom_model_child_order.md
190
- - TODO.fix-fails/01-phantom-whitespace.md
191
- - TODO.fix-fails/02-table-parentheses-latex.md
192
- - TODO.fix-fails/03-longidv-tag-mathml.md
193
- - TODO.fix-fails/04-mmultiscript-none-mathml.md
194
- - TODO.fix-fails/05-mstyle-nary-oint-mathml.md
195
- - TODO.fix-fails/06-issue-238-mathml.md
196
- - TODO.fix-fails/07-metanorma-bipm-latex.md
197
- - TODO.fix-fails/08-metanorma-itu-latex.md
198
- - TODO.fix-fails/09-omml-greek-encoding.md
199
- - TODO.fix-fails/10-omml-underover-greek.md
200
- - TODO.fix-fails/11-omml-multiscripts-zwsp.md
201
- - TODO.fix-fails/12-omml-oint-integral.md
202
- - TODO.fix-fails/13-omml-nary-prod.md
203
- - TODO.fix-fails/14-omml-empty-mo.md
204
- - TODO.fix-fails/REMAINING_FAILURES.md
205
- - TODO.fix-tests/00-overview.md
206
- - TODO.fix-tests/01-zero-width-space.md
207
- - TODO.fix-tests/02-html-linebreak.md
208
- - TODO.fix-tests/03-phantom-whitespace.md
209
- - TODO.fix-tests/04-mathml-structure.md
210
- - TODO.fix-tests/05-omml-rendering.md
211
- - TODO.mml-plurimath-model.md
212
183
  - UnicodeMath-Supported-Data.adoc
213
184
  - UnitsML-Supported-Data.adoc
214
185
  - bin/console
data/TODO.bad_symbols.md DELETED
@@ -1,45 +0,0 @@
1
- # Bad Symbol.new Usages in translator.rb
2
-
3
- ## Line 577 (FIXED)
4
- ```ruby
5
- Plurimath::Math::Symbols::Symbol.new("(", mo_element: true)
6
- ```
7
- Was used in `combine_function_with_parens` to create opening parenthesis.
8
- FIXED: Now uses `next_elem` directly if it's a Paren object, or creates `Paren::Lround.new` as fallback.
9
- Problem with old code: It was ignoring the existing `Paren::Lround` object returned by `mo_to_symbol` and creating a generic Symbol.
10
-
11
- ## Line 171 - mo_element= (FIXED)
12
- ```ruby
13
- result.mo_element = true
14
- ```
15
- This was setting mo_element on Symbol objects returned by mathml_unary_classes.
16
- FIXED: Removed this line as Symbol class doesn't have mo_element attribute and doesn't need it.
17
- The Paren classes already know how to render themselves as `<mo>` via their own to_mathml_without_math_tag method.
18
-
19
- ## Line 169-175 - linebreakstyle not passed (FIXED)
20
- ```ruby
21
- return Plurimath::Math::Function::Linebreak.new(
22
- Plurimath::Math::Symbols::Symbol.new(value)
23
- )
24
- ```
25
- The `linebreakstyle` attribute was not being passed to Linebreak constructor.
26
- FIXED: Now passes `linebreakstyle` attribute and uses `mathml_unary_classes` for proper encoding.
27
-
28
- ## Line 159 - whitespace stripped (PARTIALLY FIXED)
29
- ```ruby
30
- result = Plurimath::Utility.mathml_unary_classes([stripped], lang: :mathml)
31
- ```
32
- Whitespace was being stripped from Symbol values, causing phantom content to lose spaces.
33
- FIXED for LaTeX output: Restores original value after `mathml_unary_classes` creates the Symbol.
34
- MathML structure issue remains a pre-existing bug.
35
-
36
- ## Remaining Issues - SEE TODO.fix-tests/
37
-
38
- Detailed investigation reports for the remaining 19 test failures are in `TODO.fix-tests/`:
39
-
40
- - 00-overview.md - Summary of all failures
41
- - 01-zero-width-space.md - OMML empty element serialization
42
- - 02-html-linebreak.md - HTML linebreak positioning (FIXED)
43
- - 03-phantom-whitespace.md - LaTeX whitespace in phantom (PARTIALLY FIXED)
44
- - 04-mathml-structure.md - MathML structure differences
45
- - 05-omml-rendering.md - OMML rendering issues
@@ -1,28 +0,0 @@
1
- # Fix Unitsml ENOENT - missing unitsdb data files
2
-
3
- ## Problem
4
- ~44 test failures with `Errno::ENOENT` - unitsml gem can't find
5
- `unitsdb/units.yaml`. The GitHub-hosted unitsml gem has an empty
6
- `unitsdb/` directory (git submodule not initialized during `bundle install`).
7
-
8
- ## Affected specs (51 total failures)
9
- - spec/plurimath/asciimath_spec.rb (23 failures)
10
- - spec/plurimath/integration/asciimath_spec.rb (13 failures)
11
- - spec/plurimath/math/formula/unitsml_spec.rb (3 failures)
12
- - spec/plurimath/asciimath/metanorma/mn_samples_bipm_spec.rb (2 failures)
13
- - spec/plurimath/asciimath/metanorma/mn_samples_jcgm_spec.rb (1 failure)
14
- - spec/plurimath/asciimath/parser_spec.rb (1 failure)
15
- - spec/plurimath/unicode_math_spec.rb (1 failure)
16
-
17
- ## Root cause
18
- The unitsml gem's `unitsdb/` is a git submodule. When bundler installs
19
- from GitHub, submodules aren't initialized. The local copy at
20
- `../../unitsml/unitsml-ruby/` has the data.
21
-
22
- ## Fix
23
- Change Gemfile to use local path for unitsml:
24
- ```ruby
25
- gem "unitsml", path: "../../unitsml/unitsml-ruby"
26
- ```
27
-
28
- ## Status: Fixed (changed Gemfile to local path)
@@ -1,34 +0,0 @@
1
- # Fix SystemStackError in Formula#cloned_objects
2
-
3
- ## Problem
4
- 4 test failures with `SystemStackError: stack level too deep` in
5
- `Plurimath::Math::Core#cloned_objects`. The `cloned_objects` method
6
- recursively clones all values, but some formula structures create
7
- infinite recursion.
8
-
9
- ## Affected specs (19 failures total)
10
- - spec/plurimath/mathml_spec.rb:213 - MathML object round-trip
11
- - spec/plurimath/mathml_spec.rb:778 - table with parentheses and sqrt
12
- - spec/plurimath/mathml_spec.rb:1448 - mtable with frame/rowlines
13
- - spec/plurimath/mathml_spec.rb:1713 - metanorma bipm input
14
- - spec/plurimath/math_zone_spec.rb:675 - table math zone
15
- - spec/plurimath/math_zone_spec.rb:4234 - table math zone
16
- - spec/plurimath/integration/asciimath_spec.rb - 12 ogc/bipm/itu/jcgm examples
17
-
18
- ## Stack trace pattern
19
- ```
20
- Core#cloned_objects → Formula#cloned_objects → Array#map →
21
- Core#variable_value → Array#map → Core#cloned_objects → ...
22
- ```
23
-
24
- ## Root cause
25
- `cloned_objects` in `lib/plurimath/math/core.rb:223` creates
26
- `self.class.new(nil)` which triggers `Formula.new(nil)`. If the Formula
27
- contains objects that reference back to the formula or create cyclic
28
- structures, the recursion never terminates.
29
-
30
- This is likely a pre-existing bug exposed by V4 parsing producing
31
- different formula structures. Needs investigation of what formula
32
- structure triggers the infinite recursion.
33
-
34
- ## Status: Open (pre-existing, needs investigation)
@@ -1,23 +0,0 @@
1
- # Fix OMML output diff for underover with displaystyle false
2
-
3
- ## Problem
4
- 1 test failure in mathml_spec.rb:2663 - OMML output for underover
5
- tags with `displaystyle false` produces different XML structure.
6
-
7
- ## Expected vs Actual
8
- Expected: `<m:sSup>` with `<m:sup>` element
9
- Actual: `<m:limUpp>` with `<m:lim>` element
10
-
11
- This means the underover is being interpreted differently (as a limit
12
- upper instead of a superscript).
13
-
14
- ## Root cause
15
- The V4 MathML parser produces a different formula structure for
16
- `<munderover>` elements when `displaystyle="false"`. This changes
17
- how the OMML serializer renders the formula.
18
-
19
- This is related to MathML 4 vs MathML 3 behavior differences. In
20
- MathML 4, displaystyle affects whether underover renders as limits
21
- vs scripts. The specs need to be updated to match V4 behavior.
22
-
23
- ## Status: Fixed (updated expected OMML to use limLow/limUpp)
@@ -1,20 +0,0 @@
1
- # Fix unitsml spec attribute ordering and displaystyle differences
2
-
3
- ## Problem
4
- 2 test failures in spec/plurimath/math/formula/unitsml_spec.rb - the
5
- unitsml-generated MathML output has changed:
6
- 1. XML attribute ordering changed (id vs dimensionURL)
7
- 2. `lang="en-US"` changed to `lang="en"`
8
- 3. Inner `<math>` now includes `displaystyle="true"` (from V4::Math default)
9
-
10
- ## Root cause
11
- The local unitsml gem (at ../../unitsml/unitsml-ruby/) produces slightly
12
- different output than the previously-tested version. The attribute
13
- ordering and lang value differences are from the updated unitsml/unitsdb
14
- data. The `displaystyle="true"` addition is from our V4::Math default.
15
-
16
- ## Fix
17
- Update the expected values in the spec to match the new output. These
18
- are cosmetic differences in XML serialization.
19
-
20
- ## Status: Fixed (updated expected values for attribute ordering, lang, displaystyle)
@@ -1,50 +0,0 @@
1
- # Greek Letter HTML Entity Serialization in moxml Ox Adapter
2
-
3
- ## Issue
4
-
5
- When using the `moxml` gem with the Ox adapter, Greek letters (and other special Unicode characters) are serialized as raw Unicode characters instead of HTML entities.
6
-
7
- **Example:**
8
- - Expected: `<m:t>&#x3b8;</m:t>` (theta as HTML entity)
9
- - Actual: `<m:t>θ</m:t>` (theta as raw Unicode character)
10
-
11
- ## Affected Tests
12
-
13
- All OMML comparison tests in plurimath:
14
-
15
- | Test # | Name | Issue |
16
- |--------|------|-------|
17
- | 2452 | contains multiple tags in Mathml | Greek letters (α, θ) as Unicode vs HTML entities |
18
- | 2663 | contains underover, under, and over tags | Greek letter θ encoding |
19
- | 3121 | multiscripts containing none tag | ZWSP (&#8203;) not rendered - empty m:t |
20
- | 3201 | oint msubsup tag | Integral symbol (&#x222e;) missing in output |
21
- | 3365 | nary prod symbol in underover | Integral symbol (&#x222e;) missing |
22
- | 3486 | empty mo example from plurimath#318 | Empty mo (&#xb1; ±) not rendered |
23
-
24
- ## Root Cause
25
-
26
- The moxml gem's Ox adapter uses `Ox.dump()` to serialize XML. Ox serializes text content as raw Unicode characters rather than HTML entities.
27
-
28
- When a symbol like `θ` (theta, U+03B8) is stored as a text node, Ox outputs it as `θ` instead of `&#x3b8;`.
29
-
30
- ## Expected Behavior
31
-
32
- For MathML/OMML output, certain characters should be serialized as HTML entities:
33
- - Greek letters: `&#x3b8;` for `θ`, `&#x3b1;` for `α`, `&#x3b2;` for `β`
34
- - Zero-width space: `&#8203;` for ZWSP
35
- - Operators: `&#xb1;` for `±`, `&#x222e;` for `∮`
36
-
37
- ## Solution
38
-
39
- The Ox adapter's serialize method needs to post-process text content to encode certain Unicode characters as HTML entities. This could be done in:
40
-
41
- 1. `moxml/lib/moxml/adapter/ox.rb` - modify the `serialize` method to post-process text nodes
42
- 2. Or create a text encoding step when adding text content to elements
43
-
44
- ## Alternative
45
-
46
- Use the Oga adapter instead of Ox adapter. The Oga adapter appears to handle HTML entity encoding correctly (based on test outputs showing `&#8203;` preserved with Oga).
47
-
48
- ## Context
49
-
50
- This is a serialization issue in the moxml library, not in plurimath itself. Plurimath correctly stores the symbol information; the issue is how moxml/Ox serializes that information to XML output.
@@ -1,149 +0,0 @@
1
- # Proposal: Fix Custom Model Deserialization Child Order for Binary Operators
2
-
3
- ## Problem Description
4
-
5
- When using the mml gem's custom model feature to substitute element classes (e.g., mapping `Mml::V4::Mover` to a custom `Overset` class), the child elements are being passed to the custom model's constructor in **reversed order** compared to their document order.
6
-
7
- ## Expected Behavior
8
-
9
- For MathML `<mover>` element:
10
- - First child = base (the element being overscored)
11
- - Second child = overscript (the accent mark)
12
-
13
- When parsing `<mover><mi>θ</mi><mi>d</mi></mover>`:
14
- - `mi[0]` = θ (base)
15
- - `mi[1]` = d (overscript)
16
-
17
- The custom model `Overset.new(base, overscript)` should receive:
18
- - `parameter_one` = θ
19
- - `parameter_two` = d
20
-
21
- ## Actual Behavior
22
-
23
- The custom model `Overset` receives children in **reversed order**:
24
- - `parameter_one` = d (should be θ)
25
- - `parameter_two` = θ (should be d)
26
-
27
- ## Clarification: This is NOT a lutaml-model bug
28
-
29
- Lutaml-model has confirmed this is not an issue in their gem. The problem is entirely within the **mml gem** and how it sets up and uses custom models.
30
-
31
- ## Root Cause Analysis
32
-
33
- ### Investigation Steps
34
-
35
- 1. **Parsing without custom models:**
36
- ```ruby
37
- parsed = Mml.parse(mathml, namespace_exist: true, version: 4)
38
- mover = parsed.instance_variable_get(:@mover_value).first
39
- # mover.mi_value[0].value = "θ" (correct - base first)
40
- # mover.mi_value[1].value = "d" (correct - overscript second)
41
- ```
42
-
43
- 2. **Parsing with custom models:**
44
- ```ruby
45
- Mml::V4::Configuration.custom_models = { Mml::V4::Mover => Overset }
46
- parsed = Mml.parse(mathml, namespace_exist: true, version: 4)
47
- overset = parsed.value.first
48
- # overset.parameter_one.value = "d" (WRONG - should be "θ")
49
- # overset.parameter_two.class = Theta (WRONG - should be "d")
50
- ```
51
-
52
- 3. **Custom model setup in mml gem:**
53
- ```ruby
54
- # From mml gem's context_configuration.rb:
55
- def custom_models=(models_hash)
56
- models_hash.each do |klass, model|
57
- klass.model(model)
58
- end
59
- end
60
-
61
- # This calls lutaml-model's model method:
62
- # From lutaml-model serialize/initialization.rb:
63
- def model(klass = nil)
64
- if klass
65
- @model = klass
66
- add_custom_handling_methods_to_model(klass)
67
- else
68
- @model
69
- end
70
- end
71
- ```
72
-
73
- 4. **The fix must be in mml gem:**
74
- When `klass.model(custom_model)` is called, the mml gem should ensure that when the custom model is instantiated during deserialization, the child elements are passed in document order. The issue is that the mml gem's element attribute definitions (e.g., `element "mover"` with child `mi_value`) are not being properly respected when the custom model substitution is applied.
75
-
76
- Specifically, the mml gem's `ContextConfiguration#custom_models=` sets up the model substitution, but it does not ensure that the original element's `element_order` or child element mappings are preserved for the custom model. When lutaml-model instantiates the custom model via its `from_xml` or similar deserialization path, it passes children in an order that may not match the MathML document order.
77
-
78
- ### Key Finding
79
-
80
- The issue is in **how the mml gem sets up the model substitution**, not in lutaml-model itself. When `Mml::V4::Mover.model(Overset)` is called:
81
-
82
- 1. Lutaml-model stores `@model = Overset`
83
- 2. During XML deserialization, lutaml-model uses its own logic to map XML children to constructor arguments
84
- 3. The mml gem's element definitions (which define child element order via `element_order` or similar) are not being communicated to the custom model
85
-
86
- The fix should be in `ContextConfiguration#custom_models=` — after calling `klass.model(model)`, the mml gem should also ensure that the custom model class inherits or mirrors the original element's child element ordering information.
87
-
88
- ## Affected Elements
89
-
90
- This issue affects all binary MathML elements when using custom models:
91
- - `<mover>` → Overset (base, overscript)
92
- - `<munder>` → Underset (base, underscript)
93
- - `<mover>` and `<munder>` combined in `<munderover>` → Underover (base, underscript, overscript)
94
- - `<msup>` → Power (base, superscript)
95
- - `<msub>` → Base (base, subscript)
96
-
97
- ## Expected Fix
98
-
99
- The fix must be in the **mml gem** — specifically in `ContextConfiguration#custom_models=` or the related element deserialization logic. When setting up a custom model substitution, the mml gem must ensure that the custom model receives child elements in MathML document order.
100
-
101
- ### Specific Fix Location
102
-
103
- In `lib/mml/context_configuration.rb`, the `custom_models=` method should be updated to preserve child element ordering when setting up the model substitution. One approach:
104
-
105
- 1. After calling `klass.model(model)`, copy or alias the original element's `element_order` or child-getting logic to the custom model
106
- 2. Or, intercept the deserialization of elements with custom models to explicitly pass children in document order
107
-
108
- ## Test Case
109
-
110
- ```ruby
111
- # Setup
112
- require "mml"
113
- class Overset < BinaryFunction
114
- def initialize(base = nil, overscript = nil)
115
- super(base, overscript)
116
- end
117
- end
118
- Mml::V4::Configuration.custom_models = { Mml::V4::Mover => Overset }
119
-
120
- # MathML: <mover><mi>θ</mi><mi>d</mi></mover>
121
- # Expected: base=θ, overscript=d
122
- # Actual: base=d, overscript=θ (BUG!)
123
-
124
- mathml = <<~MATHML
125
- <math xmlns="http://www.w3.org/1998/Math/MathML">
126
- <mover>
127
- <mi>θ</mi>
128
- <mi>d</mi>
129
- </mover>
130
- </math>
131
- MATHML
132
-
133
- parsed = Mml.parse(mathml, namespace_exist: true, version: 4)
134
- overset = parsed.value.first
135
- raise "Bug: children reversed!" unless overset.parameter_one.value == "θ"
136
- ```
137
-
138
- ## Impact
139
-
140
- This bug affects any application using the mml gem's custom model feature to substitute element classes, particularly when:
141
- - Building domain-specific MathML parsers (like plurimath)
142
- - Converting MathML to other formats
143
- - Processing mathematical content with accent marks, subscripts, superscripts, etc.
144
-
145
- ## References
146
-
147
- - MathML Specification: https://www.w3.org/TR/MathML3/chapter3.html
148
- - mml gem: https://github.com/metanorma/mml
149
- - lutaml-model gem: https://github.com/lutaml/lutaml-model (confirmed: not the source of this bug)
@@ -1,63 +0,0 @@
1
- # 01 - phantom tag whitespace issue
2
-
3
- ## Test Location
4
- `spec/plurimath/mathml_spec.rb:281`
5
-
6
- ## Issue Summary
7
- Test "contains Mathml phantom tag's example" fails with XML equivalence check.
8
-
9
- ## Input MathML
10
- ```xml
11
- <math>
12
- <mrow>
13
- <mi> x </mi>
14
- <mphantom>
15
- <mo> + </mo>
16
- </mphantom>
17
- <mphantom>
18
- <mi> y </mi>
19
- </mphantom>
20
- <mo> + </mo>
21
- <mi> z </mi>
22
- </mrow>
23
- </math>
24
- ```
25
-
26
- ## Expected MathML (from test)
27
- ```xml
28
- <mphantom>
29
- <mo>+</mo>
30
- </mphantom>
31
- <mphantom>
32
- <mi> y </mi>
33
- </mphantom>
34
- <mo> + </mo>
35
- ```
36
-
37
- ## Our Output
38
- ```xml
39
- <mphantom>
40
- <mo> + </mo>
41
- </mphantom>
42
- <mphantom>
43
- <mi> y </mi>
44
- </mphantom>
45
- <mo> + </mo>
46
- ```
47
-
48
- ## Difference
49
- - Expected first mo inside phantom: `<mo>+</mo>` (no spaces)
50
- - Our output first mo inside phantom: `<mo> + </mo>` (with spaces)
51
-
52
- The second mo (outside phantom) correctly preserves spaces.
53
-
54
- ## Analysis
55
- The issue is that `mphantom` is not supposed to preserve the whitespace inside the `mo` element according to the expected test output. However, this seems semantically incorrect - if the input has `<mo> + </mo>`, the spaces are part of the mo content.
56
-
57
- ## Status
58
- **Pre-existing from LutaML-Model update** - This test was passing before the update, meaning the previous code stripped spaces inside mo elements when inside phantom.
59
-
60
- ## Possible Causes
61
- 1. The LutaML-Model update changed how `mo_to_symbol` preserves whitespace
62
- 2. The test expectation was always semantically questionable but matched the previous implementation
63
- 3. Something in the translator is now preserving whitespace that was previously stripped
@@ -1,17 +0,0 @@
1
- # 02 - table with parentheses (LaTeX trailing space)
2
-
3
- ## Test Location
4
- `spec/plurimath/mathml_spec.rb:778`
5
-
6
- ## Issue Summary
7
- Test "contains table with surrounding parentheses(metanorma example) and sqrt tag" fails on LaTeX comparison.
8
-
9
- ## Failure Details
10
- - Expected and got appear identical but differ in invisible characters or whitespace
11
- - The test uses `to_latex` comparison
12
-
13
- ## Possible Cause
14
- The LaTeX output has extra whitespace/trailing spaces that don't match the expected.
15
-
16
- ## Status
17
- **Pre-existing from LutaML-Model update**
@@ -1,17 +0,0 @@
1
- # 03 - longidv tag (MathML structure)
2
-
3
- ## Test Location
4
- `spec/plurimath/mathml_spec.rb:1197`
5
-
6
- ## Issue Summary
7
- Test "contains longidv tag Mathml" fails with MathML element_structure differences.
8
-
9
- ## Failure Details
10
- - 3 dimension differences in XML comparison
11
- - Element structure mismatch in the MathML output
12
-
13
- ## Possible Cause
14
- The LutaML-Model update changed how certain MathML elements are parsed/rendered, affecting the structure.
15
-
16
- ## Status
17
- **Pre-existing from LutaML-Model update**
@@ -1,17 +0,0 @@
1
- # 04 - mmultiscript containing none (MathML structure)
2
-
3
- ## Test Location
4
- `spec/plurimath/mathml_spec.rb:1319`
5
-
6
- ## Issue Summary
7
- Test "contains mmultiscript containing none tag" fails with MathML element_structure differences.
8
-
9
- ## Failure Details
10
- - Element structure mismatch in MathML output
11
- - The `<none>` tag handling may have changed
12
-
13
- ## Possible Cause
14
- The LutaML-Model update changed how `<none>` elements inside `<mmultiscripts>` are handled.
15
-
16
- ## Status
17
- **Pre-existing from LutaML-Model update**
@@ -1,17 +0,0 @@
1
- # 05 - mstyle containing nary oint (MathML structure)
2
-
3
- ## Test Location
4
- `spec/plurimath/mathml_spec.rb:1366`
5
-
6
- ## Issue Summary
7
- Test "contains mstyle containing nary oint value in msubsup tag" fails with MathML element_structure differences.
8
-
9
- ## Failure Details
10
- - Element structure mismatch in MathML output
11
- - Issue with nary operator rendering inside msubsup
12
-
13
- ## Possible Cause
14
- The LutaML-Model update changed how nary operators are handled within subscript/superscript elements.
15
-
16
- ## Status
17
- **Pre-existing from LutaML-Model update**
@@ -1,17 +0,0 @@
1
- # 06 - plurimath/issue#238 (MathML structure)
2
-
3
- ## Test Location
4
- `spec/plurimath/mathml_spec.rb:1540`
5
-
6
- ## Issue Summary
7
- Test "contains string from plurimath/issue#238" fails with MathML element_structure differences.
8
-
9
- ## Failure Details
10
- - 4-5 dimension differences in XML comparison
11
- - Complex MathML structure issue
12
-
13
- ## Possible Cause
14
- The issue #238 was about a specific MathML rendering problem that the LutaML-Model update may have affected.
15
-
16
- ## Status
17
- **Pre-existing from LutaML-Model update**
@@ -1,19 +0,0 @@
1
- # 07 - metanorma-cli-actions-mn-bipm (LaTeX structure)
2
-
3
- ## Test Location
4
- `spec/plurimath/mathml_spec.rb:1713`
5
-
6
- ## Issue Summary
7
- Test "contains input from metanorma-cli-actions-mn-bipm run" fails with LaTeX comparison.
8
-
9
- ## Expected LaTeX
10
- `\underline{\mathit{B}} = \left [ ... \end{matrix}\right ]`
11
-
12
- ## Got LaTeX
13
- `\underset{ \left ( \underline \right ) }{ \left ( \mathit{B} \right ) } = \left [ ... \end{matrix}\right ]`
14
-
15
- ## Issue
16
- The LaTeX rendering of the underline/enclose structure is different - the structure of how `underline` wraps `B` has changed.
17
-
18
- ## Status
19
- **Pre-existing from LutaML-Model update**
@@ -1,17 +0,0 @@
1
- # 08 - metanorma-cli-actions-mn-itu (LaTeX spacing)
2
-
3
- ## Test Location
4
- `spec/plurimath/mathml_spec.rb:1860`
5
-
6
- ## Issue Summary
7
- Test "contains input from metanorma-cli-actions-mn-itu run" fails with LaTeX comparison.
8
-
9
- ## Expected
10
- `y_{k} = ( x_{k} \pm h ) m`
11
- (Got: `y_{k} = ( x_{k} \pm h ) m`)
12
-
13
- ## Issue
14
- Double space before `m` in expected, single space in our output. This is likely related to how `None#to_latex` or similar empty-returning functions are handled.
15
-
16
- ## Status
17
- **Pre-existing from LutaML-Model update**
@@ -1,24 +0,0 @@
1
- # 09 - OMML Greek letter encoding (multiple tags)
2
-
3
- ## Test Location
4
- `spec/plurimath/mathml_spec.rb:2452`
5
-
6
- ## Issue Summary
7
- Test "contains multiple tags in Mathml" fails with OMML comparison due to Greek letter encoding.
8
-
9
- ## Expected
10
- `<m:t>&#x3b1;</m:t>` (HTML entity)
11
- `<m:t>&#x3b8;</m:t>` (HTML entity)
12
-
13
- ## Got
14
- `<m:t>α</m:t>` (Unicode character)
15
- `<m:t>θ</m:t>` (Unicode character)
16
-
17
- ## Issue
18
- Greek letters are being output as Unicode characters instead of HTML entities. This is the **Greek letter encoding issue** documented in `TODO.bugs/05-omml-greek-entity-encoding.md`.
19
-
20
- ## Cause
21
- Ox serializer outputs Unicode characters directly instead of HTML entities.
22
-
23
- ## Status
24
- **Pre-existing from LutaML-Model update - KNOWN ISSUE**
@@ -1,19 +0,0 @@
1
- # 10 - OMML Greek letter encoding (underover)
2
-
3
- ## Test Location
4
- `spec/plurimath/mathml_spec.rb:2663`
5
-
6
- ## Issue Summary
7
- Test "contains underover, under, and over tags with displaystyle false" fails with OMML comparison.
8
-
9
- ## Expected
10
- `<m:t>&#x3b8;</m:t>` (HTML entity)
11
-
12
- ## Got
13
- `<m:t>θ</m:t>` (Unicode character)
14
-
15
- ## Issue
16
- Greek letter theta encoding issue - same as test 09.
17
-
18
- ## Status
19
- **Pre-existing from LutaML-Model update - KNOWN ISSUE**
@@ -1,19 +0,0 @@
1
- # 11 - OMML multiscripts ZWSP handling
2
-
3
- ## Test Location
4
- `spec/plurimath/mathml_spec.rb:3121`
5
-
6
- ## Issue Summary
7
- Test "contains multiscripts containing none tag" fails with OMML comparison.
8
-
9
- ## Expected
10
- `<m:t>&#8203;</m:t>` (ZWSP - Zero-Width Space)
11
-
12
- ## Got
13
- `<m:t></m:t>` (empty)
14
-
15
- ## Issue
16
- ZWSP (Zero-Width Space, U+200B) placeholder is not being rendered in OMML output. The empty `m:t` element suggests the ZWSP character is being lost.
17
-
18
- ## Status
19
- **Pre-existing from LutaML-Model update**
@@ -1,19 +0,0 @@
1
- # 12 - OMML oint msubsup missing integral
2
-
3
- ## Test Location
4
- `spec/plurimath/mathml_spec.rb:3201`
5
-
6
- ## Issue Summary
7
- Test "contains oint msubsup tag" fails with OMML comparison.
8
-
9
- ## Expected
10
- Structure includes `<m:t>&#x222e;</m:t>` (contour integral symbol ∮)
11
-
12
- ## Got
13
- Missing the integral symbol in output
14
-
15
- ## Issue
16
- The integral symbol `&#x222e;` is not appearing in the OMML output for nary operators.
17
-
18
- ## Status
19
- **Pre-existing from LutaML-Model update**
@@ -1,19 +0,0 @@
1
- # 13 - OMML nary prod missing integral
2
-
3
- ## Test Location
4
- `spec/plurimath/mathml_spec.rb:3365`
5
-
6
- ## Issue Summary
7
- Test "contains nary prod symbol in underover for nary tag" fails with OMML comparison.
8
-
9
- ## Expected
10
- Structure includes integral symbol
11
-
12
- ## Got
13
- Structure differs from expected
14
-
15
- ## Issue
16
- Similar to test 12 - nary product/integral operator rendering issue in OMML.
17
-
18
- ## Status
19
- **Pre-existing from LutaML-Model update**
@@ -1,19 +0,0 @@
1
- # 14 - OMML empty mo example
2
-
3
- ## Test Location
4
- `spec/plurimath/mathml_spec.rb:3486`
5
-
6
- ## Issue Summary
7
- Test "contains empty mo example from plurimath/plurimath#318" fails with OMML comparison.
8
-
9
- ## Expected
10
- `<m:t>&#xb1;</m:t>` (plus-minus sign ±)
11
-
12
- ## Got
13
- `<m:t></m:t>` (empty)
14
-
15
- ## Issue
16
- Empty `mo` element (representing ±) is not being rendered correctly in OMML output.
17
-
18
- ## Status
19
- **Pre-existing from LutaML-Model update**
@@ -1,168 +0,0 @@
1
- # MathML Translator: Remaining 14 Failing Tests
2
-
3
- ## Context
4
-
5
- This document describes the 14 MathML translator tests that remain failing after the LutaML-Model update. The fixes for 17 other tests are in this PR; these require additional investigation.
6
-
7
- ---
8
-
9
- ## Summary of Fixed Tests
10
-
11
- 17 tests were resolved by addressing:
12
- - `None#to_latex` and `None#to_asciimath` returning `nil` instead of `""`
13
- - Whitespace preservation in `mo_to_symbol`
14
- - Filtering empty strings from formula output
15
- - Fixing `Plus#to_mathml_without_math_tag` to use `value` instead of hardcoded `"+"`
16
-
17
- ---
18
-
19
- ## Remaining 14 Failing Tests
20
-
21
- ### Category 1: MathML Structure Issues (5 tests)
22
-
23
- #### 1. Test 281: Phantom Tag Whitespace
24
- - **Location**: `spec/plurimath/mathml_spec.rb:281`
25
- - **Test**: `contains Mathml phantom tag's example`
26
- - **Issue**: Expected `<mo>+</mo>` (no spaces) inside `<mphantom>`, but output is `<mo> + </mo>` (with spaces)
27
- - **Input**: `<mo> + </mo>` inside `<mphantom>`
28
- - **Note**: The second `<mo> + </mo>` outside phantom correctly preserves spaces. Issue is specific to `mphantom`.
29
-
30
- ---
31
-
32
- #### 2. Test 1197: Longidv Tag
33
- - **Location**: `spec/plurimath/mathml_spec.rb:1197`
34
- - **Test**: `contains longidv tag Mathml`
35
- - **Issue**: 3 element_structure differences in XML comparison
36
- - **Hint**: `longidv` involves `mscarries` elements. LutaML update changed parsing/rendering of these.
37
-
38
- ---
39
-
40
- #### 3. Test 1319: Mmultiscript with None Tag
41
- - **Location**: `spec/plurimath/mathml_spec.rb:1319`
42
- - **Test**: `contains mmultiscript containing none tag`
43
- - **Issue**: Element structure mismatch for `mmultiscripts` containing `<none>` elements
44
-
45
- ---
46
-
47
- #### 4. Test 1366: Mstyle with Nary Oint
48
- - **Location**: `spec/plurimath/mathml_spec.rb:1366`
49
- - **Test**: `contains mstyle containing nary oint value in msubsup tag`
50
- - **Issue**: Element structure differences with nary operators inside `msubsup`
51
-
52
- ---
53
-
54
- #### 5. Test 1540: Issue #238
55
- - **Location**: `spec/plurimath/mathml_spec.rb:1540`
56
- - **Test**: `contains string from plurimath/issue#238`
57
- - **Issue**: 4-5 element_structure differences in complex MathML structure
58
-
59
- ---
60
-
61
- ### Category 2: LaTeX Output Issues (3 tests)
62
-
63
- #### 6. Test 778: Table with Parentheses (Metanorma)
64
- - **Location**: `spec/plurimath/mathml_spec.rb:778`
65
- - **Test**: `contains table with surrounding parentheses(metanorma example) and sqrt tag`
66
- - **Issue**: LaTeX comparison fails - expected and got look identical but differ in invisible characters
67
- - **Hint**: Use byte-level comparison (`xxd`, `od -c`) to identify the difference
68
-
69
- ---
70
-
71
- #### 7. Test 1713: Metanorma BIPM Run
72
- - **Location**: `spec/plurimath/mathml_spec.rb:1713`
73
- - **Test**: `contains input from metanorma-cli-actions-mn-bipm run`
74
- - **Expected**: `\underline{\mathit{B}} = \left [ ... \end{matrix}\right ]`
75
- - **Got**: `\underset{ \left ( \underline \right ) }{ \left ( \mathit{B} \right ) } = \left [ ... \end{matrix}\right ]`
76
- - **Hint**: `menclose` with `notation="updiagonalstrike"` rendering changed from `\underline{...}` to `\underset{...}{...}` form
77
-
78
- ---
79
-
80
- #### 8. Test 1860: Metanorma ITU Run
81
- - **Location**: `spec/plurimath/mathml_spec.rb:1860`
82
- - **Test**: `contains input from metanorma-cli-actions-mn-itu run`
83
- - **Expected**: `y_{k} = ( x_{k} \pm h ) m` (double space before `m`)
84
- - **Got**: `y_{k} = ( x_{k} \pm h ) m` (single space)
85
- - **Hint**: Related to how `None` elements contribute spacing
86
-
87
- ---
88
-
89
- ### Category 3: OMML Rendering Issues (6 tests)
90
-
91
- **Root Cause**: Ox serializer outputs Unicode characters instead of HTML entities. See `TODO.bugs/05-omml-greek-entity-encoding.md`.
92
-
93
- ---
94
-
95
- #### 9. Test 2452: Multiple Tags - Greek Encoding
96
- - **Location**: `spec/plurimath/mathml_spec.rb:2452`
97
- - **Expected**: `<m:t>&#x3b1;</m:t>`, `<m:t>&#x3b8;</m:t>`
98
- - **Got**: `<m:t>α</m:t>`, `<m:t>θ</m:t>`
99
-
100
- ---
101
-
102
- #### 10. Test 2663: Underover with Greek
103
- - **Location**: `spec/plurimath/mathml_spec.rb:2663`
104
- - **Expected**: `<m:t>&#x3b8;</m:t>`
105
- - **Got**: `<m:t>θ</m:t>`
106
-
107
- ---
108
-
109
- #### 11. Test 3121: Multiscripts None - ZWSP
110
- - **Location**: `spec/plurimath/mathml_spec.rb:3121`
111
- - **Expected**: `<m:t>&#8203;</m:t>` (ZWSP - Zero-Width Space)
112
- - **Got**: `<m:t></m:t>` (empty)
113
- - **Hint**: ZWSP (U+200B) used as placeholder in multiscripts is being dropped by Ox serializer
114
-
115
- ---
116
-
117
- #### 12. Test 3201: Oint Msubsup - Missing Integral
118
- - **Location**: `spec/plurimath/mathml_spec.rb:3201`
119
- - **Expected**: Contains `<m:t>&#x222e;</m:t>` (contour integral ∮)
120
- - **Got**: Integral symbol missing
121
- - **Hint**: Nary operator `oint` not rendering in OMML within msubsup context
122
-
123
- ---
124
-
125
- #### 13. Test 3365: Nary Prod Symbol
126
- - **Location**: `spec/plurimath/mathml_spec.rb:3365`
127
- - **Issue**: Same as test 12 - integral symbol `&#x222e;` missing
128
- - **Hint**: Similar nary operator issue with product notation
129
-
130
- ---
131
-
132
- #### 14. Test 3486: Empty MO Example
133
- - **Location**: `spec/plurimath/mathml_spec.rb:3486`
134
- - **Expected**: `<m:t>&#xb1;</m:t>` (plus-minus sign ±)
135
- - **Got**: `<m:t></m:t>` (empty)
136
- - **Hint**: Symbol value being lost in translation for empty `mo` elements
137
-
138
- ---
139
-
140
- ## Investigation Files
141
-
142
- Detailed notes in:
143
- - `TODO.fix-fails/01-phantom-whitespace.md` through `TODO.fix-fails/14-omml-empty-mo.md`
144
- - `TODO.bugs/05-omml-greek-entity-encoding.md`
145
-
146
- ---
147
-
148
- ## Hints
149
-
150
- 1. **MathML Structure**: Compare AST structure before/after LutaML update. Focus on `translator.rb` handling of `mphantom`, `mmultiscripts`, `mscarries`, `munderover`.
151
-
152
- 2. **LaTeX Whitespace**: Use byte-level comparison to identify invisible character differences. Issue likely in how empty/nil values contribute spacing.
153
-
154
- 3. **OMML Issues**:
155
- - Greek encoding: Post-process Ox output to convert Unicode to HTML entities, or use Oga adapter instead
156
- - ZWSP/Empty MO: Investigate how Ox handles special Unicode characters (U+200B, U+00B1, U+222E)
157
-
158
- ---
159
-
160
- ## Testing
161
-
162
- ```bash
163
- # All failing tests
164
- bundle exec rspec spec/plurimath/mathml_spec.rb --format documentation
165
-
166
- # Single test
167
- bundle exec rspec spec/plurimath/mathml_spec.rb:281 --format documentation
168
- ```
@@ -1,102 +0,0 @@
1
- # MathML Spec Test Failures Investigation
2
-
3
- ## Summary
4
- 41 tests total: 22 passing, 19 failing
5
- - 2 issues FIXED by current changes (Category 2 linebreak, Category 3 LaTeX part)
6
- - These failures existed BEFORE mo_element fix and are pre-existing issues from lutaml-model update.
7
-
8
- ## Failure Categories
9
-
10
- ### Category 1: ZERO WIDTH SPACE serialization in OMML (7 tests)
11
- **Tests:** 2452, 2913, 3039, 3121, 3201, 3365, 3486
12
-
13
- **Issue:** `<m:t>&#8203;</m:t>` becomes `<m:t></m:t>` or `<m:t/>`
14
-
15
- **Root Cause:** Ox serialization issue - empty elements serialized as `<m:t></m:t>` instead of `<m:t/>`
16
-
17
- **Example diff:**
18
- ```
19
- - <m:t/>
20
- + <m:t></m:t>
21
- ```
22
-
23
- ---
24
-
25
- ### Category 2: Linebreak positioning in HTML (FIXED)
26
- **Tests:** 3594
27
-
28
- **Issue:** `<br/>` positioned before operator instead of after
29
-
30
- **Example diff:**
31
- ```
32
- - <i>N</i><sub>s</sub><sup>2</sup> =<br/> T <br/>&#x2191; S <br/> D
33
- + <i>N</i><sub>s</sub><sup>2</sup> <br/>= T <br/>↑ S <br/> D
34
- ```
35
-
36
- **Root Cause:** `linebreakstyle="after"` not being passed to Linebreak constructor.
37
-
38
- **FIXED:** Now passes `linebreakstyle` attribute and uses `mathml_unary_classes` for proper encoding.
39
-
40
- ---
41
-
42
- ### Category 3: LaTeX whitespace in phantom (PARTIALLY FIXED)
43
- **Tests:** 281
44
-
45
- **Issue:** `\phantom{ y }` vs `\phantom{y}` - whitespace stripped from phantom content
46
-
47
- **Example diff:**
48
- ```
49
- - " x \\phantom{+} \\phantom{ y } + z "
50
- + "x \\phantom{+} \\phantom{y} + z"
51
- ```
52
-
53
- **Root Cause:** Symbol class strips whitespace when creating symbol value.
54
-
55
- **FIXED for LaTeX output:** After `mathml_unary_classes` creates Symbol, restore original value to preserve whitespace.
56
- **Still failing:** MathML structure issue (mrow wrapper missing) - pre-existing bug.
57
-
58
- ---
59
-
60
- ### Category 4: MathML structure differences (8 tests)
61
- **Tests:** 213, 778, 1197, 1272, 1319, 1366, 1540, 1713
62
-
63
- **Issue:** XML structure differences with msubsup, mrow element positioning
64
-
65
- **Example diff:**
66
- ```
67
- Element_position differs: mrow at position 0 vs position 1
68
- Element differs: msubsup → (empty)
69
- ```
70
-
71
- **Root Cause:** Likely mml parsing issue with nested elements in semantics.
72
-
73
- ---
74
-
75
- ### Category 5: OMML rendering issues (3 tests)
76
- **Tests:** 2559, 2663, 2790
77
-
78
- **Issue:** limLow, accent, sSubSup not rendering correctly
79
-
80
- **Example:** Expected limLow with proper lim printing, but got different structure.
81
-
82
- **Root Cause:** Translator OMML rendering issues.
83
-
84
- ---
85
-
86
- ## Recommendations
87
-
88
- ### For Category 1 (ZERO WIDTH SPACE)
89
- This is an Ox serialization issue. Empty elements are being serialized as `<m:t></m:t>` instead of `<m:t/>`.
90
- Likely requires changes to how lutaml-model/Ox handles empty element serialization.
91
-
92
- ### For Category 2 (HTML linebreak) - FIXED
93
- No further action needed.
94
-
95
- ### For Category 3 (Phantom whitespace) - PARTIALLY FIXED
96
- LaTeX output is now correct. MathML structure issue requires investigation into mrow handling.
97
-
98
- ### For Category 4 (MathML structure)
99
- Investigate mml parsing of semantics elements and how children are ordered.
100
-
101
- ### For Category 5 (OMML rendering)
102
- Fix translator OMML rendering for limLow, accent, and sSubSup elements.
@@ -1,34 +0,0 @@
1
- # Category 1: ZERO WIDTH SPACE Serialization in OMML
2
-
3
- ## Affected Tests
4
- - 2452: contains multiple tags in Mathml
5
- - 2913: mfrac with options/attributes tag
6
- - 3039: mpadded with attributes
7
- - 3121: multiscripts containing none tag
8
- - 3201: oint msubsup tag
9
- - 3365: nary prod symbol in underover
10
- - 3486: empty mo example from plurimath/plurimath#318
11
-
12
- ## Issue
13
- `<m:t>&#8203;</m:t>` (zero-width space) becomes `<m:t></m:t>` or `<m:t/>` in OMML output.
14
-
15
- ## Root Cause
16
- In `lib/plurimath/math/core.rb`, the `empty_tag` method:
17
- ```ruby
18
- def empty_tag(wrapper_tag = nil)
19
- r_tag = ox_element("r", namespace: "m")
20
- r_tag << (ox_element("t", namespace: "m") << "&#8203;")
21
- ...
22
- end
23
- ```
24
-
25
- The `&#8203;` entity is being lost during Ox element serialization.
26
-
27
- ## Investigation Needed
28
- 1. Check how `ox_element` and Ox handle HTML entities in text content
29
- 2. Verify if this is a lutaml-model issue or an issue with how Plurimath uses Ox
30
- 3. Test if using raw Unicode character U+200B directly works instead of entity
31
-
32
- ## Related Files
33
- - `lib/plurimath/math/core.rb:48-54` - empty_tag method
34
- - `lib/plurimath/math/symbols/symbol.rb` - Symbol rendering
@@ -1,41 +0,0 @@
1
- # Category 2: HTML Linebreak Positioning
2
-
3
- ## Affected Tests
4
- - 3594: contains subsup and linebreak with different values example in MathML
5
-
6
- ## Issue
7
- `<br/>` is positioned BEFORE the operator instead of AFTER when `linebreakstyle="after"` is set.
8
-
9
- ## Example Diff
10
- ```
11
- Expected: "<i>N</i><sub>s</sub><sup>2</sup> =<br/> T <br/>&#x2191; S <br/> D"
12
- Actual: "<i>N</i><sub>s</sub><sup>2</sup> <br/>= T <br/>↑ S <br/> D"
13
- ```
14
-
15
- Note: Also `&#x2191;` (entity) becomes `↑` (literal Unicode character).
16
-
17
- ## Root Cause
18
- In `lib/plurimath/math/function/linebreak.rb`, the `to_html` method:
19
- ```ruby
20
- def to_html(options:)
21
- br_tag = "<br/>"
22
- return br_tag unless parameter_one
23
-
24
- case attributes[:linebreakstyle]
25
- when "after"
26
- "#{parameter_one.to_html(options: options)}#{br_tag}"
27
- else
28
- "#{br_tag}#{parameter_one.to_html(options: options)}"
29
- end
30
- end
31
- ```
32
-
33
- The `linebreakstyle` attribute is being passed from MathML but may not be properly captured or applied.
34
-
35
- ## Investigation Needed
36
- 1. Check if `mo_to_symbol` properly captures `linebreakstyle` attribute from `<mo linebreak="newline" linebreakstyle="after">`
37
- 2. Verify that Linebreak class stores and uses this attribute correctly
38
-
39
- ## Related Files
40
- - `lib/plurimath/math/function/linebreak.rb`
41
- - `lib/plurimath/mathml/translator.rb` - mo_to_symbol
@@ -1,34 +0,0 @@
1
- # Category 3: LaTeX Whitespace in Phantom
2
-
3
- ## Affected Tests
4
- - 281: phantom tag's example
5
-
6
- ## Issue
7
- Whitespace is stripped from phantom content in LaTeX output.
8
-
9
- ## Example Diff
10
- ```
11
- Expected: " x \\phantom{+} \\phantom{ y } + z "
12
- Actual: "x \\phantom{+} \\phantom{y} + z"
13
- ```
14
-
15
- Leading/trailing spaces inside `\phantom{}` are being lost.
16
-
17
- ## Root Cause
18
- In `lib/plurimath/mathml/translator.rb`, `mi_to_symbol`:
19
- ```ruby
20
- stripped = value.strip
21
- ...
22
- result = Plurimath::Utility.mathml_unary_classes([stripped], lang: :mathml)
23
- ```
24
-
25
- The whitespace is stripped when creating the Symbol. When Phantom renders via `latex_value`, it uses `parameter_one.to_latex` which returns the stripped value.
26
-
27
- ## Investigation Needed
28
- 1. Check if Symbol class should preserve whitespace in value
29
- 2. Or if Phantom class should handle whitespace differently
30
-
31
- ## Related Files
32
- - `lib/plurimath/mathml/translator.rb:149-166` - mi_to_symbol
33
- - `lib/plurimath/math/function/phantom.rb`
34
- - `lib/plurimath/math/symbols/symbol.rb`
@@ -1,33 +0,0 @@
1
- # Category 4: MathML Structure Differences
2
-
3
- ## Affected Tests
4
- - 213: contains Mathml object (msubsup in semantics)
5
- - 778: table with surrounding parentheses
6
- - 1197: longidv tag
7
- - 1272: mpadded with attributes
8
- - 1319: mmultiscript containing none tag
9
- - 1366: mstyle containing nary oint value in msubsup
10
- - 1540: plurimath/issue#238
11
- - 1713: metanorma-cli-actions-mn-bipm
12
-
13
- ## Issue
14
- Element structure differs - msubsup/mrow positions are wrong or elements are empty when they shouldn't be.
15
-
16
- ## Example Diff (test 213)
17
- ```
18
- Position: mrow at position 0 vs position 1
19
- Element differs: msubsup → (empty)
20
- Element differs: mrow → (empty)
21
- ```
22
-
23
- ## Root Cause
24
- Likely related to how mml parses `<semantics>` elements and orders children. The translator may not be properly handling the `semantics` wrapper and its `annotation` child elements.
25
-
26
- ## Investigation Needed
27
- 1. Check how mml gem parses semantics elements
28
- 2. Verify `mrow_to_mrow` correctly handles semantics content
29
- 3. Check `ordered_children` method for proper ordering
30
-
31
- ## Related Files
32
- - `lib/plurimath/mathml/translator.rb` - mrow_to_mrow, ordered_children
33
- - `lib/plurimath/math/formula.rb` - mrow handling
@@ -1,27 +0,0 @@
1
- # Category 5: OMML Rendering Issues
2
-
3
- ## Affected Tests
4
- - 2559: scarries, longdiv, msline and scarry tags
5
- - 2663: underover, under, and over tags with displaystyle false
6
- - 2790: bar, vec, dot, ddot, ul, and tilde examples containing accent
7
-
8
- ## Issue
9
- OMML output has incorrect structure for limLow, accent, sSubSup elements.
10
-
11
- ## Example Issues
12
- - `limLow` not rendering with proper lim printing
13
- - Accent elements not properly structured
14
- - sSubSup vs sSub/sSup structure issues
15
-
16
- ## Root Cause
17
- Likely issues in the translator's OMML rendering methods for these specific elements.
18
-
19
- ## Investigation Needed
20
- 1. Check `PowerBase#to_omml_without_math_tag` for sSubSup handling
21
- 2. Check accent/overline rendering for OMML
22
- 3. Check nary functions (integral, product) OMML output
23
-
24
- ## Related Files
25
- - `lib/plurimath/math/function/power_base.rb`
26
- - `lib/plurimath/math/function/nary.rb`
27
- - `lib/plurimath/mathml/translator.rb`
@@ -1,86 +0,0 @@
1
- # TODO: Translation Layer for MathML to Plurimath Model
2
-
3
- ## Status: Translation Complete ✅
4
-
5
- The translation layer in `lib/plurimath/mathml/translator.rb` is **complete and working**.
6
-
7
- ## What Was Fixed
8
-
9
- ### 1. Function Classes (cos, sin, tan, etc.)
10
- Added comprehensive `MATHML_FUNCTION_CLASSES` lookup mapping 45 function names to their Plurimath Function classes:
11
- - Trigonometric: sin, cos, tan, cot, sec, csc
12
- - Hyperbolic: sinh, cosh, tanh, coth, sech, csch
13
- - Inverse trigonometric: arcsin, arccos, arctan, arccot, arcsec, arccsc
14
- - Inverse hyperbolic: arsinh, arcosh, artanh, arcoth, arsech, arcsch
15
- - Exponential/logarithmic: exp, log, ln, lg
16
- - Limits/extremum: lim, liminf, limsup, inf, sup, max, min
17
- - Other: det, gcd, dim, hom, ker, deg, mod, arg, abs, norm, floor, ceil, sgn, sum, prod, int, oint
18
-
19
- ### 2. Function + Parenthesis Heuristic
20
- When `<mi>cos</mi><mo>(</mo>` is encountered in an `<mrow>`, they are combined into `Cos.new(Symbol.new("("))` to produce `\cos{(}` in LaTeX - matching old parser behavior.
21
-
22
- ### 3. Mtext Array Handling
23
- Fixed issue where `<mtext>` with mixed content would pass an Array to `Text.new` instead of a String.
24
-
25
- ### 4. Mover Bug (Previously Fixed)
26
- `ordered_children` using `each_mixed_content` ensures correct child order for binary operators.
27
-
28
- ## Test Status
29
-
30
- ```
31
- 41 examples total
32
- 36 failures (due to output format differences, NOT translation bugs)
33
- 5 passing (verified correct)
34
- ```
35
-
36
- ## What Still Fails (36 Tests)
37
-
38
- The failures are **NOT translation bugs** - they are **output format differences** caused by:
39
-
40
- ### 1. Structural Differences in MathML/OMML Output
41
- - Extra `<mrow>` wrappers in some places
42
- - Missing `<mrow>` wrappers in other places
43
- - The `Mstyle` or `Formula` rendering adds wrappers differently than old parser
44
-
45
- ### 2. Missing Model Attribute Support
46
- - `Scarries` class doesn't support `crossout` attribute
47
- - `Msline` class doesn't support `length` attribute
48
- - `Mglyph` class doesn't support all attributes properly
49
-
50
- ### 3. Format Differences
51
- - `<=` vs `le` in ASCIIMath output
52
- - `&#x3b8;` vs `θ` in OMML output (semantically equivalent)
53
- - Various whitespace differences
54
-
55
- ### 4. Model Structure Differences
56
- - The new translator creates objects differently than old `custom_models` approach
57
- - When these objects render to MathML/OMML/HTML, structure differs
58
-
59
- ## Why These Cannot Be Fixed in Translator
60
-
61
- The translator correctly creates Plurimath model objects. The failures occur when:
62
- 1. `formula.to_mathml` renders the model to MathML - structure differs from expected
63
- 2. `formula.to_omml` renders the model to OMML - structure differs from expected
64
- 3. `formula.to_asciimath` renders the model to ASCIIMath - format differs from expected
65
-
66
- These are **model rendering issues**, not **translation issues**.
67
-
68
- ## What Would Fix the Remaining Failures
69
-
70
- 1. **Change Plurimath model classes** to support missing attributes (crossout, length, etc.)
71
- 2. **Change Formula/Mstyle rendering** to match old parser's mrow wrapping behavior
72
- 3. **Update test expectations** to match correct output (not allowed per user constraint)
73
-
74
- ## Files
75
-
76
- ### Created/Modified
77
- - `lib/plurimath/mathml/translator.rb` - 550+ line translation layer
78
- - `lib/plurimath/mathml/parser.rb` - uses Translator
79
-
80
- ## Passing Tests (Verified Correct)
81
- 1. Basic MathML parsing
82
- 2. mover children order (critical fix!)
83
- 3. Greek letters → Symbol subclasses
84
- 4. Parentheses → `(` / `)` not `\lparen`/`\rparen`
85
- 5. Linebreak handling
86
- 6. LaTeX output (to_latex assertions now passing)