xmi 0.5.5 → 0.5.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 9a7c79e8d907feee2589c9cd13f534297ffd3869b15bf1c05db190b59903fdaa
4
- data.tar.gz: f46a5b384fe0f8da80d075766fe226d8519c3b0a72739b66220c606012f098ab
3
+ metadata.gz: 62e6b5232df619d790a5ca7b69454d3ed6f2762bb457808436be572162867f16
4
+ data.tar.gz: 4adbc6c45fd9e75a2fb6612f60cb35ef47a91afe44cca8c366f8c9b17237aeab
5
5
  SHA512:
6
- metadata.gz: 7e54e2c55800f18215428d8001502c9476d1b63a9dd9db3a164de3e0eae569a0fc405b237adc07e2a2e6fbd789e001a2a57df26cc276b3c9ea8ffccc349159a4
7
- data.tar.gz: 12c434257662c1a5291f915e97b197497a1153676c57b997e0392b00fc90c08ba712781f859ccdaf4b76f25b17ec3c6277480d9f6a2f5c1b763fe92d24f7341c
6
+ metadata.gz: fad7e8f844a110c38395e3d6b90f5506832cf4d971830b3dee23294d7268c6183b5a07ffafaa0832aba65b13ae1a2ad7dae7176b93d333c1607ef5607079f7af
7
+ data.tar.gz: e81076d61606ae1742f08fffa61aa7b0efd999c29dcbffd1c80ee5a34b81adaeb79bb988021f33a07cb51c8c16d023a09cdf5b884905dfb80b4070634731dbdd
data/.rubocop_todo.yml CHANGED
@@ -1,6 +1,6 @@
1
1
  # This configuration was generated by
2
2
  # `rubocop --auto-gen-config`
3
- # on 2026-04-21 01:02:50 UTC using RuboCop version 1.86.1.
3
+ # on 2026-04-22 09:21:04 UTC using RuboCop version 1.86.1.
4
4
  # The point is for the user to remove these configuration records
5
5
  # one by one as the offenses are removed from the code base.
6
6
  # Note that changes in the inspected code, or installation of new
@@ -11,15 +11,7 @@ Gemspec/RequiredRubyVersion:
11
11
  Exclude:
12
12
  - 'xmi.gemspec'
13
13
 
14
- # Offense count: 1
15
- # This cop supports safe autocorrection (--autocorrect).
16
- # Configuration parameters: EnforcedStyle.
17
- # SupportedStyles: empty_lines, no_empty_lines
18
- Layout/EmptyLinesAroundBlockBody:
19
- Exclude:
20
- - 'spec/xmi/ea_root/extension_loading_spec.rb'
21
-
22
- # Offense count: 76
14
+ # Offense count: 77
23
15
  # This cop supports safe autocorrection (--autocorrect).
24
16
  # Configuration parameters: Max, AllowHeredoc, AllowURI, AllowQualifiedName, URISchemes, AllowRBSInlineAnnotation, AllowCopDirectives, AllowedPatterns, SplitStrings.
25
17
  # URISchemes: http, https
@@ -60,7 +52,7 @@ Metrics/CyclomaticComplexity:
60
52
  - 'lib/xmi/ea_root/code_generation.rb'
61
53
  - 'lib/xmi/sparx/index.rb'
62
54
 
63
- # Offense count: 28
55
+ # Offense count: 30
64
56
  # Configuration parameters: CountComments, CountAsOne, AllowedMethods, AllowedPatterns.
65
57
  Metrics/MethodLength:
66
58
  Max: 55
@@ -141,12 +133,19 @@ RSpec/MultipleMemoizedHelpers:
141
133
  RSpec/NestedGroups:
142
134
  Max: 4
143
135
 
144
- # Offense count: 1
136
+ # Offense count: 8
145
137
  # Configuration parameters: CustomTransform, IgnoreMethods, IgnoreMetadata, InflectorPath, EnforcedInflector.
146
138
  # SupportedInflectors: default, active_support
147
139
  RSpec/SpecFilePathFormat:
148
140
  Exclude:
149
141
  - 'spec/xmi/ea_root/extension_loading_spec.rb'
142
+ - 'spec/xmi/sparx/sparx_root_citygml_rel_ns_spec.rb'
143
+ - 'spec/xmi/sparx/sparx_root_citygml_spec.rb'
144
+ - 'spec/xmi/sparx/sparx_root_eauml_spec.rb'
145
+ - 'spec/xmi/sparx/sparx_root_gml_spec.rb'
146
+ - 'spec/xmi/sparx/sparx_root_mdg_spec.rb'
147
+ - 'spec/xmi/sparx/sparx_root_xmi2013_uml2013_spec.rb'
148
+ - 'spec/xmi/sparx/sparx_root_xmi_parsing_spec.rb'
150
149
 
151
150
  # Offense count: 4
152
151
  # Configuration parameters: AllowedClasses.
data/CLAUDE.md ADDED
@@ -0,0 +1,166 @@
1
+ # CLAUDE.md
2
+
3
+ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4
+
5
+ ## Build, Test, and Development Commands
6
+
7
+ ```bash
8
+ # Install dependencies
9
+ bundle install
10
+
11
+ # Run all tests
12
+ bundle exec rake spec
13
+ bundle exec rspec
14
+
15
+ # Run a single test file
16
+ bundle exec rspec spec/xmi/sparx/sparx_root_xmi2013_uml2013_spec.rb
17
+
18
+ # Run a specific test by line number
19
+ bundle exec rspec spec/xmi/sparx/sparx_root_xmi2013_uml2016_spec.rb:610
20
+
21
+ # Run linter with auto-correct
22
+ bundle exec rubocop -A --auto-gen-config
23
+
24
+ # Run both tests and linting (default rake task)
25
+ bundle exec rake
26
+
27
+ # Interactive console for experimentation
28
+ bin/console
29
+ ```
30
+
31
+ ## Architecture Overview
32
+
33
+ This gem converts XMI (XML Metadata Interchange) files into Ruby objects, specifically designed for Enterprise Architect generated XMI files.
34
+
35
+ ### Core Dependencies
36
+
37
+ - **lutaml-model**: All serializable models inherit from `Lutaml::Model::Serializable`
38
+ - **nokogiri**: XML parsing backend
39
+
40
+ ### Main Entry Point
41
+
42
+ `Xmi::Sparx::Root.parse_xml(xml_content)` is the primary method to parse XMI files. It performs preprocessing before parsing:
43
+
44
+ 1. `fix_encoding` - Fixes invalid UTF-8 byte sequences in the XML content
45
+ 2. `normalize_omg_namespace_versions` - Normalizes OMG namespace versions (XMI, UML) to canonical 20131001
46
+ 3. `resolve_relative_namespaces` - Replaces relative `xmlns` values with `targetNamespace` values
47
+ 4. `rename_ea_xmlns_attribute` - Renames `xmlns` attribute to `altered_xmlns` on EA-specific elements (see below)
48
+
49
+ ### OMG Namespace Version Normalization
50
+
51
+ OMG publishes XMI and UML specifications with dated namespace URIs (e.g., `http://www.omg.org/spec/XMI/20110701`, `20131001`, `20161101`). This library normalizes all versions to `20131001` during parsing, allowing a single set of model classes to handle all versions.
52
+
53
+ ### Enterprise Architect's Misuse of the `xmlns` Attribute
54
+
55
+ **This is a critical quirk to understand when working with EA-generated XMI.**
56
+
57
+ In standard XML, `xmlns` is a reserved attribute for namespace declarations. However, Enterprise Architect incorrectly uses `xmlns` as a **regular data attribute** on certain stereotype elements (e.g., `GML:ApplicationSchema`, `CityGML:ApplicationSchema`), storing arbitrary URI values unrelated to namespace declarations.
58
+
59
+ This violates XML conventions and creates parsing conflicts—XML libraries treat `xmlns` as reserved. The library works around this by renaming `xmlns` to `altered_xmlns` before parsing:
60
+
61
+ ```xml
62
+ <!-- EA-generated -->
63
+ <GML:ApplicationSchema xmlns="http://some-value" ...>
64
+
65
+ <!-- After preprocessing -->
66
+ <GML:ApplicationSchema altered_xmlns="http://some-value" ...>
67
+ ```
68
+
69
+ Model classes define `altered_xmlns` attributes to receive these values:
70
+
71
+ ```ruby
72
+ class ApplicationSchema < Lutaml::Model::Serializable
73
+ attribute :altered_xmlns, :string # renamed from xmlns
74
+ end
75
+ ```
76
+
77
+ ### Namespace Architecture
78
+
79
+ **All XMI/UML namespace versions are normalized to 20131001 before parsing** (see `Root.replace_xmlns`).
80
+
81
+ Namespace classes are defined in:
82
+ - `lib/xmi/namespace/omg.rb` - OMG namespaces (XMI, UML, UmlDi, UmlDc)
83
+ - `lib/xmi/namespace/sparx.rb` - Sparx-specific profiles (SysPhS, GML, EaUml, CustomProfile, CityGML)
84
+
85
+ Use version-agnostic alias classes that inherit from 20131001 versions:
86
+ ```ruby
87
+ ::Xmi::Namespace::Omg::Xmi # => inherits from Xmi20131001
88
+ ::Xmi::Namespace::Omg::Uml # => inherits from Uml20131001
89
+ ```
90
+
91
+ ### Custom Types with XML Namespace
92
+
93
+ Custom types in `lib/xmi/type.rb` declare their XML namespace using the `xml do ... end` block:
94
+
95
+ ```ruby
96
+ class XmiId < Lutaml::Model::Type::String
97
+ xml do
98
+ namespace ::Xmi::Namespace::Omg::Xmi
99
+ end
100
+ end
101
+ ```
102
+
103
+ ### Model Definition Pattern
104
+
105
+ Models use lutaml-model syntax with explicit namespace declarations:
106
+ ```ruby
107
+ class MyModel < Lutaml::Model::Serializable
108
+ attribute :id, ::Xmi::Type::XmiId
109
+
110
+ xml do
111
+ root "Model"
112
+ namespace ::Xmi::Namespace::Omg::Uml
113
+ namespace_scope [::Xmi::Namespace::Omg::Xmi, ::Xmi::Namespace::Omg::Uml]
114
+
115
+ # Attributes with XMI namespace require explicit declaration
116
+ map_attribute "id", to: :id,
117
+ namespace: "http://www.omg.org/spec/XMI/20131001",
118
+ prefix: "xmi"
119
+ end
120
+ end
121
+ ```
122
+
123
+ ### Dynamic Extension Loading
124
+
125
+ `Xmi::EaRoot.load_extension(xml_path)` dynamically generates Ruby classes from EA MDG extension XML files. This creates stereotype classes under `Xmi::EaRoot::{ModuleName}::{ClassName}` and updates `Root` mappings.
126
+
127
+ Extensions use `NamespaceRegistry` to look up or create namespace classes dynamically:
128
+ - Existing namespace URIs resolve to predefined classes
129
+ - New URIs create dynamic classes under `Xmi::Namespace::Dynamic::{ModuleName}`
130
+
131
+ ### Key Files
132
+
133
+ | File | Purpose |
134
+ |------|---------|
135
+ | `lib/xmi.rb` | Main entry point, loads dependencies and configures XML adapter |
136
+ | `lib/xmi/sparx.rb` | Module with autoload declarations for Sparx components |
137
+ | `lib/xmi/sparx/root.rb` | Main `Root` class with parsing and namespace normalization |
138
+ | `lib/xmi/root.rb` | Base `Root` class with common XMI attributes |
139
+ | `lib/xmi/uml.rb` | UML model classes (UmlModel, PackagedElement, etc.) |
140
+ | `lib/xmi/ea_root.rb` | Dynamic extension loading from MDG XML |
141
+ | `lib/xmi/type.rb` | Custom types with namespace declarations (XmiId, XmiType, etc.) |
142
+ | `lib/xmi/namespace/omg.rb` | OMG namespace classes (XMI, UML, UmlDi, UmlDc) |
143
+ | `lib/xmi/namespace/sparx.rb` | Sparx-specific profile namespaces |
144
+ | `lib/xmi/namespace_registry.rb` | URI-to-class mapping for namespace lookup |
145
+
146
+ ### Collection Value Maps
147
+
148
+ When mapping collection elements, use the standard `VALUE_MAP` pattern to handle nil/empty values:
149
+
150
+ ```ruby
151
+ map_element "Element", to: :elements,
152
+ value_map: {
153
+ from: { nil: :empty, empty: :empty, omitted: :empty },
154
+ to: { nil: :empty, empty: :empty, omitted: :empty }
155
+ }
156
+ ```
157
+
158
+ A shared constant is available at `Xmi::Sparx::VALUE_MAP`.
159
+
160
+ ## Known Issues
161
+
162
+ One serialization test fails due to a lutaml-model bug where `to_xml` does not respect namespace declarations on custom types. See `LUTAML_MODEL_BUG_REPORT.md` for details. Parsing works correctly; only serialization is affected.
163
+
164
+ ## Limitations
165
+
166
+ This gem is designed for Enterprise Architect generated XMI files and may not work with XMI from other tools. Some EA-specific elements (e.g., `GML:ApplicationSchema`) use `xmlns` as an attribute, which is renamed to `altered_xmlns` during preprocessing to avoid conflicts with Lutaml::Model internals.
data/README.adoc CHANGED
@@ -157,7 +157,7 @@ classes.
157
157
 
158
158
  === Namespace Normalization
159
159
 
160
- The normalization is performed by [`SparxRoot.replace_xmlns`](lib/xmi/sparx.rb:1158) which rewrites
160
+ The normalization is performed by `Xmi::ParserPipeline` which rewrites
161
161
  namespace URIs in the input XML:
162
162
 
163
163
  .Example of namespace normalization
@@ -180,7 +180,7 @@ namespace URIs in the input XML:
180
180
 
181
181
  === Namespace Classes
182
182
 
183
- Namespace classes are defined in [`lib/xmi/namespace/omg.rb`](lib/xmi/namespace/omg.rb:1):
183
+ Namespace classes are defined in `lib/xmi/namespace/omg.rb`:
184
184
 
185
185
  * **Version-specific classes**: `Xmi20110701`, `Uml20131001`, etc.
186
186
  * **Version-agnostic aliases**: `Xmi`, `Uml`, `UmlDi`, `UmlDc`
@@ -277,7 +277,7 @@ namespace prefixes will parse as `nil`.
277
277
 
278
278
  === Sparx Systems Namespaces
279
279
 
280
- Sparx-specific namespaces are defined in [`lib/xmi/namespace/sparx.rb`](lib/xmi/namespace/sparx.rb:1):
280
+ Sparx-specific namespaces are defined in `lib/xmi/namespace/sparx.rb`:
281
281
 
282
282
  * **SysPhS** - System Physical Systems profile
283
283
  * **GML** - Geography Markup Language profile
@@ -301,7 +301,7 @@ end
301
301
 
302
302
  === Extension Namespaces
303
303
 
304
- Dynamically loaded extensions (via [`EaRoot.load_extension`](lib/xmi/ea_root.rb:54)) also use
304
+ Dynamically loaded extensions (via `EaRoot.load_extension`) also use
305
305
  namespace-qualified mappings:
306
306
 
307
307
  .Extension with namespace mapping
@@ -384,17 +384,14 @@ The old API used `SparxRoot.parse_xml` with automatic namespace normalization:
384
384
  [source,ruby]
385
385
  ----
386
386
  # Old API (still works, but version-aware API is recommended)
387
- doc = Xmi::Sparx::SparxRoot.parse_xml(xml_content)
387
+ doc = Xmi::Sparx::Root.parse_xml(xml_content)
388
388
  ----
389
389
 
390
390
  The new version-aware API:
391
391
 
392
392
  [source,ruby]
393
393
  ----
394
- # New API - explicit about version handling
395
- doc = Xmi::Sparx::SparxRoot.parse_xml_with_versioning(xml_content)
396
-
397
- # Or use the module-level API
394
+ # New API - module-level convenience method
398
395
  doc = Xmi.parse(xml_content)
399
396
  ----
400
397
 
@@ -465,13 +462,13 @@ define an `altered_xmlns` attribute to receive this value.
465
462
 
466
463
  After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
467
464
 
468
- To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
465
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to https://rubygems.org[rubygems.org].
469
466
 
470
467
 
471
468
  == Contributing
472
469
 
473
- Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/xmi. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/[USERNAME]/xmi/blob/master/CODE_OF_CONDUCT.md).
470
+ Bug reports and pull requests are welcome on GitHub at https://github.com/lutaml/xmi. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the https://github.com/lutaml/xmi/blob/master/CODE_OF_CONDUCT.md[code of conduct].
474
471
 
475
472
  == Code of Conduct
476
473
 
477
- Everyone interacting in the Xmi project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/[USERNAME]/xmi/blob/master/CODE_OF_CONDUCT.md).
474
+ Everyone interacting in the Xmi project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the https://github.com/lutaml/xmi/blob/master/CODE_OF_CONDUCT.md[code of conduct].
@@ -0,0 +1,33 @@
1
+ # 01: Eliminate Duplicate Nokogiri Parse
2
+
3
+ ## Impact: HIGH (~30-50% parse time for large files)
4
+
5
+ ## Problem
6
+
7
+ `ParserPipeline` parses XML twice:
8
+ 1. `NamespaceDetector.detect_versions` → `Nokogiri::XML(xml_content)` + `collect_namespaces` — parses entire doc just to read root namespace URIs
9
+ 2. `from_xml` → adapter.parse(xml_content) — parses again via lutaml-model
10
+
11
+ For the 3.5MB large fixture, this is the dominant cost.
12
+
13
+ ## Fix
14
+
15
+ Extract namespace URIs from the first ~4KB of the XML string using regex, avoiding Nokogiri entirely for version detection.
16
+
17
+ ```ruby
18
+ # Instead of:
19
+ doc = Nokogiri::XML(xml_content)
20
+ doc.collect_namespaces
21
+
22
+ # Use:
23
+ NS_REGEX = /xmlns:?(\\w*)=["']([^"']+)["']/
24
+ xml_content[0, 4096].scan(NS_REGEX).to_h { |prefix, uri| [prefix, uri] }
25
+ ```
26
+
27
+ The namespace declarations are always on or near the root element, so scanning the first 4KB is sufficient.
28
+
29
+ ## Files
30
+
31
+ - `lib/xmi/namespace_detector.rb` — replace `extract_namespace_uris` Nokogiri parse with regex
32
+ - `lib/xmi/namespace_detector.rb` — verify `detect_versions` still works
33
+ - `spec/xmi/parser_pipeline_spec.rb` — verify tests pass
@@ -0,0 +1,38 @@
1
+ # 02: Read-Only Fast Mode (Skip Namespace Declaration Plan)
2
+
3
+ ## Impact: HIGH (eliminates full tree walk before mapping starts)
4
+
5
+ ## Problem
6
+
7
+ In lutaml-model's `ModelTransform#data_to_model` (lines 71-76), `build_input_declaration_plan` calls `collect_element_namespaces` which recursively walks **every element** in the parsed document — solely for round-trip namespace fidelity (preserving xmlns declarations for `to_xml`).
8
+
9
+ Most XMI consumers only parse (read) and never call `to_xml`. This full tree walk is wasted work.
10
+
11
+ ## Fix
12
+
13
+ ### In lutaml-model
14
+
15
+ Add a `parse_only: true` option to `from_xml` that skips namespace declaration plan collection:
16
+
17
+ ```ruby
18
+ # model_transform.rb line 71
19
+ unless options.key?(:lutaml_parent) || options[:parse_only]
20
+ input_declaration_plan = build_input_declaration_plan(root_element)
21
+ # ...
22
+ end
23
+ ```
24
+
25
+ ### In xmi gem
26
+
27
+ Pass `parse_only: true` in `ParserPipeline::ParseXml` step:
28
+
29
+ ```ruby
30
+ model_class.from_xml(ctx[:xml], register: register, parse_only: true)
31
+ ```
32
+
33
+ If the user later calls `to_xml`, it works but without preserved input namespace declarations (acceptable for read-only use). If they need full round-trip, they opt out of `parse_only`.
34
+
35
+ ## Files
36
+
37
+ - `lutaml-model/lib/lutaml/xml/model_transform.rb` — gate `build_input_declaration_plan` behind `parse_only` option
38
+ - `lib/xmi/parser_pipeline.rb` — pass `parse_only: true` in ParseXml step
@@ -0,0 +1,20 @@
1
+ # 03: Register Fallback Idempotency Guard
2
+
3
+ ## Impact: MEDIUM (prevents cache invalidation on every parse)
4
+
5
+ ## Problem
6
+
7
+ `extend_fallback_for_mixed_namespaces` is called on every `parse_xml` for mixed-namespace documents. It calls `primary_register.add_fallback(reg.id)` which invalidates internal caches even when the fallback was already added on a previous parse.
8
+
9
+ ## Fix
10
+
11
+ Add a guard before calling `add_fallback`:
12
+
13
+ ```ruby
14
+ next if primary_register.fallback.include?(reg.id)
15
+ primary_register.add_fallback(reg.id)
16
+ ```
17
+
18
+ ## Files
19
+
20
+ - `lib/xmi/namespace_detector.rb` or `lib/xmi/version_registry.rb` — find where `extend_fallback_for_mixed_namespaces` is called and add the guard
@@ -0,0 +1,31 @@
1
+ # 04: Pipeline Hash Mutation
2
+
3
+ ## Impact: LOW (saves 4 intermediate Hash allocations per parse)
4
+
5
+ ## Problem
6
+
7
+ Each pipeline step returns `ctx.merge(key: value)` creating intermediate Hash objects.
8
+
9
+ ## Fix
10
+
11
+ Use `ctx[:key] = value` mutation instead:
12
+
13
+ ```ruby
14
+ # Before
15
+ def self.call(ctx)
16
+ xml = ctx[:xml]
17
+ # ...process...
18
+ ctx.merge(xml: fixed_xml)
19
+
20
+ # After
21
+ def self.call(ctx)
22
+ xml = ctx[:xml]
23
+ # ...process...
24
+ ctx[:xml] = fixed_xml
25
+ ctx
26
+ end
27
+ ```
28
+
29
+ ## Files
30
+
31
+ - `lib/xmi/parser_pipeline.rb` — all 4 step modules
@@ -0,0 +1,29 @@
1
+ # 05: EaRoot Single Parse for Extension Loading
2
+
3
+ ## Impact: LOW (only affects load_extension, not parse_xml hot path)
4
+
5
+ ## Problem
6
+
7
+ `EaRoot.load_extension(xml_path)` reads and parses the XML file twice:
8
+ 1. `derive_module_name` in `ea_root.rb:54-56` — `Nokogiri::XML(File.read(xml_path))`
9
+ 2. `build_extension` in `code_generation.rb:8` — `Nokogiri::XML(File.read(xml_path))`
10
+
11
+ ## Fix
12
+
13
+ Parse once in `load_extension`, pass the Nokogiri doc to both:
14
+
15
+ ```ruby
16
+ def load_extension(xml_path)
17
+ xmi_doc = Nokogiri::XML(File.read(xml_path))
18
+ extension_id = get_module_name(xmi_doc)
19
+ # ...guard...
20
+ build_extension_from_doc(xmi_doc)
21
+ update_mappings(extension_id)
22
+ loaded_extensions[extension_id] = xml_path
23
+ end
24
+ ```
25
+
26
+ ## Files
27
+
28
+ - `lib/xmi/ea_root.rb` — parse once, pass doc
29
+ - `lib/xmi/ea_root/code_generation.rb` — accept doc parameter instead of file path
data/docs/migration.md CHANGED
@@ -12,7 +12,7 @@ The old API used `SparxRoot.parse_xml` with automatic namespace normalization:
12
12
  require 'xmi'
13
13
 
14
14
  # Old approach - normalizes all namespaces to 20131001
15
- doc = Xmi::Sparx::SparxRoot.parse_xml(xml_content)
15
+ doc = Xmi::Sparx::Root.parse_xml(xml_content)
16
16
  ```
17
17
 
18
18
  This approach:
@@ -31,7 +31,7 @@ require 'xmi'
31
31
  doc = Xmi.parse(xml_content)
32
32
 
33
33
  # Or for Sparx EA files
34
- doc = Xmi::Sparx::SparxRoot.parse_xml_with_versioning(xml_content)
34
+ doc = Xmi::Sparx::Root.parse_xml_with_versioning(xml_content)
35
35
  ```
36
36
 
37
37
  This approach:
@@ -45,10 +45,10 @@ This approach:
45
45
 
46
46
  ```ruby
47
47
  # Before
48
- doc = Xmi::Sparx::SparxRoot.parse_xml(xml_content)
48
+ doc = Xmi::Sparx::Root.parse_xml(xml_content)
49
49
 
50
50
  # After
51
- doc = Xmi::Sparx::SparxRoot.parse_xml_with_versioning(xml_content)
51
+ doc = Xmi::Sparx::Root.parse_xml_with_versioning(xml_content)
52
52
 
53
53
  # Or
54
54
  doc = Xmi.parse(xml_content)
@@ -86,7 +86,7 @@ doc = Xmi.parse_with_version(xml_content, "20131001")
86
86
 
87
87
  | Use Case | Recommended API |
88
88
  |----------|----------------|
89
- | Sparx EA files | `Xmi::Sparx::SparxRoot.parse_xml_with_versioning` |
89
+ | Sparx EA files | `Xmi::Sparx::Root.parse_xml_with_versioning` |
90
90
  | General XMI files | `Xmi.parse` |
91
91
  | Version detection | `Xmi::Parsing.detect_version` |
92
92
  | Known version | `Xmi.parse_with_version(xml, version)` |
@@ -97,7 +97,7 @@ The old `parse_xml` method still works but normalizes namespaces:
97
97
 
98
98
  ```ruby
99
99
  # Old API - still supported
100
- doc = Xmi::Sparx::SparxRoot.parse_xml(xml_content)
100
+ doc = Xmi::Sparx::Root.parse_xml(xml_content)
101
101
 
102
102
  # This internally normalizes to 20131001 namespace
103
103
  ```
data/docs/versioning.md CHANGED
@@ -64,7 +64,7 @@ Xmi::Parsing.version_supported?("20131001") # => true
64
64
 
65
65
  ```ruby
66
66
  # For Enterprise Architect generated XMI files
67
- doc = Xmi::Sparx::SparxRoot.parse_xml_with_versioning(xml_content)
67
+ doc = Xmi::Sparx::Root.parse_xml_with_versioning(xml_content)
68
68
  ```
69
69
 
70
70
  ## Version Modules
@@ -194,7 +194,7 @@ class BenchmarkRunner
194
194
 
195
195
  case method
196
196
  when :xmi_parse_242_small, :xmi_parse_242_medium, :xmi_parse_242_large, :xmi_parse_251
197
- measure_time { Xmi::Sparx::SparxRoot.parse_xml(xml_content) }
197
+ measure_time { Xmi::Sparx::Root.parse_xml(xml_content) }
198
198
  else
199
199
  raise "Unknown benchmark: #{method}"
200
200
  end
@@ -4,8 +4,7 @@ module Xmi
4
4
  class EaRoot
5
5
  module CodeGeneration
6
6
  # Build extension classes directly via Class.new + const_set.
7
- def build_extension(xml_path)
8
- xmi_doc = Nokogiri::XML(File.read(xml_path))
7
+ def build_extension(xmi_doc)
9
8
  @module_name = get_module_name(xmi_doc)
10
9
  @def_namespace = get_namespace_from_definition(xmi_doc)
11
10
 
@@ -12,7 +12,7 @@ module Xmi
12
12
  private
13
13
 
14
14
  def inject_model_attributes(new_klasses, module_name)
15
- sparx_root = Xmi::Sparx::SparxRoot
15
+ sparx_root = Xmi::Sparx::Root
16
16
  new_klasses.each do |klass_name|
17
17
  method_name = Lutaml::Model::Utils.snake_case(klass_name)
18
18
  full_klass = EaRoot.const_get(module_name).const_get(klass_name)
@@ -32,7 +32,7 @@ module Xmi
32
32
 
33
33
  return if map_entries.empty?
34
34
 
35
- Xmi::Sparx::SparxRoot.class_eval do
35
+ Xmi::Sparx::Root.class_eval do
36
36
  xml do
37
37
  map_entries.each do |element_name, method_sym|
38
38
  map_element element_name, to: method_sym,
data/lib/xmi/ea_root.rb CHANGED
@@ -18,7 +18,8 @@ module Xmi
18
18
 
19
19
  class << self
20
20
  def load_extension(xml_path)
21
- extension_id = derive_module_name(xml_path)
21
+ xmi_doc = Nokogiri::XML(File.read(xml_path))
22
+ extension_id = get_module_name(xmi_doc)
22
23
 
23
24
  if loaded_extensions.key?(extension_id)
24
25
  raise ArgumentError,
@@ -27,7 +28,7 @@ module Xmi
27
28
  "Call unload_extension('#{extension_id}') first if you want to reload it."
28
29
  end
29
30
 
30
- build_extension(xml_path)
31
+ build_extension(xmi_doc)
31
32
  update_mappings(extension_id)
32
33
  loaded_extensions[extension_id] = xml_path
33
34
  end
@@ -48,13 +49,6 @@ module Xmi
48
49
  def loaded_extensions
49
50
  @loaded_extensions ||= {}
50
51
  end
51
-
52
- private
53
-
54
- def derive_module_name(xml_path)
55
- xmi_doc = Nokogiri::XML(File.read(xml_path))
56
- get_module_name(xmi_doc)
57
- end
58
52
  end
59
53
  end
60
54
  end
@@ -10,6 +10,15 @@ module Xmi
10
10
  class NamespaceDetector
11
11
  VERSION_PATTERN = /(\d{8})/
12
12
 
13
+ # Regex to extract xmlns declarations without parsing the entire document.
14
+ # Matches both default namespace (xmlns="...") and prefixed (xmlns:foo="...").
15
+ # Namespace declarations are always on or near the root element, so scanning
16
+ # the first 8KB is sufficient for any XMI file.
17
+ NS_DECL_REGEX = /xmlns(?::(\w+))?\s*=\s*["']([^"']+)["']/
18
+
19
+ # How many bytes of the XML to scan for namespace declarations
20
+ NS_SCAN_BYTES = 8192
21
+
13
22
  # Namespace URI patterns for OMG specifications
14
23
  NS_PATTERNS = {
15
24
  xmi: %r{http://www\.omg\.org/spec/XMI/(\d{8})},
@@ -33,11 +42,34 @@ module Xmi
33
42
  }
34
43
  end
35
44
 
36
- # Extract all namespace URIs from XML content
45
+ # Extract all namespace URIs from XML content using regex on the first 8KB.
37
46
  #
38
- # @param xml_content [String] The XML content to parse
47
+ # This avoids a full Nokogiri parse namespace declarations are always
48
+ # on or near the root element, so scanning the first few KB is sufficient
49
+ # and ~10x faster than parsing a 3.5MB document.
50
+ #
51
+ # @param xml_content [String] The XML content
39
52
  # @return [Hash<String, String>] A hash mapping prefixes to namespace URIs
40
53
  def self.extract_namespace_uris(xml_content)
54
+ head = xml_content.byteslice(0, NS_SCAN_BYTES)
55
+ unless head.valid_encoding?
56
+ head = head.encode("UTF-8", invalid: :replace,
57
+ undef: :replace)
58
+ end
59
+ result = {}
60
+ head.scan(NS_DECL_REGEX) do |prefix, uri|
61
+ key = prefix.nil? ? "xmlns" : prefix
62
+ result[key] = uri unless result.key?(key)
63
+ end
64
+ result
65
+ end
66
+
67
+ # Extract namespace URIs via Nokogiri (full parse).
68
+ # Used by `analyze` when the complete namespace map is needed.
69
+ #
70
+ # @param xml_content [String] The XML content
71
+ # @return [Hash<String, String>] A hash mapping prefixes to namespace URIs
72
+ def self.extract_namespace_uris_full(xml_content)
41
73
  doc = Nokogiri::XML(xml_content, &:noent)
42
74
  doc.collect_namespaces
43
75
  rescue Nokogiri::XML::SyntaxError
@@ -117,7 +149,7 @@ module Xmi
117
149
  def self.analyze(xml_content)
118
150
  versions = detect_versions(xml_content)
119
151
  uris = detect_namespace_uris(xml_content)
120
- raw_namespaces = extract_namespace_uris(xml_content)
152
+ raw_namespaces = extract_namespace_uris_full(xml_content)
121
153
 
122
154
  {
123
155
  versions: versions,
@@ -8,9 +8,9 @@ module Xmi
8
8
  # without modifying existing code — Open/Closed Principle.
9
9
  #
10
10
  # @example
11
- # context = { xml: xml_content, root_class: Xmi::Sparx::SparxRoot }
11
+ # context = { xml: xml_content, root_class: Xmi::Sparx::Root }
12
12
  # result = Xmi::ParserPipeline.run(context)
13
- # result[:root] # => parsed SparxRoot instance
13
+ # result[:root] # => parsed Root instance
14
14
  #
15
15
  module ParserPipeline
16
16
  module Steps
@@ -18,11 +18,11 @@ module Xmi
18
18
  def self.call(ctx)
19
19
  xml = ctx[:xml]
20
20
  if xml.respond_to?(:valid_encoding?) && !xml.valid_encoding?
21
- xml = xml
21
+ ctx[:xml] = xml
22
22
  .encode("UTF-16be", invalid: :replace, replace: "?")
23
23
  .encode("UTF-8")
24
24
  end
25
- ctx.merge(xml: xml)
25
+ ctx
26
26
  end
27
27
  end
28
28
 
@@ -36,16 +36,16 @@ module Xmi
36
36
  module ParseXml
37
37
  def self.call(ctx)
38
38
  root_class = ctx[:root_class]
39
- root = VersionRegistry.parse_with_detected_version(ctx[:xml],
40
- root_class)
41
- ctx.merge(root: root)
39
+ ctx[:root] = VersionRegistry.parse_with_detected_version(
40
+ ctx[:xml], root_class
41
+ )
42
+ ctx
42
43
  end
43
44
  end
44
45
 
45
46
  module BuildIndex
46
47
  def self.call(ctx)
47
- root = ctx[:root]
48
- root.build_index if root.respond_to?(:build_index)
48
+ ctx[:root].build_index if ctx[:root].respond_to?(:build_index)
49
49
  ctx
50
50
  end
51
51
  end
@@ -2,7 +2,7 @@
2
2
 
3
3
  module Xmi
4
4
  module Sparx
5
- # Builds all commonly needed indexes from a parsed SparxRoot in a single
5
+ # Builds all commonly needed indexes from a parsed Root in a single
6
6
  # targeted walk, avoiding the generic map_id_name approach that visits every
7
7
  # attribute of every node.
8
8
  #
@@ -24,7 +24,7 @@ module Xmi
24
24
 
25
25
  PackagedElement = ::Xmi::Uml::PackagedElement
26
26
 
27
- # @param root [Xmi::Sparx::SparxRoot] parsed XMI model
27
+ # @param root [Xmi::Sparx::Root] parsed XMI model
28
28
  def initialize(root)
29
29
  @id_name_map = {}
30
30
  @packaged_elements = []
@@ -2,15 +2,15 @@
2
2
 
3
3
  module Xmi
4
4
  module Sparx
5
- module SparxMappings
5
+ module Mappings
6
6
  # Base XML mapping class for Sparx EA XMI documents.
7
7
  #
8
8
  # This reusable mapping class encapsulates all the XML element → attribute
9
- # mappings for SparxRoot.
9
+ # mappings for Root.
10
10
  #
11
11
  # @example Use in a model class
12
- # class SparxRoot < Root
13
- # xml SparxMappings::BaseMapping
12
+ # class Root < ::Xmi::Root
13
+ # xml Mappings::BaseMapping
14
14
  # end
15
15
  class BaseMapping < Lutaml::Xml::Mapping
16
16
  xml do
@@ -2,7 +2,7 @@
2
2
 
3
3
  module Xmi
4
4
  module Sparx
5
- module SparxMappings
5
+ module Mappings
6
6
  # Reusable XML mapping classes for Sparx EA XMI documents.
7
7
  autoload :BaseMapping, "xmi/sparx/mappings/base_mapping"
8
8
  end
@@ -2,7 +2,7 @@
2
2
 
3
3
  module Xmi
4
4
  module Sparx
5
- class SparxRoot < Root
5
+ class Root < ::Xmi::Root
6
6
  attribute :modelica_parameter, SysPhS
7
7
 
8
8
  attribute :eauml_import, EaUml::Import, collection: true
@@ -26,7 +26,7 @@ module Xmi
26
26
  # encoding fix → version detection → XML parsing → index building.
27
27
  #
28
28
  # @param xml_content [String] The raw XMI XML content
29
- # @return [SparxRoot] The parsed Ruby object with index built
29
+ # @return [Root] The parsed Ruby object with index built
30
30
  #
31
31
  # @see Xmi::ParserPipeline
32
32
  # @see Xmi.parse
@@ -55,7 +55,7 @@ module Xmi
55
55
  end
56
56
 
57
57
  # Use the reusable BaseMapping class instead of eval hack
58
- xml SparxMappings::BaseMapping
58
+ xml Mappings::BaseMapping
59
59
 
60
60
  # Build index for fast lookups
61
61
  # @return [Sparx::Index]
data/lib/xmi/sparx.rb CHANGED
@@ -17,8 +17,8 @@ module Xmi
17
17
  autoload :EaStub, "xmi/sparx/ea_stub"
18
18
  autoload :SysPhS, "xmi/sparx/sys_ph_s"
19
19
  autoload :Extension, "xmi/sparx/extension"
20
- autoload :SparxRoot, "xmi/sparx/root"
21
- autoload :SparxMappings, "xmi/sparx/mappings"
20
+ autoload :Root, "xmi/sparx/root"
21
+ autoload :Mappings, "xmi/sparx/mappings"
22
22
  autoload :Index, "xmi/sparx/index"
23
23
  end
24
24
  end
data/lib/xmi/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Xmi
4
- VERSION = "0.5.5"
4
+ VERSION = "0.5.6"
5
5
  end
@@ -108,16 +108,15 @@ module Xmi
108
108
  #
109
109
  # @param xml_content [String] XML content
110
110
  # @param model_class [Class] The model class to parse into
111
+ # @param opts [Hash] Options passed through to from_xml
112
+ # (e.g., import_declaration_plan: :skip)
111
113
  # @return [Object] Parsed model instance
112
- def parse_with_detected_version(xml_content, model_class)
114
+ def parse_with_detected_version(xml_content, model_class, **opts)
113
115
  register = detect_register(xml_content)
114
116
 
115
- if register
116
- model_class.from_xml(xml_content, register: register)
117
- else
118
- # Fallback to default parsing (existing behavior)
119
- model_class.from_xml(xml_content)
120
- end
117
+ opts[:register] = register if register
118
+
119
+ model_class.from_xml(xml_content, **opts)
121
120
  end
122
121
 
123
122
  # Handle mixed namespace documents by binding additional namespace URIs
@@ -159,7 +158,11 @@ module Xmi
159
158
  # A cycle would occur if reg's fallback chain includes primary_register.
160
159
  # We check this by seeing if primary_register.id appears in reg's fallback.
161
160
  # Use add_fallback to keep Register and TypeContext in sync and invalidate caches.
162
- primary_register.add_fallback(reg.id) unless reg.fallback.include?(primary_register.id)
161
+ # Guard: skip if primary already has this fallback to avoid unnecessary cache invalidation.
162
+ unless primary_register.fallback.include?(reg.id) ||
163
+ reg.fallback.include?(primary_register.id)
164
+ primary_register.add_fallback(reg.id)
165
+ end
163
166
  end
164
167
  end
165
168
  # rubocop:enable Metrics/CyclomaticComplexity
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: xmi
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.5.5
4
+ version: 0.5.6
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ribose Inc.
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2026-04-21 00:00:00.000000000 Z
11
+ date: 2026-04-22 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: lutaml-model
@@ -52,10 +52,16 @@ files:
52
52
  - ".rubocop.yml"
53
53
  - ".rubocop_todo.yml"
54
54
  - CHANGELOG.md
55
+ - CLAUDE.md
55
56
  - CODE_OF_CONDUCT.md
56
57
  - Gemfile
57
58
  - README.adoc
58
59
  - Rakefile
60
+ - TODO.perf/01-eliminate-duplicate-nokogiri-parse.md
61
+ - TODO.perf/02-read-only-fast-mode.md
62
+ - TODO.perf/03-register-fallback-idempotency.md
63
+ - TODO.perf/04-pipeline-hash-mutation.md
64
+ - TODO.perf/05-ea-root-single-parse.md
59
65
  - bin/console
60
66
  - bin/setup
61
67
  - docs/migration.md