xmi 0.5.6 → 0.5.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/lib/xmi/version.rb +1 -1
- metadata +1 -6
- data/TODO.perf/01-eliminate-duplicate-nokogiri-parse.md +0 -33
- data/TODO.perf/02-read-only-fast-mode.md +0 -38
- data/TODO.perf/03-register-fallback-idempotency.md +0 -20
- data/TODO.perf/04-pipeline-hash-mutation.md +0 -31
- data/TODO.perf/05-ea-root-single-parse.md +0 -29
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: efc69e1cb868e8f245cb943d45b7ceaf007b31c59f6c6ff63f6439d999d4385e
|
|
4
|
+
data.tar.gz: 5bb58e282bd3eb3a565e8d407ce71ba602add350e8ed3ccf5ade429deafe01e8
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 0eaafdbea1e4fc8d9cd306c9b17ba169a660675b11094fa971f729c5a20126753c3d57e31a31cbae0c394215c26879e63b7c05d0e0a80479fc969bb48fbafb0b
|
|
7
|
+
data.tar.gz: ec71dd2f1fa4f19238b6c8f0a464bac681f05989f54aed68bc43ff6c212ee8b3e493b26624613b4244121841d422aaa12353c452fbd87cae9845ddcbebd01d68
|
data/lib/xmi/version.rb
CHANGED
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: xmi
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.5.
|
|
4
|
+
version: 0.5.7
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Ribose Inc.
|
|
@@ -57,11 +57,6 @@ files:
|
|
|
57
57
|
- Gemfile
|
|
58
58
|
- README.adoc
|
|
59
59
|
- Rakefile
|
|
60
|
-
- TODO.perf/01-eliminate-duplicate-nokogiri-parse.md
|
|
61
|
-
- TODO.perf/02-read-only-fast-mode.md
|
|
62
|
-
- TODO.perf/03-register-fallback-idempotency.md
|
|
63
|
-
- TODO.perf/04-pipeline-hash-mutation.md
|
|
64
|
-
- TODO.perf/05-ea-root-single-parse.md
|
|
65
60
|
- bin/console
|
|
66
61
|
- bin/setup
|
|
67
62
|
- docs/migration.md
|
|
@@ -1,33 +0,0 @@
|
|
|
1
|
-
# 01: Eliminate Duplicate Nokogiri Parse
|
|
2
|
-
|
|
3
|
-
## Impact: HIGH (~30-50% parse time for large files)
|
|
4
|
-
|
|
5
|
-
## Problem
|
|
6
|
-
|
|
7
|
-
`ParserPipeline` parses XML twice:
|
|
8
|
-
1. `NamespaceDetector.detect_versions` → `Nokogiri::XML(xml_content)` + `collect_namespaces` — parses entire doc just to read root namespace URIs
|
|
9
|
-
2. `from_xml` → adapter.parse(xml_content) — parses again via lutaml-model
|
|
10
|
-
|
|
11
|
-
For the 3.5MB large fixture, this is the dominant cost.
|
|
12
|
-
|
|
13
|
-
## Fix
|
|
14
|
-
|
|
15
|
-
Extract namespace URIs from the first ~4KB of the XML string using regex, avoiding Nokogiri entirely for version detection.
|
|
16
|
-
|
|
17
|
-
```ruby
|
|
18
|
-
# Instead of:
|
|
19
|
-
doc = Nokogiri::XML(xml_content)
|
|
20
|
-
doc.collect_namespaces
|
|
21
|
-
|
|
22
|
-
# Use:
|
|
23
|
-
NS_REGEX = /xmlns:?(\\w*)=["']([^"']+)["']/
|
|
24
|
-
xml_content[0, 4096].scan(NS_REGEX).to_h { |prefix, uri| [prefix, uri] }
|
|
25
|
-
```
|
|
26
|
-
|
|
27
|
-
The namespace declarations are always on or near the root element, so scanning the first 4KB is sufficient.
|
|
28
|
-
|
|
29
|
-
## Files
|
|
30
|
-
|
|
31
|
-
- `lib/xmi/namespace_detector.rb` — replace `extract_namespace_uris` Nokogiri parse with regex
|
|
32
|
-
- `lib/xmi/namespace_detector.rb` — verify `detect_versions` still works
|
|
33
|
-
- `spec/xmi/parser_pipeline_spec.rb` — verify tests pass
|
|
@@ -1,38 +0,0 @@
|
|
|
1
|
-
# 02: Read-Only Fast Mode (Skip Namespace Declaration Plan)
|
|
2
|
-
|
|
3
|
-
## Impact: HIGH (eliminates full tree walk before mapping starts)
|
|
4
|
-
|
|
5
|
-
## Problem
|
|
6
|
-
|
|
7
|
-
In lutaml-model's `ModelTransform#data_to_model` (lines 71-76), `build_input_declaration_plan` calls `collect_element_namespaces` which recursively walks **every element** in the parsed document — solely for round-trip namespace fidelity (preserving xmlns declarations for `to_xml`).
|
|
8
|
-
|
|
9
|
-
Most XMI consumers only parse (read) and never call `to_xml`. This full tree walk is wasted work.
|
|
10
|
-
|
|
11
|
-
## Fix
|
|
12
|
-
|
|
13
|
-
### In lutaml-model
|
|
14
|
-
|
|
15
|
-
Add a `parse_only: true` option to `from_xml` that skips namespace declaration plan collection:
|
|
16
|
-
|
|
17
|
-
```ruby
|
|
18
|
-
# model_transform.rb line 71
|
|
19
|
-
unless options.key?(:lutaml_parent) || options[:parse_only]
|
|
20
|
-
input_declaration_plan = build_input_declaration_plan(root_element)
|
|
21
|
-
# ...
|
|
22
|
-
end
|
|
23
|
-
```
|
|
24
|
-
|
|
25
|
-
### In xmi gem
|
|
26
|
-
|
|
27
|
-
Pass `parse_only: true` in `ParserPipeline::ParseXml` step:
|
|
28
|
-
|
|
29
|
-
```ruby
|
|
30
|
-
model_class.from_xml(ctx[:xml], register: register, parse_only: true)
|
|
31
|
-
```
|
|
32
|
-
|
|
33
|
-
If the user later calls `to_xml`, it works but without preserved input namespace declarations (acceptable for read-only use). If they need full round-trip, they opt out of `parse_only`.
|
|
34
|
-
|
|
35
|
-
## Files
|
|
36
|
-
|
|
37
|
-
- `lutaml-model/lib/lutaml/xml/model_transform.rb` — gate `build_input_declaration_plan` behind `parse_only` option
|
|
38
|
-
- `lib/xmi/parser_pipeline.rb` — pass `parse_only: true` in ParseXml step
|
|
@@ -1,20 +0,0 @@
|
|
|
1
|
-
# 03: Register Fallback Idempotency Guard
|
|
2
|
-
|
|
3
|
-
## Impact: MEDIUM (prevents cache invalidation on every parse)
|
|
4
|
-
|
|
5
|
-
## Problem
|
|
6
|
-
|
|
7
|
-
`extend_fallback_for_mixed_namespaces` is called on every `parse_xml` for mixed-namespace documents. It calls `primary_register.add_fallback(reg.id)` which invalidates internal caches even when the fallback was already added on a previous parse.
|
|
8
|
-
|
|
9
|
-
## Fix
|
|
10
|
-
|
|
11
|
-
Add a guard before calling `add_fallback`:
|
|
12
|
-
|
|
13
|
-
```ruby
|
|
14
|
-
next if primary_register.fallback.include?(reg.id)
|
|
15
|
-
primary_register.add_fallback(reg.id)
|
|
16
|
-
```
|
|
17
|
-
|
|
18
|
-
## Files
|
|
19
|
-
|
|
20
|
-
- `lib/xmi/namespace_detector.rb` or `lib/xmi/version_registry.rb` — find where `extend_fallback_for_mixed_namespaces` is called and add the guard
|
|
@@ -1,31 +0,0 @@
|
|
|
1
|
-
# 04: Pipeline Hash Mutation
|
|
2
|
-
|
|
3
|
-
## Impact: LOW (saves 4 intermediate Hash allocations per parse)
|
|
4
|
-
|
|
5
|
-
## Problem
|
|
6
|
-
|
|
7
|
-
Each pipeline step returns `ctx.merge(key: value)` creating intermediate Hash objects.
|
|
8
|
-
|
|
9
|
-
## Fix
|
|
10
|
-
|
|
11
|
-
Use `ctx[:key] = value` mutation instead:
|
|
12
|
-
|
|
13
|
-
```ruby
|
|
14
|
-
# Before
|
|
15
|
-
def self.call(ctx)
|
|
16
|
-
xml = ctx[:xml]
|
|
17
|
-
# ...process...
|
|
18
|
-
ctx.merge(xml: fixed_xml)
|
|
19
|
-
|
|
20
|
-
# After
|
|
21
|
-
def self.call(ctx)
|
|
22
|
-
xml = ctx[:xml]
|
|
23
|
-
# ...process...
|
|
24
|
-
ctx[:xml] = fixed_xml
|
|
25
|
-
ctx
|
|
26
|
-
end
|
|
27
|
-
```
|
|
28
|
-
|
|
29
|
-
## Files
|
|
30
|
-
|
|
31
|
-
- `lib/xmi/parser_pipeline.rb` — all 4 step modules
|
|
@@ -1,29 +0,0 @@
|
|
|
1
|
-
# 05: EaRoot Single Parse for Extension Loading
|
|
2
|
-
|
|
3
|
-
## Impact: LOW (only affects load_extension, not parse_xml hot path)
|
|
4
|
-
|
|
5
|
-
## Problem
|
|
6
|
-
|
|
7
|
-
`EaRoot.load_extension(xml_path)` reads and parses the XML file twice:
|
|
8
|
-
1. `derive_module_name` in `ea_root.rb:54-56` — `Nokogiri::XML(File.read(xml_path))`
|
|
9
|
-
2. `build_extension` in `code_generation.rb:8` — `Nokogiri::XML(File.read(xml_path))`
|
|
10
|
-
|
|
11
|
-
## Fix
|
|
12
|
-
|
|
13
|
-
Parse once in `load_extension`, pass the Nokogiri doc to both:
|
|
14
|
-
|
|
15
|
-
```ruby
|
|
16
|
-
def load_extension(xml_path)
|
|
17
|
-
xmi_doc = Nokogiri::XML(File.read(xml_path))
|
|
18
|
-
extension_id = get_module_name(xmi_doc)
|
|
19
|
-
# ...guard...
|
|
20
|
-
build_extension_from_doc(xmi_doc)
|
|
21
|
-
update_mappings(extension_id)
|
|
22
|
-
loaded_extensions[extension_id] = xml_path
|
|
23
|
-
end
|
|
24
|
-
```
|
|
25
|
-
|
|
26
|
-
## Files
|
|
27
|
-
|
|
28
|
-
- `lib/xmi/ea_root.rb` — parse once, pass doc
|
|
29
|
-
- `lib/xmi/ea_root/code_generation.rb` — accept doc parameter instead of file path
|