canon 0.1.9 → 0.1.11
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.rubocop_todo.yml +25 -99
- data/README.adoc +220 -26
- data/docs/advanced/diff-classification.adoc +118 -26
- data/lib/canon/cli.rb +30 -0
- data/lib/canon/commands/diff_command.rb +3 -0
- data/lib/canon/comparison/markup_comparator.rb +109 -2
- data/lib/canon/comparison/xml_comparator/diff_node_builder.rb +108 -0
- data/lib/canon/comparison/xml_comparator.rb +192 -0
- data/lib/canon/config/env_schema.rb +5 -1
- data/lib/canon/config.rb +30 -0
- data/lib/canon/diff/diff_classifier.rb +48 -33
- data/lib/canon/diff/xml_serialization_formatter.rb +153 -0
- data/lib/canon/diff_formatter.rb +102 -12
- data/lib/canon/version.rb +1 -1
- metadata +3 -2
|
@@ -80,14 +80,20 @@ Classification depends on `attribute_order` setting:
|
|
|
80
80
|
│ │
|
|
81
81
|
│ DiffClassifier examines each DiffNode: │
|
|
82
82
|
│ │
|
|
83
|
-
│
|
|
84
|
-
│
|
|
83
|
+
│ 1. Serialization-level formatting (XmlSerializationFormatter) │
|
|
84
|
+
│ → XML syntax differences: <tag/> vs <tag></tag> │
|
|
85
|
+
│ → ALWAYS formatting-only (non-normative) │
|
|
85
86
|
│ │
|
|
86
|
-
│
|
|
87
|
-
│
|
|
88
|
-
│
|
|
89
|
-
│ → NORMATIVE (difference matters) │
|
|
87
|
+
│ 2. Content-level formatting (text_content: :normalize) │
|
|
88
|
+
│ → Whitespace differences in content │
|
|
89
|
+
│ → Formatting-only when normalized content matches │
|
|
90
90
|
│ │
|
|
91
|
+
│ 3. CompareProfile policy (normative vs informative) │
|
|
92
|
+
│ → behavior == :ignore → INFORMATIVE │
|
|
93
|
+
│ → behavior == :strict → NORMATIVE │
|
|
94
|
+
│ → behavior == :normalize → Check content normalization │
|
|
95
|
+
│ │
|
|
96
|
+
│ Sets diff_node.formatting = true/false │
|
|
91
97
|
│ Sets diff_node.normative = true/false │
|
|
92
98
|
└───────────────────────────────────┬───────────────────────────────┘
|
|
93
99
|
↓
|
|
@@ -102,6 +108,27 @@ Classification depends on `attribute_order` setting:
|
|
|
102
108
|
└──────────────────────────────────────────────────────────────────┘
|
|
103
109
|
----
|
|
104
110
|
|
|
111
|
+
=== Three-Level Classification System
|
|
112
|
+
|
|
113
|
+
Canon distinguishes between **three distinct kinds of differences**:
|
|
114
|
+
|
|
115
|
+
| Kind | `formatting:` | `normative:` | Meaning | Examples |
|
|
116
|
+
|------|---------------|--------------|---------|----------|
|
|
117
|
+
| **Serialization formatting** | `true` | `false` | XML syntax differences | `<tag/>` vs `<tag></tag>` |
|
|
118
|
+
| **Content formatting** | `true` | `false` | Whitespace in content | `Hello world` vs `Hello world` |
|
|
119
|
+
| **Informative** | `false` | `false` | Tracked but doesn't affect equivalence | Attribute order (when `:ignore`) |
|
|
120
|
+
| **Normative** | `false` | `true` | Affects equivalence | Different words, missing elements |
|
|
121
|
+
|
|
122
|
+
**Key distinction**:
|
|
123
|
+
|
|
124
|
+
* **Serialization-level formatting**: XML syntax differences that are ALWAYS non-normative regardless of match options, because they represent different valid serializations of the same semantic content. Detected by `XmlSerializationFormatter`.
|
|
125
|
+
|
|
126
|
+
* **Content-level formatting**: Whitespace differences in document content. These are formatting-only (non-normative) when normalized content matches (using `text_content: :normalize`).
|
|
127
|
+
|
|
128
|
+
* **Informative**: Differences tracked for reference but don't affect equivalence (when behavior is `:ignore`).
|
|
129
|
+
|
|
130
|
+
* **Normative**: Semantic content differences that affect equivalence (when behavior is `:strict` or when normalized content differs).
|
|
131
|
+
|
|
105
132
|
== CompareProfile-Based Classification
|
|
106
133
|
|
|
107
134
|
=== Overview
|
|
@@ -120,22 +147,42 @@ DiffNode → DiffClassifier → CompareProfile → normative?
|
|
|
120
147
|
|
|
121
148
|
=== Classification Hierarchy
|
|
122
149
|
|
|
123
|
-
Canon uses a
|
|
150
|
+
Canon uses a **multi-level hierarchy** for classifying differences:
|
|
124
151
|
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
152
|
+
[source]
|
|
153
|
+
----
|
|
154
|
+
DiffNode → DiffClassifier → XmlSerializationFormatter → serialization formatting?
|
|
155
|
+
↓
|
|
156
|
+
CompareProfile → normative dimension?
|
|
157
|
+
↓
|
|
158
|
+
FormattingDetector → formatting-only?
|
|
159
|
+
↓
|
|
160
|
+
Final classification
|
|
161
|
+
----
|
|
162
|
+
|
|
163
|
+
**Classification priority (from highest to lowest specificity)**:
|
|
164
|
+
|
|
165
|
+
1. **Serialization-level formatting** (highest priority)
|
|
166
|
+
- XML syntax differences: `<tag/>` vs `<tag></tag>`
|
|
167
|
+
- Detected by `XmlSerializationFormatter`
|
|
168
|
+
- **ALWAYS** `formatting: true, normative: false`
|
|
169
|
+
- Bypasses all other classification logic
|
|
129
170
|
|
|
130
|
-
2. **
|
|
171
|
+
2. **Content-level formatting**
|
|
172
|
+
- Whitespace differences in document content
|
|
173
|
+
- Detected by `FormattingDetector` when `text_content: :normalize`
|
|
174
|
+
- `formatting: true, normative: false` when normalized content matches
|
|
175
|
+
- Respects element-level whitespace sensitivity
|
|
176
|
+
|
|
177
|
+
3. **Informative** (based on `:ignore` behavior)
|
|
131
178
|
- Tracked but doesn't affect equivalence
|
|
132
|
-
-
|
|
133
|
-
-
|
|
179
|
+
- `formatting: false, normative: false`
|
|
180
|
+
- Example: Attribute order when `attribute_order: :ignore`
|
|
134
181
|
|
|
135
|
-
|
|
182
|
+
4. **Normative** (based on `:strict` behavior or content mismatch)
|
|
136
183
|
- Affects equivalence
|
|
137
|
-
-
|
|
138
|
-
-
|
|
184
|
+
- `formatting: false, normative: true`
|
|
185
|
+
- Example: Different words, missing elements
|
|
139
186
|
|
|
140
187
|
=== Format-Specific Policies
|
|
141
188
|
|
|
@@ -292,6 +339,34 @@ Canon::Comparison.equivalent?(html1, html2, format: :html)
|
|
|
292
339
|
----
|
|
293
340
|
====
|
|
294
341
|
|
|
342
|
+
.Self-closing vs explicit closing tags
|
|
343
|
+
====
|
|
344
|
+
Per XML standards, `<tag/>` and `<tag></tag>` are semantically equivalent (both represent empty elements). Canon classifies differences in serialisation format as **formatting-only** (non-normative):
|
|
345
|
+
|
|
346
|
+
[source,ruby]
|
|
347
|
+
----
|
|
348
|
+
# Self-closing vs explicit closing - always equivalent
|
|
349
|
+
xml1 = '<svg><rect x="10" y="10"/></svg>'
|
|
350
|
+
xml2 = '<svg><rect x="10" y="10"></rect></svg>'
|
|
351
|
+
|
|
352
|
+
Canon::Comparison.equivalent?(xml1, xml2, format: :xml)
|
|
353
|
+
# => true
|
|
354
|
+
|
|
355
|
+
# Empty/whitespace-only text nodes from serialisation are formatting-only
|
|
356
|
+
result = Canon::Comparison.equivalent?(xml1, xml2, format: :xml, verbose: true)
|
|
357
|
+
result.differences.each do |diff|
|
|
358
|
+
if diff.dimension == :text_content
|
|
359
|
+
puts "Normative: #{diff.normative?}" # => false
|
|
360
|
+
puts "Formatting: #{diff.formatting?}" # => true
|
|
361
|
+
end
|
|
362
|
+
end
|
|
363
|
+
----
|
|
364
|
+
|
|
365
|
+
This applies regardless of `text_content` behavior setting, as these differences are purely serialisation format variations (similar to attribute order).
|
|
366
|
+
|
|
367
|
+
The key insight: empty or whitespace-only text nodes created by different serialisation styles (`<tag/>` vs `<tag></tag>`) are always classified as **formatting-only**, not normative.
|
|
368
|
+
====
|
|
369
|
+
|
|
295
370
|
=== FormattingDetector Integration
|
|
296
371
|
|
|
297
372
|
For dimensions that support it (`:text_content`, `:structural_whitespace`),
|
|
@@ -319,19 +394,35 @@ With `:normalize` mode:
|
|
|
319
394
|
|
|
320
395
|
=== Implementation Details
|
|
321
396
|
|
|
322
|
-
The
|
|
397
|
+
The classification system uses three main classes:
|
|
323
398
|
|
|
324
|
-
*
|
|
325
|
-
|
|
326
|
-
|
|
399
|
+
* **`XmlSerializationFormatter`** - Detects XML serialization-level formatting differences
|
|
400
|
+
- Self-closing vs explicit closing tags: `<tag/>` vs `<tag></tag>`
|
|
401
|
+
- Always returns `formatting: true, normative: false`
|
|
402
|
+
- These differences are ALWAYS non-normative regardless of match options
|
|
327
403
|
|
|
328
|
-
|
|
404
|
+
* **`CompareProfile`** - Determines dimension behavior and policy
|
|
405
|
+
- `normative_dimension?(dimension)` - Is this dimension normative?
|
|
406
|
+
- `affects_equivalence?(dimension)` - Does this dimension affect equivalence?
|
|
407
|
+
- `supports_formatting_detection?(dimension)` - Can this dimension have formatting-only diffs?
|
|
408
|
+
|
|
409
|
+
* **`DiffClassifier`** - Orchestrates classification using the above
|
|
410
|
+
- First checks `XmlSerializationFormatter` for serialization formatting
|
|
411
|
+
- Then handles content-level formatting (text_content: :normalize)
|
|
412
|
+
- Finally applies `CompareProfile` policy for normative vs informative
|
|
329
413
|
|
|
330
414
|
[source,ruby]
|
|
331
415
|
----
|
|
332
416
|
def classify(diff_node)
|
|
333
|
-
#
|
|
334
|
-
#
|
|
417
|
+
# FIRST: Check for XML serialization-level formatting differences
|
|
418
|
+
# These are ALWAYS non-normative (formatting-only) regardless of match options
|
|
419
|
+
if XmlSerializationFormatter.serialization_formatting?(diff_node)
|
|
420
|
+
diff_node.formatting = true
|
|
421
|
+
diff_node.normative = false
|
|
422
|
+
return diff_node
|
|
423
|
+
end
|
|
424
|
+
|
|
425
|
+
# SECOND: Handle content-level formatting for text_content with :normalize
|
|
335
426
|
if diff_node.dimension == :text_content &&
|
|
336
427
|
profile.send(:behavior_for, :text_content) == :normalize &&
|
|
337
428
|
!inside_whitespace_sensitive_element?(diff_node) &&
|
|
@@ -341,10 +432,10 @@ def classify(diff_node)
|
|
|
341
432
|
return diff_node
|
|
342
433
|
end
|
|
343
434
|
|
|
344
|
-
#
|
|
435
|
+
# THIRD: Apply CompareProfile policy
|
|
345
436
|
is_normative = profile.normative_dimension?(diff_node.dimension)
|
|
346
437
|
|
|
347
|
-
#
|
|
438
|
+
# FOURTH: Check FormattingDetector for non-normative dimensions
|
|
348
439
|
if !is_normative && profile.supports_formatting_detection?(diff_node.dimension)
|
|
349
440
|
if formatting_only_diff?(diff_node)
|
|
350
441
|
diff_node.formatting = true
|
|
@@ -353,6 +444,7 @@ def classify(diff_node)
|
|
|
353
444
|
end
|
|
354
445
|
end
|
|
355
446
|
|
|
447
|
+
# FIFTH: Apply normative determination
|
|
356
448
|
diff_node.normative = is_normative
|
|
357
449
|
diff_node
|
|
358
450
|
end
|
data/lib/canon/cli.rb
CHANGED
|
@@ -126,6 +126,24 @@ module Canon
|
|
|
126
126
|
|
|
127
127
|
# Disable color output
|
|
128
128
|
$ canon diff file1.xml file2.xml --no-color
|
|
129
|
+
|
|
130
|
+
# Show raw file contents (for copying to specs)
|
|
131
|
+
$ canon diff file1.xml file2.xml --show-raw-inputs
|
|
132
|
+
|
|
133
|
+
# Show preprocessed contents (what was actually compared)
|
|
134
|
+
$ canon diff file1.xml file2.xml --show-preprocessed-inputs
|
|
135
|
+
|
|
136
|
+
# Show both raw and preprocessed (full trace)
|
|
137
|
+
$ canon diff file1.xml file2.xml --show-raw-inputs --show-preprocessed-inputs
|
|
138
|
+
|
|
139
|
+
# Preprocess with normalization and show what was compared
|
|
140
|
+
$ canon diff file1.xml file2.xml --preprocessing normalize --show-preprocessed-inputs
|
|
141
|
+
|
|
142
|
+
# Show raw inputs with line numbers (RSpec-style)
|
|
143
|
+
$ canon diff file1.xml file2.xml --show-line-numbered-inputs
|
|
144
|
+
|
|
145
|
+
# Verbose mode (shows all three input displays)
|
|
146
|
+
$ canon diff file1.xml file2.xml --verbose
|
|
129
147
|
DESC
|
|
130
148
|
method_option :format,
|
|
131
149
|
aliases: "-f",
|
|
@@ -213,6 +231,18 @@ module Canon
|
|
|
213
231
|
method_option :diff_grouping_lines,
|
|
214
232
|
type: :numeric,
|
|
215
233
|
desc: "Group diffs within N lines into context blocks (default: no grouping)"
|
|
234
|
+
method_option :show_raw_inputs,
|
|
235
|
+
type: :boolean,
|
|
236
|
+
default: false,
|
|
237
|
+
desc: "Show raw/original file contents before diff"
|
|
238
|
+
method_option :show_preprocessed_inputs,
|
|
239
|
+
type: :boolean,
|
|
240
|
+
default: false,
|
|
241
|
+
desc: "Show preprocessed contents (what was actually compared)"
|
|
242
|
+
method_option :show_line_numbered_inputs,
|
|
243
|
+
type: :boolean,
|
|
244
|
+
default: false,
|
|
245
|
+
desc: "Show raw inputs with line numbers (RSpec-style)"
|
|
216
246
|
def diff(file1, file2)
|
|
217
247
|
Commands::DiffCommand.new(options).run(file1, file2)
|
|
218
248
|
end
|
|
@@ -53,6 +53,9 @@ module Canon
|
|
|
53
53
|
context_lines: @options.fetch(:context_lines, 3),
|
|
54
54
|
diff_grouping_lines: @options[:diff_grouping_lines],
|
|
55
55
|
show_diffs: @options[:show_diffs]&.to_sym || :all,
|
|
56
|
+
show_raw_inputs: @options[:show_raw_inputs] || false,
|
|
57
|
+
show_preprocessed_inputs: @options[:show_preprocessed_inputs] || false,
|
|
58
|
+
show_line_numbered_inputs: @options[:show_line_numbered_inputs] || false,
|
|
56
59
|
)
|
|
57
60
|
|
|
58
61
|
# Show configuration in verbose mode using shared DebugOutput
|
|
@@ -239,9 +239,116 @@ module Canon
|
|
|
239
239
|
# @param diff2 [Symbol] Difference type for node2
|
|
240
240
|
# @param dimension [Symbol] The dimension of the difference
|
|
241
241
|
# @return [String] Human-readable reason
|
|
242
|
-
def build_difference_reason(
|
|
242
|
+
def build_difference_reason(node1, node2, diff1, diff2, dimension)
|
|
243
|
+
# For attribute presence differences, show what attributes differ
|
|
244
|
+
if dimension == :attribute_presence
|
|
245
|
+
attrs1 = extract_attributes(node1)
|
|
246
|
+
attrs2 = extract_attributes(node2)
|
|
247
|
+
return build_attribute_difference_reason(attrs1, attrs2)
|
|
248
|
+
end
|
|
249
|
+
|
|
250
|
+
# For text content differences, show the actual text (truncated if needed)
|
|
251
|
+
if dimension == :text_content
|
|
252
|
+
text1 = extract_text_content_from_node(node1)
|
|
253
|
+
text2 = extract_text_content_from_node(node2)
|
|
254
|
+
return build_text_difference_reason(text1, text2)
|
|
255
|
+
end
|
|
256
|
+
|
|
243
257
|
# Default reason - can be overridden in subclasses
|
|
244
|
-
"
|
|
258
|
+
"#{diff1} vs #{diff2}"
|
|
259
|
+
end
|
|
260
|
+
|
|
261
|
+
# Build a clear reason message for attribute presence differences
|
|
262
|
+
# Shows which attributes are only in node1, only in node2, or different values
|
|
263
|
+
#
|
|
264
|
+
# @param attrs1 [Hash, nil] First node's attributes
|
|
265
|
+
# @param attrs2 [Hash, nil] Second node's attributes
|
|
266
|
+
# @return [String] Clear explanation of the attribute difference
|
|
267
|
+
def build_attribute_difference_reason(attrs1, attrs2)
|
|
268
|
+
return "#{attrs1&.keys&.size || 0} vs #{attrs2&.keys&.size || 0} attributes" unless attrs1 && attrs2
|
|
269
|
+
|
|
270
|
+
require "set"
|
|
271
|
+
keys1 = attrs1.keys.to_set
|
|
272
|
+
keys2 = attrs2.keys.to_set
|
|
273
|
+
|
|
274
|
+
only_in_1 = keys1 - keys2
|
|
275
|
+
only_in_2 = keys2 - keys1
|
|
276
|
+
common = keys1 & keys2
|
|
277
|
+
|
|
278
|
+
# Check if values differ for common keys
|
|
279
|
+
different_values = common.reject { |k| attrs1[k] == attrs2[k] }
|
|
280
|
+
|
|
281
|
+
parts = []
|
|
282
|
+
parts << "only in first: #{only_in_1.to_a.sort.join(', ')}" if only_in_1.any?
|
|
283
|
+
parts << "only in second: #{only_in_2.to_a.sort.join(', ')}" if only_in_2.any?
|
|
284
|
+
parts << "different values: #{different_values.sort.join(', ')}" if different_values.any?
|
|
285
|
+
|
|
286
|
+
if parts.empty?
|
|
287
|
+
"#{keys1.size} vs #{keys2.size} attributes (same names)"
|
|
288
|
+
else
|
|
289
|
+
parts.join("; ")
|
|
290
|
+
end
|
|
291
|
+
end
|
|
292
|
+
|
|
293
|
+
# Extract text content from a node for diff reason
|
|
294
|
+
#
|
|
295
|
+
# @param node [Object, nil] Node to extract text from
|
|
296
|
+
# @return [String, nil] Text content or nil
|
|
297
|
+
def extract_text_content_from_node(node)
|
|
298
|
+
return nil if node.nil?
|
|
299
|
+
|
|
300
|
+
# For Canon::Xml::Nodes::TextNode
|
|
301
|
+
return node.value if node.respond_to?(:value) && node.is_a?(Canon::Xml::Nodes::TextNode)
|
|
302
|
+
|
|
303
|
+
# For XML/HTML nodes with text_content method
|
|
304
|
+
return node.text_content if node.respond_to?(:text_content)
|
|
305
|
+
|
|
306
|
+
# For nodes with text method
|
|
307
|
+
return node.text if node.respond_to?(:text)
|
|
308
|
+
|
|
309
|
+
# For nodes with content method (Moxml::Text)
|
|
310
|
+
return node.content if node.respond_to?(:content)
|
|
311
|
+
|
|
312
|
+
# For nodes with value method (other types)
|
|
313
|
+
return node.value if node.respond_to?(:value)
|
|
314
|
+
|
|
315
|
+
# For simple text nodes or strings
|
|
316
|
+
return node.to_s if node.is_a?(String)
|
|
317
|
+
|
|
318
|
+
# For other node types, try to_s
|
|
319
|
+
node.to_s
|
|
320
|
+
rescue StandardError
|
|
321
|
+
nil
|
|
322
|
+
end
|
|
323
|
+
|
|
324
|
+
# Build a clear reason message for text content differences
|
|
325
|
+
# Shows the actual text content (truncated if too long)
|
|
326
|
+
#
|
|
327
|
+
# @param text1 [String, nil] First text content
|
|
328
|
+
# @param text2 [String, nil] Second text content
|
|
329
|
+
# @return [String] Clear explanation of the text difference
|
|
330
|
+
def build_text_difference_reason(text1, text2)
|
|
331
|
+
# Handle nil cases
|
|
332
|
+
return "missing vs '#{truncate_text(text2)}'" if text1.nil? && text2
|
|
333
|
+
return "'#{truncate_text(text1)}' vs missing" if text1 && text2.nil?
|
|
334
|
+
return "both missing" if text1.nil? && text2.nil?
|
|
335
|
+
|
|
336
|
+
# Both have content - show truncated versions
|
|
337
|
+
"'#{truncate_text(text1)}' vs '#{truncate_text(text2)}'"
|
|
338
|
+
end
|
|
339
|
+
|
|
340
|
+
# Truncate text for display in reason messages
|
|
341
|
+
#
|
|
342
|
+
# @param text [String] Text to truncate
|
|
343
|
+
# @param max_length [Integer] Maximum length
|
|
344
|
+
# @return [String] Truncated text
|
|
345
|
+
def truncate_text(text, max_length = 40)
|
|
346
|
+
return "" if text.nil?
|
|
347
|
+
|
|
348
|
+
text = text.to_s
|
|
349
|
+
return text if text.length <= max_length
|
|
350
|
+
|
|
351
|
+
"#{text[0...max_length]}..."
|
|
245
352
|
end
|
|
246
353
|
|
|
247
354
|
# Serialize an element node to string
|
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
# frozen_string_literal: true
|
|
2
2
|
|
|
3
|
+
require "set"
|
|
3
4
|
require_relative "../../diff/diff_node"
|
|
4
5
|
require_relative "../../diff/path_builder"
|
|
5
6
|
require_relative "../../diff/node_serializer"
|
|
@@ -62,6 +63,21 @@ module Canon
|
|
|
62
63
|
end
|
|
63
64
|
end
|
|
64
65
|
|
|
66
|
+
# For attribute presence differences, show what attributes differ
|
|
67
|
+
if dimension == :attribute_presence
|
|
68
|
+
attrs1 = extract_attributes(node1)
|
|
69
|
+
attrs2 = extract_attributes(node2)
|
|
70
|
+
return build_attribute_difference_reason(attrs1, attrs2)
|
|
71
|
+
end
|
|
72
|
+
|
|
73
|
+
# For text content differences, show the actual text (truncated if needed)
|
|
74
|
+
if dimension == :text_content
|
|
75
|
+
text1 = extract_text_content(node1)
|
|
76
|
+
text2 = extract_text_content(node2)
|
|
77
|
+
return build_text_difference_reason(text1, text2)
|
|
78
|
+
end
|
|
79
|
+
|
|
80
|
+
# Default reason
|
|
65
81
|
"#{diff1} vs #{diff2}"
|
|
66
82
|
end
|
|
67
83
|
|
|
@@ -110,6 +126,98 @@ module Canon
|
|
|
110
126
|
|
|
111
127
|
Canon::Diff::NodeSerializer.extract_attributes(node)
|
|
112
128
|
end
|
|
129
|
+
|
|
130
|
+
# Build a clear reason message for attribute presence differences
|
|
131
|
+
# Shows which attributes are only in node1, only in node2, or different values
|
|
132
|
+
#
|
|
133
|
+
# @param attrs1 [Hash, nil] First node's attributes
|
|
134
|
+
# @param attrs2 [Hash, nil] Second node's attributes
|
|
135
|
+
# @return [String] Clear explanation of the attribute difference
|
|
136
|
+
def self.build_attribute_difference_reason(attrs1, attrs2)
|
|
137
|
+
return "#{attrs1&.keys&.size || 0} vs #{attrs2&.keys&.size || 0} attributes" unless attrs1 && attrs2
|
|
138
|
+
|
|
139
|
+
keys1 = attrs1.keys.to_set
|
|
140
|
+
keys2 = attrs2.keys.to_set
|
|
141
|
+
|
|
142
|
+
only_in_1 = keys1 - keys2
|
|
143
|
+
only_in_2 = keys2 - keys1
|
|
144
|
+
common = keys1 & keys2
|
|
145
|
+
|
|
146
|
+
# Check if values differ for common keys
|
|
147
|
+
different_values = common.reject { |k| attrs1[k] == attrs2[k] }
|
|
148
|
+
|
|
149
|
+
parts = []
|
|
150
|
+
parts << "only in first: #{only_in_1.to_a.sort.join(', ')}" if only_in_1.any?
|
|
151
|
+
parts << "only in second: #{only_in_2.to_a.sort.join(', ')}" if only_in_2.any?
|
|
152
|
+
parts << "different values: #{different_values.sort.join(', ')}" if different_values.any?
|
|
153
|
+
|
|
154
|
+
if parts.empty?
|
|
155
|
+
"#{keys1.size} vs #{keys2.size} attributes (same names)"
|
|
156
|
+
else
|
|
157
|
+
parts.join("; ")
|
|
158
|
+
end
|
|
159
|
+
end
|
|
160
|
+
|
|
161
|
+
# Extract text content from a node
|
|
162
|
+
#
|
|
163
|
+
# @param node [Object, nil] Node to extract text from
|
|
164
|
+
# @return [String, nil] Text content or nil
|
|
165
|
+
def self.extract_text_content(node)
|
|
166
|
+
return nil if node.nil?
|
|
167
|
+
|
|
168
|
+
# For Canon::Xml::Nodes::TextNode
|
|
169
|
+
return node.value if node.respond_to?(:value) && node.is_a?(Canon::Xml::Nodes::TextNode)
|
|
170
|
+
|
|
171
|
+
# For XML/HTML nodes with text_content method
|
|
172
|
+
return node.text_content if node.respond_to?(:text_content)
|
|
173
|
+
|
|
174
|
+
# For nodes with text method
|
|
175
|
+
return node.text if node.respond_to?(:text)
|
|
176
|
+
|
|
177
|
+
# For nodes with content method (Moxml::Text)
|
|
178
|
+
return node.content if node.respond_to?(:content)
|
|
179
|
+
|
|
180
|
+
# For nodes with value method (other types)
|
|
181
|
+
return node.value if node.respond_to?(:value)
|
|
182
|
+
|
|
183
|
+
# For simple text nodes or strings
|
|
184
|
+
return node.to_s if node.is_a?(String)
|
|
185
|
+
|
|
186
|
+
# For other node types, try to_s
|
|
187
|
+
node.to_s
|
|
188
|
+
rescue StandardError
|
|
189
|
+
nil
|
|
190
|
+
end
|
|
191
|
+
|
|
192
|
+
# Build a clear reason message for text content differences
|
|
193
|
+
# Shows the actual text content (truncated if too long)
|
|
194
|
+
#
|
|
195
|
+
# @param text1 [String, nil] First text content
|
|
196
|
+
# @param text2 [String, nil] Second text content
|
|
197
|
+
# @return [String] Clear explanation of the text difference
|
|
198
|
+
def self.build_text_difference_reason(text1, text2)
|
|
199
|
+
# Handle nil cases
|
|
200
|
+
return "missing vs '#{truncate(text2)}'" if text1.nil? && text2
|
|
201
|
+
return "'#{truncate(text1)}' vs missing" if text1 && text2.nil?
|
|
202
|
+
return "both missing" if text1.nil? && text2.nil?
|
|
203
|
+
|
|
204
|
+
# Both have content - show truncated versions
|
|
205
|
+
"'#{truncate(text1)}' vs '#{truncate(text2)}'"
|
|
206
|
+
end
|
|
207
|
+
|
|
208
|
+
# Truncate text for display in reason messages
|
|
209
|
+
#
|
|
210
|
+
# @param text [String] Text to truncate
|
|
211
|
+
# @param max_length [Integer] Maximum length
|
|
212
|
+
# @return [String] Truncated text
|
|
213
|
+
def self.truncate(text, max_length = 40)
|
|
214
|
+
return "" if text.nil?
|
|
215
|
+
|
|
216
|
+
text = text.to_s
|
|
217
|
+
return text if text.length <= max_length
|
|
218
|
+
|
|
219
|
+
"#{text[0...max_length]}..."
|
|
220
|
+
end
|
|
113
221
|
end
|
|
114
222
|
end
|
|
115
223
|
end
|