canon 0.1.8 → 0.1.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.rubocop_todo.yml +83 -22
- data/docs/Gemfile +1 -0
- data/docs/_config.yml +90 -1
- data/docs/advanced/diff-classification.adoc +196 -24
- data/docs/features/match-options/index.adoc +239 -1
- data/lib/canon/comparison/format_detector.rb +2 -1
- data/lib/canon/comparison/html_comparator.rb +19 -8
- data/lib/canon/comparison/html_compare_profile.rb +8 -2
- data/lib/canon/comparison/markup_comparator.rb +109 -2
- data/lib/canon/comparison/match_options/base_resolver.rb +7 -0
- data/lib/canon/comparison/whitespace_sensitivity.rb +208 -0
- data/lib/canon/comparison/xml_comparator/child_comparison.rb +15 -7
- data/lib/canon/comparison/xml_comparator/diff_node_builder.rb +108 -0
- data/lib/canon/comparison/xml_comparator/node_parser.rb +10 -5
- data/lib/canon/comparison/xml_comparator/node_type_comparator.rb +14 -7
- data/lib/canon/comparison/xml_comparator.rb +240 -23
- data/lib/canon/comparison/xml_node_comparison.rb +25 -3
- data/lib/canon/diff/diff_classifier.rb +119 -5
- data/lib/canon/diff/formatting_detector.rb +1 -1
- data/lib/canon/diff/xml_serialization_formatter.rb +153 -0
- data/lib/canon/rspec_matchers.rb +37 -8
- data/lib/canon/version.rb +1 -1
- data/lib/canon/xml/data_model.rb +24 -13
- metadata +4 -78
- data/docs/plans/2025-01-17-html-parser-selection-fix.adoc +0 -250
- data/false_positive_analysis.txt +0 -0
- data/file1.html +0 -1
- data/file2.html +0 -1
- data/old-docs/ADVANCED_TOPICS.adoc +0 -20
- data/old-docs/BASIC_USAGE.adoc +0 -16
- data/old-docs/CHARACTER_VISUALIZATION.adoc +0 -567
- data/old-docs/CLI.adoc +0 -497
- data/old-docs/CUSTOMIZING_BEHAVIOR.adoc +0 -19
- data/old-docs/DIFF_ARCHITECTURE.adoc +0 -435
- data/old-docs/DIFF_FORMATTING.adoc +0 -540
- data/old-docs/DIFF_PARAMETERS.adoc +0 -261
- data/old-docs/DOM_DIFF.adoc +0 -1017
- data/old-docs/ENV_CONFIG.adoc +0 -876
- data/old-docs/FORMATS.adoc +0 -867
- data/old-docs/INPUT_VALIDATION.adoc +0 -477
- data/old-docs/MATCHER_BEHAVIOR.adoc +0 -90
- data/old-docs/MATCH_ARCHITECTURE.adoc +0 -463
- data/old-docs/MATCH_OPTIONS.adoc +0 -912
- data/old-docs/MODES.adoc +0 -432
- data/old-docs/NORMATIVE_INFORMATIVE_DIFFS.adoc +0 -219
- data/old-docs/OPTIONS.adoc +0 -1387
- data/old-docs/PREPROCESSING.adoc +0 -491
- data/old-docs/README.old.adoc +0 -2831
- data/old-docs/RSPEC.adoc +0 -814
- data/old-docs/RUBY_API.adoc +0 -485
- data/old-docs/SEMANTIC_DIFF_REPORT.adoc +0 -646
- data/old-docs/SEMANTIC_TREE_DIFF.adoc +0 -765
- data/old-docs/STRING_COMPARE.adoc +0 -345
- data/old-docs/TMP.adoc +0 -3384
- data/old-docs/TREE_DIFF.adoc +0 -1080
- data/old-docs/UNDERSTANDING_CANON.adoc +0 -17
- data/old-docs/VERBOSE.adoc +0 -482
- data/old-docs/VISUALIZATION_MAP.adoc +0 -625
- data/old-docs/WHITESPACE_TREATMENT.adoc +0 -1155
- data/scripts/analyze_current_state.rb +0 -85
- data/scripts/analyze_false_positives.rb +0 -114
- data/scripts/analyze_remaining_failures.rb +0 -105
- data/scripts/compare_current_failures.rb +0 -95
- data/scripts/compare_dom_tree_diff.rb +0 -158
- data/scripts/compare_failures.rb +0 -151
- data/scripts/debug_attribute_extraction.rb +0 -66
- data/scripts/debug_blocks_839.rb +0 -115
- data/scripts/debug_meta_matching.rb +0 -52
- data/scripts/debug_p_matching.rb +0 -192
- data/scripts/debug_signature_matching.rb +0 -118
- data/scripts/debug_sourcecode_124.rb +0 -32
- data/scripts/debug_whitespace_sensitive.rb +0 -192
- data/scripts/extract_false_positives.rb +0 -138
- data/scripts/find_actual_false_positives.rb +0 -125
- data/scripts/investigate_all_false_positives.rb +0 -161
- data/scripts/investigate_batch1.rb +0 -127
- data/scripts/investigate_classification.rb +0 -150
- data/scripts/investigate_classification_detailed.rb +0 -190
- data/scripts/investigate_common_failures.rb +0 -342
- data/scripts/investigate_false_negative.rb +0 -80
- data/scripts/investigate_false_positive.rb +0 -83
- data/scripts/investigate_false_positives.rb +0 -227
- data/scripts/investigate_false_positives_batch.rb +0 -163
- data/scripts/investigate_mixed_content.rb +0 -125
- data/scripts/investigate_remaining_16.rb +0 -214
- data/scripts/run_single_test.rb +0 -29
- data/scripts/test_all_false_positives.rb +0 -95
- data/scripts/test_attribute_details.rb +0 -61
- data/scripts/test_both_algorithms.rb +0 -49
- data/scripts/test_both_simple.rb +0 -49
- data/scripts/test_enhanced_semantic_output.rb +0 -125
- data/scripts/test_readme_examples.rb +0 -131
- data/scripts/test_semantic_tree_diff.rb +0 -99
- data/scripts/test_semantic_ux_improvements.rb +0 -135
- data/scripts/test_single_false_positive.rb +0 -119
- data/scripts/test_size_limits.rb +0 -99
- data/test_html_1.html +0 -21
- data/test_html_2.html +0 -21
- data/test_nokogiri.rb +0 -33
- data/test_normalize.rb +0 -45
|
@@ -1,17 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
layout: default
|
|
3
|
-
title: Understanding Canon
|
|
4
|
-
nav_order: 3
|
|
5
|
-
has_children: true
|
|
6
|
-
---
|
|
7
|
-
= Understanding Canon
|
|
8
|
-
|
|
9
|
-
Learn how Canon works internally:
|
|
10
|
-
|
|
11
|
-
* **link:FORMATS[Format support]** - XML, HTML, JSON, YAML
|
|
12
|
-
canonicalization
|
|
13
|
-
* **link:MODES[Diff modes]** - By-line vs by-object comparison modes
|
|
14
|
-
* **link:MATCH_ARCHITECTURE[Match architecture]** - Three-phase
|
|
15
|
-
comparison flow
|
|
16
|
-
|
|
17
|
-
These documents explain Canon's core concepts and architecture.
|
data/old-docs/VERBOSE.adoc
DELETED
|
@@ -1,482 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
layout: default
|
|
3
|
-
title: Verbose Mode
|
|
4
|
-
nav_order: 40
|
|
5
|
-
parent: Advanced Topics
|
|
6
|
-
---
|
|
7
|
-
= Canon verbose mode guide
|
|
8
|
-
:toc:
|
|
9
|
-
:toclevels: 3
|
|
10
|
-
|
|
11
|
-
== General
|
|
12
|
-
|
|
13
|
-
Canon provides a two-tier verbose output architecture for debugging
|
|
14
|
-
comparison failures:
|
|
15
|
-
|
|
16
|
-
* **Semantic Diff Report**: Always shown in verbose mode - provides
|
|
17
|
-
actionable, dimension-specific details for each difference
|
|
18
|
-
* **CANON VERBOSE tables**: Extra detailed option tables shown only when
|
|
19
|
-
`CANON_VERBOSE=1` environment variable is set
|
|
20
|
-
|
|
21
|
-
This progressive disclosure ensures developers get useful information by
|
|
22
|
-
default, with additional debugging details available when needed.
|
|
23
|
-
|
|
24
|
-
== Architecture
|
|
25
|
-
|
|
26
|
-
The output architecture follows a clear three-tier structure:
|
|
27
|
-
|
|
28
|
-
[source]
|
|
29
|
-
----
|
|
30
|
-
╔═════════════════════════════════════════════════════════════════════╗
|
|
31
|
-
║ CANON VERBOSE MODE OUTPUT ARCHITECTURE ║
|
|
32
|
-
╚═════════════════════════════════════════════════════════════════════╝
|
|
33
|
-
|
|
34
|
-
When verbose: true is used:
|
|
35
|
-
|
|
36
|
-
┌─────────────────────────────────────────────────────────────────────┐
|
|
37
|
-
│ TIER 1: CANON VERBOSE Tables (ONLY if CANON_VERBOSE=1) │
|
|
38
|
-
│ │
|
|
39
|
-
│ ┌────────────────────────────────────────────────────────────────┐ │
|
|
40
|
-
│ │ Match Options Table │ │
|
|
41
|
-
│ │ • Shows preprocessing behavior │ │
|
|
42
|
-
│ │ • Shows dimension behaviors (strict/normalize/ignore) │ │
|
|
43
|
-
│ │ • Explains what each setting means │ │
|
|
44
|
-
│ └────────────────────────────────────────────────────────────────┘ │
|
|
45
|
-
│ ┌────────────────────────────────────────────────────────────────┐ │
|
|
46
|
-
│ │ Formatter Options Table │ │
|
|
47
|
-
│ │ • Shows mode (by_line vs by_object) │ │
|
|
48
|
-
│ │ • Shows context_lines, diff_grouping_lines │ │
|
|
49
|
-
│ │ • Shows show_diffs filter setting │ │
|
|
50
|
-
│ └────────────────────────────────────────────────────────────────┘ │
|
|
51
|
-
│ ┌────────────────────────────────────────────────────────────────┐ │
|
|
52
|
-
│ │ Comparison Result Summary │ │
|
|
53
|
-
│ │ • Equivalent? (YES/NO) │ │
|
|
54
|
-
│ │ • Normative/Informative/Total diff counts │ │
|
|
55
|
-
│ └────────────────────────────────────────────────────────────────┘ │
|
|
56
|
-
└─────────────────────────────────────────────────────────────────────┘
|
|
57
|
-
│
|
|
58
|
-
▼
|
|
59
|
-
┌─────────────────────────────────────────────────────────────────────┐
|
|
60
|
-
│ TIER 2: Semantic Diff Report (ALWAYS if diffs exist) │
|
|
61
|
-
│ │
|
|
62
|
-
│ For each difference: │
|
|
63
|
-
│ • XPath location (e.g., /html/body/div/table/pre/text) │
|
|
64
|
-
│ • Dimension classification (attribute_presence, text_content) │
|
|
65
|
-
│ • Specific changes (Added: +xmlns:v, +xmlns:o) │
|
|
66
|
-
│ • Normative/Informative status │
|
|
67
|
-
│ • Dimension-specific formatting │
|
|
68
|
-
└─────────────────────────────────────────────────────────────────────┘
|
|
69
|
-
│
|
|
70
|
-
▼
|
|
71
|
-
┌─────────────────────────────────────────────────────────────────────┐
|
|
72
|
-
│ TIER 3: Detailed Diff (ALWAYS) │
|
|
73
|
-
│ │
|
|
74
|
-
│ Either: │
|
|
75
|
-
│ • Line-by-line diff (for HTML, or with --by-line flag) │
|
|
76
|
-
│ • Object tree diff (for XML/JSON/YAML by default) │
|
|
77
|
-
└─────────────────────────────────────────────────────────────────────┘
|
|
78
|
-
----
|
|
79
|
-
|
|
80
|
-
=== Output flow
|
|
81
|
-
|
|
82
|
-
The `DiffFormatter.format_comparison_result()` method orchestrates the
|
|
83
|
-
output:
|
|
84
|
-
|
|
85
|
-
. Check if `CANON_VERBOSE=1` → Render option tables
|
|
86
|
-
. Check if differences exist → Render Semantic Diff Report
|
|
87
|
-
. Always render detailed diff (by-line or by-object)
|
|
88
|
-
|
|
89
|
-
== Semantic diff report
|
|
90
|
-
|
|
91
|
-
=== General
|
|
92
|
-
|
|
93
|
-
The Semantic Diff Report is the core verbose output, always shown when
|
|
94
|
-
differences exist. It provides dimension-specific, actionable details for
|
|
95
|
-
each difference.
|
|
96
|
-
|
|
97
|
-
Unlike the detailed diff (which shows every changed line), the Semantic
|
|
98
|
-
Diff Report shows a high-level summary of WHAT changed and WHY it matters.
|
|
99
|
-
|
|
100
|
-
=== Output format
|
|
101
|
-
|
|
102
|
-
[example]
|
|
103
|
-
====
|
|
104
|
-
[source]
|
|
105
|
-
----
|
|
106
|
-
======================================================================
|
|
107
|
-
SEMANTIC DIFF REPORT (1 difference)
|
|
108
|
-
======================================================================
|
|
109
|
-
|
|
110
|
-
🔍 DIFFERENCE #1/1 [NORMATIVE]
|
|
111
|
-
──────────────────────────────────────────────────────────────────────
|
|
112
|
-
Dimension: attribute_presence
|
|
113
|
-
Location: /html
|
|
114
|
-
|
|
115
|
-
⊖ Expected (File 1):
|
|
116
|
-
<html> with 2 attributes: lang, xmlns:epub
|
|
117
|
-
|
|
118
|
-
⊕ Actual (File 2):
|
|
119
|
-
<html> with 6 attributes: lang, xmlns:epub, xmlns:m, xmlns:o,
|
|
120
|
-
xmlns:v, xmlns:w
|
|
121
|
-
|
|
122
|
-
✨ Changes:
|
|
123
|
-
Added: +xmlns:m, +xmlns:o, +xmlns:v, +xmlns:w
|
|
124
|
-
|
|
125
|
-
======================================================================
|
|
126
|
-
----
|
|
127
|
-
====
|
|
128
|
-
|
|
129
|
-
=== Format structure
|
|
130
|
-
|
|
131
|
-
Each difference displays:
|
|
132
|
-
|
|
133
|
-
* **Status indicator**: `[NORMATIVE]` (green) or `[INFORMATIVE]` (yellow)
|
|
134
|
-
* **Dimension**: Which aspect differs (colorized in magenta)
|
|
135
|
-
* **Location**: XPath for XML/HTML, path for JSON/YAML (colorized in blue)
|
|
136
|
-
* **Expected section**: What was in File 1 (red heading, bold)
|
|
137
|
-
* **Actual section**: What was in File 2 (green heading, bold)
|
|
138
|
-
* **Changes summary**: Actionable description of the difference (yellow,
|
|
139
|
-
bold)
|
|
140
|
-
|
|
141
|
-
=== Dimension-specific formats
|
|
142
|
-
|
|
143
|
-
==== Attribute presence differences
|
|
144
|
-
|
|
145
|
-
For missing or extra attributes:
|
|
146
|
-
|
|
147
|
-
[example]
|
|
148
|
-
====
|
|
149
|
-
[source]
|
|
150
|
-
----
|
|
151
|
-
Dimension: attribute_presence
|
|
152
|
-
Location: /html/body/p
|
|
153
|
-
|
|
154
|
-
⊖ Expected: <p> with 2 attributes: id, lang
|
|
155
|
-
⊕ Actual: <p> with 4 attributes: id, lang, data-value, aria-label
|
|
156
|
-
|
|
157
|
-
✨ Changes: Added: +data-value, +aria-label
|
|
158
|
-
----
|
|
159
|
-
====
|
|
160
|
-
|
|
161
|
-
Shows:
|
|
162
|
-
|
|
163
|
-
* Element name (`<p>`)
|
|
164
|
-
* How many attributes each has
|
|
165
|
-
* Which attributes were added (green with `+` prefix) or removed (red with
|
|
166
|
-
`-` prefix)
|
|
167
|
-
|
|
168
|
-
==== Attribute value differences
|
|
169
|
-
|
|
170
|
-
For differing attribute values:
|
|
171
|
-
|
|
172
|
-
[example]
|
|
173
|
-
====
|
|
174
|
-
[source]
|
|
175
|
-
----
|
|
176
|
-
Dimension: attribute_values
|
|
177
|
-
Location: /html/body/div
|
|
178
|
-
|
|
179
|
-
⊖ Expected: <div> class=" container fluid "
|
|
180
|
-
⊕ Actual: <div> class="container fluid"
|
|
181
|
-
|
|
182
|
-
✨ Changes: Whitespace normalization difference
|
|
183
|
-
----
|
|
184
|
-
====
|
|
185
|
-
|
|
186
|
-
Shows:
|
|
187
|
-
|
|
188
|
-
* Which specific attribute differs (highlighted in cyan)
|
|
189
|
-
* Exact values on both sides
|
|
190
|
-
* Analysis: "Whitespace difference only", "Whitespace normalization
|
|
191
|
-
difference", or "Value changed"
|
|
192
|
-
|
|
193
|
-
==== Text content differences
|
|
194
|
-
|
|
195
|
-
For text that differs:
|
|
196
|
-
|
|
197
|
-
[example]
|
|
198
|
-
====
|
|
199
|
-
[source]
|
|
200
|
-
----
|
|
201
|
-
Dimension: text_content
|
|
202
|
-
Location: /html/body/div/table/tbody/tr/td/pre/text
|
|
203
|
-
|
|
204
|
-
⊖ Expected: <text> "
|
|
205
|
-
puts \"Hello, world.\"
|
|
206
|
-
"
|
|
207
|
-
⊕ Actual: <text> "puts \"Hello, world.\" "
|
|
208
|
-
|
|
209
|
-
✨ Changes: ⚠️ Whitespace preserved (inside <pre>, <code>, etc. -
|
|
210
|
-
whitespace is significant)
|
|
211
|
-
----
|
|
212
|
-
====
|
|
213
|
-
|
|
214
|
-
Shows:
|
|
215
|
-
|
|
216
|
-
* Text preview (truncated at 100 characters)
|
|
217
|
-
* Special warning if inside `<pre>`, `<code>`, `<textarea>`, `<script>`,
|
|
218
|
-
or `<style>` elements (where whitespace is significant)
|
|
219
|
-
|
|
220
|
-
==== Structural whitespace differences
|
|
221
|
-
|
|
222
|
-
For whitespace-only differences (usually informative):
|
|
223
|
-
|
|
224
|
-
[example]
|
|
225
|
-
====
|
|
226
|
-
[source]
|
|
227
|
-
----
|
|
228
|
-
Dimension: structural_whitespace
|
|
229
|
-
Location: /root/p
|
|
230
|
-
|
|
231
|
-
⊖ Expected: <p> "hello␣␣world"
|
|
232
|
-
⊕ Actual: <p> "hello␣world"
|
|
233
|
-
|
|
234
|
-
✨ Changes: Whitespace-only difference (informative)
|
|
235
|
-
----
|
|
236
|
-
====
|
|
237
|
-
|
|
238
|
-
Shows:
|
|
239
|
-
|
|
240
|
-
* Whitespace visualized: `␣` for space, `→` for tab, `↵` for newline
|
|
241
|
-
* Marked as `[INFORMATIVE]` (yellow)
|
|
242
|
-
|
|
243
|
-
==== JSON/YAML differences
|
|
244
|
-
|
|
245
|
-
For JSON/YAML path-based differences:
|
|
246
|
-
|
|
247
|
-
[example]
|
|
248
|
-
====
|
|
249
|
-
[source]
|
|
250
|
-
----
|
|
251
|
-
Dimension: 15
|
|
252
|
-
Location: user.email
|
|
253
|
-
|
|
254
|
-
⊖ Expected: user.email = "alice@example.com"
|
|
255
|
-
⊕ Actual: user.email = "bob@example.com"
|
|
256
|
-
|
|
257
|
-
✨ Changes: Value changed
|
|
258
|
-
----
|
|
259
|
-
====
|
|
260
|
-
|
|
261
|
-
== CANON VERBOSE mode
|
|
262
|
-
|
|
263
|
-
=== General
|
|
264
|
-
|
|
265
|
-
CANON VERBOSE mode adds detailed option tables BEFORE the Semantic Diff
|
|
266
|
-
Report. These tables help understand:
|
|
267
|
-
|
|
268
|
-
* What match options are in effect
|
|
269
|
-
* How the diff formatter is configured
|
|
270
|
-
* Statistics about the comparison result
|
|
271
|
-
|
|
272
|
-
To enable, set the `CANON_VERBOSE` environment variable:
|
|
273
|
-
|
|
274
|
-
[source,bash]
|
|
275
|
-
----
|
|
276
|
-
CANON_VERBOSE=1 bundle exec rspec spec/my_failing_spec.rb:123
|
|
277
|
-
----
|
|
278
|
-
|
|
279
|
-
=== Match options table
|
|
280
|
-
|
|
281
|
-
Shows preprocessing and dimension behaviors:
|
|
282
|
-
|
|
283
|
-
[example]
|
|
284
|
-
====
|
|
285
|
-
[source]
|
|
286
|
-
----
|
|
287
|
-
╭────────────────────────────────────────────────────────────────────╮
|
|
288
|
-
│ Match Options (HTML) │
|
|
289
|
-
├────────────────────┬───────────┬────────────────────────────────────┤
|
|
290
|
-
│ Dimension │ Behavior │ Meaning │
|
|
291
|
-
├────────────────────┼───────────┼────────────────────────────────────┤
|
|
292
|
-
│ preprocessing │ rendered │ As browser-rendered (compacted wh… │
|
|
293
|
-
│ text_content │ normalize │ Normalized then compared (normative… │
|
|
294
|
-
│ structural_whit… │ ignore │ Differences IGNORED (informative) │
|
|
295
|
-
│ attribute_presence │ strict │ Must match exactly (normative) │
|
|
296
|
-
│ attribute_values │ normalize │ Normalized then compared (normative… │
|
|
297
|
-
│ comments │ ignore │ Differences IGNORED (informative) │
|
|
298
|
-
╰────────────────────┴───────────┴────────────────────────────────────╯
|
|
299
|
-
----
|
|
300
|
-
====
|
|
301
|
-
|
|
302
|
-
Preprocessing behaviors:
|
|
303
|
-
|
|
304
|
-
* `:none` - No preprocessing (compare as-is)
|
|
305
|
-
* `:c14n` - Canonicalize (XML C14N normalization)
|
|
306
|
-
* `:normalize` - Normalize (collapse whitespace, trim lines)
|
|
307
|
-
* `:format` - Pretty-format (consistent indentation)
|
|
308
|
-
* `:rendered` - As browser-rendered (compacted whitespace, to_html)
|
|
309
|
-
|
|
310
|
-
Dimension behaviors:
|
|
311
|
-
|
|
312
|
-
* `:ignore` - Differences IGNORED (innormative, won't fail test)
|
|
313
|
-
* `:strict` - Must match exactly (normative, will fail test)
|
|
314
|
-
* `:normalize` - Normalized then compared (normative if different after
|
|
315
|
-
normalization)
|
|
316
|
-
* `:strip` - Strip leading/trailing whitespace only
|
|
317
|
-
* `:compact` - Collapse whitespace runs to single space
|
|
318
|
-
|
|
319
|
-
=== Formatter options table
|
|
320
|
-
|
|
321
|
-
Shows diff formatting settings:
|
|
322
|
-
|
|
323
|
-
[example]
|
|
324
|
-
====
|
|
325
|
-
[source]
|
|
326
|
-
----
|
|
327
|
-
╭────────────────────────────────────────────────────────────────────╮
|
|
328
|
-
│ Formatter Options │
|
|
329
|
-
├─────────────────────┬─────────┬────────────────────────────────────┤
|
|
330
|
-
│ Option │ Value │ Impact │
|
|
331
|
-
├─────────────────────┼─────────┼─────────────────────────────────────┤
|
|
332
|
-
│ mode │ by_line │ Line-by-line diff │
|
|
333
|
-
│ context_lines │ 3 │ 3 lines of context around diffs │
|
|
334
|
-
│ show_diffs │ all │ Show all diffs (normative + informative) │
|
|
335
|
-
╰─────────────────────┴─────────┴────────────────────────────────────╯
|
|
336
|
-
----
|
|
337
|
-
====
|
|
338
|
-
|
|
339
|
-
=== Comparison result summary
|
|
340
|
-
|
|
341
|
-
Shows diff statistics:
|
|
342
|
-
|
|
343
|
-
[example]
|
|
344
|
-
====
|
|
345
|
-
[source]
|
|
346
|
-
----
|
|
347
|
-
╭─────────────────────────────────────────────────────────────────────╮
|
|
348
|
-
│ Comparison Result Summary │
|
|
349
|
-
├────────────────┬─────────┬──────────────────────────────────────────┤
|
|
350
|
-
│ Equivalent? │ ✗ NO │ Documents have semantic differences │
|
|
351
|
-
│ Normative Diffs │ 1 diffs │ Semantic differences that matter │
|
|
352
|
-
│ Informative Diffs │ 0 │ Textual/formatting differences (ignored) │
|
|
353
|
-
│ Total Diffs │ 1 │ All differences found │
|
|
354
|
-
╰────────────────┴─────────┴──────────────────────────────────────────╯
|
|
355
|
-
----
|
|
356
|
-
====
|
|
357
|
-
|
|
358
|
-
== Usage
|
|
359
|
-
|
|
360
|
-
=== Using in RSpec matchers
|
|
361
|
-
|
|
362
|
-
Verbose mode is activated by using `verbose: true` in the comparison:
|
|
363
|
-
|
|
364
|
-
[source,ruby]
|
|
365
|
-
----
|
|
366
|
-
result = Canon::Comparison::XmlComparator.equivalent?(
|
|
367
|
-
xml1,
|
|
368
|
-
xml2,
|
|
369
|
-
verbose: true
|
|
370
|
-
)
|
|
371
|
-
# Returns ComparisonResult object
|
|
372
|
-
# Semantic Diff Report shown if differences exist
|
|
373
|
-
----
|
|
374
|
-
|
|
375
|
-
With RSpec matchers, verbose mode is automatic on test failure:
|
|
376
|
-
|
|
377
|
-
[source,ruby]
|
|
378
|
-
----
|
|
379
|
-
# Semantic Diff Report automatically shown on failure
|
|
380
|
-
expect(actual_html).to be_html4_equivalent_to(expected_html)
|
|
381
|
-
----
|
|
382
|
-
|
|
383
|
-
To enable CANON VERBOSE tables:
|
|
384
|
-
|
|
385
|
-
[source,bash]
|
|
386
|
-
----
|
|
387
|
-
CANON_VERBOSE=1 bundle exec rspec spec/my_spec.rb:123
|
|
388
|
-
----
|
|
389
|
-
|
|
390
|
-
=== Using via CLI
|
|
391
|
-
|
|
392
|
-
[source,bash]
|
|
393
|
-
----
|
|
394
|
-
# Verbose mode (shows Semantic Diff Report)
|
|
395
|
-
canon diff file1.xml file2.xml --verbose
|
|
396
|
-
|
|
397
|
-
# With CANON VERBOSE tables
|
|
398
|
-
CANON_VERBOSE=1 canon diff file1.xml file2.xml --verbose
|
|
399
|
-
----
|
|
400
|
-
|
|
401
|
-
=== Configuration
|
|
402
|
-
|
|
403
|
-
You can enable CANON VERBOSE mode permanently for a project:
|
|
404
|
-
|
|
405
|
-
[source,ruby]
|
|
406
|
-
----
|
|
407
|
-
# In spec/spec_helper.rb
|
|
408
|
-
ENV['CANON_VERBOSE'] = '1' if ENV['DEBUG']
|
|
409
|
-
|
|
410
|
-
# Or in your test
|
|
411
|
-
before(:each) do
|
|
412
|
-
ENV['CANON_VERBOSE'] = '1'
|
|
413
|
-
end
|
|
414
|
-
----
|
|
415
|
-
|
|
416
|
-
== Implementation
|
|
417
|
-
|
|
418
|
-
=== DiffDetailFormatter module
|
|
419
|
-
|
|
420
|
-
Location: `lib/canon/diff_formatter/diff_detail_formatter.rb`
|
|
421
|
-
|
|
422
|
-
Responsible for:
|
|
423
|
-
|
|
424
|
-
* Formatting the Semantic Diff Report
|
|
425
|
-
* Dispatching to dimension-specific formatters
|
|
426
|
-
* Extracting XPath/JSON paths
|
|
427
|
-
* Detecting whitespace-preserving elements (`<pre>`, `<code>`, etc.)
|
|
428
|
-
* Colorizing output
|
|
429
|
-
|
|
430
|
-
Key methods:
|
|
431
|
-
|
|
432
|
-
* `format_report(differences)` - Main entry point
|
|
433
|
-
* `format_attribute_presence_details()` - Format attribute presence diffs
|
|
434
|
-
* `format_attribute_values_details()` - Format attribute value diffs
|
|
435
|
-
* `format_text_content_details()` - Format text content diffs
|
|
436
|
-
* `extract_xpath(node)` - Extract XPath with safety limits
|
|
437
|
-
* `inside_preserve_element?(node)` - Detect whitespace preservation
|
|
438
|
-
|
|
439
|
-
=== DebugOutput module
|
|
440
|
-
|
|
441
|
-
Location: `lib/canon/diff_formatter/debug_output.rb`
|
|
442
|
-
|
|
443
|
-
Responsible for:
|
|
444
|
-
|
|
445
|
-
* Rendering CANON VERBOSE option tables
|
|
446
|
-
* Checking if `CANON_VERBOSE=1` is set
|
|
447
|
-
* Formatting match options with descriptions
|
|
448
|
-
* Formatting formatter options with impact
|
|
449
|
-
* Formatting comparison summary statistics
|
|
450
|
-
|
|
451
|
-
Key methods:
|
|
452
|
-
|
|
453
|
-
* `verbose_tables_only()` - Returns CANON VERBOSE tables or empty string
|
|
454
|
-
* `format_match_options_table()` - Render match options as table
|
|
455
|
-
* `format_formatter_options_table()` - Render formatter options as table
|
|
456
|
-
* `format_comparison_summary()` - Render result summary as table
|
|
457
|
-
|
|
458
|
-
=== DiffFormatter integration
|
|
459
|
-
|
|
460
|
-
Location: `lib/canon/diff_formatter.rb`
|
|
461
|
-
|
|
462
|
-
The `format_comparison_result()` method orchestrates output:
|
|
463
|
-
|
|
464
|
-
[source,ruby]
|
|
465
|
-
----
|
|
466
|
-
def format_comparison_result(comparison_result, expected, actual)
|
|
467
|
-
output = []
|
|
468
|
-
|
|
469
|
-
# 1. CANON VERBOSE tables (ONLY if CANON_VERBOSE=1)
|
|
470
|
-
output << DebugOutput.verbose_tables_only(...)
|
|
471
|
-
|
|
472
|
-
# 2. Semantic Diff Report (ALWAYS if diffs exist)
|
|
473
|
-
output << DiffDetailFormatter.format_report(...)
|
|
474
|
-
|
|
475
|
-
# 3. Detailed diff (ALWAYS)
|
|
476
|
-
output << format(differences, ...)
|
|
477
|
-
|
|
478
|
-
output.compact.join("\n")
|
|
479
|
-
end
|
|
480
|
-
----
|
|
481
|
-
|
|
482
|
-
This ensures the correct output order and separation of concerns.
|