canon 0.1.6 → 0.1.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (136) hide show
  1. checksums.yaml +4 -4
  2. data/.rubocop_todo.yml +163 -67
  3. data/README.adoc +400 -7
  4. data/docs/Gemfile +9 -0
  5. data/docs/INDEX.adoc +99 -182
  6. data/docs/_config.yml +100 -0
  7. data/docs/advanced/diff-classification.adoc +547 -0
  8. data/docs/advanced/diff-pipeline.adoc +358 -0
  9. data/docs/advanced/index.adoc +214 -0
  10. data/docs/advanced/semantic-diff-report.adoc +390 -0
  11. data/docs/{VERBOSE.adoc → advanced/verbose-mode-architecture.adoc} +51 -53
  12. data/docs/features/diff-formatting/algorithm-specific-output.adoc +533 -0
  13. data/docs/{CHARACTER_VISUALIZATION.adoc → features/diff-formatting/character-visualization.adoc} +23 -62
  14. data/docs/features/diff-formatting/colors-and-symbols.adoc +606 -0
  15. data/docs/features/diff-formatting/context-and-grouping.adoc +490 -0
  16. data/docs/features/diff-formatting/display-filtering.adoc +472 -0
  17. data/docs/features/diff-formatting/index.adoc +140 -0
  18. data/docs/features/environment-configuration/index.adoc +327 -0
  19. data/docs/features/environment-configuration/override-system.adoc +436 -0
  20. data/docs/features/environment-configuration/size-limits.adoc +273 -0
  21. data/docs/features/index.adoc +173 -0
  22. data/docs/features/input-validation/index.adoc +521 -0
  23. data/docs/features/match-options/algorithm-specific-behavior.adoc +365 -0
  24. data/docs/features/match-options/html-policies.adoc +312 -0
  25. data/docs/features/match-options/index.adoc +621 -0
  26. data/docs/getting-started/index.adoc +83 -0
  27. data/docs/getting-started/quick-start.adoc +76 -0
  28. data/docs/guides/choosing-configuration.adoc +689 -0
  29. data/docs/guides/index.adoc +181 -0
  30. data/docs/{CLI.adoc → interfaces/cli/index.adoc} +18 -13
  31. data/docs/interfaces/index.adoc +101 -0
  32. data/docs/{RSPEC.adoc → interfaces/rspec/index.adoc} +242 -31
  33. data/docs/{RUBY_API.adoc → interfaces/ruby-api/index.adoc} +118 -16
  34. data/docs/lychee.toml +65 -0
  35. data/docs/reference/cli-options.adoc +418 -0
  36. data/docs/reference/environment-variables.adoc +375 -0
  37. data/docs/reference/index.adoc +204 -0
  38. data/docs/reference/options-across-interfaces.adoc +417 -0
  39. data/docs/understanding/algorithms/dom-diff.adoc +389 -0
  40. data/docs/understanding/algorithms/index.adoc +314 -0
  41. data/docs/understanding/algorithms/semantic-tree-diff.adoc +533 -0
  42. data/docs/understanding/architecture.adoc +447 -0
  43. data/docs/understanding/comparison-pipeline.adoc +317 -0
  44. data/docs/understanding/formats/html.adoc +380 -0
  45. data/docs/understanding/formats/index.adoc +261 -0
  46. data/docs/understanding/formats/json.adoc +390 -0
  47. data/docs/understanding/formats/xml.adoc +366 -0
  48. data/docs/understanding/formats/yaml.adoc +504 -0
  49. data/docs/understanding/index.adoc +130 -0
  50. data/lib/canon/cli.rb +42 -1
  51. data/lib/canon/commands/diff_command.rb +108 -23
  52. data/lib/canon/comparison/compare_profile.rb +101 -0
  53. data/lib/canon/comparison/comparison_result.rb +41 -2
  54. data/lib/canon/comparison/html_comparator.rb +292 -71
  55. data/lib/canon/comparison/html_compare_profile.rb +117 -0
  56. data/lib/canon/comparison/match_options.rb +42 -4
  57. data/lib/canon/comparison/strategies/base_match_strategy.rb +99 -0
  58. data/lib/canon/comparison/strategies/match_strategy_factory.rb +74 -0
  59. data/lib/canon/comparison/strategies/semantic_tree_match_strategy.rb +220 -0
  60. data/lib/canon/comparison/xml_comparator.rb +695 -91
  61. data/lib/canon/comparison.rb +207 -2
  62. data/lib/canon/config/env_provider.rb +71 -0
  63. data/lib/canon/config/env_schema.rb +58 -0
  64. data/lib/canon/config/override_resolver.rb +55 -0
  65. data/lib/canon/config/type_converter.rb +59 -0
  66. data/lib/canon/config.rb +158 -29
  67. data/lib/canon/data_model.rb +29 -0
  68. data/lib/canon/diff/diff_classifier.rb +74 -14
  69. data/lib/canon/diff/diff_context_builder.rb +41 -0
  70. data/lib/canon/diff/diff_line.rb +18 -2
  71. data/lib/canon/diff/diff_node.rb +18 -3
  72. data/lib/canon/diff/diff_node_mapper.rb +71 -12
  73. data/lib/canon/diff/formatting_detector.rb +53 -0
  74. data/lib/canon/diff_formatter/by_line/base_formatter.rb +60 -5
  75. data/lib/canon/diff_formatter/by_line/html_formatter.rb +68 -16
  76. data/lib/canon/diff_formatter/by_line/json_formatter.rb +0 -37
  77. data/lib/canon/diff_formatter/by_line/simple_formatter.rb +0 -42
  78. data/lib/canon/diff_formatter/by_line/xml_formatter.rb +116 -31
  79. data/lib/canon/diff_formatter/by_line/yaml_formatter.rb +0 -37
  80. data/lib/canon/diff_formatter/by_object/base_formatter.rb +126 -19
  81. data/lib/canon/diff_formatter/by_object/xml_formatter.rb +30 -1
  82. data/lib/canon/diff_formatter/debug_output.rb +7 -1
  83. data/lib/canon/diff_formatter/diff_detail_formatter.rb +674 -57
  84. data/lib/canon/diff_formatter/legend.rb +42 -0
  85. data/lib/canon/diff_formatter.rb +78 -9
  86. data/lib/canon/errors.rb +56 -0
  87. data/lib/canon/formatters/html_formatter_base.rb +35 -1
  88. data/lib/canon/formatters/json_formatter.rb +3 -0
  89. data/lib/canon/formatters/yaml_formatter.rb +3 -0
  90. data/lib/canon/html/data_model.rb +229 -0
  91. data/lib/canon/html.rb +9 -0
  92. data/lib/canon/options/cli_generator.rb +70 -0
  93. data/lib/canon/options/registry.rb +234 -0
  94. data/lib/canon/rspec_matchers.rb +34 -13
  95. data/lib/canon/tree_diff/adapters/html_adapter.rb +316 -0
  96. data/lib/canon/tree_diff/adapters/json_adapter.rb +204 -0
  97. data/lib/canon/tree_diff/adapters/xml_adapter.rb +285 -0
  98. data/lib/canon/tree_diff/adapters/yaml_adapter.rb +213 -0
  99. data/lib/canon/tree_diff/core/attribute_comparator.rb +84 -0
  100. data/lib/canon/tree_diff/core/matching.rb +241 -0
  101. data/lib/canon/tree_diff/core/node_signature.rb +164 -0
  102. data/lib/canon/tree_diff/core/node_weight.rb +135 -0
  103. data/lib/canon/tree_diff/core/tree_node.rb +450 -0
  104. data/lib/canon/tree_diff/matchers/hash_matcher.rb +258 -0
  105. data/lib/canon/tree_diff/matchers/similarity_matcher.rb +168 -0
  106. data/lib/canon/tree_diff/matchers/structural_propagator.rb +242 -0
  107. data/lib/canon/tree_diff/matchers/universal_matcher.rb +220 -0
  108. data/lib/canon/tree_diff/operation_converter.rb +631 -0
  109. data/lib/canon/tree_diff/operations/operation.rb +92 -0
  110. data/lib/canon/tree_diff/operations/operation_detector.rb +626 -0
  111. data/lib/canon/tree_diff/tree_diff_integrator.rb +140 -0
  112. data/lib/canon/tree_diff.rb +33 -0
  113. data/lib/canon/validators/json_validator.rb +3 -1
  114. data/lib/canon/validators/yaml_validator.rb +3 -1
  115. data/lib/canon/version.rb +1 -1
  116. data/lib/canon/xml/data_model.rb +22 -23
  117. data/lib/canon/xml/element_matcher.rb +128 -20
  118. data/lib/canon/xml/namespace_helper.rb +110 -0
  119. data/lib/canon.rb +3 -0
  120. metadata +81 -23
  121. data/_config.yml +0 -116
  122. data/docs/ADVANCED_TOPICS.adoc +0 -20
  123. data/docs/BASIC_USAGE.adoc +0 -16
  124. data/docs/CUSTOMIZING_BEHAVIOR.adoc +0 -19
  125. data/docs/DIFF_ARCHITECTURE.adoc +0 -435
  126. data/docs/DIFF_FORMATTING.adoc +0 -540
  127. data/docs/FORMATS.adoc +0 -447
  128. data/docs/INPUT_VALIDATION.adoc +0 -477
  129. data/docs/MATCH_ARCHITECTURE.adoc +0 -463
  130. data/docs/MATCH_OPTIONS.adoc +0 -719
  131. data/docs/MODES.adoc +0 -432
  132. data/docs/NORMATIVE_INFORMATIVE_DIFFS.adoc +0 -219
  133. data/docs/OPTIONS.adoc +0 -1387
  134. data/docs/PREPROCESSING.adoc +0 -491
  135. data/docs/SEMANTIC_DIFF_REPORT.adoc +0 -528
  136. data/docs/UNDERSTANDING_CANON.adoc +0 -17
@@ -0,0 +1,621 @@
1
+ ---
2
+ title: Match Options
3
+ parent: Features
4
+ nav_order: 3
5
+ has_children: true
6
+ ---
7
+ = Match options
8
+ :toc:
9
+ :toclevels: 3
10
+
11
+ == Purpose
12
+
13
+ This section provides a complete reference for Canon's match options, including match dimensions, behaviors, and predefined profiles.
14
+
15
+ Match options control **Layer 3 (Match Options)** of Canon's 4-layer comparison architecture. See link:../../understanding/comparison-pipeline.adoc[Comparison Pipeline] for the complete flow.
16
+
17
+ == Overview
18
+
19
+ Match options control which aspects of documents are compared and how strictly they are compared. Canon provides:
20
+
21
+ * **Match dimensions**: Independent aspects of documents (text, whitespace, attributes, etc.)
22
+ * **Dimension behaviors**: How each dimension is compared (`:strict`, `:normalize`, `:ignore`)
23
+ * **Match profiles**: Predefined combinations for common scenarios
24
+
25
+ **Important**: Match options behave differently with each algorithm. See link:algorithm-specific-behavior.adoc[Algorithm-Specific Behavior] for details.
26
+
27
+ == Child Pages
28
+
29
+ * link:dimensions.adoc[Match Dimensions] - Detailed reference for all dimensions
30
+ * link:profiles.adoc[Match Profiles] - Predefined configurations
31
+ * link:algorithm-specific-behavior.adoc[Algorithm-Specific Behavior] - How DOM and Semantic algorithms interpret options differently
32
+ * link:html-policies.adoc[HTML-Specific Policies] - HTML format-specific comparison policies
33
+
34
+ == Match dimensions overview
35
+
36
+ Match dimensions are orthogonal aspects that can be configured independently.
37
+
38
+ === text_content
39
+
40
+ **Applies to**: All formats
41
+
42
+ **Purpose**: Controls how text content within elements/values is compared.
43
+
44
+ **Behaviors**:
45
+
46
+ `:strict`:: Text must match exactly, character-for-character including all whitespace
47
+
48
+ `:normalize`:: Whitespace is normalized (collapsed/trimmed) before comparison
49
+
50
+ `:ignore`:: Text content is completely ignored in comparison
51
+
52
+ === structural_whitespace
53
+
54
+ **Applies to**: All formats
55
+
56
+ **Purpose**: Controls how whitespace between elements (indentation, newlines) is handled.
57
+
58
+ **Behaviors**:
59
+
60
+ `:strict`:: All structural whitespace must match exactly
61
+
62
+ `:normalize`:: Structural whitespace is normalized
63
+
64
+ `:ignore`:: Structural whitespace is completely ignored
65
+
66
+ === attribute_whitespace
67
+
68
+ **Applies to**: XML, HTML only
69
+
70
+ **Purpose**: Controls how whitespace in attribute values is handled.
71
+
72
+ **Behaviors**:
73
+
74
+ `:strict`:: Attribute value whitespace must match exactly
75
+
76
+ `:normalize`:: Whitespace in attribute values is normalized
77
+
78
+ `:ignore`:: Whitespace in attribute values is ignored
79
+
80
+ === attribute_order
81
+
82
+ **Applies to**: XML, HTML only
83
+
84
+ **Purpose**: Controls whether attribute order matters.
85
+
86
+ **Behaviors**:
87
+
88
+ `:strict`:: Attributes must appear in the same order
89
+
90
+ `:ignore`:: Attribute order doesn't matter (set-based comparison)
91
+
92
+ === attribute_values
93
+
94
+ **Applies to**: XML, HTML only
95
+
96
+ **Purpose**: Controls how attribute values are compared.
97
+
98
+ **Behaviors**:
99
+
100
+ `:strict`:: Attribute values must match exactly
101
+
102
+ `:normalize`:: Whitespace in values is normalized
103
+
104
+ `:ignore`:: Only attribute presence is checked, values ignored
105
+
106
+ === key_order
107
+
108
+ **Applies to**: JSON, YAML only
109
+
110
+ **Purpose**: Controls whether object key order matters.
111
+
112
+ **Behaviors**:
113
+
114
+ `:strict`:: Keys must appear in the same order
115
+
116
+ `:ignore`:: Key order doesn't matter (unordered comparison)
117
+
118
+ === comments
119
+
120
+ **Applies to**: XML, HTML, YAML (JSON doesn't support comments in standard spec)
121
+
122
+ **Purpose**: Controls how comments are compared.
123
+
124
+ **Behaviors**:
125
+
126
+ `:strict`:: Comments must match exactly (including whitespace)
127
+
128
+ `:normalize`:: Whitespace in comments is normalized
129
+
130
+ `:ignore`:: Comments are completely ignored
131
+
132
+ === namespace_uri
133
+
134
+ **Applies to**: XML only
135
+
136
+ **Purpose**: Controls how XML element namespaces are compared. Elements are identified by the pair `{namespace_uri, local_name}` according to XML semantics.
137
+
138
+ **Behaviors**:
139
+
140
+ `:strict`:: Namespace URIs must match (default and only supported behavior)
141
+
142
+ **Note**: This dimension is always `:strict` for XML. Namespace prefixes are not significant - only the namespace URI matters. Elements with different prefixes but the same namespace URI are considered equivalent.
143
+
144
+ .Namespace URI comparison
145
+ [example]
146
+ ====
147
+ [source,ruby]
148
+ ----
149
+ # These are equivalent (same namespace URI, different prefixes)
150
+ xml1 = '<root xmlns:a="http://example.com"><a:item>value</a:item></root>'
151
+ xml2 = '<root xmlns:b="http://example.com"><b:item>value</b:item></root>'
152
+
153
+ # These are NOT equivalent (different namespace URIs)
154
+ xml3 = '<root xmlns:a="http://example.com"><a:item>value</a:item></root>'
155
+ xml4 = '<root xmlns:a="http://other.com"><a:item>value</a:item></root>'
156
+ ----
157
+ ====
158
+
159
+ == Match profiles overview
160
+
161
+ Profiles are predefined combinations of dimension settings for common scenarios.
162
+
163
+ === strict
164
+
165
+ **Purpose**: Exact matching - all dimensions use `:strict` behavior.
166
+
167
+ **When to use**:
168
+
169
+ * Character-perfect matching required
170
+ * Testing exact serializer output
171
+ * Verifying formatting compliance
172
+ * Maximum strictness needed
173
+
174
+ === rendered
175
+
176
+ **Purpose**: Mimics how browsers/CSS engines render content.
177
+
178
+ **When to use**:
179
+
180
+ * Comparing HTML rendered output
181
+ * Formatting doesn't affect display
182
+ * Testing web page generation
183
+ * Browser-equivalent comparison
184
+
185
+ === spec_friendly
186
+
187
+ **Purpose**: Test-friendly comparison that ignores most formatting differences.
188
+
189
+ **When to use**:
190
+
191
+ * Writing RSpec tests
192
+ * Testing semantic correctness
193
+ * Ignoring pretty-printing differences
194
+ * Most common test scenario
195
+
196
+ === content_only
197
+
198
+ **Purpose**: Only semantic content matters - maximum tolerance for formatting.
199
+
200
+ **When to use**:
201
+
202
+ * Only care about data, not presentation
203
+ * Maximum flexibility needed
204
+ * Comparing across different formats
205
+ * Structural equivalence only
206
+
207
+ == Format defaults
208
+
209
+ Each format has sensible defaults based on typical usage:
210
+
211
+ [cols="1,1,1,1,1"]
212
+ |===
213
+ |Dimension |XML |HTML |JSON |YAML
214
+
215
+ |`text_content`
216
+ |`:strict`
217
+ |`:normalize`
218
+ |`:strict`
219
+ |`:strict`
220
+
221
+ |`structural_whitespace`
222
+ |`:strict`
223
+ |`:normalize`
224
+ |`:strict`
225
+ |`:strict`
226
+
227
+ |`attribute_whitespace`
228
+ |`:strict`
229
+ |`:normalize`
230
+ |—
231
+ |—
232
+
233
+ |`attribute_order`
234
+ |`:ignore`
235
+ |`:ignore`
236
+ |—
237
+ |—
238
+
239
+ |`attribute_values`
240
+ |`:strict`
241
+ |`:strict`
242
+ |—
243
+ |—
244
+
245
+ |`key_order`
246
+ |—
247
+ |—
248
+ |`:strict`
249
+ |`:strict`
250
+
251
+ |`comments`
252
+ |`:strict`
253
+ |`:ignore`
254
+ |—
255
+ |`:strict`
256
+
257
+ |`namespace_uri`
258
+ |`:strict`
259
+ |—
260
+ |—
261
+ |—
262
+ |===
263
+
264
+ == Configuration precedence
265
+
266
+ When options are specified in multiple places, Canon resolves them using this hierarchy (highest to lowest priority):
267
+
268
+ [source]
269
+ ----
270
+ 1. Per-comparison explicit options (highest)
271
+
272
+ 2. Per-comparison profile
273
+
274
+ 3. Global configuration explicit options
275
+
276
+ 4. Global configuration profile
277
+
278
+ 5. Format defaults (lowest)
279
+ ----
280
+
281
+ .Precedence example
282
+ [example]
283
+ ====
284
+ **Global configuration**:
285
+
286
+ [source,ruby]
287
+ ----
288
+ Canon::RSpecMatchers.configure do |config|
289
+ config.xml.match.profile = :spec_friendly
290
+ config.xml.match.options = { comments: :strict }
291
+ end
292
+ ----
293
+
294
+ The `:spec_friendly` profile sets:
295
+
296
+ * `text_content: :normalize`
297
+ * `structural_whitespace: :ignore`
298
+ * `comments: :ignore`
299
+
300
+ But the explicit `comments: :strict` overrides the profile setting.
301
+
302
+ **Per-test usage**:
303
+
304
+ [source,ruby]
305
+ ----
306
+ expect(actual).to be_xml_equivalent_to(expected)
307
+ .with_profile(:rendered)
308
+ .with_options(structural_whitespace: :ignore)
309
+ ----
310
+
311
+ **Final resolved options**:
312
+
313
+ * `text_content: :normalize` (from `:rendered` per-test profile)
314
+ * `structural_whitespace: :ignore` (from per-test explicit option)
315
+ * `comments: :strict` (from global explicit option)
316
+ * Other dimensions use `:rendered` profile or format defaults
317
+ ====
318
+
319
+ == Usage examples
320
+
321
+ === Ruby API
322
+
323
+ [source,ruby]
324
+ ----
325
+ # Use specific dimensions
326
+ Canon::Comparison.equivalent?(doc1, doc2,
327
+ match: {
328
+ text_content: :normalize,
329
+ structural_whitespace: :ignore,
330
+ comments: :ignore
331
+ }
332
+ )
333
+
334
+ # Use a profile
335
+ Canon::Comparison.equivalent?(doc1, doc2,
336
+ match_profile: :spec_friendly
337
+ )
338
+
339
+ # Profile with dimension overrides
340
+ Canon::Comparison.equivalent?(doc1, doc2,
341
+ match_profile: :spec_friendly,
342
+ match: {
343
+ comments: :strict # Override profile
344
+ }
345
+ )
346
+
347
+ # Use semantic dimensions
348
+ Canon::Comparison.equivalent?(doc1, doc2,
349
+ diff_algorithm: :semantic,
350
+ match: {
351
+ element_position: :ignore,
352
+ element_hierarchy: :ignore
353
+ }
354
+ )
355
+ ----
356
+
357
+ === CLI
358
+
359
+ [source,bash]
360
+ ----
361
+ # Use profile
362
+ $ canon diff file1.xml file2.xml \
363
+ --match-profile spec_friendly \
364
+ --verbose
365
+
366
+ # Override specific dimensions
367
+ $ canon diff file1.xml file2.xml \
368
+ --text-content normalize \
369
+ --structural-whitespace ignore \
370
+ --verbose
371
+
372
+ # Combine profile with overrides
373
+ $ canon diff file1.xml file2.xml \
374
+ --match-profile spec_friendly \
375
+ --comments strict \
376
+ --verbose
377
+
378
+ # Use semantic algorithm with flexible positioning
379
+ $ canon diff file1.xml file2.xml \
380
+ --diff-algorithm semantic \
381
+ --element-position ignore \
382
+ --verbose
383
+ ----
384
+
385
+ === RSpec
386
+
387
+ [source,ruby]
388
+ ----
389
+ # Global configuration
390
+ Canon::RSpecMatchers.configure do |config|
391
+ config.xml.match.profile = :spec_friendly
392
+ config.xml.match.options = {
393
+ text_content: :normalize,
394
+ comments: :ignore
395
+ }
396
+ end
397
+
398
+ # Per-test override
399
+ expect(actual).to be_xml_equivalent_to(expected)
400
+ .with_profile(:strict)
401
+
402
+ # Per-test dimension override
403
+ expect(actual).to be_xml_equivalent_to(expected)
404
+ .with_options(
405
+ structural_whitespace: :strict,
406
+ text_content: :strict
407
+ )
408
+
409
+ # Semantic algorithm with flexible hierarchy
410
+ expect(actual).to be_xml_equivalent_to(expected,
411
+ diff_algorithm: :semantic
412
+ )
413
+ .with_options(
414
+ element_position: :ignore,
415
+ element_hierarchy: :ignore
416
+ )
417
+ ====
418
+
419
+ == Comments dimension
420
+
421
+ The `comments` dimension controls how comment nodes are matched and how their differences are classified in diff output.
422
+
423
+ === Matching behaviors
424
+
425
+ `strict`:: Comment content must match exactly. Differences are classified as **normative** (shown in red/green).
426
+ `normalize`:: Comment text is normalized (whitespace collapsed) before matching. Differences are still classified as **normative**.
427
+ `ignore`:: Comment content is compared but differences are classified as **informative** (shown in cyan/blue) rather than normative.
428
+
429
+ IMPORTANT: With `comments: :ignore`, comment nodes still participate in comparison and create DiffNodes. The difference is that these DiffNodes are marked as **informative** rather than **normative**. Use the `show_diffs` option to control visibility of informative diffs in the output.
430
+
431
+ === Default values
432
+
433
+ * **XML**: `comments: :strict` - Comment differences are normative
434
+ * **HTML**: `comments: :ignore` - Comment differences are informative
435
+
436
+ === Example: Comments as informative differences
437
+
438
+ .Match comments but classify differences as informative
439
+ [source,ruby]
440
+ ----
441
+ xml1 = '<root><!--comment 1--><child>text</child></root>'
442
+ xml2 = '<root><!--comment 2--><child>text</child></root>'
443
+
444
+ Canon.equivalent?(xml1, xml2,
445
+ format: :xml,
446
+ verbose: true,
447
+ match: { comments: :ignore }, # Comment diffs → informative
448
+ show_diffs: :all # Show all diffs (cyan for comments)
449
+ )
450
+ ----
451
+
452
+ In this example, the comment difference is still detected and included in the diff output, but is shown in cyan color to indicate it's an informative difference.
453
+
454
+ === Example: Hide informative differences
455
+
456
+ .Hide informative differences including comments
457
+ [source,ruby]
458
+ ----
459
+ Canon.equivalent?(xml1, xml2,
460
+ format: :xml,
461
+ verbose: true,
462
+ match: { comments: :ignore }, # Comment diffs → informative
463
+ show_diffs: :normative # Hide informative diffs
464
+ )
465
+ # Returns empty string - no normative diffs to show
466
+ ----
467
+
468
+ == Controlling diff visibility with show_diffs
469
+
470
+ The `show_diffs` option controls which differences appear in verbose output. This provides fine-grained control over what is displayed without affecting the comparison algorithm.
471
+
472
+ === Values
473
+
474
+ `:all` (default):: Show all differences (both normative and informative)
475
+ `:normative`:: Show only normative differences (hide informative)
476
+ `:informative`:: Show only informative differences (hide normative)
477
+
478
+ === Color scheme
479
+
480
+ Normative differences:: Shown in red (removed) and green (added)
481
+ Informative differences:: Shown in cyan/blue (both removed and added)
482
+
483
+ === Three-stage pipeline
484
+
485
+ The comparison process follows a three-stage pipeline:
486
+
487
+ [source]
488
+ ----
489
+ ┌─────────────────────────────────────────────────────────────┐
490
+ │ STAGE 1: MATCHING - Compare all nodes │
491
+ │ • Parse XML/HTML to DOM or TreeNode │
492
+ │ • Compare ALL nodes (including comments) │
493
+ │ • Create DiffNodes for ALL differences │
494
+ └────────────────────┬────────────────────────────────────────┘
495
+
496
+
497
+ ┌─────────────────────────────────────────────────────────────┐
498
+ │ STAGE 2: CLASSIFICATION - Mark normative vs informative │
499
+ │ • DiffClassifier uses match_options │
500
+ │ • comments: :ignore → comment diffs = informative │
501
+ │ • comments: :strict → comment diffs = normative │
502
+ └────────────────────┬────────────────────────────────────────┘
503
+
504
+
505
+ ┌─────────────────────────────────────────────────────────────┐
506
+ │ STAGE 3: RENDERING - Control visibility │
507
+ │ • show_diffs: :all → show everything │
508
+ │ • show_diffs: :normative → show only normative │
509
+ │ • show_diffs: :informative → show only informative │
510
+ └─────────────────────────────────────────────────────────────┘
511
+ ----
512
+
513
+ === Example: Show only normative differences
514
+
515
+ .Show only normative differences (common use case)
516
+ [source,ruby]
517
+ ----
518
+ result = Canon.equivalent?(xml1, xml2,
519
+ format: :xml,
520
+ verbose: true,
521
+ show_diffs: :normative
522
+ )
523
+ ----
524
+
525
+ This is useful when you want to focus on actual semantic differences and hide cosmetic ones like comment changes or whitespace formatting.
526
+
527
+ === Example: Show only informative differences
528
+
529
+ .Show only informative differences (debugging use case)
530
+ [source,ruby]
531
+ ----
532
+ result = Canon.equivalent?(xml1, xml2,
533
+ format: :xml,
534
+ verbose: true,
535
+ show_diffs: :informative
536
+ )
537
+ ----
538
+
539
+ This is useful for debugging or reviewing formatting/cosmetic changes separately from semantic ones.
540
+
541
+ === Combining comments and show_diffs
542
+
543
+ The `comments` match option and `show_diffs` option work together:
544
+
545
+ [cols="1,1,2"]
546
+ |===
547
+ | comments | show_diffs | Result
548
+
549
+ | `:ignore`
550
+ | `:normative`
551
+ | Comment diffs hidden (informative)
552
+
553
+ | `:ignore`
554
+ | `:all`
555
+ | Comment diffs shown in cyan (informative)
556
+
557
+ | `:strict`
558
+ | `:normative`
559
+ | Comment diffs shown in red/green (normative)
560
+
561
+ | `:strict`
562
+ | `:informative`
563
+ | Comment diffs hidden (normative)
564
+ |===
565
+
566
+ .Example: Mixed scenario
567
+ // XML with both comment and text differences
568
+ xml1 = '<root><!--old comment--><child>text1</child></root>'
569
+ xml2 = '<root><!--new comment--><child>text2</child></root>'
570
+
571
+ // Show only normative diffs (hide comment changes)
572
+ result = Canon.equivalent?(xml1, xml2,
573
+ format: :xml,
574
+ verbose: true,
575
+ match: { comments: :ignore, text_content: :strict },
576
+ show_diffs: :normative
577
+ )
578
+
579
+ // Output shows text change but not comment change
580
+ // - text1 (in red)
581
+ // + text2 (in green)
582
+ // Comment diff is hidden because it's informative
583
+ ----
584
+
585
+ === CLI usage
586
+
587
+ [source,bash]
588
+ ----
589
+ # Show only normative differences
590
+ canon diff file1.xml file2.xml --show-diffs=normative
591
+
592
+ # Show all differences (default)
593
+ canon diff file1.xml file2.xml --show-diffs=all
594
+
595
+ # Show only informative differences
596
+ canon diff file1.xml file2.xml --show-diffs=informative
597
+ ----
598
+
599
+ === RSpec usage
600
+
601
+ [source,ruby]
602
+ ----
603
+ RSpec.describe 'XML comparison' do
604
+ it 'matches despite comment differences' do
605
+ expect(xml1).to be_xml_equivalent_to(xml2)
606
+ .with_match(comments: :ignore)
607
+ .show_diffs(:normative)
608
+ end
609
+ end
610
+ ----
611
+
612
+ === Environment variable
613
+
614
+ You can set a default value using the `CANON_SHOW_DIFFS` environment variable:
615
+
616
+ [source,bash]
617
+ ----
618
+ export CANON_SHOW_DIFFS=normative
619
+ ----
620
+
621
+ Valid values: `all`, `normative`, `informative`
@@ -0,0 +1,83 @@
1
+ ---
2
+ layout: default
3
+ title: Getting Started
4
+ nav_order: 2
5
+ has_children: true
6
+ ---
7
+ = Getting Started
8
+
9
+ Welcome to Canon! This section will help you get up and running quickly.
10
+
11
+ == Overview
12
+
13
+ Canon is a Ruby library for canonicalization, pretty-printing, and semantic comparison of structured documents in multiple formats (XML, HTML, JSON, YAML).
14
+
15
+ Whether you're:
16
+
17
+ * A developer integrating Canon into an application
18
+ * A QA engineer writing tests for document generation
19
+ * A DevOps engineer comparing configuration files
20
+ * An architect evaluating Canon's design
21
+
22
+ This section provides everything you need to start using Canon effectively.
23
+
24
+ == What You'll Learn
25
+
26
+ This section covers:
27
+
28
+ link:installation[**Installation**]::
29
+ How to install Canon in your Ruby environment, including gem installation and bundler setup.
30
+
31
+ link:quick-start[**Quick Start**]::
32
+ Your first Canon operations - formatting and comparing documents with minimal code.
33
+
34
+ link:core-concepts[**Core Concepts**]::
35
+ Essential concepts to understand how Canon works: canonicalization, semantic comparison, and diff modes.
36
+
37
+ == Quick Example
38
+
39
+ Here's a taste of what Canon can do:
40
+
41
+ [source,ruby]
42
+ ----
43
+ require 'canon'
44
+
45
+ # Format XML in canonical form
46
+ xml = '<root><b>2</b><a>1</a></root>'
47
+ canonical = Canon.format(xml, :xml, mode: :c14n)
48
+ # => "<root><a>1</a><b>2</b></root>"
49
+
50
+ # Compare documents semantically
51
+ doc1 = '<root><a>1</a><b>2</b></root>'
52
+ doc2 = '<root> <b>2</b> <a>1</a> </root>'
53
+ Canon::Comparison.equivalent?(doc1, doc2)
54
+ # => true (ignores whitespace and element order)
55
+ ----
56
+
57
+ == Next Steps
58
+
59
+ After completing this section:
60
+
61
+ * Explore link:../interfaces/[Interfaces] to learn Ruby API, CLI, and RSpec usage
62
+ * Read link:../understanding/[Understanding Canon] to learn how it works internally
63
+ * Check link:../features/[Features] to customize Canon's behavior
64
+
65
+ == Common Questions
66
+
67
+ **Which Ruby versions are supported?**::
68
+ Canon supports Ruby 2.7 and higher.
69
+
70
+ **What formats does Canon support?**::
71
+ XML, HTML, JSON, and YAML. See link:../understanding/formats/[Format Support] for details.
72
+
73
+ **Can I use Canon without RSpec?**::
74
+ Yes! Canon works as a standalone library. RSpec matchers are optional.
75
+
76
+ **Is Canon suitable for production use?**::
77
+ Yes. The core DOM diff algorithm is stable and well-tested. The semantic tree diff is experimental.
78
+
79
+ == See Also
80
+
81
+ * link:../interfaces/[Interfaces] - Choose your preferred way to use Canon
82
+ * link:../understanding/formats/[Format Support] - Format-specific details
83
+ * link:../features/match-options/[Match Options] - Customizing comparison behavior