canon 0.1.6 → 0.1.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (136) hide show
  1. checksums.yaml +4 -4
  2. data/.rubocop_todo.yml +163 -67
  3. data/README.adoc +400 -7
  4. data/docs/Gemfile +9 -0
  5. data/docs/INDEX.adoc +99 -182
  6. data/docs/_config.yml +100 -0
  7. data/docs/advanced/diff-classification.adoc +547 -0
  8. data/docs/advanced/diff-pipeline.adoc +358 -0
  9. data/docs/advanced/index.adoc +214 -0
  10. data/docs/advanced/semantic-diff-report.adoc +390 -0
  11. data/docs/{VERBOSE.adoc → advanced/verbose-mode-architecture.adoc} +51 -53
  12. data/docs/features/diff-formatting/algorithm-specific-output.adoc +533 -0
  13. data/docs/{CHARACTER_VISUALIZATION.adoc → features/diff-formatting/character-visualization.adoc} +23 -62
  14. data/docs/features/diff-formatting/colors-and-symbols.adoc +606 -0
  15. data/docs/features/diff-formatting/context-and-grouping.adoc +490 -0
  16. data/docs/features/diff-formatting/display-filtering.adoc +472 -0
  17. data/docs/features/diff-formatting/index.adoc +140 -0
  18. data/docs/features/environment-configuration/index.adoc +327 -0
  19. data/docs/features/environment-configuration/override-system.adoc +436 -0
  20. data/docs/features/environment-configuration/size-limits.adoc +273 -0
  21. data/docs/features/index.adoc +173 -0
  22. data/docs/features/input-validation/index.adoc +521 -0
  23. data/docs/features/match-options/algorithm-specific-behavior.adoc +365 -0
  24. data/docs/features/match-options/html-policies.adoc +312 -0
  25. data/docs/features/match-options/index.adoc +621 -0
  26. data/docs/getting-started/index.adoc +83 -0
  27. data/docs/getting-started/quick-start.adoc +76 -0
  28. data/docs/guides/choosing-configuration.adoc +689 -0
  29. data/docs/guides/index.adoc +181 -0
  30. data/docs/{CLI.adoc → interfaces/cli/index.adoc} +18 -13
  31. data/docs/interfaces/index.adoc +101 -0
  32. data/docs/{RSPEC.adoc → interfaces/rspec/index.adoc} +242 -31
  33. data/docs/{RUBY_API.adoc → interfaces/ruby-api/index.adoc} +118 -16
  34. data/docs/lychee.toml +65 -0
  35. data/docs/reference/cli-options.adoc +418 -0
  36. data/docs/reference/environment-variables.adoc +375 -0
  37. data/docs/reference/index.adoc +204 -0
  38. data/docs/reference/options-across-interfaces.adoc +417 -0
  39. data/docs/understanding/algorithms/dom-diff.adoc +389 -0
  40. data/docs/understanding/algorithms/index.adoc +314 -0
  41. data/docs/understanding/algorithms/semantic-tree-diff.adoc +533 -0
  42. data/docs/understanding/architecture.adoc +447 -0
  43. data/docs/understanding/comparison-pipeline.adoc +317 -0
  44. data/docs/understanding/formats/html.adoc +380 -0
  45. data/docs/understanding/formats/index.adoc +261 -0
  46. data/docs/understanding/formats/json.adoc +390 -0
  47. data/docs/understanding/formats/xml.adoc +366 -0
  48. data/docs/understanding/formats/yaml.adoc +504 -0
  49. data/docs/understanding/index.adoc +130 -0
  50. data/lib/canon/cli.rb +42 -1
  51. data/lib/canon/commands/diff_command.rb +108 -23
  52. data/lib/canon/comparison/compare_profile.rb +101 -0
  53. data/lib/canon/comparison/comparison_result.rb +41 -2
  54. data/lib/canon/comparison/html_comparator.rb +292 -71
  55. data/lib/canon/comparison/html_compare_profile.rb +117 -0
  56. data/lib/canon/comparison/match_options.rb +42 -4
  57. data/lib/canon/comparison/strategies/base_match_strategy.rb +99 -0
  58. data/lib/canon/comparison/strategies/match_strategy_factory.rb +74 -0
  59. data/lib/canon/comparison/strategies/semantic_tree_match_strategy.rb +220 -0
  60. data/lib/canon/comparison/xml_comparator.rb +695 -91
  61. data/lib/canon/comparison.rb +207 -2
  62. data/lib/canon/config/env_provider.rb +71 -0
  63. data/lib/canon/config/env_schema.rb +58 -0
  64. data/lib/canon/config/override_resolver.rb +55 -0
  65. data/lib/canon/config/type_converter.rb +59 -0
  66. data/lib/canon/config.rb +158 -29
  67. data/lib/canon/data_model.rb +29 -0
  68. data/lib/canon/diff/diff_classifier.rb +74 -14
  69. data/lib/canon/diff/diff_context_builder.rb +41 -0
  70. data/lib/canon/diff/diff_line.rb +18 -2
  71. data/lib/canon/diff/diff_node.rb +18 -3
  72. data/lib/canon/diff/diff_node_mapper.rb +71 -12
  73. data/lib/canon/diff/formatting_detector.rb +53 -0
  74. data/lib/canon/diff_formatter/by_line/base_formatter.rb +60 -5
  75. data/lib/canon/diff_formatter/by_line/html_formatter.rb +68 -16
  76. data/lib/canon/diff_formatter/by_line/json_formatter.rb +0 -37
  77. data/lib/canon/diff_formatter/by_line/simple_formatter.rb +0 -42
  78. data/lib/canon/diff_formatter/by_line/xml_formatter.rb +116 -31
  79. data/lib/canon/diff_formatter/by_line/yaml_formatter.rb +0 -37
  80. data/lib/canon/diff_formatter/by_object/base_formatter.rb +126 -19
  81. data/lib/canon/diff_formatter/by_object/xml_formatter.rb +30 -1
  82. data/lib/canon/diff_formatter/debug_output.rb +7 -1
  83. data/lib/canon/diff_formatter/diff_detail_formatter.rb +674 -57
  84. data/lib/canon/diff_formatter/legend.rb +42 -0
  85. data/lib/canon/diff_formatter.rb +78 -9
  86. data/lib/canon/errors.rb +56 -0
  87. data/lib/canon/formatters/html_formatter_base.rb +35 -1
  88. data/lib/canon/formatters/json_formatter.rb +3 -0
  89. data/lib/canon/formatters/yaml_formatter.rb +3 -0
  90. data/lib/canon/html/data_model.rb +229 -0
  91. data/lib/canon/html.rb +9 -0
  92. data/lib/canon/options/cli_generator.rb +70 -0
  93. data/lib/canon/options/registry.rb +234 -0
  94. data/lib/canon/rspec_matchers.rb +34 -13
  95. data/lib/canon/tree_diff/adapters/html_adapter.rb +316 -0
  96. data/lib/canon/tree_diff/adapters/json_adapter.rb +204 -0
  97. data/lib/canon/tree_diff/adapters/xml_adapter.rb +285 -0
  98. data/lib/canon/tree_diff/adapters/yaml_adapter.rb +213 -0
  99. data/lib/canon/tree_diff/core/attribute_comparator.rb +84 -0
  100. data/lib/canon/tree_diff/core/matching.rb +241 -0
  101. data/lib/canon/tree_diff/core/node_signature.rb +164 -0
  102. data/lib/canon/tree_diff/core/node_weight.rb +135 -0
  103. data/lib/canon/tree_diff/core/tree_node.rb +450 -0
  104. data/lib/canon/tree_diff/matchers/hash_matcher.rb +258 -0
  105. data/lib/canon/tree_diff/matchers/similarity_matcher.rb +168 -0
  106. data/lib/canon/tree_diff/matchers/structural_propagator.rb +242 -0
  107. data/lib/canon/tree_diff/matchers/universal_matcher.rb +220 -0
  108. data/lib/canon/tree_diff/operation_converter.rb +631 -0
  109. data/lib/canon/tree_diff/operations/operation.rb +92 -0
  110. data/lib/canon/tree_diff/operations/operation_detector.rb +626 -0
  111. data/lib/canon/tree_diff/tree_diff_integrator.rb +140 -0
  112. data/lib/canon/tree_diff.rb +33 -0
  113. data/lib/canon/validators/json_validator.rb +3 -1
  114. data/lib/canon/validators/yaml_validator.rb +3 -1
  115. data/lib/canon/version.rb +1 -1
  116. data/lib/canon/xml/data_model.rb +22 -23
  117. data/lib/canon/xml/element_matcher.rb +128 -20
  118. data/lib/canon/xml/namespace_helper.rb +110 -0
  119. data/lib/canon.rb +3 -0
  120. metadata +81 -23
  121. data/_config.yml +0 -116
  122. data/docs/ADVANCED_TOPICS.adoc +0 -20
  123. data/docs/BASIC_USAGE.adoc +0 -16
  124. data/docs/CUSTOMIZING_BEHAVIOR.adoc +0 -19
  125. data/docs/DIFF_ARCHITECTURE.adoc +0 -435
  126. data/docs/DIFF_FORMATTING.adoc +0 -540
  127. data/docs/FORMATS.adoc +0 -447
  128. data/docs/INPUT_VALIDATION.adoc +0 -477
  129. data/docs/MATCH_ARCHITECTURE.adoc +0 -463
  130. data/docs/MATCH_OPTIONS.adoc +0 -719
  131. data/docs/MODES.adoc +0 -432
  132. data/docs/NORMATIVE_INFORMATIVE_DIFFS.adoc +0 -219
  133. data/docs/OPTIONS.adoc +0 -1387
  134. data/docs/PREPROCESSING.adoc +0 -491
  135. data/docs/SEMANTIC_DIFF_REPORT.adoc +0 -528
  136. data/docs/UNDERSTANDING_CANON.adoc +0 -17
@@ -0,0 +1,689 @@
1
+ ---
2
+ title: Choosing Configuration
3
+ parent: Guides
4
+ nav_order: 1
5
+ ---
6
+ = Choosing Configuration
7
+
8
+ == Purpose
9
+
10
+ Canon's 4-layer architecture provides powerful flexibility, but this can be overwhelming. This guide helps you choose the right configuration for your use case through decision trees, use case scenarios, and practical recommendations.
11
+
12
+ == Quick Decision Tree
13
+
14
+ [mermaid]
15
+ ----
16
+ graph TD
17
+ Start[What are you comparing?] --> Similar{Similar<br/>structure?}
18
+ Similar -->|Yes| Fast{Need<br/>speed?}
19
+ Similar -->|No| Semantic[Use Semantic Algorithm]
20
+
21
+ Fast -->|Yes| DOM[Use DOM Algorithm]
22
+ Fast -->|No| Semantic
23
+
24
+ DOM --> Format{Care about<br/>formatting?}
25
+ Semantic --> Format
26
+
27
+ Format -->|Yes| Strict[strict profile]
28
+ Format -->|No| SpecFriendly[spec_friendly profile]
29
+
30
+ Strict --> Output1[by_line mode]
31
+ SpecFriendly --> Output2{Want<br/>operations?}
32
+
33
+ Output2 -->|Yes| ByObject[by_object mode]
34
+ Output2 -->|No| ByLine[by_line mode]
35
+
36
+ style DOM fill:#fff4e1
37
+ style Semantic fill:#e1f5ff
38
+ style Strict fill:#ffe1f5
39
+ style SpecFriendly fill:#ffe1f5
40
+ style ByObject fill:#e1ffe1
41
+ style ByLine fill:#e1ffe1
42
+ ----
43
+
44
+ == Layer-by-Layer Decision Guide
45
+
46
+ === Layer 1: Preprocessing
47
+
48
+ **Question**: How should documents be normalized before comparison?
49
+
50
+ [cols="2,3,3"]
51
+ |===
52
+ |Choose |When |Example
53
+
54
+ |**none**
55
+ |Documents already in comparable form, no normalization needed
56
+ |Comparing canonicalized XML files
57
+
58
+ |**c14n**
59
+ |Testing XML canonicalization implementations
60
+ |Validating C14N output
61
+
62
+ |**normalize**
63
+ |Whitespace differences are irrelevant
64
+ |Comparing generated vs handwritten XML
65
+
66
+ |**format**
67
+ |Want to compare structure, ignore all formatting
68
+ |Comparing minified vs formatted JSON
69
+ |===
70
+
71
+ **Default**: `none` (no preprocessing)
72
+
73
+ **Ruby API**:
74
+ [source,ruby]
75
+ ----
76
+ Canon::Comparison.equivalent?(doc1, doc2,
77
+ preprocessing: :normalize # or :c14n, :format
78
+ )
79
+ ----
80
+
81
+ **CLI**:
82
+ [source,bash]
83
+ ----
84
+ canon diff file1.xml file2.xml --preprocessing normalize
85
+ ----
86
+
87
+ === Layer 2: Algorithm Selection
88
+
89
+ **Question**: What comparison strategy fits your documents?
90
+
91
+ [cols="2,3,3"]
92
+ |===
93
+ |Choose |When |Characteristics
94
+
95
+ |**dom**
96
+ |• Similar document structure +
97
+ • Traditional diff workflow +
98
+ • Speed is important +
99
+ • Production use (stable)
100
+ |• Fast +
101
+ • Position-based +
102
+ • No move detection +
103
+ • Well-tested
104
+
105
+ |**semantic**
106
+ |• Restructured documents +
107
+ • Need move detection +
108
+ • Operation analysis needed +
109
+ • Experimental OK
110
+ |• Slower +
111
+ • Signature-based +
112
+ • Detects moves +
113
+ • Experimental
114
+ |===
115
+
116
+ **Default**: `dom` (stable algorithm)
117
+
118
+ **Decision Matrix**:
119
+ [cols="2,1,1"]
120
+ |===
121
+ |Scenario |DOM |Semantic
122
+
123
+ |Documents have same structure
124
+ |✓
125
+ |✓
126
+
127
+ |Documents are reordered
128
+ |✗
129
+ |✓
130
+
131
+ |Need fast comparison
132
+ |✓
133
+ |✗
134
+
135
+ |Need move detection
136
+ |✗
137
+ |✓
138
+
139
+ |Production use
140
+ |✓
141
+ |⚠
142
+
143
+ |Large documents (> 100KB)
144
+ |✓
145
+ |✗
146
+ |===
147
+
148
+ **Ruby API**:
149
+ [source,ruby]
150
+ ----
151
+ Canon::Comparison.equivalent?(doc1, doc2,
152
+ diff_algorithm: :dom # or :semantic
153
+ )
154
+ ----
155
+
156
+ **CLI**:
157
+ [source,bash]
158
+ ----
159
+ canon diff file1.xml file2.xml --diff-algorithm semantic
160
+ ----
161
+
162
+ === Layer 3: Match Options
163
+
164
+ **Question**: How strict should comparison be?
165
+
166
+ ==== Using Match Profiles (Recommended)
167
+
168
+ [cols="2,4"]
169
+ |===
170
+ |Profile |Use When
171
+
172
+ |**strict**
173
+ |Exact matching required. Everything must match exactly including whitespace, attribute order, comments.
174
+
175
+ |**rendered**
176
+ |Comparing rendered output. Simulates browser/CSS rendering - ignores formatting but keeps content strict.
177
+
178
+ |**spec_friendly**
179
+ |Writing tests. Ignores formatting differences, focuses on content and structure.
180
+
181
+ |**content_only**
182
+ |Content comparison only. Ignores all structural and formatting differences.
183
+ |===
184
+
185
+ **Default**: `strict` (exact matching)
186
+
187
+ **Ruby API**:
188
+ [source,ruby]
189
+ ----
190
+ Canon::Comparison.equivalent?(doc1, doc2,
191
+ match_profile: :spec_friendly # or :strict, :rendered, :content_only
192
+ )
193
+ ----
194
+
195
+ **CLI**:
196
+ [source,bash]
197
+ ----
198
+ canon diff file1.xml file2.xml --match-profile spec_friendly
199
+ ----
200
+
201
+ ==== Custom Match Dimensions
202
+
203
+ For fine-grained control, configure individual dimensions:
204
+
205
+ [source,ruby]
206
+ ----
207
+ Canon::Comparison.equivalent?(doc1, doc2,
208
+ match: {
209
+ text_content: :normalize, # normalize, strict, ignore
210
+ structural_whitespace: :ignore, # ignore, normalize, strict
211
+ attribute_order: :ignore, # ignore, strict (XML/HTML)
212
+ attribute_values: :normalize, # normalize, strict, ignore
213
+ comments: :ignore # ignore, normalize, strict
214
+ }
215
+ )
216
+ ----
217
+
218
+ **Remember**: Match options behave differently with each algorithm! See link:../features/match-options/algorithm-specific-behavior.adoc[Algorithm-Specific Behavior].
219
+
220
+ === Layer 4: Diff Formatting
221
+
222
+ **Question**: How should differences be displayed?
223
+
224
+ ==== Choosing Diff Mode
225
+
226
+ [cols="2,3,3"]
227
+ |===
228
+ |Mode |Best For |Output Type
229
+
230
+ |**by_line**
231
+ |• Traditional diffs +
232
+ • Code review +
233
+ • Quick scanning +
234
+ • DOM algorithm
235
+ |Line-based diff similar to `git diff`
236
+
237
+ |**by_object**
238
+ |• Tree structure view +
239
+ • Operation analysis +
240
+ • Semantic algorithm
241
+ |Tree-based with operations (INSERT, DELETE, UPDATE, MOVE)
242
+ |===
243
+
244
+ **Default**: `by_line` for DOM, `by_object` for Semantic
245
+
246
+ **Natural Fits**:
247
+ * DOM + by_line = Traditional positional diff
248
+ * Semantic + by_object = Operation-based tree diff
249
+
250
+ **Ruby API**:
251
+ [source,ruby]
252
+ ----
253
+ Canon::Comparison.equivalent?(doc1, doc2,
254
+ diff_mode: :by_object, # or :by_line
255
+ verbose: true # Enable diff output
256
+ )
257
+ ----
258
+
259
+ **CLI**:
260
+ [source,bash]
261
+ ----
262
+ canon diff file1.xml file2.xml --diff-mode by-object --verbose
263
+ ----
264
+
265
+ ==== Visual Formatting Options
266
+
267
+ [source,ruby]
268
+ ----
269
+ Canon::Comparison.equivalent?(doc1, doc2,
270
+ verbose: true,
271
+ use_color: true, # Enable colors
272
+ context_lines: 3, # Lines of context
273
+ diff_grouping_lines: 5, # Group nearby changes
274
+ show_legend: true # Display symbol legend
275
+ )
276
+ ----
277
+
278
+ == Use Case Scenarios
279
+
280
+ === Scenario 1: Unit Testing XML Generation
281
+
282
+ **Requirement**: Test that code generates correct XML, ignoring formatting
283
+
284
+ **Configuration**:
285
+ [source,ruby]
286
+ ----
287
+ expect(actual_xml).to be_equivalent_to(expected_xml).with_options(
288
+ preprocessing: :normalize, # Ignore formatting differences
289
+ diff_algorithm: :dom, # Fast, stable
290
+ match_profile: :spec_friendly, # Test-friendly
291
+ verbose: true # Show diffs on failure
292
+ )
293
+ ----
294
+
295
+ **Why**:
296
+ * `normalize` handles inconsistent whitespace
297
+ * `dom` is fast and stable for tests
298
+ * `spec_friendly` focuses on content, not formatting
299
+ * `verbose` helps debug failures
300
+
301
+ === Scenario 2: Comparing API Responses
302
+
303
+ **Requirement**: Compare JSON responses, key order doesn't matter
304
+
305
+ **Configuration**:
306
+ [source,ruby]
307
+ ----
308
+ Canon::Comparison.equivalent?(response1, response2,
309
+ diff_algorithm: :dom,
310
+ match: {
311
+ key_order: :ignore, # JSON key order irrelevant
312
+ text_content: :normalize # Normalize string values
313
+ },
314
+ verbose: true,
315
+ diff_mode: :by_object # Tree view of differences
316
+ )
317
+ ----
318
+
319
+ **Why**:
320
+ * `key_order: :ignore` handles JSON object key reordering
321
+ * `by_object` shows structured diff
322
+ * `dom` is sufficient for API responses
323
+
324
+ === Scenario 3: Detecting Document Restructuring
325
+
326
+ **Requirement**: Find what changed when document is reorganized
327
+
328
+ **Configuration**:
329
+ [source,ruby]
330
+ ----
331
+ result = Canon::Comparison.equivalent?(old_doc, new_doc,
332
+ diff_algorithm: :semantic, # Detect moves
333
+ match_profile: :spec_friendly, # Ignore formatting
334
+ verbose: true,
335
+ diff_mode: :by_object # See operations
336
+ )
337
+
338
+ # Analyze operations
339
+ puts "Moves: #{result.statistics.moves}"
340
+ puts "Updates: #{result.statistics.updates}"
341
+ ----
342
+
343
+ **Why**:
344
+ * `semantic` algorithm detects moves and restructuring
345
+ * `by_object` shows operation-level changes
346
+ * Statistics provide quantitative analysis
347
+
348
+ === Scenario 4: Code Review Diff
349
+
350
+ **Requirement**: Traditional diff for reviewing changes
351
+
352
+ **Configuration**:
353
+ [source,bash]
354
+ ----
355
+ canon diff old.xml new.xml \
356
+ --diff-algorithm dom \
357
+ --match-profile spec_friendly \
358
+ --diff-mode by_line \
359
+ --verbose \
360
+ --use-color \
361
+ --context-lines 3
362
+ ----
363
+
364
+ **Why**:
365
+ * `dom + by_line` gives traditional diff
366
+ * `context_lines` provides context
367
+ * Colors improve readability
368
+
369
+ === Scenario 5: Canonicalization Testing
370
+
371
+ **Requirement**: Test C14N implementation
372
+
373
+ **Configuration**:
374
+ [source,ruby]
375
+ ----
376
+ Canon::Comparison.equivalent?(doc, canonical_doc,
377
+ preprocessing: :c14n, # Apply canonicalization
378
+ diff_algorithm: :dom,
379
+ match_profile: :strict, # Exact match required
380
+ verbose: true
381
+ )
382
+ ----
383
+
384
+ **Why**:
385
+ * `c14n` preprocessing applies canonicalization
386
+ * `strict` profile ensures exact match
387
+ * Tests that canonicalization produces correct output
388
+
389
+ === Scenario 6: Content-Only Comparison
390
+
391
+ **Requirement**: Compare only text content, ignore all structure
392
+
393
+ **Configuration**:
394
+ [source,ruby]
395
+ ----
396
+ Canon::Comparison.equivalent?(doc1, doc2,
397
+ preprocessing: :format, # Normalize structure first
398
+ diff_algorithm: :semantic, # Better for structure-independent
399
+ match_profile: :content_only, # Ignore all structure
400
+ verbose: true,
401
+ diff_mode: :by_object
402
+ )
403
+ ----
404
+
405
+ **Why**:
406
+ * `content_only` profile ignores structure
407
+ * `semantic` algorithm better at structure-independent comparison
408
+ * `format` preprocessing normalizes before comparison
409
+
410
+ == Layer Interaction Matrix
411
+
412
+ This table shows recommended configurations for common scenarios:
413
+
414
+ [cols="3,1,1,1,1,2"]
415
+ |===
416
+ |Use Case |Layer 1 |Layer 2 |Layer 3 |Layer 4 |Notes
417
+
418
+ |Unit tests (similar structure)
419
+ |normalize
420
+ |dom
421
+ |spec_friendly
422
+ |by_line
423
+ |Fast, test-friendly
424
+
425
+ |Unit tests (any structure)
426
+ |normalize
427
+ |semantic
428
+ |spec_friendly
429
+ |by_object
430
+ |Handles restructuring
431
+
432
+ |API response comparison
433
+ |none
434
+ |dom
435
+ |custom
436
+ |by_object
437
+ |Configure key_order
438
+
439
+ |Document evolution tracking
440
+ |none
441
+ |semantic
442
+ |rendered
443
+ |by_object
444
+ |Detect operations
445
+
446
+ |Code review
447
+ |none
448
+ |dom
449
+ |strict
450
+ |by_line
451
+ |Traditional diff
452
+
453
+ |C14N testing
454
+ |c14n
455
+ |dom
456
+ |strict
457
+ |by_line
458
+ |Exact match
459
+
460
+ |Content extraction testing
461
+ |format
462
+ |semantic
463
+ |content_only
464
+ |by_object
465
+ |Structure-independent
466
+
467
+ |Regression testing
468
+ |normalize
469
+ |dom
470
+ |spec_friendly
471
+ |by_line
472
+ |Stable, fast
473
+ |===
474
+
475
+ == Common Configuration Patterns
476
+
477
+ === Pattern 1: Fast Test Assertion
478
+
479
+ [source,ruby]
480
+ ----
481
+ # Minimal configuration for speed
482
+ Canon::Comparison.equivalent?(expected, actual,
483
+ match_profile: :spec_friendly
484
+ )
485
+ # Uses defaults: no preprocessing, dom algorithm, by_line output
486
+ ----
487
+
488
+ === Pattern 2: Comprehensive Analysis
489
+
490
+ [source,ruby]
491
+ ----
492
+ # Full analysis with all features
493
+ result = Canon::Comparison.equivalent?(doc1, doc2,
494
+ preprocessing: :normalize,
495
+ diff_algorithm: :semantic,
496
+ match_profile: :spec_friendly,
497
+ verbose: true,
498
+ diff_mode: :by_object,
499
+ use_color: true,
500
+ context_lines: 5,
501
+ show_legend: true
502
+ )
503
+
504
+ # Access rich data
505
+ puts result.operations
506
+ puts result.statistics
507
+ ----
508
+
509
+ === Pattern 3: Strict Validation
510
+
511
+ [source,ruby]
512
+ ----
513
+ # Exact match required
514
+ Canon::Comparison.equivalent?(doc1, doc2,
515
+ preprocessing: :c14n, # Canonicalize first
516
+ match_profile: :strict, # Exact matching
517
+ verbose: true # Show any differences
518
+ )
519
+ ----
520
+
521
+ === Pattern 4: Flexible Content Comparison
522
+
523
+ [source,ruby]
524
+ ----
525
+ # Focus on content, ignore structure
526
+ Canon::Comparison.equivalent?(doc1, doc2,
527
+ preprocessing: :normalize,
528
+ diff_algorithm: :semantic,
529
+ match_profile: :content_only,
530
+ verbose: true
531
+ )
532
+ ----
533
+
534
+ == Anti-Patterns to Avoid
535
+
536
+ === Anti-Pattern 1: Over-Configuration
537
+
538
+ [source,ruby]
539
+ ----
540
+ # DON'T: Conflicting settings
541
+ Canon::Comparison.equivalent?(doc1, doc2,
542
+ preprocessing: :c14n, # Canonicalizes
543
+ match: {
544
+ structural_whitespace: :strict # Conflicts with c14n
545
+ }
546
+ )
547
+
548
+ # DO: Choose one approach
549
+ Canon::Comparison.equivalent?(doc1, doc2,
550
+ preprocessing: :c14n # Handles normalization
551
+ )
552
+ ----
553
+
554
+ === Anti-Pattern 2: Wrong Algorithm/Mode Combination
555
+
556
+ [source,ruby]
557
+ ----
558
+ # SUBOPTIMAL: Loses semantic information
559
+ Canon::Comparison.equivalent?(doc1, doc2,
560
+ diff_algorithm: :semantic,
561
+ diff_mode: :by_line # Doesn't show operations well
562
+ )
563
+
564
+ # BETTER: Use natural fit
565
+ Canon::Comparison.equivalent?(doc1, doc2,
566
+ diff_algorithm: :semantic,
567
+ diff_mode: :by_object # Shows operations clearly
568
+ )
569
+ ----
570
+
571
+ === Anti-Pattern 3: Unnecessary Semantic Algorithm
572
+
573
+ [source,ruby]
574
+ ----
575
+ # SLOW: Semantic not needed for similar documents
576
+ Canon::Comparison.equivalent?(doc1, doc2,
577
+ diff_algorithm: :semantic # Overkill if no restructuring
578
+ )
579
+
580
+ # FASTER: Use DOM for similar structures
581
+ Canon::Comparison.equivalent?(doc1, doc2,
582
+ diff_algorithm: :dom # Fast for similar docs
583
+ )
584
+ ----
585
+
586
+ === Anti-Pattern 4: Missing Verbose Flag
587
+
588
+ [source,ruby]
589
+ ----
590
+ # DON'T: Can't see what's different
591
+ result = Canon::Comparison.equivalent?(doc1, doc2)
592
+ # result is just true/false
593
+
594
+ # DO: Enable verbose for debugging
595
+ result = Canon::Comparison.equivalent?(doc1, doc2,
596
+ verbose: true
597
+ )
598
+ # result.diff shows actual differences
599
+ ----
600
+
601
+ == Performance Considerations
602
+
603
+ === Performance Impact by Layer
604
+
605
+ [cols="2,2,2,3"]
606
+ |===
607
+ |Layer |Low Impact |Medium Impact |High Impact
608
+
609
+ |**Layer 1**
610
+ |none
611
+ |normalize, format
612
+ |c14n (complex documents)
613
+
614
+ |**Layer 2**
615
+ |dom
616
+ |—
617
+ |semantic
618
+
619
+ |**Layer 3**
620
+ |Any profile
621
+ |—
622
+ |Complex custom dimensions
623
+
624
+ |**Layer 4**
625
+ |by_line
626
+ |by_object (small docs)
627
+ |by_object (large docs)
628
+ |===
629
+
630
+ === Optimization Guidelines
631
+
632
+ **For Speed**:
633
+ [source,ruby]
634
+ ----
635
+ Canon::Comparison.equivalent?(doc1, doc2,
636
+ preprocessing: :none, # Skip preprocessing
637
+ diff_algorithm: :dom, # Fast algorithm
638
+ match_profile: :strict, # Simple matching
639
+ diff_mode: :by_line # Fast output
640
+ )
641
+ ----
642
+
643
+ **For Intelligence** (accepting slower performance):
644
+ [source,ruby]
645
+ ----
646
+ Canon::Comparison.equivalent?(doc1, doc2,
647
+ preprocessing: :normalize, # Normalize first
648
+ diff_algorithm: :semantic, # Intelligent algorithm
649
+ diff_mode: :by_object # Rich output
650
+ )
651
+ ----
652
+
653
+ == Migration Checklist
654
+
655
+ When changing configuration:
656
+
657
+ === Changing Algorithm (DOM → Semantic)
658
+
659
+ - [ ] Update `diff_algorithm` option
660
+ - [ ] Consider changing `diff_mode` to `by_object`
661
+ - [ ] Remove or update `attribute_order` expectations
662
+ - [ ] Update test assertions for operation-based output
663
+ - [ ] Accept slower performance
664
+ - [ ] Review move detection impact
665
+
666
+ === Changing Algorithm (Semantic → DOM)
667
+
668
+ - [ ] Update `diff_algorithm` option
669
+ - [ ] Consider changing `diff_mode` to `by_line`
670
+ - [ ] Add `attribute_order: :ignore` if needed
671
+ - [ ] Update test assertions for line-based output
672
+ - [ ] Expect faster performance
673
+ - [ ] Accept no move detection
674
+
675
+ === Changing Match Profile
676
+
677
+ - [ ] Review impact on existing tests
678
+ - [ ] Understand what each dimension does
679
+ - [ ] Test with sample documents
680
+ - [ ] Update documentation
681
+
682
+ == See Also
683
+
684
+ * link:../understanding/comparison-pipeline.adoc[Comparison Pipeline] - Understanding the 4 layers
685
+ * link:../understanding/algorithms/[Algorithms] - Detailed algorithm documentation
686
+ * link:../features/match-options/algorithm-specific-behavior.adoc[Algorithm-Specific Behavior] - How algorithms differ
687
+ * link:../features/diff-formatting/algorithm-specific-output.adoc[Algorithm-Specific Output] - Output format differences
688
+ * link:../features/match-options/[Match Options] - All matching options
689
+ * link:../features/diff-formatting/[Diff Formatting] - Formatting options