canon 0.1.5 → 0.1.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.rubocop_todo.yml +163 -67
- data/README.adoc +400 -7
- data/docs/Gemfile +9 -0
- data/docs/INDEX.adoc +99 -182
- data/docs/_config.yml +100 -0
- data/docs/advanced/diff-classification.adoc +547 -0
- data/docs/advanced/diff-pipeline.adoc +358 -0
- data/docs/advanced/index.adoc +214 -0
- data/docs/advanced/semantic-diff-report.adoc +390 -0
- data/docs/{VERBOSE.adoc → advanced/verbose-mode-architecture.adoc} +51 -53
- data/docs/features/diff-formatting/algorithm-specific-output.adoc +533 -0
- data/docs/{CHARACTER_VISUALIZATION.adoc → features/diff-formatting/character-visualization.adoc} +23 -62
- data/docs/features/diff-formatting/colors-and-symbols.adoc +606 -0
- data/docs/features/diff-formatting/context-and-grouping.adoc +490 -0
- data/docs/features/diff-formatting/display-filtering.adoc +472 -0
- data/docs/features/diff-formatting/index.adoc +140 -0
- data/docs/features/environment-configuration/index.adoc +327 -0
- data/docs/features/environment-configuration/override-system.adoc +436 -0
- data/docs/features/environment-configuration/size-limits.adoc +273 -0
- data/docs/features/index.adoc +173 -0
- data/docs/features/input-validation/index.adoc +521 -0
- data/docs/features/match-options/algorithm-specific-behavior.adoc +365 -0
- data/docs/features/match-options/html-policies.adoc +312 -0
- data/docs/features/match-options/index.adoc +621 -0
- data/docs/getting-started/index.adoc +83 -0
- data/docs/getting-started/quick-start.adoc +76 -0
- data/docs/guides/choosing-configuration.adoc +689 -0
- data/docs/guides/index.adoc +181 -0
- data/docs/{CLI.adoc → interfaces/cli/index.adoc} +18 -13
- data/docs/interfaces/index.adoc +101 -0
- data/docs/{RSPEC.adoc → interfaces/rspec/index.adoc} +242 -31
- data/docs/{RUBY_API.adoc → interfaces/ruby-api/index.adoc} +118 -16
- data/docs/lychee.toml +65 -0
- data/docs/reference/cli-options.adoc +418 -0
- data/docs/reference/environment-variables.adoc +375 -0
- data/docs/reference/index.adoc +204 -0
- data/docs/reference/options-across-interfaces.adoc +417 -0
- data/docs/understanding/algorithms/dom-diff.adoc +389 -0
- data/docs/understanding/algorithms/index.adoc +314 -0
- data/docs/understanding/algorithms/semantic-tree-diff.adoc +533 -0
- data/docs/understanding/architecture.adoc +447 -0
- data/docs/understanding/comparison-pipeline.adoc +317 -0
- data/docs/understanding/formats/html.adoc +380 -0
- data/docs/understanding/formats/index.adoc +261 -0
- data/docs/understanding/formats/json.adoc +390 -0
- data/docs/understanding/formats/xml.adoc +366 -0
- data/docs/understanding/formats/yaml.adoc +504 -0
- data/docs/understanding/index.adoc +130 -0
- data/lib/canon/cli.rb +42 -1
- data/lib/canon/commands/diff_command.rb +108 -23
- data/lib/canon/comparison/compare_profile.rb +101 -0
- data/lib/canon/comparison/comparison_result.rb +41 -2
- data/lib/canon/comparison/html_comparator.rb +292 -71
- data/lib/canon/comparison/html_compare_profile.rb +117 -0
- data/lib/canon/comparison/match_options.rb +42 -4
- data/lib/canon/comparison/strategies/base_match_strategy.rb +99 -0
- data/lib/canon/comparison/strategies/match_strategy_factory.rb +74 -0
- data/lib/canon/comparison/strategies/semantic_tree_match_strategy.rb +220 -0
- data/lib/canon/comparison/xml_comparator.rb +695 -91
- data/lib/canon/comparison.rb +207 -2
- data/lib/canon/config/env_provider.rb +71 -0
- data/lib/canon/config/env_schema.rb +58 -0
- data/lib/canon/config/override_resolver.rb +55 -0
- data/lib/canon/config/type_converter.rb +59 -0
- data/lib/canon/config.rb +158 -29
- data/lib/canon/data_model.rb +29 -0
- data/lib/canon/diff/diff_classifier.rb +74 -14
- data/lib/canon/diff/diff_context_builder.rb +41 -0
- data/lib/canon/diff/diff_line.rb +18 -2
- data/lib/canon/diff/diff_node.rb +18 -3
- data/lib/canon/diff/diff_node_mapper.rb +71 -12
- data/lib/canon/diff/formatting_detector.rb +53 -0
- data/lib/canon/diff_formatter/by_line/base_formatter.rb +60 -5
- data/lib/canon/diff_formatter/by_line/html_formatter.rb +68 -16
- data/lib/canon/diff_formatter/by_line/json_formatter.rb +0 -37
- data/lib/canon/diff_formatter/by_line/simple_formatter.rb +0 -42
- data/lib/canon/diff_formatter/by_line/xml_formatter.rb +116 -31
- data/lib/canon/diff_formatter/by_line/yaml_formatter.rb +0 -37
- data/lib/canon/diff_formatter/by_object/base_formatter.rb +126 -19
- data/lib/canon/diff_formatter/by_object/xml_formatter.rb +30 -1
- data/lib/canon/diff_formatter/debug_output.rb +7 -1
- data/lib/canon/diff_formatter/diff_detail_formatter.rb +674 -57
- data/lib/canon/diff_formatter/legend.rb +42 -0
- data/lib/canon/diff_formatter.rb +78 -9
- data/lib/canon/errors.rb +56 -0
- data/lib/canon/formatters/html_formatter_base.rb +35 -1
- data/lib/canon/formatters/json_formatter.rb +3 -0
- data/lib/canon/formatters/yaml_formatter.rb +3 -0
- data/lib/canon/html/data_model.rb +229 -0
- data/lib/canon/html.rb +9 -0
- data/lib/canon/options/cli_generator.rb +70 -0
- data/lib/canon/options/registry.rb +234 -0
- data/lib/canon/rspec_matchers.rb +34 -13
- data/lib/canon/tree_diff/adapters/html_adapter.rb +316 -0
- data/lib/canon/tree_diff/adapters/json_adapter.rb +204 -0
- data/lib/canon/tree_diff/adapters/xml_adapter.rb +285 -0
- data/lib/canon/tree_diff/adapters/yaml_adapter.rb +213 -0
- data/lib/canon/tree_diff/core/attribute_comparator.rb +84 -0
- data/lib/canon/tree_diff/core/matching.rb +241 -0
- data/lib/canon/tree_diff/core/node_signature.rb +164 -0
- data/lib/canon/tree_diff/core/node_weight.rb +135 -0
- data/lib/canon/tree_diff/core/tree_node.rb +450 -0
- data/lib/canon/tree_diff/matchers/hash_matcher.rb +258 -0
- data/lib/canon/tree_diff/matchers/similarity_matcher.rb +168 -0
- data/lib/canon/tree_diff/matchers/structural_propagator.rb +242 -0
- data/lib/canon/tree_diff/matchers/universal_matcher.rb +220 -0
- data/lib/canon/tree_diff/operation_converter.rb +631 -0
- data/lib/canon/tree_diff/operations/operation.rb +92 -0
- data/lib/canon/tree_diff/operations/operation_detector.rb +626 -0
- data/lib/canon/tree_diff/tree_diff_integrator.rb +140 -0
- data/lib/canon/tree_diff.rb +33 -0
- data/lib/canon/validators/json_validator.rb +3 -1
- data/lib/canon/validators/yaml_validator.rb +3 -1
- data/lib/canon/version.rb +1 -1
- data/lib/canon/xml/data_model.rb +22 -23
- data/lib/canon/xml/element_matcher.rb +128 -20
- data/lib/canon/xml/namespace_helper.rb +110 -0
- data/lib/canon.rb +3 -0
- metadata +81 -23
- data/_config.yml +0 -116
- data/docs/ADVANCED_TOPICS.adoc +0 -20
- data/docs/BASIC_USAGE.adoc +0 -16
- data/docs/CUSTOMIZING_BEHAVIOR.adoc +0 -19
- data/docs/DIFF_ARCHITECTURE.adoc +0 -435
- data/docs/DIFF_FORMATTING.adoc +0 -540
- data/docs/FORMATS.adoc +0 -447
- data/docs/INPUT_VALIDATION.adoc +0 -477
- data/docs/MATCH_ARCHITECTURE.adoc +0 -463
- data/docs/MATCH_OPTIONS.adoc +0 -719
- data/docs/MODES.adoc +0 -432
- data/docs/NORMATIVE_INFORMATIVE_DIFFS.adoc +0 -219
- data/docs/OPTIONS.adoc +0 -1387
- data/docs/PREPROCESSING.adoc +0 -491
- data/docs/SEMANTIC_DIFF_REPORT.adoc +0 -528
- data/docs/UNDERSTANDING_CANON.adoc +0 -17
|
@@ -0,0 +1,621 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: Match Options
|
|
3
|
+
parent: Features
|
|
4
|
+
nav_order: 3
|
|
5
|
+
has_children: true
|
|
6
|
+
---
|
|
7
|
+
= Match options
|
|
8
|
+
:toc:
|
|
9
|
+
:toclevels: 3
|
|
10
|
+
|
|
11
|
+
== Purpose
|
|
12
|
+
|
|
13
|
+
This section provides a complete reference for Canon's match options, including match dimensions, behaviors, and predefined profiles.
|
|
14
|
+
|
|
15
|
+
Match options control **Layer 3 (Match Options)** of Canon's 4-layer comparison architecture. See link:../../understanding/comparison-pipeline.adoc[Comparison Pipeline] for the complete flow.
|
|
16
|
+
|
|
17
|
+
== Overview
|
|
18
|
+
|
|
19
|
+
Match options control which aspects of documents are compared and how strictly they are compared. Canon provides:
|
|
20
|
+
|
|
21
|
+
* **Match dimensions**: Independent aspects of documents (text, whitespace, attributes, etc.)
|
|
22
|
+
* **Dimension behaviors**: How each dimension is compared (`:strict`, `:normalize`, `:ignore`)
|
|
23
|
+
* **Match profiles**: Predefined combinations for common scenarios
|
|
24
|
+
|
|
25
|
+
**Important**: Match options behave differently with each algorithm. See link:algorithm-specific-behavior.adoc[Algorithm-Specific Behavior] for details.
|
|
26
|
+
|
|
27
|
+
== Child Pages
|
|
28
|
+
|
|
29
|
+
* link:dimensions.adoc[Match Dimensions] - Detailed reference for all dimensions
|
|
30
|
+
* link:profiles.adoc[Match Profiles] - Predefined configurations
|
|
31
|
+
* link:algorithm-specific-behavior.adoc[Algorithm-Specific Behavior] - How DOM and Semantic algorithms interpret options differently
|
|
32
|
+
* link:html-policies.adoc[HTML-Specific Policies] - HTML format-specific comparison policies
|
|
33
|
+
|
|
34
|
+
== Match dimensions overview
|
|
35
|
+
|
|
36
|
+
Match dimensions are orthogonal aspects that can be configured independently.
|
|
37
|
+
|
|
38
|
+
=== text_content
|
|
39
|
+
|
|
40
|
+
**Applies to**: All formats
|
|
41
|
+
|
|
42
|
+
**Purpose**: Controls how text content within elements/values is compared.
|
|
43
|
+
|
|
44
|
+
**Behaviors**:
|
|
45
|
+
|
|
46
|
+
`:strict`:: Text must match exactly, character-for-character including all whitespace
|
|
47
|
+
|
|
48
|
+
`:normalize`:: Whitespace is normalized (collapsed/trimmed) before comparison
|
|
49
|
+
|
|
50
|
+
`:ignore`:: Text content is completely ignored in comparison
|
|
51
|
+
|
|
52
|
+
=== structural_whitespace
|
|
53
|
+
|
|
54
|
+
**Applies to**: All formats
|
|
55
|
+
|
|
56
|
+
**Purpose**: Controls how whitespace between elements (indentation, newlines) is handled.
|
|
57
|
+
|
|
58
|
+
**Behaviors**:
|
|
59
|
+
|
|
60
|
+
`:strict`:: All structural whitespace must match exactly
|
|
61
|
+
|
|
62
|
+
`:normalize`:: Structural whitespace is normalized
|
|
63
|
+
|
|
64
|
+
`:ignore`:: Structural whitespace is completely ignored
|
|
65
|
+
|
|
66
|
+
=== attribute_whitespace
|
|
67
|
+
|
|
68
|
+
**Applies to**: XML, HTML only
|
|
69
|
+
|
|
70
|
+
**Purpose**: Controls how whitespace in attribute values is handled.
|
|
71
|
+
|
|
72
|
+
**Behaviors**:
|
|
73
|
+
|
|
74
|
+
`:strict`:: Attribute value whitespace must match exactly
|
|
75
|
+
|
|
76
|
+
`:normalize`:: Whitespace in attribute values is normalized
|
|
77
|
+
|
|
78
|
+
`:ignore`:: Whitespace in attribute values is ignored
|
|
79
|
+
|
|
80
|
+
=== attribute_order
|
|
81
|
+
|
|
82
|
+
**Applies to**: XML, HTML only
|
|
83
|
+
|
|
84
|
+
**Purpose**: Controls whether attribute order matters.
|
|
85
|
+
|
|
86
|
+
**Behaviors**:
|
|
87
|
+
|
|
88
|
+
`:strict`:: Attributes must appear in the same order
|
|
89
|
+
|
|
90
|
+
`:ignore`:: Attribute order doesn't matter (set-based comparison)
|
|
91
|
+
|
|
92
|
+
=== attribute_values
|
|
93
|
+
|
|
94
|
+
**Applies to**: XML, HTML only
|
|
95
|
+
|
|
96
|
+
**Purpose**: Controls how attribute values are compared.
|
|
97
|
+
|
|
98
|
+
**Behaviors**:
|
|
99
|
+
|
|
100
|
+
`:strict`:: Attribute values must match exactly
|
|
101
|
+
|
|
102
|
+
`:normalize`:: Whitespace in values is normalized
|
|
103
|
+
|
|
104
|
+
`:ignore`:: Only attribute presence is checked, values ignored
|
|
105
|
+
|
|
106
|
+
=== key_order
|
|
107
|
+
|
|
108
|
+
**Applies to**: JSON, YAML only
|
|
109
|
+
|
|
110
|
+
**Purpose**: Controls whether object key order matters.
|
|
111
|
+
|
|
112
|
+
**Behaviors**:
|
|
113
|
+
|
|
114
|
+
`:strict`:: Keys must appear in the same order
|
|
115
|
+
|
|
116
|
+
`:ignore`:: Key order doesn't matter (unordered comparison)
|
|
117
|
+
|
|
118
|
+
=== comments
|
|
119
|
+
|
|
120
|
+
**Applies to**: XML, HTML, YAML (JSON doesn't support comments in standard spec)
|
|
121
|
+
|
|
122
|
+
**Purpose**: Controls how comments are compared.
|
|
123
|
+
|
|
124
|
+
**Behaviors**:
|
|
125
|
+
|
|
126
|
+
`:strict`:: Comments must match exactly (including whitespace)
|
|
127
|
+
|
|
128
|
+
`:normalize`:: Whitespace in comments is normalized
|
|
129
|
+
|
|
130
|
+
`:ignore`:: Comments are completely ignored
|
|
131
|
+
|
|
132
|
+
=== namespace_uri
|
|
133
|
+
|
|
134
|
+
**Applies to**: XML only
|
|
135
|
+
|
|
136
|
+
**Purpose**: Controls how XML element namespaces are compared. Elements are identified by the pair `{namespace_uri, local_name}` according to XML semantics.
|
|
137
|
+
|
|
138
|
+
**Behaviors**:
|
|
139
|
+
|
|
140
|
+
`:strict`:: Namespace URIs must match (default and only supported behavior)
|
|
141
|
+
|
|
142
|
+
**Note**: This dimension is always `:strict` for XML. Namespace prefixes are not significant - only the namespace URI matters. Elements with different prefixes but the same namespace URI are considered equivalent.
|
|
143
|
+
|
|
144
|
+
.Namespace URI comparison
|
|
145
|
+
[example]
|
|
146
|
+
====
|
|
147
|
+
[source,ruby]
|
|
148
|
+
----
|
|
149
|
+
# These are equivalent (same namespace URI, different prefixes)
|
|
150
|
+
xml1 = '<root xmlns:a="http://example.com"><a:item>value</a:item></root>'
|
|
151
|
+
xml2 = '<root xmlns:b="http://example.com"><b:item>value</b:item></root>'
|
|
152
|
+
|
|
153
|
+
# These are NOT equivalent (different namespace URIs)
|
|
154
|
+
xml3 = '<root xmlns:a="http://example.com"><a:item>value</a:item></root>'
|
|
155
|
+
xml4 = '<root xmlns:a="http://other.com"><a:item>value</a:item></root>'
|
|
156
|
+
----
|
|
157
|
+
====
|
|
158
|
+
|
|
159
|
+
== Match profiles overview
|
|
160
|
+
|
|
161
|
+
Profiles are predefined combinations of dimension settings for common scenarios.
|
|
162
|
+
|
|
163
|
+
=== strict
|
|
164
|
+
|
|
165
|
+
**Purpose**: Exact matching - all dimensions use `:strict` behavior.
|
|
166
|
+
|
|
167
|
+
**When to use**:
|
|
168
|
+
|
|
169
|
+
* Character-perfect matching required
|
|
170
|
+
* Testing exact serializer output
|
|
171
|
+
* Verifying formatting compliance
|
|
172
|
+
* Maximum strictness needed
|
|
173
|
+
|
|
174
|
+
=== rendered
|
|
175
|
+
|
|
176
|
+
**Purpose**: Mimics how browsers/CSS engines render content.
|
|
177
|
+
|
|
178
|
+
**When to use**:
|
|
179
|
+
|
|
180
|
+
* Comparing HTML rendered output
|
|
181
|
+
* Formatting doesn't affect display
|
|
182
|
+
* Testing web page generation
|
|
183
|
+
* Browser-equivalent comparison
|
|
184
|
+
|
|
185
|
+
=== spec_friendly
|
|
186
|
+
|
|
187
|
+
**Purpose**: Test-friendly comparison that ignores most formatting differences.
|
|
188
|
+
|
|
189
|
+
**When to use**:
|
|
190
|
+
|
|
191
|
+
* Writing RSpec tests
|
|
192
|
+
* Testing semantic correctness
|
|
193
|
+
* Ignoring pretty-printing differences
|
|
194
|
+
* Most common test scenario
|
|
195
|
+
|
|
196
|
+
=== content_only
|
|
197
|
+
|
|
198
|
+
**Purpose**: Only semantic content matters - maximum tolerance for formatting.
|
|
199
|
+
|
|
200
|
+
**When to use**:
|
|
201
|
+
|
|
202
|
+
* Only care about data, not presentation
|
|
203
|
+
* Maximum flexibility needed
|
|
204
|
+
* Comparing across different formats
|
|
205
|
+
* Structural equivalence only
|
|
206
|
+
|
|
207
|
+
== Format defaults
|
|
208
|
+
|
|
209
|
+
Each format has sensible defaults based on typical usage:
|
|
210
|
+
|
|
211
|
+
[cols="1,1,1,1,1"]
|
|
212
|
+
|===
|
|
213
|
+
|Dimension |XML |HTML |JSON |YAML
|
|
214
|
+
|
|
215
|
+
|`text_content`
|
|
216
|
+
|`:strict`
|
|
217
|
+
|`:normalize`
|
|
218
|
+
|`:strict`
|
|
219
|
+
|`:strict`
|
|
220
|
+
|
|
221
|
+
|`structural_whitespace`
|
|
222
|
+
|`:strict`
|
|
223
|
+
|`:normalize`
|
|
224
|
+
|`:strict`
|
|
225
|
+
|`:strict`
|
|
226
|
+
|
|
227
|
+
|`attribute_whitespace`
|
|
228
|
+
|`:strict`
|
|
229
|
+
|`:normalize`
|
|
230
|
+
|—
|
|
231
|
+
|—
|
|
232
|
+
|
|
233
|
+
|`attribute_order`
|
|
234
|
+
|`:ignore`
|
|
235
|
+
|`:ignore`
|
|
236
|
+
|—
|
|
237
|
+
|—
|
|
238
|
+
|
|
239
|
+
|`attribute_values`
|
|
240
|
+
|`:strict`
|
|
241
|
+
|`:strict`
|
|
242
|
+
|—
|
|
243
|
+
|—
|
|
244
|
+
|
|
245
|
+
|`key_order`
|
|
246
|
+
|—
|
|
247
|
+
|—
|
|
248
|
+
|`:strict`
|
|
249
|
+
|`:strict`
|
|
250
|
+
|
|
251
|
+
|`comments`
|
|
252
|
+
|`:strict`
|
|
253
|
+
|`:ignore`
|
|
254
|
+
|—
|
|
255
|
+
|`:strict`
|
|
256
|
+
|
|
257
|
+
|`namespace_uri`
|
|
258
|
+
|`:strict`
|
|
259
|
+
|—
|
|
260
|
+
|—
|
|
261
|
+
|—
|
|
262
|
+
|===
|
|
263
|
+
|
|
264
|
+
== Configuration precedence
|
|
265
|
+
|
|
266
|
+
When options are specified in multiple places, Canon resolves them using this hierarchy (highest to lowest priority):
|
|
267
|
+
|
|
268
|
+
[source]
|
|
269
|
+
----
|
|
270
|
+
1. Per-comparison explicit options (highest)
|
|
271
|
+
↓
|
|
272
|
+
2. Per-comparison profile
|
|
273
|
+
↓
|
|
274
|
+
3. Global configuration explicit options
|
|
275
|
+
↓
|
|
276
|
+
4. Global configuration profile
|
|
277
|
+
↓
|
|
278
|
+
5. Format defaults (lowest)
|
|
279
|
+
----
|
|
280
|
+
|
|
281
|
+
.Precedence example
|
|
282
|
+
[example]
|
|
283
|
+
====
|
|
284
|
+
**Global configuration**:
|
|
285
|
+
|
|
286
|
+
[source,ruby]
|
|
287
|
+
----
|
|
288
|
+
Canon::RSpecMatchers.configure do |config|
|
|
289
|
+
config.xml.match.profile = :spec_friendly
|
|
290
|
+
config.xml.match.options = { comments: :strict }
|
|
291
|
+
end
|
|
292
|
+
----
|
|
293
|
+
|
|
294
|
+
The `:spec_friendly` profile sets:
|
|
295
|
+
|
|
296
|
+
* `text_content: :normalize`
|
|
297
|
+
* `structural_whitespace: :ignore`
|
|
298
|
+
* `comments: :ignore`
|
|
299
|
+
|
|
300
|
+
But the explicit `comments: :strict` overrides the profile setting.
|
|
301
|
+
|
|
302
|
+
**Per-test usage**:
|
|
303
|
+
|
|
304
|
+
[source,ruby]
|
|
305
|
+
----
|
|
306
|
+
expect(actual).to be_xml_equivalent_to(expected)
|
|
307
|
+
.with_profile(:rendered)
|
|
308
|
+
.with_options(structural_whitespace: :ignore)
|
|
309
|
+
----
|
|
310
|
+
|
|
311
|
+
**Final resolved options**:
|
|
312
|
+
|
|
313
|
+
* `text_content: :normalize` (from `:rendered` per-test profile)
|
|
314
|
+
* `structural_whitespace: :ignore` (from per-test explicit option)
|
|
315
|
+
* `comments: :strict` (from global explicit option)
|
|
316
|
+
* Other dimensions use `:rendered` profile or format defaults
|
|
317
|
+
====
|
|
318
|
+
|
|
319
|
+
== Usage examples
|
|
320
|
+
|
|
321
|
+
=== Ruby API
|
|
322
|
+
|
|
323
|
+
[source,ruby]
|
|
324
|
+
----
|
|
325
|
+
# Use specific dimensions
|
|
326
|
+
Canon::Comparison.equivalent?(doc1, doc2,
|
|
327
|
+
match: {
|
|
328
|
+
text_content: :normalize,
|
|
329
|
+
structural_whitespace: :ignore,
|
|
330
|
+
comments: :ignore
|
|
331
|
+
}
|
|
332
|
+
)
|
|
333
|
+
|
|
334
|
+
# Use a profile
|
|
335
|
+
Canon::Comparison.equivalent?(doc1, doc2,
|
|
336
|
+
match_profile: :spec_friendly
|
|
337
|
+
)
|
|
338
|
+
|
|
339
|
+
# Profile with dimension overrides
|
|
340
|
+
Canon::Comparison.equivalent?(doc1, doc2,
|
|
341
|
+
match_profile: :spec_friendly,
|
|
342
|
+
match: {
|
|
343
|
+
comments: :strict # Override profile
|
|
344
|
+
}
|
|
345
|
+
)
|
|
346
|
+
|
|
347
|
+
# Use semantic dimensions
|
|
348
|
+
Canon::Comparison.equivalent?(doc1, doc2,
|
|
349
|
+
diff_algorithm: :semantic,
|
|
350
|
+
match: {
|
|
351
|
+
element_position: :ignore,
|
|
352
|
+
element_hierarchy: :ignore
|
|
353
|
+
}
|
|
354
|
+
)
|
|
355
|
+
----
|
|
356
|
+
|
|
357
|
+
=== CLI
|
|
358
|
+
|
|
359
|
+
[source,bash]
|
|
360
|
+
----
|
|
361
|
+
# Use profile
|
|
362
|
+
$ canon diff file1.xml file2.xml \
|
|
363
|
+
--match-profile spec_friendly \
|
|
364
|
+
--verbose
|
|
365
|
+
|
|
366
|
+
# Override specific dimensions
|
|
367
|
+
$ canon diff file1.xml file2.xml \
|
|
368
|
+
--text-content normalize \
|
|
369
|
+
--structural-whitespace ignore \
|
|
370
|
+
--verbose
|
|
371
|
+
|
|
372
|
+
# Combine profile with overrides
|
|
373
|
+
$ canon diff file1.xml file2.xml \
|
|
374
|
+
--match-profile spec_friendly \
|
|
375
|
+
--comments strict \
|
|
376
|
+
--verbose
|
|
377
|
+
|
|
378
|
+
# Use semantic algorithm with flexible positioning
|
|
379
|
+
$ canon diff file1.xml file2.xml \
|
|
380
|
+
--diff-algorithm semantic \
|
|
381
|
+
--element-position ignore \
|
|
382
|
+
--verbose
|
|
383
|
+
----
|
|
384
|
+
|
|
385
|
+
=== RSpec
|
|
386
|
+
|
|
387
|
+
[source,ruby]
|
|
388
|
+
----
|
|
389
|
+
# Global configuration
|
|
390
|
+
Canon::RSpecMatchers.configure do |config|
|
|
391
|
+
config.xml.match.profile = :spec_friendly
|
|
392
|
+
config.xml.match.options = {
|
|
393
|
+
text_content: :normalize,
|
|
394
|
+
comments: :ignore
|
|
395
|
+
}
|
|
396
|
+
end
|
|
397
|
+
|
|
398
|
+
# Per-test override
|
|
399
|
+
expect(actual).to be_xml_equivalent_to(expected)
|
|
400
|
+
.with_profile(:strict)
|
|
401
|
+
|
|
402
|
+
# Per-test dimension override
|
|
403
|
+
expect(actual).to be_xml_equivalent_to(expected)
|
|
404
|
+
.with_options(
|
|
405
|
+
structural_whitespace: :strict,
|
|
406
|
+
text_content: :strict
|
|
407
|
+
)
|
|
408
|
+
|
|
409
|
+
# Semantic algorithm with flexible hierarchy
|
|
410
|
+
expect(actual).to be_xml_equivalent_to(expected,
|
|
411
|
+
diff_algorithm: :semantic
|
|
412
|
+
)
|
|
413
|
+
.with_options(
|
|
414
|
+
element_position: :ignore,
|
|
415
|
+
element_hierarchy: :ignore
|
|
416
|
+
)
|
|
417
|
+
====
|
|
418
|
+
|
|
419
|
+
== Comments dimension
|
|
420
|
+
|
|
421
|
+
The `comments` dimension controls how comment nodes are matched and how their differences are classified in diff output.
|
|
422
|
+
|
|
423
|
+
=== Matching behaviors
|
|
424
|
+
|
|
425
|
+
`strict`:: Comment content must match exactly. Differences are classified as **normative** (shown in red/green).
|
|
426
|
+
`normalize`:: Comment text is normalized (whitespace collapsed) before matching. Differences are still classified as **normative**.
|
|
427
|
+
`ignore`:: Comment content is compared but differences are classified as **informative** (shown in cyan/blue) rather than normative.
|
|
428
|
+
|
|
429
|
+
IMPORTANT: With `comments: :ignore`, comment nodes still participate in comparison and create DiffNodes. The difference is that these DiffNodes are marked as **informative** rather than **normative**. Use the `show_diffs` option to control visibility of informative diffs in the output.
|
|
430
|
+
|
|
431
|
+
=== Default values
|
|
432
|
+
|
|
433
|
+
* **XML**: `comments: :strict` - Comment differences are normative
|
|
434
|
+
* **HTML**: `comments: :ignore` - Comment differences are informative
|
|
435
|
+
|
|
436
|
+
=== Example: Comments as informative differences
|
|
437
|
+
|
|
438
|
+
.Match comments but classify differences as informative
|
|
439
|
+
[source,ruby]
|
|
440
|
+
----
|
|
441
|
+
xml1 = '<root><!--comment 1--><child>text</child></root>'
|
|
442
|
+
xml2 = '<root><!--comment 2--><child>text</child></root>'
|
|
443
|
+
|
|
444
|
+
Canon.equivalent?(xml1, xml2,
|
|
445
|
+
format: :xml,
|
|
446
|
+
verbose: true,
|
|
447
|
+
match: { comments: :ignore }, # Comment diffs → informative
|
|
448
|
+
show_diffs: :all # Show all diffs (cyan for comments)
|
|
449
|
+
)
|
|
450
|
+
----
|
|
451
|
+
|
|
452
|
+
In this example, the comment difference is still detected and included in the diff output, but is shown in cyan color to indicate it's an informative difference.
|
|
453
|
+
|
|
454
|
+
=== Example: Hide informative differences
|
|
455
|
+
|
|
456
|
+
.Hide informative differences including comments
|
|
457
|
+
[source,ruby]
|
|
458
|
+
----
|
|
459
|
+
Canon.equivalent?(xml1, xml2,
|
|
460
|
+
format: :xml,
|
|
461
|
+
verbose: true,
|
|
462
|
+
match: { comments: :ignore }, # Comment diffs → informative
|
|
463
|
+
show_diffs: :normative # Hide informative diffs
|
|
464
|
+
)
|
|
465
|
+
# Returns empty string - no normative diffs to show
|
|
466
|
+
----
|
|
467
|
+
|
|
468
|
+
== Controlling diff visibility with show_diffs
|
|
469
|
+
|
|
470
|
+
The `show_diffs` option controls which differences appear in verbose output. This provides fine-grained control over what is displayed without affecting the comparison algorithm.
|
|
471
|
+
|
|
472
|
+
=== Values
|
|
473
|
+
|
|
474
|
+
`:all` (default):: Show all differences (both normative and informative)
|
|
475
|
+
`:normative`:: Show only normative differences (hide informative)
|
|
476
|
+
`:informative`:: Show only informative differences (hide normative)
|
|
477
|
+
|
|
478
|
+
=== Color scheme
|
|
479
|
+
|
|
480
|
+
Normative differences:: Shown in red (removed) and green (added)
|
|
481
|
+
Informative differences:: Shown in cyan/blue (both removed and added)
|
|
482
|
+
|
|
483
|
+
=== Three-stage pipeline
|
|
484
|
+
|
|
485
|
+
The comparison process follows a three-stage pipeline:
|
|
486
|
+
|
|
487
|
+
[source]
|
|
488
|
+
----
|
|
489
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
490
|
+
│ STAGE 1: MATCHING - Compare all nodes │
|
|
491
|
+
│ • Parse XML/HTML to DOM or TreeNode │
|
|
492
|
+
│ • Compare ALL nodes (including comments) │
|
|
493
|
+
│ • Create DiffNodes for ALL differences │
|
|
494
|
+
└────────────────────┬────────────────────────────────────────┘
|
|
495
|
+
│
|
|
496
|
+
▼
|
|
497
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
498
|
+
│ STAGE 2: CLASSIFICATION - Mark normative vs informative │
|
|
499
|
+
│ • DiffClassifier uses match_options │
|
|
500
|
+
│ • comments: :ignore → comment diffs = informative │
|
|
501
|
+
│ • comments: :strict → comment diffs = normative │
|
|
502
|
+
└────────────────────┬────────────────────────────────────────┘
|
|
503
|
+
│
|
|
504
|
+
▼
|
|
505
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
506
|
+
│ STAGE 3: RENDERING - Control visibility │
|
|
507
|
+
│ • show_diffs: :all → show everything │
|
|
508
|
+
│ • show_diffs: :normative → show only normative │
|
|
509
|
+
│ • show_diffs: :informative → show only informative │
|
|
510
|
+
└─────────────────────────────────────────────────────────────┘
|
|
511
|
+
----
|
|
512
|
+
|
|
513
|
+
=== Example: Show only normative differences
|
|
514
|
+
|
|
515
|
+
.Show only normative differences (common use case)
|
|
516
|
+
[source,ruby]
|
|
517
|
+
----
|
|
518
|
+
result = Canon.equivalent?(xml1, xml2,
|
|
519
|
+
format: :xml,
|
|
520
|
+
verbose: true,
|
|
521
|
+
show_diffs: :normative
|
|
522
|
+
)
|
|
523
|
+
----
|
|
524
|
+
|
|
525
|
+
This is useful when you want to focus on actual semantic differences and hide cosmetic ones like comment changes or whitespace formatting.
|
|
526
|
+
|
|
527
|
+
=== Example: Show only informative differences
|
|
528
|
+
|
|
529
|
+
.Show only informative differences (debugging use case)
|
|
530
|
+
[source,ruby]
|
|
531
|
+
----
|
|
532
|
+
result = Canon.equivalent?(xml1, xml2,
|
|
533
|
+
format: :xml,
|
|
534
|
+
verbose: true,
|
|
535
|
+
show_diffs: :informative
|
|
536
|
+
)
|
|
537
|
+
----
|
|
538
|
+
|
|
539
|
+
This is useful for debugging or reviewing formatting/cosmetic changes separately from semantic ones.
|
|
540
|
+
|
|
541
|
+
=== Combining comments and show_diffs
|
|
542
|
+
|
|
543
|
+
The `comments` match option and `show_diffs` option work together:
|
|
544
|
+
|
|
545
|
+
[cols="1,1,2"]
|
|
546
|
+
|===
|
|
547
|
+
| comments | show_diffs | Result
|
|
548
|
+
|
|
549
|
+
| `:ignore`
|
|
550
|
+
| `:normative`
|
|
551
|
+
| Comment diffs hidden (informative)
|
|
552
|
+
|
|
553
|
+
| `:ignore`
|
|
554
|
+
| `:all`
|
|
555
|
+
| Comment diffs shown in cyan (informative)
|
|
556
|
+
|
|
557
|
+
| `:strict`
|
|
558
|
+
| `:normative`
|
|
559
|
+
| Comment diffs shown in red/green (normative)
|
|
560
|
+
|
|
561
|
+
| `:strict`
|
|
562
|
+
| `:informative`
|
|
563
|
+
| Comment diffs hidden (normative)
|
|
564
|
+
|===
|
|
565
|
+
|
|
566
|
+
.Example: Mixed scenario
|
|
567
|
+
// XML with both comment and text differences
|
|
568
|
+
xml1 = '<root><!--old comment--><child>text1</child></root>'
|
|
569
|
+
xml2 = '<root><!--new comment--><child>text2</child></root>'
|
|
570
|
+
|
|
571
|
+
// Show only normative diffs (hide comment changes)
|
|
572
|
+
result = Canon.equivalent?(xml1, xml2,
|
|
573
|
+
format: :xml,
|
|
574
|
+
verbose: true,
|
|
575
|
+
match: { comments: :ignore, text_content: :strict },
|
|
576
|
+
show_diffs: :normative
|
|
577
|
+
)
|
|
578
|
+
|
|
579
|
+
// Output shows text change but not comment change
|
|
580
|
+
// - text1 (in red)
|
|
581
|
+
// + text2 (in green)
|
|
582
|
+
// Comment diff is hidden because it's informative
|
|
583
|
+
----
|
|
584
|
+
|
|
585
|
+
=== CLI usage
|
|
586
|
+
|
|
587
|
+
[source,bash]
|
|
588
|
+
----
|
|
589
|
+
# Show only normative differences
|
|
590
|
+
canon diff file1.xml file2.xml --show-diffs=normative
|
|
591
|
+
|
|
592
|
+
# Show all differences (default)
|
|
593
|
+
canon diff file1.xml file2.xml --show-diffs=all
|
|
594
|
+
|
|
595
|
+
# Show only informative differences
|
|
596
|
+
canon diff file1.xml file2.xml --show-diffs=informative
|
|
597
|
+
----
|
|
598
|
+
|
|
599
|
+
=== RSpec usage
|
|
600
|
+
|
|
601
|
+
[source,ruby]
|
|
602
|
+
----
|
|
603
|
+
RSpec.describe 'XML comparison' do
|
|
604
|
+
it 'matches despite comment differences' do
|
|
605
|
+
expect(xml1).to be_xml_equivalent_to(xml2)
|
|
606
|
+
.with_match(comments: :ignore)
|
|
607
|
+
.show_diffs(:normative)
|
|
608
|
+
end
|
|
609
|
+
end
|
|
610
|
+
----
|
|
611
|
+
|
|
612
|
+
=== Environment variable
|
|
613
|
+
|
|
614
|
+
You can set a default value using the `CANON_SHOW_DIFFS` environment variable:
|
|
615
|
+
|
|
616
|
+
[source,bash]
|
|
617
|
+
----
|
|
618
|
+
export CANON_SHOW_DIFFS=normative
|
|
619
|
+
----
|
|
620
|
+
|
|
621
|
+
Valid values: `all`, `normative`, `informative`
|
|
@@ -0,0 +1,83 @@
|
|
|
1
|
+
---
|
|
2
|
+
layout: default
|
|
3
|
+
title: Getting Started
|
|
4
|
+
nav_order: 2
|
|
5
|
+
has_children: true
|
|
6
|
+
---
|
|
7
|
+
= Getting Started
|
|
8
|
+
|
|
9
|
+
Welcome to Canon! This section will help you get up and running quickly.
|
|
10
|
+
|
|
11
|
+
== Overview
|
|
12
|
+
|
|
13
|
+
Canon is a Ruby library for canonicalization, pretty-printing, and semantic comparison of structured documents in multiple formats (XML, HTML, JSON, YAML).
|
|
14
|
+
|
|
15
|
+
Whether you're:
|
|
16
|
+
|
|
17
|
+
* A developer integrating Canon into an application
|
|
18
|
+
* A QA engineer writing tests for document generation
|
|
19
|
+
* A DevOps engineer comparing configuration files
|
|
20
|
+
* An architect evaluating Canon's design
|
|
21
|
+
|
|
22
|
+
This section provides everything you need to start using Canon effectively.
|
|
23
|
+
|
|
24
|
+
== What You'll Learn
|
|
25
|
+
|
|
26
|
+
This section covers:
|
|
27
|
+
|
|
28
|
+
link:installation[**Installation**]::
|
|
29
|
+
How to install Canon in your Ruby environment, including gem installation and bundler setup.
|
|
30
|
+
|
|
31
|
+
link:quick-start[**Quick Start**]::
|
|
32
|
+
Your first Canon operations - formatting and comparing documents with minimal code.
|
|
33
|
+
|
|
34
|
+
link:core-concepts[**Core Concepts**]::
|
|
35
|
+
Essential concepts to understand how Canon works: canonicalization, semantic comparison, and diff modes.
|
|
36
|
+
|
|
37
|
+
== Quick Example
|
|
38
|
+
|
|
39
|
+
Here's a taste of what Canon can do:
|
|
40
|
+
|
|
41
|
+
[source,ruby]
|
|
42
|
+
----
|
|
43
|
+
require 'canon'
|
|
44
|
+
|
|
45
|
+
# Format XML in canonical form
|
|
46
|
+
xml = '<root><b>2</b><a>1</a></root>'
|
|
47
|
+
canonical = Canon.format(xml, :xml, mode: :c14n)
|
|
48
|
+
# => "<root><a>1</a><b>2</b></root>"
|
|
49
|
+
|
|
50
|
+
# Compare documents semantically
|
|
51
|
+
doc1 = '<root><a>1</a><b>2</b></root>'
|
|
52
|
+
doc2 = '<root> <b>2</b> <a>1</a> </root>'
|
|
53
|
+
Canon::Comparison.equivalent?(doc1, doc2)
|
|
54
|
+
# => true (ignores whitespace and element order)
|
|
55
|
+
----
|
|
56
|
+
|
|
57
|
+
== Next Steps
|
|
58
|
+
|
|
59
|
+
After completing this section:
|
|
60
|
+
|
|
61
|
+
* Explore link:../interfaces/[Interfaces] to learn Ruby API, CLI, and RSpec usage
|
|
62
|
+
* Read link:../understanding/[Understanding Canon] to learn how it works internally
|
|
63
|
+
* Check link:../features/[Features] to customize Canon's behavior
|
|
64
|
+
|
|
65
|
+
== Common Questions
|
|
66
|
+
|
|
67
|
+
**Which Ruby versions are supported?**::
|
|
68
|
+
Canon supports Ruby 2.7 and higher.
|
|
69
|
+
|
|
70
|
+
**What formats does Canon support?**::
|
|
71
|
+
XML, HTML, JSON, and YAML. See link:../understanding/formats/[Format Support] for details.
|
|
72
|
+
|
|
73
|
+
**Can I use Canon without RSpec?**::
|
|
74
|
+
Yes! Canon works as a standalone library. RSpec matchers are optional.
|
|
75
|
+
|
|
76
|
+
**Is Canon suitable for production use?**::
|
|
77
|
+
Yes. The core DOM diff algorithm is stable and well-tested. The semantic tree diff is experimental.
|
|
78
|
+
|
|
79
|
+
== See Also
|
|
80
|
+
|
|
81
|
+
* link:../interfaces/[Interfaces] - Choose your preferred way to use Canon
|
|
82
|
+
* link:../understanding/formats/[Format Support] - Format-specific details
|
|
83
|
+
* link:../features/match-options/[Match Options] - Customizing comparison behavior
|