canon 0.1.21 → 0.1.22
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.rubocop_todo.yml +43 -43
- data/README.adoc +8 -3
- data/docs/advanced/diff-pipeline.adoc +36 -9
- data/docs/features/diff-formatting/colors-and-symbols.adoc +82 -0
- data/docs/features/diff-formatting/index.adoc +12 -0
- data/docs/features/diff-formatting/themes.adoc +353 -0
- data/docs/features/environment-configuration/index.adoc +23 -0
- data/docs/internals/diff-char-range-pipeline.adoc +249 -0
- data/docs/internals/diffnode-enrichment.adoc +1 -0
- data/docs/internals/index.adoc +52 -4
- data/docs/reference/environment-variables.adoc +6 -0
- data/docs/understanding/architecture.adoc +5 -0
- data/examples/show_themes.rb +217 -0
- data/lib/canon/comparison/comparison_result.rb +9 -4
- data/lib/canon/config/env_schema.rb +3 -1
- data/lib/canon/config.rb +11 -0
- data/lib/canon/diff/diff_block.rb +7 -0
- data/lib/canon/diff/diff_block_builder.rb +2 -2
- data/lib/canon/diff/diff_char_range.rb +140 -0
- data/lib/canon/diff/diff_line.rb +42 -4
- data/lib/canon/diff/diff_line_builder.rb +907 -0
- data/lib/canon/diff/diff_node.rb +5 -1
- data/lib/canon/diff/diff_node_enricher.rb +1418 -0
- data/lib/canon/diff/diff_node_mapper.rb +54 -0
- data/lib/canon/diff/source_locator.rb +105 -0
- data/lib/canon/diff/text_decomposer.rb +103 -0
- data/lib/canon/diff_formatter/by_line/base_formatter.rb +264 -24
- data/lib/canon/diff_formatter/by_line/html_formatter.rb +35 -20
- data/lib/canon/diff_formatter/by_line/json_formatter.rb +36 -19
- data/lib/canon/diff_formatter/by_line/simple_formatter.rb +33 -19
- data/lib/canon/diff_formatter/by_line/xml_formatter.rb +583 -98
- data/lib/canon/diff_formatter/by_line/yaml_formatter.rb +36 -19
- data/lib/canon/diff_formatter/by_object/base_formatter.rb +62 -13
- data/lib/canon/diff_formatter/by_object/json_formatter.rb +59 -24
- data/lib/canon/diff_formatter/by_object/xml_formatter.rb +74 -34
- data/lib/canon/diff_formatter/diff_detail_formatter/color_helper.rb +4 -5
- data/lib/canon/diff_formatter/diff_detail_formatter.rb +1 -1
- data/lib/canon/diff_formatter/legend.rb +4 -2
- data/lib/canon/diff_formatter/theme.rb +857 -0
- data/lib/canon/diff_formatter.rb +11 -6
- data/lib/canon/tree_diff/matchers/hash_matcher.rb +15 -15
- data/lib/canon/tree_diff/matchers/similarity_matcher.rb +10 -0
- data/lib/canon/tree_diff/operations/operation_detector.rb +5 -1
- data/lib/canon/tree_diff/tree_diff_integrator.rb +1 -1
- data/lib/canon/version.rb +1 -1
- metadata +11 -2
|
@@ -0,0 +1,353 @@
|
|
|
1
|
+
---
|
|
2
|
+
layout: default
|
|
3
|
+
title: Diff Display Themes
|
|
4
|
+
parent: Diff Formatting
|
|
5
|
+
grand_parent: Features
|
|
6
|
+
nav_order: 6
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
:toc:
|
|
10
|
+
:toclevels: 3
|
|
11
|
+
|
|
12
|
+
== Purpose
|
|
13
|
+
|
|
14
|
+
Canon provides a theme system for diff display that controls colors, backgrounds, text effects, and visual markers. Themes make diffs readable on different terminal backgrounds and allow personalizing the visual style.
|
|
15
|
+
|
|
16
|
+
== Available themes
|
|
17
|
+
|
|
18
|
+
Canon ships with four predefined themes:
|
|
19
|
+
|
|
20
|
+
[cols="1,2,3"]
|
|
21
|
+
|===
|
|
22
|
+
| Theme | Best For | Key Characteristics
|
|
23
|
+
|
|
24
|
+
| `:light`
|
|
25
|
+
| Light terminal backgrounds
|
|
26
|
+
| Light marker backgrounds (`light_red`, `light_green`), dark text, `:black` structure elements
|
|
27
|
+
|
|
28
|
+
| `:dark`
|
|
29
|
+
| Dark terminal backgrounds (default)
|
|
30
|
+
| Saturated foreground colors, no backgrounds, `:cyan` informative, `:bright_blue` formatting
|
|
31
|
+
|
|
32
|
+
| `:retro`
|
|
33
|
+
| Amber CRT / low blue light / accessibility
|
|
34
|
+
| Monochromatic amber/yellow, inverse video for removed, `:bright_yellow` text on `:yellow` background
|
|
35
|
+
|
|
36
|
+
| `:claude`
|
|
37
|
+
| Maximum visual contrast (Claude Code style)
|
|
38
|
+
| `:white` on `:red` background for removed, `:black` on `:green` background for added
|
|
39
|
+
|===
|
|
40
|
+
|
|
41
|
+
== Theme structure
|
|
42
|
+
|
|
43
|
+
Each theme is a nested hash with these top-level sections:
|
|
44
|
+
|
|
45
|
+
[cols="1,3"]
|
|
46
|
+
|===
|
|
47
|
+
| Section | Controls
|
|
48
|
+
|
|
49
|
+
| `diff`
|
|
50
|
+
| Colors for removed, added, changed, unchanged, formatting, and informative diffs
|
|
51
|
+
|
|
52
|
+
| `xml`
|
|
53
|
+
| Colors for XML syntax elements: tags, attribute names, attribute values, text, comments, CDATA
|
|
54
|
+
|
|
55
|
+
| `html`
|
|
56
|
+
| Same as `xml` but for HTML output
|
|
57
|
+
|
|
58
|
+
| `structure`
|
|
59
|
+
| Colors for line numbers, pipe separators, context lines
|
|
60
|
+
|
|
61
|
+
| `visualization`
|
|
62
|
+
| Unicode characters for invisible whitespace (space, tab, newline, nbsp)
|
|
63
|
+
|
|
64
|
+
| `display_mode`
|
|
65
|
+
| How changed lines display: `:separate`, `:inline`, or `:mixed`
|
|
66
|
+
|===
|
|
67
|
+
|
|
68
|
+
=== Styling properties
|
|
69
|
+
|
|
70
|
+
Each styled element supports these properties:
|
|
71
|
+
|
|
72
|
+
[cols="1,1,3"]
|
|
73
|
+
|===
|
|
74
|
+
| Property | Type | Description
|
|
75
|
+
|
|
76
|
+
| `color`
|
|
77
|
+
| Symbol
|
|
78
|
+
| ANSI foreground color
|
|
79
|
+
|
|
80
|
+
| `bg`
|
|
81
|
+
| Symbol or nil
|
|
82
|
+
| ANSI background color (nil = no background)
|
|
83
|
+
|
|
84
|
+
| `bold`
|
|
85
|
+
| Boolean
|
|
86
|
+
| Bold text
|
|
87
|
+
|
|
88
|
+
| `underline`
|
|
89
|
+
| Boolean
|
|
90
|
+
| Underlined text
|
|
91
|
+
|
|
92
|
+
| `strikethrough`
|
|
93
|
+
| Boolean
|
|
94
|
+
| Strikethrough text
|
|
95
|
+
|
|
96
|
+
| `italic`
|
|
97
|
+
| Boolean
|
|
98
|
+
| Italic text (terminal support varies)
|
|
99
|
+
|===
|
|
100
|
+
|
|
101
|
+
=== Valid color values
|
|
102
|
+
|
|
103
|
+
.Standard ANSI colors (universally supported)
|
|
104
|
+
[cols="2,1"]
|
|
105
|
+
|===
|
|
106
|
+
| Color | ANSI Code
|
|
107
|
+
|
|
108
|
+
| `:black` | 30
|
|
109
|
+
| `:red` | 31
|
|
110
|
+
| `:green` | 32
|
|
111
|
+
| `:yellow` | 33
|
|
112
|
+
| `:blue` | 34
|
|
113
|
+
| `:magenta` | 35
|
|
114
|
+
| `:cyan` | 36
|
|
115
|
+
| `:white` | 37
|
|
116
|
+
|===
|
|
117
|
+
|
|
118
|
+
.Bright variants (supported by most terminals)
|
|
119
|
+
[cols="2,1"]
|
|
120
|
+
|===
|
|
121
|
+
| Color | ANSI Code
|
|
122
|
+
|
|
123
|
+
| `:bright_red` | 91
|
|
124
|
+
| `:bright_green` | 92
|
|
125
|
+
| `:bright_yellow` | 93
|
|
126
|
+
| `:bright_blue` | 94
|
|
127
|
+
| `:bright_magenta` | 95
|
|
128
|
+
| `:bright_cyan` | 96
|
|
129
|
+
| `:default` | Default terminal color
|
|
130
|
+
|===
|
|
131
|
+
|
|
132
|
+
NOTE: The Rainbow gem (Canon's terminal color library) doesn't support `:bright_black` or `:bright_white` directly. Themes use `:white` or `:black` as substitutes.
|
|
133
|
+
|
|
134
|
+
== Diff color sections
|
|
135
|
+
|
|
136
|
+
Each theme defines colors for six diff categories:
|
|
137
|
+
|
|
138
|
+
[cols="1,2,2"]
|
|
139
|
+
|===
|
|
140
|
+
| Category | Markers | Description
|
|
141
|
+
|
|
142
|
+
| `removed`
|
|
143
|
+
| `-` (normative), `[` (formatting), `<` (informative)
|
|
144
|
+
| Lines or content present in the old document but not the new
|
|
145
|
+
|
|
146
|
+
| `added`
|
|
147
|
+
| `+` (normative), `]` (formatting), `>` (informative)
|
|
148
|
+
| Lines or content present in the new document but not the old
|
|
149
|
+
|
|
150
|
+
| `changed`
|
|
151
|
+
| `*` (mixed)
|
|
152
|
+
| Lines with both old and new content; has `content_old` and `content_new` sub-sections
|
|
153
|
+
|
|
154
|
+
| `unchanged`
|
|
155
|
+
| ` ` (space)
|
|
156
|
+
| Context lines with no differences
|
|
157
|
+
|
|
158
|
+
| `formatting`
|
|
159
|
+
| `[` / `]`
|
|
160
|
+
| Pure whitespace/formatting differences (no semantic change)
|
|
161
|
+
|
|
162
|
+
| `informative`
|
|
163
|
+
| `<` / `>`
|
|
164
|
+
| Tracked differences that don't affect equivalence
|
|
165
|
+
|===
|
|
166
|
+
|
|
167
|
+
== Theme specifications
|
|
168
|
+
|
|
169
|
+
=== Light theme (`:light`)
|
|
170
|
+
|
|
171
|
+
Designed for light terminal backgrounds (white/light gray).
|
|
172
|
+
|
|
173
|
+
* Removed: `:red` text with `:light_red` background on markers, `:strikethrough` on content
|
|
174
|
+
* Added: `:green` text with `:light_green` background on markers
|
|
175
|
+
* Changed: `:bright_red` old / `:bright_green` new, bold
|
|
176
|
+
* Formatting: `:bright_blue` (visible on light backgrounds)
|
|
177
|
+
* Informative: `:bright_magenta`
|
|
178
|
+
* Structure: `:black` line numbers and pipes
|
|
179
|
+
* Comments: `:magenta` italic
|
|
180
|
+
|
|
181
|
+
=== Dark theme (`:dark`, default)
|
|
182
|
+
|
|
183
|
+
Designed for dark terminal backgrounds.
|
|
184
|
+
|
|
185
|
+
* Removed: `:red` text, `:strikethrough` on content
|
|
186
|
+
* Added: `:green` text
|
|
187
|
+
* Changed: `:yellow` marker, `:bright_red` old / `:bright_green` new
|
|
188
|
+
* Formatting: `:bright_blue` (subtle but visible on dark backgrounds)
|
|
189
|
+
* Informative: `:cyan` (distinct from formatting)
|
|
190
|
+
* Structure: `:white` line numbers and pipes
|
|
191
|
+
* Comments: `:cyan` italic
|
|
192
|
+
|
|
193
|
+
=== Retro theme (`:retro`)
|
|
194
|
+
|
|
195
|
+
Amber CRT phosphor look. Monochromatic amber/yellow, good for low blue light and accessibility.
|
|
196
|
+
|
|
197
|
+
* Removed: Inverse video - `:bright_yellow` on `:yellow` background, bold
|
|
198
|
+
* Added: `:bright_white` (less emphasis than removed)
|
|
199
|
+
* Changed: `:bright_yellow` old with strikethrough / `:bright_white` new with underline
|
|
200
|
+
* Formatting: `:yellow` (subtle)
|
|
201
|
+
* Informative: `:bright_yellow` bold (stands out)
|
|
202
|
+
* Structure: `:yellow` throughout
|
|
203
|
+
* Comments: `:yellow` italic
|
|
204
|
+
|
|
205
|
+
=== Claude theme (`:claude`)
|
|
206
|
+
|
|
207
|
+
Claude Code diff style. Maximum visual contrast with colored backgrounds.
|
|
208
|
+
|
|
209
|
+
* Removed: `:white` text on `:red` background
|
|
210
|
+
* Added: `:black` text on `:green` background
|
|
211
|
+
* Changed: `:white` on `:magenta` background for marker
|
|
212
|
+
* Formatting: `:yellow`
|
|
213
|
+
* Informative: `:bright_cyan`
|
|
214
|
+
* Structure: `:yellow` line numbers and pipes
|
|
215
|
+
* Comments: `:cyan` italic
|
|
216
|
+
|
|
217
|
+
== Configuration
|
|
218
|
+
|
|
219
|
+
=== Setting theme by name
|
|
220
|
+
|
|
221
|
+
.Ruby API
|
|
222
|
+
[source,ruby]
|
|
223
|
+
----
|
|
224
|
+
Canon::Config.configure do |config|
|
|
225
|
+
config.xml.diff.theme = :claude
|
|
226
|
+
end
|
|
227
|
+
----
|
|
228
|
+
|
|
229
|
+
.ENV variable
|
|
230
|
+
[source,bash]
|
|
231
|
+
----
|
|
232
|
+
export CANON_DIFF_THEME=claude
|
|
233
|
+
----
|
|
234
|
+
|
|
235
|
+
Valid values: `light`, `dark`, `retro`, `claude`
|
|
236
|
+
|
|
237
|
+
=== Resolution priority
|
|
238
|
+
|
|
239
|
+
Theme is resolved in this order (highest to lowest):
|
|
240
|
+
|
|
241
|
+
. `CANON_DIFF_THEME` environment variable
|
|
242
|
+
. `config.xml.diff.theme_inheritance` (base theme + overrides)
|
|
243
|
+
. `config.xml.diff.custom_theme` (complete custom theme hash)
|
|
244
|
+
. `config.xml.diff.theme` (theme name)
|
|
245
|
+
. `:dark` (default)
|
|
246
|
+
|
|
247
|
+
=== Theme inheritance
|
|
248
|
+
|
|
249
|
+
Create a custom theme by inheriting from a base theme and overriding specific properties:
|
|
250
|
+
|
|
251
|
+
[source,ruby]
|
|
252
|
+
----
|
|
253
|
+
Canon::Config.configure do |config|
|
|
254
|
+
config.xml.diff.theme_inheritance = {
|
|
255
|
+
base: :dark,
|
|
256
|
+
overrides: {
|
|
257
|
+
diff: {
|
|
258
|
+
removed: { content: { bg: :light_red } }
|
|
259
|
+
}
|
|
260
|
+
}
|
|
261
|
+
}
|
|
262
|
+
end
|
|
263
|
+
----
|
|
264
|
+
|
|
265
|
+
Only the overridden properties change; everything else inherits from the base theme.
|
|
266
|
+
|
|
267
|
+
.Programmatic theme building
|
|
268
|
+
[source,ruby]
|
|
269
|
+
----
|
|
270
|
+
custom_theme = Canon::DiffFormatter::Theme.inherit_from(:dark)
|
|
271
|
+
.merge(diff: { removed: { content: { bg: :light_red } } })
|
|
272
|
+
.build
|
|
273
|
+
|
|
274
|
+
Canon::Config.configure do |config|
|
|
275
|
+
config.xml.diff.custom_theme = custom_theme
|
|
276
|
+
end
|
|
277
|
+
----
|
|
278
|
+
|
|
279
|
+
=== Full custom theme
|
|
280
|
+
|
|
281
|
+
Provide a complete theme hash with all required keys:
|
|
282
|
+
|
|
283
|
+
[source,ruby]
|
|
284
|
+
----
|
|
285
|
+
Canon::Config.configure do |config|
|
|
286
|
+
config.xml.diff.custom_theme = Canon::DiffFormatter::Theme.inherit_from(:light)
|
|
287
|
+
.merge(diff: { removed: { content: { color: :magenta } } })
|
|
288
|
+
.build
|
|
289
|
+
end
|
|
290
|
+
----
|
|
291
|
+
|
|
292
|
+
== Example script
|
|
293
|
+
|
|
294
|
+
Canon includes an example script that demonstrates all themes with a sample XML diff:
|
|
295
|
+
|
|
296
|
+
[source,bash]
|
|
297
|
+
----
|
|
298
|
+
bundle exec ruby examples/show_themes.rb
|
|
299
|
+
----
|
|
300
|
+
|
|
301
|
+
This shows each theme's color output side-by-side, plus plain text output and usage examples.
|
|
302
|
+
|
|
303
|
+
== Theme validation
|
|
304
|
+
|
|
305
|
+
All themes (built-in and custom) are validated for:
|
|
306
|
+
|
|
307
|
+
* **Completeness**: All required sections and keys present
|
|
308
|
+
* **Valid values**: Colors are valid ANSI, booleans are true/false
|
|
309
|
+
* **MECE**: No missing or extra keys
|
|
310
|
+
|
|
311
|
+
[source,ruby]
|
|
312
|
+
----
|
|
313
|
+
result = Canon::DiffFormatter::Theme.validate(my_theme)
|
|
314
|
+
result.valid? # => true/false
|
|
315
|
+
result.missing_keys # => ["diff.removed.content.color", ...]
|
|
316
|
+
result.invalid_values # => ["diff.removed.content.color must be one of ..."]
|
|
317
|
+
----
|
|
318
|
+
|
|
319
|
+
== Rainbow gem compatibility
|
|
320
|
+
|
|
321
|
+
Canon uses the https://github.com/kucaahbe/rainbow[Rainbow] gem for terminal colors. Some color names that Canon's theme system accepts are translated internally:
|
|
322
|
+
|
|
323
|
+
[cols="2,2"]
|
|
324
|
+
|===
|
|
325
|
+
| Theme Color | Rainbow Method Chain
|
|
326
|
+
|
|
327
|
+
| `:bright_red`
|
|
328
|
+
| `.red.bright`
|
|
329
|
+
|
|
330
|
+
| `:bright_blue`
|
|
331
|
+
| `.blue.bright`
|
|
332
|
+
|
|
333
|
+
| `:bright_green`
|
|
334
|
+
| `.green.bright`
|
|
335
|
+
|
|
336
|
+
| `:bright_cyan`
|
|
337
|
+
| `.cyan.bright`
|
|
338
|
+
|
|
339
|
+
| `:bright_magenta`
|
|
340
|
+
| `.magenta.bright`
|
|
341
|
+
|
|
342
|
+
| `:bright_yellow`
|
|
343
|
+
| `.yellow.bright`
|
|
344
|
+
|===
|
|
345
|
+
|
|
346
|
+
Colors `:bright_black` and `:bright_white` are not supported by Rainbow in 16-color mode. Themes use `:white`, `:black`, or `:cyan` as substitutes.
|
|
347
|
+
|
|
348
|
+
== See also
|
|
349
|
+
|
|
350
|
+
* link:colors-and-symbols.adoc[Colors and symbols] - Diff markers and classification
|
|
351
|
+
* link:display-filtering.adoc[Display filtering] - Control which diff types appear
|
|
352
|
+
* link:../environment-configuration/index.html[Environment configuration] - ENV variable setup
|
|
353
|
+
* link:../../reference/environment-variables.html[Environment variables reference] - Complete variable listing
|
|
@@ -99,6 +99,29 @@ export CANON_HTML_DIFF_USE_COLOR=true
|
|
|
99
99
|
|
|
100
100
|
Valid values: `true`, `false`, `1`, `0`, `yes`, `no`
|
|
101
101
|
|
|
102
|
+
=== Diff display theme
|
|
103
|
+
|
|
104
|
+
Choose a color theme for diff output. Themes control foreground/background colors, text effects, and visual markers.
|
|
105
|
+
|
|
106
|
+
[source,bash]
|
|
107
|
+
----
|
|
108
|
+
# Set theme for dark terminal
|
|
109
|
+
export CANON_DIFF_THEME=dark
|
|
110
|
+
|
|
111
|
+
# Set theme for light terminal
|
|
112
|
+
export CANON_DIFF_THEME=light
|
|
113
|
+
|
|
114
|
+
# Amber CRT retro look
|
|
115
|
+
export CANON_DIFF_THEME=retro
|
|
116
|
+
|
|
117
|
+
# Claude Code style (red/green backgrounds)
|
|
118
|
+
export CANON_DIFF_THEME=claude
|
|
119
|
+
----
|
|
120
|
+
|
|
121
|
+
Valid values: `light`, `dark`, `retro`, `claude`
|
|
122
|
+
|
|
123
|
+
See link:../diff-formatting/themes.adoc[Diff display themes] for complete theme documentation.
|
|
124
|
+
|
|
102
125
|
=== Context and grouping
|
|
103
126
|
|
|
104
127
|
[source,bash]
|
|
@@ -0,0 +1,249 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: DiffCharRange Pipeline
|
|
3
|
+
parent: Internals
|
|
4
|
+
nav_order: 2
|
|
5
|
+
---
|
|
6
|
+
= DiffCharRange Pipeline
|
|
7
|
+
|
|
8
|
+
== Purpose
|
|
9
|
+
|
|
10
|
+
This document explains the two-phase pipeline that enriches semantic DiffNodes with character-level positions and assembles them into display-ready DiffLines. This pipeline replaced the previous DiffNodeMapper approach that used Diff::LCS on text lines, which caused structural mismatches for XML.
|
|
11
|
+
|
|
12
|
+
== The Problem
|
|
13
|
+
|
|
14
|
+
The previous approach (`DiffNodeMapper`) ran `Diff::LCS.sdiff(lines1, lines2)` on text lines. This was fundamentally wrong because:
|
|
15
|
+
|
|
16
|
+
1. LCS operates on text lines, not XML structure
|
|
17
|
+
2. It caused spurious deletes/inserts for closing tags that merely moved to a different line
|
|
18
|
+
3. Character-level highlighting was independently re-tokenized in the formatter
|
|
19
|
+
4. A string-concatenation hack (`merge_adjacent_removals!`) was needed to paper over structural mismatches
|
|
20
|
+
|
|
21
|
+
== The Solution
|
|
22
|
+
|
|
23
|
+
The new pipeline follows a strict two-phase separation:
|
|
24
|
+
|
|
25
|
+
* **Phase 1 (Enrichment)**: DiffNodes are enriched with character positions and line counts. LCS on serialized content is acceptable here -- it operates on the change itself (short strings like "Hello World" vs "Hello Universe"), NOT on document lines.
|
|
26
|
+
* **Phase 2 (Rendering)**: Formatter renders pre-computed DiffCharRanges. NO LCS, NO tokenization, NO computation. Just read char ranges and apply colors.
|
|
27
|
+
|
|
28
|
+
== Architecture
|
|
29
|
+
|
|
30
|
+
[source]
|
|
31
|
+
----
|
|
32
|
+
PHASE 1: DIFFNODE GENERATION + ENRICHMENT
|
|
33
|
+
==========================================
|
|
34
|
+
|
|
35
|
+
Comparator (DOM/semantic) produces DiffNode[]
|
|
36
|
+
|
|
|
37
|
+
| serialized_before, serialized_after, dimension, etc.
|
|
38
|
+
|
|
|
39
|
+
v
|
|
40
|
+
DiffNodeEnricher (NEW)
|
|
41
|
+
|
|
|
42
|
+
| For each DiffNode:
|
|
43
|
+
| 1. SourceLocator: find serialized content in source text
|
|
44
|
+
| -> char_offset, line_number, col in text1 and text2
|
|
45
|
+
| 2. TextDecomposer: decompose serialized_before vs serialized_after
|
|
46
|
+
| -> common_prefix, changed_old, changed_new, common_suffix
|
|
47
|
+
| 3. Create DiffCharRange objects for each part
|
|
48
|
+
| -> attached to DiffNode
|
|
49
|
+
| 4. Compute line_range_before, line_range_after
|
|
50
|
+
|
|
|
51
|
+
v
|
|
52
|
+
DiffNode[] (enriched with char_ranges, line counts)
|
|
53
|
+
|
|
54
|
+
|
|
55
|
+
PHASE 2: RENDERING (NO LCS, NO COMPUTATION)
|
|
56
|
+
============================================
|
|
57
|
+
|
|
58
|
+
DiffNode[] -> DiffLineBuilder (maps DiffCharRanges to DiffLines)
|
|
59
|
+
-> DiffLines (each carrying DiffCharRange[])
|
|
60
|
+
-> DiffBlockBuilder -> DiffContextBuilder -> DiffReport [UNCHANGED]
|
|
61
|
+
-> Formatter (reads DiffCharRanges, applies colors)
|
|
62
|
+
----
|
|
63
|
+
|
|
64
|
+
== New Classes
|
|
65
|
+
|
|
66
|
+
=== DiffCharRange
|
|
67
|
+
|
|
68
|
+
*Location*: `lib/canon/diff/diff_char_range.rb`
|
|
69
|
+
|
|
70
|
+
Value object: a character range within a source line linked to a DiffNode.
|
|
71
|
+
|
|
72
|
+
[source,ruby]
|
|
73
|
+
----
|
|
74
|
+
class DiffCharRange
|
|
75
|
+
attr_reader :line_number, :start_col, :end_col, :side, :status, :role, :diff_node
|
|
76
|
+
|
|
77
|
+
# side: :old (text1) or :new (text2)
|
|
78
|
+
# status: :unchanged / :removed / :added / :changed_old / :changed_new
|
|
79
|
+
# role: :before / :changed / :after
|
|
80
|
+
end
|
|
81
|
+
----
|
|
82
|
+
|
|
83
|
+
Status meanings:
|
|
84
|
+
|
|
85
|
+
[cols="2,4"]
|
|
86
|
+
|===
|
|
87
|
+
| Status | Meaning
|
|
88
|
+
|
|
89
|
+
| `:unchanged` | Same in both documents (before-text / after-text portions)
|
|
90
|
+
| `:changed_old` | Old text being replaced (changed-text, text1 side)
|
|
91
|
+
| `:changed_new` | New replacement text (changed-text, text2 side)
|
|
92
|
+
| `:removed` | Entire range only in text1 (whole-line deletion)
|
|
93
|
+
| `:added` | Entire range only in text2 (whole-line addition)
|
|
94
|
+
|===
|
|
95
|
+
|
|
96
|
+
=== TextDecomposer
|
|
97
|
+
|
|
98
|
+
*Location*: `lib/canon/diff/text_decomposer.rb`
|
|
99
|
+
|
|
100
|
+
Pure function: decomposes two strings into common prefix / changed / common suffix.
|
|
101
|
+
|
|
102
|
+
[source,ruby]
|
|
103
|
+
----
|
|
104
|
+
TextDecomposer.decompose("Hello World", "Hello Universe")
|
|
105
|
+
# => {
|
|
106
|
+
# common_prefix: "Hello ",
|
|
107
|
+
# changed_old: "World",
|
|
108
|
+
# changed_new: "Universe",
|
|
109
|
+
# common_suffix: ""
|
|
110
|
+
# }
|
|
111
|
+
----
|
|
112
|
+
|
|
113
|
+
Algorithm: character-by-character prefix scan + reverse suffix scan. O(n).
|
|
114
|
+
|
|
115
|
+
=== SourceLocator
|
|
116
|
+
|
|
117
|
+
*Location*: `lib/canon/diff/source_locator.rb`
|
|
118
|
+
|
|
119
|
+
Locates serialized content within source text, returns char offset + line/col.
|
|
120
|
+
|
|
121
|
+
[source,ruby]
|
|
122
|
+
----
|
|
123
|
+
SourceLocator.locate("Hello World", text1, line_map)
|
|
124
|
+
# => { char_offset: 17, line_number: 1, col: 9 }
|
|
125
|
+
----
|
|
126
|
+
|
|
127
|
+
Uses `String#index` on the full text, then maps char offset to line/col via a pre-built line offset map.
|
|
128
|
+
|
|
129
|
+
=== DiffNodeEnricher
|
|
130
|
+
|
|
131
|
+
*Location*: `lib/canon/diff/diff_node_enricher.rb`
|
|
132
|
+
|
|
133
|
+
Enriches DiffNodes with character position data. This is Phase 1 of the pipeline.
|
|
134
|
+
|
|
135
|
+
**Input**: `DiffNode[]`, `text1`, `text2` (preprocessed strings)
|
|
136
|
+
**Output**: Same `DiffNode[]` enriched with `char_ranges`, `line_range_before`, `line_range_after`
|
|
137
|
+
|
|
138
|
+
**Algorithm**:
|
|
139
|
+
----
|
|
140
|
+
Step 1: Build line offset maps for text1 and text2
|
|
141
|
+
Step 2: For each DiffNode:
|
|
142
|
+
a. SourceLocator.locate(serialized_before, text1) -> position in text1
|
|
143
|
+
b. SourceLocator.locate(serialized_after, text2) -> position in text2
|
|
144
|
+
c. TextDecomposer.decompose(serialized_before, serialized_after) -> 3 parts
|
|
145
|
+
d. Map each part to DiffCharRange objects at correct line/col positions
|
|
146
|
+
e. Store char_ranges on DiffNode
|
|
147
|
+
f. Compute line_range_before, line_range_after from char_ranges
|
|
148
|
+
----
|
|
149
|
+
|
|
150
|
+
=== DiffLineBuilder
|
|
151
|
+
|
|
152
|
+
*Location*: `lib/canon/diff/diff_line_builder.rb`
|
|
153
|
+
|
|
154
|
+
Assembles DiffLines from enriched DiffNodes. This is Phase 2 -- NO LCS, NO computation.
|
|
155
|
+
|
|
156
|
+
**Input**: `DiffNode[]` (enriched), `text1`, `text2`
|
|
157
|
+
**Output**: `DiffLine[]` (each carrying `DiffCharRange[]`)
|
|
158
|
+
|
|
159
|
+
**Algorithm**:
|
|
160
|
+
----
|
|
161
|
+
Step 1: Walk DiffNodes in document order (sorted by line_range_before)
|
|
162
|
+
Step 2: For each DiffNode, create DiffLines from its char_ranges:
|
|
163
|
+
- char_ranges with side: :old -> DiffLine.char_ranges
|
|
164
|
+
- char_ranges with side: :new -> DiffLine.new_char_ranges
|
|
165
|
+
- Determine DiffLine type from the DiffNode's dimension and line ranges
|
|
166
|
+
Step 3: Fill in unchanged lines between DiffNodes
|
|
167
|
+
- Use cumulative offset from line_range differences
|
|
168
|
+
- Detect reflow (lines that moved but content unchanged)
|
|
169
|
+
Step 4: Return DiffLine[] in document order
|
|
170
|
+
----
|
|
171
|
+
|
|
172
|
+
**Reflow detection**: Lines that exist in text1 but whose content appears within an adjacent changed line in text2 (e.g., closing tag moved to previous line) are detected by checking if the stripped content is a substring of the adjacent line. These become formatting-only DiffLines (`[`/`]` markers), NOT deletions.
|
|
173
|
+
|
|
174
|
+
== Per-Dimension Strategies
|
|
175
|
+
|
|
176
|
+
[cols="2,4"]
|
|
177
|
+
|===
|
|
178
|
+
| Dimension | Strategy
|
|
179
|
+
|
|
180
|
+
| `:text_content` | Locate serialized_before/after, decompose into prefix/changed/suffix
|
|
181
|
+
| `:attribute_values` | Find `key="old_value"` in old line, `key="new_value"` in new line
|
|
182
|
+
| `:attribute_presence` | Find `key="value"` for added/removed attributes
|
|
183
|
+
| `:attribute_order` | Highlight entire attribute section as formatting
|
|
184
|
+
| `:comments` | Locate `<!--...-->`, decompose if changed
|
|
185
|
+
| `:structural_whitespace` | Mark affected lines as formatting-only
|
|
186
|
+
| `:element_structure` | Full-element deletion/insertion (all lines)
|
|
187
|
+
|===
|
|
188
|
+
|
|
189
|
+
== Modified Classes
|
|
190
|
+
|
|
191
|
+
=== DiffNode
|
|
192
|
+
|
|
193
|
+
Added enrichment attributes:
|
|
194
|
+
|
|
195
|
+
* `attr_accessor :char_ranges` -- `Array<DiffCharRange>`
|
|
196
|
+
* `attr_accessor :line_range_before` -- `[start_line, end_line]` in text1
|
|
197
|
+
* `attr_accessor :line_range_after` -- `[start_line, end_line]` in text2
|
|
198
|
+
|
|
199
|
+
=== DiffLine
|
|
200
|
+
|
|
201
|
+
Added:
|
|
202
|
+
|
|
203
|
+
* `attr_reader :char_ranges` (text1 side DiffCharRange[])
|
|
204
|
+
* `attr_reader :new_char_ranges` (text2 side DiffCharRange[])
|
|
205
|
+
* `attr_reader :new_content` (text2 line text, for :changed lines)
|
|
206
|
+
* `add_char_range(cr)`, `add_new_char_range(cr)`, `has_char_ranges?`
|
|
207
|
+
|
|
208
|
+
== The 3-Part Decomposition
|
|
209
|
+
|
|
210
|
+
When a text node changes (e.g., "Hello World" -> "Hello Universe"), the change is decomposed into 3 logical parts:
|
|
211
|
+
|
|
212
|
+
[source]
|
|
213
|
+
----
|
|
214
|
+
"Hello World" -> "Hello Universe"
|
|
215
|
+
^^^^^^^^ ^^^^^^^^ ^^^^^^^^
|
|
216
|
+
:before :before :changed
|
|
217
|
+
(the actual diff)
|
|
218
|
+
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
219
|
+
^^^^^^^^
|
|
220
|
+
:after (empty in this case)
|
|
221
|
+
|
|
222
|
+
Each part produces DiffCharRange objects:
|
|
223
|
+
:before -> status: :unchanged (white/default)
|
|
224
|
+
:changed -> status: :changed_old (red) / :changed_new (green)
|
|
225
|
+
:after -> status: :unchanged (white/default)
|
|
226
|
+
----
|
|
227
|
+
|
|
228
|
+
Each part can map to **multiple DiffCharRanges** across **multiple source lines** when the content spans line boundaries.
|
|
229
|
+
|
|
230
|
+
== Mixed Change Detection
|
|
231
|
+
|
|
232
|
+
A "mixed" change (`*` marker) occurs when a line has **multiple separate changed regions**. The detection counts contiguous changed regions, not total ranges:
|
|
233
|
+
|
|
234
|
+
* Simple replacement "John Doe" -> "Jane Doe": 3 ranges but 1 changed region -> `-`/`+`
|
|
235
|
+
* True mixed "foo bar baz" -> "foo qux baz quux": 2+ separate changed regions -> `*`
|
|
236
|
+
|
|
237
|
+
The `count_changed_regions` method in `XmlFormatter` walks the ranges array counting transitions into `:changed_old`/`:changed_new` status.
|
|
238
|
+
|
|
239
|
+
== Deleted Files
|
|
240
|
+
|
|
241
|
+
The following file was removed and replaced by this pipeline:
|
|
242
|
+
|
|
243
|
+
* `lib/canon/diff/diff_node_mapper.rb` -- replaced by `DiffNodeEnricher` + `DiffLineBuilder`
|
|
244
|
+
|
|
245
|
+
== See Also
|
|
246
|
+
|
|
247
|
+
* link:diffnode-enrichment[DiffNode Enrichment] -- How metadata is attached to DiffNodes
|
|
248
|
+
* link:../advanced/diff-pipeline[Diff Pipeline Architecture] -- The 6-layer pipeline overview
|
|
249
|
+
* link:../features/diff-formatting/colors-and-symbols[Colors and Symbols] -- How char ranges are rendered with color
|
|
@@ -609,3 +609,4 @@ The old API still works for backwards compatibility, but enriched properties pro
|
|
|
609
609
|
* link:../understanding/architecture.adoc[Architecture] - 4-layer architecture overview
|
|
610
610
|
* link:../understanding/algorithms/[Algorithms] - DOM and Semantic algorithm details
|
|
611
611
|
* link:../features/diff-formatting/[Diff Formatting] - Layer 4 rendering options
|
|
612
|
+
* link:diff-char-range-pipeline[DiffCharRange Pipeline] - How enriched DiffNodes are processed into character-level display positions
|