yaml-janitor 20251113 → 20251115

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 022b412fb7fefdf3b91aae1bb4d47db513c144e1ddb3e99b6d0b83ee9141ac90
4
- data.tar.gz: 0d0f1fd010e75a0bceda041960f03432b283091fb8425b2909d3f0f4beaadc80
3
+ metadata.gz: 9836d954e6602561f6197a639b73183bc96409e93d477191590192d689c669ae
4
+ data.tar.gz: f1f96ba5f15ec13038060a9ebdf6775a939e46d8b5fddaa190c8664e50925cc8
5
5
  SHA512:
6
- metadata.gz: 18655c70a33f9db707541e76d6951bbb98b3f227028f933101ff8dd0ea28d2b0fa319a36af467ccb8b52ea7bb46727ef95d8e6625a5b1d647ebdda6ee4601958
7
- data.tar.gz: d48f8f842813eb5538c4e939969736b74b1fc8c2414f7be91d9890108d57336015276c6802b35ee138414e88175305c43cf2b85732e9c8ce95873c893083ad1b
6
+ metadata.gz: 1a02c1e0afd72eb574dff29f779bd9de9a58a4455fee1be3ceb9e598e958608c3302da92784a1730f08927ef0f1a221f52a72b556d2d8d7e4de970252f9bfbe7
7
+ data.tar.gz: bded6694abd1eb893bcde70ac3144315ececc62a92eb20f7730ec85075b2bb14dd3ad696a32dd431644cb7f0b2fb559dad9e53a21094629352013bc296f029ea
data/README.md CHANGED
@@ -1,13 +1,13 @@
1
1
  # yaml-janitor
2
2
 
3
- A YAML linter built on psych-pure that preserves comments while detecting and
4
- fixing issues.
3
+ A YAML linter and formatter built on psych-pure that preserves comments while
4
+ formatting files.
5
5
 
6
6
  ## Why?
7
7
 
8
8
  Traditional YAML tools destroy comments when editing files. yaml-janitor uses
9
- psych-pure's comment-preserving parser to lint and fix YAML files without
10
- losing valuable documentation.
9
+ psych-pure's comment-preserving parser to format YAML files without losing
10
+ valuable documentation.
11
11
 
12
12
  ## Installation
13
13
 
@@ -25,7 +25,7 @@ gem 'yaml-janitor'
25
25
 
26
26
  ### CLI
27
27
 
28
- Check a single file:
28
+ Check a single file (reports formatting issues):
29
29
  ```bash
30
30
  yaml-janitor config.yml
31
31
  ```
@@ -35,14 +35,19 @@ Check all YAML files in a directory:
35
35
  yaml-janitor containers/
36
36
  ```
37
37
 
38
- Auto-fix issues:
38
+ Format files in-place:
39
39
  ```bash
40
40
  yaml-janitor --fix config.yml
41
41
  ```
42
42
 
43
- Run specific rules:
43
+ Format with custom indentation:
44
44
  ```bash
45
- yaml-janitor --rules multiline_certificate config.yml
45
+ yaml-janitor --fix --indentation 4 config.yml
46
+ ```
47
+
48
+ Show diff of formatting changes:
49
+ ```bash
50
+ yaml-janitor --diff config.yml
46
51
  ```
47
52
 
48
53
  ### Ruby API
@@ -50,21 +55,27 @@ yaml-janitor --rules multiline_certificate config.yml
50
55
  ```ruby
51
56
  require 'yaml_janitor'
52
57
 
53
- # Lint a file
58
+ # Check a file for formatting issues
54
59
  result = YamlJanitor.lint_file("config.yml")
55
60
  result[:violations].each do |violation|
56
- puts violation
61
+ puts "#{violation.file}: #{violation.message}"
57
62
  end
58
63
 
59
- # Lint and fix
60
- result = YamlJanitor.lint_file("config.yml", fix: true)
64
+ # Format a file in-place
65
+ result = YamlJanitor.format_file("config.yml")
61
66
  if result[:fixed]
62
- puts "Fixed! New content:\n#{result[:output]}"
67
+ puts "Formatted!"
63
68
  end
64
69
 
65
- # Lint a string
70
+ # Format a string
66
71
  yaml_string = File.read("config.yml")
67
- result = YamlJanitor.lint(yaml_string)
72
+ result = YamlJanitor.format(yaml_string)
73
+ puts result[:output]
74
+
75
+ # Use custom config
76
+ config = YamlJanitor::Config.new(overrides: { indentation: 4 })
77
+ linter = YamlJanitor::Linter.new(config: config)
78
+ result = linter.lint_file("config.yml", fix: true)
68
79
  ```
69
80
 
70
81
  ## Configuration
@@ -72,29 +83,15 @@ result = YamlJanitor.lint(yaml_string)
72
83
  Create a `.yaml-janitor.yml` file in your project root:
73
84
 
74
85
  ```yaml
75
- # Formatting options (applied during --fix)
86
+ # Formatting options
76
87
  indentation: 2
77
88
  line_width: 80
78
- sequence_indent: false
79
-
80
- # Rule configuration
81
- rules:
82
- multiline_certificate:
83
- enabled: true
84
- consistent_indentation:
85
- enabled: true
86
89
  ```
87
90
 
88
91
  ### Configuration Options
89
92
 
90
- **Formatting**:
91
93
  - `indentation`: Number of spaces for indentation (default: 2)
92
- - `line_width`: Maximum line width before wrapping (default: 80)
93
- - `sequence_indent`: Indent sequences under their key (default: false)
94
-
95
- **Rules**:
96
- - `multiline_certificate`: Detects multi-line certificates in double-quoted strings
97
- - `consistent_indentation`: Detects and fixes inconsistent indentation
94
+ - `line_width`: Maximum line width before wrapping (default: 80, not yet implemented)
98
95
 
99
96
  ### Command Line Overrides
100
97
 
@@ -106,47 +103,43 @@ yaml-janitor --indentation 4 --line-width 100 config.yml
106
103
  yaml-janitor --config production.yml containers/
107
104
  ```
108
105
 
109
- ## Rules
106
+ ## How It Works
110
107
 
111
- ### multiline_certificate
108
+ yaml-janitor uses a two-phase approach:
112
109
 
113
- Detects multi-line certificates embedded in double-quoted strings. This pattern
114
- triggers a psych-pure parser bug.
110
+ 1. **Parse**: Load YAML with psych-pure, preserving comment metadata
111
+ 2. **Format**: Emit YAML using custom formatter with full control over style
115
112
 
116
- ```yaml
117
- # BAD (will trigger violation)
118
- DISCOURSE_SAML_CERT: "-----BEGIN CERTIFICATE-----
119
- MIIDGDCCAgCgAwIBAgIVAMP/9hm9Vl3/23QoXrL8hQ31DLwRMA0GCSqGSIb3DQEB
120
- -----END CERTIFICATE-----"
121
-
122
- # GOOD (use block literal style)
123
- DISCOURSE_SAML_CERT: |
124
- -----BEGIN CERTIFICATE-----
125
- MIIDGDCCAgCgAwIBAgIVAMP/9hm9Vl3/23QoXrL8hQ31DLwRMA0GCSqGSIb3DQEB
126
- -----END CERTIFICATE-----
127
- ```
113
+ When you run `yaml-janitor --fix`, it:
114
+ - Loads your YAML file with comments preserved
115
+ - Formats it according to configuration (indentation, line width, etc.)
116
+ - Verifies semantics are unchanged (paranoid mode)
117
+ - Writes the formatted output back to the file
128
118
 
129
- **Auto-fix**: Not yet implemented (requires psych-pure enhancements)
119
+ ### Formatting Rules
130
120
 
131
- ### consistent_indentation
121
+ The formatter enforces:
122
+ - **Consistent indentation** (default: 2 spaces)
123
+ - **Block style for arrays and mappings** (never flow style like `[a, b, c]`)
124
+ - **Normalized string quoting** (only quotes when necessary)
125
+ - **Proper line breaks** between top-level keys
132
126
 
133
- Detects inconsistent indentation (mixing 2-space, 4-space, etc.) in YAML files.
127
+ ### Comment Preservation
134
128
 
135
- ```yaml
136
- # BAD (inconsistent: 4 and 8 spaces)
137
- database:
138
- host: "localhost"
139
- config:
140
- timeout: 30
141
-
142
- # GOOD (consistent: 2 spaces)
143
- database:
144
- host: "localhost"
145
- config:
146
- timeout: 30
147
- ```
129
+ Comments are preserved in most locations:
130
+ - Leading comments (before keys)
131
+ - Trailing comments (after values)
132
+ - Mid-document comments (between keys)
133
+
134
+ Known limitation: Inline comments on mapping keys (e.g., `servers: # comment`)
135
+ may be repositioned as leading comments on the next key due to psych-pure's
136
+ comment tracking.
137
+
138
+ ### Safety
148
139
 
149
- **Auto-fix**: Yes, normalizes to configured indentation (default: 2 spaces)
140
+ All formatting changes are verified with paranoid mode: the original YAML and
141
+ formatted YAML are both parsed and compared for semantic equality. If they
142
+ differ, the tool errors out instead of writing the file.
150
143
 
151
144
  ## Development
152
145
 
@@ -164,12 +157,12 @@ bundle exec rake test
164
157
  ### Test Coverage
165
158
 
166
159
  Integration tests verify:
167
- - Comment preservation during fixes
160
+ - Comment preservation during formatting
168
161
  - Indentation normalization
169
162
  - Paranoid mode (semantic verification)
170
- - Config loading and rule enable/disable
171
- - Multi-line certificate detection
172
- - Clean files pass without violations
163
+ - Config loading and overrides
164
+ - Parse error detection
165
+ - Idempotent formatting (clean files pass without violations)
173
166
 
174
167
  ## Background
175
168
 
data/bin/yaml-janitor CHANGED
@@ -7,27 +7,34 @@ def print_usage
7
7
  puts <<~USAGE
8
8
  Usage: yaml-janitor [options] <file_or_directory>
9
9
 
10
+ yaml-janitor is a YAML linter and formatter that preserves comments.
11
+
10
12
  Options:
11
- --fix Auto-fix issues where possible
12
- --rules RULES Comma-separated list of rules (default: all)
13
+ --fix Format files in-place (without this, just check)
14
+ --diff Show diff of formatting changes
13
15
  --config PATH Path to config file (default: .yaml-janitor.yml)
14
16
  --indentation N Number of spaces for indentation (default: 2)
15
17
  --line-width N Maximum line width (default: 80)
16
18
  --help Show this help message
17
19
 
18
20
  Examples:
21
+ # Check files (report issues)
19
22
  yaml-janitor config.yml
23
+ yaml-janitor containers/
24
+
25
+ # Format files in-place
20
26
  yaml-janitor --fix config.yml
21
- yaml-janitor --rules multiline_certificate containers/
22
- yaml-janitor --config my-config.yml --fix config.yml
23
- yaml-janitor --indentation 4 --line-width 100 config.yml
27
+ yaml-janitor --fix --indentation 4 containers/
28
+
29
+ # Show diff of formatting changes
30
+ yaml-janitor --diff config.yml
24
31
  USAGE
25
32
  exit 0
26
33
  end
27
34
 
28
35
  # Parse args
29
36
  fix = false
30
- rules = :all
37
+ diff = false
31
38
  config_path = nil
32
39
  config_overrides = {}
33
40
  paths = []
@@ -37,9 +44,8 @@ while i < ARGV.length
37
44
  case ARGV[i]
38
45
  when "--fix"
39
46
  fix = true
40
- when "--rules"
41
- i += 1
42
- rules = ARGV[i].split(",").map(&:to_sym)
47
+ when "--diff"
48
+ diff = true
43
49
  when "--config"
44
50
  i += 1
45
51
  config_path = ARGV[i]
@@ -69,10 +75,11 @@ end
69
75
 
70
76
  # Process files
71
77
  config = YamlJanitor::Config.new(config_path: config_path, overrides: config_overrides)
72
- linter = YamlJanitor::Linter.new(rules: rules, config: config)
78
+ linter = YamlJanitor::Linter.new(config: config)
73
79
  total_files = 0
74
- total_violations = 0
75
80
  files_with_violations = []
81
+ formatted_files = []
82
+ failed_files = []
76
83
 
77
84
  paths.each do |path|
78
85
  if File.directory?(path)
@@ -82,18 +89,23 @@ paths.each do |path|
82
89
  total_files += 1
83
90
  result = linter.lint_file(file, fix: fix)
84
91
 
85
- if result[:violations].any?
92
+ if result[:error]
93
+ failed_files << { file: file, error: result[:error] }
94
+ puts "✗ #{file}: #{result[:error].message}"
95
+ elsif result[:violations].any?
86
96
  files_with_violations << file
87
- total_violations += result[:violations].length
88
-
89
- puts "\n#{file}:"
90
- result[:violations].each do |violation|
91
- puts " #{violation}"
92
- end
93
-
94
- if fix && result[:fixed]
95
- puts " Fixed"
97
+ if fix
98
+ formatted_files << file
99
+ puts "#{file} (formatted)"
100
+ elsif diff
101
+ puts "#{file}: needs formatting"
102
+ puts linter.generate_diff(result[:original], result[:formatted], file)
103
+ puts ""
104
+ else
105
+ puts " #{file}: needs formatting"
96
106
  end
107
+ elsif !fix && !diff
108
+ puts "✓ #{file}"
97
109
  end
98
110
  end
99
111
  elsif File.file?(path)
@@ -101,18 +113,23 @@ paths.each do |path|
101
113
  total_files += 1
102
114
  result = linter.lint_file(path, fix: fix)
103
115
 
104
- if result[:violations].any?
116
+ if result[:error]
117
+ failed_files << { file: path, error: result[:error] }
118
+ puts "✗ #{path}: #{result[:error].message}"
119
+ elsif result[:violations].any?
105
120
  files_with_violations << path
106
- total_violations += result[:violations].length
107
-
108
- puts "\n#{path}:"
109
- result[:violations].each do |violation|
110
- puts " #{violation}"
111
- end
112
-
113
- if fix && result[:fixed]
114
- puts " Fixed"
121
+ if fix
122
+ formatted_files << path
123
+ puts "#{path} (formatted)"
124
+ elsif diff
125
+ puts "#{path}: needs formatting"
126
+ puts linter.generate_diff(result[:original], result[:formatted], path)
127
+ puts ""
128
+ else
129
+ puts " #{path}: needs formatting"
115
130
  end
131
+ elsif !fix && !diff
132
+ puts "✓ #{path}"
116
133
  end
117
134
  else
118
135
  puts "Warning: #{path} not found"
@@ -121,10 +138,21 @@ end
121
138
 
122
139
  # Summary
123
140
  puts "\n" + "="*60
124
- puts "Checked #{total_files} files"
125
- puts "Found #{total_violations} violations in #{files_with_violations.length} files"
141
+ if fix
142
+ puts "Formatted #{formatted_files.length}/#{total_files} files"
143
+ else
144
+ puts "Checked #{total_files} files"
145
+ puts "#{files_with_violations.length} files need formatting"
146
+ end
126
147
 
127
- if total_violations > 0
148
+ if failed_files.any?
149
+ puts "\nFailed files:"
150
+ failed_files.each do |failure|
151
+ puts " #{failure[:file]}: #{failure[:error].message}"
152
+ end
153
+ exit 1
154
+ elsif files_with_violations.any? && !fix
155
+ puts "\nRun with --fix to format these files"
128
156
  exit 1
129
157
  else
130
158
  puts "✓ All files clean!"
@@ -7,7 +7,6 @@ module YamlJanitor
7
7
  DEFAULT_CONFIG = {
8
8
  indentation: 2,
9
9
  line_width: 80,
10
- sequence_indent: false,
11
10
  rules: {
12
11
  multiline_certificate: { enabled: true },
13
12
  consistent_indentation: { enabled: true }
@@ -30,10 +29,6 @@ module YamlJanitor
30
29
  @config[:line_width]
31
30
  end
32
31
 
33
- def sequence_indent
34
- @config[:sequence_indent]
35
- end
36
-
37
32
  def rule_enabled?(rule_name)
38
33
  rule_config = @config[:rules][rule_name.to_sym]
39
34
  rule_config && rule_config[:enabled] != false
@@ -46,8 +41,7 @@ module YamlJanitor
46
41
  def dump_options
47
42
  {
48
43
  indentation: indentation,
49
- line_width: line_width,
50
- sequence_indent: sequence_indent
44
+ line_width: line_width
51
45
  }
52
46
  end
53
47
 
@@ -0,0 +1,289 @@
1
+ # frozen_string_literal: true
2
+
3
+ module YamlJanitor
4
+ # Emitter takes a loaded YAML document (with comments) and formats it
5
+ # according to configuration rules. Unlike Psych::Pure.dump, we have
6
+ # complete control over formatting choices.
7
+ class Emitter
8
+ def initialize(node, config)
9
+ @node = node
10
+ @config = config
11
+ @output = []
12
+ end
13
+
14
+ def emit
15
+ # Emit any leading comments on the root document
16
+ emit_comments(get_comments(@node, :leading), 0)
17
+
18
+ emit_document(@node)
19
+ @output.join("\n") + "\n"
20
+ end
21
+
22
+ private
23
+
24
+ def emit_document(node, indent: 0)
25
+ case node
26
+ when Psych::Pure::LoadedHash
27
+ emit_mapping(node, indent)
28
+ when Hash
29
+ emit_mapping(node, indent)
30
+ when Psych::Pure::LoadedObject
31
+ # Check if it wraps an array
32
+ inner = node.__getobj__
33
+ if inner.is_a?(Array)
34
+ emit_sequence(inner, indent, loaded_object: node)
35
+ else
36
+ emit_node(inner, indent)
37
+ end
38
+ when Array
39
+ emit_sequence(node, indent)
40
+ else
41
+ emit_scalar(node, indent)
42
+ end
43
+ end
44
+
45
+ def emit_mapping(hash, indent)
46
+ # Use psych_keys if available (LoadedHash), otherwise fall back to regular iteration
47
+ entries = if hash.respond_to?(:psych_keys)
48
+ hash.psych_keys.map { |pk| [pk.key_node, pk.value_node] }
49
+ else
50
+ hash.to_a
51
+ end
52
+
53
+ entries.each_with_index do |(key, value), index|
54
+ # Add blank line between top-level keys if configured
55
+ actual_value = value.is_a?(Psych::Pure::LoadedObject) ? value.__getobj__ : value
56
+ @output << "" if index > 0 && indent == 0 && should_add_blank_line?(actual_value)
57
+
58
+ # Emit any leading comments
59
+ emit_comments(get_comments(key, :leading), indent)
60
+
61
+ # Emit the key-value pair
62
+ key_str = scalar_to_string(key.is_a?(Psych::Pure::LoadedObject) ? key.__getobj__ : key)
63
+
64
+ # Unwrap LoadedObject to check the actual type
65
+ actual_value = value.is_a?(Psych::Pure::LoadedObject) ? value.__getobj__ : value
66
+
67
+ case actual_value
68
+ when Hash, Psych::Pure::LoadedHash, Array
69
+ # Complex value - put on next line
70
+ line = "#{' ' * indent}#{key_str}:"
71
+
72
+ # Check for inline comment on the value
73
+ if (trailing = get_comments(value, :trailing))
74
+ inline = trailing.find { |c| c.inline? }
75
+ if inline
76
+ line += " #{inline.value}"
77
+ trailing = trailing.reject { |c| c.inline? }
78
+ end
79
+ end
80
+
81
+ @output << line
82
+ emit_node(value, indent + indentation)
83
+
84
+ # Emit any non-inline trailing comments
85
+ emit_comments(trailing, indent) if trailing&.any?
86
+ else
87
+ # Simple value - same line
88
+ value_str = scalar_to_string(actual_value)
89
+ line = "#{' ' * indent}#{key_str}: #{value_str}"
90
+
91
+ # Check for inline comment on the value
92
+ if (trailing = get_comments(value, :trailing))
93
+ inline = trailing.find { |c| c.inline? }
94
+ line += " #{inline.value}" if inline
95
+ end
96
+
97
+ @output << line
98
+ end
99
+
100
+ # Emit any trailing comments on the key itself
101
+ emit_comments(get_comments(key, :trailing), indent)
102
+ end
103
+ end
104
+
105
+ def emit_sequence(array, indent, loaded_object: nil)
106
+ array.each_with_index do |item, index|
107
+ # Emit any leading comments (check both the item and the LoadedObject wrapper)
108
+ comments = get_comments(item, :leading) || (loaded_object ? get_comments(loaded_object, :leading) : nil)
109
+ emit_comments(comments, indent)
110
+
111
+ case item
112
+ when Hash, Psych::Pure::LoadedHash
113
+ # Complex item - use compact style (dash on same line as first key)
114
+ emit_compact_hash_item(item, indent)
115
+ when Array
116
+ # Nested array
117
+ @output << "#{' ' * indent}-"
118
+ emit_node(item, indent + indentation)
119
+ else
120
+ # Simple item - same line
121
+ item_str = scalar_to_string(item)
122
+ @output << "#{' ' * indent}- #{item_str}"
123
+ end
124
+
125
+ # Emit any trailing comments
126
+ emit_comments(get_comments(item, :trailing), indent)
127
+ end
128
+ end
129
+
130
+ def emit_compact_hash_item(hash, indent)
131
+ # Emit hash as array item in compact style:
132
+ # - key1: value1
133
+ # key2: value2
134
+
135
+ # Use psych_keys if available (LoadedHash), otherwise fall back to regular iteration
136
+ entries = if hash.respond_to?(:psych_keys)
137
+ hash.psych_keys.map { |pk| [pk.key_node, pk.value_node] }
138
+ else
139
+ hash.to_a
140
+ end
141
+
142
+ entries.each_with_index do |(key, value), index|
143
+ # Emit any leading comments
144
+ emit_comments(get_comments(key, :leading), indent + (index > 0 ? indentation : 0))
145
+
146
+ # Get the actual key and value (unwrap LoadedObject)
147
+ key_str = scalar_to_string(key.is_a?(Psych::Pure::LoadedObject) ? key.__getobj__ : key)
148
+ actual_value = value.is_a?(Psych::Pure::LoadedObject) ? value.__getobj__ : value
149
+
150
+ # First item gets the dash, rest are indented
151
+ prefix = index == 0 ? "#{' ' * indent}- " : "#{' ' * (indent + indentation)}"
152
+
153
+ case actual_value
154
+ when Hash, Psych::Pure::LoadedHash, Array
155
+ # Complex value - put on next line
156
+ line = "#{prefix}#{key_str}:"
157
+
158
+ # Check for inline comment on the value
159
+ if (trailing = get_comments(value, :trailing))
160
+ inline = trailing.find { |c| c.inline? }
161
+ if inline
162
+ line += " #{inline.value}"
163
+ trailing = trailing.reject { |c| c.inline? }
164
+ end
165
+ end
166
+
167
+ @output << line
168
+ emit_node(value, indent + indentation * 2)
169
+
170
+ # Emit any non-inline trailing comments
171
+ emit_comments(trailing, indent + indentation) if trailing&.any?
172
+ else
173
+ # Simple value - same line
174
+ value_str = scalar_to_string(actual_value)
175
+ line = "#{prefix}#{key_str}: #{value_str}"
176
+
177
+ # Check for inline comment on the value
178
+ if (trailing = get_comments(value, :trailing))
179
+ inline = trailing.find { |c| c.inline? }
180
+ line += " #{inline.value}" if inline
181
+ end
182
+
183
+ @output << line
184
+ end
185
+
186
+ # Emit any trailing comments on the key itself
187
+ emit_comments(get_comments(key, :trailing), indent + (index > 0 ? indentation : 0))
188
+ end
189
+ end
190
+
191
+ def emit_node(node, indent)
192
+ case node
193
+ when Psych::Pure::LoadedHash, Hash
194
+ emit_mapping(node, indent)
195
+ when Psych::Pure::LoadedObject
196
+ emit_node(node.__getobj__, indent)
197
+ when Array
198
+ emit_sequence(node, indent)
199
+ else
200
+ @output << "#{' ' * indent}#{scalar_to_string(node)}"
201
+ end
202
+ end
203
+
204
+ def emit_scalar(value, indent)
205
+ @output << "#{' ' * indent}#{scalar_to_string(value)}"
206
+ end
207
+
208
+ def scalar_to_string(value)
209
+ case value
210
+ when String
211
+ format_string(value)
212
+ when Symbol
213
+ ":#{value}"
214
+ when NilClass
215
+ "null"
216
+ when TrueClass, FalseClass
217
+ value.to_s
218
+ when Numeric
219
+ value.to_s
220
+ else
221
+ value.to_s
222
+ end
223
+ end
224
+
225
+ def format_string(str)
226
+ # Choose appropriate string style
227
+ if str.include?("\n")
228
+ # Multi-line string - use literal block scalar
229
+ format_literal_string(str)
230
+ elsif needs_quoting?(str)
231
+ # Quote if necessary
232
+ if str.include?('"') && !str.include?("'")
233
+ "'#{str.gsub("'", "''")}'"
234
+ else
235
+ "\"#{str.gsub('"', '\\"')}\""
236
+ end
237
+ else
238
+ str
239
+ end
240
+ end
241
+
242
+ def format_literal_string(str)
243
+ # For now, just quote it - we can enhance this later
244
+ "\"#{str.gsub('"', '\\"').gsub("\n", '\\n')}\""
245
+ end
246
+
247
+ def needs_quoting?(str)
248
+ # Basic rules - quote if:
249
+ # - Starts/ends with whitespace
250
+ # - Contains : or # or special chars
251
+ # - Looks like a boolean/null/number
252
+ return true if str.match?(/\A\s|\s\z/)
253
+ return true if str.match?(/[:#\[\]{}]/)
254
+ return true if str.match?(/\A(true|false|null|~|yes|no|on|off)\z/i)
255
+ return true if str.match?(/\A[-+]?[0-9]/)
256
+ false
257
+ end
258
+
259
+ def emit_comments(comments, indent)
260
+ return unless comments&.any?
261
+
262
+ comments.each do |comment|
263
+ @output << "#{' ' * indent}#{comment.value}"
264
+ end
265
+ end
266
+
267
+ def get_comments(node, type)
268
+ return nil unless node.respond_to?(:psych_node)
269
+ return nil unless node.psych_node.respond_to?(:comments?)
270
+ return nil unless node.psych_node.comments?
271
+
272
+ case type
273
+ when :leading
274
+ node.psych_node.comments.leading
275
+ when :trailing
276
+ node.psych_node.comments.trailing
277
+ end
278
+ end
279
+
280
+ def should_add_blank_line?(value)
281
+ # Add blank line before complex structures
282
+ value.is_a?(Hash) || value.is_a?(Array)
283
+ end
284
+
285
+ def indentation
286
+ @config.indentation
287
+ end
288
+ end
289
+ end
@@ -2,11 +2,8 @@
2
2
 
3
3
  module YamlJanitor
4
4
  class Linter
5
- attr_reader :rules
6
-
7
- def initialize(rules: :all, config: nil, config_path: nil)
5
+ def initialize(config: nil, config_path: nil)
8
6
  @config = config || Config.new(config_path: config_path)
9
- @rules = load_rules(rules)
10
7
  end
11
8
 
12
9
  # Lint a file, optionally fixing issues
@@ -28,22 +25,24 @@ module YamlJanitor
28
25
  # Load with comments
29
26
  loaded = Psych::Pure.load(yaml_content, comments: true)
30
27
 
31
- # Check for violations
32
- @rules.each do |rule|
33
- violations += rule.check(loaded, file: file)
28
+ # Format using our custom emitter
29
+ formatted = Emitter.new(loaded, @config).emit
30
+
31
+ # Check if formatting would change the file
32
+ if yaml_content != formatted
33
+ violations << Violation.new(
34
+ rule: :formatting,
35
+ message: "File needs formatting (indentation, style, or whitespace issues)",
36
+ file: file
37
+ )
34
38
  end
35
39
 
36
40
  # Apply fixes if requested
37
41
  output = yaml_content
38
42
  fixed = false
39
43
 
40
- if fix && violations.any?
41
- @rules.each do |rule|
42
- rule.fix!(loaded)
43
- end
44
-
45
- # Dump back to YAML with configured options
46
- output = Psych::Pure.dump(loaded, **@config.dump_options)
44
+ if fix
45
+ output = formatted
47
46
  fixed = true
48
47
 
49
48
  # Paranoid mode: verify semantics match
@@ -53,7 +52,9 @@ module YamlJanitor
53
52
  {
54
53
  violations: violations,
55
54
  fixed: fixed,
56
- output: output
55
+ output: output,
56
+ original: yaml_content,
57
+ formatted: formatted
57
58
  }
58
59
  rescue => e
59
60
  {
@@ -68,34 +69,47 @@ module YamlJanitor
68
69
  }
69
70
  end
70
71
 
71
- private
72
-
73
- def load_rules(rule_specs)
74
- available_rules = {
75
- multiline_certificate: Rules::MultilineCertificate,
76
- consistent_indentation: Rules::ConsistentIndentation
77
- }
78
-
79
- if rule_specs == :all
80
- # Load all enabled rules from config
81
- rule_names = available_rules.keys.select do |name|
82
- @config.rule_enabled?(name)
72
+ # Generate unified diff between original and formatted content
73
+ def generate_diff(original, formatted, path)
74
+ require 'tempfile'
75
+
76
+ # Write to temp files and use system diff
77
+ Tempfile.create(['original', '.yml']) do |orig_file|
78
+ Tempfile.create(['formatted', '.yml']) do |fmt_file|
79
+ orig_file.write(original)
80
+ orig_file.flush
81
+ fmt_file.write(formatted)
82
+ fmt_file.flush
83
+
84
+ # Use git diff if available (better formatting), fall back to diff
85
+ diff_cmd = if system('which git > /dev/null 2>&1')
86
+ "git diff --no-index --color=always #{orig_file.path} #{fmt_file.path}"
87
+ else
88
+ "diff -u #{orig_file.path} #{fmt_file.path}"
89
+ end
90
+
91
+ diff_output = `#{diff_cmd}`
92
+
93
+ # Replace temp file paths with actual path
94
+ # Git adds a/ and b/ prefixes (or just a/b for temp files)
95
+ orig_path_pattern = Regexp.escape(orig_file.path)
96
+ fmt_path_pattern = Regexp.escape(fmt_file.path)
97
+
98
+ # Handle various git diff formats
99
+ diff_output.gsub(/a\/#{orig_path_pattern}/, path)
100
+ .gsub(/b\/#{fmt_path_pattern}/, "#{path} (formatted)")
101
+ .gsub(/a#{orig_path_pattern}/, path)
102
+ .gsub(/b#{fmt_path_pattern}/, "#{path} (formatted)")
103
+ .gsub(/#{orig_path_pattern}/, path)
104
+ .gsub(/#{fmt_path_pattern}/, "#{path} (formatted)")
83
105
  end
84
- elsif rule_specs.is_a?(Array)
85
- rule_names = rule_specs
86
- else
87
- raise Error, "Invalid rules specification: #{rule_specs.inspect}"
88
106
  end
89
-
90
- rule_names.map do |name|
91
- rule_class = available_rules[name.to_sym]
92
- raise Error, "Unknown rule: #{name}" unless rule_class
93
- next unless @config.rule_enabled?(name)
94
-
95
- rule_class.new(@config.rule_config(name))
96
- end.compact
107
+ rescue => e
108
+ "Error generating diff: #{e.message}"
97
109
  end
98
110
 
111
+ private
112
+
99
113
  def verify_semantics!(original, fixed)
100
114
  original_data = YAML.load(original)
101
115
  fixed_data = YAML.load(fixed)
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module YamlJanitor
4
- VERSION = "20251113"
4
+ VERSION = "20251115"
5
5
  end
data/lib/yaml_janitor.rb CHANGED
@@ -5,11 +5,9 @@ require "yaml"
5
5
 
6
6
  require_relative "yaml_janitor/version"
7
7
  require_relative "yaml_janitor/config"
8
+ require_relative "yaml_janitor/emitter"
8
9
  require_relative "yaml_janitor/linter"
9
- require_relative "yaml_janitor/rule"
10
10
  require_relative "yaml_janitor/violation"
11
- require_relative "yaml_janitor/rules/multiline_certificate"
12
- require_relative "yaml_janitor/rules/consistent_indentation"
13
11
 
14
12
  module YamlJanitor
15
13
  class Error < StandardError; end
@@ -17,16 +15,16 @@ module YamlJanitor
17
15
  class SemanticMismatchError < Error; end
18
16
 
19
17
  class << self
20
- # Convenience method to lint a file
21
- def lint_file(path, rules: :all, fix: false)
22
- linter = Linter.new(rules: rules)
23
- linter.lint_file(path, fix: fix)
18
+ # Convenience method to format a file
19
+ def format_file(path, config: nil)
20
+ linter = Linter.new(config: config)
21
+ linter.lint_file(path, fix: true)
24
22
  end
25
23
 
26
- # Convenience method to lint a string
27
- def lint(yaml_string, rules: :all, fix: false)
28
- linter = Linter.new(rules: rules)
29
- linter.lint(yaml_string, fix: fix)
24
+ # Convenience method to format a string
25
+ def format(yaml_string, config: nil)
26
+ linter = Linter.new(config: config)
27
+ linter.lint(yaml_string, fix: true)
30
28
  end
31
29
  end
32
30
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: yaml-janitor
3
3
  version: !ruby/object:Gem::Version
4
- version: '20251113'
4
+ version: '20251115'
5
5
  platform: ruby
6
6
  authors:
7
7
  - ducks
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2025-11-13 00:00:00.000000000 Z
11
+ date: 2025-11-16 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: psych-pure
@@ -66,6 +66,7 @@ files:
66
66
  - bin/yaml-janitor
67
67
  - lib/yaml_janitor.rb
68
68
  - lib/yaml_janitor/config.rb
69
+ - lib/yaml_janitor/emitter.rb
69
70
  - lib/yaml_janitor/linter.rb
70
71
  - lib/yaml_janitor/rule.rb
71
72
  - lib/yaml_janitor/rules/consistent_indentation.rb