json-repair 0.3.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: db2b6fb7849a2e75329405c1f85fa7de836b0fa2f079623032571f42d359514d
4
- data.tar.gz: 1c845714c4c443bad3c9277a2ceae6cef8ff346125f52f89473aaa50b9ff2132
3
+ metadata.gz: 83ede681caac8ebbc294902937c8af11ebc90fe066beed41e15013782fbd0b96
4
+ data.tar.gz: da309d72a1c564f05dd405c02115985572c38c8b2c28aad85f778b94cc2a68e9
5
5
  SHA512:
6
- metadata.gz: 53929154af31033e2f380ed89979430f4339c97c94c088b6f85da27ac251d658b98840e44085c4ba9b4972bab75c1bb0f8ad750beddd4bb79e439efb135e0386
7
- data.tar.gz: b4b5150aee81c518eaee8847bb2f5d8d8131a15719bb93badce465a2d447ddc361888155b2d33125fdd69d2568424c08440772c61a7f7f5b35922a4d1270adf8
6
+ metadata.gz: da63df705106d20701700dd55712ed06676cdb4e6615f36d915cae12f3bdf7ac085ea919a553c77e8e0876663e5ad7d0067b46b00fddd3a7cca837eb38e54fc8
7
+ data.tar.gz: d4471c07b5a3a39505ec890b8afc359ca12fa3042c953143675af2d88bd8a4f56fe6225f00e0d4665eaf66465862f2b273929a2bb7fd54a81f79f1abf39ecfa1
data/CHANGELOG.md CHANGED
Binary file
data/CLAUDE.md ADDED
@@ -0,0 +1,67 @@
1
+ # CLAUDE.md
2
+
3
+ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4
+
5
+ ## Commands
6
+
7
+ - `bin/setup` — install dependencies via Bundler.
8
+ - `bundle exec rake` — default task; runs both RSpec and RuboCop.
9
+ - `bundle exec rspec` — run the test suite.
10
+ - `bundle exec rspec spec/json_spec.rb:42` — run a single example by line number; nearly all behavioral specs live in `spec/json_spec.rb`.
11
+ - `bundle exec rubocop` — lint. Project-specific exclusions in `.rubocop.yml` deliberately disable several `Metrics/*` cops for `lib/json/repairer.rb` and `lib/json/repair/string_utils.rb` because the parser is long by design — don't try to "fix" it by chopping methods up.
12
+ - `bin/console` — IRB with the gem preloaded.
13
+ - `bundle exec rake install` / `bundle exec rake release` — local install / publish to rubygems.org.
14
+ - Type checking: `Steepfile` checks `lib/` against `sig/`. Run `bundle exec steep check` if/when steep is installed (not in `Gemfile` by default).
15
+
16
+ Ruby `>= 3.0.0` is required (per gemspec). CI runs against Ruby 3.3.1.
17
+
18
+ ## Architecture
19
+
20
+ This gem is a **Ruby port of the [josdejong/jsonrepair](https://github.com/josdejong/jsonrepair) TypeScript library**. The upstream version currently mirrored is tracked in `CHANGELOG.md` (presently v3.14.0). When syncing upstream changes, the goal is parity with the JS implementation, not idiomatic refactoring — keep method names, control flow, and repair heuristics aligned with the JS source so future syncs stay tractable.
21
+
22
+ ### Entry point
23
+
24
+ `JSON.repair(str)` in `lib/json/repair.rb` is a thin wrapper that constructs `JSON::Repairer.new(str).repair`. `JSON::JSONRepairError` is the only error raised for unrecoverable inputs.
25
+
26
+ ### The parser (`lib/json/repairer.rb`)
27
+
28
+ A single-pass, hand-written recursive-descent parser. State is three instance variables:
29
+
30
+ - `@json` — the input string (read-only after init).
31
+ - `@index` — the current cursor into `@json`.
32
+ - `@output` — a mutable `+''` buffer that accumulates the *repaired* JSON. The parser writes directly to `@output` as it walks; it does not build an AST.
33
+
34
+ Each `parse_*` method (`parse_value`, `parse_object`, `parse_array`, `parse_string`, `parse_number`, `parse_keywords`, `parse_unquoted_string`, `parse_regex`, `parse_comment`, `parse_markdown_code_block`, …) follows a contract:
35
+
36
+ 1. Returns truthy if it consumed something, falsy otherwise.
37
+ 2. On success, advances `@index` past the consumed input and appends the *valid* JSON form to `@output`.
38
+ 3. On a recoverable mismatch (missing quote, missing comma, trailing comma, wrong quote style, etc.) it performs an in-place repair on `@output` using helpers like `insert_before_last_whitespace`, `strip_last_occurrence`, or `remove_at_index`.
39
+ 4. On an unrecoverable error it calls one of the `throw_*` helpers, which raise `JSON::JSONRepairError`.
40
+
41
+ Two patterns recur and are worth knowing before editing:
42
+
43
+ - **Backtracking via snapshots.** Methods like `parse_string` capture `i_before = @index` and `o_before = @output.length` before tentatively consuming input. If a later check (e.g. "the end quote turned out not to be a real end quote") fails, they restore both and re-invoke themselves with different flags (e.g. `stop_at_delimiter: true`, `stop_at_index: …`). Preserve this pattern when modifying string/number parsing.
44
+ - **Repair-by-rewriting-tail.** Helpers like `insert_before_last_whitespace(@output, ',')` and `@output = strip_last_occurrence(@output, ',')` patch the already-emitted output to fix things like missing or trailing commas. These run *after* the malformed input has been partially emitted — they are the mechanism for "I now realize that earlier token needed a comma after it."
45
+
46
+ `repair` (the public method) drives `parse_value` then handles top-level concerns: stripping Markdown fences (` ```json ... ``` `), converting newline-delimited JSON at the root into an array, dropping redundant trailing braces/brackets, and rejecting any non-whitespace trailing garbage.
47
+
48
+ ### Shared helpers (`lib/json/repair/string_utils.rb`)
49
+
50
+ `JSON::Repair::StringUtils` is a mixin included into `Repairer`. It holds:
51
+
52
+ - Character constants (`OPENING_BRACE`, `BACKSLASH`, smart-quote variants, special whitespace code points, etc.) used in lieu of magic literals.
53
+ - Character-class predicates (`digit?`, `hex?`, `quote?`, `single_quote_like?`, `delimiter?`, `whitespace?`, `special_whitespace?`, `start_of_value?`, …).
54
+ - The keyword machinery — `parse_keywords` / `parse_keyword` — which converts Python `True`/`False`/`None` and Ruby `nil` into their JSON equivalents in addition to recognizing `true`/`false`/`null`.
55
+ - Output-buffer surgery helpers: `strip_last_occurrence`, `insert_before_last_whitespace`, `remove_at_index`, `ends_with_comma_or_newline?`.
56
+
57
+ Because the mixin reads `@json`, `@index`, and `@output` directly (notably inside `parse_keyword`), it is **not standalone** — it is coupled to `Repairer`'s state and should only be mixed into classes that own those ivars.
58
+
59
+ ### Type signatures (`sig/`)
60
+
61
+ RBS signatures mirror the public surface of `JSON.repair`, `JSON::Repairer`, and `JSON::Repair::StringUtils`. Update them in lockstep with `lib/` changes; the `Steepfile` will surface drift.
62
+
63
+ ### Test layout
64
+
65
+ - `spec/json_spec.rb` — the substantive behavioral suite (700+ examples covering every repair heuristic). New behavior — and every sync from upstream — belongs here.
66
+ - `spec/json/repair_spec.rb` — sanity check on `JSON::Repair::VERSION` only.
67
+ - `.rspec_status` is committed and tracks per-example pass/fail so `--only-failures` / `--next-failure` work across runs.
data/README.md CHANGED
@@ -31,6 +31,21 @@ puts repaired_json # Outputs: {"name": "Alice", "age": 25}
31
31
 
32
32
  The `repair` method takes a string containing JSON data and returns a corrected version of this string, ensuring it is valid JSON.
33
33
 
34
+ ## Command line
35
+
36
+ The gem ships a `json-repair` executable. It reads from stdin or a file and writes to stdout, `--output FILE`, or back over the input file with `--overwrite`.
37
+
38
+ ```bash
39
+ $ echo '{a:1,}' | json-repair
40
+ {"a":1}
41
+
42
+ $ json-repair broken.json
43
+ $ json-repair broken.json -o fixed.json
44
+ $ json-repair broken.json --overwrite
45
+ ```
46
+
47
+ Run `json-repair --help` for the full list of options.
48
+
34
49
  ## Development
35
50
 
36
51
  After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
data/exe/json-repair ADDED
@@ -0,0 +1,6 @@
1
+ #!/usr/bin/env ruby
2
+ # frozen_string_literal: true
3
+
4
+ require 'json/repair/cli'
5
+
6
+ exit JSON::Repair::CLI.call(ARGV)
@@ -0,0 +1,133 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'fileutils'
4
+ require 'optparse'
5
+ require 'tempfile'
6
+ require_relative '../repair'
7
+
8
+ module JSON
9
+ module Repair
10
+ class CLI
11
+ def self.call(argv, stdin: $stdin, stdout: $stdout, stderr: $stderr)
12
+ new(stdin: stdin, stdout: stdout, stderr: stderr).call(argv)
13
+ end
14
+
15
+ def initialize(stdin: $stdin, stdout: $stdout, stderr: $stderr)
16
+ @stdin = stdin
17
+ @stdout = stdout
18
+ @stderr = stderr
19
+ end
20
+
21
+ # Reset per-invocation state so a single instance can be safely reused
22
+ # (e.g. `cli = CLI.new; cli.call(['-v']); cli.call(['x'])`).
23
+ def call(argv)
24
+ @output_path = @halt = nil
25
+ @overwrite = false
26
+ run(argv)
27
+ rescue OptionParser::ParseError, JSON::JSONRepairError, SystemCallError, IOError,
28
+ SystemStackError => e
29
+ @stderr.puts "json-repair: #{e.message}"
30
+ 1
31
+ end
32
+
33
+ private
34
+
35
+ def run(argv)
36
+ positional = catch(:halt) { parser.parse(argv) }
37
+ return @halt if @halt
38
+
39
+ input_path = positional.first
40
+ return 1 unless validate(positional, input_path)
41
+
42
+ repaired = JSON.repair(read_input(input_path))
43
+ write_output(repaired, input_path)
44
+ 0
45
+ end
46
+
47
+ def validate(positional, input_path)
48
+ error = validation_error(positional, input_path)
49
+ return true unless error
50
+
51
+ @stderr.puts "json-repair: #{error}"
52
+ false
53
+ end
54
+
55
+ def validation_error(positional, input_path)
56
+ return "unexpected argument: #{positional[1]}" if positional.length > 1
57
+ return '--overwrite requires a filename' if @overwrite && input_path.nil?
58
+ return '--overwrite and --output are mutually exclusive' if @overwrite && @output_path
59
+
60
+ nil
61
+ end
62
+
63
+ def read_input(input_path)
64
+ raw = input_path ? File.read(input_path) : @stdin.read
65
+ raw.force_encoding(Encoding::UTF_8)
66
+ raise JSON::JSONRepairError, 'input is not valid UTF-8' unless raw.valid_encoding?
67
+
68
+ raw
69
+ end
70
+
71
+ def write_output(repaired, input_path)
72
+ if @overwrite
73
+ replace_in_place(input_path, repaired)
74
+ elsif @output_path
75
+ File.write(@output_path, repaired)
76
+ else
77
+ @stdout.write(repaired)
78
+ @stdout.write("\n") unless repaired.end_with?("\n")
79
+ end
80
+ end
81
+
82
+ # Write to a uniquely-named tempfile alongside the input, then move it
83
+ # over the original. Tempfile.create uses O_EXCL + a random suffix, so
84
+ # the temp path is safe against symlink / clobber races; FileUtils.mv
85
+ # with force: true handles cross-device renames and Windows, where
86
+ # File.rename cannot overwrite an existing destination. The original
87
+ # file's mode is preserved (Tempfile defaults to 0600).
88
+ #
89
+ # Symlinks are followed via File.realpath so the underlying file is
90
+ # rewritten in place and the link is left pointing at it; otherwise
91
+ # the rename would replace the link itself with a regular file.
92
+ def replace_in_place(input_path, repaired)
93
+ real_path = File.realpath(input_path)
94
+ original_mode = File.stat(real_path).mode
95
+ Tempfile.create(['json-repair', '.tmp'], File.dirname(real_path)) do |tmp|
96
+ tmp.write(repaired)
97
+ tmp.close
98
+ File.chmod(original_mode, tmp.path)
99
+ FileUtils.mv(tmp.path, real_path, force: true)
100
+ end
101
+ end
102
+
103
+ def parser
104
+ OptionParser.new do |opts|
105
+ opts.banner = 'Usage: json-repair [filename] [options]'
106
+ opts.separator ''
107
+ opts.separator 'Repair a broken JSON document. Reads stdin when no filename is given.'
108
+ opts.separator ''
109
+ define_options(opts)
110
+ end
111
+ end
112
+
113
+ OVERWRITE_DESC = 'Replace the input file in place (requires filename; conflicts with --output)'
114
+ private_constant :OVERWRITE_DESC
115
+
116
+ def define_options(opts)
117
+ opts.on('-o', '--output FILE', 'Write repaired JSON to FILE') { |f| @output_path = f }
118
+ opts.on('--overwrite', OVERWRITE_DESC) { @overwrite = true }
119
+ opts.on('-v', '--version', 'Print version and exit') { halt_with(JSON::Repair::VERSION) }
120
+ opts.on('-h', '--help', 'Print this help and exit') { halt_with(opts.help) }
121
+ end
122
+
123
+ # Print to stdout and short-circuit `parser.parse` so trailing args
124
+ # after --version/--help do not raise OptionParser::ParseError and
125
+ # flip the exit code (the option text promises "...and exit").
126
+ def halt_with(message)
127
+ @stdout.puts message
128
+ @halt = 0
129
+ throw :halt
130
+ end
131
+ end
132
+ end
133
+ end
@@ -2,6 +2,6 @@
2
2
 
3
3
  module JSON
4
4
  module Repair
5
- VERSION = '0.3.0'
5
+ VERSION = '0.5.0'
6
6
  end
7
7
  end
data/lib/json/repair.rb CHANGED
@@ -4,7 +4,14 @@ require_relative 'repair/version'
4
4
  require_relative 'repairer'
5
5
 
6
6
  module JSON
7
- class JSONRepairError < StandardError; end
7
+ class JSONRepairError < StandardError
8
+ attr_reader :position
9
+
10
+ def initialize(message = nil, position = nil)
11
+ super(message && position ? "#{message} at index #{position}" : message)
12
+ @position = position
13
+ end
14
+ end
8
15
 
9
16
  def self.repair(json)
10
17
  Repairer.new(json).repair
data/lib/json/repairer.rb CHANGED
@@ -739,28 +739,28 @@ module JSON
739
739
  end
740
740
 
741
741
  def throw_invalid_character(char)
742
- raise JSONRepairError, "Invalid character #{char.inspect} at index #{@index}"
742
+ raise JSONRepairError.new("Invalid character #{char.inspect}", @index)
743
743
  end
744
744
 
745
745
  def throw_unexpected_character
746
- raise JSONRepairError, "Unexpected character #{@json[@index].inspect} at index #{@index}"
746
+ raise JSONRepairError.new("Unexpected character #{@json[@index].inspect}", @index)
747
747
  end
748
748
 
749
749
  def throw_unexpected_end
750
- raise JSONRepairError, 'Unexpected end of json string'
750
+ raise JSONRepairError.new('Unexpected end of json string', @index)
751
751
  end
752
752
 
753
753
  def throw_object_key_expected
754
- raise JSONRepairError, 'Object key expected'
754
+ raise JSONRepairError.new('Object key expected', @index)
755
755
  end
756
756
 
757
757
  def throw_colon_expected
758
- raise JSONRepairError, 'Colon expected'
758
+ raise JSONRepairError.new('Colon expected', @index)
759
759
  end
760
760
 
761
761
  def throw_invalid_unicode_character
762
762
  chars = @json[@index, 6]
763
- raise JSONRepairError, "Invalid unicode character #{chars.inspect} at index #{@index}"
763
+ raise JSONRepairError.new("Invalid unicode character #{chars.inspect}", @index)
764
764
  end
765
765
  end
766
766
  end
@@ -0,0 +1,16 @@
1
+ module JSON
2
+ module Repair
3
+ class CLI
4
+ # `::IO | ::StringIO` because `::StringIO` is not an `::IO` subclass; the
5
+ # specs inject `::StringIO` instances and any other duck-typed stream
6
+ # that implements `#read` / `#write` / `#puts` would work too.
7
+ type stream = ::IO | ::StringIO
8
+
9
+ def self.call: (::Array[::String] argv, ?stdin: stream, ?stdout: stream, ?stderr: stream) -> ::Integer
10
+
11
+ def initialize: (?stdin: stream, ?stdout: stream, ?stderr: stream) -> void
12
+
13
+ def call: (::Array[::String] argv) -> ::Integer
14
+ end
15
+ end
16
+ end
data/sig/json/repair.rbs CHANGED
@@ -1,5 +1,8 @@
1
1
  module JSON
2
2
  class JSONRepairError < StandardError
3
+ attr_reader position: ::Integer?
4
+
5
+ def initialize: (?::String? message, ?::Integer? position) -> void
3
6
  end
4
7
 
5
8
  module Repair
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: json-repair
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.0
4
+ version: 0.5.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Aleksandr Zykov
@@ -12,23 +12,28 @@ dependencies: []
12
12
  description: This is a simple gem that repairs broken JSON strings.
13
13
  email:
14
14
  - alexandrz@gmail.com
15
- executables: []
15
+ executables:
16
+ - json-repair
16
17
  extensions: []
17
18
  extra_rdoc_files: []
18
19
  files:
19
20
  - ".rspec"
20
21
  - ".rubocop.yml"
21
22
  - CHANGELOG.md
23
+ - CLAUDE.md
22
24
  - CODE_OF_CONDUCT.md
23
25
  - LICENSE.txt
24
26
  - README.md
25
27
  - Rakefile
26
28
  - Steepfile
29
+ - exe/json-repair
27
30
  - lib/json/repair.rb
31
+ - lib/json/repair/cli.rb
28
32
  - lib/json/repair/string_utils.rb
29
33
  - lib/json/repair/version.rb
30
34
  - lib/json/repairer.rb
31
35
  - sig/json/repair.rbs
36
+ - sig/json/repair/cli.rbs
32
37
  - sig/json/repair/string_utils.rbs
33
38
  - sig/json/repairer.rbs
34
39
  homepage: https://github.com/sashazykov/json-repair-rb