json-repair 0.7.0 → 0.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: fdf2958528b936eba0faf6c724a20d10a3a3e0b95329bc1866740c99432f6fe3
4
- data.tar.gz: 33fa97d2c7689ea594723e5d0d412646cf57a2ebdbf79f3b62947d4928963f32
3
+ metadata.gz: 6de36fcd3ab73ce63e1f9367d4e86dd7373dd7430b2dc87d451d75dd1f5fd685
4
+ data.tar.gz: 7221ab8c14253abf6cf4d40ff8fb8e6bd93c20504a8b5e6608c93a159420dc2f
5
5
  SHA512:
6
- metadata.gz: 587323c57caac3e53af1da24cfeee47df18ff738516d8fd2862d37a547c6e864cf5653c8b43427ec58aa47be3f2e958c991208b36de098c5ffc9503f2c64a0a2
7
- data.tar.gz: f5380cfdca6a4f60833ab57fc8465715b546415fa4f6ed5781b6bb24653ce324d2b6d928f9a5ff87b4deead145391303c1e805bf0fbff238b9ef1c7024a1f4aa
6
+ metadata.gz: 5083d8d8ada9a0b0a67beb8cb8ad1f1af53435e4e5c99632de5c68e19fd02d6dc85567b8c44cfc3068ae44118934c073d5a1acc22abf7f3d089b2add3899e083
7
+ data.tar.gz: 01d4eb164137dcb0e5b3b11703e2b1099efff6749f633466c3b37e1cccec07904e04006ddca1961d30702b5f133e35b88f8c37edce2bfa72ad61a6172dbbbada
data/CHANGELOG.md CHANGED
@@ -1,5 +1,54 @@
1
1
  # Changes
2
2
 
3
+ ### 2026-06-11 (0.10.0)
4
+
5
+ * Repair Markdown list markers in front of top-level values:
6
+ `- {"a": 1}` → `{"a":1}`, and multi-line lists become arrays via the
7
+ existing newline-delimited JSON handling
8
+ (`"- {\"a\": 1}\n- {\"b\": 2}"` → `[{"a":1},{"b":2}]`). Bullet
9
+ markers `-`, `*`, `+` and ordered markers like `1.` / `2)` (up to
10
+ nine digits, the CommonMark limit) are recognized at the start of
11
+ the root value and of each newline-delimited line, only when
12
+ followed by same-line whitespace and a value — so `-5`, a trailing
13
+ `"- "`, and newline-delimited decimals like `"1.5\n2.5"` keep their
14
+ number readings, and nothing changes inside nested structures.
15
+ Previously these inputs raised `JSONRepairError`; two non-raising
16
+ behaviors change for the better: `"3\n- 5\n7"` now repairs to
17
+ `[3,5,7]` instead of the corrupt `[3,0,5,7]`, and a single-line
18
+ `* text` becomes `"text"` instead of `"* text"`. Deliberate
19
+ divergence from upstream
20
+ [jsonrepair](https://github.com/josdejong/jsonrepair) (no Markdown
21
+ list handling as of v3.14.0), and more precise than Python
22
+ [`json_repair`](https://github.com/mangiucugna/json_repair), which
23
+ collapses scalar list items to `""`.
24
+
25
+ ### 2026-06-11 (0.9.0)
26
+
27
+ * Repair numbers missing the digit before their decimal point:
28
+ `.5` → `0.5`, `-.5` → `-0.5`, and truncated forms like `.` → `0.0`.
29
+ Previously these leaked a raw stdlib `JSON::ParserError` out of
30
+ `JSON.repair` because the repairer emitted the leading-dot number
31
+ unchanged (invalid JSON) and the canonical-output re-parse choked on
32
+ it. This is a deliberate divergence from upstream
33
+ [jsonrepair](https://github.com/josdejong/jsonrepair) (which leaves
34
+ leading-dot numbers unrepaired as of v3.14.0), matching
35
+ [dirty-json](https://github.com/RyanMarcus/dirty-json) behavior.
36
+ * `JSON.repair` now guards its error contract: if the repairer ever
37
+ emits a string stdlib JSON cannot parse (a repairer bug), the stdlib
38
+ error is wrapped in `JSON::JSONRepairError` instead of leaking
39
+ `JSON::ParserError` to callers.
40
+
41
+ ### 2026-05-15 (0.8.0)
42
+
43
+ * `JSON.repair_file(path)` and `JSON.repair_io(io)` convenience
44
+ wrappers around `JSON.repair`. `repair_file` reads a path from disk
45
+ (accepts a `String` or `Pathname`); `repair_io` reads from any
46
+ object responding to `#read` (e.g. `File`, `StringIO`, `$stdin`)
47
+ without closing it. Both forward `return_objects:` and
48
+ `skip_json_loads:` through to `JSON.repair`. Mirrors Python's
49
+ [`json_repair`](https://github.com/mangiucugna/json_repair)
50
+ `load` / `from_file` helpers.
51
+
3
52
  ### 2026-05-12 (0.7.0)
4
53
 
5
54
  * `JSON.repair` now always returns canonical JSON via
data/README.md CHANGED
@@ -31,6 +31,12 @@ puts repaired_json # Outputs: {"name":"Alice","age":25}
31
31
 
32
32
  The `repair` method takes a string containing JSON data and returns a corrected version of this string, ensuring it is valid JSON.
33
33
 
34
+ Markdown markup in LLM output is handled too: fenced code blocks like `` ```json `` are stripped, and list markers (`-`, `*`, `+`, `1.`) in front of top-level values are removed — a multi-line list becomes an array:
35
+
36
+ ```ruby
37
+ JSON.repair("- {\"a\": 1}\n- {\"b\": 2}") # => '[{"a":1},{"b":2}]'
38
+ ```
39
+
34
40
  Pass `return_objects: true` to get the parsed Ruby value (Hash, Array, or scalar) instead of a string:
35
41
 
36
42
  ```ruby
@@ -53,6 +59,20 @@ If you need the parsed Ruby value instead of a string, pass `return_objects: tru
53
59
 
54
60
  `skip_json_loads: true` skips the stdlib `JSON.parse` attempt and routes the input straight through the repairer. The output is the same; the option is purely a performance knob for callers who know their input will need repair.
55
61
 
62
+ ### Reading from a file or IO
63
+
64
+ `JSON.repair_file(path)` reads a file from disk and repairs its contents. `JSON.repair_io(io)` does the same with any object that responds to `#read` (e.g. `File`, `StringIO`, `$stdin`). Both forward `return_objects:` and `skip_json_loads:` to `JSON.repair`.
65
+
66
+ ```ruby
67
+ JSON.repair_file('broken.json')
68
+ JSON.repair_file('broken.json', return_objects: true)
69
+
70
+ File.open('broken.json') { |io| JSON.repair_io(io) }
71
+ JSON.repair_io($stdin)
72
+ ```
73
+
74
+ `JSON.repair_io` does not close the IO — the caller manages its lifecycle.
75
+
56
76
  ## Command line
57
77
 
58
78
  The gem ships a `json-repair` executable. It reads from stdin or a file and writes to stdout, `--output FILE`, or back over the input file with `--overwrite`.
data/Rakefile CHANGED
@@ -19,6 +19,9 @@ task :steep do
19
19
  sh 'bundle exec steep check'
20
20
  end
21
21
 
22
+ desc 'Type-check: rbs validate + steep check'
23
+ task typecheck: %i[rbs steep]
24
+
22
25
  desc 'Run benchmark/run.rb (regression baseline for JSON.repair)'
23
26
  task :bench do
24
27
  ruby '-Ilib', 'benchmark/run.rb'
@@ -60,13 +60,14 @@ module JSON
60
60
 
61
61
  # Functions to check character chars
62
62
  def hex?(char)
63
- (char >= ZERO && char <= NINE) ||
64
- (char >= UPPERCASE_A && char <= UPPERCASE_F) ||
65
- (char >= LOWERCASE_A && char <= LOWERCASE_F)
63
+ !char.nil? &&
64
+ ((char >= ZERO && char <= NINE) ||
65
+ (char >= UPPERCASE_A && char <= UPPERCASE_F) ||
66
+ (char >= LOWERCASE_A && char <= LOWERCASE_F))
66
67
  end
67
68
 
68
69
  def digit?(char)
69
- char && char >= ZERO && char <= NINE
70
+ !char.nil? && char >= ZERO && char <= NINE
70
71
  end
71
72
 
72
73
  def valid_string_character?(char)
@@ -74,11 +75,11 @@ module JSON
74
75
  end
75
76
 
76
77
  def delimiter?(char)
77
- REGEX_DELIMITER.match?(char)
78
+ !char.nil? && REGEX_DELIMITER.match?(char)
78
79
  end
79
80
 
80
81
  def unquoted_string_delimiter?(char)
81
- REGEX_UNQUOTED_STRING_DELIMITER.match?(char)
82
+ !char.nil? && REGEX_UNQUOTED_STRING_DELIMITER.match?(char)
82
83
  end
83
84
 
84
85
  REGEX_FUNCTION_NAME_CHAR_START = /\A[a-zA-Z_$]\z/
@@ -93,19 +94,19 @@ module JSON
93
94
  end
94
95
 
95
96
  def start_of_value?(char)
96
- REGEX_START_OF_VALUE.match?(char) || (char && quote?(char))
97
+ !char.nil? && (REGEX_START_OF_VALUE.match?(char) || quote?(char))
97
98
  end
98
99
 
99
100
  def control_character?(char)
100
- [NEWLINE, RETURN, TAB, BACKSPACE, FORM_FEED].include?(char)
101
+ !char.nil? && [NEWLINE, RETURN, TAB, BACKSPACE, FORM_FEED].include?(char)
101
102
  end
102
103
 
103
104
  def whitespace?(char)
104
- [SPACE, NEWLINE, TAB, RETURN].include?(char)
105
+ !char.nil? && [SPACE, NEWLINE, TAB, RETURN].include?(char)
105
106
  end
106
107
 
107
108
  def whitespace_except_newline?(char)
108
- [SPACE, TAB, RETURN].include?(char)
109
+ !char.nil? && [SPACE, TAB, RETURN].include?(char)
109
110
  end
110
111
 
111
112
  def special_whitespace?(char)
@@ -122,6 +123,10 @@ module JSON
122
123
  (char >= EN_QUAD && char <= ZERO_WIDTH_SPACE)
123
124
  end
124
125
 
126
+ def same_line_whitespace?(char)
127
+ whitespace_except_newline?(char) || special_whitespace?(char)
128
+ end
129
+
125
130
  def quote?(char)
126
131
  double_quote_like?(char) || single_quote_like?(char)
127
132
  end
@@ -135,20 +140,25 @@ module JSON
135
140
  end
136
141
 
137
142
  def double_quote_like?(char)
138
- [DOUBLE_QUOTE, DOUBLE_QUOTE_LEFT, DOUBLE_QUOTE_RIGHT].include?(char)
143
+ !char.nil? && [DOUBLE_QUOTE, DOUBLE_QUOTE_LEFT, DOUBLE_QUOTE_RIGHT].include?(char)
139
144
  end
140
145
 
141
146
  def single_quote_like?(char)
142
- [QUOTE, QUOTE_LEFT, QUOTE_RIGHT, GRAVE_ACCENT, ACUTE_ACCENT].include?(char)
147
+ !char.nil? && [QUOTE, QUOTE_LEFT, QUOTE_RIGHT, GRAVE_ACCENT, ACUTE_ACCENT].include?(char)
143
148
  end
144
149
 
145
- # Strip last occurrence of text_to_strip from text
150
+ # Strip last occurrence of text_to_strip from text.
151
+ #
152
+ # `|| ''` on the slices below (and in `insert_before_last_whitespace` /
153
+ # `remove_at_index`) is for steep's nil-narrowing: `String#[range]` is
154
+ # typed `String?`, but every call site here keeps indices within
155
+ # `0..text.length`, so the slices never actually return `nil`.
146
156
  def strip_last_occurrence(text, text_to_strip, strip_remaining_text: false)
147
157
  index = text.rindex(text_to_strip)
148
158
  return text unless index
149
159
 
150
- remaining_text = strip_remaining_text ? '' : text[index + 1..]
151
- text[0...index] + remaining_text
160
+ remaining_text = strip_remaining_text ? '' : (text[index + 1..] || '')
161
+ (text[0...index] || '') + remaining_text
152
162
  end
153
163
 
154
164
  def insert_before_last_whitespace(text, text_to_insert)
@@ -158,7 +168,7 @@ module JSON
158
168
 
159
169
  index -= 1 while whitespace?(text[index - 1])
160
170
 
161
- text[0...index] + text_to_insert + text[index..]
171
+ (text[0...index] || '') + text_to_insert + (text[index..] || '')
162
172
  end
163
173
 
164
174
  # Parse keywords true, false, null
@@ -187,7 +197,7 @@ module JSON
187
197
  end
188
198
 
189
199
  def remove_at_index(text, start, count)
190
- text[0...start] + text[start + count..]
200
+ (text[0...start] || '') + (text[start + count..] || '')
191
201
  end
192
202
 
193
203
  def ends_with_comma_or_newline?(text)
@@ -2,6 +2,6 @@
2
2
 
3
3
  module JSON
4
4
  module Repair
5
- VERSION = '0.7.0'
5
+ VERSION = '0.10.0'
6
6
  end
7
7
  end
data/lib/json/repair.rb CHANGED
@@ -20,6 +20,22 @@ module JSON
20
20
  return_objects ? parsed : JSON.generate(parsed)
21
21
  end
22
22
 
23
+ # Inlined rather than calling `repair(...)` so the literal-bool overloads
24
+ # in sig/json/repair.rbs narrow correctly per caller — forwarding a
25
+ # `bool`-typed `return_objects` will not resolve against the literal-
26
+ # `true`/`false` overloads on `JSON.repair`.
27
+ def self.repair_io(io, return_objects: false, skip_json_loads: false)
28
+ json = io.read || ''
29
+ parsed = skip_json_loads ? repaired_parse(json) : tolerant_parse(json)
30
+ return_objects ? parsed : JSON.generate(parsed)
31
+ end
32
+
33
+ def self.repair_file(path, return_objects: false, skip_json_loads: false)
34
+ json = File.read(path.to_s)
35
+ parsed = skip_json_loads ? repaired_parse(json) : tolerant_parse(json)
36
+ return_objects ? parsed : JSON.generate(parsed)
37
+ end
38
+
23
39
  def self.tolerant_parse(json)
24
40
  JSON.parse(json)
25
41
  rescue JSON::ParserError
@@ -27,8 +43,14 @@ module JSON
27
43
  end
28
44
  private_class_method :tolerant_parse
29
45
 
46
+ # The rescue guards the JSONRepairError-only error contract: if the
47
+ # Repairer ever emits a string stdlib JSON cannot parse (a Repairer bug),
48
+ # wrap the stdlib error instead of leaking JSON::ParserError to callers.
30
49
  def self.repaired_parse(json)
31
- JSON.parse(Repairer.new(json).repair)
50
+ repaired = Repairer.new(json).repair
51
+ JSON.parse(repaired)
52
+ rescue JSON::ParserError => e
53
+ raise JSONRepairError, "Internal error: repaired output is not valid JSON (#{e.message})"
32
54
  end
33
55
  private_class_method :repaired_parse
34
56
  end
data/lib/json/repairer.rb CHANGED
@@ -37,6 +37,12 @@ module JSON
37
37
  def repair
38
38
  parse_markdown_code_block(MARKDOWN_OPEN_BLOCKS)
39
39
 
40
+ # repair: skip a Markdown list marker before the root value
41
+ # (and any comments before it, which parse_value would otherwise
42
+ # only consume after the marker check has already failed)
43
+ parse_whitespace_and_skip_comments
44
+ skip_markdown_list_marker
45
+
40
46
  processed = parse_value
41
47
 
42
48
  throw_unexpected_end unless processed
@@ -46,7 +52,8 @@ module JSON
46
52
  processed_comma = parse_character(COMMA)
47
53
  parse_whitespace_and_skip_comments if processed_comma
48
54
 
49
- if start_of_value?(@json[@index]) && ends_with_comma_or_newline?(@output)
55
+ if (start_of_value?(@json[@index]) || markdown_list_marker_length) &&
56
+ ends_with_comma_or_newline?(@output)
50
57
  # start of a new value after end of the root level object: looks like
51
58
  # newline delimited JSON -> turn into a root level array
52
59
  unless processed_comma
@@ -170,6 +177,52 @@ module JSON
170
177
  false
171
178
  end
172
179
 
180
+ # Look ahead from @index for a Markdown list marker like "- ", "* ",
181
+ # "+ ", or "12. " that precedes a value. Returns the marker's length,
182
+ # or nil when there is no marker. Only consulted at the top level —
183
+ # the root value and each newline-delimited value — never inside
184
+ # nested structures. A marker must be followed by same-line
185
+ # whitespace and a value, so "-5", a trailing "- ", and "-\n{...}"
186
+ # keep their number readings. Ordered markers are capped at nine
187
+ # digits (the CommonMark limit) so long truncated decimals are not
188
+ # mistaken for markers. Divergence from upstream (no Markdown list
189
+ # handling as of v3.14.0): LLMs frequently emit JSON values as
190
+ # Markdown list items.
191
+ def markdown_list_marker_length
192
+ j = @index
193
+
194
+ if [MINUS, ASTERISK, PLUS].include?(@json[j])
195
+ j += 1
196
+ elsif digit?(@json[j])
197
+ j += 1 while digit?(@json[j]) && j - @index < 9
198
+ return nil unless [DOT, CLOSE_PARENTHESIS].include?(@json[j])
199
+
200
+ j += 1
201
+ else
202
+ return nil
203
+ end
204
+
205
+ marker_length = j - @index
206
+ return nil unless same_line_whitespace?(@json[j])
207
+
208
+ j += 1 while same_line_whitespace?(@json[j])
209
+ # a leading-dot number like ".5" is also a value here: parse_number
210
+ # repairs it to "0.5" even though start_of_value? does not match it
211
+ return nil unless start_of_value?(@json[j]) || @json[j] == DOT
212
+
213
+ marker_length
214
+ end
215
+
216
+ # Repair a value behind a Markdown list marker, like "- {"a":1}",
217
+ # by skipping the marker. See markdown_list_marker_length.
218
+ def skip_markdown_list_marker
219
+ length = markdown_list_marker_length
220
+ return false unless length
221
+
222
+ @index += length
223
+ true
224
+ end
225
+
173
226
  # Parse an object like '{"key": "value"}'
174
227
  def parse_object
175
228
  return false unless @json[@index] == OPENING_BRACE
@@ -570,7 +623,9 @@ module JSON
570
623
  repair_number_ending_with_numeric_symbol(start)
571
624
  return true
572
625
  end
573
- unless digit?(@json[@index])
626
+ # also accept a dot so "-.5" continues into the fraction branch
627
+ # below (divergence from upstream, which leaves "-.5" unrepaired)
628
+ unless digit?(@json[@index]) || @json[@index] == DOT
574
629
  @index = start
575
630
  return false
576
631
  end
@@ -620,7 +675,7 @@ module JSON
620
675
  num = @json[start...@index]
621
676
  has_invalid_leading_zero = num.match?(/^0\d/)
622
677
 
623
- @output << (has_invalid_leading_zero ? "\"#{num}\"" : num)
678
+ @output << (has_invalid_leading_zero ? "\"#{num}\"" : repair_leading_dot_number(num))
624
679
  return true
625
680
  end
626
681
 
@@ -711,7 +766,18 @@ module JSON
711
766
  # repair numbers cut off at the end
712
767
  # this will only be called when we end after a '.', '-', or 'e' and does not
713
768
  # change the number more than it needs to make it valid JSON
714
- @output << "#{@json[start...@index]}0"
769
+ @output << repair_leading_dot_number("#{@json[start...@index]}0")
770
+ end
771
+
772
+ # Repair a number missing its digit before the decimal point, like ".5"
773
+ # or "-.5", into "0.5" / "-0.5". Divergence from upstream, which emits
774
+ # the invalid leading-dot number unchanged. The guard keeps the common
775
+ # case (a number that needs no repair) allocation-free; `sub` copies
776
+ # its receiver even when the pattern does not match.
777
+ def repair_leading_dot_number(num)
778
+ return num unless num.start_with?('.', '-.')
779
+
780
+ num.sub(/\A(?<sign>-?)\./, '\k<sign>0.')
715
781
  end
716
782
 
717
783
  # Parse and repair Newline Delimited JSON (NDJSON):
@@ -732,6 +798,10 @@ module JSON
732
798
  end
733
799
  end
734
800
 
801
+ # repair: skip a Markdown list marker before the next value
802
+ parse_whitespace_and_skip_comments
803
+ skip_markdown_list_marker
804
+
735
805
  processed_value = parse_value
736
806
  end
737
807
 
@@ -1,9 +1,9 @@
1
1
  module JSON
2
2
  module Repair
3
3
  module StringUtils
4
- @output: untyped
4
+ @output: ::String
5
5
 
6
- @index: untyped
6
+ @index: ::Integer
7
7
 
8
8
  # Constants for character chars
9
9
  BACKSLASH: "\\"
@@ -24,17 +24,17 @@ module JSON
24
24
 
25
25
  CLOSE_PARENTHESIS: ")"
26
26
 
27
- SPACE: " "
27
+ SPACE: ::String
28
28
 
29
- NEWLINE: "\n"
29
+ NEWLINE: ::String
30
30
 
31
- TAB: "\t"
31
+ TAB: ::String
32
32
 
33
- RETURN: "\r"
33
+ RETURN: ::String
34
34
 
35
- BACKSPACE: "\b"
35
+ BACKSPACE: ::String
36
36
 
37
- FORM_FEED: "\f"
37
+ FORM_FEED: ::String
38
38
 
39
39
  DOUBLE_QUOTE: "\""
40
40
 
@@ -110,56 +110,62 @@ module JSON
110
110
 
111
111
  REGEX_FUNCTION_NAME_CHAR: ::Regexp
112
112
 
113
- # Functions to check character chars
114
- def hex?: (untyped char) -> untyped
113
+ # Functions to check character chars.
114
+ # `char` is `::String?` because every caller passes `@json[@index]`,
115
+ # which is `nil` past the end of input. The predicates either guard
116
+ # against `nil` explicitly or rely on `Array#include?` / `==` /
117
+ # `Regexp#match?` returning a safe value for `nil`.
118
+ def hex?: (::String? char) -> bool
115
119
 
116
- def digit?: (untyped char) -> untyped
120
+ def digit?: (::String? char) -> bool
117
121
 
118
- def valid_string_character?: (untyped char) -> untyped
122
+ def valid_string_character?: (::String char) -> bool
119
123
 
120
- def delimiter?: (untyped char) -> untyped
124
+ def delimiter?: (::String? char) -> bool
121
125
 
122
- def unquoted_string_delimiter?: (untyped char) -> untyped
126
+ def unquoted_string_delimiter?: (::String? char) -> bool
123
127
 
124
- def function_name_char_start?: (untyped char) -> untyped
128
+ def function_name_char_start?: (::String? char) -> bool
125
129
 
126
- def function_name_char?: (untyped char) -> untyped
130
+ def function_name_char?: (::String? char) -> bool
127
131
 
128
- def start_of_value?: (untyped char) -> untyped
132
+ def start_of_value?: (::String? char) -> bool
129
133
 
130
- def control_character?: (untyped char) -> untyped
134
+ def control_character?: (::String? char) -> bool
131
135
 
132
- def whitespace?: (untyped char) -> untyped
136
+ def whitespace?: (::String? char) -> bool
133
137
 
134
- def whitespace_except_newline?: (untyped char) -> untyped
138
+ def whitespace_except_newline?: (::String? char) -> bool
135
139
 
136
- def special_whitespace?: (untyped char) -> untyped
140
+ def special_whitespace?: (::String? char) -> bool
137
141
 
138
- def quote?: (untyped char) -> untyped
142
+ def same_line_whitespace?: (::String? char) -> bool
139
143
 
140
- def double_quote?: (untyped char) -> untyped
144
+ def quote?: (::String? char) -> bool
141
145
 
142
- def single_quote?: (untyped char) -> untyped
146
+ def double_quote?: (::String? char) -> bool
143
147
 
144
- def double_quote_like?: (untyped char) -> untyped
148
+ def single_quote?: (::String? char) -> bool
145
149
 
146
- def single_quote_like?: (untyped char) -> untyped
150
+ def double_quote_like?: (::String? char) -> bool
151
+
152
+ def single_quote_like?: (::String? char) -> bool
147
153
 
148
154
  # Strip last occurrence of text_to_strip from text
149
- def strip_last_occurrence: (untyped text, untyped text_to_strip, ?strip_remaining_text: bool) -> untyped
155
+ def strip_last_occurrence: (::String text, ::String text_to_strip, ?strip_remaining_text: bool) -> ::String
150
156
 
151
- def insert_before_last_whitespace: (untyped text, untyped text_to_insert) -> untyped
157
+ def insert_before_last_whitespace: (::String text, ::String text_to_insert) -> ::String
152
158
 
153
159
  # Parse keywords true, false, null
154
160
  # Repair Python keywords True, False, None
155
161
  # Repair Ruby keyword nil
156
- def parse_keywords: () -> untyped
162
+ def parse_keywords: () -> bool
157
163
 
158
- def parse_keyword: (untyped name, untyped value) -> (true | false)
164
+ def parse_keyword: (::String name, ::String value) -> bool
159
165
 
160
- def remove_at_index: (untyped text, untyped start, untyped count) -> untyped
166
+ def remove_at_index: (::String text, ::Integer start, ::Integer count) -> ::String
161
167
 
162
- def ends_with_comma_or_newline?: (untyped text) -> untyped
168
+ def ends_with_comma_or_newline?: (::String text) -> bool
163
169
  end
164
170
  end
165
171
  end
data/sig/json/repair.rbs CHANGED
@@ -1,4 +1,10 @@
1
1
  module JSON
2
+ # Recursive type for any `JSON.parse` result. Mirrors what stdlib's
3
+ # `JSON.parse` produces (and the JS upstream emits): scalars, arrays,
4
+ # and objects of the same. Used in place of `untyped` for the
5
+ # `return_objects: true` and internal `*_parse` paths.
6
+ type json_value = ::Hash[::String, json_value] | ::Array[json_value] | ::String | ::Integer | ::Float | bool | nil
7
+
2
8
  class JSONRepairError < StandardError
3
9
  attr_reader position: ::Integer?
4
10
 
@@ -9,13 +15,25 @@ module JSON
9
15
  VERSION: ::String
10
16
  end
11
17
 
18
+ interface _Readable
19
+ def read: () -> ::String?
20
+ end
21
+
12
22
  def self.repair: (::String json, return_objects: false, ?skip_json_loads: bool) -> ::String
13
- | (::String json, return_objects: true, ?skip_json_loads: bool) -> untyped
23
+ | (::String json, return_objects: true, ?skip_json_loads: bool) -> json_value
14
24
  | (::String json, ?skip_json_loads: bool) -> ::String
15
25
 
26
+ def self.repair_io: (_Readable io, return_objects: false, ?skip_json_loads: bool) -> ::String
27
+ | (_Readable io, return_objects: true, ?skip_json_loads: bool) -> json_value
28
+ | (_Readable io, ?skip_json_loads: bool) -> ::String
29
+
30
+ def self.repair_file: (::String | ::Pathname path, return_objects: false, ?skip_json_loads: bool) -> ::String
31
+ | (::String | ::Pathname path, return_objects: true, ?skip_json_loads: bool) -> json_value
32
+ | (::String | ::Pathname path, ?skip_json_loads: bool) -> ::String
33
+
16
34
  private
17
35
 
18
- def self.tolerant_parse: (::String json) -> untyped
36
+ def self.tolerant_parse: (::String json) -> json_value
19
37
 
20
- def self.repaired_parse: (::String json) -> untyped
38
+ def self.repaired_parse: (::String json) -> json_value
21
39
  end
@@ -11,7 +11,7 @@ module JSON
11
11
  # `lib/json/repairer.rb`).
12
12
  @json: untyped
13
13
 
14
- @index: Integer
14
+ @index: ::Integer
15
15
 
16
16
  @output: ::String
17
17
 
@@ -31,25 +31,32 @@ module JSON
31
31
 
32
32
  private
33
33
 
34
- def parse_value: () -> untyped
34
+ def parse_value: () -> bool
35
35
 
36
- def parse_whitespace: (?skip_newline: bool) -> (true | false)
36
+ def parse_whitespace: (?skip_newline: bool) -> bool
37
37
 
38
- def parse_comment: () -> (true | false)
38
+ def parse_comment: () -> bool
39
39
 
40
40
  # Find and skip over a Markdown fenced code block
41
- def parse_markdown_code_block: (::Array[::String] blocks) -> (true | false)
41
+ def parse_markdown_code_block: (::Array[::String] blocks) -> bool
42
42
 
43
- def skip_markdown_code_block: (::Array[::String] blocks) -> (true | false)
43
+ def skip_markdown_code_block: (::Array[::String] blocks) -> bool
44
+
45
+ # Look ahead for a Markdown list marker like "- " or "12. " that
46
+ # precedes a value; returns the marker's length, or nil when there
47
+ # is no marker.
48
+ def markdown_list_marker_length: () -> ::Integer?
49
+
50
+ def skip_markdown_list_marker: () -> bool
44
51
 
45
52
  # Parse an object like '{"key": "value"}'
46
- def parse_object: () -> (false | true)
53
+ def parse_object: () -> bool
47
54
 
48
- def skip_character: (untyped char) -> (true | false)
55
+ def skip_character: (::String char) -> bool
49
56
 
50
57
  # Skip ellipsis like "[1,2,3,...]" or "[1,2,3,...,9]" or "[...,7,8,9]"
51
58
  # or a similar construct in objects.
52
- def skip_ellipsis: () -> untyped
59
+ def skip_ellipsis: () -> void
53
60
 
54
61
  # Parse a string enclosed by double quotes "...". Can contain escaped quotes
55
62
  # Repair strings enclosed in single quotes or special quotes
@@ -62,51 +69,59 @@ module JSON
62
69
  # more conservative way, stopping the string at the first next delimiter
63
70
  # and fixing the string by inserting a quote there, or stopping at a
64
71
  # stop index detected in the first iteration.
65
- def parse_string: (?stop_at_delimiter: bool, ?stop_at_index: ::Integer) -> (untyped | true | false)
72
+ def parse_string: (?stop_at_delimiter: bool, ?stop_at_index: ::Integer) -> bool
66
73
 
67
74
  # Repair an unquoted string by adding quotes around it
68
75
  # Repair a MongoDB function call like NumberLong("2")
69
76
  # Repair a JSONP function call like callback({...});
70
- def parse_unquoted_string: (bool is_key) -> (false | true)
77
+ def parse_unquoted_string: (bool is_key) -> bool
71
78
 
72
79
  # Parse a regular expression literal like /foo/ or /foo\/bar/
73
- def parse_regex: () -> (false | true)
80
+ def parse_regex: () -> bool
74
81
 
75
- def parse_character: (untyped char) -> (true | false)
82
+ def parse_character: (::String char) -> bool
76
83
 
77
- def parse_whitespace_and_skip_comments: (?skip_newline: bool) -> untyped
84
+ def parse_whitespace_and_skip_comments: (?skip_newline: bool) -> bool
78
85
 
79
86
  # Parse a number like 2.4 or 2.4e6
80
- def parse_number: () -> (true | false)
87
+ def parse_number: () -> bool
81
88
 
82
- def at_end_of_number?: () -> untyped
89
+ def at_end_of_number?: () -> bool
83
90
 
84
91
  # Parse an array like '["item1", "item2", ...]'
85
- def parse_array: () -> (true | false)
92
+ def parse_array: () -> bool
86
93
 
87
- def prev_non_whitespace_index: (untyped start) -> untyped
94
+ def prev_non_whitespace_index: (::Integer start) -> ::Integer
88
95
 
89
96
  # Repair concatenated strings like "hello" + "world", change this into "helloworld"
90
- def parse_concatenated_string: () -> untyped
97
+ def parse_concatenated_string: () -> bool
98
+
99
+ def repair_number_ending_with_numeric_symbol: (::Integer start) -> void
91
100
 
92
- def repair_number_ending_with_numeric_symbol: (untyped start) -> untyped
101
+ # Repair a number missing its digit before the decimal point, like ".5"
102
+ # or "-.5", into "0.5" / "-0.5".
103
+ def repair_leading_dot_number: (::String num) -> ::String
93
104
 
94
105
  # Parse and repair Newline Delimited JSON (NDJSON):
95
106
  # multiple JSON objects separated by a newline character
96
- def parse_newline_delimited_json: () -> untyped
107
+ def parse_newline_delimited_json: () -> void
97
108
 
98
- def skip_escape_character: () -> untyped
109
+ def skip_escape_character: () -> bool
99
110
 
100
- def throw_invalid_character: (untyped char) -> untyped
111
+ # `bot` (bottom) because these always raise — steep needs this to
112
+ # treat their call sites as unreachable so methods like `repair`
113
+ # type-check (the trailing `throw_unexpected_character` must not
114
+ # contribute `void` to the method's union return type).
115
+ def throw_invalid_character: (::String char) -> bot
101
116
 
102
- def throw_unexpected_character: () -> untyped
117
+ def throw_unexpected_character: () -> bot
103
118
 
104
- def throw_unexpected_end: () -> untyped
119
+ def throw_unexpected_end: () -> bot
105
120
 
106
- def throw_object_key_expected: () -> untyped
121
+ def throw_object_key_expected: () -> bot
107
122
 
108
- def throw_colon_expected: () -> untyped
123
+ def throw_colon_expected: () -> bot
109
124
 
110
- def throw_invalid_unicode_character: () -> untyped
125
+ def throw_invalid_unicode_character: () -> bot
111
126
  end
112
127
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: json-repair
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.7.0
4
+ version: 0.10.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Aleksandr Zykov