RubyGems - csv - Versions diffs - 3.2.0 → 3.2.3 - Mend

csv 3.2.0 → 3.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

checksums.yaml +4 -4
data/NEWS.md +99 -0
data/README.md +1 -1
data/doc/csv/options/generating/write_headers.rdoc +1 -1
data/doc/csv/recipes/generating.rdoc +1 -1
data/doc/csv/recipes/parsing.rdoc +1 -1
data/lib/csv/fields_converter.rb +6 -2
data/lib/csv/input_record_separator.rb +18 -0
data/lib/csv/parser.rb +202 -65
data/lib/csv/table.rb +14 -4
data/lib/csv/version.rb +1 -1
data/lib/csv/writer.rb +2 -1
data/lib/csv.rb +61 -17
metadata +6 -5

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: c48c0d15454e002ff10270a9c56cf4311ce635a8a9dfb527f7a7541f29f801b2
-  data.tar.gz: 505d1d0dbb4cff0a544b2e00925cb1101ed71642a584d534f443405fba8bd820
+  metadata.gz: 915b3ed5a51bf4836f08f7bb06efc3b07bdc90e09209a5253092130e2cad2ab6
+  data.tar.gz: 6bce2e39329afcf200691b4b2f422b6a48d45da66368f5d5e136e0c761cd6217
 SHA512:
-  metadata.gz: 1c9ecd18d5b9a4f663c0676694ffc133a4657e2f7a07cafe2f0a5d9ddd2d7846f505bc62c21698fcf1117126efc6978b7aa1b497d2fef8d532a8a4246c58bff2
-  data.tar.gz: e4fe05b49f92c68c011060d1dcd39ead1785d886eabbd3689a12884df9eb30124694417314e39bcf656cd05d6d0dea6a80a701dd5fe6cac42efc33c67be54926
+  metadata.gz: 5c1434c8e91c16de40d19d4d1200f193248e786720b67f2bbecf26a481859fe814b8cbaa02d22027668ff02588541266c8ff5d00b9fc1cfc2163b358b8e9ece9
+  data.tar.gz: 1978e933549049129f0ec99e80a10f2838b3c75a282103aa177d8421fe7589d428308e2786b29a961a4a7a5565ede77e3b1ef44ba8f4bc91b593a5a884ded7aa

data/NEWS.md CHANGED Viewed

@@ -1,5 +1,104 @@
 # News
+## 3.2.3 - 2022-04-09
+### Improvements
+  * Added contents summary to `CSV::Table#inspect`.
+    [GitHub#229][Patch by Eriko Sugiyama]
+    [GitHub#235][Patch by Sampat Badhe]
+  * Suppressed `$INPUT_RECORD_SEPARATOR` deprecation warning by
+    `Warning.warn`.
+    [GitHub#233][Reported by Jean byroot Boussier]
+  * Improved error message for liberal parsing with quoted values.
+    [GitHub#231][Patch by Nikolay Rys]
+  * Fixed typos in documentation.
+    [GitHub#236][Patch by Sampat Badhe]
+  * Added `:max_field_size` option and deprecated `:field_size_limit` option.
+    [GitHub#238][Reported by Dan Buettner]
+  * Added `:symbol_raw` to built-in header converters.
+    [GitHub#237][Reported by taki]
+    [GitHub#239][Patch by Eriko Sugiyama]
+### Fixes
+  * Fixed a bug that some texts may be dropped unexpectedly.
+    [Bug #18245][ruby-core:105587][Reported by Hassan Abdul Rehman]
+  * Fixed a bug that `:field_size_limit` doesn't work with not complex row.
+    [GitHub#238][Reported by Dan Buettner]
+### Thanks
+  * Hassan Abdul Rehman
+  * Eriko Sugiyama
+  * Jean byroot Boussier
+  * Nikolay Rys
+  * Sampat Badhe
+  * Dan Buettner
+  * taki
+## 3.2.2 - 2021-12-24
+### Improvements
+  * Added a validation for invalid option combination.
+    [GitHub#225][Patch by adamroyjones]
+  * Improved documentation for developers.
+    [GitHub#227][Patch by Eriko Sugiyama]
+### Fixes
+  * Fixed a bug that all of `ARGF` contents may not be consumed.
+    [GitHub#228][Reported by Rafael Navaza]
+### Thanks
+  * adamroyjones
+  * Eriko Sugiyama
+  * Rafael Navaza
+## 3.2.1 - 2021-10-23
+### Improvements
+  * doc: Fixed wrong class name.
+    [GitHub#217][Patch by Vince]
+  * Changed to always use `"\n"` for the default row separator on Ruby
+    3.0 or later because `$INPUT_RECORD_SEPARATOR` was deprecated
+    since Ruby 3.0.
+  * Added support for Ractor.
+    [GitHub#218][Patch by rm155]
+    * Users who want to use the built-in converters in non-main
+      Ractors need to call `Ractor.make_shareable(CSV::Converters)`
+      and/or `Ractor.make_shareable(CSV::HeaderConverters)` before
+      creating non-main Ractors.
+### Thanks
+  * Vince
+  * Joakim Antman
+  * rm155
 ## 3.2.0 - 2021-06-06
 ### Improvements

data/README.md CHANGED Viewed

@@ -35,7 +35,7 @@ end
 ## Development
-After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake test` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
+After checking out the repo, run `ruby run-test.rb` to check if your changes can pass the test.
 To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).

data/doc/csv/options/generating/write_headers.rdoc CHANGED Viewed

@@ -19,7 +19,7 @@ Without +write_headers+:
 With +write_headers+":
   CSV.open(file_path,'w',
-      :write_headers=> true,
+      :write_headers => true,
       :headers => ['Name','Value']
     ) do |csv|
       csv << ['foo', '0']

data/doc/csv/recipes/generating.rdoc CHANGED Viewed

@@ -148,7 +148,7 @@ This example defines and uses a custom write converter to strip whitespace from
 ==== Recipe: Specify Multiple Write Converters
-Use option <tt>:write_converters</tt> and multiple custom coverters
+Use option <tt>:write_converters</tt> and multiple custom converters
 to convert field values when generating \CSV.
 This example defines and uses two custom write converters to strip and upcase generated fields:

data/doc/csv/recipes/parsing.rdoc CHANGED Viewed

@@ -83,7 +83,7 @@ Use instance method CSV#each with option +headers+ to read a source \String one
   CSV.new(string, headers: true).each do |row|
     p row
   end
-Ouput:
+Output:
   #<CSV::Row "Name":"foo" "Value":"0">
   #<CSV::Row "Name":"bar" "Value":"1">
   #<CSV::Row "Name":"baz" "Value":"2">

data/lib/csv/fields_converter.rb CHANGED Viewed

@@ -16,7 +16,7 @@ class CSV
       @empty_value = options[:empty_value]
       @empty_value_is_empty_string = (@empty_value == "")
       @accept_nil = options[:accept_nil]
-      @builtin_converters = options[:builtin_converters]
+      @builtin_converters_name = options[:builtin_converters_name]
       @need_static_convert = need_static_convert?
     end
@@ -24,7 +24,7 @@ class CSV
       if name.nil?  # custom converter
         @converters << converter
       else          # named converter
-        combo = @builtin_converters[name]
+        combo = builtin_converters[name]
         case combo
         when Array  # combo converter
           combo.each do |sub_name|
@@ -80,5 +80,9 @@ class CSV
       @need_static_convert or
         (not @converters.empty?)
     end
+    def builtin_converters
+      @builtin_converters ||= ::CSV.const_get(@builtin_converters_name)
+    end
   end
 end

data/lib/csv/input_record_separator.rb ADDED Viewed

@@ -0,0 +1,18 @@
+require "English"
+require "stringio"
+class CSV
+  module InputRecordSeparator
+    class << self
+      if RUBY_VERSION >= "3.0.0"
+        def value
+          "\n"
+        end
+      else
+        def value
+          $INPUT_RECORD_SEPARATOR
+        end
+      end
+    end
+  end
+end

data/lib/csv/parser.rb CHANGED Viewed

@@ -3,6 +3,7 @@
 require "strscan"
 require_relative "delete_suffix"
+require_relative "input_record_separator"
 require_relative "match_p"
 require_relative "row"
 require_relative "table"
@@ -26,6 +27,10 @@ class CSV
     class InvalidEncoding < StandardError
     end
+    # Raised when unexpected case is happen.
+    class UnexpectedError < StandardError
+    end
     #
     # CSV::Scanner receives a CSV output, scans it and return the content.
     # It also controls the life cycle of the object with its methods +keep_start+,
@@ -77,16 +82,17 @@ class CSV
     # +keep_end+, +keep_back+, +keep_drop+.
     #
     # CSV::InputsScanner.scan() tries to match with pattern at the current position.
-    # If there's a match, the scanner advances the “scan pointer” and returns the matched string.
+    # If there's a match, the scanner advances the "scan pointer" and returns the matched string.
     # Otherwise, the scanner returns nil.
     #
-    # CSV::InputsScanner.rest() returns the “rest” of the string (i.e. everything after the scan pointer).
+    # CSV::InputsScanner.rest() returns the "rest" of the string (i.e. everything after the scan pointer).
     # If there is no more data (eos? = true), it returns "".
     #
     class InputsScanner
-      def initialize(inputs, encoding, chunk_size: 8192)
+      def initialize(inputs, encoding, row_separator, chunk_size: 8192)
         @inputs = inputs.dup
         @encoding = encoding
+        @row_separator = row_separator
         @chunk_size = chunk_size
         @last_scanner = @inputs.empty?
         @keeps = []
@@ -94,11 +100,13 @@ class CSV
       end
       def each_line(row_separator)
+        return enum_for(__method__, row_separator) unless block_given?
         buffer = nil
         input = @scanner.rest
         position = @scanner.pos
         offset = 0
         n_row_separator_chars = row_separator.size
+        # trace(__method__, :start, line, input)
         while true
           input.each_line(row_separator) do |line|
             @scanner.pos += line.bytesize
@@ -138,25 +146,28 @@ class CSV
       end
       def scan(pattern)
+        # trace(__method__, pattern, :start)
         value = @scanner.scan(pattern)
+        # trace(__method__, pattern, :done, :last, value) if @last_scanner
         return value if @last_scanner
-        if value
-          read_chunk if @scanner.eos?
-          return value
-        else
-          nil
-        end
+        read_chunk if value and @scanner.eos?
+        # trace(__method__, pattern, :done, value)
+        value
       end
       def scan_all(pattern)
+        # trace(__method__, pattern, :start)
         value = @scanner.scan(pattern)
+        # trace(__method__, pattern, :done, :last, value) if @last_scanner
         return value if @last_scanner
         return nil if value.nil?
         while @scanner.eos? and read_chunk and (sub_value = @scanner.scan(pattern))
+          # trace(__method__, pattern, :sub, sub_value)
           value << sub_value
         end
+        # trace(__method__, pattern, :done, value)
         value
       end
@@ -165,76 +176,135 @@ class CSV
       end
       def keep_start
-        @keeps.push([@scanner.pos, nil])
+        # trace(__method__, :start)
+        adjust_last_keep
+        @keeps.push([@scanner, @scanner.pos, nil])
+        # trace(__method__, :done)
       end
       def keep_end
-        start, buffer = @keeps.pop
-        keep = @scanner.string.byteslice(start, @scanner.pos - start)
+        # trace(__method__, :start)
+        scanner, start, buffer = @keeps.pop
+        if scanner == @scanner
+          keep = @scanner.string.byteslice(start, @scanner.pos - start)
+        else
+          keep = @scanner.string.byteslice(0, @scanner.pos)
+        end
         if buffer
           buffer << keep
           keep = buffer
         end
+        # trace(__method__, :done, keep)
         keep
       end
       def keep_back
-        start, buffer = @keeps.pop
+        # trace(__method__, :start)
+        scanner, start, buffer = @keeps.pop
         if buffer
+          # trace(__method__, :rescan, start, buffer)
           string = @scanner.string
-          keep = string.byteslice(start, string.bytesize - start)
+          if scanner == @scanner
+            keep = string.byteslice(start, string.bytesize - start)
+          else
+            keep = string
+          end
           if keep and not keep.empty?
             @inputs.unshift(StringIO.new(keep))
             @last_scanner = false
           end
           @scanner = StringScanner.new(buffer)
         else
+          if @scanner != scanner
+            message = "scanners are different but no buffer: "
+            message += "#{@scanner.inspect}(#{@scanner.object_id}): "
+            message += "#{scanner.inspect}(#{scanner.object_id})"
+            raise UnexpectedError, message
+          end
+          # trace(__method__, :repos, start, buffer)
           @scanner.pos = start
         end
         read_chunk if @scanner.eos?
       end
       def keep_drop
-        @keeps.pop
+        _, _, buffer = @keeps.pop
+        # trace(__method__, :done, :empty) unless buffer
+        return unless buffer
+        last_keep = @keeps.last
+        # trace(__method__, :done, :no_last_keep) unless last_keep
+        return unless last_keep
+        if last_keep[2]
+          last_keep[2] << buffer
+        else
+          last_keep[2] = buffer
+        end
+        # trace(__method__, :done)
       end
       def rest
         @scanner.rest
       end
+      def check(pattern)
+        @scanner.check(pattern)
+      end
       private
-      def read_chunk
-        return false if @last_scanner
+      def trace(*args)
+        pp([*args, @scanner, @scanner&.string, @scanner&.pos, @keeps])
+      end
-        unless @keeps.empty?
-          keep = @keeps.last
-          keep_start = keep[0]
-          string = @scanner.string
-          keep_data = string.byteslice(keep_start, @scanner.pos - keep_start)
-          if keep_data
-            keep_buffer = keep[1]
-            if keep_buffer
-              keep_buffer << keep_data
-            else
-              keep[1] = keep_data.dup
-            end
+      def adjust_last_keep
+        # trace(__method__, :start)
+        keep = @keeps.last
+        # trace(__method__, :done, :empty) if keep.nil?
+        return if keep.nil?
+        scanner, start, buffer = keep
+        string = @scanner.string
+        if @scanner != scanner
+          start = 0
+        end
+        if start == 0 and @scanner.eos?
+          keep_data = string
+        else
+          keep_data = string.byteslice(start, @scanner.pos - start)
+        end
+        if keep_data
+          if buffer
+            buffer << keep_data
+          else
+            keep[2] = keep_data.dup
           end
-          keep[0] = 0
         end
+        # trace(__method__, :done)
+      end
+      def read_chunk
+        return false if @last_scanner
+        adjust_last_keep
         input = @inputs.first
         case input
         when StringIO
           string = input.read
           raise InvalidEncoding unless string.valid_encoding?
+          # trace(__method__, :stringio, string)
           @scanner = StringScanner.new(string)
           @inputs.shift
           @last_scanner = @inputs.empty?
           true
         else
-          chunk = input.gets(nil, @chunk_size)
+          chunk = input.gets(@row_separator, @chunk_size)
           if chunk
             raise InvalidEncoding unless chunk.valid_encoding?
+            # trace(__method__, :chunk, chunk)
             @scanner = StringScanner.new(chunk)
             if input.respond_to?(:eof?) and input.eof?
               @inputs.shift
@@ -242,6 +312,7 @@ class CSV
             end
             true
           else
+            # trace(__method__, :no_chunk)
             @scanner = StringScanner.new("".encode(@encoding))
             @inputs.shift
             @last_scanner = @inputs.empty?
@@ -276,7 +347,11 @@ class CSV
     end
     def field_size_limit
-      @field_size_limit
+      @max_field_size&.succ
+    end
+    def max_field_size
+      @max_field_size
     end
     def skip_lines
@@ -344,6 +419,16 @@ class CSV
         end
         message = "Invalid byte sequence in #{@encoding}"
         raise MalformedCSVError.new(message, lineno)
+      rescue UnexpectedError => error
+        if @scanner
+          ignore_broken_line
+          lineno = @lineno
+        else
+          lineno = @lineno + 1
+        end
+        message = "This should not be happen: #{error.message}: "
+        message += "Please report this to https://github.com/ruby/csv/issues"
+        raise MalformedCSVError.new(message, lineno)
       end
     end
@@ -360,6 +445,7 @@ class CSV
       prepare_skip_lines
       prepare_strip
       prepare_separators
+      validate_strip_and_col_sep_options
       prepare_quoted
       prepare_unquoted
       prepare_line
@@ -387,7 +473,7 @@ class CSV
         @backslash_quote = false
       end
       @unconverted_fields = @options[:unconverted_fields]
-      @field_size_limit = @options[:field_size_limit]
+      @max_field_size = @options[:max_field_size]
       @skip_blanks = @options[:skip_blanks]
       @fields_converter = @options[:fields_converter]
       @header_fields_converter = @options[:header_fields_converter]
@@ -479,9 +565,9 @@ class CSV
     begin
       StringScanner.new("x").scan("x")
     rescue TypeError
-      @@string_scanner_scan_accept_string = false
+      STRING_SCANNER_SCAN_ACCEPT_STRING = false
     else
-      @@string_scanner_scan_accept_string = true
+      STRING_SCANNER_SCAN_ACCEPT_STRING = true
     end
     def prepare_separators
@@ -505,7 +591,7 @@ class CSV
         @first_column_separators = Regexp.new(@escaped_first_column_separator +
                                               "+".encode(@encoding))
       else
-        if @@string_scanner_scan_accept_string
+        if STRING_SCANNER_SCAN_ACCEPT_STRING
           @column_end = @column_separator
         else
           @column_end = Regexp.new(@escaped_column_separator)
@@ -526,10 +612,32 @@ class CSV
       @cr = "\r".encode(@encoding)
       @lf = "\n".encode(@encoding)
-      @cr_or_lf = Regexp.new("[\r\n]".encode(@encoding))
+      @line_end = Regexp.new("\r\n|\n|\r".encode(@encoding))
       @not_line_end = Regexp.new("[^\r\n]+".encode(@encoding))
     end
+    # This method verifies that there are no (obvious) ambiguities with the
+    # provided +col_sep+ and +strip+ parsing options. For example, if +col_sep+
+    # and +strip+ were both equal to +\t+, then there would be no clear way to
+    # parse the input.
+    def validate_strip_and_col_sep_options
+      return unless @strip
+      if @strip.is_a?(String)
+        if @column_separator.start_with?(@strip) || @column_separator.end_with?(@strip)
+          raise ArgumentError,
+                "The provided strip (#{@escaped_strip}) and " \
+                "col_sep (#{@escaped_column_separator}) options are incompatible."
+        end
+      else
+        if Regexp.new("\\A[#{@escaped_strip}]|[#{@escaped_strip}]\\z").match?(@column_separator)
+          raise ArgumentError,
+                "The provided strip (true) and " \
+                "col_sep (#{@escaped_column_separator}) options are incompatible."
+        end
+      end
+    end
     def prepare_quoted
       if @quote_character
         @quotes = Regexp.new(@escaped_quote_character +
@@ -605,7 +713,7 @@ class CSV
             # do nothing:  ensure will set default
           end
         end
-        separator = $INPUT_RECORD_SEPARATOR if separator == :auto
+        separator = InputRecordSeparator.value if separator == :auto
       end
       separator.to_s.encode(@encoding)
     end
@@ -704,26 +812,28 @@ class CSV
       sample[0, 128].index(@quote_character)
     end
-    SCANNER_TEST = (ENV["CSV_PARSER_SCANNER_TEST"] == "yes")
-    if SCANNER_TEST
-      class UnoptimizedStringIO
-        def initialize(string)
-          @io = StringIO.new(string, "rb:#{string.encoding}")
-        end
+    class UnoptimizedStringIO # :nodoc:
+      def initialize(string)
+        @io = StringIO.new(string, "rb:#{string.encoding}")
+      end
-        def gets(*args)
-          @io.gets(*args)
-        end
+      def gets(*args)
+        @io.gets(*args)
+      end
-        def each_line(*args, &block)
-          @io.each_line(*args, &block)
-        end
+      def each_line(*args, &block)
+        @io.each_line(*args, &block)
+      end
-        def eof?
-          @io.eof?
-        end
+      def eof?
+        @io.eof?
       end
+    end
+    SCANNER_TEST = (ENV["CSV_PARSER_SCANNER_TEST"] == "yes")
+    if SCANNER_TEST
+      SCANNER_TEST_CHUNK_SIZE_NAME = "CSV_PARSER_SCANNER_TEST_CHUNK_SIZE"
+      SCANNER_TEST_CHUNK_SIZE_VALUE = ENV[SCANNER_TEST_CHUNK_SIZE_NAME]
       def build_scanner
         inputs = @samples.collect do |sample|
           UnoptimizedStringIO.new(sample)
@@ -733,17 +843,27 @@ class CSV
         else
           inputs << @input
         end
-        chunk_size = ENV["CSV_PARSER_SCANNER_TEST_CHUNK_SIZE"] || "1"
+        begin
+          chunk_size_value = ENV[SCANNER_TEST_CHUNK_SIZE_NAME]
+        rescue # Ractor::IsolationError
+          # Ractor on Ruby 3.0 can't read ENV value.
+          chunk_size_value = SCANNER_TEST_CHUNK_SIZE_VALUE
+        end
+        chunk_size = Integer((chunk_size_value || "1"), 10)
         InputsScanner.new(inputs,
                           @encoding,
-                          chunk_size: Integer(chunk_size, 10))
+                          @row_separator,
+                          chunk_size: chunk_size)
       end
     else
       def build_scanner
         string = nil
         if @samples.empty? and @input.is_a?(StringIO)
           string = @input.read
-        elsif @samples.size == 1 and @input.respond_to?(:eof?) and @input.eof?
+        elsif @samples.size == 1 and
+              @input != ARGF and
+              @input.respond_to?(:eof?) and
+              @input.eof?
           string = @samples[0]
         end
         if string
@@ -762,7 +882,7 @@ class CSV
             StringIO.new(sample)
           end
           inputs << @input
-          InputsScanner.new(inputs, @encoding)
+          InputsScanner.new(inputs, @encoding, @row_separator)
         end
       end
     end
@@ -796,6 +916,14 @@ class CSV
       end
     end
+    def validate_field_size(field)
+      return unless @max_field_size
+      return if field.size <= @max_field_size
+      ignore_broken_line
+      message = "Field size exceeded: #{field.size} > #{@max_field_size}"
+      raise MalformedCSVError.new(message, @lineno)
+    end
     def parse_no_quote(&block)
       @scanner.each_line(@row_separator) do |line|
         next if @skip_lines and skip_line?(line)
@@ -808,6 +936,11 @@ class CSV
         else
           line = strip_value(line)
           row = line.split(@split_column_separator, -1)
+          if @max_field_size
+            row.each do |column|
+              validate_field_size(column)
+            end
+          end
           n_columns = row.size
           i = 0
           while i < n_columns
@@ -863,6 +996,7 @@ class CSV
                 @need_robust_parsing = true
                 return parse_quotable_robust(&block)
               end
+              validate_field_size(row[i])
             end
             i += 1
           end
@@ -886,10 +1020,7 @@ class CSV
         value = parse_column_value
         if value
           @scanner.scan_all(@strip_value) if @strip_value
-          if @field_size_limit and value.size >= @field_size_limit
-            ignore_broken_line
-            raise MalformedCSVError.new("Field size exceeded", @lineno)
-          end
+          validate_field_size(value)
         end
         if parse_column_end
           row << value
@@ -910,11 +1041,17 @@ class CSV
           break
         else
           if @quoted_column_value
+            if liberal_parsing? and (new_line = @scanner.check(@line_end))
+              message =
+                "Illegal end-of-line sequence outside of a quoted field " +
+                "<#{new_line.inspect}>"
+            else
+              message = "Any value after quoted field isn't allowed"
+            end
             ignore_broken_line
-            message = "Any value after quoted field isn't allowed"
             raise MalformedCSVError.new(message, @lineno)
           elsif @unquoted_column_value and
-                (new_line = @scanner.scan(@cr_or_lf))
+                (new_line = @scanner.scan(@line_end))
             ignore_broken_line
             message = "Unquoted fields do not allow new line " +
                       "<#{new_line.inspect}>"
@@ -923,7 +1060,7 @@ class CSV
             ignore_broken_line
             message = "Illegal quoting"
             raise MalformedCSVError.new(message, @lineno)
-          elsif (new_line = @scanner.scan(@cr_or_lf))
+          elsif (new_line = @scanner.scan(@line_end))
             ignore_broken_line
             message = "New line must be <#{@row_separator.inspect}> " +
                       "not <#{new_line.inspect}>"
@@ -1089,7 +1226,7 @@ class CSV
     def ignore_broken_line
       @scanner.scan_all(@not_line_end)
-      @scanner.scan_all(@cr_or_lf)
+      @scanner.scan_all(@line_end)
       @lineno += 1
     end

data/lib/csv/table.rb CHANGED Viewed

@@ -999,9 +999,15 @@ class CSV
     # Omits the headers if option +write_headers+ is given as +false+
     # (see {Option +write_headers+}[../CSV.html#class-CSV-label-Option+write_headers]):
     #   table.to_csv(write_headers: false) # => "foo,0\nbar,1\nbaz,2\n"
-    def to_csv(write_headers: true, **options)
+    #
+    # Limit rows if option +limit+ is given like +2+:
+    #   table.to_csv(limit: 2) # => "Name,Value\nfoo,0\nbar,1\n"
+    def to_csv(write_headers: true, limit: nil, **options)
       array = write_headers ? [headers.to_csv(**options)] : []
-      @table.each do |row|
+      limit ||= @table.size
+      limit = @table.size + 1 + limit if limit < 0
+      limit = 0 if limit < 0
+      @table.first(limit).each do |row|
         array.push(row.fields.to_csv(**options)) unless row.header_row?
       end
@@ -1038,9 +1044,13 @@ class CSV
     # Example:
     #   source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
     #   table = CSV.parse(source, headers: true)
-    #   table.inspect # => "#<CSV::Table mode:col_or_row row_count:4>"
+    #   table.inspect # => "#<CSV::Table mode:col_or_row row_count:4>\nName,Value\nfoo,0\nbar,1\nbaz,2\n"
+    #
     def inspect
-      "#<#{self.class} mode:#{@mode} row_count:#{to_a.size}>".encode("US-ASCII")
+      inspected = +"#<#{self.class} mode:#{@mode} row_count:#{to_a.size}>"
+      summary = to_csv(limit: 5)
+      inspected << "\n" << summary if summary.encoding.ascii_compatible?
+      inspected
     end
   end
 end

data/lib/csv/version.rb CHANGED Viewed

@@ -2,5 +2,5 @@
 class CSV
   # The version of the installed library.
-  VERSION = "3.2.0"
+  VERSION = "3.2.3"
 end

data/lib/csv/writer.rb CHANGED Viewed

@@ -1,5 +1,6 @@
 # frozen_string_literal: true
+require_relative "input_record_separator"
 require_relative "match_p"
 require_relative "row"
@@ -133,7 +134,7 @@ class CSV
       @column_separator = @options[:column_separator].to_s.encode(@encoding)
       row_separator = @options[:row_separator]
       if row_separator == :auto
-        @row_separator = $INPUT_RECORD_SEPARATOR.encode(@encoding)
+        @row_separator = InputRecordSeparator.value.encode(@encoding)
       else
         @row_separator = row_separator.to_s.encode(@encoding)
       end

data/lib/csv.rb CHANGED Viewed

@@ -90,11 +90,11 @@
 # with any questions.
 require "forwardable"
-require "English"
 require "date"
 require "stringio"
 require_relative "csv/fields_converter"
+require_relative "csv/input_record_separator"
 require_relative "csv/match_p"
 require_relative "csv/parser"
 require_relative "csv/row"
@@ -341,6 +341,7 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
 #     liberal_parsing:    false,
 #     nil_value:          nil,
 #     empty_value:        "",
+#     strip:              false,
 #     # For generating.
 #     write_headers:      nil,
 #     quote_empty:        true,
@@ -348,7 +349,6 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
 #     write_converters:   nil,
 #     write_nil_value:    nil,
 #     write_empty_value:  "",
-#     strip:              false,
 #   }
 #
 # ==== Options for Parsing
@@ -357,7 +357,9 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
 # - +row_sep+: Specifies the row separator; used to delimit rows.
 # - +col_sep+: Specifies the column separator; used to delimit fields.
 # - +quote_char+: Specifies the quote character; used to quote fields.
-# - +field_size_limit+: Specifies the maximum field size allowed.
+# - +field_size_limit+: Specifies the maximum field size + 1 allowed.
+#   Deprecated since 3.2.3. Use +max_field_size+ instead.
+# - +max_field_size+: Specifies the maximum field size allowed.
 # - +converters+: Specifies the field converters to be used.
 # - +unconverted_fields+: Specifies whether unconverted fields are to be available.
 # - +headers+: Specifies whether data contains headers,
@@ -366,8 +368,9 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
 # - +header_converters+: Specifies the header converters to be used.
 # - +skip_blanks+: Specifies whether blanks lines are to be ignored.
 # - +skip_lines+: Specifies how comments lines are to be recognized.
-# - +strip+: Specifies whether leading and trailing whitespace are
-#   to be stripped from fields..
+# - +strip+: Specifies whether leading and trailing whitespace are to be
+#   stripped from fields. This must be compatible with +col_sep+; if it is not,
+#   then an +ArgumentError+ exception will be raised.
 # - +liberal_parsing+: Specifies whether \CSV should attempt to parse
 #   non-compliant data.
 # - +nil_value+: Specifies the object that is to be substituted for each null (no-text) field.
@@ -513,7 +516,7 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
 #  [" 1 ", #<struct CSV::FieldInfo index=1, line=2, header=nil>]
 #  [" baz ", #<struct CSV::FieldInfo index=0, line=3, header=nil>]
 #  [" 2 ", #<struct CSV::FieldInfo index=1, line=3, header=nil>]
-# Each CSV::Info object shows:
+# Each CSV::FieldInfo object shows:
 # - The 0-based field index.
 # - The 1-based line index.
 # - The field header, if any.
@@ -547,6 +550,14 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
 #
 # There is no such storage structure for write headers.
 #
+# In order for the parsing methods to access stored converters in non-main-Ractors, the
+# storage structure must be made shareable first.
+# Therefore, <tt>Ractor.make_shareable(CSV::Converters)</tt> and
+# <tt>Ractor.make_shareable(CSV::HeaderConverters)</tt> must be called before the creation
+# of Ractors that use the converters stored in these structures. (Since making the storage
+# structures shareable involves freezing them, any custom converters that are to be used
+# must be added first.)
+#
 # ===== Converter Lists
 #
 # A _converter_ _list_ is an \Array that may include any assortment of:
@@ -917,8 +928,10 @@ class CSV
     symbol:   lambda { |h|
       h.encode(ConverterEncoding).downcase.gsub(/[^\s\w]+/, "").strip.
                                            gsub(/\s+/, "_").to_sym
-    }
+    },
+    symbol_raw: lambda { |h| h.encode(ConverterEncoding).to_sym }
   }
   # Default values for method options.
   DEFAULT_OPTIONS = {
     # For both parsing and generating.
@@ -927,6 +940,7 @@ class CSV
     quote_char:         '"',
     # For parsing.
     field_size_limit:   nil,
+    max_field_size:     nil,
     converters:         nil,
     unconverted_fields: nil,
     headers:            false,
@@ -937,6 +951,7 @@ class CSV
     liberal_parsing:    false,
     nil_value:          nil,
     empty_value:        "",
+    strip:              false,
     # For generating.
     write_headers:      nil,
     quote_empty:        true,
@@ -944,7 +959,6 @@ class CSV
     write_converters:   nil,
     write_nil_value:    nil,
     write_empty_value:  "",
-    strip:              false,
   }.freeze
   class << self
@@ -957,6 +971,8 @@ class CSV
     # Creates or retrieves cached \CSV objects.
     # For arguments and options, see CSV.new.
     #
+    # This API is not Ractor-safe.
+    #
     # ---
     #
     # With no block given, returns a \CSV object.
@@ -1187,7 +1203,7 @@ class CSV
     #   See {Options for Parsing}[#class-CSV-label-Options+for+Parsing].
     def filter(input=nil, output=nil, **options)
       # parse options for input, output, or both
-      in_options, out_options = Hash.new, {row_sep: $INPUT_RECORD_SEPARATOR}
+      in_options, out_options = Hash.new, {row_sep: InputRecordSeparator.value}
       options.each do |key, value|
         case key.to_s
         when /\Ain(?:put)?_(.+)\Z/
@@ -1407,8 +1423,8 @@ class CSV
     # Argument +ary+ must be an \Array.
     #
     # Special options:
-    # * Option <tt>:row_sep</tt> defaults to <tt>$INPUT_RECORD_SEPARATOR</tt>
-    #   (<tt>$/</tt>).:
+    # * Option <tt>:row_sep</tt> defaults to <tt>"\n"> on Ruby 3.0 or later
+    #   and <tt>$INPUT_RECORD_SEPARATOR</tt> (<tt>$/</tt>) otherwise.:
     #     $INPUT_RECORD_SEPARATOR # => "\n"
     # * This method accepts an additional option, <tt>:encoding</tt>, which sets the base
     #   Encoding for the output. This method will try to guess your Encoding from
@@ -1430,7 +1446,7 @@ class CSV
     #   CSV.generate_line(:foo)
     #
     def generate_line(row, **options)
-      options = {row_sep: $INPUT_RECORD_SEPARATOR}.merge(options)
+      options = {row_sep: InputRecordSeparator.value}.merge(options)
       str = +""
       if options[:encoding]
         str.force_encoding(options[:encoding])
@@ -1853,6 +1869,7 @@ class CSV
                  row_sep: :auto,
                  quote_char: '"',
                  field_size_limit: nil,
+                 max_field_size: nil,
                  converters: nil,
                  unconverted_fields: nil,
                  headers: false,
@@ -1868,11 +1885,11 @@ class CSV
                  encoding: nil,
                  nil_value: nil,
                  empty_value: "",
+                 strip: false,
                  quote_empty: true,
                  write_converters: nil,
                  write_nil_value: nil,
-                 write_empty_value: "",
-                 strip: false)
+                 write_empty_value: "")
     raise ArgumentError.new("Cannot parse nil as CSV") if data.nil?
     if data.is_a?(String)
@@ -1895,11 +1912,14 @@ class CSV
     @initial_header_converters = header_converters
     @initial_write_converters = write_converters
+    if max_field_size.nil? and field_size_limit
+      max_field_size = field_size_limit - 1
+    end
     @parser_options = {
       column_separator: col_sep,
       row_separator: row_sep,
       quote_character: quote_char,
-      field_size_limit: field_size_limit,
+      max_field_size: max_field_size,
       unconverted_fields: unconverted_fields,
       headers: headers,
       return_headers: return_headers,
@@ -1967,10 +1987,24 @@ class CSV
   # Returns the limit for field size; used for parsing;
   # see {Option +field_size_limit+}[#class-CSV-label-Option+field_size_limit]:
   #   CSV.new('').field_size_limit # => nil
+  #
+  # Deprecated since 3.2.3. Use +max_field_size+ instead.
   def field_size_limit
     parser.field_size_limit
   end
+  # :call-seq:
+  #   csv.max_field_size -> integer or nil
+  #
+  # Returns the limit for field size; used for parsing;
+  # see {Option +max_field_size+}[#class-CSV-label-Option+max_field_size]:
+  #   CSV.new('').max_field_size # => nil
+  #
+  # Since 3.2.3.
+  def max_field_size
+    parser.max_field_size
+  end
   # :call-seq:
   #   csv.skip_lines -> regexp or nil
   #
@@ -1992,6 +2026,10 @@ class CSV
   #   csv.converters # => [:integer]
   #   csv.convert(proc {|x| x.to_s })
   #   csv.converters
+  #
+  # Notes that you need to call
+  # +Ractor.make_shareable(CSV::Converters)+ on the main Ractor to use
+  # this method.
   def converters
     parser_fields_converter.map do |converter|
       name = Converters.rassoc(converter)
@@ -2054,6 +2092,10 @@ class CSV
   # Returns an \Array containing header converters; used for parsing;
   # see {Header Converters}[#class-CSV-label-Header+Converters]:
   #   CSV.new('').header_converters # => []
+  #
+  # Notes that you need to call
+  # +Ractor.make_shareable(CSV::HeaderConverters)+ on the main Ractor
+  # to use this method.
   def header_converters
     header_fields_converter.map do |converter|
       name = HeaderConverters.rassoc(converter)
@@ -2694,7 +2736,7 @@ class CSV
   def build_parser_fields_converter
     specific_options = {
-      builtin_converters: Converters,
+      builtin_converters_name: :Converters,
     }
     options = @base_fields_converter_options.merge(specific_options)
     build_fields_converter(@initial_converters, options)
@@ -2706,7 +2748,7 @@ class CSV
   def build_header_fields_converter
     specific_options = {
-      builtin_converters: HeaderConverters,
+      builtin_converters_name: :HeaderConverters,
       accept_nil: true,
     }
     options = @base_fields_converter_options.merge(specific_options)
@@ -2774,6 +2816,8 @@ end
 #   io = StringIO.new
 #   CSV(io, col_sep: ";") { |csv| csv << ["a", "b", "c"] }
 #
+# This API is not Ractor-safe.
+#
 def CSV(*args, **options, &block)
   CSV.instance(*args, **options, &block)
 end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: csv
 version: !ruby/object:Gem::Version
-  version: 3.2.0
+  version: 3.2.3
 platform: ruby
 authors:
 - James Edward Gray II
@@ -9,7 +9,7 @@ authors:
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2021-06-05 00:00:00.000000000 Z
+date: 2022-04-08 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: bundler
@@ -59,14 +59,14 @@ dependencies:
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
-        version: 3.4.3
+        version: 3.4.8
   type: :development
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
-        version: 3.4.3
+        version: 3.4.8
 description: The CSV library provides a complete interface to CSV files and data.
   It offers tools to enable you to read and write to and from Strings or IO objects,
   as needed.
@@ -118,6 +118,7 @@ files:
 - lib/csv/core_ext/string.rb
 - lib/csv/delete_suffix.rb
 - lib/csv/fields_converter.rb
+- lib/csv/input_record_separator.rb
 - lib/csv/match_p.rb
 - lib/csv/parser.rb
 - lib/csv/row.rb
@@ -146,7 +147,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 3.3.0.dev
+rubygems_version: 3.4.0.dev
 signing_key:
 specification_version: 4
 summary: CSV Reading and Writing