RubyGems - logstash-filter-csv - Versions diffs - 3.0.2 → 3.0.3 - Mend

logstash-filter-csv 3.0.2 → 3.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA1:
-  metadata.gz: 2e2d4379b16ea4358cfffca7c0922024af7f2e80
-  data.tar.gz: fc2a74ee00ef3598a94765f142d0d21bea669f45
+  metadata.gz: b64bc3605560eab32e1532795bcfa5afc2471da5
+  data.tar.gz: 0a17fb598a3ac9932e2e7485309eb3c34f22efd2
 SHA512:
-  metadata.gz: 56696d58d95c9703886e6ac1c8fd4e70851550d4cd8f3f38effacea4bd7f4360aee0544a05ae8348dc8ca59bd97ac8835ee9d18c34ab4e13c59ce833b5a5f496
-  data.tar.gz: b02418749ef7a9b5bd7dc9d1b5a4bbd10b39e266344c7ceb1983801757474e07fc70887a15b42a1a7699e455fcf0ecba473c72f60960565c81692153faa2b6c0
+  metadata.gz: c5d0fed6269330d4f0e5de50023f6cba7371f3aecaf4d4996f82a0f5f645a4c4381b627086a2eaef484547ab450b66569bb0e15a4fe71c59f8d5a299d00d24f9
+  data.tar.gz: 89563ed47c27a9a50e5047bd8cc09b8cdbb613154c9a96fabc14cc4f6fa2acf372e56a106adabda8f999aaaa710b9148e261e6f4843257eefc03f29708610be5

data/CHANGELOG.md CHANGED Viewed

@@ -1,3 +1,6 @@
+## 3.0.3
+  - generate Timestamp objects for correctly converted :date and :date_time fields with related specs.
 ## 3.0.2
   - Relax constraint on logstash-core-plugin-api to >= 1.60 <= 2.99

data/docs/index.asciidoc ADDED Viewed

@@ -0,0 +1,152 @@
+:plugin: csv
+:type: filter
+///////////////////////////////////////////
+START - GENERATED VARIABLES, DO NOT EDIT!
+///////////////////////////////////////////
+:version: %VERSION%
+:release_date: %RELEASE_DATE%
+:changelog_url: %CHANGELOG_URL%
+:include_path: ../../../logstash/docs/include
+///////////////////////////////////////////
+END - GENERATED VARIABLES, DO NOT EDIT!
+///////////////////////////////////////////
+[id="plugins-{type}-{plugin}"]
+=== Csv
+include::{include_path}/plugin_header.asciidoc[]
+==== Description
+The CSV filter takes an event field containing CSV data, parses it,
+and stores it as individual fields (can optionally specify the names).
+This filter can also parse data with any separator, not just commas.
+[id="plugins-{type}s-{plugin}-options"]
+==== Csv Filter Configuration Options
+This plugin supports the following configuration options plus the <<plugins-{type}s-common-options>> described later.
+[cols="<,<,<",options="header",]
+|=======================================================================
+|Setting |Input type|Required
+| <<plugins-{type}s-{plugin}-autodetect_column_names>> |<<boolean,boolean>>|No
+| <<plugins-{type}s-{plugin}-autogenerate_column_names>> |<<boolean,boolean>>|No
+| <<plugins-{type}s-{plugin}-columns>> |<<array,array>>|No
+| <<plugins-{type}s-{plugin}-convert>> |<<hash,hash>>|No
+| <<plugins-{type}s-{plugin}-quote_char>> |<<string,string>>|No
+| <<plugins-{type}s-{plugin}-separator>> |<<string,string>>|No
+| <<plugins-{type}s-{plugin}-skip_empty_columns>> |<<boolean,boolean>>|No
+| <<plugins-{type}s-{plugin}-source>> |<<string,string>>|No
+| <<plugins-{type}s-{plugin}-target>> |<<string,string>>|No
+|=======================================================================
+Also see <<plugins-{type}s-common-options>> for a list of options supported by all
+filter plugins.
+&nbsp;
+[id="plugins-{type}s-{plugin}-autodetect_column_names"]
+===== `autodetect_column_names`
+  * Value type is <<boolean,boolean>>
+  * Default value is `false`
+Define whether column names should be auto-detected from the header column or not.
+Defaults to false.
+[id="plugins-{type}s-{plugin}-autogenerate_column_names"]
+===== `autogenerate_column_names`
+  * Value type is <<boolean,boolean>>
+  * Default value is `true`
+Define whether column names should autogenerated or not.
+Defaults to true. If set to false, columns not having a header specified will not be parsed.
+[id="plugins-{type}s-{plugin}-columns"]
+===== `columns`
+  * Value type is <<array,array>>
+  * Default value is `[]`
+Define a list of column names (in the order they appear in the CSV,
+as if it were a header line). If `columns` is not configured, or there
+are not enough columns specified, the default column names are
+"column1", "column2", etc. In the case that there are more columns
+in the data than specified in this column list, extra columns will be auto-numbered:
+(e.g. "user_defined_1", "user_defined_2", "column3", "column4", etc.)
+[id="plugins-{type}s-{plugin}-convert"]
+===== `convert`
+  * Value type is <<hash,hash>>
+  * Default value is `{}`
+Define a set of datatype conversions to be applied to columns.
+Possible conversions are integer, float, date, date_time, boolean
+# Example:
+[source,ruby]
+    filter {
+      csv {
+        convert => {
+          "column1" => "integer"
+          "column2" => "boolean"
+        }
+      }
+    }
+[id="plugins-{type}s-{plugin}-quote_char"]
+===== `quote_char`
+  * Value type is <<string,string>>
+  * Default value is `"\""`
+Define the character used to quote CSV fields. If this is not specified
+the default is a double quote `"`.
+Optional.
+[id="plugins-{type}s-{plugin}-separator"]
+===== `separator`
+  * Value type is <<string,string>>
+  * Default value is `","`
+Define the column separator value. If this is not specified, the default
+is a comma `,`. If you want to define a tabulation as a separator, you need
+to set the value to the actual tab character and not `\t`.
+Optional.
+[id="plugins-{type}s-{plugin}-skip_empty_columns"]
+===== `skip_empty_columns`
+  * Value type is <<boolean,boolean>>
+  * Default value is `false`
+Define whether empty columns should be skipped.
+Defaults to false. If set to true, columns containing no value will not get set.
+[id="plugins-{type}s-{plugin}-source"]
+===== `source`
+  * Value type is <<string,string>>
+  * Default value is `"message"`
+The CSV data in the value of the `source` field will be expanded into a
+data structure.
+[id="plugins-{type}s-{plugin}-target"]
+===== `target`
+  * Value type is <<string,string>>
+  * There is no default value for this setting.
+Define target field for placing the data.
+Defaults to writing to the root of the event.
+include::{include_path}/{type}.asciidoc[]

data/lib/logstash/filters/csv.rb CHANGED Viewed

@@ -23,7 +23,8 @@ class LogStash::Filters::CSV < LogStash::Filters::Base
   config :columns, :validate => :array, :default => []
   # Define the column separator value. If this is not specified, the default
-  # is a comma `,`.
+  # is a comma `,`. If you want to define a tabulation as a separator, you need
+  # to set the value to the actual tab character and not `\t`.
   # Optional.
   config :separator, :validate => :string, :default => ","
@@ -51,11 +52,18 @@ class LogStash::Filters::CSV < LogStash::Filters::Base
   # [source,ruby]
   #     filter {
   #       csv {
-  #         convert => { "column1" => "integer", "column2" => "boolean" }
+  #         convert => {
+  #           "column1" => "integer"
+  #           "column2" => "boolean"
+  #         }
   #       }
   #     }
   config :convert, :validate => :hash, :default => {}
+  # Define whether column names should be auto-detected from the header column or not.
+  # Defaults to false.
+  config :autodetect_column_names, :validate => :boolean, :default => false
   CONVERTERS = {
     :integer => lambda do |value|
       CSV::Converters[:integer].call(value)
@@ -66,11 +74,13 @@ class LogStash::Filters::CSV < LogStash::Filters::Base
     end,
     :date => lambda do |value|
-      CSV::Converters[:date].call(value)
+      result = CSV::Converters[:date].call(value)
+      result.is_a?(Date) ? LogStash::Timestamp.new(result.to_time) : result
     end,
     :date_time => lambda do |value|
-      CSV::Converters[:date_time].call(value)
+      result = CSV::Converters[:date_time].call(value)
+      result.is_a?(DateTime) ? LogStash::Timestamp.new(result.to_time) : result
     end,
     :boolean => lambda do |value|
@@ -112,6 +122,12 @@ class LogStash::Filters::CSV < LogStash::Filters::Base
       begin
         values = CSV.parse_line(source, :col_sep => @separator, :quote_char => @quote_char)
+        if (@autodetect_column_names && @columns.empty?)
+          @columns = values
+          event.cancel
+          return
+        end
         values.each_index do |i|
           unless (@skip_empty_columns && (values[i].nil? || values[i].empty?))
             unless ignore_field?(i)

data/logstash-filter-csv.gemspec CHANGED Viewed

@@ -1,7 +1,7 @@
 Gem::Specification.new do |s|
   s.name            = 'logstash-filter-csv'
-  s.version         = '3.0.2'
+  s.version         = '3.0.3'
   s.licenses        = ['Apache License (2.0)']
   s.summary         = "The CSV filter takes an event field containing CSV data, parses it, and stores it as individual fields (can optionally specify the names)."
   s.description     = "This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program"
@@ -11,7 +11,7 @@ Gem::Specification.new do |s|
   s.require_paths = ["lib"]
   # Files
-  s.files = Dir['lib/**/*','spec/**/*','vendor/**/*','*.gemspec','*.md','CONTRIBUTORS','Gemfile','LICENSE','NOTICE.TXT']
+  s.files = Dir["lib/**/*","spec/**/*","*.gemspec","*.md","CONTRIBUTORS","Gemfile","LICENSE","NOTICE.TXT", "vendor/jar-dependencies/**/*.jar", "vendor/jar-dependencies/**/*.rb", "VERSION", "docs/**/*"]
   # Tests
   s.test_files = s.files.grep(%r{^(test|spec|features)/})

data/spec/filters/csv_spec.rb CHANGED Viewed

@@ -20,7 +20,7 @@ describe LogStash::Filters::CSV do
       it "should register" do
         input = LogStash::Plugin.lookup("filter", "csv").new(config)
-        expect {input.register}.to raise_error
+        expect {input.register}.to raise_error(LogStash::ConfigurationError)
       end
     end
   end
@@ -232,18 +232,64 @@ describe LogStash::Filters::CSV do
     describe "using field convertion" do
       let(:config) do
-        { "convert" => { "column1" => "integer", "column3" => "boolean" } }
+        {
+            "convert" => {
+                "column1" => "integer",
+                "column3" => "boolean",
+                "column4" => "float",
+                "column5" => "date",
+                "column6" => "date_time",
+                "column7" => "date",
+                "column8" => "date_time",
+            }
+        }
       end
-      let(:doc)   { "1234,bird,false" }
+      # 2017-06-01,2001-02-03T04:05:06+07:00
+      let(:doc)   { "1234,bird,false,3.14159265359,2017-06-01,2001-02-03 04:05:06,invalid_date,invalid_date_time" }
       let(:event) { LogStash::Event.new("message" => doc) }
-      it "get converted values to the expected type" do
+      it "converts to integer" do
         plugin.filter(event)
         expect(event.get("column1")).to eq(1234)
+      end
+      it "does not convert without converter" do
+        plugin.filter(event)
         expect(event.get("column2")).to eq("bird")
+      end
+      it "converts to boolean" do
+        plugin.filter(event)
         expect(event.get("column3")).to eq(false)
       end
+      it "converts to float" do
+        plugin.filter(event)
+        expect(event.get("column4")).to eq(3.14159265359)
+      end
+      it "converts to date" do
+        plugin.filter(event)
+        expect(event.get("column5")).to be_a(LogStash::Timestamp)
+        expect(event.get("column5").to_s).to eq(LogStash::Timestamp.new(Date.parse("2017-06-01").to_time).to_s)
+      end
+      it "converts to date_time" do
+        plugin.filter(event)
+        expect(event.get("column6")).to be_a(LogStash::Timestamp)
+        expect(event.get("column6").to_s).to eq(LogStash::Timestamp.new(DateTime.parse("2001-02-03 04:05:06").to_time).to_s)
+      end
+      it "tries to converts to date but return original" do
+        plugin.filter(event)
+        expect(event.get("column7")).to eq("invalid_date")
+      end
+      it "tries to converts to date_time but return original" do
+        plugin.filter(event)
+        expect(event.get("column8")).to eq("invalid_date_time")
+      end
       context "when using column names" do
         let(:config) do
@@ -259,5 +305,21 @@ describe LogStash::Filters::CSV do
         end
       end
     end
+    describe "given autodetect option" do
+      let(:header) { LogStash::Event.new("message" => "first,last,address") }
+      let(:doc)    { "big,bird,sesame street" }
+      let(:config) do
+        { "autodetect_column_names" => true }
+      end
+      it "extract all the values with the autodetected header" do
+        plugin.filter(header)
+        plugin.filter(event)
+        expect(event.get("first")).to eq("big")
+        expect(event.get("last")).to eq("bird")
+        expect(event.get("address")).to eq("sesame street")
+      end
+    end
   end
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: logstash-filter-csv
 version: !ruby/object:Gem::Version
-  version: 3.0.2
+  version: 3.0.3
 platform: ruby
 authors:
 - Elastic
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2016-07-14 00:00:00.000000000 Z
+date: 2017-05-24 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   requirement: !ruby/object:Gem::Requirement
@@ -56,6 +56,7 @@ files:
 - LICENSE
 - NOTICE.TXT
 - README.md
+- docs/index.asciidoc
 - lib/logstash/filters/csv.rb
 - logstash-filter-csv.gemspec
 - spec/filters/csv_spec.rb
@@ -81,7 +82,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
       version: '0'
 requirements: []
 rubyforge_project:
-rubygems_version: 2.6.3
+rubygems_version: 2.4.8
 signing_key:
 specification_version: 4
 summary: The CSV filter takes an event field containing CSV data, parses it, and stores it as individual fields (can optionally specify the names).