logstash-output-scacsv 1.0.0 → 1.0.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 02e5d4b3c9ceaa6e7968a2ec2e3563e1b9bb2119
4
- data.tar.gz: a979b0ca6d5bb260769f9f5e83d5e315e7e04728
3
+ metadata.gz: 373f6c4b2a800362409b49877ddfbb9448a1cc2f
4
+ data.tar.gz: 2b73b60065ab79367ea9e24a3b908c64a415a39b
5
5
  SHA512:
6
- metadata.gz: f0b09ab0d67f2e4cee37eb3bd34c3e75c06ebf16e60ea346d511dccc90d402dd73e4b4d713d9dbb990ff54f52bbce0e28a8bc3d66b95fe7329377b090dcdd677
7
- data.tar.gz: 9ba0172228e956cb5a0c7cb71a4851c6b8424f9d3d631fd2a5468e02c776fc0346f404f76fada3f6a2178c470381f107873d660792ffce9d8003d3d5f80ada29
6
+ metadata.gz: ce2af21c0d26bb9686975c5be44481d5ad5bdbfd0f77c4961189310db35fc83a6a4d466cbafbf5fe975df001e592d5172490e01fcd5090d80f7fcf7d8c314d62
7
+ data.tar.gz: 8d1e769135cf5ba7980a75d6bfed39750120801d15dbe596d6e8ffa5115432b039300e1dde5cc17ca505f2f5ed08b748b1791aebba6add9fb9e123e881d73168
data/README.md CHANGED
@@ -1,86 +1,174 @@
1
- # Logstash Plugin
1
+ <html>
2
+ <head>
3
+ <meta charset="UTF-8">
4
+ <title>Logstash for SCAPI - output scacsv</title>
5
+ <link rel="stylesheet" href="http://logstash.net/style.css">
6
+ </head>
7
+ <body>
8
+ <div class="container">
9
+ <div class="header">
10
+
11
+ <!--main content goes here, yo!-->
12
+ <div class="content_wrapper">
13
+ <h2>scacsv</h2>
14
+ <h3> Synopsis </h3>
15
+ Receives a stream of events and outputs files complying with the SCAPI requirements related to header and file naming.
16
+ Essentially provides a match between Logstash's 'streaming' approach and SCAPI's file-based input reqmts.
17
+ This is what it might look like in your config file:
18
+ <pre><code>output {
19
+ scacsv {
20
+ <a href="#fields">fields</a> => ... # array (required)
21
+ <a href="#header">header</a> => ... # array (optional), default: {}
22
+ <a href="#path">path</a> => ... # string (required)
23
+ <a href="#group">group</a> => ... # string (required)
24
+ <a href="#max_size">max_size</a> => ... # number (optional), default: 0 (not used)
25
+ <a href="#flush_interval">flush_interval</a> => ... # number (optional), default: 60
26
+ <a href="#file_interval_width">file_interval_width</a> => ... # string (optional), default: ""
27
+ <a href="#time_field">time_field</a> => ... # string (optional), default: 'timestamp'
28
+ <a href="#time_field_format">time_field_format</a> => ... # string (required)
29
+ <a href="#timestamp_output_format">timestamp_output_format</a> => ... # string (optional), default: ""
30
+ <a href="#increment_time">increment_time</a> => ... # boolean (optional), default: false
31
+ }
32
+ }
33
+ </code></pre>
34
+ <h3> Details </h3>
35
+ Note: by default this plugin expects timestamp provided to be in epoch time. You can override this expectation and supply non-epoch timestamps which will be used as using the <a href="#keep_original_timestamps">keep_original_timestamps</a> configuration option. However, such non-epoch timestamps will not automatically be incremented when determining the end time of the file
36
+ <h4>
37
+ <a name="fields">
38
+ fields
39
+ </a>
40
+ </h4>
41
+ <ul>
42
+ <li> Value type is <a href="http://logstash.net/docs/1.4.2/configuration#array">Array</a> </li>
43
+ <li> There is no default for this setting </li>
44
+ </ul>
45
+ <p>Specify which fields from the incoming event you wish to output, and which order</p>
46
+ <h4>
47
+ <a name="header">
48
+ header
49
+ </a>
50
+ </h4>
51
+ <ul>
52
+ <li> Value type is <a href="http://logstash.net/docs/1.4.2/configuration#hash">Array</a> </li>
53
+ <li> Default value is {} </li>
54
+ </ul>
55
+ <p>
56
+ Used to specify a string to put as the header (first) line in the file. Useful if you want to override the default ones which are determined from the fields setting
57
+ </p>
58
+ <h4>
59
+ <a name="path">
60
+ path
61
+ </a>
62
+ </h4>
63
+ <ul>
64
+ <li> Value type is <a href="http://logstash.net/docs/1.4.2/configuration#string">string</a> </li>
65
+ <li> Default value is "" </li>
66
+ </ul>
67
+ <p>Path of temporary output file. Output will be written to this file until it is time to close the file. Then it will be renamed to SCAPI file convention. The temporary output file path will be then reused for the next set of output. For example, if output data for a CPU group, we might define the following path </p>
68
+ </p><code>path =&gt; "./cpu.csv"</code>.</p>
69
+ <h4>
70
+ <a name="group">
71
+ group (required setting)
72
+ </a>
73
+ </h4>
74
+ <ul>
75
+ <li> Value type is <a href="http://logstash.net/docs/1.4.2/configuration#dytomh">string</a> </li>
76
+ <li> There is no default value for this setting. </li>
77
+ </ul>
78
+ <p>SCAPI input filenames must have a group identifier as part of the name. The filename generally has this format <code>&lt;group&gt;__&lt;starttime&gt;__&lt;endtime&gt;.csv</code>. This <code>group</code> parameter is used to specify that group name and it will be used as a prefix when the file is renamed from <code>path</code>. For example</p>
79
+ <p><code>path =&gt; "./cpu"</code>.</p>
80
+ <h4>
81
+ <a name="max_size">
82
+ max_size
83
+ </a>
84
+ </h4>
85
+ <ul>
86
+ <li> Value type is <a href="../configuration#number">number</a> </li>
87
+ <li> Default value is 0 (meaning it is not used)</li>
88
+ </ul>
89
+ <p>This will closing and rename a file if there have been <code>max_size</code> events received. This is to limit the size of a file, and sometimes can be useful when 'chopping' a stream into chunks for use in SCAPI</p>
90
+ <h4>
91
+ <a name="flush_interval">
92
+ flush_interval
93
+ </a>
94
+ </h4>
95
+ <ul>
96
+ <li> Value type is <a href="../configuration#number">number</a> </li>
97
+ <li> Default value is 60 </li>
98
+ </ul>
99
+ <p>Amount of time (seconds) to wait before flushing, closing and renaming a file, if there have been no events received. This is to ensure that after a period of idleness, we will output a SCAPI file.</p>
100
+ <h4>
101
+ <a name="file_interval_width">
102
+ file_interval_width
103
+ </a>
104
+ </h4>
105
+ <ul>
106
+ <li> Value type is <a href="../configuration#string">string</a> </li>
107
+ <li> Default value is "" (meaning it is not used). Allowed values are "MINUTE", "HOUR", "DAY"</li>
108
+ </ul>
109
+ <p>Setting this enables files to be closed on specified boundaries. This is useful to break incoming stream up on PI preferred boundaries. If HOUR was set for example, then all incoming data for a particular hour would be put in a file for that hour, and when new data in the next hour arrives, the file is close and a new one opened</p>
110
+ <h4>
111
+ <a name="time_field">
112
+ time_field
113
+ </a>
114
+ </h4>
115
+ <ul>
116
+ <li> Value type is <a href="../configuration#string">string</a> </li>
117
+ <li> Default value is "timestamp"</li>
118
+ </ul>
119
+ <p>Specify which field to use as the 'timestamp' when determining filename times. Values from the 'timestamp' field will be used for <code>starttime</code> (first value seen) and <code>endtime</code> (last value seen) in the file name <code>&lt;group&gt;__&lt;starttime&gt;__&lt;endtime&gt;.csv</code></p>
120
+ <h4>
121
+ <a name="time_field_format (required setting)">
122
+ time_field_format
123
+ </a>
124
+ </h4>
125
+ <ul>
126
+ <li> Value type is <a href="../configuration#string">string</a> </li>
127
+ <li> There is no default value for this setting</li>
128
+ </ul>
129
+ <p>A format string, in java SimpleDateFormat format, to specify how to interpret the timefield values e.g. <code>"yyyy-MM-dd HH:mm:ss"</code>. </p>
130
+ <h4>
131
+ <a name="timestamp_output_format">
132
+ timestamp_output_format
133
+ </a>
134
+ </h4>
135
+ <ul>
136
+ <li> Value type is <a href="../configuration#string">string</a> </li>
137
+ <li> If not specified, it uses the format declared by <code>time_field_format</code></li>
138
+ </ul>
139
+ <p>A format string, in java SimpleDateFormat format, to specify how to output filename timestamps</p>
140
+ <h4>
141
+ <a name="increment_time">
142
+ increment_time
143
+ </a>
144
+ </h4>
145
+ <ul>
146
+ <li> Value type is <a href="../configuration#boolean">boolean</a> </li>
147
+ <li> Default value is false</li>
148
+ </ul>
149
+ <p>
150
+ By default, the supplied timestamp will be left as is. If set to <code>true</code>, then the timestamp will be incremented by 1. This is to ensure that the end time is greater than the last event time in the file - per PI datafile requirements
151
+ </p>
152
+ <hr>
153
+ </div>
154
+ <div class="clear">
155
+ </div>
156
+ </div>
157
+ </div>
158
+ <!--closes main container div-->
159
+ <div class="clear">
160
+ </div>
161
+ <div class="footer">
162
+ <p>
163
+ Hello! I'm your friendly footer. If you're actually reading this, I'm impressed.
164
+ </p>
165
+ </div>
166
+ <noscript>
167
+ <div style="display:inline;">
168
+ <img height="1" width="1" style="border-style:none;" alt="" src="//googleads.g.doubleclick.net/pagead/viewthroughconversion/985891458/?value=0&amp;guid=ON&amp;script=0"/>
169
+ </div>
170
+ </noscript>
171
+ <script src="/js/patch.js?1.4.2"></script>
172
+ </body>
173
+ </html>
2
174
 
3
- This is a plugin for [Logstash](https://github.com/elasticsearch/logstash).
4
-
5
- It is fully free and fully open source. The license is Apache 2.0, meaning you are pretty much free to use it however you want in whatever way.
6
-
7
- ## Documentation
8
-
9
- Logstash provides infrastructure to automatically generate documentation for this plugin. We use the asciidoc format to write documentation so any comments in the source code will be first converted into asciidoc and then into html. All plugin documentation are placed under one [central location](http://www.elasticsearch.org/guide/en/logstash/current/).
10
-
11
- - For formatting code or config example, you can use the asciidoc `[source,ruby]` directive
12
- - For more asciidoc formatting tips, see the excellent reference here https://github.com/elasticsearch/docs#asciidoc-guide
13
-
14
- ## Need Help?
15
-
16
- Need help? Try #logstash on freenode IRC or the https://discuss.elastic.co/c/logstash discussion forum.
17
-
18
- ## Developing
19
-
20
- ### 1. Plugin Developement and Testing
21
-
22
- #### Code
23
- - To get started, you'll need JRuby with the Bundler gem installed.
24
-
25
- - Create a new plugin or clone and existing from the GitHub [logstash-plugins](https://github.com/logstash-plugins) organization. We also provide [example plugins](https://github.com/logstash-plugins?query=example).
26
-
27
- - Install dependencies
28
- ```sh
29
- bundle install
30
- ```
31
-
32
- #### Test
33
-
34
- - Update your dependencies
35
-
36
- ```sh
37
- bundle install
38
- ```
39
-
40
- - Run tests
41
-
42
- ```sh
43
- bundle exec rspec
44
- ```
45
-
46
- ### 2. Running your unpublished Plugin in Logstash
47
-
48
- #### 2.1 Run in a local Logstash clone
49
-
50
- - Edit Logstash `Gemfile` and add the local plugin path, for example:
51
- ```ruby
52
- gem "logstash-filter-awesome", :path => "/your/local/logstash-filter-awesome"
53
- ```
54
- - Install plugin
55
- ```sh
56
- bin/plugin install --no-verify
57
- ```
58
- - Run Logstash with your plugin
59
- ```sh
60
- bin/logstash -e 'filter {awesome {}}'
61
- ```
62
- At this point any modifications to the plugin code will be applied to this local Logstash setup. After modifying the plugin, simply rerun Logstash.
63
-
64
- #### 2.2 Run in an installed Logstash
65
-
66
- You can use the same **2.1** method to run your plugin in an installed Logstash by editing its `Gemfile` and pointing the `:path` to your local plugin development directory or you can build the gem and install it using:
67
-
68
- - Build your plugin gem
69
- ```sh
70
- gem build logstash-filter-awesome.gemspec
71
- ```
72
- - Install the plugin from the Logstash home
73
- ```sh
74
- bin/plugin install /your/local/plugin/logstash-filter-awesome.gem
75
- ```
76
- - Start Logstash and proceed to test the plugin
77
-
78
- ## Contributing
79
-
80
- All contributions are welcome: ideas, patches, documentation, bug reports, complaints, and even something you drew up on a napkin.
81
-
82
- Programming is not a required skill. Whatever you've seen about open source and maintainers or community members saying "send patches or die" - you will not see that here.
83
-
84
- It is more important to the community that you are able to contribute.
85
-
86
- For more information about contributing, see the [CONTRIBUTING](https://github.com/elasticsearch/logstash/blob/master/CONTRIBUTING.md) file.
@@ -4,7 +4,7 @@
4
4
  #
5
5
  # Logstash mediation output for SCAPI
6
6
  #
7
- # Version 160215.1 Robert Mckeown
7
+ # Version 170615.1 Robert Mckeown
8
8
  #
9
9
  ############################################
10
10
 
@@ -42,14 +42,13 @@ class LogStash::Outputs::SCACSV < LogStash::Outputs::File
42
42
  # Name of the output group - used as a prefix in the renamed file
43
43
  config :group, :validate => :string, :required => true
44
44
  config :max_size, :validate => :number, :default => 0
45
+ config :file_interval_width, :validate => :string, :default => "" # Allow "" or "hour","day" or "minute"
45
46
  config :flush_interval, :validate => :number, :default => 60
46
47
  config :time_field, :validate => :string, :default => "timestamp"
47
48
  # config :time_format, :validate => :string, :default => "%Y%m%d%H%M%S"
48
49
  config :time_field_format, :validate => :string, :required => true
49
50
  config :timestamp_output_format, :validate => :string, :default => "" # "yyyyMMddHHmmss" # java format
50
51
 
51
-
52
-
53
52
  config :tz_offset, :validate => :number, :default => 0
54
53
  config :increment_time, :validate => :boolean, :default => false
55
54
 
@@ -63,11 +62,35 @@ class LogStash::Outputs::SCACSV < LogStash::Outputs::File
63
62
  @endTime = "missingEndTime"
64
63
  @recordCount = 0
65
64
 
66
- @lastOutputTime = 0
65
+ @lastOutputTime = 0 #data time
67
66
  @flushInterval = @flush_interval.to_i
68
67
 
69
68
  @timerThread = Thread.new { flushWatchdog(@flush_interval) }
70
69
 
70
+ @currentOutputIntervalStartTime = 0
71
+ @fileIntervalWidthSeconds = 0
72
+ @closeOnIntervalBoundaries = false
73
+ case @file_interval_width.upcase
74
+ when "MINUTE"
75
+ @fileIntervalWidthSeconds = 60
76
+ @closeOnIntervalBoundaries = true
77
+ when "HOUR"
78
+ @fileIntervalWidthSeconds = 3600
79
+ @closeOnIntervalBoundaries = true
80
+ when "DAY"
81
+ @fileIntervalWidthSeconds = 86400
82
+ @closeOnIntervalBoundaries = true
83
+ else
84
+ @fileIntervalWidthSeconds = 0 #not used
85
+ @closeOnIntervalBoundaries = false
86
+ end
87
+
88
+ @df = nil
89
+ if (@time_field_format != "epoch")
90
+ @df = java.text.SimpleDateFormat.new(@time_field_format)
91
+ end
92
+
93
+
71
94
  end
72
95
 
73
96
  # This thread ensures that we output (close and rename) a file every so often
@@ -101,6 +124,12 @@ class LogStash::Outputs::SCACSV < LogStash::Outputs::File
101
124
  closeAndRenameCurrentFile
102
125
  end
103
126
  else
127
+
128
+ # Now see if we need to close file because of a new boundary
129
+ if @closeOnIntervalBoundaries and @recordCount >= 1 and (@currentOutputIntervalStartTime != snapTimestampToInterval(timestampAsEpochSeconds(event),@fileIntervalWidthSeconds))
130
+ closeAndRenameCurrentFile
131
+ end
132
+
104
133
  @formattedPath = event.sprintf(@path)
105
134
  fd = open(@formattedPath)
106
135
  @logger.debug("SCACSVreceive - after opening fd=" + fd.to_s)
@@ -127,11 +156,27 @@ class LogStash::Outputs::SCACSV < LogStash::Outputs::File
127
156
 
128
157
  # capture the earliest - assumption is that records are in order
129
158
  if (@recordCount) == 1
130
- @startTime = event[@time_field]
159
+ if !@closeOnIntervalBoundaries
160
+ @startTime = event[@time_field]
161
+ else
162
+ @startTime = snapTimestampToInterval(timestampAsEpochSeconds(event),@fileIntervalWidthSeconds)
163
+ end
131
164
  end
132
165
 
133
166
  # for every record, update endTime - again, assumption is that records are in order
134
- @endTime = event[@time_field]
167
+ if !@closeOnIntervalBoundaries
168
+ @endTime = event[@time_field]
169
+ else
170
+ @endTime = @startTime + @fileIntervalWidthSeconds - 1 # end of interval
171
+ end
172
+
173
+ #puts("After snapping. timestamp=" + event[@time_field].to_s + " startTime=" + @startTime.to_s + " endTime = " + @endTime.to_s)
174
+
175
+ # remember start of boundary for next time
176
+ if @closeOnIntervalBoundaries
177
+ @currentOutputIntervalStartTime = @startTime
178
+ end
179
+
135
180
 
136
181
  if ((@max_size > 0) and (@recordCount >= max_size))
137
182
  # Have enough records, close it out
@@ -142,6 +187,22 @@ class LogStash::Outputs::SCACSV < LogStash::Outputs::File
142
187
 
143
188
  end #def receive
144
189
 
190
+ private
191
+ def timestampAsEpochSeconds(event)
192
+ # rmck: come back and remove global refs here!
193
+ if !@df.nil?
194
+ @df.parse(event[@time_field])
195
+ else
196
+ #when df not set, we assume epoch seconds
197
+ event[@time_field].to_i
198
+ end
199
+ end
200
+
201
+ private
202
+ def snapTimestampToInterval(timestamp,interval)
203
+ intervalStart = (timestamp/ interval) * interval
204
+ end
205
+
145
206
  private
146
207
  def get_value(name, event)
147
208
  val = event[name]
@@ -215,13 +276,11 @@ class LogStash::Outputs::SCACSV < LogStash::Outputs::File
215
276
  if (@time_field_format != "epoch")
216
277
  # if not epoch, then we expect java timestamp format
217
278
  # so must convert start/end times
279
+ nStartTime = @df.parse(@startTime)
280
+ nEndTime = @df.parse(@endTime)
218
281
 
219
- df = java.text.SimpleDateFormat.new(@time_field_format)
220
- nStartTime = df.parse(@startTime)
221
- nEndTime = df.parse(@endTime)
222
-
223
- @startTime = df.parse(@startTime).getTime
224
- @endTime = df.parse(@endTime).getTime
282
+ @startTime = @df.parse(@startTime).getTime
283
+ @endTime = @df.parse(@endTime).getTime
225
284
 
226
285
  end
227
286
 
@@ -235,7 +294,6 @@ class LogStash::Outputs::SCACSV < LogStash::Outputs::File
235
294
  @endTime = @endTime.to_i + @tz_offset
236
295
  if (@increment_time)
237
296
  # increment is used to ensure that the end-time on the filename is after the last data value
238
-
239
297
  @endTime = @endTime.to_i + 1000 # 1000ms = 1sec
240
298
 
241
299
  end
@@ -1,6 +1,6 @@
1
1
  Gem::Specification.new do |s|
2
2
  s.name = 'logstash-output-scacsv'
3
- s.version = "1.0.0"
3
+ s.version = "1.0.1"
4
4
  s.licenses = ["Apache License (2.0)"]
5
5
  s.summary = "Receives a stream of events and outputs files meeting the csv reqmts for IBM SmartCloudAnalytics Predictive Insights"
6
6
  s.description = "This gem is a logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/plugin install gemname. This gem is not a stand-alone program"
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: logstash-output-scacsv
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0
4
+ version: 1.0.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Robert Mckeown
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-06-12 00:00:00.000000000 Z
11
+ date: 2015-07-02 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: logstash-core