logstash-filter-geoip 1.0.2 → 1.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 4cb9d672a670ac659f61fa7604bc57274eabd759
4
- data.tar.gz: d697817443b57ce1ef55e4dbdd19caf657e90cc6
3
+ metadata.gz: 9d5c90fa4938034e2c5ed20e1351df96ccf0a943
4
+ data.tar.gz: e69c420b023bc2f13db5d7b5844e6abe65325900
5
5
  SHA512:
6
- metadata.gz: 95c606eb5c2df267eb7476b91247bf14e9c4758ea1c27b720342fc6d624091473b754c11cf1b42ed13204e321910e288f17bc95e90e61fe4abd3a2e73083c213
7
- data.tar.gz: 62ba7bd2e0a890969d6e2ca3c1a0575ab1ccdd8e84232d468bee03bcab0294e4d9850d889160ec5973093313cd6263ab4554897d47fe8c15400b2208bfdc886b
6
+ metadata.gz: d8a75e6eb497f9d400ba1007c0dd370e35ea3447a7f523cd8719076a0f8b68edb854e58e7bc6716be4974bf10bffb3bf76f2dadd6918e8b28c3e5a18f8bd4403
7
+ data.tar.gz: a8655cbb27a53d6f4cfe48c0442475a82b7bdbf04ef8af099fc2155f581f534b15fdbf9085ee3c9f6a7b30c8a0ae622e7a65a807a06b4be6094755c67630a445
@@ -0,0 +1,2 @@
1
+ * 1.1.0
2
+ - Add LRU cache
data/README.md CHANGED
@@ -1,15 +1,15 @@
1
1
  # Logstash Plugin
2
2
 
3
- This is a plugin for [Logstash](https://github.com/elasticsearch/logstash).
3
+ This is a plugin for [Logstash](https://github.com/elastic/logstash).
4
4
 
5
5
  It is fully free and fully open source. The license is Apache 2.0, meaning you are pretty much free to use it however you want in whatever way.
6
6
 
7
7
  ## Documentation
8
8
 
9
- Logstash provides infrastructure to automatically generate documentation for this plugin. We use the asciidoc format to write documentation so any comments in the source code will be first converted into asciidoc and then into html. All plugin documentation are placed under one [central location](http://www.elasticsearch.org/guide/en/logstash/current/).
9
+ Logstash provides infrastructure to automatically generate documentation for this plugin. We use the asciidoc format to write documentation so any comments in the source code will be first converted into asciidoc and then into html. All plugin documentation are placed under one [central location](http://www.elastic.co/guide/en/logstash/current/).
10
10
 
11
11
  - For formatting code or config example, you can use the asciidoc `[source,ruby]` directive
12
- - For more asciidoc formatting tips, see the excellent reference here https://github.com/elasticsearch/docs#asciidoc-guide
12
+ - For more asciidoc formatting tips, see the excellent reference here https://github.com/elastic/docs#asciidoc-guide
13
13
 
14
14
  ## Need Help?
15
15
 
@@ -89,4 +89,4 @@ Programming is not a required skill. Whatever you've seen about open source and
89
89
 
90
90
  It is more important to the community that you are able to contribute.
91
91
 
92
- For more information about contributing, see the [CONTRIBUTING](https://github.com/elasticsearch/logstash/blob/master/CONTRIBUTING.md) file.
92
+ For more information about contributing, see the [CONTRIBUTING](https://github.com/elastic/logstash/blob/master/CONTRIBUTING.md) file.
@@ -2,6 +2,8 @@
2
2
  require "logstash/filters/base"
3
3
  require "logstash/namespace"
4
4
  require "tempfile"
5
+ require "lru_redux"
6
+ require "geoip"
5
7
 
6
8
  # The GeoIP filter adds information about the geographical location of IP addresses,
7
9
  # based on data from the Maxmind database.
@@ -11,7 +13,7 @@ require "tempfile"
11
13
  # http://geojson.org/geojson-spec.html[GeoJSON] format. Additionally,
12
14
  # the default Elasticsearch template provided with the
13
15
  # <<plugins-outputs-elasticsearch,`elasticsearch` output>> maps
14
- # the `[geoip][location]` field to an http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-geo-point-type.html#_mapping_options[Elasticsearch geo_point].
16
+ # the `[geoip][location]` field to an https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-geo-point-type.html#_mapping_options[Elasticsearch geo_point].
15
17
  #
16
18
  # As this field is a `geo_point` _and_ it is still valid GeoJSON, you get
17
19
  # the awesomeness of Elasticsearch's geospatial query, facet and filter functions
@@ -22,6 +24,12 @@ require "tempfile"
22
24
  # Maxmind with a CCA-ShareAlike 3.0 license. For more details on GeoLite, see
23
25
  # <http://www.maxmind.com/en/geolite>.
24
26
  class LogStash::Filters::GeoIP < LogStash::Filters::Base
27
+ LOOKUP_CACHE_INIT_MUTEX = Mutex.new
28
+ # Map of lookup caches, keyed by geoip_type
29
+ LOOKUP_CACHES = {}
30
+
31
+ attr_accessor :lookup_cache
32
+
25
33
  config_name "geoip"
26
34
 
27
35
  # The path to the GeoIP database file which Logstash should use. Country, City, ASN, ISP
@@ -58,9 +66,24 @@ class LogStash::Filters::GeoIP < LogStash::Filters::Base
58
66
  # is still valid GeoJSON.
59
67
  config :target, :validate => :string, :default => 'geoip'
60
68
 
69
+ # GeoIP lookup is surprisingly expensive. This filter uses an LRU cache to take advantage of the fact that
70
+ # IPs agents are often found adjacent to one another in log files and rarely have a random distribution.
71
+ # The higher you set this the more likely an item is to be in the cache and the faster this filter will run.
72
+ # However, if you set this too high you can use more memory than desired.
73
+ #
74
+ # Experiment with different values for this option to find the best performance for your dataset.
75
+ #
76
+ # This MUST be set to a value > 0. There is really no reason to not want this behavior, the overhead is minimal
77
+ # and the speed gains are large.
78
+ #
79
+ # It is important to note that this config value is global to the geoip_type. That is to say all instances of the geoip filter
80
+ # of the same geoip_type share the same cache. The last declared cache size will 'win'. The reason for this is that there would be no benefit
81
+ # to having multiple caches for different instances at different points in the pipeline, that would just increase the
82
+ # number of cache misses and waste memory.
83
+ config :lru_cache_size, :validate => :number, :default => 1000
84
+
61
85
  public
62
86
  def register
63
- require "geoip"
64
87
  if @database.nil?
65
88
  @database = ::Dir.glob(::File.join(::File.expand_path("../../../vendor/", ::File.dirname(__FILE__)),"GeoLiteCity*.dat")).first
66
89
  if !File.exists?(@database)
@@ -88,6 +111,11 @@ class LogStash::Filters::GeoIP < LogStash::Filters::Base
88
111
  end
89
112
 
90
113
  @threadkey = "geoip-#{self.object_id}"
114
+
115
+ # This is wrapped in a mutex to make sure the initialization behavior of LOOKUP_CACHES (see def above) doesn't create a dupe
116
+ LOOKUP_CACHE_INIT_MUTEX.synchronize do
117
+ self.lookup_cache = LOOKUP_CACHES[@geoip_type] ||= LruRedux::ThreadSafeCache.new(1000)
118
+ end
91
119
  end # def register
92
120
 
93
121
  public
@@ -104,18 +132,16 @@ class LogStash::Filters::GeoIP < LogStash::Filters::Base
104
132
  Thread.current[@threadkey] = ::GeoIP.new(@database)
105
133
  end
106
134
 
107
- begin
108
- ip = event[@source]
109
- ip = ip.first if ip.is_a? Array
110
- geo_data = Thread.current[@threadkey].send(@geoip_type, ip)
111
- rescue SocketError => e
112
- @logger.error("IP Field contained invalid IP address or hostname", :field => @source, :event => event)
113
- rescue Exception => e
114
- @logger.error("Unknown error while looking up GeoIP data", :exception => e, :field => @source, :event => event)
115
- end
135
+ geo_data = get_geo_data(event)
116
136
 
117
137
  return if geo_data.nil? || !geo_data.respond_to?(:to_hash)
118
138
 
139
+ apply_geodata(geo_data, event)
140
+
141
+ filter_matched(event)
142
+ end # def filter
143
+
144
+ def apply_geodata(geo_data,event)
119
145
  geo_data_hash = geo_data.to_hash
120
146
  geo_data_hash.delete(:request)
121
147
  event[@target] = {} if event[@target].nil?
@@ -130,16 +156,36 @@ class LogStash::Filters::GeoIP < LogStash::Filters::Base
130
156
  if value.is_a?(String)
131
157
  # Some strings from GeoIP don't have the correct encoding...
132
158
  value = case value.encoding
133
- # I have found strings coming from GeoIP that are ASCII-8BIT are actually
134
- # ISO-8859-1...
135
- when Encoding::ASCII_8BIT; value.force_encoding(Encoding::ISO_8859_1).encode(Encoding::UTF_8)
136
- when Encoding::ISO_8859_1, Encoding::US_ASCII; value.encode(Encoding::UTF_8)
137
- else; value
138
- end
159
+ # I have found strings coming from GeoIP that are ASCII-8BIT are actually
160
+ # ISO-8859-1...
161
+ when Encoding::ASCII_8BIT; value.force_encoding(Encoding::ISO_8859_1).encode(Encoding::UTF_8)
162
+ when Encoding::ISO_8859_1, Encoding::US_ASCII; value.encode(Encoding::UTF_8)
163
+ else; value.dup
164
+ end
139
165
  end
140
166
  event[@target][key.to_s] = value
141
167
  end
142
168
  end # geo_data_hash.each
143
- filter_matched(event)
144
- end # def filter
169
+ end
170
+
171
+ def get_geo_data(event)
172
+ ip = event[@source]
173
+ ip = ip.first if ip.is_a? Array
174
+
175
+ get_geo_data_for_ip(ip)
176
+ rescue SocketError => e
177
+ @logger.error("IP Field contained invalid IP address or hostname", :field => @source, :event => event)
178
+ rescue StandardError => e
179
+ @logger.error("Unknown error while looking up GeoIP data", :exception => e, :field => @source, :event => event)
180
+ end
181
+
182
+ def get_geo_data_for_ip(ip)
183
+ if (cached = lookup_cache[ip])
184
+ cached
185
+ else
186
+ geo_data = Thread.current[@threadkey].send(@geoip_type, ip)
187
+ lookup_cache[ip] = geo_data
188
+ geo_data
189
+ end
190
+ end
145
191
  end # class LogStash::Filters::GeoIP
@@ -1,7 +1,7 @@
1
1
  Gem::Specification.new do |s|
2
2
 
3
3
  s.name = 'logstash-filter-geoip'
4
- s.version = '1.0.2'
4
+ s.version = '1.1.0'
5
5
  s.licenses = ['Apache License (2.0)']
6
6
  s.summary = "$summary"
7
7
  s.description = "This gem is a logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/plugin install gemname. This gem is not a stand-alone program"
@@ -11,7 +11,7 @@ Gem::Specification.new do |s|
11
11
  s.require_paths = ["lib"]
12
12
 
13
13
  # Files
14
- s.files = `git ls-files`.split($\)+::Dir.glob('vendor/*')
14
+ s.files = Dir['lib/**/*','spec/**/*','vendor/**/*','*.gemspec','*.md','CONTRIBUTORS','Gemfile','LICENSE','NOTICE.TXT']
15
15
 
16
16
  # Tests
17
17
  s.test_files = s.files.grep(%r{^(test|spec|features)/})
@@ -23,6 +23,7 @@ Gem::Specification.new do |s|
23
23
  s.add_runtime_dependency "logstash-core", '>= 1.4.0', '< 2.0.0'
24
24
 
25
25
  s.add_runtime_dependency 'geoip', ['>= 1.3.2']
26
+ s.add_runtime_dependency 'lru_redux', "~> 1.1.0"
26
27
 
27
28
  s.add_development_dependency 'logstash-devutils'
28
29
  end
@@ -221,6 +221,29 @@ describe LogStash::Filters::GeoIP do
221
221
  subject
222
222
  end
223
223
  end
224
+ end
225
+
226
+ describe "returned object identities" do
227
+ let(:plugin) { LogStash::Filters::GeoIP.new("source" => "message") }
228
+ let(:geo_data) { plugin.get_geo_data_for_ip("8.8.8.8") }
229
+
230
+ before do
231
+ plugin.register
232
+ end
224
233
 
234
+ it "should dup the objects" do
235
+ event = {}
236
+ alt_event = {}
237
+ plugin.apply_geodata(geo_data, event)
238
+ plugin.apply_geodata(geo_data, alt_event)
239
+
240
+ event["geoip"].each do |k,v|
241
+ alt_v = alt_event["geoip"][k]
242
+ expect(v).to eql(alt_v)
243
+ unless v.is_a?(Numeric) # Numeric values can't be mutated, so this isn't an issue, its really for strings
244
+ expect(v.object_id).not_to eql(alt_v.object_id), "Object Ids for key #{k} and v #{v}"
245
+ end
246
+ end
247
+ end
225
248
  end
226
249
  end
metadata CHANGED
@@ -1,83 +1,94 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: logstash-filter-geoip
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.2
4
+ version: 1.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Elastic
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-07-30 00:00:00.000000000 Z
11
+ date: 2015-09-15 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
- name: logstash-core
15
- version_requirements: !ruby/object:Gem::Requirement
14
+ requirement: !ruby/object:Gem::Requirement
16
15
  requirements:
17
- - - '>='
16
+ - - ">="
18
17
  - !ruby/object:Gem::Version
19
18
  version: 1.4.0
20
- - - <
19
+ - - "<"
21
20
  - !ruby/object:Gem::Version
22
21
  version: 2.0.0
23
- requirement: !ruby/object:Gem::Requirement
22
+ name: logstash-core
23
+ prerelease: false
24
+ type: :runtime
25
+ version_requirements: !ruby/object:Gem::Requirement
24
26
  requirements:
25
- - - '>='
27
+ - - ">="
26
28
  - !ruby/object:Gem::Version
27
29
  version: 1.4.0
28
- - - <
30
+ - - "<"
29
31
  - !ruby/object:Gem::Version
30
32
  version: 2.0.0
31
- prerelease: false
32
- type: :runtime
33
33
  - !ruby/object:Gem::Dependency
34
+ requirement: !ruby/object:Gem::Requirement
35
+ requirements:
36
+ - - ">="
37
+ - !ruby/object:Gem::Version
38
+ version: 1.3.2
34
39
  name: geoip
40
+ prerelease: false
41
+ type: :runtime
35
42
  version_requirements: !ruby/object:Gem::Requirement
36
43
  requirements:
37
- - - '>='
44
+ - - ">="
38
45
  - !ruby/object:Gem::Version
39
46
  version: 1.3.2
47
+ - !ruby/object:Gem::Dependency
40
48
  requirement: !ruby/object:Gem::Requirement
41
49
  requirements:
42
- - - '>='
50
+ - - "~>"
43
51
  - !ruby/object:Gem::Version
44
- version: 1.3.2
52
+ version: 1.1.0
53
+ name: lru_redux
45
54
  prerelease: false
46
55
  type: :runtime
47
- - !ruby/object:Gem::Dependency
48
- name: logstash-devutils
49
56
  version_requirements: !ruby/object:Gem::Requirement
50
57
  requirements:
51
- - - '>='
58
+ - - "~>"
52
59
  - !ruby/object:Gem::Version
53
- version: '0'
60
+ version: 1.1.0
61
+ - !ruby/object:Gem::Dependency
54
62
  requirement: !ruby/object:Gem::Requirement
55
63
  requirements:
56
- - - '>='
64
+ - - ">="
57
65
  - !ruby/object:Gem::Version
58
66
  version: '0'
67
+ name: logstash-devutils
59
68
  prerelease: false
60
69
  type: :development
70
+ version_requirements: !ruby/object:Gem::Requirement
71
+ requirements:
72
+ - - ">="
73
+ - !ruby/object:Gem::Version
74
+ version: '0'
61
75
  description: This gem is a logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/plugin install gemname. This gem is not a stand-alone program
62
76
  email: info@elastic.co
63
77
  executables: []
64
78
  extensions: []
65
79
  extra_rdoc_files: []
66
80
  files:
67
- - .gitignore
68
81
  - CHANGELOG.md
69
82
  - CONTRIBUTORS
70
83
  - Gemfile
71
84
  - LICENSE
72
85
  - NOTICE.TXT
73
86
  - README.md
74
- - Rakefile
75
87
  - lib/logstash/filters/geoip.rb
76
88
  - logstash-filter-geoip.gemspec
77
89
  - spec/filters/geoip_spec.rb
78
- - vendor.json
79
- - vendor/GeoLiteCity-2013-01-18.dat
80
90
  - vendor/GeoIPASNum-2014-02-12.dat
91
+ - vendor/GeoLiteCity-2013-01-18.dat
81
92
  homepage: http://www.elastic.co/guide/en/logstash/current/index.html
82
93
  licenses:
83
94
  - Apache License (2.0)
@@ -90,19 +101,20 @@ require_paths:
90
101
  - lib
91
102
  required_ruby_version: !ruby/object:Gem::Requirement
92
103
  requirements:
93
- - - '>='
104
+ - - ">="
94
105
  - !ruby/object:Gem::Version
95
106
  version: '0'
96
107
  required_rubygems_version: !ruby/object:Gem::Requirement
97
108
  requirements:
98
- - - '>='
109
+ - - ">="
99
110
  - !ruby/object:Gem::Version
100
111
  version: '0'
101
112
  requirements: []
102
113
  rubyforge_project:
103
- rubygems_version: 2.1.9
114
+ rubygems_version: 2.4.8
104
115
  signing_key:
105
116
  specification_version: 4
106
- summary: $summary
117
+ summary: "$summary"
107
118
  test_files:
108
119
  - spec/filters/geoip_spec.rb
120
+ has_rdoc:
data/.gitignore DELETED
@@ -1,4 +0,0 @@
1
- *.gem
2
- Gemfile.lock
3
- .bundle
4
- vendor
data/Rakefile DELETED
@@ -1,10 +0,0 @@
1
- require 'json'
2
-
3
- BASE_PATH = File.expand_path(File.dirname(__FILE__))
4
- @files = JSON.parse(File.read(File.join(BASE_PATH, 'vendor.json')))
5
-
6
- task :default do
7
- system("rake -T")
8
- end
9
-
10
- require "logstash/devutils/rake"
@@ -1,10 +0,0 @@
1
- [
2
- {
3
- "url": "http://logstash.objects.dreamhost.com/maxmind/GeoLiteCity-2013-01-18.dat.gz",
4
- "sha1": "15aab9a90ff90c4784b2c48331014d242b86bf82"
5
- },
6
- {
7
- "url": "http://logstash.objects.dreamhost.com/maxmind/GeoIPASNum-2014-02-12.dat.gz",
8
- "sha1": "6f33ca0b31e5f233e36d1f66fbeae36909b58f91"
9
- }
10
- ]