logstash-filter-useragent 3.2.4-java → 3.3.1-java

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: d11626aca14a03d4bc4e7cfcb233ffa902b1a320211883b3bb9ab9967cabbd13
4
- data.tar.gz: 4cff4fd0c0ae2f025c3a760810eb8673026b7d58b562f4cef8d31be934a7c583
3
+ metadata.gz: 0e9602a03ed7ff6329d58b7f929f6a2337a1429fd6b9794f7df7ae5fa37aefaa
4
+ data.tar.gz: a5a45f3a8b9de9cdbe68e1cc351d7a84ecd045710038cb39e7f4d88decd689eb
5
5
  SHA512:
6
- metadata.gz: '09441a1f7854948b6bbed31d03796b46ed9b0380f202275f098ecc4e90799b59084ac1b4a610189aa4b93dbd2fbf5d7a7c41ddf7bc000cc9164b7fb211c95bba'
7
- data.tar.gz: a0daa9e0d151772559c7501acc29c94710084df43227cceca6df1d25b13565d097950f068f9b22191f40241f6a474e73e69aebe8f1d03b5fb8f4e7bf38499b21
6
+ metadata.gz: aeeb91619083fbf74249ec8f38526efab217e9caba60e94a84ddce30f284d254e8deacea6b7fa98ba6da08043e47972e3041f2df7fdb468d49d9587ce73f7e8a
7
+ data.tar.gz: 8bd5b3c6059f151cc1e900a94b2cf1a6386174a6b01bd24d76f9e56d30742480af685e624e53ce1e878b5948a2e40dd38b2bd2b663818f3811f246bdc03aeeda
data/CHANGELOG.md CHANGED
@@ -1,3 +1,21 @@
1
+ ## 3.3.1
2
+ - Fix: invalid 3.3.0 release which did not package correctly [#71](https://github.com/logstash-plugins/logstash-filter-useragent/pull/71)
3
+
4
+ ## 3.3.0 (invalid)
5
+ - Feat: support ECS mode when setting UA fields [#68](https://github.com/logstash-plugins/logstash-filter-useragent/pull/68)
6
+
7
+ - Fix: capture os major version + update UA regexes [#69](https://github.com/logstash-plugins/logstash-filter-useragent/pull/69)
8
+
9
+ The UA parser *regexes.yaml* update (to **v0.12.0**) will accurately detect recent user agent strings.
10
+
11
+ NOTE: The update might cause changes in matching user agent fields such as `name`
12
+ (for example, the previous version did not support `Edge` and detect it as `Chrome`).
13
+ If needed the old behavior can be restored by downloading the outdated [regexes.yaml](https://raw.githubusercontent.com/ua-parser/uap-core/2e6c983e42e7aae7d957a263cb4d3de7ccbd92af/regexes.yaml)
14
+ and configuring `regexes => path/to/regexes.yaml`.
15
+
16
+ - Plugin no longer sets the `[build]` UA version field which is not implemented and was always `""`.
17
+ - Fix: `target => [field]` configuration, which wasn't working previously
18
+
1
19
  ## 3.2.4
2
20
  - Added support for OS regular expressions that use backreferences [#59](https://github.com/logstash-plugins/logstash-filter-useragent/pull/59)
3
21
 
data/LICENSE CHANGED
@@ -1,13 +1,202 @@
1
- Copyright (c) 2012-2018 Elasticsearch <http://www.elastic.co>
2
1
 
3
- Licensed under the Apache License, Version 2.0 (the "License");
4
- you may not use this file except in compliance with the License.
5
- You may obtain a copy of the License at
2
+ Apache License
3
+ Version 2.0, January 2004
4
+ http://www.apache.org/licenses/
6
5
 
7
- http://www.apache.org/licenses/LICENSE-2.0
6
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
8
7
 
9
- Unless required by applicable law or agreed to in writing, software
10
- distributed under the License is distributed on an "AS IS" BASIS,
11
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
- See the License for the specific language governing permissions and
13
- limitations under the License.
8
+ 1. Definitions.
9
+
10
+ "License" shall mean the terms and conditions for use, reproduction,
11
+ and distribution as defined by Sections 1 through 9 of this document.
12
+
13
+ "Licensor" shall mean the copyright owner or entity authorized by
14
+ the copyright owner that is granting the License.
15
+
16
+ "Legal Entity" shall mean the union of the acting entity and all
17
+ other entities that control, are controlled by, or are under common
18
+ control with that entity. For the purposes of this definition,
19
+ "control" means (i) the power, direct or indirect, to cause the
20
+ direction or management of such entity, whether by contract or
21
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
22
+ outstanding shares, or (iii) beneficial ownership of such entity.
23
+
24
+ "You" (or "Your") shall mean an individual or Legal Entity
25
+ exercising permissions granted by this License.
26
+
27
+ "Source" form shall mean the preferred form for making modifications,
28
+ including but not limited to software source code, documentation
29
+ source, and configuration files.
30
+
31
+ "Object" form shall mean any form resulting from mechanical
32
+ transformation or translation of a Source form, including but
33
+ not limited to compiled object code, generated documentation,
34
+ and conversions to other media types.
35
+
36
+ "Work" shall mean the work of authorship, whether in Source or
37
+ Object form, made available under the License, as indicated by a
38
+ copyright notice that is included in or attached to the work
39
+ (an example is provided in the Appendix below).
40
+
41
+ "Derivative Works" shall mean any work, whether in Source or Object
42
+ form, that is based on (or derived from) the Work and for which the
43
+ editorial revisions, annotations, elaborations, or other modifications
44
+ represent, as a whole, an original work of authorship. For the purposes
45
+ of this License, Derivative Works shall not include works that remain
46
+ separable from, or merely link (or bind by name) to the interfaces of,
47
+ the Work and Derivative Works thereof.
48
+
49
+ "Contribution" shall mean any work of authorship, including
50
+ the original version of the Work and any modifications or additions
51
+ to that Work or Derivative Works thereof, that is intentionally
52
+ submitted to Licensor for inclusion in the Work by the copyright owner
53
+ or by an individual or Legal Entity authorized to submit on behalf of
54
+ the copyright owner. For the purposes of this definition, "submitted"
55
+ means any form of electronic, verbal, or written communication sent
56
+ to the Licensor or its representatives, including but not limited to
57
+ communication on electronic mailing lists, source code control systems,
58
+ and issue tracking systems that are managed by, or on behalf of, the
59
+ Licensor for the purpose of discussing and improving the Work, but
60
+ excluding communication that is conspicuously marked or otherwise
61
+ designated in writing by the copyright owner as "Not a Contribution."
62
+
63
+ "Contributor" shall mean Licensor and any individual or Legal Entity
64
+ on behalf of whom a Contribution has been received by Licensor and
65
+ subsequently incorporated within the Work.
66
+
67
+ 2. Grant of Copyright License. Subject to the terms and conditions of
68
+ this License, each Contributor hereby grants to You a perpetual,
69
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
70
+ copyright license to reproduce, prepare Derivative Works of,
71
+ publicly display, publicly perform, sublicense, and distribute the
72
+ Work and such Derivative Works in Source or Object form.
73
+
74
+ 3. Grant of Patent License. Subject to the terms and conditions of
75
+ this License, each Contributor hereby grants to You a perpetual,
76
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
77
+ (except as stated in this section) patent license to make, have made,
78
+ use, offer to sell, sell, import, and otherwise transfer the Work,
79
+ where such license applies only to those patent claims licensable
80
+ by such Contributor that are necessarily infringed by their
81
+ Contribution(s) alone or by combination of their Contribution(s)
82
+ with the Work to which such Contribution(s) was submitted. If You
83
+ institute patent litigation against any entity (including a
84
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
85
+ or a Contribution incorporated within the Work constitutes direct
86
+ or contributory patent infringement, then any patent licenses
87
+ granted to You under this License for that Work shall terminate
88
+ as of the date such litigation is filed.
89
+
90
+ 4. Redistribution. You may reproduce and distribute copies of the
91
+ Work or Derivative Works thereof in any medium, with or without
92
+ modifications, and in Source or Object form, provided that You
93
+ meet the following conditions:
94
+
95
+ (a) You must give any other recipients of the Work or
96
+ Derivative Works a copy of this License; and
97
+
98
+ (b) You must cause any modified files to carry prominent notices
99
+ stating that You changed the files; and
100
+
101
+ (c) You must retain, in the Source form of any Derivative Works
102
+ that You distribute, all copyright, patent, trademark, and
103
+ attribution notices from the Source form of the Work,
104
+ excluding those notices that do not pertain to any part of
105
+ the Derivative Works; and
106
+
107
+ (d) If the Work includes a "NOTICE" text file as part of its
108
+ distribution, then any Derivative Works that You distribute must
109
+ include a readable copy of the attribution notices contained
110
+ within such NOTICE file, excluding those notices that do not
111
+ pertain to any part of the Derivative Works, in at least one
112
+ of the following places: within a NOTICE text file distributed
113
+ as part of the Derivative Works; within the Source form or
114
+ documentation, if provided along with the Derivative Works; or,
115
+ within a display generated by the Derivative Works, if and
116
+ wherever such third-party notices normally appear. The contents
117
+ of the NOTICE file are for informational purposes only and
118
+ do not modify the License. You may add Your own attribution
119
+ notices within Derivative Works that You distribute, alongside
120
+ or as an addendum to the NOTICE text from the Work, provided
121
+ that such additional attribution notices cannot be construed
122
+ as modifying the License.
123
+
124
+ You may add Your own copyright statement to Your modifications and
125
+ may provide additional or different license terms and conditions
126
+ for use, reproduction, or distribution of Your modifications, or
127
+ for any such Derivative Works as a whole, provided Your use,
128
+ reproduction, and distribution of the Work otherwise complies with
129
+ the conditions stated in this License.
130
+
131
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
132
+ any Contribution intentionally submitted for inclusion in the Work
133
+ by You to the Licensor shall be under the terms and conditions of
134
+ this License, without any additional terms or conditions.
135
+ Notwithstanding the above, nothing herein shall supersede or modify
136
+ the terms of any separate license agreement you may have executed
137
+ with Licensor regarding such Contributions.
138
+
139
+ 6. Trademarks. This License does not grant permission to use the trade
140
+ names, trademarks, service marks, or product names of the Licensor,
141
+ except as required for reasonable and customary use in describing the
142
+ origin of the Work and reproducing the content of the NOTICE file.
143
+
144
+ 7. Disclaimer of Warranty. Unless required by applicable law or
145
+ agreed to in writing, Licensor provides the Work (and each
146
+ Contributor provides its Contributions) on an "AS IS" BASIS,
147
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
148
+ implied, including, without limitation, any warranties or conditions
149
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
150
+ PARTICULAR PURPOSE. You are solely responsible for determining the
151
+ appropriateness of using or redistributing the Work and assume any
152
+ risks associated with Your exercise of permissions under this License.
153
+
154
+ 8. Limitation of Liability. In no event and under no legal theory,
155
+ whether in tort (including negligence), contract, or otherwise,
156
+ unless required by applicable law (such as deliberate and grossly
157
+ negligent acts) or agreed to in writing, shall any Contributor be
158
+ liable to You for damages, including any direct, indirect, special,
159
+ incidental, or consequential damages of any character arising as a
160
+ result of this License or out of the use or inability to use the
161
+ Work (including but not limited to damages for loss of goodwill,
162
+ work stoppage, computer failure or malfunction, or any and all
163
+ other commercial damages or losses), even if such Contributor
164
+ has been advised of the possibility of such damages.
165
+
166
+ 9. Accepting Warranty or Additional Liability. While redistributing
167
+ the Work or Derivative Works thereof, You may choose to offer,
168
+ and charge a fee for, acceptance of support, warranty, indemnity,
169
+ or other liability obligations and/or rights consistent with this
170
+ License. However, in accepting such obligations, You may act only
171
+ on Your own behalf and on Your sole responsibility, not on behalf
172
+ of any other Contributor, and only if You agree to indemnify,
173
+ defend, and hold each Contributor harmless for any liability
174
+ incurred by, or claims asserted against, such Contributor by reason
175
+ of your accepting any such warranty or additional liability.
176
+
177
+ END OF TERMS AND CONDITIONS
178
+
179
+ APPENDIX: How to apply the Apache License to your work.
180
+
181
+ To apply the Apache License to your work, attach the following
182
+ boilerplate notice, with the fields enclosed by brackets "[]"
183
+ replaced with your own identifying information. (Don't include
184
+ the brackets!) The text should be enclosed in the appropriate
185
+ comment syntax for the file format. We also recommend that a
186
+ file or class name and description of purpose be included on the
187
+ same "printed page" as the copyright notice for easier
188
+ identification within third-party archives.
189
+
190
+ Copyright 2020 Elastic and contributors
191
+
192
+ Licensed under the Apache License, Version 2.0 (the "License");
193
+ you may not use this file except in compliance with the License.
194
+ You may obtain a copy of the License at
195
+
196
+ http://www.apache.org/licenses/LICENSE-2.0
197
+
198
+ Unless required by applicable law or agreed to in writing, software
199
+ distributed under the License is distributed on an "AS IS" BASIS,
200
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
201
+ See the License for the specific language governing permissions and
202
+ limitations under the License.
data/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Logstash Plugin
2
2
 
3
- [![Travis Build Status](https://travis-ci.org/logstash-plugins/logstash-filter-useragent.svg)](https://travis-ci.org/logstash-plugins/logstash-filter-useragent)
3
+ [![Travis Build Status](https://travis-ci.com/logstash-plugins/logstash-filter-useragent.svg)](https://travis-ci.com/logstash-plugins/logstash-filter-useragent)
4
4
 
5
5
  This is a plugin for [Logstash](https://github.com/elastic/logstash).
6
6
 
data/docs/index.asciidoc CHANGED
@@ -22,12 +22,81 @@ include::{include_path}/plugin_header.asciidoc[]
22
22
 
23
23
  Parse user agent strings into structured data based on BrowserScope data
24
24
 
25
- UserAgent filter, adds information about user agent like family, operating
26
- system, version, and device
25
+ UserAgent filter, adds information about user agent like name, version, operating
26
+ system, and device.
27
27
 
28
- Logstash releases ship with the regexes.yaml database made available from
29
- ua-parser with an Apache 2.0 license. For more details on ua-parser, see
30
- <https://github.com/tobie/ua-parser/>.
28
+ The plugin ships with the *regexes.yaml* database made available from ua-parser
29
+ with an Apache 2.0 license. For more details on ua-parser, see
30
+ <https://github.com/ua-parser/uap-core/>.
31
+
32
+ ==== Compatibility with the Elastic Common Schema (ECS)
33
+
34
+ This plugin can be used to parse user-agent (UA) _into_ fields compliant with the Elastic Common Schema.
35
+ Here's how
36
+ <<plugins-{type}s-{plugin}-ecs_compatibility,ECS compatibility mode>> affects
37
+ output.
38
+
39
+ [cols="<l,<l,e,<e"]
40
+ |=======================================================================
41
+ |ECS disabled |ECS v1 |Description |Notes
42
+
43
+ |[name] |[user_agent][name] |Detected UA name |
44
+ | N/A |[user_agent][version] |Detected UA version |Only available in ECS mode
45
+ |[major] |[@metadata][filter][user_agent][version][major] |UA major version |Only as meta-data in ECS mode
46
+ |[minor] |[@metadata][filter][user_agent][version][minor] |UA minor version |Only as meta-data in ECS mode
47
+ |[patch] |[@metadata][filter][user_agent][version][patch] |UA patch version |Only as meta-data in ECS mode
48
+ |[os_name] |[user_agent][os][name] |Detected operating-system name |
49
+ | N/A |[user_agent][os][version] |Detected OS version |Only available in ECS mode
50
+ |[os_major] |[@metadata][filter][user_agent][os][version][major] |OS major version |Only as meta-data in ECS mode
51
+ |[os_minor] |[@metadata][filter][user_agent][os][version][minor] |OS minor version |Only as meta-data in ECS mode
52
+ |[os_patch] |[@metadata][filter][user_agent][os][version][patch] |OS patch version |Only as meta-data in ECS mode
53
+ |[os_full] |[user_agent][os][full] |Full operating-system name |
54
+ |[device] |[user_agent][device][name] |Device name |
55
+ |=======================================================================
56
+
57
+ Example:
58
+ [source,ruby]
59
+ filter {
60
+ useragent {
61
+ source => 'message'
62
+ }
63
+ }
64
+
65
+ Given an event with the `message` field set as: `Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Firefox/45.0`
66
+ produces the following fields:
67
+
68
+ [source,ruby]
69
+ -----
70
+ {
71
+ "name"=>"Firefox",
72
+ "version"=>"45.0",
73
+ "major"=>"45",
74
+ "minor"=>"0",
75
+ "os_name"=>"Mac OS X",
76
+ "os_version"=>"10.11",
77
+ "os_full"=>"Mac OS X 10.11",
78
+ "os_major"=>"10",
79
+ "os_minor"=>"11",
80
+ "device"=>"Mac"
81
+ }
82
+ -----
83
+
84
+ **and with ECS enabled:**
85
+ [source,ruby]
86
+ -----
87
+ {
88
+ "user_agent"=>{
89
+ "name"=>"Firefox",
90
+ "version"=>"45.0",
91
+ "os"=>{
92
+ "name"=>"Mac OS X",
93
+ "version"=>"10.11",
94
+ "full"=>"Mac OS X 10.11"
95
+ },
96
+ "device"=>{"name"=>"Mac"},
97
+ }
98
+ }
99
+ -----
31
100
 
32
101
  [id="plugins-{type}s-{plugin}-options"]
33
102
  ==== Useragent Filter Configuration Options
@@ -37,6 +106,7 @@ This plugin supports the following configuration options plus the <<plugins-{typ
37
106
  [cols="<,<,<",options="header",]
38
107
  |=======================================================================
39
108
  |Setting |Input type|Required
109
+ | <<plugins-{type}s-{plugin}-ecs_compatibility>> |<<string,string>>|No
40
110
  | <<plugins-{type}s-{plugin}-lru_cache_size>> |<<number,number>>|No
41
111
  | <<plugins-{type}s-{plugin}-prefix>> |<<string,string>>|No
42
112
  | <<plugins-{type}s-{plugin}-regexes>> |<<string,string>>|No
@@ -49,11 +119,25 @@ filter plugins.
49
119
 
50
120
  &nbsp;
51
121
 
122
+ [id="plugins-{type}s-{plugin}-ecs_compatibility"]
123
+ ===== `ecs_compatibility`
124
+
125
+ * Value type is <<string,string>>
126
+ * Supported values are:
127
+ ** `disabled`: does not use ECS-compatible field names (fields might be set at the root of the event)
128
+ ** `v1`: uses fields that are compatible with Elastic Common Schema (for example, `[user_agent][version]`)
129
+ * Default value depends on which version of Logstash is running:
130
+ ** When Logstash provides a `pipeline.ecs_compatibility` setting, its value is used as the default
131
+ ** Otherwise, the default value is `disabled`.
132
+
133
+ Controls this plugin's compatibility with the {ecs-ref}[Elastic Common Schema (ECS)].
134
+ The value of this setting affects the _default_ value of <<plugins-{type}s-{plugin}-target>>.
135
+
52
136
  [id="plugins-{type}s-{plugin}-lru_cache_size"]
53
137
  ===== `lru_cache_size`
54
138
 
55
139
  * Value type is <<number,number>>
56
- * Default value is `1000`
140
+ * Default value is `100000`
57
141
 
58
142
  UA parsing is surprisingly expensive. This filter uses an LRU cache to take advantage of the fact that
59
143
  user agents are often found adjacent to one another in log files and rarely have a random distribution.
@@ -84,10 +168,8 @@ A string to prepend to all of the extracted keys
84
168
  * Value type is <<string,string>>
85
169
  * There is no default value for this setting.
86
170
 
87
- `regexes.yaml` file to use
88
-
89
171
  If not specified, this will default to the `regexes.yaml` that ships
90
- with logstash.
172
+ with logstash. Otherwise use the provided `regexes.yaml` file.
91
173
 
92
174
  You can find the latest version of this here:
93
175
  <https://github.com/ua-parser/uap-core/blob/master/regexes.yaml>
@@ -106,7 +188,9 @@ array, only the first value will be used.
106
188
  ===== `target`
107
189
 
108
190
  * Value type is <<string,string>>
109
- * There is no default value for this setting.
191
+ * Default value depends on whether <<plugins-{type}s-{plugin}-ecs_compatibility>> is enabled:
192
+ ** ECS Compatibility disabled: no default value for this setting
193
+ ** ECS Compatibility enabled: `"user_agent"`
110
194
 
111
195
  The name of the field to assign user agent data into.
112
196
 
@@ -115,4 +199,4 @@ If not specified user agent data will be stored in the root of the event.
115
199
 
116
200
 
117
201
  [id="plugins-{type}s-{plugin}-common-options"]
118
- include::{include_path}/{type}.asciidoc[]
202
+ include::{include_path}/{type}.asciidoc[]
@@ -1,10 +1,8 @@
1
1
  # encoding: utf-8
2
- require "java"
3
2
  require "logstash-filter-useragent_jars"
4
3
  require "logstash/filters/base"
5
4
  require "logstash/namespace"
6
- require "tempfile"
7
- require "thread"
5
+ require 'logstash/plugin_mixins/ecs_compatibility_support'
8
6
 
9
7
  # Parse user agent strings into structured data based on BrowserScope data
10
8
  #
@@ -16,6 +14,8 @@ require "thread"
16
14
  # <https://github.com/tobie/ua-parser/>.
17
15
  class LogStash::Filters::UserAgent < LogStash::Filters::Base
18
16
 
17
+ include LogStash::PluginMixins::ECSCompatibilitySupport(:disabled, :v1)
18
+
19
19
  config_name "useragent"
20
20
 
21
21
  # The field containing the user agent string. If this field is an
@@ -25,7 +25,7 @@ class LogStash::Filters::UserAgent < LogStash::Filters::Base
25
25
  # The name of the field to assign user agent data into.
26
26
  #
27
27
  # If not specified user agent data will be stored in the root of the event.
28
- config :target, :validate => :string
28
+ config :target, :validate => :string # default [user_agent] in ECS mode
29
29
 
30
30
  # `regexes.yaml` file to use
31
31
  #
@@ -37,7 +37,7 @@ class LogStash::Filters::UserAgent < LogStash::Filters::Base
37
37
  config :regexes, :validate => :string
38
38
 
39
39
  # A string to prepend to all of the extracted keys
40
- config :prefix, :validate => :string, :default => ''
40
+ config :prefix, :validate => :string, :default => '' # not supported in ECS mode
41
41
 
42
42
  # UA parsing is surprisingly expensive. This filter uses an LRU cache to take advantage of the fact that
43
43
  # user agents are often found adjacent to one another in log files and rarely have a random distribution.
@@ -55,7 +55,46 @@ class LogStash::Filters::UserAgent < LogStash::Filters::Base
55
55
  # number of cache misses and waste memory.
56
56
  config :lru_cache_size, :validate => :number, :default => 100_000
57
57
 
58
+ def initialize(*params)
59
+ super
60
+
61
+ # make @target in the format [field name] if defined, i.e. surrounded by brackets
62
+ target = @target || ecs_select[disabled: '', v1: '[user_agent]']
63
+ target = "[#{@target}]" if !target.empty? && target !~ /^\[[^\[\]]+\]$/
64
+
65
+ @name_field = ecs_select[disabled: "[#{@prefix}name]", v1: '[name]']
66
+ @name_field = "#{target}#{@name_field}"
67
+
68
+ @device_name_field = ecs_select[disabled: "[#{@prefix}device]", v1: '[device][name]']
69
+ @device_name_field = "#{target}#{@device_name_field}"
70
+
71
+ @version_field = ecs_select[disabled: "[#{@prefix}version]", v1: '[version]']
72
+ @version_field = "#{target}#{@version_field}"
73
+ @major_field = ecs_select[disabled: "#{target}[#{@prefix}major]", v1: "[@metadata][filter][user_agent][version][major]"]
74
+ @minor_field = ecs_select[disabled: "#{target}[#{@prefix}minor]", v1: "[@metadata][filter][user_agent][version][minor]"]
75
+ @patch_field = ecs_select[disabled: "#{target}[#{@prefix}patch]", v1: "[@metadata][filter][user_agent][version][patch]"]
76
+
77
+ @os_full_name_field = ecs_select[disabled: "[#{@prefix}os_full]", v1: '[os][full]'] # did not exist in legacy prior to ECS-ification
78
+ @os_full_name_field = "#{target}#{@os_full_name_field}"
79
+
80
+ @os_name_field = ecs_select[disabled: "[#{@prefix}os_name]", v1: '[os][name]']
81
+ @os_name_field = "#{target}#{@os_name_field}"
82
+ @legacy_os_field = ecs_select[disabled: "#{target}[#{@prefix}os]", v1: nil] # same as [os_name] in legacy mode
83
+
84
+ @os_version_field = ecs_select[disabled: "[#{@prefix}os_version]", v1: '[os][version]']
85
+ @os_version_field = "#{target}#{@os_version_field}"
86
+ @os_major_field = ecs_select[disabled: "#{target}[#{@prefix}os_major]", v1: "[@metadata][filter][user_agent][os][version][major]"]
87
+ @os_minor_field = ecs_select[disabled: "#{target}[#{@prefix}os_minor]", v1: "[@metadata][filter][user_agent][os][version][minor]"]
88
+ @os_patch_field = ecs_select[disabled: "#{target}[#{@prefix}os_patch]", v1: "[@metadata][filter][user_agent][os][version][patch]"]
89
+
90
+ # NOTE: unfortunately we can not reliably provide `user_agent.original` since the patterns do not
91
+ # reliably give back the matched group and they support the UA string prefixed and/or suffixed
92
+ end
93
+
58
94
  def register
95
+ if ecs_compatibility != :disabled && @prefix && !@prefix.empty?
96
+ @logger.warn "Field prefix isn't supported in ECS compatibility mode, please remove `prefix => #{@prefix.inspect}`"
97
+ end
59
98
 
60
99
  if @regexes.nil?
61
100
  @parser = org.logstash.uaparser.CachingParser.new(lru_cache_size)
@@ -63,21 +102,6 @@ class LogStash::Filters::UserAgent < LogStash::Filters::Base
63
102
  @logger.debug("Using user agent regexes", :regexes => @regexes)
64
103
  @parser = org.logstash.uaparser.CachingParser.new(@regexes, lru_cache_size)
65
104
  end
66
-
67
- # make @target in the format [field name] if defined, i.e. surrounded by brakets
68
- normalized_target = (@target && @target !~ /^\[[^\[\]]+\]$/) ? "[#{@target}]" : ""
69
-
70
- # predefine prefixed field names
71
- @prefixed_name = "#{normalized_target}[#{@prefix}name]"
72
- @prefixed_os = "#{normalized_target}[#{@prefix}os]"
73
- @prefixed_os_name = "#{normalized_target}[#{@prefix}os_name]"
74
- @prefixed_os_major = "#{normalized_target}[#{@prefix}os_major]"
75
- @prefixed_os_minor = "#{normalized_target}[#{@prefix}os_minor]"
76
- @prefixed_device = "#{normalized_target}[#{@prefix}device]"
77
- @prefixed_major = "#{normalized_target}[#{@prefix}major]"
78
- @prefixed_minor = "#{normalized_target}[#{@prefix}minor]"
79
- @prefixed_patch = "#{normalized_target}[#{@prefix}patch]"
80
- @prefixed_build = "#{normalized_target}[#{@prefix}build]"
81
105
  end
82
106
 
83
107
  def filter(event)
@@ -88,60 +112,127 @@ class LogStash::Filters::UserAgent < LogStash::Filters::Base
88
112
 
89
113
  begin
90
114
  ua_data = lookup_useragent(useragent)
91
- rescue StandardError => e
92
- @logger.error("Uknown error while parsing user agent data", :exception => e, :field => @source, :event => event)
115
+ rescue => e
116
+ @logger.error("Unknown error while parsing user agent data",
117
+ :exception => e.class, :message => e.message, :backtrace => e.backtrace,
118
+ :field => @source, :event => event.to_hash)
93
119
  return
94
120
  end
95
121
 
96
122
  return unless ua_data
97
123
 
98
124
  event.remove(@source) if @target == @source
99
- set_fields(event, ua_data)
125
+ set_fields(event, useragent, ua_data)
100
126
 
101
127
  filter_matched(event)
102
128
  end
103
129
 
104
- # should be private but need to stay public for specs
105
- # TODO: (colin) the related specs should be refactored to not rely on private methods.
106
- def lookup_useragent(useragent)
107
- return unless useragent
130
+ private
108
131
 
109
- # the UserAgentParser::Parser class is not thread safe, indications are that it is probably
110
- # caused by the underlying JRuby regex code that is not thread safe.
111
- # see https://github.com/logstash-plugins/logstash-filter-useragent/issues/25
132
+ def lookup_useragent(useragent)
112
133
  @parser.parse(useragent)
113
134
  end
114
135
 
115
- private
116
-
117
- def set_fields(event, ua_data)
118
- # UserAgentParser outputs as US-ASCII.
119
-
120
- event.set(@prefixed_name, ua_data.userAgent.family.dup.force_encoding(Encoding::UTF_8))
121
-
122
- #OSX, Android and maybe iOS parse correctly, ua-agent parsing for Windows does not provide this level of detail
136
+ def set_fields(event, ua_source, ua_data)
137
+ # UserAgentParser strings are US-ASCII
138
+
139
+ ua = ua_data.userAgent
140
+ event.set(@name_field, duped_string(ua.family))
141
+ event.set(@device_name_field, duped_string(ua_data.device)) if ua_data.device
142
+
143
+ event.set(@major_field, duped_string(ua.major)) if ua.major
144
+ event.set(@minor_field, duped_string(ua.minor)) if ua.minor
145
+ event.set(@patch_field, duped_string(ua.patch)) if ua.patch
146
+ set_version(event, ua_source, ua) # UA version string e.g. "89.0.4389.90"
147
+
148
+ os = ua_data.os
149
+ if os
150
+ # os.major, os.minor, ... are all strings
151
+ event.set(@os_major_field, duped_string(os.major)) if os.major # e.g. 'Vista' or '10'
152
+ event.set(@os_minor_field, duped_string(os.minor)) if os.minor
153
+ event.set(@os_patch_field, duped_string(os.patch)) if os.patch
154
+ os_version = build_os_version(os)
155
+ event.set(@os_version_field, os_version) if os_version
156
+
157
+ os_name = os.family
158
+ if os_name
159
+ os_name = duped_string(os_name)
160
+ event.set(@os_name_field, os_name)
161
+ event.set(@legacy_os_field, os_name.dup) if @legacy_os_field
162
+ os_full_name = os_name.dup
163
+ os_full_name << ' ' << os_version if os_version
164
+ event.set(@os_full_name_field, os_full_name)
165
+ end
166
+ end
167
+ end
123
168
 
124
- # Calls in here use #dup because there's potential for later filters to modify these values
125
- # and corrupt the cache. See uap source here for details https://github.com/ua-parser/uap-ruby/tree/master/lib/user_agent_parser
126
- if (os = ua_data.os)
127
- # The OS is a rich object
128
- event.set(@prefixed_os, ua_data.os.family.dup.force_encoding(Encoding::UTF_8))
129
- event.set(@prefixed_os_name, os.family.dup.force_encoding(Encoding::UTF_8)) if os.family
130
-
131
- # These are all strings
132
- if os.minor && os.major
133
- event.set(@prefixed_os_major, os.major.dup.force_encoding(Encoding::UTF_8)) if os.major
134
- event.set(@prefixed_os_minor, os.minor.dup.force_encoding(Encoding::UTF_8)) if os.minor
169
+ # reconstruct and set the User-Agent version string
170
+ def set_version(event, ua_source, ua)
171
+ if @version_field && ua.major
172
+ # only Chrome has all 4 segments, while Firefox only uses major.minor
173
+ version = duped_string(ua.major)
174
+ if ua.minor
175
+ version << '.' << ua.minor
176
+ if ua.patch
177
+ version << '.' << ua.patch
178
+ if ua.patchMinor
179
+ version << '.' << ua.patchMinor
180
+ else
181
+ adjusted_version = check_and_adjust_version(ua_source, version)
182
+ version = adjusted_version if adjusted_version
183
+ end
184
+ end
135
185
  end
186
+ event.set(@version_field, version)
136
187
  end
188
+ end
137
189
 
138
- event.set(@prefixed_device, ua_data.device.to_s.dup.force_encoding(Encoding::UTF_8)) if ua_data.device
190
+ def check_and_adjust_version(ua_source, version)
191
+ # only set OS version if it's not 'interpreted' (contained in UA string)
192
+ return nil if !version || (i = ua_source.index(version)).nil?
193
+ i += version.size
194
+ # complete version when patchMinor is not matched but still there
195
+ if ua_source[i] == '.' # we built the version with dots
196
+ if patch_minor = ua_source.index(' ', i + 1)
197
+ patch_minor = ua_source[i + 1...patch_minor]
198
+ if patch_minor.eql? patch_minor.to_i.to_s
199
+ version = "#{version}.#{patch_minor}"
200
+ end
201
+ end
202
+ end
203
+ version
204
+ end
139
205
 
140
- if (ua_version = ua_data.userAgent)
141
- event.set(@prefixed_major, ua_version.major.dup.force_encoding(Encoding::UTF_8)) if ua_version.major
142
- event.set(@prefixed_minor, ua_version.minor.dup.force_encoding(Encoding::UTF_8)) if ua_version.minor
143
- event.set(@prefixed_patch, ua_version.patch.dup.force_encoding(Encoding::UTF_8)) if ua_version.patch
144
- event.set(@prefixed_build, ua_version.patchMinor.dup.force_encoding(Encoding::UTF_8)) if ua_version.patchMinor
206
+ # reconstructs the OS version string
207
+ def build_os_version(os)
208
+ # NOTE: UA regexes don't always give us the versions back
209
+ # they do get "corrected" for various OSes such as:
210
+ # - Windows (Windows NT 6.0 => 'Vista')
211
+ # - Windows ('Windows NT 6.3' => '8','1')
212
+ # - Windows ('Windows NT 10.0' => '10')
213
+ # - iOS ('Darwin/15.5' => '9','3','2')
214
+ return unless major = os.major
215
+ if major.to_i.to_s == major
216
+ version, sep = duped_string(major), '.'
217
+ else
218
+ version, sep = duped_string(major), ' '
145
219
  end
220
+ if os.minor
221
+ version << sep << os.minor
222
+ if os.patch
223
+ version << '.' << os.patch
224
+ if os.patchMinor
225
+ version << '.' << os.patchMinor
226
+ end
227
+ end
228
+ end
229
+ version
146
230
  end
231
+
232
+ def duped_string(str)
233
+ # Calls in here use #dup because there's potential for later filters to modify these values
234
+ # and corrupt the cache. See uap source here for details https://github.com/ua-parser/uap-ruby/tree/master/lib/user_agent_parser
235
+ str.dup.force_encoding(Encoding::UTF_8)
236
+ end
237
+
147
238
  end