logstash-filter-useragent 3.2.4-java → 3.3.1-java
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +18 -0
- data/LICENSE +199 -10
- data/README.md +1 -1
- data/docs/index.asciidoc +95 -11
- data/lib/logstash/filters/useragent.rb +146 -55
- data/logstash-filter-useragent.gemspec +1 -0
- data/spec/filters/useragent_spec.rb +427 -78
- data/vendor/jar-dependencies/org/logstash/filters/logstash-filter-useragent/{3.2.4/logstash-filter-useragent-3.2.4.jar → 3.3.1/logstash-filter-useragent-3.3.1.jar} +0 -0
- data/version +1 -1
- metadata +17 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 0e9602a03ed7ff6329d58b7f929f6a2337a1429fd6b9794f7df7ae5fa37aefaa
|
4
|
+
data.tar.gz: a5a45f3a8b9de9cdbe68e1cc351d7a84ecd045710038cb39e7f4d88decd689eb
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: aeeb91619083fbf74249ec8f38526efab217e9caba60e94a84ddce30f284d254e8deacea6b7fa98ba6da08043e47972e3041f2df7fdb468d49d9587ce73f7e8a
|
7
|
+
data.tar.gz: 8bd5b3c6059f151cc1e900a94b2cf1a6386174a6b01bd24d76f9e56d30742480af685e624e53ce1e878b5948a2e40dd38b2bd2b663818f3811f246bdc03aeeda
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,21 @@
|
|
1
|
+
## 3.3.1
|
2
|
+
- Fix: invalid 3.3.0 release which did not package correctly [#71](https://github.com/logstash-plugins/logstash-filter-useragent/pull/71)
|
3
|
+
|
4
|
+
## 3.3.0 (invalid)
|
5
|
+
- Feat: support ECS mode when setting UA fields [#68](https://github.com/logstash-plugins/logstash-filter-useragent/pull/68)
|
6
|
+
|
7
|
+
- Fix: capture os major version + update UA regexes [#69](https://github.com/logstash-plugins/logstash-filter-useragent/pull/69)
|
8
|
+
|
9
|
+
The UA parser *regexes.yaml* update (to **v0.12.0**) will accurately detect recent user agent strings.
|
10
|
+
|
11
|
+
NOTE: The update might cause changes in matching user agent fields such as `name`
|
12
|
+
(for example, the previous version did not support `Edge` and detect it as `Chrome`).
|
13
|
+
If needed the old behavior can be restored by downloading the outdated [regexes.yaml](https://raw.githubusercontent.com/ua-parser/uap-core/2e6c983e42e7aae7d957a263cb4d3de7ccbd92af/regexes.yaml)
|
14
|
+
and configuring `regexes => path/to/regexes.yaml`.
|
15
|
+
|
16
|
+
- Plugin no longer sets the `[build]` UA version field which is not implemented and was always `""`.
|
17
|
+
- Fix: `target => [field]` configuration, which wasn't working previously
|
18
|
+
|
1
19
|
## 3.2.4
|
2
20
|
- Added support for OS regular expressions that use backreferences [#59](https://github.com/logstash-plugins/logstash-filter-useragent/pull/59)
|
3
21
|
|
data/LICENSE
CHANGED
@@ -1,13 +1,202 @@
|
|
1
|
-
Copyright (c) 2012-2018 Elasticsearch <http://www.elastic.co>
|
2
1
|
|
3
|
-
|
4
|
-
|
5
|
-
|
2
|
+
Apache License
|
3
|
+
Version 2.0, January 2004
|
4
|
+
http://www.apache.org/licenses/
|
6
5
|
|
7
|
-
|
6
|
+
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
8
7
|
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
8
|
+
1. Definitions.
|
9
|
+
|
10
|
+
"License" shall mean the terms and conditions for use, reproduction,
|
11
|
+
and distribution as defined by Sections 1 through 9 of this document.
|
12
|
+
|
13
|
+
"Licensor" shall mean the copyright owner or entity authorized by
|
14
|
+
the copyright owner that is granting the License.
|
15
|
+
|
16
|
+
"Legal Entity" shall mean the union of the acting entity and all
|
17
|
+
other entities that control, are controlled by, or are under common
|
18
|
+
control with that entity. For the purposes of this definition,
|
19
|
+
"control" means (i) the power, direct or indirect, to cause the
|
20
|
+
direction or management of such entity, whether by contract or
|
21
|
+
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
22
|
+
outstanding shares, or (iii) beneficial ownership of such entity.
|
23
|
+
|
24
|
+
"You" (or "Your") shall mean an individual or Legal Entity
|
25
|
+
exercising permissions granted by this License.
|
26
|
+
|
27
|
+
"Source" form shall mean the preferred form for making modifications,
|
28
|
+
including but not limited to software source code, documentation
|
29
|
+
source, and configuration files.
|
30
|
+
|
31
|
+
"Object" form shall mean any form resulting from mechanical
|
32
|
+
transformation or translation of a Source form, including but
|
33
|
+
not limited to compiled object code, generated documentation,
|
34
|
+
and conversions to other media types.
|
35
|
+
|
36
|
+
"Work" shall mean the work of authorship, whether in Source or
|
37
|
+
Object form, made available under the License, as indicated by a
|
38
|
+
copyright notice that is included in or attached to the work
|
39
|
+
(an example is provided in the Appendix below).
|
40
|
+
|
41
|
+
"Derivative Works" shall mean any work, whether in Source or Object
|
42
|
+
form, that is based on (or derived from) the Work and for which the
|
43
|
+
editorial revisions, annotations, elaborations, or other modifications
|
44
|
+
represent, as a whole, an original work of authorship. For the purposes
|
45
|
+
of this License, Derivative Works shall not include works that remain
|
46
|
+
separable from, or merely link (or bind by name) to the interfaces of,
|
47
|
+
the Work and Derivative Works thereof.
|
48
|
+
|
49
|
+
"Contribution" shall mean any work of authorship, including
|
50
|
+
the original version of the Work and any modifications or additions
|
51
|
+
to that Work or Derivative Works thereof, that is intentionally
|
52
|
+
submitted to Licensor for inclusion in the Work by the copyright owner
|
53
|
+
or by an individual or Legal Entity authorized to submit on behalf of
|
54
|
+
the copyright owner. For the purposes of this definition, "submitted"
|
55
|
+
means any form of electronic, verbal, or written communication sent
|
56
|
+
to the Licensor or its representatives, including but not limited to
|
57
|
+
communication on electronic mailing lists, source code control systems,
|
58
|
+
and issue tracking systems that are managed by, or on behalf of, the
|
59
|
+
Licensor for the purpose of discussing and improving the Work, but
|
60
|
+
excluding communication that is conspicuously marked or otherwise
|
61
|
+
designated in writing by the copyright owner as "Not a Contribution."
|
62
|
+
|
63
|
+
"Contributor" shall mean Licensor and any individual or Legal Entity
|
64
|
+
on behalf of whom a Contribution has been received by Licensor and
|
65
|
+
subsequently incorporated within the Work.
|
66
|
+
|
67
|
+
2. Grant of Copyright License. Subject to the terms and conditions of
|
68
|
+
this License, each Contributor hereby grants to You a perpetual,
|
69
|
+
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
70
|
+
copyright license to reproduce, prepare Derivative Works of,
|
71
|
+
publicly display, publicly perform, sublicense, and distribute the
|
72
|
+
Work and such Derivative Works in Source or Object form.
|
73
|
+
|
74
|
+
3. Grant of Patent License. Subject to the terms and conditions of
|
75
|
+
this License, each Contributor hereby grants to You a perpetual,
|
76
|
+
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
77
|
+
(except as stated in this section) patent license to make, have made,
|
78
|
+
use, offer to sell, sell, import, and otherwise transfer the Work,
|
79
|
+
where such license applies only to those patent claims licensable
|
80
|
+
by such Contributor that are necessarily infringed by their
|
81
|
+
Contribution(s) alone or by combination of their Contribution(s)
|
82
|
+
with the Work to which such Contribution(s) was submitted. If You
|
83
|
+
institute patent litigation against any entity (including a
|
84
|
+
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
85
|
+
or a Contribution incorporated within the Work constitutes direct
|
86
|
+
or contributory patent infringement, then any patent licenses
|
87
|
+
granted to You under this License for that Work shall terminate
|
88
|
+
as of the date such litigation is filed.
|
89
|
+
|
90
|
+
4. Redistribution. You may reproduce and distribute copies of the
|
91
|
+
Work or Derivative Works thereof in any medium, with or without
|
92
|
+
modifications, and in Source or Object form, provided that You
|
93
|
+
meet the following conditions:
|
94
|
+
|
95
|
+
(a) You must give any other recipients of the Work or
|
96
|
+
Derivative Works a copy of this License; and
|
97
|
+
|
98
|
+
(b) You must cause any modified files to carry prominent notices
|
99
|
+
stating that You changed the files; and
|
100
|
+
|
101
|
+
(c) You must retain, in the Source form of any Derivative Works
|
102
|
+
that You distribute, all copyright, patent, trademark, and
|
103
|
+
attribution notices from the Source form of the Work,
|
104
|
+
excluding those notices that do not pertain to any part of
|
105
|
+
the Derivative Works; and
|
106
|
+
|
107
|
+
(d) If the Work includes a "NOTICE" text file as part of its
|
108
|
+
distribution, then any Derivative Works that You distribute must
|
109
|
+
include a readable copy of the attribution notices contained
|
110
|
+
within such NOTICE file, excluding those notices that do not
|
111
|
+
pertain to any part of the Derivative Works, in at least one
|
112
|
+
of the following places: within a NOTICE text file distributed
|
113
|
+
as part of the Derivative Works; within the Source form or
|
114
|
+
documentation, if provided along with the Derivative Works; or,
|
115
|
+
within a display generated by the Derivative Works, if and
|
116
|
+
wherever such third-party notices normally appear. The contents
|
117
|
+
of the NOTICE file are for informational purposes only and
|
118
|
+
do not modify the License. You may add Your own attribution
|
119
|
+
notices within Derivative Works that You distribute, alongside
|
120
|
+
or as an addendum to the NOTICE text from the Work, provided
|
121
|
+
that such additional attribution notices cannot be construed
|
122
|
+
as modifying the License.
|
123
|
+
|
124
|
+
You may add Your own copyright statement to Your modifications and
|
125
|
+
may provide additional or different license terms and conditions
|
126
|
+
for use, reproduction, or distribution of Your modifications, or
|
127
|
+
for any such Derivative Works as a whole, provided Your use,
|
128
|
+
reproduction, and distribution of the Work otherwise complies with
|
129
|
+
the conditions stated in this License.
|
130
|
+
|
131
|
+
5. Submission of Contributions. Unless You explicitly state otherwise,
|
132
|
+
any Contribution intentionally submitted for inclusion in the Work
|
133
|
+
by You to the Licensor shall be under the terms and conditions of
|
134
|
+
this License, without any additional terms or conditions.
|
135
|
+
Notwithstanding the above, nothing herein shall supersede or modify
|
136
|
+
the terms of any separate license agreement you may have executed
|
137
|
+
with Licensor regarding such Contributions.
|
138
|
+
|
139
|
+
6. Trademarks. This License does not grant permission to use the trade
|
140
|
+
names, trademarks, service marks, or product names of the Licensor,
|
141
|
+
except as required for reasonable and customary use in describing the
|
142
|
+
origin of the Work and reproducing the content of the NOTICE file.
|
143
|
+
|
144
|
+
7. Disclaimer of Warranty. Unless required by applicable law or
|
145
|
+
agreed to in writing, Licensor provides the Work (and each
|
146
|
+
Contributor provides its Contributions) on an "AS IS" BASIS,
|
147
|
+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
148
|
+
implied, including, without limitation, any warranties or conditions
|
149
|
+
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
150
|
+
PARTICULAR PURPOSE. You are solely responsible for determining the
|
151
|
+
appropriateness of using or redistributing the Work and assume any
|
152
|
+
risks associated with Your exercise of permissions under this License.
|
153
|
+
|
154
|
+
8. Limitation of Liability. In no event and under no legal theory,
|
155
|
+
whether in tort (including negligence), contract, or otherwise,
|
156
|
+
unless required by applicable law (such as deliberate and grossly
|
157
|
+
negligent acts) or agreed to in writing, shall any Contributor be
|
158
|
+
liable to You for damages, including any direct, indirect, special,
|
159
|
+
incidental, or consequential damages of any character arising as a
|
160
|
+
result of this License or out of the use or inability to use the
|
161
|
+
Work (including but not limited to damages for loss of goodwill,
|
162
|
+
work stoppage, computer failure or malfunction, or any and all
|
163
|
+
other commercial damages or losses), even if such Contributor
|
164
|
+
has been advised of the possibility of such damages.
|
165
|
+
|
166
|
+
9. Accepting Warranty or Additional Liability. While redistributing
|
167
|
+
the Work or Derivative Works thereof, You may choose to offer,
|
168
|
+
and charge a fee for, acceptance of support, warranty, indemnity,
|
169
|
+
or other liability obligations and/or rights consistent with this
|
170
|
+
License. However, in accepting such obligations, You may act only
|
171
|
+
on Your own behalf and on Your sole responsibility, not on behalf
|
172
|
+
of any other Contributor, and only if You agree to indemnify,
|
173
|
+
defend, and hold each Contributor harmless for any liability
|
174
|
+
incurred by, or claims asserted against, such Contributor by reason
|
175
|
+
of your accepting any such warranty or additional liability.
|
176
|
+
|
177
|
+
END OF TERMS AND CONDITIONS
|
178
|
+
|
179
|
+
APPENDIX: How to apply the Apache License to your work.
|
180
|
+
|
181
|
+
To apply the Apache License to your work, attach the following
|
182
|
+
boilerplate notice, with the fields enclosed by brackets "[]"
|
183
|
+
replaced with your own identifying information. (Don't include
|
184
|
+
the brackets!) The text should be enclosed in the appropriate
|
185
|
+
comment syntax for the file format. We also recommend that a
|
186
|
+
file or class name and description of purpose be included on the
|
187
|
+
same "printed page" as the copyright notice for easier
|
188
|
+
identification within third-party archives.
|
189
|
+
|
190
|
+
Copyright 2020 Elastic and contributors
|
191
|
+
|
192
|
+
Licensed under the Apache License, Version 2.0 (the "License");
|
193
|
+
you may not use this file except in compliance with the License.
|
194
|
+
You may obtain a copy of the License at
|
195
|
+
|
196
|
+
http://www.apache.org/licenses/LICENSE-2.0
|
197
|
+
|
198
|
+
Unless required by applicable law or agreed to in writing, software
|
199
|
+
distributed under the License is distributed on an "AS IS" BASIS,
|
200
|
+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
201
|
+
See the License for the specific language governing permissions and
|
202
|
+
limitations under the License.
|
data/README.md
CHANGED
@@ -1,6 +1,6 @@
|
|
1
1
|
# Logstash Plugin
|
2
2
|
|
3
|
-
[![Travis Build Status](https://travis-ci.
|
3
|
+
[![Travis Build Status](https://travis-ci.com/logstash-plugins/logstash-filter-useragent.svg)](https://travis-ci.com/logstash-plugins/logstash-filter-useragent)
|
4
4
|
|
5
5
|
This is a plugin for [Logstash](https://github.com/elastic/logstash).
|
6
6
|
|
data/docs/index.asciidoc
CHANGED
@@ -22,12 +22,81 @@ include::{include_path}/plugin_header.asciidoc[]
|
|
22
22
|
|
23
23
|
Parse user agent strings into structured data based on BrowserScope data
|
24
24
|
|
25
|
-
UserAgent filter, adds information about user agent like
|
26
|
-
system,
|
25
|
+
UserAgent filter, adds information about user agent like name, version, operating
|
26
|
+
system, and device.
|
27
27
|
|
28
|
-
|
29
|
-
|
30
|
-
<https://github.com/
|
28
|
+
The plugin ships with the *regexes.yaml* database made available from ua-parser
|
29
|
+
with an Apache 2.0 license. For more details on ua-parser, see
|
30
|
+
<https://github.com/ua-parser/uap-core/>.
|
31
|
+
|
32
|
+
==== Compatibility with the Elastic Common Schema (ECS)
|
33
|
+
|
34
|
+
This plugin can be used to parse user-agent (UA) _into_ fields compliant with the Elastic Common Schema.
|
35
|
+
Here's how
|
36
|
+
<<plugins-{type}s-{plugin}-ecs_compatibility,ECS compatibility mode>> affects
|
37
|
+
output.
|
38
|
+
|
39
|
+
[cols="<l,<l,e,<e"]
|
40
|
+
|=======================================================================
|
41
|
+
|ECS disabled |ECS v1 |Description |Notes
|
42
|
+
|
43
|
+
|[name] |[user_agent][name] |Detected UA name |
|
44
|
+
| N/A |[user_agent][version] |Detected UA version |Only available in ECS mode
|
45
|
+
|[major] |[@metadata][filter][user_agent][version][major] |UA major version |Only as meta-data in ECS mode
|
46
|
+
|[minor] |[@metadata][filter][user_agent][version][minor] |UA minor version |Only as meta-data in ECS mode
|
47
|
+
|[patch] |[@metadata][filter][user_agent][version][patch] |UA patch version |Only as meta-data in ECS mode
|
48
|
+
|[os_name] |[user_agent][os][name] |Detected operating-system name |
|
49
|
+
| N/A |[user_agent][os][version] |Detected OS version |Only available in ECS mode
|
50
|
+
|[os_major] |[@metadata][filter][user_agent][os][version][major] |OS major version |Only as meta-data in ECS mode
|
51
|
+
|[os_minor] |[@metadata][filter][user_agent][os][version][minor] |OS minor version |Only as meta-data in ECS mode
|
52
|
+
|[os_patch] |[@metadata][filter][user_agent][os][version][patch] |OS patch version |Only as meta-data in ECS mode
|
53
|
+
|[os_full] |[user_agent][os][full] |Full operating-system name |
|
54
|
+
|[device] |[user_agent][device][name] |Device name |
|
55
|
+
|=======================================================================
|
56
|
+
|
57
|
+
Example:
|
58
|
+
[source,ruby]
|
59
|
+
filter {
|
60
|
+
useragent {
|
61
|
+
source => 'message'
|
62
|
+
}
|
63
|
+
}
|
64
|
+
|
65
|
+
Given an event with the `message` field set as: `Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Firefox/45.0`
|
66
|
+
produces the following fields:
|
67
|
+
|
68
|
+
[source,ruby]
|
69
|
+
-----
|
70
|
+
{
|
71
|
+
"name"=>"Firefox",
|
72
|
+
"version"=>"45.0",
|
73
|
+
"major"=>"45",
|
74
|
+
"minor"=>"0",
|
75
|
+
"os_name"=>"Mac OS X",
|
76
|
+
"os_version"=>"10.11",
|
77
|
+
"os_full"=>"Mac OS X 10.11",
|
78
|
+
"os_major"=>"10",
|
79
|
+
"os_minor"=>"11",
|
80
|
+
"device"=>"Mac"
|
81
|
+
}
|
82
|
+
-----
|
83
|
+
|
84
|
+
**and with ECS enabled:**
|
85
|
+
[source,ruby]
|
86
|
+
-----
|
87
|
+
{
|
88
|
+
"user_agent"=>{
|
89
|
+
"name"=>"Firefox",
|
90
|
+
"version"=>"45.0",
|
91
|
+
"os"=>{
|
92
|
+
"name"=>"Mac OS X",
|
93
|
+
"version"=>"10.11",
|
94
|
+
"full"=>"Mac OS X 10.11"
|
95
|
+
},
|
96
|
+
"device"=>{"name"=>"Mac"},
|
97
|
+
}
|
98
|
+
}
|
99
|
+
-----
|
31
100
|
|
32
101
|
[id="plugins-{type}s-{plugin}-options"]
|
33
102
|
==== Useragent Filter Configuration Options
|
@@ -37,6 +106,7 @@ This plugin supports the following configuration options plus the <<plugins-{typ
|
|
37
106
|
[cols="<,<,<",options="header",]
|
38
107
|
|=======================================================================
|
39
108
|
|Setting |Input type|Required
|
109
|
+
| <<plugins-{type}s-{plugin}-ecs_compatibility>> |<<string,string>>|No
|
40
110
|
| <<plugins-{type}s-{plugin}-lru_cache_size>> |<<number,number>>|No
|
41
111
|
| <<plugins-{type}s-{plugin}-prefix>> |<<string,string>>|No
|
42
112
|
| <<plugins-{type}s-{plugin}-regexes>> |<<string,string>>|No
|
@@ -49,11 +119,25 @@ filter plugins.
|
|
49
119
|
|
50
120
|
|
51
121
|
|
122
|
+
[id="plugins-{type}s-{plugin}-ecs_compatibility"]
|
123
|
+
===== `ecs_compatibility`
|
124
|
+
|
125
|
+
* Value type is <<string,string>>
|
126
|
+
* Supported values are:
|
127
|
+
** `disabled`: does not use ECS-compatible field names (fields might be set at the root of the event)
|
128
|
+
** `v1`: uses fields that are compatible with Elastic Common Schema (for example, `[user_agent][version]`)
|
129
|
+
* Default value depends on which version of Logstash is running:
|
130
|
+
** When Logstash provides a `pipeline.ecs_compatibility` setting, its value is used as the default
|
131
|
+
** Otherwise, the default value is `disabled`.
|
132
|
+
|
133
|
+
Controls this plugin's compatibility with the {ecs-ref}[Elastic Common Schema (ECS)].
|
134
|
+
The value of this setting affects the _default_ value of <<plugins-{type}s-{plugin}-target>>.
|
135
|
+
|
52
136
|
[id="plugins-{type}s-{plugin}-lru_cache_size"]
|
53
137
|
===== `lru_cache_size`
|
54
138
|
|
55
139
|
* Value type is <<number,number>>
|
56
|
-
* Default value is `
|
140
|
+
* Default value is `100000`
|
57
141
|
|
58
142
|
UA parsing is surprisingly expensive. This filter uses an LRU cache to take advantage of the fact that
|
59
143
|
user agents are often found adjacent to one another in log files and rarely have a random distribution.
|
@@ -84,10 +168,8 @@ A string to prepend to all of the extracted keys
|
|
84
168
|
* Value type is <<string,string>>
|
85
169
|
* There is no default value for this setting.
|
86
170
|
|
87
|
-
`regexes.yaml` file to use
|
88
|
-
|
89
171
|
If not specified, this will default to the `regexes.yaml` that ships
|
90
|
-
with logstash.
|
172
|
+
with logstash. Otherwise use the provided `regexes.yaml` file.
|
91
173
|
|
92
174
|
You can find the latest version of this here:
|
93
175
|
<https://github.com/ua-parser/uap-core/blob/master/regexes.yaml>
|
@@ -106,7 +188,9 @@ array, only the first value will be used.
|
|
106
188
|
===== `target`
|
107
189
|
|
108
190
|
* Value type is <<string,string>>
|
109
|
-
*
|
191
|
+
* Default value depends on whether <<plugins-{type}s-{plugin}-ecs_compatibility>> is enabled:
|
192
|
+
** ECS Compatibility disabled: no default value for this setting
|
193
|
+
** ECS Compatibility enabled: `"user_agent"`
|
110
194
|
|
111
195
|
The name of the field to assign user agent data into.
|
112
196
|
|
@@ -115,4 +199,4 @@ If not specified user agent data will be stored in the root of the event.
|
|
115
199
|
|
116
200
|
|
117
201
|
[id="plugins-{type}s-{plugin}-common-options"]
|
118
|
-
include::{include_path}/{type}.asciidoc[]
|
202
|
+
include::{include_path}/{type}.asciidoc[]
|
@@ -1,10 +1,8 @@
|
|
1
1
|
# encoding: utf-8
|
2
|
-
require "java"
|
3
2
|
require "logstash-filter-useragent_jars"
|
4
3
|
require "logstash/filters/base"
|
5
4
|
require "logstash/namespace"
|
6
|
-
require
|
7
|
-
require "thread"
|
5
|
+
require 'logstash/plugin_mixins/ecs_compatibility_support'
|
8
6
|
|
9
7
|
# Parse user agent strings into structured data based on BrowserScope data
|
10
8
|
#
|
@@ -16,6 +14,8 @@ require "thread"
|
|
16
14
|
# <https://github.com/tobie/ua-parser/>.
|
17
15
|
class LogStash::Filters::UserAgent < LogStash::Filters::Base
|
18
16
|
|
17
|
+
include LogStash::PluginMixins::ECSCompatibilitySupport(:disabled, :v1)
|
18
|
+
|
19
19
|
config_name "useragent"
|
20
20
|
|
21
21
|
# The field containing the user agent string. If this field is an
|
@@ -25,7 +25,7 @@ class LogStash::Filters::UserAgent < LogStash::Filters::Base
|
|
25
25
|
# The name of the field to assign user agent data into.
|
26
26
|
#
|
27
27
|
# If not specified user agent data will be stored in the root of the event.
|
28
|
-
config :target, :validate => :string
|
28
|
+
config :target, :validate => :string # default [user_agent] in ECS mode
|
29
29
|
|
30
30
|
# `regexes.yaml` file to use
|
31
31
|
#
|
@@ -37,7 +37,7 @@ class LogStash::Filters::UserAgent < LogStash::Filters::Base
|
|
37
37
|
config :regexes, :validate => :string
|
38
38
|
|
39
39
|
# A string to prepend to all of the extracted keys
|
40
|
-
config :prefix, :validate => :string, :default => ''
|
40
|
+
config :prefix, :validate => :string, :default => '' # not supported in ECS mode
|
41
41
|
|
42
42
|
# UA parsing is surprisingly expensive. This filter uses an LRU cache to take advantage of the fact that
|
43
43
|
# user agents are often found adjacent to one another in log files and rarely have a random distribution.
|
@@ -55,7 +55,46 @@ class LogStash::Filters::UserAgent < LogStash::Filters::Base
|
|
55
55
|
# number of cache misses and waste memory.
|
56
56
|
config :lru_cache_size, :validate => :number, :default => 100_000
|
57
57
|
|
58
|
+
def initialize(*params)
|
59
|
+
super
|
60
|
+
|
61
|
+
# make @target in the format [field name] if defined, i.e. surrounded by brackets
|
62
|
+
target = @target || ecs_select[disabled: '', v1: '[user_agent]']
|
63
|
+
target = "[#{@target}]" if !target.empty? && target !~ /^\[[^\[\]]+\]$/
|
64
|
+
|
65
|
+
@name_field = ecs_select[disabled: "[#{@prefix}name]", v1: '[name]']
|
66
|
+
@name_field = "#{target}#{@name_field}"
|
67
|
+
|
68
|
+
@device_name_field = ecs_select[disabled: "[#{@prefix}device]", v1: '[device][name]']
|
69
|
+
@device_name_field = "#{target}#{@device_name_field}"
|
70
|
+
|
71
|
+
@version_field = ecs_select[disabled: "[#{@prefix}version]", v1: '[version]']
|
72
|
+
@version_field = "#{target}#{@version_field}"
|
73
|
+
@major_field = ecs_select[disabled: "#{target}[#{@prefix}major]", v1: "[@metadata][filter][user_agent][version][major]"]
|
74
|
+
@minor_field = ecs_select[disabled: "#{target}[#{@prefix}minor]", v1: "[@metadata][filter][user_agent][version][minor]"]
|
75
|
+
@patch_field = ecs_select[disabled: "#{target}[#{@prefix}patch]", v1: "[@metadata][filter][user_agent][version][patch]"]
|
76
|
+
|
77
|
+
@os_full_name_field = ecs_select[disabled: "[#{@prefix}os_full]", v1: '[os][full]'] # did not exist in legacy prior to ECS-ification
|
78
|
+
@os_full_name_field = "#{target}#{@os_full_name_field}"
|
79
|
+
|
80
|
+
@os_name_field = ecs_select[disabled: "[#{@prefix}os_name]", v1: '[os][name]']
|
81
|
+
@os_name_field = "#{target}#{@os_name_field}"
|
82
|
+
@legacy_os_field = ecs_select[disabled: "#{target}[#{@prefix}os]", v1: nil] # same as [os_name] in legacy mode
|
83
|
+
|
84
|
+
@os_version_field = ecs_select[disabled: "[#{@prefix}os_version]", v1: '[os][version]']
|
85
|
+
@os_version_field = "#{target}#{@os_version_field}"
|
86
|
+
@os_major_field = ecs_select[disabled: "#{target}[#{@prefix}os_major]", v1: "[@metadata][filter][user_agent][os][version][major]"]
|
87
|
+
@os_minor_field = ecs_select[disabled: "#{target}[#{@prefix}os_minor]", v1: "[@metadata][filter][user_agent][os][version][minor]"]
|
88
|
+
@os_patch_field = ecs_select[disabled: "#{target}[#{@prefix}os_patch]", v1: "[@metadata][filter][user_agent][os][version][patch]"]
|
89
|
+
|
90
|
+
# NOTE: unfortunately we can not reliably provide `user_agent.original` since the patterns do not
|
91
|
+
# reliably give back the matched group and they support the UA string prefixed and/or suffixed
|
92
|
+
end
|
93
|
+
|
58
94
|
def register
|
95
|
+
if ecs_compatibility != :disabled && @prefix && !@prefix.empty?
|
96
|
+
@logger.warn "Field prefix isn't supported in ECS compatibility mode, please remove `prefix => #{@prefix.inspect}`"
|
97
|
+
end
|
59
98
|
|
60
99
|
if @regexes.nil?
|
61
100
|
@parser = org.logstash.uaparser.CachingParser.new(lru_cache_size)
|
@@ -63,21 +102,6 @@ class LogStash::Filters::UserAgent < LogStash::Filters::Base
|
|
63
102
|
@logger.debug("Using user agent regexes", :regexes => @regexes)
|
64
103
|
@parser = org.logstash.uaparser.CachingParser.new(@regexes, lru_cache_size)
|
65
104
|
end
|
66
|
-
|
67
|
-
# make @target in the format [field name] if defined, i.e. surrounded by brakets
|
68
|
-
normalized_target = (@target && @target !~ /^\[[^\[\]]+\]$/) ? "[#{@target}]" : ""
|
69
|
-
|
70
|
-
# predefine prefixed field names
|
71
|
-
@prefixed_name = "#{normalized_target}[#{@prefix}name]"
|
72
|
-
@prefixed_os = "#{normalized_target}[#{@prefix}os]"
|
73
|
-
@prefixed_os_name = "#{normalized_target}[#{@prefix}os_name]"
|
74
|
-
@prefixed_os_major = "#{normalized_target}[#{@prefix}os_major]"
|
75
|
-
@prefixed_os_minor = "#{normalized_target}[#{@prefix}os_minor]"
|
76
|
-
@prefixed_device = "#{normalized_target}[#{@prefix}device]"
|
77
|
-
@prefixed_major = "#{normalized_target}[#{@prefix}major]"
|
78
|
-
@prefixed_minor = "#{normalized_target}[#{@prefix}minor]"
|
79
|
-
@prefixed_patch = "#{normalized_target}[#{@prefix}patch]"
|
80
|
-
@prefixed_build = "#{normalized_target}[#{@prefix}build]"
|
81
105
|
end
|
82
106
|
|
83
107
|
def filter(event)
|
@@ -88,60 +112,127 @@ class LogStash::Filters::UserAgent < LogStash::Filters::Base
|
|
88
112
|
|
89
113
|
begin
|
90
114
|
ua_data = lookup_useragent(useragent)
|
91
|
-
rescue
|
92
|
-
@logger.error("
|
115
|
+
rescue => e
|
116
|
+
@logger.error("Unknown error while parsing user agent data",
|
117
|
+
:exception => e.class, :message => e.message, :backtrace => e.backtrace,
|
118
|
+
:field => @source, :event => event.to_hash)
|
93
119
|
return
|
94
120
|
end
|
95
121
|
|
96
122
|
return unless ua_data
|
97
123
|
|
98
124
|
event.remove(@source) if @target == @source
|
99
|
-
set_fields(event, ua_data)
|
125
|
+
set_fields(event, useragent, ua_data)
|
100
126
|
|
101
127
|
filter_matched(event)
|
102
128
|
end
|
103
129
|
|
104
|
-
|
105
|
-
# TODO: (colin) the related specs should be refactored to not rely on private methods.
|
106
|
-
def lookup_useragent(useragent)
|
107
|
-
return unless useragent
|
130
|
+
private
|
108
131
|
|
109
|
-
|
110
|
-
# caused by the underlying JRuby regex code that is not thread safe.
|
111
|
-
# see https://github.com/logstash-plugins/logstash-filter-useragent/issues/25
|
132
|
+
def lookup_useragent(useragent)
|
112
133
|
@parser.parse(useragent)
|
113
134
|
end
|
114
135
|
|
115
|
-
|
116
|
-
|
117
|
-
|
118
|
-
|
119
|
-
|
120
|
-
event.set(@
|
121
|
-
|
122
|
-
|
136
|
+
def set_fields(event, ua_source, ua_data)
|
137
|
+
# UserAgentParser strings are US-ASCII
|
138
|
+
|
139
|
+
ua = ua_data.userAgent
|
140
|
+
event.set(@name_field, duped_string(ua.family))
|
141
|
+
event.set(@device_name_field, duped_string(ua_data.device)) if ua_data.device
|
142
|
+
|
143
|
+
event.set(@major_field, duped_string(ua.major)) if ua.major
|
144
|
+
event.set(@minor_field, duped_string(ua.minor)) if ua.minor
|
145
|
+
event.set(@patch_field, duped_string(ua.patch)) if ua.patch
|
146
|
+
set_version(event, ua_source, ua) # UA version string e.g. "89.0.4389.90"
|
147
|
+
|
148
|
+
os = ua_data.os
|
149
|
+
if os
|
150
|
+
# os.major, os.minor, ... are all strings
|
151
|
+
event.set(@os_major_field, duped_string(os.major)) if os.major # e.g. 'Vista' or '10'
|
152
|
+
event.set(@os_minor_field, duped_string(os.minor)) if os.minor
|
153
|
+
event.set(@os_patch_field, duped_string(os.patch)) if os.patch
|
154
|
+
os_version = build_os_version(os)
|
155
|
+
event.set(@os_version_field, os_version) if os_version
|
156
|
+
|
157
|
+
os_name = os.family
|
158
|
+
if os_name
|
159
|
+
os_name = duped_string(os_name)
|
160
|
+
event.set(@os_name_field, os_name)
|
161
|
+
event.set(@legacy_os_field, os_name.dup) if @legacy_os_field
|
162
|
+
os_full_name = os_name.dup
|
163
|
+
os_full_name << ' ' << os_version if os_version
|
164
|
+
event.set(@os_full_name_field, os_full_name)
|
165
|
+
end
|
166
|
+
end
|
167
|
+
end
|
123
168
|
|
124
|
-
|
125
|
-
|
126
|
-
if
|
127
|
-
#
|
128
|
-
|
129
|
-
|
130
|
-
|
131
|
-
|
132
|
-
|
133
|
-
|
134
|
-
|
169
|
+
# reconstruct and set the User-Agent version string
|
170
|
+
def set_version(event, ua_source, ua)
|
171
|
+
if @version_field && ua.major
|
172
|
+
# only Chrome has all 4 segments, while Firefox only uses major.minor
|
173
|
+
version = duped_string(ua.major)
|
174
|
+
if ua.minor
|
175
|
+
version << '.' << ua.minor
|
176
|
+
if ua.patch
|
177
|
+
version << '.' << ua.patch
|
178
|
+
if ua.patchMinor
|
179
|
+
version << '.' << ua.patchMinor
|
180
|
+
else
|
181
|
+
adjusted_version = check_and_adjust_version(ua_source, version)
|
182
|
+
version = adjusted_version if adjusted_version
|
183
|
+
end
|
184
|
+
end
|
135
185
|
end
|
186
|
+
event.set(@version_field, version)
|
136
187
|
end
|
188
|
+
end
|
137
189
|
|
138
|
-
|
190
|
+
def check_and_adjust_version(ua_source, version)
|
191
|
+
# only set OS version if it's not 'interpreted' (contained in UA string)
|
192
|
+
return nil if !version || (i = ua_source.index(version)).nil?
|
193
|
+
i += version.size
|
194
|
+
# complete version when patchMinor is not matched but still there
|
195
|
+
if ua_source[i] == '.' # we built the version with dots
|
196
|
+
if patch_minor = ua_source.index(' ', i + 1)
|
197
|
+
patch_minor = ua_source[i + 1...patch_minor]
|
198
|
+
if patch_minor.eql? patch_minor.to_i.to_s
|
199
|
+
version = "#{version}.#{patch_minor}"
|
200
|
+
end
|
201
|
+
end
|
202
|
+
end
|
203
|
+
version
|
204
|
+
end
|
139
205
|
|
140
|
-
|
141
|
-
|
142
|
-
|
143
|
-
|
144
|
-
|
206
|
+
# reconstructs the OS version string
|
207
|
+
def build_os_version(os)
|
208
|
+
# NOTE: UA regexes don't always give us the versions back
|
209
|
+
# they do get "corrected" for various OSes such as:
|
210
|
+
# - Windows (Windows NT 6.0 => 'Vista')
|
211
|
+
# - Windows ('Windows NT 6.3' => '8','1')
|
212
|
+
# - Windows ('Windows NT 10.0' => '10')
|
213
|
+
# - iOS ('Darwin/15.5' => '9','3','2')
|
214
|
+
return unless major = os.major
|
215
|
+
if major.to_i.to_s == major
|
216
|
+
version, sep = duped_string(major), '.'
|
217
|
+
else
|
218
|
+
version, sep = duped_string(major), ' '
|
145
219
|
end
|
220
|
+
if os.minor
|
221
|
+
version << sep << os.minor
|
222
|
+
if os.patch
|
223
|
+
version << '.' << os.patch
|
224
|
+
if os.patchMinor
|
225
|
+
version << '.' << os.patchMinor
|
226
|
+
end
|
227
|
+
end
|
228
|
+
end
|
229
|
+
version
|
146
230
|
end
|
231
|
+
|
232
|
+
def duped_string(str)
|
233
|
+
# Calls in here use #dup because there's potential for later filters to modify these values
|
234
|
+
# and corrupt the cache. See uap source here for details https://github.com/ua-parser/uap-ruby/tree/master/lib/user_agent_parser
|
235
|
+
str.dup.force_encoding(Encoding::UTF_8)
|
236
|
+
end
|
237
|
+
|
147
238
|
end
|