log_sense 1.9.0 → 2.0.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: ccf396e27466411bfa603709787126290e24d0652ac5cc6b96bad008d11ccd6d
4
- data.tar.gz: 4cee6242192c1c38273c7a9fa55c2d55ea18aa65c6328932a27f944fa8b25cbd
3
+ metadata.gz: 288570381159c730801985845d064e62fa8ad08ee6b44f48ebf2928e60e80e47
4
+ data.tar.gz: 64d80612d568f4fd1991257d2755ec38ef94543b1fd451097efd6e9006ab551f
5
5
  SHA512:
6
- metadata.gz: 25874781f2e012a40de832d553a1a3bd0cfa0fd232988e482a2e6ceaf6c3cdd632bbe5059bbb134f4fc3029add42ff0e51311d9c6c3a61feedebf69df1f9c8b6
7
- data.tar.gz: 0e9733a78e5b972ed20c657983e04f7d238a14646e91a0c68e2d4594c4fbaecacc5271ec1c3e15949d590b86f04049b837e448b1473f9db60a8c2a3673074a02
6
+ metadata.gz: 6352f42ccbd453e9adf4af83372ad8c7113f58d419e502e444dad2b3f3ad5e54b1f31f077631040ea473a9b2976e38ff2b8960cb91091f65830ed03b987cb3e8
7
+ data.tar.gz: 4f01dc9e7ea6a53d983d51508e69710999741f6e5c74b49c2ab42e45a312f3364a07f7ff58fe6dd8e852f08d67d1df110a86c7e0330c6301e522bf8c595839f3
data/CHANGELOG.org CHANGED
@@ -2,6 +2,21 @@
2
2
  #+AUTHOR: Adolfo Villafiorita
3
3
  #+STARTUP: showall
4
4
 
5
+ * 2.0.1
6
+
7
+ - Add GitHub action for publishing to RubyGems after repeated failures
8
+ with authentication from the command line
9
+
10
+ * 2.0.0 (Not released)
11
+
12
+ - World Map
13
+ - Dark mode
14
+ - Fix link colors in sidebar
15
+ - Bars in the statuses bar plot are now colored according to status
16
+ - Add "statuses by day" in Rails report
17
+ - Enlarge "errors" and "potential attacks" reports
18
+ - Various smaller fixes
19
+
5
20
  * 1.9.0
6
21
 
7
22
  - Perform calculation on HTML pages only
data/Gemfile CHANGED
@@ -1,6 +1,6 @@
1
1
  source "https://rubygems.org"
2
2
 
3
- # Specify your gem's dependencies in apache_log_report.gemspec
3
+ # Specify your gem's dependencies in log_sense.gemspec
4
4
  gemspec
5
5
 
6
- gem "rake", "~> 12.0"
6
+ gem "rake", "~> 13.0"
data/Gemfile.lock CHANGED
@@ -1,12 +1,12 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- log_sense (1.7.0)
5
- browser
6
- ipaddr
7
- iso_country_codes
8
- sqlite3
9
- terminal-table
4
+ log_sense (2.0.0)
5
+ browser (~> 5.3.0)
6
+ ipaddr (~> 1.2.0)
7
+ iso_country_codes (~> 0.7.0)
8
+ sqlite3 (~> 2.0.0)
9
+ terminal-table (~> 3.0.0)
10
10
 
11
11
  GEM
12
12
  remote: https://rubygems.org/
@@ -17,22 +17,22 @@ GEM
17
17
  reline (>= 0.3.8)
18
18
  io-console (0.7.2)
19
19
  ipaddr (1.2.6)
20
- irb (1.13.1)
20
+ irb (1.14.0)
21
21
  rdoc (>= 4.0.0)
22
22
  reline (>= 0.4.2)
23
23
  iso_country_codes (0.7.8)
24
24
  mini_portile2 (2.8.7)
25
- minitest (5.23.1)
25
+ minitest (5.24.1)
26
26
  psych (5.1.2)
27
27
  stringio
28
- rake (12.3.3)
28
+ rake (13.2.1)
29
29
  rdoc (6.7.0)
30
30
  psych (>= 4.0.0)
31
- reline (0.5.8)
31
+ reline (0.5.9)
32
32
  io-console (~> 0.5)
33
- sqlite3 (2.0.2)
33
+ sqlite3 (2.0.3)
34
34
  mini_portile2 (~> 2.8.0)
35
- stringio (3.1.0)
35
+ stringio (3.1.1)
36
36
  terminal-table (3.0.2)
37
37
  unicode-display_width (>= 1.1.1, < 3)
38
38
  unicode-display_width (2.5.0)
@@ -41,10 +41,10 @@ PLATFORMS
41
41
  ruby
42
42
 
43
43
  DEPENDENCIES
44
- debug
44
+ debug (~> 1.9.0)
45
45
  log_sense!
46
- minitest
47
- rake (~> 12.0)
46
+ minitest (~> 5.24.0)
47
+ rake (~> 13.0)
48
48
 
49
49
  BUNDLED WITH
50
50
  2.5.3
data/README.org CHANGED
@@ -4,58 +4,89 @@
4
4
 
5
5
  * Introduction
6
6
 
7
- LogSense generates reports and statistics from Apache and Ruby on Rails log
8
- files. All the statistics you need to monitor your application, its
9
- performances, and how users access your app. Since it collects data from logs,
10
- there is no need for cookies or other tracking technologies.
11
-
12
- LogSense is Written in Ruby, it runs from the command line, it is
13
- fast, and it can be installed on any system with a relatively recent
14
- version of Ruby. We tested on Ruby 2.6.9, Ruby 3.0.x and later.
15
-
16
- When generating reports, LogSense reports the following data:
17
-
18
- - Visitors, hits, unique visitors, bandwidth used
19
- - Most accessed HTML pages
20
- - Most accessed resources
21
- - Missed resources (also by IP) which helps highlight
22
- potential attacks
23
- - Response statuses
24
- - Referers
25
- - OS, browsers, and devices
26
- - IP Country location, thanks to the DP-IP lite country DB
27
- - Streaks: resources accessed by a given IP over time
28
- - Performance of Rails requests
29
- - Rails Fatal Errors (with reference to the logs)
7
+ LogSense generates reports and statistics from Ruby on Rails and Apache/Nginx
8
+ log files.
9
+
10
+ Main features:
11
+
12
+ - Statistics for Rails app in production and Web server logs (combined format,
13
+ which can be produced both by Apache and Nginx)
14
+ - Reports on performances, errors, visitors, and devices used to access your
15
+ websites and webapps[fn:: LogSense parses also the data generated by the
16
+ BrowserInfo gem, providing additional information for Rails apps, including
17
+ devices, platforms and number of accesses to methods by device type.].
18
+ - Can combine one or more log files
19
+ - No need for cookies or other tracking technologies (but you need access to
20
+ your log files)
21
+ - Filters allow to analyze specific periods distinguish traffic generated by
22
+ self polls and crawlers.
23
+ - Reports can be generated in HTML, txt, ufw, and SQLite. HTML reports are
24
+ responsive and come with dark and light theme.
30
25
 
31
- LogSense parses also the data generated by BrowserInfo, providing additional
32
- information for Rails apps, including devices and platforms and number of
33
- accesses to methods by device type.
26
+ LogSense is Written in Ruby, it runs from the command line, it is fast, and it
27
+ can be installed on any system with a relatively recent version of Ruby. We
28
+ use it with Ruby 3.1.4 and 3.3.0.
34
29
 
35
- A special output format =ufw= generates rules for the [[https://launchpad.net/ufw][Uncomplicated
36
- Firewall]] to blacklist IPs requesting URLs matching a specific pattern.
37
-
38
- Filters from the command line allow to analyze specific periods and
39
- distinguish traffic generated by self polls and crawlers.
30
+ It is fast. On a ThinkPad P16, a 277M log file is parsed in 15 seconds,
31
+ processing, that is, about 7740 events per second; a 569M log file is parsed in
32
+ 50 seconds, that is, about 4700 events per second.
40
33
 
41
- LogSense generates HTML, txt, ufw, and SQLite outputs.
42
34
 
43
- ** Rails Report Structure
35
+ ** Rails Production Report
44
36
 
45
37
  #+ATTR_HTML: :width 80%
46
38
  [[file:./screenshots/rails-screenshot.png]]
47
39
 
48
-
49
- ** Apache Report Structure
40
+ LogSense understands the Rails *production log* and generates the following
41
+ reports in TXT and HTML:
42
+
43
+ - Daily Distribution
44
+ - Time Distribution
45
+ - Statuses
46
+ - Statuses by Day
47
+ - Rails Performance
48
+ - Controller and Methods by Device
49
+ - Fatal Events
50
+ - Internal Server Errors
51
+ - Errors
52
+ - Potential Attacks
53
+ - Browsers
54
+ - Platforms
55
+ - IPs
56
+ - Countries
57
+ - IP per hour
58
+ - Sessions
59
+
60
+ ** Apache/Nginx Report
50
61
 
51
62
  #+ATTR_HTML: :width 80%
52
- [[file:./screenshots/apache-screenshot.png]]
53
-
63
+ [[file:./screenshots/combined_log-screenshot.png]]
64
+
65
+ LogSense reads the Apache/Nginx *combined log* format and generates the
66
+ following reports in TXT and HTML:
67
+
68
+ - Time Distribution
69
+ - 20_ and 30_ on HTML pages
70
+ - 20_ and 30_ on other resources
71
+ - 40_ and 50_x on HTML pages
72
+ - 40_ and 50_ on other resources
73
+ - 40_ and 50_x on HTML pages by IP
74
+ - 40_ and 50_ on other resources by IP
75
+ - Statuses
76
+ - Statuses by Day
77
+ - Browsers
78
+ - Platforms
79
+ - IPs
80
+ - Countries
81
+ - IP per hour
82
+ - Combined Platform Data
83
+ - Referers
84
+ - Sessions
54
85
 
55
86
  ** UFW Report
56
87
 
57
- The output format =ufw= generates directives for Uncomplicated
58
- Firewall blacklisting IPs requesting URLs matching a given pattern.
88
+ The =ufw= output format generates directives for Uncomplicated Firewall,
89
+ blacklisting IPs requesting URLs matching a given pattern.
59
90
 
60
91
  We use it to blacklist IPs requesting WordPress login pages on our
61
92
  websites... since we don't use WordPress for our websites.
@@ -73,40 +104,55 @@ ufw deny from 185.255.134.18
73
104
  ...
74
105
  #+end_src
75
106
 
76
-
77
- * An important word of warning
78
-
79
- [[https://owasp.org/www-community/attacks/Log_Injection][Log poisoning]] is a technique whereby attackers send requests with invalidated
80
- user input to forge log entries or inject malicious content into the logs.
81
-
82
- log_sense sanitizes entries of HTML reports, to try and protect from log
83
- poisoning. *Log entries and URLs in SQLite3, however, are not sanitized*:
84
- they are stored and read from the log. This is not, in general, an issue,
85
- unless you use the data from SQLite in environments in which URLs can be
86
- opened or code executed.
87
-
88
- * Motivation
89
-
90
- LogSense moves along the lines of tools such as [[https://goaccess.io/][GoAccess]] and [[https://umami.is/][Umami]], focusing on
91
- *privacy*, *data-ownership*, and *simplicity*: no need to install JavaScript
92
- snippets, no tracking cookies, just plain and simple log analysis.
93
-
94
- LogSense is also inspired by *static websites generators*: statistics are
95
- generated from the command line and accessed as static HTML files. This
96
- significantly reduces the attack surface of your web server and installation
97
- headaches. We have, for instance, a cron job running on our servers, generating
98
- statistics at night. The generated files are then made available on a private
99
- area on the web.
100
-
101
107
  * Installation
102
108
 
103
109
  #+begin_src bash
104
110
  gem install log_sense
105
111
  #+end_src
106
112
 
113
+ If you want to collect information about browsers, platform and devices when
114
+ generating Rails reports, add the =browser= gem to your bundle and the
115
+ following code to =application_controller.rb=:
116
+
117
+ #+begin_example ruby
118
+ # Gemfile
119
+ gem "browser"
120
+ #+end_example
121
+
122
+ #+begin_example ruby
123
+ # application_controller.rb
124
+ class ApplicationController < ActionController::Base
125
+
126
+ # [...]
127
+
128
+ before_action do |controller|
129
+ user_agent = request.env['HTTP_USER_AGENT']
130
+ ip = request.env['REMOTE_ADDR']
131
+
132
+ hashed_ip = Digest::SHA256.hexdigest ip
133
+ b = Browser.new(user_agent)
134
+ now = DateTime.now
135
+
136
+ logger = Rails.logger
137
+ browser_data = [
138
+ b.name, b.platform, b.device.name,
139
+ controller.class.name, controller.action_name,
140
+ request.format.symbol,
141
+ hashed_ip,
142
+ now
143
+ ]
144
+
145
+ browser_data_str = browser_data.map { |x| "\"#{x}\"" }.join(',')
146
+ logger.info "BrowserInfo: #{browser_data_str}"
147
+ end
148
+
149
+ # [...]
150
+ end
151
+ #+end_example
152
+
107
153
  * Usage
108
154
 
109
- #+begin_src bash :results raw output :wrap example
155
+ #+begin_src bash :results raw output :wrap example :exports both
110
156
  log_sense --help
111
157
  #+end_src
112
158
 
@@ -131,7 +177,7 @@ area on the web.
131
177
  -v, --version Prints version information
132
178
  -h, --help Prints this help
133
179
 
134
- This is version 1.8.0
180
+ This is version 2.0.0
135
181
 
136
182
  Output formats:
137
183
 
@@ -146,6 +192,51 @@ log_sense -f apache -i access.log -t txt > access-data.txt
146
192
  log_sense -f rails -i production.log -t html -o performance.html
147
193
  #+end_example
148
194
 
195
+ * Motivation
196
+
197
+ LogSense focuses on *privacy*, *data-ownership*, and *simplicity*: no need to
198
+ install JavaScript snippets, no tracking cookies, just plain and simple log
199
+ analysis.
200
+
201
+ LogSense is also inspired by *static websites generators*: statistics are
202
+ generated from the command line and accessed as static HTML files. This
203
+ significantly reduces the attack surface of your web server and installation
204
+ headaches. We have a cron job running on our servers, generating statistics at
205
+ night. The generated files are then made available on a private area on the
206
+ web and rotated monthly.
207
+
208
+ * An important word of warning on SQLite3 output
209
+
210
+ [[https://owasp.org/www-community/attacks/Log_Injection][Log poisoning]] is a technique whereby attackers send requests with invalidated
211
+ user input to forge log entries or inject malicious content into the logs.
212
+
213
+ log_sense sanitizes entries of HTML reports, to try and protect from log
214
+ poisoning. *Log entries and URLs in SQLite3 tables, however, are not
215
+ sanitized*: they are read and stored from the log as they are. This is not, in
216
+ general, an issue, unless you use the unsanitized data from SQLite as it is in
217
+ environments where URL can be opened or code executed using the URLs as
218
+ argument.
219
+
220
+ * Change Log
221
+
222
+ See the [[file:CHANGELOG.org][CHANGELOG]] file.
223
+
224
+ * Compatibility
225
+
226
+ LogSense should run on any system on which a recent version of Ruby
227
+ runs. We tested it with Ruby 2.6.9 and Ruby 3.0.x, and Ruby 3.3.x
228
+
229
+ * Author and Contributors
230
+
231
+ [[https://shair.tech][Shair.Tech]]
232
+
233
+ * Credits
234
+
235
+ - HTML reports use [[https://get.foundation/][Zurb Foundation]], [[https://www.datatables.net/][Data Tables]], and [[https://echarts.apache.org/en/index.html][Apache ECharts]]
236
+ - The textual format is compatible with [[https://orgmode.org/][Org Mode]] and can be further processed to
237
+ any format [[https://orgmode.org/][Org Mode]] can be exported to, including HTML and PDF, with the word
238
+ of warning in the section above concerning log poisoning.
239
+
149
240
  * Code Structure
150
241
 
151
242
  The code implements a pipeline, with the following steps:
@@ -164,64 +255,26 @@ The code implements a pipeline, with the following steps:
164
255
  building the reports.
165
256
  5. *Emitter* generates reports from shaped data using ERB.
166
257
 
167
- The architecture and the structure of the code is far from being nice,
168
- for historical reason and for a bunch of small differences existing
169
- between the input and the outputs to be generated. This usually ends
170
- up with modifications to the code that have to be replicated in
171
- different parts of the code and in interferences.
172
-
173
- Among the points I would like to address:
174
-
175
- - The execution pipeline in the main script has a few exceptions to
176
- manage SQLite reading/dumping and ufw report. A linear structure
177
- would be a lot nicer.
178
- - Two different classes are defined for steps 1, 2, and 4, to manage,
179
- respectively, Apache and Rails logs. These classes inherit from a
180
- common ancestor (e.g. ApacheParser and RailsParser both inherit from
181
- Parser), but there is still too little code shared. A nicer
182
- approach would be that of identifying a common DB structure and
183
- unify the pipeline up to (or including) the generation of
184
- reports. There are a bunch of small different things to highlight in
185
- reports, which still make this difficult. For instance, the country
186
- report for Apache reports size of TX data, which is not available
187
- for Rail reports.
188
- - Geolocation could become a lot more efficient if performed in
189
- SQLite, rather than in Ruby
190
- - The distinction between Aggregation, Shaping, and Emission is a too
191
- fine-grained and it would be nice to be able to cleanly remove one
192
- of the steps.
193
-
194
-
195
- * Change Log
196
-
197
- See the [[file:CHANGELOG.org][CHANGELOG]] file.
198
-
199
- * Compatibility
200
-
201
- LogSense should run on any system on which a recent version of Ruby
202
- runs. We tested it with Ruby 2.6.9 and Ruby 3.x.x.
203
-
204
- Concerning the outputs:
205
258
 
206
- - HTML reports use [[https://get.foundation/][Zurb Foundation]], [[https://www.datatables.net/][Data Tables]], and [[https://vega.github.io/vega-lite/][Vega Light]], which
207
- are all downloaded from a CDN
208
- - The textual format is compatible with [[https://orgmode.org/][Org Mode]] and can be further
209
- processed to any format [[https://orgmode.org/][Org Mode]] can be exported to, including HTML
210
- and PDF, with the word of warning in the section above.
259
+ * Todo
211
260
 
212
- * Author and Contributors
213
-
214
- [[https://shair.tech][Shair.Tech]]
261
+ See [[todo.org]]
215
262
 
216
263
  * Known Bugs
217
264
 
218
265
  We have been running LogSense for quite a few years with no particular issues.
219
266
  There are no known bugs; there is an unknown number of unknown bugs.
220
267
 
221
- * License
268
+ You are most welcome to report issues and missing features, using the Issue
269
+ tracker.
270
+
271
+ * Licenses
222
272
 
223
- Source code distributed under the terms of the [[http://opensource.org/licenses/MIT][MIT License]].
273
+ LogSense is distributed under the terms of the [[http://opensource.org/licenses/MIT][MIT License]].
224
274
 
225
- Geolocation is made possible by the DB-IP.com IP to City database,
226
- released under a CC license.
275
+ Geolocation is made possible by [[https://db-ip.com/][dbip]]'s IP to City database, released under a
276
+ CC license.
227
277
 
278
+ The world map is distributed under the terms of the [[http://opensource.org/licenses/MIT][MIT License]] by Pareto
279
+ Softare, [[https://simplemaps.com/][Simplemaps.com]]. It is used in LogSense with some changes to the class
280
+ names and ids.
data/exe/log_sense CHANGED
@@ -114,7 +114,7 @@ elsif @options[:output_format] == "ufw"
114
114
  }
115
115
  ips_and_urls.each do |ip, urls|
116
116
  puts "# #{urls[0..10].uniq.join(' ')}"
117
- puts "ufw deny from #{ip}"
117
+ puts "ufw insert 1 deny from #{ip}"
118
118
  puts
119
119
  end
120
120
 
@@ -132,6 +132,7 @@ else
132
132
 
133
133
  warn "Grouping IPs by country ..." if @options[:verbose]
134
134
  country_col = geolocated_data[0].size - 1
135
+ @data[:ips] = geolocated_data
135
136
  @data[:countries] = geolocated_data.group_by { |x| x[country_col] }
136
137
  elsif @options[:geolocation] && @data[:ips].size == 0
137
138
  warn "Skipping geolocation: no IP found" if @options[:verbose]
@@ -78,7 +78,8 @@ module LogSense
78
78
  extra_cols = ""
79
79
  end
80
80
 
81
- @ips = @db.execute %(SELECT ip, count(ip) #{extra_cols} from #{@table}
81
+ @ips = @db.execute %(SELECT ip, count(ip) #{extra_cols}
82
+ from #{@table}
82
83
  where #{filter}
83
84
  group by ip
84
85
  order by count(ip) desc
@@ -169,7 +170,7 @@ module LogSense
169
170
  # name is used to give the name to the column with formatted time
170
171
  def ip_by_time_query(name, format_string)
171
172
  %(SELECT ip,
172
- strftime("%H", #{@date_field}) as #{name},
173
+ strftime('#{format_string}', #{@date_field}) as #{name},
173
174
  count(#{@url_field}) from #{@table}
174
175
  where #{filter} and ip != "" and
175
176
  #{@url_field} != "" and