log_sense 1.0.11 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: b0b958a6580aff4f478248d606897af928ffe23d6e8d8f43abcdeca50fbac6db
4
- data.tar.gz: 9ecd13a65d80653e50b242938bce8083ff1e9ec590260fea3f8b5fc04b5e6a81
3
+ metadata.gz: d96a22ce71f0c0266811faa1853981d1211b93687bfd2e3077b91189feb5742b
4
+ data.tar.gz: 014ad1230a6a83b379310ba2b93b16bfd956db19a8f087bd2c216b5e77211620
5
5
  SHA512:
6
- metadata.gz: 5a0c3a5e40a7e7cc74df707062fbd7cf958341696da9388b2d941f83fc219cdbcc8a258660ad85daa445bb781b3ca736350ac961907ff4220e6844b9c95bba55
7
- data.tar.gz: e670fe25f1f8b368b77cb7259aac46fbb551751423ce322e50f5e48087703a01bdb5a08d242522db6a04693f1676cc7e123212d4ddf771260bd1febdec272835
6
+ metadata.gz: 8fe6fb1a329b0cabae06199f13c70936c9c7aeaa6c99142a683414ac380bfc23fecb14d3588803f2f9afc24483bfb8f429f867633ce5029c3dc23df1c713dea4
7
+ data.tar.gz: 10fe88f045845a5573f8ba7aedde8d3475a0f390e6025746799525be685c6db11e5eb9148e9de3f8f05a7950ae9c912f2d67e454a285de08143789db374818c7
data/CHANGELOG.org CHANGED
@@ -2,26 +2,13 @@
2
2
  #+AUTHOR: Adolfo Villafiorita
3
3
  #+STARTUP: showall
4
4
 
5
- * Unreleased
5
+ * Changes in log_sense 1.1.2
6
+ <2021-12-17 Fri>
6
7
 
7
- This changes are in the repository but not yet released to Rubygems.
8
+ - Added Rails Log HTML output
8
9
 
9
- ** New Functions and Changes
10
+ * Changes in log_sense 1.1.1 and earlier
11
+ <2021-12-17 Fri>
10
12
 
11
- ** Fixes
12
-
13
- ** Documentation
14
-
15
- ** Code
16
-
17
-
18
- * Version 1.0.0
19
-
20
- ** New Functions and Changes
21
-
22
- ** Fixes
23
-
24
- ** Documentation
25
-
26
- ** Code
13
+ - In the Git commit messages (not very informative, I am afraid).
27
14
 
data/README.org CHANGED
@@ -14,21 +14,15 @@ and [[https://umami.is/][Umami]], focusing on privacy and data-ownership: the da
14
14
  generated by LogSense is stored on your computer and owned by
15
15
  you (like it should be).
16
16
 
17
- LogSense is also inspired by static websites generators:
18
- statistics are generated from the command line and accessed as static
19
- HTML files. By generating static resources, LogSense
20
- significantly reduces the attack surface of your webserver and
21
- installation headaches.
17
+ LogSense is also inspired by *static websites generators*: statistics
18
+ are generated from the command line and accessed as static HTML files.
19
+ By generating static resources, LogSense significantly reduces the
20
+ attack surface of your webserver and installation headaches.
22
21
 
23
22
  We have, for instance, a cron job running on our servers, generating
24
23
  statistics at night. The generated files are then made available on a
25
24
  private area on the web.
26
25
 
27
- Statistics are generated from Apache log formats in the =combined=
28
- format and from Rails logs. Reports are tailored, but not limited, to
29
- web servers serving static websites. No need to install Java Script
30
- code on your websites, no cookies installed, no user tracking.
31
-
32
26
  LogSense reports the following data:
33
27
 
34
28
  - Visitors, hits, unique visitors, bandwidth used
@@ -62,20 +56,29 @@ LogSense generates HTML, txt (Org Mode), and SQLite outputs.
62
56
 
63
57
  #+RESULTS:
64
58
  #+begin_example
65
- Usage: apache_log_report [options] [logfile]
66
- -l, --limit=N Number of entries to show (defaults to 30)
59
+ Usage: log_sense [options] [logfile]
60
+ --title=TITLE Title to use in the report
61
+ -f, --input-format=FORMAT Input format (either rails or apache)
62
+ -i, --input-file=INPUT_FILE Input file
63
+ -t, --output-format=FORMAT Output format: html, org, txt, sqlite. See below for available formats
64
+ -o, --output-file=OUTPUT_FILE Output file
67
65
  -b, --begin=DATE Consider entries after or on DATE
68
66
  -e, --end=DATE Consider entries before or on DATE
69
- -i, --ignore-crawlers Ignore crawlers
70
- -p, --ignore-selfpoll Ignore apaches self poll entries (from ::1)
71
- --only-crawlers Perform analysis on crawlers only
72
- -u, --prefix=PREFIX Prefix to add to all plots (used to run multiple analyses in the same dir)
73
- -w, --suffix=SUFFIX Suffix to add to all plots (used to run multiple analyses in the same dir)
74
- -c, --code-export=WHAT Control :export directive in Org Mode code blocks (code, results, *both*, none)
75
- -f, --format=FORMAT Output format: html, org, sqlite. Defaults to org mode
67
+ -l, --limit=N Number of entries to show (defaults to 30)
68
+ -c, --crawlers=POLICY Decide what to do with crawlers (applies to Apache Logs)
69
+ -n, --no-selfpolls Ignore self poll entries (requests from ::1; applies to Apache Logs)
76
70
  -v, --version Prints version information
77
71
  -h, --help Prints this help
78
- This is version 1.1.6
72
+
73
+ This is version 1.1.1
74
+
75
+ Output formats
76
+ apache parsing can produce the following outputs:
77
+ - sqlite
78
+ - html
79
+ rails parsing can produce the following outputs:
80
+ - sqlite
81
+ - txt
79
82
  #+end_example
80
83
 
81
84
  * Change Log
@@ -1,4 +1,5 @@
1
1
  require 'terminal-table'
2
+ require 'json'
2
3
  require 'erb'
3
4
  require 'ostruct'
4
5
 
@@ -14,6 +14,10 @@ module LogSense
14
14
  opt_parser = OptionParser.new do |opts|
15
15
  opts.banner = "Usage: log_sense [options] [logfile]"
16
16
 
17
+ opts.on("-tTITLE", "--title=TITLE", String, "Title to use in the report") do |n|
18
+ args[:title] = n
19
+ end
20
+
17
21
  opts.on("-fFORMAT", "--input-format=FORMAT", String, "Input format (either rails or apache)") do |n|
18
22
  args[:input_format] = n
19
23
  end
@@ -26,6 +26,10 @@ module LogSense
26
26
  @log_size = db.execute "SELECT count(started_at) from Event"
27
27
  @log_size = @log_size[0][0]
28
28
 
29
+ # TODO: I should make the names of events/size/etc uniform betweeen Apache and Rails Logs
30
+ # SAME AS ABOVE
31
+ @total_hits = @log_size
32
+
29
33
  # SAME AS ABOVE (but log_size is wrong in the case of Rails
30
34
  # logs, since an event takes more than one line)
31
35
  @events = db.execute "SELECT count(started_at) from Event"
@@ -88,11 +92,12 @@ module LogSense
88
92
 
89
93
  @statuses = db.execute "SELECT status, count(status) from Event where #{filter} group by status order by status"
90
94
 
95
+ @by_day_5xx = db.execute "SELECT date(started_at), count(started_at) from Event where substr(status, 1,1) == '5' and #{filter} group by date(started_at)"
91
96
  @by_day_4xx = db.execute "SELECT date(started_at), count(started_at) from Event where substr(status, 1,1) == '4' and #{filter} group by date(started_at)"
92
97
  @by_day_3xx = db.execute "SELECT date(started_at), count(started_at) from Event where substr(status, 1,1) == '3' and #{filter} group by date(started_at)"
93
98
  @by_day_2xx = db.execute "SELECT date(started_at), count(started_at) from Event where substr(status, 1,1) == '2' and #{filter} group by date(started_at)"
94
99
 
95
- @statuses_by_day = (@by_day_2xx + @by_day_3xx + @by_day_4xx).group_by { |x| x[0] }.to_a.map { |x|
100
+ @statuses_by_day = (@by_day_2xx + @by_day_3xx + @by_day_4xx + @by_day_5xx).group_by { |x| x[0] }.to_a.map { |x|
96
101
  [x[0], x[1].map { |y| y[1] }].flatten
97
102
  }
98
103
 
@@ -100,8 +105,12 @@ module LogSense
100
105
 
101
106
  @performance = db.execute "SELECT distinct(controller), count(controller), printf(\"%.2f\", min(duration_total_ms)), printf(\"%.2f\", avg(duration_total_ms)), printf(\"%.2f\", max(duration_total_ms)) from Event group by controller order by controller"
102
107
 
103
- @fatal = db.execute "SELECT strftime(\"%Y-%m-%d %H:%M\", started_at), ip, url, log_id FROM Event WHERE exit_status == 'F'"
108
+ @fatal = db.execute ("SELECT strftime(\"%Y-%m-%d %H:%M\", started_at), ip, url, error.description, event.log_id FROM Event JOIN Error ON event.log_id == error.log_id WHERE exit_status == 'F'") || [[]]
109
+
110
+ @internal_server_error = (db.execute "SELECT strftime(\"%Y-%m-%d %H:%M\", started_at), status, ip, url, error.description, event.log_id FROM Event JOIN Error ON event.log_id == error.log_id WHERE status is 500") || [[]]
104
111
 
112
+ @error = (db.execute "SELECT log_id, context, description, count(log_id) from Error GROUP BY description") || [[]]
113
+
105
114
  data = {}
106
115
  self.instance_variables.each do |variable|
107
116
  var_as_symbol = variable.to_s[1..-1].to_sym
@@ -1,4 +1,5 @@
1
1
  require 'sqlite3'
2
+ require 'byebug'
2
3
 
3
4
  module LogSense
4
5
  module RailsLogParser
@@ -24,7 +25,7 @@ module LogSense
24
25
  allocations INTEGER,
25
26
  comment TEXT
26
27
  )'
27
-
28
+
28
29
  ins = db.prepare("insert into Event(
29
30
  exit_status,
30
31
  started_at,
@@ -44,6 +45,22 @@ module LogSense
44
45
  )
45
46
  values (#{Array.new(15, '?').join(', ')})")
46
47
 
48
+
49
+ db.execute 'CREATE TABLE IF NOT EXISTS Error(
50
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
51
+ log_id TEXT,
52
+ context TEXT,
53
+ description TEXT
54
+ )'
55
+
56
+ ins_error = db.prepare("insert into Error(
57
+ log_id,
58
+ context,
59
+ description
60
+ )
61
+ values (?, ?, ?)")
62
+
63
+
47
64
  # requests in the log might be interleaved.
48
65
  #
49
66
  # We use the 'pending' variable to progressively store data
@@ -65,8 +82,14 @@ module LogSense
65
82
  # Different requests might be interleaved, of course
66
83
 
67
84
  File.readlines(filename).each do |line|
68
- # We discard LOG_LEVEL != 'I'
69
- next if line[0] != 'I' and line[0] != 'F'
85
+ # I and F for completed requests, [ is for error messages
86
+ next if line[0] != 'I' and line[0] != 'F' and line[0] != '['
87
+
88
+ data = self.match_and_process_error line
89
+ if data
90
+ ins_error.execute(data[:log_id], data[:context], data[:description])
91
+ next
92
+ end
70
93
 
71
94
  data = self.match_and_process_start line
72
95
  if data
@@ -145,39 +168,6 @@ module LogSense
145
168
  end
146
169
  end
147
170
 
148
-
149
- data = self.match_and_process_completed_no_alloc line
150
- if data
151
- id = data[:log_id]
152
-
153
- # it might as well be that the first event started before
154
- # the log. With this, we make sure we add only events whose
155
- # start was logged and parsed
156
- if pending[id]
157
- event = data.merge (pending[id] || {})
158
-
159
- ins.execute(
160
- event[:exit_status],
161
- event[:started_at],
162
- event[:ended_at],
163
- event[:log_id],
164
- event[:ip],
165
- "#{DateTime.parse(event[:ended_at]).strftime("%Y-%m-%d")} #{event[:ip]}",
166
- event[:url],
167
- event[:controller],
168
- event[:html_verb],
169
- event[:status],
170
- event[:duration_total_ms],
171
- event[:duration_views_ms],
172
- event[:duration_ar_ms],
173
- event[:allocations],
174
- event[:comment]
175
- )
176
-
177
- pending.delete(id)
178
- end
179
- end
180
-
181
171
  end
182
172
 
183
173
  db
@@ -189,8 +179,30 @@ module LogSense
189
179
  URL = /(?<url>[^"]+)/
190
180
  IP = /(?<ip>[0-9.]+)/
191
181
  STATUS = /(?<status>[0-9]+)/
182
+ STATUS_IN_WORDS = /(OK|Unauthorized|Found|Internal Server Error|Bad Request|Method Not Allowed|Request Timeout|Not Implemented|Bad Gateway|Service Unavailable)/
192
183
  MSECS = /[0-9.]+/
193
184
 
185
+ # Error Messages
186
+ # [584cffcc-f1fd-4b5c-bb8b-b89621bd4921] ActionController::RoutingError (No route matches [GET] "/assets/foundation-icons.svg"):
187
+ # [fd8df8b5-83c9-48b5-a056-e5026e31bd5e] ActionView::Template::Error (undefined method `all_my_ancestor' for nil:NilClass):
188
+ # [d17ed55c-f5f1-442a-a9d6-3035ab91adf0] ActionView::Template::Error (undefined method `volunteer_for' for #<DonationsController:0x007f4864c564b8>
189
+ CONTEXT = /(?<context>[^ ]+Error)/
190
+ ERROR_REGEXP = /^\[#{ID}\] #{CONTEXT} \((?<description>.*)\):/
191
+
192
+ def self.match_and_process_error line
193
+ matchdata = ERROR_REGEXP.match line
194
+ if matchdata
195
+ {
196
+ log_id: matchdata[:id],
197
+ context: matchdata[:context],
198
+ description: matchdata[:description]
199
+ }
200
+ else
201
+ nil
202
+ end
203
+ end
204
+
205
+
194
206
  # I, [2021-10-19T08:16:34.343858 #10477] INFO -- : [67103c0d-455d-4fe8-951e-87e97628cb66] Started GET "/grow/people/471" for 217.77.80.35 at 2021-10-19 08:16:34 +0000
195
207
  STARTED_REGEXP = /I, \[#{TIMESTAMP} #[0-9]+\] INFO -- : \[#{ID}\] Started #{VERB} "#{URL}" for #{IP} at/
196
208
 
@@ -209,11 +221,15 @@ module LogSense
209
221
  end
210
222
  end
211
223
 
224
+ # TODO: Add regexps for the performance data (Views ...). We have three cases (view, active records, allocations), (views, active records), (active records, allocations)
212
225
  # I, [2021-10-19T08:16:34.712331 #10477] INFO -- : [67103c0d-455d-4fe8-951e-87e97628cb66] Completed 200 OK in 367ms (Views: 216.7ms | ActiveRecord: 141.3ms | Allocations: 168792)
213
- COMPLETED_REGEXP = /I, \[#{TIMESTAMP} #[0-9]+\] INFO -- : \[#{ID}\] Completed #{STATUS} [^ ]+ in (?<total>#{MSECS})ms \(Views: (?<views>#{MSECS})ms \| ActiveRecord: (?<arec>#{MSECS})ms \| Allocations: (?<alloc>[0-9]+)\)/
226
+ # I, [2021-12-09T16:53:52.657727 #2735058] INFO -- : [0064e403-9eb2-439d-8fe1-a334c86f5532] Completed 200 OK in 13ms (Views: 11.1ms | ActiveRecord: 1.2ms)
227
+ # I, [2021-12-06T14:28:19.736545 #2804090] INFO -- : [34091cb5-3e7b-4042-aaf8-6c6510d3f14c] Completed 500 Internal Server Error in 66ms (ActiveRecord: 8.0ms | Allocations: 24885)
228
+ COMPLETED_REGEXP = /I, \[#{TIMESTAMP} #[0-9]+\] INFO -- : \[#{ID}\] Completed #{STATUS} #{STATUS_IN_WORDS} in (?<total>#{MSECS})ms \((Views: (?<views>#{MSECS})ms \| )?ActiveRecord: (?<arec>#{MSECS})ms( \| Allocations: (?<alloc>[0-9]+))?\)/
214
229
 
215
230
  def self.match_and_process_completed line
216
231
  matchdata = (COMPLETED_REGEXP.match line)
232
+ # exit_status = matchdata[:status].to_i == 500 ? "E" : "I"
217
233
  if matchdata
218
234
  {
219
235
  exit_status: "I",
@@ -231,29 +247,6 @@ module LogSense
231
247
  end
232
248
  end
233
249
 
234
- # I, [2021-12-09T16:53:52.657727 #2735058] INFO -- : [0064e403-9eb2-439d-8fe1-a334c86f5532] Completed 200 OK in 13ms (Views: 11.1ms | ActiveRecord: 1.2ms)
235
- COMPLETED_NO_ALLOC_REGEXP = /I, \[#{TIMESTAMP} #[0-9]+\] INFO -- : \[#{ID}\] Completed #{STATUS} [^ ]+ in (?<total>#{MSECS})ms \(Views: (?<views>#{MSECS})ms \| ActiveRecord: (?<arec>#{MSECS})ms\)/
236
-
237
- def self.match_and_process_completed_no_alloc line
238
- matchdata = (COMPLETED_NO_ALLOC_REGEXP.match line)
239
- if matchdata
240
- {
241
- exit_status: "I",
242
- ended_at: matchdata[:timestamp],
243
- log_id: matchdata[:id],
244
- status: matchdata[:status],
245
- duration_total_ms: matchdata[:total],
246
- duration_views_ms: matchdata[:views],
247
- duration_ar_ms: matchdata[:arec],
248
- allocations: -1,
249
- comment: ""
250
- }
251
- else
252
- nil
253
- end
254
- end
255
-
256
-
257
250
  # I, [2021-10-19T08:16:34.345162 #10477] INFO -- : [67103c0d-455d-4fe8-951e-87e97628cb66] Processing by PeopleController#show as HTML
258
251
  PROCESSING_REGEXP = /I, \[#{TIMESTAMP} #[0-9]+\] INFO -- : \[#{ID}\] Processing by (?<controller>[^ ]+) as/
259
252
 
@@ -289,7 +282,7 @@ module LogSense
289
282
  end
290
283
 
291
284
  # generate a unique visitor id from an event
292
- def unique_visitor_id event
285
+ def self.unique_visitor_id event
293
286
  "#{DateTime.parse(event[:started_at] || event[:ended_at] || "1970-01-01").strftime("%Y-%m-%d")} #{event[:ip]}"
294
287
  end
295
288
 
@@ -0,0 +1,24 @@
1
+ <ul class="stats-list">
2
+ <li>
3
+ <%= data[:first_day].strftime("%b %d, %Y") %>
4
+ <span class="stats-list-label">From</span>
5
+ </li>
6
+ <li>
7
+ <%= data[:last_day].strftime("%b %d, %Y") %>
8
+ <span class="stats-list-label">To</span>
9
+ </li>
10
+ <li class="stats-list-positive">
11
+ <%= data[:total_days] %> <span class="stats-list-label">Days in Log</span>
12
+ </li>
13
+ <li class="stats-list-negative">
14
+ <%= data[:log_size] %> <span class="stats-list-label">Total Entries</span>
15
+ </li>
16
+ <li class="stats-list-negative">
17
+ <%= data[:selfpolls_size] %> <span class="stats-list-label">Self Polls Entries</span>
18
+ </li>
19
+ <li class="stats-list-negative">
20
+ <td><%= data[:crawlers_size] %></td>
21
+ <span class="stats-list-label">Crawlers Entries</span>
22
+ </li>
23
+ </ul>
24
+
@@ -1,23 +1,21 @@
1
- <table class="table unstriped performance">
2
- <tbody>
3
- <tr>
4
- <th>Analysis started at</th>
5
- <td><%= data[:started_at].to_s %></td>
6
- </tr>
7
- <tr>
8
- <th>Analysis ended at</th>
9
- <td><%= data[:ended_at].to_s %></td>
10
- </tr>
11
- <tr>
12
- <th>Duration</th>
13
- <td><%= "%02d:%02d" % [data[:duration] / 60, data[:duration] % 60] %></td>
14
- </tr>
15
- <tr>
16
- <th>Events</th>
17
- <td><%= data[:log_size] %></td>
18
- </tr>
19
- <tr>
20
- <th>Parsed Events/sec</th>
21
- <td><%= "%.2f" % (data[:log_size] / data[:duration]) %></td></tr>
22
- </tbody>
23
- </table>
1
+ <ul class="stats-list">
2
+ <li>
3
+ <%= data[:started_at].strftime("%b %d, %Y @ %H:%M:%S") %>
4
+ <span class="stats-list-label">Analysis Started</span>
5
+ </li>
6
+ <li>
7
+ <%= data[:ended_at].strftime("%b %d, %Y @ %H:%M:%S") %>
8
+ <span class="stats-list-label">Analysis Ended</span>
9
+ </li>
10
+ <li class="stats-list-negative">
11
+ <%= "%02d:%02d" % [data[:duration] / 60, data[:duration] % 60] %>
12
+ <span class="stats-list-label">Duration</span>
13
+ </li>
14
+ <li class="stats-list-negative">
15
+ <%= data[:log_size] %> <span class="stats-list-label">Events</span>
16
+ </li>
17
+ <li class="stats-list-positive">
18
+ <td><%= "%.2f" % (data[:log_size] / data[:duration]) %>
19
+ <span class="stats-list-label">Parsed Events/sec</span>
20
+ </li>
21
+ </ul>
@@ -1,34 +1,23 @@
1
- <table class="table unstriped summary">
2
- <tr>
3
- <th>Input file</th>
4
- <td><b><%= (data[:log_file] || "stdin") %></b></td>
5
- </tr>
6
- <tr>
7
- <th class="period">Period Analyzed</th>
8
- <td class="period">
9
- <%= data[:first_day_in_analysis] %>
10
- --
11
- <%= data[:last_day_in_analysis] %>
12
- </td>
13
- </tr>
14
- <tr>
15
- <th class="days">Days </th>
16
- <td class="days"><%= data[:total_days_in_analysis] %></td>
17
- </tr>
18
- <tr>
19
- <th class="hits">Hits</th>
20
- <td class="hits"><%= data[:total_hits] %></td>
21
- </tr>
22
- <tr>
23
- <th class="unique-visits">Unique Visits</th>
24
- <td class="unique-visits"><%= data[:total_unique_visits] %></td>
25
- </tr>
26
- <tr>
27
- <th class="avg-hits-per-unique-visits">Unique Visits</th>
28
- <td class="avg-hits-per-unique-visits"><%= data[:total_unique_visits] != 0 ? data[:total_hits] / data[:total_unique_visits] : "N/A" %></td>
29
- </tr>
30
- <tr>
31
- <th class="tx">Tx</th>
32
- <td class="tx"><%= data[:total_size] %></td>
33
- </tr>
34
- </table>
1
+ <ul class="stats-list">
2
+ <li>
3
+ <%= data[:first_day_in_analysis].strftime("%b %d, %Y") %>
4
+ <span class="stats-list-label">From</span>
5
+ </li>
6
+ <li>
7
+ <%= data[:last_day_in_analysis].strftime("%b %d, %Y") %>
8
+ <span class="stats-list-label">To</span>
9
+ </li>
10
+ <li class="stats-list-positive">
11
+ <%= data[:total_days_in_analysis] %> <span class="stats-list-label">Days</span>
12
+ </li>
13
+ <li class="stats-list-negative">
14
+ <%= data[:total_hits] %> <span class="stats-list-label">Hits</span>
15
+ </li>
16
+ <li class="stats-list-negative">
17
+ <%= data[:total_unique_visits] %> <span class="stats-list-label">Unique Visits</span>
18
+ </li>
19
+ <li class="stats-list-negative">
20
+ <%= data[:total_unique_visits] != 0 ? data[:total_hits] / data[:total_unique_visits] : "N/A" %>
21
+ <span class="stats-list-label">Unique Visits / Day</span>
22
+ </li>
23
+ </ul>