columbo 0.1.3 → 0.1.4

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -45,7 +45,7 @@ Check the Rack configuration:
45
45
 
46
46
  ## Disclaimer
47
47
 
48
- This is an alpha release and it is untested with Sinatra, it is tested with Rails 3 only.
48
+ This is an alpha release, it is untested with Sinatra and Rails 3 only.
49
49
  UI to explore sessions will be completed later (ETA: 2013'Q2).
50
50
 
51
51
  ## Author
@@ -10,9 +10,12 @@ module Columbo
10
10
 
11
11
  def initialize(app, opts={})
12
12
  @app = app
13
- @capture = opts[:capture]
14
- @bench = opts[:capture] && opts[:bench]
15
- @mongo_uri = opts[:mongo_uri]
13
+ # Options
14
+ @capture = opts[:capture]
15
+ @bench = opts[:capture] && opts[:bench]
16
+ @capture_crawlers = opts[:capture_crawlers]
17
+ @crawlers = opts[:crawlers] || "(Baidu|Gigabot|Googlebot|libwww-perl|lwp-trivial|msnbot|SiteUptime|Slurp|WordPress|ZIBB|ZyBorg|bot|crawler|spider|robot|crawling|facebook|w3c|coccoc)"
18
+ @mongo_uri = opts[:mongo_uri]
16
19
 
17
20
  Columbo.logger = opts[:logger] if opts[:logger]
18
21
 
@@ -43,7 +46,7 @@ module Columbo
43
46
 
44
47
  Thread.new do
45
48
  begin
46
- @inspector.investigate env, status, headers, response, start_processing, stop_processing
49
+ @inspector.investigate env, status, headers, response, start_processing, stop_processing, @crawlers, @capture_crawlers
47
50
  rescue Exception => e
48
51
  log_error env, e
49
52
  end
@@ -10,11 +10,13 @@ module Columbo
10
10
  @mongo_uri = mongo_uri
11
11
  end
12
12
 
13
- def investigate(env, status, headers, body, start, stop)
13
+ def investigate(env, status, headers, body, start, stop, crawlers, capture_crawlers)
14
14
  # Lazy connection to MongoDB
15
15
  client = Columbo::DbClient.new @mongo_uri
16
16
  # Normalise request from env
17
17
  request = Rack::Request.new(env)
18
+ rg = Regexp.new(crawlers, Regexp::IGNORECASE)
19
+ return if request.user_agent.match(rg) && !capture_crawlers
18
20
  html = ''
19
21
  body.each { |part| html += part }
20
22
  # Retrieve plain text body for full text search
@@ -1,3 +1,3 @@
1
1
  module Columbo
2
- VERSION = '0.1.3'
2
+ VERSION = '0.1.4'
3
3
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: columbo
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.3
4
+ version: 0.1.4
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2013-03-29 00:00:00.000000000 Z
12
+ date: 2013-04-08 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: rack