slim_scrooge 1.0.0 → 1.0.1
Sign up to get free protection for your applications and to get access to all the features.
- data/README.textile +8 -4
- data/Rakefile +2 -3
- data/VERSION.yml +2 -2
- data/lib/slim_scrooge/callsite.rb +21 -0
- data/lib/slim_scrooge/callsites.rb +15 -0
- data/lib/slim_scrooge/monitored_hash.rb +31 -0
- data/lib/slim_scrooge/result_set.rb +8 -0
- data/lib/slim_scrooge/slim_scrooge.rb +19 -13
- metadata +6 -15
data/README.textile
CHANGED
@@ -1,10 +1,10 @@
|
|
1
|
-
h1. SlimScrooge - serious optimisation
|
1
|
+
h1. SlimScrooge - serious optimisation for activerecord
|
2
2
|
|
3
3
|
h2. What is it?
|
4
4
|
|
5
5
|
It's an optimization layer to ensure your application only fetches the database content needed to minimize wire traffic, excessive SQL queries and reduce conversion overheads to native Ruby types.
|
6
6
|
|
7
|
-
SlimScrooge implements
|
7
|
+
SlimScrooge implements inline query optimisation, automatically restricting the columns fetched based on what was used during previous passes through the same part of your code.
|
8
8
|
|
9
9
|
SlimScrooge is similar to (and is partly derived from) "Scrooge":http://github.com/methodmissing/scrooge but has many fewer lines of code and is faster.
|
10
10
|
|
@@ -12,7 +12,11 @@ h2. Benchmark
|
|
12
12
|
|
13
13
|
SlimScrooge performs best when the database is not on the same machine as your rails app. In this case the overhead of fetching unnecessary data from the database can become more important.
|
14
14
|
|
15
|
-
I ran a benchmark that consisted of fetching 400 real urls (culled from the log file) from our complex web app. In this test I found a consistent *12% improvement* in performance over plain active record. Not earth-shattering, but worthwhile.
|
15
|
+
I ran a benchmark that consisted of fetching 400 real urls (culled from the log file) from our complex web app. In this test I found a consistent *12% improvement* in performance over plain active record. Not earth-shattering, but worthwhile.
|
16
|
+
|
17
|
+
Note that this result was for comparing the time taken for running complete requests through rails - of which database accesses are only one part. So the result is better than it at first sounds.
|
18
|
+
|
19
|
+
In future releases I expect further gains.
|
16
20
|
|
17
21
|
h2. Installation
|
18
22
|
|
@@ -54,7 +58,7 @@ h2. Technical discussion
|
|
54
58
|
|
55
59
|
SlimScrooge hooks in at just one particular place in ActiveRecord - and that place is the find_all_by_sql method. All select queries pass through this method.
|
56
60
|
|
57
|
-
SlimScrooge is able to record each call (and where it came from in your code), and to modify queries that do SELECT * FROM en-route to
|
61
|
+
SlimScrooge is able to record each call (and where it came from in your code), and to modify queries that do SELECT * FROM en-route to the database so that they only select the rows that are actually used by that piece of code.
|
58
62
|
|
59
63
|
How does SlimScrooge know which columns are actually used?
|
60
64
|
|
data/Rakefile
CHANGED
@@ -14,14 +14,13 @@ begin
|
|
14
14
|
require 'jeweler'
|
15
15
|
Jeweler::Tasks.new do |s|
|
16
16
|
s.name = "slim_scrooge"
|
17
|
-
s.summary = "Slim_scrooge -
|
17
|
+
s.summary = "Slim_scrooge - serious optimisation for ActiveRecord"
|
18
18
|
s.email = "sdsykes@gmail.com"
|
19
19
|
s.homepage = "http://github.com/sdsykes/slim_scrooge"
|
20
|
-
s.description = "Slim scrooge boosts speed in Rails
|
20
|
+
s.description = "Slim scrooge boosts speed in Rails ActiveRecord Models by only querying the database for what is needed."
|
21
21
|
s.authors = ["Stephen Sykes"]
|
22
22
|
s.files = FileList["[A-Z]*", "{ext,lib,test}/**/*"]
|
23
23
|
s.extensions = "ext/extconf.rb"
|
24
|
-
s.add_dependency('slim-attributes', '>= 0.7.0')
|
25
24
|
end
|
26
25
|
Jeweler::GemcutterTasks.new
|
27
26
|
rescue LoadError
|
data/VERSION.yml
CHANGED
@@ -1,6 +1,9 @@
|
|
1
1
|
# Author: Stephen Sykes
|
2
2
|
|
3
3
|
module SlimScrooge
|
4
|
+
# A Callsite contains the list of columns that are accessed when an SQL
|
5
|
+
# query is made from a particular place in the app
|
6
|
+
#
|
4
7
|
class Callsite
|
5
8
|
ScroogeComma = ",".freeze
|
6
9
|
ScroogeRegexJoin = /(?:LEFT|INNER|OUTER|CROSS)*\s*(?:STRAIGHT_JOIN|JOIN)/i
|
@@ -9,6 +12,8 @@ module SlimScrooge
|
|
9
12
|
attr_reader :columns_hash, :primary_key, :model_class
|
10
13
|
|
11
14
|
class << self
|
15
|
+
# Make a callsite if the query is of the right type for us to optimise
|
16
|
+
#
|
12
17
|
def make_callsite(model_class, original_sql)
|
13
18
|
if use_scrooge?(model_class, original_sql)
|
14
19
|
new(model_class)
|
@@ -17,12 +22,17 @@ module SlimScrooge
|
|
17
22
|
end
|
18
23
|
end
|
19
24
|
|
25
|
+
# Check if query can be optimised
|
26
|
+
#
|
20
27
|
def use_scrooge?(model_class, original_sql)
|
21
28
|
original_sql =~ select_regexp(model_class.table_name) &&
|
22
29
|
model_class.columns_hash.has_key?(model_class.primary_key) &&
|
23
30
|
original_sql !~ ScroogeRegexJoin
|
24
31
|
end
|
25
32
|
|
33
|
+
# The regexp that enables us to replace the * from SELECT * with
|
34
|
+
# the list of columns we actually need
|
35
|
+
#
|
26
36
|
def select_regexp(table_name)
|
27
37
|
%r{SELECT (`?(?:#{table_name})?`?.?\\*) FROM}
|
28
38
|
end
|
@@ -38,6 +48,8 @@ module SlimScrooge
|
|
38
48
|
@seen_columns = SimpleSet.new(essential_columns)
|
39
49
|
end
|
40
50
|
|
51
|
+
# List of columns that should always be fetched no matter what
|
52
|
+
#
|
41
53
|
def essential_columns
|
42
54
|
@model_class.reflect_on_all_associations.inject([@model_class.primary_key]) do |arr, assoc|
|
43
55
|
if assoc.options[:dependent] && assoc.macro == :belongs_to
|
@@ -47,20 +59,29 @@ module SlimScrooge
|
|
47
59
|
end
|
48
60
|
end
|
49
61
|
|
62
|
+
# Returns suitable sql given a list of columns and the original query
|
63
|
+
#
|
50
64
|
def scrooged_sql(seen_columns, sql)
|
51
65
|
sql.gsub(@select_regexp, "SELECT #{scrooge_select_sql(seen_columns)} FROM")
|
52
66
|
end
|
53
67
|
|
68
|
+
# List if columns what were not fetched
|
69
|
+
#
|
54
70
|
def missing_columns(fetched_columns)
|
55
71
|
(@all_columns - SimpleSet.new(fetched_columns)) << @primary_key
|
56
72
|
end
|
57
73
|
|
74
|
+
# Returns sql for fetching the unfetched columns for all the rows
|
75
|
+
# in the result set, specified by primary_keys
|
76
|
+
#
|
58
77
|
def reload_sql(primary_keys, fetched_columns)
|
59
78
|
sql_keys = primary_keys.collect{|pk| "'#{pk}'"}.join(ScroogeComma)
|
60
79
|
cols = scrooge_select_sql(missing_columns(fetched_columns))
|
61
80
|
"SELECT #{cols} FROM #{@quoted_table_name} WHERE #{@quoted_table_name}.#{@primary_key} IN (#{sql_keys})"
|
62
81
|
end
|
63
82
|
|
83
|
+
# Change a set of columns into a correctly quoted comma separated list
|
84
|
+
#
|
64
85
|
def scrooge_select_sql(set)
|
65
86
|
set.collect do |name|
|
66
87
|
"#{@quoted_table_name}.#{@model_class.connection.quote_column_name(name)}"
|
@@ -1,23 +1,34 @@
|
|
1
1
|
# Author: Stephen Sykes
|
2
2
|
|
3
3
|
module SlimScrooge
|
4
|
+
# Contains the complete list of callsites
|
5
|
+
#
|
4
6
|
class Callsites
|
5
7
|
CallsitesMutex = Mutex.new
|
6
8
|
@@callsites = {}
|
7
9
|
|
8
10
|
class << self
|
11
|
+
# Whether we have encountered a callsite before
|
12
|
+
#
|
9
13
|
def has_key?(callsite_key)
|
10
14
|
@@callsites.has_key?(callsite_key)
|
11
15
|
end
|
12
16
|
|
17
|
+
# Return the callsite for this key
|
18
|
+
#
|
13
19
|
def [](callsite_key)
|
14
20
|
@@callsites[callsite_key]
|
15
21
|
end
|
16
22
|
|
23
|
+
# Generate a key string - uses the portion of the query before the WHERE
|
24
|
+
# together with the callsite_hash generated by callsite_hash.c
|
25
|
+
#
|
17
26
|
def callsite_key(callsite_hash, sql)
|
18
27
|
callsite_hash + sql.gsub(/WHERE.*/i, "").hash
|
19
28
|
end
|
20
29
|
|
30
|
+
# Create a new callsite
|
31
|
+
#
|
21
32
|
def create(sql, callsite_key, name)
|
22
33
|
begin
|
23
34
|
model_class = name.split.first.constantize
|
@@ -28,12 +39,16 @@ module SlimScrooge
|
|
28
39
|
end
|
29
40
|
end
|
30
41
|
|
42
|
+
# Add a new callsite, wrap in a mutex for safety
|
43
|
+
#
|
31
44
|
def add_callsite(callsite_key, callsite)
|
32
45
|
CallsitesMutex.synchronize do
|
33
46
|
@@callsites[callsite_key] = callsite
|
34
47
|
end
|
35
48
|
end
|
36
49
|
|
50
|
+
# Record that a column was accessed, wrap in a mutex for safety
|
51
|
+
#
|
37
52
|
def add_seen_column(callsite, seen_column)
|
38
53
|
CallsitesMutex.synchronize do
|
39
54
|
callsite.seen_columns << seen_column
|
@@ -1,9 +1,20 @@
|
|
1
1
|
# Author: Stephen Sykes
|
2
2
|
|
3
3
|
module SlimScrooge
|
4
|
+
# A MonitoredHash allows us to return only some columns into the @attributes
|
5
|
+
# of an ActiveRecord model object, but to notice when an attribute that
|
6
|
+
# wasn't fetched is accessed.
|
7
|
+
#
|
8
|
+
# Also, when a result is first fetched for a particular callsite, we monitor
|
9
|
+
# all the columns so that we can immediately learn which columns are needed.
|
10
|
+
#
|
4
11
|
class MonitoredHash < Hash
|
5
12
|
attr_accessor :callsite, :result_set, :monitored_columns
|
6
13
|
|
14
|
+
# Create a monitored hash. The unmonitored_columns are accessed like a regular
|
15
|
+
# hash. The monitored columns kept separately, and new_column_access is called
|
16
|
+
# before they are returned.
|
17
|
+
#
|
7
18
|
def self.[](monitored_columns, unmonitored_columns, callsite)
|
8
19
|
hash = MonitoredHash.new {|hash, key| hash.new_column_access(key)}
|
9
20
|
hash.monitored_columns = monitored_columns
|
@@ -12,6 +23,10 @@ module SlimScrooge
|
|
12
23
|
hash
|
13
24
|
end
|
14
25
|
|
26
|
+
# Called when an unknown column is requested, through the default proc.
|
27
|
+
# If the column requested is valid, and the result set is not completely
|
28
|
+
# loaded, then we reload. Otherwise just note the column with add_seen_column.
|
29
|
+
#
|
15
30
|
def new_column_access(name)
|
16
31
|
if @callsite.columns_hash.has_key?(name)
|
17
32
|
@result_set.reload! if @result_set && name != @callsite.primary_key
|
@@ -20,6 +35,8 @@ module SlimScrooge
|
|
20
35
|
@monitored_columns[name]
|
21
36
|
end
|
22
37
|
|
38
|
+
# Reload if needed before allowing assignment
|
39
|
+
#
|
23
40
|
def []=(name, value)
|
24
41
|
if has_key?(name)
|
25
42
|
return super
|
@@ -30,16 +47,22 @@ module SlimScrooge
|
|
30
47
|
@monitored_columns[name] = value
|
31
48
|
end
|
32
49
|
|
50
|
+
# Returns the column names
|
51
|
+
#
|
33
52
|
def keys
|
34
53
|
@result_set ? @callsite.columns_hash.keys : super | @monitored_columns.keys
|
35
54
|
end
|
36
55
|
|
56
|
+
# Check for a column name
|
57
|
+
#
|
37
58
|
def has_key?(name)
|
38
59
|
@result_set ? @callsite.columns_hash.has_key?(name) : super || @monitored_columns.has_key?(name)
|
39
60
|
end
|
40
61
|
|
41
62
|
alias_method :include?, :has_key?
|
42
63
|
|
64
|
+
# Called by Hash#update when reload is called on an ActiveRecord object
|
65
|
+
#
|
43
66
|
def to_hash
|
44
67
|
@result_set.reload! if @result_set
|
45
68
|
@monitored_columns.merge(self)
|
@@ -58,6 +81,14 @@ module SlimScrooge
|
|
58
81
|
end
|
59
82
|
end
|
60
83
|
|
84
|
+
# We need to change the update method of Hash so that it *always* calls
|
85
|
+
# to_hash. This is because it normally checks if other_hash is a kind of
|
86
|
+
# Hash, and doesn't bother calling to_hash if so. But we need it to call
|
87
|
+
# to_hash, because otherwise update will not get the complete columns
|
88
|
+
# from a MonitoredHash
|
89
|
+
#
|
90
|
+
# This is not harmful - to_hash in a regular Hash just returns self.
|
91
|
+
#
|
61
92
|
class Hash
|
62
93
|
alias_method :c_update, :update
|
63
94
|
def update(other_hash, &block)
|
@@ -1,6 +1,11 @@
|
|
1
1
|
# Author: Stephen Sykes
|
2
2
|
|
3
3
|
module SlimScrooge
|
4
|
+
# A ResultSet contains all the rows found by an sql query
|
5
|
+
# A call to reload! will cause all the rows in the set to be fully loaded
|
6
|
+
# from the database - this should be called when a column access that hasn't previously
|
7
|
+
# been seen by SlimScrooge is encountered
|
8
|
+
#
|
4
9
|
class ResultSet
|
5
10
|
attr_reader :rows, :callsite_key
|
6
11
|
|
@@ -14,6 +19,9 @@ module SlimScrooge
|
|
14
19
|
@rows.inject({}) {|hash, row| hash[row[key]] = row; hash}
|
15
20
|
end
|
16
21
|
|
22
|
+
# Reload all the rows in the sql result at once
|
23
|
+
# Reloads only those columns we didn't fetch the first time
|
24
|
+
#
|
17
25
|
def reload!
|
18
26
|
callsite = Callsites[@callsite_key]
|
19
27
|
rows_hash = rows_by_key(callsite.primary_key)
|
@@ -13,19 +13,8 @@ module SlimScrooge
|
|
13
13
|
def find_by_sql_with_slim_scrooge(sql)
|
14
14
|
return find_by_sql_without_slim_scrooge(sql) if sql.is_a?(Array) # don't mess with user's custom query
|
15
15
|
callsite_key = SlimScrooge::Callsites.callsite_key(callsite_hash, sql)
|
16
|
-
if SlimScrooge::Callsites.has_key?(callsite_key)
|
17
|
-
|
18
|
-
seen_columns = callsite.seen_columns.dup # dup so cols aren't changed underneath us
|
19
|
-
rows = connection.select_all(callsite.scrooged_sql(seen_columns, sql), "#{name} Load SlimScrooged")
|
20
|
-
rows.collect! {|row| MonitoredHash[{}, row, callsite]}
|
21
|
-
result_set = SlimScrooge::ResultSet.new(rows.dup, callsite_key, seen_columns)
|
22
|
-
rows.collect! do |row|
|
23
|
-
row.result_set = result_set
|
24
|
-
instantiate(row)
|
25
|
-
end
|
26
|
-
else
|
27
|
-
find_by_sql_without_slim_scrooge(sql)
|
28
|
-
end
|
16
|
+
if SlimScrooge::Callsites.has_key?(callsite_key)
|
17
|
+
find_with_callsite_key(sql, callsite_key)
|
29
18
|
elsif callsite = SlimScrooge::Callsites.create(sql, callsite_key, name) # new site that is scroogeable
|
30
19
|
rows = connection.select_all(sql, "#{name} Load SlimScrooged 1st time")
|
31
20
|
rows.collect! {|row| instantiate(MonitoredHash[row, {}, callsite])}
|
@@ -33,6 +22,23 @@ module SlimScrooge
|
|
33
22
|
find_by_sql_without_slim_scrooge(sql)
|
34
23
|
end
|
35
24
|
end
|
25
|
+
|
26
|
+
private
|
27
|
+
|
28
|
+
def find_with_callsite_key(sql, callsite_key)
|
29
|
+
if callsite = SlimScrooge::Callsites[callsite_key]
|
30
|
+
seen_columns = callsite.seen_columns.dup # dup so cols aren't changed underneath us
|
31
|
+
rows = connection.select_all(callsite.scrooged_sql(seen_columns, sql), "#{name} Load SlimScrooged")
|
32
|
+
rows.collect! {|row| MonitoredHash[{}, row, callsite]}
|
33
|
+
result_set = SlimScrooge::ResultSet.new(rows.dup, callsite_key, seen_columns)
|
34
|
+
rows.collect! do |row|
|
35
|
+
row.result_set = result_set
|
36
|
+
instantiate(row)
|
37
|
+
end
|
38
|
+
else
|
39
|
+
find_by_sql_without_slim_scrooge(sql)
|
40
|
+
end
|
41
|
+
end
|
36
42
|
end
|
37
43
|
end
|
38
44
|
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: slim_scrooge
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.0.
|
4
|
+
version: 1.0.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Stephen Sykes
|
@@ -9,20 +9,11 @@ autorequire:
|
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
11
|
|
12
|
-
date: 2009-11-
|
12
|
+
date: 2009-11-07 00:00:00 +02:00
|
13
13
|
default_executable:
|
14
|
-
dependencies:
|
15
|
-
|
16
|
-
|
17
|
-
type: :runtime
|
18
|
-
version_requirement:
|
19
|
-
version_requirements: !ruby/object:Gem::Requirement
|
20
|
-
requirements:
|
21
|
-
- - ">="
|
22
|
-
- !ruby/object:Gem::Version
|
23
|
-
version: 0.7.0
|
24
|
-
version:
|
25
|
-
description: Slim scrooge boosts speed in Rails/Mysql ActiveRecord Models by lazily instantiating attributes as needed, and only querying the database for what is needed.
|
14
|
+
dependencies: []
|
15
|
+
|
16
|
+
description: Slim scrooge boosts speed in Rails ActiveRecord Models by only querying the database for what is needed.
|
26
17
|
email: sdsykes@gmail.com
|
27
18
|
executables: []
|
28
19
|
|
@@ -75,7 +66,7 @@ rubyforge_project:
|
|
75
66
|
rubygems_version: 1.3.5
|
76
67
|
signing_key:
|
77
68
|
specification_version: 3
|
78
|
-
summary: Slim_scrooge -
|
69
|
+
summary: Slim_scrooge - serious optimisation for ActiveRecord
|
79
70
|
test_files:
|
80
71
|
- test/active_record_setup.rb
|
81
72
|
- test/helper.rb
|