slim_scrooge 1.0.0 → 1.0.1

Sign up to get free protection for your applications and to get access to all the features.
data/README.textile CHANGED
@@ -1,10 +1,10 @@
1
- h1. SlimScrooge - serious optimisation of mysql for activerecord
1
+ h1. SlimScrooge - serious optimisation for activerecord
2
2
 
3
3
  h2. What is it?
4
4
 
5
5
  It's an optimization layer to ensure your application only fetches the database content needed to minimize wire traffic, excessive SQL queries and reduce conversion overheads to native Ruby types.
6
6
 
7
- SlimScrooge implements both lazy loading of attributes fetched from mysql, as well as inline query optimisation, automatically restricting the columns fetched based on what was used during previous passes through the same part of your code.
7
+ SlimScrooge implements inline query optimisation, automatically restricting the columns fetched based on what was used during previous passes through the same part of your code.
8
8
 
9
9
  SlimScrooge is similar to (and is partly derived from) "Scrooge":http://github.com/methodmissing/scrooge but has many fewer lines of code and is faster.
10
10
 
@@ -12,7 +12,11 @@ h2. Benchmark
12
12
 
13
13
  SlimScrooge performs best when the database is not on the same machine as your rails app. In this case the overhead of fetching unnecessary data from the database can become more important.
14
14
 
15
- I ran a benchmark that consisted of fetching 400 real urls (culled from the log file) from our complex web app. In this test I found a consistent *12% improvement* in performance over plain active record. Not earth-shattering, but worthwhile. In future releases I expect further gains.
15
+ I ran a benchmark that consisted of fetching 400 real urls (culled from the log file) from our complex web app. In this test I found a consistent *12% improvement* in performance over plain active record. Not earth-shattering, but worthwhile.
16
+
17
+ Note that this result was for comparing the time taken for running complete requests through rails - of which database accesses are only one part. So the result is better than it at first sounds.
18
+
19
+ In future releases I expect further gains.
16
20
 
17
21
  h2. Installation
18
22
 
@@ -54,7 +58,7 @@ h2. Technical discussion
54
58
 
55
59
  SlimScrooge hooks in at just one particular place in ActiveRecord - and that place is the find_all_by_sql method. All select queries pass through this method.
56
60
 
57
- SlimScrooge is able to record each call (and where it came from in your code), and to modify queries that do SELECT * FROM en-route to mysql so that they only select the rows that are actually used by that piece of code.
61
+ SlimScrooge is able to record each call (and where it came from in your code), and to modify queries that do SELECT * FROM en-route to the database so that they only select the rows that are actually used by that piece of code.
58
62
 
59
63
  How does SlimScrooge know which columns are actually used?
60
64
 
data/Rakefile CHANGED
@@ -14,14 +14,13 @@ begin
14
14
  require 'jeweler'
15
15
  Jeweler::Tasks.new do |s|
16
16
  s.name = "slim_scrooge"
17
- s.summary = "Slim_scrooge - lazy instantiation of attributes and query optimisation for ActiveRecord"
17
+ s.summary = "Slim_scrooge - serious optimisation for ActiveRecord"
18
18
  s.email = "sdsykes@gmail.com"
19
19
  s.homepage = "http://github.com/sdsykes/slim_scrooge"
20
- s.description = "Slim scrooge boosts speed in Rails/Mysql ActiveRecord Models by lazily instantiating attributes as needed, and only querying the database for what is needed."
20
+ s.description = "Slim scrooge boosts speed in Rails ActiveRecord Models by only querying the database for what is needed."
21
21
  s.authors = ["Stephen Sykes"]
22
22
  s.files = FileList["[A-Z]*", "{ext,lib,test}/**/*"]
23
23
  s.extensions = "ext/extconf.rb"
24
- s.add_dependency('slim-attributes', '>= 0.7.0')
25
24
  end
26
25
  Jeweler::GemcutterTasks.new
27
26
  rescue LoadError
data/VERSION.yml CHANGED
@@ -1,5 +1,5 @@
1
1
  ---
2
+ :major: 1
2
3
  :minor: 0
3
- :patch: 0
4
+ :patch: 1
4
5
  :build:
5
- :major: 1
@@ -1,6 +1,9 @@
1
1
  # Author: Stephen Sykes
2
2
 
3
3
  module SlimScrooge
4
+ # A Callsite contains the list of columns that are accessed when an SQL
5
+ # query is made from a particular place in the app
6
+ #
4
7
  class Callsite
5
8
  ScroogeComma = ",".freeze
6
9
  ScroogeRegexJoin = /(?:LEFT|INNER|OUTER|CROSS)*\s*(?:STRAIGHT_JOIN|JOIN)/i
@@ -9,6 +12,8 @@ module SlimScrooge
9
12
  attr_reader :columns_hash, :primary_key, :model_class
10
13
 
11
14
  class << self
15
+ # Make a callsite if the query is of the right type for us to optimise
16
+ #
12
17
  def make_callsite(model_class, original_sql)
13
18
  if use_scrooge?(model_class, original_sql)
14
19
  new(model_class)
@@ -17,12 +22,17 @@ module SlimScrooge
17
22
  end
18
23
  end
19
24
 
25
+ # Check if query can be optimised
26
+ #
20
27
  def use_scrooge?(model_class, original_sql)
21
28
  original_sql =~ select_regexp(model_class.table_name) &&
22
29
  model_class.columns_hash.has_key?(model_class.primary_key) &&
23
30
  original_sql !~ ScroogeRegexJoin
24
31
  end
25
32
 
33
+ # The regexp that enables us to replace the * from SELECT * with
34
+ # the list of columns we actually need
35
+ #
26
36
  def select_regexp(table_name)
27
37
  %r{SELECT (`?(?:#{table_name})?`?.?\\*) FROM}
28
38
  end
@@ -38,6 +48,8 @@ module SlimScrooge
38
48
  @seen_columns = SimpleSet.new(essential_columns)
39
49
  end
40
50
 
51
+ # List of columns that should always be fetched no matter what
52
+ #
41
53
  def essential_columns
42
54
  @model_class.reflect_on_all_associations.inject([@model_class.primary_key]) do |arr, assoc|
43
55
  if assoc.options[:dependent] && assoc.macro == :belongs_to
@@ -47,20 +59,29 @@ module SlimScrooge
47
59
  end
48
60
  end
49
61
 
62
+ # Returns suitable sql given a list of columns and the original query
63
+ #
50
64
  def scrooged_sql(seen_columns, sql)
51
65
  sql.gsub(@select_regexp, "SELECT #{scrooge_select_sql(seen_columns)} FROM")
52
66
  end
53
67
 
68
+ # List if columns what were not fetched
69
+ #
54
70
  def missing_columns(fetched_columns)
55
71
  (@all_columns - SimpleSet.new(fetched_columns)) << @primary_key
56
72
  end
57
73
 
74
+ # Returns sql for fetching the unfetched columns for all the rows
75
+ # in the result set, specified by primary_keys
76
+ #
58
77
  def reload_sql(primary_keys, fetched_columns)
59
78
  sql_keys = primary_keys.collect{|pk| "'#{pk}'"}.join(ScroogeComma)
60
79
  cols = scrooge_select_sql(missing_columns(fetched_columns))
61
80
  "SELECT #{cols} FROM #{@quoted_table_name} WHERE #{@quoted_table_name}.#{@primary_key} IN (#{sql_keys})"
62
81
  end
63
82
 
83
+ # Change a set of columns into a correctly quoted comma separated list
84
+ #
64
85
  def scrooge_select_sql(set)
65
86
  set.collect do |name|
66
87
  "#{@quoted_table_name}.#{@model_class.connection.quote_column_name(name)}"
@@ -1,23 +1,34 @@
1
1
  # Author: Stephen Sykes
2
2
 
3
3
  module SlimScrooge
4
+ # Contains the complete list of callsites
5
+ #
4
6
  class Callsites
5
7
  CallsitesMutex = Mutex.new
6
8
  @@callsites = {}
7
9
 
8
10
  class << self
11
+ # Whether we have encountered a callsite before
12
+ #
9
13
  def has_key?(callsite_key)
10
14
  @@callsites.has_key?(callsite_key)
11
15
  end
12
16
 
17
+ # Return the callsite for this key
18
+ #
13
19
  def [](callsite_key)
14
20
  @@callsites[callsite_key]
15
21
  end
16
22
 
23
+ # Generate a key string - uses the portion of the query before the WHERE
24
+ # together with the callsite_hash generated by callsite_hash.c
25
+ #
17
26
  def callsite_key(callsite_hash, sql)
18
27
  callsite_hash + sql.gsub(/WHERE.*/i, "").hash
19
28
  end
20
29
 
30
+ # Create a new callsite
31
+ #
21
32
  def create(sql, callsite_key, name)
22
33
  begin
23
34
  model_class = name.split.first.constantize
@@ -28,12 +39,16 @@ module SlimScrooge
28
39
  end
29
40
  end
30
41
 
42
+ # Add a new callsite, wrap in a mutex for safety
43
+ #
31
44
  def add_callsite(callsite_key, callsite)
32
45
  CallsitesMutex.synchronize do
33
46
  @@callsites[callsite_key] = callsite
34
47
  end
35
48
  end
36
49
 
50
+ # Record that a column was accessed, wrap in a mutex for safety
51
+ #
37
52
  def add_seen_column(callsite, seen_column)
38
53
  CallsitesMutex.synchronize do
39
54
  callsite.seen_columns << seen_column
@@ -1,9 +1,20 @@
1
1
  # Author: Stephen Sykes
2
2
 
3
3
  module SlimScrooge
4
+ # A MonitoredHash allows us to return only some columns into the @attributes
5
+ # of an ActiveRecord model object, but to notice when an attribute that
6
+ # wasn't fetched is accessed.
7
+ #
8
+ # Also, when a result is first fetched for a particular callsite, we monitor
9
+ # all the columns so that we can immediately learn which columns are needed.
10
+ #
4
11
  class MonitoredHash < Hash
5
12
  attr_accessor :callsite, :result_set, :monitored_columns
6
13
 
14
+ # Create a monitored hash. The unmonitored_columns are accessed like a regular
15
+ # hash. The monitored columns kept separately, and new_column_access is called
16
+ # before they are returned.
17
+ #
7
18
  def self.[](monitored_columns, unmonitored_columns, callsite)
8
19
  hash = MonitoredHash.new {|hash, key| hash.new_column_access(key)}
9
20
  hash.monitored_columns = monitored_columns
@@ -12,6 +23,10 @@ module SlimScrooge
12
23
  hash
13
24
  end
14
25
 
26
+ # Called when an unknown column is requested, through the default proc.
27
+ # If the column requested is valid, and the result set is not completely
28
+ # loaded, then we reload. Otherwise just note the column with add_seen_column.
29
+ #
15
30
  def new_column_access(name)
16
31
  if @callsite.columns_hash.has_key?(name)
17
32
  @result_set.reload! if @result_set && name != @callsite.primary_key
@@ -20,6 +35,8 @@ module SlimScrooge
20
35
  @monitored_columns[name]
21
36
  end
22
37
 
38
+ # Reload if needed before allowing assignment
39
+ #
23
40
  def []=(name, value)
24
41
  if has_key?(name)
25
42
  return super
@@ -30,16 +47,22 @@ module SlimScrooge
30
47
  @monitored_columns[name] = value
31
48
  end
32
49
 
50
+ # Returns the column names
51
+ #
33
52
  def keys
34
53
  @result_set ? @callsite.columns_hash.keys : super | @monitored_columns.keys
35
54
  end
36
55
 
56
+ # Check for a column name
57
+ #
37
58
  def has_key?(name)
38
59
  @result_set ? @callsite.columns_hash.has_key?(name) : super || @monitored_columns.has_key?(name)
39
60
  end
40
61
 
41
62
  alias_method :include?, :has_key?
42
63
 
64
+ # Called by Hash#update when reload is called on an ActiveRecord object
65
+ #
43
66
  def to_hash
44
67
  @result_set.reload! if @result_set
45
68
  @monitored_columns.merge(self)
@@ -58,6 +81,14 @@ module SlimScrooge
58
81
  end
59
82
  end
60
83
 
84
+ # We need to change the update method of Hash so that it *always* calls
85
+ # to_hash. This is because it normally checks if other_hash is a kind of
86
+ # Hash, and doesn't bother calling to_hash if so. But we need it to call
87
+ # to_hash, because otherwise update will not get the complete columns
88
+ # from a MonitoredHash
89
+ #
90
+ # This is not harmful - to_hash in a regular Hash just returns self.
91
+ #
61
92
  class Hash
62
93
  alias_method :c_update, :update
63
94
  def update(other_hash, &block)
@@ -1,6 +1,11 @@
1
1
  # Author: Stephen Sykes
2
2
 
3
3
  module SlimScrooge
4
+ # A ResultSet contains all the rows found by an sql query
5
+ # A call to reload! will cause all the rows in the set to be fully loaded
6
+ # from the database - this should be called when a column access that hasn't previously
7
+ # been seen by SlimScrooge is encountered
8
+ #
4
9
  class ResultSet
5
10
  attr_reader :rows, :callsite_key
6
11
 
@@ -14,6 +19,9 @@ module SlimScrooge
14
19
  @rows.inject({}) {|hash, row| hash[row[key]] = row; hash}
15
20
  end
16
21
 
22
+ # Reload all the rows in the sql result at once
23
+ # Reloads only those columns we didn't fetch the first time
24
+ #
17
25
  def reload!
18
26
  callsite = Callsites[@callsite_key]
19
27
  rows_hash = rows_by_key(callsite.primary_key)
@@ -13,19 +13,8 @@ module SlimScrooge
13
13
  def find_by_sql_with_slim_scrooge(sql)
14
14
  return find_by_sql_without_slim_scrooge(sql) if sql.is_a?(Array) # don't mess with user's custom query
15
15
  callsite_key = SlimScrooge::Callsites.callsite_key(callsite_hash, sql)
16
- if SlimScrooge::Callsites.has_key?(callsite_key) # seen before
17
- if callsite = SlimScrooge::Callsites[callsite_key] # and is scroogeable
18
- seen_columns = callsite.seen_columns.dup # dup so cols aren't changed underneath us
19
- rows = connection.select_all(callsite.scrooged_sql(seen_columns, sql), "#{name} Load SlimScrooged")
20
- rows.collect! {|row| MonitoredHash[{}, row, callsite]}
21
- result_set = SlimScrooge::ResultSet.new(rows.dup, callsite_key, seen_columns)
22
- rows.collect! do |row|
23
- row.result_set = result_set
24
- instantiate(row)
25
- end
26
- else
27
- find_by_sql_without_slim_scrooge(sql)
28
- end
16
+ if SlimScrooge::Callsites.has_key?(callsite_key)
17
+ find_with_callsite_key(sql, callsite_key)
29
18
  elsif callsite = SlimScrooge::Callsites.create(sql, callsite_key, name) # new site that is scroogeable
30
19
  rows = connection.select_all(sql, "#{name} Load SlimScrooged 1st time")
31
20
  rows.collect! {|row| instantiate(MonitoredHash[row, {}, callsite])}
@@ -33,6 +22,23 @@ module SlimScrooge
33
22
  find_by_sql_without_slim_scrooge(sql)
34
23
  end
35
24
  end
25
+
26
+ private
27
+
28
+ def find_with_callsite_key(sql, callsite_key)
29
+ if callsite = SlimScrooge::Callsites[callsite_key]
30
+ seen_columns = callsite.seen_columns.dup # dup so cols aren't changed underneath us
31
+ rows = connection.select_all(callsite.scrooged_sql(seen_columns, sql), "#{name} Load SlimScrooged")
32
+ rows.collect! {|row| MonitoredHash[{}, row, callsite]}
33
+ result_set = SlimScrooge::ResultSet.new(rows.dup, callsite_key, seen_columns)
34
+ rows.collect! do |row|
35
+ row.result_set = result_set
36
+ instantiate(row)
37
+ end
38
+ else
39
+ find_by_sql_without_slim_scrooge(sql)
40
+ end
41
+ end
36
42
  end
37
43
  end
38
44
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: slim_scrooge
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0
4
+ version: 1.0.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Stephen Sykes
@@ -9,20 +9,11 @@ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
11
 
12
- date: 2009-11-06 00:00:00 +02:00
12
+ date: 2009-11-07 00:00:00 +02:00
13
13
  default_executable:
14
- dependencies:
15
- - !ruby/object:Gem::Dependency
16
- name: slim-attributes
17
- type: :runtime
18
- version_requirement:
19
- version_requirements: !ruby/object:Gem::Requirement
20
- requirements:
21
- - - ">="
22
- - !ruby/object:Gem::Version
23
- version: 0.7.0
24
- version:
25
- description: Slim scrooge boosts speed in Rails/Mysql ActiveRecord Models by lazily instantiating attributes as needed, and only querying the database for what is needed.
14
+ dependencies: []
15
+
16
+ description: Slim scrooge boosts speed in Rails ActiveRecord Models by only querying the database for what is needed.
26
17
  email: sdsykes@gmail.com
27
18
  executables: []
28
19
 
@@ -75,7 +66,7 @@ rubyforge_project:
75
66
  rubygems_version: 1.3.5
76
67
  signing_key:
77
68
  specification_version: 3
78
- summary: Slim_scrooge - lazy instantiation of attributes and query optimisation for ActiveRecord
69
+ summary: Slim_scrooge - serious optimisation for ActiveRecord
79
70
  test_files:
80
71
  - test/active_record_setup.rb
81
72
  - test/helper.rb