batch-loader 0.1.0 → 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 3a50ebacf9007fad9837f5b414d6c6ba9f06bff9
4
- data.tar.gz: 646e11f0e0599449c7a0036672659752ac2018e3
3
+ metadata.gz: 0ced31b90f33da9d4e1fe719fe5f6f0b5a043e60
4
+ data.tar.gz: bc75c0b4234c6bb01db28e648853be9afbc4e7bd
5
5
  SHA512:
6
- metadata.gz: 0a841e009396580f4ca5115b8515ffcb465e3983a07c7ad05d117f7e7f49d24d16b7ef70f80d1e44f027a92f6ce969fcd3f6078019e44178b8200322c7beccae
7
- data.tar.gz: 754f66efbed2c515407354eed12d175c88baabb61eaeb870a59f290470a9fbca1d03b8a08775ff16e202b7a778482b9aefa5e677308dbd0b4f84a59010659017
6
+ metadata.gz: f4548669975cacfac9958d50ddb7b8d4a4d7de1cb974dcefd7c4b1545987ccfa552d76dce04252fa26ffa5e03b6cded5d699232ece4e184a0cf870167ca3f009
7
+ data.tar.gz: d3867a1d1313ae11f9d97af2b6dc3076165be4f68f990bf60487eaa7c811d2fad46356d909fc894537aea1992014636c5fe7c3b1127aabdc263ccde65f839fb5
@@ -0,0 +1,23 @@
1
+ # Changelog
2
+
3
+ The following are lists of the notable changes included with each release.
4
+ This is intended to help keep people informed about notable changes between
5
+ versions, as well as provide a rough history. Each item is prefixed with
6
+ one of the following labels: `Added`, `Changed`, `Deprecated`,
7
+ `Removed`, `Fixed`, `Security`. We also use [Semantic Versioning](http://semver.org)
8
+ to manage the versions of this gem so
9
+ that you can set version constraints properly.
10
+
11
+ #### [Unreleased](https://github.com/exAspArk/batch-loader/compare/v0.2.0...HEAD)
12
+
13
+ * WIP
14
+
15
+ #### [v0.2.0](https://github.com/exAspArk/batch-loader/compare/v0.1.0...v0.2.0) – 2017-08-02
16
+
17
+ * `Added`: `cache: false` option.
18
+ * `Added`: `BatchLoader::Middleware`.
19
+ * `Added`: More docs and tests.
20
+
21
+ #### [v0.1.0](https://github.com/exAspArk/batch-loader/compare/ed32edb...v0.1.0) – 2017-07-31
22
+
23
+ * `Added`: initial functional version.
data/README.md CHANGED
@@ -1,8 +1,232 @@
1
1
  # BatchLoader
2
2
 
3
- Welcome to your new gem! In this directory, you'll find the files you need to be able to package up your Ruby library into a gem. Put your Ruby code in the file `lib/batch_loader`. To experiment with that code, run `bin/console` for an interactive prompt.
3
+ [![Build Status](https://travis-ci.org/exAspArk/batch-loader.svg?branch=master)](https://travis-ci.org/exAspArk/batch-loader)
4
4
 
5
- TODO: Delete this and the text above, and describe your gem
5
+ Simple tool to avoid N+1 DB queries, HTTP requests, etc.
6
+
7
+ ## Contents
8
+
9
+ * [Highlights](#highlights)
10
+ * [Usage](#usage)
11
+ * [Why?](#why)
12
+ * [Basic example](#basic-example)
13
+ * [How it works](#how-it-works)
14
+ * [REST API example](#rest-api-example)
15
+ * [GraphQL example](#graphql-example)
16
+ * [Caching](#caching)
17
+ * [Installation](#installation)
18
+ * [Implementation details](#implementation-details)
19
+ * [Testing](#testing)
20
+ * [Development](#development)
21
+ * [Contributing](#contributing)
22
+ * [License](#license)
23
+ * [Code of Conduct](#code-of-conduct)
24
+
25
+ ## Highlights
26
+
27
+ * Generic utility to avoid N+1 DB queries, HTTP requests, etc.
28
+ * Adapted Ruby implementation of battle-tested tools like [Haskell Haxl](https://github.com/facebook/Haxl), [JS DataLoader](https://github.com/facebook/dataloader), etc.
29
+ * Parent objects don't have to know about children's requirements, batching is isolated.
30
+ * Automatically caches previous queries.
31
+ * Doesn't require to create custom classes.
32
+ * Thread-safe (`BatchLoader#load`).
33
+ * Has zero dependencies.
34
+ * Works with any Ruby code, including REST APIs and GraphQL.
35
+
36
+ ## Usage
37
+
38
+ ### Why?
39
+
40
+ Let's have a look at the code with N+1 queries:
41
+
42
+ ```ruby
43
+ def load_posts(ids)
44
+ Post.where(id: ids)
45
+ end
46
+
47
+ def load_users(posts)
48
+ posts.map { |post| post.user }
49
+ end
50
+
51
+ posts = load_posts([1, 2, 3]) # Posts SELECT * FROM posts WHERE id IN (1, 2, 3)
52
+ # _ ↓ _
53
+ # ↙ ↓ ↘
54
+ # U ↓ ↓ SELECT * FROM users WHERE id = 1
55
+ users = load_users(post) # ↓ U ↓ SELECT * FROM users WHERE id = 2
56
+ # ↓ ↓ U SELECT * FROM users WHERE id = 3
57
+ # ↘ ↓ ↙
58
+ # ¯ ↓ ¯
59
+ users.map { |u| user.name } # Users
60
+ ```
61
+
62
+ The naive approach would be to preload dependent objects on the top level:
63
+
64
+ ```ruby
65
+ # With ORM in basic cases
66
+ def load_posts(ids)
67
+ Post.where(id: ids).includes(:user)
68
+ end
69
+
70
+ # But without ORM or in more complicated cases you will have to do something like:
71
+ def load_posts(ids)
72
+ # load posts
73
+ posts = Post.where(id: ids)
74
+ user_ids = posts.map(&:user_id)
75
+
76
+ # load users
77
+ users = User.where(id: user_ids)
78
+ user_by_id = users.each_with_object({}) { |user, memo| memo[user.id] = user }
79
+
80
+ # map user to post
81
+ posts.each { |post| post.user = user_by_id[post.user_id] }
82
+ end
83
+
84
+ def load_users(posts)
85
+ posts.map { |post| post.user }
86
+ end
87
+
88
+ posts = load_posts([1, 2, 3]) # Posts SELECT * FROM posts WHERE id IN (1, 2, 3)
89
+ # _ ↓ _ SELECT * FROM users WHERE id IN (1, 2, 3)
90
+ # ↙ ↓ ↘
91
+ # U ↓ ↓
92
+ users = load_posts(post.user) # ↓ U ↓
93
+ # ↓ ↓ U
94
+ # ↘ ↓ ↙
95
+ # ¯ ↓ ¯
96
+ users.map { |u| user.name } # Users
97
+ ```
98
+
99
+ But the problem here is that `load_posts` now depends on the child association. Plus it'll preload the association every time, even if it's not necessary. Can we do better? Sure!
100
+
101
+ ### Basic example
102
+
103
+ With `BatchLoader` we can rewrite the code above:
104
+
105
+ ```ruby
106
+ def load_posts(ids)
107
+ Post.where(id: ids)
108
+ end
109
+
110
+ def load_users(posts)
111
+ posts.map do |post|
112
+ BatchLoader.for(post.user_id).batch do |user_ids, batch_loader|
113
+ User.where(id: user_ids).each { |u| batch_loader.load(u.id, user) }
114
+ end
115
+ end
116
+ end
117
+
118
+ posts = load_posts([1, 2, 3]) # Posts SELECT * FROM posts WHERE id IN (1, 2, 3)
119
+ # _ ↓ _
120
+ # ↙ ↓ ↘
121
+ # BL ↓ ↓
122
+ users = load_users(posts) # ↓ BL ↓
123
+ # ↓ ↓ BL
124
+ # ↘ ↓ ↙
125
+ # ¯ ↓ ¯
126
+ BatchLoader.sync!(users).map(&:name) # Users SELECT * FROM users WHERE id IN (1, 2, 3)
127
+ ```
128
+
129
+ As we can see, batching is isolated and described right in a place where it's needed.
130
+
131
+ ### How it works
132
+
133
+ In general, `BatchLoader` returns an object which in other similar implementations is call Promise. Each Promise knows which data it needs to load and how to batch the query. When all the Promises are collected it's possible to resolve them once without N+1 queries.
134
+
135
+ So, when we call `BatchLoader.for` we pass an item (`user_id`) which should be batched. For the `batch` method, we pass a block which uses all the collected items (`user_ids`):
136
+
137
+ <pre>
138
+ BatchLoader.for(post.<b>user_id</b>).batch do |<b>user_ids</b>, batch_loader|
139
+ ...
140
+ end
141
+ </pre>
142
+
143
+ Inside the block we execute a batch query for our items (`User.where`). After that, all we have to do is to call `load` method and pass an item which was used in `BatchLoader.for` method (`user_id`) and the loaded object itself (`user`):
144
+
145
+ <pre>
146
+ BatchLoader.for(post.<b>user_id</b>).batch do |user_ids, batch_loader|
147
+ User.where(id: user_ids).each { |u| batch_loader.load(<b>u.id</b>, <b>user</b>) }
148
+ end
149
+ </pre>
150
+
151
+ Now we can resolve all the collected `BatchLoader` objects:
152
+
153
+ <pre>
154
+ BatchLoader.sync!(users) # => SELECT * FROM users WHERE id IN (1, 2, 3)
155
+ </pre>
156
+
157
+ For more information, see the [Implementation details](#implementation-details) section.
158
+
159
+ ### REST API example
160
+
161
+ Now imagine we have a regular Rails app with N+1 HTTP requests:
162
+
163
+ ```ruby
164
+ # app/models/post.rb
165
+ class Post < ApplicationRecord
166
+ def rating
167
+ HttpClient.request(:get, "https://example.com/ratings/#{id}")
168
+ end
169
+ end
170
+
171
+ # app/controllers/posts_controller.rb
172
+ class PostsController < ApplicationController
173
+ def index
174
+ posts = Post.limit(10)
175
+ serialized_posts = posts.map { |post| {id: post.id, rating: post.rating} }
176
+
177
+ render json: serialized_posts
178
+ end
179
+ end
180
+ ```
181
+
182
+ As we can see, the code above will make N+1 HTTP requests, one for each post. Let's batch the requests with a gem called [parallel](https://github.com/grosser/parallel):
183
+
184
+ ```ruby
185
+ # app/models/post.rb
186
+ class Post < ApplicationRecord
187
+ def rating_lazy
188
+ BatchLoader.for(post).batch do |posts, batch_loader|
189
+ Parallel.each(posts, in_threads: 10) { |post| batch_loader.load(post, post.rating) }
190
+ end
191
+ end
192
+
193
+ def rating
194
+ HttpClient.request(:get, "https://example.com/ratings/#{id}")
195
+ end
196
+ end
197
+ ```
198
+
199
+ `BatchLoader#load` is thread-safe. So, if `HttpClient` is also thread-safe, then with `parallel` gem we can execute all HTTP requests concurrently in threads (there are some benchmarks for [concurrent HTTP requests](https://github.com/exAspArk/concurrent_http_requests) in Ruby). Thanks to Matz, MRI releases GIL when thread hits blocking I/O – HTTP request in our case.
200
+
201
+ Now we can resolve all `BatchLoader` objects in the controller:
202
+
203
+ ```ruby
204
+ # app/controllers/posts_controller.rb
205
+ class PostsController < ApplicationController
206
+ def index
207
+ posts = Post.limit(10)
208
+ serialized_posts = posts.map { |post| {id: post.id, rating: post.rating_lazy} }
209
+ render json: BatchLoader.sync!(serialized_posts)
210
+ end
211
+ end
212
+ ```
213
+
214
+ `BatchLoader` caches the resolved values. To ensure that the cache is purged for each request in the app add the following middleware:
215
+
216
+ ```ruby
217
+ # config/application.rb
218
+ config.middleware.use BatchLoader::Middleware
219
+ ```
220
+
221
+ See the [Caching](#caching) section for more information.
222
+
223
+ ### GraphQL example
224
+
225
+ TODO
226
+
227
+ ### Caching
228
+
229
+ TODO
6
230
 
7
231
  ## Installation
8
232
 
@@ -20,9 +244,13 @@ Or install it yourself as:
20
244
 
21
245
  $ gem install batch-loader
22
246
 
23
- ## Usage
247
+ ## Implementation details
248
+
249
+ TODO
250
+
251
+ ## Testing
24
252
 
25
- TODO: Write usage instructions here
253
+ TODO
26
254
 
27
255
  ## Development
28
256
 
@@ -9,8 +9,8 @@ Gem::Specification.new do |spec|
9
9
  spec.authors = ["exAspArk"]
10
10
  spec.email = ["exaspark@gmail.com"]
11
11
 
12
- spec.summary = %q{Simple tool to avoid N+1 queries, HTTP requests, etc.}
13
- spec.description = %q{Simple tool to avoid N+1 queries, HTTP requests, etc.}
12
+ spec.summary = %q{Simple tool to avoid N+1 DB queries, HTTP requests, etc.}
13
+ spec.description = %q{Simple tool to avoid N+1 DB queries, HTTP requests, etc.}
14
14
  spec.homepage = "https://github.com/exAspArk/batch-loader"
15
15
  spec.license = "MIT"
16
16
 
@@ -0,0 +1 @@
1
+ require "batch_loader"
@@ -1,50 +1,69 @@
1
1
  require "batch_loader/version"
2
- require "batch_loader/executor"
2
+ require "batch_loader/executor_proxy"
3
+ require "batch_loader/middleware"
3
4
 
4
5
  class BatchLoader
5
- def self.for(item)
6
- new(item: item)
7
- end
6
+ NoBatchError = Class.new(StandardError)
7
+ BatchAlreadyExistsError = Class.new(StandardError)
8
+
9
+ class << self
10
+ def for(item)
11
+ new(item: item)
12
+ end
8
13
 
9
- def self.sync!(value)
10
- case value
11
- when Array
12
- value.map { |v| sync!(v) }
13
- when Hash
14
- value.each { |k, v| value[k] = sync!(v) }
15
- when BatchLoader
16
- sync!(value.sync)
17
- else
18
- value
14
+ def sync!(value)
15
+ case value
16
+ when Array
17
+ value.map! { |v| sync!(v) }
18
+ when Hash
19
+ value.each { |k, v| value[k] = sync!(v) }
20
+ when BatchLoader
21
+ sync!(value.sync)
22
+ else
23
+ value
24
+ end
19
25
  end
20
26
  end
21
27
 
28
+ attr_reader :item, :batch_block, :cache
29
+
22
30
  def initialize(item:)
23
31
  @item = item
24
32
  end
25
33
 
26
- def batch(&batch_block)
34
+ def batch(cache: true, &batch_block)
35
+ raise BatchAlreadyExistsError if @batch_block
36
+ @cache = cache
27
37
  @batch_block = batch_block
28
- executor.add_item(@item, &@batch_block)
38
+ executor_for_block.add(item: item)
29
39
  self
30
40
  end
31
41
 
32
- def load(item, loaded_item)
33
- executor.save(item, loaded_item, &@batch_block)
42
+ def load(item, value)
43
+ executor_for_block.load(item: item, value: value)
34
44
  end
35
45
 
36
46
  def sync
37
- unless executor.saved?(&@batch_block)
38
- items = executor.items(&@batch_block)
39
- @batch_block.call(items, self)
47
+ unless executor_for_block.value_loaded?(item: item)
48
+ batch_block.call(executor_for_block.list_items, self)
49
+ executor_for_block.delete_items
40
50
  end
41
-
42
- executor.find(@item, &@batch_block)
51
+ result = executor_for_block.loaded_value(item: item)
52
+ purge_cache unless cache
53
+ result
43
54
  end
44
55
 
45
56
  private
46
57
 
47
- def executor
48
- BatchLoader::Executor.ensure_current
58
+ def executor_for_block
59
+ @executor_for_block ||= begin
60
+ raise NoBatchError.new("Please provide a batch block first") unless batch_block
61
+ BatchLoader::ExecutorProxy.new(&batch_block)
62
+ end
63
+ end
64
+
65
+ def purge_cache
66
+ executor_for_block.unload_value(item: item)
67
+ executor_for_block.add(item: item)
49
68
  end
50
69
  end
@@ -3,36 +3,18 @@ class BatchLoader
3
3
  NAMESPACE = :batch_loader
4
4
 
5
5
  def self.ensure_current
6
- Thread.current[NAMESPACE] = Thread.current[NAMESPACE] || new
6
+ Thread.current[NAMESPACE] ||= new
7
7
  end
8
8
 
9
- def self.clear_current
9
+ def self.delete_current
10
10
  Thread.current[NAMESPACE] = nil
11
11
  end
12
12
 
13
- def initialize
14
- @items_by_batch_block = Hash.new { |hash, key| hash[key] = [] }
15
- @loaded_items_by_batch_block = Hash.new { |hash, key| hash[key] = {} }
16
- end
17
-
18
- def add_item(item, &batch_block)
19
- @items_by_batch_block[batch_block.source_location] << item
20
- end
21
-
22
- def items(&batch_block)
23
- @items_by_batch_block[batch_block.source_location]
24
- end
13
+ attr_reader :items_by_block, :loaded_values_by_block
25
14
 
26
- def save(item, loaded_item, &batch_block)
27
- @loaded_items_by_batch_block[batch_block.source_location][item] = loaded_item
28
- end
29
-
30
- def saved?(&batch_block)
31
- @loaded_items_by_batch_block.key?(batch_block.source_location)
32
- end
33
-
34
- def find(item, &batch_block)
35
- @loaded_items_by_batch_block.dig(batch_block.source_location, item)
15
+ def initialize
16
+ @items_by_block = Hash.new { |hash, key| hash[key] = Set.new }
17
+ @loaded_values_by_block = Hash.new { |hash, key| hash[key] = {} }
36
18
  end
37
19
  end
38
20
  end
@@ -0,0 +1,51 @@
1
+ require "batch_loader/executor"
2
+
3
+ class BatchLoader
4
+ class ExecutorProxy
5
+ attr_reader :block, :global_executor
6
+
7
+ def initialize(&block)
8
+ @block = block
9
+ @block_hash_key = block.source_location
10
+ @global_executor = BatchLoader::Executor.ensure_current
11
+ end
12
+
13
+ def add(item:)
14
+ items << item
15
+ end
16
+
17
+ def list_items
18
+ items.to_a
19
+ end
20
+
21
+ def delete_items
22
+ global_executor.items_by_block[@block_hash_key] = Set.new
23
+ end
24
+
25
+ def load(item:, value:)
26
+ loaded[item] = value
27
+ end
28
+
29
+ def loaded_value(item:)
30
+ loaded[item]
31
+ end
32
+
33
+ def value_loaded?(item:)
34
+ loaded.key?(item)
35
+ end
36
+
37
+ def unload_value(item:)
38
+ loaded.delete(item)
39
+ end
40
+
41
+ private
42
+
43
+ def items
44
+ global_executor.items_by_block[@block_hash_key]
45
+ end
46
+
47
+ def loaded
48
+ global_executor.loaded_values_by_block[@block_hash_key]
49
+ end
50
+ end
51
+ end
@@ -0,0 +1,16 @@
1
+ class BatchLoader
2
+ class Middleware
3
+ def initialize(app)
4
+ @app = app
5
+ end
6
+
7
+ def call(env)
8
+ begin
9
+ BatchLoader::Executor.ensure_current
10
+ @app.call(env)
11
+ ensure
12
+ BatchLoader::Executor.delete_current
13
+ end
14
+ end
15
+ end
16
+ end
@@ -1,3 +1,3 @@
1
1
  class BatchLoader
2
- VERSION = "0.1.0"
2
+ VERSION = "0.2.0"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: batch-loader
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - exAspArk
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2017-07-31 00:00:00.000000000 Z
11
+ date: 2017-08-02 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -52,7 +52,7 @@ dependencies:
52
52
  - - "~>"
53
53
  - !ruby/object:Gem::Version
54
54
  version: '3.0'
55
- description: Simple tool to avoid N+1 queries, HTTP requests, etc.
55
+ description: Simple tool to avoid N+1 DB queries, HTTP requests, etc.
56
56
  email:
57
57
  - exaspark@gmail.com
58
58
  executables: []
@@ -63,6 +63,7 @@ files:
63
63
  - ".rspec"
64
64
  - ".ruby-version"
65
65
  - ".travis.yml"
66
+ - CHANGELOG.md
66
67
  - CODE_OF_CONDUCT.md
67
68
  - Gemfile
68
69
  - LICENSE.txt
@@ -71,8 +72,11 @@ files:
71
72
  - batch-loader.gemspec
72
73
  - bin/console
73
74
  - bin/setup
75
+ - lib/batch-loader.rb
74
76
  - lib/batch_loader.rb
75
77
  - lib/batch_loader/executor.rb
78
+ - lib/batch_loader/executor_proxy.rb
79
+ - lib/batch_loader/middleware.rb
76
80
  - lib/batch_loader/version.rb
77
81
  homepage: https://github.com/exAspArk/batch-loader
78
82
  licenses:
@@ -97,5 +101,5 @@ rubyforge_project:
97
101
  rubygems_version: 2.5.2
98
102
  signing_key:
99
103
  specification_version: 4
100
- summary: Simple tool to avoid N+1 queries, HTTP requests, etc.
104
+ summary: Simple tool to avoid N+1 DB queries, HTTP requests, etc.
101
105
  test_files: []