pipeloader 0.0.1 → 0.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: e89c97e574dfe8954ae5d612925c75391c10b131291e061f2bbfd92446fa8088
4
- data.tar.gz: 8982f0272ae0dc1fe47b7cea62b6b01a462b47ee7447c5173f16720cd424f164
3
+ metadata.gz: 2efb3ab68daaf2649998d53eab252167254d1e195ff3f46ff676c1bcaeeb8d17
4
+ data.tar.gz: a45b1c6236205caf6421437bdf4ec6b04b459698d630f63708c6f929e63c1242
5
5
  SHA512:
6
- metadata.gz: 59182c22fc62605148fa69ee580858fb670c2199628e8068ac9957fd826979634a9687179d43330013987fb18f6f33c6dad4fe0246850b47d07bfecfeed17d32
7
- data.tar.gz: e9d19c2e8de08e14041c9459250c0839af2925895f20ca012e3c3abad39d9bec36ecc6bb91e9470b43c800f111f04a759bb481258d8aeb1bfd10bf52e59d781d
6
+ metadata.gz: c607f4d9a7c5c4c5ee035bc2458ad3e91b0a1d15e83d7a9cedd35a2942de1c470521175eff0d9898211c828575c1615aeb514212e3edb252159686b5aa11864b
7
+ data.tar.gz: 5d9dbf27e303c91733d6f6b5787015bf993a9658b173f689e4cecbd9a0a55739ac4d691976bc4cfe034f0b5d0d8e541717c1a57a3b6780ed58b57c454d3e7180
data/DATALOADERS.md ADDED
@@ -0,0 +1,379 @@
1
+ # Batch loaders vs `GraphQL::Dataloader`, by example
2
+
3
+ Almost every `GraphQL::Dataloader::Source` you write is one of a handful of shapes:
4
+ load a record by id, load children by foreign key, count them, look one up by some
5
+ other column, check whether a row exists. `Pipeloader::Batch` collapses each of those
6
+ to a line on the model — and for the association shapes, to **no resolver at all**:
7
+ you declare the field and call `object.author`, and the load batches itself.
8
+
9
+ Every "before" below is real `GraphQL::Dataloader` code (a custom `Source`, or one of
10
+ graphql-ruby's built-ins); every "after" is the pipeloader equivalent. The schema is a
11
+ GitHub-ish one: `Repository`, `Issue`, `User`, `Star`, `Topic`.
12
+
13
+ The one-time setup: `include Pipeloader::Batch::Model` in the model. There's nothing to
14
+ scope — siblings are the records loaded by the same query, and the group rides on the
15
+ records, so batching just works. Everything below assumes the concern is included.
16
+
17
+ ---
18
+
19
+ ## 1. A record by id — `belongs_to`
20
+
21
+ The single most common source: hydrate a foreign key.
22
+
23
+ ```ruby
24
+ # --- GraphQL::Dataloader ---
25
+ class Sources::Record < GraphQL::Dataloader::Source
26
+ def initialize(model) = @model = model
27
+ def fetch(ids)
28
+ found = @model.where(id: ids).index_by(&:id)
29
+ ids.map { |id| found[id] } # one row per id, nil if missing
30
+ end
31
+ end
32
+
33
+ class Types::Issue < GraphQL::Schema::Object
34
+ field :author, Types::User, null: false
35
+ def author = dataloader.with(Sources::Record, ::User).load(object.author_id)
36
+ end
37
+ ```
38
+
39
+ ```ruby
40
+ # --- pipeloader ---
41
+ class Issue < ApplicationRecord
42
+ include Pipeloader::Batch::Model
43
+ batch_belongs_to :author
44
+ end
45
+
46
+ class Types::Issue < GraphQL::Schema::Object
47
+ field :author, Types::User, null: false # no resolver — object.author batches
48
+ end
49
+ ```
50
+
51
+ graphql-ruby's built-in `ActiveRecordAssociationSource` is the closest dataloader
52
+ equivalent, and even it needs a per-field resolver:
53
+
54
+ ```ruby
55
+ def author = dataloader.with(GraphQL::Dataloader::ActiveRecordAssociationSource, :author).load(object)
56
+ ```
57
+
58
+ pipeloader needs none: the field resolves through `object.author`, which the macro has
59
+ already made batch across every sibling `Issue` in the context.
60
+
61
+ ---
62
+
63
+ ## 2. A collection — `has_many`
64
+
65
+ ```ruby
66
+ # --- GraphQL::Dataloader ---
67
+ class Sources::HasMany < GraphQL::Dataloader::Source
68
+ def initialize(model, fk)
69
+ @model = model
70
+ @fk = fk
71
+ end
72
+ def fetch(owner_ids)
73
+ by_owner = @model.where(@fk => owner_ids).group_by { |r| r[@fk] }
74
+ owner_ids.map { |id| by_owner[id] || [] }
75
+ end
76
+ end
77
+
78
+ def issues = dataloader.with(Sources::HasMany, ::Issue, :repository_id).load(object.id)
79
+ ```
80
+
81
+ ```ruby
82
+ # --- pipeloader ---
83
+ class Repository < ApplicationRecord
84
+ include Pipeloader::Batch::Model
85
+ batch_has_many :issues
86
+ end
87
+ # field :issues, [Types::Issue], null: false — no resolver
88
+ ```
89
+
90
+ `object.issues` returns a lazy, chainable proxy whose load batches across every
91
+ sibling. See §6 for the chaining.
92
+
93
+ ---
94
+
95
+ ## 3. A singular child — `has_one`
96
+
97
+ ```ruby
98
+ # --- GraphQL::Dataloader: HasMany source, but take the first ---
99
+ def profile = dataloader.with(Sources::HasMany, ::Profile, :user_id).load(object.id).then(&:first)
100
+ ```
101
+
102
+ ```ruby
103
+ # --- pipeloader ---
104
+ batch_has_one :profile
105
+ ```
106
+
107
+ ---
108
+
109
+ ## 4. A count
110
+
111
+ ```ruby
112
+ # --- GraphQL::Dataloader ---
113
+ class Sources::Count < GraphQL::Dataloader::Source
114
+ def initialize(model, fk)
115
+ @model = model
116
+ @fk = fk
117
+ end
118
+ def fetch(owner_ids)
119
+ counts = @model.where(@fk => owner_ids).group(@fk).count
120
+ owner_ids.map { |id| counts[id] || 0 }
121
+ end
122
+ end
123
+
124
+ def issues_count = dataloader.with(Sources::Count, ::Issue, :repository_id).load(object.id)
125
+ ```
126
+
127
+ ```ruby
128
+ # --- pipeloader ---
129
+ batch_count :issues_count # source derived from the name; default 0
130
+ ```
131
+
132
+ ---
133
+
134
+ ## 5. A sum / aggregate
135
+
136
+ ```ruby
137
+ # --- GraphQL::Dataloader ---
138
+ class Sources::Sum < GraphQL::Dataloader::Source
139
+ def initialize(model, fk, column)
140
+ @model = model
141
+ @fk = fk
142
+ @column = column
143
+ end
144
+ def fetch(owner_ids)
145
+ sums = @model.where(@fk => owner_ids).group(@fk).sum(@column)
146
+ owner_ids.map { |id| sums[id] || 0 }
147
+ end
148
+ end
149
+
150
+ def disk_usage = dataloader.with(Sources::Sum, ::File, :repository_id, :byte_size).load(object.id)
151
+ ```
152
+
153
+ ```ruby
154
+ # --- pipeloader ---
155
+ batch_aggregate :disk_usage, of: :files, function: :sum, column: :byte_size
156
+ # function: also :average / :minimum / :maximum (empty group -> nil, or your :default)
157
+ ```
158
+
159
+ ---
160
+
161
+ ## 6. A filtered collection
162
+
163
+ With a dataloader you either bake the filter into a dedicated source, or load
164
+ everything and filter in Ruby (which un-batches the database side). With pipeloader the
165
+ `has_many` proxy is chainable, and the filter is pushed **into** the one batched query:
166
+
167
+ ```ruby
168
+ # --- pipeloader ---
169
+ field :open_issues, [Types::Issue], null: false
170
+ def open_issues = object.issues.where(state: "open").order(:created_at)
171
+ # one query total: SELECT * FROM issues WHERE state = 'open' AND repository_id IN (...)
172
+ ```
173
+
174
+ `where` / `order` / `limit` / `select` all stay batched (limit/offset are per-owner —
175
+ top-N per repository, not N overall). A distinct chain is cached as its own batch, so
176
+ the same chain across many owners is still one round trip. Prefer a named relation? Pass
177
+ the scope to the macro:
178
+
179
+ ```ruby
180
+ batch_has_many :open_issues, -> { where(state: "open") }, class_name: "Issue"
181
+ ```
182
+
183
+ ---
184
+
185
+ ## 7. A record by a non-primary-key column
186
+
187
+ ```ruby
188
+ # --- GraphQL::Dataloader ---
189
+ class Sources::RecordBy < GraphQL::Dataloader::Source
190
+ def initialize(model, column)
191
+ @model = model
192
+ @column = column
193
+ end
194
+ def fetch(values)
195
+ found = @model.where(@column => values).index_by { |r| r[@column] }
196
+ values.map { |v| found[v] }
197
+ end
198
+ end
199
+
200
+ def owner = dataloader.with(Sources::RecordBy, ::User, :login).load(object.owner_login)
201
+ ```
202
+
203
+ If the column is a real key, say so on the association and it just works:
204
+
205
+ ```ruby
206
+ # --- pipeloader ---
207
+ batch_belongs_to :owner, class_name: "User", primary_key: :login, foreign_key: :owner_login
208
+ ```
209
+
210
+ If it isn't an association at all, the general `batch` macro (§8) keyed by that column
211
+ does the same in three lines.
212
+
213
+ ---
214
+
215
+ ## 8. An existence / viewer-scoped flag — the `batch` macro
216
+
217
+ This is the case that doesn't map to an association: a value that depends on the request
218
+ (the current viewer). With a dataloader it's a source parameterized by the viewer:
219
+
220
+ ```ruby
221
+ # --- GraphQL::Dataloader ---
222
+ class Sources::ViewerStarred < GraphQL::Dataloader::Source
223
+ def initialize(viewer_id) = @viewer_id = viewer_id
224
+ def fetch(repo_ids)
225
+ starred = Star.where(user_id: @viewer_id, repo_id: repo_ids).pluck(:repo_id).to_set
226
+ repo_ids.map { |id| starred.include?(id) }
227
+ end
228
+ end
229
+
230
+ def viewer_has_starred = dataloader.with(Sources::ViewerStarred, context[:viewer].id).load(object.id)
231
+ ```
232
+
233
+ `batch` is the general escape hatch: give it a loader that takes the owner keys and
234
+ returns a `{ key => value }` Hash; each instance reads its own, or the `default`.
235
+
236
+ ```ruby
237
+ # --- pipeloader ---
238
+ class Repository < ApplicationRecord
239
+ include Pipeloader::Batch::Model
240
+
241
+ batch :viewer_has_starred, default: false do |repo_ids|
242
+ Star.where(user_id: Current.user.id, repo_id: repo_ids)
243
+ .pluck(:repo_id).index_with(true) # { repo_id => true }, missing -> false
244
+ end
245
+ end
246
+ # field :viewer_has_starred, Boolean, null: false — no resolver
247
+ ```
248
+
249
+ The loader reads the viewer from request state (`Current.user` here — any
250
+ `ActiveSupport::CurrentAttributes`; outside Rails, close over a local or read what you
251
+ stashed when you opened the batch). It runs **once** per context, the first time any
252
+ repository's flag is read.
253
+
254
+ ---
255
+
256
+ ## 9. A derived value — the `batch` macro
257
+
258
+ Anything you can compute in one grouped query: a score, a summary, a lookup by a column.
259
+
260
+ ```ruby
261
+ # reaction counts per issue, one query for the whole page
262
+ batch :reaction_counts, default: {} do |issue_ids|
263
+ Reaction.where(issue_id: issue_ids).group(:issue_id, :emoji).count # { [id, emoji] => n }
264
+ .each_with_object(Hash.new { |h, k| h[k] = {} }) { |((id, emoji), n), acc| acc[id][emoji] = n }
265
+ end
266
+
267
+ # a record looked up by a non-PK column, keyed by that column instead of the PK
268
+ batch :author, key: :author_login, default: nil do |logins|
269
+ User.where(login: logins).index_by(&:login)
270
+ end
271
+ ```
272
+
273
+ `key:` controls which column the owner keys come from (default the primary key), so the
274
+ loader and the reader agree on what to index by.
275
+
276
+ ---
277
+
278
+ ## 10. `has_many :through`
279
+
280
+ The case a flat foreign-key batcher can't express — the target has no column pointing
281
+ back at the owner. With a dataloader you join in the source and re-hydrate the rows:
282
+
283
+ ```ruby
284
+ # --- GraphQL::Dataloader ---
285
+ class Sources::Topics < GraphQL::Dataloader::Source
286
+ def fetch(repo_ids)
287
+ rows = RepositoryTopic.where(repository_id: repo_ids).joins(:topic)
288
+ .pluck(:repository_id, "topics.id", "topics.name")
289
+ by_repo = rows.group_by(&:first)
290
+ repo_ids.map { |id| (by_repo[id] || []).map { |(_, tid, name)| Topic.new(id: tid, name: name) } }
291
+ end
292
+ end
293
+
294
+ def topics = dataloader.with(Sources::Topics).load(object.id)
295
+ ```
296
+
297
+ ```ruby
298
+ # --- pipeloader ---
299
+ batch_has_many :topics, through: :repository_topics
300
+ ```
301
+
302
+ pipeloader routes a `:through` collection to AR's `Preloader` (which walks the join), so
303
+ it batches to one query for the join and one for the targets, across every owner. It
304
+ returns a plain loaded array — no chainable proxy — so filter at the relation if you need
305
+ to: `batch_has_many :recent_topics, -> { order(created_at: :desc) }, through: :repository_topics`.
306
+
307
+ ---
308
+
309
+ ## 11. A polymorphic `belongs_to`
310
+
311
+ ```ruby
312
+ # --- GraphQL::Dataloader: group by type, load each, re-map to input order ---
313
+ class Sources::Commentable < GraphQL::Dataloader::Source
314
+ def fetch(comments)
315
+ by_type = comments.group_by(&:commentable_type)
316
+ loaded = by_type.transform_values do |group|
317
+ group.first.commentable_type.constantize.where(id: group.map(&:commentable_id)).index_by(&:id)
318
+ end
319
+ comments.map { |c| loaded[c.commentable_type][c.commentable_id] }
320
+ end
321
+ end
322
+
323
+ def commentable = dataloader.with(Sources::Commentable).load(object)
324
+ ```
325
+
326
+ ```ruby
327
+ # --- pipeloader ---
328
+ batch_belongs_to :commentable, polymorphic: true
329
+ ```
330
+
331
+ The `Preloader` groups by `commentable_type` under the hood, so it's one query **per
332
+ type** across all siblings (not one per record), resolving each to the right class.
333
+
334
+ ---
335
+
336
+ ## Where the edges are
337
+
338
+ - **The `has_many` proxy is read-only.** Writes (`<<`, `create`, `build`, …) delegate to
339
+ the real association; reads it doesn't implement raise `NoMethodError` rather than
340
+ silently issuing a per-record query. `:through` collections aren't chainable (they load
341
+ to an array; put the scope on the macro).
342
+ - **`batch` loaders still write their own query.** The escape hatch removes the `Source`
343
+ class and the `.load`, and batches across siblings for you — but an existence check or a
344
+ derived value is still SQL you write.
345
+
346
+ ---
347
+
348
+ ## Ergonomics — the scorecard
349
+
350
+ | pattern | `GraphQL::Dataloader` | pipeloader |
351
+ |---|---|---|
352
+ | record by id | a `Source` + `.load` resolver | `batch_belongs_to` — **no resolver** |
353
+ | collection | a `Source` + `.load` resolver | `batch_has_many` — **no resolver** |
354
+ | has_one | reuse the HasMany source + `.first` | `batch_has_one` — **no resolver** |
355
+ | count | a `Source` + `.load` resolver | `batch_count` |
356
+ | sum/avg/min/max | a `Source` + `.load` resolver | `batch_aggregate` |
357
+ | filtered collection | a dedicated source, or filter in Ruby | `object.issues.where(...)` — pushed into the batch |
358
+ | by non-PK column | a `Source` + `.load` resolver | a `primary_key:` on the association, or `batch key:` |
359
+ | existence / viewer-scoped | a viewer-parameterized `Source` + `.load` | `batch ... do \|ids\| ... end` |
360
+ | derived value | a custom `Source` + `.load` | `batch ... do \|ids\| ... end` |
361
+ | `has_many :through` | a join in a custom `Source` | `batch_has_many through:` — **no resolver** |
362
+ | polymorphic belongs_to | group-by-type custom `Source` | `batch_belongs_to polymorphic: true` — **no resolver** |
363
+
364
+ Three things change:
365
+
366
+ 1. **No source registry.** A dataloader app accretes a `Sources::` namespace of small,
367
+ near-identical classes. pipeloader has none — the shape is named by the macro.
368
+ 2. **No `.load` in resolvers — usually no resolver.** Every association resolves through
369
+ the model's own method (`object.author`), so the GraphQL type is just field
370
+ declarations. The batching lives on the model, next to the association it batches.
371
+ 3. **Batched and chainable by default.** A collection is a real, filterable relation
372
+ surface, with the filter pushed into the single query — no second source per variant.
373
+
374
+ What stays the same: the genuinely custom cases — an existence check, a viewer-scoped
375
+ flag, a derived value — still need a hand-written query. `batch` removes the ceremony
376
+ around it (no `Source` class, no `.load`, automatic batching across siblings) but you
377
+ still write the SQL the value needs. That's the honest boundary: every standard
378
+ association — including `:through` and polymorphic — becomes a macro with no resolver,
379
+ and the long tail becomes a one-method `batch`, rather than a `Sources::` class apiece.
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Joshua Hull
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.