pipeloader 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/README.md +224 -0
- data/lib/pipeloader/ar_patch.rb +49 -0
- data/lib/pipeloader/field_exact.rb +99 -0
- data/lib/pipeloader/pipeliner.rb +49 -0
- data/lib/pipeloader/source.rb +19 -0
- data/lib/pipeloader/version.rb +3 -0
- data/lib/pipeloader.rb +72 -0
- metadata +120 -0
checksums.yaml
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
---
|
|
2
|
+
SHA256:
|
|
3
|
+
metadata.gz: e89c97e574dfe8954ae5d612925c75391c10b131291e061f2bbfd92446fa8088
|
|
4
|
+
data.tar.gz: 8982f0272ae0dc1fe47b7cea62b6b01a462b47ee7447c5173f16720cd424f164
|
|
5
|
+
SHA512:
|
|
6
|
+
metadata.gz: 59182c22fc62605148fa69ee580858fb670c2199628e8068ac9957fd826979634a9687179d43330013987fb18f6f33c6dad4fe0246850b47d07bfecfeed17d32
|
|
7
|
+
data.tar.gz: e9d19c2e8de08e14041c9459250c0839af2925895f20ca012e3c3abad39d9bec36ecc6bb91e9470b43c800f111f04a759bb481258d8aeb1bfd10bf52e59d781d
|
data/README.md
ADDED
|
@@ -0,0 +1,224 @@
|
|
|
1
|
+
# pipeloader
|
|
2
|
+
|
|
3
|
+
Transparent libpq pipelining for **graphql-ruby on ActiveRecord**. During GraphQL
|
|
4
|
+
response building, every ActiveRecord `SELECT` is routed through a libpq pipeline,
|
|
5
|
+
so a nested query resolves in roughly **one round trip per tree level** — with
|
|
6
|
+
**plain resolvers and plain models**. No Futures, no `dataloader.load`, no field
|
|
7
|
+
changes.
|
|
8
|
+
|
|
9
|
+
## Adopting it
|
|
10
|
+
|
|
11
|
+
One line:
|
|
12
|
+
|
|
13
|
+
```ruby
|
|
14
|
+
class AppSchema < GraphQL::Schema
|
|
15
|
+
use Pipeloader
|
|
16
|
+
end
|
|
17
|
+
```
|
|
18
|
+
|
|
19
|
+
That's the whole adoption surface. Your types and resolvers stay exactly as they
|
|
20
|
+
are — ordinary ActiveRecord:
|
|
21
|
+
|
|
22
|
+
```ruby
|
|
23
|
+
class Types::Post < GraphQL::Schema::Object
|
|
24
|
+
field :title, String, null: false
|
|
25
|
+
field :author, Types::Author, null: false # resolves via post.author
|
|
26
|
+
field :comments, [Types::Comment], null: false # resolves via post.comments
|
|
27
|
+
end
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
`post.author`, `post.comments`, a `has_many`, a `.where(...)` in a hand-written
|
|
31
|
+
resolver — any AR SELECT issued while building the response is intercepted and
|
|
32
|
+
pipelined. Because it hooks AR's query path (not the GraphQL field), nothing
|
|
33
|
+
leaks back to synchronous N+1, even from custom resolver code.
|
|
34
|
+
|
|
35
|
+
## What it does
|
|
36
|
+
|
|
37
|
+
`example/run.rb` — plain resolvers, against a seeded database:
|
|
38
|
+
|
|
39
|
+
```
|
|
40
|
+
{ posts(limit: 50) { title author { name } comments { body commenter { name } } } }
|
|
41
|
+
|
|
42
|
+
resolved 50 posts with PLAIN AR resolvers
|
|
43
|
+
pipeline round-trips: 3
|
|
44
|
+
queries pipelined: 403
|
|
45
|
+
naive N+1 would be: ~594 round trips
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
Three round trips: `posts` → (`authors` + `comments`) → `commenters`. The to-one
|
|
49
|
+
`author` and the to-many `comments` are *different shapes at the same level*, yet
|
|
50
|
+
collapse into a single round trip.
|
|
51
|
+
|
|
52
|
+
## Benchmark
|
|
53
|
+
|
|
54
|
+
The same 3-level query (`posts → author + comments → commenter`, 25 posts),
|
|
55
|
+
resolved four ways, with **real** network latency added by a local TCP proxy in
|
|
56
|
+
front of Postgres (`example/latency_proxy.rb` delays the request direction, so a
|
|
57
|
+
synchronous query pays RTT once and a pipelined burst pays it once). Min of 3
|
|
58
|
+
iterations; your numbers will vary.
|
|
59
|
+
|
|
60
|
+
| approach | RTT 0 | RTT 1 ms | RTT 5 ms | round-trips |
|
|
61
|
+
|---|--:|--:|--:|--:|
|
|
62
|
+
| naive (N+1) | 94 ms | 505 ms | **1972 ms** | 290 |
|
|
63
|
+
| AR `includes` (hand-written) | 17 ms | 22 ms | 42 ms | 4 |
|
|
64
|
+
| `GraphQL::Dataloader` | 16 ms | 21 ms | 42 ms | 4 |
|
|
65
|
+
| **pipeloader** | 41 ms | 45 ms | **73 ms** | **3** |
|
|
66
|
+
|
|
67
|
+
Reading it honestly:
|
|
68
|
+
|
|
69
|
+
- **vs the N+1 you actually have** — the headline. pipeloader turns 290 round
|
|
70
|
+
trips into 3 with zero resolver code, so at a 5 ms hop it's **~24× faster** than
|
|
71
|
+
naive. Most "there's an N+1 in here somewhere" code is the naive row.
|
|
72
|
+
- **vs batching (`includes` / `GraphQL::Dataloader`)** — at low/moderate RTT,
|
|
73
|
+
batching still wins: its 4 `IN` queries do less work than pipeloader's ~400
|
|
74
|
+
prepared point queries. pipeloader prepares + caches statements per connection
|
|
75
|
+
(so parse cost is amortized to ~one parse per query shape), but it still runs
|
|
76
|
+
400 bind+executes and builds 400 results. **Pipelining cuts round trips;
|
|
77
|
+
batching cuts server work.** pipeloader does fewer round trips (3 vs 4 — it
|
|
78
|
+
collapses the to-one `author` and to-many `comments` into one burst, where
|
|
79
|
+
Dataloader runs them as two sequential sources), so it closes the gap as RTT
|
|
80
|
+
rises and passes the batchers around ~25 ms RTT (cross-region). Same
|
|
81
|
+
point-vs-batch tradeoff the Go experiments in this repo show.
|
|
82
|
+
- **What pipeloader actually buys you: zero code, for any query shape.**
|
|
83
|
+
`GraphQL::Dataloader` needs a source plus a `.load` call per association;
|
|
84
|
+
`includes` must be hand-written per query and kept in sync with the selection.
|
|
85
|
+
pipeloader is `use Pipeloader` and ordinary resolvers.
|
|
86
|
+
|
|
87
|
+
Run it: `ruby example/bench.rb` (needs the seeded `graphql_experiment` DB).
|
|
88
|
+
|
|
89
|
+
### Scaling with tree shape
|
|
90
|
+
|
|
91
|
+
That benchmark is a *narrow* tree (3 deep, 2 relations at its widest), which is
|
|
92
|
+
close to the worst case for pipeloader. The gap widens with **width**, because:
|
|
93
|
+
|
|
94
|
+
- **pipeloader round trips = tree depth** — one burst per level, any width.
|
|
95
|
+
- **batching round trips = Σ (distinct target tables per level)** — each is its
|
|
96
|
+
own `IN` query (a Dataloader source, or an `includes` preload).
|
|
97
|
+
|
|
98
|
+
A *wide* query — issues fanning out to assignee, creator, project, parent, and
|
|
99
|
+
comments, those nesting to team, lead, and authors (`example/bench_wide.rb`):
|
|
100
|
+
|
|
101
|
+
| approach | RTT 0 | RTT 1 ms | RTT 5 ms | round-trips |
|
|
102
|
+
|---|--:|--:|--:|--:|
|
|
103
|
+
| naive (N+1) | 63 ms | 278 ms | 1115 ms | 164 |
|
|
104
|
+
| AR `includes` | 13 ms | 29 ms | 91 ms | 11 |
|
|
105
|
+
| `GraphQL::Dataloader` | 9 ms | 20 ms | 57 ms | 7 |
|
|
106
|
+
| **pipeloader** | 28 ms | 34 ms | **51 ms** | **3** |
|
|
107
|
+
|
|
108
|
+
pipeloader's round trips stay at **3** (the depth) while batching climbs to 7–11,
|
|
109
|
+
so at a 5 ms hop **pipeloader is the fastest of all** — the point-vs-batch
|
|
110
|
+
crossover dropped from ~25 ms (narrow) to under 5 ms (wide). The wider and deeper
|
|
111
|
+
the tree, the lower the RTT at which pipelining wins, because pipelining is the
|
|
112
|
+
only one of the three whose round trips don't grow with the query.
|
|
113
|
+
|
|
114
|
+
## How it works
|
|
115
|
+
|
|
116
|
+
1. **`use GraphQL::Dataloader`** — runs resolution in fibers. This is what lets a
|
|
117
|
+
synchronous-looking `post.author` *yield* instead of blocking, so sibling
|
|
118
|
+
queries can gather before anything hits the wire.
|
|
119
|
+
2. **A monkey-patch on `select_all`** — while a response is being built, AR's
|
|
120
|
+
SELECT path hands the query to a Dataloader source instead of executing it.
|
|
121
|
+
The active dataloader is **stashed on the connection** for the duration of the
|
|
122
|
+
multiplex (and cleared at the end), so the patch finds it as `self`.
|
|
123
|
+
3. **The source pipelines** — when the fibers all park, it prepares each distinct
|
|
124
|
+
SQL once (cached per connection), then sends every gathered query as one libpq
|
|
125
|
+
burst (`enter_pipeline_mode` … `pipeline_sync`), reads the results, and returns
|
|
126
|
+
an `ActiveRecord::Result` per query so AR builds models normally.
|
|
127
|
+
|
|
128
|
+
## Field-exact projection (opt-in)
|
|
129
|
+
|
|
130
|
+
By default AR picks the columns (`SELECT *`), which keeps adoption zero-effort. If
|
|
131
|
+
you want the pipeline to fetch **only the columns the query selected**, opt in and
|
|
132
|
+
pipeloader narrows each SELECT using graphql-ruby's `lookahead`:
|
|
133
|
+
|
|
134
|
+
```ruby
|
|
135
|
+
Pipeloader.field_exact = true # globally, before your types load, or…
|
|
136
|
+
|
|
137
|
+
class Types::Post < GraphQL::Schema::Object
|
|
138
|
+
pipeloader_field_exact! # …per type
|
|
139
|
+
field :title, String, null: false
|
|
140
|
+
field :author, Types::Author, null: false
|
|
141
|
+
end
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
For `{ posts { title author { name } } }` the posts SELECT becomes
|
|
145
|
+
`SELECT id, title, author_id FROM …` (PK + selected column + the FK needed to
|
|
146
|
+
resolve `author`), and the authors SELECT becomes `SELECT id, name FROM …` — for
|
|
147
|
+
both the root relation and the `belongs_to`.
|
|
148
|
+
|
|
149
|
+
**It never breaks a field.** A classifier narrows only when it can *prove* every
|
|
150
|
+
selected field reads a known column or association. The instant a selection is
|
|
151
|
+
opaque — a computed field, a custom resolver, anything it can't map to a column —
|
|
152
|
+
it **bails to a whole-row fetch** for that record, so a projected field can never
|
|
153
|
+
raise `MissingAttributeError`.
|
|
154
|
+
|
|
155
|
+
**The `selects:` escape hatch.** A computed field can declare the columns it
|
|
156
|
+
reads, so projection keeps them instead of bailing to a whole row:
|
|
157
|
+
|
|
158
|
+
```ruby
|
|
159
|
+
field :excerpt, String, null: false, selects: %i[body]
|
|
160
|
+
def excerpt = object.body[0, 200]
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
Selecting `excerpt` now adds `body` to the projection. With no opt-in (the
|
|
164
|
+
default), `selects:` is accepted and ignored, and every SELECT is whole-row.
|
|
165
|
+
|
|
166
|
+
## Status & caveats — this is a proof of concept
|
|
167
|
+
|
|
168
|
+
- **Whole rows by default; field-exact is [opt-in](#field-exact-projection-opt-in).**
|
|
169
|
+
Off, AR picks the columns (maximum transparency); on, the pipeline projects to
|
|
170
|
+
the selected columns and bails to whole rows on anything opaque.
|
|
171
|
+
- **PostgreSQL pipelines; SQLite narrows only; anything else raises.** Pipelining
|
|
172
|
+
is libpq-specific, so on PostgreSQL queries are pipelined, on SQLite they run
|
|
173
|
+
un-pipelined (the opt-in column projection still applies, useful for tests/dev),
|
|
174
|
+
and any other adapter raises a `RuntimeError` at query time rather than silently
|
|
175
|
+
misbehaving. Running SQLite un-pipelined is safe because SQLite is *embedded* —
|
|
176
|
+
its queries are in-process calls with no network round trip, so there's nothing
|
|
177
|
+
for a dataloader or a pipeline to collapse. N+1 there is just a series of cheap
|
|
178
|
+
local calls, not the latency amplification that makes N+1 catastrophic against a
|
|
179
|
+
networked database. So pipelining buys nothing on SQLite, and skipping it costs
|
|
180
|
+
nothing.
|
|
181
|
+
- **Reads only.** It intercepts `select_all` (SELECTs); writes and non-SELECTs
|
|
182
|
+
fall straight through, and queries inside an open transaction are skipped.
|
|
183
|
+
- **Assumes thread-isolated connections** (the ActiveRecord default): a request's
|
|
184
|
+
resolver fibers all share one connection. Under `:fiber` isolation you'd stash
|
|
185
|
+
per leased connection.
|
|
186
|
+
- **Stats are process-global** single-threaded demo instrumentation.
|
|
187
|
+
- Prepares and caches statements per connection, but doesn't re-prepare after a
|
|
188
|
+
reconnect / `DEALLOCATE` the way AR does. Also not hardened for multiple
|
|
189
|
+
databases, `count`/`exists?` (which route through other methods), or error
|
|
190
|
+
recovery mid-pipeline.
|
|
191
|
+
|
|
192
|
+
## Running the example
|
|
193
|
+
|
|
194
|
+
```bash
|
|
195
|
+
# Needs a Postgres DB with posts/authors/comments/users tables. In this repo:
|
|
196
|
+
# go run ./cmd/gqlbench -reset # seeds the graphql_experiment DB
|
|
197
|
+
ruby example/run.rb # shows the round-trip collapse
|
|
198
|
+
ruby example/bench.rb # the latency benchmark (narrow tree)
|
|
199
|
+
ruby example/bench_wide.rb # the wide-tree benchmark
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
Requires `activerecord`, `graphql`, and `pg` (libpq ≥ 14 for pipelining).
|
|
203
|
+
|
|
204
|
+
## Tests
|
|
205
|
+
|
|
206
|
+
`rake test`. Three suites, all **parity-first** — the pipelined result must be
|
|
207
|
+
byte-identical to plain ActiveRecord:
|
|
208
|
+
|
|
209
|
+
- **`test/pipeloader_test.rb`** — every query runs through both a plain schema and
|
|
210
|
+
a `use Pipeloader` schema, asserting identical results across each relationship
|
|
211
|
+
kind, nullable foreign keys, empty has-many, deduplication, ordering, type
|
|
212
|
+
casting, aliases, variables, and multiplex. It also checks round-trip counts
|
|
213
|
+
(= tree depth) and that the patch leaves writes, transactions, and non-GraphQL
|
|
214
|
+
ActiveRecord untouched.
|
|
215
|
+
- **`test/field_exact_test.rb`** — the opt-in projection: projected results match
|
|
216
|
+
the whole-row schema, the emitted SQL is actually narrowed (and keeps the FK),
|
|
217
|
+
the `selects:` escape hatch includes its columns, and opaque fields bail to a
|
|
218
|
+
whole-row `SELECT *` instead of raising.
|
|
219
|
+
- **`test/adapter_test.rb`** — adapter handling: PostgreSQL pipelines, an
|
|
220
|
+
unsupported adapter raises, and a real in-memory **SQLite** run (in a subprocess)
|
|
221
|
+
proves projection works there with pipelining disabled.
|
|
222
|
+
|
|
223
|
+
Needs a reachable Postgres (the suites create `pl_*` fixture tables in
|
|
224
|
+
`graphql_experiment`).
|
|
@@ -0,0 +1,49 @@
|
|
|
1
|
+
module Pipeloader
|
|
2
|
+
# Monkey-patch: while a Pipeloader dataloader is active (i.e. during GraphQL
|
|
3
|
+
# response building), route SELECTs through the pipelining Source instead of
|
|
4
|
+
# executing them synchronously. Everything else falls through to AR untouched.
|
|
5
|
+
module ARPatch
|
|
6
|
+
def self.install!
|
|
7
|
+
return if @installed
|
|
8
|
+
|
|
9
|
+
# select_all lives on QueryCache (ahead of AbstractAdapter in the chain),
|
|
10
|
+
# so prepend there to intercept before AR's own query cache.
|
|
11
|
+
ActiveRecord::ConnectionAdapters::QueryCache.prepend(Methods)
|
|
12
|
+
@installed = true
|
|
13
|
+
end
|
|
14
|
+
|
|
15
|
+
module Methods
|
|
16
|
+
# The active dataloader is stashed on the connection for the duration of the
|
|
17
|
+
# response phase (set/cleared by Pipeloader::Trace). The connection is the
|
|
18
|
+
# pipelining unit and is shared across the request's resolver fibers, so
|
|
19
|
+
# this is both the natural home and reliably visible here as `self`.
|
|
20
|
+
attr_accessor :pipeloader_dataloader
|
|
21
|
+
|
|
22
|
+
def select_all(arel, name = nil, binds = [], preparable: nil, async: false, allow_retry: false)
|
|
23
|
+
dl = pipeloader_dataloader
|
|
24
|
+
if dl.is_a?(GraphQL::Dataloader) && !in_transaction?
|
|
25
|
+
relation = arel_from_relation(arel)
|
|
26
|
+
sql, bind_objs, = to_sql_and_binds(relation, binds, preparable, allow_retry)
|
|
27
|
+
if sql.lstrip[0, 6].casecmp("select").zero?
|
|
28
|
+
params = bind_objs.map { |b| b.respond_to?(:value_for_database) ? b.value_for_database : b }
|
|
29
|
+
return dl.with(Pipeloader::Source, raw_connection).load([sql, params])
|
|
30
|
+
end
|
|
31
|
+
end
|
|
32
|
+
# Synchronous fallback (no active response, or not gathered): one query,
|
|
33
|
+
# one round trip.
|
|
34
|
+
result = super
|
|
35
|
+
Pipeloader.round_trips += 1
|
|
36
|
+
Pipeloader.queries += 1
|
|
37
|
+
result
|
|
38
|
+
end
|
|
39
|
+
|
|
40
|
+
private
|
|
41
|
+
|
|
42
|
+
def in_transaction?
|
|
43
|
+
current_transaction.open?
|
|
44
|
+
rescue StandardError
|
|
45
|
+
false
|
|
46
|
+
end
|
|
47
|
+
end
|
|
48
|
+
end
|
|
49
|
+
end
|
|
@@ -0,0 +1,99 @@
|
|
|
1
|
+
module Pipeloader
|
|
2
|
+
# Opt-in, off by default. Set true before your schema's types are defined to
|
|
3
|
+
# turn on projection everywhere; or opt in per type with `pipeloader_field_exact!`.
|
|
4
|
+
class << self
|
|
5
|
+
attr_accessor :field_exact
|
|
6
|
+
end
|
|
7
|
+
self.field_exact = false
|
|
8
|
+
|
|
9
|
+
# Per-type opt-in, mixed into every GraphQL::Schema::Object.
|
|
10
|
+
module TypeOptIn
|
|
11
|
+
def pipeloader_field_exact!(value = true)
|
|
12
|
+
@pipeloader_field_exact = value
|
|
13
|
+
end
|
|
14
|
+
|
|
15
|
+
def pipeloader_field_exact?
|
|
16
|
+
return @pipeloader_field_exact if defined?(@pipeloader_field_exact)
|
|
17
|
+
|
|
18
|
+
superclass.respond_to?(:pipeloader_field_exact?) && superclass.pipeloader_field_exact?
|
|
19
|
+
end
|
|
20
|
+
end
|
|
21
|
+
|
|
22
|
+
# Monkey-patches `field` to (a) accept the `selects:` escape hatch and (b)
|
|
23
|
+
# attach the projection extension when this field's type has opted in.
|
|
24
|
+
module FieldSelects
|
|
25
|
+
attr_reader :pipeloader_selects
|
|
26
|
+
|
|
27
|
+
def initialize(*args, selects: nil, owner: nil, extensions: [], **kwargs, &block)
|
|
28
|
+
@pipeloader_selects = selects && Array(selects).map(&:to_s)
|
|
29
|
+
if Pipeloader.field_exact || (owner.respond_to?(:pipeloader_field_exact?) && owner.pipeloader_field_exact?)
|
|
30
|
+
extensions = extensions + [ProjectionExtension]
|
|
31
|
+
end
|
|
32
|
+
super(*args, owner: owner, extensions: extensions, **kwargs, &block)
|
|
33
|
+
end
|
|
34
|
+
end
|
|
35
|
+
|
|
36
|
+
# Narrows a returned ActiveRecord::Relation to exactly the columns the query
|
|
37
|
+
# needs — and bails to a whole-row fetch the moment a selected field can't be
|
|
38
|
+
# proven to read only known columns, so it can never raise MissingAttributeError.
|
|
39
|
+
class ProjectionExtension < GraphQL::Schema::FieldExtension
|
|
40
|
+
extras [:lookahead]
|
|
41
|
+
|
|
42
|
+
def resolve(object:, arguments:, context:, **)
|
|
43
|
+
lookahead = arguments[:lookahead]
|
|
44
|
+
inner = arguments.key?(:lookahead) ? arguments.reject { |k, _| k == :lookahead } : arguments
|
|
45
|
+
|
|
46
|
+
# A belongs_to is loaded whole-row by `object.assoc`, so resolve it via a
|
|
47
|
+
# projected query instead (still pipelined). Skipped if the type defines a
|
|
48
|
+
# custom resolver, or the selection is opaque (then fall through to default).
|
|
49
|
+
record = object.respond_to?(:object) ? object.object : object
|
|
50
|
+
if lookahead && record.is_a?(ActiveRecord::Base) &&
|
|
51
|
+
!field.owner.instance_methods(false).include?(field.resolver_method) &&
|
|
52
|
+
(assoc = record.class.reflect_on_association(field.method_str.to_sym))&.belongs_to?
|
|
53
|
+
fk = record.public_send(assoc.foreign_key)
|
|
54
|
+
return nil if fk.nil?
|
|
55
|
+
|
|
56
|
+
cols = Pipeloader.project_columns(assoc.klass, lookahead)
|
|
57
|
+
return assoc.klass.where(assoc.klass.primary_key => fk).select(*cols).first if cols
|
|
58
|
+
end
|
|
59
|
+
|
|
60
|
+
value = yield(object, inner)
|
|
61
|
+
return value unless lookahead && value.is_a?(ActiveRecord::Relation)
|
|
62
|
+
|
|
63
|
+
cols = Pipeloader.project_columns(value.klass, lookahead)
|
|
64
|
+
return value unless cols # opaque field selected -> fetch whole rows
|
|
65
|
+
|
|
66
|
+
# Keep a has_many's foreign key so AR can still group / wire the inverse.
|
|
67
|
+
if value.respond_to?(:proxy_association) && (proxy = value.proxy_association)
|
|
68
|
+
cols += Array(proxy.reflection.foreign_key)
|
|
69
|
+
end
|
|
70
|
+
value.select(*cols.uniq)
|
|
71
|
+
end
|
|
72
|
+
end
|
|
73
|
+
|
|
74
|
+
# Returns the exact column list for a model + selection, or nil meaning
|
|
75
|
+
# "can't prove it's safe — fetch whole rows."
|
|
76
|
+
def self.project_columns(model, lookahead)
|
|
77
|
+
columns = model.column_names
|
|
78
|
+
needed = [model.primary_key]
|
|
79
|
+
lookahead.selections.each do |sel|
|
|
80
|
+
field = sel.field
|
|
81
|
+
if field.respond_to?(:pipeloader_selects) && field.pipeloader_selects
|
|
82
|
+
needed.concat(field.pipeloader_selects) # explicit escape hatch
|
|
83
|
+
elsif field.owner.instance_methods(false).include?(field.resolver_method)
|
|
84
|
+
return nil # custom resolver: opaque
|
|
85
|
+
elsif columns.include?(field.method_str)
|
|
86
|
+
needed << field.method_str # plain column
|
|
87
|
+
elsif (assoc = model.reflect_on_association(field.method_str.to_sym))
|
|
88
|
+
needed << assoc.foreign_key if assoc.belongs_to? # FK for a belongs_to
|
|
89
|
+
# has_many keys off the model's PK, already included
|
|
90
|
+
else
|
|
91
|
+
return nil # unknown accessor: opaque
|
|
92
|
+
end
|
|
93
|
+
end
|
|
94
|
+
needed.compact.uniq.map(&:to_s)
|
|
95
|
+
end
|
|
96
|
+
end
|
|
97
|
+
|
|
98
|
+
GraphQL::Schema::Field.prepend(Pipeloader::FieldSelects)
|
|
99
|
+
GraphQL::Schema::Object.extend(Pipeloader::TypeOptIn)
|
|
@@ -0,0 +1,49 @@
|
|
|
1
|
+
require "pg"
|
|
2
|
+
|
|
3
|
+
module Pipeloader
|
|
4
|
+
# Runs a batch of SELECTs as one libpq pipelined burst on a PG::Connection.
|
|
5
|
+
module Pipeliner
|
|
6
|
+
module_function
|
|
7
|
+
|
|
8
|
+
# queries: array of [sql, params]. Returns array of [columns, rows] (raw
|
|
9
|
+
# strings), in the same order, having sent them all in a single round trip.
|
|
10
|
+
def pipeline_batch(pg, queries)
|
|
11
|
+
prepared = pg.instance_variable_get(:@pipeloader_prepared)
|
|
12
|
+
unless prepared
|
|
13
|
+
prepared = {}
|
|
14
|
+
pg.instance_variable_set(:@pipeloader_prepared, prepared)
|
|
15
|
+
end
|
|
16
|
+
|
|
17
|
+
# Prepare any unseen SQL before entering pipeline mode (Parse can't be
|
|
18
|
+
# issued synchronously mid-pipeline). GraphQL has a bounded set of query
|
|
19
|
+
# shapes, so this amortizes to ~one parse per shape for the connection's
|
|
20
|
+
# life; thereafter every execution reuses the named statement.
|
|
21
|
+
queries.each do |sql, _params|
|
|
22
|
+
prepared[sql] ||= begin
|
|
23
|
+
name = "pipeloader_#{prepared.size}"
|
|
24
|
+
pg.prepare(name, sql)
|
|
25
|
+
name
|
|
26
|
+
end
|
|
27
|
+
end
|
|
28
|
+
|
|
29
|
+
pg.enter_pipeline_mode
|
|
30
|
+
queries.each { |sql, params| pg.send_query_prepared(prepared[sql], params) }
|
|
31
|
+
pg.pipeline_sync
|
|
32
|
+
|
|
33
|
+
results = []
|
|
34
|
+
loop do
|
|
35
|
+
result = pg.get_result
|
|
36
|
+
break if result.nil?
|
|
37
|
+
break if result.result_status == PG::PGRES_PIPELINE_SYNC
|
|
38
|
+
|
|
39
|
+
# Raw strings, so ActiveRecord casts via its own column types (and so we
|
|
40
|
+
# never disturb the connection's type map that AR relies on).
|
|
41
|
+
result.type_map = PG::TypeMapAllStrings.new
|
|
42
|
+
results << [result.fields, result.values]
|
|
43
|
+
pg.get_result # per-query nil delimiter
|
|
44
|
+
end
|
|
45
|
+
pg.exit_pipeline_mode
|
|
46
|
+
results
|
|
47
|
+
end
|
|
48
|
+
end
|
|
49
|
+
end
|
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
module Pipeloader
|
|
2
|
+
# A GraphQL::Dataloader source that gathers every SELECT parked at a fiber tick
|
|
3
|
+
# and runs them as one pipelined burst, returning an ActiveRecord::Result per
|
|
4
|
+
# query (so AR builds models normally).
|
|
5
|
+
class Source < GraphQL::Dataloader::Source
|
|
6
|
+
def initialize(pg)
|
|
7
|
+
@pg = pg
|
|
8
|
+
end
|
|
9
|
+
|
|
10
|
+
# keys: array of [sql, params], deduplicated by Dataloader. Must return one
|
|
11
|
+
# result per key, in order.
|
|
12
|
+
def fetch(keys)
|
|
13
|
+
batch = Pipeliner.pipeline_batch(@pg, keys)
|
|
14
|
+
Pipeloader.round_trips += 1
|
|
15
|
+
Pipeloader.queries += keys.size
|
|
16
|
+
batch.map { |columns, rows| ActiveRecord::Result.new(columns, rows) }
|
|
17
|
+
end
|
|
18
|
+
end
|
|
19
|
+
end
|
data/lib/pipeloader.rb
ADDED
|
@@ -0,0 +1,72 @@
|
|
|
1
|
+
require "pg"
|
|
2
|
+
require "graphql"
|
|
3
|
+
require "active_record"
|
|
4
|
+
|
|
5
|
+
require_relative "pipeloader/version"
|
|
6
|
+
require_relative "pipeloader/pipeliner"
|
|
7
|
+
require_relative "pipeloader/source"
|
|
8
|
+
require_relative "pipeloader/ar_patch"
|
|
9
|
+
require_relative "pipeloader/field_exact"
|
|
10
|
+
|
|
11
|
+
# Pipeloader makes a graphql-ruby query resolve its ActiveRecord SELECTs through
|
|
12
|
+
# a libpq pipeline: one round trip per tree level, transparently. Resolvers stay
|
|
13
|
+
# plain AR — no Futures, no dataloader.load, no field changes — because AR's own
|
|
14
|
+
# query path is intercepted during response building.
|
|
15
|
+
#
|
|
16
|
+
# class AppSchema < GraphQL::Schema
|
|
17
|
+
# use Pipeloader # adds GraphQL::Dataloader (fibers) + the AR patch
|
|
18
|
+
# end
|
|
19
|
+
module Pipeloader
|
|
20
|
+
class << self
|
|
21
|
+
# Per-query stats (single-threaded; reset at the start of each query).
|
|
22
|
+
attr_accessor :round_trips, :queries
|
|
23
|
+
end
|
|
24
|
+
self.round_trips = 0
|
|
25
|
+
self.queries = 0
|
|
26
|
+
|
|
27
|
+
def self.reset_stats!
|
|
28
|
+
self.round_trips = 0
|
|
29
|
+
self.queries = 0
|
|
30
|
+
end
|
|
31
|
+
|
|
32
|
+
def self.use(schema)
|
|
33
|
+
schema.use(GraphQL::Dataloader) # run resolution in fibers so SELECTs can gather
|
|
34
|
+
ARPatch.install!
|
|
35
|
+
schema.trace_with(Trace)
|
|
36
|
+
end
|
|
37
|
+
|
|
38
|
+
# Pipelining is libpq-specific. PostgreSQL pipelines; SQLite can't, but the
|
|
39
|
+
# opt-in column projection is plain ActiveRecord and still applies, so SQLite
|
|
40
|
+
# is allowed with pipelining disabled. Any other adapter is unsupported.
|
|
41
|
+
#
|
|
42
|
+
# Running SQLite un-pipelined is safe because SQLite is embedded: its queries
|
|
43
|
+
# are in-process calls with no network round trip, so there's nothing for a
|
|
44
|
+
# dataloader or pipeline to collapse. N+1 there is just cheap local calls, not
|
|
45
|
+
# the latency amplification pipelining exists to remove.
|
|
46
|
+
def self.pipelining_supported?(conn)
|
|
47
|
+
case conn.adapter_name
|
|
48
|
+
when "PostgreSQL" then true
|
|
49
|
+
when "SQLite" then false
|
|
50
|
+
else
|
|
51
|
+
raise "Pipeloader supports PostgreSQL (pipelined) and SQLite " \
|
|
52
|
+
"(field narrowing only); #{conn.adapter_name} is not supported."
|
|
53
|
+
end
|
|
54
|
+
end
|
|
55
|
+
|
|
56
|
+
# Stash the active dataloader on the connection for the whole response phase,
|
|
57
|
+
# and clear it at the end. This is done at *multiplex* scope, not per-query,
|
|
58
|
+
# because under Dataloader resolution is deferred to the multiplex's fiber run
|
|
59
|
+
# loop — a per-query hook would clear the stash before resolvers ever run.
|
|
60
|
+
module Trace
|
|
61
|
+
def execute_multiplex(multiplex:)
|
|
62
|
+
Pipeloader.reset_stats!
|
|
63
|
+
conn = ActiveRecord::Base.connection
|
|
64
|
+
# Raises on an unsupported adapter; on SQLite, leaves the stash unset so
|
|
65
|
+
# select_all never pipelines (column projection still applies).
|
|
66
|
+
conn.pipeloader_dataloader = multiplex.dataloader if Pipeloader.pipelining_supported?(conn)
|
|
67
|
+
super
|
|
68
|
+
ensure
|
|
69
|
+
conn.pipeloader_dataloader = nil if conn
|
|
70
|
+
end
|
|
71
|
+
end
|
|
72
|
+
end
|
metadata
ADDED
|
@@ -0,0 +1,120 @@
|
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
|
2
|
+
name: pipeloader
|
|
3
|
+
version: !ruby/object:Gem::Version
|
|
4
|
+
version: 0.0.1
|
|
5
|
+
platform: ruby
|
|
6
|
+
authors:
|
|
7
|
+
- Joshua Hull
|
|
8
|
+
bindir: bin
|
|
9
|
+
cert_chain: []
|
|
10
|
+
date: 1980-01-02 00:00:00.000000000 Z
|
|
11
|
+
dependencies:
|
|
12
|
+
- !ruby/object:Gem::Dependency
|
|
13
|
+
name: activerecord
|
|
14
|
+
requirement: !ruby/object:Gem::Requirement
|
|
15
|
+
requirements:
|
|
16
|
+
- - ">="
|
|
17
|
+
- !ruby/object:Gem::Version
|
|
18
|
+
version: '7.1'
|
|
19
|
+
type: :runtime
|
|
20
|
+
prerelease: false
|
|
21
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
22
|
+
requirements:
|
|
23
|
+
- - ">="
|
|
24
|
+
- !ruby/object:Gem::Version
|
|
25
|
+
version: '7.1'
|
|
26
|
+
- !ruby/object:Gem::Dependency
|
|
27
|
+
name: graphql
|
|
28
|
+
requirement: !ruby/object:Gem::Requirement
|
|
29
|
+
requirements:
|
|
30
|
+
- - ">="
|
|
31
|
+
- !ruby/object:Gem::Version
|
|
32
|
+
version: '2.0'
|
|
33
|
+
type: :runtime
|
|
34
|
+
prerelease: false
|
|
35
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
36
|
+
requirements:
|
|
37
|
+
- - ">="
|
|
38
|
+
- !ruby/object:Gem::Version
|
|
39
|
+
version: '2.0'
|
|
40
|
+
- !ruby/object:Gem::Dependency
|
|
41
|
+
name: pg
|
|
42
|
+
requirement: !ruby/object:Gem::Requirement
|
|
43
|
+
requirements:
|
|
44
|
+
- - ">="
|
|
45
|
+
- !ruby/object:Gem::Version
|
|
46
|
+
version: '1.3'
|
|
47
|
+
type: :runtime
|
|
48
|
+
prerelease: false
|
|
49
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
50
|
+
requirements:
|
|
51
|
+
- - ">="
|
|
52
|
+
- !ruby/object:Gem::Version
|
|
53
|
+
version: '1.3'
|
|
54
|
+
- !ruby/object:Gem::Dependency
|
|
55
|
+
name: minitest
|
|
56
|
+
requirement: !ruby/object:Gem::Requirement
|
|
57
|
+
requirements:
|
|
58
|
+
- - ">="
|
|
59
|
+
- !ruby/object:Gem::Version
|
|
60
|
+
version: '5'
|
|
61
|
+
type: :development
|
|
62
|
+
prerelease: false
|
|
63
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
64
|
+
requirements:
|
|
65
|
+
- - ">="
|
|
66
|
+
- !ruby/object:Gem::Version
|
|
67
|
+
version: '5'
|
|
68
|
+
- !ruby/object:Gem::Dependency
|
|
69
|
+
name: rake
|
|
70
|
+
requirement: !ruby/object:Gem::Requirement
|
|
71
|
+
requirements:
|
|
72
|
+
- - ">="
|
|
73
|
+
- !ruby/object:Gem::Version
|
|
74
|
+
version: '13'
|
|
75
|
+
type: :development
|
|
76
|
+
prerelease: false
|
|
77
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
78
|
+
requirements:
|
|
79
|
+
- - ">="
|
|
80
|
+
- !ruby/object:Gem::Version
|
|
81
|
+
version: '13'
|
|
82
|
+
description: During GraphQL response building, Pipeloader routes ActiveRecord SELECTs
|
|
83
|
+
through a libpq pipeline so a query tree resolves in roughly one round trip per
|
|
84
|
+
level — with plain resolvers and plain models, no Futures, no dataloader.load, no
|
|
85
|
+
resolver changes.
|
|
86
|
+
email:
|
|
87
|
+
- josh@fireflop.com
|
|
88
|
+
executables: []
|
|
89
|
+
extensions: []
|
|
90
|
+
extra_rdoc_files: []
|
|
91
|
+
files:
|
|
92
|
+
- README.md
|
|
93
|
+
- lib/pipeloader.rb
|
|
94
|
+
- lib/pipeloader/ar_patch.rb
|
|
95
|
+
- lib/pipeloader/field_exact.rb
|
|
96
|
+
- lib/pipeloader/pipeliner.rb
|
|
97
|
+
- lib/pipeloader/source.rb
|
|
98
|
+
- lib/pipeloader/version.rb
|
|
99
|
+
homepage: https://github.com/joshbuddy/pipeloader
|
|
100
|
+
licenses:
|
|
101
|
+
- MIT
|
|
102
|
+
metadata: {}
|
|
103
|
+
rdoc_options: []
|
|
104
|
+
require_paths:
|
|
105
|
+
- lib
|
|
106
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
|
107
|
+
requirements:
|
|
108
|
+
- - ">="
|
|
109
|
+
- !ruby/object:Gem::Version
|
|
110
|
+
version: '3.2'
|
|
111
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
|
112
|
+
requirements:
|
|
113
|
+
- - ">="
|
|
114
|
+
- !ruby/object:Gem::Version
|
|
115
|
+
version: '0'
|
|
116
|
+
requirements: []
|
|
117
|
+
rubygems_version: 3.6.7
|
|
118
|
+
specification_version: 4
|
|
119
|
+
summary: Transparent libpq pipelining for graphql-ruby on ActiveRecord
|
|
120
|
+
test_files: []
|