RubyGems - pipeloader - Versions diffs - 0.0.1 → 0.0.3 - Mend

pipeloader 0.0.1 → 0.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

checksums.yaml +4 -4
data/DATALOADERS.md +379 -0
data/LICENSE +21 -0
data/README.md +243 -161
data/lib/pipeloader/ar_patch.rb +3 -1
data/lib/pipeloader/batch/batch_loader.rb +63 -0
data/lib/pipeloader/batch/batch_proxy.rb +204 -0
data/lib/pipeloader/batch/context.rb +43 -0
data/lib/pipeloader/batch/fetcher.rb +30 -0
data/lib/pipeloader/batch/fetcher_state.rb +27 -0
data/lib/pipeloader/batch/load_grouping.rb +28 -0
data/lib/pipeloader/batch/load_interceptor.rb +44 -0
data/lib/pipeloader/batch/model.rb +170 -0
data/lib/pipeloader/batch/relationship.rb +68 -0
data/lib/pipeloader/batch.rb +44 -0
data/lib/pipeloader/field_exact.rb +235 -14
data/lib/pipeloader/pipeliner.rb +114 -26
data/lib/pipeloader/source.rb +27 -3
data/lib/pipeloader/version.rb +1 -1
data/lib/pipeloader.rb +32 -1
metadata +47 -4

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: e89c97e574dfe8954ae5d612925c75391c10b131291e061f2bbfd92446fa8088
-  data.tar.gz: 8982f0272ae0dc1fe47b7cea62b6b01a462b47ee7447c5173f16720cd424f164
+  metadata.gz: 2efb3ab68daaf2649998d53eab252167254d1e195ff3f46ff676c1bcaeeb8d17
+  data.tar.gz: a45b1c6236205caf6421437bdf4ec6b04b459698d630f63708c6f929e63c1242
 SHA512:
-  metadata.gz: 59182c22fc62605148fa69ee580858fb670c2199628e8068ac9957fd826979634a9687179d43330013987fb18f6f33c6dad4fe0246850b47d07bfecfeed17d32
-  data.tar.gz: e9d19c2e8de08e14041c9459250c0839af2925895f20ca012e3c3abad39d9bec36ecc6bb91e9470b43c800f111f04a759bb481258d8aeb1bfd10bf52e59d781d
+  metadata.gz: c607f4d9a7c5c4c5ee035bc2458ad3e91b0a1d15e83d7a9cedd35a2942de1c470521175eff0d9898211c828575c1615aeb514212e3edb252159686b5aa11864b
+  data.tar.gz: 5d9dbf27e303c91733d6f6b5787015bf993a9658b173f689e4cecbd9a0a55739ac4d691976bc4cfe034f0b5d0d8e541717c1a57a3b6780ed58b57c454d3e7180

data/DATALOADERS.md ADDED Viewed

@@ -0,0 +1,379 @@
+# Batch loaders vs `GraphQL::Dataloader`, by example
+Almost every `GraphQL::Dataloader::Source` you write is one of a handful of shapes:
+load a record by id, load children by foreign key, count them, look one up by some
+other column, check whether a row exists. `Pipeloader::Batch` collapses each of those
+to a line on the model — and for the association shapes, to **no resolver at all**:
+you declare the field and call `object.author`, and the load batches itself.
+Every "before" below is real `GraphQL::Dataloader` code (a custom `Source`, or one of
+graphql-ruby's built-ins); every "after" is the pipeloader equivalent. The schema is a
+GitHub-ish one: `Repository`, `Issue`, `User`, `Star`, `Topic`.
+The one-time setup: `include Pipeloader::Batch::Model` in the model. There's nothing to
+scope — siblings are the records loaded by the same query, and the group rides on the
+records, so batching just works. Everything below assumes the concern is included.
+---
+## 1. A record by id — `belongs_to`
+The single most common source: hydrate a foreign key.
+```ruby
+# --- GraphQL::Dataloader ---
+class Sources::Record < GraphQL::Dataloader::Source
+  def initialize(model) = @model = model
+  def fetch(ids)
+    found = @model.where(id: ids).index_by(&:id)
+    ids.map { |id| found[id] }            # one row per id, nil if missing
+  end
+end
+class Types::Issue < GraphQL::Schema::Object
+  field :author, Types::User, null: false
+  def author = dataloader.with(Sources::Record, ::User).load(object.author_id)
+end
+```
+```ruby
+# --- pipeloader ---
+class Issue < ApplicationRecord
+  include Pipeloader::Batch::Model
+  batch_belongs_to :author
+end
+class Types::Issue < GraphQL::Schema::Object
+  field :author, Types::User, null: false   # no resolver — object.author batches
+end
+```
+graphql-ruby's built-in `ActiveRecordAssociationSource` is the closest dataloader
+equivalent, and even it needs a per-field resolver:
+```ruby
+def author = dataloader.with(GraphQL::Dataloader::ActiveRecordAssociationSource, :author).load(object)
+```
+pipeloader needs none: the field resolves through `object.author`, which the macro has
+already made batch across every sibling `Issue` in the context.
+---
+## 2. A collection — `has_many`
+```ruby
+# --- GraphQL::Dataloader ---
+class Sources::HasMany < GraphQL::Dataloader::Source
+  def initialize(model, fk)
+    @model = model
+    @fk = fk
+  end
+  def fetch(owner_ids)
+    by_owner = @model.where(@fk => owner_ids).group_by { |r| r[@fk] }
+    owner_ids.map { |id| by_owner[id] || [] }
+  end
+end
+def issues = dataloader.with(Sources::HasMany, ::Issue, :repository_id).load(object.id)
+```
+```ruby
+# --- pipeloader ---
+class Repository < ApplicationRecord
+  include Pipeloader::Batch::Model
+  batch_has_many :issues
+end
+# field :issues, [Types::Issue], null: false   — no resolver
+```
+`object.issues` returns a lazy, chainable proxy whose load batches across every
+sibling. See §6 for the chaining.
+---
+## 3. A singular child — `has_one`
+```ruby
+# --- GraphQL::Dataloader: HasMany source, but take the first ---
+def profile = dataloader.with(Sources::HasMany, ::Profile, :user_id).load(object.id).then(&:first)
+```
+```ruby
+# --- pipeloader ---
+batch_has_one :profile
+```
+---
+## 4. A count
+```ruby
+# --- GraphQL::Dataloader ---
+class Sources::Count < GraphQL::Dataloader::Source
+  def initialize(model, fk)
+    @model = model
+    @fk = fk
+  end
+  def fetch(owner_ids)
+    counts = @model.where(@fk => owner_ids).group(@fk).count
+    owner_ids.map { |id| counts[id] || 0 }
+  end
+end
+def issues_count = dataloader.with(Sources::Count, ::Issue, :repository_id).load(object.id)
+```
+```ruby
+# --- pipeloader ---
+batch_count :issues_count            # source derived from the name; default 0
+```
+---
+## 5. A sum / aggregate
+```ruby
+# --- GraphQL::Dataloader ---
+class Sources::Sum < GraphQL::Dataloader::Source
+  def initialize(model, fk, column)
+    @model = model
+    @fk = fk
+    @column = column
+  end
+  def fetch(owner_ids)
+    sums = @model.where(@fk => owner_ids).group(@fk).sum(@column)
+    owner_ids.map { |id| sums[id] || 0 }
+  end
+end
+def disk_usage = dataloader.with(Sources::Sum, ::File, :repository_id, :byte_size).load(object.id)
+```
+```ruby
+# --- pipeloader ---
+batch_aggregate :disk_usage, of: :files, function: :sum, column: :byte_size
+# function: also :average / :minimum / :maximum (empty group -> nil, or your :default)
+```
+---
+## 6. A filtered collection
+With a dataloader you either bake the filter into a dedicated source, or load
+everything and filter in Ruby (which un-batches the database side). With pipeloader the
+`has_many` proxy is chainable, and the filter is pushed **into** the one batched query:
+```ruby
+# --- pipeloader ---
+field :open_issues, [Types::Issue], null: false
+def open_issues = object.issues.where(state: "open").order(:created_at)
+#   one query total:  SELECT * FROM issues WHERE state = 'open' AND repository_id IN (...)
+```
+`where` / `order` / `limit` / `select` all stay batched (limit/offset are per-owner —
+top-N per repository, not N overall). A distinct chain is cached as its own batch, so
+the same chain across many owners is still one round trip. Prefer a named relation? Pass
+the scope to the macro:
+```ruby
+batch_has_many :open_issues, -> { where(state: "open") }, class_name: "Issue"
+```
+---
+## 7. A record by a non-primary-key column
+```ruby
+# --- GraphQL::Dataloader ---
+class Sources::RecordBy < GraphQL::Dataloader::Source
+  def initialize(model, column)
+    @model = model
+    @column = column
+  end
+  def fetch(values)
+    found = @model.where(@column => values).index_by { |r| r[@column] }
+    values.map { |v| found[v] }
+  end
+end
+def owner = dataloader.with(Sources::RecordBy, ::User, :login).load(object.owner_login)
+```
+If the column is a real key, say so on the association and it just works:
+```ruby
+# --- pipeloader ---
+batch_belongs_to :owner, class_name: "User", primary_key: :login, foreign_key: :owner_login
+```
+If it isn't an association at all, the general `batch` macro (§8) keyed by that column
+does the same in three lines.
+---
+## 8. An existence / viewer-scoped flag — the `batch` macro
+This is the case that doesn't map to an association: a value that depends on the request
+(the current viewer). With a dataloader it's a source parameterized by the viewer:
+```ruby
+# --- GraphQL::Dataloader ---
+class Sources::ViewerStarred < GraphQL::Dataloader::Source
+  def initialize(viewer_id) = @viewer_id = viewer_id
+  def fetch(repo_ids)
+    starred = Star.where(user_id: @viewer_id, repo_id: repo_ids).pluck(:repo_id).to_set
+    repo_ids.map { |id| starred.include?(id) }
+  end
+end
+def viewer_has_starred = dataloader.with(Sources::ViewerStarred, context[:viewer].id).load(object.id)
+```
+`batch` is the general escape hatch: give it a loader that takes the owner keys and
+returns a `{ key => value }` Hash; each instance reads its own, or the `default`.
+```ruby
+# --- pipeloader ---
+class Repository < ApplicationRecord
+  include Pipeloader::Batch::Model
+  batch :viewer_has_starred, default: false do |repo_ids|
+    Star.where(user_id: Current.user.id, repo_id: repo_ids)
+        .pluck(:repo_id).index_with(true)       # { repo_id => true }, missing -> false
+  end
+end
+# field :viewer_has_starred, Boolean, null: false   — no resolver
+```
+The loader reads the viewer from request state (`Current.user` here — any
+`ActiveSupport::CurrentAttributes`; outside Rails, close over a local or read what you
+stashed when you opened the batch). It runs **once** per context, the first time any
+repository's flag is read.
+---
+## 9. A derived value — the `batch` macro
+Anything you can compute in one grouped query: a score, a summary, a lookup by a column.
+```ruby
+# reaction counts per issue, one query for the whole page
+batch :reaction_counts, default: {} do |issue_ids|
+  Reaction.where(issue_id: issue_ids).group(:issue_id, :emoji).count   # { [id, emoji] => n }
+          .each_with_object(Hash.new { |h, k| h[k] = {} }) { |((id, emoji), n), acc| acc[id][emoji] = n }
+end
+# a record looked up by a non-PK column, keyed by that column instead of the PK
+batch :author, key: :author_login, default: nil do |logins|
+  User.where(login: logins).index_by(&:login)
+end
+```
+`key:` controls which column the owner keys come from (default the primary key), so the
+loader and the reader agree on what to index by.
+---
+## 10. `has_many :through`
+The case a flat foreign-key batcher can't express — the target has no column pointing
+back at the owner. With a dataloader you join in the source and re-hydrate the rows:
+```ruby
+# --- GraphQL::Dataloader ---
+class Sources::Topics < GraphQL::Dataloader::Source
+  def fetch(repo_ids)
+    rows = RepositoryTopic.where(repository_id: repo_ids).joins(:topic)
+                          .pluck(:repository_id, "topics.id", "topics.name")
+    by_repo = rows.group_by(&:first)
+    repo_ids.map { |id| (by_repo[id] || []).map { |(_, tid, name)| Topic.new(id: tid, name: name) } }
+  end
+end
+def topics = dataloader.with(Sources::Topics).load(object.id)
+```
+```ruby
+# --- pipeloader ---
+batch_has_many :topics, through: :repository_topics
+```
+pipeloader routes a `:through` collection to AR's `Preloader` (which walks the join), so
+it batches to one query for the join and one for the targets, across every owner. It
+returns a plain loaded array — no chainable proxy — so filter at the relation if you need
+to: `batch_has_many :recent_topics, -> { order(created_at: :desc) }, through: :repository_topics`.
+---
+## 11. A polymorphic `belongs_to`
+```ruby
+# --- GraphQL::Dataloader: group by type, load each, re-map to input order ---
+class Sources::Commentable < GraphQL::Dataloader::Source
+  def fetch(comments)
+    by_type = comments.group_by(&:commentable_type)
+    loaded = by_type.transform_values do |group|
+      group.first.commentable_type.constantize.where(id: group.map(&:commentable_id)).index_by(&:id)
+    end
+    comments.map { |c| loaded[c.commentable_type][c.commentable_id] }
+  end
+end
+def commentable = dataloader.with(Sources::Commentable).load(object)
+```
+```ruby
+# --- pipeloader ---
+batch_belongs_to :commentable, polymorphic: true
+```
+The `Preloader` groups by `commentable_type` under the hood, so it's one query **per
+type** across all siblings (not one per record), resolving each to the right class.
+---
+## Where the edges are
+- **The `has_many` proxy is read-only.** Writes (`<<`, `create`, `build`, …) delegate to
+  the real association; reads it doesn't implement raise `NoMethodError` rather than
+  silently issuing a per-record query. `:through` collections aren't chainable (they load
+  to an array; put the scope on the macro).
+- **`batch` loaders still write their own query.** The escape hatch removes the `Source`
+  class and the `.load`, and batches across siblings for you — but an existence check or a
+  derived value is still SQL you write.
+---
+## Ergonomics — the scorecard
+| pattern | `GraphQL::Dataloader` | pipeloader |
+|---|---|---|
+| record by id | a `Source` + `.load` resolver | `batch_belongs_to` — **no resolver** |
+| collection | a `Source` + `.load` resolver | `batch_has_many` — **no resolver** |
+| has_one | reuse the HasMany source + `.first` | `batch_has_one` — **no resolver** |
+| count | a `Source` + `.load` resolver | `batch_count` |
+| sum/avg/min/max | a `Source` + `.load` resolver | `batch_aggregate` |
+| filtered collection | a dedicated source, or filter in Ruby | `object.issues.where(...)` — pushed into the batch |
+| by non-PK column | a `Source` + `.load` resolver | a `primary_key:` on the association, or `batch key:` |
+| existence / viewer-scoped | a viewer-parameterized `Source` + `.load` | `batch ... do \|ids\| ... end` |
+| derived value | a custom `Source` + `.load` | `batch ... do \|ids\| ... end` |
+| `has_many :through` | a join in a custom `Source` | `batch_has_many through:` — **no resolver** |
+| polymorphic belongs_to | group-by-type custom `Source` | `batch_belongs_to polymorphic: true` — **no resolver** |
+Three things change:
+1. **No source registry.** A dataloader app accretes a `Sources::` namespace of small,
+   near-identical classes. pipeloader has none — the shape is named by the macro.
+2. **No `.load` in resolvers — usually no resolver.** Every association resolves through
+   the model's own method (`object.author`), so the GraphQL type is just field
+   declarations. The batching lives on the model, next to the association it batches.
+3. **Batched and chainable by default.** A collection is a real, filterable relation
+   surface, with the filter pushed into the single query — no second source per variant.
+What stays the same: the genuinely custom cases — an existence check, a viewer-scoped
+flag, a derived value — still need a hand-written query. `batch` removes the ceremony
+around it (no `Source` class, no `.load`, automatic batching across siblings) but you
+still write the SQL the value needs. That's the honest boundary: every standard
+association — including `:through` and polymorphic — becomes a macro with no resolver,
+and the long tail becomes a one-method `batch`, rather than a `Sources::` class apiece.

data/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Joshua Hull
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.