@michaelstewart/convex-tanstack-db-collection 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Michael Stewart
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,385 @@
1
+ # Convex Tanstack DB Collection
2
+
3
+ On-demand real-time sync between [Convex](https://convex.dev) and [TanStack DB](https://tanstack.com/db) collections.
4
+
5
+ Uses a "backfill + tail" pattern: fetch full history for new filters, then subscribe with a cursor to catch ongoing changes. Convex's OCC guarantees per-key timestamp monotonicity, enabling efficient cursor-based sync without a global transaction log.
6
+
7
+ ## Installation
8
+
9
+ ```bash
10
+ npm install @michaelstewart/convex-tanstack-db-collection
11
+ # or
12
+ pnpm add @michaelstewart/convex-tanstack-db-collection
13
+ ```
14
+
15
+ ## Example Use Case
16
+
17
+ Imagine a Slack-like app with messages inside channels:
18
+
19
+ ```typescript
20
+ // convex/schema.ts
21
+ import { defineSchema, defineTable } from "convex/server"
22
+ import { v } from "convex/values"
23
+
24
+ export default defineSchema({
25
+ channels: defineTable({
26
+ name: v.string(),
27
+ }),
28
+ messages: defineTable({
29
+ // Client-generated UUID to support optimistic inserts
30
+ id: v.string(),
31
+ channelId: v.id("channels"),
32
+ authorId: v.string(),
33
+ body: v.string(),
34
+ updatedAt: v.number(),
35
+ })
36
+ .index("by_channel_updatedAt", ["channelId", "updatedAt"])
37
+ .index("by_author_updatedAt", ["authorId", "updatedAt"]),
38
+ })
39
+ ```
40
+
41
+ ```typescript
42
+ // src/collections.ts
43
+ import { createCollection } from '@tanstack/react-db'
44
+ import { convexCollectionOptions } from '@michaelstewart/convex-tanstack-db-collection'
45
+ import { api } from '@convex/_generated/api'
46
+
47
+ const messagesCollection = createCollection(
48
+ convexCollectionOptions({
49
+ client: convexClient,
50
+ query: api.messages.getMessagesAfter,
51
+ filters: { filterField: 'channelId', convexArg: 'channelIds' },
52
+ getKey: (msg) => msg.id,
53
+ })
54
+ )
55
+
56
+ // In your UI - TanStack DB extracts channelId from the where clause
57
+ const { data: messages } = useLiveQuery(q =>
58
+ q.from({ msg: messagesCollection })
59
+ .where(({ msg }) => msg.channelId.eq(currentChannelId))
60
+ )
61
+ ```
62
+
63
+ ```typescript
64
+ // convex/messages.ts
65
+ import { v } from 'convex/values'
66
+ import { query } from './_generated/server'
67
+
68
+ export const getMessagesAfter = query({
69
+ args: {
70
+ channelIds: v.optional(v.array(v.id("channels"))),
71
+ after: v.optional(v.number()),
72
+ },
73
+ handler: async (ctx, { channelIds, after = 0 }) => {
74
+ if (!channelIds || channelIds.length === 0) return []
75
+
76
+ const results = await Promise.all(
77
+ channelIds.map(channelId =>
78
+ ctx.db
79
+ .query("messages")
80
+ .withIndex("by_channel_updatedAt", q =>
81
+ q.eq("channelId", channelId).gt("updatedAt", after)
82
+ )
83
+ .collect()
84
+ )
85
+ )
86
+ return results.flat()
87
+ },
88
+ })
89
+ ```
90
+
91
+ ## Design Background
92
+
93
+ ### Why Not a Changelog?
94
+
95
+ [ElectricSQL](https://tanstack.com/db/latest/docs/collections/electric-collection) syncs from Postgres using the write-ahead log (WAL) as a changelog. Every transaction has a globally-ordered transaction ID (txid), so Electric can stream exactly what changed and clients can confirm when their mutations are synced by waiting for specific txids.
96
+
97
+ Convex doesn't have a global transaction log—there's no single writer assigning sequential IDs. Instead, Convex provides:
98
+
99
+ 1. **Deterministic Optimistic concurrency control (OCC)**: Transactions are serializable based on read sets, with automatic deterministic retry on conflicts
100
+ 2. **Reactive subscriptions**: Queries automatically re-run when their dependencies change, tracked efficiently via index ranges in query read sets
101
+
102
+ This adapter uses these two Convex superpowers to construct an **update log** from an index on `updatedAt`. Because OCC guarantees that `updatedAt` is non-decreasing for any given key (it acts as a Lamport timestamp), we can query `after: cursor` to fetch only newer records.
103
+
104
+ The result is efficient cursor-based sync—with two caveats:
105
+ 1. Index records in the last few seconds of the update log can become visible out of order- solved with [tail overlap](#the-tail-overlap-why-we-need-it)
106
+ 2. [Hard deletes are unsupported](#hard-deletes-not-supported)
107
+
108
+ ### The Backfill + Tail Pattern
109
+
110
+ We use a two-phase sync:
111
+
112
+ 1. **Backfill**: Query with `after: 0` to get full current state for filter values
113
+ 2. **Tail**: Subscribe with `after: globalCursor - tailOverlapMs` to catch ongoing changes
114
+
115
+ A single subscription covers all active filter values.
116
+
117
+ **Why one subscription for all filters?**
118
+
119
+ Convex function calls are billed on subscription creation and subscription update. If you have 50 filter values active, 50 separate subscriptions could be expensive. Instead, we merge them into one subscription that tracks changes across all values, using cursor advancement to minimize redundant data.
120
+
121
+ ### The Tail Overlap (Why We Need It)
122
+
123
+ The per-key timestamp guarantee doesn't extend across keys. Specifically, **commit order doesn't match timestamp generation order**:
124
+
125
+ ```
126
+ T=1000: Transaction A generates updatedAt=1000 for key1
127
+ T=1001: Transaction B generates updatedAt=1001 for key2
128
+ T=1002: Transaction B commits first → key2 visible with updatedAt=1001
129
+ T=1003: Transaction A commits second → key1 visible with updatedAt=1000
130
+ ```
131
+
132
+ If we see key2 first, advance `globalCursor` to 1001, and re-subscribe with `after: 1001`, we'd **never see key1** because `1000 < 1001`.
133
+
134
+ The **tail overlap** (`tailOverlapMs`, default 10 seconds) solves this with a conservative the subscription cursor:
135
+
136
+ ```typescript
137
+ subscriptionCursor = globalCursor - tailOverlapMs
138
+ ```
139
+
140
+ This creates an overlap window where we re-receive some data. The LWW (Last-Write-Wins) resolution using `updatedAt` handles duplicates correctly—for any given key, we keep whichever version has the higher timestamp.
141
+
142
+ **The tradeoff:** A larger overlap means more duplicate data but safer sync. A smaller overlap saves bandwidth but risks missing updates if transactions take longer than the window to commit.
143
+
144
+ ### Lamport Timestamps
145
+
146
+ Your documents must have an `updatedAt` field that you update on every mutation. To guarantee monotonicity within each key, even with updates from different servers with skewed clocks, use a Lamport style timestamp:
147
+
148
+ ```typescript
149
+ /**
150
+ * Calculate a monotonically increasing updatedAt timestamp.
151
+ * Uses max(Date.now(), prevUpdatedAt + 1) to handle server clock skew.
152
+ */
153
+ function getLamportUpdatedAt(prevUpdatedAt: number): number {
154
+ return Math.max(Date.now(), prevUpdatedAt + 1)
155
+ }
156
+
157
+ // On insert
158
+ await ctx.db.insert('messages', {
159
+ ...data,
160
+ updatedAt: Date.now(), // No previous timestamp, so Date.now() is fine
161
+ })
162
+
163
+ // On update
164
+ const existing = await ctx.db.get(id)
165
+ await ctx.db.patch(id, {
166
+ ...changes,
167
+ updatedAt: getLamportUpdatedAt(existing.updatedAt),
168
+ })
169
+ ```
170
+
171
+ <details>
172
+ <summary><strong>More Examples</strong></summary>
173
+
174
+ ### Multiple Filter Dimensions
175
+
176
+ You can filter by multiple fields using the same sync query:
177
+
178
+ ```typescript
179
+ // Filter by channel OR by author - both use the same getMessagesAfter query
180
+ const messagesCollection = createCollection(
181
+ convexCollectionOptions({
182
+ client: convexClient,
183
+ query: api.messages.getMessagesAfter,
184
+ filters: [
185
+ { filterField: 'channelId', convexArg: 'channelIds' },
186
+ { filterField: 'authorId', convexArg: 'authorIds' },
187
+ ],
188
+ getKey: (msg) => msg.id,
189
+ })
190
+ )
191
+
192
+ // View messages in a channel
193
+ const { data: channelMessages } = useLiveQuery(q =>
194
+ q.from({ msg: messagesCollection })
195
+ .where(({ msg }) => msg.channelId.eq(channelId))
196
+ )
197
+
198
+ // Or view all messages by an author
199
+ const { data: authorMessages } = useLiveQuery(q =>
200
+ q.from({ msg: messagesCollection })
201
+ .where(({ msg }) => msg.authorId.eq(userId))
202
+ )
203
+ ```
204
+
205
+ ### Global Sync (No Filters)
206
+
207
+ For small datasets, sync everything:
208
+
209
+ ```typescript
210
+ const allMessagesCollection = createCollection(
211
+ convexCollectionOptions({
212
+ client: convexClient,
213
+ query: api.messages.getAllMessagesAfter, // Query takes only { after }
214
+ getKey: (msg) => msg.id,
215
+ })
216
+ )
217
+ ```
218
+
219
+ </details>
220
+
221
+ <details>
222
+ <summary><strong>Convex Query Setup (Advanced)</strong></summary>
223
+
224
+ ## Convex Query Setup
225
+
226
+ Your sync query accepts filter arrays and an `after` timestamp. Use compound indexes for efficient queries:
227
+
228
+ ```typescript
229
+ // convex/messages.ts
230
+ import { v } from 'convex/values'
231
+ import { query } from './_generated/server'
232
+
233
+ export const getMessagesAfter = query({
234
+ args: {
235
+ channelIds: v.optional(v.array(v.id("channels"))),
236
+ authorIds: v.optional(v.array(v.string())),
237
+ after: v.optional(v.number()),
238
+ },
239
+ handler: async (ctx, { channelIds, authorIds, after = 0 }) => {
240
+ // Query each channel using the compound index
241
+ if (channelIds && channelIds.length > 0) {
242
+ const results = await Promise.all(
243
+ channelIds.map(channelId =>
244
+ ctx.db
245
+ .query("messages")
246
+ .withIndex("by_channel_updatedAt", q =>
247
+ q.eq("channelId", channelId).gt("updatedAt", after)
248
+ )
249
+ .collect()
250
+ )
251
+ )
252
+ return results.flat()
253
+ }
254
+
255
+ // For author queries, use a different index (or filter)
256
+ if (authorIds && authorIds.length > 0) {
257
+ const results = await Promise.all(
258
+ authorIds.map(authorId =>
259
+ ctx.db
260
+ .query("messages")
261
+ .withIndex("by_author_updatedAt", q =>
262
+ q.eq("authorId", authorId).gt("updatedAt", after)
263
+ )
264
+ .collect()
265
+ )
266
+ )
267
+ return results.flat()
268
+ }
269
+
270
+ return []
271
+ },
272
+ })
273
+ ```
274
+
275
+ The compound index `["channelId", "updatedAt"]` allows efficient range queries: "all messages in this channel updated after this timestamp".
276
+
277
+ </details>
278
+
279
+ <details>
280
+ <summary><strong>Configuration Reference</strong></summary>
281
+
282
+ ## Configuration
283
+
284
+ ### Filter Options
285
+
286
+ ```typescript
287
+ interface FilterDimension {
288
+ // Field name in TanStack DB queries (e.g., 'channelId')
289
+ filterField: string
290
+
291
+ // Convex query argument name (e.g., 'channelIds')
292
+ convexArg: string
293
+
294
+ // If true, assert only one value is ever requested (default: false)
295
+ // Throws error if multiple values requested
296
+ single?: boolean
297
+ }
298
+ ```
299
+
300
+ ### Full Config
301
+
302
+ ```typescript
303
+ interface ConvexCollectionConfig {
304
+ client: ConvexClient | ConvexReactClient
305
+ query: FunctionReference<'query'>
306
+ getKey: (item: T) => string | number
307
+
308
+ // Filter configuration (optional)
309
+ filters?: FilterDimension | FilterDimension[]
310
+
311
+ // Timestamp field for LWW conflict resolution (default: 'updatedAt')
312
+ updatedAtFieldName?: string
313
+
314
+ // Debounce for batching loadSubset calls (default: 50ms)
315
+ debounceMs?: number
316
+
317
+ // Overlap window when rewinding subscription cursor (default: 10000ms)
318
+ // See "The Tail Overlap" section above for why this is needed
319
+ tailOverlapMs?: number
320
+
321
+ // Messages before re-subscribing with advanced cursor (default: 10)
322
+ // Set to 0 to disable cursor advancement entirely
323
+ resubscribeThreshold?: number
324
+
325
+ // Mutation handlers
326
+ onInsert?: (params) => Promise<void>
327
+ onUpdate?: (params) => Promise<void>
328
+ }
329
+ ```
330
+
331
+ ### Tuning the Tail Overlap
332
+
333
+ The default `tailOverlapMs` of 10 seconds is generous. Convex has a [1-second execution time limit](https://docs.convex.dev/production/state/limits) for user code in mutations, so it's unlikely that a record becomes visible multiple seconds after another record with a later timestamp. However I expect it is technically possible in cases of degraded DB performance or bad clock skew.
334
+
335
+ Even if you set this ultra-conservatively to 5 minutes, you'd still cut duplicate traffic by orders of magnitude in most apps. Ask yourself: what percentage of data on this page was written in the last 5 minutes? For many applications, it's a small fraction.
336
+
337
+ </details>
338
+
339
+ ## How It Works
340
+
341
+ 1. **Filter Extraction**: Parses TanStack DB `where` clauses to extract filter values
342
+ 2. **Backfill**: Fetches full history for new filter values with `after: 0`
343
+ 3. **Subscription Merging**: Maintains a single Convex subscription for all active filter values
344
+ 4. **LWW Conflict Resolution**: Uses `updatedAt` timestamps to handle overlapping data
345
+ 5. **Cursor Advancement**: Periodically re-subscribes with advanced cursor to reduce data transfer
346
+
347
+ ## Limitations
348
+
349
+ ### Hard Deletes Not Supported
350
+
351
+ This adapter does not support hard deletes. When a record is deleted from Convex, other subscribed clients have no way to learn about the deletion—the sync query only returns items that exist.
352
+
353
+ **Use soft deletes instead:**
354
+
355
+ ```typescript
356
+ // Instead of deleting:
357
+ await ctx.db.delete(id)
358
+
359
+ // Set a status field:
360
+ await ctx.db.patch(id, {
361
+ status: 'deleted',
362
+ updatedAt: Date.now()
363
+ })
364
+ ```
365
+
366
+ The sync will receive the updated record with `status: 'deleted'`. Your UI can filter out deleted items:
367
+
368
+ ```typescript
369
+ const { data } = useLiveQuery(q =>
370
+ q.from({ item: itemsCollection })
371
+ .where(({ item }) => item.status.eq('active'))
372
+ )
373
+ ```
374
+
375
+ ### Filter Expressions
376
+
377
+ Only `.eq()` and `.in()` operators are supported for filter extraction. Complex expressions like `.gt()`, `.lt()`, or nested `or` conditions on filter fields won't work.
378
+
379
+ ## When to Use This
380
+
381
+ **Consider starting with [query-collection](https://tanstack.com/db/latest/docs/collections/query-collection)** if you have few items on screen. It's simpler, uses Convex's built-in `useQuery` under the hood, and is sufficient for many apps.
382
+
383
+ This adapter is for when you need:
384
+ - **On-demand sync**: Specifically load data matching your current queries
385
+ - **Cursor-based efficiency**: Avoid re-fetching unchanged data on every subscription update