@earth-app/collegedb 1.0.7 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # CollegeDB
2
2
 
3
- > Cloudflare D1 Horizontal Sharding Router
3
+ Universal Database Horizontal Sharding Router
4
4
 
5
5
  [![TypeScript](https://img.shields.io/badge/TypeScript-5.0+-blue.svg)](https://www.typescriptlang.org/)
6
6
  [![GitHub Issues](https://img.shields.io/github/issues/earth-app/CollegeDB)](https://github.com/earth-app/CollegeDB/issues)
@@ -8,7 +8,7 @@
8
8
  [![GitHub License](https://img.shields.io/github/license/earth-app/CollegeDB)](LICENSE)
9
9
  ![NPM Version](https://img.shields.io/npm/v/%40earth-app%2Fcollegedb)
10
10
 
11
- A TypeScript library for **true horizontal scaling** of SQLite-style databases on Cloudflare using D1 and KV. CollegeDB distributes your data across multiple D1 databases, with each table's records split by primary key across different database instances.
11
+ A TypeScript library for **true horizontal scaling** of SQLite-style databases primarily for Cloudflare using D1 and KV, with additional provider adapters for Redis/Valkey KV and PostgreSQL/MySQL/SQLite SQL backends. CollegeDB distributes your data across multiple database shards, with each table's records split by primary key across different database instances.
12
12
 
13
13
  CollegeDB implements **data distribution** where a single logical table is physically stored across multiple D1 databases:
14
14
 
@@ -46,6 +46,7 @@ CollegeDB provides a sharding layer on top of Cloudflare D1 databases, enabling
46
46
  - **Scale horizontally** by distributing table data across multiple D1 instances
47
47
  - **Route queries automatically** based on primary key mappings
48
48
  - **Maintain consistency** with KV-based shard mapping
49
+ - **Run on multiple providers** through `KVStorage` and `SQLDatabase` contracts
49
50
  - **Optimize for geography** with location-aware shard allocation
50
51
  - **Monitor and rebalance** shard distribution
51
52
  - **Handle migrations** between shards seamlessly
@@ -53,6 +54,7 @@ CollegeDB provides a sharding layer on top of Cloudflare D1 databases, enabling
53
54
  ## 📦 Features
54
55
 
55
56
  - **🔀 Automatic Query Routing**: Primary key → shard mapping using Cloudflare KV
57
+ - **🧩 Provider Adapters (v1.1.0)**: Redis/Valkey KV + PostgreSQL/MySQL/SQLite SQL adapters while preserving Cloudflare compatibility
56
58
  - **🎯 Multiple Allocation Strategies**: Round-robin, random, hash-based, and location-aware distribution
57
59
  - **🔄 Mixed Strategy Support**: Different strategies for reads vs writes (e.g., location for writes, hash for reads)
58
60
  - **📊 Shard Coordination**: Durable Objects for allocation and statistics
@@ -62,18 +64,193 @@ CollegeDB provides a sharding layer on top of Cloudflare D1 databases, enabling
62
64
  - **⚡ High Performance**: Optimized for Cloudflare Workers runtime
63
65
  - **🔧 TypeScript First**: Full type safety and excellent DX
64
66
 
67
+ ## Benchmark Suite
68
+
69
+ CollegeDB includes a comprehensive benchmark suite covering real-world latency across provider combinations and Cloudflare Worker routing paths.
70
+
71
+ ### Matrix
72
+
73
+ | Scenario Key | Scenario | What Happens | Workload Per Run |
74
+ | ----------------- | -------------------------------- | --------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------- |
75
+ | basic_crud | Basic CRUD round-trip | Insert, read, update, and delete a user via routed queries. | 20 iterations; 4 routed SQL ops per iteration |
76
+ | advanced_usage | Advanced lookup workflow | Writes user+post, adds lookup aliases, then validates join and alias-based lookup. | 15 iterations; ~5 routed SQL ops + KV lookup-key updates per iteration |
77
+ | migration_mapping | Migration-style mapping creation | Inserts legacy records on a fixed shard, then builds shard mappings in batch and validates routing. | 10 iterations; 20 legacy records mapped per iteration |
78
+ | bulk_crud | Bulk CRUD pressure | Performs bulk inserts, half updates, and full delete sweep, then validates shard-wide totals. | 7 iterations; 160 inserts + 80 updates + 160 deletes per iteration |
79
+ | indexing | Indexed query scan | Creates an index on posts(user_id) and repeatedly queries the indexed path. | 15 iterations after warmup dataset build |
80
+
81
+ Real-world latency benchmarks across provider combinations (`average / p95`):
82
+
83
+ | Combination | Basic CRUD | Advanced Operations | Migration | Bulk CRUD | Indexing | Overall Avg |
84
+ | --------------- | ------------------- | ------------------- | ------------------- | --------------------- | -------------------- | ----------- |
85
+ | cloudflare | 13.14 ms / 16.50 ms | 4.43 ms / 9.65 ms | 27.68 ms / 30.69 ms | 156.30 ms / 163.76 ms | 67.17 ms / 106.63 ms | 28.40 ms |
86
+ | postgres+redis | 2.77 ms / 3.90 ms | 3.11 ms / 4.64 ms | 6.55 ms / 8.07 ms | 42.33 ms / 80.67 ms | 0.34 ms / 0.61 ms | 5.87 ms |
87
+ | postgres+valkey | 1.65 ms / 2.23 ms | 2.10 ms / 2.82 ms | 5.60 ms / 6.05 ms | 33.13 ms / 43.69 ms | 0.30 ms / 0.46 ms | 4.64 ms |
88
+ | mysql+redis | 5.11 ms / 8.38 ms | 5.45 ms / 8.51 ms | 27.41 ms / 61.56 ms | 92.70 ms / 139.70 ms | 0.49 ms / 1.22 ms | 13.91 ms |
89
+ | mysql+valkey | 4.99 ms / 6.66 ms | 4.21 ms / 6.42 ms | 21.68 ms / 27.20 ms | 87.67 ms / 109.44 ms | 0.55 ms / 1.92 ms | 12.54 ms |
90
+ | mariadb+redis | 2.64 ms / 5.90 ms | 3.02 ms / 7.55 ms | 6.48 ms / 7.66 ms | 46.99 ms / 58.08 ms | 0.37 ms / 1.08 ms | 6.29 ms |
91
+ | mariadb+valkey | 2.34 ms / 4.58 ms | 2.91 ms / 5.69 ms | 5.73 ms / 7.35 ms | 45.04 ms / 61.42 ms | 0.36 ms / 0.79 ms | 5.96 ms |
92
+ | sqlite+redis | 2.21 ms / 3.84 ms | 2.43 ms / 3.14 ms | 10.49 ms / 17.31 ms | 140.85 ms / 184.48 ms | 0.07 ms / 0.14 ms | 15.87 ms |
93
+ | sqlite+valkey | 1.36 ms / 1.80 ms | 2.06 ms / 2.70 ms | 6.36 ms / 8.77 ms | 121.13 ms / 156.31 ms | 0.06 ms / 0.14 ms | 13.42 ms |
94
+
95
+ ### Overview
96
+
97
+ CollegeDB includes an integration benchmark suite covering both local provider matrices and Cloudflare Worker routing paths.
98
+
99
+ Top-level benchmark scenarios:
100
+
101
+ - `basic_crud`: insert/read/update/delete round-trip routing
102
+ - `advanced_usage`: join + multi-key lookup behavior
103
+ - `migration_mapping`: batch mapping creation for existing keys
104
+ - `bulk_crud`: high-volume insert/update/delete flow
105
+ - `indexing`: indexed query latency under warm data
106
+ - `metadata_fetch`: schema/metadata query latency
107
+ - `pragma_or_info`: provider-specific pragma/info query latency
108
+ - `counting`: shard-wide aggregate counting
109
+ - `shard_fanout`: all-shards fanout query aggregation
110
+ - `reassignment`: shard reassignment and routed-read validation
111
+
112
+ Benchmark reports include:
113
+
114
+ - an interpretation guide (`How To Read This Report`)
115
+ - a benchmark catalog with per-run workload details
116
+ - a compact overall matrix (passed/failed/skipped + overall average)
117
+ - split scenario matrices for core workload latency and introspection/routing latency
118
+
65
119
  ## Installation
66
120
 
67
121
  ```bash
68
- bun add collegedb
122
+ bun add @earth-app/collegedb
69
123
  # or
70
- npm install collegedb
124
+ npm install @earth-app/collegedb
71
125
  ```
72
126
 
127
+ ## Provider Adapters
128
+
129
+ CollegeDB can run with either native Cloudflare bindings or custom providers as long as they match the exported `KVStorage` and `SQLDatabase` interfaces.
130
+
131
+ Supported adapters:
132
+
133
+ - `createRedisKVProvider`
134
+ - `createValkeyKVProvider`
135
+ - `createPostgresSQLProvider`
136
+ - `createMySQLSQLProvider`
137
+ - `createSQLiteSQLProvider`
138
+ - `createHyperdrivePostgresProvider`
139
+ - `createHyperdriveMySQLProvider`
140
+
141
+ ```typescript
142
+ import { createClient as createRedisClient } from 'redis';
143
+ import { Pool } from 'pg';
144
+ import { createPostgresSQLProvider, createRedisKVProvider, initialize, run, type CollegeDBConfig } from '@earth-app/collegedb';
145
+
146
+ const redisClient = createRedisClient({ url: process.env.REDIS_URL });
147
+ const pgPool = new Pool({ connectionString: process.env.POSTGRES_URL });
148
+
149
+ const config: CollegeDBConfig = {
150
+ kv: createRedisKVProvider(redisClient),
151
+ shards: {
152
+ 'pg-east': createPostgresSQLProvider(pgPool)
153
+ },
154
+ strategy: 'hash',
155
+ disableAutoMigration: true
156
+ };
157
+
158
+ async function bootstrap() {
159
+ await redisClient.connect();
160
+ initialize(config);
161
+ await run('user-1', 'INSERT INTO users (id, name) VALUES (?, ?)', ['user-1', 'Taylor']);
162
+ }
163
+
164
+ bootstrap().catch(console.error);
165
+ ```
166
+
167
+ For Hyperdrive-backed SQL connections, use `createHyperdrivePostgresProvider` or `createHyperdriveMySQLProvider` with your database client factory.
168
+
169
+ For a complete non-Cloudflare setup, see `examples/provider-sandbox.ts`.
170
+
171
+ ## Sandbox Benchmarks (Docker Compose)
172
+
173
+ CollegeDB ships with an integration sandbox runner that benchmarks real latency across provider combinations.
174
+
175
+ Requirements:
176
+
177
+ - Docker + Docker Compose plugin
178
+ - Bun
179
+ - Wrangler (installed as a dev dependency and invoked by scripts)
180
+
181
+ The Cloudflare benchmark path runs against the dedicated sandbox worker:
182
+
183
+ - Worker entry: `sandbox/worker.ts`
184
+ - Wrangler config: `sandbox/wrangler.jsonc`
185
+
186
+ Main commands:
187
+
188
+ ```bash
189
+ # Run full SQL x KV matrix plus Cloudflare local benchmark
190
+ bun run test:sandbox
191
+
192
+ # Run full SQL x KV matrix only
193
+ bun run test:sandbox:all
194
+
195
+ # Run Cloudflare local benchmark only (wrangler dev --local)
196
+ bun run test:sandbox:cloudflare
197
+ ```
198
+
199
+ Provider filters:
200
+
201
+ ```bash
202
+ # One SQL provider against both KV providers
203
+ bun run test:sandbox:mysql
204
+ bun run test:sandbox:postgres
205
+ bun run test:sandbox:mariadb
206
+ bun run test:sandbox:sqlite
207
+
208
+ # One KV provider against all SQL providers
209
+ bun run test:sandbox:redis
210
+ bun run test:sandbox:valkey
211
+
212
+ # Explicit pairwise combinations
213
+ bun run test:sandbox:postgres+redis
214
+ bun run test:sandbox:postgres+valkey
215
+ bun run test:sandbox:mysql+redis
216
+ bun run test:sandbox:mysql+valkey
217
+ bun run test:sandbox:mariadb+redis
218
+ bun run test:sandbox:mariadb+valkey
219
+ bun run test:sandbox:sqlite+redis
220
+ bun run test:sandbox:sqlite+valkey
221
+ ```
222
+
223
+ Output behavior:
224
+
225
+ - Every run writes a timestamped Markdown report to `sandbox/results/`
226
+ - `sandbox/results/latest.md` is always updated to the newest report
227
+ - The runner prints the report in-terminal using Bun's Markdown renderer with ANSI formatting
228
+ - `test:sandbox` produces a matrix for all SQL x KV combinations; filtered commands produce matrix subsets
229
+
230
+ Benchmark coverage includes:
231
+
232
+ - basic CRUD
233
+ - advanced lookup/routing workflows
234
+ - migration-style mapping creation
235
+ - bulk CRUD
236
+ - indexing queries
237
+ - metadata fetch
238
+ - pragma/info queries (provider-specific)
239
+ - counting across shards
240
+ - shard fanout aggregation
241
+ - shard reassignment workflow
242
+
243
+ How to read benchmark rows:
244
+
245
+ - Latency cells are formatted as `average / p95` in milliseconds.
246
+ - `FAILED` means the scenario returned an error.
247
+ - `N/A` means the scenario was intentionally skipped in that environment.
248
+ - Use the detailed section for full `avg`, `p50`, `p95`, `min`, `max`, and sample count (`n`).
249
+
73
250
  ## Basic Usage
74
251
 
75
252
  ```typescript
76
- import { collegedb, createSchema, run, first } from 'collegedb';
253
+ import { collegedb, createSchema, run, first } from '@earth-app/collegedb';
77
254
 
78
255
  // Initialize with your Cloudflare bindings (existing databases work automatically!)
79
256
  collegedb(
@@ -88,7 +265,7 @@ collegedb(
88
265
  },
89
266
  async () => {
90
267
  // Create schema on new shards only (existing shards auto-detected)
91
- await createSchema(env['db-new-shard']);
268
+ await createSchema(env['db-new-shard'], 'CREATE TABLE IF NOT EXISTS users (id TEXT PRIMARY KEY, name TEXT, email TEXT)');
92
269
 
93
270
  // Insert data (automatically routed to appropriate shard)
94
271
  await run('user-123', 'INSERT INTO users (id, name, email) VALUES (?, ?, ?)', ['user-123', 'Johnson', 'alice@example.com']);
@@ -104,7 +281,7 @@ collegedb(
104
281
  ### Geographic Distribution Example
105
282
 
106
283
  ```typescript
107
- import { collegedb, first, run } from 'collegedb';
284
+ import { collegedb, first, run } from '@earth-app/collegedb';
108
285
 
109
286
  // Optimize for North American users with geographic sharding
110
287
  collegedb(
@@ -141,7 +318,7 @@ collegedb(
141
318
  ### Mixed Strategy Example
142
319
 
143
320
  ```typescript
144
- import { collegedb, first, run, type MixedShardingStrategy } from 'collegedb';
321
+ import { collegedb, first, run, type MixedShardingStrategy } from '@earth-app/collegedb';
145
322
 
146
323
  // Use location strategy for writes (optimal data placement) and hash for reads (optimal performance)
147
324
  const mixedStrategy: MixedShardingStrategy = {
@@ -194,7 +371,7 @@ This approach provides:
194
371
  CollegeDB supports **multiple lookup keys** for the same record, allowing you to query by username, email, ID, or any unique identifier. Keys are automatically hashed with SHA-256 for security and privacy.
195
372
 
196
373
  ```typescript
197
- import { collegedb, first, run, KVShardMapper } from 'collegedb';
374
+ import { collegedb, first, run, KVShardMapper } from '@earth-app/collegedb';
198
375
 
199
376
  collegedb(
200
377
  {
@@ -222,8 +399,6 @@ collegedb(
222
399
 
223
400
  ### Adding Lookup Keys to Existing Mappings
224
401
 
225
- s
226
-
227
402
  ```typescript
228
403
  const mapper = new KVShardMapper(env.KV);
229
404
 
@@ -291,7 +466,7 @@ CollegeDB supports **seamless, automatic integration** with existing D1 database
291
466
  4. **KV Namespace**: A Cloudflare KV namespace for storing shard mappings
292
467
 
293
468
  ```typescript
294
- import { collegedb, first, run } from 'collegedb';
469
+ import { collegedb, first, run } from '@earth-app/collegedb';
295
470
 
296
471
  // Add your existing databases as shards - that's it!
297
472
  collegedb(
@@ -321,7 +496,7 @@ collegedb(
321
496
  You can manually validate databases before integration if needed:
322
497
 
323
498
  ```typescript
324
- import { validateTableForSharding, listTables } from 'collegedb';
499
+ import { validateTableForSharding, listTables } from '@earth-app/collegedb';
325
500
 
326
501
  // Check database structure
327
502
  const tables = await listTables(env.ExistingDB);
@@ -343,7 +518,7 @@ for (const table of tables) {
343
518
  If you want to inspect existing data before automatic migration:
344
519
 
345
520
  ```typescript
346
- import { discoverExistingPrimaryKeys } from 'collegedb';
521
+ import { discoverExistingPrimaryKeys } from '@earth-app/collegedb';
347
522
 
348
523
  // Discover all user IDs in existing users table
349
524
  const userIds = await discoverExistingPrimaryKeys(env.ExistingDB, 'users');
@@ -358,7 +533,7 @@ const orderIds = await discoverExistingPrimaryKeys(env.ExistingDB, 'orders', 'or
358
533
  For complete control over the integration process:
359
534
 
360
535
  ```typescript
361
- import { integrateExistingDatabase, KVShardMapper } from 'collegedb';
536
+ import { integrateExistingDatabase, KVShardMapper } from '@earth-app/collegedb';
362
537
 
363
538
  const mapper = new KVShardMapper(env.KV);
364
539
 
@@ -386,7 +561,7 @@ if (result.success) {
386
561
  After integration, initialize CollegeDB with your existing databases as shards:
387
562
 
388
563
  ```typescript
389
- import { initialize, first } from 'collegedb';
564
+ import { initialize, first } from '@earth-app/collegedb';
390
565
 
391
566
  // Include existing databases as shards
392
567
  initialize({
@@ -409,7 +584,7 @@ const user = await first('existing-user-123', 'SELECT * FROM users WHERE id = ?'
409
584
  The simplest possible integration - just add your existing databases:
410
585
 
411
586
  ```typescript
412
- import { initialize, first, run } from 'collegedb';
587
+ import { initialize, first, run } from '@earth-app/collegedb';
413
588
 
414
589
  export default {
415
590
  async fetch(request: Request, env: Env): Promise<Response> {
@@ -479,12 +654,12 @@ console.log(`Would process ${testResult.totalRecords} records from ${testResult.
479
654
 
480
655
  ```typescript
481
656
  // Simple rollback - clear all mappings
482
- import { KVShardMapper } from 'collegedb';
657
+ import { KVShardMapper } from '@earth-app/collegedb';
483
658
  const mapper = new KVShardMapper(env.KV);
484
659
  await mapper.clearAllMappings(); // Returns to pre-migration state
485
660
 
486
661
  // Or clear cache to force re-detection
487
- import { clearMigrationCache } from 'collegedb';
662
+ import { clearMigrationCache } from '@earth-app/collegedb';
488
663
  clearMigrationCache(); // Forces fresh migration check
489
664
  ```
490
665
 
@@ -535,7 +710,7 @@ for (const [table, pkColumn] of Object.entries(customIntegration)) {
535
710
  | ------------------------------------------ | ---------------------------------------------------------------- | -------------------------- |
536
711
  | `collegedb(config, callback)` | Initialize CollegeDB, then run a callback | `CollegeDBConfig, () => T` |
537
712
  | `initialize(config)` | Initialize CollegeDB with configuration | `CollegeDBConfig` |
538
- | `createSchema(d1)` | Create database schema on a D1 instance | `D1Database` |
713
+ | `createSchema(db, schema)` | Create schema on a shard database | `SQLDatabase, string` |
539
714
  | `prepare(key, sql)` | Prepare a SQL statement for execution | `string, string` |
540
715
  | `run(key, sql, bindings)` | Execute a SQL query with primary key routing | `string, string, any[]` |
541
716
  | `first(key, sql, bindings)` | Execute a SQL query and return first result | `string, string, any[]` |
@@ -549,19 +724,34 @@ for (const [table, pkColumn] of Object.entries(customIntegration)) {
549
724
  | `reassignShard(key, newShard)` | Move primary key to different shard | `string, string` |
550
725
  | `listKnownShards()` | Get list of available shards | `void` |
551
726
  | `getShardStats()` | Get statistics for all shards | `void` |
727
+ | `getDatabaseSizeForShard(shard)` | Get size of a specific shard in bytes | `string` |
552
728
  | `flush()` | Clear all shard mappings (development only) | `void` |
553
729
 
730
+ ### Provider Adapter Functions
731
+
732
+ | Function | Description | Parameters |
733
+ | ---------------------------------------------------------- | -------------------------------------------------------------------- | -------------------------------------------------------- |
734
+ | `createRedisKVProvider(client, options?)` | Adapt a Redis client to CollegeDB's `KVStorage` contract | `RedisLikeClient, { scanCount?: number }` |
735
+ | `createValkeyKVProvider(client, options?)` | Adapt a Valkey client to CollegeDB's `KVStorage` contract | `RedisLikeClient, { scanCount?: number }` |
736
+ | `createPostgresSQLProvider(client)` | Adapt a PostgreSQL client/pool to `SQLDatabase` | `PostgresClientLike` |
737
+ | `createMySQLSQLProvider(client)` | Adapt a MySQL/MariaDB client to `SQLDatabase` | `MySQLClientLike` |
738
+ | `createSQLiteSQLProvider(client)` | Adapt a SQLite client to `SQLDatabase` | `SQLiteClientLike` |
739
+ | `createHyperdrivePostgresProvider(binding, clientFactory)` | Create a PostgreSQL `SQLDatabase` adapter using a Hyperdrive binding | `HyperdriveBindingLike, HyperdrivePostgresClientFactory` |
740
+ | `createHyperdriveMySQLProvider(binding, clientFactory)` | Create a MySQL `SQLDatabase` adapter using a Hyperdrive binding | `HyperdriveBindingLike, HyperdriveMySQLClientFactory` |
741
+ | `isKVStorage(value)` | Runtime guard for `KVStorage` | `unknown` |
742
+ | `isSQLDatabase(value)` | Runtime guard for `SQLDatabase` | `unknown` |
743
+
554
744
  ### Drop-in Replacement Functions
555
745
 
556
746
  | Function | Description | Parameters |
557
747
  | ----------------------------------------- | ---------------------------------------------- | ------------------------------ |
558
- | `autoDetectAndMigrate(d1, shard, config)` | Automatically detect and migrate existing data | `D1Database, string, config` |
559
- | `checkMigrationNeeded(d1, shard, config)` | Check if database needs migration | `D1Database, string, config` |
560
- | `validateTableForSharding(d1, table)` | Check if table is suitable for sharding | `D1Database, string` |
561
- | `discoverExistingPrimaryKeys(d1, table)` | Find all primary keys in existing table | `D1Database, string` |
562
- | `integrateExistingDatabase(d1, shard)` | Complete drop-in integration of existing DB | `D1Database, string, mapper` |
748
+ | `autoDetectAndMigrate(d1, shard, config)` | Automatically detect and migrate existing data | `SQLDatabase, string, config` |
749
+ | `checkMigrationNeeded(d1, shard, config)` | Check if database needs migration | `SQLDatabase, string, config` |
750
+ | `validateTableForSharding(d1, table)` | Check if table is suitable for sharding | `SQLDatabase, string` |
751
+ | `discoverExistingPrimaryKeys(d1, table)` | Find all primary keys in existing table | `SQLDatabase, string` |
752
+ | `integrateExistingDatabase(d1, shard)` | Complete drop-in integration of existing DB | `SQLDatabase, string, mapper` |
563
753
  | `createMappingsForExistingKeys(keys)` | Create shard mappings for existing keys | `string[], string[], strategy` |
564
- | `listTables(d1)` | Get list of tables in database | `D1Database` |
754
+ | `listTables(d1)` | Get list of tables in database | `SQLDatabase` |
565
755
  | `clearMigrationCache()` | Clear automatic migration cache | `void` |
566
756
 
567
757
  ### Error Handling
@@ -612,7 +802,7 @@ The `ShardCoordinator` is an optional Durable Object that provides centralized s
612
802
  #### Usage Example
613
803
 
614
804
  ```typescript
615
- import { ShardCoordinator } from 'collegedb';
805
+ import { ShardCoordinator } from '@earth-app/collegedb';
616
806
 
617
807
  // Export for Cloudflare Workers runtime
618
808
  export { ShardCoordinator };
@@ -644,14 +834,19 @@ The main configuration interface supports both single strategies and mixed strat
644
834
 
645
835
  ```typescript
646
836
  interface CollegeDBConfig {
647
- kv: KVNamespace;
837
+ kv: KVStorage;
648
838
  coordinator?: DurableObjectNamespace;
649
- shards: Record<string, D1Database>;
839
+ shards: Record<string, SQLDatabase>;
650
840
  strategy?: ShardingStrategy | MixedShardingStrategy;
651
841
  targetRegion?: D1Region;
652
842
  shardLocations?: Record<string, ShardLocation>;
653
843
  disableAutoMigration?: boolean; // Default: false
654
844
  hashShardMappings?: boolean; // Default: true
845
+ maxDatabaseSize?: number; // Default: undefined (no limit)
846
+ mappingCacheTtlMs?: number; // Default: 30000
847
+ knownShardsCacheTtlMs?: number; // Default: 10000
848
+ sizeCacheTtlMs?: number; // Default: 30000
849
+ migrationConcurrency?: number; // Default: 25
655
850
  }
656
851
  ```
657
852
 
@@ -702,6 +897,91 @@ const mixedStrategyConfig: CollegeDBConfig = {
702
897
  };
703
898
  ```
704
899
 
900
+ ### Database Size Management
901
+
902
+ CollegeDB supports automatic size-based shard exclusion to prevent individual shards from becoming too large. This feature helps maintain optimal performance and prevents hitting database storage limits.
903
+
904
+ #### Configuration
905
+
906
+ ```typescript
907
+ const config: CollegeDBConfig = {
908
+ kv: env.KV,
909
+ shards: {
910
+ 'db-east': env.DB_EAST,
911
+ 'db-west': env.DB_WEST,
912
+ 'db-central': env.DB_CENTRAL
913
+ },
914
+ strategy: 'hash',
915
+ maxDatabaseSize: 500 * 1024 * 1024 // 500 MB limit per shard
916
+ };
917
+ ```
918
+
919
+ #### How It Works
920
+
921
+ When `maxDatabaseSize` is configured:
922
+
923
+ 1. **Allocation Phase**: Before allocating new records, CollegeDB checks each shard's size using efficient SQLite pragmas
924
+ 2. **Size Filtering**: Shards exceeding the limit are excluded from new allocations
925
+ 3. **Fallback Protection**: If all shards exceed the limit, allocation continues to prevent complete failure
926
+ 4. **Existing Records**: Records already mapped to oversized shards remain accessible
927
+
928
+ #### Size Check Implementation
929
+
930
+ The size check uses SQLite's `PRAGMA page_count` and `PRAGMA page_size` for accurate, low-overhead size calculation:
931
+
932
+ ```sql
933
+ -- Efficient size calculation (used internally)
934
+ PRAGMA page_count; -- Returns number of database pages
935
+ PRAGMA page_size; -- Returns size of each page in bytes
936
+ -- Total size = page_count × page_size
937
+ ```
938
+
939
+ #### Usage Examples
940
+
941
+ ```typescript
942
+ // Conservative limit for high-performance scenarios
943
+ const performanceConfig: CollegeDBConfig = {
944
+ // ... other config
945
+ maxDatabaseSize: 100 * 1024 * 1024, // 100 MB per shard
946
+ strategy: 'round-robin' // Ensures even distribution
947
+ };
948
+
949
+ // Standard production limit
950
+ const productionConfig: CollegeDBConfig = {
951
+ // ... other config
952
+ maxDatabaseSize: 1024 * 1024 * 1024, // 1 GB per shard
953
+ strategy: 'hash' // Consistent allocation
954
+ };
955
+
956
+ // Check individual shard sizes
957
+ import { getDatabaseSizeForShard } from '@earth-app/collegedb';
958
+
959
+ const eastSize = await getDatabaseSizeForShard('db-east');
960
+ console.log(`East shard: ${Math.round(eastSize / 1024 / 1024)} MB`);
961
+ ```
962
+
963
+ #### Debug Logging
964
+
965
+ Enable debug logging to monitor size-based exclusions:
966
+
967
+ ```typescript
968
+ const config: CollegeDBConfig = {
969
+ // ... other config
970
+ maxDatabaseSize: 500 * 1024 * 1024,
971
+ debug: true // Logs when shards are excluded due to size
972
+ };
973
+
974
+ // Console output example:
975
+ // "Excluded 2 shards due to size limits: db-east, db-central"
976
+ ```
977
+
978
+ #### Size-Limit Performance Impact
979
+
980
+ - **Size Check Frequency**: Only performed during new allocations (not on reads)
981
+ - **Query Efficiency**: Uses fast SQLite pragmas (microsecond execution time)
982
+ - **Parallel Execution**: Size checks run concurrently across all shards
983
+ - **Caching**: Size checks are cached in-memory (controlled by `sizeCacheTtlMs`, default `30000`)
984
+
705
985
  ### Types
706
986
 
707
987
  CollegeDB exports TypeScript types for better development experience and type safety:
@@ -709,6 +989,10 @@ CollegeDB exports TypeScript types for better development experience and type sa
709
989
  | Type | Description | Example |
710
990
  | ----------------------- | ------------------------------ | --------------------------------------------------- |
711
991
  | `CollegeDBConfig` | Main configuration object | `{ kv, shards, strategy }` |
992
+ | `KVStorage` | Provider-agnostic KV contract | `createRedisKVProvider(redisClient)` |
993
+ | `SQLDatabase` | Provider-agnostic SQL contract | `createPostgresSQLProvider(pgPool)` |
994
+ | `QueryResult` | Standard query response shape | `{ success, results, meta }` |
995
+ | `QueryResultMeta` | Query execution metadata | `{ duration, changes?, last_row_id? }` |
712
996
  | `ShardingStrategy` | Single strategy options | `'hash' \| 'location' \| 'round-robin' \| 'random'` |
713
997
  | `MixedShardingStrategy` | Mixed strategy configuration | `{ read: 'hash', write: 'location' }` |
714
998
  | `OperationType` | Database operation types | `'read' \| 'write'` |
@@ -719,7 +1003,7 @@ CollegeDB exports TypeScript types for better development experience and type sa
719
1003
  #### Mixed Strategy Configuration
720
1004
 
721
1005
  ```typescript
722
- import type { MixedShardingStrategy, CollegeDBConfig } from 'collegedb';
1006
+ import type { MixedShardingStrategy, CollegeDBConfig } from '@earth-app/collegedb';
723
1007
 
724
1008
  // Type-safe mixed strategy configuration
725
1009
  const mixedStrategy: MixedShardingStrategy = {
@@ -831,81 +1115,127 @@ wrangler d1 create collegedb-central
831
1115
  wrangler kv namespace create "KV"
832
1116
  ```
833
1117
 
834
- ### 3. Configure wrangler.toml
1118
+ ### 3. Configure wrangler.jsonc
835
1119
 
836
- ```toml
837
- [[d1_databases]]
838
- binding = "db-east"
839
- database_name = "collegedb-east"
840
- database_id = "your-database-id"
841
-
842
- [[d1_databases]]
843
- binding = "db-west"
844
- database_name = "collegedb-west"
845
- database_id = "your-database-id"
846
-
847
- [[kv_namespaces]]
848
- binding = "KV"
849
- id = "your-kv-namespace-id"
850
-
851
- [[durable_objects.bindings]]
852
- name = "ShardCoordinator"
853
- class_name = "ShardCoordinator"
1120
+ ```jsonc
1121
+ {
1122
+ "$schema": "./node_modules/wrangler/config-schema.json",
1123
+ "name": "collegedb-app",
1124
+ "main": "src/index.ts",
1125
+ "compatibility_date": "2026-04-15",
1126
+ "d1_databases": [
1127
+ {
1128
+ "binding": "db-east",
1129
+ "database_name": "collegedb-east",
1130
+ "database_id": "your-east-database-id"
1131
+ },
1132
+ {
1133
+ "binding": "db-west",
1134
+ "database_name": "collegedb-west",
1135
+ "database_id": "your-west-database-id"
1136
+ }
1137
+ ],
1138
+ "kv_namespaces": [
1139
+ {
1140
+ "binding": "KV",
1141
+ "id": "your-kv-namespace-id",
1142
+ "preview_id": "your-kv-preview-id"
1143
+ }
1144
+ ],
1145
+ "durable_objects": {
1146
+ "bindings": [
1147
+ {
1148
+ "name": "ShardCoordinator",
1149
+ "class_name": "ShardCoordinator"
1150
+ }
1151
+ ]
1152
+ },
1153
+ "migrations": [
1154
+ {
1155
+ "tag": "v1",
1156
+ "new_sqlite_classes": ["ShardCoordinator"]
1157
+ }
1158
+ ]
1159
+ }
854
1160
  ```
855
1161
 
856
- #### Complete wrangler.toml with ShardCoordinator
1162
+ #### Complete wrangler.jsonc with ShardCoordinator
857
1163
 
858
- ```toml
859
- name = "collegedb-app"
860
- main = "src/index.ts"
861
- compatibility_date = "2024-08-10"
862
-
863
- # D1 Database bindings
864
- [[d1_databases]]
865
- binding = "db-east"
866
- database_name = "collegedb-east"
867
- database_id = "your-east-database-id"
868
-
869
- [[d1_databases]]
870
- binding = "db-west"
871
- database_name = "collegedb-west"
872
- database_id = "your-west-database-id"
873
-
874
- [[d1_databases]]
875
- binding = "db-central"
876
- database_name = "collegedb-central"
877
- database_id = "your-central-database-id"
878
-
879
- # KV namespace for shard mappings
880
- [[kv_namespaces]]
881
- binding = "KV"
882
- id = "your-kv-namespace-id"
883
- preview_id = "your-kv-preview-id" # For local development
884
-
885
- # Durable Object for shard coordination
886
- [[durable_objects.bindings]]
887
- name = "ShardCoordinator"
888
- class_name = "ShardCoordinator"
889
-
890
- # Environment-specific configurations
891
- [env.production]
892
- [[env.production.d1_databases]]
893
- binding = "db-east"
894
- database_name = "collegedb-prod-east"
895
- database_id = "your-prod-east-id"
896
-
897
- [[env.production.d1_databases]]
898
- binding = "db-west"
899
- database_name = "collegedb-prod-west"
900
- database_id = "your-prod-west-id"
901
-
902
- [[env.production.kv_namespaces]]
903
- binding = "KV"
904
- id = "your-prod-kv-namespace-id"
905
-
906
- [[env.production.durable_objects.bindings]]
907
- name = "ShardCoordinator"
908
- class_name = "ShardCoordinator"
1164
+ ```jsonc
1165
+ {
1166
+ "$schema": "./node_modules/wrangler/config-schema.json",
1167
+ "name": "collegedb-app",
1168
+ "main": "src/index.ts",
1169
+ "compatibility_date": "2026-04-15",
1170
+ "d1_databases": [
1171
+ {
1172
+ "binding": "db-east",
1173
+ "database_name": "collegedb-east",
1174
+ "database_id": "your-east-database-id"
1175
+ },
1176
+ {
1177
+ "binding": "db-west",
1178
+ "database_name": "collegedb-west",
1179
+ "database_id": "your-west-database-id"
1180
+ },
1181
+ {
1182
+ "binding": "db-central",
1183
+ "database_name": "collegedb-central",
1184
+ "database_id": "your-central-database-id"
1185
+ }
1186
+ ],
1187
+ "kv_namespaces": [
1188
+ {
1189
+ "binding": "KV",
1190
+ "id": "your-kv-namespace-id",
1191
+ "preview_id": "your-kv-preview-id"
1192
+ }
1193
+ ],
1194
+ "durable_objects": {
1195
+ "bindings": [
1196
+ {
1197
+ "name": "ShardCoordinator",
1198
+ "class_name": "ShardCoordinator"
1199
+ }
1200
+ ]
1201
+ },
1202
+ "migrations": [
1203
+ {
1204
+ "tag": "v1",
1205
+ "new_sqlite_classes": ["ShardCoordinator"]
1206
+ }
1207
+ ],
1208
+ "env": {
1209
+ "production": {
1210
+ "d1_databases": [
1211
+ {
1212
+ "binding": "db-east",
1213
+ "database_name": "collegedb-prod-east",
1214
+ "database_id": "your-prod-east-id"
1215
+ },
1216
+ {
1217
+ "binding": "db-west",
1218
+ "database_name": "collegedb-prod-west",
1219
+ "database_id": "your-prod-west-id"
1220
+ }
1221
+ ],
1222
+ "kv_namespaces": [
1223
+ {
1224
+ "binding": "KV",
1225
+ "id": "your-prod-kv-namespace-id"
1226
+ }
1227
+ ],
1228
+ "durable_objects": {
1229
+ "bindings": [
1230
+ {
1231
+ "name": "ShardCoordinator",
1232
+ "class_name": "ShardCoordinator"
1233
+ }
1234
+ ]
1235
+ }
1236
+ }
1237
+ }
1238
+ }
909
1239
  ```
910
1240
 
911
1241
  ### 3.1. Worker Script Setup (Required for ShardCoordinator)
@@ -914,7 +1244,7 @@ Create your main worker file with ShardCoordinator export:
914
1244
 
915
1245
  ```typescript
916
1246
  // src/index.ts
917
- import { collegedb, ShardCoordinator, first, run } from 'collegedb';
1247
+ import { collegedb, ShardCoordinator, first, run } from '@earth-app/collegedb';
918
1248
 
919
1249
  // IMPORTANT: Export ShardCoordinator for Cloudflare Workers runtime
920
1250
  export { ShardCoordinator };
@@ -978,7 +1308,7 @@ wrangler deploy --env production
978
1308
  #### Using CollegeDB Functions
979
1309
 
980
1310
  ```typescript
981
- import { getShardStats, listKnownShards } from 'collegedb';
1311
+ import { getShardStats, listKnownShards } from '@earth-app/collegedb';
982
1312
 
983
1313
  // Get detailed statistics
984
1314
  const stats = await getShardStats();
@@ -1086,7 +1416,7 @@ export default {
1086
1416
  ### Shard Rebalancing
1087
1417
 
1088
1418
  ```typescript
1089
- import { reassignShard } from 'collegedb';
1419
+ import { reassignShard } from '@earth-app/collegedb';
1090
1420
 
1091
1421
  // Move a primary key to a different shard
1092
1422
  await reassignShard('user-123', 'db-west');
@@ -1675,25 +2005,31 @@ CollegeDB includes an optional **ShardCoordinator** Durable Object that provides
1675
2005
 
1676
2006
  #### Durable Object Setup
1677
2007
 
1678
- First, configure the Durable Object in your `wrangler.toml`:
1679
-
1680
- ```toml
1681
- [[durable_objects.bindings]]
1682
- name = "ShardCoordinator"
1683
- class_name = "ShardCoordinator"
2008
+ First, configure the Durable Object in your `wrangler.jsonc`:
1684
2009
 
1685
- # Export the ShardCoordinator class
1686
- [durable_objects.bindings.script_name]
1687
- # If using modules format
1688
- [[durable_objects.bindings]]
1689
- name = "ShardCoordinator"
1690
- class_name = "ShardCoordinator"
2010
+ ```jsonc
2011
+ {
2012
+ "durable_objects": {
2013
+ "bindings": [
2014
+ {
2015
+ "name": "ShardCoordinator",
2016
+ "class_name": "ShardCoordinator"
2017
+ }
2018
+ ]
2019
+ },
2020
+ "migrations": [
2021
+ {
2022
+ "tag": "v1",
2023
+ "new_sqlite_classes": ["ShardCoordinator"]
2024
+ }
2025
+ ]
2026
+ }
1691
2027
  ```
1692
2028
 
1693
2029
  #### Basic Usage with ShardCoordinator
1694
2030
 
1695
2031
  ```typescript
1696
- import { collegedb, ShardCoordinator } from 'collegedb';
2032
+ import { collegedb, ShardCoordinator } from '@earth-app/collegedb';
1697
2033
 
1698
2034
  // Export the Durable Object class for Cloudflare Workers
1699
2035
  export { ShardCoordinator };