@earth-app/collegedb 1.0.1 → 1.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,27 +1,60 @@
1
1
  # CollegeDB
2
2
 
3
- > Cloudflare D1 Sharding Router
3
+ > Cloudflare D1 Horizontal Sharding Router
4
4
 
5
5
  [![TypeScript](https://img.shields.io/badge/TypeScript-5.0+-blue.svg)](https://www.typescriptlang.org/)
6
+ [![GitHub Issues](https://img.shields.io/github/issues/earth-app/CollegeDB)](https://github.com/earth-app/CollegeDB/issues)
6
7
  [![Cloudflare Workers](https://img.shields.io/badge/cloudflare-workers-orange.svg)](https://workers.cloudflare.com/)
7
- [![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
8
+ [![GitHub License](https://img.shields.io/github/license/earth-app/CollegeDB)](LICENSE)
9
+ ![NPM Version](https://img.shields.io/npm/v/%40earth-app%2Fcollegedb)
8
10
 
9
- A TypeScript library for horizontal scaling of SQLite-style databases on Cloudflare using D1 and KV. CollegeDB simulates vertical scaling by routing queries to the correct D1 database instance using primary key mappings stored in Cloudflare KV.
11
+ A TypeScript library for **true horizontal scaling** of SQLite-style databases on Cloudflare using D1 and KV. CollegeDB distributes your data across multiple D1 databases, with each table's records split by primary key across different database instances.
10
12
 
11
- ## Overview
13
+ CollegeDB implements **data distribution** where a single logical table is physically stored across multiple D1 databases:
14
+
15
+ ```txt
16
+ env.db-east (Shard 1)
17
+ ┌────────────────────────────────────────────┐
18
+ │ table users: [user-1, user-3, user-5, ...] │
19
+ │ table posts: [post-2, post-7, post-9, ...] │
20
+ └────────────────────────────────────────────┘
21
+
22
+ env.db-west (Shard 2)
23
+ ┌────────────────────────────────────────────┐
24
+ │ table users: [user-2, user-4, user-6, ...] │
25
+ │ table posts: [post-1, post-3, post-8, ...] │
26
+ └────────────────────────────────────────────┘
27
+
28
+ env.db-central (Shard 3)
29
+ ┌────────────────────────────────────────────┐
30
+ │ table users: [user-7, user-8, user-9, ...] │
31
+ │ table posts: [post-4, post-5, post-6, ...] │
32
+ └────────────────────────────────────────────┘
33
+ ```
34
+
35
+ This allows you to:
36
+
37
+ - **Break through D1's single database limits** by spreading data across many databases
38
+ - **Improve query performance** by reducing data per database instance
39
+ - **Scale geographically** by placing shards in different regions
40
+ - **Increase write throughput** by parallelizing across multiple database instances
41
+
42
+ ## 📈 Overview
12
43
 
13
44
  CollegeDB provides a sharding layer on top of Cloudflare D1 databases, enabling you to:
14
45
 
15
- - **Scale horizontally** across multiple D1 instances
16
- - **Route queries automatically** based on primary keys
17
- - **Maintain consistency** with KV-based mapping
46
+ - **Scale horizontally** by distributing table data across multiple D1 instances
47
+ - **Route queries automatically** based on primary key mappings
48
+ - **Maintain consistency** with KV-based shard mapping
49
+ - **Optimize for geography** with location-aware shard allocation
18
50
  - **Monitor and rebalance** shard distribution
19
51
  - **Handle migrations** between shards seamlessly
20
52
 
21
53
  ## 📦 Features
22
54
 
23
55
  - **🔀 Automatic Query Routing**: Primary key → shard mapping using Cloudflare KV
24
- - **🎯 Multiple Allocation Strategies**: Round-robin, random, or hash-based distribution
56
+ - **🎯 Multiple Allocation Strategies**: Round-robin, random, hash-based, and location-aware distribution
57
+ - **🔄 Mixed Strategy Support**: Different strategies for reads vs writes (e.g., location for writes, hash for reads)
25
58
  - **📊 Shard Coordination**: Durable Objects for allocation and statistics
26
59
  - **🛠 Migration Support**: Move data between shards with zero downtime
27
60
  - **🔄 Automatic Drop-in Replacement**: Zero-config integration with existing databases
@@ -51,7 +84,7 @@ collegedb(
51
84
  'db-east': env['db-east'], // Can be existing DB with data
52
85
  'db-west': env['db-west'] // Can be existing DB with data
53
86
  },
54
- strategy: 'hash' // or 'round-robin', 'random'
87
+ strategy: 'hash'
55
88
  },
56
89
  async () => {
57
90
  // Create schema on new shards only (existing shards auto-detected)
@@ -68,6 +101,94 @@ collegedb(
68
101
  );
69
102
  ```
70
103
 
104
+ ### Geographic Distribution Example
105
+
106
+ ```typescript
107
+ import { collegedb, first, run } from 'collegedb';
108
+
109
+ // Optimize for North American users with geographic sharding
110
+ collegedb(
111
+ {
112
+ kv: env.KV,
113
+ strategy: 'location',
114
+ targetRegion: 'wnam', // Western North America
115
+ shardLocations: {
116
+ 'db-west': { region: 'wnam', priority: 2 }, // SF - Preferred for target region
117
+ 'db-east': { region: 'enam', priority: 1 }, // NYC - Secondary
118
+ 'db-europe': { region: 'weur', priority: 0.5 } // London - Fallback
119
+ },
120
+ shards: {
121
+ 'db-west': env.DB_WEST,
122
+ 'db-east': env.DB_EAST,
123
+ 'db-europe': env.DB_EUROPE
124
+ }
125
+ },
126
+ async () => {
127
+ // New users will be allocated to db-west (closest to target region)
128
+ await run('user-west-123', 'INSERT INTO users (id, name, location) VALUES (?, ?, ?)', [
129
+ 'user-west-123',
130
+ 'West Coast User',
131
+ 'California'
132
+ ]);
133
+
134
+ // Queries are routed to the correct geographic shard
135
+ const user = await first<User>('user-west-123', 'SELECT * FROM users WHERE id = ?', ['user-west-123']);
136
+ console.log(`User found in optimal shard: ${user?.name}`);
137
+ }
138
+ );
139
+ ```
140
+
141
+ ### Mixed Strategy Example
142
+
143
+ ```typescript
144
+ import { collegedb, first, run, type MixedShardingStrategy } from 'collegedb';
145
+
146
+ // Use location strategy for writes (optimal data placement) and hash for reads (optimal performance)
147
+ const mixedStrategy: MixedShardingStrategy = {
148
+ write: 'location', // New data goes to geographically optimal shards
149
+ read: 'hash' // Reads use consistent hashing for best performance
150
+ };
151
+
152
+ collegedb(
153
+ {
154
+ kv: env.KV,
155
+ strategy: mixedStrategy,
156
+ targetRegion: 'wnam', // Western North America for writes
157
+ shardLocations: {
158
+ 'db-west': { region: 'wnam', priority: 2 },
159
+ 'db-east': { region: 'enam', priority: 1 },
160
+ 'db-central': { region: 'enam', priority: 1 }
161
+ },
162
+ shards: {
163
+ 'db-west': env.DB_WEST,
164
+ 'db-east': env.DB_EAST,
165
+ 'db-central': env.DB_CENTRAL
166
+ }
167
+ },
168
+ async () => {
169
+ // Write operations use location strategy - new users placed optimally
170
+ await run('user-california-456', 'INSERT INTO users (id, name, location) VALUES (?, ?, ?)', [
171
+ 'user-california-456',
172
+ 'California User',
173
+ 'Los Angeles'
174
+ ]);
175
+
176
+ // Read operations use hash strategy - consistent and fast routing
177
+ const user = await first<User>('user-california-456', 'SELECT * FROM users WHERE id = ?', ['user-california-456']);
178
+
179
+ // Different operations can route to different shards based on strategy
180
+ // This optimizes both data placement (writes) and query performance (reads)
181
+ console.log(`User: ${user?.name}, Location: ${user?.location}`);
182
+ }
183
+ );
184
+ ```
185
+
186
+ This approach provides:
187
+
188
+ - **Optimal data placement**: New records are written to geographically optimal shards using `location` strategy
189
+ - **Optimal read performance**: Queries use `hash` strategy for consistent, high-performance routing
190
+ - **Flexibility**: Each operation type can use the most appropriate routing strategy
191
+
71
192
  ## Drop-in Replacement for Existing Databases
72
193
 
73
194
  CollegeDB supports **seamless, automatic integration** with existing D1 databases that already contain data. Simply add your existing databases as shards in the configuration. CollegeDB will automatically detect existing data and create the necessary shard mappings **without requiring any manual migration steps**.
@@ -320,18 +441,22 @@ for (const [table, pkColumn] of Object.entries(customIntegration)) {
320
441
 
321
442
  ## 📚 API Reference
322
443
 
323
- | Function | Description | Parameters |
324
- | ------------------------------ | -------------------------------------------- | ------------------------ |
325
- | `collegedb(config, callback)` | Initialize CollegeDB, then run a callback | `CollegeDBConfig, ()=>T` |
326
- | `initialize(config)` | Initialize CollegeDB with configuration | `CollegeDBConfig` |
327
- | `createSchema(d1)` | Create database schema on a D1 instance | `D1Database` |
328
- | `prepare(key, sql)` | Prepare a SQL statement for execution | `string, string` |
329
- | `run(key, sql, bindings)` | Execute a SQL query with primary key routing | `string, string, any[]` |
330
- | `first(key, sql, bindings)` | Execute a SQL query and return first result | `string, string, any[]` |
331
- | `all(key, sql, bindings)` | Execute a SQL query and return all results | `string, string, any[]` |
332
- | `reassignShard(key, newShard)` | Move primary key to different shard | `string, string` |
333
- | `listKnownShards()` | Get list of available shards | `void` |
334
- | `getShardStats()` | Get statistics for all shards | `void` |
444
+ | Function | Description | Parameters |
445
+ | ---------------------------------- | ---------------------------------------------------- | ------------------------ |
446
+ | `collegedb(config, callback)` | Initialize CollegeDB, then run a callback | `CollegeDBConfig, ()=>T` |
447
+ | `initialize(config)` | Initialize CollegeDB with configuration | `CollegeDBConfig` |
448
+ | `createSchema(d1)` | Create database schema on a D1 instance | `D1Database` |
449
+ | `prepare(key, sql)` | Prepare a SQL statement for execution | `string, string` |
450
+ | `run(key, sql, bindings)` | Execute a SQL query with primary key routing | `string, string, any[]` |
451
+ | `first(key, sql, bindings)` | Execute a SQL query and return first result | `string, string, any[]` |
452
+ | `all(key, sql, bindings)` | Execute a SQL query and return all results | `string, string, any[]` |
453
+ | `runShard(shard, sql, bindings)` | Execute a SQL query directly on a specific shard | `string, string, any[]` |
454
+ | `allShard(shard, sql, bindings)` | Execute query on specific shard, return all results | `string, string, any[]` |
455
+ | `firstShard(shard, sql, bindings)` | Execute query on specific shard, return first result | `string, string, any[]` |
456
+ | `reassignShard(key, newShard)` | Move primary key to different shard | `string, string` |
457
+ | `listKnownShards()` | Get list of available shards | `void` |
458
+ | `getShardStats()` | Get statistics for all shards | `void` |
459
+ | `flush()` | Clear all shard mappings (development only) | `void` |
335
460
 
336
461
  ### Drop-in Replacement Functions
337
462
 
@@ -346,6 +471,181 @@ for (const [table, pkColumn] of Object.entries(customIntegration)) {
346
471
  | `listTables(d1)` | Get list of tables in database | `D1Database` |
347
472
  | `clearMigrationCache()` | Clear automatic migration cache | `void` |
348
473
 
474
+ ### Error Handling
475
+
476
+ | Class | Description | Usage |
477
+ | ---------------- | ------------------------------------------- | ------------------------------------- |
478
+ | `CollegeDBError` | Custom error class for CollegeDB operations | `throw new CollegeDBError(msg, code)` |
479
+
480
+ The `CollegeDBError` class extends the native `Error` class and includes an optional error code for better error categorization:
481
+
482
+ ```typescript
483
+ try {
484
+ await run('invalid-key', 'SELECT * FROM users WHERE id = ?', ['invalid-key']);
485
+ } catch (error) {
486
+ if (error instanceof CollegeDBError) {
487
+ console.error(`CollegeDB Error (${error.code}): ${error.message}`);
488
+ }
489
+ }
490
+ ```
491
+
492
+ ### ShardCoordinator (Durable Object) API
493
+
494
+ The `ShardCoordinator` is an optional Durable Object that provides centralized shard allocation and statistics management. All endpoints return JSON responses.
495
+
496
+ #### HTTP API Endpoints
497
+
498
+ | Endpoint | Method | Description | Request Body | Response |
499
+ | ----------- | ------ | ---------------------------------- | ------------------------------------------------ | -------------------------------------- |
500
+ | `/shards` | GET | List all registered shards | None | `["db-east", "db-west"]` |
501
+ | `/shards` | POST | Register a new shard | `{"shard": "db-new"}` | `{"success": true}` |
502
+ | `/shards` | DELETE | Unregister a shard | `{"shard": "db-old"}` | `{"success": true}` |
503
+ | `/stats` | GET | Get shard statistics | None | `[{"binding":"db-east","count":1542}]` |
504
+ | `/stats` | POST | Update shard statistics | `{"shard": "db-east", "count": 1600}` | `{"success": true}` |
505
+ | `/allocate` | POST | Allocate shard for primary key | `{"primaryKey": "user-123"}` | `{"shard": "db-west"}` |
506
+ | `/allocate` | POST | Allocate with specific strategy | `{"primaryKey": "user-123", "strategy": "hash"}` | `{"shard": "db-west"}` |
507
+ | `/flush` | POST | Clear all state (development only) | None | `{"success": true}` |
508
+ | `/health` | GET | Health check | None | `"OK"` |
509
+
510
+ #### Programmatic Methods
511
+
512
+ | Method | Description | Parameters | Returns |
513
+ | ----------------------------- | ----------------------------- | -------------------- | ------------------- |
514
+ | `new ShardCoordinator(state)` | Create coordinator instance | `DurableObjectState` | `ShardCoordinator` |
515
+ | `fetch(request)` | Handle HTTP requests | `Request` | `Promise<Response>` |
516
+ | `incrementShardCount(shard)` | Increment key count for shard | `string` | `Promise<void>` |
517
+ | `decrementShardCount(shard)` | Decrement key count for shard | `string` | `Promise<void>` |
518
+
519
+ #### Usage Example
520
+
521
+ ```typescript
522
+ import { ShardCoordinator } from 'collegedb';
523
+
524
+ // Export for Cloudflare Workers runtime
525
+ export { ShardCoordinator };
526
+
527
+ // Use in your worker
528
+ export default {
529
+ async fetch(request: Request, env: Env) {
530
+ const coordinatorId = env.ShardCoordinator.idFromName('default');
531
+ const coordinator = env.ShardCoordinator.get(coordinatorId);
532
+
533
+ // Allocate shard for user
534
+ const response = await coordinator.fetch('http://coordinator/allocate', {
535
+ method: 'POST',
536
+ headers: { 'Content-Type': 'application/json' },
537
+ body: JSON.stringify({ primaryKey: 'user-123', strategy: 'hash' })
538
+ });
539
+
540
+ const { shard } = await response.json();
541
+ // Use allocated shard for database operations...
542
+ }
543
+ };
544
+ ```
545
+
546
+ ### Configuration Types
547
+
548
+ #### CollegeDBConfig
549
+
550
+ The main configuration interface supports both single strategies and mixed strategies:
551
+
552
+ ```typescript
553
+ interface CollegeDBConfig {
554
+ kv: KVNamespace;
555
+ coordinator?: DurableObjectNamespace;
556
+ shards: Record<string, D1Database>;
557
+ strategy?: ShardingStrategy | MixedShardingStrategy;
558
+ targetRegion?: D1Region;
559
+ shardLocations?: Record<string, ShardLocation>;
560
+ disableAutoMigration?: boolean;
561
+ }
562
+ ```
563
+
564
+ #### Strategy Types
565
+
566
+ ```typescript
567
+ // Single strategy for all operations
568
+ type ShardingStrategy = 'round-robin' | 'random' | 'hash' | 'location';
569
+
570
+ // Mixed strategy for different operation types
571
+ interface MixedShardingStrategy {
572
+ read: ShardingStrategy; // Strategy for SELECT operations
573
+ write: ShardingStrategy; // Strategy for INSERT/UPDATE/DELETE operations
574
+ }
575
+
576
+ // Operation types for internal routing
577
+ type OperationType = 'read' | 'write';
578
+ ```
579
+
580
+ #### Example Configurations
581
+
582
+ ```typescript
583
+ // Single strategy configuration (traditional)
584
+ const singleStrategyConfig: CollegeDBConfig = {
585
+ kv: env.KV,
586
+ strategy: 'hash', // All operations use hash strategy
587
+ shards: {
588
+ /* ... */
589
+ }
590
+ };
591
+
592
+ // Mixed strategy configuration (new feature)
593
+ const mixedStrategyConfig: CollegeDBConfig = {
594
+ kv: env.KV,
595
+ strategy: {
596
+ read: 'hash', // Fast, consistent reads
597
+ write: 'location' // Optimal data placement
598
+ },
599
+ targetRegion: 'wnam',
600
+ shardLocations: {
601
+ /* ... */
602
+ },
603
+ shards: {
604
+ /* ... */
605
+ }
606
+ };
607
+ ```
608
+
609
+ ### Types
610
+
611
+ CollegeDB exports TypeScript types for better development experience and type safety:
612
+
613
+ | Type | Description | Example |
614
+ | ----------------------- | ------------------------------ | --------------------------------------------------- |
615
+ | `CollegeDBConfig` | Main configuration object | `{ kv, shards, strategy }` |
616
+ | `ShardingStrategy` | Single strategy options | `'hash' \| 'location' \| 'round-robin' \| 'random'` |
617
+ | `MixedShardingStrategy` | Mixed strategy configuration | `{ read: 'hash', write: 'location' }` |
618
+ | `OperationType` | Database operation types | `'read' \| 'write'` |
619
+ | `D1Region` | Cloudflare D1 regions | `'wnam' \| 'enam' \| 'weur' \| ...` |
620
+ | `ShardLocation` | Geographic shard configuration | `{ region: 'wnam', priority: 2 }` |
621
+ | `ShardStats` | Shard usage statistics | `{ binding: 'db-east', count: 1542 }` |
622
+
623
+ #### Mixed Strategy Configuration
624
+
625
+ ```typescript
626
+ import type { MixedShardingStrategy, CollegeDBConfig } from 'collegedb';
627
+
628
+ // Type-safe mixed strategy configuration
629
+ const mixedStrategy: MixedShardingStrategy = {
630
+ read: 'hash', // Fast, deterministic reads
631
+ write: 'location' // Geographically optimized writes
632
+ };
633
+
634
+ const config: CollegeDBConfig = {
635
+ kv: env.KV,
636
+ strategy: mixedStrategy, // Type-checked
637
+ targetRegion: 'wnam',
638
+ shardLocations: {
639
+ 'db-west': { region: 'wnam', priority: 2 },
640
+ 'db-east': { region: 'enam', priority: 1 }
641
+ },
642
+ shards: {
643
+ 'db-west': env.DB_WEST,
644
+ 'db-east': env.DB_EAST
645
+ }
646
+ };
647
+ ```
648
+
349
649
  ## 🏗 Architecture
350
650
 
351
651
  ```txt
@@ -369,16 +669,53 @@ for (const [table, pkColumn] of Object.entries(customIntegration)) {
369
669
 
370
670
  ### Data Flow
371
671
 
672
+ #### Without ShardCoordinator (Hash/Random/Location strategies)
673
+
372
674
  1. **Query Received**: Application sends query with primary key
373
- 2. **Shard Resolution**: CollegeDB checks KV for existing mapping or allocates new shard
374
- 3. **Query Execution**: SQL executed on appropriate D1 database
375
- 4. **Response**: Results returned to application
675
+ 2. **Shard Resolution**: CollegeDB checks KV for existing mapping or calculates shard using strategy
676
+ 3. **Direct Allocation**: For new keys, shard selected using hash/random/location algorithm
677
+ 4. **Query Execution**: SQL executed on appropriate D1 database
678
+ 5. **Response**: Results returned to application
679
+
680
+ #### With ShardCoordinator (Round-Robin strategy)
681
+
682
+ 1. **Query Received**: Application sends query with primary key
683
+ 2. **Shard Resolution**: CollegeDB checks KV for existing mapping
684
+ 3. **Coordinator Allocation**: For new keys, coordinator allocates shard using round-robin
685
+ 4. **State Update**: Coordinator updates round-robin index and shard statistics
686
+ 5. **Query Execution**: SQL executed on appropriate D1 database
687
+ 6. **Response**: Results returned to application
688
+
689
+ #### ShardCoordinator Internal Flow
690
+
691
+ ```txt
692
+ ┌─────────────────────────────────────────────────────────────┐
693
+ │ ShardCoordinator (Durable Object) │
694
+ ├─────────────────────────────────────────────────────────────┤
695
+ │ ┌─────────────────┐ ┌─────────────────────────────────┐ │
696
+ │ │ HTTP API │ │ Persistent Storage │ │
697
+ │ │ - /allocate │ │ - knownShards: string[] │ │
698
+ │ │ - /shards │ │ - shardStats: ShardStats{} │ │
699
+ │ │ - /stats │ │ - strategy: ShardingStrategy │ │
700
+ │ │ - /health │ │ - roundRobinIndex: number │ │
701
+ │ └─────────────────┘ └─────────────────────────────────┘ │
702
+ │ │ │
703
+ │ ┌─────────────────────────────────────────────────────┐ │
704
+ │ │ Allocation Algorithms │ │
705
+ │ │ - Round-Robin: state.roundRobinIndex │ │
706
+ │ │ - Hash: consistent hash(primaryKey) │ │
707
+ │ │ - Random: Math.random() * shards.length │ │
708
+ │ │ - Location: region proximity + priority │ │
709
+ │ └─────────────────────────────────────────────────────┘ │
710
+ └─────────────────────────────────────────────────────────────┘
711
+ ```
376
712
 
377
713
  ### Shard Allocation Strategies
378
714
 
379
715
  - **Hash**: Consistent hashing for deterministic shard selection
380
716
  - **Round-Robin**: Evenly distribute new keys across shards
381
717
  - **Random**: Random shard selection for load balancing
718
+ - **Location**: Geographic proximity-based allocation for optimal latency
382
719
 
383
720
  ## 🌐 Cloudflare Setup
384
721
 
@@ -420,6 +757,114 @@ name = "ShardCoordinator"
420
757
  class_name = "ShardCoordinator"
421
758
  ```
422
759
 
760
+ #### Complete wrangler.toml with ShardCoordinator
761
+
762
+ ```toml
763
+ name = "collegedb-app"
764
+ main = "src/index.ts"
765
+ compatibility_date = "2024-08-10"
766
+
767
+ # D1 Database bindings
768
+ [[d1_databases]]
769
+ binding = "db-east"
770
+ database_name = "collegedb-east"
771
+ database_id = "your-east-database-id"
772
+
773
+ [[d1_databases]]
774
+ binding = "db-west"
775
+ database_name = "collegedb-west"
776
+ database_id = "your-west-database-id"
777
+
778
+ [[d1_databases]]
779
+ binding = "db-central"
780
+ database_name = "collegedb-central"
781
+ database_id = "your-central-database-id"
782
+
783
+ # KV namespace for shard mappings
784
+ [[kv_namespaces]]
785
+ binding = "KV"
786
+ id = "your-kv-namespace-id"
787
+ preview_id = "your-kv-preview-id" # For local development
788
+
789
+ # Durable Object for shard coordination
790
+ [[durable_objects.bindings]]
791
+ name = "ShardCoordinator"
792
+ class_name = "ShardCoordinator"
793
+
794
+ # Environment-specific configurations
795
+ [env.production]
796
+ [[env.production.d1_databases]]
797
+ binding = "db-east"
798
+ database_name = "collegedb-prod-east"
799
+ database_id = "your-prod-east-id"
800
+
801
+ [[env.production.d1_databases]]
802
+ binding = "db-west"
803
+ database_name = "collegedb-prod-west"
804
+ database_id = "your-prod-west-id"
805
+
806
+ [[env.production.kv_namespaces]]
807
+ binding = "KV"
808
+ id = "your-prod-kv-namespace-id"
809
+
810
+ [[env.production.durable_objects.bindings]]
811
+ name = "ShardCoordinator"
812
+ class_name = "ShardCoordinator"
813
+ ```
814
+
815
+ ### 3.1. Worker Script Setup (Required for ShardCoordinator)
816
+
817
+ Create your main worker file with ShardCoordinator export:
818
+
819
+ ```typescript
820
+ // src/index.ts
821
+ import { collegedb, ShardCoordinator, first, run } from 'collegedb';
822
+
823
+ // IMPORTANT: Export ShardCoordinator for Cloudflare Workers runtime
824
+ export { ShardCoordinator };
825
+
826
+ interface Env {
827
+ KV: KVNamespace;
828
+ ShardCoordinator: DurableObjectNamespace;
829
+ 'db-east': D1Database;
830
+ 'db-west': D1Database;
831
+ 'db-central': D1Database;
832
+ }
833
+
834
+ export default {
835
+ async fetch(request: Request, env: Env): Promise<Response> {
836
+ return await collegedb(
837
+ {
838
+ kv: env.KV,
839
+ coordinator: env.ShardCoordinator, // Optional: only needed for round-robin
840
+ strategy: 'hash', // or 'round-robin', 'random', 'location'
841
+ shards: {
842
+ 'db-east': env['db-east'],
843
+ 'db-west': env['db-west'],
844
+ 'db-central': env['db-central']
845
+ }
846
+ },
847
+ async () => {
848
+ // Your application logic here
849
+ const url = new URL(request.url);
850
+
851
+ if (url.pathname === '/user') {
852
+ const userId = url.searchParams.get('id');
853
+ if (!userId) {
854
+ return new Response('Missing user ID', { status: 400 });
855
+ }
856
+
857
+ const user = await first(userId, 'SELECT * FROM users WHERE id = ?', [userId]);
858
+ return Response.json(user);
859
+ }
860
+
861
+ return new Response('CollegeDB API', { status: 200 });
862
+ }
863
+ );
864
+ }
865
+ };
866
+ ```
867
+
423
868
  ### 4. Deploy
424
869
 
425
870
  ```bash
@@ -434,6 +879,8 @@ wrangler deploy --env production
434
879
 
435
880
  ### Shard Statistics
436
881
 
882
+ #### Using CollegeDB Functions
883
+
437
884
  ```typescript
438
885
  import { getShardStats, listKnownShards } from 'collegedb';
439
886
 
@@ -450,6 +897,96 @@ const shards = await listKnownShards();
450
897
  console.log(shards); // ['db-east', 'db-west']
451
898
  ```
452
899
 
900
+ #### Using ShardCoordinator Directly
901
+
902
+ ```typescript
903
+ // Get coordinator instance
904
+ const coordinatorId = env.ShardCoordinator.idFromName('default');
905
+ const coordinator = env.ShardCoordinator.get(coordinatorId);
906
+
907
+ // Get real-time shard statistics
908
+ const statsResponse = await coordinator.fetch('http://coordinator/stats');
909
+ const detailedStats = await statsResponse.json();
910
+ console.log(detailedStats);
911
+ /* Returns:
912
+ [
913
+ {
914
+ "binding": "db-east",
915
+ "count": 1542,
916
+ "lastUpdated": 1672531200000
917
+ },
918
+ {
919
+ "binding": "db-west",
920
+ "count": 1458,
921
+ "lastUpdated": 1672531205000
922
+ }
923
+ ]
924
+ */
925
+
926
+ // List registered shards
927
+ const shardsResponse = await coordinator.fetch('http://coordinator/shards');
928
+ const allShards = await shardsResponse.json();
929
+ console.log(allShards); // ['db-east', 'db-west', 'db-central']
930
+ ```
931
+
932
+ #### Advanced Monitoring Dashboard
933
+
934
+ ```typescript
935
+ async function createMonitoringDashboard(env: Env) {
936
+ const coordinatorId = env.ShardCoordinator.idFromName('default');
937
+ const coordinator = env.ShardCoordinator.get(coordinatorId);
938
+
939
+ // Get comprehensive metrics
940
+ const [shardsResponse, statsResponse, healthResponse] = await Promise.all([
941
+ coordinator.fetch('http://coordinator/shards'),
942
+ coordinator.fetch('http://coordinator/stats'),
943
+ coordinator.fetch('http://coordinator/health')
944
+ ]);
945
+
946
+ const shards = await shardsResponse.json();
947
+ const stats = await statsResponse.json();
948
+ const isHealthy = healthResponse.ok;
949
+
950
+ // Calculate distribution metrics
951
+ const totalKeys = stats.reduce((sum: number, shard: any) => sum + shard.count, 0);
952
+ const avgKeysPerShard = totalKeys / stats.length;
953
+ const maxKeys = Math.max(...stats.map((s: any) => s.count));
954
+ const minKeys = Math.min(...stats.map((s: any) => s.count));
955
+ const distributionRatio = maxKeys / (minKeys || 1);
956
+
957
+ // Check for stale statistics (>5 minutes)
958
+ const now = Date.now();
959
+ const staleThreshold = 5 * 60 * 1000; // 5 minutes
960
+ const staleShards = stats.filter((shard: any) => now - shard.lastUpdated > staleThreshold);
961
+
962
+ return {
963
+ healthy: isHealthy,
964
+ totalShards: shards.length,
965
+ totalKeys,
966
+ avgKeysPerShard: Math.round(avgKeysPerShard),
967
+ distributionRatio: Math.round(distributionRatio * 100) / 100,
968
+ isBalanced: distributionRatio < 1.5, // Less than 50% difference
969
+ staleShards: staleShards.length,
970
+ shardDetails: stats.map((shard: any) => ({
971
+ ...shard,
972
+ loadPercentage: Math.round((shard.count / totalKeys) * 100),
973
+ isStale: now - shard.lastUpdated > staleThreshold
974
+ }))
975
+ };
976
+ }
977
+
978
+ // Usage in monitoring endpoint
979
+ export default {
980
+ async fetch(request: Request, env: Env) {
981
+ if (new URL(request.url).pathname === '/monitor') {
982
+ const dashboard = await createMonitoringDashboard(env);
983
+ return Response.json(dashboard);
984
+ }
985
+ // ... rest of your app
986
+ }
987
+ };
988
+ ```
989
+
453
990
  ### Shard Rebalancing
454
991
 
455
992
  ```typescript
@@ -467,8 +1004,500 @@ Monitor your CollegeDB deployment by tracking:
467
1004
  - **Query latency per shard**
468
1005
  - **Error rates and failed queries**
469
1006
  - **KV operation metrics**
1007
+ - **ShardCoordinator health and availability**
1008
+
1009
+ #### Automated Health Checks
1010
+
1011
+ ```typescript
1012
+ async function performHealthChecks(env: Env): Promise<HealthReport> {
1013
+ const results: HealthReport = {
1014
+ overall: 'healthy',
1015
+ timestamp: new Date().toISOString(),
1016
+ checks: {}
1017
+ };
1018
+
1019
+ // 1. Test KV availability
1020
+ try {
1021
+ await env.KV.put('health-check', 'ok', { expirationTtl: 60 });
1022
+ const kvTest = await env.KV.get('health-check');
1023
+ results.checks.kv = kvTest === 'ok' ? 'healthy' : 'degraded';
1024
+ } catch (error) {
1025
+ results.checks.kv = 'unhealthy';
1026
+ results.overall = 'unhealthy';
1027
+ }
1028
+
1029
+ // 2. Test ShardCoordinator availability
1030
+ if (env.ShardCoordinator) {
1031
+ try {
1032
+ const coordinatorId = env.ShardCoordinator.idFromName('default');
1033
+ const coordinator = env.ShardCoordinator.get(coordinatorId);
1034
+ const healthResponse = await coordinator.fetch('http://coordinator/health');
1035
+ results.checks.coordinator = healthResponse.ok ? 'healthy' : 'unhealthy';
1036
+
1037
+ if (!healthResponse.ok) {
1038
+ results.overall = 'degraded';
1039
+ }
1040
+ } catch (error) {
1041
+ results.checks.coordinator = 'unhealthy';
1042
+ results.overall = 'degraded'; // Can fallback to hash allocation
1043
+ }
1044
+ }
1045
+
1046
+ // 3. Test each D1 shard
1047
+ const shardTests = Object.entries(env)
1048
+ .filter(([key]) => key.startsWith('db-'))
1049
+ .map(async ([shardName, db]: [string, any]) => {
1050
+ try {
1051
+ // Simple query to test connectivity
1052
+ await db.prepare('SELECT 1 as test').first();
1053
+ results.checks[shardName] = 'healthy';
1054
+ } catch (error) {
1055
+ results.checks[shardName] = 'unhealthy';
1056
+ results.overall = 'unhealthy';
1057
+ }
1058
+ });
1059
+
1060
+ await Promise.all(shardTests);
1061
+
1062
+ // 4. Check shard distribution balance
1063
+ if (results.checks.coordinator === 'healthy') {
1064
+ try {
1065
+ const coordinatorId = env.ShardCoordinator.idFromName('default');
1066
+ const coordinator = env.ShardCoordinator.get(coordinatorId);
1067
+ const statsResponse = await coordinator.fetch('http://coordinator/stats');
1068
+ const stats = await statsResponse.json();
1069
+
1070
+ const totalKeys = stats.reduce((sum: number, shard: any) => sum + shard.count, 0);
1071
+ if (totalKeys > 0) {
1072
+ const avgKeys = totalKeys / stats.length;
1073
+ const maxKeys = Math.max(...stats.map((s: any) => s.count));
1074
+ const distributionRatio = maxKeys / avgKeys;
1075
+
1076
+ results.checks.distribution = distributionRatio < 2 ? 'healthy' : 'degraded';
1077
+ results.distributionRatio = distributionRatio;
1078
+
1079
+ if (distributionRatio >= 3 && results.overall === 'healthy') {
1080
+ results.overall = 'degraded';
1081
+ }
1082
+ }
1083
+ } catch (error) {
1084
+ results.checks.distribution = 'unknown';
1085
+ }
1086
+ }
1087
+
1088
+ return results;
1089
+ }
1090
+
1091
+ interface HealthReport {
1092
+ overall: 'healthy' | 'degraded' | 'unhealthy';
1093
+ timestamp: string;
1094
+ checks: Record<string, 'healthy' | 'degraded' | 'unhealthy' | 'unknown'>;
1095
+ distributionRatio?: number;
1096
+ }
1097
+
1098
+ // Health endpoint example
1099
+ export default {
1100
+ async fetch(request: Request, env: Env) {
1101
+ if (new URL(request.url).pathname === '/health') {
1102
+ const health = await performHealthChecks(env);
1103
+ const statusCode = health.overall === 'healthy' ? 200 : health.overall === 'degraded' ? 206 : 503;
1104
+ return Response.json(health, { status: statusCode });
1105
+ }
1106
+ // ... rest of your app
1107
+ }
1108
+ };
1109
+ ```
1110
+
1111
+ #### Alerting and Monitoring Integration
1112
+
1113
+ ```typescript
1114
+ // Integration with external monitoring services
1115
+ async function sendAlert(severity: 'warning' | 'critical', message: string, env: Env) {
1116
+ // Example: Slack webhook
1117
+ if (env.SLACK_WEBHOOK_URL) {
1118
+ await fetch(env.SLACK_WEBHOOK_URL, {
1119
+ method: 'POST',
1120
+ headers: { 'Content-Type': 'application/json' },
1121
+ body: JSON.stringify({
1122
+ text: `🚨 CollegeDB ${severity.toUpperCase()}: ${message}`,
1123
+ username: 'CollegeDB Monitor'
1124
+ })
1125
+ });
1126
+ }
1127
+
1128
+ // Example: Custom webhook
1129
+ if (env.MONITORING_WEBHOOK_URL) {
1130
+ await fetch(env.MONITORING_WEBHOOK_URL, {
1131
+ method: 'POST',
1132
+ headers: { 'Content-Type': 'application/json' },
1133
+ body: JSON.stringify({
1134
+ service: 'collegedb',
1135
+ severity,
1136
+ message,
1137
+ timestamp: new Date().toISOString()
1138
+ })
1139
+ });
1140
+ }
1141
+ }
1142
+
1143
+ // Scheduled monitoring (using Cron Triggers)
1144
+ export default {
1145
+ async scheduled(event: ScheduledEvent, env: Env, ctx: ExecutionContext): Promise<void> {
1146
+ const health = await performHealthChecks(env);
1147
+
1148
+ if (health.overall === 'unhealthy') {
1149
+ await sendAlert('critical', `System unhealthy: ${JSON.stringify(health.checks)}`, env);
1150
+ } else if (health.overall === 'degraded') {
1151
+ await sendAlert('warning', `System degraded: ${JSON.stringify(health.checks)}`, env);
1152
+ }
1153
+
1154
+ // Check for severe shard imbalance
1155
+ if (health.distributionRatio && health.distributionRatio > 5) {
1156
+ await sendAlert('warning', `Severe shard imbalance detected: ${health.distributionRatio}x difference`, env);
1157
+ }
1158
+ }
1159
+ };
1160
+ ```
1161
+
1162
+ ## ⚙️ Performance Analysis
1163
+
1164
+ ### Scaling Performance Comparison
1165
+
1166
+ CollegeDB provides significant performance improvements through horizontal scaling. Here are mathematical estimates comparing single D1 database vs CollegeDB with different shard counts:
1167
+
1168
+ #### Query Performance
1169
+
1170
+ _SELECT, VALUES, TABLE, PRAGMA, ..._
1171
+
1172
+ | Configuration | Query Latency\* | Concurrent Queries | Throughput Gain |
1173
+ | ----------------------- | --------------- | ----------------------- | --------------- |
1174
+ | Single D1 | ~50-80ms | Limited by D1 limits | 1x (baseline) |
1175
+ | CollegeDB (10 shards) | ~55-85ms | 10x parallel capacity | ~8-9x |
1176
+ | CollegeDB (100 shards) | ~60-90ms | 100x parallel capacity | ~75-80x |
1177
+ | CollegeDB (1000 shards) | ~65-95ms | 1000x parallel capacity | ~650-700x |
1178
+
1179
+ \*Includes KV lookup overhead (~5-15ms)
1180
+
1181
+ #### Write Performance
1182
+
1183
+ _INSERT, UPDATE, DELETE, ..._
1184
+
1185
+ | Configuration | Write Latency\* | Concurrent Writes | Throughput Gain |
1186
+ | ----------------------- | --------------- | ------------------ | --------------- |
1187
+ | Single D1 | ~80-120ms | ~50 writes/sec | 1x (baseline) |
1188
+ | CollegeDB (10 shards) | ~90-135ms | ~450 writes/sec | ~9x |
1189
+ | CollegeDB (100 shards) | ~95-145ms | ~4,200 writes/sec | ~84x |
1190
+ | CollegeDB (1000 shards) | ~105-160ms | ~35,000 writes/sec | ~700x |
1191
+
1192
+ \*Includes KV mapping creation/update overhead (~10-25ms)
1193
+
1194
+ ### Strategy-Specific Performance
1195
+
1196
+ #### Hash Strategy
1197
+
1198
+ - **Best for**: Consistent performance, even data distribution
1199
+ - **Latency**: Lowest overhead (no coordinator calls)
1200
+ - **Throughput**: Optimal for high-volume scenarios
1201
+
1202
+ | Shards | Avg Latency | Distribution Quality | Coordinator Dependency |
1203
+ | ------ | ----------- | -------------------- | ---------------------- |
1204
+ | 10 | +5ms | Excellent | None |
1205
+ | 100 | +5ms | Excellent | None |
1206
+ | 1000 | +5ms | Excellent | None |
470
1207
 
471
- ## 🔧 Advanced Configuration
1208
+ #### Round-Robin Strategy
1209
+
1210
+ - **Best for**: Guaranteed even distribution
1211
+ - **Latency**: Requires coordinator communication
1212
+ - **Throughput**: Good, limited by coordinator
1213
+
1214
+ | Shards | Avg Latency | Distribution Quality | Coordinator Dependency |
1215
+ | ------ | ----------- | -------------------- | ---------------------- |
1216
+ | 10 | +15ms | Perfect | High |
1217
+ | 100 | +20ms | Perfect | High |
1218
+ | 1000 | +25ms | Perfect | High |
1219
+
1220
+ #### Random Strategy
1221
+
1222
+ - **Best for**: Simple setup, good distribution over time
1223
+ - **Latency**: Low overhead
1224
+ - **Throughput**: Good for medium-scale deployments
1225
+
1226
+ | Shards | Avg Latency | Distribution Quality | Coordinator Dependency |
1227
+ | ------ | ----------- | -------------------- | ---------------------- |
1228
+ | 10 | +3ms | Good | None |
1229
+ | 100 | +3ms | Good | None |
1230
+ | 1000 | +3ms | Fair | None |
1231
+
1232
+ #### Location Strategy
1233
+
1234
+ - **Best for**: Geographic optimization, reduced latency
1235
+ - **Latency**: Optimized by region proximity
1236
+ - **Throughput**: Regional performance benefits
1237
+
1238
+ | Shards | Avg Latency | Geographic Benefit | Coordinator Dependency |
1239
+ | ------ | ----------- | -------------------- | ---------------------- |
1240
+ | 10 | +8ms | Excellent (-20-40ms) | Optional |
1241
+ | 100 | +10ms | Excellent (-20-40ms) | Optional |
1242
+ | 1000 | +12ms | Excellent (-20-40ms) | Optional |
1243
+
1244
+ #### Mixed Strategy
1245
+
1246
+ - **Best for**: Optimizing both read and write performance independently
1247
+ - **Latency**: Best of both strategies combined
1248
+ - **Throughput**: Optimal for workloads with different read/write patterns
1249
+
1250
+ **High-Performance Mix**: `{ read: 'hash', write: 'location' }`
1251
+
1252
+ | Operation | Strategy | Latency Impact | Throughput Benefit | Geographic Benefit |
1253
+ | --------- | -------- | ------------------------ | ------------------ | -------------------- |
1254
+ | Reads | Hash | +5ms | Excellent | None |
1255
+ | Writes | Location | +8ms (-20-40ms regional) | Good | Excellent (-20-40ms) |
1256
+
1257
+ **Balanced Mix**: `{ read: 'location', write: 'hash' }`
1258
+
1259
+ | Operation | Strategy | Latency Impact | Throughput Benefit | Geographic Benefit |
1260
+ | --------- | -------- | ------------------------ | ------------------ | -------------------- |
1261
+ | Reads | Location | +8ms (-20-40ms regional) | Good | Excellent (-20-40ms) |
1262
+ | Writes | Hash | +5ms | Excellent | None |
1263
+
1264
+ **Enterprise Mix**: `{ read: 'hash', write: 'round-robin' }`
1265
+
1266
+ | Operation | Strategy | Latency Impact | Distribution Quality | Coordinator Dependency |
1267
+ | --------- | ----------- | -------------- | -------------------- | ---------------------- |
1268
+ | Reads | Hash | +5ms | Excellent | None |
1269
+ | Writes | Round-Robin | +15-25ms | Perfect | High |
1270
+
1271
+ ##### By Shard Count
1272
+
1273
+ **Hash + Location Mix** (`{ read: 'hash', write: 'location' }`)
1274
+
1275
+ | Shards | Read Latency | Write Latency | Combined Benefit | Best Use Case |
1276
+ | ------ | ------------ | ---------------------- | --------------------- | ---------------- |
1277
+ | 10 | +5ms | +8ms (-30ms regional) | ~22ms net improvement | Global apps |
1278
+ | 100 | +5ms | +10ms (-30ms regional) | ~20ms net improvement | Enterprise scale |
1279
+ | 1000 | +5ms | +12ms (-30ms regional) | ~18ms net improvement | Massive scale |
1280
+
1281
+ **Location + Hash Mix** (`{ read: 'location', write: 'hash' }`)
1282
+
1283
+ | Shards | Read Latency | Write Latency | Combined Benefit | Best Use Case |
1284
+ | ------ | ---------------------- | ------------- | --------------------- | --------------------- |
1285
+ | 10 | +8ms (-30ms regional) | +5ms | ~17ms net improvement | Read-heavy regional |
1286
+ | 100 | +10ms (-30ms regional) | +5ms | ~15ms net improvement | Analytics workloads |
1287
+ | 1000 | +12ms (-30ms regional) | +5ms | ~13ms net improvement | Large-scale reporting |
1288
+
1289
+ **Hash + Round-Robin Mix** (`{ read: 'hash', write: 'round-robin' }`)
1290
+
1291
+ | Shards | Read Latency | Write Latency | Distribution Quality | Best Use Case |
1292
+ | ------ | ------------ | ------------- | ------------------------------- | ------------------ |
1293
+ | 10 | +5ms | +15ms | Perfect writes, Excellent reads | Balanced workloads |
1294
+ | 100 | +5ms | +20ms | Perfect writes, Excellent reads | Large databases |
1295
+ | 1000 | +5ms | +25ms | Perfect writes, Excellent reads | Enterprise scale |
1296
+
1297
+ ### Mixed Strategy Scenarios & Recommendations
1298
+
1299
+ #### Large Database Scenarios (>10M records)
1300
+
1301
+ **Scenario**: Massive datasets requiring optimal query performance and balanced growth
1302
+
1303
+ ```typescript
1304
+ // Recommended: Hash reads + Round-Robin writes
1305
+ {
1306
+ strategy: { read: 'hash', write: 'round-robin' },
1307
+ coordinator: env.ShardCoordinator // Required for round-robin
1308
+ }
1309
+ ```
1310
+
1311
+ **Performance Profile**:
1312
+
1313
+ - Read latency: +5ms (fastest possible routing)
1314
+ - Write latency: +15-25ms (coordinator overhead)
1315
+ - Data distribution: Perfect balance over time
1316
+ - **Ideal for**: Analytics platforms, data warehouses, reporting systems
1317
+
1318
+ #### Vast Geographic Spread Scenarios
1319
+
1320
+ **Scenario**: Global applications with users across multiple continents
1321
+
1322
+ ```typescript
1323
+ // Recommended: Hash reads + Location writes
1324
+ {
1325
+ strategy: { read: 'hash', write: 'location' },
1326
+ targetRegion: 'auto', // Or specific region like 'wnam'
1327
+ shardLocations: {
1328
+ 'db-americas': { region: 'wnam', priority: 2 },
1329
+ 'db-europe': { region: 'weur', priority: 2 },
1330
+ 'db-asia': { region: 'apac', priority: 2 }
1331
+ }
1332
+ }
1333
+ ```
1334
+
1335
+ **Performance Profile**:
1336
+
1337
+ - Read latency: +5ms (consistent global performance)
1338
+ - Write latency: +8ms baseline (-20-40ms regional benefit)
1339
+ - **Net improvement**: 15-35ms for geographically distributed users
1340
+ - **Ideal for**: Social networks, e-commerce, content platforms
1341
+
1342
+ #### High-Volume Write Scenarios
1343
+
1344
+ **Scenario**: Applications with heavy write loads (IoT, logging, real-time data)
1345
+
1346
+ ```typescript
1347
+ // Recommended: Location reads + Hash writes
1348
+ {
1349
+ strategy: { read: 'location', write: 'hash' },
1350
+ targetRegion: 'wnam',
1351
+ shardLocations: {
1352
+ 'db-west': { region: 'wnam', priority: 3 },
1353
+ 'db-central': { region: 'enam', priority: 2 },
1354
+ 'db-east': { region: 'enam', priority: 1 }
1355
+ }
1356
+ }
1357
+ ```
1358
+
1359
+ **Performance Profile**:
1360
+
1361
+ - Read latency: +8ms baseline (-20-40ms regional benefit)
1362
+ - Write latency: +5ms (fastest write routing)
1363
+ - Write throughput: Maximum possible for hash strategy
1364
+ - **Ideal for**: IoT data collection, real-time analytics, logging systems
1365
+
1366
+ #### Multi-Tenant SaaS Scenarios
1367
+
1368
+ **Scenario**: SaaS applications with predictable performance requirements
1369
+
1370
+ ```typescript
1371
+ // Recommended: Hash reads + Hash writes (consistent performance)
1372
+ {
1373
+ strategy: { read: 'hash', write: 'hash' }
1374
+ // No coordinator needed, predictable routing for both operations
1375
+ }
1376
+ ```
1377
+
1378
+ **Performance Profile**:
1379
+
1380
+ - Read latency: +5ms (most predictable)
1381
+ - Write latency: +5ms (most predictable)
1382
+ - Tenant isolation: Natural sharding by tenant ID
1383
+ - **Ideal for**: B2B SaaS, multi-tenant platforms, predictable workloads
1384
+
1385
+ #### Read-Heavy Analytics Scenarios
1386
+
1387
+ **Scenario**: Analytics and reporting workloads with occasional writes
1388
+
1389
+ ```typescript
1390
+ // Recommended: Random reads + Location writes
1391
+ {
1392
+ strategy: { read: 'random', write: 'location' },
1393
+ targetRegion: 'wnam',
1394
+ shardLocations: { /* geographic configuration */ }
1395
+ }
1396
+ ```
1397
+
1398
+ **Performance Profile**:
1399
+
1400
+ - Read latency: +3ms (lowest overhead, good load balancing)
1401
+ - Write latency: +8ms baseline (-20-40ms regional benefit)
1402
+ - Read load distribution: Excellent across all shards
1403
+ - **Ideal for**: Business intelligence, data analysis, reporting dashboards
1404
+
1405
+ ### Mixed Strategy Performance Comparison
1406
+
1407
+ #### By Database Size
1408
+
1409
+ | Database Size | Best Mixed Strategy | Read Performance | Write Performance | Overall Benefit |
1410
+ | ----------------------------- | ------------------------------------------ | --------------------- | -------------------- | --------------------- |
1411
+ | **Small (1K-100K records)** | `{read: 'hash', write: 'hash'}` | Excellent | Excellent | Consistent, simple |
1412
+ | **Medium (100K-1M records)** | `{read: 'hash', write: 'location'}` | Excellent | Good + Regional | 15-35ms improvement |
1413
+ | **Large (1M-10M records)** | `{read: 'hash', write: 'round-robin'}` | Excellent | Perfect distribution | Optimal scaling |
1414
+ | **Very Large (10M+ records)** | `{read: 'location', write: 'round-robin'}` | Regional optimization | Perfect distribution | Best for global scale |
1415
+
1416
+ #### By Geographic Distribution
1417
+
1418
+ | Geographic Spread | Best Mixed Strategy | Latency Improvement | Use Case |
1419
+ | ----------------- | --------------------------------------- | ----------------------- | ------------------------------- |
1420
+ | **Single Region** | `{read: 'hash', write: 'hash'}` | +5ms both operations | Simple, fast |
1421
+ | **Multi-Region** | `{read: 'hash', write: 'location'}` | 15-35ms net improvement | Global apps |
1422
+ | **Global** | `{read: 'location', write: 'location'}` | 20-40ms both operations | Maximum geographic optimization |
1423
+
1424
+ #### By Workload Pattern
1425
+
1426
+ | Workload Type | Read/Write Ratio | Best Mixed Strategy | Primary Benefit |
1427
+ | --------------- | ---------------- | ------------------------------------------ | ------------------------------- |
1428
+ | **Read-Heavy** | 90% reads | `{read: 'random', write: 'location'}` | Load-balanced queries |
1429
+ | **Write-Heavy** | 70% writes | `{read: 'location', write: 'hash'}` | Fast write processing |
1430
+ | **Balanced** | 50/50 | `{read: 'hash', write: 'hash'}` | Consistent performance |
1431
+ | **Analytics** | 95% reads | `{read: 'location', write: 'round-robin'}` | Regional + perfect distribution |
1432
+
1433
+ ### Real-World Scaling Benefits
1434
+
1435
+ #### Database Size Limits
1436
+
1437
+ - **Single D1**: Limited to D1's database size constraints
1438
+ - **CollegeDB**: Virtually unlimited through horizontal distribution
1439
+ - **Data per shard**: Scales inversely with shard count (1000 shards = 1/1000 data per shard)
1440
+
1441
+ #### Geographic Distribution
1442
+
1443
+ ```typescript
1444
+ // Location-aware sharding reduces latency by 20-40ms
1445
+ initialize({
1446
+ kv: env.KV,
1447
+ strategy: 'location',
1448
+ targetRegion: 'wnam', // Western North America
1449
+ shardLocations: {
1450
+ 'db-west': { region: 'wnam', priority: 2 }, // Preferred
1451
+ 'db-east': { region: 'enam', priority: 1 }, // Secondary
1452
+ 'db-europe': { region: 'weur', priority: 0.5 } // Fallback
1453
+ },
1454
+ shards: { ... }
1455
+ });
1456
+ ```
1457
+
1458
+ #### Fault Tolerance
1459
+
1460
+ - **Single D1**: Single point of failure
1461
+ - **CollegeDB**: Distributed failure isolation (failure of 1 shard affects only 1/N of data)
1462
+
1463
+ ### Cost-Performance Analysis
1464
+
1465
+ | Shards | D1 Costs\*\* | Performance Gain | Cost per Performance Unit |
1466
+ | ------ | ------------ | ---------------- | ------------------------- |
1467
+ | 1 | 1x | 1x | 1.00x |
1468
+ | 10 | 1.2x | ~9x | 0.13x |
1469
+ | 100 | 2.5x | ~80x | 0.03x |
1470
+ | 1000 | 15x | ~700x | 0.02x |
1471
+
1472
+ \*\*Estimated based on D1's pricing model including KV overhead
1473
+
1474
+ ### When to Use CollegeDB
1475
+
1476
+ ✅ **Recommended for:**
1477
+
1478
+ - High-traffic applications (>1000 QPS)
1479
+ - Large datasets approaching D1 limits
1480
+ - Geographic distribution requirements
1481
+ - Applications needing >50 concurrent operations
1482
+ - Systems requiring fault tolerance
1483
+
1484
+ ✅ **Mixed Strategy specifically recommended for:**
1485
+
1486
+ - **Global applications** needing both fast queries and optimal data placement
1487
+ - **Large-scale databases** requiring different optimization for reads vs writes
1488
+ - **Multi-workload systems** with distinct read/write patterns
1489
+ - **Geographic optimization** while maintaining query performance
1490
+ - **Enterprise applications** needing fine-tuned performance control
1491
+
1492
+ ❌ **Not recommended for:**
1493
+
1494
+ - Small applications (<100 QPS)
1495
+ - Simple CRUD operations with minimal scale
1496
+ - Applications without geographic spread
1497
+ - Cost-sensitive deployments at small scale
1498
+ - **Single-strategy applications** where reads and writes have identical performance needs
1499
+
1500
+ ## �🔧 Advanced Configuration
472
1501
 
473
1502
  ### Custom Allocation Strategy
474
1503
 
@@ -492,6 +1521,409 @@ const config = {
492
1521
  initialize(config);
493
1522
  ```
494
1523
 
1524
+ ### Durable Objects Integration (ShardCoordinator)
1525
+
1526
+ CollegeDB includes an optional **ShardCoordinator** Durable Object that provides centralized shard allocation and statistics management. This is particularly useful for round-robin allocation strategies and monitoring shard utilization across your application.
1527
+
1528
+ #### Durable Object Setup
1529
+
1530
+ First, configure the Durable Object in your `wrangler.toml`:
1531
+
1532
+ ```toml
1533
+ [[durable_objects.bindings]]
1534
+ name = "ShardCoordinator"
1535
+ class_name = "ShardCoordinator"
1536
+
1537
+ # Export the ShardCoordinator class
1538
+ [durable_objects.bindings.script_name]
1539
+ # If using modules format
1540
+ [[durable_objects.bindings]]
1541
+ name = "ShardCoordinator"
1542
+ class_name = "ShardCoordinator"
1543
+ ```
1544
+
1545
+ #### Basic Usage with ShardCoordinator
1546
+
1547
+ ```typescript
1548
+ import { collegedb, ShardCoordinator } from 'collegedb';
1549
+
1550
+ // Export the Durable Object class for Cloudflare Workers
1551
+ export { ShardCoordinator };
1552
+
1553
+ export default {
1554
+ async fetch(request: Request, env: Env): Promise<Response> {
1555
+ // Initialize CollegeDB with coordinator support
1556
+ await collegedb(
1557
+ {
1558
+ kv: env.KV,
1559
+ coordinator: env.ShardCoordinator, // Add coordinator binding
1560
+ strategy: 'round-robin',
1561
+ shards: {
1562
+ 'db-east': env.DB_EAST,
1563
+ 'db-west': env.DB_WEST,
1564
+ 'db-central': env.DB_CENTRAL
1565
+ }
1566
+ },
1567
+ async () => {
1568
+ // Your application logic here
1569
+ const user = await first('user-123', 'SELECT * FROM users WHERE id = ?', ['user-123']);
1570
+ return Response.json(user);
1571
+ }
1572
+ );
1573
+ }
1574
+ };
1575
+ ```
1576
+
1577
+ #### ShardCoordinator HTTP API
1578
+
1579
+ The ShardCoordinator exposes a comprehensive HTTP API for managing shards and allocation:
1580
+
1581
+ ##### Shard Management
1582
+
1583
+ ```typescript
1584
+ // Get coordinator instance
1585
+ const coordinatorId = env.ShardCoordinator.idFromName('default');
1586
+ const coordinator = env.ShardCoordinator.get(coordinatorId);
1587
+
1588
+ // List all registered shards
1589
+ const shardsResponse = await coordinator.fetch('http://coordinator/shards');
1590
+ const shards = await shardsResponse.json();
1591
+ // Returns: ["db-east", "db-west", "db-central"]
1592
+
1593
+ // Register a new shard
1594
+ await coordinator.fetch('http://coordinator/shards', {
1595
+ method: 'POST',
1596
+ headers: { 'Content-Type': 'application/json' },
1597
+ body: JSON.stringify({ shard: 'db-new-region' })
1598
+ });
1599
+
1600
+ // Remove a shard
1601
+ await coordinator.fetch('http://coordinator/shards', {
1602
+ method: 'DELETE',
1603
+ headers: { 'Content-Type': 'application/json' },
1604
+ body: JSON.stringify({ shard: 'db-old-region' })
1605
+ });
1606
+ ```
1607
+
1608
+ ##### Statistics and Monitoring
1609
+
1610
+ ```typescript
1611
+ // Get shard statistics
1612
+ const statsResponse = await coordinator.fetch('http://coordinator/stats');
1613
+ const stats = await statsResponse.json();
1614
+ /* Returns:
1615
+ [
1616
+ {
1617
+ "binding": "db-east",
1618
+ "count": 1542,
1619
+ "lastUpdated": 1672531200000
1620
+ },
1621
+ {
1622
+ "binding": "db-west",
1623
+ "count": 1458,
1624
+ "lastUpdated": 1672531205000
1625
+ }
1626
+ ]
1627
+ */
1628
+
1629
+ // Update shard statistics manually
1630
+ await coordinator.fetch('http://coordinator/stats', {
1631
+ method: 'POST',
1632
+ headers: { 'Content-Type': 'application/json' },
1633
+ body: JSON.stringify({
1634
+ shard: 'db-east',
1635
+ count: 1600
1636
+ })
1637
+ });
1638
+ ```
1639
+
1640
+ ##### Shard Allocation
1641
+
1642
+ ```typescript
1643
+ // Allocate a shard for a primary key
1644
+ const allocationResponse = await coordinator.fetch('http://coordinator/allocate', {
1645
+ method: 'POST',
1646
+ headers: { 'Content-Type': 'application/json' },
1647
+ body: JSON.stringify({
1648
+ primaryKey: 'user-123',
1649
+ strategy: 'round-robin' // Optional, uses coordinator default if not specified
1650
+ })
1651
+ });
1652
+
1653
+ const { shard } = await allocationResponse.json();
1654
+ // Returns: { "shard": "db-west" }
1655
+
1656
+ // Hash-based allocation (consistent for same key)
1657
+ const hashAllocation = await coordinator.fetch('http://coordinator/allocate', {
1658
+ method: 'POST',
1659
+ headers: { 'Content-Type': 'application/json' },
1660
+ body: JSON.stringify({
1661
+ primaryKey: 'user-456',
1662
+ strategy: 'hash'
1663
+ })
1664
+ });
1665
+ ```
1666
+
1667
+ ##### Health Check and Development
1668
+
1669
+ ```typescript
1670
+ // Health check endpoint
1671
+ const healthResponse = await coordinator.fetch('http://coordinator/health');
1672
+ // Returns: "OK" with 200 status
1673
+
1674
+ // Clear all coordinator state (DEVELOPMENT ONLY!)
1675
+ await coordinator.fetch('http://coordinator/flush', {
1676
+ method: 'POST'
1677
+ });
1678
+ // WARNING: This removes all shard registrations and statistics
1679
+ ```
1680
+
1681
+ #### Programmatic Shard Management
1682
+
1683
+ The ShardCoordinator also provides methods for direct programmatic access:
1684
+
1685
+ ```typescript
1686
+ // Get coordinator instance
1687
+ const coordinatorId = env.ShardCoordinator.idFromName('default');
1688
+ const coordinator = env.ShardCoordinator.get(coordinatorId);
1689
+
1690
+ // Increment shard count (when adding new keys)
1691
+ await coordinator.incrementShardCount('db-east');
1692
+
1693
+ // Decrement shard count (when removing keys)
1694
+ await coordinator.decrementShardCount('db-west');
1695
+ ```
1696
+
1697
+ #### Advanced Monitoring Setup
1698
+
1699
+ Set up comprehensive monitoring of your shard distribution:
1700
+
1701
+ ```typescript
1702
+ async function monitorShardHealth(env: Env) {
1703
+ const coordinatorId = env.ShardCoordinator.idFromName('default');
1704
+ const coordinator = env.ShardCoordinator.get(coordinatorId);
1705
+
1706
+ // Get current statistics
1707
+ const statsResponse = await coordinator.fetch('http://coordinator/stats');
1708
+ const stats = await statsResponse.json();
1709
+
1710
+ // Calculate distribution balance
1711
+ const totalKeys = stats.reduce((sum: number, shard: any) => sum + shard.count, 0);
1712
+ const avgKeysPerShard = totalKeys / stats.length;
1713
+
1714
+ // Check for imbalanced shards (>20% deviation from average)
1715
+ const imbalancedShards = stats.filter((shard: any) => {
1716
+ const deviation = Math.abs(shard.count - avgKeysPerShard) / avgKeysPerShard;
1717
+ return deviation > 0.2;
1718
+ });
1719
+
1720
+ if (imbalancedShards.length > 0) {
1721
+ console.warn('Shard imbalance detected:', imbalancedShards);
1722
+ // Trigger rebalancing logic or alerts
1723
+ }
1724
+
1725
+ // Check for stale statistics (>1 hour old)
1726
+ const now = Date.now();
1727
+ const staleShards = stats.filter((shard: any) => {
1728
+ return now - shard.lastUpdated > 3600000; // 1 hour in ms
1729
+ });
1730
+
1731
+ if (staleShards.length > 0) {
1732
+ console.warn('Stale shard statistics detected:', staleShards);
1733
+ }
1734
+
1735
+ return {
1736
+ totalKeys,
1737
+ avgKeysPerShard,
1738
+ balance: imbalancedShards.length === 0,
1739
+ freshStats: staleShards.length === 0,
1740
+ shards: stats
1741
+ };
1742
+ }
1743
+ ```
1744
+
1745
+ #### Error Handling
1746
+
1747
+ ```typescript
1748
+ try {
1749
+ const coordinator = env.ShardCoordinator.get(coordinatorId);
1750
+ const response = await coordinator.fetch('http://coordinator/allocate', {
1751
+ method: 'POST',
1752
+ headers: { 'Content-Type': 'application/json' },
1753
+ body: JSON.stringify({ primaryKey: 'user-123' })
1754
+ });
1755
+
1756
+ if (!response.ok) {
1757
+ const error = await response.json();
1758
+ throw new Error(`ShardCoordinator error: ${error.error}`);
1759
+ }
1760
+
1761
+ const { shard } = await response.json();
1762
+ return shard;
1763
+ } catch (error) {
1764
+ console.error('Failed to allocate shard:', error);
1765
+ // Fallback to hash-based allocation without coordinator
1766
+ return hashFunction('user-123', availableShards);
1767
+ }
1768
+ ```
1769
+
1770
+ #### Performance Considerations
1771
+
1772
+ - **Coordinator Latency**: Round-robin allocation adds ~10-20ms latency due to coordinator communication
1773
+ - **Scalability**: Single coordinator instance can handle thousands of allocations per second
1774
+ - **Fault Tolerance**: Design fallback allocation strategies when coordinator is unavailable
1775
+ - **Caching**: Consider caching allocation results for frequently accessed keys
1776
+
1777
+ ```typescript
1778
+ // Fallback allocation when coordinator is unavailable
1779
+ function fallbackAllocation(primaryKey: string, shards: string[]): string {
1780
+ // Use hash-based allocation as fallback
1781
+ const hash = simpleHash(primaryKey);
1782
+ return shards[hash % shards.length];
1783
+ }
1784
+
1785
+ async function allocateWithFallback(coordinator: DurableObjectNamespace, primaryKey: string, shards: string[]): Promise<string> {
1786
+ try {
1787
+ const coordinatorId = coordinator.idFromName('default');
1788
+ const instance = coordinator.get(coordinatorId);
1789
+
1790
+ const response = await instance.fetch('http://coordinator/allocate', {
1791
+ method: 'POST',
1792
+ headers: { 'Content-Type': 'application/json' },
1793
+ body: JSON.stringify({ primaryKey })
1794
+ });
1795
+
1796
+ if (response.ok) {
1797
+ const { shard } = await response.json();
1798
+ return shard;
1799
+ }
1800
+ } catch (error) {
1801
+ console.warn('Coordinator unavailable, using fallback allocation:', error);
1802
+ }
1803
+
1804
+ // Fallback to hash-based allocation
1805
+ return fallbackAllocation(primaryKey, shards);
1806
+ }
1807
+ ```
1808
+
1809
+ ## 🚀 Quick Reference
1810
+
1811
+ ### Strategy Selection Guide
1812
+
1813
+ | Strategy | Use Case | Latency | Distribution | Coordinator Required |
1814
+ | ------------- | ---------------------------------------- | ------------------ | ------------ | -------------------- |
1815
+ | `hash` | High-volume apps, consistent performance | Lowest | Excellent | No |
1816
+ | `round-robin` | Guaranteed even distribution | Medium | Perfect | Yes |
1817
+ | `random` | Simple setup, good enough distribution | Low | Good | No |
1818
+ | `location` | Geographic optimization, reduced latency | Region-optimized | Good | No |
1819
+ | `mixed` | Optimized read/write performance | Strategy-dependent | Variable | Strategy-dependent |
1820
+
1821
+ #### Mixed Strategy Recommendations by Scenario
1822
+
1823
+ | Scenario | Recommended Mix | Read Strategy | Write Strategy | Benefits |
1824
+ | ---------------------------------- | -------------------------------------- | ------------- | -------------- | ---------------------------------------------- |
1825
+ | **Large Databases (>10M records)** | `{read: 'hash', write: 'round-robin'}` | Hash | Round-Robin | Fastest reads, even data distribution |
1826
+ | **Global Applications** | `{read: 'hash', write: 'location'}` | Hash | Location | Fast queries, optimal geographic placement |
1827
+ | **High Write Volume** | `{read: 'location', write: 'hash'}` | Location | Hash | Regional read optimization, fast write routing |
1828
+ | **Analytics Workloads** | `{read: 'random', write: 'location'}` | Random | Location | Load-balanced queries, optimal data placement |
1829
+ | **Multi-Tenant SaaS** | `{read: 'hash', write: 'hash'}` | Hash | Hash | Consistent performance, predictable routing |
1830
+
1831
+ ### Configuration Templates
1832
+
1833
+ **Hash Strategy (Recommended for most apps):**
1834
+
1835
+ ```typescript
1836
+ {
1837
+ kv: env.KV,
1838
+ strategy: 'hash',
1839
+ shards: { 'db-1': env.DB_1, 'db-2': env.DB_2 }
1840
+ }
1841
+ ```
1842
+
1843
+ **Location Strategy (Geographic optimization):**
1844
+
1845
+ ```typescript
1846
+ {
1847
+ kv: env.KV,
1848
+ strategy: 'location',
1849
+ targetRegion: 'wnam',
1850
+ shardLocations: {
1851
+ 'db-west': { region: 'wnam', priority: 2 },
1852
+ 'db-east': { region: 'enam', priority: 1 }
1853
+ },
1854
+ shards: { 'db-west': env.DB_WEST, 'db-east': env.DB_EAST }
1855
+ }
1856
+ ```
1857
+
1858
+ **Round-Robin Strategy (Even distribution):**
1859
+
1860
+ ```typescript
1861
+ {
1862
+ kv: env.KV,
1863
+ coordinator: env.ShardCoordinator,
1864
+ strategy: 'round-robin',
1865
+ shards: { 'db-1': env.DB_1, 'db-2': env.DB_2, 'db-3': env.DB_3 }
1866
+ }
1867
+ ```
1868
+
1869
+ **Mixed Strategy (Global applications):**
1870
+
1871
+ ```typescript
1872
+ {
1873
+ kv: env.KV,
1874
+ strategy: {
1875
+ read: 'hash', // Fast, consistent reads
1876
+ write: 'location' // Optimal geographic placement
1877
+ },
1878
+ targetRegion: 'wnam',
1879
+ shardLocations: {
1880
+ 'db-west': { region: 'wnam', priority: 2 },
1881
+ 'db-east': { region: 'enam', priority: 1 }
1882
+ },
1883
+ shards: { 'db-west': env.DB_WEST, 'db-east': env.DB_EAST }
1884
+ }
1885
+ ```
1886
+
1887
+ **Mixed Strategy (Large databases):**
1888
+
1889
+ ```typescript
1890
+ {
1891
+ kv: env.KV,
1892
+ coordinator: env.ShardCoordinator,
1893
+ strategy: {
1894
+ read: 'hash', // Fastest possible reads
1895
+ write: 'round-robin' // Perfect distribution
1896
+ },
1897
+ shards: { 'db-1': env.DB_1, 'db-2': env.DB_2, 'db-3': env.DB_3 }
1898
+ }
1899
+ ```
1900
+
1901
+ **Mixed Strategy (High-performance consistent):**
1902
+
1903
+ ```typescript
1904
+ {
1905
+ kv: env.KV,
1906
+ strategy: {
1907
+ read: 'hash', // Predictable read performance
1908
+ write: 'hash' // Predictable write performance
1909
+ },
1910
+ shards: { 'db-1': env.DB_1, 'db-2': env.DB_2 }
1911
+ }
1912
+ ```
1913
+
1914
+ ### Region Codes Reference
1915
+
1916
+ | Code | Region | Typical Location |
1917
+ | ------ | --------------------- | ---------------- |
1918
+ | `wnam` | Western North America | San Francisco |
1919
+ | `enam` | Eastern North America | New York |
1920
+ | `weur` | Western Europe | London |
1921
+ | `eeur` | Eastern Europe | Berlin |
1922
+ | `apac` | Asia Pacific | Tokyo |
1923
+ | `oc` | Oceania | Sydney |
1924
+ | `me` | Middle East | Dubai |
1925
+ | `af` | Africa | Johannesburg |
1926
+
495
1927
  ## 🤝 Contributing
496
1928
 
497
1929
  1. Fork the repository