remult-sqlite-github 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2024
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,326 @@
1
+ # remult-sqlite-github
2
+
3
+ A [Remult](https://remult.dev) data provider that syncs SQLite to GitHub for serverless deployments.
4
+
5
+ ## Features
6
+
7
+ - **Automatic sync**: SQLite database is automatically synced to GitHub
8
+ - **Recovery**: Database is automatically recovered from GitHub if local copy is missing
9
+ - **Conflict detection**: Uses GitHub's SHA-based conflict detection
10
+ - **Compression**: Optional gzip compression for uploads
11
+ - **GitHub App support**: Supports both Personal Access Tokens and GitHub App authentication
12
+ - **Rate limit handling**: Built-in retry logic with exponential backoff
13
+
14
+ ## Installation
15
+
16
+ ```bash
17
+ npm install remult-sqlite-github
18
+ ```
19
+
20
+ ## Requirements
21
+
22
+ - Node.js >= 18.0.0
23
+ - `better-sqlite3` >= 9.0.0
24
+ - `remult` >= 3.0.0
25
+
26
+ ## Quick Start
27
+
28
+ ```typescript
29
+ import { remultApi } from "remult/remult-sveltekit"; // or your framework
30
+ import { createGitHubDataProvider } from "remult-sqlite-github";
31
+
32
+ export const api = remultApi({
33
+ dataProvider: createGitHubDataProvider({
34
+ file: "./mydb.sqlite",
35
+ github: {
36
+ owner: "your-username",
37
+ repo: "your-database-repo",
38
+ token: process.env.GITHUB_TOKEN,
39
+ },
40
+ }),
41
+ entities: [Task],
42
+ });
43
+ ```
44
+
45
+ ## Configuration
46
+
47
+ ### Full Options
48
+
49
+ ```typescript
50
+ createGitHubDataProvider({
51
+ // SQLite Configuration
52
+ file: "./mydb.sqlite",
53
+ sqliteOptions: {
54
+ foreignKeys: true, // Enable foreign key constraints (default: true)
55
+ busyTimeout: 5000, // SQLite busy timeout in ms (default: 5000)
56
+ pragmas: { // Additional PRAGMA statements
57
+ cache_size: -64000,
58
+ },
59
+ },
60
+
61
+ // GitHub Configuration
62
+ github: {
63
+ owner: "your-username", // Repository owner (required)
64
+ repo: "your-database-repo", // Repository name (required)
65
+ branch: "main", // Branch to use (default: "main")
66
+ path: "data", // Path prefix in repo (default: "")
67
+
68
+ // Authentication (one of these required)
69
+ token: "ghp_xxxx", // Personal Access Token
70
+
71
+ // OR GitHub App authentication
72
+ appId: 12345,
73
+ privateKey: "-----BEGIN RSA PRIVATE KEY-----...",
74
+ installationId: 67890,
75
+ },
76
+
77
+ // Sync Configuration
78
+ sync: {
79
+ snapshotInterval: 30000, // Interval between snapshots in ms (default: 30 sec)
80
+ snapshotOnChange: true, // Only snapshot if changes detected (default: true)
81
+ enableWal: false, // Enable WAL mode segmentation (default: false)
82
+ walThreshold: 1048576, // WAL size threshold in bytes (default: 1MB)
83
+ maxRetries: 3, // Max retry attempts (default: 3)
84
+ compression: false, // Enable gzip compression (default: false)
85
+ },
86
+
87
+ // Event callback
88
+ onEvent: (event) => {
89
+ console.log("Sync event:", event);
90
+ },
91
+
92
+ verbose: false, // Enable verbose logging (default: false)
93
+ });
94
+ ```
95
+
96
+ ### Environment Variables Example
97
+
98
+ ```bash
99
+ # GitHub Configuration
100
+ GITHUB_OWNER=your-username
101
+ GITHUB_REPO=your-database-repo
102
+ GITHUB_BRANCH=main
103
+ GITHUB_TOKEN=ghp_xxxx
104
+
105
+ # Or GitHub App (alternative to token)
106
+ # GITHUB_APP_ID=12345
107
+ # GITHUB_PRIVATE_KEY="-----BEGIN RSA PRIVATE KEY-----..."
108
+ # GITHUB_INSTALLATION_ID=67890
109
+
110
+ # Sync options
111
+ COMPRESSION=false
112
+ DEBUG=false
113
+ ```
114
+
115
+ ## Events
116
+
117
+ The `onEvent` callback receives the following event types:
118
+
119
+ ```typescript
120
+ type GitHubSyncEvent =
121
+ | { type: "snapshot_uploaded"; sha: string; size: number; compressed: boolean }
122
+ | { type: "wal_uploaded"; index: number; size: number }
123
+ | { type: "recovery_started"; branch: string }
124
+ | { type: "recovery_completed"; sha: string }
125
+ | { type: "commit_created"; sha: string; message: string }
126
+ | { type: "sync_error"; error: Error; context: string; willRetry: boolean }
127
+ | { type: "rate_limit_hit"; resetAt: Date; remaining: number }
128
+ | { type: "initialized"; recovered: boolean };
129
+ ```
130
+
131
+ ## GitHub Repository Structure
132
+
133
+ The database is stored in your GitHub repository with this structure:
134
+
135
+ ```
136
+ [path/]
137
+ ├── snapshot.db # Database snapshot (or snapshot.db.gz if compressed)
138
+ └── snapshot.meta.json # Snapshot metadata (timestamp, size, checksum)
139
+ ```
140
+
141
+ ## Branching Strategy
142
+
143
+ Use different branches for different environments:
144
+
145
+ ```typescript
146
+ // Production
147
+ createGitHubDataProvider({
148
+ github: { branch: "main", ... }
149
+ })
150
+
151
+ // Staging
152
+ createGitHubDataProvider({
153
+ github: { branch: "staging", ... }
154
+ })
155
+ ```
156
+
157
+ ## Advanced Usage
158
+
159
+ ### Manual Snapshot
160
+
161
+ ```typescript
162
+ import { BetterSqlite3GitHubDataProvider } from "remult-sqlite-github";
163
+
164
+ const provider = new BetterSqlite3GitHubDataProvider(options);
165
+ await provider.init();
166
+
167
+ // Force an immediate snapshot
168
+ await provider.snapshot();
169
+
170
+ // Or force sync (ignores snapshotOnChange setting)
171
+ await provider.forceSync();
172
+ ```
173
+
174
+ ### Access Raw Database
175
+
176
+ ```typescript
177
+ const db = provider.rawDatabase;
178
+ if (db) {
179
+ db.exec("VACUUM");
180
+ }
181
+ ```
182
+
183
+ ### Graceful Shutdown
184
+
185
+ ```typescript
186
+ // Performs final snapshot if there are pending changes
187
+ await provider.close();
188
+ ```
189
+
190
+ ## Limitations & Recommendations
191
+
192
+ ### Database Size
193
+
194
+ | Size | Status | Notes |
195
+ |------|--------|-------|
196
+ | < 50MB | Recommended | Optimal performance, fast uploads/downloads |
197
+ | 50-100MB | Supported | Works but slower sync times |
198
+ | > 100MB | Not supported | GitHub API rejects files over 100MB |
199
+
200
+ **Recommendations for managing database size:**
201
+ - Run `VACUUM` periodically to reclaim space: `db.exec("VACUUM")`
202
+ - Use appropriate data types (avoid storing large blobs)
203
+ - Consider archiving old data to separate tables/databases
204
+ - Monitor size with the `onEvent` callback (`snapshot_uploaded` event includes size)
205
+
206
+ ### Compression
207
+
208
+ Compression is **off by default** but recommended for databases over 10MB:
209
+
210
+ ```typescript
211
+ createGitHubDataProvider({
212
+ sync: {
213
+ compression: true, // Enable gzip compression
214
+ },
215
+ })
216
+ ```
217
+
218
+ | Scenario | Compression | Rationale |
219
+ |----------|-------------|-----------|
220
+ | < 10MB database | Off | Overhead not worth it, fast enough uncompressed |
221
+ | 10-50MB database | Recommended | 50-70% size reduction typical for SQLite |
222
+ | 50-100MB database | Strongly recommended | Essential to stay within limits and reduce transfer time |
223
+ | Text-heavy data | On | Excellent compression ratios (70-90%) |
224
+ | Binary/blob data | Optional | May not compress well |
225
+
226
+ **Compression trade-offs:**
227
+ - **Pros**: Smaller uploads, faster transfers, lower bandwidth usage
228
+ - **Cons**: CPU overhead for compression/decompression, slightly slower snapshots
229
+
230
+ ### Sync Interval
231
+
232
+ The default snapshot interval is **30 seconds**. Adjust based on your needs:
233
+
234
+ ```typescript
235
+ createGitHubDataProvider({
236
+ sync: {
237
+ snapshotInterval: 30 * 1000, // 30 seconds (default)
238
+ snapshotOnChange: true, // Only sync if data changed (default)
239
+ },
240
+ })
241
+ ```
242
+
243
+ | Interval | Use Case |
244
+ |----------|----------|
245
+ | 30 seconds | General use (default) |
246
+ | 1 minute | Slightly lower API usage |
247
+ | 5 minutes | Low-frequency updates, cost-conscious |
248
+ | 0 (disabled) | Manual sync only via `forceSync()` |
249
+
250
+ **Note**: With `snapshotOnChange: true` (default), snapshots only occur if data has changed, so a short interval won't cause unnecessary uploads.
251
+
252
+ ### Rate Limits
253
+
254
+ GitHub API has rate limits that this library handles automatically:
255
+
256
+ | Auth Type | Rate Limit | Notes |
257
+ |-----------|------------|-------|
258
+ | Personal Access Token | 5,000/hour | Sufficient for most use cases |
259
+ | GitHub App | 5,000/hour per installation | Same limit, but can have multiple installations |
260
+ | Unauthenticated | 60/hour | Not supported by this library |
261
+
262
+ **Rate limit considerations:**
263
+ - Each snapshot uses 2-4 API calls (get SHA, upload file, upload metadata, optional cleanup)
264
+ - Recovery uses 2-3 API calls
265
+ - With 30-second intervals: ~120 snapshots/hour max = ~480 API calls/hour (well under limit)
266
+ - With `snapshotOnChange: true` (default), actual API calls are much lower since only changes trigger uploads
267
+ - The library automatically waits and retries when rate limited
268
+
269
+ ### Data Durability
270
+
271
+ **Important**: This is an **async-only** sync strategy. Data written to SQLite is:
272
+ 1. ✅ Immediately persisted to local SQLite file
273
+ 2. ⏳ Uploaded to GitHub on next snapshot interval (or manual sync)
274
+
275
+ **What this means:**
276
+ - If your serverless instance terminates before the next snapshot, recent changes may be lost
277
+ - For critical writes, call `forceSync()` after important operations
278
+ - This is suitable for applications where occasional data loss is acceptable
279
+
280
+ ```typescript
281
+ // For critical operations
282
+ await taskRepo.insert({ title: "Important task" });
283
+ await provider.forceSync(); // Ensure it's uploaded immediately
284
+ ```
285
+
286
+ ### Concurrent Access
287
+
288
+ This library is designed for **single-writer** scenarios:
289
+ - Multiple serverless instances reading is fine
290
+ - Multiple instances writing simultaneously may cause conflicts
291
+ - Conflicts are detected via SHA validation and will trigger retries
292
+ - For multi-writer scenarios, consider a traditional database
293
+
294
+ ### Other Limitations
295
+
296
+ - **GitHub.com only**: GitHub Enterprise Server is not supported
297
+ - **Public/Private repos**: Both work, but private repos require appropriate token scopes
298
+ - **Branch protection**: Ensure your token has permission to push to the configured branch
299
+ - **Large files**: Files over 100MB cannot be stored (GitHub API limit)
300
+ - **Binary data**: Large BLOBs in SQLite will increase database size significantly
301
+
302
+ ## Best Practices
303
+
304
+ 1. **Use a dedicated repository** for your database (not your application repo)
305
+ 2. **Enable compression** for databases over 10MB
306
+ 3. **Use branches** for environments (main, staging, development)
307
+ 4. **Monitor events** to track sync health and catch errors
308
+ 5. **Set appropriate intervals** based on your data criticality
309
+ 6. **Run VACUUM periodically** to keep database size manageable
310
+ 7. **Use GitHub App auth** for production (more secure than PATs)
311
+
312
+ ## Demo
313
+
314
+ See the `examples/sveltekit` directory for a complete SvelteKit demo application.
315
+
316
+ ```bash
317
+ cd examples/sveltekit
318
+ npm install
319
+ cp .env.example .env
320
+ # Edit .env with your GitHub credentials
321
+ npm run dev
322
+ ```
323
+
324
+ ## License
325
+
326
+ MIT