firebase-rtdb-tools 0.2.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Berkay Turanci
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,152 @@
1
+ Metadata-Version: 2.4
2
+ Name: firebase-rtdb-tools
3
+ Version: 0.2.1
4
+ Summary: Firebase Realtime Database (RTDB) lossless restore toolkit for large backups
5
+ Author-email: Berkay Turanci <berkayturanci@gmail.com>
6
+ License-Expression: MIT
7
+ Project-URL: Homepage, https://github.com/berkayturanci/firebase-rtdb-restore
8
+ Project-URL: Repository, https://github.com/berkayturanci/firebase-rtdb-restore
9
+ Project-URL: Issues, https://github.com/berkayturanci/firebase-rtdb-restore/issues
10
+ Project-URL: Changelog, https://github.com/berkayturanci/firebase-rtdb-restore/blob/main/CHANGELOG.md
11
+ Project-URL: Documentation, https://berkayturanci.github.io/firebase-rtdb-restore/
12
+ Classifier: Programming Language :: Python :: 3
13
+ Classifier: Operating System :: OS Independent
14
+ Classifier: Topic :: Database
15
+ Classifier: Topic :: Utilities
16
+ Requires-Python: >=3.8
17
+ Description-Content-Type: text/markdown
18
+ License-File: LICENSE
19
+ Requires-Dist: firebase-admin>=5.0.0
20
+ Provides-Extra: dev
21
+ Requires-Dist: ruff>=0.4.0; extra == "dev"
22
+ Requires-Dist: build; extra == "dev"
23
+ Requires-Dist: pre-commit; extra == "dev"
24
+ Dynamic: license-file
25
+
26
+ # Firebase RTDB Lossless Restore Toolkit
27
+
28
+ A simple, memory-efficient toolkit to restore large Firebase Realtime Database (RTDB) backups safely and without data loss.
29
+
30
+ [![GitHub release (latest SemVer)](https://img.shields.io/github/v/release/berkayturanci/firebase-rtdb-restore)](https://github.com/berkayturanci/firebase-rtdb-restore/releases)
31
+ [![PyPI version](https://img.shields.io/pypi/v/firebase-rtdb-tools?logo=pypi)](https://pypi.org/project/firebase-rtdb-tools/)
32
+ [![Run Tests](https://github.com/berkayturanci/firebase-rtdb-restore/actions/workflows/tests.yml/badge.svg)](https://github.com/berkayturanci/firebase-rtdb-restore/actions/workflows/tests.yml)
33
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
34
+
35
+ ---
36
+
37
+ ## The Problem
38
+
39
+ Restoring a large Firebase database backup (e.g., 1 GB+) using default tools is difficult for three reasons:
40
+
41
+ 1. **The Overwrite Trap**: Importing a JSON file in the Firebase Console completely erases all existing data at that path first. You cannot upload a large backup in pieces because each new piece wipes out the previous ones.
42
+ 2. **Request Size Limits**: Firebase limits the size of a single write request. Large backup files will timeout or fail with payload size errors.
43
+ 3. **Out-Of-Memory Crashes**: Loading a giant JSON backup file into memory will crash standard scripts.
44
+
45
+ ---
46
+
47
+ ## The Solution
48
+
49
+ This toolkit solves these problems using four simple steps:
50
+
51
+ * **Stream Splitting**: Splits a giant JSON file into smaller chunks without loading the whole file into memory. It reads the file in tiny 128 KB blocks.
52
+ * **Lossless Verification**: Automatically checks that no data was lost during splitting by comparing SHA-256 fingerprints of every single entry.
53
+ * **Batch Uploading**: Groups entries into safe ≤ 4 MB batches and uploads them using additive `PATCH` updates, merging data without erasing anything else.
54
+ * **Oversized Entry Recovery**: Recursively splits individual massive entries (like a single user with huge data) child-key by child-key so they fit under request limits.
55
+
56
+ ---
57
+
58
+ ## Installation
59
+
60
+ ### Via PyPI
61
+
62
+ ```bash
63
+ pip install firebase-rtdb-tools
64
+ ```
65
+
66
+ This installs four simple command-line tools:
67
+ * `firebase-rtdb-split`
68
+ * `firebase-rtdb-validate`
69
+ * `firebase-rtdb-upload`
70
+ * `firebase-rtdb-upload-single`
71
+
72
+ If PyPI does not show the package yet, install from source until the next release workflow publishes successfully.
73
+
74
+ ### From Source
75
+
76
+ ```bash
77
+ git clone https://github.com/berkayturanci/firebase-rtdb-restore.git
78
+ cd firebase-rtdb-restore
79
+ pip install -r requirements.txt
80
+ ```
81
+
82
+ For local development, install the editable package with development tools:
83
+
84
+ ```bash
85
+ pip install -e ".[dev]"
86
+ ```
87
+
88
+ ---
89
+
90
+ ## How to Get Your Firebase Service Account Key
91
+
92
+ To upload data to your Firebase database:
93
+
94
+ 1. Go to your **Firebase Console** -> **Project Settings** -> **Service accounts**.
95
+ 2. Click **Generate new private key** and download the JSON file.
96
+ 3. Pass the path to this JSON file using the `-s` / `--service-account` option, or set the environment variable:
97
+ ```bash
98
+ export FIREBASE_SERVICE_ACCOUNT_KEY="/path/to/serviceAccountKey.json"
99
+ ```
100
+
101
+ ---
102
+
103
+ ## Simple Restore Workflow
104
+
105
+ ### Step 1: Split the giant backup file
106
+ Split the backup JSON into smaller files (default is 1,000 entries per file):
107
+ ```bash
108
+ make split BACKUP=backup.json CHUNKS=./chunks NODE=users
109
+ ```
110
+ *(Or use `firebase-rtdb-split backup.json -o ./chunks -n users -c 1000`)*
111
+
112
+ ### Step 2: Verify the split
113
+ Check that the split was 100% exact and no data was lost:
114
+ ```bash
115
+ make validate BACKUP=backup.json CHUNKS=./chunks NODE=users
116
+ ```
117
+ *(Or use `firebase-rtdb-validate backup.json ./chunks -n users`)*
118
+
119
+ **Do not proceed if this step fails.**
120
+
121
+ ### Step 3: Upload chunks to Firebase
122
+ Upload all chunks to your database. This merges data additively and will not overwrite other sibling nodes:
123
+ ```bash
124
+ # Option A: Append/Resume (merges chunks into /users without wiping anything else)
125
+ make upload CHUNKS=./chunks SA=serviceAccountKey.json DBPATH=/users
126
+
127
+ # Option B: Clean restore of the TARGET path (wipes /users first, leaves siblings intact)
128
+ make upload-wipe CHUNKS=./chunks SA=serviceAccountKey.json DBPATH=/users
129
+
130
+ # Option C: Full reset (wipes the ENTIRE database root first — destroys all data)
131
+ make upload-wipe-root CHUNKS=./chunks SA=serviceAccountKey.json DBPATH=/users
132
+ ```
133
+ *(Or use `firebase-rtdb-upload ./chunks -s serviceAccountKey.json -p /users --wipe`)*
134
+
135
+ **Upload options:**
136
+ * `--wipe` wipes only the target path (`-p`); `--wipe-root` wipes the entire database root.
137
+ * `--dry-run` previews exactly what would be wiped/uploaded without writing anything.
138
+ * `-w/--workers N` uploads N chunks in parallel (default 1).
139
+ * Uploads are **resumable**: completed chunks are recorded in a `.upload-progress` file inside the chunks directory and skipped automatically when you re-run after a failure. Transient write errors are retried with exponential backoff.
140
+
141
+ ### Step 4: Handle giant entries (if any)
142
+ If the upload script reports that a specific entry failed because it is too large to fit in a single request:
143
+ ```bash
144
+ make upload-single UID=some_uid CHUNKS=./chunks/chunk_0000.json SA=serviceAccountKey.json DBPATH=/users
145
+ ```
146
+ *(Or use `firebase-rtdb-upload-single some_uid ./chunks/chunk_0000.json -s serviceAccountKey.json -p /users`)*
147
+
148
+ ---
149
+
150
+ ## License
151
+
152
+ MIT License. See [LICENSE](LICENSE) for details.
@@ -0,0 +1,127 @@
1
+ # Firebase RTDB Lossless Restore Toolkit
2
+
3
+ A simple, memory-efficient toolkit to restore large Firebase Realtime Database (RTDB) backups safely and without data loss.
4
+
5
+ [![GitHub release (latest SemVer)](https://img.shields.io/github/v/release/berkayturanci/firebase-rtdb-restore)](https://github.com/berkayturanci/firebase-rtdb-restore/releases)
6
+ [![PyPI version](https://img.shields.io/pypi/v/firebase-rtdb-tools?logo=pypi)](https://pypi.org/project/firebase-rtdb-tools/)
7
+ [![Run Tests](https://github.com/berkayturanci/firebase-rtdb-restore/actions/workflows/tests.yml/badge.svg)](https://github.com/berkayturanci/firebase-rtdb-restore/actions/workflows/tests.yml)
8
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
9
+
10
+ ---
11
+
12
+ ## The Problem
13
+
14
+ Restoring a large Firebase database backup (e.g., 1 GB+) using default tools is difficult for three reasons:
15
+
16
+ 1. **The Overwrite Trap**: Importing a JSON file in the Firebase Console completely erases all existing data at that path first. You cannot upload a large backup in pieces because each new piece wipes out the previous ones.
17
+ 2. **Request Size Limits**: Firebase limits the size of a single write request. Large backup files will timeout or fail with payload size errors.
18
+ 3. **Out-Of-Memory Crashes**: Loading a giant JSON backup file into memory will crash standard scripts.
19
+
20
+ ---
21
+
22
+ ## The Solution
23
+
24
+ This toolkit solves these problems using four simple steps:
25
+
26
+ * **Stream Splitting**: Splits a giant JSON file into smaller chunks without loading the whole file into memory. It reads the file in tiny 128 KB blocks.
27
+ * **Lossless Verification**: Automatically checks that no data was lost during splitting by comparing SHA-256 fingerprints of every single entry.
28
+ * **Batch Uploading**: Groups entries into safe ≤ 4 MB batches and uploads them using additive `PATCH` updates, merging data without erasing anything else.
29
+ * **Oversized Entry Recovery**: Recursively splits individual massive entries (like a single user with huge data) child-key by child-key so they fit under request limits.
30
+
31
+ ---
32
+
33
+ ## Installation
34
+
35
+ ### Via PyPI
36
+
37
+ ```bash
38
+ pip install firebase-rtdb-tools
39
+ ```
40
+
41
+ This installs four simple command-line tools:
42
+ * `firebase-rtdb-split`
43
+ * `firebase-rtdb-validate`
44
+ * `firebase-rtdb-upload`
45
+ * `firebase-rtdb-upload-single`
46
+
47
+ If PyPI does not show the package yet, install from source until the next release workflow publishes successfully.
48
+
49
+ ### From Source
50
+
51
+ ```bash
52
+ git clone https://github.com/berkayturanci/firebase-rtdb-restore.git
53
+ cd firebase-rtdb-restore
54
+ pip install -r requirements.txt
55
+ ```
56
+
57
+ For local development, install the editable package with development tools:
58
+
59
+ ```bash
60
+ pip install -e ".[dev]"
61
+ ```
62
+
63
+ ---
64
+
65
+ ## How to Get Your Firebase Service Account Key
66
+
67
+ To upload data to your Firebase database:
68
+
69
+ 1. Go to your **Firebase Console** -> **Project Settings** -> **Service accounts**.
70
+ 2. Click **Generate new private key** and download the JSON file.
71
+ 3. Pass the path to this JSON file using the `-s` / `--service-account` option, or set the environment variable:
72
+ ```bash
73
+ export FIREBASE_SERVICE_ACCOUNT_KEY="/path/to/serviceAccountKey.json"
74
+ ```
75
+
76
+ ---
77
+
78
+ ## Simple Restore Workflow
79
+
80
+ ### Step 1: Split the giant backup file
81
+ Split the backup JSON into smaller files (default is 1,000 entries per file):
82
+ ```bash
83
+ make split BACKUP=backup.json CHUNKS=./chunks NODE=users
84
+ ```
85
+ *(Or use `firebase-rtdb-split backup.json -o ./chunks -n users -c 1000`)*
86
+
87
+ ### Step 2: Verify the split
88
+ Check that the split was 100% exact and no data was lost:
89
+ ```bash
90
+ make validate BACKUP=backup.json CHUNKS=./chunks NODE=users
91
+ ```
92
+ *(Or use `firebase-rtdb-validate backup.json ./chunks -n users`)*
93
+
94
+ **Do not proceed if this step fails.**
95
+
96
+ ### Step 3: Upload chunks to Firebase
97
+ Upload all chunks to your database. This merges data additively and will not overwrite other sibling nodes:
98
+ ```bash
99
+ # Option A: Append/Resume (merges chunks into /users without wiping anything else)
100
+ make upload CHUNKS=./chunks SA=serviceAccountKey.json DBPATH=/users
101
+
102
+ # Option B: Clean restore of the TARGET path (wipes /users first, leaves siblings intact)
103
+ make upload-wipe CHUNKS=./chunks SA=serviceAccountKey.json DBPATH=/users
104
+
105
+ # Option C: Full reset (wipes the ENTIRE database root first — destroys all data)
106
+ make upload-wipe-root CHUNKS=./chunks SA=serviceAccountKey.json DBPATH=/users
107
+ ```
108
+ *(Or use `firebase-rtdb-upload ./chunks -s serviceAccountKey.json -p /users --wipe`)*
109
+
110
+ **Upload options:**
111
+ * `--wipe` wipes only the target path (`-p`); `--wipe-root` wipes the entire database root.
112
+ * `--dry-run` previews exactly what would be wiped/uploaded without writing anything.
113
+ * `-w/--workers N` uploads N chunks in parallel (default 1).
114
+ * Uploads are **resumable**: completed chunks are recorded in a `.upload-progress` file inside the chunks directory and skipped automatically when you re-run after a failure. Transient write errors are retried with exponential backoff.
115
+
116
+ ### Step 4: Handle giant entries (if any)
117
+ If the upload script reports that a specific entry failed because it is too large to fit in a single request:
118
+ ```bash
119
+ make upload-single UID=some_uid CHUNKS=./chunks/chunk_0000.json SA=serviceAccountKey.json DBPATH=/users
120
+ ```
121
+ *(Or use `firebase-rtdb-upload-single some_uid ./chunks/chunk_0000.json -s serviceAccountKey.json -p /users`)*
122
+
123
+ ---
124
+
125
+ ## License
126
+
127
+ MIT License. See [LICENSE](LICENSE) for details.
@@ -0,0 +1,5 @@
1
+ """
2
+ Firebase RTDB Lossless Restore Toolkit.
3
+ """
4
+
5
+ __version__ = "0.2.0"
@@ -0,0 +1,24 @@
1
+ """Allow ``python -m firebase_rtdb_restore`` to list the available commands."""
2
+
3
+ import sys
4
+
5
+ COMMANDS = {
6
+ "split": "firebase_rtdb_restore.split_backup",
7
+ "validate": "firebase_rtdb_restore.validate_chunks",
8
+ "upload": "firebase_rtdb_restore.upload_chunks",
9
+ "upload-single": "firebase_rtdb_restore.upload_single_user",
10
+ }
11
+
12
+
13
+ def main():
14
+ print("Firebase RTDB Lossless Restore Toolkit\n")
15
+ print("Run one of the module entry points directly, e.g.:")
16
+ for name, module in COMMANDS.items():
17
+ print(f" python -m {module} # {name}")
18
+ print("\nOr use the installed console scripts: firebase-rtdb-split, "
19
+ "firebase-rtdb-validate, firebase-rtdb-upload, firebase-rtdb-upload-single.")
20
+ return 0
21
+
22
+
23
+ if __name__ == "__main__":
24
+ sys.exit(main())
@@ -0,0 +1,234 @@
1
+ #!/usr/bin/env python3
2
+ """
3
+ Shared helpers for the Firebase RTDB restore toolkit.
4
+
5
+ Keeps the streaming JSON parser, node location, service-account resolution,
6
+ idempotent app initialisation, retrying writes and TTY-aware progress in one
7
+ place so the four CLI entry points stay thin and behave consistently.
8
+ """
9
+
10
+ import contextlib
11
+ import json
12
+ import os
13
+ import sys
14
+ import time
15
+
16
+ READ_CHUNK = 128 * 1024 # 128 KB read window
17
+
18
+ # Sentinel distinguishing "value not parsed yet" from a legitimate JSON ``null``
19
+ # value, so an entry whose value is literally ``null`` is not silently dropped.
20
+ _INCOMPLETE = object()
21
+
22
+
23
+ def is_tty():
24
+ return sys.stdout.isatty()
25
+
26
+
27
+ def tty_progress(msg):
28
+ """Emit a carriage-return progress line, but only on an interactive TTY.
29
+
30
+ On non-interactive streams (CI logs, pipes, test runners) this is a no-op so
31
+ the transient ``\\r`` updates do not flood the output. Final/summary lines
32
+ should use ``print`` directly.
33
+ """
34
+ if is_tty():
35
+ print(f"\r{msg}", end="", flush=True)
36
+
37
+
38
+ def locate_node(f, node_key):
39
+ """Stream-search the open text file ``f`` for the top-level ``"<node_key>": {``.
40
+
41
+ Reads the file in blocks until the node is found, so the target node may
42
+ appear anywhere in the backup rather than only within a fixed-size header
43
+ window. Returns the leftover buffer positioned just after the opening
44
+ ``{`` of the node object, or ``None`` if the node is never found.
45
+ """
46
+ node_pattern = f'"{node_key}"'
47
+ # Retain enough of a tail across reads that a pattern straddling a read
48
+ # boundary still matches on the next iteration.
49
+ overlap = len(node_pattern) + 1
50
+ buf = ""
51
+ while True:
52
+ idx = buf.find(node_pattern)
53
+ if idx != -1:
54
+ after = buf[idx + len(node_pattern):]
55
+ colon = after.find(":")
56
+ if colon != -1:
57
+ brace = after.find("{", colon)
58
+ if brace != -1:
59
+ return after[brace + 1:]
60
+ # Key found but ':' / '{' not fully buffered yet — keep from the key
61
+ # onward and read more.
62
+ more = f.read(READ_CHUNK)
63
+ if not more:
64
+ return None
65
+ buf = buf[idx:] + more
66
+ continue
67
+ more = f.read(READ_CHUNK)
68
+ if not more:
69
+ return None
70
+ buf = buf[-overlap:] + more
71
+
72
+
73
+ def iter_entries(f, buf, file_size=None, label=""):
74
+ """Yield ``(key, value)`` pairs from the node object being streamed.
75
+
76
+ ``buf`` must be positioned just after the node's opening ``{`` (see
77
+ :func:`locate_node`). Works on both pretty-printed and minified JSON and
78
+ never loads the whole file into memory. When ``file_size`` and ``label`` are
79
+ given, a TTY-only progress line is emitted as the file is consumed.
80
+ """
81
+ decoder = json.JSONDecoder()
82
+ total = 0
83
+
84
+ def report():
85
+ if file_size and label:
86
+ pct = min(f.tell() * 100 // file_size, 100)
87
+ tty_progress(f" {label}: {total} entries | {pct}% read ")
88
+
89
+ while True:
90
+ if len(buf) < READ_CHUNK:
91
+ more = f.read(READ_CHUNK)
92
+ if more:
93
+ buf += more
94
+ report()
95
+
96
+ s = buf.lstrip(" \t\n\r")
97
+ if not s or s[0] == "}":
98
+ break
99
+ if s[0] == ",":
100
+ buf = s[1:]
101
+ continue
102
+ if s[0] != '"':
103
+ # Unexpected character outside of a key — skip one char and retry.
104
+ buf = s[1:]
105
+ continue
106
+
107
+ # ── parse key ───────────────────────────────────────────────────────
108
+ try:
109
+ key, key_end = decoder.raw_decode(s)
110
+ except json.JSONDecodeError:
111
+ more = f.read(READ_CHUNK)
112
+ if not more:
113
+ break
114
+ buf = s + more
115
+ continue
116
+
117
+ if not isinstance(key, str):
118
+ buf = s[key_end:]
119
+ continue
120
+
121
+ rest = s[key_end:].lstrip()
122
+ if not rest or rest[0] != ":":
123
+ buf = rest
124
+ continue
125
+
126
+ val_str = rest[1:].lstrip()
127
+
128
+ # ── parse value (read more if incomplete) ───────────────────────────
129
+ val = _INCOMPLETE
130
+ while True:
131
+ try:
132
+ val, val_end = decoder.raw_decode(val_str)
133
+ break
134
+ except json.JSONDecodeError:
135
+ more = f.read(READ_CHUNK)
136
+ if not more:
137
+ break # EOF with incomplete value
138
+ val_str += more
139
+
140
+ if val is _INCOMPLETE:
141
+ break # incomplete entry at EOF — stop
142
+
143
+ total += 1
144
+ buf = val_str[val_end:]
145
+ yield key, val
146
+
147
+ report()
148
+
149
+
150
+ def resolve_service_account(arg):
151
+ """Resolve the service-account path from CLI arg, env var, or local default.
152
+
153
+ Order of precedence: explicit ``arg`` → ``FIREBASE_SERVICE_ACCOUNT_KEY`` env
154
+ var → ``./serviceAccountKey.json``. Returns the expanded path or ``None``.
155
+ """
156
+ if arg:
157
+ return os.path.expanduser(arg)
158
+ env = os.environ.get("FIREBASE_SERVICE_ACCOUNT_KEY")
159
+ if env:
160
+ return os.path.expanduser(env)
161
+ if os.path.exists("./serviceAccountKey.json"):
162
+ return "./serviceAccountKey.json"
163
+ return None
164
+
165
+
166
+ def service_account_error():
167
+ print("ERROR: Service account file must be provided via -s/--service-account,")
168
+ print("or set via the FIREBASE_SERVICE_ACCOUNT_KEY environment variable,")
169
+ print("or exist as './serviceAccountKey.json' in the current working directory.")
170
+
171
+
172
+ def init_app(sa_path, database_url=None):
173
+ """Initialise the default Firebase app idempotently.
174
+
175
+ Returns ``(service_account_dict, database_url)``. Safe to call more than once
176
+ in the same process: a pre-existing default app is reused instead of raising.
177
+ """
178
+ import firebase_admin
179
+ from firebase_admin import credentials
180
+
181
+ with open(sa_path) as f:
182
+ sa = json.load(f)
183
+
184
+ db_url = database_url or f"https://{sa['project_id']}.firebaseio.com"
185
+
186
+ # A pre-existing default app raises ValueError — reuse it instead of failing.
187
+ with contextlib.suppress(ValueError):
188
+ firebase_admin.initialize_app(credentials.Certificate(sa_path), {"databaseURL": db_url})
189
+
190
+ return sa, db_url
191
+
192
+
193
+ def with_retry(fn, *, attempts=4, base_delay=1.0, label=""):
194
+ """Call ``fn`` with exponential backoff, retrying on any exception.
195
+
196
+ ``attempts`` is the total number of tries (1 initial + ``attempts - 1``
197
+ retries). Re-raises the last exception if every attempt fails.
198
+ """
199
+ last = None
200
+ for i in range(1, attempts + 1):
201
+ try:
202
+ return fn()
203
+ except Exception as e: # noqa: BLE001 — surface any transient failure to retry
204
+ last = e
205
+ if i == attempts:
206
+ break
207
+ delay = base_delay * (2 ** (i - 1))
208
+ print(f"\n retry {i}/{attempts - 1} for {label or 'write'} after error: {e} (waiting {delay:.1f}s)")
209
+ time.sleep(delay)
210
+ raise last
211
+
212
+
213
+ def recursive_write(ref, value, path, max_bytes, depth=0):
214
+ """Write ``value`` to ``ref``, splitting dicts child-by-child if too large.
215
+
216
+ Any single write is retried with backoff. A non-dict value larger than
217
+ ``max_bytes`` cannot be split, so it is written as one request with a warning.
218
+ """
219
+ sz = len(json.dumps(value, ensure_ascii=False, separators=(",", ":")).encode())
220
+ indent = " " * (depth + 1)
221
+ tty_progress(f"{indent}{path} ({sz // 1024} KB) ...")
222
+
223
+ if sz <= max_bytes or not isinstance(value, dict):
224
+ if sz > max_bytes:
225
+ print(
226
+ f"\n WARNING: {path} is {sz // 1024} KB and is not a dict — "
227
+ f"cannot split further, writing as a single request."
228
+ )
229
+ with_retry(lambda: ref.set(value), label=path)
230
+ return
231
+
232
+ # Too large — write each child key separately (recursing as needed).
233
+ for k, v in value.items():
234
+ recursive_write(ref.child(k), v, f"{path}/{k}", max_bytes, depth + 1)
@@ -0,0 +1,91 @@
1
+ #!/usr/bin/env python3
2
+ """
3
+ Stream-split a Firebase RTDB backup into N-entry chunk files.
4
+
5
+ Works on both pretty-printed AND minified (single-line) JSON.
6
+ Reads in 128 KB blocks — never loads the full file into memory.
7
+ """
8
+
9
+ import argparse
10
+ import json
11
+ import os
12
+
13
+ from firebase_rtdb_restore._common import iter_entries, locate_node
14
+
15
+
16
+ def split_backup(input_path, output_dir, chunk_size, node_key):
17
+ if not os.path.exists(input_path):
18
+ print(f"ERROR: Input file not found: {input_path}")
19
+ return 0, 0
20
+ file_size = os.path.getsize(input_path)
21
+ os.makedirs(output_dir, exist_ok=True)
22
+
23
+ chunk_num = 0
24
+ total = 0
25
+ chunk = {}
26
+
27
+ with open(input_path, encoding="utf-8") as f:
28
+ buf = locate_node(f, node_key)
29
+ if buf is None:
30
+ print(f'ERROR: "{node_key}" key not found. Is this a Firebase RTDB backup?')
31
+ return 0, 0
32
+
33
+ for key, val in iter_entries(f, buf, file_size=file_size, label="parsing"):
34
+ chunk[key] = val
35
+ total += 1
36
+ if len(chunk) >= chunk_size:
37
+ _write_chunk(output_dir, chunk_num, chunk)
38
+ chunk_num += 1
39
+ chunk = {}
40
+
41
+ # Final partial chunk
42
+ if chunk:
43
+ _write_chunk(output_dir, chunk_num, chunk)
44
+ chunk_num += 1
45
+
46
+ print(f"\n\nDone: {total} entries → {chunk_num} chunk files in:\n {output_dir}/")
47
+ _print_upload_instructions(output_dir, node_key)
48
+ return chunk_num, total
49
+
50
+
51
+ def _write_chunk(output_dir, chunk_num, chunk):
52
+ path = os.path.join(output_dir, f"chunk_{chunk_num:04d}.json")
53
+ with open(path, "w", encoding="utf-8") as f:
54
+ json.dump(chunk, f)
55
+ print(f"\r chunk_{chunk_num:04d}.json ({len(chunk)} entries){' ' * 30}")
56
+
57
+
58
+ def _print_upload_instructions(output_dir, node_key):
59
+ print(f"\nUpload each chunk (merges into /{node_key} — does NOT overwrite others):")
60
+ print(f" for f in {output_dir}/chunk_*.json; do")
61
+ print( " echo \"Uploading $f ...\"")
62
+ print(f" firebase database:update /{node_key} \"$f\"")
63
+ print( " done")
64
+
65
+
66
+ def main():
67
+ parser = argparse.ArgumentParser(description="Stream-split a Firebase RTDB backup into N-entry chunk files.")
68
+ parser.add_argument("backup_file", help="Path to the RTDB backup JSON file.")
69
+ parser.add_argument("-o", "--output-dir", help="Directory to save chunk files. Defaults to '<backup_file_dir>/rtdb-chunks'.")
70
+ parser.add_argument("-c", "--chunk-size", type=int, default=1000, help="Number of entries per chunk file (default: 1000).")
71
+ parser.add_argument("-n", "--node", default="users", help="The top-level JSON key to split (default: 'users').")
72
+
73
+ args = parser.parse_args()
74
+
75
+ if args.chunk_size < 1:
76
+ parser.error("--chunk-size must be a positive integer")
77
+
78
+ input_path = os.path.expanduser(args.backup_file)
79
+ output_dir = os.path.expanduser(args.output_dir) if args.output_dir \
80
+ else os.path.join(os.path.dirname(input_path), "rtdb-chunks")
81
+
82
+ print(f"Input: {input_path}")
83
+ print(f"Output dir: {output_dir}")
84
+ print(f"Chunk size: {args.chunk_size} entries")
85
+ print(f"Split node: {args.node}\n")
86
+
87
+ split_backup(input_path, output_dir, args.chunk_size, args.node)
88
+
89
+
90
+ if __name__ == "__main__":
91
+ main()