git-safepoint 0.0.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (34) hide show
  1. git_safepoint-0.0.1/.gitignore +12 -0
  2. git_safepoint-0.0.1/LICENSE +21 -0
  3. git_safepoint-0.0.1/PKG-INFO +322 -0
  4. git_safepoint-0.0.1/README.md +294 -0
  5. git_safepoint-0.0.1/adapters/git-safepoint-preexec.zsh +243 -0
  6. git_safepoint-0.0.1/adapters/pretooluse_hook.py +42 -0
  7. git_safepoint-0.0.1/git_safepoint/__init__.py +9 -0
  8. git_safepoint-0.0.1/git_safepoint/__main__.py +9 -0
  9. git_safepoint-0.0.1/git_safepoint/cli.py +767 -0
  10. git_safepoint-0.0.1/git_safepoint/destructive.py +422 -0
  11. git_safepoint-0.0.1/git_safepoint/engine.py +2043 -0
  12. git_safepoint-0.0.1/git_safepoint/gitutil.py +460 -0
  13. git_safepoint-0.0.1/git_safepoint/ids.py +86 -0
  14. git_safepoint-0.0.1/git_safepoint/lock.py +106 -0
  15. git_safepoint-0.0.1/git_safepoint/mtimecache.py +150 -0
  16. git_safepoint-0.0.1/git_safepoint/secret.py +261 -0
  17. git_safepoint-0.0.1/git_safepoint.py +16 -0
  18. git_safepoint-0.0.1/pyproject.toml +70 -0
  19. git_safepoint-0.0.1/tests/__init__.py +0 -0
  20. git_safepoint-0.0.1/tests/helpers.py +91 -0
  21. git_safepoint-0.0.1/tests/test_batch.py +101 -0
  22. git_safepoint-0.0.1/tests/test_cli.py +351 -0
  23. git_safepoint-0.0.1/tests/test_concurrency.py +369 -0
  24. git_safepoint-0.0.1/tests/test_debounce.py +65 -0
  25. git_safepoint-0.0.1/tests/test_destructive.py +334 -0
  26. git_safepoint-0.0.1/tests/test_gitutil.py +296 -0
  27. git_safepoint-0.0.1/tests/test_hook.py +635 -0
  28. git_safepoint-0.0.1/tests/test_ids.py +173 -0
  29. git_safepoint-0.0.1/tests/test_include_ignored.py +256 -0
  30. git_safepoint-0.0.1/tests/test_incremental.py +458 -0
  31. git_safepoint-0.0.1/tests/test_prune.py +506 -0
  32. git_safepoint-0.0.1/tests/test_restore.py +970 -0
  33. git_safepoint-0.0.1/tests/test_secret.py +331 -0
  34. git_safepoint-0.0.1/tests/test_snapshot.py +493 -0
@@ -0,0 +1,12 @@
1
+ __pycache__/
2
+ *.pyc
3
+ *.pyo
4
+ *.pyd
5
+ .pytest_cache/
6
+ *.egg-info/
7
+ dist/
8
+ build/
9
+ .venv/
10
+ venv/
11
+ env/
12
+ .env
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Takayoshi Hirano
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,322 @@
1
+ Metadata-Version: 2.4
2
+ Name: git-safepoint
3
+ Version: 0.0.1
4
+ Summary: Work-tree snapshot safety net: untracked-safe capture + selective restore via git plumbing
5
+ Project-URL: Homepage, https://github.com/takahira/git-safepoint
6
+ Project-URL: Repository, https://github.com/takahira/git-safepoint
7
+ Project-URL: Issues, https://github.com/takahira/git-safepoint/issues
8
+ Author: Takayoshi Hirano
9
+ License-Expression: MIT
10
+ License-File: LICENSE
11
+ Keywords: backup,checkpoint,git,recovery,restore,safety-net,snapshot,undo,untracked
12
+ Classifier: Development Status :: 3 - Alpha
13
+ Classifier: Environment :: Console
14
+ Classifier: Intended Audience :: Developers
15
+ Classifier: Operating System :: MacOS
16
+ Classifier: Operating System :: POSIX :: Linux
17
+ Classifier: Programming Language :: Python :: 3
18
+ Classifier: Programming Language :: Python :: 3.9
19
+ Classifier: Programming Language :: Python :: 3.10
20
+ Classifier: Programming Language :: Python :: 3.11
21
+ Classifier: Programming Language :: Python :: 3.12
22
+ Classifier: Programming Language :: Python :: 3.13
23
+ Classifier: Programming Language :: Python :: 3.14
24
+ Classifier: Topic :: Software Development :: Version Control :: Git
25
+ Classifier: Topic :: System :: Archiving :: Backup
26
+ Requires-Python: >=3.9
27
+ Description-Content-Type: text/markdown
28
+
29
+ # git-safepoint
30
+
31
+ **Local-first work-tree snapshot safety net for AI-assisted coding.**
32
+
33
+ Captures tracked + untracked files before every destructive command, and lets
34
+ you restore just the file you lost. No cloud. No accounts. Standard library
35
+ only. Touches neither your index nor HEAD.
36
+
37
+ [![MIT License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
38
+ [![Python 3.9+](https://img.shields.io/badge/python-3.9%2B-blue.svg)](https://www.python.org)
39
+ [![Platform](https://img.shields.io/badge/platform-macOS%20%7C%20Linux-lightgrey.svg)](https://github.com/takahira/git-safepoint)
40
+
41
+ ---
42
+
43
+ ## The problem
44
+
45
+ AI coding tools have a structural blind spot: **bash/terminal commands and untracked files**.
46
+
47
+ - Claude Code [`/rewind`](https://code.claude.com/docs/en/checkpointing) explicitly documents it cannot restore bash changes (`rm`, `mv`, `cp`, …)
48
+ - [copilot-cli #1675](https://github.com/github/copilot-cli/issues/1675) (Feb 2026): checkpoint restore deleted ~1 GB of untracked files via `git clean -fd`
49
+ - Replit (Jul 2025), Gemini CLI (Jul 2025), PocketOS (Apr 2026) — same pattern, different tools
50
+
51
+ `git reflog` won't help. These files were never staged.
52
+
53
+ git-safepoint fills that gap.
54
+
55
+ ---
56
+
57
+ ## What makes it different
58
+
59
+ <!-- markdownlint-disable MD060 -->
60
+ | Feature | git-safepoint | mrq¹ | ckpt² | Re_gent³ | Native AI⁴ |
61
+ |---------------------------------|---------|-----------------|---------------------|---------------------|--------------------|
62
+ | Local-only (no cloud upload) | Yes | No (cloud only) | Yes | Yes | Yes |
63
+ | Captures untracked files | Yes | Partial | Unverified | Yes | No |
64
+ | Covers manual bash / ext tools | Yes | Yes (fs-watch) | Yes | No (agent-hook) | No |
65
+ | File-level selective restore | Yes | No (whole snap) | Partial (gen only) | No (not impl) | No (session only) |
66
+ | Free OSS | Yes | No ($9–29/mo) | Yes | Yes | Yes |
67
+ <!-- markdownlint-enable MD060 -->
68
+
69
+ ¹ [mrq](https://getmrq.com) — commercial cloud snapshot service
70
+ ² [ckpt](https://github.com/mohshomis/ckpt) — OSS, TypeScript/Node
71
+ ³ [Re_gent](https://github.com/regent-vcs/re_gent) — OSS, Go, agent-hook only
72
+ ⁴ Claude Code `/rewind`, Cursor checkpoints, Gemini CLI, Codex CLI
73
+
74
+ git-safepoint is the only tool we're aware of that satisfies all four simultaneously: **local OSS × true untracked-safe × covers manual bash × snapshot-id / file-level restore**.
75
+
76
+ ---
77
+
78
+ ## Install
79
+
80
+ Zero runtime dependencies — standard library only, Python 3.9+.
81
+
82
+ ```sh
83
+ # Recommended: install the `git-safepoint` command (pipx keeps it isolated)
84
+ pipx install git+https://github.com/takahira/git-safepoint
85
+ # ... or: pip install git+https://github.com/takahira/git-safepoint
86
+
87
+ # Or run from a clone without installing (also gives you the adapters/ hooks)
88
+ git clone https://github.com/takahira/git-safepoint
89
+ cd git-safepoint
90
+ ```
91
+
92
+ After a pip/pipx install the CLI is `git-safepoint` (or `python3 -m git_safepoint`).
93
+ From a clone it is `python3 git_safepoint.py`. The Claude Code / zsh hook adapters
94
+ under `adapters/` ship with the clone (and the sdist) — use the clone if you want
95
+ them.
96
+
97
+ ## Quick start
98
+
99
+ ```sh
100
+ # Snapshot the current work tree (tracked + untracked; secrets auto-excluded)
101
+ git-safepoint --repo /path/to/your/repo snapshot --label "before refactor"
102
+
103
+ # Also capture .gitignore'd build artifacts (secrets stay excluded even here)
104
+ git-safepoint --repo . snapshot --include-ignored 'output/' --include-ignored '*.log'
105
+
106
+ # List snapshots (newest first)
107
+ git-safepoint --repo . list
108
+ git-safepoint --repo . list --json
109
+
110
+ # Diff between two snapshots (or vs current work tree)
111
+ git-safepoint --repo . diff <id1> [<id2>] [--path FILE]
112
+
113
+ # Restore a single file
114
+ git-safepoint --repo . restore <id> path/to/lost-file.txt
115
+
116
+ # Restore a subtree or everything
117
+ git-safepoint --repo . restore <id> --dir notes/
118
+ git-safepoint --repo . restore <id> --all --yes
119
+
120
+ # Interactive restore (TTY: list → diff → confirm)
121
+ git-safepoint --repo . restore --interactive
122
+
123
+ # Recover a partially-staged version (git add -p content lost to reset --hard)
124
+ git-safepoint --repo . restore <id> path/to/file --staged
125
+
126
+ # GC / retention
127
+ git-safepoint --repo . prune --keep-generations 50 --dry-run
128
+ ```
129
+
130
+ From a clone (no install), replace `git-safepoint` with `python3 git_safepoint.py`.
131
+
132
+ ---
133
+
134
+ ## Claude Code hook (auto-capture before every tool call)
135
+
136
+ Add to `~/.claude/settings.json`:
137
+
138
+ ```json
139
+ {
140
+ "hooks": {
141
+ "PreToolUse": [
142
+ {
143
+ "matcher": "Bash",
144
+ "hooks": [{"type": "command", "command": "python3 /abs/path/to/git-safepoint/adapters/pretooluse_hook.py"}]
145
+ },
146
+ {
147
+ "matcher": "Write",
148
+ "hooks": [{"type": "command", "command": "python3 /abs/path/to/git-safepoint/adapters/pretooluse_hook.py"}]
149
+ },
150
+ {
151
+ "matcher": "Edit",
152
+ "hooks": [{"type": "command", "command": "python3 /abs/path/to/git-safepoint/adapters/pretooluse_hook.py"}]
153
+ },
154
+ {
155
+ "matcher": "NotebookEdit",
156
+ "hooks": [{"type": "command", "command": "python3 /abs/path/to/git-safepoint/adapters/pretooluse_hook.py"}]
157
+ }
158
+ ]
159
+ }
160
+ }
161
+ ```
162
+
163
+ Reload your Claude Code window. The hook fires before every Bash, Write, Edit, and NotebookEdit call:
164
+
165
+ - **Conservative mode**: skips if nothing changed (mtime check + tree SHA dedup), so it's low-overhead on read-only calls
166
+ - **Fail-open**: always exits 0 — the safety net never blocks the agent
167
+ - Detects git repos from command paths when `cwd` is outside any repo
168
+
169
+ Hooks take no CLI flags, so to also capture `.gitignore`'d build artifacts through the hook/preexec path, export a `:`-separated allow-list — e.g. `export GIT_SAFEPOINT_INCLUDE_IGNORED='output/:dist/'`. Secrets stay excluded even then.
170
+
171
+ Live-verified with the Claude Code VSCode extension (June 2026).
172
+
173
+ ---
174
+
175
+ ## zsh preexec (auto-capture before terminal commands)
176
+
177
+ ```sh
178
+ export GIT_SAFEPOINT_PY=/abs/path/to/git-safepoint/git_safepoint.py
179
+ source /abs/path/to/git-safepoint/adapters/git-safepoint-preexec.zsh
180
+ ```
181
+
182
+ Snapshots before destructive shell commands (`rm`, `mv`, `git reset --hard`, etc.).
183
+
184
+ ---
185
+
186
+ ## How it works
187
+
188
+ git-safepoint uses git plumbing only — no diffs, no stash, no index changes:
189
+
190
+ 1. `git ls-files --cached --others --exclude-standard` enumerates tracked + untracked
191
+ 2. Files are stored with `git hash-object -w` into the repo's object store (batch mode: 500 files/fork)
192
+ 3. A **private index** (separate from yours) builds a tree with `git write-tree`
193
+ 4. A shadow commit lands at `refs/snapshots/<timestamp-seq-pid>` — HEAD and your index are untouched
194
+ 5. If the **staged index** differs from both the work tree and HEAD (a `git add -p` / stage-then-edit state that `reset --hard` would otherwise destroy), that index is captured as the snapshot commit's parent — `list` marks it `+staged`, and `restore --staged <id> …` / `diff --staged …` reach it. Built from a copy of the index, so the real index is never touched.
195
+
196
+ All state lives under `.git/snap/` and never touches your work tree. With
197
+ `git worktree`, the `lock` and `seq` are shared on the common `.git` (so
198
+ captures across linked worktrees are serialised and IDs stay monotonic); the
199
+ mtime cache is per worktree:
200
+
201
+ - `mtime-cache.json` — incremental hash cache; only rehashes changed files (per worktree)
202
+ - `seq` — monotonic counter for collision-free IDs across concurrent processes and linked worktrees
203
+ - `lock` — per-repo `flock` (shared across linked worktrees) so parallel hook fires don't corrupt
204
+
205
+ **Secrets** (`.env`, `*.pem`, `id_rsa`, `*.key`, `*firebase-adminsdk*.json`, etc.) are excluded from snapshots — **for untracked files**. The floor's job is to keep an untracked secret out of the object store; a file already **tracked by git** is exempt (its blob is already committed, so snapshotting it leaks nothing and excluding it would only leave it unprotected). Editor/merge backup & swap copies of a recognised secret (`.env~`, `id_rsa.bak`, `server.pem.swp`, `#.env#`) are excluded too. The exclusion list is **name-based** (a conservative floor): a credential with an unrecognizable name — e.g. a randomly-named cloud service-account key — won't be auto-excluded, so keep it `.gitignore`'d or outside the repo. `.gitignore`'d files are excluded by default (the floor still applies even with `--include-ignored`). Snapshots survive `git clean -fdx`. Destroyed by `rm -rf .git` (same single point of failure as git itself).
206
+
207
+ ---
208
+
209
+ ## Performance (measured on macOS, Python 3.14, SSD)
210
+
211
+ | Files | Cold (first capture) | Incremental (no change) | Incremental (1 file changed) |
212
+ |-------:|---------------------:|------------------------:|-----------------------------:|
213
+ | 1,000 | ~0.6 s | ~90–100 ms | ~100 ms |
214
+ | 5,000 | ~2.8 s | ~150–250 ms | ~150–250 ms |
215
+ | 10,000 | ~6.5 s | ~230–410 ms | ~230–460 ms |
216
+
217
+ Cold uses batch hashing (500 files/fork), reducing it from ~100 s to ~6.5 s vs. the naive one-process-per-file approach. Incremental uses an mtime+size+inode+exec-bit+ctime+type signature cache to skip unchanged files.
218
+
219
+ ---
220
+
221
+ ## Retention / GC
222
+
223
+ ```sh
224
+ # Keep last 50 snapshots, max 512 MiB total, max 72 hours
225
+ git-safepoint --repo . prune --keep-generations 50 --max-bytes 536870912 --keep-hours 72
226
+
227
+ # Dry run first
228
+ git-safepoint --repo . prune --dry-run
229
+ ```
230
+
231
+ The most recent snapshot is always preserved regardless of **any** retention
232
+ limit (size, generations, or age) — even `--keep-generations 0` or when every
233
+ snapshot is older than `--keep-hours`.
234
+
235
+ `prune` runs `git gc` with the prune/reflog grace **pinned** on the command line
236
+ to specific safe values — reflog expiry is pinned to `never` in both directions
237
+ (`gc.reflogExpire=never`, `gc.reflogExpireUnreachable=never`), and
238
+ `gc.pruneExpire=2.weeks.ago` — so an aggressive user `gc.*` config cannot
239
+ force-prune *regardless of any aggressive `gc.pruneExpire` in your git config*. Your own unreachable-but-recoverable objects (dropped stashes,
240
+ pre-reset commits, reflog history) are not collaterally collected. Dropped
241
+ snapshot objects are reclaimed on git's normal grace schedule. Use `--no-gc` to
242
+ drop refs without any gc.
243
+
244
+ ---
245
+
246
+ ## Known limitations
247
+
248
+ The hook's destructive-command detection uses a verb-allowlist approach optimized for zero false positives. It misses destruction hidden inside arguments:
249
+
250
+ - `find . -exec rm {} \;` — verb is `find`, not `rm`
251
+ - `` echo `rm -rf x` `` — verb is `echo`
252
+ - `(rm -rf x)` — leading token is `(`
253
+ - `python3 -c "open('f','w').write(...)"` — verb is `python3`
254
+
255
+ The git-subcommand allowlist (`checkout`/`switch`/`restore`/`reset`/`clean`/`rm`/`stash`, plus `branch -D`) is deliberately narrow: recovery/abort subcommands that can touch the work tree — `rebase`/`merge`/`am`/`cherry-pick --abort`, `read-tree -u`, `checkout-index -f`, `worktree remove` — are **not** individually thorough-mode triggers. Most refuse to run with conflicting uncommitted changes (so they don't silently destroy unsaved work), and any residual case is covered by conservative mode below; the trade-off keeps the false-positive rate near zero.
256
+
257
+ Conservative mode (used by the Claude Code hook) covers most of these by snapshotting on every tool call rather than only on destructive ones. For an *undetected* destructive command above, the conservative path still captures content changes (the file signature includes mtime, size, inode, exec-bit and **ctime**, so even an external `tar -x` / `rsync --times` / `cp -p` that restores mtime is caught). The remaining sliver is a same-size in-place content swap on a filesystem whose ctime resolution is too coarse to separate two writes in one tick; for those, only a destructive command the allowlist *does* recognise force-rehashes. The zsh preexec path has no conservative fallback, so the verb-allowlist gaps apply there in full.
258
+
259
+ Control-flow / compound bodies *are* detected (`if …; then rm …; fi`, `for/while … do rm …`, `{ rm …; }`). One repo-resolution gap remains: a **bare-name** target reached only after a `cd` in the same command line — `cd sub && rm -rf nestedrepo` — resolves `nestedrepo` against the original cwd, so a *separate nested git repo* at `sub/nestedrepo` is not found (the outer repo is still snapshotted). Use a path that contains a `/` (`rm -rf sub/nestedrepo`) and it is found.
260
+
261
+ **Batch restore is per-file atomic, not all-or-nothing.** Each file lands via `os.replace` (a crash never leaves a half-written file), but `restore --all` / `--dir` over many files is not a single transaction. If interrupted (Ctrl-C / kill) midway the work tree is left part-restored; git-safepoint prints how many files it restored so you can re-run the same restore to finish (restored files are idempotent and any overwritten originals are saved under `.snap-bak/`).
262
+
263
+ **Capture is not a point-in-time snapshot.** Files are stat'd and then hashed in separate steps, and the per-repo lock only excludes other git-safepoint processes — not your editor or build. If an external writer changes a file *during* a capture, that file may be stored with a slightly torn view (new bytes against the pre-write mode); it is never corrupt git data, and the next capture re-hashes it. Submodules record only their pinned commit (the pin is restored manually, not the submodule work tree); in **conservative** mode a submodule-only HEAD change (no change to the superproject's own files) is skipped before any tree is built, so the pin is captured best-effort there — a destructive command still force-captures the live pin.
264
+
265
+ **Reserved work-tree names.** git-safepoint never captures its own restore artifacts: the `.snap-bak/` directory (pre-overwrite backups) and any path ending in `.snap-restore-tmp` (in-flight restore temp files). A user file that happens to use those names is excluded from snapshots — avoid them.
266
+
267
+ **Concurrency / `.git` on a network filesystem.** The per-repo lock uses
268
+ `fcntl.flock`, which is reliable on a local filesystem. On some NFS mounts
269
+ (`nolock`/`local_lock`) and overlay/network filesystems `flock` may not actually
270
+ serialize across hosts; the lock then degrades silently. The collision-free ID
271
+ mint (create-only ref + retry) still prevents corrupt refs, and the mtime cache
272
+ is a pure optimization, so the worst realistic outcome is duplicated work / a
273
+ cold re-hash — not ref-store corruption. Keep `.git` on a local filesystem for
274
+ guaranteed serialization.
275
+
276
+ **Local metadata leak.** Two minor, local-only caveats: a snapshot's commit
277
+ message records the (truncated) triggering command **verbatim — there is no
278
+ redaction**, so avoid putting a secret directly on a destructive command line
279
+ (e.g. `... --token=…`); and a snapshot of a symlink
280
+ stores the link's *target path string* (never the secret's bytes). A normal push
281
+ (`refs/heads` / `refs/tags`) does not transfer `refs/snapshots/`, so this stays
282
+ on your machine by default — but `git push --mirror`, `git clone --mirror`, and
283
+ `git bundle --all` *do* carry the snapshot refs and their objects, so if you
284
+ mirror the repo as a backup, prune first or keep the destination inside your
285
+ trust boundary. Either way, if a command line or a symlink target itself contains
286
+ a secret, that *string* lives in your local object store until the snapshot is
287
+ pruned.
288
+
289
+ ---
290
+
291
+ ## Running tests
292
+
293
+ ```sh
294
+ python3 -m unittest discover -s tests -p 'test_*.py'
295
+ # → 305 tests pass (macOS / Linux; 2 non-UTF-8-name tests skip on macOS)
296
+ ```
297
+
298
+ ---
299
+
300
+ ## Status
301
+
302
+ MVP — full test suite passing (see [Running tests](#running-tests)), live in Claude Code sessions.
303
+
304
+ **Implemented**: snapshot engine (tracked + untracked + opt-in .gitignore'd), secret exclusion (tracked files exempt), incremental capture, debounce + tree-SHA dedup, collision-free IDs across concurrent processes, staged-index variant capture (`restore --staged`), single-file / subtree / all / interactive restore, diff, GC/prune, Claude Code PreToolUse hook (conservative mode, live-verified), zsh preexec adapter.
305
+
306
+ **Not yet**: PyPI package, daemon mode (fswatch / kqueue), interactive TUI, off-`.git` mirror.
307
+
308
+ ---
309
+
310
+ ## Requirements
311
+
312
+ - Python 3.9+
313
+ - git 2.x for snapshot / restore; **git ≥ 2.25** for the snapshot-vs-work-tree
314
+ `diff` / interactive-restore preview (it uses `git add --pathspec-file-nul`;
315
+ on older git that one feature reports an error instead of a wrong empty diff)
316
+ - macOS or Linux (Windows: untested)
317
+
318
+ ---
319
+
320
+ ## License
321
+
322
+ MIT
@@ -0,0 +1,294 @@
1
+ # git-safepoint
2
+
3
+ **Local-first work-tree snapshot safety net for AI-assisted coding.**
4
+
5
+ Captures tracked + untracked files before every destructive command, and lets
6
+ you restore just the file you lost. No cloud. No accounts. Standard library
7
+ only. Touches neither your index nor HEAD.
8
+
9
+ [![MIT License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
10
+ [![Python 3.9+](https://img.shields.io/badge/python-3.9%2B-blue.svg)](https://www.python.org)
11
+ [![Platform](https://img.shields.io/badge/platform-macOS%20%7C%20Linux-lightgrey.svg)](https://github.com/takahira/git-safepoint)
12
+
13
+ ---
14
+
15
+ ## The problem
16
+
17
+ AI coding tools have a structural blind spot: **bash/terminal commands and untracked files**.
18
+
19
+ - Claude Code [`/rewind`](https://code.claude.com/docs/en/checkpointing) explicitly documents it cannot restore bash changes (`rm`, `mv`, `cp`, …)
20
+ - [copilot-cli #1675](https://github.com/github/copilot-cli/issues/1675) (Feb 2026): checkpoint restore deleted ~1 GB of untracked files via `git clean -fd`
21
+ - Replit (Jul 2025), Gemini CLI (Jul 2025), PocketOS (Apr 2026) — same pattern, different tools
22
+
23
+ `git reflog` won't help. These files were never staged.
24
+
25
+ git-safepoint fills that gap.
26
+
27
+ ---
28
+
29
+ ## What makes it different
30
+
31
+ <!-- markdownlint-disable MD060 -->
32
+ | Feature | git-safepoint | mrq¹ | ckpt² | Re_gent³ | Native AI⁴ |
33
+ |---------------------------------|---------|-----------------|---------------------|---------------------|--------------------|
34
+ | Local-only (no cloud upload) | Yes | No (cloud only) | Yes | Yes | Yes |
35
+ | Captures untracked files | Yes | Partial | Unverified | Yes | No |
36
+ | Covers manual bash / ext tools | Yes | Yes (fs-watch) | Yes | No (agent-hook) | No |
37
+ | File-level selective restore | Yes | No (whole snap) | Partial (gen only) | No (not impl) | No (session only) |
38
+ | Free OSS | Yes | No ($9–29/mo) | Yes | Yes | Yes |
39
+ <!-- markdownlint-enable MD060 -->
40
+
41
+ ¹ [mrq](https://getmrq.com) — commercial cloud snapshot service
42
+ ² [ckpt](https://github.com/mohshomis/ckpt) — OSS, TypeScript/Node
43
+ ³ [Re_gent](https://github.com/regent-vcs/re_gent) — OSS, Go, agent-hook only
44
+ ⁴ Claude Code `/rewind`, Cursor checkpoints, Gemini CLI, Codex CLI
45
+
46
+ git-safepoint is the only tool we're aware of that satisfies all four simultaneously: **local OSS × true untracked-safe × covers manual bash × snapshot-id / file-level restore**.
47
+
48
+ ---
49
+
50
+ ## Install
51
+
52
+ Zero runtime dependencies — standard library only, Python 3.9+.
53
+
54
+ ```sh
55
+ # Recommended: install the `git-safepoint` command (pipx keeps it isolated)
56
+ pipx install git+https://github.com/takahira/git-safepoint
57
+ # ... or: pip install git+https://github.com/takahira/git-safepoint
58
+
59
+ # Or run from a clone without installing (also gives you the adapters/ hooks)
60
+ git clone https://github.com/takahira/git-safepoint
61
+ cd git-safepoint
62
+ ```
63
+
64
+ After a pip/pipx install the CLI is `git-safepoint` (or `python3 -m git_safepoint`).
65
+ From a clone it is `python3 git_safepoint.py`. The Claude Code / zsh hook adapters
66
+ under `adapters/` ship with the clone (and the sdist) — use the clone if you want
67
+ them.
68
+
69
+ ## Quick start
70
+
71
+ ```sh
72
+ # Snapshot the current work tree (tracked + untracked; secrets auto-excluded)
73
+ git-safepoint --repo /path/to/your/repo snapshot --label "before refactor"
74
+
75
+ # Also capture .gitignore'd build artifacts (secrets stay excluded even here)
76
+ git-safepoint --repo . snapshot --include-ignored 'output/' --include-ignored '*.log'
77
+
78
+ # List snapshots (newest first)
79
+ git-safepoint --repo . list
80
+ git-safepoint --repo . list --json
81
+
82
+ # Diff between two snapshots (or vs current work tree)
83
+ git-safepoint --repo . diff <id1> [<id2>] [--path FILE]
84
+
85
+ # Restore a single file
86
+ git-safepoint --repo . restore <id> path/to/lost-file.txt
87
+
88
+ # Restore a subtree or everything
89
+ git-safepoint --repo . restore <id> --dir notes/
90
+ git-safepoint --repo . restore <id> --all --yes
91
+
92
+ # Interactive restore (TTY: list → diff → confirm)
93
+ git-safepoint --repo . restore --interactive
94
+
95
+ # Recover a partially-staged version (git add -p content lost to reset --hard)
96
+ git-safepoint --repo . restore <id> path/to/file --staged
97
+
98
+ # GC / retention
99
+ git-safepoint --repo . prune --keep-generations 50 --dry-run
100
+ ```
101
+
102
+ From a clone (no install), replace `git-safepoint` with `python3 git_safepoint.py`.
103
+
104
+ ---
105
+
106
+ ## Claude Code hook (auto-capture before every tool call)
107
+
108
+ Add to `~/.claude/settings.json`:
109
+
110
+ ```json
111
+ {
112
+ "hooks": {
113
+ "PreToolUse": [
114
+ {
115
+ "matcher": "Bash",
116
+ "hooks": [{"type": "command", "command": "python3 /abs/path/to/git-safepoint/adapters/pretooluse_hook.py"}]
117
+ },
118
+ {
119
+ "matcher": "Write",
120
+ "hooks": [{"type": "command", "command": "python3 /abs/path/to/git-safepoint/adapters/pretooluse_hook.py"}]
121
+ },
122
+ {
123
+ "matcher": "Edit",
124
+ "hooks": [{"type": "command", "command": "python3 /abs/path/to/git-safepoint/adapters/pretooluse_hook.py"}]
125
+ },
126
+ {
127
+ "matcher": "NotebookEdit",
128
+ "hooks": [{"type": "command", "command": "python3 /abs/path/to/git-safepoint/adapters/pretooluse_hook.py"}]
129
+ }
130
+ ]
131
+ }
132
+ }
133
+ ```
134
+
135
+ Reload your Claude Code window. The hook fires before every Bash, Write, Edit, and NotebookEdit call:
136
+
137
+ - **Conservative mode**: skips if nothing changed (mtime check + tree SHA dedup), so it's low-overhead on read-only calls
138
+ - **Fail-open**: always exits 0 — the safety net never blocks the agent
139
+ - Detects git repos from command paths when `cwd` is outside any repo
140
+
141
+ Hooks take no CLI flags, so to also capture `.gitignore`'d build artifacts through the hook/preexec path, export a `:`-separated allow-list — e.g. `export GIT_SAFEPOINT_INCLUDE_IGNORED='output/:dist/'`. Secrets stay excluded even then.
142
+
143
+ Live-verified with the Claude Code VSCode extension (June 2026).
144
+
145
+ ---
146
+
147
+ ## zsh preexec (auto-capture before terminal commands)
148
+
149
+ ```sh
150
+ export GIT_SAFEPOINT_PY=/abs/path/to/git-safepoint/git_safepoint.py
151
+ source /abs/path/to/git-safepoint/adapters/git-safepoint-preexec.zsh
152
+ ```
153
+
154
+ Snapshots before destructive shell commands (`rm`, `mv`, `git reset --hard`, etc.).
155
+
156
+ ---
157
+
158
+ ## How it works
159
+
160
+ git-safepoint uses git plumbing only — no diffs, no stash, no index changes:
161
+
162
+ 1. `git ls-files --cached --others --exclude-standard` enumerates tracked + untracked
163
+ 2. Files are stored with `git hash-object -w` into the repo's object store (batch mode: 500 files/fork)
164
+ 3. A **private index** (separate from yours) builds a tree with `git write-tree`
165
+ 4. A shadow commit lands at `refs/snapshots/<timestamp-seq-pid>` — HEAD and your index are untouched
166
+ 5. If the **staged index** differs from both the work tree and HEAD (a `git add -p` / stage-then-edit state that `reset --hard` would otherwise destroy), that index is captured as the snapshot commit's parent — `list` marks it `+staged`, and `restore --staged <id> …` / `diff --staged …` reach it. Built from a copy of the index, so the real index is never touched.
167
+
168
+ All state lives under `.git/snap/` and never touches your work tree. With
169
+ `git worktree`, the `lock` and `seq` are shared on the common `.git` (so
170
+ captures across linked worktrees are serialised and IDs stay monotonic); the
171
+ mtime cache is per worktree:
172
+
173
+ - `mtime-cache.json` — incremental hash cache; only rehashes changed files (per worktree)
174
+ - `seq` — monotonic counter for collision-free IDs across concurrent processes and linked worktrees
175
+ - `lock` — per-repo `flock` (shared across linked worktrees) so parallel hook fires don't corrupt
176
+
177
+ **Secrets** (`.env`, `*.pem`, `id_rsa`, `*.key`, `*firebase-adminsdk*.json`, etc.) are excluded from snapshots — **for untracked files**. The floor's job is to keep an untracked secret out of the object store; a file already **tracked by git** is exempt (its blob is already committed, so snapshotting it leaks nothing and excluding it would only leave it unprotected). Editor/merge backup & swap copies of a recognised secret (`.env~`, `id_rsa.bak`, `server.pem.swp`, `#.env#`) are excluded too. The exclusion list is **name-based** (a conservative floor): a credential with an unrecognizable name — e.g. a randomly-named cloud service-account key — won't be auto-excluded, so keep it `.gitignore`'d or outside the repo. `.gitignore`'d files are excluded by default (the floor still applies even with `--include-ignored`). Snapshots survive `git clean -fdx`. Destroyed by `rm -rf .git` (same single point of failure as git itself).
178
+
179
+ ---
180
+
181
+ ## Performance (measured on macOS, Python 3.14, SSD)
182
+
183
+ | Files | Cold (first capture) | Incremental (no change) | Incremental (1 file changed) |
184
+ |-------:|---------------------:|------------------------:|-----------------------------:|
185
+ | 1,000 | ~0.6 s | ~90–100 ms | ~100 ms |
186
+ | 5,000 | ~2.8 s | ~150–250 ms | ~150–250 ms |
187
+ | 10,000 | ~6.5 s | ~230–410 ms | ~230–460 ms |
188
+
189
+ Cold uses batch hashing (500 files/fork), reducing it from ~100 s to ~6.5 s vs. the naive one-process-per-file approach. Incremental uses an mtime+size+inode+exec-bit+ctime+type signature cache to skip unchanged files.
190
+
191
+ ---
192
+
193
+ ## Retention / GC
194
+
195
+ ```sh
196
+ # Keep last 50 snapshots, max 512 MiB total, max 72 hours
197
+ git-safepoint --repo . prune --keep-generations 50 --max-bytes 536870912 --keep-hours 72
198
+
199
+ # Dry run first
200
+ git-safepoint --repo . prune --dry-run
201
+ ```
202
+
203
+ The most recent snapshot is always preserved regardless of **any** retention
204
+ limit (size, generations, or age) — even `--keep-generations 0` or when every
205
+ snapshot is older than `--keep-hours`.
206
+
207
+ `prune` runs `git gc` with the prune/reflog grace **pinned** on the command line
208
+ to specific safe values — reflog expiry is pinned to `never` in both directions
209
+ (`gc.reflogExpire=never`, `gc.reflogExpireUnreachable=never`), and
210
+ `gc.pruneExpire=2.weeks.ago` — so an aggressive user `gc.*` config cannot
211
+ force-prune *regardless of any aggressive `gc.pruneExpire` in your git config*. Your own unreachable-but-recoverable objects (dropped stashes,
212
+ pre-reset commits, reflog history) are not collaterally collected. Dropped
213
+ snapshot objects are reclaimed on git's normal grace schedule. Use `--no-gc` to
214
+ drop refs without any gc.
215
+
216
+ ---
217
+
218
+ ## Known limitations
219
+
220
+ The hook's destructive-command detection uses a verb-allowlist approach optimized for zero false positives. It misses destruction hidden inside arguments:
221
+
222
+ - `find . -exec rm {} \;` — verb is `find`, not `rm`
223
+ - `` echo `rm -rf x` `` — verb is `echo`
224
+ - `(rm -rf x)` — leading token is `(`
225
+ - `python3 -c "open('f','w').write(...)"` — verb is `python3`
226
+
227
+ The git-subcommand allowlist (`checkout`/`switch`/`restore`/`reset`/`clean`/`rm`/`stash`, plus `branch -D`) is deliberately narrow: recovery/abort subcommands that can touch the work tree — `rebase`/`merge`/`am`/`cherry-pick --abort`, `read-tree -u`, `checkout-index -f`, `worktree remove` — are **not** individually thorough-mode triggers. Most refuse to run with conflicting uncommitted changes (so they don't silently destroy unsaved work), and any residual case is covered by conservative mode below; the trade-off keeps the false-positive rate near zero.
228
+
229
+ Conservative mode (used by the Claude Code hook) covers most of these by snapshotting on every tool call rather than only on destructive ones. For an *undetected* destructive command above, the conservative path still captures content changes (the file signature includes mtime, size, inode, exec-bit and **ctime**, so even an external `tar -x` / `rsync --times` / `cp -p` that restores mtime is caught). The remaining sliver is a same-size in-place content swap on a filesystem whose ctime resolution is too coarse to separate two writes in one tick; for those, only a destructive command the allowlist *does* recognise force-rehashes. The zsh preexec path has no conservative fallback, so the verb-allowlist gaps apply there in full.
230
+
231
+ Control-flow / compound bodies *are* detected (`if …; then rm …; fi`, `for/while … do rm …`, `{ rm …; }`). One repo-resolution gap remains: a **bare-name** target reached only after a `cd` in the same command line — `cd sub && rm -rf nestedrepo` — resolves `nestedrepo` against the original cwd, so a *separate nested git repo* at `sub/nestedrepo` is not found (the outer repo is still snapshotted). Use a path that contains a `/` (`rm -rf sub/nestedrepo`) and it is found.
232
+
233
+ **Batch restore is per-file atomic, not all-or-nothing.** Each file lands via `os.replace` (a crash never leaves a half-written file), but `restore --all` / `--dir` over many files is not a single transaction. If interrupted (Ctrl-C / kill) midway the work tree is left part-restored; git-safepoint prints how many files it restored so you can re-run the same restore to finish (restored files are idempotent and any overwritten originals are saved under `.snap-bak/`).
234
+
235
+ **Capture is not a point-in-time snapshot.** Files are stat'd and then hashed in separate steps, and the per-repo lock only excludes other git-safepoint processes — not your editor or build. If an external writer changes a file *during* a capture, that file may be stored with a slightly torn view (new bytes against the pre-write mode); it is never corrupt git data, and the next capture re-hashes it. Submodules record only their pinned commit (the pin is restored manually, not the submodule work tree); in **conservative** mode a submodule-only HEAD change (no change to the superproject's own files) is skipped before any tree is built, so the pin is captured best-effort there — a destructive command still force-captures the live pin.
236
+
237
+ **Reserved work-tree names.** git-safepoint never captures its own restore artifacts: the `.snap-bak/` directory (pre-overwrite backups) and any path ending in `.snap-restore-tmp` (in-flight restore temp files). A user file that happens to use those names is excluded from snapshots — avoid them.
238
+
239
+ **Concurrency / `.git` on a network filesystem.** The per-repo lock uses
240
+ `fcntl.flock`, which is reliable on a local filesystem. On some NFS mounts
241
+ (`nolock`/`local_lock`) and overlay/network filesystems `flock` may not actually
242
+ serialize across hosts; the lock then degrades silently. The collision-free ID
243
+ mint (create-only ref + retry) still prevents corrupt refs, and the mtime cache
244
+ is a pure optimization, so the worst realistic outcome is duplicated work / a
245
+ cold re-hash — not ref-store corruption. Keep `.git` on a local filesystem for
246
+ guaranteed serialization.
247
+
248
+ **Local metadata leak.** Two minor, local-only caveats: a snapshot's commit
249
+ message records the (truncated) triggering command **verbatim — there is no
250
+ redaction**, so avoid putting a secret directly on a destructive command line
251
+ (e.g. `... --token=…`); and a snapshot of a symlink
252
+ stores the link's *target path string* (never the secret's bytes). A normal push
253
+ (`refs/heads` / `refs/tags`) does not transfer `refs/snapshots/`, so this stays
254
+ on your machine by default — but `git push --mirror`, `git clone --mirror`, and
255
+ `git bundle --all` *do* carry the snapshot refs and their objects, so if you
256
+ mirror the repo as a backup, prune first or keep the destination inside your
257
+ trust boundary. Either way, if a command line or a symlink target itself contains
258
+ a secret, that *string* lives in your local object store until the snapshot is
259
+ pruned.
260
+
261
+ ---
262
+
263
+ ## Running tests
264
+
265
+ ```sh
266
+ python3 -m unittest discover -s tests -p 'test_*.py'
267
+ # → 305 tests pass (macOS / Linux; 2 non-UTF-8-name tests skip on macOS)
268
+ ```
269
+
270
+ ---
271
+
272
+ ## Status
273
+
274
+ MVP — full test suite passing (see [Running tests](#running-tests)), live in Claude Code sessions.
275
+
276
+ **Implemented**: snapshot engine (tracked + untracked + opt-in .gitignore'd), secret exclusion (tracked files exempt), incremental capture, debounce + tree-SHA dedup, collision-free IDs across concurrent processes, staged-index variant capture (`restore --staged`), single-file / subtree / all / interactive restore, diff, GC/prune, Claude Code PreToolUse hook (conservative mode, live-verified), zsh preexec adapter.
277
+
278
+ **Not yet**: PyPI package, daemon mode (fswatch / kqueue), interactive TUI, off-`.git` mirror.
279
+
280
+ ---
281
+
282
+ ## Requirements
283
+
284
+ - Python 3.9+
285
+ - git 2.x for snapshot / restore; **git ≥ 2.25** for the snapshot-vs-work-tree
286
+ `diff` / interactive-restore preview (it uses `git add --pathspec-file-nul`;
287
+ on older git that one feature reports an error instead of a wrong empty diff)
288
+ - macOS or Linux (Windows: untested)
289
+
290
+ ---
291
+
292
+ ## License
293
+
294
+ MIT