claude-code-session-sync 0.1.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- claude_code_session_sync-0.1.0.dist-info/METADATA +87 -0
- claude_code_session_sync-0.1.0.dist-info/RECORD +31 -0
- claude_code_session_sync-0.1.0.dist-info/WHEEL +5 -0
- claude_code_session_sync-0.1.0.dist-info/entry_points.txt +2 -0
- claude_code_session_sync-0.1.0.dist-info/licenses/LICENSE +21 -0
- claude_code_session_sync-0.1.0.dist-info/top_level.txt +1 -0
- claude_session_sync/__init__.py +11 -0
- claude_session_sync/acks.py +279 -0
- claude_session_sync/anomaly.py +161 -0
- claude_session_sync/apply.py +874 -0
- claude_session_sync/atomicio.py +621 -0
- claude_session_sync/bootstrap.py +370 -0
- claude_session_sync/canonical.py +185 -0
- claude_session_sync/classify.py +133 -0
- claude_session_sync/cli.py +1065 -0
- claude_session_sync/config.py +128 -0
- claude_session_sync/doctor.py +351 -0
- claude_session_sync/fuzzy.py +136 -0
- claude_session_sync/lineset.py +143 -0
- claude_session_sync/memory.py +953 -0
- claude_session_sync/merge.py +836 -0
- claude_session_sync/pathsafe.py +91 -0
- claude_session_sync/py.typed +0 -0
- claude_session_sync/resolve.py +226 -0
- claude_session_sync/scan.py +485 -0
- claude_session_sync/session_merge.py +214 -0
- claude_session_sync/sidecar.py +238 -0
- claude_session_sync/snapshot.py +136 -0
- claude_session_sync/state.py +240 -0
- claude_session_sync/tombstone.py +330 -0
- claude_session_sync/transfer.py +462 -0
|
@@ -0,0 +1,87 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: claude-code-session-sync
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Offline cross-machine sync for Claude Code sessions (JSONL) and memory (.md).
|
|
5
|
+
Author: weilung
|
|
6
|
+
License-Expression: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/weilung/claude-session-sync
|
|
8
|
+
Project-URL: Repository, https://github.com/weilung/claude-session-sync
|
|
9
|
+
Project-URL: Issues, https://github.com/weilung/claude-session-sync/issues
|
|
10
|
+
Project-URL: Changelog, https://github.com/weilung/claude-session-sync/blob/main/CHANGELOG.md
|
|
11
|
+
Keywords: claude,claude-code,sync,sessions,memory,jsonl,offline
|
|
12
|
+
Classifier: Development Status :: 4 - Beta
|
|
13
|
+
Classifier: Environment :: Console
|
|
14
|
+
Classifier: Intended Audience :: Developers
|
|
15
|
+
Classifier: Operating System :: OS Independent
|
|
16
|
+
Classifier: Programming Language :: Python :: 3
|
|
17
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
18
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
19
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
20
|
+
Classifier: Topic :: Utilities
|
|
21
|
+
Requires-Python: >=3.11
|
|
22
|
+
Description-Content-Type: text/markdown
|
|
23
|
+
License-File: LICENSE
|
|
24
|
+
Dynamic: license-file
|
|
25
|
+
|
|
26
|
+
# claude-session-sync
|
|
27
|
+
|
|
28
|
+
讓 Claude Code 的**對話 session(JSONL)與 memory(.md)**透過**外接 / 網路硬碟**在多台機器間離線同步的 CLI 工具。
|
|
29
|
+
|
|
30
|
+
- 同群組多台共用一個 hub 目錄**雙向同步**;跨群組**明確、可挑選**地引入/送回特定 session。
|
|
31
|
+
- 核心原則:**機械的事交給 Python,語意的事交給 AI;永不靜默丟資料。**
|
|
32
|
+
- 與現成工具的差異:離線(不強制上雲/git)+可挑選+AI 輔助 memory 合併+多寫入者安全。
|
|
33
|
+
|
|
34
|
+
## 現況
|
|
35
|
+
|
|
36
|
+
**核心功能已實作並收斂**(跨模型 code review + 逐塊 fresh-gate;跨平台,Windows 綠)。
|
|
37
|
+
|
|
38
|
+
- 已完成:唯讀掃描/分類 → `sync` 雙向同步(安全寫入:read-verify-write + lock + tombstone + keep-both)、
|
|
39
|
+
跨群 `pull`/`push`、`bootstrap` 基線、`doctor` 診斷/rebuild-state/break-lock/ack、
|
|
40
|
+
memory union + tombstone + `MEMORY.md` 索引重建、`memory-merge`(含跨群 `--from` 與模糊近似 `--fuzzy`)、SessionEnd `nudge` hook。
|
|
41
|
+
- memory「同事實不同檔名」的**模糊近似比對**(P2 最後一項、最高風險)已完成——刻意只做**唯讀建議**(`memory-merge --fuzzy` 列候選、由你逐對放行才保留兩版,**絕不自動合併**)。
|
|
42
|
+
- 首個公開版 **0.1.0(beta)**;尚未到 1.0 穩定版,介面/行為仍可能調整。
|
|
43
|
+
|
|
44
|
+
## 安裝
|
|
45
|
+
|
|
46
|
+
```bash
|
|
47
|
+
# 從 PyPI(推薦)— 套件名 claude-code-session-sync,安裝後指令仍是 claude-session-sync
|
|
48
|
+
pipx install claude-code-session-sync # 或: pip install claude-code-session-sync
|
|
49
|
+
|
|
50
|
+
# 或從原始碼安裝最新版
|
|
51
|
+
pipx install "git+https://github.com/weilung/claude-session-sync.git"
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
需 Python ≥ 3.11、零第三方相依(標準庫)。安裝後即有 `claude-session-sync` 指令。
|
|
55
|
+
|
|
56
|
+
## 快速開始
|
|
57
|
+
|
|
58
|
+
```bash
|
|
59
|
+
# 1) 設定自己群組的 hub(編輯設定檔;Windows 路徑用單引號)
|
|
60
|
+
# Windows: %APPDATA%\claude-session-sync\config.toml
|
|
61
|
+
# POSIX: ~/.config/claude-session-sync/config.toml
|
|
62
|
+
# own_hub = 'D:\SyncDrive\HomeJSONL'
|
|
63
|
+
|
|
64
|
+
# 2) 第一次先建基線
|
|
65
|
+
claude-session-sync bootstrap --map "本機專案夾=hub專案夾" --yes
|
|
66
|
+
|
|
67
|
+
# 3) 日常同步(先預覽,再 --apply 落地)
|
|
68
|
+
claude-session-sync status # 看差異(純唯讀)
|
|
69
|
+
claude-session-sync sync # 預覽
|
|
70
|
+
claude-session-sync sync --apply # 落地
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
(開發環境未安裝時,等價寫法為 `python -m claude_session_sync.cli <子指令>`。)
|
|
74
|
+
|
|
75
|
+
## 文件
|
|
76
|
+
|
|
77
|
+
- **[`docs/`](docs/README.md) — 使用者指南(白話版)**:不需要懂程式,講清楚「做什麼、怎麼安全、怎麼用」。
|
|
78
|
+
- [概念與運作](docs/01-概念與運作.md) | [安全機制白話](docs/02-安全機制白話.md) | [指令手冊](docs/03-指令手冊.md)
|
|
79
|
+
- [情境劇本](docs/04-情境劇本.md) | [Windows 與已知限制](docs/05-windows-與已知限制.md) | [名詞對照](docs/06-名詞對照.md)
|
|
80
|
+
|
|
81
|
+
## 開發
|
|
82
|
+
|
|
83
|
+
```bash
|
|
84
|
+
python -m unittest discover -t . -s tests
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
純 Python、零第三方相依(標準庫)。跨 OS CI 於 Ubuntu + Windows 跑 py3.11/3.13。
|
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
claude_code_session_sync-0.1.0.dist-info/licenses/LICENSE,sha256=nkzV9VJScNHPuadWPypvUFVzNTMXGqsMrfDUHZ-QGVw,1064
|
|
2
|
+
claude_session_sync/__init__.py,sha256=5OGFH_TrJq-If5oS_s-0W8yb34SOQfoTBr3gLZtxKws,489
|
|
3
|
+
claude_session_sync/acks.py,sha256=UcVC9m6QB8joxX8iVY9loKHoDZx2it7EM9X1UGJWl7o,17610
|
|
4
|
+
claude_session_sync/anomaly.py,sha256=C_QoR_kVirvHz01RVz6JLJK_QFrGxq9vE0N6db7W9sM,9510
|
|
5
|
+
claude_session_sync/apply.py,sha256=dVDckRvpIRSvkouyjZCOPhlOrge8M5XYbCM4hDuH_Gc,63096
|
|
6
|
+
claude_session_sync/atomicio.py,sha256=3lRrMgjYjFvpMF6hUhAtbe0CMMnNKmq1HTeF8YRRlyE,30323
|
|
7
|
+
claude_session_sync/bootstrap.py,sha256=1cEcI7xfVhi9TXgjKYGWcmn3gIq_A2wXagSEGTISUSY,22304
|
|
8
|
+
claude_session_sync/canonical.py,sha256=sRZs1I1RBZRuy_TamdDCurD_CGvJmq_k7a_9expCfRQ,6497
|
|
9
|
+
claude_session_sync/classify.py,sha256=M5HJRuOk9myn5zbU0_X7KJerg0i76qVf0U-5Fcz_Szs,6756
|
|
10
|
+
claude_session_sync/cli.py,sha256=QrFCCwl0CvYmpgr9QGi63h0eQ2YOQ2vIsoMFmK6ggGs,62446
|
|
11
|
+
claude_session_sync/config.py,sha256=0XSkBIuXr9i3k1ptWkb_IO-B_X-c9gMi0ND9XCxnJ8A,4960
|
|
12
|
+
claude_session_sync/doctor.py,sha256=FbX9lU8QnR9shhNKXyijiPnda08frW3miEEO0OBJJio,19526
|
|
13
|
+
claude_session_sync/fuzzy.py,sha256=P2nxuVmQHg8wDCliOg56M21bFLh05fuDOGlzoDMi4Ug,7258
|
|
14
|
+
claude_session_sync/lineset.py,sha256=XChbVTiZwjoqH3bWKLUnn4CN590iQq07PZAlWxQ_B3w,5890
|
|
15
|
+
claude_session_sync/memory.py,sha256=t0MAkbvIWae1J1w-iqR4R4hT322gzwTd2oIK4gLHhuE,68323
|
|
16
|
+
claude_session_sync/merge.py,sha256=4ID7DoHmFZyIeXXm180t3w9B5hWmRT3qBgX2WJSn0aI,55886
|
|
17
|
+
claude_session_sync/pathsafe.py,sha256=KUCxhU-qTVzfPgibbAlS6Drp9E5rr6mxpas4nBStkkk,5291
|
|
18
|
+
claude_session_sync/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
|
|
19
|
+
claude_session_sync/resolve.py,sha256=ES5YwNDNJzrYQEg-lNoH7FORZop_adNdhaRNNyIeUMM,11936
|
|
20
|
+
claude_session_sync/scan.py,sha256=xivMpXUizbvIy8sx7k3gMhBRyYiGQvhY1guKzoPeHHI,29492
|
|
21
|
+
claude_session_sync/session_merge.py,sha256=5Q5GiyvvW3_xVnkdEpTJdYawovCE7tm5X6fP0zsxepY,10304
|
|
22
|
+
claude_session_sync/sidecar.py,sha256=FKqHw-nxIGyBcQp4isrpk2oR4Cg5nODA4o2kCOpeLms,9360
|
|
23
|
+
claude_session_sync/snapshot.py,sha256=6OcyeSlF-Wjeafd6aEz16mmgVxXlQLaEkBdo6zBVnTM,7026
|
|
24
|
+
claude_session_sync/state.py,sha256=PNoDm8xN4dRHkvCiE7B5xWTbed7MC8ZElrrTBco9L6M,12109
|
|
25
|
+
claude_session_sync/tombstone.py,sha256=higglM06WfClIfWt_JDbYkXeG-6sTIA4CcizPMDQKQU,18003
|
|
26
|
+
claude_session_sync/transfer.py,sha256=d0Udb8XXJJ0b_Anwjy2u1Dwr8Ub_2FSeOjS9tQCMI-w,26668
|
|
27
|
+
claude_code_session_sync-0.1.0.dist-info/METADATA,sha256=maj1R7CukDlQjKXSSRoMr93lfSOMyU3eZw-ukvRziGw,4159
|
|
28
|
+
claude_code_session_sync-0.1.0.dist-info/WHEEL,sha256=K260EYznzXsJYBQGqmI8VTxEdiZYNvDZwW9cBh9-_MA,91
|
|
29
|
+
claude_code_session_sync-0.1.0.dist-info/entry_points.txt,sha256=uRISCzJtnCBJO9qM9xus_kyXmRJ0m5Lsew9Cm8buw4o,69
|
|
30
|
+
claude_code_session_sync-0.1.0.dist-info/top_level.txt,sha256=Cr8U4xm_8iwtkaCRNXeG8-rU2KmET9GyHhM5Zo0g6NU,20
|
|
31
|
+
claude_code_session_sync-0.1.0.dist-info/RECORD,,
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 weilung
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
claude_session_sync
|
|
@@ -0,0 +1,11 @@
|
|
|
1
|
+
"""claude-session-sync — 跨機同步 Claude Code 的 session(JSONL) 與 memory(.md)。
|
|
2
|
+
|
|
3
|
+
P1a 唯讀核心(已實作):
|
|
4
|
+
- canonical:編碼吸收 + canonical hash + 行解析(三態:ok / zero-byte / blank / decode-error)
|
|
5
|
+
- lineset:行身分、root-set、genuine leaf、active-tip
|
|
6
|
+
- classify:§4.1 分類表 + 安全閘 + main-root 相容性 + active-tip 交叉驗
|
|
7
|
+
|
|
8
|
+
依據 DESIGN.md v0.4 附錄 B(P0 spike 定案)與 PLAN-P1.md v0.3。
|
|
9
|
+
"""
|
|
10
|
+
|
|
11
|
+
__version__ = "0.1.0"
|
|
@@ -0,0 +1,279 @@
|
|
|
1
|
+
"""ack 帳本:對「工具永遠無法自動解決」的 blocked 項(damaged / casefold-collision / identity-collision)
|
|
2
|
+
記錄「已審閱」,讓 doctor/sync 不再反覆回報(DESIGN 附錄 A15「blocked 收斂出口」)。
|
|
3
|
+
|
|
4
|
+
**安全鐵則**:
|
|
5
|
+
1. **純呈現層**:ack 只抑制「回報」,**絕不改變分類**——acked 的 damaged 仍永不同步、acked 的 collision 仍永不
|
|
6
|
+
自動合併。本帳本從不進 `build_plan`/`classify`/`apply`,故結構上不可能把 blocked 變成 auto-apply。呈現層
|
|
7
|
+
(`format_plan` / `doctor.diagnose`)以 `AckView` 隱藏/降級 acked 項,分類與寫入路徑一律看不到本模組。
|
|
8
|
+
2. **fingerprint 綁定**:ack 記 `(kind, identity, fingerprint)`。現況 fingerprint 不符(damaged 檔內容改、
|
|
9
|
+
撞名集合變)→ 視為**未** ack、照常重報——不遮蓋新的/變動過的問題。
|
|
10
|
+
3. **fail-closed 讀**:帳本缺 → 無 ack(全部照常回報,`ok=True`);壞/讀不到 → 忽略整本(`ok=False`,呼叫端
|
|
11
|
+
警告,仍全部照常回報);壞條目 → 跳過該條、不毒化整本。`.tombstones/` 為 symlink/逃逸 → **不信任、不
|
|
12
|
+
suppress**(回空、`ok=True`)。任一路徑都**只會少 suppress、不會多 suppress**。
|
|
13
|
+
4. **A3 不丟**:ack 只寫一個 hub 側 per-project JSON(`<proj>/.tombstones/acks.json`),**絕不動 session/
|
|
14
|
+
memory 檔**。落點在 `.tombstones/`(已被 scan 排除、不會被當 session/專案);寫走 atomicio 原子寫 + 專屬鎖。
|
|
15
|
+
|
|
16
|
+
範圍(v1,session 側):casefold-collision(A9 檔名撞名,兩份都是真 session)、damaged / blocked-damaged-source
|
|
17
|
+
(壞 JSONL)、identity-collision(同 uuid 異 hash 的內容身分衝突)。memory 側對應項留 follow-on。
|
|
18
|
+
"""
|
|
19
|
+
from __future__ import annotations
|
|
20
|
+
|
|
21
|
+
import hashlib
|
|
22
|
+
import json
|
|
23
|
+
from dataclasses import dataclass, field
|
|
24
|
+
from pathlib import Path
|
|
25
|
+
|
|
26
|
+
from . import atomicio, pathsafe, scan, tombstone
|
|
27
|
+
|
|
28
|
+
SCHEMA_VERSION = 1
|
|
29
|
+
ACKS_FILE = "acks.json"
|
|
30
|
+
|
|
31
|
+
# ackable 的 plan action → ack kind。casefold-collision 以 casefold key 為身分(一項含整組 sid);
|
|
32
|
+
# damaged / identity-collision 以 sid 為身分(指紋看兩側檔 bytes)。
|
|
33
|
+
_ACTION_KIND: dict[str, str] = {
|
|
34
|
+
"blocked-casefold-collision": "casefold-collision",
|
|
35
|
+
"blocked-damaged-source": "damaged",
|
|
36
|
+
"damaged": "damaged",
|
|
37
|
+
"identity-collision": "identity-collision",
|
|
38
|
+
}
|
|
39
|
+
_KINDS = frozenset(_ACTION_KIND.values())
|
|
40
|
+
# 可 ack 的 plan action 字串(供 apply.format_report 呈現層過濾)。**單一真相源在 `scan.ACKABLE_ACTIONS`**(scan 被
|
|
41
|
+
# acks 依賴、不可反向 import);`_ACTION_KIND` 的鍵須與之一致(test_acks 有漂移守衛)。
|
|
42
|
+
ACKABLE_ACTIONS = scan.ACKABLE_ACTIONS
|
|
43
|
+
|
|
44
|
+
|
|
45
|
+
class UnsafeAcksDir(OSError):
|
|
46
|
+
"""`<proj>/.tombstones` 是 symlink 或逃逸專案夾(指界外)→ 拒寫 acks(否則寫到信任根外)。"""
|
|
47
|
+
|
|
48
|
+
|
|
49
|
+
@dataclass(frozen=True)
|
|
50
|
+
class AckItem:
|
|
51
|
+
"""一個可 ack 的 blocked 項(doctor / format_plan / ack 寫入的**單一真相源**,由 `ackable_from_plan` 產)。"""
|
|
52
|
+
project: str # hub 專案夾名(pk)
|
|
53
|
+
hub_dir: str # hub 專案夾路徑;ledger 落點 = <hub_dir>/.tombstones/acks.json
|
|
54
|
+
kind: str # casefold-collision | damaged | identity-collision
|
|
55
|
+
identity: str # collision: casefold key;damaged/identity: sid
|
|
56
|
+
fingerprint: str | None # 現況指紋(見 fingerprint_*);None=不可綁定內容(讀不到)→ 不可 ack、不被隱藏(g6)
|
|
57
|
+
session_ids: tuple[str, ...] # 此項涵蓋的 sid(collision=整組;其餘=單一)——供呈現層隱藏對應行
|
|
58
|
+
label: str # 顯示用短標籤
|
|
59
|
+
|
|
60
|
+
|
|
61
|
+
@dataclass
|
|
62
|
+
class Ledger:
|
|
63
|
+
"""載入後的 acks.json。`by_key` 以 **(kind, identity, fingerprint) 三元組**為鍵(同一 (kind,identity) 可有多個
|
|
64
|
+
fingerprint 並存,g2);`ok=False` 表帳本損壞/讀不到(已忽略,呼叫端警告)。"""
|
|
65
|
+
by_key: dict[tuple[str, str, str], dict] = field(default_factory=dict)
|
|
66
|
+
ok: bool = True
|
|
67
|
+
path: Path | None = None
|
|
68
|
+
|
|
69
|
+
|
|
70
|
+
# ── fingerprint ─────────────────────────────────────────────────────────────
|
|
71
|
+
|
|
72
|
+
def fingerprint_collision(names, local_files=None, hub_files=None) -> str | None:
|
|
73
|
+
"""撞名項指紋 = 撞名集(排序拼法)+**各撞名檔兩側 raw bytes digest**。新拼法加入/移除 → 集變 → 指紋變;
|
|
74
|
+
某撞名檔內容改(變 damaged/換成別的 session)→ digest 變 → 指紋變 → 重報。**為何也綁內容**(g4):
|
|
75
|
+
`classify_session` 的撞名閘**先於** damaged 閘 → 撞名檔之一變壞仍分類為 collision,若 fp 只看名稱,collision ack
|
|
76
|
+
會遮蓋「撞名檔變 damaged」這個新情況;綁內容使「撞名檔變了」重新提示。**回 `None`** 若某撞名檔 present 但讀不到
|
|
77
|
+
(不可綁定內容 → 不列為 ackable,fail-closed,g5 Medium)。`*_files` 由 `scan._session_files` 提供(已排除 symlink);
|
|
78
|
+
未給(如非比對用的直接呼叫)→ 視為兩側皆無檔。"""
|
|
79
|
+
lf, hf = local_files or {}, hub_files or {}
|
|
80
|
+
parts = ["\n".join(sorted(names))]
|
|
81
|
+
for n in sorted(names):
|
|
82
|
+
fpf = fingerprint_files(lf.get(n), hf.get(n))
|
|
83
|
+
if fpf is None:
|
|
84
|
+
return None # 某撞名檔讀不到 → 不可綁定 → 不可 ack(fail-closed)
|
|
85
|
+
parts.append(fpf)
|
|
86
|
+
return "cf:" + hashlib.sha256("|".join(parts).encode("utf-8")).hexdigest()
|
|
87
|
+
|
|
88
|
+
|
|
89
|
+
def fingerprint_files(*paths: Path | None) -> str | None:
|
|
90
|
+
"""damaged / identity-collision 指紋 = 兩側檔 raw bytes digest 的組合。任一側內容改/消失 → 指紋變 → 重報。
|
|
91
|
+
**回 `None`** 若某 present 檔讀不到(unreadable,`raw_file_digest`→None)→ 呼叫端(ackable_from_plan)視為
|
|
92
|
+
**不可綁定內容 → 不列為 ackable、永遠照報**(fail-closed,g5 Medium:不可綁定則 fp 無法反映內容變動,read-denied
|
|
93
|
+
檔內容變會被舊 ack 遮蓋)。缺檔(None path)→ "-"(可綁定為「該側無此檔」)。path 僅由 `scan._session_files`
|
|
94
|
+
提供(已排除 symlink),故 `raw_file_digest`(會跟隨 symlink)不會讀到界外檔。"""
|
|
95
|
+
parts: list[str] = []
|
|
96
|
+
for p in paths:
|
|
97
|
+
if p is None:
|
|
98
|
+
parts.append("-") # 該側無此檔(可綁定)
|
|
99
|
+
else:
|
|
100
|
+
d = tombstone.raw_file_digest(p)
|
|
101
|
+
if d is None:
|
|
102
|
+
return None # present 但讀不到 → 不可綁定內容(fail-closed)
|
|
103
|
+
parts.append(d)
|
|
104
|
+
return "fs:" + hashlib.sha256("|".join(parts).encode("utf-8")).hexdigest()
|
|
105
|
+
|
|
106
|
+
|
|
107
|
+
# ── 從 plan 抽 ackable 項(單一真相源)─────────────────────────────────────────
|
|
108
|
+
|
|
109
|
+
def ackable_from_plan(plan) -> list[AckItem]:
|
|
110
|
+
"""從 `SyncPlan` 抽出所有可 ack 的 session 側 blocked 項。**只含有 hub_dir 的專案**(ack 帳本 hub 側;
|
|
111
|
+
local-only 專案尚未上 hub、還不是同步問題,不納入)。撞名依 casefold key 併組(整組一項)。"""
|
|
112
|
+
out: list[AckItem] = []
|
|
113
|
+
for pp in plan.projects:
|
|
114
|
+
if not pp.hub_dir:
|
|
115
|
+
continue
|
|
116
|
+
pk = Path(pp.hub_dir).name
|
|
117
|
+
coll: dict[str, set[str]] = {}
|
|
118
|
+
filebased: list[tuple[str, str]] = [] # (sid, kind) for damaged / identity-collision
|
|
119
|
+
for s in pp.sessions:
|
|
120
|
+
kind = _ACTION_KIND.get(s.action)
|
|
121
|
+
if kind is None:
|
|
122
|
+
continue
|
|
123
|
+
if kind == "casefold-collision":
|
|
124
|
+
coll.setdefault(s.session_id.casefold(), set()).add(s.session_id)
|
|
125
|
+
else:
|
|
126
|
+
filebased.append((s.session_id, kind))
|
|
127
|
+
if not coll and not filebased:
|
|
128
|
+
continue
|
|
129
|
+
# collision 與 damaged/identity 皆需讀撞名/壞檔內容算指紋(g4:collision fp 亦綁內容)→ 有 ackable 項就讀。
|
|
130
|
+
# 用 `_session_files`(已排除 symlink)取真實檔路徑,確保指紋只看工具本就會處理的實體檔。
|
|
131
|
+
local_files = scan._session_files(Path(pp.local_dir)) if pp.local_dir else {}
|
|
132
|
+
hub_files = scan._session_files(Path(pp.hub_dir))
|
|
133
|
+
# 不可綁定(讀不到內容)→ fp=None:**仍列出**(不 skip)。fingerprint=None → is_acked 恆 False → compute_ack_view
|
|
134
|
+
# 不會隱藏該 (pk,sid)(fail-closed,g6:若整個 skip 掉,同 hub 另一 view 對同 sid 的 ack 會讓此 sid 看似「所有涵蓋
|
|
135
|
+
# 項都已 ack」而誤藏這個不可綁定行)。`_doctor_ack` 另會濾掉 fp=None 者(不可 ack、無從綁定內容變動)。
|
|
136
|
+
for cf, sids in sorted(coll.items()):
|
|
137
|
+
names = sorted(sids)
|
|
138
|
+
out.append(AckItem(pk, pp.hub_dir, "casefold-collision", cf,
|
|
139
|
+
fingerprint_collision(names, local_files, hub_files), tuple(names),
|
|
140
|
+
"/".join(n[:8] for n in names)))
|
|
141
|
+
for sid, kind in sorted(filebased):
|
|
142
|
+
out.append(AckItem(pk, pp.hub_dir, kind, sid,
|
|
143
|
+
fingerprint_files(local_files.get(sid), hub_files.get(sid)), (sid,), sid[:8]))
|
|
144
|
+
return out
|
|
145
|
+
|
|
146
|
+
|
|
147
|
+
# ── 帳本讀(lock-free、fail-closed)────────────────────────────────────────────
|
|
148
|
+
|
|
149
|
+
def load_ledger(hub_project_dir) -> Ledger:
|
|
150
|
+
"""讀 `<hub_project_dir>/.tombstones/acks.json`。fail-closed:缺 → 空 `ok=True`;壞/讀不到 → 空 `ok=False`;
|
|
151
|
+
`.tombstones` 不安全 → 空 `ok=True`(不信任、不 suppress)。**任一情況都只會少 suppress,不會多 suppress**。"""
|
|
152
|
+
tdir = tombstone.tombstones_dir(hub_project_dir)
|
|
153
|
+
path = tdir / ACKS_FILE
|
|
154
|
+
if not tombstone._tombstones_ok(hub_project_dir):
|
|
155
|
+
return Ledger({}, True, path) # symlink/逃逸 .tombstones → 不信任其內容、不 suppress
|
|
156
|
+
if pathsafe.is_reparse(path):
|
|
157
|
+
# `.tombstones` 是真夾但 `acks.json` **本身**是 symlink/reparse → `read_bytes()` 會跟隨讀界外/被植入的帳本
|
|
158
|
+
# → 不信任、不 suppress(fail-closed,g1 High;leaf 防線,補 `_tombstones_ok` 只管父夾)。
|
|
159
|
+
return Ledger({}, True, path)
|
|
160
|
+
try:
|
|
161
|
+
raw = path.read_bytes()
|
|
162
|
+
except FileNotFoundError:
|
|
163
|
+
return Ledger({}, True, path) # 尚無帳本 → 無 ack(正常)
|
|
164
|
+
except OSError:
|
|
165
|
+
return Ledger({}, False, path) # 讀不到(權限等)→ fail-closed 忽略
|
|
166
|
+
try:
|
|
167
|
+
obj = json.loads(raw.decode("utf-8"))
|
|
168
|
+
except (ValueError, UnicodeDecodeError):
|
|
169
|
+
return Ledger({}, False, path) # 壞 JSON → 忽略整本
|
|
170
|
+
version = obj.get("version") if isinstance(obj, dict) else None
|
|
171
|
+
# `type(version) is int`:**拒 bool/float**——Python 中 `True == 1`、`1.0 == 1` 皆真,若只寫 `version == 1`
|
|
172
|
+
# 則 `{"version": true}` / `{"version": 1.0}` 會被當合法版本放行 → 壞帳本可 suppress 真問題(R1 High#1,
|
|
173
|
+
# 對稱 tombstone `_valid_tombstone` 的型別 fail-closed)。
|
|
174
|
+
if not (type(version) is int and version == SCHEMA_VERSION and isinstance(obj.get("acks"), list)):
|
|
175
|
+
return Ledger({}, False, path) # 版本不符/型別錯/結構壞 → 忽略整本
|
|
176
|
+
by_key: dict[tuple[str, str, str], dict] = {}
|
|
177
|
+
for rec in obj["acks"]:
|
|
178
|
+
if not isinstance(rec, dict):
|
|
179
|
+
continue # 壞條目跳過(不毒化整本)
|
|
180
|
+
k, i, fp = rec.get("kind"), rec.get("identity"), rec.get("fingerprint")
|
|
181
|
+
if isinstance(k, str) and isinstance(i, str) and isinstance(fp, str) and k in _KINDS:
|
|
182
|
+
by_key[(k, i, fp)] = rec # 以 **(kind,identity,fingerprint) 三元組**為鍵:同一 (kind,identity)
|
|
183
|
+
# 可有多個 fp 並存(同 hub 專案被多個 local 夾映射時 damaged 的內容 fp 不同,g2)——不互蓋。
|
|
184
|
+
return Ledger(by_key, True, path)
|
|
185
|
+
|
|
186
|
+
|
|
187
|
+
def is_acked(ledger: Ledger, kind: str, identity: str, fingerprint: str | None) -> bool:
|
|
188
|
+
"""該 (kind, identity, fingerprint) 三元組是否在帳本內(**fp-exact**:指紋不符=內容/撞名集已變 → 未 ack)。
|
|
189
|
+
`fingerprint=None`(不可綁定內容)→ **恆 False**(fail-closed:讀不到內容者不可能『已 ack』、也不該被隱藏,g6)。"""
|
|
190
|
+
return fingerprint is not None and (kind, identity, fingerprint) in ledger.by_key
|
|
191
|
+
|
|
192
|
+
|
|
193
|
+
# ── ack view(呈現層過濾:format_plan / doctor 共用)────────────────────────────
|
|
194
|
+
|
|
195
|
+
@dataclass
|
|
196
|
+
class AckView:
|
|
197
|
+
"""呈現層過濾視圖。`hidden[pk]` = 該專案要隱藏的 session_id 集(見 compute_ack_view 的 fail-safe 規則);
|
|
198
|
+
`corrupt_projects` = 帳本損壞的專案(呼叫端警告)。隱藏行數由各 renderer 自行計。"""
|
|
199
|
+
hidden: dict[str, set[str]] = field(default_factory=dict)
|
|
200
|
+
corrupt_projects: list[str] = field(default_factory=list)
|
|
201
|
+
|
|
202
|
+
|
|
203
|
+
def compute_ack_view(plan) -> AckView:
|
|
204
|
+
"""對 plan 的每個 ackable 項查對應專案帳本(每專案讀一次),算出要隱藏的 session_id。純讀、lock-free。
|
|
205
|
+
|
|
206
|
+
**fail-safe 隱藏**:某 `(pk, sid)` 只在「**涵蓋它的所有 ackable 項都 fp-exact 已 ack**」時才隱藏——否則不藏。
|
|
207
|
+
因同一 hub 專案可被多個 local 夾映射(兩 cwd 綁定/兩 clone),同一 sid 的 damaged 內容 fp 可不同 → 兩個
|
|
208
|
+
AckItem 共用 `(pk, sid)`;若只 ack 其一就用 `(pk,sid)` 藏,會誤藏另一個未 ack 的(遮蓋真問題,g2 High)。
|
|
209
|
+
寧可多顯示已 ack 項,也不誤藏未 ack 的同 sid 項。(collision 的 fp 只看名稱集、跨 view 相同,本規則對它無副作用。)"""
|
|
210
|
+
view = AckView()
|
|
211
|
+
by_dir: dict[str, list[AckItem]] = {}
|
|
212
|
+
for it in ackable_from_plan(plan):
|
|
213
|
+
by_dir.setdefault(it.hub_dir, []).append(it)
|
|
214
|
+
for hub_dir, its in by_dir.items():
|
|
215
|
+
led = load_ledger(hub_dir)
|
|
216
|
+
pk = its[0].project
|
|
217
|
+
if not led.ok:
|
|
218
|
+
view.corrupt_projects.append(pk)
|
|
219
|
+
sid_all_acked: dict[str, bool] = {} # sid → 涵蓋它的**所有**項是否都已 ack(AND)
|
|
220
|
+
for it in its:
|
|
221
|
+
acked = is_acked(led, it.kind, it.identity, it.fingerprint)
|
|
222
|
+
for sid in it.session_ids:
|
|
223
|
+
sid_all_acked[sid] = sid_all_acked.get(sid, True) and acked
|
|
224
|
+
hide = {sid for sid, ok in sid_all_acked.items() if ok}
|
|
225
|
+
if hide:
|
|
226
|
+
view.hidden.setdefault(pk, set()).update(hide)
|
|
227
|
+
return view
|
|
228
|
+
|
|
229
|
+
|
|
230
|
+
# ── 帳本寫(加鎖 read-modify-write、atomic)─────────────────────────────────────
|
|
231
|
+
|
|
232
|
+
@dataclass
|
|
233
|
+
class UpdateResult:
|
|
234
|
+
added: list[str] = field(default_factory=list) # 新 ack / 更新指紋的 label
|
|
235
|
+
removed: list[str] = field(default_factory=list) # 取消 ack 的 label
|
|
236
|
+
unchanged: list[str] = field(default_factory=list) # 已 ack 且指紋相符(無變更)
|
|
237
|
+
replaced_corrupt: bool = False # 原帳本損壞、已以新內容取代(呼叫端告知)
|
|
238
|
+
|
|
239
|
+
|
|
240
|
+
def update_ledger(hub_project_dir, *, add=(), remove=(), lock_timeout_s: float = 5.0) -> UpdateResult:
|
|
241
|
+
"""加鎖 read-modify-write 一個專案帳本。`add`:`list[AckItem]`(以 (kind,identity,fingerprint) 三元組為鍵——
|
|
242
|
+
同 (kind,identity) 不同 fp **並存不互蓋**,g2);`remove`:`list[(kind, identity, fingerprint)]`(load_ledger 暴露
|
|
243
|
+
的三元組鍵)。鎖內重讀(並發 ack 合併、不互蓋);原子寫。`.tombstones` 不安全 → raise。"""
|
|
244
|
+
tdir = tombstone.tombstones_dir(hub_project_dir)
|
|
245
|
+
if not tombstone._tombstones_ok(hub_project_dir):
|
|
246
|
+
raise UnsafeAcksDir(f".tombstones 為 symlink 或逃逸專案夾,拒絕寫入 acks:{tdir}")
|
|
247
|
+
path = tdir / ACKS_FILE
|
|
248
|
+
res = UpdateResult()
|
|
249
|
+
lock = atomicio.FileLock(path).acquire_blocking(timeout_s=lock_timeout_s)
|
|
250
|
+
try:
|
|
251
|
+
led = load_ledger(hub_project_dir) # 鎖內重讀
|
|
252
|
+
res.replaced_corrupt = not led.ok # 損壞帳本會被本次寫入取代(原本就已被忽略、不 suppress 任何項)
|
|
253
|
+
by_key = dict(led.by_key)
|
|
254
|
+
for it in add:
|
|
255
|
+
key = (it.kind, it.identity, it.fingerprint)
|
|
256
|
+
if key in by_key:
|
|
257
|
+
res.unchanged.append(it.label)
|
|
258
|
+
continue
|
|
259
|
+
by_key[key] = {
|
|
260
|
+
"kind": it.kind, "identity": it.identity, "fingerprint": it.fingerprint,
|
|
261
|
+
"label": it.label, "acked_at": tombstone.now_iso(),
|
|
262
|
+
"acked_by": tombstone.local_machine_id(),
|
|
263
|
+
}
|
|
264
|
+
res.added.append(it.label)
|
|
265
|
+
for key in remove:
|
|
266
|
+
rec = by_key.pop(key, None)
|
|
267
|
+
if rec is not None:
|
|
268
|
+
res.removed.append(rec.get("label") or key[1])
|
|
269
|
+
_write_ledger(path, by_key)
|
|
270
|
+
return res
|
|
271
|
+
finally:
|
|
272
|
+
lock.release()
|
|
273
|
+
|
|
274
|
+
|
|
275
|
+
def _write_ledger(path: Path, by_key: dict[tuple[str, str, str], dict]) -> None:
|
|
276
|
+
acks = sorted(by_key.values(),
|
|
277
|
+
key=lambda r: (r.get("kind", ""), r.get("identity", ""), r.get("fingerprint", "")))
|
|
278
|
+
obj = {"version": SCHEMA_VERSION, "acks": acks}
|
|
279
|
+
atomicio.atomic_write_text(path, json.dumps(obj, ensure_ascii=False, indent=2))
|
|
@@ -0,0 +1,161 @@
|
|
|
1
|
+
"""存在性異常偵測(§8.5):任何 apply 前阻斷,擋「掛錯碟/空掛載/hub 突變/已知 session 大量消失」。
|
|
2
|
+
|
|
3
|
+
依據 DESIGN §8.5 + PLAN v0.8 §2.9 + codex r6 必補②(已知 session 大量消失偵測):
|
|
4
|
+
靠**存在性**而非百分比門檻去抓「整體不對」——掛載點不在 / hub 夾名指紋變 → halt;
|
|
5
|
+
**已知 session 大量從 hub 消失**(夾名沒變但內容被清/部分同步/誤掛)也 halt。
|
|
6
|
+
|
|
7
|
+
約定(全工具共用):state.known_sessions 的 key = **hub 專案夾名**(hub_root 底下的編碼夾名),
|
|
8
|
+
bindings 的 value 亦同。anomaly 才能由 project_key 直接定位 hub_root/<project_key>。
|
|
9
|
+
|
|
10
|
+
P1a 起:mount + hub-fingerprint;P1b 加 known-session 大量消失(coarse 安全網,預設高門檻避免誤殺)。
|
|
11
|
+
"""
|
|
12
|
+
from __future__ import annotations
|
|
13
|
+
|
|
14
|
+
import hashlib
|
|
15
|
+
import json
|
|
16
|
+
from dataclasses import dataclass
|
|
17
|
+
from pathlib import Path
|
|
18
|
+
|
|
19
|
+
from .pathsafe import safe_project_dir # leaf;逃逸 pk 夾不讀界外(e2e gate2 #4,無循環)
|
|
20
|
+
from .state import State
|
|
21
|
+
|
|
22
|
+
# 大量消失門檻(coarse 安全網;寧可偶爾要使用者確認,也不靜默吃掉「掛錯/被清空」)。
|
|
23
|
+
DISAPPEAR_MIN_KNOWN = 8 # **全體**已知 session 總數低於此值不觸發全域判定(樣本太小)
|
|
24
|
+
DISAPPEAR_FRAC = 0.6 # 消失比例達此值 → halt(全域或單專案)
|
|
25
|
+
PROJECT_VANISH_MIN = 4 # **單專案**已知數達此值即納入個別判定(避免大專案稀釋掉小專案被整夾清空)
|
|
26
|
+
|
|
27
|
+
|
|
28
|
+
@dataclass
|
|
29
|
+
class Anomaly:
|
|
30
|
+
code: str
|
|
31
|
+
message: str
|
|
32
|
+
severity: str # "halt" | "warn"
|
|
33
|
+
|
|
34
|
+
|
|
35
|
+
def hub_fingerprint(hub_root: str | Path) -> str:
|
|
36
|
+
"""hub 的存在性指紋:所有專案夾名(排序)的雜湊。掛錯碟 → 夾名集合大不同 → 指紋變。"""
|
|
37
|
+
root = Path(hub_root)
|
|
38
|
+
names = sorted(d.name for d in root.iterdir() if d.is_dir()) if root.exists() else []
|
|
39
|
+
return hashlib.sha256(json.dumps(names, ensure_ascii=False).encode("utf-8")).hexdigest()
|
|
40
|
+
|
|
41
|
+
|
|
42
|
+
def known_session_set_hash(state: State | None) -> str:
|
|
43
|
+
"""所有 project_key→已知 sessionId 集合的穩定雜湊(供 anomaly 快照比對 project 集合突變)。"""
|
|
44
|
+
if state is None:
|
|
45
|
+
return hashlib.sha256(b"none").hexdigest()
|
|
46
|
+
payload = {k: sorted(v) for k, v in sorted(state.known_sessions.items())}
|
|
47
|
+
return hashlib.sha256(json.dumps(payload, ensure_ascii=False, sort_keys=True).encode("utf-8")).hexdigest()
|
|
48
|
+
|
|
49
|
+
|
|
50
|
+
def _hub_session_stems(hub_dir: Path) -> set[str]:
|
|
51
|
+
if not hub_dir.exists():
|
|
52
|
+
return set()
|
|
53
|
+
return {p.stem for p in hub_dir.glob("*.jsonl") if p.is_file() and not p.name.startswith(".")}
|
|
54
|
+
|
|
55
|
+
|
|
56
|
+
def detect_disappearance(
|
|
57
|
+
state: State | None, hub_root: str | Path, *,
|
|
58
|
+
min_known: int = DISAPPEAR_MIN_KNOWN, frac: float = DISAPPEAR_FRAC,
|
|
59
|
+
project_min: int = PROJECT_VANISH_MIN,
|
|
60
|
+
) -> Anomaly | None:
|
|
61
|
+
"""已知 session 是否從 hub **大量**消失(夾名指紋沒抓到的「內容被清/部分同步/誤掛」)。
|
|
62
|
+
|
|
63
|
+
只看 hub 端(local 消失靠 tombstone 流程處理,最致命的掛錯碟由此處 + fingerprint 涵蓋)。觸發 =
|
|
64
|
+
- **單專案**:known≥project_min 且該專案消失比例 ≥ frac(避免大專案稀釋掉小專案整夾被清,codex r8);
|
|
65
|
+
- **或全域**:全體 known≥min_known 且全體消失比例 ≥ frac。
|
|
66
|
+
比例用精確分數比較(非 int() floor,否則 50% 會誤觸 60% 門檻,codex r8)。回 None 表無異常。
|
|
67
|
+
"""
|
|
68
|
+
if state is None or not state.known_sessions:
|
|
69
|
+
return None
|
|
70
|
+
root = Path(hub_root)
|
|
71
|
+
total_known = 0
|
|
72
|
+
total_missing = 0
|
|
73
|
+
flagged: list[str] = []
|
|
74
|
+
for pk, known in state.known_sessions.items():
|
|
75
|
+
if not known:
|
|
76
|
+
continue
|
|
77
|
+
if not safe_project_dir(root, root / pk):
|
|
78
|
+
continue # 逃逸 pk 夾(symlink/junction 出 hub_root)→ 不 glob 界外(e2e gate2 #4);其異常由 build_plan skipped-unsafe surface
|
|
79
|
+
present = _hub_session_stems(root / pk)
|
|
80
|
+
missing = len(set(known) - present)
|
|
81
|
+
total_known += len(known)
|
|
82
|
+
total_missing += missing
|
|
83
|
+
if len(known) >= project_min and missing / len(known) >= frac:
|
|
84
|
+
flagged.append(pk)
|
|
85
|
+
global_hit = total_known >= min_known and total_known > 0 and total_missing / total_known >= frac
|
|
86
|
+
if not flagged and not global_hit:
|
|
87
|
+
return None
|
|
88
|
+
parts: list[str] = []
|
|
89
|
+
if flagged:
|
|
90
|
+
parts.append(f"{len(flagged)} 個專案各自大量消失")
|
|
91
|
+
if global_hit:
|
|
92
|
+
parts.append(f"全體 {total_missing}/{total_known} 不在 hub")
|
|
93
|
+
return Anomaly(
|
|
94
|
+
"known-sessions-vanished",
|
|
95
|
+
"已知 session 大量從 hub 消失(" + ";".join(parts) + ")。"
|
|
96
|
+
"疑似掛錯碟/hub 被清空/部分同步——請確認 hub 正確後 `bootstrap` 重建基線,不自動處理。",
|
|
97
|
+
"halt",
|
|
98
|
+
)
|
|
99
|
+
|
|
100
|
+
|
|
101
|
+
def check(state: State | None, hub_root: str | Path) -> list[Anomaly]:
|
|
102
|
+
root = Path(hub_root)
|
|
103
|
+
if not root.exists() or not root.is_dir():
|
|
104
|
+
return [Anomaly("mount-missing", f"hub 掛載點不存在或非目錄:{root}", "halt")]
|
|
105
|
+
out: list[Anomaly] = []
|
|
106
|
+
cur = hub_fingerprint(root)
|
|
107
|
+
if state is not None and state.hub_fingerprint and state.hub_fingerprint != cur:
|
|
108
|
+
out.append(Anomaly("hub-fingerprint-changed", "hub 指紋改變(可能掛錯碟/專案集合突變)", "halt"))
|
|
109
|
+
disappeared = detect_disappearance(state, root)
|
|
110
|
+
if disappeared is not None:
|
|
111
|
+
out.append(disappeared)
|
|
112
|
+
return out
|
|
113
|
+
|
|
114
|
+
|
|
115
|
+
# ── 跨側 presence / identity 安全 predicates(session-scan 與 memory-scan 共用,單一真相源避免漂移)──
|
|
116
|
+
# 這兩個 predicate 不是 halt-anomaly 本身,而是餵給 classify 的安全判據(大量消失 → 不自動傳播刪除;
|
|
117
|
+
# casefold 撞名 → 跨 OS aliasing 風險)。原本在 scan.py,P1d memory 也要用同一套 → 上提到此 leaf 模組,
|
|
118
|
+
# scan 以原名(`is_bulk_local_deletion` / `_collision_casefolds` 別名)re-export 維持既有呼叫端不變。
|
|
119
|
+
|
|
120
|
+
# present-empty 偵測的最小樣本:曾有 ≥ 此數的 local 項、現一個都不在 → 疑掛錯碟/整夾被清。
|
|
121
|
+
# 單一項刪到空(known==1)仍視為正常刪除(floor=2 才觸發),避免常見小專案誤擋。
|
|
122
|
+
_MOUNT_UNCONFIRMED_MIN = 2
|
|
123
|
+
|
|
124
|
+
|
|
125
|
+
def is_bulk_local_deletion(local_known: set | None, local_stems: set[str]) -> bool:
|
|
126
|
+
"""本專案 local 是否「大量消失」——疑掛錯碟/內容被清,而非使用者逐一刪除 → 該專案所有 `local-deleted`
|
|
127
|
+
改判 `blocked-bulk-local-deletion`(交人、**不自動寫 tombstone**)。session(sid)與 memory(檔名)共用。
|
|
128
|
+
|
|
129
|
+
這是刪除偵測最危險處:false-positive 會寫 tombstone 去**抑制對側真實項(session/memory,跨機資料可用性
|
|
130
|
+
損失)**。觸發條件(任一):
|
|
131
|
+
(a) **掛載無法確認**:曾有 ≥2 個 local 項,但**一個都不在**現況(present 空集)。最危險的掛錯碟情境——
|
|
132
|
+
夾名靠 binding/git 對上、內容卻是別碟(known 集與現況零交集)即被此擋下,連 frac 樣本量不足的小專案
|
|
133
|
+
也涵蓋(codex r24-2)。單一項刪到空(known==1)仍信任。
|
|
134
|
+
(b) **大量比例消失**:known ≥ project_min 且消失比例 ≥ frac(仿 anomaly hub 側、精確分數比較)。
|
|
135
|
+
保守取捨——寧可大量/可疑消失時多問一次,也不在掛錯碟/被清空時靜默寫 tombstone。**仍有殘留**:掛載已被
|
|
136
|
+
現存項確認(present 非空)的「部分」消失會被當正常刪除而寫 tombstone;此為 feature 本旨(傳播使用者刪除)
|
|
137
|
+
的刻意取捨,且 harm 受限——A3 保證永不刪 hub/對側 local,tombstone 僅抑制再傳播、可逆(移除即復原)。"""
|
|
138
|
+
if not local_known:
|
|
139
|
+
return False
|
|
140
|
+
known = set(local_known)
|
|
141
|
+
present = known & set(local_stems)
|
|
142
|
+
if not present and len(known) >= _MOUNT_UNCONFIRMED_MIN:
|
|
143
|
+
return True # (a) 掛載無法確認
|
|
144
|
+
return len(known) >= PROJECT_VANISH_MIN and (len(known) - len(present)) / len(known) >= DISAPPEAR_FRAC
|
|
145
|
+
|
|
146
|
+
|
|
147
|
+
def collision_casefolds(a_stems, b_stems, keyfn=None) -> set[str]:
|
|
148
|
+
"""casefold 撞名集(A9,跨 OS 碰撞風險)。**合併兩側**後看每個折疊鍵是否對到 >1 種拼法——這同時
|
|
149
|
+
涵蓋同側重複與**跨側** case-only 變體(local `ABC` + hub `abc`,case-sensitive 機器上各自不重複,但落到
|
|
150
|
+
Windows/exFAT 會 alias,且兩者鎖路徑不同→不互斥)(codex r11-4)。session 傳 sid 集、memory 傳檔名集。
|
|
151
|
+
|
|
152
|
+
`keyfn`=折疊鍵函式,預設 `str.casefold`(session sid=ASCII UUID,NFC/NFD 不適用 → 維持既有行為、零變動)。
|
|
153
|
+
**memory 傳 `pathsafe.name_key`(NFC∘casefold∘NFC)**——否則同一檔名的 NFC 與 NFD 兩種拼法(跨平台撰寫 memory
|
|
154
|
+
常見)不被判撞名,會各自當獨立檔雙向 copy(norm-sensitive FS)或在 norm-insensitive FS 上 aliased 覆蓋(e2e-r1
|
|
155
|
+
Finding 2;memory 檔名配對按位元組精確、唯此折疊鍵能認出正規化別名)。回傳集合以 `keyfn` 為鍵——**呼叫端的
|
|
156
|
+
`x in <此集合>` 判定必須用同一 `keyfn(x)`**(memory 端已改 `name_key(name) in collisions`)。"""
|
|
157
|
+
kf = keyfn or str.casefold
|
|
158
|
+
by_cf: dict[str, set[str]] = {}
|
|
159
|
+
for s in set(a_stems) | set(b_stems):
|
|
160
|
+
by_cf.setdefault(kf(s), set()).add(s)
|
|
161
|
+
return {cf for cf, spellings in by_cf.items() if len(spellings) > 1}
|