umarise-huggingface 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- umarise_huggingface-0.1.0/PKG-INFO +123 -0
- umarise_huggingface-0.1.0/README.md +102 -0
- umarise_huggingface-0.1.0/pyproject.toml +34 -0
- umarise_huggingface-0.1.0/setup.cfg +4 -0
- umarise_huggingface-0.1.0/umarise_huggingface/__init__.py +19 -0
- umarise_huggingface-0.1.0/umarise_huggingface/anchor.py +105 -0
- umarise_huggingface-0.1.0/umarise_huggingface/hook.py +100 -0
- umarise_huggingface-0.1.0/umarise_huggingface.egg-info/PKG-INFO +123 -0
- umarise_huggingface-0.1.0/umarise_huggingface.egg-info/SOURCES.txt +10 -0
- umarise_huggingface-0.1.0/umarise_huggingface.egg-info/dependency_links.txt +1 -0
- umarise_huggingface-0.1.0/umarise_huggingface.egg-info/requires.txt +2 -0
- umarise_huggingface-0.1.0/umarise_huggingface.egg-info/top_level.txt +1 -0
|
@@ -0,0 +1,123 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: umarise-huggingface
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Anchor every HuggingFace upload to Bitcoin. Zero-touch provenance for models and datasets.
|
|
5
|
+
Author-email: Umarise <partners@umarise.com>
|
|
6
|
+
License: Unlicense
|
|
7
|
+
Project-URL: Homepage, https://umarise.com
|
|
8
|
+
Project-URL: Documentation, https://umarise.com/api-reference
|
|
9
|
+
Project-URL: Repository, https://github.com/AnchoringTrust/umarise-huggingface
|
|
10
|
+
Keywords: umarise,huggingface,transformers,anchoring,bitcoin,proof-of-existence,opentimestamps,provenance,ai-audit,model-provenance,dataset-provenance
|
|
11
|
+
Classifier: Development Status :: 3 - Alpha
|
|
12
|
+
Classifier: Intended Audience :: Developers
|
|
13
|
+
Classifier: Intended Audience :: Science/Research
|
|
14
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
15
|
+
Classifier: Topic :: Security :: Cryptography
|
|
16
|
+
Classifier: Programming Language :: Python :: 3
|
|
17
|
+
Requires-Python: >=3.8
|
|
18
|
+
Description-Content-Type: text/markdown
|
|
19
|
+
Requires-Dist: umarise-core-sdk>=1.1.0
|
|
20
|
+
Requires-Dist: huggingface-hub>=0.20
|
|
21
|
+
|
|
22
|
+
# umarise-huggingface
|
|
23
|
+
|
|
24
|
+
**Anchor every HuggingFace upload to Bitcoin. Automatically.**
|
|
25
|
+
|
|
26
|
+
`umarise-huggingface` intercepts `upload_file()` and `upload_folder()` to automatically compute a local SHA-256 hash and anchor it to Bitcoin via the Umarise Core API. No files are transmitted — only the hash.
|
|
27
|
+
|
|
28
|
+
## Install
|
|
29
|
+
|
|
30
|
+
```bash
|
|
31
|
+
pip install umarise-huggingface
|
|
32
|
+
export UMARISE_API_KEY=um_your_key_here
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
## Usage
|
|
36
|
+
|
|
37
|
+
### Option 1: Auto-anchor (recommended)
|
|
38
|
+
|
|
39
|
+
```python
|
|
40
|
+
import umarise_huggingface
|
|
41
|
+
from huggingface_hub import HfApi
|
|
42
|
+
|
|
43
|
+
umarise_huggingface.enable()
|
|
44
|
+
|
|
45
|
+
api = HfApi()
|
|
46
|
+
api.upload_file(
|
|
47
|
+
path_or_fileobj="model.safetensors",
|
|
48
|
+
path_in_repo="model.safetensors",
|
|
49
|
+
repo_id="your-org/your-model",
|
|
50
|
+
)
|
|
51
|
+
# [umarise] ✓ Anchored your-org/your-model:model.safetensors → origin_id: abc123...
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
### Option 2: With transformers push_to_hub
|
|
55
|
+
|
|
56
|
+
```python
|
|
57
|
+
import umarise_huggingface
|
|
58
|
+
umarise_huggingface.enable()
|
|
59
|
+
|
|
60
|
+
# Works automatically with any push_to_hub call
|
|
61
|
+
model.push_to_hub("your-org/your-model")
|
|
62
|
+
tokenizer.push_to_hub("your-org/your-model")
|
|
63
|
+
# Each file is automatically anchored
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
### Option 3: Explicit
|
|
67
|
+
|
|
68
|
+
```python
|
|
69
|
+
from umarise_huggingface import anchor_file, anchor_folder
|
|
70
|
+
|
|
71
|
+
# Single file
|
|
72
|
+
origin_id = anchor_file("model.safetensors")
|
|
73
|
+
|
|
74
|
+
# Entire directory
|
|
75
|
+
results = anchor_folder("./model_output/", pattern="*.safetensors")
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
## What happens
|
|
79
|
+
|
|
80
|
+
1. You upload via HuggingFace Hub — upload proceeds normally
|
|
81
|
+
2. SHA-256 is computed **locally** — bytes never leave your machine
|
|
82
|
+
3. Hash is submitted to Umarise Core API → anchored to Bitcoin
|
|
83
|
+
4. `origin_id` is printed to stderr
|
|
84
|
+
5. Within ~2 hours, proof is confirmed on the Bitcoin blockchain
|
|
85
|
+
|
|
86
|
+
## What this proves
|
|
87
|
+
|
|
88
|
+
- ✓ This exact model/dataset existed no later than time T
|
|
89
|
+
- ✓ The artifact was not modified after anchoring
|
|
90
|
+
- ✓ Anyone can verify — no Umarise account needed
|
|
91
|
+
|
|
92
|
+
## Verify
|
|
93
|
+
|
|
94
|
+
```bash
|
|
95
|
+
# Get the hash of your file
|
|
96
|
+
shasum -a 256 model.safetensors
|
|
97
|
+
|
|
98
|
+
# Verify against Umarise
|
|
99
|
+
curl -s https://core.umarise.com/v1-core-verify \
|
|
100
|
+
-H 'Content-Type: application/json' \
|
|
101
|
+
-d '{"hash":"sha256:YOUR_HASH"}' | python3 -m json.tool
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
Or visit [verify-anchoring.org](https://verify-anchoring.org)
|
|
105
|
+
|
|
106
|
+
## Design principles
|
|
107
|
+
|
|
108
|
+
- **Never breaks your pipeline** — if anchoring fails, uploads continue normally
|
|
109
|
+
- **Zero storage** — only the SHA-256 hash is sent to Umarise
|
|
110
|
+
- **Vault-independent** — HuggingFace stores the data, Umarise stores the proof
|
|
111
|
+
- **Independently verifiable** — no Umarise account needed to verify
|
|
112
|
+
|
|
113
|
+
## Links
|
|
114
|
+
|
|
115
|
+
- [Umarise](https://umarise.com)
|
|
116
|
+
- [API Reference](https://umarise.com/api-reference)
|
|
117
|
+
- [Anchoring Specification](https://anchoring-spec.org)
|
|
118
|
+
- [umarise-wandb](https://pypi.org/project/umarise-wandb/)
|
|
119
|
+
- [umarise-mlflow](https://pypi.org/project/umarise-mlflow/)
|
|
120
|
+
|
|
121
|
+
## License
|
|
122
|
+
|
|
123
|
+
[Unlicense](https://unlicense.org) — public domain.
|
|
@@ -0,0 +1,102 @@
|
|
|
1
|
+
# umarise-huggingface
|
|
2
|
+
|
|
3
|
+
**Anchor every HuggingFace upload to Bitcoin. Automatically.**
|
|
4
|
+
|
|
5
|
+
`umarise-huggingface` intercepts `upload_file()` and `upload_folder()` to automatically compute a local SHA-256 hash and anchor it to Bitcoin via the Umarise Core API. No files are transmitted — only the hash.
|
|
6
|
+
|
|
7
|
+
## Install
|
|
8
|
+
|
|
9
|
+
```bash
|
|
10
|
+
pip install umarise-huggingface
|
|
11
|
+
export UMARISE_API_KEY=um_your_key_here
|
|
12
|
+
```
|
|
13
|
+
|
|
14
|
+
## Usage
|
|
15
|
+
|
|
16
|
+
### Option 1: Auto-anchor (recommended)
|
|
17
|
+
|
|
18
|
+
```python
|
|
19
|
+
import umarise_huggingface
|
|
20
|
+
from huggingface_hub import HfApi
|
|
21
|
+
|
|
22
|
+
umarise_huggingface.enable()
|
|
23
|
+
|
|
24
|
+
api = HfApi()
|
|
25
|
+
api.upload_file(
|
|
26
|
+
path_or_fileobj="model.safetensors",
|
|
27
|
+
path_in_repo="model.safetensors",
|
|
28
|
+
repo_id="your-org/your-model",
|
|
29
|
+
)
|
|
30
|
+
# [umarise] ✓ Anchored your-org/your-model:model.safetensors → origin_id: abc123...
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
### Option 2: With transformers push_to_hub
|
|
34
|
+
|
|
35
|
+
```python
|
|
36
|
+
import umarise_huggingface
|
|
37
|
+
umarise_huggingface.enable()
|
|
38
|
+
|
|
39
|
+
# Works automatically with any push_to_hub call
|
|
40
|
+
model.push_to_hub("your-org/your-model")
|
|
41
|
+
tokenizer.push_to_hub("your-org/your-model")
|
|
42
|
+
# Each file is automatically anchored
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
### Option 3: Explicit
|
|
46
|
+
|
|
47
|
+
```python
|
|
48
|
+
from umarise_huggingface import anchor_file, anchor_folder
|
|
49
|
+
|
|
50
|
+
# Single file
|
|
51
|
+
origin_id = anchor_file("model.safetensors")
|
|
52
|
+
|
|
53
|
+
# Entire directory
|
|
54
|
+
results = anchor_folder("./model_output/", pattern="*.safetensors")
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
## What happens
|
|
58
|
+
|
|
59
|
+
1. You upload via HuggingFace Hub — upload proceeds normally
|
|
60
|
+
2. SHA-256 is computed **locally** — bytes never leave your machine
|
|
61
|
+
3. Hash is submitted to Umarise Core API → anchored to Bitcoin
|
|
62
|
+
4. `origin_id` is printed to stderr
|
|
63
|
+
5. Within ~2 hours, proof is confirmed on the Bitcoin blockchain
|
|
64
|
+
|
|
65
|
+
## What this proves
|
|
66
|
+
|
|
67
|
+
- ✓ This exact model/dataset existed no later than time T
|
|
68
|
+
- ✓ The artifact was not modified after anchoring
|
|
69
|
+
- ✓ Anyone can verify — no Umarise account needed
|
|
70
|
+
|
|
71
|
+
## Verify
|
|
72
|
+
|
|
73
|
+
```bash
|
|
74
|
+
# Get the hash of your file
|
|
75
|
+
shasum -a 256 model.safetensors
|
|
76
|
+
|
|
77
|
+
# Verify against Umarise
|
|
78
|
+
curl -s https://core.umarise.com/v1-core-verify \
|
|
79
|
+
-H 'Content-Type: application/json' \
|
|
80
|
+
-d '{"hash":"sha256:YOUR_HASH"}' | python3 -m json.tool
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
Or visit [verify-anchoring.org](https://verify-anchoring.org)
|
|
84
|
+
|
|
85
|
+
## Design principles
|
|
86
|
+
|
|
87
|
+
- **Never breaks your pipeline** — if anchoring fails, uploads continue normally
|
|
88
|
+
- **Zero storage** — only the SHA-256 hash is sent to Umarise
|
|
89
|
+
- **Vault-independent** — HuggingFace stores the data, Umarise stores the proof
|
|
90
|
+
- **Independently verifiable** — no Umarise account needed to verify
|
|
91
|
+
|
|
92
|
+
## Links
|
|
93
|
+
|
|
94
|
+
- [Umarise](https://umarise.com)
|
|
95
|
+
- [API Reference](https://umarise.com/api-reference)
|
|
96
|
+
- [Anchoring Specification](https://anchoring-spec.org)
|
|
97
|
+
- [umarise-wandb](https://pypi.org/project/umarise-wandb/)
|
|
98
|
+
- [umarise-mlflow](https://pypi.org/project/umarise-mlflow/)
|
|
99
|
+
|
|
100
|
+
## License
|
|
101
|
+
|
|
102
|
+
[Unlicense](https://unlicense.org) — public domain.
|
|
@@ -0,0 +1,34 @@
|
|
|
1
|
+
[build-system]
|
|
2
|
+
requires = ["setuptools>=68"]
|
|
3
|
+
build-backend = "setuptools.build_meta"
|
|
4
|
+
|
|
5
|
+
[project]
|
|
6
|
+
name = "umarise-huggingface"
|
|
7
|
+
version = "0.1.0"
|
|
8
|
+
description = "Anchor every HuggingFace upload to Bitcoin. Zero-touch provenance for models and datasets."
|
|
9
|
+
readme = "README.md"
|
|
10
|
+
license = {text = "Unlicense"}
|
|
11
|
+
requires-python = ">=3.8"
|
|
12
|
+
authors = [{name = "Umarise", email = "partners@umarise.com"}]
|
|
13
|
+
keywords = [
|
|
14
|
+
"umarise", "huggingface", "transformers", "anchoring", "bitcoin",
|
|
15
|
+
"proof-of-existence", "opentimestamps", "provenance", "ai-audit",
|
|
16
|
+
"model-provenance", "dataset-provenance"
|
|
17
|
+
]
|
|
18
|
+
classifiers = [
|
|
19
|
+
"Development Status :: 3 - Alpha",
|
|
20
|
+
"Intended Audience :: Developers",
|
|
21
|
+
"Intended Audience :: Science/Research",
|
|
22
|
+
"Topic :: Scientific/Engineering :: Artificial Intelligence",
|
|
23
|
+
"Topic :: Security :: Cryptography",
|
|
24
|
+
"Programming Language :: Python :: 3",
|
|
25
|
+
]
|
|
26
|
+
dependencies = [
|
|
27
|
+
"umarise-core-sdk>=1.1.0",
|
|
28
|
+
"huggingface-hub>=0.20",
|
|
29
|
+
]
|
|
30
|
+
|
|
31
|
+
[project.urls]
|
|
32
|
+
Homepage = "https://umarise.com"
|
|
33
|
+
Documentation = "https://umarise.com/api-reference"
|
|
34
|
+
Repository = "https://github.com/AnchoringTrust/umarise-huggingface"
|
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
"""
|
|
2
|
+
umarise-huggingface: Anchor HuggingFace uploads to Bitcoin.
|
|
3
|
+
|
|
4
|
+
Usage:
|
|
5
|
+
# Option 1: Auto-anchor (recommended)
|
|
6
|
+
import umarise_huggingface
|
|
7
|
+
umarise_huggingface.enable()
|
|
8
|
+
# All subsequent upload_file/upload_folder/push_to_hub calls are anchored
|
|
9
|
+
|
|
10
|
+
# Option 2: Explicit
|
|
11
|
+
from umarise_huggingface import anchor_file
|
|
12
|
+
origin_id = anchor_file("model.safetensors")
|
|
13
|
+
"""
|
|
14
|
+
|
|
15
|
+
from umarise_huggingface.anchor import anchor_file, anchor_folder
|
|
16
|
+
from umarise_huggingface.hook import enable, disable
|
|
17
|
+
|
|
18
|
+
__version__ = "0.1.0"
|
|
19
|
+
__all__ = ["anchor_file", "anchor_folder", "enable", "disable"]
|
|
@@ -0,0 +1,105 @@
|
|
|
1
|
+
"""
|
|
2
|
+
Core anchoring functions for HuggingFace Hub uploads.
|
|
3
|
+
|
|
4
|
+
Architecture:
|
|
5
|
+
upload_file() / push_to_hub() → HuggingFace Hub (unchanged)
|
|
6
|
+
→ SHA-256 hash locally
|
|
7
|
+
→ Hash submitted to Umarise Core API
|
|
8
|
+
→ origin_id printed to stderr
|
|
9
|
+
"""
|
|
10
|
+
|
|
11
|
+
import hashlib
|
|
12
|
+
import os
|
|
13
|
+
import sys
|
|
14
|
+
from typing import Optional
|
|
15
|
+
|
|
16
|
+
|
|
17
|
+
def _hash_file(path: str) -> str:
|
|
18
|
+
"""Compute SHA-256 of a local file."""
|
|
19
|
+
h = hashlib.sha256()
|
|
20
|
+
with open(path, "rb") as f:
|
|
21
|
+
for chunk in iter(lambda: f.read(8192), b""):
|
|
22
|
+
h.update(chunk)
|
|
23
|
+
return h.hexdigest()
|
|
24
|
+
|
|
25
|
+
|
|
26
|
+
def _get_client(api_key: Optional[str] = None):
|
|
27
|
+
"""Lazy-import UmariseCore."""
|
|
28
|
+
from umarise import UmariseCore
|
|
29
|
+
key = api_key or os.environ.get("UMARISE_API_KEY")
|
|
30
|
+
if not key:
|
|
31
|
+
raise RuntimeError(
|
|
32
|
+
"UMARISE_API_KEY not set. Export it or pass api_key= to enable()."
|
|
33
|
+
)
|
|
34
|
+
return UmariseCore(api_key=key)
|
|
35
|
+
|
|
36
|
+
|
|
37
|
+
def anchor_file(
|
|
38
|
+
file_path: str,
|
|
39
|
+
api_key: Optional[str] = None,
|
|
40
|
+
repo_id: Optional[str] = None,
|
|
41
|
+
) -> Optional[str]:
|
|
42
|
+
"""
|
|
43
|
+
Anchor a single file to Bitcoin via Umarise.
|
|
44
|
+
|
|
45
|
+
Args:
|
|
46
|
+
file_path: Local path to the file.
|
|
47
|
+
api_key: Umarise API key (falls back to UMARISE_API_KEY).
|
|
48
|
+
repo_id: Optional HF repo ID for logging context.
|
|
49
|
+
|
|
50
|
+
Returns:
|
|
51
|
+
origin_id if successful, None otherwise.
|
|
52
|
+
"""
|
|
53
|
+
try:
|
|
54
|
+
file_hash = _hash_file(file_path)
|
|
55
|
+
client = _get_client(api_key)
|
|
56
|
+
result = client.attest(f"sha256:{file_hash}")
|
|
57
|
+
|
|
58
|
+
origin_id = getattr(result, "origin_id", None) or "unknown"
|
|
59
|
+
proof_status = getattr(result, "proof_status", None) or "pending"
|
|
60
|
+
|
|
61
|
+
label = f"{repo_id}:{os.path.basename(file_path)}" if repo_id else os.path.basename(file_path)
|
|
62
|
+
print(
|
|
63
|
+
f"[umarise] ✓ Anchored {label} → {origin_id} ({proof_status})",
|
|
64
|
+
file=sys.stderr,
|
|
65
|
+
)
|
|
66
|
+
return origin_id
|
|
67
|
+
|
|
68
|
+
except Exception as e:
|
|
69
|
+
print(
|
|
70
|
+
f"[umarise] ⚠ Anchoring failed for {file_path}: {e}",
|
|
71
|
+
file=sys.stderr,
|
|
72
|
+
)
|
|
73
|
+
return None
|
|
74
|
+
|
|
75
|
+
|
|
76
|
+
def anchor_folder(
|
|
77
|
+
folder_path: str,
|
|
78
|
+
api_key: Optional[str] = None,
|
|
79
|
+
repo_id: Optional[str] = None,
|
|
80
|
+
pattern: Optional[str] = None,
|
|
81
|
+
) -> list:
|
|
82
|
+
"""
|
|
83
|
+
Anchor all files in a folder to Bitcoin.
|
|
84
|
+
|
|
85
|
+
Args:
|
|
86
|
+
folder_path: Local directory path.
|
|
87
|
+
api_key: Umarise API key.
|
|
88
|
+
repo_id: Optional HF repo ID for logging context.
|
|
89
|
+
pattern: Optional glob pattern to filter files (e.g., "*.safetensors").
|
|
90
|
+
|
|
91
|
+
Returns:
|
|
92
|
+
List of (file_path, origin_id) tuples.
|
|
93
|
+
"""
|
|
94
|
+
import fnmatch
|
|
95
|
+
|
|
96
|
+
results = []
|
|
97
|
+
for root, _, files in os.walk(folder_path):
|
|
98
|
+
for fname in files:
|
|
99
|
+
if pattern and not fnmatch.fnmatch(fname, pattern):
|
|
100
|
+
continue
|
|
101
|
+
fpath = os.path.join(root, fname)
|
|
102
|
+
origin_id = anchor_file(fpath, api_key=api_key, repo_id=repo_id)
|
|
103
|
+
results.append((fpath, origin_id))
|
|
104
|
+
|
|
105
|
+
return results
|
|
@@ -0,0 +1,100 @@
|
|
|
1
|
+
"""
|
|
2
|
+
Auto-anchor hook for HuggingFace Hub.
|
|
3
|
+
|
|
4
|
+
Monkey-patches huggingface_hub.HfApi methods to automatically
|
|
5
|
+
anchor every uploaded file to Bitcoin via Umarise.
|
|
6
|
+
|
|
7
|
+
Intercepted methods:
|
|
8
|
+
- upload_file()
|
|
9
|
+
- upload_folder()
|
|
10
|
+
"""
|
|
11
|
+
|
|
12
|
+
import os
|
|
13
|
+
import sys
|
|
14
|
+
from typing import Optional
|
|
15
|
+
|
|
16
|
+
from umarise_huggingface.anchor import anchor_file, anchor_folder
|
|
17
|
+
|
|
18
|
+
|
|
19
|
+
# Global state
|
|
20
|
+
_original_upload_file = None
|
|
21
|
+
_original_upload_folder = None
|
|
22
|
+
_enabled = False
|
|
23
|
+
|
|
24
|
+
|
|
25
|
+
def enable(api_key: Optional[str] = None):
|
|
26
|
+
"""
|
|
27
|
+
Enable automatic anchoring for all HuggingFace Hub uploads.
|
|
28
|
+
|
|
29
|
+
Call once at the start of your script:
|
|
30
|
+
import umarise_huggingface
|
|
31
|
+
umarise_huggingface.enable()
|
|
32
|
+
|
|
33
|
+
Every subsequent upload_file() and upload_folder() call
|
|
34
|
+
will automatically anchor the uploaded files to Bitcoin.
|
|
35
|
+
"""
|
|
36
|
+
global _original_upload_file, _original_upload_folder, _enabled
|
|
37
|
+
|
|
38
|
+
if _enabled:
|
|
39
|
+
return
|
|
40
|
+
|
|
41
|
+
from huggingface_hub import HfApi
|
|
42
|
+
|
|
43
|
+
_original_upload_file = HfApi.upload_file
|
|
44
|
+
_original_upload_folder = HfApi.upload_folder
|
|
45
|
+
|
|
46
|
+
_api_key = api_key
|
|
47
|
+
|
|
48
|
+
def _patched_upload_file(self, *args, **kwargs):
|
|
49
|
+
result = _original_upload_file(self, *args, **kwargs)
|
|
50
|
+
|
|
51
|
+
# Extract path_or_fileobj from args or kwargs
|
|
52
|
+
try:
|
|
53
|
+
path = kwargs.get("path_or_fileobj") or (args[0] if args else None)
|
|
54
|
+
repo_id = kwargs.get("repo_id") or (args[1] if len(args) > 1 else None)
|
|
55
|
+
|
|
56
|
+
if path and isinstance(path, str) and os.path.isfile(path):
|
|
57
|
+
anchor_file(path, api_key=_api_key, repo_id=repo_id)
|
|
58
|
+
except Exception as e:
|
|
59
|
+
print(f"[umarise] ⚠ auto_anchor error: {e}", file=sys.stderr)
|
|
60
|
+
|
|
61
|
+
return result
|
|
62
|
+
|
|
63
|
+
def _patched_upload_folder(self, *args, **kwargs):
|
|
64
|
+
result = _original_upload_folder(self, *args, **kwargs)
|
|
65
|
+
|
|
66
|
+
try:
|
|
67
|
+
folder_path = kwargs.get("folder_path") or (args[0] if args else None)
|
|
68
|
+
repo_id = kwargs.get("repo_id") or (args[1] if len(args) > 1 else None)
|
|
69
|
+
|
|
70
|
+
if folder_path and os.path.isdir(folder_path):
|
|
71
|
+
anchor_folder(folder_path, api_key=_api_key, repo_id=repo_id)
|
|
72
|
+
except Exception as e:
|
|
73
|
+
print(f"[umarise] ⚠ auto_anchor error: {e}", file=sys.stderr)
|
|
74
|
+
|
|
75
|
+
return result
|
|
76
|
+
|
|
77
|
+
HfApi.upload_file = _patched_upload_file
|
|
78
|
+
HfApi.upload_folder = _patched_upload_folder
|
|
79
|
+
_enabled = True
|
|
80
|
+
|
|
81
|
+
print("[umarise] ✓ HuggingFace auto-anchoring enabled", file=sys.stderr)
|
|
82
|
+
|
|
83
|
+
|
|
84
|
+
def disable():
|
|
85
|
+
"""Disable automatic anchoring."""
|
|
86
|
+
global _original_upload_file, _original_upload_folder, _enabled
|
|
87
|
+
|
|
88
|
+
if not _enabled:
|
|
89
|
+
return
|
|
90
|
+
|
|
91
|
+
from huggingface_hub import HfApi
|
|
92
|
+
|
|
93
|
+
if _original_upload_file:
|
|
94
|
+
HfApi.upload_file = _original_upload_file
|
|
95
|
+
if _original_upload_folder:
|
|
96
|
+
HfApi.upload_folder = _original_upload_folder
|
|
97
|
+
|
|
98
|
+
_original_upload_file = None
|
|
99
|
+
_original_upload_folder = None
|
|
100
|
+
_enabled = False
|
|
@@ -0,0 +1,123 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: umarise-huggingface
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Anchor every HuggingFace upload to Bitcoin. Zero-touch provenance for models and datasets.
|
|
5
|
+
Author-email: Umarise <partners@umarise.com>
|
|
6
|
+
License: Unlicense
|
|
7
|
+
Project-URL: Homepage, https://umarise.com
|
|
8
|
+
Project-URL: Documentation, https://umarise.com/api-reference
|
|
9
|
+
Project-URL: Repository, https://github.com/AnchoringTrust/umarise-huggingface
|
|
10
|
+
Keywords: umarise,huggingface,transformers,anchoring,bitcoin,proof-of-existence,opentimestamps,provenance,ai-audit,model-provenance,dataset-provenance
|
|
11
|
+
Classifier: Development Status :: 3 - Alpha
|
|
12
|
+
Classifier: Intended Audience :: Developers
|
|
13
|
+
Classifier: Intended Audience :: Science/Research
|
|
14
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
15
|
+
Classifier: Topic :: Security :: Cryptography
|
|
16
|
+
Classifier: Programming Language :: Python :: 3
|
|
17
|
+
Requires-Python: >=3.8
|
|
18
|
+
Description-Content-Type: text/markdown
|
|
19
|
+
Requires-Dist: umarise-core-sdk>=1.1.0
|
|
20
|
+
Requires-Dist: huggingface-hub>=0.20
|
|
21
|
+
|
|
22
|
+
# umarise-huggingface
|
|
23
|
+
|
|
24
|
+
**Anchor every HuggingFace upload to Bitcoin. Automatically.**
|
|
25
|
+
|
|
26
|
+
`umarise-huggingface` intercepts `upload_file()` and `upload_folder()` to automatically compute a local SHA-256 hash and anchor it to Bitcoin via the Umarise Core API. No files are transmitted — only the hash.
|
|
27
|
+
|
|
28
|
+
## Install
|
|
29
|
+
|
|
30
|
+
```bash
|
|
31
|
+
pip install umarise-huggingface
|
|
32
|
+
export UMARISE_API_KEY=um_your_key_here
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
## Usage
|
|
36
|
+
|
|
37
|
+
### Option 1: Auto-anchor (recommended)
|
|
38
|
+
|
|
39
|
+
```python
|
|
40
|
+
import umarise_huggingface
|
|
41
|
+
from huggingface_hub import HfApi
|
|
42
|
+
|
|
43
|
+
umarise_huggingface.enable()
|
|
44
|
+
|
|
45
|
+
api = HfApi()
|
|
46
|
+
api.upload_file(
|
|
47
|
+
path_or_fileobj="model.safetensors",
|
|
48
|
+
path_in_repo="model.safetensors",
|
|
49
|
+
repo_id="your-org/your-model",
|
|
50
|
+
)
|
|
51
|
+
# [umarise] ✓ Anchored your-org/your-model:model.safetensors → origin_id: abc123...
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
### Option 2: With transformers push_to_hub
|
|
55
|
+
|
|
56
|
+
```python
|
|
57
|
+
import umarise_huggingface
|
|
58
|
+
umarise_huggingface.enable()
|
|
59
|
+
|
|
60
|
+
# Works automatically with any push_to_hub call
|
|
61
|
+
model.push_to_hub("your-org/your-model")
|
|
62
|
+
tokenizer.push_to_hub("your-org/your-model")
|
|
63
|
+
# Each file is automatically anchored
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
### Option 3: Explicit
|
|
67
|
+
|
|
68
|
+
```python
|
|
69
|
+
from umarise_huggingface import anchor_file, anchor_folder
|
|
70
|
+
|
|
71
|
+
# Single file
|
|
72
|
+
origin_id = anchor_file("model.safetensors")
|
|
73
|
+
|
|
74
|
+
# Entire directory
|
|
75
|
+
results = anchor_folder("./model_output/", pattern="*.safetensors")
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
## What happens
|
|
79
|
+
|
|
80
|
+
1. You upload via HuggingFace Hub — upload proceeds normally
|
|
81
|
+
2. SHA-256 is computed **locally** — bytes never leave your machine
|
|
82
|
+
3. Hash is submitted to Umarise Core API → anchored to Bitcoin
|
|
83
|
+
4. `origin_id` is printed to stderr
|
|
84
|
+
5. Within ~2 hours, proof is confirmed on the Bitcoin blockchain
|
|
85
|
+
|
|
86
|
+
## What this proves
|
|
87
|
+
|
|
88
|
+
- ✓ This exact model/dataset existed no later than time T
|
|
89
|
+
- ✓ The artifact was not modified after anchoring
|
|
90
|
+
- ✓ Anyone can verify — no Umarise account needed
|
|
91
|
+
|
|
92
|
+
## Verify
|
|
93
|
+
|
|
94
|
+
```bash
|
|
95
|
+
# Get the hash of your file
|
|
96
|
+
shasum -a 256 model.safetensors
|
|
97
|
+
|
|
98
|
+
# Verify against Umarise
|
|
99
|
+
curl -s https://core.umarise.com/v1-core-verify \
|
|
100
|
+
-H 'Content-Type: application/json' \
|
|
101
|
+
-d '{"hash":"sha256:YOUR_HASH"}' | python3 -m json.tool
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
Or visit [verify-anchoring.org](https://verify-anchoring.org)
|
|
105
|
+
|
|
106
|
+
## Design principles
|
|
107
|
+
|
|
108
|
+
- **Never breaks your pipeline** — if anchoring fails, uploads continue normally
|
|
109
|
+
- **Zero storage** — only the SHA-256 hash is sent to Umarise
|
|
110
|
+
- **Vault-independent** — HuggingFace stores the data, Umarise stores the proof
|
|
111
|
+
- **Independently verifiable** — no Umarise account needed to verify
|
|
112
|
+
|
|
113
|
+
## Links
|
|
114
|
+
|
|
115
|
+
- [Umarise](https://umarise.com)
|
|
116
|
+
- [API Reference](https://umarise.com/api-reference)
|
|
117
|
+
- [Anchoring Specification](https://anchoring-spec.org)
|
|
118
|
+
- [umarise-wandb](https://pypi.org/project/umarise-wandb/)
|
|
119
|
+
- [umarise-mlflow](https://pypi.org/project/umarise-mlflow/)
|
|
120
|
+
|
|
121
|
+
## License
|
|
122
|
+
|
|
123
|
+
[Unlicense](https://unlicense.org) — public domain.
|
|
@@ -0,0 +1,10 @@
|
|
|
1
|
+
README.md
|
|
2
|
+
pyproject.toml
|
|
3
|
+
umarise_huggingface/__init__.py
|
|
4
|
+
umarise_huggingface/anchor.py
|
|
5
|
+
umarise_huggingface/hook.py
|
|
6
|
+
umarise_huggingface.egg-info/PKG-INFO
|
|
7
|
+
umarise_huggingface.egg-info/SOURCES.txt
|
|
8
|
+
umarise_huggingface.egg-info/dependency_links.txt
|
|
9
|
+
umarise_huggingface.egg-info/requires.txt
|
|
10
|
+
umarise_huggingface.egg-info/top_level.txt
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
umarise_huggingface
|