hyperretro 0.3.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,146 @@
1
+ Metadata-Version: 2.4
2
+ Name: hyperretro
3
+ Version: 0.3.0
4
+ Summary: Geometric LLM compression: factor+int4 with verifiable quality certificates
5
+ Author-email: "William Ken Ohara Stewart (NagusameCS)" <nagusamecs@proton.me>
6
+ License: MIT License
7
+
8
+ Copyright (c) 2026 William "Nagusame" Ken Ohara Stewart
9
+
10
+ Permission is hereby granted, free of charge, to any person obtaining a copy
11
+ of this software and associated documentation files (the "Software"), to deal
12
+ in the Software without restriction, including without limitation the rights
13
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
14
+ copies of the Software, and to permit persons to whom the Software is
15
+ furnished to do so, subject to the following conditions:
16
+
17
+ The above copyright notice and this permission notice shall be included in all
18
+ copies or substantial portions of the Software.
19
+
20
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
21
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
22
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
23
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
24
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
25
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
26
+ SOFTWARE.
27
+
28
+ Project-URL: Repository, https://github.com/NagusameCS/HyperTensor
29
+ Project-URL: Documentation, https://github.com/NagusameCS/HyperTensor/tree/main/hyperretro
30
+ Project-URL: Issues, https://github.com/NagusameCS/HyperTensor/issues
31
+ Project-URL: Changelog, https://github.com/NagusameCS/HyperTensor/blob/main/CHANGELOG.md
32
+ Keywords: llm,compression,quantization,speculative-decoding,pytorch,huggingface,vllm,gguf,safetensors,low-rank,svd,grc
33
+ Classifier: Development Status :: 4 - Beta
34
+ Classifier: Intended Audience :: Science/Research
35
+ Classifier: License :: OSI Approved :: MIT License
36
+ Classifier: Operating System :: OS Independent
37
+ Classifier: Programming Language :: Python :: 3
38
+ Classifier: Programming Language :: Python :: 3.10
39
+ Classifier: Programming Language :: Python :: 3.11
40
+ Classifier: Programming Language :: Python :: 3.12
41
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
42
+ Requires-Python: >=3.10
43
+ Description-Content-Type: text/markdown
44
+ License-File: LICENSE
45
+ Requires-Dist: numpy>=1.24
46
+ Requires-Dist: safetensors>=0.4
47
+ Provides-Extra: hf
48
+ Requires-Dist: transformers>=4.40; extra == "hf"
49
+ Requires-Dist: torch>=2.1; extra == "hf"
50
+ Requires-Dist: huggingface_hub>=0.20; extra == "hf"
51
+ Provides-Extra: vllm
52
+ Requires-Dist: vllm>=0.5; extra == "vllm"
53
+ Provides-Extra: bench
54
+ Requires-Dist: torch>=2.1; extra == "bench"
55
+ Requires-Dist: transformers>=4.40; extra == "bench"
56
+ Requires-Dist: datasets>=2.18; extra == "bench"
57
+ Requires-Dist: tqdm; extra == "bench"
58
+ Provides-Extra: dev
59
+ Requires-Dist: pytest>=7; extra == "dev"
60
+ Requires-Dist: pytest-cov; extra == "dev"
61
+ Dynamic: license-file
62
+
63
+ # HyperRetro
64
+
65
+ **HyperTensor, retrofitted into the PyTorch / HuggingFace / vLLM ecosystem.**
66
+
67
+ HyperTensor proper is a standalone runtime. HyperRetro is the *integrated*
68
+ sibling project: it takes the same geometric primitives (UGT shared basis,
69
+ GRC / sink-aware projection, geodesic speculative draft, fused dual-Q8
70
+ GEMV) and exposes them as drop-in pieces of the standard inference stack.
71
+
72
+ ```
73
+ hyperretro/
74
+ ├── kernels/ # PyTorch C++ extension (gemv_dual_q8_0, ...)
75
+ ├── hf/ # offline HuggingFace compression -> .safetensors
76
+ ├── vllm/ # speculative-decoding draft adapter
77
+ └── bench/ # 3-way benchmark harness (baseline | retro | HyperTensor)
78
+ ```
79
+
80
+ ## Three retrofits
81
+
82
+ ### 1. Fused kernels as a PyTorch extension
83
+
84
+ The CUDA kernel `kernel_gemv_dual_q8_0` from
85
+ [`runtime/nn/cuda_kernels.cu`](../runtime/nn/cuda_kernels.cu) is wrapped as a
86
+ JIT-built `torch.utils.cpp_extension` so users can call it from regular
87
+ PyTorch:
88
+
89
+ ```python
90
+ import hyperretro
91
+ import torch
92
+
93
+ x = torch.randn(4096)
94
+ # Wa, Wb may be float matrices or pre-quantized (scale, codes) tuples
95
+ out_a, out_b = hyperretro.gemv_dual_q8_0(x, Wa, Wb)
96
+ ```
97
+
98
+ Backend resolution: `cext` (JIT-compiled C extension) → `torch` (pure
99
+ torch reference) → `numpy` (always works). Force the fallback with
100
+ `HYPERRETRO_FORCE_FALLBACK=1`.
101
+
102
+ ### 2. Offline HuggingFace compression
103
+
104
+ A single CLI takes a vanilla HF model, runs the GRC projection / sink-aware
105
+ GRC pipeline ([Paper E](../ARXIV_SUBMISSIONS/paper-V)), and writes the
106
+ result back out as standard `.safetensors` shards that load with stock
107
+ `AutoModelForCausalLM.from_pretrained`:
108
+
109
+ ```bash
110
+ pip install -e hyperretro[hf]
111
+ hyperretro-compress \
112
+ --model Qwen/Qwen2.5-0.5B-Instruct \
113
+ --out ./qwen-grc-1024/ \
114
+ --rank 1024 \
115
+ --sink 4
116
+ ```
117
+
118
+ The output directory is 100 % HuggingFace-native — no HyperTensor runtime
119
+ needed at inference time. A `hyperretro_report.json` is written alongside
120
+ recording the per-layer Frobenius rel-err.
121
+
122
+ ### 3. Geodesic speculative draft for vLLM
123
+
124
+ `hyperretro.vllm.GeodesicDraft` replaces the random / smaller-model draft
125
+ proposer in vLLM-style speculative decoding with the geodesic-step
126
+ draft from [Paper C](../docs/papers/03-speculative-decoding.html). The
127
+ adapter is framework-agnostic (`propose(h_curr, h_prev) -> (token_ids,
128
+ confidences)`) and includes a `register_with_vllm()` hook for live
129
+ deployments.
130
+
131
+ ## Benchmarks
132
+
133
+ ```bash
134
+ hyperretro-bench kernel --rows 4096 --in-dim 4096
135
+ hyperretro-bench spec --d-model 512 --k 64 --vocab 2048 --steps 64
136
+ hyperretro-bench compress --model Qwen/Qwen2.5-0.5B --out /tmp/qwen-retro \
137
+ --rank 256 --eval-text "The quick brown fox..."
138
+ ```
139
+
140
+ Each subcommand emits a JSON report comparing **standard baseline**,
141
+ **HyperRetro**, and (where applicable) **standalone HyperTensor**.
142
+
143
+ ## License
144
+
145
+ MIT for code, CC-BY-4.0 for the accompanying documentation/papers — same
146
+ as the parent HyperTensor project.
@@ -0,0 +1,6 @@
1
+ hyperretro-0.3.0.dist-info/licenses/LICENSE,sha256=Y6hoyise223tZL3FphgIR7K87fvF9jfXeMYtNrnTKkg,1114
2
+ hyperretro-0.3.0.dist-info/METADATA,sha256=0F-pcqQd8HZicRRq2QqRYqH0F5IvXlJzy0l9lZdHfv8,6196
3
+ hyperretro-0.3.0.dist-info/WHEEL,sha256=aeYiig01lYGDzBgS8HxWXOg3uV61G9ijOsup-k9o1sk,91
4
+ hyperretro-0.3.0.dist-info/entry_points.txt,sha256=AWAmTID1J8IzsoVqHpzRRMASBcB1R7LnRO0CZf4Hceg,261
5
+ hyperretro-0.3.0.dist-info/top_level.txt,sha256=AbpHGcgLb-kRsJGnwFEktk7uzpZOCcBY74-YBdrKVGs,1
6
+ hyperretro-0.3.0.dist-info/RECORD,,
@@ -0,0 +1,5 @@
1
+ Wheel-Version: 1.0
2
+ Generator: setuptools (82.0.1)
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
5
+
@@ -0,0 +1,6 @@
1
+ [console_scripts]
2
+ hyperretro = hyperretro.cli:main
3
+ hyperretro-bench = hyperretro.bench.run:_cli_main
4
+ hyperretro-compress = hyperretro.hf.compress:_cli_main
5
+ hyperretro-distill = hyperretro.hf.distill:_cli_main
6
+ hyperretro-vllm = hyperretro.vllm_adapter:_cli_main
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 William "Nagusame" Ken Ohara Stewart
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1 @@
1
+