aimirror 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- aimirror-0.1.0/MANIFEST.in +4 -0
- aimirror-0.1.0/PKG-INFO +308 -0
- aimirror-0.1.0/README.md +273 -0
- aimirror-0.1.0/aimirror.egg-info/PKG-INFO +308 -0
- aimirror-0.1.0/aimirror.egg-info/SOURCES.txt +14 -0
- aimirror-0.1.0/aimirror.egg-info/dependency_links.txt +1 -0
- aimirror-0.1.0/aimirror.egg-info/entry_points.txt +2 -0
- aimirror-0.1.0/aimirror.egg-info/requires.txt +11 -0
- aimirror-0.1.0/aimirror.egg-info/top_level.txt +4 -0
- aimirror-0.1.0/cache.py +129 -0
- aimirror-0.1.0/config.yaml +65 -0
- aimirror-0.1.0/downloader.py +136 -0
- aimirror-0.1.0/main.py +239 -0
- aimirror-0.1.0/pyproject.toml +55 -0
- aimirror-0.1.0/router.py +56 -0
- aimirror-0.1.0/setup.cfg +4 -0
aimirror-0.1.0/PKG-INFO
ADDED
|
@@ -0,0 +1,308 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: aimirror
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: AI时代的下载镜像加速器 - 支持Docker/PyPI/HuggingFace的多线程加速下载代理
|
|
5
|
+
Author-email: Your Name <your.email@example.com>
|
|
6
|
+
License: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/livehl/aimirror
|
|
8
|
+
Project-URL: Repository, https://github.com/livehl/aimirror
|
|
9
|
+
Project-URL: Issues, https://github.com/livehl/aimirror/issues
|
|
10
|
+
Keywords: proxy,download,accelerator,docker,pypi,huggingface,cache
|
|
11
|
+
Classifier: Development Status :: 4 - Beta
|
|
12
|
+
Classifier: Intended Audience :: Developers
|
|
13
|
+
Classifier: Intended Audience :: System Administrators
|
|
14
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
15
|
+
Classifier: Operating System :: OS Independent
|
|
16
|
+
Classifier: Programming Language :: Python :: 3
|
|
17
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
18
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
19
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
20
|
+
Classifier: Topic :: Internet :: Proxy Servers
|
|
21
|
+
Classifier: Topic :: System :: Networking
|
|
22
|
+
Classifier: Topic :: Utilities
|
|
23
|
+
Requires-Python: >=3.10
|
|
24
|
+
Description-Content-Type: text/markdown
|
|
25
|
+
Requires-Dist: fastapi>=0.100.0
|
|
26
|
+
Requires-Dist: uvicorn[standard]>=0.23.0
|
|
27
|
+
Requires-Dist: httpx[http2]>=0.25.0
|
|
28
|
+
Requires-Dist: aiohttp>=3.9.0
|
|
29
|
+
Requires-Dist: aiofiles>=23.0.0
|
|
30
|
+
Requires-Dist: pyyaml>=6.0
|
|
31
|
+
Provides-Extra: dev
|
|
32
|
+
Requires-Dist: pytest>=7.0.0; extra == "dev"
|
|
33
|
+
Requires-Dist: build; extra == "dev"
|
|
34
|
+
Requires-Dist: twine; extra == "dev"
|
|
35
|
+
|
|
36
|
+
# 🚀 aimirror
|
|
37
|
+
|
|
38
|
+
[](https://www.python.org/)
|
|
39
|
+
[](https://fastapi.tiangolo.com/)
|
|
40
|
+
[](LICENSE)
|
|
41
|
+
|
|
42
|
+
> AI 时代的下载镜像加速器 —— 被慢速网络逼疯的工程师的自救工具
|
|
43
|
+
|
|
44
|
+
## 💡 项目背景
|
|
45
|
+
|
|
46
|
+
作为一名 AI 工程师,每天的工作离不开:
|
|
47
|
+
- `pip install torch` —— 几百 MB 的 wheel 包下载到地老天荒
|
|
48
|
+
- `docker pull nvidia/cuda` —— 几个 GB 的镜像层反复下载
|
|
49
|
+
- `huggingface-cli download` —— 模型文件从 HuggingFace 蜗牛般爬过来
|
|
50
|
+
|
|
51
|
+
公司内网有代理,但单线程下载大文件依然慢得让人崩溃。重复下载相同的包?不存在的缓存。忍无可忍,于是写了这个工具。
|
|
52
|
+
|
|
53
|
+
**aimirror** = 智能路由 + 并行分片下载 + 本地缓存,让下载速度飞起来。
|
|
54
|
+
|
|
55
|
+
## ✨ 功能特性
|
|
56
|
+
|
|
57
|
+
- **⚡ 并行下载** —— HTTP Range 分片,多线程并发,榨干带宽
|
|
58
|
+
- **💾 智能缓存** —— 基于文件 digest 去重,LRU 自动淘汰
|
|
59
|
+
- **🎯 动态路由** —— 小文件直接代理,大文件自动并行
|
|
60
|
+
- **🔗 多源支持** —— Docker Hub、PyPI、CRAN、HuggingFace 开箱即用
|
|
61
|
+
- **🔌 任意扩展** —— 只要是 HTTP 下载,配置一条规则即可几十倍加速
|
|
62
|
+
|
|
63
|
+
## 🏗️ 架构
|
|
64
|
+
|
|
65
|
+
```mermaid
|
|
66
|
+
flowchart LR
|
|
67
|
+
subgraph Client["客户端"]
|
|
68
|
+
PIP[pip install]
|
|
69
|
+
DOCKER[docker pull]
|
|
70
|
+
HF[huggingface-cli]
|
|
71
|
+
end
|
|
72
|
+
|
|
73
|
+
subgraph aimirror["aimirror 服务"]
|
|
74
|
+
ROUTER[路由匹配器<br/>router.py]
|
|
75
|
+
PROXY[直接代理<br/>小文件]
|
|
76
|
+
DOWNLOADER[并行下载器<br/>downloader.py]
|
|
77
|
+
CACHE[缓存管理器<br/>cache.py]
|
|
78
|
+
end
|
|
79
|
+
|
|
80
|
+
subgraph UpstreamProxy["上游代理 (可选)"]
|
|
81
|
+
COMPANY_PROXY[公司代理<br/>http://proxy.company.com:8080]
|
|
82
|
+
end
|
|
83
|
+
|
|
84
|
+
subgraph Upstream["上游服务"]
|
|
85
|
+
PYPI[PyPI]
|
|
86
|
+
DOCKER_HUB[Docker Hub]
|
|
87
|
+
HF_HUB[HuggingFace]
|
|
88
|
+
end
|
|
89
|
+
|
|
90
|
+
PIP --> ROUTER
|
|
91
|
+
DOCKER --> ROUTER
|
|
92
|
+
HF --> ROUTER
|
|
93
|
+
|
|
94
|
+
ROUTER -->|小文件| PROXY
|
|
95
|
+
ROUTER -->|大文件| DOWNLOADER
|
|
96
|
+
|
|
97
|
+
PROXY --> UpstreamProxy
|
|
98
|
+
DOWNLOADER --> CACHE
|
|
99
|
+
CACHE -->|未命中| UpstreamProxy
|
|
100
|
+
CACHE -->|命中| Client
|
|
101
|
+
|
|
102
|
+
UpstreamProxy -->|可选| Upstream
|
|
103
|
+
UpstreamProxy -.->|直连| Upstream
|
|
104
|
+
|
|
105
|
+
DOCKER_HUB -->|返回文件| CACHE
|
|
106
|
+
PYPI -->|返回文件| CACHE
|
|
107
|
+
HF_HUB -->|返回文件| CACHE
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
## 🚀 快速开始
|
|
111
|
+
|
|
112
|
+
```bash
|
|
113
|
+
# 安装
|
|
114
|
+
pip install -r requirements.txt
|
|
115
|
+
|
|
116
|
+
# 启动
|
|
117
|
+
python main.py
|
|
118
|
+
|
|
119
|
+
# 使用
|
|
120
|
+
curl http://localhost:8081/health
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
## 🔧 客户端配置
|
|
124
|
+
|
|
125
|
+
**pip**
|
|
126
|
+
```bash
|
|
127
|
+
pip install torch --index-url http://localhost:8081/simple --trusted-host localhost:8081
|
|
128
|
+
```
|
|
129
|
+
|
|
130
|
+
**Docker**
|
|
131
|
+
```json
|
|
132
|
+
{
|
|
133
|
+
"registry-mirrors": ["http://localhost:8081"]
|
|
134
|
+
}
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
**HuggingFace (huggingface-cli)**
|
|
138
|
+
```bash
|
|
139
|
+
# 设置环境变量
|
|
140
|
+
export HF_ENDPOINT=http://localhost:8081
|
|
141
|
+
|
|
142
|
+
# 下载模型(支持所有文件类型:.gguf, .bin, .safetensors, .json 等)
|
|
143
|
+
huggingface-cli download TheBloke/Llama-2-7B-GGUF llama-2-7b.Q4_K_M.gguf
|
|
144
|
+
|
|
145
|
+
# 下载整个仓库
|
|
146
|
+
huggingface-cli download meta-llama/Llama-2-7b-hf --local-dir ./models
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
或使用 Python:
|
|
150
|
+
```python
|
|
151
|
+
import os
|
|
152
|
+
os.environ["HF_ENDPOINT"] = "http://localhost:8081"
|
|
153
|
+
|
|
154
|
+
from huggingface_hub import hf_hub_download, snapshot_download
|
|
155
|
+
|
|
156
|
+
# 下载单个文件
|
|
157
|
+
hf_hub_download(repo_id="TheBloke/Llama-2-7B-GGUF", filename="llama-2-7b.Q4_K_M.gguf")
|
|
158
|
+
|
|
159
|
+
# 下载整个仓库
|
|
160
|
+
snapshot_download(repo_id="meta-llama/Llama-2-7b-hf", local_dir="./models")
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
## 📖 API
|
|
164
|
+
|
|
165
|
+
| 路径 | 说明 |
|
|
166
|
+
|------|------|
|
|
167
|
+
| `/*` | 代理到对应上游 (Docker/PyPI/CRAN/HuggingFace) |
|
|
168
|
+
| `/health` | 健康检查 |
|
|
169
|
+
| `/stats` | 缓存统计 |
|
|
170
|
+
|
|
171
|
+
## 🐳 Docker 部署
|
|
172
|
+
|
|
173
|
+
```bash
|
|
174
|
+
# 使用 GitHub Container Registry
|
|
175
|
+
docker pull ghcr.io/livehl/aimirror:latest
|
|
176
|
+
|
|
177
|
+
# 运行
|
|
178
|
+
docker run -d -p 8081:8081 -v $(pwd)/cache:/data/fast_proxy/cache ghcr.io/livehl/aimirror:latest
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
## ⚙️ 配置示例
|
|
182
|
+
|
|
183
|
+
```yaml
|
|
184
|
+
server:
|
|
185
|
+
host: "0.0.0.0"
|
|
186
|
+
port: 8081
|
|
187
|
+
upstream_proxy: "http://proxy.company.com:8080" # 公司代理(可选)
|
|
188
|
+
|
|
189
|
+
cache:
|
|
190
|
+
dir: "./cache"
|
|
191
|
+
max_size_gb: 100
|
|
192
|
+
|
|
193
|
+
rules:
|
|
194
|
+
- name: docker-blob
|
|
195
|
+
pattern: "/v2/.*/blobs/sha256:[a-f0-9]+"
|
|
196
|
+
upstream: "https://registry-1.docker.io"
|
|
197
|
+
strategy: parallel
|
|
198
|
+
min_size: 1048576
|
|
199
|
+
concurrency: 20
|
|
200
|
+
chunk_size: 10485760
|
|
201
|
+
|
|
202
|
+
- name: pip-wheel
|
|
203
|
+
pattern: "/packages/.+\.whl$"
|
|
204
|
+
upstream: "https://pypi.org"
|
|
205
|
+
strategy: parallel
|
|
206
|
+
min_size: 1048576
|
|
207
|
+
concurrency: 20
|
|
208
|
+
chunk_size: 5242880
|
|
209
|
+
|
|
210
|
+
# HuggingFace 文件下载(支持所有模型文件,临时签名 URL 缓存优化)
|
|
211
|
+
- name: huggingface-files
|
|
212
|
+
pattern: '/.*/(blob|resolve)/main/.+'
|
|
213
|
+
upstream: "https://huggingface.co"
|
|
214
|
+
strategy: parallel
|
|
215
|
+
min_size: 1048576
|
|
216
|
+
concurrency: 20
|
|
217
|
+
chunk_size: 10485760
|
|
218
|
+
cache_key_source: original # 使用原始 URL 作为缓存 key
|
|
219
|
+
path_rewrite:
|
|
220
|
+
- search: "/blob/main/"
|
|
221
|
+
replace: "/resolve/main/"
|
|
222
|
+
|
|
223
|
+
# 示例:扩展任意 HTTP 下载站点
|
|
224
|
+
# - name: my-custom-repo
|
|
225
|
+
# pattern: '/downloads/.+\.(tar\.gz|zip|bin)$'
|
|
226
|
+
# upstream: "https://downloads.example.com"
|
|
227
|
+
# strategy: parallel
|
|
228
|
+
# min_size: 10485760 # 10MB 以上启用并行
|
|
229
|
+
# concurrency: 16
|
|
230
|
+
# chunk_size: 20971520 # 20MB 分片
|
|
231
|
+
|
|
232
|
+
- name: default
|
|
233
|
+
pattern: ".*"
|
|
234
|
+
upstream: "https://pypi.org"
|
|
235
|
+
strategy: proxy
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
### 配置说明
|
|
239
|
+
|
|
240
|
+
| 字段 | 说明 |
|
|
241
|
+
|------|------|
|
|
242
|
+
| `server.upstream_proxy` | 可选的公司代理,用于连接外网 |
|
|
243
|
+
| `rules[].upstream` | 上游源 base URL |
|
|
244
|
+
| `rules[].strategy` | `proxy` 直接代理 / `parallel` 并行下载 |
|
|
245
|
+
| `rules[].path_rewrite` | 路径重写规则(如 HuggingFace blob→resolve) |
|
|
246
|
+
| `rules[].cache_key_source` | `original` 使用原始URL作为缓存key(解决临时签名问题) |
|
|
247
|
+
|
|
248
|
+
## 🧪 测试
|
|
249
|
+
|
|
250
|
+
### 运行测试
|
|
251
|
+
|
|
252
|
+
```bash
|
|
253
|
+
# 运行简单测试(无需 pytest)
|
|
254
|
+
python test_simple.py
|
|
255
|
+
|
|
256
|
+
# 运行完整测试套件(需要 pytest)
|
|
257
|
+
pytest test_proxy.py -v
|
|
258
|
+
```
|
|
259
|
+
|
|
260
|
+
### 手动验证
|
|
261
|
+
|
|
262
|
+
**测试 PyPI 代理**
|
|
263
|
+
```bash
|
|
264
|
+
curl -o /dev/null "http://localhost:8081/packages/fb/d7/71b982339efc4fff3c622c6fefecddfd3e0b35b60c5f822872d5b806bb71/torch-1.0.0-cp27-cp27m-manylinux1_x86_64.whl" \
|
|
265
|
+
-w "HTTP: %{http_code}, Size: %{size_download}, Time: %{time_total}s\n"
|
|
266
|
+
```
|
|
267
|
+
|
|
268
|
+
**测试 HuggingFace 代理**
|
|
269
|
+
```bash
|
|
270
|
+
export HF_ENDPOINT=http://localhost:8081
|
|
271
|
+
|
|
272
|
+
# 测试下载 GGUF 模型文件
|
|
273
|
+
huggingface-cli download TheBloke/Llama-2-7B-GGUF llama-2-7b.Q4_K_M.gguf
|
|
274
|
+
|
|
275
|
+
# 测试下载 safetensors 格式模型
|
|
276
|
+
huggingface-cli download meta-llama/Llama-2-7b-hf model-00001-of-00002.safetensors
|
|
277
|
+
|
|
278
|
+
# 测试下载整个仓库
|
|
279
|
+
huggingface-cli download sentence-transformers/all-MiniLM-L6-v2 --local-dir ./test-model
|
|
280
|
+
```
|
|
281
|
+
|
|
282
|
+
**测试 Docker Registry 代理**
|
|
283
|
+
```bash
|
|
284
|
+
# 获取 token
|
|
285
|
+
TOKEN=$(curl -s "https://auth.docker.io/token?service=registry.docker.io&scope=repository:library/nginx:pull" \
|
|
286
|
+
| grep -o '"token":"[^"]*"' | cut -d'"' -f4)
|
|
287
|
+
|
|
288
|
+
# 下载 blob
|
|
289
|
+
curl -o /dev/null "http://localhost:8081/v2/library/nginx/blobs/sha256:abc123" \
|
|
290
|
+
-H "Authorization: Bearer $TOKEN" \
|
|
291
|
+
-w "HTTP: %{http_code}, Size: %{size_download}, Time: %{time_total}s\n"
|
|
292
|
+
```
|
|
293
|
+
|
|
294
|
+
**验证缓存命中**
|
|
295
|
+
```bash
|
|
296
|
+
# 第一次下载(并行下载)
|
|
297
|
+
time curl -o /tmp/test1.gguf "http://localhost:8081/unsloth/model/resolve/main/file.gguf"
|
|
298
|
+
|
|
299
|
+
# 第二次下载(缓存命中,应该快很多)
|
|
300
|
+
time curl -o /tmp/test2.gguf "http://localhost:8081/unsloth/model/resolve/main/file.gguf"
|
|
301
|
+
|
|
302
|
+
# 查看缓存统计
|
|
303
|
+
curl http://localhost:8081/stats | jq
|
|
304
|
+
```
|
|
305
|
+
|
|
306
|
+
## 📄 License
|
|
307
|
+
|
|
308
|
+
MIT
|
aimirror-0.1.0/README.md
ADDED
|
@@ -0,0 +1,273 @@
|
|
|
1
|
+
# 🚀 aimirror
|
|
2
|
+
|
|
3
|
+
[](https://www.python.org/)
|
|
4
|
+
[](https://fastapi.tiangolo.com/)
|
|
5
|
+
[](LICENSE)
|
|
6
|
+
|
|
7
|
+
> AI 时代的下载镜像加速器 —— 被慢速网络逼疯的工程师的自救工具
|
|
8
|
+
|
|
9
|
+
## 💡 项目背景
|
|
10
|
+
|
|
11
|
+
作为一名 AI 工程师,每天的工作离不开:
|
|
12
|
+
- `pip install torch` —— 几百 MB 的 wheel 包下载到地老天荒
|
|
13
|
+
- `docker pull nvidia/cuda` —— 几个 GB 的镜像层反复下载
|
|
14
|
+
- `huggingface-cli download` —— 模型文件从 HuggingFace 蜗牛般爬过来
|
|
15
|
+
|
|
16
|
+
公司内网有代理,但单线程下载大文件依然慢得让人崩溃。重复下载相同的包?不存在的缓存。忍无可忍,于是写了这个工具。
|
|
17
|
+
|
|
18
|
+
**aimirror** = 智能路由 + 并行分片下载 + 本地缓存,让下载速度飞起来。
|
|
19
|
+
|
|
20
|
+
## ✨ 功能特性
|
|
21
|
+
|
|
22
|
+
- **⚡ 并行下载** —— HTTP Range 分片,多线程并发,榨干带宽
|
|
23
|
+
- **💾 智能缓存** —— 基于文件 digest 去重,LRU 自动淘汰
|
|
24
|
+
- **🎯 动态路由** —— 小文件直接代理,大文件自动并行
|
|
25
|
+
- **🔗 多源支持** —— Docker Hub、PyPI、CRAN、HuggingFace 开箱即用
|
|
26
|
+
- **🔌 任意扩展** —— 只要是 HTTP 下载,配置一条规则即可几十倍加速
|
|
27
|
+
|
|
28
|
+
## 🏗️ 架构
|
|
29
|
+
|
|
30
|
+
```mermaid
|
|
31
|
+
flowchart LR
|
|
32
|
+
subgraph Client["客户端"]
|
|
33
|
+
PIP[pip install]
|
|
34
|
+
DOCKER[docker pull]
|
|
35
|
+
HF[huggingface-cli]
|
|
36
|
+
end
|
|
37
|
+
|
|
38
|
+
subgraph aimirror["aimirror 服务"]
|
|
39
|
+
ROUTER[路由匹配器<br/>router.py]
|
|
40
|
+
PROXY[直接代理<br/>小文件]
|
|
41
|
+
DOWNLOADER[并行下载器<br/>downloader.py]
|
|
42
|
+
CACHE[缓存管理器<br/>cache.py]
|
|
43
|
+
end
|
|
44
|
+
|
|
45
|
+
subgraph UpstreamProxy["上游代理 (可选)"]
|
|
46
|
+
COMPANY_PROXY[公司代理<br/>http://proxy.company.com:8080]
|
|
47
|
+
end
|
|
48
|
+
|
|
49
|
+
subgraph Upstream["上游服务"]
|
|
50
|
+
PYPI[PyPI]
|
|
51
|
+
DOCKER_HUB[Docker Hub]
|
|
52
|
+
HF_HUB[HuggingFace]
|
|
53
|
+
end
|
|
54
|
+
|
|
55
|
+
PIP --> ROUTER
|
|
56
|
+
DOCKER --> ROUTER
|
|
57
|
+
HF --> ROUTER
|
|
58
|
+
|
|
59
|
+
ROUTER -->|小文件| PROXY
|
|
60
|
+
ROUTER -->|大文件| DOWNLOADER
|
|
61
|
+
|
|
62
|
+
PROXY --> UpstreamProxy
|
|
63
|
+
DOWNLOADER --> CACHE
|
|
64
|
+
CACHE -->|未命中| UpstreamProxy
|
|
65
|
+
CACHE -->|命中| Client
|
|
66
|
+
|
|
67
|
+
UpstreamProxy -->|可选| Upstream
|
|
68
|
+
UpstreamProxy -.->|直连| Upstream
|
|
69
|
+
|
|
70
|
+
DOCKER_HUB -->|返回文件| CACHE
|
|
71
|
+
PYPI -->|返回文件| CACHE
|
|
72
|
+
HF_HUB -->|返回文件| CACHE
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
## 🚀 快速开始
|
|
76
|
+
|
|
77
|
+
```bash
|
|
78
|
+
# 安装
|
|
79
|
+
pip install -r requirements.txt
|
|
80
|
+
|
|
81
|
+
# 启动
|
|
82
|
+
python main.py
|
|
83
|
+
|
|
84
|
+
# 使用
|
|
85
|
+
curl http://localhost:8081/health
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
## 🔧 客户端配置
|
|
89
|
+
|
|
90
|
+
**pip**
|
|
91
|
+
```bash
|
|
92
|
+
pip install torch --index-url http://localhost:8081/simple --trusted-host localhost:8081
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
**Docker**
|
|
96
|
+
```json
|
|
97
|
+
{
|
|
98
|
+
"registry-mirrors": ["http://localhost:8081"]
|
|
99
|
+
}
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
**HuggingFace (huggingface-cli)**
|
|
103
|
+
```bash
|
|
104
|
+
# 设置环境变量
|
|
105
|
+
export HF_ENDPOINT=http://localhost:8081
|
|
106
|
+
|
|
107
|
+
# 下载模型(支持所有文件类型:.gguf, .bin, .safetensors, .json 等)
|
|
108
|
+
huggingface-cli download TheBloke/Llama-2-7B-GGUF llama-2-7b.Q4_K_M.gguf
|
|
109
|
+
|
|
110
|
+
# 下载整个仓库
|
|
111
|
+
huggingface-cli download meta-llama/Llama-2-7b-hf --local-dir ./models
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
或使用 Python:
|
|
115
|
+
```python
|
|
116
|
+
import os
|
|
117
|
+
os.environ["HF_ENDPOINT"] = "http://localhost:8081"
|
|
118
|
+
|
|
119
|
+
from huggingface_hub import hf_hub_download, snapshot_download
|
|
120
|
+
|
|
121
|
+
# 下载单个文件
|
|
122
|
+
hf_hub_download(repo_id="TheBloke/Llama-2-7B-GGUF", filename="llama-2-7b.Q4_K_M.gguf")
|
|
123
|
+
|
|
124
|
+
# 下载整个仓库
|
|
125
|
+
snapshot_download(repo_id="meta-llama/Llama-2-7b-hf", local_dir="./models")
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
## 📖 API
|
|
129
|
+
|
|
130
|
+
| 路径 | 说明 |
|
|
131
|
+
|------|------|
|
|
132
|
+
| `/*` | 代理到对应上游 (Docker/PyPI/CRAN/HuggingFace) |
|
|
133
|
+
| `/health` | 健康检查 |
|
|
134
|
+
| `/stats` | 缓存统计 |
|
|
135
|
+
|
|
136
|
+
## 🐳 Docker 部署
|
|
137
|
+
|
|
138
|
+
```bash
|
|
139
|
+
# 使用 GitHub Container Registry
|
|
140
|
+
docker pull ghcr.io/livehl/aimirror:latest
|
|
141
|
+
|
|
142
|
+
# 运行
|
|
143
|
+
docker run -d -p 8081:8081 -v $(pwd)/cache:/data/fast_proxy/cache ghcr.io/livehl/aimirror:latest
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
## ⚙️ 配置示例
|
|
147
|
+
|
|
148
|
+
```yaml
|
|
149
|
+
server:
|
|
150
|
+
host: "0.0.0.0"
|
|
151
|
+
port: 8081
|
|
152
|
+
upstream_proxy: "http://proxy.company.com:8080" # 公司代理(可选)
|
|
153
|
+
|
|
154
|
+
cache:
|
|
155
|
+
dir: "./cache"
|
|
156
|
+
max_size_gb: 100
|
|
157
|
+
|
|
158
|
+
rules:
|
|
159
|
+
- name: docker-blob
|
|
160
|
+
pattern: "/v2/.*/blobs/sha256:[a-f0-9]+"
|
|
161
|
+
upstream: "https://registry-1.docker.io"
|
|
162
|
+
strategy: parallel
|
|
163
|
+
min_size: 1048576
|
|
164
|
+
concurrency: 20
|
|
165
|
+
chunk_size: 10485760
|
|
166
|
+
|
|
167
|
+
- name: pip-wheel
|
|
168
|
+
pattern: "/packages/.+\.whl$"
|
|
169
|
+
upstream: "https://pypi.org"
|
|
170
|
+
strategy: parallel
|
|
171
|
+
min_size: 1048576
|
|
172
|
+
concurrency: 20
|
|
173
|
+
chunk_size: 5242880
|
|
174
|
+
|
|
175
|
+
# HuggingFace 文件下载(支持所有模型文件,临时签名 URL 缓存优化)
|
|
176
|
+
- name: huggingface-files
|
|
177
|
+
pattern: '/.*/(blob|resolve)/main/.+'
|
|
178
|
+
upstream: "https://huggingface.co"
|
|
179
|
+
strategy: parallel
|
|
180
|
+
min_size: 1048576
|
|
181
|
+
concurrency: 20
|
|
182
|
+
chunk_size: 10485760
|
|
183
|
+
cache_key_source: original # 使用原始 URL 作为缓存 key
|
|
184
|
+
path_rewrite:
|
|
185
|
+
- search: "/blob/main/"
|
|
186
|
+
replace: "/resolve/main/"
|
|
187
|
+
|
|
188
|
+
# 示例:扩展任意 HTTP 下载站点
|
|
189
|
+
# - name: my-custom-repo
|
|
190
|
+
# pattern: '/downloads/.+\.(tar\.gz|zip|bin)$'
|
|
191
|
+
# upstream: "https://downloads.example.com"
|
|
192
|
+
# strategy: parallel
|
|
193
|
+
# min_size: 10485760 # 10MB 以上启用并行
|
|
194
|
+
# concurrency: 16
|
|
195
|
+
# chunk_size: 20971520 # 20MB 分片
|
|
196
|
+
|
|
197
|
+
- name: default
|
|
198
|
+
pattern: ".*"
|
|
199
|
+
upstream: "https://pypi.org"
|
|
200
|
+
strategy: proxy
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
### 配置说明
|
|
204
|
+
|
|
205
|
+
| 字段 | 说明 |
|
|
206
|
+
|------|------|
|
|
207
|
+
| `server.upstream_proxy` | 可选的公司代理,用于连接外网 |
|
|
208
|
+
| `rules[].upstream` | 上游源 base URL |
|
|
209
|
+
| `rules[].strategy` | `proxy` 直接代理 / `parallel` 并行下载 |
|
|
210
|
+
| `rules[].path_rewrite` | 路径重写规则(如 HuggingFace blob→resolve) |
|
|
211
|
+
| `rules[].cache_key_source` | `original` 使用原始URL作为缓存key(解决临时签名问题) |
|
|
212
|
+
|
|
213
|
+
## 🧪 测试
|
|
214
|
+
|
|
215
|
+
### 运行测试
|
|
216
|
+
|
|
217
|
+
```bash
|
|
218
|
+
# 运行简单测试(无需 pytest)
|
|
219
|
+
python test_simple.py
|
|
220
|
+
|
|
221
|
+
# 运行完整测试套件(需要 pytest)
|
|
222
|
+
pytest test_proxy.py -v
|
|
223
|
+
```
|
|
224
|
+
|
|
225
|
+
### 手动验证
|
|
226
|
+
|
|
227
|
+
**测试 PyPI 代理**
|
|
228
|
+
```bash
|
|
229
|
+
curl -o /dev/null "http://localhost:8081/packages/fb/d7/71b982339efc4fff3c622c6fefecddfd3e0b35b60c5f822872d5b806bb71/torch-1.0.0-cp27-cp27m-manylinux1_x86_64.whl" \
|
|
230
|
+
-w "HTTP: %{http_code}, Size: %{size_download}, Time: %{time_total}s\n"
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
**测试 HuggingFace 代理**
|
|
234
|
+
```bash
|
|
235
|
+
export HF_ENDPOINT=http://localhost:8081
|
|
236
|
+
|
|
237
|
+
# 测试下载 GGUF 模型文件
|
|
238
|
+
huggingface-cli download TheBloke/Llama-2-7B-GGUF llama-2-7b.Q4_K_M.gguf
|
|
239
|
+
|
|
240
|
+
# 测试下载 safetensors 格式模型
|
|
241
|
+
huggingface-cli download meta-llama/Llama-2-7b-hf model-00001-of-00002.safetensors
|
|
242
|
+
|
|
243
|
+
# 测试下载整个仓库
|
|
244
|
+
huggingface-cli download sentence-transformers/all-MiniLM-L6-v2 --local-dir ./test-model
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
**测试 Docker Registry 代理**
|
|
248
|
+
```bash
|
|
249
|
+
# 获取 token
|
|
250
|
+
TOKEN=$(curl -s "https://auth.docker.io/token?service=registry.docker.io&scope=repository:library/nginx:pull" \
|
|
251
|
+
| grep -o '"token":"[^"]*"' | cut -d'"' -f4)
|
|
252
|
+
|
|
253
|
+
# 下载 blob
|
|
254
|
+
curl -o /dev/null "http://localhost:8081/v2/library/nginx/blobs/sha256:abc123" \
|
|
255
|
+
-H "Authorization: Bearer $TOKEN" \
|
|
256
|
+
-w "HTTP: %{http_code}, Size: %{size_download}, Time: %{time_total}s\n"
|
|
257
|
+
```
|
|
258
|
+
|
|
259
|
+
**验证缓存命中**
|
|
260
|
+
```bash
|
|
261
|
+
# 第一次下载(并行下载)
|
|
262
|
+
time curl -o /tmp/test1.gguf "http://localhost:8081/unsloth/model/resolve/main/file.gguf"
|
|
263
|
+
|
|
264
|
+
# 第二次下载(缓存命中,应该快很多)
|
|
265
|
+
time curl -o /tmp/test2.gguf "http://localhost:8081/unsloth/model/resolve/main/file.gguf"
|
|
266
|
+
|
|
267
|
+
# 查看缓存统计
|
|
268
|
+
curl http://localhost:8081/stats | jq
|
|
269
|
+
```
|
|
270
|
+
|
|
271
|
+
## 📄 License
|
|
272
|
+
|
|
273
|
+
MIT
|