hedge-python 0.1.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,367 @@
1
+ Metadata-Version: 2.4
2
+ Name: hedge-python
3
+ Version: 0.1.0
4
+ Summary: Adaptive hedged request library for Python. Learns per-host latency via DDSketch, fires backup requests at estimated p90, caps hedge rate with token bucket.
5
+ Author-email: LeoSun <sunhailin.shl@antgroup.com>
6
+ License: MIT
7
+ License-File: LICENSE
8
+ Keywords: aiohttp,ddsketch,grpc,hedge,httpx,latency,tail-latency
9
+ Classifier: Development Status :: 3 - Alpha
10
+ Classifier: Framework :: AsyncIO
11
+ Classifier: Intended Audience :: Developers
12
+ Classifier: License :: OSI Approved :: MIT License
13
+ Classifier: Programming Language :: Python :: 3
14
+ Classifier: Programming Language :: Python :: 3.9
15
+ Classifier: Programming Language :: Python :: 3.10
16
+ Classifier: Programming Language :: Python :: 3.11
17
+ Classifier: Programming Language :: Python :: 3.12
18
+ Classifier: Programming Language :: Python :: 3.13
19
+ Classifier: Programming Language :: Python :: 3.14
20
+ Classifier: Topic :: Software Development :: Libraries
21
+ Classifier: Typing :: Typed
22
+ Requires-Python: >=3.9
23
+ Provides-Extra: aiohttp
24
+ Requires-Dist: aiohttp>=3.9.0; extra == 'aiohttp'
25
+ Provides-Extra: all
26
+ Requires-Dist: aiohttp>=3.9.0; extra == 'all'
27
+ Requires-Dist: grpcio>=1.50.0; extra == 'all'
28
+ Requires-Dist: httpx>=0.24.0; extra == 'all'
29
+ Requires-Dist: protobuf>=4.21.0; extra == 'all'
30
+ Provides-Extra: dev
31
+ Requires-Dist: aiohttp>=3.9.0; extra == 'dev'
32
+ Requires-Dist: grpcio-tools>=1.50.0; extra == 'dev'
33
+ Requires-Dist: grpcio>=1.50.0; extra == 'dev'
34
+ Requires-Dist: httpx>=0.24.0; extra == 'dev'
35
+ Requires-Dist: matplotlib>=3.7.0; extra == 'dev'
36
+ Requires-Dist: mypy>=1.0; extra == 'dev'
37
+ Requires-Dist: protobuf>=4.21.0; extra == 'dev'
38
+ Requires-Dist: pytest-asyncio>=0.21.0; extra == 'dev'
39
+ Requires-Dist: pytest-benchmark>=4.0; extra == 'dev'
40
+ Requires-Dist: pytest-cov>=4.0; extra == 'dev'
41
+ Requires-Dist: pytest-timeout>=2.1.0; extra == 'dev'
42
+ Requires-Dist: pytest>=7.0; extra == 'dev'
43
+ Requires-Dist: ruff>=0.4.0; extra == 'dev'
44
+ Provides-Extra: grpc
45
+ Requires-Dist: grpcio>=1.50.0; extra == 'grpc'
46
+ Requires-Dist: protobuf>=4.21.0; extra == 'grpc'
47
+ Provides-Extra: httpx
48
+ Requires-Dist: httpx>=0.24.0; extra == 'httpx'
49
+ Description-Content-Type: text/markdown
50
+
51
+ # hedge-python
52
+
53
+ **English** | [简体中文](README.zh-CN.md) | [日本語](README.ja.md)
54
+
55
+ [![CI](https://github.com/sunhailin-Leo/hedge-python/actions/workflows/ci.yml/badge.svg)](https://github.com/sunhailin-Leo/hedge-python/actions)
56
+ [![Coverage](https://img.shields.io/badge/coverage-97%25-brightgreen.svg)](#testing)
57
+ [![Python](https://img.shields.io/badge/python-3.9%E2%80%933.13-blue.svg)](pyproject.toml)
58
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
59
+
60
+ Python port of [bhope/hedge](https://github.com/bhope/hedge) — **adaptive hedged
61
+ requests for tail-latency optimisation**.
62
+
63
+ `hedge-python` learns per-host latency distributions with
64
+ [DDSketch](https://arxiv.org/abs/2004.08604), races a backup request when the
65
+ primary exceeds its estimated p90, and caps the hedge rate with a token bucket
66
+ to prevent load amplification during outages. Zero configuration required.
67
+ First-class support for **httpx**, **aiohttp**, and **gRPC** (unary +
68
+ server-streaming).
69
+
70
+ Inspired by Dean & Barroso, [_The Tail at Scale_](https://research.google/pubs/the-tail-at-scale/) (CACM 2013).
71
+
72
+ ---
73
+
74
+ ## Why hedging?
75
+
76
+ A small fraction of slow responses dominates user-perceived latency. Hedging
77
+ fires a duplicate request after the primary blows past its expected deadline —
78
+ whichever finishes first wins, the other is cancelled.
79
+
80
+ **Result on a benchmark with 5% straggler requests (10× slower):**
81
+
82
+ ![Multi-framework benchmark](eval_multi_framework.png)
83
+
84
+ | Framework | Configuration | p50 | p90 | p95 | p99 | p999 | Overhead |
85
+ |-----------|---------------------|-------|-------|--------|--------|---------|----------|
86
+ | httpx | No hedging | 5.8 | 10.3 | 12.2 | 51.3 | 78.3 | 0.0% |
87
+ | httpx | **Adaptive (hedge)**| 6.2 | 10.5 | 12.1 | **18.8** | **22.2** | 7.0% |
88
+ | aiohttp | No hedging | 6.3 | 10.7 | 13.0 | 52.4 | 79.0 | 0.0% |
89
+ | aiohttp | **Adaptive (hedge)**| 6.5 | 11.3 | 13.8 | **20.5** | **25.1** | 4.6% |
90
+ | grpc | No hedging | 6.5 | 10.8 | 12.7 | 59.9 | 82.0 | 0.0% |
91
+ | grpc | **Adaptive (hedge)**| 6.9 | 11.6 | 13.7 | **20.4** | **23.5** | 5.6% |
92
+
93
+ Across all three frameworks, p99 latency drops by **60–66%** at the cost of
94
+ ~5–7% extra backend traffic. Reproduce with `make bench-multi && make bench-plot`.
95
+
96
+ ---
97
+
98
+ ## Quick Start
99
+
100
+ ```bash
101
+ # Install with your preferred framework
102
+ pip install hedge-python[httpx]
103
+ pip install hedge-python[aiohttp]
104
+ pip install hedge-python[grpc]
105
+ pip install hedge-python[all] # all frameworks
106
+ ```
107
+
108
+ ### httpx
109
+
110
+ ```python
111
+ import asyncio
112
+ import httpx
113
+ from hedge import HedgeConfig
114
+ from hedge.transport import HedgedHttpxTransport
115
+
116
+ async def main():
117
+ transport = HedgedHttpxTransport(config=HedgeConfig())
118
+ async with httpx.AsyncClient(transport=transport) as client:
119
+ resp = await client.get("https://api.example.com/data")
120
+ print(resp.status_code)
121
+
122
+ asyncio.run(main())
123
+ ```
124
+
125
+ ### aiohttp
126
+
127
+ ```python
128
+ import asyncio
129
+ from hedge import HedgeConfig
130
+ from hedge.transport import HedgedAiohttpSession
131
+
132
+ async def main():
133
+ async with HedgedAiohttpSession(config=HedgeConfig()) as session:
134
+ resp = await session.get("https://api.example.com/data")
135
+ data = await resp.json()
136
+ print(data)
137
+
138
+ asyncio.run(main())
139
+ ```
140
+
141
+ ### gRPC (Unary)
142
+
143
+ ```python
144
+ import grpc.aio
145
+ from hedge import HedgeConfig
146
+ from hedge.interceptor import HedgedUnaryInterceptor
147
+
148
+ async def make_channel():
149
+ return grpc.aio.insecure_channel(
150
+ "localhost:50051",
151
+ interceptors=[HedgedUnaryInterceptor(config=HedgeConfig(estimated_rps=500))],
152
+ )
153
+ ```
154
+
155
+ ### gRPC (Server Streaming — LLM inference, log tailing, …)
156
+
157
+ ```python
158
+ import grpc.aio
159
+ from hedge import HedgeConfig
160
+ from hedge.interceptor import HedgedServerStreamInterceptor
161
+
162
+ async def make_channel():
163
+ return grpc.aio.insecure_channel(
164
+ "localhost:50051",
165
+ interceptors=[HedgedServerStreamInterceptor(config=HedgeConfig())],
166
+ )
167
+ ```
168
+
169
+ For server streaming, the hedge signal is **time-to-first-message (TTFM)**: if
170
+ the primary stream doesn't yield its first chunk within the estimated p90,
171
+ a backup stream is started. Whichever yields first wins and continues
172
+ streaming; the loser is cancelled at the wire level.
173
+
174
+ > **Runnable examples** for each framework live in [`examples/`](examples/) —
175
+ > the gRPC ones are fully self-contained (they spin up a local server with
176
+ > simulated stragglers so you can see hedging in action without any external
177
+ > dependency). See [`examples/README.md`](examples/README.md) for the index.
178
+
179
+ ---
180
+
181
+ ## How It Works
182
+
183
+ ### 1. DDSketch quantile estimator
184
+
185
+ Each target host gets a `WindowedSketch` — a pair of DDSketches that rotate
186
+ every 30 seconds. DDSketch uses logarithmic bucket mapping to provide
187
+ **relative-error guarantees**: any quantile estimate is within ±1% of the
188
+ true value, regardless of the underlying distribution.
189
+
190
+ ### 2. Adaptive trigger
191
+
192
+ On each request, the transport queries the sketch for the configured
193
+ percentile (default p90). If the primary hasn't responded by that deadline,
194
+ a backup request is fired. Whichever response arrives first is returned;
195
+ the loser is cancelled (including the underlying gRPC `Call` for streams).
196
+
197
+ ```
198
+ ┌─ primary ─────────── ✓ (fast) ──→ return
199
+ request ──────┤
200
+ └─ hedge fires after p90 ─── ✗ (cancelled)
201
+ ```
202
+
203
+ ### 3. Token bucket budget
204
+
205
+ Hedges are rate-limited by a token bucket that refills at
206
+ `estimated_rps × budget_percent / 100` tokens per second. During genuine
207
+ outages the bucket drains and hedging stops automatically — preventing the
208
+ load-doubling spiral that would deepen the incident.
209
+
210
+ ### gRPC implementation note
211
+
212
+ The gRPC `intercept_unary_unary` continuation returns a `Call` object almost
213
+ immediately; the real RTT is spent in the subsequent `await call`. We wrap
214
+ **both steps** in a single asyncio task so the hedge timer reflects true
215
+ end-to-end RPC latency. Cancelling a loser invokes `call.cancel()` first
216
+ (notifying the server) then `task.cancel()` (cleaning up the coroutine).
217
+
218
+ ---
219
+
220
+ ## Configuration
221
+
222
+ All knobs live on `HedgeConfig`:
223
+
224
+ | Parameter | Type | Default | Description |
225
+ |-----------|------|---------|-------------|
226
+ | `percentile` | `float` | `0.90` | Sketch quantile used as hedge trigger |
227
+ | `max_hedges` | `int` | `1` | Maximum concurrent hedge requests per call |
228
+ | `budget_percent` | `float` | `10.0` | Max hedge rate as percent of total traffic |
229
+ | `estimated_rps` | `float` | `100.0` | Expected requests per second; sets token bucket capacity |
230
+ | `min_delay` | `float` | `0.001` | Floor on the hedge delay in seconds |
231
+ | `warmup_requests` | `int` | `20` | Number of initial requests using fixed delay |
232
+ | `warmup_delay` | `float` | `0.01` | Fixed hedge delay during warmup in seconds |
233
+ | `window_duration` | `float` | `30.0` | Sketch window rotation interval in seconds |
234
+ | `stats` | `Stats \| None` | `None` | Inject a custom `Stats` for observability |
235
+
236
+ > **Tip — `estimated_rps`**: pick a value close to your real RPS so the token
237
+ > bucket capacity (`rps × budget_percent / 100`) is meaningful. If unsure,
238
+ > start at the default `100.0` and watch `hedge_rate` / `budget_exhausted` in
239
+ > the stats snapshot.
240
+
241
+ ---
242
+
243
+ ## Observability
244
+
245
+ ```python
246
+ from hedge import HedgeConfig, Stats
247
+ from hedge.transport import HedgedHttpxTransport
248
+
249
+ stats = Stats()
250
+ transport = HedgedHttpxTransport(config=HedgeConfig(stats=stats))
251
+
252
+ # ... after running some traffic ...
253
+ snap = stats.snapshot()
254
+ print(f"total={snap.total_requests} hedged={snap.hedged_requests}")
255
+ print(f"hedge_wins={snap.hedge_wins} primary_wins={snap.primary_wins}")
256
+ print(f"budget_exhausted={snap.budget_exhausted}")
257
+ print(f"hedge_rate={stats.hedge_rate():.2%}")
258
+ ```
259
+
260
+ `Stats` is fully thread-safe and can be shared across multiple
261
+ transports/interceptors to aggregate metrics.
262
+
263
+ ---
264
+
265
+ ## Benchmarks & charts
266
+
267
+ Two benchmark suites ship with the project:
268
+
269
+ | Command | What it does | Output |
270
+ |--------------------|------------------------------------------------------------------------|---------------------------------------|
271
+ | `make bench-compare` | httpx only: No hedging vs Static 10ms vs Static 50ms vs Adaptive | `benchmark/results.csv` |
272
+ | `make bench-multi` | httpx vs aiohttp vs gRPC, No hedging vs Adaptive | `benchmark/results_multi.csv` |
273
+ | `make bench-plot` | Render both CSVs into charts | `eval.png`, `eval_multi_framework.png` |
274
+
275
+ Each suite runs 500 requests against a simulated lognormal latency
276
+ (`mean=5ms, stddev=2ms`) with 5% straggler probability (10× spike).
277
+
278
+ ---
279
+
280
+ ## Development
281
+
282
+ ```bash
283
+ # Install uv (if not already)
284
+ curl -LsSf https://astral.sh/uv/install.sh | sh
285
+
286
+ make install # install all extras with uv
287
+ make lint # ruff check
288
+ make typecheck # mypy
289
+ make test # all tests
290
+ make test-unit # unit tests only
291
+ make test-integration # integration tests (requires httpx / aiohttp / grpcio)
292
+ make coverage # coverage report (current: 96%)
293
+ make bench-multi # multi-framework benchmark
294
+ make bench-plot # render charts
295
+ make ci # lint + typecheck + test + coverage
296
+ ```
297
+
298
+ ### Testing
299
+
300
+ * **Unit tests** (`tests/unit/`): DDSketch, token bucket, scheduler, stats,
301
+ options, lazy import shims, gRPC interceptor branches (with fake
302
+ continuations).
303
+ * **Integration tests** (`tests/integration/`): real httpx transport, real
304
+ aiohttp session, **real local gRPC server** with `.proto` + generated pb2.
305
+ * **Benchmarks** (`tests/benchmark/`): DDSketch microbench, token bucket
306
+ microbench, four-config comparison, three-framework comparison.
307
+
308
+ Current coverage: **97%** (122 tests, ~7 seconds).
309
+
310
+ ---
311
+
312
+ ## Project Structure
313
+
314
+ ```
315
+ hedge-python/
316
+ ├── src/hedge/
317
+ │ ├── __init__.py # Public API
318
+ │ ├── _options.py # HedgeConfig dataclass
319
+ │ ├── _stats.py # Thread-safe Stats + StatsSnapshot
320
+ │ ├── sketch/
321
+ │ │ ├── _ddsketch.py # DDSketch quantile estimator
322
+ │ │ └── _windowed.py # Sliding-window DDSketch pair
323
+ │ ├── budget/
324
+ │ │ └── _token_bucket.py # Token bucket rate limiter
325
+ │ ├── transport/
326
+ │ │ ├── _base.py # Shared HedgeScheduler logic
327
+ │ │ ├── _httpx.py # httpx AsyncBaseTransport adapter
328
+ │ │ └── _aiohttp.py # aiohttp session wrapper
329
+ │ └── interceptor/
330
+ │ └── _grpc.py # gRPC unary + server-stream interceptors
331
+ ├── tests/
332
+ │ ├── unit/ # 7 unit-test files
333
+ │ ├── integration/
334
+ │ │ ├── proto/ # .proto + generated pb2 / pb2_grpc
335
+ │ │ ├── test_httpx_transport.py
336
+ │ │ ├── test_aiohttp_session.py
337
+ │ │ └── test_grpc_interceptor.py
338
+ │ └── benchmark/
339
+ │ ├── test_bench_ddsketch.py
340
+ │ ├── test_bench_token_bucket.py
341
+ │ ├── test_bench_hedge_comparison.py # httpx 4-config
342
+ │ └── test_bench_multi_framework.py # 3-framework comparison
343
+ ├── benchmark/
344
+ │ ├── plot.py # CSV → matplotlib charts
345
+ │ ├── results.csv # produced by bench-compare
346
+ │ └── results_multi.csv # produced by bench-multi
347
+ ├── eval.png # single-framework chart
348
+ ├── eval_multi_framework.png # cross-framework chart
349
+ ├── pyproject.toml
350
+ ├── Makefile
351
+ └── .github/workflows/ci.yml
352
+ ```
353
+
354
+ ---
355
+
356
+ ## References
357
+
358
+ - Jeffrey Dean and Luiz André Barroso. ["The Tail at Scale."](https://research.google/pubs/the-tail-at-scale/) *Communications of the ACM*, 56(2):74–80, February 2013.
359
+ - Charles Masson, Jee E. Rim, and Homin K. Lee. ["DDSketch: A Fast and Fully-Mergeable Quantile Sketch with Relative-Error Guarantees."](https://arxiv.org/abs/2004.08604) *Proceedings of the VLDB Endowment*, 12(12):2195–2205, 2019.
360
+
361
+ ## Changelog
362
+
363
+ See [CHANGELOG.md](CHANGELOG.md) for the full release history.
364
+
365
+ ## License
366
+
367
+ `hedge-python` is released under the [MIT License](LICENSE).
@@ -0,0 +1,19 @@
1
+ hedge/__init__.py,sha256=3az0aD88vR8EzW-a64-6V-jm3txcklg0rR6cm3nAKbs,436
2
+ hedge/_options.py,sha256=iiK4XlraaAl4SeQPk0FRIDC3EK4VD-au8Srz4cTr4js,1291
3
+ hedge/_stats.py,sha256=Vqq559hXBe45j-iLUERdDCLSTpLSPgGcOnaV0oimgC0,2253
4
+ hedge/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
5
+ hedge/budget/__init__.py,sha256=iGU4ZAhPXYMFRdiMFAfMU64nGaGa5EENQg1v66bNYBw,145
6
+ hedge/budget/_token_bucket.py,sha256=nScj5E8tGv9emlJesHWU5ZXXiJvB3rjRxxEisFfG0FA,1882
7
+ hedge/interceptor/__init__.py,sha256=doYyBrf_htHANnZWedLZV4XSfIPFnaXC1eEAf8SGZ5Q,754
8
+ hedge/interceptor/_grpc.py,sha256=2yJqzpOPkXMSVHEcCtYll3WTbsR0cqtbwjCccVmf1DE,11080
9
+ hedge/sketch/__init__.py,sha256=2U_sql-SLfp_-kWEOTtMLamXFZlXP-rB4KWSrNodR0Q,210
10
+ hedge/sketch/_ddsketch.py,sha256=wVWDz11wvKr9zJChE4_tnQ1p12D2-YN4nkrx-ZExW2o,5692
11
+ hedge/sketch/_windowed.py,sha256=QQEXL8Hi2w6n2BnOWavb21Kio_gLk6MlxesDirzgOPs,3991
12
+ hedge/transport/__init__.py,sha256=PDpqXwP53-VHjrMMiPQID475snunZO14icfnGjJS0bo,734
13
+ hedge/transport/_aiohttp.py,sha256=mPWctXY0MpMKJKW5NYoyIeJYIE8-jT3iLIQxO3rIYWI,4481
14
+ hedge/transport/_base.py,sha256=wP_4NCcYXArxSIibe0qy41ZQN5zf87tO4v3lnBLBGYE,5694
15
+ hedge/transport/_httpx.py,sha256=vnJ7LbeHEkiAe0zXjrmDRcSqdX6c8Lo_C2gYLOr9q_k,2632
16
+ hedge_python-0.1.0.dist-info/METADATA,sha256=XdNa3aOi4VhaaPKiLIdXEZGrYBIrzEsqowfaL4v75xY,14502
17
+ hedge_python-0.1.0.dist-info/WHEEL,sha256=QccIxa26bgl1E6uMy58deGWi-0aeIkkangHcxk2kWfw,87
18
+ hedge_python-0.1.0.dist-info/licenses/LICENSE,sha256=tyl0aeFJKseJ8As6y6-tjiUolw5iRMME6zl49q8T2ZE,1063
19
+ hedge_python-0.1.0.dist-info/RECORD,,
@@ -0,0 +1,4 @@
1
+ Wheel-Version: 1.0
2
+ Generator: hatchling 1.29.0
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 LeoSun
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.