clawperf 0.2.2__tar.gz → 0.2.4__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {clawperf-0.2.2 → clawperf-0.2.4}/PKG-INFO +5 -4
- {clawperf-0.2.2 → clawperf-0.2.4}/README.md +4 -3
- {clawperf-0.2.2 → clawperf-0.2.4}/src/clawperf/__init__.py +1 -1
- {clawperf-0.2.2 → clawperf-0.2.4}/src/clawperf.egg-info/PKG-INFO +5 -4
- {clawperf-0.2.2 → clawperf-0.2.4}/LICENSE +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/pyproject.toml +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/setup.cfg +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/src/clawperf/__main__.py +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/src/clawperf/cli.py +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/src/clawperf/config.py +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/src/clawperf/context.py +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/src/clawperf/logging_setup.py +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/src/clawperf/mock_server.py +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/src/clawperf/runner.py +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/src/clawperf/scheduler.py +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/src/clawperf/system_metrics.py +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/src/clawperf/tokenizer.py +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/src/clawperf.egg-info/SOURCES.txt +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/src/clawperf.egg-info/dependency_links.txt +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/src/clawperf.egg-info/entry_points.txt +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/src/clawperf.egg-info/requires.txt +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/src/clawperf.egg-info/top_level.txt +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/tests/test_aggregation.py +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/tests/test_config.py +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/tests/test_context.py +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/tests/test_history.py +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/tests/test_mock_server.py +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/tests/test_runner_math.py +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/tests/test_runner_utils.py +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/tests/test_scheduler.py +0 -0
- {clawperf-0.2.2 → clawperf-0.2.4}/tests/test_system_metrics.py +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: clawperf
|
|
3
|
-
Version: 0.2.
|
|
3
|
+
Version: 0.2.4
|
|
4
4
|
Summary: Performance benchmarking tool for LLM Serving backends with multi-turn long-context workloads
|
|
5
5
|
Author: ClawPerf Contributors
|
|
6
6
|
License-Expression: Apache-2.0
|
|
@@ -281,15 +281,14 @@ Each simulated user maintains an independent conversation state with its own gro
|
|
|
281
281
|
|
|
282
282
|
Each user's context follows this structure:
|
|
283
283
|
|
|
284
|
-
|
|
285
|
-
[System Prefix] [User Prefix] [History] [Current Input]
|
|
286
|
-
```
|
|
284
|
+

|
|
287
285
|
|
|
288
286
|
When context reaches `--max-context-tokens`, append-mode compaction fires:
|
|
289
287
|
|
|
290
288
|
1. The base context (system + user prefix + input, without history) is checked first. If it already exceeds the limit, compaction is skipped and the turn is marked as `context_overflow` — this prevents infinite compaction loops.
|
|
291
289
|
2. Otherwise, history is cleared and the user prefix grows by `--compaction-prefix-increment` tokens.
|
|
292
290
|
3. New random content fills the enlarged user prefix.
|
|
291
|
+
4. If the grown base still exceeds the limit, the prefix growth is **reverted** (history cleared only) so the user isn't permanently trapped in overflow.
|
|
293
292
|
|
|
294
293
|
This simulates how real LLM serving systems handle context overflow with prefix caching.
|
|
295
294
|
|
|
@@ -304,6 +303,8 @@ The mock server simulates vLLM's KV-block prefix cache using a trie:
|
|
|
304
303
|
|
|
305
304
|
## User Arrival Scheduling
|
|
306
305
|
|
|
306
|
+

|
|
307
|
+
|
|
307
308
|
- **burst**: All users start immediately
|
|
308
309
|
- **steady:2**: Users arrive every 2 seconds
|
|
309
310
|
- **poisson:0.5**: Users arrive following a Poisson process with rate 0.5
|
|
@@ -246,15 +246,14 @@ Each simulated user maintains an independent conversation state with its own gro
|
|
|
246
246
|
|
|
247
247
|
Each user's context follows this structure:
|
|
248
248
|
|
|
249
|
-
|
|
250
|
-
[System Prefix] [User Prefix] [History] [Current Input]
|
|
251
|
-
```
|
|
249
|
+

|
|
252
250
|
|
|
253
251
|
When context reaches `--max-context-tokens`, append-mode compaction fires:
|
|
254
252
|
|
|
255
253
|
1. The base context (system + user prefix + input, without history) is checked first. If it already exceeds the limit, compaction is skipped and the turn is marked as `context_overflow` — this prevents infinite compaction loops.
|
|
256
254
|
2. Otherwise, history is cleared and the user prefix grows by `--compaction-prefix-increment` tokens.
|
|
257
255
|
3. New random content fills the enlarged user prefix.
|
|
256
|
+
4. If the grown base still exceeds the limit, the prefix growth is **reverted** (history cleared only) so the user isn't permanently trapped in overflow.
|
|
258
257
|
|
|
259
258
|
This simulates how real LLM serving systems handle context overflow with prefix caching.
|
|
260
259
|
|
|
@@ -269,6 +268,8 @@ The mock server simulates vLLM's KV-block prefix cache using a trie:
|
|
|
269
268
|
|
|
270
269
|
## User Arrival Scheduling
|
|
271
270
|
|
|
271
|
+

|
|
272
|
+
|
|
272
273
|
- **burst**: All users start immediately
|
|
273
274
|
- **steady:2**: Users arrive every 2 seconds
|
|
274
275
|
- **poisson:0.5**: Users arrive following a Poisson process with rate 0.5
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: clawperf
|
|
3
|
-
Version: 0.2.
|
|
3
|
+
Version: 0.2.4
|
|
4
4
|
Summary: Performance benchmarking tool for LLM Serving backends with multi-turn long-context workloads
|
|
5
5
|
Author: ClawPerf Contributors
|
|
6
6
|
License-Expression: Apache-2.0
|
|
@@ -281,15 +281,14 @@ Each simulated user maintains an independent conversation state with its own gro
|
|
|
281
281
|
|
|
282
282
|
Each user's context follows this structure:
|
|
283
283
|
|
|
284
|
-
|
|
285
|
-
[System Prefix] [User Prefix] [History] [Current Input]
|
|
286
|
-
```
|
|
284
|
+

|
|
287
285
|
|
|
288
286
|
When context reaches `--max-context-tokens`, append-mode compaction fires:
|
|
289
287
|
|
|
290
288
|
1. The base context (system + user prefix + input, without history) is checked first. If it already exceeds the limit, compaction is skipped and the turn is marked as `context_overflow` — this prevents infinite compaction loops.
|
|
291
289
|
2. Otherwise, history is cleared and the user prefix grows by `--compaction-prefix-increment` tokens.
|
|
292
290
|
3. New random content fills the enlarged user prefix.
|
|
291
|
+
4. If the grown base still exceeds the limit, the prefix growth is **reverted** (history cleared only) so the user isn't permanently trapped in overflow.
|
|
293
292
|
|
|
294
293
|
This simulates how real LLM serving systems handle context overflow with prefix caching.
|
|
295
294
|
|
|
@@ -304,6 +303,8 @@ The mock server simulates vLLM's KV-block prefix cache using a trie:
|
|
|
304
303
|
|
|
305
304
|
## User Arrival Scheduling
|
|
306
305
|
|
|
306
|
+

|
|
307
|
+
|
|
307
308
|
- **burst**: All users start immediately
|
|
308
309
|
- **steady:2**: Users arrive every 2 seconds
|
|
309
310
|
- **poisson:0.5**: Users arrive following a Poisson process with rate 0.5
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|