ml-loadtest 1.0.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,377 @@
1
+ Metadata-Version: 2.4
2
+ Name: ml-loadtest
3
+ Version: 1.0.0
4
+ Summary: Adaptive load testing tool for ML inference APIs with dynamic scaling and regression detection
5
+ License: MIT
6
+ Keywords: load-testing,locust,ml,api-testing,performance,inference,benchmarking
7
+ Requires-Python: >=3.10
8
+ Classifier: Development Status :: 4 - Beta
9
+ Classifier: Intended Audience :: Developers
10
+ Classifier: License :: OSI Approved :: MIT License
11
+ Classifier: Programming Language :: Python :: 3
12
+ Classifier: Programming Language :: Python :: 3.10
13
+ Classifier: Programming Language :: Python :: 3.11
14
+ Classifier: Programming Language :: Python :: 3.12
15
+ Classifier: Topic :: Software Development :: Testing
16
+ Classifier: Topic :: System :: Benchmark
17
+ Requires-Dist: dacite (>=1.8,<2.0)
18
+ Requires-Dist: locust (>=2.20,<3.0)
19
+ Requires-Dist: notion-client (>=2.0,<3.0)
20
+ Requires-Dist: numpy (>=1.23.5,<2.0.0)
21
+ Requires-Dist: psutil (>=5.9,<6.0)
22
+ Project-URL: Homepage, https://github.com/IncodeTechnologies/ml-load-testing-tool
23
+ Project-URL: Issues, https://github.com/IncodeTechnologies/ml-load-testing-tool/issues
24
+ Project-URL: Repository, https://github.com/IncodeTechnologies/ml-load-testing-tool
25
+ Description-Content-Type: text/markdown
26
+
27
+ # ML Load Testing Tool
28
+
29
+ Adaptive load testing tool for ML inference APIs. Uses Locust to dynamically scale concurrent users based on P99 latency targets, then analyzes results for regression detection and rate limit recommendations.
30
+
31
+ ## Features
32
+
33
+ - **Adaptive Scaling**: Automatically adjusts concurrent users based on P99 latency targets
34
+ - **Multi-Mode Testing**: Individual endpoint, production mix, exploration, and spike test modes
35
+ - **Regression Detection**: Compare test results against baselines to catch performance degradations
36
+ - **Rate Limit Recommendations**: Calculates safe rate limits with configurable safety factors
37
+ - **Notion Integration**: Sync results to Notion for tracking and visualization
38
+
39
+ ## Installation
40
+
41
+ ### From GitHub
42
+
43
+ ```bash
44
+ pip install git+https://github.com/IncodeTechnologies/ml-load-testing-tool.git
45
+ ```
46
+
47
+ ### From Source
48
+
49
+ ```bash
50
+ git clone https://github.com/IncodeTechnologies/ml-load-testing-tool.git
51
+ cd ml-load-testing-tool
52
+ poetry install
53
+ ```
54
+
55
+ ## Quick Start
56
+
57
+ The tool requires a weights module that defines which endpoints to test. Use bundled examples or create your own:
58
+
59
+ ```bash
60
+ # Test with bundled example TaskSets
61
+ locust -f $(ml-loadtest-file) \
62
+ --host http://api:8000 \
63
+ --weights-module loadtest.distribution_weights
64
+
65
+ # Run headless with specific parameters
66
+ locust -f $(ml-loadtest-file) \
67
+ --host http://api:8000 \
68
+ --weights-module loadtest.distribution_weights \
69
+ --users 32 \
70
+ --spawn-rate 4 \
71
+ --run-time 60s \
72
+ --headless
73
+ ```
74
+
75
+ The `ml-loadtest-file` command prints the path to the installed locustfile, giving you full access to all Locust CLI parameters.
76
+
77
+ **Note:** The `--weights-module` parameter is required. It specifies a Python module containing a `production_weights` dictionary that maps TaskSet classes to their relative weights.
78
+
79
+ ## Usage
80
+
81
+ ### Two Usage Patterns
82
+
83
+ **Pattern 1: Direct Path (Recommended for CLI)**
84
+
85
+ ```bash
86
+ # Get full locust CLI access with installed package
87
+ locust -f $(ml-loadtest-file) --host http://api:8000 [any locust params]
88
+ ```
89
+
90
+ **Pattern 2: Local Import (Recommended for Customization)**
91
+
92
+ Create a local `locustfile.py` in your project:
93
+
94
+ ```python
95
+ # Import everything from the installed package
96
+ from ml_loadtest.locustfile import *
97
+
98
+ # Optionally override settings or add custom logic here
99
+ ```
100
+
101
+ Then run:
102
+
103
+ ```bash
104
+ locust -f locustfile.py --host http://api:8000
105
+ ```
106
+
107
+ ### Load Testing
108
+
109
+ The tool supports four test modes via the `--test-mode` parameter. You can run a single mode or multiple modes in sequence (space-separated).
110
+
111
+ #### 1. Individual Endpoint Testing
112
+
113
+ Test each endpoint separately to find individual capacity:
114
+
115
+ ```bash
116
+ locust -f $(ml-loadtest-file) \
117
+ --host http://api:8000 \
118
+ --weights-module loadtest.distribution_weights \
119
+ --test-mode INDIVIDUAL \
120
+ --target-p99-ms 500 \
121
+ --max-users 100
122
+ ```
123
+
124
+ #### 2. Production Mix Testing
125
+
126
+ Test with production traffic distribution:
127
+
128
+ ```bash
129
+ locust -f $(ml-loadtest-file) \
130
+ --host http://api:8000 \
131
+ --weights-module ml_loadtest.examples.distribution_weights \
132
+ --test-mode PRODUCTION \
133
+ --target-p99-ms 500
134
+ ```
135
+
136
+ #### 3. Exploration Mode
137
+
138
+ Test multiple weight distributions to find optimal mix:
139
+
140
+ ```bash
141
+ locust -f $(ml-loadtest-file) \
142
+ --host http://api:8000 \
143
+ --weights-module loadtest.distribution_weights \
144
+ --test-mode EXPLORATION \
145
+ --target-p99-ms 1000
146
+ ```
147
+
148
+ #### 4. Spike Testing
149
+
150
+ Test sudden traffic spikes:
151
+
152
+ ```bash
153
+ locust -f $(ml-loadtest-file) \
154
+ --host http://api:8000 \
155
+ --weights-module ml_loadtest.examples.distribution_weights \
156
+ --test-mode SPIKE \
157
+ --spike-target-rps 1000 \
158
+ --spike-duration 30
159
+ ```
160
+
161
+ #### 5. Multiple Modes
162
+
163
+ Run multiple test modes in sequence:
164
+
165
+ ```bash
166
+ # Run individual and production tests (default)
167
+ locust -f $(ml-loadtest-file) \
168
+ --host http://api:8000 \
169
+ --weights-module loadtest.distribution_weights \
170
+ --test-mode INDIVIDUAL PRODUCTION
171
+
172
+ # Run all test modes
173
+ locust -f $(ml-loadtest-file) \
174
+ --host http://api:8000 \
175
+ --weights-module loadtest.distribution_weights \
176
+ --test-mode INDIVIDUAL PRODUCTION EXPLORATION SPIKE
177
+ ```
178
+
179
+ ### Key Configuration Options
180
+
181
+ All standard Locust parameters are available, plus:
182
+
183
+ - `--target-p99-ms`: Target P99 latency in milliseconds (default: 1000)
184
+ - `--max-users`: Maximum concurrent users (default: 32)
185
+ - `--min-users`: Minimum concurrent users (default: 1)
186
+ - `--test-mode`: Test modes to run - INDIVIDUAL, PRODUCTION, EXPLORATION, or SPIKE. Space-separated for multiple (default: INDIVIDUAL PRODUCTION)
187
+ - `--increase-rate`: User increase multiplier when under target (default: 1.2)
188
+ - `--decrease-rate`: User decrease multiplier when over target (default: 0.8)
189
+ - `--check-interval`: Seconds between scaling checks (default: 30)
190
+ - `--output-file`: Output filename prefix (default: "report_loadtest_results")
191
+ - `--weights-module`: Python module with custom production_weights (required)
192
+ - `--spike-target-rps`: Target RPS for spike mode (default: 100.0)
193
+ - `--spike-duration`: Duration in seconds for spike mode (default: 30)
194
+
195
+ Full list of Locust parameters: https://docs.locust.io/en/stable/configuration.html
196
+
197
+ ### Configuration File
198
+
199
+ The package includes a `locust.conf` configuration file that provides default settings for load tests. This allows you to avoid repeating common parameters on the command line.
200
+
201
+ **What is locust.conf?**
202
+
203
+ A Locust configuration file that sets default values for both standard Locust parameters and custom ml-loadtest parameters.
204
+
205
+ **Configuration example:**
206
+
207
+ ```ini
208
+ ; Locust configuration file
209
+ host = http://api:8000
210
+ headless
211
+ only-summary
212
+ run-time = 2h
213
+ loglevel = INFO
214
+ csv = report
215
+ html = report.html
216
+
217
+ ; Load test settings
218
+ target-p99-ms = 1000
219
+ min-users = 1
220
+ max-users = 32
221
+ increase-rate = 1.2
222
+ decrease-rate = 0.8
223
+ check-interval = 30
224
+ tolerance = 0.1
225
+ production-run-duration = 1200
226
+ weights-module = loadtest.distribution_weights
227
+ ```
228
+
229
+ **How to use it:**
230
+
231
+ ```bash
232
+ # Use the bundled config file (from installed package location)
233
+ locust -f $(ml-loadtest-file) \
234
+ --config locust.conf
235
+
236
+ # Override specific settings from config file
237
+ locust -f $(ml-loadtest-file) \
238
+ --config locust.conf \
239
+ --max-users 64
240
+ ```
241
+
242
+ **Note:** Command-line arguments always override config file settings.
243
+
244
+ ### Analysis
245
+
246
+ After running tests, analyze results for regressions and get rate limit recommendations:
247
+
248
+ ```bash
249
+ # Basic analysis
250
+ python -m ml_loadtest.analyze
251
+
252
+ # Update baseline after confirming results are good
253
+ python -m ml_loadtest.analyze --update-baseline
254
+
255
+ # Custom input/output files
256
+ python -m ml_loadtest.analyze \
257
+ --input-file custom_report_loadtest_results.json \
258
+ --baseline-file my_baseline.json \
259
+ --output-file analysis_output.txt
260
+ ```
261
+
262
+ The analyzer will:
263
+ - Compare current results against baseline
264
+ - Detect performance regressions (default 10% threshold)
265
+ - Recommend safe rate limits (default 70% of measured capacity)
266
+ - Generate detailed reports with statistics
267
+
268
+ ### Notion Integration
269
+
270
+ Sync test results to Notion for tracking:
271
+
272
+ ```bash
273
+ # Set Notion credentials (environment variables)
274
+ export NOTION_TOKEN="notion-integration-token"
275
+ export NOTION_TEST_RESULTS_DATABASE_ID="test-results-database-id"
276
+ export NOTION_ENDPOINT_DATABASE_ID="endpoint-database-id"
277
+
278
+ python -m ml_loadtest.notion_sync "service-name" "v1.0.0" \
279
+ --report-file report_loadtest_results.json \
280
+ --baseline-file baseline.json
281
+ ```
282
+
283
+ ## Extending with Custom TaskSets
284
+
285
+ ### Creating Custom TaskSets
286
+
287
+ Each TaskSet must implement this interface:
288
+
289
+ ```python
290
+ from locust import TaskSet
291
+
292
+ class MyCustomTaskSet(TaskSet):
293
+ # Required: endpoint identifier
294
+ endpoint = "/my-endpoint"
295
+
296
+ # Required: test implementation
297
+ def test_endpoint(self) -> None:
298
+ with self.client.post(
299
+ self.endpoint,
300
+ json={"data": "example"},
301
+ name=self.endpoint,
302
+ catch_response=True,
303
+ ) as response:
304
+ if response.status_code == 200:
305
+ response.success()
306
+ else:
307
+ response.failure(f"Failed with {response.status_code}")
308
+ ```
309
+
310
+ ### Using Custom TaskSets
311
+
312
+ Create a weights module (e.g., `distribution_weights.py`):
313
+
314
+ ```python
315
+ from my_tasks import TaskSet1, TaskSet2, TaskSet3
316
+
317
+ production_weights = {
318
+ TaskSet1: 50, # 50% of requests
319
+ TaskSet2: 30, # 30% of requests
320
+ TaskSet3: 20, # 20% of requests
321
+ }
322
+ ```
323
+
324
+ Run with custom weights:
325
+
326
+ ```bash
327
+ locust -f $(ml-loadtest-file) \
328
+ --host http://api:8000 \
329
+ --weights-module distribution_weights
330
+ ```
331
+
332
+ ## Architecture
333
+
334
+ ### Core Components
335
+
336
+ 1. **locustfile.py** - Test orchestration with adaptive scaling
337
+ - `EndpointCapacityExplorer`: Manages test modes and adaptive scaling
338
+ - `LoadTestHttpUser`: Executes weighted endpoint tasks
339
+ - Daemon thread monitors P99 and adjusts user count dynamically
340
+
341
+ 2. **analyze.py** - Post-test analysis
342
+ - `LoadTestAnalyzer`: Regression detection and rate limit calculation
343
+ - Compares against baselines (10% regression threshold)
344
+ - Recommends safe limits (70% of measured capacity by default)
345
+
346
+ 3. **distribution_weights.py** - Production traffic weights
347
+ - Example weight configuration for bundled TaskSets
348
+ - Template for custom weight modules
349
+
350
+ 4. **notion_sync.py** - Notion integration
351
+ - Syncs test results to Notion databases
352
+ - Tracks performance metrics over time
353
+
354
+ ### Data Flow
355
+
356
+ 1. Locust users send requests to target endpoints
357
+ 2. Response times captured in circular buffers (maxlen=2000)
358
+ 3. Daemon thread checks P99 every `--check-interval` seconds
359
+ 4. User count adjusted based on P99 vs target comparison
360
+ 5. After test completion, JSON/TXT reports saved
361
+ 6. Analyzer loads reports for regression detection and recommendations
362
+
363
+ ## Development
364
+
365
+ ### Running Tests
366
+
367
+ ```bash
368
+ make test
369
+ ```
370
+
371
+ ### Linting and Formatting
372
+
373
+ ```bash
374
+ make type-check
375
+ make format
376
+ make lint
377
+ ```