loadtest 7.1.1 → 8.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,633 @@
1
+ # TCP Sockets Performance
2
+
3
+ To improve performance the author has tried out using raw TCP sockets
4
+ using the [net module](https://nodejs.org/api/net.html),
5
+ instead of the [HTTP module](https://nodejs.org/api/http.html).
6
+ This is the story of how it went.
7
+
8
+ ## Rationale
9
+
10
+ Keep-alive (option `-k`) makes a huge difference in performance:
11
+ instead of opening a new socket for every request,
12
+ the same connection is reused,
13
+ so it is usually much faster.
14
+
15
+ We need to run the measurements with and without it
16
+ to see how each factor is affected.
17
+
18
+ ### Summary
19
+
20
+ The following tables summarize all comparisons.
21
+ Fastest option is shown **in bold**.
22
+ Results are shown with one core (or worker, or thread) and three cores for the load tester.
23
+ Detailed explanations follow.
24
+
25
+ First without keep-alive, one-core load tester against 3-core test server:
26
+
27
+ |package|krps|
28
+ |-------|----|
29
+ |loadtest|6|
30
+ |tcp barebones|10|
31
+ |loadtest tcp|9|
32
+ |ab|**20**|
33
+ |autocannon|8|
34
+
35
+ Now with keep-alive, also one-core load tester against 3-core test server:
36
+
37
+ |package|krps|
38
+ |-------|----|
39
+ |loadtest|21|
40
+ |tcp barebones|**80**|
41
+ |loadtest tcp|68|
42
+ |autocannon|57|
43
+ |wrk|73|
44
+
45
+ With keep-alive, 3-core load tester against 3-core test server:
46
+
47
+ |package|krps|
48
+ |-------|----|
49
+ |loadtest|54|
50
+ |loadtest tcp|115|
51
+ |autocannon|107|
52
+ |wrk|**118**|
53
+
54
+ With keep-alive, 1-core load tester against Nginx:
55
+
56
+ |package|krps|
57
+ |-------|----|
58
+ |loadtest|19|
59
+ |loadtest tcp|61|
60
+ |autocannon|40|
61
+ |wrk|**111**|
62
+
63
+ Finally with keep-alive, 3-core load tester against Nginx:
64
+
65
+ |package|krps|
66
+ |-------|----|
67
+ |loadtest|49|
68
+ |loadtest tcp|111|
69
+ |autocannon|80|
70
+ |wrk|**122**|
71
+
72
+ ## Implementations
73
+
74
+ All measurements against the test server using 3 cores
75
+ (the default configuration for our six-core machine),
76
+ unless specified otherwise:
77
+
78
+ ```console
79
+ $ node bin/testserver.js
80
+ ```
81
+
82
+ Note that the first `$` is the console prompt.
83
+ Tests run on an Intel Core i5-12400T processor with 6 cores,
84
+ with Ubuntu 22.04.3 LTS (Xubuntu actually).
85
+ Performance numbers are shown in bold and as thousands of requests per second (krps):
86
+ **80 krps**.
87
+
88
+ ### Targets
89
+
90
+ We compare a few packages on the test machine.
91
+ Keep in mind that `ab` does not use keep-alive while `autocannon` does,
92
+ so they are not to be compared between them.
93
+
94
+ #### Apache ab
95
+
96
+ First target performance is against [Apache `ab`](https://httpd.apache.org/docs/2.4/programs/ab.html).
97
+
98
+ ```console
99
+ $ ab -V
100
+ Version 2.3 <$Revision: 1879490 $>
101
+ ```
102
+
103
+ With 10 concurrent connections without keep-alive.
104
+
105
+ ```console
106
+ $ ab -t 10 -c 10 http://localhost:7357/
107
+ [...]
108
+ Requests per second: 20395.83 [#/sec] (mean)
109
+ ```
110
+
111
+ Results are around **20 krps**.
112
+ Keep-alive cannot be used with `ab` as far as the author knows.
113
+
114
+ #### Autocannon
115
+
116
+ The [autocannon](https://www.npmjs.com/package/autocannon) package uses by default
117
+ 10 concurrent connections with keep-alive enabled:
118
+
119
+ ```console
120
+ $ autocannon --version
121
+ autocannon v7.12.0
122
+ node v18.17.1
123
+ ```
124
+
125
+ ```console
126
+ $ autocannon http://localhost:7357/
127
+ [...]
128
+ ┌───────────┬─────────┬─────────┬─────────┬─────────┬──────────┬─────────┬─────────┐
129
+ │ Stat │ 1% │ 2.5% │ 50% │ 97.5% │ Avg │ Stdev │ Min │
130
+ ├───────────┼─────────┼─────────┼─────────┼─────────┼──────────┼─────────┼─────────┤
131
+ │ Req/Sec │ 51295 │ 51295 │ 57343 │ 59103 │ 56798.55 │ 2226.35 │ 51285 │
132
+ ├───────────┼─────────┼─────────┼─────────┼─────────┼──────────┼─────────┼─────────┤
133
+ │ Bytes/Sec │ 6.36 MB │ 6.36 MB │ 7.11 MB │ 7.33 MB │ 7.04 MB │ 276 kB │ 6.36 MB │
134
+ └───────────┴─────────┴─────────┴─────────┴─────────┴──────────┴─────────┴─────────┘
135
+ ```
136
+
137
+ We will look at the median rate (reported as 50%),
138
+ so results are around **57 krps**.
139
+ Keep-alive cannot be disabled with an option,
140
+ but it can be changed directly in the code by setting the header `Connection: close`.
141
+ Performance is near **8 krps**:
142
+
143
+ ```console
144
+ $ npx autocannon http://localhost:7357/
145
+ [...]
146
+ ┌───────────┬────────┬────────┬────────┬────────┬────────┬─────────┬────────┐
147
+ │ Stat │ 1% │ 2.5% │ 50% │ 97.5% │ Avg │ Stdev │ Min │
148
+ ├───────────┼────────┼────────┼────────┼────────┼────────┼─────────┼────────┤
149
+ │ Req/Sec │ 5831 │ 5831 │ 7703 │ 8735 │ 7674.4 │ 753.53 │ 5828 │
150
+ ├───────────┼────────┼────────┼────────┼────────┼────────┼─────────┼────────┤
151
+ │ Bytes/Sec │ 560 kB │ 560 kB │ 739 kB │ 839 kB │ 737 kB │ 72.4 kB │ 559 kB │
152
+ └───────────┴────────┴────────┴────────┴────────┴────────┴─────────┴────────┘
153
+ ```
154
+
155
+ #### `wrk`
156
+
157
+ To complete the set we try `wrk`:
158
+
159
+ ```console
160
+ $ wrk -v
161
+ wrk debian/4.1.0-3build1 [epoll]
162
+ ```
163
+
164
+ With a single thread (core) for fair comparison we get almost **73 krps**:
165
+
166
+ ```console
167
+ $ wrk http://localhost:7357/ -t 1
168
+ [...]
169
+ Requests/sec: 72639.52
170
+ ```
171
+
172
+ ### Baseline
173
+
174
+ The baseline is the existing `http` implementation in `loadtest` 7.1.1,
175
+ running on one core.
176
+
177
+ Without keep-alive close to **6 krps**:
178
+
179
+ ```console
180
+ $ node bin/loadtest.js http://localhost:7357 --cores 1
181
+ [...]
182
+ Effective rps: 6342
183
+ ```
184
+
185
+ Very far away from the 20 krps given by `ab`.
186
+ With keep-alive:
187
+
188
+ ```console
189
+ $ node bin/loadtest.js http://localhost:7357 --cores 1 -k
190
+ [...]
191
+ Effective rps: 20490
192
+ ```
193
+
194
+ We are around **20 krps**.
195
+ Again quite far from the 57 krps by `autocannon`;
196
+ close to `ab` but it doesn't use keep-alive so the comparison is meaningless.
197
+
198
+ ### Proof of Concept
199
+
200
+ For the first implementation we want to learn if the bare sockets implementation is worth the time.
201
+ In this naïve implementation we open the socket,
202
+ send a short canned request without taking into account any parameters or headers:
203
+
204
+ ```js
205
+ this.params.request = `${this.params.method} ${this.params.path} HTTP/1.1\r\n\r\n`
206
+ ```
207
+
208
+ We don't parse the result either,
209
+ just assume that it is received as one packet
210
+ and disregard it.
211
+ The results are almost **80 krps**:
212
+
213
+ ```console
214
+ $ node bin/loadtest.js http://localhost:7357 --cores 1 --tcp
215
+ [...]
216
+ Effective rps: 79997
217
+ ```
218
+
219
+ Very promising start!
220
+ Obviously this only works properly with GET requests without any body,
221
+ so it is only useful as a benchmark:
222
+ we want to make sure we don't lose too much performance when adding all the functionality.
223
+
224
+ We can also do a barebones implementation without keep-alive,
225
+ creating a new socket for every request.
226
+ The result is around **10 krps**,
227
+ still far from Apache `ab`.
228
+ But here there is not much we can do:
229
+ apparently writing sockets in C is more efficient than in Node.js,
230
+ or perhaps `ab` has some tricks up its sleeve,
231
+ probably some low level optimizations.
232
+ In the Node.js code there is not much fat we can trim.
233
+
234
+ So from now on we will focus on the keep-alive tests.
235
+
236
+ ### Adding Headers
237
+
238
+ First we add the proper headers in the request.
239
+ This means we are sending out more data for each round,
240
+ but performance doesn't seem to be altered much,
241
+ still around **80 krps**.
242
+
243
+ The request we are now sending is:
244
+
245
+ ```
246
+ GET / HTTP/1.1
247
+ host: localhost:7357
248
+ accept: */*
249
+ user-agent: loadtest/7.1.0
250
+ Connection: keep-alive
251
+
252
+ ```
253
+
254
+ One interesting bit is that sending the header `connection: keep-alive`
255
+ does not affect performance;
256
+ however, sending `connection: close` breaks performance to 8 requests per second.
257
+ Probably there are huge inefficiencies in the way sockets are created.
258
+ This should be investigated in depth at some point,
259
+ if we want to have a test without keep-alive at some point.
260
+
261
+ ### Parsing Responses
262
+
263
+ Now we come to the really critical part:
264
+ parsing the response including the content.
265
+
266
+ A very simple implementation just parses the response as a string,
267
+ reads the first line and extracts the status code.
268
+ Performance is now down to around **68 krps**.
269
+ Note that we are still assuming that each response is a single packet.
270
+ A sample response from the test server included with `loadtest`
271
+ can look like this:
272
+
273
+ ```
274
+ HTTP/1.1 200 OK
275
+ Date: Fri, 08 Sep 2023 11:04:21 GMT
276
+ Connection: keep-alive
277
+ Keep-Alive: timeout=5
278
+ Content-Length: 2
279
+
280
+ OK
281
+ ```
282
+
283
+ We can see a very simple HTTP response that fits in one packet.
284
+
285
+ ### Parsing All Headers
286
+
287
+ It is possible that a response comes in multiple packets,
288
+ so we need to keep some state between packets.
289
+ This is the next step:
290
+ we should make sure that we have received the whole body and not just part of it.
291
+ The way to do this is to read the `content-length` header,
292
+ and then check that the body that we have has this length;
293
+ only then can we be 100% sure that we have the whole body.
294
+
295
+ Therefore we need to parse all incoming headers,
296
+ find the content length (in the header `content-length`),
297
+ and then parse the rest of the packet to check that we have the whole body.
298
+ Again, a very simple implementation that parses content length and checks against body length
299
+ goes down to **63 krps**.
300
+
301
+ If the body is not complete we need to keep the partial body,
302
+ and add the rest as it comes until the required `content-length`.
303
+ Keep in mind that even headers can be so long that they come in several packets!
304
+ In this case even more state needs to be stored between packets.
305
+
306
+ With decent packet parsing,
307
+ including multi-packet headers and bodies,
308
+ performance goes down to **60 krps**.
309
+ Most of the time is spent parsing headers,
310
+ since the body only needs to be checked for length,
311
+ not parsed.
312
+
313
+ ### Considering Duplicates
314
+
315
+ Given that answers tend to be identical in a load test,
316
+ perhaps changing a date or a serial number,
317
+ we can apply a trick:
318
+ when receiving a packet check if it's similar enough to one received before
319
+ so we can skip parsing the headers altogether.
320
+
321
+ The algorithm checks the following conditions:
322
+
323
+ - Length of the received packet is less than 1000 bytes.
324
+ - Length of the packet is identical to one received before.
325
+ - Length of headers and body are also identical.
326
+ - Same status as before.
327
+
328
+ If all of them apply then the headers in the message are not parsed:
329
+ we estimate that the packet is complete and we don't need to check for content length.
330
+ Keep in mind that we _might_ be wrong:
331
+ we might have received a packet with just part of a response
332
+ that happens to have the same length, status and header length as a previous complete response,
333
+ and which is also below 1000 bytes.
334
+ This is however extremely unlikely.
335
+
336
+ Using this trick we go back to **67 krps**.
337
+
338
+ Packets of different lengths are stored for comparison,
339
+ which can cause memory issues when size varies constantly.
340
+
341
+ ### Multiprocess, Multi-core
342
+
343
+ Now we can go back to using multiple cores:
344
+
345
+ ```console
346
+ $ node bin/loadtest.js http://localhost:7357 --cores 3 --tcp
347
+ [...]
348
+ Effective rps: 115379
349
+ ```
350
+
351
+ In this case half the available cores,
352
+ leaving the rest for the test server.
353
+ Now we go up to **115 krps**!
354
+
355
+ What about regular `http` connections without the `--tcp` option?
356
+ It stays at **54 krps**:
357
+
358
+ ```console
359
+ $ node bin/loadtest.js http://localhost:7357/ -k --cores 3
360
+ [...]
361
+ Effective rps: 54432
362
+ ```
363
+
364
+ For comparison we try using `autocannon` also with three workers:
365
+
366
+ ```console
367
+ $ autocannon http://localhost:7357/ -w 3 -c 30
368
+ [...]
369
+ ┌───────────┬───────┬───────┬─────────┬─────────┬──────────┬─────────┬───────┐
370
+ │ Stat │ 1% │ 2.5% │ 50% │ 97.5% │ Avg │ Stdev │ Min │
371
+ ├───────────┼───────┼───────┼─────────┼─────────┼──────────┼─────────┼───────┤
372
+ │ Req/Sec │ 88511 │ 88511 │ 107071 │ 110079 │ 105132.8 │ 6148.39 │ 88460 │
373
+ ├───────────┼───────┼───────┼─────────┼─────────┼──────────┼─────────┼───────┤
374
+ │ Bytes/Sec │ 11 MB │ 11 MB │ 13.3 MB │ 13.6 MB │ 13 MB │ 764 kB │ 11 MB │
375
+ └───────────┴───────┴───────┴─────────┴─────────┴──────────┴─────────┴───────┘
376
+ ```
377
+
378
+ Median rate (50% percentile) is **107 krps**.
379
+ Now `wrk` which yields **118 krps**:
380
+
381
+ ```console
382
+ $ wrk http://localhost:7357/ -t 3
383
+ [...]
384
+ Requests/sec: 118164.03
385
+ ```
386
+
387
+ So `loadtest` has managed to be slightly above `autocannon` using multiple tricks,
388
+ but below `wrk`.
389
+
390
+ ### Pool of Clients
391
+
392
+ We are not done yet.
393
+ As it happens the new code is not very precise with connections and clients:
394
+ in particular it doesn't play nice with our `--rps` feature,
395
+ which is used to send an exact number of requests per second.
396
+ We need to do a complete refactoring to have a pool of clients,
397
+ take them to fulfill a request and them free them back to the pool.
398
+
399
+ After the refactoring we get some bad news:
400
+ performance has dropped down back to **60 krps**!
401
+
402
+ ```console
403
+ $ node bin/loadtest.js http://localhost:7357/ --tcp --cores 1
404
+ [...]
405
+ Effective rps: 60331
406
+ ```
407
+
408
+ We need to do the painstaking exercise of getting back to our target performance.
409
+
410
+ ### Profiling and Micro-profiling
411
+
412
+ We need to see where our microseconds (µs) are being spent.
413
+ Every microsecond counts: between 67 krps (15 µs per request) to 60 krps (16.7 µs per request)
414
+ the difference is... less than two microseconds.
415
+
416
+ We use the [`microprofiler`](https://github.com/alexfernandez/microprofiler) package,
417
+ which allows us to instrument the code that is sending and receiving requests.
418
+ For instance the function `makeRequest()` in `lib/tcpClient.js` which is sending out the request:
419
+
420
+ ```js
421
+ import microprofiler from 'microprofiler'
422
+
423
+ [...]
424
+ makeRequest() {
425
+ if (!this.running) {
426
+ return
427
+ }
428
+ // first block: connect
429
+ const start1 = microprofiler.start()
430
+ this.connect()
431
+ microprofiler.measureFrom(start1, 'connect', 100000)
432
+ // second block: create parser
433
+ const start2 = microprofiler.start()
434
+ this.parser = new Parser(this.params.method)
435
+ microprofiler.measureFrom(start2, 'create parser', 100000)
436
+ // third block: start measuring latency
437
+ const start3 = microprofiler.start()
438
+ const id = this.latency.begin();
439
+ this.currentId = id
440
+ microprofiler.measureFrom(start3, 'latency begin', 100000)
441
+ // fourth block: write to socket
442
+ const start4 = microprofiler.start()
443
+ this.connection.write(this.params.request)
444
+ microprofiler.measureFrom(start4, 'write', 100000)
445
+ }
446
+ ```
447
+
448
+ Each of the four calls are instrumented.
449
+ When this code runs the output has a lot of lines like this:
450
+
451
+ ```console
452
+ $ node bin/loadtest.js http://localhost:7357/ --tcp --cores 1
453
+ [...]
454
+ Profiling connect: 100000 requests, mean time: 1.144 µs, rps: 6948026
455
+ Profiling create parser: 100000 requests, mean time: 0.152 µs, rps: 6582446
456
+ Profiling latency begin: 100000 requests, mean time: 1.138 µs, rps: 878664
457
+ Profiling write: 100000 requests, mean time: 5.669 µs, rps: 176409
458
+ ```
459
+
460
+ Note that the results oscillate something like 0.3 µs from time to time,
461
+ so don't pay attention to very small differences.
462
+ Mean time is the interesting part: from 0.152 to create the parser µs to 5.669 µs for the write.
463
+ There is not a lot that we can do with the `connection.write()` call,
464
+ since it's directly speaking with the Node.js core;
465
+ we can try reducing the message size (not sending all headers)
466
+ but it doesn't seem to do much.
467
+ So we center on the `this.connect()` call,
468
+ which we can reduce to less than a µs.
469
+ Then we repeat again on the `finishRequest()` call to see if we can squeeze another microsecond there.
470
+
471
+ After some optimizing and a lot of bug fixing we are back to **68 krps**:
472
+
473
+ ```console
474
+ $ node bin/loadtest.js http://localhost:7357/ --tcp --cores 1
475
+ [...]
476
+ Effective rps: 68466
477
+ ```
478
+
479
+ With classic `loadtest` without the `--tcp` option, we still get **21 krps**:
480
+
481
+ ```console
482
+ $ node bin/loadtest.js http://localhost:7357/ -k --cores 1
483
+ [...]
484
+ Effective rps: 21446
485
+ ```
486
+
487
+ Marginally better than before.
488
+ By the way, it would be a good idea to try again without keep-alive.
489
+ There is currently no option to disable keep-alive,
490
+ but it can be done by hacking the header as
491
+ `Keep-alive: close`.
492
+ We get a bit less performance than the barebones implementation,
493
+ almost **9 krps**:
494
+
495
+ ```console
496
+ $ node bin/loadtest.js http://localhost:7357/ --tcp --cores 1
497
+ [...]
498
+ Effective rps: 8682
499
+ ```
500
+
501
+ ### Reproducible Script
502
+
503
+ The current setup is a bit cumbersome: start the server,
504
+ then start the load test with the right parameters.
505
+ We need to have a reproducible way of getting performance measurements.
506
+ So we introduce the script `bin/tcp-performance.js`,
507
+ that starts a test server and then runs a load test with the parameters we have been using.
508
+ Unfortunately the test server only uses one core (being run in API mode),
509
+ and maxes out quickly at **27 krps**.
510
+
511
+ ```console
512
+ $ node bin/tcp-performance.js
513
+ [...]
514
+ Effective rps: 27350
515
+ ```
516
+
517
+ The author has carried out multiple attempts at getting a multi-core test server running:
518
+ use the cluster module,
519
+ run as a multi-core process,
520
+ run it as a script using
521
+ [child_process.exec()](https://nodejs.org/api/child_process.html#child_processexeccommand-options-callback)...
522
+ They all add too much complexity.
523
+ So we can use the single-core measurements as a benchmark,
524
+ even if they are not representative of full operation.
525
+
526
+ By the way, `autocannon` does a bit better in this scenario (single-core test server),
527
+ as it reaches **43 krps**.
528
+ How does it do this magic?
529
+ One part of the puzzle can be that it sends less headers,
530
+ without `user-agent` or `accepts`.
531
+ So we can do a quick trial of removing these headers in `loadtest`:
532
+
533
+ ```console
534
+ $ node bin/loadtest.js http://localhost:7357/ --tcp --cores 1
535
+ [...]
536
+ Effective rps: 29694
537
+ ```
538
+
539
+ Performance is improved a bit but not much, to almost **30 krps**.
540
+ How `autocannon` does this wizardry is not evident.
541
+
542
+ ### Face-off with Nginx
543
+
544
+ Our last test is to run `loadtest` against a local Nginx server,
545
+ which is sure not to max out with only one core:
546
+ it goes to **61 krps**.
547
+
548
+ ```console
549
+ $ node bin/loadtest.js http://localhost:80/ --tcp --cores 1
550
+ [...]
551
+ Effective rps: 61059
552
+ ```
553
+
554
+ While without `--tcp` we only get **19 krps**.
555
+ A similar test with `autocannon` yields only **40 krps**:
556
+
557
+ ```console
558
+ $ autocannon http://localhost:80/
559
+ [...]
560
+ ┌───────────┬─────────┬─────────┬───────┬─────────┬─────────┬─────────┬─────────┐
561
+ │ Stat │ 1% │ 2.5% │ 50% │ 97.5% │ Avg │ Stdev │ Min │
562
+ ├───────────┼─────────┼─────────┼───────┼─────────┼─────────┼─────────┼─────────┤
563
+ │ Req/Sec │ 34591 │ 34591 │ 40735 │ 43679 │ 40400 │ 2664.56 │ 34590 │
564
+ ├───────────┼─────────┼─────────┼───────┼─────────┼─────────┼─────────┼─────────┤
565
+ │ Bytes/Sec │ 29.7 MB │ 29.7 MB │ 35 MB │ 37.5 MB │ 34.7 MB │ 2.29 MB │ 29.7 MB │
566
+ └───────────┴─────────┴─────────┴───────┴─────────┴─────────┴─────────┴─────────┘
567
+ ```
568
+
569
+ Now it's not evident either how it reaches less performance against an Nginx
570
+ than against our Node.js test server,
571
+ but the numbers are quite consistent.
572
+ While `wrk` takes the crown again with **111 krps**:
573
+
574
+ ```console
575
+ $ wrk http://localhost:80/ -t 1
576
+ [...]
577
+ Requests/sec: 111176.14
578
+ ```
579
+
580
+ Running again `loadtest` with three cores we get **111 krps**:
581
+
582
+ ```console
583
+ $ node bin/loadtest.js http://localhost:80/ --tcp --cores 3
584
+ [...]
585
+ Effective rps: 110858
586
+ ```
587
+
588
+ Without `--tcp` we get **49 krps**.
589
+ While `autocannon` with three workers reaches **80 krps**:
590
+
591
+ ```console
592
+ $ autocannon http://localhost:80/ -w 3
593
+ [...]
594
+ ┌───────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┐
595
+ │ Stat │ 1% │ 2.5% │ 50% │ 97.5% │ Avg │ Stdev │ Min │
596
+ ├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
597
+ │ Req/Sec │ 65727 │ 65727 │ 80191 │ 84223 │ 78668.8 │ 5071.38 │ 65676 │
598
+ ├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
599
+ │ Bytes/Sec │ 56.4 MB │ 56.4 MB │ 68.9 MB │ 72.4 MB │ 67.6 MB │ 4.36 MB │ 56.4 MB │
600
+ └───────────┴─────────┴─────────┴─────────┴─────────┴─────────┴─────────┴─────────┘
601
+ ```
602
+
603
+ Consistent with the numbers reached above against a test server with 3 cores.
604
+
605
+ `wrk` does not go much further with three threads than with one, at **122 krps**:
606
+
607
+ ```console
608
+ $ wrk http://localhost:80/ -t 3
609
+ [...]
610
+ Requests/sec: 121991.96
611
+ ```
612
+
613
+ ## Conclusions
614
+
615
+ It is good to know that `loadtest` can hold its own against such beasts like `ab`,
616
+ `autocannon` or `wrk`.
617
+ `ab` and `wrk` are written in C,
618
+ while `autocannon` is maintained by Matteo Collina who is one of the leading Node.js performance gurus.
619
+
620
+ There are some unexplained effects,
621
+ like why does `autocannon` perform so poorly against Nginx.
622
+ It would be really interesting to understand it.
623
+
624
+ Now with TCP sockets and keep-alive you can use `loadtest`
625
+ to go beyond the paltry 6 to 20 krps that we used to get:
626
+ especially with multiple cores you can reach 100 krps locally.
627
+ If you need performance that goes beyond that,
628
+ you can try some of the other options used here.
629
+
630
+ Note that there are many options not yet implemented for TCP sockets,
631
+ like secure connections with HTTPS.
632
+ They will come in the next releases.
633
+
package/lib/baseClient.js CHANGED
@@ -3,9 +3,9 @@ import {addUserAgent} from './headers.js'
3
3
 
4
4
 
5
5
  export class BaseClient {
6
- constructor(loadTest, options) {
6
+ constructor(loadTest) {
7
7
  this.loadTest = loadTest;
8
- this.options = options;
8
+ this.options = loadTest.options;
9
9
  this.generateMessage = undefined;
10
10
  }
11
11
 
@@ -23,11 +23,7 @@ export class BaseClient {
23
23
  }
24
24
  }
25
25
  this.loadTest.latency.end(id, errorCode);
26
- let callback;
27
- if (!this.options.requestsPerSecond) {
28
- callback = () => this.makeRequest();
29
- }
30
- this.loadTest.finishRequest(error, result, callback);
26
+ this.loadTest.pool.finishRequest(this, result, error);
31
27
  };
32
28
  }
33
29
 
package/lib/cluster.js CHANGED
@@ -17,7 +17,7 @@ export async function runTask(cores, task) {
17
17
  if (cluster.isPrimary) {
18
18
  return await runWorkers(cores)
19
19
  } else {
20
- const result = await task(cluster.worker.id)
20
+ const result = await task(cluster.worker.id) || '0'
21
21
  process.send(result)
22
22
  }
23
23
  }