loadtest 8.0.0 → 8.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/doc/tcp-sockets.md +121 -57
  2. package/package.json +1 -1
@@ -1,6 +1,6 @@
1
1
  # TCP Sockets Performance
2
2
 
3
- To improve performance the author tried out using raw TCP sockets
3
+ To improve performance the author has tried out using raw TCP sockets
4
4
  using the [net module](https://nodejs.org/api/net.html),
5
5
  instead of the [HTTP module](https://nodejs.org/api/http.html).
6
6
  This is the story of how it went.
@@ -71,12 +71,15 @@ Finally with keep-alive, 3-core load tester against Nginx:
71
71
 
72
72
  ## Implementations
73
73
 
74
- All measurements against the test server using 3 cores (default):
74
+ All measurements against the test server using 3 cores
75
+ (the default configuration for our six-core machine),
76
+ unless specified otherwise:
75
77
 
76
- ```
77
- node bin/testserver.js
78
+ ```console
79
+ $ node bin/testserver.js
78
80
  ```
79
81
 
82
+ Note that the first `$` is the console prompt.
80
83
  Tests run on an Intel Core i5-12400T processor with 6 cores,
81
84
  with Ubuntu 22.04.3 LTS (Xubuntu actually).
82
85
  Performance numbers are shown in bold and as thousands of requests per second (krps):
@@ -92,15 +95,15 @@ so they are not to be compared between them.
92
95
 
93
96
  First target performance is against [Apache `ab`](https://httpd.apache.org/docs/2.4/programs/ab.html).
94
97
 
95
- ```
96
- ab -V
98
+ ```console
99
+ $ ab -V
97
100
  Version 2.3 <$Revision: 1879490 $>
98
101
  ```
99
102
 
100
103
  With 10 concurrent connections without keep-alive.
101
104
 
102
- ```
103
- ab -t 10 -c 10 http://localhost:7357/
105
+ ```console
106
+ $ ab -t 10 -c 10 http://localhost:7357/
104
107
  [...]
105
108
  Requests per second: 20395.83 [#/sec] (mean)
106
109
  ```
@@ -113,14 +116,14 @@ Keep-alive cannot be used with `ab` as far as the author knows.
113
116
  The [autocannon](https://www.npmjs.com/package/autocannon) package uses by default
114
117
  10 concurrent connections with keep-alive enabled:
115
118
 
116
- ```
117
- autocannon --version
119
+ ```console
120
+ $ autocannon --version
118
121
  autocannon v7.12.0
119
122
  node v18.17.1
120
123
  ```
121
124
 
122
- ```
123
- autocannon http://localhost:7357/
125
+ ```console
126
+ $ autocannon http://localhost:7357/
124
127
  [...]
125
128
  ┌───────────┬─────────┬─────────┬─────────┬─────────┬──────────┬─────────┬─────────┐
126
129
  │ Stat │ 1% │ 2.5% │ 50% │ 97.5% │ Avg │ Stdev │ Min │
@@ -137,8 +140,8 @@ Keep-alive cannot be disabled with an option,
137
140
  but it can be changed directly in the code by setting the header `Connection: close`.
138
141
  Performance is near **8 krps**:
139
142
 
140
- ```
141
- npx autocannon http://localhost:7357/
143
+ ```console
144
+ $ npx autocannon http://localhost:7357/
142
145
  [...]
143
146
  ┌───────────┬────────┬────────┬────────┬────────┬────────┬─────────┬────────┐
144
147
  │ Stat │ 1% │ 2.5% │ 50% │ 97.5% │ Avg │ Stdev │ Min │
@@ -153,15 +156,15 @@ npx autocannon http://localhost:7357/
153
156
 
154
157
  To complete the set we try `wrk`:
155
158
 
156
- ```
157
- wrk -v
159
+ ```console
160
+ $ wrk -v
158
161
  wrk debian/4.1.0-3build1 [epoll]
159
162
  ```
160
163
 
161
164
  With a single thread (core) for fair comparison we get almost **73 krps**:
162
165
 
163
- ```
164
- wrk http://localhost:7357/ -t 1
166
+ ```console
167
+ $ wrk http://localhost:7357/ -t 1
165
168
  [...]
166
169
  Requests/sec: 72639.52
167
170
  ```
@@ -173,8 +176,8 @@ running on one core.
173
176
 
174
177
  Without keep-alive close to **6 krps**:
175
178
 
176
- ```
177
- node bin/loadtest.js http://localhost:7357 --cores 1
179
+ ```console
180
+ $ node bin/loadtest.js http://localhost:7357 --cores 1
178
181
  [...]
179
182
  Effective rps: 6342
180
183
  ```
@@ -182,8 +185,8 @@ Effective rps: 6342
182
185
  Very far away from the 20 krps given by `ab`.
183
186
  With keep-alive:
184
187
 
185
- ```
186
- node bin/loadtest.js http://localhost:7357 --cores 1 -k
188
+ ```console
189
+ $ node bin/loadtest.js http://localhost:7357 --cores 1 -k
187
190
  [...]
188
191
  Effective rps: 20490
189
192
  ```
@@ -198,7 +201,7 @@ For the first implementation we want to learn if the bare sockets implementation
198
201
  In this naïve implementation we open the socket,
199
202
  send a short canned request without taking into account any parameters or headers:
200
203
 
201
- ```
204
+ ```js
202
205
  this.params.request = `${this.params.method} ${this.params.path} HTTP/1.1\r\n\r\n`
203
206
  ```
204
207
 
@@ -207,8 +210,8 @@ just assume that it is received as one packet
207
210
  and disregard it.
208
211
  The results are almost **80 krps**:
209
212
 
210
- ```
211
- node bin/loadtest.js http://localhost:7357 --cores 1 --tcp
213
+ ```console
214
+ $ node bin/loadtest.js http://localhost:7357 --cores 1 --tcp
212
215
  [...]
213
216
  Effective rps: 79997
214
217
  ```
@@ -339,8 +342,8 @@ which can cause memory issues when size varies constantly.
339
342
 
340
343
  Now we can go back to using multiple cores:
341
344
 
342
- ```
343
- node bin/loadtest.js http://localhost:7357 --cores 3 --tcp
345
+ ```console
346
+ $ node bin/loadtest.js http://localhost:7357 --cores 3 --tcp
344
347
  [...]
345
348
  Effective rps: 115379
346
349
  ```
@@ -352,16 +355,16 @@ Now we go up to **115 krps**!
352
355
  What about regular `http` connections without the `--tcp` option?
353
356
  It stays at **54 krps**:
354
357
 
355
- ```
356
- node bin/loadtest.js http://localhost:7357/ -k --cores 3
358
+ ```console
359
+ $ node bin/loadtest.js http://localhost:7357/ -k --cores 3
357
360
  [...]
358
361
  Effective rps: 54432
359
362
  ```
360
363
 
361
364
  For comparison we try using `autocannon` also with three workers:
362
365
 
363
- ```
364
- autocannon http://localhost:7357/ -w 3 -c 30
366
+ ```console
367
+ $ autocannon http://localhost:7357/ -w 3 -c 30
365
368
  [...]
366
369
  ┌───────────┬───────┬───────┬─────────┬─────────┬──────────┬─────────┬───────┐
367
370
  │ Stat │ 1% │ 2.5% │ 50% │ 97.5% │ Avg │ Stdev │ Min │
@@ -375,8 +378,8 @@ autocannon http://localhost:7357/ -w 3 -c 30
375
378
  Median rate (50% percentile) is **107 krps**.
376
379
  Now `wrk` which yields **118 krps**:
377
380
 
378
- ```
379
- wrk http://localhost:7357/ -t 3
381
+ ```console
382
+ $ wrk http://localhost:7357/ -t 3
380
383
  [...]
381
384
  Requests/sec: 118164.03
382
385
  ```
@@ -396,26 +399,87 @@ take them to fulfill a request and them free them back to the pool.
396
399
  After the refactoring we get some bad news:
397
400
  performance has dropped down back to **60 krps**!
398
401
 
399
- ```
400
- node bin/loadtest.js http://localhost:7357/ --tcp --cores 1
402
+ ```console
403
+ $ node bin/loadtest.js http://localhost:7357/ --tcp --cores 1
401
404
  [...]
402
405
  Effective rps: 60331
403
406
  ```
404
407
 
405
408
  We need to do the painstaking exercise of getting back to our target performance.
406
409
 
410
+ ### Profiling and Micro-profiling
411
+
412
+ We need to see where our microseconds (µs) are being spent.
413
+ Every microsecond counts: between 67 krps (15 µs per request) to 60 krps (16.7 µs per request)
414
+ the difference is... less than two microseconds.
415
+
416
+ We use the [`microprofiler`](https://github.com/alexfernandez/microprofiler) package,
417
+ which allows us to instrument the code that is sending and receiving requests.
418
+ For instance the function `makeRequest()` in `lib/tcpClient.js` which is sending out the request:
419
+
420
+ ```js
421
+ import microprofiler from 'microprofiler'
422
+
423
+ [...]
424
+ makeRequest() {
425
+ if (!this.running) {
426
+ return
427
+ }
428
+ // first block: connect
429
+ const start1 = microprofiler.start()
430
+ this.connect()
431
+ microprofiler.measureFrom(start1, 'connect', 100000)
432
+ // second block: create parser
433
+ const start2 = microprofiler.start()
434
+ this.parser = new Parser(this.params.method)
435
+ microprofiler.measureFrom(start2, 'create parser', 100000)
436
+ // third block: start measuring latency
437
+ const start3 = microprofiler.start()
438
+ const id = this.latency.begin();
439
+ this.currentId = id
440
+ microprofiler.measureFrom(start3, 'latency begin', 100000)
441
+ // fourth block: write to socket
442
+ const start4 = microprofiler.start()
443
+ this.connection.write(this.params.request)
444
+ microprofiler.measureFrom(start4, 'write', 100000)
445
+ }
446
+ ```
447
+
448
+ Each of the four calls are instrumented.
449
+ When this code runs the output has a lot of lines like this:
450
+
451
+ ```console
452
+ $ node bin/loadtest.js http://localhost:7357/ --tcp --cores 1
453
+ [...]
454
+ Profiling connect: 100000 requests, mean time: 1.144 µs, rps: 6948026
455
+ Profiling create parser: 100000 requests, mean time: 0.152 µs, rps: 6582446
456
+ Profiling latency begin: 100000 requests, mean time: 1.138 µs, rps: 878664
457
+ Profiling write: 100000 requests, mean time: 5.669 µs, rps: 176409
458
+ ```
459
+
460
+ Note that the results oscillate something like 0.3 µs from time to time,
461
+ so don't pay attention to very small differences.
462
+ Mean time is the interesting part: from 0.152 to create the parser µs to 5.669 µs for the write.
463
+ There is not a lot that we can do with the `connection.write()` call,
464
+ since it's directly speaking with the Node.js core;
465
+ we can try reducing the message size (not sending all headers)
466
+ but it doesn't seem to do much.
467
+ So we center on the `this.connect()` call,
468
+ which we can reduce to less than a µs.
469
+ Then we repeat again on the `finishRequest()` call to see if we can squeeze another microsecond there.
470
+
407
471
  After some optimizing and a lot of bug fixing we are back to **68 krps**:
408
472
 
409
- ```
410
- node bin/loadtest.js http://localhost:7357/ --tcp --cores 1
473
+ ```console
474
+ $ node bin/loadtest.js http://localhost:7357/ --tcp --cores 1
411
475
  [...]
412
476
  Effective rps: 68466
413
477
  ```
414
478
 
415
479
  With classic `loadtest` without the `--tcp` option, we still get **21 krps**:
416
480
 
417
- ```
418
- node bin/loadtest.js http://localhost:7357/ -k --cores 1
481
+ ```console
482
+ $ node bin/loadtest.js http://localhost:7357/ -k --cores 1
419
483
  [...]
420
484
  Effective rps: 21446
421
485
  ```
@@ -428,8 +492,8 @@ but it can be done by hacking the header as
428
492
  We get a bit less performance than the barebones implementation,
429
493
  almost **9 krps**:
430
494
 
431
- ```
432
- node bin/loadtest.js http://localhost:7357/ --tcp --cores 1
495
+ ```console
496
+ $ node bin/loadtest.js http://localhost:7357/ --tcp --cores 1
433
497
  [...]
434
498
  Effective rps: 8682
435
499
  ```
@@ -444,8 +508,8 @@ that starts a test server and then runs a load test with the parameters we have
444
508
  Unfortunately the test server only uses one core (being run in API mode),
445
509
  and maxes out quickly at **27 krps**.
446
510
 
447
- ```
448
- node bin/tcp-performance.js
511
+ ```console
512
+ $ node bin/tcp-performance.js
449
513
  [...]
450
514
  Effective rps: 27350
451
515
  ```
@@ -466,8 +530,8 @@ One part of the puzzle can be that it sends less headers,
466
530
  without `user-agent` or `accepts`.
467
531
  So we can do a quick trial of removing these headers in `loadtest`:
468
532
 
469
- ```
470
- node bin/loadtest.js http://localhost:7357/ --tcp --cores 1
533
+ ```console
534
+ $ node bin/loadtest.js http://localhost:7357/ --tcp --cores 1
471
535
  [...]
472
536
  Effective rps: 29694
473
537
  ```
@@ -481,8 +545,8 @@ Our last test is to run `loadtest` against a local Nginx server,
481
545
  which is sure not to max out with only one core:
482
546
  it goes to **61 krps**.
483
547
 
484
- ```
485
- node bin/loadtest.js http://localhost:80/ --tcp --cores 1
548
+ ```console
549
+ $ node bin/loadtest.js http://localhost:80/ --tcp --cores 1
486
550
  [...]
487
551
  Effective rps: 61059
488
552
  ```
@@ -490,8 +554,8 @@ Effective rps: 61059
490
554
  While without `--tcp` we only get **19 krps**.
491
555
  A similar test with `autocannon` yields only **40 krps**:
492
556
 
493
- ```
494
- autocannon http://localhost:80/
557
+ ```console
558
+ $ autocannon http://localhost:80/
495
559
  [...]
496
560
  ┌───────────┬─────────┬─────────┬───────┬─────────┬─────────┬─────────┬─────────┐
497
561
  │ Stat │ 1% │ 2.5% │ 50% │ 97.5% │ Avg │ Stdev │ Min │
@@ -507,16 +571,16 @@ than against our Node.js test server,
507
571
  but the numbers are quite consistent.
508
572
  While `wrk` takes the crown again with **111 krps**:
509
573
 
510
- ```
511
- wrk http://localhost:80/ -t 1
574
+ ```console
575
+ $ wrk http://localhost:80/ -t 1
512
576
  [...]
513
577
  Requests/sec: 111176.14
514
578
  ```
515
579
 
516
580
  Running again `loadtest` with three cores we get **111 krps**:
517
581
 
518
- ```
519
- node bin/loadtest.js http://localhost:80/ --tcp --cores 3
582
+ ```console
583
+ $ node bin/loadtest.js http://localhost:80/ --tcp --cores 3
520
584
  [...]
521
585
  Effective rps: 110858
522
586
  ```
@@ -524,8 +588,8 @@ Effective rps: 110858
524
588
  Without `--tcp` we get **49 krps**.
525
589
  While `autocannon` with three workers reaches **80 krps**:
526
590
 
527
- ```
528
- autocannon http://localhost:80/ -w 3
591
+ ```console
592
+ $ autocannon http://localhost:80/ -w 3
529
593
  [...]
530
594
  ┌───────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┐
531
595
  │ Stat │ 1% │ 2.5% │ 50% │ 97.5% │ Avg │ Stdev │ Min │
@@ -540,8 +604,8 @@ Consistent with the numbers reached above against a test server with 3 cores.
540
604
 
541
605
  `wrk` does not go much further with three threads than with one, at **122 krps**:
542
606
 
543
- ```
544
- wrk http://localhost:80/ -t 3
607
+ ```console
608
+ $ wrk http://localhost:80/ -t 3
545
609
  [...]
546
610
  Requests/sec: 121991.96
547
611
  ```
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "loadtest",
3
- "version": "8.0.0",
3
+ "version": "8.0.1",
4
4
  "type": "module",
5
5
  "description": "Run load tests for your web application. Mostly ab-compatible interface, with an option to force requests per second. Includes an API for automated load testing.",
6
6
  "homepage": "https://github.com/alexfernandez/loadtest",