@wazir-dev/cli 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (629) hide show
  1. package/AGENTS.md +111 -0
  2. package/CHANGELOG.md +14 -0
  3. package/CONTRIBUTING.md +101 -0
  4. package/LICENSE +21 -0
  5. package/README.md +314 -0
  6. package/assets/composition-engine.mmd +34 -0
  7. package/assets/demo-script.sh +17 -0
  8. package/assets/logo-dark.svg +14 -0
  9. package/assets/logo.svg +14 -0
  10. package/assets/pipeline.mmd +39 -0
  11. package/assets/record-demo.sh +51 -0
  12. package/docs/README.md +51 -0
  13. package/docs/adapters/context-mode.md +60 -0
  14. package/docs/concepts/architecture.md +87 -0
  15. package/docs/concepts/artifact-model.md +60 -0
  16. package/docs/concepts/composition-engine.md +36 -0
  17. package/docs/concepts/indexing-and-recall.md +160 -0
  18. package/docs/concepts/observability.md +41 -0
  19. package/docs/concepts/roles-and-workflows.md +59 -0
  20. package/docs/concepts/terminology-policy.md +27 -0
  21. package/docs/getting-started/01-installation.md +78 -0
  22. package/docs/getting-started/02-first-run.md +102 -0
  23. package/docs/getting-started/03-adding-to-project.md +15 -0
  24. package/docs/getting-started/04-host-setup.md +15 -0
  25. package/docs/guides/ci-integration.md +15 -0
  26. package/docs/guides/creating-skills.md +15 -0
  27. package/docs/guides/expertise-module-authoring.md +15 -0
  28. package/docs/guides/hook-development.md +15 -0
  29. package/docs/guides/memory-and-learnings.md +34 -0
  30. package/docs/guides/multi-host-export.md +15 -0
  31. package/docs/guides/troubleshooting.md +101 -0
  32. package/docs/guides/writing-custom-roles.md +15 -0
  33. package/docs/plans/2026-03-15-cli-pipeline-integration-design.md +592 -0
  34. package/docs/plans/2026-03-15-cli-pipeline-integration-plan.md +598 -0
  35. package/docs/plans/2026-03-15-docs-enforcement-plan.md +238 -0
  36. package/docs/readmes/INDEX.md +99 -0
  37. package/docs/readmes/features/expertise/README.md +171 -0
  38. package/docs/readmes/features/exports/README.md +222 -0
  39. package/docs/readmes/features/hooks/README.md +103 -0
  40. package/docs/readmes/features/hooks/loop-cap-guard.md +133 -0
  41. package/docs/readmes/features/hooks/post-tool-capture.md +121 -0
  42. package/docs/readmes/features/hooks/post-tool-lint.md +130 -0
  43. package/docs/readmes/features/hooks/pre-compact-summary.md +122 -0
  44. package/docs/readmes/features/hooks/pre-tool-capture-route.md +100 -0
  45. package/docs/readmes/features/hooks/protected-path-write-guard.md +128 -0
  46. package/docs/readmes/features/hooks/session-start.md +119 -0
  47. package/docs/readmes/features/hooks/stop-handoff-harvest.md +125 -0
  48. package/docs/readmes/features/roles/README.md +157 -0
  49. package/docs/readmes/features/roles/clarifier.md +152 -0
  50. package/docs/readmes/features/roles/content-author.md +190 -0
  51. package/docs/readmes/features/roles/designer.md +193 -0
  52. package/docs/readmes/features/roles/executor.md +184 -0
  53. package/docs/readmes/features/roles/learner.md +210 -0
  54. package/docs/readmes/features/roles/planner.md +182 -0
  55. package/docs/readmes/features/roles/researcher.md +164 -0
  56. package/docs/readmes/features/roles/reviewer.md +184 -0
  57. package/docs/readmes/features/roles/specifier.md +162 -0
  58. package/docs/readmes/features/roles/verifier.md +215 -0
  59. package/docs/readmes/features/schemas/README.md +178 -0
  60. package/docs/readmes/features/skills/README.md +63 -0
  61. package/docs/readmes/features/skills/brainstorming.md +96 -0
  62. package/docs/readmes/features/skills/debugging.md +148 -0
  63. package/docs/readmes/features/skills/design.md +120 -0
  64. package/docs/readmes/features/skills/prepare-next.md +109 -0
  65. package/docs/readmes/features/skills/run-audit.md +159 -0
  66. package/docs/readmes/features/skills/scan-project.md +109 -0
  67. package/docs/readmes/features/skills/self-audit.md +176 -0
  68. package/docs/readmes/features/skills/tdd.md +137 -0
  69. package/docs/readmes/features/skills/using-skills.md +92 -0
  70. package/docs/readmes/features/skills/verification.md +120 -0
  71. package/docs/readmes/features/skills/writing-plans.md +104 -0
  72. package/docs/readmes/features/tooling/README.md +320 -0
  73. package/docs/readmes/features/workflows/README.md +186 -0
  74. package/docs/readmes/features/workflows/author.md +181 -0
  75. package/docs/readmes/features/workflows/clarify.md +154 -0
  76. package/docs/readmes/features/workflows/design-review.md +171 -0
  77. package/docs/readmes/features/workflows/design.md +169 -0
  78. package/docs/readmes/features/workflows/discover.md +162 -0
  79. package/docs/readmes/features/workflows/execute.md +173 -0
  80. package/docs/readmes/features/workflows/learn.md +167 -0
  81. package/docs/readmes/features/workflows/plan-review.md +165 -0
  82. package/docs/readmes/features/workflows/plan.md +170 -0
  83. package/docs/readmes/features/workflows/prepare-next.md +167 -0
  84. package/docs/readmes/features/workflows/review.md +169 -0
  85. package/docs/readmes/features/workflows/run-audit.md +191 -0
  86. package/docs/readmes/features/workflows/spec-challenge.md +159 -0
  87. package/docs/readmes/features/workflows/specify.md +160 -0
  88. package/docs/readmes/features/workflows/verify.md +177 -0
  89. package/docs/readmes/packages/README.md +50 -0
  90. package/docs/readmes/packages/ajv.md +117 -0
  91. package/docs/readmes/packages/context-mode.md +118 -0
  92. package/docs/readmes/packages/gray-matter.md +116 -0
  93. package/docs/readmes/packages/node-test.md +137 -0
  94. package/docs/readmes/packages/yaml.md +112 -0
  95. package/docs/reference/configuration-reference.md +159 -0
  96. package/docs/reference/expertise-index.md +52 -0
  97. package/docs/reference/git-flow.md +43 -0
  98. package/docs/reference/hooks.md +87 -0
  99. package/docs/reference/host-exports.md +50 -0
  100. package/docs/reference/launch-checklist.md +172 -0
  101. package/docs/reference/marketplace-listings.md +76 -0
  102. package/docs/reference/release-process.md +34 -0
  103. package/docs/reference/roles-reference.md +77 -0
  104. package/docs/reference/skills.md +33 -0
  105. package/docs/reference/templates.md +29 -0
  106. package/docs/reference/tooling-cli.md +94 -0
  107. package/docs/truth-claims.yaml +222 -0
  108. package/expertise/PROGRESS.md +63 -0
  109. package/expertise/README.md +18 -0
  110. package/expertise/antipatterns/PROGRESS.md +56 -0
  111. package/expertise/antipatterns/backend/api-design-antipatterns.md +1271 -0
  112. package/expertise/antipatterns/backend/auth-antipatterns.md +1195 -0
  113. package/expertise/antipatterns/backend/caching-antipatterns.md +622 -0
  114. package/expertise/antipatterns/backend/database-antipatterns.md +1038 -0
  115. package/expertise/antipatterns/backend/index.md +24 -0
  116. package/expertise/antipatterns/backend/microservices-antipatterns.md +850 -0
  117. package/expertise/antipatterns/code/architecture-antipatterns.md +919 -0
  118. package/expertise/antipatterns/code/async-antipatterns.md +622 -0
  119. package/expertise/antipatterns/code/code-smells.md +1186 -0
  120. package/expertise/antipatterns/code/dependency-antipatterns.md +1209 -0
  121. package/expertise/antipatterns/code/error-handling-antipatterns.md +1360 -0
  122. package/expertise/antipatterns/code/index.md +27 -0
  123. package/expertise/antipatterns/code/naming-and-abstraction.md +1118 -0
  124. package/expertise/antipatterns/code/state-management-antipatterns.md +1076 -0
  125. package/expertise/antipatterns/code/testing-antipatterns.md +1053 -0
  126. package/expertise/antipatterns/design/accessibility-antipatterns.md +1136 -0
  127. package/expertise/antipatterns/design/dark-patterns.md +1121 -0
  128. package/expertise/antipatterns/design/index.md +22 -0
  129. package/expertise/antipatterns/design/ui-antipatterns.md +1202 -0
  130. package/expertise/antipatterns/design/ux-antipatterns.md +680 -0
  131. package/expertise/antipatterns/frontend/css-layout-antipatterns.md +691 -0
  132. package/expertise/antipatterns/frontend/flutter-antipatterns.md +1827 -0
  133. package/expertise/antipatterns/frontend/index.md +23 -0
  134. package/expertise/antipatterns/frontend/mobile-antipatterns.md +573 -0
  135. package/expertise/antipatterns/frontend/react-antipatterns.md +1128 -0
  136. package/expertise/antipatterns/frontend/spa-antipatterns.md +1235 -0
  137. package/expertise/antipatterns/index.md +31 -0
  138. package/expertise/antipatterns/performance/index.md +20 -0
  139. package/expertise/antipatterns/performance/performance-antipatterns.md +1013 -0
  140. package/expertise/antipatterns/performance/premature-optimization.md +623 -0
  141. package/expertise/antipatterns/performance/scaling-antipatterns.md +785 -0
  142. package/expertise/antipatterns/process/ai-coding-antipatterns.md +853 -0
  143. package/expertise/antipatterns/process/code-review-antipatterns.md +656 -0
  144. package/expertise/antipatterns/process/deployment-antipatterns.md +920 -0
  145. package/expertise/antipatterns/process/index.md +23 -0
  146. package/expertise/antipatterns/process/technical-debt-antipatterns.md +647 -0
  147. package/expertise/antipatterns/security/index.md +20 -0
  148. package/expertise/antipatterns/security/secrets-antipatterns.md +849 -0
  149. package/expertise/antipatterns/security/security-theater.md +843 -0
  150. package/expertise/antipatterns/security/vulnerability-patterns.md +801 -0
  151. package/expertise/architecture/PROGRESS.md +70 -0
  152. package/expertise/architecture/data/caching-architecture.md +671 -0
  153. package/expertise/architecture/data/data-consistency.md +574 -0
  154. package/expertise/architecture/data/data-modeling.md +536 -0
  155. package/expertise/architecture/data/event-streams-and-queues.md +634 -0
  156. package/expertise/architecture/data/index.md +25 -0
  157. package/expertise/architecture/data/search-architecture.md +663 -0
  158. package/expertise/architecture/data/sql-vs-nosql.md +708 -0
  159. package/expertise/architecture/decisions/architecture-decision-records.md +640 -0
  160. package/expertise/architecture/decisions/build-vs-buy.md +616 -0
  161. package/expertise/architecture/decisions/index.md +23 -0
  162. package/expertise/architecture/decisions/monolith-to-microservices.md +790 -0
  163. package/expertise/architecture/decisions/technology-selection.md +616 -0
  164. package/expertise/architecture/distributed/cap-theorem-and-tradeoffs.md +800 -0
  165. package/expertise/architecture/distributed/circuit-breaker-bulkhead.md +741 -0
  166. package/expertise/architecture/distributed/consensus-and-coordination.md +796 -0
  167. package/expertise/architecture/distributed/distributed-systems-fundamentals.md +564 -0
  168. package/expertise/architecture/distributed/idempotency-and-retry.md +796 -0
  169. package/expertise/architecture/distributed/index.md +25 -0
  170. package/expertise/architecture/distributed/saga-pattern.md +797 -0
  171. package/expertise/architecture/foundations/architectural-thinking.md +460 -0
  172. package/expertise/architecture/foundations/coupling-and-cohesion.md +770 -0
  173. package/expertise/architecture/foundations/design-principles-solid.md +649 -0
  174. package/expertise/architecture/foundations/domain-driven-design.md +719 -0
  175. package/expertise/architecture/foundations/index.md +25 -0
  176. package/expertise/architecture/foundations/separation-of-concerns.md +472 -0
  177. package/expertise/architecture/foundations/twelve-factor-app.md +797 -0
  178. package/expertise/architecture/index.md +34 -0
  179. package/expertise/architecture/integration/api-design-graphql.md +638 -0
  180. package/expertise/architecture/integration/api-design-grpc.md +804 -0
  181. package/expertise/architecture/integration/api-design-rest.md +892 -0
  182. package/expertise/architecture/integration/index.md +25 -0
  183. package/expertise/architecture/integration/third-party-integration.md +795 -0
  184. package/expertise/architecture/integration/webhooks-and-callbacks.md +1152 -0
  185. package/expertise/architecture/integration/websockets-realtime.md +791 -0
  186. package/expertise/architecture/mobile-architecture/index.md +22 -0
  187. package/expertise/architecture/mobile-architecture/mobile-app-architecture.md +780 -0
  188. package/expertise/architecture/mobile-architecture/mobile-backend-for-frontend.md +670 -0
  189. package/expertise/architecture/mobile-architecture/offline-first.md +719 -0
  190. package/expertise/architecture/mobile-architecture/push-and-sync.md +782 -0
  191. package/expertise/architecture/patterns/cqrs-event-sourcing.md +717 -0
  192. package/expertise/architecture/patterns/event-driven.md +797 -0
  193. package/expertise/architecture/patterns/hexagonal-clean-architecture.md +870 -0
  194. package/expertise/architecture/patterns/index.md +27 -0
  195. package/expertise/architecture/patterns/layered-architecture.md +736 -0
  196. package/expertise/architecture/patterns/microservices.md +753 -0
  197. package/expertise/architecture/patterns/modular-monolith.md +692 -0
  198. package/expertise/architecture/patterns/monolith.md +626 -0
  199. package/expertise/architecture/patterns/plugin-architecture.md +735 -0
  200. package/expertise/architecture/patterns/serverless.md +780 -0
  201. package/expertise/architecture/scaling/database-scaling.md +615 -0
  202. package/expertise/architecture/scaling/feature-flags-and-rollouts.md +757 -0
  203. package/expertise/architecture/scaling/horizontal-vs-vertical.md +606 -0
  204. package/expertise/architecture/scaling/index.md +24 -0
  205. package/expertise/architecture/scaling/multi-tenancy.md +800 -0
  206. package/expertise/architecture/scaling/stateless-design.md +787 -0
  207. package/expertise/backend/embedded-firmware.md +625 -0
  208. package/expertise/backend/go.md +853 -0
  209. package/expertise/backend/index.md +24 -0
  210. package/expertise/backend/java-spring.md +448 -0
  211. package/expertise/backend/node-typescript.md +625 -0
  212. package/expertise/backend/python-fastapi.md +724 -0
  213. package/expertise/backend/rust.md +458 -0
  214. package/expertise/backend/solidity.md +711 -0
  215. package/expertise/composition-map.yaml +443 -0
  216. package/expertise/content/foundations/content-modeling.md +395 -0
  217. package/expertise/content/foundations/editorial-standards.md +449 -0
  218. package/expertise/content/foundations/index.md +24 -0
  219. package/expertise/content/foundations/microcopy.md +455 -0
  220. package/expertise/content/foundations/terminology-governance.md +509 -0
  221. package/expertise/content/index.md +34 -0
  222. package/expertise/content/patterns/accessibility-copy.md +518 -0
  223. package/expertise/content/patterns/index.md +24 -0
  224. package/expertise/content/patterns/notification-content.md +433 -0
  225. package/expertise/content/patterns/sample-content.md +486 -0
  226. package/expertise/content/patterns/state-copy.md +439 -0
  227. package/expertise/design/PROGRESS.md +58 -0
  228. package/expertise/design/disciplines/dark-mode-theming.md +577 -0
  229. package/expertise/design/disciplines/design-systems.md +595 -0
  230. package/expertise/design/disciplines/index.md +25 -0
  231. package/expertise/design/disciplines/information-architecture.md +800 -0
  232. package/expertise/design/disciplines/interaction-design.md +788 -0
  233. package/expertise/design/disciplines/responsive-design.md +552 -0
  234. package/expertise/design/disciplines/usability-testing.md +516 -0
  235. package/expertise/design/disciplines/user-research.md +792 -0
  236. package/expertise/design/foundations/accessibility-design.md +796 -0
  237. package/expertise/design/foundations/color-theory.md +797 -0
  238. package/expertise/design/foundations/iconography.md +795 -0
  239. package/expertise/design/foundations/index.md +26 -0
  240. package/expertise/design/foundations/motion-and-animation.md +653 -0
  241. package/expertise/design/foundations/rtl-design.md +585 -0
  242. package/expertise/design/foundations/spacing-and-layout.md +607 -0
  243. package/expertise/design/foundations/typography.md +800 -0
  244. package/expertise/design/foundations/visual-hierarchy.md +761 -0
  245. package/expertise/design/index.md +32 -0
  246. package/expertise/design/patterns/authentication-flows.md +474 -0
  247. package/expertise/design/patterns/content-consumption.md +789 -0
  248. package/expertise/design/patterns/data-display.md +618 -0
  249. package/expertise/design/patterns/e-commerce.md +1494 -0
  250. package/expertise/design/patterns/feedback-and-states.md +642 -0
  251. package/expertise/design/patterns/forms-and-input.md +819 -0
  252. package/expertise/design/patterns/gamification.md +801 -0
  253. package/expertise/design/patterns/index.md +31 -0
  254. package/expertise/design/patterns/microinteractions.md +449 -0
  255. package/expertise/design/patterns/navigation.md +800 -0
  256. package/expertise/design/patterns/notifications.md +705 -0
  257. package/expertise/design/patterns/onboarding.md +700 -0
  258. package/expertise/design/patterns/search-and-filter.md +601 -0
  259. package/expertise/design/patterns/settings-and-preferences.md +768 -0
  260. package/expertise/design/patterns/social-and-community.md +748 -0
  261. package/expertise/design/platforms/desktop-native.md +612 -0
  262. package/expertise/design/platforms/index.md +25 -0
  263. package/expertise/design/platforms/mobile-android.md +825 -0
  264. package/expertise/design/platforms/mobile-cross-platform.md +983 -0
  265. package/expertise/design/platforms/mobile-ios.md +699 -0
  266. package/expertise/design/platforms/tablet.md +794 -0
  267. package/expertise/design/platforms/web-dashboard.md +790 -0
  268. package/expertise/design/platforms/web-responsive.md +550 -0
  269. package/expertise/design/psychology/behavioral-nudges.md +449 -0
  270. package/expertise/design/psychology/cognitive-load.md +1191 -0
  271. package/expertise/design/psychology/error-psychology.md +778 -0
  272. package/expertise/design/psychology/index.md +22 -0
  273. package/expertise/design/psychology/persuasive-design.md +736 -0
  274. package/expertise/design/psychology/user-mental-models.md +623 -0
  275. package/expertise/design/tooling/open-pencil.md +266 -0
  276. package/expertise/frontend/angular.md +1073 -0
  277. package/expertise/frontend/desktop-electron.md +546 -0
  278. package/expertise/frontend/flutter.md +782 -0
  279. package/expertise/frontend/index.md +27 -0
  280. package/expertise/frontend/native-android.md +409 -0
  281. package/expertise/frontend/native-ios.md +490 -0
  282. package/expertise/frontend/react-native.md +1160 -0
  283. package/expertise/frontend/react.md +808 -0
  284. package/expertise/frontend/vue.md +1089 -0
  285. package/expertise/humanize/domain-rules-code.md +79 -0
  286. package/expertise/humanize/domain-rules-content.md +67 -0
  287. package/expertise/humanize/domain-rules-technical-docs.md +56 -0
  288. package/expertise/humanize/index.md +35 -0
  289. package/expertise/humanize/self-audit-checklist.md +87 -0
  290. package/expertise/humanize/sentence-patterns.md +218 -0
  291. package/expertise/humanize/vocabulary-blacklist.md +105 -0
  292. package/expertise/i18n/PROGRESS.md +65 -0
  293. package/expertise/i18n/advanced/accessibility-and-i18n.md +28 -0
  294. package/expertise/i18n/advanced/bidirectional-text-algorithm.md +38 -0
  295. package/expertise/i18n/advanced/complex-scripts.md +30 -0
  296. package/expertise/i18n/advanced/performance-and-i18n.md +27 -0
  297. package/expertise/i18n/advanced/testing-i18n.md +28 -0
  298. package/expertise/i18n/content/content-adaptation.md +23 -0
  299. package/expertise/i18n/content/locale-specific-formatting.md +23 -0
  300. package/expertise/i18n/content/machine-translation-integration.md +28 -0
  301. package/expertise/i18n/content/translation-management.md +29 -0
  302. package/expertise/i18n/foundations/date-time-calendars.md +67 -0
  303. package/expertise/i18n/foundations/i18n-architecture.md +272 -0
  304. package/expertise/i18n/foundations/locale-and-language-tags.md +79 -0
  305. package/expertise/i18n/foundations/numbers-currency-units.md +61 -0
  306. package/expertise/i18n/foundations/pluralization-and-gender.md +109 -0
  307. package/expertise/i18n/foundations/string-externalization.md +236 -0
  308. package/expertise/i18n/foundations/text-direction-bidi.md +241 -0
  309. package/expertise/i18n/foundations/unicode-and-encoding.md +86 -0
  310. package/expertise/i18n/index.md +38 -0
  311. package/expertise/i18n/platform/backend-i18n.md +31 -0
  312. package/expertise/i18n/platform/flutter-i18n.md +148 -0
  313. package/expertise/i18n/platform/native-android-i18n.md +36 -0
  314. package/expertise/i18n/platform/native-ios-i18n.md +36 -0
  315. package/expertise/i18n/platform/react-i18n.md +103 -0
  316. package/expertise/i18n/platform/web-css-i18n.md +81 -0
  317. package/expertise/i18n/rtl/arabic-specific.md +175 -0
  318. package/expertise/i18n/rtl/hebrew-specific.md +149 -0
  319. package/expertise/i18n/rtl/rtl-animations-and-transitions.md +111 -0
  320. package/expertise/i18n/rtl/rtl-forms-and-input.md +161 -0
  321. package/expertise/i18n/rtl/rtl-fundamentals.md +211 -0
  322. package/expertise/i18n/rtl/rtl-icons-and-images.md +181 -0
  323. package/expertise/i18n/rtl/rtl-layout-mirroring.md +252 -0
  324. package/expertise/i18n/rtl/rtl-navigation-and-gestures.md +107 -0
  325. package/expertise/i18n/rtl/rtl-testing-and-qa.md +147 -0
  326. package/expertise/i18n/rtl/rtl-typography.md +160 -0
  327. package/expertise/index.md +113 -0
  328. package/expertise/index.yaml +216 -0
  329. package/expertise/infrastructure/cloud-aws.md +597 -0
  330. package/expertise/infrastructure/cloud-gcp.md +599 -0
  331. package/expertise/infrastructure/cybersecurity.md +816 -0
  332. package/expertise/infrastructure/database-mongodb.md +447 -0
  333. package/expertise/infrastructure/database-postgres.md +400 -0
  334. package/expertise/infrastructure/devops-cicd.md +787 -0
  335. package/expertise/infrastructure/index.md +27 -0
  336. package/expertise/performance/PROGRESS.md +50 -0
  337. package/expertise/performance/backend/api-latency.md +1204 -0
  338. package/expertise/performance/backend/background-jobs.md +506 -0
  339. package/expertise/performance/backend/connection-pooling.md +1209 -0
  340. package/expertise/performance/backend/database-query-optimization.md +515 -0
  341. package/expertise/performance/backend/index.md +23 -0
  342. package/expertise/performance/backend/rate-limiting-and-throttling.md +971 -0
  343. package/expertise/performance/foundations/algorithmic-complexity.md +954 -0
  344. package/expertise/performance/foundations/caching-strategies.md +489 -0
  345. package/expertise/performance/foundations/concurrency-and-parallelism.md +847 -0
  346. package/expertise/performance/foundations/index.md +24 -0
  347. package/expertise/performance/foundations/measuring-and-profiling.md +440 -0
  348. package/expertise/performance/foundations/memory-management.md +964 -0
  349. package/expertise/performance/foundations/performance-budgets.md +1314 -0
  350. package/expertise/performance/index.md +31 -0
  351. package/expertise/performance/infrastructure/auto-scaling.md +1059 -0
  352. package/expertise/performance/infrastructure/cdn-and-edge.md +1081 -0
  353. package/expertise/performance/infrastructure/index.md +22 -0
  354. package/expertise/performance/infrastructure/load-balancing.md +1081 -0
  355. package/expertise/performance/infrastructure/observability.md +1079 -0
  356. package/expertise/performance/mobile/index.md +23 -0
  357. package/expertise/performance/mobile/mobile-animations.md +544 -0
  358. package/expertise/performance/mobile/mobile-memory-battery.md +416 -0
  359. package/expertise/performance/mobile/mobile-network.md +452 -0
  360. package/expertise/performance/mobile/mobile-rendering.md +599 -0
  361. package/expertise/performance/mobile/mobile-startup-time.md +505 -0
  362. package/expertise/performance/platform-specific/flutter-performance.md +647 -0
  363. package/expertise/performance/platform-specific/index.md +22 -0
  364. package/expertise/performance/platform-specific/node-performance.md +1307 -0
  365. package/expertise/performance/platform-specific/postgres-performance.md +1366 -0
  366. package/expertise/performance/platform-specific/react-performance.md +1403 -0
  367. package/expertise/performance/web/bundle-optimization.md +1239 -0
  368. package/expertise/performance/web/image-and-media.md +636 -0
  369. package/expertise/performance/web/index.md +24 -0
  370. package/expertise/performance/web/network-optimization.md +1133 -0
  371. package/expertise/performance/web/rendering-performance.md +1098 -0
  372. package/expertise/performance/web/ssr-and-hydration.md +918 -0
  373. package/expertise/performance/web/web-vitals.md +1374 -0
  374. package/expertise/quality/accessibility.md +985 -0
  375. package/expertise/quality/evidence-based-verification.md +499 -0
  376. package/expertise/quality/index.md +24 -0
  377. package/expertise/quality/ml-model-audit.md +614 -0
  378. package/expertise/quality/performance.md +600 -0
  379. package/expertise/quality/testing-api.md +891 -0
  380. package/expertise/quality/testing-mobile.md +496 -0
  381. package/expertise/quality/testing-web.md +849 -0
  382. package/expertise/security/PROGRESS.md +54 -0
  383. package/expertise/security/agentic-identity.md +540 -0
  384. package/expertise/security/compliance-frameworks.md +601 -0
  385. package/expertise/security/data/data-encryption.md +364 -0
  386. package/expertise/security/data/data-privacy-gdpr.md +692 -0
  387. package/expertise/security/data/database-security.md +1171 -0
  388. package/expertise/security/data/index.md +22 -0
  389. package/expertise/security/data/pii-handling.md +531 -0
  390. package/expertise/security/foundations/authentication.md +1041 -0
  391. package/expertise/security/foundations/authorization.md +603 -0
  392. package/expertise/security/foundations/cryptography.md +1001 -0
  393. package/expertise/security/foundations/index.md +25 -0
  394. package/expertise/security/foundations/owasp-top-10.md +1354 -0
  395. package/expertise/security/foundations/secrets-management.md +1217 -0
  396. package/expertise/security/foundations/secure-sdlc.md +700 -0
  397. package/expertise/security/foundations/supply-chain-security.md +698 -0
  398. package/expertise/security/index.md +31 -0
  399. package/expertise/security/infrastructure/cloud-security-aws.md +1296 -0
  400. package/expertise/security/infrastructure/cloud-security-gcp.md +1376 -0
  401. package/expertise/security/infrastructure/container-security.md +721 -0
  402. package/expertise/security/infrastructure/incident-response.md +1295 -0
  403. package/expertise/security/infrastructure/index.md +24 -0
  404. package/expertise/security/infrastructure/logging-and-monitoring.md +1618 -0
  405. package/expertise/security/infrastructure/network-security.md +1337 -0
  406. package/expertise/security/mobile/index.md +23 -0
  407. package/expertise/security/mobile/mobile-android-security.md +1218 -0
  408. package/expertise/security/mobile/mobile-binary-protection.md +1229 -0
  409. package/expertise/security/mobile/mobile-data-storage.md +1265 -0
  410. package/expertise/security/mobile/mobile-ios-security.md +1401 -0
  411. package/expertise/security/mobile/mobile-network-security.md +1520 -0
  412. package/expertise/security/smart-contract-security.md +594 -0
  413. package/expertise/security/testing/index.md +22 -0
  414. package/expertise/security/testing/penetration-testing.md +1258 -0
  415. package/expertise/security/testing/security-code-review.md +1765 -0
  416. package/expertise/security/testing/threat-modeling.md +1074 -0
  417. package/expertise/security/testing/vulnerability-scanning.md +1062 -0
  418. package/expertise/security/web/api-security.md +586 -0
  419. package/expertise/security/web/cors-and-headers.md +433 -0
  420. package/expertise/security/web/csrf.md +562 -0
  421. package/expertise/security/web/file-upload.md +1477 -0
  422. package/expertise/security/web/index.md +25 -0
  423. package/expertise/security/web/injection.md +1375 -0
  424. package/expertise/security/web/session-management.md +1101 -0
  425. package/expertise/security/web/xss.md +1158 -0
  426. package/exports/README.md +17 -0
  427. package/exports/hosts/claude/.claude/agents/clarifier.md +42 -0
  428. package/exports/hosts/claude/.claude/agents/content-author.md +63 -0
  429. package/exports/hosts/claude/.claude/agents/designer.md +55 -0
  430. package/exports/hosts/claude/.claude/agents/executor.md +55 -0
  431. package/exports/hosts/claude/.claude/agents/learner.md +51 -0
  432. package/exports/hosts/claude/.claude/agents/planner.md +53 -0
  433. package/exports/hosts/claude/.claude/agents/researcher.md +43 -0
  434. package/exports/hosts/claude/.claude/agents/reviewer.md +54 -0
  435. package/exports/hosts/claude/.claude/agents/specifier.md +47 -0
  436. package/exports/hosts/claude/.claude/agents/verifier.md +71 -0
  437. package/exports/hosts/claude/.claude/commands/author.md +42 -0
  438. package/exports/hosts/claude/.claude/commands/clarify.md +38 -0
  439. package/exports/hosts/claude/.claude/commands/design-review.md +46 -0
  440. package/exports/hosts/claude/.claude/commands/design.md +44 -0
  441. package/exports/hosts/claude/.claude/commands/discover.md +37 -0
  442. package/exports/hosts/claude/.claude/commands/execute.md +48 -0
  443. package/exports/hosts/claude/.claude/commands/learn.md +38 -0
  444. package/exports/hosts/claude/.claude/commands/plan-review.md +42 -0
  445. package/exports/hosts/claude/.claude/commands/plan.md +39 -0
  446. package/exports/hosts/claude/.claude/commands/prepare-next.md +37 -0
  447. package/exports/hosts/claude/.claude/commands/review.md +40 -0
  448. package/exports/hosts/claude/.claude/commands/run-audit.md +41 -0
  449. package/exports/hosts/claude/.claude/commands/spec-challenge.md +41 -0
  450. package/exports/hosts/claude/.claude/commands/specify.md +38 -0
  451. package/exports/hosts/claude/.claude/commands/verify.md +37 -0
  452. package/exports/hosts/claude/.claude/settings.json +34 -0
  453. package/exports/hosts/claude/CLAUDE.md +19 -0
  454. package/exports/hosts/claude/export.manifest.json +38 -0
  455. package/exports/hosts/claude/host-package.json +67 -0
  456. package/exports/hosts/codex/AGENTS.md +19 -0
  457. package/exports/hosts/codex/export.manifest.json +38 -0
  458. package/exports/hosts/codex/host-package.json +41 -0
  459. package/exports/hosts/cursor/.cursor/hooks.json +16 -0
  460. package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +19 -0
  461. package/exports/hosts/cursor/export.manifest.json +38 -0
  462. package/exports/hosts/cursor/host-package.json +42 -0
  463. package/exports/hosts/gemini/GEMINI.md +19 -0
  464. package/exports/hosts/gemini/export.manifest.json +38 -0
  465. package/exports/hosts/gemini/host-package.json +41 -0
  466. package/hooks/README.md +18 -0
  467. package/hooks/definitions/loop_cap_guard.yaml +21 -0
  468. package/hooks/definitions/post_tool_capture.yaml +24 -0
  469. package/hooks/definitions/pre_compact_summary.yaml +19 -0
  470. package/hooks/definitions/pre_tool_capture_route.yaml +19 -0
  471. package/hooks/definitions/protected_path_write_guard.yaml +19 -0
  472. package/hooks/definitions/session_start.yaml +19 -0
  473. package/hooks/definitions/stop_handoff_harvest.yaml +20 -0
  474. package/hooks/loop-cap-guard +17 -0
  475. package/hooks/post-tool-lint +36 -0
  476. package/hooks/protected-path-write-guard +17 -0
  477. package/hooks/session-start +41 -0
  478. package/llms-full.txt +2355 -0
  479. package/llms.txt +43 -0
  480. package/package.json +79 -0
  481. package/roles/README.md +20 -0
  482. package/roles/clarifier.md +42 -0
  483. package/roles/content-author.md +63 -0
  484. package/roles/designer.md +55 -0
  485. package/roles/executor.md +55 -0
  486. package/roles/learner.md +51 -0
  487. package/roles/planner.md +53 -0
  488. package/roles/researcher.md +43 -0
  489. package/roles/reviewer.md +54 -0
  490. package/roles/specifier.md +47 -0
  491. package/roles/verifier.md +71 -0
  492. package/schemas/README.md +24 -0
  493. package/schemas/accepted-learning.schema.json +20 -0
  494. package/schemas/author-artifact.schema.json +156 -0
  495. package/schemas/clarification.schema.json +19 -0
  496. package/schemas/design-artifact.schema.json +80 -0
  497. package/schemas/docs-claim.schema.json +18 -0
  498. package/schemas/export-manifest.schema.json +20 -0
  499. package/schemas/hook.schema.json +67 -0
  500. package/schemas/host-export-package.schema.json +18 -0
  501. package/schemas/implementation-plan.schema.json +19 -0
  502. package/schemas/proposed-learning.schema.json +19 -0
  503. package/schemas/research.schema.json +18 -0
  504. package/schemas/review.schema.json +29 -0
  505. package/schemas/run-manifest.schema.json +18 -0
  506. package/schemas/spec-challenge.schema.json +18 -0
  507. package/schemas/spec.schema.json +20 -0
  508. package/schemas/usage.schema.json +102 -0
  509. package/schemas/verification-proof.schema.json +29 -0
  510. package/schemas/wazir-manifest.schema.json +173 -0
  511. package/skills/README.md +40 -0
  512. package/skills/brainstorming/SKILL.md +77 -0
  513. package/skills/debugging/SKILL.md +50 -0
  514. package/skills/design/SKILL.md +61 -0
  515. package/skills/dispatching-parallel-agents/SKILL.md +128 -0
  516. package/skills/executing-plans/SKILL.md +70 -0
  517. package/skills/finishing-a-development-branch/SKILL.md +169 -0
  518. package/skills/humanize/SKILL.md +123 -0
  519. package/skills/init-pipeline/SKILL.md +124 -0
  520. package/skills/prepare-next/SKILL.md +20 -0
  521. package/skills/receiving-code-review/SKILL.md +123 -0
  522. package/skills/requesting-code-review/SKILL.md +105 -0
  523. package/skills/requesting-code-review/code-reviewer.md +108 -0
  524. package/skills/run-audit/SKILL.md +197 -0
  525. package/skills/scan-project/SKILL.md +41 -0
  526. package/skills/self-audit/SKILL.md +153 -0
  527. package/skills/subagent-driven-development/SKILL.md +154 -0
  528. package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +26 -0
  529. package/skills/subagent-driven-development/implementer-prompt.md +102 -0
  530. package/skills/subagent-driven-development/spec-reviewer-prompt.md +61 -0
  531. package/skills/tdd/SKILL.md +23 -0
  532. package/skills/using-git-worktrees/SKILL.md +163 -0
  533. package/skills/using-skills/SKILL.md +95 -0
  534. package/skills/verification/SKILL.md +22 -0
  535. package/skills/wazir/SKILL.md +463 -0
  536. package/skills/writing-plans/SKILL.md +30 -0
  537. package/skills/writing-skills/SKILL.md +157 -0
  538. package/skills/writing-skills/anthropic-best-practices.md +122 -0
  539. package/skills/writing-skills/persuasion-principles.md +50 -0
  540. package/templates/README.md +20 -0
  541. package/templates/artifacts/README.md +10 -0
  542. package/templates/artifacts/accepted-learning.md +19 -0
  543. package/templates/artifacts/accepted-learning.template.json +12 -0
  544. package/templates/artifacts/author.md +74 -0
  545. package/templates/artifacts/author.template.json +19 -0
  546. package/templates/artifacts/clarification.md +21 -0
  547. package/templates/artifacts/clarification.template.json +12 -0
  548. package/templates/artifacts/execute-notes.md +19 -0
  549. package/templates/artifacts/implementation-plan.md +21 -0
  550. package/templates/artifacts/implementation-plan.template.json +11 -0
  551. package/templates/artifacts/learning-proposal.md +19 -0
  552. package/templates/artifacts/next-run-handoff.md +21 -0
  553. package/templates/artifacts/plan-review.md +19 -0
  554. package/templates/artifacts/proposed-learning.template.json +12 -0
  555. package/templates/artifacts/research.md +21 -0
  556. package/templates/artifacts/research.template.json +12 -0
  557. package/templates/artifacts/review-findings.md +19 -0
  558. package/templates/artifacts/review.template.json +11 -0
  559. package/templates/artifacts/run-manifest.template.json +8 -0
  560. package/templates/artifacts/spec-challenge.md +19 -0
  561. package/templates/artifacts/spec-challenge.template.json +11 -0
  562. package/templates/artifacts/spec.md +21 -0
  563. package/templates/artifacts/spec.template.json +12 -0
  564. package/templates/artifacts/verification-proof.md +19 -0
  565. package/templates/artifacts/verification-proof.template.json +11 -0
  566. package/templates/examples/accepted-learning.example.json +14 -0
  567. package/templates/examples/author.example.json +152 -0
  568. package/templates/examples/clarification.example.json +15 -0
  569. package/templates/examples/docs-claim.example.json +8 -0
  570. package/templates/examples/export-manifest.example.json +7 -0
  571. package/templates/examples/host-export-package.example.json +11 -0
  572. package/templates/examples/implementation-plan.example.json +17 -0
  573. package/templates/examples/proposed-learning.example.json +13 -0
  574. package/templates/examples/research.example.json +15 -0
  575. package/templates/examples/research.example.md +6 -0
  576. package/templates/examples/review.example.json +17 -0
  577. package/templates/examples/run-manifest.example.json +9 -0
  578. package/templates/examples/spec-challenge.example.json +14 -0
  579. package/templates/examples/spec.example.json +21 -0
  580. package/templates/examples/verification-proof.example.json +21 -0
  581. package/templates/examples/wazir-manifest.example.yaml +65 -0
  582. package/templates/task-definition-schema.md +99 -0
  583. package/tooling/README.md +20 -0
  584. package/tooling/src/adapters/context-mode.js +50 -0
  585. package/tooling/src/capture/command.js +376 -0
  586. package/tooling/src/capture/store.js +99 -0
  587. package/tooling/src/capture/usage.js +270 -0
  588. package/tooling/src/checks/branches.js +50 -0
  589. package/tooling/src/checks/brand-truth.js +110 -0
  590. package/tooling/src/checks/changelog.js +231 -0
  591. package/tooling/src/checks/command-registry.js +36 -0
  592. package/tooling/src/checks/commits.js +102 -0
  593. package/tooling/src/checks/docs-drift.js +103 -0
  594. package/tooling/src/checks/docs-truth.js +201 -0
  595. package/tooling/src/checks/runtime-surface.js +156 -0
  596. package/tooling/src/cli.js +116 -0
  597. package/tooling/src/command-options.js +56 -0
  598. package/tooling/src/commands/validate.js +320 -0
  599. package/tooling/src/doctor/command.js +91 -0
  600. package/tooling/src/export/command.js +77 -0
  601. package/tooling/src/export/compiler.js +498 -0
  602. package/tooling/src/guards/loop-cap-guard.js +52 -0
  603. package/tooling/src/guards/protected-path-write-guard.js +67 -0
  604. package/tooling/src/index/command.js +152 -0
  605. package/tooling/src/index/storage.js +1061 -0
  606. package/tooling/src/index/summarizers.js +261 -0
  607. package/tooling/src/loaders.js +18 -0
  608. package/tooling/src/project-root.js +22 -0
  609. package/tooling/src/recall/command.js +225 -0
  610. package/tooling/src/schema-validator.js +30 -0
  611. package/tooling/src/state-root.js +40 -0
  612. package/tooling/src/status/command.js +71 -0
  613. package/wazir.manifest.yaml +135 -0
  614. package/workflows/README.md +19 -0
  615. package/workflows/author.md +42 -0
  616. package/workflows/clarify.md +38 -0
  617. package/workflows/design-review.md +46 -0
  618. package/workflows/design.md +44 -0
  619. package/workflows/discover.md +37 -0
  620. package/workflows/execute.md +48 -0
  621. package/workflows/learn.md +38 -0
  622. package/workflows/plan-review.md +42 -0
  623. package/workflows/plan.md +39 -0
  624. package/workflows/prepare-next.md +37 -0
  625. package/workflows/review.md +40 -0
  626. package/workflows/run-audit.md +41 -0
  627. package/workflows/spec-challenge.md +41 -0
  628. package/workflows/specify.md +38 -0
  629. package/workflows/verify.md +37 -0
@@ -0,0 +1,1081 @@
1
+ # Load Balancing: Performance Engineering Reference
2
+
3
+ > **Scope**: Algorithms, architectures, software/cloud implementations, health checks,
4
+ > session affinity, connection draining, bottlenecks, anti-patterns, and decision frameworks.
5
+ > Every claim includes measured numbers. Last updated: 2026-03-08.
6
+
7
+ ---
8
+
9
+ ## Table of Contents
10
+
11
+ 1. [Load Balancing Algorithms](#1-load-balancing-algorithms)
12
+ 2. [Layer 4 vs Layer 7 Load Balancing](#2-layer-4-vs-layer-7-load-balancing)
13
+ 3. [Software Load Balancers: Benchmarks](#3-software-load-balancers-benchmarks)
14
+ 4. [Cloud Load Balancers](#4-cloud-load-balancers)
15
+ 5. [Health Check Strategies and Failover Timing](#5-health-check-strategies-and-failover-timing)
16
+ 6. [Session Affinity and Sticky Sessions](#6-session-affinity-and-sticky-sessions)
17
+ 7. [Connection Draining and Graceful Shutdown](#7-connection-draining-and-graceful-shutdown)
18
+ 8. [Common Bottlenecks](#8-common-bottlenecks)
19
+ 9. [Anti-Patterns](#9-anti-patterns)
20
+ 10. [Before/After: Real-World Improvements](#10-beforeafter-real-world-improvements)
21
+ 11. [Decision Tree: Which Algorithm Should I Use?](#11-decision-tree-which-algorithm-should-i-use)
22
+ 12. [Configuration Quick Reference](#12-configuration-quick-reference)
23
+ 13. [Sources](#13-sources)
24
+
25
+ ---
26
+
27
+ ## 1. Load Balancing Algorithms
28
+
29
+ ### 1.1 Round Robin
30
+
31
+ The simplest algorithm. Requests are distributed sequentially across servers in a fixed
32
+ rotation. No per-request state is required.
33
+
34
+ | Metric | Value |
35
+ |----------------------|------------------------------------------------------------|
36
+ | CPU overhead | Negligible -- single counter increment per request |
37
+ | Memory per backend | O(1) -- one pointer to current position |
38
+ | Distribution quality | Perfect only when all servers are identical and all requests cost the same |
39
+ | Best for | Homogeneous fleets, stateless services, < 20 backends |
40
+
41
+ **Weakness**: Round robin assumes all servers are equally powerful and all requests need
42
+ equal processing time. A slow server accumulates a backlog while round robin keeps
43
+ sending it work at the same rate. In heterogeneous fleets, this causes p99 latency to
44
+ spike by 2-5x as the weakest server becomes a bottleneck.
45
+
46
+ ### 1.2 Weighted Round Robin
47
+
48
+ Extends round robin by assigning a weight to each server proportional to its capacity.
49
+ A server with weight 3 receives three times the requests of a server with weight 1.
50
+
51
+ | Metric | Value |
52
+ |----------------------|------------------------------------------------------------|
53
+ | CPU overhead | Negligible -- weighted counter |
54
+ | Configuration effort | Must manually assign and maintain weights |
55
+ | Distribution quality | Good when capacity ratios are known and stable |
56
+ | Best for | Mixed hardware fleets, gradual canary deployments |
57
+
58
+ **Practical note**: In Kubernetes, weighted round robin is commonly used for canary
59
+ deployments where a new version receives 5-10% of traffic (weight 1) while the stable
60
+ version keeps 90-95% (weight 19).
61
+
62
+ ### 1.3 Least Connections
63
+
64
+ Routes each new request to the server currently handling the fewest active connections.
65
+ Requires maintaining a per-server connection counter.
66
+
67
+ | Metric | Value |
68
+ |----------------------|------------------------------------------------------------|
69
+ | CPU overhead | O(n) scan or O(log n) with a min-heap, where n = backend count |
70
+ | Memory per backend | 8 bytes (one atomic counter) |
71
+ | Distribution quality | Excellent for variable-duration requests |
72
+ | Best for | WebSocket servers, database proxies, APIs with mixed latency |
73
+
74
+ **Why it works**: If one server is processing a slow query (500ms) while others finish
75
+ in 10ms, least connections naturally routes 50x more requests to the fast servers during
76
+ that window. Round robin would still send the same rate to the slow server.
77
+
78
+ ### 1.4 Weighted Least Connections
79
+
80
+ Combines least connections with server capacity weights. The algorithm selects the server
81
+ with the lowest ratio of (active connections / weight). A server with weight 3 and 6
82
+ connections has the same priority as a server with weight 1 and 2 connections.
83
+
84
+ | Metric | Value |
85
+ |----------------------|------------------------------------------------------------|
86
+ | CPU overhead | O(n) scan with division per backend |
87
+ | Best for | Heterogeneous fleets with variable-duration requests |
88
+ | Production use | Default in many enterprise load balancers (F5, Citrix) |
89
+
90
+ ### 1.5 Consistent Hashing
91
+
92
+ Maps both servers and requests onto a hash ring. Each request is routed to the nearest
93
+ server clockwise on the ring. When a server is added or removed, only 1/n of requests
94
+ are redistributed (where n = number of servers), compared to full redistribution in
95
+ modulo hashing.
96
+
97
+ | Metric | Value |
98
+ |----------------------|------------------------------------------------------------|
99
+ | CPU overhead | O(log n) binary search on the ring |
100
+ | Memory | 100-200 virtual nodes per server typical (Envoy default: 160) |
101
+ | Redistribution | ~1/n keys move when a server is added/removed |
102
+ | Best for | Caching layers, stateful services, rate limiting per user |
103
+
104
+ **Virtual nodes**: Without virtual nodes, hash distribution is uneven. With 150+ virtual
105
+ nodes per server, standard deviation of load drops below 10% of the mean. Envoy's ring
106
+ hash load balancer uses a configurable minimum ring size (default: 1024 entries) to
107
+ control distribution quality vs memory usage.
108
+
109
+ **Warning**: Consistent hashing gives up adaptiveness. A slow server still receives its
110
+ fixed share of traffic. Combine with circuit breaking for production safety.
111
+
112
+ ### 1.6 Random with Two Choices (Power of Two)
113
+
114
+ Pick two servers at random. Send the request to whichever has fewer active connections.
115
+ Described by Michael Mitzenmacher in his 1996 dissertation at Harvard.
116
+
117
+ | Metric | Value |
118
+ |----------------------|------------------------------------------------------------|
119
+ | CPU overhead | O(1) -- two random selections plus one comparison |
120
+ | Max queue length | O(log log n) vs O(log n) for pure random |
121
+ | Best for | Distributed load balancers with incomplete state |
122
+
123
+ **The exponential improvement**: Pure random load balancing produces a maximum queue
124
+ length of O(log n / log log n). Adding just one more random choice -- checking two
125
+ servers instead of one -- reduces the maximum queue to O(log log n). This is an
126
+ exponential improvement from a single additional comparison.
127
+
128
+ **Real-world benchmarks** (HAProxy test-driving Power of Two):
129
+ - **No contention**: All algorithms perform similarly; Power of Two within 2% of
130
+ least connections.
131
+ - **Medium contention**: Random produces ~10% lower throughput than alternatives;
132
+ Power of Two matches least connections.
133
+ - **High contention**: Power of Two reduces peak server load beyond what least
134
+ connections achieves, because it avoids the "herd behavior" where all load balancers
135
+ simultaneously pick the same least-loaded server.
136
+
137
+ **Why it matters for distributed systems**: In a fleet of load balancers, each has a
138
+ slightly stale view of backend load. Least connections in this scenario causes all
139
+ balancers to simultaneously route to the server they all think is least loaded,
140
+ creating a thundering herd. Power of Two's randomization breaks this synchronization.
141
+ NGINX adopted this algorithm for its `random two least_conn` directive.
142
+
143
+ ### 1.7 Algorithm Comparison Summary
144
+
145
+ | Algorithm | Overhead | State Required | Heterogeneous Fleet | Variable Requests | Cache Affinity |
146
+ |------------------------|----------|---------------|--------------------|--------------------|---------------|
147
+ | Round Robin | O(1) | Counter | Poor | Poor | None |
148
+ | Weighted Round Robin | O(1) | Counter+weights| Good | Poor | None |
149
+ | Least Connections | O(n) | Counters | Poor | Excellent | None |
150
+ | Weighted Least Conn | O(n) | Counters+weights| Excellent | Excellent | None |
151
+ | Consistent Hashing | O(log n) | Hash ring | Poor | Poor | Excellent |
152
+ | Random Two Choices | O(1) | Counters | Poor | Good | None |
153
+
154
+ ---
155
+
156
+ ## 2. Layer 4 vs Layer 7 Load Balancing
157
+
158
+ ### 2.1 Layer 4 (Transport Layer)
159
+
160
+ Operates on TCP/UDP packets. Routing decisions based on source/destination IP and port.
161
+ Does not inspect payload content.
162
+
163
+ | Metric | Typical Value |
164
+ |-------------------|------------------------------------------------------------|
165
+ | Throughput | 10-40 Gbps per node; up to 1M+ new connections/second |
166
+ | Added latency | Sub-millisecond (< 100 microseconds in kernel-bypass modes) |
167
+ | TLS awareness | None -- passes encrypted traffic through |
168
+ | Routing granularity| IP + port only |
169
+ | Memory per connection | ~256 bytes (connection tracking entry) |
170
+
171
+ **Implementation**: L4 balancers typically use Direct Server Return (DSR) where the
172
+ response goes directly from the backend to the client, bypassing the load balancer
173
+ entirely. This eliminates the load balancer as a bandwidth bottleneck on the return path.
174
+
175
+ ### 2.2 Layer 7 (Application Layer)
176
+
177
+ Operates on HTTP/HTTPS, gRPC, and other application protocols. Full request parsing,
178
+ header inspection, TLS termination.
179
+
180
+ | Metric | Typical Value |
181
+ |-------------------|------------------------------------------------------------|
182
+ | Throughput | 50K-200K requests/second per node (varies with request size)|
183
+ | Added latency | 1-20ms (dominated by TLS handshake on new connections) |
184
+ | TLS awareness | Full -- terminates and re-encrypts |
185
+ | Routing granularity| URL path, headers, cookies, body content |
186
+ | Memory per connection | 8-32 KB (buffering, TLS state, HTTP parsing) |
187
+
188
+ ### 2.3 Performance Tradeoffs
189
+
190
+ ```
191
+ L4 L7
192
+ Throughput: 10-40 Gbps 1-10 Gbps (TLS-dependent)
193
+ New connections/s: ~1,000,000 ~100,000-200,000
194
+ Latency added: < 0.1 ms 1-20 ms
195
+ CPU per request: ~1 microsecond ~50-200 microseconds
196
+ Can route by URL: No Yes
197
+ Can retry failed: No (connection-level only) Yes (request-level)
198
+ Session affinity: IP hash only Cookie-based (robust across NAT)
199
+ Health checks: TCP SYN/ACK only HTTP status codes, body content
200
+ WebSocket support: Pass-through Protocol-aware upgrade handling
201
+ ```
202
+
203
+ ### 2.4 Production Deployment Pattern: L4 + L7 Tiering
204
+
205
+ Most large-scale deployments use both layers strategically:
206
+
207
+ ```
208
+ Internet
209
+ |
210
+ [L4 Global LB] -- anycast, BGP, ECMP
211
+ | Handles: DDoS absorption, geographic routing
212
+ | Throughput: millions of connections/s
213
+ v
214
+ [L7 Regional LB] -- NGINX/HAProxy/Envoy per region
215
+ | Handles: TLS termination, path routing, retries
216
+ | Throughput: 100K-500K req/s per node
217
+ v
218
+ [L4 Intra-DC LB] -- IPVS/Maglev/kube-proxy
219
+ | Handles: pod-level distribution
220
+ | Throughput: millions of packets/s
221
+ v
222
+ [Backend Pods]
223
+ ```
224
+
225
+ Google's Maglev paper (2016) demonstrated this pattern at scale: their L4 balancer
226
+ handles 10M+ packets/second per machine using kernel-bypass networking, feeding into
227
+ L7 proxies for application-level routing.
228
+
229
+ ---
230
+
231
+ ## 3. Software Load Balancers: Benchmarks
232
+
233
+ ### 3.1 HAProxy
234
+
235
+ | Metric | Value | Source/Conditions |
236
+ |------------------------|------------------------------------|-----------------------------|
237
+ | HTTP RPS (peak) | 2.08 million req/s | HAProxy 2.4, Graviton2 64-core ARM |
238
+ | HTTP bandwidth | 92 Gbps payload | 30 KB request size |
239
+ | Added latency | ~400 microseconds average | vs 160 us direct connection |
240
+ | Memory usage | ~50 MB baseline | 4-core VPS benchmark |
241
+ | Concurrent connections | 2+ million | With connection pooling |
242
+ | HTTP/2 support | Full multiplexing | Since HAProxy 2.0 |
243
+ | HTTP/3 (QUIC) | Not natively supported | As of 2025 |
244
+
245
+ **Key tuning parameters**:
246
+ - `nbthread`: Match to CPU core count; HAProxy scales linearly up to ~64 threads
247
+ - `maxconn`: Default 2000; production typically 50,000-100,000
248
+ - `tune.bufsize`: Default 16384; increase for large headers (WebSocket upgrade)
249
+ - Connection reuse: `http-reuse always` reduces backend connection overhead by 60-80%
250
+
251
+ ### 3.2 NGINX
252
+
253
+ | Metric | Value | Source/Conditions |
254
+ |------------------------|------------------------------------|-----------------------------|
255
+ | HTTP RPS | ~5,200 req/s per worker (tuned) | Codedamn benchmark, 8 workers |
256
+ | Static file serving | 100K+ req/s | Small files, keepalive |
257
+ | Memory usage | ~80 MB baseline | 4-core VPS benchmark |
258
+ | HTTP/3 (QUIC) | Native support since 1.25.0 | Most mature implementation |
259
+ | Worker model | Event-driven, single-threaded workers | One worker per CPU core |
260
+
261
+ **Key tuning parameters**:
262
+ - `worker_processes auto`: One worker per CPU core
263
+ - `worker_connections 10240`: Default 512 is far too low for production
264
+ - `keepalive_timeout 65`: Reduces TLS handshake overhead via connection reuse
265
+ - `upstream keepalive 64`: Connection pool to backends; critical for latency
266
+ - `proxy_buffering on`: Frees backend connections faster; reduces p99 by 20-40%
267
+
268
+ ### 3.3 Envoy
269
+
270
+ | Metric | Value | Source/Conditions |
271
+ |------------------------|------------------------------------|-----------------------------|
272
+ | Throughput | Highest raw RPS in Loggly benchmark| Multiple test configurations|
273
+ | Latency under churn | Sub-10ms during pod scaling | Kubernetes dynamic backends |
274
+ | CPU usage | 73% more than HAProxy | For equivalent throughput |
275
+ | Memory usage | ~150 MB baseline | 4-core VPS benchmark |
276
+ | xDS configuration | Dynamic, zero-downtime config reload| Via gRPC control plane |
277
+
278
+ **Envoy's advantage**: In dynamic environments like Kubernetes, Envoy maintains
279
+ consistent sub-10ms latency during pod scaling events. HAProxy can spike to 25-second
280
+ response times during configuration reloads in similar scenarios. This makes Envoy the
281
+ default sidecar proxy for service meshes (Istio, Linkerd2-proxy).
282
+
283
+ **Envoy's cost**: The C++ thread-per-core architecture with extensive observability
284
+ (histogram stats, distributed tracing) comes at a CPU and memory premium. Disabling
285
+ access logs can boost Envoy throughput by up to 2x.
286
+
287
+ ### 3.4 Traefik
288
+
289
+ | Metric | Value | Source/Conditions |
290
+ |------------------------|------------------------------------|-----------------------------|
291
+ | HTTP RPS | ~30,000 req/s | Default configuration |
292
+ | Auto-discovery | Native Docker/K8s integration | Zero configuration routing |
293
+ | Memory usage | ~100 MB baseline | Comparable workload |
294
+ | Best for | Development, small-to-medium prod | Ease of use over raw perf |
295
+
296
+ ### 3.5 Comparative Summary
297
+
298
+ ```
299
+ Throughput ranking (HTTP/1.1, same hardware):
300
+ HAProxy > Envoy > NGINX > Traefik
301
+
302
+ Latency consistency under dynamic backends:
303
+ Envoy > HAProxy > NGINX > Traefik
304
+
305
+ Memory efficiency:
306
+ HAProxy (~50 MB) > NGINX (~80 MB) > Traefik (~100 MB) > Envoy (~150 MB)
307
+
308
+ HTTP/3 support maturity:
309
+ NGINX > Traefik > Envoy > HAProxy (none)
310
+
311
+ Dynamic reconfiguration:
312
+ Envoy (xDS, zero downtime) > Traefik (auto-discovery) > HAProxy (reload) > NGINX (reload)
313
+ ```
314
+
315
+ **Production guidance**: In real-world production environments, the performance difference
316
+ between HAProxy and NGINX is usually minimal when both are properly configured. The
317
+ decision is driven more by architectural needs -- HAProxy for pure TCP/HTTP proxying,
318
+ NGINX when you also need static file serving or HTTP/3, Envoy when you need service
319
+ mesh integration.
320
+
321
+ ---
322
+
323
+ ## 4. Cloud Load Balancers
324
+
325
+ ### 4.1 AWS Network Load Balancer (NLB)
326
+
327
+ | Metric | Value |
328
+ |-----------------------|-------------------------------------------------------------|
329
+ | Layer | 4 (TCP/UDP/TLS) |
330
+ | Throughput | Millions of requests/second, scales automatically |
331
+ | Latency | Single-digit milliseconds |
332
+ | Static IP | Yes -- one per AZ |
333
+ | TLS termination | Supported (offloads from backends) |
334
+ | Idle timeout | 350 seconds (TCP), 120 seconds (UDP), now configurable |
335
+ | Pricing | ~$0.006 per NLCU-hour + $0.0225/hour |
336
+
337
+ ### 4.2 AWS Application Load Balancer (ALB)
338
+
339
+ | Metric | Value |
340
+ |-----------------------|-------------------------------------------------------------|
341
+ | Layer | 7 (HTTP/HTTPS/gRPC/WebSocket) |
342
+ | Throughput | Scales automatically, but lower than NLB |
343
+ | Latency | 4x slower than NLB in benchmarks |
344
+ | Content-based routing | URL path, host header, HTTP method, query string |
345
+ | gRPC support | Native |
346
+ | Pricing | ~$0.008 per LCU-hour + $0.0225/hour |
347
+
348
+ **Why ALB is slower**: ALB terminates the incoming TCP session, reads the HTTP payload
349
+ into memory, evaluates routing rules against headers/path, then opens a new connection
350
+ to the target. NLB rewrites IP/TCP headers and forwards packets without inspecting
351
+ payload. The extra memory copy and rule evaluation add measurable latency.
352
+
353
+ **NLB vs ALB selection**: Use NLB when latency is critical (gaming, financial trading,
354
+ real-time streaming). NLB benchmark results show 4x faster throughput with tighter
355
+ standard deviation than ALB under identical conditions.
356
+
357
+ ### 4.3 Google Cloud Load Balancer
358
+
359
+ | Metric | Value |
360
+ |-----------------------|-------------------------------------------------------------|
361
+ | Architecture | True anycast global load balancer |
362
+ | Single IP | One IP serves all regions via anycast |
363
+ | Auto-scaling | Fully managed, no pre-warming needed |
364
+ | Backend types | Instance groups, NEGs, serverless |
365
+ | SSL policies | Configurable min TLS version and cipher suites |
366
+
367
+ **GCP's architectural advantage**: Google's load balancer is the only major cloud offering
368
+ a true anycast-based global HTTP(S) load balancer. Traffic enters Google's network at the
369
+ nearest edge POP (100+ locations) and traverses Google's private backbone to the nearest
370
+ healthy backend. This eliminates multiple public internet hops and typically reduces
371
+ latency by 20-40% compared to DNS-based global load balancing.
372
+
373
+ ### 4.4 Cloudflare Load Balancer
374
+
375
+ | Metric | Value |
376
+ |-----------------------|-------------------------------------------------------------|
377
+ | Network | 300+ data centers worldwide |
378
+ | Proximity | Within ~50 ms of ~95% of internet-connected population |
379
+ | Routing methods | Latency-based, geo, GPS coordinates, random, hash |
380
+ | Argo Smart Routing | Improves performance by up to 30% via optimized paths |
381
+ | Health checks | Every 60 seconds from multiple regions |
382
+ | Pricing | From $5/month per origin (basic); enterprise for advanced |
383
+
384
+ **Latency consideration**: Cloud-based load balancers are not colocated with origin
385
+ servers. The request path is: client -> nearest Cloudflare POP (low latency) -> public
386
+ internet -> origin server (variable latency). For latency-critical applications, this
387
+ extra hop can add 5-50ms depending on origin location. Cloudflare mitigates this with
388
+ Argo Smart Routing, which finds the fastest path across their backbone network.
389
+
390
+ ### 4.5 Cloud Load Balancer Comparison
391
+
392
+ ```
393
+ Latency (lowest to highest):
394
+ AWS NLB (L4, <5ms) < GCP Global LB (anycast) < AWS ALB (L7) < Cloudflare (edge proxy)
395
+
396
+ Global distribution:
397
+ Cloudflare (300+ PoPs) > GCP (100+ edge) > AWS (30+ regions, needs Global Accelerator)
398
+
399
+ Ease of setup:
400
+ Cloudflare (DNS change) > AWS ALB (console/IaC) > GCP (console/IaC) > AWS NLB
401
+
402
+ Cost for high traffic:
403
+ AWS NLB < AWS ALB < GCP < Cloudflare (depends on plan tier)
404
+ ```
405
+
406
+ ---
407
+
408
+ ## 5. Health Check Strategies and Failover Timing
409
+
410
+ ### 5.1 Health Check Types
411
+
412
+ | Check Type | What It Tests | Overhead | Accuracy |
413
+ |-------------------|------------------------------|-------------|-------------|
414
+ | TCP SYN | Port open | Minimal | Low -- port can be open but app broken |
415
+ | HTTP GET /health | App responds with 200 | Low | Medium |
416
+ | HTTP GET /ready | App ready to serve traffic | Low-Medium | High |
417
+ | Deep health | DB connection, cache, deps | Medium-High | Highest |
418
+ | gRPC health | gRPC health check protocol | Low | High |
419
+
420
+ **Best practice**: Use protocol-matched health checks. Running a TCP health check
421
+ against an HTTP service gives false positives -- the TCP handshake succeeds even when
422
+ the HTTP service returns 503 errors. Always check at the application protocol level.
423
+
424
+ ### 5.2 Timing Configuration
425
+
426
+ | Parameter | Recommended Value | Why |
427
+ |----------------------------|-------------------|----------------------------------|
428
+ | Check interval | 5-10 seconds | Balances responsiveness vs load |
429
+ | Timeout per check | 5-10 seconds | Must be < interval |
430
+ | Unhealthy threshold | 2-3 consecutive failures | Avoids false positives from network jitter |
431
+ | Healthy threshold | 2-3 consecutive successes | Prevents flapping |
432
+ | Critical service interval | 30-60 seconds | For non-latency-critical paths |
433
+
434
+ **Failover timing calculation**:
435
+ ```
436
+ Time to detect failure = check_interval x unhealthy_threshold
437
+ Time to restore = check_interval x healthy_threshold
438
+
439
+ Example (default AWS ALB):
440
+ Detect: 30s interval x 2 failures = 60 seconds
441
+ Restore: 30s interval x 5 successes = 150 seconds
442
+
443
+ Example (aggressive, latency-sensitive):
444
+ Detect: 5s interval x 2 failures = 10 seconds
445
+ Restore: 5s interval x 2 successes = 10 seconds
446
+ ```
447
+
448
+ **Warning**: Setting intervals below 2 seconds causes health check storms at scale.
449
+ With 100 backends and 1-second intervals, each backend receives 100 health check
450
+ requests per second from the load balancer alone. With multiple load balancers, this
451
+ multiplies further.
452
+
453
+ ### 5.3 Health Check Endpoint Design
454
+
455
+ A well-designed health endpoint returns quickly and tests meaningful dependencies:
456
+
457
+ ```json
458
+ // GET /health -- returns in < 50ms
459
+ {
460
+ "status": "healthy",
461
+ "checks": {
462
+ "database": { "status": "up", "latency_ms": 2 },
463
+ "cache": { "status": "up", "latency_ms": 1 },
464
+ "disk": { "status": "up", "free_gb": 42 }
465
+ },
466
+ "uptime_seconds": 86400
467
+ }
468
+ ```
469
+
470
+ **Critical rule**: Health check endpoints must NOT perform expensive operations. A health
471
+ check that runs a database query under load can itself cause the failure it is trying to
472
+ detect. Use connection pool validation or ping commands, not full queries.
473
+
474
+ ### 5.4 Graceful Health Check Failure
475
+
476
+ Before shutting down, a server should:
477
+ 1. Start returning 503 from the health endpoint
478
+ 2. Wait for the load balancer to detect the failure (interval x unhealthy_threshold)
479
+ 3. Wait for in-flight requests to complete (connection draining timeout)
480
+ 4. Then shut down
481
+
482
+ Skipping step 2 causes dropped requests -- the load balancer still has the server in
483
+ its pool and continues routing traffic to a shutting-down process.
484
+
485
+ ---
486
+
487
+ ## 6. Session Affinity and Sticky Sessions
488
+
489
+ ### 6.1 Mechanisms
490
+
491
+ | Method | Implementation | Survives NAT | Server Removal Impact |
492
+ |---------------------|------------------------|-------------|----------------------|
493
+ | Source IP hash | L4 or L7 | No | All sessions from that IP disrupted |
494
+ | Cookie insertion | L7 only | Yes | Only that server's sessions |
495
+ | URL parameter | L7 only | Yes | Only that server's sessions |
496
+ | Header-based | L7 only | Yes | Only that server's sessions |
497
+ | TLS session ID | L7 only | Yes | Only new sessions |
498
+
499
+ ### 6.2 Performance Cost
500
+
501
+ Sticky sessions impose several measurable costs:
502
+
503
+ **Uneven load distribution**: In a cluster of 10 servers, sticky sessions can cause
504
+ load variance of 3-5x between the most- and least-loaded servers. A server accumulating
505
+ many long-lived sessions becomes a hotspot while others sit idle.
506
+
507
+ **Scaling impediment**: When auto-scaling adds a new server, existing sticky sessions
508
+ remain on old servers. The new server receives only new sessions, which may be a small
509
+ fraction of total traffic. Result: the overloaded servers that triggered the scale-out
510
+ remain overloaded; the new server is underutilized.
511
+
512
+ **Failover disruption**: When a server with sticky sessions fails or is removed, all
513
+ sessions on that server lose their state. With 1,000 sessions per server, a single
514
+ server removal disrupts 1,000 users simultaneously.
515
+
516
+ **Cache efficiency tradeoff**: Sticky sessions improve RAM cache hit rates because the
517
+ same user's requests always hit the same server's cache. This can improve response times
518
+ by 10-30% for cache-heavy applications. However, this benefit must be weighed against
519
+ the scaling and availability costs.
520
+
521
+ ### 6.3 Modern Alternatives
522
+
523
+ | Approach | Latency Impact | Scaling Impact |
524
+ |-----------------------|------------------------|------------------------|
525
+ | Sticky sessions | +0ms (same server) | Poor -- prevents even distribution |
526
+ | Redis session store | +1-3ms per request | Excellent -- any server can serve any request |
527
+ | JWT stateless tokens | +0.1-0.5ms (decode) | Excellent -- no server-side state |
528
+ | Client-side state | +0ms | Excellent -- no server-side state |
529
+
530
+ **Recommendation**: Use externalized session stores (Redis, Memcached) or stateless
531
+ tokens (JWT) instead of sticky sessions. The 1-3ms latency penalty for Redis is far less
532
+ costly than the scaling and availability problems caused by sticky sessions. Redis
533
+ Cluster can serve 100,000+ session lookups per second per node with sub-millisecond
534
+ p99 latency.
535
+
536
+ ---
537
+
538
+ ## 7. Connection Draining and Graceful Shutdown
539
+
540
+ ### 7.1 How Connection Draining Works
541
+
542
+ When a backend is marked for removal:
543
+ 1. Load balancer stops sending **new** requests to the backend
544
+ 2. **Existing** in-flight requests are allowed to complete
545
+ 3. After the drain timeout expires, remaining connections are forcefully closed
546
+ 4. Backend is removed from the pool
547
+
548
+ ### 7.2 Timeout Configuration
549
+
550
+ | Service Type | Recommended Drain Timeout | Rationale |
551
+ |---------------------------|---------------------------|------------------------------|
552
+ | Fast APIs (< 1s response) | 5-15 seconds | 5s covers 99.9th percentile |
553
+ | Standard web apps | 30-60 seconds | Covers slow pages, uploads |
554
+ | WebSocket/streaming | 300-3600 seconds | Long-lived connections |
555
+ | Database connections | 60-120 seconds | Covers in-flight transactions|
556
+
557
+ **Platform defaults**:
558
+ - AWS Classic LB: 300 seconds (configurable 1-3600s)
559
+ - AWS ALB/NLB: 300 seconds deregistration delay
560
+ - Google Cloud: 0-3600 seconds (must be explicitly enabled)
561
+ - Kubernetes: `terminationGracePeriodSeconds` default 30 seconds
562
+
563
+ ### 7.3 Implementation Pattern
564
+
565
+ ```
566
+ Application receives SIGTERM
567
+ |
568
+ v
569
+ Stop accepting new connections
570
+ Set health check endpoint to return 503
571
+ |
572
+ v
573
+ Wait for LB to detect unhealthy (interval x threshold = 10-60s)
574
+ |
575
+ v
576
+ Wait for in-flight requests to complete (up to drain timeout)
577
+ |
578
+ v
579
+ Close database connections, flush buffers
580
+ |
581
+ v
582
+ Exit process
583
+ ```
584
+
585
+ **Critical coordination**: The application's graceful shutdown timeout must be LONGER
586
+ than the load balancer's health check detection time plus the drain timeout. If the
587
+ application exits before the load balancer removes it, clients see connection resets.
588
+
589
+ **AWS ECS specifics**: ECS sends SIGTERM to the container, then waits for the
590
+ `stopTimeout` (default 30s) before sending SIGKILL. The deregistration delay must
591
+ complete within this window. Set `stopTimeout >= deregistration_delay + 10s` as a safety
592
+ margin.
593
+
594
+ ### 7.4 Zero-Downtime Deployment Checklist
595
+
596
+ 1. New instances register and pass health checks (healthy threshold met)
597
+ 2. Old instances begin draining (health check returns 503)
598
+ 3. Load balancer detects old instances as unhealthy (unhealthy threshold met)
599
+ 4. In-flight requests on old instances complete (drain timeout)
600
+ 5. Old instances shut down
601
+ 6. Total deployment time: healthy_threshold + unhealthy_threshold + drain_timeout
602
+
603
+ ---
604
+
605
+ ## 8. Common Bottlenecks
606
+
607
+ ### 8.1 Uneven Distribution
608
+
609
+ **Symptom**: Some backends at 90% CPU while others sit at 20%.
610
+
611
+ **Causes**:
612
+ - Round robin with heterogeneous servers
613
+ - Sticky sessions accumulating on specific servers
614
+ - Consistent hashing with too few virtual nodes (< 100 per server)
615
+ - Hot keys in hash-based routing (one user generating 50% of traffic)
616
+
617
+ **Measurement**: Standard deviation of request counts across backends. Healthy: < 15%
618
+ of mean. Problematic: > 30% of mean.
619
+
620
+ **Fix**: Switch to least connections or weighted least connections. For hash-based
621
+ routing, increase virtual nodes to 150+ per server.
622
+
623
+ ### 8.2 Thundering Herd
624
+
625
+ **Symptom**: Server comes back online and is immediately overwhelmed, fails health
626
+ check, is removed, recovers, repeat.
627
+
628
+ **Mechanism**: When a server recovers, the load balancer routes a burst of queued or
629
+ new requests to it simultaneously. The server, not yet warmed up (cold caches, JIT not
630
+ compiled, connection pools empty), cannot handle the burst and fails again.
631
+
632
+ **Real-world example**: In a documented AWS ALB case, the controller deregistered all
633
+ targets from pods failing readiness probes, then reintroduced them. Each pod was
634
+ immediately crushed by thousands of queued requests, failed the readiness probe again,
635
+ and was deregistered -- creating an infinite crash loop.
636
+
637
+ **Mitigation**:
638
+ - **Slow start**: HAProxy's `slowstart 60s` ramps traffic to a recovering server over
639
+ 60 seconds instead of sending full load immediately
640
+ - **Jitter**: Add randomized delays (50-500ms) to health check intervals to prevent
641
+ synchronized checks from multiple LB instances
642
+ - **Request queuing**: Limit the rate of new connections to a recovering server
643
+ - **Warm-up**: Pre-populate caches and connection pools before marking healthy
644
+
645
+ ### 8.3 Health Check Storms
646
+
647
+ **Symptom**: Backend servers spending significant CPU on health check responses.
648
+
649
+ **Scale calculation**:
650
+ ```
651
+ Health checks per second = (num_LB_instances x num_backends) / check_interval
652
+
653
+ Example: 5 LB instances, 200 backends, 2-second interval
654
+ = (5 x 200) / 2 = 500 health checks per second
655
+
656
+ With deep health checks querying database:
657
+ = 500 database queries/second JUST for health checks
658
+ ```
659
+
660
+ **Mitigation**:
661
+ - Use lightweight health endpoints (return cached status, not live queries)
662
+ - Stagger check intervals across LB instances (jitter)
663
+ - Use passive health checking (track real request failures) supplemented by
664
+ infrequent active checks (every 30-60 seconds)
665
+
666
+ ### 8.4 TLS Termination Overhead
667
+
668
+ **CPU cost**: TLS handshakes are CPU-intensive. A single HAProxy core can be fully
669
+ saturated by TLS handshake processing alone. However, established connections with
670
+ session resumption use minimal CPU -- less than 1% of CPU load and less than 2% of
671
+ network overhead according to Google's production data (istlsfastyet.com).
672
+
673
+ **Optimization strategies**:
674
+ - Enable TLS session resumption (reduces handshake from 2-RTT to 1-RTT)
675
+ - Use TLS 1.3 (mandatory 1-RTT handshake, 0-RTT for resumed sessions)
676
+ - Prefer ECDSA certificates over RSA (3-10x faster signature operations)
677
+ - Use CHACHA20-POLY1305 on non-AES-NI hardware (ARM, older x86)
678
+ - Enable OCSP stapling (eliminates client-side OCSP lookup, saves 100-300ms)
679
+ - Keep-alive connections amortize handshake cost across many requests
680
+
681
+ **Quantified impact**: A new TLS 1.2 RSA-2048 handshake costs ~1ms of CPU time on
682
+ modern hardware. TLS 1.3 with ECDSA-P256 costs ~0.1ms. With keepalive connections
683
+ averaging 100 requests each, the per-request overhead drops to 1-10 microseconds.
684
+
685
+ ### 8.5 Connection Exhaustion
686
+
687
+ **Symptom**: Load balancer stops accepting new connections; existing connections work.
688
+
689
+ **Cause**: Ephemeral port exhaustion on the load balancer. Each connection to a backend
690
+ uses a source port. Default ephemeral port range (32768-60999) provides ~28,000 ports.
691
+ With high connection rates and slow backend responses, ports are consumed faster than
692
+ they are recycled.
693
+
694
+ **Fix**:
695
+ - Enable connection reuse (`http-reuse always` in HAProxy)
696
+ - Expand ephemeral port range: `sysctl net.ipv4.ip_local_port_range="1024 65535"`
697
+ - Reduce TIME_WAIT duration: `sysctl net.ipv4.tcp_tw_reuse=1`
698
+ - Use multiple backend IPs to multiply available port space
699
+
700
+ ---
701
+
702
+ ## 9. Anti-Patterns
703
+
704
+ ### 9.1 Sticky Sessions Preventing Horizontal Scaling
705
+
706
+ **Pattern**: Application stores session state in local memory. Sticky sessions added
707
+ to "solve" the problem. Auto-scaling adds servers but existing sessions stay pinned
708
+ to overloaded servers.
709
+
710
+ **Impact**: 60-70% of auto-scaling events fail to relieve load because new capacity
711
+ receives only new sessions, which may be < 10% of total traffic.
712
+
713
+ **Fix**: Externalize session state to Redis/Memcached. Migration cost: typically 2-5
714
+ days of engineering effort. Payoff: true horizontal scalability, simplified
715
+ deployments, elimination of session loss during server failures.
716
+
717
+ ### 9.2 No Health Checks (Blind Load Balancing)
718
+
719
+ **Pattern**: Load balancer sends traffic to all configured backends without checking
720
+ if they are alive.
721
+
722
+ **Impact**: Failed backends receive their full share of traffic. With 5 backends and
723
+ 1 failed, 20% of all requests fail. Mean time to detection depends entirely on
724
+ monitoring alerting, which may be minutes to hours.
725
+
726
+ **Fix**: Enable active health checks at the application protocol level. TCP checks
727
+ are better than nothing but miss application-level failures. HTTP checks to a
728
+ dedicated /health endpoint catch the most failure modes.
729
+
730
+ ### 9.3 Single Load Balancer (SPOF)
731
+
732
+ **Pattern**: One load balancer instance in front of a redundant backend fleet.
733
+
734
+ **Impact**: The load balancer becomes the single point of failure. When it fails,
735
+ 100% of traffic is lost despite having healthy backends.
736
+
737
+ **Fix**:
738
+ - **Active-passive**: Two LB instances with VRRP failover (HAProxy + keepalived).
739
+ Failover time: 1-5 seconds.
740
+ - **Active-active**: Multiple LB instances behind DNS round robin or anycast.
741
+ No failover delay; capacity is always shared.
742
+ - **Cloud managed**: AWS ALB/NLB, GCP LB -- inherently redundant across AZs.
743
+ The cloud provider manages redundancy transparently.
744
+
745
+ ### 9.4 Ignoring Backend Capacity in Algorithm Selection
746
+
747
+ **Pattern**: Using round robin with servers of vastly different capacity (e.g., mixing
748
+ 4-core and 32-core machines).
749
+
750
+ **Impact**: The 4-core machine receives the same traffic as the 32-core machine,
751
+ becomes overloaded, and causes 12-25% of requests to have elevated latency (the
752
+ fraction served by the underpowered server).
753
+
754
+ **Fix**: Use weighted round robin or weighted least connections. Assign weights
755
+ proportional to capacity (e.g., weight 1 for 4-core, weight 8 for 32-core).
756
+
757
+ ### 9.5 Overly Aggressive Health Checks
758
+
759
+ **Pattern**: Health check interval set to 1 second with 1 failure threshold. Deep
760
+ health check queries database on every check.
761
+
762
+ **Impact**: Network jitter causes a single dropped health check packet, immediately
763
+ removing a healthy server. Server handles hundreds of health check queries per second,
764
+ consuming capacity that should serve production traffic. Flapping servers create
765
+ oscillating load patterns.
766
+
767
+ **Fix**: Set interval to 5-10 seconds, unhealthy threshold to 2-3, and use lightweight
768
+ health endpoints that test connectivity without executing heavy queries.
769
+
770
+ ### 9.6 No Connection Draining During Deployments
771
+
772
+ **Pattern**: Server process is killed immediately during deployment. Load balancer
773
+ continues routing to the dead server until health checks detect the failure.
774
+
775
+ **Impact**: All in-flight requests on the terminated server receive connection reset
776
+ errors. With a 30-second health check interval and 2-failure threshold, new requests
777
+ continue being routed to the dead server for up to 60 seconds.
778
+
779
+ **Fix**: Implement graceful shutdown (SIGTERM handler), return 503 from health endpoint,
780
+ wait for LB detection, drain in-flight requests, then exit.
781
+
782
+ ---
783
+
784
+ ## 10. Before/After: Real-World Improvements
785
+
786
+ ### 10.1 Round Robin to Least Connections (Heterogeneous Fleet)
787
+
788
+ ```
789
+ Scenario: 12 backend servers, mix of c5.xlarge (4) and c5.4xlarge (8)
790
+ Traffic: 50,000 req/s, variable processing time (2ms-200ms)
791
+
792
+ BEFORE (Round Robin):
793
+ p50 latency: 15 ms
794
+ p95 latency: 180 ms
795
+ p99 latency: 450 ms <-- c5.xlarge servers saturated
796
+ Error rate: 2.3% <-- timeouts on overloaded servers
797
+
798
+ AFTER (Weighted Least Connections, weights 1:4):
799
+ p50 latency: 12 ms (20% improvement)
800
+ p95 latency: 45 ms (75% improvement)
801
+ p99 latency: 85 ms (81% improvement)
802
+ Error rate: 0.1% (95% reduction)
803
+ ```
804
+
805
+ ### 10.2 Single LB to Active-Active with Health Checks
806
+
807
+ ```
808
+ Scenario: E-commerce platform, 2 million daily users
809
+ Single HAProxy instance, no health checks
810
+
811
+ BEFORE:
812
+ Monthly downtime: 45 minutes (LB failures + backend failures)
813
+ p99 during failures: 30+ seconds (requests to dead backends)
814
+ Availability: 99.90%
815
+
816
+ AFTER (2 HAProxy + VRRP + HTTP health checks):
817
+ Monthly downtime: < 2 minutes
818
+ p99 during failures: 200 ms (immediate failover)
819
+ Availability: 99.995%
820
+ Failover time: < 3 seconds (VRRP) + 10 seconds (health check detection)
821
+ ```
822
+
823
+ ### 10.3 Sticky Sessions to Externalized State
824
+
825
+ ```
826
+ Scenario: SaaS application, 500 concurrent users per server, 8 servers
827
+ Sticky sessions with in-memory session store
828
+
829
+ BEFORE:
830
+ Scale-out effectiveness: ~30% (new servers get only new sessions)
831
+ Deployment error rate: 5% of users lose session on deploy
832
+ Load variance: 3.5x between busiest and quietest server
833
+ p99 latency: 320 ms
834
+
835
+ AFTER (Redis Cluster session store, least connections LB):
836
+ Scale-out effectiveness: ~95% (immediate load redistribution)
837
+ Deployment error rate: 0% (sessions survive server removal)
838
+ Load variance: 1.2x (near-perfect distribution)
839
+ p99 latency: 180 ms (+2ms Redis overhead, -140ms from even distribution)
840
+ ```
841
+
842
+ ### 10.4 Pipedrive: Thundering Herd Fix
843
+
844
+ ```
845
+ Scenario: Microservices with service mesh, periodic cache invalidation
846
+
847
+ BEFORE:
848
+ Every hour, cache invalidation triggered thousands of simultaneous requests
849
+ to a single upstream service through the service mesh.
850
+ p99 latency spike: 40ms -> 1000ms (25x increase) every hour
851
+
852
+ AFTER (Request coalescing / singleflight pattern):
853
+ Only one request sent to upstream per cache key; concurrent requesters
854
+ share the Promise/Future.
855
+ p99 latency: stable at 40ms, no hourly spikes
856
+ Upstream load: reduced by 95% during invalidation windows
857
+ Fix: ~10 lines of code
858
+ ```
859
+
860
+ ---
861
+
862
+ ## 11. Decision Tree: Which Algorithm Should I Use?
863
+
864
+ ```
865
+ START: What type of service are you load balancing?
866
+ |
867
+ +---> Caching layer (Redis, Memcached, CDN origin)?
868
+ | |
869
+ | YES --> Use CONSISTENT HASHING
870
+ | (preserves cache affinity, ~1/n redistribution on changes)
871
+ | Consider: ring size >= 150 virtual nodes per server
872
+ |
873
+ +---> Stateless API / web service?
874
+ | |
875
+ | +---> Are all backend servers identical hardware?
876
+ | | |
877
+ | | YES --> Are request processing times uniform?
878
+ | | | |
879
+ | | | YES --> Use ROUND ROBIN (simplest, lowest overhead)
880
+ | | | |
881
+ | | | NO --> Use LEAST CONNECTIONS
882
+ | | | (adapts to variable request durations)
883
+ | | |
884
+ | | NO --> Use WEIGHTED LEAST CONNECTIONS
885
+ | | (adapts to both capacity and request variation)
886
+ | |
887
+ | +---> Do you have multiple independent load balancers?
888
+ | |
889
+ | YES --> Use RANDOM TWO CHOICES (LEAST_CONN)
890
+ | (avoids herd behavior from stale state)
891
+ | |
892
+ | NO --> Use LEAST CONNECTIONS
893
+ |
894
+ +---> WebSocket / long-lived connection service?
895
+ | |
896
+ | YES --> Use LEAST CONNECTIONS
897
+ | (prevents new connections piling on servers with
898
+ | many idle long-lived connections)
899
+ |
900
+ +---> Canary / blue-green deployment?
901
+ | |
902
+ | YES --> Use WEIGHTED ROUND ROBIN
903
+ | (weight controls traffic split: 95/5, 90/10, etc.)
904
+ |
905
+ +---> Rate limiting / per-user routing?
906
+ |
907
+ YES --> Use CONSISTENT HASHING on user ID / API key
908
+ (same user always hits same backend for rate counting)
909
+ ```
910
+
911
+ ### Layer Selection Guide
912
+
913
+ ```
914
+ Do you need content-based routing (URL path, headers, cookies)?
915
+ |
916
+ YES --> Use LAYER 7 (HTTP-aware load balancing)
917
+ | |
918
+ | +---> Need service mesh integration? --> ENVOY
919
+ | +---> Need HTTP/3 support? --> NGINX
920
+ | +---> Need maximum HTTP throughput? --> HAPROXY
921
+ | +---> Need auto-discovery? --> TRAEFIK
922
+ |
923
+ NO --> Do you need maximum throughput with minimum latency?
924
+ |
925
+ YES --> Use LAYER 4 (TCP/UDP)
926
+ | |
927
+ | +---> Cloud: AWS NLB, GCP Network LB
928
+ | +---> Self-hosted: IPVS, HAProxy (mode tcp)
929
+ |
930
+ NO --> Use LAYER 7 (more features, acceptable overhead)
931
+ ```
932
+
933
+ ---
934
+
935
+ ## 12. Configuration Quick Reference
936
+
937
+ ### HAProxy: Weighted Least Connections with Health Checks
938
+
939
+ ```haproxy
940
+ frontend http_front
941
+ bind *:443 ssl crt /etc/ssl/certs/site.pem
942
+ mode http
943
+ option http-server-close
944
+ default_backend app_servers
945
+
946
+ backend app_servers
947
+ mode http
948
+ balance leastconn
949
+ option httpchk GET /health
950
+ http-check expect status 200
951
+
952
+ # Connection draining: 30s drain, 60s slow start
953
+ default-server inter 5s fall 3 rise 2 slowstart 60s
954
+
955
+ server app1 10.0.1.1:8080 weight 4 check maxconn 2000
956
+ server app2 10.0.1.2:8080 weight 4 check maxconn 2000
957
+ server app3 10.0.1.3:8080 weight 1 check maxconn 500 # canary
958
+ ```
959
+
960
+ ### NGINX: Upstream with Keepalive and Health Checks
961
+
962
+ ```nginx
963
+ upstream app_backend {
964
+ least_conn;
965
+
966
+ server 10.0.1.1:8080 weight=4 max_fails=3 fail_timeout=10s;
967
+ server 10.0.1.2:8080 weight=4 max_fails=3 fail_timeout=10s;
968
+ server 10.0.1.3:8080 weight=1 max_fails=3 fail_timeout=10s;
969
+
970
+ keepalive 64; # connection pool to backends
971
+ keepalive_time 1h; # max connection lifetime
972
+ keepalive_timeout 60s; # idle timeout for pooled connections
973
+ }
974
+
975
+ server {
976
+ listen 443 ssl http2;
977
+
978
+ ssl_session_cache shared:SSL:10m; # TLS session resumption
979
+ ssl_session_timeout 1d;
980
+ ssl_session_tickets off; # forward secrecy
981
+
982
+ location / {
983
+ proxy_pass http://app_backend;
984
+ proxy_http_version 1.1;
985
+ proxy_set_header Connection ""; # enable keepalive to upstream
986
+ proxy_connect_timeout 5s;
987
+ proxy_read_timeout 30s;
988
+ }
989
+
990
+ location /health {
991
+ access_log off;
992
+ proxy_pass http://app_backend;
993
+ }
994
+ }
995
+ ```
996
+
997
+ ### Envoy: Cluster with Power of Two Choices
998
+
999
+ ```yaml
1000
+ clusters:
1001
+ - name: app_cluster
1002
+ type: STRICT_DNS
1003
+ lb_policy: LEAST_REQUEST # implements power of two choices
1004
+ load_assignment:
1005
+ cluster_name: app_cluster
1006
+ endpoints:
1007
+ - lb_endpoints:
1008
+ - endpoint:
1009
+ address:
1010
+ socket_address:
1011
+ address: app-service
1012
+ port_value: 8080
1013
+ health_checks:
1014
+ - timeout: 5s
1015
+ interval: 10s
1016
+ unhealthy_threshold: 3
1017
+ healthy_threshold: 2
1018
+ http_health_check:
1019
+ path: /health
1020
+ circuit_breakers:
1021
+ thresholds:
1022
+ - max_connections: 1024
1023
+ max_pending_requests: 1024
1024
+ max_requests: 1024
1025
+ outlier_detection:
1026
+ consecutive_5xx: 5
1027
+ interval: 10s
1028
+ base_ejection_time: 30s
1029
+ max_ejection_percent: 50
1030
+ ```
1031
+
1032
+ ### Kubernetes: Service with Session Affinity (when you must)
1033
+
1034
+ ```yaml
1035
+ apiVersion: v1
1036
+ kind: Service
1037
+ metadata:
1038
+ name: app-service
1039
+ spec:
1040
+ selector:
1041
+ app: my-app
1042
+ ports:
1043
+ - port: 80
1044
+ targetPort: 8080
1045
+ sessionAffinity: ClientIP
1046
+ sessionAffinityConfig:
1047
+ clientIP:
1048
+ timeoutSeconds: 1800 # 30-minute session timeout
1049
+ ```
1050
+
1051
+ ---
1052
+
1053
+ ## 13. Sources
1054
+
1055
+ - [HAProxy Exceeds 2 Million RPS on a Single ARM Instance](https://www.haproxy.com/blog/haproxy-forwards-over-2-million-http-requests-per-second-on-a-single-aws-arm-instance) -- HAProxy benchmark data: 2.08M RPS, 400us latency, 92 Gbps bandwidth
1056
+ - [Benchmarking 5 Popular Load Balancers: Nginx, HAProxy, Envoy, Traefik, and ALB](https://www.loggly.com/blog/benchmarking-5-popular-load-balancers-nginx-haproxy-envoy-traefik-and-alb/) -- Loggly comparative benchmark of throughput and latency
1057
+ - [HAProxy vs NGINX vs Envoy on VPS in 2025](https://onidel.com/blog/haproxy-nginx-envoy-benchmark-vps) -- Memory usage comparison: HAProxy 50MB, NGINX 80MB, Envoy 150MB
1058
+ - [HAProxy vs NGINX Performance: A Comprehensive Analysis](https://last9.io/blog/haproxy-vs-nginx-performance/) -- HAProxy 10-15% advantage in raw connection handling
1059
+ - [Test Driving Power of Two Random Choices Load Balancing](https://www.haproxy.com/blog/power-of-two-load-balancing) -- HAProxy P2C benchmarks under contention levels
1060
+ - [NGINX and the Power of Two Choices Load-Balancing Algorithm](https://www.f5.com/company/blog/nginx/nginx-power-of-two-choices-load-balancing-algorithm) -- NGINX random two least_conn implementation
1061
+ - [The Power of Two Choices in Randomized Load Balancing (Mitzenmacher)](https://www.eecs.harvard.edu/~michaelm/postscripts/mythesis.pdf) -- Original dissertation proving O(log log n) bound
1062
+ - [Load Balancing Algorithms - Cloudflare](https://www.cloudflare.com/learning/performance/types-of-load-balancing-algorithms/) -- Algorithm overview and tradeoffs
1063
+ - [Layer 4 vs Layer 7 Load Balancing](https://www.systemoverflow.com/learn/load-balancing/lb-algorithms/layer-4-vs-layer-7-load-balancing-algorithm-trade-offs) -- L4 vs L7 performance tradeoffs
1064
+ - [ALB vs NLB: Which AWS Load Balancer Fits Your Needs](https://blog.cloudcraft.co/alb-vs-nlb-which-aws-load-balancer-fits-your-needs/) -- NLB 4x faster than ALB in benchmarks
1065
+ - [Load Balancing Compared: Cloudflare vs AWS ELB vs Azure vs GCP](https://inventivehq.com/blog/cloudflare-load-balancing-vs-aws-alb-vs-azure-front-door-vs-google-cloud-load-balancing) -- Cloud LB comparison
1066
+ - [Cloudflare Load Balancing Reference Architecture](https://developers.cloudflare.com/reference-architecture/architectures/load-balancing/) -- 300+ PoPs, ~50ms from 95% of internet population
1067
+ - [Health Checks Overview - Google Cloud](https://docs.cloud.google.com/load-balancing/docs/health-check-concepts) -- GCP health check configuration
1068
+ - [Load Balancer Health Check Best Practices - Oracle](https://docs.oracle.com/en-us/iaas/Content/Balance/Tasks/health_check_best_practices.htm) -- Health check timing recommendations
1069
+ - [Is TLS Fast Yet?](https://istlsfastyet.com/) -- Google production data: TLS < 1% CPU load
1070
+ - [HAProxy TLS/SSL Termination](https://www.haproxy.com/glossary/what-is-ssl-tls-termination) -- TLS CPU overhead details
1071
+ - [Load Balancing, Affinity, Persistence, Sticky Sessions - HAProxy](https://www.haproxy.com/blog/load-balancing-affinity-persistence-sticky-sessions-what-you-need-to-know) -- Session affinity mechanisms and tradeoffs
1072
+ - [Sticky Sessions Disrupts Pod Auto-Scaling in Kubernetes](https://medium.com/nerd-for-tech/how-session-stickiness-disrupts-pod-auto-scaling-in-kubernetes-17ece8e2ea4f) -- Sticky sessions and auto-scaling conflict
1073
+ - [Enable Connection Draining - Google Cloud](https://cloud.google.com/load-balancing/docs/enabling-connection-draining) -- GCP drain timeout configuration
1074
+ - [Optimize Load Balancer Connection Draining for Amazon ECS](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/load-balancer-connection-draining.html) -- AWS ECS drain best practices
1075
+ - [How to Reduce Latency Spikes: Singleflight Pattern - Pipedrive](https://medium.com/pipedrive-engineering/how-to-reduce-latency-spikes-a-trick-with-shared-upstream-requests-d43d3bac951c) -- P99 spike from 40ms to 1000ms, fixed with request coalescing
1076
+ - [Thundering Herd Problem - Encore Blog](https://encore.dev/blog/thundering-herd-problem) -- Mitigation strategies: jitter, rate limiting, backoff
1077
+ - [NGINX Performance Tuning - Codedamn](https://codedamn.com/news/backend/performance-tuning-benchmarking-optimization) -- NGINX 5200 RPS per worker with 8 workers
1078
+ - [Tuning NGINX for Performance - F5](https://www.f5.com/company/blog/nginx/tuning-nginx) -- worker_connections, keepalive tuning
1079
+ - [Envoy Load Balancer Documentation](https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/load_balancing/load_balancers) -- Consistent hashing ring size, algorithm details
1080
+ - [Battle of the Proxies: Envoy vs Traefik vs HAProxy in 2025](https://aws.plainenglish.io/battle-of-the-proxies-envoy-vs-traefik-vs-haproxy-in-2025-8f0bed6c7a66) -- Envoy sub-10ms under churn vs HAProxy 25s spikes
1081
+ - [Predictive Load Balancing: Round Robin, Weighted Round Robin, and ML](https://www.mdpi.com/2673-4591/122/1/26) -- ML-based approaches vs traditional algorithms