@wazir-dev/cli 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (629) hide show
  1. package/AGENTS.md +111 -0
  2. package/CHANGELOG.md +14 -0
  3. package/CONTRIBUTING.md +101 -0
  4. package/LICENSE +21 -0
  5. package/README.md +314 -0
  6. package/assets/composition-engine.mmd +34 -0
  7. package/assets/demo-script.sh +17 -0
  8. package/assets/logo-dark.svg +14 -0
  9. package/assets/logo.svg +14 -0
  10. package/assets/pipeline.mmd +39 -0
  11. package/assets/record-demo.sh +51 -0
  12. package/docs/README.md +51 -0
  13. package/docs/adapters/context-mode.md +60 -0
  14. package/docs/concepts/architecture.md +87 -0
  15. package/docs/concepts/artifact-model.md +60 -0
  16. package/docs/concepts/composition-engine.md +36 -0
  17. package/docs/concepts/indexing-and-recall.md +160 -0
  18. package/docs/concepts/observability.md +41 -0
  19. package/docs/concepts/roles-and-workflows.md +59 -0
  20. package/docs/concepts/terminology-policy.md +27 -0
  21. package/docs/getting-started/01-installation.md +78 -0
  22. package/docs/getting-started/02-first-run.md +102 -0
  23. package/docs/getting-started/03-adding-to-project.md +15 -0
  24. package/docs/getting-started/04-host-setup.md +15 -0
  25. package/docs/guides/ci-integration.md +15 -0
  26. package/docs/guides/creating-skills.md +15 -0
  27. package/docs/guides/expertise-module-authoring.md +15 -0
  28. package/docs/guides/hook-development.md +15 -0
  29. package/docs/guides/memory-and-learnings.md +34 -0
  30. package/docs/guides/multi-host-export.md +15 -0
  31. package/docs/guides/troubleshooting.md +101 -0
  32. package/docs/guides/writing-custom-roles.md +15 -0
  33. package/docs/plans/2026-03-15-cli-pipeline-integration-design.md +592 -0
  34. package/docs/plans/2026-03-15-cli-pipeline-integration-plan.md +598 -0
  35. package/docs/plans/2026-03-15-docs-enforcement-plan.md +238 -0
  36. package/docs/readmes/INDEX.md +99 -0
  37. package/docs/readmes/features/expertise/README.md +171 -0
  38. package/docs/readmes/features/exports/README.md +222 -0
  39. package/docs/readmes/features/hooks/README.md +103 -0
  40. package/docs/readmes/features/hooks/loop-cap-guard.md +133 -0
  41. package/docs/readmes/features/hooks/post-tool-capture.md +121 -0
  42. package/docs/readmes/features/hooks/post-tool-lint.md +130 -0
  43. package/docs/readmes/features/hooks/pre-compact-summary.md +122 -0
  44. package/docs/readmes/features/hooks/pre-tool-capture-route.md +100 -0
  45. package/docs/readmes/features/hooks/protected-path-write-guard.md +128 -0
  46. package/docs/readmes/features/hooks/session-start.md +119 -0
  47. package/docs/readmes/features/hooks/stop-handoff-harvest.md +125 -0
  48. package/docs/readmes/features/roles/README.md +157 -0
  49. package/docs/readmes/features/roles/clarifier.md +152 -0
  50. package/docs/readmes/features/roles/content-author.md +190 -0
  51. package/docs/readmes/features/roles/designer.md +193 -0
  52. package/docs/readmes/features/roles/executor.md +184 -0
  53. package/docs/readmes/features/roles/learner.md +210 -0
  54. package/docs/readmes/features/roles/planner.md +182 -0
  55. package/docs/readmes/features/roles/researcher.md +164 -0
  56. package/docs/readmes/features/roles/reviewer.md +184 -0
  57. package/docs/readmes/features/roles/specifier.md +162 -0
  58. package/docs/readmes/features/roles/verifier.md +215 -0
  59. package/docs/readmes/features/schemas/README.md +178 -0
  60. package/docs/readmes/features/skills/README.md +63 -0
  61. package/docs/readmes/features/skills/brainstorming.md +96 -0
  62. package/docs/readmes/features/skills/debugging.md +148 -0
  63. package/docs/readmes/features/skills/design.md +120 -0
  64. package/docs/readmes/features/skills/prepare-next.md +109 -0
  65. package/docs/readmes/features/skills/run-audit.md +159 -0
  66. package/docs/readmes/features/skills/scan-project.md +109 -0
  67. package/docs/readmes/features/skills/self-audit.md +176 -0
  68. package/docs/readmes/features/skills/tdd.md +137 -0
  69. package/docs/readmes/features/skills/using-skills.md +92 -0
  70. package/docs/readmes/features/skills/verification.md +120 -0
  71. package/docs/readmes/features/skills/writing-plans.md +104 -0
  72. package/docs/readmes/features/tooling/README.md +320 -0
  73. package/docs/readmes/features/workflows/README.md +186 -0
  74. package/docs/readmes/features/workflows/author.md +181 -0
  75. package/docs/readmes/features/workflows/clarify.md +154 -0
  76. package/docs/readmes/features/workflows/design-review.md +171 -0
  77. package/docs/readmes/features/workflows/design.md +169 -0
  78. package/docs/readmes/features/workflows/discover.md +162 -0
  79. package/docs/readmes/features/workflows/execute.md +173 -0
  80. package/docs/readmes/features/workflows/learn.md +167 -0
  81. package/docs/readmes/features/workflows/plan-review.md +165 -0
  82. package/docs/readmes/features/workflows/plan.md +170 -0
  83. package/docs/readmes/features/workflows/prepare-next.md +167 -0
  84. package/docs/readmes/features/workflows/review.md +169 -0
  85. package/docs/readmes/features/workflows/run-audit.md +191 -0
  86. package/docs/readmes/features/workflows/spec-challenge.md +159 -0
  87. package/docs/readmes/features/workflows/specify.md +160 -0
  88. package/docs/readmes/features/workflows/verify.md +177 -0
  89. package/docs/readmes/packages/README.md +50 -0
  90. package/docs/readmes/packages/ajv.md +117 -0
  91. package/docs/readmes/packages/context-mode.md +118 -0
  92. package/docs/readmes/packages/gray-matter.md +116 -0
  93. package/docs/readmes/packages/node-test.md +137 -0
  94. package/docs/readmes/packages/yaml.md +112 -0
  95. package/docs/reference/configuration-reference.md +159 -0
  96. package/docs/reference/expertise-index.md +52 -0
  97. package/docs/reference/git-flow.md +43 -0
  98. package/docs/reference/hooks.md +87 -0
  99. package/docs/reference/host-exports.md +50 -0
  100. package/docs/reference/launch-checklist.md +172 -0
  101. package/docs/reference/marketplace-listings.md +76 -0
  102. package/docs/reference/release-process.md +34 -0
  103. package/docs/reference/roles-reference.md +77 -0
  104. package/docs/reference/skills.md +33 -0
  105. package/docs/reference/templates.md +29 -0
  106. package/docs/reference/tooling-cli.md +94 -0
  107. package/docs/truth-claims.yaml +222 -0
  108. package/expertise/PROGRESS.md +63 -0
  109. package/expertise/README.md +18 -0
  110. package/expertise/antipatterns/PROGRESS.md +56 -0
  111. package/expertise/antipatterns/backend/api-design-antipatterns.md +1271 -0
  112. package/expertise/antipatterns/backend/auth-antipatterns.md +1195 -0
  113. package/expertise/antipatterns/backend/caching-antipatterns.md +622 -0
  114. package/expertise/antipatterns/backend/database-antipatterns.md +1038 -0
  115. package/expertise/antipatterns/backend/index.md +24 -0
  116. package/expertise/antipatterns/backend/microservices-antipatterns.md +850 -0
  117. package/expertise/antipatterns/code/architecture-antipatterns.md +919 -0
  118. package/expertise/antipatterns/code/async-antipatterns.md +622 -0
  119. package/expertise/antipatterns/code/code-smells.md +1186 -0
  120. package/expertise/antipatterns/code/dependency-antipatterns.md +1209 -0
  121. package/expertise/antipatterns/code/error-handling-antipatterns.md +1360 -0
  122. package/expertise/antipatterns/code/index.md +27 -0
  123. package/expertise/antipatterns/code/naming-and-abstraction.md +1118 -0
  124. package/expertise/antipatterns/code/state-management-antipatterns.md +1076 -0
  125. package/expertise/antipatterns/code/testing-antipatterns.md +1053 -0
  126. package/expertise/antipatterns/design/accessibility-antipatterns.md +1136 -0
  127. package/expertise/antipatterns/design/dark-patterns.md +1121 -0
  128. package/expertise/antipatterns/design/index.md +22 -0
  129. package/expertise/antipatterns/design/ui-antipatterns.md +1202 -0
  130. package/expertise/antipatterns/design/ux-antipatterns.md +680 -0
  131. package/expertise/antipatterns/frontend/css-layout-antipatterns.md +691 -0
  132. package/expertise/antipatterns/frontend/flutter-antipatterns.md +1827 -0
  133. package/expertise/antipatterns/frontend/index.md +23 -0
  134. package/expertise/antipatterns/frontend/mobile-antipatterns.md +573 -0
  135. package/expertise/antipatterns/frontend/react-antipatterns.md +1128 -0
  136. package/expertise/antipatterns/frontend/spa-antipatterns.md +1235 -0
  137. package/expertise/antipatterns/index.md +31 -0
  138. package/expertise/antipatterns/performance/index.md +20 -0
  139. package/expertise/antipatterns/performance/performance-antipatterns.md +1013 -0
  140. package/expertise/antipatterns/performance/premature-optimization.md +623 -0
  141. package/expertise/antipatterns/performance/scaling-antipatterns.md +785 -0
  142. package/expertise/antipatterns/process/ai-coding-antipatterns.md +853 -0
  143. package/expertise/antipatterns/process/code-review-antipatterns.md +656 -0
  144. package/expertise/antipatterns/process/deployment-antipatterns.md +920 -0
  145. package/expertise/antipatterns/process/index.md +23 -0
  146. package/expertise/antipatterns/process/technical-debt-antipatterns.md +647 -0
  147. package/expertise/antipatterns/security/index.md +20 -0
  148. package/expertise/antipatterns/security/secrets-antipatterns.md +849 -0
  149. package/expertise/antipatterns/security/security-theater.md +843 -0
  150. package/expertise/antipatterns/security/vulnerability-patterns.md +801 -0
  151. package/expertise/architecture/PROGRESS.md +70 -0
  152. package/expertise/architecture/data/caching-architecture.md +671 -0
  153. package/expertise/architecture/data/data-consistency.md +574 -0
  154. package/expertise/architecture/data/data-modeling.md +536 -0
  155. package/expertise/architecture/data/event-streams-and-queues.md +634 -0
  156. package/expertise/architecture/data/index.md +25 -0
  157. package/expertise/architecture/data/search-architecture.md +663 -0
  158. package/expertise/architecture/data/sql-vs-nosql.md +708 -0
  159. package/expertise/architecture/decisions/architecture-decision-records.md +640 -0
  160. package/expertise/architecture/decisions/build-vs-buy.md +616 -0
  161. package/expertise/architecture/decisions/index.md +23 -0
  162. package/expertise/architecture/decisions/monolith-to-microservices.md +790 -0
  163. package/expertise/architecture/decisions/technology-selection.md +616 -0
  164. package/expertise/architecture/distributed/cap-theorem-and-tradeoffs.md +800 -0
  165. package/expertise/architecture/distributed/circuit-breaker-bulkhead.md +741 -0
  166. package/expertise/architecture/distributed/consensus-and-coordination.md +796 -0
  167. package/expertise/architecture/distributed/distributed-systems-fundamentals.md +564 -0
  168. package/expertise/architecture/distributed/idempotency-and-retry.md +796 -0
  169. package/expertise/architecture/distributed/index.md +25 -0
  170. package/expertise/architecture/distributed/saga-pattern.md +797 -0
  171. package/expertise/architecture/foundations/architectural-thinking.md +460 -0
  172. package/expertise/architecture/foundations/coupling-and-cohesion.md +770 -0
  173. package/expertise/architecture/foundations/design-principles-solid.md +649 -0
  174. package/expertise/architecture/foundations/domain-driven-design.md +719 -0
  175. package/expertise/architecture/foundations/index.md +25 -0
  176. package/expertise/architecture/foundations/separation-of-concerns.md +472 -0
  177. package/expertise/architecture/foundations/twelve-factor-app.md +797 -0
  178. package/expertise/architecture/index.md +34 -0
  179. package/expertise/architecture/integration/api-design-graphql.md +638 -0
  180. package/expertise/architecture/integration/api-design-grpc.md +804 -0
  181. package/expertise/architecture/integration/api-design-rest.md +892 -0
  182. package/expertise/architecture/integration/index.md +25 -0
  183. package/expertise/architecture/integration/third-party-integration.md +795 -0
  184. package/expertise/architecture/integration/webhooks-and-callbacks.md +1152 -0
  185. package/expertise/architecture/integration/websockets-realtime.md +791 -0
  186. package/expertise/architecture/mobile-architecture/index.md +22 -0
  187. package/expertise/architecture/mobile-architecture/mobile-app-architecture.md +780 -0
  188. package/expertise/architecture/mobile-architecture/mobile-backend-for-frontend.md +670 -0
  189. package/expertise/architecture/mobile-architecture/offline-first.md +719 -0
  190. package/expertise/architecture/mobile-architecture/push-and-sync.md +782 -0
  191. package/expertise/architecture/patterns/cqrs-event-sourcing.md +717 -0
  192. package/expertise/architecture/patterns/event-driven.md +797 -0
  193. package/expertise/architecture/patterns/hexagonal-clean-architecture.md +870 -0
  194. package/expertise/architecture/patterns/index.md +27 -0
  195. package/expertise/architecture/patterns/layered-architecture.md +736 -0
  196. package/expertise/architecture/patterns/microservices.md +753 -0
  197. package/expertise/architecture/patterns/modular-monolith.md +692 -0
  198. package/expertise/architecture/patterns/monolith.md +626 -0
  199. package/expertise/architecture/patterns/plugin-architecture.md +735 -0
  200. package/expertise/architecture/patterns/serverless.md +780 -0
  201. package/expertise/architecture/scaling/database-scaling.md +615 -0
  202. package/expertise/architecture/scaling/feature-flags-and-rollouts.md +757 -0
  203. package/expertise/architecture/scaling/horizontal-vs-vertical.md +606 -0
  204. package/expertise/architecture/scaling/index.md +24 -0
  205. package/expertise/architecture/scaling/multi-tenancy.md +800 -0
  206. package/expertise/architecture/scaling/stateless-design.md +787 -0
  207. package/expertise/backend/embedded-firmware.md +625 -0
  208. package/expertise/backend/go.md +853 -0
  209. package/expertise/backend/index.md +24 -0
  210. package/expertise/backend/java-spring.md +448 -0
  211. package/expertise/backend/node-typescript.md +625 -0
  212. package/expertise/backend/python-fastapi.md +724 -0
  213. package/expertise/backend/rust.md +458 -0
  214. package/expertise/backend/solidity.md +711 -0
  215. package/expertise/composition-map.yaml +443 -0
  216. package/expertise/content/foundations/content-modeling.md +395 -0
  217. package/expertise/content/foundations/editorial-standards.md +449 -0
  218. package/expertise/content/foundations/index.md +24 -0
  219. package/expertise/content/foundations/microcopy.md +455 -0
  220. package/expertise/content/foundations/terminology-governance.md +509 -0
  221. package/expertise/content/index.md +34 -0
  222. package/expertise/content/patterns/accessibility-copy.md +518 -0
  223. package/expertise/content/patterns/index.md +24 -0
  224. package/expertise/content/patterns/notification-content.md +433 -0
  225. package/expertise/content/patterns/sample-content.md +486 -0
  226. package/expertise/content/patterns/state-copy.md +439 -0
  227. package/expertise/design/PROGRESS.md +58 -0
  228. package/expertise/design/disciplines/dark-mode-theming.md +577 -0
  229. package/expertise/design/disciplines/design-systems.md +595 -0
  230. package/expertise/design/disciplines/index.md +25 -0
  231. package/expertise/design/disciplines/information-architecture.md +800 -0
  232. package/expertise/design/disciplines/interaction-design.md +788 -0
  233. package/expertise/design/disciplines/responsive-design.md +552 -0
  234. package/expertise/design/disciplines/usability-testing.md +516 -0
  235. package/expertise/design/disciplines/user-research.md +792 -0
  236. package/expertise/design/foundations/accessibility-design.md +796 -0
  237. package/expertise/design/foundations/color-theory.md +797 -0
  238. package/expertise/design/foundations/iconography.md +795 -0
  239. package/expertise/design/foundations/index.md +26 -0
  240. package/expertise/design/foundations/motion-and-animation.md +653 -0
  241. package/expertise/design/foundations/rtl-design.md +585 -0
  242. package/expertise/design/foundations/spacing-and-layout.md +607 -0
  243. package/expertise/design/foundations/typography.md +800 -0
  244. package/expertise/design/foundations/visual-hierarchy.md +761 -0
  245. package/expertise/design/index.md +32 -0
  246. package/expertise/design/patterns/authentication-flows.md +474 -0
  247. package/expertise/design/patterns/content-consumption.md +789 -0
  248. package/expertise/design/patterns/data-display.md +618 -0
  249. package/expertise/design/patterns/e-commerce.md +1494 -0
  250. package/expertise/design/patterns/feedback-and-states.md +642 -0
  251. package/expertise/design/patterns/forms-and-input.md +819 -0
  252. package/expertise/design/patterns/gamification.md +801 -0
  253. package/expertise/design/patterns/index.md +31 -0
  254. package/expertise/design/patterns/microinteractions.md +449 -0
  255. package/expertise/design/patterns/navigation.md +800 -0
  256. package/expertise/design/patterns/notifications.md +705 -0
  257. package/expertise/design/patterns/onboarding.md +700 -0
  258. package/expertise/design/patterns/search-and-filter.md +601 -0
  259. package/expertise/design/patterns/settings-and-preferences.md +768 -0
  260. package/expertise/design/patterns/social-and-community.md +748 -0
  261. package/expertise/design/platforms/desktop-native.md +612 -0
  262. package/expertise/design/platforms/index.md +25 -0
  263. package/expertise/design/platforms/mobile-android.md +825 -0
  264. package/expertise/design/platforms/mobile-cross-platform.md +983 -0
  265. package/expertise/design/platforms/mobile-ios.md +699 -0
  266. package/expertise/design/platforms/tablet.md +794 -0
  267. package/expertise/design/platforms/web-dashboard.md +790 -0
  268. package/expertise/design/platforms/web-responsive.md +550 -0
  269. package/expertise/design/psychology/behavioral-nudges.md +449 -0
  270. package/expertise/design/psychology/cognitive-load.md +1191 -0
  271. package/expertise/design/psychology/error-psychology.md +778 -0
  272. package/expertise/design/psychology/index.md +22 -0
  273. package/expertise/design/psychology/persuasive-design.md +736 -0
  274. package/expertise/design/psychology/user-mental-models.md +623 -0
  275. package/expertise/design/tooling/open-pencil.md +266 -0
  276. package/expertise/frontend/angular.md +1073 -0
  277. package/expertise/frontend/desktop-electron.md +546 -0
  278. package/expertise/frontend/flutter.md +782 -0
  279. package/expertise/frontend/index.md +27 -0
  280. package/expertise/frontend/native-android.md +409 -0
  281. package/expertise/frontend/native-ios.md +490 -0
  282. package/expertise/frontend/react-native.md +1160 -0
  283. package/expertise/frontend/react.md +808 -0
  284. package/expertise/frontend/vue.md +1089 -0
  285. package/expertise/humanize/domain-rules-code.md +79 -0
  286. package/expertise/humanize/domain-rules-content.md +67 -0
  287. package/expertise/humanize/domain-rules-technical-docs.md +56 -0
  288. package/expertise/humanize/index.md +35 -0
  289. package/expertise/humanize/self-audit-checklist.md +87 -0
  290. package/expertise/humanize/sentence-patterns.md +218 -0
  291. package/expertise/humanize/vocabulary-blacklist.md +105 -0
  292. package/expertise/i18n/PROGRESS.md +65 -0
  293. package/expertise/i18n/advanced/accessibility-and-i18n.md +28 -0
  294. package/expertise/i18n/advanced/bidirectional-text-algorithm.md +38 -0
  295. package/expertise/i18n/advanced/complex-scripts.md +30 -0
  296. package/expertise/i18n/advanced/performance-and-i18n.md +27 -0
  297. package/expertise/i18n/advanced/testing-i18n.md +28 -0
  298. package/expertise/i18n/content/content-adaptation.md +23 -0
  299. package/expertise/i18n/content/locale-specific-formatting.md +23 -0
  300. package/expertise/i18n/content/machine-translation-integration.md +28 -0
  301. package/expertise/i18n/content/translation-management.md +29 -0
  302. package/expertise/i18n/foundations/date-time-calendars.md +67 -0
  303. package/expertise/i18n/foundations/i18n-architecture.md +272 -0
  304. package/expertise/i18n/foundations/locale-and-language-tags.md +79 -0
  305. package/expertise/i18n/foundations/numbers-currency-units.md +61 -0
  306. package/expertise/i18n/foundations/pluralization-and-gender.md +109 -0
  307. package/expertise/i18n/foundations/string-externalization.md +236 -0
  308. package/expertise/i18n/foundations/text-direction-bidi.md +241 -0
  309. package/expertise/i18n/foundations/unicode-and-encoding.md +86 -0
  310. package/expertise/i18n/index.md +38 -0
  311. package/expertise/i18n/platform/backend-i18n.md +31 -0
  312. package/expertise/i18n/platform/flutter-i18n.md +148 -0
  313. package/expertise/i18n/platform/native-android-i18n.md +36 -0
  314. package/expertise/i18n/platform/native-ios-i18n.md +36 -0
  315. package/expertise/i18n/platform/react-i18n.md +103 -0
  316. package/expertise/i18n/platform/web-css-i18n.md +81 -0
  317. package/expertise/i18n/rtl/arabic-specific.md +175 -0
  318. package/expertise/i18n/rtl/hebrew-specific.md +149 -0
  319. package/expertise/i18n/rtl/rtl-animations-and-transitions.md +111 -0
  320. package/expertise/i18n/rtl/rtl-forms-and-input.md +161 -0
  321. package/expertise/i18n/rtl/rtl-fundamentals.md +211 -0
  322. package/expertise/i18n/rtl/rtl-icons-and-images.md +181 -0
  323. package/expertise/i18n/rtl/rtl-layout-mirroring.md +252 -0
  324. package/expertise/i18n/rtl/rtl-navigation-and-gestures.md +107 -0
  325. package/expertise/i18n/rtl/rtl-testing-and-qa.md +147 -0
  326. package/expertise/i18n/rtl/rtl-typography.md +160 -0
  327. package/expertise/index.md +113 -0
  328. package/expertise/index.yaml +216 -0
  329. package/expertise/infrastructure/cloud-aws.md +597 -0
  330. package/expertise/infrastructure/cloud-gcp.md +599 -0
  331. package/expertise/infrastructure/cybersecurity.md +816 -0
  332. package/expertise/infrastructure/database-mongodb.md +447 -0
  333. package/expertise/infrastructure/database-postgres.md +400 -0
  334. package/expertise/infrastructure/devops-cicd.md +787 -0
  335. package/expertise/infrastructure/index.md +27 -0
  336. package/expertise/performance/PROGRESS.md +50 -0
  337. package/expertise/performance/backend/api-latency.md +1204 -0
  338. package/expertise/performance/backend/background-jobs.md +506 -0
  339. package/expertise/performance/backend/connection-pooling.md +1209 -0
  340. package/expertise/performance/backend/database-query-optimization.md +515 -0
  341. package/expertise/performance/backend/index.md +23 -0
  342. package/expertise/performance/backend/rate-limiting-and-throttling.md +971 -0
  343. package/expertise/performance/foundations/algorithmic-complexity.md +954 -0
  344. package/expertise/performance/foundations/caching-strategies.md +489 -0
  345. package/expertise/performance/foundations/concurrency-and-parallelism.md +847 -0
  346. package/expertise/performance/foundations/index.md +24 -0
  347. package/expertise/performance/foundations/measuring-and-profiling.md +440 -0
  348. package/expertise/performance/foundations/memory-management.md +964 -0
  349. package/expertise/performance/foundations/performance-budgets.md +1314 -0
  350. package/expertise/performance/index.md +31 -0
  351. package/expertise/performance/infrastructure/auto-scaling.md +1059 -0
  352. package/expertise/performance/infrastructure/cdn-and-edge.md +1081 -0
  353. package/expertise/performance/infrastructure/index.md +22 -0
  354. package/expertise/performance/infrastructure/load-balancing.md +1081 -0
  355. package/expertise/performance/infrastructure/observability.md +1079 -0
  356. package/expertise/performance/mobile/index.md +23 -0
  357. package/expertise/performance/mobile/mobile-animations.md +544 -0
  358. package/expertise/performance/mobile/mobile-memory-battery.md +416 -0
  359. package/expertise/performance/mobile/mobile-network.md +452 -0
  360. package/expertise/performance/mobile/mobile-rendering.md +599 -0
  361. package/expertise/performance/mobile/mobile-startup-time.md +505 -0
  362. package/expertise/performance/platform-specific/flutter-performance.md +647 -0
  363. package/expertise/performance/platform-specific/index.md +22 -0
  364. package/expertise/performance/platform-specific/node-performance.md +1307 -0
  365. package/expertise/performance/platform-specific/postgres-performance.md +1366 -0
  366. package/expertise/performance/platform-specific/react-performance.md +1403 -0
  367. package/expertise/performance/web/bundle-optimization.md +1239 -0
  368. package/expertise/performance/web/image-and-media.md +636 -0
  369. package/expertise/performance/web/index.md +24 -0
  370. package/expertise/performance/web/network-optimization.md +1133 -0
  371. package/expertise/performance/web/rendering-performance.md +1098 -0
  372. package/expertise/performance/web/ssr-and-hydration.md +918 -0
  373. package/expertise/performance/web/web-vitals.md +1374 -0
  374. package/expertise/quality/accessibility.md +985 -0
  375. package/expertise/quality/evidence-based-verification.md +499 -0
  376. package/expertise/quality/index.md +24 -0
  377. package/expertise/quality/ml-model-audit.md +614 -0
  378. package/expertise/quality/performance.md +600 -0
  379. package/expertise/quality/testing-api.md +891 -0
  380. package/expertise/quality/testing-mobile.md +496 -0
  381. package/expertise/quality/testing-web.md +849 -0
  382. package/expertise/security/PROGRESS.md +54 -0
  383. package/expertise/security/agentic-identity.md +540 -0
  384. package/expertise/security/compliance-frameworks.md +601 -0
  385. package/expertise/security/data/data-encryption.md +364 -0
  386. package/expertise/security/data/data-privacy-gdpr.md +692 -0
  387. package/expertise/security/data/database-security.md +1171 -0
  388. package/expertise/security/data/index.md +22 -0
  389. package/expertise/security/data/pii-handling.md +531 -0
  390. package/expertise/security/foundations/authentication.md +1041 -0
  391. package/expertise/security/foundations/authorization.md +603 -0
  392. package/expertise/security/foundations/cryptography.md +1001 -0
  393. package/expertise/security/foundations/index.md +25 -0
  394. package/expertise/security/foundations/owasp-top-10.md +1354 -0
  395. package/expertise/security/foundations/secrets-management.md +1217 -0
  396. package/expertise/security/foundations/secure-sdlc.md +700 -0
  397. package/expertise/security/foundations/supply-chain-security.md +698 -0
  398. package/expertise/security/index.md +31 -0
  399. package/expertise/security/infrastructure/cloud-security-aws.md +1296 -0
  400. package/expertise/security/infrastructure/cloud-security-gcp.md +1376 -0
  401. package/expertise/security/infrastructure/container-security.md +721 -0
  402. package/expertise/security/infrastructure/incident-response.md +1295 -0
  403. package/expertise/security/infrastructure/index.md +24 -0
  404. package/expertise/security/infrastructure/logging-and-monitoring.md +1618 -0
  405. package/expertise/security/infrastructure/network-security.md +1337 -0
  406. package/expertise/security/mobile/index.md +23 -0
  407. package/expertise/security/mobile/mobile-android-security.md +1218 -0
  408. package/expertise/security/mobile/mobile-binary-protection.md +1229 -0
  409. package/expertise/security/mobile/mobile-data-storage.md +1265 -0
  410. package/expertise/security/mobile/mobile-ios-security.md +1401 -0
  411. package/expertise/security/mobile/mobile-network-security.md +1520 -0
  412. package/expertise/security/smart-contract-security.md +594 -0
  413. package/expertise/security/testing/index.md +22 -0
  414. package/expertise/security/testing/penetration-testing.md +1258 -0
  415. package/expertise/security/testing/security-code-review.md +1765 -0
  416. package/expertise/security/testing/threat-modeling.md +1074 -0
  417. package/expertise/security/testing/vulnerability-scanning.md +1062 -0
  418. package/expertise/security/web/api-security.md +586 -0
  419. package/expertise/security/web/cors-and-headers.md +433 -0
  420. package/expertise/security/web/csrf.md +562 -0
  421. package/expertise/security/web/file-upload.md +1477 -0
  422. package/expertise/security/web/index.md +25 -0
  423. package/expertise/security/web/injection.md +1375 -0
  424. package/expertise/security/web/session-management.md +1101 -0
  425. package/expertise/security/web/xss.md +1158 -0
  426. package/exports/README.md +17 -0
  427. package/exports/hosts/claude/.claude/agents/clarifier.md +42 -0
  428. package/exports/hosts/claude/.claude/agents/content-author.md +63 -0
  429. package/exports/hosts/claude/.claude/agents/designer.md +55 -0
  430. package/exports/hosts/claude/.claude/agents/executor.md +55 -0
  431. package/exports/hosts/claude/.claude/agents/learner.md +51 -0
  432. package/exports/hosts/claude/.claude/agents/planner.md +53 -0
  433. package/exports/hosts/claude/.claude/agents/researcher.md +43 -0
  434. package/exports/hosts/claude/.claude/agents/reviewer.md +54 -0
  435. package/exports/hosts/claude/.claude/agents/specifier.md +47 -0
  436. package/exports/hosts/claude/.claude/agents/verifier.md +71 -0
  437. package/exports/hosts/claude/.claude/commands/author.md +42 -0
  438. package/exports/hosts/claude/.claude/commands/clarify.md +38 -0
  439. package/exports/hosts/claude/.claude/commands/design-review.md +46 -0
  440. package/exports/hosts/claude/.claude/commands/design.md +44 -0
  441. package/exports/hosts/claude/.claude/commands/discover.md +37 -0
  442. package/exports/hosts/claude/.claude/commands/execute.md +48 -0
  443. package/exports/hosts/claude/.claude/commands/learn.md +38 -0
  444. package/exports/hosts/claude/.claude/commands/plan-review.md +42 -0
  445. package/exports/hosts/claude/.claude/commands/plan.md +39 -0
  446. package/exports/hosts/claude/.claude/commands/prepare-next.md +37 -0
  447. package/exports/hosts/claude/.claude/commands/review.md +40 -0
  448. package/exports/hosts/claude/.claude/commands/run-audit.md +41 -0
  449. package/exports/hosts/claude/.claude/commands/spec-challenge.md +41 -0
  450. package/exports/hosts/claude/.claude/commands/specify.md +38 -0
  451. package/exports/hosts/claude/.claude/commands/verify.md +37 -0
  452. package/exports/hosts/claude/.claude/settings.json +34 -0
  453. package/exports/hosts/claude/CLAUDE.md +19 -0
  454. package/exports/hosts/claude/export.manifest.json +38 -0
  455. package/exports/hosts/claude/host-package.json +67 -0
  456. package/exports/hosts/codex/AGENTS.md +19 -0
  457. package/exports/hosts/codex/export.manifest.json +38 -0
  458. package/exports/hosts/codex/host-package.json +41 -0
  459. package/exports/hosts/cursor/.cursor/hooks.json +16 -0
  460. package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +19 -0
  461. package/exports/hosts/cursor/export.manifest.json +38 -0
  462. package/exports/hosts/cursor/host-package.json +42 -0
  463. package/exports/hosts/gemini/GEMINI.md +19 -0
  464. package/exports/hosts/gemini/export.manifest.json +38 -0
  465. package/exports/hosts/gemini/host-package.json +41 -0
  466. package/hooks/README.md +18 -0
  467. package/hooks/definitions/loop_cap_guard.yaml +21 -0
  468. package/hooks/definitions/post_tool_capture.yaml +24 -0
  469. package/hooks/definitions/pre_compact_summary.yaml +19 -0
  470. package/hooks/definitions/pre_tool_capture_route.yaml +19 -0
  471. package/hooks/definitions/protected_path_write_guard.yaml +19 -0
  472. package/hooks/definitions/session_start.yaml +19 -0
  473. package/hooks/definitions/stop_handoff_harvest.yaml +20 -0
  474. package/hooks/loop-cap-guard +17 -0
  475. package/hooks/post-tool-lint +36 -0
  476. package/hooks/protected-path-write-guard +17 -0
  477. package/hooks/session-start +41 -0
  478. package/llms-full.txt +2355 -0
  479. package/llms.txt +43 -0
  480. package/package.json +79 -0
  481. package/roles/README.md +20 -0
  482. package/roles/clarifier.md +42 -0
  483. package/roles/content-author.md +63 -0
  484. package/roles/designer.md +55 -0
  485. package/roles/executor.md +55 -0
  486. package/roles/learner.md +51 -0
  487. package/roles/planner.md +53 -0
  488. package/roles/researcher.md +43 -0
  489. package/roles/reviewer.md +54 -0
  490. package/roles/specifier.md +47 -0
  491. package/roles/verifier.md +71 -0
  492. package/schemas/README.md +24 -0
  493. package/schemas/accepted-learning.schema.json +20 -0
  494. package/schemas/author-artifact.schema.json +156 -0
  495. package/schemas/clarification.schema.json +19 -0
  496. package/schemas/design-artifact.schema.json +80 -0
  497. package/schemas/docs-claim.schema.json +18 -0
  498. package/schemas/export-manifest.schema.json +20 -0
  499. package/schemas/hook.schema.json +67 -0
  500. package/schemas/host-export-package.schema.json +18 -0
  501. package/schemas/implementation-plan.schema.json +19 -0
  502. package/schemas/proposed-learning.schema.json +19 -0
  503. package/schemas/research.schema.json +18 -0
  504. package/schemas/review.schema.json +29 -0
  505. package/schemas/run-manifest.schema.json +18 -0
  506. package/schemas/spec-challenge.schema.json +18 -0
  507. package/schemas/spec.schema.json +20 -0
  508. package/schemas/usage.schema.json +102 -0
  509. package/schemas/verification-proof.schema.json +29 -0
  510. package/schemas/wazir-manifest.schema.json +173 -0
  511. package/skills/README.md +40 -0
  512. package/skills/brainstorming/SKILL.md +77 -0
  513. package/skills/debugging/SKILL.md +50 -0
  514. package/skills/design/SKILL.md +61 -0
  515. package/skills/dispatching-parallel-agents/SKILL.md +128 -0
  516. package/skills/executing-plans/SKILL.md +70 -0
  517. package/skills/finishing-a-development-branch/SKILL.md +169 -0
  518. package/skills/humanize/SKILL.md +123 -0
  519. package/skills/init-pipeline/SKILL.md +124 -0
  520. package/skills/prepare-next/SKILL.md +20 -0
  521. package/skills/receiving-code-review/SKILL.md +123 -0
  522. package/skills/requesting-code-review/SKILL.md +105 -0
  523. package/skills/requesting-code-review/code-reviewer.md +108 -0
  524. package/skills/run-audit/SKILL.md +197 -0
  525. package/skills/scan-project/SKILL.md +41 -0
  526. package/skills/self-audit/SKILL.md +153 -0
  527. package/skills/subagent-driven-development/SKILL.md +154 -0
  528. package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +26 -0
  529. package/skills/subagent-driven-development/implementer-prompt.md +102 -0
  530. package/skills/subagent-driven-development/spec-reviewer-prompt.md +61 -0
  531. package/skills/tdd/SKILL.md +23 -0
  532. package/skills/using-git-worktrees/SKILL.md +163 -0
  533. package/skills/using-skills/SKILL.md +95 -0
  534. package/skills/verification/SKILL.md +22 -0
  535. package/skills/wazir/SKILL.md +463 -0
  536. package/skills/writing-plans/SKILL.md +30 -0
  537. package/skills/writing-skills/SKILL.md +157 -0
  538. package/skills/writing-skills/anthropic-best-practices.md +122 -0
  539. package/skills/writing-skills/persuasion-principles.md +50 -0
  540. package/templates/README.md +20 -0
  541. package/templates/artifacts/README.md +10 -0
  542. package/templates/artifacts/accepted-learning.md +19 -0
  543. package/templates/artifacts/accepted-learning.template.json +12 -0
  544. package/templates/artifacts/author.md +74 -0
  545. package/templates/artifacts/author.template.json +19 -0
  546. package/templates/artifacts/clarification.md +21 -0
  547. package/templates/artifacts/clarification.template.json +12 -0
  548. package/templates/artifacts/execute-notes.md +19 -0
  549. package/templates/artifacts/implementation-plan.md +21 -0
  550. package/templates/artifacts/implementation-plan.template.json +11 -0
  551. package/templates/artifacts/learning-proposal.md +19 -0
  552. package/templates/artifacts/next-run-handoff.md +21 -0
  553. package/templates/artifacts/plan-review.md +19 -0
  554. package/templates/artifacts/proposed-learning.template.json +12 -0
  555. package/templates/artifacts/research.md +21 -0
  556. package/templates/artifacts/research.template.json +12 -0
  557. package/templates/artifacts/review-findings.md +19 -0
  558. package/templates/artifacts/review.template.json +11 -0
  559. package/templates/artifacts/run-manifest.template.json +8 -0
  560. package/templates/artifacts/spec-challenge.md +19 -0
  561. package/templates/artifacts/spec-challenge.template.json +11 -0
  562. package/templates/artifacts/spec.md +21 -0
  563. package/templates/artifacts/spec.template.json +12 -0
  564. package/templates/artifacts/verification-proof.md +19 -0
  565. package/templates/artifacts/verification-proof.template.json +11 -0
  566. package/templates/examples/accepted-learning.example.json +14 -0
  567. package/templates/examples/author.example.json +152 -0
  568. package/templates/examples/clarification.example.json +15 -0
  569. package/templates/examples/docs-claim.example.json +8 -0
  570. package/templates/examples/export-manifest.example.json +7 -0
  571. package/templates/examples/host-export-package.example.json +11 -0
  572. package/templates/examples/implementation-plan.example.json +17 -0
  573. package/templates/examples/proposed-learning.example.json +13 -0
  574. package/templates/examples/research.example.json +15 -0
  575. package/templates/examples/research.example.md +6 -0
  576. package/templates/examples/review.example.json +17 -0
  577. package/templates/examples/run-manifest.example.json +9 -0
  578. package/templates/examples/spec-challenge.example.json +14 -0
  579. package/templates/examples/spec.example.json +21 -0
  580. package/templates/examples/verification-proof.example.json +21 -0
  581. package/templates/examples/wazir-manifest.example.yaml +65 -0
  582. package/templates/task-definition-schema.md +99 -0
  583. package/tooling/README.md +20 -0
  584. package/tooling/src/adapters/context-mode.js +50 -0
  585. package/tooling/src/capture/command.js +376 -0
  586. package/tooling/src/capture/store.js +99 -0
  587. package/tooling/src/capture/usage.js +270 -0
  588. package/tooling/src/checks/branches.js +50 -0
  589. package/tooling/src/checks/brand-truth.js +110 -0
  590. package/tooling/src/checks/changelog.js +231 -0
  591. package/tooling/src/checks/command-registry.js +36 -0
  592. package/tooling/src/checks/commits.js +102 -0
  593. package/tooling/src/checks/docs-drift.js +103 -0
  594. package/tooling/src/checks/docs-truth.js +201 -0
  595. package/tooling/src/checks/runtime-surface.js +156 -0
  596. package/tooling/src/cli.js +116 -0
  597. package/tooling/src/command-options.js +56 -0
  598. package/tooling/src/commands/validate.js +320 -0
  599. package/tooling/src/doctor/command.js +91 -0
  600. package/tooling/src/export/command.js +77 -0
  601. package/tooling/src/export/compiler.js +498 -0
  602. package/tooling/src/guards/loop-cap-guard.js +52 -0
  603. package/tooling/src/guards/protected-path-write-guard.js +67 -0
  604. package/tooling/src/index/command.js +152 -0
  605. package/tooling/src/index/storage.js +1061 -0
  606. package/tooling/src/index/summarizers.js +261 -0
  607. package/tooling/src/loaders.js +18 -0
  608. package/tooling/src/project-root.js +22 -0
  609. package/tooling/src/recall/command.js +225 -0
  610. package/tooling/src/schema-validator.js +30 -0
  611. package/tooling/src/state-root.js +40 -0
  612. package/tooling/src/status/command.js +71 -0
  613. package/wazir.manifest.yaml +135 -0
  614. package/workflows/README.md +19 -0
  615. package/workflows/author.md +42 -0
  616. package/workflows/clarify.md +38 -0
  617. package/workflows/design-review.md +46 -0
  618. package/workflows/design.md +44 -0
  619. package/workflows/discover.md +37 -0
  620. package/workflows/execute.md +48 -0
  621. package/workflows/learn.md +38 -0
  622. package/workflows/plan-review.md +42 -0
  623. package/workflows/plan.md +39 -0
  624. package/workflows/prepare-next.md +37 -0
  625. package/workflows/review.md +40 -0
  626. package/workflows/run-audit.md +41 -0
  627. package/workflows/spec-challenge.md +41 -0
  628. package/workflows/specify.md +38 -0
  629. package/workflows/verify.md +37 -0
@@ -0,0 +1,971 @@
1
+ # Rate Limiting & Throttling
2
+
3
+ > Performance expertise module covering algorithms, distributed implementations, client-side
4
+ > strategies, and production decision frameworks for rate limiting and throttling systems.
5
+
6
+ ---
7
+
8
+ ## Table of Contents
9
+
10
+ 1. [Core Concepts: Rate Limiting vs Throttling vs Backpressure](#core-concepts)
11
+ 2. [Rate Limiting Algorithms](#rate-limiting-algorithms)
12
+ 3. [Algorithm Performance Comparison](#algorithm-performance-comparison)
13
+ 4. [Distributed Rate Limiting with Redis](#distributed-rate-limiting-with-redis)
14
+ 5. [Client-Side Throttling and Debouncing](#client-side-throttling-and-debouncing)
15
+ 6. [API Rate Limit Headers](#api-rate-limit-headers)
16
+ 7. [Backpressure Mechanisms](#backpressure-mechanisms)
17
+ 8. [Common Bottlenecks](#common-bottlenecks)
18
+ 9. [Anti-Patterns](#anti-patterns)
19
+ 10. [Before/After: System Stability Under Load](#beforeafter-system-stability-under-load)
20
+ 11. [Decision Tree: Which Algorithm Should I Use?](#decision-tree-which-algorithm-should-i-use)
21
+ 12. [Production Case Studies](#production-case-studies)
22
+ 13. [Sources](#sources)
23
+
24
+ ---
25
+
26
+ ## Core Concepts
27
+
28
+ ### Rate Limiting
29
+
30
+ Rate limiting enforces a strict quantitative ceiling on operations within a fixed timeframe.
31
+ When a client exceeds N requests per window, subsequent requests are rejected outright with
32
+ HTTP 429 ("Too Many Requests"). The enforcement is binary: the request either passes or is
33
+ denied. Typical configurations enforce 100-10,000 requests per minute per API key.
34
+
35
+ ### Throttling
36
+
37
+ Throttling controls the *rate at which requests are processed* rather than rejecting them.
38
+ A throttled client continues receiving responses but at a degraded throughput -- for example,
39
+ a search API that normally responds in 50ms may add artificial delays of 200-500ms to slow
40
+ a misbehaving client. Throttling preserves availability while degrading performance, whereas
41
+ rate limiting preserves server resources by shedding load.
42
+
43
+ ### Backpressure
44
+
45
+ Backpressure is an upstream signal that tells producers to slow down when downstream
46
+ consumers cannot keep pace. Unlike rate limiting (which is imposed on callers), backpressure
47
+ is a cooperative feedback loop where a service communicates its capacity to its upstream
48
+ dependencies. In practice, backpressure manifests as TCP flow control, HTTP 429 with
49
+ Retry-After headers, Kafka consumer lag signals, or reactive streams demand signaling.
50
+
51
+ ### When to Use Each
52
+
53
+ | Mechanism | Direction | Behavior When Triggered | Best For |
54
+ |----------------|-----------------|----------------------------------|-----------------------------------|
55
+ | Rate Limiting | Server to client | Reject excess requests (429) | API abuse prevention, DDoS |
56
+ | Throttling | Server to client | Slow down processing | Graceful degradation under load |
57
+ | Backpressure | Downstream to upstream | Signal producer to reduce rate | Internal service-to-service flows |
58
+
59
+ ---
60
+
61
+ ## Rate Limiting Algorithms
62
+
63
+ ### 1. Token Bucket
64
+
65
+ **How it works:** A bucket holds up to `B` tokens (the burst capacity). Tokens are added
66
+ at a fixed refill rate of `R` tokens per second. Each incoming request consumes 1 token.
67
+ If the bucket is empty, the request is rejected. Tokens accumulate up to the bucket
68
+ capacity, enabling short bursts.
69
+
70
+ ```
71
+ Parameters:
72
+ - Bucket capacity (B): maximum burst size (e.g., 100 tokens)
73
+ - Refill rate (R): tokens added per second (e.g., 10/s)
74
+
75
+ State per client:
76
+ - current_tokens: float or integer
77
+ - last_refill_timestamp: epoch milliseconds
78
+
79
+ Per-request logic:
80
+ 1. Calculate elapsed time since last refill
81
+ 2. Add (elapsed * R) tokens, cap at B
82
+ 3. If current_tokens >= 1: consume 1 token, allow request
83
+ 4. Else: reject request
84
+ ```
85
+
86
+ **Memory per bucket:** ~24-40 bytes (2 numeric fields + timestamp). At 1M users, that is
87
+ roughly 24-40 MB of state.
88
+
89
+ **Strengths:** Allows controlled bursts (a client that was idle accumulates tokens).
90
+ Simple to reason about. Two tunable parameters (rate and burst) map cleanly to business
91
+ requirements. Used in production at Stripe, AWS API Gateway, and NGINX.
92
+
93
+ **Weaknesses:** Requires per-user state. Tuning burst capacity too high allows traffic
94
+ spikes that defeat the purpose of limiting. Does not smooth output rate -- bursts pass
95
+ through to the backend.
96
+
97
+ ### 2. Leaky Bucket
98
+
99
+ **How it works:** Requests enter a FIFO queue (the bucket) with fixed capacity `B`.
100
+ Requests drain from the queue at a constant rate `R` per second. If the queue is full
101
+ when a new request arrives, it is dropped. The output rate is perfectly smooth regardless
102
+ of input burstiness.
103
+
104
+ ```
105
+ Parameters:
106
+ - Queue capacity (B): maximum queued requests
107
+ - Leak rate (R): requests processed per second
108
+
109
+ State per client:
110
+ - queue: bounded FIFO (or counter + timestamp)
111
+ - last_leak_timestamp
112
+
113
+ Per-request logic:
114
+ 1. Drain (elapsed * R) items from queue
115
+ 2. If queue length < B: enqueue request
116
+ 3. Else: reject request
117
+ ```
118
+
119
+ **Memory per bucket:** ~24-32 bytes if using counter-based implementation (no actual
120
+ queue). If using a real FIFO queue, memory grows with queue depth.
121
+
122
+ **Strengths:** Produces perfectly smooth, constant-rate output. Simple to implement.
123
+ Prevents any burst from reaching the backend.
124
+
125
+ **Weaknesses:** Cannot handle legitimate burst traffic -- even a briefly idle client
126
+ gets no burst allowance. Adds latency because requests wait in the queue. Less flexible
127
+ than token bucket for APIs where occasional bursts are acceptable.
128
+
129
+ ### 3. Fixed Window Counter
130
+
131
+ **How it works:** Time is divided into fixed windows of duration `W` (e.g., 60 seconds).
132
+ A counter tracks requests in the current window. When the counter reaches the limit `L`,
133
+ subsequent requests are rejected until the window resets.
134
+
135
+ ```
136
+ Parameters:
137
+ - Window size (W): e.g., 60 seconds
138
+ - Limit (L): e.g., 100 requests per window
139
+
140
+ State per client:
141
+ - counter: integer
142
+ - window_start: timestamp
143
+
144
+ Per-request logic:
145
+ 1. If current_time >= window_start + W: reset counter to 0, update window_start
146
+ 2. If counter < L: increment counter, allow request
147
+ 3. Else: reject request
148
+ ```
149
+
150
+ **Memory per client:** ~16 bytes (1 integer + 1 timestamp). The most memory-efficient
151
+ algorithm.
152
+
153
+ **Strengths:** Extremely simple to implement. O(1) time and space. Trivial to implement
154
+ in Redis with INCR + EXPIRE (2 commands).
155
+
156
+ **Weaknesses:** Suffers from the **boundary burst problem**. A client can send L requests
157
+ at the end of window N and L requests at the start of window N+1, effectively sending 2L
158
+ requests in a span of W seconds. This can allow up to 2x the intended rate at window
159
+ boundaries. For a 100 req/min limit, a client could send 200 requests in a 60-second span
160
+ straddling two windows.
161
+
162
+ ### 4. Sliding Window Log
163
+
164
+ **How it works:** Every request timestamp is stored in a sorted set. When a new request
165
+ arrives, all timestamps older than `(now - W)` are removed. If the remaining count is
166
+ below the limit, the request is allowed and its timestamp is added.
167
+
168
+ ```
169
+ Parameters:
170
+ - Window size (W): e.g., 60 seconds
171
+ - Limit (L): e.g., 100 requests per window
172
+
173
+ State per client:
174
+ - sorted set of timestamps (e.g., Redis ZSET)
175
+
176
+ Per-request logic:
177
+ 1. Remove all entries with timestamp < (now - W)
178
+ 2. Count remaining entries
179
+ 3. If count < L: add current timestamp, allow request
180
+ 4. Else: reject request
181
+ ```
182
+
183
+ **Memory per client:** O(L) -- stores up to L timestamps. At 100 requests per window with
184
+ 8-byte timestamps, that is ~800 bytes per client. At 1M users, that is ~800 MB. At
185
+ 10,000 requests per window, memory grows to ~80 KB per client.
186
+
187
+ **Strengths:** Perfectly accurate. No boundary burst problem. Exact sliding window
188
+ semantics.
189
+
190
+ **Weaknesses:** Memory-intensive: O(L) per client where L is the rate limit. Requires
191
+ cleanup of expired entries on every request, which is O(L) worst case. Not practical for
192
+ high-limit scenarios (e.g., 10,000 req/min). The ZREMRANGEBYSCORE operation on large sets
193
+ adds latency.
194
+
195
+ ### 5. Sliding Window Counter
196
+
197
+ **How it works:** Combines fixed window counter with weighted overlap calculation. Maintains
198
+ counters for the current and previous windows. The effective count is calculated as:
199
+
200
+ ```
201
+ effective_count = (previous_window_count * overlap_percentage) + current_window_count
202
+
203
+ Where overlap_percentage = 1 - (elapsed_time_in_current_window / window_size)
204
+ ```
205
+
206
+ For example, if we are 30 seconds into a 60-second window, overlap is 50%. If the previous
207
+ window had 80 requests and the current has 30, the effective count is (80 * 0.5) + 30 = 70.
208
+
209
+ ```
210
+ Parameters:
211
+ - Window size (W): e.g., 60 seconds
212
+ - Limit (L): e.g., 100 requests per window
213
+
214
+ State per client:
215
+ - previous_counter: integer
216
+ - current_counter: integer
217
+ - current_window_start: timestamp
218
+
219
+ Per-request logic:
220
+ 1. If current_time >= window_start + W: rotate counters
221
+ 2. Calculate overlap_pct = 1 - ((now - window_start) / W)
222
+ 3. effective = (prev_counter * overlap_pct) + current_counter
223
+ 4. If effective < L: increment current_counter, allow request
224
+ 5. Else: reject request
225
+ ```
226
+
227
+ **Memory per client:** ~20 bytes (2 integers + 1 timestamp). O(1) space -- same order as
228
+ fixed window.
229
+
230
+ **Strengths:** Smooths boundary bursts with minimal memory overhead. O(1) time and space.
231
+ In real-world testing, only 0.003% of requests are incorrectly allowed compared to a
232
+ perfect sliding window log (source: Cloudflare engineering). Best balance of accuracy and
233
+ efficiency.
234
+
235
+ **Weaknesses:** Approximate -- not perfectly accurate. Assumes uniform distribution of
236
+ requests within the previous window. Slightly more complex to implement than fixed window.
237
+
238
+ ---
239
+
240
+ ## Algorithm Performance Comparison
241
+
242
+ | Algorithm | Memory/Client | Time Complexity | Burst Handling | Accuracy | Impl. Complexity |
243
+ |-----------------------|---------------|-----------------|-----------------|---------------|------------------|
244
+ | Token Bucket | ~32 bytes | O(1) | Allows bursts | Exact (rate) | Low-Medium |
245
+ | Leaky Bucket | ~32 bytes | O(1) | Smooths bursts | Exact (rate) | Low |
246
+ | Fixed Window Counter | ~16 bytes | O(1) | Boundary bursts | 2x overshoot | Very Low |
247
+ | Sliding Window Log | ~8L bytes | O(L) cleanup | No bursts | Perfect | Medium |
248
+ | Sliding Window Counter| ~20 bytes | O(1) | Smoothed | ~99.997% | Low-Medium |
249
+
250
+ ### Memory at Scale (1M concurrent users)
251
+
252
+ | Algorithm | Memory @ 100 req/min limit | Memory @ 10K req/min limit |
253
+ |-----------------------|----------------------------|----------------------------|
254
+ | Token Bucket | ~32 MB | ~32 MB |
255
+ | Leaky Bucket | ~32 MB | ~32 MB |
256
+ | Fixed Window Counter | ~16 MB | ~16 MB |
257
+ | Sliding Window Log | ~800 MB | ~80 GB |
258
+ | Sliding Window Counter| ~20 MB | ~20 MB |
259
+
260
+ The sliding window log is the only algorithm whose memory scales with the rate limit value
261
+ itself. All others use constant space per client regardless of the limit.
262
+
263
+ ### Throughput Overhead
264
+
265
+ A well-implemented rate limiter adds minimal latency:
266
+
267
+ - **In-memory (single-process):** 50-200 nanoseconds per check. Negligible overhead.
268
+ - **Redis single-node:** 0.1-0.5ms per check (network round trip dominates). At sub-ms
269
+ latency, this supports 50,000+ rate limit checks per second per Redis node.
270
+ - **Redis with Lua script:** Single round trip regardless of algorithm complexity. P95
271
+ latency < 2ms, P99 < 5ms in production (source: Redis benchmarks).
272
+ - **Redis cluster:** Add ~0.1ms for slot redirection on first access. Throughput scales
273
+ linearly with nodes.
274
+
275
+ ---
276
+
277
+ ## Distributed Rate Limiting with Redis
278
+
279
+ ### Why Redis
280
+
281
+ Redis is the de facto choice for distributed rate limiting because:
282
+
283
+ 1. **Sub-millisecond latency:** Redis processes commands in the sub-microsecond range
284
+ internally. Unix domain socket latency is ~30 microseconds; network latency on 1 Gbit/s
285
+ is ~200 microseconds (source: Redis latency documentation).
286
+ 2. **Atomic operations:** INCR, EXPIRE, ZADD, ZRANGEBYSCORE execute atomically.
287
+ 3. **Lua scripting:** EVAL/EVALSHA run multi-step logic atomically on the server, eliminating
288
+ race conditions and reducing round trips from 3-4 to 1.
289
+ 4. **Built-in expiration:** TTL-based key expiry handles window rotation automatically.
290
+
291
+ ### The Race Condition Problem
292
+
293
+ Without atomicity, concurrent requests create a classic TOCTOU (time-of-check/time-of-use)
294
+ race:
295
+
296
+ ```
297
+ Thread A: GET counter -> 99 (under limit of 100)
298
+ Thread B: GET counter -> 99 (under limit of 100)
299
+ Thread A: INCR counter -> 100 (allowed)
300
+ Thread B: INCR counter -> 101 (SHOULD have been rejected)
301
+ ```
302
+
303
+ At 10,000 requests per second, this race condition can allow 5-15% over-admission in
304
+ testing (source: Halodoc engineering blog).
305
+
306
+ ### Lua Script Solutions
307
+
308
+ #### Fixed Window with Lua
309
+
310
+ ```lua
311
+ -- KEYS[1] = rate limit key
312
+ -- ARGV[1] = limit
313
+ -- ARGV[2] = window size in seconds
314
+ local current = redis.call('INCR', KEYS[1])
315
+ if current == 1 then
316
+ redis.call('EXPIRE', KEYS[1], ARGV[2])
317
+ end
318
+ if current > tonumber(ARGV[1]) then
319
+ return 0 -- rejected
320
+ end
321
+ return 1 -- allowed
322
+ ```
323
+
324
+ #### Sliding Window Log with Lua
325
+
326
+ ```lua
327
+ -- KEYS[1] = rate limit key
328
+ -- ARGV[1] = limit
329
+ -- ARGV[2] = window size in milliseconds
330
+ -- ARGV[3] = current timestamp in milliseconds
331
+ -- ARGV[4] = unique request ID
332
+ local window_start = tonumber(ARGV[3]) - tonumber(ARGV[2])
333
+ redis.call('ZREMRANGEBYSCORE', KEYS[1], '-inf', window_start)
334
+ local count = redis.call('ZCARD', KEYS[1])
335
+ if count < tonumber(ARGV[1]) then
336
+ redis.call('ZADD', KEYS[1], ARGV[3], ARGV[4])
337
+ redis.call('PEXPIRE', KEYS[1], ARGV[2])
338
+ return 0 -- allowed
339
+ end
340
+ return 1 -- rejected
341
+ ```
342
+
343
+ #### Token Bucket with Lua
344
+
345
+ ```lua
346
+ -- KEYS[1] = bucket key
347
+ -- ARGV[1] = bucket capacity
348
+ -- ARGV[2] = refill rate (tokens/sec)
349
+ -- ARGV[3] = current timestamp (seconds, float)
350
+ -- ARGV[4] = tokens to consume
351
+ local bucket = redis.call('HMGET', KEYS[1], 'tokens', 'last_refill')
352
+ local tokens = tonumber(bucket[1]) or tonumber(ARGV[1])
353
+ local last_refill = tonumber(bucket[2]) or tonumber(ARGV[3])
354
+ local now = tonumber(ARGV[3])
355
+ local elapsed = math.max(0, now - last_refill)
356
+ tokens = math.min(tonumber(ARGV[1]), tokens + elapsed * tonumber(ARGV[2]))
357
+ local allowed = 0
358
+ if tokens >= tonumber(ARGV[4]) then
359
+ tokens = tokens - tonumber(ARGV[4])
360
+ allowed = 1
361
+ end
362
+ redis.call('HMSET', KEYS[1], 'tokens', tokens, 'last_refill', now)
363
+ redis.call('EXPIRE', KEYS[1], math.ceil(tonumber(ARGV[1]) / tonumber(ARGV[2])) * 2)
364
+ return allowed
365
+ ```
366
+
367
+ ### EVAL vs EVALSHA
368
+
369
+ - **EVAL** sends the full script source on every call. This wastes bandwidth --
370
+ a typical rate limiter Lua script is 300-800 bytes.
371
+ - **EVALSHA** sends only the 40-byte SHA1 hash after the script is loaded via SCRIPT LOAD.
372
+ At 50,000 requests/second, this saves ~12-38 MB/s of bandwidth.
373
+
374
+ **Production pattern:** Load scripts at application startup, cache SHA1 hashes, call
375
+ EVALSHA. If Redis returns NOSCRIPT (after a restart or failover), fall back to EVAL once
376
+ and re-cache the hash.
377
+
378
+ ### Handling Hot Keys
379
+
380
+ When millions of users share a single rate limit key (e.g., global API limit), that key
381
+ becomes a hot key on a single Redis shard. Mitigation strategies:
382
+
383
+ 1. **Key sharding:** Append a shard suffix (e.g., `ratelimit:global:{0-7}`) and sum
384
+ counters across shards. Distributes load across 8 Redis slots.
385
+ 2. **Local pre-aggregation:** Aggregate counts in-process for 100-500ms, then flush to
386
+ Redis. Reduces Redis operations by 10-50x at the cost of accuracy.
387
+ 3. **Redis Cluster with read replicas:** Route read-heavy operations (checking remaining
388
+ quota) to replicas. Writes still go to the primary.
389
+
390
+ ---
391
+
392
+ ## Client-Side Throttling and Debouncing
393
+
394
+ ### Debouncing
395
+
396
+ Debouncing delays execution until activity stops for a specified period. Ideal for
397
+ search-as-you-type or autocomplete where only the final input matters.
398
+
399
+ ```javascript
400
+ function debounce(fn, delayMs) {
401
+ let timer;
402
+ return (...args) => {
403
+ clearTimeout(timer);
404
+ timer = setTimeout(() => fn(...args), delayMs);
405
+ };
406
+ }
407
+ // Usage: debounce(searchAPI, 300) -- waits 300ms after last keystroke
408
+ ```
409
+
410
+ **Impact:** A user typing 5 characters in 1 second generates 1 API call instead of 5.
411
+ At 100,000 concurrent users, debouncing at 300ms reduces search API traffic by 60-80%.
412
+
413
+ ### Throttling (Client-Side)
414
+
415
+ Throttling limits execution to at most once per time interval, regardless of how many
416
+ times the function is invoked.
417
+
418
+ ```javascript
419
+ function throttle(fn, intervalMs) {
420
+ let lastCall = 0;
421
+ return (...args) => {
422
+ const now = Date.now();
423
+ if (now - lastCall >= intervalMs) {
424
+ lastCall = now;
425
+ fn(...args);
426
+ }
427
+ };
428
+ }
429
+ // Usage: throttle(trackScroll, 100) -- at most 10 calls/sec on scroll
430
+ ```
431
+
432
+ **Impact:** Scroll-tracking that fires 60 times/second (matching 60fps) is reduced to
433
+ 10 calls/second with 100ms throttling -- an 83% reduction in API calls.
434
+
435
+ ### Exponential Backoff with Jitter
436
+
437
+ When a client receives HTTP 429, it should retry with exponential backoff:
438
+
439
+ ```
440
+ delay = min(base_delay * 2^attempt, max_delay) + random_jitter
441
+
442
+ Where:
443
+ base_delay = 1 second (typical)
444
+ max_delay = 32-64 seconds (cap)
445
+ jitter = random(0, delay * 0.5) -- prevents thundering herd
446
+ ```
447
+
448
+ **Without jitter:** If 1,000 clients hit a rate limit simultaneously and all retry at
449
+ exactly 2^N seconds, the server sees synchronized spikes of 1,000 requests at t=1s, t=2s,
450
+ t=4s, t=8s -- the thundering herd problem.
451
+
452
+ **With jitter:** The same 1,000 retries spread across a time range. At attempt 3 with
453
+ base delay 1s: delay = 8s + random(0, 4s), so retries spread across the 8-12s window.
454
+ This reduces peak retry load by 60-80%.
455
+
456
+ ### Adaptive Client-Side Rate Limiting (Google SRE)
457
+
458
+ Google's approach from the SRE book implements client-side throttling based on observed
459
+ rejection rate:
460
+
461
+ ```
462
+ client_request_probability = max(0, (requests - K * accepts) / (requests + 1))
463
+
464
+ Where:
465
+ requests = total requests in recent window
466
+ accepts = requests that were accepted (not rate-limited)
467
+ K = multiplier (typically 2.0)
468
+ ```
469
+
470
+ When the backend starts rejecting requests, the client proactively reduces its own send
471
+ rate. At K=2, the client starts self-throttling when more than 50% of requests are
472
+ rejected. This prevents wasted work (sending requests that will be rejected) and reduces
473
+ server load from processing rejections.
474
+
475
+ ---
476
+
477
+ ## API Rate Limit Headers
478
+
479
+ ### Current Standard (IETF Draft)
480
+
481
+ The IETF HTTPAPI Working Group is standardizing rate limit headers via
482
+ `draft-ietf-httpapi-ratelimit-headers` (currently at draft-10, expires March 2026).
483
+ The specification defines:
484
+
485
+ | Header | Purpose | Example Value |
486
+ |-------------------|--------------------------------------------------|----------------------|
487
+ | `RateLimit-Limit` | Maximum requests allowed in the current window | `100` |
488
+ | `RateLimit-Remaining` | Requests remaining in the current window | `47` |
489
+ | `RateLimit-Reset` | Seconds until the rate limit window resets | `30` |
490
+ | `RateLimit-Policy`| Describes the rate limit policy | `100;w=60` |
491
+ | `Retry-After` | Seconds to wait before retrying (on 429) | `30` |
492
+
493
+ ### Legacy Headers (Still Widely Used)
494
+
495
+ Before the IETF draft, APIs used non-standard `X-RateLimit-*` headers:
496
+
497
+ ```
498
+ X-RateLimit-Limit: 1000
499
+ X-RateLimit-Remaining: 742
500
+ X-RateLimit-Reset: 1672531200 (Unix epoch -- inconsistent across APIs)
501
+ ```
502
+
503
+ The key inconsistency: `X-RateLimit-Reset` is sometimes a Unix timestamp (GitHub, Twitter)
504
+ and sometimes delta-seconds (others). The IETF draft standardizes on delta-seconds for
505
+ `RateLimit-Reset`, consistent with `Retry-After` from RFC 9110.
506
+
507
+ ### HTTP Status Codes
508
+
509
+ | Code | Meaning | When to Use |
510
+ |------|-------------------------|--------------------------------------------------|
511
+ | 429 | Too Many Requests | Client exceeded rate limit (RFC 6585) |
512
+ | 503 | Service Unavailable | Server-side overload / load shedding |
513
+ | 403 | Forbidden | Some APIs use this instead of 429 (not ideal) |
514
+
515
+ **Best practice:** Always return 429 for rate limiting, 503 for load shedding. Include
516
+ `Retry-After` header with both. Include `RateLimit-*` headers on ALL responses (not just
517
+ 429s) so clients can proactively manage their request rate.
518
+
519
+ ### Response Body Best Practice
520
+
521
+ ```json
522
+ {
523
+ "error": {
524
+ "code": "rate_limit_exceeded",
525
+ "message": "Rate limit exceeded. Limit: 100 requests per 60 seconds.",
526
+ "retry_after": 30,
527
+ "limit": 100,
528
+ "remaining": 0,
529
+ "reset": 30
530
+ }
531
+ }
532
+ ```
533
+
534
+ ---
535
+
536
+ ## Backpressure Mechanisms
537
+
538
+ ### TCP-Level Backpressure
539
+
540
+ TCP flow control is the original backpressure mechanism. When a receiver's buffer fills,
541
+ it advertises a smaller window size, causing the sender to slow down. This is invisible
542
+ to the application layer but can cause connection pooling issues when TCP windows shrink
543
+ under load.
544
+
545
+ ### Application-Level Backpressure
546
+
547
+ #### Queue-Based Backpressure
548
+
549
+ Bounded queues with rejection policies provide explicit backpressure:
550
+
551
+ ```
552
+ Queue capacity: 10,000 items
553
+ Current depth: 9,500 items (95% full)
554
+ Action: Return 503 to new requests, signal upstream to reduce rate
555
+ ```
556
+
557
+ When queue depth exceeds 80% capacity, start rejecting low-priority requests. At 95%,
558
+ reject all non-critical requests. This graduated response prevents queue overflow while
559
+ maintaining service for critical traffic.
560
+
561
+ #### Reactive Streams Backpressure
562
+
563
+ Reactive frameworks (RxJava, Project Reactor, Akka Streams) implement demand-based
564
+ backpressure where consumers explicitly request N items from producers:
565
+
566
+ - `request(10)` -- consumer can handle 10 more items
567
+ - Producer sends at most 10 items, then waits for more demand
568
+ - If consumer is slow, producer naturally slows without buffering
569
+
570
+ #### Load Shedding as Backpressure (Stripe Model)
571
+
572
+ Stripe implements 4 tiers of rate limiting in production (source: Stripe engineering blog):
573
+
574
+ 1. **Request Rate Limiter:** Token bucket, N requests/second per user.
575
+ 2. **Concurrent Request Limiter:** Caps simultaneous in-flight requests per user to
576
+ manage CPU-intensive endpoints.
577
+ 3. **Fleet Usage Load Shedder:** Reserves 20% of fleet capacity for critical requests.
578
+ Non-critical requests are rejected with 503 when the fleet exceeds 80% utilization.
579
+ 4. **Worker Utilization Load Shedder:** Final defense. When individual workers are
580
+ overloaded, they progressively shed lower-priority traffic, starting with test-mode
581
+ requests.
582
+
583
+ This tiered approach means Stripe can handle 100+ billion API requests without cascading
584
+ failures (source: Stripe engineering).
585
+
586
+ ---
587
+
588
+ ## Common Bottlenecks
589
+
590
+ ### 1. The Rate Limiter Becomes the Bottleneck
591
+
592
+ **Problem:** If the rate limiter itself is slower than the services it protects, it adds
593
+ latency to every request -- including those well within their limits.
594
+
595
+ **Symptoms:**
596
+ - P99 latency increases by 5-50ms under load
597
+ - Rate limiter Redis CPU exceeds 80%
598
+ - Rate limiter timeouts cause cascading failures
599
+
600
+ **Solutions:**
601
+ - Use in-memory rate limiting for single-instance services (~200ns per check vs ~500us
602
+ for Redis)
603
+ - Use Lua scripts to reduce Redis round trips from 3-4 to 1
604
+ - Set aggressive timeouts on Redis calls (e.g., 5ms). If Redis is unavailable, fail
605
+ open (allow request) rather than blocking
606
+ - Pre-compute rate limit decisions and cache them for 100-500ms
607
+
608
+ ### 2. Redis Round-Trip Latency
609
+
610
+ **Problem:** Each rate limit check requires a Redis round trip. At 50,000 requests/second,
611
+ that is 50,000 Redis operations/second, consuming significant network bandwidth and Redis
612
+ CPU.
613
+
614
+ **Measurements:**
615
+ - Same-datacenter Redis: 0.2-0.5ms per round trip
616
+ - Cross-datacenter Redis: 1-5ms per round trip
617
+ - Redis Lua script (single round trip): 0.3-1ms regardless of algorithm complexity
618
+
619
+ **Solutions:**
620
+ - **Pipeline commands:** Batch multiple rate limit checks into a single Redis pipeline.
621
+ Reduces per-check latency by 3-5x.
622
+ - **Local token cache:** Each application instance maintains a local allocation of tokens
623
+ (e.g., 1/N of the global limit where N = number of instances). Refresh from Redis every
624
+ 1-5 seconds. Reduces Redis calls by 100-1000x.
625
+ - **Redis Cluster:** Shard rate limit keys across nodes. Linear throughput scaling.
626
+
627
+ ### 3. Hot Keys in Redis
628
+
629
+ **Problem:** A single popular API key or a global rate limit creates a "hot key" that
630
+ concentrates all writes on one Redis shard.
631
+
632
+ **Impact:** A single Redis shard handles ~100,000-200,000 operations/second. A global
633
+ rate limit at 500,000 requests/second exceeds this capacity.
634
+
635
+ **Solutions:**
636
+ - **Key sharding:** Split `ratelimit:global` into `ratelimit:global:{0..15}` and sum
637
+ across shards. Each shard handles 1/16 of the traffic.
638
+ - **Probabilistic local counting:** Maintain an approximate local counter (e.g., HyperLogLog
639
+ or simple counter with periodic sync). Accept ~1-5% inaccuracy.
640
+ - **Two-tier limiting:** Coarse local limit (in-memory) + fine-grained global limit
641
+ (Redis). Local limit catches 90% of rejections without touching Redis.
642
+
643
+ ### 4. Clock Skew in Distributed Systems
644
+
645
+ **Problem:** Fixed and sliding window algorithms depend on consistent time. If servers
646
+ disagree by >1 second, rate limits become inaccurate.
647
+
648
+ **Impact:** At 100 requests per 10-second window, a 2-second clock skew between servers
649
+ can allow 120 requests (20% over-admission).
650
+
651
+ **Solutions:**
652
+ - Use Redis server time (via `TIME` command or Lua `redis.call('TIME')`) instead of
653
+ client timestamps. All decisions reference the same clock.
654
+ - Use NTP to keep server clocks within 10ms of each other.
655
+ - Prefer token bucket algorithm, which is less sensitive to clock skew (it uses elapsed
656
+ time deltas, not absolute timestamps).
657
+
658
+ ---
659
+
660
+ ## Anti-Patterns
661
+
662
+ ### 1. In-Memory Rate Limiting in Distributed Systems
663
+
664
+ **The mistake:** Using a local in-memory counter (e.g., `ConcurrentHashMap` in Java, or
665
+ a process-level dictionary in Python) when running multiple service instances behind a
666
+ load balancer.
667
+
668
+ **Why it fails:** With 10 instances and a limit of 100 requests/minute, a client can
669
+ send 100 requests to each instance -- effectively getting a 1,000 request/minute limit.
670
+ Each instance sees only 1/N of the traffic.
671
+
672
+ **The fix:** Use a shared store (Redis, Memcached) or implement a gossip protocol to
673
+ synchronize counters. Alternatively, use sticky sessions (but this creates uneven load
674
+ distribution and single-point-of-failure risks).
675
+
676
+ ### 2. No Rate Limiting
677
+
678
+ **The mistake:** Deploying an API without any rate limiting because "our clients are
679
+ well-behaved."
680
+
681
+ **What happens:** A single misbehaving client (or bot, or retry storm) saturates the
682
+ service. 73% of SaaS outages are linked to API overuse or poor traffic management
683
+ (source: Gartner, 2024). Without rate limiting, a single client can monopolize shared
684
+ resources, degrading service for all other users.
685
+
686
+ **The fix:** Start with a generous rate limit (e.g., 10x your expected peak per-client
687
+ traffic). Monitor, measure, then tighten. A loose limit is infinitely better than no
688
+ limit.
689
+
690
+ ### 3. Too Aggressive Rate Limiting
691
+
692
+ **The mistake:** Setting rate limits based on average traffic rather than peak traffic.
693
+ A limit of 10 requests/second when the client legitimately bursts to 50 requests/second
694
+ during normal operation causes constant 429 errors.
695
+
696
+ **Symptoms:**
697
+ - High 429 rate (>5% of requests) during normal operation
698
+ - Client retry storms amplifying the problem
699
+ - Support tickets from frustrated developers
700
+
701
+ **The fix:** Set limits at 2-5x the observed P95 client request rate. Use token bucket
702
+ algorithm to allow bursts. Monitor 429 rates by client and alert when they exceed 1%.
703
+
704
+ ### 4. Rate Limiting Without Informative Responses
705
+
706
+ **The mistake:** Returning 429 without `Retry-After` header, without `RateLimit-*`
707
+ headers, or with a generic error message.
708
+
709
+ **Why it matters:** Without `Retry-After`, clients guess when to retry -- often too soon,
710
+ creating retry storms. Without `RateLimit-Remaining`, clients cannot proactively manage
711
+ their request rate.
712
+
713
+ **The fix:** Always include `Retry-After` on 429 responses. Include `RateLimit-*` headers
714
+ on ALL responses. Include a JSON body with specific error details.
715
+
716
+ ### 5. Using the Wrong Identifier
717
+
718
+ **The mistake:** Rate limiting by IP address in an environment with NAT gateways, proxies,
719
+ or cloud egress. A corporate NAT gateway may funnel 10,000 users through a single IP.
720
+
721
+ **Impact:** Legitimate users behind the same NAT get rate-limited collectively. Meanwhile,
722
+ an attacker can rotate IPs easily (e.g., using cloud VMs at $0.01/hour each).
723
+
724
+ **The fix:** Rate limit by API key, user ID, or OAuth token. Use IP-based limiting only
725
+ as a last resort for unauthenticated endpoints (e.g., login, registration).
726
+
727
+ ### 6. Fail-Closed Rate Limiter
728
+
729
+ **The mistake:** When Redis (or your rate limit store) is unavailable, rejecting all
730
+ requests.
731
+
732
+ **Impact:** A Redis outage causes a complete API outage -- the rate limiter becomes a
733
+ single point of failure.
734
+
735
+ **The fix:** Fail open. If the rate limit check cannot be completed within 5ms, allow
736
+ the request and log the failure. Use circuit breakers around Redis calls. A short period
737
+ of unlimited access is vastly preferable to a complete outage.
738
+
739
+ ---
740
+
741
+ ## Before/After: System Stability Under Load
742
+
743
+ ### Scenario: API Receiving Traffic Spike (10x Normal Load)
744
+
745
+ #### WITHOUT Rate Limiting
746
+
747
+ ```
748
+ Time Requests/s Latency (P99) Error Rate CPU Usage
749
+ t+0s 1,000 50ms 0.1% 40%
750
+ t+10s 5,000 200ms 2% 75%
751
+ t+20s 10,000 2,000ms 25% 98%
752
+ t+30s 10,000 timeout 80% 100%
753
+ t+40s 10,000 timeout 95% 100% <-- cascading failure
754
+ t+50s 2,000 timeout 90% 100% <-- legitimate users affected
755
+ t+60s 500 5,000ms 50% 95% <-- slow recovery begins
756
+ t+120s 1,000 500ms 10% 60% <-- partial recovery
757
+ ```
758
+
759
+ **Total impact:** 2+ minutes of degraded service. ~60% of legitimate requests failed.
760
+ Cascading failures propagated to downstream services.
761
+
762
+ #### WITH Rate Limiting (Token Bucket: 2,000 req/s per client, global: 5,000 req/s)
763
+
764
+ ```
765
+ Time Requests/s Latency (P99) Error Rate 429 Rate CPU Usage
766
+ t+0s 1,000 50ms 0.1% 0% 40%
767
+ t+10s 5,000 55ms 0.1% 0% 45%
768
+ t+20s 10,000 60ms 0.1% 50% 50%
769
+ t+30s 10,000 60ms 0.1% 50% 50%
770
+ t+40s 10,000 60ms 0.1% 50% 50% <-- stable
771
+ t+50s 10,000 60ms 0.1% 50% 50% <-- stable
772
+ t+60s 5,000 55ms 0.1% 0% 45% <-- load subsides
773
+ ```
774
+
775
+ **Total impact:** Zero degradation for clients within their limits. 50% of excess traffic
776
+ shed cleanly with 429 + Retry-After. P99 latency increased by only 10ms (rate limiter
777
+ overhead). No cascading failures. No recovery period needed.
778
+
779
+ ### Scenario: Redis Rate Limiter Failure
780
+
781
+ #### Fail-Closed (Anti-Pattern)
782
+
783
+ ```
784
+ Redis down at t+0s:
785
+ - ALL rate limit checks fail
786
+ - ALL requests rejected (100% error rate)
787
+ - Complete API outage for 30-300 seconds until Redis recovers
788
+ - Worse than having no rate limiter at all
789
+ ```
790
+
791
+ #### Fail-Open (Best Practice)
792
+
793
+ ```
794
+ Redis down at t+0s:
795
+ - Rate limit checks timeout after 5ms
796
+ - All requests allowed (fail open)
797
+ - Log "rate limiter unavailable" at WARN level
798
+ - Alert on-call engineer
799
+ - Service continues operating without rate limiting for 30-300 seconds
800
+ - Risk: temporary over-admission. Actual risk is low if the outage is brief.
801
+ ```
802
+
803
+ ---
804
+
805
+ ## Decision Tree: Which Algorithm Should I Use?
806
+
807
+ ```
808
+ START: What is your primary requirement?
809
+ |
810
+ +---> Need to allow traffic bursts?
811
+ | |
812
+ | +---> YES: Use TOKEN BUCKET
813
+ | | - Stripe, AWS API Gateway, NGINX use this
814
+ | | - Two params: rate + burst capacity
815
+ | | - Memory: ~32 bytes/client
816
+ | | - Best for: API rate limiting with burst tolerance
817
+ | |
818
+ | +---> NO: Need perfectly smooth output rate?
819
+ | |
820
+ | +---> YES: Use LEAKY BUCKET
821
+ | | - Processes requests at constant rate
822
+ | | - Adds queuing latency
823
+ | | - Best for: traffic shaping, network QoS
824
+ | |
825
+ | +---> NO: Continue below
826
+ |
827
+ +---> Is memory a primary constraint?
828
+ | |
829
+ | +---> YES, minimal memory:
830
+ | | |
831
+ | | +---> Can you tolerate boundary bursts (up to 2x)?
832
+ | | |
833
+ | | +---> YES: Use FIXED WINDOW COUNTER
834
+ | | | - ~16 bytes/client, simplest to implement
835
+ | | | - Redis: INCR + EXPIRE (2 commands)
836
+ | | |
837
+ | | +---> NO: Use SLIDING WINDOW COUNTER
838
+ | | - ~20 bytes/client, 99.997% accurate
839
+ | | - Best default choice for most APIs
840
+ | |
841
+ | +---> NO, memory is not a concern:
842
+ | |
843
+ | +---> Need perfect accuracy (zero false allows)?
844
+ | |
845
+ | +---> YES: Use SLIDING WINDOW LOG
846
+ | | - O(L) memory per client
847
+ | | - Only practical for L < 1,000
848
+ | | - Best for: billing, compliance, audit
849
+ | |
850
+ | +---> NO: Use SLIDING WINDOW COUNTER
851
+ | - Best overall balance
852
+ |
853
+ +---> Running distributed (multi-node)?
854
+ |
855
+ +---> YES:
856
+ | - Use Redis + Lua scripts for atomicity
857
+ | - SLIDING WINDOW COUNTER is the recommended default
858
+ | - TOKEN BUCKET if bursts are needed
859
+ | - Always use EVALSHA (not EVAL) for performance
860
+ | - Set 5ms timeout, fail open on Redis errors
861
+ |
862
+ +---> NO (single process):
863
+ - Use in-memory implementation
864
+ - Any algorithm works; TOKEN BUCKET is most versatile
865
+ - ~200ns per check, no external dependencies
866
+
867
+ QUICK DECISION MATRIX:
868
+ API rate limiting (general) --> Sliding Window Counter
869
+ API rate limiting (allow bursts)--> Token Bucket
870
+ Network traffic shaping --> Leaky Bucket
871
+ Simple + low memory --> Fixed Window Counter
872
+ Billing / compliance / audit --> Sliding Window Log
873
+ ```
874
+
875
+ ---
876
+
877
+ ## Production Case Studies
878
+
879
+ ### Cloudflare: Rate Limiting at the Edge
880
+
881
+ Cloudflare processes rate limiting across millions of domains at their edge network. Their
882
+ architecture uses a Twemproxy cluster inside each Point of Presence (PoP) with consistent
883
+ hashing to distribute memcache keys across servers. When the cluster is resized, consistent
884
+ hashing ensures only a small fraction of keys are rehashed. Rate limiting at the edge
885
+ means origin servers never see excessive traffic, and the performance/memory cost is
886
+ distributed across the global edge network (source: Cloudflare engineering blog).
887
+
888
+ Key numbers:
889
+ - Rate limiting deployed across 300+ data centers globally
890
+ - Consistent hashing minimizes key redistribution during scaling
891
+ - Edge-based limiting eliminates origin server load from abusive traffic
892
+
893
+ ### Stripe: Four-Tier Rate Limiting
894
+
895
+ Stripe's production rate limiting uses 4 distinct layers (source: Stripe engineering blog):
896
+
897
+ 1. **Request rate limiter** (token bucket): Per-user request rate cap
898
+ 2. **Concurrent request limiter:** Per-user cap on simultaneous in-flight requests
899
+ 3. **Fleet usage load shedder:** Reserves 20% fleet capacity for critical traffic
900
+ 4. **Worker utilization load shedder:** Per-worker progressive shedding by priority
901
+
902
+ All layers use Redis with token bucket algorithm for the per-user limiters. The load
903
+ shedders use local worker metrics. This layered approach protects against both individual
904
+ abuse and systemic overload while ensuring critical payment processing is never starved.
905
+
906
+ ### Halodoc: Redis + Lua Sliding Window
907
+
908
+ Halodoc implemented a sliding window rate limiter using Redis sorted sets and Lua scripts
909
+ (source: Halodoc engineering blog). Key findings:
910
+
911
+ - Lua scripts reduced race condition over-admission from ~12% to 0%
912
+ - Single Redis round trip per rate limit check (vs 3-4 without Lua)
913
+ - ZREMRANGEBYSCORE + ZCARD + ZADD + PEXPIRE in a single atomic operation
914
+ - Production deployment handles thousands of requests/second with sub-2ms latency
915
+
916
+ ### Google SRE: Client-Side Adaptive Throttling
917
+
918
+ Google's SRE book describes client-side adaptive throttling where clients track their own
919
+ acceptance rate and proactively reduce sending rate when backends are stressed. With
920
+ multiplier K=2, clients start self-throttling when rejection rate exceeds 50%. This
921
+ reduces wasted work (sending requests only to be rejected) and can reduce recovery time
922
+ from overload by 50-70% compared to server-side-only rate limiting (source: Google SRE
923
+ book, Chapter 21).
924
+
925
+ ---
926
+
927
+ ## Implementation Checklist
928
+
929
+ ```
930
+ [ ] Choose algorithm based on decision tree above
931
+ [ ] Implement with Redis + Lua scripts for distributed systems
932
+ [ ] Use EVALSHA (not EVAL) -- load scripts at startup, cache SHA1
933
+ [ ] Set Redis call timeout to 5ms, fail open on timeout
934
+ [ ] Return RateLimit-* headers on ALL responses (not just 429)
935
+ [ ] Return Retry-After header on 429 responses
936
+ [ ] Include informative JSON error body on 429 responses
937
+ [ ] Rate limit by API key or user ID (not IP address)
938
+ [ ] Set initial limits at 2-5x observed P95 client traffic
939
+ [ ] Monitor 429 rate by client -- alert if >1% during normal operation
940
+ [ ] Implement exponential backoff with jitter in client SDKs
941
+ [ ] Add circuit breaker around Redis rate limit calls
942
+ [ ] Test under load: verify rate limiter does not become the bottleneck
943
+ [ ] Test Redis failure: verify fail-open behavior
944
+ [ ] Test clock skew: verify algorithm tolerates 1-2 second differences
945
+ [ ] Dashboard: request rate, 429 rate, rate limiter latency, Redis health
946
+ ```
947
+
948
+ ---
949
+
950
+ ## Sources
951
+
952
+ - [Cloudflare: How We Built Rate Limiting Capable of Scaling to Millions of Domains](https://blog.cloudflare.com/counting-things-a-lot-of-different-things/)
953
+ - [Stripe: Scaling Your API with Rate Limiters](https://stripe.com/blog/rate-limiters)
954
+ - [Redis: Build 5 Rate Limiters with Redis](https://redis.io/tutorials/howtos/ratelimiting/)
955
+ - [Halodoc: Redis and Lua Powered Sliding Window Rate Limiter](https://blogs.halodoc.io/taming-the-traffic-redis-and-lua-powered-sliding-window-rate-limiter-in-action/)
956
+ - [IETF: RateLimit Header Fields for HTTP (draft-ietf-httpapi-ratelimit-headers)](https://datatracker.ietf.org/doc/draft-ietf-httpapi-ratelimit-headers/)
957
+ - [API7: From Token Bucket to Sliding Window Rate Limiting Guide](https://api7.ai/blog/rate-limiting-guide-algorithms-best-practices)
958
+ - [Arpit Bhayani: Sliding Window Rate Limiting Design and Implementation](https://arpitbhayani.me/blogs/sliding-window-ratelimiter/)
959
+ - [Gravitee: API Rate Limiting at Scale](https://www.gravitee.io/blog/rate-limiting-apis-scale-patterns-strategies)
960
+ - [AlgoMaster: Rate Limiting Algorithms Explained with Code](https://blog.algomaster.io/p/rate-limiting-algorithms-explained-with-code)
961
+ - [Redis: Diagnosing Latency Issues](https://redis.io/docs/latest/operate/oss_and_stack/management/optimization/latency/)
962
+ - [Zuplo: 10 Best Practices for API Rate Limiting in 2025](https://zuplo.com/learning-center/10-best-practices-for-api-rate-limiting-in-2025)
963
+ - [Kodekx: API Rate Limiting Best Practices for Scaling SaaS Apps in 2025](https://www.kodekx.com/blog/api-rate-limiting-best-practices-scaling-saas-2025)
964
+ - [Expedia Group: Traffic Shedding, Rate Limiting, Backpressure](https://medium.com/expedia-group-tech/traffic-shedding-rate-limiting-backpressure-oh-my-21f95c403b29)
965
+ - [Smudge.ai: Visualizing Algorithms for Rate Limiting](https://smudge.ai/blog/ratelimit-algorithms)
966
+ - [FreeCodeCamp: How to Build a Distributed Rate Limiting System Using Redis and Lua](https://www.freecodecamp.org/news/build-rate-limiting-system-using-redis-and-lua/)
967
+ - [Kong: How to Design a Scalable Rate Limiting Algorithm](https://konghq.com/blog/engineering/how-to-design-a-scalable-rate-limiting-algorithm)
968
+ - [Lunar.dev: Maximizing Performance with Client-Side Throttling](https://www.lunar.dev/post/client-side-throttling)
969
+ - [GeeksforGeeks: Token Bucket vs Leaky Bucket Algorithm](https://www.geeksforgeeks.org/system-design/token-bucket-vs-leaky-bucket-algorithm-system-design/)
970
+ - [Speakeasy: Rate Limiting Best Practices in REST API Design](https://www.speakeasy.com/api-design/rate-limiting)
971
+ - [Arxiv: Designing Scalable Rate Limiting Systems](https://arxiv.org/html/2602.11741)