@wazir-dev/cli 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +111 -0
- package/CHANGELOG.md +14 -0
- package/CONTRIBUTING.md +101 -0
- package/LICENSE +21 -0
- package/README.md +314 -0
- package/assets/composition-engine.mmd +34 -0
- package/assets/demo-script.sh +17 -0
- package/assets/logo-dark.svg +14 -0
- package/assets/logo.svg +14 -0
- package/assets/pipeline.mmd +39 -0
- package/assets/record-demo.sh +51 -0
- package/docs/README.md +51 -0
- package/docs/adapters/context-mode.md +60 -0
- package/docs/concepts/architecture.md +87 -0
- package/docs/concepts/artifact-model.md +60 -0
- package/docs/concepts/composition-engine.md +36 -0
- package/docs/concepts/indexing-and-recall.md +160 -0
- package/docs/concepts/observability.md +41 -0
- package/docs/concepts/roles-and-workflows.md +59 -0
- package/docs/concepts/terminology-policy.md +27 -0
- package/docs/getting-started/01-installation.md +78 -0
- package/docs/getting-started/02-first-run.md +102 -0
- package/docs/getting-started/03-adding-to-project.md +15 -0
- package/docs/getting-started/04-host-setup.md +15 -0
- package/docs/guides/ci-integration.md +15 -0
- package/docs/guides/creating-skills.md +15 -0
- package/docs/guides/expertise-module-authoring.md +15 -0
- package/docs/guides/hook-development.md +15 -0
- package/docs/guides/memory-and-learnings.md +34 -0
- package/docs/guides/multi-host-export.md +15 -0
- package/docs/guides/troubleshooting.md +101 -0
- package/docs/guides/writing-custom-roles.md +15 -0
- package/docs/plans/2026-03-15-cli-pipeline-integration-design.md +592 -0
- package/docs/plans/2026-03-15-cli-pipeline-integration-plan.md +598 -0
- package/docs/plans/2026-03-15-docs-enforcement-plan.md +238 -0
- package/docs/readmes/INDEX.md +99 -0
- package/docs/readmes/features/expertise/README.md +171 -0
- package/docs/readmes/features/exports/README.md +222 -0
- package/docs/readmes/features/hooks/README.md +103 -0
- package/docs/readmes/features/hooks/loop-cap-guard.md +133 -0
- package/docs/readmes/features/hooks/post-tool-capture.md +121 -0
- package/docs/readmes/features/hooks/post-tool-lint.md +130 -0
- package/docs/readmes/features/hooks/pre-compact-summary.md +122 -0
- package/docs/readmes/features/hooks/pre-tool-capture-route.md +100 -0
- package/docs/readmes/features/hooks/protected-path-write-guard.md +128 -0
- package/docs/readmes/features/hooks/session-start.md +119 -0
- package/docs/readmes/features/hooks/stop-handoff-harvest.md +125 -0
- package/docs/readmes/features/roles/README.md +157 -0
- package/docs/readmes/features/roles/clarifier.md +152 -0
- package/docs/readmes/features/roles/content-author.md +190 -0
- package/docs/readmes/features/roles/designer.md +193 -0
- package/docs/readmes/features/roles/executor.md +184 -0
- package/docs/readmes/features/roles/learner.md +210 -0
- package/docs/readmes/features/roles/planner.md +182 -0
- package/docs/readmes/features/roles/researcher.md +164 -0
- package/docs/readmes/features/roles/reviewer.md +184 -0
- package/docs/readmes/features/roles/specifier.md +162 -0
- package/docs/readmes/features/roles/verifier.md +215 -0
- package/docs/readmes/features/schemas/README.md +178 -0
- package/docs/readmes/features/skills/README.md +63 -0
- package/docs/readmes/features/skills/brainstorming.md +96 -0
- package/docs/readmes/features/skills/debugging.md +148 -0
- package/docs/readmes/features/skills/design.md +120 -0
- package/docs/readmes/features/skills/prepare-next.md +109 -0
- package/docs/readmes/features/skills/run-audit.md +159 -0
- package/docs/readmes/features/skills/scan-project.md +109 -0
- package/docs/readmes/features/skills/self-audit.md +176 -0
- package/docs/readmes/features/skills/tdd.md +137 -0
- package/docs/readmes/features/skills/using-skills.md +92 -0
- package/docs/readmes/features/skills/verification.md +120 -0
- package/docs/readmes/features/skills/writing-plans.md +104 -0
- package/docs/readmes/features/tooling/README.md +320 -0
- package/docs/readmes/features/workflows/README.md +186 -0
- package/docs/readmes/features/workflows/author.md +181 -0
- package/docs/readmes/features/workflows/clarify.md +154 -0
- package/docs/readmes/features/workflows/design-review.md +171 -0
- package/docs/readmes/features/workflows/design.md +169 -0
- package/docs/readmes/features/workflows/discover.md +162 -0
- package/docs/readmes/features/workflows/execute.md +173 -0
- package/docs/readmes/features/workflows/learn.md +167 -0
- package/docs/readmes/features/workflows/plan-review.md +165 -0
- package/docs/readmes/features/workflows/plan.md +170 -0
- package/docs/readmes/features/workflows/prepare-next.md +167 -0
- package/docs/readmes/features/workflows/review.md +169 -0
- package/docs/readmes/features/workflows/run-audit.md +191 -0
- package/docs/readmes/features/workflows/spec-challenge.md +159 -0
- package/docs/readmes/features/workflows/specify.md +160 -0
- package/docs/readmes/features/workflows/verify.md +177 -0
- package/docs/readmes/packages/README.md +50 -0
- package/docs/readmes/packages/ajv.md +117 -0
- package/docs/readmes/packages/context-mode.md +118 -0
- package/docs/readmes/packages/gray-matter.md +116 -0
- package/docs/readmes/packages/node-test.md +137 -0
- package/docs/readmes/packages/yaml.md +112 -0
- package/docs/reference/configuration-reference.md +159 -0
- package/docs/reference/expertise-index.md +52 -0
- package/docs/reference/git-flow.md +43 -0
- package/docs/reference/hooks.md +87 -0
- package/docs/reference/host-exports.md +50 -0
- package/docs/reference/launch-checklist.md +172 -0
- package/docs/reference/marketplace-listings.md +76 -0
- package/docs/reference/release-process.md +34 -0
- package/docs/reference/roles-reference.md +77 -0
- package/docs/reference/skills.md +33 -0
- package/docs/reference/templates.md +29 -0
- package/docs/reference/tooling-cli.md +94 -0
- package/docs/truth-claims.yaml +222 -0
- package/expertise/PROGRESS.md +63 -0
- package/expertise/README.md +18 -0
- package/expertise/antipatterns/PROGRESS.md +56 -0
- package/expertise/antipatterns/backend/api-design-antipatterns.md +1271 -0
- package/expertise/antipatterns/backend/auth-antipatterns.md +1195 -0
- package/expertise/antipatterns/backend/caching-antipatterns.md +622 -0
- package/expertise/antipatterns/backend/database-antipatterns.md +1038 -0
- package/expertise/antipatterns/backend/index.md +24 -0
- package/expertise/antipatterns/backend/microservices-antipatterns.md +850 -0
- package/expertise/antipatterns/code/architecture-antipatterns.md +919 -0
- package/expertise/antipatterns/code/async-antipatterns.md +622 -0
- package/expertise/antipatterns/code/code-smells.md +1186 -0
- package/expertise/antipatterns/code/dependency-antipatterns.md +1209 -0
- package/expertise/antipatterns/code/error-handling-antipatterns.md +1360 -0
- package/expertise/antipatterns/code/index.md +27 -0
- package/expertise/antipatterns/code/naming-and-abstraction.md +1118 -0
- package/expertise/antipatterns/code/state-management-antipatterns.md +1076 -0
- package/expertise/antipatterns/code/testing-antipatterns.md +1053 -0
- package/expertise/antipatterns/design/accessibility-antipatterns.md +1136 -0
- package/expertise/antipatterns/design/dark-patterns.md +1121 -0
- package/expertise/antipatterns/design/index.md +22 -0
- package/expertise/antipatterns/design/ui-antipatterns.md +1202 -0
- package/expertise/antipatterns/design/ux-antipatterns.md +680 -0
- package/expertise/antipatterns/frontend/css-layout-antipatterns.md +691 -0
- package/expertise/antipatterns/frontend/flutter-antipatterns.md +1827 -0
- package/expertise/antipatterns/frontend/index.md +23 -0
- package/expertise/antipatterns/frontend/mobile-antipatterns.md +573 -0
- package/expertise/antipatterns/frontend/react-antipatterns.md +1128 -0
- package/expertise/antipatterns/frontend/spa-antipatterns.md +1235 -0
- package/expertise/antipatterns/index.md +31 -0
- package/expertise/antipatterns/performance/index.md +20 -0
- package/expertise/antipatterns/performance/performance-antipatterns.md +1013 -0
- package/expertise/antipatterns/performance/premature-optimization.md +623 -0
- package/expertise/antipatterns/performance/scaling-antipatterns.md +785 -0
- package/expertise/antipatterns/process/ai-coding-antipatterns.md +853 -0
- package/expertise/antipatterns/process/code-review-antipatterns.md +656 -0
- package/expertise/antipatterns/process/deployment-antipatterns.md +920 -0
- package/expertise/antipatterns/process/index.md +23 -0
- package/expertise/antipatterns/process/technical-debt-antipatterns.md +647 -0
- package/expertise/antipatterns/security/index.md +20 -0
- package/expertise/antipatterns/security/secrets-antipatterns.md +849 -0
- package/expertise/antipatterns/security/security-theater.md +843 -0
- package/expertise/antipatterns/security/vulnerability-patterns.md +801 -0
- package/expertise/architecture/PROGRESS.md +70 -0
- package/expertise/architecture/data/caching-architecture.md +671 -0
- package/expertise/architecture/data/data-consistency.md +574 -0
- package/expertise/architecture/data/data-modeling.md +536 -0
- package/expertise/architecture/data/event-streams-and-queues.md +634 -0
- package/expertise/architecture/data/index.md +25 -0
- package/expertise/architecture/data/search-architecture.md +663 -0
- package/expertise/architecture/data/sql-vs-nosql.md +708 -0
- package/expertise/architecture/decisions/architecture-decision-records.md +640 -0
- package/expertise/architecture/decisions/build-vs-buy.md +616 -0
- package/expertise/architecture/decisions/index.md +23 -0
- package/expertise/architecture/decisions/monolith-to-microservices.md +790 -0
- package/expertise/architecture/decisions/technology-selection.md +616 -0
- package/expertise/architecture/distributed/cap-theorem-and-tradeoffs.md +800 -0
- package/expertise/architecture/distributed/circuit-breaker-bulkhead.md +741 -0
- package/expertise/architecture/distributed/consensus-and-coordination.md +796 -0
- package/expertise/architecture/distributed/distributed-systems-fundamentals.md +564 -0
- package/expertise/architecture/distributed/idempotency-and-retry.md +796 -0
- package/expertise/architecture/distributed/index.md +25 -0
- package/expertise/architecture/distributed/saga-pattern.md +797 -0
- package/expertise/architecture/foundations/architectural-thinking.md +460 -0
- package/expertise/architecture/foundations/coupling-and-cohesion.md +770 -0
- package/expertise/architecture/foundations/design-principles-solid.md +649 -0
- package/expertise/architecture/foundations/domain-driven-design.md +719 -0
- package/expertise/architecture/foundations/index.md +25 -0
- package/expertise/architecture/foundations/separation-of-concerns.md +472 -0
- package/expertise/architecture/foundations/twelve-factor-app.md +797 -0
- package/expertise/architecture/index.md +34 -0
- package/expertise/architecture/integration/api-design-graphql.md +638 -0
- package/expertise/architecture/integration/api-design-grpc.md +804 -0
- package/expertise/architecture/integration/api-design-rest.md +892 -0
- package/expertise/architecture/integration/index.md +25 -0
- package/expertise/architecture/integration/third-party-integration.md +795 -0
- package/expertise/architecture/integration/webhooks-and-callbacks.md +1152 -0
- package/expertise/architecture/integration/websockets-realtime.md +791 -0
- package/expertise/architecture/mobile-architecture/index.md +22 -0
- package/expertise/architecture/mobile-architecture/mobile-app-architecture.md +780 -0
- package/expertise/architecture/mobile-architecture/mobile-backend-for-frontend.md +670 -0
- package/expertise/architecture/mobile-architecture/offline-first.md +719 -0
- package/expertise/architecture/mobile-architecture/push-and-sync.md +782 -0
- package/expertise/architecture/patterns/cqrs-event-sourcing.md +717 -0
- package/expertise/architecture/patterns/event-driven.md +797 -0
- package/expertise/architecture/patterns/hexagonal-clean-architecture.md +870 -0
- package/expertise/architecture/patterns/index.md +27 -0
- package/expertise/architecture/patterns/layered-architecture.md +736 -0
- package/expertise/architecture/patterns/microservices.md +753 -0
- package/expertise/architecture/patterns/modular-monolith.md +692 -0
- package/expertise/architecture/patterns/monolith.md +626 -0
- package/expertise/architecture/patterns/plugin-architecture.md +735 -0
- package/expertise/architecture/patterns/serverless.md +780 -0
- package/expertise/architecture/scaling/database-scaling.md +615 -0
- package/expertise/architecture/scaling/feature-flags-and-rollouts.md +757 -0
- package/expertise/architecture/scaling/horizontal-vs-vertical.md +606 -0
- package/expertise/architecture/scaling/index.md +24 -0
- package/expertise/architecture/scaling/multi-tenancy.md +800 -0
- package/expertise/architecture/scaling/stateless-design.md +787 -0
- package/expertise/backend/embedded-firmware.md +625 -0
- package/expertise/backend/go.md +853 -0
- package/expertise/backend/index.md +24 -0
- package/expertise/backend/java-spring.md +448 -0
- package/expertise/backend/node-typescript.md +625 -0
- package/expertise/backend/python-fastapi.md +724 -0
- package/expertise/backend/rust.md +458 -0
- package/expertise/backend/solidity.md +711 -0
- package/expertise/composition-map.yaml +443 -0
- package/expertise/content/foundations/content-modeling.md +395 -0
- package/expertise/content/foundations/editorial-standards.md +449 -0
- package/expertise/content/foundations/index.md +24 -0
- package/expertise/content/foundations/microcopy.md +455 -0
- package/expertise/content/foundations/terminology-governance.md +509 -0
- package/expertise/content/index.md +34 -0
- package/expertise/content/patterns/accessibility-copy.md +518 -0
- package/expertise/content/patterns/index.md +24 -0
- package/expertise/content/patterns/notification-content.md +433 -0
- package/expertise/content/patterns/sample-content.md +486 -0
- package/expertise/content/patterns/state-copy.md +439 -0
- package/expertise/design/PROGRESS.md +58 -0
- package/expertise/design/disciplines/dark-mode-theming.md +577 -0
- package/expertise/design/disciplines/design-systems.md +595 -0
- package/expertise/design/disciplines/index.md +25 -0
- package/expertise/design/disciplines/information-architecture.md +800 -0
- package/expertise/design/disciplines/interaction-design.md +788 -0
- package/expertise/design/disciplines/responsive-design.md +552 -0
- package/expertise/design/disciplines/usability-testing.md +516 -0
- package/expertise/design/disciplines/user-research.md +792 -0
- package/expertise/design/foundations/accessibility-design.md +796 -0
- package/expertise/design/foundations/color-theory.md +797 -0
- package/expertise/design/foundations/iconography.md +795 -0
- package/expertise/design/foundations/index.md +26 -0
- package/expertise/design/foundations/motion-and-animation.md +653 -0
- package/expertise/design/foundations/rtl-design.md +585 -0
- package/expertise/design/foundations/spacing-and-layout.md +607 -0
- package/expertise/design/foundations/typography.md +800 -0
- package/expertise/design/foundations/visual-hierarchy.md +761 -0
- package/expertise/design/index.md +32 -0
- package/expertise/design/patterns/authentication-flows.md +474 -0
- package/expertise/design/patterns/content-consumption.md +789 -0
- package/expertise/design/patterns/data-display.md +618 -0
- package/expertise/design/patterns/e-commerce.md +1494 -0
- package/expertise/design/patterns/feedback-and-states.md +642 -0
- package/expertise/design/patterns/forms-and-input.md +819 -0
- package/expertise/design/patterns/gamification.md +801 -0
- package/expertise/design/patterns/index.md +31 -0
- package/expertise/design/patterns/microinteractions.md +449 -0
- package/expertise/design/patterns/navigation.md +800 -0
- package/expertise/design/patterns/notifications.md +705 -0
- package/expertise/design/patterns/onboarding.md +700 -0
- package/expertise/design/patterns/search-and-filter.md +601 -0
- package/expertise/design/patterns/settings-and-preferences.md +768 -0
- package/expertise/design/patterns/social-and-community.md +748 -0
- package/expertise/design/platforms/desktop-native.md +612 -0
- package/expertise/design/platforms/index.md +25 -0
- package/expertise/design/platforms/mobile-android.md +825 -0
- package/expertise/design/platforms/mobile-cross-platform.md +983 -0
- package/expertise/design/platforms/mobile-ios.md +699 -0
- package/expertise/design/platforms/tablet.md +794 -0
- package/expertise/design/platforms/web-dashboard.md +790 -0
- package/expertise/design/platforms/web-responsive.md +550 -0
- package/expertise/design/psychology/behavioral-nudges.md +449 -0
- package/expertise/design/psychology/cognitive-load.md +1191 -0
- package/expertise/design/psychology/error-psychology.md +778 -0
- package/expertise/design/psychology/index.md +22 -0
- package/expertise/design/psychology/persuasive-design.md +736 -0
- package/expertise/design/psychology/user-mental-models.md +623 -0
- package/expertise/design/tooling/open-pencil.md +266 -0
- package/expertise/frontend/angular.md +1073 -0
- package/expertise/frontend/desktop-electron.md +546 -0
- package/expertise/frontend/flutter.md +782 -0
- package/expertise/frontend/index.md +27 -0
- package/expertise/frontend/native-android.md +409 -0
- package/expertise/frontend/native-ios.md +490 -0
- package/expertise/frontend/react-native.md +1160 -0
- package/expertise/frontend/react.md +808 -0
- package/expertise/frontend/vue.md +1089 -0
- package/expertise/humanize/domain-rules-code.md +79 -0
- package/expertise/humanize/domain-rules-content.md +67 -0
- package/expertise/humanize/domain-rules-technical-docs.md +56 -0
- package/expertise/humanize/index.md +35 -0
- package/expertise/humanize/self-audit-checklist.md +87 -0
- package/expertise/humanize/sentence-patterns.md +218 -0
- package/expertise/humanize/vocabulary-blacklist.md +105 -0
- package/expertise/i18n/PROGRESS.md +65 -0
- package/expertise/i18n/advanced/accessibility-and-i18n.md +28 -0
- package/expertise/i18n/advanced/bidirectional-text-algorithm.md +38 -0
- package/expertise/i18n/advanced/complex-scripts.md +30 -0
- package/expertise/i18n/advanced/performance-and-i18n.md +27 -0
- package/expertise/i18n/advanced/testing-i18n.md +28 -0
- package/expertise/i18n/content/content-adaptation.md +23 -0
- package/expertise/i18n/content/locale-specific-formatting.md +23 -0
- package/expertise/i18n/content/machine-translation-integration.md +28 -0
- package/expertise/i18n/content/translation-management.md +29 -0
- package/expertise/i18n/foundations/date-time-calendars.md +67 -0
- package/expertise/i18n/foundations/i18n-architecture.md +272 -0
- package/expertise/i18n/foundations/locale-and-language-tags.md +79 -0
- package/expertise/i18n/foundations/numbers-currency-units.md +61 -0
- package/expertise/i18n/foundations/pluralization-and-gender.md +109 -0
- package/expertise/i18n/foundations/string-externalization.md +236 -0
- package/expertise/i18n/foundations/text-direction-bidi.md +241 -0
- package/expertise/i18n/foundations/unicode-and-encoding.md +86 -0
- package/expertise/i18n/index.md +38 -0
- package/expertise/i18n/platform/backend-i18n.md +31 -0
- package/expertise/i18n/platform/flutter-i18n.md +148 -0
- package/expertise/i18n/platform/native-android-i18n.md +36 -0
- package/expertise/i18n/platform/native-ios-i18n.md +36 -0
- package/expertise/i18n/platform/react-i18n.md +103 -0
- package/expertise/i18n/platform/web-css-i18n.md +81 -0
- package/expertise/i18n/rtl/arabic-specific.md +175 -0
- package/expertise/i18n/rtl/hebrew-specific.md +149 -0
- package/expertise/i18n/rtl/rtl-animations-and-transitions.md +111 -0
- package/expertise/i18n/rtl/rtl-forms-and-input.md +161 -0
- package/expertise/i18n/rtl/rtl-fundamentals.md +211 -0
- package/expertise/i18n/rtl/rtl-icons-and-images.md +181 -0
- package/expertise/i18n/rtl/rtl-layout-mirroring.md +252 -0
- package/expertise/i18n/rtl/rtl-navigation-and-gestures.md +107 -0
- package/expertise/i18n/rtl/rtl-testing-and-qa.md +147 -0
- package/expertise/i18n/rtl/rtl-typography.md +160 -0
- package/expertise/index.md +113 -0
- package/expertise/index.yaml +216 -0
- package/expertise/infrastructure/cloud-aws.md +597 -0
- package/expertise/infrastructure/cloud-gcp.md +599 -0
- package/expertise/infrastructure/cybersecurity.md +816 -0
- package/expertise/infrastructure/database-mongodb.md +447 -0
- package/expertise/infrastructure/database-postgres.md +400 -0
- package/expertise/infrastructure/devops-cicd.md +787 -0
- package/expertise/infrastructure/index.md +27 -0
- package/expertise/performance/PROGRESS.md +50 -0
- package/expertise/performance/backend/api-latency.md +1204 -0
- package/expertise/performance/backend/background-jobs.md +506 -0
- package/expertise/performance/backend/connection-pooling.md +1209 -0
- package/expertise/performance/backend/database-query-optimization.md +515 -0
- package/expertise/performance/backend/index.md +23 -0
- package/expertise/performance/backend/rate-limiting-and-throttling.md +971 -0
- package/expertise/performance/foundations/algorithmic-complexity.md +954 -0
- package/expertise/performance/foundations/caching-strategies.md +489 -0
- package/expertise/performance/foundations/concurrency-and-parallelism.md +847 -0
- package/expertise/performance/foundations/index.md +24 -0
- package/expertise/performance/foundations/measuring-and-profiling.md +440 -0
- package/expertise/performance/foundations/memory-management.md +964 -0
- package/expertise/performance/foundations/performance-budgets.md +1314 -0
- package/expertise/performance/index.md +31 -0
- package/expertise/performance/infrastructure/auto-scaling.md +1059 -0
- package/expertise/performance/infrastructure/cdn-and-edge.md +1081 -0
- package/expertise/performance/infrastructure/index.md +22 -0
- package/expertise/performance/infrastructure/load-balancing.md +1081 -0
- package/expertise/performance/infrastructure/observability.md +1079 -0
- package/expertise/performance/mobile/index.md +23 -0
- package/expertise/performance/mobile/mobile-animations.md +544 -0
- package/expertise/performance/mobile/mobile-memory-battery.md +416 -0
- package/expertise/performance/mobile/mobile-network.md +452 -0
- package/expertise/performance/mobile/mobile-rendering.md +599 -0
- package/expertise/performance/mobile/mobile-startup-time.md +505 -0
- package/expertise/performance/platform-specific/flutter-performance.md +647 -0
- package/expertise/performance/platform-specific/index.md +22 -0
- package/expertise/performance/platform-specific/node-performance.md +1307 -0
- package/expertise/performance/platform-specific/postgres-performance.md +1366 -0
- package/expertise/performance/platform-specific/react-performance.md +1403 -0
- package/expertise/performance/web/bundle-optimization.md +1239 -0
- package/expertise/performance/web/image-and-media.md +636 -0
- package/expertise/performance/web/index.md +24 -0
- package/expertise/performance/web/network-optimization.md +1133 -0
- package/expertise/performance/web/rendering-performance.md +1098 -0
- package/expertise/performance/web/ssr-and-hydration.md +918 -0
- package/expertise/performance/web/web-vitals.md +1374 -0
- package/expertise/quality/accessibility.md +985 -0
- package/expertise/quality/evidence-based-verification.md +499 -0
- package/expertise/quality/index.md +24 -0
- package/expertise/quality/ml-model-audit.md +614 -0
- package/expertise/quality/performance.md +600 -0
- package/expertise/quality/testing-api.md +891 -0
- package/expertise/quality/testing-mobile.md +496 -0
- package/expertise/quality/testing-web.md +849 -0
- package/expertise/security/PROGRESS.md +54 -0
- package/expertise/security/agentic-identity.md +540 -0
- package/expertise/security/compliance-frameworks.md +601 -0
- package/expertise/security/data/data-encryption.md +364 -0
- package/expertise/security/data/data-privacy-gdpr.md +692 -0
- package/expertise/security/data/database-security.md +1171 -0
- package/expertise/security/data/index.md +22 -0
- package/expertise/security/data/pii-handling.md +531 -0
- package/expertise/security/foundations/authentication.md +1041 -0
- package/expertise/security/foundations/authorization.md +603 -0
- package/expertise/security/foundations/cryptography.md +1001 -0
- package/expertise/security/foundations/index.md +25 -0
- package/expertise/security/foundations/owasp-top-10.md +1354 -0
- package/expertise/security/foundations/secrets-management.md +1217 -0
- package/expertise/security/foundations/secure-sdlc.md +700 -0
- package/expertise/security/foundations/supply-chain-security.md +698 -0
- package/expertise/security/index.md +31 -0
- package/expertise/security/infrastructure/cloud-security-aws.md +1296 -0
- package/expertise/security/infrastructure/cloud-security-gcp.md +1376 -0
- package/expertise/security/infrastructure/container-security.md +721 -0
- package/expertise/security/infrastructure/incident-response.md +1295 -0
- package/expertise/security/infrastructure/index.md +24 -0
- package/expertise/security/infrastructure/logging-and-monitoring.md +1618 -0
- package/expertise/security/infrastructure/network-security.md +1337 -0
- package/expertise/security/mobile/index.md +23 -0
- package/expertise/security/mobile/mobile-android-security.md +1218 -0
- package/expertise/security/mobile/mobile-binary-protection.md +1229 -0
- package/expertise/security/mobile/mobile-data-storage.md +1265 -0
- package/expertise/security/mobile/mobile-ios-security.md +1401 -0
- package/expertise/security/mobile/mobile-network-security.md +1520 -0
- package/expertise/security/smart-contract-security.md +594 -0
- package/expertise/security/testing/index.md +22 -0
- package/expertise/security/testing/penetration-testing.md +1258 -0
- package/expertise/security/testing/security-code-review.md +1765 -0
- package/expertise/security/testing/threat-modeling.md +1074 -0
- package/expertise/security/testing/vulnerability-scanning.md +1062 -0
- package/expertise/security/web/api-security.md +586 -0
- package/expertise/security/web/cors-and-headers.md +433 -0
- package/expertise/security/web/csrf.md +562 -0
- package/expertise/security/web/file-upload.md +1477 -0
- package/expertise/security/web/index.md +25 -0
- package/expertise/security/web/injection.md +1375 -0
- package/expertise/security/web/session-management.md +1101 -0
- package/expertise/security/web/xss.md +1158 -0
- package/exports/README.md +17 -0
- package/exports/hosts/claude/.claude/agents/clarifier.md +42 -0
- package/exports/hosts/claude/.claude/agents/content-author.md +63 -0
- package/exports/hosts/claude/.claude/agents/designer.md +55 -0
- package/exports/hosts/claude/.claude/agents/executor.md +55 -0
- package/exports/hosts/claude/.claude/agents/learner.md +51 -0
- package/exports/hosts/claude/.claude/agents/planner.md +53 -0
- package/exports/hosts/claude/.claude/agents/researcher.md +43 -0
- package/exports/hosts/claude/.claude/agents/reviewer.md +54 -0
- package/exports/hosts/claude/.claude/agents/specifier.md +47 -0
- package/exports/hosts/claude/.claude/agents/verifier.md +71 -0
- package/exports/hosts/claude/.claude/commands/author.md +42 -0
- package/exports/hosts/claude/.claude/commands/clarify.md +38 -0
- package/exports/hosts/claude/.claude/commands/design-review.md +46 -0
- package/exports/hosts/claude/.claude/commands/design.md +44 -0
- package/exports/hosts/claude/.claude/commands/discover.md +37 -0
- package/exports/hosts/claude/.claude/commands/execute.md +48 -0
- package/exports/hosts/claude/.claude/commands/learn.md +38 -0
- package/exports/hosts/claude/.claude/commands/plan-review.md +42 -0
- package/exports/hosts/claude/.claude/commands/plan.md +39 -0
- package/exports/hosts/claude/.claude/commands/prepare-next.md +37 -0
- package/exports/hosts/claude/.claude/commands/review.md +40 -0
- package/exports/hosts/claude/.claude/commands/run-audit.md +41 -0
- package/exports/hosts/claude/.claude/commands/spec-challenge.md +41 -0
- package/exports/hosts/claude/.claude/commands/specify.md +38 -0
- package/exports/hosts/claude/.claude/commands/verify.md +37 -0
- package/exports/hosts/claude/.claude/settings.json +34 -0
- package/exports/hosts/claude/CLAUDE.md +19 -0
- package/exports/hosts/claude/export.manifest.json +38 -0
- package/exports/hosts/claude/host-package.json +67 -0
- package/exports/hosts/codex/AGENTS.md +19 -0
- package/exports/hosts/codex/export.manifest.json +38 -0
- package/exports/hosts/codex/host-package.json +41 -0
- package/exports/hosts/cursor/.cursor/hooks.json +16 -0
- package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +19 -0
- package/exports/hosts/cursor/export.manifest.json +38 -0
- package/exports/hosts/cursor/host-package.json +42 -0
- package/exports/hosts/gemini/GEMINI.md +19 -0
- package/exports/hosts/gemini/export.manifest.json +38 -0
- package/exports/hosts/gemini/host-package.json +41 -0
- package/hooks/README.md +18 -0
- package/hooks/definitions/loop_cap_guard.yaml +21 -0
- package/hooks/definitions/post_tool_capture.yaml +24 -0
- package/hooks/definitions/pre_compact_summary.yaml +19 -0
- package/hooks/definitions/pre_tool_capture_route.yaml +19 -0
- package/hooks/definitions/protected_path_write_guard.yaml +19 -0
- package/hooks/definitions/session_start.yaml +19 -0
- package/hooks/definitions/stop_handoff_harvest.yaml +20 -0
- package/hooks/loop-cap-guard +17 -0
- package/hooks/post-tool-lint +36 -0
- package/hooks/protected-path-write-guard +17 -0
- package/hooks/session-start +41 -0
- package/llms-full.txt +2355 -0
- package/llms.txt +43 -0
- package/package.json +79 -0
- package/roles/README.md +20 -0
- package/roles/clarifier.md +42 -0
- package/roles/content-author.md +63 -0
- package/roles/designer.md +55 -0
- package/roles/executor.md +55 -0
- package/roles/learner.md +51 -0
- package/roles/planner.md +53 -0
- package/roles/researcher.md +43 -0
- package/roles/reviewer.md +54 -0
- package/roles/specifier.md +47 -0
- package/roles/verifier.md +71 -0
- package/schemas/README.md +24 -0
- package/schemas/accepted-learning.schema.json +20 -0
- package/schemas/author-artifact.schema.json +156 -0
- package/schemas/clarification.schema.json +19 -0
- package/schemas/design-artifact.schema.json +80 -0
- package/schemas/docs-claim.schema.json +18 -0
- package/schemas/export-manifest.schema.json +20 -0
- package/schemas/hook.schema.json +67 -0
- package/schemas/host-export-package.schema.json +18 -0
- package/schemas/implementation-plan.schema.json +19 -0
- package/schemas/proposed-learning.schema.json +19 -0
- package/schemas/research.schema.json +18 -0
- package/schemas/review.schema.json +29 -0
- package/schemas/run-manifest.schema.json +18 -0
- package/schemas/spec-challenge.schema.json +18 -0
- package/schemas/spec.schema.json +20 -0
- package/schemas/usage.schema.json +102 -0
- package/schemas/verification-proof.schema.json +29 -0
- package/schemas/wazir-manifest.schema.json +173 -0
- package/skills/README.md +40 -0
- package/skills/brainstorming/SKILL.md +77 -0
- package/skills/debugging/SKILL.md +50 -0
- package/skills/design/SKILL.md +61 -0
- package/skills/dispatching-parallel-agents/SKILL.md +128 -0
- package/skills/executing-plans/SKILL.md +70 -0
- package/skills/finishing-a-development-branch/SKILL.md +169 -0
- package/skills/humanize/SKILL.md +123 -0
- package/skills/init-pipeline/SKILL.md +124 -0
- package/skills/prepare-next/SKILL.md +20 -0
- package/skills/receiving-code-review/SKILL.md +123 -0
- package/skills/requesting-code-review/SKILL.md +105 -0
- package/skills/requesting-code-review/code-reviewer.md +108 -0
- package/skills/run-audit/SKILL.md +197 -0
- package/skills/scan-project/SKILL.md +41 -0
- package/skills/self-audit/SKILL.md +153 -0
- package/skills/subagent-driven-development/SKILL.md +154 -0
- package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +26 -0
- package/skills/subagent-driven-development/implementer-prompt.md +102 -0
- package/skills/subagent-driven-development/spec-reviewer-prompt.md +61 -0
- package/skills/tdd/SKILL.md +23 -0
- package/skills/using-git-worktrees/SKILL.md +163 -0
- package/skills/using-skills/SKILL.md +95 -0
- package/skills/verification/SKILL.md +22 -0
- package/skills/wazir/SKILL.md +463 -0
- package/skills/writing-plans/SKILL.md +30 -0
- package/skills/writing-skills/SKILL.md +157 -0
- package/skills/writing-skills/anthropic-best-practices.md +122 -0
- package/skills/writing-skills/persuasion-principles.md +50 -0
- package/templates/README.md +20 -0
- package/templates/artifacts/README.md +10 -0
- package/templates/artifacts/accepted-learning.md +19 -0
- package/templates/artifacts/accepted-learning.template.json +12 -0
- package/templates/artifacts/author.md +74 -0
- package/templates/artifacts/author.template.json +19 -0
- package/templates/artifacts/clarification.md +21 -0
- package/templates/artifacts/clarification.template.json +12 -0
- package/templates/artifacts/execute-notes.md +19 -0
- package/templates/artifacts/implementation-plan.md +21 -0
- package/templates/artifacts/implementation-plan.template.json +11 -0
- package/templates/artifacts/learning-proposal.md +19 -0
- package/templates/artifacts/next-run-handoff.md +21 -0
- package/templates/artifacts/plan-review.md +19 -0
- package/templates/artifacts/proposed-learning.template.json +12 -0
- package/templates/artifacts/research.md +21 -0
- package/templates/artifacts/research.template.json +12 -0
- package/templates/artifacts/review-findings.md +19 -0
- package/templates/artifacts/review.template.json +11 -0
- package/templates/artifacts/run-manifest.template.json +8 -0
- package/templates/artifacts/spec-challenge.md +19 -0
- package/templates/artifacts/spec-challenge.template.json +11 -0
- package/templates/artifacts/spec.md +21 -0
- package/templates/artifacts/spec.template.json +12 -0
- package/templates/artifacts/verification-proof.md +19 -0
- package/templates/artifacts/verification-proof.template.json +11 -0
- package/templates/examples/accepted-learning.example.json +14 -0
- package/templates/examples/author.example.json +152 -0
- package/templates/examples/clarification.example.json +15 -0
- package/templates/examples/docs-claim.example.json +8 -0
- package/templates/examples/export-manifest.example.json +7 -0
- package/templates/examples/host-export-package.example.json +11 -0
- package/templates/examples/implementation-plan.example.json +17 -0
- package/templates/examples/proposed-learning.example.json +13 -0
- package/templates/examples/research.example.json +15 -0
- package/templates/examples/research.example.md +6 -0
- package/templates/examples/review.example.json +17 -0
- package/templates/examples/run-manifest.example.json +9 -0
- package/templates/examples/spec-challenge.example.json +14 -0
- package/templates/examples/spec.example.json +21 -0
- package/templates/examples/verification-proof.example.json +21 -0
- package/templates/examples/wazir-manifest.example.yaml +65 -0
- package/templates/task-definition-schema.md +99 -0
- package/tooling/README.md +20 -0
- package/tooling/src/adapters/context-mode.js +50 -0
- package/tooling/src/capture/command.js +376 -0
- package/tooling/src/capture/store.js +99 -0
- package/tooling/src/capture/usage.js +270 -0
- package/tooling/src/checks/branches.js +50 -0
- package/tooling/src/checks/brand-truth.js +110 -0
- package/tooling/src/checks/changelog.js +231 -0
- package/tooling/src/checks/command-registry.js +36 -0
- package/tooling/src/checks/commits.js +102 -0
- package/tooling/src/checks/docs-drift.js +103 -0
- package/tooling/src/checks/docs-truth.js +201 -0
- package/tooling/src/checks/runtime-surface.js +156 -0
- package/tooling/src/cli.js +116 -0
- package/tooling/src/command-options.js +56 -0
- package/tooling/src/commands/validate.js +320 -0
- package/tooling/src/doctor/command.js +91 -0
- package/tooling/src/export/command.js +77 -0
- package/tooling/src/export/compiler.js +498 -0
- package/tooling/src/guards/loop-cap-guard.js +52 -0
- package/tooling/src/guards/protected-path-write-guard.js +67 -0
- package/tooling/src/index/command.js +152 -0
- package/tooling/src/index/storage.js +1061 -0
- package/tooling/src/index/summarizers.js +261 -0
- package/tooling/src/loaders.js +18 -0
- package/tooling/src/project-root.js +22 -0
- package/tooling/src/recall/command.js +225 -0
- package/tooling/src/schema-validator.js +30 -0
- package/tooling/src/state-root.js +40 -0
- package/tooling/src/status/command.js +71 -0
- package/wazir.manifest.yaml +135 -0
- package/workflows/README.md +19 -0
- package/workflows/author.md +42 -0
- package/workflows/clarify.md +38 -0
- package/workflows/design-review.md +46 -0
- package/workflows/design.md +44 -0
- package/workflows/discover.md +37 -0
- package/workflows/execute.md +48 -0
- package/workflows/learn.md +38 -0
- package/workflows/plan-review.md +42 -0
- package/workflows/plan.md +39 -0
- package/workflows/prepare-next.md +37 -0
- package/workflows/review.md +40 -0
- package/workflows/run-audit.md +41 -0
- package/workflows/spec-challenge.md +41 -0
- package/workflows/specify.md +38 -0
- package/workflows/verify.md +37 -0
|
@@ -0,0 +1,971 @@
|
|
|
1
|
+
# Rate Limiting & Throttling
|
|
2
|
+
|
|
3
|
+
> Performance expertise module covering algorithms, distributed implementations, client-side
|
|
4
|
+
> strategies, and production decision frameworks for rate limiting and throttling systems.
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## Table of Contents
|
|
9
|
+
|
|
10
|
+
1. [Core Concepts: Rate Limiting vs Throttling vs Backpressure](#core-concepts)
|
|
11
|
+
2. [Rate Limiting Algorithms](#rate-limiting-algorithms)
|
|
12
|
+
3. [Algorithm Performance Comparison](#algorithm-performance-comparison)
|
|
13
|
+
4. [Distributed Rate Limiting with Redis](#distributed-rate-limiting-with-redis)
|
|
14
|
+
5. [Client-Side Throttling and Debouncing](#client-side-throttling-and-debouncing)
|
|
15
|
+
6. [API Rate Limit Headers](#api-rate-limit-headers)
|
|
16
|
+
7. [Backpressure Mechanisms](#backpressure-mechanisms)
|
|
17
|
+
8. [Common Bottlenecks](#common-bottlenecks)
|
|
18
|
+
9. [Anti-Patterns](#anti-patterns)
|
|
19
|
+
10. [Before/After: System Stability Under Load](#beforeafter-system-stability-under-load)
|
|
20
|
+
11. [Decision Tree: Which Algorithm Should I Use?](#decision-tree-which-algorithm-should-i-use)
|
|
21
|
+
12. [Production Case Studies](#production-case-studies)
|
|
22
|
+
13. [Sources](#sources)
|
|
23
|
+
|
|
24
|
+
---
|
|
25
|
+
|
|
26
|
+
## Core Concepts
|
|
27
|
+
|
|
28
|
+
### Rate Limiting
|
|
29
|
+
|
|
30
|
+
Rate limiting enforces a strict quantitative ceiling on operations within a fixed timeframe.
|
|
31
|
+
When a client exceeds N requests per window, subsequent requests are rejected outright with
|
|
32
|
+
HTTP 429 ("Too Many Requests"). The enforcement is binary: the request either passes or is
|
|
33
|
+
denied. Typical configurations enforce 100-10,000 requests per minute per API key.
|
|
34
|
+
|
|
35
|
+
### Throttling
|
|
36
|
+
|
|
37
|
+
Throttling controls the *rate at which requests are processed* rather than rejecting them.
|
|
38
|
+
A throttled client continues receiving responses but at a degraded throughput -- for example,
|
|
39
|
+
a search API that normally responds in 50ms may add artificial delays of 200-500ms to slow
|
|
40
|
+
a misbehaving client. Throttling preserves availability while degrading performance, whereas
|
|
41
|
+
rate limiting preserves server resources by shedding load.
|
|
42
|
+
|
|
43
|
+
### Backpressure
|
|
44
|
+
|
|
45
|
+
Backpressure is an upstream signal that tells producers to slow down when downstream
|
|
46
|
+
consumers cannot keep pace. Unlike rate limiting (which is imposed on callers), backpressure
|
|
47
|
+
is a cooperative feedback loop where a service communicates its capacity to its upstream
|
|
48
|
+
dependencies. In practice, backpressure manifests as TCP flow control, HTTP 429 with
|
|
49
|
+
Retry-After headers, Kafka consumer lag signals, or reactive streams demand signaling.
|
|
50
|
+
|
|
51
|
+
### When to Use Each
|
|
52
|
+
|
|
53
|
+
| Mechanism | Direction | Behavior When Triggered | Best For |
|
|
54
|
+
|----------------|-----------------|----------------------------------|-----------------------------------|
|
|
55
|
+
| Rate Limiting | Server to client | Reject excess requests (429) | API abuse prevention, DDoS |
|
|
56
|
+
| Throttling | Server to client | Slow down processing | Graceful degradation under load |
|
|
57
|
+
| Backpressure | Downstream to upstream | Signal producer to reduce rate | Internal service-to-service flows |
|
|
58
|
+
|
|
59
|
+
---
|
|
60
|
+
|
|
61
|
+
## Rate Limiting Algorithms
|
|
62
|
+
|
|
63
|
+
### 1. Token Bucket
|
|
64
|
+
|
|
65
|
+
**How it works:** A bucket holds up to `B` tokens (the burst capacity). Tokens are added
|
|
66
|
+
at a fixed refill rate of `R` tokens per second. Each incoming request consumes 1 token.
|
|
67
|
+
If the bucket is empty, the request is rejected. Tokens accumulate up to the bucket
|
|
68
|
+
capacity, enabling short bursts.
|
|
69
|
+
|
|
70
|
+
```
|
|
71
|
+
Parameters:
|
|
72
|
+
- Bucket capacity (B): maximum burst size (e.g., 100 tokens)
|
|
73
|
+
- Refill rate (R): tokens added per second (e.g., 10/s)
|
|
74
|
+
|
|
75
|
+
State per client:
|
|
76
|
+
- current_tokens: float or integer
|
|
77
|
+
- last_refill_timestamp: epoch milliseconds
|
|
78
|
+
|
|
79
|
+
Per-request logic:
|
|
80
|
+
1. Calculate elapsed time since last refill
|
|
81
|
+
2. Add (elapsed * R) tokens, cap at B
|
|
82
|
+
3. If current_tokens >= 1: consume 1 token, allow request
|
|
83
|
+
4. Else: reject request
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
**Memory per bucket:** ~24-40 bytes (2 numeric fields + timestamp). At 1M users, that is
|
|
87
|
+
roughly 24-40 MB of state.
|
|
88
|
+
|
|
89
|
+
**Strengths:** Allows controlled bursts (a client that was idle accumulates tokens).
|
|
90
|
+
Simple to reason about. Two tunable parameters (rate and burst) map cleanly to business
|
|
91
|
+
requirements. Used in production at Stripe, AWS API Gateway, and NGINX.
|
|
92
|
+
|
|
93
|
+
**Weaknesses:** Requires per-user state. Tuning burst capacity too high allows traffic
|
|
94
|
+
spikes that defeat the purpose of limiting. Does not smooth output rate -- bursts pass
|
|
95
|
+
through to the backend.
|
|
96
|
+
|
|
97
|
+
### 2. Leaky Bucket
|
|
98
|
+
|
|
99
|
+
**How it works:** Requests enter a FIFO queue (the bucket) with fixed capacity `B`.
|
|
100
|
+
Requests drain from the queue at a constant rate `R` per second. If the queue is full
|
|
101
|
+
when a new request arrives, it is dropped. The output rate is perfectly smooth regardless
|
|
102
|
+
of input burstiness.
|
|
103
|
+
|
|
104
|
+
```
|
|
105
|
+
Parameters:
|
|
106
|
+
- Queue capacity (B): maximum queued requests
|
|
107
|
+
- Leak rate (R): requests processed per second
|
|
108
|
+
|
|
109
|
+
State per client:
|
|
110
|
+
- queue: bounded FIFO (or counter + timestamp)
|
|
111
|
+
- last_leak_timestamp
|
|
112
|
+
|
|
113
|
+
Per-request logic:
|
|
114
|
+
1. Drain (elapsed * R) items from queue
|
|
115
|
+
2. If queue length < B: enqueue request
|
|
116
|
+
3. Else: reject request
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
**Memory per bucket:** ~24-32 bytes if using counter-based implementation (no actual
|
|
120
|
+
queue). If using a real FIFO queue, memory grows with queue depth.
|
|
121
|
+
|
|
122
|
+
**Strengths:** Produces perfectly smooth, constant-rate output. Simple to implement.
|
|
123
|
+
Prevents any burst from reaching the backend.
|
|
124
|
+
|
|
125
|
+
**Weaknesses:** Cannot handle legitimate burst traffic -- even a briefly idle client
|
|
126
|
+
gets no burst allowance. Adds latency because requests wait in the queue. Less flexible
|
|
127
|
+
than token bucket for APIs where occasional bursts are acceptable.
|
|
128
|
+
|
|
129
|
+
### 3. Fixed Window Counter
|
|
130
|
+
|
|
131
|
+
**How it works:** Time is divided into fixed windows of duration `W` (e.g., 60 seconds).
|
|
132
|
+
A counter tracks requests in the current window. When the counter reaches the limit `L`,
|
|
133
|
+
subsequent requests are rejected until the window resets.
|
|
134
|
+
|
|
135
|
+
```
|
|
136
|
+
Parameters:
|
|
137
|
+
- Window size (W): e.g., 60 seconds
|
|
138
|
+
- Limit (L): e.g., 100 requests per window
|
|
139
|
+
|
|
140
|
+
State per client:
|
|
141
|
+
- counter: integer
|
|
142
|
+
- window_start: timestamp
|
|
143
|
+
|
|
144
|
+
Per-request logic:
|
|
145
|
+
1. If current_time >= window_start + W: reset counter to 0, update window_start
|
|
146
|
+
2. If counter < L: increment counter, allow request
|
|
147
|
+
3. Else: reject request
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
**Memory per client:** ~16 bytes (1 integer + 1 timestamp). The most memory-efficient
|
|
151
|
+
algorithm.
|
|
152
|
+
|
|
153
|
+
**Strengths:** Extremely simple to implement. O(1) time and space. Trivial to implement
|
|
154
|
+
in Redis with INCR + EXPIRE (2 commands).
|
|
155
|
+
|
|
156
|
+
**Weaknesses:** Suffers from the **boundary burst problem**. A client can send L requests
|
|
157
|
+
at the end of window N and L requests at the start of window N+1, effectively sending 2L
|
|
158
|
+
requests in a span of W seconds. This can allow up to 2x the intended rate at window
|
|
159
|
+
boundaries. For a 100 req/min limit, a client could send 200 requests in a 60-second span
|
|
160
|
+
straddling two windows.
|
|
161
|
+
|
|
162
|
+
### 4. Sliding Window Log
|
|
163
|
+
|
|
164
|
+
**How it works:** Every request timestamp is stored in a sorted set. When a new request
|
|
165
|
+
arrives, all timestamps older than `(now - W)` are removed. If the remaining count is
|
|
166
|
+
below the limit, the request is allowed and its timestamp is added.
|
|
167
|
+
|
|
168
|
+
```
|
|
169
|
+
Parameters:
|
|
170
|
+
- Window size (W): e.g., 60 seconds
|
|
171
|
+
- Limit (L): e.g., 100 requests per window
|
|
172
|
+
|
|
173
|
+
State per client:
|
|
174
|
+
- sorted set of timestamps (e.g., Redis ZSET)
|
|
175
|
+
|
|
176
|
+
Per-request logic:
|
|
177
|
+
1. Remove all entries with timestamp < (now - W)
|
|
178
|
+
2. Count remaining entries
|
|
179
|
+
3. If count < L: add current timestamp, allow request
|
|
180
|
+
4. Else: reject request
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
**Memory per client:** O(L) -- stores up to L timestamps. At 100 requests per window with
|
|
184
|
+
8-byte timestamps, that is ~800 bytes per client. At 1M users, that is ~800 MB. At
|
|
185
|
+
10,000 requests per window, memory grows to ~80 KB per client.
|
|
186
|
+
|
|
187
|
+
**Strengths:** Perfectly accurate. No boundary burst problem. Exact sliding window
|
|
188
|
+
semantics.
|
|
189
|
+
|
|
190
|
+
**Weaknesses:** Memory-intensive: O(L) per client where L is the rate limit. Requires
|
|
191
|
+
cleanup of expired entries on every request, which is O(L) worst case. Not practical for
|
|
192
|
+
high-limit scenarios (e.g., 10,000 req/min). The ZREMRANGEBYSCORE operation on large sets
|
|
193
|
+
adds latency.
|
|
194
|
+
|
|
195
|
+
### 5. Sliding Window Counter
|
|
196
|
+
|
|
197
|
+
**How it works:** Combines fixed window counter with weighted overlap calculation. Maintains
|
|
198
|
+
counters for the current and previous windows. The effective count is calculated as:
|
|
199
|
+
|
|
200
|
+
```
|
|
201
|
+
effective_count = (previous_window_count * overlap_percentage) + current_window_count
|
|
202
|
+
|
|
203
|
+
Where overlap_percentage = 1 - (elapsed_time_in_current_window / window_size)
|
|
204
|
+
```
|
|
205
|
+
|
|
206
|
+
For example, if we are 30 seconds into a 60-second window, overlap is 50%. If the previous
|
|
207
|
+
window had 80 requests and the current has 30, the effective count is (80 * 0.5) + 30 = 70.
|
|
208
|
+
|
|
209
|
+
```
|
|
210
|
+
Parameters:
|
|
211
|
+
- Window size (W): e.g., 60 seconds
|
|
212
|
+
- Limit (L): e.g., 100 requests per window
|
|
213
|
+
|
|
214
|
+
State per client:
|
|
215
|
+
- previous_counter: integer
|
|
216
|
+
- current_counter: integer
|
|
217
|
+
- current_window_start: timestamp
|
|
218
|
+
|
|
219
|
+
Per-request logic:
|
|
220
|
+
1. If current_time >= window_start + W: rotate counters
|
|
221
|
+
2. Calculate overlap_pct = 1 - ((now - window_start) / W)
|
|
222
|
+
3. effective = (prev_counter * overlap_pct) + current_counter
|
|
223
|
+
4. If effective < L: increment current_counter, allow request
|
|
224
|
+
5. Else: reject request
|
|
225
|
+
```
|
|
226
|
+
|
|
227
|
+
**Memory per client:** ~20 bytes (2 integers + 1 timestamp). O(1) space -- same order as
|
|
228
|
+
fixed window.
|
|
229
|
+
|
|
230
|
+
**Strengths:** Smooths boundary bursts with minimal memory overhead. O(1) time and space.
|
|
231
|
+
In real-world testing, only 0.003% of requests are incorrectly allowed compared to a
|
|
232
|
+
perfect sliding window log (source: Cloudflare engineering). Best balance of accuracy and
|
|
233
|
+
efficiency.
|
|
234
|
+
|
|
235
|
+
**Weaknesses:** Approximate -- not perfectly accurate. Assumes uniform distribution of
|
|
236
|
+
requests within the previous window. Slightly more complex to implement than fixed window.
|
|
237
|
+
|
|
238
|
+
---
|
|
239
|
+
|
|
240
|
+
## Algorithm Performance Comparison
|
|
241
|
+
|
|
242
|
+
| Algorithm | Memory/Client | Time Complexity | Burst Handling | Accuracy | Impl. Complexity |
|
|
243
|
+
|-----------------------|---------------|-----------------|-----------------|---------------|------------------|
|
|
244
|
+
| Token Bucket | ~32 bytes | O(1) | Allows bursts | Exact (rate) | Low-Medium |
|
|
245
|
+
| Leaky Bucket | ~32 bytes | O(1) | Smooths bursts | Exact (rate) | Low |
|
|
246
|
+
| Fixed Window Counter | ~16 bytes | O(1) | Boundary bursts | 2x overshoot | Very Low |
|
|
247
|
+
| Sliding Window Log | ~8L bytes | O(L) cleanup | No bursts | Perfect | Medium |
|
|
248
|
+
| Sliding Window Counter| ~20 bytes | O(1) | Smoothed | ~99.997% | Low-Medium |
|
|
249
|
+
|
|
250
|
+
### Memory at Scale (1M concurrent users)
|
|
251
|
+
|
|
252
|
+
| Algorithm | Memory @ 100 req/min limit | Memory @ 10K req/min limit |
|
|
253
|
+
|-----------------------|----------------------------|----------------------------|
|
|
254
|
+
| Token Bucket | ~32 MB | ~32 MB |
|
|
255
|
+
| Leaky Bucket | ~32 MB | ~32 MB |
|
|
256
|
+
| Fixed Window Counter | ~16 MB | ~16 MB |
|
|
257
|
+
| Sliding Window Log | ~800 MB | ~80 GB |
|
|
258
|
+
| Sliding Window Counter| ~20 MB | ~20 MB |
|
|
259
|
+
|
|
260
|
+
The sliding window log is the only algorithm whose memory scales with the rate limit value
|
|
261
|
+
itself. All others use constant space per client regardless of the limit.
|
|
262
|
+
|
|
263
|
+
### Throughput Overhead
|
|
264
|
+
|
|
265
|
+
A well-implemented rate limiter adds minimal latency:
|
|
266
|
+
|
|
267
|
+
- **In-memory (single-process):** 50-200 nanoseconds per check. Negligible overhead.
|
|
268
|
+
- **Redis single-node:** 0.1-0.5ms per check (network round trip dominates). At sub-ms
|
|
269
|
+
latency, this supports 50,000+ rate limit checks per second per Redis node.
|
|
270
|
+
- **Redis with Lua script:** Single round trip regardless of algorithm complexity. P95
|
|
271
|
+
latency < 2ms, P99 < 5ms in production (source: Redis benchmarks).
|
|
272
|
+
- **Redis cluster:** Add ~0.1ms for slot redirection on first access. Throughput scales
|
|
273
|
+
linearly with nodes.
|
|
274
|
+
|
|
275
|
+
---
|
|
276
|
+
|
|
277
|
+
## Distributed Rate Limiting with Redis
|
|
278
|
+
|
|
279
|
+
### Why Redis
|
|
280
|
+
|
|
281
|
+
Redis is the de facto choice for distributed rate limiting because:
|
|
282
|
+
|
|
283
|
+
1. **Sub-millisecond latency:** Redis processes commands in the sub-microsecond range
|
|
284
|
+
internally. Unix domain socket latency is ~30 microseconds; network latency on 1 Gbit/s
|
|
285
|
+
is ~200 microseconds (source: Redis latency documentation).
|
|
286
|
+
2. **Atomic operations:** INCR, EXPIRE, ZADD, ZRANGEBYSCORE execute atomically.
|
|
287
|
+
3. **Lua scripting:** EVAL/EVALSHA run multi-step logic atomically on the server, eliminating
|
|
288
|
+
race conditions and reducing round trips from 3-4 to 1.
|
|
289
|
+
4. **Built-in expiration:** TTL-based key expiry handles window rotation automatically.
|
|
290
|
+
|
|
291
|
+
### The Race Condition Problem
|
|
292
|
+
|
|
293
|
+
Without atomicity, concurrent requests create a classic TOCTOU (time-of-check/time-of-use)
|
|
294
|
+
race:
|
|
295
|
+
|
|
296
|
+
```
|
|
297
|
+
Thread A: GET counter -> 99 (under limit of 100)
|
|
298
|
+
Thread B: GET counter -> 99 (under limit of 100)
|
|
299
|
+
Thread A: INCR counter -> 100 (allowed)
|
|
300
|
+
Thread B: INCR counter -> 101 (SHOULD have been rejected)
|
|
301
|
+
```
|
|
302
|
+
|
|
303
|
+
At 10,000 requests per second, this race condition can allow 5-15% over-admission in
|
|
304
|
+
testing (source: Halodoc engineering blog).
|
|
305
|
+
|
|
306
|
+
### Lua Script Solutions
|
|
307
|
+
|
|
308
|
+
#### Fixed Window with Lua
|
|
309
|
+
|
|
310
|
+
```lua
|
|
311
|
+
-- KEYS[1] = rate limit key
|
|
312
|
+
-- ARGV[1] = limit
|
|
313
|
+
-- ARGV[2] = window size in seconds
|
|
314
|
+
local current = redis.call('INCR', KEYS[1])
|
|
315
|
+
if current == 1 then
|
|
316
|
+
redis.call('EXPIRE', KEYS[1], ARGV[2])
|
|
317
|
+
end
|
|
318
|
+
if current > tonumber(ARGV[1]) then
|
|
319
|
+
return 0 -- rejected
|
|
320
|
+
end
|
|
321
|
+
return 1 -- allowed
|
|
322
|
+
```
|
|
323
|
+
|
|
324
|
+
#### Sliding Window Log with Lua
|
|
325
|
+
|
|
326
|
+
```lua
|
|
327
|
+
-- KEYS[1] = rate limit key
|
|
328
|
+
-- ARGV[1] = limit
|
|
329
|
+
-- ARGV[2] = window size in milliseconds
|
|
330
|
+
-- ARGV[3] = current timestamp in milliseconds
|
|
331
|
+
-- ARGV[4] = unique request ID
|
|
332
|
+
local window_start = tonumber(ARGV[3]) - tonumber(ARGV[2])
|
|
333
|
+
redis.call('ZREMRANGEBYSCORE', KEYS[1], '-inf', window_start)
|
|
334
|
+
local count = redis.call('ZCARD', KEYS[1])
|
|
335
|
+
if count < tonumber(ARGV[1]) then
|
|
336
|
+
redis.call('ZADD', KEYS[1], ARGV[3], ARGV[4])
|
|
337
|
+
redis.call('PEXPIRE', KEYS[1], ARGV[2])
|
|
338
|
+
return 0 -- allowed
|
|
339
|
+
end
|
|
340
|
+
return 1 -- rejected
|
|
341
|
+
```
|
|
342
|
+
|
|
343
|
+
#### Token Bucket with Lua
|
|
344
|
+
|
|
345
|
+
```lua
|
|
346
|
+
-- KEYS[1] = bucket key
|
|
347
|
+
-- ARGV[1] = bucket capacity
|
|
348
|
+
-- ARGV[2] = refill rate (tokens/sec)
|
|
349
|
+
-- ARGV[3] = current timestamp (seconds, float)
|
|
350
|
+
-- ARGV[4] = tokens to consume
|
|
351
|
+
local bucket = redis.call('HMGET', KEYS[1], 'tokens', 'last_refill')
|
|
352
|
+
local tokens = tonumber(bucket[1]) or tonumber(ARGV[1])
|
|
353
|
+
local last_refill = tonumber(bucket[2]) or tonumber(ARGV[3])
|
|
354
|
+
local now = tonumber(ARGV[3])
|
|
355
|
+
local elapsed = math.max(0, now - last_refill)
|
|
356
|
+
tokens = math.min(tonumber(ARGV[1]), tokens + elapsed * tonumber(ARGV[2]))
|
|
357
|
+
local allowed = 0
|
|
358
|
+
if tokens >= tonumber(ARGV[4]) then
|
|
359
|
+
tokens = tokens - tonumber(ARGV[4])
|
|
360
|
+
allowed = 1
|
|
361
|
+
end
|
|
362
|
+
redis.call('HMSET', KEYS[1], 'tokens', tokens, 'last_refill', now)
|
|
363
|
+
redis.call('EXPIRE', KEYS[1], math.ceil(tonumber(ARGV[1]) / tonumber(ARGV[2])) * 2)
|
|
364
|
+
return allowed
|
|
365
|
+
```
|
|
366
|
+
|
|
367
|
+
### EVAL vs EVALSHA
|
|
368
|
+
|
|
369
|
+
- **EVAL** sends the full script source on every call. This wastes bandwidth --
|
|
370
|
+
a typical rate limiter Lua script is 300-800 bytes.
|
|
371
|
+
- **EVALSHA** sends only the 40-byte SHA1 hash after the script is loaded via SCRIPT LOAD.
|
|
372
|
+
At 50,000 requests/second, this saves ~12-38 MB/s of bandwidth.
|
|
373
|
+
|
|
374
|
+
**Production pattern:** Load scripts at application startup, cache SHA1 hashes, call
|
|
375
|
+
EVALSHA. If Redis returns NOSCRIPT (after a restart or failover), fall back to EVAL once
|
|
376
|
+
and re-cache the hash.
|
|
377
|
+
|
|
378
|
+
### Handling Hot Keys
|
|
379
|
+
|
|
380
|
+
When millions of users share a single rate limit key (e.g., global API limit), that key
|
|
381
|
+
becomes a hot key on a single Redis shard. Mitigation strategies:
|
|
382
|
+
|
|
383
|
+
1. **Key sharding:** Append a shard suffix (e.g., `ratelimit:global:{0-7}`) and sum
|
|
384
|
+
counters across shards. Distributes load across 8 Redis slots.
|
|
385
|
+
2. **Local pre-aggregation:** Aggregate counts in-process for 100-500ms, then flush to
|
|
386
|
+
Redis. Reduces Redis operations by 10-50x at the cost of accuracy.
|
|
387
|
+
3. **Redis Cluster with read replicas:** Route read-heavy operations (checking remaining
|
|
388
|
+
quota) to replicas. Writes still go to the primary.
|
|
389
|
+
|
|
390
|
+
---
|
|
391
|
+
|
|
392
|
+
## Client-Side Throttling and Debouncing
|
|
393
|
+
|
|
394
|
+
### Debouncing
|
|
395
|
+
|
|
396
|
+
Debouncing delays execution until activity stops for a specified period. Ideal for
|
|
397
|
+
search-as-you-type or autocomplete where only the final input matters.
|
|
398
|
+
|
|
399
|
+
```javascript
|
|
400
|
+
function debounce(fn, delayMs) {
|
|
401
|
+
let timer;
|
|
402
|
+
return (...args) => {
|
|
403
|
+
clearTimeout(timer);
|
|
404
|
+
timer = setTimeout(() => fn(...args), delayMs);
|
|
405
|
+
};
|
|
406
|
+
}
|
|
407
|
+
// Usage: debounce(searchAPI, 300) -- waits 300ms after last keystroke
|
|
408
|
+
```
|
|
409
|
+
|
|
410
|
+
**Impact:** A user typing 5 characters in 1 second generates 1 API call instead of 5.
|
|
411
|
+
At 100,000 concurrent users, debouncing at 300ms reduces search API traffic by 60-80%.
|
|
412
|
+
|
|
413
|
+
### Throttling (Client-Side)
|
|
414
|
+
|
|
415
|
+
Throttling limits execution to at most once per time interval, regardless of how many
|
|
416
|
+
times the function is invoked.
|
|
417
|
+
|
|
418
|
+
```javascript
|
|
419
|
+
function throttle(fn, intervalMs) {
|
|
420
|
+
let lastCall = 0;
|
|
421
|
+
return (...args) => {
|
|
422
|
+
const now = Date.now();
|
|
423
|
+
if (now - lastCall >= intervalMs) {
|
|
424
|
+
lastCall = now;
|
|
425
|
+
fn(...args);
|
|
426
|
+
}
|
|
427
|
+
};
|
|
428
|
+
}
|
|
429
|
+
// Usage: throttle(trackScroll, 100) -- at most 10 calls/sec on scroll
|
|
430
|
+
```
|
|
431
|
+
|
|
432
|
+
**Impact:** Scroll-tracking that fires 60 times/second (matching 60fps) is reduced to
|
|
433
|
+
10 calls/second with 100ms throttling -- an 83% reduction in API calls.
|
|
434
|
+
|
|
435
|
+
### Exponential Backoff with Jitter
|
|
436
|
+
|
|
437
|
+
When a client receives HTTP 429, it should retry with exponential backoff:
|
|
438
|
+
|
|
439
|
+
```
|
|
440
|
+
delay = min(base_delay * 2^attempt, max_delay) + random_jitter
|
|
441
|
+
|
|
442
|
+
Where:
|
|
443
|
+
base_delay = 1 second (typical)
|
|
444
|
+
max_delay = 32-64 seconds (cap)
|
|
445
|
+
jitter = random(0, delay * 0.5) -- prevents thundering herd
|
|
446
|
+
```
|
|
447
|
+
|
|
448
|
+
**Without jitter:** If 1,000 clients hit a rate limit simultaneously and all retry at
|
|
449
|
+
exactly 2^N seconds, the server sees synchronized spikes of 1,000 requests at t=1s, t=2s,
|
|
450
|
+
t=4s, t=8s -- the thundering herd problem.
|
|
451
|
+
|
|
452
|
+
**With jitter:** The same 1,000 retries spread across a time range. At attempt 3 with
|
|
453
|
+
base delay 1s: delay = 8s + random(0, 4s), so retries spread across the 8-12s window.
|
|
454
|
+
This reduces peak retry load by 60-80%.
|
|
455
|
+
|
|
456
|
+
### Adaptive Client-Side Rate Limiting (Google SRE)
|
|
457
|
+
|
|
458
|
+
Google's approach from the SRE book implements client-side throttling based on observed
|
|
459
|
+
rejection rate:
|
|
460
|
+
|
|
461
|
+
```
|
|
462
|
+
client_request_probability = max(0, (requests - K * accepts) / (requests + 1))
|
|
463
|
+
|
|
464
|
+
Where:
|
|
465
|
+
requests = total requests in recent window
|
|
466
|
+
accepts = requests that were accepted (not rate-limited)
|
|
467
|
+
K = multiplier (typically 2.0)
|
|
468
|
+
```
|
|
469
|
+
|
|
470
|
+
When the backend starts rejecting requests, the client proactively reduces its own send
|
|
471
|
+
rate. At K=2, the client starts self-throttling when more than 50% of requests are
|
|
472
|
+
rejected. This prevents wasted work (sending requests that will be rejected) and reduces
|
|
473
|
+
server load from processing rejections.
|
|
474
|
+
|
|
475
|
+
---
|
|
476
|
+
|
|
477
|
+
## API Rate Limit Headers
|
|
478
|
+
|
|
479
|
+
### Current Standard (IETF Draft)
|
|
480
|
+
|
|
481
|
+
The IETF HTTPAPI Working Group is standardizing rate limit headers via
|
|
482
|
+
`draft-ietf-httpapi-ratelimit-headers` (currently at draft-10, expires March 2026).
|
|
483
|
+
The specification defines:
|
|
484
|
+
|
|
485
|
+
| Header | Purpose | Example Value |
|
|
486
|
+
|-------------------|--------------------------------------------------|----------------------|
|
|
487
|
+
| `RateLimit-Limit` | Maximum requests allowed in the current window | `100` |
|
|
488
|
+
| `RateLimit-Remaining` | Requests remaining in the current window | `47` |
|
|
489
|
+
| `RateLimit-Reset` | Seconds until the rate limit window resets | `30` |
|
|
490
|
+
| `RateLimit-Policy`| Describes the rate limit policy | `100;w=60` |
|
|
491
|
+
| `Retry-After` | Seconds to wait before retrying (on 429) | `30` |
|
|
492
|
+
|
|
493
|
+
### Legacy Headers (Still Widely Used)
|
|
494
|
+
|
|
495
|
+
Before the IETF draft, APIs used non-standard `X-RateLimit-*` headers:
|
|
496
|
+
|
|
497
|
+
```
|
|
498
|
+
X-RateLimit-Limit: 1000
|
|
499
|
+
X-RateLimit-Remaining: 742
|
|
500
|
+
X-RateLimit-Reset: 1672531200 (Unix epoch -- inconsistent across APIs)
|
|
501
|
+
```
|
|
502
|
+
|
|
503
|
+
The key inconsistency: `X-RateLimit-Reset` is sometimes a Unix timestamp (GitHub, Twitter)
|
|
504
|
+
and sometimes delta-seconds (others). The IETF draft standardizes on delta-seconds for
|
|
505
|
+
`RateLimit-Reset`, consistent with `Retry-After` from RFC 9110.
|
|
506
|
+
|
|
507
|
+
### HTTP Status Codes
|
|
508
|
+
|
|
509
|
+
| Code | Meaning | When to Use |
|
|
510
|
+
|------|-------------------------|--------------------------------------------------|
|
|
511
|
+
| 429 | Too Many Requests | Client exceeded rate limit (RFC 6585) |
|
|
512
|
+
| 503 | Service Unavailable | Server-side overload / load shedding |
|
|
513
|
+
| 403 | Forbidden | Some APIs use this instead of 429 (not ideal) |
|
|
514
|
+
|
|
515
|
+
**Best practice:** Always return 429 for rate limiting, 503 for load shedding. Include
|
|
516
|
+
`Retry-After` header with both. Include `RateLimit-*` headers on ALL responses (not just
|
|
517
|
+
429s) so clients can proactively manage their request rate.
|
|
518
|
+
|
|
519
|
+
### Response Body Best Practice
|
|
520
|
+
|
|
521
|
+
```json
|
|
522
|
+
{
|
|
523
|
+
"error": {
|
|
524
|
+
"code": "rate_limit_exceeded",
|
|
525
|
+
"message": "Rate limit exceeded. Limit: 100 requests per 60 seconds.",
|
|
526
|
+
"retry_after": 30,
|
|
527
|
+
"limit": 100,
|
|
528
|
+
"remaining": 0,
|
|
529
|
+
"reset": 30
|
|
530
|
+
}
|
|
531
|
+
}
|
|
532
|
+
```
|
|
533
|
+
|
|
534
|
+
---
|
|
535
|
+
|
|
536
|
+
## Backpressure Mechanisms
|
|
537
|
+
|
|
538
|
+
### TCP-Level Backpressure
|
|
539
|
+
|
|
540
|
+
TCP flow control is the original backpressure mechanism. When a receiver's buffer fills,
|
|
541
|
+
it advertises a smaller window size, causing the sender to slow down. This is invisible
|
|
542
|
+
to the application layer but can cause connection pooling issues when TCP windows shrink
|
|
543
|
+
under load.
|
|
544
|
+
|
|
545
|
+
### Application-Level Backpressure
|
|
546
|
+
|
|
547
|
+
#### Queue-Based Backpressure
|
|
548
|
+
|
|
549
|
+
Bounded queues with rejection policies provide explicit backpressure:
|
|
550
|
+
|
|
551
|
+
```
|
|
552
|
+
Queue capacity: 10,000 items
|
|
553
|
+
Current depth: 9,500 items (95% full)
|
|
554
|
+
Action: Return 503 to new requests, signal upstream to reduce rate
|
|
555
|
+
```
|
|
556
|
+
|
|
557
|
+
When queue depth exceeds 80% capacity, start rejecting low-priority requests. At 95%,
|
|
558
|
+
reject all non-critical requests. This graduated response prevents queue overflow while
|
|
559
|
+
maintaining service for critical traffic.
|
|
560
|
+
|
|
561
|
+
#### Reactive Streams Backpressure
|
|
562
|
+
|
|
563
|
+
Reactive frameworks (RxJava, Project Reactor, Akka Streams) implement demand-based
|
|
564
|
+
backpressure where consumers explicitly request N items from producers:
|
|
565
|
+
|
|
566
|
+
- `request(10)` -- consumer can handle 10 more items
|
|
567
|
+
- Producer sends at most 10 items, then waits for more demand
|
|
568
|
+
- If consumer is slow, producer naturally slows without buffering
|
|
569
|
+
|
|
570
|
+
#### Load Shedding as Backpressure (Stripe Model)
|
|
571
|
+
|
|
572
|
+
Stripe implements 4 tiers of rate limiting in production (source: Stripe engineering blog):
|
|
573
|
+
|
|
574
|
+
1. **Request Rate Limiter:** Token bucket, N requests/second per user.
|
|
575
|
+
2. **Concurrent Request Limiter:** Caps simultaneous in-flight requests per user to
|
|
576
|
+
manage CPU-intensive endpoints.
|
|
577
|
+
3. **Fleet Usage Load Shedder:** Reserves 20% of fleet capacity for critical requests.
|
|
578
|
+
Non-critical requests are rejected with 503 when the fleet exceeds 80% utilization.
|
|
579
|
+
4. **Worker Utilization Load Shedder:** Final defense. When individual workers are
|
|
580
|
+
overloaded, they progressively shed lower-priority traffic, starting with test-mode
|
|
581
|
+
requests.
|
|
582
|
+
|
|
583
|
+
This tiered approach means Stripe can handle 100+ billion API requests without cascading
|
|
584
|
+
failures (source: Stripe engineering).
|
|
585
|
+
|
|
586
|
+
---
|
|
587
|
+
|
|
588
|
+
## Common Bottlenecks
|
|
589
|
+
|
|
590
|
+
### 1. The Rate Limiter Becomes the Bottleneck
|
|
591
|
+
|
|
592
|
+
**Problem:** If the rate limiter itself is slower than the services it protects, it adds
|
|
593
|
+
latency to every request -- including those well within their limits.
|
|
594
|
+
|
|
595
|
+
**Symptoms:**
|
|
596
|
+
- P99 latency increases by 5-50ms under load
|
|
597
|
+
- Rate limiter Redis CPU exceeds 80%
|
|
598
|
+
- Rate limiter timeouts cause cascading failures
|
|
599
|
+
|
|
600
|
+
**Solutions:**
|
|
601
|
+
- Use in-memory rate limiting for single-instance services (~200ns per check vs ~500us
|
|
602
|
+
for Redis)
|
|
603
|
+
- Use Lua scripts to reduce Redis round trips from 3-4 to 1
|
|
604
|
+
- Set aggressive timeouts on Redis calls (e.g., 5ms). If Redis is unavailable, fail
|
|
605
|
+
open (allow request) rather than blocking
|
|
606
|
+
- Pre-compute rate limit decisions and cache them for 100-500ms
|
|
607
|
+
|
|
608
|
+
### 2. Redis Round-Trip Latency
|
|
609
|
+
|
|
610
|
+
**Problem:** Each rate limit check requires a Redis round trip. At 50,000 requests/second,
|
|
611
|
+
that is 50,000 Redis operations/second, consuming significant network bandwidth and Redis
|
|
612
|
+
CPU.
|
|
613
|
+
|
|
614
|
+
**Measurements:**
|
|
615
|
+
- Same-datacenter Redis: 0.2-0.5ms per round trip
|
|
616
|
+
- Cross-datacenter Redis: 1-5ms per round trip
|
|
617
|
+
- Redis Lua script (single round trip): 0.3-1ms regardless of algorithm complexity
|
|
618
|
+
|
|
619
|
+
**Solutions:**
|
|
620
|
+
- **Pipeline commands:** Batch multiple rate limit checks into a single Redis pipeline.
|
|
621
|
+
Reduces per-check latency by 3-5x.
|
|
622
|
+
- **Local token cache:** Each application instance maintains a local allocation of tokens
|
|
623
|
+
(e.g., 1/N of the global limit where N = number of instances). Refresh from Redis every
|
|
624
|
+
1-5 seconds. Reduces Redis calls by 100-1000x.
|
|
625
|
+
- **Redis Cluster:** Shard rate limit keys across nodes. Linear throughput scaling.
|
|
626
|
+
|
|
627
|
+
### 3. Hot Keys in Redis
|
|
628
|
+
|
|
629
|
+
**Problem:** A single popular API key or a global rate limit creates a "hot key" that
|
|
630
|
+
concentrates all writes on one Redis shard.
|
|
631
|
+
|
|
632
|
+
**Impact:** A single Redis shard handles ~100,000-200,000 operations/second. A global
|
|
633
|
+
rate limit at 500,000 requests/second exceeds this capacity.
|
|
634
|
+
|
|
635
|
+
**Solutions:**
|
|
636
|
+
- **Key sharding:** Split `ratelimit:global` into `ratelimit:global:{0..15}` and sum
|
|
637
|
+
across shards. Each shard handles 1/16 of the traffic.
|
|
638
|
+
- **Probabilistic local counting:** Maintain an approximate local counter (e.g., HyperLogLog
|
|
639
|
+
or simple counter with periodic sync). Accept ~1-5% inaccuracy.
|
|
640
|
+
- **Two-tier limiting:** Coarse local limit (in-memory) + fine-grained global limit
|
|
641
|
+
(Redis). Local limit catches 90% of rejections without touching Redis.
|
|
642
|
+
|
|
643
|
+
### 4. Clock Skew in Distributed Systems
|
|
644
|
+
|
|
645
|
+
**Problem:** Fixed and sliding window algorithms depend on consistent time. If servers
|
|
646
|
+
disagree by >1 second, rate limits become inaccurate.
|
|
647
|
+
|
|
648
|
+
**Impact:** At 100 requests per 10-second window, a 2-second clock skew between servers
|
|
649
|
+
can allow 120 requests (20% over-admission).
|
|
650
|
+
|
|
651
|
+
**Solutions:**
|
|
652
|
+
- Use Redis server time (via `TIME` command or Lua `redis.call('TIME')`) instead of
|
|
653
|
+
client timestamps. All decisions reference the same clock.
|
|
654
|
+
- Use NTP to keep server clocks within 10ms of each other.
|
|
655
|
+
- Prefer token bucket algorithm, which is less sensitive to clock skew (it uses elapsed
|
|
656
|
+
time deltas, not absolute timestamps).
|
|
657
|
+
|
|
658
|
+
---
|
|
659
|
+
|
|
660
|
+
## Anti-Patterns
|
|
661
|
+
|
|
662
|
+
### 1. In-Memory Rate Limiting in Distributed Systems
|
|
663
|
+
|
|
664
|
+
**The mistake:** Using a local in-memory counter (e.g., `ConcurrentHashMap` in Java, or
|
|
665
|
+
a process-level dictionary in Python) when running multiple service instances behind a
|
|
666
|
+
load balancer.
|
|
667
|
+
|
|
668
|
+
**Why it fails:** With 10 instances and a limit of 100 requests/minute, a client can
|
|
669
|
+
send 100 requests to each instance -- effectively getting a 1,000 request/minute limit.
|
|
670
|
+
Each instance sees only 1/N of the traffic.
|
|
671
|
+
|
|
672
|
+
**The fix:** Use a shared store (Redis, Memcached) or implement a gossip protocol to
|
|
673
|
+
synchronize counters. Alternatively, use sticky sessions (but this creates uneven load
|
|
674
|
+
distribution and single-point-of-failure risks).
|
|
675
|
+
|
|
676
|
+
### 2. No Rate Limiting
|
|
677
|
+
|
|
678
|
+
**The mistake:** Deploying an API without any rate limiting because "our clients are
|
|
679
|
+
well-behaved."
|
|
680
|
+
|
|
681
|
+
**What happens:** A single misbehaving client (or bot, or retry storm) saturates the
|
|
682
|
+
service. 73% of SaaS outages are linked to API overuse or poor traffic management
|
|
683
|
+
(source: Gartner, 2024). Without rate limiting, a single client can monopolize shared
|
|
684
|
+
resources, degrading service for all other users.
|
|
685
|
+
|
|
686
|
+
**The fix:** Start with a generous rate limit (e.g., 10x your expected peak per-client
|
|
687
|
+
traffic). Monitor, measure, then tighten. A loose limit is infinitely better than no
|
|
688
|
+
limit.
|
|
689
|
+
|
|
690
|
+
### 3. Too Aggressive Rate Limiting
|
|
691
|
+
|
|
692
|
+
**The mistake:** Setting rate limits based on average traffic rather than peak traffic.
|
|
693
|
+
A limit of 10 requests/second when the client legitimately bursts to 50 requests/second
|
|
694
|
+
during normal operation causes constant 429 errors.
|
|
695
|
+
|
|
696
|
+
**Symptoms:**
|
|
697
|
+
- High 429 rate (>5% of requests) during normal operation
|
|
698
|
+
- Client retry storms amplifying the problem
|
|
699
|
+
- Support tickets from frustrated developers
|
|
700
|
+
|
|
701
|
+
**The fix:** Set limits at 2-5x the observed P95 client request rate. Use token bucket
|
|
702
|
+
algorithm to allow bursts. Monitor 429 rates by client and alert when they exceed 1%.
|
|
703
|
+
|
|
704
|
+
### 4. Rate Limiting Without Informative Responses
|
|
705
|
+
|
|
706
|
+
**The mistake:** Returning 429 without `Retry-After` header, without `RateLimit-*`
|
|
707
|
+
headers, or with a generic error message.
|
|
708
|
+
|
|
709
|
+
**Why it matters:** Without `Retry-After`, clients guess when to retry -- often too soon,
|
|
710
|
+
creating retry storms. Without `RateLimit-Remaining`, clients cannot proactively manage
|
|
711
|
+
their request rate.
|
|
712
|
+
|
|
713
|
+
**The fix:** Always include `Retry-After` on 429 responses. Include `RateLimit-*` headers
|
|
714
|
+
on ALL responses. Include a JSON body with specific error details.
|
|
715
|
+
|
|
716
|
+
### 5. Using the Wrong Identifier
|
|
717
|
+
|
|
718
|
+
**The mistake:** Rate limiting by IP address in an environment with NAT gateways, proxies,
|
|
719
|
+
or cloud egress. A corporate NAT gateway may funnel 10,000 users through a single IP.
|
|
720
|
+
|
|
721
|
+
**Impact:** Legitimate users behind the same NAT get rate-limited collectively. Meanwhile,
|
|
722
|
+
an attacker can rotate IPs easily (e.g., using cloud VMs at $0.01/hour each).
|
|
723
|
+
|
|
724
|
+
**The fix:** Rate limit by API key, user ID, or OAuth token. Use IP-based limiting only
|
|
725
|
+
as a last resort for unauthenticated endpoints (e.g., login, registration).
|
|
726
|
+
|
|
727
|
+
### 6. Fail-Closed Rate Limiter
|
|
728
|
+
|
|
729
|
+
**The mistake:** When Redis (or your rate limit store) is unavailable, rejecting all
|
|
730
|
+
requests.
|
|
731
|
+
|
|
732
|
+
**Impact:** A Redis outage causes a complete API outage -- the rate limiter becomes a
|
|
733
|
+
single point of failure.
|
|
734
|
+
|
|
735
|
+
**The fix:** Fail open. If the rate limit check cannot be completed within 5ms, allow
|
|
736
|
+
the request and log the failure. Use circuit breakers around Redis calls. A short period
|
|
737
|
+
of unlimited access is vastly preferable to a complete outage.
|
|
738
|
+
|
|
739
|
+
---
|
|
740
|
+
|
|
741
|
+
## Before/After: System Stability Under Load
|
|
742
|
+
|
|
743
|
+
### Scenario: API Receiving Traffic Spike (10x Normal Load)
|
|
744
|
+
|
|
745
|
+
#### WITHOUT Rate Limiting
|
|
746
|
+
|
|
747
|
+
```
|
|
748
|
+
Time Requests/s Latency (P99) Error Rate CPU Usage
|
|
749
|
+
t+0s 1,000 50ms 0.1% 40%
|
|
750
|
+
t+10s 5,000 200ms 2% 75%
|
|
751
|
+
t+20s 10,000 2,000ms 25% 98%
|
|
752
|
+
t+30s 10,000 timeout 80% 100%
|
|
753
|
+
t+40s 10,000 timeout 95% 100% <-- cascading failure
|
|
754
|
+
t+50s 2,000 timeout 90% 100% <-- legitimate users affected
|
|
755
|
+
t+60s 500 5,000ms 50% 95% <-- slow recovery begins
|
|
756
|
+
t+120s 1,000 500ms 10% 60% <-- partial recovery
|
|
757
|
+
```
|
|
758
|
+
|
|
759
|
+
**Total impact:** 2+ minutes of degraded service. ~60% of legitimate requests failed.
|
|
760
|
+
Cascading failures propagated to downstream services.
|
|
761
|
+
|
|
762
|
+
#### WITH Rate Limiting (Token Bucket: 2,000 req/s per client, global: 5,000 req/s)
|
|
763
|
+
|
|
764
|
+
```
|
|
765
|
+
Time Requests/s Latency (P99) Error Rate 429 Rate CPU Usage
|
|
766
|
+
t+0s 1,000 50ms 0.1% 0% 40%
|
|
767
|
+
t+10s 5,000 55ms 0.1% 0% 45%
|
|
768
|
+
t+20s 10,000 60ms 0.1% 50% 50%
|
|
769
|
+
t+30s 10,000 60ms 0.1% 50% 50%
|
|
770
|
+
t+40s 10,000 60ms 0.1% 50% 50% <-- stable
|
|
771
|
+
t+50s 10,000 60ms 0.1% 50% 50% <-- stable
|
|
772
|
+
t+60s 5,000 55ms 0.1% 0% 45% <-- load subsides
|
|
773
|
+
```
|
|
774
|
+
|
|
775
|
+
**Total impact:** Zero degradation for clients within their limits. 50% of excess traffic
|
|
776
|
+
shed cleanly with 429 + Retry-After. P99 latency increased by only 10ms (rate limiter
|
|
777
|
+
overhead). No cascading failures. No recovery period needed.
|
|
778
|
+
|
|
779
|
+
### Scenario: Redis Rate Limiter Failure
|
|
780
|
+
|
|
781
|
+
#### Fail-Closed (Anti-Pattern)
|
|
782
|
+
|
|
783
|
+
```
|
|
784
|
+
Redis down at t+0s:
|
|
785
|
+
- ALL rate limit checks fail
|
|
786
|
+
- ALL requests rejected (100% error rate)
|
|
787
|
+
- Complete API outage for 30-300 seconds until Redis recovers
|
|
788
|
+
- Worse than having no rate limiter at all
|
|
789
|
+
```
|
|
790
|
+
|
|
791
|
+
#### Fail-Open (Best Practice)
|
|
792
|
+
|
|
793
|
+
```
|
|
794
|
+
Redis down at t+0s:
|
|
795
|
+
- Rate limit checks timeout after 5ms
|
|
796
|
+
- All requests allowed (fail open)
|
|
797
|
+
- Log "rate limiter unavailable" at WARN level
|
|
798
|
+
- Alert on-call engineer
|
|
799
|
+
- Service continues operating without rate limiting for 30-300 seconds
|
|
800
|
+
- Risk: temporary over-admission. Actual risk is low if the outage is brief.
|
|
801
|
+
```
|
|
802
|
+
|
|
803
|
+
---
|
|
804
|
+
|
|
805
|
+
## Decision Tree: Which Algorithm Should I Use?
|
|
806
|
+
|
|
807
|
+
```
|
|
808
|
+
START: What is your primary requirement?
|
|
809
|
+
|
|
|
810
|
+
+---> Need to allow traffic bursts?
|
|
811
|
+
| |
|
|
812
|
+
| +---> YES: Use TOKEN BUCKET
|
|
813
|
+
| | - Stripe, AWS API Gateway, NGINX use this
|
|
814
|
+
| | - Two params: rate + burst capacity
|
|
815
|
+
| | - Memory: ~32 bytes/client
|
|
816
|
+
| | - Best for: API rate limiting with burst tolerance
|
|
817
|
+
| |
|
|
818
|
+
| +---> NO: Need perfectly smooth output rate?
|
|
819
|
+
| |
|
|
820
|
+
| +---> YES: Use LEAKY BUCKET
|
|
821
|
+
| | - Processes requests at constant rate
|
|
822
|
+
| | - Adds queuing latency
|
|
823
|
+
| | - Best for: traffic shaping, network QoS
|
|
824
|
+
| |
|
|
825
|
+
| +---> NO: Continue below
|
|
826
|
+
|
|
|
827
|
+
+---> Is memory a primary constraint?
|
|
828
|
+
| |
|
|
829
|
+
| +---> YES, minimal memory:
|
|
830
|
+
| | |
|
|
831
|
+
| | +---> Can you tolerate boundary bursts (up to 2x)?
|
|
832
|
+
| | |
|
|
833
|
+
| | +---> YES: Use FIXED WINDOW COUNTER
|
|
834
|
+
| | | - ~16 bytes/client, simplest to implement
|
|
835
|
+
| | | - Redis: INCR + EXPIRE (2 commands)
|
|
836
|
+
| | |
|
|
837
|
+
| | +---> NO: Use SLIDING WINDOW COUNTER
|
|
838
|
+
| | - ~20 bytes/client, 99.997% accurate
|
|
839
|
+
| | - Best default choice for most APIs
|
|
840
|
+
| |
|
|
841
|
+
| +---> NO, memory is not a concern:
|
|
842
|
+
| |
|
|
843
|
+
| +---> Need perfect accuracy (zero false allows)?
|
|
844
|
+
| |
|
|
845
|
+
| +---> YES: Use SLIDING WINDOW LOG
|
|
846
|
+
| | - O(L) memory per client
|
|
847
|
+
| | - Only practical for L < 1,000
|
|
848
|
+
| | - Best for: billing, compliance, audit
|
|
849
|
+
| |
|
|
850
|
+
| +---> NO: Use SLIDING WINDOW COUNTER
|
|
851
|
+
| - Best overall balance
|
|
852
|
+
|
|
|
853
|
+
+---> Running distributed (multi-node)?
|
|
854
|
+
|
|
|
855
|
+
+---> YES:
|
|
856
|
+
| - Use Redis + Lua scripts for atomicity
|
|
857
|
+
| - SLIDING WINDOW COUNTER is the recommended default
|
|
858
|
+
| - TOKEN BUCKET if bursts are needed
|
|
859
|
+
| - Always use EVALSHA (not EVAL) for performance
|
|
860
|
+
| - Set 5ms timeout, fail open on Redis errors
|
|
861
|
+
|
|
|
862
|
+
+---> NO (single process):
|
|
863
|
+
- Use in-memory implementation
|
|
864
|
+
- Any algorithm works; TOKEN BUCKET is most versatile
|
|
865
|
+
- ~200ns per check, no external dependencies
|
|
866
|
+
|
|
867
|
+
QUICK DECISION MATRIX:
|
|
868
|
+
API rate limiting (general) --> Sliding Window Counter
|
|
869
|
+
API rate limiting (allow bursts)--> Token Bucket
|
|
870
|
+
Network traffic shaping --> Leaky Bucket
|
|
871
|
+
Simple + low memory --> Fixed Window Counter
|
|
872
|
+
Billing / compliance / audit --> Sliding Window Log
|
|
873
|
+
```
|
|
874
|
+
|
|
875
|
+
---
|
|
876
|
+
|
|
877
|
+
## Production Case Studies
|
|
878
|
+
|
|
879
|
+
### Cloudflare: Rate Limiting at the Edge
|
|
880
|
+
|
|
881
|
+
Cloudflare processes rate limiting across millions of domains at their edge network. Their
|
|
882
|
+
architecture uses a Twemproxy cluster inside each Point of Presence (PoP) with consistent
|
|
883
|
+
hashing to distribute memcache keys across servers. When the cluster is resized, consistent
|
|
884
|
+
hashing ensures only a small fraction of keys are rehashed. Rate limiting at the edge
|
|
885
|
+
means origin servers never see excessive traffic, and the performance/memory cost is
|
|
886
|
+
distributed across the global edge network (source: Cloudflare engineering blog).
|
|
887
|
+
|
|
888
|
+
Key numbers:
|
|
889
|
+
- Rate limiting deployed across 300+ data centers globally
|
|
890
|
+
- Consistent hashing minimizes key redistribution during scaling
|
|
891
|
+
- Edge-based limiting eliminates origin server load from abusive traffic
|
|
892
|
+
|
|
893
|
+
### Stripe: Four-Tier Rate Limiting
|
|
894
|
+
|
|
895
|
+
Stripe's production rate limiting uses 4 distinct layers (source: Stripe engineering blog):
|
|
896
|
+
|
|
897
|
+
1. **Request rate limiter** (token bucket): Per-user request rate cap
|
|
898
|
+
2. **Concurrent request limiter:** Per-user cap on simultaneous in-flight requests
|
|
899
|
+
3. **Fleet usage load shedder:** Reserves 20% fleet capacity for critical traffic
|
|
900
|
+
4. **Worker utilization load shedder:** Per-worker progressive shedding by priority
|
|
901
|
+
|
|
902
|
+
All layers use Redis with token bucket algorithm for the per-user limiters. The load
|
|
903
|
+
shedders use local worker metrics. This layered approach protects against both individual
|
|
904
|
+
abuse and systemic overload while ensuring critical payment processing is never starved.
|
|
905
|
+
|
|
906
|
+
### Halodoc: Redis + Lua Sliding Window
|
|
907
|
+
|
|
908
|
+
Halodoc implemented a sliding window rate limiter using Redis sorted sets and Lua scripts
|
|
909
|
+
(source: Halodoc engineering blog). Key findings:
|
|
910
|
+
|
|
911
|
+
- Lua scripts reduced race condition over-admission from ~12% to 0%
|
|
912
|
+
- Single Redis round trip per rate limit check (vs 3-4 without Lua)
|
|
913
|
+
- ZREMRANGEBYSCORE + ZCARD + ZADD + PEXPIRE in a single atomic operation
|
|
914
|
+
- Production deployment handles thousands of requests/second with sub-2ms latency
|
|
915
|
+
|
|
916
|
+
### Google SRE: Client-Side Adaptive Throttling
|
|
917
|
+
|
|
918
|
+
Google's SRE book describes client-side adaptive throttling where clients track their own
|
|
919
|
+
acceptance rate and proactively reduce sending rate when backends are stressed. With
|
|
920
|
+
multiplier K=2, clients start self-throttling when rejection rate exceeds 50%. This
|
|
921
|
+
reduces wasted work (sending requests only to be rejected) and can reduce recovery time
|
|
922
|
+
from overload by 50-70% compared to server-side-only rate limiting (source: Google SRE
|
|
923
|
+
book, Chapter 21).
|
|
924
|
+
|
|
925
|
+
---
|
|
926
|
+
|
|
927
|
+
## Implementation Checklist
|
|
928
|
+
|
|
929
|
+
```
|
|
930
|
+
[ ] Choose algorithm based on decision tree above
|
|
931
|
+
[ ] Implement with Redis + Lua scripts for distributed systems
|
|
932
|
+
[ ] Use EVALSHA (not EVAL) -- load scripts at startup, cache SHA1
|
|
933
|
+
[ ] Set Redis call timeout to 5ms, fail open on timeout
|
|
934
|
+
[ ] Return RateLimit-* headers on ALL responses (not just 429)
|
|
935
|
+
[ ] Return Retry-After header on 429 responses
|
|
936
|
+
[ ] Include informative JSON error body on 429 responses
|
|
937
|
+
[ ] Rate limit by API key or user ID (not IP address)
|
|
938
|
+
[ ] Set initial limits at 2-5x observed P95 client traffic
|
|
939
|
+
[ ] Monitor 429 rate by client -- alert if >1% during normal operation
|
|
940
|
+
[ ] Implement exponential backoff with jitter in client SDKs
|
|
941
|
+
[ ] Add circuit breaker around Redis rate limit calls
|
|
942
|
+
[ ] Test under load: verify rate limiter does not become the bottleneck
|
|
943
|
+
[ ] Test Redis failure: verify fail-open behavior
|
|
944
|
+
[ ] Test clock skew: verify algorithm tolerates 1-2 second differences
|
|
945
|
+
[ ] Dashboard: request rate, 429 rate, rate limiter latency, Redis health
|
|
946
|
+
```
|
|
947
|
+
|
|
948
|
+
---
|
|
949
|
+
|
|
950
|
+
## Sources
|
|
951
|
+
|
|
952
|
+
- [Cloudflare: How We Built Rate Limiting Capable of Scaling to Millions of Domains](https://blog.cloudflare.com/counting-things-a-lot-of-different-things/)
|
|
953
|
+
- [Stripe: Scaling Your API with Rate Limiters](https://stripe.com/blog/rate-limiters)
|
|
954
|
+
- [Redis: Build 5 Rate Limiters with Redis](https://redis.io/tutorials/howtos/ratelimiting/)
|
|
955
|
+
- [Halodoc: Redis and Lua Powered Sliding Window Rate Limiter](https://blogs.halodoc.io/taming-the-traffic-redis-and-lua-powered-sliding-window-rate-limiter-in-action/)
|
|
956
|
+
- [IETF: RateLimit Header Fields for HTTP (draft-ietf-httpapi-ratelimit-headers)](https://datatracker.ietf.org/doc/draft-ietf-httpapi-ratelimit-headers/)
|
|
957
|
+
- [API7: From Token Bucket to Sliding Window Rate Limiting Guide](https://api7.ai/blog/rate-limiting-guide-algorithms-best-practices)
|
|
958
|
+
- [Arpit Bhayani: Sliding Window Rate Limiting Design and Implementation](https://arpitbhayani.me/blogs/sliding-window-ratelimiter/)
|
|
959
|
+
- [Gravitee: API Rate Limiting at Scale](https://www.gravitee.io/blog/rate-limiting-apis-scale-patterns-strategies)
|
|
960
|
+
- [AlgoMaster: Rate Limiting Algorithms Explained with Code](https://blog.algomaster.io/p/rate-limiting-algorithms-explained-with-code)
|
|
961
|
+
- [Redis: Diagnosing Latency Issues](https://redis.io/docs/latest/operate/oss_and_stack/management/optimization/latency/)
|
|
962
|
+
- [Zuplo: 10 Best Practices for API Rate Limiting in 2025](https://zuplo.com/learning-center/10-best-practices-for-api-rate-limiting-in-2025)
|
|
963
|
+
- [Kodekx: API Rate Limiting Best Practices for Scaling SaaS Apps in 2025](https://www.kodekx.com/blog/api-rate-limiting-best-practices-scaling-saas-2025)
|
|
964
|
+
- [Expedia Group: Traffic Shedding, Rate Limiting, Backpressure](https://medium.com/expedia-group-tech/traffic-shedding-rate-limiting-backpressure-oh-my-21f95c403b29)
|
|
965
|
+
- [Smudge.ai: Visualizing Algorithms for Rate Limiting](https://smudge.ai/blog/ratelimit-algorithms)
|
|
966
|
+
- [FreeCodeCamp: How to Build a Distributed Rate Limiting System Using Redis and Lua](https://www.freecodecamp.org/news/build-rate-limiting-system-using-redis-and-lua/)
|
|
967
|
+
- [Kong: How to Design a Scalable Rate Limiting Algorithm](https://konghq.com/blog/engineering/how-to-design-a-scalable-rate-limiting-algorithm)
|
|
968
|
+
- [Lunar.dev: Maximizing Performance with Client-Side Throttling](https://www.lunar.dev/post/client-side-throttling)
|
|
969
|
+
- [GeeksforGeeks: Token Bucket vs Leaky Bucket Algorithm](https://www.geeksforgeeks.org/system-design/token-bucket-vs-leaky-bucket-algorithm-system-design/)
|
|
970
|
+
- [Speakeasy: Rate Limiting Best Practices in REST API Design](https://www.speakeasy.com/api-design/rate-limiting)
|
|
971
|
+
- [Arxiv: Designing Scalable Rate Limiting Systems](https://arxiv.org/html/2602.11741)
|