@wazir-dev/cli 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +111 -0
- package/CHANGELOG.md +14 -0
- package/CONTRIBUTING.md +101 -0
- package/LICENSE +21 -0
- package/README.md +314 -0
- package/assets/composition-engine.mmd +34 -0
- package/assets/demo-script.sh +17 -0
- package/assets/logo-dark.svg +14 -0
- package/assets/logo.svg +14 -0
- package/assets/pipeline.mmd +39 -0
- package/assets/record-demo.sh +51 -0
- package/docs/README.md +51 -0
- package/docs/adapters/context-mode.md +60 -0
- package/docs/concepts/architecture.md +87 -0
- package/docs/concepts/artifact-model.md +60 -0
- package/docs/concepts/composition-engine.md +36 -0
- package/docs/concepts/indexing-and-recall.md +160 -0
- package/docs/concepts/observability.md +41 -0
- package/docs/concepts/roles-and-workflows.md +59 -0
- package/docs/concepts/terminology-policy.md +27 -0
- package/docs/getting-started/01-installation.md +78 -0
- package/docs/getting-started/02-first-run.md +102 -0
- package/docs/getting-started/03-adding-to-project.md +15 -0
- package/docs/getting-started/04-host-setup.md +15 -0
- package/docs/guides/ci-integration.md +15 -0
- package/docs/guides/creating-skills.md +15 -0
- package/docs/guides/expertise-module-authoring.md +15 -0
- package/docs/guides/hook-development.md +15 -0
- package/docs/guides/memory-and-learnings.md +34 -0
- package/docs/guides/multi-host-export.md +15 -0
- package/docs/guides/troubleshooting.md +101 -0
- package/docs/guides/writing-custom-roles.md +15 -0
- package/docs/plans/2026-03-15-cli-pipeline-integration-design.md +592 -0
- package/docs/plans/2026-03-15-cli-pipeline-integration-plan.md +598 -0
- package/docs/plans/2026-03-15-docs-enforcement-plan.md +238 -0
- package/docs/readmes/INDEX.md +99 -0
- package/docs/readmes/features/expertise/README.md +171 -0
- package/docs/readmes/features/exports/README.md +222 -0
- package/docs/readmes/features/hooks/README.md +103 -0
- package/docs/readmes/features/hooks/loop-cap-guard.md +133 -0
- package/docs/readmes/features/hooks/post-tool-capture.md +121 -0
- package/docs/readmes/features/hooks/post-tool-lint.md +130 -0
- package/docs/readmes/features/hooks/pre-compact-summary.md +122 -0
- package/docs/readmes/features/hooks/pre-tool-capture-route.md +100 -0
- package/docs/readmes/features/hooks/protected-path-write-guard.md +128 -0
- package/docs/readmes/features/hooks/session-start.md +119 -0
- package/docs/readmes/features/hooks/stop-handoff-harvest.md +125 -0
- package/docs/readmes/features/roles/README.md +157 -0
- package/docs/readmes/features/roles/clarifier.md +152 -0
- package/docs/readmes/features/roles/content-author.md +190 -0
- package/docs/readmes/features/roles/designer.md +193 -0
- package/docs/readmes/features/roles/executor.md +184 -0
- package/docs/readmes/features/roles/learner.md +210 -0
- package/docs/readmes/features/roles/planner.md +182 -0
- package/docs/readmes/features/roles/researcher.md +164 -0
- package/docs/readmes/features/roles/reviewer.md +184 -0
- package/docs/readmes/features/roles/specifier.md +162 -0
- package/docs/readmes/features/roles/verifier.md +215 -0
- package/docs/readmes/features/schemas/README.md +178 -0
- package/docs/readmes/features/skills/README.md +63 -0
- package/docs/readmes/features/skills/brainstorming.md +96 -0
- package/docs/readmes/features/skills/debugging.md +148 -0
- package/docs/readmes/features/skills/design.md +120 -0
- package/docs/readmes/features/skills/prepare-next.md +109 -0
- package/docs/readmes/features/skills/run-audit.md +159 -0
- package/docs/readmes/features/skills/scan-project.md +109 -0
- package/docs/readmes/features/skills/self-audit.md +176 -0
- package/docs/readmes/features/skills/tdd.md +137 -0
- package/docs/readmes/features/skills/using-skills.md +92 -0
- package/docs/readmes/features/skills/verification.md +120 -0
- package/docs/readmes/features/skills/writing-plans.md +104 -0
- package/docs/readmes/features/tooling/README.md +320 -0
- package/docs/readmes/features/workflows/README.md +186 -0
- package/docs/readmes/features/workflows/author.md +181 -0
- package/docs/readmes/features/workflows/clarify.md +154 -0
- package/docs/readmes/features/workflows/design-review.md +171 -0
- package/docs/readmes/features/workflows/design.md +169 -0
- package/docs/readmes/features/workflows/discover.md +162 -0
- package/docs/readmes/features/workflows/execute.md +173 -0
- package/docs/readmes/features/workflows/learn.md +167 -0
- package/docs/readmes/features/workflows/plan-review.md +165 -0
- package/docs/readmes/features/workflows/plan.md +170 -0
- package/docs/readmes/features/workflows/prepare-next.md +167 -0
- package/docs/readmes/features/workflows/review.md +169 -0
- package/docs/readmes/features/workflows/run-audit.md +191 -0
- package/docs/readmes/features/workflows/spec-challenge.md +159 -0
- package/docs/readmes/features/workflows/specify.md +160 -0
- package/docs/readmes/features/workflows/verify.md +177 -0
- package/docs/readmes/packages/README.md +50 -0
- package/docs/readmes/packages/ajv.md +117 -0
- package/docs/readmes/packages/context-mode.md +118 -0
- package/docs/readmes/packages/gray-matter.md +116 -0
- package/docs/readmes/packages/node-test.md +137 -0
- package/docs/readmes/packages/yaml.md +112 -0
- package/docs/reference/configuration-reference.md +159 -0
- package/docs/reference/expertise-index.md +52 -0
- package/docs/reference/git-flow.md +43 -0
- package/docs/reference/hooks.md +87 -0
- package/docs/reference/host-exports.md +50 -0
- package/docs/reference/launch-checklist.md +172 -0
- package/docs/reference/marketplace-listings.md +76 -0
- package/docs/reference/release-process.md +34 -0
- package/docs/reference/roles-reference.md +77 -0
- package/docs/reference/skills.md +33 -0
- package/docs/reference/templates.md +29 -0
- package/docs/reference/tooling-cli.md +94 -0
- package/docs/truth-claims.yaml +222 -0
- package/expertise/PROGRESS.md +63 -0
- package/expertise/README.md +18 -0
- package/expertise/antipatterns/PROGRESS.md +56 -0
- package/expertise/antipatterns/backend/api-design-antipatterns.md +1271 -0
- package/expertise/antipatterns/backend/auth-antipatterns.md +1195 -0
- package/expertise/antipatterns/backend/caching-antipatterns.md +622 -0
- package/expertise/antipatterns/backend/database-antipatterns.md +1038 -0
- package/expertise/antipatterns/backend/index.md +24 -0
- package/expertise/antipatterns/backend/microservices-antipatterns.md +850 -0
- package/expertise/antipatterns/code/architecture-antipatterns.md +919 -0
- package/expertise/antipatterns/code/async-antipatterns.md +622 -0
- package/expertise/antipatterns/code/code-smells.md +1186 -0
- package/expertise/antipatterns/code/dependency-antipatterns.md +1209 -0
- package/expertise/antipatterns/code/error-handling-antipatterns.md +1360 -0
- package/expertise/antipatterns/code/index.md +27 -0
- package/expertise/antipatterns/code/naming-and-abstraction.md +1118 -0
- package/expertise/antipatterns/code/state-management-antipatterns.md +1076 -0
- package/expertise/antipatterns/code/testing-antipatterns.md +1053 -0
- package/expertise/antipatterns/design/accessibility-antipatterns.md +1136 -0
- package/expertise/antipatterns/design/dark-patterns.md +1121 -0
- package/expertise/antipatterns/design/index.md +22 -0
- package/expertise/antipatterns/design/ui-antipatterns.md +1202 -0
- package/expertise/antipatterns/design/ux-antipatterns.md +680 -0
- package/expertise/antipatterns/frontend/css-layout-antipatterns.md +691 -0
- package/expertise/antipatterns/frontend/flutter-antipatterns.md +1827 -0
- package/expertise/antipatterns/frontend/index.md +23 -0
- package/expertise/antipatterns/frontend/mobile-antipatterns.md +573 -0
- package/expertise/antipatterns/frontend/react-antipatterns.md +1128 -0
- package/expertise/antipatterns/frontend/spa-antipatterns.md +1235 -0
- package/expertise/antipatterns/index.md +31 -0
- package/expertise/antipatterns/performance/index.md +20 -0
- package/expertise/antipatterns/performance/performance-antipatterns.md +1013 -0
- package/expertise/antipatterns/performance/premature-optimization.md +623 -0
- package/expertise/antipatterns/performance/scaling-antipatterns.md +785 -0
- package/expertise/antipatterns/process/ai-coding-antipatterns.md +853 -0
- package/expertise/antipatterns/process/code-review-antipatterns.md +656 -0
- package/expertise/antipatterns/process/deployment-antipatterns.md +920 -0
- package/expertise/antipatterns/process/index.md +23 -0
- package/expertise/antipatterns/process/technical-debt-antipatterns.md +647 -0
- package/expertise/antipatterns/security/index.md +20 -0
- package/expertise/antipatterns/security/secrets-antipatterns.md +849 -0
- package/expertise/antipatterns/security/security-theater.md +843 -0
- package/expertise/antipatterns/security/vulnerability-patterns.md +801 -0
- package/expertise/architecture/PROGRESS.md +70 -0
- package/expertise/architecture/data/caching-architecture.md +671 -0
- package/expertise/architecture/data/data-consistency.md +574 -0
- package/expertise/architecture/data/data-modeling.md +536 -0
- package/expertise/architecture/data/event-streams-and-queues.md +634 -0
- package/expertise/architecture/data/index.md +25 -0
- package/expertise/architecture/data/search-architecture.md +663 -0
- package/expertise/architecture/data/sql-vs-nosql.md +708 -0
- package/expertise/architecture/decisions/architecture-decision-records.md +640 -0
- package/expertise/architecture/decisions/build-vs-buy.md +616 -0
- package/expertise/architecture/decisions/index.md +23 -0
- package/expertise/architecture/decisions/monolith-to-microservices.md +790 -0
- package/expertise/architecture/decisions/technology-selection.md +616 -0
- package/expertise/architecture/distributed/cap-theorem-and-tradeoffs.md +800 -0
- package/expertise/architecture/distributed/circuit-breaker-bulkhead.md +741 -0
- package/expertise/architecture/distributed/consensus-and-coordination.md +796 -0
- package/expertise/architecture/distributed/distributed-systems-fundamentals.md +564 -0
- package/expertise/architecture/distributed/idempotency-and-retry.md +796 -0
- package/expertise/architecture/distributed/index.md +25 -0
- package/expertise/architecture/distributed/saga-pattern.md +797 -0
- package/expertise/architecture/foundations/architectural-thinking.md +460 -0
- package/expertise/architecture/foundations/coupling-and-cohesion.md +770 -0
- package/expertise/architecture/foundations/design-principles-solid.md +649 -0
- package/expertise/architecture/foundations/domain-driven-design.md +719 -0
- package/expertise/architecture/foundations/index.md +25 -0
- package/expertise/architecture/foundations/separation-of-concerns.md +472 -0
- package/expertise/architecture/foundations/twelve-factor-app.md +797 -0
- package/expertise/architecture/index.md +34 -0
- package/expertise/architecture/integration/api-design-graphql.md +638 -0
- package/expertise/architecture/integration/api-design-grpc.md +804 -0
- package/expertise/architecture/integration/api-design-rest.md +892 -0
- package/expertise/architecture/integration/index.md +25 -0
- package/expertise/architecture/integration/third-party-integration.md +795 -0
- package/expertise/architecture/integration/webhooks-and-callbacks.md +1152 -0
- package/expertise/architecture/integration/websockets-realtime.md +791 -0
- package/expertise/architecture/mobile-architecture/index.md +22 -0
- package/expertise/architecture/mobile-architecture/mobile-app-architecture.md +780 -0
- package/expertise/architecture/mobile-architecture/mobile-backend-for-frontend.md +670 -0
- package/expertise/architecture/mobile-architecture/offline-first.md +719 -0
- package/expertise/architecture/mobile-architecture/push-and-sync.md +782 -0
- package/expertise/architecture/patterns/cqrs-event-sourcing.md +717 -0
- package/expertise/architecture/patterns/event-driven.md +797 -0
- package/expertise/architecture/patterns/hexagonal-clean-architecture.md +870 -0
- package/expertise/architecture/patterns/index.md +27 -0
- package/expertise/architecture/patterns/layered-architecture.md +736 -0
- package/expertise/architecture/patterns/microservices.md +753 -0
- package/expertise/architecture/patterns/modular-monolith.md +692 -0
- package/expertise/architecture/patterns/monolith.md +626 -0
- package/expertise/architecture/patterns/plugin-architecture.md +735 -0
- package/expertise/architecture/patterns/serverless.md +780 -0
- package/expertise/architecture/scaling/database-scaling.md +615 -0
- package/expertise/architecture/scaling/feature-flags-and-rollouts.md +757 -0
- package/expertise/architecture/scaling/horizontal-vs-vertical.md +606 -0
- package/expertise/architecture/scaling/index.md +24 -0
- package/expertise/architecture/scaling/multi-tenancy.md +800 -0
- package/expertise/architecture/scaling/stateless-design.md +787 -0
- package/expertise/backend/embedded-firmware.md +625 -0
- package/expertise/backend/go.md +853 -0
- package/expertise/backend/index.md +24 -0
- package/expertise/backend/java-spring.md +448 -0
- package/expertise/backend/node-typescript.md +625 -0
- package/expertise/backend/python-fastapi.md +724 -0
- package/expertise/backend/rust.md +458 -0
- package/expertise/backend/solidity.md +711 -0
- package/expertise/composition-map.yaml +443 -0
- package/expertise/content/foundations/content-modeling.md +395 -0
- package/expertise/content/foundations/editorial-standards.md +449 -0
- package/expertise/content/foundations/index.md +24 -0
- package/expertise/content/foundations/microcopy.md +455 -0
- package/expertise/content/foundations/terminology-governance.md +509 -0
- package/expertise/content/index.md +34 -0
- package/expertise/content/patterns/accessibility-copy.md +518 -0
- package/expertise/content/patterns/index.md +24 -0
- package/expertise/content/patterns/notification-content.md +433 -0
- package/expertise/content/patterns/sample-content.md +486 -0
- package/expertise/content/patterns/state-copy.md +439 -0
- package/expertise/design/PROGRESS.md +58 -0
- package/expertise/design/disciplines/dark-mode-theming.md +577 -0
- package/expertise/design/disciplines/design-systems.md +595 -0
- package/expertise/design/disciplines/index.md +25 -0
- package/expertise/design/disciplines/information-architecture.md +800 -0
- package/expertise/design/disciplines/interaction-design.md +788 -0
- package/expertise/design/disciplines/responsive-design.md +552 -0
- package/expertise/design/disciplines/usability-testing.md +516 -0
- package/expertise/design/disciplines/user-research.md +792 -0
- package/expertise/design/foundations/accessibility-design.md +796 -0
- package/expertise/design/foundations/color-theory.md +797 -0
- package/expertise/design/foundations/iconography.md +795 -0
- package/expertise/design/foundations/index.md +26 -0
- package/expertise/design/foundations/motion-and-animation.md +653 -0
- package/expertise/design/foundations/rtl-design.md +585 -0
- package/expertise/design/foundations/spacing-and-layout.md +607 -0
- package/expertise/design/foundations/typography.md +800 -0
- package/expertise/design/foundations/visual-hierarchy.md +761 -0
- package/expertise/design/index.md +32 -0
- package/expertise/design/patterns/authentication-flows.md +474 -0
- package/expertise/design/patterns/content-consumption.md +789 -0
- package/expertise/design/patterns/data-display.md +618 -0
- package/expertise/design/patterns/e-commerce.md +1494 -0
- package/expertise/design/patterns/feedback-and-states.md +642 -0
- package/expertise/design/patterns/forms-and-input.md +819 -0
- package/expertise/design/patterns/gamification.md +801 -0
- package/expertise/design/patterns/index.md +31 -0
- package/expertise/design/patterns/microinteractions.md +449 -0
- package/expertise/design/patterns/navigation.md +800 -0
- package/expertise/design/patterns/notifications.md +705 -0
- package/expertise/design/patterns/onboarding.md +700 -0
- package/expertise/design/patterns/search-and-filter.md +601 -0
- package/expertise/design/patterns/settings-and-preferences.md +768 -0
- package/expertise/design/patterns/social-and-community.md +748 -0
- package/expertise/design/platforms/desktop-native.md +612 -0
- package/expertise/design/platforms/index.md +25 -0
- package/expertise/design/platforms/mobile-android.md +825 -0
- package/expertise/design/platforms/mobile-cross-platform.md +983 -0
- package/expertise/design/platforms/mobile-ios.md +699 -0
- package/expertise/design/platforms/tablet.md +794 -0
- package/expertise/design/platforms/web-dashboard.md +790 -0
- package/expertise/design/platforms/web-responsive.md +550 -0
- package/expertise/design/psychology/behavioral-nudges.md +449 -0
- package/expertise/design/psychology/cognitive-load.md +1191 -0
- package/expertise/design/psychology/error-psychology.md +778 -0
- package/expertise/design/psychology/index.md +22 -0
- package/expertise/design/psychology/persuasive-design.md +736 -0
- package/expertise/design/psychology/user-mental-models.md +623 -0
- package/expertise/design/tooling/open-pencil.md +266 -0
- package/expertise/frontend/angular.md +1073 -0
- package/expertise/frontend/desktop-electron.md +546 -0
- package/expertise/frontend/flutter.md +782 -0
- package/expertise/frontend/index.md +27 -0
- package/expertise/frontend/native-android.md +409 -0
- package/expertise/frontend/native-ios.md +490 -0
- package/expertise/frontend/react-native.md +1160 -0
- package/expertise/frontend/react.md +808 -0
- package/expertise/frontend/vue.md +1089 -0
- package/expertise/humanize/domain-rules-code.md +79 -0
- package/expertise/humanize/domain-rules-content.md +67 -0
- package/expertise/humanize/domain-rules-technical-docs.md +56 -0
- package/expertise/humanize/index.md +35 -0
- package/expertise/humanize/self-audit-checklist.md +87 -0
- package/expertise/humanize/sentence-patterns.md +218 -0
- package/expertise/humanize/vocabulary-blacklist.md +105 -0
- package/expertise/i18n/PROGRESS.md +65 -0
- package/expertise/i18n/advanced/accessibility-and-i18n.md +28 -0
- package/expertise/i18n/advanced/bidirectional-text-algorithm.md +38 -0
- package/expertise/i18n/advanced/complex-scripts.md +30 -0
- package/expertise/i18n/advanced/performance-and-i18n.md +27 -0
- package/expertise/i18n/advanced/testing-i18n.md +28 -0
- package/expertise/i18n/content/content-adaptation.md +23 -0
- package/expertise/i18n/content/locale-specific-formatting.md +23 -0
- package/expertise/i18n/content/machine-translation-integration.md +28 -0
- package/expertise/i18n/content/translation-management.md +29 -0
- package/expertise/i18n/foundations/date-time-calendars.md +67 -0
- package/expertise/i18n/foundations/i18n-architecture.md +272 -0
- package/expertise/i18n/foundations/locale-and-language-tags.md +79 -0
- package/expertise/i18n/foundations/numbers-currency-units.md +61 -0
- package/expertise/i18n/foundations/pluralization-and-gender.md +109 -0
- package/expertise/i18n/foundations/string-externalization.md +236 -0
- package/expertise/i18n/foundations/text-direction-bidi.md +241 -0
- package/expertise/i18n/foundations/unicode-and-encoding.md +86 -0
- package/expertise/i18n/index.md +38 -0
- package/expertise/i18n/platform/backend-i18n.md +31 -0
- package/expertise/i18n/platform/flutter-i18n.md +148 -0
- package/expertise/i18n/platform/native-android-i18n.md +36 -0
- package/expertise/i18n/platform/native-ios-i18n.md +36 -0
- package/expertise/i18n/platform/react-i18n.md +103 -0
- package/expertise/i18n/platform/web-css-i18n.md +81 -0
- package/expertise/i18n/rtl/arabic-specific.md +175 -0
- package/expertise/i18n/rtl/hebrew-specific.md +149 -0
- package/expertise/i18n/rtl/rtl-animations-and-transitions.md +111 -0
- package/expertise/i18n/rtl/rtl-forms-and-input.md +161 -0
- package/expertise/i18n/rtl/rtl-fundamentals.md +211 -0
- package/expertise/i18n/rtl/rtl-icons-and-images.md +181 -0
- package/expertise/i18n/rtl/rtl-layout-mirroring.md +252 -0
- package/expertise/i18n/rtl/rtl-navigation-and-gestures.md +107 -0
- package/expertise/i18n/rtl/rtl-testing-and-qa.md +147 -0
- package/expertise/i18n/rtl/rtl-typography.md +160 -0
- package/expertise/index.md +113 -0
- package/expertise/index.yaml +216 -0
- package/expertise/infrastructure/cloud-aws.md +597 -0
- package/expertise/infrastructure/cloud-gcp.md +599 -0
- package/expertise/infrastructure/cybersecurity.md +816 -0
- package/expertise/infrastructure/database-mongodb.md +447 -0
- package/expertise/infrastructure/database-postgres.md +400 -0
- package/expertise/infrastructure/devops-cicd.md +787 -0
- package/expertise/infrastructure/index.md +27 -0
- package/expertise/performance/PROGRESS.md +50 -0
- package/expertise/performance/backend/api-latency.md +1204 -0
- package/expertise/performance/backend/background-jobs.md +506 -0
- package/expertise/performance/backend/connection-pooling.md +1209 -0
- package/expertise/performance/backend/database-query-optimization.md +515 -0
- package/expertise/performance/backend/index.md +23 -0
- package/expertise/performance/backend/rate-limiting-and-throttling.md +971 -0
- package/expertise/performance/foundations/algorithmic-complexity.md +954 -0
- package/expertise/performance/foundations/caching-strategies.md +489 -0
- package/expertise/performance/foundations/concurrency-and-parallelism.md +847 -0
- package/expertise/performance/foundations/index.md +24 -0
- package/expertise/performance/foundations/measuring-and-profiling.md +440 -0
- package/expertise/performance/foundations/memory-management.md +964 -0
- package/expertise/performance/foundations/performance-budgets.md +1314 -0
- package/expertise/performance/index.md +31 -0
- package/expertise/performance/infrastructure/auto-scaling.md +1059 -0
- package/expertise/performance/infrastructure/cdn-and-edge.md +1081 -0
- package/expertise/performance/infrastructure/index.md +22 -0
- package/expertise/performance/infrastructure/load-balancing.md +1081 -0
- package/expertise/performance/infrastructure/observability.md +1079 -0
- package/expertise/performance/mobile/index.md +23 -0
- package/expertise/performance/mobile/mobile-animations.md +544 -0
- package/expertise/performance/mobile/mobile-memory-battery.md +416 -0
- package/expertise/performance/mobile/mobile-network.md +452 -0
- package/expertise/performance/mobile/mobile-rendering.md +599 -0
- package/expertise/performance/mobile/mobile-startup-time.md +505 -0
- package/expertise/performance/platform-specific/flutter-performance.md +647 -0
- package/expertise/performance/platform-specific/index.md +22 -0
- package/expertise/performance/platform-specific/node-performance.md +1307 -0
- package/expertise/performance/platform-specific/postgres-performance.md +1366 -0
- package/expertise/performance/platform-specific/react-performance.md +1403 -0
- package/expertise/performance/web/bundle-optimization.md +1239 -0
- package/expertise/performance/web/image-and-media.md +636 -0
- package/expertise/performance/web/index.md +24 -0
- package/expertise/performance/web/network-optimization.md +1133 -0
- package/expertise/performance/web/rendering-performance.md +1098 -0
- package/expertise/performance/web/ssr-and-hydration.md +918 -0
- package/expertise/performance/web/web-vitals.md +1374 -0
- package/expertise/quality/accessibility.md +985 -0
- package/expertise/quality/evidence-based-verification.md +499 -0
- package/expertise/quality/index.md +24 -0
- package/expertise/quality/ml-model-audit.md +614 -0
- package/expertise/quality/performance.md +600 -0
- package/expertise/quality/testing-api.md +891 -0
- package/expertise/quality/testing-mobile.md +496 -0
- package/expertise/quality/testing-web.md +849 -0
- package/expertise/security/PROGRESS.md +54 -0
- package/expertise/security/agentic-identity.md +540 -0
- package/expertise/security/compliance-frameworks.md +601 -0
- package/expertise/security/data/data-encryption.md +364 -0
- package/expertise/security/data/data-privacy-gdpr.md +692 -0
- package/expertise/security/data/database-security.md +1171 -0
- package/expertise/security/data/index.md +22 -0
- package/expertise/security/data/pii-handling.md +531 -0
- package/expertise/security/foundations/authentication.md +1041 -0
- package/expertise/security/foundations/authorization.md +603 -0
- package/expertise/security/foundations/cryptography.md +1001 -0
- package/expertise/security/foundations/index.md +25 -0
- package/expertise/security/foundations/owasp-top-10.md +1354 -0
- package/expertise/security/foundations/secrets-management.md +1217 -0
- package/expertise/security/foundations/secure-sdlc.md +700 -0
- package/expertise/security/foundations/supply-chain-security.md +698 -0
- package/expertise/security/index.md +31 -0
- package/expertise/security/infrastructure/cloud-security-aws.md +1296 -0
- package/expertise/security/infrastructure/cloud-security-gcp.md +1376 -0
- package/expertise/security/infrastructure/container-security.md +721 -0
- package/expertise/security/infrastructure/incident-response.md +1295 -0
- package/expertise/security/infrastructure/index.md +24 -0
- package/expertise/security/infrastructure/logging-and-monitoring.md +1618 -0
- package/expertise/security/infrastructure/network-security.md +1337 -0
- package/expertise/security/mobile/index.md +23 -0
- package/expertise/security/mobile/mobile-android-security.md +1218 -0
- package/expertise/security/mobile/mobile-binary-protection.md +1229 -0
- package/expertise/security/mobile/mobile-data-storage.md +1265 -0
- package/expertise/security/mobile/mobile-ios-security.md +1401 -0
- package/expertise/security/mobile/mobile-network-security.md +1520 -0
- package/expertise/security/smart-contract-security.md +594 -0
- package/expertise/security/testing/index.md +22 -0
- package/expertise/security/testing/penetration-testing.md +1258 -0
- package/expertise/security/testing/security-code-review.md +1765 -0
- package/expertise/security/testing/threat-modeling.md +1074 -0
- package/expertise/security/testing/vulnerability-scanning.md +1062 -0
- package/expertise/security/web/api-security.md +586 -0
- package/expertise/security/web/cors-and-headers.md +433 -0
- package/expertise/security/web/csrf.md +562 -0
- package/expertise/security/web/file-upload.md +1477 -0
- package/expertise/security/web/index.md +25 -0
- package/expertise/security/web/injection.md +1375 -0
- package/expertise/security/web/session-management.md +1101 -0
- package/expertise/security/web/xss.md +1158 -0
- package/exports/README.md +17 -0
- package/exports/hosts/claude/.claude/agents/clarifier.md +42 -0
- package/exports/hosts/claude/.claude/agents/content-author.md +63 -0
- package/exports/hosts/claude/.claude/agents/designer.md +55 -0
- package/exports/hosts/claude/.claude/agents/executor.md +55 -0
- package/exports/hosts/claude/.claude/agents/learner.md +51 -0
- package/exports/hosts/claude/.claude/agents/planner.md +53 -0
- package/exports/hosts/claude/.claude/agents/researcher.md +43 -0
- package/exports/hosts/claude/.claude/agents/reviewer.md +54 -0
- package/exports/hosts/claude/.claude/agents/specifier.md +47 -0
- package/exports/hosts/claude/.claude/agents/verifier.md +71 -0
- package/exports/hosts/claude/.claude/commands/author.md +42 -0
- package/exports/hosts/claude/.claude/commands/clarify.md +38 -0
- package/exports/hosts/claude/.claude/commands/design-review.md +46 -0
- package/exports/hosts/claude/.claude/commands/design.md +44 -0
- package/exports/hosts/claude/.claude/commands/discover.md +37 -0
- package/exports/hosts/claude/.claude/commands/execute.md +48 -0
- package/exports/hosts/claude/.claude/commands/learn.md +38 -0
- package/exports/hosts/claude/.claude/commands/plan-review.md +42 -0
- package/exports/hosts/claude/.claude/commands/plan.md +39 -0
- package/exports/hosts/claude/.claude/commands/prepare-next.md +37 -0
- package/exports/hosts/claude/.claude/commands/review.md +40 -0
- package/exports/hosts/claude/.claude/commands/run-audit.md +41 -0
- package/exports/hosts/claude/.claude/commands/spec-challenge.md +41 -0
- package/exports/hosts/claude/.claude/commands/specify.md +38 -0
- package/exports/hosts/claude/.claude/commands/verify.md +37 -0
- package/exports/hosts/claude/.claude/settings.json +34 -0
- package/exports/hosts/claude/CLAUDE.md +19 -0
- package/exports/hosts/claude/export.manifest.json +38 -0
- package/exports/hosts/claude/host-package.json +67 -0
- package/exports/hosts/codex/AGENTS.md +19 -0
- package/exports/hosts/codex/export.manifest.json +38 -0
- package/exports/hosts/codex/host-package.json +41 -0
- package/exports/hosts/cursor/.cursor/hooks.json +16 -0
- package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +19 -0
- package/exports/hosts/cursor/export.manifest.json +38 -0
- package/exports/hosts/cursor/host-package.json +42 -0
- package/exports/hosts/gemini/GEMINI.md +19 -0
- package/exports/hosts/gemini/export.manifest.json +38 -0
- package/exports/hosts/gemini/host-package.json +41 -0
- package/hooks/README.md +18 -0
- package/hooks/definitions/loop_cap_guard.yaml +21 -0
- package/hooks/definitions/post_tool_capture.yaml +24 -0
- package/hooks/definitions/pre_compact_summary.yaml +19 -0
- package/hooks/definitions/pre_tool_capture_route.yaml +19 -0
- package/hooks/definitions/protected_path_write_guard.yaml +19 -0
- package/hooks/definitions/session_start.yaml +19 -0
- package/hooks/definitions/stop_handoff_harvest.yaml +20 -0
- package/hooks/loop-cap-guard +17 -0
- package/hooks/post-tool-lint +36 -0
- package/hooks/protected-path-write-guard +17 -0
- package/hooks/session-start +41 -0
- package/llms-full.txt +2355 -0
- package/llms.txt +43 -0
- package/package.json +79 -0
- package/roles/README.md +20 -0
- package/roles/clarifier.md +42 -0
- package/roles/content-author.md +63 -0
- package/roles/designer.md +55 -0
- package/roles/executor.md +55 -0
- package/roles/learner.md +51 -0
- package/roles/planner.md +53 -0
- package/roles/researcher.md +43 -0
- package/roles/reviewer.md +54 -0
- package/roles/specifier.md +47 -0
- package/roles/verifier.md +71 -0
- package/schemas/README.md +24 -0
- package/schemas/accepted-learning.schema.json +20 -0
- package/schemas/author-artifact.schema.json +156 -0
- package/schemas/clarification.schema.json +19 -0
- package/schemas/design-artifact.schema.json +80 -0
- package/schemas/docs-claim.schema.json +18 -0
- package/schemas/export-manifest.schema.json +20 -0
- package/schemas/hook.schema.json +67 -0
- package/schemas/host-export-package.schema.json +18 -0
- package/schemas/implementation-plan.schema.json +19 -0
- package/schemas/proposed-learning.schema.json +19 -0
- package/schemas/research.schema.json +18 -0
- package/schemas/review.schema.json +29 -0
- package/schemas/run-manifest.schema.json +18 -0
- package/schemas/spec-challenge.schema.json +18 -0
- package/schemas/spec.schema.json +20 -0
- package/schemas/usage.schema.json +102 -0
- package/schemas/verification-proof.schema.json +29 -0
- package/schemas/wazir-manifest.schema.json +173 -0
- package/skills/README.md +40 -0
- package/skills/brainstorming/SKILL.md +77 -0
- package/skills/debugging/SKILL.md +50 -0
- package/skills/design/SKILL.md +61 -0
- package/skills/dispatching-parallel-agents/SKILL.md +128 -0
- package/skills/executing-plans/SKILL.md +70 -0
- package/skills/finishing-a-development-branch/SKILL.md +169 -0
- package/skills/humanize/SKILL.md +123 -0
- package/skills/init-pipeline/SKILL.md +124 -0
- package/skills/prepare-next/SKILL.md +20 -0
- package/skills/receiving-code-review/SKILL.md +123 -0
- package/skills/requesting-code-review/SKILL.md +105 -0
- package/skills/requesting-code-review/code-reviewer.md +108 -0
- package/skills/run-audit/SKILL.md +197 -0
- package/skills/scan-project/SKILL.md +41 -0
- package/skills/self-audit/SKILL.md +153 -0
- package/skills/subagent-driven-development/SKILL.md +154 -0
- package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +26 -0
- package/skills/subagent-driven-development/implementer-prompt.md +102 -0
- package/skills/subagent-driven-development/spec-reviewer-prompt.md +61 -0
- package/skills/tdd/SKILL.md +23 -0
- package/skills/using-git-worktrees/SKILL.md +163 -0
- package/skills/using-skills/SKILL.md +95 -0
- package/skills/verification/SKILL.md +22 -0
- package/skills/wazir/SKILL.md +463 -0
- package/skills/writing-plans/SKILL.md +30 -0
- package/skills/writing-skills/SKILL.md +157 -0
- package/skills/writing-skills/anthropic-best-practices.md +122 -0
- package/skills/writing-skills/persuasion-principles.md +50 -0
- package/templates/README.md +20 -0
- package/templates/artifacts/README.md +10 -0
- package/templates/artifacts/accepted-learning.md +19 -0
- package/templates/artifacts/accepted-learning.template.json +12 -0
- package/templates/artifacts/author.md +74 -0
- package/templates/artifacts/author.template.json +19 -0
- package/templates/artifacts/clarification.md +21 -0
- package/templates/artifacts/clarification.template.json +12 -0
- package/templates/artifacts/execute-notes.md +19 -0
- package/templates/artifacts/implementation-plan.md +21 -0
- package/templates/artifacts/implementation-plan.template.json +11 -0
- package/templates/artifacts/learning-proposal.md +19 -0
- package/templates/artifacts/next-run-handoff.md +21 -0
- package/templates/artifacts/plan-review.md +19 -0
- package/templates/artifacts/proposed-learning.template.json +12 -0
- package/templates/artifacts/research.md +21 -0
- package/templates/artifacts/research.template.json +12 -0
- package/templates/artifacts/review-findings.md +19 -0
- package/templates/artifacts/review.template.json +11 -0
- package/templates/artifacts/run-manifest.template.json +8 -0
- package/templates/artifacts/spec-challenge.md +19 -0
- package/templates/artifacts/spec-challenge.template.json +11 -0
- package/templates/artifacts/spec.md +21 -0
- package/templates/artifacts/spec.template.json +12 -0
- package/templates/artifacts/verification-proof.md +19 -0
- package/templates/artifacts/verification-proof.template.json +11 -0
- package/templates/examples/accepted-learning.example.json +14 -0
- package/templates/examples/author.example.json +152 -0
- package/templates/examples/clarification.example.json +15 -0
- package/templates/examples/docs-claim.example.json +8 -0
- package/templates/examples/export-manifest.example.json +7 -0
- package/templates/examples/host-export-package.example.json +11 -0
- package/templates/examples/implementation-plan.example.json +17 -0
- package/templates/examples/proposed-learning.example.json +13 -0
- package/templates/examples/research.example.json +15 -0
- package/templates/examples/research.example.md +6 -0
- package/templates/examples/review.example.json +17 -0
- package/templates/examples/run-manifest.example.json +9 -0
- package/templates/examples/spec-challenge.example.json +14 -0
- package/templates/examples/spec.example.json +21 -0
- package/templates/examples/verification-proof.example.json +21 -0
- package/templates/examples/wazir-manifest.example.yaml +65 -0
- package/templates/task-definition-schema.md +99 -0
- package/tooling/README.md +20 -0
- package/tooling/src/adapters/context-mode.js +50 -0
- package/tooling/src/capture/command.js +376 -0
- package/tooling/src/capture/store.js +99 -0
- package/tooling/src/capture/usage.js +270 -0
- package/tooling/src/checks/branches.js +50 -0
- package/tooling/src/checks/brand-truth.js +110 -0
- package/tooling/src/checks/changelog.js +231 -0
- package/tooling/src/checks/command-registry.js +36 -0
- package/tooling/src/checks/commits.js +102 -0
- package/tooling/src/checks/docs-drift.js +103 -0
- package/tooling/src/checks/docs-truth.js +201 -0
- package/tooling/src/checks/runtime-surface.js +156 -0
- package/tooling/src/cli.js +116 -0
- package/tooling/src/command-options.js +56 -0
- package/tooling/src/commands/validate.js +320 -0
- package/tooling/src/doctor/command.js +91 -0
- package/tooling/src/export/command.js +77 -0
- package/tooling/src/export/compiler.js +498 -0
- package/tooling/src/guards/loop-cap-guard.js +52 -0
- package/tooling/src/guards/protected-path-write-guard.js +67 -0
- package/tooling/src/index/command.js +152 -0
- package/tooling/src/index/storage.js +1061 -0
- package/tooling/src/index/summarizers.js +261 -0
- package/tooling/src/loaders.js +18 -0
- package/tooling/src/project-root.js +22 -0
- package/tooling/src/recall/command.js +225 -0
- package/tooling/src/schema-validator.js +30 -0
- package/tooling/src/state-root.js +40 -0
- package/tooling/src/status/command.js +71 -0
- package/wazir.manifest.yaml +135 -0
- package/workflows/README.md +19 -0
- package/workflows/author.md +42 -0
- package/workflows/clarify.md +38 -0
- package/workflows/design-review.md +46 -0
- package/workflows/design.md +44 -0
- package/workflows/discover.md +37 -0
- package/workflows/execute.md +48 -0
- package/workflows/learn.md +38 -0
- package/workflows/plan-review.md +42 -0
- package/workflows/plan.md +39 -0
- package/workflows/prepare-next.md +37 -0
- package/workflows/review.md +40 -0
- package/workflows/run-audit.md +41 -0
- package/workflows/spec-challenge.md +41 -0
- package/workflows/specify.md +38 -0
- package/workflows/verify.md +37 -0
|
@@ -0,0 +1,1204 @@
|
|
|
1
|
+
# API Latency -- Performance Expertise Module
|
|
2
|
+
|
|
3
|
+
> API latency directly impacts user experience, conversion rates, and system throughput. Amazon found that every 100ms of added latency cost them 1% in sales. Google found that an extra 500ms in search page generation time dropped traffic by 20%. Walmart reported a 2% increase in conversions for every 1 second shaved off loading time. At scale, tail latency (p99) matters more than average -- a service with 50ms average but 2s p99 will have many users experiencing multi-second waits, especially in fan-out architectures where a single slow dependency degrades the entire request.
|
|
4
|
+
|
|
5
|
+
> **Impact:** Critical
|
|
6
|
+
> **Applies to:** Backend, Web, Mobile
|
|
7
|
+
> **Key metrics:** Response time p50/p95/p99, Time to First Byte (TTFB), Throughput (RPS), Error rate under load
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## Table of Contents
|
|
12
|
+
|
|
13
|
+
1. [Why This Matters](#1-why-this-matters)
|
|
14
|
+
2. [Performance Budgets and Targets](#2-performance-budgets-and-targets)
|
|
15
|
+
3. [Measurement and Profiling](#3-measurement-and-profiling)
|
|
16
|
+
4. [Common Bottlenecks](#4-common-bottlenecks)
|
|
17
|
+
5. [Optimization Patterns](#5-optimization-patterns)
|
|
18
|
+
6. [Anti-Patterns](#6-anti-patterns)
|
|
19
|
+
7. [Architecture-Level Decisions](#7-architecture-level-decisions)
|
|
20
|
+
8. [Testing and Regression Prevention](#8-testing-and-regression-prevention)
|
|
21
|
+
9. [Decision Trees](#9-decision-trees)
|
|
22
|
+
10. [Code Examples](#10-code-examples)
|
|
23
|
+
11. [Quick Reference](#11-quick-reference)
|
|
24
|
+
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
## 1. Why This Matters
|
|
28
|
+
|
|
29
|
+
### Business Impact -- Real Numbers
|
|
30
|
+
|
|
31
|
+
| Study / Company | Finding | Source |
|
|
32
|
+
|-----------------------|----------------------------------------------------------------|--------------------------------|
|
|
33
|
+
| Amazon | Every 100ms of latency costs 1% in sales | Greg Linden, Stanford talk |
|
|
34
|
+
| Google | 500ms extra latency drops search traffic by 20% | Marissa Mayer, Web 2.0 Summit |
|
|
35
|
+
| Google | 100ms search delay reduces conversions by up to 7% | Akamai / Google joint study |
|
|
36
|
+
| Walmart | Every 1s improvement increases conversions by 2% | Walmart Labs engineering blog |
|
|
37
|
+
| Akamai | 100ms delay in page load hurts conversion by 7% | Akamai "State of Online Retail"|
|
|
38
|
+
| Booking.com | 1s slower load time = 1.5% drop in conversion | Velocity Conference 2015 |
|
|
39
|
+
| Aberdeen Group | 1s delay = 11% fewer page views, 16% lower satisfaction | Aberdeen Research report |
|
|
40
|
+
|
|
41
|
+
### Tail Latency Amplification in Microservices
|
|
42
|
+
|
|
43
|
+
In distributed systems, tail latency is amplified by fan-out. If a single API call fans out to N backend services, the overall request latency is bounded by the **slowest** responder.
|
|
44
|
+
|
|
45
|
+
**Quantified impact of fan-out on tail latency:**
|
|
46
|
+
|
|
47
|
+
- If a single service has p99 = 50ms, and a request fans out to 10 services in parallel:
|
|
48
|
+
- Probability that ALL 10 respond within 50ms = 0.99^10 = 90.4%
|
|
49
|
+
- Probability at least one exceeds 50ms = 9.6%
|
|
50
|
+
- Effective p99 of the composite call is much worse than any single service
|
|
51
|
+
- At 100 services in a fan-out: 0.99^100 = 36.6% chance of hitting p99 on at least one
|
|
52
|
+
- At 1000 services: effectively guaranteed to hit tail latency on every request
|
|
53
|
+
|
|
54
|
+
This means a service that looks "fast" in isolation can create unacceptable latency when composed into a microservice mesh. Google's Jeff Dean documented this extensively: with 1ms p99 at the component level and 100-way fan-out, the aggregate p99 approaches 100ms.
|
|
55
|
+
|
|
56
|
+
### User Experience Thresholds
|
|
57
|
+
|
|
58
|
+
| Response Time | User Perception |
|
|
59
|
+
|------------------|----------------------------------------------------|
|
|
60
|
+
| 0-100ms | Feels instantaneous |
|
|
61
|
+
| 100-300ms | Slight delay, still feels responsive |
|
|
62
|
+
| 300-1000ms | Noticeable delay, user waits consciously |
|
|
63
|
+
| 1-3 seconds | Mental context switch, user considers alternatives |
|
|
64
|
+
| 3-10 seconds | Frustration, high abandonment probability |
|
|
65
|
+
| 10+ seconds | Task failure; user leaves |
|
|
66
|
+
|
|
67
|
+
Source: Nielsen Norman Group; Jakob Nielsen's response time research.
|
|
68
|
+
|
|
69
|
+
---
|
|
70
|
+
|
|
71
|
+
## 2. Performance Budgets and Targets
|
|
72
|
+
|
|
73
|
+
### General API Latency Targets
|
|
74
|
+
|
|
75
|
+
| Metric | Internal APIs (service-to-service) | User-facing APIs | Real-time APIs (chat, gaming) |
|
|
76
|
+
|--------|-------------------------------------|------------------|-------------------------------|
|
|
77
|
+
| p50 | < 20ms | < 100ms | < 30ms |
|
|
78
|
+
| p95 | < 100ms | < 300ms | < 80ms |
|
|
79
|
+
| p99 | < 500ms | < 1000ms | < 150ms |
|
|
80
|
+
| p99.9 | < 2000ms | < 3000ms | < 500ms |
|
|
81
|
+
|
|
82
|
+
### Targets by API Type
|
|
83
|
+
|
|
84
|
+
| API Type | p50 Target | p95 Target | p99 Target | Notes |
|
|
85
|
+
|-----------------------------|------------|------------|------------|-------------------------------------|
|
|
86
|
+
| Health check / ping | < 5ms | < 10ms | < 50ms | Should be trivially fast |
|
|
87
|
+
| Simple CRUD read | < 30ms | < 100ms | < 300ms | Single DB query + serialization |
|
|
88
|
+
| CRUD write | < 50ms | < 150ms | < 500ms | Write + validation + events |
|
|
89
|
+
| Search / filtered list | < 100ms | < 300ms | < 1000ms | Indexed queries, pagination |
|
|
90
|
+
| Aggregation / reporting | < 500ms | < 2000ms | < 5000ms | Consider async for heavy queries |
|
|
91
|
+
| File upload / processing | < 1000ms | < 3000ms | < 10000ms | Return 202 Accepted for long ops |
|
|
92
|
+
| Third-party API proxy | < 200ms | < 1000ms | < 3000ms | Depends on downstream; add caching |
|
|
93
|
+
|
|
94
|
+
### Setting a Latency Budget
|
|
95
|
+
|
|
96
|
+
Break down the total latency budget into components:
|
|
97
|
+
|
|
98
|
+
```
|
|
99
|
+
Total budget: 200ms (p95 for a user-facing read API)
|
|
100
|
+
- Network (client to LB): 20ms
|
|
101
|
+
- Load balancer / API gateway: 5ms
|
|
102
|
+
- Auth middleware (JWT validation): 5ms
|
|
103
|
+
- Rate limiting check: 2ms
|
|
104
|
+
- Business logic: 10ms
|
|
105
|
+
- Database query: 30ms
|
|
106
|
+
- Serialization (JSON): 5ms
|
|
107
|
+
- Network (response to client): 20ms
|
|
108
|
+
- Overhead / buffer: 103ms
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
When the sum of components exceeds the budget, you have identified where optimization is needed.
|
|
112
|
+
|
|
113
|
+
---
|
|
114
|
+
|
|
115
|
+
## 3. Measurement and Profiling
|
|
116
|
+
|
|
117
|
+
### Distributed Tracing
|
|
118
|
+
|
|
119
|
+
Distributed tracing is essential in microservice architectures. Each request gets a unique trace ID that propagates across service boundaries, allowing you to visualize the full request lifecycle.
|
|
120
|
+
|
|
121
|
+
**Tools:**
|
|
122
|
+
|
|
123
|
+
| Tool | Type | Key Feature | Overhead |
|
|
124
|
+
|--------------|-----------------|------------------------------------------|--------------|
|
|
125
|
+
| Jaeger | Open source | OpenTelemetry native, Uber-developed | ~1-3% |
|
|
126
|
+
| Zipkin | Open source | Lightweight, Twitter-developed | ~1-2% |
|
|
127
|
+
| Datadog APM | Commercial | Full-stack observability, ML anomalies | ~2-5% |
|
|
128
|
+
| New Relic | Commercial | Code-level visibility, distributed trace | ~2-5% |
|
|
129
|
+
| Grafana Tempo| Open source | Cost-effective trace storage at scale | ~1-3% |
|
|
130
|
+
| AWS X-Ray | Cloud-native | Deep AWS integration | ~1-3% |
|
|
131
|
+
|
|
132
|
+
**OpenTelemetry instrumentation (standard approach):**
|
|
133
|
+
|
|
134
|
+
```python
|
|
135
|
+
from opentelemetry import trace
|
|
136
|
+
from opentelemetry.sdk.trace import TracerProvider
|
|
137
|
+
from opentelemetry.sdk.trace.export import BatchSpanProcessor
|
|
138
|
+
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
|
|
139
|
+
|
|
140
|
+
provider = TracerProvider()
|
|
141
|
+
processor = BatchSpanProcessor(OTLPSpanExporter(endpoint="http://collector:4317"))
|
|
142
|
+
provider.add_span_processor(processor)
|
|
143
|
+
trace.set_tracer_provider(provider)
|
|
144
|
+
|
|
145
|
+
tracer = trace.get_tracer(__name__)
|
|
146
|
+
|
|
147
|
+
@app.route("/api/orders/<order_id>")
|
|
148
|
+
def get_order(order_id):
|
|
149
|
+
with tracer.start_as_current_span("get_order") as span:
|
|
150
|
+
span.set_attribute("order.id", order_id)
|
|
151
|
+
|
|
152
|
+
with tracer.start_as_current_span("db_query"):
|
|
153
|
+
order = db.orders.find_one({"id": order_id})
|
|
154
|
+
|
|
155
|
+
with tracer.start_as_current_span("serialize"):
|
|
156
|
+
result = serialize(order)
|
|
157
|
+
|
|
158
|
+
return result
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
### Application Performance Monitoring (APM)
|
|
162
|
+
|
|
163
|
+
Key metrics to track continuously:
|
|
164
|
+
|
|
165
|
+
- **Response time percentiles:** p50, p95, p99, p99.9
|
|
166
|
+
- **Throughput:** requests per second (RPS)
|
|
167
|
+
- **Error rate:** percentage of 5xx responses
|
|
168
|
+
- **Saturation:** CPU, memory, connection pool utilization
|
|
169
|
+
- **Apdex score:** (satisfied + tolerating/2) / total requests
|
|
170
|
+
|
|
171
|
+
### Flame Graphs and CPU Profiling
|
|
172
|
+
|
|
173
|
+
Flame graphs visualize where CPU time is spent. Use them to identify unexpected hot paths:
|
|
174
|
+
|
|
175
|
+
```bash
|
|
176
|
+
# Go: pprof CPU profile
|
|
177
|
+
go tool pprof -http=:6060 http://localhost:8080/debug/pprof/profile?seconds=30
|
|
178
|
+
|
|
179
|
+
# Java: async-profiler (low overhead, production-safe)
|
|
180
|
+
./asprof -d 30 -f flamegraph.html <pid>
|
|
181
|
+
|
|
182
|
+
# Node.js: 0x or clinic.js
|
|
183
|
+
npx 0x -- node server.js
|
|
184
|
+
npx clinic flame -- node server.js
|
|
185
|
+
|
|
186
|
+
# Python: py-spy (sampling profiler, production-safe)
|
|
187
|
+
py-spy record -o profile.svg --pid <pid>
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
### Load Testing Tools
|
|
191
|
+
|
|
192
|
+
| Tool | Language | Protocol Support | Key Strength | Max RPS (single machine) |
|
|
193
|
+
|---------|----------|-------------------------|---------------------------------------|--------------------------|
|
|
194
|
+
| k6 | Go/JS | HTTP/1.1, HTTP/2, gRPC | Scripting flexibility, cloud mode | 50,000+ |
|
|
195
|
+
| wrk | C | HTTP/1.1 | Raw throughput, minimal overhead | 100,000+ |
|
|
196
|
+
| hey | Go | HTTP/1.1, HTTP/2 | Simple, quick benchmarks | 50,000+ |
|
|
197
|
+
| Gatling | Scala | HTTP, WebSocket, gRPC | Advanced scenarios, detailed reports | 30,000+ |
|
|
198
|
+
| Locust | Python | HTTP (extensible) | Python scripting, distributed mode | 10,000+ |
|
|
199
|
+
| vegeta | Go | HTTP | Constant-rate load, histogram output | 50,000+ |
|
|
200
|
+
|
|
201
|
+
**Example k6 load test:**
|
|
202
|
+
|
|
203
|
+
```javascript
|
|
204
|
+
import http from 'k6/http';
|
|
205
|
+
import { check, sleep } from 'k6';
|
|
206
|
+
|
|
207
|
+
export const options = {
|
|
208
|
+
stages: [
|
|
209
|
+
{ duration: '2m', target: 100 }, // ramp up
|
|
210
|
+
{ duration: '5m', target: 100 }, // sustain
|
|
211
|
+
{ duration: '2m', target: 500 }, // spike
|
|
212
|
+
{ duration: '2m', target: 0 }, // ramp down
|
|
213
|
+
],
|
|
214
|
+
thresholds: {
|
|
215
|
+
http_req_duration: ['p(95)<300', 'p(99)<1000'],
|
|
216
|
+
http_req_failed: ['rate<0.01'],
|
|
217
|
+
},
|
|
218
|
+
};
|
|
219
|
+
|
|
220
|
+
export default function () {
|
|
221
|
+
const res = http.get('https://api.example.com/v1/orders');
|
|
222
|
+
check(res, {
|
|
223
|
+
'status is 200': (r) => r.status === 200,
|
|
224
|
+
'response time < 300ms': (r) => r.timings.duration < 300,
|
|
225
|
+
});
|
|
226
|
+
sleep(1);
|
|
227
|
+
}
|
|
228
|
+
```
|
|
229
|
+
|
|
230
|
+
---
|
|
231
|
+
|
|
232
|
+
## 4. Common Bottlenecks
|
|
233
|
+
|
|
234
|
+
### 4.1 N+1 Query Problem
|
|
235
|
+
|
|
236
|
+
**Impact:** 10-50x slowdown on list endpoints.
|
|
237
|
+
**Mechanism:** One query fetches N parent records; N additional queries fetch related records individually.
|
|
238
|
+
|
|
239
|
+
A concrete benchmark: fetching 100 records with N+1 produces 101 queries at ~10ms each = 1,010ms total. A single JOIN or eager-loaded query takes 20ms -- a 50x improvement.
|
|
240
|
+
|
|
241
|
+
Real-world case: 800 items across 17 categories took 1+ second with 18 N+1 queries. A single JOIN achieved the same result 10x faster at ~14ms reading 834 rows vs. 42ms reading 13,889 rows.
|
|
242
|
+
|
|
243
|
+
### 4.2 Missing or Incorrect Database Indexes
|
|
244
|
+
|
|
245
|
+
**Impact:** Full table scans turn O(log n) lookups into O(n) scans. On a table with 1M rows, a missing index can turn a 2ms query into a 2000ms query.
|
|
246
|
+
|
|
247
|
+
### 4.3 Serialization Overhead
|
|
248
|
+
|
|
249
|
+
**Impact:** JSON serialization can consume 10-30% of total response time for large payloads.
|
|
250
|
+
|
|
251
|
+
Protobuf is 5-7x faster than JSON for serialization/deserialization (Go benchmarks), with 3x fewer memory allocations and 3-10x smaller wire format. Java benchmarks show Protobuf at 4x+ faster than JSON.
|
|
252
|
+
|
|
253
|
+
### 4.4 Network Hops and Service-to-Service Calls
|
|
254
|
+
|
|
255
|
+
**Impact:** Each network hop adds 0.5-5ms within a datacenter, 50-150ms cross-region. A chain of 5 synchronous service calls adds 2.5-25ms of pure network overhead internally.
|
|
256
|
+
|
|
257
|
+
### 4.5 Cold Starts (Serverless)
|
|
258
|
+
|
|
259
|
+
**Impact:** AWS Lambda cold starts range from 16ms (Rust/arm64) to 6+ seconds (Java without SnapStart). Node.js and Python typically cold start in 100-300ms. Java with SnapStart reduces from 6.1s to 1.4s (4.3x improvement). Memory allocation matters: 512MB Lambda cold starts 40% faster than 128MB.
|
|
260
|
+
|
|
261
|
+
### 4.6 Connection Pool Exhaustion
|
|
262
|
+
|
|
263
|
+
**Impact:** When all database connections in the pool are in use, new requests queue. A pool of 20 connections serving 200 concurrent requests means 180 requests wait. Each blocked request adds the full wait time to its latency.
|
|
264
|
+
|
|
265
|
+
### 4.7 Garbage Collection Pauses
|
|
266
|
+
|
|
267
|
+
**Impact:** Stop-the-world GC pauses can spike p99 from 10ms to 50-500ms. In Java, poorly tuned GC can cause 100ms+ pauses. Modern collectors like ZGC and Shenandoah reduce pause times by 10-1000x (sub-millisecond pauses) with a 5-15% throughput trade-off. In Go, tuning GOGC to 50-80 yields frequent but shorter pauses, reducing tail latency spikes.
|
|
268
|
+
|
|
269
|
+
### 4.8 Middleware Chain Overhead
|
|
270
|
+
|
|
271
|
+
**Impact:** Each middleware layer (auth, logging, rate limiting, CORS, request parsing) adds 0.5-5ms. A chain of 10 middleware layers can add 5-50ms before business logic even executes.
|
|
272
|
+
|
|
273
|
+
### 4.9 DNS Resolution
|
|
274
|
+
|
|
275
|
+
**Impact:** DNS lookups add 10-200ms when not cached. Internal service discovery via DNS can add latency if TTL is too low, causing frequent re-resolution.
|
|
276
|
+
|
|
277
|
+
### 4.10 TLS Handshake
|
|
278
|
+
|
|
279
|
+
**Impact:** Full TLS 1.2 handshake adds 2 round trips (40-200ms). TLS 1.3 reduces this to 1 round trip. TLS session resumption can eliminate the handshake overhead on subsequent requests.
|
|
280
|
+
|
|
281
|
+
### 4.11 Large Payload Sizes
|
|
282
|
+
|
|
283
|
+
**Impact:** Uncompressed 1MB JSON response takes ~80ms to transfer on 100Mbps. With gzip (60-70% reduction) or Brotli (65-75% reduction), this drops to 20-30ms of transfer time.
|
|
284
|
+
|
|
285
|
+
### 4.12 Synchronous External API Calls
|
|
286
|
+
|
|
287
|
+
**Impact:** Blocking on third-party APIs (payment processors, email services) adds their full latency to your response time. A Stripe API call averaging 300ms makes your API at least 300ms.
|
|
288
|
+
|
|
289
|
+
### 4.13 Logging and Observability Overhead
|
|
290
|
+
|
|
291
|
+
**Impact:** Synchronous logging to disk or network can add 1-10ms per request. Structured logging with JSON encoding is 2-5x slower than plain text. Solution: buffer and flush asynchronously.
|
|
292
|
+
|
|
293
|
+
### 4.14 Lock Contention
|
|
294
|
+
|
|
295
|
+
**Impact:** Mutex contention in hot paths serializes parallel work. Under high concurrency, a critical section held for 1ms can create 100ms+ waits as goroutines/threads queue.
|
|
296
|
+
|
|
297
|
+
### 4.15 Inefficient Memory Allocation
|
|
298
|
+
|
|
299
|
+
**Impact:** Excessive heap allocations increase GC pressure. In Go, each allocation adds ~25ns; at 100K allocations per request, that is 2.5ms of pure allocation overhead plus GC impact.
|
|
300
|
+
|
|
301
|
+
### 4.16 Unbounded Result Sets
|
|
302
|
+
|
|
303
|
+
**Impact:** Returning 100,000 rows from a database when the client only needs 20 wastes DB I/O, network bandwidth, serialization time, and client memory. Server-side pagination can reduce response sizes by 50-99%.
|
|
304
|
+
|
|
305
|
+
### 4.17 Missing Response Compression
|
|
306
|
+
|
|
307
|
+
**Impact:** Gzip reduces JSON payload sizes by 60-80%. Brotli achieves 65-85% compression. A 500KB JSON response becomes 75-100KB with gzip, saving 400KB+ of transfer time.
|
|
308
|
+
|
|
309
|
+
---
|
|
310
|
+
|
|
311
|
+
## 5. Optimization Patterns
|
|
312
|
+
|
|
313
|
+
### 5.1 Connection Pooling
|
|
314
|
+
|
|
315
|
+
Maintain persistent connections to databases, caches, and downstream services.
|
|
316
|
+
|
|
317
|
+
**Benchmarks (PgBouncer for PostgreSQL):**
|
|
318
|
+
- Throughput improvement: 3.4x to 7.0x higher TPS
|
|
319
|
+
- Connection speed: 3.9x to 9.9x faster connection establishment
|
|
320
|
+
- Latency reduction: up to 7x lower query latency
|
|
321
|
+
- Under stress: 2.1x more transactions completed
|
|
322
|
+
|
|
323
|
+
**Configuration guidelines:**
|
|
324
|
+
|
|
325
|
+
| Parameter | Recommended Value | Rationale |
|
|
326
|
+
|------------------------|------------------------------------|------------------------------------|
|
|
327
|
+
| Pool size | (2 * CPU cores) + disk spindles | HikariCP formula; avoids contention|
|
|
328
|
+
| Connection timeout | 5-10 seconds | Fail fast rather than queue |
|
|
329
|
+
| Idle timeout | 10-30 minutes | Reclaim unused connections |
|
|
330
|
+
| Max lifetime | 30-60 minutes | Prevent stale connection issues |
|
|
331
|
+
| Validation query | Every 30-60 seconds | Detect dead connections |
|
|
332
|
+
|
|
333
|
+
### 5.2 Query Optimization
|
|
334
|
+
|
|
335
|
+
- **Add indexes** on columns used in WHERE, JOIN, ORDER BY, and GROUP BY clauses
|
|
336
|
+
- **Use EXPLAIN ANALYZE** to verify query plans and detect full table scans
|
|
337
|
+
- **Eliminate N+1** queries using JOINs, subqueries, or ORM eager loading
|
|
338
|
+
- **Denormalize** read-heavy data where appropriate (trade storage for speed)
|
|
339
|
+
- **Use covering indexes** so the database can answer queries from the index alone
|
|
340
|
+
- **Partition large tables** (time-series data benefits from range partitioning)
|
|
341
|
+
|
|
342
|
+
### 5.3 Caching Layers
|
|
343
|
+
|
|
344
|
+
**Multi-tier caching strategy:**
|
|
345
|
+
|
|
346
|
+
```
|
|
347
|
+
Client Cache (browser/mobile) -----> 0ms (cache hit)
|
|
348
|
+
|
|
|
349
|
+
CDN / Edge Cache -----------------> 1-10ms
|
|
350
|
+
|
|
|
351
|
+
API Gateway Cache ----------------> 1-5ms
|
|
352
|
+
|
|
|
353
|
+
Application Cache (Redis) --------> 0.5-2ms
|
|
354
|
+
|
|
|
355
|
+
Database Query Cache --------------> avoid (unpredictable invalidation)
|
|
356
|
+
|
|
|
357
|
+
Database (source of truth) -------> 5-100ms+
|
|
358
|
+
```
|
|
359
|
+
|
|
360
|
+
**Redis caching benchmarks:**
|
|
361
|
+
- API response latency drops from ~300ms to <30ms (10x improvement, ~90% reduction)
|
|
362
|
+
- AWS ElastiCache Redis 7.1 achieves sub-millisecond p99 at peak load
|
|
363
|
+
- Edge caching enables 5ms global latency even with geographically distributed clients
|
|
364
|
+
|
|
365
|
+
**Cache invalidation strategies:**
|
|
366
|
+
|
|
367
|
+
| Strategy | Consistency | Complexity | Use Case |
|
|
368
|
+
|-------------------|-------------|------------|----------------------------------|
|
|
369
|
+
| TTL-based | Eventual | Low | Rarely-changing reference data |
|
|
370
|
+
| Write-through | Strong | Medium | User profiles, settings |
|
|
371
|
+
| Write-behind | Eventual | High | High-write-volume analytics |
|
|
372
|
+
| Cache-aside | Eventual | Low | General purpose |
|
|
373
|
+
| Event-driven | Near-real | Medium | Multi-service cache invalidation |
|
|
374
|
+
|
|
375
|
+
### 5.4 Async Processing
|
|
376
|
+
|
|
377
|
+
Move non-critical work out of the request path:
|
|
378
|
+
|
|
379
|
+
- **Email/notification sending** -- queue via RabbitMQ, SQS, or Kafka
|
|
380
|
+
- **Analytics/logging** -- fire-and-forget to a message bus
|
|
381
|
+
- **Image/file processing** -- return 202 Accepted, process in background
|
|
382
|
+
- **Third-party webhooks** -- enqueue and retry independently
|
|
383
|
+
|
|
384
|
+
Return early with a 202 Accepted + a polling endpoint or WebSocket for status updates. This can reduce perceived latency from seconds to milliseconds.
|
|
385
|
+
|
|
386
|
+
### 5.5 Payload Optimization
|
|
387
|
+
|
|
388
|
+
- **Sparse fieldsets** -- let clients specify which fields they need (GraphQL, JSON:API)
|
|
389
|
+
- **Pagination** -- server-side with cursor-based pagination for stable performance
|
|
390
|
+
- **Compression** -- enable gzip/Brotli at the API gateway or reverse proxy level
|
|
391
|
+
- **Binary formats** -- Protobuf for internal services (5-7x faster than JSON, 3-10x smaller)
|
|
392
|
+
- **Streaming** -- use chunked transfer encoding or Server-Sent Events for large result sets
|
|
393
|
+
|
|
394
|
+
### 5.6 HTTP/2 and HTTP/3
|
|
395
|
+
|
|
396
|
+
HTTP/2 multiplexing allows multiple requests over a single TCP connection, eliminating head-of-line blocking at the HTTP layer. Header compression (HPACK) reduces repeated header overhead -- especially valuable for APIs where the same headers appear in every call.
|
|
397
|
+
|
|
398
|
+
HTTP/3 (QUIC) eliminates TCP head-of-line blocking entirely and reduces connection setup to 0-RTT in the best case, saving 100-300ms on first connection.
|
|
399
|
+
|
|
400
|
+
### 5.7 Edge Computing and CDN Caching
|
|
401
|
+
|
|
402
|
+
Deploy API responses at edge locations to reduce geographic latency:
|
|
403
|
+
|
|
404
|
+
- CDN-cached API responses reduce latency by up to 60% for geographically dispersed users
|
|
405
|
+
- Edge computing with regional API gateways reduces latency penalties by 70%+
|
|
406
|
+
- Stale-while-revalidate patterns serve cached responses immediately while refreshing in background
|
|
407
|
+
|
|
408
|
+
### 5.8 Request Coalescing and Batching
|
|
409
|
+
|
|
410
|
+
- **Request coalescing:** deduplicate identical in-flight requests so only one hits the backend
|
|
411
|
+
- **Batching:** combine multiple small requests into one (e.g., DataLoader pattern)
|
|
412
|
+
- **Debouncing:** on the client side, wait for user to stop typing before sending search queries
|
|
413
|
+
|
|
414
|
+
---
|
|
415
|
+
|
|
416
|
+
## 6. Anti-Patterns
|
|
417
|
+
|
|
418
|
+
### 6.1 Over-Fetching
|
|
419
|
+
|
|
420
|
+
Returning entire database rows when the client only needs 3 fields. A user profile endpoint returning 50 fields when the UI only displays name, avatar, and email. This wastes serialization time, bandwidth, and client parsing time.
|
|
421
|
+
|
|
422
|
+
### 6.2 Chatty Microservices
|
|
423
|
+
|
|
424
|
+
Making 10+ synchronous inter-service calls per request. Each call adds 1-5ms of network overhead, and the chain is only as fast as the slowest call. Prefer coarse-grained service boundaries or aggregate services (Backend-for-Frontend pattern).
|
|
425
|
+
|
|
426
|
+
### 6.3 Synchronous Chains
|
|
427
|
+
|
|
428
|
+
Service A calls B, which calls C, which calls D -- all synchronously. The total latency is the SUM of all calls. If A=10ms, B=20ms, C=30ms, D=15ms, total = 75ms minimum. Any spike in D cascades to all upstream callers.
|
|
429
|
+
|
|
430
|
+
### 6.4 Premature Caching Without Measurement
|
|
431
|
+
|
|
432
|
+
Adding Redis before profiling. If the bottleneck is serialization or CPU, caching will not help. Measure first, cache second. Caching the wrong thing adds complexity, increases stale data risk, and wastes infrastructure budget.
|
|
433
|
+
|
|
434
|
+
### 6.5 No Pagination on List Endpoints
|
|
435
|
+
|
|
436
|
+
Returning all 100,000 records when the client displays 20. This creates multi-second response times, high memory usage, and potential OOM errors under concurrent load.
|
|
437
|
+
|
|
438
|
+
### 6.6 Synchronous Logging
|
|
439
|
+
|
|
440
|
+
Writing structured JSON logs to disk or a network endpoint synchronously in the request path. Each log statement adds 0.5-5ms. Solution: use async log writers with in-memory buffers.
|
|
441
|
+
|
|
442
|
+
### 6.7 Unbounded Retry Loops
|
|
443
|
+
|
|
444
|
+
Retrying failed downstream calls without exponential backoff, jitter, or circuit breaking. This amplifies load on already-struggling services and can cause cascading failures.
|
|
445
|
+
|
|
446
|
+
### 6.8 Missing Timeouts
|
|
447
|
+
|
|
448
|
+
Not setting timeouts on HTTP clients, database connections, or gRPC calls. A single slow dependency can block a thread/goroutine indefinitely, eventually exhausting the connection pool and causing the entire service to stall.
|
|
449
|
+
|
|
450
|
+
### 6.9 Ignoring Connection Reuse
|
|
451
|
+
|
|
452
|
+
Creating a new HTTP client or database connection per request instead of reusing pooled connections. TCP handshake (1-3ms) + TLS handshake (10-50ms) per request adds up quickly at high RPS.
|
|
453
|
+
|
|
454
|
+
### 6.10 N+1 API Calls from Frontend
|
|
455
|
+
|
|
456
|
+
The API returns a list of IDs; the frontend makes individual requests for each ID. This is the HTTP equivalent of N+1 queries. Provide batch endpoints or embed related resources.
|
|
457
|
+
|
|
458
|
+
### 6.11 Blocking on Non-Critical Work
|
|
459
|
+
|
|
460
|
+
Sending confirmation emails, updating analytics, or generating audit logs synchronously before returning the response. If it is not needed for the response, do it asynchronously.
|
|
461
|
+
|
|
462
|
+
### 6.12 Using SELECT * in Production Queries
|
|
463
|
+
|
|
464
|
+
Fetching all columns when only a few are needed wastes I/O, increases deserialization time, and prevents the use of covering indexes.
|
|
465
|
+
|
|
466
|
+
### 6.13 Misconfigured Service Meshes
|
|
467
|
+
|
|
468
|
+
Service meshes like Istio and Linkerd add observability and security but can introduce up to 25% latency increase when misconfigured (Red Hat 2023 survey). Ensure sidecar proxy resource limits are properly tuned.
|
|
469
|
+
|
|
470
|
+
---
|
|
471
|
+
|
|
472
|
+
## 7. Architecture-Level Decisions
|
|
473
|
+
|
|
474
|
+
### Synchronous vs. Asynchronous Communication
|
|
475
|
+
|
|
476
|
+
| Dimension | Synchronous (REST/gRPC) | Asynchronous (Message Queue) |
|
|
477
|
+
|----------------------|-----------------------------------|-------------------------------------|
|
|
478
|
+
| Latency | Sum of all hops | Only critical-path hops |
|
|
479
|
+
| Coupling | Tight (caller waits) | Loose (fire and forget) |
|
|
480
|
+
| Error handling | Direct propagation | Dead letter queues, retries |
|
|
481
|
+
| Consistency | Easier strong consistency | Eventual consistency patterns |
|
|
482
|
+
| Debugging | Straightforward stack traces | Requires correlation IDs, tracing |
|
|
483
|
+
| Throughput | Limited by slowest service | Buffered, handles spikes gracefully |
|
|
484
|
+
|
|
485
|
+
**Guideline:** Use synchronous calls only when the response is needed to continue processing. For everything else, prefer asynchronous messaging.
|
|
486
|
+
|
|
487
|
+
### Monolith vs. Microservices -- Latency Tradeoffs
|
|
488
|
+
|
|
489
|
+
| Aspect | Monolith | Microservices |
|
|
490
|
+
|-----------------------|-----------------------------------|-------------------------------------|
|
|
491
|
+
| In-process calls | ~100ns function call | 0.5-5ms network call |
|
|
492
|
+
| Serialization | None (shared memory) | JSON: 1-10ms; Protobuf: 0.1-2ms |
|
|
493
|
+
| Fan-out latency | N/A | Amplified by service count |
|
|
494
|
+
| Deployment latency | Full redeploy (minutes) | Independent deploy (seconds) |
|
|
495
|
+
| Tail latency | Single process, predictable | Distributed, harder to bound |
|
|
496
|
+
|
|
497
|
+
A monolith has inherently lower latency for internal operations. Microservices trade latency for scalability, team autonomy, and deployment independence. The CQRS pattern can achieve 58% reduction in inter-service communication overhead by separating read and write paths.
|
|
498
|
+
|
|
499
|
+
### CQRS (Command Query Responsibility Segregation)
|
|
500
|
+
|
|
501
|
+
Separate read and write models:
|
|
502
|
+
|
|
503
|
+
- **Write side:** normalized, consistent, handles commands through domain logic
|
|
504
|
+
- **Read side:** denormalized, optimized for query patterns, eventually consistent
|
|
505
|
+
|
|
506
|
+
Benefits for latency:
|
|
507
|
+
- Read APIs serve from pre-computed, denormalized views (no JOINs at query time)
|
|
508
|
+
- Write APIs are simpler (no read optimization logic)
|
|
509
|
+
- Read replicas can scale independently
|
|
510
|
+
|
|
511
|
+
### Event Sourcing
|
|
512
|
+
|
|
513
|
+
Store events rather than current state. Benefits for latency:
|
|
514
|
+
|
|
515
|
+
- Writes are append-only (extremely fast, no read-modify-write)
|
|
516
|
+
- Read models are materialized views, optimized per query pattern
|
|
517
|
+
- Event replay enables building new read models without touching production data
|
|
518
|
+
|
|
519
|
+
Trade-off: increased complexity, eventual consistency, and storage costs.
|
|
520
|
+
|
|
521
|
+
### Zone-Aware Routing
|
|
522
|
+
|
|
523
|
+
Direct traffic to services within the same availability zone to minimize cross-zone latency (typically 1-3ms per zone hop). Tools like Istio support zone-aware routing natively.
|
|
524
|
+
|
|
525
|
+
### Circuit Breaker Pattern
|
|
526
|
+
|
|
527
|
+
Circuit breakers reduce cascading failures by 78% and enable automatic recovery in 94% of partial failure scenarios within ~2.8 seconds average. This prevents a slow dependency from consuming all connection pool slots and degrading the entire service.
|
|
528
|
+
|
|
529
|
+
---
|
|
530
|
+
|
|
531
|
+
## 8. Testing and Regression Prevention
|
|
532
|
+
|
|
533
|
+
### Load Testing in CI/CD
|
|
534
|
+
|
|
535
|
+
Integrate performance tests into the deployment pipeline:
|
|
536
|
+
|
|
537
|
+
```yaml
|
|
538
|
+
# Example: GitHub Actions performance gate
|
|
539
|
+
performance-test:
|
|
540
|
+
runs-on: ubuntu-latest
|
|
541
|
+
steps:
|
|
542
|
+
- uses: actions/checkout@v4
|
|
543
|
+
- name: Start application
|
|
544
|
+
run: docker-compose up -d
|
|
545
|
+
- name: Run k6 load test
|
|
546
|
+
uses: grafana/k6-action@v0.4.0
|
|
547
|
+
with:
|
|
548
|
+
filename: tests/performance/api-load.js
|
|
549
|
+
- name: Check thresholds
|
|
550
|
+
run: |
|
|
551
|
+
# k6 exits with code 99 if thresholds are breached
|
|
552
|
+
# This automatically fails the pipeline
|
|
553
|
+
echo "Performance thresholds validated"
|
|
554
|
+
```
|
|
555
|
+
|
|
556
|
+
### Latency Budgets as SLOs
|
|
557
|
+
|
|
558
|
+
Define Service Level Objectives tied to latency percentiles:
|
|
559
|
+
|
|
560
|
+
| SLI (Indicator) | SLO (Objective) | Error Budget (30 days) |
|
|
561
|
+
|------------------------------------------|------------------------|------------------------|
|
|
562
|
+
| p95 response time for /api/orders | < 300ms, 99.5% of time | 3.6 hours of violation |
|
|
563
|
+
| p99 response time for /api/search | < 1000ms, 99% of time | 7.2 hours of violation |
|
|
564
|
+
| Availability (non-5xx) | 99.9% | 43.2 minutes downtime |
|
|
565
|
+
|
|
566
|
+
When the error budget is consumed, freeze feature deployments and focus on reliability.
|
|
567
|
+
|
|
568
|
+
### Continuous Profiling
|
|
569
|
+
|
|
570
|
+
Run continuous profiling in production to detect regressions as they happen:
|
|
571
|
+
|
|
572
|
+
- **Pyroscope** (open source): continuous profiling for Go, Python, Java, Node.js
|
|
573
|
+
- **Datadog Continuous Profiler**: tied to APM traces
|
|
574
|
+
- **Google Cloud Profiler**: low-overhead production profiling
|
|
575
|
+
|
|
576
|
+
### Performance Regression Detection
|
|
577
|
+
|
|
578
|
+
```python
|
|
579
|
+
# Example: Compare latency distributions between releases
|
|
580
|
+
import numpy as np
|
|
581
|
+
from scipy import stats
|
|
582
|
+
|
|
583
|
+
baseline_p99 = np.percentile(baseline_latencies, 99)
|
|
584
|
+
canary_p99 = np.percentile(canary_latencies, 99)
|
|
585
|
+
|
|
586
|
+
# Statistical significance test
|
|
587
|
+
_, p_value = stats.mannwhitneyu(baseline_latencies, canary_latencies)
|
|
588
|
+
|
|
589
|
+
if canary_p99 > baseline_p99 * 1.1 and p_value < 0.05:
|
|
590
|
+
raise Exception(
|
|
591
|
+
f"Latency regression detected: p99 went from "
|
|
592
|
+
f"{baseline_p99:.1f}ms to {canary_p99:.1f}ms (p={p_value:.4f})"
|
|
593
|
+
)
|
|
594
|
+
```
|
|
595
|
+
|
|
596
|
+
### Synthetic Monitoring
|
|
597
|
+
|
|
598
|
+
Run synthetic API tests from multiple geographic regions every 1-5 minutes:
|
|
599
|
+
|
|
600
|
+
- Detect latency degradations before users report them
|
|
601
|
+
- Track TTFB trends over weeks/months
|
|
602
|
+
- Alert when p95 exceeds budget for 5+ consecutive checks
|
|
603
|
+
|
|
604
|
+
---
|
|
605
|
+
|
|
606
|
+
## 9. Decision Trees
|
|
607
|
+
|
|
608
|
+
### "My API Is Slow" -- Diagnostic Flowchart
|
|
609
|
+
|
|
610
|
+
```
|
|
611
|
+
START: API response time exceeds budget
|
|
612
|
+
|
|
|
613
|
+
+---> Is it slow for ALL requests or only SOME?
|
|
614
|
+
| |
|
|
615
|
+
| +---> ALL requests:
|
|
616
|
+
| | |
|
|
617
|
+
| | +---> Check CPU utilization
|
|
618
|
+
| | | > 80%? --> Profile CPU (flame graph)
|
|
619
|
+
| | | - Hot serialization loop?
|
|
620
|
+
| | | - Regex/parsing overhead?
|
|
621
|
+
| | | - Crypto operations?
|
|
622
|
+
| | |
|
|
623
|
+
| | +---> Check memory / GC
|
|
624
|
+
| | | High GC pause? --> Tune GC, reduce allocations
|
|
625
|
+
| | | OOM pressure? --> Increase memory or reduce footprint
|
|
626
|
+
| | |
|
|
627
|
+
| | +---> Check connection pools
|
|
628
|
+
| | All in use? --> Increase pool size or reduce hold time
|
|
629
|
+
| | Timeout waiting? --> Check downstream service health
|
|
630
|
+
| |
|
|
631
|
+
| +---> SOME requests (tail latency):
|
|
632
|
+
| |
|
|
633
|
+
| +---> Specific endpoints?
|
|
634
|
+
| | Yes --> Profile that endpoint (N+1? Missing index?)
|
|
635
|
+
| | No --> Systemic issue (GC, noisy neighbor)
|
|
636
|
+
| |
|
|
637
|
+
| +---> Correlated with payload size?
|
|
638
|
+
| | Yes --> Pagination, compression, field filtering
|
|
639
|
+
| |
|
|
640
|
+
| +---> Correlated with time of day?
|
|
641
|
+
| Yes --> Load-related; auto-scaling or caching needed
|
|
642
|
+
|
|
|
643
|
+
+---> Where is the time spent? (Use distributed tracing)
|
|
644
|
+
|
|
|
645
|
+
+---> Database (> 50% of request time):
|
|
646
|
+
| - Check query plan (EXPLAIN ANALYZE)
|
|
647
|
+
| - Add missing indexes
|
|
648
|
+
| - Eliminate N+1 queries
|
|
649
|
+
| - Consider read replicas
|
|
650
|
+
| - Add caching layer
|
|
651
|
+
|
|
|
652
|
+
+---> Network / downstream services (> 30%):
|
|
653
|
+
| - Add timeouts and circuit breakers
|
|
654
|
+
| - Cache downstream responses
|
|
655
|
+
| - Parallelize independent calls
|
|
656
|
+
| - Consider async processing
|
|
657
|
+
|
|
|
658
|
+
+---> Serialization (> 15%):
|
|
659
|
+
| - Switch to Protobuf for internal APIs
|
|
660
|
+
| - Use streaming serialization
|
|
661
|
+
| - Reduce payload size (sparse fieldsets)
|
|
662
|
+
| - Enable compression (gzip/Brotli)
|
|
663
|
+
|
|
|
664
|
+
+---> Application logic (> 20%):
|
|
665
|
+
- Profile with flame graph
|
|
666
|
+
- Optimize hot loops
|
|
667
|
+
- Reduce memory allocations
|
|
668
|
+
- Consider algorithmic improvements
|
|
669
|
+
```
|
|
670
|
+
|
|
671
|
+
### "Should I Cache This?" -- Decision Framework
|
|
672
|
+
|
|
673
|
+
```
|
|
674
|
+
Is the data requested frequently?
|
|
675
|
+
No --> Do not cache (waste of memory)
|
|
676
|
+
Yes --> Is the data expensive to compute/fetch?
|
|
677
|
+
No --> Probably not worth caching complexity
|
|
678
|
+
Yes --> Can you tolerate stale data?
|
|
679
|
+
No --> Use write-through cache or skip caching
|
|
680
|
+
Yes --> For how long?
|
|
681
|
+
Seconds --> Cache with short TTL (rate-limit protection)
|
|
682
|
+
Minutes --> Cache-aside with TTL (most common pattern)
|
|
683
|
+
Hours+ --> CDN / edge cache with background refresh
|
|
684
|
+
```
|
|
685
|
+
|
|
686
|
+
### "Should I Make This Async?" -- Decision Framework
|
|
687
|
+
|
|
688
|
+
```
|
|
689
|
+
Is the result needed for the API response?
|
|
690
|
+
Yes --> Must be synchronous (but can parallelize with other sync work)
|
|
691
|
+
No --> Can the user wait for the result?
|
|
692
|
+
No --> Fire-and-forget via message queue
|
|
693
|
+
Yes --> Return 202 Accepted + status polling endpoint
|
|
694
|
+
Process in background worker
|
|
695
|
+
```
|
|
696
|
+
|
|
697
|
+
---
|
|
698
|
+
|
|
699
|
+
## 10. Code Examples
|
|
700
|
+
|
|
701
|
+
### Example 1: N+1 Query Elimination
|
|
702
|
+
|
|
703
|
+
**Before (N+1 -- 101 queries for 100 orders):**
|
|
704
|
+
|
|
705
|
+
```python
|
|
706
|
+
# BAD: 1 query for orders + 100 queries for customers
|
|
707
|
+
orders = db.execute("SELECT * FROM orders LIMIT 100")
|
|
708
|
+
for order in orders:
|
|
709
|
+
customer = db.execute(
|
|
710
|
+
"SELECT * FROM customers WHERE id = %s", (order.customer_id,)
|
|
711
|
+
)
|
|
712
|
+
order.customer = customer
|
|
713
|
+
# Total: 101 queries, ~1010ms at 10ms/query
|
|
714
|
+
```
|
|
715
|
+
|
|
716
|
+
**After (single JOIN -- 1 query):**
|
|
717
|
+
|
|
718
|
+
```python
|
|
719
|
+
# GOOD: 1 query with JOIN
|
|
720
|
+
orders = db.execute("""
|
|
721
|
+
SELECT o.id, o.total, o.created_at,
|
|
722
|
+
c.id as customer_id, c.name, c.email
|
|
723
|
+
FROM orders o
|
|
724
|
+
JOIN customers c ON o.customer_id = c.id
|
|
725
|
+
LIMIT 100
|
|
726
|
+
""")
|
|
727
|
+
# Total: 1 query, ~15ms
|
|
728
|
+
# Improvement: 67x faster
|
|
729
|
+
```
|
|
730
|
+
|
|
731
|
+
**ORM equivalent (SQLAlchemy eager loading):**
|
|
732
|
+
|
|
733
|
+
```python
|
|
734
|
+
# BAD: lazy loading triggers N+1
|
|
735
|
+
orders = session.query(Order).limit(100).all()
|
|
736
|
+
for order in orders:
|
|
737
|
+
print(order.customer.name) # triggers individual SELECT per customer
|
|
738
|
+
|
|
739
|
+
# GOOD: joinedload eliminates N+1
|
|
740
|
+
from sqlalchemy.orm import joinedload
|
|
741
|
+
orders = (
|
|
742
|
+
session.query(Order)
|
|
743
|
+
.options(joinedload(Order.customer))
|
|
744
|
+
.limit(100)
|
|
745
|
+
.all()
|
|
746
|
+
)
|
|
747
|
+
for order in orders:
|
|
748
|
+
print(order.customer.name) # no additional queries
|
|
749
|
+
```
|
|
750
|
+
|
|
751
|
+
### Example 2: Response Caching with Redis
|
|
752
|
+
|
|
753
|
+
**Before (uncached -- 250ms):**
|
|
754
|
+
|
|
755
|
+
```python
|
|
756
|
+
@app.route("/api/products/<category>")
|
|
757
|
+
def get_products(category):
|
|
758
|
+
# Direct DB query every time: ~200ms
|
|
759
|
+
products = db.execute(
|
|
760
|
+
"SELECT * FROM products WHERE category = %s ORDER BY popularity DESC",
|
|
761
|
+
(category,)
|
|
762
|
+
)
|
|
763
|
+
return jsonify([serialize(p) for p in products]) # ~50ms serialization
|
|
764
|
+
# Total: ~250ms per request
|
|
765
|
+
```
|
|
766
|
+
|
|
767
|
+
**After (Redis-cached -- 2ms on cache hit):**
|
|
768
|
+
|
|
769
|
+
```python
|
|
770
|
+
import redis
|
|
771
|
+
import json
|
|
772
|
+
|
|
773
|
+
cache = redis.Redis(host='localhost', port=6379, decode_responses=True)
|
|
774
|
+
CACHE_TTL = 300 # 5 minutes
|
|
775
|
+
|
|
776
|
+
@app.route("/api/products/<category>")
|
|
777
|
+
def get_products(category):
|
|
778
|
+
cache_key = f"products:{category}"
|
|
779
|
+
|
|
780
|
+
# Check cache first: ~0.5ms
|
|
781
|
+
cached = cache.get(cache_key)
|
|
782
|
+
if cached:
|
|
783
|
+
return cached, 200, {'Content-Type': 'application/json'}
|
|
784
|
+
|
|
785
|
+
# Cache miss: query DB (~200ms) + serialize (~50ms)
|
|
786
|
+
products = db.execute(
|
|
787
|
+
"SELECT * FROM products WHERE category = %s ORDER BY popularity DESC",
|
|
788
|
+
(category,)
|
|
789
|
+
)
|
|
790
|
+
result = json.dumps([serialize(p) for p in products])
|
|
791
|
+
|
|
792
|
+
# Store in cache: ~0.5ms
|
|
793
|
+
cache.setex(cache_key, CACHE_TTL, result)
|
|
794
|
+
|
|
795
|
+
return result, 200, {'Content-Type': 'application/json'}
|
|
796
|
+
# Cache hit: ~2ms | Cache miss: ~252ms (same + cache write)
|
|
797
|
+
# At 90% hit rate, average = 0.9*2 + 0.1*252 = 27ms (9.3x improvement)
|
|
798
|
+
```
|
|
799
|
+
|
|
800
|
+
### Example 3: Async Processing -- Fire-and-Forget
|
|
801
|
+
|
|
802
|
+
**Before (synchronous -- 850ms):**
|
|
803
|
+
|
|
804
|
+
```python
|
|
805
|
+
@app.route("/api/orders", methods=["POST"])
|
|
806
|
+
def create_order(order_data):
|
|
807
|
+
order = validate_order(order_data) # 10ms
|
|
808
|
+
order = save_to_database(order) # 30ms
|
|
809
|
+
charge_payment(order) # 300ms (Stripe API)
|
|
810
|
+
send_confirmation_email(order) # 200ms (SendGrid API)
|
|
811
|
+
update_inventory(order) # 50ms
|
|
812
|
+
notify_warehouse(order) # 100ms
|
|
813
|
+
track_analytics(order) # 50ms
|
|
814
|
+
generate_invoice_pdf(order) # 110ms
|
|
815
|
+
return jsonify(order), 201
|
|
816
|
+
# Total: ~850ms -- user waits for everything
|
|
817
|
+
```
|
|
818
|
+
|
|
819
|
+
**After (async non-critical work -- 340ms):**
|
|
820
|
+
|
|
821
|
+
```python
|
|
822
|
+
from celery import Celery
|
|
823
|
+
|
|
824
|
+
celery_app = Celery('orders', broker='redis://localhost:6379/0')
|
|
825
|
+
|
|
826
|
+
@app.route("/api/orders", methods=["POST"])
|
|
827
|
+
def create_order(order_data):
|
|
828
|
+
order = validate_order(order_data) # 10ms
|
|
829
|
+
order = save_to_database(order) # 30ms
|
|
830
|
+
charge_payment(order) # 300ms (critical, must be sync)
|
|
831
|
+
|
|
832
|
+
# Non-critical work: fire-and-forget to background workers
|
|
833
|
+
celery_app.send_task('send_confirmation_email', args=[order.id])
|
|
834
|
+
celery_app.send_task('update_inventory', args=[order.id])
|
|
835
|
+
celery_app.send_task('notify_warehouse', args=[order.id])
|
|
836
|
+
celery_app.send_task('track_analytics', args=[order.id])
|
|
837
|
+
celery_app.send_task('generate_invoice_pdf', args=[order.id])
|
|
838
|
+
# Enqueue time: ~2ms total
|
|
839
|
+
|
|
840
|
+
return jsonify(order), 201
|
|
841
|
+
# Total: ~342ms -- 2.5x faster, user gets response immediately
|
|
842
|
+
```
|
|
843
|
+
|
|
844
|
+
### Example 4: Connection Pooling
|
|
845
|
+
|
|
846
|
+
**Before (new connection per request):**
|
|
847
|
+
|
|
848
|
+
```python
|
|
849
|
+
import psycopg2
|
|
850
|
+
|
|
851
|
+
def get_user(user_id):
|
|
852
|
+
# New connection every time: ~20-50ms for TCP + TLS + auth
|
|
853
|
+
conn = psycopg2.connect(
|
|
854
|
+
host="db.example.com", dbname="app", user="api", password="secret"
|
|
855
|
+
)
|
|
856
|
+
cursor = conn.cursor()
|
|
857
|
+
cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
|
|
858
|
+
user = cursor.fetchone()
|
|
859
|
+
conn.close()
|
|
860
|
+
return user
|
|
861
|
+
# Connection overhead: 20-50ms per request
|
|
862
|
+
```
|
|
863
|
+
|
|
864
|
+
**After (connection pool):**
|
|
865
|
+
|
|
866
|
+
```python
|
|
867
|
+
from psycopg2 import pool
|
|
868
|
+
|
|
869
|
+
# Initialize pool once at startup
|
|
870
|
+
db_pool = pool.ThreadedConnectionPool(
|
|
871
|
+
minconn=5,
|
|
872
|
+
maxconn=20, # (2 * CPU_cores) + disk_spindles
|
|
873
|
+
host="db.example.com",
|
|
874
|
+
dbname="app",
|
|
875
|
+
user="api",
|
|
876
|
+
password="secret"
|
|
877
|
+
)
|
|
878
|
+
|
|
879
|
+
def get_user(user_id):
|
|
880
|
+
conn = db_pool.getconn() # ~0.1ms from pool
|
|
881
|
+
try:
|
|
882
|
+
cursor = conn.cursor()
|
|
883
|
+
cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
|
|
884
|
+
user = cursor.fetchone()
|
|
885
|
+
return user
|
|
886
|
+
finally:
|
|
887
|
+
db_pool.putconn(conn) # return to pool, not closed
|
|
888
|
+
# Connection overhead: ~0.1ms per request (200-500x faster)
|
|
889
|
+
```
|
|
890
|
+
|
|
891
|
+
### Example 5: Parallel Service Calls
|
|
892
|
+
|
|
893
|
+
**Before (sequential -- 450ms):**
|
|
894
|
+
|
|
895
|
+
```python
|
|
896
|
+
async def get_dashboard(user_id):
|
|
897
|
+
profile = await fetch_profile(user_id) # 100ms
|
|
898
|
+
orders = await fetch_recent_orders(user_id) # 150ms
|
|
899
|
+
recommendations = await fetch_recommendations(user_id) # 200ms
|
|
900
|
+
return {
|
|
901
|
+
"profile": profile,
|
|
902
|
+
"orders": orders,
|
|
903
|
+
"recommendations": recommendations,
|
|
904
|
+
}
|
|
905
|
+
# Total: 100 + 150 + 200 = 450ms (sequential)
|
|
906
|
+
```
|
|
907
|
+
|
|
908
|
+
**After (parallel -- 200ms):**
|
|
909
|
+
|
|
910
|
+
```python
|
|
911
|
+
import asyncio
|
|
912
|
+
|
|
913
|
+
async def get_dashboard(user_id):
|
|
914
|
+
profile, orders, recommendations = await asyncio.gather(
|
|
915
|
+
fetch_profile(user_id), # 100ms
|
|
916
|
+
fetch_recent_orders(user_id), # 150ms
|
|
917
|
+
fetch_recommendations(user_id), # 200ms
|
|
918
|
+
)
|
|
919
|
+
return {
|
|
920
|
+
"profile": profile,
|
|
921
|
+
"orders": orders,
|
|
922
|
+
"recommendations": recommendations,
|
|
923
|
+
}
|
|
924
|
+
# Total: max(100, 150, 200) = 200ms (parallel)
|
|
925
|
+
# Improvement: 2.25x faster
|
|
926
|
+
```
|
|
927
|
+
|
|
928
|
+
### Example 6: Payload Optimization with Field Selection
|
|
929
|
+
|
|
930
|
+
**Before (over-fetching -- 12KB response):**
|
|
931
|
+
|
|
932
|
+
```python
|
|
933
|
+
@app.route("/api/users")
|
|
934
|
+
def list_users():
|
|
935
|
+
users = db.execute("SELECT * FROM users LIMIT 50")
|
|
936
|
+
return jsonify([
|
|
937
|
+
{
|
|
938
|
+
"id": u.id,
|
|
939
|
+
"name": u.name,
|
|
940
|
+
"email": u.email,
|
|
941
|
+
"avatar_url": u.avatar_url,
|
|
942
|
+
"bio": u.bio, # 500 chars avg
|
|
943
|
+
"settings": u.settings, # large JSON blob
|
|
944
|
+
"login_history": u.login_history, # array of 100+ entries
|
|
945
|
+
"metadata": u.metadata, # internal data, not needed
|
|
946
|
+
# ... 40+ more fields
|
|
947
|
+
}
|
|
948
|
+
for u in users
|
|
949
|
+
])
|
|
950
|
+
# Response: ~12KB, serialization: ~15ms
|
|
951
|
+
```
|
|
952
|
+
|
|
953
|
+
**After (sparse fieldsets -- 1.2KB response):**
|
|
954
|
+
|
|
955
|
+
```python
|
|
956
|
+
@app.route("/api/users")
|
|
957
|
+
def list_users():
|
|
958
|
+
fields = request.args.get("fields", "id,name,avatar_url").split(",")
|
|
959
|
+
allowed = {"id", "name", "email", "avatar_url", "bio", "created_at"}
|
|
960
|
+
selected = allowed.intersection(fields)
|
|
961
|
+
|
|
962
|
+
columns = ", ".join(selected)
|
|
963
|
+
users = db.execute(f"SELECT {columns} FROM users LIMIT 50")
|
|
964
|
+
return jsonify([{field: getattr(u, field) for field in selected} for u in users])
|
|
965
|
+
# Response: ~1.2KB, serialization: ~2ms
|
|
966
|
+
# Improvement: 10x smaller payload, 7.5x faster serialization
|
|
967
|
+
```
|
|
968
|
+
|
|
969
|
+
### Example 7: Database Index Optimization
|
|
970
|
+
|
|
971
|
+
**Before (full table scan -- 1200ms):**
|
|
972
|
+
|
|
973
|
+
```sql
|
|
974
|
+
-- Query without proper index on a 5M row table
|
|
975
|
+
EXPLAIN ANALYZE
|
|
976
|
+
SELECT * FROM events
|
|
977
|
+
WHERE user_id = 12345
|
|
978
|
+
AND created_at > '2025-01-01'
|
|
979
|
+
ORDER BY created_at DESC
|
|
980
|
+
LIMIT 20;
|
|
981
|
+
|
|
982
|
+
-- Result: Seq Scan on events
|
|
983
|
+
-- Rows examined: 5,000,000
|
|
984
|
+
-- Execution time: 1200ms
|
|
985
|
+
```
|
|
986
|
+
|
|
987
|
+
**After (composite index -- 2ms):**
|
|
988
|
+
|
|
989
|
+
```sql
|
|
990
|
+
-- Add a composite index matching the query pattern
|
|
991
|
+
CREATE INDEX idx_events_user_created
|
|
992
|
+
ON events (user_id, created_at DESC);
|
|
993
|
+
|
|
994
|
+
-- Same query now uses index
|
|
995
|
+
EXPLAIN ANALYZE
|
|
996
|
+
SELECT * FROM events
|
|
997
|
+
WHERE user_id = 12345
|
|
998
|
+
AND created_at > '2025-01-01'
|
|
999
|
+
ORDER BY created_at DESC
|
|
1000
|
+
LIMIT 20;
|
|
1001
|
+
|
|
1002
|
+
-- Result: Index Scan using idx_events_user_created
|
|
1003
|
+
-- Rows examined: 20
|
|
1004
|
+
-- Execution time: 2ms
|
|
1005
|
+
-- Improvement: 600x faster
|
|
1006
|
+
```
|
|
1007
|
+
|
|
1008
|
+
### Example 8: Response Compression
|
|
1009
|
+
|
|
1010
|
+
**Before (uncompressed -- 480KB transfer):**
|
|
1011
|
+
|
|
1012
|
+
```python
|
|
1013
|
+
@app.route("/api/reports/monthly")
|
|
1014
|
+
def monthly_report():
|
|
1015
|
+
data = generate_report() # Returns 480KB JSON
|
|
1016
|
+
return jsonify(data)
|
|
1017
|
+
# Transfer size: 480KB
|
|
1018
|
+
# Transfer time at 100Mbps: ~38ms
|
|
1019
|
+
```
|
|
1020
|
+
|
|
1021
|
+
**After (Brotli compression -- 96KB transfer):**
|
|
1022
|
+
|
|
1023
|
+
```python
|
|
1024
|
+
from flask_compress import Compress
|
|
1025
|
+
|
|
1026
|
+
app = Flask(__name__)
|
|
1027
|
+
Compress(app) # Enables gzip/Brotli automatically based on Accept-Encoding
|
|
1028
|
+
|
|
1029
|
+
# Or at the reverse proxy level (nginx):
|
|
1030
|
+
# gzip on;
|
|
1031
|
+
# gzip_types application/json;
|
|
1032
|
+
# gzip_min_length 1000;
|
|
1033
|
+
# brotli on;
|
|
1034
|
+
# brotli_types application/json;
|
|
1035
|
+
```
|
|
1036
|
+
|
|
1037
|
+
```nginx
|
|
1038
|
+
# nginx configuration for compression
|
|
1039
|
+
server {
|
|
1040
|
+
# Gzip: 60-70% reduction
|
|
1041
|
+
gzip on;
|
|
1042
|
+
gzip_types application/json application/javascript text/plain;
|
|
1043
|
+
gzip_min_length 1000;
|
|
1044
|
+
gzip_comp_level 6; # balance between compression ratio and CPU
|
|
1045
|
+
|
|
1046
|
+
# Brotli: 70-80% reduction (requires ngx_brotli module)
|
|
1047
|
+
brotli on;
|
|
1048
|
+
brotli_types application/json application/javascript text/plain;
|
|
1049
|
+
brotli_comp_level 6;
|
|
1050
|
+
brotli_min_length 1000;
|
|
1051
|
+
}
|
|
1052
|
+
# 480KB --> ~96KB with Brotli (80% reduction)
|
|
1053
|
+
# Transfer time: 38ms --> 7.6ms (5x faster transfer)
|
|
1054
|
+
```
|
|
1055
|
+
|
|
1056
|
+
### Example 9: Circuit Breaker with Timeout
|
|
1057
|
+
|
|
1058
|
+
**Before (no timeout, no circuit breaker):**
|
|
1059
|
+
|
|
1060
|
+
```python
|
|
1061
|
+
import requests
|
|
1062
|
+
|
|
1063
|
+
def get_payment_status(payment_id):
|
|
1064
|
+
# If payment service is down, this hangs for 30s+ (default socket timeout)
|
|
1065
|
+
response = requests.get(f"http://payment-service/api/payments/{payment_id}")
|
|
1066
|
+
return response.json()
|
|
1067
|
+
```
|
|
1068
|
+
|
|
1069
|
+
**After (timeout + circuit breaker):**
|
|
1070
|
+
|
|
1071
|
+
```python
|
|
1072
|
+
import requests
|
|
1073
|
+
from circuitbreaker import circuit
|
|
1074
|
+
|
|
1075
|
+
@circuit(
|
|
1076
|
+
failure_threshold=5, # open after 5 failures
|
|
1077
|
+
recovery_timeout=30, # try again after 30s
|
|
1078
|
+
expected_exception=Exception,
|
|
1079
|
+
)
|
|
1080
|
+
def get_payment_status(payment_id):
|
|
1081
|
+
response = requests.get(
|
|
1082
|
+
f"http://payment-service/api/payments/{payment_id}",
|
|
1083
|
+
timeout=(1, 5), # (connect_timeout=1s, read_timeout=5s)
|
|
1084
|
+
)
|
|
1085
|
+
response.raise_for_status()
|
|
1086
|
+
return response.json()
|
|
1087
|
+
|
|
1088
|
+
# Usage with fallback
|
|
1089
|
+
def get_payment_status_safe(payment_id):
|
|
1090
|
+
try:
|
|
1091
|
+
return get_payment_status(payment_id)
|
|
1092
|
+
except Exception:
|
|
1093
|
+
# Return cached or degraded response
|
|
1094
|
+
return {"id": payment_id, "status": "unknown", "cached": True}
|
|
1095
|
+
```
|
|
1096
|
+
|
|
1097
|
+
---
|
|
1098
|
+
|
|
1099
|
+
## 11. Quick Reference
|
|
1100
|
+
|
|
1101
|
+
### Latency Numbers Every Developer Should Know
|
|
1102
|
+
|
|
1103
|
+
| Operation | Latency |
|
|
1104
|
+
|----------------------------------------------|----------------|
|
|
1105
|
+
| L1 cache reference | 0.5 ns |
|
|
1106
|
+
| L2 cache reference | 7 ns |
|
|
1107
|
+
| Main memory reference | 100 ns |
|
|
1108
|
+
| SSD random read | 16 us |
|
|
1109
|
+
| HDD random read | 2-10 ms |
|
|
1110
|
+
| Redis GET (local) | 0.1-0.5 ms |
|
|
1111
|
+
| Redis GET (cross-AZ) | 1-3 ms |
|
|
1112
|
+
| PostgreSQL simple query (indexed) | 1-5 ms |
|
|
1113
|
+
| PostgreSQL complex query (no index) | 50-5000 ms |
|
|
1114
|
+
| HTTP request (same datacenter) | 0.5-5 ms |
|
|
1115
|
+
| HTTP request (same region, different AZ) | 1-3 ms |
|
|
1116
|
+
| HTTP request (cross-region) | 50-150 ms |
|
|
1117
|
+
| DNS lookup (uncached) | 10-200 ms |
|
|
1118
|
+
| TLS handshake (full, TLS 1.2) | 40-200 ms |
|
|
1119
|
+
| TLS handshake (resumed, TLS 1.3) | 0-10 ms |
|
|
1120
|
+
| TCP connection setup | 1-3 ms (LAN) |
|
|
1121
|
+
| JSON serialize 1KB object | 0.01-0.1 ms |
|
|
1122
|
+
| JSON serialize 1MB object | 5-50 ms |
|
|
1123
|
+
| Protobuf serialize 1KB object | 0.002-0.02 ms |
|
|
1124
|
+
| Gzip compress 100KB JSON | 1-5 ms |
|
|
1125
|
+
| Brotli compress 100KB JSON | 5-20 ms |
|
|
1126
|
+
| AWS Lambda cold start (Node.js) | 100-300 ms |
|
|
1127
|
+
| AWS Lambda cold start (Java) | 3000-6000 ms |
|
|
1128
|
+
| AWS Lambda cold start (Rust/arm64) | 16 ms |
|
|
1129
|
+
|
|
1130
|
+
### Optimization Impact Summary
|
|
1131
|
+
|
|
1132
|
+
| Optimization | Typical Improvement | Effort |
|
|
1133
|
+
|----------------------------------------|-------------------------|---------|
|
|
1134
|
+
| Add missing database index | 10-600x query speedup | Low |
|
|
1135
|
+
| Eliminate N+1 queries | 10-50x on list APIs | Low |
|
|
1136
|
+
| Add Redis caching | 10x (300ms to 30ms) | Medium |
|
|
1137
|
+
| Connection pooling | 3-7x throughput | Low |
|
|
1138
|
+
| Enable response compression | 5x transfer speed | Low |
|
|
1139
|
+
| Parallelize independent calls | 2-3x for multi-call APIs| Medium |
|
|
1140
|
+
| Async non-critical work | 2-5x perceived latency | Medium |
|
|
1141
|
+
| Switch JSON to Protobuf (internal) | 5-7x serialization | High |
|
|
1142
|
+
| HTTP/2 multiplexing | 1.5-3x for multi-request| Low |
|
|
1143
|
+
| Circuit breakers + timeouts | Prevents 30s+ hangs | Low |
|
|
1144
|
+
| Pagination (100K to 20 rows) | 50-100x response size | Low |
|
|
1145
|
+
| Sparse field selection | 5-10x payload reduction | Medium |
|
|
1146
|
+
| GC tuning (ZGC/Shenandoah) | 10-1000x pause reduction| Medium |
|
|
1147
|
+
| Edge caching / CDN | 60-70% latency reduction| Medium |
|
|
1148
|
+
|
|
1149
|
+
### Compression Quick Reference
|
|
1150
|
+
|
|
1151
|
+
| Format | Compression Ratio | Speed | Browser Support | Best For |
|
|
1152
|
+
|----------|-------------------|-----------|-----------------|----------------------|
|
|
1153
|
+
| gzip | 60-70% | Fast | 99%+ | Dynamic API responses|
|
|
1154
|
+
| Brotli | 65-80% | Moderate | 96% | Static + API (pre-compress)|
|
|
1155
|
+
| zstd | 65-75% | Very fast | Growing | Internal services |
|
|
1156
|
+
| Protobuf | 70-90% vs JSON | Very fast | N/A (binary) | Service-to-service |
|
|
1157
|
+
|
|
1158
|
+
### Serialization Quick Reference
|
|
1159
|
+
|
|
1160
|
+
| Format | Serialize Speed | Deserialize Speed | Wire Size | Human Readable | Schema Required |
|
|
1161
|
+
|-------------|-----------------|-------------------|-----------|----------------|-----------------|
|
|
1162
|
+
| JSON | 1x (baseline) | 1x (baseline) | 1x | Yes | No |
|
|
1163
|
+
| Protobuf | 5-7x faster | 5-7x faster | 0.1-0.3x | No | Yes (.proto) |
|
|
1164
|
+
| FlatBuffers | 10x+ faster | Zero-copy | 0.2-0.4x | No | Yes (.fbs) |
|
|
1165
|
+
| MessagePack | 2-3x faster | 2-3x faster | 0.5-0.7x | No | No |
|
|
1166
|
+
| Avro | 3-5x faster | 3-5x faster | 0.2-0.4x | No | Yes (.avsc) |
|
|
1167
|
+
|
|
1168
|
+
---
|
|
1169
|
+
|
|
1170
|
+
## Sources
|
|
1171
|
+
|
|
1172
|
+
- [Amazon: Every 100ms of latency costs 1% in sales](https://www.gigaspaces.com/blog/amazon-found-every-100ms-of-latency-cost-them-1-in-sales/)
|
|
1173
|
+
- [P50 vs P95 vs P99 Latency Explained](https://oneuptime.com/blog/post/2025-09-15-p50-vs-p95-vs-p99-latency-percentiles/view)
|
|
1174
|
+
- [Mastering Latency Metrics: P90, P95, P99](https://medium.com/javarevisited/mastering-latency-metrics-p90-p95-p99-d5427faea879)
|
|
1175
|
+
- [API Response Time Standards](https://odown.com/blog/api-response-time-standards/)
|
|
1176
|
+
- [Tail Latency: Key in Large-Scale Distributed Systems](https://last9.io/blog/tail-latency/)
|
|
1177
|
+
- [What Is P99 Latency?](https://aerospike.com/blog/what-is-p99-latency/)
|
|
1178
|
+
- [How to Reduce Long-Tail Latency in Microservices](https://sysctl.id/reduce-long-tail-latency-microservices/)
|
|
1179
|
+
- [Global Payments: Request Hedging with DynamoDB](https://aws.amazon.com/blogs/database/how-global-payments-inc-improved-their-tail-latency-using-request-hedging-with-amazon-dynamodb/)
|
|
1180
|
+
- [Fanouts and Percentiles](https://paulcavallaro.com/blog/fanouts-and-percentiles/)
|
|
1181
|
+
- [Why Tail Latencies Matter](https://www.gomomento.com/blog/why-tail-latencies-matter/)
|
|
1182
|
+
- [How to Minimize Latency and Cost in Distributed Systems](https://www.infoq.com/articles/minimize-latency-cost-distributed-systems/)
|
|
1183
|
+
- [Monolith vs Microservices 2025](https://medium.com/@pawel.piwosz/monolith-vs-microservices-2025-real-cloud-migration-costs-and-hidden-challenges-8b453a3c71ec)
|
|
1184
|
+
- [Improve Database Performance with Connection Pooling](https://stackoverflow.blog/2020/10/14/improve-database-performance-with-connection-pooling/)
|
|
1185
|
+
- [PostgreSQL Performance with PgBouncer](https://opstree.com/blog/2025/10/07/postgresql-performance-with-pgbouncer/)
|
|
1186
|
+
- [Azure PostgreSQL Connection Pooling Best Practices](https://azure.microsoft.com/en-us/blog/performance-best-practices-for-using-azure-database-for-postgresql-connection-pooling/)
|
|
1187
|
+
- [N+1 Query Problem Explained](https://planetscale.com/blog/what-is-n-1-query-problem-and-how-to-solve-it)
|
|
1188
|
+
- [Solving the N+1 Query Problem](https://dev.to/vasughanta09/solving-the-n1-query-problem-a-developers-guide-to-database-performance-321c)
|
|
1189
|
+
- [Benchmarking JSON vs Protobuf vs FlatBuffers](https://medium.com/@harshiljani2002/benchmarking-data-serialization-json-vs-protobuf-vs-flatbuffers-3218eecdba77)
|
|
1190
|
+
- [Beating JSON Performance with Protobuf](https://auth0.com/blog/beating-json-performance-with-protobuf/)
|
|
1191
|
+
- [Protobuf vs JSON: Performance and Efficiency](https://www.gravitee.io/blog/protobuf-vs-json)
|
|
1192
|
+
- [HTTP/2 vs HTTP/1.1 Performance](https://www.cloudflare.com/learning/performance/http2-vs-http1.1/)
|
|
1193
|
+
- [Brotli vs Gzip Compression](https://www.debugbear.com/blog/http-compression-gzip-brotli)
|
|
1194
|
+
- [REST API Compression with Gzip and Brotli](https://zuplo.com/learning-center/implementing-data-compression-in-rest-apis-with-gzip-and-brotli)
|
|
1195
|
+
- [AWS Lambda Cold Start Optimization 2025](https://zircon.tech/blog/aws-lambda-cold-start-optimization-in-2025-what-actually-works/)
|
|
1196
|
+
- [Lambda Cold Starts Benchmark](https://maxday.github.io/lambda-perf/)
|
|
1197
|
+
- [Redis Edge Caching: 5ms Global Latency](https://upstash.com/blog/edge-caching-benchmark)
|
|
1198
|
+
- [ElastiCache Redis 7.1: 500M+ RPS](https://aws.amazon.com/blogs/database/achieve-over-500-million-requests-per-second-per-cluster-with-amazon-elasticache-for-redis-7-1/)
|
|
1199
|
+
- [Cache Optimization Strategies](https://redis.io/blog/guide-to-cache-optimization-strategies/)
|
|
1200
|
+
- [GC Optimization for High-Throughput Java](https://engineering.linkedin.com/garbage-collection/garbage-collection-optimization-high-throughput-and-low-latency-java-applications)
|
|
1201
|
+
- [Taming Go's Garbage Collector](https://dev.to/jones_charles_ad50858dbc0/taming-gos-garbage-collector-for-blazing-fast-low-latency-apps-24an)
|
|
1202
|
+
- [GC Impact on Application Performance](https://www.azul.com/blog/garbage-collection-application-performance-impact/)
|
|
1203
|
+
- [API Latency and P99 Tuning Lab](https://github.com/fourcoretech/api-latency-and-p99-tuning-lab)
|
|
1204
|
+
- [How to Increase API Performance](https://zuplo.com/blog/2025/01/30/increase-api-performance)
|