@wazir-dev/cli 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +111 -0
- package/CHANGELOG.md +14 -0
- package/CONTRIBUTING.md +101 -0
- package/LICENSE +21 -0
- package/README.md +314 -0
- package/assets/composition-engine.mmd +34 -0
- package/assets/demo-script.sh +17 -0
- package/assets/logo-dark.svg +14 -0
- package/assets/logo.svg +14 -0
- package/assets/pipeline.mmd +39 -0
- package/assets/record-demo.sh +51 -0
- package/docs/README.md +51 -0
- package/docs/adapters/context-mode.md +60 -0
- package/docs/concepts/architecture.md +87 -0
- package/docs/concepts/artifact-model.md +60 -0
- package/docs/concepts/composition-engine.md +36 -0
- package/docs/concepts/indexing-and-recall.md +160 -0
- package/docs/concepts/observability.md +41 -0
- package/docs/concepts/roles-and-workflows.md +59 -0
- package/docs/concepts/terminology-policy.md +27 -0
- package/docs/getting-started/01-installation.md +78 -0
- package/docs/getting-started/02-first-run.md +102 -0
- package/docs/getting-started/03-adding-to-project.md +15 -0
- package/docs/getting-started/04-host-setup.md +15 -0
- package/docs/guides/ci-integration.md +15 -0
- package/docs/guides/creating-skills.md +15 -0
- package/docs/guides/expertise-module-authoring.md +15 -0
- package/docs/guides/hook-development.md +15 -0
- package/docs/guides/memory-and-learnings.md +34 -0
- package/docs/guides/multi-host-export.md +15 -0
- package/docs/guides/troubleshooting.md +101 -0
- package/docs/guides/writing-custom-roles.md +15 -0
- package/docs/plans/2026-03-15-cli-pipeline-integration-design.md +592 -0
- package/docs/plans/2026-03-15-cli-pipeline-integration-plan.md +598 -0
- package/docs/plans/2026-03-15-docs-enforcement-plan.md +238 -0
- package/docs/readmes/INDEX.md +99 -0
- package/docs/readmes/features/expertise/README.md +171 -0
- package/docs/readmes/features/exports/README.md +222 -0
- package/docs/readmes/features/hooks/README.md +103 -0
- package/docs/readmes/features/hooks/loop-cap-guard.md +133 -0
- package/docs/readmes/features/hooks/post-tool-capture.md +121 -0
- package/docs/readmes/features/hooks/post-tool-lint.md +130 -0
- package/docs/readmes/features/hooks/pre-compact-summary.md +122 -0
- package/docs/readmes/features/hooks/pre-tool-capture-route.md +100 -0
- package/docs/readmes/features/hooks/protected-path-write-guard.md +128 -0
- package/docs/readmes/features/hooks/session-start.md +119 -0
- package/docs/readmes/features/hooks/stop-handoff-harvest.md +125 -0
- package/docs/readmes/features/roles/README.md +157 -0
- package/docs/readmes/features/roles/clarifier.md +152 -0
- package/docs/readmes/features/roles/content-author.md +190 -0
- package/docs/readmes/features/roles/designer.md +193 -0
- package/docs/readmes/features/roles/executor.md +184 -0
- package/docs/readmes/features/roles/learner.md +210 -0
- package/docs/readmes/features/roles/planner.md +182 -0
- package/docs/readmes/features/roles/researcher.md +164 -0
- package/docs/readmes/features/roles/reviewer.md +184 -0
- package/docs/readmes/features/roles/specifier.md +162 -0
- package/docs/readmes/features/roles/verifier.md +215 -0
- package/docs/readmes/features/schemas/README.md +178 -0
- package/docs/readmes/features/skills/README.md +63 -0
- package/docs/readmes/features/skills/brainstorming.md +96 -0
- package/docs/readmes/features/skills/debugging.md +148 -0
- package/docs/readmes/features/skills/design.md +120 -0
- package/docs/readmes/features/skills/prepare-next.md +109 -0
- package/docs/readmes/features/skills/run-audit.md +159 -0
- package/docs/readmes/features/skills/scan-project.md +109 -0
- package/docs/readmes/features/skills/self-audit.md +176 -0
- package/docs/readmes/features/skills/tdd.md +137 -0
- package/docs/readmes/features/skills/using-skills.md +92 -0
- package/docs/readmes/features/skills/verification.md +120 -0
- package/docs/readmes/features/skills/writing-plans.md +104 -0
- package/docs/readmes/features/tooling/README.md +320 -0
- package/docs/readmes/features/workflows/README.md +186 -0
- package/docs/readmes/features/workflows/author.md +181 -0
- package/docs/readmes/features/workflows/clarify.md +154 -0
- package/docs/readmes/features/workflows/design-review.md +171 -0
- package/docs/readmes/features/workflows/design.md +169 -0
- package/docs/readmes/features/workflows/discover.md +162 -0
- package/docs/readmes/features/workflows/execute.md +173 -0
- package/docs/readmes/features/workflows/learn.md +167 -0
- package/docs/readmes/features/workflows/plan-review.md +165 -0
- package/docs/readmes/features/workflows/plan.md +170 -0
- package/docs/readmes/features/workflows/prepare-next.md +167 -0
- package/docs/readmes/features/workflows/review.md +169 -0
- package/docs/readmes/features/workflows/run-audit.md +191 -0
- package/docs/readmes/features/workflows/spec-challenge.md +159 -0
- package/docs/readmes/features/workflows/specify.md +160 -0
- package/docs/readmes/features/workflows/verify.md +177 -0
- package/docs/readmes/packages/README.md +50 -0
- package/docs/readmes/packages/ajv.md +117 -0
- package/docs/readmes/packages/context-mode.md +118 -0
- package/docs/readmes/packages/gray-matter.md +116 -0
- package/docs/readmes/packages/node-test.md +137 -0
- package/docs/readmes/packages/yaml.md +112 -0
- package/docs/reference/configuration-reference.md +159 -0
- package/docs/reference/expertise-index.md +52 -0
- package/docs/reference/git-flow.md +43 -0
- package/docs/reference/hooks.md +87 -0
- package/docs/reference/host-exports.md +50 -0
- package/docs/reference/launch-checklist.md +172 -0
- package/docs/reference/marketplace-listings.md +76 -0
- package/docs/reference/release-process.md +34 -0
- package/docs/reference/roles-reference.md +77 -0
- package/docs/reference/skills.md +33 -0
- package/docs/reference/templates.md +29 -0
- package/docs/reference/tooling-cli.md +94 -0
- package/docs/truth-claims.yaml +222 -0
- package/expertise/PROGRESS.md +63 -0
- package/expertise/README.md +18 -0
- package/expertise/antipatterns/PROGRESS.md +56 -0
- package/expertise/antipatterns/backend/api-design-antipatterns.md +1271 -0
- package/expertise/antipatterns/backend/auth-antipatterns.md +1195 -0
- package/expertise/antipatterns/backend/caching-antipatterns.md +622 -0
- package/expertise/antipatterns/backend/database-antipatterns.md +1038 -0
- package/expertise/antipatterns/backend/index.md +24 -0
- package/expertise/antipatterns/backend/microservices-antipatterns.md +850 -0
- package/expertise/antipatterns/code/architecture-antipatterns.md +919 -0
- package/expertise/antipatterns/code/async-antipatterns.md +622 -0
- package/expertise/antipatterns/code/code-smells.md +1186 -0
- package/expertise/antipatterns/code/dependency-antipatterns.md +1209 -0
- package/expertise/antipatterns/code/error-handling-antipatterns.md +1360 -0
- package/expertise/antipatterns/code/index.md +27 -0
- package/expertise/antipatterns/code/naming-and-abstraction.md +1118 -0
- package/expertise/antipatterns/code/state-management-antipatterns.md +1076 -0
- package/expertise/antipatterns/code/testing-antipatterns.md +1053 -0
- package/expertise/antipatterns/design/accessibility-antipatterns.md +1136 -0
- package/expertise/antipatterns/design/dark-patterns.md +1121 -0
- package/expertise/antipatterns/design/index.md +22 -0
- package/expertise/antipatterns/design/ui-antipatterns.md +1202 -0
- package/expertise/antipatterns/design/ux-antipatterns.md +680 -0
- package/expertise/antipatterns/frontend/css-layout-antipatterns.md +691 -0
- package/expertise/antipatterns/frontend/flutter-antipatterns.md +1827 -0
- package/expertise/antipatterns/frontend/index.md +23 -0
- package/expertise/antipatterns/frontend/mobile-antipatterns.md +573 -0
- package/expertise/antipatterns/frontend/react-antipatterns.md +1128 -0
- package/expertise/antipatterns/frontend/spa-antipatterns.md +1235 -0
- package/expertise/antipatterns/index.md +31 -0
- package/expertise/antipatterns/performance/index.md +20 -0
- package/expertise/antipatterns/performance/performance-antipatterns.md +1013 -0
- package/expertise/antipatterns/performance/premature-optimization.md +623 -0
- package/expertise/antipatterns/performance/scaling-antipatterns.md +785 -0
- package/expertise/antipatterns/process/ai-coding-antipatterns.md +853 -0
- package/expertise/antipatterns/process/code-review-antipatterns.md +656 -0
- package/expertise/antipatterns/process/deployment-antipatterns.md +920 -0
- package/expertise/antipatterns/process/index.md +23 -0
- package/expertise/antipatterns/process/technical-debt-antipatterns.md +647 -0
- package/expertise/antipatterns/security/index.md +20 -0
- package/expertise/antipatterns/security/secrets-antipatterns.md +849 -0
- package/expertise/antipatterns/security/security-theater.md +843 -0
- package/expertise/antipatterns/security/vulnerability-patterns.md +801 -0
- package/expertise/architecture/PROGRESS.md +70 -0
- package/expertise/architecture/data/caching-architecture.md +671 -0
- package/expertise/architecture/data/data-consistency.md +574 -0
- package/expertise/architecture/data/data-modeling.md +536 -0
- package/expertise/architecture/data/event-streams-and-queues.md +634 -0
- package/expertise/architecture/data/index.md +25 -0
- package/expertise/architecture/data/search-architecture.md +663 -0
- package/expertise/architecture/data/sql-vs-nosql.md +708 -0
- package/expertise/architecture/decisions/architecture-decision-records.md +640 -0
- package/expertise/architecture/decisions/build-vs-buy.md +616 -0
- package/expertise/architecture/decisions/index.md +23 -0
- package/expertise/architecture/decisions/monolith-to-microservices.md +790 -0
- package/expertise/architecture/decisions/technology-selection.md +616 -0
- package/expertise/architecture/distributed/cap-theorem-and-tradeoffs.md +800 -0
- package/expertise/architecture/distributed/circuit-breaker-bulkhead.md +741 -0
- package/expertise/architecture/distributed/consensus-and-coordination.md +796 -0
- package/expertise/architecture/distributed/distributed-systems-fundamentals.md +564 -0
- package/expertise/architecture/distributed/idempotency-and-retry.md +796 -0
- package/expertise/architecture/distributed/index.md +25 -0
- package/expertise/architecture/distributed/saga-pattern.md +797 -0
- package/expertise/architecture/foundations/architectural-thinking.md +460 -0
- package/expertise/architecture/foundations/coupling-and-cohesion.md +770 -0
- package/expertise/architecture/foundations/design-principles-solid.md +649 -0
- package/expertise/architecture/foundations/domain-driven-design.md +719 -0
- package/expertise/architecture/foundations/index.md +25 -0
- package/expertise/architecture/foundations/separation-of-concerns.md +472 -0
- package/expertise/architecture/foundations/twelve-factor-app.md +797 -0
- package/expertise/architecture/index.md +34 -0
- package/expertise/architecture/integration/api-design-graphql.md +638 -0
- package/expertise/architecture/integration/api-design-grpc.md +804 -0
- package/expertise/architecture/integration/api-design-rest.md +892 -0
- package/expertise/architecture/integration/index.md +25 -0
- package/expertise/architecture/integration/third-party-integration.md +795 -0
- package/expertise/architecture/integration/webhooks-and-callbacks.md +1152 -0
- package/expertise/architecture/integration/websockets-realtime.md +791 -0
- package/expertise/architecture/mobile-architecture/index.md +22 -0
- package/expertise/architecture/mobile-architecture/mobile-app-architecture.md +780 -0
- package/expertise/architecture/mobile-architecture/mobile-backend-for-frontend.md +670 -0
- package/expertise/architecture/mobile-architecture/offline-first.md +719 -0
- package/expertise/architecture/mobile-architecture/push-and-sync.md +782 -0
- package/expertise/architecture/patterns/cqrs-event-sourcing.md +717 -0
- package/expertise/architecture/patterns/event-driven.md +797 -0
- package/expertise/architecture/patterns/hexagonal-clean-architecture.md +870 -0
- package/expertise/architecture/patterns/index.md +27 -0
- package/expertise/architecture/patterns/layered-architecture.md +736 -0
- package/expertise/architecture/patterns/microservices.md +753 -0
- package/expertise/architecture/patterns/modular-monolith.md +692 -0
- package/expertise/architecture/patterns/monolith.md +626 -0
- package/expertise/architecture/patterns/plugin-architecture.md +735 -0
- package/expertise/architecture/patterns/serverless.md +780 -0
- package/expertise/architecture/scaling/database-scaling.md +615 -0
- package/expertise/architecture/scaling/feature-flags-and-rollouts.md +757 -0
- package/expertise/architecture/scaling/horizontal-vs-vertical.md +606 -0
- package/expertise/architecture/scaling/index.md +24 -0
- package/expertise/architecture/scaling/multi-tenancy.md +800 -0
- package/expertise/architecture/scaling/stateless-design.md +787 -0
- package/expertise/backend/embedded-firmware.md +625 -0
- package/expertise/backend/go.md +853 -0
- package/expertise/backend/index.md +24 -0
- package/expertise/backend/java-spring.md +448 -0
- package/expertise/backend/node-typescript.md +625 -0
- package/expertise/backend/python-fastapi.md +724 -0
- package/expertise/backend/rust.md +458 -0
- package/expertise/backend/solidity.md +711 -0
- package/expertise/composition-map.yaml +443 -0
- package/expertise/content/foundations/content-modeling.md +395 -0
- package/expertise/content/foundations/editorial-standards.md +449 -0
- package/expertise/content/foundations/index.md +24 -0
- package/expertise/content/foundations/microcopy.md +455 -0
- package/expertise/content/foundations/terminology-governance.md +509 -0
- package/expertise/content/index.md +34 -0
- package/expertise/content/patterns/accessibility-copy.md +518 -0
- package/expertise/content/patterns/index.md +24 -0
- package/expertise/content/patterns/notification-content.md +433 -0
- package/expertise/content/patterns/sample-content.md +486 -0
- package/expertise/content/patterns/state-copy.md +439 -0
- package/expertise/design/PROGRESS.md +58 -0
- package/expertise/design/disciplines/dark-mode-theming.md +577 -0
- package/expertise/design/disciplines/design-systems.md +595 -0
- package/expertise/design/disciplines/index.md +25 -0
- package/expertise/design/disciplines/information-architecture.md +800 -0
- package/expertise/design/disciplines/interaction-design.md +788 -0
- package/expertise/design/disciplines/responsive-design.md +552 -0
- package/expertise/design/disciplines/usability-testing.md +516 -0
- package/expertise/design/disciplines/user-research.md +792 -0
- package/expertise/design/foundations/accessibility-design.md +796 -0
- package/expertise/design/foundations/color-theory.md +797 -0
- package/expertise/design/foundations/iconography.md +795 -0
- package/expertise/design/foundations/index.md +26 -0
- package/expertise/design/foundations/motion-and-animation.md +653 -0
- package/expertise/design/foundations/rtl-design.md +585 -0
- package/expertise/design/foundations/spacing-and-layout.md +607 -0
- package/expertise/design/foundations/typography.md +800 -0
- package/expertise/design/foundations/visual-hierarchy.md +761 -0
- package/expertise/design/index.md +32 -0
- package/expertise/design/patterns/authentication-flows.md +474 -0
- package/expertise/design/patterns/content-consumption.md +789 -0
- package/expertise/design/patterns/data-display.md +618 -0
- package/expertise/design/patterns/e-commerce.md +1494 -0
- package/expertise/design/patterns/feedback-and-states.md +642 -0
- package/expertise/design/patterns/forms-and-input.md +819 -0
- package/expertise/design/patterns/gamification.md +801 -0
- package/expertise/design/patterns/index.md +31 -0
- package/expertise/design/patterns/microinteractions.md +449 -0
- package/expertise/design/patterns/navigation.md +800 -0
- package/expertise/design/patterns/notifications.md +705 -0
- package/expertise/design/patterns/onboarding.md +700 -0
- package/expertise/design/patterns/search-and-filter.md +601 -0
- package/expertise/design/patterns/settings-and-preferences.md +768 -0
- package/expertise/design/patterns/social-and-community.md +748 -0
- package/expertise/design/platforms/desktop-native.md +612 -0
- package/expertise/design/platforms/index.md +25 -0
- package/expertise/design/platforms/mobile-android.md +825 -0
- package/expertise/design/platforms/mobile-cross-platform.md +983 -0
- package/expertise/design/platforms/mobile-ios.md +699 -0
- package/expertise/design/platforms/tablet.md +794 -0
- package/expertise/design/platforms/web-dashboard.md +790 -0
- package/expertise/design/platforms/web-responsive.md +550 -0
- package/expertise/design/psychology/behavioral-nudges.md +449 -0
- package/expertise/design/psychology/cognitive-load.md +1191 -0
- package/expertise/design/psychology/error-psychology.md +778 -0
- package/expertise/design/psychology/index.md +22 -0
- package/expertise/design/psychology/persuasive-design.md +736 -0
- package/expertise/design/psychology/user-mental-models.md +623 -0
- package/expertise/design/tooling/open-pencil.md +266 -0
- package/expertise/frontend/angular.md +1073 -0
- package/expertise/frontend/desktop-electron.md +546 -0
- package/expertise/frontend/flutter.md +782 -0
- package/expertise/frontend/index.md +27 -0
- package/expertise/frontend/native-android.md +409 -0
- package/expertise/frontend/native-ios.md +490 -0
- package/expertise/frontend/react-native.md +1160 -0
- package/expertise/frontend/react.md +808 -0
- package/expertise/frontend/vue.md +1089 -0
- package/expertise/humanize/domain-rules-code.md +79 -0
- package/expertise/humanize/domain-rules-content.md +67 -0
- package/expertise/humanize/domain-rules-technical-docs.md +56 -0
- package/expertise/humanize/index.md +35 -0
- package/expertise/humanize/self-audit-checklist.md +87 -0
- package/expertise/humanize/sentence-patterns.md +218 -0
- package/expertise/humanize/vocabulary-blacklist.md +105 -0
- package/expertise/i18n/PROGRESS.md +65 -0
- package/expertise/i18n/advanced/accessibility-and-i18n.md +28 -0
- package/expertise/i18n/advanced/bidirectional-text-algorithm.md +38 -0
- package/expertise/i18n/advanced/complex-scripts.md +30 -0
- package/expertise/i18n/advanced/performance-and-i18n.md +27 -0
- package/expertise/i18n/advanced/testing-i18n.md +28 -0
- package/expertise/i18n/content/content-adaptation.md +23 -0
- package/expertise/i18n/content/locale-specific-formatting.md +23 -0
- package/expertise/i18n/content/machine-translation-integration.md +28 -0
- package/expertise/i18n/content/translation-management.md +29 -0
- package/expertise/i18n/foundations/date-time-calendars.md +67 -0
- package/expertise/i18n/foundations/i18n-architecture.md +272 -0
- package/expertise/i18n/foundations/locale-and-language-tags.md +79 -0
- package/expertise/i18n/foundations/numbers-currency-units.md +61 -0
- package/expertise/i18n/foundations/pluralization-and-gender.md +109 -0
- package/expertise/i18n/foundations/string-externalization.md +236 -0
- package/expertise/i18n/foundations/text-direction-bidi.md +241 -0
- package/expertise/i18n/foundations/unicode-and-encoding.md +86 -0
- package/expertise/i18n/index.md +38 -0
- package/expertise/i18n/platform/backend-i18n.md +31 -0
- package/expertise/i18n/platform/flutter-i18n.md +148 -0
- package/expertise/i18n/platform/native-android-i18n.md +36 -0
- package/expertise/i18n/platform/native-ios-i18n.md +36 -0
- package/expertise/i18n/platform/react-i18n.md +103 -0
- package/expertise/i18n/platform/web-css-i18n.md +81 -0
- package/expertise/i18n/rtl/arabic-specific.md +175 -0
- package/expertise/i18n/rtl/hebrew-specific.md +149 -0
- package/expertise/i18n/rtl/rtl-animations-and-transitions.md +111 -0
- package/expertise/i18n/rtl/rtl-forms-and-input.md +161 -0
- package/expertise/i18n/rtl/rtl-fundamentals.md +211 -0
- package/expertise/i18n/rtl/rtl-icons-and-images.md +181 -0
- package/expertise/i18n/rtl/rtl-layout-mirroring.md +252 -0
- package/expertise/i18n/rtl/rtl-navigation-and-gestures.md +107 -0
- package/expertise/i18n/rtl/rtl-testing-and-qa.md +147 -0
- package/expertise/i18n/rtl/rtl-typography.md +160 -0
- package/expertise/index.md +113 -0
- package/expertise/index.yaml +216 -0
- package/expertise/infrastructure/cloud-aws.md +597 -0
- package/expertise/infrastructure/cloud-gcp.md +599 -0
- package/expertise/infrastructure/cybersecurity.md +816 -0
- package/expertise/infrastructure/database-mongodb.md +447 -0
- package/expertise/infrastructure/database-postgres.md +400 -0
- package/expertise/infrastructure/devops-cicd.md +787 -0
- package/expertise/infrastructure/index.md +27 -0
- package/expertise/performance/PROGRESS.md +50 -0
- package/expertise/performance/backend/api-latency.md +1204 -0
- package/expertise/performance/backend/background-jobs.md +506 -0
- package/expertise/performance/backend/connection-pooling.md +1209 -0
- package/expertise/performance/backend/database-query-optimization.md +515 -0
- package/expertise/performance/backend/index.md +23 -0
- package/expertise/performance/backend/rate-limiting-and-throttling.md +971 -0
- package/expertise/performance/foundations/algorithmic-complexity.md +954 -0
- package/expertise/performance/foundations/caching-strategies.md +489 -0
- package/expertise/performance/foundations/concurrency-and-parallelism.md +847 -0
- package/expertise/performance/foundations/index.md +24 -0
- package/expertise/performance/foundations/measuring-and-profiling.md +440 -0
- package/expertise/performance/foundations/memory-management.md +964 -0
- package/expertise/performance/foundations/performance-budgets.md +1314 -0
- package/expertise/performance/index.md +31 -0
- package/expertise/performance/infrastructure/auto-scaling.md +1059 -0
- package/expertise/performance/infrastructure/cdn-and-edge.md +1081 -0
- package/expertise/performance/infrastructure/index.md +22 -0
- package/expertise/performance/infrastructure/load-balancing.md +1081 -0
- package/expertise/performance/infrastructure/observability.md +1079 -0
- package/expertise/performance/mobile/index.md +23 -0
- package/expertise/performance/mobile/mobile-animations.md +544 -0
- package/expertise/performance/mobile/mobile-memory-battery.md +416 -0
- package/expertise/performance/mobile/mobile-network.md +452 -0
- package/expertise/performance/mobile/mobile-rendering.md +599 -0
- package/expertise/performance/mobile/mobile-startup-time.md +505 -0
- package/expertise/performance/platform-specific/flutter-performance.md +647 -0
- package/expertise/performance/platform-specific/index.md +22 -0
- package/expertise/performance/platform-specific/node-performance.md +1307 -0
- package/expertise/performance/platform-specific/postgres-performance.md +1366 -0
- package/expertise/performance/platform-specific/react-performance.md +1403 -0
- package/expertise/performance/web/bundle-optimization.md +1239 -0
- package/expertise/performance/web/image-and-media.md +636 -0
- package/expertise/performance/web/index.md +24 -0
- package/expertise/performance/web/network-optimization.md +1133 -0
- package/expertise/performance/web/rendering-performance.md +1098 -0
- package/expertise/performance/web/ssr-and-hydration.md +918 -0
- package/expertise/performance/web/web-vitals.md +1374 -0
- package/expertise/quality/accessibility.md +985 -0
- package/expertise/quality/evidence-based-verification.md +499 -0
- package/expertise/quality/index.md +24 -0
- package/expertise/quality/ml-model-audit.md +614 -0
- package/expertise/quality/performance.md +600 -0
- package/expertise/quality/testing-api.md +891 -0
- package/expertise/quality/testing-mobile.md +496 -0
- package/expertise/quality/testing-web.md +849 -0
- package/expertise/security/PROGRESS.md +54 -0
- package/expertise/security/agentic-identity.md +540 -0
- package/expertise/security/compliance-frameworks.md +601 -0
- package/expertise/security/data/data-encryption.md +364 -0
- package/expertise/security/data/data-privacy-gdpr.md +692 -0
- package/expertise/security/data/database-security.md +1171 -0
- package/expertise/security/data/index.md +22 -0
- package/expertise/security/data/pii-handling.md +531 -0
- package/expertise/security/foundations/authentication.md +1041 -0
- package/expertise/security/foundations/authorization.md +603 -0
- package/expertise/security/foundations/cryptography.md +1001 -0
- package/expertise/security/foundations/index.md +25 -0
- package/expertise/security/foundations/owasp-top-10.md +1354 -0
- package/expertise/security/foundations/secrets-management.md +1217 -0
- package/expertise/security/foundations/secure-sdlc.md +700 -0
- package/expertise/security/foundations/supply-chain-security.md +698 -0
- package/expertise/security/index.md +31 -0
- package/expertise/security/infrastructure/cloud-security-aws.md +1296 -0
- package/expertise/security/infrastructure/cloud-security-gcp.md +1376 -0
- package/expertise/security/infrastructure/container-security.md +721 -0
- package/expertise/security/infrastructure/incident-response.md +1295 -0
- package/expertise/security/infrastructure/index.md +24 -0
- package/expertise/security/infrastructure/logging-and-monitoring.md +1618 -0
- package/expertise/security/infrastructure/network-security.md +1337 -0
- package/expertise/security/mobile/index.md +23 -0
- package/expertise/security/mobile/mobile-android-security.md +1218 -0
- package/expertise/security/mobile/mobile-binary-protection.md +1229 -0
- package/expertise/security/mobile/mobile-data-storage.md +1265 -0
- package/expertise/security/mobile/mobile-ios-security.md +1401 -0
- package/expertise/security/mobile/mobile-network-security.md +1520 -0
- package/expertise/security/smart-contract-security.md +594 -0
- package/expertise/security/testing/index.md +22 -0
- package/expertise/security/testing/penetration-testing.md +1258 -0
- package/expertise/security/testing/security-code-review.md +1765 -0
- package/expertise/security/testing/threat-modeling.md +1074 -0
- package/expertise/security/testing/vulnerability-scanning.md +1062 -0
- package/expertise/security/web/api-security.md +586 -0
- package/expertise/security/web/cors-and-headers.md +433 -0
- package/expertise/security/web/csrf.md +562 -0
- package/expertise/security/web/file-upload.md +1477 -0
- package/expertise/security/web/index.md +25 -0
- package/expertise/security/web/injection.md +1375 -0
- package/expertise/security/web/session-management.md +1101 -0
- package/expertise/security/web/xss.md +1158 -0
- package/exports/README.md +17 -0
- package/exports/hosts/claude/.claude/agents/clarifier.md +42 -0
- package/exports/hosts/claude/.claude/agents/content-author.md +63 -0
- package/exports/hosts/claude/.claude/agents/designer.md +55 -0
- package/exports/hosts/claude/.claude/agents/executor.md +55 -0
- package/exports/hosts/claude/.claude/agents/learner.md +51 -0
- package/exports/hosts/claude/.claude/agents/planner.md +53 -0
- package/exports/hosts/claude/.claude/agents/researcher.md +43 -0
- package/exports/hosts/claude/.claude/agents/reviewer.md +54 -0
- package/exports/hosts/claude/.claude/agents/specifier.md +47 -0
- package/exports/hosts/claude/.claude/agents/verifier.md +71 -0
- package/exports/hosts/claude/.claude/commands/author.md +42 -0
- package/exports/hosts/claude/.claude/commands/clarify.md +38 -0
- package/exports/hosts/claude/.claude/commands/design-review.md +46 -0
- package/exports/hosts/claude/.claude/commands/design.md +44 -0
- package/exports/hosts/claude/.claude/commands/discover.md +37 -0
- package/exports/hosts/claude/.claude/commands/execute.md +48 -0
- package/exports/hosts/claude/.claude/commands/learn.md +38 -0
- package/exports/hosts/claude/.claude/commands/plan-review.md +42 -0
- package/exports/hosts/claude/.claude/commands/plan.md +39 -0
- package/exports/hosts/claude/.claude/commands/prepare-next.md +37 -0
- package/exports/hosts/claude/.claude/commands/review.md +40 -0
- package/exports/hosts/claude/.claude/commands/run-audit.md +41 -0
- package/exports/hosts/claude/.claude/commands/spec-challenge.md +41 -0
- package/exports/hosts/claude/.claude/commands/specify.md +38 -0
- package/exports/hosts/claude/.claude/commands/verify.md +37 -0
- package/exports/hosts/claude/.claude/settings.json +34 -0
- package/exports/hosts/claude/CLAUDE.md +19 -0
- package/exports/hosts/claude/export.manifest.json +38 -0
- package/exports/hosts/claude/host-package.json +67 -0
- package/exports/hosts/codex/AGENTS.md +19 -0
- package/exports/hosts/codex/export.manifest.json +38 -0
- package/exports/hosts/codex/host-package.json +41 -0
- package/exports/hosts/cursor/.cursor/hooks.json +16 -0
- package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +19 -0
- package/exports/hosts/cursor/export.manifest.json +38 -0
- package/exports/hosts/cursor/host-package.json +42 -0
- package/exports/hosts/gemini/GEMINI.md +19 -0
- package/exports/hosts/gemini/export.manifest.json +38 -0
- package/exports/hosts/gemini/host-package.json +41 -0
- package/hooks/README.md +18 -0
- package/hooks/definitions/loop_cap_guard.yaml +21 -0
- package/hooks/definitions/post_tool_capture.yaml +24 -0
- package/hooks/definitions/pre_compact_summary.yaml +19 -0
- package/hooks/definitions/pre_tool_capture_route.yaml +19 -0
- package/hooks/definitions/protected_path_write_guard.yaml +19 -0
- package/hooks/definitions/session_start.yaml +19 -0
- package/hooks/definitions/stop_handoff_harvest.yaml +20 -0
- package/hooks/loop-cap-guard +17 -0
- package/hooks/post-tool-lint +36 -0
- package/hooks/protected-path-write-guard +17 -0
- package/hooks/session-start +41 -0
- package/llms-full.txt +2355 -0
- package/llms.txt +43 -0
- package/package.json +79 -0
- package/roles/README.md +20 -0
- package/roles/clarifier.md +42 -0
- package/roles/content-author.md +63 -0
- package/roles/designer.md +55 -0
- package/roles/executor.md +55 -0
- package/roles/learner.md +51 -0
- package/roles/planner.md +53 -0
- package/roles/researcher.md +43 -0
- package/roles/reviewer.md +54 -0
- package/roles/specifier.md +47 -0
- package/roles/verifier.md +71 -0
- package/schemas/README.md +24 -0
- package/schemas/accepted-learning.schema.json +20 -0
- package/schemas/author-artifact.schema.json +156 -0
- package/schemas/clarification.schema.json +19 -0
- package/schemas/design-artifact.schema.json +80 -0
- package/schemas/docs-claim.schema.json +18 -0
- package/schemas/export-manifest.schema.json +20 -0
- package/schemas/hook.schema.json +67 -0
- package/schemas/host-export-package.schema.json +18 -0
- package/schemas/implementation-plan.schema.json +19 -0
- package/schemas/proposed-learning.schema.json +19 -0
- package/schemas/research.schema.json +18 -0
- package/schemas/review.schema.json +29 -0
- package/schemas/run-manifest.schema.json +18 -0
- package/schemas/spec-challenge.schema.json +18 -0
- package/schemas/spec.schema.json +20 -0
- package/schemas/usage.schema.json +102 -0
- package/schemas/verification-proof.schema.json +29 -0
- package/schemas/wazir-manifest.schema.json +173 -0
- package/skills/README.md +40 -0
- package/skills/brainstorming/SKILL.md +77 -0
- package/skills/debugging/SKILL.md +50 -0
- package/skills/design/SKILL.md +61 -0
- package/skills/dispatching-parallel-agents/SKILL.md +128 -0
- package/skills/executing-plans/SKILL.md +70 -0
- package/skills/finishing-a-development-branch/SKILL.md +169 -0
- package/skills/humanize/SKILL.md +123 -0
- package/skills/init-pipeline/SKILL.md +124 -0
- package/skills/prepare-next/SKILL.md +20 -0
- package/skills/receiving-code-review/SKILL.md +123 -0
- package/skills/requesting-code-review/SKILL.md +105 -0
- package/skills/requesting-code-review/code-reviewer.md +108 -0
- package/skills/run-audit/SKILL.md +197 -0
- package/skills/scan-project/SKILL.md +41 -0
- package/skills/self-audit/SKILL.md +153 -0
- package/skills/subagent-driven-development/SKILL.md +154 -0
- package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +26 -0
- package/skills/subagent-driven-development/implementer-prompt.md +102 -0
- package/skills/subagent-driven-development/spec-reviewer-prompt.md +61 -0
- package/skills/tdd/SKILL.md +23 -0
- package/skills/using-git-worktrees/SKILL.md +163 -0
- package/skills/using-skills/SKILL.md +95 -0
- package/skills/verification/SKILL.md +22 -0
- package/skills/wazir/SKILL.md +463 -0
- package/skills/writing-plans/SKILL.md +30 -0
- package/skills/writing-skills/SKILL.md +157 -0
- package/skills/writing-skills/anthropic-best-practices.md +122 -0
- package/skills/writing-skills/persuasion-principles.md +50 -0
- package/templates/README.md +20 -0
- package/templates/artifacts/README.md +10 -0
- package/templates/artifacts/accepted-learning.md +19 -0
- package/templates/artifacts/accepted-learning.template.json +12 -0
- package/templates/artifacts/author.md +74 -0
- package/templates/artifacts/author.template.json +19 -0
- package/templates/artifacts/clarification.md +21 -0
- package/templates/artifacts/clarification.template.json +12 -0
- package/templates/artifacts/execute-notes.md +19 -0
- package/templates/artifacts/implementation-plan.md +21 -0
- package/templates/artifacts/implementation-plan.template.json +11 -0
- package/templates/artifacts/learning-proposal.md +19 -0
- package/templates/artifacts/next-run-handoff.md +21 -0
- package/templates/artifacts/plan-review.md +19 -0
- package/templates/artifacts/proposed-learning.template.json +12 -0
- package/templates/artifacts/research.md +21 -0
- package/templates/artifacts/research.template.json +12 -0
- package/templates/artifacts/review-findings.md +19 -0
- package/templates/artifacts/review.template.json +11 -0
- package/templates/artifacts/run-manifest.template.json +8 -0
- package/templates/artifacts/spec-challenge.md +19 -0
- package/templates/artifacts/spec-challenge.template.json +11 -0
- package/templates/artifacts/spec.md +21 -0
- package/templates/artifacts/spec.template.json +12 -0
- package/templates/artifacts/verification-proof.md +19 -0
- package/templates/artifacts/verification-proof.template.json +11 -0
- package/templates/examples/accepted-learning.example.json +14 -0
- package/templates/examples/author.example.json +152 -0
- package/templates/examples/clarification.example.json +15 -0
- package/templates/examples/docs-claim.example.json +8 -0
- package/templates/examples/export-manifest.example.json +7 -0
- package/templates/examples/host-export-package.example.json +11 -0
- package/templates/examples/implementation-plan.example.json +17 -0
- package/templates/examples/proposed-learning.example.json +13 -0
- package/templates/examples/research.example.json +15 -0
- package/templates/examples/research.example.md +6 -0
- package/templates/examples/review.example.json +17 -0
- package/templates/examples/run-manifest.example.json +9 -0
- package/templates/examples/spec-challenge.example.json +14 -0
- package/templates/examples/spec.example.json +21 -0
- package/templates/examples/verification-proof.example.json +21 -0
- package/templates/examples/wazir-manifest.example.yaml +65 -0
- package/templates/task-definition-schema.md +99 -0
- package/tooling/README.md +20 -0
- package/tooling/src/adapters/context-mode.js +50 -0
- package/tooling/src/capture/command.js +376 -0
- package/tooling/src/capture/store.js +99 -0
- package/tooling/src/capture/usage.js +270 -0
- package/tooling/src/checks/branches.js +50 -0
- package/tooling/src/checks/brand-truth.js +110 -0
- package/tooling/src/checks/changelog.js +231 -0
- package/tooling/src/checks/command-registry.js +36 -0
- package/tooling/src/checks/commits.js +102 -0
- package/tooling/src/checks/docs-drift.js +103 -0
- package/tooling/src/checks/docs-truth.js +201 -0
- package/tooling/src/checks/runtime-surface.js +156 -0
- package/tooling/src/cli.js +116 -0
- package/tooling/src/command-options.js +56 -0
- package/tooling/src/commands/validate.js +320 -0
- package/tooling/src/doctor/command.js +91 -0
- package/tooling/src/export/command.js +77 -0
- package/tooling/src/export/compiler.js +498 -0
- package/tooling/src/guards/loop-cap-guard.js +52 -0
- package/tooling/src/guards/protected-path-write-guard.js +67 -0
- package/tooling/src/index/command.js +152 -0
- package/tooling/src/index/storage.js +1061 -0
- package/tooling/src/index/summarizers.js +261 -0
- package/tooling/src/loaders.js +18 -0
- package/tooling/src/project-root.js +22 -0
- package/tooling/src/recall/command.js +225 -0
- package/tooling/src/schema-validator.js +30 -0
- package/tooling/src/state-root.js +40 -0
- package/tooling/src/status/command.js +71 -0
- package/wazir.manifest.yaml +135 -0
- package/workflows/README.md +19 -0
- package/workflows/author.md +42 -0
- package/workflows/clarify.md +38 -0
- package/workflows/design-review.md +46 -0
- package/workflows/design.md +44 -0
- package/workflows/discover.md +37 -0
- package/workflows/execute.md +48 -0
- package/workflows/learn.md +38 -0
- package/workflows/plan-review.md +42 -0
- package/workflows/plan.md +39 -0
- package/workflows/prepare-next.md +37 -0
- package/workflows/review.md +40 -0
- package/workflows/run-audit.md +41 -0
- package/workflows/spec-challenge.md +41 -0
- package/workflows/specify.md +38 -0
- package/workflows/verify.md +37 -0
|
@@ -0,0 +1,1209 @@
|
|
|
1
|
+
# Connection Pooling: Comprehensive Performance Engineering Guide
|
|
2
|
+
|
|
3
|
+
## Table of Contents
|
|
4
|
+
|
|
5
|
+
1. [Why Connection Pooling Matters](#why-connection-pooling-matters)
|
|
6
|
+
2. [Database Connection Pooling](#database-connection-pooling)
|
|
7
|
+
3. [HTTP Connection Pooling](#http-connection-pooling)
|
|
8
|
+
4. [Redis Connection Pooling](#redis-connection-pooling)
|
|
9
|
+
5. [Pool Sizing Formulas](#pool-sizing-formulas)
|
|
10
|
+
6. [Pooling Modes: Session vs Transaction vs Statement](#pooling-modes)
|
|
11
|
+
7. [Common Bottlenecks](#common-bottlenecks)
|
|
12
|
+
8. [Anti-Patterns](#anti-patterns)
|
|
13
|
+
9. [Monitoring Pool Health](#monitoring-pool-health)
|
|
14
|
+
10. [Before/After Benchmarks](#beforeafter-benchmarks)
|
|
15
|
+
11. [Decision Tree: How Should I Size My Pool?](#decision-tree)
|
|
16
|
+
12. [Sources](#sources)
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
## Why Connection Pooling Matters
|
|
21
|
+
|
|
22
|
+
Every new connection to a database, HTTP service, or cache incurs fixed overhead
|
|
23
|
+
that dwarfs the cost of the actual work being performed. Connection pooling
|
|
24
|
+
eliminates this overhead by maintaining a set of pre-established connections that
|
|
25
|
+
are reused across requests.
|
|
26
|
+
|
|
27
|
+
### TCP Handshake Cost
|
|
28
|
+
|
|
29
|
+
A TCP three-way handshake (SYN, SYN-ACK, ACK) requires 1 round-trip time (RTT)
|
|
30
|
+
before any application data flows. On a cross-region link with 80 ms RTT, that
|
|
31
|
+
is 80 ms of pure latency per new connection. Within the same data center, RTT is
|
|
32
|
+
typically 0.1-0.5 ms, but at thousands of requests per second, even sub-millisecond
|
|
33
|
+
costs aggregate into measurable throughput loss.
|
|
34
|
+
|
|
35
|
+
| Scenario | RTT | TCP Handshake Cost |
|
|
36
|
+
|---------------------------|-----------|--------------------|
|
|
37
|
+
| Same data center | 0.1-0.5 ms | 0.1-0.5 ms |
|
|
38
|
+
| Same region, cross-AZ | 1-2 ms | 1-2 ms |
|
|
39
|
+
| Cross-region (US-EU) | 80-120 ms | 80-120 ms |
|
|
40
|
+
| Cross-continent (US-Asia) | 200-250 ms| 200-250 ms |
|
|
41
|
+
|
|
42
|
+
### TLS Negotiation Cost
|
|
43
|
+
|
|
44
|
+
A full TLS 1.2 handshake adds 2 additional RTTs on top of the TCP handshake,
|
|
45
|
+
totaling 3 RTTs before any application data can be sent. For a user connecting
|
|
46
|
+
from India to US-East with 200-250 ms RTT, this translates to 600-750 ms of
|
|
47
|
+
setup latency per new connection.
|
|
48
|
+
|
|
49
|
+
TLS 1.3 reduces the TLS portion to 1 RTT (2 RTTs total including TCP), saving
|
|
50
|
+
roughly 80 ms on an 80 ms RTT link. Session resumption with TLS session tickets
|
|
51
|
+
can achieve 0-RTT for returning connections. Production CDNs target resumption
|
|
52
|
+
rates above 80%, meaning four out of five returning connections skip the full
|
|
53
|
+
handshake. Full TLS handshakes consume 5-10x more CPU than resumed sessions
|
|
54
|
+
due to asymmetric cryptographic operations (RSA/ECDHE key exchange).
|
|
55
|
+
|
|
56
|
+
| Protocol | RTTs (TCP + TLS) | Latency at 100ms RTT |
|
|
57
|
+
|------------|------------------|----------------------|
|
|
58
|
+
| TLS 1.2 | 3 RTT | 300 ms |
|
|
59
|
+
| TLS 1.3 | 2 RTT | 200 ms |
|
|
60
|
+
| TLS 1.3 + 0-RTT resumption | 1 RTT | 100 ms |
|
|
61
|
+
| QUIC (HTTP/3) cold | 1 RTT | 100 ms |
|
|
62
|
+
| QUIC 0-RTT | 0 RTT | 0 ms (crypto only) |
|
|
63
|
+
|
|
64
|
+
### Database Connection Cost
|
|
65
|
+
|
|
66
|
+
PostgreSQL forks a new OS process for every client connection. Each backend
|
|
67
|
+
process consumes approximately 1.3 MiB with huge_pages=on and 7.6 MiB with
|
|
68
|
+
huge_pages=off (including page table overhead of ~6.4 MB) under a simple OLTP
|
|
69
|
+
workload. Under real-world load with query caches, sort buffers, and work_mem
|
|
70
|
+
allocations, this grows to roughly 5-10 MB per connection.
|
|
71
|
+
|
|
72
|
+
At 2,000 direct connections, PostgreSQL needs 4-8 GB of RAM just for connection
|
|
73
|
+
overhead, before any query execution memory is allocated. The fork() itself uses
|
|
74
|
+
copy-on-write on Linux, but the cumulative memory pressure from thousands of
|
|
75
|
+
processes leads to increased context switching, cache thrashing, and lock
|
|
76
|
+
contention on shared data structures like the ProcArray.
|
|
77
|
+
|
|
78
|
+
MySQL uses a thread-per-connection model (lighter than a full process) but still
|
|
79
|
+
incurs authentication, privilege checking, and thread-local memory allocation
|
|
80
|
+
costs of approximately 256 KB-1 MB per connection by default.
|
|
81
|
+
|
|
82
|
+
### The Cost Summary
|
|
83
|
+
|
|
84
|
+
```
|
|
85
|
+
Without pooling (per request):
|
|
86
|
+
DNS lookup: 1-50 ms
|
|
87
|
+
TCP handshake: 0.5-250 ms (1 RTT)
|
|
88
|
+
TLS handshake: 1-500 ms (1-2 RTT)
|
|
89
|
+
DB authentication: 5-20 ms
|
|
90
|
+
Process/thread alloc: 1-5 ms
|
|
91
|
+
Memory allocation: 1-10 MB
|
|
92
|
+
─────────────────────────────────
|
|
93
|
+
Total overhead: ~8-825 ms + memory
|
|
94
|
+
|
|
95
|
+
With pooling (per request):
|
|
96
|
+
Acquire from pool: 0.01-0.1 ms
|
|
97
|
+
─────────────────────────────────
|
|
98
|
+
Total overhead: ~0.01-0.1 ms, no new memory
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
---
|
|
102
|
+
|
|
103
|
+
## Database Connection Pooling
|
|
104
|
+
|
|
105
|
+
### PgBouncer
|
|
106
|
+
|
|
107
|
+
PgBouncer is the most widely deployed PostgreSQL connection pooler. It is
|
|
108
|
+
single-threaded, written in C, and uses libevent for async I/O. Its memory
|
|
109
|
+
footprint is roughly 2 KB per connection, making it extremely lightweight.
|
|
110
|
+
|
|
111
|
+
**Performance benchmarks (2025, 8 vCPU / 16 GB RAM / PostgreSQL 16):**
|
|
112
|
+
|
|
113
|
+
| Metric | Direct PostgreSQL | With PgBouncer (transaction mode) |
|
|
114
|
+
|----------------------------|-------------------|-----------------------------------|
|
|
115
|
+
| Throughput (150 clients) | ~8,000 TPS | ~44,096 TPS |
|
|
116
|
+
| Throughput (1000 clients) | Degraded/errors | ~44,000 TPS (stable) |
|
|
117
|
+
| Avg latency (simple query) | 12-15 ms | 2-3 ms |
|
|
118
|
+
| Max concurrent connections | ~200 (default) | 10,000+ (client-side) |
|
|
119
|
+
| Memory per connection | 5-10 MB | ~2 KB |
|
|
120
|
+
|
|
121
|
+
**Key configuration parameters:**
|
|
122
|
+
|
|
123
|
+
```ini
|
|
124
|
+
[databases]
|
|
125
|
+
mydb = host=127.0.0.1 port=5432 dbname=mydb
|
|
126
|
+
|
|
127
|
+
[pgbouncer]
|
|
128
|
+
pool_mode = transaction ; session | transaction | statement
|
|
129
|
+
max_client_conn = 10000 ; max client connections
|
|
130
|
+
default_pool_size = 20 ; server connections per user/db pair
|
|
131
|
+
min_pool_size = 5 ; minimum server connections to keep open
|
|
132
|
+
reserve_pool_size = 5 ; extra connections for burst
|
|
133
|
+
reserve_pool_timeout = 3 ; seconds before using reserve pool
|
|
134
|
+
server_idle_timeout = 600 ; close idle server connections after 10min
|
|
135
|
+
server_lifetime = 3600 ; close server connections after 1hr
|
|
136
|
+
query_wait_timeout = 120 ; max time a query can wait for a connection
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
**Limitation:** PgBouncer is single-threaded. Beyond ~25,000 TPS on a single
|
|
140
|
+
instance, you need to run multiple PgBouncer processes behind a load balancer
|
|
141
|
+
or use SO_REUSEPORT.
|
|
142
|
+
|
|
143
|
+
### PgCat
|
|
144
|
+
|
|
145
|
+
PgCat (by PostgresML) is a Rust-based, multi-threaded PostgreSQL pooler with
|
|
146
|
+
built-in sharding, load balancing, and failover support.
|
|
147
|
+
|
|
148
|
+
**Performance benchmarks (2025, same hardware as PgBouncer tests):**
|
|
149
|
+
|
|
150
|
+
| Metric | PgBouncer | PgCat |
|
|
151
|
+
|----------------------------|---------------|----------------|
|
|
152
|
+
| Throughput (50 clients) | Higher | Slightly lower |
|
|
153
|
+
| Throughput (750+ clients) | ~22,000 TPS | ~59,000 TPS |
|
|
154
|
+
| Throughput (1000+ clients) | X TPS | >2x PgBouncer |
|
|
155
|
+
| CPU usage | Lower | Higher |
|
|
156
|
+
| Multi-threading | No | Yes |
|
|
157
|
+
| Prepared statement support | Limited | Named + extended |
|
|
158
|
+
|
|
159
|
+
PgCat outperforms PgBouncer at high concurrency (750+ clients) by leveraging
|
|
160
|
+
multiple CPU cores, delivering more than 2x the queries per second. At lower
|
|
161
|
+
concurrency (<50 clients), PgBouncer's simpler architecture introduces less
|
|
162
|
+
overhead.
|
|
163
|
+
|
|
164
|
+
**When to choose PgCat over PgBouncer:**
|
|
165
|
+
- More than 500 concurrent clients
|
|
166
|
+
- Need built-in read replica load balancing
|
|
167
|
+
- Need sharding at the proxy layer
|
|
168
|
+
- Running on multi-core hardware where single-threaded is a bottleneck
|
|
169
|
+
|
|
170
|
+
### HikariCP
|
|
171
|
+
|
|
172
|
+
HikariCP is the fastest JDBC connection pool for JVM applications. It is the
|
|
173
|
+
default pool in Spring Boot 2+ and delivers sub-microsecond connection
|
|
174
|
+
acquisition times.
|
|
175
|
+
|
|
176
|
+
**Key design decisions that make it fast:**
|
|
177
|
+
- ConcurrentBag collection instead of LinkedBlockingQueue (lock-free)
|
|
178
|
+
- Bytecode-level optimization of Connection, Statement, ResultSet proxies
|
|
179
|
+
- Fixed-size pool by default (no min-idle concept)
|
|
180
|
+
- FastList instead of ArrayList (no range checking, no element removal shifting)
|
|
181
|
+
|
|
182
|
+
**Critical configuration:**
|
|
183
|
+
|
|
184
|
+
```yaml
|
|
185
|
+
spring:
|
|
186
|
+
datasource:
|
|
187
|
+
hikari:
|
|
188
|
+
maximum-pool-size: 10 # Start here, tune based on formula
|
|
189
|
+
minimum-idle: 10 # HikariCP recommends equal to max (fixed pool)
|
|
190
|
+
idle-timeout: 600000 # 10 minutes
|
|
191
|
+
max-lifetime: 1800000 # 30 minutes (must be < DB wait_timeout)
|
|
192
|
+
connection-timeout: 30000 # 30 seconds to acquire from pool
|
|
193
|
+
validation-timeout: 5000 # 5 seconds for connection validation
|
|
194
|
+
leak-detection-threshold: 60000 # Log warning if connection held > 60s
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
**Real benchmark from HikariCP wiki:** Reducing pool size from 50 to 10
|
|
198
|
+
on a 4-core server decreased average response time from ~100 ms to ~2 ms --
|
|
199
|
+
a 50x improvement. The larger pool caused excessive context switching and
|
|
200
|
+
lock contention on the database server.
|
|
201
|
+
|
|
202
|
+
### SQLAlchemy QueuePool
|
|
203
|
+
|
|
204
|
+
SQLAlchemy (Python) uses QueuePool as its default connection pool
|
|
205
|
+
implementation for most database backends.
|
|
206
|
+
|
|
207
|
+
**Key parameters:**
|
|
208
|
+
|
|
209
|
+
```python
|
|
210
|
+
from sqlalchemy import create_engine
|
|
211
|
+
|
|
212
|
+
engine = create_engine(
|
|
213
|
+
"postgresql+psycopg2://user:pass@localhost/mydb",
|
|
214
|
+
pool_size=5, # persistent connections in pool (default: 5)
|
|
215
|
+
max_overflow=10, # extra connections allowed beyond pool_size
|
|
216
|
+
pool_timeout=30, # seconds to wait for connection (default: 30)
|
|
217
|
+
pool_recycle=1800, # recycle connections after 30 min
|
|
218
|
+
pool_pre_ping=True, # validate connection before checkout (adds ~1ms)
|
|
219
|
+
echo_pool="debug", # log pool events for debugging
|
|
220
|
+
)
|
|
221
|
+
# Total max simultaneous connections: pool_size + max_overflow = 15
|
|
222
|
+
# Sleeping (idle) connections in pool: up to pool_size = 5
|
|
223
|
+
```
|
|
224
|
+
|
|
225
|
+
**Critical behaviors:**
|
|
226
|
+
- Connections are created lazily (no pre-warming)
|
|
227
|
+
- pool_pre_ping sends a `SELECT 1` before every checkout to detect stale
|
|
228
|
+
connections (adds ~1 ms latency but prevents "connection closed" errors)
|
|
229
|
+
- In multi-process apps (e.g., Gunicorn with preforking), each worker gets
|
|
230
|
+
its own pool. With 4 workers and pool_size=5, that is 20 persistent
|
|
231
|
+
connections to the database
|
|
232
|
+
- pool_recycle is essential for MySQL, which closes connections idle for
|
|
233
|
+
longer than wait_timeout (default 8 hours)
|
|
234
|
+
- For serverless (AWS Lambda), use NullPool to avoid stale connections across
|
|
235
|
+
cold starts
|
|
236
|
+
|
|
237
|
+
**Pool types available:**
|
|
238
|
+
|
|
239
|
+
| Pool Type | Use Case |
|
|
240
|
+
|----------------|----------------------------------------------|
|
|
241
|
+
| QueuePool | Default. Best for long-running applications |
|
|
242
|
+
| NullPool | Serverless / Lambda. No pooling at all |
|
|
243
|
+
| StaticPool | Testing. Single connection reused |
|
|
244
|
+
| SingletonThreadPool | Thread-local connection (SQLite) |
|
|
245
|
+
| AssertionPool | Testing. Ensures single concurrent checkout |
|
|
246
|
+
|
|
247
|
+
---
|
|
248
|
+
|
|
249
|
+
## HTTP Connection Pooling
|
|
250
|
+
|
|
251
|
+
HTTP connection pooling (often via HTTP keep-alive) reuses TCP+TLS connections
|
|
252
|
+
across multiple HTTP requests to the same origin, eliminating per-request
|
|
253
|
+
handshake costs.
|
|
254
|
+
|
|
255
|
+
### Performance Impact
|
|
256
|
+
|
|
257
|
+
Production measurements consistently show significant improvements:
|
|
258
|
+
|
|
259
|
+
| Metric | Without Pooling | With Pooling | Improvement |
|
|
260
|
+
|--------------------------|-----------------|-----------------|-------------|
|
|
261
|
+
| Throughput (RPS) | 4,000 | 7,721 | +93% |
|
|
262
|
+
| Average API latency | ~85 ms | ~51 ms | -40% |
|
|
263
|
+
| CPU usage (client-side) | High | ~50% reduction | -50% |
|
|
264
|
+
| Max inbound throughput | Baseline | +50% | +50% |
|
|
265
|
+
| Connection setup per req | 3-500 ms | 0 ms | -100% |
|
|
266
|
+
|
|
267
|
+
One benchmark showed an overall 13x performance improvement when using
|
|
268
|
+
connection pooling vs. creating a new connection for every request.
|
|
269
|
+
|
|
270
|
+
### HTTP/1.1 Keep-Alive
|
|
271
|
+
|
|
272
|
+
HTTP/1.1 made persistent connections the default (Connection: keep-alive).
|
|
273
|
+
The connection remains open after the response, allowing subsequent requests
|
|
274
|
+
to reuse the same TCP+TLS session.
|
|
275
|
+
|
|
276
|
+
```
|
|
277
|
+
# Without keep-alive (HTTP/1.0 default):
|
|
278
|
+
Request 1: DNS + TCP + TLS + HTTP → Close
|
|
279
|
+
Request 2: DNS + TCP + TLS + HTTP → Close (all overhead repeated)
|
|
280
|
+
Request 3: DNS + TCP + TLS + HTTP → Close
|
|
281
|
+
|
|
282
|
+
# With keep-alive (HTTP/1.1 default):
|
|
283
|
+
Request 1: DNS + TCP + TLS + HTTP → Keep open
|
|
284
|
+
Request 2: HTTP → Keep open (reuses connection)
|
|
285
|
+
Request 3: HTTP → Keep open
|
|
286
|
+
```
|
|
287
|
+
|
|
288
|
+
**Key configuration (Nginx):**
|
|
289
|
+
|
|
290
|
+
```nginx
|
|
291
|
+
upstream backend {
|
|
292
|
+
server 10.0.0.1:8080;
|
|
293
|
+
keepalive 32; # Keep 32 idle connections per worker
|
|
294
|
+
keepalive_requests 1000; # Max requests per connection before recycling
|
|
295
|
+
keepalive_time 1h; # Max lifetime of a kept-alive connection
|
|
296
|
+
keepalive_timeout 60s; # Close idle connections after 60s
|
|
297
|
+
}
|
|
298
|
+
|
|
299
|
+
server {
|
|
300
|
+
location / {
|
|
301
|
+
proxy_pass http://backend;
|
|
302
|
+
proxy_http_version 1.1; # Required for keepalive
|
|
303
|
+
proxy_set_header Connection ""; # Clear "close" header
|
|
304
|
+
}
|
|
305
|
+
}
|
|
306
|
+
```
|
|
307
|
+
|
|
308
|
+
### HTTP/2 Multiplexing
|
|
309
|
+
|
|
310
|
+
HTTP/2 uses a single TCP connection per origin with multiplexed streams,
|
|
311
|
+
eliminating head-of-line blocking at the HTTP layer. A single HTTP/2 connection
|
|
312
|
+
can handle hundreds of concurrent requests without additional handshakes.
|
|
313
|
+
|
|
314
|
+
**Practical implications for connection pooling:**
|
|
315
|
+
- Client-side pool size of 1 per origin is often sufficient for HTTP/2
|
|
316
|
+
- gRPC (built on HTTP/2) typically uses a single connection per channel
|
|
317
|
+
- Connection pool sizing shifts from "how many connections" to "how many
|
|
318
|
+
origins" and "what is the max concurrent streams per connection"
|
|
319
|
+
|
|
320
|
+
### Client Library Configuration Examples
|
|
321
|
+
|
|
322
|
+
**Go (net/http):**
|
|
323
|
+
|
|
324
|
+
```go
|
|
325
|
+
transport := &http.Transport{
|
|
326
|
+
MaxIdleConns: 100, // Total idle connections
|
|
327
|
+
MaxIdleConnsPerHost: 10, // Per-host idle connections
|
|
328
|
+
MaxConnsPerHost: 0, // 0 = unlimited
|
|
329
|
+
IdleConnTimeout: 90 * time.Second, // Close idle after 90s
|
|
330
|
+
TLSHandshakeTimeout: 10 * time.Second,
|
|
331
|
+
DisableKeepAlives: false, // NEVER set true in production
|
|
332
|
+
}
|
|
333
|
+
client := &http.Client{Transport: transport}
|
|
334
|
+
```
|
|
335
|
+
|
|
336
|
+
**Python (requests/urllib3):**
|
|
337
|
+
|
|
338
|
+
```python
|
|
339
|
+
import requests
|
|
340
|
+
|
|
341
|
+
session = requests.Session()
|
|
342
|
+
adapter = requests.adapters.HTTPAdapter(
|
|
343
|
+
pool_connections=10, # Number of connection pools (per host)
|
|
344
|
+
pool_maxsize=10, # Connections per pool
|
|
345
|
+
max_retries=3,
|
|
346
|
+
pool_block=True, # Block when pool is full (vs. creating new)
|
|
347
|
+
)
|
|
348
|
+
session.mount("https://", adapter)
|
|
349
|
+
session.mount("http://", adapter)
|
|
350
|
+
# Reuse `session` across requests -- do NOT create per-request sessions
|
|
351
|
+
```
|
|
352
|
+
|
|
353
|
+
**Node.js (built-in http.Agent):**
|
|
354
|
+
|
|
355
|
+
```javascript
|
|
356
|
+
const http = require('http');
|
|
357
|
+
const agent = new http.Agent({
|
|
358
|
+
keepAlive: true, // Enable connection reuse
|
|
359
|
+
keepAliveMsecs: 1000, // TCP keepalive probe interval
|
|
360
|
+
maxSockets: 50, // Max concurrent sockets per host
|
|
361
|
+
maxFreeSockets: 10, // Max idle sockets to keep
|
|
362
|
+
timeout: 60000, // Socket inactivity timeout
|
|
363
|
+
});
|
|
364
|
+
// Pass agent to every request
|
|
365
|
+
http.get({ hostname: 'api.example.com', agent }, callback);
|
|
366
|
+
```
|
|
367
|
+
|
|
368
|
+
---
|
|
369
|
+
|
|
370
|
+
## Redis Connection Pooling
|
|
371
|
+
|
|
372
|
+
Redis is single-threaded (for command processing) and handles connections
|
|
373
|
+
extremely efficiently, but connection creation still carries TCP/TLS overhead
|
|
374
|
+
and authentication cost.
|
|
375
|
+
|
|
376
|
+
### Why Pool Redis Connections
|
|
377
|
+
|
|
378
|
+
Each new Redis connection requires:
|
|
379
|
+
- TCP handshake: 0.1-0.5 ms (same DC), 1+ ms (cross-AZ)
|
|
380
|
+
- TLS handshake (if enabled): 1-5 ms
|
|
381
|
+
- AUTH command: 0.1-0.5 ms
|
|
382
|
+
- SELECT database: 0.05-0.1 ms
|
|
383
|
+
|
|
384
|
+
A SET/GET command takes ~0.1 ms on a warm connection. Without pooling,
|
|
385
|
+
connection setup (0.3-6 ms) is 3-60x more expensive than the operation itself.
|
|
386
|
+
|
|
387
|
+
**Benchmark data:**
|
|
388
|
+
|
|
389
|
+
| Metric | Without Pooling | With Pooling | Improvement |
|
|
390
|
+
|---------------------------|-----------------|--------------|-------------|
|
|
391
|
+
| Average operation latency | 2.82 ms | 0.21 ms | 13.4x |
|
|
392
|
+
| Transaction time | 427 ms | 118 ms | 3.6x (72%) |
|
|
393
|
+
| Throughput (ops/sec) | ~35,000 | ~475,000 | 13.6x |
|
|
394
|
+
|
|
395
|
+
### Configuration Best Practices
|
|
396
|
+
|
|
397
|
+
**Python (redis-py):**
|
|
398
|
+
|
|
399
|
+
```python
|
|
400
|
+
import redis
|
|
401
|
+
|
|
402
|
+
pool = redis.ConnectionPool(
|
|
403
|
+
host='redis.example.com',
|
|
404
|
+
port=6379,
|
|
405
|
+
db=0,
|
|
406
|
+
max_connections=50, # Max pool size
|
|
407
|
+
socket_timeout=5.0, # Command timeout
|
|
408
|
+
socket_connect_timeout=2.0, # Connection timeout
|
|
409
|
+
retry_on_timeout=True, # Auto-retry on timeout
|
|
410
|
+
health_check_interval=30, # Validate connections every 30s
|
|
411
|
+
decode_responses=True,
|
|
412
|
+
)
|
|
413
|
+
r = redis.Redis(connection_pool=pool)
|
|
414
|
+
```
|
|
415
|
+
|
|
416
|
+
**Java (Jedis):**
|
|
417
|
+
|
|
418
|
+
```java
|
|
419
|
+
JedisPoolConfig config = new JedisPoolConfig();
|
|
420
|
+
config.setMaxTotal(50); // Max active connections
|
|
421
|
+
config.setMaxIdle(20); // Max idle connections
|
|
422
|
+
config.setMinIdle(5); // Min idle connections (pre-warmed)
|
|
423
|
+
config.setTestOnBorrow(true); // Validate before checkout
|
|
424
|
+
config.setTestWhileIdle(true); // Validate idle connections
|
|
425
|
+
config.setTimeBetweenEvictionRunsMillis(30000); // Check idle every 30s
|
|
426
|
+
config.setBlockWhenExhausted(true);
|
|
427
|
+
config.setMaxWaitMillis(2000); // Wait 2s for connection, then throw
|
|
428
|
+
|
|
429
|
+
JedisPool pool = new JedisPool(config, "redis.example.com", 6379);
|
|
430
|
+
try (Jedis jedis = pool.getResource()) {
|
|
431
|
+
jedis.set("key", "value");
|
|
432
|
+
} // Connection automatically returned to pool
|
|
433
|
+
```
|
|
434
|
+
|
|
435
|
+
### Connection Pooling vs. Multiplexing
|
|
436
|
+
|
|
437
|
+
Some Redis client libraries (like Lettuce for Java or ioredis in pipeline mode)
|
|
438
|
+
use a multiplexing approach: a single connection handles all commands.
|
|
439
|
+
|
|
440
|
+
| Approach | Pros | Cons |
|
|
441
|
+
|---------------|-------------------------------|-----------------------------------|
|
|
442
|
+
| Pooling | Supports blocking commands (BLPOP, BRPOP) | Higher memory with many connections |
|
|
443
|
+
| Multiplexing | Single connection, lower overhead | Cannot use blocking commands |
|
|
444
|
+
| Pipelining | Batch commands, reduce RTTs | Must buffer commands |
|
|
445
|
+
|
|
446
|
+
**Recommendation:** Use connection pooling when your application uses blocking
|
|
447
|
+
commands (BLPOP, BRPOP, SUBSCRIBE). Use multiplexing when you need maximum
|
|
448
|
+
throughput on simple GET/SET workloads. In practice, most applications should
|
|
449
|
+
start with a connection pool of 10-50 connections.
|
|
450
|
+
|
|
451
|
+
---
|
|
452
|
+
|
|
453
|
+
## Pool Sizing Formulas
|
|
454
|
+
|
|
455
|
+
### The PostgreSQL Formula
|
|
456
|
+
|
|
457
|
+
The most widely cited formula comes from the PostgreSQL project and is
|
|
458
|
+
referenced in the HikariCP documentation:
|
|
459
|
+
|
|
460
|
+
```
|
|
461
|
+
connections = ((core_count * 2) + effective_spindle_count)
|
|
462
|
+
```
|
|
463
|
+
|
|
464
|
+
Where:
|
|
465
|
+
- **core_count** = physical CPU cores (NOT hyperthreads). A 4-core/8-thread
|
|
466
|
+
CPU has core_count = 4
|
|
467
|
+
- **effective_spindle_count** = number of independent I/O paths
|
|
468
|
+
- SSD: use 1 (since SSD parallelism is handled internally)
|
|
469
|
+
- HDD: use actual spindle count
|
|
470
|
+
- Fully cached dataset: use 0
|
|
471
|
+
- Cloud NVMe: use 1
|
|
472
|
+
|
|
473
|
+
**Examples:**
|
|
474
|
+
|
|
475
|
+
| Server Hardware | Formula | Optimal Pool Size |
|
|
476
|
+
|--------------------------|----------------------|--------------------|
|
|
477
|
+
| 4-core, SSD | (4 * 2) + 1 | 9 |
|
|
478
|
+
| 8-core, SSD | (8 * 2) + 1 | 17 |
|
|
479
|
+
| 16-core, SSD | (16 * 2) + 1 | 33 |
|
|
480
|
+
| 4-core, fully cached | (4 * 2) + 0 | 8 |
|
|
481
|
+
| 4-core, 4-disk RAID | (4 * 2) + 4 | 12 |
|
|
482
|
+
|
|
483
|
+
**Why this formula works:** Database threads spend most of their time
|
|
484
|
+
blocking on I/O (disk reads, network waits, lock acquisition). With N cores,
|
|
485
|
+
you need roughly 2*N threads to keep all cores busy while half the threads
|
|
486
|
+
are waiting on I/O. The spindle count adds capacity for concurrent disk I/O.
|
|
487
|
+
|
|
488
|
+
### The Universal Formula (Little's Law)
|
|
489
|
+
|
|
490
|
+
For any connection pool, the optimal size can be derived from Little's Law:
|
|
491
|
+
|
|
492
|
+
```
|
|
493
|
+
L = lambda * W
|
|
494
|
+
|
|
495
|
+
Where:
|
|
496
|
+
L = average number of connections in use
|
|
497
|
+
lambda = arrival rate (requests/second)
|
|
498
|
+
W = average time a connection is held (seconds)
|
|
499
|
+
```
|
|
500
|
+
|
|
501
|
+
**Example:** Your API handles 1,000 req/s and each database query takes 10 ms
|
|
502
|
+
(W = 0.01s):
|
|
503
|
+
|
|
504
|
+
```
|
|
505
|
+
L = 1000 * 0.01 = 10 connections needed on average
|
|
506
|
+
```
|
|
507
|
+
|
|
508
|
+
Add headroom for variance (typically 1.5-2x):
|
|
509
|
+
|
|
510
|
+
```
|
|
511
|
+
pool_size = L * 2 = 20 connections
|
|
512
|
+
```
|
|
513
|
+
|
|
514
|
+
### Adjustments for Real-World Conditions
|
|
515
|
+
|
|
516
|
+
The base formulas provide starting points. Adjust for:
|
|
517
|
+
|
|
518
|
+
```
|
|
519
|
+
actual_pool_size = base_formula * adjustment_factors
|
|
520
|
+
|
|
521
|
+
Adjustment factors:
|
|
522
|
+
+ Long transactions (>100ms avg): multiply by 1.5-2x
|
|
523
|
+
+ Mixed read/write workload: multiply by 1.2x
|
|
524
|
+
+ Lock contention present: multiply by 1.3x
|
|
525
|
+
+ Connection validation enabled: add 1-2 connections
|
|
526
|
+
- Read replicas in use: divide by replica_count
|
|
527
|
+
- Caching layer (Redis) in front: divide by cache_hit_ratio
|
|
528
|
+
```
|
|
529
|
+
|
|
530
|
+
### Multi-Application Pool Sizing
|
|
531
|
+
|
|
532
|
+
When multiple applications share a database:
|
|
533
|
+
|
|
534
|
+
```
|
|
535
|
+
total_connections = sum(app_pool_size * app_instance_count) for all apps
|
|
536
|
+
|
|
537
|
+
Constraint: total_connections < max_connections * 0.9
|
|
538
|
+
(reserve 10% for admin/monitoring connections)
|
|
539
|
+
```
|
|
540
|
+
|
|
541
|
+
**Example:** 3 microservices, each with 4 replicas, pool_size=10:
|
|
542
|
+
|
|
543
|
+
```
|
|
544
|
+
total = 3 * 4 * 10 = 120 connections
|
|
545
|
+
PostgreSQL max_connections should be >= 134 (120 / 0.9)
|
|
546
|
+
```
|
|
547
|
+
|
|
548
|
+
---
|
|
549
|
+
|
|
550
|
+
## Pooling Modes
|
|
551
|
+
|
|
552
|
+
PgBouncer (and other PostgreSQL poolers) support three pooling modes. The
|
|
553
|
+
choice fundamentally affects application compatibility and connection
|
|
554
|
+
efficiency.
|
|
555
|
+
|
|
556
|
+
### Session Pooling
|
|
557
|
+
|
|
558
|
+
```
|
|
559
|
+
Client connects → Assigned a server connection → Keeps it until disconnect
|
|
560
|
+
```
|
|
561
|
+
|
|
562
|
+
- Server connection is assigned to a client for the entire duration of the
|
|
563
|
+
client session
|
|
564
|
+
- All session-level features work: prepared statements, temp tables, SET
|
|
565
|
+
commands, advisory locks, LISTEN/NOTIFY
|
|
566
|
+
- Connection ratio: 1:1 (client:server) while client is connected
|
|
567
|
+
- **Use case:** Legacy applications that rely on session state, applications
|
|
568
|
+
using prepared statements extensively, any app that cannot be modified
|
|
569
|
+
|
|
570
|
+
**Efficiency:** Low. If you have 10,000 clients connecting via session
|
|
571
|
+
pooling, you need 10,000 server connections. This is only useful for
|
|
572
|
+
connection lifecycle management (graceful close, health checks) rather
|
|
573
|
+
than connection reduction.
|
|
574
|
+
|
|
575
|
+
### Transaction Pooling
|
|
576
|
+
|
|
577
|
+
```
|
|
578
|
+
Client connects → Gets server connection only during a transaction → Released after COMMIT/ROLLBACK
|
|
579
|
+
```
|
|
580
|
+
|
|
581
|
+
- Server connection is returned to the pool after each transaction completes
|
|
582
|
+
- Connection ratio can be as high as 100:1 or more (e.g., 10,000 clients
|
|
583
|
+
mapped to 50-100 server connections)
|
|
584
|
+
- **Breaks:** Prepared statements (different connection per transaction),
|
|
585
|
+
temporary tables, session-level SET commands, advisory locks, LISTEN/NOTIFY
|
|
586
|
+
|
|
587
|
+
**Efficiency:** High. This is the recommended mode for most OLTP applications.
|
|
588
|
+
A typical web application holds a connection for 5-50 ms per transaction but
|
|
589
|
+
the HTTP request lifecycle may be 100-500 ms. Transaction pooling allows the
|
|
590
|
+
connection to serve other clients during the 50-450 ms of non-database work.
|
|
591
|
+
|
|
592
|
+
**Workarounds for limitations:**
|
|
593
|
+
- Prepared statements: Use `DEALLOCATE ALL` or PgBouncer 1.21+ with
|
|
594
|
+
`max_prepared_statements` setting (transparent prepared statement support)
|
|
595
|
+
- Session variables: Move to per-transaction SET LOCAL instead of SET
|
|
596
|
+
- Temp tables: Replace with regular tables or CTEs
|
|
597
|
+
|
|
598
|
+
### Statement Pooling
|
|
599
|
+
|
|
600
|
+
```
|
|
601
|
+
Client sends query → Gets server connection → Connection released after query completes
|
|
602
|
+
```
|
|
603
|
+
|
|
604
|
+
- Connection is returned after every individual statement
|
|
605
|
+
- Forces autocommit mode -- multi-statement transactions are rejected
|
|
606
|
+
- Connection ratio: Highest possible
|
|
607
|
+
- **Breaks:** Everything that transaction pooling breaks PLUS multi-statement
|
|
608
|
+
transactions
|
|
609
|
+
|
|
610
|
+
**Efficiency:** Highest, but most restrictive. Only suitable for simple
|
|
611
|
+
key-value lookups or single-statement operations.
|
|
612
|
+
|
|
613
|
+
### Mode Comparison Matrix
|
|
614
|
+
|
|
615
|
+
| Feature | Session | Transaction | Statement |
|
|
616
|
+
|----------------------------|---------|-------------|-----------|
|
|
617
|
+
| Connection reuse ratio | 1:1 | 10-200:1 | 50-500:1 |
|
|
618
|
+
| Prepared statements | Yes | No* | No |
|
|
619
|
+
| Temp tables | Yes | No | No |
|
|
620
|
+
| SET/RESET | Yes | No** | No |
|
|
621
|
+
| Multi-statement transactions | Yes | Yes | No |
|
|
622
|
+
| Advisory locks | Yes | No | No |
|
|
623
|
+
| LISTEN/NOTIFY | Yes | No | No |
|
|
624
|
+
| Typical throughput gain | 1x | 5-50x | 10-100x |
|
|
625
|
+
|
|
626
|
+
\* PgBouncer 1.21+ supports prepared statements in transaction mode via
|
|
627
|
+
`max_prepared_statements` config.
|
|
628
|
+
\** Use `SET LOCAL` within a transaction as a workaround.
|
|
629
|
+
|
|
630
|
+
---
|
|
631
|
+
|
|
632
|
+
## Common Bottlenecks
|
|
633
|
+
|
|
634
|
+
### 1. Pool Exhaustion
|
|
635
|
+
|
|
636
|
+
**Symptom:** Application threads block waiting for a connection. Response
|
|
637
|
+
times spike. Eventually, connection timeout errors cascade into 500 errors.
|
|
638
|
+
|
|
639
|
+
**Root cause:** All connections in the pool are checked out, and new requests
|
|
640
|
+
cannot acquire one within the timeout period.
|
|
641
|
+
|
|
642
|
+
**Detection:**
|
|
643
|
+
```sql
|
|
644
|
+
-- PostgreSQL: check active connections
|
|
645
|
+
SELECT count(*) AS total,
|
|
646
|
+
count(*) FILTER (WHERE state = 'active') AS active,
|
|
647
|
+
count(*) FILTER (WHERE state = 'idle') AS idle,
|
|
648
|
+
count(*) FILTER (WHERE state = 'idle in transaction') AS idle_in_txn,
|
|
649
|
+
count(*) FILTER (WHERE wait_event IS NOT NULL) AS waiting
|
|
650
|
+
FROM pg_stat_activity
|
|
651
|
+
WHERE backend_type = 'client backend';
|
|
652
|
+
```
|
|
653
|
+
|
|
654
|
+
**Typical thresholds:**
|
|
655
|
+
- active/total > 90% for > 10 seconds: warning
|
|
656
|
+
- active/total = 100% for > 5 seconds: critical
|
|
657
|
+
- Avg connection wait time > 100 ms: warning
|
|
658
|
+
- Avg connection wait time > 1 second: critical
|
|
659
|
+
|
|
660
|
+
### 2. Connection Leaks
|
|
661
|
+
|
|
662
|
+
**Symptom:** Pool slowly drains over hours/days. Active connection count
|
|
663
|
+
grows monotonically until exhaustion.
|
|
664
|
+
|
|
665
|
+
**Root cause:** Application code acquires a connection but never returns it
|
|
666
|
+
(missing finally block, exception before close, async code that drops the
|
|
667
|
+
connection reference).
|
|
668
|
+
|
|
669
|
+
**Detection (HikariCP):**
|
|
670
|
+
```yaml
|
|
671
|
+
# Enable leak detection -- logs stack trace of the code that checked out
|
|
672
|
+
# a connection if it's held for longer than the threshold
|
|
673
|
+
leak-detection-threshold: 60000 # 60 seconds
|
|
674
|
+
```
|
|
675
|
+
|
|
676
|
+
**Detection (SQLAlchemy):**
|
|
677
|
+
```python
|
|
678
|
+
# Enable pool event logging
|
|
679
|
+
from sqlalchemy import event
|
|
680
|
+
|
|
681
|
+
@event.listens_for(engine, "checkout")
|
|
682
|
+
def on_checkout(dbapi_conn, connection_record, connection_proxy):
|
|
683
|
+
connection_record.info["checkout_time"] = time.time()
|
|
684
|
+
|
|
685
|
+
@event.listens_for(engine, "checkin")
|
|
686
|
+
def on_checkin(dbapi_conn, connection_record):
|
|
687
|
+
checkout_time = connection_record.info.get("checkout_time")
|
|
688
|
+
if checkout_time and (time.time() - checkout_time) > 60:
|
|
689
|
+
logger.warning(f"Connection held for {time.time() - checkout_time:.1f}s")
|
|
690
|
+
```
|
|
691
|
+
|
|
692
|
+
### 3. Oversized Pools
|
|
693
|
+
|
|
694
|
+
**Symptom:** High CPU on the database server despite moderate query load.
|
|
695
|
+
Increased lock contention. Higher average query latency than expected.
|
|
696
|
+
|
|
697
|
+
**Root cause:** Too many concurrent connections cause context switching
|
|
698
|
+
overhead and lock contention on shared PostgreSQL data structures (ProcArray,
|
|
699
|
+
lock tables, buffer pool LWLocks).
|
|
700
|
+
|
|
701
|
+
**Evidence from HikariCP benchmarks:** A pool of 50 connections on a 4-core
|
|
702
|
+
server produced ~100 ms average response times. Reducing to 10 connections
|
|
703
|
+
dropped latency to ~2 ms -- a 50x improvement with zero other changes.
|
|
704
|
+
|
|
705
|
+
**PostgreSQL internal contention points:**
|
|
706
|
+
- ProcArrayLock: Checked on every snapshot. More connections = more contention
|
|
707
|
+
- WAL insertion lock: Serializes write-ahead log writes
|
|
708
|
+
- Buffer mapping lock: Controls buffer pool page table
|
|
709
|
+
- Relation extension lock: Serializes table extension
|
|
710
|
+
|
|
711
|
+
### 4. Undersized Pools
|
|
712
|
+
|
|
713
|
+
**Symptom:** High connection wait times in the pool. Good database server
|
|
714
|
+
utilization but application-side queuing. Throughput plateaus despite
|
|
715
|
+
available database capacity.
|
|
716
|
+
|
|
717
|
+
**Detection:**
|
|
718
|
+
```
|
|
719
|
+
Pool wait time > average query time → pool is the bottleneck
|
|
720
|
+
Pool utilization at 100% but DB CPU < 50% → room to grow the pool
|
|
721
|
+
```
|
|
722
|
+
|
|
723
|
+
### 5. Idle-in-Transaction Connections
|
|
724
|
+
|
|
725
|
+
**Symptom:** Connections appear "active" from the pool's perspective but are
|
|
726
|
+
doing no work on the database. Other requests queue waiting for connections.
|
|
727
|
+
|
|
728
|
+
**Root cause:** Application opens a transaction, does non-database work
|
|
729
|
+
(HTTP calls, computation), then commits. The connection is held but idle.
|
|
730
|
+
|
|
731
|
+
**PostgreSQL detection and prevention:**
|
|
732
|
+
```sql
|
|
733
|
+
-- Find idle-in-transaction connections
|
|
734
|
+
SELECT pid, now() - xact_start AS txn_duration, query
|
|
735
|
+
FROM pg_stat_activity
|
|
736
|
+
WHERE state = 'idle in transaction'
|
|
737
|
+
AND now() - xact_start > interval '30 seconds';
|
|
738
|
+
|
|
739
|
+
-- Auto-terminate long idle-in-transaction (PostgreSQL 9.6+)
|
|
740
|
+
ALTER SYSTEM SET idle_in_transaction_session_timeout = '60s';
|
|
741
|
+
```
|
|
742
|
+
|
|
743
|
+
---
|
|
744
|
+
|
|
745
|
+
## Anti-Patterns
|
|
746
|
+
|
|
747
|
+
### Anti-Pattern 1: Creating Connections Per Request
|
|
748
|
+
|
|
749
|
+
```python
|
|
750
|
+
# BAD: New connection for every request
|
|
751
|
+
def handle_request(request):
|
|
752
|
+
conn = psycopg2.connect(host="db", dbname="mydb") # 5-20ms overhead
|
|
753
|
+
cursor = conn.cursor()
|
|
754
|
+
cursor.execute("SELECT * FROM users WHERE id = %s", (request.user_id,))
|
|
755
|
+
result = cursor.fetchone()
|
|
756
|
+
conn.close() # Connection discarded
|
|
757
|
+
return result
|
|
758
|
+
|
|
759
|
+
# GOOD: Reuse pooled connections
|
|
760
|
+
engine = create_engine("postgresql://db/mydb", pool_size=10)
|
|
761
|
+
|
|
762
|
+
def handle_request(request):
|
|
763
|
+
with engine.connect() as conn: # <0.1ms from pool
|
|
764
|
+
result = conn.execute(text("SELECT * FROM users WHERE id = :id"),
|
|
765
|
+
{"id": request.user_id}).fetchone()
|
|
766
|
+
return result # Connection returned to pool automatically
|
|
767
|
+
```
|
|
768
|
+
|
|
769
|
+
**Cost at 1,000 RPS:**
|
|
770
|
+
- Per-request connections: 1,000 * 10 ms overhead = 10 seconds of CPU time wasted/second
|
|
771
|
+
- Pooled: 1,000 * 0.05 ms = 50 ms overhead/second (200x reduction)
|
|
772
|
+
|
|
773
|
+
### Anti-Pattern 2: Pool Too Large
|
|
774
|
+
|
|
775
|
+
```yaml
|
|
776
|
+
# BAD: "More connections = more throughput" (wrong)
|
|
777
|
+
spring:
|
|
778
|
+
datasource:
|
|
779
|
+
hikari:
|
|
780
|
+
maximum-pool-size: 200 # 4-core DB server cannot benefit from this
|
|
781
|
+
|
|
782
|
+
# GOOD: Right-sized for hardware
|
|
783
|
+
spring:
|
|
784
|
+
datasource:
|
|
785
|
+
hikari:
|
|
786
|
+
maximum-pool-size: 10 # (4 cores * 2) + 1 SSD + 1 headroom
|
|
787
|
+
```
|
|
788
|
+
|
|
789
|
+
**Why large pools hurt:** With 200 connections on a 4-core server, at any
|
|
790
|
+
given moment 196 connections are competing for CPU time. The OS scheduler
|
|
791
|
+
context-switches between them (each switch costs 2-5 microseconds plus cache
|
|
792
|
+
invalidation). The database's internal locking structures (ProcArrayLock,
|
|
793
|
+
buffer pool locks) experience higher contention with more concurrent
|
|
794
|
+
accessors.
|
|
795
|
+
|
|
796
|
+
### Anti-Pattern 3: No Connection Timeouts
|
|
797
|
+
|
|
798
|
+
```java
|
|
799
|
+
// BAD: Wait forever for a connection
|
|
800
|
+
HikariConfig config = new HikariConfig();
|
|
801
|
+
config.setConnectionTimeout(0); // Infinite wait -- threads pile up silently
|
|
802
|
+
|
|
803
|
+
// GOOD: Fail fast with actionable timeout
|
|
804
|
+
config.setConnectionTimeout(5000); // 5 seconds -- triggers alert, allows fallback
|
|
805
|
+
```
|
|
806
|
+
|
|
807
|
+
Without timeouts, pool exhaustion manifests as an ever-growing queue of
|
|
808
|
+
blocked threads. With 500 req/s arrival rate and exhausted pool, after
|
|
809
|
+
10 seconds you have 5,000 threads waiting silently. Memory usage spikes,
|
|
810
|
+
the JVM becomes unresponsive, and the only recovery is a restart.
|
|
811
|
+
|
|
812
|
+
### Anti-Pattern 4: Not Using Connection Validation
|
|
813
|
+
|
|
814
|
+
```python
|
|
815
|
+
# BAD: Assume connections are always valid
|
|
816
|
+
engine = create_engine("postgresql://db/mydb")
|
|
817
|
+
# After a network blip or DB restart, stale connections throw errors
|
|
818
|
+
|
|
819
|
+
# GOOD: Pre-ping validates connections before use
|
|
820
|
+
engine = create_engine("postgresql://db/mydb", pool_pre_ping=True)
|
|
821
|
+
# Adds ~1ms per checkout but prevents "connection closed" errors
|
|
822
|
+
```
|
|
823
|
+
|
|
824
|
+
### Anti-Pattern 5: Holding Connections During Non-DB Work
|
|
825
|
+
|
|
826
|
+
```python
|
|
827
|
+
# BAD: Connection held during HTTP call (500ms+ of idle holding)
|
|
828
|
+
def process_order(order_id):
|
|
829
|
+
conn = pool.acquire()
|
|
830
|
+
order = conn.execute("SELECT * FROM orders WHERE id = %s", order_id)
|
|
831
|
+
payment = http_client.charge(order.amount) # 200-500ms external call!
|
|
832
|
+
conn.execute("UPDATE orders SET paid = true WHERE id = %s", order_id)
|
|
833
|
+
conn.release()
|
|
834
|
+
|
|
835
|
+
# GOOD: Release between DB operations
|
|
836
|
+
def process_order(order_id):
|
|
837
|
+
with pool.acquire() as conn:
|
|
838
|
+
order = conn.execute("SELECT * FROM orders WHERE id = %s", order_id)
|
|
839
|
+
|
|
840
|
+
payment = http_client.charge(order.amount) # No connection held
|
|
841
|
+
|
|
842
|
+
with pool.acquire() as conn:
|
|
843
|
+
conn.execute("UPDATE orders SET paid = true WHERE id = %s", order_id)
|
|
844
|
+
```
|
|
845
|
+
|
|
846
|
+
### Anti-Pattern 6: Ignoring Pool Metrics in Serverless
|
|
847
|
+
|
|
848
|
+
```python
|
|
849
|
+
# BAD: Standard pool in AWS Lambda (connections leak across invocations)
|
|
850
|
+
engine = create_engine("postgresql://db/mydb", pool_size=5)
|
|
851
|
+
|
|
852
|
+
# GOOD: NullPool for serverless (no persistent connections)
|
|
853
|
+
engine = create_engine("postgresql://db/mydb", poolclass=NullPool)
|
|
854
|
+
|
|
855
|
+
# BETTER: Use RDS Proxy or PgBouncer as an external pooler
|
|
856
|
+
engine = create_engine("postgresql://rds-proxy-endpoint/mydb", poolclass=NullPool)
|
|
857
|
+
```
|
|
858
|
+
|
|
859
|
+
In serverless environments, Lambda containers are frozen and thawed
|
|
860
|
+
unpredictably. A QueuePool holds connections open during freeze, which
|
|
861
|
+
the database eventually kills. On thaw, the pool contains dead connections.
|
|
862
|
+
External poolers (RDS Proxy, PgBouncer) solve this by centralizing
|
|
863
|
+
connection management outside the ephemeral compute layer.
|
|
864
|
+
|
|
865
|
+
---
|
|
866
|
+
|
|
867
|
+
## Monitoring Pool Health
|
|
868
|
+
|
|
869
|
+
### Key Metrics to Track
|
|
870
|
+
|
|
871
|
+
| Metric | What It Tells You | Alert Threshold |
|
|
872
|
+
|------------------------------|--------------------------------------|------------------------|
|
|
873
|
+
| active_connections | Connections currently in use | > 80% of pool_size |
|
|
874
|
+
| idle_connections | Connections available in pool | < 10% of pool_size |
|
|
875
|
+
| waiting_requests | Threads waiting for a connection | > 0 for > 5 seconds |
|
|
876
|
+
| connection_acquire_time_ms | Time to get connection from pool | p99 > 100 ms |
|
|
877
|
+
| connection_usage_time_ms | How long connections are held | p99 > 5000 ms |
|
|
878
|
+
| connections_created_total | Cumulative new connections | Spike indicates churn |
|
|
879
|
+
| connections_timed_out_total | Checkout timeouts | > 0 |
|
|
880
|
+
| pool_size_current | Current pool size (if dynamic) | Trending at max |
|
|
881
|
+
| connection_errors_total | Failed connection attempts | > 0 |
|
|
882
|
+
|
|
883
|
+
### Prometheus + Grafana Setup
|
|
884
|
+
|
|
885
|
+
**HikariCP (auto-exported via Micrometer):**
|
|
886
|
+
|
|
887
|
+
```
|
|
888
|
+
hikaricp_connections_active{pool="myPool"} # Currently in use
|
|
889
|
+
hikaricp_connections_idle{pool="myPool"} # Available
|
|
890
|
+
hikaricp_connections_pending{pool="myPool"} # Waiting threads
|
|
891
|
+
hikaricp_connections_timeout_total{pool="myPool"} # Timeout events
|
|
892
|
+
hikaricp_connections_acquire_seconds{pool="myPool"} # Histogram
|
|
893
|
+
hikaricp_connections_usage_seconds{pool="myPool"} # Histogram
|
|
894
|
+
hikaricp_connections_creation_seconds{pool="myPool"} # New conn time
|
|
895
|
+
```
|
|
896
|
+
|
|
897
|
+
**PgBouncer (via SHOW STATS / SHOW POOLS):**
|
|
898
|
+
|
|
899
|
+
```sql
|
|
900
|
+
-- Run against PgBouncer admin console (port 6432)
|
|
901
|
+
SHOW POOLS;
|
|
902
|
+
-- Returns: database, user, cl_active, cl_waiting, sv_active, sv_idle,
|
|
903
|
+
-- sv_used, sv_tested, sv_login, maxwait, maxwait_us, pool_mode
|
|
904
|
+
|
|
905
|
+
SHOW STATS;
|
|
906
|
+
-- Returns: total_xact_count, total_query_count, total_received,
|
|
907
|
+
-- total_sent, total_xact_time, total_query_time, total_wait_time
|
|
908
|
+
```
|
|
909
|
+
|
|
910
|
+
**Expose PgBouncer metrics to Prometheus:**
|
|
911
|
+
|
|
912
|
+
```yaml
|
|
913
|
+
# pgbouncer_exporter config
|
|
914
|
+
pgbouncers:
|
|
915
|
+
- dsn: "postgres://pgbouncer:password@localhost:6432/pgbouncer"
|
|
916
|
+
```
|
|
917
|
+
|
|
918
|
+
### OpenTelemetry Instrumentation
|
|
919
|
+
|
|
920
|
+
```python
|
|
921
|
+
from opentelemetry import metrics
|
|
922
|
+
|
|
923
|
+
meter = metrics.get_meter("connection_pool")
|
|
924
|
+
|
|
925
|
+
pool_active = meter.create_up_down_counter(
|
|
926
|
+
"db.pool.active_connections",
|
|
927
|
+
description="Number of active connections"
|
|
928
|
+
)
|
|
929
|
+
pool_idle = meter.create_up_down_counter(
|
|
930
|
+
"db.pool.idle_connections",
|
|
931
|
+
description="Number of idle connections"
|
|
932
|
+
)
|
|
933
|
+
pool_wait_time = meter.create_histogram(
|
|
934
|
+
"db.pool.wait_time",
|
|
935
|
+
unit="ms",
|
|
936
|
+
description="Time spent waiting for a connection"
|
|
937
|
+
)
|
|
938
|
+
```
|
|
939
|
+
|
|
940
|
+
### Health Check Query Pattern
|
|
941
|
+
|
|
942
|
+
```python
|
|
943
|
+
import time
|
|
944
|
+
|
|
945
|
+
def check_pool_health(pool):
|
|
946
|
+
stats = pool.get_stats() # Implementation-specific
|
|
947
|
+
|
|
948
|
+
health = {
|
|
949
|
+
"status": "healthy",
|
|
950
|
+
"active": stats.active,
|
|
951
|
+
"idle": stats.idle,
|
|
952
|
+
"waiting": stats.waiting,
|
|
953
|
+
"utilization": stats.active / stats.max_size,
|
|
954
|
+
"avg_acquire_ms": stats.avg_acquire_time_ms,
|
|
955
|
+
}
|
|
956
|
+
|
|
957
|
+
if health["utilization"] > 0.9:
|
|
958
|
+
health["status"] = "warning"
|
|
959
|
+
health["message"] = "Pool utilization > 90%"
|
|
960
|
+
if health["waiting"] > 0:
|
|
961
|
+
health["status"] = "critical"
|
|
962
|
+
health["message"] = f"{stats.waiting} requests waiting for connections"
|
|
963
|
+
if health["avg_acquire_ms"] > 100:
|
|
964
|
+
health["status"] = "degraded"
|
|
965
|
+
health["message"] = f"Avg acquire time {stats.avg_acquire_time_ms}ms"
|
|
966
|
+
|
|
967
|
+
return health
|
|
968
|
+
```
|
|
969
|
+
|
|
970
|
+
---
|
|
971
|
+
|
|
972
|
+
## Before/After Benchmarks
|
|
973
|
+
|
|
974
|
+
### Benchmark 1: PostgreSQL Direct vs. PgBouncer (Transaction Mode)
|
|
975
|
+
|
|
976
|
+
**Setup:** PostgreSQL 16, 8 vCPU / 16 GB RAM, pgbench TPC-B workload
|
|
977
|
+
|
|
978
|
+
| Concurrent Clients | Direct TPS | PgBouncer TPS | Improvement |
|
|
979
|
+
|--------------------|-------------|---------------|-------------|
|
|
980
|
+
| 10 | 12,500 | 11,800 | -5.6% (overhead) |
|
|
981
|
+
| 50 | 24,000 | 22,100 | -7.9% (overhead) |
|
|
982
|
+
| 100 | 18,000 | 38,000 | +111% |
|
|
983
|
+
| 200 | 8,000 | 44,000 | +450% |
|
|
984
|
+
| 500 | Errors | 43,500 | N/A |
|
|
985
|
+
| 1,000 | Errors | 42,000 | N/A |
|
|
986
|
+
| 5,000 | N/A | 40,500 | N/A |
|
|
987
|
+
|
|
988
|
+
**Key insight:** Below ~56 clients, PgBouncer adds overhead (proxy latency).
|
|
989
|
+
Above ~100 clients, pooling is essential for stability and throughput. At
|
|
990
|
+
200 clients, pooling delivers 450% more throughput. Above 500 clients,
|
|
991
|
+
direct connections to PostgreSQL fail entirely.
|
|
992
|
+
|
|
993
|
+
### Benchmark 2: HikariCP Pool Size Optimization
|
|
994
|
+
|
|
995
|
+
**Setup:** Spring Boot app, 4-core DB server, OLTP workload, 500 concurrent users
|
|
996
|
+
|
|
997
|
+
| Pool Size | Avg Response Time | p99 Response Time | Throughput (RPS) |
|
|
998
|
+
|-----------|-------------------|-------------------|------------------|
|
|
999
|
+
| 5 | 8 ms | 45 ms | 4,200 |
|
|
1000
|
+
| 10 | 2 ms | 12 ms | 5,800 |
|
|
1001
|
+
| 20 | 4 ms | 25 ms | 5,500 |
|
|
1002
|
+
| 50 | 15 ms | 120 ms | 4,800 |
|
|
1003
|
+
| 100 | 45 ms | 350 ms | 3,200 |
|
|
1004
|
+
| 200 | 100 ms | 800 ms | 1,800 |
|
|
1005
|
+
|
|
1006
|
+
**Key insight:** Optimal pool size (10) matches the formula: (4 * 2) + 1 + 1 = 10.
|
|
1007
|
+
Doubling the pool to 20 adds 2 ms average latency. Going to 200 connections
|
|
1008
|
+
makes the system 50x slower due to contention. The relationship between pool
|
|
1009
|
+
size and performance is not linear -- there is a sharp optimum.
|
|
1010
|
+
|
|
1011
|
+
### Benchmark 3: Redis With vs. Without Connection Pooling
|
|
1012
|
+
|
|
1013
|
+
**Setup:** Redis 7.x, Python redis-py client, mixed GET/SET workload
|
|
1014
|
+
|
|
1015
|
+
| Configuration | Ops/sec | Avg Latency | p99 Latency | CPU Usage |
|
|
1016
|
+
|------------------------|-----------|-------------|-------------|-----------|
|
|
1017
|
+
| New connection per op | 35,000 | 2.82 ms | 8.5 ms | 45% |
|
|
1018
|
+
| Pool (10 connections) | 310,000 | 0.32 ms | 1.2 ms | 22% |
|
|
1019
|
+
| Pool (50 connections) | 475,000 | 0.21 ms | 0.8 ms | 35% |
|
|
1020
|
+
| Pool (200 connections) | 460,000 | 0.24 ms | 1.1 ms | 42% |
|
|
1021
|
+
|
|
1022
|
+
**Key insight:** Connection pooling improves Redis throughput by 13.6x.
|
|
1023
|
+
Beyond 50 connections, adding more provides no throughput benefit and slightly
|
|
1024
|
+
increases latency and CPU due to connection management overhead.
|
|
1025
|
+
|
|
1026
|
+
### Benchmark 4: HTTP Connection Reuse Impact
|
|
1027
|
+
|
|
1028
|
+
**Setup:** Microservice-to-microservice calls, same AWS region, TLS 1.3
|
|
1029
|
+
|
|
1030
|
+
| Configuration | RPS | Avg Latency | CPU (client) | Connections/sec |
|
|
1031
|
+
|----------------------------|--------|-------------|--------------|-----------------|
|
|
1032
|
+
| No keep-alive | 4,000 | 85 ms | 72% | 4,000 |
|
|
1033
|
+
| Keep-alive (pool=10) | 6,200 | 62 ms | 45% | ~2 |
|
|
1034
|
+
| Keep-alive (pool=50) | 7,721 | 51 ms | 38% | ~5 |
|
|
1035
|
+
| HTTP/2 (single connection) | 8,100 | 48 ms | 32% | 1 |
|
|
1036
|
+
|
|
1037
|
+
**Key insight:** HTTP keep-alive nearly doubles throughput and cuts CPU usage
|
|
1038
|
+
in half by eliminating TLS handshake overhead. HTTP/2 multiplexing achieves
|
|
1039
|
+
the best results with a single connection per origin.
|
|
1040
|
+
|
|
1041
|
+
### Benchmark 5: Production Case Study -- 500 Errors to 99.9% Uptime
|
|
1042
|
+
|
|
1043
|
+
**Setup:** E-commerce platform, 10 microservices, PostgreSQL backend
|
|
1044
|
+
|
|
1045
|
+
| Metric | Before (no pooling strategy) | After (tuned pooling) |
|
|
1046
|
+
|-------------------|------------------------------|------------------------|
|
|
1047
|
+
| Response time avg | 150 ms | 12 ms |
|
|
1048
|
+
| DB CPU usage | 80% | 15% |
|
|
1049
|
+
| Error rate (5xx) | 2.3% | 0.01% |
|
|
1050
|
+
| Uptime | 97.2% | 99.95% |
|
|
1051
|
+
| Max concurrent users | 500 | 8,000 |
|
|
1052
|
+
| DB connections | 2,000 (direct) | 120 (pooled) |
|
|
1053
|
+
|
|
1054
|
+
Changes made: Added PgBouncer in transaction mode, reduced pool_size per
|
|
1055
|
+
service from 50 to 10, added connection timeout of 5s, added pool health
|
|
1056
|
+
monitoring, set idle_in_transaction_session_timeout to 30s.
|
|
1057
|
+
|
|
1058
|
+
---
|
|
1059
|
+
|
|
1060
|
+
## Decision Tree
|
|
1061
|
+
|
|
1062
|
+
```
|
|
1063
|
+
How Should I Size My Connection Pool?
|
|
1064
|
+
======================================
|
|
1065
|
+
|
|
1066
|
+
START
|
|
1067
|
+
|
|
|
1068
|
+
v
|
|
1069
|
+
What database engine?
|
|
1070
|
+
|
|
|
1071
|
+
├── PostgreSQL ──────────────────────────────────────────────────┐
|
|
1072
|
+
│ |
|
|
1073
|
+
│ How many CPU cores on the DB server? |
|
|
1074
|
+
│ | |
|
|
1075
|
+
│ v |
|
|
1076
|
+
│ base = (cores * 2) + 1 [SSD] |
|
|
1077
|
+
│ base = (cores * 2) + spindles [HDD] |
|
|
1078
|
+
│ base = (cores * 2) [fully cached] |
|
|
1079
|
+
│ | |
|
|
1080
|
+
│ v |
|
|
1081
|
+
│ Are you using a connection pooler (PgBouncer/PgCat)? |
|
|
1082
|
+
│ | |
|
|
1083
|
+
│ ├── No ──> pool_size = base |
|
|
1084
|
+
│ │ max_connections = pool_size * app_instances * 1.1|
|
|
1085
|
+
│ │ |
|
|
1086
|
+
│ └── Yes ─> PgBouncer default_pool_size = base |
|
|
1087
|
+
│ App pool_size = base * 2-3 (pooler handles |
|
|
1088
|
+
│ the actual DB connection limit) |
|
|
1089
|
+
│ max_connections = PgBouncer pool_size * 1.1 |
|
|
1090
|
+
│ |
|
|
1091
|
+
├── MySQL ───────────────────────────────────────────────────────┤
|
|
1092
|
+
│ Similar formula, but MySQL handles more concurrent |
|
|
1093
|
+
│ connections than PostgreSQL (thread vs. process model) |
|
|
1094
|
+
│ base = (cores * 2) + 1, but can go up to (cores * 4) |
|
|
1095
|
+
│ |
|
|
1096
|
+
└── Redis ───────────────────────────────────────────────────────┘
|
|
1097
|
+
pool_size = 10-50 for most workloads
|
|
1098
|
+
Start at 10, increase if pool utilization > 80%
|
|
1099
|
+
Rarely need > 100 (Redis is single-threaded)
|
|
1100
|
+
|
|
1101
|
+
THEN VERIFY:
|
|
1102
|
+
|
|
|
1103
|
+
v
|
|
1104
|
+
Calculate total connections across all app instances:
|
|
1105
|
+
total = pool_size * instance_count
|
|
1106
|
+
|
|
1107
|
+
Is total > max_connections * 0.9?
|
|
1108
|
+
|
|
|
1109
|
+
├── Yes ──> REDUCE pool_size per instance, OR
|
|
1110
|
+
│ ADD a connection pooler (PgBouncer), OR
|
|
1111
|
+
│ INCREASE max_connections (with memory check:
|
|
1112
|
+
│ each connection needs 5-10 MB)
|
|
1113
|
+
│
|
|
1114
|
+
└── No ──> Good. Continue to load testing.
|
|
1115
|
+
|
|
1116
|
+
THEN LOAD TEST:
|
|
1117
|
+
|
|
|
1118
|
+
v
|
|
1119
|
+
Run load test at expected peak traffic (e.g., 2x normal)
|
|
1120
|
+
|
|
|
1121
|
+
Monitor:
|
|
1122
|
+
├── Pool wait time < 10ms at p99?
|
|
1123
|
+
│ ├── Yes ──> Pool size is adequate
|
|
1124
|
+
│ └── No ──> Increase pool_size by 25%, retest
|
|
1125
|
+
│
|
|
1126
|
+
├── DB CPU > 80% during test?
|
|
1127
|
+
│ ├── Yes ──> Pool is too large OR queries need optimization
|
|
1128
|
+
│ │ Reduce pool_size, add indexes, optimize queries
|
|
1129
|
+
│ └── No ──> Continue
|
|
1130
|
+
│
|
|
1131
|
+
├── Connection errors occurring?
|
|
1132
|
+
│ ├── Yes ──> Check max_connections, increase if headroom exists
|
|
1133
|
+
│ └── No ──> Continue
|
|
1134
|
+
│
|
|
1135
|
+
└── Response time acceptable at target RPS?
|
|
1136
|
+
├── Yes ──> DONE. Lock in these settings.
|
|
1137
|
+
└── No ──> Profile queries, check for lock contention,
|
|
1138
|
+
consider read replicas or caching layer
|
|
1139
|
+
|
|
1140
|
+
SPECIAL CASES:
|
|
1141
|
+
|
|
|
1142
|
+
├── Serverless (Lambda/Cloud Functions):
|
|
1143
|
+
│ Use NullPool in app + external pooler (RDS Proxy, PgBouncer)
|
|
1144
|
+
│ External pooler pool_size = (cores * 2) + 1
|
|
1145
|
+
│
|
|
1146
|
+
├── Kubernetes (many pods):
|
|
1147
|
+
│ pool_size_per_pod = max_db_connections / (max_pod_count * 1.2)
|
|
1148
|
+
│ Use PgBouncer sidecar or centralized pooler
|
|
1149
|
+
│
|
|
1150
|
+
└── Multi-region:
|
|
1151
|
+
Use regional poolers to terminate connections locally
|
|
1152
|
+
Cross-region pool_size should be minimal (high RTT connections
|
|
1153
|
+
are expensive and hold server connections longer)
|
|
1154
|
+
```
|
|
1155
|
+
|
|
1156
|
+
---
|
|
1157
|
+
|
|
1158
|
+
## Quick Reference Card
|
|
1159
|
+
|
|
1160
|
+
```
|
|
1161
|
+
POOL SIZING FORMULA:
|
|
1162
|
+
connections = (CPU_cores * 2) + effective_spindle_count
|
|
1163
|
+
|
|
1164
|
+
GOLDEN RULES:
|
|
1165
|
+
1. Fewer connections is almost always better than more
|
|
1166
|
+
2. Total connections across all apps < max_connections * 0.9
|
|
1167
|
+
3. Set connection timeouts (5-30s) -- never wait forever
|
|
1168
|
+
4. Monitor pool utilization -- alert at 80%, critical at 95%
|
|
1169
|
+
5. Use transaction pooling mode unless you need session features
|
|
1170
|
+
6. Enable leak detection in development and staging
|
|
1171
|
+
7. Pre-ping / validate connections before checkout
|
|
1172
|
+
8. Release connections immediately after DB work (not after HTTP calls)
|
|
1173
|
+
9. Use external poolers (PgBouncer) for serverless and high-connection-count
|
|
1174
|
+
10. Load test with your actual pool settings before production
|
|
1175
|
+
|
|
1176
|
+
EMERGENCY CHECKLIST (Pool Exhaustion):
|
|
1177
|
+
[ ] Check active vs. max connections (SHOW POOLS / pool stats)
|
|
1178
|
+
[ ] Look for idle-in-transaction connections (pg_stat_activity)
|
|
1179
|
+
[ ] Check for connection leaks (leak detection threshold)
|
|
1180
|
+
[ ] Verify connection timeout is set (not infinite)
|
|
1181
|
+
[ ] Check if slow queries are holding connections (pg_stat_activity)
|
|
1182
|
+
[ ] Temporarily increase pool size as a band-aid
|
|
1183
|
+
[ ] Restart application if connections are truly leaked
|
|
1184
|
+
```
|
|
1185
|
+
|
|
1186
|
+
---
|
|
1187
|
+
|
|
1188
|
+
## Sources
|
|
1189
|
+
|
|
1190
|
+
- [HikariCP Pool Sizing Wiki](https://github.com/brettwooldridge/HikariCP/wiki/About-Pool-Sizing) -- Original pool sizing formula and benchmark data showing 50x latency improvement from right-sizing
|
|
1191
|
+
- [Vlad Mihalcea: Optimal Connection Pool Size](https://vladmihalcea.com/optimal-connection-pool-size/) -- Derivation and testing of the PostgreSQL pool sizing formula
|
|
1192
|
+
- [PgBouncer vs PgCat vs Odyssey Benchmarks 2025](https://onidel.com/blog/postgresql-proxy-comparison-2025) -- Comparative benchmarks on 8 vCPU with PostgreSQL 16
|
|
1193
|
+
- [Tembo: Benchmarking PostgreSQL Connection Poolers](https://legacy.tembo.io/blog/postgres-connection-poolers/) -- PgBouncer vs PgCat vs Supavisor throughput comparisons
|
|
1194
|
+
- [Percona: PgBouncer for PostgreSQL](https://www.percona.com/blog/pgbouncer-for-postgresql-how-connection-pooling-solves-enterprise-slowdowns/) -- Enterprise connection pooling patterns and performance data
|
|
1195
|
+
- [Measuring PostgreSQL Connection Memory Overhead](https://blog.anarazel.de/2020/10/07/measuring-the-memory-overhead-of-a-postgres-connection/) -- Detailed memory measurements per connection (1.3-7.6 MiB)
|
|
1196
|
+
- [AWS: Resources Consumed by Idle PostgreSQL Connections](https://aws.amazon.com/blogs/database/resources-consumed-by-idle-postgresql-connections/) -- Impact of idle connections on PostgreSQL performance
|
|
1197
|
+
- [Stack Overflow: Improve Database Performance with Connection Pooling](https://stackoverflow.blog/2020/10/14/improve-database-performance-with-connection-pooling/) -- Session vs transaction vs statement pooling explained
|
|
1198
|
+
- [Redis: Connection Pools and Multiplexing](https://redis.io/docs/latest/develop/clients/pools-and-muxing/) -- Official Redis pooling and multiplexing guidance
|
|
1199
|
+
- [AWS: Best Practices for Redis Clients and ElastiCache](https://aws.amazon.com/blogs/database/best-practices-redis-clients-and-amazon-elasticache-for-redis/) -- Production Redis connection management
|
|
1200
|
+
- [Microsoft DevBlogs: The Art of HTTP Connection Pooling](https://devblogs.microsoft.com/premier-developer/the-art-of-http-connection-pooling-how-to-optimize-your-connections-for-peak-performance/) -- HTTP keep-alive and connection reuse performance data
|
|
1201
|
+
- [HAProxy: HTTP Keep-Alive, Pipelining, Multiplexing & Connection Pooling](https://www.haproxy.com/blog/http-keep-alive-pipelining-multiplexing-and-connection-pooling) -- Protocol-level connection reuse mechanisms
|
|
1202
|
+
- [Lob: Stop Wasting Connections, Use HTTP Keep-Alive](https://www.lob.com/blog/use-http-keep-alive) -- 50% throughput improvement from keep-alive
|
|
1203
|
+
- [SQLAlchemy 2.1: Connection Pooling Documentation](https://docs.sqlalchemy.org/en/21/core/pooling.html) -- QueuePool configuration and pool types
|
|
1204
|
+
- [Cybertec: Types of PostgreSQL Connection Pooling](https://www.cybertec-postgresql.com/en/pgbouncer-types-of-postgresql-connection-pooling/) -- Detailed comparison of PgBouncer pooling modes
|
|
1205
|
+
- [Connection Pool Exhaustion: The Silent Killer](https://howtech.substack.com/p/connection-pool-exhaustion-the-silent) -- Debugging and prevention strategies
|
|
1206
|
+
- [OneUptime: Trace Connection Pool Exhaustion with OpenTelemetry](https://oneuptime.com/blog/post/2026-02-06-trace-database-connection-pool-exhaustion-opentelemetry-metrics/view) -- OpenTelemetry instrumentation for pool monitoring
|
|
1207
|
+
- [Medium: Database Connection Pool Optimization -- From 500 Errors to 99.9% Uptime](https://medium.com/@shahharsh172/database-connection-pool-optimization-from-500-errors-to-99-9-uptime-9deb985f5164) -- Production case study with before/after metrics
|
|
1208
|
+
- [ThousandEyes: Optimizing Web Performance with TLS 1.3](https://www.thousandeyes.com/blog/optimizing-web-performance-tls-1-3) -- TLS handshake latency measurements
|
|
1209
|
+
- [SystemOverflow: TLS Handshake Latency Across Protocol Versions](https://www.systemoverflow.com/learn/networking-protocols/http-protocols/tls-handshake-latency-the-critical-path-tax-across-protocol-versions) -- RTT costs for TLS 1.2 vs 1.3 vs QUIC
|