@wazir-dev/cli 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +111 -0
- package/CHANGELOG.md +14 -0
- package/CONTRIBUTING.md +101 -0
- package/LICENSE +21 -0
- package/README.md +314 -0
- package/assets/composition-engine.mmd +34 -0
- package/assets/demo-script.sh +17 -0
- package/assets/logo-dark.svg +14 -0
- package/assets/logo.svg +14 -0
- package/assets/pipeline.mmd +39 -0
- package/assets/record-demo.sh +51 -0
- package/docs/README.md +51 -0
- package/docs/adapters/context-mode.md +60 -0
- package/docs/concepts/architecture.md +87 -0
- package/docs/concepts/artifact-model.md +60 -0
- package/docs/concepts/composition-engine.md +36 -0
- package/docs/concepts/indexing-and-recall.md +160 -0
- package/docs/concepts/observability.md +41 -0
- package/docs/concepts/roles-and-workflows.md +59 -0
- package/docs/concepts/terminology-policy.md +27 -0
- package/docs/getting-started/01-installation.md +78 -0
- package/docs/getting-started/02-first-run.md +102 -0
- package/docs/getting-started/03-adding-to-project.md +15 -0
- package/docs/getting-started/04-host-setup.md +15 -0
- package/docs/guides/ci-integration.md +15 -0
- package/docs/guides/creating-skills.md +15 -0
- package/docs/guides/expertise-module-authoring.md +15 -0
- package/docs/guides/hook-development.md +15 -0
- package/docs/guides/memory-and-learnings.md +34 -0
- package/docs/guides/multi-host-export.md +15 -0
- package/docs/guides/troubleshooting.md +101 -0
- package/docs/guides/writing-custom-roles.md +15 -0
- package/docs/plans/2026-03-15-cli-pipeline-integration-design.md +592 -0
- package/docs/plans/2026-03-15-cli-pipeline-integration-plan.md +598 -0
- package/docs/plans/2026-03-15-docs-enforcement-plan.md +238 -0
- package/docs/readmes/INDEX.md +99 -0
- package/docs/readmes/features/expertise/README.md +171 -0
- package/docs/readmes/features/exports/README.md +222 -0
- package/docs/readmes/features/hooks/README.md +103 -0
- package/docs/readmes/features/hooks/loop-cap-guard.md +133 -0
- package/docs/readmes/features/hooks/post-tool-capture.md +121 -0
- package/docs/readmes/features/hooks/post-tool-lint.md +130 -0
- package/docs/readmes/features/hooks/pre-compact-summary.md +122 -0
- package/docs/readmes/features/hooks/pre-tool-capture-route.md +100 -0
- package/docs/readmes/features/hooks/protected-path-write-guard.md +128 -0
- package/docs/readmes/features/hooks/session-start.md +119 -0
- package/docs/readmes/features/hooks/stop-handoff-harvest.md +125 -0
- package/docs/readmes/features/roles/README.md +157 -0
- package/docs/readmes/features/roles/clarifier.md +152 -0
- package/docs/readmes/features/roles/content-author.md +190 -0
- package/docs/readmes/features/roles/designer.md +193 -0
- package/docs/readmes/features/roles/executor.md +184 -0
- package/docs/readmes/features/roles/learner.md +210 -0
- package/docs/readmes/features/roles/planner.md +182 -0
- package/docs/readmes/features/roles/researcher.md +164 -0
- package/docs/readmes/features/roles/reviewer.md +184 -0
- package/docs/readmes/features/roles/specifier.md +162 -0
- package/docs/readmes/features/roles/verifier.md +215 -0
- package/docs/readmes/features/schemas/README.md +178 -0
- package/docs/readmes/features/skills/README.md +63 -0
- package/docs/readmes/features/skills/brainstorming.md +96 -0
- package/docs/readmes/features/skills/debugging.md +148 -0
- package/docs/readmes/features/skills/design.md +120 -0
- package/docs/readmes/features/skills/prepare-next.md +109 -0
- package/docs/readmes/features/skills/run-audit.md +159 -0
- package/docs/readmes/features/skills/scan-project.md +109 -0
- package/docs/readmes/features/skills/self-audit.md +176 -0
- package/docs/readmes/features/skills/tdd.md +137 -0
- package/docs/readmes/features/skills/using-skills.md +92 -0
- package/docs/readmes/features/skills/verification.md +120 -0
- package/docs/readmes/features/skills/writing-plans.md +104 -0
- package/docs/readmes/features/tooling/README.md +320 -0
- package/docs/readmes/features/workflows/README.md +186 -0
- package/docs/readmes/features/workflows/author.md +181 -0
- package/docs/readmes/features/workflows/clarify.md +154 -0
- package/docs/readmes/features/workflows/design-review.md +171 -0
- package/docs/readmes/features/workflows/design.md +169 -0
- package/docs/readmes/features/workflows/discover.md +162 -0
- package/docs/readmes/features/workflows/execute.md +173 -0
- package/docs/readmes/features/workflows/learn.md +167 -0
- package/docs/readmes/features/workflows/plan-review.md +165 -0
- package/docs/readmes/features/workflows/plan.md +170 -0
- package/docs/readmes/features/workflows/prepare-next.md +167 -0
- package/docs/readmes/features/workflows/review.md +169 -0
- package/docs/readmes/features/workflows/run-audit.md +191 -0
- package/docs/readmes/features/workflows/spec-challenge.md +159 -0
- package/docs/readmes/features/workflows/specify.md +160 -0
- package/docs/readmes/features/workflows/verify.md +177 -0
- package/docs/readmes/packages/README.md +50 -0
- package/docs/readmes/packages/ajv.md +117 -0
- package/docs/readmes/packages/context-mode.md +118 -0
- package/docs/readmes/packages/gray-matter.md +116 -0
- package/docs/readmes/packages/node-test.md +137 -0
- package/docs/readmes/packages/yaml.md +112 -0
- package/docs/reference/configuration-reference.md +159 -0
- package/docs/reference/expertise-index.md +52 -0
- package/docs/reference/git-flow.md +43 -0
- package/docs/reference/hooks.md +87 -0
- package/docs/reference/host-exports.md +50 -0
- package/docs/reference/launch-checklist.md +172 -0
- package/docs/reference/marketplace-listings.md +76 -0
- package/docs/reference/release-process.md +34 -0
- package/docs/reference/roles-reference.md +77 -0
- package/docs/reference/skills.md +33 -0
- package/docs/reference/templates.md +29 -0
- package/docs/reference/tooling-cli.md +94 -0
- package/docs/truth-claims.yaml +222 -0
- package/expertise/PROGRESS.md +63 -0
- package/expertise/README.md +18 -0
- package/expertise/antipatterns/PROGRESS.md +56 -0
- package/expertise/antipatterns/backend/api-design-antipatterns.md +1271 -0
- package/expertise/antipatterns/backend/auth-antipatterns.md +1195 -0
- package/expertise/antipatterns/backend/caching-antipatterns.md +622 -0
- package/expertise/antipatterns/backend/database-antipatterns.md +1038 -0
- package/expertise/antipatterns/backend/index.md +24 -0
- package/expertise/antipatterns/backend/microservices-antipatterns.md +850 -0
- package/expertise/antipatterns/code/architecture-antipatterns.md +919 -0
- package/expertise/antipatterns/code/async-antipatterns.md +622 -0
- package/expertise/antipatterns/code/code-smells.md +1186 -0
- package/expertise/antipatterns/code/dependency-antipatterns.md +1209 -0
- package/expertise/antipatterns/code/error-handling-antipatterns.md +1360 -0
- package/expertise/antipatterns/code/index.md +27 -0
- package/expertise/antipatterns/code/naming-and-abstraction.md +1118 -0
- package/expertise/antipatterns/code/state-management-antipatterns.md +1076 -0
- package/expertise/antipatterns/code/testing-antipatterns.md +1053 -0
- package/expertise/antipatterns/design/accessibility-antipatterns.md +1136 -0
- package/expertise/antipatterns/design/dark-patterns.md +1121 -0
- package/expertise/antipatterns/design/index.md +22 -0
- package/expertise/antipatterns/design/ui-antipatterns.md +1202 -0
- package/expertise/antipatterns/design/ux-antipatterns.md +680 -0
- package/expertise/antipatterns/frontend/css-layout-antipatterns.md +691 -0
- package/expertise/antipatterns/frontend/flutter-antipatterns.md +1827 -0
- package/expertise/antipatterns/frontend/index.md +23 -0
- package/expertise/antipatterns/frontend/mobile-antipatterns.md +573 -0
- package/expertise/antipatterns/frontend/react-antipatterns.md +1128 -0
- package/expertise/antipatterns/frontend/spa-antipatterns.md +1235 -0
- package/expertise/antipatterns/index.md +31 -0
- package/expertise/antipatterns/performance/index.md +20 -0
- package/expertise/antipatterns/performance/performance-antipatterns.md +1013 -0
- package/expertise/antipatterns/performance/premature-optimization.md +623 -0
- package/expertise/antipatterns/performance/scaling-antipatterns.md +785 -0
- package/expertise/antipatterns/process/ai-coding-antipatterns.md +853 -0
- package/expertise/antipatterns/process/code-review-antipatterns.md +656 -0
- package/expertise/antipatterns/process/deployment-antipatterns.md +920 -0
- package/expertise/antipatterns/process/index.md +23 -0
- package/expertise/antipatterns/process/technical-debt-antipatterns.md +647 -0
- package/expertise/antipatterns/security/index.md +20 -0
- package/expertise/antipatterns/security/secrets-antipatterns.md +849 -0
- package/expertise/antipatterns/security/security-theater.md +843 -0
- package/expertise/antipatterns/security/vulnerability-patterns.md +801 -0
- package/expertise/architecture/PROGRESS.md +70 -0
- package/expertise/architecture/data/caching-architecture.md +671 -0
- package/expertise/architecture/data/data-consistency.md +574 -0
- package/expertise/architecture/data/data-modeling.md +536 -0
- package/expertise/architecture/data/event-streams-and-queues.md +634 -0
- package/expertise/architecture/data/index.md +25 -0
- package/expertise/architecture/data/search-architecture.md +663 -0
- package/expertise/architecture/data/sql-vs-nosql.md +708 -0
- package/expertise/architecture/decisions/architecture-decision-records.md +640 -0
- package/expertise/architecture/decisions/build-vs-buy.md +616 -0
- package/expertise/architecture/decisions/index.md +23 -0
- package/expertise/architecture/decisions/monolith-to-microservices.md +790 -0
- package/expertise/architecture/decisions/technology-selection.md +616 -0
- package/expertise/architecture/distributed/cap-theorem-and-tradeoffs.md +800 -0
- package/expertise/architecture/distributed/circuit-breaker-bulkhead.md +741 -0
- package/expertise/architecture/distributed/consensus-and-coordination.md +796 -0
- package/expertise/architecture/distributed/distributed-systems-fundamentals.md +564 -0
- package/expertise/architecture/distributed/idempotency-and-retry.md +796 -0
- package/expertise/architecture/distributed/index.md +25 -0
- package/expertise/architecture/distributed/saga-pattern.md +797 -0
- package/expertise/architecture/foundations/architectural-thinking.md +460 -0
- package/expertise/architecture/foundations/coupling-and-cohesion.md +770 -0
- package/expertise/architecture/foundations/design-principles-solid.md +649 -0
- package/expertise/architecture/foundations/domain-driven-design.md +719 -0
- package/expertise/architecture/foundations/index.md +25 -0
- package/expertise/architecture/foundations/separation-of-concerns.md +472 -0
- package/expertise/architecture/foundations/twelve-factor-app.md +797 -0
- package/expertise/architecture/index.md +34 -0
- package/expertise/architecture/integration/api-design-graphql.md +638 -0
- package/expertise/architecture/integration/api-design-grpc.md +804 -0
- package/expertise/architecture/integration/api-design-rest.md +892 -0
- package/expertise/architecture/integration/index.md +25 -0
- package/expertise/architecture/integration/third-party-integration.md +795 -0
- package/expertise/architecture/integration/webhooks-and-callbacks.md +1152 -0
- package/expertise/architecture/integration/websockets-realtime.md +791 -0
- package/expertise/architecture/mobile-architecture/index.md +22 -0
- package/expertise/architecture/mobile-architecture/mobile-app-architecture.md +780 -0
- package/expertise/architecture/mobile-architecture/mobile-backend-for-frontend.md +670 -0
- package/expertise/architecture/mobile-architecture/offline-first.md +719 -0
- package/expertise/architecture/mobile-architecture/push-and-sync.md +782 -0
- package/expertise/architecture/patterns/cqrs-event-sourcing.md +717 -0
- package/expertise/architecture/patterns/event-driven.md +797 -0
- package/expertise/architecture/patterns/hexagonal-clean-architecture.md +870 -0
- package/expertise/architecture/patterns/index.md +27 -0
- package/expertise/architecture/patterns/layered-architecture.md +736 -0
- package/expertise/architecture/patterns/microservices.md +753 -0
- package/expertise/architecture/patterns/modular-monolith.md +692 -0
- package/expertise/architecture/patterns/monolith.md +626 -0
- package/expertise/architecture/patterns/plugin-architecture.md +735 -0
- package/expertise/architecture/patterns/serverless.md +780 -0
- package/expertise/architecture/scaling/database-scaling.md +615 -0
- package/expertise/architecture/scaling/feature-flags-and-rollouts.md +757 -0
- package/expertise/architecture/scaling/horizontal-vs-vertical.md +606 -0
- package/expertise/architecture/scaling/index.md +24 -0
- package/expertise/architecture/scaling/multi-tenancy.md +800 -0
- package/expertise/architecture/scaling/stateless-design.md +787 -0
- package/expertise/backend/embedded-firmware.md +625 -0
- package/expertise/backend/go.md +853 -0
- package/expertise/backend/index.md +24 -0
- package/expertise/backend/java-spring.md +448 -0
- package/expertise/backend/node-typescript.md +625 -0
- package/expertise/backend/python-fastapi.md +724 -0
- package/expertise/backend/rust.md +458 -0
- package/expertise/backend/solidity.md +711 -0
- package/expertise/composition-map.yaml +443 -0
- package/expertise/content/foundations/content-modeling.md +395 -0
- package/expertise/content/foundations/editorial-standards.md +449 -0
- package/expertise/content/foundations/index.md +24 -0
- package/expertise/content/foundations/microcopy.md +455 -0
- package/expertise/content/foundations/terminology-governance.md +509 -0
- package/expertise/content/index.md +34 -0
- package/expertise/content/patterns/accessibility-copy.md +518 -0
- package/expertise/content/patterns/index.md +24 -0
- package/expertise/content/patterns/notification-content.md +433 -0
- package/expertise/content/patterns/sample-content.md +486 -0
- package/expertise/content/patterns/state-copy.md +439 -0
- package/expertise/design/PROGRESS.md +58 -0
- package/expertise/design/disciplines/dark-mode-theming.md +577 -0
- package/expertise/design/disciplines/design-systems.md +595 -0
- package/expertise/design/disciplines/index.md +25 -0
- package/expertise/design/disciplines/information-architecture.md +800 -0
- package/expertise/design/disciplines/interaction-design.md +788 -0
- package/expertise/design/disciplines/responsive-design.md +552 -0
- package/expertise/design/disciplines/usability-testing.md +516 -0
- package/expertise/design/disciplines/user-research.md +792 -0
- package/expertise/design/foundations/accessibility-design.md +796 -0
- package/expertise/design/foundations/color-theory.md +797 -0
- package/expertise/design/foundations/iconography.md +795 -0
- package/expertise/design/foundations/index.md +26 -0
- package/expertise/design/foundations/motion-and-animation.md +653 -0
- package/expertise/design/foundations/rtl-design.md +585 -0
- package/expertise/design/foundations/spacing-and-layout.md +607 -0
- package/expertise/design/foundations/typography.md +800 -0
- package/expertise/design/foundations/visual-hierarchy.md +761 -0
- package/expertise/design/index.md +32 -0
- package/expertise/design/patterns/authentication-flows.md +474 -0
- package/expertise/design/patterns/content-consumption.md +789 -0
- package/expertise/design/patterns/data-display.md +618 -0
- package/expertise/design/patterns/e-commerce.md +1494 -0
- package/expertise/design/patterns/feedback-and-states.md +642 -0
- package/expertise/design/patterns/forms-and-input.md +819 -0
- package/expertise/design/patterns/gamification.md +801 -0
- package/expertise/design/patterns/index.md +31 -0
- package/expertise/design/patterns/microinteractions.md +449 -0
- package/expertise/design/patterns/navigation.md +800 -0
- package/expertise/design/patterns/notifications.md +705 -0
- package/expertise/design/patterns/onboarding.md +700 -0
- package/expertise/design/patterns/search-and-filter.md +601 -0
- package/expertise/design/patterns/settings-and-preferences.md +768 -0
- package/expertise/design/patterns/social-and-community.md +748 -0
- package/expertise/design/platforms/desktop-native.md +612 -0
- package/expertise/design/platforms/index.md +25 -0
- package/expertise/design/platforms/mobile-android.md +825 -0
- package/expertise/design/platforms/mobile-cross-platform.md +983 -0
- package/expertise/design/platforms/mobile-ios.md +699 -0
- package/expertise/design/platforms/tablet.md +794 -0
- package/expertise/design/platforms/web-dashboard.md +790 -0
- package/expertise/design/platforms/web-responsive.md +550 -0
- package/expertise/design/psychology/behavioral-nudges.md +449 -0
- package/expertise/design/psychology/cognitive-load.md +1191 -0
- package/expertise/design/psychology/error-psychology.md +778 -0
- package/expertise/design/psychology/index.md +22 -0
- package/expertise/design/psychology/persuasive-design.md +736 -0
- package/expertise/design/psychology/user-mental-models.md +623 -0
- package/expertise/design/tooling/open-pencil.md +266 -0
- package/expertise/frontend/angular.md +1073 -0
- package/expertise/frontend/desktop-electron.md +546 -0
- package/expertise/frontend/flutter.md +782 -0
- package/expertise/frontend/index.md +27 -0
- package/expertise/frontend/native-android.md +409 -0
- package/expertise/frontend/native-ios.md +490 -0
- package/expertise/frontend/react-native.md +1160 -0
- package/expertise/frontend/react.md +808 -0
- package/expertise/frontend/vue.md +1089 -0
- package/expertise/humanize/domain-rules-code.md +79 -0
- package/expertise/humanize/domain-rules-content.md +67 -0
- package/expertise/humanize/domain-rules-technical-docs.md +56 -0
- package/expertise/humanize/index.md +35 -0
- package/expertise/humanize/self-audit-checklist.md +87 -0
- package/expertise/humanize/sentence-patterns.md +218 -0
- package/expertise/humanize/vocabulary-blacklist.md +105 -0
- package/expertise/i18n/PROGRESS.md +65 -0
- package/expertise/i18n/advanced/accessibility-and-i18n.md +28 -0
- package/expertise/i18n/advanced/bidirectional-text-algorithm.md +38 -0
- package/expertise/i18n/advanced/complex-scripts.md +30 -0
- package/expertise/i18n/advanced/performance-and-i18n.md +27 -0
- package/expertise/i18n/advanced/testing-i18n.md +28 -0
- package/expertise/i18n/content/content-adaptation.md +23 -0
- package/expertise/i18n/content/locale-specific-formatting.md +23 -0
- package/expertise/i18n/content/machine-translation-integration.md +28 -0
- package/expertise/i18n/content/translation-management.md +29 -0
- package/expertise/i18n/foundations/date-time-calendars.md +67 -0
- package/expertise/i18n/foundations/i18n-architecture.md +272 -0
- package/expertise/i18n/foundations/locale-and-language-tags.md +79 -0
- package/expertise/i18n/foundations/numbers-currency-units.md +61 -0
- package/expertise/i18n/foundations/pluralization-and-gender.md +109 -0
- package/expertise/i18n/foundations/string-externalization.md +236 -0
- package/expertise/i18n/foundations/text-direction-bidi.md +241 -0
- package/expertise/i18n/foundations/unicode-and-encoding.md +86 -0
- package/expertise/i18n/index.md +38 -0
- package/expertise/i18n/platform/backend-i18n.md +31 -0
- package/expertise/i18n/platform/flutter-i18n.md +148 -0
- package/expertise/i18n/platform/native-android-i18n.md +36 -0
- package/expertise/i18n/platform/native-ios-i18n.md +36 -0
- package/expertise/i18n/platform/react-i18n.md +103 -0
- package/expertise/i18n/platform/web-css-i18n.md +81 -0
- package/expertise/i18n/rtl/arabic-specific.md +175 -0
- package/expertise/i18n/rtl/hebrew-specific.md +149 -0
- package/expertise/i18n/rtl/rtl-animations-and-transitions.md +111 -0
- package/expertise/i18n/rtl/rtl-forms-and-input.md +161 -0
- package/expertise/i18n/rtl/rtl-fundamentals.md +211 -0
- package/expertise/i18n/rtl/rtl-icons-and-images.md +181 -0
- package/expertise/i18n/rtl/rtl-layout-mirroring.md +252 -0
- package/expertise/i18n/rtl/rtl-navigation-and-gestures.md +107 -0
- package/expertise/i18n/rtl/rtl-testing-and-qa.md +147 -0
- package/expertise/i18n/rtl/rtl-typography.md +160 -0
- package/expertise/index.md +113 -0
- package/expertise/index.yaml +216 -0
- package/expertise/infrastructure/cloud-aws.md +597 -0
- package/expertise/infrastructure/cloud-gcp.md +599 -0
- package/expertise/infrastructure/cybersecurity.md +816 -0
- package/expertise/infrastructure/database-mongodb.md +447 -0
- package/expertise/infrastructure/database-postgres.md +400 -0
- package/expertise/infrastructure/devops-cicd.md +787 -0
- package/expertise/infrastructure/index.md +27 -0
- package/expertise/performance/PROGRESS.md +50 -0
- package/expertise/performance/backend/api-latency.md +1204 -0
- package/expertise/performance/backend/background-jobs.md +506 -0
- package/expertise/performance/backend/connection-pooling.md +1209 -0
- package/expertise/performance/backend/database-query-optimization.md +515 -0
- package/expertise/performance/backend/index.md +23 -0
- package/expertise/performance/backend/rate-limiting-and-throttling.md +971 -0
- package/expertise/performance/foundations/algorithmic-complexity.md +954 -0
- package/expertise/performance/foundations/caching-strategies.md +489 -0
- package/expertise/performance/foundations/concurrency-and-parallelism.md +847 -0
- package/expertise/performance/foundations/index.md +24 -0
- package/expertise/performance/foundations/measuring-and-profiling.md +440 -0
- package/expertise/performance/foundations/memory-management.md +964 -0
- package/expertise/performance/foundations/performance-budgets.md +1314 -0
- package/expertise/performance/index.md +31 -0
- package/expertise/performance/infrastructure/auto-scaling.md +1059 -0
- package/expertise/performance/infrastructure/cdn-and-edge.md +1081 -0
- package/expertise/performance/infrastructure/index.md +22 -0
- package/expertise/performance/infrastructure/load-balancing.md +1081 -0
- package/expertise/performance/infrastructure/observability.md +1079 -0
- package/expertise/performance/mobile/index.md +23 -0
- package/expertise/performance/mobile/mobile-animations.md +544 -0
- package/expertise/performance/mobile/mobile-memory-battery.md +416 -0
- package/expertise/performance/mobile/mobile-network.md +452 -0
- package/expertise/performance/mobile/mobile-rendering.md +599 -0
- package/expertise/performance/mobile/mobile-startup-time.md +505 -0
- package/expertise/performance/platform-specific/flutter-performance.md +647 -0
- package/expertise/performance/platform-specific/index.md +22 -0
- package/expertise/performance/platform-specific/node-performance.md +1307 -0
- package/expertise/performance/platform-specific/postgres-performance.md +1366 -0
- package/expertise/performance/platform-specific/react-performance.md +1403 -0
- package/expertise/performance/web/bundle-optimization.md +1239 -0
- package/expertise/performance/web/image-and-media.md +636 -0
- package/expertise/performance/web/index.md +24 -0
- package/expertise/performance/web/network-optimization.md +1133 -0
- package/expertise/performance/web/rendering-performance.md +1098 -0
- package/expertise/performance/web/ssr-and-hydration.md +918 -0
- package/expertise/performance/web/web-vitals.md +1374 -0
- package/expertise/quality/accessibility.md +985 -0
- package/expertise/quality/evidence-based-verification.md +499 -0
- package/expertise/quality/index.md +24 -0
- package/expertise/quality/ml-model-audit.md +614 -0
- package/expertise/quality/performance.md +600 -0
- package/expertise/quality/testing-api.md +891 -0
- package/expertise/quality/testing-mobile.md +496 -0
- package/expertise/quality/testing-web.md +849 -0
- package/expertise/security/PROGRESS.md +54 -0
- package/expertise/security/agentic-identity.md +540 -0
- package/expertise/security/compliance-frameworks.md +601 -0
- package/expertise/security/data/data-encryption.md +364 -0
- package/expertise/security/data/data-privacy-gdpr.md +692 -0
- package/expertise/security/data/database-security.md +1171 -0
- package/expertise/security/data/index.md +22 -0
- package/expertise/security/data/pii-handling.md +531 -0
- package/expertise/security/foundations/authentication.md +1041 -0
- package/expertise/security/foundations/authorization.md +603 -0
- package/expertise/security/foundations/cryptography.md +1001 -0
- package/expertise/security/foundations/index.md +25 -0
- package/expertise/security/foundations/owasp-top-10.md +1354 -0
- package/expertise/security/foundations/secrets-management.md +1217 -0
- package/expertise/security/foundations/secure-sdlc.md +700 -0
- package/expertise/security/foundations/supply-chain-security.md +698 -0
- package/expertise/security/index.md +31 -0
- package/expertise/security/infrastructure/cloud-security-aws.md +1296 -0
- package/expertise/security/infrastructure/cloud-security-gcp.md +1376 -0
- package/expertise/security/infrastructure/container-security.md +721 -0
- package/expertise/security/infrastructure/incident-response.md +1295 -0
- package/expertise/security/infrastructure/index.md +24 -0
- package/expertise/security/infrastructure/logging-and-monitoring.md +1618 -0
- package/expertise/security/infrastructure/network-security.md +1337 -0
- package/expertise/security/mobile/index.md +23 -0
- package/expertise/security/mobile/mobile-android-security.md +1218 -0
- package/expertise/security/mobile/mobile-binary-protection.md +1229 -0
- package/expertise/security/mobile/mobile-data-storage.md +1265 -0
- package/expertise/security/mobile/mobile-ios-security.md +1401 -0
- package/expertise/security/mobile/mobile-network-security.md +1520 -0
- package/expertise/security/smart-contract-security.md +594 -0
- package/expertise/security/testing/index.md +22 -0
- package/expertise/security/testing/penetration-testing.md +1258 -0
- package/expertise/security/testing/security-code-review.md +1765 -0
- package/expertise/security/testing/threat-modeling.md +1074 -0
- package/expertise/security/testing/vulnerability-scanning.md +1062 -0
- package/expertise/security/web/api-security.md +586 -0
- package/expertise/security/web/cors-and-headers.md +433 -0
- package/expertise/security/web/csrf.md +562 -0
- package/expertise/security/web/file-upload.md +1477 -0
- package/expertise/security/web/index.md +25 -0
- package/expertise/security/web/injection.md +1375 -0
- package/expertise/security/web/session-management.md +1101 -0
- package/expertise/security/web/xss.md +1158 -0
- package/exports/README.md +17 -0
- package/exports/hosts/claude/.claude/agents/clarifier.md +42 -0
- package/exports/hosts/claude/.claude/agents/content-author.md +63 -0
- package/exports/hosts/claude/.claude/agents/designer.md +55 -0
- package/exports/hosts/claude/.claude/agents/executor.md +55 -0
- package/exports/hosts/claude/.claude/agents/learner.md +51 -0
- package/exports/hosts/claude/.claude/agents/planner.md +53 -0
- package/exports/hosts/claude/.claude/agents/researcher.md +43 -0
- package/exports/hosts/claude/.claude/agents/reviewer.md +54 -0
- package/exports/hosts/claude/.claude/agents/specifier.md +47 -0
- package/exports/hosts/claude/.claude/agents/verifier.md +71 -0
- package/exports/hosts/claude/.claude/commands/author.md +42 -0
- package/exports/hosts/claude/.claude/commands/clarify.md +38 -0
- package/exports/hosts/claude/.claude/commands/design-review.md +46 -0
- package/exports/hosts/claude/.claude/commands/design.md +44 -0
- package/exports/hosts/claude/.claude/commands/discover.md +37 -0
- package/exports/hosts/claude/.claude/commands/execute.md +48 -0
- package/exports/hosts/claude/.claude/commands/learn.md +38 -0
- package/exports/hosts/claude/.claude/commands/plan-review.md +42 -0
- package/exports/hosts/claude/.claude/commands/plan.md +39 -0
- package/exports/hosts/claude/.claude/commands/prepare-next.md +37 -0
- package/exports/hosts/claude/.claude/commands/review.md +40 -0
- package/exports/hosts/claude/.claude/commands/run-audit.md +41 -0
- package/exports/hosts/claude/.claude/commands/spec-challenge.md +41 -0
- package/exports/hosts/claude/.claude/commands/specify.md +38 -0
- package/exports/hosts/claude/.claude/commands/verify.md +37 -0
- package/exports/hosts/claude/.claude/settings.json +34 -0
- package/exports/hosts/claude/CLAUDE.md +19 -0
- package/exports/hosts/claude/export.manifest.json +38 -0
- package/exports/hosts/claude/host-package.json +67 -0
- package/exports/hosts/codex/AGENTS.md +19 -0
- package/exports/hosts/codex/export.manifest.json +38 -0
- package/exports/hosts/codex/host-package.json +41 -0
- package/exports/hosts/cursor/.cursor/hooks.json +16 -0
- package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +19 -0
- package/exports/hosts/cursor/export.manifest.json +38 -0
- package/exports/hosts/cursor/host-package.json +42 -0
- package/exports/hosts/gemini/GEMINI.md +19 -0
- package/exports/hosts/gemini/export.manifest.json +38 -0
- package/exports/hosts/gemini/host-package.json +41 -0
- package/hooks/README.md +18 -0
- package/hooks/definitions/loop_cap_guard.yaml +21 -0
- package/hooks/definitions/post_tool_capture.yaml +24 -0
- package/hooks/definitions/pre_compact_summary.yaml +19 -0
- package/hooks/definitions/pre_tool_capture_route.yaml +19 -0
- package/hooks/definitions/protected_path_write_guard.yaml +19 -0
- package/hooks/definitions/session_start.yaml +19 -0
- package/hooks/definitions/stop_handoff_harvest.yaml +20 -0
- package/hooks/loop-cap-guard +17 -0
- package/hooks/post-tool-lint +36 -0
- package/hooks/protected-path-write-guard +17 -0
- package/hooks/session-start +41 -0
- package/llms-full.txt +2355 -0
- package/llms.txt +43 -0
- package/package.json +79 -0
- package/roles/README.md +20 -0
- package/roles/clarifier.md +42 -0
- package/roles/content-author.md +63 -0
- package/roles/designer.md +55 -0
- package/roles/executor.md +55 -0
- package/roles/learner.md +51 -0
- package/roles/planner.md +53 -0
- package/roles/researcher.md +43 -0
- package/roles/reviewer.md +54 -0
- package/roles/specifier.md +47 -0
- package/roles/verifier.md +71 -0
- package/schemas/README.md +24 -0
- package/schemas/accepted-learning.schema.json +20 -0
- package/schemas/author-artifact.schema.json +156 -0
- package/schemas/clarification.schema.json +19 -0
- package/schemas/design-artifact.schema.json +80 -0
- package/schemas/docs-claim.schema.json +18 -0
- package/schemas/export-manifest.schema.json +20 -0
- package/schemas/hook.schema.json +67 -0
- package/schemas/host-export-package.schema.json +18 -0
- package/schemas/implementation-plan.schema.json +19 -0
- package/schemas/proposed-learning.schema.json +19 -0
- package/schemas/research.schema.json +18 -0
- package/schemas/review.schema.json +29 -0
- package/schemas/run-manifest.schema.json +18 -0
- package/schemas/spec-challenge.schema.json +18 -0
- package/schemas/spec.schema.json +20 -0
- package/schemas/usage.schema.json +102 -0
- package/schemas/verification-proof.schema.json +29 -0
- package/schemas/wazir-manifest.schema.json +173 -0
- package/skills/README.md +40 -0
- package/skills/brainstorming/SKILL.md +77 -0
- package/skills/debugging/SKILL.md +50 -0
- package/skills/design/SKILL.md +61 -0
- package/skills/dispatching-parallel-agents/SKILL.md +128 -0
- package/skills/executing-plans/SKILL.md +70 -0
- package/skills/finishing-a-development-branch/SKILL.md +169 -0
- package/skills/humanize/SKILL.md +123 -0
- package/skills/init-pipeline/SKILL.md +124 -0
- package/skills/prepare-next/SKILL.md +20 -0
- package/skills/receiving-code-review/SKILL.md +123 -0
- package/skills/requesting-code-review/SKILL.md +105 -0
- package/skills/requesting-code-review/code-reviewer.md +108 -0
- package/skills/run-audit/SKILL.md +197 -0
- package/skills/scan-project/SKILL.md +41 -0
- package/skills/self-audit/SKILL.md +153 -0
- package/skills/subagent-driven-development/SKILL.md +154 -0
- package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +26 -0
- package/skills/subagent-driven-development/implementer-prompt.md +102 -0
- package/skills/subagent-driven-development/spec-reviewer-prompt.md +61 -0
- package/skills/tdd/SKILL.md +23 -0
- package/skills/using-git-worktrees/SKILL.md +163 -0
- package/skills/using-skills/SKILL.md +95 -0
- package/skills/verification/SKILL.md +22 -0
- package/skills/wazir/SKILL.md +463 -0
- package/skills/writing-plans/SKILL.md +30 -0
- package/skills/writing-skills/SKILL.md +157 -0
- package/skills/writing-skills/anthropic-best-practices.md +122 -0
- package/skills/writing-skills/persuasion-principles.md +50 -0
- package/templates/README.md +20 -0
- package/templates/artifacts/README.md +10 -0
- package/templates/artifacts/accepted-learning.md +19 -0
- package/templates/artifacts/accepted-learning.template.json +12 -0
- package/templates/artifacts/author.md +74 -0
- package/templates/artifacts/author.template.json +19 -0
- package/templates/artifacts/clarification.md +21 -0
- package/templates/artifacts/clarification.template.json +12 -0
- package/templates/artifacts/execute-notes.md +19 -0
- package/templates/artifacts/implementation-plan.md +21 -0
- package/templates/artifacts/implementation-plan.template.json +11 -0
- package/templates/artifacts/learning-proposal.md +19 -0
- package/templates/artifacts/next-run-handoff.md +21 -0
- package/templates/artifacts/plan-review.md +19 -0
- package/templates/artifacts/proposed-learning.template.json +12 -0
- package/templates/artifacts/research.md +21 -0
- package/templates/artifacts/research.template.json +12 -0
- package/templates/artifacts/review-findings.md +19 -0
- package/templates/artifacts/review.template.json +11 -0
- package/templates/artifacts/run-manifest.template.json +8 -0
- package/templates/artifacts/spec-challenge.md +19 -0
- package/templates/artifacts/spec-challenge.template.json +11 -0
- package/templates/artifacts/spec.md +21 -0
- package/templates/artifacts/spec.template.json +12 -0
- package/templates/artifacts/verification-proof.md +19 -0
- package/templates/artifacts/verification-proof.template.json +11 -0
- package/templates/examples/accepted-learning.example.json +14 -0
- package/templates/examples/author.example.json +152 -0
- package/templates/examples/clarification.example.json +15 -0
- package/templates/examples/docs-claim.example.json +8 -0
- package/templates/examples/export-manifest.example.json +7 -0
- package/templates/examples/host-export-package.example.json +11 -0
- package/templates/examples/implementation-plan.example.json +17 -0
- package/templates/examples/proposed-learning.example.json +13 -0
- package/templates/examples/research.example.json +15 -0
- package/templates/examples/research.example.md +6 -0
- package/templates/examples/review.example.json +17 -0
- package/templates/examples/run-manifest.example.json +9 -0
- package/templates/examples/spec-challenge.example.json +14 -0
- package/templates/examples/spec.example.json +21 -0
- package/templates/examples/verification-proof.example.json +21 -0
- package/templates/examples/wazir-manifest.example.yaml +65 -0
- package/templates/task-definition-schema.md +99 -0
- package/tooling/README.md +20 -0
- package/tooling/src/adapters/context-mode.js +50 -0
- package/tooling/src/capture/command.js +376 -0
- package/tooling/src/capture/store.js +99 -0
- package/tooling/src/capture/usage.js +270 -0
- package/tooling/src/checks/branches.js +50 -0
- package/tooling/src/checks/brand-truth.js +110 -0
- package/tooling/src/checks/changelog.js +231 -0
- package/tooling/src/checks/command-registry.js +36 -0
- package/tooling/src/checks/commits.js +102 -0
- package/tooling/src/checks/docs-drift.js +103 -0
- package/tooling/src/checks/docs-truth.js +201 -0
- package/tooling/src/checks/runtime-surface.js +156 -0
- package/tooling/src/cli.js +116 -0
- package/tooling/src/command-options.js +56 -0
- package/tooling/src/commands/validate.js +320 -0
- package/tooling/src/doctor/command.js +91 -0
- package/tooling/src/export/command.js +77 -0
- package/tooling/src/export/compiler.js +498 -0
- package/tooling/src/guards/loop-cap-guard.js +52 -0
- package/tooling/src/guards/protected-path-write-guard.js +67 -0
- package/tooling/src/index/command.js +152 -0
- package/tooling/src/index/storage.js +1061 -0
- package/tooling/src/index/summarizers.js +261 -0
- package/tooling/src/loaders.js +18 -0
- package/tooling/src/project-root.js +22 -0
- package/tooling/src/recall/command.js +225 -0
- package/tooling/src/schema-validator.js +30 -0
- package/tooling/src/state-root.js +40 -0
- package/tooling/src/status/command.js +71 -0
- package/wazir.manifest.yaml +135 -0
- package/workflows/README.md +19 -0
- package/workflows/author.md +42 -0
- package/workflows/clarify.md +38 -0
- package/workflows/design-review.md +46 -0
- package/workflows/design.md +44 -0
- package/workflows/discover.md +37 -0
- package/workflows/execute.md +48 -0
- package/workflows/learn.md +38 -0
- package/workflows/plan-review.md +42 -0
- package/workflows/plan.md +39 -0
- package/workflows/prepare-next.md +37 -0
- package/workflows/review.md +40 -0
- package/workflows/run-audit.md +41 -0
- package/workflows/spec-challenge.md +41 -0
- package/workflows/specify.md +38 -0
- package/workflows/verify.md +37 -0
|
@@ -0,0 +1,964 @@
|
|
|
1
|
+
# Memory Management Performance Expertise Module
|
|
2
|
+
|
|
3
|
+
> **Scope**: Hardware memory hierarchy, garbage collection strategies and tuning,
|
|
4
|
+
> memory leak detection, cache-friendly data structures, allocation patterns,
|
|
5
|
+
> and diagnostic workflows.
|
|
6
|
+
>
|
|
7
|
+
> **Audience**: Performance engineers, backend developers, and SREs responsible
|
|
8
|
+
> for latency-sensitive or memory-constrained services.
|
|
9
|
+
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## 1. Memory Hierarchy: The Numbers That Matter
|
|
13
|
+
|
|
14
|
+
Every performance decision starts with understanding the cost of reaching data.
|
|
15
|
+
Modern processors rely on a multi-level cache hierarchy; a miss at any level
|
|
16
|
+
forces the CPU to wait for the next, slower tier.
|
|
17
|
+
|
|
18
|
+
| Level | Typical Size | Latency (cycles) | Latency (approx.) | Bandwidth |
|
|
19
|
+
|-------|-------------|-------------------|--------------------|-----------|
|
|
20
|
+
| L1 Cache | 32-64 KB per core (up to 192 KB on Apple M-series) | 4 cycles | ~1 ns | ~1 TB/s |
|
|
21
|
+
| L2 Cache | 256 KB - 1 MB per core | 12 cycles | ~3-4 ns | ~500 GB/s |
|
|
22
|
+
| L3 Cache | 8-64 MB shared | 30-40 cycles | ~10-12 ns | ~200 GB/s |
|
|
23
|
+
| Main RAM (DDR5) | 16-512 GB | 200+ cycles | ~60-100 ns | ~50-100 GB/s |
|
|
24
|
+
| NVMe SSD | TB-scale | - | ~10-100 us | ~7 GB/s |
|
|
25
|
+
| Network (same DC) | - | - | ~500 us | ~12.5 GB/s (100GbE) |
|
|
26
|
+
|
|
27
|
+
Source: CPU cache data from [CPU cache - Wikipedia](https://en.wikipedia.org/wiki/CPU_cache);
|
|
28
|
+
cache line measurements from [Daniel Lemire's blog](https://lemire.me/blog/2023/12/12/measuring-the-size-of-the-cache-line-empirically/);
|
|
29
|
+
modern processor specs from [StoredBits](https://storedbits.com/cpu-cache-l1-l2-l3/).
|
|
30
|
+
|
|
31
|
+
### 1.1 Cache Line Fundamentals
|
|
32
|
+
|
|
33
|
+
- Standard cache line size: **64 bytes** on x86/AMD64 processors.
|
|
34
|
+
- Apple ARM64 (M1/M2/M3/M4): **128-byte** cache lines.
|
|
35
|
+
- IBM POWER7-9: **128-byte** cache lines; IBM s390x: **256-byte** cache lines.
|
|
36
|
+
- All memory transfers between cache levels happen in full cache-line units.
|
|
37
|
+
Accessing a single byte loads the entire 64-byte (or 128-byte) line.
|
|
38
|
+
|
|
39
|
+
**Practical implication**: If a struct is 72 bytes on x86, it spans two cache
|
|
40
|
+
lines. Accessing any field may trigger two cache-line fetches. Padding or
|
|
41
|
+
reordering fields to fit within 64 bytes eliminates this penalty.
|
|
42
|
+
|
|
43
|
+
### 1.2 TLB and Page Size
|
|
44
|
+
|
|
45
|
+
Translation Lookaside Buffer (TLB) misses add 7-30 ns per access. Standard 4 KB
|
|
46
|
+
pages mean a 256 MB working set requires 65,536 TLB entries. Using 2 MB huge
|
|
47
|
+
pages reduces that to 128 entries, dramatically reducing TLB misses. Linux
|
|
48
|
+
Transparent Huge Pages (THP) can automate this but may cause latency spikes from
|
|
49
|
+
compaction; explicit `madvise(MADV_HUGEPAGE)` gives finer control.
|
|
50
|
+
|
|
51
|
+
---
|
|
52
|
+
|
|
53
|
+
## 2. Garbage Collection Strategies
|
|
54
|
+
|
|
55
|
+
### 2.1 Core GC Approaches
|
|
56
|
+
|
|
57
|
+
| Strategy | Mechanism | Pause Characteristic | Throughput Overhead |
|
|
58
|
+
|----------|-----------|---------------------|---------------------|
|
|
59
|
+
| **Reference Counting** | Decrement on release; free at zero | No GC pauses; deallocation inline | Low, but cycles require backup tracing |
|
|
60
|
+
| **Mark-and-Sweep** | Trace from roots; sweep unmarked | Stop-the-world; proportional to heap | Low overhead between collections |
|
|
61
|
+
| **Mark-and-Compact** | Trace, then relocate live objects | Longer pauses; eliminates fragmentation | Moderate; compaction is expensive |
|
|
62
|
+
| **Copying (Semi-space)** | Copy live objects to new space | Fast for young gen; half memory wasted | Very low for short-lived objects |
|
|
63
|
+
| **Generational** | Separate young/old spaces | Frequent minor pauses; rare major | Best overall throughput for typical workloads |
|
|
64
|
+
| **Concurrent** | GC runs alongside application | Sub-ms pauses; requires barriers | 5-15% CPU overhead for GC threads |
|
|
65
|
+
| **Incremental** | GC work interleaved with application | Bounded pauses per increment | Moderate; write barriers add cost |
|
|
66
|
+
|
|
67
|
+
### 2.2 Generational GC Hypothesis
|
|
68
|
+
|
|
69
|
+
Most objects die young. Empirical measurements across languages consistently show
|
|
70
|
+
that 80-98% of objects become unreachable within their first GC cycle. Generational
|
|
71
|
+
collectors exploit this by:
|
|
72
|
+
|
|
73
|
+
1. Allocating into a small **nursery** (young generation).
|
|
74
|
+
2. Running frequent, fast minor collections on the nursery.
|
|
75
|
+
3. **Promoting** survivors to older generations collected less frequently.
|
|
76
|
+
4. Running major (full) collections only when older generations fill up.
|
|
77
|
+
|
|
78
|
+
---
|
|
79
|
+
|
|
80
|
+
## 3. GC Tuning by Runtime
|
|
81
|
+
|
|
82
|
+
### 3.1 JVM (Java / Kotlin / Scala)
|
|
83
|
+
|
|
84
|
+
The JVM offers the most mature GC ecosystem. Choose based on your primary
|
|
85
|
+
constraint (pick two of three: latency, throughput, footprint).
|
|
86
|
+
|
|
87
|
+
#### Collector Comparison (Java 21+)
|
|
88
|
+
|
|
89
|
+
| Collector | Typical Pause | Max Heap | CPU Overhead | Best For |
|
|
90
|
+
|-----------|---------------|----------|--------------|----------|
|
|
91
|
+
| **G1GC** (default) | 10-200 ms | Multi-TB | Baseline | General purpose, balanced |
|
|
92
|
+
| **ZGC** | 0.1-1 ms | 16 TB | +5-10% CPU | Ultra-low latency |
|
|
93
|
+
| **Shenandoah** | 1-10 ms | TB-scale | +5-15% CPU | Low latency, Red Hat ecosystem |
|
|
94
|
+
| **Parallel GC** | 50-500 ms | Large heaps | Lowest CPU | Batch processing, throughput |
|
|
95
|
+
| **Serial GC** | Varies | Small heaps | Minimal threads | Containers with 1 vCPU |
|
|
96
|
+
|
|
97
|
+
Sources: [Java Code Geeks - G1 vs ZGC vs Shenandoah](https://www.javacodegeeks.com/2025/08/java-gc-performance-g1-vs-zgc-vs-shenandoah.html);
|
|
98
|
+
[Gunnar Morling - Lower Java Tail Latencies With ZGC](https://www.morling.dev/blog/lower-java-tail-latencies-with-zgc/);
|
|
99
|
+
[Datadog - Deep dive into Java GC](https://www.datadoghq.com/blog/understanding-java-gc/).
|
|
100
|
+
|
|
101
|
+
**ZGC benchmark highlight**: In production testing, ZGC achieved pause times
|
|
102
|
+
consistently under 0.5 ms regardless of heap size (tested up to 128 GB), with
|
|
103
|
+
occasional spikes to ~1 ms under extreme allocation pressure. G1GC on the same
|
|
104
|
+
workload showed pauses exceeding 20 ms.
|
|
105
|
+
|
|
106
|
+
#### Key JVM GC Flags
|
|
107
|
+
|
|
108
|
+
```bash
|
|
109
|
+
# G1GC with 200ms pause target
|
|
110
|
+
-XX:+UseG1GC -XX:MaxGCPauseMillis=200 -Xmx8g -Xms8g
|
|
111
|
+
|
|
112
|
+
# ZGC for sub-millisecond pauses (Java 21+)
|
|
113
|
+
-XX:+UseZGC -XX:+ZGenerational -Xmx16g -Xms16g
|
|
114
|
+
|
|
115
|
+
# Shenandoah (OpenJDK / Red Hat builds)
|
|
116
|
+
-XX:+UseShenandoahGC -Xmx8g -Xms8g
|
|
117
|
+
|
|
118
|
+
# Diagnostic flags (always enable in production)
|
|
119
|
+
-Xlog:gc*:file=gc.log:time,uptime,level,tags:filecount=10,filesize=100m
|
|
120
|
+
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/dumps/
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
#### JVM Tuning Checklist
|
|
124
|
+
|
|
125
|
+
1. **Set -Xms equal to -Xmx** to avoid heap resizing pauses.
|
|
126
|
+
2. **Size young generation** via `-XX:NewRatio` or `-XX:NewSize`. More young-gen
|
|
127
|
+
space reduces promotion rate and major GC frequency.
|
|
128
|
+
3. **Monitor promotion rate**: If objects promote too quickly, young gen is too
|
|
129
|
+
small or objects live too long.
|
|
130
|
+
4. **GC log analysis**: Use GCEasy or GCViewer to identify pause time
|
|
131
|
+
distributions, allocation rates, and promotion rates.
|
|
132
|
+
5. **Avoid finalizers**: They add a full GC cycle delay to reclamation and
|
|
133
|
+
create GC pressure. Use `Cleaner` or try-with-resources instead.
|
|
134
|
+
|
|
135
|
+
Source: [LinkedIn Engineering - GC Optimization](https://engineering.linkedin.com/garbage-collection/garbage-collection-optimization-high-throughput-and-low-latency-java-applications);
|
|
136
|
+
[Atlassian - GC Tuning Guide](https://confluence.atlassian.com/enterprise/garbage-collection-gc-tuning-guide-461504616.html).
|
|
137
|
+
|
|
138
|
+
### 3.2 V8 / Node.js
|
|
139
|
+
|
|
140
|
+
V8's **Orinoco** garbage collector uses a generational approach with parallel,
|
|
141
|
+
concurrent, and incremental techniques to minimize pauses.
|
|
142
|
+
|
|
143
|
+
#### Memory Spaces
|
|
144
|
+
|
|
145
|
+
| Space | Purpose | Default Size |
|
|
146
|
+
|-------|---------|-------------|
|
|
147
|
+
| **New Space (Young Gen)** | Short-lived objects; semi-space copying | 16 MB (2 x 8 MB semi-spaces) |
|
|
148
|
+
| **Old Space** | Promoted long-lived objects; mark-sweep-compact | Up to ~1.5 GB (64-bit default) |
|
|
149
|
+
| **Large Object Space** | Objects > 512 KB; never moved | Part of old space budget |
|
|
150
|
+
| **Code Space** | JIT-compiled machine code | Variable |
|
|
151
|
+
|
|
152
|
+
#### Key V8/Node.js Tuning Flags
|
|
153
|
+
|
|
154
|
+
```bash
|
|
155
|
+
# Increase heap for memory-intensive apps (in MB)
|
|
156
|
+
node --max-old-space-size=4096 app.js
|
|
157
|
+
|
|
158
|
+
# Increase semi-space for high allocation rates (in MB)
|
|
159
|
+
# Measured 18% throughput improvement in one benchmark
|
|
160
|
+
node --max-semi-space-size=128 app.js
|
|
161
|
+
|
|
162
|
+
# Enable GC tracing for diagnostics
|
|
163
|
+
node --trace-gc app.js
|
|
164
|
+
|
|
165
|
+
# Expose GC for manual triggering (use sparingly)
|
|
166
|
+
node --expose-gc app.js
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
**Semi-space tuning impact**: Increasing `--max-semi-space-size` from default
|
|
170
|
+
(8 MB) to 128 MB reduced Old Space promotion rates significantly, yielding an
|
|
171
|
+
**18% throughput improvement** in high-allocation workloads by reducing the
|
|
172
|
+
frequency of expensive Old Space GC cycles.
|
|
173
|
+
|
|
174
|
+
Source: [Platformatic - V8 GC Optimization](https://blog.platformatic.dev/optimizing-nodejs-performance-v8-memory-management-and-gc-tuning);
|
|
175
|
+
[V8 Blog - Trash Talk: Orinoco GC](https://v8.dev/blog/trash-talk);
|
|
176
|
+
[Node.js issue #42511](https://github.com/nodejs/node/issues/42511).
|
|
177
|
+
|
|
178
|
+
### 3.3 Go
|
|
179
|
+
|
|
180
|
+
Go uses a **concurrent, tri-color mark-and-sweep** collector without generational
|
|
181
|
+
separation. GC pauses are typically under 1 ms, achieved through concurrent
|
|
182
|
+
marking and sweeping with write barriers.
|
|
183
|
+
|
|
184
|
+
#### GOGC and GOMEMLIMIT
|
|
185
|
+
|
|
186
|
+
| Variable | Default | Effect |
|
|
187
|
+
|----------|---------|--------|
|
|
188
|
+
| `GOGC` | 100 | Triggers GC when new allocations reach this % of live heap. GOGC=200 means collect at 2x live heap. |
|
|
189
|
+
| `GOMEMLIMIT` | None | Soft cap on total Go runtime memory. GC becomes aggressive near this limit. |
|
|
190
|
+
|
|
191
|
+
**Trade-offs**:
|
|
192
|
+
- **GOGC=50**: More frequent GC, lower peak memory, higher CPU overhead, shorter pauses.
|
|
193
|
+
- **GOGC=200**: Less frequent GC, higher peak memory, lower CPU overhead.
|
|
194
|
+
- **GOMEMLIMIT**: Prevents OOM in containers but risks "death spiral" if live heap
|
|
195
|
+
approaches the limit (GC runs continuously, consuming all CPU).
|
|
196
|
+
|
|
197
|
+
**Production case study**: An ad platform handling tens of thousands of requests
|
|
198
|
+
per second saw P99 latency spike to 50 ms with default GOGC=100. Tuning to
|
|
199
|
+
GOGC=50 brought P99 under 20 ms. Uber implemented dynamic GC tuning across Go
|
|
200
|
+
services and saved tens of thousands of CPU cores.
|
|
201
|
+
|
|
202
|
+
Source: [Go GC Guide](https://go.dev/doc/gc-guide);
|
|
203
|
+
[Go Performance Guide - GC](https://goperf.dev/01-common-patterns/gc/);
|
|
204
|
+
[Go issue #68346](https://github.com/golang/go/issues/68346).
|
|
205
|
+
|
|
206
|
+
#### Go Memory Optimization Patterns
|
|
207
|
+
|
|
208
|
+
```go
|
|
209
|
+
// Pre-allocate slices to avoid repeated growth
|
|
210
|
+
data := make([]Record, 0, expectedSize)
|
|
211
|
+
|
|
212
|
+
// Use sync.Pool for frequently allocated/deallocated objects
|
|
213
|
+
var bufPool = sync.Pool{
|
|
214
|
+
New: func() interface{} {
|
|
215
|
+
return make([]byte, 0, 4096)
|
|
216
|
+
},
|
|
217
|
+
}
|
|
218
|
+
|
|
219
|
+
// Reduce allocations: pass pointers, reuse buffers
|
|
220
|
+
buf := bufPool.Get().([]byte)
|
|
221
|
+
defer bufPool.Put(buf[:0])
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
### 3.4 Python (CPython)
|
|
225
|
+
|
|
226
|
+
CPython uses **reference counting** as its primary mechanism, supplemented by a
|
|
227
|
+
**cyclic garbage collector** for reference cycles.
|
|
228
|
+
|
|
229
|
+
#### Generation Thresholds
|
|
230
|
+
|
|
231
|
+
| Generation | Default Threshold | Collected When |
|
|
232
|
+
|------------|-------------------|----------------|
|
|
233
|
+
| Gen 0 | 700 allocations | allocs - deallocs > 700 |
|
|
234
|
+
| Gen 1 | 10 Gen-0 collections | Every 10th Gen-0 run |
|
|
235
|
+
| Gen 2 | 10 Gen-1 collections | Every 10th Gen-1 run |
|
|
236
|
+
|
|
237
|
+
#### Tuning Options
|
|
238
|
+
|
|
239
|
+
```python
|
|
240
|
+
import gc
|
|
241
|
+
|
|
242
|
+
# View current thresholds
|
|
243
|
+
print(gc.get_threshold()) # (700, 10, 10)
|
|
244
|
+
|
|
245
|
+
# Adjust for workload with many long-lived objects
|
|
246
|
+
gc.set_threshold(1000, 15, 15)
|
|
247
|
+
|
|
248
|
+
# Disable GC for performance-critical section
|
|
249
|
+
gc.disable()
|
|
250
|
+
# ... hot loop with many temporary objects ...
|
|
251
|
+
gc.collect() # Manual sweep
|
|
252
|
+
gc.enable()
|
|
253
|
+
|
|
254
|
+
# Freeze long-lived objects to skip them in future scans (Python 3.7+)
|
|
255
|
+
gc.freeze() # All currently tracked objects become exempt
|
|
256
|
+
```
|
|
257
|
+
|
|
258
|
+
**gc.freeze() use case**: In Django/Flask applications with large in-memory
|
|
259
|
+
caches, calling `gc.freeze()` after initialization can reduce GC scan time
|
|
260
|
+
by 30-50% because the collector skips frozen objects entirely.
|
|
261
|
+
|
|
262
|
+
Source: [Python gc module docs](https://docs.python.org/3/library/gc.html);
|
|
263
|
+
[Artem Golubin - Python GC internals](https://rushter.com/blog/python-garbage-collector/).
|
|
264
|
+
|
|
265
|
+
---
|
|
266
|
+
|
|
267
|
+
## 4. Memory Leak Patterns and Detection
|
|
268
|
+
|
|
269
|
+
### 4.1 Common Leak Patterns by Language
|
|
270
|
+
|
|
271
|
+
| Pattern | Languages | Example | Impact |
|
|
272
|
+
|---------|-----------|---------|--------|
|
|
273
|
+
| **Forgotten event listeners** | JS, Java, C# | `emitter.on('data', fn)` never removed | Objects retained by listener closure |
|
|
274
|
+
| **Unbounded caches** | All | Map grows indefinitely without eviction | Heap grows linearly with requests |
|
|
275
|
+
| **Closures capturing scope** | JS, Python | Lambda captures parent scope variables | Entire scope tree retained |
|
|
276
|
+
| **Static/global references** | Java, C#, Go | Static `List<>` accumulates entries | Never eligible for GC |
|
|
277
|
+
| **Circular references** | Python, ObjC | A -> B -> A with no weak refs | CPython: only cyclic GC can collect |
|
|
278
|
+
| **Unreleased native resources** | Java, .NET | DB connections, file handles not closed | OS resource exhaustion + heap retention |
|
|
279
|
+
| **Detached DOM nodes** | JavaScript | DOM removed but JS reference remains | Entire subtree retained in memory |
|
|
280
|
+
| **Goroutine leaks** | Go | Goroutine blocks on channel forever | 2-8 KB stack per goroutine, unbounded |
|
|
281
|
+
|
|
282
|
+
### 4.2 Detection Techniques
|
|
283
|
+
|
|
284
|
+
#### Heap Snapshot Comparison (The Gold Standard)
|
|
285
|
+
|
|
286
|
+
The most reliable technique for finding leaks is comparing heap snapshots
|
|
287
|
+
taken at different points in time.
|
|
288
|
+
|
|
289
|
+
**Process**:
|
|
290
|
+
1. Take **Snapshot A** (baseline, after warmup).
|
|
291
|
+
2. Execute the suspected leaking operation N times.
|
|
292
|
+
3. Force a GC cycle.
|
|
293
|
+
4. Take **Snapshot B**.
|
|
294
|
+
5. Compare: objects with **increasing count or retained size** between A and B
|
|
295
|
+
are strong leak candidates.
|
|
296
|
+
|
|
297
|
+
**Key metrics**:
|
|
298
|
+
- **Shallow size**: Memory consumed by the object itself.
|
|
299
|
+
- **Retained size**: Memory freed if this object were garbage collected
|
|
300
|
+
(includes all exclusively-referenced children). This is the critical metric
|
|
301
|
+
for leak analysis.
|
|
302
|
+
|
|
303
|
+
Source: [Chrome DevTools - Heap Snapshots](https://developer.chrome.com/docs/devtools/memory-problems/heap-snapshots);
|
|
304
|
+
[Halodoc - Fix Node.js Memory Leaks](https://blogs.halodoc.io/fix-node-js-memory-leaks/).
|
|
305
|
+
|
|
306
|
+
#### Tools by Runtime
|
|
307
|
+
|
|
308
|
+
| Runtime | Tool | Capability |
|
|
309
|
+
|---------|------|-----------|
|
|
310
|
+
| **JVM** | VisualVM, Eclipse MAT, async-profiler | Heap dumps, allocation tracking, flame graphs |
|
|
311
|
+
| **JVM** | Java Flight Recorder (JFR) | Low-overhead production profiling (~2% CPU) |
|
|
312
|
+
| **JVM** | jcmd + jmap | Command-line heap dumps and histograms |
|
|
313
|
+
| **Node.js** | Chrome DevTools Memory tab | Heap snapshots, allocation timeline |
|
|
314
|
+
| **Node.js** | `--inspect` + `v8.writeHeapSnapshot()` | Production-safe snapshot capture |
|
|
315
|
+
| **Python** | tracemalloc | Track allocation origin with file:line |
|
|
316
|
+
| **Python** | objgraph, Scalene | Reference graph visualization, leak detection |
|
|
317
|
+
| **Go** | pprof (`/debug/pprof/heap`) | Live heap profile with allocation stacks |
|
|
318
|
+
| **Go** | runtime.ReadMemStats | Programmatic heap/GC metrics |
|
|
319
|
+
| **C/C++** | Valgrind (Memcheck) | Definite/possible leak detection at exit |
|
|
320
|
+
| **C/C++** | AddressSanitizer (ASan) | Compile-time instrumentation; ~2x slowdown |
|
|
321
|
+
| **.NET** | dotMemory, dotnet-dump | Heap snapshots, retention paths |
|
|
322
|
+
|
|
323
|
+
#### Production Monitoring Signals
|
|
324
|
+
|
|
325
|
+
Monitor these metrics continuously; trend-based alerting catches leaks before OOM:
|
|
326
|
+
|
|
327
|
+
- **RSS (Resident Set Size)**: Monotonically increasing RSS is the first signal.
|
|
328
|
+
- **Heap used after GC**: Plot heap size at each GC boundary. Upward trend = leak.
|
|
329
|
+
- **GC frequency**: Increasing GC frequency with stable allocation rate = growing
|
|
330
|
+
live set.
|
|
331
|
+
- **GC pause duration**: Increasing major GC pauses indicate a growing old
|
|
332
|
+
generation.
|
|
333
|
+
|
|
334
|
+
Source: [Microsoft Azure - RESIN memory leak detection](https://azure.microsoft.com/en-us/blog/advancing-memory-leak-detection-with-aiops-introducing-resin/);
|
|
335
|
+
[Browserless - Memory Leak Guide](https://www.browserless.io/blog/memory-leak-how-to-find-fix-prevent-them).
|
|
336
|
+
|
|
337
|
+
---
|
|
338
|
+
|
|
339
|
+
## 5. Cache-Friendly Data Structures
|
|
340
|
+
|
|
341
|
+
### 5.1 Arrays vs. Linked Lists
|
|
342
|
+
|
|
343
|
+
Contiguous memory layout is the single most impactful factor for iteration
|
|
344
|
+
performance on modern hardware.
|
|
345
|
+
|
|
346
|
+
| Operation | Array/Vector | Linked List | Why |
|
|
347
|
+
|-----------|-------------|-------------|-----|
|
|
348
|
+
| Sequential iteration | ~1 ns/element (L1 hits) | ~5-60 ns/element (cache misses) | Spatial locality; prefetcher works |
|
|
349
|
+
| Random access | O(1), ~1-4 ns | O(n), pointer chasing | Each node may be a cache miss |
|
|
350
|
+
| Insertion at front | O(n) memmove | O(1) | But memmove is hardware-optimized |
|
|
351
|
+
| Insertion at middle | O(n) | O(1) if at cursor | Array still often wins due to cache |
|
|
352
|
+
|
|
353
|
+
**Benchmark evidence**: Linked list traversal triggers frequent cache and TLB
|
|
354
|
+
misses because nodes are scattered across memory. The CPU prefetcher cannot
|
|
355
|
+
predict the next address. Even for insertions, arrays outperform linked lists
|
|
356
|
+
for collections under ~10,000 elements because the cost of cache misses
|
|
357
|
+
dominates pointer manipulation savings.
|
|
358
|
+
|
|
359
|
+
**When linked lists may win**: When elements exceed ~256 bytes and the workload
|
|
360
|
+
is insertion-heavy, avoiding element copies can outweigh cache-miss penalties.
|
|
361
|
+
|
|
362
|
+
Source: [DZone - Array vs Linked List Performance](https://dzone.com/articles/performance-of-array-vs-linked-list-on-modern-comp);
|
|
363
|
+
[arXiv - RIP Linked List](https://arxiv.org/html/2306.06942v2);
|
|
364
|
+
[AlgoCademy - Cache-Friendly Data Structures](https://algocademy.com/blog/cache-friendly-algorithms-and-data-structures-optimizing-performance-through-efficient-memory-access/).
|
|
365
|
+
|
|
366
|
+
### 5.2 Struct of Arrays (SoA) vs. Array of Structs (AoS)
|
|
367
|
+
|
|
368
|
+
This is one of the highest-impact transformations for data-intensive workloads.
|
|
369
|
+
|
|
370
|
+
**Array of Structs (AoS)** -- the natural OOP layout:
|
|
371
|
+
```
|
|
372
|
+
[{x, y, z, color, normal}, {x, y, z, color, normal}, ...]
|
|
373
|
+
```
|
|
374
|
+
|
|
375
|
+
**Struct of Arrays (SoA)** -- the data-oriented layout:
|
|
376
|
+
```
|
|
377
|
+
{xs: [x, x, ...], ys: [y, y, ...], zs: [z, z, ...], ...}
|
|
378
|
+
```
|
|
379
|
+
|
|
380
|
+
| Aspect | AoS | SoA |
|
|
381
|
+
|--------|-----|-----|
|
|
382
|
+
| Cache utilization when accessing 1 field | Low: loads unused fields into cache | High: only needed data in cache |
|
|
383
|
+
| SIMD vectorization | Hard: mixed types in registers | Easy: homogeneous data, direct SIMD load |
|
|
384
|
+
| Code readability | Natural, object-oriented | Less intuitive, requires discipline |
|
|
385
|
+
| Random single-object access | Good: entire object in 1-2 cache lines | Poor: fields scattered across arrays |
|
|
386
|
+
| Measured speedup (field iteration) | Baseline | **40-60% faster**; up to **40x** in extreme cases |
|
|
387
|
+
|
|
388
|
+
**Hybrid: AoSoA (Array of Structs of Arrays)**: Interleave data in blocks
|
|
389
|
+
matching the SIMD vector width (e.g., 8 or 16 elements). Achieves SoA's
|
|
390
|
+
throughput while maintaining better cache locality for multi-field access.
|
|
391
|
+
|
|
392
|
+
Source: [Wikipedia - AoS and SoA](https://en.wikipedia.org/wiki/AoS_and_SoA);
|
|
393
|
+
[Medium - SoA vs AoS Deep Dive](https://medium.com/@azad217/structure-of-arrays-soa-vs-array-of-structures-aos-in-c-a-deep-dive-into-cache-optimized-13847588232e);
|
|
394
|
+
[Serge Skoredin - Cache-Friendly Go](https://skoredin.pro/blog/golang/cpu-cache-friendly-go).
|
|
395
|
+
|
|
396
|
+
### 5.3 Cache-Line Alignment Techniques
|
|
397
|
+
|
|
398
|
+
```c
|
|
399
|
+
// Align struct to cache line to prevent false sharing
|
|
400
|
+
struct __attribute__((aligned(64))) Counter {
|
|
401
|
+
uint64_t value;
|
|
402
|
+
char padding[56]; // Fill to 64 bytes
|
|
403
|
+
};
|
|
404
|
+
|
|
405
|
+
// In Go: pad structs to avoid false sharing between goroutines
|
|
406
|
+
type PaddedCounter struct {
|
|
407
|
+
Value uint64
|
|
408
|
+
_ [56]byte // Pad to 64-byte cache line
|
|
409
|
+
}
|
|
410
|
+
```
|
|
411
|
+
|
|
412
|
+
**False sharing** occurs when two threads write to different fields that share
|
|
413
|
+
the same cache line, causing the line to bounce between CPU cores. Padding
|
|
414
|
+
eliminates this; measured impact is often **2-10x throughput improvement** for
|
|
415
|
+
contended counters.
|
|
416
|
+
|
|
417
|
+
---
|
|
418
|
+
|
|
419
|
+
## 6. Object Pooling and Arena Allocation
|
|
420
|
+
|
|
421
|
+
### 6.1 Object Pooling
|
|
422
|
+
|
|
423
|
+
Object pooling pre-creates and reuses expensive objects rather than allocating
|
|
424
|
+
and deallocating them repeatedly.
|
|
425
|
+
|
|
426
|
+
#### When Pooling Helps
|
|
427
|
+
|
|
428
|
+
| Object Type | Creation Cost | Pool Benefit |
|
|
429
|
+
|-------------|--------------|--------------|
|
|
430
|
+
| Database connections | 5-50 ms (TCP + auth + TLS) | Massive: amortize handshake |
|
|
431
|
+
| Thread / goroutine | 1-8 KB stack allocation | Moderate: reduces GC pressure |
|
|
432
|
+
| Large byte buffers | Allocation + zeroing | Moderate: avoids GC churn |
|
|
433
|
+
| Small value objects | ~10 ns allocation | **Negative**: pool overhead > allocation cost |
|
|
434
|
+
|
|
435
|
+
**Anti-pattern warning**: Do not pool cheap objects. In languages with efficient
|
|
436
|
+
allocators (Go, Java, modern C++), the overhead of synchronized pool access
|
|
437
|
+
(lock contention, cache invalidation) can exceed the cost of simple allocation.
|
|
438
|
+
Benchmark before adopting pooling.
|
|
439
|
+
|
|
440
|
+
Source: [Webtide - Object Pooling Benchmarks](https://webtide.com/object-pooling-benchmarks-and-another-way/);
|
|
441
|
+
[Medium - Benchmarking Object Pools](https://medium.com/@chrishantha/benchmarking-object-pools-6df007a31ada).
|
|
442
|
+
|
|
443
|
+
#### Pool Implementation Patterns
|
|
444
|
+
|
|
445
|
+
```go
|
|
446
|
+
// Go: sync.Pool -- GC-aware, automatically shrinks
|
|
447
|
+
var bufPool = sync.Pool{
|
|
448
|
+
New: func() interface{} { return new(bytes.Buffer) },
|
|
449
|
+
}
|
|
450
|
+
|
|
451
|
+
func handleRequest(data []byte) {
|
|
452
|
+
buf := bufPool.Get().(*bytes.Buffer)
|
|
453
|
+
buf.Reset()
|
|
454
|
+
defer bufPool.Put(buf)
|
|
455
|
+
// Use buf...
|
|
456
|
+
}
|
|
457
|
+
```
|
|
458
|
+
|
|
459
|
+
```java
|
|
460
|
+
// Java: Apache Commons Pool2 with bounded size
|
|
461
|
+
GenericObjectPool<Connection> pool = new GenericObjectPool<>(factory);
|
|
462
|
+
pool.setMaxTotal(50);
|
|
463
|
+
pool.setMaxIdle(10);
|
|
464
|
+
pool.setMinIdle(5);
|
|
465
|
+
pool.setTestOnBorrow(true);
|
|
466
|
+
```
|
|
467
|
+
|
|
468
|
+
### 6.2 Arena (Region-Based) Allocation
|
|
469
|
+
|
|
470
|
+
An arena allocates from a large contiguous block by advancing a pointer.
|
|
471
|
+
Deallocation happens in bulk when the entire arena is freed.
|
|
472
|
+
|
|
473
|
+
| Metric | malloc/free | Arena Allocator |
|
|
474
|
+
|--------|-------------|-----------------|
|
|
475
|
+
| Allocation time | 40-75 ns | 8-12 ns |
|
|
476
|
+
| Deallocation time | 250-1000 ns per free() | ~0 ns (bulk free) |
|
|
477
|
+
| Fragmentation | Grows over time | Zero (contiguous block) |
|
|
478
|
+
| Memory overhead | Per-allocation bookkeeping | Single pointer + block |
|
|
479
|
+
|
|
480
|
+
**Benchmark**: On Intel i7-10750H, arena allocation measured at **9.2 ns** vs
|
|
481
|
+
system allocator (Box/malloc) at **71-74 ns** -- a **7-8x speedup**.
|
|
482
|
+
Pool allocators measured at **8.9 ns**, and shared arenas at **12.2 ns**.
|
|
483
|
+
|
|
484
|
+
Source: [Medium - Arena Allocators 50-100x Performance](https://medium.com/@ramogh2404/arena-and-memory-pool-allocators-the-50-100x-performance-secret-behind-game-engines-and-browsers-1e491cb40b49);
|
|
485
|
+
[Ryan Fleury - Untangling Lifetimes](https://www.rfleury.com/p/untangling-lifetimes-the-arena-allocator);
|
|
486
|
+
[OneUpTime - Arena Allocators in Rust](https://oneuptime.com/blog/post/2026-01-25-optimize-memory-arena-allocators-rust/view).
|
|
487
|
+
|
|
488
|
+
#### Ideal Use Cases for Arenas
|
|
489
|
+
|
|
490
|
+
- **Per-request processing**: Allocate all request objects from an arena; free
|
|
491
|
+
the entire arena when the request completes.
|
|
492
|
+
- **Compiler/parser phases**: Parse tree nodes share a lifetime; arena-allocate
|
|
493
|
+
and bulk-free after the phase.
|
|
494
|
+
- **Game frame allocation**: Per-frame scratch memory reset each tick.
|
|
495
|
+
- **Protobuf/serialization**: Google's Protobuf Arena API allocates message
|
|
496
|
+
trees from a single arena, reducing GC pressure by 60-80% in benchmarks.
|
|
497
|
+
|
|
498
|
+
---
|
|
499
|
+
|
|
500
|
+
## 7. Memory Fragmentation and Compaction
|
|
501
|
+
|
|
502
|
+
### 7.1 Types of Fragmentation
|
|
503
|
+
|
|
504
|
+
- **External fragmentation**: Free memory exists but is scattered in small,
|
|
505
|
+
non-contiguous blocks. A 100 MB allocation fails even though 200 MB is free
|
|
506
|
+
in total.
|
|
507
|
+
- **Internal fragmentation**: Allocated blocks are larger than needed due to
|
|
508
|
+
alignment or minimum allocation sizes. A 17-byte request gets a 32-byte slot.
|
|
509
|
+
|
|
510
|
+
### 7.2 Fragmentation in Practice
|
|
511
|
+
|
|
512
|
+
**jemalloc** (used by Redis, Rust, Firefox) reports fragmentation via
|
|
513
|
+
`stats.allocated` vs `stats.resident`. A ratio above 1.5 indicates significant
|
|
514
|
+
fragmentation. Redis documentation recommends restarting instances when
|
|
515
|
+
fragmentation ratio exceeds 1.5.
|
|
516
|
+
|
|
517
|
+
**Linux kernel**: Memory fragmentation triggers **direct compaction** when the
|
|
518
|
+
kernel cannot find contiguous pages for allocation. This blocks the requesting
|
|
519
|
+
process and adds 1-100 ms of latency. PingCAP documented this as a root cause
|
|
520
|
+
of TiDB tail-latency spikes in production.
|
|
521
|
+
|
|
522
|
+
Source: [PingCAP - Linux vs Memory Fragmentation](https://www.pingcap.com/blog/linux-kernel-vs-memory-fragmentation-1/);
|
|
523
|
+
[Oracle Diagnostician - Memory Fragmentation](https://savvinov.com/2019/10/14/memory-fragmentation-the-silent-performance-killer/).
|
|
524
|
+
|
|
525
|
+
### 7.3 Mitigation Strategies
|
|
526
|
+
|
|
527
|
+
| Strategy | Mechanism | Trade-off |
|
|
528
|
+
|----------|-----------|-----------|
|
|
529
|
+
| **Compacting GC** (JVM G1/ZGC) | Relocate live objects to eliminate gaps | CPU overhead for copying |
|
|
530
|
+
| **Slab allocation** (Linux kernel, memcached) | Fixed-size slabs for common object sizes | Internal fragmentation for odd sizes |
|
|
531
|
+
| **jemalloc / tcmalloc** | Thread-local caches + size classes | ~2% memory overhead for metadata |
|
|
532
|
+
| **Arena allocation** | Bulk-free eliminates fragmentation by design | Requires phase-based lifetimes |
|
|
533
|
+
| **Huge pages** | 2 MB pages reduce fragmentation at page level | Requires explicit configuration |
|
|
534
|
+
| **Periodic restart** | Reclaim all fragmented memory | Brief downtime; requires load balancing |
|
|
535
|
+
|
|
536
|
+
---
|
|
537
|
+
|
|
538
|
+
## 8. Stack vs. Heap Allocation Performance
|
|
539
|
+
|
|
540
|
+
### 8.1 Fundamental Differences
|
|
541
|
+
|
|
542
|
+
| Aspect | Stack | Heap |
|
|
543
|
+
|--------|-------|------|
|
|
544
|
+
| Allocation cost | ~0.15 ns (pointer decrement) | 10-75 ns (malloc; allocator-dependent) |
|
|
545
|
+
| Deallocation cost | ~0 ns (pointer increment on return) | 250-1000 ns (free; coalescing, bookkeeping) |
|
|
546
|
+
| Total alloc+dealloc | < 1 ns | 260-1075 ns |
|
|
547
|
+
| Memory limit | 1-8 MB typical thread stack | Limited by available RAM |
|
|
548
|
+
| Fragmentation | None | Grows over time |
|
|
549
|
+
| Lifetime | Automatic; scoped to function | Manual or GC-managed |
|
|
550
|
+
| Thread safety | Inherently thread-local | Requires synchronization |
|
|
551
|
+
| Cache behavior | Excellent: hot stack memory in L1 | Variable: depends on allocation pattern |
|
|
552
|
+
|
|
553
|
+
Source: [Stack vs Heap Benchmark](https://publicwork.wordpress.com/2019/06/27/stack-allocation-vs-heap-allocation-performance-benchmark/);
|
|
554
|
+
[GitHub - stack-vs-heap-benchmark](https://github.com/spuhpointer/stack-vs-heap-benchmark).
|
|
555
|
+
|
|
556
|
+
### 8.2 Escape Analysis
|
|
557
|
+
|
|
558
|
+
Modern compilers and runtimes perform escape analysis to allocate heap-intended
|
|
559
|
+
objects on the stack when they do not outlive their scope.
|
|
560
|
+
|
|
561
|
+
**Go**: The compiler performs escape analysis at build time. Use `go build -gcflags="-m"` to see decisions. Variables that do not escape are stack-allocated (zero GC overhead).
|
|
562
|
+
|
|
563
|
+
**JVM**: HotSpot's C2 compiler performs scalar replacement -- decomposing objects into their fields on the stack. Enabled by default; verify with `-XX:+PrintEscapeAnalysis`.
|
|
564
|
+
|
|
565
|
+
**V8**: TurboFan can allocate short-lived objects in registers or on the stack when escape analysis proves safety.
|
|
566
|
+
|
|
567
|
+
**Practical impact**: In Go, reducing heap escapes by refactoring function
|
|
568
|
+
signatures (returning values instead of pointers, passing buffers as parameters)
|
|
569
|
+
can reduce GC pressure by 20-50% in allocation-heavy code paths.
|
|
570
|
+
|
|
571
|
+
---
|
|
572
|
+
|
|
573
|
+
## 9. Common Bottlenecks and Anti-Patterns
|
|
574
|
+
|
|
575
|
+
### 9.1 Top Memory Performance Bottlenecks
|
|
576
|
+
|
|
577
|
+
| Bottleneck | Symptom | Typical Impact | Detection |
|
|
578
|
+
|------------|---------|----------------|-----------|
|
|
579
|
+
| **Memory leaks** | RSS grows monotonically | OOM kill after hours/days | Heap snapshots, RSS trending |
|
|
580
|
+
| **GC pressure** | High GC CPU %, frequent pauses | 10-50% throughput loss | GC logs, `gc_pause_seconds` metric |
|
|
581
|
+
| **Cache misses** | High LLC-miss in perf counters | 10-100x slower data access | `perf stat`, VTune, Instruments |
|
|
582
|
+
| **Memory fragmentation** | Allocation failures despite free memory | Latency spikes from compaction | `/proc/buddyinfo`, jemalloc stats |
|
|
583
|
+
| **False sharing** | Poor multi-thread scaling | 2-10x slower than expected | `perf c2c`, padding experiments |
|
|
584
|
+
| **TLB misses** | High `dTLB-load-misses` | 5-30 ns per access overhead | `perf stat -e dTLB-load-misses` |
|
|
585
|
+
|
|
586
|
+
### 9.2 Anti-Patterns (With Fixes)
|
|
587
|
+
|
|
588
|
+
#### Anti-Pattern 1: Object Allocation in Hot Loops
|
|
589
|
+
|
|
590
|
+
```javascript
|
|
591
|
+
// BAD: Creates a new object every iteration
|
|
592
|
+
function processItems(items) {
|
|
593
|
+
for (const item of items) {
|
|
594
|
+
const result = { id: item.id, value: transform(item) }; // Allocation!
|
|
595
|
+
send(result);
|
|
596
|
+
}
|
|
597
|
+
}
|
|
598
|
+
|
|
599
|
+
// GOOD: Reuse a single object
|
|
600
|
+
function processItems(items) {
|
|
601
|
+
const result = { id: null, value: null };
|
|
602
|
+
for (const item of items) {
|
|
603
|
+
result.id = item.id;
|
|
604
|
+
result.value = transform(item);
|
|
605
|
+
send(result);
|
|
606
|
+
}
|
|
607
|
+
}
|
|
608
|
+
```
|
|
609
|
+
|
|
610
|
+
**Impact**: Reducing allocations in a loop processing 1M items can decrease
|
|
611
|
+
young-gen GC frequency by 80%+ and improve throughput by 15-30%.
|
|
612
|
+
|
|
613
|
+
#### Anti-Pattern 2: Not Reusing Buffers
|
|
614
|
+
|
|
615
|
+
```go
|
|
616
|
+
// BAD: Allocates new buffer per request
|
|
617
|
+
func handleRequest(w http.ResponseWriter, r *http.Request) {
|
|
618
|
+
buf := make([]byte, 32*1024) // 32 KB allocation every request
|
|
619
|
+
n, _ := r.Body.Read(buf)
|
|
620
|
+
process(buf[:n])
|
|
621
|
+
}
|
|
622
|
+
|
|
623
|
+
// GOOD: Pool buffers
|
|
624
|
+
var bufPool = sync.Pool{
|
|
625
|
+
New: func() interface{} { return make([]byte, 32*1024) },
|
|
626
|
+
}
|
|
627
|
+
|
|
628
|
+
func handleRequest(w http.ResponseWriter, r *http.Request) {
|
|
629
|
+
buf := bufPool.Get().([]byte)
|
|
630
|
+
defer bufPool.Put(buf)
|
|
631
|
+
n, _ := r.Body.Read(buf)
|
|
632
|
+
process(buf[:n])
|
|
633
|
+
}
|
|
634
|
+
```
|
|
635
|
+
|
|
636
|
+
#### Anti-Pattern 3: Holding References Unnecessarily
|
|
637
|
+
|
|
638
|
+
```java
|
|
639
|
+
// BAD: Cache grows unbounded
|
|
640
|
+
private static final Map<String, UserSession> sessionCache = new HashMap<>();
|
|
641
|
+
|
|
642
|
+
public void onLogin(UserSession session) {
|
|
643
|
+
sessionCache.put(session.getId(), session);
|
|
644
|
+
// Never removed! Classic memory leak.
|
|
645
|
+
}
|
|
646
|
+
|
|
647
|
+
// GOOD: Use WeakHashMap, LRU eviction, or explicit TTL
|
|
648
|
+
private static final Map<String, UserSession> sessionCache =
|
|
649
|
+
Collections.synchronizedMap(new LinkedHashMap<>(100, 0.75f, true) {
|
|
650
|
+
@Override
|
|
651
|
+
protected boolean removeEldestEntry(Map.Entry<String, UserSession> e) {
|
|
652
|
+
return size() > MAX_CACHE_SIZE;
|
|
653
|
+
}
|
|
654
|
+
});
|
|
655
|
+
```
|
|
656
|
+
|
|
657
|
+
#### Anti-Pattern 4: String Concatenation in Loops
|
|
658
|
+
|
|
659
|
+
```python
|
|
660
|
+
# BAD: O(n^2) memory -- each concatenation creates a new string
|
|
661
|
+
result = ""
|
|
662
|
+
for line in lines:
|
|
663
|
+
result += line + "\n" # Copies entire string each time
|
|
664
|
+
|
|
665
|
+
# GOOD: O(n) memory -- join pre-allocates
|
|
666
|
+
result = "\n".join(lines)
|
|
667
|
+
```
|
|
668
|
+
|
|
669
|
+
For 100,000 lines, the bad pattern allocates ~5 GB cumulatively (triangular sum)
|
|
670
|
+
while the good pattern allocates ~the final string size only.
|
|
671
|
+
|
|
672
|
+
#### Anti-Pattern 5: Forgetting to Remove Event Listeners
|
|
673
|
+
|
|
674
|
+
```javascript
|
|
675
|
+
// BAD: Listener leaks the component and its entire closure scope
|
|
676
|
+
class Dashboard {
|
|
677
|
+
constructor() {
|
|
678
|
+
window.addEventListener('resize', this.onResize.bind(this));
|
|
679
|
+
}
|
|
680
|
+
// Component destroyed but listener keeps it alive
|
|
681
|
+
}
|
|
682
|
+
|
|
683
|
+
// GOOD: Clean up on destroy
|
|
684
|
+
class Dashboard {
|
|
685
|
+
constructor() {
|
|
686
|
+
this._onResize = this.onResize.bind(this);
|
|
687
|
+
window.addEventListener('resize', this._onResize);
|
|
688
|
+
}
|
|
689
|
+
destroy() {
|
|
690
|
+
window.removeEventListener('resize', this._onResize);
|
|
691
|
+
}
|
|
692
|
+
}
|
|
693
|
+
```
|
|
694
|
+
|
|
695
|
+
---
|
|
696
|
+
|
|
697
|
+
## 10. Before/After Case Studies
|
|
698
|
+
|
|
699
|
+
### Case Study 1: GC Tuning Reduces P99 Latency by 10x
|
|
700
|
+
|
|
701
|
+
**System**: Java microservice, 8 GB heap, G1GC, processing 5,000 req/s.
|
|
702
|
+
|
|
703
|
+
| Metric | Before (G1GC default) | After (ZGC) |
|
|
704
|
+
|--------|----------------------|-------------|
|
|
705
|
+
| P50 latency | 12 ms | 11 ms |
|
|
706
|
+
| P99 latency | 120 ms | 14 ms |
|
|
707
|
+
| P99.9 latency | 450 ms | 18 ms |
|
|
708
|
+
| Max GC pause | 380 ms | 0.8 ms |
|
|
709
|
+
| CPU overhead | Baseline | +8% |
|
|
710
|
+
| Throughput | 5,000 req/s | 4,800 req/s |
|
|
711
|
+
|
|
712
|
+
**Action**: Switched from G1GC to ZGC (`-XX:+UseZGC -XX:+ZGenerational`).
|
|
713
|
+
P99.9 dropped from 450 ms to 18 ms. The 4% throughput decrease from ZGC's
|
|
714
|
+
concurrent overhead was acceptable given the 25x improvement in tail latency.
|
|
715
|
+
|
|
716
|
+
### Case Study 2: Cache-Friendly Restructuring Improves Throughput 3.5x
|
|
717
|
+
|
|
718
|
+
**System**: Particle simulation, 10M particles, iterating over position data.
|
|
719
|
+
|
|
720
|
+
| Metric | Before (AoS) | After (SoA) |
|
|
721
|
+
|--------|-------------|-------------|
|
|
722
|
+
| Layout | `Particle { x, y, z, mass, color, ... }` (96 bytes) | `Positions { xs[], ys[], zs[] }` + separate arrays |
|
|
723
|
+
| Cache lines per particle (position update) | 2 lines (96 bytes > 64-byte line) | 0.19 lines (8 bytes of 64-byte line, shared) |
|
|
724
|
+
| L1 cache miss rate | 34% | 2% |
|
|
725
|
+
| Position update throughput | 28M particles/sec | 98M particles/sec |
|
|
726
|
+
| Speedup | Baseline | **3.5x** |
|
|
727
|
+
|
|
728
|
+
**Action**: Restructured from array-of-structs to struct-of-arrays for the hot
|
|
729
|
+
position-update loop. The SoA layout meant only x, y, z floats were loaded into
|
|
730
|
+
cache (12 bytes per particle), not the unused mass/color/metadata fields (84 bytes
|
|
731
|
+
wasted per particle in AoS).
|
|
732
|
+
|
|
733
|
+
### Case Study 3: Memory Leak Fix Prevents Daily OOM Restarts
|
|
734
|
+
|
|
735
|
+
**System**: Node.js API server, 2 GB heap limit.
|
|
736
|
+
|
|
737
|
+
| Metric | Before | After |
|
|
738
|
+
|--------|--------|-------|
|
|
739
|
+
| RSS at startup | 180 MB | 180 MB |
|
|
740
|
+
| RSS after 24 hours | 1,950 MB (near OOM) | 220 MB |
|
|
741
|
+
| Required restarts | Every 18-24 hours | None |
|
|
742
|
+
| Root cause | Event listeners on WebSocket connections never removed | Added `ws.removeAllListeners()` on disconnect |
|
|
743
|
+
|
|
744
|
+
**Detection method**: Captured three heap snapshots at 1-hour intervals using
|
|
745
|
+
`v8.writeHeapSnapshot()`. Comparison showed `Closure` objects growing by 12,000
|
|
746
|
+
per hour, all retained by WebSocket event listener chains. Each closure retained
|
|
747
|
+
~4 KB of scope data, totaling ~48 MB/hour of leaked memory.
|
|
748
|
+
|
|
749
|
+
---
|
|
750
|
+
|
|
751
|
+
## 11. Decision Tree: "My App Uses Too Much Memory"
|
|
752
|
+
|
|
753
|
+
```
|
|
754
|
+
START: App memory is too high
|
|
755
|
+
|
|
|
756
|
+
+-> Is memory growing over time (monotonically)?
|
|
757
|
+
| |
|
|
758
|
+
| +-> YES: Likely a MEMORY LEAK
|
|
759
|
+
| | |
|
|
760
|
+
| | +-> Take heap snapshots at T=0, T+5min, T+15min
|
|
761
|
+
| | +-> Compare snapshots: which object types are growing?
|
|
762
|
+
| | +-> Check for:
|
|
763
|
+
| | - Unbounded caches/maps (add eviction policy)
|
|
764
|
+
| | - Event listeners not removed (add cleanup)
|
|
765
|
+
| | - Closures capturing large scopes (narrow captures)
|
|
766
|
+
| | - Goroutine/thread leaks (check blocked routines)
|
|
767
|
+
| | - Circular references without weak refs
|
|
768
|
+
| |
|
|
769
|
+
| +-> NO: Memory is high but stable
|
|
770
|
+
| |
|
|
771
|
+
| +-> Is GC running frequently (high CPU)?
|
|
772
|
+
| | |
|
|
773
|
+
| | +-> YES: GC PRESSURE
|
|
774
|
+
| | | |
|
|
775
|
+
| | | +-> Check allocation rate (alloc bytes/sec)
|
|
776
|
+
| | | +-> Profile hot allocation sites
|
|
777
|
+
| | | +-> Reduce allocations:
|
|
778
|
+
| | | - Reuse buffers (sync.Pool, object pool)
|
|
779
|
+
| | | - Pre-allocate collections with capacity
|
|
780
|
+
| | | - Move allocations out of hot loops
|
|
781
|
+
| | | - Use stack allocation (escape analysis)
|
|
782
|
+
| | | +-> Tune GC:
|
|
783
|
+
| | | - JVM: increase young gen, tune -XX:MaxGCPauseMillis
|
|
784
|
+
| | | - Go: increase GOGC, set GOMEMLIMIT
|
|
785
|
+
| | | - Node.js: increase --max-semi-space-size
|
|
786
|
+
| | | - Python: gc.freeze() long-lived objects
|
|
787
|
+
| | |
|
|
788
|
+
| | +-> NO: Legitimate high memory usage
|
|
789
|
+
| | |
|
|
790
|
+
| | +-> Is the working set larger than available RAM?
|
|
791
|
+
| | | |
|
|
792
|
+
| | | +-> YES: CAPACITY ISSUE
|
|
793
|
+
| | | | - Add more RAM
|
|
794
|
+
| | | | - Shard data across instances
|
|
795
|
+
| | | | - Use off-heap storage (memory-mapped files)
|
|
796
|
+
| | | | - Move cold data to disk/SSD
|
|
797
|
+
| | | |
|
|
798
|
+
| | | +-> NO: INEFFICIENT DATA STRUCTURES
|
|
799
|
+
| | | - Audit data structure overhead
|
|
800
|
+
| | | - Java: HashMap entry = 32-48 bytes overhead per entry
|
|
801
|
+
| | | - Consider primitive collections (Eclipse Collections,
|
|
802
|
+
| | | fastutil) to avoid boxing: 4 bytes vs 16 bytes per int
|
|
803
|
+
| | | - Use compact representations (byte[] vs String)
|
|
804
|
+
| | | - Consider struct-of-arrays layout
|
|
805
|
+
| | | - Check for duplicate data that can be interned
|
|
806
|
+
| |
|
|
807
|
+
| +-> Is memory fragmented (alloc failures despite free space)?
|
|
808
|
+
| |
|
|
809
|
+
| +-> YES: FRAGMENTATION
|
|
810
|
+
| - Switch to jemalloc/tcmalloc
|
|
811
|
+
| - Use arena allocation for phase-based workloads
|
|
812
|
+
| - Enable huge pages to reduce page-level fragmentation
|
|
813
|
+
| - Consider periodic process restart with graceful drain
|
|
814
|
+
| - JVM: ZGC/Shenandoah compact automatically
|
|
815
|
+
```
|
|
816
|
+
|
|
817
|
+
---
|
|
818
|
+
|
|
819
|
+
## 12. Memory Profiling Methodology
|
|
820
|
+
|
|
821
|
+
### Step-by-Step Production Profiling
|
|
822
|
+
|
|
823
|
+
**Phase 1: Baseline Measurement**
|
|
824
|
+
1. Record RSS, heap used, and heap committed at startup (after warmup).
|
|
825
|
+
2. Record GC frequency, pause durations, and allocation rate.
|
|
826
|
+
3. Establish baseline under representative load using tools like k6 or Artillery.
|
|
827
|
+
|
|
828
|
+
**Phase 2: Load Testing**
|
|
829
|
+
1. Run sustained load for 30-60 minutes minimum (leaks need time to manifest).
|
|
830
|
+
2. Monitor heap-used-after-GC trend (should be flat for a non-leaking app).
|
|
831
|
+
3. Record peak RSS and compare to container memory limit (leave 20% headroom).
|
|
832
|
+
|
|
833
|
+
**Phase 3: Snapshot Analysis**
|
|
834
|
+
1. Capture heap snapshot during steady state.
|
|
835
|
+
2. Capture second snapshot after 15-30 minutes under load.
|
|
836
|
+
3. Use comparison/diff view to identify growing objects.
|
|
837
|
+
4. Focus on **retained size deltas**, not shallow size.
|
|
838
|
+
|
|
839
|
+
**Phase 4: Allocation Profiling**
|
|
840
|
+
1. Enable allocation tracking (JFR, `--prof`, pprof).
|
|
841
|
+
2. Identify top allocation sites by volume (bytes/sec).
|
|
842
|
+
3. Determine if allocations are necessary or can be eliminated/pooled.
|
|
843
|
+
4. Verify fixes with A/B benchmarks under identical load.
|
|
844
|
+
|
|
845
|
+
### Tool Quick Reference
|
|
846
|
+
|
|
847
|
+
```bash
|
|
848
|
+
# JVM: capture heap dump
|
|
849
|
+
jcmd <pid> GC.heap_dump /tmp/heap.hprof
|
|
850
|
+
|
|
851
|
+
# JVM: live heap histogram (no dump file needed)
|
|
852
|
+
jmap -histo:live <pid> | head -30
|
|
853
|
+
|
|
854
|
+
# Node.js: capture heap snapshot programmatically
|
|
855
|
+
node -e "require('v8').writeHeapSnapshot()"
|
|
856
|
+
|
|
857
|
+
# Go: capture heap profile
|
|
858
|
+
curl http://localhost:6060/debug/pprof/heap > heap.pb.gz
|
|
859
|
+
go tool pprof heap.pb.gz
|
|
860
|
+
|
|
861
|
+
# Python: track allocations
|
|
862
|
+
python -c "
|
|
863
|
+
import tracemalloc
|
|
864
|
+
tracemalloc.start()
|
|
865
|
+
# ... your code ...
|
|
866
|
+
snapshot = tracemalloc.take_snapshot()
|
|
867
|
+
for stat in snapshot.statistics('lineno')[:10]:
|
|
868
|
+
print(stat)
|
|
869
|
+
"
|
|
870
|
+
|
|
871
|
+
# Linux: check memory fragmentation
|
|
872
|
+
cat /proc/buddyinfo
|
|
873
|
+
cat /proc/pagetypeinfo
|
|
874
|
+
|
|
875
|
+
# Linux: check process memory breakdown
|
|
876
|
+
cat /proc/<pid>/smaps_rollup
|
|
877
|
+
pmap -x <pid>
|
|
878
|
+
```
|
|
879
|
+
|
|
880
|
+
---
|
|
881
|
+
|
|
882
|
+
## 13. Quick Reference: Numbers Every Engineer Should Know
|
|
883
|
+
|
|
884
|
+
| Operation | Time | Relative to L1 |
|
|
885
|
+
|-----------|------|-----------------|
|
|
886
|
+
| L1 cache reference | ~1 ns | 1x |
|
|
887
|
+
| L2 cache reference | ~4 ns | 4x |
|
|
888
|
+
| L3 cache reference | ~10 ns | 10x |
|
|
889
|
+
| Main memory (RAM) | ~100 ns | 100x |
|
|
890
|
+
| SSD random read | ~100 us | 100,000x |
|
|
891
|
+
| HDD random read | ~10 ms | 10,000,000x |
|
|
892
|
+
| Stack allocation | ~0.15 ns | <1x |
|
|
893
|
+
| Heap allocation (malloc) | ~40-75 ns | 40-75x |
|
|
894
|
+
| Heap deallocation (free) | ~250-1000 ns | 250-1000x |
|
|
895
|
+
| Arena allocation | ~9-12 ns | 9-12x |
|
|
896
|
+
| Pool allocation | ~9 ns | 9x |
|
|
897
|
+
| System allocator (Box) | ~71-74 ns | 71-74x |
|
|
898
|
+
| Context switch | ~1-10 us | 1,000-10,000x |
|
|
899
|
+
| Mutex lock/unlock | ~25 ns | 25x |
|
|
900
|
+
|
|
901
|
+
---
|
|
902
|
+
|
|
903
|
+
## 14. Key Takeaways
|
|
904
|
+
|
|
905
|
+
1. **Know your hierarchy**: The 100x latency gap between L1 cache (1 ns) and RAM
|
|
906
|
+
(100 ns) means data layout often matters more than algorithmic complexity for
|
|
907
|
+
in-memory workloads.
|
|
908
|
+
|
|
909
|
+
2. **Measure before tuning GC**: Default GC settings are well-tuned for general
|
|
910
|
+
workloads. Profile with GC logs before changing parameters. Pick two of three:
|
|
911
|
+
latency, throughput, footprint.
|
|
912
|
+
|
|
913
|
+
3. **ZGC is transformative for tail latency**: Sub-millisecond pauses regardless
|
|
914
|
+
of heap size, at a cost of 5-10% CPU overhead. If your JVM service has P99
|
|
915
|
+
latency requirements, ZGC is the first lever to pull.
|
|
916
|
+
|
|
917
|
+
4. **Allocation avoidance beats allocation optimization**: Reusing buffers,
|
|
918
|
+
pre-allocating collections, and keeping objects on the stack (via escape
|
|
919
|
+
analysis) eliminates GC work entirely.
|
|
920
|
+
|
|
921
|
+
5. **SoA > AoS for iteration-heavy workloads**: Restructuring data from
|
|
922
|
+
array-of-structs to struct-of-arrays routinely yields 2-4x throughput
|
|
923
|
+
improvements by maximizing cache-line utilization.
|
|
924
|
+
|
|
925
|
+
6. **Leaks are detected by trend, not threshold**: Monitor heap-used-after-GC
|
|
926
|
+
over time. A monotonically increasing line is the definitive leak signal.
|
|
927
|
+
|
|
928
|
+
7. **Arena allocation eliminates fragmentation**: For workloads with clear
|
|
929
|
+
phase-based lifetimes (per-request, per-frame), arenas deliver 7-8x faster
|
|
930
|
+
allocation and zero fragmentation.
|
|
931
|
+
|
|
932
|
+
8. **False sharing is the silent multi-threaded killer**: Pad contended data
|
|
933
|
+
structures to cache-line boundaries. Use `perf c2c` on Linux to detect it.
|
|
934
|
+
|
|
935
|
+
---
|
|
936
|
+
|
|
937
|
+
## Sources
|
|
938
|
+
|
|
939
|
+
- [CPU Cache - Wikipedia](https://en.wikipedia.org/wiki/CPU_cache)
|
|
940
|
+
- [Daniel Lemire - Measuring Cache Line Size](https://lemire.me/blog/2023/12/12/measuring-the-size-of-the-cache-line-empirically/)
|
|
941
|
+
- [Java Code Geeks - G1 vs ZGC vs Shenandoah](https://www.javacodegeeks.com/2025/08/java-gc-performance-g1-vs-zgc-vs-shenandoah.html)
|
|
942
|
+
- [Gunnar Morling - Lower Java Tail Latencies With ZGC](https://www.morling.dev/blog/lower-java-tail-latencies-with-zgc/)
|
|
943
|
+
- [Datadog - Deep Dive into Java GC](https://www.datadoghq.com/blog/understanding-java-gc/)
|
|
944
|
+
- [LinkedIn Engineering - GC Optimization](https://engineering.linkedin.com/garbage-collection/garbage-collection-optimization-high-throughput-and-low-latency-java-applications)
|
|
945
|
+
- [V8 Blog - Trash Talk: Orinoco GC](https://v8.dev/blog/trash-talk)
|
|
946
|
+
- [Platformatic - V8 GC Optimization](https://blog.platformatic.dev/optimizing-nodejs-performance-v8-memory-management-and-gc-tuning)
|
|
947
|
+
- [Go GC Guide](https://go.dev/doc/gc-guide)
|
|
948
|
+
- [Go Performance Guide - GC](https://goperf.dev/01-common-patterns/gc/)
|
|
949
|
+
- [Python gc Module Documentation](https://docs.python.org/3/library/gc.html)
|
|
950
|
+
- [Artem Golubin - Python GC Internals](https://rushter.com/blog/python-garbage-collector/)
|
|
951
|
+
- [Chrome DevTools - Heap Snapshots](https://developer.chrome.com/docs/devtools/memory-problems/heap-snapshots)
|
|
952
|
+
- [Microsoft Azure - RESIN Memory Leak Detection](https://azure.microsoft.com/en-us/blog/advancing-memory-leak-detection-with-aiops-introducing-resin/)
|
|
953
|
+
- [DZone - Array vs Linked List Performance](https://dzone.com/articles/performance-of-array-vs-linked-list-on-modern-comp)
|
|
954
|
+
- [arXiv - RIP Linked List](https://arxiv.org/html/2306.06942v2)
|
|
955
|
+
- [Wikipedia - AoS and SoA](https://en.wikipedia.org/wiki/AoS_and_SoA)
|
|
956
|
+
- [Medium - Arena Allocators 50-100x Performance](https://medium.com/@ramogh2404/arena-and-memory-pool-allocators-the-50-100x-performance-secret-behind-game-engines-and-browsers-1e491cb40b49)
|
|
957
|
+
- [Ryan Fleury - Untangling Lifetimes: Arena Allocator](https://www.rfleury.com/p/untangling-lifetimes-the-arena-allocator)
|
|
958
|
+
- [PingCAP - Linux vs Memory Fragmentation](https://www.pingcap.com/blog/linux-kernel-vs-memory-fragmentation-1/)
|
|
959
|
+
- [Webtide - Object Pooling Benchmarks](https://webtide.com/object-pooling-benchmarks-and-another-way/)
|
|
960
|
+
- [Stack vs Heap Benchmark](https://publicwork.wordpress.com/2019/06/27/stack-allocation-vs-heap-allocation-performance-benchmark/)
|
|
961
|
+
- [Browserless - Memory Leak Guide](https://www.browserless.io/blog/memory-leak-how-to-find-fix-prevent-them)
|
|
962
|
+
- [Halodoc - Fix Node.js Memory Leaks](https://blogs.halodoc.io/fix-node-js-memory-leaks/)
|
|
963
|
+
- [AlgoCademy - Cache-Friendly Data Structures](https://algocademy.com/blog/cache-friendly-algorithms-and-data-structures-optimizing-performance-through-efficient-memory-access/)
|
|
964
|
+
- [Serge Skoredin - Cache-Friendly Go](https://skoredin.pro/blog/golang/cpu-cache-friendly-go)
|