@wazir-dev/cli 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +111 -0
- package/CHANGELOG.md +14 -0
- package/CONTRIBUTING.md +101 -0
- package/LICENSE +21 -0
- package/README.md +314 -0
- package/assets/composition-engine.mmd +34 -0
- package/assets/demo-script.sh +17 -0
- package/assets/logo-dark.svg +14 -0
- package/assets/logo.svg +14 -0
- package/assets/pipeline.mmd +39 -0
- package/assets/record-demo.sh +51 -0
- package/docs/README.md +51 -0
- package/docs/adapters/context-mode.md +60 -0
- package/docs/concepts/architecture.md +87 -0
- package/docs/concepts/artifact-model.md +60 -0
- package/docs/concepts/composition-engine.md +36 -0
- package/docs/concepts/indexing-and-recall.md +160 -0
- package/docs/concepts/observability.md +41 -0
- package/docs/concepts/roles-and-workflows.md +59 -0
- package/docs/concepts/terminology-policy.md +27 -0
- package/docs/getting-started/01-installation.md +78 -0
- package/docs/getting-started/02-first-run.md +102 -0
- package/docs/getting-started/03-adding-to-project.md +15 -0
- package/docs/getting-started/04-host-setup.md +15 -0
- package/docs/guides/ci-integration.md +15 -0
- package/docs/guides/creating-skills.md +15 -0
- package/docs/guides/expertise-module-authoring.md +15 -0
- package/docs/guides/hook-development.md +15 -0
- package/docs/guides/memory-and-learnings.md +34 -0
- package/docs/guides/multi-host-export.md +15 -0
- package/docs/guides/troubleshooting.md +101 -0
- package/docs/guides/writing-custom-roles.md +15 -0
- package/docs/plans/2026-03-15-cli-pipeline-integration-design.md +592 -0
- package/docs/plans/2026-03-15-cli-pipeline-integration-plan.md +598 -0
- package/docs/plans/2026-03-15-docs-enforcement-plan.md +238 -0
- package/docs/readmes/INDEX.md +99 -0
- package/docs/readmes/features/expertise/README.md +171 -0
- package/docs/readmes/features/exports/README.md +222 -0
- package/docs/readmes/features/hooks/README.md +103 -0
- package/docs/readmes/features/hooks/loop-cap-guard.md +133 -0
- package/docs/readmes/features/hooks/post-tool-capture.md +121 -0
- package/docs/readmes/features/hooks/post-tool-lint.md +130 -0
- package/docs/readmes/features/hooks/pre-compact-summary.md +122 -0
- package/docs/readmes/features/hooks/pre-tool-capture-route.md +100 -0
- package/docs/readmes/features/hooks/protected-path-write-guard.md +128 -0
- package/docs/readmes/features/hooks/session-start.md +119 -0
- package/docs/readmes/features/hooks/stop-handoff-harvest.md +125 -0
- package/docs/readmes/features/roles/README.md +157 -0
- package/docs/readmes/features/roles/clarifier.md +152 -0
- package/docs/readmes/features/roles/content-author.md +190 -0
- package/docs/readmes/features/roles/designer.md +193 -0
- package/docs/readmes/features/roles/executor.md +184 -0
- package/docs/readmes/features/roles/learner.md +210 -0
- package/docs/readmes/features/roles/planner.md +182 -0
- package/docs/readmes/features/roles/researcher.md +164 -0
- package/docs/readmes/features/roles/reviewer.md +184 -0
- package/docs/readmes/features/roles/specifier.md +162 -0
- package/docs/readmes/features/roles/verifier.md +215 -0
- package/docs/readmes/features/schemas/README.md +178 -0
- package/docs/readmes/features/skills/README.md +63 -0
- package/docs/readmes/features/skills/brainstorming.md +96 -0
- package/docs/readmes/features/skills/debugging.md +148 -0
- package/docs/readmes/features/skills/design.md +120 -0
- package/docs/readmes/features/skills/prepare-next.md +109 -0
- package/docs/readmes/features/skills/run-audit.md +159 -0
- package/docs/readmes/features/skills/scan-project.md +109 -0
- package/docs/readmes/features/skills/self-audit.md +176 -0
- package/docs/readmes/features/skills/tdd.md +137 -0
- package/docs/readmes/features/skills/using-skills.md +92 -0
- package/docs/readmes/features/skills/verification.md +120 -0
- package/docs/readmes/features/skills/writing-plans.md +104 -0
- package/docs/readmes/features/tooling/README.md +320 -0
- package/docs/readmes/features/workflows/README.md +186 -0
- package/docs/readmes/features/workflows/author.md +181 -0
- package/docs/readmes/features/workflows/clarify.md +154 -0
- package/docs/readmes/features/workflows/design-review.md +171 -0
- package/docs/readmes/features/workflows/design.md +169 -0
- package/docs/readmes/features/workflows/discover.md +162 -0
- package/docs/readmes/features/workflows/execute.md +173 -0
- package/docs/readmes/features/workflows/learn.md +167 -0
- package/docs/readmes/features/workflows/plan-review.md +165 -0
- package/docs/readmes/features/workflows/plan.md +170 -0
- package/docs/readmes/features/workflows/prepare-next.md +167 -0
- package/docs/readmes/features/workflows/review.md +169 -0
- package/docs/readmes/features/workflows/run-audit.md +191 -0
- package/docs/readmes/features/workflows/spec-challenge.md +159 -0
- package/docs/readmes/features/workflows/specify.md +160 -0
- package/docs/readmes/features/workflows/verify.md +177 -0
- package/docs/readmes/packages/README.md +50 -0
- package/docs/readmes/packages/ajv.md +117 -0
- package/docs/readmes/packages/context-mode.md +118 -0
- package/docs/readmes/packages/gray-matter.md +116 -0
- package/docs/readmes/packages/node-test.md +137 -0
- package/docs/readmes/packages/yaml.md +112 -0
- package/docs/reference/configuration-reference.md +159 -0
- package/docs/reference/expertise-index.md +52 -0
- package/docs/reference/git-flow.md +43 -0
- package/docs/reference/hooks.md +87 -0
- package/docs/reference/host-exports.md +50 -0
- package/docs/reference/launch-checklist.md +172 -0
- package/docs/reference/marketplace-listings.md +76 -0
- package/docs/reference/release-process.md +34 -0
- package/docs/reference/roles-reference.md +77 -0
- package/docs/reference/skills.md +33 -0
- package/docs/reference/templates.md +29 -0
- package/docs/reference/tooling-cli.md +94 -0
- package/docs/truth-claims.yaml +222 -0
- package/expertise/PROGRESS.md +63 -0
- package/expertise/README.md +18 -0
- package/expertise/antipatterns/PROGRESS.md +56 -0
- package/expertise/antipatterns/backend/api-design-antipatterns.md +1271 -0
- package/expertise/antipatterns/backend/auth-antipatterns.md +1195 -0
- package/expertise/antipatterns/backend/caching-antipatterns.md +622 -0
- package/expertise/antipatterns/backend/database-antipatterns.md +1038 -0
- package/expertise/antipatterns/backend/index.md +24 -0
- package/expertise/antipatterns/backend/microservices-antipatterns.md +850 -0
- package/expertise/antipatterns/code/architecture-antipatterns.md +919 -0
- package/expertise/antipatterns/code/async-antipatterns.md +622 -0
- package/expertise/antipatterns/code/code-smells.md +1186 -0
- package/expertise/antipatterns/code/dependency-antipatterns.md +1209 -0
- package/expertise/antipatterns/code/error-handling-antipatterns.md +1360 -0
- package/expertise/antipatterns/code/index.md +27 -0
- package/expertise/antipatterns/code/naming-and-abstraction.md +1118 -0
- package/expertise/antipatterns/code/state-management-antipatterns.md +1076 -0
- package/expertise/antipatterns/code/testing-antipatterns.md +1053 -0
- package/expertise/antipatterns/design/accessibility-antipatterns.md +1136 -0
- package/expertise/antipatterns/design/dark-patterns.md +1121 -0
- package/expertise/antipatterns/design/index.md +22 -0
- package/expertise/antipatterns/design/ui-antipatterns.md +1202 -0
- package/expertise/antipatterns/design/ux-antipatterns.md +680 -0
- package/expertise/antipatterns/frontend/css-layout-antipatterns.md +691 -0
- package/expertise/antipatterns/frontend/flutter-antipatterns.md +1827 -0
- package/expertise/antipatterns/frontend/index.md +23 -0
- package/expertise/antipatterns/frontend/mobile-antipatterns.md +573 -0
- package/expertise/antipatterns/frontend/react-antipatterns.md +1128 -0
- package/expertise/antipatterns/frontend/spa-antipatterns.md +1235 -0
- package/expertise/antipatterns/index.md +31 -0
- package/expertise/antipatterns/performance/index.md +20 -0
- package/expertise/antipatterns/performance/performance-antipatterns.md +1013 -0
- package/expertise/antipatterns/performance/premature-optimization.md +623 -0
- package/expertise/antipatterns/performance/scaling-antipatterns.md +785 -0
- package/expertise/antipatterns/process/ai-coding-antipatterns.md +853 -0
- package/expertise/antipatterns/process/code-review-antipatterns.md +656 -0
- package/expertise/antipatterns/process/deployment-antipatterns.md +920 -0
- package/expertise/antipatterns/process/index.md +23 -0
- package/expertise/antipatterns/process/technical-debt-antipatterns.md +647 -0
- package/expertise/antipatterns/security/index.md +20 -0
- package/expertise/antipatterns/security/secrets-antipatterns.md +849 -0
- package/expertise/antipatterns/security/security-theater.md +843 -0
- package/expertise/antipatterns/security/vulnerability-patterns.md +801 -0
- package/expertise/architecture/PROGRESS.md +70 -0
- package/expertise/architecture/data/caching-architecture.md +671 -0
- package/expertise/architecture/data/data-consistency.md +574 -0
- package/expertise/architecture/data/data-modeling.md +536 -0
- package/expertise/architecture/data/event-streams-and-queues.md +634 -0
- package/expertise/architecture/data/index.md +25 -0
- package/expertise/architecture/data/search-architecture.md +663 -0
- package/expertise/architecture/data/sql-vs-nosql.md +708 -0
- package/expertise/architecture/decisions/architecture-decision-records.md +640 -0
- package/expertise/architecture/decisions/build-vs-buy.md +616 -0
- package/expertise/architecture/decisions/index.md +23 -0
- package/expertise/architecture/decisions/monolith-to-microservices.md +790 -0
- package/expertise/architecture/decisions/technology-selection.md +616 -0
- package/expertise/architecture/distributed/cap-theorem-and-tradeoffs.md +800 -0
- package/expertise/architecture/distributed/circuit-breaker-bulkhead.md +741 -0
- package/expertise/architecture/distributed/consensus-and-coordination.md +796 -0
- package/expertise/architecture/distributed/distributed-systems-fundamentals.md +564 -0
- package/expertise/architecture/distributed/idempotency-and-retry.md +796 -0
- package/expertise/architecture/distributed/index.md +25 -0
- package/expertise/architecture/distributed/saga-pattern.md +797 -0
- package/expertise/architecture/foundations/architectural-thinking.md +460 -0
- package/expertise/architecture/foundations/coupling-and-cohesion.md +770 -0
- package/expertise/architecture/foundations/design-principles-solid.md +649 -0
- package/expertise/architecture/foundations/domain-driven-design.md +719 -0
- package/expertise/architecture/foundations/index.md +25 -0
- package/expertise/architecture/foundations/separation-of-concerns.md +472 -0
- package/expertise/architecture/foundations/twelve-factor-app.md +797 -0
- package/expertise/architecture/index.md +34 -0
- package/expertise/architecture/integration/api-design-graphql.md +638 -0
- package/expertise/architecture/integration/api-design-grpc.md +804 -0
- package/expertise/architecture/integration/api-design-rest.md +892 -0
- package/expertise/architecture/integration/index.md +25 -0
- package/expertise/architecture/integration/third-party-integration.md +795 -0
- package/expertise/architecture/integration/webhooks-and-callbacks.md +1152 -0
- package/expertise/architecture/integration/websockets-realtime.md +791 -0
- package/expertise/architecture/mobile-architecture/index.md +22 -0
- package/expertise/architecture/mobile-architecture/mobile-app-architecture.md +780 -0
- package/expertise/architecture/mobile-architecture/mobile-backend-for-frontend.md +670 -0
- package/expertise/architecture/mobile-architecture/offline-first.md +719 -0
- package/expertise/architecture/mobile-architecture/push-and-sync.md +782 -0
- package/expertise/architecture/patterns/cqrs-event-sourcing.md +717 -0
- package/expertise/architecture/patterns/event-driven.md +797 -0
- package/expertise/architecture/patterns/hexagonal-clean-architecture.md +870 -0
- package/expertise/architecture/patterns/index.md +27 -0
- package/expertise/architecture/patterns/layered-architecture.md +736 -0
- package/expertise/architecture/patterns/microservices.md +753 -0
- package/expertise/architecture/patterns/modular-monolith.md +692 -0
- package/expertise/architecture/patterns/monolith.md +626 -0
- package/expertise/architecture/patterns/plugin-architecture.md +735 -0
- package/expertise/architecture/patterns/serverless.md +780 -0
- package/expertise/architecture/scaling/database-scaling.md +615 -0
- package/expertise/architecture/scaling/feature-flags-and-rollouts.md +757 -0
- package/expertise/architecture/scaling/horizontal-vs-vertical.md +606 -0
- package/expertise/architecture/scaling/index.md +24 -0
- package/expertise/architecture/scaling/multi-tenancy.md +800 -0
- package/expertise/architecture/scaling/stateless-design.md +787 -0
- package/expertise/backend/embedded-firmware.md +625 -0
- package/expertise/backend/go.md +853 -0
- package/expertise/backend/index.md +24 -0
- package/expertise/backend/java-spring.md +448 -0
- package/expertise/backend/node-typescript.md +625 -0
- package/expertise/backend/python-fastapi.md +724 -0
- package/expertise/backend/rust.md +458 -0
- package/expertise/backend/solidity.md +711 -0
- package/expertise/composition-map.yaml +443 -0
- package/expertise/content/foundations/content-modeling.md +395 -0
- package/expertise/content/foundations/editorial-standards.md +449 -0
- package/expertise/content/foundations/index.md +24 -0
- package/expertise/content/foundations/microcopy.md +455 -0
- package/expertise/content/foundations/terminology-governance.md +509 -0
- package/expertise/content/index.md +34 -0
- package/expertise/content/patterns/accessibility-copy.md +518 -0
- package/expertise/content/patterns/index.md +24 -0
- package/expertise/content/patterns/notification-content.md +433 -0
- package/expertise/content/patterns/sample-content.md +486 -0
- package/expertise/content/patterns/state-copy.md +439 -0
- package/expertise/design/PROGRESS.md +58 -0
- package/expertise/design/disciplines/dark-mode-theming.md +577 -0
- package/expertise/design/disciplines/design-systems.md +595 -0
- package/expertise/design/disciplines/index.md +25 -0
- package/expertise/design/disciplines/information-architecture.md +800 -0
- package/expertise/design/disciplines/interaction-design.md +788 -0
- package/expertise/design/disciplines/responsive-design.md +552 -0
- package/expertise/design/disciplines/usability-testing.md +516 -0
- package/expertise/design/disciplines/user-research.md +792 -0
- package/expertise/design/foundations/accessibility-design.md +796 -0
- package/expertise/design/foundations/color-theory.md +797 -0
- package/expertise/design/foundations/iconography.md +795 -0
- package/expertise/design/foundations/index.md +26 -0
- package/expertise/design/foundations/motion-and-animation.md +653 -0
- package/expertise/design/foundations/rtl-design.md +585 -0
- package/expertise/design/foundations/spacing-and-layout.md +607 -0
- package/expertise/design/foundations/typography.md +800 -0
- package/expertise/design/foundations/visual-hierarchy.md +761 -0
- package/expertise/design/index.md +32 -0
- package/expertise/design/patterns/authentication-flows.md +474 -0
- package/expertise/design/patterns/content-consumption.md +789 -0
- package/expertise/design/patterns/data-display.md +618 -0
- package/expertise/design/patterns/e-commerce.md +1494 -0
- package/expertise/design/patterns/feedback-and-states.md +642 -0
- package/expertise/design/patterns/forms-and-input.md +819 -0
- package/expertise/design/patterns/gamification.md +801 -0
- package/expertise/design/patterns/index.md +31 -0
- package/expertise/design/patterns/microinteractions.md +449 -0
- package/expertise/design/patterns/navigation.md +800 -0
- package/expertise/design/patterns/notifications.md +705 -0
- package/expertise/design/patterns/onboarding.md +700 -0
- package/expertise/design/patterns/search-and-filter.md +601 -0
- package/expertise/design/patterns/settings-and-preferences.md +768 -0
- package/expertise/design/patterns/social-and-community.md +748 -0
- package/expertise/design/platforms/desktop-native.md +612 -0
- package/expertise/design/platforms/index.md +25 -0
- package/expertise/design/platforms/mobile-android.md +825 -0
- package/expertise/design/platforms/mobile-cross-platform.md +983 -0
- package/expertise/design/platforms/mobile-ios.md +699 -0
- package/expertise/design/platforms/tablet.md +794 -0
- package/expertise/design/platforms/web-dashboard.md +790 -0
- package/expertise/design/platforms/web-responsive.md +550 -0
- package/expertise/design/psychology/behavioral-nudges.md +449 -0
- package/expertise/design/psychology/cognitive-load.md +1191 -0
- package/expertise/design/psychology/error-psychology.md +778 -0
- package/expertise/design/psychology/index.md +22 -0
- package/expertise/design/psychology/persuasive-design.md +736 -0
- package/expertise/design/psychology/user-mental-models.md +623 -0
- package/expertise/design/tooling/open-pencil.md +266 -0
- package/expertise/frontend/angular.md +1073 -0
- package/expertise/frontend/desktop-electron.md +546 -0
- package/expertise/frontend/flutter.md +782 -0
- package/expertise/frontend/index.md +27 -0
- package/expertise/frontend/native-android.md +409 -0
- package/expertise/frontend/native-ios.md +490 -0
- package/expertise/frontend/react-native.md +1160 -0
- package/expertise/frontend/react.md +808 -0
- package/expertise/frontend/vue.md +1089 -0
- package/expertise/humanize/domain-rules-code.md +79 -0
- package/expertise/humanize/domain-rules-content.md +67 -0
- package/expertise/humanize/domain-rules-technical-docs.md +56 -0
- package/expertise/humanize/index.md +35 -0
- package/expertise/humanize/self-audit-checklist.md +87 -0
- package/expertise/humanize/sentence-patterns.md +218 -0
- package/expertise/humanize/vocabulary-blacklist.md +105 -0
- package/expertise/i18n/PROGRESS.md +65 -0
- package/expertise/i18n/advanced/accessibility-and-i18n.md +28 -0
- package/expertise/i18n/advanced/bidirectional-text-algorithm.md +38 -0
- package/expertise/i18n/advanced/complex-scripts.md +30 -0
- package/expertise/i18n/advanced/performance-and-i18n.md +27 -0
- package/expertise/i18n/advanced/testing-i18n.md +28 -0
- package/expertise/i18n/content/content-adaptation.md +23 -0
- package/expertise/i18n/content/locale-specific-formatting.md +23 -0
- package/expertise/i18n/content/machine-translation-integration.md +28 -0
- package/expertise/i18n/content/translation-management.md +29 -0
- package/expertise/i18n/foundations/date-time-calendars.md +67 -0
- package/expertise/i18n/foundations/i18n-architecture.md +272 -0
- package/expertise/i18n/foundations/locale-and-language-tags.md +79 -0
- package/expertise/i18n/foundations/numbers-currency-units.md +61 -0
- package/expertise/i18n/foundations/pluralization-and-gender.md +109 -0
- package/expertise/i18n/foundations/string-externalization.md +236 -0
- package/expertise/i18n/foundations/text-direction-bidi.md +241 -0
- package/expertise/i18n/foundations/unicode-and-encoding.md +86 -0
- package/expertise/i18n/index.md +38 -0
- package/expertise/i18n/platform/backend-i18n.md +31 -0
- package/expertise/i18n/platform/flutter-i18n.md +148 -0
- package/expertise/i18n/platform/native-android-i18n.md +36 -0
- package/expertise/i18n/platform/native-ios-i18n.md +36 -0
- package/expertise/i18n/platform/react-i18n.md +103 -0
- package/expertise/i18n/platform/web-css-i18n.md +81 -0
- package/expertise/i18n/rtl/arabic-specific.md +175 -0
- package/expertise/i18n/rtl/hebrew-specific.md +149 -0
- package/expertise/i18n/rtl/rtl-animations-and-transitions.md +111 -0
- package/expertise/i18n/rtl/rtl-forms-and-input.md +161 -0
- package/expertise/i18n/rtl/rtl-fundamentals.md +211 -0
- package/expertise/i18n/rtl/rtl-icons-and-images.md +181 -0
- package/expertise/i18n/rtl/rtl-layout-mirroring.md +252 -0
- package/expertise/i18n/rtl/rtl-navigation-and-gestures.md +107 -0
- package/expertise/i18n/rtl/rtl-testing-and-qa.md +147 -0
- package/expertise/i18n/rtl/rtl-typography.md +160 -0
- package/expertise/index.md +113 -0
- package/expertise/index.yaml +216 -0
- package/expertise/infrastructure/cloud-aws.md +597 -0
- package/expertise/infrastructure/cloud-gcp.md +599 -0
- package/expertise/infrastructure/cybersecurity.md +816 -0
- package/expertise/infrastructure/database-mongodb.md +447 -0
- package/expertise/infrastructure/database-postgres.md +400 -0
- package/expertise/infrastructure/devops-cicd.md +787 -0
- package/expertise/infrastructure/index.md +27 -0
- package/expertise/performance/PROGRESS.md +50 -0
- package/expertise/performance/backend/api-latency.md +1204 -0
- package/expertise/performance/backend/background-jobs.md +506 -0
- package/expertise/performance/backend/connection-pooling.md +1209 -0
- package/expertise/performance/backend/database-query-optimization.md +515 -0
- package/expertise/performance/backend/index.md +23 -0
- package/expertise/performance/backend/rate-limiting-and-throttling.md +971 -0
- package/expertise/performance/foundations/algorithmic-complexity.md +954 -0
- package/expertise/performance/foundations/caching-strategies.md +489 -0
- package/expertise/performance/foundations/concurrency-and-parallelism.md +847 -0
- package/expertise/performance/foundations/index.md +24 -0
- package/expertise/performance/foundations/measuring-and-profiling.md +440 -0
- package/expertise/performance/foundations/memory-management.md +964 -0
- package/expertise/performance/foundations/performance-budgets.md +1314 -0
- package/expertise/performance/index.md +31 -0
- package/expertise/performance/infrastructure/auto-scaling.md +1059 -0
- package/expertise/performance/infrastructure/cdn-and-edge.md +1081 -0
- package/expertise/performance/infrastructure/index.md +22 -0
- package/expertise/performance/infrastructure/load-balancing.md +1081 -0
- package/expertise/performance/infrastructure/observability.md +1079 -0
- package/expertise/performance/mobile/index.md +23 -0
- package/expertise/performance/mobile/mobile-animations.md +544 -0
- package/expertise/performance/mobile/mobile-memory-battery.md +416 -0
- package/expertise/performance/mobile/mobile-network.md +452 -0
- package/expertise/performance/mobile/mobile-rendering.md +599 -0
- package/expertise/performance/mobile/mobile-startup-time.md +505 -0
- package/expertise/performance/platform-specific/flutter-performance.md +647 -0
- package/expertise/performance/platform-specific/index.md +22 -0
- package/expertise/performance/platform-specific/node-performance.md +1307 -0
- package/expertise/performance/platform-specific/postgres-performance.md +1366 -0
- package/expertise/performance/platform-specific/react-performance.md +1403 -0
- package/expertise/performance/web/bundle-optimization.md +1239 -0
- package/expertise/performance/web/image-and-media.md +636 -0
- package/expertise/performance/web/index.md +24 -0
- package/expertise/performance/web/network-optimization.md +1133 -0
- package/expertise/performance/web/rendering-performance.md +1098 -0
- package/expertise/performance/web/ssr-and-hydration.md +918 -0
- package/expertise/performance/web/web-vitals.md +1374 -0
- package/expertise/quality/accessibility.md +985 -0
- package/expertise/quality/evidence-based-verification.md +499 -0
- package/expertise/quality/index.md +24 -0
- package/expertise/quality/ml-model-audit.md +614 -0
- package/expertise/quality/performance.md +600 -0
- package/expertise/quality/testing-api.md +891 -0
- package/expertise/quality/testing-mobile.md +496 -0
- package/expertise/quality/testing-web.md +849 -0
- package/expertise/security/PROGRESS.md +54 -0
- package/expertise/security/agentic-identity.md +540 -0
- package/expertise/security/compliance-frameworks.md +601 -0
- package/expertise/security/data/data-encryption.md +364 -0
- package/expertise/security/data/data-privacy-gdpr.md +692 -0
- package/expertise/security/data/database-security.md +1171 -0
- package/expertise/security/data/index.md +22 -0
- package/expertise/security/data/pii-handling.md +531 -0
- package/expertise/security/foundations/authentication.md +1041 -0
- package/expertise/security/foundations/authorization.md +603 -0
- package/expertise/security/foundations/cryptography.md +1001 -0
- package/expertise/security/foundations/index.md +25 -0
- package/expertise/security/foundations/owasp-top-10.md +1354 -0
- package/expertise/security/foundations/secrets-management.md +1217 -0
- package/expertise/security/foundations/secure-sdlc.md +700 -0
- package/expertise/security/foundations/supply-chain-security.md +698 -0
- package/expertise/security/index.md +31 -0
- package/expertise/security/infrastructure/cloud-security-aws.md +1296 -0
- package/expertise/security/infrastructure/cloud-security-gcp.md +1376 -0
- package/expertise/security/infrastructure/container-security.md +721 -0
- package/expertise/security/infrastructure/incident-response.md +1295 -0
- package/expertise/security/infrastructure/index.md +24 -0
- package/expertise/security/infrastructure/logging-and-monitoring.md +1618 -0
- package/expertise/security/infrastructure/network-security.md +1337 -0
- package/expertise/security/mobile/index.md +23 -0
- package/expertise/security/mobile/mobile-android-security.md +1218 -0
- package/expertise/security/mobile/mobile-binary-protection.md +1229 -0
- package/expertise/security/mobile/mobile-data-storage.md +1265 -0
- package/expertise/security/mobile/mobile-ios-security.md +1401 -0
- package/expertise/security/mobile/mobile-network-security.md +1520 -0
- package/expertise/security/smart-contract-security.md +594 -0
- package/expertise/security/testing/index.md +22 -0
- package/expertise/security/testing/penetration-testing.md +1258 -0
- package/expertise/security/testing/security-code-review.md +1765 -0
- package/expertise/security/testing/threat-modeling.md +1074 -0
- package/expertise/security/testing/vulnerability-scanning.md +1062 -0
- package/expertise/security/web/api-security.md +586 -0
- package/expertise/security/web/cors-and-headers.md +433 -0
- package/expertise/security/web/csrf.md +562 -0
- package/expertise/security/web/file-upload.md +1477 -0
- package/expertise/security/web/index.md +25 -0
- package/expertise/security/web/injection.md +1375 -0
- package/expertise/security/web/session-management.md +1101 -0
- package/expertise/security/web/xss.md +1158 -0
- package/exports/README.md +17 -0
- package/exports/hosts/claude/.claude/agents/clarifier.md +42 -0
- package/exports/hosts/claude/.claude/agents/content-author.md +63 -0
- package/exports/hosts/claude/.claude/agents/designer.md +55 -0
- package/exports/hosts/claude/.claude/agents/executor.md +55 -0
- package/exports/hosts/claude/.claude/agents/learner.md +51 -0
- package/exports/hosts/claude/.claude/agents/planner.md +53 -0
- package/exports/hosts/claude/.claude/agents/researcher.md +43 -0
- package/exports/hosts/claude/.claude/agents/reviewer.md +54 -0
- package/exports/hosts/claude/.claude/agents/specifier.md +47 -0
- package/exports/hosts/claude/.claude/agents/verifier.md +71 -0
- package/exports/hosts/claude/.claude/commands/author.md +42 -0
- package/exports/hosts/claude/.claude/commands/clarify.md +38 -0
- package/exports/hosts/claude/.claude/commands/design-review.md +46 -0
- package/exports/hosts/claude/.claude/commands/design.md +44 -0
- package/exports/hosts/claude/.claude/commands/discover.md +37 -0
- package/exports/hosts/claude/.claude/commands/execute.md +48 -0
- package/exports/hosts/claude/.claude/commands/learn.md +38 -0
- package/exports/hosts/claude/.claude/commands/plan-review.md +42 -0
- package/exports/hosts/claude/.claude/commands/plan.md +39 -0
- package/exports/hosts/claude/.claude/commands/prepare-next.md +37 -0
- package/exports/hosts/claude/.claude/commands/review.md +40 -0
- package/exports/hosts/claude/.claude/commands/run-audit.md +41 -0
- package/exports/hosts/claude/.claude/commands/spec-challenge.md +41 -0
- package/exports/hosts/claude/.claude/commands/specify.md +38 -0
- package/exports/hosts/claude/.claude/commands/verify.md +37 -0
- package/exports/hosts/claude/.claude/settings.json +34 -0
- package/exports/hosts/claude/CLAUDE.md +19 -0
- package/exports/hosts/claude/export.manifest.json +38 -0
- package/exports/hosts/claude/host-package.json +67 -0
- package/exports/hosts/codex/AGENTS.md +19 -0
- package/exports/hosts/codex/export.manifest.json +38 -0
- package/exports/hosts/codex/host-package.json +41 -0
- package/exports/hosts/cursor/.cursor/hooks.json +16 -0
- package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +19 -0
- package/exports/hosts/cursor/export.manifest.json +38 -0
- package/exports/hosts/cursor/host-package.json +42 -0
- package/exports/hosts/gemini/GEMINI.md +19 -0
- package/exports/hosts/gemini/export.manifest.json +38 -0
- package/exports/hosts/gemini/host-package.json +41 -0
- package/hooks/README.md +18 -0
- package/hooks/definitions/loop_cap_guard.yaml +21 -0
- package/hooks/definitions/post_tool_capture.yaml +24 -0
- package/hooks/definitions/pre_compact_summary.yaml +19 -0
- package/hooks/definitions/pre_tool_capture_route.yaml +19 -0
- package/hooks/definitions/protected_path_write_guard.yaml +19 -0
- package/hooks/definitions/session_start.yaml +19 -0
- package/hooks/definitions/stop_handoff_harvest.yaml +20 -0
- package/hooks/loop-cap-guard +17 -0
- package/hooks/post-tool-lint +36 -0
- package/hooks/protected-path-write-guard +17 -0
- package/hooks/session-start +41 -0
- package/llms-full.txt +2355 -0
- package/llms.txt +43 -0
- package/package.json +79 -0
- package/roles/README.md +20 -0
- package/roles/clarifier.md +42 -0
- package/roles/content-author.md +63 -0
- package/roles/designer.md +55 -0
- package/roles/executor.md +55 -0
- package/roles/learner.md +51 -0
- package/roles/planner.md +53 -0
- package/roles/researcher.md +43 -0
- package/roles/reviewer.md +54 -0
- package/roles/specifier.md +47 -0
- package/roles/verifier.md +71 -0
- package/schemas/README.md +24 -0
- package/schemas/accepted-learning.schema.json +20 -0
- package/schemas/author-artifact.schema.json +156 -0
- package/schemas/clarification.schema.json +19 -0
- package/schemas/design-artifact.schema.json +80 -0
- package/schemas/docs-claim.schema.json +18 -0
- package/schemas/export-manifest.schema.json +20 -0
- package/schemas/hook.schema.json +67 -0
- package/schemas/host-export-package.schema.json +18 -0
- package/schemas/implementation-plan.schema.json +19 -0
- package/schemas/proposed-learning.schema.json +19 -0
- package/schemas/research.schema.json +18 -0
- package/schemas/review.schema.json +29 -0
- package/schemas/run-manifest.schema.json +18 -0
- package/schemas/spec-challenge.schema.json +18 -0
- package/schemas/spec.schema.json +20 -0
- package/schemas/usage.schema.json +102 -0
- package/schemas/verification-proof.schema.json +29 -0
- package/schemas/wazir-manifest.schema.json +173 -0
- package/skills/README.md +40 -0
- package/skills/brainstorming/SKILL.md +77 -0
- package/skills/debugging/SKILL.md +50 -0
- package/skills/design/SKILL.md +61 -0
- package/skills/dispatching-parallel-agents/SKILL.md +128 -0
- package/skills/executing-plans/SKILL.md +70 -0
- package/skills/finishing-a-development-branch/SKILL.md +169 -0
- package/skills/humanize/SKILL.md +123 -0
- package/skills/init-pipeline/SKILL.md +124 -0
- package/skills/prepare-next/SKILL.md +20 -0
- package/skills/receiving-code-review/SKILL.md +123 -0
- package/skills/requesting-code-review/SKILL.md +105 -0
- package/skills/requesting-code-review/code-reviewer.md +108 -0
- package/skills/run-audit/SKILL.md +197 -0
- package/skills/scan-project/SKILL.md +41 -0
- package/skills/self-audit/SKILL.md +153 -0
- package/skills/subagent-driven-development/SKILL.md +154 -0
- package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +26 -0
- package/skills/subagent-driven-development/implementer-prompt.md +102 -0
- package/skills/subagent-driven-development/spec-reviewer-prompt.md +61 -0
- package/skills/tdd/SKILL.md +23 -0
- package/skills/using-git-worktrees/SKILL.md +163 -0
- package/skills/using-skills/SKILL.md +95 -0
- package/skills/verification/SKILL.md +22 -0
- package/skills/wazir/SKILL.md +463 -0
- package/skills/writing-plans/SKILL.md +30 -0
- package/skills/writing-skills/SKILL.md +157 -0
- package/skills/writing-skills/anthropic-best-practices.md +122 -0
- package/skills/writing-skills/persuasion-principles.md +50 -0
- package/templates/README.md +20 -0
- package/templates/artifacts/README.md +10 -0
- package/templates/artifacts/accepted-learning.md +19 -0
- package/templates/artifacts/accepted-learning.template.json +12 -0
- package/templates/artifacts/author.md +74 -0
- package/templates/artifacts/author.template.json +19 -0
- package/templates/artifacts/clarification.md +21 -0
- package/templates/artifacts/clarification.template.json +12 -0
- package/templates/artifacts/execute-notes.md +19 -0
- package/templates/artifacts/implementation-plan.md +21 -0
- package/templates/artifacts/implementation-plan.template.json +11 -0
- package/templates/artifacts/learning-proposal.md +19 -0
- package/templates/artifacts/next-run-handoff.md +21 -0
- package/templates/artifacts/plan-review.md +19 -0
- package/templates/artifacts/proposed-learning.template.json +12 -0
- package/templates/artifacts/research.md +21 -0
- package/templates/artifacts/research.template.json +12 -0
- package/templates/artifacts/review-findings.md +19 -0
- package/templates/artifacts/review.template.json +11 -0
- package/templates/artifacts/run-manifest.template.json +8 -0
- package/templates/artifacts/spec-challenge.md +19 -0
- package/templates/artifacts/spec-challenge.template.json +11 -0
- package/templates/artifacts/spec.md +21 -0
- package/templates/artifacts/spec.template.json +12 -0
- package/templates/artifacts/verification-proof.md +19 -0
- package/templates/artifacts/verification-proof.template.json +11 -0
- package/templates/examples/accepted-learning.example.json +14 -0
- package/templates/examples/author.example.json +152 -0
- package/templates/examples/clarification.example.json +15 -0
- package/templates/examples/docs-claim.example.json +8 -0
- package/templates/examples/export-manifest.example.json +7 -0
- package/templates/examples/host-export-package.example.json +11 -0
- package/templates/examples/implementation-plan.example.json +17 -0
- package/templates/examples/proposed-learning.example.json +13 -0
- package/templates/examples/research.example.json +15 -0
- package/templates/examples/research.example.md +6 -0
- package/templates/examples/review.example.json +17 -0
- package/templates/examples/run-manifest.example.json +9 -0
- package/templates/examples/spec-challenge.example.json +14 -0
- package/templates/examples/spec.example.json +21 -0
- package/templates/examples/verification-proof.example.json +21 -0
- package/templates/examples/wazir-manifest.example.yaml +65 -0
- package/templates/task-definition-schema.md +99 -0
- package/tooling/README.md +20 -0
- package/tooling/src/adapters/context-mode.js +50 -0
- package/tooling/src/capture/command.js +376 -0
- package/tooling/src/capture/store.js +99 -0
- package/tooling/src/capture/usage.js +270 -0
- package/tooling/src/checks/branches.js +50 -0
- package/tooling/src/checks/brand-truth.js +110 -0
- package/tooling/src/checks/changelog.js +231 -0
- package/tooling/src/checks/command-registry.js +36 -0
- package/tooling/src/checks/commits.js +102 -0
- package/tooling/src/checks/docs-drift.js +103 -0
- package/tooling/src/checks/docs-truth.js +201 -0
- package/tooling/src/checks/runtime-surface.js +156 -0
- package/tooling/src/cli.js +116 -0
- package/tooling/src/command-options.js +56 -0
- package/tooling/src/commands/validate.js +320 -0
- package/tooling/src/doctor/command.js +91 -0
- package/tooling/src/export/command.js +77 -0
- package/tooling/src/export/compiler.js +498 -0
- package/tooling/src/guards/loop-cap-guard.js +52 -0
- package/tooling/src/guards/protected-path-write-guard.js +67 -0
- package/tooling/src/index/command.js +152 -0
- package/tooling/src/index/storage.js +1061 -0
- package/tooling/src/index/summarizers.js +261 -0
- package/tooling/src/loaders.js +18 -0
- package/tooling/src/project-root.js +22 -0
- package/tooling/src/recall/command.js +225 -0
- package/tooling/src/schema-validator.js +30 -0
- package/tooling/src/state-root.js +40 -0
- package/tooling/src/status/command.js +71 -0
- package/wazir.manifest.yaml +135 -0
- package/workflows/README.md +19 -0
- package/workflows/author.md +42 -0
- package/workflows/clarify.md +38 -0
- package/workflows/design-review.md +46 -0
- package/workflows/design.md +44 -0
- package/workflows/discover.md +37 -0
- package/workflows/execute.md +48 -0
- package/workflows/learn.md +38 -0
- package/workflows/plan-review.md +42 -0
- package/workflows/plan.md +39 -0
- package/workflows/prepare-next.md +37 -0
- package/workflows/review.md +40 -0
- package/workflows/run-audit.md +41 -0
- package/workflows/spec-challenge.md +41 -0
- package/workflows/specify.md +38 -0
- package/workflows/verify.md +37 -0
|
@@ -0,0 +1,564 @@
|
|
|
1
|
+
# Distributed Systems Fundamentals -- Architecture Expertise Module
|
|
2
|
+
|
|
3
|
+
> Distributed systems are systems where components on networked computers communicate
|
|
4
|
+
> by passing messages. They introduce fundamental challenges absent from single-process
|
|
5
|
+
> systems: partial failure, network unreliability, clock skew, and consensus difficulty.
|
|
6
|
+
> Understanding these fundamentals prevents the most expensive architectural mistakes --
|
|
7
|
+
> the kind that only surface under production load, during network partitions, or at
|
|
8
|
+
> 3 AM when your on-call engineer discovers two database primaries accepting writes.
|
|
9
|
+
|
|
10
|
+
> **Category:** Distributed
|
|
11
|
+
> **Complexity:** Expert
|
|
12
|
+
> **Applies when:** Any system spanning more than one process -- microservices, multi-region
|
|
13
|
+
> deployments, client-server applications, or systems using external services (databases,
|
|
14
|
+
> caches, message brokers, third-party APIs).
|
|
15
|
+
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
## What This Is
|
|
19
|
+
|
|
20
|
+
### You Are Probably Already Building a Distributed System
|
|
21
|
+
|
|
22
|
+
If your application talks to a database, you have a distributed system. If it calls an
|
|
23
|
+
external API, uses a cache, a message queue, or a CDN, you have a distributed system. A
|
|
24
|
+
single-server web app with PostgreSQL is a two-node distributed system. The network between
|
|
25
|
+
them is usually local, which makes failures rare but not impossible -- and when they happen,
|
|
26
|
+
developers are blindsided because they never designed for them.
|
|
27
|
+
|
|
28
|
+
### The Eight Fallacies of Distributed Computing
|
|
29
|
+
|
|
30
|
+
Peter Deutsch (Sun Microsystems, 1994) identified seven false assumptions; James Gosling
|
|
31
|
+
added the eighth. Decades later, they remain the most common source of distributed bugs.
|
|
32
|
+
|
|
33
|
+
**1. The Network Is Reliable.** Packets drop. Connections time out. During the February
|
|
34
|
+
2025 AWS eu-north-1 incident, a networking fault in Stockholm disrupted intra-region
|
|
35
|
+
traffic between availability zones while external connectivity appeared normal. EC2, S3,
|
|
36
|
+
Lambda, DynamoDB, and CloudWatch all degraded -- not because the services failed, but
|
|
37
|
+
because internal communication broke. *Every network call needs timeout, retry, and
|
|
38
|
+
failure-handling logic.*
|
|
39
|
+
|
|
40
|
+
**2. Latency Is Zero.** A function call takes nanoseconds; a network call takes
|
|
41
|
+
milliseconds -- six orders of magnitude. A page making 50 sequential 5ms service calls
|
|
42
|
+
accumulates 250ms of pure network time. *Batch operations, caching, and coarse-grained
|
|
43
|
+
APIs exist to combat this.*
|
|
44
|
+
|
|
45
|
+
**3. Bandwidth Is Infinite.** Unbounded traffic (full objects when only IDs are needed,
|
|
46
|
+
uncompressed payloads, full-dataset replication instead of deltas) increases packet loss
|
|
47
|
+
and tail latency system-wide. At scale, AWS cross-AZ/cross-region data transfer costs
|
|
48
|
+
become a significant line item.
|
|
49
|
+
|
|
50
|
+
**4. The Network Is Secure.** Every link is an attack surface. Distributed systems have
|
|
51
|
+
more links than monoliths, each requiring independent security. Zero-trust networking
|
|
52
|
+
exists because the "trusted internal network" assumption has been proven wrong repeatedly.
|
|
53
|
+
|
|
54
|
+
**5. Topology Doesn't Change.** In cloud environments, VMs are replaced, containers
|
|
55
|
+
rescheduled, load balancers rebalanced, IPs recycled. Hard-coded addresses and infinite
|
|
56
|
+
DNS TTLs break silently. *Service discovery and dynamic routing exist for this.*
|
|
57
|
+
|
|
58
|
+
**6. There Is One Administrator.** Modern systems span teams, organizations, and cloud
|
|
59
|
+
providers. Debugging requires coordinating across different on-call rotations, deployment
|
|
60
|
+
schedules, and monitoring systems. *Distributed tracing, centralized logging, and clear
|
|
61
|
+
service ownership are necessities.*
|
|
62
|
+
|
|
63
|
+
**7. Transport Cost Is Zero.** Serialization, TLS handshakes, connection management, and
|
|
64
|
+
infrastructure costs are all non-zero. JSON overhead invisible at low volume becomes a CPU
|
|
65
|
+
bottleneck under load. Binary formats (Protocol Buffers, Avro) exist because this is real.
|
|
66
|
+
|
|
67
|
+
**8. The Network Is Homogeneous.** Real networks span different vendors, protocol versions,
|
|
68
|
+
encodings, and byte orderings. Standard wire formats, versioned APIs, and explicit encoding
|
|
69
|
+
declarations address this.
|
|
70
|
+
|
|
71
|
+
### Partial Failure vs. Total Failure
|
|
72
|
+
|
|
73
|
+
In a single-process system, failure is total: running or not. In distributed systems,
|
|
74
|
+
failure is **partial**: some components fail while others continue. This is the defining
|
|
75
|
+
challenge:
|
|
76
|
+
|
|
77
|
+
- A request may have been received but not processed (server crashed after receipt).
|
|
78
|
+
- A request may have been processed but the response lost (client does not know to retry
|
|
79
|
+
or not -- retry may duplicate; giving up may lose the operation).
|
|
80
|
+
- A slow node looks identical to a dead node from a timeout's perspective, but it is still
|
|
81
|
+
processing, potentially conflicting with its replacement.
|
|
82
|
+
- Different observers see different states: A sees B as failed while C sees B as healthy.
|
|
83
|
+
|
|
84
|
+
### Fundamental Impossibility Results
|
|
85
|
+
|
|
86
|
+
Three results define the theoretical boundaries. Ignoring them leads to violated guarantees.
|
|
87
|
+
|
|
88
|
+
**FLP Impossibility (Fischer, Lynch, Paterson, 1985).** In an asynchronous system where
|
|
89
|
+
even one process might crash, no deterministic consensus algorithm can guarantee
|
|
90
|
+
termination. You cannot have safety (never wrong), liveness (always terminates), and fault
|
|
91
|
+
tolerance simultaneously in a purely asynchronous system. Practical systems work around FLP
|
|
92
|
+
with partial synchrony (timeouts), randomization, or failure detectors.
|
|
93
|
+
|
|
94
|
+
**CAP Theorem (Brewer 2000; Gilbert & Lynch 2002).** A distributed data store cannot
|
|
95
|
+
simultaneously provide Consistency (linearizability), Availability (every request gets a
|
|
96
|
+
non-error response), and Partition tolerance. Since partitions are inevitable, the choice
|
|
97
|
+
is CP (consistent but unavailable during partitions -- etcd, ZooKeeper, Spanner) or AP
|
|
98
|
+
(available but potentially stale -- Cassandra, DynamoDB eventual mode). Brewer clarified
|
|
99
|
+
in 2012: "the '2 of 3' formulation is misleading" -- during normal operation you can have
|
|
100
|
+
both C and A; the trade-off is only during partitions.
|
|
101
|
+
|
|
102
|
+
**Byzantine Fault Tolerance (Lamport, Shostak, Pease, 1982).** If nodes can behave
|
|
103
|
+
arbitrarily (lie, collude), consensus requires 3f+1 nodes to tolerate f faults -- more
|
|
104
|
+
than two-thirds must be honest. PBFT (Castro & Liskov, 1999) made this practical. BFT is
|
|
105
|
+
essential for blockchains (Solana, Stellar, Tendermint) but overkill for internal systems;
|
|
106
|
+
most use crash-fault tolerance (CFT), needing only 2f+1 nodes.
|
|
107
|
+
|
|
108
|
+
---
|
|
109
|
+
|
|
110
|
+
### Signals You Are Ignoring Distribution
|
|
111
|
+
|
|
112
|
+
If any of these are true, you are building a distributed system without treating it as one:
|
|
113
|
+
|
|
114
|
+
- No timeout on any network call (database, API, cache).
|
|
115
|
+
- No retry logic, or retries are infinite without backoff.
|
|
116
|
+
- You assume database transactions always succeed on the first attempt.
|
|
117
|
+
- No circuit breaker for downstream service calls.
|
|
118
|
+
- Multi-service operations with no partial failure handling.
|
|
119
|
+
- No health checks, no distributed tracing, no centralized logging.
|
|
120
|
+
- You deploy to "the cloud" but have not considered AZ failure.
|
|
121
|
+
- You use wall-clock time for ordering events across services.
|
|
122
|
+
|
|
123
|
+
---
|
|
124
|
+
|
|
125
|
+
## When to Avoid Distribution (Equally Important)
|
|
126
|
+
|
|
127
|
+
### Distribution Is Never Free
|
|
128
|
+
|
|
129
|
+
Every process boundary adds: network latency, partial failure modes, consistency
|
|
130
|
+
challenges, operational complexity, debugging difficulty, and infrastructure cost.
|
|
131
|
+
|
|
132
|
+
**The single-process advantage:** Shared memory (nanoseconds, not milliseconds). Total
|
|
133
|
+
failure (no split brain). Trivially atomic transactions. Stack traces showing the full
|
|
134
|
+
call chain. One artifact to deploy and rollback.
|
|
135
|
+
|
|
136
|
+
### Real Examples of Unnecessary Distribution
|
|
137
|
+
|
|
138
|
+
**Premature microservices:** A 5-engineer startup splits a CRUD app into 12 microservices
|
|
139
|
+
because "that's what Netflix does." Netflix needed it for 2,000+ engineers deploying
|
|
140
|
+
independently. The startup now spends more time debugging inter-service calls than
|
|
141
|
+
building features.
|
|
142
|
+
|
|
143
|
+
**Distributed cache for single-server apps:** Adding Redis for an app on one server
|
|
144
|
+
creates cache invalidation complexity, connection management, and a new failure mode. An
|
|
145
|
+
in-process LRU cache is faster and simpler.
|
|
146
|
+
|
|
147
|
+
**Event-driven architecture for synchronous workflows:** Replacing direct function calls
|
|
148
|
+
with a message queue between same-process components that need synchronous responses adds
|
|
149
|
+
latency, complexity, and a new dependency for zero benefit.
|
|
150
|
+
|
|
151
|
+
**CQRS for 100 users:** Separate read-optimized database synced via events from the write
|
|
152
|
+
database, for a workload a single PostgreSQL instance handles trivially. Now you maintain
|
|
153
|
+
two databases, an event pipeline, and stale read handling.
|
|
154
|
+
|
|
155
|
+
**The alternative:** A modular monolith -- single deployable, well-defined module
|
|
156
|
+
boundaries, enforced dependency rules, separate data ownership per module. Extract services
|
|
157
|
+
only when a module genuinely needs independent scaling or deployment.
|
|
158
|
+
|
|
159
|
+
---
|
|
160
|
+
|
|
161
|
+
## How It Works
|
|
162
|
+
|
|
163
|
+
### Network Models
|
|
164
|
+
|
|
165
|
+
| Model | Guarantee | Real-World Analog |
|
|
166
|
+
|-------|-----------|-------------------|
|
|
167
|
+
| **Reliable links** | Messages delivered exactly once, uncorrupted | TCP (approximation) |
|
|
168
|
+
| **Fair-loss links** | Messages may be lost/duplicated; persistent retransmission succeeds | UDP + application retransmit |
|
|
169
|
+
| **Arbitrary (Byzantine)** | Network can lose, duplicate, reorder, corrupt, or fabricate | Untrusted networks; TLS converts to authenticated fair-loss |
|
|
170
|
+
|
|
171
|
+
### Timing Models
|
|
172
|
+
|
|
173
|
+
| Model | Guarantee | Implication |
|
|
174
|
+
|-------|-----------|-------------|
|
|
175
|
+
| **Synchronous** | Known upper bounds on delay, processing, clock drift | Timeouts reliably detect failure. Unrealistic for most systems. |
|
|
176
|
+
| **Partially synchronous** | Eventually synchronous; bounds exist but are unknown | Most practical protocols (Raft, Paxos) assume this. |
|
|
177
|
+
| **Asynchronous** | No timing bounds at all | Cannot distinguish slow from dead. FLP applies. |
|
|
178
|
+
|
|
179
|
+
### Failure Models
|
|
180
|
+
|
|
181
|
+
| Model | Behavior | Tolerance Requirement | Used By |
|
|
182
|
+
|-------|----------|-----------------------|---------|
|
|
183
|
+
| **Crash-stop** | Correct or permanently stopped | Simple reasoning | Theoretical analysis |
|
|
184
|
+
| **Crash-recovery** | Can crash and restart with durable state | 2f+1 nodes for f faults | Raft, Paxos, ZAB |
|
|
185
|
+
| **Byzantine** | Arbitrary/malicious behavior | 3f+1 nodes for f faults | PBFT, Tendermint, blockchains |
|
|
186
|
+
| **Omission** | Fails to send/receive some messages | Between crash and Byzantine | Network issues, GC pauses |
|
|
187
|
+
|
|
188
|
+
### Ordering and Logical Clocks
|
|
189
|
+
|
|
190
|
+
**Happens-before (Lamport, 1978):** A -> B if: (1) same node, A before B in program order;
|
|
191
|
+
(2) A is a send, B is the corresponding receive; (3) transitivity. If neither A -> B nor
|
|
192
|
+
B -> A, the events are concurrent.
|
|
193
|
+
|
|
194
|
+
**Lamport timestamps:** Each node maintains a counter; increment on every event; attach to
|
|
195
|
+
sent messages; on receive, set counter to max(local, received) + 1. Provides total order
|
|
196
|
+
but does not capture concurrency -- if ts(A) < ts(B), A may or may not precede B.
|
|
197
|
+
|
|
198
|
+
```
|
|
199
|
+
Node A: [1]---[2]---send(msg,ts=2)------------------[5]---[6]
|
|
200
|
+
Node B: [1]---[2]---receive(msg,ts=2)---[3]---send(reply,ts=3)
|
|
201
|
+
Node C: [1]------------------------[2]---receive(reply,ts=3)---[4]
|
|
202
|
+
```
|
|
203
|
+
|
|
204
|
+
**Vector clocks (Fidge & Mattern, 1988):** Each node maintains a vector of counters (one
|
|
205
|
+
per node). Increment own entry on events; attach full vector to messages; element-wise max
|
|
206
|
+
on receipt. Captures concurrency precisely: V(A) < V(B) component-wise means A -> B; if
|
|
207
|
+
neither dominates, they are concurrent. Used by DynamoDB and Riak for conflict detection.
|
|
208
|
+
|
|
209
|
+
```
|
|
210
|
+
Node A: [1,0,0]---[2,0,0]---send(msg)----------→ receive [3,2,0]
|
|
211
|
+
Node B: [0,1,0]---receive(msg) [2,2,0]---[2,3,0]
|
|
212
|
+
Node C: [0,0,1] ← concurrent with both A and B
|
|
213
|
+
```
|
|
214
|
+
|
|
215
|
+
Vector clocks grow linearly with the number of nodes. For large clusters, interval tree
|
|
216
|
+
clocks or dotted version vectors provide bounded-size alternatives.
|
|
217
|
+
|
|
218
|
+
### Physical Clocks
|
|
219
|
+
|
|
220
|
+
**NTP:** Synchronizes to 1-50ms on LAN, 10-100ms over internet. Cannot guarantee error
|
|
221
|
+
bounds -- two NTP-synced servers may disagree on event ordering within the error margin.
|
|
222
|
+
|
|
223
|
+
**Google TrueTime:** Represents time as an interval [earliest, latest]. GPS receivers and
|
|
224
|
+
atomic clocks in every Google datacenter keep uncertainty under 1ms at p99. Spanner's
|
|
225
|
+
"commit wait" trades latency for correctness: after committing at timestamp T, wait for
|
|
226
|
+
the uncertainty interval to pass before reporting success.
|
|
227
|
+
|
|
228
|
+
**Hybrid Logical Clocks (HLC):** Combine physical time with logical counters. CockroachDB
|
|
229
|
+
uses HLC to achieve Spanner-like consistency without atomic clocks, relying on NTP with
|
|
230
|
+
clock skew detection and bounded staleness reads.
|
|
231
|
+
|
|
232
|
+
---
|
|
233
|
+
|
|
234
|
+
## Trade-Offs Matrix
|
|
235
|
+
|
|
236
|
+
| Dimension | Option A | Option B | Key Tension |
|
|
237
|
+
|-----------|----------|----------|-------------|
|
|
238
|
+
| Consistency vs. Availability | CP: correct data, may reject during partitions | AP: always responds, may return stale data | CAP forces this choice during partitions only |
|
|
239
|
+
| Latency vs. Consistency | Synchronous replication: higher latency, no data loss | Async replication: lower latency, risk of data loss | Cross-region sync adds 50-200ms RTT per replica |
|
|
240
|
+
| Throughput vs. Ordering | Total order broadcast: limited by leader | Partial/causal ordering: higher throughput | Partition data for independent parallel total orders |
|
|
241
|
+
| Fault tolerance vs. Cost | More replicas (2f+1) | Fewer replicas | f=1→3 nodes; f=2→5 nodes (67% more hardware, marginal gain) |
|
|
242
|
+
| Simplicity vs. Resilience | Single-region | Multi-region | Multi-region adds 50-200ms latency; worth it only for region-level DR |
|
|
243
|
+
| Autonomy vs. Coordination | Shared database: strong consistency | DB-per-service: loose coupling, saga complexity | Shared DB is a coupling and scaling bottleneck |
|
|
244
|
+
| Exactly-once vs. At-least-once | Distributed transactions (higher latency) | Idempotent operations (simpler infra) | True exactly-once is impossible (Two Generals); use idempotent at-least-once |
|
|
245
|
+
| Sync vs. Async communication | Request/response: simple, caller blocked | Events/messages: decoupled, harder to debug | Sync creates temporal coupling; async adds infra and ordering complexity |
|
|
246
|
+
|
|
247
|
+
---
|
|
248
|
+
|
|
249
|
+
## Evolution Path
|
|
250
|
+
|
|
251
|
+
**Stage 1 -- Single Process.** All logic in one process, one database. Correct starting
|
|
252
|
+
point for nearly all systems. Shared memory, ACID transactions, simple debugging.
|
|
253
|
+
*Outgrow when:* you need independent scaling, independent deployment for team autonomy, or
|
|
254
|
+
the codebase is too large for one team without constant merge conflicts.
|
|
255
|
+
|
|
256
|
+
**Stage 2 -- Monolith + External Services.** Database, cache, third-party APIs. You already
|
|
257
|
+
have a distributed system. Add: connection pools, timeouts, retry with backoff, circuit
|
|
258
|
+
breakers, dependency health checks.
|
|
259
|
+
*Outgrow when:* module boundaries need enforcement, teams step on each other, deployments
|
|
260
|
+
are slow due to monolith size.
|
|
261
|
+
|
|
262
|
+
**Stage 3 -- Modular Monolith.** Single deployable with enforced module boundaries (e.g.,
|
|
263
|
+
ArchUnit, linting rules). Each module owns its data. Inter-module interfaces could later
|
|
264
|
+
become network boundaries.
|
|
265
|
+
*Outgrow when:* specific modules need independent scaling, different tech stacks, or
|
|
266
|
+
independent deployment cadences.
|
|
267
|
+
|
|
268
|
+
**Stage 4 -- Selective Extraction.** Extract services only for concrete reasons (scaling,
|
|
269
|
+
team autonomy, tech mismatch). Add: service discovery, distributed tracing, centralized
|
|
270
|
+
logging, API gateways, SLAs. Each extraction is a deliberate decision with documented
|
|
271
|
+
justification, not speculation.
|
|
272
|
+
|
|
273
|
+
**Stage 5 -- Distributed at Scale.** Multiple services, databases, queues. Invest in
|
|
274
|
+
platform engineering: service mesh, deployment pipelines, observability. Establish data
|
|
275
|
+
patterns (sagas, outbox, CDC). Run chaos engineering. Maintain architecture decision
|
|
276
|
+
records for every cross-cutting concern.
|
|
277
|
+
|
|
278
|
+
**Stage 6 -- Multi-Region.** Choose consistency models per data type (strong for financial,
|
|
279
|
+
eventual for preferences). Implement conflict resolution (CRDTs, last-writer-wins,
|
|
280
|
+
application-level merge). Design for region-level failure (active-active or
|
|
281
|
+
active-passive). Address data sovereignty and regulatory requirements.
|
|
282
|
+
|
|
283
|
+
---
|
|
284
|
+
|
|
285
|
+
## Failure Modes
|
|
286
|
+
|
|
287
|
+
### Network Partition
|
|
288
|
+
|
|
289
|
+
**AWS eu-north-1 (February 2025).** A networking fault in Stockholm disrupted intra-AZ
|
|
290
|
+
traffic. Services in eun1-az3 could not reach other AZs, but external connectivity was
|
|
291
|
+
fine. EC2, S3, Lambda, DynamoDB, CloudWatch all degraded from broken internal
|
|
292
|
+
service-to-service calls. *Mitigation:* design for partition tolerance from the start; use
|
|
293
|
+
consensus for must-be-consistent data; accept eventual consistency where staleness is
|
|
294
|
+
tolerable; monitor internal network health, not just external probes.
|
|
295
|
+
|
|
296
|
+
### Split Brain
|
|
297
|
+
|
|
298
|
+
**GitHub (2013).** A network partition caused split brain between database replicas. Both
|
|
299
|
+
sides accepted writes, creating divergent state requiring manual reconciliation.
|
|
300
|
+
**AWS US-East-1 (2013).** Network partition caused split brain with customer-facing
|
|
301
|
+
inconsistencies. *Mitigation:* quorum-based writes (require majority); fencing tokens
|
|
302
|
+
(monotonic epoch numbers -- Kafka uses these; HDFS uses ZooKeeper for single-NameNode
|
|
303
|
+
guarantee); never auto-promote a leader without quorum.
|
|
304
|
+
|
|
305
|
+
### Clock Drift
|
|
306
|
+
|
|
307
|
+
A distributed lock with 30-second TTL: if node A's clock is 5 seconds ahead, the lock
|
|
308
|
+
appears expired to others 5 real seconds early. Log timestamps from different servers
|
|
309
|
+
cannot be reliably compared within the drift margin. *Mitigation:* logical clocks for
|
|
310
|
+
ordering; fencing tokens instead of time-based leases; bounded-uncertainty clocks
|
|
311
|
+
(TrueTime, HLC); monitor and alert on NTP skew.
|
|
312
|
+
|
|
313
|
+
### Cascading Failures
|
|
314
|
+
|
|
315
|
+
**Amazon DynamoDB (September 2015).** A transient network problem caused storage servers to
|
|
316
|
+
miss partition assignments. They retried against metadata servers, which became overwhelmed,
|
|
317
|
+
causing more timeouts and more retries -- a positive feedback loop that cascaded for 4+
|
|
318
|
+
hours. *Mitigation:* circuit breakers, bulkheads, exponential backoff with jitter, load
|
|
319
|
+
shedding, health checks with dependency awareness.
|
|
320
|
+
|
|
321
|
+
### Thundering Herd
|
|
322
|
+
|
|
323
|
+
**PayPal Braintree Disputes API (2018).** Failed jobs retried at static intervals
|
|
324
|
+
(no jitter), compounding load. Autoscaling took 45+ seconds to respond; existing servers
|
|
325
|
+
were overwhelmed during the gap. *Mitigation:* staggered cache TTLs (random jitter),
|
|
326
|
+
request coalescing, rate limiting, connection draining before restarts, exponential backoff
|
|
327
|
+
with jitter.
|
|
328
|
+
|
|
329
|
+
### BGP / Routing Misconfiguration
|
|
330
|
+
|
|
331
|
+
**Cloudflare BGP Incident.** During scheduled maintenance of Cloudflare's Hong Kong data
|
|
332
|
+
center, outbound routes were entered into the inbound interface, causing global traffic to
|
|
333
|
+
be directed to an offline data center. It took ~15 minutes to correct routes network-wide.
|
|
334
|
+
In a separate incident, an internal configuration error caused Cloudflare's 1.1.1.1 route
|
|
335
|
+
announcements to disappear from the global routing table entirely. The lesson: distributed
|
|
336
|
+
systems fail in ways that are almost impossible to predict until they actually fail in
|
|
337
|
+
production. *Mitigation:* route announcement monitoring, BGP session alerting, staged
|
|
338
|
+
rollouts for network configuration changes, automated rollback on anomaly detection.
|
|
339
|
+
|
|
340
|
+
### Message Ordering and Duplication
|
|
341
|
+
|
|
342
|
+
Messages arrive out of order or are delivered more than once. Systems assuming FIFO delivery
|
|
343
|
+
or exactly-once semantics produce incorrect results. *Mitigation:* idempotent consumers
|
|
344
|
+
(same message twice = same result); idempotency keys for deduplication; single partition
|
|
345
|
+
per ordering key (Kafka) for ordered processing; sequence numbers with gap detection.
|
|
346
|
+
Accept that exactly-once delivery is impossible in the general case (Two Generals' Problem)
|
|
347
|
+
and design for at-least-once with idempotent processing.
|
|
348
|
+
|
|
349
|
+
### Dual-Write Inconsistency
|
|
350
|
+
|
|
351
|
+
Writing to a database AND a message broker without coordination: if one succeeds and the
|
|
352
|
+
other fails, stores diverge permanently. *Mitigation:* transactional outbox pattern (write
|
|
353
|
+
event to outbox table in the same DB transaction; relay to broker asynchronously); change
|
|
354
|
+
data capture (CDC); never rely on application-level coordination of two independent writes.
|
|
355
|
+
|
|
356
|
+
---
|
|
357
|
+
|
|
358
|
+
## Technology Landscape
|
|
359
|
+
|
|
360
|
+
### Service Discovery
|
|
361
|
+
|
|
362
|
+
| Technology | Model | Consistency | Notes |
|
|
363
|
+
|-----------|-------|-------------|-------|
|
|
364
|
+
| **Consul** | Registry + DNS + KV | Raft (CP) | Multi-datacenter support |
|
|
365
|
+
| **etcd** | Key-value with watch | Raft (CP) | Kubernetes control plane backend |
|
|
366
|
+
| **ZooKeeper** | Hierarchical KV | ZAB (CP) | Kafka (older), HDFS, HBase |
|
|
367
|
+
| **Eureka** | AP registry | Peer replication (AP) | Netflix OSS; favors availability |
|
|
368
|
+
| **Kubernetes DNS** | DNS-based | Eventually consistent | Built into k8s, no extra infra |
|
|
369
|
+
|
|
370
|
+
### Consensus Protocols
|
|
371
|
+
|
|
372
|
+
| Protocol | Fault Model | Complexity | Used By |
|
|
373
|
+
|----------|------------|------------|---------|
|
|
374
|
+
| **Raft** | Crash-recovery | Moderate (designed for clarity) | etcd, Consul, CockroachDB, TiKV |
|
|
375
|
+
| **Paxos/Multi-Paxos** | Crash-recovery | High | Google Chubby, Spanner, Azure Storage |
|
|
376
|
+
| **ZAB** | Crash-recovery | Moderate | ZooKeeper |
|
|
377
|
+
| **PBFT** | Byzantine | Very High (O(n^2) messages) | Hyperledger Fabric, Tendermint |
|
|
378
|
+
|
|
379
|
+
### Observability
|
|
380
|
+
|
|
381
|
+
| Technology | Scope | Notes |
|
|
382
|
+
|-----------|-------|-------|
|
|
383
|
+
| **OpenTelemetry** | Traces + metrics + logs | Industry standard; vendor-neutral |
|
|
384
|
+
| **Jaeger** | Distributed tracing | CNCF graduated; strong k8s integration |
|
|
385
|
+
| **Grafana Tempo** | Trace storage | Cost-effective; object storage backend |
|
|
386
|
+
|
|
387
|
+
---
|
|
388
|
+
|
|
389
|
+
## Decision Tree
|
|
390
|
+
|
|
391
|
+
```
|
|
392
|
+
Does your system span more than one process?
|
|
393
|
+
├── NO → Does it talk to a database or external API?
|
|
394
|
+
│ ├── YES → You have a distributed system. Continue.
|
|
395
|
+
│ └── NO → You do not need this module.
|
|
396
|
+
│
|
|
397
|
+
└── YES → Can you avoid distribution? (modular monolith instead?)
|
|
398
|
+
├── YES → Prefer single-process. See → modular-monolith
|
|
399
|
+
└── NO → How many services?
|
|
400
|
+
├── 2-5: Sync HTTP/gRPC. Timeouts + retries + circuit breakers.
|
|
401
|
+
│ See → circuit-breaker-bulkhead, idempotency-and-retry
|
|
402
|
+
├── 5-20: Service discovery. Distributed tracing. Event-driven
|
|
403
|
+
│ for decoupled flows. DB-per-service. Sagas.
|
|
404
|
+
│ See → microservices, consensus-and-coordination
|
|
405
|
+
└── 20+: Service mesh. Platform team. Chaos engineering.
|
|
406
|
+
│
|
|
407
|
+
├── Need strong consistency? (financial, inventory)
|
|
408
|
+
│ → Consensus-backed stores. 2PC or sagas.
|
|
409
|
+
│ See → cap-theorem-and-tradeoffs
|
|
410
|
+
├── Eventual consistency OK? (feeds, analytics)
|
|
411
|
+
│ → Event-driven + idempotent consumers.
|
|
412
|
+
│ See → idempotency-and-retry
|
|
413
|
+
└── Multi-region needed?
|
|
414
|
+
→ Per-data-type consistency model. CRDTs. Conflict resolution.
|
|
415
|
+
See → cap-theorem-and-tradeoffs
|
|
416
|
+
```
|
|
417
|
+
|
|
418
|
+
---
|
|
419
|
+
|
|
420
|
+
## Implementation Sketch
|
|
421
|
+
|
|
422
|
+
### Minimal Distributed Hygiene (Every Networked Application)
|
|
423
|
+
|
|
424
|
+
```python
|
|
425
|
+
# 1. TIMEOUTS -- never make a network call without one
|
|
426
|
+
response = requests.get("https://api.example.com/data", timeout=(3, 10))
|
|
427
|
+
# connect timeout ^ ^ read timeout
|
|
428
|
+
|
|
429
|
+
# 2. RETRY WITH EXPONENTIAL BACKOFF AND JITTER
|
|
430
|
+
import random, time
|
|
431
|
+
|
|
432
|
+
def retry_with_backoff(fn, max_retries=3, base_delay=1.0):
|
|
433
|
+
for attempt in range(max_retries):
|
|
434
|
+
try:
|
|
435
|
+
return fn()
|
|
436
|
+
except RetryableError:
|
|
437
|
+
if attempt == max_retries - 1:
|
|
438
|
+
raise
|
|
439
|
+
delay = random.uniform(0, base_delay * (2 ** attempt))
|
|
440
|
+
time.sleep(delay)
|
|
441
|
+
|
|
442
|
+
# 3. IDEMPOTENCY KEYS -- make operations safe to retry
|
|
443
|
+
idempotency_key = str(uuid.uuid4())
|
|
444
|
+
requests.post(url, json=data, headers={"Idempotency-Key": idempotency_key})
|
|
445
|
+
# Server: check if key seen before → return cached response; else process and cache.
|
|
446
|
+
```
|
|
447
|
+
|
|
448
|
+
### Circuit Breaker (Preventing Cascading Failures)
|
|
449
|
+
|
|
450
|
+
```python
|
|
451
|
+
import time
|
|
452
|
+
from enum import Enum
|
|
453
|
+
|
|
454
|
+
class CircuitState(Enum):
|
|
455
|
+
CLOSED = "closed" # Normal operation, requests pass through
|
|
456
|
+
OPEN = "open" # Failures exceeded threshold, requests fast-rejected
|
|
457
|
+
HALF_OPEN = "half_open" # Testing if downstream has recovered
|
|
458
|
+
|
|
459
|
+
class CircuitBreaker:
|
|
460
|
+
def __init__(self, failure_threshold=5, recovery_timeout=30):
|
|
461
|
+
self.failure_threshold = failure_threshold
|
|
462
|
+
self.recovery_timeout = recovery_timeout
|
|
463
|
+
self.state = CircuitState.CLOSED
|
|
464
|
+
self.failure_count = 0
|
|
465
|
+
self.last_failure_time = 0
|
|
466
|
+
|
|
467
|
+
def call(self, fn):
|
|
468
|
+
if self.state == CircuitState.OPEN:
|
|
469
|
+
if time.time() - self.last_failure_time > self.recovery_timeout:
|
|
470
|
+
self.state = CircuitState.HALF_OPEN # Allow one probe request
|
|
471
|
+
else:
|
|
472
|
+
raise CircuitOpenError("Circuit open -- request rejected immediately")
|
|
473
|
+
|
|
474
|
+
try:
|
|
475
|
+
result = fn()
|
|
476
|
+
if self.state == CircuitState.HALF_OPEN:
|
|
477
|
+
self.state = CircuitState.CLOSED # Probe succeeded, recover
|
|
478
|
+
self.failure_count = 0
|
|
479
|
+
return result
|
|
480
|
+
except Exception:
|
|
481
|
+
self.failure_count += 1
|
|
482
|
+
self.last_failure_time = time.time()
|
|
483
|
+
if self.failure_count >= self.failure_threshold:
|
|
484
|
+
self.state = CircuitState.OPEN
|
|
485
|
+
raise
|
|
486
|
+
|
|
487
|
+
# Usage: wrap every downstream call in a per-service circuit breaker.
|
|
488
|
+
# When a downstream fails repeatedly, the breaker opens and requests
|
|
489
|
+
# are rejected instantly -- preventing thread pool exhaustion and
|
|
490
|
+
# cascading failure upstream.
|
|
491
|
+
```
|
|
492
|
+
|
|
493
|
+
### Health Check with Dependency Awareness
|
|
494
|
+
|
|
495
|
+
```python
|
|
496
|
+
def check_health():
|
|
497
|
+
"""Report on all dependencies. A service is only as healthy as its
|
|
498
|
+
required dependencies."""
|
|
499
|
+
deps = [
|
|
500
|
+
check_dep("postgresql", lambda: db.execute("SELECT 1"),
|
|
501
|
+
timeout_ms=5000, required=True),
|
|
502
|
+
check_dep("redis", lambda: redis.ping(),
|
|
503
|
+
timeout_ms=1000, required=False), # Degraded without it
|
|
504
|
+
check_dep("payment-svc", lambda: requests.get(url, timeout=2),
|
|
505
|
+
timeout_ms=2000, required=True),
|
|
506
|
+
]
|
|
507
|
+
required_down = any(d.unhealthy for d in deps if d.required)
|
|
508
|
+
any_down = any(d.unhealthy for d in deps)
|
|
509
|
+
status = "unhealthy" if required_down else "degraded" if any_down else "healthy"
|
|
510
|
+
return {"status": status, "dependencies": [d.to_dict() for d in deps]}
|
|
511
|
+
|
|
512
|
+
# Key insight: distinguish REQUIRED dependencies (service cannot function)
|
|
513
|
+
# from OPTIONAL dependencies (service is degraded but still useful).
|
|
514
|
+
# Load balancers use this to route traffic away from unhealthy instances.
|
|
515
|
+
```
|
|
516
|
+
|
|
517
|
+
### Transactional Outbox (Preventing Dual-Write Inconsistency)
|
|
518
|
+
|
|
519
|
+
```sql
|
|
520
|
+
BEGIN TRANSACTION;
|
|
521
|
+
INSERT INTO orders (id, customer_id, total) VALUES ('ord-123', 'cust-456', 99.99);
|
|
522
|
+
INSERT INTO outbox (id, aggregate_type, event_type, payload, created_at)
|
|
523
|
+
VALUES ('evt-789', 'Order', 'OrderCreated',
|
|
524
|
+
'{"orderId":"ord-123","total":99.99}', NOW());
|
|
525
|
+
COMMIT;
|
|
526
|
+
-- Separate relay process polls outbox, publishes to broker, marks published.
|
|
527
|
+
-- Consumer must be idempotent (relay may re-publish on crash).
|
|
528
|
+
```
|
|
529
|
+
|
|
530
|
+
---
|
|
531
|
+
|
|
532
|
+
## Cross-References
|
|
533
|
+
|
|
534
|
+
- **[cap-theorem-and-tradeoffs](../distributed/cap-theorem-and-tradeoffs.md):** CAP
|
|
535
|
+
implications, PACELC, per-operation consistency, system positioning on the C-A spectrum.
|
|
536
|
+
- **[consensus-and-coordination](../distributed/consensus-and-coordination.md):** Paxos,
|
|
537
|
+
Raft, ZAB, leader election, distributed locking, coordination services.
|
|
538
|
+
- **[circuit-breaker-bulkhead](../../patterns/circuit-breaker-bulkhead.md):** Fault
|
|
539
|
+
isolation -- circuit breakers, bulkheads, timeouts, load shedding.
|
|
540
|
+
- **[idempotency-and-retry](../../patterns/idempotency-and-retry.md):** Idempotency keys,
|
|
541
|
+
deduplication, exactly-once via at-least-once with idempotent consumers.
|
|
542
|
+
- **[microservices](../distributed/microservices.md):** Service decomposition, boundaries,
|
|
543
|
+
data ownership, inter-service communication, operational cost of distribution.
|
|
544
|
+
|
|
545
|
+
---
|
|
546
|
+
|
|
547
|
+
## Key Takeaways
|
|
548
|
+
|
|
549
|
+
1. **You are already building distributed systems.** Database, API, cache -- any network
|
|
550
|
+
boundary makes you distributed. Treat it accordingly.
|
|
551
|
+
2. **Distribution is never free.** Every boundary adds latency, failure modes, consistency
|
|
552
|
+
challenges, and operational cost. Single-process first.
|
|
553
|
+
3. **Partial failure is the defining challenge.** Design every interaction for the ambiguity
|
|
554
|
+
of "did it succeed or not?"
|
|
555
|
+
4. **The Eight Fallacies are not theoretical.** They are production incidents waiting to
|
|
556
|
+
happen.
|
|
557
|
+
5. **Know the impossibility results.** CAP, FLP, and BFT define what is achievable. Choose
|
|
558
|
+
trade-offs; do not pretend they do not exist.
|
|
559
|
+
6. **Clocks lie.** Use logical clocks for ordering, fencing tokens for mutual exclusion,
|
|
560
|
+
bounded-uncertainty clocks when physical time is necessary.
|
|
561
|
+
7. **Idempotency is the universal safety net.** In a world of retries and at-least-once
|
|
562
|
+
delivery, idempotent operations prevent data corruption.
|
|
563
|
+
8. **Start simple. Distribute when forced.** Monolith first. Modular monolith second.
|
|
564
|
+
Extract services for concrete, measurable reasons.
|