@wazir-dev/cli 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (629) hide show
  1. package/AGENTS.md +111 -0
  2. package/CHANGELOG.md +14 -0
  3. package/CONTRIBUTING.md +101 -0
  4. package/LICENSE +21 -0
  5. package/README.md +314 -0
  6. package/assets/composition-engine.mmd +34 -0
  7. package/assets/demo-script.sh +17 -0
  8. package/assets/logo-dark.svg +14 -0
  9. package/assets/logo.svg +14 -0
  10. package/assets/pipeline.mmd +39 -0
  11. package/assets/record-demo.sh +51 -0
  12. package/docs/README.md +51 -0
  13. package/docs/adapters/context-mode.md +60 -0
  14. package/docs/concepts/architecture.md +87 -0
  15. package/docs/concepts/artifact-model.md +60 -0
  16. package/docs/concepts/composition-engine.md +36 -0
  17. package/docs/concepts/indexing-and-recall.md +160 -0
  18. package/docs/concepts/observability.md +41 -0
  19. package/docs/concepts/roles-and-workflows.md +59 -0
  20. package/docs/concepts/terminology-policy.md +27 -0
  21. package/docs/getting-started/01-installation.md +78 -0
  22. package/docs/getting-started/02-first-run.md +102 -0
  23. package/docs/getting-started/03-adding-to-project.md +15 -0
  24. package/docs/getting-started/04-host-setup.md +15 -0
  25. package/docs/guides/ci-integration.md +15 -0
  26. package/docs/guides/creating-skills.md +15 -0
  27. package/docs/guides/expertise-module-authoring.md +15 -0
  28. package/docs/guides/hook-development.md +15 -0
  29. package/docs/guides/memory-and-learnings.md +34 -0
  30. package/docs/guides/multi-host-export.md +15 -0
  31. package/docs/guides/troubleshooting.md +101 -0
  32. package/docs/guides/writing-custom-roles.md +15 -0
  33. package/docs/plans/2026-03-15-cli-pipeline-integration-design.md +592 -0
  34. package/docs/plans/2026-03-15-cli-pipeline-integration-plan.md +598 -0
  35. package/docs/plans/2026-03-15-docs-enforcement-plan.md +238 -0
  36. package/docs/readmes/INDEX.md +99 -0
  37. package/docs/readmes/features/expertise/README.md +171 -0
  38. package/docs/readmes/features/exports/README.md +222 -0
  39. package/docs/readmes/features/hooks/README.md +103 -0
  40. package/docs/readmes/features/hooks/loop-cap-guard.md +133 -0
  41. package/docs/readmes/features/hooks/post-tool-capture.md +121 -0
  42. package/docs/readmes/features/hooks/post-tool-lint.md +130 -0
  43. package/docs/readmes/features/hooks/pre-compact-summary.md +122 -0
  44. package/docs/readmes/features/hooks/pre-tool-capture-route.md +100 -0
  45. package/docs/readmes/features/hooks/protected-path-write-guard.md +128 -0
  46. package/docs/readmes/features/hooks/session-start.md +119 -0
  47. package/docs/readmes/features/hooks/stop-handoff-harvest.md +125 -0
  48. package/docs/readmes/features/roles/README.md +157 -0
  49. package/docs/readmes/features/roles/clarifier.md +152 -0
  50. package/docs/readmes/features/roles/content-author.md +190 -0
  51. package/docs/readmes/features/roles/designer.md +193 -0
  52. package/docs/readmes/features/roles/executor.md +184 -0
  53. package/docs/readmes/features/roles/learner.md +210 -0
  54. package/docs/readmes/features/roles/planner.md +182 -0
  55. package/docs/readmes/features/roles/researcher.md +164 -0
  56. package/docs/readmes/features/roles/reviewer.md +184 -0
  57. package/docs/readmes/features/roles/specifier.md +162 -0
  58. package/docs/readmes/features/roles/verifier.md +215 -0
  59. package/docs/readmes/features/schemas/README.md +178 -0
  60. package/docs/readmes/features/skills/README.md +63 -0
  61. package/docs/readmes/features/skills/brainstorming.md +96 -0
  62. package/docs/readmes/features/skills/debugging.md +148 -0
  63. package/docs/readmes/features/skills/design.md +120 -0
  64. package/docs/readmes/features/skills/prepare-next.md +109 -0
  65. package/docs/readmes/features/skills/run-audit.md +159 -0
  66. package/docs/readmes/features/skills/scan-project.md +109 -0
  67. package/docs/readmes/features/skills/self-audit.md +176 -0
  68. package/docs/readmes/features/skills/tdd.md +137 -0
  69. package/docs/readmes/features/skills/using-skills.md +92 -0
  70. package/docs/readmes/features/skills/verification.md +120 -0
  71. package/docs/readmes/features/skills/writing-plans.md +104 -0
  72. package/docs/readmes/features/tooling/README.md +320 -0
  73. package/docs/readmes/features/workflows/README.md +186 -0
  74. package/docs/readmes/features/workflows/author.md +181 -0
  75. package/docs/readmes/features/workflows/clarify.md +154 -0
  76. package/docs/readmes/features/workflows/design-review.md +171 -0
  77. package/docs/readmes/features/workflows/design.md +169 -0
  78. package/docs/readmes/features/workflows/discover.md +162 -0
  79. package/docs/readmes/features/workflows/execute.md +173 -0
  80. package/docs/readmes/features/workflows/learn.md +167 -0
  81. package/docs/readmes/features/workflows/plan-review.md +165 -0
  82. package/docs/readmes/features/workflows/plan.md +170 -0
  83. package/docs/readmes/features/workflows/prepare-next.md +167 -0
  84. package/docs/readmes/features/workflows/review.md +169 -0
  85. package/docs/readmes/features/workflows/run-audit.md +191 -0
  86. package/docs/readmes/features/workflows/spec-challenge.md +159 -0
  87. package/docs/readmes/features/workflows/specify.md +160 -0
  88. package/docs/readmes/features/workflows/verify.md +177 -0
  89. package/docs/readmes/packages/README.md +50 -0
  90. package/docs/readmes/packages/ajv.md +117 -0
  91. package/docs/readmes/packages/context-mode.md +118 -0
  92. package/docs/readmes/packages/gray-matter.md +116 -0
  93. package/docs/readmes/packages/node-test.md +137 -0
  94. package/docs/readmes/packages/yaml.md +112 -0
  95. package/docs/reference/configuration-reference.md +159 -0
  96. package/docs/reference/expertise-index.md +52 -0
  97. package/docs/reference/git-flow.md +43 -0
  98. package/docs/reference/hooks.md +87 -0
  99. package/docs/reference/host-exports.md +50 -0
  100. package/docs/reference/launch-checklist.md +172 -0
  101. package/docs/reference/marketplace-listings.md +76 -0
  102. package/docs/reference/release-process.md +34 -0
  103. package/docs/reference/roles-reference.md +77 -0
  104. package/docs/reference/skills.md +33 -0
  105. package/docs/reference/templates.md +29 -0
  106. package/docs/reference/tooling-cli.md +94 -0
  107. package/docs/truth-claims.yaml +222 -0
  108. package/expertise/PROGRESS.md +63 -0
  109. package/expertise/README.md +18 -0
  110. package/expertise/antipatterns/PROGRESS.md +56 -0
  111. package/expertise/antipatterns/backend/api-design-antipatterns.md +1271 -0
  112. package/expertise/antipatterns/backend/auth-antipatterns.md +1195 -0
  113. package/expertise/antipatterns/backend/caching-antipatterns.md +622 -0
  114. package/expertise/antipatterns/backend/database-antipatterns.md +1038 -0
  115. package/expertise/antipatterns/backend/index.md +24 -0
  116. package/expertise/antipatterns/backend/microservices-antipatterns.md +850 -0
  117. package/expertise/antipatterns/code/architecture-antipatterns.md +919 -0
  118. package/expertise/antipatterns/code/async-antipatterns.md +622 -0
  119. package/expertise/antipatterns/code/code-smells.md +1186 -0
  120. package/expertise/antipatterns/code/dependency-antipatterns.md +1209 -0
  121. package/expertise/antipatterns/code/error-handling-antipatterns.md +1360 -0
  122. package/expertise/antipatterns/code/index.md +27 -0
  123. package/expertise/antipatterns/code/naming-and-abstraction.md +1118 -0
  124. package/expertise/antipatterns/code/state-management-antipatterns.md +1076 -0
  125. package/expertise/antipatterns/code/testing-antipatterns.md +1053 -0
  126. package/expertise/antipatterns/design/accessibility-antipatterns.md +1136 -0
  127. package/expertise/antipatterns/design/dark-patterns.md +1121 -0
  128. package/expertise/antipatterns/design/index.md +22 -0
  129. package/expertise/antipatterns/design/ui-antipatterns.md +1202 -0
  130. package/expertise/antipatterns/design/ux-antipatterns.md +680 -0
  131. package/expertise/antipatterns/frontend/css-layout-antipatterns.md +691 -0
  132. package/expertise/antipatterns/frontend/flutter-antipatterns.md +1827 -0
  133. package/expertise/antipatterns/frontend/index.md +23 -0
  134. package/expertise/antipatterns/frontend/mobile-antipatterns.md +573 -0
  135. package/expertise/antipatterns/frontend/react-antipatterns.md +1128 -0
  136. package/expertise/antipatterns/frontend/spa-antipatterns.md +1235 -0
  137. package/expertise/antipatterns/index.md +31 -0
  138. package/expertise/antipatterns/performance/index.md +20 -0
  139. package/expertise/antipatterns/performance/performance-antipatterns.md +1013 -0
  140. package/expertise/antipatterns/performance/premature-optimization.md +623 -0
  141. package/expertise/antipatterns/performance/scaling-antipatterns.md +785 -0
  142. package/expertise/antipatterns/process/ai-coding-antipatterns.md +853 -0
  143. package/expertise/antipatterns/process/code-review-antipatterns.md +656 -0
  144. package/expertise/antipatterns/process/deployment-antipatterns.md +920 -0
  145. package/expertise/antipatterns/process/index.md +23 -0
  146. package/expertise/antipatterns/process/technical-debt-antipatterns.md +647 -0
  147. package/expertise/antipatterns/security/index.md +20 -0
  148. package/expertise/antipatterns/security/secrets-antipatterns.md +849 -0
  149. package/expertise/antipatterns/security/security-theater.md +843 -0
  150. package/expertise/antipatterns/security/vulnerability-patterns.md +801 -0
  151. package/expertise/architecture/PROGRESS.md +70 -0
  152. package/expertise/architecture/data/caching-architecture.md +671 -0
  153. package/expertise/architecture/data/data-consistency.md +574 -0
  154. package/expertise/architecture/data/data-modeling.md +536 -0
  155. package/expertise/architecture/data/event-streams-and-queues.md +634 -0
  156. package/expertise/architecture/data/index.md +25 -0
  157. package/expertise/architecture/data/search-architecture.md +663 -0
  158. package/expertise/architecture/data/sql-vs-nosql.md +708 -0
  159. package/expertise/architecture/decisions/architecture-decision-records.md +640 -0
  160. package/expertise/architecture/decisions/build-vs-buy.md +616 -0
  161. package/expertise/architecture/decisions/index.md +23 -0
  162. package/expertise/architecture/decisions/monolith-to-microservices.md +790 -0
  163. package/expertise/architecture/decisions/technology-selection.md +616 -0
  164. package/expertise/architecture/distributed/cap-theorem-and-tradeoffs.md +800 -0
  165. package/expertise/architecture/distributed/circuit-breaker-bulkhead.md +741 -0
  166. package/expertise/architecture/distributed/consensus-and-coordination.md +796 -0
  167. package/expertise/architecture/distributed/distributed-systems-fundamentals.md +564 -0
  168. package/expertise/architecture/distributed/idempotency-and-retry.md +796 -0
  169. package/expertise/architecture/distributed/index.md +25 -0
  170. package/expertise/architecture/distributed/saga-pattern.md +797 -0
  171. package/expertise/architecture/foundations/architectural-thinking.md +460 -0
  172. package/expertise/architecture/foundations/coupling-and-cohesion.md +770 -0
  173. package/expertise/architecture/foundations/design-principles-solid.md +649 -0
  174. package/expertise/architecture/foundations/domain-driven-design.md +719 -0
  175. package/expertise/architecture/foundations/index.md +25 -0
  176. package/expertise/architecture/foundations/separation-of-concerns.md +472 -0
  177. package/expertise/architecture/foundations/twelve-factor-app.md +797 -0
  178. package/expertise/architecture/index.md +34 -0
  179. package/expertise/architecture/integration/api-design-graphql.md +638 -0
  180. package/expertise/architecture/integration/api-design-grpc.md +804 -0
  181. package/expertise/architecture/integration/api-design-rest.md +892 -0
  182. package/expertise/architecture/integration/index.md +25 -0
  183. package/expertise/architecture/integration/third-party-integration.md +795 -0
  184. package/expertise/architecture/integration/webhooks-and-callbacks.md +1152 -0
  185. package/expertise/architecture/integration/websockets-realtime.md +791 -0
  186. package/expertise/architecture/mobile-architecture/index.md +22 -0
  187. package/expertise/architecture/mobile-architecture/mobile-app-architecture.md +780 -0
  188. package/expertise/architecture/mobile-architecture/mobile-backend-for-frontend.md +670 -0
  189. package/expertise/architecture/mobile-architecture/offline-first.md +719 -0
  190. package/expertise/architecture/mobile-architecture/push-and-sync.md +782 -0
  191. package/expertise/architecture/patterns/cqrs-event-sourcing.md +717 -0
  192. package/expertise/architecture/patterns/event-driven.md +797 -0
  193. package/expertise/architecture/patterns/hexagonal-clean-architecture.md +870 -0
  194. package/expertise/architecture/patterns/index.md +27 -0
  195. package/expertise/architecture/patterns/layered-architecture.md +736 -0
  196. package/expertise/architecture/patterns/microservices.md +753 -0
  197. package/expertise/architecture/patterns/modular-monolith.md +692 -0
  198. package/expertise/architecture/patterns/monolith.md +626 -0
  199. package/expertise/architecture/patterns/plugin-architecture.md +735 -0
  200. package/expertise/architecture/patterns/serverless.md +780 -0
  201. package/expertise/architecture/scaling/database-scaling.md +615 -0
  202. package/expertise/architecture/scaling/feature-flags-and-rollouts.md +757 -0
  203. package/expertise/architecture/scaling/horizontal-vs-vertical.md +606 -0
  204. package/expertise/architecture/scaling/index.md +24 -0
  205. package/expertise/architecture/scaling/multi-tenancy.md +800 -0
  206. package/expertise/architecture/scaling/stateless-design.md +787 -0
  207. package/expertise/backend/embedded-firmware.md +625 -0
  208. package/expertise/backend/go.md +853 -0
  209. package/expertise/backend/index.md +24 -0
  210. package/expertise/backend/java-spring.md +448 -0
  211. package/expertise/backend/node-typescript.md +625 -0
  212. package/expertise/backend/python-fastapi.md +724 -0
  213. package/expertise/backend/rust.md +458 -0
  214. package/expertise/backend/solidity.md +711 -0
  215. package/expertise/composition-map.yaml +443 -0
  216. package/expertise/content/foundations/content-modeling.md +395 -0
  217. package/expertise/content/foundations/editorial-standards.md +449 -0
  218. package/expertise/content/foundations/index.md +24 -0
  219. package/expertise/content/foundations/microcopy.md +455 -0
  220. package/expertise/content/foundations/terminology-governance.md +509 -0
  221. package/expertise/content/index.md +34 -0
  222. package/expertise/content/patterns/accessibility-copy.md +518 -0
  223. package/expertise/content/patterns/index.md +24 -0
  224. package/expertise/content/patterns/notification-content.md +433 -0
  225. package/expertise/content/patterns/sample-content.md +486 -0
  226. package/expertise/content/patterns/state-copy.md +439 -0
  227. package/expertise/design/PROGRESS.md +58 -0
  228. package/expertise/design/disciplines/dark-mode-theming.md +577 -0
  229. package/expertise/design/disciplines/design-systems.md +595 -0
  230. package/expertise/design/disciplines/index.md +25 -0
  231. package/expertise/design/disciplines/information-architecture.md +800 -0
  232. package/expertise/design/disciplines/interaction-design.md +788 -0
  233. package/expertise/design/disciplines/responsive-design.md +552 -0
  234. package/expertise/design/disciplines/usability-testing.md +516 -0
  235. package/expertise/design/disciplines/user-research.md +792 -0
  236. package/expertise/design/foundations/accessibility-design.md +796 -0
  237. package/expertise/design/foundations/color-theory.md +797 -0
  238. package/expertise/design/foundations/iconography.md +795 -0
  239. package/expertise/design/foundations/index.md +26 -0
  240. package/expertise/design/foundations/motion-and-animation.md +653 -0
  241. package/expertise/design/foundations/rtl-design.md +585 -0
  242. package/expertise/design/foundations/spacing-and-layout.md +607 -0
  243. package/expertise/design/foundations/typography.md +800 -0
  244. package/expertise/design/foundations/visual-hierarchy.md +761 -0
  245. package/expertise/design/index.md +32 -0
  246. package/expertise/design/patterns/authentication-flows.md +474 -0
  247. package/expertise/design/patterns/content-consumption.md +789 -0
  248. package/expertise/design/patterns/data-display.md +618 -0
  249. package/expertise/design/patterns/e-commerce.md +1494 -0
  250. package/expertise/design/patterns/feedback-and-states.md +642 -0
  251. package/expertise/design/patterns/forms-and-input.md +819 -0
  252. package/expertise/design/patterns/gamification.md +801 -0
  253. package/expertise/design/patterns/index.md +31 -0
  254. package/expertise/design/patterns/microinteractions.md +449 -0
  255. package/expertise/design/patterns/navigation.md +800 -0
  256. package/expertise/design/patterns/notifications.md +705 -0
  257. package/expertise/design/patterns/onboarding.md +700 -0
  258. package/expertise/design/patterns/search-and-filter.md +601 -0
  259. package/expertise/design/patterns/settings-and-preferences.md +768 -0
  260. package/expertise/design/patterns/social-and-community.md +748 -0
  261. package/expertise/design/platforms/desktop-native.md +612 -0
  262. package/expertise/design/platforms/index.md +25 -0
  263. package/expertise/design/platforms/mobile-android.md +825 -0
  264. package/expertise/design/platforms/mobile-cross-platform.md +983 -0
  265. package/expertise/design/platforms/mobile-ios.md +699 -0
  266. package/expertise/design/platforms/tablet.md +794 -0
  267. package/expertise/design/platforms/web-dashboard.md +790 -0
  268. package/expertise/design/platforms/web-responsive.md +550 -0
  269. package/expertise/design/psychology/behavioral-nudges.md +449 -0
  270. package/expertise/design/psychology/cognitive-load.md +1191 -0
  271. package/expertise/design/psychology/error-psychology.md +778 -0
  272. package/expertise/design/psychology/index.md +22 -0
  273. package/expertise/design/psychology/persuasive-design.md +736 -0
  274. package/expertise/design/psychology/user-mental-models.md +623 -0
  275. package/expertise/design/tooling/open-pencil.md +266 -0
  276. package/expertise/frontend/angular.md +1073 -0
  277. package/expertise/frontend/desktop-electron.md +546 -0
  278. package/expertise/frontend/flutter.md +782 -0
  279. package/expertise/frontend/index.md +27 -0
  280. package/expertise/frontend/native-android.md +409 -0
  281. package/expertise/frontend/native-ios.md +490 -0
  282. package/expertise/frontend/react-native.md +1160 -0
  283. package/expertise/frontend/react.md +808 -0
  284. package/expertise/frontend/vue.md +1089 -0
  285. package/expertise/humanize/domain-rules-code.md +79 -0
  286. package/expertise/humanize/domain-rules-content.md +67 -0
  287. package/expertise/humanize/domain-rules-technical-docs.md +56 -0
  288. package/expertise/humanize/index.md +35 -0
  289. package/expertise/humanize/self-audit-checklist.md +87 -0
  290. package/expertise/humanize/sentence-patterns.md +218 -0
  291. package/expertise/humanize/vocabulary-blacklist.md +105 -0
  292. package/expertise/i18n/PROGRESS.md +65 -0
  293. package/expertise/i18n/advanced/accessibility-and-i18n.md +28 -0
  294. package/expertise/i18n/advanced/bidirectional-text-algorithm.md +38 -0
  295. package/expertise/i18n/advanced/complex-scripts.md +30 -0
  296. package/expertise/i18n/advanced/performance-and-i18n.md +27 -0
  297. package/expertise/i18n/advanced/testing-i18n.md +28 -0
  298. package/expertise/i18n/content/content-adaptation.md +23 -0
  299. package/expertise/i18n/content/locale-specific-formatting.md +23 -0
  300. package/expertise/i18n/content/machine-translation-integration.md +28 -0
  301. package/expertise/i18n/content/translation-management.md +29 -0
  302. package/expertise/i18n/foundations/date-time-calendars.md +67 -0
  303. package/expertise/i18n/foundations/i18n-architecture.md +272 -0
  304. package/expertise/i18n/foundations/locale-and-language-tags.md +79 -0
  305. package/expertise/i18n/foundations/numbers-currency-units.md +61 -0
  306. package/expertise/i18n/foundations/pluralization-and-gender.md +109 -0
  307. package/expertise/i18n/foundations/string-externalization.md +236 -0
  308. package/expertise/i18n/foundations/text-direction-bidi.md +241 -0
  309. package/expertise/i18n/foundations/unicode-and-encoding.md +86 -0
  310. package/expertise/i18n/index.md +38 -0
  311. package/expertise/i18n/platform/backend-i18n.md +31 -0
  312. package/expertise/i18n/platform/flutter-i18n.md +148 -0
  313. package/expertise/i18n/platform/native-android-i18n.md +36 -0
  314. package/expertise/i18n/platform/native-ios-i18n.md +36 -0
  315. package/expertise/i18n/platform/react-i18n.md +103 -0
  316. package/expertise/i18n/platform/web-css-i18n.md +81 -0
  317. package/expertise/i18n/rtl/arabic-specific.md +175 -0
  318. package/expertise/i18n/rtl/hebrew-specific.md +149 -0
  319. package/expertise/i18n/rtl/rtl-animations-and-transitions.md +111 -0
  320. package/expertise/i18n/rtl/rtl-forms-and-input.md +161 -0
  321. package/expertise/i18n/rtl/rtl-fundamentals.md +211 -0
  322. package/expertise/i18n/rtl/rtl-icons-and-images.md +181 -0
  323. package/expertise/i18n/rtl/rtl-layout-mirroring.md +252 -0
  324. package/expertise/i18n/rtl/rtl-navigation-and-gestures.md +107 -0
  325. package/expertise/i18n/rtl/rtl-testing-and-qa.md +147 -0
  326. package/expertise/i18n/rtl/rtl-typography.md +160 -0
  327. package/expertise/index.md +113 -0
  328. package/expertise/index.yaml +216 -0
  329. package/expertise/infrastructure/cloud-aws.md +597 -0
  330. package/expertise/infrastructure/cloud-gcp.md +599 -0
  331. package/expertise/infrastructure/cybersecurity.md +816 -0
  332. package/expertise/infrastructure/database-mongodb.md +447 -0
  333. package/expertise/infrastructure/database-postgres.md +400 -0
  334. package/expertise/infrastructure/devops-cicd.md +787 -0
  335. package/expertise/infrastructure/index.md +27 -0
  336. package/expertise/performance/PROGRESS.md +50 -0
  337. package/expertise/performance/backend/api-latency.md +1204 -0
  338. package/expertise/performance/backend/background-jobs.md +506 -0
  339. package/expertise/performance/backend/connection-pooling.md +1209 -0
  340. package/expertise/performance/backend/database-query-optimization.md +515 -0
  341. package/expertise/performance/backend/index.md +23 -0
  342. package/expertise/performance/backend/rate-limiting-and-throttling.md +971 -0
  343. package/expertise/performance/foundations/algorithmic-complexity.md +954 -0
  344. package/expertise/performance/foundations/caching-strategies.md +489 -0
  345. package/expertise/performance/foundations/concurrency-and-parallelism.md +847 -0
  346. package/expertise/performance/foundations/index.md +24 -0
  347. package/expertise/performance/foundations/measuring-and-profiling.md +440 -0
  348. package/expertise/performance/foundations/memory-management.md +964 -0
  349. package/expertise/performance/foundations/performance-budgets.md +1314 -0
  350. package/expertise/performance/index.md +31 -0
  351. package/expertise/performance/infrastructure/auto-scaling.md +1059 -0
  352. package/expertise/performance/infrastructure/cdn-and-edge.md +1081 -0
  353. package/expertise/performance/infrastructure/index.md +22 -0
  354. package/expertise/performance/infrastructure/load-balancing.md +1081 -0
  355. package/expertise/performance/infrastructure/observability.md +1079 -0
  356. package/expertise/performance/mobile/index.md +23 -0
  357. package/expertise/performance/mobile/mobile-animations.md +544 -0
  358. package/expertise/performance/mobile/mobile-memory-battery.md +416 -0
  359. package/expertise/performance/mobile/mobile-network.md +452 -0
  360. package/expertise/performance/mobile/mobile-rendering.md +599 -0
  361. package/expertise/performance/mobile/mobile-startup-time.md +505 -0
  362. package/expertise/performance/platform-specific/flutter-performance.md +647 -0
  363. package/expertise/performance/platform-specific/index.md +22 -0
  364. package/expertise/performance/platform-specific/node-performance.md +1307 -0
  365. package/expertise/performance/platform-specific/postgres-performance.md +1366 -0
  366. package/expertise/performance/platform-specific/react-performance.md +1403 -0
  367. package/expertise/performance/web/bundle-optimization.md +1239 -0
  368. package/expertise/performance/web/image-and-media.md +636 -0
  369. package/expertise/performance/web/index.md +24 -0
  370. package/expertise/performance/web/network-optimization.md +1133 -0
  371. package/expertise/performance/web/rendering-performance.md +1098 -0
  372. package/expertise/performance/web/ssr-and-hydration.md +918 -0
  373. package/expertise/performance/web/web-vitals.md +1374 -0
  374. package/expertise/quality/accessibility.md +985 -0
  375. package/expertise/quality/evidence-based-verification.md +499 -0
  376. package/expertise/quality/index.md +24 -0
  377. package/expertise/quality/ml-model-audit.md +614 -0
  378. package/expertise/quality/performance.md +600 -0
  379. package/expertise/quality/testing-api.md +891 -0
  380. package/expertise/quality/testing-mobile.md +496 -0
  381. package/expertise/quality/testing-web.md +849 -0
  382. package/expertise/security/PROGRESS.md +54 -0
  383. package/expertise/security/agentic-identity.md +540 -0
  384. package/expertise/security/compliance-frameworks.md +601 -0
  385. package/expertise/security/data/data-encryption.md +364 -0
  386. package/expertise/security/data/data-privacy-gdpr.md +692 -0
  387. package/expertise/security/data/database-security.md +1171 -0
  388. package/expertise/security/data/index.md +22 -0
  389. package/expertise/security/data/pii-handling.md +531 -0
  390. package/expertise/security/foundations/authentication.md +1041 -0
  391. package/expertise/security/foundations/authorization.md +603 -0
  392. package/expertise/security/foundations/cryptography.md +1001 -0
  393. package/expertise/security/foundations/index.md +25 -0
  394. package/expertise/security/foundations/owasp-top-10.md +1354 -0
  395. package/expertise/security/foundations/secrets-management.md +1217 -0
  396. package/expertise/security/foundations/secure-sdlc.md +700 -0
  397. package/expertise/security/foundations/supply-chain-security.md +698 -0
  398. package/expertise/security/index.md +31 -0
  399. package/expertise/security/infrastructure/cloud-security-aws.md +1296 -0
  400. package/expertise/security/infrastructure/cloud-security-gcp.md +1376 -0
  401. package/expertise/security/infrastructure/container-security.md +721 -0
  402. package/expertise/security/infrastructure/incident-response.md +1295 -0
  403. package/expertise/security/infrastructure/index.md +24 -0
  404. package/expertise/security/infrastructure/logging-and-monitoring.md +1618 -0
  405. package/expertise/security/infrastructure/network-security.md +1337 -0
  406. package/expertise/security/mobile/index.md +23 -0
  407. package/expertise/security/mobile/mobile-android-security.md +1218 -0
  408. package/expertise/security/mobile/mobile-binary-protection.md +1229 -0
  409. package/expertise/security/mobile/mobile-data-storage.md +1265 -0
  410. package/expertise/security/mobile/mobile-ios-security.md +1401 -0
  411. package/expertise/security/mobile/mobile-network-security.md +1520 -0
  412. package/expertise/security/smart-contract-security.md +594 -0
  413. package/expertise/security/testing/index.md +22 -0
  414. package/expertise/security/testing/penetration-testing.md +1258 -0
  415. package/expertise/security/testing/security-code-review.md +1765 -0
  416. package/expertise/security/testing/threat-modeling.md +1074 -0
  417. package/expertise/security/testing/vulnerability-scanning.md +1062 -0
  418. package/expertise/security/web/api-security.md +586 -0
  419. package/expertise/security/web/cors-and-headers.md +433 -0
  420. package/expertise/security/web/csrf.md +562 -0
  421. package/expertise/security/web/file-upload.md +1477 -0
  422. package/expertise/security/web/index.md +25 -0
  423. package/expertise/security/web/injection.md +1375 -0
  424. package/expertise/security/web/session-management.md +1101 -0
  425. package/expertise/security/web/xss.md +1158 -0
  426. package/exports/README.md +17 -0
  427. package/exports/hosts/claude/.claude/agents/clarifier.md +42 -0
  428. package/exports/hosts/claude/.claude/agents/content-author.md +63 -0
  429. package/exports/hosts/claude/.claude/agents/designer.md +55 -0
  430. package/exports/hosts/claude/.claude/agents/executor.md +55 -0
  431. package/exports/hosts/claude/.claude/agents/learner.md +51 -0
  432. package/exports/hosts/claude/.claude/agents/planner.md +53 -0
  433. package/exports/hosts/claude/.claude/agents/researcher.md +43 -0
  434. package/exports/hosts/claude/.claude/agents/reviewer.md +54 -0
  435. package/exports/hosts/claude/.claude/agents/specifier.md +47 -0
  436. package/exports/hosts/claude/.claude/agents/verifier.md +71 -0
  437. package/exports/hosts/claude/.claude/commands/author.md +42 -0
  438. package/exports/hosts/claude/.claude/commands/clarify.md +38 -0
  439. package/exports/hosts/claude/.claude/commands/design-review.md +46 -0
  440. package/exports/hosts/claude/.claude/commands/design.md +44 -0
  441. package/exports/hosts/claude/.claude/commands/discover.md +37 -0
  442. package/exports/hosts/claude/.claude/commands/execute.md +48 -0
  443. package/exports/hosts/claude/.claude/commands/learn.md +38 -0
  444. package/exports/hosts/claude/.claude/commands/plan-review.md +42 -0
  445. package/exports/hosts/claude/.claude/commands/plan.md +39 -0
  446. package/exports/hosts/claude/.claude/commands/prepare-next.md +37 -0
  447. package/exports/hosts/claude/.claude/commands/review.md +40 -0
  448. package/exports/hosts/claude/.claude/commands/run-audit.md +41 -0
  449. package/exports/hosts/claude/.claude/commands/spec-challenge.md +41 -0
  450. package/exports/hosts/claude/.claude/commands/specify.md +38 -0
  451. package/exports/hosts/claude/.claude/commands/verify.md +37 -0
  452. package/exports/hosts/claude/.claude/settings.json +34 -0
  453. package/exports/hosts/claude/CLAUDE.md +19 -0
  454. package/exports/hosts/claude/export.manifest.json +38 -0
  455. package/exports/hosts/claude/host-package.json +67 -0
  456. package/exports/hosts/codex/AGENTS.md +19 -0
  457. package/exports/hosts/codex/export.manifest.json +38 -0
  458. package/exports/hosts/codex/host-package.json +41 -0
  459. package/exports/hosts/cursor/.cursor/hooks.json +16 -0
  460. package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +19 -0
  461. package/exports/hosts/cursor/export.manifest.json +38 -0
  462. package/exports/hosts/cursor/host-package.json +42 -0
  463. package/exports/hosts/gemini/GEMINI.md +19 -0
  464. package/exports/hosts/gemini/export.manifest.json +38 -0
  465. package/exports/hosts/gemini/host-package.json +41 -0
  466. package/hooks/README.md +18 -0
  467. package/hooks/definitions/loop_cap_guard.yaml +21 -0
  468. package/hooks/definitions/post_tool_capture.yaml +24 -0
  469. package/hooks/definitions/pre_compact_summary.yaml +19 -0
  470. package/hooks/definitions/pre_tool_capture_route.yaml +19 -0
  471. package/hooks/definitions/protected_path_write_guard.yaml +19 -0
  472. package/hooks/definitions/session_start.yaml +19 -0
  473. package/hooks/definitions/stop_handoff_harvest.yaml +20 -0
  474. package/hooks/loop-cap-guard +17 -0
  475. package/hooks/post-tool-lint +36 -0
  476. package/hooks/protected-path-write-guard +17 -0
  477. package/hooks/session-start +41 -0
  478. package/llms-full.txt +2355 -0
  479. package/llms.txt +43 -0
  480. package/package.json +79 -0
  481. package/roles/README.md +20 -0
  482. package/roles/clarifier.md +42 -0
  483. package/roles/content-author.md +63 -0
  484. package/roles/designer.md +55 -0
  485. package/roles/executor.md +55 -0
  486. package/roles/learner.md +51 -0
  487. package/roles/planner.md +53 -0
  488. package/roles/researcher.md +43 -0
  489. package/roles/reviewer.md +54 -0
  490. package/roles/specifier.md +47 -0
  491. package/roles/verifier.md +71 -0
  492. package/schemas/README.md +24 -0
  493. package/schemas/accepted-learning.schema.json +20 -0
  494. package/schemas/author-artifact.schema.json +156 -0
  495. package/schemas/clarification.schema.json +19 -0
  496. package/schemas/design-artifact.schema.json +80 -0
  497. package/schemas/docs-claim.schema.json +18 -0
  498. package/schemas/export-manifest.schema.json +20 -0
  499. package/schemas/hook.schema.json +67 -0
  500. package/schemas/host-export-package.schema.json +18 -0
  501. package/schemas/implementation-plan.schema.json +19 -0
  502. package/schemas/proposed-learning.schema.json +19 -0
  503. package/schemas/research.schema.json +18 -0
  504. package/schemas/review.schema.json +29 -0
  505. package/schemas/run-manifest.schema.json +18 -0
  506. package/schemas/spec-challenge.schema.json +18 -0
  507. package/schemas/spec.schema.json +20 -0
  508. package/schemas/usage.schema.json +102 -0
  509. package/schemas/verification-proof.schema.json +29 -0
  510. package/schemas/wazir-manifest.schema.json +173 -0
  511. package/skills/README.md +40 -0
  512. package/skills/brainstorming/SKILL.md +77 -0
  513. package/skills/debugging/SKILL.md +50 -0
  514. package/skills/design/SKILL.md +61 -0
  515. package/skills/dispatching-parallel-agents/SKILL.md +128 -0
  516. package/skills/executing-plans/SKILL.md +70 -0
  517. package/skills/finishing-a-development-branch/SKILL.md +169 -0
  518. package/skills/humanize/SKILL.md +123 -0
  519. package/skills/init-pipeline/SKILL.md +124 -0
  520. package/skills/prepare-next/SKILL.md +20 -0
  521. package/skills/receiving-code-review/SKILL.md +123 -0
  522. package/skills/requesting-code-review/SKILL.md +105 -0
  523. package/skills/requesting-code-review/code-reviewer.md +108 -0
  524. package/skills/run-audit/SKILL.md +197 -0
  525. package/skills/scan-project/SKILL.md +41 -0
  526. package/skills/self-audit/SKILL.md +153 -0
  527. package/skills/subagent-driven-development/SKILL.md +154 -0
  528. package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +26 -0
  529. package/skills/subagent-driven-development/implementer-prompt.md +102 -0
  530. package/skills/subagent-driven-development/spec-reviewer-prompt.md +61 -0
  531. package/skills/tdd/SKILL.md +23 -0
  532. package/skills/using-git-worktrees/SKILL.md +163 -0
  533. package/skills/using-skills/SKILL.md +95 -0
  534. package/skills/verification/SKILL.md +22 -0
  535. package/skills/wazir/SKILL.md +463 -0
  536. package/skills/writing-plans/SKILL.md +30 -0
  537. package/skills/writing-skills/SKILL.md +157 -0
  538. package/skills/writing-skills/anthropic-best-practices.md +122 -0
  539. package/skills/writing-skills/persuasion-principles.md +50 -0
  540. package/templates/README.md +20 -0
  541. package/templates/artifacts/README.md +10 -0
  542. package/templates/artifacts/accepted-learning.md +19 -0
  543. package/templates/artifacts/accepted-learning.template.json +12 -0
  544. package/templates/artifacts/author.md +74 -0
  545. package/templates/artifacts/author.template.json +19 -0
  546. package/templates/artifacts/clarification.md +21 -0
  547. package/templates/artifacts/clarification.template.json +12 -0
  548. package/templates/artifacts/execute-notes.md +19 -0
  549. package/templates/artifacts/implementation-plan.md +21 -0
  550. package/templates/artifacts/implementation-plan.template.json +11 -0
  551. package/templates/artifacts/learning-proposal.md +19 -0
  552. package/templates/artifacts/next-run-handoff.md +21 -0
  553. package/templates/artifacts/plan-review.md +19 -0
  554. package/templates/artifacts/proposed-learning.template.json +12 -0
  555. package/templates/artifacts/research.md +21 -0
  556. package/templates/artifacts/research.template.json +12 -0
  557. package/templates/artifacts/review-findings.md +19 -0
  558. package/templates/artifacts/review.template.json +11 -0
  559. package/templates/artifacts/run-manifest.template.json +8 -0
  560. package/templates/artifacts/spec-challenge.md +19 -0
  561. package/templates/artifacts/spec-challenge.template.json +11 -0
  562. package/templates/artifacts/spec.md +21 -0
  563. package/templates/artifacts/spec.template.json +12 -0
  564. package/templates/artifacts/verification-proof.md +19 -0
  565. package/templates/artifacts/verification-proof.template.json +11 -0
  566. package/templates/examples/accepted-learning.example.json +14 -0
  567. package/templates/examples/author.example.json +152 -0
  568. package/templates/examples/clarification.example.json +15 -0
  569. package/templates/examples/docs-claim.example.json +8 -0
  570. package/templates/examples/export-manifest.example.json +7 -0
  571. package/templates/examples/host-export-package.example.json +11 -0
  572. package/templates/examples/implementation-plan.example.json +17 -0
  573. package/templates/examples/proposed-learning.example.json +13 -0
  574. package/templates/examples/research.example.json +15 -0
  575. package/templates/examples/research.example.md +6 -0
  576. package/templates/examples/review.example.json +17 -0
  577. package/templates/examples/run-manifest.example.json +9 -0
  578. package/templates/examples/spec-challenge.example.json +14 -0
  579. package/templates/examples/spec.example.json +21 -0
  580. package/templates/examples/verification-proof.example.json +21 -0
  581. package/templates/examples/wazir-manifest.example.yaml +65 -0
  582. package/templates/task-definition-schema.md +99 -0
  583. package/tooling/README.md +20 -0
  584. package/tooling/src/adapters/context-mode.js +50 -0
  585. package/tooling/src/capture/command.js +376 -0
  586. package/tooling/src/capture/store.js +99 -0
  587. package/tooling/src/capture/usage.js +270 -0
  588. package/tooling/src/checks/branches.js +50 -0
  589. package/tooling/src/checks/brand-truth.js +110 -0
  590. package/tooling/src/checks/changelog.js +231 -0
  591. package/tooling/src/checks/command-registry.js +36 -0
  592. package/tooling/src/checks/commits.js +102 -0
  593. package/tooling/src/checks/docs-drift.js +103 -0
  594. package/tooling/src/checks/docs-truth.js +201 -0
  595. package/tooling/src/checks/runtime-surface.js +156 -0
  596. package/tooling/src/cli.js +116 -0
  597. package/tooling/src/command-options.js +56 -0
  598. package/tooling/src/commands/validate.js +320 -0
  599. package/tooling/src/doctor/command.js +91 -0
  600. package/tooling/src/export/command.js +77 -0
  601. package/tooling/src/export/compiler.js +498 -0
  602. package/tooling/src/guards/loop-cap-guard.js +52 -0
  603. package/tooling/src/guards/protected-path-write-guard.js +67 -0
  604. package/tooling/src/index/command.js +152 -0
  605. package/tooling/src/index/storage.js +1061 -0
  606. package/tooling/src/index/summarizers.js +261 -0
  607. package/tooling/src/loaders.js +18 -0
  608. package/tooling/src/project-root.js +22 -0
  609. package/tooling/src/recall/command.js +225 -0
  610. package/tooling/src/schema-validator.js +30 -0
  611. package/tooling/src/state-root.js +40 -0
  612. package/tooling/src/status/command.js +71 -0
  613. package/wazir.manifest.yaml +135 -0
  614. package/workflows/README.md +19 -0
  615. package/workflows/author.md +42 -0
  616. package/workflows/clarify.md +38 -0
  617. package/workflows/design-review.md +46 -0
  618. package/workflows/design.md +44 -0
  619. package/workflows/discover.md +37 -0
  620. package/workflows/execute.md +48 -0
  621. package/workflows/learn.md +38 -0
  622. package/workflows/plan-review.md +42 -0
  623. package/workflows/plan.md +39 -0
  624. package/workflows/prepare-next.md +37 -0
  625. package/workflows/review.md +40 -0
  626. package/workflows/run-audit.md +41 -0
  627. package/workflows/spec-challenge.md +41 -0
  628. package/workflows/specify.md +38 -0
  629. package/workflows/verify.md +37 -0
@@ -0,0 +1,800 @@
1
+ # CAP Theorem and Tradeoffs -- Architecture Expertise Module
2
+
3
+ > The CAP theorem states that a distributed system can provide at most two of three guarantees:
4
+ > Consistency, Availability, and Partition tolerance. Since network partitions are inevitable,
5
+ > the real choice is between consistency and availability during a partition. PACELC extends
6
+ > this: even without partitions, there is a latency vs consistency tradeoff.
7
+
8
+ > **Category:** Distributed
9
+ > **Complexity:** Complex
10
+ > **Applies when:** Designing data replication strategy, choosing databases, or deciding consistency guarantees for distributed services.
11
+
12
+ ---
13
+
14
+ ## What This Is
15
+
16
+ ### The CAP Theorem
17
+
18
+ Eric Brewer introduced the CAP conjecture in his keynote at the ACM PODC symposium in 2000.
19
+ Seth Gilbert and Nancy Lynch of MIT formally proved it in 2002, elevating it from conjecture
20
+ to theorem. The theorem states that any distributed data store can provide at most two of the
21
+ following three guarantees simultaneously:
22
+
23
+ **Consistency (C):** Every read receives the most recent write or an error. All nodes in the
24
+ system see the same data at the same time. When a client writes a value, every subsequent
25
+ read from any node must return that value (or a more recent one). This is linearizable
26
+ consistency -- the strongest form -- not eventual consistency.
27
+
28
+ **Availability (A):** Every request to a non-failing node receives a non-error response,
29
+ without the guarantee that it contains the most recent write. The system remains operational
30
+ and responsive. In CAP's formal definition, *every* request must eventually receive a
31
+ response -- there is no timeout constraint.
32
+
33
+ **Partition Tolerance (P):** The system continues to operate despite an arbitrary number of
34
+ messages being dropped or delayed by the network between nodes. A partition is a
35
+ communication break -- a lost or temporarily delayed connection between two nodes or groups
36
+ of nodes.
37
+
38
+ ### The Real Choice: CP or AP
39
+
40
+ The critical insight that Brewer himself clarified in 2012 is that CAP is not about
41
+ choosing two of three in normal operation. It is about what happens *during a network
42
+ partition*:
43
+
44
+ - **Network partitions are inevitable.** In any distributed system that spans more than one
45
+ node, network failures will occur. Hardware fails, cables get cut, switches drop packets,
46
+ cloud availability zones lose connectivity. Partition tolerance is not optional -- it is a
47
+ given.
48
+
49
+ - **The real choice is binary.** When a partition occurs, the system must choose: respond to
50
+ requests with potentially stale data (choose Availability, sacrifice Consistency) or refuse
51
+ to respond until the partition heals and consistency can be confirmed (choose Consistency,
52
+ sacrifice Availability).
53
+
54
+ - **During normal operation, all three are achievable.** When the network is healthy and no
55
+ partitions exist, a well-designed system can be both consistent and available. The tradeoff
56
+ only manifests during failures.
57
+
58
+ ### PACELC: The Essential Extension
59
+
60
+ Daniel Abadi proposed the PACELC theorem in 2010 (published 2012) to address CAP's blind
61
+ spot. CAP says nothing about system behavior when there is *no* partition -- which is the
62
+ vast majority of the time. PACELC states:
63
+
64
+ > If there is a **P**artition, choose between **A**vailability and **C**onsistency;
65
+ > **E**lse, when the system is operating normally, choose between **L**atency and
66
+ > **C**onsistency.
67
+
68
+ This captures a fundamental truth: even without failures, replicating data across nodes
69
+ introduces latency. A system that waits for all replicas to acknowledge a write before
70
+ responding is consistent but slow. A system that responds after writing to one replica is
71
+ fast but temporarily inconsistent.
72
+
73
+ PACELC classifications:
74
+
75
+ | Classification | During Partition | Normal Operation | Example Systems |
76
+ |---------------|-----------------|-----------------|-----------------|
77
+ | **PA/EL** | Availability | Low Latency | DynamoDB, Cassandra, Riak |
78
+ | **PA/EC** | Availability | Consistency | --- |
79
+ | **PC/EL** | Consistency | Low Latency | --- |
80
+ | **PC/EC** | Consistency | Consistency | Google Spanner, CockroachDB, VoltDB |
81
+
82
+ PA/EL systems are the most common AP systems -- they optimize for speed and uptime at the
83
+ cost of strict consistency. PC/EC systems are the most common CP systems -- they never
84
+ compromise on correctness. The off-diagonal combinations (PA/EC, PC/EL) are rare because
85
+ the design philosophies that drive partition behavior tend to align with normal-operation
86
+ behavior.
87
+
88
+ ### What CAP Does NOT Say
89
+
90
+ These are the most common and damaging misconceptions:
91
+
92
+ **Misconception 1: "Pick any two."** The original "pick 2 of 3" framing (often drawn as a
93
+ Venn diagram with CA, CP, and AP regions) is misleading. CA systems do not exist in
94
+ distributed computing because you cannot simply opt out of partitions. A single-node
95
+ PostgreSQL database is "CA" only because it is not distributed -- there is no network to
96
+ partition. The moment you add replication, you must handle partitions.
97
+
98
+ **Misconception 2: "A system is either CP or AP, full stop."** As Martin Kleppmann argued in
99
+ his influential 2015 essay "Please stop calling databases CP or AP," most real systems cannot
100
+ be cleanly classified. MongoDB with a single primary is CP for writes but different replicas
101
+ may serve stale reads. DynamoDB offers tunable consistency per request. A system's CAP
102
+ position can vary by operation, configuration, and even by individual request.
103
+
104
+ **Misconception 3: "CAP means you can never have consistency and availability."** You can have
105
+ both during normal operation. The tradeoff is only forced during a partition event. Google
106
+ Spanner demonstrates this: it is technically CP (it will sacrifice availability during a
107
+ partition) but achieves greater than 99.999% availability because Google's private network
108
+ infrastructure makes partitions extraordinarily rare.
109
+
110
+ **Misconception 4: "Availability in CAP means 'high availability' as SREs define it."** CAP's
111
+ definition of availability is very specific: every request to a non-failing node must receive
112
+ a response. It says nothing about response time. A system that takes 30 days to respond is
113
+ "available" by CAP's definition. This is why PACELC's addition of latency is so important
114
+ for practical system design.
115
+
116
+ **Misconception 5: "Consistency in CAP is the same as ACID consistency."** CAP consistency
117
+ means linearizability -- a specific property of read/write operations across replicas. ACID
118
+ consistency means transactions preserve database invariants (foreign keys, constraints). They
119
+ are different concepts that happen to share a name.
120
+
121
+ ---
122
+
123
+ ## When to Prioritize Consistency (CP Systems)
124
+
125
+ Choose consistency over availability when incorrect or stale data causes irreversible harm,
126
+ financial loss, or safety risks.
127
+
128
+ ### Financial Systems and Banking
129
+
130
+ A bank transfer between accounts must be atomic. If a network partition occurs mid-transfer,
131
+ the system must refuse to process further transactions on those accounts rather than risk
132
+ double-spending or lost funds. When a customer checks their balance, they must see the actual
133
+ balance -- not a cached value from before a recent deposit. Banks universally choose CP
134
+ because the cost of an incorrect balance (regulatory penalties, customer trust, actual money
135
+ loss) far outweighs the cost of brief unavailability.
136
+
137
+ **Real-world example:** Traditional banking cores (Temenos, FIS) operate as CP systems. When
138
+ a branch cannot reach the central ledger, it queues transactions locally rather than
139
+ processing them optimistically. ATM networks use authorization holds -- a CP pattern --
140
+ rather than dispensing cash they cannot verify.
141
+
142
+ ### Inventory and Booking Systems
143
+
144
+ An airline cannot sell the same seat twice. A hotel cannot book the same room to two guests.
145
+ An event venue cannot oversell beyond capacity. These systems require strong consistency
146
+ because the physical resource is finite and non-fungible.
147
+
148
+ **Real-world example:** Ticketmaster's seat reservation system uses strong consistency for
149
+ the booking operation. When a partition occurs between data centers, the system will reject
150
+ booking attempts rather than risk double-booking. However, the *browsing* portion of the
151
+ system (checking what seats are available) can tolerate eventual consistency -- showing a
152
+ seat as available when it was just booked is acceptable because the booking attempt will
153
+ be rejected at the consistent layer.
154
+
155
+ ### Medical Records and Safety-Critical Systems
156
+
157
+ Patient medication records, dosage calculations, and allergy information cannot tolerate
158
+ stale reads. Administering a medication that was contraindicated by a recently-entered
159
+ allergy could be fatal. These systems choose CP and accept the operational burden of
160
+ unavailability during network issues.
161
+
162
+ ### Leader Election and Coordination
163
+
164
+ Distributed coordination services like ZooKeeper and etcd are inherently CP. They implement
165
+ consensus protocols (ZAB, Raft) that sacrifice availability during partitions to ensure
166
+ that all nodes agree on the current state. This is necessary because their purpose is to
167
+ provide a single source of truth for configuration, leader election, and distributed locks.
168
+
169
+ ### CP System Characteristics
170
+
171
+ | Property | Typical CP Behavior |
172
+ |----------|-------------------|
173
+ | Write path | Synchronous replication to a quorum before acknowledging |
174
+ | Read path | Read from leader, or read from follower with consistency check |
175
+ | During partition | Minority partition becomes read-only or fully unavailable |
176
+ | Recovery | Automatic once partition heals; no conflict resolution needed |
177
+ | Consensus protocol | Raft, Paxos, ZAB, or similar |
178
+ | Example databases | PostgreSQL (single-primary), CockroachDB, Google Spanner, etcd, ZooKeeper, HBase |
179
+
180
+ ---
181
+
182
+ ## When to Prioritize Availability (AP Systems)
183
+
184
+ Choose availability over consistency when the system must remain responsive at all costs
185
+ and temporary staleness or inconsistency is tolerable.
186
+
187
+ ### Social Media Feeds and Content Platforms
188
+
189
+ When a user posts on a social platform, it is acceptable if followers in another region
190
+ see the post a few seconds or even minutes later. The platform must never show a user an
191
+ error page or refuse to load their feed because of a network issue between data centers.
192
+ Facebook, Twitter/X, and Instagram all prioritize availability for feed rendering.
193
+
194
+ **Real-world example:** If a user updates their profile picture on Facebook, other users
195
+ may see the old picture for a brief period. This is a deliberate design choice: the
196
+ alternative -- making the entire profile unavailable until all replicas confirm the new
197
+ picture -- would degrade the user experience far more than a few seconds of staleness.
198
+
199
+ ### Caching Layers and CDNs
200
+
201
+ Content delivery networks are inherently AP systems. A CDN node serves cached content even
202
+ if it cannot reach the origin server. The content may be stale (an old version of a webpage,
203
+ an outdated product image) but serving stale content is vastly preferable to serving nothing.
204
+ DNS is another classic AP system -- DNS resolvers cache records and serve potentially stale
205
+ entries rather than failing when they cannot reach authoritative nameservers.
206
+
207
+ ### Shopping Carts and Wishlists
208
+
209
+ Amazon's original Dynamo paper (2007) described the shopping cart as an AP use case. Items
210
+ added to a cart during a partition might temporarily diverge across replicas, but the system
211
+ resolges conflicts by merging (union of items) rather than discarding. A customer seeing a
212
+ previously removed item reappear in their cart is annoying; a customer being unable to add
213
+ items to their cart at all loses revenue.
214
+
215
+ ### IoT Sensor Data and Telemetry
216
+
217
+ Sensor networks collecting temperature, humidity, or machine telemetry readings prioritize
218
+ availability. Missing a few readings or receiving them out of order is tolerable. Losing the
219
+ ability to ingest data at all -- because a network link between the collection tier and the
220
+ storage tier is down -- means losing irreplaceable time-series data.
221
+
222
+ ### AP System Characteristics
223
+
224
+ | Property | Typical AP Behavior |
225
+ |----------|-------------------|
226
+ | Write path | Write to any available node; asynchronous replication |
227
+ | Read path | Read from any available node; may return stale data |
228
+ | During partition | Both sides of the partition continue serving reads and writes |
229
+ | Recovery | Conflict resolution via last-write-wins, vector clocks, CRDTs, or application-level merge |
230
+ | Conflict strategy | Merge, last-write-wins, or custom resolution |
231
+ | Example databases | Cassandra, DynamoDB (default), Riak, CouchDB, DNS |
232
+
233
+ ---
234
+
235
+ ## When NOT to Apply CAP
236
+
237
+ This section is as important as the sections above. CAP is frequently misapplied, leading
238
+ to poor architectural decisions.
239
+
240
+ ### Single-Node Systems
241
+
242
+ CAP is a theorem about *distributed* systems. A single PostgreSQL instance, a single Redis
243
+ server, or a monolithic application with one database has no network partitions to worry
244
+ about. Applying CAP to these systems is a category error. If you have a single database
245
+ server with no replication, your concerns are durability (disk failure), capacity (can the
246
+ server handle the load), and recovery time -- not CAP tradeoffs.
247
+
248
+ **Common mistake:** A team chooses Cassandra "for availability" when they have a single
249
+ data center, a single application server, and modest data volumes. A single PostgreSQL
250
+ instance with good backups would serve them better, with simpler operations and stronger
251
+ consistency.
252
+
253
+ ### Systems Where Partitions Are Handled by Infrastructure
254
+
255
+ Google Spanner is technically CP but achieves greater than 99.999% availability. How?
256
+ Google's private fiber network makes partitions so rare that the theoretical availability
257
+ sacrifice almost never materializes. Similarly, systems running within a single availability
258
+ zone in AWS with redundant networking face partition probabilities so low that designing
259
+ around partition behavior is not the dominant architectural concern.
260
+
261
+ **Key insight:** If your partition probability is 0.001%, designing your entire data model
262
+ around partition behavior is over-engineering. Focus on the tradeoffs that actually affect
263
+ your system daily: latency, throughput, operational complexity, and cost.
264
+
265
+ ### Over-Simplification of "Just Pick CP or AP"
266
+
267
+ Real systems are not uniformly CP or AP. A well-designed system uses different consistency
268
+ levels for different operations:
269
+
270
+ - **E-commerce platform:** Product catalog browsing (AP -- eventual consistency is fine),
271
+ inventory reservation (CP -- must be consistent), payment processing (CP -- must be
272
+ consistent), order history display (AP -- slight delay is acceptable), recommendation
273
+ engine (AP -- stale preferences are tolerable).
274
+
275
+ - **Ride-sharing application:** Driver location updates (AP -- eventual consistency, high
276
+ frequency), ride matching (CP at the assignment moment -- cannot double-assign a driver),
277
+ fare calculation (CP -- must be based on consistent trip data), trip history (AP -- can
278
+ tolerate brief delays).
279
+
280
+ **Common mistake:** A team declares their entire system "AP" or "CP" and forces every
281
+ component into that mold, rather than making per-feature consistency decisions.
282
+
283
+ ### When the Problem Is Actually Latency, Not Partitions
284
+
285
+ Many teams invoke CAP when their real problem is latency. They say "we chose AP for
286
+ availability" when they actually mean "we chose eventual consistency because synchronous
287
+ replication was too slow." This is a PACELC tradeoff (EL vs EC), not a CAP tradeoff (A vs C).
288
+ Conflating the two leads to architectural discussions where participants talk past each other.
289
+
290
+ ### When Consensus Is the Real Requirement
291
+
292
+ If your system needs distributed transactions, global ordering of events, or leader election,
293
+ CAP is the wrong framework. You need to reason about consensus protocols (Raft, Paxos,
294
+ PBFT), their failure modes, and their performance characteristics. CAP tells you that
295
+ consensus is impossible during a partition -- which is true but not useful for designing the
296
+ consensus protocol itself.
297
+
298
+ ### Real Examples of Teams Misapplying CAP
299
+
300
+ **Example 1: Choosing MongoDB "because it is CP."** A startup chose MongoDB for a social
301
+ media application specifically because they wanted consistency. But MongoDB with a replica
302
+ set is only CP for writes routed to the primary. Reads from secondaries return stale data
303
+ by default. The team did not configure read preferences correctly and ended up with an
304
+ effectively AP read path they did not intend, causing subtle bugs in their notification
305
+ system.
306
+
307
+ **Example 2: Choosing Cassandra "because it is AP" for financial data.** A fintech company
308
+ chose Cassandra for transaction records because they wanted "five nines of availability."
309
+ They did not realize that Cassandra's AP nature meant concurrent writes to the same
310
+ transaction record could conflict silently, with last-write-wins discarding earlier updates.
311
+ They discovered lost transactions during an audit and had to add an external coordination
312
+ layer (effectively rebuilding CP semantics on top of an AP database).
313
+
314
+ **Example 3: Ignoring the "Else" in PACELC.** A team chose a PC/EC database (CockroachDB)
315
+ for a latency-sensitive user-facing API. During normal operation (no partitions), every write
316
+ required cross-region consensus, adding 100-200ms of latency. Users complained about slow
317
+ response times. The team eventually moved user session data to a PA/EL store (Redis with
318
+ replication) while keeping financial data in CockroachDB -- a per-feature consistency
319
+ decision they should have made from the start.
320
+
321
+ ---
322
+
323
+ ## How It Works
324
+
325
+ ### Partition Detection
326
+
327
+ A network partition is detected when nodes cannot communicate with each other within a
328
+ configured timeout. Detection mechanisms include:
329
+
330
+ 1. **Heartbeat failure:** Nodes exchange periodic heartbeat messages. If a node misses
331
+ several consecutive heartbeats from a peer, it suspects a partition.
332
+ 2. **Quorum loss:** In consensus-based systems, if a node cannot reach a majority of peers,
333
+ it knows it is on the minority side of a partition.
334
+ 3. **Split-brain detection:** Some systems use a witness node, a shared disk, or a cloud
335
+ API as a tiebreaker to determine which side of a partition is the "real" cluster.
336
+
337
+ The difficulty is distinguishing a true network partition from a slow node. A node that
338
+ takes 10 seconds to respond to a heartbeat might be overloaded, not partitioned. Most
339
+ systems use aggressive timeouts (seconds) to detect partitions quickly, accepting that some
340
+ slow nodes will be falsely flagged as partitioned.
341
+
342
+ ### Consistency Levels: A Spectrum
343
+
344
+ Consistency is not binary. The following levels form a hierarchy from strongest to weakest:
345
+
346
+ **Linearizability (Strongest):**
347
+ All operations appear to execute atomically at some point between their invocation and
348
+ completion. There is a total order of operations consistent with real-time ordering. If
349
+ operation A completes before operation B begins, A appears before B in the total order. This
350
+ is what CAP means by "consistency." It requires coordination on every operation and is the
351
+ most expensive consistency level.
352
+
353
+ **Sequential Consistency:**
354
+ All operations appear to execute in some total order that is consistent with the program
355
+ order of each individual process, but this order need not respect real-time ordering. Two
356
+ clients may observe writes in different orders, as long as each client sees a sequence
357
+ consistent with the order it issued its own operations.
358
+
359
+ **Causal Consistency:**
360
+ Operations that are causally related must be seen in the same order by all nodes. Causally
361
+ unrelated (concurrent) operations may be seen in different orders by different nodes. If
362
+ process A writes X, and process B reads X and then writes Y, then X causally precedes Y and
363
+ all nodes must see X before Y. But if process C independently writes Z with no knowledge of
364
+ X or Y, Z may appear at any point.
365
+
366
+ **Read-Your-Writes Consistency:**
367
+ A client always sees its own writes. If client A writes a value, client A's subsequent reads
368
+ will reflect that write. Other clients may see stale data. This is often sufficient for web
369
+ applications where users primarily interact with their own data.
370
+
371
+ **Eventual Consistency (Weakest):**
372
+ If no new updates are made, all replicas will *eventually* converge to the same value. There
373
+ is no bound on how long "eventually" takes (though in practice it is usually seconds). During
374
+ the convergence window, different replicas may return different values. This is the cheapest
375
+ consistency level in terms of latency and availability.
376
+
377
+ ### Availability Levels
378
+
379
+ | Level | Annual Downtime | Description |
380
+ |-------|----------------|-------------|
381
+ | 99% ("two nines") | 3.65 days | Acceptable for internal tools |
382
+ | 99.9% ("three nines") | 8.76 hours | Standard for most web applications |
383
+ | 99.95% | 4.38 hours | Typical SLA for cloud databases |
384
+ | 99.99% ("four nines") | 52.6 minutes | High-availability production systems |
385
+ | 99.999% ("five nines") | 5.26 minutes | Telecom, financial systems, Google Spanner |
386
+ | 99.9999% ("six nines") | 31.5 seconds | Theoretical; requires extraordinary infrastructure |
387
+
388
+ Note: CAP availability and SLA availability are different concepts. CAP availability means
389
+ every request to a non-failing node gets a response. SLA availability means the system
390
+ responds successfully within a defined latency threshold for a defined percentage of
391
+ requests over a time window.
392
+
393
+ ### Real Database CAP/PACELC Positions
394
+
395
+ | Database | CAP Position | PACELC Position | Notes |
396
+ |----------|-------------|----------------|-------|
397
+ | **PostgreSQL** (single primary) | CP | PC/EC | Followers reject writes during partition; strong consistency always |
398
+ | **PostgreSQL** (Patroni HA) | CP | PC/EC | Automatic failover but still single-writer; fencing prevents split-brain |
399
+ | **MySQL** (Group Replication) | CP | PC/EC | Multi-primary mode exists but defaults to single-primary |
400
+ | **Cassandra** | AP | PA/EL | Tunable consistency (ONE, QUORUM, ALL) per query; default is eventual |
401
+ | **DynamoDB** | AP (default) | PA/EL | Supports strongly consistent reads per-request (costs 2x throughput) |
402
+ | **MongoDB** | CP-ish | PC/EC | Primary handles writes; reads from secondaries can be stale unless configured |
403
+ | **CockroachDB** | CP | PC/EC | Serializable isolation; Raft consensus for every write; geo-partitioned leaseholders reduce latency |
404
+ | **Google Spanner** | CP | PC/EC | TrueTime enables external consistency; >99.999% availability via network investment |
405
+ | **Redis** (replicated) | AP | PA/EL | Asynchronous replication; acknowledged writes can be lost on failover |
406
+ | **Redis** (Sentinel) | AP | PA/EL | Sentinel provides failover but does not prevent data loss during partition |
407
+ | **etcd** | CP | PC/EC | Raft consensus; minority partition is unavailable |
408
+ | **ZooKeeper** | CP | PC/EC | ZAB protocol; minority partition refuses requests |
409
+ | **CouchDB** | AP | PA/EL | Multi-master replication with conflict detection; user resolves conflicts |
410
+ | **Riak** | AP | PA/EL | Dynamo-inspired; vector clocks and CRDTs for conflict resolution |
411
+ | **ScyllaDB** | AP | PA/EL | Cassandra-compatible; same tunable consistency model, higher throughput |
412
+ | **TiDB** | CP | PC/EC | Raft-based; strong consistency with MySQL compatibility |
413
+ | **YugabyteDB** | CP | PC/EC | Raft consensus; PostgreSQL-compatible wire protocol |
414
+ | **FoundationDB** | CP | PC/EC | Strictly serializable; Apple's iCloud backend |
415
+
416
+ ---
417
+
418
+ ## Trade-Offs Matrix
419
+
420
+ | Decision | Choose Consistency When | Choose Availability When | Real-World Signal |
421
+ |----------|----------------------|------------------------|-------------------|
422
+ | **Data correctness** | A wrong answer is worse than no answer (finance, medical, legal) | A stale answer is better than no answer (feeds, search, analytics) | "Can we tolerate showing outdated data for 5 seconds?" |
423
+ | **Conflict resolution** | Conflicts are expensive or impossible to resolve after the fact (double-booking, double-spending) | Conflicts are cheap to resolve or merge (shopping carts, like counts, view counters) | "What happens if two replicas accept conflicting writes?" |
424
+ | **User expectations** | Users expect to see their most recent action immediately (bank balance after transfer) | Users tolerate brief delays (social feed not showing a just-posted comment) | "Will users call support if they see stale data?" |
425
+ | **Regulatory requirements** | Regulations demand audit trails with total ordering (SOX, PCI-DSS, HIPAA) | No regulatory ordering requirements (content platforms, IoT telemetry) | "Do auditors need to see a globally consistent timeline?" |
426
+ | **Failure blast radius** | Brief unavailability affects few users or is operationally manageable | Unavailability causes revenue loss, user churn, or SLA penalties | "What costs more: 30 seconds of downtime or 30 seconds of stale data?" |
427
+ | **Write frequency** | Writes are infrequent relative to the consensus latency budget | Writes are high-frequency and latency-sensitive | "Can we afford 50-200ms of consensus overhead per write?" |
428
+ | **Geographic distribution** | Users are geographically concentrated or latency is not the primary concern | Users are globally distributed and latency is critical | "Are our users within one region or spread across continents?" |
429
+ | **Operational complexity** | Team can operate consensus-based systems (monitoring, debugging split-brain, quorum management) | Team prefers simpler operational model (any-node-writes, no quorum) | "Does the team have experience operating Raft/Paxos-based systems?" |
430
+ | **Recovery cost** | Recovery from inconsistency is expensive (manual reconciliation, compensating transactions) | Recovery from inconsistency is automated (CRDTs, last-write-wins, merge functions) | "What does our conflict resolution procedure look like?" |
431
+ | **Data volume and velocity** | Moderate data volume where consensus overhead is acceptable | High data volume or velocity where consensus would be a bottleneck | "Are we writing 100 records/sec or 100,000 records/sec?" |
432
+
433
+ ---
434
+
435
+ ## Evolution Path
436
+
437
+ Most systems should start with strong consistency and deliberately relax it where the
438
+ tradeoffs justify the complexity.
439
+
440
+ ### Phase 1: Start with Strong Consistency
441
+
442
+ Begin with a single-primary relational database (PostgreSQL, MySQL). Every read and write
443
+ goes through one node. There are no CAP tradeoffs because there is no distribution. This is
444
+ not a limitation -- it is a feature. You get linearizable consistency, ACID transactions, and
445
+ simple debugging. Most applications never outgrow this phase.
446
+
447
+ ### Phase 2: Identify Read Paths That Tolerate Staleness
448
+
449
+ As traffic grows, identify read operations where eventual consistency is acceptable: product
450
+ catalog pages, user feeds, recommendation results, analytics dashboards. Route these reads
451
+ to replicas with asynchronous replication. Writes still go to the primary.
452
+
453
+ ### Phase 3: Add Caching for AP Read Paths
454
+
455
+ Put a caching layer (Redis, Memcached) in front of read replicas for frequently accessed,
456
+ staleness-tolerant data. The cache is inherently AP: it serves stale data when it cannot
457
+ reach the database, and cache invalidation introduces a consistency window. The result is
458
+ a tiered read path: hot reads from cache (AP, sub-ms), warm reads from replica (near-
459
+ consistent, low-ms), and consistent reads from primary (CP, higher latency).
460
+
461
+ ### Phase 4: Per-Feature Consistency Decisions
462
+
463
+ As the system grows, different features adopt different consistency models based on their
464
+ requirements. This is the mature state. Document each feature's consistency choice and
465
+ rationale. For example: authentication and payment processing use CP (PostgreSQL primary,
466
+ serializable isolation), inventory reservation uses CP (SELECT FOR UPDATE), while product
467
+ catalog (Elasticsearch via CDC), shopping cart (DynamoDB), recommendations (Redis), order
468
+ history (read replica), and notifications (WebSocket) all use AP with varying staleness
469
+ tolerances.
470
+
471
+ ### Phase 5: Multi-Region with Tunable Consistency
472
+
473
+ For global scale, use databases that support per-operation consistency tuning. CockroachDB
474
+ with geo-partitioned leaseholders pins data to the region closest to the user, reducing
475
+ consensus latency. DynamoDB global tables provide eventual consistency across regions with
476
+ strong consistency available per-request within a region.
477
+
478
+ ---
479
+
480
+ ## Failure Modes
481
+
482
+ ### Split-Brain from Incorrect Partition Handling
483
+
484
+ **What happens:** A network partition divides a cluster into two groups. Both groups elect a
485
+ leader and accept writes independently. When the partition heals, the system has two
486
+ divergent histories that cannot be automatically reconciled.
487
+
488
+ **Real-world example:** The 2013 GitHub outage was caused by a network partition that led to
489
+ a split-brain scenario in their MySQL cluster. Both sides of the partition accepted writes,
490
+ causing data inconsistencies that required manual intervention to resolve.
491
+
492
+ **Prevention:**
493
+ - Use odd-numbered clusters (3, 5, 7 nodes) so quorum is always a strict majority
494
+ - Implement fencing tokens -- when a new leader is elected, it gets a monotonically
495
+ increasing token, and storage nodes reject writes from old leaders with stale tokens
496
+ - Use external witness services (cloud provider APIs, separate availability zone) as a
497
+ tiebreaker
498
+ - Prefer consensus protocols (Raft, Paxos) that mathematically prevent split-brain over
499
+ ad-hoc leader election
500
+
501
+ ### Stale Reads Causing Business Logic Errors
502
+
503
+ **What happens:** A service reads stale data from an eventually consistent store and makes a
504
+ business decision based on that stale data. The decision is wrong because the data has since
505
+ changed.
506
+
507
+ **Example:** An inventory service reads available stock from a read replica (2 seconds behind
508
+ primary). It sees 5 units available and allows a purchase. But the primary already processed
509
+ 4 other purchases, leaving only 1 unit. The system has now oversold.
510
+
511
+ **Prevention:**
512
+ - Route business-critical reads to the primary or a synchronous replica
513
+ - Use read-your-writes consistency for operations within a single user session
514
+ - Implement optimistic concurrency control (version numbers) so that writes based on stale
515
+ reads fail at the write step
516
+ - Accept eventual consistency for the read (showing "5 in stock") but enforce consistency
517
+ at the write (inventory decrement uses a compare-and-swap or database constraint)
518
+
519
+ ### Unavailability Cascading Through the System
520
+
521
+ **What happens:** A CP data store becomes unavailable during a partition. Services that depend
522
+ on it also become unavailable. Services that depend on *those* services also become
523
+ unavailable. The blast radius expands exponentially.
524
+
525
+ **Example:** The user authentication service uses etcd (CP) for session validation. During a
526
+ partition, etcd's minority side is unavailable. All services in that zone cannot validate
527
+ sessions. The API gateway cannot authenticate requests. The entire zone is effectively down,
528
+ even though the application servers, databases, and network within the zone are healthy.
529
+
530
+ **Prevention:**
531
+ - Cache authentication tokens locally with a TTL so services can validate existing sessions
532
+ during brief partitions
533
+ - Implement circuit breakers that allow degraded operation when a CP dependency is unavailable
534
+ - Design fallback paths: if the CP store is unreachable, degrade gracefully rather than
535
+ failing completely
536
+ - Avoid putting CP systems on the critical path of every request
537
+
538
+ ### Conflict Resolution Complexity in AP Systems
539
+
540
+ **What happens:** During a partition, both sides accept conflicting writes. When the partition
541
+ heals, the system must resolve these conflicts. Simple strategies (last-write-wins) lose
542
+ data. Complex strategies (application-level merge) introduce subtle bugs.
543
+
544
+ **Example:** Two users edit the same document during a partition. User A adds paragraph 3.
545
+ User B deletes paragraph 2. When the partition heals, the system must merge these changes.
546
+ Last-write-wins would discard one user's edits entirely. A naive merge might apply both
547
+ changes but produce a garbled document.
548
+
549
+ **Prevention:**
550
+ - Use CRDTs (Conflict-free Replicated Data Types) for data structures that can be
551
+ mathematically merged without conflicts (counters, sets, registers)
552
+ - Design data models to be append-only where possible (event sourcing) so conflicts become
553
+ a matter of ordering rather than overwriting
554
+ - Implement domain-specific merge functions that understand the semantics of the data
555
+ - Alert operators when conflicts occur so they can be reviewed, rather than silently
556
+ applying a generic resolution strategy
557
+
558
+ ### Timeout Misconfiguration
559
+
560
+ **What happens:** Partition detection timeouts set too aggressively cause healthy-but-slow
561
+ nodes to be flagged as partitioned, triggering unnecessary failovers. Timeouts set too
562
+ conservatively leave the system operating inconsistently for minutes without detection.
563
+
564
+ **Prevention:** Use adaptive timeouts (phi accrual failure detector, used by Cassandra and
565
+ Akka). Separate "suspicion" from "declared dead" thresholds. Monitor false-positive and
566
+ false-negative rates and tune accordingly.
567
+
568
+ ---
569
+
570
+ ## Technology Landscape
571
+
572
+ ### CP Databases and When to Use Them
573
+
574
+ | Database | Consensus Protocol | Consistency Level | Best For |
575
+ |----------|-------------------|-------------------|----------|
576
+ | **PostgreSQL** (single primary + replicas) | N/A (single writer) | Linearizable (primary), eventual (replicas) | General-purpose OLTP, moderate scale |
577
+ | **CockroachDB** | Raft | Serializable (default), read-committed available | Global OLTP requiring strong consistency with PostgreSQL compatibility |
578
+ | **Google Spanner** | Paxos + TrueTime | External consistency (stronger than linearizable) | Global-scale OLTP where Google Cloud is acceptable |
579
+ | **TiDB** | Raft (via TiKV) | Snapshot isolation (default), configurable | MySQL-compatible distributed OLTP |
580
+ | **YugabyteDB** | Raft | Serializable (YSQL), tunable (YCQL) | PostgreSQL-compatible distributed OLTP |
581
+ | **FoundationDB** | Custom (OCC + Paxos) | Strictly serializable | Low-level key-value requiring strongest guarantees |
582
+ | **etcd** | Raft | Linearizable | Configuration management, service discovery, leader election |
583
+ | **ZooKeeper** | ZAB | Linearizable | Distributed coordination, lock management |
584
+
585
+ ### AP Databases and When to Use Them
586
+
587
+ | Database | Replication Model | Conflict Resolution | Best For |
588
+ |----------|------------------|-------------------|----------|
589
+ | **Cassandra** | Gossip + hinted handoff | Last-write-wins (LWW) by default, LWTs available | High-throughput time-series, IoT, logs |
590
+ | **DynamoDB** | Multi-master (global tables) | Last-write-wins; strong reads available per-request | Serverless, key-value, session stores |
591
+ | **Riak** | Vnodes + hinted handoff | Vector clocks, CRDTs, sibling resolution | High availability key-value, session stores |
592
+ | **CouchDB** | Multi-master HTTP replication | Revision tree, deterministic winner, user-resolves conflicts | Offline-first mobile, document sync |
593
+ | **ScyllaDB** | Gossip (Cassandra-compatible) | Last-write-wins (Cassandra-compatible) | Cassandra workloads requiring lower latency |
594
+ | **Redis** (replicated) | Async primary-replica | Last-write-wins (no conflict detection) | Caching, session stores, pub/sub |
595
+
596
+ ### Tunable Consistency Databases
597
+
598
+ These databases allow per-operation consistency tuning, which is the most practical approach
599
+ for systems with mixed consistency requirements:
600
+
601
+ | Database | Consistency Tuning Mechanism | Range |
602
+ |----------|---------------------------|-------|
603
+ | **Cassandra** | Per-query consistency level (ONE, TWO, THREE, QUORUM, ALL, LOCAL_QUORUM, EACH_QUORUM) | Full AP to near-CP |
604
+ | **DynamoDB** | `ConsistentRead: true` parameter on GetItem/Query | AP (default) or CP per request |
605
+ | **MongoDB** | Read concern (local, majority, linearizable) + write concern (w:1, w:majority, w:all) | Near-AP to CP |
606
+ | **YugabyteDB** | YSQL (serializable) vs YCQL (tunable, Cassandra-compatible) | CP (YSQL) or tunable (YCQL) |
607
+ | **Cosmos DB** | Five consistency levels (strong, bounded staleness, session, consistent prefix, eventual) | CP to AP in five steps |
608
+
609
+ ---
610
+
611
+ ## Decision Tree
612
+
613
+ Use this flowchart to determine the appropriate consistency model for a specific feature
614
+ or operation (not for the entire system):
615
+
616
+ ```
617
+ START: What happens if this operation returns stale data?
618
+ |
619
+ |-- "Financial loss, safety risk, or regulatory violation"
620
+ | |
621
+ | --> Use STRONG CONSISTENCY (CP)
622
+ | |
623
+ | |-- Is this a global system with cross-region writes?
624
+ | | |-- Yes --> CockroachDB, Spanner, YugabyteDB
625
+ | | |-- No --> PostgreSQL (single primary), MySQL with Group Replication
626
+ | |
627
+ | |-- Is this a coordination/config primitive?
628
+ | |-- Yes --> etcd, ZooKeeper
629
+ | |-- No --> Use database above with serializable isolation
630
+ |
631
+ |-- "User confusion but no lasting harm"
632
+ | |
633
+ | --> Use SESSION CONSISTENCY (read-your-writes)
634
+ | |
635
+ | |-- Route user's reads to the same node that processed their writes
636
+ | |-- Or use sticky sessions with a bounded staleness guarantee
637
+ | |-- Example: User sees their own post immediately; followers see it within seconds
638
+ |
639
+ |-- "Minor inconvenience or unnoticeable"
640
+ | |
641
+ | --> Use EVENTUAL CONSISTENCY (AP)
642
+ | |
643
+ | |-- Can conflicts be automatically resolved?
644
+ | | |-- Yes, with LWW --> Cassandra, DynamoDB, Redis
645
+ | | |-- Yes, with CRDTs --> Riak, custom implementation
646
+ | | |-- No, needs manual merge --> CouchDB, application-level resolution
647
+ | |
648
+ | |-- Is this a cache?
649
+ | |-- Yes --> Redis, Memcached with TTL-based invalidation
650
+ | |-- No --> Choose based on data model and query patterns
651
+ |
652
+ |-- "It depends on the specific field or context"
653
+ |
654
+ --> Use MIXED CONSISTENCY (tunable per-operation)
655
+ |
656
+ |-- Use Cosmos DB's five levels, or
657
+ |-- Use DynamoDB with per-request ConsistentRead, or
658
+ |-- Use separate databases for different consistency tiers
659
+ ```
660
+
661
+ ---
662
+
663
+ ## Implementation Sketch
664
+
665
+ ### Pattern: Consistency Tier Router
666
+
667
+ A middleware that routes requests to different data stores based on the consistency
668
+ requirement of each operation:
669
+
670
+ ```python
671
+ from enum import Enum
672
+ from typing import Any, Optional
673
+
674
+ class ConsistencyLevel(Enum):
675
+ STRONG = "strong" # Linearizable reads from primary
676
+ SESSION = "session" # Read-your-writes within a session
677
+ BOUNDED = "bounded" # Staleness bounded by time or version
678
+ EVENTUAL = "eventual" # Read from any replica or cache
679
+
680
+ class ConsistencyRouter:
681
+ """Routes data operations to the appropriate store based on
682
+ the consistency level required by each operation."""
683
+
684
+ def __init__(self, primary_db, read_replica, cache):
685
+ self.primary = primary_db # CP: PostgreSQL primary
686
+ self.replica = read_replica # Near-consistent: streaming replica
687
+ self.cache = cache # AP: Redis cache
688
+
689
+ def read(self, key: str, level: ConsistencyLevel,
690
+ session_id: Optional[str] = None) -> Any:
691
+ if level == ConsistencyLevel.STRONG:
692
+ # Always read from primary -- linearizable
693
+ return self.primary.read(key)
694
+
695
+ if level == ConsistencyLevel.SESSION:
696
+ # Check if this session wrote recently
697
+ last_write_ts = self.cache.get(f"session:{session_id}:lwt:{key}")
698
+ if last_write_ts and self.replica.lag() > last_write_ts:
699
+ # Replica has not caught up to this session's write
700
+ return self.primary.read(key)
701
+ return self.replica.read(key)
702
+
703
+ if level == ConsistencyLevel.BOUNDED:
704
+ # Read from replica if lag is within bounds
705
+ if self.replica.lag_seconds() < 5:
706
+ return self.replica.read(key)
707
+ return self.primary.read(key)
708
+
709
+ if level == ConsistencyLevel.EVENTUAL:
710
+ # Try cache first, then replica, then primary
711
+ cached = self.cache.get(key)
712
+ if cached is not None:
713
+ return cached
714
+ value = self.replica.read(key)
715
+ self.cache.set(key, value, ttl=60)
716
+ return value
717
+
718
+ def write(self, key: str, value: Any,
719
+ session_id: Optional[str] = None) -> None:
720
+ # Writes always go to primary (CP path)
721
+ self.primary.write(key, value)
722
+
723
+ # Record write timestamp for session consistency
724
+ if session_id:
725
+ self.cache.set(
726
+ f"session:{session_id}:lwt:{key}",
727
+ self.primary.current_lsn(),
728
+ ttl=300
729
+ )
730
+
731
+ # Async cache invalidation (AP path)
732
+ self.cache.delete(key)
733
+ ```
734
+
735
+ ### Pattern: Feature Consistency Declaration
736
+
737
+ Declare consistency requirements per feature in configuration, not in code, so they can
738
+ be reviewed and audited. Each entry specifies the consistency level, backing store,
739
+ optional staleness bound, conflict resolution strategy, and a rationale:
740
+
741
+ ```yaml
742
+ # consistency-config.yaml
743
+ features:
744
+ user_authentication: { consistency: strong, store: postgresql_primary, rationale: "Security-critical" }
745
+ product_catalog: { consistency: eventual, store: elasticsearch, max_staleness: 30s }
746
+ inventory_check: { consistency: bounded, store: postgresql_replica, max_staleness: 2s }
747
+ inventory_reservation:{ consistency: strong, store: postgresql_primary, isolation: serializable }
748
+ shopping_cart: { consistency: session, store: dynamodb, conflict_resolution: union_merge }
749
+ recommendation_feed: { consistency: eventual, store: redis_cache, max_staleness: 5m }
750
+ payment_processing: { consistency: strong, store: postgresql_primary, isolation: serializable }
751
+ ```
752
+
753
+ ---
754
+
755
+ ## Key Takeaways
756
+
757
+ 1. **CAP is about partitions, not about normal operation.** During normal operation, a
758
+ well-designed distributed system provides both consistency and availability. The
759
+ tradeoff only manifests during network partitions.
760
+
761
+ 2. **PACELC is the more useful model.** It captures the latency-consistency tradeoff that
762
+ dominates day-to-day system design, not just the partition-time tradeoff.
763
+
764
+ 3. **Per-feature, not per-system.** Choose consistency levels per feature, per operation,
765
+ or even per request. No serious production system is uniformly CP or AP.
766
+
767
+ 4. **Start consistent, relax deliberately.** Begin with strong consistency. Identify paths
768
+ where eventual consistency is acceptable. Document the rationale for each relaxation.
769
+
770
+ 5. **The real question is cost of inconsistency vs cost of unavailability.** For each
771
+ feature, quantify what happens when data is stale versus what happens when the service
772
+ is down. The answer determines your consistency choice.
773
+
774
+ ---
775
+
776
+ ## Cross-References
777
+
778
+ - **distributed-systems-fundamentals** -- Foundational concepts (replication, consensus, failure models) that underpin CAP
779
+ - **data-consistency** -- Deep dive into consistency models, isolation levels, and implementation patterns
780
+ - **consensus-and-coordination** -- Raft, Paxos, ZAB, and other protocols that implement CP guarantees
781
+ - **sql-vs-nosql** -- Database selection criteria beyond CAP, including data model, query patterns, and operational concerns
782
+
783
+ ---
784
+
785
+ ## Sources
786
+
787
+ - Brewer, E. (2000). "Towards Robust Distributed Systems." ACM PODC Keynote.
788
+ - Gilbert, S. and Lynch, N. (2002). "Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services."
789
+ - Brewer, E. (2012). "CAP Twelve Years Later: How the 'Rules' Have Changed." IEEE Computer.
790
+ - Abadi, D. (2012). "Consistency Tradeoffs in Modern Distributed Database System Design." IEEE Computer.
791
+ - Kleppmann, M. (2015). ["Please stop calling databases CP or AP."](https://martin.kleppmann.com/2015/05/11/please-stop-calling-databases-cp-or-ap.html)
792
+ - Corbett, J. et al. (2013). ["Spanner, TrueTime and the CAP Theorem."](https://research.google/pubs/spanner-truetime-and-the-cap-theorem/) Google Research.
793
+ - DeCandia, G. et al. (2007). "Dynamo: Amazon's Highly Available Key-value Store." SOSP.
794
+ - [CAP Theorem -- Wikipedia](https://en.wikipedia.org/wiki/CAP_theorem)
795
+ - [PACELC Theorem -- Wikipedia](https://en.wikipedia.org/wiki/PACELC_design_principle)
796
+ - [Jepsen: Consistency Models](https://jepsen.io/consistency)
797
+ - [Consistency and Partition Tolerance -- ByteByteGo](https://blog.bytebytego.com/p/consistency-and-partition-tolerance)
798
+ - [CAP Theorem -- IBM](https://www.ibm.com/think/topics/cap-theorem)
799
+ - [PACELC Theorem -- ScyllaDB](https://www.scylladb.com/glossary/pacelc-theorem/)
800
+ - [Strong Consistency Models -- Aphyr](https://aphyr.com/posts/313-strong-consistency-models)