@wazir-dev/cli 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (629) hide show
  1. package/AGENTS.md +111 -0
  2. package/CHANGELOG.md +14 -0
  3. package/CONTRIBUTING.md +101 -0
  4. package/LICENSE +21 -0
  5. package/README.md +314 -0
  6. package/assets/composition-engine.mmd +34 -0
  7. package/assets/demo-script.sh +17 -0
  8. package/assets/logo-dark.svg +14 -0
  9. package/assets/logo.svg +14 -0
  10. package/assets/pipeline.mmd +39 -0
  11. package/assets/record-demo.sh +51 -0
  12. package/docs/README.md +51 -0
  13. package/docs/adapters/context-mode.md +60 -0
  14. package/docs/concepts/architecture.md +87 -0
  15. package/docs/concepts/artifact-model.md +60 -0
  16. package/docs/concepts/composition-engine.md +36 -0
  17. package/docs/concepts/indexing-and-recall.md +160 -0
  18. package/docs/concepts/observability.md +41 -0
  19. package/docs/concepts/roles-and-workflows.md +59 -0
  20. package/docs/concepts/terminology-policy.md +27 -0
  21. package/docs/getting-started/01-installation.md +78 -0
  22. package/docs/getting-started/02-first-run.md +102 -0
  23. package/docs/getting-started/03-adding-to-project.md +15 -0
  24. package/docs/getting-started/04-host-setup.md +15 -0
  25. package/docs/guides/ci-integration.md +15 -0
  26. package/docs/guides/creating-skills.md +15 -0
  27. package/docs/guides/expertise-module-authoring.md +15 -0
  28. package/docs/guides/hook-development.md +15 -0
  29. package/docs/guides/memory-and-learnings.md +34 -0
  30. package/docs/guides/multi-host-export.md +15 -0
  31. package/docs/guides/troubleshooting.md +101 -0
  32. package/docs/guides/writing-custom-roles.md +15 -0
  33. package/docs/plans/2026-03-15-cli-pipeline-integration-design.md +592 -0
  34. package/docs/plans/2026-03-15-cli-pipeline-integration-plan.md +598 -0
  35. package/docs/plans/2026-03-15-docs-enforcement-plan.md +238 -0
  36. package/docs/readmes/INDEX.md +99 -0
  37. package/docs/readmes/features/expertise/README.md +171 -0
  38. package/docs/readmes/features/exports/README.md +222 -0
  39. package/docs/readmes/features/hooks/README.md +103 -0
  40. package/docs/readmes/features/hooks/loop-cap-guard.md +133 -0
  41. package/docs/readmes/features/hooks/post-tool-capture.md +121 -0
  42. package/docs/readmes/features/hooks/post-tool-lint.md +130 -0
  43. package/docs/readmes/features/hooks/pre-compact-summary.md +122 -0
  44. package/docs/readmes/features/hooks/pre-tool-capture-route.md +100 -0
  45. package/docs/readmes/features/hooks/protected-path-write-guard.md +128 -0
  46. package/docs/readmes/features/hooks/session-start.md +119 -0
  47. package/docs/readmes/features/hooks/stop-handoff-harvest.md +125 -0
  48. package/docs/readmes/features/roles/README.md +157 -0
  49. package/docs/readmes/features/roles/clarifier.md +152 -0
  50. package/docs/readmes/features/roles/content-author.md +190 -0
  51. package/docs/readmes/features/roles/designer.md +193 -0
  52. package/docs/readmes/features/roles/executor.md +184 -0
  53. package/docs/readmes/features/roles/learner.md +210 -0
  54. package/docs/readmes/features/roles/planner.md +182 -0
  55. package/docs/readmes/features/roles/researcher.md +164 -0
  56. package/docs/readmes/features/roles/reviewer.md +184 -0
  57. package/docs/readmes/features/roles/specifier.md +162 -0
  58. package/docs/readmes/features/roles/verifier.md +215 -0
  59. package/docs/readmes/features/schemas/README.md +178 -0
  60. package/docs/readmes/features/skills/README.md +63 -0
  61. package/docs/readmes/features/skills/brainstorming.md +96 -0
  62. package/docs/readmes/features/skills/debugging.md +148 -0
  63. package/docs/readmes/features/skills/design.md +120 -0
  64. package/docs/readmes/features/skills/prepare-next.md +109 -0
  65. package/docs/readmes/features/skills/run-audit.md +159 -0
  66. package/docs/readmes/features/skills/scan-project.md +109 -0
  67. package/docs/readmes/features/skills/self-audit.md +176 -0
  68. package/docs/readmes/features/skills/tdd.md +137 -0
  69. package/docs/readmes/features/skills/using-skills.md +92 -0
  70. package/docs/readmes/features/skills/verification.md +120 -0
  71. package/docs/readmes/features/skills/writing-plans.md +104 -0
  72. package/docs/readmes/features/tooling/README.md +320 -0
  73. package/docs/readmes/features/workflows/README.md +186 -0
  74. package/docs/readmes/features/workflows/author.md +181 -0
  75. package/docs/readmes/features/workflows/clarify.md +154 -0
  76. package/docs/readmes/features/workflows/design-review.md +171 -0
  77. package/docs/readmes/features/workflows/design.md +169 -0
  78. package/docs/readmes/features/workflows/discover.md +162 -0
  79. package/docs/readmes/features/workflows/execute.md +173 -0
  80. package/docs/readmes/features/workflows/learn.md +167 -0
  81. package/docs/readmes/features/workflows/plan-review.md +165 -0
  82. package/docs/readmes/features/workflows/plan.md +170 -0
  83. package/docs/readmes/features/workflows/prepare-next.md +167 -0
  84. package/docs/readmes/features/workflows/review.md +169 -0
  85. package/docs/readmes/features/workflows/run-audit.md +191 -0
  86. package/docs/readmes/features/workflows/spec-challenge.md +159 -0
  87. package/docs/readmes/features/workflows/specify.md +160 -0
  88. package/docs/readmes/features/workflows/verify.md +177 -0
  89. package/docs/readmes/packages/README.md +50 -0
  90. package/docs/readmes/packages/ajv.md +117 -0
  91. package/docs/readmes/packages/context-mode.md +118 -0
  92. package/docs/readmes/packages/gray-matter.md +116 -0
  93. package/docs/readmes/packages/node-test.md +137 -0
  94. package/docs/readmes/packages/yaml.md +112 -0
  95. package/docs/reference/configuration-reference.md +159 -0
  96. package/docs/reference/expertise-index.md +52 -0
  97. package/docs/reference/git-flow.md +43 -0
  98. package/docs/reference/hooks.md +87 -0
  99. package/docs/reference/host-exports.md +50 -0
  100. package/docs/reference/launch-checklist.md +172 -0
  101. package/docs/reference/marketplace-listings.md +76 -0
  102. package/docs/reference/release-process.md +34 -0
  103. package/docs/reference/roles-reference.md +77 -0
  104. package/docs/reference/skills.md +33 -0
  105. package/docs/reference/templates.md +29 -0
  106. package/docs/reference/tooling-cli.md +94 -0
  107. package/docs/truth-claims.yaml +222 -0
  108. package/expertise/PROGRESS.md +63 -0
  109. package/expertise/README.md +18 -0
  110. package/expertise/antipatterns/PROGRESS.md +56 -0
  111. package/expertise/antipatterns/backend/api-design-antipatterns.md +1271 -0
  112. package/expertise/antipatterns/backend/auth-antipatterns.md +1195 -0
  113. package/expertise/antipatterns/backend/caching-antipatterns.md +622 -0
  114. package/expertise/antipatterns/backend/database-antipatterns.md +1038 -0
  115. package/expertise/antipatterns/backend/index.md +24 -0
  116. package/expertise/antipatterns/backend/microservices-antipatterns.md +850 -0
  117. package/expertise/antipatterns/code/architecture-antipatterns.md +919 -0
  118. package/expertise/antipatterns/code/async-antipatterns.md +622 -0
  119. package/expertise/antipatterns/code/code-smells.md +1186 -0
  120. package/expertise/antipatterns/code/dependency-antipatterns.md +1209 -0
  121. package/expertise/antipatterns/code/error-handling-antipatterns.md +1360 -0
  122. package/expertise/antipatterns/code/index.md +27 -0
  123. package/expertise/antipatterns/code/naming-and-abstraction.md +1118 -0
  124. package/expertise/antipatterns/code/state-management-antipatterns.md +1076 -0
  125. package/expertise/antipatterns/code/testing-antipatterns.md +1053 -0
  126. package/expertise/antipatterns/design/accessibility-antipatterns.md +1136 -0
  127. package/expertise/antipatterns/design/dark-patterns.md +1121 -0
  128. package/expertise/antipatterns/design/index.md +22 -0
  129. package/expertise/antipatterns/design/ui-antipatterns.md +1202 -0
  130. package/expertise/antipatterns/design/ux-antipatterns.md +680 -0
  131. package/expertise/antipatterns/frontend/css-layout-antipatterns.md +691 -0
  132. package/expertise/antipatterns/frontend/flutter-antipatterns.md +1827 -0
  133. package/expertise/antipatterns/frontend/index.md +23 -0
  134. package/expertise/antipatterns/frontend/mobile-antipatterns.md +573 -0
  135. package/expertise/antipatterns/frontend/react-antipatterns.md +1128 -0
  136. package/expertise/antipatterns/frontend/spa-antipatterns.md +1235 -0
  137. package/expertise/antipatterns/index.md +31 -0
  138. package/expertise/antipatterns/performance/index.md +20 -0
  139. package/expertise/antipatterns/performance/performance-antipatterns.md +1013 -0
  140. package/expertise/antipatterns/performance/premature-optimization.md +623 -0
  141. package/expertise/antipatterns/performance/scaling-antipatterns.md +785 -0
  142. package/expertise/antipatterns/process/ai-coding-antipatterns.md +853 -0
  143. package/expertise/antipatterns/process/code-review-antipatterns.md +656 -0
  144. package/expertise/antipatterns/process/deployment-antipatterns.md +920 -0
  145. package/expertise/antipatterns/process/index.md +23 -0
  146. package/expertise/antipatterns/process/technical-debt-antipatterns.md +647 -0
  147. package/expertise/antipatterns/security/index.md +20 -0
  148. package/expertise/antipatterns/security/secrets-antipatterns.md +849 -0
  149. package/expertise/antipatterns/security/security-theater.md +843 -0
  150. package/expertise/antipatterns/security/vulnerability-patterns.md +801 -0
  151. package/expertise/architecture/PROGRESS.md +70 -0
  152. package/expertise/architecture/data/caching-architecture.md +671 -0
  153. package/expertise/architecture/data/data-consistency.md +574 -0
  154. package/expertise/architecture/data/data-modeling.md +536 -0
  155. package/expertise/architecture/data/event-streams-and-queues.md +634 -0
  156. package/expertise/architecture/data/index.md +25 -0
  157. package/expertise/architecture/data/search-architecture.md +663 -0
  158. package/expertise/architecture/data/sql-vs-nosql.md +708 -0
  159. package/expertise/architecture/decisions/architecture-decision-records.md +640 -0
  160. package/expertise/architecture/decisions/build-vs-buy.md +616 -0
  161. package/expertise/architecture/decisions/index.md +23 -0
  162. package/expertise/architecture/decisions/monolith-to-microservices.md +790 -0
  163. package/expertise/architecture/decisions/technology-selection.md +616 -0
  164. package/expertise/architecture/distributed/cap-theorem-and-tradeoffs.md +800 -0
  165. package/expertise/architecture/distributed/circuit-breaker-bulkhead.md +741 -0
  166. package/expertise/architecture/distributed/consensus-and-coordination.md +796 -0
  167. package/expertise/architecture/distributed/distributed-systems-fundamentals.md +564 -0
  168. package/expertise/architecture/distributed/idempotency-and-retry.md +796 -0
  169. package/expertise/architecture/distributed/index.md +25 -0
  170. package/expertise/architecture/distributed/saga-pattern.md +797 -0
  171. package/expertise/architecture/foundations/architectural-thinking.md +460 -0
  172. package/expertise/architecture/foundations/coupling-and-cohesion.md +770 -0
  173. package/expertise/architecture/foundations/design-principles-solid.md +649 -0
  174. package/expertise/architecture/foundations/domain-driven-design.md +719 -0
  175. package/expertise/architecture/foundations/index.md +25 -0
  176. package/expertise/architecture/foundations/separation-of-concerns.md +472 -0
  177. package/expertise/architecture/foundations/twelve-factor-app.md +797 -0
  178. package/expertise/architecture/index.md +34 -0
  179. package/expertise/architecture/integration/api-design-graphql.md +638 -0
  180. package/expertise/architecture/integration/api-design-grpc.md +804 -0
  181. package/expertise/architecture/integration/api-design-rest.md +892 -0
  182. package/expertise/architecture/integration/index.md +25 -0
  183. package/expertise/architecture/integration/third-party-integration.md +795 -0
  184. package/expertise/architecture/integration/webhooks-and-callbacks.md +1152 -0
  185. package/expertise/architecture/integration/websockets-realtime.md +791 -0
  186. package/expertise/architecture/mobile-architecture/index.md +22 -0
  187. package/expertise/architecture/mobile-architecture/mobile-app-architecture.md +780 -0
  188. package/expertise/architecture/mobile-architecture/mobile-backend-for-frontend.md +670 -0
  189. package/expertise/architecture/mobile-architecture/offline-first.md +719 -0
  190. package/expertise/architecture/mobile-architecture/push-and-sync.md +782 -0
  191. package/expertise/architecture/patterns/cqrs-event-sourcing.md +717 -0
  192. package/expertise/architecture/patterns/event-driven.md +797 -0
  193. package/expertise/architecture/patterns/hexagonal-clean-architecture.md +870 -0
  194. package/expertise/architecture/patterns/index.md +27 -0
  195. package/expertise/architecture/patterns/layered-architecture.md +736 -0
  196. package/expertise/architecture/patterns/microservices.md +753 -0
  197. package/expertise/architecture/patterns/modular-monolith.md +692 -0
  198. package/expertise/architecture/patterns/monolith.md +626 -0
  199. package/expertise/architecture/patterns/plugin-architecture.md +735 -0
  200. package/expertise/architecture/patterns/serverless.md +780 -0
  201. package/expertise/architecture/scaling/database-scaling.md +615 -0
  202. package/expertise/architecture/scaling/feature-flags-and-rollouts.md +757 -0
  203. package/expertise/architecture/scaling/horizontal-vs-vertical.md +606 -0
  204. package/expertise/architecture/scaling/index.md +24 -0
  205. package/expertise/architecture/scaling/multi-tenancy.md +800 -0
  206. package/expertise/architecture/scaling/stateless-design.md +787 -0
  207. package/expertise/backend/embedded-firmware.md +625 -0
  208. package/expertise/backend/go.md +853 -0
  209. package/expertise/backend/index.md +24 -0
  210. package/expertise/backend/java-spring.md +448 -0
  211. package/expertise/backend/node-typescript.md +625 -0
  212. package/expertise/backend/python-fastapi.md +724 -0
  213. package/expertise/backend/rust.md +458 -0
  214. package/expertise/backend/solidity.md +711 -0
  215. package/expertise/composition-map.yaml +443 -0
  216. package/expertise/content/foundations/content-modeling.md +395 -0
  217. package/expertise/content/foundations/editorial-standards.md +449 -0
  218. package/expertise/content/foundations/index.md +24 -0
  219. package/expertise/content/foundations/microcopy.md +455 -0
  220. package/expertise/content/foundations/terminology-governance.md +509 -0
  221. package/expertise/content/index.md +34 -0
  222. package/expertise/content/patterns/accessibility-copy.md +518 -0
  223. package/expertise/content/patterns/index.md +24 -0
  224. package/expertise/content/patterns/notification-content.md +433 -0
  225. package/expertise/content/patterns/sample-content.md +486 -0
  226. package/expertise/content/patterns/state-copy.md +439 -0
  227. package/expertise/design/PROGRESS.md +58 -0
  228. package/expertise/design/disciplines/dark-mode-theming.md +577 -0
  229. package/expertise/design/disciplines/design-systems.md +595 -0
  230. package/expertise/design/disciplines/index.md +25 -0
  231. package/expertise/design/disciplines/information-architecture.md +800 -0
  232. package/expertise/design/disciplines/interaction-design.md +788 -0
  233. package/expertise/design/disciplines/responsive-design.md +552 -0
  234. package/expertise/design/disciplines/usability-testing.md +516 -0
  235. package/expertise/design/disciplines/user-research.md +792 -0
  236. package/expertise/design/foundations/accessibility-design.md +796 -0
  237. package/expertise/design/foundations/color-theory.md +797 -0
  238. package/expertise/design/foundations/iconography.md +795 -0
  239. package/expertise/design/foundations/index.md +26 -0
  240. package/expertise/design/foundations/motion-and-animation.md +653 -0
  241. package/expertise/design/foundations/rtl-design.md +585 -0
  242. package/expertise/design/foundations/spacing-and-layout.md +607 -0
  243. package/expertise/design/foundations/typography.md +800 -0
  244. package/expertise/design/foundations/visual-hierarchy.md +761 -0
  245. package/expertise/design/index.md +32 -0
  246. package/expertise/design/patterns/authentication-flows.md +474 -0
  247. package/expertise/design/patterns/content-consumption.md +789 -0
  248. package/expertise/design/patterns/data-display.md +618 -0
  249. package/expertise/design/patterns/e-commerce.md +1494 -0
  250. package/expertise/design/patterns/feedback-and-states.md +642 -0
  251. package/expertise/design/patterns/forms-and-input.md +819 -0
  252. package/expertise/design/patterns/gamification.md +801 -0
  253. package/expertise/design/patterns/index.md +31 -0
  254. package/expertise/design/patterns/microinteractions.md +449 -0
  255. package/expertise/design/patterns/navigation.md +800 -0
  256. package/expertise/design/patterns/notifications.md +705 -0
  257. package/expertise/design/patterns/onboarding.md +700 -0
  258. package/expertise/design/patterns/search-and-filter.md +601 -0
  259. package/expertise/design/patterns/settings-and-preferences.md +768 -0
  260. package/expertise/design/patterns/social-and-community.md +748 -0
  261. package/expertise/design/platforms/desktop-native.md +612 -0
  262. package/expertise/design/platforms/index.md +25 -0
  263. package/expertise/design/platforms/mobile-android.md +825 -0
  264. package/expertise/design/platforms/mobile-cross-platform.md +983 -0
  265. package/expertise/design/platforms/mobile-ios.md +699 -0
  266. package/expertise/design/platforms/tablet.md +794 -0
  267. package/expertise/design/platforms/web-dashboard.md +790 -0
  268. package/expertise/design/platforms/web-responsive.md +550 -0
  269. package/expertise/design/psychology/behavioral-nudges.md +449 -0
  270. package/expertise/design/psychology/cognitive-load.md +1191 -0
  271. package/expertise/design/psychology/error-psychology.md +778 -0
  272. package/expertise/design/psychology/index.md +22 -0
  273. package/expertise/design/psychology/persuasive-design.md +736 -0
  274. package/expertise/design/psychology/user-mental-models.md +623 -0
  275. package/expertise/design/tooling/open-pencil.md +266 -0
  276. package/expertise/frontend/angular.md +1073 -0
  277. package/expertise/frontend/desktop-electron.md +546 -0
  278. package/expertise/frontend/flutter.md +782 -0
  279. package/expertise/frontend/index.md +27 -0
  280. package/expertise/frontend/native-android.md +409 -0
  281. package/expertise/frontend/native-ios.md +490 -0
  282. package/expertise/frontend/react-native.md +1160 -0
  283. package/expertise/frontend/react.md +808 -0
  284. package/expertise/frontend/vue.md +1089 -0
  285. package/expertise/humanize/domain-rules-code.md +79 -0
  286. package/expertise/humanize/domain-rules-content.md +67 -0
  287. package/expertise/humanize/domain-rules-technical-docs.md +56 -0
  288. package/expertise/humanize/index.md +35 -0
  289. package/expertise/humanize/self-audit-checklist.md +87 -0
  290. package/expertise/humanize/sentence-patterns.md +218 -0
  291. package/expertise/humanize/vocabulary-blacklist.md +105 -0
  292. package/expertise/i18n/PROGRESS.md +65 -0
  293. package/expertise/i18n/advanced/accessibility-and-i18n.md +28 -0
  294. package/expertise/i18n/advanced/bidirectional-text-algorithm.md +38 -0
  295. package/expertise/i18n/advanced/complex-scripts.md +30 -0
  296. package/expertise/i18n/advanced/performance-and-i18n.md +27 -0
  297. package/expertise/i18n/advanced/testing-i18n.md +28 -0
  298. package/expertise/i18n/content/content-adaptation.md +23 -0
  299. package/expertise/i18n/content/locale-specific-formatting.md +23 -0
  300. package/expertise/i18n/content/machine-translation-integration.md +28 -0
  301. package/expertise/i18n/content/translation-management.md +29 -0
  302. package/expertise/i18n/foundations/date-time-calendars.md +67 -0
  303. package/expertise/i18n/foundations/i18n-architecture.md +272 -0
  304. package/expertise/i18n/foundations/locale-and-language-tags.md +79 -0
  305. package/expertise/i18n/foundations/numbers-currency-units.md +61 -0
  306. package/expertise/i18n/foundations/pluralization-and-gender.md +109 -0
  307. package/expertise/i18n/foundations/string-externalization.md +236 -0
  308. package/expertise/i18n/foundations/text-direction-bidi.md +241 -0
  309. package/expertise/i18n/foundations/unicode-and-encoding.md +86 -0
  310. package/expertise/i18n/index.md +38 -0
  311. package/expertise/i18n/platform/backend-i18n.md +31 -0
  312. package/expertise/i18n/platform/flutter-i18n.md +148 -0
  313. package/expertise/i18n/platform/native-android-i18n.md +36 -0
  314. package/expertise/i18n/platform/native-ios-i18n.md +36 -0
  315. package/expertise/i18n/platform/react-i18n.md +103 -0
  316. package/expertise/i18n/platform/web-css-i18n.md +81 -0
  317. package/expertise/i18n/rtl/arabic-specific.md +175 -0
  318. package/expertise/i18n/rtl/hebrew-specific.md +149 -0
  319. package/expertise/i18n/rtl/rtl-animations-and-transitions.md +111 -0
  320. package/expertise/i18n/rtl/rtl-forms-and-input.md +161 -0
  321. package/expertise/i18n/rtl/rtl-fundamentals.md +211 -0
  322. package/expertise/i18n/rtl/rtl-icons-and-images.md +181 -0
  323. package/expertise/i18n/rtl/rtl-layout-mirroring.md +252 -0
  324. package/expertise/i18n/rtl/rtl-navigation-and-gestures.md +107 -0
  325. package/expertise/i18n/rtl/rtl-testing-and-qa.md +147 -0
  326. package/expertise/i18n/rtl/rtl-typography.md +160 -0
  327. package/expertise/index.md +113 -0
  328. package/expertise/index.yaml +216 -0
  329. package/expertise/infrastructure/cloud-aws.md +597 -0
  330. package/expertise/infrastructure/cloud-gcp.md +599 -0
  331. package/expertise/infrastructure/cybersecurity.md +816 -0
  332. package/expertise/infrastructure/database-mongodb.md +447 -0
  333. package/expertise/infrastructure/database-postgres.md +400 -0
  334. package/expertise/infrastructure/devops-cicd.md +787 -0
  335. package/expertise/infrastructure/index.md +27 -0
  336. package/expertise/performance/PROGRESS.md +50 -0
  337. package/expertise/performance/backend/api-latency.md +1204 -0
  338. package/expertise/performance/backend/background-jobs.md +506 -0
  339. package/expertise/performance/backend/connection-pooling.md +1209 -0
  340. package/expertise/performance/backend/database-query-optimization.md +515 -0
  341. package/expertise/performance/backend/index.md +23 -0
  342. package/expertise/performance/backend/rate-limiting-and-throttling.md +971 -0
  343. package/expertise/performance/foundations/algorithmic-complexity.md +954 -0
  344. package/expertise/performance/foundations/caching-strategies.md +489 -0
  345. package/expertise/performance/foundations/concurrency-and-parallelism.md +847 -0
  346. package/expertise/performance/foundations/index.md +24 -0
  347. package/expertise/performance/foundations/measuring-and-profiling.md +440 -0
  348. package/expertise/performance/foundations/memory-management.md +964 -0
  349. package/expertise/performance/foundations/performance-budgets.md +1314 -0
  350. package/expertise/performance/index.md +31 -0
  351. package/expertise/performance/infrastructure/auto-scaling.md +1059 -0
  352. package/expertise/performance/infrastructure/cdn-and-edge.md +1081 -0
  353. package/expertise/performance/infrastructure/index.md +22 -0
  354. package/expertise/performance/infrastructure/load-balancing.md +1081 -0
  355. package/expertise/performance/infrastructure/observability.md +1079 -0
  356. package/expertise/performance/mobile/index.md +23 -0
  357. package/expertise/performance/mobile/mobile-animations.md +544 -0
  358. package/expertise/performance/mobile/mobile-memory-battery.md +416 -0
  359. package/expertise/performance/mobile/mobile-network.md +452 -0
  360. package/expertise/performance/mobile/mobile-rendering.md +599 -0
  361. package/expertise/performance/mobile/mobile-startup-time.md +505 -0
  362. package/expertise/performance/platform-specific/flutter-performance.md +647 -0
  363. package/expertise/performance/platform-specific/index.md +22 -0
  364. package/expertise/performance/platform-specific/node-performance.md +1307 -0
  365. package/expertise/performance/platform-specific/postgres-performance.md +1366 -0
  366. package/expertise/performance/platform-specific/react-performance.md +1403 -0
  367. package/expertise/performance/web/bundle-optimization.md +1239 -0
  368. package/expertise/performance/web/image-and-media.md +636 -0
  369. package/expertise/performance/web/index.md +24 -0
  370. package/expertise/performance/web/network-optimization.md +1133 -0
  371. package/expertise/performance/web/rendering-performance.md +1098 -0
  372. package/expertise/performance/web/ssr-and-hydration.md +918 -0
  373. package/expertise/performance/web/web-vitals.md +1374 -0
  374. package/expertise/quality/accessibility.md +985 -0
  375. package/expertise/quality/evidence-based-verification.md +499 -0
  376. package/expertise/quality/index.md +24 -0
  377. package/expertise/quality/ml-model-audit.md +614 -0
  378. package/expertise/quality/performance.md +600 -0
  379. package/expertise/quality/testing-api.md +891 -0
  380. package/expertise/quality/testing-mobile.md +496 -0
  381. package/expertise/quality/testing-web.md +849 -0
  382. package/expertise/security/PROGRESS.md +54 -0
  383. package/expertise/security/agentic-identity.md +540 -0
  384. package/expertise/security/compliance-frameworks.md +601 -0
  385. package/expertise/security/data/data-encryption.md +364 -0
  386. package/expertise/security/data/data-privacy-gdpr.md +692 -0
  387. package/expertise/security/data/database-security.md +1171 -0
  388. package/expertise/security/data/index.md +22 -0
  389. package/expertise/security/data/pii-handling.md +531 -0
  390. package/expertise/security/foundations/authentication.md +1041 -0
  391. package/expertise/security/foundations/authorization.md +603 -0
  392. package/expertise/security/foundations/cryptography.md +1001 -0
  393. package/expertise/security/foundations/index.md +25 -0
  394. package/expertise/security/foundations/owasp-top-10.md +1354 -0
  395. package/expertise/security/foundations/secrets-management.md +1217 -0
  396. package/expertise/security/foundations/secure-sdlc.md +700 -0
  397. package/expertise/security/foundations/supply-chain-security.md +698 -0
  398. package/expertise/security/index.md +31 -0
  399. package/expertise/security/infrastructure/cloud-security-aws.md +1296 -0
  400. package/expertise/security/infrastructure/cloud-security-gcp.md +1376 -0
  401. package/expertise/security/infrastructure/container-security.md +721 -0
  402. package/expertise/security/infrastructure/incident-response.md +1295 -0
  403. package/expertise/security/infrastructure/index.md +24 -0
  404. package/expertise/security/infrastructure/logging-and-monitoring.md +1618 -0
  405. package/expertise/security/infrastructure/network-security.md +1337 -0
  406. package/expertise/security/mobile/index.md +23 -0
  407. package/expertise/security/mobile/mobile-android-security.md +1218 -0
  408. package/expertise/security/mobile/mobile-binary-protection.md +1229 -0
  409. package/expertise/security/mobile/mobile-data-storage.md +1265 -0
  410. package/expertise/security/mobile/mobile-ios-security.md +1401 -0
  411. package/expertise/security/mobile/mobile-network-security.md +1520 -0
  412. package/expertise/security/smart-contract-security.md +594 -0
  413. package/expertise/security/testing/index.md +22 -0
  414. package/expertise/security/testing/penetration-testing.md +1258 -0
  415. package/expertise/security/testing/security-code-review.md +1765 -0
  416. package/expertise/security/testing/threat-modeling.md +1074 -0
  417. package/expertise/security/testing/vulnerability-scanning.md +1062 -0
  418. package/expertise/security/web/api-security.md +586 -0
  419. package/expertise/security/web/cors-and-headers.md +433 -0
  420. package/expertise/security/web/csrf.md +562 -0
  421. package/expertise/security/web/file-upload.md +1477 -0
  422. package/expertise/security/web/index.md +25 -0
  423. package/expertise/security/web/injection.md +1375 -0
  424. package/expertise/security/web/session-management.md +1101 -0
  425. package/expertise/security/web/xss.md +1158 -0
  426. package/exports/README.md +17 -0
  427. package/exports/hosts/claude/.claude/agents/clarifier.md +42 -0
  428. package/exports/hosts/claude/.claude/agents/content-author.md +63 -0
  429. package/exports/hosts/claude/.claude/agents/designer.md +55 -0
  430. package/exports/hosts/claude/.claude/agents/executor.md +55 -0
  431. package/exports/hosts/claude/.claude/agents/learner.md +51 -0
  432. package/exports/hosts/claude/.claude/agents/planner.md +53 -0
  433. package/exports/hosts/claude/.claude/agents/researcher.md +43 -0
  434. package/exports/hosts/claude/.claude/agents/reviewer.md +54 -0
  435. package/exports/hosts/claude/.claude/agents/specifier.md +47 -0
  436. package/exports/hosts/claude/.claude/agents/verifier.md +71 -0
  437. package/exports/hosts/claude/.claude/commands/author.md +42 -0
  438. package/exports/hosts/claude/.claude/commands/clarify.md +38 -0
  439. package/exports/hosts/claude/.claude/commands/design-review.md +46 -0
  440. package/exports/hosts/claude/.claude/commands/design.md +44 -0
  441. package/exports/hosts/claude/.claude/commands/discover.md +37 -0
  442. package/exports/hosts/claude/.claude/commands/execute.md +48 -0
  443. package/exports/hosts/claude/.claude/commands/learn.md +38 -0
  444. package/exports/hosts/claude/.claude/commands/plan-review.md +42 -0
  445. package/exports/hosts/claude/.claude/commands/plan.md +39 -0
  446. package/exports/hosts/claude/.claude/commands/prepare-next.md +37 -0
  447. package/exports/hosts/claude/.claude/commands/review.md +40 -0
  448. package/exports/hosts/claude/.claude/commands/run-audit.md +41 -0
  449. package/exports/hosts/claude/.claude/commands/spec-challenge.md +41 -0
  450. package/exports/hosts/claude/.claude/commands/specify.md +38 -0
  451. package/exports/hosts/claude/.claude/commands/verify.md +37 -0
  452. package/exports/hosts/claude/.claude/settings.json +34 -0
  453. package/exports/hosts/claude/CLAUDE.md +19 -0
  454. package/exports/hosts/claude/export.manifest.json +38 -0
  455. package/exports/hosts/claude/host-package.json +67 -0
  456. package/exports/hosts/codex/AGENTS.md +19 -0
  457. package/exports/hosts/codex/export.manifest.json +38 -0
  458. package/exports/hosts/codex/host-package.json +41 -0
  459. package/exports/hosts/cursor/.cursor/hooks.json +16 -0
  460. package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +19 -0
  461. package/exports/hosts/cursor/export.manifest.json +38 -0
  462. package/exports/hosts/cursor/host-package.json +42 -0
  463. package/exports/hosts/gemini/GEMINI.md +19 -0
  464. package/exports/hosts/gemini/export.manifest.json +38 -0
  465. package/exports/hosts/gemini/host-package.json +41 -0
  466. package/hooks/README.md +18 -0
  467. package/hooks/definitions/loop_cap_guard.yaml +21 -0
  468. package/hooks/definitions/post_tool_capture.yaml +24 -0
  469. package/hooks/definitions/pre_compact_summary.yaml +19 -0
  470. package/hooks/definitions/pre_tool_capture_route.yaml +19 -0
  471. package/hooks/definitions/protected_path_write_guard.yaml +19 -0
  472. package/hooks/definitions/session_start.yaml +19 -0
  473. package/hooks/definitions/stop_handoff_harvest.yaml +20 -0
  474. package/hooks/loop-cap-guard +17 -0
  475. package/hooks/post-tool-lint +36 -0
  476. package/hooks/protected-path-write-guard +17 -0
  477. package/hooks/session-start +41 -0
  478. package/llms-full.txt +2355 -0
  479. package/llms.txt +43 -0
  480. package/package.json +79 -0
  481. package/roles/README.md +20 -0
  482. package/roles/clarifier.md +42 -0
  483. package/roles/content-author.md +63 -0
  484. package/roles/designer.md +55 -0
  485. package/roles/executor.md +55 -0
  486. package/roles/learner.md +51 -0
  487. package/roles/planner.md +53 -0
  488. package/roles/researcher.md +43 -0
  489. package/roles/reviewer.md +54 -0
  490. package/roles/specifier.md +47 -0
  491. package/roles/verifier.md +71 -0
  492. package/schemas/README.md +24 -0
  493. package/schemas/accepted-learning.schema.json +20 -0
  494. package/schemas/author-artifact.schema.json +156 -0
  495. package/schemas/clarification.schema.json +19 -0
  496. package/schemas/design-artifact.schema.json +80 -0
  497. package/schemas/docs-claim.schema.json +18 -0
  498. package/schemas/export-manifest.schema.json +20 -0
  499. package/schemas/hook.schema.json +67 -0
  500. package/schemas/host-export-package.schema.json +18 -0
  501. package/schemas/implementation-plan.schema.json +19 -0
  502. package/schemas/proposed-learning.schema.json +19 -0
  503. package/schemas/research.schema.json +18 -0
  504. package/schemas/review.schema.json +29 -0
  505. package/schemas/run-manifest.schema.json +18 -0
  506. package/schemas/spec-challenge.schema.json +18 -0
  507. package/schemas/spec.schema.json +20 -0
  508. package/schemas/usage.schema.json +102 -0
  509. package/schemas/verification-proof.schema.json +29 -0
  510. package/schemas/wazir-manifest.schema.json +173 -0
  511. package/skills/README.md +40 -0
  512. package/skills/brainstorming/SKILL.md +77 -0
  513. package/skills/debugging/SKILL.md +50 -0
  514. package/skills/design/SKILL.md +61 -0
  515. package/skills/dispatching-parallel-agents/SKILL.md +128 -0
  516. package/skills/executing-plans/SKILL.md +70 -0
  517. package/skills/finishing-a-development-branch/SKILL.md +169 -0
  518. package/skills/humanize/SKILL.md +123 -0
  519. package/skills/init-pipeline/SKILL.md +124 -0
  520. package/skills/prepare-next/SKILL.md +20 -0
  521. package/skills/receiving-code-review/SKILL.md +123 -0
  522. package/skills/requesting-code-review/SKILL.md +105 -0
  523. package/skills/requesting-code-review/code-reviewer.md +108 -0
  524. package/skills/run-audit/SKILL.md +197 -0
  525. package/skills/scan-project/SKILL.md +41 -0
  526. package/skills/self-audit/SKILL.md +153 -0
  527. package/skills/subagent-driven-development/SKILL.md +154 -0
  528. package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +26 -0
  529. package/skills/subagent-driven-development/implementer-prompt.md +102 -0
  530. package/skills/subagent-driven-development/spec-reviewer-prompt.md +61 -0
  531. package/skills/tdd/SKILL.md +23 -0
  532. package/skills/using-git-worktrees/SKILL.md +163 -0
  533. package/skills/using-skills/SKILL.md +95 -0
  534. package/skills/verification/SKILL.md +22 -0
  535. package/skills/wazir/SKILL.md +463 -0
  536. package/skills/writing-plans/SKILL.md +30 -0
  537. package/skills/writing-skills/SKILL.md +157 -0
  538. package/skills/writing-skills/anthropic-best-practices.md +122 -0
  539. package/skills/writing-skills/persuasion-principles.md +50 -0
  540. package/templates/README.md +20 -0
  541. package/templates/artifacts/README.md +10 -0
  542. package/templates/artifacts/accepted-learning.md +19 -0
  543. package/templates/artifacts/accepted-learning.template.json +12 -0
  544. package/templates/artifacts/author.md +74 -0
  545. package/templates/artifacts/author.template.json +19 -0
  546. package/templates/artifacts/clarification.md +21 -0
  547. package/templates/artifacts/clarification.template.json +12 -0
  548. package/templates/artifacts/execute-notes.md +19 -0
  549. package/templates/artifacts/implementation-plan.md +21 -0
  550. package/templates/artifacts/implementation-plan.template.json +11 -0
  551. package/templates/artifacts/learning-proposal.md +19 -0
  552. package/templates/artifacts/next-run-handoff.md +21 -0
  553. package/templates/artifacts/plan-review.md +19 -0
  554. package/templates/artifacts/proposed-learning.template.json +12 -0
  555. package/templates/artifacts/research.md +21 -0
  556. package/templates/artifacts/research.template.json +12 -0
  557. package/templates/artifacts/review-findings.md +19 -0
  558. package/templates/artifacts/review.template.json +11 -0
  559. package/templates/artifacts/run-manifest.template.json +8 -0
  560. package/templates/artifacts/spec-challenge.md +19 -0
  561. package/templates/artifacts/spec-challenge.template.json +11 -0
  562. package/templates/artifacts/spec.md +21 -0
  563. package/templates/artifacts/spec.template.json +12 -0
  564. package/templates/artifacts/verification-proof.md +19 -0
  565. package/templates/artifacts/verification-proof.template.json +11 -0
  566. package/templates/examples/accepted-learning.example.json +14 -0
  567. package/templates/examples/author.example.json +152 -0
  568. package/templates/examples/clarification.example.json +15 -0
  569. package/templates/examples/docs-claim.example.json +8 -0
  570. package/templates/examples/export-manifest.example.json +7 -0
  571. package/templates/examples/host-export-package.example.json +11 -0
  572. package/templates/examples/implementation-plan.example.json +17 -0
  573. package/templates/examples/proposed-learning.example.json +13 -0
  574. package/templates/examples/research.example.json +15 -0
  575. package/templates/examples/research.example.md +6 -0
  576. package/templates/examples/review.example.json +17 -0
  577. package/templates/examples/run-manifest.example.json +9 -0
  578. package/templates/examples/spec-challenge.example.json +14 -0
  579. package/templates/examples/spec.example.json +21 -0
  580. package/templates/examples/verification-proof.example.json +21 -0
  581. package/templates/examples/wazir-manifest.example.yaml +65 -0
  582. package/templates/task-definition-schema.md +99 -0
  583. package/tooling/README.md +20 -0
  584. package/tooling/src/adapters/context-mode.js +50 -0
  585. package/tooling/src/capture/command.js +376 -0
  586. package/tooling/src/capture/store.js +99 -0
  587. package/tooling/src/capture/usage.js +270 -0
  588. package/tooling/src/checks/branches.js +50 -0
  589. package/tooling/src/checks/brand-truth.js +110 -0
  590. package/tooling/src/checks/changelog.js +231 -0
  591. package/tooling/src/checks/command-registry.js +36 -0
  592. package/tooling/src/checks/commits.js +102 -0
  593. package/tooling/src/checks/docs-drift.js +103 -0
  594. package/tooling/src/checks/docs-truth.js +201 -0
  595. package/tooling/src/checks/runtime-surface.js +156 -0
  596. package/tooling/src/cli.js +116 -0
  597. package/tooling/src/command-options.js +56 -0
  598. package/tooling/src/commands/validate.js +320 -0
  599. package/tooling/src/doctor/command.js +91 -0
  600. package/tooling/src/export/command.js +77 -0
  601. package/tooling/src/export/compiler.js +498 -0
  602. package/tooling/src/guards/loop-cap-guard.js +52 -0
  603. package/tooling/src/guards/protected-path-write-guard.js +67 -0
  604. package/tooling/src/index/command.js +152 -0
  605. package/tooling/src/index/storage.js +1061 -0
  606. package/tooling/src/index/summarizers.js +261 -0
  607. package/tooling/src/loaders.js +18 -0
  608. package/tooling/src/project-root.js +22 -0
  609. package/tooling/src/recall/command.js +225 -0
  610. package/tooling/src/schema-validator.js +30 -0
  611. package/tooling/src/state-root.js +40 -0
  612. package/tooling/src/status/command.js +71 -0
  613. package/wazir.manifest.yaml +135 -0
  614. package/workflows/README.md +19 -0
  615. package/workflows/author.md +42 -0
  616. package/workflows/clarify.md +38 -0
  617. package/workflows/design-review.md +46 -0
  618. package/workflows/design.md +44 -0
  619. package/workflows/discover.md +37 -0
  620. package/workflows/execute.md +48 -0
  621. package/workflows/learn.md +38 -0
  622. package/workflows/plan-review.md +42 -0
  623. package/workflows/plan.md +39 -0
  624. package/workflows/prepare-next.md +37 -0
  625. package/workflows/review.md +40 -0
  626. package/workflows/run-audit.md +41 -0
  627. package/workflows/spec-challenge.md +41 -0
  628. package/workflows/specify.md +38 -0
  629. package/workflows/verify.md +37 -0
@@ -0,0 +1,790 @@
1
+ # Monolith to Microservices — Architecture Expertise Module
2
+
3
+ > Migrating from monolith to microservices is one of the riskiest architectural undertakings. Most organizations should NOT do it — the modular monolith is sufficient for the majority. When migration is genuinely needed, incremental extraction using the Strangler Fig pattern is the only proven safe approach. Big-bang rewrites fail.
4
+
5
+ > **Category:** Decision
6
+ > **Complexity:** Expert
7
+ > **Applies when:** Existing monolith causing demonstrable team scaling problems, deployment bottlenecks, or independent scaling needs that modular monolith cannot address
8
+
9
+ ---
10
+
11
+ ## What This Is (and What It Isn't)
12
+
13
+ A monolith-to-microservices migration is the process of decomposing a single deployable application into multiple independently deployable services, each owning its own data store and business capability. The goal is to gain independent deployment, independent scaling, and team autonomy — but the cost is distributed systems complexity across every layer of the stack.
14
+
15
+ The only proven safe approach is the **Strangler Fig pattern**: gradually replace monolith capabilities with new services, one bounded context at a time, while the monolith continues to serve traffic. The monolith shrinks incrementally as services take over. At no point does the system go through a non-functional state.
16
+
17
+ The name comes from the strangler fig tree in tropical forests: a parasitic vine that grows around a host tree, gradually enveloping it. Eventually the host tree dies and rots away, leaving only the fig — but the fig was always functional at every stage. The same principle applies to migration: the system is always working, always deployable, always serving users.
18
+
19
+ **What this migration IS:**
20
+
21
+ - An incremental, multi-quarter (often multi-year) evolutionary process
22
+ - A domain decomposition exercise first, a technical exercise second
23
+ - An organizational restructuring as much as a technical one
24
+ - Reversible at every step — each extracted service can be reverted back to the monolith
25
+
26
+ **What this migration IS NOT:**
27
+
28
+ - A big-bang rewrite. Every documented big-bang rewrite of a monolith to microservices has either failed outright, gone massively over schedule, or produced a distributed monolith. The Second System Effect (Fred Brooks) applies with full force.
29
+ - A technology upgrade. "Rewrite the Java monolith in Go microservices" is not a migration strategy. It is two migrations stacked on top of each other (language change + architecture change), and the risk multiplies.
30
+ - A solution to bad code. If the monolith has unclear domain boundaries, no tests, and tangled dependencies, extracting pieces into services will produce microservices with unclear boundaries, no tests, and tangled network dependencies — except now debugging requires distributed tracing instead of a stack trace.
31
+ - Inevitable. Many of the world's most successful software systems are monoliths or modular monoliths. There is no architectural law that says a system must evolve toward microservices.
32
+
33
+ **The Strangler Fig pattern — how it works at a high level:**
34
+
35
+ ```
36
+ Phase 1: Identify a bounded context for extraction (e.g., Notifications)
37
+ Phase 2: Build the new Notifications service alongside the monolith
38
+ Phase 3: Introduce a routing layer (API gateway, proxy, or facade)
39
+ Phase 4: Redirect traffic: monolith → new service (1% → 10% → 100%)
40
+ Phase 5: Remove the Notifications code from the monolith
41
+ Phase 6: Repeat for the next bounded context
42
+ ```
43
+
44
+ At every phase, the system works. At every phase, you can stop and declare the migration complete with the current set of extracted services and the remaining monolith. This is the critical property that makes the Strangler Fig pattern safe: it is always possible to stop.
45
+
46
+ ---
47
+
48
+ ## When to Do It
49
+
50
+ This section describes scenarios where migration is genuinely justified. Each scenario includes real examples and measurable indicators. If you cannot point to one of these specific pain points with data, you should not migrate.
51
+
52
+ ### Teams larger than 50 developers are blocked by each other
53
+
54
+ When 8+ teams share a single deployable monolith, deployment coordination becomes a full-time job. Teams queue for merge windows. A bug in Team A's code blocks Team B's release. Feature flags accumulate to manage partial deployments. Release trains become monthly instead of daily.
55
+
56
+ **Amazon (2001-2006):** The canonical example. Jeff Bezos mandated that every team must communicate via service interfaces — no shared databases, no direct linking. This was driven by organizational scaling pain: hundreds of engineers could not coordinate changes to a single codebase. The result was a service-oriented architecture that enabled Amazon's engineering organization to scale to thousands of developers. The mandate was organizational first, technical second.
57
+
58
+ **Uber (2014-2017):** Uber's original Python monolith ("uberblack") served the company from founding through hypergrowth. By 2014, with hundreds of engineers across multiple cities and product lines (UberX, UberPool, UberEats), the monolith was the bottleneck. Deployment took hours. A single bad commit could take down all services globally. Teams were blocked by each other's changes in shared code paths. Uber decomposed along domain lines — trips, pricing, matching, payments — enabling independent team velocity. By 2017, Uber operated approximately 2,200 microservices.
59
+
60
+ **Measurable indicator:** Deployment frequency has declined as the team has grown. Lead time from "code complete" to "in production" exceeds 1 week due to coordination, not technical complexity.
61
+
62
+ ### Deployment frequency is limited by coupling, not by choice
63
+
64
+ If your team wants to deploy multiple times per day but cannot because changes to one component require retesting the entire application, and a modular monolith with independent test suites has already been attempted, service extraction may be warranted.
65
+
66
+ **Netflix (2008-2015):** After a three-day database corruption outage in August 2008 exposed the fragility of their monolithic architecture, Netflix began a seven-year migration to microservices. The driver was not team size alone but deployment coupling: a change to the recommendation engine required redeploying the entire application, including the streaming pipeline, billing, and user management. By 2015, Netflix operated 700+ microservices and deployed thousands of times per day across independent services.
67
+
68
+ **Measurable indicator:** You have tried modular monolith approaches (separate test suites per module, feature flags, trunk-based development) and deployment coupling persists because of shared runtime state or shared deployment artifact constraints.
69
+
70
+ ### Genuinely different scaling requirements per component
71
+
72
+ If your video transcoding pipeline needs 500 GPU cores during peak hours while your user profile service needs 2 CPU cores, scaling them as a single unit wastes resources. This is a legitimate driver for extraction — but only for the specific components with divergent scaling needs.
73
+
74
+ **Measurable indicator:** Infrastructure cost analysis shows 3x+ over-provisioning because the monolith must be scaled to the requirement of its most resource-intensive component. Actual measurement of traffic patterns shows 10x+ difference in resource requirements between components.
75
+
76
+ ### Regulatory or compliance isolation requirements
77
+
78
+ Some domains require that payment processing, healthcare data, or financial calculations run in isolated environments with independent audit trails, separate access controls, and dedicated infrastructure. A monolith sharing process memory between PCI-scoped and non-PCI-scoped code may not satisfy auditors.
79
+
80
+ **Measurable indicator:** Compliance auditors have flagged shared process boundaries as a risk. The cost of making the entire monolith PCI/HIPAA/SOX compliant exceeds the cost of extracting the regulated component.
81
+
82
+ ### Technology heterogeneity is genuinely required
83
+
84
+ If your ML inference pipeline must be Python/PyTorch, your real-time bidding engine must be C++ for latency, and your web API is Go, these cannot coexist in a single deployable unit. This is a hard constraint that microservices solve.
85
+
86
+ **Measurable indicator:** Performance benchmarks prove that the required component cannot meet its SLA in the monolith's language/runtime, and the performance gap is fundamental (not fixable by optimization).
87
+
88
+ ---
89
+
90
+ ## When NOT to Do It
91
+
92
+ **This section is deliberately longer than "When to Do It" because the default answer should be "don't migrate."** The industry has a well-documented bias toward microservices adoption driven by hype, resume-driven development, and vendor marketing — not by genuine architectural need. The failure rate of monolith-to-microservices migrations is high, and the costs are routinely underestimated by 3-10x.
93
+
94
+ ### "Microservices are modern" is not a reason
95
+
96
+ The most common driver for migration is not a measured pain point but a belief that microservices are inherently superior. They are not. They are a tradeoff: you gain independent deployment and scaling at the cost of distributed systems complexity. For most applications, the cost exceeds the benefit.
97
+
98
+ Sam Newman, author of "Monolith to Microservices" (O'Reilly, 2019), states explicitly: **"The monolith is not the enemy."** And: **"Microservices should not be the default choice."** His book's core argument is that decomposition must be driven by measurable organizational or technical pain, not by architectural fashion.
99
+
100
+ ### Team smaller than ~20 developers
101
+
102
+ Below approximately 20 developers (2-3 teams), the coordination overhead of a monolith is manageable, and the operational overhead of microservices is disproportionate. You need to staff on-call rotations, build distributed tracing infrastructure, manage inter-service API versioning, handle partial failures, and operate a service mesh — all for a team that could coordinate via a daily standup.
103
+
104
+ **The math:** A microservices architecture for a team of 10 typically requires 10-30% of engineering capacity devoted to platform/infrastructure work (CI/CD per service, monitoring, service discovery, API gateway management). For a team of 10, that is 1-3 full-time engineers maintaining infrastructure instead of building product. For a team of 100, the same 10-30% is 10-30 engineers — which is a viable platform team.
105
+
106
+ ### You have not tried a modular monolith first
107
+
108
+ The pragmatic evolution path is: Monolith -> Modular Monolith -> Selective Microservice Extraction. Skipping the modular monolith step means you are paying the full distributed systems tax before proving that simpler alternatives are insufficient.
109
+
110
+ A modular monolith provides most of the organizational benefits of microservices (team ownership, clear boundaries, independent development) without the operational cost. If a modular monolith solves your pain, you have saved yourself years of migration work. If it does not, you have at least identified your domain boundaries — which you need before extracting services anyway.
111
+
112
+ ### No DevOps maturity (CI/CD, monitoring, observability)
113
+
114
+ Microservices require mature DevOps practices as a prerequisite, not as a nice-to-have. Each service needs: its own CI/CD pipeline, its own deployment configuration, health checks, distributed tracing, centralized logging, alerting, and on-call procedures. If your team does not have automated deployment for a single application, you are not ready for automated deployment of 20 applications.
115
+
116
+ **The prerequisite checklist:**
117
+ - Automated testing with >80% coverage
118
+ - CI/CD pipeline with <15 minute build-to-deploy
119
+ - Centralized logging and monitoring
120
+ - Infrastructure-as-code for provisioning
121
+ - Container orchestration (Kubernetes or equivalent)
122
+ - Distributed tracing infrastructure (Jaeger, Zipkin, or equivalent)
123
+
124
+ If you lack more than 2 of these, invest in DevOps maturity before considering microservices.
125
+
126
+ ### Real failure stories: companies that migrated and regretted it
127
+
128
+ **Segment (2017-2018): Microservices back to monolith.** Segment, the customer data platform, migrated from a monolith to microservices in 2016-2017. Each destination integration (sending data to Amplitude, Mixpanel, Google Analytics, etc.) became its own microservice. During a period of hypergrowth, Segment was adding approximately 3 new destination integrations per month, each requiring a new microservice with its own deployment pipeline, queue, worker, and monitoring.
129
+
130
+ The result: 3 full-time engineers spent most of their time just keeping the microservices alive. The team was unable to make progress on product features. Each new destination multiplied operational complexity. In 2018, Segment reversed course and built "Centrifuge" — a monolithic architecture capable of handling billions of messages per day. Developer productivity increased immediately. The team went from being unable to ship features to shipping them routinely.
131
+
132
+ The lesson: Segment's microservices architecture was driven by a reasonable-sounding principle (isolation per destination) but the operational cost per service was constant, while the number of services grew linearly with business growth. The total operational cost became unsustainable for the team size.
133
+
134
+ **Amazon Prime Video (2023): Microservices to monolith for audio/video monitoring.** Amazon Prime Video's Video Quality Analysis (VQA) team published a case study describing how they reduced infrastructure costs by 90% by consolidating a microservices architecture (built on AWS Step Functions and Lambda) into a monolithic application. The original architecture used distributed components for media conversion, defect detection, and real-time notifications. The inter-service communication overhead (particularly data transfer between Lambda functions via S3) dominated the cost.
135
+
136
+ Adrian Cockcroft, former VP of Cloud Architecture Strategy at AWS, noted this was more accurately described as a "Step Functions to proper microservice" story — the original architecture was not well-designed microservices but rather an over-distributed pipeline. Regardless, the case demonstrates that even within AWS, teams find that consolidation outperforms distribution for certain workloads.
137
+
138
+ **Unnamed enterprises (industry pattern):** A 2025 CNCF survey found approximately 42% of organizations that adopted microservices have consolidated at least some services back into larger deployable units. A 2025 Gartner study found 60% of teams report regret about microservices for small-to-medium applications. The pattern is consistent: teams adopt microservices for perceived benefits, discover that the operational overhead exceeds the organizational benefits at their scale, and consolidate.
139
+
140
+ ### The 18-month migration with no user-visible improvement
141
+
142
+ A recurring anti-pattern: a team spends 12-18 months migrating a monolith to microservices. During that period, no new features ship. User experience does not improve. Performance does not improve (often it degrades due to network overhead). The only visible result is higher infrastructure costs and a more complex deployment pipeline. Leadership questions the value. The migration stalls at 40% completion — leaving the organization with the worst-of-both-worlds: a partially decomposed system that is harder to operate than either a monolith or a fully migrated microservices architecture.
143
+
144
+ **Prevention:** Before starting migration, define measurable success criteria. "Deploy frequency increases from weekly to daily for Team X" or "Scaling cost for Component Y decreases by 50%." If you cannot state the expected user-visible or cost-visible improvement, the migration lacks justification.
145
+
146
+ ### Resume-driven development
147
+
148
+ A blunt but necessary point: engineers are incentivized to adopt microservices because "Led migration from monolith to microservices" is a career-enhancing line on a resume. This incentive is real and systematically biases architectural decisions toward complexity. Organizations must guard against this by requiring measurable justification for migration decisions and by valuing operational simplicity as an engineering virtue.
149
+
150
+ ---
151
+
152
+ ## How It Works
153
+
154
+ ### Step 1: Domain analysis — identify extraction candidates
155
+
156
+ Before touching any code, map the monolith's domain. Use Domain-Driven Design techniques to identify bounded contexts. Each bounded context is a potential extraction candidate.
157
+
158
+ **Techniques for domain discovery:**
159
+
160
+ - **Event Storming:** Workshop format where domain experts and engineers map business events on sticky notes. Clusters of related events reveal natural boundaries. A 2-day Event Storming session is often more valuable than months of code analysis.
161
+ - **Context Mapping:** Identify the relationships between bounded contexts: shared kernel, customer-supplier, conformist, anti-corruption layer. These relationships determine extraction order.
162
+ - **Dependency analysis:** Static analysis of the monolith's code to identify clusters of classes/modules with high internal cohesion and low external coupling. Tools: jDepend (Java), Structure101, Lattix, CodeScene.
163
+ - **Data affinity analysis:** Which database tables are joined together most frequently? Tables that are always queried together likely belong in the same bounded context. Tables that are rarely joined across contexts are natural split points.
164
+
165
+ **Prioritization matrix (Sam Newman's approach):**
166
+
167
+ Plot each candidate on two axes:
168
+ - **X-axis: Difficulty of extraction** (low → high). Factors: number of inbound dependencies, shared database tables, synchronous call chains, data volume.
169
+ - **Y-axis: Value of extraction** (low → high). Factors: deployment frequency need, scaling need, team autonomy need, regulatory isolation need.
170
+
171
+ Extract candidates in the **high value, low difficulty** quadrant first. This delivers benefits early and builds migration expertise before tackling harder extractions.
172
+
173
+ ```
174
+ High Value │ Extract │ Extract
175
+ │ FIRST │ SECOND
176
+ │ (quick wins)│ (hard but worth it)
177
+ │──────────────┼──────────────────
178
+ │ Leave in │ Leave in monolith
179
+ │ monolith │ (high cost, low benefit)
180
+ Low Value │ (not worth │
181
+ │ the effort)│
182
+ └──────────────┴──────────────────
183
+ Low Difficulty High Difficulty
184
+ ```
185
+
186
+ ### Step 2: Build the new service alongside the monolith
187
+
188
+ The new service is developed and deployed independently. It does not replace the monolith capability yet — it runs in parallel. Key decisions at this stage:
189
+
190
+ - **API contract:** Define the new service's API before writing implementation. The API should reflect the domain model, not the monolith's internal structure.
191
+ - **Data ownership:** The new service gets its own database. It does NOT share the monolith's database. Data it needs from the monolith is either: (a) passed in API calls, (b) synchronized via Change Data Capture (CDC), or (c) replicated via events.
192
+ - **Anti-corruption layer (ACL):** An adapter that translates between the monolith's data model and the new service's domain model. The ACL prevents the monolith's legacy concepts from leaking into the new service's clean domain model.
193
+
194
+ ### Step 3: Introduce a routing layer (the strangler facade)
195
+
196
+ A proxy or API gateway sits in front of both the monolith and the new service. It routes requests to the appropriate backend based on rules. This is the "strangler" — it intercepts traffic and gradually redirects it.
197
+
198
+ **Implementation options:**
199
+
200
+ - **API Gateway (Kong, AWS API Gateway, Envoy):** Route by URL path. `/api/notifications/*` goes to the new service; everything else goes to the monolith.
201
+ - **Feature flags:** Use a feature flag system to route specific users or a percentage of traffic to the new service.
202
+ - **DNS/load balancer:** At the infrastructure level, route traffic based on request characteristics.
203
+ - **In-code routing:** The monolith itself delegates calls to the new service via an HTTP client. Simplest to implement but couples the monolith to the new service.
204
+
205
+ ### Step 4: Incremental traffic migration
206
+
207
+ Never flip 100% of traffic to the new service at once. Use a graduated rollout:
208
+
209
+ ```
210
+ 1% traffic → new service (canary: verify basic functionality)
211
+ 5% traffic → new service (verify error rates match monolith)
212
+ 25% traffic → new service (verify latency and throughput at scale)
213
+ 50% traffic → new service (verify under significant load)
214
+ 100% traffic → new service (full migration, monolith code now dormant)
215
+ ```
216
+
217
+ At each stage, compare: error rates, latency percentiles (p50, p95, p99), business metrics (conversion rates, data accuracy), and resource consumption. If any metric degrades, roll back to the monolith instantly via the routing layer.
218
+
219
+ **Parallel running:** For critical business logic (payments, order processing), run both the monolith and the new service simultaneously, compare outputs, and alert on divergence. This catches logic bugs before users are affected.
220
+
221
+ ### Step 5: Database decomposition — the hardest part
222
+
223
+ Database decomposition is universally cited as the hardest part of monolith-to-microservices migration. The monolith typically has a single shared database with foreign key relationships, joins, and transactions spanning what should be separate bounded contexts.
224
+
225
+ **Strategies for database decomposition:**
226
+
227
+ **Strategy 1: Shared database (temporary)**
228
+ During migration, both the monolith and the new service read/write the same database. This is a transitional state — it couples the services at the data layer and must be resolved, but it reduces initial migration risk.
229
+
230
+ **Strategy 2: Database view / materialized view**
231
+ The new service reads from a database view that presents the monolith's data in the service's domain model. Write operations go through the monolith. This is a read-only transitional pattern.
232
+
233
+ **Strategy 3: Change Data Capture (CDC)**
234
+ Use Debezium, AWS DMS, or similar tools to capture changes from the monolith's database and replicate them to the new service's database. The new service has its own copy of the data it needs, in its own schema, updated asynchronously.
235
+
236
+ ```
237
+ Monolith DB ──CDC (Debezium)──► Kafka ──► New Service DB
238
+ (source of truth) (read replica in
239
+ service's schema)
240
+ ```
241
+
242
+ **Strategy 4: Dual-write with reconciliation**
243
+ The monolith writes to both its own database and the new service's database (or publishes events that the new service consumes). A reconciliation job periodically verifies consistency. This is error-prone (dual writes can diverge on failure) and should only be used temporarily.
244
+
245
+ **Strategy 5: Event-sourced migration**
246
+ If the monolith can be modified to publish domain events, the new service builds its own read model from those events. This is the cleanest long-term solution but requires the monolith to support event publication.
247
+
248
+ **The critical rule:** At the end of migration, each service owns its data exclusively. No shared databases between services. Cross-service data access is via APIs or events, never via direct database queries.
249
+
250
+ ### Step 6: Retire monolith code
251
+
252
+ Once 100% of traffic for a capability is served by the new service and has been stable for a defined period (typically 2-4 weeks), remove the corresponding code from the monolith. This is often neglected — teams leave dead code in the monolith "just in case." Dead code increases cognitive load, maintenance burden, and the risk of accidentally routing traffic back to it.
253
+
254
+ ### Step 7: Repeat for the next bounded context
255
+
256
+ Each extraction follows the same cycle. With each iteration, the team's extraction expertise improves, patterns are reusable, and the monolith shrinks. A typical migration extracts 1-3 services per quarter.
257
+
258
+ ---
259
+
260
+ ## Trade-Offs Matrix
261
+
262
+ | You Get | You Pay |
263
+ |---------|---------|
264
+ | Independent deployment per service — each team ships on its own schedule | N deployment pipelines to build, maintain, and monitor instead of 1 |
265
+ | Independent scaling per service — scale only what needs scaling | Service discovery, load balancing, and circuit breakers become mandatory infrastructure |
266
+ | Team autonomy — each team owns a service end-to-end | Cross-service debugging requires distributed tracing (Jaeger, Zipkin, Datadog) |
267
+ | Technology heterogeneity — each service can use the best language/framework | Polyglot operations burden: N runtimes, N dependency management tools, N security patching processes |
268
+ | Fault isolation — one service crashing does not take down others | Network partitions, timeouts, and partial failures become normal operating conditions |
269
+ | Smaller codebases per service — faster builds, easier onboarding | System-level understanding requires reading N codebases and understanding their interactions |
270
+ | Independent data stores — each service optimizes for its access patterns | Distributed data consistency requires sagas, eventual consistency, and compensation logic |
271
+ | Organizational scaling — add teams without increasing coordination overhead | API versioning, backward compatibility, and contract testing become critical |
272
+ | Granular monitoring and alerting per service | Monitoring infrastructure scales with service count; alert fatigue is common |
273
+ | Selective technology upgrades — upgrade one service at a time | Integration testing across services is harder than testing a monolith |
274
+ | Regulatory isolation — PCI/HIPAA-scoped services separated from others | Every network call is a potential security boundary requiring mTLS, auth tokens, and audit |
275
+ | Incremental migration — extract value without big-bang risk | Migration can take years; maintaining partial migration state is expensive |
276
+
277
+ ---
278
+
279
+ ## Evolution Path
280
+
281
+ The recommended evolution path is not "monolith to microservices" but "monolith to modular monolith to selective extraction." Most organizations should stop at the modular monolith stage.
282
+
283
+ ### Stage 1: Monolith (team of 1-10)
284
+
285
+ A single deployable application with no enforced internal boundaries. This is the correct starting architecture for most new projects. Optimize for development speed and learning. Do not prematurely decompose.
286
+
287
+ ### Stage 2: Modular monolith (team of 10-50)
288
+
289
+ Enforce module boundaries within the monolith using tooling (ArchUnit, Packwerk, Spring Modulith, Nx). Each module owns its data, exposes a public API, and hides its internals. Teams own modules. Deploy as a single unit.
290
+
291
+ **This is where most organizations should stay.** The modular monolith provides team autonomy and clean domain separation without distributed systems complexity. Shopify operates one of the world's largest Rails codebases (2,800,000+ lines of Ruby, 2,000+ engineers) as a modular monolith with 40+ production deployments per day.
292
+
293
+ ### Stage 3: Selective extraction (team of 50+, with specific pain)
294
+
295
+ Extract only the modules that have a demonstrated need for independent deployment or independent scaling. The modular monolith's clean interfaces make extraction a mechanical operation: wrap the module's interface in a network adapter, deploy it separately, swap the in-process call for a network call.
296
+
297
+ **What to extract (with evidence):**
298
+ - The module whose deployment is blocked by other modules' release schedules (measured: deployment frequency data)
299
+ - The module that needs 100x the compute resources of other modules (measured: resource utilization data)
300
+ - The module that must be in a separate compliance boundary (documented: regulatory requirement)
301
+ - The module that must use a different technology stack (validated: performance benchmarks)
302
+
303
+ **What NOT to extract:**
304
+ - Modules that "feel like" they should be services
305
+ - Modules extracted to match an org chart change
306
+ - Modules extracted because "we'll need to scale someday"
307
+ - All modules simultaneously
308
+
309
+ ### Stage 4: Continued monolith + services (steady state)
310
+
311
+ Many successful architectures are hybrids: a modular monolith core with a small number of extracted services for specific needs. This is not a failure state or a transitional state — it is often the optimal end state. The monolith handles 80% of the domain where the simplicity of in-process communication, shared deployment, and ACID transactions is valuable. The services handle the 20% where independent scaling, deployment, or isolation is genuinely required.
312
+
313
+ ---
314
+
315
+ ## Failure Modes
316
+
317
+ ### Failure Mode 1: The distributed monolith
318
+
319
+ **What it looks like:** Services are deployed independently but cannot function independently. Every change requires coordinating deployments across 3-5 services. Services share a database. A service cannot be tested without running 8 other services locally. Deploying Service A without simultaneously deploying Service B causes failures.
320
+
321
+ **Why it happens:** The monolith was decomposed along technical layers (UI service, business logic service, data access service) instead of domain boundaries. Or the monolith was split into services that communicate synchronously in long chains, creating temporal coupling. Or services share a database, creating data coupling that undoes the independence microservices are supposed to provide.
322
+
323
+ **How common:** Extremely. A 2025 CNCF survey found approximately 42% of organizations that adopted microservices have consolidated at least some services back. The distributed monolith is the single most common outcome of naive migration.
324
+
325
+ **Prevention:** Decompose along domain boundaries (bounded contexts), not technical layers. Each service must own its data. Prefer asynchronous communication (events) over synchronous call chains.
326
+
327
+ ### Failure Mode 2: Data consistency nightmares
328
+
329
+ **What it looks like:** Orders are created but payments are not processed. Inventory is decremented but the order fails. Customers see stale data because eventual consistency has a lag. Reconciliation jobs run nightly to fix data drift between services.
330
+
331
+ **Why it happens:** The monolith used ACID transactions to maintain consistency between what are now separate services. After decomposition, there is no distributed transaction coordinator. Sagas and compensating transactions are harder to implement correctly than most teams anticipate.
332
+
333
+ **Prevention:** Accept eventual consistency as a fundamental constraint. Design compensating transactions for every distributed operation. Implement idempotency on all service endpoints. Use the Saga pattern with explicit compensation steps. Monitor consistency metrics and alert on drift.
334
+
335
+ ### Failure Mode 3: Incomplete migration — worst of both worlds
336
+
337
+ **What it looks like:** After 18 months, 40% of the monolith has been extracted to services. The remaining 60% is still a monolith. The monolith now depends on 8 external services, adding network calls and failure modes to its critical paths. The services depend on the monolith for shared data. Neither the monolith nor the services can be understood in isolation. The system is harder to operate than either a pure monolith or pure microservices.
338
+
339
+ **Why it happens:** Migration fatigue. The first 2-3 extractions are exciting. Extractions 4-8 are progressively harder because they hit the tightly coupled core of the monolith. Leadership loses patience with the multi-year timeline. Engineers who championed the migration move to other companies. The migration stalls.
340
+
341
+ **Prevention:** Before starting, commit to a realistic multi-year timeline. Plan extractions so that each one delivers measurable value independently — the migration can stop at any point and the system is in a better state than before. Define explicit "good enough" stopping points. Not every module needs to be extracted.
342
+
343
+ ### Failure Mode 4: Team burnout from multi-year migration
344
+
345
+ **What it looks like:** The migration was estimated at 12 months. After 24 months, it is 60% complete. The team has been doing migration work instead of product work for two years. Morale is low. Senior engineers leave for companies where they can build new features instead of migrating old ones. New hires inherit a partially migrated system they do not understand.
346
+
347
+ **Why it happens:** Migration timelines are systematically underestimated. Database decomposition alone often takes longer than the entire original estimate. Each extraction uncovers undocumented dependencies, implicit assumptions, and edge cases.
348
+
349
+ **Prevention:** Budget migration work at 20-30% of engineering capacity, not 100%. Teams should spend 70-80% of their time on product work and 20-30% on migration. This extends the calendar time but preserves team morale and product velocity. No team should spend more than one quarter doing 100% migration work.
350
+
351
+ ### Failure Mode 5: Over-decomposition — nano-services
352
+
353
+ **What it looks like:** The team creates 200 services for a system that could be served by 15. Each service has 200-500 lines of code. Services exist for individual CRUD entities (UserService, AddressService, PreferenceService). The call graph for a single user request traverses 12 services.
354
+
355
+ **Why it happens:** Misapplication of the Single Responsibility Principle at the service level. "A service should do one thing" is interpreted as "a service should manage one database table." This confuses granularity of deployment with granularity of domain responsibility.
356
+
357
+ **Prevention:** A microservice should align with a bounded context, not an entity. The bounded context for "User Management" includes users, addresses, preferences, and authentication — these are a single service, not four. If every HTTP request traverses more than 3 services, the decomposition is too fine-grained.
358
+
359
+ ### Failure Mode 6: Migrating without observability
360
+
361
+ **What it looks like:** Services are deployed but there is no distributed tracing. When a request fails, engineers cannot determine which of the 15 services in the call chain caused the failure. Debugging requires reading logs from multiple services manually. Mean time to resolution (MTTR) increases from minutes (monolith: read the stack trace) to hours (microservices: correlate logs across services).
362
+
363
+ **Prevention:** Distributed tracing (OpenTelemetry, Jaeger, Zipkin) is a prerequisite for microservices, not a follow-up project. Deploy tracing infrastructure before extracting the first service.
364
+
365
+ ---
366
+
367
+ ## Technology Landscape
368
+
369
+ ### Strangler Fig implementation tools
370
+
371
+ **API Gateways (routing layer):**
372
+ - **Kong:** Open-source API gateway. Route by path, header, or custom plugin logic. Supports canary deployments and traffic splitting.
373
+ - **AWS API Gateway:** Managed gateway with Lambda integration. Route by path to different backend services.
374
+ - **Envoy Proxy:** High-performance proxy used as the data plane in most service meshes (Istio, Linkerd). Supports traffic splitting, canary routing, and circuit breaking.
375
+ - **Nginx:** Reverse proxy with upstream routing. Simple configuration for path-based routing between monolith and new services.
376
+ - **Traefik:** Cloud-native reverse proxy with automatic service discovery. Integrates with Kubernetes, Docker, and Consul.
377
+
378
+ **Service mesh for traffic management:**
379
+ - **Istio:** Full-featured service mesh. Traffic splitting (send 5% to new service, 95% to monolith), circuit breaking, mTLS, observability. Heavy operational overhead.
380
+ - **Linkerd:** Lightweight service mesh. Simpler than Istio, focused on reliability and observability. Good choice for teams that need traffic splitting without Istio's complexity.
381
+ - **AWS App Mesh:** Managed service mesh. Integrates with ECS, EKS, and EC2. Virtual routers enable traffic splitting between monolith and new services.
382
+
383
+ ### Database migration and data synchronization
384
+
385
+ **Change Data Capture (CDC):**
386
+ - **Debezium:** Open-source CDC platform. Captures row-level changes from PostgreSQL, MySQL, MongoDB, SQL Server, Oracle. Publishes to Kafka. The standard tool for database decomposition during migration.
387
+ - **AWS Database Migration Service (DMS):** Managed CDC service. Supports continuous replication from source database to target.
388
+ - **Maxwell:** Lightweight CDC for MySQL. Reads binlog and publishes to Kafka, Kinesis, or other targets.
389
+
390
+ **Dual-write coordination:**
391
+ - **Outbox pattern:** Write domain events to an "outbox" table in the same transaction as the business data. A separate process (Debezium, polling publisher) reads the outbox and publishes to the message broker. Guarantees at-least-once delivery without dual-write inconsistency.
392
+ - **Transactional outbox with Debezium:** Combines the outbox pattern with CDC. Debezium reads the outbox table and publishes events. The application only writes to one database — no dual writes.
393
+
394
+ ### Saga orchestration
395
+
396
+ - **Temporal:** Open-source workflow engine. Model sagas as workflows with explicit compensation steps. Handles retries, timeouts, and failure recovery. Growing adoption for microservices orchestration.
397
+ - **AWS Step Functions:** Managed workflow orchestration. Model sagas as state machines. Integrates with Lambda, ECS, and other AWS services.
398
+ - **Camunda:** Open-source workflow engine with BPMN modeling. Supports saga orchestration with visual workflow definition.
399
+ - **Axon Framework (Java):** Event-driven microservices framework with built-in saga support. Integrates with Axon Server for event routing.
400
+
401
+ ### Observability (prerequisite, not optional)
402
+
403
+ - **OpenTelemetry:** Vendor-neutral standard for distributed tracing, metrics, and logs. Instrument services once, export to any backend.
404
+ - **Jaeger / Zipkin:** Open-source distributed tracing backends. Visualize request flows across services.
405
+ - **Grafana + Prometheus + Loki:** Metrics (Prometheus), dashboards (Grafana), and logs (Loki). The open-source observability stack.
406
+ - **Datadog / New Relic / Honeycomb:** Commercial observability platforms with distributed tracing, APM, and log aggregation.
407
+
408
+ ### Contract testing
409
+
410
+ - **Pact:** Consumer-driven contract testing. The consumer defines expected API behavior; the provider verifies it. Prevents breaking changes across service boundaries.
411
+ - **Spring Cloud Contract:** Contract testing for Spring Boot services. Defines contracts as Groovy or YAML DSL; generates test stubs automatically.
412
+ - **Spectral:** OpenAPI linting. Enforces API design standards across services.
413
+
414
+ ---
415
+
416
+ ## Decision Tree
417
+
418
+ ```
419
+ Is there measurable pain caused by the monolith architecture?
420
+
421
+ ├── No measurable pain
422
+ │ └── STOP. Do not migrate. "It could be better" is not a reason.
423
+ │ A working monolith is a valuable asset, not a liability.
424
+
425
+ └── Yes, there is measurable pain
426
+
427
+ ├── Is the pain organizational (teams blocking each other)?
428
+ │ ├── Have you tried a modular monolith?
429
+ │ │ ├── No → Try modular monolith first. Enforce boundaries with
430
+ │ │ │ ArchUnit/Packwerk/Spring Modulith. Revisit in 6 months.
431
+ │ │ └── Yes, and teams are still blocked
432
+ │ │ └── How many developers share the codebase?
433
+ │ │ ├── < 20 → Modular monolith should suffice. Revisit
434
+ │ │ │ boundary definitions; the boundaries may be wrong.
435
+ │ │ ├── 20-50 → Selectively extract the 1-3 highest-friction
436
+ │ │ │ modules. Keep the rest as modular monolith.
437
+ │ │ └── > 50 → Incremental extraction via Strangler Fig.
438
+ │ │ Start with highest-value, lowest-difficulty module.
439
+ │ │
440
+ │ └── Is the pain technical (scaling, performance)?
441
+ │ ├── Does one component need dramatically more resources?
442
+ │ │ ├── Yes → Extract ONLY that component. Keep the rest monolithic.
443
+ │ │ └── No → Investigate scaling the monolith (horizontal scaling,
444
+ │ │ caching, read replicas) before decomposing.
445
+ │ │
446
+ │ └── Is the pain deployment-related?
447
+ │ ├── Deploys are slow because of test suite → Fix the test suite,
448
+ │ │ not the architecture. Parallel testing, test selection,
449
+ │ │ contract tests.
450
+ │ ├── Deploys are slow because of build time → Modular build
451
+ │ │ (Nx, Bazel, Gradle module cache). Not a reason for services.
452
+ │ └── Deploys are risky because of coupling → Modular monolith
453
+ │ with enforced boundaries. Extract only if coupling persists
454
+ │ after modularization.
455
+
456
+ └── Is the pain regulatory/compliance?
457
+ └── Extract ONLY the regulated component to an isolated service.
458
+ Keep everything else monolithic.
459
+
460
+ DEFAULT POSITION: Do not migrate. The burden of proof is on migration,
461
+ not on staying monolithic.
462
+ ```
463
+
464
+ ---
465
+
466
+ ## Implementation Sketch
467
+
468
+ ### Strangler Fig with API Gateway (Kong/Nginx)
469
+
470
+ ```nginx
471
+ # nginx.conf — Strangler Fig routing
472
+ # Phase 1: Route notifications to new service, everything else to monolith
473
+
474
+ upstream monolith {
475
+ server monolith.internal:8080;
476
+ }
477
+
478
+ upstream notifications_service {
479
+ server notifications.internal:8081;
480
+ }
481
+
482
+ server {
483
+ listen 80;
484
+
485
+ # Extracted capability: notifications
486
+ location /api/notifications/ {
487
+ proxy_pass http://notifications_service;
488
+ proxy_set_header X-Request-ID $request_id;
489
+ proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
490
+ }
491
+
492
+ # Everything else: still the monolith
493
+ location / {
494
+ proxy_pass http://monolith;
495
+ proxy_set_header X-Request-ID $request_id;
496
+ proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
497
+ }
498
+ }
499
+ ```
500
+
501
+ ### Anti-corruption layer in the monolith
502
+
503
+ ```java
504
+ // Inside the monolith: ACL that delegates to the new service
505
+ // Replaces the old in-process NotificationService
506
+
507
+ @Service
508
+ public class NotificationServiceProxy implements NotificationService {
509
+
510
+ private final WebClient webClient;
511
+ private final CircuitBreaker circuitBreaker;
512
+
513
+ @Override
514
+ public void sendNotification(NotificationRequest request) {
515
+ // Transform monolith's domain model to new service's API contract
516
+ NewServiceNotificationDTO dto = NewServiceNotificationDTO.builder()
517
+ .recipientId(request.getUserId().toString())
518
+ .channel(mapChannel(request.getType()))
519
+ .templateId(mapTemplate(request.getTemplateName()))
520
+ .variables(request.getTemplateVariables())
521
+ .build();
522
+
523
+ circuitBreaker.run(() ->
524
+ webClient.post()
525
+ .uri("/api/notifications/send")
526
+ .bodyValue(dto)
527
+ .retrieve()
528
+ .toBodilessEntity()
529
+ .block(Duration.ofSeconds(5))
530
+ , throwable -> {
531
+ // Fallback: log and optionally fall back to monolith's implementation
532
+ log.warn("Notification service unavailable, falling back", throwable);
533
+ fallbackNotificationService.sendNotification(request);
534
+ return null;
535
+ });
536
+ }
537
+ }
538
+ ```
539
+
540
+ ### Database decomposition with Debezium CDC
541
+
542
+ ```yaml
543
+ # docker-compose.yml — Debezium CDC pipeline
544
+ # Captures changes from monolith's notifications tables
545
+ # Publishes to Kafka for the new notifications service to consume
546
+
547
+ services:
548
+ zookeeper:
549
+ image: confluentinc/cp-zookeeper:7.5.0
550
+ environment:
551
+ ZOOKEEPER_CLIENT_PORT: 2181
552
+
553
+ kafka:
554
+ image: confluentinc/cp-kafka:7.5.0
555
+ depends_on: [zookeeper]
556
+ environment:
557
+ KAFKA_BROKER_ID: 1
558
+ KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
559
+ KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
560
+
561
+ debezium:
562
+ image: debezium/connect:2.4
563
+ depends_on: [kafka]
564
+ environment:
565
+ BOOTSTRAP_SERVERS: kafka:9092
566
+ GROUP_ID: monolith-cdc
567
+ CONFIG_STORAGE_TOPIC: cdc-configs
568
+ OFFSET_STORAGE_TOPIC: cdc-offsets
569
+ STATUS_STORAGE_TOPIC: cdc-status
570
+
571
+ # Register the connector after Debezium starts:
572
+ # curl -X POST http://debezium:8083/connectors -H "Content-Type: application/json" -d @connector.json
573
+ ```
574
+
575
+ ```json
576
+ // connector.json — Debezium connector for monolith's notifications tables
577
+ {
578
+ "name": "monolith-notifications-connector",
579
+ "config": {
580
+ "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
581
+ "database.hostname": "monolith-db",
582
+ "database.port": "5432",
583
+ "database.user": "debezium",
584
+ "database.password": "${CDC_DB_PASSWORD}",
585
+ "database.dbname": "monolith",
586
+ "schema.include.list": "notifications",
587
+ "table.include.list": "notifications.templates,notifications.delivery_log",
588
+ "topic.prefix": "monolith",
589
+ "plugin.name": "pgoutput",
590
+ "slot.name": "debezium_notifications",
591
+ "transforms": "route",
592
+ "transforms.route.type": "org.apache.kafka.connect.transforms.RegexRouter",
593
+ "transforms.route.regex": "monolith.notifications.(.*)",
594
+ "transforms.route.replacement": "notifications.$1"
595
+ }
596
+ }
597
+ ```
598
+
599
+ ### Saga pattern for cross-service operations
600
+
601
+ ```java
602
+ // Order creation saga — orchestrated approach using Temporal
603
+ // Replaces what was a single ACID transaction in the monolith
604
+
605
+ @WorkflowInterface
606
+ public interface CreateOrderSaga {
607
+ @WorkflowMethod
608
+ OrderResult createOrder(CreateOrderCommand command);
609
+ }
610
+
611
+ @WorkflowImpl(taskQueues = "order-saga")
612
+ public class CreateOrderSagaImpl implements CreateOrderSaga {
613
+
614
+ private final PaymentActivities payment = Workflow.newActivityStub(
615
+ PaymentActivities.class,
616
+ ActivityOptions.newBuilder()
617
+ .setStartToCloseTimeout(Duration.ofSeconds(30))
618
+ .setRetryOptions(RetryOptions.newBuilder()
619
+ .setMaximumAttempts(3)
620
+ .build())
621
+ .build()
622
+ );
623
+
624
+ private final InventoryActivities inventory = Workflow.newActivityStub(
625
+ InventoryActivities.class,
626
+ ActivityOptions.newBuilder()
627
+ .setStartToCloseTimeout(Duration.ofSeconds(30))
628
+ .build()
629
+ );
630
+
631
+ @Override
632
+ public OrderResult createOrder(CreateOrderCommand command) {
633
+ // Step 1: Reserve inventory
634
+ ReservationResult reservation;
635
+ try {
636
+ reservation = inventory.reserveStock(command.getItems());
637
+ } catch (Exception e) {
638
+ return OrderResult.failed("Inventory reservation failed: " + e.getMessage());
639
+ }
640
+
641
+ // Step 2: Charge payment
642
+ PaymentResult paymentResult;
643
+ try {
644
+ paymentResult = payment.chargePayment(command.getPaymentDetails());
645
+ } catch (Exception e) {
646
+ // COMPENSATE: Release inventory reservation
647
+ inventory.releaseReservation(reservation.getReservationId());
648
+ return OrderResult.failed("Payment failed: " + e.getMessage());
649
+ }
650
+
651
+ // Step 3: Confirm order
652
+ try {
653
+ return OrderResult.success(command.getOrderId(), reservation, paymentResult);
654
+ } catch (Exception e) {
655
+ // COMPENSATE: Refund payment and release inventory
656
+ payment.refundPayment(paymentResult.getTransactionId());
657
+ inventory.releaseReservation(reservation.getReservationId());
658
+ return OrderResult.failed("Order confirmation failed: " + e.getMessage());
659
+ }
660
+ }
661
+ }
662
+ ```
663
+
664
+ ### Canary deployment with Istio traffic splitting
665
+
666
+ ```yaml
667
+ # istio-virtual-service.yaml
668
+ # Gradually shift traffic from monolith to new notifications service
669
+
670
+ apiVersion: networking.istio.io/v1beta1
671
+ kind: VirtualService
672
+ metadata:
673
+ name: notifications-routing
674
+ spec:
675
+ hosts:
676
+ - notifications.internal
677
+ http:
678
+ - match:
679
+ - uri:
680
+ prefix: /api/notifications
681
+ route:
682
+ # Phase 1: 95% monolith, 5% new service
683
+ - destination:
684
+ host: monolith
685
+ port:
686
+ number: 8080
687
+ weight: 95
688
+ - destination:
689
+ host: notifications-service
690
+ port:
691
+ number: 8081
692
+ weight: 5
693
+ # Automatic rollback: if new service error rate > 1%, route all to monolith
694
+ retries:
695
+ attempts: 2
696
+ perTryTimeout: 3s
697
+ ---
698
+ # After validating Phase 1 metrics, update weights:
699
+ # Phase 2: weight 75/25
700
+ # Phase 3: weight 50/50
701
+ # Phase 4: weight 0/100 (full migration)
702
+ ```
703
+
704
+ ### Feature parity validation
705
+
706
+ ```python
707
+ # parallel_run_validator.py
708
+ # Run both monolith and new service, compare outputs
709
+ # Used during migration to validate feature parity before shifting traffic
710
+
711
+ import asyncio
712
+ import hashlib
713
+ import json
714
+ from dataclasses import dataclass
715
+ from datetime import datetime
716
+
717
+ @dataclass
718
+ class ComparisonResult:
719
+ request_id: str
720
+ timestamp: datetime
721
+ monolith_status: int
722
+ service_status: int
723
+ bodies_match: bool
724
+ monolith_latency_ms: float
725
+ service_latency_ms: float
726
+ divergence_details: str | None = None
727
+
728
+ async def parallel_run(request, monolith_client, service_client) -> ComparisonResult:
729
+ """Send same request to both monolith and new service, compare results."""
730
+
731
+ monolith_task = asyncio.create_task(monolith_client.send(request))
732
+ service_task = asyncio.create_task(service_client.send(request))
733
+
734
+ monolith_resp, service_resp = await asyncio.gather(
735
+ monolith_task, service_task, return_exceptions=True
736
+ )
737
+
738
+ # Always return monolith response to the caller (source of truth)
739
+ # Log comparison for analysis
740
+
741
+ monolith_body = normalize_json(monolith_resp.body)
742
+ service_body = normalize_json(service_resp.body)
743
+
744
+ bodies_match = monolith_body == service_body
745
+
746
+ result = ComparisonResult(
747
+ request_id=request.id,
748
+ timestamp=datetime.utcnow(),
749
+ monolith_status=monolith_resp.status,
750
+ service_status=service_resp.status,
751
+ bodies_match=bodies_match,
752
+ monolith_latency_ms=monolith_resp.latency_ms,
753
+ service_latency_ms=service_resp.latency_ms,
754
+ divergence_details=diff(monolith_body, service_body) if not bodies_match else None,
755
+ )
756
+
757
+ # Emit metrics for monitoring dashboard
758
+ metrics.increment("parallel_run.total")
759
+ if not bodies_match:
760
+ metrics.increment("parallel_run.divergence")
761
+ logger.warning(f"Response divergence: {result.divergence_details}")
762
+
763
+ return result
764
+
765
+
766
+ def normalize_json(body: bytes) -> str:
767
+ """Normalize JSON for comparison (sort keys, remove volatile fields)."""
768
+ data = json.loads(body)
769
+ # Remove fields that are expected to differ (timestamps, request IDs)
770
+ for field in ["timestamp", "request_id", "trace_id"]:
771
+ data.pop(field, None)
772
+ return json.dumps(data, sort_keys=True)
773
+ ```
774
+
775
+ ---
776
+
777
+ ## Cross-References
778
+
779
+ - **monolith:** The starting architecture. Most monoliths should be improved, not replaced. See `architecture/patterns/monolith.md`.
780
+ - **modular-monolith:** The recommended intermediate step. Try this before microservices. See `architecture/patterns/modular-monolith.md`.
781
+ - **microservices:** The target architecture — but only for the components that genuinely need it. See `architecture/patterns/microservices.md`.
782
+ - **domain-driven-design:** Bounded contexts define extraction candidates. Event Storming discovers domain boundaries. See `architecture/foundations/domain-driven-design.md`.
783
+ - **saga-pattern:** Replaces ACID transactions that spanned what are now separate services. Essential for maintaining data consistency after decomposition.
784
+ - **event-driven:** Asynchronous communication between services reduces coupling and enables eventual consistency. See `architecture/patterns/event-driven.md`.
785
+ - **coupling-and-cohesion:** The theoretical foundation for why decomposition boundaries matter. High cohesion within services, low coupling between them. See `architecture/foundations/coupling-and-cohesion.md`.
786
+ - **cqrs-event-sourcing:** Useful for services that need to build read models from events published by other services. See `architecture/patterns/cqrs-event-sourcing.md`.
787
+
788
+ ---
789
+
790
+ *Researched: 2026-03-08 | Sources: [Sam Newman — Monolith to Microservices (O'Reilly)](https://samnewman.io/books/monolith-to-microservices/) | [Sam Newman — Monolith Decomposition Patterns](https://samnewman.io/talks/monolith-decomposition-patterns/) | [Sam Newman — Decomposing a Monolith Does Not Require Microservices (InfoQ)](https://www.infoq.com/news/2020/05/monolith-decomposition-newman/) | [Segment — Goodbye Microservices: From 100s of Problem Children to 1 Superstar](https://segment.com/blog/goodbye-microservices/) | [InfoQ — Why Segment Went Back to a Monolith](https://www.infoq.com/news/2020/04/microservices-back-again/) | [Amazon Prime Video Microservices to Monolith Case Study](https://newsletter.systemdesign.one/p/prime-video-microservices) | [Adrian Cockcroft — What is there to learn from the Prime Video story](https://adrianco.medium.com/so-many-bad-takes-what-is-there-to-learn-from-the-prime-video-microservices-to-monolith-story-4bd0970423d4) | [AWS — Strangler Fig Pattern](https://docs.aws.amazon.com/prescriptive-guidance/latest/cloud-design-patterns/strangler-fig.html) | [Microsoft Azure — Strangler Fig Pattern](https://learn.microsoft.com/en-us/azure/architecture/patterns/strangler-fig) | [Chris Richardson — Strangler Fig Application Pattern](https://microservices.io/patterns/refactoring/strangler-application.html) | [Confluent — Strangler Fig Pattern](https://developer.confluent.io/patterns/compositional-patterns/strangler-fig/) | [Agoda Engineering — Breaking Down Monolith](https://medium.com/agoda-engineering/leading-with-clients-our-journey-to-microservices-from-a-graphql-monolith-252b8baa69af) | [DEV Community — From Monolith to Modular Monolith to Microservices](https://dev.to/sepehr/from-monolith-to-modular-monolith-to-microservices-realistic-migration-patterns-36f2) | [Java Code Geeks — Microservices vs. Modular Monoliths in 2025](https://www.javacodegeeks.com/2025/12/microservices-vs-modular-monoliths-in-2025-when-each-approach-wins.html) | [TechTarget — 4 Microservices Antipatterns That Ruin Migration](https://www.techtarget.com/searchapparchitecture/tip/4-deadly-microservices-antipatterns-that-ruin-migration) | [Gremlin — Is Your Microservice a Distributed Monolith?](https://www.gremlin.com/blog/is-your-microservice-a-distributed-monolith) | [CircleCI — Monolith to Microservices Migration Strategies](https://circleci.com/blog/monolith-to-microservices-migration-strategies/) | [Google Cloud — Refactor a Monolith into Microservices](https://cloud.google.com/architecture/microservices-architecture-refactoring-monoliths) | [AWS — Decomposing Monoliths into Microservices](https://docs.aws.amazon.com/prescriptive-guidance/latest/modernization-decomposing-monoliths/welcome.html)*