@umacloud/knowledge 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (418) hide show
  1. package/00-governance/governance-capabilities.md +557 -0
  2. package/00-governance/knowledge-map.md +39 -0
  3. package/00-governance/maintenance-policy.md +76 -0
  4. package/00-governance/review-checklist.md +81 -0
  5. package/README.md +13 -0
  6. package/ai/01-standards/agent-development-complete.md +691 -0
  7. package/ai/01-standards/llm-application-complete.md +488 -0
  8. package/ai/01-standards/mlops-complete.md +798 -0
  9. package/ai/01-standards/prompt-engineering-complete.md +646 -0
  10. package/ai/01-standards/rag-architecture-complete.md +649 -0
  11. package/ai/02-playbooks/llm-evaluation-playbook.md +847 -0
  12. package/ai/03-checklists/ai-project-checklist.md +215 -0
  13. package/ai/04-antipatterns/ai-antipatterns.md +661 -0
  14. package/ai/05-cases/case-rag-production.md +147 -0
  15. package/ai/06-glossary/ai-glossary.md +162 -0
  16. package/ai/agent-evaluation-benchmark.md +53 -0
  17. package/ai/ai-agent-memory-context-management.md +41 -0
  18. package/ai/ai-cost-capacity-optimization-playbook.md +42 -0
  19. package/ai/ai-data-security-and-compliance-playbook.md +37 -0
  20. package/ai/ai-domain-index-and-checklist.md +40 -0
  21. package/ai/ai-governance-maturity-model.md +50 -0
  22. package/ai/ai-model-selection-and-routing-strategy.md +47 -0
  23. package/ai/ai-observability-and-oncall-runbook.md +52 -0
  24. package/ai/ai-rag-engineering-playbook.md +42 -0
  25. package/ai/ai-red-team-and-safety-evaluation.md +42 -0
  26. package/ai/ai-release-readiness-and-rollback-gate.md +42 -0
  27. package/ai/llm-agent-engineering-deep-dive.md +57 -0
  28. package/ai/prompt-and-tool-guardrails.md +52 -0
  29. package/api/01-standards/enterprise-api-standards.md +198 -0
  30. package/api/01-standards/rest-api-design-guide.md +63 -0
  31. package/api/02-playbooks/api-pagination-playbook.md +93 -0
  32. package/api/02-playbooks/graphql-production-playbook.md +176 -0
  33. package/api/03-checklists/api-review-checklist.md +55 -0
  34. package/api/04-antipatterns/api-antipatterns.md +112 -0
  35. package/architecture/01-standards/api-gateway-patterns.md +496 -0
  36. package/architecture/01-standards/cloud-native-patterns.md +644 -0
  37. package/architecture/01-standards/distributed-systems-patterns.md +591 -0
  38. package/architecture/01-standards/event-driven-architecture.md +595 -0
  39. package/architecture/01-standards/microservices-patterns-complete.md +968 -0
  40. package/architecture/01-standards/microservices-patterns.md +495 -0
  41. package/architecture/01-standards/system-design-interview.md +664 -0
  42. package/architecture/02-playbooks/microservices-patterns-playbook.md +137 -0
  43. package/architecture/02-playbooks/migration-playbook.md +780 -0
  44. package/architecture/02-playbooks/system-design-playbook.md +779 -0
  45. package/architecture/03-checklists/architecture-decision-checklist.md +297 -0
  46. package/architecture/04-antipatterns/architecture-antipatterns.md +417 -0
  47. package/architecture/05-cases/case-netflix-microservices.md +413 -0
  48. package/architecture/06-glossary/architecture-glossary.md +164 -0
  49. package/architecture/adr-template-and-examples.md +38 -0
  50. package/architecture/api-gateway-deep-dive.md +1291 -0
  51. package/architecture/configuration-management.md +1162 -0
  52. package/architecture/distributed-transactions.md +1220 -0
  53. package/architecture/microservices-complete.md +735 -0
  54. package/architecture/resilience-and-disaster-patterns.md +37 -0
  55. package/architecture/service-governance.md +1198 -0
  56. package/architecture/system-architecture-deep-dive.md +37 -0
  57. package/backend/01-standards/analytics-and-growth.md +65 -0
  58. package/backend/01-standards/api-and-error-conventions.md +120 -0
  59. package/backend/01-standards/application-layering-and-packaging.md +160 -0
  60. package/backend/01-standards/auth-implementation.md +104 -0
  61. package/backend/01-standards/backend-framework-idioms.md +74 -0
  62. package/backend/01-standards/background-jobs-and-async.md +66 -0
  63. package/backend/01-standards/caching-strategies-complete.md +390 -0
  64. package/backend/01-standards/config-and-observability.md +77 -0
  65. package/backend/01-standards/data-modeling-and-persistence.md +94 -0
  66. package/backend/01-standards/django-complete.md +1765 -0
  67. package/backend/01-standards/email-and-notifications.md +64 -0
  68. package/backend/01-standards/fastapi-complete.md +925 -0
  69. package/backend/01-standards/file-upload-and-storage.md +66 -0
  70. package/backend/01-standards/graphql-api-complete.md +416 -0
  71. package/backend/01-standards/llm-application-standard.md +78 -0
  72. package/backend/01-standards/message-queue-patterns.md +379 -0
  73. package/backend/01-standards/microservices-and-distributed.md +78 -0
  74. package/backend/01-standards/nestjs-complete.md +2167 -0
  75. package/backend/01-standards/payment-integration.md +80 -0
  76. package/backend/01-standards/rate-limiting-complete.md +451 -0
  77. package/backend/01-standards/realtime-and-websocket.md +65 -0
  78. package/backend/01-standards/search-and-filtering.md +64 -0
  79. package/backend/01-standards/spring-boot-complete.md +445 -0
  80. package/backend/02-playbooks/api-design-playbook.md +718 -0
  81. package/backend/02-playbooks/email-send-playbook.md +130 -0
  82. package/backend/02-playbooks/file-upload-s3-playbook.md +153 -0
  83. package/backend/02-playbooks/typescript-enterprise-playbook.md +133 -0
  84. package/backend/02-playbooks/websocket-realtime-playbook.md +154 -0
  85. package/backend/03-checklists/api-launch-checklist.md +189 -0
  86. package/backend/04-antipatterns/backend-antipatterns.md +1051 -0
  87. package/blockchain/01-standards/blockchain-basics.md +557 -0
  88. package/blockchain/01-standards/smart-contract-development.md +1315 -0
  89. package/cicd/01-standards/deployment-and-delivery-standard.md +96 -0
  90. package/cicd/01-standards/github-actions-complete.md +473 -0
  91. package/cicd/01-standards/release-and-store-submission.md +75 -0
  92. package/cicd/02-playbooks/cicd-pipeline-playbook.md +144 -0
  93. package/cicd/02-playbooks/release-management-playbook.md +605 -0
  94. package/cicd/03-checklists/pipeline-security-checklist.md +168 -0
  95. package/cicd/04-antipatterns/cicd-antipatterns.md +589 -0
  96. package/cicd/05-cases/case-deployment-automation.md +221 -0
  97. package/cicd/05-cases/case-gitops-transformation.md +212 -0
  98. package/cicd/06-glossary/cicd-glossary.md +114 -0
  99. package/cicd/cicd-blueprint-deep-dive.md +38 -0
  100. package/cicd/release-readiness-gate.md +37 -0
  101. package/cloud-native/01-standards/container-security.md +741 -0
  102. package/cloud-native/01-standards/kubernetes-complete.md +812 -0
  103. package/cloud-native/02-playbooks/api-gateway-playbook.md +155 -0
  104. package/cloud-native/02-playbooks/gitops-with-argocd.md +760 -0
  105. package/cloud-native/02-playbooks/k8s-troubleshooting-playbook.md +1942 -0
  106. package/cloud-native/02-playbooks/message-queue-playbook.md +129 -0
  107. package/cloud-native/02-playbooks/multicloud-governance.md +726 -0
  108. package/cloud-native/02-playbooks/serverless-patterns.md +788 -0
  109. package/cloud-native/02-playbooks/service-mesh-playbook.md +612 -0
  110. package/cloud-native/02-playbooks/terraform-iac-playbook.md +143 -0
  111. package/cloud-native/03-checklists/container-security-checklist.md +431 -0
  112. package/cloud-native/03-checklists/k8s-production-readiness-checklist.md +460 -0
  113. package/cloud-native/04-antipatterns/container-antipatterns.md +660 -0
  114. package/cloud-native/04-antipatterns/k8s-antipatterns.md +743 -0
  115. package/cloud-native/05-cases/case-k8s-migration.md +478 -0
  116. package/cloud-native/05-cases/case-k8s-scaling.md +642 -0
  117. package/cloud-native/05-cases/case-k8s-security-incident.md +397 -0
  118. package/cloud-native/06-glossary/cloud-native-glossary.md +337 -0
  119. package/cross-platform/01-standards/cross-platform-frameworks.md +83 -0
  120. package/cross-platform/01-standards/platform-selection-and-architecture.md +77 -0
  121. package/data/01-standards/elasticsearch-complete.md +2098 -0
  122. package/data/01-standards/postgresql-complete.md +1613 -0
  123. package/data/01-standards/redis-complete.md +1527 -0
  124. package/data/02-playbooks/database-optimization-playbook.md +403 -0
  125. package/data/02-playbooks/elasticsearch-production-playbook.md +132 -0
  126. package/data/03-checklists/database-launch-checklist.md +187 -0
  127. package/data/04-antipatterns/database-antipatterns.md +873 -0
  128. package/data/05-cases/case-database-migration.md +310 -0
  129. package/data/06-glossary/database-glossary.md +440 -0
  130. package/data/data-governance-and-modeling-deep-dive.md +39 -0
  131. package/data-engineering/01-standards/airflow-complete.md +523 -0
  132. package/data-engineering/01-standards/kafka-complete.md +1521 -0
  133. package/data-engineering/02-playbooks/spark-etl-playbook.md +496 -0
  134. package/data-engineering/03-checklists/pipeline-launch-checklist.md +194 -0
  135. package/data-engineering/04-antipatterns/data-pipeline-antipatterns.md +684 -0
  136. package/data-engineering/05-cases/case-real-time-pipeline.md +355 -0
  137. package/data-engineering/06-glossary/data-engineering-glossary.md +429 -0
  138. package/database/01-standards/database-schema-standards.md +147 -0
  139. package/database/02-playbooks/postgresql-optimization-quick.md +52 -0
  140. package/database/02-playbooks/postgresql-performance-optimization.md +58 -0
  141. package/database/02-playbooks/postgresql-production-playbook.md +146 -0
  142. package/database/02-playbooks/redis-caching-playbook.md +117 -0
  143. package/database/03-checklists/database-review-checklist.md +50 -0
  144. package/database/04-antipatterns/database-antipatterns.md +112 -0
  145. package/design/01-standards/ui-design-system-complete.md +423 -0
  146. package/design/02-playbooks/design-handoff-playbook.md +254 -0
  147. package/design/02-playbooks/design-review-playbook.md +388 -0
  148. package/design/03-checklists/design-review-checklist.md +246 -0
  149. package/design/04-antipatterns/design-antipatterns.md +378 -0
  150. package/design/05-cases/case-design-system-adoption.md +328 -0
  151. package/design/06-glossary/design-glossary.md +329 -0
  152. package/design/ui-full-lifecycle-cross-platform-playbook.md +571 -0
  153. package/design/ux-system-deep-dive.md +38 -0
  154. package/design-systems/00-craft-rules.md +71 -0
  155. package/design-systems/aesthetic-families.md +43 -0
  156. package/design-systems/anti-ai-slop.md +162 -0
  157. package/design-systems/bold-geometric.md +120 -0
  158. package/design-systems/brutalist-bold.md +103 -0
  159. package/design-systems/editorial-clean.md +109 -0
  160. package/design-systems/glass-aurora.md +108 -0
  161. package/design-systems/modern-minimal.md +145 -0
  162. package/design-systems/premium-luxury.md +106 -0
  163. package/design-systems/product-type-design-map.md +48 -0
  164. package/design-systems/soft-warm.md +123 -0
  165. package/design-systems/tech-utility.md +113 -0
  166. package/desktop/01-standards/desktop-app-standard.md +72 -0
  167. package/desktop/01-standards/desktop-design.md +71 -0
  168. package/development/00-governance/document-template.md +41 -0
  169. package/development/01-standards/api-versioning-strategies.md +432 -0
  170. package/development/01-standards/authentication-patterns-complete.md +479 -0
  171. package/development/01-standards/css-architecture-complete.md +550 -0
  172. package/development/01-standards/database-migration-strategies.md +484 -0
  173. package/development/01-standards/elasticsearch-complete.md +347 -0
  174. package/development/01-standards/git-complete.md +371 -0
  175. package/development/01-standards/golang-complete.md +1565 -0
  176. package/development/01-standards/graphql-complete.md +298 -0
  177. package/development/01-standards/javascript-bundlers-complete.md +469 -0
  178. package/development/01-standards/javascript-typescript-complete.md +528 -0
  179. package/development/01-standards/jest-complete.md +275 -0
  180. package/development/01-standards/linux-complete.md +234 -0
  181. package/development/01-standards/logging-observability-complete.md +526 -0
  182. package/development/01-standards/microservices-communication.md +502 -0
  183. package/development/01-standards/mongodb-complete.md +406 -0
  184. package/development/01-standards/oauth2-complete.md +285 -0
  185. package/development/01-standards/performance-optimization-complete.md +289 -0
  186. package/development/01-standards/playwright-complete.md +247 -0
  187. package/development/01-standards/postgresql-complete.md +456 -0
  188. package/development/01-standards/pytest-complete.md +340 -0
  189. package/development/01-standards/python-async-programming.md +902 -0
  190. package/development/01-standards/python-complete.md +956 -0
  191. package/development/01-standards/python-decorators-complete.md +799 -0
  192. package/development/01-standards/python-design-patterns.md +2854 -0
  193. package/development/01-standards/python-packaging-distribution.md +420 -0
  194. package/development/01-standards/python-testing-strategies.md +607 -0
  195. package/development/01-standards/python-web-frameworks-comparison.md +471 -0
  196. package/development/01-standards/redis-complete.md +317 -0
  197. package/development/01-standards/rest-api-complete.md +316 -0
  198. package/development/01-standards/rust-complete.md +578 -0
  199. package/development/01-standards/typescript-advanced-types.md +1513 -0
  200. package/development/01-standards/web-security-complete.md +292 -0
  201. package/development/02-playbooks/api-design-playbook.md +810 -0
  202. package/development/02-playbooks/database-migration-playbook.md +580 -0
  203. package/development/02-playbooks/debugging-playbook.md +692 -0
  204. package/development/02-playbooks/feature-delivery-playbook.md +430 -0
  205. package/development/02-playbooks/incident-hotfix-playbook.md +387 -0
  206. package/development/02-playbooks/performance-optimization-playbook.md +531 -0
  207. package/development/02-playbooks/performance-tuning-playbook.md +652 -0
  208. package/development/02-playbooks/refactor-playbook.md +403 -0
  209. package/development/02-playbooks/release-playbook.md +469 -0
  210. package/development/03-checklists/architecture-review-checklist.md +168 -0
  211. package/development/03-checklists/data-migration-checklist.md +157 -0
  212. package/development/03-checklists/oncall-handover-checklist.md +173 -0
  213. package/development/03-checklists/pr-checklist.md +158 -0
  214. package/development/03-checklists/production-readiness-checklist.md +190 -0
  215. package/development/03-checklists/release-readiness-checklist.md +154 -0
  216. package/development/03-checklists/security-review-checklist.md +182 -0
  217. package/development/04-antipatterns/api-antipatterns.md +657 -0
  218. package/development/04-antipatterns/architecture-antipatterns.md +686 -0
  219. package/development/04-antipatterns/backend-antipatterns.md +648 -0
  220. package/development/04-antipatterns/cicd-antipatterns.md +540 -0
  221. package/development/04-antipatterns/code-smell-antipatterns.md +571 -0
  222. package/development/04-antipatterns/data-antipatterns.md +658 -0
  223. package/development/04-antipatterns/database-antipatterns.md +578 -0
  224. package/development/04-antipatterns/frontend-antipatterns.md +635 -0
  225. package/development/04-antipatterns/reliability-antipatterns.md +700 -0
  226. package/development/04-antipatterns/security-antipatterns.md +747 -0
  227. package/development/05-cases/case-api-version-migration.md +428 -0
  228. package/development/05-cases/case-authorization-hardening.md +383 -0
  229. package/development/05-cases/case-bluegreen-rollback.md +466 -0
  230. package/development/05-cases/case-cache-snowball-protection.md +485 -0
  231. package/development/05-cases/case-ci-cd-pipeline.md +544 -0
  232. package/development/05-cases/case-database-scaling.md +500 -0
  233. package/development/05-cases/case-db-hotspot-optimization.md +487 -0
  234. package/development/05-cases/case-incident-mttr-reduction.md +563 -0
  235. package/development/05-cases/case-microservice-migration.md +375 -0
  236. package/development/05-cases/case-performance-optimization.md +406 -0
  237. package/development/05-cases/case-security-incident-response.md +345 -0
  238. package/development/06-glossary/full-stack-glossary.md +166 -0
  239. package/development/09-maturity/quarterly-audit-template.md +35 -0
  240. package/development/11-ui-excellence/ui-aesthetic-system.md +41 -0
  241. package/development/11-ui-excellence/ui-engineering-excellence.md +435 -0
  242. package/development/12-scenarios/development-scenarios-guide.md +565 -0
  243. package/development/13-implementation-assets/implementation-toolkit.md +282 -0
  244. package/development/13-implementation-assets/knowledge-gates-execution.md +43 -0
  245. package/development/14-full-lifecycle/software-lifecycle-gates.md +511 -0
  246. package/development/15-lifecycle-templates/project-templates-collection.md +791 -0
  247. package/development/api-contract-and-versioning-guide.md +36 -0
  248. package/development/api-governance-complete.md +43 -0
  249. package/development/backend-engineering-complete.md +43 -0
  250. package/development/code-review-quality-complete.md +43 -0
  251. package/development/concurrency-reliability-complete.md +43 -0
  252. package/development/database-engineering-complete.md +43 -0
  253. package/development/engineering-effectiveness-complete.md +43 -0
  254. package/development/engineering-standards-deep-dive.md +38 -0
  255. package/development/frontend-engineering-complete.md +43 -0
  256. package/development/performance-capacity-complete.md +43 -0
  257. package/development/refactor-migration-complete.md +42 -0
  258. package/development/refactoring-and-techdebt-playbook.md +37 -0
  259. package/development/security-in-development-complete.md +43 -0
  260. package/devops/01-standards/cicd-pipeline-complete.md +262 -0
  261. package/devops/01-standards/docker-complete.md +1490 -0
  262. package/devops/01-standards/github-actions-complete.md +337 -0
  263. package/devops/01-standards/kubernetes-complete.md +638 -0
  264. package/devops/01-standards/terraform-complete.md +2117 -0
  265. package/devops/02-playbooks/docker-compose-playbook.md +233 -0
  266. package/devops/02-playbooks/docker-k8s-production-playbook.md +186 -0
  267. package/devops/02-playbooks/docker-production-playbook.md +952 -0
  268. package/edge-iot/01-standards/edge-iot-complete.md +473 -0
  269. package/experts/architect/api-design.md +178 -0
  270. package/experts/architect/methodology.md +124 -0
  271. package/experts/architect/security.md +75 -0
  272. package/experts/backend-lead/methodology.md +216 -0
  273. package/experts/devops/methodology.md +160 -0
  274. package/experts/frontend-lead/methodology.md +178 -0
  275. package/experts/product-manager/industry/ecommerce.md +43 -0
  276. package/experts/product-manager/industry/saas.md +40 -0
  277. package/experts/product-manager/methodology.md +97 -0
  278. package/experts/qa-lead/methodology.md +123 -0
  279. package/experts/qa-lead/test-strategy.md +128 -0
  280. package/experts/uiux-designer/methodology.md +125 -0
  281. package/frontend/01-standards/accessibility-complete.md +532 -0
  282. package/frontend/01-standards/accessibility-standard.md +74 -0
  283. package/frontend/01-standards/admin-dashboard-and-crud.md +72 -0
  284. package/frontend/01-standards/design-tokens-complete.md +444 -0
  285. package/frontend/01-standards/forms-and-validation.md +77 -0
  286. package/frontend/01-standards/frontend-architecture-and-layering.md +119 -0
  287. package/frontend/01-standards/i18n-and-localization.md +65 -0
  288. package/frontend/01-standards/nextjs-complete.md +451 -0
  289. package/frontend/01-standards/react-complete.md +713 -0
  290. package/frontend/01-standards/react-hooks-complete-guide.md +1100 -0
  291. package/frontend/01-standards/react-hooks-complete.md +1171 -0
  292. package/frontend/01-standards/seo-and-web-vitals.md +77 -0
  293. package/frontend/01-standards/state-management-complete.md +444 -0
  294. package/frontend/01-standards/vue-complete.md +499 -0
  295. package/frontend/01-standards/vue3-complete.md +2002 -0
  296. package/frontend/01-standards/web-framework-best-practices.md +64 -0
  297. package/frontend/01-standards/web-performance-complete.md +495 -0
  298. package/frontend/02-playbooks/accessibility-a11y-playbook.md +161 -0
  299. package/frontend/02-playbooks/frontend-performance-playbook.md +707 -0
  300. package/frontend/02-playbooks/i18n-internationalization-playbook.md +120 -0
  301. package/frontend/02-playbooks/performance-optimization-playbook.md +163 -0
  302. package/frontend/02-playbooks/react-nextjs-production-playbook.md +167 -0
  303. package/frontend/02-playbooks/react-state-management-playbook.md +173 -0
  304. package/frontend/03-checklists/component-quality-checklist.md +166 -0
  305. package/frontend/03-checklists/frontend-launch-checklist.md +299 -0
  306. package/frontend/04-antipatterns/frontend-antipatterns.md +886 -0
  307. package/frontend/05-cases/case-performance-optimization.md +274 -0
  308. package/harmony/01-standards/harmonyos-arkts-standard.md +75 -0
  309. package/harmony/01-standards/harmonyos-design.md +65 -0
  310. package/high-quality-engineering-playbook.md +54 -0
  311. package/incident/01-standards/incident-response-complete.md +303 -0
  312. package/incident/02-playbooks/chaos-engineering-playbook.md +883 -0
  313. package/incident/02-playbooks/postmortem-playbook.md +398 -0
  314. package/incident/03-checklists/incident-readiness-checklist.md +181 -0
  315. package/incident/04-antipatterns/incident-antipatterns.md +490 -0
  316. package/incident/05-cases/case-cascade-failure.md +176 -0
  317. package/incident/06-glossary/incident-glossary.md +114 -0
  318. package/incident/postmortem-and-response-deep-dive.md +39 -0
  319. package/industries/ecommerce/ecommerce-complete.md +631 -0
  320. package/industries/education/education-complete.md +555 -0
  321. package/industries/fintech/fintech-complete.md +501 -0
  322. package/industries/gaming/gaming-complete.md +587 -0
  323. package/industries/healthcare/healthcare-complete.md +452 -0
  324. package/low-code/01-standards/low-code-complete.md +944 -0
  325. package/miniprogram/01-standards/ai-common-mistakes.md +61 -0
  326. package/miniprogram/01-standards/miniprogram-custom-navbar-capsule.md +77 -0
  327. package/miniprogram/01-standards/miniprogram-design.md +61 -0
  328. package/miniprogram/01-standards/miniprogram-standard.md +81 -0
  329. package/mobile/01-standards/android-material-design.md +70 -0
  330. package/mobile/01-standards/flutter-complete.md +384 -0
  331. package/mobile/01-standards/ios-design-hig.md +78 -0
  332. package/mobile/01-standards/mobile-app-standard.md +85 -0
  333. package/mobile/01-standards/react-native-complete.md +352 -0
  334. package/mobile/02-playbooks/mobile-cross-platform-playbook.md +175 -0
  335. package/mobile/02-playbooks/mobile-performance.md +473 -0
  336. package/mobile/03-checklists/mobile-release-checklist.md +234 -0
  337. package/mobile/04-antipatterns/mobile-antipatterns.md +798 -0
  338. package/mobile/05-cases/case-app-performance.md +500 -0
  339. package/mobile/05-cases/case-app-startup-optimization.md +218 -0
  340. package/mobile/06-glossary/mobile-glossary.md +484 -0
  341. package/observability/01-standards/observability-standards.md +103 -0
  342. package/observability/02-playbooks/prometheus-grafana-playbook.md +135 -0
  343. package/observability/02-playbooks/structured-logging-playbook.md +73 -0
  344. package/observability/03-checklists/observability-checklist.md +54 -0
  345. package/observability/04-antipatterns/observability-antipatterns.md +106 -0
  346. package/operations/01-standards/prometheus-monitoring-complete.md +1578 -0
  347. package/operations/02-playbooks/capacity-planning-playbook.md +620 -0
  348. package/operations/03-checklists/production-launch-checklist.md +365 -0
  349. package/operations/04-antipatterns/operations-antipatterns.md +664 -0
  350. package/operations/05-cases/case-sre-practices.md +581 -0
  351. package/operations/06-glossary/operations-glossary.md +120 -0
  352. package/operations/aiops-anomaly-detection.md +758 -0
  353. package/operations/capacity-planning.md +1061 -0
  354. package/operations/chaos-engineering.md +659 -0
  355. package/operations/incident-command-system.md +38 -0
  356. package/operations/observability-complete.md +442 -0
  357. package/operations/slo-sli-playbook.md +517 -0
  358. package/operations/sre-operations-deep-dive.md +39 -0
  359. package/package.json +8 -0
  360. package/performance/01-standards/performance-and-scalability.md +80 -0
  361. package/performance/01-standards/performance-standards.md +156 -0
  362. package/performance/02-playbooks/query-optimization-playbook.md +103 -0
  363. package/performance/03-checklists/performance-checklist.md +56 -0
  364. package/performance/04-antipatterns/performance-antipatterns.md +146 -0
  365. package/product/01-standards/product-management-complete.md +285 -0
  366. package/product/02-playbooks/feature-launch-playbook.md +207 -0
  367. package/product/02-playbooks/user-research-playbook.md +532 -0
  368. package/product/03-checklists/feature-launch-checklist.md +275 -0
  369. package/product/04-antipatterns/product-antipatterns.md +355 -0
  370. package/product/05-cases/case-mvp-to-scale.md +384 -0
  371. package/product/06-glossary/product-glossary.md +462 -0
  372. package/product/feature-prioritization-framework.md +40 -0
  373. package/product/kpi-and-metric-tree.md +37 -0
  374. package/product/product-discovery-and-prd-deep-dive.md +41 -0
  375. package/quantum/01-standards/quantum-complete.md +1186 -0
  376. package/security/01-standards/api-security-complete.md +511 -0
  377. package/security/01-standards/container-runtime-security.md +574 -0
  378. package/security/01-standards/data-protection-gdpr.md +543 -0
  379. package/security/01-standards/owasp-top10-complete.md +1890 -0
  380. package/security/01-standards/secure-coding-baseline.md +90 -0
  381. package/security/01-standards/supply-chain-security.md +441 -0
  382. package/security/01-standards/web-security-checklist.md +108 -0
  383. package/security/01-standards/zero-trust-architecture.md +521 -0
  384. package/security/02-playbooks/auth-sso-playbook.md +166 -0
  385. package/security/02-playbooks/incident-response-security-playbook.md +588 -0
  386. package/security/02-playbooks/owasp-api-security-playbook.md +129 -0
  387. package/security/02-playbooks/payment-integration-playbook.md +119 -0
  388. package/security/02-playbooks/penetration-testing-playbook.md +517 -0
  389. package/security/03-checklists/security-audit-checklist.md +356 -0
  390. package/security/04-antipatterns/security-coding-antipatterns.md +580 -0
  391. package/security/05-cases/case-log4shell-incident.md +537 -0
  392. package/security/05-cases/case-major-breaches.md +468 -0
  393. package/security/06-glossary/security-glossary.md +212 -0
  394. package/security/compliance-automation.md +993 -0
  395. package/security/container-security.md +680 -0
  396. package/security/devsecops-complete.md +426 -0
  397. package/security/sast-dast-sca.md +775 -0
  398. package/security/secrets-management.md +594 -0
  399. package/security/security-architecture-deep-dive.md +37 -0
  400. package/security/threat-modeling-stride-playbook.md +40 -0
  401. package/seed-templates/auth-system.md +59 -0
  402. package/seed-templates/blog-content.md +94 -0
  403. package/seed-templates/dashboard.md +89 -0
  404. package/seed-templates/docs-site.md +73 -0
  405. package/seed-templates/e-commerce.md +50 -0
  406. package/seed-templates/saas-landing.md +92 -0
  407. package/seed-templates/settings-page.md +51 -0
  408. package/testing/01-standards/test-strategy-and-layering.md +83 -0
  409. package/testing/01-standards/testing-strategy-complete.md +422 -0
  410. package/testing/01-standards/unit-testing-best-practices.md +118 -0
  411. package/testing/02-playbooks/e2e-testing-playbook.md +988 -0
  412. package/testing/02-playbooks/testing-strategy-playbook.md +126 -0
  413. package/testing/03-checklists/test-strategy-checklist.md +208 -0
  414. package/testing/04-antipatterns/testing-antipatterns.md +718 -0
  415. package/testing/05-cases/case-testing-transformation.md +300 -0
  416. package/testing/06-glossary/testing-glossary.md +110 -0
  417. package/testing/risk-based-test-matrix.md +36 -0
  418. package/testing/testing-strategy-deep-dive.md +37 -0
@@ -0,0 +1,500 @@
1
+ ---
2
+ id: case-database-scaling
3
+ title: 案例研究:数据库从单实例到读写分离到分库分表的演进
4
+ domain: development
5
+ category: 05-cases
6
+ difficulty: intermediate
7
+ tags: [agent, case, checklist, database, development, scaling, 元数据]
8
+ quality_score: 70
9
+ last_updated: 2026-06-15
10
+ ---
11
+ # 案例研究:数据库从单实例到读写分离到分库分表的演进
12
+
13
+ ## 元数据
14
+
15
+ | 字段 | 值 |
16
+ |------|------|
17
+ | 行业 | 社交电商平台 |
18
+ | 系统规模 | 注册用户 3000 万,日活 400 万 |
19
+ | 数据库 | MySQL 8.0 |
20
+ | 数据规模 | 总数据量 4.2TB,核心表 8 亿行 |
21
+ | 团队规模 | 后端 20 人,DBA 3 人 |
22
+ | 演进周期 | 3 年(2022-01 至 2024-12) |
23
+
24
+ ---
25
+
26
+ ## 一、背景
27
+
28
+ ### 1.1 业务概述
29
+
30
+ 某社交电商平台,核心业务为用户分享商品赚取佣金。主要数据实体:
31
+
32
+ - **用户表(users)**:3000 万行,年增长 800 万
33
+ - **订单表(orders)**:8 亿行,日增 150 万
34
+ - **商品表(products)**:500 万行,日增 2000
35
+ - **佣金表(commissions)**:6 亿行,日增 120 万
36
+ - **关系表(user_relations)**:1.2 亿行(用户邀请链路)
37
+
38
+ ### 1.2 演进三阶段概览
39
+
40
+ ```
41
+ 阶段一(2022 Q1): 单实例 → 主从复制(读写分离)
42
+ 触发条件:读 QPS 达到 8000,主库 CPU 85%+
43
+
44
+ 阶段二(2023 Q2): 主从 → 垂直拆分(按业务拆库)
45
+ 触发条件:单库连接数达上限,慢查询频发
46
+
47
+ 阶段三(2024 Q1): 垂直拆分 → 水平分片(分库分表)
48
+ 触发条件:订单表 8 亿行,单表查询超时
49
+ ```
50
+
51
+ ---
52
+
53
+ ## 二、阶段一:单实例到读写分离
54
+
55
+ ### 2.1 触发问题
56
+
57
+ 2022 年初,业务快速增长,数据库出现严重瓶颈:
58
+
59
+ | 指标 | 阈值 | 实际值 |
60
+ |------|------|--------|
61
+ | 读 QPS | 建议 < 5000 | 8,200 |
62
+ | 写 QPS | 建议 < 2000 | 1,800 |
63
+ | 主库 CPU | 建议 < 70% | 85-92% |
64
+ | 慢查询/分钟 | 建议 < 5 | 35 |
65
+ | 连接数 | 上限 1000 | 890 |
66
+
67
+ ### 2.2 方案设计
68
+
69
+ ```
70
+ 架构变更:
71
+ Before: App → MySQL (Single Instance, 64C 256G)
72
+ After: App → ProxySQL → MySQL Master (写)
73
+ → MySQL Slave-1 (读)
74
+ → MySQL Slave-2 (读)
75
+ → MySQL Slave-3 (读,报表专用)
76
+ ```
77
+
78
+ ### 2.3 关键实施细节
79
+
80
+ #### 中间件选择
81
+
82
+ | 方案 | 优势 | 劣势 | 选择 |
83
+ |------|------|------|------|
84
+ | 应用层读写分离 | 无中间件依赖 | 代码侵入大 | 否 |
85
+ | ProxySQL | 透明代理,路由灵活 | 运维复杂度 | **是** |
86
+ | MySQL Router | 官方出品 | 功能较弱 | 否 |
87
+ | ShardingSphere-Proxy | 功能全面 | 太重 | 否 |
88
+
89
+ #### ProxySQL 路由规则
90
+
91
+ ```sql
92
+ -- 写操作路由到 Master
93
+ INSERT INTO mysql_query_rules (rule_id, match_pattern, destination_hostgroup)
94
+ VALUES (1, '^(INSERT|UPDATE|DELETE|REPLACE)', 10);
95
+
96
+ -- 读操作路由到 Slave
97
+ INSERT INTO mysql_query_rules (rule_id, match_pattern, destination_hostgroup)
98
+ VALUES (2, '^SELECT', 20);
99
+
100
+ -- 事务内读操作路由到 Master(保证读已之写一致性)
101
+ INSERT INTO mysql_query_rules (rule_id, match_pattern, destination_hostgroup, flagIN)
102
+ VALUES (3, '^SELECT', 10, 1);
103
+
104
+ -- 报表查询路由到专用从库
105
+ INSERT INTO mysql_query_rules (rule_id, match_pattern, destination_hostgroup, schemaname)
106
+ VALUES (4, '^SELECT.*report_', 30, 'analytics');
107
+ ```
108
+
109
+ #### 主从一致性处理
110
+
111
+ 核心问题:从库有复制延迟(通常 100-500ms),某些场景需要读到最新数据。
112
+
113
+ ```
114
+ 策略分类:
115
+ 1. 强一致读(写后立即读)→ 路由到 Master
116
+ 场景:下单后查看订单详情、修改资料后查看
117
+
118
+ 2. 会话一致读 → 3 秒内路由到 Master,之后路由到 Slave
119
+ 场景:用户操作后的页面刷新
120
+
121
+ 3. 最终一致读 → 路由到 Slave
122
+ 场景:商品列表、排行榜、搜索结果
123
+ ```
124
+
125
+ 应用层实现:
126
+
127
+ ```python
128
+ # 装饰器标记强一致读
129
+ @route_to_master(duration=3) # 3 秒内读 Master
130
+ def create_order(order_data):
131
+ db.master.insert(order_data)
132
+ return db.query("SELECT * FROM orders WHERE id = %s", order_data.id)
133
+ ```
134
+
135
+ ### 2.4 阶段一结果
136
+
137
+ | 指标 | 改造前 | 改造后 |
138
+ |------|--------|--------|
139
+ | 主库读 QPS | 8,200 | 1,800(仅写 + 强一致读) |
140
+ | 主库 CPU | 85-92% | 35-45% |
141
+ | 慢查询/分钟 | 35 | 3 |
142
+ | 读延迟 P99 | 450ms | 85ms |
143
+ | 系统可支撑 QPS | 10,000 | 35,000 |
144
+
145
+ **实施周期**:3 周(含压测和灰度)
146
+
147
+ ---
148
+
149
+ ## 三、阶段二:读写分离到垂直拆分
150
+
151
+ ### 3.1 触发问题
152
+
153
+ 2023 年中,随着业务复杂度提升,主从架构再次出现瓶颈:
154
+
155
+ | 指标 | 问题描述 |
156
+ |------|----------|
157
+ | 连接数 | 主库连接数达 800+(上限 1000),每个业务模块占用独立连接池 |
158
+ | 锁竞争 | 订单表和佣金表写入高并发,行锁等待严重 |
159
+ | DDL 风险 | 8 亿行大表加字段需要 4 小时,业务窗口不够 |
160
+ | 耦合性 | 所有业务共享一个数据库,一个慢查询影响全局 |
161
+ | 备份时间 | 全量备份 6 小时,影响从库性能 |
162
+
163
+ ### 3.2 垂直拆分方案
164
+
165
+ 按业务域拆分为 4 个独立数据库:
166
+
167
+ ```
168
+ 拆分前(单库 all_in_one):
169
+ ├── users, user_profiles, user_settings
170
+ ├── orders, order_items, order_logs
171
+ ├── products, categories, product_images
172
+ ├── commissions, commission_rules, withdrawals
173
+ └── user_relations, invite_rewards
174
+
175
+ 拆分后:
176
+ ├── user_db: users, user_profiles, user_settings, user_relations
177
+ ├── order_db: orders, order_items, order_logs
178
+ ├── product_db: products, categories, product_images
179
+ ├── finance_db: commissions, commission_rules, withdrawals, invite_rewards
180
+ ```
181
+
182
+ 每个库独立的主从架构:
183
+
184
+ ```
185
+ user_db: Master + 2 Slave (读多写少)
186
+ order_db: Master + 2 Slave (读写均高)
187
+ product_db: Master + 1 Slave (读多写少)
188
+ finance_db: Master + 2 Slave (写多,一致性要求高)
189
+ ```
190
+
191
+ ### 3.3 核心挑战:跨库 JOIN 改造
192
+
193
+ **问题**:拆分前有 34 处跨业务域的 JOIN 查询。
194
+
195
+ **改造策略**:
196
+
197
+ ```
198
+ 策略 1: 冗余字段(12 处)
199
+ 订单表冗余 user_name、product_name
200
+ 避免高频查询需要跨库
201
+
202
+ 策略 2: 应用层聚合(15 处)
203
+ 先查订单列表 → 批量查用户信息 → 内存中组装
204
+ 使用并行查询减少延迟
205
+
206
+ 策略 3: 数据同步到宽表(5 处)
207
+ 通过 binlog 同步到 ClickHouse 宽表
208
+ 报表和复杂分析查询走 ClickHouse
209
+
210
+ 策略 4: 消除不必要的 JOIN(2 处)
211
+ 重新审视业务需求,发现 2 处 JOIN 已无业务意义
212
+ ```
213
+
214
+ ### 3.4 跨库事务处理
215
+
216
+ **问题**:下单流程涉及 order_db(创建订单)+ finance_db(计算佣金)+ product_db(扣减库存)
217
+
218
+ **方案**:基于消息队列的最终一致性
219
+
220
+ ```
221
+ 下单流程:
222
+ 1. order_db: 创建订单(PENDING)← 本地事务
223
+ 2. MQ: 发送"订单已创建"消息
224
+ 3. product_db: 扣减库存 ← 消费消息,本地事务
225
+ 4. finance_db: 计算佣金 ← 消费消息,本地事务
226
+ 5. order_db: 更新订单状态(CONFIRMED)← 消费确认消息
227
+
228
+ 补偿机制:
229
+ - 库存扣减失败 → 发送补偿消息 → 取消订单
230
+ - 佣金计算失败 → 异步重试 3 次 → 失败进入人工处理队列
231
+ - 定时对账任务:每小时扫描 PENDING 超时订单
232
+ ```
233
+
234
+ ### 3.5 数据迁移过程
235
+
236
+ ```
237
+ Phase 1: 双写准备(1 周)
238
+ - 新建 4 个独立数据库实例
239
+ - 通过 DTS(数据传输服务)全量同步 + 增量追赶
240
+
241
+ Phase 2: 双写验证(2 周)
242
+ - 应用层双写:先写旧库,再写新库
243
+ - 对账任务:每分钟比对新旧库数据一致性
244
+ - 不一致记录写入修复队列
245
+
246
+ Phase 3: 切读(1 周)
247
+ - 读流量逐步切到新库:10% → 50% → 100%
248
+ - 对比新旧库返回结果
249
+
250
+ Phase 4: 切写(3 天)
251
+ - 写流量切到新库
252
+ - 旧库保持只读同步,作为回滚兜底
253
+ - 观察 7 天后下线旧库
254
+ ```
255
+
256
+ ### 3.6 阶段二结果
257
+
258
+ | 指标 | 改造前 | 改造后 |
259
+ |------|--------|--------|
260
+ | 单库连接数 | 800+ | 每库 200 以内 |
261
+ | 订单写入延迟 P99 | 380ms | 45ms |
262
+ | DDL 变更时间 | 4 小时 | 15 分钟(表更小) |
263
+ | 全量备份时间 | 6 小时 | 每库 45 分钟 |
264
+ | 故障影响范围 | 全部业务 | 仅单个业务域 |
265
+
266
+ **实施周期**:8 周
267
+
268
+ ---
269
+
270
+ ## 四、阶段三:垂直拆分到水平分片
271
+
272
+ ### 4.1 触发问题
273
+
274
+ 2024 年初,order_db 成为新的瓶颈:
275
+
276
+ | 指标 | 问题描述 |
277
+ |------|----------|
278
+ | orders 表行数 | 8 亿行,单表 380GB |
279
+ | 写入 QPS | 峰值 4,500 |
280
+ | 查询延迟 | 按用户查询 P99 = 1.2s |
281
+ | 索引大小 | 120GB,超过内存,频繁磁盘 IO |
282
+ | DDL | 加索引预估 8 小时,无法执行 |
283
+
284
+ 同时 finance_db 的 commissions 表也达到 6 亿行。
285
+
286
+ ### 4.2 分片策略设计
287
+
288
+ #### 分片键选择
289
+
290
+ | 候选分片键 | 优势 | 劣势 | 评估 |
291
+ |------------|------|------|------|
292
+ | order_id | 数据均匀 | 按用户查询需要扇出全部分片 | 否 |
293
+ | user_id | 用户维度查询无需跨片 | 大卖家数据倾斜 | **是** |
294
+ | created_at | 范围查询友好 | 热点集中在最新分片 | 否 |
295
+ | user_id + 月份 | 均匀 + 范围友好 | 复杂度高 | 备选 |
296
+
297
+ 最终选择 **user_id** 作为分片键,理由:
298
+ 1. 95% 的查询带有 user_id 条件
299
+ 2. 订单详情通过 order_id 查询时,order_id 编码中包含 user_id(雪花算法自定义段)
300
+ 3. 大卖家数据倾斜问题通过 1024 个虚拟分片缓解
301
+
302
+ #### 分片规模
303
+
304
+ ```
305
+ orders 表分片:
306
+ 物理库:4 个(order_db_0 ~ order_db_3)
307
+ 每库表:256 个(orders_0000 ~ orders_0255)
308
+ 总分片:1024 个虚拟分片
309
+ 路由规则:shard_id = user_id % 1024
310
+ 物理库映射:db_id = shard_id / 256
311
+
312
+ commissions 表分片:
313
+ 同 orders 表策略,按 user_id 分片到同一组库
314
+ 保证同一用户的订单和佣金在同一物理库(支持本地 JOIN)
315
+ ```
316
+
317
+ ### 4.3 分片中间件选择
318
+
319
+ | 方案 | 优势 | 劣势 | 选择 |
320
+ |------|------|------|------|
321
+ | ShardingSphere-JDBC | Java 原生,性能好 | 仅 Java | 否(Go技术栈) |
322
+ | Vitess | YouTube 验证 | 运维复杂度极高 | 否 |
323
+ | 自研路由层 | 完全可控 | 开发维护成本 | **是** |
324
+ | CockroachDB | 自动分片 | 换数据库风险大 | 否 |
325
+
326
+ 选择自研轻量路由层(Go middleware),理由:
327
+ 1. 团队 Go 技术栈,中间件生态不如 Java 成熟
328
+ 2. 分片规则相对简单(单键取模),无需重量级中间件
329
+ 3. 自研可完全控制路由逻辑和异常处理
330
+
331
+ #### 路由层核心代码设计
332
+
333
+ ```go
334
+ type ShardRouter struct {
335
+ shardCount int // 1024
336
+ dbCount int // 4
337
+ dbPool []*sql.DB // 4 个物理库连接池
338
+ }
339
+
340
+ func (r *ShardRouter) Route(userID int64) (db *sql.DB, tableName string) {
341
+ shardID := userID % int64(r.shardCount)
342
+ dbIndex := shardID / int64(r.shardCount/r.dbCount)
343
+ tableIndex := shardID % int64(r.shardCount/r.dbCount)
344
+ return r.dbPool[dbIndex], fmt.Sprintf("orders_%04d", tableIndex)
345
+ }
346
+
347
+ func (r *ShardRouter) ScatterQuery(ctx context.Context, query string) ([]Row, error) {
348
+ // 无分片键的查询:并行查询所有分片,合并结果
349
+ g, ctx := errgroup.WithContext(ctx)
350
+ results := make([][]Row, r.dbCount)
351
+ for i := 0; i < r.dbCount; i++ {
352
+ i := i
353
+ g.Go(func() error {
354
+ // 每个库内 256 个表用 UNION ALL
355
+ rows, err := r.dbPool[i].QueryContext(ctx, query)
356
+ results[i] = rows
357
+ return err
358
+ })
359
+ }
360
+ if err := g.Wait(); err != nil { return nil, err }
361
+ return mergeResults(results), nil
362
+ }
363
+ ```
364
+
365
+ ### 4.4 全局唯一 ID 方案
366
+
367
+ 分片后无法依赖数据库自增 ID,采用自定义雪花算法:
368
+
369
+ ```
370
+ Order ID 结构(64 bit):
371
+ ┌─────────┬──────────┬───────────┬──────────┐
372
+ │ 1 bit │ 41 bit │ 10 bit │ 12 bit │
373
+ │ 符号位 │ 时间戳 │ user_id │ 序列号 │
374
+ │ │ (毫秒) │ 后10位 │ │
375
+ └─────────┴──────────┴───────────┴──────────┘
376
+
377
+ 优势:
378
+ - 从 order_id 可直接提取 user_id 后 10 位 → 快速路由
379
+ - 时间有序 → B+ Tree 索引友好
380
+ - 每毫秒每用户可生成 4096 个订单 → 容量充足
381
+ ```
382
+
383
+ ### 4.5 数据迁移
384
+
385
+ ```
386
+ Phase 1: 新集群搭建(1 周)
387
+ - 4 个 MySQL 实例,每个 32C 128G
388
+ - 预创建 1024 个分片表
389
+
390
+ Phase 2: 全量迁移(3 天)
391
+ - 使用自研迁移工具,按 user_id 读取旧表 → 计算分片 → 写入新表
392
+ - 8 亿行数据,16 线程并行,速度 300 万行/分钟
393
+ - 总耗时约 4.5 小时
394
+
395
+ Phase 3: 增量追赶(2 天)
396
+ - 基于 binlog 增量同步
397
+ - 追赶完成后,新旧数据延迟 < 1 秒
398
+
399
+ Phase 4: 双写切换(1 周)
400
+ - 先切读(灰度 10% → 100%)
401
+ - 再切写(灰度 1% → 100%)
402
+ - 对账任务持续运行
403
+
404
+ Phase 5: 旧库下线(2 周后)
405
+ - 观察稳定后下线旧 order_db
406
+ ```
407
+
408
+ ### 4.6 阶段三结果
409
+
410
+ | 指标 | 分片前 | 分片后 |
411
+ |------|--------|--------|
412
+ | 单表行数 | 8 亿 | 平均 78 万(最大 120 万) |
413
+ | 单表大小 | 380GB | 平均 370MB |
414
+ | 写入 QPS 承载 | 4,500 | 20,000+ |
415
+ | 按用户查询 P99 | 1.2s | 15ms |
416
+ | 索引大小 | 120GB | 每表 < 500MB(全部 in memory) |
417
+ | DDL 执行时间 | 8 小时 | 每表 < 3 秒 |
418
+
419
+ **实施周期**:6 周
420
+
421
+ ---
422
+
423
+ ## 五、三阶段总结
424
+
425
+ ### 5.1 全局对比
426
+
427
+ | 指标 | 2022 初(单实例) | 2022 Q2(读写分离) | 2023 Q4(垂直拆分) | 2024 Q2(水平分片) |
428
+ |------|-------------------|---------------------|---------------------|---------------------|
429
+ | 承载 QPS | 10,000 | 35,000 | 60,000 | 150,000+ |
430
+ | 数据库实例数 | 1 | 4 | 12 | 20 |
431
+ | 运维复杂度 | 低 | 中 | 中高 | 高 |
432
+ | 查询 P99 | 450ms | 85ms | 45ms | 15ms |
433
+ | 硬件成本/月 | 2 万 | 4.5 万 | 12 万 | 18 万 |
434
+ | 可用性 | 99.5% | 99.9% | 99.95% | 99.99% |
435
+
436
+ ### 5.2 成本效益分析
437
+
438
+ ```
439
+ 数据库成本增长:2 万 → 18 万/月(9 倍)
440
+ 业务规模增长:日活 50 万 → 400 万(8 倍)
441
+ 单用户数据库成本:0.04 元 → 0.045 元(基本持平)
442
+ ```
443
+
444
+ ---
445
+
446
+ ## 六、经验教训
447
+
448
+ ### 6.1 做对的事
449
+
450
+ 1. **渐进式演进**:每次只解决当前最紧迫的问题,不过度设计。初期如果直接上分库分表,团队能力和业务规模都不匹配
451
+ 2. **数据迁移零丢失**:三次数据迁移均通过双写 + 对账机制保证零数据丢失
452
+ 3. **分片键深思熟虑**:user_id 作为分片键覆盖了 95% 的查询场景,避免了大量跨片查询
453
+ 4. **ID 设计前瞻性**:自定义雪花算法在 ID 中编码了路由信息,简化了查询路由
454
+
455
+ ### 6.2 做错的事
456
+
457
+ 1. **跨库 JOIN 改造低估工期**:预估 2 周,实际花了 4 周,因为隐藏的跨库依赖比预期多 60%
458
+ 2. **监控滞后**:垂直拆分后没有及时建立跨库的统一监控,问题定位效率下降
459
+ 3. **分片后的运维工具缺失**:分片后才发现缺少批量 DDL 工具、跨片数据查询工具,临时开发耽误 2 周
460
+ 4. **测试覆盖不足**:分片后的边界场景(分片键为 0、极大值、NULL)在初期测试中遗漏
461
+
462
+ ### 6.3 关键认知
463
+
464
+ - 数据库扩展不是一步到位的,应该跟随业务增长节奏渐进演进
465
+ - 每次架构升级都增加运维复杂度,必须同步提升运维工具和团队能力
466
+ - 分片是最后手段,在此之前应充分利用缓存、读写分离、垂直拆分
467
+ - 数据迁移的风险远大于代码变更,必须有完善的对账和回滚机制
468
+
469
+ ---
470
+
471
+ ## Agent Checklist
472
+
473
+ 在 AI Agent 辅助执行数据库扩展任务时,应逐项确认:
474
+
475
+ ### 读写分离
476
+ - [ ] **复制延迟**:是否评估了主从复制延迟对业务的影响
477
+ - [ ] **路由策略**:是否明确了哪些查询走主库、哪些走从库
478
+ - [ ] **一致性方案**:写后读场景是否有主库路由或延迟感知机制
479
+ - [ ] **故障切换**:从库故障时是否能自动摘除,主库故障时是否有自动切换
480
+
481
+ ### 垂直拆分
482
+ - [ ] **拆分边界**:是否按业务域划分了清晰的数据库边界
483
+ - [ ] **跨库依赖**:是否梳理并处理了所有跨库 JOIN 和事务
484
+ - [ ] **数据冗余**:高频跨库查询是否通过冗余字段解决
485
+ - [ ] **分布式事务**:跨库写操作是否有一致性保障方案
486
+
487
+ ### 水平分片
488
+ - [ ] **分片键选择**:分片键是否覆盖了大部分查询条件
489
+ - [ ] **数据均匀性**:分片后数据分布是否均匀,是否有热点分片
490
+ - [ ] **全局 ID**:是否设计了分布式唯一 ID 方案
491
+ - [ ] **跨片查询**:无分片键的查询是否有处理方案(扇出 / 宽表 / ES)
492
+ - [ ] **DDL 工具**:是否有批量执行 DDL 的工具
493
+ - [ ] **扩容方案**:未来分片数不够时,是否有扩容方案(一致性哈希/虚拟分片)
494
+
495
+ ### 通用
496
+ - [ ] **数据迁移**:是否有双写 + 对账 + 回滚的完整迁移方案
497
+ - [ ] **压测验证**:迁移后是否通过压测验证了性能和容量
498
+ - [ ] **监控覆盖**:是否建立了跨实例/跨库的统一监控看板
499
+ - [ ] **备份策略**:拆分/分片后的备份策略是否调整
500
+ - [ ] **回滚预案**:每个阶段是否有明确的回滚步骤和触发条件