@umacloud/knowledge 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (418) hide show
  1. package/00-governance/governance-capabilities.md +557 -0
  2. package/00-governance/knowledge-map.md +39 -0
  3. package/00-governance/maintenance-policy.md +76 -0
  4. package/00-governance/review-checklist.md +81 -0
  5. package/README.md +13 -0
  6. package/ai/01-standards/agent-development-complete.md +691 -0
  7. package/ai/01-standards/llm-application-complete.md +488 -0
  8. package/ai/01-standards/mlops-complete.md +798 -0
  9. package/ai/01-standards/prompt-engineering-complete.md +646 -0
  10. package/ai/01-standards/rag-architecture-complete.md +649 -0
  11. package/ai/02-playbooks/llm-evaluation-playbook.md +847 -0
  12. package/ai/03-checklists/ai-project-checklist.md +215 -0
  13. package/ai/04-antipatterns/ai-antipatterns.md +661 -0
  14. package/ai/05-cases/case-rag-production.md +147 -0
  15. package/ai/06-glossary/ai-glossary.md +162 -0
  16. package/ai/agent-evaluation-benchmark.md +53 -0
  17. package/ai/ai-agent-memory-context-management.md +41 -0
  18. package/ai/ai-cost-capacity-optimization-playbook.md +42 -0
  19. package/ai/ai-data-security-and-compliance-playbook.md +37 -0
  20. package/ai/ai-domain-index-and-checklist.md +40 -0
  21. package/ai/ai-governance-maturity-model.md +50 -0
  22. package/ai/ai-model-selection-and-routing-strategy.md +47 -0
  23. package/ai/ai-observability-and-oncall-runbook.md +52 -0
  24. package/ai/ai-rag-engineering-playbook.md +42 -0
  25. package/ai/ai-red-team-and-safety-evaluation.md +42 -0
  26. package/ai/ai-release-readiness-and-rollback-gate.md +42 -0
  27. package/ai/llm-agent-engineering-deep-dive.md +57 -0
  28. package/ai/prompt-and-tool-guardrails.md +52 -0
  29. package/api/01-standards/enterprise-api-standards.md +198 -0
  30. package/api/01-standards/rest-api-design-guide.md +63 -0
  31. package/api/02-playbooks/api-pagination-playbook.md +93 -0
  32. package/api/02-playbooks/graphql-production-playbook.md +176 -0
  33. package/api/03-checklists/api-review-checklist.md +55 -0
  34. package/api/04-antipatterns/api-antipatterns.md +112 -0
  35. package/architecture/01-standards/api-gateway-patterns.md +496 -0
  36. package/architecture/01-standards/cloud-native-patterns.md +644 -0
  37. package/architecture/01-standards/distributed-systems-patterns.md +591 -0
  38. package/architecture/01-standards/event-driven-architecture.md +595 -0
  39. package/architecture/01-standards/microservices-patterns-complete.md +968 -0
  40. package/architecture/01-standards/microservices-patterns.md +495 -0
  41. package/architecture/01-standards/system-design-interview.md +664 -0
  42. package/architecture/02-playbooks/microservices-patterns-playbook.md +137 -0
  43. package/architecture/02-playbooks/migration-playbook.md +780 -0
  44. package/architecture/02-playbooks/system-design-playbook.md +779 -0
  45. package/architecture/03-checklists/architecture-decision-checklist.md +297 -0
  46. package/architecture/04-antipatterns/architecture-antipatterns.md +417 -0
  47. package/architecture/05-cases/case-netflix-microservices.md +413 -0
  48. package/architecture/06-glossary/architecture-glossary.md +164 -0
  49. package/architecture/adr-template-and-examples.md +38 -0
  50. package/architecture/api-gateway-deep-dive.md +1291 -0
  51. package/architecture/configuration-management.md +1162 -0
  52. package/architecture/distributed-transactions.md +1220 -0
  53. package/architecture/microservices-complete.md +735 -0
  54. package/architecture/resilience-and-disaster-patterns.md +37 -0
  55. package/architecture/service-governance.md +1198 -0
  56. package/architecture/system-architecture-deep-dive.md +37 -0
  57. package/backend/01-standards/analytics-and-growth.md +65 -0
  58. package/backend/01-standards/api-and-error-conventions.md +120 -0
  59. package/backend/01-standards/application-layering-and-packaging.md +160 -0
  60. package/backend/01-standards/auth-implementation.md +104 -0
  61. package/backend/01-standards/backend-framework-idioms.md +74 -0
  62. package/backend/01-standards/background-jobs-and-async.md +66 -0
  63. package/backend/01-standards/caching-strategies-complete.md +390 -0
  64. package/backend/01-standards/config-and-observability.md +77 -0
  65. package/backend/01-standards/data-modeling-and-persistence.md +94 -0
  66. package/backend/01-standards/django-complete.md +1765 -0
  67. package/backend/01-standards/email-and-notifications.md +64 -0
  68. package/backend/01-standards/fastapi-complete.md +925 -0
  69. package/backend/01-standards/file-upload-and-storage.md +66 -0
  70. package/backend/01-standards/graphql-api-complete.md +416 -0
  71. package/backend/01-standards/llm-application-standard.md +78 -0
  72. package/backend/01-standards/message-queue-patterns.md +379 -0
  73. package/backend/01-standards/microservices-and-distributed.md +78 -0
  74. package/backend/01-standards/nestjs-complete.md +2167 -0
  75. package/backend/01-standards/payment-integration.md +80 -0
  76. package/backend/01-standards/rate-limiting-complete.md +451 -0
  77. package/backend/01-standards/realtime-and-websocket.md +65 -0
  78. package/backend/01-standards/search-and-filtering.md +64 -0
  79. package/backend/01-standards/spring-boot-complete.md +445 -0
  80. package/backend/02-playbooks/api-design-playbook.md +718 -0
  81. package/backend/02-playbooks/email-send-playbook.md +130 -0
  82. package/backend/02-playbooks/file-upload-s3-playbook.md +153 -0
  83. package/backend/02-playbooks/typescript-enterprise-playbook.md +133 -0
  84. package/backend/02-playbooks/websocket-realtime-playbook.md +154 -0
  85. package/backend/03-checklists/api-launch-checklist.md +189 -0
  86. package/backend/04-antipatterns/backend-antipatterns.md +1051 -0
  87. package/blockchain/01-standards/blockchain-basics.md +557 -0
  88. package/blockchain/01-standards/smart-contract-development.md +1315 -0
  89. package/cicd/01-standards/deployment-and-delivery-standard.md +96 -0
  90. package/cicd/01-standards/github-actions-complete.md +473 -0
  91. package/cicd/01-standards/release-and-store-submission.md +75 -0
  92. package/cicd/02-playbooks/cicd-pipeline-playbook.md +144 -0
  93. package/cicd/02-playbooks/release-management-playbook.md +605 -0
  94. package/cicd/03-checklists/pipeline-security-checklist.md +168 -0
  95. package/cicd/04-antipatterns/cicd-antipatterns.md +589 -0
  96. package/cicd/05-cases/case-deployment-automation.md +221 -0
  97. package/cicd/05-cases/case-gitops-transformation.md +212 -0
  98. package/cicd/06-glossary/cicd-glossary.md +114 -0
  99. package/cicd/cicd-blueprint-deep-dive.md +38 -0
  100. package/cicd/release-readiness-gate.md +37 -0
  101. package/cloud-native/01-standards/container-security.md +741 -0
  102. package/cloud-native/01-standards/kubernetes-complete.md +812 -0
  103. package/cloud-native/02-playbooks/api-gateway-playbook.md +155 -0
  104. package/cloud-native/02-playbooks/gitops-with-argocd.md +760 -0
  105. package/cloud-native/02-playbooks/k8s-troubleshooting-playbook.md +1942 -0
  106. package/cloud-native/02-playbooks/message-queue-playbook.md +129 -0
  107. package/cloud-native/02-playbooks/multicloud-governance.md +726 -0
  108. package/cloud-native/02-playbooks/serverless-patterns.md +788 -0
  109. package/cloud-native/02-playbooks/service-mesh-playbook.md +612 -0
  110. package/cloud-native/02-playbooks/terraform-iac-playbook.md +143 -0
  111. package/cloud-native/03-checklists/container-security-checklist.md +431 -0
  112. package/cloud-native/03-checklists/k8s-production-readiness-checklist.md +460 -0
  113. package/cloud-native/04-antipatterns/container-antipatterns.md +660 -0
  114. package/cloud-native/04-antipatterns/k8s-antipatterns.md +743 -0
  115. package/cloud-native/05-cases/case-k8s-migration.md +478 -0
  116. package/cloud-native/05-cases/case-k8s-scaling.md +642 -0
  117. package/cloud-native/05-cases/case-k8s-security-incident.md +397 -0
  118. package/cloud-native/06-glossary/cloud-native-glossary.md +337 -0
  119. package/cross-platform/01-standards/cross-platform-frameworks.md +83 -0
  120. package/cross-platform/01-standards/platform-selection-and-architecture.md +77 -0
  121. package/data/01-standards/elasticsearch-complete.md +2098 -0
  122. package/data/01-standards/postgresql-complete.md +1613 -0
  123. package/data/01-standards/redis-complete.md +1527 -0
  124. package/data/02-playbooks/database-optimization-playbook.md +403 -0
  125. package/data/02-playbooks/elasticsearch-production-playbook.md +132 -0
  126. package/data/03-checklists/database-launch-checklist.md +187 -0
  127. package/data/04-antipatterns/database-antipatterns.md +873 -0
  128. package/data/05-cases/case-database-migration.md +310 -0
  129. package/data/06-glossary/database-glossary.md +440 -0
  130. package/data/data-governance-and-modeling-deep-dive.md +39 -0
  131. package/data-engineering/01-standards/airflow-complete.md +523 -0
  132. package/data-engineering/01-standards/kafka-complete.md +1521 -0
  133. package/data-engineering/02-playbooks/spark-etl-playbook.md +496 -0
  134. package/data-engineering/03-checklists/pipeline-launch-checklist.md +194 -0
  135. package/data-engineering/04-antipatterns/data-pipeline-antipatterns.md +684 -0
  136. package/data-engineering/05-cases/case-real-time-pipeline.md +355 -0
  137. package/data-engineering/06-glossary/data-engineering-glossary.md +429 -0
  138. package/database/01-standards/database-schema-standards.md +147 -0
  139. package/database/02-playbooks/postgresql-optimization-quick.md +52 -0
  140. package/database/02-playbooks/postgresql-performance-optimization.md +58 -0
  141. package/database/02-playbooks/postgresql-production-playbook.md +146 -0
  142. package/database/02-playbooks/redis-caching-playbook.md +117 -0
  143. package/database/03-checklists/database-review-checklist.md +50 -0
  144. package/database/04-antipatterns/database-antipatterns.md +112 -0
  145. package/design/01-standards/ui-design-system-complete.md +423 -0
  146. package/design/02-playbooks/design-handoff-playbook.md +254 -0
  147. package/design/02-playbooks/design-review-playbook.md +388 -0
  148. package/design/03-checklists/design-review-checklist.md +246 -0
  149. package/design/04-antipatterns/design-antipatterns.md +378 -0
  150. package/design/05-cases/case-design-system-adoption.md +328 -0
  151. package/design/06-glossary/design-glossary.md +329 -0
  152. package/design/ui-full-lifecycle-cross-platform-playbook.md +571 -0
  153. package/design/ux-system-deep-dive.md +38 -0
  154. package/design-systems/00-craft-rules.md +71 -0
  155. package/design-systems/aesthetic-families.md +43 -0
  156. package/design-systems/anti-ai-slop.md +162 -0
  157. package/design-systems/bold-geometric.md +120 -0
  158. package/design-systems/brutalist-bold.md +103 -0
  159. package/design-systems/editorial-clean.md +109 -0
  160. package/design-systems/glass-aurora.md +108 -0
  161. package/design-systems/modern-minimal.md +145 -0
  162. package/design-systems/premium-luxury.md +106 -0
  163. package/design-systems/product-type-design-map.md +48 -0
  164. package/design-systems/soft-warm.md +123 -0
  165. package/design-systems/tech-utility.md +113 -0
  166. package/desktop/01-standards/desktop-app-standard.md +72 -0
  167. package/desktop/01-standards/desktop-design.md +71 -0
  168. package/development/00-governance/document-template.md +41 -0
  169. package/development/01-standards/api-versioning-strategies.md +432 -0
  170. package/development/01-standards/authentication-patterns-complete.md +479 -0
  171. package/development/01-standards/css-architecture-complete.md +550 -0
  172. package/development/01-standards/database-migration-strategies.md +484 -0
  173. package/development/01-standards/elasticsearch-complete.md +347 -0
  174. package/development/01-standards/git-complete.md +371 -0
  175. package/development/01-standards/golang-complete.md +1565 -0
  176. package/development/01-standards/graphql-complete.md +298 -0
  177. package/development/01-standards/javascript-bundlers-complete.md +469 -0
  178. package/development/01-standards/javascript-typescript-complete.md +528 -0
  179. package/development/01-standards/jest-complete.md +275 -0
  180. package/development/01-standards/linux-complete.md +234 -0
  181. package/development/01-standards/logging-observability-complete.md +526 -0
  182. package/development/01-standards/microservices-communication.md +502 -0
  183. package/development/01-standards/mongodb-complete.md +406 -0
  184. package/development/01-standards/oauth2-complete.md +285 -0
  185. package/development/01-standards/performance-optimization-complete.md +289 -0
  186. package/development/01-standards/playwright-complete.md +247 -0
  187. package/development/01-standards/postgresql-complete.md +456 -0
  188. package/development/01-standards/pytest-complete.md +340 -0
  189. package/development/01-standards/python-async-programming.md +902 -0
  190. package/development/01-standards/python-complete.md +956 -0
  191. package/development/01-standards/python-decorators-complete.md +799 -0
  192. package/development/01-standards/python-design-patterns.md +2854 -0
  193. package/development/01-standards/python-packaging-distribution.md +420 -0
  194. package/development/01-standards/python-testing-strategies.md +607 -0
  195. package/development/01-standards/python-web-frameworks-comparison.md +471 -0
  196. package/development/01-standards/redis-complete.md +317 -0
  197. package/development/01-standards/rest-api-complete.md +316 -0
  198. package/development/01-standards/rust-complete.md +578 -0
  199. package/development/01-standards/typescript-advanced-types.md +1513 -0
  200. package/development/01-standards/web-security-complete.md +292 -0
  201. package/development/02-playbooks/api-design-playbook.md +810 -0
  202. package/development/02-playbooks/database-migration-playbook.md +580 -0
  203. package/development/02-playbooks/debugging-playbook.md +692 -0
  204. package/development/02-playbooks/feature-delivery-playbook.md +430 -0
  205. package/development/02-playbooks/incident-hotfix-playbook.md +387 -0
  206. package/development/02-playbooks/performance-optimization-playbook.md +531 -0
  207. package/development/02-playbooks/performance-tuning-playbook.md +652 -0
  208. package/development/02-playbooks/refactor-playbook.md +403 -0
  209. package/development/02-playbooks/release-playbook.md +469 -0
  210. package/development/03-checklists/architecture-review-checklist.md +168 -0
  211. package/development/03-checklists/data-migration-checklist.md +157 -0
  212. package/development/03-checklists/oncall-handover-checklist.md +173 -0
  213. package/development/03-checklists/pr-checklist.md +158 -0
  214. package/development/03-checklists/production-readiness-checklist.md +190 -0
  215. package/development/03-checklists/release-readiness-checklist.md +154 -0
  216. package/development/03-checklists/security-review-checklist.md +182 -0
  217. package/development/04-antipatterns/api-antipatterns.md +657 -0
  218. package/development/04-antipatterns/architecture-antipatterns.md +686 -0
  219. package/development/04-antipatterns/backend-antipatterns.md +648 -0
  220. package/development/04-antipatterns/cicd-antipatterns.md +540 -0
  221. package/development/04-antipatterns/code-smell-antipatterns.md +571 -0
  222. package/development/04-antipatterns/data-antipatterns.md +658 -0
  223. package/development/04-antipatterns/database-antipatterns.md +578 -0
  224. package/development/04-antipatterns/frontend-antipatterns.md +635 -0
  225. package/development/04-antipatterns/reliability-antipatterns.md +700 -0
  226. package/development/04-antipatterns/security-antipatterns.md +747 -0
  227. package/development/05-cases/case-api-version-migration.md +428 -0
  228. package/development/05-cases/case-authorization-hardening.md +383 -0
  229. package/development/05-cases/case-bluegreen-rollback.md +466 -0
  230. package/development/05-cases/case-cache-snowball-protection.md +485 -0
  231. package/development/05-cases/case-ci-cd-pipeline.md +544 -0
  232. package/development/05-cases/case-database-scaling.md +500 -0
  233. package/development/05-cases/case-db-hotspot-optimization.md +487 -0
  234. package/development/05-cases/case-incident-mttr-reduction.md +563 -0
  235. package/development/05-cases/case-microservice-migration.md +375 -0
  236. package/development/05-cases/case-performance-optimization.md +406 -0
  237. package/development/05-cases/case-security-incident-response.md +345 -0
  238. package/development/06-glossary/full-stack-glossary.md +166 -0
  239. package/development/09-maturity/quarterly-audit-template.md +35 -0
  240. package/development/11-ui-excellence/ui-aesthetic-system.md +41 -0
  241. package/development/11-ui-excellence/ui-engineering-excellence.md +435 -0
  242. package/development/12-scenarios/development-scenarios-guide.md +565 -0
  243. package/development/13-implementation-assets/implementation-toolkit.md +282 -0
  244. package/development/13-implementation-assets/knowledge-gates-execution.md +43 -0
  245. package/development/14-full-lifecycle/software-lifecycle-gates.md +511 -0
  246. package/development/15-lifecycle-templates/project-templates-collection.md +791 -0
  247. package/development/api-contract-and-versioning-guide.md +36 -0
  248. package/development/api-governance-complete.md +43 -0
  249. package/development/backend-engineering-complete.md +43 -0
  250. package/development/code-review-quality-complete.md +43 -0
  251. package/development/concurrency-reliability-complete.md +43 -0
  252. package/development/database-engineering-complete.md +43 -0
  253. package/development/engineering-effectiveness-complete.md +43 -0
  254. package/development/engineering-standards-deep-dive.md +38 -0
  255. package/development/frontend-engineering-complete.md +43 -0
  256. package/development/performance-capacity-complete.md +43 -0
  257. package/development/refactor-migration-complete.md +42 -0
  258. package/development/refactoring-and-techdebt-playbook.md +37 -0
  259. package/development/security-in-development-complete.md +43 -0
  260. package/devops/01-standards/cicd-pipeline-complete.md +262 -0
  261. package/devops/01-standards/docker-complete.md +1490 -0
  262. package/devops/01-standards/github-actions-complete.md +337 -0
  263. package/devops/01-standards/kubernetes-complete.md +638 -0
  264. package/devops/01-standards/terraform-complete.md +2117 -0
  265. package/devops/02-playbooks/docker-compose-playbook.md +233 -0
  266. package/devops/02-playbooks/docker-k8s-production-playbook.md +186 -0
  267. package/devops/02-playbooks/docker-production-playbook.md +952 -0
  268. package/edge-iot/01-standards/edge-iot-complete.md +473 -0
  269. package/experts/architect/api-design.md +178 -0
  270. package/experts/architect/methodology.md +124 -0
  271. package/experts/architect/security.md +75 -0
  272. package/experts/backend-lead/methodology.md +216 -0
  273. package/experts/devops/methodology.md +160 -0
  274. package/experts/frontend-lead/methodology.md +178 -0
  275. package/experts/product-manager/industry/ecommerce.md +43 -0
  276. package/experts/product-manager/industry/saas.md +40 -0
  277. package/experts/product-manager/methodology.md +97 -0
  278. package/experts/qa-lead/methodology.md +123 -0
  279. package/experts/qa-lead/test-strategy.md +128 -0
  280. package/experts/uiux-designer/methodology.md +125 -0
  281. package/frontend/01-standards/accessibility-complete.md +532 -0
  282. package/frontend/01-standards/accessibility-standard.md +74 -0
  283. package/frontend/01-standards/admin-dashboard-and-crud.md +72 -0
  284. package/frontend/01-standards/design-tokens-complete.md +444 -0
  285. package/frontend/01-standards/forms-and-validation.md +77 -0
  286. package/frontend/01-standards/frontend-architecture-and-layering.md +119 -0
  287. package/frontend/01-standards/i18n-and-localization.md +65 -0
  288. package/frontend/01-standards/nextjs-complete.md +451 -0
  289. package/frontend/01-standards/react-complete.md +713 -0
  290. package/frontend/01-standards/react-hooks-complete-guide.md +1100 -0
  291. package/frontend/01-standards/react-hooks-complete.md +1171 -0
  292. package/frontend/01-standards/seo-and-web-vitals.md +77 -0
  293. package/frontend/01-standards/state-management-complete.md +444 -0
  294. package/frontend/01-standards/vue-complete.md +499 -0
  295. package/frontend/01-standards/vue3-complete.md +2002 -0
  296. package/frontend/01-standards/web-framework-best-practices.md +64 -0
  297. package/frontend/01-standards/web-performance-complete.md +495 -0
  298. package/frontend/02-playbooks/accessibility-a11y-playbook.md +161 -0
  299. package/frontend/02-playbooks/frontend-performance-playbook.md +707 -0
  300. package/frontend/02-playbooks/i18n-internationalization-playbook.md +120 -0
  301. package/frontend/02-playbooks/performance-optimization-playbook.md +163 -0
  302. package/frontend/02-playbooks/react-nextjs-production-playbook.md +167 -0
  303. package/frontend/02-playbooks/react-state-management-playbook.md +173 -0
  304. package/frontend/03-checklists/component-quality-checklist.md +166 -0
  305. package/frontend/03-checklists/frontend-launch-checklist.md +299 -0
  306. package/frontend/04-antipatterns/frontend-antipatterns.md +886 -0
  307. package/frontend/05-cases/case-performance-optimization.md +274 -0
  308. package/harmony/01-standards/harmonyos-arkts-standard.md +75 -0
  309. package/harmony/01-standards/harmonyos-design.md +65 -0
  310. package/high-quality-engineering-playbook.md +54 -0
  311. package/incident/01-standards/incident-response-complete.md +303 -0
  312. package/incident/02-playbooks/chaos-engineering-playbook.md +883 -0
  313. package/incident/02-playbooks/postmortem-playbook.md +398 -0
  314. package/incident/03-checklists/incident-readiness-checklist.md +181 -0
  315. package/incident/04-antipatterns/incident-antipatterns.md +490 -0
  316. package/incident/05-cases/case-cascade-failure.md +176 -0
  317. package/incident/06-glossary/incident-glossary.md +114 -0
  318. package/incident/postmortem-and-response-deep-dive.md +39 -0
  319. package/industries/ecommerce/ecommerce-complete.md +631 -0
  320. package/industries/education/education-complete.md +555 -0
  321. package/industries/fintech/fintech-complete.md +501 -0
  322. package/industries/gaming/gaming-complete.md +587 -0
  323. package/industries/healthcare/healthcare-complete.md +452 -0
  324. package/low-code/01-standards/low-code-complete.md +944 -0
  325. package/miniprogram/01-standards/ai-common-mistakes.md +61 -0
  326. package/miniprogram/01-standards/miniprogram-custom-navbar-capsule.md +77 -0
  327. package/miniprogram/01-standards/miniprogram-design.md +61 -0
  328. package/miniprogram/01-standards/miniprogram-standard.md +81 -0
  329. package/mobile/01-standards/android-material-design.md +70 -0
  330. package/mobile/01-standards/flutter-complete.md +384 -0
  331. package/mobile/01-standards/ios-design-hig.md +78 -0
  332. package/mobile/01-standards/mobile-app-standard.md +85 -0
  333. package/mobile/01-standards/react-native-complete.md +352 -0
  334. package/mobile/02-playbooks/mobile-cross-platform-playbook.md +175 -0
  335. package/mobile/02-playbooks/mobile-performance.md +473 -0
  336. package/mobile/03-checklists/mobile-release-checklist.md +234 -0
  337. package/mobile/04-antipatterns/mobile-antipatterns.md +798 -0
  338. package/mobile/05-cases/case-app-performance.md +500 -0
  339. package/mobile/05-cases/case-app-startup-optimization.md +218 -0
  340. package/mobile/06-glossary/mobile-glossary.md +484 -0
  341. package/observability/01-standards/observability-standards.md +103 -0
  342. package/observability/02-playbooks/prometheus-grafana-playbook.md +135 -0
  343. package/observability/02-playbooks/structured-logging-playbook.md +73 -0
  344. package/observability/03-checklists/observability-checklist.md +54 -0
  345. package/observability/04-antipatterns/observability-antipatterns.md +106 -0
  346. package/operations/01-standards/prometheus-monitoring-complete.md +1578 -0
  347. package/operations/02-playbooks/capacity-planning-playbook.md +620 -0
  348. package/operations/03-checklists/production-launch-checklist.md +365 -0
  349. package/operations/04-antipatterns/operations-antipatterns.md +664 -0
  350. package/operations/05-cases/case-sre-practices.md +581 -0
  351. package/operations/06-glossary/operations-glossary.md +120 -0
  352. package/operations/aiops-anomaly-detection.md +758 -0
  353. package/operations/capacity-planning.md +1061 -0
  354. package/operations/chaos-engineering.md +659 -0
  355. package/operations/incident-command-system.md +38 -0
  356. package/operations/observability-complete.md +442 -0
  357. package/operations/slo-sli-playbook.md +517 -0
  358. package/operations/sre-operations-deep-dive.md +39 -0
  359. package/package.json +8 -0
  360. package/performance/01-standards/performance-and-scalability.md +80 -0
  361. package/performance/01-standards/performance-standards.md +156 -0
  362. package/performance/02-playbooks/query-optimization-playbook.md +103 -0
  363. package/performance/03-checklists/performance-checklist.md +56 -0
  364. package/performance/04-antipatterns/performance-antipatterns.md +146 -0
  365. package/product/01-standards/product-management-complete.md +285 -0
  366. package/product/02-playbooks/feature-launch-playbook.md +207 -0
  367. package/product/02-playbooks/user-research-playbook.md +532 -0
  368. package/product/03-checklists/feature-launch-checklist.md +275 -0
  369. package/product/04-antipatterns/product-antipatterns.md +355 -0
  370. package/product/05-cases/case-mvp-to-scale.md +384 -0
  371. package/product/06-glossary/product-glossary.md +462 -0
  372. package/product/feature-prioritization-framework.md +40 -0
  373. package/product/kpi-and-metric-tree.md +37 -0
  374. package/product/product-discovery-and-prd-deep-dive.md +41 -0
  375. package/quantum/01-standards/quantum-complete.md +1186 -0
  376. package/security/01-standards/api-security-complete.md +511 -0
  377. package/security/01-standards/container-runtime-security.md +574 -0
  378. package/security/01-standards/data-protection-gdpr.md +543 -0
  379. package/security/01-standards/owasp-top10-complete.md +1890 -0
  380. package/security/01-standards/secure-coding-baseline.md +90 -0
  381. package/security/01-standards/supply-chain-security.md +441 -0
  382. package/security/01-standards/web-security-checklist.md +108 -0
  383. package/security/01-standards/zero-trust-architecture.md +521 -0
  384. package/security/02-playbooks/auth-sso-playbook.md +166 -0
  385. package/security/02-playbooks/incident-response-security-playbook.md +588 -0
  386. package/security/02-playbooks/owasp-api-security-playbook.md +129 -0
  387. package/security/02-playbooks/payment-integration-playbook.md +119 -0
  388. package/security/02-playbooks/penetration-testing-playbook.md +517 -0
  389. package/security/03-checklists/security-audit-checklist.md +356 -0
  390. package/security/04-antipatterns/security-coding-antipatterns.md +580 -0
  391. package/security/05-cases/case-log4shell-incident.md +537 -0
  392. package/security/05-cases/case-major-breaches.md +468 -0
  393. package/security/06-glossary/security-glossary.md +212 -0
  394. package/security/compliance-automation.md +993 -0
  395. package/security/container-security.md +680 -0
  396. package/security/devsecops-complete.md +426 -0
  397. package/security/sast-dast-sca.md +775 -0
  398. package/security/secrets-management.md +594 -0
  399. package/security/security-architecture-deep-dive.md +37 -0
  400. package/security/threat-modeling-stride-playbook.md +40 -0
  401. package/seed-templates/auth-system.md +59 -0
  402. package/seed-templates/blog-content.md +94 -0
  403. package/seed-templates/dashboard.md +89 -0
  404. package/seed-templates/docs-site.md +73 -0
  405. package/seed-templates/e-commerce.md +50 -0
  406. package/seed-templates/saas-landing.md +92 -0
  407. package/seed-templates/settings-page.md +51 -0
  408. package/testing/01-standards/test-strategy-and-layering.md +83 -0
  409. package/testing/01-standards/testing-strategy-complete.md +422 -0
  410. package/testing/01-standards/unit-testing-best-practices.md +118 -0
  411. package/testing/02-playbooks/e2e-testing-playbook.md +988 -0
  412. package/testing/02-playbooks/testing-strategy-playbook.md +126 -0
  413. package/testing/03-checklists/test-strategy-checklist.md +208 -0
  414. package/testing/04-antipatterns/testing-antipatterns.md +718 -0
  415. package/testing/05-cases/case-testing-transformation.md +300 -0
  416. package/testing/06-glossary/testing-glossary.md +110 -0
  417. package/testing/risk-based-test-matrix.md +36 -0
  418. package/testing/testing-strategy-deep-dive.md +37 -0
@@ -0,0 +1,485 @@
1
+ ---
2
+ id: case-cache-snowball-protection
3
+ title: 案例研究:缓存雪崩防护体系建设
4
+ domain: development
5
+ category: 05-cases
6
+ difficulty: intermediate
7
+ tags: [agent, cache, case, checklist, development, protection, snowball, 元数据]
8
+ quality_score: 70
9
+ last_updated: 2026-06-15
10
+ ---
11
+ # 案例研究:缓存雪崩防护体系建设
12
+
13
+ ## 元数据
14
+
15
+ | 字段 | 值 |
16
+ |------|------|
17
+ | 行业 | 直播电商平台 |
18
+ | 系统规模 | 日活 600 万,峰值 QPS 50,000 |
19
+ | 技术栈 | Go + Redis Cluster + MySQL + Elasticsearch |
20
+ | 缓存规模 | Redis Cluster 6 主 6 从,内存 192GB |
21
+ | 团队规模 | 后端 20 人,SRE 4 人 |
22
+ | 建设周期 | 5 周(2024-03 至 2024-04) |
23
+ | 触发事件 | 一次缓存雪崩导致全站宕机 47 分钟 |
24
+
25
+ ---
26
+
27
+ ## 一、背景
28
+
29
+ ### 1.1 事故回顾
30
+
31
+ 2024 年 3 月 8 日(三八大促),20:15 全站出现严重故障:
32
+
33
+ ```
34
+ 时间线:
35
+ 20:00 大促直播间开播,流量激增到 QPS 48,000
36
+ 20:12 Redis Cluster 某节点执行了一次 keys * 命令(运维误操作)
37
+ 导致该节点阻塞 8 秒
38
+ 20:13 该节点上的 12,000 个热点 Key 同时返回超时
39
+ 20:14 客户端重试 → Redis 节点压力暴增 → 请求堆积
40
+ 20:15 大量请求穿透到 MySQL → MySQL CPU 99% → 慢查询堆积
41
+ 20:16 MySQL 连接池耗尽 → 全站 API 返回 503
42
+ 20:17 告警触发,SRE 介入
43
+ 20:32 MySQL 限流 + Redis 节点重启
44
+ 20:45 热点数据手动预热
45
+ 21:02 全站恢复正常
46
+
47
+ 故障时长:47 分钟
48
+ 业务影响:
49
+ - 直播间断流,主播和观众无法互动
50
+ - 订单损失估算 280 万元
51
+ - 大促活动效果严重打折
52
+ ```
53
+
54
+ ### 1.2 根因分析
55
+
56
+ ```
57
+ 直接原因:
58
+ 运维在生产 Redis 节点上执行 keys * 命令
59
+
60
+ 深层原因:
61
+ 1. 无缓存雪崩防护机制
62
+ 2. 热点 Key 集中在少数 Redis 节点
63
+ 3. 缓存失效后无限流,请求直接穿透到 MySQL
64
+ 4. 无本地缓存兜底
65
+ 5. Redis 访问无超时控制,阻塞传播到应用层
66
+ 6. 运维权限管控不足,生产环境可执行危险命令
67
+ ```
68
+
69
+ ### 1.3 缓存架构现状
70
+
71
+ ```
72
+ 改造前的缓存架构(单层):
73
+
74
+ App → Redis Cluster(唯一缓存层)→ MySQL
75
+
76
+ 问题:
77
+ 1. 所有缓存 Key 使用固定 TTL(1 小时),大量 Key 在同一时刻过期
78
+ 2. 热点商品/直播间数据集中在同一 Redis 节点(热点倾斜)
79
+ 3. 缓存未命中直接查 MySQL,无任何缓解机制
80
+ 4. 单一缓存层,Redis 故障 = 全站故障
81
+ ```
82
+
83
+ ---
84
+
85
+ ## 二、挑战
86
+
87
+ ### 2.1 技术挑战
88
+
89
+ 1. **流量峰值极高**:大促期间 QPS 50,000+,任何防护机制都不能引入显著延迟
90
+ 2. **热点不可预测**:直播带货场景下,爆款商品的热度在秒级变化
91
+ 3. **数据新鲜度要求**:商品价格/库存必须实时(缓存 TTL 不能太长)
92
+ 4. **Redis Cluster 限制**:不能简单增加副本数(成本和一致性复杂度)
93
+
94
+ ### 2.2 业务约束
95
+
96
+ 1. 大促季结束前必须完成加固(5 周时间窗口)
97
+ 2. 改造期间不能影响现有业务
98
+ 3. 直播带货场景的读写模式特殊:短时间内极高的读 QPS + 突发的写 QPS
99
+
100
+ ---
101
+
102
+ ## 三、方案设计
103
+
104
+ ### 3.1 多级缓存架构
105
+
106
+ ```
107
+ 改造后的缓存架构(三级):
108
+
109
+ ┌──────────┐
110
+ │ Client │
111
+ └────┬─────┘
112
+
113
+ ┌────┴─────┐
114
+ │ API Layer│
115
+ └────┬─────┘
116
+
117
+ ┌──────────┴──────────┐
118
+ │ L1: Local Cache │ ← 进程内缓存
119
+ │ (Caffeine/Ristretto)│ TTL: 5-30s
120
+ │ 命中率: ~60% │ 容量: 每实例 256MB
121
+ └──────────┬──────────┘
122
+ │ 未命中
123
+ ┌──────────┴──────────┐
124
+ │ L2: Redis Cluster │ ← 分布式缓存
125
+ │ 命中率: ~35% │ TTL: 5min-1h (随机抖动)
126
+ └──────────┬──────────┘
127
+ │ 未命中
128
+ ┌──────────┴──────────┐
129
+ │ L3: MySQL + ES │ ← 数据源
130
+ │ 命中率: ~5% │
131
+ └─────────────────────┘
132
+ ```
133
+
134
+ ### 3.2 六大防护策略
135
+
136
+ #### 策略 1:TTL 随机抖动
137
+
138
+ ```go
139
+ // 消除大量 Key 同时过期
140
+ func SetWithJitter(ctx context.Context, key string, value interface{}, baseTTL time.Duration) error {
141
+ // 在基础 TTL 上增加 0-20% 的随机抖动
142
+ jitter := time.Duration(rand.Int63n(int64(baseTTL) / 5))
143
+ actualTTL := baseTTL + jitter
144
+ return rdb.Set(ctx, key, value, actualTTL).Err()
145
+ }
146
+
147
+ // 不同数据类型的 TTL 策略
148
+ var ttlConfig = map[string]time.Duration{
149
+ "product:detail": 30 * time.Minute, // 商品详情
150
+ "product:price": 5 * time.Minute, // 商品价格(更新频繁)
151
+ "product:stock": 1 * time.Minute, // 库存(高频变化)
152
+ "livestream:info": 10 * time.Minute, // 直播间信息
153
+ "user:profile": 1 * time.Hour, // 用户资料(少变)
154
+ }
155
+ ```
156
+
157
+ #### 策略 2:热点 Key 自动探测与本地缓存
158
+
159
+ ```go
160
+ // 热点 Key 探测器
161
+ type HotKeyDetector struct {
162
+ counter *slidingWindow // 滑动窗口计数
163
+ threshold int64 // 热点阈值:100 QPS
164
+ localCache *ristretto.Cache
165
+ }
166
+
167
+ func (d *HotKeyDetector) OnAccess(key string) {
168
+ count := d.counter.Increment(key)
169
+ if count > d.threshold {
170
+ // 热点 Key 自动提升到本地缓存
171
+ val, err := redis.Get(ctx, key)
172
+ if err == nil {
173
+ d.localCache.SetWithTTL(key, val, 1, 10*time.Second)
174
+ metrics.HotKeyPromotions.Inc()
175
+ }
176
+ }
177
+ }
178
+
179
+ func (d *HotKeyDetector) Get(ctx context.Context, key string) (interface{}, error) {
180
+ // 先查本地缓存
181
+ if val, found := d.localCache.Get(key); found {
182
+ metrics.LocalCacheHits.Inc()
183
+ return val, nil
184
+ }
185
+ // 再查 Redis
186
+ return redis.Get(ctx, key)
187
+ }
188
+ ```
189
+
190
+ #### 策略 3:互斥重建(Singleflight)
191
+
192
+ ```go
193
+ // 防止缓存击穿:同一个 Key 过期后只有一个请求去重建
194
+ var sf singleflight.Group
195
+
196
+ func GetProductDetail(ctx context.Context, productID int64) (*Product, error) {
197
+ key := fmt.Sprintf("product:detail:%d", productID)
198
+
199
+ // 尝试从缓存获取
200
+ cached, err := cache.Get(ctx, key)
201
+ if err == nil {
202
+ return cached.(*Product), nil
203
+ }
204
+
205
+ // 缓存未命中,使用 singleflight 防止并发重建
206
+ result, err, _ := sf.Do(key, func() (interface{}, error) {
207
+ // 只有一个 goroutine 执行数据库查询
208
+ product, err := db.GetProduct(ctx, productID)
209
+ if err != nil {
210
+ return nil, err
211
+ }
212
+ // 写回缓存
213
+ cache.SetWithJitter(ctx, key, product, 30*time.Minute)
214
+ return product, nil
215
+ })
216
+
217
+ if err != nil {
218
+ return nil, err
219
+ }
220
+ return result.(*Product), nil
221
+ }
222
+ ```
223
+
224
+ #### 策略 4:空值缓存(防止缓存穿透)
225
+
226
+ ```go
227
+ // 对不存在的数据也缓存空值,防止恶意查询穿透到 DB
228
+ func GetProductWithNullProtection(ctx context.Context, id int64) (*Product, error) {
229
+ key := fmt.Sprintf("product:detail:%d", id)
230
+
231
+ cached, err := cache.Get(ctx, key)
232
+ if err == nil {
233
+ if cached == nil {
234
+ return nil, ErrNotFound // 空值缓存命中
235
+ }
236
+ return cached.(*Product), nil
237
+ }
238
+
239
+ product, err := db.GetProduct(ctx, id)
240
+ if err == ErrNotFound {
241
+ // 缓存空值,TTL 较短
242
+ cache.Set(ctx, key, nil, 2*time.Minute)
243
+ return nil, ErrNotFound
244
+ }
245
+ if err != nil {
246
+ return nil, err
247
+ }
248
+
249
+ cache.SetWithJitter(ctx, key, product, 30*time.Minute)
250
+ return product, nil
251
+ }
252
+ ```
253
+
254
+ #### 策略 5:多级降级机制
255
+
256
+ ```go
257
+ // 降级策略链
258
+ type DegradationChain struct {
259
+ strategies []DegradationStrategy
260
+ }
261
+
262
+ type DegradationStrategy interface {
263
+ Name() string
264
+ CanHandle(err error) bool
265
+ Handle(ctx context.Context, key string) (interface{}, error)
266
+ }
267
+
268
+ // 降级策略 1:读从库
269
+ type ReadReplicaFallback struct{}
270
+
271
+ func (f *ReadReplicaFallback) Handle(ctx context.Context, key string) (interface{}, error) {
272
+ return readReplicaDB.Query(ctx, key)
273
+ }
274
+
275
+ // 降级策略 2:返回过期缓存
276
+ type StaleDataFallback struct{}
277
+
278
+ func (f *StaleDataFallback) Handle(ctx context.Context, key string) (interface{}, error) {
279
+ // Redis 中保存的上一个版本的数据(独立 Key,TTL 更长)
280
+ staleKey := "stale:" + key
281
+ return redis.Get(ctx, staleKey)
282
+ }
283
+
284
+ // 降级策略 3:返回默认值
285
+ type DefaultValueFallback struct{}
286
+
287
+ func (f *DefaultValueFallback) Handle(ctx context.Context, key string) (interface{}, error) {
288
+ return getDefaultValue(key), nil
289
+ }
290
+
291
+ // 使用方式
292
+ func GetWithDegradation(ctx context.Context, key string) (interface{}, error) {
293
+ val, err := cache.Get(ctx, key)
294
+ if err == nil {
295
+ return val, nil
296
+ }
297
+
298
+ for _, strategy := range degradationChain.strategies {
299
+ if strategy.CanHandle(err) {
300
+ val, err := strategy.Handle(ctx, key)
301
+ if err == nil {
302
+ metrics.DegradationHits.WithLabelValues(strategy.Name()).Inc()
303
+ return val, nil
304
+ }
305
+ }
306
+ }
307
+
308
+ return nil, errors.New("all degradation strategies exhausted")
309
+ }
310
+ ```
311
+
312
+ #### 策略 6:限流保护数据源
313
+
314
+ ```go
315
+ // 对 MySQL 的查询做限流,防止缓存雪崩时打爆数据库
316
+ var dbLimiter = rate.NewLimiter(rate.Limit(2000), 500) // 2000 QPS, burst 500
317
+
318
+ func QueryDBWithRateLimit(ctx context.Context, query string, args ...interface{}) (*sql.Rows, error) {
319
+ if !dbLimiter.Allow() {
320
+ metrics.DBRateLimited.Inc()
321
+ return nil, ErrRateLimited
322
+ }
323
+ return db.QueryContext(ctx, query, args...)
324
+ }
325
+ ```
326
+
327
+ ---
328
+
329
+ ## 四、实施步骤
330
+
331
+ ### 4.1 Phase 1:紧急加固(Week 1)
332
+
333
+ ```
334
+ Day 1: TTL 随机抖动
335
+ - 全量替换固定 TTL 为抖动 TTL
336
+ - 立即消除"同时过期"风险
337
+
338
+ Day 2: Singleflight 接入
339
+ - 所有缓存 Get 方法接入 singleflight
340
+ - 防止缓存击穿时大量并发重建
341
+
342
+ Day 3: Redis 超时控制
343
+ - Redis 客户端设置 200ms 超时(原先无超时)
344
+ - 超时后走降级而非阻塞等待
345
+
346
+ Day 4-5: 数据库限流
347
+ - 为 MySQL 查询增加 rate limiter
348
+ - 限流后返回降级响应而非 503
349
+ ```
350
+
351
+ ### 4.2 Phase 2:本地缓存层(Week 2-3)
352
+
353
+ ```
354
+ Week 2: L1 本地缓存建设
355
+ - 引入 Ristretto 作为进程内缓存(Go 高性能本地缓存库)
356
+ - 每个实例分配 256MB 内存
357
+ - 商品详情/直播间信息等高频读数据接入 L1
358
+
359
+ Week 3: 热点 Key 自动探测
360
+ - 部署热点 Key 探测器
361
+ - 自动将 QPS > 100 的 Key 提升到本地缓存
362
+ - Grafana 热点 Key 实时监控面板
363
+ ```
364
+
365
+ ### 4.3 Phase 3:多级降级(Week 4)
366
+
367
+ ```
368
+ Week 4:
369
+ - 实现降级策略链(过期缓存 → 从库 → 默认值)
370
+ - 为核心接口配置降级策略
371
+ - 空值缓存防穿透
372
+
373
+ 关键决策——哪些数据可以降级:
374
+ 可降级(返回过期数据):
375
+ ✅ 商品详情(标题、图片、描述)
376
+ ✅ 直播间信息(主播信息、介绍)
377
+ ✅ 评价列表
378
+ ✅ 推荐列表
379
+
380
+ 不可降级(必须实时):
381
+ ❌ 商品价格(涉及交易金额)
382
+ ❌ 库存数量(涉及超卖风险)
383
+ ❌ 订单状态
384
+ ❌ 用户余额
385
+ ```
386
+
387
+ ### 4.4 Phase 4:压测验证与运维加固(Week 5)
388
+
389
+ ```
390
+ Week 5:
391
+ Day 1-2: 缓存雪崩模拟压测
392
+ - 场景 1:批量删除 10,000 个热点 Key → 验证 singleflight + 限流
393
+ - 场景 2:Redis 节点宕机 → 验证本地缓存 + 降级链
394
+ - 场景 3:MySQL 延迟注入(500ms)→ 验证超时 + 降级
395
+
396
+ Day 3: Redis 运维加固
397
+ - 生产 Redis 禁用危险命令(keys, flushdb, flushall)
398
+ - Redis 访问权限收回,开发人员只读
399
+ - Redis 慢日志监控告警
400
+
401
+ Day 4-5: 监控告警体系
402
+ - 缓存命中率分层监控(L1/L2 分别告警)
403
+ - 降级触发次数告警
404
+ - 数据库限流触发告警
405
+ - 热点 Key 分布可视化
406
+ ```
407
+
408
+ ---
409
+
410
+ ## 五、结果数据
411
+
412
+ ### 5.1 核心指标对比
413
+
414
+ | 指标 | 改造前 | 改造后 |
415
+ |------|--------|--------|
416
+ | 缓存总命中率 | 85% | 95%(L1 60% + L2 35%) |
417
+ | Redis 单点故障影响 | 全站宕机 | 延迟增加 20ms(L1 兜底) |
418
+ | MySQL 峰值 QPS | 8,000(缓存失效时) | 800(限流 + 多级缓存) |
419
+ | 热点 Key 响应时间 | 5ms(Redis) | 0.1ms(本地缓存) |
420
+ | 雪崩恢复时间 | 47 分钟 | 自动恢复,无需人工 |
421
+
422
+ ### 5.2 压测结果
423
+
424
+ | 场景 | 改造前 | 改造后 |
425
+ |------|--------|--------|
426
+ | 正常流量 50K QPS | P99: 45ms | P99: 32ms(本地缓存加速) |
427
+ | Redis 节点宕机 | 全站 503 | P99: 65ms(降级服务) |
428
+ | 10K Key 同时过期 | MySQL CPU 99%, 超时 | P99: 120ms, MySQL CPU 45% |
429
+ | 热点 Key 100K QPS | Redis 节点 CPU 95% | 本地缓存处理,Redis 无感知 |
430
+
431
+ ### 5.3 成本
432
+
433
+ | 项目 | 值 |
434
+ |------|------|
435
+ | 本地缓存内存增加 | 每实例 256MB x 20 实例 = 5GB |
436
+ | 开发投入 | 3 人 x 5 周 |
437
+ | Redis 配置变更 | 0 元(策略优化,不加硬件) |
438
+ | 避免的故障损失 | 估算 280 万/次 |
439
+
440
+ ---
441
+
442
+ ## 六、经验教训
443
+
444
+ ### 6.1 做对的事
445
+
446
+ 1. **多级缓存是根本解**:单层缓存的可用性上限取决于那一层的可用性。L1 + L2 双层使得任一层故障都有兜底
447
+ 2. **TTL 抖动是最小成本最大收益的改动**:一行代码变更就消除了同时过期的核心风险
448
+ 3. **Singleflight 极其有效**:缓存击穿场景下将 MySQL 压力从 N 降到 1
449
+ 4. **降级分级处理**:区分可降级和不可降级数据,避免一刀切
450
+ 5. **压测验证**:模拟了三种极端场景,发现了 2 个降级策略的 Bug
451
+
452
+ ### 6.2 做错的事
453
+
454
+ 1. **本地缓存一致性问题低估**:多实例本地缓存之间数据不一致,导致同一用户刷新页面看到不同价格。后通过 Redis Pub/Sub 通知各实例失效本地缓存
455
+ 2. **空值缓存初期 TTL 设太长**:设了 30 分钟,导致商品上架后用户 30 分钟内仍看到"商品不存在"。后调整为 2 分钟
456
+ 3. **降级日志太多**:降级触发时每次都打 WARN 日志,大促期间日志量暴增 50 倍。后改为采样记录
457
+
458
+ ### 6.3 关键认知
459
+
460
+ - 缓存不是"加速器",是"保护器"。缓存策略必须和容量策略联合设计
461
+ - 永远不要在生产 Redis 上执行 `keys *`、`flushdb`、`flushall`
462
+ - 缓存雪崩不是"如果"的问题,是"何时"的问题。必须有预案
463
+ - 本地缓存是对抗网络抖动和中间件故障的最后一道防线
464
+ - 降级不是故障,是设计——系统应该优雅地降级而不是直接崩溃
465
+
466
+ ---
467
+
468
+ ## Agent Checklist
469
+
470
+ 在 AI Agent 辅助设计缓存防护体系时,应逐项确认:
471
+
472
+ - [ ] **TTL 策略**:缓存 TTL 是否有随机抖动,避免大量 Key 同时过期
473
+ - [ ] **多级缓存**:是否有本地缓存层作为 Redis 的兜底
474
+ - [ ] **缓存击穿**:热点 Key 过期时是否有 Singleflight/互斥锁防止并发重建
475
+ - [ ] **缓存穿透**:不存在的 Key 是否有空值缓存或布隆过滤器
476
+ - [ ] **热点探测**:是否有机制自动检测和保护热点 Key
477
+ - [ ] **降级策略**:缓存故障时是否有分级降级方案(过期数据/从库/默认值)
478
+ - [ ] **数据分类**:是否区分了可降级数据和不可降级数据
479
+ - [ ] **限流保护**:缓存失效时对数据源(DB)是否有限流保护
480
+ - [ ] **超时控制**:Redis 客户端是否设置了合理的超时时间
481
+ - [ ] **危险命令禁用**:生产 Redis 是否禁用了 keys/flushdb/flushall 等命令
482
+ - [ ] **一致性方案**:本地缓存与 Redis 之间的数据一致性如何保证
483
+ - [ ] **监控告警**:缓存命中率、降级触发、热点 Key 是否有监控和告警
484
+ - [ ] **压测验证**:是否模拟了雪崩/击穿/穿透场景并验证了防护效果
485
+ - [ ] **预热机制**:冷启动或大促前是否有缓存预热方案