@umacloud/knowledge 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (418) hide show
  1. package/00-governance/governance-capabilities.md +557 -0
  2. package/00-governance/knowledge-map.md +39 -0
  3. package/00-governance/maintenance-policy.md +76 -0
  4. package/00-governance/review-checklist.md +81 -0
  5. package/README.md +13 -0
  6. package/ai/01-standards/agent-development-complete.md +691 -0
  7. package/ai/01-standards/llm-application-complete.md +488 -0
  8. package/ai/01-standards/mlops-complete.md +798 -0
  9. package/ai/01-standards/prompt-engineering-complete.md +646 -0
  10. package/ai/01-standards/rag-architecture-complete.md +649 -0
  11. package/ai/02-playbooks/llm-evaluation-playbook.md +847 -0
  12. package/ai/03-checklists/ai-project-checklist.md +215 -0
  13. package/ai/04-antipatterns/ai-antipatterns.md +661 -0
  14. package/ai/05-cases/case-rag-production.md +147 -0
  15. package/ai/06-glossary/ai-glossary.md +162 -0
  16. package/ai/agent-evaluation-benchmark.md +53 -0
  17. package/ai/ai-agent-memory-context-management.md +41 -0
  18. package/ai/ai-cost-capacity-optimization-playbook.md +42 -0
  19. package/ai/ai-data-security-and-compliance-playbook.md +37 -0
  20. package/ai/ai-domain-index-and-checklist.md +40 -0
  21. package/ai/ai-governance-maturity-model.md +50 -0
  22. package/ai/ai-model-selection-and-routing-strategy.md +47 -0
  23. package/ai/ai-observability-and-oncall-runbook.md +52 -0
  24. package/ai/ai-rag-engineering-playbook.md +42 -0
  25. package/ai/ai-red-team-and-safety-evaluation.md +42 -0
  26. package/ai/ai-release-readiness-and-rollback-gate.md +42 -0
  27. package/ai/llm-agent-engineering-deep-dive.md +57 -0
  28. package/ai/prompt-and-tool-guardrails.md +52 -0
  29. package/api/01-standards/enterprise-api-standards.md +198 -0
  30. package/api/01-standards/rest-api-design-guide.md +63 -0
  31. package/api/02-playbooks/api-pagination-playbook.md +93 -0
  32. package/api/02-playbooks/graphql-production-playbook.md +176 -0
  33. package/api/03-checklists/api-review-checklist.md +55 -0
  34. package/api/04-antipatterns/api-antipatterns.md +112 -0
  35. package/architecture/01-standards/api-gateway-patterns.md +496 -0
  36. package/architecture/01-standards/cloud-native-patterns.md +644 -0
  37. package/architecture/01-standards/distributed-systems-patterns.md +591 -0
  38. package/architecture/01-standards/event-driven-architecture.md +595 -0
  39. package/architecture/01-standards/microservices-patterns-complete.md +968 -0
  40. package/architecture/01-standards/microservices-patterns.md +495 -0
  41. package/architecture/01-standards/system-design-interview.md +664 -0
  42. package/architecture/02-playbooks/microservices-patterns-playbook.md +137 -0
  43. package/architecture/02-playbooks/migration-playbook.md +780 -0
  44. package/architecture/02-playbooks/system-design-playbook.md +779 -0
  45. package/architecture/03-checklists/architecture-decision-checklist.md +297 -0
  46. package/architecture/04-antipatterns/architecture-antipatterns.md +417 -0
  47. package/architecture/05-cases/case-netflix-microservices.md +413 -0
  48. package/architecture/06-glossary/architecture-glossary.md +164 -0
  49. package/architecture/adr-template-and-examples.md +38 -0
  50. package/architecture/api-gateway-deep-dive.md +1291 -0
  51. package/architecture/configuration-management.md +1162 -0
  52. package/architecture/distributed-transactions.md +1220 -0
  53. package/architecture/microservices-complete.md +735 -0
  54. package/architecture/resilience-and-disaster-patterns.md +37 -0
  55. package/architecture/service-governance.md +1198 -0
  56. package/architecture/system-architecture-deep-dive.md +37 -0
  57. package/backend/01-standards/analytics-and-growth.md +65 -0
  58. package/backend/01-standards/api-and-error-conventions.md +120 -0
  59. package/backend/01-standards/application-layering-and-packaging.md +160 -0
  60. package/backend/01-standards/auth-implementation.md +104 -0
  61. package/backend/01-standards/backend-framework-idioms.md +74 -0
  62. package/backend/01-standards/background-jobs-and-async.md +66 -0
  63. package/backend/01-standards/caching-strategies-complete.md +390 -0
  64. package/backend/01-standards/config-and-observability.md +77 -0
  65. package/backend/01-standards/data-modeling-and-persistence.md +94 -0
  66. package/backend/01-standards/django-complete.md +1765 -0
  67. package/backend/01-standards/email-and-notifications.md +64 -0
  68. package/backend/01-standards/fastapi-complete.md +925 -0
  69. package/backend/01-standards/file-upload-and-storage.md +66 -0
  70. package/backend/01-standards/graphql-api-complete.md +416 -0
  71. package/backend/01-standards/llm-application-standard.md +78 -0
  72. package/backend/01-standards/message-queue-patterns.md +379 -0
  73. package/backend/01-standards/microservices-and-distributed.md +78 -0
  74. package/backend/01-standards/nestjs-complete.md +2167 -0
  75. package/backend/01-standards/payment-integration.md +80 -0
  76. package/backend/01-standards/rate-limiting-complete.md +451 -0
  77. package/backend/01-standards/realtime-and-websocket.md +65 -0
  78. package/backend/01-standards/search-and-filtering.md +64 -0
  79. package/backend/01-standards/spring-boot-complete.md +445 -0
  80. package/backend/02-playbooks/api-design-playbook.md +718 -0
  81. package/backend/02-playbooks/email-send-playbook.md +130 -0
  82. package/backend/02-playbooks/file-upload-s3-playbook.md +153 -0
  83. package/backend/02-playbooks/typescript-enterprise-playbook.md +133 -0
  84. package/backend/02-playbooks/websocket-realtime-playbook.md +154 -0
  85. package/backend/03-checklists/api-launch-checklist.md +189 -0
  86. package/backend/04-antipatterns/backend-antipatterns.md +1051 -0
  87. package/blockchain/01-standards/blockchain-basics.md +557 -0
  88. package/blockchain/01-standards/smart-contract-development.md +1315 -0
  89. package/cicd/01-standards/deployment-and-delivery-standard.md +96 -0
  90. package/cicd/01-standards/github-actions-complete.md +473 -0
  91. package/cicd/01-standards/release-and-store-submission.md +75 -0
  92. package/cicd/02-playbooks/cicd-pipeline-playbook.md +144 -0
  93. package/cicd/02-playbooks/release-management-playbook.md +605 -0
  94. package/cicd/03-checklists/pipeline-security-checklist.md +168 -0
  95. package/cicd/04-antipatterns/cicd-antipatterns.md +589 -0
  96. package/cicd/05-cases/case-deployment-automation.md +221 -0
  97. package/cicd/05-cases/case-gitops-transformation.md +212 -0
  98. package/cicd/06-glossary/cicd-glossary.md +114 -0
  99. package/cicd/cicd-blueprint-deep-dive.md +38 -0
  100. package/cicd/release-readiness-gate.md +37 -0
  101. package/cloud-native/01-standards/container-security.md +741 -0
  102. package/cloud-native/01-standards/kubernetes-complete.md +812 -0
  103. package/cloud-native/02-playbooks/api-gateway-playbook.md +155 -0
  104. package/cloud-native/02-playbooks/gitops-with-argocd.md +760 -0
  105. package/cloud-native/02-playbooks/k8s-troubleshooting-playbook.md +1942 -0
  106. package/cloud-native/02-playbooks/message-queue-playbook.md +129 -0
  107. package/cloud-native/02-playbooks/multicloud-governance.md +726 -0
  108. package/cloud-native/02-playbooks/serverless-patterns.md +788 -0
  109. package/cloud-native/02-playbooks/service-mesh-playbook.md +612 -0
  110. package/cloud-native/02-playbooks/terraform-iac-playbook.md +143 -0
  111. package/cloud-native/03-checklists/container-security-checklist.md +431 -0
  112. package/cloud-native/03-checklists/k8s-production-readiness-checklist.md +460 -0
  113. package/cloud-native/04-antipatterns/container-antipatterns.md +660 -0
  114. package/cloud-native/04-antipatterns/k8s-antipatterns.md +743 -0
  115. package/cloud-native/05-cases/case-k8s-migration.md +478 -0
  116. package/cloud-native/05-cases/case-k8s-scaling.md +642 -0
  117. package/cloud-native/05-cases/case-k8s-security-incident.md +397 -0
  118. package/cloud-native/06-glossary/cloud-native-glossary.md +337 -0
  119. package/cross-platform/01-standards/cross-platform-frameworks.md +83 -0
  120. package/cross-platform/01-standards/platform-selection-and-architecture.md +77 -0
  121. package/data/01-standards/elasticsearch-complete.md +2098 -0
  122. package/data/01-standards/postgresql-complete.md +1613 -0
  123. package/data/01-standards/redis-complete.md +1527 -0
  124. package/data/02-playbooks/database-optimization-playbook.md +403 -0
  125. package/data/02-playbooks/elasticsearch-production-playbook.md +132 -0
  126. package/data/03-checklists/database-launch-checklist.md +187 -0
  127. package/data/04-antipatterns/database-antipatterns.md +873 -0
  128. package/data/05-cases/case-database-migration.md +310 -0
  129. package/data/06-glossary/database-glossary.md +440 -0
  130. package/data/data-governance-and-modeling-deep-dive.md +39 -0
  131. package/data-engineering/01-standards/airflow-complete.md +523 -0
  132. package/data-engineering/01-standards/kafka-complete.md +1521 -0
  133. package/data-engineering/02-playbooks/spark-etl-playbook.md +496 -0
  134. package/data-engineering/03-checklists/pipeline-launch-checklist.md +194 -0
  135. package/data-engineering/04-antipatterns/data-pipeline-antipatterns.md +684 -0
  136. package/data-engineering/05-cases/case-real-time-pipeline.md +355 -0
  137. package/data-engineering/06-glossary/data-engineering-glossary.md +429 -0
  138. package/database/01-standards/database-schema-standards.md +147 -0
  139. package/database/02-playbooks/postgresql-optimization-quick.md +52 -0
  140. package/database/02-playbooks/postgresql-performance-optimization.md +58 -0
  141. package/database/02-playbooks/postgresql-production-playbook.md +146 -0
  142. package/database/02-playbooks/redis-caching-playbook.md +117 -0
  143. package/database/03-checklists/database-review-checklist.md +50 -0
  144. package/database/04-antipatterns/database-antipatterns.md +112 -0
  145. package/design/01-standards/ui-design-system-complete.md +423 -0
  146. package/design/02-playbooks/design-handoff-playbook.md +254 -0
  147. package/design/02-playbooks/design-review-playbook.md +388 -0
  148. package/design/03-checklists/design-review-checklist.md +246 -0
  149. package/design/04-antipatterns/design-antipatterns.md +378 -0
  150. package/design/05-cases/case-design-system-adoption.md +328 -0
  151. package/design/06-glossary/design-glossary.md +329 -0
  152. package/design/ui-full-lifecycle-cross-platform-playbook.md +571 -0
  153. package/design/ux-system-deep-dive.md +38 -0
  154. package/design-systems/00-craft-rules.md +71 -0
  155. package/design-systems/aesthetic-families.md +43 -0
  156. package/design-systems/anti-ai-slop.md +162 -0
  157. package/design-systems/bold-geometric.md +120 -0
  158. package/design-systems/brutalist-bold.md +103 -0
  159. package/design-systems/editorial-clean.md +109 -0
  160. package/design-systems/glass-aurora.md +108 -0
  161. package/design-systems/modern-minimal.md +145 -0
  162. package/design-systems/premium-luxury.md +106 -0
  163. package/design-systems/product-type-design-map.md +48 -0
  164. package/design-systems/soft-warm.md +123 -0
  165. package/design-systems/tech-utility.md +113 -0
  166. package/desktop/01-standards/desktop-app-standard.md +72 -0
  167. package/desktop/01-standards/desktop-design.md +71 -0
  168. package/development/00-governance/document-template.md +41 -0
  169. package/development/01-standards/api-versioning-strategies.md +432 -0
  170. package/development/01-standards/authentication-patterns-complete.md +479 -0
  171. package/development/01-standards/css-architecture-complete.md +550 -0
  172. package/development/01-standards/database-migration-strategies.md +484 -0
  173. package/development/01-standards/elasticsearch-complete.md +347 -0
  174. package/development/01-standards/git-complete.md +371 -0
  175. package/development/01-standards/golang-complete.md +1565 -0
  176. package/development/01-standards/graphql-complete.md +298 -0
  177. package/development/01-standards/javascript-bundlers-complete.md +469 -0
  178. package/development/01-standards/javascript-typescript-complete.md +528 -0
  179. package/development/01-standards/jest-complete.md +275 -0
  180. package/development/01-standards/linux-complete.md +234 -0
  181. package/development/01-standards/logging-observability-complete.md +526 -0
  182. package/development/01-standards/microservices-communication.md +502 -0
  183. package/development/01-standards/mongodb-complete.md +406 -0
  184. package/development/01-standards/oauth2-complete.md +285 -0
  185. package/development/01-standards/performance-optimization-complete.md +289 -0
  186. package/development/01-standards/playwright-complete.md +247 -0
  187. package/development/01-standards/postgresql-complete.md +456 -0
  188. package/development/01-standards/pytest-complete.md +340 -0
  189. package/development/01-standards/python-async-programming.md +902 -0
  190. package/development/01-standards/python-complete.md +956 -0
  191. package/development/01-standards/python-decorators-complete.md +799 -0
  192. package/development/01-standards/python-design-patterns.md +2854 -0
  193. package/development/01-standards/python-packaging-distribution.md +420 -0
  194. package/development/01-standards/python-testing-strategies.md +607 -0
  195. package/development/01-standards/python-web-frameworks-comparison.md +471 -0
  196. package/development/01-standards/redis-complete.md +317 -0
  197. package/development/01-standards/rest-api-complete.md +316 -0
  198. package/development/01-standards/rust-complete.md +578 -0
  199. package/development/01-standards/typescript-advanced-types.md +1513 -0
  200. package/development/01-standards/web-security-complete.md +292 -0
  201. package/development/02-playbooks/api-design-playbook.md +810 -0
  202. package/development/02-playbooks/database-migration-playbook.md +580 -0
  203. package/development/02-playbooks/debugging-playbook.md +692 -0
  204. package/development/02-playbooks/feature-delivery-playbook.md +430 -0
  205. package/development/02-playbooks/incident-hotfix-playbook.md +387 -0
  206. package/development/02-playbooks/performance-optimization-playbook.md +531 -0
  207. package/development/02-playbooks/performance-tuning-playbook.md +652 -0
  208. package/development/02-playbooks/refactor-playbook.md +403 -0
  209. package/development/02-playbooks/release-playbook.md +469 -0
  210. package/development/03-checklists/architecture-review-checklist.md +168 -0
  211. package/development/03-checklists/data-migration-checklist.md +157 -0
  212. package/development/03-checklists/oncall-handover-checklist.md +173 -0
  213. package/development/03-checklists/pr-checklist.md +158 -0
  214. package/development/03-checklists/production-readiness-checklist.md +190 -0
  215. package/development/03-checklists/release-readiness-checklist.md +154 -0
  216. package/development/03-checklists/security-review-checklist.md +182 -0
  217. package/development/04-antipatterns/api-antipatterns.md +657 -0
  218. package/development/04-antipatterns/architecture-antipatterns.md +686 -0
  219. package/development/04-antipatterns/backend-antipatterns.md +648 -0
  220. package/development/04-antipatterns/cicd-antipatterns.md +540 -0
  221. package/development/04-antipatterns/code-smell-antipatterns.md +571 -0
  222. package/development/04-antipatterns/data-antipatterns.md +658 -0
  223. package/development/04-antipatterns/database-antipatterns.md +578 -0
  224. package/development/04-antipatterns/frontend-antipatterns.md +635 -0
  225. package/development/04-antipatterns/reliability-antipatterns.md +700 -0
  226. package/development/04-antipatterns/security-antipatterns.md +747 -0
  227. package/development/05-cases/case-api-version-migration.md +428 -0
  228. package/development/05-cases/case-authorization-hardening.md +383 -0
  229. package/development/05-cases/case-bluegreen-rollback.md +466 -0
  230. package/development/05-cases/case-cache-snowball-protection.md +485 -0
  231. package/development/05-cases/case-ci-cd-pipeline.md +544 -0
  232. package/development/05-cases/case-database-scaling.md +500 -0
  233. package/development/05-cases/case-db-hotspot-optimization.md +487 -0
  234. package/development/05-cases/case-incident-mttr-reduction.md +563 -0
  235. package/development/05-cases/case-microservice-migration.md +375 -0
  236. package/development/05-cases/case-performance-optimization.md +406 -0
  237. package/development/05-cases/case-security-incident-response.md +345 -0
  238. package/development/06-glossary/full-stack-glossary.md +166 -0
  239. package/development/09-maturity/quarterly-audit-template.md +35 -0
  240. package/development/11-ui-excellence/ui-aesthetic-system.md +41 -0
  241. package/development/11-ui-excellence/ui-engineering-excellence.md +435 -0
  242. package/development/12-scenarios/development-scenarios-guide.md +565 -0
  243. package/development/13-implementation-assets/implementation-toolkit.md +282 -0
  244. package/development/13-implementation-assets/knowledge-gates-execution.md +43 -0
  245. package/development/14-full-lifecycle/software-lifecycle-gates.md +511 -0
  246. package/development/15-lifecycle-templates/project-templates-collection.md +791 -0
  247. package/development/api-contract-and-versioning-guide.md +36 -0
  248. package/development/api-governance-complete.md +43 -0
  249. package/development/backend-engineering-complete.md +43 -0
  250. package/development/code-review-quality-complete.md +43 -0
  251. package/development/concurrency-reliability-complete.md +43 -0
  252. package/development/database-engineering-complete.md +43 -0
  253. package/development/engineering-effectiveness-complete.md +43 -0
  254. package/development/engineering-standards-deep-dive.md +38 -0
  255. package/development/frontend-engineering-complete.md +43 -0
  256. package/development/performance-capacity-complete.md +43 -0
  257. package/development/refactor-migration-complete.md +42 -0
  258. package/development/refactoring-and-techdebt-playbook.md +37 -0
  259. package/development/security-in-development-complete.md +43 -0
  260. package/devops/01-standards/cicd-pipeline-complete.md +262 -0
  261. package/devops/01-standards/docker-complete.md +1490 -0
  262. package/devops/01-standards/github-actions-complete.md +337 -0
  263. package/devops/01-standards/kubernetes-complete.md +638 -0
  264. package/devops/01-standards/terraform-complete.md +2117 -0
  265. package/devops/02-playbooks/docker-compose-playbook.md +233 -0
  266. package/devops/02-playbooks/docker-k8s-production-playbook.md +186 -0
  267. package/devops/02-playbooks/docker-production-playbook.md +952 -0
  268. package/edge-iot/01-standards/edge-iot-complete.md +473 -0
  269. package/experts/architect/api-design.md +178 -0
  270. package/experts/architect/methodology.md +124 -0
  271. package/experts/architect/security.md +75 -0
  272. package/experts/backend-lead/methodology.md +216 -0
  273. package/experts/devops/methodology.md +160 -0
  274. package/experts/frontend-lead/methodology.md +178 -0
  275. package/experts/product-manager/industry/ecommerce.md +43 -0
  276. package/experts/product-manager/industry/saas.md +40 -0
  277. package/experts/product-manager/methodology.md +97 -0
  278. package/experts/qa-lead/methodology.md +123 -0
  279. package/experts/qa-lead/test-strategy.md +128 -0
  280. package/experts/uiux-designer/methodology.md +125 -0
  281. package/frontend/01-standards/accessibility-complete.md +532 -0
  282. package/frontend/01-standards/accessibility-standard.md +74 -0
  283. package/frontend/01-standards/admin-dashboard-and-crud.md +72 -0
  284. package/frontend/01-standards/design-tokens-complete.md +444 -0
  285. package/frontend/01-standards/forms-and-validation.md +77 -0
  286. package/frontend/01-standards/frontend-architecture-and-layering.md +119 -0
  287. package/frontend/01-standards/i18n-and-localization.md +65 -0
  288. package/frontend/01-standards/nextjs-complete.md +451 -0
  289. package/frontend/01-standards/react-complete.md +713 -0
  290. package/frontend/01-standards/react-hooks-complete-guide.md +1100 -0
  291. package/frontend/01-standards/react-hooks-complete.md +1171 -0
  292. package/frontend/01-standards/seo-and-web-vitals.md +77 -0
  293. package/frontend/01-standards/state-management-complete.md +444 -0
  294. package/frontend/01-standards/vue-complete.md +499 -0
  295. package/frontend/01-standards/vue3-complete.md +2002 -0
  296. package/frontend/01-standards/web-framework-best-practices.md +64 -0
  297. package/frontend/01-standards/web-performance-complete.md +495 -0
  298. package/frontend/02-playbooks/accessibility-a11y-playbook.md +161 -0
  299. package/frontend/02-playbooks/frontend-performance-playbook.md +707 -0
  300. package/frontend/02-playbooks/i18n-internationalization-playbook.md +120 -0
  301. package/frontend/02-playbooks/performance-optimization-playbook.md +163 -0
  302. package/frontend/02-playbooks/react-nextjs-production-playbook.md +167 -0
  303. package/frontend/02-playbooks/react-state-management-playbook.md +173 -0
  304. package/frontend/03-checklists/component-quality-checklist.md +166 -0
  305. package/frontend/03-checklists/frontend-launch-checklist.md +299 -0
  306. package/frontend/04-antipatterns/frontend-antipatterns.md +886 -0
  307. package/frontend/05-cases/case-performance-optimization.md +274 -0
  308. package/harmony/01-standards/harmonyos-arkts-standard.md +75 -0
  309. package/harmony/01-standards/harmonyos-design.md +65 -0
  310. package/high-quality-engineering-playbook.md +54 -0
  311. package/incident/01-standards/incident-response-complete.md +303 -0
  312. package/incident/02-playbooks/chaos-engineering-playbook.md +883 -0
  313. package/incident/02-playbooks/postmortem-playbook.md +398 -0
  314. package/incident/03-checklists/incident-readiness-checklist.md +181 -0
  315. package/incident/04-antipatterns/incident-antipatterns.md +490 -0
  316. package/incident/05-cases/case-cascade-failure.md +176 -0
  317. package/incident/06-glossary/incident-glossary.md +114 -0
  318. package/incident/postmortem-and-response-deep-dive.md +39 -0
  319. package/industries/ecommerce/ecommerce-complete.md +631 -0
  320. package/industries/education/education-complete.md +555 -0
  321. package/industries/fintech/fintech-complete.md +501 -0
  322. package/industries/gaming/gaming-complete.md +587 -0
  323. package/industries/healthcare/healthcare-complete.md +452 -0
  324. package/low-code/01-standards/low-code-complete.md +944 -0
  325. package/miniprogram/01-standards/ai-common-mistakes.md +61 -0
  326. package/miniprogram/01-standards/miniprogram-custom-navbar-capsule.md +77 -0
  327. package/miniprogram/01-standards/miniprogram-design.md +61 -0
  328. package/miniprogram/01-standards/miniprogram-standard.md +81 -0
  329. package/mobile/01-standards/android-material-design.md +70 -0
  330. package/mobile/01-standards/flutter-complete.md +384 -0
  331. package/mobile/01-standards/ios-design-hig.md +78 -0
  332. package/mobile/01-standards/mobile-app-standard.md +85 -0
  333. package/mobile/01-standards/react-native-complete.md +352 -0
  334. package/mobile/02-playbooks/mobile-cross-platform-playbook.md +175 -0
  335. package/mobile/02-playbooks/mobile-performance.md +473 -0
  336. package/mobile/03-checklists/mobile-release-checklist.md +234 -0
  337. package/mobile/04-antipatterns/mobile-antipatterns.md +798 -0
  338. package/mobile/05-cases/case-app-performance.md +500 -0
  339. package/mobile/05-cases/case-app-startup-optimization.md +218 -0
  340. package/mobile/06-glossary/mobile-glossary.md +484 -0
  341. package/observability/01-standards/observability-standards.md +103 -0
  342. package/observability/02-playbooks/prometheus-grafana-playbook.md +135 -0
  343. package/observability/02-playbooks/structured-logging-playbook.md +73 -0
  344. package/observability/03-checklists/observability-checklist.md +54 -0
  345. package/observability/04-antipatterns/observability-antipatterns.md +106 -0
  346. package/operations/01-standards/prometheus-monitoring-complete.md +1578 -0
  347. package/operations/02-playbooks/capacity-planning-playbook.md +620 -0
  348. package/operations/03-checklists/production-launch-checklist.md +365 -0
  349. package/operations/04-antipatterns/operations-antipatterns.md +664 -0
  350. package/operations/05-cases/case-sre-practices.md +581 -0
  351. package/operations/06-glossary/operations-glossary.md +120 -0
  352. package/operations/aiops-anomaly-detection.md +758 -0
  353. package/operations/capacity-planning.md +1061 -0
  354. package/operations/chaos-engineering.md +659 -0
  355. package/operations/incident-command-system.md +38 -0
  356. package/operations/observability-complete.md +442 -0
  357. package/operations/slo-sli-playbook.md +517 -0
  358. package/operations/sre-operations-deep-dive.md +39 -0
  359. package/package.json +8 -0
  360. package/performance/01-standards/performance-and-scalability.md +80 -0
  361. package/performance/01-standards/performance-standards.md +156 -0
  362. package/performance/02-playbooks/query-optimization-playbook.md +103 -0
  363. package/performance/03-checklists/performance-checklist.md +56 -0
  364. package/performance/04-antipatterns/performance-antipatterns.md +146 -0
  365. package/product/01-standards/product-management-complete.md +285 -0
  366. package/product/02-playbooks/feature-launch-playbook.md +207 -0
  367. package/product/02-playbooks/user-research-playbook.md +532 -0
  368. package/product/03-checklists/feature-launch-checklist.md +275 -0
  369. package/product/04-antipatterns/product-antipatterns.md +355 -0
  370. package/product/05-cases/case-mvp-to-scale.md +384 -0
  371. package/product/06-glossary/product-glossary.md +462 -0
  372. package/product/feature-prioritization-framework.md +40 -0
  373. package/product/kpi-and-metric-tree.md +37 -0
  374. package/product/product-discovery-and-prd-deep-dive.md +41 -0
  375. package/quantum/01-standards/quantum-complete.md +1186 -0
  376. package/security/01-standards/api-security-complete.md +511 -0
  377. package/security/01-standards/container-runtime-security.md +574 -0
  378. package/security/01-standards/data-protection-gdpr.md +543 -0
  379. package/security/01-standards/owasp-top10-complete.md +1890 -0
  380. package/security/01-standards/secure-coding-baseline.md +90 -0
  381. package/security/01-standards/supply-chain-security.md +441 -0
  382. package/security/01-standards/web-security-checklist.md +108 -0
  383. package/security/01-standards/zero-trust-architecture.md +521 -0
  384. package/security/02-playbooks/auth-sso-playbook.md +166 -0
  385. package/security/02-playbooks/incident-response-security-playbook.md +588 -0
  386. package/security/02-playbooks/owasp-api-security-playbook.md +129 -0
  387. package/security/02-playbooks/payment-integration-playbook.md +119 -0
  388. package/security/02-playbooks/penetration-testing-playbook.md +517 -0
  389. package/security/03-checklists/security-audit-checklist.md +356 -0
  390. package/security/04-antipatterns/security-coding-antipatterns.md +580 -0
  391. package/security/05-cases/case-log4shell-incident.md +537 -0
  392. package/security/05-cases/case-major-breaches.md +468 -0
  393. package/security/06-glossary/security-glossary.md +212 -0
  394. package/security/compliance-automation.md +993 -0
  395. package/security/container-security.md +680 -0
  396. package/security/devsecops-complete.md +426 -0
  397. package/security/sast-dast-sca.md +775 -0
  398. package/security/secrets-management.md +594 -0
  399. package/security/security-architecture-deep-dive.md +37 -0
  400. package/security/threat-modeling-stride-playbook.md +40 -0
  401. package/seed-templates/auth-system.md +59 -0
  402. package/seed-templates/blog-content.md +94 -0
  403. package/seed-templates/dashboard.md +89 -0
  404. package/seed-templates/docs-site.md +73 -0
  405. package/seed-templates/e-commerce.md +50 -0
  406. package/seed-templates/saas-landing.md +92 -0
  407. package/seed-templates/settings-page.md +51 -0
  408. package/testing/01-standards/test-strategy-and-layering.md +83 -0
  409. package/testing/01-standards/testing-strategy-complete.md +422 -0
  410. package/testing/01-standards/unit-testing-best-practices.md +118 -0
  411. package/testing/02-playbooks/e2e-testing-playbook.md +988 -0
  412. package/testing/02-playbooks/testing-strategy-playbook.md +126 -0
  413. package/testing/03-checklists/test-strategy-checklist.md +208 -0
  414. package/testing/04-antipatterns/testing-antipatterns.md +718 -0
  415. package/testing/05-cases/case-testing-transformation.md +300 -0
  416. package/testing/06-glossary/testing-glossary.md +110 -0
  417. package/testing/risk-based-test-matrix.md +36 -0
  418. package/testing/testing-strategy-deep-dive.md +37 -0
@@ -0,0 +1,442 @@
1
+ ---
2
+ id: observability-complete
3
+ title: observability-complete
4
+ domain: operations
5
+ category: observability-complete.md
6
+ difficulty: intermediate
7
+ tags: [complete, dashboard, observability, operations, 告警设计, 实施清单, 常见失败模式, 技术选型]
8
+ quality_score: 70
9
+ last_updated: 2026-06-15
10
+ ---
11
+ # 开发:Excellent(11964948@qq.com)
12
+ # 功能:可观测性完整体系
13
+ # 作用:提供日志、指标、追踪三大支柱的完整实施指南
14
+ # 创建时间:2026-03-20
15
+ # 最后修改:2026-03-20
16
+
17
+ ## 目标
18
+ 建立生产环境全链路可观测能力,实现故障快速定位、性能瓶颈精准识别、用户体验量化评估。
19
+
20
+ ## 适用范围
21
+ - 生产环境所有微服务、数据库、中间件、基础设施
22
+ - 关键业务流程的端到端追踪
23
+ - 多云、混合云环境的统一观测
24
+
25
+ ## 核心概念:三支柱
26
+
27
+ ### 1. 日志(Logs)
28
+ **定义**:离散的、带时间戳的事件记录
29
+
30
+ **日志分级**:
31
+ - ERROR:系统错误、异常、需要立即处理
32
+ - WARN:潜在问题、降级、接近阈值
33
+ - INFO:关键业务流程、状态变更
34
+ - DEBUG:调试信息(生产环境默认关闭)
35
+ - TRACE:详细执行路径(仅用于故障诊断)
36
+
37
+ **结构化日志规范**:
38
+ ```json
39
+ {
40
+ "timestamp": "2026-03-20T10:30:45.123Z",
41
+ "level": "ERROR",
42
+ "service": "order-service",
43
+ "trace_id": "abc123",
44
+ "span_id": "def456",
45
+ "user_id": "user_789",
46
+ "message": "Payment processing failed",
47
+ "error": {
48
+ "type": "PaymentGatewayException",
49
+ "code": "GATEWAY_TIMEOUT",
50
+ "stack": "..."
51
+ },
52
+ "context": {
53
+ "order_id": "order_123",
54
+ "amount": 99.99,
55
+ "currency": "CNY"
56
+ },
57
+ "environment": "production",
58
+ "version": "2.1.3"
59
+ }
60
+ ```
61
+
62
+ **日志采集策略**:
63
+ - 使用 Fluentd/Fluent Bit/Filebeat 采集容器/主机日志
64
+ - 统一输出到 Elasticsearch/Loki
65
+ - 保留策略:ERROR/WARN 保留 90 天,INFO 保留 30 天,DEBUG/TRACE 仅故障时临时开启
66
+ - 采样策略:高流量场景按 10% 采样 INFO 日志,ERROR/WARN 必须 100% 采集
67
+
68
+ **日志查询最佳实践**:
69
+ ```bash
70
+ # 查询某个 trace_id 的全链路日志
71
+ trace_id:"abc123"
72
+
73
+ # 查询某个服务的错误日志
74
+ service:"order-service" AND level:"ERROR"
75
+
76
+ # 查询特定用户的操作日志
77
+ user_id:"user_789" AND level:("INFO" OR "ERROR")
78
+
79
+ # 聚合分析:按错误类型分组
80
+ level:"ERROR" | stats count by error.type
81
+ ```
82
+
83
+ ### 2. 指标(Metrics)
84
+ **定义**:可聚合的、数值化的系统状态测量
85
+
86
+ **指标类型**:
87
+ - Counter(计数器):单调递增(如请求总数、错误总数)
88
+ - Gauge(仪表):可增可减(如当前连接数、内存使用)
89
+ - Histogram(直方图):分布统计(如请求延迟、响应大小)
90
+ - Summary(摘要):分位数统计(如 P50/P95/P99 延迟)
91
+
92
+ **RED 指标(请求驱动服务)**:
93
+ - Rate(请求速率):每秒请求数(RPS)
94
+ - Errors(错误率):失败请求占比(4xx/5xx/超时)
95
+ - Duration(延迟):请求处理时间分布(P50/P95/P99)
96
+
97
+ **USE 指标(资源)**:
98
+ - Utilization(利用率):资源使用百分比(CPU/内存/磁盘/网络)
99
+ - Saturation(饱和度):排队/等待程度(负载均值、运行队列)
100
+ - Errors(错误):错误事件计数(网络错误、磁盘错误)
101
+
102
+ **核心业务指标**:
103
+ ```yaml
104
+ # 电商交易系统
105
+ - 订单创建成功率
106
+ - 支付成功率
107
+ - 订单履约时长(P95/P99)
108
+ - 库存准确率
109
+ - 购物车转化率
110
+
111
+ # 用户系统
112
+ - 登录成功率
113
+ - API 响应时间(P95/P99)
114
+ - 活跃用户数(DAU/MAU)
115
+ - 用户留存率
116
+
117
+ # 基础设施
118
+ - 容器 CPU 使用率
119
+ - 容器内存使用率
120
+ - 网络吞吐量
121
+ - 磁盘 I/O 延迟
122
+ - 数据库连接池使用率
123
+ ```
124
+
125
+ **指标采集架构**:
126
+ ```
127
+ 应用(Prometheus SDK)
128
+ -> Prometheus Server(Pull 模式)
129
+ -> Remote Write(长期存储)
130
+ -> Grafana(可视化)
131
+ -> Alertmanager(告警)
132
+ ```
133
+
134
+ **PromQL 查询示例**:
135
+ ```promql
136
+ # HTTP 请求错误率
137
+ sum(rate(http_requests_total{status=~"5.."}[5m])) /
138
+ sum(rate(http_requests_total[5m])) * 100
139
+
140
+ # P95 延迟
141
+ histogram_quantile(0.95,
142
+ sum(rate(http_request_duration_seconds_bucket[5m])) by (le)
143
+ )
144
+
145
+ # 容器内存使用率
146
+ container_memory_usage_bytes{container="order-service"} /
147
+ container_spec_memory_limit_bytes{container="order-service"} * 100
148
+
149
+ # 预测磁盘空间耗尽时间
150
+ predict_linear(node_filesystem_free_bytes[1h], 4*3600)
151
+ ```
152
+
153
+ ### 3. 追踪(Traces)
154
+ **定义**:请求在分布式系统中的完整执行路径
155
+
156
+ **核心概念**:
157
+ - Trace(追踪):一次请求的完整旅程
158
+ - Span(跨度):单个服务的处理单元
159
+ - Span Context:跨服务传递的上下文(trace_id、span_id、baggage)
160
+
161
+ **分布式追踪实现**:
162
+ ```python
163
+ # Python OpenTelemetry 示例
164
+ from opentelemetry import trace
165
+ from opentelemetry.trace.propagation.tracecontext import TraceContextTextMapPropagator
166
+
167
+ tracer = trace.get_tracer(__name__)
168
+
169
+ @app.route("/checkout")
170
+ def checkout():
171
+ with tracer.start_as_current_span("checkout") as span:
172
+ span.set_attribute("user_id", user_id)
173
+ span.set_attribute("order_id", order_id)
174
+
175
+ # 调用支付服务
176
+ with tracer.start_span("call_payment_service"):
177
+ payment_result = payment_service.charge()
178
+
179
+ # 调用库存服务
180
+ with tracer.start_span("call_inventory_service"):
181
+ inventory_result = inventory_service.reserve()
182
+
183
+ return {"status": "success"}
184
+ ```
185
+
186
+ **追踪采样策略**:
187
+ - 头部采样:在 Trace 开始时决定是否采样(概率采样)
188
+ - 尾部采样:在 Trace 结束后根据规则决定是否保留(错误/慢请求必留)
189
+ - 混合策略:正常流量 1% 采样,错误/慢请求 100% 保留
190
+
191
+ **追踪分析场景**:
192
+ - 调用链路可视化:识别关键路径和瓶颈服务
193
+ - 依赖关系图:自动生成服务拓扑
194
+ - 性能热点分析:定位耗时最长的 Span
195
+ - 错误传播追踪:查看错误在调用链中的传播路径
196
+
197
+ ## 技术选型
198
+
199
+ ### 日志系统
200
+ **方案对比**:
201
+ | 系统 | 优势 | 劣势 | 适用场景 |
202
+ |------|------|------|----------|
203
+ | Elasticsearch + Kibana (ELK) | 功能完整、生态成熟 | 资源消耗大、查询慢 | 大规模日志分析 |
204
+ | Loki + Grafana | 轻量级、与 Prometheus 统一 | 查询能力有限 | 中小规模、成本敏感 |
205
+ | Splunk | 企业级功能、可视化强 | 费用高昂 | 企业级生产环境 |
206
+
207
+ **推荐方案**:
208
+ - 中小团队:Loki + Grafana(与监控统一)
209
+ - 大型团队:Elasticsearch + Kibana(功能完整)
210
+
211
+ ### 指标系统
212
+ **方案对比**:
213
+ | 系统 | 优势 | 劣势 | 适用场景 |
214
+ |------|------|------|----------|
215
+ | Prometheus + Grafana | 开源标准、查询强大 | 单机存储、高可用需额外方案 | 通用监控 |
216
+ | VictoriaMetrics | 高性能、长期存储 | 社区较小 | 大规模时序数据 |
217
+ | Datadog | SaaS、全栈监控 | 费用高 | 企业级快速落地 |
218
+ | InfluxDB | 高性能写入 | 生态不如 Prometheus | IoT/实时分析 |
219
+
220
+ **推荐方案**:
221
+ - 自建:Prometheus + Thanos/Cortex(长期存储+高可用)
222
+ - SaaS:Datadog/Dynatrace(快速落地)
223
+
224
+ ### 追踪系统
225
+ **方案对比**:
226
+ | 系统 | 优势 | 劣势 | 适用场景 |
227
+ |------|------|------|----------|
228
+ | Jaeger | 开源标准、Uber 出品 | UI 较简陋 | 微服务追踪 |
229
+ | Zipkin | 轻量级、易上手 | 功能较少 | 中小规模 |
230
+ | Tempo + Grafana | 与 Loki/Prometheus 统一 | 功能较新 | 统一可观测性 |
231
+ | SkyWalking | APM 功能强、对应用无侵入 | 社区较小 | Java/PHP 应用 |
232
+
233
+ **推荐方案**:
234
+ - 云原生:Jaeger + OpenTelemetry(标准化)
235
+ - 统一平台:Grafana Tempo(与日志/指标统一)
236
+
237
+ ## 实施清单
238
+
239
+ ### 阶段一:基础覆盖(0-1个月)
240
+ - [ ] 部署日志采集(Fluentd/Filebeat -> Elasticsearch/Loki)
241
+ - [ ] 部署指标采集(Prometheus + Node Exporter + 应用 SDK)
242
+ - [ ] 部署追踪系统(Jaeger/Tempo + OpenTelemetry SDK)
243
+ - [ ] 创建基础 Dashboard:
244
+ - 服务健康状态(UP/DOWN)
245
+ - 核心业务指标(QPS、错误率、延迟)
246
+ - 基础设施指标(CPU、内存、磁盘、网络)
247
+ - [ ] 配置基础告警规则(服务宕机、错误率飙升、磁盘空间不足)
248
+
249
+ ### 阶段二:深度集成(1-3个月)
250
+ - [ ] 应用层埋点:
251
+ - HTTP/API 请求指标
252
+ - 数据库查询指标
253
+ - 缓存命中率
254
+ - 外部服务调用指标
255
+ - [ ] 分布式追踪集成:
256
+ - 跨服务 Trace Context 传递
257
+ - 数据库/缓存/消息队列 Span 插桩
258
+ - 错误 Span 自动标记
259
+ - [ ] 结构化日志改造:
260
+ - 统一日志格式(JSON)
261
+ - 注入 Trace Context(trace_id/span_id)
262
+ - 敏感信息脱敏
263
+ - [ ] Dashboard 体系化:
264
+ - 业务大屏(转化率、营收、活跃用户)
265
+ - 技术大屏(SLI/SLO、容量、成本)
266
+ - 值班大屏(告警、事件、值班人员)
267
+
268
+ ### 阶段三:智能化运维(3-6个月)
269
+ - [ ] 异常检测:
270
+ - 基于历史数据的动态阈值
271
+ - AI/ML 异常检测算法
272
+ - [ ] 容量预测:
273
+ - 资源使用趋势预测
274
+ - 自动扩缩容建议
275
+ - [ ] 根因分析:
276
+ - 日志/指标/追踪联动分析
277
+ - 自动化故障诊断报告
278
+ - [ ] AIOps 集成:
279
+ - 智能告警降噪
280
+ - 自动故障恢复(自愈)
281
+
282
+ ## 告警设计
283
+
284
+ ### 告警分级
285
+ - P0(紧急):核心业务中断、数据丢失风险
286
+ - 响应时间:<5 分钟
287
+ - 通知方式:电话 + 短信 + IM
288
+ - P1(严重):部分功能降级、性能严重下降
289
+ - 响应时间:<15 分钟
290
+ - 通知方式:短信 + IM
291
+ - P2(警告):潜在风险、资源接近阈值
292
+ - 响应时间:<1 小时
293
+ - 通知方式:IM + 邮件
294
+ - P3(通知):非紧急、可延后处理
295
+ - 响应时间:<24 小时
296
+ - 通知方式:邮件
297
+
298
+ ### 告警规则示例
299
+ ```yaml
300
+ # 服务可用性告警
301
+ - alert: ServiceDown
302
+ expr: up{job="order-service"} == 0
303
+ for: 2m
304
+ labels:
305
+ severity: P0
306
+ annotations:
307
+ summary: "服务宕机:{{ $labels.job }}"
308
+ description: "服务 {{ $labels.instance }} 已宕机超过 2 分钟"
309
+
310
+ # 错误率告警
311
+ - alert: HighErrorRate
312
+ expr: |
313
+ sum(rate(http_requests_total{status=~"5.."}[5m])) by (service) /
314
+ sum(rate(http_requests_total[5m])) by (service) > 0.05
315
+ for: 5m
316
+ labels:
317
+ severity: P1
318
+ annotations:
319
+ summary: "服务 {{ $labels.service }} 错误率过高"
320
+ description: "错误率 {{ $value | humanizePercentage }} 超过 5% 阈值"
321
+
322
+ # 延迟告警
323
+ - alert: HighLatency
324
+ expr: |
325
+ histogram_quantile(0.95,
326
+ sum(rate(http_request_duration_seconds_bucket[5m])) by (le, service)
327
+ ) > 2
328
+ for: 10m
329
+ labels:
330
+ severity: P1
331
+ annotations:
332
+ summary: "服务 {{ $labels.service }} P95 延迟过高"
333
+ description: "P95 延迟 {{ $value }} 秒超过 2 秒阈值"
334
+ ```
335
+
336
+ ### 告警降噪
337
+ - 分组(Grouping):相关告警合并通知
338
+ - 抑制(Inhibition):高优先级告警抑制低优先级
339
+ - 静默(Silencing):计划内维护期间静默告警
340
+ - 去重(Deduplication):相同告警去重,避免刷屏
341
+
342
+ ## Dashboard 设计原则
343
+
344
+ ### 信息层次
345
+ 1. **L1 - 业务视角**:业务健康度、用户体验、核心 KPI
346
+ 2. **L2 - 服务视角**:服务 SLI/SLO、错误率、延迟、吞吐
347
+ 3. **L3 - 基础设施视角**:CPU、内存、磁盘、网络、中间件
348
+ 4. **L4 - 详细视图**:日志、追踪、调试信息
349
+
350
+ ### 可视化最佳实践
351
+ - 使用颜色编码:绿色(健康)、黄色(警告)、红色(异常)
352
+ - 添加阈值线:清晰标识正常/警告/异常区间
353
+ - 时间范围选择器:支持 1h/6h/24h/7d/30d 快速切换
354
+ - 变量联动:支持按服务/环境/版本筛选
355
+ - 图表注释:在关键事件(部署、故障、扩容)处添加注释
356
+
357
+ ### Dashboard 模板
358
+ ```yaml
359
+ # 服务健康 Dashboard
360
+ - 标题:服务健康概览
361
+ 面板:
362
+ - 服务状态(Stat 面板)
363
+ - QPS(Time Series)
364
+ - 错误率(Time Series)
365
+ - P95/P99 延迟(Time Series)
366
+ - 活跃实例数(Gauge)
367
+ - 最近告警列表(Table)
368
+ ```
369
+
370
+ ## 常见失败模式
371
+
372
+ ### 1. 日志问题
373
+ - **日志过多导致存储成本失控**:未做采样、DEBUG 日志全量保留
374
+ - **日志格式不统一**:多团队各自定义格式,无法统一查询
375
+ - **敏感信息泄露**:未脱敏,日志中包含密码、密钥、个人信息
376
+ - **缺少 Trace Context**:无法关联日志与追踪,排查效率低
377
+
378
+ ### 2. 指标问题
379
+ - **指标爆炸**:高基数标签(如 user_id、request_id)导致 TSDB 性能下降
380
+ - **缺少业务指标**:只有技术指标,无法反映业务健康度
381
+ - **告警风暴**:阈值设置不当、缺少降噪,导致告警疲劳
382
+ - **Dashboard 过多过乱**:缺少层次和治理,找不到关键信息
383
+
384
+ ### 3. 追踪问题
385
+ - **采样率设置不当**:正常请求采样过低,关键错误追踪丢失
386
+ - **Span 粒度过粗**:缺少关键步骤的 Span,无法定位瓶颈
387
+ - **Trace Context 丢失**:跨服务传递失败,追踪链路断裂
388
+ - **追踪数据保留时间过短**:历史问题无法回溯
389
+
390
+ ### 4. 系统性问题
391
+ - **可观测性孤岛**:日志/指标/追踪三套独立系统,无法联动
392
+ - **缺少文档和培训**:团队成员不会使用,可观测性价值未释放
393
+ - **成本失控**:未做资源规划和优化,存储/查询成本过高
394
+ - **忽视性能影响**:可观测性组件自身成为性能瓶颈
395
+
396
+ ## 验收标准
397
+
398
+ ### 功能验收
399
+ - [ ] 所有服务日志采集覆盖 100%
400
+ - [ ] 所有服务核心指标(RED/USE)覆盖 100%
401
+ - [ ] 所有服务分布式追踪集成 100%
402
+ - [ ] 关键业务流程端到端追踪可见
403
+ - [ ] 日志/指标/追踪可通过 Trace Context 关联
404
+
405
+ ### 性能验收
406
+ - [ ] 日志采集延迟 < 10 秒
407
+ - [ ] 指标采集延迟 < 30 秒
408
+ - [ ] 追踪采集延迟 < 60 秒
409
+ - [ ] Dashboard 加载时间 < 5 秒
410
+ - [ ] 查询响应时间 < 10 秒(90% 请求)
411
+
412
+ ### 成本验收
413
+ - [ ] 日志存储成本 < $X/GB/月
414
+ - [ ] 指标存储成本 < $Y/百万样本/月
415
+ - [ ] 追踪存储成本 < $Z/百万 Span/月
416
+ - [ ] 可观测性总成本占基础设施成本 < 10%
417
+
418
+ ### 可用性验收
419
+ - [ ] 日志系统可用性 >= 99.9%
420
+ - [ ] 指标系统可用性 >= 99.9%
421
+ - [ ] 追踪系统可用性 >= 99.5%
422
+ - [ ] 数据保留符合合规要求
423
+
424
+ ## 参考资源
425
+
426
+ ### 开源工具
427
+ - OpenTelemetry:统一可观测性标准 https://opentelemetry.io/
428
+ - Prometheus:指标采集与告警 https://prometheus.io/
429
+ - Grafana:统一可视化平台 https://grafana.com/
430
+ - Jaeger:分布式追踪 https://www.jaegertracing.io/
431
+ - Fluentd:日志采集 https://www.fluentd.org/
432
+
433
+ ### 最佳实践
434
+ - Google SRE Book:https://sre.google/books/
435
+ - Observability Engineering(O'Reilly)
436
+ - Distributed Tracing in Practice(O'Reilly)
437
+
438
+ ### 云服务
439
+ - AWS X-Ray、CloudWatch
440
+ - Azure Monitor、Application Insights
441
+ - Google Cloud Operations Suite
442
+ - Datadog、Dynatrace、New Relic