@umacloud/knowledge 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (418) hide show
  1. package/00-governance/governance-capabilities.md +557 -0
  2. package/00-governance/knowledge-map.md +39 -0
  3. package/00-governance/maintenance-policy.md +76 -0
  4. package/00-governance/review-checklist.md +81 -0
  5. package/README.md +13 -0
  6. package/ai/01-standards/agent-development-complete.md +691 -0
  7. package/ai/01-standards/llm-application-complete.md +488 -0
  8. package/ai/01-standards/mlops-complete.md +798 -0
  9. package/ai/01-standards/prompt-engineering-complete.md +646 -0
  10. package/ai/01-standards/rag-architecture-complete.md +649 -0
  11. package/ai/02-playbooks/llm-evaluation-playbook.md +847 -0
  12. package/ai/03-checklists/ai-project-checklist.md +215 -0
  13. package/ai/04-antipatterns/ai-antipatterns.md +661 -0
  14. package/ai/05-cases/case-rag-production.md +147 -0
  15. package/ai/06-glossary/ai-glossary.md +162 -0
  16. package/ai/agent-evaluation-benchmark.md +53 -0
  17. package/ai/ai-agent-memory-context-management.md +41 -0
  18. package/ai/ai-cost-capacity-optimization-playbook.md +42 -0
  19. package/ai/ai-data-security-and-compliance-playbook.md +37 -0
  20. package/ai/ai-domain-index-and-checklist.md +40 -0
  21. package/ai/ai-governance-maturity-model.md +50 -0
  22. package/ai/ai-model-selection-and-routing-strategy.md +47 -0
  23. package/ai/ai-observability-and-oncall-runbook.md +52 -0
  24. package/ai/ai-rag-engineering-playbook.md +42 -0
  25. package/ai/ai-red-team-and-safety-evaluation.md +42 -0
  26. package/ai/ai-release-readiness-and-rollback-gate.md +42 -0
  27. package/ai/llm-agent-engineering-deep-dive.md +57 -0
  28. package/ai/prompt-and-tool-guardrails.md +52 -0
  29. package/api/01-standards/enterprise-api-standards.md +198 -0
  30. package/api/01-standards/rest-api-design-guide.md +63 -0
  31. package/api/02-playbooks/api-pagination-playbook.md +93 -0
  32. package/api/02-playbooks/graphql-production-playbook.md +176 -0
  33. package/api/03-checklists/api-review-checklist.md +55 -0
  34. package/api/04-antipatterns/api-antipatterns.md +112 -0
  35. package/architecture/01-standards/api-gateway-patterns.md +496 -0
  36. package/architecture/01-standards/cloud-native-patterns.md +644 -0
  37. package/architecture/01-standards/distributed-systems-patterns.md +591 -0
  38. package/architecture/01-standards/event-driven-architecture.md +595 -0
  39. package/architecture/01-standards/microservices-patterns-complete.md +968 -0
  40. package/architecture/01-standards/microservices-patterns.md +495 -0
  41. package/architecture/01-standards/system-design-interview.md +664 -0
  42. package/architecture/02-playbooks/microservices-patterns-playbook.md +137 -0
  43. package/architecture/02-playbooks/migration-playbook.md +780 -0
  44. package/architecture/02-playbooks/system-design-playbook.md +779 -0
  45. package/architecture/03-checklists/architecture-decision-checklist.md +297 -0
  46. package/architecture/04-antipatterns/architecture-antipatterns.md +417 -0
  47. package/architecture/05-cases/case-netflix-microservices.md +413 -0
  48. package/architecture/06-glossary/architecture-glossary.md +164 -0
  49. package/architecture/adr-template-and-examples.md +38 -0
  50. package/architecture/api-gateway-deep-dive.md +1291 -0
  51. package/architecture/configuration-management.md +1162 -0
  52. package/architecture/distributed-transactions.md +1220 -0
  53. package/architecture/microservices-complete.md +735 -0
  54. package/architecture/resilience-and-disaster-patterns.md +37 -0
  55. package/architecture/service-governance.md +1198 -0
  56. package/architecture/system-architecture-deep-dive.md +37 -0
  57. package/backend/01-standards/analytics-and-growth.md +65 -0
  58. package/backend/01-standards/api-and-error-conventions.md +120 -0
  59. package/backend/01-standards/application-layering-and-packaging.md +160 -0
  60. package/backend/01-standards/auth-implementation.md +104 -0
  61. package/backend/01-standards/backend-framework-idioms.md +74 -0
  62. package/backend/01-standards/background-jobs-and-async.md +66 -0
  63. package/backend/01-standards/caching-strategies-complete.md +390 -0
  64. package/backend/01-standards/config-and-observability.md +77 -0
  65. package/backend/01-standards/data-modeling-and-persistence.md +94 -0
  66. package/backend/01-standards/django-complete.md +1765 -0
  67. package/backend/01-standards/email-and-notifications.md +64 -0
  68. package/backend/01-standards/fastapi-complete.md +925 -0
  69. package/backend/01-standards/file-upload-and-storage.md +66 -0
  70. package/backend/01-standards/graphql-api-complete.md +416 -0
  71. package/backend/01-standards/llm-application-standard.md +78 -0
  72. package/backend/01-standards/message-queue-patterns.md +379 -0
  73. package/backend/01-standards/microservices-and-distributed.md +78 -0
  74. package/backend/01-standards/nestjs-complete.md +2167 -0
  75. package/backend/01-standards/payment-integration.md +80 -0
  76. package/backend/01-standards/rate-limiting-complete.md +451 -0
  77. package/backend/01-standards/realtime-and-websocket.md +65 -0
  78. package/backend/01-standards/search-and-filtering.md +64 -0
  79. package/backend/01-standards/spring-boot-complete.md +445 -0
  80. package/backend/02-playbooks/api-design-playbook.md +718 -0
  81. package/backend/02-playbooks/email-send-playbook.md +130 -0
  82. package/backend/02-playbooks/file-upload-s3-playbook.md +153 -0
  83. package/backend/02-playbooks/typescript-enterprise-playbook.md +133 -0
  84. package/backend/02-playbooks/websocket-realtime-playbook.md +154 -0
  85. package/backend/03-checklists/api-launch-checklist.md +189 -0
  86. package/backend/04-antipatterns/backend-antipatterns.md +1051 -0
  87. package/blockchain/01-standards/blockchain-basics.md +557 -0
  88. package/blockchain/01-standards/smart-contract-development.md +1315 -0
  89. package/cicd/01-standards/deployment-and-delivery-standard.md +96 -0
  90. package/cicd/01-standards/github-actions-complete.md +473 -0
  91. package/cicd/01-standards/release-and-store-submission.md +75 -0
  92. package/cicd/02-playbooks/cicd-pipeline-playbook.md +144 -0
  93. package/cicd/02-playbooks/release-management-playbook.md +605 -0
  94. package/cicd/03-checklists/pipeline-security-checklist.md +168 -0
  95. package/cicd/04-antipatterns/cicd-antipatterns.md +589 -0
  96. package/cicd/05-cases/case-deployment-automation.md +221 -0
  97. package/cicd/05-cases/case-gitops-transformation.md +212 -0
  98. package/cicd/06-glossary/cicd-glossary.md +114 -0
  99. package/cicd/cicd-blueprint-deep-dive.md +38 -0
  100. package/cicd/release-readiness-gate.md +37 -0
  101. package/cloud-native/01-standards/container-security.md +741 -0
  102. package/cloud-native/01-standards/kubernetes-complete.md +812 -0
  103. package/cloud-native/02-playbooks/api-gateway-playbook.md +155 -0
  104. package/cloud-native/02-playbooks/gitops-with-argocd.md +760 -0
  105. package/cloud-native/02-playbooks/k8s-troubleshooting-playbook.md +1942 -0
  106. package/cloud-native/02-playbooks/message-queue-playbook.md +129 -0
  107. package/cloud-native/02-playbooks/multicloud-governance.md +726 -0
  108. package/cloud-native/02-playbooks/serverless-patterns.md +788 -0
  109. package/cloud-native/02-playbooks/service-mesh-playbook.md +612 -0
  110. package/cloud-native/02-playbooks/terraform-iac-playbook.md +143 -0
  111. package/cloud-native/03-checklists/container-security-checklist.md +431 -0
  112. package/cloud-native/03-checklists/k8s-production-readiness-checklist.md +460 -0
  113. package/cloud-native/04-antipatterns/container-antipatterns.md +660 -0
  114. package/cloud-native/04-antipatterns/k8s-antipatterns.md +743 -0
  115. package/cloud-native/05-cases/case-k8s-migration.md +478 -0
  116. package/cloud-native/05-cases/case-k8s-scaling.md +642 -0
  117. package/cloud-native/05-cases/case-k8s-security-incident.md +397 -0
  118. package/cloud-native/06-glossary/cloud-native-glossary.md +337 -0
  119. package/cross-platform/01-standards/cross-platform-frameworks.md +83 -0
  120. package/cross-platform/01-standards/platform-selection-and-architecture.md +77 -0
  121. package/data/01-standards/elasticsearch-complete.md +2098 -0
  122. package/data/01-standards/postgresql-complete.md +1613 -0
  123. package/data/01-standards/redis-complete.md +1527 -0
  124. package/data/02-playbooks/database-optimization-playbook.md +403 -0
  125. package/data/02-playbooks/elasticsearch-production-playbook.md +132 -0
  126. package/data/03-checklists/database-launch-checklist.md +187 -0
  127. package/data/04-antipatterns/database-antipatterns.md +873 -0
  128. package/data/05-cases/case-database-migration.md +310 -0
  129. package/data/06-glossary/database-glossary.md +440 -0
  130. package/data/data-governance-and-modeling-deep-dive.md +39 -0
  131. package/data-engineering/01-standards/airflow-complete.md +523 -0
  132. package/data-engineering/01-standards/kafka-complete.md +1521 -0
  133. package/data-engineering/02-playbooks/spark-etl-playbook.md +496 -0
  134. package/data-engineering/03-checklists/pipeline-launch-checklist.md +194 -0
  135. package/data-engineering/04-antipatterns/data-pipeline-antipatterns.md +684 -0
  136. package/data-engineering/05-cases/case-real-time-pipeline.md +355 -0
  137. package/data-engineering/06-glossary/data-engineering-glossary.md +429 -0
  138. package/database/01-standards/database-schema-standards.md +147 -0
  139. package/database/02-playbooks/postgresql-optimization-quick.md +52 -0
  140. package/database/02-playbooks/postgresql-performance-optimization.md +58 -0
  141. package/database/02-playbooks/postgresql-production-playbook.md +146 -0
  142. package/database/02-playbooks/redis-caching-playbook.md +117 -0
  143. package/database/03-checklists/database-review-checklist.md +50 -0
  144. package/database/04-antipatterns/database-antipatterns.md +112 -0
  145. package/design/01-standards/ui-design-system-complete.md +423 -0
  146. package/design/02-playbooks/design-handoff-playbook.md +254 -0
  147. package/design/02-playbooks/design-review-playbook.md +388 -0
  148. package/design/03-checklists/design-review-checklist.md +246 -0
  149. package/design/04-antipatterns/design-antipatterns.md +378 -0
  150. package/design/05-cases/case-design-system-adoption.md +328 -0
  151. package/design/06-glossary/design-glossary.md +329 -0
  152. package/design/ui-full-lifecycle-cross-platform-playbook.md +571 -0
  153. package/design/ux-system-deep-dive.md +38 -0
  154. package/design-systems/00-craft-rules.md +71 -0
  155. package/design-systems/aesthetic-families.md +43 -0
  156. package/design-systems/anti-ai-slop.md +162 -0
  157. package/design-systems/bold-geometric.md +120 -0
  158. package/design-systems/brutalist-bold.md +103 -0
  159. package/design-systems/editorial-clean.md +109 -0
  160. package/design-systems/glass-aurora.md +108 -0
  161. package/design-systems/modern-minimal.md +145 -0
  162. package/design-systems/premium-luxury.md +106 -0
  163. package/design-systems/product-type-design-map.md +48 -0
  164. package/design-systems/soft-warm.md +123 -0
  165. package/design-systems/tech-utility.md +113 -0
  166. package/desktop/01-standards/desktop-app-standard.md +72 -0
  167. package/desktop/01-standards/desktop-design.md +71 -0
  168. package/development/00-governance/document-template.md +41 -0
  169. package/development/01-standards/api-versioning-strategies.md +432 -0
  170. package/development/01-standards/authentication-patterns-complete.md +479 -0
  171. package/development/01-standards/css-architecture-complete.md +550 -0
  172. package/development/01-standards/database-migration-strategies.md +484 -0
  173. package/development/01-standards/elasticsearch-complete.md +347 -0
  174. package/development/01-standards/git-complete.md +371 -0
  175. package/development/01-standards/golang-complete.md +1565 -0
  176. package/development/01-standards/graphql-complete.md +298 -0
  177. package/development/01-standards/javascript-bundlers-complete.md +469 -0
  178. package/development/01-standards/javascript-typescript-complete.md +528 -0
  179. package/development/01-standards/jest-complete.md +275 -0
  180. package/development/01-standards/linux-complete.md +234 -0
  181. package/development/01-standards/logging-observability-complete.md +526 -0
  182. package/development/01-standards/microservices-communication.md +502 -0
  183. package/development/01-standards/mongodb-complete.md +406 -0
  184. package/development/01-standards/oauth2-complete.md +285 -0
  185. package/development/01-standards/performance-optimization-complete.md +289 -0
  186. package/development/01-standards/playwright-complete.md +247 -0
  187. package/development/01-standards/postgresql-complete.md +456 -0
  188. package/development/01-standards/pytest-complete.md +340 -0
  189. package/development/01-standards/python-async-programming.md +902 -0
  190. package/development/01-standards/python-complete.md +956 -0
  191. package/development/01-standards/python-decorators-complete.md +799 -0
  192. package/development/01-standards/python-design-patterns.md +2854 -0
  193. package/development/01-standards/python-packaging-distribution.md +420 -0
  194. package/development/01-standards/python-testing-strategies.md +607 -0
  195. package/development/01-standards/python-web-frameworks-comparison.md +471 -0
  196. package/development/01-standards/redis-complete.md +317 -0
  197. package/development/01-standards/rest-api-complete.md +316 -0
  198. package/development/01-standards/rust-complete.md +578 -0
  199. package/development/01-standards/typescript-advanced-types.md +1513 -0
  200. package/development/01-standards/web-security-complete.md +292 -0
  201. package/development/02-playbooks/api-design-playbook.md +810 -0
  202. package/development/02-playbooks/database-migration-playbook.md +580 -0
  203. package/development/02-playbooks/debugging-playbook.md +692 -0
  204. package/development/02-playbooks/feature-delivery-playbook.md +430 -0
  205. package/development/02-playbooks/incident-hotfix-playbook.md +387 -0
  206. package/development/02-playbooks/performance-optimization-playbook.md +531 -0
  207. package/development/02-playbooks/performance-tuning-playbook.md +652 -0
  208. package/development/02-playbooks/refactor-playbook.md +403 -0
  209. package/development/02-playbooks/release-playbook.md +469 -0
  210. package/development/03-checklists/architecture-review-checklist.md +168 -0
  211. package/development/03-checklists/data-migration-checklist.md +157 -0
  212. package/development/03-checklists/oncall-handover-checklist.md +173 -0
  213. package/development/03-checklists/pr-checklist.md +158 -0
  214. package/development/03-checklists/production-readiness-checklist.md +190 -0
  215. package/development/03-checklists/release-readiness-checklist.md +154 -0
  216. package/development/03-checklists/security-review-checklist.md +182 -0
  217. package/development/04-antipatterns/api-antipatterns.md +657 -0
  218. package/development/04-antipatterns/architecture-antipatterns.md +686 -0
  219. package/development/04-antipatterns/backend-antipatterns.md +648 -0
  220. package/development/04-antipatterns/cicd-antipatterns.md +540 -0
  221. package/development/04-antipatterns/code-smell-antipatterns.md +571 -0
  222. package/development/04-antipatterns/data-antipatterns.md +658 -0
  223. package/development/04-antipatterns/database-antipatterns.md +578 -0
  224. package/development/04-antipatterns/frontend-antipatterns.md +635 -0
  225. package/development/04-antipatterns/reliability-antipatterns.md +700 -0
  226. package/development/04-antipatterns/security-antipatterns.md +747 -0
  227. package/development/05-cases/case-api-version-migration.md +428 -0
  228. package/development/05-cases/case-authorization-hardening.md +383 -0
  229. package/development/05-cases/case-bluegreen-rollback.md +466 -0
  230. package/development/05-cases/case-cache-snowball-protection.md +485 -0
  231. package/development/05-cases/case-ci-cd-pipeline.md +544 -0
  232. package/development/05-cases/case-database-scaling.md +500 -0
  233. package/development/05-cases/case-db-hotspot-optimization.md +487 -0
  234. package/development/05-cases/case-incident-mttr-reduction.md +563 -0
  235. package/development/05-cases/case-microservice-migration.md +375 -0
  236. package/development/05-cases/case-performance-optimization.md +406 -0
  237. package/development/05-cases/case-security-incident-response.md +345 -0
  238. package/development/06-glossary/full-stack-glossary.md +166 -0
  239. package/development/09-maturity/quarterly-audit-template.md +35 -0
  240. package/development/11-ui-excellence/ui-aesthetic-system.md +41 -0
  241. package/development/11-ui-excellence/ui-engineering-excellence.md +435 -0
  242. package/development/12-scenarios/development-scenarios-guide.md +565 -0
  243. package/development/13-implementation-assets/implementation-toolkit.md +282 -0
  244. package/development/13-implementation-assets/knowledge-gates-execution.md +43 -0
  245. package/development/14-full-lifecycle/software-lifecycle-gates.md +511 -0
  246. package/development/15-lifecycle-templates/project-templates-collection.md +791 -0
  247. package/development/api-contract-and-versioning-guide.md +36 -0
  248. package/development/api-governance-complete.md +43 -0
  249. package/development/backend-engineering-complete.md +43 -0
  250. package/development/code-review-quality-complete.md +43 -0
  251. package/development/concurrency-reliability-complete.md +43 -0
  252. package/development/database-engineering-complete.md +43 -0
  253. package/development/engineering-effectiveness-complete.md +43 -0
  254. package/development/engineering-standards-deep-dive.md +38 -0
  255. package/development/frontend-engineering-complete.md +43 -0
  256. package/development/performance-capacity-complete.md +43 -0
  257. package/development/refactor-migration-complete.md +42 -0
  258. package/development/refactoring-and-techdebt-playbook.md +37 -0
  259. package/development/security-in-development-complete.md +43 -0
  260. package/devops/01-standards/cicd-pipeline-complete.md +262 -0
  261. package/devops/01-standards/docker-complete.md +1490 -0
  262. package/devops/01-standards/github-actions-complete.md +337 -0
  263. package/devops/01-standards/kubernetes-complete.md +638 -0
  264. package/devops/01-standards/terraform-complete.md +2117 -0
  265. package/devops/02-playbooks/docker-compose-playbook.md +233 -0
  266. package/devops/02-playbooks/docker-k8s-production-playbook.md +186 -0
  267. package/devops/02-playbooks/docker-production-playbook.md +952 -0
  268. package/edge-iot/01-standards/edge-iot-complete.md +473 -0
  269. package/experts/architect/api-design.md +178 -0
  270. package/experts/architect/methodology.md +124 -0
  271. package/experts/architect/security.md +75 -0
  272. package/experts/backend-lead/methodology.md +216 -0
  273. package/experts/devops/methodology.md +160 -0
  274. package/experts/frontend-lead/methodology.md +178 -0
  275. package/experts/product-manager/industry/ecommerce.md +43 -0
  276. package/experts/product-manager/industry/saas.md +40 -0
  277. package/experts/product-manager/methodology.md +97 -0
  278. package/experts/qa-lead/methodology.md +123 -0
  279. package/experts/qa-lead/test-strategy.md +128 -0
  280. package/experts/uiux-designer/methodology.md +125 -0
  281. package/frontend/01-standards/accessibility-complete.md +532 -0
  282. package/frontend/01-standards/accessibility-standard.md +74 -0
  283. package/frontend/01-standards/admin-dashboard-and-crud.md +72 -0
  284. package/frontend/01-standards/design-tokens-complete.md +444 -0
  285. package/frontend/01-standards/forms-and-validation.md +77 -0
  286. package/frontend/01-standards/frontend-architecture-and-layering.md +119 -0
  287. package/frontend/01-standards/i18n-and-localization.md +65 -0
  288. package/frontend/01-standards/nextjs-complete.md +451 -0
  289. package/frontend/01-standards/react-complete.md +713 -0
  290. package/frontend/01-standards/react-hooks-complete-guide.md +1100 -0
  291. package/frontend/01-standards/react-hooks-complete.md +1171 -0
  292. package/frontend/01-standards/seo-and-web-vitals.md +77 -0
  293. package/frontend/01-standards/state-management-complete.md +444 -0
  294. package/frontend/01-standards/vue-complete.md +499 -0
  295. package/frontend/01-standards/vue3-complete.md +2002 -0
  296. package/frontend/01-standards/web-framework-best-practices.md +64 -0
  297. package/frontend/01-standards/web-performance-complete.md +495 -0
  298. package/frontend/02-playbooks/accessibility-a11y-playbook.md +161 -0
  299. package/frontend/02-playbooks/frontend-performance-playbook.md +707 -0
  300. package/frontend/02-playbooks/i18n-internationalization-playbook.md +120 -0
  301. package/frontend/02-playbooks/performance-optimization-playbook.md +163 -0
  302. package/frontend/02-playbooks/react-nextjs-production-playbook.md +167 -0
  303. package/frontend/02-playbooks/react-state-management-playbook.md +173 -0
  304. package/frontend/03-checklists/component-quality-checklist.md +166 -0
  305. package/frontend/03-checklists/frontend-launch-checklist.md +299 -0
  306. package/frontend/04-antipatterns/frontend-antipatterns.md +886 -0
  307. package/frontend/05-cases/case-performance-optimization.md +274 -0
  308. package/harmony/01-standards/harmonyos-arkts-standard.md +75 -0
  309. package/harmony/01-standards/harmonyos-design.md +65 -0
  310. package/high-quality-engineering-playbook.md +54 -0
  311. package/incident/01-standards/incident-response-complete.md +303 -0
  312. package/incident/02-playbooks/chaos-engineering-playbook.md +883 -0
  313. package/incident/02-playbooks/postmortem-playbook.md +398 -0
  314. package/incident/03-checklists/incident-readiness-checklist.md +181 -0
  315. package/incident/04-antipatterns/incident-antipatterns.md +490 -0
  316. package/incident/05-cases/case-cascade-failure.md +176 -0
  317. package/incident/06-glossary/incident-glossary.md +114 -0
  318. package/incident/postmortem-and-response-deep-dive.md +39 -0
  319. package/industries/ecommerce/ecommerce-complete.md +631 -0
  320. package/industries/education/education-complete.md +555 -0
  321. package/industries/fintech/fintech-complete.md +501 -0
  322. package/industries/gaming/gaming-complete.md +587 -0
  323. package/industries/healthcare/healthcare-complete.md +452 -0
  324. package/low-code/01-standards/low-code-complete.md +944 -0
  325. package/miniprogram/01-standards/ai-common-mistakes.md +61 -0
  326. package/miniprogram/01-standards/miniprogram-custom-navbar-capsule.md +77 -0
  327. package/miniprogram/01-standards/miniprogram-design.md +61 -0
  328. package/miniprogram/01-standards/miniprogram-standard.md +81 -0
  329. package/mobile/01-standards/android-material-design.md +70 -0
  330. package/mobile/01-standards/flutter-complete.md +384 -0
  331. package/mobile/01-standards/ios-design-hig.md +78 -0
  332. package/mobile/01-standards/mobile-app-standard.md +85 -0
  333. package/mobile/01-standards/react-native-complete.md +352 -0
  334. package/mobile/02-playbooks/mobile-cross-platform-playbook.md +175 -0
  335. package/mobile/02-playbooks/mobile-performance.md +473 -0
  336. package/mobile/03-checklists/mobile-release-checklist.md +234 -0
  337. package/mobile/04-antipatterns/mobile-antipatterns.md +798 -0
  338. package/mobile/05-cases/case-app-performance.md +500 -0
  339. package/mobile/05-cases/case-app-startup-optimization.md +218 -0
  340. package/mobile/06-glossary/mobile-glossary.md +484 -0
  341. package/observability/01-standards/observability-standards.md +103 -0
  342. package/observability/02-playbooks/prometheus-grafana-playbook.md +135 -0
  343. package/observability/02-playbooks/structured-logging-playbook.md +73 -0
  344. package/observability/03-checklists/observability-checklist.md +54 -0
  345. package/observability/04-antipatterns/observability-antipatterns.md +106 -0
  346. package/operations/01-standards/prometheus-monitoring-complete.md +1578 -0
  347. package/operations/02-playbooks/capacity-planning-playbook.md +620 -0
  348. package/operations/03-checklists/production-launch-checklist.md +365 -0
  349. package/operations/04-antipatterns/operations-antipatterns.md +664 -0
  350. package/operations/05-cases/case-sre-practices.md +581 -0
  351. package/operations/06-glossary/operations-glossary.md +120 -0
  352. package/operations/aiops-anomaly-detection.md +758 -0
  353. package/operations/capacity-planning.md +1061 -0
  354. package/operations/chaos-engineering.md +659 -0
  355. package/operations/incident-command-system.md +38 -0
  356. package/operations/observability-complete.md +442 -0
  357. package/operations/slo-sli-playbook.md +517 -0
  358. package/operations/sre-operations-deep-dive.md +39 -0
  359. package/package.json +8 -0
  360. package/performance/01-standards/performance-and-scalability.md +80 -0
  361. package/performance/01-standards/performance-standards.md +156 -0
  362. package/performance/02-playbooks/query-optimization-playbook.md +103 -0
  363. package/performance/03-checklists/performance-checklist.md +56 -0
  364. package/performance/04-antipatterns/performance-antipatterns.md +146 -0
  365. package/product/01-standards/product-management-complete.md +285 -0
  366. package/product/02-playbooks/feature-launch-playbook.md +207 -0
  367. package/product/02-playbooks/user-research-playbook.md +532 -0
  368. package/product/03-checklists/feature-launch-checklist.md +275 -0
  369. package/product/04-antipatterns/product-antipatterns.md +355 -0
  370. package/product/05-cases/case-mvp-to-scale.md +384 -0
  371. package/product/06-glossary/product-glossary.md +462 -0
  372. package/product/feature-prioritization-framework.md +40 -0
  373. package/product/kpi-and-metric-tree.md +37 -0
  374. package/product/product-discovery-and-prd-deep-dive.md +41 -0
  375. package/quantum/01-standards/quantum-complete.md +1186 -0
  376. package/security/01-standards/api-security-complete.md +511 -0
  377. package/security/01-standards/container-runtime-security.md +574 -0
  378. package/security/01-standards/data-protection-gdpr.md +543 -0
  379. package/security/01-standards/owasp-top10-complete.md +1890 -0
  380. package/security/01-standards/secure-coding-baseline.md +90 -0
  381. package/security/01-standards/supply-chain-security.md +441 -0
  382. package/security/01-standards/web-security-checklist.md +108 -0
  383. package/security/01-standards/zero-trust-architecture.md +521 -0
  384. package/security/02-playbooks/auth-sso-playbook.md +166 -0
  385. package/security/02-playbooks/incident-response-security-playbook.md +588 -0
  386. package/security/02-playbooks/owasp-api-security-playbook.md +129 -0
  387. package/security/02-playbooks/payment-integration-playbook.md +119 -0
  388. package/security/02-playbooks/penetration-testing-playbook.md +517 -0
  389. package/security/03-checklists/security-audit-checklist.md +356 -0
  390. package/security/04-antipatterns/security-coding-antipatterns.md +580 -0
  391. package/security/05-cases/case-log4shell-incident.md +537 -0
  392. package/security/05-cases/case-major-breaches.md +468 -0
  393. package/security/06-glossary/security-glossary.md +212 -0
  394. package/security/compliance-automation.md +993 -0
  395. package/security/container-security.md +680 -0
  396. package/security/devsecops-complete.md +426 -0
  397. package/security/sast-dast-sca.md +775 -0
  398. package/security/secrets-management.md +594 -0
  399. package/security/security-architecture-deep-dive.md +37 -0
  400. package/security/threat-modeling-stride-playbook.md +40 -0
  401. package/seed-templates/auth-system.md +59 -0
  402. package/seed-templates/blog-content.md +94 -0
  403. package/seed-templates/dashboard.md +89 -0
  404. package/seed-templates/docs-site.md +73 -0
  405. package/seed-templates/e-commerce.md +50 -0
  406. package/seed-templates/saas-landing.md +92 -0
  407. package/seed-templates/settings-page.md +51 -0
  408. package/testing/01-standards/test-strategy-and-layering.md +83 -0
  409. package/testing/01-standards/testing-strategy-complete.md +422 -0
  410. package/testing/01-standards/unit-testing-best-practices.md +118 -0
  411. package/testing/02-playbooks/e2e-testing-playbook.md +988 -0
  412. package/testing/02-playbooks/testing-strategy-playbook.md +126 -0
  413. package/testing/03-checklists/test-strategy-checklist.md +208 -0
  414. package/testing/04-antipatterns/testing-antipatterns.md +718 -0
  415. package/testing/05-cases/case-testing-transformation.md +300 -0
  416. package/testing/06-glossary/testing-glossary.md +110 -0
  417. package/testing/risk-based-test-matrix.md +36 -0
  418. package/testing/testing-strategy-deep-dive.md +37 -0
@@ -0,0 +1,496 @@
1
+ ---
2
+ id: spark-etl-playbook
3
+ title: Spark ETL开发完整指南
4
+ domain: data-engineering
5
+ category: 02-playbooks
6
+ difficulty: intermediate
7
+ tags: [data-engineering, etl, etl流程, playbook, spark, 参考资料, 学习路径, 性能优化]
8
+ quality_score: 70
9
+ last_updated: 2026-06-15
10
+ ---
11
+ # Spark ETL开发完整指南
12
+
13
+ ## 概述
14
+ Apache Spark是一个快速、通用的大数据处理引擎,支持批处理、流处理、SQL查询和机器学习。本指南覆盖Spark ETL(Extract-Transform-Load)的最佳实践。
15
+
16
+ ## 核心概念
17
+
18
+ ### 1. RDD vs DataFrame vs Dataset
19
+
20
+ **RDD (Resilient Distributed Dataset)**:
21
+ - 底层API
22
+ - 无模式
23
+ - 函数式编程
24
+
25
+ ```python
26
+ from pyspark import SparkContext
27
+
28
+ sc = SparkContext.getOrCreate()
29
+
30
+ # 创建RDD
31
+ rdd = sc.parallelize([1, 2, 3, 4, 5])
32
+
33
+ # 转换
34
+ squared = rdd.map(lambda x: x ** 2)
35
+ filtered = rdd.filter(lambda x: x > 10)
36
+ ```
37
+
38
+ **DataFrame**:
39
+ - 高层API
40
+ - 有模式
41
+ - SQL风格
42
+
43
+ ```python
44
+ from pyspark.sql import SparkSession
45
+
46
+ spark = SparkSession.builder.appName("ETL").getOrCreate()
47
+
48
+ # 创建DataFrame
49
+ df = spark.createDataFrame(
50
+ [(1, "Alice", 30), (2, "Bob", 25)],
51
+ ["id", "name", "age"]
52
+ )
53
+
54
+ # SQL查询
55
+ df.createOrReplaceTempView("users")
56
+ result = spark.sql("SELECT * FROM users WHERE age > 25")
57
+ ```
58
+
59
+ **Dataset**:
60
+ - 类型安全的DataFrame
61
+ - Scala/Java优先
62
+
63
+ ### 2. 惰性求值
64
+ Spark延迟执行,直到遇到action操作。
65
+
66
+ ```python
67
+ # Transformation (惰性)
68
+ mapped = df.select("name", "age").filter(df.age > 20)
69
+
70
+ # Action (触发执行)
71
+ result = mapped.collect() # 触发计算
72
+ ```
73
+
74
+ ### 3. 分区和并行
75
+
76
+ ```python
77
+ # 查看分区数
78
+ print(df.rdd.getNumPartitions)
79
+
80
+ # 重分区
81
+ repartitioned = df.repartition(10)
82
+
83
+ # Coalesce (减少分区)
84
+ coalesced = df.coalesce(5)
85
+ ```
86
+
87
+ ## ETL流程
88
+
89
+ ### 1. Extract (提取)
90
+
91
+ #### 从文件系统读取
92
+
93
+ ```python
94
+ # CSV
95
+ df = spark.read.csv("hdfs://path/to/file.csv", header=True, inferSchema=True)
96
+
97
+ # JSON
98
+ df = spark.read.json("hdfs://path/to/file.json")
99
+
100
+ # Parquet (推荐)
101
+ df = spark.read.parquet("hdfs://path/to/file.parquet")
102
+
103
+ # 分区数据
104
+ df = spark.read.parquet("hdfs://path/to/data") \
105
+ .filter("year=2024") \
106
+ .filter("month=03")
107
+ ```
108
+
109
+ #### 从数据库读取
110
+
111
+ ```python
112
+ # JDBC
113
+ df = spark.read \
114
+ .format("jdbc") \
115
+ .option("url", "jdbc:postgresql://localhost:5432/db") \
116
+ .option("dbtable", "users") \
117
+ .option("user", "user") \
118
+ .option("password", "password") \
119
+ .load()
120
+
121
+ # 带推下谓词
122
+ df = spark.read \
123
+ .format("jdbc") \
124
+ .option("url", "jdbc:postgresql://localhost:5432/db") \
125
+ .option("dbtable", "(SELECT * FROM users WHERE active=true) AS users") \
126
+ .option("user", "user") \
127
+ .option("password", "password") \
128
+ .load()
129
+ ```
130
+
131
+ #### 从Kafka读取
132
+
133
+ ```python
134
+ # 流式读取
135
+ df = spark \
136
+ .readStream \
137
+ .format("kafka") \
138
+ .option("kafka.bootstrap.servers", "localhost:9092") \
139
+ .option("subscribe", "topic1,topic2") \
140
+ .load()
141
+
142
+ # 批量读取
143
+ df = spark \
144
+ .read \
145
+ .format("kafka") \
146
+ .option("kafka.bootstrap.servers", "localhost:9092") \
147
+ .option("subscribe", "topic1") \
148
+ .option("startingOffsets", "earliest") \
149
+ .option("endingOffsets", "latest") \
150
+ .load()
151
+ ```
152
+
153
+ ### 2. Transform (转换)
154
+
155
+ #### 基础转换
156
+
157
+ ```python
158
+ # 选择列
159
+ selected = df.select("name", "age")
160
+
161
+ # 过滤
162
+ filtered = df.filter(df.age > 25)
163
+
164
+ # 去重
165
+ distinct = df.dropDuplicates(["user_id"])
166
+
167
+ # 排序
168
+ sorted_df = df.orderBy(df.age.desc())
169
+
170
+ # 重命名
171
+ renamed = df.withColumnRenamed("old_name", "new_name")
172
+
173
+ # 添加列
174
+ with_new = df.withColumn("age_plus_10", df.age + 10)
175
+ ```
176
+
177
+ #### 聚合
178
+
179
+ ```python
180
+ # 分组聚合
181
+ agg_df = df.groupBy("department").agg(
182
+ {"salary": "avg", "age": "max"}
183
+ )
184
+
185
+ # 多聚合
186
+ from pyspark.sql.functions import avg, max, count
187
+
188
+ agg_df = df.groupBy("department").agg(
189
+ avg("salary").alias("avg_salary"),
190
+ max("age").alias("max_age"),
191
+ count("*").alias("count")
192
+ )
193
+
194
+ # Window函数
195
+ from pyspark.sql.window import Window
196
+ from pyspark.sql.functions import row_number, rank
197
+
198
+ window_spec = Window.partitionBy("department").orderBy(df.salary.desc())
199
+
200
+ ranked = df.withColumn("rank", rank().over(window_spec))
201
+ ```
202
+
203
+ #### Join
204
+
205
+ ```python
206
+ # Inner Join
207
+ joined = df1.join(df2, df1.id == df2.user_id, "inner")
208
+
209
+ # Left Join
210
+ left_joined = df1.join(df2, df1.id == df2.user_id, "left")
211
+
212
+ # Broadcast Join (小表)
213
+ from pyspark.sql.functions import broadcast
214
+
215
+ joined = df1.join(broadcast(small_df), df1.id == small_df.id)
216
+
217
+ # 多列Join
218
+ joined = df1.join(df2, ["id", "date"])
219
+ ```
220
+
221
+ #### UDF (用户定义函数)
222
+
223
+ ```python
224
+ from pyspark.sql.functions import udf
225
+ from pyspark.sql.types import StringType
226
+
227
+ @udf(returnType=StringType())
228
+ def categorize_age(age):
229
+ if age < 18:
230
+ return "minor"
231
+ elif age < 65:
232
+ return "adult"
233
+ else:
234
+ return "senior"
235
+
236
+ df = df.withColumn("age_category", categorize_age(df.age))
237
+
238
+ # Pandas UDF (更快)
239
+ import pandas as pd
240
+ from pyspark.sql.functions import pandas_udf
241
+
242
+ @pandas_udf("string")
243
+ def categorize_age_udf(age_series: pd.Series) -> pd.Series:
244
+ return age_series.apply(lambda age: "minor" if age < 18 else "adult")
245
+ ```
246
+
247
+ ### 3. Load (加载)
248
+
249
+ #### 写入文件系统
250
+
251
+ ```python
252
+ # Parquet (推荐)
253
+ df.write.parquet("hdfs://path/to/output", mode="overwrite")
254
+
255
+ # 分区写入
256
+ df.write.partitionBy("year", "month").parquet("hdfs://path/to/output")
257
+
258
+ # CSV
259
+ df.write.csv("hdfs://path/to/output.csv", header=True, mode="overwrite")
260
+
261
+ # JSON
262
+ df.write.json("hdfs://path/to/output.json", mode="overwrite")
263
+ ```
264
+
265
+ #### 写入数据库
266
+
267
+ ```python
268
+ df.write \
269
+ .format("jdbc") \
270
+ .option("url", "jdbc:postgresql://localhost:5432/db") \
271
+ .option("dbtable", "output_table") \
272
+ .option("user", "user") \
273
+ .option("password", "password") \
274
+ .mode("append") \
275
+ .save()
276
+ ```
277
+
278
+ #### 写入Kafka
279
+
280
+ ```python
281
+ # 流式写入
282
+ query = df \
283
+ .writeStream \
284
+ .format("kafka") \
285
+ .option("kafka.bootstrap.servers", "localhost:9092") \
286
+ .option("topic", "output_topic") \
287
+ .option("checkpointLocation", "/path/to/checkpoint") \
288
+ .start()
289
+
290
+ # 批量写入
291
+ df \
292
+ .selectExpr("CAST(key AS STRING)", "CAST(value AS STRING)") \
293
+ .write \
294
+ .format("kafka") \
295
+ .option("kafka.bootstrap.servers", "localhost:9092") \
296
+ .option("topic", "output_topic") \
297
+ .save()
298
+ ```
299
+
300
+ ## 性能优化
301
+
302
+ ### 1. 缓存
303
+
304
+ ```python
305
+ # 缓存DataFrame
306
+ df.cache()
307
+
308
+ # 持久化到磁盘
309
+ from pyspark import StorageLevel
310
+
311
+ df.persist(StorageLevel.MEMORY_AND_DISK)
312
+
313
+ # 解除缓存
314
+ df.unpersist()
315
+ ```
316
+
317
+ ### 2. 分区优化
318
+
319
+ ```python
320
+ # 查看分区数
321
+ print(df.rdd.getNumPartitions())
322
+
323
+ # 增加分区
324
+ repartitioned = df.repartition(100, "user_id")
325
+
326
+ # 减少分区
327
+ coalesced = df.coalesce(10)
328
+
329
+ # 自定义分区
330
+ from pyspark.sql.functions import spark_partition_id
331
+
332
+ df.withColumn("partition_id", spark_partition_id())
333
+ ```
334
+
335
+ ### 3. 广播变量
336
+
337
+ ```python
338
+ # 广播小数据集
339
+ broadcast_var = spark.sparkContext.broadcast({"key": "value"})
340
+
341
+ # 使用
342
+ def my_udf(value):
343
+ return broadcast_var.value.get(value)
344
+
345
+ # 清理
346
+ broadcast_var.unpersist()
347
+ ```
348
+
349
+ ### 4. 累加器
350
+
351
+ ```python
352
+ # 创建累加器
353
+ acc = spark.sparkContext.accumulator(0)
354
+
355
+ def add_to_acc(value):
356
+ acc.add(value)
357
+
358
+ # 使用
359
+ df.foreach(lambda row: add_to_acc(row["amount"]))
360
+
361
+ print(acc.value)
362
+ ```
363
+
364
+ ## 监控和调优
365
+
366
+ ### 1. Spark UI
367
+
368
+ 访问 `http://localhost:4040` 查看:
369
+ - Jobs
370
+ - Stages
371
+ - Storage
372
+ - Environment
373
+ - Executors
374
+
375
+ ### 2. 执行计划
376
+
377
+ ```python
378
+ # 查看逻辑计划
379
+ df.explain(True)
380
+
381
+ # 查看物理计划
382
+ df.explain()
383
+ ```
384
+
385
+ ### 3. 内存调优
386
+
387
+ ```python
388
+ # spark-submit参数
389
+ --executor-memory 8G \
390
+ --executor-cores 4 \
391
+ --driver-memory 4G \
392
+ --conf spark.sql.shuffle.partitions=200
393
+ ```
394
+
395
+ ## 最佳实践
396
+
397
+ ### ✅ DO
398
+
399
+ 1. **使用Parquet格式**
400
+ ```python
401
+ # ✅ 列式存储,性能好
402
+ df.write.parquet("output")
403
+ ```
404
+
405
+ 2. **合理分区**
406
+ ```python
407
+ # ✅ 按常用查询字段分区
408
+ df.write.partitionBy("date", "hour").parquet("output")
409
+ ```
410
+
411
+ 3. **使用广播Join**
412
+ ```python
413
+ # ✅ 小表广播
414
+ joined = large_df.join(broadcast(small_df), "id")
415
+ ```
416
+
417
+ 4. **缓存重用数据**
418
+ ```python
419
+ # ✅ 多次使用的数据
420
+ df.cache()
421
+ df.count()
422
+ df.show()
423
+ ```
424
+
425
+ ### ❌ DON'T
426
+
427
+ 1. **不要使用collect()获取大数据**
428
+ ```python
429
+ # ❌ 内存溢出
430
+ data = df.collect()
431
+
432
+ # ✅ 使用limit
433
+ data = df.limit(1000).collect()
434
+ ```
435
+
436
+ 2. **不要过度分区**
437
+ ```python
438
+ # ❌ 太多小分区
439
+ df.repartition(1000)
440
+
441
+ # ✅ 合理分区数
442
+ df.repartition(200)
443
+ ```
444
+
445
+ 3. **不要使用row count做检查**
446
+ ```python
447
+ # ❌ 触发完整计算
448
+ if df.count() > 0:
449
+ df.show()
450
+
451
+ # ✅ 使用take
452
+ if df.head(1):
453
+ df.show()
454
+ ```
455
+
456
+ ## 学习路径
457
+
458
+ ### 初级 (1-2周)
459
+ 1. Spark架构和RDD基础
460
+ 2. DataFrame和SQL
461
+ 3. 基础ETL操作
462
+
463
+ ### 中级 (2-3周)
464
+ 1. 聚合和Window函数
465
+ 2. UDF和性能优化
466
+ 3. 流处理(Structured Streaming)
467
+
468
+ ### 高级 (2-4周)
469
+ 1. 自定义数据源
470
+ 2. 集群调优
471
+ 3. 生产部署
472
+
473
+ ### 专家级 (持续)
474
+ 1. 性能调优和故障排查
475
+ 2. 多租户资源管理
476
+ 3. 实时Lambda架构
477
+
478
+ ## 参考资料
479
+
480
+ ### 官方文档
481
+ - [Spark官方文档](https://spark.apache.org/docs/latest/)
482
+ - [PySpark文档](https://spark.apache.org/docs/latest/api/python/)
483
+
484
+ ### 教程
485
+ - [Spark编程指南](https://spark.apache.org/docs/latest/rdd-programming-guide.html)
486
+ - [Spark SQL指南](https://spark.apache.org/docs/latest/sql-programming-guide.html)
487
+
488
+ ---
489
+
490
+ **知识ID**: `spark-etl-playbook`
491
+ **领域**: data-engineering
492
+ **类型**: playbooks
493
+ **难度**: intermediate
494
+ **质量分**: 90
495
+ **维护者**: data-team@umadev.com
496
+ **最后更新**: 2026-03-28
@@ -0,0 +1,194 @@
1
+ ---
2
+ id: pipeline-launch-checklist
3
+ title: 数据管道上线检查清单
4
+ domain: data-engineering
5
+ category: 03-checklists
6
+ difficulty: intermediate
7
+ tags: [agent, alerting, checklist, data-engineering, launch, pipeline, quality, 上线前最终确认]
8
+ quality_score: 70
9
+ last_updated: 2026-06-15
10
+ ---
11
+ # 数据管道上线检查清单
12
+
13
+ ## 概述
14
+
15
+ 本清单用于数据管道(Data Pipeline)从开发环境迁移到生产环境前的系统化审查。覆盖数据质量、可靠性、监控、安全、性能和运维六大维度。适用于批处理管道(Spark / Airflow)和流处理管道(Kafka / Flink / Spark Streaming)。
16
+
17
+ 所有 MUST 级别项未通过则阻止上线。
18
+
19
+ ---
20
+
21
+ ## 1. 数据质量(Data Quality)
22
+
23
+ ### MUST
24
+
25
+ - [ ] 输入数据 Schema 校验已实现(字段名、类型、非空约束)
26
+ - [ ] 输出数据 Schema 与下游消费者契约一致
27
+ - [ ] 空值处理策略已定义(丢弃 / 默认值填充 / 标记异常)
28
+ - [ ] 重复数据检测与去重逻辑已实现
29
+ - [ ] 数据类型转换有异常处理(如字符串转日期失败不导致管道崩溃)
30
+ - [ ] 数据量级校验:输出行数与输入行数的比例在合理范围内
31
+ - [ ] 关键业务字段的值域校验(如金额 > 0,日期在合理范围内)
32
+
33
+ ### SHOULD
34
+
35
+ - [ ] 数据质量指标暴露到监控系统(空值率、去重率、异常率)
36
+ - [ ] 数据血缘(Data Lineage)已记录:每个字段的来源和转换逻辑
37
+ - [ ] 数据新鲜度检查:数据时间戳不超过预期延迟
38
+ - [ ] 数据分布异常检测(统计分布与历史基线对比)
39
+ - [ ] 数据质量看板可视化(Grafana / DataDog / Great Expectations)
40
+ - [ ] 使用 Great Expectations / Deequ / Soda 等工具自动化质量检查
41
+
42
+ ## 2. 可靠性(Reliability)
43
+
44
+ ### MUST
45
+
46
+ - [ ] 幂等性保证:管道重复执行不产生重复数据
47
+ - [ ] 失败重试机制已配置(重试次数、间隔、退避策略)
48
+ - [ ] 死信队列(DLQ)已配置:无法处理的消息不丢弃,进入 DLQ
49
+ - [ ] 检查点 / 偏移量管理已实现(Kafka offset / Flink checkpoint / Spark checkpoint)
50
+ - [ ] 管道从失败点恢复的能力已验证(不需要从头重跑)
51
+ - [ ] 上游数据源不可用时的降级策略已定义(等待 / 跳过 / 使用缓存数据)
52
+ - [ ] 数据写入使用事务或原子操作(避免部分写入导致不一致)
53
+
54
+ ### SHOULD
55
+
56
+ - [ ] Exactly-once 语义已实现(或 at-least-once + 幂等去重)
57
+ - [ ] 背压(Backpressure)处理机制已实现
58
+ - [ ] 管道支持优雅停机(处理完当前批次后再停止)
59
+ - [ ] 上下游 Schema 变更的兼容性策略(Schema Evolution)
60
+ - [ ] 灾备切换方案已验证(主集群故障时切到备用集群)
61
+ - [ ] 数据回填(Backfill)脚本已准备并测试
62
+
63
+ ## 3. 监控与告警(Monitoring & Alerting)
64
+
65
+ ### MUST
66
+
67
+ - [ ] 管道运行状态监控(运行中 / 成功 / 失败 / 延迟)
68
+ - [ ] 处理延迟监控:端到端延迟超过 SLA 触发告警
69
+ - [ ] 数据积压监控:未处理消息数 / Kafka Consumer Lag 告警
70
+ - [ ] 错误率监控:处理失败比例 > 1% 触发告警
71
+ - [ ] 管道进程存活监控(进程崩溃 / OOM 即时告警)
72
+ - [ ] 输出数据量监控:输出为 0 行或骤降 > 50% 触发告警
73
+ - [ ] 资源使用监控:CPU / 内存 / 磁盘超过阈值告警
74
+
75
+ ### SHOULD
76
+
77
+ - [ ] Grafana / Datadog 仪表板已创建,包含:
78
+ - 处理吞吐量(records/sec)
79
+ - 端到端延迟(P50 / P95 / P99)
80
+ - Consumer Lag 趋势
81
+ - 错误率趋势
82
+ - 资源使用趋势
83
+ - [ ] 告警分级:P1(管道停止/数据丢失)、P2(延迟超标)、P3(资源告警)
84
+ - [ ] 告警收敛:同一问题不重复告警(设置静默窗口)
85
+ - [ ] 每日自动生成管道运行报告(成功率、延迟、数据量)
86
+ - [ ] SLA 仪表板:展示各管道的 SLA 达成率
87
+
88
+ ## 4. 安全(Security)
89
+
90
+ ### MUST
91
+
92
+ - [ ] 敏感数据已加密传输(Kafka SSL / HTTPS / TLS)
93
+ - [ ] 敏感数据已加密存储或脱敏(PII 字段:姓名、手机号、身份证号)
94
+ - [ ] 数据访问权限控制(Kafka ACL / HDFS 权限 / 数据库权限)
95
+ - [ ] 密码/密钥通过 Secret Manager 管理(不硬编码在代码或配置中)
96
+ - [ ] 管道运行账户使用最小权限原则
97
+ - [ ] 数据保留策略已定义(过期数据自动清理)
98
+
99
+ ### SHOULD
100
+
101
+ - [ ] 数据分级标注(公开 / 内部 / 机密 / 绝密)
102
+ - [ ] 敏感数据访问有审计日志
103
+ - [ ] GDPR / 个人信息保护法合规审查已完成
104
+ - [ ] 跨区域数据传输合规审查(如有跨境场景)
105
+ - [ ] 定期安全扫描:依赖漏洞检查
106
+ - [ ] 数据删除请求处理流程已定义(Right to be Forgotten)
107
+
108
+ ## 5. 性能(Performance)
109
+
110
+ ### MUST
111
+
112
+ - [ ] 吞吐量满足业务需求(处理速度 ≥ 数据产生速度)
113
+ - [ ] 端到端延迟满足 SLA(批处理 < N 小时,流处理 < N 秒)
114
+ - [ ] 压力测试已完成:2 倍峰值流量下管道仍可正常处理
115
+ - [ ] 资源分配已优化(不过度分配浪费成本,不欠分配导致瓶颈)
116
+ - [ ] 大数据量场景已验证(数据量翻倍时性能劣化在可接受范围)
117
+
118
+ ### SHOULD
119
+
120
+ - [ ] 分区策略已优化(Kafka partition 数量合理,Spark partition 大小 128-256MB)
121
+ - [ ] 序列化格式已选定并验证(Avro / Protobuf / Parquet > JSON / CSV)
122
+ - [ ] 数据压缩已启用(Snappy / LZ4 / Zstd)
123
+ - [ ] 小文件合并策略已实现(避免 HDFS 小文件问题)
124
+ - [ ] 数据倾斜检测和处理(Spark: salting / repartition)
125
+ - [ ] 冷启动性能已验证(管道重启后处理积压数据的时间可接受)
126
+
127
+ ## 6. 运维(Operations)
128
+
129
+ ### MUST
130
+
131
+ - [ ] 管道部署使用 CI/CD 自动化(不手动部署)
132
+ - [ ] 配置与代码分离(使用配置文件 / 环境变量)
133
+ - [ ] 管道版本与代码版本可追溯(Git tag / 镜像版本)
134
+ - [ ] 回滚方案已定义并验证
135
+ - [ ] 运行手册(Runbook)已编写,包含常见故障处理步骤
136
+ - [ ] On-call 轮值已安排
137
+
138
+ ### SHOULD
139
+
140
+ - [ ] 管道支持参数化运行(日期范围、数据源、输出目标可配置)
141
+ - [ ] 日志级别可动态调整(不需重启即可切换 DEBUG/INFO)
142
+ - [ ] 管道依赖关系可视化(DAG 图)
143
+ - [ ] 容量规划文档已编写(数据增长预估 + 资源扩容方案)
144
+ - [ ] 成本监控:每条管道的计算成本可追踪
145
+ - [ ] 定期回顾:每月审查管道运行质量和成本
146
+
147
+ ---
148
+
149
+ ## 批处理管道补充检查
150
+
151
+ - [ ] 调度配置正确(Airflow DAG / Cron / 云调度器)
152
+ - [ ] 任务依赖关系正确(上游任务失败时下游不启动)
153
+ - [ ] SLA 超时设置:任务超过预期时间触发告警
154
+ - [ ] 历史数据回填方案已测试
155
+ - [ ] 并行度设置合理(不超过集群资源上限)
156
+
157
+ ## 流处理管道补充检查
158
+
159
+ - [ ] Consumer Group 配置正确
160
+ - [ ] Checkpoint 间隔设置合理(太频繁影响性能,太稀疏恢复慢)
161
+ - [ ] Watermark 策略已定义(处理乱序事件)
162
+ - [ ] 窗口函数配置正确(窗口大小、滑动间隔、允许延迟)
163
+ - [ ] State 大小监控(避免 State 无限增长导致 OOM)
164
+
165
+ ---
166
+
167
+ ## 上线前最终确认
168
+
169
+ | 序号 | 检查项 | 状态 |
170
+ |------|--------|------|
171
+ | 1 | 所有 MUST 级别检查项通过 | [ ] |
172
+ | 2 | 端到端测试在 staging 环境通过 | [ ] |
173
+ | 3 | 数据质量校验通过 | [ ] |
174
+ | 4 | 监控仪表板和告警已配置 | [ ] |
175
+ | 5 | Runbook 已编写并由团队审阅 | [ ] |
176
+ | 6 | 回滚方案已验证 | [ ] |
177
+ | 7 | 下游消费者已通知并确认兼容 | [ ] |
178
+
179
+ ---
180
+
181
+ ## Agent Checklist
182
+
183
+ 以下为 AI Agent 在数据管道上线审查时必须逐项验证的硬约束:
184
+
185
+ - [ ] 确认管道 Schema 校验逻辑存在(输入 + 输出)
186
+ - [ ] 确认幂等性:手动触发管道两次,输出数据不重复
187
+ - [ ] 确认失败恢复:模拟中途失败后重启,数据完整且不重复
188
+ - [ ] 确认监控告警:管道停止后 < 5 分钟收到告警
189
+ - [ ] 确认密码/密钥不在代码或配置文件中明文出现
190
+ - [ ] 确认吞吐量测试结果满足 SLA
191
+ - [ ] 确认 Consumer Lag / 处理延迟指标可在监控系统中查看
192
+ - [ ] 确认回滚方案文档存在且步骤清晰
193
+ - [ ] 确认数据保留策略已配置(过期数据自动清理)
194
+ - [ ] 生成上线检查报告并附在发布 ticket 中