@umacloud/knowledge 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (418) hide show
  1. package/00-governance/governance-capabilities.md +557 -0
  2. package/00-governance/knowledge-map.md +39 -0
  3. package/00-governance/maintenance-policy.md +76 -0
  4. package/00-governance/review-checklist.md +81 -0
  5. package/README.md +13 -0
  6. package/ai/01-standards/agent-development-complete.md +691 -0
  7. package/ai/01-standards/llm-application-complete.md +488 -0
  8. package/ai/01-standards/mlops-complete.md +798 -0
  9. package/ai/01-standards/prompt-engineering-complete.md +646 -0
  10. package/ai/01-standards/rag-architecture-complete.md +649 -0
  11. package/ai/02-playbooks/llm-evaluation-playbook.md +847 -0
  12. package/ai/03-checklists/ai-project-checklist.md +215 -0
  13. package/ai/04-antipatterns/ai-antipatterns.md +661 -0
  14. package/ai/05-cases/case-rag-production.md +147 -0
  15. package/ai/06-glossary/ai-glossary.md +162 -0
  16. package/ai/agent-evaluation-benchmark.md +53 -0
  17. package/ai/ai-agent-memory-context-management.md +41 -0
  18. package/ai/ai-cost-capacity-optimization-playbook.md +42 -0
  19. package/ai/ai-data-security-and-compliance-playbook.md +37 -0
  20. package/ai/ai-domain-index-and-checklist.md +40 -0
  21. package/ai/ai-governance-maturity-model.md +50 -0
  22. package/ai/ai-model-selection-and-routing-strategy.md +47 -0
  23. package/ai/ai-observability-and-oncall-runbook.md +52 -0
  24. package/ai/ai-rag-engineering-playbook.md +42 -0
  25. package/ai/ai-red-team-and-safety-evaluation.md +42 -0
  26. package/ai/ai-release-readiness-and-rollback-gate.md +42 -0
  27. package/ai/llm-agent-engineering-deep-dive.md +57 -0
  28. package/ai/prompt-and-tool-guardrails.md +52 -0
  29. package/api/01-standards/enterprise-api-standards.md +198 -0
  30. package/api/01-standards/rest-api-design-guide.md +63 -0
  31. package/api/02-playbooks/api-pagination-playbook.md +93 -0
  32. package/api/02-playbooks/graphql-production-playbook.md +176 -0
  33. package/api/03-checklists/api-review-checklist.md +55 -0
  34. package/api/04-antipatterns/api-antipatterns.md +112 -0
  35. package/architecture/01-standards/api-gateway-patterns.md +496 -0
  36. package/architecture/01-standards/cloud-native-patterns.md +644 -0
  37. package/architecture/01-standards/distributed-systems-patterns.md +591 -0
  38. package/architecture/01-standards/event-driven-architecture.md +595 -0
  39. package/architecture/01-standards/microservices-patterns-complete.md +968 -0
  40. package/architecture/01-standards/microservices-patterns.md +495 -0
  41. package/architecture/01-standards/system-design-interview.md +664 -0
  42. package/architecture/02-playbooks/microservices-patterns-playbook.md +137 -0
  43. package/architecture/02-playbooks/migration-playbook.md +780 -0
  44. package/architecture/02-playbooks/system-design-playbook.md +779 -0
  45. package/architecture/03-checklists/architecture-decision-checklist.md +297 -0
  46. package/architecture/04-antipatterns/architecture-antipatterns.md +417 -0
  47. package/architecture/05-cases/case-netflix-microservices.md +413 -0
  48. package/architecture/06-glossary/architecture-glossary.md +164 -0
  49. package/architecture/adr-template-and-examples.md +38 -0
  50. package/architecture/api-gateway-deep-dive.md +1291 -0
  51. package/architecture/configuration-management.md +1162 -0
  52. package/architecture/distributed-transactions.md +1220 -0
  53. package/architecture/microservices-complete.md +735 -0
  54. package/architecture/resilience-and-disaster-patterns.md +37 -0
  55. package/architecture/service-governance.md +1198 -0
  56. package/architecture/system-architecture-deep-dive.md +37 -0
  57. package/backend/01-standards/analytics-and-growth.md +65 -0
  58. package/backend/01-standards/api-and-error-conventions.md +120 -0
  59. package/backend/01-standards/application-layering-and-packaging.md +160 -0
  60. package/backend/01-standards/auth-implementation.md +104 -0
  61. package/backend/01-standards/backend-framework-idioms.md +74 -0
  62. package/backend/01-standards/background-jobs-and-async.md +66 -0
  63. package/backend/01-standards/caching-strategies-complete.md +390 -0
  64. package/backend/01-standards/config-and-observability.md +77 -0
  65. package/backend/01-standards/data-modeling-and-persistence.md +94 -0
  66. package/backend/01-standards/django-complete.md +1765 -0
  67. package/backend/01-standards/email-and-notifications.md +64 -0
  68. package/backend/01-standards/fastapi-complete.md +925 -0
  69. package/backend/01-standards/file-upload-and-storage.md +66 -0
  70. package/backend/01-standards/graphql-api-complete.md +416 -0
  71. package/backend/01-standards/llm-application-standard.md +78 -0
  72. package/backend/01-standards/message-queue-patterns.md +379 -0
  73. package/backend/01-standards/microservices-and-distributed.md +78 -0
  74. package/backend/01-standards/nestjs-complete.md +2167 -0
  75. package/backend/01-standards/payment-integration.md +80 -0
  76. package/backend/01-standards/rate-limiting-complete.md +451 -0
  77. package/backend/01-standards/realtime-and-websocket.md +65 -0
  78. package/backend/01-standards/search-and-filtering.md +64 -0
  79. package/backend/01-standards/spring-boot-complete.md +445 -0
  80. package/backend/02-playbooks/api-design-playbook.md +718 -0
  81. package/backend/02-playbooks/email-send-playbook.md +130 -0
  82. package/backend/02-playbooks/file-upload-s3-playbook.md +153 -0
  83. package/backend/02-playbooks/typescript-enterprise-playbook.md +133 -0
  84. package/backend/02-playbooks/websocket-realtime-playbook.md +154 -0
  85. package/backend/03-checklists/api-launch-checklist.md +189 -0
  86. package/backend/04-antipatterns/backend-antipatterns.md +1051 -0
  87. package/blockchain/01-standards/blockchain-basics.md +557 -0
  88. package/blockchain/01-standards/smart-contract-development.md +1315 -0
  89. package/cicd/01-standards/deployment-and-delivery-standard.md +96 -0
  90. package/cicd/01-standards/github-actions-complete.md +473 -0
  91. package/cicd/01-standards/release-and-store-submission.md +75 -0
  92. package/cicd/02-playbooks/cicd-pipeline-playbook.md +144 -0
  93. package/cicd/02-playbooks/release-management-playbook.md +605 -0
  94. package/cicd/03-checklists/pipeline-security-checklist.md +168 -0
  95. package/cicd/04-antipatterns/cicd-antipatterns.md +589 -0
  96. package/cicd/05-cases/case-deployment-automation.md +221 -0
  97. package/cicd/05-cases/case-gitops-transformation.md +212 -0
  98. package/cicd/06-glossary/cicd-glossary.md +114 -0
  99. package/cicd/cicd-blueprint-deep-dive.md +38 -0
  100. package/cicd/release-readiness-gate.md +37 -0
  101. package/cloud-native/01-standards/container-security.md +741 -0
  102. package/cloud-native/01-standards/kubernetes-complete.md +812 -0
  103. package/cloud-native/02-playbooks/api-gateway-playbook.md +155 -0
  104. package/cloud-native/02-playbooks/gitops-with-argocd.md +760 -0
  105. package/cloud-native/02-playbooks/k8s-troubleshooting-playbook.md +1942 -0
  106. package/cloud-native/02-playbooks/message-queue-playbook.md +129 -0
  107. package/cloud-native/02-playbooks/multicloud-governance.md +726 -0
  108. package/cloud-native/02-playbooks/serverless-patterns.md +788 -0
  109. package/cloud-native/02-playbooks/service-mesh-playbook.md +612 -0
  110. package/cloud-native/02-playbooks/terraform-iac-playbook.md +143 -0
  111. package/cloud-native/03-checklists/container-security-checklist.md +431 -0
  112. package/cloud-native/03-checklists/k8s-production-readiness-checklist.md +460 -0
  113. package/cloud-native/04-antipatterns/container-antipatterns.md +660 -0
  114. package/cloud-native/04-antipatterns/k8s-antipatterns.md +743 -0
  115. package/cloud-native/05-cases/case-k8s-migration.md +478 -0
  116. package/cloud-native/05-cases/case-k8s-scaling.md +642 -0
  117. package/cloud-native/05-cases/case-k8s-security-incident.md +397 -0
  118. package/cloud-native/06-glossary/cloud-native-glossary.md +337 -0
  119. package/cross-platform/01-standards/cross-platform-frameworks.md +83 -0
  120. package/cross-platform/01-standards/platform-selection-and-architecture.md +77 -0
  121. package/data/01-standards/elasticsearch-complete.md +2098 -0
  122. package/data/01-standards/postgresql-complete.md +1613 -0
  123. package/data/01-standards/redis-complete.md +1527 -0
  124. package/data/02-playbooks/database-optimization-playbook.md +403 -0
  125. package/data/02-playbooks/elasticsearch-production-playbook.md +132 -0
  126. package/data/03-checklists/database-launch-checklist.md +187 -0
  127. package/data/04-antipatterns/database-antipatterns.md +873 -0
  128. package/data/05-cases/case-database-migration.md +310 -0
  129. package/data/06-glossary/database-glossary.md +440 -0
  130. package/data/data-governance-and-modeling-deep-dive.md +39 -0
  131. package/data-engineering/01-standards/airflow-complete.md +523 -0
  132. package/data-engineering/01-standards/kafka-complete.md +1521 -0
  133. package/data-engineering/02-playbooks/spark-etl-playbook.md +496 -0
  134. package/data-engineering/03-checklists/pipeline-launch-checklist.md +194 -0
  135. package/data-engineering/04-antipatterns/data-pipeline-antipatterns.md +684 -0
  136. package/data-engineering/05-cases/case-real-time-pipeline.md +355 -0
  137. package/data-engineering/06-glossary/data-engineering-glossary.md +429 -0
  138. package/database/01-standards/database-schema-standards.md +147 -0
  139. package/database/02-playbooks/postgresql-optimization-quick.md +52 -0
  140. package/database/02-playbooks/postgresql-performance-optimization.md +58 -0
  141. package/database/02-playbooks/postgresql-production-playbook.md +146 -0
  142. package/database/02-playbooks/redis-caching-playbook.md +117 -0
  143. package/database/03-checklists/database-review-checklist.md +50 -0
  144. package/database/04-antipatterns/database-antipatterns.md +112 -0
  145. package/design/01-standards/ui-design-system-complete.md +423 -0
  146. package/design/02-playbooks/design-handoff-playbook.md +254 -0
  147. package/design/02-playbooks/design-review-playbook.md +388 -0
  148. package/design/03-checklists/design-review-checklist.md +246 -0
  149. package/design/04-antipatterns/design-antipatterns.md +378 -0
  150. package/design/05-cases/case-design-system-adoption.md +328 -0
  151. package/design/06-glossary/design-glossary.md +329 -0
  152. package/design/ui-full-lifecycle-cross-platform-playbook.md +571 -0
  153. package/design/ux-system-deep-dive.md +38 -0
  154. package/design-systems/00-craft-rules.md +71 -0
  155. package/design-systems/aesthetic-families.md +43 -0
  156. package/design-systems/anti-ai-slop.md +162 -0
  157. package/design-systems/bold-geometric.md +120 -0
  158. package/design-systems/brutalist-bold.md +103 -0
  159. package/design-systems/editorial-clean.md +109 -0
  160. package/design-systems/glass-aurora.md +108 -0
  161. package/design-systems/modern-minimal.md +145 -0
  162. package/design-systems/premium-luxury.md +106 -0
  163. package/design-systems/product-type-design-map.md +48 -0
  164. package/design-systems/soft-warm.md +123 -0
  165. package/design-systems/tech-utility.md +113 -0
  166. package/desktop/01-standards/desktop-app-standard.md +72 -0
  167. package/desktop/01-standards/desktop-design.md +71 -0
  168. package/development/00-governance/document-template.md +41 -0
  169. package/development/01-standards/api-versioning-strategies.md +432 -0
  170. package/development/01-standards/authentication-patterns-complete.md +479 -0
  171. package/development/01-standards/css-architecture-complete.md +550 -0
  172. package/development/01-standards/database-migration-strategies.md +484 -0
  173. package/development/01-standards/elasticsearch-complete.md +347 -0
  174. package/development/01-standards/git-complete.md +371 -0
  175. package/development/01-standards/golang-complete.md +1565 -0
  176. package/development/01-standards/graphql-complete.md +298 -0
  177. package/development/01-standards/javascript-bundlers-complete.md +469 -0
  178. package/development/01-standards/javascript-typescript-complete.md +528 -0
  179. package/development/01-standards/jest-complete.md +275 -0
  180. package/development/01-standards/linux-complete.md +234 -0
  181. package/development/01-standards/logging-observability-complete.md +526 -0
  182. package/development/01-standards/microservices-communication.md +502 -0
  183. package/development/01-standards/mongodb-complete.md +406 -0
  184. package/development/01-standards/oauth2-complete.md +285 -0
  185. package/development/01-standards/performance-optimization-complete.md +289 -0
  186. package/development/01-standards/playwright-complete.md +247 -0
  187. package/development/01-standards/postgresql-complete.md +456 -0
  188. package/development/01-standards/pytest-complete.md +340 -0
  189. package/development/01-standards/python-async-programming.md +902 -0
  190. package/development/01-standards/python-complete.md +956 -0
  191. package/development/01-standards/python-decorators-complete.md +799 -0
  192. package/development/01-standards/python-design-patterns.md +2854 -0
  193. package/development/01-standards/python-packaging-distribution.md +420 -0
  194. package/development/01-standards/python-testing-strategies.md +607 -0
  195. package/development/01-standards/python-web-frameworks-comparison.md +471 -0
  196. package/development/01-standards/redis-complete.md +317 -0
  197. package/development/01-standards/rest-api-complete.md +316 -0
  198. package/development/01-standards/rust-complete.md +578 -0
  199. package/development/01-standards/typescript-advanced-types.md +1513 -0
  200. package/development/01-standards/web-security-complete.md +292 -0
  201. package/development/02-playbooks/api-design-playbook.md +810 -0
  202. package/development/02-playbooks/database-migration-playbook.md +580 -0
  203. package/development/02-playbooks/debugging-playbook.md +692 -0
  204. package/development/02-playbooks/feature-delivery-playbook.md +430 -0
  205. package/development/02-playbooks/incident-hotfix-playbook.md +387 -0
  206. package/development/02-playbooks/performance-optimization-playbook.md +531 -0
  207. package/development/02-playbooks/performance-tuning-playbook.md +652 -0
  208. package/development/02-playbooks/refactor-playbook.md +403 -0
  209. package/development/02-playbooks/release-playbook.md +469 -0
  210. package/development/03-checklists/architecture-review-checklist.md +168 -0
  211. package/development/03-checklists/data-migration-checklist.md +157 -0
  212. package/development/03-checklists/oncall-handover-checklist.md +173 -0
  213. package/development/03-checklists/pr-checklist.md +158 -0
  214. package/development/03-checklists/production-readiness-checklist.md +190 -0
  215. package/development/03-checklists/release-readiness-checklist.md +154 -0
  216. package/development/03-checklists/security-review-checklist.md +182 -0
  217. package/development/04-antipatterns/api-antipatterns.md +657 -0
  218. package/development/04-antipatterns/architecture-antipatterns.md +686 -0
  219. package/development/04-antipatterns/backend-antipatterns.md +648 -0
  220. package/development/04-antipatterns/cicd-antipatterns.md +540 -0
  221. package/development/04-antipatterns/code-smell-antipatterns.md +571 -0
  222. package/development/04-antipatterns/data-antipatterns.md +658 -0
  223. package/development/04-antipatterns/database-antipatterns.md +578 -0
  224. package/development/04-antipatterns/frontend-antipatterns.md +635 -0
  225. package/development/04-antipatterns/reliability-antipatterns.md +700 -0
  226. package/development/04-antipatterns/security-antipatterns.md +747 -0
  227. package/development/05-cases/case-api-version-migration.md +428 -0
  228. package/development/05-cases/case-authorization-hardening.md +383 -0
  229. package/development/05-cases/case-bluegreen-rollback.md +466 -0
  230. package/development/05-cases/case-cache-snowball-protection.md +485 -0
  231. package/development/05-cases/case-ci-cd-pipeline.md +544 -0
  232. package/development/05-cases/case-database-scaling.md +500 -0
  233. package/development/05-cases/case-db-hotspot-optimization.md +487 -0
  234. package/development/05-cases/case-incident-mttr-reduction.md +563 -0
  235. package/development/05-cases/case-microservice-migration.md +375 -0
  236. package/development/05-cases/case-performance-optimization.md +406 -0
  237. package/development/05-cases/case-security-incident-response.md +345 -0
  238. package/development/06-glossary/full-stack-glossary.md +166 -0
  239. package/development/09-maturity/quarterly-audit-template.md +35 -0
  240. package/development/11-ui-excellence/ui-aesthetic-system.md +41 -0
  241. package/development/11-ui-excellence/ui-engineering-excellence.md +435 -0
  242. package/development/12-scenarios/development-scenarios-guide.md +565 -0
  243. package/development/13-implementation-assets/implementation-toolkit.md +282 -0
  244. package/development/13-implementation-assets/knowledge-gates-execution.md +43 -0
  245. package/development/14-full-lifecycle/software-lifecycle-gates.md +511 -0
  246. package/development/15-lifecycle-templates/project-templates-collection.md +791 -0
  247. package/development/api-contract-and-versioning-guide.md +36 -0
  248. package/development/api-governance-complete.md +43 -0
  249. package/development/backend-engineering-complete.md +43 -0
  250. package/development/code-review-quality-complete.md +43 -0
  251. package/development/concurrency-reliability-complete.md +43 -0
  252. package/development/database-engineering-complete.md +43 -0
  253. package/development/engineering-effectiveness-complete.md +43 -0
  254. package/development/engineering-standards-deep-dive.md +38 -0
  255. package/development/frontend-engineering-complete.md +43 -0
  256. package/development/performance-capacity-complete.md +43 -0
  257. package/development/refactor-migration-complete.md +42 -0
  258. package/development/refactoring-and-techdebt-playbook.md +37 -0
  259. package/development/security-in-development-complete.md +43 -0
  260. package/devops/01-standards/cicd-pipeline-complete.md +262 -0
  261. package/devops/01-standards/docker-complete.md +1490 -0
  262. package/devops/01-standards/github-actions-complete.md +337 -0
  263. package/devops/01-standards/kubernetes-complete.md +638 -0
  264. package/devops/01-standards/terraform-complete.md +2117 -0
  265. package/devops/02-playbooks/docker-compose-playbook.md +233 -0
  266. package/devops/02-playbooks/docker-k8s-production-playbook.md +186 -0
  267. package/devops/02-playbooks/docker-production-playbook.md +952 -0
  268. package/edge-iot/01-standards/edge-iot-complete.md +473 -0
  269. package/experts/architect/api-design.md +178 -0
  270. package/experts/architect/methodology.md +124 -0
  271. package/experts/architect/security.md +75 -0
  272. package/experts/backend-lead/methodology.md +216 -0
  273. package/experts/devops/methodology.md +160 -0
  274. package/experts/frontend-lead/methodology.md +178 -0
  275. package/experts/product-manager/industry/ecommerce.md +43 -0
  276. package/experts/product-manager/industry/saas.md +40 -0
  277. package/experts/product-manager/methodology.md +97 -0
  278. package/experts/qa-lead/methodology.md +123 -0
  279. package/experts/qa-lead/test-strategy.md +128 -0
  280. package/experts/uiux-designer/methodology.md +125 -0
  281. package/frontend/01-standards/accessibility-complete.md +532 -0
  282. package/frontend/01-standards/accessibility-standard.md +74 -0
  283. package/frontend/01-standards/admin-dashboard-and-crud.md +72 -0
  284. package/frontend/01-standards/design-tokens-complete.md +444 -0
  285. package/frontend/01-standards/forms-and-validation.md +77 -0
  286. package/frontend/01-standards/frontend-architecture-and-layering.md +119 -0
  287. package/frontend/01-standards/i18n-and-localization.md +65 -0
  288. package/frontend/01-standards/nextjs-complete.md +451 -0
  289. package/frontend/01-standards/react-complete.md +713 -0
  290. package/frontend/01-standards/react-hooks-complete-guide.md +1100 -0
  291. package/frontend/01-standards/react-hooks-complete.md +1171 -0
  292. package/frontend/01-standards/seo-and-web-vitals.md +77 -0
  293. package/frontend/01-standards/state-management-complete.md +444 -0
  294. package/frontend/01-standards/vue-complete.md +499 -0
  295. package/frontend/01-standards/vue3-complete.md +2002 -0
  296. package/frontend/01-standards/web-framework-best-practices.md +64 -0
  297. package/frontend/01-standards/web-performance-complete.md +495 -0
  298. package/frontend/02-playbooks/accessibility-a11y-playbook.md +161 -0
  299. package/frontend/02-playbooks/frontend-performance-playbook.md +707 -0
  300. package/frontend/02-playbooks/i18n-internationalization-playbook.md +120 -0
  301. package/frontend/02-playbooks/performance-optimization-playbook.md +163 -0
  302. package/frontend/02-playbooks/react-nextjs-production-playbook.md +167 -0
  303. package/frontend/02-playbooks/react-state-management-playbook.md +173 -0
  304. package/frontend/03-checklists/component-quality-checklist.md +166 -0
  305. package/frontend/03-checklists/frontend-launch-checklist.md +299 -0
  306. package/frontend/04-antipatterns/frontend-antipatterns.md +886 -0
  307. package/frontend/05-cases/case-performance-optimization.md +274 -0
  308. package/harmony/01-standards/harmonyos-arkts-standard.md +75 -0
  309. package/harmony/01-standards/harmonyos-design.md +65 -0
  310. package/high-quality-engineering-playbook.md +54 -0
  311. package/incident/01-standards/incident-response-complete.md +303 -0
  312. package/incident/02-playbooks/chaos-engineering-playbook.md +883 -0
  313. package/incident/02-playbooks/postmortem-playbook.md +398 -0
  314. package/incident/03-checklists/incident-readiness-checklist.md +181 -0
  315. package/incident/04-antipatterns/incident-antipatterns.md +490 -0
  316. package/incident/05-cases/case-cascade-failure.md +176 -0
  317. package/incident/06-glossary/incident-glossary.md +114 -0
  318. package/incident/postmortem-and-response-deep-dive.md +39 -0
  319. package/industries/ecommerce/ecommerce-complete.md +631 -0
  320. package/industries/education/education-complete.md +555 -0
  321. package/industries/fintech/fintech-complete.md +501 -0
  322. package/industries/gaming/gaming-complete.md +587 -0
  323. package/industries/healthcare/healthcare-complete.md +452 -0
  324. package/low-code/01-standards/low-code-complete.md +944 -0
  325. package/miniprogram/01-standards/ai-common-mistakes.md +61 -0
  326. package/miniprogram/01-standards/miniprogram-custom-navbar-capsule.md +77 -0
  327. package/miniprogram/01-standards/miniprogram-design.md +61 -0
  328. package/miniprogram/01-standards/miniprogram-standard.md +81 -0
  329. package/mobile/01-standards/android-material-design.md +70 -0
  330. package/mobile/01-standards/flutter-complete.md +384 -0
  331. package/mobile/01-standards/ios-design-hig.md +78 -0
  332. package/mobile/01-standards/mobile-app-standard.md +85 -0
  333. package/mobile/01-standards/react-native-complete.md +352 -0
  334. package/mobile/02-playbooks/mobile-cross-platform-playbook.md +175 -0
  335. package/mobile/02-playbooks/mobile-performance.md +473 -0
  336. package/mobile/03-checklists/mobile-release-checklist.md +234 -0
  337. package/mobile/04-antipatterns/mobile-antipatterns.md +798 -0
  338. package/mobile/05-cases/case-app-performance.md +500 -0
  339. package/mobile/05-cases/case-app-startup-optimization.md +218 -0
  340. package/mobile/06-glossary/mobile-glossary.md +484 -0
  341. package/observability/01-standards/observability-standards.md +103 -0
  342. package/observability/02-playbooks/prometheus-grafana-playbook.md +135 -0
  343. package/observability/02-playbooks/structured-logging-playbook.md +73 -0
  344. package/observability/03-checklists/observability-checklist.md +54 -0
  345. package/observability/04-antipatterns/observability-antipatterns.md +106 -0
  346. package/operations/01-standards/prometheus-monitoring-complete.md +1578 -0
  347. package/operations/02-playbooks/capacity-planning-playbook.md +620 -0
  348. package/operations/03-checklists/production-launch-checklist.md +365 -0
  349. package/operations/04-antipatterns/operations-antipatterns.md +664 -0
  350. package/operations/05-cases/case-sre-practices.md +581 -0
  351. package/operations/06-glossary/operations-glossary.md +120 -0
  352. package/operations/aiops-anomaly-detection.md +758 -0
  353. package/operations/capacity-planning.md +1061 -0
  354. package/operations/chaos-engineering.md +659 -0
  355. package/operations/incident-command-system.md +38 -0
  356. package/operations/observability-complete.md +442 -0
  357. package/operations/slo-sli-playbook.md +517 -0
  358. package/operations/sre-operations-deep-dive.md +39 -0
  359. package/package.json +8 -0
  360. package/performance/01-standards/performance-and-scalability.md +80 -0
  361. package/performance/01-standards/performance-standards.md +156 -0
  362. package/performance/02-playbooks/query-optimization-playbook.md +103 -0
  363. package/performance/03-checklists/performance-checklist.md +56 -0
  364. package/performance/04-antipatterns/performance-antipatterns.md +146 -0
  365. package/product/01-standards/product-management-complete.md +285 -0
  366. package/product/02-playbooks/feature-launch-playbook.md +207 -0
  367. package/product/02-playbooks/user-research-playbook.md +532 -0
  368. package/product/03-checklists/feature-launch-checklist.md +275 -0
  369. package/product/04-antipatterns/product-antipatterns.md +355 -0
  370. package/product/05-cases/case-mvp-to-scale.md +384 -0
  371. package/product/06-glossary/product-glossary.md +462 -0
  372. package/product/feature-prioritization-framework.md +40 -0
  373. package/product/kpi-and-metric-tree.md +37 -0
  374. package/product/product-discovery-and-prd-deep-dive.md +41 -0
  375. package/quantum/01-standards/quantum-complete.md +1186 -0
  376. package/security/01-standards/api-security-complete.md +511 -0
  377. package/security/01-standards/container-runtime-security.md +574 -0
  378. package/security/01-standards/data-protection-gdpr.md +543 -0
  379. package/security/01-standards/owasp-top10-complete.md +1890 -0
  380. package/security/01-standards/secure-coding-baseline.md +90 -0
  381. package/security/01-standards/supply-chain-security.md +441 -0
  382. package/security/01-standards/web-security-checklist.md +108 -0
  383. package/security/01-standards/zero-trust-architecture.md +521 -0
  384. package/security/02-playbooks/auth-sso-playbook.md +166 -0
  385. package/security/02-playbooks/incident-response-security-playbook.md +588 -0
  386. package/security/02-playbooks/owasp-api-security-playbook.md +129 -0
  387. package/security/02-playbooks/payment-integration-playbook.md +119 -0
  388. package/security/02-playbooks/penetration-testing-playbook.md +517 -0
  389. package/security/03-checklists/security-audit-checklist.md +356 -0
  390. package/security/04-antipatterns/security-coding-antipatterns.md +580 -0
  391. package/security/05-cases/case-log4shell-incident.md +537 -0
  392. package/security/05-cases/case-major-breaches.md +468 -0
  393. package/security/06-glossary/security-glossary.md +212 -0
  394. package/security/compliance-automation.md +993 -0
  395. package/security/container-security.md +680 -0
  396. package/security/devsecops-complete.md +426 -0
  397. package/security/sast-dast-sca.md +775 -0
  398. package/security/secrets-management.md +594 -0
  399. package/security/security-architecture-deep-dive.md +37 -0
  400. package/security/threat-modeling-stride-playbook.md +40 -0
  401. package/seed-templates/auth-system.md +59 -0
  402. package/seed-templates/blog-content.md +94 -0
  403. package/seed-templates/dashboard.md +89 -0
  404. package/seed-templates/docs-site.md +73 -0
  405. package/seed-templates/e-commerce.md +50 -0
  406. package/seed-templates/saas-landing.md +92 -0
  407. package/seed-templates/settings-page.md +51 -0
  408. package/testing/01-standards/test-strategy-and-layering.md +83 -0
  409. package/testing/01-standards/testing-strategy-complete.md +422 -0
  410. package/testing/01-standards/unit-testing-best-practices.md +118 -0
  411. package/testing/02-playbooks/e2e-testing-playbook.md +988 -0
  412. package/testing/02-playbooks/testing-strategy-playbook.md +126 -0
  413. package/testing/03-checklists/test-strategy-checklist.md +208 -0
  414. package/testing/04-antipatterns/testing-antipatterns.md +718 -0
  415. package/testing/05-cases/case-testing-transformation.md +300 -0
  416. package/testing/06-glossary/testing-glossary.md +110 -0
  417. package/testing/risk-based-test-matrix.md +36 -0
  418. package/testing/testing-strategy-deep-dive.md +37 -0
@@ -0,0 +1,2117 @@
1
+ ---
2
+ id: terraform-complete
3
+ title: Terraform 完整指南
4
+ domain: devops
5
+ category: 01-standards
6
+ difficulty: intermediate
7
+ tags: [complete, devops, terraform, 实战配置, 工作流, 核心概念, 概述, 模块设计]
8
+ quality_score: 70
9
+ last_updated: 2026-06-15
10
+ ---
11
+ # Terraform 完整指南
12
+
13
+ > 文档版本: v1.0 | 最后更新: 2026-03-28 | 适用范围: Terraform 1.6+ / OpenTofu 1.6+
14
+
15
+ ---
16
+
17
+ ## 目录
18
+
19
+ 1. [概述](#概述)
20
+ 2. [核心概念](#核心概念)
21
+ 3. [HCL 语法详解](#hcl-语法详解)
22
+ 4. [状态管理](#状态管理)
23
+ 5. [模块设计](#模块设计)
24
+ 6. [工作流](#工作流)
25
+ 7. [AWS 实战配置](#aws-实战配置)
26
+ 8. [安全](#安全)
27
+ 9. [CI/CD 集成](#cicd-集成)
28
+ 10. [团队协作](#团队协作)
29
+ 11. [性能优化](#性能优化)
30
+ 12. [常见陷阱与反模式](#常见陷阱与反模式)
31
+ 13. [Agent Checklist](#agent-checklist)
32
+
33
+ ---
34
+
35
+ ## 概述
36
+
37
+ ### IaC 理念
38
+
39
+ Infrastructure as Code (IaC) 将基础设施的定义、部署和管理全部代码化,像管理应用代码一样管理基础设施。核心原则:
40
+
41
+ - **声明式定义**:描述期望状态而非操作步骤,Terraform 自动计算差异并执行变更
42
+ - **版本控制**:所有基础设施变更通过 Git 追踪,可审计、可回溯
43
+ - **可重复性**:同一套代码在任意环境产生一致的结果,消除手动配置漂移
44
+ - **自文档化**:代码本身就是基础设施的文档,无需额外维护配置清单
45
+ - **协作友好**:变更通过 PR 审查,团队共享统一的基础设施定义
46
+
47
+ ### Terraform vs 竞品对比
48
+
49
+ | 维度 | Terraform / OpenTofu | Pulumi | CloudFormation | Ansible |
50
+ |------|---------------------|--------|----------------|---------|
51
+ | 语言 | HCL (声明式 DSL) | Python/TS/Go/C# 等通用语言 | JSON/YAML | YAML (过程式) |
52
+ | 云支持 | 多云 (3000+ Provider) | 多云 | 仅 AWS | 多云 (侧重配置管理) |
53
+ | 状态管理 | 显式 State 文件 | 托管 / 自管 State | AWS 托管 | 无状态 (幂等模块) |
54
+ | 学习曲线 | 中等 (HCL 专用语法) | 低 (复用已有语言) | 中高 (AWS 绑定) | 低 (YAML 编排) |
55
+ | Plan/Preview | `terraform plan` 原生支持 | `pulumi preview` | Change Sets | `--check` 模式 |
56
+ | 模块生态 | Terraform Registry 海量模块 | Pulumi Registry | AWS 嵌套栈 / Modules | Ansible Galaxy |
57
+ | 适合场景 | 基础设施编排 (网络/计算/存储) | 复杂逻辑 + 基础设施 | 纯 AWS 环境 | 配置管理 + 应用部署 |
58
+ | 开源协议 | BSL 1.1 (Terraform) / MPL 2.0 (OpenTofu) | Apache 2.0 | 闭源 | GPL 3.0 |
59
+
60
+ **选型建议**:
61
+ - 多云 / 混合云基础设施编排 → Terraform / OpenTofu
62
+ - 团队已有强 Python/TS 背景且逻辑复杂 → Pulumi
63
+ - 纯 AWS 且已深度使用 AWS 服务 → CloudFormation
64
+ - 服务器配置管理 + 应用部署 → Ansible
65
+ - Terraform 负责基础设施层,Ansible 负责配置层,二者常配合使用
66
+
67
+ ---
68
+
69
+ ## 核心概念
70
+
71
+ ### Provider
72
+
73
+ Provider 是 Terraform 与外部 API 交互的插件。每个云平台、SaaS 服务或内部系统都有对应的 Provider。
74
+
75
+ ```hcl
76
+ # 声明 Provider 及版本约束
77
+ terraform {
78
+ required_version = ">= 1.6.0"
79
+
80
+ required_providers {
81
+ aws = {
82
+ source = "hashicorp/aws"
83
+ version = "~> 5.40" # 允许 5.40.x 补丁更新
84
+ }
85
+ random = {
86
+ source = "hashicorp/random"
87
+ version = ">= 3.6.0, < 4.0.0"
88
+ }
89
+ }
90
+ }
91
+
92
+ # 配置 Provider
93
+ provider "aws" {
94
+ region = var.aws_region
95
+
96
+ default_tags {
97
+ tags = {
98
+ Environment = var.environment
99
+ ManagedBy = "terraform"
100
+ Project = var.project_name
101
+ }
102
+ }
103
+ }
104
+
105
+ # 多 Provider 实例 (alias)
106
+ provider "aws" {
107
+ alias = "us_west"
108
+ region = "us-west-2"
109
+ }
110
+ ```
111
+
112
+ ### Resource
113
+
114
+ Resource 是 Terraform 管理的核心对象,代表一个基础设施组件。
115
+
116
+ ```hcl
117
+ resource "aws_instance" "web" {
118
+ ami = data.aws_ami.ubuntu.id
119
+ instance_type = var.instance_type
120
+ subnet_id = aws_subnet.public[0].id
121
+
122
+ vpc_security_group_ids = [aws_security_group.web.id]
123
+
124
+ root_block_device {
125
+ volume_size = 20
126
+ volume_type = "gp3"
127
+ encrypted = true
128
+ }
129
+
130
+ tags = {
131
+ Name = "${var.project_name}-web"
132
+ }
133
+
134
+ lifecycle {
135
+ create_before_destroy = true
136
+ prevent_destroy = false
137
+ ignore_changes = [ami] # AMI 更新由其他流程管理
138
+ }
139
+ }
140
+ ```
141
+
142
+ ### Data Source
143
+
144
+ Data Source 用于查询外部信息,不创建资源。
145
+
146
+ ```hcl
147
+ # 查询最新 Ubuntu AMI
148
+ data "aws_ami" "ubuntu" {
149
+ most_recent = true
150
+ owners = ["099720109477"] # Canonical
151
+
152
+ filter {
153
+ name = "name"
154
+ values = ["ubuntu/images/hvm-ssd/ubuntu-*-22.04-amd64-server-*"]
155
+ }
156
+
157
+ filter {
158
+ name = "virtualization-type"
159
+ values = ["hvm"]
160
+ }
161
+ }
162
+
163
+ # 查询当前 AWS 账户信息
164
+ data "aws_caller_identity" "current" {}
165
+
166
+ # 查询可用区
167
+ data "aws_availability_zones" "available" {
168
+ state = "available"
169
+ }
170
+ ```
171
+
172
+ ### Variable
173
+
174
+ 输入变量定义模块的参数化接口。
175
+
176
+ ```hcl
177
+ variable "environment" {
178
+ description = "Deployment environment (dev/staging/prod)"
179
+ type = string
180
+
181
+ validation {
182
+ condition = contains(["dev", "staging", "prod"], var.environment)
183
+ error_message = "Environment must be one of: dev, staging, prod."
184
+ }
185
+ }
186
+
187
+ variable "instance_count" {
188
+ description = "Number of EC2 instances to create"
189
+ type = number
190
+ default = 2
191
+
192
+ validation {
193
+ condition = var.instance_count >= 1 && var.instance_count <= 20
194
+ error_message = "Instance count must be between 1 and 20."
195
+ }
196
+ }
197
+
198
+ variable "allowed_cidrs" {
199
+ description = "List of CIDR blocks allowed to access the service"
200
+ type = list(string)
201
+ default = []
202
+ }
203
+
204
+ variable "tags" {
205
+ description = "Additional tags to apply to all resources"
206
+ type = map(string)
207
+ default = {}
208
+ }
209
+ ```
210
+
211
+ ### Output
212
+
213
+ 输出值暴露模块的计算结果,供其他模块引用或展示给用户。
214
+
215
+ ```hcl
216
+ output "instance_ids" {
217
+ description = "IDs of created EC2 instances"
218
+ value = aws_instance.web[*].id
219
+ }
220
+
221
+ output "load_balancer_dns" {
222
+ description = "DNS name of the load balancer"
223
+ value = aws_lb.main.dns_name
224
+ }
225
+
226
+ output "database_endpoint" {
227
+ description = "RDS instance endpoint"
228
+ value = aws_db_instance.main.endpoint
229
+ sensitive = true # 不在 CLI 输出中显示
230
+ }
231
+ ```
232
+
233
+ ### Local
234
+
235
+ Local 值用于中间计算,简化重复表达式。
236
+
237
+ ```hcl
238
+ locals {
239
+ # 通用标签合并
240
+ common_tags = merge(var.tags, {
241
+ Environment = var.environment
242
+ ManagedBy = "terraform"
243
+ Project = var.project_name
244
+ })
245
+
246
+ # 根据环境选择实例规格
247
+ instance_type = {
248
+ dev = "t3.micro"
249
+ staging = "t3.small"
250
+ prod = "t3.medium"
251
+ }[var.environment]
252
+
253
+ # 可用区列表
254
+ azs = slice(data.aws_availability_zones.available.names, 0, 3)
255
+
256
+ # CIDR 计算
257
+ private_subnets = [for i, az in local.azs : cidrsubnet(var.vpc_cidr, 8, i)]
258
+ public_subnets = [for i, az in local.azs : cidrsubnet(var.vpc_cidr, 8, i + 100)]
259
+ }
260
+ ```
261
+
262
+ ### Module
263
+
264
+ 模块是可复用的 Terraform 配置包,封装一组相关资源。
265
+
266
+ ```hcl
267
+ # 调用本地模块
268
+ module "vpc" {
269
+ source = "./modules/vpc"
270
+
271
+ vpc_cidr = "10.0.0.0/16"
272
+ environment = var.environment
273
+ project_name = var.project_name
274
+ }
275
+
276
+ # 调用 Registry 模块
277
+ module "eks" {
278
+ source = "terraform-aws-modules/eks/aws"
279
+ version = "~> 20.0"
280
+
281
+ cluster_name = "${var.project_name}-${var.environment}"
282
+ cluster_version = "1.29"
283
+ vpc_id = module.vpc.vpc_id
284
+ subnet_ids = module.vpc.private_subnet_ids
285
+ }
286
+ ```
287
+
288
+ ---
289
+
290
+ ## HCL 语法详解
291
+
292
+ ### 类型系统
293
+
294
+ HCL 具备完整的类型系统,支持原始类型和复合类型。
295
+
296
+ ```hcl
297
+ # 原始类型
298
+ variable "name" { type = string }
299
+ variable "count" { type = number }
300
+ variable "enabled" { type = bool }
301
+
302
+ # 集合类型
303
+ variable "cidrs" { type = list(string) }
304
+ variable "ports" { type = set(number) }
305
+ variable "tags" { type = map(string) }
306
+
307
+ # 结构化类型
308
+ variable "database" {
309
+ type = object({
310
+ engine = string
311
+ engine_version = string
312
+ instance_class = string
313
+ storage_gb = number
314
+ multi_az = bool
315
+ backup_retention = optional(number, 7) # 可选字段 + 默认值
316
+ })
317
+ }
318
+
319
+ # 元组类型 (固定长度、异构)
320
+ variable "rule" {
321
+ type = tuple([string, number, string])
322
+ # 例: ["tcp", 443, "0.0.0.0/0"]
323
+ }
324
+
325
+ # any 类型 (推迟到运行时推断)
326
+ variable "flexible_input" {
327
+ type = any
328
+ default = null
329
+ }
330
+ ```
331
+
332
+ ### 表达式
333
+
334
+ ```hcl
335
+ # 字符串插值
336
+ resource "aws_instance" "app" {
337
+ tags = {
338
+ Name = "${var.project_name}-${var.environment}-app-${count.index + 1}"
339
+ }
340
+ }
341
+
342
+ # 多行字符串 (heredoc)
343
+ resource "aws_iam_policy" "example" {
344
+ policy = <<-EOT
345
+ {
346
+ "Version": "2012-10-17",
347
+ "Statement": [
348
+ {
349
+ "Effect": "Allow",
350
+ "Action": "s3:GetObject",
351
+ "Resource": "arn:aws:s3:::${var.bucket_name}/*"
352
+ }
353
+ ]
354
+ }
355
+ EOT
356
+ }
357
+
358
+ # 引用其他资源
359
+ resource "aws_security_group_rule" "ingress" {
360
+ security_group_id = aws_security_group.web.id # 引用另一个资源的属性
361
+ source_security_group_id = module.alb.security_group_id # 引用模块输出
362
+ }
363
+ ```
364
+
365
+ ### 内置函数
366
+
367
+ ```hcl
368
+ locals {
369
+ # 字符串函数
370
+ upper_env = upper(var.environment) # "PROD"
371
+ name_parts = split("-", var.resource_name) # ["my", "app"]
372
+ joined = join(", ", var.cidrs) # "10.0.0.0/8, 172.16.0.0/12"
373
+ trimmed = trimspace(" hello ") # "hello"
374
+ replaced = replace(var.name, "/", "-") # 替换字符
375
+
376
+ # 数值函数
377
+ max_val = max(5, 12, 9) # 12
378
+ min_val = min(var.min_size, 10) # 取较小值
379
+ ceiling = ceil(7.3) # 8
380
+
381
+ # 集合函数
382
+ flat_list = flatten([var.public_subnets, var.private_subnets])
383
+ unique_list = distinct(var.regions)
384
+ merged_map = merge(local.default_tags, var.extra_tags)
385
+ keys_list = keys(var.tags)
386
+ values_list = values(var.tags)
387
+ sorted = sort(var.names)
388
+ lookup_val = lookup(var.instance_map, "web", "t3.micro") # 带默认值的 map 查找
389
+ contains_it = contains(var.allowed_envs, "prod")
390
+
391
+ # 编码函数
392
+ json_policy = jsonencode({
393
+ Version = "2012-10-17"
394
+ Statement = [{ Effect = "Allow", Action = "*", Resource = "*" }]
395
+ })
396
+ b64_encoded = base64encode("hello world")
397
+ yaml_config = yamlencode({ key = "value", list = [1, 2, 3] })
398
+
399
+ # 文件函数
400
+ user_data = file("${path.module}/scripts/init.sh")
401
+ template = templatefile("${path.module}/templates/config.tpl", {
402
+ db_host = aws_db_instance.main.address
403
+ db_port = aws_db_instance.main.port
404
+ })
405
+
406
+ # 网络函数
407
+ subnet_cidr = cidrsubnet("10.0.0.0/16", 8, 1) # "10.0.1.0/24"
408
+ host_ip = cidrhost("10.0.1.0/24", 5) # "10.0.1.5"
409
+
410
+ # 类型转换
411
+ str_to_num = tonumber("42")
412
+ num_to_str = tostring(42)
413
+ to_set = toset(["a", "b", "a"]) # 去重
414
+ }
415
+ ```
416
+
417
+ ### 动态块 (Dynamic Block)
418
+
419
+ ```hcl
420
+ # 动态生成安全组规则
421
+ resource "aws_security_group" "web" {
422
+ name = "${var.project_name}-web-sg"
423
+ description = "Security group for web tier"
424
+ vpc_id = module.vpc.vpc_id
425
+
426
+ dynamic "ingress" {
427
+ for_each = var.ingress_rules
428
+ content {
429
+ description = ingress.value.description
430
+ from_port = ingress.value.port
431
+ to_port = ingress.value.port
432
+ protocol = ingress.value.protocol
433
+ cidr_blocks = ingress.value.cidr_blocks
434
+ }
435
+ }
436
+
437
+ egress {
438
+ from_port = 0
439
+ to_port = 0
440
+ protocol = "-1"
441
+ cidr_blocks = ["0.0.0.0/0"]
442
+ }
443
+ }
444
+
445
+ # 对应的变量定义
446
+ variable "ingress_rules" {
447
+ type = list(object({
448
+ description = string
449
+ port = number
450
+ protocol = string
451
+ cidr_blocks = list(string)
452
+ }))
453
+ default = [
454
+ { description = "HTTPS", port = 443, protocol = "tcp", cidr_blocks = ["0.0.0.0/0"] },
455
+ { description = "HTTP", port = 80, protocol = "tcp", cidr_blocks = ["0.0.0.0/0"] },
456
+ ]
457
+ }
458
+ ```
459
+
460
+ ### for_each vs count
461
+
462
+ ```hcl
463
+ # --- count: 基于数量的简单迭代 ---
464
+ resource "aws_instance" "worker" {
465
+ count = var.worker_count
466
+
467
+ ami = data.aws_ami.ubuntu.id
468
+ instance_type = "t3.medium"
469
+ subnet_id = element(module.vpc.private_subnet_ids, count.index)
470
+
471
+ tags = {
472
+ Name = "${var.project_name}-worker-${count.index + 1}"
473
+ }
474
+ }
475
+
476
+ # --- for_each: 基于集合的迭代 (推荐) ---
477
+ # 使用 map 实现不同配置
478
+ variable "instances" {
479
+ type = map(object({
480
+ instance_type = string
481
+ subnet_tier = string
482
+ }))
483
+ default = {
484
+ web = { instance_type = "t3.small", subnet_tier = "public" }
485
+ api = { instance_type = "t3.medium", subnet_tier = "private" }
486
+ worker = { instance_type = "t3.large", subnet_tier = "private" }
487
+ }
488
+ }
489
+
490
+ resource "aws_instance" "app" {
491
+ for_each = var.instances
492
+
493
+ ami = data.aws_ami.ubuntu.id
494
+ instance_type = each.value.instance_type
495
+ subnet_id = each.value.subnet_tier == "public" ? module.vpc.public_subnet_ids[0] : module.vpc.private_subnet_ids[0]
496
+
497
+ tags = {
498
+ Name = "${var.project_name}-${each.key}"
499
+ Role = each.key
500
+ }
501
+ }
502
+
503
+ # --- for 表达式 ---
504
+ locals {
505
+ # 列表推导
506
+ instance_ids = [for inst in aws_instance.app : inst.id]
507
+
508
+ # Map 推导
509
+ instance_ip_map = { for k, inst in aws_instance.app : k => inst.private_ip }
510
+
511
+ # 带过滤的推导
512
+ public_instances = { for k, v in var.instances : k => v if v.subnet_tier == "public" }
513
+
514
+ # 嵌套推导
515
+ sg_rules = flatten([
516
+ for name, config in var.services : [
517
+ for port in config.ports : {
518
+ name = name
519
+ port = port
520
+ }
521
+ ]
522
+ ])
523
+ }
524
+ ```
525
+
526
+ ### 条件表达式
527
+
528
+ ```hcl
529
+ # 三元运算符
530
+ resource "aws_instance" "app" {
531
+ instance_type = var.environment == "prod" ? "t3.large" : "t3.micro"
532
+ }
533
+
534
+ # 条件创建资源 (count)
535
+ resource "aws_cloudwatch_metric_alarm" "high_cpu" {
536
+ count = var.environment == "prod" ? 1 : 0
537
+
538
+ alarm_name = "${var.project_name}-high-cpu"
539
+ namespace = "AWS/EC2"
540
+ metric_name = "CPUUtilization"
541
+ # ...
542
+ }
543
+
544
+ # 条件创建资源 (for_each)
545
+ resource "aws_route53_record" "alias" {
546
+ for_each = var.create_dns ? toset(["main"]) : toset([])
547
+
548
+ zone_id = var.zone_id
549
+ name = var.domain_name
550
+ type = "A"
551
+ # ...
552
+ }
553
+
554
+ # 条件输出
555
+ output "bastion_ip" {
556
+ value = var.enable_bastion ? aws_instance.bastion[0].public_ip : null
557
+ }
558
+ ```
559
+
560
+ ---
561
+
562
+ ## 状态管理
563
+
564
+ ### Remote State
565
+
566
+ **State 文件包含敏感信息,严禁提交到 Git。** 生产环境必须使用 Remote Backend。
567
+
568
+ ```hcl
569
+ # S3 Backend (推荐用于 AWS)
570
+ terraform {
571
+ backend "s3" {
572
+ bucket = "mycompany-terraform-state"
573
+ key = "prod/vpc/terraform.tfstate"
574
+ region = "ap-northeast-1"
575
+ encrypt = true
576
+ dynamodb_table = "terraform-state-lock" # State Locking
577
+ # 启用版本控制以支持状态回滚
578
+ }
579
+ }
580
+
581
+ # 创建 Backend 基础设施 (Bootstrap, 单独管理)
582
+ resource "aws_s3_bucket" "terraform_state" {
583
+ bucket = "mycompany-terraform-state"
584
+
585
+ lifecycle {
586
+ prevent_destroy = true
587
+ }
588
+ }
589
+
590
+ resource "aws_s3_bucket_versioning" "terraform_state" {
591
+ bucket = aws_s3_bucket.terraform_state.id
592
+ versioning_configuration {
593
+ status = "Enabled"
594
+ }
595
+ }
596
+
597
+ resource "aws_s3_bucket_server_side_encryption_configuration" "terraform_state" {
598
+ bucket = aws_s3_bucket.terraform_state.id
599
+ rule {
600
+ apply_server_side_encryption_by_default {
601
+ sse_algorithm = "aws:kms"
602
+ }
603
+ }
604
+ }
605
+
606
+ resource "aws_dynamodb_table" "terraform_lock" {
607
+ name = "terraform-state-lock"
608
+ billing_mode = "PAY_PER_REQUEST"
609
+ hash_key = "LockID"
610
+
611
+ attribute {
612
+ name = "LockID"
613
+ type = "S"
614
+ }
615
+ }
616
+ ```
617
+
618
+ ### State Locking
619
+
620
+ State Locking 防止多人同时操作导致状态损坏。
621
+
622
+ - **S3 Backend**:DynamoDB 表实现锁
623
+ - **GCS Backend**:原生支持锁
624
+ - **Terraform Cloud / Enterprise**:内置锁管理
625
+ - **强制解锁**(仅在锁残留时使用):`terraform force-unlock <LOCK_ID>`
626
+
627
+ ### State 迁移
628
+
629
+ ```bash
630
+ # 从 local 迁移到 S3
631
+ # 1. 添加 backend "s3" 配置
632
+ # 2. 执行初始化迁移
633
+ terraform init -migrate-state
634
+
635
+ # 在 backend 之间迁移
636
+ terraform init -migrate-state -force-copy
637
+
638
+ # 导出当前状态
639
+ terraform state pull > terraform.tfstate.backup
640
+
641
+ # 推送状态 (谨慎使用)
642
+ terraform state push terraform.tfstate
643
+ ```
644
+
645
+ ### State 操作
646
+
647
+ ```bash
648
+ # 列出所有资源
649
+ terraform state list
650
+
651
+ # 查看资源详情
652
+ terraform state show aws_instance.web
653
+
654
+ # 移动资源 (重命名/移入模块)
655
+ terraform state mv aws_instance.web aws_instance.app
656
+ terraform state mv aws_instance.web module.compute.aws_instance.web
657
+
658
+ # 从状态中移除 (不删除实际资源)
659
+ terraform state rm aws_instance.legacy
660
+
661
+ # 导入已有资源到 Terraform 管理
662
+ terraform import aws_instance.web i-1234567890abcdef0
663
+
664
+ # Terraform 1.5+ 声明式 import
665
+ import {
666
+ to = aws_instance.web
667
+ id = "i-1234567890abcdef0"
668
+ }
669
+
670
+ # taint: 标记资源需要重建 (下次 apply 时)
671
+ terraform taint aws_instance.web
672
+
673
+ # untaint: 取消 taint 标记
674
+ terraform untaint aws_instance.web
675
+
676
+ # Terraform 1.6+ 推荐使用 -replace 替代 taint
677
+ terraform apply -replace="aws_instance.web"
678
+ ```
679
+
680
+ ---
681
+
682
+ ## 模块设计
683
+
684
+ ### 标准模块结构
685
+
686
+ ```
687
+ modules/
688
+ └── vpc/
689
+ ├── main.tf # 核心资源定义
690
+ ├── variables.tf # 输入变量
691
+ ├── outputs.tf # 输出值
692
+ ├── versions.tf # Provider 版本约束
693
+ ├── locals.tf # 本地变量
694
+ ├── data.tf # Data Source
695
+ ├── README.md # 模块文档 (terraform-docs 自动生成)
696
+ └── examples/ # 使用示例
697
+ └── simple/
698
+ └── main.tf
699
+ ```
700
+
701
+ ### 模块输入输出设计
702
+
703
+ ```hcl
704
+ # modules/vpc/variables.tf
705
+ variable "vpc_cidr" {
706
+ description = "CIDR block for the VPC"
707
+ type = string
708
+
709
+ validation {
710
+ condition = can(cidrhost(var.vpc_cidr, 0))
711
+ error_message = "Must be a valid CIDR block."
712
+ }
713
+ }
714
+
715
+ variable "enable_nat_gateway" {
716
+ description = "Whether to create NAT Gateway for private subnets"
717
+ type = bool
718
+ default = true
719
+ }
720
+
721
+ variable "single_nat_gateway" {
722
+ description = "Use a single NAT Gateway instead of one per AZ (cost saving for non-prod)"
723
+ type = bool
724
+ default = false
725
+ }
726
+
727
+ # modules/vpc/outputs.tf
728
+ output "vpc_id" {
729
+ description = "The ID of the VPC"
730
+ value = aws_vpc.main.id
731
+ }
732
+
733
+ output "private_subnet_ids" {
734
+ description = "List of private subnet IDs"
735
+ value = aws_subnet.private[*].id
736
+ }
737
+
738
+ output "public_subnet_ids" {
739
+ description = "List of public subnet IDs"
740
+ value = aws_subnet.public[*].id
741
+ }
742
+
743
+ output "nat_gateway_ips" {
744
+ description = "Elastic IPs of the NAT Gateways"
745
+ value = aws_eip.nat[*].public_ip
746
+ }
747
+ ```
748
+
749
+ ### 版本约束
750
+
751
+ ```hcl
752
+ # 精确版本
753
+ module "vpc" {
754
+ source = "terraform-aws-modules/vpc/aws"
755
+ version = "5.5.3"
756
+ }
757
+
758
+ # 补丁版本范围
759
+ module "eks" {
760
+ source = "terraform-aws-modules/eks/aws"
761
+ version = "~> 20.8" # >= 20.8.0, < 21.0.0
762
+ }
763
+
764
+ # 版本范围
765
+ module "rds" {
766
+ source = "terraform-aws-modules/rds/aws"
767
+ version = ">= 6.0.0, < 7.0.0"
768
+ }
769
+
770
+ # Git 源 (含版本标签)
771
+ module "custom" {
772
+ source = "git::https://github.com/myorg/terraform-modules.git//modules/custom?ref=v2.1.0"
773
+ }
774
+ ```
775
+
776
+ ### 模块组合模式
777
+
778
+ ```hcl
779
+ # 根模块组合多个子模块
780
+ module "networking" {
781
+ source = "./modules/vpc"
782
+
783
+ vpc_cidr = var.vpc_cidr
784
+ environment = var.environment
785
+ enable_nat_gateway = var.environment == "prod"
786
+ single_nat_gateway = var.environment != "prod"
787
+ }
788
+
789
+ module "database" {
790
+ source = "./modules/rds"
791
+
792
+ vpc_id = module.networking.vpc_id
793
+ subnet_ids = module.networking.private_subnet_ids
794
+ instance_class = local.instance_type
795
+ environment = var.environment
796
+ multi_az = var.environment == "prod"
797
+ }
798
+
799
+ module "compute" {
800
+ source = "./modules/ecs"
801
+
802
+ vpc_id = module.networking.vpc_id
803
+ private_subnet_ids = module.networking.private_subnet_ids
804
+ public_subnet_ids = module.networking.public_subnet_ids
805
+ db_endpoint = module.database.endpoint
806
+ environment = var.environment
807
+ }
808
+ ```
809
+
810
+ ---
811
+
812
+ ## 工作流
813
+
814
+ ### 核心命令
815
+
816
+ ```bash
817
+ # 初始化: 下载 Provider + 模块 + 初始化 Backend
818
+ terraform init
819
+ terraform init -upgrade # 升级 Provider 到约束范围内最新版
820
+ terraform init -reconfigure # 重新配置 Backend (忽略已有配置)
821
+
822
+ # 格式化
823
+ terraform fmt # 格式化当前目录
824
+ terraform fmt -recursive # 递归格式化所有子目录
825
+ terraform fmt -check # 仅检查格式 (CI 用)
826
+
827
+ # 验证
828
+ terraform validate # 语法 + 类型检查 (不连接远端)
829
+
830
+ # 计划
831
+ terraform plan # 预览变更
832
+ terraform plan -out=tfplan # 保存计划到文件 (推荐用于 CI/CD)
833
+ terraform plan -target=aws_instance.web # 仅计划特定资源
834
+ terraform plan -var="environment=prod" # 传入变量
835
+
836
+ # 执行
837
+ terraform apply # 交互式确认后执行
838
+ terraform apply tfplan # 执行保存的计划 (无需再次确认)
839
+ terraform apply -auto-approve # 跳过确认 (仅用于自动化流水线)
840
+ terraform apply -replace="aws_instance.web" # 强制重建指定资源
841
+
842
+ # 销毁
843
+ terraform destroy # 销毁所有资源
844
+ terraform destroy -target=aws_instance.web # 仅销毁指定资源
845
+
846
+ # 输出
847
+ terraform output # 显示所有输出
848
+ terraform output -json # JSON 格式输出
849
+ terraform output database_endpoint # 查看单个输出
850
+
851
+ # 依赖图
852
+ terraform graph | dot -Tpng > graph.png # 生成资源依赖图
853
+ ```
854
+
855
+ ### Workspace
856
+
857
+ Workspace 用于在同一配置中管理多个环境的状态(轻量级方案)。
858
+
859
+ ```bash
860
+ # 创建并切换到新 workspace
861
+ terraform workspace new dev
862
+ terraform workspace new staging
863
+ terraform workspace new prod
864
+
865
+ # 列出所有 workspace
866
+ terraform workspace list
867
+
868
+ # 切换 workspace
869
+ terraform workspace select prod
870
+
871
+ # 在配置中使用 workspace
872
+ # main.tf
873
+ locals {
874
+ env = terraform.workspace
875
+ }
876
+ ```
877
+
878
+ **注意**:对于复杂项目,推荐使用目录分离(`environments/dev/`、`environments/prod/`)而非 workspace,因为不同环境往往有不同的资源组合。
879
+
880
+ ---
881
+
882
+ ## AWS 实战配置
883
+
884
+ ### VPC 完整配置
885
+
886
+ ```hcl
887
+ resource "aws_vpc" "main" {
888
+ cidr_block = var.vpc_cidr
889
+ enable_dns_hostnames = true
890
+ enable_dns_support = true
891
+
892
+ tags = merge(local.common_tags, {
893
+ Name = "${var.project_name}-${var.environment}-vpc"
894
+ })
895
+ }
896
+
897
+ # 公有子网 (多 AZ)
898
+ resource "aws_subnet" "public" {
899
+ count = length(local.azs)
900
+
901
+ vpc_id = aws_vpc.main.id
902
+ cidr_block = local.public_subnets[count.index]
903
+ availability_zone = local.azs[count.index]
904
+ map_public_ip_on_launch = true
905
+
906
+ tags = merge(local.common_tags, {
907
+ Name = "${var.project_name}-public-${local.azs[count.index]}"
908
+ Tier = "public"
909
+ "kubernetes.io/role/elb" = "1" # EKS ALB 发现标签
910
+ })
911
+ }
912
+
913
+ # 私有子网 (多 AZ)
914
+ resource "aws_subnet" "private" {
915
+ count = length(local.azs)
916
+
917
+ vpc_id = aws_vpc.main.id
918
+ cidr_block = local.private_subnets[count.index]
919
+ availability_zone = local.azs[count.index]
920
+
921
+ tags = merge(local.common_tags, {
922
+ Name = "${var.project_name}-private-${local.azs[count.index]}"
923
+ Tier = "private"
924
+ "kubernetes.io/role/internal-elb" = "1"
925
+ })
926
+ }
927
+
928
+ # Internet Gateway
929
+ resource "aws_internet_gateway" "main" {
930
+ vpc_id = aws_vpc.main.id
931
+
932
+ tags = merge(local.common_tags, {
933
+ Name = "${var.project_name}-igw"
934
+ })
935
+ }
936
+
937
+ # NAT Gateway (每个 AZ 一个,生产环境高可用)
938
+ resource "aws_eip" "nat" {
939
+ count = var.single_nat_gateway ? 1 : length(local.azs)
940
+ domain = "vpc"
941
+
942
+ tags = merge(local.common_tags, {
943
+ Name = "${var.project_name}-nat-eip-${count.index + 1}"
944
+ })
945
+ }
946
+
947
+ resource "aws_nat_gateway" "main" {
948
+ count = var.single_nat_gateway ? 1 : length(local.azs)
949
+
950
+ allocation_id = aws_eip.nat[count.index].id
951
+ subnet_id = aws_subnet.public[count.index].id
952
+
953
+ tags = merge(local.common_tags, {
954
+ Name = "${var.project_name}-nat-${count.index + 1}"
955
+ })
956
+
957
+ depends_on = [aws_internet_gateway.main]
958
+ }
959
+
960
+ # 路由表
961
+ resource "aws_route_table" "public" {
962
+ vpc_id = aws_vpc.main.id
963
+
964
+ route {
965
+ cidr_block = "0.0.0.0/0"
966
+ gateway_id = aws_internet_gateway.main.id
967
+ }
968
+
969
+ tags = merge(local.common_tags, { Name = "${var.project_name}-public-rt" })
970
+ }
971
+
972
+ resource "aws_route_table" "private" {
973
+ count = var.single_nat_gateway ? 1 : length(local.azs)
974
+ vpc_id = aws_vpc.main.id
975
+
976
+ route {
977
+ cidr_block = "0.0.0.0/0"
978
+ nat_gateway_id = aws_nat_gateway.main[var.single_nat_gateway ? 0 : count.index].id
979
+ }
980
+
981
+ tags = merge(local.common_tags, { Name = "${var.project_name}-private-rt-${count.index + 1}" })
982
+ }
983
+
984
+ resource "aws_route_table_association" "public" {
985
+ count = length(local.azs)
986
+ subnet_id = aws_subnet.public[count.index].id
987
+ route_table_id = aws_route_table.public.id
988
+ }
989
+
990
+ resource "aws_route_table_association" "private" {
991
+ count = length(local.azs)
992
+ subnet_id = aws_subnet.private[count.index].id
993
+ route_table_id = aws_route_table.private[var.single_nat_gateway ? 0 : count.index].id
994
+ }
995
+ ```
996
+
997
+ ### RDS 配置
998
+
999
+ ```hcl
1000
+ resource "aws_db_subnet_group" "main" {
1001
+ name = "${var.project_name}-db-subnet"
1002
+ subnet_ids = aws_subnet.private[*].id
1003
+
1004
+ tags = local.common_tags
1005
+ }
1006
+
1007
+ resource "aws_db_instance" "main" {
1008
+ identifier = "${var.project_name}-${var.environment}"
1009
+
1010
+ engine = "postgres"
1011
+ engine_version = "16.2"
1012
+ instance_class = var.environment == "prod" ? "db.r6g.large" : "db.t3.micro"
1013
+ allocated_storage = 20
1014
+ max_allocated_storage = var.environment == "prod" ? 200 : 50 # Auto-scaling 上限
1015
+
1016
+ db_name = var.db_name
1017
+ username = var.db_username
1018
+ password = var.db_password # 建议通过 Secrets Manager 管理
1019
+
1020
+ db_subnet_group_name = aws_db_subnet_group.main.name
1021
+ vpc_security_group_ids = [aws_security_group.db.id]
1022
+
1023
+ multi_az = var.environment == "prod"
1024
+ storage_encrypted = true
1025
+ deletion_protection = var.environment == "prod"
1026
+ skip_final_snapshot = var.environment != "prod"
1027
+ final_snapshot_identifier = var.environment == "prod" ? "${var.project_name}-final-snapshot" : null
1028
+
1029
+ backup_retention_period = var.environment == "prod" ? 35 : 7
1030
+ backup_window = "03:00-04:00"
1031
+ maintenance_window = "Mon:04:00-Mon:05:00"
1032
+
1033
+ performance_insights_enabled = var.environment == "prod"
1034
+
1035
+ tags = local.common_tags
1036
+
1037
+ lifecycle {
1038
+ ignore_changes = [password] # 密码由外部管理
1039
+ }
1040
+ }
1041
+ ```
1042
+
1043
+ ### S3 配置
1044
+
1045
+ ```hcl
1046
+ resource "aws_s3_bucket" "assets" {
1047
+ bucket = "${var.project_name}-${var.environment}-assets"
1048
+
1049
+ tags = local.common_tags
1050
+ }
1051
+
1052
+ resource "aws_s3_bucket_versioning" "assets" {
1053
+ bucket = aws_s3_bucket.assets.id
1054
+ versioning_configuration {
1055
+ status = "Enabled"
1056
+ }
1057
+ }
1058
+
1059
+ resource "aws_s3_bucket_server_side_encryption_configuration" "assets" {
1060
+ bucket = aws_s3_bucket.assets.id
1061
+ rule {
1062
+ apply_server_side_encryption_by_default {
1063
+ sse_algorithm = "AES256"
1064
+ }
1065
+ bucket_key_enabled = true
1066
+ }
1067
+ }
1068
+
1069
+ resource "aws_s3_bucket_public_access_block" "assets" {
1070
+ bucket = aws_s3_bucket.assets.id
1071
+
1072
+ block_public_acls = true
1073
+ block_public_policy = true
1074
+ ignore_public_acls = true
1075
+ restrict_public_buckets = true
1076
+ }
1077
+
1078
+ resource "aws_s3_bucket_lifecycle_configuration" "assets" {
1079
+ bucket = aws_s3_bucket.assets.id
1080
+
1081
+ rule {
1082
+ id = "transition-to-ia"
1083
+ status = "Enabled"
1084
+
1085
+ transition {
1086
+ days = 90
1087
+ storage_class = "STANDARD_IA"
1088
+ }
1089
+
1090
+ transition {
1091
+ days = 180
1092
+ storage_class = "GLACIER"
1093
+ }
1094
+
1095
+ noncurrent_version_expiration {
1096
+ noncurrent_days = 90
1097
+ }
1098
+ }
1099
+ }
1100
+ ```
1101
+
1102
+ ### IAM 配置
1103
+
1104
+ ```hcl
1105
+ # ECS Task Role
1106
+ resource "aws_iam_role" "ecs_task" {
1107
+ name = "${var.project_name}-${var.environment}-ecs-task"
1108
+
1109
+ assume_role_policy = jsonencode({
1110
+ Version = "2012-10-17"
1111
+ Statement = [
1112
+ {
1113
+ Action = "sts:AssumeRole"
1114
+ Effect = "Allow"
1115
+ Principal = {
1116
+ Service = "ecs-tasks.amazonaws.com"
1117
+ }
1118
+ }
1119
+ ]
1120
+ })
1121
+
1122
+ tags = local.common_tags
1123
+ }
1124
+
1125
+ # 最小权限策略
1126
+ resource "aws_iam_role_policy" "ecs_task" {
1127
+ name = "${var.project_name}-ecs-task-policy"
1128
+ role = aws_iam_role.ecs_task.id
1129
+
1130
+ policy = jsonencode({
1131
+ Version = "2012-10-17"
1132
+ Statement = [
1133
+ {
1134
+ Effect = "Allow"
1135
+ Action = [
1136
+ "s3:GetObject",
1137
+ "s3:PutObject",
1138
+ "s3:ListBucket"
1139
+ ]
1140
+ Resource = [
1141
+ aws_s3_bucket.assets.arn,
1142
+ "${aws_s3_bucket.assets.arn}/*"
1143
+ ]
1144
+ },
1145
+ {
1146
+ Effect = "Allow"
1147
+ Action = [
1148
+ "secretsmanager:GetSecretValue"
1149
+ ]
1150
+ Resource = [
1151
+ "arn:aws:secretsmanager:${var.aws_region}:${data.aws_caller_identity.current.account_id}:secret:${var.project_name}/*"
1152
+ ]
1153
+ }
1154
+ ]
1155
+ })
1156
+ }
1157
+ ```
1158
+
1159
+ ### ALB 配置
1160
+
1161
+ ```hcl
1162
+ resource "aws_lb" "main" {
1163
+ name = "${var.project_name}-${var.environment}-alb"
1164
+ internal = false
1165
+ load_balancer_type = "application"
1166
+ security_groups = [aws_security_group.alb.id]
1167
+ subnets = aws_subnet.public[*].id
1168
+
1169
+ enable_deletion_protection = var.environment == "prod"
1170
+
1171
+ access_logs {
1172
+ bucket = aws_s3_bucket.alb_logs.id
1173
+ prefix = "alb"
1174
+ enabled = true
1175
+ }
1176
+
1177
+ tags = local.common_tags
1178
+ }
1179
+
1180
+ resource "aws_lb_listener" "https" {
1181
+ load_balancer_arn = aws_lb.main.arn
1182
+ port = 443
1183
+ protocol = "HTTPS"
1184
+ ssl_policy = "ELBSecurityPolicy-TLS13-1-2-2021-06"
1185
+ certificate_arn = var.acm_certificate_arn
1186
+
1187
+ default_action {
1188
+ type = "forward"
1189
+ target_group_arn = aws_lb_target_group.app.arn
1190
+ }
1191
+ }
1192
+
1193
+ resource "aws_lb_listener" "http_redirect" {
1194
+ load_balancer_arn = aws_lb.main.arn
1195
+ port = 80
1196
+ protocol = "HTTP"
1197
+
1198
+ default_action {
1199
+ type = "redirect"
1200
+ redirect {
1201
+ port = "443"
1202
+ protocol = "HTTPS"
1203
+ status_code = "HTTP_301"
1204
+ }
1205
+ }
1206
+ }
1207
+
1208
+ resource "aws_lb_target_group" "app" {
1209
+ name = "${var.project_name}-${var.environment}-tg"
1210
+ port = 8080
1211
+ protocol = "HTTP"
1212
+ vpc_id = aws_vpc.main.id
1213
+ target_type = "ip"
1214
+
1215
+ health_check {
1216
+ enabled = true
1217
+ healthy_threshold = 3
1218
+ unhealthy_threshold = 3
1219
+ timeout = 5
1220
+ interval = 30
1221
+ path = "/health"
1222
+ matcher = "200"
1223
+ }
1224
+
1225
+ deregistration_delay = 60
1226
+
1227
+ tags = local.common_tags
1228
+ }
1229
+ ```
1230
+
1231
+ ### EKS 配置
1232
+
1233
+ ```hcl
1234
+ module "eks" {
1235
+ source = "terraform-aws-modules/eks/aws"
1236
+ version = "~> 20.8"
1237
+
1238
+ cluster_name = "${var.project_name}-${var.environment}"
1239
+ cluster_version = "1.29"
1240
+
1241
+ vpc_id = module.networking.vpc_id
1242
+ subnet_ids = module.networking.private_subnet_ids
1243
+
1244
+ # 集群访问控制
1245
+ cluster_endpoint_public_access = true
1246
+ cluster_endpoint_private_access = true
1247
+ cluster_endpoint_public_access_cidrs = var.environment == "prod" ? var.allowed_cidrs : ["0.0.0.0/0"]
1248
+
1249
+ # 集群插件
1250
+ cluster_addons = {
1251
+ coredns = {
1252
+ most_recent = true
1253
+ }
1254
+ kube-proxy = {
1255
+ most_recent = true
1256
+ }
1257
+ vpc-cni = {
1258
+ most_recent = true
1259
+ service_account_role_arn = module.vpc_cni_irsa.iam_role_arn
1260
+ }
1261
+ }
1262
+
1263
+ # 托管节点组
1264
+ eks_managed_node_groups = {
1265
+ general = {
1266
+ instance_types = ["t3.large"]
1267
+ capacity_type = var.environment == "prod" ? "ON_DEMAND" : "SPOT"
1268
+ min_size = var.environment == "prod" ? 3 : 1
1269
+ max_size = var.environment == "prod" ? 10 : 3
1270
+ desired_size = var.environment == "prod" ? 3 : 1
1271
+
1272
+ labels = {
1273
+ Environment = var.environment
1274
+ NodeGroup = "general"
1275
+ }
1276
+
1277
+ tags = local.common_tags
1278
+ }
1279
+ }
1280
+
1281
+ tags = local.common_tags
1282
+ }
1283
+ ```
1284
+
1285
+ ---
1286
+
1287
+ ## 安全
1288
+
1289
+ ### Secrets 管理
1290
+
1291
+ ```hcl
1292
+ # 严禁在代码中硬编码密码 -- 使用 AWS Secrets Manager
1293
+ resource "aws_secretsmanager_secret" "db_password" {
1294
+ name = "${var.project_name}/${var.environment}/db-password"
1295
+ recovery_window_in_days = var.environment == "prod" ? 30 : 0
1296
+ tags = local.common_tags
1297
+ }
1298
+
1299
+ resource "aws_secretsmanager_secret_version" "db_password" {
1300
+ secret_id = aws_secretsmanager_secret.db_password.id
1301
+ secret_string = random_password.db.result
1302
+ }
1303
+
1304
+ resource "random_password" "db" {
1305
+ length = 32
1306
+ special = true
1307
+ override_special = "!#$%^&*()-_=+"
1308
+ }
1309
+
1310
+ # 在 RDS 中引用
1311
+ resource "aws_db_instance" "main" {
1312
+ # ...
1313
+ password = random_password.db.result
1314
+ # ...
1315
+ }
1316
+ ```
1317
+
1318
+ ### Sensitive 变量
1319
+
1320
+ ```hcl
1321
+ variable "db_password" {
1322
+ description = "Database master password"
1323
+ type = string
1324
+ sensitive = true # 不会出现在 plan / apply 输出中
1325
+ }
1326
+
1327
+ output "db_connection_string" {
1328
+ value = "postgresql://${var.db_username}:${var.db_password}@${aws_db_instance.main.endpoint}/${var.db_name}"
1329
+ sensitive = true
1330
+ }
1331
+ ```
1332
+
1333
+ ### Vault 集成
1334
+
1335
+ ```hcl
1336
+ provider "vault" {
1337
+ address = var.vault_address
1338
+ # 认证通过环境变量 VAULT_TOKEN 或 AppRole
1339
+ }
1340
+
1341
+ data "vault_generic_secret" "db" {
1342
+ path = "secret/data/${var.environment}/database"
1343
+ }
1344
+
1345
+ resource "aws_db_instance" "main" {
1346
+ # ...
1347
+ username = data.vault_generic_secret.db.data["username"]
1348
+ password = data.vault_generic_secret.db.data["password"]
1349
+ }
1350
+ ```
1351
+
1352
+ ### Policy as Code
1353
+
1354
+ #### Sentinel (Terraform Cloud / Enterprise)
1355
+
1356
+ ```python
1357
+ # sentinel/enforce-encryption.sentinel
1358
+ import "tfplan/v2" as tfplan
1359
+
1360
+ # 强制 S3 Bucket 开启加密
1361
+ s3_buckets = filter tfplan.resource_changes as _, rc {
1362
+ rc.type is "aws_s3_bucket" and
1363
+ rc.mode is "managed" and
1364
+ (rc.change.actions contains "create" or rc.change.actions contains "update")
1365
+ }
1366
+
1367
+ encryption_enforced = rule {
1368
+ all s3_buckets as _, bucket {
1369
+ bucket.change.after.server_side_encryption_configuration is not null
1370
+ }
1371
+ }
1372
+
1373
+ main = rule {
1374
+ encryption_enforced
1375
+ }
1376
+ ```
1377
+
1378
+ #### OPA (Open Policy Agent)
1379
+
1380
+ ```rego
1381
+ # policy/enforce_tags.rego
1382
+ package terraform.analysis
1383
+
1384
+ import input as tfplan
1385
+
1386
+ # 要求所有资源必须有 Environment 和 ManagedBy 标签
1387
+ required_tags := ["Environment", "ManagedBy"]
1388
+
1389
+ deny[msg] {
1390
+ resource := tfplan.resource_changes[_]
1391
+ resource.change.actions[_] == "create"
1392
+ tags := resource.change.after.tags
1393
+ required_tag := required_tags[_]
1394
+ not tags[required_tag]
1395
+ msg := sprintf("Resource '%s' is missing required tag '%s'", [resource.address, required_tag])
1396
+ }
1397
+
1398
+ # 禁止使用过大的实例类型
1399
+ deny[msg] {
1400
+ resource := tfplan.resource_changes[_]
1401
+ resource.type == "aws_instance"
1402
+ resource.change.actions[_] == "create"
1403
+ instance_type := resource.change.after.instance_type
1404
+ startswith(instance_type, "x1")
1405
+ msg := sprintf("Instance type '%s' is not allowed (too expensive). Use t3/m6i family.", [instance_type])
1406
+ }
1407
+ ```
1408
+
1409
+ ### RBAC (Terraform Cloud)
1410
+
1411
+ ```hcl
1412
+ # 团队权限配置
1413
+ resource "tfe_team" "developers" {
1414
+ name = "developers"
1415
+ organization = var.tfc_organization
1416
+ }
1417
+
1418
+ resource "tfe_team_access" "developers_dev" {
1419
+ access = "write" # plan + apply
1420
+ team_id = tfe_team.developers.id
1421
+ workspace_id = tfe_workspace.dev.id
1422
+ }
1423
+
1424
+ resource "tfe_team_access" "developers_prod" {
1425
+ access = "plan" # 仅 plan,不能 apply
1426
+ team_id = tfe_team.developers.id
1427
+ workspace_id = tfe_workspace.prod.id
1428
+ }
1429
+ ```
1430
+
1431
+ ---
1432
+
1433
+ ## CI/CD 集成
1434
+
1435
+ ### GitHub Actions
1436
+
1437
+ ```yaml
1438
+ # .github/workflows/terraform.yml
1439
+ name: Terraform
1440
+
1441
+ on:
1442
+ pull_request:
1443
+ paths:
1444
+ - 'infrastructure/**'
1445
+ push:
1446
+ branches: [main]
1447
+ paths:
1448
+ - 'infrastructure/**'
1449
+
1450
+ permissions:
1451
+ id-token: write # OIDC
1452
+ contents: read
1453
+ pull-requests: write
1454
+
1455
+ jobs:
1456
+ plan:
1457
+ name: Terraform Plan
1458
+ runs-on: ubuntu-latest
1459
+ if: github.event_name == 'pull_request'
1460
+ defaults:
1461
+ run:
1462
+ working-directory: infrastructure/
1463
+
1464
+ steps:
1465
+ - uses: actions/checkout@v4
1466
+
1467
+ - uses: aws-actions/configure-aws-credentials@v4
1468
+ with:
1469
+ role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
1470
+ aws-region: ap-northeast-1
1471
+
1472
+ - uses: hashicorp/setup-terraform@v3
1473
+ with:
1474
+ terraform_version: 1.7.4
1475
+
1476
+ - name: Terraform Init
1477
+ run: terraform init -no-color
1478
+
1479
+ - name: Terraform Format Check
1480
+ run: terraform fmt -check -recursive
1481
+
1482
+ - name: Terraform Validate
1483
+ run: terraform validate -no-color
1484
+
1485
+ - name: Terraform Plan
1486
+ id: plan
1487
+ run: terraform plan -no-color -out=tfplan
1488
+ continue-on-error: true
1489
+
1490
+ - name: Comment PR
1491
+ uses: actions/github-script@v7
1492
+ with:
1493
+ script: |
1494
+ const output = `#### Terraform Plan 📋
1495
+ \`\`\`
1496
+ ${{ steps.plan.outputs.stdout }}
1497
+ \`\`\`
1498
+ *Pushed by: @${{ github.actor }}*`;
1499
+
1500
+ github.rest.issues.createComment({
1501
+ issue_number: context.issue.number,
1502
+ owner: context.repo.owner,
1503
+ repo: context.repo.repo,
1504
+ body: output
1505
+ });
1506
+
1507
+ - name: Plan Status
1508
+ if: steps.plan.outcome == 'failure'
1509
+ run: exit 1
1510
+
1511
+ apply:
1512
+ name: Terraform Apply
1513
+ runs-on: ubuntu-latest
1514
+ if: github.ref == 'refs/heads/main' && github.event_name == 'push'
1515
+ environment: production
1516
+ defaults:
1517
+ run:
1518
+ working-directory: infrastructure/
1519
+
1520
+ steps:
1521
+ - uses: actions/checkout@v4
1522
+
1523
+ - uses: aws-actions/configure-aws-credentials@v4
1524
+ with:
1525
+ role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
1526
+ aws-region: ap-northeast-1
1527
+
1528
+ - uses: hashicorp/setup-terraform@v3
1529
+ with:
1530
+ terraform_version: 1.7.4
1531
+
1532
+ - name: Terraform Init
1533
+ run: terraform init -no-color
1534
+
1535
+ - name: Terraform Apply
1536
+ run: terraform apply -auto-approve -no-color
1537
+ ```
1538
+
1539
+ ### GitLab CI
1540
+
1541
+ ```yaml
1542
+ # .gitlab-ci.yml
1543
+ stages:
1544
+ - validate
1545
+ - plan
1546
+ - apply
1547
+
1548
+ variables:
1549
+ TF_ROOT: infrastructure/
1550
+ TF_STATE_NAME: default
1551
+
1552
+ .terraform_base:
1553
+ image: hashicorp/terraform:1.7.4
1554
+ before_script:
1555
+ - cd ${TF_ROOT}
1556
+ - terraform init -no-color
1557
+
1558
+ validate:
1559
+ extends: .terraform_base
1560
+ stage: validate
1561
+ script:
1562
+ - terraform fmt -check -recursive
1563
+ - terraform validate -no-color
1564
+ rules:
1565
+ - changes:
1566
+ - infrastructure/**
1567
+
1568
+ plan:
1569
+ extends: .terraform_base
1570
+ stage: plan
1571
+ script:
1572
+ - terraform plan -no-color -out=tfplan
1573
+ artifacts:
1574
+ paths:
1575
+ - ${TF_ROOT}/tfplan
1576
+ expire_in: 1 week
1577
+ rules:
1578
+ - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
1579
+ changes:
1580
+ - infrastructure/**
1581
+
1582
+ apply:
1583
+ extends: .terraform_base
1584
+ stage: apply
1585
+ script:
1586
+ - terraform apply -auto-approve -no-color tfplan
1587
+ dependencies:
1588
+ - plan
1589
+ rules:
1590
+ - if: '$CI_COMMIT_BRANCH == "main"'
1591
+ changes:
1592
+ - infrastructure/**
1593
+ when: manual
1594
+ environment:
1595
+ name: production
1596
+ ```
1597
+
1598
+ ### Atlantis
1599
+
1600
+ ```yaml
1601
+ # atlantis.yaml
1602
+ version: 3
1603
+ projects:
1604
+ - name: vpc
1605
+ dir: infrastructure/vpc
1606
+ workspace: default
1607
+ terraform_version: v1.7.4
1608
+ autoplan:
1609
+ when_modified:
1610
+ - "*.tf"
1611
+ - "../modules/vpc/**"
1612
+ enabled: true
1613
+ apply_requirements:
1614
+ - approved # PR 需要审批
1615
+ - mergeable # PR 可合并
1616
+
1617
+ - name: app
1618
+ dir: infrastructure/app
1619
+ workspace: default
1620
+ terraform_version: v1.7.4
1621
+ autoplan:
1622
+ when_modified:
1623
+ - "*.tf"
1624
+ - "../modules/ecs/**"
1625
+ enabled: true
1626
+ apply_requirements:
1627
+ - approved
1628
+ - mergeable
1629
+ ```
1630
+
1631
+ ### Terraform Cloud
1632
+
1633
+ ```hcl
1634
+ # 配置 Terraform Cloud 作为 Backend
1635
+ terraform {
1636
+ cloud {
1637
+ organization = "mycompany"
1638
+
1639
+ workspaces {
1640
+ tags = ["app:myproject", "env:prod"]
1641
+ }
1642
+ }
1643
+ }
1644
+ ```
1645
+
1646
+ ---
1647
+
1648
+ ## 团队协作
1649
+
1650
+ ### 推荐目录结构
1651
+
1652
+ ```
1653
+ infrastructure/
1654
+ ├── environments/
1655
+ │ ├── dev/
1656
+ │ │ ├── main.tf # 调用模块,dev 环境参数
1657
+ │ │ ├── variables.tf
1658
+ │ │ ├── outputs.tf
1659
+ │ │ ├── terraform.tfvars # dev 环境变量值
1660
+ │ │ └── backend.tf # dev 环境 state 配置
1661
+ │ ├── staging/
1662
+ │ │ ├── main.tf
1663
+ │ │ ├── variables.tf
1664
+ │ │ ├── outputs.tf
1665
+ │ │ ├── terraform.tfvars
1666
+ │ │ └── backend.tf
1667
+ │ └── prod/
1668
+ │ ├── main.tf
1669
+ │ ├── variables.tf
1670
+ │ ├── outputs.tf
1671
+ │ ├── terraform.tfvars
1672
+ │ └── backend.tf
1673
+ ├── modules/ # 可复用模块
1674
+ │ ├── vpc/
1675
+ │ ├── ecs/
1676
+ │ ├── rds/
1677
+ │ ├── s3/
1678
+ │ └── iam/
1679
+ ├── policies/ # OPA / Sentinel 策略
1680
+ │ ├── enforce_tags.rego
1681
+ │ └── enforce_encryption.rego
1682
+ └── scripts/
1683
+ ├── bootstrap.sh # 初始化 S3 Backend + DynamoDB Lock
1684
+ └── import.sh # 批量导入已有资源
1685
+ ```
1686
+
1687
+ ### 命名规范
1688
+
1689
+ ```hcl
1690
+ # 资源命名: <project>-<environment>-<component>-<qualifier>
1691
+ # 例: myapp-prod-web-sg, myapp-dev-db-primary
1692
+
1693
+ # Terraform 资源名 (代码内标识符): 使用下划线,语义清晰
1694
+ resource "aws_security_group" "web_ingress" { } # 好
1695
+ resource "aws_security_group" "sg1" { } # 差:无语义
1696
+ resource "aws_security_group" "web-ingress-sg" { } # 差:HCL 用下划线
1697
+
1698
+ # 变量名: 使用下划线,名词短语
1699
+ variable "vpc_cidr" { }
1700
+ variable "enable_nat_gateway" { } # bool 用 enable_ 前缀
1701
+ variable "instance_count" { } # 数量用 _count 后缀
1702
+
1703
+ # 输出名: 使用下划线,描述所输出的值
1704
+ output "vpc_id" { }
1705
+ output "private_subnet_ids" { }
1706
+ output "database_endpoint" { }
1707
+
1708
+ # 模块名: 使用下划线
1709
+ module "networking" { }
1710
+ module "application_cluster" { }
1711
+ ```
1712
+
1713
+ ### 代码审查要点
1714
+
1715
+ 1. **Plan 输出审查**:每个 PR 必须附带 `terraform plan` 输出
1716
+ 2. **破坏性变更识别**:关注 `destroy` 和 `replace` 操作,尤其是数据库和持久化存储
1717
+ 3. **安全审查**:检查安全组规则、IAM 策略是否遵循最小权限
1718
+ 4. **标签完整性**:所有资源必须包含 Environment、ManagedBy、Project 标签
1719
+ 5. **硬编码检查**:不应出现 IP 地址、密码、Account ID 等硬编码值
1720
+ 6. **State 影响**:注意资源地址变更(重命名/移入模块)是否会导致重建
1721
+ 7. **Provider 版本**:版本约束是否合理,是否会引入破坏性更新
1722
+
1723
+ ### 变更管理
1724
+
1725
+ ```bash
1726
+ # 安全的变更流程
1727
+ # 1. 创建特性分支
1728
+ git checkout -b feat/add-redis-cluster
1729
+
1730
+ # 2. 修改配置
1731
+ # 3. 本地验证
1732
+ terraform fmt -check -recursive
1733
+ terraform validate
1734
+ terraform plan -out=tfplan
1735
+
1736
+ # 4. 审查 plan 输出
1737
+ # 5. 提交 PR,附带 plan 输出
1738
+ # 6. 代码审查 + plan 审查
1739
+ # 7. 合并到 main,自动触发 apply (或手动 approve)
1740
+
1741
+ # 紧急变更回滚
1742
+ # 方式 1: git revert + apply
1743
+ git revert HEAD
1744
+ terraform apply
1745
+
1746
+ # 方式 2: 恢复 State 到之前版本 (S3 版本控制)
1747
+ aws s3api list-object-versions --bucket mycompany-terraform-state --prefix prod/app/terraform.tfstate
1748
+ aws s3api get-object --bucket mycompany-terraform-state --key prod/app/terraform.tfstate --version-id <version-id> terraform.tfstate.restore
1749
+ terraform state push terraform.tfstate.restore
1750
+ ```
1751
+
1752
+ ---
1753
+
1754
+ ## 性能优化
1755
+
1756
+ ### 并行度控制
1757
+
1758
+ ```bash
1759
+ # 默认并行度为 10,可根据 API 限制调整
1760
+ terraform apply -parallelism=20 # 提高并行度 (资源量大时)
1761
+ terraform apply -parallelism=3 # 降低并行度 (API 限流时)
1762
+ ```
1763
+
1764
+ ### Targeted Apply
1765
+
1766
+ ```bash
1767
+ # 仅操作特定资源 (调试/紧急修复时使用)
1768
+ terraform plan -target=module.networking
1769
+ terraform apply -target=aws_instance.web
1770
+
1771
+ # 注意: targeted apply 会跳过依赖检查,仅用于临时操作
1772
+ # 完成后应执行完整的 terraform plan 确认状态一致
1773
+ ```
1774
+
1775
+ ### 模块拆分策略
1776
+
1777
+ ```
1778
+ # 按生命周期拆分 State,减少 plan/apply 范围
1779
+
1780
+ infrastructure/
1781
+ ├── networking/ # 变更频率低 - VPC/Subnet/NAT
1782
+ │ └── terraform.tfstate
1783
+ ├── data-stores/ # 变更频率低 - RDS/ElastiCache/S3
1784
+ │ └── terraform.tfstate
1785
+ ├── compute/ # 变更频率高 - ECS/EC2/ASG
1786
+ │ └── terraform.tfstate
1787
+ ├── monitoring/ # 变更频率中 - CloudWatch/Alarms
1788
+ │ └── terraform.tfstate
1789
+ └── dns/ # 变更频率低 - Route53/ACM
1790
+ └── terraform.tfstate
1791
+ ```
1792
+
1793
+ 跨 State 引用使用 `terraform_remote_state` Data Source:
1794
+
1795
+ ```hcl
1796
+ # compute/main.tf - 引用 networking 的输出
1797
+ data "terraform_remote_state" "networking" {
1798
+ backend = "s3"
1799
+ config = {
1800
+ bucket = "mycompany-terraform-state"
1801
+ key = "prod/networking/terraform.tfstate"
1802
+ region = "ap-northeast-1"
1803
+ }
1804
+ }
1805
+
1806
+ resource "aws_instance" "web" {
1807
+ subnet_id = data.terraform_remote_state.networking.outputs.private_subnet_ids[0]
1808
+ }
1809
+ ```
1810
+
1811
+ ### 大规模部署优化
1812
+
1813
+ ```bash
1814
+ # 使用 -refresh=false 跳过 refresh (状态与实际一致时)
1815
+ terraform plan -refresh=false
1816
+
1817
+ # 生成 plan 文件后使用 terraform show 审查
1818
+ terraform plan -out=tfplan
1819
+ terraform show -json tfplan | jq '.resource_changes | length'
1820
+
1821
+ # Provider 镜像 (加速 init)
1822
+ # ~/.terraformrc
1823
+ provider_installation {
1824
+ filesystem_mirror {
1825
+ path = "/usr/share/terraform/providers"
1826
+ include = ["registry.terraform.io/hashicorp/*"]
1827
+ }
1828
+ direct {
1829
+ exclude = ["registry.terraform.io/hashicorp/*"]
1830
+ }
1831
+ }
1832
+ ```
1833
+
1834
+ ---
1835
+
1836
+ ## 常见陷阱与反模式
1837
+
1838
+ ### 1. State 漂移
1839
+
1840
+ **问题**:手动在控制台修改了资源,Terraform State 与实际状态不一致。
1841
+
1842
+ ```bash
1843
+ # 检测漂移
1844
+ terraform plan -refresh-only
1845
+
1846
+ # 同步 State 到实际状态 (不修改资源)
1847
+ terraform apply -refresh-only
1848
+
1849
+ # 预防: 禁止手动操作,所有变更通过 Terraform
1850
+ # 预防: 定期运行 drift detection (Terraform Cloud 内置支持)
1851
+ ```
1852
+
1853
+ ### 2. 循环依赖
1854
+
1855
+ **问题**:资源 A 依赖 B 的属性,B 又依赖 A 的属性。
1856
+
1857
+ ```hcl
1858
+ # 错误: 安全组循环引用
1859
+ resource "aws_security_group" "a" {
1860
+ ingress {
1861
+ security_groups = [aws_security_group.b.id] # A 依赖 B
1862
+ }
1863
+ }
1864
+
1865
+ resource "aws_security_group" "b" {
1866
+ ingress {
1867
+ security_groups = [aws_security_group.a.id] # B 依赖 A → 循环!
1868
+ }
1869
+ }
1870
+
1871
+ # 正确: 使用独立的 rule 资源打破循环
1872
+ resource "aws_security_group" "a" {
1873
+ name = "sg-a"
1874
+ vpc_id = aws_vpc.main.id
1875
+ }
1876
+
1877
+ resource "aws_security_group" "b" {
1878
+ name = "sg-b"
1879
+ vpc_id = aws_vpc.main.id
1880
+ }
1881
+
1882
+ resource "aws_security_group_rule" "a_from_b" {
1883
+ type = "ingress"
1884
+ security_group_id = aws_security_group.a.id
1885
+ source_security_group_id = aws_security_group.b.id
1886
+ from_port = 443
1887
+ to_port = 443
1888
+ protocol = "tcp"
1889
+ }
1890
+
1891
+ resource "aws_security_group_rule" "b_from_a" {
1892
+ type = "ingress"
1893
+ security_group_id = aws_security_group.b.id
1894
+ source_security_group_id = aws_security_group.a.id
1895
+ from_port = 8080
1896
+ to_port = 8080
1897
+ protocol = "tcp"
1898
+ }
1899
+ ```
1900
+
1901
+ ### 3. Provider 版本锁定
1902
+
1903
+ **问题**:未锁定版本导致团队成员使用不同版本,产生不一致的 plan。
1904
+
1905
+ ```hcl
1906
+ # 错误: 无版本约束
1907
+ terraform {
1908
+ required_providers {
1909
+ aws = {
1910
+ source = "hashicorp/aws"
1911
+ }
1912
+ }
1913
+ }
1914
+
1915
+ # 正确: 明确版本约束 + 提交 .terraform.lock.hcl
1916
+ terraform {
1917
+ required_providers {
1918
+ aws = {
1919
+ source = "hashicorp/aws"
1920
+ version = "~> 5.40"
1921
+ }
1922
+ }
1923
+ }
1924
+
1925
+ # .terraform.lock.hcl 必须提交到 Git
1926
+ # 更新锁文件: terraform init -upgrade
1927
+ ```
1928
+
1929
+ ### 4. count vs for_each 陷阱
1930
+
1931
+ ```hcl
1932
+ # 错误: 使用 count + 列表,删除中间元素导致后续资源全部重建
1933
+ variable "subnets" {
1934
+ default = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
1935
+ }
1936
+
1937
+ resource "aws_subnet" "bad" {
1938
+ count = length(var.subnets)
1939
+ cidr_block = var.subnets[count.index] # 删除第 2 个 → 第 3 个变为 index 1 → 重建!
1940
+ }
1941
+
1942
+ # 正确: 使用 for_each + map/set,键稳定不受顺序影响
1943
+ variable "subnets" {
1944
+ default = {
1945
+ az-a = "10.0.1.0/24"
1946
+ az-b = "10.0.2.0/24"
1947
+ az-c = "10.0.3.0/24"
1948
+ }
1949
+ }
1950
+
1951
+ resource "aws_subnet" "good" {
1952
+ for_each = var.subnets
1953
+ cidr_block = each.value # 删除 az-b → 仅影响 az-b,az-a 和 az-c 不变
1954
+ }
1955
+ ```
1956
+
1957
+ ### 5. 过大的 State 文件
1958
+
1959
+ **问题**:所有资源放在一个 State 中,plan/apply 极慢,爆炸半径大。
1960
+
1961
+ **解法**:按生命周期和变更频率拆分模块(见性能优化章节)。每个 State 管理 50-200 个资源为宜。
1962
+
1963
+ ### 6. 硬编码 Provider 配置
1964
+
1965
+ ```hcl
1966
+ # 错误: 硬编码 Region 和 Account ID
1967
+ provider "aws" {
1968
+ region = "ap-northeast-1"
1969
+ }
1970
+
1971
+ resource "aws_iam_role" "bad" {
1972
+ assume_role_policy = jsonencode({
1973
+ Statement = [{
1974
+ Principal = { AWS = "arn:aws:iam::123456789012:root" } # 硬编码
1975
+ }]
1976
+ })
1977
+ }
1978
+
1979
+ # 正确: 参数化
1980
+ provider "aws" {
1981
+ region = var.aws_region
1982
+ }
1983
+
1984
+ data "aws_caller_identity" "current" {}
1985
+
1986
+ resource "aws_iam_role" "good" {
1987
+ assume_role_policy = jsonencode({
1988
+ Statement = [{
1989
+ Principal = { AWS = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:root" }
1990
+ }]
1991
+ })
1992
+ }
1993
+ ```
1994
+
1995
+ ### 7. 忽略 lifecycle 规则
1996
+
1997
+ ```hcl
1998
+ # 数据库等有状态资源必须设置 prevent_destroy
1999
+ resource "aws_db_instance" "main" {
2000
+ # ...
2001
+ lifecycle {
2002
+ prevent_destroy = true # 防止意外 destroy
2003
+ }
2004
+ }
2005
+
2006
+ # 蓝绿部署场景使用 create_before_destroy
2007
+ resource "aws_instance" "web" {
2008
+ # ...
2009
+ lifecycle {
2010
+ create_before_destroy = true # 先建新实例再删旧实例
2011
+ }
2012
+ }
2013
+ ```
2014
+
2015
+ ### 8. 敏感数据泄漏
2016
+
2017
+ ```hcl
2018
+ # 错误: 在 terraform.tfvars 中存储密码并提交到 Git
2019
+ # terraform.tfvars
2020
+ db_password = "my-secret-password" # 绝不要这样做
2021
+
2022
+ # 正确: 使用环境变量
2023
+ # export TF_VAR_db_password="my-secret-password"
2024
+
2025
+ # 正确: 使用 Secrets Manager / Vault
2026
+ data "aws_secretsmanager_secret_version" "db" {
2027
+ secret_id = aws_secretsmanager_secret.db.id
2028
+ }
2029
+
2030
+ # .gitignore 必须包含:
2031
+ # *.tfstate
2032
+ # *.tfstate.*
2033
+ # *.tfvars (如果包含敏感值)
2034
+ # .terraform/
2035
+ ```
2036
+
2037
+ ### 9. 未使用 moved 块处理重构
2038
+
2039
+ ```hcl
2040
+ # Terraform 1.1+ 使用 moved 块安全重命名资源
2041
+ # 避免 destroy + create,保留已有资源
2042
+ moved {
2043
+ from = aws_instance.web
2044
+ to = aws_instance.application
2045
+ }
2046
+
2047
+ moved {
2048
+ from = aws_instance.app
2049
+ to = module.compute.aws_instance.app
2050
+ }
2051
+ ```
2052
+
2053
+ ### 10. 忽视 Plan 输出
2054
+
2055
+ **反模式**:直接 `terraform apply -auto-approve` 而不审查 plan。
2056
+
2057
+ **规范**:
2058
+ - 开发环境:本地 `terraform plan` 审查后再 apply
2059
+ - 生产环境:CI 生成 plan → PR 审查 plan 输出 → 人工批准后 apply saved plan
2060
+ - 永远使用 `terraform plan -out=tfplan` + `terraform apply tfplan`,确保执行的是审查过的计划
2061
+
2062
+ ---
2063
+
2064
+ ## Agent Checklist
2065
+
2066
+ 以下检查清单供 AI Agent 在生成或审查 Terraform 代码时使用:
2067
+
2068
+ ### 基础规范
2069
+ - [ ] `terraform fmt` 格式化通过
2070
+ - [ ] `terraform validate` 验证通过
2071
+ - [ ] 所有 Provider 声明了版本约束
2072
+ - [ ] `.terraform.lock.hcl` 提交到 Git
2073
+ - [ ] `.gitignore` 包含 `*.tfstate`、`.terraform/`、敏感 `.tfvars`
2074
+
2075
+ ### 安全
2076
+ - [ ] 无硬编码密码、密钥、Account ID
2077
+ - [ ] 敏感变量标记 `sensitive = true`
2078
+ - [ ] 密码通过 Secrets Manager / Vault 管理
2079
+ - [ ] S3 Bucket 开启加密 + 阻止公开访问
2080
+ - [ ] IAM 策略遵循最小权限原则
2081
+ - [ ] 安全组规则无不必要的 `0.0.0.0/0` 入站
2082
+ - [ ] RDS 开启 `storage_encrypted = true`
2083
+
2084
+ ### 状态管理
2085
+ - [ ] 使用 Remote Backend (S3/GCS/Terraform Cloud)
2086
+ - [ ] 启用 State Locking (DynamoDB/原生)
2087
+ - [ ] State Bucket 开启版本控制
2088
+ - [ ] State 按生命周期合理拆分
2089
+
2090
+ ### 模块设计
2091
+ - [ ] 模块有完整的 `variables.tf` + `outputs.tf`
2092
+ - [ ] 变量有 `description` 和 `type`
2093
+ - [ ] 关键变量有 `validation` 块
2094
+ - [ ] 使用 `for_each` 而非 `count`(除非纯数量迭代)
2095
+ - [ ] 模块版本使用 `~>` 约束
2096
+
2097
+ ### 生产就绪
2098
+ - [ ] 数据库等有状态资源设置 `prevent_destroy`
2099
+ - [ ] 所有资源有 Environment / ManagedBy / Project 标签
2100
+ - [ ] 生产环境 RDS 开启 `multi_az` 和足够的 `backup_retention_period`
2101
+ - [ ] ALB 使用 TLS 1.3 策略
2102
+ - [ ] HTTP 自动重定向到 HTTPS
2103
+ - [ ] EKS 节点组设置合理的 `min_size` / `max_size`
2104
+ - [ ] NAT Gateway 生产环境每 AZ 一个
2105
+
2106
+ ### CI/CD
2107
+ - [ ] PR 自动运行 `terraform plan`
2108
+ - [ ] Plan 输出作为 PR 评论
2109
+ - [ ] 生产 Apply 需要人工批准
2110
+ - [ ] 使用 OIDC 而非长期 Access Key 进行认证
2111
+ - [ ] Plan 文件保存为 CI Artifact
2112
+
2113
+ ### 变更安全
2114
+ - [ ] 审查 plan 中的 `destroy` / `replace` 操作
2115
+ - [ ] 资源重命名使用 `moved` 块
2116
+ - [ ] 未使用 `-auto-approve`(除自动化流水线外)
2117
+ - [ ] 大规模变更分批执行