dojo.md 0.2.2 → 0.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (196) hide show
  1. package/courses/GENERATION_LOG.md +29 -0
  2. package/courses/api-documentation-writing/course.yaml +12 -0
  3. package/courses/api-documentation-writing/scenarios/level-1/authentication-basics.yaml +46 -0
  4. package/courses/api-documentation-writing/scenarios/level-1/data-types-formats.yaml +45 -0
  5. package/courses/api-documentation-writing/scenarios/level-1/endpoint-description.yaml +45 -0
  6. package/courses/api-documentation-writing/scenarios/level-1/error-documentation.yaml +45 -0
  7. package/courses/api-documentation-writing/scenarios/level-1/first-documentation-shift.yaml +47 -0
  8. package/courses/api-documentation-writing/scenarios/level-1/getting-started-guide.yaml +42 -0
  9. package/courses/api-documentation-writing/scenarios/level-1/pagination-docs.yaml +51 -0
  10. package/courses/api-documentation-writing/scenarios/level-1/request-parameters.yaml +46 -0
  11. package/courses/api-documentation-writing/scenarios/level-1/request-response-examples.yaml +48 -0
  12. package/courses/api-documentation-writing/scenarios/level-1/status-codes.yaml +45 -0
  13. package/courses/api-documentation-writing/scenarios/level-2/error-patterns.yaml +48 -0
  14. package/courses/api-documentation-writing/scenarios/level-2/intermediate-documentation-shift.yaml +48 -0
  15. package/courses/api-documentation-writing/scenarios/level-2/oauth-documentation.yaml +47 -0
  16. package/courses/api-documentation-writing/scenarios/level-2/openapi-specification.yaml +46 -0
  17. package/courses/api-documentation-writing/scenarios/level-2/rate-limiting-docs.yaml +45 -0
  18. package/courses/api-documentation-writing/scenarios/level-2/request-body-schemas.yaml +46 -0
  19. package/courses/api-documentation-writing/scenarios/level-2/schema-definitions.yaml +41 -0
  20. package/courses/api-documentation-writing/scenarios/level-2/swagger-redoc-rendering.yaml +43 -0
  21. package/courses/api-documentation-writing/scenarios/level-2/validation-documentation.yaml +47 -0
  22. package/courses/api-documentation-writing/scenarios/level-2/versioning-changelog.yaml +42 -0
  23. package/courses/api-documentation-writing/scenarios/level-3/advanced-documentation-shift.yaml +43 -0
  24. package/courses/api-documentation-writing/scenarios/level-3/api-style-guide.yaml +40 -0
  25. package/courses/api-documentation-writing/scenarios/level-3/code-samples-multilang.yaml +40 -0
  26. package/courses/api-documentation-writing/scenarios/level-3/content-architecture.yaml +47 -0
  27. package/courses/api-documentation-writing/scenarios/level-3/deprecation-communication.yaml +44 -0
  28. package/courses/api-documentation-writing/scenarios/level-3/interactive-api-explorer.yaml +42 -0
  29. package/courses/api-documentation-writing/scenarios/level-3/migration-guides.yaml +42 -0
  30. package/courses/api-documentation-writing/scenarios/level-3/sdk-documentation.yaml +40 -0
  31. package/courses/api-documentation-writing/scenarios/level-3/webhook-documentation.yaml +48 -0
  32. package/courses/api-documentation-writing/scenarios/level-3/websocket-sse-docs.yaml +47 -0
  33. package/courses/api-documentation-writing/scenarios/level-4/api-changelog-management.yaml +44 -0
  34. package/courses/api-documentation-writing/scenarios/level-4/api-governance-standards.yaml +41 -0
  35. package/courses/api-documentation-writing/scenarios/level-4/api-product-strategy.yaml +41 -0
  36. package/courses/api-documentation-writing/scenarios/level-4/developer-portal-design.yaml +48 -0
  37. package/courses/api-documentation-writing/scenarios/level-4/docs-as-code.yaml +41 -0
  38. package/courses/api-documentation-writing/scenarios/level-4/documentation-localization.yaml +46 -0
  39. package/courses/api-documentation-writing/scenarios/level-4/documentation-metrics.yaml +45 -0
  40. package/courses/api-documentation-writing/scenarios/level-4/documentation-testing.yaml +41 -0
  41. package/courses/api-documentation-writing/scenarios/level-4/expert-documentation-shift.yaml +45 -0
  42. package/courses/api-documentation-writing/scenarios/level-4/multi-audience-docs.yaml +46 -0
  43. package/courses/api-documentation-writing/scenarios/level-5/ai-powered-documentation.yaml +44 -0
  44. package/courses/api-documentation-writing/scenarios/level-5/api-first-documentation.yaml +45 -0
  45. package/courses/api-documentation-writing/scenarios/level-5/api-marketplace-docs.yaml +42 -0
  46. package/courses/api-documentation-writing/scenarios/level-5/board-api-strategy.yaml +48 -0
  47. package/courses/api-documentation-writing/scenarios/level-5/documentation-program-strategy.yaml +42 -0
  48. package/courses/api-documentation-writing/scenarios/level-5/documentation-team-structure.yaml +47 -0
  49. package/courses/api-documentation-writing/scenarios/level-5/dx-competitive-advantage.yaml +46 -0
  50. package/courses/api-documentation-writing/scenarios/level-5/ecosystem-documentation.yaml +45 -0
  51. package/courses/api-documentation-writing/scenarios/level-5/industry-documentation-patterns.yaml +46 -0
  52. package/courses/api-documentation-writing/scenarios/level-5/master-documentation-shift.yaml +46 -0
  53. package/courses/code-review-feedback-writing/course.yaml +12 -0
  54. package/courses/code-review-feedback-writing/scenarios/level-1/approve-vs-request-changes.yaml +48 -0
  55. package/courses/code-review-feedback-writing/scenarios/level-1/asking-questions.yaml +50 -0
  56. package/courses/code-review-feedback-writing/scenarios/level-1/clear-comment-writing.yaml +45 -0
  57. package/courses/code-review-feedback-writing/scenarios/level-1/constructive-tone.yaml +43 -0
  58. package/courses/code-review-feedback-writing/scenarios/level-1/first-review-shift.yaml +46 -0
  59. package/courses/code-review-feedback-writing/scenarios/level-1/giving-praise.yaml +44 -0
  60. package/courses/code-review-feedback-writing/scenarios/level-1/nitpick-etiquette.yaml +44 -0
  61. package/courses/code-review-feedback-writing/scenarios/level-1/providing-context.yaml +46 -0
  62. package/courses/code-review-feedback-writing/scenarios/level-1/reviewing-small-prs.yaml +43 -0
  63. package/courses/code-review-feedback-writing/scenarios/level-1/style-vs-logic.yaml +48 -0
  64. package/courses/code-review-feedback-writing/scenarios/level-2/architectural-feedback.yaml +52 -0
  65. package/courses/code-review-feedback-writing/scenarios/level-2/intermediate-review-shift.yaml +46 -0
  66. package/courses/code-review-feedback-writing/scenarios/level-2/performance-feedback.yaml +50 -0
  67. package/courses/code-review-feedback-writing/scenarios/level-2/reviewing-breaking-changes.yaml +44 -0
  68. package/courses/code-review-feedback-writing/scenarios/level-2/reviewing-complex-prs.yaml +43 -0
  69. package/courses/code-review-feedback-writing/scenarios/level-2/reviewing-documentation.yaml +47 -0
  70. package/courses/code-review-feedback-writing/scenarios/level-2/reviewing-error-handling.yaml +50 -0
  71. package/courses/code-review-feedback-writing/scenarios/level-2/reviewing-tests.yaml +53 -0
  72. package/courses/code-review-feedback-writing/scenarios/level-2/security-review-comments.yaml +50 -0
  73. package/courses/code-review-feedback-writing/scenarios/level-2/suggesting-alternatives.yaml +42 -0
  74. package/courses/code-review-feedback-writing/scenarios/level-3/advanced-review-shift.yaml +48 -0
  75. package/courses/code-review-feedback-writing/scenarios/level-3/api-design-review.yaml +47 -0
  76. package/courses/code-review-feedback-writing/scenarios/level-3/cross-team-review.yaml +45 -0
  77. package/courses/code-review-feedback-writing/scenarios/level-3/database-migration-review.yaml +48 -0
  78. package/courses/code-review-feedback-writing/scenarios/level-3/design-pattern-feedback.yaml +48 -0
  79. package/courses/code-review-feedback-writing/scenarios/level-3/mentoring-through-review.yaml +46 -0
  80. package/courses/code-review-feedback-writing/scenarios/level-3/production-incident-review.yaml +42 -0
  81. package/courses/code-review-feedback-writing/scenarios/level-3/reviewing-senior-code.yaml +47 -0
  82. package/courses/code-review-feedback-writing/scenarios/level-3/reviewing-unfamiliar-code.yaml +43 -0
  83. package/courses/code-review-feedback-writing/scenarios/level-3/speed-vs-thoroughness.yaml +46 -0
  84. package/courses/code-review-feedback-writing/scenarios/level-4/automated-review-strategy.yaml +44 -0
  85. package/courses/code-review-feedback-writing/scenarios/level-4/expert-review-shift.yaml +46 -0
  86. package/courses/code-review-feedback-writing/scenarios/level-4/review-culture-design.yaml +41 -0
  87. package/courses/code-review-feedback-writing/scenarios/level-4/review-guidelines-standards.yaml +45 -0
  88. package/courses/code-review-feedback-writing/scenarios/level-4/review-load-balancing.yaml +39 -0
  89. package/courses/code-review-feedback-writing/scenarios/level-4/review-metrics.yaml +39 -0
  90. package/courses/code-review-feedback-writing/scenarios/level-4/review-process-optimization.yaml +48 -0
  91. package/courses/code-review-feedback-writing/scenarios/level-4/scaling-review-process.yaml +45 -0
  92. package/courses/code-review-feedback-writing/scenarios/level-4/security-review-standards.yaml +41 -0
  93. package/courses/code-review-feedback-writing/scenarios/level-4/training-reviewers.yaml +42 -0
  94. package/courses/code-review-feedback-writing/scenarios/level-5/board-quality-metrics.yaml +44 -0
  95. package/courses/code-review-feedback-writing/scenarios/level-5/knowledge-transfer-at-scale.yaml +42 -0
  96. package/courses/code-review-feedback-writing/scenarios/level-5/ma-review-alignment.yaml +50 -0
  97. package/courses/code-review-feedback-writing/scenarios/level-5/master-review-shift.yaml +49 -0
  98. package/courses/code-review-feedback-writing/scenarios/level-5/review-competitive-advantage.yaml +48 -0
  99. package/courses/code-review-feedback-writing/scenarios/level-5/review-organizational-learning.yaml +46 -0
  100. package/courses/code-review-feedback-writing/scenarios/level-5/review-roi-analysis.yaml +51 -0
  101. package/courses/code-review-feedback-writing/scenarios/level-5/review-velocity-impact.yaml +44 -0
  102. package/courses/code-review-feedback-writing/scenarios/level-5/scaling-reviews-100-plus.yaml +45 -0
  103. package/courses/code-review-feedback-writing/scenarios/level-5/toxic-culture-transformation.yaml +46 -0
  104. package/courses/technical-rfc-writing/course.yaml +11 -0
  105. package/courses/technical-rfc-writing/scenarios/level-1/first-rfc-shift.yaml +45 -0
  106. package/courses/technical-rfc-writing/scenarios/level-1/implementation-planning.yaml +47 -0
  107. package/courses/technical-rfc-writing/scenarios/level-1/open-questions.yaml +46 -0
  108. package/courses/technical-rfc-writing/scenarios/level-1/problem-statement.yaml +41 -0
  109. package/courses/technical-rfc-writing/scenarios/level-1/proposing-solutions.yaml +49 -0
  110. package/courses/technical-rfc-writing/scenarios/level-1/rfc-structure.yaml +41 -0
  111. package/courses/technical-rfc-writing/scenarios/level-1/risks-and-mitigations.yaml +43 -0
  112. package/courses/technical-rfc-writing/scenarios/level-1/scoping-an-rfc.yaml +49 -0
  113. package/courses/technical-rfc-writing/scenarios/level-1/success-metrics.yaml +43 -0
  114. package/courses/technical-rfc-writing/scenarios/level-1/writing-for-audience.yaml +42 -0
  115. package/courses/technical-rfc-writing/scenarios/level-2/risk-assessment-matrix.yaml +43 -0
  116. package/courses/technical-rfc-writing/scenarios/level-2/technical-design-detail.yaml +42 -0
  117. package/courses/technical-rfc-writing/scenarios/level-2/trade-off-analysis.yaml +43 -0
  118. package/courses/terraform-infrastructure-setup/scenarios/level-1/first-debugging-shift.yaml +66 -0
  119. package/courses/terraform-infrastructure-setup/scenarios/level-1/plan-output-reading.yaml +71 -0
  120. package/courses/terraform-infrastructure-setup/scenarios/level-1/resource-creation-failures.yaml +54 -0
  121. package/courses/terraform-infrastructure-setup/scenarios/level-1/resource-references.yaml +70 -0
  122. package/courses/terraform-infrastructure-setup/scenarios/level-1/state-file-basics.yaml +73 -0
  123. package/courses/terraform-infrastructure-setup/scenarios/level-1/terraform-fmt-validate.yaml +58 -0
  124. package/courses/terraform-infrastructure-setup/scenarios/level-2/count-vs-for-each.yaml +58 -0
  125. package/courses/terraform-infrastructure-setup/scenarios/level-2/dependency-management.yaml +80 -0
  126. package/courses/terraform-infrastructure-setup/scenarios/level-2/intermediate-debugging-shift.yaml +66 -0
  127. package/courses/terraform-infrastructure-setup/scenarios/level-2/lifecycle-rules.yaml +51 -0
  128. package/courses/terraform-infrastructure-setup/scenarios/level-2/locals-and-expressions.yaml +58 -0
  129. package/courses/terraform-infrastructure-setup/scenarios/level-2/module-structure.yaml +75 -0
  130. package/courses/terraform-infrastructure-setup/scenarios/level-2/provisioner-pitfalls.yaml +64 -0
  131. package/courses/terraform-infrastructure-setup/scenarios/level-2/remote-state-backend.yaml +55 -0
  132. package/courses/terraform-infrastructure-setup/scenarios/level-2/terraform-import.yaml +55 -0
  133. package/courses/terraform-infrastructure-setup/scenarios/level-2/workspace-management.yaml +51 -0
  134. package/courses/terraform-infrastructure-setup/scenarios/level-3/advanced-debugging-shift.yaml +63 -0
  135. package/courses/terraform-infrastructure-setup/scenarios/level-3/api-rate-limiting.yaml +50 -0
  136. package/courses/terraform-infrastructure-setup/scenarios/level-3/conditional-resources.yaml +66 -0
  137. package/courses/terraform-infrastructure-setup/scenarios/level-3/drift-detection.yaml +66 -0
  138. package/courses/terraform-infrastructure-setup/scenarios/level-3/dynamic-blocks.yaml +71 -0
  139. package/courses/terraform-infrastructure-setup/scenarios/level-3/large-scale-refactoring.yaml +59 -0
  140. package/courses/terraform-infrastructure-setup/scenarios/level-3/multi-provider-config.yaml +69 -0
  141. package/courses/terraform-infrastructure-setup/scenarios/level-3/state-surgery.yaml +57 -0
  142. package/courses/terraform-infrastructure-setup/scenarios/level-3/terraform-cloud-enterprise.yaml +59 -0
  143. package/courses/terraform-infrastructure-setup/scenarios/level-3/terraform-debugging.yaml +51 -0
  144. package/courses/terraform-infrastructure-setup/scenarios/level-4/blast-radius-management.yaml +51 -0
  145. package/courses/terraform-infrastructure-setup/scenarios/level-4/cicd-pipeline-design.yaml +50 -0
  146. package/courses/terraform-infrastructure-setup/scenarios/level-4/compliance-as-code.yaml +46 -0
  147. package/courses/terraform-infrastructure-setup/scenarios/level-4/cost-estimation-governance.yaml +42 -0
  148. package/courses/terraform-infrastructure-setup/scenarios/level-4/expert-debugging-shift.yaml +51 -0
  149. package/courses/terraform-infrastructure-setup/scenarios/level-4/iac-organization-strategy.yaml +45 -0
  150. package/courses/terraform-infrastructure-setup/scenarios/level-4/incident-response-iac.yaml +47 -0
  151. package/courses/terraform-infrastructure-setup/scenarios/level-4/infrastructure-testing.yaml +41 -0
  152. package/courses/terraform-infrastructure-setup/scenarios/level-4/module-registry-design.yaml +45 -0
  153. package/courses/terraform-infrastructure-setup/scenarios/level-4/multi-account-strategy.yaml +57 -0
  154. package/courses/terraform-infrastructure-setup/scenarios/level-5/board-infrastructure-investment.yaml +53 -0
  155. package/courses/terraform-infrastructure-setup/scenarios/level-5/disaster-recovery-iac.yaml +47 -0
  156. package/courses/terraform-infrastructure-setup/scenarios/level-5/enterprise-iac-transformation.yaml +48 -0
  157. package/courses/terraform-infrastructure-setup/scenarios/level-5/iac-technology-evolution.yaml +49 -0
  158. package/courses/terraform-infrastructure-setup/scenarios/level-5/ma-infrastructure-consolidation.yaml +54 -0
  159. package/courses/terraform-infrastructure-setup/scenarios/level-5/master-debugging-shift.yaml +53 -0
  160. package/courses/terraform-infrastructure-setup/scenarios/level-5/multi-cloud-strategy.yaml +49 -0
  161. package/courses/terraform-infrastructure-setup/scenarios/level-5/platform-engineering.yaml +47 -0
  162. package/courses/terraform-infrastructure-setup/scenarios/level-5/regulatory-compliance-automation.yaml +47 -0
  163. package/courses/terraform-infrastructure-setup/scenarios/level-5/terraform-vs-alternatives.yaml +46 -0
  164. package/dist/cli/commands/generate.d.ts.map +1 -1
  165. package/dist/cli/commands/generate.js +2 -1
  166. package/dist/cli/commands/generate.js.map +1 -1
  167. package/dist/cli/commands/train.d.ts.map +1 -1
  168. package/dist/cli/commands/train.js +6 -3
  169. package/dist/cli/commands/train.js.map +1 -1
  170. package/dist/cli/index.js +9 -6
  171. package/dist/cli/index.js.map +1 -1
  172. package/dist/cli/run-demo.js +3 -2
  173. package/dist/cli/run-demo.js.map +1 -1
  174. package/dist/engine/model-utils.d.ts +6 -0
  175. package/dist/engine/model-utils.d.ts.map +1 -1
  176. package/dist/engine/model-utils.js +28 -1
  177. package/dist/engine/model-utils.js.map +1 -1
  178. package/dist/engine/training.d.ts.map +1 -1
  179. package/dist/engine/training.js +4 -3
  180. package/dist/engine/training.js.map +1 -1
  181. package/dist/evaluator/judge.d.ts +7 -1
  182. package/dist/evaluator/judge.d.ts.map +1 -1
  183. package/dist/evaluator/judge.js +50 -11
  184. package/dist/evaluator/judge.js.map +1 -1
  185. package/dist/generator/course-generator.d.ts.map +1 -1
  186. package/dist/generator/course-generator.js +4 -3
  187. package/dist/generator/course-generator.js.map +1 -1
  188. package/dist/mcp/server.d.ts.map +1 -1
  189. package/dist/mcp/server.js +7 -3
  190. package/dist/mcp/server.js.map +1 -1
  191. package/dist/mcp/session-manager.d.ts.map +1 -1
  192. package/dist/mcp/session-manager.js +3 -2
  193. package/dist/mcp/session-manager.js.map +1 -1
  194. package/dist/types/index.d.ts +1 -1
  195. package/dist/types/index.d.ts.map +1 -1
  196. package/package.json +1 -1
@@ -0,0 +1,66 @@
1
+ meta:
2
+ id: conditional-resources
3
+ level: 3
4
+ course: terraform-infrastructure-setup
5
+ type: output
6
+ description: "Create conditional resources — use count, for_each, and ternary expressions to conditionally create infrastructure based on environment or feature flags"
7
+ tags: [Terraform, conditional, feature-flags, count, ternary, advanced]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ Your infrastructure needs different resources per environment:
13
+ - Prod: NAT Gateway (expensive), multi-AZ RDS, CloudFront
14
+ - Dev: NAT Instance (cheap), single-AZ RDS, no CloudFront
15
+ - All: VPC, subnets, security groups
16
+
17
+ Your initial attempt:
18
+
19
+ ```hcl
20
+ resource "aws_nat_gateway" "main" {
21
+ count = var.environment == "prod" ? 1 : 0
22
+ # ...
23
+ }
24
+
25
+ resource "aws_instance" "nat" {
26
+ count = var.environment == "prod" ? 0 : 1
27
+ # ...
28
+ }
29
+
30
+ resource "aws_db_instance" "main" {
31
+ multi_az = var.environment == "prod" ? true : false
32
+ # ...
33
+ }
34
+
35
+ resource "aws_cloudfront_distribution" "cdn" {
36
+ count = var.environment == "prod" ? 1 : 0
37
+ origin {
38
+ domain_name = aws_lb.web.dns_name
39
+ }
40
+ }
41
+ ```
42
+
43
+ Problem: other resources reference aws_nat_gateway.main.id but it
44
+ might not exist (count = 0):
45
+ ```
46
+ Error: Invalid index
47
+ aws_nat_gateway.main[0] does not exist
48
+ ```
49
+
50
+ Task: Explain conditional resource creation patterns, how to
51
+ safely reference conditional resources, feature flags in Terraform,
52
+ and environment-specific configuration strategies.
53
+
54
+ assertions:
55
+ - type: llm_judge
56
+ criteria: "Conditional creation patterns are explained — count = condition ? 1 : 0 is the primary pattern. When count = 0, resource doesn't exist. Reference safely: one_of(aws_nat_gateway.main[*].id, aws_instance.nat[*].id) or try(aws_nat_gateway.main[0].id, null). Splat expression: aws_nat_gateway.main[*].id returns empty list when count = 0 (no error). Conditional with for_each: for_each = var.enable_cdn ? { cdn = true } : {} — resource exists when map is non-empty. Ternary for attributes: multi_az = var.environment == 'prod' (direct boolean)"
57
+ weight: 0.35
58
+ description: "Conditional patterns"
59
+ - type: llm_judge
60
+ criteria: "Safe referencing is covered — problem: aws_nat_gateway.main[0].id errors when count = 0. Solutions: (1) use splat: aws_nat_gateway.main[*].id (returns list, empty if count=0). (2) use try(): try(aws_nat_gateway.main[0].id, ''). (3) use one(): one(aws_nat_gateway.main[*].id) (returns single value or null). (4) use locals to abstract: locals { nat_gateway_id = length(aws_nat_gateway.main) > 0 ? aws_nat_gateway.main[0].id : null }. Then reference local.nat_gateway_id. For route tables: use coalesce() or conditional in the referencing resource's count"
61
+ weight: 0.35
62
+ description: "Safe referencing"
63
+ - type: llm_judge
64
+ criteria: "Feature flags and environment strategy are practical — feature flags: variable 'features' { type = object({ enable_cdn = bool, enable_waf = bool, enable_monitoring = bool }), default = { enable_cdn = false, enable_waf = false, enable_monitoring = true } }. Each feature conditionally creates resources. Environment strategy: (1) simple: ternary on var.environment, (2) structured: locals map with per-environment settings: locals { env_config = { prod = { nat_type = 'gateway', rds_multi_az = true }, dev = { nat_type = 'instance', rds_multi_az = false } } }. Reference: local.env_config[var.environment].rds_multi_az. Keeps all environment differences in one place"
65
+ weight: 0.30
66
+ description: "Feature flags"
@@ -0,0 +1,66 @@
1
+ meta:
2
+ id: drift-detection
3
+ level: 3
4
+ course: terraform-infrastructure-setup
5
+ type: output
6
+ description: "Detect and remediate infrastructure drift — diagnose out-of-band changes, implement continuous drift detection, and establish drift prevention workflows"
7
+ tags: [Terraform, drift, detection, remediation, refresh, advanced]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ Your security team ran an emergency script that modified security
13
+ groups directly in AWS. Monday morning, terraform plan shows:
14
+
15
+ ```
16
+ Note: Objects have changed outside of Terraform
17
+
18
+ Terraform detected the following changes made outside of Terraform:
19
+
20
+ # aws_security_group.web has been changed
21
+ ~ resource "aws_security_group" "web" {
22
+ ~ ingress = [
23
+ # New rule added outside Terraform:
24
+ + {
25
+ cidr_blocks = ["0.0.0.0/0"]
26
+ from_port = 22
27
+ to_port = 22
28
+ protocol = "tcp"
29
+ }
30
+ ]
31
+ }
32
+
33
+ # aws_instance.web has been changed
34
+ ~ resource "aws_instance" "web" {
35
+ ~ instance_type = "t3.micro" -> "t3.large"
36
+ }
37
+
38
+ # aws_db_instance.main has been changed
39
+ ~ resource "aws_db_instance" "main" {
40
+ ~ backup_retention_period = 7 -> 30
41
+ }
42
+
43
+ Plan: 0 to add, 3 to change, 0 to destroy.
44
+ ```
45
+
46
+ Terraform wants to revert ALL changes. But some changes were
47
+ intentional (backup retention increase) and others were dangerous
48
+ (SSH open to 0.0.0.0/0).
49
+
50
+ Task: Explain drift detection mechanics, selective drift handling
51
+ (accept some, revert others), continuous drift detection strategies,
52
+ and prevention workflows to minimize drift.
53
+
54
+ assertions:
55
+ - type: llm_judge
56
+ criteria: "Drift detection mechanics are explained — Terraform refreshes state before plan/apply by querying cloud APIs. Compares: real infrastructure vs state vs configuration. Drift = real infrastructure differs from state. Plan shows drift as 'Objects have changed outside of Terraform'. Terraform then plans to reconcile real infrastructure with configuration (not state). terraform plan -refresh-only: shows drift without planning changes. terraform apply -refresh-only: updates state to match reality without changing infrastructure. terraform refresh (deprecated): same as apply -refresh-only"
57
+ weight: 0.35
58
+ description: "Detection mechanics"
59
+ - type: llm_judge
60
+ criteria: "Selective drift handling is covered — to accept some drift and revert others: (1) SSH rule (dangerous): revert by applying — Terraform removes the unauthorized 0.0.0.0/0 SSH rule. (2) Instance type change: if intentional, update .tf file to t3.large, plan shows no changes. (3) Backup retention: if intentional, update .tf to backup_retention_period = 30. Workflow: review each drift item, decide accept or revert, update code for accepted changes, apply to revert rejected changes. ignore_changes lifecycle: for attributes managed externally (not recommended as default — masks problems)"
61
+ weight: 0.35
62
+ description: "Selective handling"
63
+ - type: llm_judge
64
+ criteria: "Prevention and continuous detection are practical — continuous detection: scheduled terraform plan -detailed-exitcode in CI (exit code 2 = drift detected, alert the team). Terraform Cloud: drift detection feature runs plans automatically. Prevention: (1) all changes through Terraform (enforce via IAM — deny console modifications), (2) SCP policies in AWS Organizations to restrict manual changes, (3) AWS Config rules to detect non-Terraform changes, (4) establish an emergency change process that includes updating Terraform code after. Culture: drift is a process problem, not a tool problem"
65
+ weight: 0.30
66
+ description: "Prevention"
@@ -0,0 +1,71 @@
1
+ meta:
2
+ id: dynamic-blocks
3
+ level: 3
4
+ course: terraform-infrastructure-setup
5
+ type: output
6
+ description: "Use dynamic blocks — generate repetitive configuration blocks from variables, handle nested dynamics, and avoid over-engineering"
7
+ tags: [Terraform, dynamic-blocks, for-each, iteration, DRY, advanced]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ Your security group configuration has grown unwieldy with 20 inline
13
+ ingress rules hardcoded:
14
+
15
+ ```hcl
16
+ resource "aws_security_group" "web" {
17
+ name = "web-sg"
18
+
19
+ ingress {
20
+ from_port = 80
21
+ to_port = 80
22
+ protocol = "tcp"
23
+ cidr_blocks = ["0.0.0.0/0"]
24
+ }
25
+ ingress {
26
+ from_port = 443
27
+ to_port = 443
28
+ protocol = "tcp"
29
+ cidr_blocks = ["0.0.0.0/0"]
30
+ }
31
+ ingress {
32
+ from_port = 22
33
+ to_port = 22
34
+ protocol = "tcp"
35
+ cidr_blocks = ["10.0.0.0/8"]
36
+ }
37
+ # ... 17 more rules
38
+ }
39
+ ```
40
+
41
+ You want to make this data-driven using variables and dynamic blocks.
42
+ But your first attempt creates confusing, deeply nested code:
43
+
44
+ ```hcl
45
+ dynamic "ingress" {
46
+ for_each = var.rules
47
+ content {
48
+ dynamic "self" { # ERROR: can't nest dynamic within dynamic
49
+ # for the same block type
50
+ }
51
+ }
52
+ }
53
+ ```
54
+
55
+ Task: Explain dynamic blocks, when to use them, nested dynamic
56
+ blocks, iterator naming, the content block, and when dynamic blocks
57
+ hurt readability (over-engineering).
58
+
59
+ assertions:
60
+ - type: llm_judge
61
+ criteria: "Dynamic blocks are explained — dynamic 'ingress' { for_each = var.ingress_rules, content { from_port = ingress.value.from_port, to_port = ingress.value.to_port, protocol = ingress.value.protocol, cidr_blocks = ingress.value.cidr_blocks } }. The iterator name defaults to the block label (ingress). Custom iterator: dynamic 'ingress' { iterator = rule, for_each = ..., content { from_port = rule.value.port } }. for_each accepts: list, set, map. Within content, use iterator.key and iterator.value. Dynamic blocks can only generate repeatable nested blocks, not top-level arguments"
62
+ weight: 0.35
63
+ description: "Dynamic blocks"
64
+ - type: llm_judge
65
+ criteria: "Nested dynamics and practical patterns are covered — nested dynamic blocks: dynamic blocks can be nested when a block contains another repeatable block. Example: aws_security_group with dynamic ingress that has dynamic cidr_blocks (though cidr_blocks is actually a list attribute, not a block). Real nested example: aws_lb_listener_rule with dynamic condition { dynamic host_header { ... } }. Variable structure: use list of objects for simple rules, map of objects when you need keys. Flatten complex structures with locals before passing to dynamic blocks"
66
+ weight: 0.35
67
+ description: "Nested dynamics"
68
+ - type: llm_judge
69
+ criteria: "Over-engineering warnings are given — when NOT to use dynamic blocks: (1) fewer than 3-4 rules — just write them out, (2) deeply nested dynamics (3+ levels) — becomes unreadable, (3) when separate resources are clearer (aws_security_group_rule instead of inline ingress blocks). Dynamic blocks trade readability for DRY — sometimes repetition is more maintainable. Alternative to dynamic: use for_each on separate resource blocks (aws_security_group_rule with for_each). This is often clearer and gives each rule its own lifecycle. Rule of thumb: if a new team member can't understand it in 30 seconds, simplify"
70
+ weight: 0.30
71
+ description: "Over-engineering"
@@ -0,0 +1,59 @@
1
+ meta:
2
+ id: large-scale-refactoring
3
+ level: 3
4
+ course: terraform-infrastructure-setup
5
+ type: output
6
+ description: "Refactor large Terraform codebases — split monoliths into modules, migrate between state files, and use moved blocks for safe resource reorganization"
7
+ tags: [Terraform, refactoring, modules, moved-blocks, migration, advanced]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ Your organization's Terraform codebase has grown organically over
13
+ 3 years into a monolith:
14
+
15
+ ```
16
+ infrastructure/
17
+ ├── main.tf (3500 lines, 180 resources)
18
+ ├── variables.tf (800 lines, 95 variables)
19
+ ├── outputs.tf (200 lines)
20
+ └── terraform.tfstate (25MB, all resources in one state)
21
+ ```
22
+
23
+ Problems:
24
+ - terraform plan takes 8 minutes (refreshes all 180 resources)
25
+ - Any change risks all resources (blast radius = everything)
26
+ - 5 teams touch the same files, causing merge conflicts
27
+ - Lock contention: only one person can run terraform at a time
28
+
29
+ Target architecture:
30
+ ```
31
+ infrastructure/
32
+ ├── foundation/ (VPC, DNS, IAM — Platform team)
33
+ │ └── terraform.tfstate
34
+ ├── database/ (RDS, ElastiCache — Database team)
35
+ │ └── terraform.tfstate
36
+ ├── compute/ (ECS, ALB — App team)
37
+ │ └── terraform.tfstate
38
+ ├── monitoring/ (CloudWatch, Alarms — SRE team)
39
+ │ └── terraform.tfstate
40
+ └── modules/ (Shared modules)
41
+ ```
42
+
43
+ Task: Design the migration strategy from monolith to modular
44
+ Terraform, covering state splitting, moved blocks, cross-state
45
+ references, testing the migration, and rollback planning.
46
+
47
+ assertions:
48
+ - type: llm_judge
49
+ criteria: "Migration strategy is phased — Phase 1: catalog all resources by team/domain. Phase 2: create module structure and write configurations for each domain. Phase 3: use moved blocks within the monolith to reorganize into modules (no state split yet). Phase 4: split state files using state mv or state rm + import. Phase 5: establish cross-state references using terraform_remote_state data sources. Each phase is independently verifiable: plan should show no changes after each phase. Never do everything at once — incremental migration with verification"
50
+ weight: 0.35
51
+ description: "Migration strategy"
52
+ - type: llm_judge
53
+ criteria: "State splitting mechanics are covered — approach 1 (state mv): (1) backup state, (2) create new backend configs, (3) terraform state mv resources to new state files. Approach 2 (state rm + import): (1) remove resources from monolith state, (2) import into new domain state files. Approach 3 (manual): (1) state pull, (2) edit JSON to split resources, (3) state push to new backends. Cross-state references: foundation outputs VPC ID, compute reads it via terraform_remote_state. IAM and dependency order: foundation first (VPC, IAM), then database (needs VPC), then compute (needs both)"
54
+ weight: 0.35
55
+ description: "State splitting"
56
+ - type: llm_judge
57
+ criteria: "Testing and rollback are practical — testing: after each migration step, terraform plan must show zero changes in all state files. If plan shows changes, something was migrated incorrectly — fix before proceeding. Rollback: keep the original monolith state backup throughout migration. If anything goes wrong, restore from backup and restart the phase. Timeline: for 180 resources, plan 2-4 weeks. Risk mitigation: migrate non-production first, then production during maintenance window. Communication: notify all teams of the plan, freeze non-essential changes during migration"
58
+ weight: 0.30
59
+ description: "Testing and rollback"
@@ -0,0 +1,69 @@
1
+ meta:
2
+ id: multi-provider-config
3
+ level: 3
4
+ course: terraform-infrastructure-setup
5
+ type: output
6
+ description: "Configure multi-provider setups — manage multi-region, multi-account, and multi-cloud deployments with provider aliases and assume_role"
7
+ tags: [Terraform, providers, multi-region, multi-account, cross-account, advanced]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ Your organization needs infrastructure across multiple AWS accounts
13
+ and regions:
14
+
15
+ ```
16
+ Production Account (111111111111) - us-east-1
17
+ Staging Account (222222222222) - us-east-1
18
+ DR Account (111111111111) - us-west-2
19
+ Shared Services (333333333333) - us-east-1
20
+ ```
21
+
22
+ Your Terraform configuration:
23
+
24
+ ```hcl
25
+ provider "aws" {
26
+ region = "us-east-1"
27
+ }
28
+
29
+ provider "aws" {
30
+ alias = "dr"
31
+ region = "us-west-2"
32
+ }
33
+
34
+ provider "aws" {
35
+ alias = "staging"
36
+ region = "us-east-1"
37
+ assume_role {
38
+ role_arn = "arn:aws:iam::222222222222:role/TerraformRole"
39
+ }
40
+ }
41
+ ```
42
+
43
+ Error when deploying to staging:
44
+ ```
45
+ Error: error configuring Terraform AWS Provider: IAM Role
46
+ (arn:aws:iam::222222222222:role/TerraformRole) cannot be assumed.
47
+
48
+ There are a number of possible causes:
49
+ - The credentials used do not have permission to assume the role
50
+ - The role's trust policy does not allow the current identity
51
+ ```
52
+
53
+ Task: Explain multi-provider configuration, assume_role for
54
+ cross-account access, passing providers to modules, provider
55
+ configuration best practices, and debugging cross-account issues.
56
+
57
+ assertions:
58
+ - type: llm_judge
59
+ criteria: "Multi-provider setup is explained — provider aliases allow multiple configurations of the same provider. Default provider (no alias) used when provider isn't specified on a resource. Aliased providers: specify with provider = aws.dr on each resource. assume_role: Terraform assumes an IAM role in another account. Requirements: (1) trust policy on target role must allow the source account/role, (2) source must have sts:AssumeRole permission, (3) external_id for additional security. The error: trust policy or permissions issue — check both sides"
60
+ weight: 0.35
61
+ description: "Multi-provider setup"
62
+ - type: llm_judge
63
+ criteria: "Provider passing to modules is covered — modules don't inherit provider aliases automatically. Pass explicitly: module 'dr_vpc' { source = './modules/vpc', providers = { aws = aws.dr } }. Module must declare required providers: terraform { required_providers { aws = { source = 'hashicorp/aws' } } }. For modules needing multiple providers: providers = { aws = aws, aws.secondary = aws.dr }. Anti-pattern: configuring providers inside modules — always configure in root and pass down"
64
+ weight: 0.35
65
+ description: "Module providers"
66
+ - type: llm_judge
67
+ criteria: "Cross-account debugging is practical — debugging assume_role: (1) verify trust policy on target role allows the source identity, (2) verify source has sts:AssumeRole permission, (3) check for external_id requirement, (4) test manually: aws sts assume-role --role-arn ... (5) enable TF_LOG=DEBUG to see the exact API call. IAM role trust policy must include the specific ARN (account, user, or role). Session duration: default 1 hour, can increase with duration_seconds. MFA: if required, must be handled outside Terraform. Best practice: use separate state files per account for blast radius isolation"
68
+ weight: 0.30
69
+ description: "Cross-account debugging"
@@ -0,0 +1,57 @@
1
+ meta:
2
+ id: state-surgery
3
+ level: 3
4
+ course: terraform-infrastructure-setup
5
+ type: output
6
+ description: "Perform state surgery — use state mv, rm, pull, push for complex migrations, module extraction, and resource address changes"
7
+ tags: [Terraform, state, migration, state-mv, state-rm, advanced]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ Your monolithic Terraform configuration with 200 resources needs
13
+ to be split into separate modules. Current flat structure:
14
+
15
+ ```hcl
16
+ # main.tf (2000 lines)
17
+ resource "aws_vpc" "main" { ... }
18
+ resource "aws_subnet" "public" { ... }
19
+ resource "aws_instance" "web" { ... }
20
+ resource "aws_rds_instance" "db" { ... }
21
+ ```
22
+
23
+ Target: split into modules/networking, modules/compute, modules/database.
24
+
25
+ Attempt 1 — Just move code into modules:
26
+ ```
27
+ $ terraform plan
28
+ # aws_vpc.main will be destroyed
29
+ # module.networking.aws_vpc.main will be created
30
+ # aws_instance.web will be destroyed
31
+ # module.compute.aws_instance.web will be created
32
+ # aws_rds_instance.db will be destroyed (!!!)
33
+ # module.database.aws_rds_instance.db will be created
34
+ Plan: 6 to add, 0 to change, 6 to destroy.
35
+ ```
36
+
37
+ All resources will be destroyed and recreated — unacceptable for
38
+ production! The database would be lost.
39
+
40
+ Task: Explain state surgery operations (mv, rm, pull, push),
41
+ how to migrate resources between modules without recreation,
42
+ moved blocks (Terraform 1.1+), state backup best practices,
43
+ and complex migration strategies.
44
+
45
+ assertions:
46
+ - type: llm_judge
47
+ criteria: "State mv migration is explained — terraform state mv moves a resource from one address to another in state without modifying infrastructure. To migrate to modules: terraform state mv aws_vpc.main module.networking.aws_vpc.main, terraform state mv aws_instance.web module.compute.aws_instance.web, etc. After all moves: terraform plan should show no changes. Always backup state first: terraform state pull > backup.tfstate. State mv is atomic per resource — if interrupted, some resources moved, others not. Plan carefully and script the moves"
48
+ weight: 0.35
49
+ description: "State mv"
50
+ - type: llm_judge
51
+ criteria: "Moved blocks are covered as the modern alternative — moved { from = aws_vpc.main, to = module.networking.aws_vpc.main }. Benefits over state mv: (1) declarative and code-reviewable, (2) handled during plan/apply, (3) no manual state manipulation, (4) works across plan/apply workflow. Multiple moved blocks can coexist. Moved blocks are removed after successful apply. Supports: resource address changes, module refactoring, count to for_each migration. Terraform 1.1+ required. Preferred over state mv for most migrations"
52
+ weight: 0.35
53
+ description: "Moved blocks"
54
+ - type: llm_judge
55
+ criteria: "State rm and complex operations are practical — terraform state rm: removes resource from state without destroying it. Use when: (1) resource should no longer be managed by Terraform, (2) moving resource to different state file, (3) removing accidentally imported resource. terraform state pull/push: download/upload entire state file. Use for: manual state repair, migrating between backends, debugging. Complex migration: for splitting state files, (1) state pull, (2) manipulate JSON, (3) state push to new backend. Always: backup before surgery, verify with plan after, use -dry-run where available"
56
+ weight: 0.30
57
+ description: "Complex operations"
@@ -0,0 +1,59 @@
1
+ meta:
2
+ id: terraform-cloud-enterprise
3
+ level: 3
4
+ course: terraform-infrastructure-setup
5
+ type: output
6
+ description: "Use Terraform Cloud/Enterprise — configure remote execution, VCS integration, workspace management, and Sentinel policies"
7
+ tags: [Terraform, Terraform-Cloud, Enterprise, remote-execution, Sentinel, advanced]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ Your team is migrating from local Terraform execution to Terraform
13
+ Cloud. Current pain points:
14
+ - Engineers run terraform from laptops with different provider versions
15
+ - No audit trail of who applied what
16
+ - State files stored in S3 with overly permissive access
17
+ - No policy enforcement (anyone can create m5.24xlarge instances)
18
+
19
+ Migration configuration:
20
+
21
+ ```hcl
22
+ terraform {
23
+ cloud {
24
+ organization = "acme-corp"
25
+ workspaces {
26
+ name = "production"
27
+ }
28
+ }
29
+ }
30
+ ```
31
+
32
+ After migration:
33
+ ```
34
+ $ terraform plan
35
+
36
+ Running plan in Terraform Cloud. Output will stream here.
37
+
38
+ Error: Terraform Cloud returned an unexpected error
39
+ UNAUTHORIZED: You are not authorized to perform this action.
40
+ ```
41
+
42
+ Task: Explain Terraform Cloud features (remote execution, VCS
43
+ integration, workspace management), Sentinel policies for
44
+ governance, migration from local/S3 to Terraform Cloud, and
45
+ when to use Cloud vs Enterprise vs self-hosted.
46
+
47
+ assertions:
48
+ - type: llm_judge
49
+ criteria: "Terraform Cloud features are explained — remote execution: plan and apply run on Terraform Cloud's infrastructure (consistent environment, no laptop dependencies). VCS integration: connect to GitHub/GitLab, automatic plans on PRs, apply on merge. Workspace management: each workspace has its own state, variables, and permissions. Variable sets: share variables across workspaces. Run triggers: chain workspaces (VPC workspace triggers EKS workspace). The auth error: need to run terraform login first, or set TF_TOKEN_app_terraform_io environment variable. Team permissions control who can plan vs apply"
50
+ weight: 0.35
51
+ description: "Cloud features"
52
+ - type: llm_judge
53
+ criteria: "Sentinel policies are covered — Sentinel: policy-as-code framework for governance. Policy sets: attach to workspaces. Enforcement levels: advisory (warn), soft-mandatory (override with approval), hard-mandatory (no override). Example policies: restrict instance types (no m5.24xlarge), require tags on all resources, enforce encryption, restrict regions. Policy workflow: plan → Sentinel check → cost estimation → apply. Policies written in Sentinel language (not HCL). OPA (Open Policy Agent) also supported as alternative"
54
+ weight: 0.35
55
+ description: "Sentinel policies"
56
+ - type: llm_judge
57
+ criteria: "Migration and comparison are practical — migration from S3: (1) add cloud block to config, (2) terraform login, (3) terraform init to migrate state. Cloud vs Enterprise vs self-hosted: Cloud (SaaS, free tier available, easiest setup), Enterprise (self-hosted, air-gapped support, custom agents), self-hosted agents with Cloud (hybrid — control plane in Cloud, execution on your infrastructure). When Cloud: most teams. When Enterprise: regulatory requirements for air-gapped, very large scale, custom integrations. Cost: Cloud free for small teams, Enterprise starts at $70K+/year"
58
+ weight: 0.30
59
+ description: "Migration and comparison"
@@ -0,0 +1,51 @@
1
+ meta:
2
+ id: terraform-debugging
3
+ level: 3
4
+ course: terraform-infrastructure-setup
5
+ type: output
6
+ description: "Debug Terraform with TF_LOG — use log levels, provider-specific debugging, crash logs, and systematic troubleshooting for complex failures"
7
+ tags: [Terraform, debugging, TF_LOG, crash-logs, troubleshooting, advanced]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ A terraform apply fails with a cryptic error that gives no useful
13
+ information:
14
+
15
+ ```
16
+ Error: error creating ECS Service (my-service): InvalidParameterException:
17
+ Unable to assume the provided role.
18
+
19
+ with aws_ecs_service.web,
20
+ on ecs.tf line 15, in resource "aws_ecs_service" "web":
21
+ 15: resource "aws_ecs_service" "web" {
22
+ ```
23
+
24
+ The IAM role exists and looks correct. You need to dig deeper.
25
+
26
+ You also encounter a Terraform crash:
27
+ ```
28
+ !!!!!!!!!!!!!!!!!!!!!!!!!!! TERRAFORM CRASH !!!!!!!!!!!!!!!!!!!!!!!!!
29
+
30
+ Terraform crashed! This is always indicative of a bug within
31
+ Terraform or a provider. Crash log saved to: crash.log
32
+ ```
33
+
34
+ Task: Explain Terraform debugging techniques, TF_LOG levels and
35
+ environment variables, provider-specific debugging, crash log
36
+ analysis, and systematic troubleshooting methodology for complex
37
+ infrastructure failures.
38
+
39
+ assertions:
40
+ - type: llm_judge
41
+ criteria: "TF_LOG debugging is explained — levels (most to least verbose): TRACE, DEBUG, INFO, WARN, ERROR. Set: TF_LOG=DEBUG terraform apply. Save to file: TF_LOG_PATH=./debug.log. Component-specific: TF_LOG_CORE=WARN TF_LOG_PROVIDER=DEBUG (provider operations verbose, core quiet). The ECS error: TF_LOG=DEBUG reveals the actual API request/response — likely IAM role trust policy doesn't include ecs.amazonaws.com, or there's an IAM propagation delay. DEBUG shows: HTTP requests, API responses, retry attempts, timing. TRACE shows everything including internal state operations"
42
+ weight: 0.35
43
+ description: "TF_LOG debugging"
44
+ - type: llm_judge
45
+ criteria: "Crash logs and provider debugging are covered — crash log: contains Go stack trace, panic message, provider version. Report to: provider GitHub issues if provider crash, Terraform core GitHub if core crash. Include: Terraform version, provider versions, sanitized config, crash.log. Provider debugging: check provider changelog for known bugs, try upgrading/downgrading provider version, reproduce with minimal configuration. AWS-specific: decode authorization failure messages with aws sts decode-authorization-message. Eventual consistency: IAM changes can take seconds to propagate — add depends_on or retry"
46
+ weight: 0.35
47
+ description: "Crash and provider"
48
+ - type: llm_judge
49
+ criteria: "Systematic troubleshooting is practical — methodology: (1) read the error message carefully (resource, file, line), (2) check provider documentation for the resource, (3) enable TF_LOG=DEBUG and search for the actual API error, (4) reproduce with minimal configuration (isolate the issue), (5) check for known issues on provider GitHub, (6) verify cloud-side (correct permissions, quotas, resource limits). Common hidden causes: IAM propagation delay, API rate limiting (429 errors hidden in retries), eventual consistency, stale provider cache (terraform init -upgrade). terraform plan -refresh-only to verify state matches reality"
50
+ weight: 0.30
51
+ description: "Troubleshooting method"
@@ -0,0 +1,51 @@
1
+ meta:
2
+ id: blast-radius-management
3
+ level: 4
4
+ course: terraform-infrastructure-setup
5
+ type: output
6
+ description: "Manage Terraform blast radius — design state boundaries, implement approval workflows, and prevent large-scale outages from single changes"
7
+ tags: [Terraform, blast-radius, state-separation, approvals, risk, expert]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ A single terraform apply destroyed your production database, two
13
+ load balancers, and a VPN connection. Root cause: all 300 resources
14
+ were in one state file. The engineer intended to modify a CloudWatch
15
+ alarm but a provider upgrade changed the behavior of unrelated
16
+ resources.
17
+
18
+ Impact:
19
+ - 4 hours of downtime
20
+ - Database restored from backup (30 minutes of data loss)
21
+ - Post-mortem found: blast radius = 300 resources per apply
22
+ - Board asked: "How do we prevent this from happening again?"
23
+
24
+ Current state architecture:
25
+ ```
26
+ Single state: 300 resources
27
+ - VPC, subnets, NAT gateways
28
+ - RDS, ElastiCache
29
+ - ECS services, ALBs
30
+ - CloudWatch, SNS, SQS
31
+ - IAM roles, policies
32
+ - S3 buckets, CloudFront
33
+ ```
34
+
35
+ Task: Design the blast radius management strategy covering: state
36
+ file boundaries, change classification (risk levels), approval
37
+ workflows, provider upgrade safety, and recovery procedures.
38
+
39
+ assertions:
40
+ - type: llm_judge
41
+ criteria: "State boundaries reduce blast radius — split 300 resources into isolated state files: foundation (VPC, subnets, NAT — rarely changes, ~20 resources), database (RDS, ElastiCache — critical, ~10 resources), compute (ECS, ALB — frequently changes, ~50 resources), messaging (SQS, SNS — moderate, ~30 resources), monitoring (CloudWatch, alarms — frequent, ~40 resources), IAM (roles, policies — sensitive, ~30 resources), CDN (CloudFront, S3 — moderate, ~20 resources). Each state file limits the blast radius. Maximum 50-80 resources per state. Cross-state references via terraform_remote_state"
42
+ weight: 0.35
43
+ description: "State boundaries"
44
+ - type: llm_judge
45
+ criteria: "Change classification and approvals are defined — risk levels: Low (monitoring, tags, non-destructive updates — auto-approve in CI), Medium (security group changes, scaling modifications — 1 approval), High (database changes, network topology, IAM — 2 approvals + change window), Critical (provider upgrades, state operations, foundation changes — team lead + SRE approval). Implement via: Terraform Cloud workspace-level permissions, GitHub environment protection rules, or Atlantis apply requirements. Provider upgrades: pin exact versions, upgrade in dev first, review changelog for breaking changes, upgrade one state file at a time"
46
+ weight: 0.35
47
+ description: "Classification and approvals"
48
+ - type: llm_judge
49
+ criteria: "Recovery procedures are practical — immediate response: (1) don't run terraform apply again, (2) assess damage scope from state and CloudTrail, (3) restore from backups (RDS snapshots, S3 versioning). Recovery: (1) if resources destroyed but state intact: terraform apply recreates, (2) if state corrupted: restore from S3 versioned state backup. Prevention: prevent_destroy on databases and critical resources, separate state files limit collateral damage, terraform plan -detailed-exitcode in CI catches unexpected destroys, plan output review required before apply. Provider upgrades: test in isolated environment first, upgrade one service domain at a time, maintain rollback plan (pin to previous version)"
50
+ weight: 0.30
51
+ description: "Recovery"
@@ -0,0 +1,50 @@
1
+ meta:
2
+ id: cicd-pipeline-design
3
+ level: 4
4
+ course: terraform-infrastructure-setup
5
+ type: output
6
+ description: "Design CI/CD pipelines for Terraform — implement GitOps workflows with Atlantis, GitHub Actions, or Terraform Cloud for safe infrastructure deployment"
7
+ tags: [Terraform, CI/CD, GitOps, Atlantis, GitHub-Actions, expert]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ Your team deploys Terraform from individual laptops. Last month:
13
+ - An engineer applied to production instead of staging (wrong workspace)
14
+ - Two engineers ran apply simultaneously, causing state corruption
15
+ - An apply failed halfway but no one noticed for 3 hours
16
+ - No record of who deployed what or when
17
+
18
+ You need to design a CI/CD pipeline for Terraform that prevents
19
+ all of these issues. Options on the table:
20
+
21
+ 1. GitHub Actions with custom workflow
22
+ 2. Atlantis (pull request automation)
23
+ 3. Terraform Cloud/Enterprise
24
+ 4. Spacelift
25
+
26
+ Requirements:
27
+ - Plan on every PR
28
+ - Apply only after approval and merge
29
+ - Environment protection (can't accidentally apply to prod)
30
+ - Cost estimation before apply
31
+ - Security scanning (tfsec/checkov)
32
+ - Slack notifications for plan/apply results
33
+
34
+ Task: Design the CI/CD pipeline for Terraform, compare the tool
35
+ options, show a complete workflow from code change to production
36
+ deployment, and address security considerations.
37
+
38
+ assertions:
39
+ - type: llm_judge
40
+ criteria: "Complete pipeline workflow is designed — code change → PR opened → automated pipeline: (1) terraform fmt -check (formatting), (2) terraform validate (syntax), (3) tfsec/checkov scan (security), (4) terraform plan (preview changes), (5) Infracost estimate (cost), (6) post results as PR comment. On merge to main: (7) terraform plan again (detect drift since PR), (8) approval gate (manual for prod), (9) terraform apply, (10) post-apply verification, (11) Slack notification. Environment promotion: dev auto-apply, staging auto-apply, prod manual approval"
41
+ weight: 0.35
42
+ description: "Pipeline workflow"
43
+ - type: llm_judge
44
+ criteria: "Tool comparison is practical — Atlantis: open-source, PR automation, self-hosted, lightweight. Best for: teams wanting simple PR-based workflow. GitHub Actions: flexible, native GitHub integration, custom workflows. Best for: teams already on GitHub wanting full control. Terraform Cloud: managed service, built-in Sentinel, cost estimation, team management. Best for: organizations wanting managed solution. Spacelift: multi-tool support, advanced policies, drift detection. Best for: enterprises with complex requirements. Recommendation depends on: team size, budget, compliance needs, multi-tool requirements"
45
+ weight: 0.35
46
+ description: "Tool comparison"
47
+ - type: llm_judge
48
+ criteria: "Security considerations are covered — credentials: use OIDC for cloud authentication (no static keys in CI). GitHub Actions: aws-actions/configure-aws-credentials with OIDC. State access: CI role has minimal permissions (plan role vs apply role). Secrets: never echo credentials, use GitHub encrypted secrets or Terraform Cloud variables. Branch protection: require PR reviews, no direct pushes to main. Environment protection: GitHub environments with required reviewers for prod. Audit: log all plan/apply with outputs. Network: CI runner in private network if accessing private resources"
49
+ weight: 0.30
50
+ description: "Security"