npm - dojo.md - Versions diffs - 0.2.1 → 0.2.3 - Mend

dojo.md 0.2.1 → 0.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (152) hide show

package/courses/terraform-infrastructure-setup/scenarios/level-2/dependency-management.yaml ADDED Viewed

@@ -0,0 +1,80 @@
+meta:
+  id: dependency-management
+  level: 2
+  course: terraform-infrastructure-setup
+  type: output
+  description: "Manage resource dependencies — debug dependency graphs, resolve circular references, use depends_on correctly, and understand the DAG"
+  tags: [Terraform, dependencies, DAG, circular, depends-on, intermediate]
+state: {}
+trigger: |
+  Your infrastructure has a complex dependency chain that's causing
+  issues during apply:
+  ```hcl
+  resource "aws_iam_role" "lambda" {
+    name = "lambda-role"
+    assume_role_policy = jsonencode({...})
+  }
+  resource "aws_iam_role_policy" "lambda" {
+    role   = aws_iam_role.lambda.name
+    policy = jsonencode({
+      Statement = [{
+        Action   = "s3:GetObject"
+        Resource = aws_s3_bucket.data.arn
+      }]
+    })
+  }
+  resource "aws_lambda_function" "processor" {
+    function_name = "processor"
+    role          = aws_iam_role.lambda.arn
+    handler       = "index.handler"
+    runtime       = "nodejs18.x"
+    filename      = "lambda.zip"
+  }
+  resource "aws_s3_bucket_notification" "trigger" {
+    bucket = aws_s3_bucket.data.id
+    lambda_function {
+      lambda_function_arn = aws_lambda_function.processor.arn
+      events              = ["s3:ObjectCreated:*"]
+    }
+  }
+  resource "aws_lambda_permission" "s3" {
+    action        = "lambda:InvokeFunction"
+    function_name = aws_lambda_function.processor.function_name
+    principal     = "s3.amazonaws.com"
+    source_arn    = aws_s3_bucket.data.arn
+  }
+  ```
+  Error:
+  ```
+  Error: error creating S3 Bucket Notification: Unable to validate
+  the following destination configurations: Lambda function ARN
+  The Lambda function doesn't have permission to be invoked by S3 yet
+  (the permission resource hasn't been created).
+  ```
+  Task: Explain the Terraform dependency graph (DAG), implicit vs
+  explicit dependencies, how to debug dependency ordering issues,
+  terraform graph command, and depends_on best practices.
+assertions:
+  - type: llm_judge
+    criteria: "DAG and implicit dependencies are explained — Terraform builds a directed acyclic graph (DAG) from resource references. Implicit: when resource A uses resource B's attribute, A depends on B. The chain: role → role_policy (references role.name), role → lambda (references role.arn), bucket → notification (references bucket.id), lambda → notification (references lambda.arn). The error: s3_bucket_notification depends on lambda (implicit) but NOT on lambda_permission (no attribute reference). S3 notification tries to verify the Lambda ARN but permission doesn't exist yet"
+    weight: 0.35
+    description: "DAG explained"
+  - type: llm_judge
+    criteria: "The fix uses depends_on correctly — add depends_on = [aws_lambda_permission.s3] to the aws_s3_bucket_notification resource. This creates an explicit dependency where no implicit one exists. depends_on is needed because the notification resource doesn't reference any attribute of the permission resource, but the permission must exist for the notification to succeed. terraform graph: visualize dependencies with terraform graph | dot -Tsvg > graph.svg. Look for missing edges that represent real-world dependencies"
+    weight: 0.35
+    description: "Fix with depends_on"
+  - type: llm_judge
+    criteria: "depends_on best practices are covered — use depends_on sparingly: prefer implicit dependencies (reference attributes). depends_on forces sequential creation (reduces parallelism). Common scenarios needing depends_on: IAM permissions before resources that need them, DNS records before health checks, network resources before resources placed in them (when ID isn't directly referenced). Anti-pattern: depends_on everywhere 'just in case' — slows down apply. Use terraform graph to verify dependency order before adding depends_on. Module-level depends_on: depends_on on module blocks waits for entire module to complete"
+    weight: 0.30
+    description: "Best practices"

package/courses/terraform-infrastructure-setup/scenarios/level-2/intermediate-debugging-shift.yaml ADDED Viewed

@@ -0,0 +1,66 @@
+meta:
+  id: intermediate-debugging-shift
+  level: 2
+  course: terraform-infrastructure-setup
+  type: output
+  description: "Combined intermediate shift — handle module errors, state corruption, and lifecycle issues during a busy infrastructure day"
+  tags: [Terraform, troubleshooting, combined, shift-simulation, intermediate]
+state: {}
+trigger: |
+  Three issues hit your team today:
+  Issue 1 — Module version conflict:
+  ```
+  $ terraform init -upgrade
+  Error: Failed to query available provider packages
+  Module "vpc" (source: terraform-aws-modules/vpc/aws, version 5.5.0)
+  requires aws provider >= 5.30.0, but the root module constrains
+  aws to ~> 4.0.
+  ```
+  The VPC module was upgraded but the root module still pins an old
+  provider version.
+  Issue 2 — State file corruption:
+  ```
+  $ terraform plan
+  Error: Failed to load state: unsupported state file format
+  The state file could not be loaded. Terraform detected that the
+  state file is not a supported format.
+  ```
+  Someone manually edited the state file and introduced a JSON syntax
+  error.
+  Issue 3 — Unexpected resource replacement:
+  ```
+  $ terraform plan
+  # aws_db_instance.main must be replaced
+  -/+ resource "aws_db_instance" "main" {
+      ~ engine_version = "14.9" -> "14.11" # forces replacement
+    }
+  ```
+  Updating the PostgreSQL minor version triggers a full database
+  replacement (destroy + create) instead of an in-place upgrade!
+  Task: Diagnose and resolve all three issues with proper explanations
+  and preventive measures.
+assertions:
+  - type: llm_judge
+    criteria: "Issue 1 (provider version conflict) is resolved — the module requires aws >= 5.30.0 but root constrains to ~> 4.0 (allows 4.x only). Fix options: (1) update root provider constraint to ~> 5.0 (breaking changes between v4 and v5 — review changelog), (2) pin module to an older version compatible with aws ~> 4.0, (3) gradually migrate by testing with aws ~> 5.0 in dev first. Prevention: use version ranges that allow updates (~> 5.0 not = 5.5.0), test module upgrades in non-prod first, terraform init -upgrade in CI to catch conflicts early"
+    weight: 0.35
+    description: "Version conflict"
+  - type: llm_judge
+    criteria: "Issue 2 (state corruption) is resolved — state file is JSON, manual edits can break it. Recovery: (1) if using S3 backend with versioning: restore previous version from S3 version history. (2) terraform state pull from backup. (3) If no backup: terraform import all resources again (last resort). Prevention: NEVER manually edit state files — use terraform state mv/rm/import commands. Enable S3 versioning on state bucket. Set up automated state backups. Use DynamoDB locking to prevent concurrent writes"
+    weight: 0.35
+    description: "State corruption"
+  - type: llm_judge
+    criteria: "Issue 3 (unexpected replacement) is resolved — some RDS parameter changes force replacement instead of in-place modification. The engine_version change: check AWS provider docs for which attributes force replacement. Fix options: (1) use allow_major_version_upgrade or apply_immediately for in-place upgrades, (2) check if the provider version has a bug (some versions incorrectly force replacement), (3) use lifecycle { ignore_changes = [engine_version] } and manage upgrades outside Terraform. Prevention: always review plan for -/+ changes, test infrastructure changes in dev before prod, pin exact engine versions and upgrade deliberately"
+    weight: 0.30
+    description: "Unexpected replacement"

package/courses/terraform-infrastructure-setup/scenarios/level-2/lifecycle-rules.yaml ADDED Viewed

@@ -0,0 +1,51 @@
+meta:
+  id: lifecycle-rules
+  level: 2
+  course: terraform-infrastructure-setup
+  type: output
+  description: "Use lifecycle meta-arguments — configure create_before_destroy, prevent_destroy, ignore_changes, and replace_triggered_by for safe resource management"
+  tags: [Terraform, lifecycle, create-before-destroy, prevent-destroy, ignore-changes, intermediate]
+state: {}
+trigger: |
+  Three lifecycle issues in your infrastructure:
+  Issue 1 — Zero-downtime deployment:
+  Changing the launch template for an ASG causes recreation. During the
+  gap between destroy and create, no instances serve traffic:
+  ```
+  -/+ aws_launch_template.web (forces replacement)
+  ```
+  Issue 2 — Accidental database deletion:
+  A developer runs terraform destroy targeting a specific module but
+  accidentally includes the RDS instance:
+  ```
+  aws_db_instance.production: Destroying... [id=prod-db]
+  aws_db_instance.production: Destruction complete
+  ```
+  Issue 3 — Auto-scaling interference:
+  Terraform keeps resetting the ASG desired_capacity back to 2, undoing
+  auto-scaling that scaled to 5 during peak traffic:
+  ```
+  ~ desired_capacity = 5 -> 2
+  ```
+  Task: Explain all lifecycle meta-arguments, when to use each,
+  common pitfalls, and real-world patterns for safe resource management.
+assertions:
+  - type: llm_judge
+    criteria: "create_before_destroy is explained — lifecycle { create_before_destroy = true }: creates the new resource before destroying the old one. Fixes Issue 1: new launch template created first, ASG updated to use it, then old template destroyed. Use for: resources that must have zero downtime (load balancers, launch templates, security groups). Pitfall: some resources have unique constraints (names must be unique) — use name_prefix instead of name to allow both to exist simultaneously. Not all resources support this — some have global uniqueness requirements"
+    weight: 0.35
+    description: "create_before_destroy"
+  - type: llm_judge
+    criteria: "prevent_destroy and ignore_changes are explained — prevent_destroy = true: Terraform errors if any operation would destroy the resource. Fixes Issue 2: protects databases, S3 buckets, encryption keys. To actually destroy: remove prevent_destroy first, then destroy. ignore_changes = [desired_capacity]: tells Terraform to ignore changes to specific attributes. Fixes Issue 3: auto-scaling can change desired_capacity without Terraform reverting it. Use ignore_changes for: attributes managed by external systems (auto-scaling, external controllers). ignore_changes = all: ignore all attribute changes (resource only managed for creation)"
+    weight: 0.35
+    description: "prevent and ignore"
+  - type: llm_judge
+    criteria: "replace_triggered_by and patterns are covered — replace_triggered_by = [aws_ami.latest.id]: force replacement when a different resource changes. Use for: instances that should be recreated when AMI updates. Lifecycle patterns: databases always get prevent_destroy, ASGs get ignore_changes on desired_capacity, stateless resources get create_before_destroy for zero downtime. Pitfall: ignore_changes can mask drift — use sparingly and document why. Lifecycle rules are meta-arguments, not resource arguments — they go in the lifecycle {} block inside the resource"
+    weight: 0.30
+    description: "Patterns"

package/courses/terraform-infrastructure-setup/scenarios/level-2/locals-and-expressions.yaml ADDED Viewed

@@ -0,0 +1,58 @@
+meta:
+  id: locals-and-expressions
+  level: 2
+  course: terraform-infrastructure-setup
+  type: output
+  description: "Use locals and expressions effectively — simplify complex configurations with local values, for expressions, conditionals, and built-in functions"
+  tags: [Terraform, locals, expressions, functions, conditionals, intermediate]
+state: {}
+trigger: |
+  Your Terraform configuration has duplicated logic everywhere:
+  ```hcl
+  resource "aws_instance" "web" {
+    tags = {
+      Name        = "web-${var.environment}-${var.region}"
+      Environment = var.environment
+      Project     = var.project
+      ManagedBy   = "terraform"
+      CostCenter  = var.environment == "prod" ? "CC-100" : "CC-200"
+    }
+  }
+  resource "aws_s3_bucket" "data" {
+    tags = {
+      Name        = "data-${var.environment}-${var.region}"
+      Environment = var.environment
+      Project     = var.project
+      ManagedBy   = "terraform"
+      CostCenter  = var.environment == "prod" ? "CC-100" : "CC-200"
+    }
+  }
+  # Same tags repeated on 20 more resources...
+  ```
+  You also need to transform a list of subnet CIDRs into a map keyed
+  by availability zone, and conditionally create resources based on
+  environment.
+  Task: Explain locals (local values), Terraform expressions (for,
+  conditionals, splat), built-in functions (lookup, merge, concat,
+  flatten, try), and how to use them to simplify configurations.
+assertions:
+  - type: llm_judge
+    criteria: "Locals are explained — locals { common_tags = { Environment = var.environment, Project = var.project, ManagedBy = 'terraform', CostCenter = var.environment == 'prod' ? 'CC-100' : 'CC-200' } }. Reference: tags = merge(local.common_tags, { Name = 'web-server' }). Locals compute values once and reuse them. Use for: common tags, computed names, complex expressions used multiple times. Locals vs variables: variables are inputs from outside, locals are computed inside the configuration. Don't overuse locals — if a value is used once, just inline it"
+    weight: 0.35
+    description: "Locals"
+  - type: llm_judge
+    criteria: "Expressions are covered — for expression: [for s in var.subnets : s.cidr] (list), {for s in var.subnets : s.az => s.cidr} (map). Conditional: var.create_vpc ? 1 : 0 with count for conditional resource creation. Splat: aws_instance.web[*].id gets all IDs. Ternary: condition ? true_val : false_val. for with filtering: [for s in var.subnets : s.cidr if s.public]. String templates: 'Hello ${var.name}'. Heredoc: <<-EOT for multi-line strings"
+    weight: 0.35
+    description: "Expressions"
+  - type: llm_judge
+    criteria: "Built-in functions are practical — merge(): combine maps (merge(local.common_tags, local.extra_tags)). lookup(): safe map access with default (lookup(var.amis, var.region, 'ami-default')). concat(): join lists. flatten(): flatten nested lists. try(): return first non-error value (try(var.config.setting, 'default')). coalesce(): first non-null value. keys()/values(): extract from maps. length(): collection size. format/formatlist(): string formatting. cidrsubnet(): calculate subnet CIDRs. templatefile(): render template files. Type conversion: toset(), tomap(), tolist(), tonumber(), tostring()"
+    weight: 0.30
+    description: "Functions"

package/courses/terraform-infrastructure-setup/scenarios/level-2/module-structure.yaml ADDED Viewed

@@ -0,0 +1,75 @@
+meta:
+  id: module-structure
+  level: 2
+  course: terraform-infrastructure-setup
+  type: output
+  description: "Design Terraform modules — create reusable module structure, handle input/output contracts, and debug module composition errors"
+  tags: [Terraform, modules, reusability, composition, inputs, intermediate]
+state: {}
+trigger: |
+  Your team copies and pastes VPC configuration across 8 environments.
+  Every copy has drifted slightly. You're tasked with creating a reusable
+  VPC module, but your first attempt has issues:
+  ```
+  modules/vpc/
+  ├── main.tf
+  ├── variables.tf
+  └── outputs.tf
+  ```
+  ```hcl
+  # modules/vpc/main.tf
+  resource "aws_vpc" "this" {
+    cidr_block = var.cidr_block
+    tags       = var.tags
+  }
+  resource "aws_subnet" "public" {
+    count             = length(var.public_subnets)
+    vpc_id            = aws_vpc.this.id
+    cidr_block        = var.public_subnets[count.index]
+    availability_zone = var.azs[count.index]
+  }
+  ```
+  ```hcl
+  # Root module calling it
+  module "vpc" {
+    source = "./modules/vpc"
+    # forgot to pass required variables
+  }
+  resource "aws_instance" "web" {
+    subnet_id = module.vpc.public_subnet_ids[0]
+  }
+  ```
+  Errors:
+  ```
+  Error: Missing required argument
+    The argument "cidr_block" is required, but no definition was found.
+  Error: Unsupported attribute
+    module.vpc.public_subnet_ids is object with no attribute "public_subnet_ids"
+  ```
+  Task: Explain Terraform module design, required vs optional variables
+  with defaults, output contracts between modules, module sources
+  (local, registry, git), and module composition best practices.
+assertions:
+  - type: llm_judge
+    criteria: "Module structure is explained — a module is a directory with .tf files: main.tf (resources), variables.tf (inputs), outputs.tf (exported values), versions.tf (required providers). Modules encapsulate reusable infrastructure. Required variables have no default — callers must provide them. The error: cidr_block has no default in the module, and the root module didn't pass it. Fix: module 'vpc' { source = './modules/vpc', cidr_block = '10.0.0.0/16', public_subnets = [...], azs = [...], tags = {...} }. Module outputs must be explicitly declared — the second error is because public_subnet_ids wasn't defined in outputs.tf"
+    weight: 0.35
+    description: "Module structure"
+  - type: llm_judge
+    criteria: "Module sources are covered — local: source = './modules/vpc'. Terraform Registry: source = 'terraform-aws-modules/vpc/aws', version = '~> 5.0'. Git: source = 'git::https://github.com/org/modules.git//vpc?ref=v1.0.0'. S3: source = 's3::https://bucket.s3.amazonaws.com/modules/vpc.zip'. Best practice: use versioned sources (git tags, registry versions) for stability. Local modules for project-specific code. Registry modules for common patterns (VPC, EKS, RDS). Pin versions: never use unversioned git sources in production"
+    weight: 0.35
+    description: "Module sources"
+  - type: llm_judge
+    criteria: "Composition best practices are practical — module contract: clearly document required vs optional inputs with descriptions. Use validation blocks on variables. Output everything consumers might need. Keep modules focused (one concern per module). Avoid deeply nested modules (max 2-3 levels). Module composition: root module calls multiple child modules, passes outputs between them. Example: module.vpc.vpc_id → module.eks.vpc_id. Don't put provider configuration in modules — pass it from root. Use for_each on modules to create multiple instances"
+    weight: 0.30
+    description: "Best practices"

package/courses/terraform-infrastructure-setup/scenarios/level-2/provisioner-pitfalls.yaml ADDED Viewed

@@ -0,0 +1,64 @@
+meta:
+  id: provisioner-pitfalls
+  level: 2
+  course: terraform-infrastructure-setup
+  type: output
+  description: "Avoid provisioner pitfalls — understand why provisioners are a last resort, debug connection failures, and use better alternatives"
+  tags: [Terraform, provisioners, remote-exec, local-exec, user-data, intermediate]
+state: {}
+trigger: |
+  A developer uses remote-exec to install software on EC2 instances:
+  ```hcl
+  resource "aws_instance" "web" {
+    ami           = "ami-0c55b159cbfafe1f0"
+    instance_type = "t3.micro"
+    key_name      = "deploy-key"
+    provisioner "remote-exec" {
+      inline = [
+        "sudo apt-get update",
+        "sudo apt-get install -y nginx",
+        "sudo systemctl enable nginx"
+      ]
+      connection {
+        type        = "ssh"
+        user        = "ubuntu"
+        private_key = file("~/.ssh/deploy-key.pem")
+        host        = self.public_ip
+      }
+    }
+  }
+  ```
+  Problems encountered:
+  ```
+  Error: timeout - last error: dial tcp 54.1.2.3:22:
+  connect: connection refused
+  Error: remote-exec provisioner error
+  Status: 100 (apt-get returned non-zero)
+  ```
+  The instance was created but the provisioner failed. Now the resource
+  is tainted and will be destroyed and recreated on next apply.
+  Task: Explain provisioners (local-exec, remote-exec, file), why
+  they're problematic, connection debugging, taint behavior on failure,
+  and better alternatives (user_data, Packer, Ansible).
+assertions:
+  - type: llm_judge
+    criteria: "Provisioner types and problems are explained — local-exec: runs command on the machine running Terraform. remote-exec: runs command on the created resource via SSH/WinRM. file: copies files to the resource. Problems: (1) not declarative — can't detect drift, (2) failure taints the resource (forces recreation), (3) non-idempotent (running twice may break), (4) SSH connection is fragile (timing, security groups, key issues), (5) Terraform can't model what provisioners do in state. The connection refused error: instance not ready yet when SSH attempted, or security group doesn't allow port 22"
+    weight: 0.35
+    description: "Provisioner problems"
+  - type: llm_judge
+    criteria: "Debugging and taint behavior are covered — connection debugging: (1) check security group allows SSH from Terraform runner, (2) instance may not be ready (cloud-init still running), (3) wrong SSH user for AMI (ubuntu vs ec2-user vs admin), (4) private key permissions or format issues. Taint on failure: when provisioner fails, Terraform marks the resource as tainted. Next apply: destroy and recreate the entire instance. This is wasteful and disruptive. on_failure = continue skips the error (resource not tainted but may be misconfigured). on_failure = fail (default) taints the resource"
+    weight: 0.35
+    description: "Debugging and taint"
+  - type: llm_judge
+    criteria: "Better alternatives are recommended — (1) user_data: cloud-init script runs on first boot, no SSH needed, retries handled by OS, idempotent with proper scripting. (2) Packer: build pre-configured AMIs with all software installed, Terraform just launches them. (3) Ansible/Chef/Puppet: proper configuration management tools run after Terraform creates infrastructure. (4) AWS Systems Manager Run Command: no SSH needed, agents already on Amazon AMIs. Best practice: Terraform creates infrastructure, other tools configure it. Use provisioners only as absolute last resort"
+    weight: 0.30
+    description: "Alternatives"

package/courses/terraform-infrastructure-setup/scenarios/level-2/remote-state-backend.yaml ADDED Viewed

@@ -0,0 +1,55 @@
+meta:
+  id: remote-state-backend
+  level: 2
+  course: terraform-infrastructure-setup
+  type: output
+  description: "Configure remote state backend — set up S3+DynamoDB locking, migrate from local to remote state, and troubleshoot backend errors"
+  tags: [Terraform, state, backend, S3, DynamoDB, locking, intermediate]
+state: {}
+trigger: |
+  Your team has been using local state files. Problems are mounting:
+  - Two engineers ran apply simultaneously, corrupting the state
+  - A laptop crash lost the state file for the staging environment
+  - No one knows who changed what or when
+  You're migrating to S3 backend with DynamoDB locking. During migration:
+  ```
+  $ terraform init -migrate-state
+  Initializing the backend...
+  Do you want to copy existing state to the new backend?
+    Enter a value: yes
+  Error: Failed to save state
+  Error saving state: failed to upload state: AccessDenied:
+  Access Denied
+  ```
+  After fixing permissions, another error:
+  ```
+  Error: Error acquiring the state lock
+  Error message: ConditionalCheckFailedException
+  ```
+  Task: Explain remote state setup (S3 + DynamoDB), state migration
+  from local to remote, state locking mechanics, backend configuration
+  options, and how to share state across teams using terraform_remote_state.
+assertions:
+  - type: llm_judge
+    criteria: "S3 backend setup is complete — backend configuration: bucket (S3 bucket name), key (state file path), region, encrypt = true (SSE), dynamodb_table (for locking). Bootstrap: create S3 bucket with versioning enabled, create DynamoDB table with LockID partition key (String). The AccessDenied error: IAM policy needs s3:GetObject, s3:PutObject, s3:ListBucket on the bucket, plus dynamodb:GetItem, dynamodb:PutItem, dynamodb:DeleteItem on the lock table. State migration: terraform init -migrate-state copies local state to S3"
+    weight: 0.35
+    description: "Backend setup"
+  - type: llm_judge
+    criteria: "State locking mechanics are explained — DynamoDB locking: before any state-modifying operation, Terraform writes a lock record to DynamoDB with a unique ID, who holds it, and operation type. If lock exists, operation fails with ConditionalCheckFailedException. Lock released after operation completes. Stale locks: if process crashes, lock remains. Fix: terraform force-unlock <LOCK_ID> (only if holder process is confirmed dead). S3 versioning: enables state recovery if corruption occurs. terraform state pull to download, terraform state push to upload manually"
+    weight: 0.35
+    description: "Locking mechanics"
+  - type: llm_judge
+    criteria: "Cross-team state sharing is covered — terraform_remote_state data source reads another team's state outputs: data 'terraform_remote_state' 'network' { backend = 's3', config = { bucket = '...', key = 'network/terraform.tfstate' } }. Reference: data.terraform_remote_state.network.outputs.vpc_id. State file organization: separate state per environment and per team/service to limit blast radius. Pattern: s3://state-bucket/<env>/<service>/terraform.tfstate. IAM: restrict which teams can read/write which state files using S3 bucket policies"
+    weight: 0.30
+    description: "State sharing"

package/courses/terraform-infrastructure-setup/scenarios/level-2/terraform-import.yaml ADDED Viewed

@@ -0,0 +1,55 @@
+meta:
+  id: terraform-import
+  level: 2
+  course: terraform-infrastructure-setup
+  type: output
+  description: "Import existing resources into Terraform — bring console-created infrastructure under IaC management without recreation"
+  tags: [Terraform, import, existing-resources, migration, IaC-adoption, intermediate]
+state: {}
+trigger: |
+  Your company has 50 AWS resources created via console that need to
+  come under Terraform management. You start importing:
+  ```
+  $ terraform import aws_instance.legacy i-0abc123def456
+  aws_instance.legacy: Importing from ID "i-0abc123def456"...
+  aws_instance.legacy: Import prepared!
+    Imported aws_instance.legacy
+  $ terraform plan
+  # aws_instance.legacy will be updated in-place
+  ~ resource "aws_instance" "legacy" {
+      ~ ami           = "ami-current123" -> "ami-wrong456"
+      ~ instance_type = "t3.large" -> "t3.micro"
+      ~ tags          = {
+          - "ManagedBy" = "console" -> null
+          + "Environment" = "prod"
+        }
+      # ... 20 more attribute changes
+    }
+  ```
+  The plan shows changes because your .tf file doesn't match the
+  actual resource configuration!
+  Task: Explain terraform import workflow, how to write matching
+  configuration after import, the new import block syntax (Terraform 1.5+),
+  bulk import strategies, and common import pitfalls.
+assertions:
+  - type: llm_judge
+    criteria: "Import workflow is explained — (1) write a resource block in .tf file, (2) run terraform import <address> <id>, (3) run terraform plan to see differences, (4) update .tf file to match actual configuration until plan shows no changes. The plan shows changes because the .tf config was written with incorrect values — must match reality exactly. Use terraform state show aws_instance.legacy to see imported attributes and copy them to .tf file. Iterate: import → plan → fix config → plan again until clean"
+    weight: 0.35
+    description: "Import workflow"
+  - type: llm_judge
+    criteria: "Import block syntax (1.5+) is covered — import { to = aws_instance.legacy, id = 'i-0abc123def456' }. Benefits over CLI import: (1) can be code-reviewed in PRs, (2) terraform plan -generate-config-out=generated.tf generates matching configuration automatically, (3) supports for_each for bulk imports. The generated config is a starting point — review and clean up. Import blocks are removed after successful apply. This is the recommended approach for Terraform 1.5+"
+    weight: 0.35
+    description: "Import blocks"
+  - type: llm_judge
+    criteria: "Bulk import and pitfalls are practical — bulk strategy: (1) inventory all resources (AWS Config, resource explorer), (2) write import blocks for all, (3) generate config, (4) review and customize, (5) apply. Pitfalls: (1) some resources don't support import (check provider docs), (2) imported resources may reference other non-imported resources, (3) secrets in imported state (passwords, keys) — rotate after import, (4) complex resources (EKS, RDS) may need manual config adjustment. Tools: terraformer for auto-generating both config and import commands from existing cloud infrastructure"
+    weight: 0.30
+    description: "Bulk import"

package/courses/terraform-infrastructure-setup/scenarios/level-2/workspace-management.yaml ADDED Viewed

@@ -0,0 +1,51 @@
+meta:
+  id: workspace-management
+  level: 2
+  course: terraform-infrastructure-setup
+  type: output
+  description: "Manage Terraform workspaces — understand workspace isolation, limitations, and when to use workspaces vs directory-based environments"
+  tags: [Terraform, workspaces, environments, isolation, multi-env, intermediate]
+state: {}
+trigger: |
+  Your team uses workspaces for environment separation:
+  ```
+  $ terraform workspace list
+    default
+    dev
+    staging
+  * prod
+  $ terraform workspace select staging
+  Switched to workspace "staging".
+  $ terraform plan
+  # Accidentally planning against staging with prod-sized resources
+  # because terraform.tfvars has prod values!
+  ```
+  A developer selected the wrong workspace and applied prod-sized
+  (expensive) resources to staging. Monthly bill spiked by $5K.
+  Another issue: you can't use different provider versions for dev
+  vs prod because all workspaces share the same root module.
+  Task: Explain Terraform workspaces, their proper use cases,
+  limitations, the terraform.workspace variable, and compare
+  workspaces vs directory-based environment separation.
+assertions:
+  - type: llm_judge
+    criteria: "Workspaces are explained — workspaces create separate state files within the same backend. terraform workspace new dev creates a new workspace. State stored at: env:/dev/terraform.tfstate. terraform.workspace variable returns current workspace name — use for conditional logic: locals { instance_type = terraform.workspace == 'prod' ? 't3.large' : 't3.micro' }. Workspaces share the same code, providers, and backend config. The accident: wrong workspace selected, wrong tfvars applied because workspaces don't enforce which tfvars to use"
+    weight: 0.35
+    description: "Workspaces explained"
+  - type: llm_judge
+    criteria: "Limitations are clearly stated — workspaces are NOT ideal for environment separation because: (1) same code for all environments (can't have different provider versions), (2) easy to apply to wrong workspace (no guardrails), (3) shared backend makes cross-environment blast radius larger, (4) terraform.workspace checks scattered through code make it hard to read. Proper use cases: testing infrastructure changes (ephemeral workspaces), feature branch environments, temporary environments for demos. NOT for: long-lived dev/staging/prod separation"
+    weight: 0.35
+    description: "Limitations"
+  - type: llm_judge
+    criteria: "Directory-based alternative is recommended — separate directories per environment: environments/dev/, environments/staging/, environments/prod/. Each has its own backend config, tfvars, and can pin different module/provider versions. Benefits: complete isolation, clear boundaries, can't accidentally apply to wrong environment, independent state. Shared logic through modules: environments call common modules with environment-specific variables. Terraform Cloud workspaces are different from CLI workspaces — they provide true isolation with separate runs, variables, and permissions"
+    weight: 0.30
+    description: "Directory alternative"

package/courses/terraform-infrastructure-setup/scenarios/level-3/advanced-debugging-shift.yaml ADDED Viewed

@@ -0,0 +1,63 @@
+meta:
+  id: advanced-debugging-shift
+  level: 3
+  course: terraform-infrastructure-setup
+  type: output
+  description: "Combined advanced shift — handle state surgery, cross-account issues, and drift remediation in a single complex incident"
+  tags: [Terraform, troubleshooting, combined, shift-simulation, advanced]
+state: {}
+trigger: |
+  Three interconnected issues escalate during your on-call shift:
+  Issue 1 — State divergence after team split:
+  Team A split their resources from the shared state last week using
+  state mv. But they missed moving aws_iam_policy.shared, which is
+  now in both state files:
+  ```
+  Team A state: contains aws_iam_policy.shared
+  Team B state: contains aws_iam_policy.shared (original)
+  Team A runs apply → modifies the policy
+  Team B runs apply → overwrites Team A's changes
+  ```
+  The same resource is managed by two state files!
+  Issue 2 — Cross-account module failure:
+  A module deployed to the DR account references resources in the
+  primary account via terraform_remote_state:
+  ```
+  Error: error reading S3 bucket: AccessDenied
+  The DR account's Terraform role doesn't have cross-account
+  permission to read the primary account's state bucket.
+  ```
+  Issue 3 — Drift from security incident response:
+  The security team disabled public access on 15 S3 buckets during
+  an incident. Now terraform plan shows 15 buckets will have public
+  access re-enabled (reverting the security fix):
+  ```
+  ~ resource "aws_s3_bucket_public_access_block" "data" {
+      ~ block_public_acls       = true -> false
+      ~ block_public_policy     = true -> false
+    }
+  ```
+  Task: Resolve all three issues with proper procedures and
+  establish preventive measures.
+assertions:
+  - type: llm_judge
+    criteria: "Issue 1 (dual-managed resource) is resolved — a resource in two state files is extremely dangerous: both try to manage it, last writer wins. Fix: (1) immediately terraform state rm aws_iam_policy.shared from Team A's state (they don't own it). (2) If Team A needs it: create a separate policy for Team A. (3) If shared: move to a shared-infrastructure state that both teams read via terraform_remote_state. Prevention: during state splits, verify no resource appears in multiple states using terraform state list on all states. Use terraform state list | sort to compare"
+    weight: 0.35
+    description: "Dual-managed resource"
+  - type: llm_judge
+    criteria: "Issue 2 (cross-account state access) is resolved — the DR account's Terraform role needs S3 read permissions on the primary account's state bucket. Fix: (1) add S3 bucket policy on primary's state bucket allowing DR account's role to s3:GetObject, (2) or use assume_role in the terraform_remote_state data source to assume a role in the primary account. Alternative: use Terraform Cloud with cross-workspace state sharing (no S3 permissions needed). Prevention: when setting up multi-account, plan state access patterns upfront and document IAM requirements"
+    weight: 0.35
+    description: "Cross-account access"
+  - type: llm_judge
+    criteria: "Issue 3 (security drift) is resolved — the security fix was correct, Terraform code is wrong. Fix: update all 15 S3 bucket configurations to block_public_acls = true, block_public_policy = true. This aligns code with the desired secure state. Do NOT apply the current plan (it would revert the security fix). Workflow: (1) update .tf files, (2) run plan to verify no changes, (3) commit code. Prevention: security changes should always be reflected in Terraform code. Establish a process: security team opens PR with Terraform changes, not just console modifications"
+    weight: 0.30
+    description: "Security drift"