RubyGems - cem_acpt - Versions diffs - 0.11.2 → 0.12.0 - Mend

cem_acpt 0.11.2 → 0.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (36) hide show

checksums.yaml +4 -4
data/Gemfile.lock +1 -1
data/README.md +93 -0
data/docs/ARCHITECTURE.md +10 -16
data/exe/cem_acpt_scan +16 -0
data/lib/cem_acpt/cli.rb +24 -0
data/lib/cem_acpt/config/base.rb +10 -3
data/lib/cem_acpt/config/cem_acpt_scan.rb +112 -0
data/lib/cem_acpt/config.rb +1 -0
data/lib/cem_acpt/platform/gcp.rb +6 -9
data/lib/cem_acpt/provision/terraform/linux.rb +52 -6
data/lib/cem_acpt/provision/terraform.rb +147 -10
data/lib/cem_acpt/scan/daemon_client.rb +91 -0
data/lib/cem_acpt/scan/errors.rb +44 -0
data/lib/cem_acpt/scan/result.rb +89 -0
data/lib/cem_acpt/scan.rb +17 -0
data/lib/cem_acpt/test_data.rb +59 -3
data/lib/cem_acpt/test_runner/log_formatter/scan_result_formatter.rb +72 -0
data/lib/cem_acpt/test_runner/log_formatter.rb +4 -1
data/lib/cem_acpt/test_runner.rb +103 -5
data/lib/cem_acpt/version.rb +1 -1
data/lib/cem_acpt.rb +18 -0
data/lib/terraform/gcp/linux/main.tf +129 -1
data/lib/terraform/gcp/linux/scan/scan_service.rb +148 -0
data/lib/terraform/gcp/linux/scan/scan_service.service +12 -0
data/lib/terraform/gcp/windows/main.tf +1 -1
data/lib/terraform/image/gcp/linux/main.tf +1 -1
data/specifications/CEM-6511.md +286 -0
data/specifications/CEM-6720.md +187 -0
data/specifications/CEM-6759.md +168 -0
data/specifications/CEM-6760.md +120 -0
data/specifications/CEM-6761.md +136 -0
data/specifications/CEM-6762.md +163 -0
data/specifications/CEM-6765.md +101 -0
metadata +23 -4
data/lib/cem_acpt/provision.rb +0 -20

data/specifications/CEM-6511.md ADDED Viewed

@@ -0,0 +1,286 @@
+# CEM-6511 — Automated benchmark scans (`cem_acpt_scan`)
+## Summary
+Add a third binary, `cem_acpt_scan`, alongside `cem_acpt` and `cem_acpt_image`. It provisions Linux test nodes the same way the acceptance-test runner does, applies the acceptance-test case's `manifest.pp`, then runs a real benchmark scan against the node — OpenSCAP for STIG profiles, CIS-CAT Pro for CIS profiles — instead of Goss assertions. The scan score is compared against a configurable per-test threshold; an exit code is returned accordingly. The motivation is AI-enablement: agents need a way to validate compliance changes against the actual benchmark, not just developer-authored Goss tests.
+## Functional behavior
+### 1. Entry point and CLI
+A new shim `exe/cem_acpt_scan` is added — a 16-line copy of `exe/cem_acpt` that calls `CemAcpt::Cli.parse_opts_for(:cem_acpt_scan)` and `CemAcpt.run(:cem_acpt_scan, ...)`. The gemspec auto-registers it via `bindir = 'exe'` (`cem_acpt.gemspec:26-27`); no gemspec edit is required.
+`CemAcpt.run` (`lib/cem_acpt.rb:24-40`) gains a third arm, `when :cem_acpt_scan`, that calls a new private `run_cem_acpt_scan(options)` mirroring `run_cem_acpt`'s shape — build config, `initialize_logger!`, instantiate `CemAcpt::TestRunner::Runner` in scan mode, exit with `runner.exit_code`. `new_config` (`lib/cem_acpt.rb:44-53`) gains a matching arm that returns `Config::CemAcptScan`.
+`Cli.parse_opts_for` (`lib/cem_acpt/cli.rb:27-95`) gains a `when :cem_acpt_scan` block adding scan-specific flags. The shared block (lines 97-185) is unchanged — `--config`, `--CI`, `--quiet`, `--verbose`, `--log-file`, `--trace`, `-Y`, `-X`, `-V` are inherited automatically.
+Scan-specific flags:
+| Flag | Effect |
+| --- | --- |
+| `-m, --module-dir DIR` | Path to consuming module root. Same semantics as `cem_acpt`. |
+| `-t, --tests TESTS` | Comma-separated list of acceptance-test case names to scan. |
+| `-D, --no-destroy-nodes` | Same semantics as `cem_acpt`. |
+| `--scanner SCANNER` | Override scanner choice. One of `openscap`, `ciscat`. Default: derived from the test-case framework (`stig_*` → openscap, `cis_*` → ciscat). |
+| `--threshold N` | Override the global pass/fail score threshold (float, 0–100). Default: from config. |
+| `--scan-output FILE` | Save normalized JSON report to `FILE`. For multi-case runs, fan out to `FILE.<test_case_name>.json`. Without this flag, JSON is printed to stdout only. |
+### 2. Config schema
+A new `lib/cem_acpt/config/cem_acpt_scan.rb` defines `CemAcpt::Config::CemAcptScan < Base` following the `Config::CemAcpt` and `Config::CemAcptImage` pattern (`lib/cem_acpt/config/cem_acpt.rb:8-81`):
+```ruby
+module CemAcpt
+  module Config
+    class CemAcptScan < Base
+      VALID_KEYS = %i[
+        cem_acpt_scan
+        module_dir
+        node_data
+        test_data
+        tests
+      ].freeze
+      def env_var_prefix
+        'CEM_ACPT_SCAN'
+      end
+      def defaults
+        {
+          cem_acpt_scan: {
+            scanner: nil,                       # nil = auto-detect from framework
+            threshold: 80.0,                    # global default pass threshold
+            test_thresholds: {},                # { 'cis_rhel-8_firewalld_server_2' => 75.0 }
+            scan_output: nil,                   # nil = stdout only
+            cis_cat_pro_source: nil,            # local path or gs:// URI; required for ciscat scans
+            cis_cat_pro_license: nil,           # local path or gs:// URI; required for ciscat scans.
+                                                # Separate from cis_cat_pro_source so license rotation
+                                                # does not require rebuilding the assessor bundle.
+            daemon: { port: 8081, ready_timeout: 60 },
+            profiles: {
+              openscap: {
+                # 'stig_rhel-8_firewalld_v1r12' => 'xccdf_org.ssgproject.content_profile_stig'
+              },
+              cis_cat: {
+                # 'cis_rhel-8_firewalld_server_2' => 'xccdf_org.cisecurity.benchmarks_profile_Level_2_-_Server'
+              },
+            },
+          },
+          module_dir: Dir.pwd,
+          node_data: {},
+          test_data: { for_each: { collection: %w[puppet8] } },
+          tests: [],
+          # ... shared defaults inherited from Base via merge
+        }
+      end
+    end
+  end
+end
+```
+`lib/cem_acpt/config.rb` gets a new `require_relative 'config/cem_acpt_scan'`. The new schema is registered via the same `add_static_options!` / `valid_keys` / `validate_config!` machinery in `Config::Base` — no changes to `base.rb` are needed.
+### 3. Runner reuse with a new `:scan` action group
+`cem_acpt_scan` reuses `CemAcpt::TestRunner::Runner` (`lib/cem_acpt/test_runner.rb:50-94`) — the lifecycle (`pre_provision_test_nodes` → `provision_test_nodes` → `run_tests` → `clean_up`) is shared. Only `configure_actions` (`test_runner.rb:120-143`) diverges based on which command instantiated the runner.
+The runner gains a tiny dispatch in `configure_actions`:
+```ruby
+def configure_actions
+  if config.is_a?(CemAcpt::Config::CemAcptScan)
+    configure_scan_actions
+  else
+    configure_acceptance_actions   # the existing :goss + :bolt block
+  end
+end
+```
+`configure_scan_actions` registers a single sync `:scan` action group that, per host, hits the on-node scan daemon at `http://<host>:<config.daemon.port>/scan`, parses the response, builds a normalized result object, appends it to `@results`, and writes the JSON output (stdout always; `--scan-output FILE` if set).
+### 4. On-node scanner daemon
+`Provision::Linux` gains `scan_provision_commands` (`lib/cem_acpt/provision/terraform/linux.rb`), a sibling of the existing `provision_commands` selected by a new `scan_mode:` keyword on `provision_commands_wrapper`. In scan mode the Goss install and Goss systemd units are skipped — only the scan-daemon machinery is installed. The on-node commands are:
+1. Install the consuming Puppet module (same `puppet module install` as acceptance mode).
+2. Install `webrick` via `puppet gem install` and start `log_service.rb` (existing logging pattern, reused).
+3. Install OpenSCAP via `install_scanner_packages_command`: `sudo dnf install -y openscap-scanner scap-security-guide || sudo apt-get install -y libopenscap8 ssg-base ssg-debderived`. Installed unconditionally — the OR-chain succeeds on whichever package manager is present, regardless of which scanner the test case resolves to.
+4. Install a headless Java runtime via `install_java_command`: `sudo dnf install -y java-11-openjdk-headless || sudo apt-get install -y default-jre-headless`. Installed unconditionally — CIS-CAT Pro's `Assessor-CLI.sh` is a Java app, and conditionalizing on the resolved scanner is not worth the wiring.
+5. Drop `/opt/cem_acpt/scan/scan_service.rb` (a small WebRick server analogous to `log_service.rb`), install its systemd unit (`scan_service.service`), `daemon-reload`, start, enable. The service exposes:
+   - `GET /health` — returns 200 once the daemon is up. Polled by the host until ready.
+   - `GET /scan` — reads scanner + profile + level + datastream from `/opt/cem_acpt/scan/scan_config.json` (written at provision time by the `scan_config_upload` null_resource — see §5), invokes the chosen scanner CLI synchronously, returns normalized JSON. The CIS-CAT Pro path expects the assessor at `/opt/cis-cat-pro/Assessor-CLI.sh` and the license bundle's contents at `/opt/cis-cat-pro/license/` — both placed by the `cis_cat_pro_upload` null_resource.
+6. Run `puppet apply` on the test-case manifest (existing `apply_command`).
+The CIS-CAT Pro bundle is **not** extracted as part of these commands. Extraction lives inside the `cis_cat_pro_upload` null_resource (§5) so the assessor tarball is guaranteed to be on disk before extraction runs — a previous iteration did extraction here and hit a race where the file did not yet exist.
+This mirrors the existing Goss pattern: install at provision, run as a systemd service, query over HTTP from the host. The difference is the source of truth for scan parameters — a state file uploaded by Terraform — rather than baked-in assertions.
+### 5. Terraform template reuse
+`lib/terraform/gcp/linux/main.tf` is reused as-is — no `lib/terraform/scan/` fork. The `node_data` object's schema gains five optional fields (defaulting to empty strings so acceptance-mode runs are unaffected):
+- `cis_cat_pro_bundle` (local host path) — assessor archive to upload.
+- `cis_cat_pro_format` — `"tar.gz"` or `"zip"`, detected from the source extension by `Provision::Terraform#archive_format`.
+- `cis_cat_pro_license` (local host path) — license archive to upload.
+- `cis_cat_pro_license_format` — `"tar.gz"` or `"zip"`, detected from the license source extension using the same `archive_format` helper.
+- `scan_config_json` (rendered JSON string) — the per-node scan parameters consumed by the on-node daemon.
+Two new `null_resource` blocks ship the scan-mode artifacts to the node. Each filters with `for_each` over `var.node_data` so it is a no-op for any node whose relevant field is empty — acceptance-mode runs do not require any of these resources to even plan.
+**`null_resource.scan_config_upload`** (filtered on `scan_config_json != ""`) — opens a fresh SSH connection to the instance, `mkdir -p /opt/cem_acpt/scan` and `chown ${var.username}:${var.username}` so the subsequent `provisioner "file"` can write without sudo, then drops `scan_config.json` with `content =` (no source file). Triggers on `google_compute_instance.acpt-test-node[each.key].id` so it re-runs whenever the instance is replaced.
+**`null_resource.cis_cat_pro_upload`** (filtered on `cis_cat_pro_bundle != ""`) — opens a fresh SSH connection and runs four provisioners *in declared order*. Provisioner execution within a single resource is sequential, which is the ordering guarantee we rely on for both the assessor extraction and the license extraction:
+1. `provisioner "file"` — upload the assessor archive to `/opt/cem_acpt/cis-cat-pro.${cis_cat_pro_format}`.
+2. `provisioner "remote-exec"` — extract the assessor into `/opt/cis-cat-pro/`. Ternary on `cis_cat_pro_format`: `tar.gz` uses `tar -xzf … --strip-components=1`; `zip` installs `unzip` (OR-chained `dnf || apt-get`), stages into a temp dir, then `shopt -s dotglob && mv */* /opt/cis-cat-pro/` to flatten the vendor's single top-level directory (including dotfiles). `chmod +x Assessor-CLI.sh`.
+3. `provisioner "file"` — upload the license archive to `/opt/cem_acpt/cis-cat-pro-license.${cis_cat_pro_license_format}`.
+4. `provisioner "remote-exec"` — `sudo mkdir -p /opt/cis-cat-pro/license/` (the vendor bundle does not always ship this directory), then extract into it. Ternary on `cis_cat_pro_license_format`: `tar.gz` uses `tar -xzf … -C /opt/cis-cat-pro/license/` (no `--strip-components` — the vendor's archive contents are dropped in as-packaged, including any subdirectories); `zip` uses `unzip -q -o … -d /opt/cis-cat-pro/license/`. Permissions on the extracted contents are left at the vendor's archive defaults — to be revisited after the first end-to-end scan confirms what the assessor actually wants.
+`Provision::Terraform#provision_node_data` (`lib/cem_acpt/provision/terraform.rb`) populates the five scan-mode fields when `Config::CemAcptScan` is the active config and the resolved scanner for a given node is `ciscat`. Two helper methods handle source resolution:
+- `resolve_cis_cat_pro_source!` — local paths pass through via `File.expand_path`; `gs://` URIs are pulled to `working_dir` with `gcloud storage cp`. The cached filename preserves the source extension so `archive_format` works downstream.
+- `resolve_cis_cat_pro_license!` — same shape, applied to `cem_acpt_scan.cis_cat_pro_license`. Raises `CemAcpt::Scan::LicenseNotFoundError` if unset when any test case resolves to ciscat.
+The host-side `archive_format` helper handles both source and license — `.tar.gz`, `.tgz`, `.zip` accepted (case-insensitive); anything else raises before provisioning starts.
+### 6. Acceptance-test discovery in scan mode
+`CemAcpt::TestData::Fetcher` (`lib/cem_acpt/test_data.rb`) is updated so the discovery predicate accommodates scan mode:
+- `find_acceptance_tests!` (`test_data.rb:72-77`) selects directories that contain `manifest.pp`. The current `goss.yaml` predicate becomes specific to acceptance mode.
+- The per-test preflight checks (`test_data.rb:45-46`) require `manifest.pp` always; `goss.yaml` is required only in acceptance mode.
+- The mode is derived from the config class (`config.is_a?(Config::CemAcptScan)` → scan mode).
+Per-test scanner profile/level resolution:
+1. Parse the dir name with the existing `name_pattern_vars` regex (`framework`, `image_fam`, `image_version`, `firewall`, `profile`, `level`).
+2. Resolve `framework` to a scanner: `cis` → `ciscat`, `stig` → `openscap`. Overridable via `--scanner`.
+3. Look up the scanner profile id in `cem_acpt_scan.profiles.<scanner>` keyed by full dir name. If missing, fail fast with a clear error pointing at the config key.
+### 7. Score evaluation and exit code
+After all `:scan` actions finish, `Runner#process_test_results` evaluates each result against its threshold:
+- `cem_acpt_scan.test_thresholds[<test_case_name>]` if set, otherwise `cem_acpt_scan.threshold`.
+- A run passes if every test case scored ≥ its threshold. Otherwise `runner.exit_code = 1`.
+The pass/fail line per test case is logged through the existing `logger.info`/`logger.error` channels and renders correctly under both `--CI` and plain modes (`lib/cem_acpt/logging/formatter.rb:75-94`).
+## Input/Output Contracts
+**Inputs**
+- `tests:` config key — list of acceptance-test directory names (same as `cem_acpt`).
+- `cem_acpt_scan.cis_cat_pro_source` — local file path or `gs://` URI to the CIS-CAT Pro bundle. Required when any selected test case resolves to scanner `ciscat`.
+- `cem_acpt_scan.cis_cat_pro_license` — local file path or `gs://` URI to the CIS-CAT Pro license bundle (`.tar.gz` or `.zip`). Required when any selected test case resolves to scanner `ciscat`. Separate from `cis_cat_pro_source` so license rotation does not require rebundling the assessor.
+- `cem_acpt_scan.profiles.openscap` / `cem_acpt_scan.profiles.cis_cat` — required mappings from test-case dir name to scanner profile id.
+- `cem_acpt_scan.threshold` (float, 0–100) — global default pass threshold. Default `80.0`.
+- `cem_acpt_scan.test_thresholds` (map) — per-test-case override thresholds.
+**Outputs**
+- Stdout — normalized JSON for each test case scanned, plus the existing logger output (CI-formatted under `-I`).
+- `--scan-output FILE` — single test case writes JSON to `FILE`; multi-case runs fan out to `FILE.<test_case>.json` (no overwrite of `FILE` itself).
+- Exit code — `0` if all scans met threshold, `1` if any failed, non-zero infra/scanner errors propagate as today.
+**Normalized JSON shape**
+```json
+{
+  "test_case": "cis_rhel-8_firewalld_server_2",
+  "scanner": "ciscat",
+  "profile": "xccdf_org.cisecurity.benchmarks_profile_Level_2_-_Server",
+  "score": 87.4,
+  "threshold": 80.0,
+  "passed_count": 187,
+  "failed_count": 27,
+  "not_applicable_count": 14,
+  "error_count": 0,
+  "rules": [
+    { "id": "...", "title": "...", "severity": "high", "result": "pass" }
+  ]
+}
+```
+## Edge cases
+- **Mixed scanners in one run.** `tests:` may resolve to a mix of `openscap` and `ciscat` cases. Each test case is scanned with the right tool independently. CIS-CAT Pro bundle upload is skipped on nodes whose resolved scanner is `openscap`.
+- **Multi-host fan-out.** Same as `cem_acpt` today — one instance per test case. Scan actions are synchronous (`async: false`); each host's `GET /scan` is sequential within the runner's single thread.
+- **Daemon readiness.** The `:scan` action waits up to `cem_acpt_scan.daemon.ready_timeout` seconds for the daemon's `/health` endpoint to return 200 before issuing `/scan`. Times out with a clear error if the systemd unit failed to start.
+- **`--scan-output FILE` with one test.** The bare path `FILE` is used. The fan-out suffix is only applied when more than one test case is scanned in the run.
+- **Windows test case requested.** The runner detects `os_family_for(test) == :windows` (`test_runner.rb:71`) and raises `"Windows scanning is not supported by cem_acpt_scan in this version"` before provisioning.
+- **No matching test case for `tests:` entry.** Same hard error as `cem_acpt` today (`test_data.rb:38`).
+- **Profile lookup miss.** If a test case has no entry in `cem_acpt_scan.profiles.<scanner>`, fail before provisioning with a message naming the missing key.
+- **License missing for ciscat run.** If `cem_acpt_scan.cis_cat_pro_license` is unset (or empty) when any test case resolves to scanner `ciscat`, raise `CemAcpt::Scan::LicenseNotFoundError` in `pre_provision_test_nodes` before any Terraform call. Openscap-only runs ignore the key entirely.
+- **License source unreachable.** A `gs://` URI that 404s under `gcloud storage cp`, or a local path that does not exist, propagates through the existing `Utils::Shell.run_cmd` / `File.expand_path` error paths and surfaces as a host-side provisioning error — same channel as an unreachable `cis_cat_pro_source`.
+## Constraints / Invariants
+- Reuses `Provision::Terraform`, `lib/terraform/gcp/linux/main.tf`, `Config::Base`, `TestRunner::Runner`, and `Logging` unchanged in shape — no parallel implementations.
+- `cem_acpt`'s existing `:goss` and `:bolt` action paths must remain functionally identical. Verified by `bundle exec rake spec` continuing to pass.
+- RuboCop must remain clean (200-char line limit, the existing `.rubocop.yml`).
+- No new runtime gem dependency (the daemon mirrors Goss's WebRick pattern; `webrick` is already installed by `provision_commands`).
+- The single-instance-per-test-case provisioning model is preserved — `cem_acpt_scan` does not introduce parallel hosts per test or sharing across cases.
+## Error handling
+- Subsystem-specific error classes live in a new `lib/cem_acpt/scan/errors.rb`, following the repo convention (`docs/ARCHITECTURE.md` §convention; `lib/cem_acpt/bolt/errors.rb` precedent). Defined: `CemAcpt::Scan::DaemonNotReadyError`, `CemAcpt::Scan::ProfileNotFoundError`, `CemAcpt::Scan::ScannerInvocationError`, `CemAcpt::Scan::LicenseNotFoundError`.
+- Provisioning errors (Terraform failures, SSH key setup, manifest apply) flow through the existing `Runner#run`'s `rescue StandardError` (`test_runner.rb:78-82`) and are reported with the existing `TestResults` plumbing.
+- Cleanup runs unconditionally via the existing `ensure` (`test_runner.rb:83-90`), respecting `--no-destroy-nodes`.
+## Non-goals
+- **Windows scanning.** A follow-up ticket adds CIS-CAT Pro on Windows; this story errors out cleanly on Windows test cases.
+- **AWS / non-GCP platforms.** Same single-platform constraint as `cem_acpt` today.
+- **Trend reporting / scan history.** The normalized JSON is the artifact; downstream consumers (CI dashboards, AI agents) own historical analysis.
+- **Image-builder integration.** `cem_acpt_image` is not modified to bake scanners into images. Scanner install is a provision-time concern only.
+- **Severity-weighted scoring.** Scores are scanner-native pass-rate; severity weighting is out of scope.
+- **A `:scan` action group registered alongside `:goss` / `:bolt`.** Scan mode is a different command, not a sub-action of acceptance runs.
+- **Bucket-side automation of CIS-CAT Pro source and license distribution.** The operator chooses the artifacts' location (local path or a `gs://` URI of their making) and is responsible for keeping the bucket populated. A follow-up story under CEM-6508 may add managed-bucket automation — uploading on a developer's behalf, sharing one bucket across the team — but for this story the artifacts are operator-managed and the runner only consumes them.
+## Acceptance criteria
+- [ ] `bundle exec exe/cem_acpt_scan -h` prints help text including `--scanner`, `--threshold`, `--scan-output`, `-m`, `-t`, `-D`.
+- [ ] `bundle exec exe/cem_acpt_scan -Y` prints the merged `Config::CemAcptScan` as YAML; `-X` explains config-source provenance.
+- [ ] Running `bundle exec exe/cem_acpt_scan -t cis_rhel-8_firewalld_server_2 -m <module>` against a configured GCP project provisions a node, applies the manifest, runs CIS-CAT Pro, returns scanner-native passing% on stdout, and exits 0 if score ≥ threshold.
+- [ ] `--scan-output report.json` writes the normalized JSON to `report.json` for a single-case run; `report.json.<case>.json` fan-out works for multi-case.
+- [ ] `cem_acpt_scan` exits non-zero when any test case scores below its threshold; per-case threshold overrides via `cem_acpt_scan.test_thresholds` are honored.
+- [ ] `cem_acpt`'s existing `:goss` + `:bolt` action paths still work — `bundle exec exe/cem_acpt -t <case>` is unchanged in behavior.
+- [ ] A Windows test case (resolved via `os_family_for`) causes `cem_acpt_scan` to fail before provisioning with a clear error.
+- [ ] A missing `cem_acpt_scan.profiles.<scanner>` entry fails before provisioning with a clear error naming the missing key.
+- [ ] A missing `cem_acpt_scan.cis_cat_pro_license` for a ciscat-resolving test case fails before provisioning with a `LicenseNotFoundError` naming the missing key. Openscap-only runs do not require the key.
+- [ ] A ciscat run with a valid license (either `.tar.gz` or `.zip` form) extracts the contents into `/opt/cis-cat-pro/license/` on the node, and the assessor produces a non-error scan result that the host parses into a `Scan::Result`.
+- [ ] `bundle exec rake spec` passes.
+- [ ] `rubocop` is clean.
+## Files touched
+- `exe/cem_acpt_scan` (new) — 16-line shim, copy of `exe/cem_acpt`.
+- `lib/cem_acpt.rb` (`:24-40`, `:44-53`, plus new `run_cem_acpt_scan`) — dispatch + entry-point method.
+- `lib/cem_acpt/cli.rb` (`:27-95`) — new `when :cem_acpt_scan` block for scan-specific flags.
+- `lib/cem_acpt/config.rb` — `require_relative 'config/cem_acpt_scan'`.
+- `lib/cem_acpt/config/cem_acpt_scan.rb` (new) — `Config::CemAcptScan`. Includes new `cis_cat_pro_license` default key alongside `cis_cat_pro_source`.
+- `lib/cem_acpt/test_data.rb` (`:45-46`, `:72-77`) — make `goss.yaml` optional in scan mode; preflight checks scoped by mode.
+- `lib/cem_acpt/test_runner.rb` (`:120-143`) — `configure_actions` dispatches between `configure_acceptance_actions` (existing) and `configure_scan_actions` (new); `process_test_results` honors per-case thresholds in scan mode.
+- `lib/cem_acpt/provision/terraform.rb` — populate the scan-mode `node_data` fields (`cis_cat_pro_bundle`, `cis_cat_pro_format`, `cis_cat_pro_license`, `cis_cat_pro_license_format`, `scan_config_json`) when `Config::CemAcptScan` is active. New helpers: `resolve_cis_cat_pro_source!`, `resolve_cis_cat_pro_license!`, `archive_format` (shared by both).
+- `lib/cem_acpt/provision/terraform/linux.rb` (`:29-48`) — append scanner-install commands when scan mode is active.
+- `lib/cem_acpt/scan/` (new) — scan subsystem: `daemon_client.rb` (HTTP client), `errors.rb` (`DaemonNotReadyError`, `ProfileNotFoundError`, `ScannerInvocationError`, `LicenseNotFoundError`), `result.rb` (normalized JSON), `scan_service_template.rb` (the Ruby served on-node, embedded as a string template).
+- `lib/terraform/gcp/linux/main.tf` — five new optional fields on the `node_data` object schema (`cis_cat_pro_bundle`, `cis_cat_pro_format`, `cis_cat_pro_license`, `cis_cat_pro_license_format`, `scan_config_json`), all defaulting to empty strings. Two new `null_resource` blocks (`scan_config_upload`, `cis_cat_pro_upload`) filter on those fields via `for_each` so they are no-ops in acceptance mode. The `cis_cat_pro_upload` resource declares four sequential provisioners — assessor upload, assessor extraction, license upload, license extraction — relying on Terraform's in-resource provisioner ordering for the upload-before-extract guarantee.
+- `spec/` — new specs mirroring the new files; existing specs touched where their fixtures change.
+## Test plan
+1. **`bundle exec rake spec`** — full unit-test suite. New specs cover:
+   - `Config::CemAcptScan` defaults, env-var loading (`CEM_ACPT_SCAN_*`), validation.
+   - `Runner#configure_actions` dispatch (acceptance vs scan).
+   - The new scan action: HTTP client mocked via WebMock or stubbed `Async::HTTP::Internet`. Verifies parsing, threshold evaluation, exit-code wiring.
+   - Discovery predicate change in `TestData::Fetcher` for both modes.
+   - Profile-mapping lookup miss → `ProfileNotFoundError`.
+   - Daemon-not-ready → `DaemonNotReadyError`.
+   - `resolve_cis_cat_pro_license!`: local path passes through unchanged; `gs://` URI is pulled to `working_dir` via stubbed `Utils::Shell.run_cmd` and preserves the source extension; missing key while any test case resolves to ciscat raises `LicenseNotFoundError`.
+   - `archive_format`: existing assessor-source coverage extended to assert the same helper accepts a license path argument — no separate `license_format` helper exists.
+2. **`bundle exec exe/cem_acpt_scan -Y`** — merged config dump round-trips correctly with both the new schema and existing shared keys.
+3. **`bundle exec exe/cem_acpt -t <case>`** — regression check that `:goss` + `:bolt` paths behave unchanged on a representative test case.
+4. **`rubocop`** — clean.
+End-to-end runs against real GCP nodes are not part of this story's CI; manual sanity checks are the assignee's responsibility before merging. Those manual checks cover both license-archive formats (`.tar.gz` and `.zip`) and confirm that the extracted contents under `/opt/cis-cat-pro/license/` allow the assessor to run without a licensing error.

data/specifications/CEM-6720.md ADDED Viewed

@@ -0,0 +1,187 @@
+# CEM-6720 — Remove provisioner factory in favor of direct instantiation
+Implements [RFC 0011](../docs/rfcs/0011-provisioner-factory-consistency.md).
+## Summary
+`Config::Base#add_static_options!` force-sets `provisioner = 'terraform'`
+after every other config source has been merged. The `Provision.new_provisioner`
+factory still branches on `config.get('provisioner')` and raises for unknown
+values — so the factory looks pluggable but is not. RFC 0011 recommends the
+smaller change: remove the factory and the static set, and have `TestRunner`
+instantiate `Provision::Terraform` directly. Re-introduce the factory when a
+concrete second implementation is in flight.
+## Functional behavior
+### 1. Delete `lib/cem_acpt/provision.rb`
+The whole 20-line file is the factory. Delete it outright.
+`Provision::Terraform` declares its own `module CemAcpt; module Provision`
+namespace at `lib/cem_acpt/provision/terraform.rb:6-7`, so the
+`CemAcpt::Provision` module remains reachable as long as the consumer
+requires `cem_acpt/provision/terraform` directly.
+The module-level `include CemAcpt::Logging` in `provision.rb` exists only
+to support the `logger.debug` call inside `new_provisioner`; both go away
+with the file.
+### 2. Update the lone caller in `TestRunner`
+`lib/cem_acpt/test_runner.rb:225-229`:
+```ruby
+# Before
+def provision_test_nodes
+  logger.info('CemAcpt::TestRunner') { 'Provisioning test nodes...' }
+  @provisioner = CemAcpt::Provision.new_provisioner(config, @run_data)
+  @provisioner.provision
+end
+# After
+def provision_test_nodes
+  logger.info('CemAcpt::TestRunner') { 'Provisioning test nodes...' }
+  @provisioner = CemAcpt::Provision::Terraform.new(config, @run_data)
+  @provisioner.provision
+end
+```
+Add the corresponding `require_relative 'provision/terraform'` near the
+top of `test_runner.rb`. Variable name `@provisioner` and method names
+`provisioner_output` / `destroy_test_nodes` stay — they describe the
+role, not the factory.
+### 3. Drop the `provisioner` static set in `Config::Base`
+`lib/cem_acpt/config/base.rb:260-270`:
+```ruby
+# Before
+def add_static_options!(config)
+  config.dset('user_config.dir', user_config_dir)
+  add_config_explanation('user_config.dir', 'static value')
+  config.dset('user_config.file', user_config_file)
+  add_config_explanation('user_config.file', 'static value')
+  config.dset('provisioner', 'terraform')
+  add_config_explanation('provisioner', 'static value')
+  config.dset('terraform.dir', terraform_dir)
+  add_config_explanation('terraform.dir', 'static value')
+  set_third_party_env_vars!(config)
+end
+# After
+def add_static_options!(config)
+  config.dset('user_config.dir', user_config_dir)
+  add_config_explanation('user_config.dir', 'static value')
+  config.dset('user_config.file', user_config_file)
+  add_config_explanation('user_config.file', 'static value')
+  config.dset('terraform.dir', terraform_dir)
+  add_config_explanation('terraform.dir', 'static value')
+  set_third_party_env_vars!(config)
+end
+```
+`provisioner` is also dropped from `BASE_VALID_KEYS` at
+`lib/cem_acpt/config/base.rb:46-62`, since nothing reads it any more
+and leaving it there would let users set a key that has no effect.
+### 4. Update `docs/ARCHITECTURE.md`
+- **Line 65** (file-tree): drop the `provision.rb # provisioner factory`
+  entry; the factory file is gone.
+- **§3 lines ~170-175**: in step 6a, drop `provisioner = 'terraform'`
+  from the list of framework-owned static keys (now three keys, not
+  four).
+- **§7 lines 403-410**: rewrite the opening paragraph of "Provisioner:
+  Terraform" so it no longer references the factory or the static
+  override. Replacement reads roughly: "There is currently one
+  provisioner. `TestRunner#provision_test_nodes` instantiates
+  `Provision::Terraform` directly. A factory was removed in CEM-6720;
+  re-introduce one when there is a concrete second implementation."
+- **§19 item 1 (lines 1020-1023)**: delete the item entirely.
+  Renumber: current item 2 (Platform constant cache, already resolved)
+  becomes item 1; current item 3 (`lib/terraform/image/gcp/windows/`
+  empty) becomes item 2.
+## Inputs / outputs / contracts
+- No public-API surface change. `Provision.new_provisioner` had no
+  out-of-tree callers in this repo, and the only in-tree caller is
+  rewired to `Provision::Terraform.new`.
+- `Provision::Terraform.new(config, run_data)` — unchanged signature
+  and behavior. The factory was a thin pass-through.
+## Edge cases
+- **`-Y` output no longer shows a `provisioner` key.** The static set
+  was the only thing populating it, and `provisioner` is removed from
+  `BASE_VALID_KEYS`, so any leftover value supplied by env/user/runtime
+  config is dropped during `validate_config!`. Anyone scripting against
+  `-Y` for a `provisioner:` line gets nothing instead of `terraform`.
+  Acceptable: no documented contract on that key, and the change is
+  consistent with the framework no longer caring.
+- **`-X` output no longer shows the `provisioner` key explanation.**
+  Same root cause; same acceptable trade-off.
+- **A user who sets `provisioner: foo` in a config file or
+  `CEM_ACPT_PROVISIONER=foo`** previously got the value silently
+  overridden by the static set, then the factory raised
+  `ArgumentError`. Now the value is silently dropped by
+  `validate_config!` (with a `warn` "Config key 'provisioner' is not
+  usable with this command") and the runner proceeds with Terraform.
+  This is a small behavior change but matches the intended state — the
+  framework is honest that the key has no effect.
+## Constraints / invariants
+- RuboCop must remain clean (200-char line limit, the existing
+  permissive `.rubocop.yml`).
+- `bundle exec rake spec` must pass.
+- The Linux and Windows test-node provisioning flows continue to work
+  end-to-end.
+## Error handling
+- Removing `lib/cem_acpt/provision.rb`: file deletion, nothing to handle.
+- The `ArgumentError, "Unknown provisioner #{...}"` raise path inside
+  the deleted factory is gone. There is no longer any "unknown
+  provisioner" error condition to handle, because there is no longer
+  any dispatch.
+## Non-goals
+- Adding a real second provisioner (Pulumi, Packer, local-only).
+  Re-introducing the factory is explicitly deferred until that exists.
+- Adding a `--provisioner NAME` CLI flag. RFC 0011's alternative path
+  is rejected — we are not preserving the extension surface.
+- Touching the image-builder path (`Provision::TerraformCmd`,
+  `ImageBuilder::TerraformBuilder`). It does not go through the
+  factory and is unaffected.
+- Any other `docs/ARCHITECTURE.md` §19 item.
+## Acceptance criteria
+- [ ] `lib/cem_acpt/provision.rb` deleted; `git grep new_provisioner
+      lib/ exe/ spec/` returns no matches.
+- [ ] `lib/cem_acpt/test_runner.rb` instantiates
+      `CemAcpt::Provision::Terraform` directly and `require`s
+      `cem_acpt/provision/terraform`.
+- [ ] `Config::Base#add_static_options!` no longer sets `provisioner`;
+      `provisioner` removed from `BASE_VALID_KEYS`.
+- [ ] `docs/ARCHITECTURE.md` updated: file-tree line 65, §3 step 6a
+      static-keys list, §7 opening paragraph, §19 item 1 deleted with
+      remaining items renumbered.
+- [ ] `bundle exec rake spec` passes.
+- [ ] `rubocop` is clean against the changed files.
+## Test plan
+- Existing specs around `Provision::Terraform` (`terraform_cmd_spec.rb`,
+  `os_data_spec.rb`, `windows_spec.rb`) are unaffected — the factory
+  had no direct test coverage to migrate, and `Provision::Terraform.new`
+  signature is unchanged.
+- `spec/cem_acpt/test_runner_spec.rb` stubs `provisioner_output` rather
+  than the construction path, so it remains green without modification.
+- Confirm `bundle exec exe/cem_acpt -Y` and `-X` run cleanly and no
+  longer reference `provisioner`.
+- No new spec files are added; the change is a removal, and the live
+  end-to-end paths exercise it implicitly.

data/specifications/CEM-6759.md ADDED Viewed

@@ -0,0 +1,168 @@
+# CEM-6759 — Wire `-b <benchmark>` flag through CIS-CAT Pro scan pipeline
+## Summary
+`run_ciscat` in `lib/terraform/gcp/linux/scan/scan_service.rb` never passes `-b <benchmark-xccdf-path>` to `Assessor-CLI.sh`. CIS-CAT Pro v4 requires an explicit benchmark to disambiguate profile IDs that exist across many OS benchmarks (RHEL 8, RHEL 9, Ubuntu LTS, etc.). Without it the assessor falls back to a `sessions.properties` file, fails to load it, and exits zero without scanning — silently reporting empty results. Wire a new `cem_acpt_scan.benchmarks.cis_cat` config map through `scan_post_processing!`, `scan_config_json_for`, and `run_ciscat` so that `-b /opt/cis-cat-pro/benchmarks/<filename>` is passed on every ciscat invocation.
+## Functional Behavior
+### 1. `lib/cem_acpt/scan/errors.rb` — new `BenchmarkNotFoundError`
+Add alongside `ProfileNotFoundError`:
+```ruby
+# Raised when a test case has no entry in `cem_acpt_scan.benchmarks.cis_cat`
+# and so no benchmark XCCDF filename can be resolved. The message carries the
+# missing config key so the operator can fix it without re-running.
+class BenchmarkNotFoundError < StandardError
+  def initialize(test_case, scanner, config_key)
+    super("No benchmark configured for test case '#{test_case}' (scanner: #{scanner}). Set '#{config_key}' in cem_acpt config.")
+  end
+end
+```
+### 2. `lib/cem_acpt/config/cem_acpt_scan.rb` — add `benchmarks` key
+Add `benchmarks: { cis_cat: {} }` alongside `profiles` in the `cem_acpt_scan` defaults hash:
+```ruby
+cem_acpt_scan: {
+  # ... existing keys ...
+  profiles: {
+    openscap: {},
+    cis_cat: {},
+  },
+  benchmarks: {
+    cis_cat: {},
+  },
+},
+```
+### 3. `lib/cem_acpt/test_data.rb` — `scan_post_processing!`
+After the existing profile resolution, resolve the benchmark for ciscat scans and raise `BenchmarkNotFoundError` early if missing:
+```ruby
+# before:
+test_data[:scan] = {
+  scanner: scanner,
+  profile: profile,
+  level: test_data[:level] || test_data['level'],
+}
+# after:
+benchmark = nil
+if scanner == :ciscat
+  benchmarks = @config.get('cem_acpt_scan.benchmarks.cis_cat') || {}
+  benchmark = benchmarks[test_data[:test_name]] || benchmarks[test_data[:test_name].to_sym]
+  if benchmark.nil? || benchmark.to_s.empty?
+    raise CemAcpt::Scan::BenchmarkNotFoundError.new(
+      test_data[:test_name], scanner,
+      "cem_acpt_scan.benchmarks.cis_cat.#{test_data[:test_name]}"
+    )
+  end
+end
+test_data[:scan] = {
+  scanner: scanner,
+  profile: profile,
+  level: test_data[:level] || test_data['level'],
+  benchmark: benchmark,
+}
+```
+`benchmark` is `nil` for openscap — it flows through harmlessly and is never read by `run_openscap`.
+### 4. `lib/cem_acpt/provision/terraform.rb` — `scan_config_json_for`
+Add `'benchmark'` to the generated JSON:
+```ruby
+JSON.generate(
+  'scanner'    => cfg[:scanner].to_s,
+  'profile'    => cfg[:profile].to_s,
+  'level'      => cfg[:level],
+  'datastream' => cfg[:datastream],
+  'benchmark'  => cfg[:benchmark],
+)
+```
+### 5. `lib/terraform/gcp/linux/scan/scan_service.rb` — `run_ciscat` and `perform_scan`
+Update `run_ciscat` signature to accept `benchmark` and append `-b` when present:
+```ruby
+# before:
+def run_ciscat(profile, level)
+  ...
+  cmd += ['-l', level.to_s] if level
+# after:
+def run_ciscat(profile, level, benchmark)
+  ...
+  cmd += ['-b', "/opt/cis-cat-pro/benchmarks/#{benchmark}"] if benchmark
+  cmd += ['-l', level.to_s] if level
+```
+Update the `perform_scan` call site:
+```ruby
+# before:
+run_ciscat(cfg['profile'], cfg['level'])
+# after:
+run_ciscat(cfg['profile'], cfg['level'], cfg['benchmark'])
+```
+## Input/Output Contracts
+`run_ciscat` gains a third positional argument `benchmark` (String or nil). When non-nil it prepends `-b /opt/cis-cat-pro/benchmarks/<benchmark>` to the assessor command before `-l`. The benchmark XCCDF files ship inside the assessor bundle at `/opt/cis-cat-pro/benchmarks/` — the filename is the operator-supplied value from the config map. Everything downstream (`parse_ciscat_json`, the error handling, the `/scan` HTTP response) is unchanged.
+## Constraints / Invariants
+- `/opt/cis-cat-pro/benchmarks/` is the hardcoded prefix in `scan_service.rb`. The config value is the filename only (e.g. `CIS_Red_Hat_Enterprise_Linux_8_Benchmark_v4.0.0-xccdf.xml`). This matches the directory layout inside the official CIS-CAT Pro assessor bundle.
+- `benchmarks.cis_cat` is strictly per-test-case — no global fallback. A missing entry raises `BenchmarkNotFoundError` before any node is provisioned, matching the existing `ProfileNotFoundError` behavior.
+- OpenSCAP is unaffected: `run_openscap` uses `--datastream`, not `-b`, and `benchmark` is `nil` for openscap test cases.
+- `-b` is appended before `-l` in the command array to match the assessor's documented flag ordering.
+## Non-Goals
+- **Auto-discovery of benchmark filenames.** The operator supplies the exact filename; no filesystem scan of `/opt/cis-cat-pro/benchmarks/` at provision time.
+- **Benchmark file presence validation.** If the filename is wrong the assessor will error on the node; that is surfaced via CEM-6761's stderr capture.
+- **Global benchmark default.** All entries must be explicit per-test-case.
+- **OpenSCAP changes.** `run_openscap` and the openscap config schema are unchanged.
+- **Terraform HCL changes.** The benchmark flows through `scan_config.json` (already uploaded by the `scan_config_upload` null_resource); no new Terraform provisioner is needed.
+## Tests
+**`spec/cem_acpt/scan/errors_spec.rb`** — add a `BenchmarkNotFoundError` describe block mirroring the `ProfileNotFoundError` block: one example asserting the error message contains the test-case name, one asserting the config key is present.
+**`spec/cem_acpt/scan/test_data_scan_spec.rb`** — extend the existing scan-mode context:
+- Happy path: `data.first[:scan][:benchmark]` equals the configured filename when `benchmarks.cis_cat` is populated.
+- Missing benchmark: raises `BenchmarkNotFoundError` with a message matching `cem_acpt_scan.benchmarks.cis_cat.<test_name>`.
+- Openscap test case: `benchmark` is `nil` in `test_data[:scan]` (no `BenchmarkNotFoundError` raised).
+**`spec/terraform/gcp/linux/scan/scan_service_spec.rb`** — add a text-grep example asserting `'-b'` and `"/opt/cis-cat-pro/benchmarks/"` appear in `run_ciscat`.
+## Acceptance Criteria
+- [ ] `run_ciscat` passes `-b /opt/cis-cat-pro/benchmarks/<benchmark>` when `cfg['benchmark']` is non-nil.
+- [ ] `scan_post_processing!` raises `BenchmarkNotFoundError` (with the missing config key in the message) when the `benchmarks.cis_cat` entry is absent for a ciscat test case.
+- [ ] `scan_post_processing!` does not raise for openscap test cases (benchmark is not required).
+- [ ] `scan_config_json_for` writes `'benchmark'` into `scan_config.json`.
+- [ ] `cem_acpt_scan.benchmarks.cis_cat` defaults to `{}` in `CemAcptScan` config.
+- [ ] `bundle exec rake spec` passes.
+## Files Touched
+- `lib/cem_acpt/scan/errors.rb` — `BenchmarkNotFoundError`.
+- `lib/cem_acpt/config/cem_acpt_scan.rb` — `benchmarks.cis_cat` default.
+- `lib/cem_acpt/test_data.rb` — benchmark resolution and error in `scan_post_processing!`.
+- `lib/cem_acpt/provision/terraform.rb` — `'benchmark'` in `scan_config_json_for`.
+- `lib/terraform/gcp/linux/scan/scan_service.rb` — `-b` flag in `run_ciscat`, updated `perform_scan` call.
+- `spec/fixtures/config_testing/user_config_dir/terraform/gcp/linux/scan/scan_service.rb` — fixture mirror.
+- `spec/fixtures/config_testing/user_config_dir/terraform_checksum.txt` — updated checksum.
+- `spec/cem_acpt/scan/errors_spec.rb` — `BenchmarkNotFoundError` examples.
+- `spec/cem_acpt/scan/test_data_scan_spec.rb` — benchmark resolution and error examples.
+- `spec/terraform/gcp/linux/scan/scan_service_spec.rb` — text-grep for `-b` flag.
+- `specifications/CEM-6759.md` — this file.