cem_acpt 0.11.2 → 0.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,286 @@
1
+ # CEM-6511 — Automated benchmark scans (`cem_acpt_scan`)
2
+
3
+ ## Summary
4
+
5
+ Add a third binary, `cem_acpt_scan`, alongside `cem_acpt` and `cem_acpt_image`. It provisions Linux test nodes the same way the acceptance-test runner does, applies the acceptance-test case's `manifest.pp`, then runs a real benchmark scan against the node — OpenSCAP for STIG profiles, CIS-CAT Pro for CIS profiles — instead of Goss assertions. The scan score is compared against a configurable per-test threshold; an exit code is returned accordingly. The motivation is AI-enablement: agents need a way to validate compliance changes against the actual benchmark, not just developer-authored Goss tests.
6
+
7
+ ## Functional behavior
8
+
9
+ ### 1. Entry point and CLI
10
+
11
+ A new shim `exe/cem_acpt_scan` is added — a 16-line copy of `exe/cem_acpt` that calls `CemAcpt::Cli.parse_opts_for(:cem_acpt_scan)` and `CemAcpt.run(:cem_acpt_scan, ...)`. The gemspec auto-registers it via `bindir = 'exe'` (`cem_acpt.gemspec:26-27`); no gemspec edit is required.
12
+
13
+ `CemAcpt.run` (`lib/cem_acpt.rb:24-40`) gains a third arm, `when :cem_acpt_scan`, that calls a new private `run_cem_acpt_scan(options)` mirroring `run_cem_acpt`'s shape — build config, `initialize_logger!`, instantiate `CemAcpt::TestRunner::Runner` in scan mode, exit with `runner.exit_code`. `new_config` (`lib/cem_acpt.rb:44-53`) gains a matching arm that returns `Config::CemAcptScan`.
14
+
15
+ `Cli.parse_opts_for` (`lib/cem_acpt/cli.rb:27-95`) gains a `when :cem_acpt_scan` block adding scan-specific flags. The shared block (lines 97-185) is unchanged — `--config`, `--CI`, `--quiet`, `--verbose`, `--log-file`, `--trace`, `-Y`, `-X`, `-V` are inherited automatically.
16
+
17
+ Scan-specific flags:
18
+
19
+ | Flag | Effect |
20
+ | --- | --- |
21
+ | `-m, --module-dir DIR` | Path to consuming module root. Same semantics as `cem_acpt`. |
22
+ | `-t, --tests TESTS` | Comma-separated list of acceptance-test case names to scan. |
23
+ | `-D, --no-destroy-nodes` | Same semantics as `cem_acpt`. |
24
+ | `--scanner SCANNER` | Override scanner choice. One of `openscap`, `ciscat`. Default: derived from the test-case framework (`stig_*` → openscap, `cis_*` → ciscat). |
25
+ | `--threshold N` | Override the global pass/fail score threshold (float, 0–100). Default: from config. |
26
+ | `--scan-output FILE` | Save normalized JSON report to `FILE`. For multi-case runs, fan out to `FILE.<test_case_name>.json`. Without this flag, JSON is printed to stdout only. |
27
+
28
+ ### 2. Config schema
29
+
30
+ A new `lib/cem_acpt/config/cem_acpt_scan.rb` defines `CemAcpt::Config::CemAcptScan < Base` following the `Config::CemAcpt` and `Config::CemAcptImage` pattern (`lib/cem_acpt/config/cem_acpt.rb:8-81`):
31
+
32
+ ```ruby
33
+ module CemAcpt
34
+ module Config
35
+ class CemAcptScan < Base
36
+ VALID_KEYS = %i[
37
+ cem_acpt_scan
38
+ module_dir
39
+ node_data
40
+ test_data
41
+ tests
42
+ ].freeze
43
+
44
+ def env_var_prefix
45
+ 'CEM_ACPT_SCAN'
46
+ end
47
+
48
+ def defaults
49
+ {
50
+ cem_acpt_scan: {
51
+ scanner: nil, # nil = auto-detect from framework
52
+ threshold: 80.0, # global default pass threshold
53
+ test_thresholds: {}, # { 'cis_rhel-8_firewalld_server_2' => 75.0 }
54
+ scan_output: nil, # nil = stdout only
55
+ cis_cat_pro_source: nil, # local path or gs:// URI; required for ciscat scans
56
+ cis_cat_pro_license: nil, # local path or gs:// URI; required for ciscat scans.
57
+ # Separate from cis_cat_pro_source so license rotation
58
+ # does not require rebuilding the assessor bundle.
59
+ daemon: { port: 8081, ready_timeout: 60 },
60
+ profiles: {
61
+ openscap: {
62
+ # 'stig_rhel-8_firewalld_v1r12' => 'xccdf_org.ssgproject.content_profile_stig'
63
+ },
64
+ cis_cat: {
65
+ # 'cis_rhel-8_firewalld_server_2' => 'xccdf_org.cisecurity.benchmarks_profile_Level_2_-_Server'
66
+ },
67
+ },
68
+ },
69
+ module_dir: Dir.pwd,
70
+ node_data: {},
71
+ test_data: { for_each: { collection: %w[puppet8] } },
72
+ tests: [],
73
+ # ... shared defaults inherited from Base via merge
74
+ }
75
+ end
76
+ end
77
+ end
78
+ end
79
+ ```
80
+
81
+ `lib/cem_acpt/config.rb` gets a new `require_relative 'config/cem_acpt_scan'`. The new schema is registered via the same `add_static_options!` / `valid_keys` / `validate_config!` machinery in `Config::Base` — no changes to `base.rb` are needed.
82
+
83
+ ### 3. Runner reuse with a new `:scan` action group
84
+
85
+ `cem_acpt_scan` reuses `CemAcpt::TestRunner::Runner` (`lib/cem_acpt/test_runner.rb:50-94`) — the lifecycle (`pre_provision_test_nodes` → `provision_test_nodes` → `run_tests` → `clean_up`) is shared. Only `configure_actions` (`test_runner.rb:120-143`) diverges based on which command instantiated the runner.
86
+
87
+ The runner gains a tiny dispatch in `configure_actions`:
88
+
89
+ ```ruby
90
+ def configure_actions
91
+ if config.is_a?(CemAcpt::Config::CemAcptScan)
92
+ configure_scan_actions
93
+ else
94
+ configure_acceptance_actions # the existing :goss + :bolt block
95
+ end
96
+ end
97
+ ```
98
+
99
+ `configure_scan_actions` registers a single sync `:scan` action group that, per host, hits the on-node scan daemon at `http://<host>:<config.daemon.port>/scan`, parses the response, builds a normalized result object, appends it to `@results`, and writes the JSON output (stdout always; `--scan-output FILE` if set).
100
+
101
+ ### 4. On-node scanner daemon
102
+
103
+ `Provision::Linux` gains `scan_provision_commands` (`lib/cem_acpt/provision/terraform/linux.rb`), a sibling of the existing `provision_commands` selected by a new `scan_mode:` keyword on `provision_commands_wrapper`. In scan mode the Goss install and Goss systemd units are skipped — only the scan-daemon machinery is installed. The on-node commands are:
104
+
105
+ 1. Install the consuming Puppet module (same `puppet module install` as acceptance mode).
106
+ 2. Install `webrick` via `puppet gem install` and start `log_service.rb` (existing logging pattern, reused).
107
+ 3. Install OpenSCAP via `install_scanner_packages_command`: `sudo dnf install -y openscap-scanner scap-security-guide || sudo apt-get install -y libopenscap8 ssg-base ssg-debderived`. Installed unconditionally — the OR-chain succeeds on whichever package manager is present, regardless of which scanner the test case resolves to.
108
+ 4. Install a headless Java runtime via `install_java_command`: `sudo dnf install -y java-11-openjdk-headless || sudo apt-get install -y default-jre-headless`. Installed unconditionally — CIS-CAT Pro's `Assessor-CLI.sh` is a Java app, and conditionalizing on the resolved scanner is not worth the wiring.
109
+ 5. Drop `/opt/cem_acpt/scan/scan_service.rb` (a small WebRick server analogous to `log_service.rb`), install its systemd unit (`scan_service.service`), `daemon-reload`, start, enable. The service exposes:
110
+ - `GET /health` — returns 200 once the daemon is up. Polled by the host until ready.
111
+ - `GET /scan` — reads scanner + profile + level + datastream from `/opt/cem_acpt/scan/scan_config.json` (written at provision time by the `scan_config_upload` null_resource — see §5), invokes the chosen scanner CLI synchronously, returns normalized JSON. The CIS-CAT Pro path expects the assessor at `/opt/cis-cat-pro/Assessor-CLI.sh` and the license bundle's contents at `/opt/cis-cat-pro/license/` — both placed by the `cis_cat_pro_upload` null_resource.
112
+ 6. Run `puppet apply` on the test-case manifest (existing `apply_command`).
113
+
114
+ The CIS-CAT Pro bundle is **not** extracted as part of these commands. Extraction lives inside the `cis_cat_pro_upload` null_resource (§5) so the assessor tarball is guaranteed to be on disk before extraction runs — a previous iteration did extraction here and hit a race where the file did not yet exist.
115
+
116
+ This mirrors the existing Goss pattern: install at provision, run as a systemd service, query over HTTP from the host. The difference is the source of truth for scan parameters — a state file uploaded by Terraform — rather than baked-in assertions.
117
+
118
+ ### 5. Terraform template reuse
119
+
120
+ `lib/terraform/gcp/linux/main.tf` is reused as-is — no `lib/terraform/scan/` fork. The `node_data` object's schema gains five optional fields (defaulting to empty strings so acceptance-mode runs are unaffected):
121
+
122
+ - `cis_cat_pro_bundle` (local host path) — assessor archive to upload.
123
+ - `cis_cat_pro_format` — `"tar.gz"` or `"zip"`, detected from the source extension by `Provision::Terraform#archive_format`.
124
+ - `cis_cat_pro_license` (local host path) — license archive to upload.
125
+ - `cis_cat_pro_license_format` — `"tar.gz"` or `"zip"`, detected from the license source extension using the same `archive_format` helper.
126
+ - `scan_config_json` (rendered JSON string) — the per-node scan parameters consumed by the on-node daemon.
127
+
128
+ Two new `null_resource` blocks ship the scan-mode artifacts to the node. Each filters with `for_each` over `var.node_data` so it is a no-op for any node whose relevant field is empty — acceptance-mode runs do not require any of these resources to even plan.
129
+
130
+ **`null_resource.scan_config_upload`** (filtered on `scan_config_json != ""`) — opens a fresh SSH connection to the instance, `mkdir -p /opt/cem_acpt/scan` and `chown ${var.username}:${var.username}` so the subsequent `provisioner "file"` can write without sudo, then drops `scan_config.json` with `content =` (no source file). Triggers on `google_compute_instance.acpt-test-node[each.key].id` so it re-runs whenever the instance is replaced.
131
+
132
+ **`null_resource.cis_cat_pro_upload`** (filtered on `cis_cat_pro_bundle != ""`) — opens a fresh SSH connection and runs four provisioners *in declared order*. Provisioner execution within a single resource is sequential, which is the ordering guarantee we rely on for both the assessor extraction and the license extraction:
133
+
134
+ 1. `provisioner "file"` — upload the assessor archive to `/opt/cem_acpt/cis-cat-pro.${cis_cat_pro_format}`.
135
+ 2. `provisioner "remote-exec"` — extract the assessor into `/opt/cis-cat-pro/`. Ternary on `cis_cat_pro_format`: `tar.gz` uses `tar -xzf … --strip-components=1`; `zip` installs `unzip` (OR-chained `dnf || apt-get`), stages into a temp dir, then `shopt -s dotglob && mv */* /opt/cis-cat-pro/` to flatten the vendor's single top-level directory (including dotfiles). `chmod +x Assessor-CLI.sh`.
136
+ 3. `provisioner "file"` — upload the license archive to `/opt/cem_acpt/cis-cat-pro-license.${cis_cat_pro_license_format}`.
137
+ 4. `provisioner "remote-exec"` — `sudo mkdir -p /opt/cis-cat-pro/license/` (the vendor bundle does not always ship this directory), then extract into it. Ternary on `cis_cat_pro_license_format`: `tar.gz` uses `tar -xzf … -C /opt/cis-cat-pro/license/` (no `--strip-components` — the vendor's archive contents are dropped in as-packaged, including any subdirectories); `zip` uses `unzip -q -o … -d /opt/cis-cat-pro/license/`. Permissions on the extracted contents are left at the vendor's archive defaults — to be revisited after the first end-to-end scan confirms what the assessor actually wants.
138
+
139
+ `Provision::Terraform#provision_node_data` (`lib/cem_acpt/provision/terraform.rb`) populates the five scan-mode fields when `Config::CemAcptScan` is the active config and the resolved scanner for a given node is `ciscat`. Two helper methods handle source resolution:
140
+
141
+ - `resolve_cis_cat_pro_source!` — local paths pass through via `File.expand_path`; `gs://` URIs are pulled to `working_dir` with `gcloud storage cp`. The cached filename preserves the source extension so `archive_format` works downstream.
142
+ - `resolve_cis_cat_pro_license!` — same shape, applied to `cem_acpt_scan.cis_cat_pro_license`. Raises `CemAcpt::Scan::LicenseNotFoundError` if unset when any test case resolves to ciscat.
143
+
144
+ The host-side `archive_format` helper handles both source and license — `.tar.gz`, `.tgz`, `.zip` accepted (case-insensitive); anything else raises before provisioning starts.
145
+
146
+ ### 6. Acceptance-test discovery in scan mode
147
+
148
+ `CemAcpt::TestData::Fetcher` (`lib/cem_acpt/test_data.rb`) is updated so the discovery predicate accommodates scan mode:
149
+
150
+ - `find_acceptance_tests!` (`test_data.rb:72-77`) selects directories that contain `manifest.pp`. The current `goss.yaml` predicate becomes specific to acceptance mode.
151
+ - The per-test preflight checks (`test_data.rb:45-46`) require `manifest.pp` always; `goss.yaml` is required only in acceptance mode.
152
+ - The mode is derived from the config class (`config.is_a?(Config::CemAcptScan)` → scan mode).
153
+
154
+ Per-test scanner profile/level resolution:
155
+
156
+ 1. Parse the dir name with the existing `name_pattern_vars` regex (`framework`, `image_fam`, `image_version`, `firewall`, `profile`, `level`).
157
+ 2. Resolve `framework` to a scanner: `cis` → `ciscat`, `stig` → `openscap`. Overridable via `--scanner`.
158
+ 3. Look up the scanner profile id in `cem_acpt_scan.profiles.<scanner>` keyed by full dir name. If missing, fail fast with a clear error pointing at the config key.
159
+
160
+ ### 7. Score evaluation and exit code
161
+
162
+ After all `:scan` actions finish, `Runner#process_test_results` evaluates each result against its threshold:
163
+
164
+ - `cem_acpt_scan.test_thresholds[<test_case_name>]` if set, otherwise `cem_acpt_scan.threshold`.
165
+ - A run passes if every test case scored ≥ its threshold. Otherwise `runner.exit_code = 1`.
166
+
167
+ The pass/fail line per test case is logged through the existing `logger.info`/`logger.error` channels and renders correctly under both `--CI` and plain modes (`lib/cem_acpt/logging/formatter.rb:75-94`).
168
+
169
+ ## Input/Output Contracts
170
+
171
+ **Inputs**
172
+
173
+ - `tests:` config key — list of acceptance-test directory names (same as `cem_acpt`).
174
+ - `cem_acpt_scan.cis_cat_pro_source` — local file path or `gs://` URI to the CIS-CAT Pro bundle. Required when any selected test case resolves to scanner `ciscat`.
175
+ - `cem_acpt_scan.cis_cat_pro_license` — local file path or `gs://` URI to the CIS-CAT Pro license bundle (`.tar.gz` or `.zip`). Required when any selected test case resolves to scanner `ciscat`. Separate from `cis_cat_pro_source` so license rotation does not require rebundling the assessor.
176
+ - `cem_acpt_scan.profiles.openscap` / `cem_acpt_scan.profiles.cis_cat` — required mappings from test-case dir name to scanner profile id.
177
+ - `cem_acpt_scan.threshold` (float, 0–100) — global default pass threshold. Default `80.0`.
178
+ - `cem_acpt_scan.test_thresholds` (map) — per-test-case override thresholds.
179
+
180
+ **Outputs**
181
+
182
+ - Stdout — normalized JSON for each test case scanned, plus the existing logger output (CI-formatted under `-I`).
183
+ - `--scan-output FILE` — single test case writes JSON to `FILE`; multi-case runs fan out to `FILE.<test_case>.json` (no overwrite of `FILE` itself).
184
+ - Exit code — `0` if all scans met threshold, `1` if any failed, non-zero infra/scanner errors propagate as today.
185
+
186
+ **Normalized JSON shape**
187
+
188
+ ```json
189
+ {
190
+ "test_case": "cis_rhel-8_firewalld_server_2",
191
+ "scanner": "ciscat",
192
+ "profile": "xccdf_org.cisecurity.benchmarks_profile_Level_2_-_Server",
193
+ "score": 87.4,
194
+ "threshold": 80.0,
195
+ "passed_count": 187,
196
+ "failed_count": 27,
197
+ "not_applicable_count": 14,
198
+ "error_count": 0,
199
+ "rules": [
200
+ { "id": "...", "title": "...", "severity": "high", "result": "pass" }
201
+ ]
202
+ }
203
+ ```
204
+
205
+ ## Edge cases
206
+
207
+ - **Mixed scanners in one run.** `tests:` may resolve to a mix of `openscap` and `ciscat` cases. Each test case is scanned with the right tool independently. CIS-CAT Pro bundle upload is skipped on nodes whose resolved scanner is `openscap`.
208
+ - **Multi-host fan-out.** Same as `cem_acpt` today — one instance per test case. Scan actions are synchronous (`async: false`); each host's `GET /scan` is sequential within the runner's single thread.
209
+ - **Daemon readiness.** The `:scan` action waits up to `cem_acpt_scan.daemon.ready_timeout` seconds for the daemon's `/health` endpoint to return 200 before issuing `/scan`. Times out with a clear error if the systemd unit failed to start.
210
+ - **`--scan-output FILE` with one test.** The bare path `FILE` is used. The fan-out suffix is only applied when more than one test case is scanned in the run.
211
+ - **Windows test case requested.** The runner detects `os_family_for(test) == :windows` (`test_runner.rb:71`) and raises `"Windows scanning is not supported by cem_acpt_scan in this version"` before provisioning.
212
+ - **No matching test case for `tests:` entry.** Same hard error as `cem_acpt` today (`test_data.rb:38`).
213
+ - **Profile lookup miss.** If a test case has no entry in `cem_acpt_scan.profiles.<scanner>`, fail before provisioning with a message naming the missing key.
214
+ - **License missing for ciscat run.** If `cem_acpt_scan.cis_cat_pro_license` is unset (or empty) when any test case resolves to scanner `ciscat`, raise `CemAcpt::Scan::LicenseNotFoundError` in `pre_provision_test_nodes` before any Terraform call. Openscap-only runs ignore the key entirely.
215
+ - **License source unreachable.** A `gs://` URI that 404s under `gcloud storage cp`, or a local path that does not exist, propagates through the existing `Utils::Shell.run_cmd` / `File.expand_path` error paths and surfaces as a host-side provisioning error — same channel as an unreachable `cis_cat_pro_source`.
216
+
217
+ ## Constraints / Invariants
218
+
219
+ - Reuses `Provision::Terraform`, `lib/terraform/gcp/linux/main.tf`, `Config::Base`, `TestRunner::Runner`, and `Logging` unchanged in shape — no parallel implementations.
220
+ - `cem_acpt`'s existing `:goss` and `:bolt` action paths must remain functionally identical. Verified by `bundle exec rake spec` continuing to pass.
221
+ - RuboCop must remain clean (200-char line limit, the existing `.rubocop.yml`).
222
+ - No new runtime gem dependency (the daemon mirrors Goss's WebRick pattern; `webrick` is already installed by `provision_commands`).
223
+ - The single-instance-per-test-case provisioning model is preserved — `cem_acpt_scan` does not introduce parallel hosts per test or sharing across cases.
224
+
225
+ ## Error handling
226
+
227
+ - Subsystem-specific error classes live in a new `lib/cem_acpt/scan/errors.rb`, following the repo convention (`docs/ARCHITECTURE.md` §convention; `lib/cem_acpt/bolt/errors.rb` precedent). Defined: `CemAcpt::Scan::DaemonNotReadyError`, `CemAcpt::Scan::ProfileNotFoundError`, `CemAcpt::Scan::ScannerInvocationError`, `CemAcpt::Scan::LicenseNotFoundError`.
228
+ - Provisioning errors (Terraform failures, SSH key setup, manifest apply) flow through the existing `Runner#run`'s `rescue StandardError` (`test_runner.rb:78-82`) and are reported with the existing `TestResults` plumbing.
229
+ - Cleanup runs unconditionally via the existing `ensure` (`test_runner.rb:83-90`), respecting `--no-destroy-nodes`.
230
+
231
+ ## Non-goals
232
+
233
+ - **Windows scanning.** A follow-up ticket adds CIS-CAT Pro on Windows; this story errors out cleanly on Windows test cases.
234
+ - **AWS / non-GCP platforms.** Same single-platform constraint as `cem_acpt` today.
235
+ - **Trend reporting / scan history.** The normalized JSON is the artifact; downstream consumers (CI dashboards, AI agents) own historical analysis.
236
+ - **Image-builder integration.** `cem_acpt_image` is not modified to bake scanners into images. Scanner install is a provision-time concern only.
237
+ - **Severity-weighted scoring.** Scores are scanner-native pass-rate; severity weighting is out of scope.
238
+ - **A `:scan` action group registered alongside `:goss` / `:bolt`.** Scan mode is a different command, not a sub-action of acceptance runs.
239
+ - **Bucket-side automation of CIS-CAT Pro source and license distribution.** The operator chooses the artifacts' location (local path or a `gs://` URI of their making) and is responsible for keeping the bucket populated. A follow-up story under CEM-6508 may add managed-bucket automation — uploading on a developer's behalf, sharing one bucket across the team — but for this story the artifacts are operator-managed and the runner only consumes them.
240
+
241
+ ## Acceptance criteria
242
+
243
+ - [ ] `bundle exec exe/cem_acpt_scan -h` prints help text including `--scanner`, `--threshold`, `--scan-output`, `-m`, `-t`, `-D`.
244
+ - [ ] `bundle exec exe/cem_acpt_scan -Y` prints the merged `Config::CemAcptScan` as YAML; `-X` explains config-source provenance.
245
+ - [ ] Running `bundle exec exe/cem_acpt_scan -t cis_rhel-8_firewalld_server_2 -m <module>` against a configured GCP project provisions a node, applies the manifest, runs CIS-CAT Pro, returns scanner-native passing% on stdout, and exits 0 if score ≥ threshold.
246
+ - [ ] `--scan-output report.json` writes the normalized JSON to `report.json` for a single-case run; `report.json.<case>.json` fan-out works for multi-case.
247
+ - [ ] `cem_acpt_scan` exits non-zero when any test case scores below its threshold; per-case threshold overrides via `cem_acpt_scan.test_thresholds` are honored.
248
+ - [ ] `cem_acpt`'s existing `:goss` + `:bolt` action paths still work — `bundle exec exe/cem_acpt -t <case>` is unchanged in behavior.
249
+ - [ ] A Windows test case (resolved via `os_family_for`) causes `cem_acpt_scan` to fail before provisioning with a clear error.
250
+ - [ ] A missing `cem_acpt_scan.profiles.<scanner>` entry fails before provisioning with a clear error naming the missing key.
251
+ - [ ] A missing `cem_acpt_scan.cis_cat_pro_license` for a ciscat-resolving test case fails before provisioning with a `LicenseNotFoundError` naming the missing key. Openscap-only runs do not require the key.
252
+ - [ ] A ciscat run with a valid license (either `.tar.gz` or `.zip` form) extracts the contents into `/opt/cis-cat-pro/license/` on the node, and the assessor produces a non-error scan result that the host parses into a `Scan::Result`.
253
+ - [ ] `bundle exec rake spec` passes.
254
+ - [ ] `rubocop` is clean.
255
+
256
+ ## Files touched
257
+
258
+ - `exe/cem_acpt_scan` (new) — 16-line shim, copy of `exe/cem_acpt`.
259
+ - `lib/cem_acpt.rb` (`:24-40`, `:44-53`, plus new `run_cem_acpt_scan`) — dispatch + entry-point method.
260
+ - `lib/cem_acpt/cli.rb` (`:27-95`) — new `when :cem_acpt_scan` block for scan-specific flags.
261
+ - `lib/cem_acpt/config.rb` — `require_relative 'config/cem_acpt_scan'`.
262
+ - `lib/cem_acpt/config/cem_acpt_scan.rb` (new) — `Config::CemAcptScan`. Includes new `cis_cat_pro_license` default key alongside `cis_cat_pro_source`.
263
+ - `lib/cem_acpt/test_data.rb` (`:45-46`, `:72-77`) — make `goss.yaml` optional in scan mode; preflight checks scoped by mode.
264
+ - `lib/cem_acpt/test_runner.rb` (`:120-143`) — `configure_actions` dispatches between `configure_acceptance_actions` (existing) and `configure_scan_actions` (new); `process_test_results` honors per-case thresholds in scan mode.
265
+ - `lib/cem_acpt/provision/terraform.rb` — populate the scan-mode `node_data` fields (`cis_cat_pro_bundle`, `cis_cat_pro_format`, `cis_cat_pro_license`, `cis_cat_pro_license_format`, `scan_config_json`) when `Config::CemAcptScan` is active. New helpers: `resolve_cis_cat_pro_source!`, `resolve_cis_cat_pro_license!`, `archive_format` (shared by both).
266
+ - `lib/cem_acpt/provision/terraform/linux.rb` (`:29-48`) — append scanner-install commands when scan mode is active.
267
+ - `lib/cem_acpt/scan/` (new) — scan subsystem: `daemon_client.rb` (HTTP client), `errors.rb` (`DaemonNotReadyError`, `ProfileNotFoundError`, `ScannerInvocationError`, `LicenseNotFoundError`), `result.rb` (normalized JSON), `scan_service_template.rb` (the Ruby served on-node, embedded as a string template).
268
+ - `lib/terraform/gcp/linux/main.tf` — five new optional fields on the `node_data` object schema (`cis_cat_pro_bundle`, `cis_cat_pro_format`, `cis_cat_pro_license`, `cis_cat_pro_license_format`, `scan_config_json`), all defaulting to empty strings. Two new `null_resource` blocks (`scan_config_upload`, `cis_cat_pro_upload`) filter on those fields via `for_each` so they are no-ops in acceptance mode. The `cis_cat_pro_upload` resource declares four sequential provisioners — assessor upload, assessor extraction, license upload, license extraction — relying on Terraform's in-resource provisioner ordering for the upload-before-extract guarantee.
269
+ - `spec/` — new specs mirroring the new files; existing specs touched where their fixtures change.
270
+
271
+ ## Test plan
272
+
273
+ 1. **`bundle exec rake spec`** — full unit-test suite. New specs cover:
274
+ - `Config::CemAcptScan` defaults, env-var loading (`CEM_ACPT_SCAN_*`), validation.
275
+ - `Runner#configure_actions` dispatch (acceptance vs scan).
276
+ - The new scan action: HTTP client mocked via WebMock or stubbed `Async::HTTP::Internet`. Verifies parsing, threshold evaluation, exit-code wiring.
277
+ - Discovery predicate change in `TestData::Fetcher` for both modes.
278
+ - Profile-mapping lookup miss → `ProfileNotFoundError`.
279
+ - Daemon-not-ready → `DaemonNotReadyError`.
280
+ - `resolve_cis_cat_pro_license!`: local path passes through unchanged; `gs://` URI is pulled to `working_dir` via stubbed `Utils::Shell.run_cmd` and preserves the source extension; missing key while any test case resolves to ciscat raises `LicenseNotFoundError`.
281
+ - `archive_format`: existing assessor-source coverage extended to assert the same helper accepts a license path argument — no separate `license_format` helper exists.
282
+ 2. **`bundle exec exe/cem_acpt_scan -Y`** — merged config dump round-trips correctly with both the new schema and existing shared keys.
283
+ 3. **`bundle exec exe/cem_acpt -t <case>`** — regression check that `:goss` + `:bolt` paths behave unchanged on a representative test case.
284
+ 4. **`rubocop`** — clean.
285
+
286
+ End-to-end runs against real GCP nodes are not part of this story's CI; manual sanity checks are the assignee's responsibility before merging. Those manual checks cover both license-archive formats (`.tar.gz` and `.zip`) and confirm that the extracted contents under `/opt/cis-cat-pro/license/` allow the assessor to run without a licensing error.
@@ -0,0 +1,187 @@
1
+ # CEM-6720 — Remove provisioner factory in favor of direct instantiation
2
+
3
+ Implements [RFC 0011](../docs/rfcs/0011-provisioner-factory-consistency.md).
4
+
5
+ ## Summary
6
+
7
+ `Config::Base#add_static_options!` force-sets `provisioner = 'terraform'`
8
+ after every other config source has been merged. The `Provision.new_provisioner`
9
+ factory still branches on `config.get('provisioner')` and raises for unknown
10
+ values — so the factory looks pluggable but is not. RFC 0011 recommends the
11
+ smaller change: remove the factory and the static set, and have `TestRunner`
12
+ instantiate `Provision::Terraform` directly. Re-introduce the factory when a
13
+ concrete second implementation is in flight.
14
+
15
+ ## Functional behavior
16
+
17
+ ### 1. Delete `lib/cem_acpt/provision.rb`
18
+
19
+ The whole 20-line file is the factory. Delete it outright.
20
+ `Provision::Terraform` declares its own `module CemAcpt; module Provision`
21
+ namespace at `lib/cem_acpt/provision/terraform.rb:6-7`, so the
22
+ `CemAcpt::Provision` module remains reachable as long as the consumer
23
+ requires `cem_acpt/provision/terraform` directly.
24
+
25
+ The module-level `include CemAcpt::Logging` in `provision.rb` exists only
26
+ to support the `logger.debug` call inside `new_provisioner`; both go away
27
+ with the file.
28
+
29
+ ### 2. Update the lone caller in `TestRunner`
30
+
31
+ `lib/cem_acpt/test_runner.rb:225-229`:
32
+
33
+ ```ruby
34
+ # Before
35
+ def provision_test_nodes
36
+ logger.info('CemAcpt::TestRunner') { 'Provisioning test nodes...' }
37
+ @provisioner = CemAcpt::Provision.new_provisioner(config, @run_data)
38
+ @provisioner.provision
39
+ end
40
+
41
+ # After
42
+ def provision_test_nodes
43
+ logger.info('CemAcpt::TestRunner') { 'Provisioning test nodes...' }
44
+ @provisioner = CemAcpt::Provision::Terraform.new(config, @run_data)
45
+ @provisioner.provision
46
+ end
47
+ ```
48
+
49
+ Add the corresponding `require_relative 'provision/terraform'` near the
50
+ top of `test_runner.rb`. Variable name `@provisioner` and method names
51
+ `provisioner_output` / `destroy_test_nodes` stay — they describe the
52
+ role, not the factory.
53
+
54
+ ### 3. Drop the `provisioner` static set in `Config::Base`
55
+
56
+ `lib/cem_acpt/config/base.rb:260-270`:
57
+
58
+ ```ruby
59
+ # Before
60
+ def add_static_options!(config)
61
+ config.dset('user_config.dir', user_config_dir)
62
+ add_config_explanation('user_config.dir', 'static value')
63
+ config.dset('user_config.file', user_config_file)
64
+ add_config_explanation('user_config.file', 'static value')
65
+ config.dset('provisioner', 'terraform')
66
+ add_config_explanation('provisioner', 'static value')
67
+ config.dset('terraform.dir', terraform_dir)
68
+ add_config_explanation('terraform.dir', 'static value')
69
+ set_third_party_env_vars!(config)
70
+ end
71
+
72
+ # After
73
+ def add_static_options!(config)
74
+ config.dset('user_config.dir', user_config_dir)
75
+ add_config_explanation('user_config.dir', 'static value')
76
+ config.dset('user_config.file', user_config_file)
77
+ add_config_explanation('user_config.file', 'static value')
78
+ config.dset('terraform.dir', terraform_dir)
79
+ add_config_explanation('terraform.dir', 'static value')
80
+ set_third_party_env_vars!(config)
81
+ end
82
+ ```
83
+
84
+ `provisioner` is also dropped from `BASE_VALID_KEYS` at
85
+ `lib/cem_acpt/config/base.rb:46-62`, since nothing reads it any more
86
+ and leaving it there would let users set a key that has no effect.
87
+
88
+ ### 4. Update `docs/ARCHITECTURE.md`
89
+
90
+ - **Line 65** (file-tree): drop the `provision.rb # provisioner factory`
91
+ entry; the factory file is gone.
92
+ - **§3 lines ~170-175**: in step 6a, drop `provisioner = 'terraform'`
93
+ from the list of framework-owned static keys (now three keys, not
94
+ four).
95
+ - **§7 lines 403-410**: rewrite the opening paragraph of "Provisioner:
96
+ Terraform" so it no longer references the factory or the static
97
+ override. Replacement reads roughly: "There is currently one
98
+ provisioner. `TestRunner#provision_test_nodes` instantiates
99
+ `Provision::Terraform` directly. A factory was removed in CEM-6720;
100
+ re-introduce one when there is a concrete second implementation."
101
+ - **§19 item 1 (lines 1020-1023)**: delete the item entirely.
102
+ Renumber: current item 2 (Platform constant cache, already resolved)
103
+ becomes item 1; current item 3 (`lib/terraform/image/gcp/windows/`
104
+ empty) becomes item 2.
105
+
106
+ ## Inputs / outputs / contracts
107
+
108
+ - No public-API surface change. `Provision.new_provisioner` had no
109
+ out-of-tree callers in this repo, and the only in-tree caller is
110
+ rewired to `Provision::Terraform.new`.
111
+ - `Provision::Terraform.new(config, run_data)` — unchanged signature
112
+ and behavior. The factory was a thin pass-through.
113
+
114
+ ## Edge cases
115
+
116
+ - **`-Y` output no longer shows a `provisioner` key.** The static set
117
+ was the only thing populating it, and `provisioner` is removed from
118
+ `BASE_VALID_KEYS`, so any leftover value supplied by env/user/runtime
119
+ config is dropped during `validate_config!`. Anyone scripting against
120
+ `-Y` for a `provisioner:` line gets nothing instead of `terraform`.
121
+ Acceptable: no documented contract on that key, and the change is
122
+ consistent with the framework no longer caring.
123
+ - **`-X` output no longer shows the `provisioner` key explanation.**
124
+ Same root cause; same acceptable trade-off.
125
+ - **A user who sets `provisioner: foo` in a config file or
126
+ `CEM_ACPT_PROVISIONER=foo`** previously got the value silently
127
+ overridden by the static set, then the factory raised
128
+ `ArgumentError`. Now the value is silently dropped by
129
+ `validate_config!` (with a `warn` "Config key 'provisioner' is not
130
+ usable with this command") and the runner proceeds with Terraform.
131
+ This is a small behavior change but matches the intended state — the
132
+ framework is honest that the key has no effect.
133
+
134
+ ## Constraints / invariants
135
+
136
+ - RuboCop must remain clean (200-char line limit, the existing
137
+ permissive `.rubocop.yml`).
138
+ - `bundle exec rake spec` must pass.
139
+ - The Linux and Windows test-node provisioning flows continue to work
140
+ end-to-end.
141
+
142
+ ## Error handling
143
+
144
+ - Removing `lib/cem_acpt/provision.rb`: file deletion, nothing to handle.
145
+ - The `ArgumentError, "Unknown provisioner #{...}"` raise path inside
146
+ the deleted factory is gone. There is no longer any "unknown
147
+ provisioner" error condition to handle, because there is no longer
148
+ any dispatch.
149
+
150
+ ## Non-goals
151
+
152
+ - Adding a real second provisioner (Pulumi, Packer, local-only).
153
+ Re-introducing the factory is explicitly deferred until that exists.
154
+ - Adding a `--provisioner NAME` CLI flag. RFC 0011's alternative path
155
+ is rejected — we are not preserving the extension surface.
156
+ - Touching the image-builder path (`Provision::TerraformCmd`,
157
+ `ImageBuilder::TerraformBuilder`). It does not go through the
158
+ factory and is unaffected.
159
+ - Any other `docs/ARCHITECTURE.md` §19 item.
160
+
161
+ ## Acceptance criteria
162
+
163
+ - [ ] `lib/cem_acpt/provision.rb` deleted; `git grep new_provisioner
164
+ lib/ exe/ spec/` returns no matches.
165
+ - [ ] `lib/cem_acpt/test_runner.rb` instantiates
166
+ `CemAcpt::Provision::Terraform` directly and `require`s
167
+ `cem_acpt/provision/terraform`.
168
+ - [ ] `Config::Base#add_static_options!` no longer sets `provisioner`;
169
+ `provisioner` removed from `BASE_VALID_KEYS`.
170
+ - [ ] `docs/ARCHITECTURE.md` updated: file-tree line 65, §3 step 6a
171
+ static-keys list, §7 opening paragraph, §19 item 1 deleted with
172
+ remaining items renumbered.
173
+ - [ ] `bundle exec rake spec` passes.
174
+ - [ ] `rubocop` is clean against the changed files.
175
+
176
+ ## Test plan
177
+
178
+ - Existing specs around `Provision::Terraform` (`terraform_cmd_spec.rb`,
179
+ `os_data_spec.rb`, `windows_spec.rb`) are unaffected — the factory
180
+ had no direct test coverage to migrate, and `Provision::Terraform.new`
181
+ signature is unchanged.
182
+ - `spec/cem_acpt/test_runner_spec.rb` stubs `provisioner_output` rather
183
+ than the construction path, so it remains green without modification.
184
+ - Confirm `bundle exec exe/cem_acpt -Y` and `-X` run cleanly and no
185
+ longer reference `provisioner`.
186
+ - No new spec files are added; the change is a removal, and the live
187
+ end-to-end paths exercise it implicitly.
@@ -0,0 +1,168 @@
1
+ # CEM-6759 — Wire `-b <benchmark>` flag through CIS-CAT Pro scan pipeline
2
+
3
+ ## Summary
4
+
5
+ `run_ciscat` in `lib/terraform/gcp/linux/scan/scan_service.rb` never passes `-b <benchmark-xccdf-path>` to `Assessor-CLI.sh`. CIS-CAT Pro v4 requires an explicit benchmark to disambiguate profile IDs that exist across many OS benchmarks (RHEL 8, RHEL 9, Ubuntu LTS, etc.). Without it the assessor falls back to a `sessions.properties` file, fails to load it, and exits zero without scanning — silently reporting empty results. Wire a new `cem_acpt_scan.benchmarks.cis_cat` config map through `scan_post_processing!`, `scan_config_json_for`, and `run_ciscat` so that `-b /opt/cis-cat-pro/benchmarks/<filename>` is passed on every ciscat invocation.
6
+
7
+ ## Functional Behavior
8
+
9
+ ### 1. `lib/cem_acpt/scan/errors.rb` — new `BenchmarkNotFoundError`
10
+
11
+ Add alongside `ProfileNotFoundError`:
12
+
13
+ ```ruby
14
+ # Raised when a test case has no entry in `cem_acpt_scan.benchmarks.cis_cat`
15
+ # and so no benchmark XCCDF filename can be resolved. The message carries the
16
+ # missing config key so the operator can fix it without re-running.
17
+ class BenchmarkNotFoundError < StandardError
18
+ def initialize(test_case, scanner, config_key)
19
+ super("No benchmark configured for test case '#{test_case}' (scanner: #{scanner}). Set '#{config_key}' in cem_acpt config.")
20
+ end
21
+ end
22
+ ```
23
+
24
+ ### 2. `lib/cem_acpt/config/cem_acpt_scan.rb` — add `benchmarks` key
25
+
26
+ Add `benchmarks: { cis_cat: {} }` alongside `profiles` in the `cem_acpt_scan` defaults hash:
27
+
28
+ ```ruby
29
+ cem_acpt_scan: {
30
+ # ... existing keys ...
31
+ profiles: {
32
+ openscap: {},
33
+ cis_cat: {},
34
+ },
35
+ benchmarks: {
36
+ cis_cat: {},
37
+ },
38
+ },
39
+ ```
40
+
41
+ ### 3. `lib/cem_acpt/test_data.rb` — `scan_post_processing!`
42
+
43
+ After the existing profile resolution, resolve the benchmark for ciscat scans and raise `BenchmarkNotFoundError` early if missing:
44
+
45
+ ```ruby
46
+ # before:
47
+ test_data[:scan] = {
48
+ scanner: scanner,
49
+ profile: profile,
50
+ level: test_data[:level] || test_data['level'],
51
+ }
52
+
53
+ # after:
54
+ benchmark = nil
55
+ if scanner == :ciscat
56
+ benchmarks = @config.get('cem_acpt_scan.benchmarks.cis_cat') || {}
57
+ benchmark = benchmarks[test_data[:test_name]] || benchmarks[test_data[:test_name].to_sym]
58
+ if benchmark.nil? || benchmark.to_s.empty?
59
+ raise CemAcpt::Scan::BenchmarkNotFoundError.new(
60
+ test_data[:test_name], scanner,
61
+ "cem_acpt_scan.benchmarks.cis_cat.#{test_data[:test_name]}"
62
+ )
63
+ end
64
+ end
65
+
66
+ test_data[:scan] = {
67
+ scanner: scanner,
68
+ profile: profile,
69
+ level: test_data[:level] || test_data['level'],
70
+ benchmark: benchmark,
71
+ }
72
+ ```
73
+
74
+ `benchmark` is `nil` for openscap — it flows through harmlessly and is never read by `run_openscap`.
75
+
76
+ ### 4. `lib/cem_acpt/provision/terraform.rb` — `scan_config_json_for`
77
+
78
+ Add `'benchmark'` to the generated JSON:
79
+
80
+ ```ruby
81
+ JSON.generate(
82
+ 'scanner' => cfg[:scanner].to_s,
83
+ 'profile' => cfg[:profile].to_s,
84
+ 'level' => cfg[:level],
85
+ 'datastream' => cfg[:datastream],
86
+ 'benchmark' => cfg[:benchmark],
87
+ )
88
+ ```
89
+
90
+ ### 5. `lib/terraform/gcp/linux/scan/scan_service.rb` — `run_ciscat` and `perform_scan`
91
+
92
+ Update `run_ciscat` signature to accept `benchmark` and append `-b` when present:
93
+
94
+ ```ruby
95
+ # before:
96
+ def run_ciscat(profile, level)
97
+ ...
98
+ cmd += ['-l', level.to_s] if level
99
+
100
+ # after:
101
+ def run_ciscat(profile, level, benchmark)
102
+ ...
103
+ cmd += ['-b', "/opt/cis-cat-pro/benchmarks/#{benchmark}"] if benchmark
104
+ cmd += ['-l', level.to_s] if level
105
+ ```
106
+
107
+ Update the `perform_scan` call site:
108
+
109
+ ```ruby
110
+ # before:
111
+ run_ciscat(cfg['profile'], cfg['level'])
112
+
113
+ # after:
114
+ run_ciscat(cfg['profile'], cfg['level'], cfg['benchmark'])
115
+ ```
116
+
117
+ ## Input/Output Contracts
118
+
119
+ `run_ciscat` gains a third positional argument `benchmark` (String or nil). When non-nil it prepends `-b /opt/cis-cat-pro/benchmarks/<benchmark>` to the assessor command before `-l`. The benchmark XCCDF files ship inside the assessor bundle at `/opt/cis-cat-pro/benchmarks/` — the filename is the operator-supplied value from the config map. Everything downstream (`parse_ciscat_json`, the error handling, the `/scan` HTTP response) is unchanged.
120
+
121
+ ## Constraints / Invariants
122
+
123
+ - `/opt/cis-cat-pro/benchmarks/` is the hardcoded prefix in `scan_service.rb`. The config value is the filename only (e.g. `CIS_Red_Hat_Enterprise_Linux_8_Benchmark_v4.0.0-xccdf.xml`). This matches the directory layout inside the official CIS-CAT Pro assessor bundle.
124
+ - `benchmarks.cis_cat` is strictly per-test-case — no global fallback. A missing entry raises `BenchmarkNotFoundError` before any node is provisioned, matching the existing `ProfileNotFoundError` behavior.
125
+ - OpenSCAP is unaffected: `run_openscap` uses `--datastream`, not `-b`, and `benchmark` is `nil` for openscap test cases.
126
+ - `-b` is appended before `-l` in the command array to match the assessor's documented flag ordering.
127
+
128
+ ## Non-Goals
129
+
130
+ - **Auto-discovery of benchmark filenames.** The operator supplies the exact filename; no filesystem scan of `/opt/cis-cat-pro/benchmarks/` at provision time.
131
+ - **Benchmark file presence validation.** If the filename is wrong the assessor will error on the node; that is surfaced via CEM-6761's stderr capture.
132
+ - **Global benchmark default.** All entries must be explicit per-test-case.
133
+ - **OpenSCAP changes.** `run_openscap` and the openscap config schema are unchanged.
134
+ - **Terraform HCL changes.** The benchmark flows through `scan_config.json` (already uploaded by the `scan_config_upload` null_resource); no new Terraform provisioner is needed.
135
+
136
+ ## Tests
137
+
138
+ **`spec/cem_acpt/scan/errors_spec.rb`** — add a `BenchmarkNotFoundError` describe block mirroring the `ProfileNotFoundError` block: one example asserting the error message contains the test-case name, one asserting the config key is present.
139
+
140
+ **`spec/cem_acpt/scan/test_data_scan_spec.rb`** — extend the existing scan-mode context:
141
+ - Happy path: `data.first[:scan][:benchmark]` equals the configured filename when `benchmarks.cis_cat` is populated.
142
+ - Missing benchmark: raises `BenchmarkNotFoundError` with a message matching `cem_acpt_scan.benchmarks.cis_cat.<test_name>`.
143
+ - Openscap test case: `benchmark` is `nil` in `test_data[:scan]` (no `BenchmarkNotFoundError` raised).
144
+
145
+ **`spec/terraform/gcp/linux/scan/scan_service_spec.rb`** — add a text-grep example asserting `'-b'` and `"/opt/cis-cat-pro/benchmarks/"` appear in `run_ciscat`.
146
+
147
+ ## Acceptance Criteria
148
+
149
+ - [ ] `run_ciscat` passes `-b /opt/cis-cat-pro/benchmarks/<benchmark>` when `cfg['benchmark']` is non-nil.
150
+ - [ ] `scan_post_processing!` raises `BenchmarkNotFoundError` (with the missing config key in the message) when the `benchmarks.cis_cat` entry is absent for a ciscat test case.
151
+ - [ ] `scan_post_processing!` does not raise for openscap test cases (benchmark is not required).
152
+ - [ ] `scan_config_json_for` writes `'benchmark'` into `scan_config.json`.
153
+ - [ ] `cem_acpt_scan.benchmarks.cis_cat` defaults to `{}` in `CemAcptScan` config.
154
+ - [ ] `bundle exec rake spec` passes.
155
+
156
+ ## Files Touched
157
+
158
+ - `lib/cem_acpt/scan/errors.rb` — `BenchmarkNotFoundError`.
159
+ - `lib/cem_acpt/config/cem_acpt_scan.rb` — `benchmarks.cis_cat` default.
160
+ - `lib/cem_acpt/test_data.rb` — benchmark resolution and error in `scan_post_processing!`.
161
+ - `lib/cem_acpt/provision/terraform.rb` — `'benchmark'` in `scan_config_json_for`.
162
+ - `lib/terraform/gcp/linux/scan/scan_service.rb` — `-b` flag in `run_ciscat`, updated `perform_scan` call.
163
+ - `spec/fixtures/config_testing/user_config_dir/terraform/gcp/linux/scan/scan_service.rb` — fixture mirror.
164
+ - `spec/fixtures/config_testing/user_config_dir/terraform_checksum.txt` — updated checksum.
165
+ - `spec/cem_acpt/scan/errors_spec.rb` — `BenchmarkNotFoundError` examples.
166
+ - `spec/cem_acpt/scan/test_data_scan_spec.rb` — benchmark resolution and error examples.
167
+ - `spec/terraform/gcp/linux/scan/scan_service_spec.rb` — text-grep for `-b` flag.
168
+ - `specifications/CEM-6759.md` — this file.