cem_acpt 0.11.0 → 0.11.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.gitignore +8 -0
- data/.worktreeinclude +1 -0
- data/CLAUDE.md +64 -25
- data/Gemfile.lock +1 -1
- data/README.md +20 -7
- data/docs/ARCHITECTURE.md +1042 -0
- data/docs/rfcs/0000-template.md +54 -0
- data/docs/rfcs/0001-fix-bolt-missing-skip-path.md +105 -0
- data/docs/rfcs/0002-fix-default-character-substitutions.md +119 -0
- data/docs/rfcs/0003-windows-image-builder-template.md +110 -0
- data/docs/rfcs/0004-image-name-truncation-off-by-one.md +108 -0
- data/docs/rfcs/0005-os-dispatch-replace-windows-heuristic.md +117 -0
- data/docs/rfcs/0006-configurable-windows-bucket.md +96 -0
- data/docs/rfcs/0007-logging-quiet-and-typos.md +121 -0
- data/docs/rfcs/0008-namespace-platform-classes.md +110 -0
- data/docs/rfcs/0009-bolt-log-formatter-cleanup.md +111 -0
- data/docs/rfcs/0010-dead-code-cleanup.md +83 -0
- data/docs/rfcs/0011-provisioner-factory-consistency.md +89 -0
- data/docs/rfcs/README.md +34 -0
- data/lib/cem_acpt/cli.rb +10 -1
- data/lib/cem_acpt/config/cem_acpt.rb +4 -1
- data/lib/cem_acpt/image_builder/errors.rb +24 -0
- data/lib/cem_acpt/image_builder/provision_commands.rb +15 -3
- data/lib/cem_acpt/image_builder.rb +29 -2
- data/lib/cem_acpt/image_name_builder.rb +8 -1
- data/lib/cem_acpt/platform/gcp.rb +112 -106
- data/lib/cem_acpt/platform.rb +21 -19
- data/lib/cem_acpt/provision/terraform/linux.rb +1 -1
- data/lib/cem_acpt/provision/terraform/os_data.rb +23 -0
- data/lib/cem_acpt/provision/terraform/windows.rb +7 -1
- data/lib/cem_acpt/provision/terraform.rb +20 -16
- data/lib/cem_acpt/test_runner/log_formatter/bolt_summary_results_formatter.rb +2 -1
- data/lib/cem_acpt/test_runner/log_formatter.rb +0 -1
- data/lib/cem_acpt/test_runner.rb +21 -8
- data/lib/cem_acpt/utils/winrm_runner.rb +4 -3
- data/lib/cem_acpt/utils.rb +0 -12
- data/lib/cem_acpt/version.rb +1 -1
- data/lib/cem_acpt.rb +19 -7
- data/lib/terraform/gcp/linux/main.tf +6 -1
- data/lib/terraform/image/gcp/linux/main.tf +8 -1
- data/specifications/CEM-6713.md +165 -0
- data/specifications/CEM-6714.md +271 -0
- data/specifications/CEM-6715.md +133 -0
- data/specifications/CEM-6716.md +160 -0
- data/specifications/CEM-6717.md +239 -0
- data/specifications/CEM-6718.md +120 -0
- data/specifications/CEM-6719.md +173 -0
- metadata +26 -11
- data/.claude/settings.local.json +0 -7
- data/lib/cem_acpt/action_result.rb +0 -91
- data/lib/cem_acpt/puppet_helpers.rb +0 -38
- data/lib/cem_acpt/test_runner/log_formatter/bolt_error_formatter.rb +0 -65
- data/lib/cem_acpt/test_runner/log_formatter/bolt_output_formatter.rb +0 -54
|
@@ -0,0 +1,54 @@
|
|
|
1
|
+
# RFC NNNN: Title
|
|
2
|
+
|
|
3
|
+
- **Status:** Proposed | Accepted | Rejected | Implemented | Withdrawn
|
|
4
|
+
- **Author:** TBD
|
|
5
|
+
- **Created:** YYYY-MM-DD
|
|
6
|
+
- **Priority:** Critical | High | Medium | Low
|
|
7
|
+
- **Category:** Bug | Refactor | Feature | Cleanup
|
|
8
|
+
- **Affected components:** `lib/...`
|
|
9
|
+
|
|
10
|
+
## Summary
|
|
11
|
+
|
|
12
|
+
One paragraph describing the change at a glance. A reader who only reads
|
|
13
|
+
this section should walk away knowing what is being proposed and why.
|
|
14
|
+
|
|
15
|
+
## Background
|
|
16
|
+
|
|
17
|
+
The relevant context: how the affected code works today, where it lives,
|
|
18
|
+
and which sections of [`docs/ARCHITECTURE.md`](../ARCHITECTURE.md) describe
|
|
19
|
+
it. Cite specific files and line numbers where helpful.
|
|
20
|
+
|
|
21
|
+
## Problem
|
|
22
|
+
|
|
23
|
+
State the problem precisely. Include reproduction steps, error messages,
|
|
24
|
+
or pointers to the offending lines of code. Be explicit about the user-
|
|
25
|
+
visible symptom (e.g. "crashes with `NoMethodError`", "silently produces
|
|
26
|
+
wrong output", "is dead code that confuses contributors").
|
|
27
|
+
|
|
28
|
+
## Proposal
|
|
29
|
+
|
|
30
|
+
Describe the proposed change. Use code snippets, function signatures,
|
|
31
|
+
or before/after diffs where they clarify the intent. If multiple files
|
|
32
|
+
need to change, list them.
|
|
33
|
+
|
|
34
|
+
## Alternatives Considered
|
|
35
|
+
|
|
36
|
+
Other ways to solve the problem and why they were not chosen.
|
|
37
|
+
|
|
38
|
+
## Risks & Migration
|
|
39
|
+
|
|
40
|
+
- Backwards-compatibility implications.
|
|
41
|
+
- Whether the change is user-visible (CLI, config schema, exit codes,
|
|
42
|
+
log output) and whether existing configs continue to work.
|
|
43
|
+
- Required follow-ups in the consuming Puppet modules (`sce_linux`,
|
|
44
|
+
`sce_windows`, etc.) if any.
|
|
45
|
+
|
|
46
|
+
## Acceptance Criteria
|
|
47
|
+
|
|
48
|
+
A checklist a reviewer can run through to confirm the RFC has been
|
|
49
|
+
implemented:
|
|
50
|
+
|
|
51
|
+
- [ ] Code change merged.
|
|
52
|
+
- [ ] RSpec coverage added or updated.
|
|
53
|
+
- [ ] `docs/ARCHITECTURE.md` updated if architecture-level facts change.
|
|
54
|
+
- [ ] CHANGELOG / version bump (if user-visible).
|
|
@@ -0,0 +1,105 @@
|
|
|
1
|
+
# RFC 0001: Fix the bolt-missing skip path
|
|
2
|
+
|
|
3
|
+
- **Status:** Implemented
|
|
4
|
+
- **Author:** Heston Snodgrass
|
|
5
|
+
- **Created:** 2026-04-30
|
|
6
|
+
- **Implemented:** 2026-05-01 (CEM-6710, PR #71)
|
|
7
|
+
- **Priority:** Critical
|
|
8
|
+
- **Category:** Bug
|
|
9
|
+
- **Affected components:** `lib/cem_acpt/test_runner.rb`, `lib/cem_acpt/actions.rb`
|
|
10
|
+
|
|
11
|
+
## Summary
|
|
12
|
+
|
|
13
|
+
When the `bolt` binary is not on the user's PATH the runner is supposed
|
|
14
|
+
to gracefully skip the `:bolt` action group. The current rescue path
|
|
15
|
+
calls two methods that do not exist on the relevant objects, so instead
|
|
16
|
+
of skipping it raises `NoMethodError` and aborts the entire test run
|
|
17
|
+
before any nodes are provisioned.
|
|
18
|
+
|
|
19
|
+
## Background
|
|
20
|
+
|
|
21
|
+
`Runner#setup_bolt` (`lib/cem_acpt/test_runner.rb:200-218`) constructs
|
|
22
|
+
`Bolt::TestRunner` and calls `setup!`, which in turn shells out to
|
|
23
|
+
`bolt task show`. If the binary is missing, `Utils::Shell.run_cmd`
|
|
24
|
+
raises `CemAcpt::ShellCommandNotFoundError`. The current rescue is:
|
|
25
|
+
|
|
26
|
+
```ruby
|
|
27
|
+
rescue CemAcpt::ShellCommandNotFoundError => e
|
|
28
|
+
logger.warning('CemAcpt::TestRunner') { e.message }
|
|
29
|
+
logger.warning('CemAcpt::TestRunner') { 'Adding Bolt action to ignore list...' }
|
|
30
|
+
CemAcpt::Actions.config.ignore << :bolt
|
|
31
|
+
return
|
|
32
|
+
end
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
[`docs/ARCHITECTURE.md` §8 and §19 (item 7)](../ARCHITECTURE.md#19-open-questions--observed-dead-code)
|
|
36
|
+
already flag this path as broken.
|
|
37
|
+
|
|
38
|
+
## Problem
|
|
39
|
+
|
|
40
|
+
Two defects in three lines:
|
|
41
|
+
|
|
42
|
+
1. **`logger.warning` is not defined.** `CemAcpt::Logging::Logger`
|
|
43
|
+
defines `warn` (`lib/cem_acpt/logging.rb:116`), inherited from
|
|
44
|
+
`::Logger`. Calling `warning` raises `NoMethodError` immediately.
|
|
45
|
+
2. **`CemAcpt::Actions.config.ignore` is not defined.** `ActionConfig`
|
|
46
|
+
exposes only `groups`, `only`, and `except`
|
|
47
|
+
(`lib/cem_acpt/actions.rb:67-118`). Even after fixing the typo at
|
|
48
|
+
#1, this line raises a second `NoMethodError`.
|
|
49
|
+
|
|
50
|
+
The rescue therefore never completes successfully. A user without
|
|
51
|
+
`bolt` installed cannot run `cem_acpt` at all, despite the README and
|
|
52
|
+
ARCHITECTURE doc both describing the behavior as graceful.
|
|
53
|
+
|
|
54
|
+
## Proposal
|
|
55
|
+
|
|
56
|
+
Use the existing `except` mechanism, which already filters actions in
|
|
57
|
+
`ActionGroup#filter_actions` (`lib/cem_acpt/actions.rb:52-60`):
|
|
58
|
+
|
|
59
|
+
```ruby
|
|
60
|
+
rescue CemAcpt::ShellCommandNotFoundError => e
|
|
61
|
+
logger.warn('CemAcpt::TestRunner') { e.message }
|
|
62
|
+
logger.warn('CemAcpt::TestRunner') { 'Bolt binary not found on PATH; skipping :bolt action.' }
|
|
63
|
+
CemAcpt::Actions.config.except = (CemAcpt::Actions.config.except + [:bolt]).uniq
|
|
64
|
+
return
|
|
65
|
+
end
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
Notes:
|
|
69
|
+
|
|
70
|
+
- `except=` is the public setter; using it preserves the
|
|
71
|
+
`ArgumentError` guard on non-Array inputs.
|
|
72
|
+
- Going via `(except + [:bolt]).uniq` keeps any user-supplied
|
|
73
|
+
`--except-actions` entries intact.
|
|
74
|
+
- Switch both `logger.warning` calls to `logger.warn`.
|
|
75
|
+
|
|
76
|
+
Alternatively, if "ignore" is preferred semantically, add an `ignore`
|
|
77
|
+
accessor to `ActionConfig` and have `filter_actions` honor it. This is
|
|
78
|
+
a larger surface change for no behavioral benefit over `except`.
|
|
79
|
+
|
|
80
|
+
## Alternatives Considered
|
|
81
|
+
|
|
82
|
+
- **Add a dedicated `ActionConfig#ignore` accessor.** Spelling matches
|
|
83
|
+
the original intent in the rescue but introduces a third filter list
|
|
84
|
+
with semantics identical to `except`.
|
|
85
|
+
- **Probe `bolt` up front in `Cli`/`Config` and short-circuit there.**
|
|
86
|
+
Cleaner separation but moves the policy decision away from the place
|
|
87
|
+
that actually decides to register the action group. Defer.
|
|
88
|
+
|
|
89
|
+
## Risks & Migration
|
|
90
|
+
|
|
91
|
+
- No config change. No CLI surface change.
|
|
92
|
+
- Users who currently work around the bug by always having `bolt`
|
|
93
|
+
installed are unaffected. Users who hit it today hit a hard crash;
|
|
94
|
+
after the change they get a `WARN` log and the test run continues
|
|
95
|
+
with the `:goss` group only.
|
|
96
|
+
|
|
97
|
+
## Acceptance Criteria
|
|
98
|
+
|
|
99
|
+
- [x] `lib/cem_acpt/test_runner.rb:206-208` updated as above.
|
|
100
|
+
- [x] RSpec test added to `spec/cem_acpt/test_runner_spec.rb` that
|
|
101
|
+
stubs `Bolt::TestRunner#setup!` to raise
|
|
102
|
+
`ShellCommandNotFoundError` and asserts the run continues with
|
|
103
|
+
`:bolt` filtered out of the registered action groups.
|
|
104
|
+
- [ ] Manual smoke test on a host without `bolt` installed: full run
|
|
105
|
+
completes with exit code reflecting only Goss results.
|
|
@@ -0,0 +1,119 @@
|
|
|
1
|
+
# RFC 0002: Fix malformed default `image_name_builder.character_substitutions`
|
|
2
|
+
|
|
3
|
+
- **Status:** Implemented
|
|
4
|
+
- **Author:** Heston Snodgrass
|
|
5
|
+
- **Created:** 2026-04-30
|
|
6
|
+
- **Implemented:** 2026-05-01 (CEM-6711, [PR #72](https://github.com/puppetlabs/cem_acpt/pull/72))
|
|
7
|
+
- **Priority:** Critical
|
|
8
|
+
- **Category:** Bug
|
|
9
|
+
- **Affected components:** `lib/cem_acpt/config/cem_acpt.rb`, `lib/cem_acpt/image_name_builder.rb`
|
|
10
|
+
|
|
11
|
+
## Summary
|
|
12
|
+
|
|
13
|
+
The default value for `image_name_builder.character_substitutions` in
|
|
14
|
+
`Config::CemAcpt::DEFAULTS` is a single 2-item array, but
|
|
15
|
+
`ImageNameBuilder#character_substitutions` expects an *array of*
|
|
16
|
+
2-item arrays. Iterating the default produces `nil` second arguments
|
|
17
|
+
to `String#gsub!`, which raises `TypeError` and aborts the run.
|
|
18
|
+
|
|
19
|
+
## Background
|
|
20
|
+
|
|
21
|
+
`Config::CemAcpt#defaults` (`lib/cem_acpt/config/cem_acpt.rb:30-34`):
|
|
22
|
+
|
|
23
|
+
```ruby
|
|
24
|
+
image_name_builder: {
|
|
25
|
+
character_substitutions: ['_', '-'],
|
|
26
|
+
parts: ['cem-acpt', '$image_fam', '$collection', '$firewall'],
|
|
27
|
+
join_with: '-',
|
|
28
|
+
},
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
`ImageNameBuilder#character_substitutions`
|
|
32
|
+
(`lib/cem_acpt/image_name_builder.rb:85-93`):
|
|
33
|
+
|
|
34
|
+
```ruby
|
|
35
|
+
def character_substitutions(name)
|
|
36
|
+
return name unless @config[:character_substitutions]
|
|
37
|
+
|
|
38
|
+
subbed_name = name
|
|
39
|
+
@config[:character_substitutions].each do |char_sub|
|
|
40
|
+
subbed_name.gsub!(char_sub[0], char_sub[1])
|
|
41
|
+
end
|
|
42
|
+
subbed_name
|
|
43
|
+
end
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
The class-level docstring (line 16) and `sample_config.yaml` both
|
|
47
|
+
describe the value as "An array of 2-item arrays" (e.g.
|
|
48
|
+
`[['_', '-'], ['.', '-']]`). [`docs/ARCHITECTURE.md` §19 (item 10)](../ARCHITECTURE.md#19-open-questions--observed-dead-code)
|
|
49
|
+
flags this as a default-vs-shape mismatch.
|
|
50
|
+
|
|
51
|
+
## Problem
|
|
52
|
+
|
|
53
|
+
With the default config:
|
|
54
|
+
|
|
55
|
+
```ruby
|
|
56
|
+
config[:image_name_builder][:character_substitutions] # => ['_', '-']
|
|
57
|
+
.each do |char_sub|
|
|
58
|
+
# iter 1: char_sub == '_', char_sub[0] == '_', char_sub[1] == nil
|
|
59
|
+
subbed_name.gsub!('_', nil) # => TypeError: no implicit conversion of nil into String
|
|
60
|
+
end
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
The bug is masked because most users override `image_name_builder` with
|
|
64
|
+
the documented nested form. Anyone relying on defaults — including a
|
|
65
|
+
fresh contributor running `cem_acpt -Y` and then a real run — hits the
|
|
66
|
+
crash.
|
|
67
|
+
|
|
68
|
+
## Proposal
|
|
69
|
+
|
|
70
|
+
Change the default to the documented nested shape:
|
|
71
|
+
|
|
72
|
+
```ruby
|
|
73
|
+
image_name_builder: {
|
|
74
|
+
character_substitutions: [['_', '-']],
|
|
75
|
+
...
|
|
76
|
+
}
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
Additionally, add a defensive shape check in `ImageNameBuilder` so a
|
|
80
|
+
future malformed override fails fast with a clear message instead of
|
|
81
|
+
the same `TypeError`:
|
|
82
|
+
|
|
83
|
+
```ruby
|
|
84
|
+
def character_substitutions(name)
|
|
85
|
+
return name unless @config[:character_substitutions]
|
|
86
|
+
|
|
87
|
+
subs = Array(@config[:character_substitutions])
|
|
88
|
+
unless subs.all? { |s| s.is_a?(Array) && s.size == 2 }
|
|
89
|
+
raise ArgumentError,
|
|
90
|
+
'image_name_builder.character_substitutions must be an array of 2-item arrays'
|
|
91
|
+
end
|
|
92
|
+
...
|
|
93
|
+
end
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
## Alternatives Considered
|
|
97
|
+
|
|
98
|
+
- **Auto-promote a flat 2-item array to `[['_', '-']]`.** Tempting but
|
|
99
|
+
ambiguous — `['_', '-', '.', '-']` would be even more confusing.
|
|
100
|
+
- **Drop the default entirely.** Would force everyone to override.
|
|
101
|
+
The default is genuinely useful (GCP image names disallow `_`); keep
|
|
102
|
+
it, just spell it correctly.
|
|
103
|
+
|
|
104
|
+
## Risks & Migration
|
|
105
|
+
|
|
106
|
+
- Any user who happened to depend on the current (broken) default
|
|
107
|
+
cannot exist, since the default crashes.
|
|
108
|
+
- A user who has correctly specified the nested form is unaffected by
|
|
109
|
+
the default change but will get the new validation error if their
|
|
110
|
+
override is malformed.
|
|
111
|
+
|
|
112
|
+
## Acceptance Criteria
|
|
113
|
+
|
|
114
|
+
- [x] Default updated to `[['_', '-']]`.
|
|
115
|
+
- [x] Validation added to `ImageNameBuilder#character_substitutions`.
|
|
116
|
+
- [x] RSpec test confirms a default-config `ImageNameBuilder.build`
|
|
117
|
+
runs without error.
|
|
118
|
+
- [x] RSpec test confirms a flat 2-item array now raises a clear
|
|
119
|
+
`ArgumentError` rather than `TypeError`.
|
|
@@ -0,0 +1,110 @@
|
|
|
1
|
+
# RFC 0003: Ship a Windows image-builder Terraform template (or fail clearly)
|
|
2
|
+
|
|
3
|
+
- **Status:** Short term implemented (CEM-6712); long term still proposed
|
|
4
|
+
- **Author:** TBD
|
|
5
|
+
- **Created:** 2026-04-30
|
|
6
|
+
- **Updated:** 2026-05-01
|
|
7
|
+
- **Priority:** Critical
|
|
8
|
+
- **Category:** Bug
|
|
9
|
+
- **Affected components:** `lib/terraform/image/gcp/windows/`, `lib/cem_acpt/image_builder.rb`, `lib/cem_acpt/config/cem_acpt_image.rb`
|
|
10
|
+
|
|
11
|
+
## Summary
|
|
12
|
+
|
|
13
|
+
`lib/terraform/image/gcp/windows/` currently contains only an empty
|
|
14
|
+
`.keep` file — there is no `main.tf`. The image builder routes any
|
|
15
|
+
Windows entry in `config.images` to this directory and runs
|
|
16
|
+
`terraform init` against it, which fails with a confusing "no
|
|
17
|
+
configuration files" error rather than a meaningful one.
|
|
18
|
+
|
|
19
|
+
## Background
|
|
20
|
+
|
|
21
|
+
The image builder has parallel Linux and Windows working trees:
|
|
22
|
+
|
|
23
|
+
```
|
|
24
|
+
lib/terraform/image/gcp/
|
|
25
|
+
├── linux/ # main.tf, provisioners, etc.
|
|
26
|
+
└── windows/ # only .keep
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
`ImageBuilder::TerraformBuilder#in_os_dir` (`lib/cem_acpt/image_builder.rb`)
|
|
30
|
+
`Dir.chdir`s to each populated tree and runs the same Terraform CLI
|
|
31
|
+
choreography described in [`docs/ARCHITECTURE.md` §12](../ARCHITECTURE.md#12-cem_acpt_image-image-builder-lifecycle).
|
|
32
|
+
[§19 (item 9)](../ARCHITECTURE.md#19-open-questions--observed-dead-code)
|
|
33
|
+
flags the missing template.
|
|
34
|
+
|
|
35
|
+
## Problem
|
|
36
|
+
|
|
37
|
+
There are two distinct user-visible symptoms:
|
|
38
|
+
|
|
39
|
+
1. A user who configures any Windows image entry (or doesn't pass
|
|
40
|
+
`--no-windows`) gets a `terraform init` failure with no actionable
|
|
41
|
+
error message.
|
|
42
|
+
2. There is no signal in the repo that Windows image building is
|
|
43
|
+
unsupported — the directory and the code path that routes to it
|
|
44
|
+
both exist.
|
|
45
|
+
|
|
46
|
+
## Proposal
|
|
47
|
+
|
|
48
|
+
Two complementary changes.
|
|
49
|
+
|
|
50
|
+
### Short term: clear, early failure (implemented in CEM-6712)
|
|
51
|
+
|
|
52
|
+
`ImageBuilder::TerraformBuilder#assert_template_present!` checks for
|
|
53
|
+
`main.tf` under `File.join(@image_terraform_dir, os_str)` and, if it
|
|
54
|
+
is missing, raises `ImageBuilder::MissingTemplateError` (defined in
|
|
55
|
+
`lib/cem_acpt/image_builder/errors.rb`) with a message that names the
|
|
56
|
+
missing path and points to `--no-#{os_str}`.
|
|
57
|
+
|
|
58
|
+
It is invoked from `run` for every OS bucket in `image_types`,
|
|
59
|
+
*before* the working directory is created and before `terraform init`,
|
|
60
|
+
so the misconfiguration is caught even in `--dry-run` mode.
|
|
61
|
+
|
|
62
|
+
### Long term: ship the template
|
|
63
|
+
|
|
64
|
+
Author `lib/terraform/image/gcp/windows/main.tf` modeled on the Linux
|
|
65
|
+
template. The Windows image-build node needs to:
|
|
66
|
+
|
|
67
|
+
- accept the same vars (`platform_data`, `node_data`, `provision_commands`,
|
|
68
|
+
`puppet_auth_token`),
|
|
69
|
+
- create a `google_compute_instance` with the appropriate Windows
|
|
70
|
+
base image and a metadata block enabling WinRM,
|
|
71
|
+
- wait for the instance to come up, then run `provision_commands` over
|
|
72
|
+
WinRM (the existing `Utils::WinRMRunner` is already set up for this
|
|
73
|
+
in the test path; the image-build path can reuse the same
|
|
74
|
+
PowerShell shape).
|
|
75
|
+
|
|
76
|
+
A separate follow-up RFC may be appropriate for the WinRM-from-Terraform
|
|
77
|
+
choreography, since the current Windows test path runs PowerShell from
|
|
78
|
+
Ruby rather than from Terraform.
|
|
79
|
+
|
|
80
|
+
## Alternatives Considered
|
|
81
|
+
|
|
82
|
+
- **Delete `lib/terraform/image/gcp/windows/` and reject Windows
|
|
83
|
+
entries in config.** Cleaner short term, but kills the option of
|
|
84
|
+
ever building Windows images.
|
|
85
|
+
- **Lean on packer instead of Terraform for Windows.** A larger
|
|
86
|
+
architectural change; out of scope here.
|
|
87
|
+
|
|
88
|
+
## Risks & Migration
|
|
89
|
+
|
|
90
|
+
- Short-term change is purely additive (a new error message). No
|
|
91
|
+
existing config breaks.
|
|
92
|
+
- Long-term template is feature work. Should be gated behind tests
|
|
93
|
+
before being declared supported.
|
|
94
|
+
|
|
95
|
+
## Acceptance Criteria
|
|
96
|
+
|
|
97
|
+
Short-term (CEM-6712, complete):
|
|
98
|
+
|
|
99
|
+
- [x] `ImageBuilder::TerraformBuilder#assert_template_present!` raises
|
|
100
|
+
`ImageBuilder::MissingTemplateError` if an OS bucket has images
|
|
101
|
+
configured without a template on disk.
|
|
102
|
+
- [x] `spec/cem_acpt/image_builder_spec.rb` asserts the error message
|
|
103
|
+
names the missing template path and points to `--no-<os>`.
|
|
104
|
+
|
|
105
|
+
Long-term (separate PR(s)):
|
|
106
|
+
|
|
107
|
+
- [ ] `main.tf` exists under `lib/terraform/image/gcp/windows/`.
|
|
108
|
+
- [ ] An end-to-end Windows image build succeeds in CI.
|
|
109
|
+
- [ ] [`docs/ARCHITECTURE.md` §12](../ARCHITECTURE.md#12-cem_acpt_image-image-builder-lifecycle)
|
|
110
|
+
and §19 are updated.
|
|
@@ -0,0 +1,108 @@
|
|
|
1
|
+
# RFC 0004: Fix image-name truncation off-by-one
|
|
2
|
+
|
|
3
|
+
- **Status:** Implemented
|
|
4
|
+
- **Author:** TBD
|
|
5
|
+
- **Created:** 2026-04-30
|
|
6
|
+
- **Implemented:** 2026-05-04 (CEM-6713)
|
|
7
|
+
- **Priority:** High
|
|
8
|
+
- **Category:** Bug
|
|
9
|
+
- **Affected components:** `lib/cem_acpt/image_builder.rb`
|
|
10
|
+
|
|
11
|
+
## Summary
|
|
12
|
+
|
|
13
|
+
`ImageBuilder::TerraformBuilder#image_name_from_image_family` produces
|
|
14
|
+
an image name up to 65 characters long. GCE image names are limited
|
|
15
|
+
to 63 characters (RFC 1035 label rule). The slice was clearly intended
|
|
16
|
+
to enforce the limit but uses an inclusive-range off-by-one and the
|
|
17
|
+
wrong cap. As written it neither enforces 63 nor 64 and can silently
|
|
18
|
+
produce names that GCE rejects.
|
|
19
|
+
|
|
20
|
+
## Background
|
|
21
|
+
|
|
22
|
+
`lib/cem_acpt/image_builder.rb:112-114`:
|
|
23
|
+
|
|
24
|
+
```ruby
|
|
25
|
+
def image_name_from_image_family(image_family)
|
|
26
|
+
"#{image_family}-v#{@start_time.to_i}"[0..64]
|
|
27
|
+
end
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
`String#[0..64]` is **inclusive** on both ends — it returns up to 65
|
|
31
|
+
characters. [`docs/ARCHITECTURE.md` §12 / §19 (item 11)](../ARCHITECTURE.md#19-open-questions--observed-dead-code)
|
|
32
|
+
note the off-by-one.
|
|
33
|
+
|
|
34
|
+
GCE rules (per [GCP docs][1]): image name must match
|
|
35
|
+
`[a-z]([-a-z0-9]*[a-z0-9])?`, length 1-63.
|
|
36
|
+
|
|
37
|
+
[1]: https://cloud.google.com/compute/docs/reference/rest/v1/images
|
|
38
|
+
|
|
39
|
+
## Problem
|
|
40
|
+
|
|
41
|
+
- An `image_family` of length ≥ 51 plus `-v` (2 chars) plus a 10-digit
|
|
42
|
+
unix timestamp (12 chars total) produces a name of 63+ chars,
|
|
43
|
+
which the current slice may or may not truncate to a value GCE
|
|
44
|
+
accepts.
|
|
45
|
+
- Even when GCE accepts the name, the truncation can land on the `-v`
|
|
46
|
+
separator or mid-timestamp, producing a value that collides with
|
|
47
|
+
another build in the same family.
|
|
48
|
+
|
|
49
|
+
## Proposal
|
|
50
|
+
|
|
51
|
+
Use an exclusive cap and align with GCE's documented 63-char limit:
|
|
52
|
+
|
|
53
|
+
```ruby
|
|
54
|
+
GCE_IMAGE_NAME_MAX = 63
|
|
55
|
+
|
|
56
|
+
def image_name_from_image_family(image_family)
|
|
57
|
+
full = "#{image_family}-v#{@start_time.to_i}"
|
|
58
|
+
return full if full.length <= GCE_IMAGE_NAME_MAX
|
|
59
|
+
|
|
60
|
+
# Preserve the timestamp suffix; truncate the family.
|
|
61
|
+
suffix = "-v#{@start_time.to_i}"
|
|
62
|
+
prefix = image_family[0, GCE_IMAGE_NAME_MAX - suffix.length]
|
|
63
|
+
prefix.sub(/-+\z/, '') + suffix
|
|
64
|
+
end
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
This guarantees:
|
|
68
|
+
|
|
69
|
+
- Length ≤ 63.
|
|
70
|
+
- The timestamp suffix is intact, so two consecutive builds in the
|
|
71
|
+
same family never collide.
|
|
72
|
+
- Trailing dashes from a clipped prefix are removed (GCE rejects names
|
|
73
|
+
ending in `-`).
|
|
74
|
+
|
|
75
|
+
Add a unit test covering: short name unchanged, exact 63-char name
|
|
76
|
+
unchanged, 64-char overflow truncated cleanly, prefix-ending-in-dash
|
|
77
|
+
case.
|
|
78
|
+
|
|
79
|
+
## Alternatives Considered
|
|
80
|
+
|
|
81
|
+
- **Refuse to build if `image_family` itself is over the limit.**
|
|
82
|
+
Catches user error but doesn't help when timestamps push a
|
|
83
|
+
borderline name over.
|
|
84
|
+
- **Hash the family into a fixed prefix.** Stable but loses
|
|
85
|
+
human-readability of image names.
|
|
86
|
+
|
|
87
|
+
## Risks & Migration
|
|
88
|
+
|
|
89
|
+
- User-visible change: image names that previously wrapped to 65
|
|
90
|
+
chars (and silently failed at GCE) will now succeed.
|
|
91
|
+
- No config schema change.
|
|
92
|
+
|
|
93
|
+
## Acceptance Criteria
|
|
94
|
+
|
|
95
|
+
- [x] `image_name_from_image_family` enforces a 63-char cap.
|
|
96
|
+
- [x] Generated names never end in `-`.
|
|
97
|
+
- [x] RSpec coverage for the four cases above.
|
|
98
|
+
- [x] [`docs/ARCHITECTURE.md` §12](../ARCHITECTURE.md#12-cem_acpt_image-image-builder-lifecycle)
|
|
99
|
+
updated to reflect the corrected behavior.
|
|
100
|
+
|
|
101
|
+
## Implementation
|
|
102
|
+
|
|
103
|
+
Implemented in CEM-6713. `GCE_IMAGE_NAME_MAX = 63` is defined on
|
|
104
|
+
`TerraformBuilder`; `image_name_from_image_family` clips the family
|
|
105
|
+
portion when needed and preserves the `-v<unix_ts>` suffix verbatim.
|
|
106
|
+
RSpec coverage lives in
|
|
107
|
+
[`spec/cem_acpt/image_builder_spec.rb`](../../spec/cem_acpt/image_builder_spec.rb)
|
|
108
|
+
under `#image_name_from_image_family`.
|
|
@@ -0,0 +1,117 @@
|
|
|
1
|
+
# RFC 0005: Replace `tests.first.include?('windows')` with explicit OS dispatch
|
|
2
|
+
|
|
3
|
+
- **Status:** Implemented
|
|
4
|
+
- **Author:** TBD
|
|
5
|
+
- **Created:** 2026-04-30
|
|
6
|
+
- **Implemented:** 2026-05-05 (CEM-6714)
|
|
7
|
+
- **Priority:** High
|
|
8
|
+
- **Category:** Refactor
|
|
9
|
+
- **Affected components:** `lib/cem_acpt/test_runner.rb`, `lib/cem_acpt/provision/terraform/`, `lib/cem_acpt/test_data.rb`
|
|
10
|
+
|
|
11
|
+
## Summary
|
|
12
|
+
|
|
13
|
+
The test runner decides whether to enter the Windows code path with
|
|
14
|
+
`config.get('tests').first.include?('windows')`. This is a fragile
|
|
15
|
+
substring match against an arbitrary test name and breaks down the
|
|
16
|
+
moment a single run mixes OS targets or a test name happens to contain
|
|
17
|
+
the substring. Replace it with an OS classification derived from the
|
|
18
|
+
already-existing `Provision::OsData` machinery.
|
|
19
|
+
|
|
20
|
+
## Background
|
|
21
|
+
|
|
22
|
+
`lib/cem_acpt/test_runner.rb:65`:
|
|
23
|
+
|
|
24
|
+
```ruby
|
|
25
|
+
if config.get('tests').first.include? 'windows'
|
|
26
|
+
upload_module_to_bucket
|
|
27
|
+
@instance_names_ips.each { |k, v| ... WinRMRunner ... }
|
|
28
|
+
end
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
`Provision::Terraform` already classifies the run by inspecting
|
|
32
|
+
`@provision_data[:test_data].first[:test_name]` against
|
|
33
|
+
`Linux.valid_names` and `Windows.valid_names` to pick a backend
|
|
34
|
+
(see [`docs/ARCHITECTURE.md` §7](../ARCHITECTURE.md#7-provisioner-terraform)).
|
|
35
|
+
That logic is the source of truth for "which OS family does this run
|
|
36
|
+
target"; the runner is reinventing — and weakening — it.
|
|
37
|
+
|
|
38
|
+
## Problem
|
|
39
|
+
|
|
40
|
+
1. **Fragile match.** `tests.first.include?('windows')` is a
|
|
41
|
+
substring check. A test named `cis_rhel-8_firewalld_windowserver_2`
|
|
42
|
+
would falsely route to the Windows path.
|
|
43
|
+
2. **Only inspects the first test.** A run with a mix of
|
|
44
|
+
`windows-2022_…` and `rhel-8_…` test names silently mishandles all
|
|
45
|
+
tests after the first.
|
|
46
|
+
3. **Couples the runner to test-name spelling.** Renaming the
|
|
47
|
+
convention (`win` vs `windows`) breaks the heuristic.
|
|
48
|
+
4. **Duplicates logic.** `Provision::OsData.use_for?` already answers
|
|
49
|
+
"is this a Windows test name?" correctly.
|
|
50
|
+
|
|
51
|
+
## Proposal
|
|
52
|
+
|
|
53
|
+
Introduce a single helper in `Provision::OsData` (or surface the one
|
|
54
|
+
that exists) and consume it from the runner:
|
|
55
|
+
|
|
56
|
+
```ruby
|
|
57
|
+
# lib/cem_acpt/provision/terraform/os_data.rb
|
|
58
|
+
def self.os_family_for(test_name)
|
|
59
|
+
return :windows if Windows.use_for?(test_name)
|
|
60
|
+
return :linux if Linux.use_for?(test_name)
|
|
61
|
+
raise Error, "Cannot determine OS family for test name: #{test_name}"
|
|
62
|
+
end
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
Then in the runner:
|
|
66
|
+
|
|
67
|
+
```ruby
|
|
68
|
+
windows_tests, linux_tests = config.get('tests').partition do |t|
|
|
69
|
+
CemAcpt::Provision::OsData.os_family_for(t) == :windows
|
|
70
|
+
end
|
|
71
|
+
|
|
72
|
+
unless windows_tests.empty?
|
|
73
|
+
raise 'Mixed Windows and Linux runs are not supported' unless linux_tests.empty?
|
|
74
|
+
upload_module_to_bucket
|
|
75
|
+
@instance_names_ips.each do |name, info|
|
|
76
|
+
win_node = ...
|
|
77
|
+
win_node.run
|
|
78
|
+
end
|
|
79
|
+
end
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
This:
|
|
83
|
+
|
|
84
|
+
- moves the OS decision to the place that already owns it,
|
|
85
|
+
- explicitly surfaces the "mixed run" case rather than silently
|
|
86
|
+
doing the wrong thing,
|
|
87
|
+
- removes the substring-match foot-gun.
|
|
88
|
+
|
|
89
|
+
A follow-up could lift this further: per-test OS dispatch (so a single
|
|
90
|
+
`cem_acpt` invocation can mix Linux and Windows tests). That is
|
|
91
|
+
larger; this RFC is just about closing the obvious correctness gap.
|
|
92
|
+
|
|
93
|
+
## Alternatives Considered
|
|
94
|
+
|
|
95
|
+
- **Add an explicit `os: windows` config knob.** Possible, but the
|
|
96
|
+
test names already encode the OS and the provisioner already
|
|
97
|
+
parses them.
|
|
98
|
+
- **Tag tests with metadata at discovery time.** Cleaner long term;
|
|
99
|
+
out of scope for this RFC.
|
|
100
|
+
|
|
101
|
+
## Risks & Migration
|
|
102
|
+
|
|
103
|
+
- Behavior change for the (unsupported) mixed-OS case: it now raises
|
|
104
|
+
rather than silently mis-routing. This is the desired change.
|
|
105
|
+
- A test directory whose name does not match either `Linux.valid_names`
|
|
106
|
+
or `Windows.valid_names` would now fail-fast at preflight rather than
|
|
107
|
+
later in provisioning. This too is desired.
|
|
108
|
+
|
|
109
|
+
## Acceptance Criteria
|
|
110
|
+
|
|
111
|
+
- [ ] `Provision::OsData.os_family_for` (or equivalent) added with
|
|
112
|
+
RSpec coverage.
|
|
113
|
+
- [ ] `test_runner.rb` no longer references `tests.first.include?`.
|
|
114
|
+
- [ ] RSpec test demonstrates a mixed-OS config produces a clear
|
|
115
|
+
error.
|
|
116
|
+
- [ ] [`docs/ARCHITECTURE.md` §4 / §13](../ARCHITECTURE.md#4-cem_acpt-test-runner-lifecycle)
|
|
117
|
+
updated.
|