@htekdev/actions-debugger 1.0.2 → 1.0.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/errors/caching-artifacts/artifact-download-no-artifacts-found.yml +118 -0
- package/errors/concurrency-timing/job-stuck-waiting-for-runner.yml +105 -0
- package/errors/concurrency-timing/matrix-fail-fast-sibling-cancellation.yml +113 -0
- package/errors/concurrency-timing/timeout-minutes-job-killed.yml +107 -0
- package/errors/known-unsolved/github-step-summary-size-limit.yml +112 -0
- package/errors/known-unsolved/job-maximum-execution-time.yml +127 -0
- package/errors/runner-environment/macos-14-sonoma-eol.yml +89 -0
- package/errors/runner-environment/macos-latest-to-macos-26.yml +127 -0
- package/errors/runner-environment/powershell-74-to-76-upgrade.yml +112 -0
- package/errors/runner-environment/service-container-unhealthy.yml +126 -0
- package/errors/runner-environment/windows-latest-vs2026-migration.yml +131 -0
- package/errors/silent-failures/hashfiles-empty-string-cache-collision.yml +96 -0
- package/errors/triggers/environment-protection-rules-silent-block.yml +105 -0
- package/errors/yaml-syntax/env-context-unavailable-job-level.yml +109 -0
- package/package.json +1 -1
|
@@ -0,0 +1,127 @@
|
|
|
1
|
+
id: known-unsolved-009
|
|
2
|
+
title: "Job Killed After Maximum Execution Time (6h Hosted / 35-Day Workflow)"
|
|
3
|
+
category: known-unsolved
|
|
4
|
+
severity: limitation
|
|
5
|
+
tags:
|
|
6
|
+
- timeout
|
|
7
|
+
- execution-time
|
|
8
|
+
- job-limits
|
|
9
|
+
- platform-limit
|
|
10
|
+
- self-hosted
|
|
11
|
+
- workflow-duration
|
|
12
|
+
- limitation
|
|
13
|
+
patterns:
|
|
14
|
+
- regex: "The job running has exceeded the maximum execution time"
|
|
15
|
+
flags: "i"
|
|
16
|
+
- regex: "exceeded the maximum (?:time|execution time)"
|
|
17
|
+
flags: "i"
|
|
18
|
+
- regex: "job .* exceeded .* maximum"
|
|
19
|
+
flags: "i"
|
|
20
|
+
error_messages:
|
|
21
|
+
- "The job running on runner GitHub Actions X has exceeded the maximum execution time of 360 minutes."
|
|
22
|
+
- "The job running has exceeded the maximum execution time"
|
|
23
|
+
root_cause: |
|
|
24
|
+
GitHub Actions enforces hard platform-level execution time limits that cannot be
|
|
25
|
+
overridden or extended by workflow configuration. These limits exist to protect
|
|
26
|
+
shared infrastructure and prevent runaway jobs from consuming unlimited resources.
|
|
27
|
+
|
|
28
|
+
**GitHub-hosted runner limits:**
|
|
29
|
+
- Maximum job execution time: **6 hours** (360 minutes)
|
|
30
|
+
- Maximum workflow run time: **35 days** (across all jobs, including queued time)
|
|
31
|
+
- Default `timeout-minutes` when not set: **360 minutes** (6 hours)
|
|
32
|
+
|
|
33
|
+
**Self-hosted runner limits:**
|
|
34
|
+
- Maximum job execution time: **5 days** (7,200 minutes) by default
|
|
35
|
+
- Maximum workflow run time: **35 days** (same as hosted)
|
|
36
|
+
- Self-hosted limits can be customized in enterprise plans via org/enterprise policies
|
|
37
|
+
|
|
38
|
+
**When limits are hit:**
|
|
39
|
+
- The runner process is sent a SIGTERM (graceful) then SIGKILL (forced) after a grace period
|
|
40
|
+
- The job is marked CANCELLED (not FAILED) in the UI
|
|
41
|
+
- The log message "The job running has exceeded the maximum execution time" appears in
|
|
42
|
+
the runner log (may be visible in the step logs depending on where the runner was killed)
|
|
43
|
+
- Any `post:` steps for active actions (e.g., cache save, artifact upload) are skipped
|
|
44
|
+
- No email notification is sent to the repo owner about the cancellation
|
|
45
|
+
|
|
46
|
+
**Why this is a limitation, not just misconfiguration:**
|
|
47
|
+
- There is no way to set `timeout-minutes` above 21600 (360 hours) to extend the GitHub-hosted 6h cap
|
|
48
|
+
- The workflow `timeout-minutes` field cannot override the platform cap on GitHub-hosted runners
|
|
49
|
+
- Jobs requiring more than 6 hours on GitHub-hosted runners have NO supported path without
|
|
50
|
+
migrating to self-hosted or restructuring the job into multiple shorter sequential jobs
|
|
51
|
+
fix: |
|
|
52
|
+
There is no way to extend the GitHub-hosted runner 6-hour job cap. Options:
|
|
53
|
+
|
|
54
|
+
1. **Break the job into smaller sequential jobs** — split long-running work (e.g., build
|
|
55
|
+
artifacts first, test in separate parallel jobs, deploy last). Each job has its own
|
|
56
|
+
6-hour budget.
|
|
57
|
+
|
|
58
|
+
2. **Migrate to self-hosted runners** — self-hosted runners support up to 5-day jobs.
|
|
59
|
+
Use actions-runner-controller (ARC) or cloud auto-scaling for elastic capacity.
|
|
60
|
+
|
|
61
|
+
3. **Optimize the slow step** — profile build/test times; parallelize with matrix
|
|
62
|
+
strategy; use incremental builds or test sharding to reduce per-job duration.
|
|
63
|
+
|
|
64
|
+
4. **Use caching aggressively** — `actions/cache` reduces download/build time between
|
|
65
|
+
runs, but does not extend limits.
|
|
66
|
+
fix_code:
|
|
67
|
+
- language: yaml
|
|
68
|
+
label: "Split a long job into sequential jobs to stay within 6h per job"
|
|
69
|
+
code: |
|
|
70
|
+
jobs:
|
|
71
|
+
build:
|
|
72
|
+
runs-on: ubuntu-latest
|
|
73
|
+
timeout-minutes: 120 # 2h budget for build
|
|
74
|
+
outputs:
|
|
75
|
+
artifact-id: ${{ steps.upload.outputs.artifact-id }}
|
|
76
|
+
steps:
|
|
77
|
+
- uses: actions/checkout@v4
|
|
78
|
+
- name: Build
|
|
79
|
+
run: make build-release
|
|
80
|
+
- name: Upload build artifact
|
|
81
|
+
id: upload
|
|
82
|
+
uses: actions/upload-artifact@v4
|
|
83
|
+
with:
|
|
84
|
+
name: release-build
|
|
85
|
+
path: dist/
|
|
86
|
+
|
|
87
|
+
# Separate job — gets its own 6h budget
|
|
88
|
+
test:
|
|
89
|
+
needs: build
|
|
90
|
+
runs-on: ubuntu-latest
|
|
91
|
+
timeout-minutes: 180 # 3h budget for tests
|
|
92
|
+
steps:
|
|
93
|
+
- uses: actions/download-artifact@v4
|
|
94
|
+
with:
|
|
95
|
+
artifact-id: ${{ needs.build.outputs.artifact-id }}
|
|
96
|
+
- run: make test-full
|
|
97
|
+
- language: yaml
|
|
98
|
+
label: "Self-hosted runner for jobs requiring more than 6 hours"
|
|
99
|
+
code: |
|
|
100
|
+
jobs:
|
|
101
|
+
long-running-job:
|
|
102
|
+
# Self-hosted runners support up to 5-day job duration
|
|
103
|
+
runs-on: [self-hosted, linux, x64]
|
|
104
|
+
timeout-minutes: 2880 # 48h — only possible on self-hosted
|
|
105
|
+
steps:
|
|
106
|
+
- uses: actions/checkout@v4
|
|
107
|
+
- name: Long-running process
|
|
108
|
+
run: ./scripts/full-dataset-processing.sh
|
|
109
|
+
prevention:
|
|
110
|
+
- "Set explicit `timeout-minutes` on every job — don't rely on the implicit 6h GitHub-hosted cap as your only safeguard."
|
|
111
|
+
- "Profile job duration regularly and alert when a job's P99 duration approaches 80% of its timeout budget."
|
|
112
|
+
- "Parallelize test suites using matrix strategy or `actions/github-script` dynamic matrix generation to reduce per-job time."
|
|
113
|
+
- "Use self-hosted runners for any workflow that legitimately requires more than 2-3 hours per job (e.g., large model training, full database rebuild, exhaustive integration tests)."
|
|
114
|
+
- "Be aware that post-run actions (cache save, artifact upload) will NOT execute if the parent job is killed for exceeding the time limit."
|
|
115
|
+
docs:
|
|
116
|
+
- url: "https://docs.github.com/en/actions/administering-github-actions/usage-limits-billing-and-administration#usage-limits"
|
|
117
|
+
label: "Usage limits: job execution time and workflow run time"
|
|
118
|
+
- url: "https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/about-self-hosted-runners#usage-limits"
|
|
119
|
+
label: "Self-hosted runner usage limits"
|
|
120
|
+
- url: "https://github.com/orgs/community/discussions/48790"
|
|
121
|
+
label: "Community: Workflow run time limit 35 days"
|
|
122
|
+
- url: "https://github.com/orgs/community/discussions/150900"
|
|
123
|
+
label: "Community: Job cancellation after 6 hours"
|
|
124
|
+
- url: "https://stackoverflow.com/questions/70187174/github-actions-self-hosted-runner-the-job-running-has-exceeded-the-maximum-exe"
|
|
125
|
+
label: "Stack Overflow: The job running has exceeded the maximum execution time"
|
|
126
|
+
- url: "https://github.com/actions/actions-runner-controller"
|
|
127
|
+
label: "Actions Runner Controller (ARC) — Kubernetes-based self-hosted runner auto-scaling"
|
|
@@ -0,0 +1,89 @@
|
|
|
1
|
+
id: runner-environment-018
|
|
2
|
+
title: "macOS 14 Sonoma Runner Deprecation — EOL November 2026"
|
|
3
|
+
category: runner-environment
|
|
4
|
+
severity: warning
|
|
5
|
+
tags:
|
|
6
|
+
- macos
|
|
7
|
+
- deprecation
|
|
8
|
+
- runner-image
|
|
9
|
+
- eol
|
|
10
|
+
- migration
|
|
11
|
+
patterns:
|
|
12
|
+
- regex: "##\\[error\\]This request was rejected because.*macos-14"
|
|
13
|
+
flags: "i"
|
|
14
|
+
- regex: "Image.*macos-14.*deprecated|macos-14.*no longer supported"
|
|
15
|
+
flags: "i"
|
|
16
|
+
- regex: "The requested image.*macos-14.*not available"
|
|
17
|
+
flags: "i"
|
|
18
|
+
error_messages:
|
|
19
|
+
- "##[error]This request was rejected because the runner label 'macos-14' is no longer supported."
|
|
20
|
+
- "The macOS 14 image has been retired. Please update to macos-15 or macos-latest."
|
|
21
|
+
root_cause: |
|
|
22
|
+
GitHub announced the deprecation of `macOS 14 Sonoma` runner images on the following schedule:
|
|
23
|
+
- **Deprecation begins**: July 6, 2026 — longer queue times during peak hours, brownout periods
|
|
24
|
+
- **Full retirement**: November 2, 2026 — jobs using `macos-14` will fail permanently
|
|
25
|
+
|
|
26
|
+
GitHub maintains only the latest two stable macOS major versions. Since macOS 26 Tahoe is now
|
|
27
|
+
GA on GitHub Actions, macOS 14 is the oldest and must retire.
|
|
28
|
+
|
|
29
|
+
**Brownout schedule** (jobs deliberately failed during these windows to force migration):
|
|
30
|
+
- October 5, 14:00 UTC – October 6, 00:00 UTC
|
|
31
|
+
- October 12, 14:00 UTC – October 13, 00:00 UTC
|
|
32
|
+
- October 16–30, 14:00 UTC – next day 00:00 UTC (escalating weekly)
|
|
33
|
+
|
|
34
|
+
Affected labels: `macos-14`, `macos-14-large`, `macos-14-xlarge`.
|
|
35
|
+
fix: |
|
|
36
|
+
Update your workflow's `runs-on` label to a supported macOS version:
|
|
37
|
+
|
|
38
|
+
| Old label | Replace with |
|
|
39
|
+
|-----------|--------------|
|
|
40
|
+
| `macos-14` | `macos-latest` or `macos-15` or `macos-26` |
|
|
41
|
+
| `macos-14-large` | `macos-latest-large` or `macos-15-large` |
|
|
42
|
+
| `macos-14-xlarge` | `macos-latest-xlarge` or `macos-15-xlarge` or `macos-26-xlarge` |
|
|
43
|
+
|
|
44
|
+
If your workflow depends on macOS 14-specific software versions (e.g., Xcode 15, older
|
|
45
|
+
Python/Ruby), test carefully against macOS 15 before switching `macos-latest`.
|
|
46
|
+
See runner-environment-017 for macOS 15 → 26 migration notes if moving to `macos-latest`.
|
|
47
|
+
fix_code:
|
|
48
|
+
- language: yaml
|
|
49
|
+
label: "Migrate from macos-14 to macos-15 (conservative) or macos-latest"
|
|
50
|
+
code: |
|
|
51
|
+
jobs:
|
|
52
|
+
build:
|
|
53
|
+
# Before:
|
|
54
|
+
# runs-on: macos-14
|
|
55
|
+
|
|
56
|
+
# Conservative: macos-15 (similar software stack, no OpenSSL jump)
|
|
57
|
+
runs-on: macos-15
|
|
58
|
+
|
|
59
|
+
# OR accept latest (currently macos-26 after June 15 2026):
|
|
60
|
+
# runs-on: macos-latest
|
|
61
|
+
|
|
62
|
+
steps:
|
|
63
|
+
- uses: actions/checkout@v4
|
|
64
|
+
- language: yaml
|
|
65
|
+
label: "Strategy matrix: test across multiple macOS versions during migration"
|
|
66
|
+
code: |
|
|
67
|
+
jobs:
|
|
68
|
+
build:
|
|
69
|
+
strategy:
|
|
70
|
+
matrix:
|
|
71
|
+
os: [macos-15, macos-26]
|
|
72
|
+
runs-on: ${{ matrix.os }}
|
|
73
|
+
steps:
|
|
74
|
+
- uses: actions/checkout@v4
|
|
75
|
+
- name: Build and test
|
|
76
|
+
run: make test
|
|
77
|
+
prevention:
|
|
78
|
+
- "Subscribe to actions/runner-images GitHub Issues announcements label to receive deprecation notices well in advance."
|
|
79
|
+
- "Avoid pinning to specific macOS point versions (`macos-14`, `macos-15`) in long-lived workflows — use `macos-latest` and test proactively against the next version."
|
|
80
|
+
- "Run a matrix strategy against `macos-latest` and your pinned version to detect incompatibilities before they cause production failures."
|
|
81
|
+
- "Search your organization's workflows periodically for deprecated runner labels using: `gh search code 'runs-on: macos-14' --owner YOUR_ORG`."
|
|
82
|
+
docs:
|
|
83
|
+
- url: "https://github.com/actions/runner-images/issues/13518"
|
|
84
|
+
label: "GitHub Announcement: macOS 14 Sonoma deprecation timeline"
|
|
85
|
+
- url: "https://docs.github.com/en/actions/using-github-hosted-runners/using-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources"
|
|
86
|
+
label: "Supported GitHub-hosted runner labels"
|
|
87
|
+
source:
|
|
88
|
+
article: "https://htek.dev/articles/github-actions-debugging-guide"
|
|
89
|
+
section: "Runner deprecation and EOL"
|
|
@@ -0,0 +1,127 @@
|
|
|
1
|
+
id: runner-environment-017
|
|
2
|
+
title: "macos-latest Label Now Points to macOS 26 Tahoe"
|
|
3
|
+
category: runner-environment
|
|
4
|
+
severity: error
|
|
5
|
+
tags:
|
|
6
|
+
- macos
|
|
7
|
+
- runner-image
|
|
8
|
+
- breaking-change
|
|
9
|
+
- openssl
|
|
10
|
+
- xcode
|
|
11
|
+
- migration
|
|
12
|
+
patterns:
|
|
13
|
+
- regex: "Error opening configuration file.*openssl"
|
|
14
|
+
flags: "i"
|
|
15
|
+
- regex: "SSL_CTX_new.*failed|SSL_connect.*SYSCALL error"
|
|
16
|
+
flags: "i"
|
|
17
|
+
- regex: "Could not find Xcode.*version.*16\\.\\d"
|
|
18
|
+
flags: "i"
|
|
19
|
+
- regex: "ruby.*requires.*Ruby (2|3\\.0|3\\.1|3\\.2|3\\.3)"
|
|
20
|
+
flags: "i"
|
|
21
|
+
- regex: "dyld.*Library not loaded.*libssl\\.1\\.1"
|
|
22
|
+
flags: "i"
|
|
23
|
+
error_messages:
|
|
24
|
+
- "dyld[]: Library not loaded: /usr/local/opt/openssl@1.1/lib/libssl.1.1.dylib"
|
|
25
|
+
- "Could not find Xcode version '16.4'"
|
|
26
|
+
- "SSL_connect returned=1 errno=0 state=error: certificate verify failed"
|
|
27
|
+
- "Your Ruby version is 3.3.x, but your Gemfile specified ~> 3.3"
|
|
28
|
+
- "npm warn old lockfile"
|
|
29
|
+
root_cause: |
|
|
30
|
+
The `macos-latest` label in GitHub Actions was migrated to point to macOS 26 Tahoe
|
|
31
|
+
beginning June 15, 2026 (completing by July 15, 2026). Previously it pointed to macOS 15
|
|
32
|
+
Sequoia. This migration includes several major software version changes that silently break
|
|
33
|
+
workflows:
|
|
34
|
+
|
|
35
|
+
- **OpenSSL**: 1.1.1w → 3.6.2 — The biggest breaking change. Many C/C++ projects,
|
|
36
|
+
Ruby gems (openssl), and Python packages that link against the system OpenSSL dynamically
|
|
37
|
+
will fail at runtime because libssl.1.1.dylib no longer ships with macOS 26.
|
|
38
|
+
- **Xcode**: Default changes from 16.4 to 26.4.1 (Xcode 26 series). Workflows pinning
|
|
39
|
+
`xcode-version: '16.4'` or relying on Clang 17 will break — Clang/LLVM jumps from 17 → 21.
|
|
40
|
+
- **Ruby**: 3.3.x → 3.4.x — Minor version bump can cause Gemfile constraint failures and
|
|
41
|
+
gem native extension compilation issues.
|
|
42
|
+
- **Node.js**: Default moves from 22 → 24 (though both images include Node 24 in cached tools).
|
|
43
|
+
- **npm**: 10.x → 11.x — Major npm version. Package-lock.json format changes possible.
|
|
44
|
+
- **Homebrew LLVM**: 18 → 20 (major version jump for workflows using `llvm@18` explicitly).
|
|
45
|
+
fix: |
|
|
46
|
+
**Immediate mitigation:** Pin to `macos-15` to restore previous behavior while you migrate:
|
|
47
|
+
|
|
48
|
+
```yaml
|
|
49
|
+
runs-on: macos-15
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
**Proper fix** — address each breaking dependency:
|
|
53
|
+
|
|
54
|
+
1. **OpenSSL**: Brew-install a pinned OpenSSL version and set library paths:
|
|
55
|
+
```bash
|
|
56
|
+
brew install openssl@1.1
|
|
57
|
+
export LDFLAGS="-L$(brew --prefix openssl@1.1)/lib"
|
|
58
|
+
export CPPFLAGS="-I$(brew --prefix openssl@1.1)/include"
|
|
59
|
+
```
|
|
60
|
+
Or migrate to OpenSSL 3.x compatible code.
|
|
61
|
+
|
|
62
|
+
2. **Xcode**: Pin the Xcode version explicitly using `maxim-lobanov/setup-xcode`:
|
|
63
|
+
```yaml
|
|
64
|
+
- uses: maxim-lobanov/setup-xcode@v1
|
|
65
|
+
with:
|
|
66
|
+
xcode-version: '16.4'
|
|
67
|
+
```
|
|
68
|
+
Or migrate your project to Xcode 26.
|
|
69
|
+
|
|
70
|
+
3. **Ruby**: Update your Gemfile to accept 3.4.x (`~> 3.4`) or use `ruby/setup-ruby` to
|
|
71
|
+
pin a specific version:
|
|
72
|
+
```yaml
|
|
73
|
+
- uses: ruby/setup-ruby@v1
|
|
74
|
+
with:
|
|
75
|
+
ruby-version: '3.3'
|
|
76
|
+
```
|
|
77
|
+
fix_code:
|
|
78
|
+
- language: yaml
|
|
79
|
+
label: "Pin to macos-15 for immediate rollback"
|
|
80
|
+
code: |
|
|
81
|
+
jobs:
|
|
82
|
+
build:
|
|
83
|
+
runs-on: macos-15 # pinned until macos-26 migration complete
|
|
84
|
+
steps:
|
|
85
|
+
- uses: actions/checkout@v4
|
|
86
|
+
- language: yaml
|
|
87
|
+
label: "Full macos-26 compatible workflow — pin Xcode, Ruby, and OpenSSL"
|
|
88
|
+
code: |
|
|
89
|
+
jobs:
|
|
90
|
+
build:
|
|
91
|
+
runs-on: macos-latest # now macos-26
|
|
92
|
+
steps:
|
|
93
|
+
- uses: actions/checkout@v4
|
|
94
|
+
|
|
95
|
+
# Pin Xcode version explicitly
|
|
96
|
+
- uses: maxim-lobanov/setup-xcode@v1
|
|
97
|
+
with:
|
|
98
|
+
xcode-version: '26.4'
|
|
99
|
+
|
|
100
|
+
# Pin Ruby if needed
|
|
101
|
+
- uses: ruby/setup-ruby@v1
|
|
102
|
+
with:
|
|
103
|
+
ruby-version: '3.3'
|
|
104
|
+
bundler-cache: true
|
|
105
|
+
|
|
106
|
+
# If OpenSSL 1.1 is needed, install and export paths
|
|
107
|
+
- name: Install OpenSSL 1.1
|
|
108
|
+
run: |
|
|
109
|
+
brew install openssl@1.1
|
|
110
|
+
echo "LDFLAGS=-L$(brew --prefix openssl@1.1)/lib" >> $GITHUB_ENV
|
|
111
|
+
echo "CPPFLAGS=-I$(brew --prefix openssl@1.1)/include" >> $GITHUB_ENV
|
|
112
|
+
prevention:
|
|
113
|
+
- "Avoid relying on `macos-latest` for builds that depend on specific system library versions (OpenSSL, LLVM). Pin to a concrete label like `macos-15`."
|
|
114
|
+
- "Subscribe to the actions/runner-images repository announcements to get advance notice of `macos-latest` label migrations."
|
|
115
|
+
- "Use `ruby/setup-ruby`, `actions/setup-node`, and `actions/setup-python` to pin language runtimes instead of relying on runner image defaults."
|
|
116
|
+
- "Test your macOS workflows against the new image early by substituting `macos-26` before the `macos-latest` migration completes."
|
|
117
|
+
- "Audit all dylib/framework dependencies — anything linking to libssl.1.1 must be migrated to OpenSSL 3.x or brew-pinned."
|
|
118
|
+
docs:
|
|
119
|
+
- url: "https://github.com/actions/runner-images/issues/14167"
|
|
120
|
+
label: "GitHub Announcement: macos-latest will use macos-26 in June 2026"
|
|
121
|
+
- url: "https://github.com/actions/runner-images/blob/main/images/macos/macos-26-arm64-Readme.md"
|
|
122
|
+
label: "macOS 26 arm64 image README — full software list"
|
|
123
|
+
- url: "https://docs.github.com/en/actions/using-github-hosted-runners/using-github-hosted-runners/about-github-hosted-runners"
|
|
124
|
+
label: "About GitHub-hosted runners — supported runner labels"
|
|
125
|
+
source:
|
|
126
|
+
article: "https://htek.dev/articles/github-actions-debugging-guide"
|
|
127
|
+
section: "Runner image migrations"
|
|
@@ -0,0 +1,112 @@
|
|
|
1
|
+
id: runner-environment-019
|
|
2
|
+
title: "PowerShell 7.4 → 7.6 LTS Upgrade Breaks pwsh Scripts"
|
|
3
|
+
category: runner-environment
|
|
4
|
+
severity: warning
|
|
5
|
+
tags:
|
|
6
|
+
- powershell
|
|
7
|
+
- pwsh
|
|
8
|
+
- runner-image
|
|
9
|
+
- breaking-change
|
|
10
|
+
- windows
|
|
11
|
+
- linux
|
|
12
|
+
patterns:
|
|
13
|
+
- regex: "The term 'ThreadJob\\\\\\\\Start-ThreadJob' is not recognized"
|
|
14
|
+
flags: "i"
|
|
15
|
+
- regex: "Cannot bind parameter.*ChildPath.*expected.*String.*got.*Array"
|
|
16
|
+
flags: "i"
|
|
17
|
+
- regex: "WildcardPattern.*backtick|escape.*pattern.*unexpected"
|
|
18
|
+
flags: "i"
|
|
19
|
+
- regex: "New-EventLog.*source.*trailing|event source.*not found"
|
|
20
|
+
flags: "i"
|
|
21
|
+
error_messages:
|
|
22
|
+
- "The term 'ThreadJob\\Start-ThreadJob' is not recognized as a name of a cmdlet, function, script file, or executable program."
|
|
23
|
+
- "Cannot bind parameter 'ChildPath'. Cannot convert the 'System.Object[]' value of type 'System.Object[]' to type 'System.String'."
|
|
24
|
+
- "Start-ThreadJob : The term 'ThreadJob\\Start-ThreadJob' is not recognized"
|
|
25
|
+
root_cause: |
|
|
26
|
+
GitHub Actions upgraded PowerShell from 7.4.x to 7.6 LTS on all runner images beginning
|
|
27
|
+
June 8, 2026 (completing June 15, 2026). This affects every runner OS: ubuntu-22.04,
|
|
28
|
+
ubuntu-24.04, ubuntu-slim, macos-14/15/26, windows-2022/2025/2025-vs2026.
|
|
29
|
+
|
|
30
|
+
PowerShell 7.6 is built on .NET 10 (7.4 was on .NET 8). The documented breaking changes are:
|
|
31
|
+
|
|
32
|
+
1. **ThreadJob module renamed**: `ThreadJob` module is now `Microsoft.PowerShell.ThreadJob`.
|
|
33
|
+
Scripts calling `ThreadJob\Start-ThreadJob` with the old module-qualified name will throw
|
|
34
|
+
a "not recognized" error. The cmdlet itself (`Start-ThreadJob`) still works without prefix.
|
|
35
|
+
|
|
36
|
+
2. **`Join-Path -ChildPath` now accepts `string[]`**: Parameter binding changed from `string`
|
|
37
|
+
to `string[]`. Existing code that passes multiple -ChildPath args may see binding errors
|
|
38
|
+
depending on how arguments were constructed.
|
|
39
|
+
|
|
40
|
+
3. **`WildcardPattern.Escape` now correctly escapes lone backticks**: Scripts that relied on
|
|
41
|
+
the previous (incorrect) behavior of backtick handling in wildcard patterns may produce
|
|
42
|
+
different results.
|
|
43
|
+
|
|
44
|
+
4. **Event source name trailing space removed**: New-EventLog / Write-EventLog source names
|
|
45
|
+
no longer have a trailing space. Scripts that matched exact event source names including
|
|
46
|
+
the trailing space (e.g., `"MySource "`) will fail to match.
|
|
47
|
+
|
|
48
|
+
5. **.NET 10 runtime**: Any `pwsh` script using .NET types or reflection that was tested
|
|
49
|
+
against .NET 8 behavior may see subtle differences.
|
|
50
|
+
fix: |
|
|
51
|
+
**ThreadJob module name** — the most common breaking change:
|
|
52
|
+
Replace `ThreadJob\Start-ThreadJob` with `Microsoft.PowerShell.ThreadJob\Start-ThreadJob`,
|
|
53
|
+
or just call `Start-ThreadJob` without the module qualifier.
|
|
54
|
+
|
|
55
|
+
**Event source name** — if you match exact source names:
|
|
56
|
+
Trim any trailing spaces from source name comparisons.
|
|
57
|
+
|
|
58
|
+
**Pin PowerShell version** if you need time to migrate (not recommended long-term):
|
|
59
|
+
Install a specific pwsh version in your workflow before running scripts.
|
|
60
|
+
|
|
61
|
+
**Test locally**: Install PowerShell 7.6 (`winget install Microsoft.PowerShell`) and run
|
|
62
|
+
your scripts to catch any remaining issues before they surface in CI.
|
|
63
|
+
fix_code:
|
|
64
|
+
- language: yaml
|
|
65
|
+
label: "Fix ThreadJob module-qualified name"
|
|
66
|
+
code: |
|
|
67
|
+
# In your PowerShell script — change this:
|
|
68
|
+
# ThreadJob\Start-ThreadJob -ScriptBlock { ... }
|
|
69
|
+
# To either:
|
|
70
|
+
# Start-ThreadJob -ScriptBlock { ... } # simplest fix
|
|
71
|
+
# Microsoft.PowerShell.ThreadJob\Start-ThreadJob -ScriptBlock { ... } # fully qualified
|
|
72
|
+
- language: yaml
|
|
73
|
+
label: "Pin PowerShell version as a temporary workaround"
|
|
74
|
+
code: |
|
|
75
|
+
jobs:
|
|
76
|
+
build:
|
|
77
|
+
runs-on: ubuntu-latest
|
|
78
|
+
steps:
|
|
79
|
+
- uses: actions/checkout@v4
|
|
80
|
+
|
|
81
|
+
# Pin pwsh to 7.4.x temporarily while migrating
|
|
82
|
+
- name: Install PowerShell 7.4
|
|
83
|
+
run: |
|
|
84
|
+
wget -q "https://github.com/PowerShell/PowerShell/releases/download/v7.4.10/powershell_7.4.10-1.deb_amd64.deb"
|
|
85
|
+
sudo dpkg -i powershell_7.4.10-1.deb_amd64.deb
|
|
86
|
+
|
|
87
|
+
- name: Run PowerShell script
|
|
88
|
+
shell: pwsh
|
|
89
|
+
run: ./scripts/build.ps1
|
|
90
|
+
- language: yaml
|
|
91
|
+
label: "Fix event source name trailing-space match"
|
|
92
|
+
code: |
|
|
93
|
+
# Before (broken on 7.6):
|
|
94
|
+
# if ($event.Source -eq "MyApp ") { ... }
|
|
95
|
+
#
|
|
96
|
+
# After (works on 7.4 and 7.6):
|
|
97
|
+
# if ($event.Source.Trim() -eq "MyApp") { ... }
|
|
98
|
+
prevention:
|
|
99
|
+
- "Subscribe to actions/runner-images announcements for PowerShell upgrade notices well before they ship."
|
|
100
|
+
- "Always use unqualified cmdlet names (`Start-ThreadJob`) rather than module-qualified names (`ThreadJob\\Start-ThreadJob`) to avoid module-rename breakage."
|
|
101
|
+
- "Run your PowerShell scripts through PSScriptAnalyzer with the latest rule set after any PS version upgrade."
|
|
102
|
+
- "Test pwsh workflows in a matrix with the previous and new PS version during runner image transition periods."
|
|
103
|
+
docs:
|
|
104
|
+
- url: "https://github.com/actions/runner-images/issues/14150"
|
|
105
|
+
label: "GitHub Announcement: PowerShell 7.4 → 7.6 upgrade on all runner images"
|
|
106
|
+
- url: "https://learn.microsoft.com/en-us/powershell/scripting/whats-new/what-s-new-in-powershell-76"
|
|
107
|
+
label: "PowerShell 7.6 release notes and breaking changes"
|
|
108
|
+
- url: "https://learn.microsoft.com/en-us/powershell/scripting/install/powershell-support-lifecycle"
|
|
109
|
+
label: "PowerShell support lifecycle"
|
|
110
|
+
source:
|
|
111
|
+
article: "https://htek.dev/articles/github-actions-debugging-guide"
|
|
112
|
+
section: "Runner image tool version changes"
|
|
@@ -0,0 +1,126 @@
|
|
|
1
|
+
id: runner-environment-021
|
|
2
|
+
title: "Service Container Marked Unhealthy — Health Check Timeout"
|
|
3
|
+
category: runner-environment
|
|
4
|
+
severity: error
|
|
5
|
+
tags:
|
|
6
|
+
- service-container
|
|
7
|
+
- healthcheck
|
|
8
|
+
- docker
|
|
9
|
+
- timeout
|
|
10
|
+
- postgres
|
|
11
|
+
- redis
|
|
12
|
+
- rabbitmq
|
|
13
|
+
patterns:
|
|
14
|
+
- regex: "service is unhealthy"
|
|
15
|
+
flags: "i"
|
|
16
|
+
- regex: "Failed to initialize.*service is unhealthy"
|
|
17
|
+
flags: "i"
|
|
18
|
+
- regex: "##\\[error\\]Failed to initialize.*service"
|
|
19
|
+
flags: "i"
|
|
20
|
+
- regex: "container_id.*unhealthy"
|
|
21
|
+
flags: "i"
|
|
22
|
+
error_messages:
|
|
23
|
+
- "##[error]Failed to initialize, rabbitmq service is unhealthy."
|
|
24
|
+
- "##[error]Failed to initialize, postgres service is unhealthy."
|
|
25
|
+
- "##[error]Failed to initialize, redis service is unhealthy."
|
|
26
|
+
- "unhealthy"
|
|
27
|
+
- "service is starting, waiting 29 seconds before checking again."
|
|
28
|
+
root_cause: |
|
|
29
|
+
GitHub Actions checks the Docker HEALTHCHECK status of service containers
|
|
30
|
+
before allowing dependent job steps to run. If the container does not
|
|
31
|
+
transition to `healthy` within the runner's fixed retry window, the job
|
|
32
|
+
fails with "service is unhealthy".
|
|
33
|
+
|
|
34
|
+
This commonly occurs because:
|
|
35
|
+
1. **No options specified** — Docker uses the image's built-in HEALTHCHECK,
|
|
36
|
+
which may be missing, too aggressive, or unsuitable for the CI environment.
|
|
37
|
+
2. **Startup time** — services like PostgreSQL, RabbitMQ, or Elasticsearch
|
|
38
|
+
take longer to initialize on GitHub-hosted runners than on local machines,
|
|
39
|
+
and the default health-check interval/retries expire before they're ready.
|
|
40
|
+
3. **Wrong health-check command** — a network ping health check may fail if
|
|
41
|
+
the service port isn't yet bound even though the process is running.
|
|
42
|
+
4. **Missing `--health-start-period`** — without a start period, Docker counts
|
|
43
|
+
health-check failures from container start, before the service has had time
|
|
44
|
+
to initialize.
|
|
45
|
+
|
|
46
|
+
Documented in actions/example-services issue #3.
|
|
47
|
+
fix: |
|
|
48
|
+
Add `options:` to the service container definition with explicit health-check
|
|
49
|
+
parameters suited to the service and GitHub-hosted runner environment:
|
|
50
|
+
|
|
51
|
+
- `--health-cmd`: Use a service-native health check command, not a TCP probe.
|
|
52
|
+
Examples:
|
|
53
|
+
PostgreSQL: `pg_isready -U postgres`
|
|
54
|
+
Redis: `redis-cli ping`
|
|
55
|
+
MySQL: `mysqladmin ping -h localhost`
|
|
56
|
+
RabbitMQ: `rabbitmqctl node_health_check`
|
|
57
|
+
|
|
58
|
+
- `--health-interval 10s`: Check every 10 seconds
|
|
59
|
+
- `--health-timeout 5s`: Allow up to 5 seconds per check
|
|
60
|
+
- `--health-retries 5`: Retry up to 5 times before marking unhealthy
|
|
61
|
+
- `--health-start-period 30s`: Give 30 seconds before counting failures
|
|
62
|
+
|
|
63
|
+
If the image lacks a health check and the options approach is insufficient,
|
|
64
|
+
add an explicit wait step after job start using the service label name to
|
|
65
|
+
poll readiness with a loop.
|
|
66
|
+
fix_code:
|
|
67
|
+
- language: yaml
|
|
68
|
+
label: "PostgreSQL service container with proper health check"
|
|
69
|
+
code: |
|
|
70
|
+
jobs:
|
|
71
|
+
test:
|
|
72
|
+
runs-on: ubuntu-latest
|
|
73
|
+
services:
|
|
74
|
+
postgres:
|
|
75
|
+
image: postgres:16
|
|
76
|
+
env:
|
|
77
|
+
POSTGRES_PASSWORD: postgres
|
|
78
|
+
POSTGRES_DB: testdb
|
|
79
|
+
ports:
|
|
80
|
+
- 5432:5432
|
|
81
|
+
options: >-
|
|
82
|
+
--health-cmd "pg_isready -U postgres"
|
|
83
|
+
--health-interval 10s
|
|
84
|
+
--health-timeout 5s
|
|
85
|
+
--health-retries 5
|
|
86
|
+
--health-start-period 30s
|
|
87
|
+
steps:
|
|
88
|
+
- run: psql postgresql://postgres:postgres@localhost:5432/testdb -c "SELECT 1"
|
|
89
|
+
- language: yaml
|
|
90
|
+
label: "Redis service container with proper health check"
|
|
91
|
+
code: |
|
|
92
|
+
services:
|
|
93
|
+
redis:
|
|
94
|
+
image: redis:7
|
|
95
|
+
ports:
|
|
96
|
+
- 6379:6379
|
|
97
|
+
options: >-
|
|
98
|
+
--health-cmd "redis-cli ping"
|
|
99
|
+
--health-interval 10s
|
|
100
|
+
--health-timeout 5s
|
|
101
|
+
--health-retries 5
|
|
102
|
+
- language: yaml
|
|
103
|
+
label: "Fallback — explicit wait step if health check unreliable"
|
|
104
|
+
code: |
|
|
105
|
+
steps:
|
|
106
|
+
- name: Wait for PostgreSQL to be ready
|
|
107
|
+
run: |
|
|
108
|
+
until pg_isready -h localhost -p 5432 -U postgres; do
|
|
109
|
+
echo "Waiting for postgres..."
|
|
110
|
+
sleep 2
|
|
111
|
+
done
|
|
112
|
+
timeout-minutes: 2
|
|
113
|
+
prevention:
|
|
114
|
+
- "Always specify `options:` with `--health-cmd`, `--health-interval`, `--health-retries`, and `--health-start-period` for every service container."
|
|
115
|
+
- "Use service-native health check commands (pg_isready, redis-cli ping) rather than generic TCP probes."
|
|
116
|
+
- "Add `--health-start-period` of 20-30 seconds for services with slow initialization (PostgreSQL, Elasticsearch, RabbitMQ)."
|
|
117
|
+
- "Test health check timing locally in Docker before relying on it in CI: `docker run --health-cmd '...' --health-interval 5s <image>`."
|
|
118
|
+
docs:
|
|
119
|
+
- url: "https://docs.github.com/en/actions/using-containerized-services/about-service-containers"
|
|
120
|
+
label: "About service containers"
|
|
121
|
+
- url: "https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idservicesservice_idoptions"
|
|
122
|
+
label: "Workflow syntax — services options"
|
|
123
|
+
- url: "https://github.com/actions/example-services/issues/3"
|
|
124
|
+
label: "actions/example-services #3 — Service container health check questions and fixes"
|
|
125
|
+
- url: "https://stackoverflow.com/questions/66763353/how-to-health-check-a-service-in-github"
|
|
126
|
+
label: "Stack Overflow — How to health check a service in GitHub Actions"
|