@htekdev/actions-debugger 1.0.126 → 1.0.128

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,172 @@
1
+ id: caching-artifacts-075
2
+ title: '`actions/cache` entries saved by GitHub-hosted runners are not accessible from self-hosted runners — cache lookup returns miss despite matching key'
3
+ category: caching-artifacts
4
+ severity: error
5
+ tags:
6
+ - actions-cache
7
+ - self-hosted
8
+ - github-hosted
9
+ - cache-miss
10
+ - cross-runner
11
+ - ACTIONS_CACHE_URL
12
+ - cache-backend
13
+ patterns:
14
+ - regex: 'Cache not found for input keys'
15
+ flags: 'i'
16
+ - regex: 'No cache found.*self.hosted'
17
+ flags: 'i'
18
+ - regex: 'ACTIONS_CACHE_URL.*self.hosted'
19
+ flags: 'i'
20
+ - regex: 'cache.*miss.*self.hosted|self.hosted.*cache.*miss'
21
+ flags: 'i'
22
+ error_messages:
23
+ - "Cache not found for input keys: my-cache-key-abc123"
24
+ - "Warning: Cache restore failed."
25
+ - "No cache found for key my-cache-key on self-hosted runner"
26
+ root_cause: |
27
+ `actions/cache` stores and retrieves cache entries via a **runner-specific
28
+ `ACTIONS_CACHE_URL` endpoint** that is injected into each job by the Actions
29
+ infrastructure. The cache URL for a GitHub-hosted runner points to GitHub's
30
+ hosted cache service, while the cache URL for a self-hosted runner may point to
31
+ a different endpoint (e.g., GHES cache, Actions Runner Controller cache proxy,
32
+ or a completely different ACTIONS_CACHE_URL configured by the runner operator).
33
+
34
+ When a GitHub-hosted runner saves a cache entry:
35
+ - The entry is stored at GitHub's hosted cache service endpoint.
36
+ - The entry IS visible via the GitHub Cache REST API and the repository's
37
+ Actions → Caches UI.
38
+
39
+ When a self-hosted runner tries to restore the same key:
40
+ - The runner uses its own `ACTIONS_CACHE_URL`, which may point to a different
41
+ service.
42
+ - The cache lookup returns "Cache not found" even though the entry is visible
43
+ in the UI.
44
+ - No error is surfaced — the action silently reports a cache miss and the job
45
+ continues.
46
+
47
+ This is commonly encountered when:
48
+ 1. **ARC (Actions Runner Controller) on Kubernetes** — ARC runners use a
49
+ `gha-cache-server` proxy sidecar that fronts GitHub's cache API; the proxy's
50
+ cache scope or URL differs from the GitHub-hosted ACTIONS_CACHE_URL.
51
+ 2. **GHES (GitHub Enterprise Server)** — On-prem GHES cache is a separate
52
+ service from github.com hosted cache.
53
+ 3. **Mixed runner pools** — Some jobs run on GitHub-hosted (`ubuntu-latest`)
54
+ and others on self-hosted runners; they do not share cache entries.
55
+ 4. **Self-hosted runners with custom ACTIONS_CACHE_URL** — Operators configure
56
+ a custom cache backend (e.g., Artifactory, Nexus) that doesn't contain entries
57
+ from GitHub's hosted cache.
58
+
59
+ Related: actions/cache#1595 (open since April 2025).
60
+
61
+ The cache entries ARE there (visible via REST API) but are stored at a different
62
+ cache service endpoint than the one the self-hosted runner queries.
63
+ fix: |
64
+ **Option 1 (Recommended) — Ensure all cache-sharing jobs use the same runner type:**
65
+ If a job saves a cache on a GitHub-hosted runner, ensure the job that needs to
66
+ restore it also runs on a GitHub-hosted runner. Do not rely on cross-runner-type
67
+ cache sharing.
68
+
69
+ **Option 2 — Configure a shared external cache backend:**
70
+ For mixed runner environments, configure all runners (GitHub-hosted and self-hosted)
71
+ to use a shared external cache backend. This requires customizing `ACTIONS_CACHE_URL`
72
+ on your self-hosted runners to point to the same service.
73
+
74
+ **Option 3 — Re-populate cache from self-hosted runners:**
75
+ Have the self-hosted runner job save its own cache entry with the same key on first
76
+ miss. Subsequent self-hosted runner runs will hit this entry. The GitHub-hosted and
77
+ self-hosted entries may diverge if they have different paths.
78
+
79
+ **Option 4 — Use artifact-based sharing instead of cache:**
80
+ If the data must cross runner types, use `actions/upload-artifact` and
81
+ `actions/download-artifact` instead of cache. Artifacts are stored and retrieved
82
+ via the same GitHub API endpoint regardless of runner type.
83
+ fix_code:
84
+ - language: yaml
85
+ label: 'Broken — cache saved by GitHub-hosted runner, restored by self-hosted (misses)'
86
+ code: |
87
+ jobs:
88
+ build:
89
+ runs-on: ubuntu-latest # GitHub-hosted runner saves cache
90
+ steps:
91
+ - uses: actions/checkout@v4
92
+ - uses: actions/cache@v4
93
+ with:
94
+ path: ~/.npm
95
+ key: npm-${{ hashFiles('**/package-lock.json') }}
96
+ - run: npm ci
97
+
98
+ test:
99
+ runs-on: self-hosted # ✗ Self-hosted runner cannot access
100
+ needs: build # github-hosted runner's cache entry
101
+ steps:
102
+ - uses: actions/cache@v4
103
+ with:
104
+ path: ~/.npm
105
+ key: npm-${{ hashFiles('**/package-lock.json') }}
106
+ # → Always "Cache not found for input keys: npm-..."
107
+ - run: npm test
108
+
109
+ - language: yaml
110
+ label: 'Fixed — both jobs on the same runner type share cache entries'
111
+ code: |
112
+ jobs:
113
+ build:
114
+ runs-on: ubuntu-latest # ✓ Both jobs use GitHub-hosted runners
115
+ steps:
116
+ - uses: actions/checkout@v4
117
+ - uses: actions/cache@v4
118
+ with:
119
+ path: ~/.npm
120
+ key: npm-${{ hashFiles('**/package-lock.json') }}
121
+ - run: npm ci
122
+
123
+ test:
124
+ runs-on: ubuntu-latest # ✓ Same runner type = same cache backend
125
+ needs: build
126
+ steps:
127
+ - uses: actions/cache@v4
128
+ with:
129
+ path: ~/.npm
130
+ key: npm-${{ hashFiles('**/package-lock.json') }}
131
+ # ✓ Cache hit — saved by github-hosted, restored by github-hosted
132
+ - run: npm test
133
+
134
+ - language: yaml
135
+ label: 'Alternative — use upload-artifact / download-artifact for cross-runner sharing'
136
+ code: |
137
+ jobs:
138
+ build:
139
+ runs-on: ubuntu-latest # GitHub-hosted builds the artifacts
140
+ steps:
141
+ - uses: actions/checkout@v4
142
+ - run: npm ci && npm run build
143
+ - uses: actions/upload-artifact@v4
144
+ with:
145
+ name: build-output
146
+ path: dist/
147
+
148
+ test:
149
+ runs-on: self-hosted # ✓ Artifacts are accessible from any runner type
150
+ needs: build
151
+ steps:
152
+ - uses: actions/download-artifact@v4
153
+ with:
154
+ name: build-output
155
+ path: dist/
156
+ - run: npm test
157
+
158
+ prevention:
159
+ - 'Treat `actions/cache` entries as scoped to the runner type that created them — GitHub-hosted and self-hosted runners do not share a cache backend by default.'
160
+ - 'Design workflows so that cache-save and cache-restore jobs run on the same runner type (both GitHub-hosted or both self-hosted).'
161
+ - 'When mixing runner types, use `actions/upload-artifact` + `actions/download-artifact` for data that must cross the runner boundary.'
162
+ - 'Verify cache access in ARC (Actions Runner Controller) setups: ARC uses a gha-cache-server proxy that may have different scope from the GitHub-hosted cache.'
163
+ - 'Inspect `ACTIONS_CACHE_URL` in debug logs (`ACTIONS_RUNNER_DEBUG=true`) to confirm both runners are pointing to the same cache endpoint.'
164
+ docs:
165
+ - url: 'https://github.com/actions/cache/issues/1595'
166
+ label: 'actions/cache#1595: GitHub-hosted runner cache not found from self-hosted runner (open 2025)'
167
+ - url: 'https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#restrictions-for-accessing-a-cache'
168
+ label: 'GitHub Docs: Cache access restrictions'
169
+ - url: 'https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows'
170
+ label: 'GitHub Docs: Caching dependencies to speed up workflows'
171
+ - url: 'https://github.com/actions/actions-runner-controller/blob/main/docs/gha-runner-scale-set-controller/README.md'
172
+ label: 'ARC docs: Runner Scale Set — cache proxy configuration'
@@ -0,0 +1,141 @@
1
+ id: known-unsolved-074
2
+ title: '`hashFiles()` has a hardcoded 120-second timeout — fails silently on large repos with many files'
3
+ category: known-unsolved
4
+ severity: limitation
5
+ tags:
6
+ - hashFiles
7
+ - timeout
8
+ - cache-key
9
+ - large-repo
10
+ - actions-cache
11
+ - 120-seconds
12
+ - known-limitation
13
+ patterns:
14
+ - regex: 'hashFiles\(.+\) couldn''t finish within 120 seconds'
15
+ flags: 'i'
16
+ - regex: 'hashFiles.*couldn''t finish'
17
+ flags: 'i'
18
+ - regex: 'The template is not valid.*hashFiles.*couldn''t finish'
19
+ flags: 'i'
20
+ error_messages:
21
+ - "Error: The template is not valid. .github/workflows/ci.yml (Line: N, Col: M): hashFiles('Assets/**', 'Packages/**', 'ProjectSettings/**') couldn't finish within 120 seconds."
22
+ - "hashFiles('**/pom.xml') couldn't finish within 120 seconds."
23
+ - "hashFiles('**/Podfile.lock') couldn't finish within 120 seconds."
24
+ root_cause: |
25
+ The `hashFiles()` expression function has a **hardcoded 120-second timeout** inside the
26
+ GitHub Actions runner. When hashing many files — large game projects, monorepos with
27
+ thousands of dependencies, or deeply nested directory trees — the hash process (a separate
28
+ Node.js subprocess) can exceed this limit.
29
+
30
+ The timeout is set in the runner source:
31
+ `src/Runner.Worker/Expressions/HashFilesFunction.cs` as a fixed constant. There is no
32
+ workflow-level configuration to extend or remove it. Pull Request actions/runner#1844,
33
+ opened in April 2022 to make the timeout configurable, has remained unmerged for years.
34
+
35
+ **Common triggers:**
36
+ - Unity game projects hashing `Assets/**`, `Packages/**`, `ProjectSettings/**` (tens of
37
+ thousands of binary and text assets).
38
+ - Java monorepos hashing `**/pom.xml` across hundreds of modules.
39
+ - iOS projects hashing `**/Podfile.lock` where CocoaPods creates large lock files.
40
+ - `actions/cache` Post step re-computing `hashFiles()` at the **end** of the job (to
41
+ check whether the cache should be saved), by which point many new files may exist in
42
+ the hashed path, making the second hash slower than the first.
43
+ - Self-hosted Windows runners, which tend to have slower file system scan speeds than
44
+ Linux GitHub-hosted runners for large directory trees.
45
+
46
+ The error surfaces as a YAML template evaluation failure before any step runs, so the
47
+ entire job fails immediately with no useful diagnostic beyond the 120-second error.
48
+
49
+ Related: actions/runner#1840 (12 reactions, open since April 2022, last active May 2024).
50
+ fix: |
51
+ There is **no native fix** — `hashFiles()` timeout is hardcoded and cannot be increased
52
+ in workflow YAML.
53
+
54
+ **Workaround 1 — Pre-compute the hash in a shell step:**
55
+ Run a shell command to hash the target files and write the result to a single file, then
56
+ use `hashFiles()` on only that single file. The shell command runs without the 120-second
57
+ runner constraint, and hashing one file is nearly instant.
58
+
59
+ **Workaround 2 — Use the last Git commit hash for the target directory:**
60
+ `git log` is much faster than file-content hashing. If cache invalidation on the last
61
+ commit touching the directory is acceptable (rather than on file-content change), use
62
+ `git log -1 --pretty=format:%H -- <directory>` to generate the key.
63
+
64
+ **Workaround 3 — Narrow the glob pattern:**
65
+ Instead of `Assets/**`, narrow to a specific file type:
66
+ `Assets/**/*.meta` or `Assets/**/*.asset`. Fewer files = faster hash = less chance of
67
+ timeout. Acceptable when only certain file types are meaningful for cache invalidation.
68
+
69
+ **Workaround 4 — Reduce what goes in the hashed path:**
70
+ If you control the directory structure, move frequently-updated large binary files out
71
+ of the hashed path so the glob matches fewer files.
72
+ fix_code:
73
+ - language: yaml
74
+ label: 'Broken — hashFiles() times out on large directory tree'
75
+ code: |
76
+ - uses: actions/cache@v4
77
+ with:
78
+ path: Library
79
+ # ✗ hashFiles() will time out if Assets/**, Packages/**, ProjectSettings/**
80
+ # contains thousands of files (e.g. large Unity project)
81
+ key: Library-${{ hashFiles('Assets/**', 'Packages/**', 'ProjectSettings/**') }}
82
+
83
+ - language: yaml
84
+ label: 'Fixed — pre-compute hash in a shell step, then hashFiles() on single file'
85
+ code: |
86
+ - name: Generate cache key from directory contents
87
+ run: |
88
+ find Assets Packages ProjectSettings \
89
+ -type f | sort | xargs sha256sum > /tmp/unity-hash-input.txt
90
+ # On Windows self-hosted runner, use PowerShell instead:
91
+ # Get-ChildItem -Recurse Assets,Packages,ProjectSettings |
92
+ # Where-Object { !$_.PSIsContainer } |
93
+ # Sort-Object FullName |
94
+ # ForEach-Object { sha256sum $_.FullName } |
95
+ # Out-File /tmp/unity-hash-input.txt -Encoding utf8
96
+
97
+ - uses: actions/cache@v4
98
+ with:
99
+ path: Library
100
+ # ✓ hashFiles() on a single pre-computed file is nearly instant
101
+ key: Library-${{ hashFiles('/tmp/unity-hash-input.txt') }}
102
+
103
+ - language: yaml
104
+ label: 'Alternative — use last Git commit hash touching the cached directory'
105
+ code: |
106
+ - name: Get cache key from last commit touching packages
107
+ run: |
108
+ echo "PKG_HASH=$(git log -1 --pretty=format:%H -- packages/)" >> $GITHUB_ENV
109
+
110
+ - uses: actions/cache@v4
111
+ with:
112
+ path: ~/.m2/repository
113
+ # ✓ git log is instant regardless of how many files are in packages/
114
+ # Note: invalidates on commit, not on file-content change
115
+ key: maven-${{ runner.os }}-${{ env.PKG_HASH }}
116
+
117
+ - language: yaml
118
+ label: 'Alternative — narrow glob to only meaningful file types'
119
+ code: |
120
+ - uses: actions/cache@v4
121
+ with:
122
+ path: Library
123
+ # ✓ Narrow glob: only hash .meta files instead of all assets
124
+ # Fewer matched files = faster hash = less risk of timeout
125
+ key: Library-${{ hashFiles('Assets/**/*.meta', 'Packages/**/package.json') }}
126
+
127
+ prevention:
128
+ - 'For any project with more than ~10,000 files in the hashed path, pre-compute the hash in a shell step instead of using `hashFiles()` directly in the cache key expression.'
129
+ - 'Test `hashFiles()` locally by adding a step that logs the evaluation time: the timeout occurs during expression evaluation, before the step executes.'
130
+ - 'On Windows self-hosted runners, file system scan is slower — lower the threshold for when to switch to a pre-computed hash (≥5,000 files).'
131
+ - 'Watch the `actions/cache` Post step — it re-evaluates `hashFiles()` at job end to decide whether to save. If you add files to the hashed path during the job, the Post step can time out even when the Pre step succeeded.'
132
+ - 'Upvote actions/runner#1844 — the PR to make the timeout configurable. No merge timeline as of June 2026.'
133
+ docs:
134
+ - url: 'https://github.com/actions/runner/issues/1840'
135
+ label: 'actions/runner#1840: hashFiles() couldn''t finish within 120 seconds (12 reactions, open since 2022)'
136
+ - url: 'https://github.com/actions/runner/pull/1844'
137
+ label: 'actions/runner#1844: PR to make hashFiles() timeout configurable (unmerged)'
138
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/evaluate-expressions-in-workflows-and-actions#hashfiles'
139
+ label: 'GitHub Docs: hashFiles() expression function'
140
+ - url: 'https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows'
141
+ label: 'GitHub Docs: Caching dependencies to speed up workflows'
@@ -0,0 +1,147 @@
1
+ id: known-unsolved-075
2
+ title: '`matrix` context unavailable in job-level `if:` condition when matrix is dynamically generated from upstream job outputs'
3
+ category: known-unsolved
4
+ severity: limitation
5
+ tags:
6
+ - matrix
7
+ - dynamic-matrix
8
+ - if-condition
9
+ - job-level
10
+ - fromJSON
11
+ - needs-outputs
12
+ patterns:
13
+ - regex: 'Unrecognized named-value.*matrix.*Located at position'
14
+ flags: 'i'
15
+ - regex: 'matrix.*context.*not.*available.*if|if.*matrix.*unrecognized'
16
+ flags: 'i'
17
+ error_messages:
18
+ - "Unrecognized named-value: 'matrix'. Located at position 26 within expression: contains(inputs.SCHEMAS, matrix.customer.schema)"
19
+ - "The workflow is not valid. ... Unrecognized named-value: 'matrix'."
20
+ root_cause: |
21
+ When a job uses `strategy.matrix` with values derived from an upstream job's outputs
22
+ (e.g. `fromJson(needs.set-matrix.outputs.matrix)`), the `matrix` context is **not**
23
+ available in the job's own `if:` condition.
24
+
25
+ The root cause is evaluation ordering: GitHub Actions evaluates job-level `if:`
26
+ conditions **before** resolving the dynamic matrix values from upstream job outputs.
27
+ At the time `if:` is checked, the specific matrix combination (e.g. `matrix.schema`,
28
+ `matrix.os`) has not yet been bound.
29
+
30
+ This limitation does NOT affect:
31
+ - Step-level `if:` conditions inside the job (matrix IS available there)
32
+ - Static matrices defined inline with literal values (matrix IS available in job-level if)
33
+ - Downstream jobs reading this job's outputs (normal needs chain)
34
+
35
+ **Workaround does not exist at job level**: There is no supported way to filter
36
+ individual matrix combinations at the job `if:` level when the matrix is dynamic.
37
+ GitHub's matrix `include`/`exclude` keys do not support expressions.
38
+
39
+ The only workaround is to move the filtering logic inside a step using
40
+ `if: condition` at the step level, or to generate a pre-filtered matrix in the
41
+ upstream job so that no filtering is needed at the consumer job level.
42
+
43
+ Source: actions/runner#1985 (64 reactions, open since 2022)
44
+ Community discussion: https://github.community/t/matrix-cannot-be-used-in-jobs-level-if/17177
45
+ fix: |
46
+ **Option 1 (recommended): Filter inside the upstream matrix-generation job**
47
+
48
+ Produce a matrix JSON that only includes the combinations you want to run. This is
49
+ the cleanest approach — no filtering needed in the consumer job.
50
+
51
+ **Option 2: Use step-level `if:` instead of job-level `if:`**
52
+
53
+ Move the filtering logic into the first step of the job. The job itself runs for
54
+ every matrix entry but exits cleanly. This wastes a job slot but works.
55
+
56
+ **Option 3: Use `continue-on-error: true` with an early-exit pattern**
57
+
58
+ Not recommended — harder to distinguish real failures from filtered runs.
59
+ fix_code:
60
+ - language: yaml
61
+ label: "Broken — matrix context in job-level if with dynamic matrix"
62
+ code: |
63
+ jobs:
64
+ set-matrix:
65
+ runs-on: ubuntu-latest
66
+ outputs:
67
+ matrix: ${{ steps.gen.outputs.matrix }}
68
+ steps:
69
+ - id: gen
70
+ run: |
71
+ echo 'matrix={"customer":[{"schema":"prod"},{"schema":"staging"}]}' >> "$GITHUB_OUTPUT"
72
+
73
+ deploy:
74
+ needs: set-matrix
75
+ runs-on: ubuntu-latest
76
+ strategy:
77
+ matrix: ${{ fromJson(needs.set-matrix.outputs.matrix) }}
78
+ # ❌ FAILS: "Unrecognized named-value: 'matrix'"
79
+ if: contains(inputs.SCHEMAS, matrix.customer.schema)
80
+ steps:
81
+ - run: echo "Deploying ${{ matrix.customer.schema }}"
82
+
83
+ - language: yaml
84
+ label: "Fixed Option 1 — pre-filter in the matrix-generation job"
85
+ code: |
86
+ jobs:
87
+ set-matrix:
88
+ runs-on: ubuntu-latest
89
+ outputs:
90
+ matrix: ${{ steps.gen.outputs.matrix }}
91
+ steps:
92
+ - id: gen
93
+ # ✅ Generate only the matrix entries that should run
94
+ run: |
95
+ # Filter based on inputs.SCHEMAS inside the script
96
+ SCHEMAS="${{ inputs.SCHEMAS }}"
97
+ MATRIX=$(jq -n --arg schemas "$SCHEMAS" \
98
+ '[{"schema":"prod"},{"schema":"staging"}] |
99
+ map(select(.schema | IN($schemas | split(","))))' \
100
+ | jq -c '{customer:.}')
101
+ echo "matrix=$MATRIX" >> "$GITHUB_OUTPUT"
102
+
103
+ deploy:
104
+ needs: set-matrix
105
+ runs-on: ubuntu-latest
106
+ strategy:
107
+ matrix: ${{ fromJson(needs.set-matrix.outputs.matrix) }}
108
+ # ✅ No job-level if needed — matrix is already filtered
109
+ steps:
110
+ - run: echo "Deploying ${{ matrix.customer.schema }}"
111
+
112
+ - language: yaml
113
+ label: "Fixed Option 2 — move filtering to step-level if"
114
+ code: |
115
+ jobs:
116
+ deploy:
117
+ needs: set-matrix
118
+ runs-on: ubuntu-latest
119
+ strategy:
120
+ matrix: ${{ fromJson(needs.set-matrix.outputs.matrix) }}
121
+ # ✅ No job-level if — allow all matrix entries through
122
+ steps:
123
+ # Early exit for matrix entries that don't match
124
+ - name: Check if this schema should deploy
125
+ # ✅ matrix context IS available in step-level if conditions
126
+ if: "!contains(inputs.SCHEMAS, matrix.customer.schema)"
127
+ run: |
128
+ echo "Skipping schema ${{ matrix.customer.schema }}"
129
+ exit 0
130
+
131
+ - name: Deploy
132
+ if: contains(inputs.SCHEMAS, matrix.customer.schema)
133
+ run: echo "Deploying ${{ matrix.customer.schema }}"
134
+
135
+ prevention:
136
+ - "When you need per-combination filtering, pre-filter the matrix JSON in the generation step rather than relying on job-level `if:` with matrix context."
137
+ - "Use step-level `if:` conditions (not job-level) when you must reference `matrix.*` context for filtering — step-level conditions evaluate after matrix expansion."
138
+ - "Check GitHub docs for 'Context availability' to confirm which contexts are available at each level before authoring complex conditional logic."
139
+ docs:
140
+ - url: 'https://github.com/actions/runner/issues/1985'
141
+ label: 'actions/runner#1985 — Unrecognized named-value: matrix in job if conditional (64 reactions)'
142
+ - url: 'https://github.community/t/matrix-cannot-be-used-in-jobs-level-if/17177'
143
+ label: 'GitHub Community: matrix cannot be used in jobs level if'
144
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/using-a-matrix-for-your-jobs'
145
+ label: 'GitHub Docs: Using a matrix for your jobs'
146
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/contexts#context-availability'
147
+ label: 'GitHub Docs: Context availability table'
@@ -0,0 +1,146 @@
1
+ id: runner-environment-236
2
+ title: '`ubuntu-slim` runner has Docker CLI but no Docker daemon — docker build/pull and Dockerfile container actions fail'
3
+ category: runner-environment
4
+ severity: error
5
+ tags:
6
+ - ubuntu-slim
7
+ - docker
8
+ - docker-daemon
9
+ - container-action
10
+ - docker-build
11
+ - dockerfile
12
+ - lightweight-runner
13
+ patterns:
14
+ - regex: 'Cannot connect to the Docker daemon at unix:///var/run/docker\.sock'
15
+ flags: 'i'
16
+ - regex: 'docker\.sock.*no such file or directory'
17
+ flags: 'i'
18
+ - regex: 'ERROR: Cannot connect to the Docker daemon'
19
+ flags: 'i'
20
+ - regex: 'ubuntu-slim.*docker|docker.*ubuntu-slim'
21
+ flags: 'i'
22
+ error_messages:
23
+ - "Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?"
24
+ - "ERROR: Cannot connect to the Docker daemon at unix:///var/run/docker.sock: connect: no such file or directory"
25
+ - "Docker build failed with exit code 1, back off 3.588 seconds before retry."
26
+ - "failed to connect to the docker API at unix:///var/run/docker.sock; check if the path is correct and if the daemon is running: dial unix /var/run/docker.sock: connect: no such file or directory"
27
+ - "/usr/bin/docker build ... ERROR: Cannot connect to the Docker daemon"
28
+ root_cause: |
29
+ The `ubuntu-slim` GitHub-hosted runner image (added January 2026 via runner-images
30
+ PR#13542) is a **lightweight Ubuntu image intentionally stripped of the Docker daemon**.
31
+ It is designed for fast, low-overhead CI jobs (linting, unit tests, scripting) that
32
+ do not need container tooling.
33
+
34
+ **What ubuntu-slim DOES have:**
35
+ - Docker CLI (`docker` command is on `PATH`)
36
+ - Standard shell tooling, git, curl, Node.js, Python
37
+
38
+ **What ubuntu-slim does NOT have:**
39
+ - Docker daemon (no `/var/run/docker.sock` socket)
40
+ - Docker daemon service (not running — not even stopped, just not installed)
41
+ - BuildKit / buildx (requires daemon)
42
+
43
+ This affects several common patterns:
44
+ 1. **`docker build` / `docker run` / `docker pull` in `run:` steps** — All fail
45
+ immediately because there is no daemon to dispatch commands to.
46
+ 2. **Container actions** (`uses: some-org/docker-action@v1` where the action has a
47
+ `Dockerfile`) — The Actions runner tries to build the action's `Dockerfile` using
48
+ the host Docker daemon before executing the step. On `ubuntu-slim`, this fails during
49
+ job setup with "Cannot connect to the Docker daemon."
50
+ 3. **`docker/setup-buildx-action`** — Fails because BuildKit requires a Docker daemon.
51
+ 4. **Service containers** (`services:` block using Docker images) — Fail because
52
+ pulling and starting service containers requires the Docker daemon.
53
+
54
+ GitHub has closed requests to add Docker daemon to ubuntu-slim as "not_planned"
55
+ (runner-images#13583 Feb 2026), confirming this is **by design**.
56
+
57
+ This is distinct from `runner-environment-030` (ubuntu-arm64-docker-not-preinstalled),
58
+ which covers ARM64 partner runners where Docker CLI itself is missing. On ubuntu-slim,
59
+ Docker CLI IS present — only the daemon is absent.
60
+ fix: |
61
+ **Option 1 (Recommended) — Switch to `ubuntu-latest` or `ubuntu-24.04`:**
62
+ If your workflow uses Docker in any form (docker build, container actions, service
63
+ containers), use a full Ubuntu image. Docker daemon is pre-installed and running
64
+ on `ubuntu-latest` and `ubuntu-24.04`.
65
+
66
+ **Option 2 — Conditionally skip Docker-dependent steps:**
67
+ If you want to use ubuntu-slim for most of a workflow but have one Docker step, you
68
+ cannot run that step on ubuntu-slim. Restructure into two jobs: one on ubuntu-slim for
69
+ fast non-Docker steps, one on ubuntu-latest for Docker steps.
70
+
71
+ **Option 3 — Use a Docker-in-Docker container (advanced, not recommended):**
72
+ You can run a `dind` (Docker-in-Docker) container as a service and set `DOCKER_HOST`
73
+ to point to it, but this is complex, slow, and defeats the purpose of ubuntu-slim.
74
+
75
+ **Option 4 — Replace Dockerfile container actions with JS/composite alternatives:**
76
+ Some Docker-based third-party actions have JavaScript equivalents that don't require
77
+ a daemon. Check if the action you're using has a non-Docker version.
78
+ fix_code:
79
+ - language: yaml
80
+ label: 'Broken — docker commands and container actions fail on ubuntu-slim'
81
+ code: |
82
+ jobs:
83
+ build:
84
+ runs-on: ubuntu-slim # ✗ No Docker daemon available
85
+
86
+ steps:
87
+ - uses: actions/checkout@v4
88
+
89
+ # ✗ Fails: "Cannot connect to the Docker daemon at unix:///var/run/docker.sock"
90
+ - name: Build Docker image
91
+ run: docker build -t myapp:latest .
92
+
93
+ # ✗ Also fails: action uses Dockerfile, requires Docker daemon to build
94
+ - name: Run Docker-based action
95
+ uses: some-org/docker-based-action@v2
96
+
97
+ - language: yaml
98
+ label: 'Fixed — use ubuntu-latest for any workflow that needs Docker'
99
+ code: |
100
+ jobs:
101
+ build:
102
+ runs-on: ubuntu-latest # ✓ Docker daemon pre-installed and running
103
+
104
+ steps:
105
+ - uses: actions/checkout@v4
106
+
107
+ # ✓ Works: Docker daemon available on ubuntu-latest
108
+ - name: Build Docker image
109
+ run: docker build -t myapp:latest .
110
+
111
+ # ✓ Works: container actions can build their Dockerfiles
112
+ - name: Run Docker-based action
113
+ uses: some-org/docker-based-action@v2
114
+
115
+ - language: yaml
116
+ label: 'Pattern — split jobs to use ubuntu-slim for fast steps, ubuntu-latest for Docker'
117
+ code: |
118
+ jobs:
119
+ lint:
120
+ runs-on: ubuntu-slim # ✓ Fast, no Docker needed for lint
121
+ steps:
122
+ - uses: actions/checkout@v4
123
+ - run: npm run lint
124
+
125
+ docker-build:
126
+ runs-on: ubuntu-latest # ✓ Docker available for build/push steps
127
+ needs: lint
128
+ steps:
129
+ - uses: actions/checkout@v4
130
+ - run: docker build -t myapp:latest .
131
+ - run: docker push myregistry/myapp:latest
132
+
133
+ prevention:
134
+ - 'Before switching to `ubuntu-slim`, audit your workflow for `docker` commands, Dockerfile-based `uses:` steps, and `services:` blocks — any of these require a Docker daemon.'
135
+ - 'Container actions (actions with a `Dockerfile`) silently fail during job setup on ubuntu-slim, making the error appear before any step runs — check the "Set up job" section of the logs.'
136
+ - 'Check the ubuntu-slim software list at https://github.com/actions/runner-images to confirm Docker daemon is not present before migrating workflows.'
137
+ - '`service containers` defined in the `services:` block also require the Docker daemon — they will also fail on ubuntu-slim.'
138
+ docs:
139
+ - url: 'https://github.com/actions/runner-images/issues/13583'
140
+ label: 'runner-images#13583: Docker pull failed on ubuntu-slim (closed as not_planned — by design)'
141
+ - url: 'https://github.com/actions/runner-images/pull/13542'
142
+ label: 'runner-images PR#13542: Add ubuntu-slim image (Jan 2026)'
143
+ - url: 'https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners'
144
+ label: 'GitHub Docs: About GitHub-hosted runners'
145
+ - url: 'https://docs.github.com/en/actions/creating-actions/creating-a-docker-container-action'
146
+ label: 'GitHub Docs: Creating a Docker container action'
@@ -0,0 +1,146 @@
1
+ id: runner-environment-237
2
+ title: 'Container job `$HOME` is hardcoded to `/github/home` — Docker images built with `/root` home break silently'
3
+ category: runner-environment
4
+ severity: error
5
+ tags:
6
+ - container
7
+ - HOME
8
+ - docker
9
+ - self-hosted
10
+ - environment-variable
11
+ - tool-cache
12
+ patterns:
13
+ - regex: '\$HOME.*github/home'
14
+ flags: 'i'
15
+ - regex: 'Could not find.*\$HOME.*plugin|command not found.*HOME.*github'
16
+ flags: 'i'
17
+ - regex: 'no such file.*github/home|Permission denied.*github/home'
18
+ flags: 'i'
19
+ error_messages:
20
+ - "The reason `bluebase plugins` doesn't work is because it depends on `$HOME` pointing to `/root` but now GitHub Actions has changed it to `/github/home`."
21
+ - "Error: Could not find plugin at /github/home/.cache/@bluebase"
22
+ - "rust: command not found"
23
+ - "cargo: command not found"
24
+ - "go: command not found"
25
+ root_cause: |
26
+ When using `jobs.<name>.container`, the GitHub Actions runner mounts a host volume at
27
+ `/github/home` and unconditionally overrides the `HOME` environment variable to point there,
28
+ regardless of:
29
+ - The Docker image's configured user or home directory
30
+ - Any `env: HOME:` value set in the container spec
31
+ - The container image's `ENV HOME` instruction
32
+
33
+ This is hard-coded in ContainerOperationProvider.cs:
34
+ `-v "/runner/work/_temp/_github_home":"/github/home"`
35
+ `HOME=/github/home`
36
+
37
+ The volume mount overwrites whatever was at `/github/home` inside the container, and
38
+ the forced HOME env var points to that empty/host-controlled directory.
39
+
40
+ **Impact on pre-installed tools**: Any CLI tool or plugin manager that stores
41
+ state relative to `$HOME` during the Docker image build (e.g. `npm global`, Rust's
42
+ `cargo`, Go binaries in `~/go/bin`, Homebrew cellar, Python user installs at
43
+ `~/.local`) will no longer find its data because HOME now points to `/github/home`
44
+ instead of `/root` or the image user's home.
45
+
46
+ Source: actions/runner#863 (124 reactions, open since 2021)
47
+ fix: |
48
+ **Option 1 (recommended): Prefix affected commands with `HOME=/root`**
49
+
50
+ Temporarily reset HOME to the original image home for each step that relies on
51
+ pre-installed tools. Do NOT set HOME permanently in the workflow — the runner
52
+ depends on `/github/home` for internal state.
53
+
54
+ **Option 2: Rebuild Docker image with `/github/home` as the home directory**
55
+
56
+ Set `ENV HOME /github/home` in the Dockerfile before installing tools.
57
+ Note: if `/github/home` is empty at build time (it is), tools install there, and
58
+ the volume mount at runtime will still OVERWRITE the directory. This does not work.
59
+
60
+ **Option 3: Copy tool state in a setup step**
61
+
62
+ Add an entrypoint or a workflow step that copies `/root/.config`, `/root/.cache`,
63
+ `/root/go`, etc. to `/github/home` before the steps that need them run.
64
+
65
+ **Option 4: Avoid container jobs for images with pre-installed tools**
66
+
67
+ Use a Docker action (`uses: docker://image`) instead of `jobs.<name>.container`
68
+ — Docker actions do not override HOME.
69
+ fix_code:
70
+ - language: yaml
71
+ label: "Broken — pre-installed cargo/rust not found in container job"
72
+ code: |
73
+ jobs:
74
+ build:
75
+ runs-on: ubuntu-latest
76
+ container:
77
+ image: my-rust-tools:latest # Built with cargo installed at /root/.cargo
78
+ steps:
79
+ - uses: actions/checkout@v4
80
+ - name: Build
81
+ run: cargo build --release # ❌ FAILS: cargo not found — HOME=/github/home
82
+
83
+ - language: yaml
84
+ label: "Fixed — reset HOME per-step for pre-built tool invocations"
85
+ code: |
86
+ jobs:
87
+ build:
88
+ runs-on: ubuntu-latest
89
+ container:
90
+ image: my-rust-tools:latest
91
+ steps:
92
+ - uses: actions/checkout@v4
93
+ - name: Build
94
+ run: HOME=/root cargo build --release # ✅ HOME temporarily reset to where cargo is
95
+
96
+ # Or use env: at the step level:
97
+ - name: Run tests
98
+ env:
99
+ HOME: /root
100
+ run: cargo test
101
+
102
+ - language: yaml
103
+ label: "Fixed — rebuild Docker image with tools installed at github/home path"
104
+ code: |
105
+ # Dockerfile — install tools where GHA will point HOME
106
+ FROM ubuntu:22.04
107
+
108
+ # Install Rust to /github/home/.cargo (matches runtime HOME)
109
+ RUN mkdir -p /github/home
110
+ ENV HOME=/github/home
111
+ RUN curl https://sh.rustup.rs -sSf | sh -s -- -y
112
+
113
+ # In workflow, no HOME override needed:
114
+ # cargo is at /github/home/.cargo/bin — BUT runtime mount overwrites!
115
+ # This approach ONLY works if you add the PATH explicitly:
116
+ ENV PATH="/github/home/.cargo/bin:${PATH}"
117
+
118
+ # Better: install to /usr/local/bin which is not affected by HOME mount
119
+ RUN curl https://sh.rustup.rs -sSf | sh -s -- -y --default-toolchain stable
120
+ RUN cp -r /github/home/.cargo/bin/* /usr/local/bin/
121
+
122
+ - language: yaml
123
+ label: "Alternative — use Docker action instead of container job (HOME not overridden)"
124
+ code: |
125
+ jobs:
126
+ build:
127
+ runs-on: ubuntu-latest
128
+ steps:
129
+ - uses: actions/checkout@v4
130
+ - name: Build in custom container
131
+ uses: docker://my-rust-tools:latest # ✅ HOME not overridden by runner
132
+ with:
133
+ args: cargo build --release
134
+
135
+ prevention:
136
+ - "When building Docker images for use in `jobs.<name>.container`, install binaries to `/usr/local/bin` or another PATH location not dependent on `$HOME`."
137
+ - "Test container images locally by running `docker run -e HOME=/github/home <image> <command>` to reproduce the GHA HOME override before using them in workflows."
138
+ - "Prefer Docker actions (`uses: docker://image`) over container jobs if your image relies on a specific HOME directory — Docker actions do not override HOME."
139
+ - "Read actions/runner#863 before publishing a Docker image intended for use as a GHA container job."
140
+ docs:
141
+ - url: 'https://github.com/actions/runner/issues/863'
142
+ label: 'actions/runner#863 — HOME is overridden for containers (124 reactions, open since 2021)'
143
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/running-jobs-in-a-container'
144
+ label: 'GitHub Docs: Running jobs in a container'
145
+ - url: 'https://stackoverflow.com/questions/58516181/missing-installed-dependencies-when-docker-image-is-used'
146
+ label: 'Stack Overflow: Missing installed dependencies when Docker image is used (Score: 3)'
@@ -0,0 +1,151 @@
1
+ id: runner-environment-238
2
+ title: 'Self-hosted runner with Docker container steps creates root-owned files in workspace — `git clean` fails on next run'
3
+ category: runner-environment
4
+ severity: error
5
+ tags:
6
+ - self-hosted
7
+ - docker
8
+ - container
9
+ - workspace
10
+ - permissions
11
+ - checkout
12
+ - root-owned
13
+ patterns:
14
+ - regex: 'warning: could not open directory.*Permission denied'
15
+ flags: 'i'
16
+ - regex: 'failed to remove.*Directory not empty|rm.*cannot remove.*Permission denied'
17
+ flags: 'i'
18
+ - regex: 'Unable to clean or reset the repository.*recreated'
19
+ flags: 'i'
20
+ error_messages:
21
+ - "warning: could not open directory 'foo/': Permission denied"
22
+ - "warning: failed to remove foo/: Directory not empty"
23
+ - "##[warning]Unable to clean or reset the repository. The repository will be recreated instead."
24
+ - "##[error]Command failed: rm -rf \"/home/runner/_work/repo/repo/foo\""
25
+ - "rm: cannot remove '/home/runner/_work/repo/repo/foo': Permission denied"
26
+ - "error: failed to run 'git clean -ffdx': exit code 1"
27
+ root_cause: |
28
+ When a GitHub Actions workflow on a **self-hosted runner** uses a Docker-based step
29
+ (either `container:` jobs or `uses: docker://` actions), files written to the
30
+ workspace during that step are owned by `root:root` — because most Docker containers
31
+ run as root by default.
32
+
33
+ The self-hosted runner process runs as a non-root user (e.g., `github-runner`, `ubuntu`).
34
+ On the **next workflow run**, `actions/checkout` attempts to clean the workspace by
35
+ running `git clean -ffdx`. Git's clean-up calls `stat()` on root-owned directories.
36
+ Because stat succeeds (files are visible) but `rm` is blocked (not root), `git clean`
37
+ reports the directory but cannot remove it. Git then falls back to `rm -rf` via the
38
+ runner, which also fails with `Permission denied`.
39
+
40
+ The checkout action logs a warning "Unable to clean or reset the repository. The
41
+ repository will be recreated instead." — then `rm -rf` of the entire workspace also
42
+ fails because root-owned directories are nested inside. The job fails permanently
43
+ until a human manually removes the root-owned files on the runner host.
44
+
45
+ **Why GitHub-hosted runners don't hit this**: GitHub-hosted runners are ephemeral —
46
+ the workspace is destroyed after every job, so there's no cross-run persistence of
47
+ root-owned files. Self-hosted runners persist the workspace by default.
48
+
49
+ Source: actions/runner#434 (131 reactions, open since 2020)
50
+ fix: |
51
+ **Option 1 (recommended): Run the container as the same UID as the host runner user**
52
+
53
+ Pass `--user $(id -u):$(id -g)` to Docker options so files are written with the
54
+ runner's UID/GID. Most images support this without modification.
55
+
56
+ **Option 2: Add a cleanup step before checkout**
57
+
58
+ Add a step at the start of the workflow that removes workspace contents using sudo.
59
+ Requires configuring passwordless sudo for the runner user.
60
+
61
+ **Option 3: Configure the container to run as root but add a post-step chown**
62
+
63
+ After container steps, add a step that runs `sudo chown -R $USER:$USER .` to
64
+ reclaim ownership. Still requires sudo access.
65
+
66
+ **Option 4: Use ephemeral (one-shot) self-hosted runners**
67
+
68
+ Ephemeral runners create a fresh workspace per job. No cross-run file persistence
69
+ means root-owned files from Docker steps never accumulate.
70
+
71
+ **Option 5: Set `clean: false` on checkout and handle cleanup yourself**
72
+
73
+ Disable automatic workspace cleaning in checkout and add your own robust cleanup
74
+ step that can handle permission errors.
75
+ fix_code:
76
+ - language: yaml
77
+ label: "Fixed — run Docker container as host runner's UID"
78
+ code: |
79
+ jobs:
80
+ build:
81
+ runs-on: [self-hosted, linux]
82
+ container:
83
+ image: node:20
84
+ options: '--user 1001:1001' # ✅ Match runner UID — no root-owned files
85
+
86
+ steps:
87
+ - uses: actions/checkout@v4
88
+ - run: npm ci && npm run build
89
+
90
+ - language: yaml
91
+ label: "Fixed — run container as current user dynamically"
92
+ code: |
93
+ jobs:
94
+ build:
95
+ runs-on: [self-hosted, linux]
96
+ steps:
97
+ - uses: actions/checkout@v4
98
+
99
+ - name: Build in container (current user)
100
+ run: |
101
+ docker run --rm \
102
+ --user "$(id -u):$(id -g)" \
103
+ -v "${{ github.workspace }}:/work" \
104
+ -w /work \
105
+ node:20 \
106
+ npm ci && npm run build
107
+ # ✅ Files written inside container owned by host runner user
108
+
109
+ - language: yaml
110
+ label: "Fixed — cleanup step before checkout to handle pre-existing root-owned files"
111
+ code: |
112
+ jobs:
113
+ build:
114
+ runs-on: [self-hosted, linux]
115
+ steps:
116
+ # ✅ Cleanup root-owned files before checkout
117
+ - name: Cleanup workspace
118
+ run: |
119
+ if [ -d "${{ github.workspace }}" ]; then
120
+ sudo chown -R "$USER:$USER" "${{ github.workspace }}" || true
121
+ sudo rm -rf "${{ github.workspace }}"
122
+ fi
123
+ # Requires: echo "runner ALL=(ALL) NOPASSWD: /bin/chown, /bin/rm" | sudo tee /etc/sudoers.d/runner
124
+
125
+ - uses: actions/checkout@v4
126
+
127
+ - language: yaml
128
+ label: "Fixed — register self-hosted runner as ephemeral (--ephemeral flag)"
129
+ code: |
130
+ # Register the runner with --ephemeral so each job gets a fresh workspace:
131
+ # ./config.sh --url https://github.com/org/repo --token TOKEN --ephemeral
132
+ #
133
+ # Or for ARC (Actions Runner Controller):
134
+ # spec:
135
+ # ephemeral: true # Each job pod is destroyed after completion
136
+
137
+ prevention:
138
+ - "Always run Docker container steps with `--user $(id -u):$(id -g)` on self-hosted runners to match the host runner's UID/GID."
139
+ - "Use ephemeral self-hosted runners to eliminate cross-run workspace persistence entirely."
140
+ - "Add a workspace cleanup step at the start of workflows that use Docker container steps to proactively clear root-owned files."
141
+ - "After setting up a self-hosted runner, test with a workflow that uses a `container:` job and verify subsequent runs can clean up without errors."
142
+ - "When using ARC (Actions Runner Controller), enable `ephemeral: true` on the RunnerDeployment spec."
143
+ docs:
144
+ - url: 'https://github.com/actions/runner/issues/434'
145
+ label: 'actions/runner#434 — Self-hosted runner with Docker step creates root-owned files (131 reactions)'
146
+ - url: 'https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/about-self-hosted-runners'
147
+ label: 'GitHub Docs: About self-hosted runners'
148
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/running-jobs-in-a-container'
149
+ label: 'GitHub Docs: Running jobs in a container'
150
+ - url: 'https://stackoverflow.com/questions/76407664/files-owned-by-rootroot-when-using-actions-checkout-on-self-hosted-runner'
151
+ label: 'Stack Overflow: Files owned by root:root when using actions/checkout on self-hosted runner'
@@ -0,0 +1,162 @@
1
+ id: silent-failures-119
2
+ title: '`github.ref` and `GITHUB_REF` intermittently empty on `release` event — workflows that tag-check silently fail or exit 1'
3
+ category: silent-failures
4
+ severity: silent-failure
5
+ tags:
6
+ - github.ref
7
+ - GITHUB_REF
8
+ - release
9
+ - release-event
10
+ - intermittent
11
+ - tag
12
+ - empty
13
+ patterns:
14
+ - regex: 'github\.ref.*release|GITHUB_REF.*release.*empty'
15
+ flags: 'i'
16
+ - regex: 'refs/tags.*empty.*release|release.*github_ref.*blank'
17
+ flags: 'i'
18
+ - regex: 'startsWith\(github\.ref.*refs/tags.*false.*release'
19
+ flags: 'i'
20
+ error_messages:
21
+ - "if [[ \"\" == refs/tags/release/* ]] ; then"
22
+ - "::error ::Could not determine the version number using ref "
23
+ - "GITHUB_REF: ''"
24
+ - "github.ref evaluates to empty string on release event"
25
+ root_cause: |
26
+ When a workflow is triggered by an `on: release` event (e.g. `published`, `created`,
27
+ `prereleased`), the `github.ref` context value and `GITHUB_REF` environment variable
28
+ are **intermittently empty** — sometimes populated correctly, sometimes empty string.
29
+
30
+ The bug is non-deterministic: re-running the same workflow sometimes succeeds (ref
31
+ populated) and sometimes fails (ref empty). It is more frequent when:
32
+ - The release is created immediately after tagging
33
+ - Multiple releases fire in quick succession
34
+ - The workflow is called as a reusable workflow with `github.ref` forwarded
35
+
36
+ **Root cause**: A race condition or internal state propagation issue in the GitHub
37
+ Actions platform where the `release` event payload is dispatched before the ref
38
+ metadata is fully resolved in all runtime components. The `github.event.release`
39
+ object IS populated correctly in the same runs that have an empty `github.ref`.
40
+
41
+ **Impact**: Workflows that use `github.ref` to extract tag names (e.g.
42
+ `startsWith(github.ref, 'refs/tags/')`, or `echo $GITHUB_REF | cut -c11-`) will
43
+ silently get an empty string, causing:
44
+ - Conditional steps to be skipped (release-only deployments don't run)
45
+ - Version parsing to produce empty strings (publishing incorrect artifacts)
46
+ - Shell scripts to `exit 1` on empty-ref checks
47
+
48
+ This issue was introduced/regressed in August 2023 (runner ~v2.307) and remains
49
+ intermittently reproducible as of April/August 2025.
50
+
51
+ Source: actions/runner#2788 (65 reactions, open since 2023)
52
+ fix: |
53
+ **Always use `github.event.release.tag_name` for release tag extraction** instead of
54
+ parsing `github.ref`. The `github.event.release` object is consistently populated
55
+ even when `github.ref` is empty.
56
+
57
+ For step-level `if:` conditions that check whether a workflow was triggered by a
58
+ release event, use `github.event_name == 'release'` instead of
59
+ `startsWith(github.ref, 'refs/tags/')`.
60
+
61
+ If you need the full `refs/tags/v1.2.3` ref format, construct it from the event:
62
+ `refs/tags/${{ github.event.release.tag_name }}`
63
+ fix_code:
64
+ - language: yaml
65
+ label: "Broken — version extraction via github.ref (intermittently empty)"
66
+ code: |
67
+ on:
68
+ release:
69
+ types: [published]
70
+
71
+ jobs:
72
+ publish:
73
+ runs-on: ubuntu-latest
74
+ steps:
75
+ - name: Extract version
76
+ run: |
77
+ # ❌ BROKEN: github.ref is intermittently empty on release events
78
+ VERSION="${{ github.ref }}"
79
+ if [[ "$VERSION" == refs/tags/v* ]]; then
80
+ TAG="${VERSION#refs/tags/}"
81
+ else
82
+ echo "::error ::Could not determine version from ref: $VERSION"
83
+ exit 1
84
+ fi
85
+
86
+ # Also fragile:
87
+ - if: startsWith(github.ref, 'refs/tags/') # ❌ sometimes false when ref is empty
88
+ run: echo "Publishing release"
89
+
90
+ - language: yaml
91
+ label: "Fixed — use github.event.release.tag_name (always populated)"
92
+ code: |
93
+ on:
94
+ release:
95
+ types: [published]
96
+
97
+ jobs:
98
+ publish:
99
+ runs-on: ubuntu-latest
100
+ steps:
101
+ - name: Extract version
102
+ run: |
103
+ # ✅ FIXED: event.release.tag_name is always populated correctly
104
+ TAG="${{ github.event.release.tag_name }}"
105
+ echo "Publishing version: $TAG"
106
+ echo "VERSION=$TAG" >> "$GITHUB_ENV"
107
+
108
+ # For if: conditions, use event_name check instead of ref check:
109
+ - if: github.event_name == 'release' # ✅ Always correct
110
+ run: echo "Publishing release ${{ github.event.release.tag_name }}"
111
+
112
+ # If you need the refs/tags/... format, construct it:
113
+ - name: Construct full ref
114
+ run: |
115
+ FULL_REF="refs/tags/${{ github.event.release.tag_name }}"
116
+ echo "Full ref: $FULL_REF"
117
+
118
+ - language: yaml
119
+ label: "Fixed — defensive workaround using both sources"
120
+ code: |
121
+ on:
122
+ release:
123
+ types: [published]
124
+
125
+ jobs:
126
+ publish:
127
+ runs-on: ubuntu-latest
128
+ steps:
129
+ - name: Determine tag with fallback
130
+ id: tag
131
+ run: |
132
+ # Primary: event object (always populated)
133
+ TAG="${{ github.event.release.tag_name }}"
134
+
135
+ # Fallback: parse from github.ref if event is empty (unlikely)
136
+ if [ -z "$TAG" ] && [ -n "${{ github.ref }}" ]; then
137
+ TAG="${{ github.ref }}"
138
+ TAG="${TAG#refs/tags/}"
139
+ fi
140
+
141
+ if [ -z "$TAG" ]; then
142
+ echo "::error ::Could not determine release tag from either source"
143
+ exit 1
144
+ fi
145
+ echo "tag=$TAG" >> "$GITHUB_OUTPUT"
146
+
147
+ - run: echo "Publishing ${{ steps.tag.outputs.tag }}"
148
+
149
+ prevention:
150
+ - "Never parse release tags from `github.ref` — always use `github.event.release.tag_name` in release-triggered workflows."
151
+ - "Use `github.event_name == 'release'` in `if:` conditions rather than `startsWith(github.ref, 'refs/tags/')` to avoid empty-ref false negatives."
152
+ - "Add explicit error messages when tag extraction returns empty string so intermittent failures are visible rather than silent."
153
+ - "If a release workflow fails unexpectedly, check whether `github.ref` was empty by adding `run: echo \"ref=${{ github.ref }}\"; echo \"tag=${{ github.event.release.tag_name }}\"`."
154
+ docs:
155
+ - url: 'https://github.com/actions/runner/issues/2788'
156
+ label: 'actions/runner#2788 — github.ref is empty for workflows triggered by release (65 reactions)'
157
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/events-that-trigger-workflows#release'
158
+ label: 'GitHub Docs: Events that trigger workflows — release'
159
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/contexts#github-context'
160
+ label: 'GitHub Docs: github context'
161
+ - url: 'https://github.com/orgs/community/discussions/64528'
162
+ label: 'GitHub Community: github.ref empty on release event (discussion)'
@@ -0,0 +1,173 @@
1
+ id: triggers-073
2
+ title: '`on: workflow_run: branches:` filter matches head_branch of triggering run — PR merge triggers are silently skipped when head_branch is the PR branch, not the merge target'
3
+ category: triggers
4
+ severity: silent-failure
5
+ tags:
6
+ - workflow_run
7
+ - branches-filter
8
+ - head-branch
9
+ - pr-merge
10
+ - silent-skip
11
+ - trigger-not-firing
12
+ - deploy-workflow
13
+ patterns:
14
+ - regex: 'workflow_run:[\s\S]{0,200}branches:'
15
+ flags: 'im'
16
+ - regex: 'github\.event\.workflow_run\.head_branch'
17
+ flags: 'i'
18
+ - regex: 'workflow_run.*branches.*\[.*main.*\]'
19
+ flags: 'i'
20
+ error_messages:
21
+ - "workflow not triggering after PR merge"
22
+ - "workflow_run with branches filter not firing on merge to main"
23
+ root_cause: |
24
+ The `branches:` filter in an `on: workflow_run:` trigger compares against
25
+ `github.event.workflow_run.head_branch` — the branch the **source workflow** ran on.
26
+
27
+ When a pull request is opened, the source workflow (e.g., CI) runs with
28
+ `head_branch` = the feature branch name (e.g., `feat/my-change`). When GitHub
29
+ handles the PR merge, it creates a push to `main`, but it **reuses the PR's existing
30
+ check suite** rather than creating a new check run for the merge commit. As a result,
31
+ the `workflow_run` event that fires after the merge has:
32
+
33
+ ```
34
+ github.event.workflow_run.head_branch = "feat/my-change" # PR branch, not "main"
35
+ github.event.workflow_run.head_sha = "<merge-commit-sha>"
36
+ ```
37
+
38
+ If a downstream workflow has `branches: [main]`, this filter is evaluated against
39
+ `head_branch` = `"feat/my-change"`. The filter does NOT match — the downstream
40
+ workflow **silently never runs** after the PR merge.
41
+
42
+ This is counterintuitive because developers expect `branches: [main]` to mean
43
+ "trigger my deployment workflow only when CI ran on the main branch," not "only
44
+ when the source workflow's head_branch exactly equals main." The distinction matters
45
+ specifically for PR merges, where the push IS to main but the check run is attributed
46
+ to the PR branch.
47
+
48
+ **Who is affected:**
49
+ - Any deploy-on-merge pattern using `workflow_run` with `branches: [main]` or
50
+ `branches: [master]`
51
+ - Workflows that use `workflow_run` to chain CI → deploy only for the default branch
52
+
53
+ **Note:** Workflows triggered by a direct push to `main` (not a PR merge) DO have
54
+ `head_branch = "main"` and ARE correctly triggered by the `branches: [main]` filter.
55
+ The problem is specific to PR merges that go through the PR check-suite flow.
56
+
57
+ Source: observed in production deployments (commit Skretzo/shortest-path@8254fb4),
58
+ GitHub docs issue github/docs#42813 (Feb 2026).
59
+ fix: |
60
+ **Remove `branches:` from the `workflow_run` trigger** and instead enforce the branch
61
+ restriction in a **job-level `if:` condition** that explicitly checks
62
+ `github.event.workflow_run.head_branch`:
63
+
64
+ ```yaml
65
+ on:
66
+ workflow_run:
67
+ workflows: ["CI"]
68
+ types: [completed]
69
+ # ✗ Remove: branches: [main]
70
+
71
+ jobs:
72
+ deploy:
73
+ # ✓ Check head_branch here instead — this sees the actual branch
74
+ if: >-
75
+ github.event.workflow_run.conclusion == 'success' &&
76
+ github.event.workflow_run.head_branch == 'main'
77
+ ```
78
+
79
+ This approach works for both direct pushes to main AND PR merges to main,
80
+ because the `if:` condition is evaluated at job execution time with the full
81
+ `workflow_run` event payload, while the `branches:` filter is evaluated at
82
+ trigger time and uses different matching semantics.
83
+
84
+ Alternatively, use the `workflow_run` trigger WITHOUT a branches filter and
85
+ rely entirely on job conditions to control which branches proceed to deployment.
86
+ fix_code:
87
+ - language: yaml
88
+ label: 'Broken — branches: [main] silently blocks PR merge triggers'
89
+ code: |
90
+ # .github/workflows/deploy.yml
91
+ on:
92
+ workflow_run:
93
+ workflows: ["CI"]
94
+ types: [completed]
95
+ branches: [main] # ✗ Matches head_branch of triggering run
96
+ # PR merges have head_branch = PR branch name
97
+ # → deploy never fires after PR merges to main
98
+
99
+ jobs:
100
+ deploy:
101
+ runs-on: ubuntu-latest
102
+ if: github.event.workflow_run.conclusion == 'success'
103
+ steps:
104
+ - run: ./deploy.sh
105
+
106
+ - language: yaml
107
+ label: 'Fixed — remove branches filter; check head_branch in job condition'
108
+ code: |
109
+ # .github/workflows/deploy.yml
110
+ on:
111
+ workflow_run:
112
+ workflows: ["CI"]
113
+ types: [completed]
114
+ # ✓ No branches filter — let all workflow_run events through
115
+
116
+ jobs:
117
+ deploy:
118
+ runs-on: ubuntu-latest
119
+ # ✓ Check head_branch in the if: condition
120
+ # head_branch == 'main' correctly identifies merges to main
121
+ if: >-
122
+ github.event.workflow_run.conclusion == 'success' &&
123
+ github.event.workflow_run.head_branch == 'main'
124
+ steps:
125
+ - uses: actions/checkout@v4
126
+ with:
127
+ ref: ${{ github.event.workflow_run.head_sha }}
128
+ - run: ./deploy.sh
129
+
130
+ - language: yaml
131
+ label: 'Pattern — full deploy-on-merge via workflow_run with branch guard'
132
+ code: |
133
+ # .github/workflows/deploy.yml
134
+ # Runs after CI completes successfully on any branch,
135
+ # but only deploys when it ran on main (direct push OR PR merge)
136
+ name: Deploy
137
+
138
+ on:
139
+ workflow_run:
140
+ workflows: ["CI"]
141
+ types: [completed]
142
+
143
+ jobs:
144
+ deploy:
145
+ name: Deploy to production
146
+ runs-on: ubuntu-latest
147
+ if: >-
148
+ github.event.workflow_run.conclusion == 'success' &&
149
+ github.event.workflow_run.head_branch == 'main'
150
+
151
+ steps:
152
+ - uses: actions/checkout@v4
153
+ with:
154
+ # ✓ Check out the exact commit that CI ran on
155
+ ref: ${{ github.event.workflow_run.head_sha }}
156
+
157
+ - name: Deploy
158
+ run: |
159
+ echo "Deploying commit ${{ github.event.workflow_run.head_sha }}"
160
+ ./deploy.sh
161
+
162
+ prevention:
163
+ - 'Never use `branches:` in `on: workflow_run:` if your deployment pattern relies on PR merges — always enforce branch restrictions in `jobs.<id>.if` conditions instead.'
164
+ - 'Test the pattern by creating a PR, merging it, and checking the Actions tab — the `workflow_run` event should appear even without a `branches` filter.'
165
+ - 'Use `github.event.workflow_run.head_branch` in the `if:` condition to distinguish deploy-worthy runs from feature-branch CI runs.'
166
+ - 'Document this behavior in your workflow comments so future contributors understand why the `branches:` filter is not used.'
167
+ docs:
168
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#workflow_run'
169
+ label: 'GitHub Docs: workflow_run trigger — branches filter behavior'
170
+ - url: 'https://github.com/github/docs/issues/42813'
171
+ label: 'github/docs#42813: Clarify branches filter on workflow_run (Feb 2026)'
172
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#using-data-from-the-triggering-workflow'
173
+ label: 'GitHub Docs: Using data from the triggering workflow'
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@htekdev/actions-debugger",
3
- "version": "1.0.126",
3
+ "version": "1.0.128",
4
4
  "description": "65+ real GitHub Actions errors, queryable by agents. CLI + MCP server + Copilot skills + error database.",
5
5
  "type": "module",
6
6
  "main": "./dist/index.js",