@htekdev/actions-debugger 1.0.131 → 1.0.133
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/errors/caching-artifacts/caching-artifacts-078.yml +100 -0
- package/errors/caching-artifacts/caching-artifacts-079.yml +95 -0
- package/errors/concurrency-timing/concurrency-timing-061.yml +138 -0
- package/errors/known-unsolved/known-unsolved-076.yml +142 -0
- package/errors/known-unsolved/known-unsolved-077.yml +140 -0
- package/errors/runner-environment/runner-environment-246.yml +120 -0
- package/errors/silent-failures/silent-failures-121.yml +140 -0
- package/errors/triggers/triggers-074.yml +142 -0
- package/package.json +1 -1
|
@@ -0,0 +1,100 @@
|
|
|
1
|
+
id: caching-artifacts-078
|
|
2
|
+
title: '`upload-artifact/merge` aborts with "Failed to DeleteArtifact: (404) Not Found" when a delete is retried after a transient server error'
|
|
3
|
+
category: caching-artifacts
|
|
4
|
+
severity: error
|
|
5
|
+
tags:
|
|
6
|
+
- upload-artifact
|
|
7
|
+
- merge
|
|
8
|
+
- delete-merged
|
|
9
|
+
- 404
|
|
10
|
+
- idempotent-delete
|
|
11
|
+
- artifact-merge
|
|
12
|
+
patterns:
|
|
13
|
+
- regex: 'Failed to DeleteArtifact.*non-retryable error.*404|DeleteArtifact.*404.*Not Found'
|
|
14
|
+
flags: 'i'
|
|
15
|
+
- regex: 'Received non-retryable error.*Failed request.*404.*artifact not found'
|
|
16
|
+
flags: 'i'
|
|
17
|
+
error_messages:
|
|
18
|
+
- "Error: Failed to DeleteArtifact: Received non-retryable error: Failed request: (404) Not Found: artifact not found"
|
|
19
|
+
- "Attempt 1 of 5 failed with error: Unexpected token '<', \"<!DOCTYPE \"... is not valid JSON. Retrying request in 3000 ms..."
|
|
20
|
+
root_cause: |
|
|
21
|
+
When using `actions/upload-artifact/merge@v6` (or later) with `delete-merged: true`,
|
|
22
|
+
the action deletes each source artifact after merging them into the combined artifact.
|
|
23
|
+
If a delete request encounters a transient server error (e.g., a malformed HTML response
|
|
24
|
+
instead of JSON — indicated by "Unexpected token '<'"), the action retries the delete.
|
|
25
|
+
|
|
26
|
+
If the first delete request was actually processed by the server before the error was
|
|
27
|
+
returned to the client, the artifact is already gone. The retry then receives a 404
|
|
28
|
+
Not Found, which `DeleteArtifact` treats as a **non-retryable fatal error** and
|
|
29
|
+
immediately aborts the job.
|
|
30
|
+
|
|
31
|
+
Because `delete-merged: true` is destructive — source artifacts have already been
|
|
32
|
+
deleted by this point — retrying the entire job from scratch is impossible, as the
|
|
33
|
+
source artifacts no longer exist.
|
|
34
|
+
|
|
35
|
+
Root bug: `DeleteArtifact` should treat 404 as a success (idempotent delete — the
|
|
36
|
+
artifact not existing IS the desired state). This is tracked in upload-artifact#751
|
|
37
|
+
(open, Jan 2026). No server-side fix has been released as of June 2026.
|
|
38
|
+
fix: |
|
|
39
|
+
Option 1 — Add `continue-on-error: true` to the merge step so that the 404 error
|
|
40
|
+
does not fail the overall job. Verify that the merge artifact was created successfully
|
|
41
|
+
before relying on this workaround, since `continue-on-error` also swallows real
|
|
42
|
+
failures.
|
|
43
|
+
|
|
44
|
+
Option 2 — Set `delete-merged: false` and handle deletion of source artifacts in a
|
|
45
|
+
separate step with explicit error handling (e.g., `gh api` with `--fail-with-body` and
|
|
46
|
+
a conditional step that ignores 404 exit codes).
|
|
47
|
+
|
|
48
|
+
Option 3 — Wrap the merge step in a retry loop using `nick-invision/retry` or a
|
|
49
|
+
shell retry, but note that re-running the merge after partial deletion will fail because
|
|
50
|
+
source artifacts are gone. Prevention is better than recovery here.
|
|
51
|
+
|
|
52
|
+
Monitor upload-artifact#751 for an upstream fix that makes 404 on DeleteArtifact
|
|
53
|
+
idempotent/non-fatal.
|
|
54
|
+
fix_code:
|
|
55
|
+
- language: yaml
|
|
56
|
+
label: 'continue-on-error workaround — prevents 404 from failing the job'
|
|
57
|
+
code: |
|
|
58
|
+
- name: Merge artifacts
|
|
59
|
+
uses: actions/upload-artifact/merge@v6
|
|
60
|
+
# ⚠️ Workaround: continue-on-error so a DeleteArtifact 404 does not fail the job.
|
|
61
|
+
# Also swallows real upload failures — add a downstream step to verify the
|
|
62
|
+
# merged artifact exists if reliability matters.
|
|
63
|
+
continue-on-error: true
|
|
64
|
+
with:
|
|
65
|
+
name: merged-results
|
|
66
|
+
pattern: partial-results-*
|
|
67
|
+
delete-merged: true
|
|
68
|
+
- language: yaml
|
|
69
|
+
label: 'delete-merged: false with manual deletion that ignores 404'
|
|
70
|
+
code: |
|
|
71
|
+
- name: Merge artifacts (no auto-delete)
|
|
72
|
+
uses: actions/upload-artifact/merge@v6
|
|
73
|
+
with:
|
|
74
|
+
name: merged-results
|
|
75
|
+
pattern: partial-results-*
|
|
76
|
+
delete-merged: false # ✅ Suppress automatic deletion
|
|
77
|
+
|
|
78
|
+
# Manually delete source artifacts, ignoring 404 responses (already deleted)
|
|
79
|
+
- name: Delete source artifacts
|
|
80
|
+
env:
|
|
81
|
+
GH_TOKEN: ${{ github.token }}
|
|
82
|
+
run: |
|
|
83
|
+
for artifact_id in $(gh api repos/${{ github.repository }}/actions/artifacts \
|
|
84
|
+
--jq '.artifacts[] | select(.name | startswith("partial-results-")) | .id'); do
|
|
85
|
+
gh api --method DELETE \
|
|
86
|
+
repos/${{ github.repository }}/actions/artifacts/$artifact_id \
|
|
87
|
+
--silent || echo "Artifact $artifact_id already deleted (404) — ignoring"
|
|
88
|
+
done
|
|
89
|
+
prevention:
|
|
90
|
+
- 'Avoid `delete-merged: true` in critical CI pipelines until upload-artifact#751 is fixed — use manual deletion with 404 tolerance instead'
|
|
91
|
+
- 'Add `continue-on-error: true` to merge steps as a temporary workaround for the 404 abort'
|
|
92
|
+
- 'Keep source artifacts available by using `delete-merged: false` until the upstream fix lands'
|
|
93
|
+
- 'Monitor actions/upload-artifact#751 for a release that treats DeleteArtifact 404 as success'
|
|
94
|
+
docs:
|
|
95
|
+
- url: 'https://github.com/actions/upload-artifact/issues/751'
|
|
96
|
+
label: 'upload-artifact#751: DeleteArtifact should not consider a 404 response an error (open, Jan 2026)'
|
|
97
|
+
- url: 'https://github.com/actions/upload-artifact/blob/main/merge/README.md'
|
|
98
|
+
label: 'actions/upload-artifact merge action README — delete-merged option'
|
|
99
|
+
- url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/storing-workflow-data-as-artifacts'
|
|
100
|
+
label: 'GitHub Docs: Storing workflow data as artifacts'
|
|
@@ -0,0 +1,95 @@
|
|
|
1
|
+
id: caching-artifacts-079
|
|
2
|
+
title: '`setup-node@v5` auto-enables caching when `packageManager` is set in `package.json` — post-run fails with "Path Validation Error" if dependencies are not installed'
|
|
3
|
+
category: caching-artifacts
|
|
4
|
+
severity: error
|
|
5
|
+
tags:
|
|
6
|
+
- setup-node
|
|
7
|
+
- caching
|
|
8
|
+
- packageManager
|
|
9
|
+
- path-validation
|
|
10
|
+
- post-run
|
|
11
|
+
- setup-node-v5
|
|
12
|
+
- automatic-caching
|
|
13
|
+
patterns:
|
|
14
|
+
- regex: 'Path Validation Error.*Path.*specified.*action.*caching.*do.*not exist'
|
|
15
|
+
flags: 'i'
|
|
16
|
+
- regex: 'Path.*specified.*action.*caching.*do.{0,5}not exist.*no cache'
|
|
17
|
+
flags: 'i'
|
|
18
|
+
error_messages:
|
|
19
|
+
- "Error: Path Validation Error: Path(s) specified in the action for caching do(es) not exist, hence no cache is being saved."
|
|
20
|
+
root_cause: |
|
|
21
|
+
Starting with `actions/setup-node@v5`, automatic caching is enabled **by default**
|
|
22
|
+
whenever a `packageManager` field is present in the project's `package.json` file —
|
|
23
|
+
even if the workflow does not explicitly configure caching via the `cache:` input.
|
|
24
|
+
|
|
25
|
+
This means jobs that merely run linting, type-checking, or any step that does not
|
|
26
|
+
execute `npm install` / `yarn install` / `pnpm install` will still attempt to cache
|
|
27
|
+
the package manager's dependency directory in the post-run cleanup step. Because no
|
|
28
|
+
`install` was run, the expected cache paths (e.g., `~/.npm`, `~/.yarn/cache`,
|
|
29
|
+
`~/.pnpm-store`) do not exist. The toolkit's path validation then emits:
|
|
30
|
+
"Path Validation Error: Path(s) specified in the action for caching do(es) not exist"
|
|
31
|
+
and **fails the job step**, turning a previously green run red.
|
|
32
|
+
|
|
33
|
+
**Why this surprises developers:**
|
|
34
|
+
- The workflow never explicitly enables caching — the presence of `packageManager`
|
|
35
|
+
in `package.json` is the invisible trigger.
|
|
36
|
+
- The error appears in the **post-run** cleanup step, not in the setup step where
|
|
37
|
+
caching is configured, making it harder to trace.
|
|
38
|
+
- Workflows that have always worked (no `cache:` input) start failing after upgrading
|
|
39
|
+
from `setup-node@v4` to `setup-node@v5`.
|
|
40
|
+
|
|
41
|
+
**Resolution:** `setup-node@v6` (released Oct 2025) changed the default so automatic
|
|
42
|
+
caching only applies to npm; yarn/pnpm caching is now opt-in. Upgrading to v6 or
|
|
43
|
+
explicitly disabling the cache in v5 resolves the issue.
|
|
44
|
+
Source: setup-node#1363 (5 reactions, closed Oct 2025 via v6 release).
|
|
45
|
+
fix: |
|
|
46
|
+
Option 1 (recommended) — Upgrade to `actions/setup-node@v6`. The v6 release limits
|
|
47
|
+
automatic caching to npm only; yarn and pnpm caching require explicit `cache: yarn`
|
|
48
|
+
or `cache: pnpm` input.
|
|
49
|
+
|
|
50
|
+
Option 2 — Disable automatic caching in v5 by setting `package-manager-cache: false`.
|
|
51
|
+
Use this if you cannot upgrade to v6 immediately.
|
|
52
|
+
|
|
53
|
+
Option 3 — Ensure dependencies are installed in every job that uses `setup-node@v5`.
|
|
54
|
+
If the job only needs Node for tooling (lint, type-check, etc.), run a minimal
|
|
55
|
+
`npm ci --ignore-scripts` or equivalent before the setup post-run fires.
|
|
56
|
+
fix_code:
|
|
57
|
+
- language: yaml
|
|
58
|
+
label: 'Upgrade to setup-node@v6 — fixes automatic caching defaults'
|
|
59
|
+
code: |
|
|
60
|
+
# ✅ Recommended: v6 only auto-caches npm, not yarn/pnpm
|
|
61
|
+
- uses: actions/setup-node@v6
|
|
62
|
+
with:
|
|
63
|
+
node-version: '22'
|
|
64
|
+
# No cache: input needed — auto-caching is npm-only in v6
|
|
65
|
+
- language: yaml
|
|
66
|
+
label: 'Disable automatic caching in v5 with package-manager-cache: false'
|
|
67
|
+
code: |
|
|
68
|
+
# ✅ Workaround for setup-node@v5: suppress auto-caching
|
|
69
|
+
- uses: actions/setup-node@v5
|
|
70
|
+
with:
|
|
71
|
+
node-version: '22'
|
|
72
|
+
package-manager-cache: false # Disables packageManager-triggered auto-cache
|
|
73
|
+
- run: npx eslint src/
|
|
74
|
+
- language: yaml
|
|
75
|
+
label: 'Opt in to caching explicitly for jobs that do install dependencies'
|
|
76
|
+
code: |
|
|
77
|
+
# ✅ Jobs that do install: use explicit cache input instead of relying on auto-detect
|
|
78
|
+
- uses: actions/setup-node@v6
|
|
79
|
+
with:
|
|
80
|
+
node-version: '22'
|
|
81
|
+
cache: 'npm' # Explicit opt-in — cache only when deps are installed
|
|
82
|
+
- run: npm ci
|
|
83
|
+
- run: npm test
|
|
84
|
+
prevention:
|
|
85
|
+
- 'Upgrade to `setup-node@v6` which fixes the over-eager auto-caching of yarn/pnpm in v5'
|
|
86
|
+
- 'In v5, always set `package-manager-cache: false` for jobs that do not run `install`'
|
|
87
|
+
- 'Check `package.json` for a `packageManager` field — its presence triggers auto-caching in v5 even without an explicit `cache:` input'
|
|
88
|
+
- 'Run lint, type-check, and audit jobs without setup-node caching to keep them fast and avoid phantom cache failures'
|
|
89
|
+
docs:
|
|
90
|
+
- url: 'https://github.com/actions/setup-node/issues/1363'
|
|
91
|
+
label: 'setup-node#1363: v5 fails with Path Validation Error in Post Run steps (5 reactions, resolved in v6)'
|
|
92
|
+
- url: 'https://github.com/actions/setup-node/releases/tag/v6.0.0'
|
|
93
|
+
label: 'setup-node v6.0.0 release notes — auto-caching limited to npm'
|
|
94
|
+
- url: 'https://github.com/actions/setup-node/blob/main/docs/advanced-usage.md#caching-packages-data'
|
|
95
|
+
label: 'setup-node docs: Caching packages data'
|
|
@@ -0,0 +1,138 @@
|
|
|
1
|
+
id: concurrency-timing-061
|
|
2
|
+
title: '`github.workflow` in reusable workflow (`workflow_call`) returns caller''s name — shared concurrency group causes deadlock'
|
|
3
|
+
category: concurrency-timing
|
|
4
|
+
severity: error
|
|
5
|
+
tags:
|
|
6
|
+
- concurrency
|
|
7
|
+
- reusable-workflow
|
|
8
|
+
- workflow-call
|
|
9
|
+
- github-workflow-context
|
|
10
|
+
- deadlock
|
|
11
|
+
- concurrency-group
|
|
12
|
+
patterns:
|
|
13
|
+
- regex: 'Canceling since a deadlock for concurrency group .+ was detected between .+ and .+'
|
|
14
|
+
flags: 'i'
|
|
15
|
+
- regex: 'deadlock.*concurrency group.*detected'
|
|
16
|
+
flags: 'i'
|
|
17
|
+
error_messages:
|
|
18
|
+
- "Canceling since a deadlock for concurrency group 'CI-refs/heads/main' was detected between 'CI workflow' and 'deploy'"
|
|
19
|
+
- "Canceling since a deadlock for concurrency group 'test-refs/heads/master' was detected between 'test workflow' and 'deploy'"
|
|
20
|
+
root_cause: |
|
|
21
|
+
When a caller workflow invokes a reusable workflow via `uses:` (the `workflow_call` trigger),
|
|
22
|
+
the `github.workflow` context variable inside the CALLEE resolves to the CALLER's workflow
|
|
23
|
+
name — not the callee's own filename. This is documented behavior but it directly undermines
|
|
24
|
+
the widely recommended concurrency group pattern of `${{ github.workflow }}-${{ github.ref }}`.
|
|
25
|
+
|
|
26
|
+
If both the caller and the callee define:
|
|
27
|
+
```yaml
|
|
28
|
+
concurrency:
|
|
29
|
+
group: ${{ github.workflow }}-${{ github.ref }}
|
|
30
|
+
cancel-in-progress: true
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
Both workflows compute the SAME group key (because `github.workflow` is the caller's name in
|
|
34
|
+
both). GitHub detects that two workflow instances are competing for the same concurrency slot
|
|
35
|
+
in a way that cannot resolve — one is waiting for the other which is waiting for the first —
|
|
36
|
+
treats it as a deadlock, and cancels one of the runs.
|
|
37
|
+
|
|
38
|
+
**Why this is especially confusing:**
|
|
39
|
+
The fix recommended for general concurrency group collisions (`${{ github.workflow }}-${{ github.ref }}`)
|
|
40
|
+
is itself the cause of this deadlock when applied naively to reusable workflows. Teams that
|
|
41
|
+
copy the "safe" concurrency pattern from caller to callee introduce the bug.
|
|
42
|
+
|
|
43
|
+
**Real-world evidence:**
|
|
44
|
+
- actions/runner#3205: `github.workflow` returns parent workflow name in child workflows
|
|
45
|
+
- vergil-project/vergil-actions#176: ci-push.yml calls ci.yml via workflow_call; both use
|
|
46
|
+
`${{ github.workflow }}-${{ github.ref }}`; deadlock detected on every push
|
|
47
|
+
- github/gh-aw#35173: workflow_call fan-out cancellations due to shared concurrency namespace
|
|
48
|
+
- SO #78101326 (6 upvotes, 1738 views): "GitHub Actions concurrency deadlock" — 6 answers
|
|
49
|
+
confirming the root cause is `github.workflow` returning the caller name
|
|
50
|
+
fix: |
|
|
51
|
+
Use `github.workflow_ref` instead of `github.workflow` in the CALLEE (reusable) workflow.
|
|
52
|
+
`github.workflow_ref` contains the full workflow file path (e.g.
|
|
53
|
+
`.github/workflows/deploy.yml@refs/heads/main`), which is unique per callee file and
|
|
54
|
+
does NOT inherit the caller's name.
|
|
55
|
+
|
|
56
|
+
**Preferred approach:** Define concurrency ONLY in the caller workflow and remove it from the
|
|
57
|
+
callee entirely. The callee runs as part of the caller's run and does not need its own
|
|
58
|
+
concurrency group. This is the cleanest fix and matches GitHub's recommendations for
|
|
59
|
+
reusable workflow design.
|
|
60
|
+
|
|
61
|
+
**Avoid:** Defining concurrency in both caller and callee with the same expression — this
|
|
62
|
+
is the root cause of the deadlock regardless of which context variable is used for the group
|
|
63
|
+
key, unless the keys are guaranteed to differ.
|
|
64
|
+
fix_code:
|
|
65
|
+
- language: yaml
|
|
66
|
+
label: 'Callee (reusable workflow): use github.workflow_ref — unique per callee file'
|
|
67
|
+
code: |
|
|
68
|
+
# .github/workflows/deploy.yml (callee — reusable workflow)
|
|
69
|
+
name: Deploy
|
|
70
|
+
|
|
71
|
+
on:
|
|
72
|
+
workflow_call:
|
|
73
|
+
|
|
74
|
+
# ✅ github.workflow_ref includes the full path, NOT the caller's name
|
|
75
|
+
concurrency:
|
|
76
|
+
group: ${{ github.workflow_ref }}-${{ github.ref }}
|
|
77
|
+
cancel-in-progress: true
|
|
78
|
+
|
|
79
|
+
jobs:
|
|
80
|
+
deploy:
|
|
81
|
+
runs-on: ubuntu-latest
|
|
82
|
+
steps:
|
|
83
|
+
- uses: actions/checkout@v4
|
|
84
|
+
- run: ./deploy.sh
|
|
85
|
+
- language: yaml
|
|
86
|
+
label: 'Preferred: define concurrency only in the caller, none in the callee'
|
|
87
|
+
code: |
|
|
88
|
+
# .github/workflows/ci.yml (caller workflow)
|
|
89
|
+
name: CI
|
|
90
|
+
|
|
91
|
+
on:
|
|
92
|
+
push:
|
|
93
|
+
branches: [main]
|
|
94
|
+
|
|
95
|
+
# Concurrency defined only in the caller
|
|
96
|
+
concurrency:
|
|
97
|
+
group: ${{ github.workflow }}-${{ github.ref }}
|
|
98
|
+
cancel-in-progress: true
|
|
99
|
+
|
|
100
|
+
jobs:
|
|
101
|
+
test:
|
|
102
|
+
runs-on: ubuntu-latest
|
|
103
|
+
steps:
|
|
104
|
+
- run: npm test
|
|
105
|
+
|
|
106
|
+
deploy:
|
|
107
|
+
needs: test
|
|
108
|
+
# ✅ No concurrency block here — the reusable workflow has no concurrency group
|
|
109
|
+
uses: ./.github/workflows/deploy.yml
|
|
110
|
+
|
|
111
|
+
---
|
|
112
|
+
# .github/workflows/deploy.yml (callee — reusable workflow)
|
|
113
|
+
name: Deploy
|
|
114
|
+
|
|
115
|
+
on:
|
|
116
|
+
workflow_call:
|
|
117
|
+
|
|
118
|
+
# ✅ No concurrency block — inherit scheduling from caller
|
|
119
|
+
jobs:
|
|
120
|
+
deploy:
|
|
121
|
+
runs-on: ubuntu-latest
|
|
122
|
+
steps:
|
|
123
|
+
- uses: actions/checkout@v4
|
|
124
|
+
- run: ./deploy.sh
|
|
125
|
+
prevention:
|
|
126
|
+
- "Never use `github.workflow` in a reusable workflow's concurrency group — it resolves to the caller's name, not the callee's"
|
|
127
|
+
- "Use `github.workflow_ref` in callee workflows if a concurrency group is truly needed there"
|
|
128
|
+
- "Prefer defining concurrency only in the caller when calling reusable workflows via `uses:` — keep callee stateless"
|
|
129
|
+
- "Audit all reusable workflows for concurrency groups that use `github.workflow` and replace with `github.workflow_ref`"
|
|
130
|
+
docs:
|
|
131
|
+
- url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/control-the-concurrency-of-workflows-and-jobs'
|
|
132
|
+
label: 'GitHub Docs: Control the concurrency of workflows and jobs'
|
|
133
|
+
- url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/accessing-contextual-information-about-workflow-runs#github-context'
|
|
134
|
+
label: 'GitHub Docs: github context — github.workflow_ref field'
|
|
135
|
+
- url: 'https://github.com/actions/runner/issues/3205'
|
|
136
|
+
label: 'actions/runner#3205: github.workflow returns parent name in child workflows (closed not_planned)'
|
|
137
|
+
- url: 'https://stackoverflow.com/questions/78101326/github-actions-concurrency-deadlock'
|
|
138
|
+
label: 'SO #78101326: GitHub Actions concurrency deadlock in reusable workflows (6 upvotes, 1738 views)'
|
|
@@ -0,0 +1,142 @@
|
|
|
1
|
+
id: known-unsolved-076
|
|
2
|
+
title: '`env` context unavailable in `strategy.matrix` — workflow env vars cannot be used in matrix definitions'
|
|
3
|
+
category: known-unsolved
|
|
4
|
+
severity: limitation
|
|
5
|
+
tags:
|
|
6
|
+
- matrix
|
|
7
|
+
- env-context
|
|
8
|
+
- strategy
|
|
9
|
+
- fromJSON
|
|
10
|
+
- dynamic-matrix
|
|
11
|
+
- context-availability
|
|
12
|
+
- known-limitation
|
|
13
|
+
patterns:
|
|
14
|
+
- regex: 'Error when evaluating .+strategy.+ for job'
|
|
15
|
+
flags: 'i'
|
|
16
|
+
- regex: 'strategy.*matrix.*\$\{\{.*env\.'
|
|
17
|
+
flags: 'i'
|
|
18
|
+
- regex: 'Unrecognized named-value: .+env.+ \(Line: \d+'
|
|
19
|
+
flags: 'i'
|
|
20
|
+
error_messages:
|
|
21
|
+
- "Error when evaluating 'strategy' for job 'build'. .github/workflows/deploy.yaml (Line: 10, Col: 15): Error parsing fromJson"
|
|
22
|
+
- "Error when evaluating 'strategy' for job 'test'. Input string '3.11' is not a valid number. Path '[0]'"
|
|
23
|
+
- "Error when evaluating 'strategy' for job 'matrix-job': Object reference not set to an instance of an object"
|
|
24
|
+
root_cause: |
|
|
25
|
+
The `env` context — workflow-level environment variables defined under the top-level `env:` key
|
|
26
|
+
— is NOT available during evaluation of `strategy.matrix`. GitHub Actions evaluates the
|
|
27
|
+
`strategy:` block at workflow QUEUING time, before jobs are dispatched and before environment
|
|
28
|
+
variable interpolation occurs. Any expression like `${{ env.VERSIONS_JSON }}` or
|
|
29
|
+
`${{ fromJSON(env.VERSIONS_JSON) }}` inside `strategy.matrix` resolves to an empty string
|
|
30
|
+
or produces a parse error.
|
|
31
|
+
|
|
32
|
+
**GitHub's documented context availability:**
|
|
33
|
+
Per the official context availability table, `env` is explicitly NOT listed as available in
|
|
34
|
+
`jobs.<job_id>.strategy`. Available contexts there are only: `github`, `inputs`, `vars`,
|
|
35
|
+
`needs`, `strategy`, and `matrix`.
|
|
36
|
+
|
|
37
|
+
**Common failure patterns:**
|
|
38
|
+
- Using `${{ fromJSON(env.MY_VERSIONS) }}` to share a JSON array between matrix and other steps
|
|
39
|
+
- Setting `env: DEPLOY_ENVS: '["staging","production"]'` at workflow level and referencing it
|
|
40
|
+
in the matrix include/exclude blocks
|
|
41
|
+
- Using workflow-level env vars to DRY up matrix configs shared with `run:` script logic
|
|
42
|
+
|
|
43
|
+
**Why this surprises developers:**
|
|
44
|
+
- `env` IS available inside `jobs.<job_id>.steps.*` — the limitation is specific to `strategy`
|
|
45
|
+
- The error message "Error parsing fromJson" doesn't mention that `env` context is unavailable
|
|
46
|
+
- The empty-string resolution (when no parse error) causes a silent empty matrix, not an error
|
|
47
|
+
|
|
48
|
+
Sources: actions/runner#480 (281 👍, open since 2020 — most-upvoted env context limitation);
|
|
49
|
+
SO#74072206 (10 upvotes, 19,053 views); SO#76039616 (6 upvotes, 10,482 views)
|
|
50
|
+
fix: |
|
|
51
|
+
Three workarounds exist, in order of preference:
|
|
52
|
+
|
|
53
|
+
**Option 1 — Use `vars` context (org/repo-level variables)**
|
|
54
|
+
Org-level and repo-level variables (`vars.*`) ARE available in `strategy.matrix`.
|
|
55
|
+
Move shared values to repository Settings → Secrets and variables → Variables instead of
|
|
56
|
+
workflow-level `env:`.
|
|
57
|
+
|
|
58
|
+
**Option 2 — Compute matrix in a previous job's output (most flexible)**
|
|
59
|
+
Use a dedicated setup job that outputs the matrix JSON, then reference it in dependent
|
|
60
|
+
jobs via `fromJSON(needs.setup.outputs.matrix)`. This also enables dynamic matrix
|
|
61
|
+
generation from files, APIs, or scripts.
|
|
62
|
+
|
|
63
|
+
**Option 3 — Hardcode values directly in strategy.matrix**
|
|
64
|
+
Duplicate the values in `strategy.matrix` explicitly. Avoids the DRY violation but
|
|
65
|
+
is the simplest fix when values rarely change.
|
|
66
|
+
fix_code:
|
|
67
|
+
- language: yaml
|
|
68
|
+
label: 'Option 1: Use vars context — repo/org variables ARE available in strategy'
|
|
69
|
+
code: |
|
|
70
|
+
# In repo Settings → Secrets and variables → Variables, create:
|
|
71
|
+
# Name: PYTHON_VERSIONS Value: ["3.10","3.11","3.12"]
|
|
72
|
+
|
|
73
|
+
jobs:
|
|
74
|
+
test:
|
|
75
|
+
strategy:
|
|
76
|
+
matrix:
|
|
77
|
+
# ✅ vars context IS available in strategy.matrix
|
|
78
|
+
python-version: ${{ fromJSON(vars.PYTHON_VERSIONS) }}
|
|
79
|
+
runs-on: ubuntu-latest
|
|
80
|
+
steps:
|
|
81
|
+
- uses: actions/setup-python@v5
|
|
82
|
+
with:
|
|
83
|
+
python-version: ${{ matrix.python-version }}
|
|
84
|
+
- run: python -m pytest
|
|
85
|
+
- language: yaml
|
|
86
|
+
label: 'Option 2: Compute matrix in a setup job, pass via needs outputs'
|
|
87
|
+
code: |
|
|
88
|
+
jobs:
|
|
89
|
+
setup:
|
|
90
|
+
runs-on: ubuntu-latest
|
|
91
|
+
outputs:
|
|
92
|
+
matrix: ${{ steps.set-matrix.outputs.matrix }}
|
|
93
|
+
steps:
|
|
94
|
+
- id: set-matrix
|
|
95
|
+
run: |
|
|
96
|
+
# Build matrix from any source: env var, file, API, conditional logic
|
|
97
|
+
echo 'matrix={"python-version":["3.10","3.11","3.12"]}' >> "$GITHUB_OUTPUT"
|
|
98
|
+
|
|
99
|
+
test:
|
|
100
|
+
needs: setup
|
|
101
|
+
strategy:
|
|
102
|
+
matrix:
|
|
103
|
+
# ✅ fromJSON(needs.*.outputs.*) IS available in strategy.matrix
|
|
104
|
+
${{ fromJSON(needs.setup.outputs.matrix) }}
|
|
105
|
+
runs-on: ubuntu-latest
|
|
106
|
+
steps:
|
|
107
|
+
- uses: actions/setup-python@v5
|
|
108
|
+
with:
|
|
109
|
+
python-version: ${{ matrix.python-version }}
|
|
110
|
+
- run: python -m pytest
|
|
111
|
+
- language: yaml
|
|
112
|
+
label: 'Anti-pattern: env context in strategy.matrix causes parse error'
|
|
113
|
+
code: |
|
|
114
|
+
# ❌ Workflow-level env vars are NOT available in strategy.matrix
|
|
115
|
+
|
|
116
|
+
env:
|
|
117
|
+
PYTHON_VERSIONS: '["3.10","3.11","3.12"]'
|
|
118
|
+
|
|
119
|
+
jobs:
|
|
120
|
+
test:
|
|
121
|
+
strategy:
|
|
122
|
+
matrix:
|
|
123
|
+
# This causes: "Error when evaluating 'strategy' for job 'test'"
|
|
124
|
+
python-version: ${{ fromJSON(env.PYTHON_VERSIONS) }}
|
|
125
|
+
runs-on: ubuntu-latest
|
|
126
|
+
steps:
|
|
127
|
+
- run: python --version
|
|
128
|
+
prevention:
|
|
129
|
+
- "Use `vars.*` (repo/org variables from Settings) instead of workflow `env:` for values shared between matrix and steps"
|
|
130
|
+
- "Generate dynamic matrices in a dedicated setup job and pass via `needs.*.outputs.*`"
|
|
131
|
+
- "Consult the GitHub Actions context availability table before using any context in `strategy` — not all contexts are available there"
|
|
132
|
+
- "actionlint will flag `env` context usage in strategy blocks — add it to your CI pipeline or pre-commit hooks"
|
|
133
|
+
- "Remember: `env` IS available in `steps.*.run`, `steps.*.with`, and most other places — the restriction is specific to `strategy` and `concurrency`"
|
|
134
|
+
docs:
|
|
135
|
+
- url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/accessing-contextual-information-about-workflow-runs#context-availability'
|
|
136
|
+
label: 'GitHub Docs: Context availability table (env not listed for strategy)'
|
|
137
|
+
- url: 'https://github.com/actions/runner/issues/480'
|
|
138
|
+
label: 'actions/runner#480: Workflow level env does not work in all fields (281 upvotes, open since 2020)'
|
|
139
|
+
- url: 'https://stackoverflow.com/questions/74072206/github-actions-use-variables-in-matrix-definition'
|
|
140
|
+
label: 'SO #74072206: Use variables in matrix definition (10 upvotes, 19,053 views)'
|
|
141
|
+
- url: 'https://stackoverflow.com/questions/76039616/how-to-read-in-an-array-of-strings-to-matrix-in-github-actions'
|
|
142
|
+
label: 'SO #76039616: Read array of strings into matrix (6 upvotes, 10,482 views)'
|
|
@@ -0,0 +1,140 @@
|
|
|
1
|
+
id: known-unsolved-077
|
|
2
|
+
title: '`fromJSON()` array nested inside a matrix object does not expand — resolves as literal "Array" or raw JSON string'
|
|
3
|
+
category: known-unsolved
|
|
4
|
+
severity: silent-failure
|
|
5
|
+
tags:
|
|
6
|
+
- matrix
|
|
7
|
+
- fromJSON
|
|
8
|
+
- dynamic-matrix
|
|
9
|
+
- object
|
|
10
|
+
- array-expansion
|
|
11
|
+
- known-limitation
|
|
12
|
+
- silent-failure
|
|
13
|
+
patterns:
|
|
14
|
+
- regex: 'fromJSON\(needs\.\w+\.outputs\.'
|
|
15
|
+
flags: 'i'
|
|
16
|
+
- regex: 'matrix\.\w+\.\w+.*\bArray\b|\bArray\b.*matrix\.'
|
|
17
|
+
flags: 'i'
|
|
18
|
+
error_messages:
|
|
19
|
+
- "echo x86_64 Array"
|
|
20
|
+
- 'matrix.component.distro = Array'
|
|
21
|
+
- 'matrix.component.distro = ["el8","el9"]'
|
|
22
|
+
root_cause: |
|
|
23
|
+
GitHub Actions supports expanding a JSON array into matrix rows when `fromJSON()` is
|
|
24
|
+
used **at the top level** of a matrix dimension:
|
|
25
|
+
```yaml
|
|
26
|
+
strategy:
|
|
27
|
+
matrix:
|
|
28
|
+
version: ${{ fromJSON(needs.setup.outputs.versions) }} # ✅ expands to N rows
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
However, when the array is nested inside a matrix **object** (i.e., as a value within
|
|
32
|
+
one of the items of a matrix dimension), `fromJSON()` does NOT expand it into rows.
|
|
33
|
+
Instead, the expression resolves to:
|
|
34
|
+
- The literal string `"Array"` when `fromJSON()` is used (type-1 pattern)
|
|
35
|
+
- The raw JSON string (e.g., `["el8","el9"]`) when the output is used directly (type-2 pattern)
|
|
36
|
+
|
|
37
|
+
**Failing patterns:**
|
|
38
|
+
```yaml
|
|
39
|
+
# Type 1 — fromJSON inside a matrix object value
|
|
40
|
+
matrix:
|
|
41
|
+
component:
|
|
42
|
+
- name: rhel
|
|
43
|
+
distro: ${{ fromJSON(needs.setup.outputs.rpms) }} # ❌ becomes "Array"
|
|
44
|
+
|
|
45
|
+
# Type 2 — raw JSON string inside a matrix object value
|
|
46
|
+
matrix:
|
|
47
|
+
component:
|
|
48
|
+
- name: rhel
|
|
49
|
+
distro: ${{ needs.setup.outputs.rpms }} # ❌ becomes the raw JSON string
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
The expansion mechanism only applies when the entire matrix dimension value is a
|
|
53
|
+
`fromJSON()` call returning an array. Nesting breaks the expansion because the runner
|
|
54
|
+
evaluates the object first and cannot retroactively fan out a single object item into
|
|
55
|
+
multiple rows.
|
|
56
|
+
|
|
57
|
+
This is a documented limitation. GitHub's matrix engine does not support nested array
|
|
58
|
+
expansion within object-typed matrix dimensions. The issue is tracked in
|
|
59
|
+
actions/runner#3794 (open since Apr 2025, no planned fix).
|
|
60
|
+
|
|
61
|
+
**Why this is confusing:**
|
|
62
|
+
The `fromJSON()` approach works perfectly when the array IS the entire dimension:
|
|
63
|
+
`matrix: version: ${{ fromJSON(needs.setup.outputs.versions) }}`. Developers naturally
|
|
64
|
+
try to reuse the same pattern inside object dimensions and get silently wrong values —
|
|
65
|
+
the job runs to completion but with a single row containing "Array" instead of N rows.
|
|
66
|
+
fix: |
|
|
67
|
+
Restructure the matrix so that the dynamic array is the TOP-LEVEL matrix dimension,
|
|
68
|
+
not nested inside an object. Move any additional object properties into the
|
|
69
|
+
`matrix.include` block to pair them with each array element.
|
|
70
|
+
|
|
71
|
+
If you need multiple outputs to compose a matrix, pre-process them into a combined
|
|
72
|
+
JSON array in an upstream job step and pass the combined output as a single
|
|
73
|
+
`fromJSON()` expression.
|
|
74
|
+
fix_code:
|
|
75
|
+
- language: yaml
|
|
76
|
+
label: 'Move the dynamic array to the top-level dimension — this expands correctly'
|
|
77
|
+
code: |
|
|
78
|
+
jobs:
|
|
79
|
+
setup:
|
|
80
|
+
runs-on: ubuntu-latest
|
|
81
|
+
outputs:
|
|
82
|
+
rpms: ${{ steps.distros.outputs.rpms }}
|
|
83
|
+
debs: ${{ steps.distros.outputs.debs }}
|
|
84
|
+
steps:
|
|
85
|
+
- id: distros
|
|
86
|
+
run: |
|
|
87
|
+
echo 'rpms=["el8","el9"]' >> "$GITHUB_OUTPUT"
|
|
88
|
+
echo 'debs=["focal","jammy","noble"]' >> "$GITHUB_OUTPUT"
|
|
89
|
+
|
|
90
|
+
build:
|
|
91
|
+
needs: setup
|
|
92
|
+
runs-on: ubuntu-latest
|
|
93
|
+
strategy:
|
|
94
|
+
matrix:
|
|
95
|
+
# ✅ distro is the top-level dimension — fromJSON expands to N rows
|
|
96
|
+
distro: ${{ fromJSON(needs.setup.outputs.rpms) }}
|
|
97
|
+
steps:
|
|
98
|
+
- run: echo "Building for ${{ matrix.distro }}"
|
|
99
|
+
- language: yaml
|
|
100
|
+
label: 'Pre-combine multiple arrays into a single include list in the upstream job'
|
|
101
|
+
code: |
|
|
102
|
+
jobs:
|
|
103
|
+
setup:
|
|
104
|
+
runs-on: ubuntu-latest
|
|
105
|
+
outputs:
|
|
106
|
+
matrix: ${{ steps.build-matrix.outputs.matrix }}
|
|
107
|
+
steps:
|
|
108
|
+
- id: build-matrix
|
|
109
|
+
run: |
|
|
110
|
+
# Build a combined include list from multiple arrays
|
|
111
|
+
matrix=$(python3 -c "
|
|
112
|
+
import json, sys
|
|
113
|
+
rpms = ['el8', 'el9']
|
|
114
|
+
debs = ['focal', 'jammy', 'noble']
|
|
115
|
+
include = [{'family': 'rhel', 'distro': d} for d in rpms] + \
|
|
116
|
+
[{'family': 'debian', 'distro': d} for d in debs]
|
|
117
|
+
print(json.dumps({'include': include}))
|
|
118
|
+
")
|
|
119
|
+
echo "matrix=$matrix" >> "$GITHUB_OUTPUT"
|
|
120
|
+
|
|
121
|
+
build:
|
|
122
|
+
needs: setup
|
|
123
|
+
runs-on: ubuntu-latest
|
|
124
|
+
strategy:
|
|
125
|
+
# ✅ fromJSON on a single combined matrix object with include list
|
|
126
|
+
matrix: ${{ fromJSON(needs.setup.outputs.matrix) }}
|
|
127
|
+
steps:
|
|
128
|
+
- run: echo "Building ${{ matrix.family }} for ${{ matrix.distro }}"
|
|
129
|
+
prevention:
|
|
130
|
+
- 'Never use `fromJSON()` as the value of a property inside a matrix object — it will not expand to multiple rows'
|
|
131
|
+
- 'Only use `fromJSON()` when the array IS the entire dimension value (e.g., `matrix.version: ${{ fromJSON(...) }}`)'
|
|
132
|
+
- 'When combining multi-dimensional dynamic matrices, pre-compute a combined `include` list in an upstream job'
|
|
133
|
+
- 'Verify matrix expansion by adding a `- run: echo ${{ toJSON(matrix) }}` debug step; "Array" in the output confirms the bug'
|
|
134
|
+
docs:
|
|
135
|
+
- url: 'https://github.com/actions/runner/issues/3794'
|
|
136
|
+
label: 'actions/runner#3794: Array outputs not understood by matrix when nested inside object (open, Apr 2025)'
|
|
137
|
+
- url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/running-variations-of-jobs-in-a-workflow#using-a-matrix-strategy'
|
|
138
|
+
label: 'GitHub Docs: Using a matrix strategy'
|
|
139
|
+
- url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/evaluate-expressions-in-workflows-and-actions#fromjson'
|
|
140
|
+
label: 'GitHub Docs: fromJSON expression function'
|
|
@@ -0,0 +1,120 @@
|
|
|
1
|
+
id: runner-environment-246
|
|
2
|
+
title: 'Container `options:` empty string from matrix causes "The template is not valid. Unexpected value''" in runner >= 2.331.0'
|
|
3
|
+
category: runner-environment
|
|
4
|
+
severity: error
|
|
5
|
+
tags:
|
|
6
|
+
- container
|
|
7
|
+
- matrix
|
|
8
|
+
- container-options
|
|
9
|
+
- runner-regression
|
|
10
|
+
- template-validation
|
|
11
|
+
- runner-2331
|
|
12
|
+
patterns:
|
|
13
|
+
- regex: 'The template is not valid.*Unexpected value'
|
|
14
|
+
flags: 'i'
|
|
15
|
+
- regex: 'Unexpected value .{0,5}\.github/workflows'
|
|
16
|
+
flags: 'i'
|
|
17
|
+
error_messages:
|
|
18
|
+
- "Error: The template is not valid. .github/workflows/deploy.yml (Line: 42, Col: 15): Unexpected value ''"
|
|
19
|
+
- "Error: The template is not valid. .github/workflows/ci.yml (Line: 18, Col: 9): Unexpected value ''"
|
|
20
|
+
root_cause: |
|
|
21
|
+
In runner version 2.331.0, a regression was introduced in template validation for
|
|
22
|
+
container job `options:` fields. When a workflow uses a matrix to conditionally set
|
|
23
|
+
container options — so that some matrix combinations have no container options
|
|
24
|
+
(`options: ''` or `options: ${{ matrix.container_options }}` where the value is
|
|
25
|
+
unset/empty) — the runner's template evaluator now rejects the empty string with
|
|
26
|
+
"Unexpected value ''" and marks the job as failed before it even starts.
|
|
27
|
+
|
|
28
|
+
Prior to runner 2.331.0, an empty `options:` string was silently accepted and the
|
|
29
|
+
container job ran without extra Docker options. The 2.331.0 release tightened template
|
|
30
|
+
validation for the `container.options` field, rejecting any expression that evaluates
|
|
31
|
+
to an empty string at runtime.
|
|
32
|
+
|
|
33
|
+
**Common trigger pattern:**
|
|
34
|
+
```yaml
|
|
35
|
+
strategy:
|
|
36
|
+
matrix:
|
|
37
|
+
include:
|
|
38
|
+
- os: ubuntu
|
|
39
|
+
container_options: "--memory=2g"
|
|
40
|
+
- os: alpine
|
|
41
|
+
# No container_options key — evaluates to empty string
|
|
42
|
+
jobs:
|
|
43
|
+
build:
|
|
44
|
+
container:
|
|
45
|
+
image: myapp:latest
|
|
46
|
+
options: ${{ matrix.container_options }} # empty for alpine row
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
This pattern worked on runner 2.330.0 but fails on 2.331.0+ with "Unexpected value ''".
|
|
50
|
+
The issue is tracked in actions/runner#4204 (open, labeled bug, Jan 2026).
|
|
51
|
+
fix: |
|
|
52
|
+
Add a fallback non-empty value to the options expression using the `||` operator.
|
|
53
|
+
A single space `' '` is enough to satisfy the validator while having no effect on
|
|
54
|
+
the container invocation, since Docker ignores empty/whitespace-only option strings.
|
|
55
|
+
|
|
56
|
+
Replace:
|
|
57
|
+
options: ${{ matrix.container_options }}
|
|
58
|
+
|
|
59
|
+
With:
|
|
60
|
+
options: ${{ matrix.container_options || ' ' }}
|
|
61
|
+
|
|
62
|
+
Alternatively, restructure the matrix to always provide a defined (non-empty)
|
|
63
|
+
`container_options` value, using a neutral default like `'--label=placeholder'`
|
|
64
|
+
for rows that don't need real options. This is more explicit and survives future
|
|
65
|
+
validator changes.
|
|
66
|
+
fix_code:
|
|
67
|
+
- language: yaml
|
|
68
|
+
label: 'Fallback to single space so template validator accepts an empty matrix value'
|
|
69
|
+
code: |
|
|
70
|
+
strategy:
|
|
71
|
+
matrix:
|
|
72
|
+
include:
|
|
73
|
+
- name: with-limits
|
|
74
|
+
container_options: "--memory=2g --cpus=1"
|
|
75
|
+
- name: no-limits
|
|
76
|
+
# container_options intentionally absent
|
|
77
|
+
|
|
78
|
+
jobs:
|
|
79
|
+
build:
|
|
80
|
+
runs-on: ubuntu-latest
|
|
81
|
+
container:
|
|
82
|
+
image: myapp:latest
|
|
83
|
+
# ✅ Use || ' ' to avoid empty-string rejection in runner >= 2.331.0
|
|
84
|
+
options: ${{ matrix.container_options || ' ' }}
|
|
85
|
+
steps:
|
|
86
|
+
- uses: actions/checkout@v4
|
|
87
|
+
- run: make build
|
|
88
|
+
- language: yaml
|
|
89
|
+
label: 'Explicit neutral default in the matrix row avoids the ambiguity entirely'
|
|
90
|
+
code: |
|
|
91
|
+
strategy:
|
|
92
|
+
matrix:
|
|
93
|
+
include:
|
|
94
|
+
- name: with-limits
|
|
95
|
+
container_options: "--memory=2g --cpus=1"
|
|
96
|
+
- name: no-limits
|
|
97
|
+
# ✅ Always set a value — Docker ignores a dummy label at no cost
|
|
98
|
+
container_options: "--label=no-extra-opts"
|
|
99
|
+
|
|
100
|
+
jobs:
|
|
101
|
+
build:
|
|
102
|
+
runs-on: ubuntu-latest
|
|
103
|
+
container:
|
|
104
|
+
image: myapp:latest
|
|
105
|
+
options: ${{ matrix.container_options }}
|
|
106
|
+
steps:
|
|
107
|
+
- uses: actions/checkout@v4
|
|
108
|
+
- run: make build
|
|
109
|
+
prevention:
|
|
110
|
+
- 'Never let `container.options` evaluate to an empty string — always provide a neutral fallback with `|| '' ''` or a dummy label'
|
|
111
|
+
- 'Test matrix workflows with every combination, including rows that skip optional matrix keys like `container_options`'
|
|
112
|
+
- 'Pin runner version in self-hosted setups to detect regressions before rolling out to all jobs'
|
|
113
|
+
- 'Track actions/runner#4204 for a permanent fix; apply the `|| '' ''` workaround in the interim'
|
|
114
|
+
docs:
|
|
115
|
+
- url: 'https://github.com/actions/runner/issues/4204'
|
|
116
|
+
label: 'actions/runner#4204: "The template is not valid" when container.options is not set in matrix (open, regression in 2.331.0)'
|
|
117
|
+
- url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/running-jobs-in-a-container'
|
|
118
|
+
label: 'GitHub Docs: Running jobs in a container — options field'
|
|
119
|
+
- url: 'https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions#jobsjob_idcontaineroptions'
|
|
120
|
+
label: 'GitHub Docs: jobs.<job_id>.container.options'
|
|
@@ -0,0 +1,140 @@
|
|
|
1
|
+
id: silent-failures-121
|
|
2
|
+
title: '`workflow_run: types: [completed]` triggers on ALL conclusions — deploy runs even when CI failed or was cancelled'
|
|
3
|
+
category: silent-failures
|
|
4
|
+
severity: silent-failure
|
|
5
|
+
tags:
|
|
6
|
+
- workflow-run
|
|
7
|
+
- conclusion
|
|
8
|
+
- silent-failure
|
|
9
|
+
- deploy-on-failure
|
|
10
|
+
- ci-cd
|
|
11
|
+
- workflow-trigger
|
|
12
|
+
- conclusion-check
|
|
13
|
+
patterns:
|
|
14
|
+
- regex: 'on:\s*\n\s+workflow_run:'
|
|
15
|
+
flags: 'im'
|
|
16
|
+
- regex: 'types:\s*[\[\s]*completed[\]\s]*\n'
|
|
17
|
+
flags: 'i'
|
|
18
|
+
error_messages:
|
|
19
|
+
- "# No error message — the deploy workflow runs silently even when upstream CI failed or was cancelled"
|
|
20
|
+
root_cause: |
|
|
21
|
+
The `workflow_run` event with `types: [completed]` fires whenever the referenced workflow
|
|
22
|
+
finishes — regardless of its conclusion. A completed workflow can have any of these conclusions:
|
|
23
|
+
`success`, `failure`, `cancelled`, `skipped`, `timed_out`, `action_required`, `neutral`,
|
|
24
|
+
or `stale`. GitHub fires `workflow_run: completed` for ALL of these states.
|
|
25
|
+
|
|
26
|
+
Developers commonly use `workflow_run` to chain a deploy or release workflow after a CI
|
|
27
|
+
workflow passes. The intuitive reading of "completed" is "finished successfully," but GitHub's
|
|
28
|
+
semantics are "finished with any outcome":
|
|
29
|
+
|
|
30
|
+
```yaml
|
|
31
|
+
# ❌ This runs deploy even when CI failed, was cancelled, or was skipped
|
|
32
|
+
on:
|
|
33
|
+
workflow_run:
|
|
34
|
+
workflows: [CI]
|
|
35
|
+
types: [completed]
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
Without an explicit `if:` conclusion check, the deploy workflow runs unconditionally after CI —
|
|
39
|
+
including when CI failed mid-run, was cancelled by a force-push, or was skipped by a path
|
|
40
|
+
filter. This silently deploys broken code (or publishes, releases, or notifies) whenever CI
|
|
41
|
+
does not succeed.
|
|
42
|
+
|
|
43
|
+
**Why this is a silent failure:**
|
|
44
|
+
- No error appears — the triggered workflow runs normally and reports "success"
|
|
45
|
+
- The CI failure and deploy completion appear as separate workflow runs — easy to miss
|
|
46
|
+
- GitHub does not add a default conclusion filter — the developer must add it manually
|
|
47
|
+
- The mistake is natural: "trigger when CI is done" sounds equivalent to "trigger when CI passes"
|
|
48
|
+
|
|
49
|
+
Sources: GitHub Docs (workflow_run event, conclusion field); SO#62750603 (159 upvotes,
|
|
50
|
+
152,034 views — top answer recommends adding conclusion check); `ahmadnassri/action-workflow-run-wait`
|
|
51
|
+
(39 stars — third-party action created specifically to work around this limitation)
|
|
52
|
+
fix: |
|
|
53
|
+
Add an `if:` condition at the JOB level (not workflow level) to check
|
|
54
|
+
`github.event.workflow_run.conclusion == 'success'`. Job-level conditions are evaluated at
|
|
55
|
+
runtime after the event fires, so this reliably gates deployment on CI success.
|
|
56
|
+
|
|
57
|
+
Do NOT put the `if:` check at the workflow level — it prevents ALL jobs from running,
|
|
58
|
+
including any cleanup or notification jobs that should run regardless of conclusion.
|
|
59
|
+
|
|
60
|
+
For workflows that need to handle both success and failure (e.g., notify Slack of failures),
|
|
61
|
+
use separate jobs with different `if:` checks.
|
|
62
|
+
fix_code:
|
|
63
|
+
- language: yaml
|
|
64
|
+
label: 'Add conclusion check at the job level — most common fix'
|
|
65
|
+
code: |
|
|
66
|
+
name: Deploy
|
|
67
|
+
|
|
68
|
+
on:
|
|
69
|
+
workflow_run:
|
|
70
|
+
workflows: [CI]
|
|
71
|
+
types: [completed]
|
|
72
|
+
|
|
73
|
+
jobs:
|
|
74
|
+
deploy:
|
|
75
|
+
# ✅ Only deploy when CI succeeded
|
|
76
|
+
if: github.event.workflow_run.conclusion == 'success'
|
|
77
|
+
runs-on: ubuntu-latest
|
|
78
|
+
steps:
|
|
79
|
+
- uses: actions/checkout@v4
|
|
80
|
+
with:
|
|
81
|
+
# workflow_run always checks out default branch — use ref for PR branches
|
|
82
|
+
ref: ${{ github.event.workflow_run.head_sha }}
|
|
83
|
+
- run: ./deploy.sh
|
|
84
|
+
- language: yaml
|
|
85
|
+
label: 'Branch by conclusion — deploy on success, notify on failure'
|
|
86
|
+
code: |
|
|
87
|
+
name: Post-CI Actions
|
|
88
|
+
|
|
89
|
+
on:
|
|
90
|
+
workflow_run:
|
|
91
|
+
workflows: [CI]
|
|
92
|
+
types: [completed]
|
|
93
|
+
|
|
94
|
+
jobs:
|
|
95
|
+
deploy:
|
|
96
|
+
if: github.event.workflow_run.conclusion == 'success'
|
|
97
|
+
runs-on: ubuntu-latest
|
|
98
|
+
steps:
|
|
99
|
+
- run: ./deploy.sh
|
|
100
|
+
|
|
101
|
+
notify-failure:
|
|
102
|
+
if: >-
|
|
103
|
+
github.event.workflow_run.conclusion == 'failure' ||
|
|
104
|
+
github.event.workflow_run.conclusion == 'timed_out'
|
|
105
|
+
runs-on: ubuntu-latest
|
|
106
|
+
steps:
|
|
107
|
+
- name: Notify team of CI failure
|
|
108
|
+
run: |
|
|
109
|
+
echo "CI failed on branch ${{ github.event.workflow_run.head_branch }}"
|
|
110
|
+
# Send Slack/Teams notification here
|
|
111
|
+
- language: yaml
|
|
112
|
+
label: 'Anti-pattern: missing conclusion check silently deploys broken code'
|
|
113
|
+
code: |
|
|
114
|
+
# ❌ BAD: Runs even when CI failed, was cancelled, or was skipped by path filter
|
|
115
|
+
on:
|
|
116
|
+
workflow_run:
|
|
117
|
+
workflows: [CI]
|
|
118
|
+
types: [completed] # "completed" means ANY outcome, not just success
|
|
119
|
+
|
|
120
|
+
jobs:
|
|
121
|
+
deploy:
|
|
122
|
+
# No if: conclusion check — deploys broken code on CI failure!
|
|
123
|
+
runs-on: ubuntu-latest
|
|
124
|
+
steps:
|
|
125
|
+
- run: ./deploy.sh
|
|
126
|
+
prevention:
|
|
127
|
+
- "Always add `if: github.event.workflow_run.conclusion == 'success'` to deploy/release jobs triggered by workflow_run"
|
|
128
|
+
- "Remember: 'completed' means the upstream workflow finished — NOT that it succeeded; always check conclusion explicitly"
|
|
129
|
+
- "Use separate jobs with different `if: conclusion ==` checks for success vs failure vs cancelled handling"
|
|
130
|
+
- "Consider adding a linting step or actionlint check that flags workflow_run jobs without a conclusion check"
|
|
131
|
+
- "Test the failure path by intentionally failing CI and verifying the downstream workflow does NOT deploy"
|
|
132
|
+
docs:
|
|
133
|
+
- url: 'https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#workflow_run'
|
|
134
|
+
label: 'GitHub Docs: workflow_run event — conclusion field and types'
|
|
135
|
+
- url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/accessing-contextual-information-about-workflow-runs#github-context'
|
|
136
|
+
label: 'GitHub Docs: github.event.workflow_run context properties'
|
|
137
|
+
- url: 'https://stackoverflow.com/questions/62750603/github-actions-trigger-another-action-after-one-action-is-completed'
|
|
138
|
+
label: 'SO #62750603: Trigger action after completion — top answer adds conclusion check (159 upvotes, 152k views)'
|
|
139
|
+
- url: 'https://github.com/ahmadnassri/action-workflow-run-wait'
|
|
140
|
+
label: 'ahmadnassri/action-workflow-run-wait — third-party action created to wait for successful workflow_run'
|
|
@@ -0,0 +1,142 @@
|
|
|
1
|
+
id: triggers-074
|
|
2
|
+
title: '`workflow_run: branches:` filter silently never triggers when upstream was started by `schedule` or `workflow_dispatch` — `head_branch` is null'
|
|
3
|
+
category: triggers
|
|
4
|
+
severity: silent-failure
|
|
5
|
+
tags:
|
|
6
|
+
- workflow-run
|
|
7
|
+
- branches-filter
|
|
8
|
+
- head-branch
|
|
9
|
+
- schedule
|
|
10
|
+
- workflow-dispatch
|
|
11
|
+
- null
|
|
12
|
+
- silent-failure
|
|
13
|
+
- trigger-missing
|
|
14
|
+
patterns:
|
|
15
|
+
- regex: 'on:\s*\n\s+workflow_run:\s*\n(.|\n)*?branches:\s*\n'
|
|
16
|
+
flags: 'im'
|
|
17
|
+
- regex: 'workflow_run.*branches.*schedule\s*$|schedule.*workflow_run.*branches'
|
|
18
|
+
flags: 'im'
|
|
19
|
+
error_messages:
|
|
20
|
+
- "# No error message — the downstream workflow simply never runs when upstream was triggered by schedule or workflow_dispatch"
|
|
21
|
+
root_cause: |
|
|
22
|
+
The `workflow_run` event exposes a `branches:` filter that is supposed to limit which upstream
|
|
23
|
+
workflow run conclusions trigger the downstream workflow. The filter compares against
|
|
24
|
+
`github.event.workflow_run.head_branch`.
|
|
25
|
+
|
|
26
|
+
**The problem:** When the upstream workflow is triggered by a `schedule` or `workflow_dispatch`
|
|
27
|
+
event, GitHub sets `head_branch` to `null` (not a branch name). The `branches:` filter then
|
|
28
|
+
evaluates `null` against each pattern in the list — and null never matches any branch name
|
|
29
|
+
pattern, including `[main]`, `['**']`, or any glob. The downstream workflow silently never
|
|
30
|
+
triggers, even though the upstream ran successfully on main.
|
|
31
|
+
|
|
32
|
+
**Event types that produce null head_branch:**
|
|
33
|
+
- `schedule` — cron-triggered workflows have no associated branch context
|
|
34
|
+
- `workflow_dispatch` — manually dispatched workflows MAY have null head_branch in some cases
|
|
35
|
+
|
|
36
|
+
**Pull request triggers correctly populate head_branch** (the PR head branch name), and
|
|
37
|
+
`push` triggers also correctly populate it (the pushed branch name). The null behavior is
|
|
38
|
+
specific to schedule and dispatch events on the upstream.
|
|
39
|
+
|
|
40
|
+
**Why this is a silent failure:**
|
|
41
|
+
- No error is raised — the downstream workflow simply does not appear in the Actions UI for that run
|
|
42
|
+
- The upstream workflow runs and completes successfully
|
|
43
|
+
- The `branches:` filter looks correct — `[main]` is the right branch name
|
|
44
|
+
- The issue only manifests for schedule/dispatch-triggered upstream runs; if CI also triggers
|
|
45
|
+
on push to main, push-triggered runs DO reach the downstream workflow
|
|
46
|
+
|
|
47
|
+
**Example scenario:** A deploy workflow runs after CI. CI has `on: [push, schedule]`.
|
|
48
|
+
The `branches: [main]` filter in the deploy's workflow_run block works for push events but
|
|
49
|
+
silently skips all scheduled CI runs.
|
|
50
|
+
|
|
51
|
+
Sources: GitHub Docs (workflow_run, head_branch null for schedule); community/discussions
|
|
52
|
+
related to workflow_run not triggering after schedule; documented in
|
|
53
|
+
`workflow-run-head-branch-null-schedule-dispatch-concurrency.yml` (concurrency variant)
|
|
54
|
+
fix: |
|
|
55
|
+
Remove the `branches:` filter from the `workflow_run` trigger entirely and instead use a
|
|
56
|
+
job-level `if:` condition to check the branch after the event fires. The `if:` condition
|
|
57
|
+
evaluates at runtime and can handle null gracefully.
|
|
58
|
+
|
|
59
|
+
For schedule-triggered upstreams, use `github.event.workflow_run.head_branch == 'main' ||
|
|
60
|
+
github.event.workflow_run.head_branch == null` (null means it ran from the default branch).
|
|
61
|
+
|
|
62
|
+
Alternatively, filter by the workflow run's `head_sha` and check against known branch SHAs.
|
|
63
|
+
fix_code:
|
|
64
|
+
- language: yaml
|
|
65
|
+
label: 'Remove branches filter and add job-level if condition — handles schedule and dispatch'
|
|
66
|
+
code: |
|
|
67
|
+
name: Deploy After CI
|
|
68
|
+
|
|
69
|
+
on:
|
|
70
|
+
workflow_run:
|
|
71
|
+
workflows: [CI]
|
|
72
|
+
types: [completed]
|
|
73
|
+
# ❌ Do NOT use branches: here if CI runs on schedule or workflow_dispatch
|
|
74
|
+
# branches: [main] # This silently skips all schedule-triggered CI runs
|
|
75
|
+
|
|
76
|
+
jobs:
|
|
77
|
+
deploy:
|
|
78
|
+
# ✅ Use job-level if: instead of trigger-level branches:
|
|
79
|
+
# null head_branch means schedule/dispatch from default branch — include it
|
|
80
|
+
if: >-
|
|
81
|
+
github.event.workflow_run.conclusion == 'success' &&
|
|
82
|
+
(github.event.workflow_run.head_branch == 'main' ||
|
|
83
|
+
github.event.workflow_run.head_branch == null)
|
|
84
|
+
runs-on: ubuntu-latest
|
|
85
|
+
steps:
|
|
86
|
+
- uses: actions/checkout@v4
|
|
87
|
+
- run: ./deploy.sh
|
|
88
|
+
- language: yaml
|
|
89
|
+
label: 'When upstream is push-only: branches filter works fine (head_branch is always set)'
|
|
90
|
+
code: |
|
|
91
|
+
name: Deploy After CI (push-only upstream)
|
|
92
|
+
|
|
93
|
+
on:
|
|
94
|
+
workflow_run:
|
|
95
|
+
workflows: [CI]
|
|
96
|
+
types: [completed]
|
|
97
|
+
# ✅ branches: filter works correctly ONLY when upstream runs on push or pull_request
|
|
98
|
+
# Because push and pull_request events always set head_branch
|
|
99
|
+
branches: [main]
|
|
100
|
+
|
|
101
|
+
jobs:
|
|
102
|
+
deploy:
|
|
103
|
+
if: github.event.workflow_run.conclusion == 'success'
|
|
104
|
+
runs-on: ubuntu-latest
|
|
105
|
+
steps:
|
|
106
|
+
- uses: actions/checkout@v4
|
|
107
|
+
- run: ./deploy.sh
|
|
108
|
+
|
|
109
|
+
# ⚠️ If CI also has on: schedule — add null check above or remove branches: filter
|
|
110
|
+
- language: yaml
|
|
111
|
+
label: 'Anti-pattern: branches filter with schedule-triggered upstream — downstream never fires'
|
|
112
|
+
code: |
|
|
113
|
+
# ❌ BAD: Upstream CI has on: [push, schedule]
|
|
114
|
+
# But deploy's branches filter silently skips all scheduled CI runs
|
|
115
|
+
|
|
116
|
+
name: Deploy After CI
|
|
117
|
+
|
|
118
|
+
on:
|
|
119
|
+
workflow_run:
|
|
120
|
+
workflows: [CI]
|
|
121
|
+
types: [completed]
|
|
122
|
+
branches: [main] # null (from schedule) never matches 'main' — deploy skipped!
|
|
123
|
+
|
|
124
|
+
jobs:
|
|
125
|
+
deploy:
|
|
126
|
+
if: github.event.workflow_run.conclusion == 'success'
|
|
127
|
+
runs-on: ubuntu-latest
|
|
128
|
+
steps:
|
|
129
|
+
- run: ./deploy.sh # Never runs after scheduled CI
|
|
130
|
+
prevention:
|
|
131
|
+
- "Avoid `branches:` filter in workflow_run if the upstream workflow can be triggered by `schedule` or `workflow_dispatch`"
|
|
132
|
+
- "Use job-level `if:` conditions with explicit null handling instead of trigger-level branch filters"
|
|
133
|
+
- "Check the upstream workflow's triggers — if it includes `schedule` or `workflow_dispatch`, `head_branch` will be null for those runs"
|
|
134
|
+
- "Test by triggering the upstream workflow manually and verifying the downstream workflow appears in the Actions UI"
|
|
135
|
+
- "When null head_branch is acceptable (schedule from default branch), use `head_branch == 'main' || head_branch == null`"
|
|
136
|
+
docs:
|
|
137
|
+
- url: 'https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#workflow_run'
|
|
138
|
+
label: 'GitHub Docs: workflow_run event — branches filter and head_branch field'
|
|
139
|
+
- url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/accessing-contextual-information-about-workflow-runs#github-context'
|
|
140
|
+
label: 'GitHub Docs: github.event.workflow_run — head_branch property (null for schedule/dispatch)'
|
|
141
|
+
- url: 'https://github.com/orgs/community/discussions/26238'
|
|
142
|
+
label: 'GitHub Community: workflow_run not triggering — head_branch null discussion'
|
package/package.json
CHANGED