@htekdev/actions-debugger 1.0.113 → 1.0.115

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (34) hide show
  1. package/errors/caching-artifacts/cache-corrupt-on-cancel-during-restore-save-always.yml +136 -0
  2. package/errors/caching-artifacts/restore-keys-asterisk-literal-not-glob.yml +107 -0
  3. package/errors/concurrency-timing/concurrency-timing-053.yml +83 -0
  4. package/errors/concurrency-timing/pull-request-review-shared-concurrency-cancels-ci.yml +131 -0
  5. package/errors/known-unsolved/github-script-esm-not-supported.yml +111 -0
  6. package/errors/known-unsolved/job-outputs-string-only-no-array-object.yml +142 -0
  7. package/errors/known-unsolved/known-unsolved-062.yml +87 -0
  8. package/errors/known-unsolved/runner-rest-api-busy-false-broker-state-desync.yml +102 -0
  9. package/errors/permissions-auth/oidc-immutable-sub-claim-new-repo-trust-policy-mismatch.yml +122 -0
  10. package/errors/permissions-auth/permissions-auth-064.yml +122 -0
  11. package/errors/permissions-auth/permissions-auth-065.yml +97 -0
  12. package/errors/permissions-auth/permissions-auth-066.yml +129 -0
  13. package/errors/permissions-auth/upload-code-coverage-missing-code-quality-write-permission.yml +94 -0
  14. package/errors/runner-environment/arc-kubernetes-checkout-circular-json-container-hook.yml +101 -0
  15. package/errors/runner-environment/cache-restore-windows-runner-silent-crash.yml +130 -0
  16. package/errors/runner-environment/git-248-fetch-tags-shallow-clone-regression.yml +100 -0
  17. package/errors/runner-environment/javascript-actions-alpine-arm64-not-supported.yml +121 -0
  18. package/errors/runner-environment/runner-environment-188.yml +96 -0
  19. package/errors/runner-environment/runner-environment-191.yml +147 -0
  20. package/errors/runner-environment/runner-environment-192.yml +144 -0
  21. package/errors/runner-environment/runner-environment-193.yml +136 -0
  22. package/errors/runner-environment/runner-environment-194.yml +86 -0
  23. package/errors/runner-environment/runner-environment-199.yml +93 -0
  24. package/errors/runner-environment/setup-python-macos-self-hosted-symlink-permission-denied.yml +94 -0
  25. package/errors/runner-environment/setup-python-windows-self-hosted-no-admin-install-fails.yml +101 -0
  26. package/errors/silent-failures/checkout-v6-clean-false-deletes-workspace-on-repo-change.yml +119 -0
  27. package/errors/silent-failures/queue-max-silently-ignored-with-cancel-in-progress.yml +109 -0
  28. package/errors/silent-failures/silent-failures-102.yml +141 -0
  29. package/errors/silent-failures/silent-failures-104.yml +119 -0
  30. package/errors/triggers/triggers-069.yml +100 -0
  31. package/errors/yaml-syntax/continue-on-error-inputs-composite-action-unexpected-value.yml +110 -0
  32. package/errors/yaml-syntax/yaml-syntax-068.yml +137 -0
  33. package/errors/yaml-syntax/yaml-syntax-069.yml +118 -0
  34. package/package.json +1 -1
@@ -0,0 +1,142 @@
1
+ id: known-unsolved-060
2
+ title: 'Job outputs are strings only — arrays, objects, and booleans must be manually JSON-serialized'
3
+ category: known-unsolved
4
+ severity: limitation
5
+ tags:
6
+ - job-outputs
7
+ - outputs
8
+ - string
9
+ - fromJSON
10
+ - array
11
+ - known-limitation
12
+ - GITHUB_OUTPUT
13
+ patterns:
14
+ - regex: 'GITHUB_OUTPUT'
15
+ flags: 'i'
16
+ - regex: 'needs\.[a-z_-]+\.outputs\.'
17
+ flags: 'i'
18
+ - regex: 'fromJSON\s*\('
19
+ flags: 'i'
20
+ error_messages:
21
+ - '(no runtime error — arrays or objects written to GITHUB_OUTPUT are silently converted to strings like "Array" or "[object Object]" if not JSON-serialized before writing)'
22
+ root_cause: |
23
+ GitHub Actions job outputs use the GITHUB_OUTPUT file protocol which only supports
24
+ string key-value pairs. There is no native type system for job outputs — every
25
+ value passed through GITHUB_OUTPUT is stored and transmitted as a plain string,
26
+ regardless of the producing language or shell.
27
+
28
+ This becomes a problem in several common scenarios:
29
+
30
+ 1. Passing arrays: A bash array or newline-separated list written to GITHUB_OUTPUT
31
+ becomes a space-separated string or a literal "[array]" — not a JSON array.
32
+ Downstream matrix.include: fromJSON() fails or produces a single-element matrix.
33
+
34
+ 2. Passing objects: A JSON object must be written as a compact single-line string.
35
+ Multi-line JSON written to GITHUB_OUTPUT breaks the key=value line format and
36
+ corrupts the output, causing a parse error or silently reading only the first
37
+ line as the value.
38
+
39
+ 3. Boolean comparisons: Outputs written as "true" or "false" are strings, not
40
+ booleans. Comparing ${{ needs.x.outputs.flag == true }} (boolean literal)
41
+ silently evaluates to false; only ${{ needs.x.outputs.flag == 'true' }} (string)
42
+ or ${{ fromJSON(needs.x.outputs.flag) }} works correctly.
43
+
44
+ This is a known platform limitation. The GitHub Actions team has acknowledged the
45
+ string-only constraint in multiple community discussions. There is no native typed
46
+ output support on the public roadmap as of mid-2026, and no timeline has been given
47
+ for adding array or object output types.
48
+ fix: |
49
+ Serialize arrays and objects to compact single-line JSON before writing to
50
+ GITHUB_OUTPUT, then deserialize with fromJSON() in consuming jobs.
51
+
52
+ Key rule: use jq -c to produce compact (one-line) JSON — multi-line JSON
53
+ embedded in GITHUB_OUTPUT silently truncates at the first newline.
54
+
55
+ For booleans: write the literal string "true"/"false" and compare with == 'true'
56
+ in if: conditions, or wrap with fromJSON() to get a native boolean for contexts
57
+ that require one (e.g., matrix include).
58
+ fix_code:
59
+ - language: yaml
60
+ label: 'Write and consume a JSON array as a job output'
61
+ code: |
62
+ jobs:
63
+ generate-matrix:
64
+ runs-on: ubuntu-latest
65
+ outputs:
66
+ targets: ${{ steps.set-matrix.outputs.targets }}
67
+ steps:
68
+ - id: set-matrix
69
+ run: |
70
+ # Use jq -c for compact single-line JSON — multi-line breaks GITHUB_OUTPUT
71
+ TARGETS=$(echo '[{"env":"staging"},{"env":"prod"}]' | jq -c .)
72
+ echo "targets=$TARGETS" >> "$GITHUB_OUTPUT"
73
+
74
+ deploy:
75
+ needs: generate-matrix
76
+ strategy:
77
+ matrix:
78
+ target: ${{ fromJSON(needs.generate-matrix.outputs.targets) }}
79
+ runs-on: ubuntu-latest
80
+ steps:
81
+ - run: echo "Deploying to ${{ matrix.target.env }}"
82
+ - language: yaml
83
+ label: 'Boolean output — write as string, compare correctly downstream'
84
+ code: |
85
+ jobs:
86
+ check-changes:
87
+ runs-on: ubuntu-latest
88
+ outputs:
89
+ has_changes: ${{ steps.diff.outputs.has_changes }}
90
+ steps:
91
+ - id: diff
92
+ run: |
93
+ if git diff --quiet HEAD~1; then
94
+ echo "has_changes=false" >> "$GITHUB_OUTPUT"
95
+ else
96
+ echo "has_changes=true" >> "$GITHUB_OUTPUT"
97
+ fi
98
+
99
+ build:
100
+ needs: check-changes
101
+ # Compare as string literal — NOT: == true (boolean comparison silently fails)
102
+ if: needs.check-changes.outputs.has_changes == 'true'
103
+ runs-on: ubuntu-latest
104
+ steps:
105
+ - run: make build
106
+ - language: yaml
107
+ label: 'Multi-line JSON — always compact with jq -c before writing'
108
+ code: |
109
+ jobs:
110
+ gather-config:
111
+ runs-on: ubuntu-latest
112
+ outputs:
113
+ config: ${{ steps.read-config.outputs.config }}
114
+ steps:
115
+ - uses: actions/checkout@v4
116
+ - id: read-config
117
+ run: |
118
+ # jq -c converts any JSON (even pretty-printed) to a single compact line
119
+ CONFIG=$(cat deploy-config.json | jq -c .)
120
+ echo "config=$CONFIG" >> "$GITHUB_OUTPUT"
121
+
122
+ deploy:
123
+ needs: gather-config
124
+ runs-on: ubuntu-latest
125
+ steps:
126
+ - run: |
127
+ CONFIG='${{ needs.gather-config.outputs.config }}'
128
+ echo "Region: $(echo "$CONFIG" | jq -r .region)"
129
+ prevention:
130
+ - 'Always pipe JSON values through jq -c before writing to GITHUB_OUTPUT — compact single-line format prevents silent value truncation at embedded newlines'
131
+ - 'Compare boolean-string outputs with == ''true'' (string) not == true (boolean) in if: conditions; or use fromJSON() to convert to a native boolean'
132
+ - 'Never write a raw bash array to GITHUB_OUTPUT — it expands to a space-separated string; always serialize through jq -c first'
133
+ - 'Add a debug step that prints the raw output value using cat $GITHUB_OUTPUT before a consuming job runs, to confirm the serialized form looks correct'
134
+ docs:
135
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/passing-information-between-jobs'
136
+ label: 'GitHub Docs: Passing information between jobs'
137
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/workflow-commands-for-github-actions#setting-an-output-parameter'
138
+ label: 'GitHub Docs: Setting an output parameter (GITHUB_OUTPUT)'
139
+ - url: 'https://github.com/orgs/community/discussions/17245'
140
+ label: 'GitHub Community: Job outputs only support strings — feature request for typed outputs'
141
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/evaluate-expressions-in-workflows-and-actions#fromjson'
142
+ label: 'GitHub Docs: fromJSON expression function'
@@ -0,0 +1,87 @@
1
+ id: known-unsolved-062
2
+ title: 'workflow_run chains are limited to one level — a workflow_run-triggered workflow cannot trigger another downstream workflow via workflow_run'
3
+ category: known-unsolved
4
+ severity: limitation
5
+ tags:
6
+ - workflow-run
7
+ - chaining
8
+ - pipeline
9
+ - limitation
10
+ - no-fix
11
+ - event-trigger
12
+ patterns:
13
+ - regex: 'on:\s*\n\s+workflow_run:'
14
+ flags: 'i'
15
+ error_messages: []
16
+ root_cause: |
17
+ GitHub Actions explicitly prevents workflow_run events from chaining more than one
18
+ level deep. A workflow triggered by workflow_run CANNOT itself use workflow_run as
19
+ an on: trigger to fire a third downstream workflow.
20
+
21
+ From GitHub documentation: "A workflow triggered by a workflow_run event can only
22
+ be triggered by a workflow that is not itself triggered by a workflow_run event."
23
+
24
+ This restriction prevents infinite trigger loops but also prevents building linear
25
+ CI/CD pipelines using workflow_run chaining alone. Multi-stage pipelines of the form
26
+ Build (push) → Test (workflow_run) → Deploy (workflow_run) → Notify (workflow_run)
27
+ fail silently at the second hop: the Deploy and Notify workflows never appear in the
28
+ Actions tab and no error is raised anywhere.
29
+
30
+ There is no runtime error, no annotation, and no warning. The on: workflow_run
31
+ trigger on the downstream workflow is simply never evaluated.
32
+ fix: |
33
+ Replace the second-hop workflow_run trigger with an explicit dispatch from the
34
+ first downstream workflow:
35
+
36
+ Option 1 — repository_dispatch: use the GitHub REST API from a job step in workflow
37
+ B to POST to /repos/{owner}/{repo}/dispatches with a custom event_type. Workflow C
38
+ listens on on: repository_dispatch with a matching types: filter.
39
+
40
+ Option 2 — workflow_dispatch: use gh workflow run from a step in workflow B to
41
+ directly trigger workflow C by filename. Requires a GitHub token with actions:write.
42
+
43
+ Option 3 — Consolidate: merge the second and third workflows into a single workflow
44
+ with job dependencies (needs:) eliminating the cross-workflow hop entirely.
45
+ fix_code:
46
+ - language: yaml
47
+ label: 'Does NOT work: workflow_run cannot chain more than one level deep'
48
+ code: |
49
+ # Workflow C — this trigger is never evaluated when Workflow B is
50
+ # itself triggered by workflow_run
51
+ on:
52
+ workflow_run:
53
+ workflows: ["B - Integration Tests"]
54
+ types: [completed]
55
+ - language: yaml
56
+ label: 'Fix: dispatch workflow C via repository_dispatch from workflow B'
57
+ code: |
58
+ # In workflow B (intermediate workflow, triggered by workflow_run):
59
+ jobs:
60
+ dispatch-downstream:
61
+ runs-on: ubuntu-latest
62
+ if: ${{ github.event.workflow_run.conclusion == 'success' }}
63
+ steps:
64
+ - name: Trigger workflow C via repository_dispatch
65
+ run: |
66
+ gh api repos/${{ github.repository }}/dispatches \
67
+ --method POST \
68
+ --field event_type=run-deploy \
69
+ --field client_payload='{"run_id":"${{ github.event.workflow_run.id }}"}'
70
+ env:
71
+ GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
72
+
73
+ # In workflow C:
74
+ on:
75
+ repository_dispatch:
76
+ types: [run-deploy]
77
+ prevention:
78
+ - 'Design CI/CD pipelines assuming workflow_run allows only one hop; use repository_dispatch or workflow_dispatch for any second-level chaining'
79
+ - 'Prefer consolidating multi-stage pipelines into a single workflow with job dependencies (needs:) when the stages always execute in sequence'
80
+ - 'When a downstream workflow never appears in the Actions tab after merging, verify whether its on: trigger is workflow_run and whether its upstream workflow is also workflow_run-triggered'
81
+ docs:
82
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#workflow_run'
83
+ label: 'GitHub Docs: workflow_run event — one-level-deep restriction'
84
+ - url: 'https://docs.github.com/en/rest/repos/repos#create-a-repository-dispatch-event'
85
+ label: 'GitHub REST API: Create a repository dispatch event'
86
+ - url: 'https://cli.github.com/manual/gh_workflow_run'
87
+ label: 'GitHub CLI: gh workflow run (alternative dispatch method)'
@@ -0,0 +1,102 @@
1
+ id: known-unsolved-063
2
+ title: 'REST API reports runner busy:false while broker shows runner actively executing a job'
3
+ category: known-unsolved
4
+ severity: silent-failure
5
+ tags:
6
+ - self-hosted
7
+ - runner
8
+ - autoscaler
9
+ - rest-api
10
+ - broker
11
+ - state-desync
12
+ - v2-flow
13
+ - job-killed
14
+ patterns:
15
+ - regex: 'busy.*false.*runner.*executing|runner.*busy.*false.*job'
16
+ flags: 'i'
17
+ - regex: '"busy"\s*:\s*false'
18
+ flags: 'i'
19
+ error_messages:
20
+ - '"busy": false'
21
+ - 'GET /repos/{owner}/{repo}/actions/runners/{id} → {"busy": false}'
22
+ root_cause: |
23
+ On non-ephemeral self-hosted runners using the V2 broker flow
24
+ (`broker.actions.githubusercontent.com`), a state desynchronization exists between
25
+ the broker service and the GitHub REST API:
26
+
27
+ - The broker correctly tracks runner state in real-time: after picking up Job B, the
28
+ runner reports `JobState: Busy` to the broker and renews its job lease every 60s.
29
+ - However, `GET /repos/{owner}/{repo}/actions/runners/{runner_id}` (the public REST
30
+ API) continues to return `"busy": false` during the early phase of job execution.
31
+ The REST API state may only update after the runner's next periodic sync, which
32
+ can lag 30–120 seconds behind the broker state.
33
+
34
+ Auto-scaling tools that rely on the REST API to identify idle runners (e.g.,
35
+ `github-aws-runners/terraform-aws-github-runner`, KEDA GitHub Actions scaler,
36
+ custom Lambda/CloudFunction scalers) interpret `busy: false` as "runner is idle and
37
+ safe to terminate." This causes the autoscaler to terminate an EC2/GCE/Azure instance
38
+ mid-job — killing the runner process with no Actions-level error and marking the job
39
+ as failed with a runner disconnection error.
40
+
41
+ From the affected job's perspective, the log ends mid-step with "The runner has
42
+ received a shutdown signal" or the job times out. There is no annotation indicating
43
+ the root cause was an autoscaler decision based on stale REST API data.
44
+
45
+ No GitHub-side fix is available as of June 2026. The REST API does not expose a
46
+ real-time busy status consistent with the broker. Open at actions/runner#4422.
47
+ fix: |
48
+ There is no complete fix — this is a known state inconsistency in the GitHub platform.
49
+
50
+ Workarounds (choose one based on your autoscaling setup):
51
+
52
+ 1. **Switch to ephemeral JIT runners (recommended)**: Use JIT tokens and terminate
53
+ runners after exactly one job. There is no window for autoscalers to misidentify
54
+ a running job as idle because the runner is registered and deregistered atomically.
55
+
56
+ 2. **Add a grace period before termination**: When your autoscaler sees `busy: false`,
57
+ wait 2–3 minutes and re-poll before actually terminating. This covers the lag
58
+ between broker state and REST API state.
59
+
60
+ 3. **Poll job status instead of runner status**: Use
61
+ `GET /repos/{owner}/{repo}/actions/runs` to check for `in_progress` workflow runs
62
+ before terminating any runner, rather than relying on per-runner `busy` status.
63
+
64
+ 4. **Use runner labels + job assignment**: If your autoscaler assigns specific runners
65
+ to specific jobs via labels, you can cross-reference queued/in-progress job
66
+ assignments against runner IDs before terminating.
67
+ fix_code:
68
+ - language: yaml
69
+ label: 'Example: Switch to ephemeral JIT runners (removes the desync window entirely)'
70
+ code: |
71
+ # Use JIT runner registration in your autoscaler
72
+ # Each runner handles exactly one job — busy/idle desync cannot occur
73
+ # See: https://docs.github.com/en/actions/security-for-github-actions/security-guides/security-hardening-for-github-actions#using-just-in-time-runners
74
+
75
+ # In your autoscaler provisioning logic:
76
+ # POST /repos/{owner}/{repo}/actions/runners/generate-jit-config
77
+ # → Use the returned jit_config to start an ephemeral runner
78
+ # → Runner auto-deregisters after job completes — no stale REST state possible
79
+ - language: yaml
80
+ label: 'Example: Grace period before termination (Terraform-style pseudocode)'
81
+ code: |
82
+ # In your autoscaler Lambda/script, before terminating an instance:
83
+ # 1. GET /repos/{owner}/{repo}/actions/runners/{runner_id}
84
+ # 2. If busy == false, wait 2 minutes
85
+ # 3. Re-poll: GET /repos/{owner}/{repo}/actions/runners/{runner_id}
86
+ # 4. Only terminate if STILL busy == false after the grace period
87
+
88
+ # This covers the broker→REST lag window (~30-120s observed in practice)
89
+ prevention:
90
+ - "Prefer ephemeral JIT runners for any workload where mid-job termination would be costly; the broker-REST desync window is zero for single-job-per-runner setups."
91
+ - "Never terminate a runner instance based solely on a single REST API `busy: false` reading — always double-check with a grace period or secondary signal."
92
+ - "Monitor for jobs that end with 'runner has received a shutdown signal' — this is a reliable indicator that a runner was terminated externally mid-job."
93
+ - "If using terraform-aws-github-runner or similar, check whether the tool version has built-in grace periods for the busy-state lag."
94
+ docs:
95
+ - url: 'https://github.com/actions/runner/issues/4422'
96
+ label: 'actions/runner#4422 — /runners REST API reports busy:false for active runner'
97
+ - url: 'https://docs.github.com/en/rest/actions/self-hosted-runners'
98
+ label: 'REST API: Self-hosted runners'
99
+ - url: 'https://docs.github.com/en/actions/security-for-github-actions/security-guides/security-hardening-for-github-actions#using-just-in-time-runners'
100
+ label: 'Just-in-time runners documentation'
101
+ - url: 'https://github.com/github-aws-runners/terraform-aws-github-runner'
102
+ label: 'terraform-aws-github-runner (commonly affected autoscaler)'
@@ -0,0 +1,122 @@
1
+ id: permissions-auth-067
2
+ title: "OIDC immutable sub claim format for new repos breaks cloud trust policies"
3
+ category: permissions-auth
4
+ severity: error
5
+ tags:
6
+ - oidc
7
+ - sub-claim
8
+ - aws
9
+ - azure
10
+ - gcp
11
+ - immutable
12
+ - trust-policy
13
+ - new-repo
14
+ patterns:
15
+ - regex: 'Not authorized to perform sts:AssumeRoleWithWebIdentity'
16
+ flags: 'i'
17
+ - regex: 'AccessDenied.*AssumeRoleWithWebIdentity'
18
+ flags: 'i'
19
+ - regex: 'WorkloadIdentityPool.*rejected|token validation failed'
20
+ flags: 'i'
21
+ - regex: 'Credentials could not be loaded.*Could not load credentials from any providers'
22
+ flags: 'i'
23
+ error_messages:
24
+ - "Not authorized to perform sts:AssumeRoleWithWebIdentity"
25
+ - "Error: Credentials could not be loaded, please check your action inputs: Could not load credentials from any providers"
26
+ - "AuthorizationError: Token validation failed: subject claim does not match trust policy"
27
+ - "WorkloadIdentityPool: token was rejected: sub does not match condition"
28
+ root_cause: |
29
+ Starting June 18, 2026, GitHub automatically applies a new **immutable subject
30
+ claim format** to all OIDC tokens issued for repositories created or renamed on
31
+ or after that date. The new format appends numeric IDs to the owner and repo
32
+ names:
33
+
34
+ Old format: repo:my-org/my-repo:ref:refs/heads/main
35
+ New format: repo:my-org-123456/my-repo-456789:ref:refs/heads/main
36
+
37
+ Cloud provider trust policies (AWS IAM `StringEquals`, GCP Workload Identity
38
+ Federation condition, Azure Federated Identity) that were copied from docs or
39
+ examples using only the human-readable `repo:OWNER/REPO:*` pattern will never
40
+ match the new immutable format. The claim is technically valid — GitHub issued it
41
+ correctly — but the trust policy simply rejects it, producing an
42
+ authorization error.
43
+
44
+ Existing repositories are NOT affected automatically; they must explicitly opt in
45
+ via the organization or repository OIDC settings. Only new repos created after
46
+ June 18, 2026 and repos renamed/transferred after that date receive the new
47
+ format unconditionally.
48
+
49
+ This is distinct from the repo-rename scenario (permissions-auth-019, which covers
50
+ mutable names changing) and the environment-block scenario (permissions-auth-054,
51
+ which covers the sub claim changing format when an `environment:` key is added).
52
+ This entry covers the baseline mismatch for freshly created repos.
53
+ fix: |
54
+ Update your cloud provider trust policy to use the new immutable sub claim
55
+ format. Use the GitHub OIDC preview API to inspect the exact subject claim your
56
+ repository will produce before updating the policy:
57
+
58
+ GET /repos/{owner}/{repo}/actions/oidc/customization/sub
59
+
60
+ For AWS IAM, replace the StringEquals condition value with the new format
61
+ (including numeric IDs), or switch to StringLike with a wildcard:
62
+
63
+ repo:my-org-*/my-repo-*:ref:refs/heads/main
64
+
65
+ For GCP Workload Identity Federation, update the attribute condition string.
66
+ For Azure Federated Identity, update the subject field in the credential.
67
+
68
+ Alternatively, use GitHub's custom subject claim feature to define a simplified
69
+ subject that is stable across formats (e.g., only include `repository` and
70
+ `ref` fields that won't change).
71
+ fix_code:
72
+ - language: yaml
73
+ label: "AWS — update IAM trust policy StringEquals to new immutable format"
74
+ code: |
75
+ # In your AWS IAM trust policy JSON (not YAML), update the condition:
76
+ # OLD — will fail for repos created after June 18, 2026:
77
+ # "token.actions.githubusercontent.com:sub": "repo:my-org/my-repo:ref:refs/heads/main"
78
+ #
79
+ # NEW — use StringLike with wildcard to match both formats:
80
+ # "token.actions.githubusercontent.com:sub": "repo:my-org*my-repo*:ref:refs/heads/main"
81
+ #
82
+ # Or use the exact immutable format shown in the OIDC preview API response.
83
+ #
84
+ # Alternatively, switch to a custom subject claim in GitHub OIDC settings
85
+ # that only includes fields you control.
86
+ - language: yaml
87
+ label: "GitHub Actions workflow is unchanged — the issue is in the cloud provider trust policy"
88
+ code: |
89
+ jobs:
90
+ deploy:
91
+ permissions:
92
+ id-token: write
93
+ contents: read
94
+ runs-on: ubuntu-latest
95
+ steps:
96
+ - uses: aws-actions/configure-aws-credentials@v4
97
+ with:
98
+ role-to-assume: arn:aws:iam::123456789012:role/my-role
99
+ aws-region: us-east-1
100
+ # If the above step fails with "Not authorized to perform
101
+ # sts:AssumeRoleWithWebIdentity", the trust policy sub claim
102
+ # condition does not match the new immutable format.
103
+ # Update the IAM trust policy — not this workflow file.
104
+ prevention:
105
+ - "Use StringLike with a wildcard pattern in cloud trust policies instead of
106
+ StringEquals with an exact sub claim value — this accommodates both old and
107
+ new immutable formats."
108
+ - "After creating a new repo, call the GitHub OIDC preview API
109
+ (GET /repos/{owner}/{repo}/actions/oidc/customization/sub) to see the
110
+ exact sub claim format before configuring the cloud trust policy."
111
+ - "Consider using GitHub's custom subject claim feature to pin a simplified,
112
+ stable sub claim structure that does not change with naming format updates."
113
+ - "Existing repos can opt in to the new format in repository or organization
114
+ OIDC settings — test in a staging environment first to confirm trust policies
115
+ are updated before enabling in production."
116
+ docs:
117
+ - url: "https://github.blog/changelog/2026-04-23-immutable-subject-claims-for-github-actions-oidc-tokens/"
118
+ label: "GitHub Changelog: Immutable subject claims for GitHub Actions OIDC tokens (April 23, 2026)"
119
+ - url: "https://docs.github.com/en/actions/security-for-github-actions/security-hardening-your-deployments/about-security-hardening-with-openid-connect"
120
+ label: "About security hardening with OpenID Connect"
121
+ - url: "https://docs.github.com/en/actions/security-for-github-actions/security-hardening-your-deployments/customizing-the-subject-claims-for-an-organization-or-repository"
122
+ label: "Customizing the subject claims for an organization or repository"
@@ -0,0 +1,122 @@
1
+ id: permissions-auth-064
2
+ title: 'GitHub OIDC Provider Intermittent 500 — Transient Token Request Failures'
3
+ category: permissions-auth
4
+ severity: error
5
+ tags:
6
+ - oidc
7
+ - intermittent
8
+ - 500
9
+ - token-request
10
+ - transient
11
+ - retry
12
+ patterns:
13
+ - regex: 'Request to OIDC provider failed with status 500'
14
+ flags: 'i'
15
+ - regex: 'Unable to get OIDC token.*500'
16
+ flags: 'i'
17
+ - regex: 'Failed to get ID token.*status.*500'
18
+ flags: 'i'
19
+ error_messages:
20
+ - "Error: Unable to get OIDC token: Error: Request to OIDC provider failed with status 500"
21
+ - "Request to OIDC provider failed with status 500"
22
+ - "Error: Failed to get ID token: Request failed with status code 500"
23
+ root_cause: |
24
+ GitHub's OIDC token endpoint (`https://token.actions.githubusercontent.com`) occasionally
25
+ returns HTTP **500 Internal Server Error** responses due to transient infrastructure issues
26
+ on GitHub's side. The failure is server-side and non-deterministic — workflows that ran
27
+ successfully hours ago may fail today, and re-running the identical job minutes later often
28
+ succeeds.
29
+
30
+ **This error is distinct from other OIDC failures:**
31
+
32
+ | Error | Cause | Fix |
33
+ |---|---|---|
34
+ | `Unable to get ACTIONS_ID_TOKEN_REQUEST_URL env variable` | Missing `id-token: write` permission | Add permission to workflow |
35
+ | `OIDC: Could not parse JWT` | Malformed sub-claim or trust config | Fix cloud provider OIDC trust config |
36
+ | `429 Too Many Requests` | Rate limiting (large parallel matrix) | Add delays, reduce parallelism |
37
+ | **`status 500`** | **Transient GitHub infrastructure failure** | **Retry the job** |
38
+
39
+ The 500 occurs **after** the permission and environment variable checks pass. The OIDC
40
+ endpoint is reachable but GitHub's token minting service returned an internal error.
41
+
42
+ Common patterns where 500s surface:
43
+ - Workflows that run OIDC auth in every push/PR commit (high frequency)
44
+ - Parallel matrix jobs that all request tokens simultaneously
45
+ - Periods of GitHub infrastructure maintenance or elevated load
46
+ fix: |
47
+ **Option 1 — Re-run the failed job.** The 500 is stateless; simply triggering a re-run
48
+ from the GitHub Actions UI is usually sufficient for non-blocking situations.
49
+
50
+ **Option 2 — Add a retry wrapper** around the cloud authentication step using an action
51
+ like `nick-fields/retry` or `Wandalen/wretry.action`. Wrap only the OIDC-dependent step
52
+ so that a transient 500 triggers an automatic retry without failing the entire workflow.
53
+
54
+ **Option 3 — Shell retry loop** for custom OIDC token fetches. Use a loop that retries
55
+ the request up to 3 times with a short backoff before failing.
56
+
57
+ There is no client-side configuration that prevents the server 500 — retrying is the
58
+ only mitigation.
59
+ fix_code:
60
+ - language: yaml
61
+ label: 'Retry wrapper with nick-fields/retry'
62
+ code: |
63
+ jobs:
64
+ deploy:
65
+ runs-on: ubuntu-latest
66
+ permissions:
67
+ id-token: write
68
+ contents: read
69
+ steps:
70
+ - uses: actions/checkout@v4
71
+
72
+ # Wrap the OIDC-based auth step in a retry wrapper
73
+ - name: Configure AWS credentials (with retry)
74
+ uses: nick-fields/retry@v3
75
+ with:
76
+ max_attempts: 3
77
+ retry_wait_seconds: 15
78
+ timeout_minutes: 5
79
+ command: >
80
+ aws sts get-caller-identity # Or any OIDC-gated step
81
+
82
+ # Direct usage example — if your action supports retries natively
83
+ - name: Authenticate to AWS
84
+ uses: aws-actions/configure-aws-credentials@v4
85
+ with:
86
+ role-to-assume: arn:aws:iam::123456789012:role/GitHubActionsRole
87
+ aws-region: us-east-1
88
+
89
+ - language: yaml
90
+ label: 'Shell retry loop for custom OIDC token fetch'
91
+ code: |
92
+ steps:
93
+ - name: Fetch OIDC token with retry
94
+ run: |
95
+ MAX_ATTEMPTS=3
96
+ for i in $(seq 1 $MAX_ATTEMPTS); do
97
+ echo "Attempt $i of $MAX_ATTEMPTS..."
98
+ TOKEN=$(curl -sf \
99
+ -H "Authorization: bearer $ACTIONS_ID_TOKEN_REQUEST_TOKEN" \
100
+ "${ACTIONS_ID_TOKEN_REQUEST_URL}&audience=sts.amazonaws.com" \
101
+ | jq -r '.value') && {
102
+ echo "OIDC_TOKEN=${TOKEN}" >> $GITHUB_ENV
103
+ echo "Token acquired."
104
+ exit 0
105
+ }
106
+ echo "Attempt $i failed."
107
+ [ $i -lt $MAX_ATTEMPTS ] && sleep $((10 * i))
108
+ done
109
+ echo "All $MAX_ATTEMPTS attempts failed."
110
+ exit 1
111
+ prevention:
112
+ - 'Add retry logic to all OIDC-based cloud auth steps — a transient 500 should not block a deployment that can safely retry'
113
+ - 'Distinguish error types in runbooks: 500 = transient (retry), 403 = permission error (fix config), 429 = rate limited (add delay)'
114
+ - 'Monitor https://www.githubstatus.com/ when 500s appear in multiple unrelated workflows simultaneously — it often indicates a GitHub Actions incident'
115
+ - 'For critical pipelines, implement automatic re-run-on-failure so OIDC 500s recover without manual intervention (e.g., peter-evans/create-or-update-comment + workflow_run trigger)'
116
+ docs:
117
+ - url: 'https://docs.github.com/en/actions/security-for-github-actions/security-hardening-your-deployments/about-security-hardening-with-openid-connect'
118
+ label: 'About security hardening with OpenID Connect — GitHub Actions docs'
119
+ - url: 'https://www.githubstatus.com/'
120
+ label: 'GitHub Status — check for active GitHub Actions incidents'
121
+ - url: 'https://github.com/nick-fields/retry'
122
+ label: 'nick-fields/retry — retry wrapper action for GitHub Actions'