@htekdev/actions-debugger 1.0.34 → 1.0.35

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,108 @@
1
+ id: 'caching-artifacts-030'
2
+ title: "Cache Service 429 Rate Limit During Upload Causes EBADF File Stream Crash"
3
+ category: caching-artifacts
4
+ severity: error
5
+ tags:
6
+ - cache
7
+ - rate-limit
8
+ - 429
9
+ - ebadf
10
+ - post-step
11
+ - upload
12
+ - toolkit
13
+ patterns:
14
+ - regex: 'Cache service responded with 429'
15
+ flags: 'i'
16
+ - regex: 'Cache upload failed because file read failed with EBADF'
17
+ flags: 'i'
18
+ - regex: 'Failed to save:.*429'
19
+ flags: 'i'
20
+ error_messages:
21
+ - "Warning: Failed to save: Cache service responded with 429 during upload chunk."
22
+ - "Error: Cache upload failed because file read failed with EBADF: bad file descriptor, read"
23
+ root_cause: |
24
+ When the GitHub cache service rate-limits a cache upload request with HTTP 429
25
+ Too Many Requests, the actions/toolkit cache implementation does not cleanly
26
+ handle the rate-limit response. It emits a warning but continues attempting
27
+ to read from the underlying file stream. Because the HTTP upload connection was
28
+ already torn down after the 429, the file stream is left in a bad state. A
29
+ subsequent read on the orphaned stream raises EBADF (bad file descriptor), which
30
+ surfaces as a hard crash in the cleanup or post step of any action that uses
31
+ the toolkit cache library — including actions/setup-java, actions/setup-node,
32
+ actions/setup-python, and direct actions/cache usage.
33
+
34
+ The root cause is missing retry-with-backoff logic for 429 responses in the
35
+ toolkit's cache upload path. 429 responses include a Retry-After header that
36
+ the toolkit ignores entirely.
37
+
38
+ Most commonly triggered on large matrix builds (20+ concurrent jobs) where many
39
+ jobs save large caches simultaneously and collectively exhaust the cache service
40
+ rate limit. Individual jobs that hit the rate limit fail with the EBADF crash
41
+ rather than retrying or gracefully degrading.
42
+
43
+ Open since November 2023 (actions/toolkit#1589). Multiple large open-source
44
+ projects have reported it: apache/beam, techmatters/terraso-mobile-client,
45
+ synapsecns/sanguine, and others.
46
+ fix: |
47
+ No upstream fix is available — this is an open bug in actions/toolkit since
48
+ November 2023. The rate-limit retry path is not implemented.
49
+
50
+ Workarounds:
51
+ 1. Re-run the failed job from the GitHub Actions UI — on re-run, the cache
52
+ service rate limit has usually recovered and the upload succeeds.
53
+ 2. Reduce concurrent cache saves: split large matrix builds into smaller
54
+ batches using strategy.max-parallel to stagger cache upload timing.
55
+ 3. Pin to the latest patch version of actions/cache — GitHub occasionally
56
+ ships partial fixes. Keep the action version pinned to the latest release.
57
+ 4. Use actions/cache/save with if: always() and accept that the step may
58
+ still warn on 429 — but it avoids the EBADF crash if the stream
59
+ handling is improved in a newer version.
60
+ 5. Increase actions/cache version: v4+ has the most recent reliability fixes.
61
+ fix_code:
62
+ - language: yaml
63
+ label: "Limit concurrent cache saves with max-parallel to avoid rate limiting"
64
+ code: |
65
+ jobs:
66
+ build:
67
+ strategy:
68
+ matrix:
69
+ os: [ubuntu-latest, macos-latest, windows-latest]
70
+ node: [18, 20, 22]
71
+ max-parallel: 4 # Stagger cache saves — avoid 20+ simultaneous uploads
72
+ runs-on: ${{ matrix.os }}
73
+ steps:
74
+ - uses: actions/checkout@v4
75
+ - uses: actions/setup-node@v4
76
+ with:
77
+ node-version: ${{ matrix.node }}
78
+ cache: npm
79
+ - run: npm ci
80
+ - run: npm test
81
+ - language: yaml
82
+ label: "Accept cache-save failure gracefully with continue-on-error"
83
+ code: |
84
+ steps:
85
+ - uses: actions/cache/restore@v4
86
+ with:
87
+ path: ~/.m2/repository
88
+ key: ${{ runner.os }}-maven-${{ hashFiles('**/pom.xml') }}
89
+ - name: Build
90
+ run: mvn --batch-mode package
91
+ - name: Save cache (continue even if 429 occurs)
92
+ uses: actions/cache/save@v4
93
+ continue-on-error: true # Prevents EBADF crash from failing the job
94
+ with:
95
+ path: ~/.m2/repository
96
+ key: ${{ runner.os }}-maven-${{ hashFiles('**/pom.xml') }}
97
+ prevention:
98
+ - "Use strategy.max-parallel to limit concurrent matrix jobs and stagger cache upload timing."
99
+ - "Prefer actions/cache@v4 (latest) which includes the most recent reliability patches."
100
+ - "Add continue-on-error: true to explicit cache save steps to prevent EBADF from failing the workflow."
101
+ - "Monitor large matrix builds for recurring 429 errors — they indicate you need to reduce concurrency or shard differently."
102
+ docs:
103
+ - url: "https://github.com/actions/toolkit/issues/1589"
104
+ label: "actions/toolkit#1589: Cache upload does not handle 429 error (open since Nov 2023, 6 reactions)"
105
+ - url: "https://github.com/actions/setup-java/issues/543"
106
+ label: "actions/setup-java#543: Transient 429 error fail upload cache cause workflow failure"
107
+ - url: "https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/caching-dependencies-to-speed-up-workflows"
108
+ label: "GitHub Docs: Caching dependencies to speed up workflows"
@@ -0,0 +1,144 @@
1
+ id: 'known-unsolved-032'
2
+ title: "Ejecting a PR from the Merge Queue Does Not Cancel Its Running Workflow Runs"
3
+ category: known-unsolved
4
+ severity: limitation
5
+ tags:
6
+ - merge-queue
7
+ - merge_group
8
+ - cancellation
9
+ - orphaned-runs
10
+ - ci-minutes
11
+ - limitation
12
+ - waste
13
+ patterns:
14
+ - regex: 'gh-readonly-queue/'
15
+ flags: 'i'
16
+ - regex: 'on:\s*\n\s*merge_group'
17
+ flags: 'im'
18
+ error_messages:
19
+ - "Workflow run continues after PR is ejected from the merge queue"
20
+ - "No automatic cancellation for gh-readonly-queue/... runs on PR removal"
21
+ root_cause: |
22
+ When GitHub's merge queue ejects a PR — due to a failing required check,
23
+ a merge conflict with another PR in the batch, or manual removal — GitHub
24
+ does NOT automatically cancel the workflow runs that were started for the
25
+ merge group batch that contained that PR.
26
+
27
+ Those runs continue to execute and consume CI minutes even though the
28
+ associated PR will never be merged via that queue entry. In active
29
+ repositories with large merge queues (especially monorepos or high-velocity
30
+ teams), a single failed check can cause a wave of orphaned runs as PRs are
31
+ rebatched and re-queued, multiplying wasted CI time.
32
+
33
+ Merge group workflow runs are triggered on ephemeral
34
+ `gh-readonly-queue/<base-branch>/pr-<number>-<sha>` refs. When the queue
35
+ ejects a PR, the ref is deleted but in-progress runs are not signalled.
36
+
37
+ GitHub has acknowledged this as working-as-designed behavior: workflow run
38
+ cancellation on merge queue ejection must be managed by the repository owner.
39
+
40
+ Source: dotCMS/core#34592 (GitHub merge queue orphaned workflow runs waste
41
+ CI resources, Feb 2026, open).
42
+ fix: |
43
+ No built-in automatic cancellation mechanism exists. Available workarounds:
44
+
45
+ Option 1 — Scoped concurrency group per merge queue entry:
46
+ Set a concurrency group scoped to the workflow and the merge group ref. This
47
+ prevents a single PR from accumulating multiple parallel runs as it is
48
+ rebatched, but does NOT cancel runs when the PR is ejected.
49
+
50
+ Option 2 — Differentiated cancel-in-progress by event:
51
+ Use cancel-in-progress only for pull_request events (not merge_group events)
52
+ to avoid cancelling sibling PRs in the same batch while still cancelling
53
+ redundant PR-branch runs.
54
+
55
+ Option 3 — External cleanup script:
56
+ A separate monitoring workflow on schedule or repository_dispatch can call
57
+ the Actions API to cancel in-progress runs on refs that no longer exist.
58
+ This is operationally complex but achieves true cleanup.
59
+ fix_code:
60
+ - language: yaml
61
+ label: "Differentiated concurrency — cancel PR runs but not merge queue runs"
62
+ code: |
63
+ on:
64
+ push:
65
+ branches: [main]
66
+ pull_request:
67
+ merge_group:
68
+
69
+ concurrency:
70
+ # Include workflow name to avoid cross-workflow cancellation
71
+ group: ${{ github.workflow }}-${{ github.ref }}
72
+ # Cancel duplicate PR branch runs, but do NOT cancel merge queue runs
73
+ # (cancelling merge_group runs ejects sibling PRs from the queue)
74
+ cancel-in-progress: ${{ github.event_name == 'pull_request' }}
75
+
76
+ jobs:
77
+ ci:
78
+ runs-on: ubuntu-latest
79
+ steps:
80
+ - uses: actions/checkout@v4
81
+ - run: npm test
82
+ - language: yaml
83
+ label: "Monitoring workflow to cancel orphaned merge queue runs (advanced)"
84
+ code: |
85
+ # This workflow runs periodically and cancels in-progress runs
86
+ # on merge queue refs that no longer exist as active queue entries.
87
+ # Requires: contents: read, actions: write
88
+ on:
89
+ schedule:
90
+ - cron: '*/15 * * * *' # Every 15 minutes
91
+ workflow_dispatch:
92
+
93
+ permissions:
94
+ actions: write
95
+ contents: read
96
+
97
+ jobs:
98
+ cleanup-orphaned-runs:
99
+ runs-on: ubuntu-latest
100
+ steps:
101
+ - name: Cancel orphaned merge queue runs
102
+ uses: actions/github-script@v7
103
+ with:
104
+ script: |
105
+ const runs = await github.rest.actions.listWorkflowRunsForRepo({
106
+ owner: context.repo.owner,
107
+ repo: context.repo.repo,
108
+ status: 'in_progress',
109
+ per_page: 100,
110
+ });
111
+ for (const run of runs.data.workflow_runs) {
112
+ if (run.head_branch?.startsWith('gh-readonly-queue/')) {
113
+ // Verify the queue ref still exists
114
+ try {
115
+ await github.rest.git.getRef({
116
+ owner: context.repo.owner,
117
+ repo: context.repo.repo,
118
+ ref: `heads/${run.head_branch}`,
119
+ });
120
+ } catch (e) {
121
+ if (e.status === 404) {
122
+ // Ref gone — cancel the orphaned run
123
+ await github.rest.actions.cancelWorkflowRun({
124
+ owner: context.repo.owner,
125
+ repo: context.repo.repo,
126
+ run_id: run.id,
127
+ });
128
+ console.log(`Cancelled orphaned run ${run.id} for ${run.head_branch}`);
129
+ }
130
+ }
131
+ }
132
+ }
133
+ prevention:
134
+ - "Accept that some CI minutes will be wasted on ejected PRs — this is a known platform constraint with no first-class solution."
135
+ - "Monitor total merge queue depth in high-velocity repos; if the queue frequently rebatches, orphaned runs accumulate quickly."
136
+ - "Use differentiated cancel-in-progress (disabled for merge_group events) to at least avoid accidentally ejecting sibling PRs while managing PR-branch redundancy."
137
+ - "Consider a periodic cleanup workflow using the Actions API to cancel in-progress runs on deleted merge queue refs."
138
+ docs:
139
+ - url: "https://github.com/dotCMS/core/issues/34592"
140
+ label: "dotCMS/core#34592: Merge queue orphaned workflow runs waste CI resources (open, Feb 2026)"
141
+ - url: "https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/configuring-pull-request-merges/managing-a-merge-queue"
142
+ label: "GitHub Docs: Managing a merge queue"
143
+ - url: "https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#merge_group"
144
+ label: "GitHub Docs: merge_group event"
@@ -0,0 +1,132 @@
1
+ id: 'triggers-029'
2
+ title: "issue_comment Trigger Runs in Default Branch Context — PR Code Not Checked Out"
3
+ category: triggers
4
+ severity: silent-failure
5
+ tags:
6
+ - issue_comment
7
+ - pull_request
8
+ - checkout
9
+ - default-branch
10
+ - context
11
+ - ref
12
+ - pr-comment
13
+ patterns:
14
+ - regex: 'on:\s*\n\s*issue_comment'
15
+ flags: 'im'
16
+ - regex: 'github\.event\.issue\.pull_request'
17
+ flags: 'i'
18
+ error_messages:
19
+ - "Warning: This commit is not necessarily the head of this branch."
20
+ - "issue_comment event fires in the default branch context, not the PR branch"
21
+ root_cause: |
22
+ When a comment is posted on a pull request, GitHub fires the `issue_comment`
23
+ event in the **default branch context**, not in the PR branch context.
24
+ Specifically:
25
+ - github.ref = refs/heads/main (or the repo default branch)
26
+ - github.sha = the HEAD commit of the default branch
27
+ - github.event.pull_request is undefined (issue_comment doesn't include PR data)
28
+
29
+ A workflow triggered by `issue_comment` that runs `actions/checkout` without
30
+ an explicit `ref:` will silently check out the **default branch**, not the
31
+ PR's code. The workflow appears to succeed — but it ran against stale or
32
+ unrelated code, not the PR changes being discussed.
33
+
34
+ This is a well-known footgun documented in 67-reaction issues. Developers
35
+ commonly add /deploy, /test, or /approve slash command workflows on PR comments,
36
+ expecting the workflow to run against the PR's code.
37
+
38
+ A second consequence: `github.event.pull_request` is undefined in this context.
39
+ Workflows that assume `github.event.pull_request.head.sha` exists will error
40
+ with "Cannot read properties of undefined (reading 'head')".
41
+
42
+ Source: actions/checkout#331 (67 reactions, open since Aug 2020).
43
+ fix: |
44
+ Explicitly retrieve the PR details from the GitHub API and check out the PR
45
+ head commit using the pull request number from the issue comment event payload.
46
+
47
+ The PR number is available at: github.event.issue.number
48
+ (issue_comment events on PRs use the issue number, which matches the PR number)
49
+
50
+ Two approaches:
51
+ 1. Use actions/github-script to fetch the PR head SHA, then checkout with
52
+ that explicit ref.
53
+ 2. Use pull_request_target instead of issue_comment for workflows that need
54
+ to run on PR code — but be aware of the security implications (pull_request_target
55
+ runs with write permissions in the base repo context, even for fork PRs).
56
+
57
+ Always gate on `github.event.issue.pull_request` in an if: condition to
58
+ distinguish PR comments from plain issue comments.
59
+ fix_code:
60
+ - language: yaml
61
+ label: "Checkout PR head commit from issue_comment event"
62
+ code: |
63
+ on:
64
+ issue_comment:
65
+ types: [created]
66
+
67
+ jobs:
68
+ run-on-pr-comment:
69
+ runs-on: ubuntu-latest
70
+ # Only run on PR comments, not plain issue comments
71
+ if: github.event.issue.pull_request != null
72
+ steps:
73
+ - name: Get PR head SHA
74
+ id: pr
75
+ uses: actions/github-script@v7
76
+ with:
77
+ script: |
78
+ const pr = await github.rest.pulls.get({
79
+ owner: context.repo.owner,
80
+ repo: context.repo.repo,
81
+ pull_number: context.issue.number,
82
+ });
83
+ core.setOutput('head_sha', pr.data.head.sha);
84
+ core.setOutput('head_ref', pr.data.head.ref);
85
+
86
+ - uses: actions/checkout@v4
87
+ with:
88
+ ref: ${{ steps.pr.outputs.head_sha }}
89
+
90
+ - name: Run tests on PR code
91
+ run: npm test
92
+ - language: yaml
93
+ label: "Slash command pattern — only act on specific PR comment text"
94
+ code: |
95
+ on:
96
+ issue_comment:
97
+ types: [created]
98
+
99
+ jobs:
100
+ slash-command:
101
+ runs-on: ubuntu-latest
102
+ if: |
103
+ github.event.issue.pull_request != null &&
104
+ contains(github.event.comment.body, '/deploy')
105
+ steps:
106
+ - name: Get PR head SHA
107
+ id: pr
108
+ uses: actions/github-script@v7
109
+ with:
110
+ script: |
111
+ const pr = await github.rest.pulls.get({
112
+ owner: context.repo.owner,
113
+ repo: context.repo.repo,
114
+ pull_number: context.issue.number,
115
+ });
116
+ core.setOutput('head_sha', pr.data.head.sha);
117
+ - uses: actions/checkout@v4
118
+ with:
119
+ ref: ${{ steps.pr.outputs.head_sha }}
120
+ - run: ./scripts/deploy.sh
121
+ prevention:
122
+ - "Always add `if: github.event.issue.pull_request != null` to distinguish PR comments from issue comments."
123
+ - "Never rely on github.ref or github.sha in issue_comment workflows — they point to the default branch, not the PR."
124
+ - "Use actions/github-script to fetch the PR head SHA via the pulls.get API, then pass it to actions/checkout as ref."
125
+ - "Consider pull_request_target for PR-triggered workflows, but audit for untrusted code execution risk first."
126
+ docs:
127
+ - url: "https://github.com/actions/checkout/issues/331"
128
+ label: "actions/checkout#331: Any way to checkout PR from issue_comment event? (67 reactions)"
129
+ - url: "https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#issue_comment"
130
+ label: "GitHub Docs: issue_comment event trigger"
131
+ - url: "https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#pull_request_target"
132
+ label: "GitHub Docs: pull_request_target event — alternative with write access"
@@ -0,0 +1,125 @@
1
+ id: 'yaml-syntax-032'
2
+ title: "jobs.<id>.result Always Returns Empty String in on.workflow_call.outputs..value"
3
+ category: yaml-syntax
4
+ severity: silent-failure
5
+ tags:
6
+ - reusable-workflow
7
+ - workflow_call
8
+ - outputs
9
+ - jobs-context
10
+ - result
11
+ - expression
12
+ - empty-string
13
+ patterns:
14
+ - regex: 'jobs\.\w+\.result'
15
+ flags: 'i'
16
+ - regex: 'on:\s*\n\s*workflow_call:\s*\n[\s\S]*?outputs:'
17
+ flags: 'im'
18
+ error_messages:
19
+ - "jobs.<job_id>.result evaluates to empty string in on.workflow_call.outputs value"
20
+ - "needs.<reusable_workflow>.outputs.<output> is empty string instead of success/failure/skipped"
21
+ root_cause: |
22
+ In a reusable workflow, the expression `${{ jobs.<id>.result }}` silently
23
+ evaluates to an empty string when referenced inside an
24
+ `on.workflow_call.outputs.<output_name>.value` expression.
25
+
26
+ Despite the GitHub documentation stating that the `jobs` context is available
27
+ in that expression scope, direct property access to the `.result` field of a
28
+ job object returns empty string rather than the expected outcome value
29
+ (`success`, `failure`, `cancelled`, or `skipped`).
30
+
31
+ This causes downstream caller workflows that test the reusable workflow's
32
+ output (e.g., `needs.reusable.outputs.my-result == 'skipped'`) to silently
33
+ receive an empty string. Conditions that gate follow-up jobs on the result
34
+ of the reusable workflow never evaluate to true.
35
+
36
+ Open since January 2024 (actions/runner#3087, 11 reactions). The issue
37
+ affects all GitHub-hosted runners. A workaround exists using
38
+ `fromJSON(toJSON(jobs.<id>)).result` which forces the expression through JSON
39
+ serialization and correctly surfaces the result value.
40
+
41
+ Note: This is distinct from the more common mistake of referencing outputs
42
+ without declaring them in the job's outputs: block. The bug occurs even when
43
+ the job produces no step outputs and you simply want to surface whether the
44
+ job ran, was skipped, or failed.
45
+ fix: |
46
+ Two approaches:
47
+
48
+ Option 1 — fromJSON/toJSON workaround (quickest):
49
+ Replace `${{ jobs.build.result }}` with
50
+ `${{ fromJSON(toJSON(jobs.build)).result }}`. The JSON round-trip forces
51
+ full evaluation of the jobs context object and correctly returns the result
52
+ string.
53
+
54
+ Option 2 — Surface result via step output (most reliable):
55
+ Add an explicit step with `if: always()` that writes `job.status` to
56
+ GITHUB_OUTPUT. Reference the step output in the job's outputs block and
57
+ in the top-level workflow_call outputs. This avoids the expression evaluation
58
+ quirk entirely.
59
+ fix_code:
60
+ - language: yaml
61
+ label: "Broken pattern vs fromJSON/toJSON workaround"
62
+ code: |
63
+ # BROKEN: jobs.build.result returns empty string
64
+ on:
65
+ workflow_call:
66
+ outputs:
67
+ job-result:
68
+ value: ${{ jobs.build.result }} # Always empty string
69
+
70
+ # FIXED: fromJSON/toJSON forces expression evaluation
71
+ on:
72
+ workflow_call:
73
+ outputs:
74
+ job-result:
75
+ value: ${{ fromJSON(toJSON(jobs.build)).result }} # Returns success/failure/skipped
76
+ - language: yaml
77
+ label: "Preferred fix — surface result via explicit step output"
78
+ code: |
79
+ on:
80
+ workflow_call:
81
+ outputs:
82
+ job-result:
83
+ description: "The build job outcome"
84
+ value: ${{ jobs.build.outputs.result }}
85
+
86
+ jobs:
87
+ build:
88
+ runs-on: ubuntu-latest
89
+ outputs:
90
+ result: ${{ steps.capture-result.outputs.result }}
91
+ steps:
92
+ - uses: actions/checkout@v4
93
+ - name: Build
94
+ run: npm run build
95
+
96
+ - name: Capture job result
97
+ id: capture-result
98
+ if: always()
99
+ run: echo "result=${{ job.status }}" >> $GITHUB_OUTPUT
100
+ - language: yaml
101
+ label: "Caller workflow checking reusable output"
102
+ code: |
103
+ jobs:
104
+ reusable:
105
+ uses: ./.github/workflows/build.yml
106
+ secrets: inherit
107
+
108
+ deploy:
109
+ needs: reusable
110
+ runs-on: ubuntu-latest
111
+ # This condition now works correctly with either fix above
112
+ if: needs.reusable.outputs.job-result == 'success'
113
+ steps:
114
+ - run: echo "Deploying after successful build"
115
+ prevention:
116
+ - "Never rely on direct `jobs.<id>.result` access in workflow_call top-level outputs — use the fromJSON(toJSON()) workaround or explicit step outputs."
117
+ - "Add integration tests for reusable workflows that verify output values are non-empty strings."
118
+ - "Use job.status (available inside step runs) rather than jobs.<id>.result (the problematic context) when capturing job outcome."
119
+ docs:
120
+ - url: "https://github.com/actions/runner/issues/3087"
121
+ label: "actions/runner#3087: Cannot access jobs.<id>.result from on.workflow_call.outputs (open since Jan 2024, 11 reactions)"
122
+ - url: "https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/passing-information-between-jobs"
123
+ label: "GitHub Docs: Passing information between jobs"
124
+ - url: "https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/contexts#jobs-context"
125
+ label: "GitHub Docs: jobs context (documents result property)"
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@htekdev/actions-debugger",
3
- "version": "1.0.34",
3
+ "version": "1.0.35",
4
4
  "description": "65+ real GitHub Actions errors, queryable by agents. CLI + MCP server + Copilot skills + error database.",
5
5
  "type": "module",
6
6
  "main": "./dist/index.js",