@htekdev/actions-debugger 1.0.114 → 1.0.115

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,83 @@
1
+ id: concurrency-timing-053
2
+ title: 'Concurrency pending slot overflow: third concurrent run silently cancels the already-queued second run when cancel-in-progress is false'
3
+ category: concurrency-timing
4
+ severity: silent-failure
5
+ tags:
6
+ - concurrency
7
+ - cancel-in-progress
8
+ - pending
9
+ - queue
10
+ - silent-cancellation
11
+ - overflow
12
+ patterns:
13
+ - regex: 'Canceling since a higher priority waiting run was found'
14
+ flags: 'i'
15
+ - regex: 'This run was automatically cancelled'
16
+ flags: 'i'
17
+ error_messages:
18
+ - 'Canceling since a higher priority waiting run was found'
19
+ - 'This run was automatically cancelled'
20
+ root_cause: |
21
+ GitHub Actions concurrency groups with cancel-in-progress: false allow at most one
22
+ run to be in-progress and one run to be pending (queued) simultaneously per group.
23
+ This queue depth is exactly 1, not unlimited.
24
+
25
+ When a third run arrives while run 1 is in-progress and run 2 is pending:
26
+ - Run 2 (the pending run) is immediately and silently cancelled
27
+ - Run 3 takes run 2's pending slot
28
+
29
+ The user who triggered run 2 typically sees it flip from "Queued" to "Cancelled"
30
+ with no notification and no failure alert. From their perspective their commit's CI
31
+ simply disappeared.
32
+
33
+ This behavior is documented in GitHub docs but surprises teams that expect a FIFO
34
+ queue of unlimited depth. The concurrency feature is a mutex with a single
35
+ waiting-room slot, not a job scheduler queue.
36
+
37
+ Common scenarios where this causes silent data loss:
38
+ - Rapid-fire merges to a protected branch with slow integration tests
39
+ - Multiple developers pushing within seconds of each other to the same branch
40
+ - Automated commits (dependency updates, release bots) arriving while CI runs
41
+ fix: |
42
+ Option 1 — Accept the overflow: intended behavior for fast-merge scenarios where only
43
+ the LATEST commit needs CI. No change needed.
44
+
45
+ Option 2 — Widen the concurrency key so each commit gets its own slot:
46
+ group: ${{ github.workflow }}-${{ github.sha }}
47
+ This disables cancellation entirely; every run completes regardless of newer pushes.
48
+
49
+ Option 3 — Use cancel-in-progress: true explicitly if "latest wins" is the desired
50
+ semantics. In-progress runs cancel rather than queued runs disappearing silently.
51
+
52
+ Option 4 — Queue at the runner group level by using a self-hosted runner group with
53
+ a concurrency limit to provide true multi-run queuing.
54
+ fix_code:
55
+ - language: yaml
56
+ label: 'Common mistake: expecting cancel-in-progress: false to queue all pending runs indefinitely'
57
+ code: |
58
+ concurrency:
59
+ group: ${{ github.workflow }}-${{ github.ref }}
60
+ cancel-in-progress: false
61
+ # Only 1 run can be pending; a 3rd arriving run silently cancels the queued 2nd
62
+ - language: yaml
63
+ label: 'Option A: per-commit group key — every run completes, no cancellation at all'
64
+ code: |
65
+ concurrency:
66
+ group: ${{ github.workflow }}-${{ github.sha }}
67
+ cancel-in-progress: false
68
+ - language: yaml
69
+ label: 'Option B: cancel-in-progress: true — explicit latest-wins, in-progress runs cancelled not pending ones'
70
+ code: |
71
+ concurrency:
72
+ group: ${{ github.workflow }}-${{ github.ref }}
73
+ cancel-in-progress: true
74
+ prevention:
75
+ - 'Understand that cancel-in-progress: false does not create an unlimited queue — it allows exactly one pending run per concurrency group key'
76
+ - 'For deployment workflows where no commit should be skipped, use per-commit group keys (${{ github.sha }}) to guarantee every run completes'
77
+ - 'Monitor the Actions tab during rapid-push periods to verify queued runs are completing, not silently disappearing'
78
+ - 'Prefer cancel-in-progress: true when only the latest result matters; the cancellation is explicit and visible rather than silent'
79
+ docs:
80
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/using-concurrency'
81
+ label: 'GitHub Docs: Using concurrency'
82
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/using-concurrency#example-only-cancel-in-progress-jobs-or-runs-for-the-current-workflow'
83
+ label: 'GitHub Docs: Concurrency — one pending slot per group'
@@ -0,0 +1,87 @@
1
+ id: known-unsolved-062
2
+ title: 'workflow_run chains are limited to one level — a workflow_run-triggered workflow cannot trigger another downstream workflow via workflow_run'
3
+ category: known-unsolved
4
+ severity: limitation
5
+ tags:
6
+ - workflow-run
7
+ - chaining
8
+ - pipeline
9
+ - limitation
10
+ - no-fix
11
+ - event-trigger
12
+ patterns:
13
+ - regex: 'on:\s*\n\s+workflow_run:'
14
+ flags: 'i'
15
+ error_messages: []
16
+ root_cause: |
17
+ GitHub Actions explicitly prevents workflow_run events from chaining more than one
18
+ level deep. A workflow triggered by workflow_run CANNOT itself use workflow_run as
19
+ an on: trigger to fire a third downstream workflow.
20
+
21
+ From GitHub documentation: "A workflow triggered by a workflow_run event can only
22
+ be triggered by a workflow that is not itself triggered by a workflow_run event."
23
+
24
+ This restriction prevents infinite trigger loops but also prevents building linear
25
+ CI/CD pipelines using workflow_run chaining alone. Multi-stage pipelines of the form
26
+ Build (push) → Test (workflow_run) → Deploy (workflow_run) → Notify (workflow_run)
27
+ fail silently at the second hop: the Deploy and Notify workflows never appear in the
28
+ Actions tab and no error is raised anywhere.
29
+
30
+ There is no runtime error, no annotation, and no warning. The on: workflow_run
31
+ trigger on the downstream workflow is simply never evaluated.
32
+ fix: |
33
+ Replace the second-hop workflow_run trigger with an explicit dispatch from the
34
+ first downstream workflow:
35
+
36
+ Option 1 — repository_dispatch: use the GitHub REST API from a job step in workflow
37
+ B to POST to /repos/{owner}/{repo}/dispatches with a custom event_type. Workflow C
38
+ listens on on: repository_dispatch with a matching types: filter.
39
+
40
+ Option 2 — workflow_dispatch: use gh workflow run from a step in workflow B to
41
+ directly trigger workflow C by filename. Requires a GitHub token with actions:write.
42
+
43
+ Option 3 — Consolidate: merge the second and third workflows into a single workflow
44
+ with job dependencies (needs:) eliminating the cross-workflow hop entirely.
45
+ fix_code:
46
+ - language: yaml
47
+ label: 'Does NOT work: workflow_run cannot chain more than one level deep'
48
+ code: |
49
+ # Workflow C — this trigger is never evaluated when Workflow B is
50
+ # itself triggered by workflow_run
51
+ on:
52
+ workflow_run:
53
+ workflows: ["B - Integration Tests"]
54
+ types: [completed]
55
+ - language: yaml
56
+ label: 'Fix: dispatch workflow C via repository_dispatch from workflow B'
57
+ code: |
58
+ # In workflow B (intermediate workflow, triggered by workflow_run):
59
+ jobs:
60
+ dispatch-downstream:
61
+ runs-on: ubuntu-latest
62
+ if: ${{ github.event.workflow_run.conclusion == 'success' }}
63
+ steps:
64
+ - name: Trigger workflow C via repository_dispatch
65
+ run: |
66
+ gh api repos/${{ github.repository }}/dispatches \
67
+ --method POST \
68
+ --field event_type=run-deploy \
69
+ --field client_payload='{"run_id":"${{ github.event.workflow_run.id }}"}'
70
+ env:
71
+ GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
72
+
73
+ # In workflow C:
74
+ on:
75
+ repository_dispatch:
76
+ types: [run-deploy]
77
+ prevention:
78
+ - 'Design CI/CD pipelines assuming workflow_run allows only one hop; use repository_dispatch or workflow_dispatch for any second-level chaining'
79
+ - 'Prefer consolidating multi-stage pipelines into a single workflow with job dependencies (needs:) when the stages always execute in sequence'
80
+ - 'When a downstream workflow never appears in the Actions tab after merging, verify whether its on: trigger is workflow_run and whether its upstream workflow is also workflow_run-triggered'
81
+ docs:
82
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#workflow_run'
83
+ label: 'GitHub Docs: workflow_run event — one-level-deep restriction'
84
+ - url: 'https://docs.github.com/en/rest/repos/repos#create-a-repository-dispatch-event'
85
+ label: 'GitHub REST API: Create a repository dispatch event'
86
+ - url: 'https://cli.github.com/manual/gh_workflow_run'
87
+ label: 'GitHub CLI: gh workflow run (alternative dispatch method)'
@@ -0,0 +1,102 @@
1
+ id: known-unsolved-063
2
+ title: 'REST API reports runner busy:false while broker shows runner actively executing a job'
3
+ category: known-unsolved
4
+ severity: silent-failure
5
+ tags:
6
+ - self-hosted
7
+ - runner
8
+ - autoscaler
9
+ - rest-api
10
+ - broker
11
+ - state-desync
12
+ - v2-flow
13
+ - job-killed
14
+ patterns:
15
+ - regex: 'busy.*false.*runner.*executing|runner.*busy.*false.*job'
16
+ flags: 'i'
17
+ - regex: '"busy"\s*:\s*false'
18
+ flags: 'i'
19
+ error_messages:
20
+ - '"busy": false'
21
+ - 'GET /repos/{owner}/{repo}/actions/runners/{id} → {"busy": false}'
22
+ root_cause: |
23
+ On non-ephemeral self-hosted runners using the V2 broker flow
24
+ (`broker.actions.githubusercontent.com`), a state desynchronization exists between
25
+ the broker service and the GitHub REST API:
26
+
27
+ - The broker correctly tracks runner state in real-time: after picking up Job B, the
28
+ runner reports `JobState: Busy` to the broker and renews its job lease every 60s.
29
+ - However, `GET /repos/{owner}/{repo}/actions/runners/{runner_id}` (the public REST
30
+ API) continues to return `"busy": false` during the early phase of job execution.
31
+ The REST API state may only update after the runner's next periodic sync, which
32
+ can lag 30–120 seconds behind the broker state.
33
+
34
+ Auto-scaling tools that rely on the REST API to identify idle runners (e.g.,
35
+ `github-aws-runners/terraform-aws-github-runner`, KEDA GitHub Actions scaler,
36
+ custom Lambda/CloudFunction scalers) interpret `busy: false` as "runner is idle and
37
+ safe to terminate." This causes the autoscaler to terminate an EC2/GCE/Azure instance
38
+ mid-job — killing the runner process with no Actions-level error and marking the job
39
+ as failed with a runner disconnection error.
40
+
41
+ From the affected job's perspective, the log ends mid-step with "The runner has
42
+ received a shutdown signal" or the job times out. There is no annotation indicating
43
+ the root cause was an autoscaler decision based on stale REST API data.
44
+
45
+ No GitHub-side fix is available as of June 2026. The REST API does not expose a
46
+ real-time busy status consistent with the broker. Open at actions/runner#4422.
47
+ fix: |
48
+ There is no complete fix — this is a known state inconsistency in the GitHub platform.
49
+
50
+ Workarounds (choose one based on your autoscaling setup):
51
+
52
+ 1. **Switch to ephemeral JIT runners (recommended)**: Use JIT tokens and terminate
53
+ runners after exactly one job. There is no window for autoscalers to misidentify
54
+ a running job as idle because the runner is registered and deregistered atomically.
55
+
56
+ 2. **Add a grace period before termination**: When your autoscaler sees `busy: false`,
57
+ wait 2–3 minutes and re-poll before actually terminating. This covers the lag
58
+ between broker state and REST API state.
59
+
60
+ 3. **Poll job status instead of runner status**: Use
61
+ `GET /repos/{owner}/{repo}/actions/runs` to check for `in_progress` workflow runs
62
+ before terminating any runner, rather than relying on per-runner `busy` status.
63
+
64
+ 4. **Use runner labels + job assignment**: If your autoscaler assigns specific runners
65
+ to specific jobs via labels, you can cross-reference queued/in-progress job
66
+ assignments against runner IDs before terminating.
67
+ fix_code:
68
+ - language: yaml
69
+ label: 'Example: Switch to ephemeral JIT runners (removes the desync window entirely)'
70
+ code: |
71
+ # Use JIT runner registration in your autoscaler
72
+ # Each runner handles exactly one job — busy/idle desync cannot occur
73
+ # See: https://docs.github.com/en/actions/security-for-github-actions/security-guides/security-hardening-for-github-actions#using-just-in-time-runners
74
+
75
+ # In your autoscaler provisioning logic:
76
+ # POST /repos/{owner}/{repo}/actions/runners/generate-jit-config
77
+ # → Use the returned jit_config to start an ephemeral runner
78
+ # → Runner auto-deregisters after job completes — no stale REST state possible
79
+ - language: yaml
80
+ label: 'Example: Grace period before termination (Terraform-style pseudocode)'
81
+ code: |
82
+ # In your autoscaler Lambda/script, before terminating an instance:
83
+ # 1. GET /repos/{owner}/{repo}/actions/runners/{runner_id}
84
+ # 2. If busy == false, wait 2 minutes
85
+ # 3. Re-poll: GET /repos/{owner}/{repo}/actions/runners/{runner_id}
86
+ # 4. Only terminate if STILL busy == false after the grace period
87
+
88
+ # This covers the broker→REST lag window (~30-120s observed in practice)
89
+ prevention:
90
+ - "Prefer ephemeral JIT runners for any workload where mid-job termination would be costly; the broker-REST desync window is zero for single-job-per-runner setups."
91
+ - "Never terminate a runner instance based solely on a single REST API `busy: false` reading — always double-check with a grace period or secondary signal."
92
+ - "Monitor for jobs that end with 'runner has received a shutdown signal' — this is a reliable indicator that a runner was terminated externally mid-job."
93
+ - "If using terraform-aws-github-runner or similar, check whether the tool version has built-in grace periods for the busy-state lag."
94
+ docs:
95
+ - url: 'https://github.com/actions/runner/issues/4422'
96
+ label: 'actions/runner#4422 — /runners REST API reports busy:false for active runner'
97
+ - url: 'https://docs.github.com/en/rest/actions/self-hosted-runners'
98
+ label: 'REST API: Self-hosted runners'
99
+ - url: 'https://docs.github.com/en/actions/security-for-github-actions/security-guides/security-hardening-for-github-actions#using-just-in-time-runners'
100
+ label: 'Just-in-time runners documentation'
101
+ - url: 'https://github.com/github-aws-runners/terraform-aws-github-runner'
102
+ label: 'terraform-aws-github-runner (commonly affected autoscaler)'
@@ -0,0 +1,94 @@
1
+ id: permissions-auth-068
2
+ title: 'upload-code-coverage action fails with 403 — missing code-quality:write permission'
3
+ category: permissions-auth
4
+ severity: error
5
+ tags:
6
+ - permissions
7
+ - code-quality
8
+ - upload-code-coverage
9
+ - github-token
10
+ - 403
11
+ - fine-grained
12
+ patterns:
13
+ - regex: 'Resource not accessible by integration'
14
+ flags: 'i'
15
+ - regex: 'Upload failed.*403|HTTP 403.*code.coverage|code.coverage.*403'
16
+ flags: 'i'
17
+ - regex: 'code-quality.*write|code_quality.*write'
18
+ flags: 'i'
19
+ error_messages:
20
+ - '{"message":"Resource not accessible by integration","documentation_url":"https://docs.github.com/rest"}'
21
+ - 'Error: Upload failed: HTTP 403 Forbidden'
22
+ - 'HTTP Status: 403'
23
+ root_cause: |
24
+ GitHub's code coverage upload API (introduced May 2026 as part of Code Quality for
25
+ pull requests) requires the new fine-grained permission `code-quality:write` on the
26
+ calling token. The default GITHUB_TOKEN in GitHub Actions workflows has this permission
27
+ set to `none` unless explicitly granted.
28
+
29
+ When `actions/upload-code-coverage` calls the coverage upload API without this
30
+ permission, GitHub returns HTTP 403 "Resource not accessible by integration". Because
31
+ `code-quality:write` is a newly introduced permission class (not present in older
32
+ workflow permission schemas), developers familiar with the standard permissions
33
+ (contents, issues, pull-requests, etc.) don't know to add it.
34
+
35
+ This affects all workflows that do not specify `permissions:` at all (which defaults to
36
+ `read-all` — but `code-quality` is still `none` for new permissions), as well as
37
+ workflows that explicitly set `permissions: {}` or use a restrictive block.
38
+ fix: |
39
+ Add `code-quality: write` to the `permissions` block of the job that runs
40
+ `actions/upload-code-coverage`. This grants the GITHUB_TOKEN the required scope to
41
+ call the code coverage upload API.
42
+
43
+ Note: `code-quality:` is a job-level permission. It cannot be set as a global
44
+ `GITHUB_TOKEN` permission through repository settings — it must be declared in the
45
+ workflow YAML per-job.
46
+ fix_code:
47
+ - language: yaml
48
+ label: 'Add code-quality:write to the coverage upload job'
49
+ code: |
50
+ jobs:
51
+ test:
52
+ runs-on: ubuntu-latest
53
+ permissions:
54
+ contents: read
55
+ code-quality: write # Required for upload-code-coverage
56
+ steps:
57
+ - uses: actions/checkout@v4
58
+
59
+ - name: Run tests and generate coverage
60
+ run: pytest --cov=src --cov-report=xml
61
+
62
+ - name: Upload code coverage
63
+ uses: actions/upload-code-coverage@v1
64
+ with:
65
+ file: coverage.xml
66
+ language: Python
67
+ - language: yaml
68
+ label: 'Minimal permissions block if no others are needed'
69
+ code: |
70
+ jobs:
71
+ coverage:
72
+ runs-on: ubuntu-latest
73
+ permissions:
74
+ code-quality: write
75
+ steps:
76
+ - uses: actions/upload-code-coverage@v1
77
+ with:
78
+ file: cobertura.xml
79
+ language: Java
80
+ label: code-coverage/jacoco
81
+ prevention:
82
+ - "Whenever you add actions/upload-code-coverage to a workflow, immediately add `code-quality: write` to the job's permissions block."
83
+ - "Use a linter or policy-as-code tool (e.g., Poutine, StepSecurity) that validates required permissions against known action requirements."
84
+ - "If your org uses required permissions: {} at the workflow level for security hardening, remember that code-quality: write must still be declared per-job."
85
+ - "Check the GitHub Changelog periodically — new actions introduce new permission classes that aren't reflected in older documentation or IDE auto-complete."
86
+ docs:
87
+ - url: 'https://github.blog/changelog/2026-05-26-code-coverage-in-pull-requests-is-now-in-public-preview/'
88
+ label: 'GitHub Changelog: Code coverage in pull requests (May 26, 2026)'
89
+ - url: 'https://github.com/actions/upload-code-coverage'
90
+ label: 'actions/upload-code-coverage repository'
91
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/controlling-permissions-for-github_token'
92
+ label: 'Controlling permissions for GITHUB_TOKEN'
93
+ - url: 'https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/enabling-features-for-your-repository/managing-github-actions-settings-for-a-repository'
94
+ label: 'GitHub Actions permissions documentation'
@@ -0,0 +1,93 @@
1
+ id: runner-environment-199
2
+ title: 'Windows self-hosted runner V2 broker listener stops polling after first job (2.334.0)'
3
+ category: runner-environment
4
+ severity: error
5
+ tags:
6
+ - self-hosted
7
+ - windows
8
+ - runner-v2
9
+ - broker
10
+ - listener
11
+ - stuck
12
+ - message-queue
13
+ - v2-flow
14
+ patterns:
15
+ - regex: 'Get messages has been cancelled using local token source\. Continue to get messages with new status'
16
+ flags: 'i'
17
+ - regex: 'BrokerMessageListener.*cancel|cancel.*BrokerMessageListener'
18
+ flags: 'i'
19
+ error_messages:
20
+ - 'Get messages has been cancelled using local token source. Continue to get messages with new status.'
21
+ root_cause: |
22
+ On Windows self-hosted runners running v2.334.0 with the V2 broker flow enabled
23
+ (`useV2Flow: true`, connecting to `broker.actions.githubusercontent.com`), the
24
+ `BrokerMessageListener` stops issuing GET /message requests after the first job
25
+ completes and the runner transitions from Busy→Online (idle).
26
+
27
+ The root cause is a race condition in the V2 state machine reset path: when the
28
+ Busy→Online transition fires, the existing `localTokenSource` for the polling loop
29
+ is cancelled (logging "Get messages has been cancelled using local token source.
30
+ Continue to get messages with new status."), but the replacement polling loop fails
31
+ to start due to a missing null-check or async continuation bug on Windows. OAuth
32
+ token refreshes continue normally (masking the hang — credentials are not the issue),
33
+ but no new GET /message requests are dispatched.
34
+
35
+ The runner UI shows the runner as "Idle" indefinitely. Any jobs queued after the
36
+ first run sit in "Queued" status and never start until the runner service is manually
37
+ restarted. The issue is specific to Windows (the identical code path on Linux/macOS
38
+ appears unaffected) and requires a service start + exactly one job completion to
39
+ trigger. Reproduced reliably across multiple self-hosted Windows environments.
40
+
41
+ Reported against runner v2.334.0, commit f1995ede5d885c997d13d8eca5467c4ce97fe69c.
42
+ No fix available as of June 2026. Open at actions/runner#4444.
43
+ fix: |
44
+ Immediate workaround: restart the runner service after each job, or add an automated
45
+ watchdog that monitors the runner diagnostics log for the stale-listener pattern and
46
+ triggers a service restart.
47
+
48
+ Longer-term: downgrade to runner v2.333.x until actions/runner#4444 is resolved.
49
+ Check the runner release notes for a fix targeting the V2 BrokerMessageListener
50
+ state-reset path.
51
+ fix_code:
52
+ - language: yaml
53
+ label: 'PowerShell watchdog snippet to restart runner service on listener hang (add to a monitoring script)'
54
+ code: |
55
+ # Watchdog: detect stale BrokerMessageListener and restart runner service
56
+ # Run this as a scheduled task every 5 minutes on the runner host
57
+
58
+ $diagDir = "C:\actions-runner\_diag"
59
+ $logFile = Get-ChildItem $diagDir -Filter "Runner_*.log" | Sort-Object LastWriteTime -Descending | Select-Object -First 1
60
+ $lastMsg = (Get-Content $logFile.FullName -Tail 200 | Select-String "BrokerMessageListener") | Select-Object -Last 1
61
+
62
+ # If the last BrokerMessageListener log entry is the cancellation message
63
+ # and it's more than 10 minutes old, restart the service
64
+ if ($lastMsg -match "cancelled using local token source") {
65
+ $entryTime = [datetime]::Parse(($lastMsg -split " ")[0])
66
+ if ((Get-Date) - $entryTime -gt [TimeSpan]::FromMinutes(10)) {
67
+ Restart-Service actions.runner.*
68
+ }
69
+ }
70
+ - language: yaml
71
+ label: 'Pin runner version to 2.333.x in ARC or self-hosted runner config'
72
+ code: |
73
+ # For ARC (Actions Runner Controller) — pin runner version in values.yaml
74
+ githubConfigSecret: pre-defined-secret
75
+ template:
76
+ spec:
77
+ containers:
78
+ - name: runner
79
+ env:
80
+ - name: RUNNER_VERSION
81
+ value: "2.333.0" # Pin below 2.334.0 until #4444 is fixed
82
+ prevention:
83
+ - "Subscribe to the actions/runner release feed to catch regression fixes; downgrade immediately when a V2-flow listener regression is reported."
84
+ - "For Windows runner pools, add a service watchdog (Task Scheduler or Prometheus alert) that detects listener staleness via the _diag/Runner_*.log pattern."
85
+ - "Consider switching to ephemeral JIT runners on Windows — ephemeral runners cannot hit this bug because each runner handles exactly one job then exits."
86
+ - "Monitor runner queue depth in your autoscaler; an idle runner with > 0 queued jobs is a reliable signal the listener has stalled."
87
+ docs:
88
+ - url: 'https://github.com/actions/runner/issues/4444'
89
+ label: 'actions/runner#4444 — Listener stops polling broker after first job (2.334.0, Windows, V2 flow)'
90
+ - url: 'https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/about-self-hosted-runners'
91
+ label: 'About self-hosted runners'
92
+ - url: 'https://github.com/actions/runner/releases'
93
+ label: 'Actions Runner release notes'
@@ -0,0 +1,94 @@
1
+ id: runner-environment-198
2
+ title: 'setup-python fails on macOS self-hosted runner with "ln: python3XX: Permission denied" — symlink created without sudo after sudo install'
3
+ category: runner-environment
4
+ severity: error
5
+ tags:
6
+ - setup-python
7
+ - self-hosted
8
+ - macos
9
+ - symlink
10
+ - permissions
11
+ - sudo
12
+ patterns:
13
+ - regex: 'ln:.*python3\d+.*Permission denied'
14
+ flags: 'i'
15
+ - regex: 'Error:.*ln:.*python.*Permission denied'
16
+ flags: 'i'
17
+ error_messages:
18
+ - 'Error: ln: python314: Permission denied'
19
+ - 'Error: ln: python312: Permission denied'
20
+ - 'Error: ln: python313: Permission denied'
21
+ root_cause: |
22
+ The setup-python macOS installer script (setup.sh) uses sudo to install the Python
23
+ framework into /Library/Frameworks/ — a root-owned directory. However, the
24
+ subsequent step that creates symlinks inside that directory (e.g., linking
25
+ python3.14 → python3) runs WITHOUT sudo, causing a permission error when the
26
+ runner''s CI user is a standard (non-root) account even if sudo is available for
27
+ the install step.
28
+
29
+ The error appears as:
30
+ Error: ln: python314: Permission denied
31
+
32
+ This affects macOS self-hosted runners (EC2, bare-metal, VM) where the runner
33
+ service is configured as a non-admin user. GitHub-hosted macOS runners are
34
+ unaffected because they grant passwordless sudo to the runner user.
35
+
36
+ A fix was proposed in actions/python-versions PR #384 to use sudo for the symlink
37
+ creation step as well, but is not yet merged/released as of mid-2026.
38
+ fix: |
39
+ Short-term workaround: Grant the runner service user passwordless sudo on the
40
+ macOS runner machine, or run the runner as a user with write access to
41
+ /Library/Frameworks/ and /usr/local/bin.
42
+
43
+ Safer workaround: Pre-install Python on the macOS runner machine using the
44
+ official Python.org pkg installer (which creates symlinks correctly as root), then
45
+ rely on setup-python''s PATH-detection to use the pre-installed version without
46
+ invoking the installer.
47
+
48
+ Enterprise workaround: Use a custom RUNNER_TOOL_CACHE path that is owned by the
49
+ runner user and set python-version to a version already cached there, bypassing
50
+ the framework installer entirely.
51
+ fix_code:
52
+ - language: yaml
53
+ label: 'Workaround: use pre-installed Python to avoid the installer'
54
+ code: |
55
+ jobs:
56
+ build:
57
+ runs-on: [self-hosted, macos]
58
+ steps:
59
+ # If Python 3.14 is pre-installed system-wide (via pkg installer as root),
60
+ # setup-python finds it in PATH without invoking setup.sh
61
+ - uses: actions/setup-python@v5
62
+ with:
63
+ python-version: '3.14'
64
+ - run: python3 --version
65
+ - language: yaml
66
+ label: 'Workaround: grant passwordless sudo to runner user via /etc/sudoers (NOT recommended for production)'
67
+ code: |
68
+ # Add to /etc/sudoers on the macOS runner machine (use visudo):
69
+ # runner-user ALL=(ALL) NOPASSWD: ALL
70
+ #
71
+ # This allows setup-python's installer to use sudo for both the framework
72
+ # install and symlink creation steps.
73
+ #
74
+ # In your workflow, no changes are needed — setup-python works normally once
75
+ # the runner user has the required sudo permissions.
76
+ jobs:
77
+ build:
78
+ runs-on: [self-hosted, macos]
79
+ steps:
80
+ - uses: actions/setup-python@v5
81
+ with:
82
+ python-version: '3.14'
83
+ prevention:
84
+ - 'Use pre-installed Python on macOS self-hosted runners to avoid the setup.sh installer and its sudo/symlink requirements'
85
+ - 'When provisioning macOS self-hosted runners, ensure the runner user has passwordless sudo or pre-install all required Python versions'
86
+ - 'Track actions/python-versions#384 for an upstream fix that adds sudo to the symlink creation step'
87
+ - 'Test setup-python on macOS self-hosted runners in a staging environment before deploying to production'
88
+ docs:
89
+ - url: 'https://github.com/actions/setup-python/issues/1301'
90
+ label: 'actions/setup-python#1301: Permission denied when creating symlinks on macOS self-hosted runners (2026)'
91
+ - url: 'https://github.com/actions/python-versions/pull/384'
92
+ label: 'actions/python-versions PR #384: fix symlink creation to use sudo (pending)'
93
+ - url: 'https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/about-self-hosted-runners#self-hosted-runner-security'
94
+ label: 'GitHub Docs: Self-hosted runner security'
@@ -0,0 +1,101 @@
1
+ id: runner-environment-197
2
+ title: 'setup-python fails on Windows self-hosted runner without administrator permissions — InstallAllUsers=1 MSI installer requires admin'
3
+ category: runner-environment
4
+ severity: error
5
+ tags:
6
+ - setup-python
7
+ - self-hosted
8
+ - windows
9
+ - admin-permissions
10
+ - msi-installer
11
+ patterns:
12
+ - regex: 'Error happened during Python installation'
13
+ flags: 'i'
14
+ - regex: 'InstallAllUsers=1'
15
+ flags: 'i'
16
+ - regex: 'setup-python.*self.hosted.*windows.*fail'
17
+ flags: 'i'
18
+ error_messages:
19
+ - 'Error happened during Python installation'
20
+ root_cause: |
21
+ The setup-python action installs Python using the official MSI installer with the
22
+ flag InstallAllUsers=1. This flag installs Python into the shared runner tool cache
23
+ (RUNNER_TOOL_CACHE) using the DefaultAllUsersTargetDir argument, which requires
24
+ administrator privileges to write to the shared system-level installation path.
25
+
26
+ When the GitHub Actions runner service is running under a standard user account
27
+ without local administrator permissions, the Python MSI installer fails because it
28
+ cannot write to the required system paths, resulting in:
29
+
30
+ Error happened during Python installation
31
+
32
+ This affects Windows self-hosted runners configured as a Windows service under a
33
+ non-admin account. GitHub-hosted Windows runners are unaffected because they run
34
+ as SYSTEM (full admin). Linux and macOS self-hosted runners are unaffected.
35
+
36
+ Note: this is by design per the action maintainers — InstallAllUsers=1 is required
37
+ to correctly populate the shared RUNNER_TOOL_CACHE. A per-user install mode
38
+ (InstallAllUsers=0) is under consideration as a future enhancement.
39
+ fix: |
40
+ Primary fix: Configure the Windows runner service to run as a local administrator
41
+ account. In Windows Services (services.msc), change the "Log On" account for the
42
+ GitHub Actions Runner service to an account with local admin rights, or install the
43
+ runner using the .\config.cmd --runasservice flow with an admin account.
44
+
45
+ Workaround 1: Pre-install the required Python version system-wide on the runner
46
+ machine before workflows run. setup-python will use the pre-installed version from
47
+ PATH without invoking the MSI installer if the version matches.
48
+
49
+ Workaround 2: Ensure the runner is run interactively (not as a service) under an
50
+ account with admin rights during initial Python installation, then cache the tool.
51
+
52
+ Workaround 3: Use actions/setup-python''s uses: together with a self-hosted runner
53
+ image that already has Python pre-installed in RUNNER_TOOL_CACHE, bypassing the
54
+ MSI install step entirely.
55
+ fix_code:
56
+ - language: yaml
57
+ label: 'Workflow: no changes needed — fix is on the runner machine configuration'
58
+ code: |
59
+ # Fix requires reconfiguring the Windows runner service account:
60
+ # 1. Open Services (services.msc) on the runner machine
61
+ # 2. Find "GitHub Actions Runner (<name>)"
62
+ # 3. Right-click → Properties → Log On tab
63
+ # 4. Change to an account with local administrator privileges
64
+ # 5. Restart the service
65
+ #
66
+ # After reconfiguring, setup-python works normally:
67
+ jobs:
68
+ build:
69
+ runs-on: [self-hosted, windows]
70
+ steps:
71
+ - uses: actions/setup-python@v5
72
+ with:
73
+ python-version: '3.12'
74
+ - run: python --version
75
+ - language: yaml
76
+ label: 'Workaround: use pre-installed Python from PATH if runner has it installed system-wide'
77
+ code: |
78
+ jobs:
79
+ build:
80
+ runs-on: [self-hosted, windows]
81
+ steps:
82
+ # Skip setup-python entirely if Python is pre-installed system-wide
83
+ - name: Verify Python available
84
+ run: python --version
85
+ # OR use setup-python only to add to PATH, not install:
86
+ - uses: actions/setup-python@v5
87
+ with:
88
+ python-version: '3.12'
89
+ # If Python 3.12 is already in RUNNER_TOOL_CACHE, no install occurs
90
+ prevention:
91
+ - 'Always run Windows self-hosted runner services under a local administrator account — setup-python requires admin rights to install into the shared tool cache'
92
+ - 'Test setup-python on self-hosted runners with the same service account before deploying to production workloads'
93
+ - 'Use pre-installed Python on Windows self-hosted runners to avoid the MSI installer entirely when admin rights cannot be granted'
94
+ - 'Document the runner service account requirements in your team''s self-hosted runner setup guide'
95
+ docs:
96
+ - url: 'https://github.com/actions/setup-python/issues/1308'
97
+ label: 'actions/setup-python#1308: fails on self-hosted runners without admin permissions (2026)'
98
+ - url: 'https://github.com/actions/setup-python/blob/main/docs/advanced-usage.md#windows'
99
+ label: 'setup-python docs: Windows advanced usage and self-hosted runner configuration'
100
+ - url: 'https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/configuring-the-self-hosted-runner-application-as-a-service'
101
+ label: 'GitHub Docs: Configuring the self-hosted runner as a service'
@@ -0,0 +1,100 @@
1
+ id: triggers-069
2
+ title: 'push tags: filter silently stops all branch-push triggers — branches: filter must be added separately to also run on branch commits'
3
+ category: triggers
4
+ severity: silent-failure
5
+ tags:
6
+ - push
7
+ - tags-filter
8
+ - branches
9
+ - silent-failure
10
+ - trigger-missing
11
+ - filter-whitelist
12
+ patterns:
13
+ - regex: 'on:\s*\n\s+push:\s*\n\s+tags:'
14
+ flags: 'i'
15
+ error_messages: []
16
+ root_cause: |
17
+ When a workflow specifies on: push: with only a tags: filter and no branches: filter,
18
+ ALL branch push events are silently excluded. The workflow fires exclusively on tag
19
+ pushes matching the tags: pattern.
20
+
21
+ This contradicts the common mental model that tags: adds an extra "also trigger on
22
+ these tags" condition on top of the default "trigger on all branch pushes" behavior.
23
+ In reality, any filter key under on: push: (branches:, tags:, branches-ignore:,
24
+ tags-ignore:, paths:) replaces the implicit "trigger on everything" default with a
25
+ whitelist. Specifying tags: without branches: means only tags match.
26
+
27
+ A workflow with:
28
+ on:
29
+ push:
30
+ tags:
31
+ - 'v*'
32
+
33
+ will never fire when code is pushed to main, feature branches, or any other branch.
34
+ If the team also needs CI on commits to main or pull request merges, a separate
35
+ branches: filter is required in the same on: push: block.
36
+
37
+ This is the inverse of the documented "branches: filter does not block tag pushes"
38
+ behavior (triggers-055). Both stem from the same whitelist-per-filter-key design.
39
+ fix: |
40
+ Add an explicit branches: filter alongside tags: to restore branch-push triggering:
41
+
42
+ on:
43
+ push:
44
+ branches:
45
+ - main
46
+ tags:
47
+ - 'v*'
48
+
49
+ If only tag-based triggering is intended (e.g., a release workflow), the original
50
+ configuration is correct. Document the intent clearly so future maintainers do not
51
+ accidentally add branches: thinking CI was "broken".
52
+
53
+ To trigger on ALL branch pushes plus specific tags, omit branches: from on: push:
54
+ and add a separate on: push: tags: entry — but note that on: push: without filters
55
+ and on: push: tags: cannot coexist in the same block; split into two trigger blocks
56
+ using pull_request for branch coverage and push: tags: for tags.
57
+ fix_code:
58
+ - language: yaml
59
+ label: 'Bug: tags: filter silently excludes all branch-push events'
60
+ code: |
61
+ on:
62
+ push:
63
+ tags:
64
+ - 'v*'
65
+ # Workflow NEVER runs on branch pushes — only fires when a v* tag is pushed
66
+ - language: yaml
67
+ label: 'Fix: add branches: filter to also run on branch commits'
68
+ code: |
69
+ on:
70
+ push:
71
+ branches:
72
+ - main
73
+ - 'release/**'
74
+ tags:
75
+ - 'v*'
76
+
77
+ jobs:
78
+ ci:
79
+ runs-on: ubuntu-latest
80
+ steps:
81
+ - uses: actions/checkout@v4
82
+ - run: make test
83
+ - language: yaml
84
+ label: 'Intentional tag-only release workflow (correct if branch CI is handled in a separate workflow)'
85
+ code: |
86
+ # release.yml — intentionally runs only on release tags
87
+ on:
88
+ push:
89
+ tags:
90
+ - 'v[0-9]+.[0-9]+.[0-9]+'
91
+ prevention:
92
+ - 'Remember that any filter key under on: push: creates a whitelist — adding tags: without branches: stops all branch-push triggers silently'
93
+ - 'Keep release tag workflows in a dedicated file (release.yml) separate from branch-CI workflows (ci.yml) to avoid accidentally merging their trigger logic'
94
+ - 'After adding a tags: filter to an existing workflow, push a test commit to a branch and verify the workflow appears in the Actions tab'
95
+ - 'Use nektos/act or GitHub Actions VS Code extension to preview which events match a workflow trigger before merging'
96
+ docs:
97
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#push'
98
+ label: 'GitHub Docs: push event filters'
99
+ - url: 'https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions#onpushbranchestagsbranches-ignoretags-ignore'
100
+ label: 'GitHub Docs: Workflow syntax — on.push filter keys'
@@ -0,0 +1,110 @@
1
+ id: yaml-syntax-070
2
+ title: "continue-on-error expression using inputs.* silently resolves null at composite action call site \u2014 \"Unexpected value ''\""
3
+ category: yaml-syntax
4
+ severity: error
5
+ tags:
6
+ - continue-on-error
7
+ - composite-actions
8
+ - inputs-context
9
+ - expression-scoping
10
+ - template-validation
11
+ patterns:
12
+ - regex: 'Unexpected value\s+''''$'
13
+ flags: 'i'
14
+ - regex: 'error.*determining.*continue on error'
15
+ flags: 'i'
16
+ - regex: 'The template is not valid.*Unexpected value\s+'''
17
+ flags: 'i'
18
+ error_messages:
19
+ - "Error: .github/workflows/test.yml (Line: N, Col: N): Unexpected value ''"
20
+ - 'Error: The step failed and an error occurred when attempting to determine whether to continue on error.'
21
+ - "Error: The template is not valid. .github/workflows/test.yml (Line: N, Col: N): Unexpected value ''"
22
+ root_cause: |
23
+ When continue-on-error on a uses: step that calls a composite action contains an
24
+ expression referencing inputs.*, the expression is evaluated inside the composite
25
+ action context — not the calling workflow context. Inside the composite action,
26
+ inputs refers to the composite action''s own declared inputs, not the caller''s
27
+ workflow_dispatch inputs or other workflow-level inputs.
28
+
29
+ If the composite action does not declare that input, inputs.my_flag resolves to
30
+ null, which becomes an empty string in the expression context. The runner then
31
+ tries to parse the empty string as a boolean and emits:
32
+
33
+ Error: Unexpected value ''
34
+
35
+ followed by:
36
+
37
+ Error: The step failed and an error occurred when attempting to determine
38
+ whether to continue on error.
39
+
40
+ The workflow then halts at that step regardless of what the caller intended.
41
+
42
+ This context-crossing is unique to uses: steps (composite actions). For run:
43
+ steps in the same job, inputs.my_flag evaluates in the workflow context and
44
+ works correctly with continue-on-error.
45
+ fix: |
46
+ Replace inputs.* with github.event.inputs.* in the continue-on-error expression
47
+ when calling composite actions from a workflow_dispatch event. github.event.inputs
48
+ is evaluated in the workflow/runner context before the composite action is entered,
49
+ so it resolves correctly.
50
+
51
+ Alternatively, capture the input as a job-level env or step output and reference
52
+ the env var or step output in the continue-on-error expression.
53
+ fix_code:
54
+ - language: yaml
55
+ label: 'Bug: inputs.continue resolves to null inside composite action context'
56
+ code: |
57
+ on:
58
+ workflow_dispatch:
59
+ inputs:
60
+ continue:
61
+ type: boolean
62
+ default: false
63
+
64
+ jobs:
65
+ test:
66
+ runs-on: ubuntu-latest
67
+ steps:
68
+ - uses: my-org/fail-action@main
69
+ # inputs.continue resolves to null here — "Unexpected value ''"
70
+ continue-on-error: ${{ github.event_name == 'workflow_dispatch' && inputs.continue }}
71
+ - language: yaml
72
+ label: 'Fix: use github.event.inputs.* instead of inputs.* in continue-on-error on uses: steps'
73
+ code: |
74
+ on:
75
+ workflow_dispatch:
76
+ inputs:
77
+ continue:
78
+ type: boolean
79
+ default: false
80
+
81
+ jobs:
82
+ test:
83
+ runs-on: ubuntu-latest
84
+ steps:
85
+ - uses: my-org/fail-action@main
86
+ # github.event.inputs evaluates in workflow context, not composite context
87
+ continue-on-error: ${{ github.event_name == 'workflow_dispatch' && github.event.inputs.continue == 'true' }}
88
+ - language: yaml
89
+ label: 'Alternative fix: pre-evaluate the flag into an env var'
90
+ code: |
91
+ jobs:
92
+ test:
93
+ runs-on: ubuntu-latest
94
+ env:
95
+ SHOULD_CONTINUE: ${{ inputs.continue }}
96
+ steps:
97
+ - uses: my-org/fail-action@main
98
+ continue-on-error: ${{ env.SHOULD_CONTINUE == 'true' }}
99
+ prevention:
100
+ - 'Never reference inputs.* in continue-on-error on a uses: step — use github.event.inputs.* for workflow_dispatch inputs or env.* for pre-evaluated values'
101
+ - 'Remember that expression contexts differ between run: steps (workflow context) and uses: composite action steps (composite context)'
102
+ - 'Test continue-on-error with expressions locally using nektos/act before merging to catch context-crossing issues'
103
+ - 'If a composite action needs conditional behavior, declare it as an explicit composite input and pass the value from the caller'
104
+ docs:
105
+ - url: 'https://github.com/actions/runner/issues/2418'
106
+ label: 'GitHub runner#2418: continue-on-error with inputs variables in composite actions (bug, 12 reactions)'
107
+ - url: 'https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/evaluate-expressions-in-workflows-and-actions#context-availability'
108
+ label: 'GitHub Docs: Expression context availability'
109
+ - url: 'https://docs.github.com/en/actions/sharing-automations/creating-actions/creating-a-composite-action'
110
+ label: 'GitHub Docs: Creating a composite action'
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@htekdev/actions-debugger",
3
- "version": "1.0.114",
3
+ "version": "1.0.115",
4
4
  "description": "65+ real GitHub Actions errors, queryable by agents. CLI + MCP server + Copilot skills + error database.",
5
5
  "type": "module",
6
6
  "main": "./dist/index.js",