trace-to-skill 0.1.96 → 0.1.98
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +8 -4
- package/dist/src/benchmark.js +12 -0
- package/dist/src/benchmark.js.map +1 -1
- package/dist/src/demo.js +17 -0
- package/dist/src/demo.js.map +1 -1
- package/dist/src/init.js +3 -3
- package/dist/src/issueMap.js +6 -0
- package/dist/src/issueMap.js.map +1 -1
- package/dist/src/rules.js +35 -4
- package/dist/src/rules.js.map +1 -1
- package/dist/src/types.d.ts +1 -1
- package/docs/BENCHMARK.md +3 -1
- package/docs/CODEX_GITHUB_ISSUE_PAIN_MAP.md +32 -31
- package/docs/CODEX_ISSUE_MAP.md +5 -2
- package/docs/CODEX_ISSUE_RADAR.md +22 -22
- package/docs/DEMO.md +29 -23
- package/docs/FAILURE_TAXONOMY.md +10 -2
- package/docs/OPENAI_OSS_BRIEF.md +5 -5
- package/docs/SCORECARD.md +4 -2
- package/docs/USE_CASES.md +27 -1
- package/fixtures/codex-cli-no-response.md +46 -0
- package/fixtures/codex-remote-connection.md +24 -0
- package/fixtures/github-codex-issues-export.json +39 -0
- package/llms.txt +4 -2
- package/package.json +9 -1
- package/schemas/analysis-result.schema.json +1 -0
|
@@ -1,9 +1,9 @@
|
|
|
1
1
|
# GitHub Issue Pain Map
|
|
2
2
|
|
|
3
|
-
Generated: 2026-06-01T03:
|
|
3
|
+
Generated: 2026-06-01T03:41:41.220Z
|
|
4
4
|
|
|
5
|
-
Issues analyzed: **
|
|
6
|
-
Matched issues: **
|
|
5
|
+
Issues analyzed: **17**
|
|
6
|
+
Matched issues: **16**
|
|
7
7
|
Unmatched issues: **1**
|
|
8
8
|
|
|
9
9
|
This report maps GitHub issues onto deterministic `trace-to-skill` failure classes. Fetch a repository directly with `--repo`, or export issues with `gh issue list` / `gh search issues` and pass the JSON file.
|
|
@@ -19,10 +19,12 @@ gh issue list --repo openai/codex --state all --limit 100 --json number,title,bo
|
|
|
19
19
|
|
|
20
20
|
| Priority | Kind | Severity | Issues | Comments | Reactions | Example |
|
|
21
21
|
| ---: | --- | --- | ---: | ---: | ---: | --- |
|
|
22
|
+
| 1895 | `codex_remote_connection` | high | 1 | 176 | 851 | [#10450 Remote Development in Codex Desktop App](https://github.com/openai/codex/issues/10450) |
|
|
22
23
|
| 1051 | `codex_token_burn` | high | 2 | 918 | 53 | [#14593 Burning tokens very fast](https://github.com/openai/codex/issues/14593) |
|
|
24
|
+
| 508 | `weak_evidence` | medium | 17 | 2186 | 1094 | [#14593 Burning tokens very fast](https://github.com/openai/codex/issues/14593) |
|
|
25
|
+
| 434 | `codex_thinking_hang` | high | 2 | 201 | 103 | [#14048 All models - Codex CLI hangs indefinitely on all prompts, no response generated](https://github.com/openai/codex/issues/14048) |
|
|
23
26
|
| 409 | `codex_auth_verification` | high | 2 | 346 | 18 | [#20161 Phone number verification doesn't work](https://github.com/openai/codex/issues/20161) |
|
|
24
27
|
| 304 | `codex_model_routing_mismatch` | high | 3 | 231 | 18 | [#11189 GPT-5.3-Codex being routed to GPT-5.2](https://github.com/openai/codex/issues/11189) |
|
|
25
|
-
| 268 | `weak_evidence` | medium | 14 | 1809 | 140 | [#14593 Burning tokens very fast](https://github.com/openai/codex/issues/14593) |
|
|
26
28
|
| 257 | `codex_context_visibility` | high | 3 | 168 | 26 | [#23794 Codex Desktop no longer shows visible context/token usage indicator](https://github.com/openai/codex/issues/23794) |
|
|
27
29
|
| 202 | `premature_completion` | high | 1 | 169 | 8 | [#1243 "Sign in With ChatGPT" functionality needs to be robust against all account types](https://github.com/openai/codex/issues/1243) |
|
|
28
30
|
| 137 | `codex_remote_compact` | high | 1 | 90 | 15 | [#14860 Error running remote compact task](https://github.com/openai/codex/issues/14860) |
|
|
@@ -34,14 +36,24 @@ gh issue list --repo openai/codex --state all --limit 100 --json number,title,bo
|
|
|
34
36
|
|
|
35
37
|
| Rank | Next artifact | Why now | Command |
|
|
36
38
|
| ---: | --- | --- | --- |
|
|
37
|
-
| 1 |
|
|
38
|
-
| 2 |
|
|
39
|
-
| 3 |
|
|
40
|
-
| 4 |
|
|
41
|
-
| 5 |
|
|
39
|
+
| 1 | Remote connection fixture and SSH workspace evidence report | 1 issue(s), 176 comment(s), severity high; top signal: codex_remote_connection. | `trace-to-skill codex-report ./runs --output openai-codex-remote-connection.md` |
|
|
40
|
+
| 2 | Usage evidence fixture and support-ready token report | 2 issue(s), 918 comment(s), severity high; top signal: codex_token_burn. | `trace-to-skill usage-evidence ./usage-notes.md --output usage-evidence.md` |
|
|
41
|
+
| 3 | Codex-ready issue report and failure fixture | 2 issue(s), 201 comment(s), severity high; top signal: codex_thinking_hang. | `trace-to-skill codex-report ./runs --output openai-codex-issue.md` |
|
|
42
|
+
| 4 | Auth verification fixture and login support report | 2 issue(s), 346 comment(s), severity high; top signal: codex_auth_verification. | `trace-to-skill codex-report ./runs --output openai-codex-auth-issue.md` |
|
|
43
|
+
| 5 | Model-routing fixture and SSE evidence report | 3 issue(s), 231 comment(s), severity high; top signal: codex_model_routing_mismatch. | `trace-to-skill codex-report ./runs --output openai-codex-model-routing.md` |
|
|
42
44
|
|
|
43
45
|
## Suggested Next Actions
|
|
44
46
|
|
|
47
|
+
### codex_remote_connection
|
|
48
|
+
|
|
49
|
+
Priority score: 1895. 1 issue(s), 176 comment(s).
|
|
50
|
+
|
|
51
|
+
Example issues:
|
|
52
|
+
- [#10450 Remote Development in Codex Desktop App](https://github.com/openai/codex/issues/10450) (176 comments; labels: enhancement, app)
|
|
53
|
+
|
|
54
|
+
Evidence rule prompts:
|
|
55
|
+
- When reporting Codex remote connection failures, capture Codex Desktop version, remote Codex CLI/app-server version, local OS, remote OS/architecture, SSH target alias from `~/.ssh/config`, whether `[features].remote_connections = true` is set, Settings > Connections visibility, selected host/path, remote workspace path, whether the remote filesystem is the source of truth, exact tunnel/app-server error, codex-server pid and restart result, `ps -ef | rg 'codex app-server|openai.chatgpt.*/codex'` evidence if available, remote PATH/auth/proxy/API reachability, model list differences versus local, fs/getMetadata or folder listing errors, ForwardAgent/proxy requirements, and whether reconnect/resume or a clean host works.
|
|
56
|
+
|
|
45
57
|
### codex_token_burn
|
|
46
58
|
|
|
47
59
|
Priority score: 1051. 2 issue(s), 918 comment(s).
|
|
@@ -53,6 +65,17 @@ Example issues:
|
|
|
53
65
|
Evidence rule prompts:
|
|
54
66
|
- When reporting Codex token burn, capture plan/workspace, client and version, model and reasoning/speed settings, fast-mode/large-context/subagent/review flags, recent /status and usage-dashboard deltas, local token totals including cached input/output/reasoning if available, background process ids and write_stdin poll cadence, compaction attempts and failures, retry/tool-loop counts, whether the app was idle, and a minimal reproduction with before/after usage percentages.
|
|
55
67
|
|
|
68
|
+
### codex_thinking_hang
|
|
69
|
+
|
|
70
|
+
Priority score: 434. 2 issue(s), 201 comment(s).
|
|
71
|
+
|
|
72
|
+
Example issues:
|
|
73
|
+
- [#14048 All models - Codex CLI hangs indefinitely on all prompts, no response generated](https://github.com/openai/codex/issues/14048) (131 comments; labels: bug, agent)
|
|
74
|
+
- [#7156 Codex hangs during cli command execution](https://github.com/openai/codex/issues/7156) (70 comments; labels: bug, CLI)
|
|
75
|
+
|
|
76
|
+
Evidence rule prompts:
|
|
77
|
+
- When reporting Codex thinking or CLI no-response hangs, capture app/CLI/extension version, OS/terminal such as WSL, model and reasoning/speed settings, subscription/workspace, turn/thread id, prompt timestamp, whether the prompt is accepted but no streaming output/error/timeout appears, status bar or usage percent such as 100% left, `turn/start` or `task_started` timestamp, last successful tool-call output, first `response_item` or assistant timestamp if it eventually appears, `RUST_LOG`/SSE evidence including unhandled responses events, transport (`responses_http` or websocket), `time.busy`/`time.idle` close metrics, reconnect or stream-disconnect lines, status incident link or cluster mitigation note if relevant, MCP/subagent state, whether stop/Ctrl+C/interrupt works, and whether a new thread, logout/login, downgrade, API billing path, or minimal config without MCPs recovers.
|
|
78
|
+
|
|
56
79
|
### codex_auth_verification
|
|
57
80
|
|
|
58
81
|
Priority score: 409. 2 issue(s), 346 comment(s).
|
|
@@ -76,28 +99,6 @@ Example issues:
|
|
|
76
99
|
Evidence rule prompts:
|
|
77
100
|
- When reporting Codex model-routing mismatches, capture the Codex app/CLI/extension version, subscription/workspace, selected model from config.toml, TUI, command flag, or UI, actual server-side model from SSE `response.created` / `response.model`, the exact `RUST_LOG` or trace command used, timestamp, account or verification state without secrets, whether API and Codex routes differ, whether a warning/fallback notice appeared, and a minimal one-prompt reproduction with redacted logs.
|
|
78
101
|
|
|
79
|
-
### codex_context_visibility
|
|
80
|
-
|
|
81
|
-
Priority score: 257. 3 issue(s), 168 comment(s).
|
|
82
|
-
|
|
83
|
-
Example issues:
|
|
84
|
-
- [#23794 Codex Desktop no longer shows visible context/token usage indicator](https://github.com/openai/codex/issues/23794) (160 comments; labels: bug, context, app)
|
|
85
|
-
- [#23591 Reimplement visible context/token usage indicator in Codex Desktop App](https://github.com/openai/codex/issues/23591) (7 comments; labels: enhancement, rate-limits, context, app)
|
|
86
|
-
- [#24710 Codex Desktop: hidden context indicator still blocks long-session context management](https://github.com/openai/codex/issues/24710) (1 comments; labels: enhancement, context, app)
|
|
87
|
-
|
|
88
|
-
Evidence rule prompts:
|
|
89
|
-
- When reporting Codex context-visibility regressions, capture Codex Desktop version, OS, surface, screenshot or short recording of the chat input area, whether the prior context/token indicator or tooltip was visible before the update, exact UI route where it disappeared, local session metadata showing context/window pressure if available, `/status` output if relevant, compaction timing, whether CLI/TUI still exposes a statusline, and how the missing indicator affects long-session decisions.
|
|
90
|
-
|
|
91
|
-
### premature_completion
|
|
92
|
-
|
|
93
|
-
Priority score: 202. 1 issue(s), 169 comment(s).
|
|
94
|
-
|
|
95
|
-
Example issues:
|
|
96
|
-
- [#1243 "Sign in With ChatGPT" functionality needs to be robust against all account types](https://github.com/openai/codex/issues/1243) (169 comments; labels: none)
|
|
97
|
-
|
|
98
|
-
Evidence rule prompts:
|
|
99
|
-
- Before claiming completion, run the relevant validation command or clearly state the exact validation that could not be run and why.
|
|
100
|
-
|
|
101
102
|
## Unmatched Issues
|
|
102
103
|
|
|
103
104
|
- [#99999 Add a fun launch animation](https://github.com/openai/codex/issues/99999) (0 comments; labels: enhancement)
|
package/docs/CODEX_ISSUE_MAP.md
CHANGED
|
@@ -35,7 +35,7 @@ npx trace-to-skill lsp-audit . --format json
|
|
|
35
35
|
| Undo, rewind, and pre-agent checkpoint needs | users want `/undo` or `/rewind`, double-Esc only rewinds chat state, untracked/gitignored files are not protected by commits, and manual recovery needs reviewable pre-agent evidence | workspace checkpoint bundle | `trace-to-skill checkpoint . --output .trace-to-skill/checkpoints/before-codex` before agent work |
|
|
36
36
|
| Model routing mismatch | selected `gpt-5.3-codex` in `config.toml`, TUI, or `--model`, but SSE `response.created` / `response.model` shows `gpt-5.2`, silent fallback, no warning, no fallback notice | `codex_model_routing_mismatch` | `trace-to-skill codex-report ./runs` or `trace-to-skill demo model-routing-mismatch` |
|
|
37
37
|
| Latency regressions | GPT-5.5 Fast feels like Standard, simple tasks take 10-20+ minutes, pre-first-token or thinking stalls, slow search/read/compaction, hours for small code changes | `codex_latency_regression` | `trace-to-skill codex-report ./runs` |
|
|
38
|
-
| Thinking or stream hangs | accepted turn, completed local tool output, no
|
|
38
|
+
| Thinking or stream hangs | accepted turn, completed local tool output, CLI prompt accepted but no streaming output/error/timeout, status bar `100% left`, unhandled responses events, terminal command execution stuck, long gap before first `response_item`, `time.busy` milliseconds with `time.idle` hundreds of seconds, Stop/Ctrl+C cannot interrupt, status incident or cluster-reroute note, subagent parent stuck | `codex_thinking_hang` | `trace-to-skill codex-report ./runs`, `trace-to-skill demo thinking-hang`, or `trace-to-skill demo cli-no-response` |
|
|
39
39
|
| Clipboard, paste, and generated attachment regressions | `Copy as Markdown` missing, Copy menu only exports metadata, long pasted prompts become `Pasted text.txt`, generated attachments cannot preview/edit/revert, `/goal` ignores non-empty fileAttachments | `codex_clipboard_attachment` | `trace-to-skill codex-report ./runs` or `trace-to-skill demo clipboard-attachment` |
|
|
40
40
|
| Deeplink, OAuth callback, and external launch regressions | `codex://oauth_callback?code=...` fails, `Unable to find Electron app`, `app\oauth_callback?code=...`, notification `type=click&tag=...` becomes an app path, AppX/MSIX protocol evidence, `codex app .` only focuses | `codex_deeplink_launch` | `trace-to-skill codex-report ./runs` or `trace-to-skill demo deeplink-launch` |
|
|
41
41
|
| App connector auth cache and stale link regressions | `401 Reauthentication required`, `refresh token was revoked`, stale `link_*`, `isAccessible: false`, `codex_apps_tools` or `codex_app_directory` cache regeneration keeps broken connector state | `codex_connector_auth_cache` | `trace-to-skill codex-report ./runs` or `trace-to-skill demo connector-auth-cache` |
|
|
@@ -61,6 +61,7 @@ npx trace-to-skill lsp-audit . --format json
|
|
|
61
61
|
| Sandbox and permission blockers | Windows sandbox setup refresh, `os error 740`, ACL/ownership drift, approval-mode mismatch | `sandbox_permission` | `trace-to-skill analyze ./runs` |
|
|
62
62
|
| Auth and connectivity failures | `token_exchange_failed`, `auth.openai.com/oauth/token`, missing CA certificates, proxy/TLS, IPv6, Cloudflare, stream disconnects | `codex_connectivity` | `trace-to-skill codex-report ./runs` |
|
|
63
63
|
| Sign-in and account verification failures | phone verification, SMS/OTP, SSO, ChatGPT sign-in account routing, organization/workspace verification, extension chat initialization | `codex_auth_verification` | `trace-to-skill codex-report ./runs` or `trace-to-skill demo auth-verification` |
|
|
64
|
+
| Remote connection and SSH workspace failures | Codex Desktop remote development, Settings > Connections hidden, `[features].remote_connections`, SSH hosts from `~/.ssh/config`, remote filesystem source of truth, local tunnel not ready, stale remote Codex version, codex-server/app-server restart, fs/getMetadata folder listing timeouts, ForwardAgent or local API proxy needs | `codex_remote_connection` | `trace-to-skill codex-report ./runs` or `trace-to-skill demo remote-connection` |
|
|
64
65
|
| Remote-control routing failures | `Waiting for desktop`, `Directory: Unavailable`, stale listener/enrollment, `127.0.0.1:14567`, empty backend environments | `codex_remote_control` | `trace-to-skill codex-report ./runs` |
|
|
65
66
|
| MCP runtime failures | `user cancelled MCP tool call`, `unsupported call: mcp__...__...`, namespace/serverName loss, `Transport closed` | `codex_mcp_runtime` | `trace-to-skill codex-report ./runs` |
|
|
66
67
|
| Plugin runtime and bundled capability failures | Computer Use native pipe path unavailable, Browser/Computer Use settings fail, plugin/list `unknown variant 'vertical'`, stale plugin cache downgrades | `codex_plugin_runtime` | `trace-to-skill codex-report ./runs` |
|
|
@@ -84,13 +85,14 @@ npx trace-to-skill lsp-audit . --format json
|
|
|
84
85
|
- Include `session-audit` output, largest rollout JSONL sizes, largest line sizes, parse-error counts, `session_index.jsonl` line count, bloated title byte/signal counts, unindexed rollout thread count, recoverable `codex resume <id>` commands, hashed project groups, and state-file presence for resume/session-state failures.
|
|
85
86
|
- Include `diagnostics-bundle` output when the issue spans config plus local session/history state, or when you need one metadata-only folder that excludes raw logs, SQLite databases, raw config, and transcripts.
|
|
86
87
|
- Include pre-first-token, thinking, tool, search, read, and compaction timings plus model/speed settings for latency regressions.
|
|
87
|
-
- Include `turn/start`, `task_started`, last successful tool output, first `response_item` timestamp, `responses_http` or websocket evidence, `time.busy` / `time.idle`, MCP/subagent state, stop/interrupt behavior, and minimal-config recovery evidence for Thinking hangs.
|
|
88
|
+
- Include `turn/start`, `task_started`, last successful tool output, accepted prompt/no output/no error/no timeout, status bar `100% left`, first `response_item` timestamp, `RUST_LOG` SSE evidence, unhandled responses events, `responses_http` or websocket evidence, `time.busy` / `time.idle`, status incident or cluster note, MCP/subagent state, stop/Ctrl+C/interrupt behavior, and minimal-config recovery evidence for Thinking or CLI no-response hangs.
|
|
88
89
|
- Include exact Copy menu items, paste source size, generated attachment name/path/size, visible editor text, `pasted-text-attachments.json` or fileAttachments metadata, `/goal` or submit path, preview/edit/revert actions, and clipboard payload format for clipboard/attachment regressions.
|
|
89
90
|
- Include exact redacted `codex://` URI shape, connector/plugin, browser, error dialog text, app running state, AppX/MSIX protocol registration evidence such as AppUserModelID and DelegateExecute, HKCU/HKCR `codex` keys, command-line arguments, `Start-Process "codex://test"` repro, and repair/reinstall/re-register attempts for deeplink/OAuth launch regressions.
|
|
90
91
|
- Include connector/plugin name and id, installed plugin root, exact Codex Apps tool name, 401/reauth text, `link_*` id before and after reconnect/cache regeneration, `isAccessible` state, redacted `codex_apps_tools`/`codex_app_directory` metadata, ChatGPT app page state, and external MCP workaround result for connector auth-cache regressions.
|
|
91
92
|
- Include fork source and forked thread ids, fork timestamp, fork boundary marker, `input_tokens` and `cached_input_tokens` before/after fork, `prompt_cache_key` before/after, cache hit rate, duplicated parent-turn/tool-transcript examples, whether new files were read before token growth, compaction state, subagent/`fork_context` history, minimal repro steps, and non-fork control result for context-fork bloat.
|
|
92
93
|
- Include Codex Desktop/app/CLI version, MultiAgentV2 state, OS, model, parent thread id, child thread ids, exact `spawn_agent` arguments, `fork_turns`, role/profile, whether `multi_tool_use.parallel` or same-turn parallel spawning was used, redacted child rollout line order, first user/task message, assistant/commentary envelope lines, sibling prompt excerpts, `wait_agent` and `close_agent` results, unexpected child tool calls, and sequential single-child versus parallel-child controls for subagent prompt leakage.
|
|
93
94
|
- Include effective `CODEX_HOME`, config files considered, redacted MCP sections, trust/profile/default-permissions state, `codex mcp list/get`, CLI-versus-Desktop/VS Code comparison, loaded config path/log lines, WSL/remote/SSH state, and restart/reload/new-conversation results for MCP discovery mismatches.
|
|
95
|
+
- Include Codex Desktop version, remote Codex CLI/app-server version, local OS, remote OS/architecture, SSH alias from `~/.ssh/config`, `[features].remote_connections = true`, Settings > Connections visibility, selected host/path, remote workspace path, remote filesystem source-of-truth expectation, tunnel/app-server error, codex-server pid/restart evidence, remote PATH/auth/proxy/API reachability, model-list differences, fs/getMetadata/folder listing errors, ForwardAgent/proxy needs, and reconnect/resume behavior for remote connection reports.
|
|
94
96
|
- Include Codex version, MCP server name, transport URL without secrets, initialize/tools/list/tools/call results, HTTP status, `Content-Type`, SSE event framing, JSON-RPC message shape, session id before/after restart, auth/OAuth expectations, User-Agent/header requirements, parse/deserialize error, another-client comparison, and reconnect/reinitialize behavior for Streamable HTTP MCP reports.
|
|
95
97
|
- Include app/CLI/extension version, OS, surface, shell/Desktop route, `[features].hooks`, redacted `hooks.json`, hook event type, matcher, handler command/name, expected versus observed fire count, duplicate event ids, deprecation warning text, trust state, live edit/rate-limit/auto-restore timing, Code Mode `exec` versus normal CLI comparison, linked-worktree cwd, Hooks settings UI evidence, and restart/reload/new-session behavior for hooks reports.
|
|
96
98
|
- Include terminal emulator/version, shell, WSL/SSH/tmux/Zellij state, streaming state, exact scroll action, viewport snap behavior, first missing or duplicated line id, raw log/transcript proof, terminal capture, numbered-line harness/control output, terminal dimensions/scrollback settings, and `/resume` or transcript recovery behavior for terminal-output integrity reports.
|
|
@@ -119,6 +121,7 @@ npx trace-to-skill lsp-audit . --format json
|
|
|
119
121
|
- Tool-call integrity and rollback failures: https://github.com/openai/codex/issues/25399, https://github.com/openai/codex/issues/25380, https://github.com/openai/codex/issues/25426, https://github.com/openai/codex/issues/7291
|
|
120
122
|
- Undo, rewind, and pre-agent checkpoint needs: https://github.com/openai/codex/issues/9203, https://github.com/openai/codex/issues/11626
|
|
121
123
|
- Latency regressions: https://github.com/openai/codex/issues/24422, https://github.com/openai/codex/issues/21527, https://github.com/openai/codex/issues/11984, https://github.com/openai/codex/issues/12161
|
|
124
|
+
- CLI no-response and command execution hangs: https://github.com/openai/codex/issues/14048, https://github.com/openai/codex/issues/7156
|
|
122
125
|
- Deeplink, OAuth callback, and external launch regressions: https://github.com/openai/codex/issues/25203, https://github.com/openai/codex/issues/25231, https://github.com/openai/codex/issues/25368, https://github.com/openai/codex/issues/25333
|
|
123
126
|
- App connector auth cache and stale link regressions: https://github.com/openai/codex/issues/24675, https://github.com/openai/codex/issues/25443
|
|
124
127
|
- Context fork bloat and prompt-cache lineage loss: https://github.com/openai/codex/issues/25467, https://github.com/openai/codex/issues/24704, https://github.com/openai/codex/issues/24150, https://github.com/openai/codex/issues/13491, https://github.com/openai/codex/issues/24281
|
|
@@ -1,10 +1,10 @@
|
|
|
1
1
|
# GitHub Issue Pain Map
|
|
2
2
|
|
|
3
|
-
Generated: 2026-06-01T03:
|
|
3
|
+
Generated: 2026-06-01T03:41:42.598Z
|
|
4
4
|
|
|
5
5
|
Issues analyzed: **46**
|
|
6
|
-
Matched issues: **
|
|
7
|
-
Unmatched issues: **
|
|
6
|
+
Matched issues: **23**
|
|
7
|
+
Unmatched issues: **23**
|
|
8
8
|
|
|
9
9
|
This report maps GitHub issues onto deterministic `trace-to-skill` failure classes. Fetch a repository directly with `--repo`, or export issues with `gh issue list` / `gh search issues` and pass the JSON file.
|
|
10
10
|
|
|
@@ -21,26 +21,26 @@ gh issue list --repo openai/codex --state all --limit 100 --json number,title,bo
|
|
|
21
21
|
| ---: | --- | --- | ---: | ---: | ---: | --- |
|
|
22
22
|
| 2438 | `codex_token_burn` | high | 4 | 1151 | 620 | [#14593 Burning tokens very fast](https://github.com/openai/codex/issues/14593) |
|
|
23
23
|
| 2221 | `weak_evidence` | medium | 46 | 4755 | 7794 | [#14593 Burning tokens very fast](https://github.com/openai/codex/issues/14593) |
|
|
24
|
+
| 1895 | `codex_remote_connection` | high | 1 | 176 | 851 | [#10450 Remote Development in Codex Desktop App](https://github.com/openai/codex/issues/10450) |
|
|
24
25
|
| 884 | `sensitive_file_access` | high | 1 | 75 | 396 | [#2847 A way to exclude sensitive files](https://github.com/openai/codex/issues/2847) |
|
|
25
26
|
| 805 | `codex_auth_verification` | high | 3 | 436 | 166 | [#20161 Phone number verification doesn't work](https://github.com/openai/codex/issues/20161) |
|
|
26
27
|
| 631 | `codex_context_visibility` | high | 1 | 160 | 227 | [#23794 Codex Desktop no longer shows visible context/token usage indicator](https://github.com/openai/codex/issues/23794) |
|
|
27
28
|
| 442 | `codex_tool_call_integrity` | high | 1 | 61 | 182 | [#2998 IDE-integrated diff / approval](https://github.com/openai/codex/issues/2998) |
|
|
29
|
+
| 434 | `codex_thinking_hang` | high | 2 | 201 | 103 | [#14048 All models — Codex CLI hangs indefinitely on all prompts, no response generated](https://github.com/openai/codex/issues/14048) |
|
|
28
30
|
| 376 | `codex_remote_compact` | high | 2 | 147 | 101 | [#14860 Error running remote compact task](https://github.com/openai/codex/issues/14860) |
|
|
29
31
|
| 376 | `context_compaction` | high | 2 | 147 | 101 | [#14860 Error running remote compact task](https://github.com/openai/codex/issues/14860) |
|
|
30
32
|
| 351 | `codex_terminal_output_integrity` | high | 1 | 66 | 134 | [#2558 Codex client output truncated when scrolling in Zellij](https://github.com/openai/codex/issues/2558) |
|
|
31
33
|
| 324 | `codex_model_routing_mismatch` | high | 1 | 169 | 69 | [#11189 GPT-5.3-Codex being routed to GPT-5.2](https://github.com/openai/codex/issues/11189) |
|
|
32
|
-
| 261 | `premature_completion` | high | 1 | 60 | 92 | [#2448 Codex CLI: Plus users hitting usage limits extremely quickly compared to competitors](https://github.com/openai/codex/issues/2448) |
|
|
33
|
-
| 247 | `codex_connectivity` | high | 2 | 192 | 14 | [#12764 The codex cli giving: 401 unauthorized](https://github.com/openai/codex/issues/12764) |
|
|
34
34
|
|
|
35
35
|
## Maintainer Roadmap
|
|
36
36
|
|
|
37
37
|
| Rank | Next artifact | Why now | Command |
|
|
38
38
|
| ---: | --- | --- | --- |
|
|
39
39
|
| 1 | Usage evidence fixture and support-ready token report | 4 issue(s), 1151 comment(s), severity high; top signal: codex_token_burn. | `trace-to-skill usage-evidence ./usage-notes.md --output usage-evidence.md` |
|
|
40
|
-
| 2 |
|
|
41
|
-
| 3 |
|
|
42
|
-
| 4 |
|
|
43
|
-
| 5 |
|
|
40
|
+
| 2 | Remote connection fixture and SSH workspace evidence report | 1 issue(s), 176 comment(s), severity high; top signal: codex_remote_connection. | `trace-to-skill codex-report ./runs --output openai-codex-remote-connection.md` |
|
|
41
|
+
| 3 | Privacy/safety guardrail and redacted support bundle | 1 issue(s), 75 comment(s), severity high; top signal: sensitive_file_access. | `trace-to-skill diagnostics-bundle ~/.codex --output codex-diagnostics` |
|
|
42
|
+
| 4 | Auth verification fixture and login support report | 3 issue(s), 436 comment(s), severity high; top signal: codex_auth_verification. | `trace-to-skill codex-report ./runs --output openai-codex-auth-issue.md` |
|
|
43
|
+
| 5 | Context visibility fixture and Desktop UI evidence report | 1 issue(s), 160 comment(s), severity high; top signal: codex_context_visibility. | `trace-to-skill codex-report ./runs --output openai-codex-context-visibility.md` |
|
|
44
44
|
|
|
45
45
|
## Suggested Next Actions
|
|
46
46
|
|
|
@@ -56,6 +56,16 @@ Example issues:
|
|
|
56
56
|
Evidence rule prompts:
|
|
57
57
|
- When reporting Codex token burn, capture plan/workspace, client and version, model and reasoning/speed settings, fast-mode/large-context/subagent/review flags, recent /status and usage-dashboard deltas, local token totals including cached input/output/reasoning if available, background process ids and write_stdin poll cadence, compaction attempts and failures, retry/tool-loop counts, whether the app was idle, and a minimal reproduction with before/after usage percentages.
|
|
58
58
|
|
|
59
|
+
### codex_remote_connection
|
|
60
|
+
|
|
61
|
+
Priority score: 1895. 1 issue(s), 176 comment(s).
|
|
62
|
+
|
|
63
|
+
Example issues:
|
|
64
|
+
- [#10450 Remote Development in Codex Desktop App](https://github.com/openai/codex/issues/10450) (176 comments; labels: enhancement, app)
|
|
65
|
+
|
|
66
|
+
Evidence rule prompts:
|
|
67
|
+
- When reporting Codex remote connection failures, capture Codex Desktop version, remote Codex CLI/app-server version, local OS, remote OS/architecture, SSH target alias from `~/.ssh/config`, whether `[features].remote_connections = true` is set, Settings > Connections visibility, selected host/path, remote workspace path, whether the remote filesystem is the source of truth, exact tunnel/app-server error, codex-server pid and restart result, `ps -ef | rg 'codex app-server|openai.chatgpt.*/codex'` evidence if available, remote PATH/auth/proxy/API reachability, model list differences versus local, fs/getMetadata or folder listing errors, ForwardAgent/proxy requirements, and whether reconnect/resume or a clean host works.
|
|
68
|
+
|
|
59
69
|
### sensitive_file_access
|
|
60
70
|
|
|
61
71
|
Priority score: 884. 1 issue(s), 75 comment(s).
|
|
@@ -88,27 +98,17 @@ Example issues:
|
|
|
88
98
|
Evidence rule prompts:
|
|
89
99
|
- When reporting Codex context-visibility regressions, capture Codex Desktop version, OS, surface, screenshot or short recording of the chat input area, whether the prior context/token indicator or tooltip was visible before the update, exact UI route where it disappeared, local session metadata showing context/window pressure if available, `/status` output if relevant, compaction timing, whether CLI/TUI still exposes a statusline, and how the missing indicator affects long-session decisions.
|
|
90
100
|
|
|
91
|
-
### codex_tool_call_integrity
|
|
92
|
-
|
|
93
|
-
Priority score: 442. 1 issue(s), 61 comment(s).
|
|
94
|
-
|
|
95
|
-
Example issues:
|
|
96
|
-
- [#2998 IDE-integrated diff / approval](https://github.com/openai/codex/issues/2998) (61 comments; labels: enhancement, extension)
|
|
97
|
-
|
|
98
|
-
Evidence rule prompts:
|
|
99
|
-
- When reporting Codex tool-call integrity failures, capture the exact tool input and output, app/CLI/extension version, OS/IDE, workspace git state, affected file path and whether it already existed or was a symlink, diff before/after, tool_call_id sequence, durable thread state for subagents, rollback/revert attempts, and whether a clean repo reproduction fails the same way.
|
|
100
|
-
|
|
101
101
|
## Unmatched Issues
|
|
102
102
|
|
|
103
103
|
- [#10410 Codex Desktop App: macOS Intel (x86_64) support](https://github.com/openai/codex/issues/10410) (190 comments; labels: enhancement, app)
|
|
104
|
-
- [#10450 Remote Development in Codex Desktop App](https://github.com/openai/codex/issues/10450) (176 comments; labels: enhancement, app)
|
|
105
|
-
- [#14048 All models — Codex CLI hangs indefinitely on all prompts, no response generated](https://github.com/openai/codex/issues/14048) (131 comments; labels: bug, agent)
|
|
106
104
|
- [#2604 Subagent Support](https://github.com/openai/codex/issues/2604) (103 comments; labels: enhancement, subagent)
|
|
107
105
|
- [#12564 Allow renaming task/thread titles to improve history navigation](https://github.com/openai/codex/issues/12564) (77 comments; labels: enhancement, extension)
|
|
108
106
|
- [#2860 Unusable on Windows due to permission ask for every shell command](https://github.com/openai/codex/issues/2860) (77 comments; labels: bug, windows-os)
|
|
109
107
|
- [#2109 Event Hooks](https://github.com/openai/codex/issues/2109) (76 comments; labels: enhancement, hooks)
|
|
110
108
|
- [#2796 BUG: VSCode IDE Plugin on SSH Connection: "Failed to load tasks."](https://github.com/openai/codex/issues/2796) (71 comments; labels: bug, extension)
|
|
111
109
|
- [#16231 High CPU usage on macOS after updating Codex in VS Code extension to 26.325.31654](https://github.com/openai/codex/issues/16231) (71 comments; labels: bug, extension, regression, performance)
|
|
112
|
-
- [#7156 Codex hangs during cli command execution](https://github.com/openai/codex/issues/7156) (70 comments; labels: bug, CLI)
|
|
113
110
|
- [#4313 Extension for JetBrains IDEs (PyCharm, IntelliJ, etc.)](https://github.com/openai/codex/issues/4313) (70 comments; labels: enhancement)
|
|
114
111
|
- [#13041 WebSocket upgrade succeeds then server closes with 1008 Policy (falls back to HTTPS)](https://github.com/openai/codex/issues/13041) (70 comments; labels: bug, connectivity)
|
|
112
|
+
- [#11701 Subagent configuration and orchestration](https://github.com/openai/codex/issues/11701) (69 comments; labels: enhancement, subagent)
|
|
113
|
+
- [#11023 Codex desktop app for Linux](https://github.com/openai/codex/issues/11023) (68 comments; labels: enhancement, app)
|
|
114
|
+
- [#6172 Hitting rate limits](https://github.com/openai/codex/issues/6172) (66 comments; labels: bug, codex-web, rate-limits)
|
package/docs/DEMO.md
CHANGED
|
@@ -1,10 +1,10 @@
|
|
|
1
1
|
# trace-to-skill Demo
|
|
2
2
|
|
|
3
|
-
Scenario: **Codex
|
|
3
|
+
Scenario: **Codex CLI no-response or all-model hang**
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
Codex CLI accepts prompts but produces no streaming output, no error, no timeout, or hangs during command execution.
|
|
6
6
|
|
|
7
|
-
Fixture: `fixtures/codex-
|
|
7
|
+
Fixture: `fixtures/codex-cli-no-response.md`
|
|
8
8
|
|
|
9
9
|
This is a packaged public fixture, so you can try the project without collecting a private trace first.
|
|
10
10
|
|
|
@@ -14,7 +14,7 @@ This is a packaged public fixture, so you can try the project without collecting
|
|
|
14
14
|
|
|
15
15
|
Score: **75/100**
|
|
16
16
|
|
|
17
|
-
Likely failure class: **Codex
|
|
17
|
+
Likely failure class: **Codex thinking or stream hang (codex_thinking_hang, high)**
|
|
18
18
|
|
|
19
19
|
Agent workflow needs clearer verification, instruction, or security hardening before broad reuse.
|
|
20
20
|
|
|
@@ -23,25 +23,25 @@ Agent workflow needs clearer verification, instruction, or security hardening be
|
|
|
23
23
|
```md
|
|
24
24
|
### What happened?
|
|
25
25
|
|
|
26
|
-
trace-to-skill detected Codex
|
|
26
|
+
trace-to-skill detected Codex thinking or stream hang (codex_thinking_hang). Codex can accept a turn, finish local tool calls, or keep a Responses request open while the UI/CLI remains on Thinking or Working with no streamed follow-up, making users interrupt healthy runs or lose long-session context.
|
|
27
27
|
|
|
28
28
|
### Detected failure class
|
|
29
29
|
|
|
30
|
-
-
|
|
30
|
+
- codex_thinking_hang: Codex thinking or stream hang (high)
|
|
31
31
|
|
|
32
32
|
### Evidence
|
|
33
33
|
|
|
34
|
-
#### Codex
|
|
35
|
-
- fixtures/codex-
|
|
36
|
-
- fixtures/codex-
|
|
37
|
-
- fixtures/codex-
|
|
38
|
-
- fixtures/codex-
|
|
39
|
-
- fixtures/codex-
|
|
40
|
-
- fixtures/codex-
|
|
34
|
+
#### Codex thinking or stream hang
|
|
35
|
+
- fixtures/codex-cli-no-response.md:1 - # Codex CLI No-Response Hang
|
|
36
|
+
- fixtures/codex-cli-no-response.md:3 - Public issue cluster: All models - Codex CLI hangs indefinitely on all prompts, no response generated.
|
|
37
|
+
- fixtures/codex-cli-no-response.md:7 - - Codex CLI accepts prompts and displays them, but no streaming output begins.
|
|
38
|
+
- fixtures/codex-cli-no-response.md:8 - - All models tested, including `gpt-5.4 high`, `gpt-5.3-codex`, and `gpt-5.1-codex-max`, show no response, no error, and no timeout.
|
|
39
|
+
- fixtures/codex-cli-no-response.md:9 - - The status bar remains `gpt-5.4 high - 100% left`; no tokens are being consumed while the prompt is stuck.
|
|
40
|
+
- fixtures/codex-cli-no-response.md:10 - - A `status.openai.com/incidents` status incident note says Codex CLI hanging or no response may come from unhealthy clusters and rerouted traffic.
|
|
41
41
|
|
|
42
42
|
### Diagnostics to attach
|
|
43
43
|
|
|
44
|
-
- When reporting Codex
|
|
44
|
+
- When reporting Codex thinking or CLI no-response hangs, capture app/CLI/extension version, OS/terminal such as WSL, model and reasoning/speed settings, subscription/workspace, turn/thread id, prompt timestamp, whether the prompt is accepted but no streaming output/error/timeout appears, status bar or usage percent such as 100% left, `turn/start` or `task_started` timestamp, last successful tool-call output, first `response_item` or assistant timestamp if it eventually appears, `RUST_LOG`/SSE evidence including unhandled responses events, transport (`responses_http` or websocket), `time.busy`/`time.idle` close metrics, reconnect or stream-disconnect lines, status incident link or cluster mitigation note if relevant, MCP/subagent state, whether stop/Ctrl+C/interrupt works, and whether a new thread, logout/login, downgrade, API billing path, or minimal config without MCPs recovers.
|
|
45
45
|
|
|
46
46
|
### Privacy
|
|
47
47
|
|
|
@@ -50,23 +50,25 @@ trace-to-skill detected Codex context or token usage indicator missing (codex_co
|
|
|
50
50
|
|
|
51
51
|
## Findings
|
|
52
52
|
|
|
53
|
-
### 1. Codex
|
|
53
|
+
### 1. Codex thinking or stream hang
|
|
54
54
|
|
|
55
55
|
Severity: **high**
|
|
56
56
|
|
|
57
|
-
|
|
57
|
+
Codex can accept a turn, finish local tool calls, or keep a Responses request open while the UI/CLI remains on Thinking or Working with no streamed follow-up, making users interrupt healthy runs or lose long-session context.
|
|
58
58
|
|
|
59
59
|
Evidence:
|
|
60
|
-
- `fixtures/codex-
|
|
61
|
-
- `fixtures/codex-
|
|
62
|
-
- `fixtures/codex-
|
|
63
|
-
- `fixtures/codex-
|
|
64
|
-
- `fixtures/codex-
|
|
65
|
-
- `fixtures/codex-
|
|
60
|
+
- `fixtures/codex-cli-no-response.md:1` # Codex CLI No-Response Hang
|
|
61
|
+
- `fixtures/codex-cli-no-response.md:3` Public issue cluster: All models - Codex CLI hangs indefinitely on all prompts, no response generated.
|
|
62
|
+
- `fixtures/codex-cli-no-response.md:7` - Codex CLI accepts prompts and displays them, but no streaming output begins.
|
|
63
|
+
- `fixtures/codex-cli-no-response.md:8` - All models tested, including `gpt-5.4 high`, `gpt-5.3-codex`, and `gpt-5.1-codex-max`, show no response, no error, and no timeout.
|
|
64
|
+
- `fixtures/codex-cli-no-response.md:9` - The status bar remains `gpt-5.4 high - 100% left`; no tokens are being consumed while the prompt is stuck.
|
|
65
|
+
- `fixtures/codex-cli-no-response.md:10` - A `status.openai.com/incidents` status incident note says Codex CLI hanging or no response may come from unhealthy clusters and rerouted traffic.
|
|
66
|
+
- `fixtures/codex-cli-no-response.md:12` - In another report, Codex hangs during terminal command execution; basic shell commands get stuck, it does half the job then stuck, and the VS Code client remains on Thinking or Working.
|
|
67
|
+
- `fixtures/codex-cli-no-response.md:17` The `codex exec --sandbox read-only --model gpt-5.3-codex 'ping'` run has no output and hangs after MCP startup with `unhandled responses event` SSE lines:
|
|
66
68
|
|
|
67
69
|
Suggested rule:
|
|
68
70
|
|
|
69
|
-
> When reporting Codex
|
|
71
|
+
> When reporting Codex thinking or CLI no-response hangs, capture app/CLI/extension version, OS/terminal such as WSL, model and reasoning/speed settings, subscription/workspace, turn/thread id, prompt timestamp, whether the prompt is accepted but no streaming output/error/timeout appears, status bar or usage percent such as 100% left, `turn/start` or `task_started` timestamp, last successful tool-call output, first `response_item` or assistant timestamp if it eventually appears, `RUST_LOG`/SSE evidence including unhandled responses events, transport (`responses_http` or websocket), `time.busy`/`time.idle` close metrics, reconnect or stream-disconnect lines, status incident link or cluster mitigation note if relevant, MCP/subagent state, whether stop/Ctrl+C/interrupt works, and whether a new thread, logout/login, downgrade, API billing path, or minimal config without MCPs recovers.
|
|
70
72
|
|
|
71
73
|
|
|
72
74
|
## Reporter Notes
|
|
@@ -97,6 +99,8 @@ Suggested rule:
|
|
|
97
99
|
- `terminal-output-integrity`: Terminal scrollback, streamed output, or transcript rendering drops, overwrites, truncates, or makes lines inaccessible.
|
|
98
100
|
- `subagent-lifecycle`: Completed, closed, stale, or interrupted subagents diverge between UI, live registry, persisted state, quota, and parent discoverability.
|
|
99
101
|
- `usage-bucket-confusion`: Usage popovers show 5h and weekly percentages without clear remaining/used, rolling/calendar, or account/workspace scope.
|
|
102
|
+
- `context-visibility`: Desktop context or token usage indicators disappear, leaving long-session compaction pressure invisible.
|
|
103
|
+
- `remote-connection`: Desktop remote SSH workspaces, Settings > Connections, remote app-server, tunnel, or remote filesystem evidence breaks.
|
|
100
104
|
- `token-burn`: Usage drains from background polling, idle activity, compaction loops, retries, or cached-heavy turns.
|
|
101
105
|
- `patch-overwrite`: `apply_patch` accepts `*** Add File` for an existing path, turning a create operation into a silent overwrite.
|
|
102
106
|
- `sensitive-files`: Secrets, local credentials, production env files, or private databases enter agent context.
|
|
@@ -112,6 +116,7 @@ trace-to-skill demo subagent-prompt-leakage
|
|
|
112
116
|
trace-to-skill demo windows-helper-path
|
|
113
117
|
trace-to-skill demo patch-overwrite
|
|
114
118
|
trace-to-skill demo thinking-hang
|
|
119
|
+
trace-to-skill demo cli-no-response
|
|
115
120
|
trace-to-skill demo clipboard-attachment
|
|
116
121
|
trace-to-skill demo deeplink-launch
|
|
117
122
|
trace-to-skill demo connector-auth-cache
|
|
@@ -122,6 +127,7 @@ trace-to-skill demo terminal-output-integrity
|
|
|
122
127
|
trace-to-skill demo subagent-lifecycle
|
|
123
128
|
trace-to-skill demo usage-bucket-confusion
|
|
124
129
|
trace-to-skill demo context-visibility
|
|
130
|
+
trace-to-skill demo remote-connection
|
|
125
131
|
trace-to-skill demo file-tree-ui
|
|
126
132
|
trace-to-skill demo usage-reset-drift
|
|
127
133
|
```
|
package/docs/FAILURE_TAXONOMY.md
CHANGED
|
@@ -52,9 +52,9 @@ The fix is to capture app/CLI/extension version, model and speed/reasoning setti
|
|
|
52
52
|
|
|
53
53
|
Codex can accept a turn, finish a local tool call, or keep a Responses request open while the UI or CLI remains on Thinking/Working with no streamed assistant follow-up. This is more specific than general latency: the session appears structurally accepted but the next visible assistant event never arrives or arrives after a very long gap.
|
|
54
54
|
|
|
55
|
-
Common signals include `turn/start`, `task_started`, a successful tool output followed by no next assistant action, a long gap before the first `response_item`, `model_client.stream_responses_api` close lines where `time.busy` is milliseconds but `time.idle` is hundreds of seconds, `responses_http` or websocket reconnects, Stop/Ctrl+C not interrupting the stuck turn, a subagent parent thread waiting while a child remains active, and minimal `config.toml` without MCPs changing the behavior.
|
|
55
|
+
Common signals include `turn/start`, `task_started`, a successful tool output followed by no next assistant action, a CLI prompt accepted with no streaming output, no error, and no timeout, status bar `100% left` with no tokens consumed, `codex exec --sandbox read-only --model ... 'ping'` stopping after `mcp startup: no servers`, `unhandled responses event` SSE lines, terminal command execution hanging, a status incident or unhealthy-cluster reroute note, a long gap before the first `response_item`, `model_client.stream_responses_api` close lines where `time.busy` is milliseconds but `time.idle` is hundreds of seconds, `responses_http` or websocket reconnects, Stop/Ctrl+C not interrupting the stuck turn, a subagent parent thread waiting while a child remains active, and minimal `config.toml` without MCPs changing the behavior.
|
|
56
56
|
|
|
57
|
-
The fix is to capture Codex version, OS, model and speed/reasoning settings, turn or thread id, prompt timestamp, last successful tool output, first `response_item` timestamp, transport evidence, `time.busy` / `time.idle`, reconnect or stream-close lines, MCP/subagent lifecycle state, stop/interrupt behavior, and whether a new thread or minimal config recovers.
|
|
57
|
+
The fix is to capture Codex app/CLI/extension version, OS/terminal such as WSL, model and speed/reasoning settings, subscription/workspace, turn or thread id, prompt timestamp, whether the prompt is accepted but no output/error/timeout appears, status bar or usage percent, last successful tool output, first `response_item` timestamp, `RUST_LOG`/SSE evidence, transport evidence, `time.busy` / `time.idle`, reconnect or stream-close lines, status incident link or cluster mitigation note, MCP/subagent lifecycle state, stop/Ctrl+C/interrupt behavior, and whether a new thread, logout/login, downgrade, API billing path, or minimal config recovers.
|
|
58
58
|
|
|
59
59
|
## Codex Clipboard Attachment
|
|
60
60
|
|
|
@@ -160,6 +160,14 @@ Common signals include `visible context/token usage indicator`, `context-window
|
|
|
160
160
|
|
|
161
161
|
The fix is to capture Codex Desktop version, OS, surface, screenshot or short recording of the chat input area, whether the prior context/token indicator or tooltip was visible before the update, exact UI route where it disappeared, local session metadata showing context/window pressure if available, `/status` output if relevant, compaction timing, whether CLI/TUI still exposes a statusline, and how the missing indicator affects long-session decisions.
|
|
162
162
|
|
|
163
|
+
## Codex Remote Connection Or SSH Workspace Failure
|
|
164
|
+
|
|
165
|
+
Codex Desktop remote connections can fail after the feature exists: the app may not show Settings > Connections, the wrong feature flag may be set, the SSH host may connect but the local tunnel or remote app-server may be unhealthy, the remote filesystem may fail to list, or the remote Codex version/model list may be stale.
|
|
166
|
+
|
|
167
|
+
Common signals include `Remote Development in Codex Desktop App`, `Remote SSH`, remote workspaces as the single source of truth, `remote_connections = true`, `remote_control = true` used as a mistaken flag, Settings > Connections missing, `local tunnel not ready`, stale remote Codex versions, `codex-server` or app-server restarts, `fs/getMetadata` folder listing timeouts, ForwardAgent requirements, proxying Codex API traffic through the local machine, and tmux-like reconnect/resume expectations.
|
|
168
|
+
|
|
169
|
+
The fix is to capture Codex Desktop version, remote Codex CLI/app-server version, local OS, remote OS/architecture, SSH alias from `~/.ssh/config`, whether `[features].remote_connections = true` is set, Settings > Connections visibility, selected host/path, remote workspace path, whether the remote filesystem is the source of truth, exact tunnel/app-server error, codex-server pid and restart result, remote PATH/auth/proxy/API reachability, model list differences versus local, fs/getMetadata or folder listing errors, ForwardAgent/proxy requirements, and whether reconnect/resume or a clean host works.
|
|
170
|
+
|
|
163
171
|
## Codex Subagent Prompt Leakage
|
|
164
172
|
|
|
165
173
|
Codex MultiAgentV2 child agents can fail the task boundary even when the parent asks for isolated children. When `spawn_agent` with `fork_turns: "none"` records the delegated task as an assistant/commentary JSON envelope, or a same-turn parallel child sees a sibling prompt, independent review, QA, and security lanes are no longer independent.
|
package/docs/OPENAI_OSS_BRIEF.md
CHANGED
|
@@ -3,14 +3,14 @@
|
|
|
3
3
|
| Field | Value |
|
|
4
4
|
| --- | --- |
|
|
5
5
|
| Repository | https://github.com/grnbtqdbyx-create/trace-to-skill |
|
|
6
|
-
| Package | trace-to-skill@0.1.
|
|
6
|
+
| Package | trace-to-skill@0.1.98 |
|
|
7
7
|
| License | Apache-2.0 |
|
|
8
8
|
| Codex readiness | ready (100/100) |
|
|
9
|
-
| Benchmark | pass,
|
|
9
|
+
| Benchmark | pass, 43 cases |
|
|
10
10
|
|
|
11
11
|
## Why This Repository Qualifies
|
|
12
12
|
|
|
13
|
-
trace-to-skill helps open-source maintainers adopt Codex safely by turning failed coding-agent runs into evidence-backed rules, reusable workflows, CI gates, and a weekly Codex Issue Radar for live GitHub issue demand. It supports real maintenance work: PR review, issue triage, release quality, MCP risk, prompt-injection defense, privacy-preserving trace sharing, and repeat failure reduction. The repository is ready, scores 100/100 on the local Codex readiness doctor, and ships a deterministic benchmark with
|
|
13
|
+
trace-to-skill helps open-source maintainers adopt Codex safely by turning failed coding-agent runs into evidence-backed rules, reusable workflows, CI gates, and a weekly Codex Issue Radar for live GitHub issue demand. It supports real maintenance work: PR review, issue triage, release quality, MCP risk, prompt-injection defense, privacy-preserving trace sharing, and repeat failure reduction. The repository is ready, scores 100/100 on the local Codex readiness doctor, and ships a deterministic benchmark with 43 public fixture cases.
|
|
14
14
|
|
|
15
15
|
### 500-Character Version
|
|
16
16
|
|
|
@@ -27,10 +27,10 @@ API credits would power optional maintainer workflows on top of the local determ
|
|
|
27
27
|
## Evidence
|
|
28
28
|
|
|
29
29
|
- Public repository: https://github.com/grnbtqdbyx-create/trace-to-skill
|
|
30
|
-
- One-command package: npx trace-to-skill@0.1.
|
|
30
|
+
- One-command package: npx trace-to-skill@0.1.98
|
|
31
31
|
- Open-source license: Apache-2.0
|
|
32
32
|
- Codex readiness doctor: ready, 100/100, 0 failed checks.
|
|
33
|
-
- Public fixture benchmark: pass,
|
|
33
|
+
- Public fixture benchmark: pass, 43 cases.
|
|
34
34
|
- GitHub issue demand mining: issue-map fetches or reads piped GitHub CLI issue JSON, then ranks OpenAI/Codex issues by failure class, comments, reactions, evidence gaps, and Maintainer Roadmap next artifacts.
|
|
35
35
|
- Weekly Codex Issue Radar: init --issue-map-repo owner/name scaffolds a scheduled Action that fetches live GitHub issues and publishes the pain map to the job summary or a stable tracking issue comment.
|
|
36
36
|
- Maintainer control: generated rules are suggestions, evidence is line-linked, and secrets can be redacted before sharing.
|
package/docs/SCORECARD.md
CHANGED
|
@@ -9,7 +9,7 @@ Status: **pass**
|
|
|
9
9
|
| Failed doctor checks | 0 |
|
|
10
10
|
| Critical findings | 0 |
|
|
11
11
|
| Built-in benchmark | pass |
|
|
12
|
-
| Benchmark cases |
|
|
12
|
+
| Benchmark cases | 43 |
|
|
13
13
|
|
|
14
14
|
## Doctor Summary
|
|
15
15
|
|
|
@@ -34,6 +34,7 @@ This benchmark runs the public fixture pack that ships with the repository and p
|
|
|
34
34
|
| Codex selected model differs from actual routed model | `fixtures/codex-model-routing-mismatch.md` | 75 | 2 | 0 | `codex_model_routing_mismatch`, `weak_evidence` | pass |
|
|
35
35
|
| Codex model and runtime latency regression | `fixtures/codex-latency-regression.md` | 75 | 2 | 0 | `codex_latency_regression`, `weak_evidence` | pass |
|
|
36
36
|
| Codex thinking and stream hang | `fixtures/codex-thinking-hang.md` | 75 | 2 | 0 | `codex_thinking_hang`, `weak_evidence` | pass |
|
|
37
|
+
| Codex CLI no-response and command execution hang | `fixtures/codex-cli-no-response.md` | 75 | 2 | 0 | `codex_thinking_hang`, `weak_evidence` | pass |
|
|
37
38
|
| Codex clipboard, paste, and attachment workflow regression | `fixtures/codex-clipboard-attachment.md` | 75 | 2 | 0 | `codex_clipboard_attachment`, `weak_evidence` | pass |
|
|
38
39
|
| Codex deeplink, OAuth callback, and external launch regression | `fixtures/codex-deeplink-launch.md` | 50 | 4 | 0 | `codex_deeplink_launch`, `codex_remote_control`, `hallucinated_file`, `weak_evidence` | pass |
|
|
39
40
|
| Codex app connector auth cache and stale link regression | `fixtures/codex-connector-auth-cache.md` | 75 | 2 | 0 | `codex_connector_auth_cache`, `weak_evidence` | pass |
|
|
@@ -54,9 +55,10 @@ This benchmark runs the public fixture pack that ships with the repository and p
|
|
|
54
55
|
| Codex MCP discovery and config-scope mismatch | `fixtures/codex-mcp-discovery-mismatch.md` | 75 | 2 | 0 | `codex_mcp_discovery_mismatch`, `weak_evidence` | pass |
|
|
55
56
|
| Codex plugin runtime and bundled capability failure | `fixtures/codex-plugin-runtime.md` | 59 | 3 | 0 | `codex_plugin_runtime`, `codex_windows_helper_path`, `weak_evidence` | pass |
|
|
56
57
|
| Codex file tree and workspace navigation UI failure | `fixtures/codex-file-tree-ui.md` | 75 | 2 | 0 | `codex_file_tree_ui`, `weak_evidence` | pass |
|
|
57
|
-
| Codex session resume and state failure | `fixtures/codex-session-state.md` |
|
|
58
|
+
| Codex session resume and state failure | `fixtures/codex-session-state.md` | 75 | 2 | 0 | `codex_session_state`, `weak_evidence` | pass |
|
|
58
59
|
| Codex usage bucket scope and percentage confusion | `fixtures/codex-usage-bucket-confusion.md` | 59 | 3 | 0 | `codex_token_burn`, `codex_usage_bucket_confusion`, `weak_evidence` | pass |
|
|
59
60
|
| Codex context or token usage indicator missing | `fixtures/codex-context-visibility.md` | 75 | 2 | 0 | `codex_context_visibility`, `weak_evidence` | pass |
|
|
61
|
+
| Codex remote connection or SSH workspace failure | `fixtures/codex-remote-connection.md` | 75 | 2 | 0 | `codex_remote_connection`, `weak_evidence` | pass |
|
|
60
62
|
| Codex token burn and usage-drain loop | `fixtures/codex-token-burn.md` | 75 | 2 | 0 | `codex_token_burn`, `weak_evidence` | pass |
|
|
61
63
|
| Codex resource leak and runaway process | `fixtures/codex-resource-leak.md` | 75 | 2 | 0 | `codex_resource_leak`, `weak_evidence` | pass |
|
|
62
64
|
| Codex tool-call integrity and rollback failure | `fixtures/codex-tool-call-integrity.md` | 43 | 4 | 0 | `codex_resource_leak`, `codex_subagent_lifecycle`, `codex_tool_call_integrity`, `weak_evidence` | pass |
|
package/docs/USE_CASES.md
CHANGED
|
@@ -59,7 +59,7 @@ What it proves:
|
|
|
59
59
|
Recommended CI surface:
|
|
60
60
|
|
|
61
61
|
```yaml
|
|
62
|
-
- uses: grnbtqdbyx-create/trace-to-skill@v0.1.
|
|
62
|
+
- uses: grnbtqdbyx-create/trace-to-skill@v0.1.98
|
|
63
63
|
with:
|
|
64
64
|
mode: all
|
|
65
65
|
doctor-threshold: "85"
|
|
@@ -348,6 +348,32 @@ This catches signals such as a missing visible context/token usage indicator, hi
|
|
|
348
348
|
|
|
349
349
|
Include Codex Desktop version, OS, surface, screenshot or short recording of the chat input area, whether the prior context/token indicator or tooltip was visible before the update, exact UI route where it disappeared, local session metadata showing context/window pressure if available, `/status` output if relevant, compaction timing, whether CLI/TUI still exposes a statusline, and how the missing indicator affects long-session decisions.
|
|
350
350
|
|
|
351
|
+
## 18.2. Codex Remote Connection Evidence
|
|
352
|
+
|
|
353
|
+
Use this when Codex Desktop remote SSH workspaces, Settings > Connections, remote app-server, tunnel, model list, or remote filesystem browsing fails.
|
|
354
|
+
|
|
355
|
+
```bash
|
|
356
|
+
npx trace-to-skill demo remote-connection
|
|
357
|
+
npx trace-to-skill codex-report ./runs --output openai-codex-remote-connection.md
|
|
358
|
+
```
|
|
359
|
+
|
|
360
|
+
This catches signals such as `Remote Development in Codex Desktop App`, missing Settings > Connections, `[features].remote_connections = true`, mistaken `remote_control = true`, SSH hosts from `~/.ssh/config`, remote filesystem source-of-truth expectations, `local tunnel not ready`, stale remote Codex versions, `codex-server` restart evidence, `fs/getMetadata` timeouts while listing remote folders, ForwardAgent needs, and local-machine proxy expectations for remote hosts that cannot reach the Codex API directly.
|
|
361
|
+
|
|
362
|
+
Include Codex Desktop version, remote Codex CLI/app-server version, local OS, remote OS/architecture, selected SSH host/path, whether the remote filesystem is the source of truth, exact tunnel/app-server/folder-listing/model-list/auth/proxy error, process evidence such as `ps -ef | rg 'codex app-server|openai.chatgpt.*/codex'` when available, and whether killing codex-server, reinstalling remote Codex, reconnecting, or trying a clean host changes the result.
|
|
363
|
+
|
|
364
|
+
## 18.3. Codex CLI No-Response Evidence
|
|
365
|
+
|
|
366
|
+
Use this when Codex CLI accepts prompts but produces no streaming output, no error, no timeout, or hangs during command execution.
|
|
367
|
+
|
|
368
|
+
```bash
|
|
369
|
+
npx trace-to-skill demo cli-no-response
|
|
370
|
+
npx trace-to-skill codex-report ./runs --output openai-codex-cli-no-response.md
|
|
371
|
+
```
|
|
372
|
+
|
|
373
|
+
This catches signals such as all-model hangs, `gpt-5.4 high - 100% left`, no tokens consumed, simple prompts like `Hello` or `ping` never producing output, `codex exec --sandbox read-only --model gpt-5.3-codex 'ping'`, `mcp startup: no servers`, unhandled responses events, CLI/VS Code stuck on Thinking or Working, terminal command execution hangs, status incidents, unhealthy cluster reroutes, and Ctrl+C or `/exit` delays.
|
|
374
|
+
|
|
375
|
+
Include CLI/app/extension version, OS/terminal/WSL, subscription/workspace, model and reasoning/speed settings, prompt timestamp, exact prompt, whether the prompt was accepted but no stream/error/timeout appeared, status bar or usage percent, `RUST_LOG` SSE snippets, transport, first `response_item` or assistant timestamp if it appears later, reconnect or stream-disconnect lines, status incident link, and recovery attempts such as new thread, downgrade, logout/login, API billing path, or minimal config without MCPs.
|
|
376
|
+
|
|
351
377
|
## 19. Codex File Tree UI Evidence
|
|
352
378
|
|
|
353
379
|
Use this when Codex Desktop cannot reveal project files through the native file tree, folder icon, floating file panel, or built-in preview.
|
|
@@ -0,0 +1,46 @@
|
|
|
1
|
+
# Codex CLI No-Response Hang
|
|
2
|
+
|
|
3
|
+
Public issue cluster: All models - Codex CLI hangs indefinitely on all prompts, no response generated.
|
|
4
|
+
|
|
5
|
+
## Symptoms
|
|
6
|
+
|
|
7
|
+
- Codex CLI accepts prompts and displays them, but no streaming output begins.
|
|
8
|
+
- All models tested, including `gpt-5.4 high`, `gpt-5.3-codex`, and `gpt-5.1-codex-max`, show no response, no error, and no timeout.
|
|
9
|
+
- The status bar remains `gpt-5.4 high - 100% left`; no tokens are being consumed while the prompt is stuck.
|
|
10
|
+
- A `status.openai.com/incidents` status incident note says Codex CLI hanging or no response may come from unhealthy clusters and rerouted traffic.
|
|
11
|
+
- Simple greetings, questions, codebase analysis, and every message hang the same way.
|
|
12
|
+
- In another report, Codex hangs during terminal command execution; basic shell commands get stuck, it does half the job then stuck, and the VS Code client remains on Thinking or Working.
|
|
13
|
+
- `/exit`, Stop, or Ctrl+C does not respond for minutes, so the user has to kill the Codex process.
|
|
14
|
+
|
|
15
|
+
## Minimal Reproduction
|
|
16
|
+
|
|
17
|
+
The `codex exec --sandbox read-only --model gpt-5.3-codex 'ping'` run has no output and hangs after MCP startup with `unhandled responses event` SSE lines:
|
|
18
|
+
|
|
19
|
+
```bash
|
|
20
|
+
RUST_LOG='codex_api::sse::responses=trace' codex exec --sandbox read-only --model gpt-5.3-codex 'ping'
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
Observed output stops after:
|
|
24
|
+
|
|
25
|
+
```text
|
|
26
|
+
mcp startup: no servers
|
|
27
|
+
unhandled responses event: response.in_progress
|
|
28
|
+
unhandled responses event: response.content_part.added
|
|
29
|
+
unhandled responses event: response.output_text.done
|
|
30
|
+
unhandled responses event: response.content_part.done
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
## Status and Recovery Notes
|
|
34
|
+
|
|
35
|
+
- A collaborator note linked a status incident and said unhealthy clusters were rerouted.
|
|
36
|
+
- Users still reported Codex is down, Reconnecting, stream disconnected before completion, and no response in both CLI and VS Code after the incident note.
|
|
37
|
+
- Downgrading, starting a new thread, logout/login, API billing path, or a minimal config without MCPs should be recorded as separate recovery attempts.
|
|
38
|
+
|
|
39
|
+
## Evidence Checklist
|
|
40
|
+
|
|
41
|
+
- CLI/app/extension version and whether it is Terminal, WSL, VS Code, or Desktop.
|
|
42
|
+
- OS, shell, subscription/workspace, selected model, reasoning effort, and speed/service tier.
|
|
43
|
+
- Prompt timestamp, exact prompt, and whether the prompt is accepted but no streaming output, error, or timeout appears.
|
|
44
|
+
- Status bar or usage percent such as `100% left`, first assistant timestamp if it eventually appears, and whether tokens were consumed.
|
|
45
|
+
- `RUST_LOG` SSE snippets, transport (`responses_http` or websocket), first `response_item`, unhandled responses events, reconnect or stream-disconnect lines, and status incident link.
|
|
46
|
+
- Stop, Ctrl+C, `/exit`, forced kill, new thread, downgrade, logout/login, API billing path, and minimal-config recovery results.
|