trace-to-skill 0.1.93 → 0.1.95

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,10 +1,10 @@
1
1
  # GitHub Issue Pain Map
2
2
 
3
- Generated: 2026-06-01T01:41:36.247Z
3
+ Generated: 2026-06-01T02:57:23.552Z
4
4
 
5
- Issues analyzed: **78**
6
- Matched issues: **43**
7
- Unmatched issues: **35**
5
+ Issues analyzed: **11**
6
+ Matched issues: **10**
7
+ Unmatched issues: **1**
8
8
 
9
9
  This report maps GitHub issues onto deterministic `trace-to-skill` failure classes. Fetch a repository directly with `--repo`, or export issues with `gh issue list` / `gh search issues` and pass the JSON file.
10
10
 
@@ -12,97 +12,89 @@ This report maps GitHub issues onto deterministic `trace-to-skill` failure class
12
12
  trace-to-skill issue-map --repo openai/codex --output codex-issue-map.md
13
13
  gh issue list --repo openai/codex --state open --limit 100 --json number,title,body,url,labels,comments,createdAt,updatedAt > codex-issues.json
14
14
  trace-to-skill issue-map codex-issues.json --output codex-issue-map.md
15
+ gh issue list --repo openai/codex --state all --limit 100 --json number,title,body,url,labels,comments,updatedAt | trace-to-skill issue-map - --format json
15
16
  ```
16
17
 
17
18
  ## Top Clusters
18
19
 
19
20
  | Priority | Kind | Severity | Issues | Comments | Reactions | Example |
20
21
  | ---: | --- | --- | ---: | ---: | ---: | --- |
21
- | 2286 | `codex_token_burn` | high | 7 | 843 | 683 | [#14593 Burning tokens very fast](https://github.com/openai/codex/issues/14593) | |
22
- | 1694 | `weak_evidence` | medium | 78 | 3327 | 5229 | [#14593 Burning tokens very fast](https://github.com/openai/codex/issues/14593) | |
23
- | 1111 | `premature_completion` | high | 10 | 310 | 347 | [#3962 Play a sound when Codex finishes a prompt / task](https://github.com/openai/codex/issues/3962) | |
24
- | 884 | `sensitive_file_access` | high | 1 | 75 | 396 | [#2847 A way to exclude sensitive files](https://github.com/openai/codex/issues/2847) | |
25
- | 564 | `codex_remote_compact` | high | 6 | 211 | 143 | [#14860 Error running remote compact task](https://github.com/openai/codex/issues/14860) | |
26
- | 514 | `codex_tool_call_integrity` | high | 2 | 103 | 192 | [#2998 IDE-integrated diff / approval](https://github.com/openai/codex/issues/2998) | |
27
- | 481 | `context_compaction` | high | 5 | 186 | 119 | [#14860 Error running remote compact task](https://github.com/openai/codex/issues/14860) | |
28
- | 393 | `codex_windows_helper_path` | high | 5 | 148 | 94 | [#18258 Codex app on macOS shows 'Computer Use plugin unavailable'](https://github.com/openai/codex/issues/18258) | |
29
- | 391 | `sandbox_permission` | high | 5 | 160 | 87 | [#10601 Sandbox setup error on Windows](https://github.com/openai/codex/issues/10601) | |
30
- | 298 | `codex_latency_regression` | high | 4 | 105 | 73 | [#24422 GPT-5.5 Fast suddenly feels as slow as Standard, with long thinking/context/search stalls](https://github.com/openai/codex/issues/24422) | |
31
- | 287 | `codex_approval_friction` | high | 3 | 96 | 77 | [#4212 Windows approval “Allow for this session” isn’t remembered](https://github.com/openai/codex/issues/4212) | |
32
- | 249 | `codex_plugin_runtime` | high | 3 | 86 | 63 | [#18258 Codex app on macOS shows 'Computer Use plugin unavailable'](https://github.com/openai/codex/issues/18258) | |
22
+ | 1051 | `codex_token_burn` | high | 2 | 918 | 53 | [#14593 Burning tokens very fast](https://github.com/openai/codex/issues/14593) |
23
+ | 409 | `codex_auth_verification` | high | 2 | 346 | 18 | [#20161 Phone number verification doesn't work](https://github.com/openai/codex/issues/20161) |
24
+ | 304 | `codex_model_routing_mismatch` | high | 3 | 231 | 18 | [#11189 GPT-5.3-Codex being routed to GPT-5.2](https://github.com/openai/codex/issues/11189) |
25
+ | 234 | `weak_evidence` | medium | 11 | 1641 | 114 | [#14593 Burning tokens very fast](https://github.com/openai/codex/issues/14593) |
26
+ | 202 | `premature_completion` | high | 1 | 169 | 8 | [#1243 "Sign in With ChatGPT" functionality needs to be robust against all account types](https://github.com/openai/codex/issues/1243) |
27
+ | 137 | `codex_remote_compact` | high | 1 | 90 | 15 | [#14860 Error running remote compact task](https://github.com/openai/codex/issues/14860) |
28
+ | 137 | `context_compaction` | high | 1 | 90 | 15 | [#14860 Error running remote compact task](https://github.com/openai/codex/issues/14860) |
29
+ | 88 | `codex_mcp_discovery_mismatch` | high | 1 | 55 | 8 | [#6465 MCP servers not detected in Codex VS Code extension but working in Codex CLI](https://github.com/openai/codex/issues/6465) |
30
+ | 22 | `codex_usage_bucket_confusion` | high | 1 | 1 | 2 | [#25471 Codex usage popover shows confusing remaining percentages for 5h vs weekly buckets](https://github.com/openai/codex/issues/25471) |
31
+
32
+ ## Maintainer Roadmap
33
+
34
+ | Rank | Next artifact | Why now | Command |
35
+ | ---: | --- | --- | --- |
36
+ | 1 | Usage evidence fixture and support-ready token report | 2 issue(s), 918 comment(s), severity high; top signal: codex_token_burn. | `trace-to-skill usage-evidence ./usage-notes.md --output usage-evidence.md` |
37
+ | 2 | Auth verification fixture and login support report | 2 issue(s), 346 comment(s), severity high; top signal: codex_auth_verification. | `trace-to-skill codex-report ./runs --output openai-codex-auth-issue.md` |
38
+ | 3 | Model-routing fixture and SSE evidence report | 3 issue(s), 231 comment(s), severity high; top signal: codex_model_routing_mismatch. | `trace-to-skill codex-report ./runs --output openai-codex-model-routing.md` |
39
+ | 4 | Codex-ready issue report and failure fixture | 1 issue(s), 169 comment(s), severity high; top signal: premature_completion. | `trace-to-skill codex-report ./runs --output openai-codex-issue.md` |
40
+ | 5 | Compaction/session regression fixture and Codex issue report | 1 issue(s), 90 comment(s), severity high; top signal: codex_remote_compact. | `trace-to-skill codex-report ./runs --output openai-codex-issue.md` |
33
41
 
34
42
  ## Suggested Next Actions
35
43
 
36
44
  ### codex_token_burn
37
45
 
38
- Priority score: 2286. 7 issue(s), 843 comment(s).
46
+ Priority score: 1051. 2 issue(s), 918 comment(s).
39
47
 
40
48
  Example issues:
41
49
  - [#14593 Burning tokens very fast](https://github.com/openai/codex/issues/14593) (593 comments; labels: bug, rate-limits)
42
- - [#19464 Support 1M token context for GPT-5.5 in Codex](https://github.com/openai/codex/issues/19464) (132 comments; labels: enhancement, context)
43
- - [#19585 Pro weekly usage limit depletes unusually fast on 5.5, worsened by unstable context compaction](https://github.com/openai/codex/issues/19585) (25 comments; labels: bug, rate-limits, context)
50
+ - [#13568 Usage dropping too quickly](https://github.com/openai/codex/issues/13568) (325 comments; labels: bug, rate-limits)
44
51
 
45
52
  Evidence rule prompts:
46
53
  - When reporting Codex token burn, capture plan/workspace, client and version, model and reasoning/speed settings, fast-mode/large-context/subagent/review flags, recent /status and usage-dashboard deltas, local token totals including cached input/output/reasoning if available, background process ids and write_stdin poll cadence, compaction attempts and failures, retry/tool-loop counts, whether the app was idle, and a minimal reproduction with before/after usage percentages.
47
54
 
48
- ### weak_evidence
55
+ ### codex_auth_verification
49
56
 
50
- Priority score: 1694. 78 issue(s), 3327 comment(s).
57
+ Priority score: 409. 2 issue(s), 346 comment(s).
51
58
 
52
59
  Example issues:
53
- - [#14593 Burning tokens very fast](https://github.com/openai/codex/issues/14593) (593 comments; labels: bug, rate-limits)
54
- - [#19464 Support 1M token context for GPT-5.5 in Codex](https://github.com/openai/codex/issues/19464) (132 comments; labels: enhancement, context)
55
- - [#14860 Error running remote compact task](https://github.com/openai/codex/issues/14860) (90 comments; labels: bug, context)
60
+ - [#20161 Phone number verification doesn't work](https://github.com/openai/codex/issues/20161) (177 comments; labels: bug, auth)
61
+ - [#1243 "Sign in With ChatGPT" functionality needs to be robust against all account types](https://github.com/openai/codex/issues/1243) (169 comments; labels: none)
56
62
 
57
63
  Evidence rule prompts:
58
- - Final responses must include the exact validation evidence used to prove the change, not only a summary of intent.
64
+ - When reporting Codex sign-in or account-verification failures, capture the Codex app/CLI/extension version, surface, OS, account type without secrets, workspace or organization context, SSO provider, whether the flow is ChatGPT sign-in, phone/SMS/OTP verification, or extension chat initialization, exact redacted error text, timestamps, whether another device/browser/account works, logout/login attempts, and screenshots with phone numbers, tokens, and email addresses redacted.
59
65
 
60
- ### premature_completion
66
+ ### codex_model_routing_mismatch
61
67
 
62
- Priority score: 1111. 10 issue(s), 310 comment(s).
68
+ Priority score: 304. 3 issue(s), 231 comment(s).
63
69
 
64
70
  Example issues:
65
- - [#3962 Play a sound when Codex finishes a prompt / task](https://github.com/openai/codex/issues/3962) (50 comments; labels: enhancement, extension)
66
- - [#7291 Bug report: VSCode extension failed to revert the changes](https://github.com/openai/codex/issues/7291) (42 comments; labels: bug, extension)
67
- - [#18341 Mac app shows persistent blurred/translucent overlay below composer](https://github.com/openai/codex/issues/18341) (34 comments; labels: bug, app)
71
+ - [#11189 GPT-5.3-Codex being routed to GPT-5.2](https://github.com/openai/codex/issues/11189) (169 comments; labels: bug, CLI)
72
+ - [#11561 GPT-5.3-Codex being routed to GPT-5.2](https://github.com/openai/codex/issues/11561) (47 comments; labels: bug, CLI)
73
+ - [#11842 GPT-5.3-Codex being routed to GPT-5.2](https://github.com/openai/codex/issues/11842) (15 comments; labels: bug, CLI)
68
74
 
69
75
  Evidence rule prompts:
70
- - Before claiming completion, run the relevant validation command or clearly state the exact validation that could not be run and why.
76
+ - When reporting Codex model-routing mismatches, capture the Codex app/CLI/extension version, subscription/workspace, selected model from config.toml, TUI, command flag, or UI, actual server-side model from SSE `response.created` / `response.model`, the exact `RUST_LOG` or trace command used, timestamp, account or verification state without secrets, whether API and Codex routes differ, whether a warning/fallback notice appeared, and a minimal one-prompt reproduction with redacted logs.
71
77
 
72
- ### sensitive_file_access
78
+ ### premature_completion
73
79
 
74
- Priority score: 884. 1 issue(s), 75 comment(s).
80
+ Priority score: 202. 1 issue(s), 169 comment(s).
75
81
 
76
82
  Example issues:
77
- - [#2847 A way to exclude sensitive files](https://github.com/openai/codex/issues/2847) (75 comments; labels: enhancement, sandbox)
83
+ - [#1243 "Sign in With ChatGPT" functionality needs to be robust against all account types](https://github.com/openai/codex/issues/1243) (169 comments; labels: none)
78
84
 
79
85
  Evidence rule prompts:
80
- - Before running an agent, exclude sensitive files such as .env, private keys, package auth files, cloud credentials, local databases, and production secret manifests; share only minimal redacted excerpts when maintainer-approved.
86
+ - Before claiming completion, run the relevant validation command or clearly state the exact validation that could not be run and why.
81
87
 
82
88
  ### codex_remote_compact
83
89
 
84
- Priority score: 564. 6 issue(s), 211 comment(s).
90
+ Priority score: 137. 1 issue(s), 90 comment(s).
85
91
 
86
92
  Example issues:
87
93
  - [#14860 Error running remote compact task](https://github.com/openai/codex/issues/14860) (90 comments; labels: bug, context)
88
- - [#9211 Error running remote compact task: timeout waiting for child process to exit](https://github.com/openai/codex/issues/9211) (27 comments; labels: bug, context)
89
- - [#10823 Unable to compact the context in a VERY long running session](https://github.com/openai/codex/issues/10823) (26 comments; labels: bug, context)
90
94
 
91
95
  Evidence rule prompts:
92
96
  - When reporting Codex remote compact failures, capture app/CLI/extension version, OS, model and reasoning/speed mode, provider config without secrets, exact /compact or auto-compact error, `responses/compact` endpoint shape, timeout values such as tcp_user_timeout or stream_idle_timeout_ms, context/token level before compaction, whether lowering reasoning/speed changes behavior, whether local fallback or a new session recovers, and related thread/feedback ids.
93
97
 
94
98
  ## Unmatched Issues
95
99
 
96
- - [#16231 High CPU usage on macOS after updating Codex in VS Code extension to 26.325.31654](https://github.com/openai/codex/issues/16231) (71 comments; labels: bug, extension, regression, performance)
97
- - [#13041 WebSocket upgrade succeeds then server closes with 1008 Policy (falls back to HTTPS)](https://github.com/openai/codex/issues/13041) (70 comments; labels: bug, connectivity)
98
- - [#11023 Codex desktop app for Linux](https://github.com/openai/codex/issues/11023) (68 comments; labels: enhancement, app)
99
- - [#13993 Support standalone Windows installer (`codex-setup.exe`)](https://github.com/openai/codex/issues/13993) (58 comments; labels: enhancement, windows-os, app, User Request, Feature)
100
- - [#8745 LSP integration (auto-detect + auto-install) for Codex CLI](https://github.com/openai/codex/issues/8745) (52 comments; labels: enhancement, agent)
101
- - [#12661 Markdown file:// links open in default browser (Edge) instead of VS Code editor](https://github.com/openai/codex/issues/12661) (46 comments; labels: bug, windows-os, extension)
102
- - [#9203 Please make "/undo" back](https://github.com/openai/codex/issues/9203) (46 comments; labels: enhancement, TUI, session)
103
- - [#6020 MCP client for `X` failed to start: handshaking with MCP server failed: connection closed: initialize response](https://github.com/openai/codex/issues/6020) (40 comments; labels: bug, mcp)
104
- - [#16857 High GPU usage while the app is “thinking” due to tiny useless animation](https://github.com/openai/codex/issues/16857) (36 comments; labels: bug, app, performance)
105
- - [#3141 Allow GPU access inside sandbox](https://github.com/openai/codex/issues/3141) (35 comments; labels: enhancement, sandbox)
106
- - [#3355 Error sending request for url (https://chatgpt.com/backend-api/codex/responses) after macbook sleeps](https://github.com/openai/codex/issues/3355) (35 comments; labels: bug, connectivity)
107
- - [#2153 ChatGPT integration](https://github.com/openai/codex/issues/2153) (33 comments; labels: enhancement, app, User Request, Feature)
108
-
100
+ - [#99999 Add a fun launch animation](https://github.com/openai/codex/issues/99999) (0 comments; labels: enhancement)
@@ -32,11 +32,13 @@ npx trace-to-skill lsp-audit . --format json
32
32
  | Process polling and high-CPU evidence | Windows `powershell.exe` / `pwsh` child process loops, `Get-CimInstance Win32_Process`, `Win32_PerfFormattedData_PerfProc_Process`, stale `chat_processes.json`, helper/renderer CPU samples | process audit receipt | `trace-to-skill process-audit ./process-notes.md` |
33
33
  | Tool-call integrity and rollback failures | `apply_patch` overwrites an existing `Add File` target, unmatched `tool_call_id`, `close_agent` hangs, failed revert/undo, unsafe diff application | `codex_tool_call_integrity` | `trace-to-skill codex-report ./runs` or `trace-to-skill guard-patch ./change.patch --root .` |
34
34
  | Undo, rewind, and pre-agent checkpoint needs | users want `/undo` or `/rewind`, double-Esc only rewinds chat state, untracked/gitignored files are not protected by commits, and manual recovery needs reviewable pre-agent evidence | workspace checkpoint bundle | `trace-to-skill checkpoint . --output .trace-to-skill/checkpoints/before-codex` before agent work |
35
+ | Model routing mismatch | selected `gpt-5.3-codex` in `config.toml`, TUI, or `--model`, but SSE `response.created` / `response.model` shows `gpt-5.2`, silent fallback, no warning, no fallback notice | `codex_model_routing_mismatch` | `trace-to-skill codex-report ./runs` or `trace-to-skill demo model-routing-mismatch` |
35
36
  | Latency regressions | GPT-5.5 Fast feels like Standard, simple tasks take 10-20+ minutes, pre-first-token or thinking stalls, slow search/read/compaction, hours for small code changes | `codex_latency_regression` | `trace-to-skill codex-report ./runs` |
36
37
  | Thinking or stream hangs | accepted turn, completed local tool output, no streamed assistant follow-up, long gap before first `response_item`, `time.busy` milliseconds with `time.idle` hundreds of seconds, Stop/Ctrl+C cannot interrupt, subagent parent stuck | `codex_thinking_hang` | `trace-to-skill codex-report ./runs` or `trace-to-skill demo thinking-hang` |
37
38
  | Clipboard, paste, and generated attachment regressions | `Copy as Markdown` missing, Copy menu only exports metadata, long pasted prompts become `Pasted text.txt`, generated attachments cannot preview/edit/revert, `/goal` ignores non-empty fileAttachments | `codex_clipboard_attachment` | `trace-to-skill codex-report ./runs` or `trace-to-skill demo clipboard-attachment` |
38
39
  | Deeplink, OAuth callback, and external launch regressions | `codex://oauth_callback?code=...` fails, `Unable to find Electron app`, `app\oauth_callback?code=...`, notification `type=click&tag=...` becomes an app path, AppX/MSIX protocol evidence, `codex app .` only focuses | `codex_deeplink_launch` | `trace-to-skill codex-report ./runs` or `trace-to-skill demo deeplink-launch` |
39
40
  | App connector auth cache and stale link regressions | `401 Reauthentication required`, `refresh token was revoked`, stale `link_*`, `isAccessible: false`, `codex_apps_tools` or `codex_app_directory` cache regeneration keeps broken connector state | `codex_connector_auth_cache` | `trace-to-skill codex-report ./runs` or `trace-to-skill demo connector-auth-cache` |
41
+ | Sign-in and account verification failures | phone number verification fails, SMS/OTP code not received, `invalid_phone_number`, `Sign in With ChatGPT` account-type routing, SSO/workspace/organization verification confusion, VS Code extension `Error starting conversation` during chat initialization | `codex_auth_verification` | `trace-to-skill codex-report ./runs` or `trace-to-skill demo auth-verification` |
40
42
  | Context fork bloat and prompt-cache lineage loss | conversation fork carries full parent transcript, duplicate context blocks after fork boundary, `input_tokens` and `cached_input_tokens` jump before new files are read, `prompt_cache_key` changes despite inherited content, cache hit rate drops, `fork_context` history leaks into child context | `codex_context_fork_bloat` | `trace-to-skill codex-report ./runs` or `trace-to-skill demo context-fork-bloat` |
41
43
  | Subagent prompt leakage and child-task boundary failure | `spawn_agent` with `fork_turns: "none"` records the delegated task as assistant/commentary, child rollout contains `recipient` / `trigger_turn` JSON envelopes, same-turn parallel children see sibling prompts, workers call tools from leaked sibling prompts, `wait_agent` or `close_agent` completes despite wrong task | `codex_subagent_prompt_leakage` | `trace-to-skill codex-report ./runs` or `trace-to-skill demo subagent-prompt-leakage` |
42
44
  | MCP discovery and config-scope mismatches | CLI `/mcp` works but VS Code/Desktop has no `mcp__*` tools, project `.codex/config.toml` ignored, `codex mcp get` says `No MCP server named`, WSL opens Windows `config.toml`, `CODEX_HOME` differs | `codex_mcp_discovery_mismatch` | `trace-to-skill codex-report ./runs` or `trace-to-skill demo mcp-discovery-mismatch` |
@@ -53,9 +55,11 @@ npx trace-to-skill lsp-audit . --format json
53
55
  | LSP auto-detect readiness | Codex users want language-aware navigation, diagnostics, references, rename, or install guidance before edits | language-server metadata | `trace-to-skill lsp-audit . --format json` |
54
56
  | Context compaction failures | `Error running remote compact task`, `context_length_exceeded`, compaction loops, `responses/compact` stream disconnects | `context_compaction` | `trace-to-skill analyze ./runs` |
55
57
  | Latest-turn drift | Codex answers an older prompt, repeats a previous response, redoes an already fixed task, forgets recent edits after compaction, or leaks raw tool payload text | `codex_latest_turn_drift` | `trace-to-skill codex-report ./runs` |
58
+ | Model routing mismatch | selected model differs from actual server-side `response.model`, silent fallback, or missing model-unavailable warning | `codex_model_routing_mismatch` | `trace-to-skill codex-report ./runs` or `trace-to-skill demo model-routing-mismatch` |
56
59
  | Session resume, project history, and state failures | `codex resume` picker freezes, Desktop project/search/sidebar hides existing threads, large rollout JSONL, short or stale `session_index.jsonl`, transcript-like sidebar title chunks, unindexed rollout thread ids, `Could not load archived chats`, `state_5.sqlite`, `thread_goals` | `codex_session_state` | `trace-to-skill codex-report ./runs` or `trace-to-skill session-audit ~/.codex --format json` |
57
60
  | Sandbox and permission blockers | Windows sandbox setup refresh, `os error 740`, ACL/ownership drift, approval-mode mismatch | `sandbox_permission` | `trace-to-skill analyze ./runs` |
58
61
  | Auth and connectivity failures | `token_exchange_failed`, `auth.openai.com/oauth/token`, missing CA certificates, proxy/TLS, IPv6, Cloudflare, stream disconnects | `codex_connectivity` | `trace-to-skill codex-report ./runs` |
62
+ | Sign-in and account verification failures | phone verification, SMS/OTP, SSO, ChatGPT sign-in account routing, organization/workspace verification, extension chat initialization | `codex_auth_verification` | `trace-to-skill codex-report ./runs` or `trace-to-skill demo auth-verification` |
59
63
  | Remote-control routing failures | `Waiting for desktop`, `Directory: Unavailable`, stale listener/enrollment, `127.0.0.1:14567`, empty backend environments | `codex_remote_control` | `trace-to-skill codex-report ./runs` |
60
64
  | MCP runtime failures | `user cancelled MCP tool call`, `unsupported call: mcp__...__...`, namespace/serverName loss, `Transport closed` | `codex_mcp_runtime` | `trace-to-skill codex-report ./runs` |
61
65
  | Plugin runtime and bundled capability failures | Computer Use native pipe path unavailable, Browser/Computer Use settings fail, plugin/list `unknown variant 'vertical'`, stale plugin cache downgrades | `codex_plugin_runtime` | `trace-to-skill codex-report ./runs` |
@@ -1,10 +1,10 @@
1
1
  # GitHub Issue Pain Map
2
2
 
3
- Generated: 2026-06-01T02:37:07.997Z
3
+ Generated: 2026-06-01T02:57:24.901Z
4
4
 
5
5
  Issues analyzed: **46**
6
- Matched issues: **15**
7
- Unmatched issues: **31**
6
+ Matched issues: **19**
7
+ Unmatched issues: **27**
8
8
 
9
9
  This report maps GitHub issues onto deterministic `trace-to-skill` failure classes. Fetch a repository directly with `--repo`, or export issues with `gh issue list` / `gh search issues` and pass the JSON file.
10
10
 
@@ -20,17 +20,17 @@ gh issue list --repo openai/codex --state all --limit 100 --json number,title,bo
20
20
  | Priority | Kind | Severity | Issues | Comments | Reactions | Example |
21
21
  | ---: | --- | --- | ---: | ---: | ---: | --- |
22
22
  | 2438 | `codex_token_burn` | high | 4 | 1151 | 620 | [#14593 Burning tokens very fast](https://github.com/openai/codex/issues/14593) |
23
- | 2221 | `weak_evidence` | medium | 46 | 4755 | 7792 | [#14593 Burning tokens very fast](https://github.com/openai/codex/issues/14593) |
23
+ | 2221 | `weak_evidence` | medium | 46 | 4755 | 7793 | [#14593 Burning tokens very fast](https://github.com/openai/codex/issues/14593) |
24
24
  | 884 | `sensitive_file_access` | high | 1 | 75 | 396 | [#2847 A way to exclude sensitive files](https://github.com/openai/codex/issues/2847) |
25
+ | 805 | `codex_auth_verification` | high | 3 | 436 | 166 | [#20161 Phone number verification doesn't work](https://github.com/openai/codex/issues/20161) |
25
26
  | 442 | `codex_tool_call_integrity` | high | 1 | 61 | 182 | [#2998 IDE-integrated diff / approval](https://github.com/openai/codex/issues/2998) |
26
27
  | 376 | `codex_remote_compact` | high | 2 | 147 | 101 | [#14860 Error running remote compact task](https://github.com/openai/codex/issues/14860) |
27
28
  | 376 | `context_compaction` | high | 2 | 147 | 101 | [#14860 Error running remote compact task](https://github.com/openai/codex/issues/14860) |
28
29
  | 351 | `codex_terminal_output_integrity` | high | 1 | 66 | 134 | [#2558 Codex client output truncated when scrolling in Zellij](https://github.com/openai/codex/issues/2558) |
30
+ | 324 | `codex_model_routing_mismatch` | high | 1 | 169 | 69 | [#11189 GPT-5.3-Codex being routed to GPT-5.2](https://github.com/openai/codex/issues/11189) |
29
31
  | 261 | `premature_completion` | high | 1 | 60 | 92 | [#2448 Codex CLI: Plus users hitting usage limits extremely quickly compared to competitors](https://github.com/openai/codex/issues/2448) |
30
32
  | 247 | `codex_connectivity` | high | 2 | 192 | 14 | [#12764 The codex cli giving: 401 unauthorized](https://github.com/openai/codex/issues/12764) |
31
33
  | 181 | `codex_latest_turn_drift` | high | 1 | 58 | 53 | [#8648 Codex replies to earlier messages instead of latest one in conversations](https://github.com/openai/codex/issues/8648) |
32
- | 168 | `codex_resource_leak` | high | 1 | 97 | 27 | [#10432 High GPU usage (70–90%) on macOS with Codex app](https://github.com/openai/codex/issues/10432) |
33
- | 128 | `codex_mcp_discovery_mismatch` | high | 1 | 55 | 28 | [#6465 MCP servers not detected in Codex VS Code extension (but working in Codex CLI)](https://github.com/openai/codex/issues/6465) |
34
34
 
35
35
  ## Maintainer Roadmap
36
36
 
@@ -38,9 +38,9 @@ gh issue list --repo openai/codex --state all --limit 100 --json number,title,bo
38
38
  | ---: | --- | --- | --- |
39
39
  | 1 | Usage evidence fixture and support-ready token report | 4 issue(s), 1151 comment(s), severity high; top signal: codex_token_burn. | `trace-to-skill usage-evidence ./usage-notes.md --output usage-evidence.md` |
40
40
  | 2 | Privacy/safety guardrail and redacted support bundle | 1 issue(s), 75 comment(s), severity high; top signal: sensitive_file_access. | `trace-to-skill diagnostics-bundle ~/.codex --output codex-diagnostics` |
41
- | 3 | Patch safety fixture and pre-agent checkpoint workflow | 1 issue(s), 61 comment(s), severity high; top signal: codex_tool_call_integrity. | `trace-to-skill checkpoint . --output .trace-to-skill/checkpoints/before-codex` |
42
- | 4 | Compaction/session regression fixture and Codex issue report | 2 issue(s), 147 comment(s), severity high; top signal: codex_remote_compact. | `trace-to-skill codex-report ./runs --output openai-codex-issue.md` |
43
- | 5 | Compaction/session regression fixture and Codex issue report | 2 issue(s), 147 comment(s), severity high; top signal: context_compaction. | `trace-to-skill codex-report ./runs --output openai-codex-issue.md` |
41
+ | 3 | Auth verification fixture and login support report | 3 issue(s), 436 comment(s), severity high; top signal: codex_auth_verification. | `trace-to-skill codex-report ./runs --output openai-codex-auth-issue.md` |
42
+ | 4 | Patch safety fixture and pre-agent checkpoint workflow | 1 issue(s), 61 comment(s), severity high; top signal: codex_tool_call_integrity. | `trace-to-skill checkpoint . --output .trace-to-skill/checkpoints/before-codex` |
43
+ | 5 | Compaction/session regression fixture and Codex issue report | 2 issue(s), 147 comment(s), severity high; top signal: codex_remote_compact. | `trace-to-skill codex-report ./runs --output openai-codex-issue.md` |
44
44
 
45
45
  ## Suggested Next Actions
46
46
 
@@ -66,28 +66,29 @@ Example issues:
66
66
  Evidence rule prompts:
67
67
  - Before running an agent, exclude sensitive files such as .env, private keys, package auth files, cloud credentials, local databases, and production secret manifests; share only minimal redacted excerpts when maintainer-approved.
68
68
 
69
- ### codex_tool_call_integrity
69
+ ### codex_auth_verification
70
70
 
71
- Priority score: 442. 1 issue(s), 61 comment(s).
71
+ Priority score: 805. 3 issue(s), 436 comment(s).
72
72
 
73
73
  Example issues:
74
- - [#2998 IDE-integrated diff / approval](https://github.com/openai/codex/issues/2998) (61 comments; labels: enhancement, extension)
74
+ - [#20161 Phone number verification doesn't work](https://github.com/openai/codex/issues/20161) (177 comments; labels: bug, auth)
75
+ - [#1243 "Sign in With ChatGPT" functionality needs to be robust against all account types](https://github.com/openai/codex/issues/1243) (169 comments; labels: none)
76
+ - [#2841 “Error starting conversation” in new Codex VS Code extension when initializing a chat](https://github.com/openai/codex/issues/2841) (90 comments; labels: bug, windows-os, extension)
75
77
 
76
78
  Evidence rule prompts:
77
- - When reporting Codex tool-call integrity failures, capture the exact tool input and output, app/CLI/extension version, OS/IDE, workspace git state, affected file path and whether it already existed or was a symlink, diff before/after, tool_call_id sequence, durable thread state for subagents, rollback/revert attempts, and whether a clean repo reproduction fails the same way.
79
+ - When reporting Codex sign-in or account-verification failures, capture the Codex app/CLI/extension version, surface, OS, account type without secrets, workspace or organization context, SSO provider, whether the flow is ChatGPT sign-in, phone/SMS/OTP verification, or extension chat initialization, exact redacted error text, timestamps, whether another device/browser/account works, logout/login attempts, and screenshots with phone numbers, tokens, and email addresses redacted.
78
80
 
79
- ### codex_remote_compact
81
+ ### codex_tool_call_integrity
80
82
 
81
- Priority score: 376. 2 issue(s), 147 comment(s).
83
+ Priority score: 442. 1 issue(s), 61 comment(s).
82
84
 
83
85
  Example issues:
84
- - [#14860 Error running remote compact task](https://github.com/openai/codex/issues/14860) (90 comments; labels: bug, context)
85
- - [#9544 Error running remote compact task: stream disconnected before completion](https://github.com/openai/codex/issues/9544) (57 comments; labels: bug, context)
86
+ - [#2998 IDE-integrated diff / approval](https://github.com/openai/codex/issues/2998) (61 comments; labels: enhancement, extension)
86
87
 
87
88
  Evidence rule prompts:
88
- - When reporting Codex remote compact failures, capture app/CLI/extension version, OS, model and reasoning/speed mode, provider config without secrets, exact /compact or auto-compact error, `responses/compact` endpoint shape, timeout values such as tcp_user_timeout or stream_idle_timeout_ms, context/token level before compaction, whether lowering reasoning/speed changes behavior, whether local fallback or a new session recovers, and related thread/feedback ids.
89
+ - When reporting Codex tool-call integrity failures, capture the exact tool input and output, app/CLI/extension version, OS/IDE, workspace git state, affected file path and whether it already existed or was a symlink, diff before/after, tool_call_id sequence, durable thread state for subagents, rollback/revert attempts, and whether a clean repo reproduction fails the same way.
89
90
 
90
- ### context_compaction
91
+ ### codex_remote_compact
91
92
 
92
93
  Priority score: 376. 2 issue(s), 147 comment(s).
93
94
 
@@ -96,19 +97,19 @@ Example issues:
96
97
  - [#9544 Error running remote compact task: stream disconnected before completion](https://github.com/openai/codex/issues/9544) (57 comments; labels: bug, context)
97
98
 
98
99
  Evidence rule prompts:
99
- - When Codex compaction fails, capture the compact error, model/app version, thread state, and whether the session is recoverable before continuing or reporting success.
100
+ - When reporting Codex remote compact failures, capture app/CLI/extension version, OS, model and reasoning/speed mode, provider config without secrets, exact /compact or auto-compact error, `responses/compact` endpoint shape, timeout values such as tcp_user_timeout or stream_idle_timeout_ms, context/token level before compaction, whether lowering reasoning/speed changes behavior, whether local fallback or a new session recovers, and related thread/feedback ids.
100
101
 
101
102
  ## Unmatched Issues
102
103
 
103
104
  - [#10410 Codex Desktop App: macOS Intel (x86_64) support](https://github.com/openai/codex/issues/10410) (190 comments; labels: enhancement, app)
104
- - [#20161 Phone number verification doesn't work](https://github.com/openai/codex/issues/20161) (177 comments; labels: bug, auth)
105
105
  - [#10450 Remote Development in Codex Desktop App](https://github.com/openai/codex/issues/10450) (176 comments; labels: enhancement, app)
106
- - [#1243 "Sign in With ChatGPT" functionality needs to be robust against all account types](https://github.com/openai/codex/issues/1243) (169 comments; labels: none)
107
- - [#11189 GPT-5.3-Codex being routed to GPT-5.2](https://github.com/openai/codex/issues/11189) (169 comments; labels: bug, CLI)
108
106
  - [#23794 Codex Desktop no longer shows visible context/token usage indicator](https://github.com/openai/codex/issues/23794) (160 comments; labels: bug, context, app)
109
107
  - [#14048 All models — Codex CLI hangs indefinitely on all prompts, no response generated](https://github.com/openai/codex/issues/14048) (131 comments; labels: bug, agent)
110
108
  - [#2604 Subagent Support](https://github.com/openai/codex/issues/2604) (103 comments; labels: enhancement, subagent)
111
- - [#2841 “Error starting conversation” in new Codex VS Code extension when initializing a chat](https://github.com/openai/codex/issues/2841) (90 comments; labels: bug, windows-os, extension)
112
109
  - [#12564 Allow renaming task/thread titles to improve history navigation](https://github.com/openai/codex/issues/12564) (77 comments; labels: enhancement, extension)
113
110
  - [#2860 Unusable on Windows due to permission ask for every shell command](https://github.com/openai/codex/issues/2860) (77 comments; labels: bug, windows-os)
114
111
  - [#2109 Event Hooks](https://github.com/openai/codex/issues/2109) (76 comments; labels: enhancement, hooks)
112
+ - [#2796 BUG: VSCode IDE Plugin on SSH Connection: "Failed to load tasks."](https://github.com/openai/codex/issues/2796) (71 comments; labels: bug, extension)
113
+ - [#16231 High CPU usage on macOS after updating Codex in VS Code extension to 26.325.31654](https://github.com/openai/codex/issues/16231) (71 comments; labels: bug, extension, regression, performance)
114
+ - [#7156 Codex hangs during cli command execution](https://github.com/openai/codex/issues/7156) (70 comments; labels: bug, CLI)
115
+ - [#4313 Extension for JetBrains IDEs (PyCharm, IntelliJ, etc.)](https://github.com/openai/codex/issues/4313) (70 comments; labels: enhancement)
package/docs/DEMO.md CHANGED
@@ -1,10 +1,10 @@
1
1
  # trace-to-skill Demo
2
2
 
3
- Scenario: **Codex subagent lifecycle**
3
+ Scenario: **Codex selected model differs from actual routed model**
4
4
 
5
- Completed, closed, stale, or interrupted subagents diverge between UI, live registry, persisted state, quota, and parent discoverability.
5
+ Codex shows one selected model while SSE response evidence shows a different server-side model was used.
6
6
 
7
- Fixture: `fixtures/codex-subagent-lifecycle.md`
7
+ Fixture: `fixtures/codex-model-routing-mismatch.md`
8
8
 
9
9
  This is a packaged public fixture, so you can try the project without collecting a private trace first.
10
10
 
@@ -14,7 +14,7 @@ This is a packaged public fixture, so you can try the project without collecting
14
14
 
15
15
  Score: **75/100**
16
16
 
17
- Likely failure class: **Codex subagent lifecycle or state reconciliation failure (codex_subagent_lifecycle, high)**
17
+ Likely failure class: **Codex selected model differs from actual routed model (codex_model_routing_mismatch, high)**
18
18
 
19
19
  Agent workflow needs clearer verification, instruction, or security hardening before broad reuse.
20
20
 
@@ -23,25 +23,24 @@ Agent workflow needs clearer verification, instruction, or security hardening be
23
23
  ```md
24
24
  ### What happened?
25
25
 
26
- trace-to-skill detected Codex subagent lifecycle or state reconciliation failure (codex_subagent_lifecycle). When completed, closed, stale, or interrupted subagents remain visible, keep quota slots, lose parent discoverability, or diverge between UI, live registry, and persisted spawn-edge state, long-running Codex sessions become hard to trust or recover.
26
+ trace-to-skill detected Codex selected model differs from actual routed model (codex_model_routing_mismatch). Silent model fallback, misrouting, or response.model mismatch makes Codex model access, benchmarks, billing expectations, and user trust hard to debug unless reports preserve both the selected model and the actual server-side model evidence.
27
27
 
28
28
  ### Detected failure class
29
29
 
30
- - codex_subagent_lifecycle: Codex subagent lifecycle or state reconciliation failure (high)
30
+ - codex_model_routing_mismatch: Codex selected model differs from actual routed model (high)
31
31
 
32
32
  ### Evidence
33
33
 
34
- #### Codex subagent lifecycle or state reconciliation failure
35
- - fixtures/codex-subagent-lifecycle.md:16 - Completed or closed subagents remain visible in the Subagents panel.
36
- - fixtures/codex-subagent-lifecycle.md:17 - The app shows stale subagent cards after close/readback reports no live agent handle.
37
- - fixtures/codex-subagent-lifecycle.md:18 - The visible subagent count grows very large; the panel can show Show 67 more or 100+ stale entries.
38
- - fixtures/codex-subagent-lifecycle.md:20 - It is unclear which subagents are active versus stale UI/cache entries.
39
- - fixtures/codex-subagent-lifecycle.md:27 - thread_spawn_edges status count: closed=549, open=0
40
- - fixtures/codex-subagent-lifecycle.md:28 - After restarting Codex Desktop multiple times, the Subagents panel still visually shows stale subagent cards.
34
+ #### Codex selected model differs from actual routed model
35
+ - fixtures/codex-model-routing-mismatch.md:5 - - GPT-5.3-Codex is being routed to GPT-5.2.
36
+ - fixtures/codex-model-routing-mismatch.md:6 - - Both `config.toml` and the TUI are set to `gpt-5.3-codex`, but SSE captures show the actual `response.model` is `gpt-5.2-2025-12-11`.
37
+ - fixtures/codex-model-routing-mismatch.md:7 - - Running `RUST_LOG='codex_tui::chatwidget=info,codex_api::sse::responses=trace' codex` and sending a prompt shows `response.created` with `response.model=gpt-5.2-2025-12-11`.
38
+ - fixtures/codex-model-routing-mismatch.md:9 - - The user sees no warning or fallback notice that a different model version is being used internally.
39
+ - fixtures/codex-model-routing-mismatch.md:10 - - Some reports mention ChatGPT Pro, WSL, macOS, recent CLI versions, and verification briefly restoring GPT-5.3-Codex before silently rerouting back to GPT-5.2.
41
40
 
42
41
  ### Diagnostics to attach
43
42
 
44
- - When reporting Codex subagent lifecycle failures, capture Codex app/CLI/extension version, OS, surface, model, subscription/workspace, root thread id, subagent ids/nicknames/roles, spawn/close/list commands or UI actions, close_agent results, list_agents or /agents output, thread_spawn_edges status counts, agent registry or max_threads/quota evidence, recent-list/sidebar behavior, whether child threads are archived or shown as top-level conversations, last-progress/heartbeat or halt reason, MCP server state for subagents, compaction/resume timing, screenshot or redacted UI state, whether restart/reload/new thread clears it, and whether stale agents are UI-only or still block new spawns.
43
+ - When reporting Codex model-routing mismatches, capture the Codex app/CLI/extension version, subscription/workspace, selected model from config.toml, TUI, command flag, or UI, actual server-side model from SSE `response.created` / `response.model`, the exact `RUST_LOG` or trace command used, timestamp, account or verification state without secrets, whether API and Codex routes differ, whether a warning/fallback notice appeared, and a minimal one-prompt reproduction with redacted logs.
45
44
 
46
45
  ### Privacy
47
46
 
@@ -50,25 +49,22 @@ trace-to-skill detected Codex subagent lifecycle or state reconciliation failure
50
49
 
51
50
  ## Findings
52
51
 
53
- ### 1. Codex subagent lifecycle or state reconciliation failure
52
+ ### 1. Codex selected model differs from actual routed model
54
53
 
55
54
  Severity: **high**
56
55
 
57
- When completed, closed, stale, or interrupted subagents remain visible, keep quota slots, lose parent discoverability, or diverge between UI, live registry, and persisted spawn-edge state, long-running Codex sessions become hard to trust or recover.
56
+ Silent model fallback, misrouting, or response.model mismatch makes Codex model access, benchmarks, billing expectations, and user trust hard to debug unless reports preserve both the selected model and the actual server-side model evidence.
58
57
 
59
58
  Evidence:
60
- - `fixtures/codex-subagent-lifecycle.md:16` Completed or closed subagents remain visible in the Subagents panel.
61
- - `fixtures/codex-subagent-lifecycle.md:17` The app shows stale subagent cards after close/readback reports no live agent handle.
62
- - `fixtures/codex-subagent-lifecycle.md:18` The visible subagent count grows very large; the panel can show Show 67 more or 100+ stale entries.
63
- - `fixtures/codex-subagent-lifecycle.md:20` It is unclear which subagents are active versus stale UI/cache entries.
64
- - `fixtures/codex-subagent-lifecycle.md:27` thread_spawn_edges status count: closed=549, open=0
65
- - `fixtures/codex-subagent-lifecycle.md:28` After restarting Codex Desktop multiple times, the Subagents panel still visually shows stale subagent cards.
66
- - `fixtures/codex-subagent-lifecycle.md:36` Codex subagents have been going stale and refusing to close for the past week.
67
- - `fixtures/codex-subagent-lifecycle.md:47` Long sessions with stale subagents may hold MCP connections or leave connection lifecycle state unclear.
59
+ - `fixtures/codex-model-routing-mismatch.md:5` - GPT-5.3-Codex is being routed to GPT-5.2.
60
+ - `fixtures/codex-model-routing-mismatch.md:6` - Both `config.toml` and the TUI are set to `gpt-5.3-codex`, but SSE captures show the actual `response.model` is `gpt-5.2-2025-12-11`.
61
+ - `fixtures/codex-model-routing-mismatch.md:7` - Running `RUST_LOG='codex_tui::chatwidget=info,codex_api::sse::responses=trace' codex` and sending a prompt shows `response.created` with `response.model=gpt-5.2-2025-12-11`.
62
+ - `fixtures/codex-model-routing-mismatch.md:9` - The user sees no warning or fallback notice that a different model version is being used internally.
63
+ - `fixtures/codex-model-routing-mismatch.md:10` - Some reports mention ChatGPT Pro, WSL, macOS, recent CLI versions, and verification briefly restoring GPT-5.3-Codex before silently rerouting back to GPT-5.2.
68
64
 
69
65
  Suggested rule:
70
66
 
71
- > When reporting Codex subagent lifecycle failures, capture Codex app/CLI/extension version, OS, surface, model, subscription/workspace, root thread id, subagent ids/nicknames/roles, spawn/close/list commands or UI actions, close_agent results, list_agents or /agents output, thread_spawn_edges status counts, agent registry or max_threads/quota evidence, recent-list/sidebar behavior, whether child threads are archived or shown as top-level conversations, last-progress/heartbeat or halt reason, MCP server state for subagents, compaction/resume timing, screenshot or redacted UI state, whether restart/reload/new thread clears it, and whether stale agents are UI-only or still block new spawns.
67
+ > When reporting Codex model-routing mismatches, capture the Codex app/CLI/extension version, subscription/workspace, selected model from config.toml, TUI, command flag, or UI, actual server-side model from SSE `response.created` / `response.model`, the exact `RUST_LOG` or trace command used, timestamp, account or verification state without secrets, whether API and Codex routes differ, whether a warning/fallback notice appeared, and a minimal one-prompt reproduction with redacted logs.
72
68
 
73
69
 
74
70
  ## Reporter Notes
@@ -91,10 +87,12 @@ Suggested rule:
91
87
  - `clipboard-attachment`: Copy as Markdown, long-paste conversion, or generated Pasted text.txt attachments break prompt and report workflows.
92
88
  - `deeplink-launch`: OAuth callbacks, notification clicks, mobile links, or `codex app <path>` external activation fail to route into Codex.
93
89
  - `connector-auth-cache`: App connectors keep stale `link_*` auth or discovery metadata after reauth-required responses.
90
+ - `auth-verification`: Phone verification, ChatGPT sign-in account routing, or extension chat initialization blocks Codex before a usable session starts.
94
91
  - `mcp-discovery-mismatch`: MCP servers work in CLI or one config scope but are absent in Desktop, VS Code, WSL, or project-local sessions.
95
92
  - `mcp-streamable-http`: Streamable HTTP or SSE MCP servers pass initialize or tools/list but fail parsing, handshakes, auth gating, stale sessions, or reconnects.
96
93
  - `hooks-runtime`: Hooks duplicate, stop firing, warn about stale config, skip surfaces, or become hard to manage in Desktop settings.
97
94
  - `terminal-output-integrity`: Terminal scrollback, streamed output, or transcript rendering drops, overwrites, truncates, or makes lines inaccessible.
95
+ - `subagent-lifecycle`: Completed, closed, stale, or interrupted subagents diverge between UI, live registry, persisted state, quota, and parent discoverability.
98
96
  - `usage-bucket-confusion`: Usage popovers show 5h and weekly percentages without clear remaining/used, rolling/calendar, or account/workspace scope.
99
97
  - `token-burn`: Usage drains from background polling, idle activity, compaction loops, retries, or cached-heavy turns.
100
98
  - `patch-overwrite`: `apply_patch` accepts `*** Add File` for an existing path, turning a create operation into a silent overwrite.
@@ -32,6 +32,14 @@ Common signals include `responds to an earlier message`, `ignoring my latest mes
32
32
 
33
33
  The fix is to capture app/CLI/extension version, model and reasoning effort, context-window percent or token counts, compaction timing, the exact latest user request, the stale earlier request or response it answered instead, thread or feedback id, whether resending the same message fixes it, and any raw internal tool payload leaked into the chat UI.
34
34
 
35
+ ## Codex Selected Model Differs From Actual Routed Model
36
+
37
+ Codex shows one selected model in config, TUI, CLI flags, or UI, but the actual server-side response uses a different model. This is different from generic latency: the key evidence is a selected-model versus `response.model` mismatch or silent fallback.
38
+
39
+ Common signals include `GPT-5.3-Codex being routed to GPT-5.2`, `config.toml` and TUI set to `gpt-5.3-codex` while SSE `response.created` shows `response.model=gpt-5.2-2025-12-11`, `RUST_LOG=codex_api::sse::responses=trace`, `codex exec --model gpt-5.3-codex`, no warning, no fallback notice, and verification briefly restoring access before a silent reroute.
40
+
41
+ The fix is to capture Codex app/CLI/extension version, subscription/workspace, selected model from `config.toml`, TUI, command flag, or UI, actual server-side model from SSE `response.created` / `response.model`, the exact `RUST_LOG` or trace command used, timestamp, account or verification state without secrets, whether API and Codex routes differ, whether a warning/fallback notice appeared, and a minimal one-prompt reproduction with redacted logs.
42
+
35
43
  ## Codex Latency Regression
36
44
 
37
45
  Codex can regress from fast interactive work into long pre-first-token stalls, extended thinking, slow read/search orchestration, compaction delays, or model routing that makes a fast mode feel like a standard or higher-reasoning mode.
@@ -72,6 +80,14 @@ Common signals include `401: "Server returned 401: 'Reauthentication required'"`
72
80
 
73
81
  The fix is to capture app and CLI versions, OS, connector/plugin name and id, installed plugin root, exact Codex Apps tool name, error text, redacted cache metadata, `link_*` before and after reconnect/cache regeneration, `isAccessible` state, restart/remove/re-add/sign-in/cache-clear attempts, ChatGPT app page state, and whether an external MCP workaround succeeds.
74
82
 
83
+ ## Codex Sign-In Or Account Verification Failure
84
+
85
+ Codex first-party sign-in, phone verification, ChatGPT account routing, workspace/organization verification, or extension chat initialization fails before the user reaches a usable session.
86
+
87
+ Common signals include phone number verification not working, SMS/OTP code not received, `invalid_phone_number`, `Sign in With ChatGPT` account-type edge cases, Plus/Pro/Teams/Enterprise routing confusion, SSO requiring unexpected phone verification, and VS Code extension `Error starting conversation` while initializing a chat after sign-in.
88
+
89
+ The fix is to capture Codex app/CLI/extension version, surface, OS, account type without secrets, workspace or organization context, SSO provider, whether the flow is ChatGPT sign-in, phone/SMS/OTP verification, or extension chat initialization, exact redacted error text, timestamps, whether another device/browser/account works, logout/login attempts, and screenshots with phone numbers, tokens, and email addresses redacted.
90
+
75
91
  ## Codex Approval Friction
76
92
 
77
93
  Codex approval UX can fail even when the sandbox itself works. The common pattern is that a user chooses `Approve for this session`, `Always`, or an MCP/tool trust setting, but Codex keeps asking for approval, makes the user babysit every step, or pushes them toward `Full Access` just to get useful work done.
@@ -3,14 +3,14 @@
3
3
  | Field | Value |
4
4
  | --- | --- |
5
5
  | Repository | https://github.com/grnbtqdbyx-create/trace-to-skill |
6
- | Package | trace-to-skill@0.1.93 |
6
+ | Package | trace-to-skill@0.1.95 |
7
7
  | License | Apache-2.0 |
8
8
  | Codex readiness | ready (100/100) |
9
- | Benchmark | pass, 38 cases |
9
+ | Benchmark | pass, 40 cases |
10
10
 
11
11
  ## Why This Repository Qualifies
12
12
 
13
- trace-to-skill helps open-source maintainers adopt Codex safely by turning failed coding-agent runs into evidence-backed rules, reusable workflows, CI gates, and a weekly Codex Issue Radar for live GitHub issue demand. It supports real maintenance work: PR review, issue triage, release quality, MCP risk, prompt-injection defense, privacy-preserving trace sharing, and repeat failure reduction. The repository is ready, scores 100/100 on the local Codex readiness doctor, and ships a deterministic benchmark with 38 public fixture cases.
13
+ trace-to-skill helps open-source maintainers adopt Codex safely by turning failed coding-agent runs into evidence-backed rules, reusable workflows, CI gates, and a weekly Codex Issue Radar for live GitHub issue demand. It supports real maintenance work: PR review, issue triage, release quality, MCP risk, prompt-injection defense, privacy-preserving trace sharing, and repeat failure reduction. The repository is ready, scores 100/100 on the local Codex readiness doctor, and ships a deterministic benchmark with 40 public fixture cases.
14
14
 
15
15
  ### 500-Character Version
16
16
 
@@ -27,10 +27,10 @@ API credits would power optional maintainer workflows on top of the local determ
27
27
  ## Evidence
28
28
 
29
29
  - Public repository: https://github.com/grnbtqdbyx-create/trace-to-skill
30
- - One-command package: npx trace-to-skill@0.1.93
30
+ - One-command package: npx trace-to-skill@0.1.95
31
31
  - Open-source license: Apache-2.0
32
32
  - Codex readiness doctor: ready, 100/100, 0 failed checks.
33
- - Public fixture benchmark: pass, 38 cases.
33
+ - Public fixture benchmark: pass, 40 cases.
34
34
  - GitHub issue demand mining: issue-map fetches or reads piped GitHub CLI issue JSON, then ranks OpenAI/Codex issues by failure class, comments, reactions, evidence gaps, and Maintainer Roadmap next artifacts.
35
35
  - Weekly Codex Issue Radar: init --issue-map-repo owner/name scaffolds a scheduled Action that fetches live GitHub issues and publishes the pain map to the job summary or a stable tracking issue comment.
36
36
  - Maintainer control: generated rules are suggestions, evidence is line-linked, and secrets can be redacted before sharing.
package/docs/SCORECARD.md CHANGED
@@ -9,7 +9,7 @@ Status: **pass**
9
9
  | Failed doctor checks | 0 |
10
10
  | Critical findings | 0 |
11
11
  | Built-in benchmark | pass |
12
- | Benchmark cases | 38 |
12
+ | Benchmark cases | 40 |
13
13
 
14
14
  ## Doctor Summary
15
15
 
@@ -31,11 +31,13 @@ This benchmark runs the public fixture pack that ships with the repository and p
31
31
  | Codex conversation fork context bloat | `fixtures/codex-context-fork-bloat.md` | 59 | 3 | 0 | `codex_context_fork_bloat`, `codex_thinking_hang`, `weak_evidence` | pass |
32
32
  | Codex subagent prompt leakage or boundary failure | `fixtures/codex-subagent-prompt-leakage.md` | 75 | 2 | 0 | `codex_subagent_prompt_leakage`, `weak_evidence` | pass |
33
33
  | Codex latest-turn drift after compaction | `fixtures/codex-latest-turn-drift.md` | 59 | 3 | 0 | `codex_latest_turn_drift`, `premature_completion`, `weak_evidence` | pass |
34
+ | Codex selected model differs from actual routed model | `fixtures/codex-model-routing-mismatch.md` | 75 | 2 | 0 | `codex_model_routing_mismatch`, `weak_evidence` | pass |
34
35
  | Codex model and runtime latency regression | `fixtures/codex-latency-regression.md` | 75 | 2 | 0 | `codex_latency_regression`, `weak_evidence` | pass |
35
36
  | Codex thinking and stream hang | `fixtures/codex-thinking-hang.md` | 75 | 2 | 0 | `codex_thinking_hang`, `weak_evidence` | pass |
36
37
  | Codex clipboard, paste, and attachment workflow regression | `fixtures/codex-clipboard-attachment.md` | 75 | 2 | 0 | `codex_clipboard_attachment`, `weak_evidence` | pass |
37
38
  | Codex deeplink, OAuth callback, and external launch regression | `fixtures/codex-deeplink-launch.md` | 50 | 4 | 0 | `codex_deeplink_launch`, `codex_remote_control`, `hallucinated_file`, `weak_evidence` | pass |
38
39
  | Codex app connector auth cache and stale link regression | `fixtures/codex-connector-auth-cache.md` | 75 | 2 | 0 | `codex_connector_auth_cache`, `weak_evidence` | pass |
40
+ | Codex sign-in and account verification failure | `fixtures/codex-auth-verification.md` | 75 | 2 | 0 | `codex_auth_verification`, `weak_evidence` | pass |
39
41
  | Codex approval persistence and MCP approval friction | `fixtures/codex-approval-friction.md` | 59 | 3 | 0 | `codex_approval_friction`, `sandbox_permission`, `weak_evidence` | pass |
40
42
  | Codex sandbox permission failure | `fixtures/sandbox-permission.md` | 59 | 3 | 0 | `codex_windows_helper_path`, `sandbox_permission`, `weak_evidence` | pass |
41
43
  | Codex Windows helper and bundled tool path failure | `fixtures/codex-windows-helper-path.md` | 43 | 4 | 0 | `codex_plugin_runtime`, `codex_windows_helper_path`, `sandbox_permission`, `weak_evidence` | pass |
@@ -43,7 +45,7 @@ This benchmark runs the public fixture pack that ships with the repository and p
43
45
  | Codex remote-control route health failure | `fixtures/codex-remote-control.md` | 75 | 2 | 0 | `codex_remote_control`, `weak_evidence` | pass |
44
46
  | Codex terminal output and scrollback integrity failure | `fixtures/codex-terminal-output-integrity.md` | 75 | 2 | 0 | `codex_terminal_output_integrity`, `weak_evidence` | pass |
45
47
  | Codex subagent lifecycle and state reconciliation failure | `fixtures/codex-subagent-lifecycle.md` | 75 | 2 | 0 | `codex_subagent_lifecycle`, `weak_evidence` | pass |
46
- | Codex quota mismatch | `fixtures/quota-mismatch.md` | 59 | 3 | 0 | `codex_usage_reset_drift`, `quota_mismatch`, `weak_evidence` | pass |
48
+ | Codex quota mismatch | `fixtures/quota-mismatch.md` | 43 | 4 | 0 | `codex_token_burn`, `codex_usage_reset_drift`, `quota_mismatch`, `weak_evidence` | pass |
47
49
  | MCP config with secret exposure | `fixtures/mcp-risk.json` | 59 | 2 | 1 | `mcp_risk`, `secret_exposure` | pass |
48
50
  | Sensitive file access in agent context | `fixtures/sensitive-file-access.md` | 75 | 2 | 0 | `sensitive_file_access`, `weak_evidence` | pass |
49
51
  | Codex MCP runtime failure | `fixtures/codex-mcp-runtime.md` | 75 | 2 | 0 | `codex_mcp_runtime`, `weak_evidence` | pass |