@researai/deepscientist 1.5.16 → 1.5.17
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +66 -23
- package/bin/ds.js +550 -19
- package/docs/en/00_QUICK_START.md +65 -5
- package/docs/en/01_SETTINGS_REFERENCE.md +1 -1
- package/docs/en/09_DOCTOR.md +14 -3
- package/docs/en/15_CODEX_PROVIDER_SETUP.md +12 -3
- package/docs/en/21_LOCAL_MODEL_BACKENDS_GUIDE.md +283 -0
- package/docs/en/91_DEVELOPMENT.md +237 -0
- package/docs/en/README.md +7 -3
- package/docs/zh/00_QUICK_START.md +54 -5
- package/docs/zh/01_SETTINGS_REFERENCE.md +1 -1
- package/docs/zh/09_DOCTOR.md +15 -4
- package/docs/zh/15_CODEX_PROVIDER_SETUP.md +12 -3
- package/docs/zh/21_LOCAL_MODEL_BACKENDS_GUIDE.md +281 -0
- package/docs/zh/README.md +7 -3
- package/install.sh +46 -4
- package/package.json +2 -1
- package/pyproject.toml +1 -1
- package/src/deepscientist/__init__.py +1 -1
- package/src/deepscientist/bridges/connectors.py +8 -2
- package/src/deepscientist/codex_cli_compat.py +185 -72
- package/src/deepscientist/config/service.py +154 -6
- package/src/deepscientist/daemon/api/handlers.py +130 -25
- package/src/deepscientist/daemon/api/router.py +5 -0
- package/src/deepscientist/daemon/app.py +446 -22
- package/src/deepscientist/diagnostics/__init__.py +6 -0
- package/src/deepscientist/diagnostics/runner_failures.py +130 -0
- package/src/deepscientist/doctor.py +207 -3
- package/src/deepscientist/prompts/builder.py +22 -4
- package/src/deepscientist/quest/service.py +413 -13
- package/src/deepscientist/runners/codex.py +59 -14
- package/src/deepscientist/shared.py +19 -0
- package/src/prompts/contracts/shared_interaction.md +3 -2
- package/src/prompts/system.md +13 -0
- package/src/prompts/system_copilot.md +13 -0
- package/src/tui/package.json +1 -1
- package/src/ui/dist/assets/{AiManusChatView-COFACy7V.js → AiManusChatView-Bv-Z8YpU.js} +44 -44
- package/src/ui/dist/assets/{AnalysisPlugin-DnSm0GZn.js → AnalysisPlugin-BCKAfjba.js} +1 -1
- package/src/ui/dist/assets/{CliPlugin-CvwCmDQ5.js → CliPlugin-BCKcpc35.js} +4 -4
- package/src/ui/dist/assets/{CodeEditorPlugin-cOqSa0xq.js → CodeEditorPlugin-DbOfSJ8K.js} +1 -1
- package/src/ui/dist/assets/{CodeViewerPlugin-itb0tltR.js → CodeViewerPlugin-CbaFRrUU.js} +3 -3
- package/src/ui/dist/assets/{DocViewerPlugin-DqKkiCI6.js → DocViewerPlugin-DAjLVeQD.js} +3 -3
- package/src/ui/dist/assets/{GitCommitViewerPlugin-DVgNHBCS.js → GitCommitViewerPlugin-CIUqbUDO.js} +1 -1
- package/src/ui/dist/assets/{GitDiffViewerPlugin-DxL2ezFG.js → GitDiffViewerPlugin-CQACjoAA.js} +1 -1
- package/src/ui/dist/assets/{GitSnapshotViewer-B_RQm1YZ.js → GitSnapshotViewer-0r4nLPke.js} +1 -1
- package/src/ui/dist/assets/{ImageViewerPlugin-tHqlXY3n.js → ImageViewerPlugin-nBOmI2v_.js} +3 -3
- package/src/ui/dist/assets/{LabCopilotPanel-ClMbq5Yu.js → LabCopilotPanel-BHxOxF4z.js} +1 -1
- package/src/ui/dist/assets/{LabPlugin-L_SuE8ow.js → LabPlugin-BKoZGs95.js} +1 -1
- package/src/ui/dist/assets/{LatexPlugin-B495DTXC.js → LatexPlugin-ZwtV8pIp.js} +1 -1
- package/src/ui/dist/assets/{MarkdownViewerPlugin-DG28-61B.js → MarkdownViewerPlugin-DKqVfKyW.js} +3 -3
- package/src/ui/dist/assets/{MarketplacePlugin-BiOGT-Kj.js → MarketplacePlugin-BwxStZ9D.js} +1 -1
- package/src/ui/dist/assets/{NotebookEditor-C-4Kt1p9.js → NotebookEditor-BEQhaQbt.js} +1 -1
- package/src/ui/dist/assets/{NotebookEditor-CVsj8h_T.js → NotebookEditor-DB9N_T9q.js} +23 -23
- package/src/ui/dist/assets/{PdfLoader-CASDQmxJ.js → PdfLoader-eWBONbQP.js} +1 -1
- package/src/ui/dist/assets/{PdfMarkdownPlugin-BFhwoKsY.js → PdfMarkdownPlugin-D22YOZL3.js} +1 -1
- package/src/ui/dist/assets/{PdfViewerPlugin-DcOzU9vd.js → PdfViewerPlugin-c-RK9DLM.js} +3 -3
- package/src/ui/dist/assets/{SearchPlugin-CHj7M58O.js → SearchPlugin-CxF9ytAx.js} +1 -1
- package/src/ui/dist/assets/{TextViewerPlugin-CB4DYfWO.js → TextViewerPlugin-C5xqeeUH.js} +2 -2
- package/src/ui/dist/assets/{VNCViewer-CjlbyCB3.js → VNCViewer-BoLGLnHz.js} +1 -1
- package/src/ui/dist/assets/{bot-CFkZY-JP.js → bot-DREQOxzP.js} +1 -1
- package/src/ui/dist/assets/{chevron-up-Dq5ofbht.js → chevron-up-C9Qpx4DE.js} +1 -1
- package/src/ui/dist/assets/{code-DLC6G24T.js → code-WlFHE7z_.js} +1 -1
- package/src/ui/dist/assets/{file-content-Dv4LoZec.js → file-content-BZMz3RYp.js} +1 -1
- package/src/ui/dist/assets/{file-diff-panel-Denq-lC3.js → file-diff-panel-CQhw0jS2.js} +1 -1
- package/src/ui/dist/assets/{file-socket-Cu4Qln7Y.js → file-socket-CfQPKQKj.js} +1 -1
- package/src/ui/dist/assets/{git-commit-horizontal-BUh6G52n.js → git-commit-horizontal-DxZ8DCZh.js} +1 -1
- package/src/ui/dist/assets/{image-B9HUUddG.js → image-Bgl4VIyx.js} +1 -1
- package/src/ui/dist/assets/{index-Cgla8biy.css → index-BpV6lusQ.css} +1 -1
- package/src/ui/dist/assets/{index-Gbl53BNp.js → index-CBNVuWcP.js} +363 -363
- package/src/ui/dist/assets/{index-wQ7RIIRd.js → index-CwNu1aH4.js} +1 -1
- package/src/ui/dist/assets/{index-B2B1sg-M.js → index-DrUnlf6K.js} +1 -1
- package/src/ui/dist/assets/{index-DRyx7vAc.js → index-NW-h8VzN.js} +1 -1
- package/src/ui/dist/assets/{pdf-effect-queue-ZtnHFCAi.js → pdf-effect-queue-J8OnM0jE.js} +1 -1
- package/src/ui/dist/assets/{popover-DL6h35vr.js → popover-CLc0pPP8.js} +1 -1
- package/src/ui/dist/assets/{project-sync-CsX08Qno.js → project-sync-C9IdzdZW.js} +1 -1
- package/src/ui/dist/assets/{select-DvmXt1yY.js → select-Cs2PmzwL.js} +1 -1
- package/src/ui/dist/assets/{sigma-7jpXazui.js → sigma-ClKcHAXm.js} +1 -1
- package/src/ui/dist/assets/{trash-xA7kFt8i.js → trash-DwpbFr3w.js} +1 -1
- package/src/ui/dist/assets/{useCliAccess-DsMwDjOp.js → useCliAccess-NQ8m0Let.js} +1 -1
- package/src/ui/dist/assets/{wrap-text-CwMn-iqb.js → wrap-text-BC-Hltpd.js} +1 -1
- package/src/ui/dist/assets/{zoom-out-R-GWEhzS.js → zoom-out-E_gaeAxL.js} +1 -1
- package/src/ui/dist/index.html +2 -2
|
@@ -0,0 +1,130 @@
|
|
|
1
|
+
from __future__ import annotations
|
|
2
|
+
|
|
3
|
+
from dataclasses import dataclass
|
|
4
|
+
|
|
5
|
+
|
|
6
|
+
@dataclass(frozen=True)
|
|
7
|
+
class FailureDiagnosis:
|
|
8
|
+
code: str
|
|
9
|
+
problem: str
|
|
10
|
+
why: str
|
|
11
|
+
guidance: tuple[str, ...]
|
|
12
|
+
retriable: bool
|
|
13
|
+
matched_text: str | None = None
|
|
14
|
+
|
|
15
|
+
|
|
16
|
+
_MODEL_UNAVAILABLE_MARKERS = (
|
|
17
|
+
"unknown model",
|
|
18
|
+
"invalid model",
|
|
19
|
+
"model not found",
|
|
20
|
+
"unsupported model",
|
|
21
|
+
"model is not available",
|
|
22
|
+
"not authorized to use model",
|
|
23
|
+
"you do not have access",
|
|
24
|
+
"access to model",
|
|
25
|
+
"model access",
|
|
26
|
+
"unrecognized model",
|
|
27
|
+
)
|
|
28
|
+
|
|
29
|
+
|
|
30
|
+
def _build_haystack(*values: object) -> str:
|
|
31
|
+
return "\n".join(str(value or "") for value in values if str(value or "").strip())
|
|
32
|
+
|
|
33
|
+
|
|
34
|
+
def _contains(text: str, marker: str) -> bool:
|
|
35
|
+
return marker in text.lower()
|
|
36
|
+
|
|
37
|
+
|
|
38
|
+
def diagnose_runner_failure(
|
|
39
|
+
*,
|
|
40
|
+
runner_name: str,
|
|
41
|
+
summary: str = "",
|
|
42
|
+
stderr_text: str = "",
|
|
43
|
+
output_text: str = "",
|
|
44
|
+
) -> FailureDiagnosis | None:
|
|
45
|
+
haystack = _build_haystack(summary, stderr_text, output_text)
|
|
46
|
+
lower = haystack.lower()
|
|
47
|
+
normalized_runner = str(runner_name or "").strip().lower()
|
|
48
|
+
|
|
49
|
+
if (
|
|
50
|
+
"tool call result does not follow tool call (2013)" in lower
|
|
51
|
+
or "tool result's tool id" in lower
|
|
52
|
+
):
|
|
53
|
+
return FailureDiagnosis(
|
|
54
|
+
code="minimax_tool_result_sequence_error",
|
|
55
|
+
problem="MiniMax rejected the tool result sequence.",
|
|
56
|
+
why=(
|
|
57
|
+
"The tool result did not immediately follow the corresponding tool call, "
|
|
58
|
+
"or the tool result referenced a tool call id that was no longer valid."
|
|
59
|
+
),
|
|
60
|
+
guidance=(
|
|
61
|
+
"Keep each tool result immediately after its matching tool call.",
|
|
62
|
+
"Do not insert an extra assistant message between a tool call and its tool result.",
|
|
63
|
+
"For MiniMax chat-wire sessions, serialize tool use one call at a time.",
|
|
64
|
+
),
|
|
65
|
+
retriable=False,
|
|
66
|
+
matched_text="2013",
|
|
67
|
+
)
|
|
68
|
+
|
|
69
|
+
if (
|
|
70
|
+
"invalid function arguments json string" in lower
|
|
71
|
+
or "failed to parse tool call arguments" in lower
|
|
72
|
+
or "trailing characters at line 1 column" in lower
|
|
73
|
+
):
|
|
74
|
+
return FailureDiagnosis(
|
|
75
|
+
code="chat_wire_tool_argument_parse_error",
|
|
76
|
+
problem="The runner emitted malformed tool-call arguments.",
|
|
77
|
+
why=(
|
|
78
|
+
"The tool-call arguments were not a single valid JSON object. "
|
|
79
|
+
"This usually happens when multiple tool calls are batched into one response "
|
|
80
|
+
"or when the arguments string contains trailing characters."
|
|
81
|
+
),
|
|
82
|
+
guidance=(
|
|
83
|
+
"Serialize tool calls one at a time instead of batching multiple MCP calls together.",
|
|
84
|
+
"Make sure each tool call emits exactly one complete JSON object for its arguments.",
|
|
85
|
+
"If this is a MiniMax chat-wire path, stay on the serialized single-tool compatibility route.",
|
|
86
|
+
),
|
|
87
|
+
retriable=False,
|
|
88
|
+
matched_text="tool-call arguments",
|
|
89
|
+
)
|
|
90
|
+
|
|
91
|
+
if "missing environment variable" in lower:
|
|
92
|
+
return FailureDiagnosis(
|
|
93
|
+
code="provider_env_missing",
|
|
94
|
+
problem="A required provider environment variable is missing.",
|
|
95
|
+
why="The configured model provider expects an API key or env var that was not present in the runner environment.",
|
|
96
|
+
guidance=(
|
|
97
|
+
"Set the required key in `~/DeepScientist/config/runners.yaml` under `runners.codex.env`.",
|
|
98
|
+
"If you launch from a shell, export the provider key in that same shell before starting `ds`.",
|
|
99
|
+
),
|
|
100
|
+
retriable=False,
|
|
101
|
+
matched_text="missing environment variable",
|
|
102
|
+
)
|
|
103
|
+
|
|
104
|
+
if any(marker in lower for marker in _MODEL_UNAVAILABLE_MARKERS):
|
|
105
|
+
return FailureDiagnosis(
|
|
106
|
+
code="runner_model_unavailable",
|
|
107
|
+
problem="The configured runner model is not available.",
|
|
108
|
+
why="The selected provider or Codex account could not access the requested model id.",
|
|
109
|
+
guidance=(
|
|
110
|
+
"Set `model: inherit` for provider-backed Codex profiles unless the provider explicitly supports the model id.",
|
|
111
|
+
"If you need a fixed model, verify that the same model works in plain `codex exec` before retrying DeepScientist.",
|
|
112
|
+
),
|
|
113
|
+
retriable=False,
|
|
114
|
+
matched_text="model unavailable",
|
|
115
|
+
)
|
|
116
|
+
|
|
117
|
+
if normalized_runner == "codex" and "invalid params" in lower and "bad_request_error" in lower:
|
|
118
|
+
return FailureDiagnosis(
|
|
119
|
+
code="provider_invalid_params",
|
|
120
|
+
problem="The provider rejected the request parameters.",
|
|
121
|
+
why="The upstream provider returned a deterministic request-shape error instead of a transient transport failure.",
|
|
122
|
+
guidance=(
|
|
123
|
+
"Inspect the immediately preceding tool call / tool result sequence for protocol ordering or JSON-shape mistakes.",
|
|
124
|
+
"Do not keep retrying the same request until the request payload or provider config is corrected.",
|
|
125
|
+
),
|
|
126
|
+
retriable=False,
|
|
127
|
+
matched_text="invalid params",
|
|
128
|
+
)
|
|
129
|
+
|
|
130
|
+
return None
|
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
from __future__ import annotations
|
|
2
2
|
|
|
3
|
+
from datetime import UTC, datetime, timedelta
|
|
3
4
|
import os
|
|
4
5
|
import socket
|
|
5
6
|
import subprocess
|
|
@@ -13,9 +14,13 @@ from urllib.request import Request, urlopen
|
|
|
13
14
|
|
|
14
15
|
from .bash_exec.shells import build_exec_shell_launch, build_terminal_shell_launch
|
|
15
16
|
from .config import ConfigManager
|
|
17
|
+
from .diagnostics import diagnose_runner_failure
|
|
16
18
|
from .home import ensure_home_layout
|
|
17
19
|
from .runtime_tools import RuntimeToolService
|
|
18
|
-
from .shared import resolve_runner_binary, utc_now
|
|
20
|
+
from .shared import read_json, read_jsonl_tail, resolve_runner_binary, utc_now
|
|
21
|
+
|
|
22
|
+
|
|
23
|
+
_RUNTIME_FAILURE_LOOKBACK = timedelta(hours=24)
|
|
19
24
|
|
|
20
25
|
|
|
21
26
|
def _browser_ui_url(host: str, port: int) -> str:
|
|
@@ -42,6 +47,10 @@ def _make_check(
|
|
|
42
47
|
errors: list[str] | None = None,
|
|
43
48
|
guidance: list[str] | None = None,
|
|
44
49
|
details: dict[str, Any] | None = None,
|
|
50
|
+
problem: str | None = None,
|
|
51
|
+
why: str | None = None,
|
|
52
|
+
fix: list[str] | None = None,
|
|
53
|
+
evidence: list[str] | None = None,
|
|
45
54
|
) -> dict[str, Any]:
|
|
46
55
|
normalized_warnings = list(warnings or [])
|
|
47
56
|
normalized_errors = list(errors or [])
|
|
@@ -55,6 +64,10 @@ def _make_check(
|
|
|
55
64
|
"errors": normalized_errors,
|
|
56
65
|
"guidance": list(guidance or []),
|
|
57
66
|
"details": dict(details or {}),
|
|
67
|
+
"problem": str(problem or "").strip() or None,
|
|
68
|
+
"why": str(why or "").strip() or None,
|
|
69
|
+
"fix": [str(line) for line in (fix or []) if str(line).strip()],
|
|
70
|
+
"evidence": [str(line) for line in (evidence or []) if str(line).strip()],
|
|
58
71
|
}
|
|
59
72
|
|
|
60
73
|
|
|
@@ -273,6 +286,13 @@ def _check_codex(config_manager: ConfigManager) -> dict[str, Any]:
|
|
|
273
286
|
probe_warnings = [str(value) for value in probe.get("warnings") or []]
|
|
274
287
|
probe_guidance = [str(value) for value in probe.get("guidance") or []]
|
|
275
288
|
summary = str(probe.get("summary") or "Codex startup probe completed.")
|
|
289
|
+
probe_details = probe.get("details") if isinstance(probe.get("details"), dict) else {}
|
|
290
|
+
diagnosis = diagnose_runner_failure(
|
|
291
|
+
runner_name="codex",
|
|
292
|
+
summary="\n".join([summary, *probe_errors]),
|
|
293
|
+
stderr_text=str(probe_details.get("stderr_excerpt") or ""),
|
|
294
|
+
output_text=str(probe_details.get("stdout_excerpt") or ""),
|
|
295
|
+
)
|
|
276
296
|
if probe.get("ok"):
|
|
277
297
|
return _make_check(
|
|
278
298
|
check_id="codex",
|
|
@@ -284,17 +304,188 @@ def _check_codex(config_manager: ConfigManager) -> dict[str, Any]:
|
|
|
284
304
|
)
|
|
285
305
|
if not probe_guidance:
|
|
286
306
|
probe_guidance = [
|
|
287
|
-
"Run `codex
|
|
307
|
+
"Run `codex login` (or just `codex`) manually once and complete login, then retry `ds doctor`.",
|
|
288
308
|
]
|
|
289
309
|
return _make_check(
|
|
290
310
|
check_id="codex",
|
|
291
311
|
label="Codex CLI",
|
|
292
312
|
ok=False,
|
|
293
|
-
summary=summary,
|
|
313
|
+
summary=diagnosis.problem if diagnosis is not None else summary,
|
|
294
314
|
warnings=probe_warnings,
|
|
295
315
|
errors=probe_errors or ["Codex startup probe did not succeed."],
|
|
296
316
|
guidance=probe_guidance,
|
|
297
317
|
details={"resolved_binary": resolved_binary},
|
|
318
|
+
problem=diagnosis.problem if diagnosis is not None else None,
|
|
319
|
+
why=diagnosis.why if diagnosis is not None else None,
|
|
320
|
+
fix=list(diagnosis.guidance) if diagnosis is not None else None,
|
|
321
|
+
evidence=(
|
|
322
|
+
[f"matched: {diagnosis.matched_text}"] if diagnosis is not None and diagnosis.matched_text else None
|
|
323
|
+
),
|
|
324
|
+
)
|
|
325
|
+
|
|
326
|
+
|
|
327
|
+
def _parse_timestamp(value: object) -> datetime | None:
|
|
328
|
+
normalized = str(value or "").strip()
|
|
329
|
+
if not normalized:
|
|
330
|
+
return None
|
|
331
|
+
candidate = normalized.replace("Z", "+00:00")
|
|
332
|
+
try:
|
|
333
|
+
parsed = datetime.fromisoformat(candidate)
|
|
334
|
+
except ValueError:
|
|
335
|
+
return None
|
|
336
|
+
if parsed.tzinfo is None:
|
|
337
|
+
parsed = parsed.replace(tzinfo=UTC)
|
|
338
|
+
return parsed.astimezone(UTC)
|
|
339
|
+
|
|
340
|
+
|
|
341
|
+
def _read_runtime_failure_record(home: Path) -> dict[str, Any] | None:
|
|
342
|
+
quests_root = home / "quests"
|
|
343
|
+
if not quests_root.exists():
|
|
344
|
+
return None
|
|
345
|
+
|
|
346
|
+
latest: dict[str, Any] | None = None
|
|
347
|
+
latest_at: datetime | None = None
|
|
348
|
+
cutoff = datetime.now(UTC) - _RUNTIME_FAILURE_LOOKBACK
|
|
349
|
+
interesting_types = {
|
|
350
|
+
"runner.turn_error",
|
|
351
|
+
"runner.turn_retry_exhausted",
|
|
352
|
+
"quest.runtime_auto_resume_suppressed",
|
|
353
|
+
}
|
|
354
|
+
|
|
355
|
+
for quest_root in sorted(quests_root.glob("*/")):
|
|
356
|
+
events = read_jsonl_tail(quest_root / ".ds" / "events.jsonl", 300)
|
|
357
|
+
for event in reversed(events):
|
|
358
|
+
event_type = str(event.get("type") or "").strip()
|
|
359
|
+
if event_type not in interesting_types:
|
|
360
|
+
continue
|
|
361
|
+
created_at = _parse_timestamp(event.get("created_at"))
|
|
362
|
+
if created_at is None or created_at < cutoff:
|
|
363
|
+
continue
|
|
364
|
+
run_id = str(event.get("run_id") or "").strip() or None
|
|
365
|
+
stderr_text = ""
|
|
366
|
+
output_text = ""
|
|
367
|
+
if run_id:
|
|
368
|
+
run_root = quest_root / ".ds" / "runs" / run_id
|
|
369
|
+
result_payload = read_json(run_root / "result.json", {})
|
|
370
|
+
if isinstance(result_payload, dict):
|
|
371
|
+
stderr_text = str(result_payload.get("stderr_text") or "").strip()
|
|
372
|
+
output_text = str(result_payload.get("output_text") or "").strip()
|
|
373
|
+
stderr_path = run_root / "stderr.txt"
|
|
374
|
+
if not stderr_text and stderr_path.exists():
|
|
375
|
+
try:
|
|
376
|
+
stderr_text = stderr_path.read_text(encoding="utf-8")
|
|
377
|
+
except OSError:
|
|
378
|
+
stderr_text = ""
|
|
379
|
+
candidate = {
|
|
380
|
+
"quest_id": quest_root.name,
|
|
381
|
+
"run_id": run_id,
|
|
382
|
+
"event_type": event_type,
|
|
383
|
+
"summary": str(event.get("summary") or "").strip(),
|
|
384
|
+
"created_at": created_at.isoformat(),
|
|
385
|
+
"stderr_text": stderr_text,
|
|
386
|
+
"output_text": output_text,
|
|
387
|
+
"recent_attempts": event.get("recent_attempts"),
|
|
388
|
+
}
|
|
389
|
+
if latest_at is None or created_at > latest_at:
|
|
390
|
+
latest = candidate
|
|
391
|
+
latest_at = created_at
|
|
392
|
+
break
|
|
393
|
+
return latest
|
|
394
|
+
|
|
395
|
+
|
|
396
|
+
def _check_recent_runtime_failures(home: Path) -> dict[str, Any]:
|
|
397
|
+
record = _read_runtime_failure_record(home)
|
|
398
|
+
if record is None:
|
|
399
|
+
return _make_check(
|
|
400
|
+
check_id="recent_runtime_failures",
|
|
401
|
+
label="Recent runtime failures",
|
|
402
|
+
ok=True,
|
|
403
|
+
summary="No recent quest runtime failures were found.",
|
|
404
|
+
)
|
|
405
|
+
|
|
406
|
+
event_type = str(record.get("event_type") or "").strip()
|
|
407
|
+
quest_id = str(record.get("quest_id") or "").strip() or None
|
|
408
|
+
run_id = str(record.get("run_id") or "").strip() or None
|
|
409
|
+
summary = str(record.get("summary") or "").strip()
|
|
410
|
+
details = {
|
|
411
|
+
"quest_id": quest_id,
|
|
412
|
+
"run_id": run_id,
|
|
413
|
+
"event_type": event_type,
|
|
414
|
+
"observed_at": record.get("created_at"),
|
|
415
|
+
}
|
|
416
|
+
|
|
417
|
+
if event_type == "quest.runtime_auto_resume_suppressed":
|
|
418
|
+
recent_attempts = int(record.get("recent_attempts") or 0)
|
|
419
|
+
return _make_check(
|
|
420
|
+
check_id="recent_runtime_failures",
|
|
421
|
+
label="Recent runtime failures",
|
|
422
|
+
ok=True,
|
|
423
|
+
summary="DeepScientist recently suppressed auto-resume to avoid a crash loop.",
|
|
424
|
+
warnings=["Automatic continuation was paused after repeated recovery attempts in a short window."],
|
|
425
|
+
guidance=[
|
|
426
|
+
"Inspect the most recent failing runner path before using `/resume` again.",
|
|
427
|
+
"If the failure was a provider-side 400/protocol error, fix that request path first instead of retrying immediately.",
|
|
428
|
+
],
|
|
429
|
+
details=details,
|
|
430
|
+
problem="Automatic crash recovery was suppressed.",
|
|
431
|
+
why="The same quest hit repeated recovery attempts in a short window, so DeepScientist parked it instead of looping forever.",
|
|
432
|
+
fix=[
|
|
433
|
+
"Open the latest failing quest logs and identify the deterministic runner/provider error.",
|
|
434
|
+
"Resume manually only after the underlying runner or provider issue is corrected.",
|
|
435
|
+
],
|
|
436
|
+
evidence=[
|
|
437
|
+
*( [f"quest: {quest_id}"] if quest_id else [] ),
|
|
438
|
+
f"recent recovery attempts: {recent_attempts}",
|
|
439
|
+
],
|
|
440
|
+
)
|
|
441
|
+
|
|
442
|
+
diagnosis = diagnose_runner_failure(
|
|
443
|
+
runner_name="codex",
|
|
444
|
+
summary=summary,
|
|
445
|
+
stderr_text=str(record.get("stderr_text") or ""),
|
|
446
|
+
output_text=str(record.get("output_text") or ""),
|
|
447
|
+
)
|
|
448
|
+
if diagnosis is None:
|
|
449
|
+
return _make_check(
|
|
450
|
+
check_id="recent_runtime_failures",
|
|
451
|
+
label="Recent runtime failures",
|
|
452
|
+
ok=True,
|
|
453
|
+
summary="A recent quest runtime failure was found.",
|
|
454
|
+
warnings=[summary or "The latest quest run failed, but doctor could not classify it precisely yet."],
|
|
455
|
+
guidance=[
|
|
456
|
+
"Open the latest run stderr and events journal for the failing quest.",
|
|
457
|
+
"If the same failure repeats, capture the run_id and provider response text before retrying again.",
|
|
458
|
+
],
|
|
459
|
+
details=details,
|
|
460
|
+
problem="A recent quest run failed.",
|
|
461
|
+
why="Doctor found a recent runtime failure event but could not match it to a known deterministic error pattern.",
|
|
462
|
+
fix=[
|
|
463
|
+
"Inspect the failing run's stderr and provider response text.",
|
|
464
|
+
"If the error is deterministic, avoid burning the retry budget until the request shape or config is fixed.",
|
|
465
|
+
],
|
|
466
|
+
evidence=[
|
|
467
|
+
*( [f"quest: {quest_id}"] if quest_id else [] ),
|
|
468
|
+
*( [f"run: {run_id}"] if run_id else [] ),
|
|
469
|
+
*( [f"summary: {summary}"] if summary else [] ),
|
|
470
|
+
],
|
|
471
|
+
)
|
|
472
|
+
|
|
473
|
+
return _make_check(
|
|
474
|
+
check_id="recent_runtime_failures",
|
|
475
|
+
label="Recent runtime failures",
|
|
476
|
+
ok=True,
|
|
477
|
+
summary=diagnosis.problem,
|
|
478
|
+
warnings=[summary] if summary and summary != diagnosis.problem else [],
|
|
479
|
+
guidance=list(diagnosis.guidance),
|
|
480
|
+
details=details,
|
|
481
|
+
problem=diagnosis.problem,
|
|
482
|
+
why=diagnosis.why,
|
|
483
|
+
fix=list(diagnosis.guidance),
|
|
484
|
+
evidence=[
|
|
485
|
+
*( [f"quest: {quest_id}"] if quest_id else [] ),
|
|
486
|
+
*( [f"run: {run_id}"] if run_id else [] ),
|
|
487
|
+
*( [f"matched: {diagnosis.matched_text}"] if diagnosis.matched_text else [] ),
|
|
488
|
+
],
|
|
298
489
|
)
|
|
299
490
|
|
|
300
491
|
|
|
@@ -491,6 +682,7 @@ def run_doctor(home: Path, *, repo_root: Path) -> dict[str, Any]:
|
|
|
491
682
|
_check_config_validation(config_manager),
|
|
492
683
|
_check_runner_support(config_manager),
|
|
493
684
|
_check_codex(config_manager),
|
|
685
|
+
_check_recent_runtime_failures(home),
|
|
494
686
|
_check_latex_runtime(home),
|
|
495
687
|
_check_bundles(repo_root),
|
|
496
688
|
_check_ui_port(home, config_manager),
|
|
@@ -519,6 +711,18 @@ def render_doctor_report(report: dict[str, Any]) -> str:
|
|
|
519
711
|
status = str(item.get("status") or "ok").upper()
|
|
520
712
|
icon = {"OK": "[ok]", "WARN": "[warn]", "ERROR": "[fail]"}.get(status, "[info]")
|
|
521
713
|
lines.append(f"{icon} {item.get('label')}: {item.get('summary')}")
|
|
714
|
+
problem = str(item.get("problem") or "").strip()
|
|
715
|
+
why = str(item.get("why") or "").strip()
|
|
716
|
+
fix_lines = [str(line) for line in item.get("fix") or [] if str(line).strip()]
|
|
717
|
+
evidence_lines = [str(line) for line in item.get("evidence") or [] if str(line).strip()]
|
|
718
|
+
if problem:
|
|
719
|
+
lines.append(f" problem: {problem}")
|
|
720
|
+
if why:
|
|
721
|
+
lines.append(f" why: {why}")
|
|
722
|
+
for line in fix_lines:
|
|
723
|
+
lines.append(f" fix: {line}")
|
|
724
|
+
for line in evidence_lines:
|
|
725
|
+
lines.append(f" evidence: {line}")
|
|
522
726
|
for warning in item.get("warnings") or []:
|
|
523
727
|
lines.append(f" warning: {warning}")
|
|
524
728
|
for error in item.get("errors") or []:
|
|
@@ -1192,6 +1192,12 @@ class PromptBuilder:
|
|
|
1192
1192
|
"- collaboration_mode: user-directed copilot",
|
|
1193
1193
|
"- freeform_task_rule: if the user asks for a concrete research task, solve that task directly before introducing stage-routing language.",
|
|
1194
1194
|
"- requested_skill_hint_rule: in copilot mode, treat `requested_skill` as a lightweight routing hint, not as an instruction to default into `decision` for ordinary direct tasks.",
|
|
1195
|
+
"- turn_self_routing_rule: before substantial work, classify the current turn as `direct_answer`, `direct_action`, `stage_continue`, or `route_decision`.",
|
|
1196
|
+
"- direct_answer_rule: if the user mainly wants an answer or clarification, answer with the narrowest sufficient context and avoid reading large stage state unless needed.",
|
|
1197
|
+
"- direct_action_rule: if the user mainly wants one concrete task, execute the smallest useful unit first and do not expand into background research continuation in the same turn unless the user asked for it.",
|
|
1198
|
+
"- stage_continue_rule: if the user mainly wants the quest to keep moving, continue from the active durable stage state after acknowledging the request.",
|
|
1199
|
+
"- route_decision_rule: switch into `decision`-style reasoning only when safe continuation depends on a real route, scope, cost, branch, or scientific-direction judgment.",
|
|
1200
|
+
"- decision_skill_escalation_rule: if a turn upgrades into `route_decision`, explicitly read the `decision` skill before substantial route-changing work.",
|
|
1195
1201
|
"- response_pattern: say what changed -> say what it means -> say what happens next",
|
|
1196
1202
|
"- mailbox_protocol: artifact.interact(include_recent_inbound_messages=True) remains the queued human-message mailbox and should be checked whenever human continuity matters.",
|
|
1197
1203
|
"- planning_rule: before non-trivial execution, make the immediate plan explicit and keep the first step small.",
|
|
@@ -1201,13 +1207,18 @@ class PromptBuilder:
|
|
|
1201
1207
|
"- git_tool_mandate: for git work inside the current quest repository or worktree, prefer `artifact.git(...)` before raw shell git commands.",
|
|
1202
1208
|
"- git_test_rule: if the user wants a generic git smoke test rather than a quest-repo mutation, use `bash_exec(...)` in an isolated scratch repository.",
|
|
1203
1209
|
"- decision_entry_rule: use `decision` only for real route, scope, cost, branch, or scientific-direction judgments; do not default to it for ordinary repo, code, environment, or execution tasks.",
|
|
1210
|
+
"- micro_task_stop_rule: after finishing a `direct_answer` or `direct_action` turn, report the result plainly and wait instead of auto-continuing.",
|
|
1204
1211
|
"- stop_rule: once the current requested unit is done, send a concise update and wait for the next message or `/resume`.",
|
|
1205
1212
|
"- escalation_rule: if a route change materially affects cost, scope, or scientific direction, ask before proceeding.",
|
|
1206
1213
|
]
|
|
1207
1214
|
if chinese_turn:
|
|
1208
|
-
lines.append(
|
|
1215
|
+
lines.append(
|
|
1216
|
+
"- tone_hint: 使用自然、礼貌、专业、带一点活泼感的中文;像靠谱又主动汇报进展的研究搭子,不要冷冰冰或官话腔;对真实好消息可自然用“都搞定啦”“有结果了”这种轻微庆祝开头,但下一句要立刻说清具体结果。"
|
|
1217
|
+
)
|
|
1209
1218
|
else:
|
|
1210
|
-
lines.append(
|
|
1219
|
+
lines.append(
|
|
1220
|
+
"- tone_hint: use concise, natural, warm English, lead with the conclusion, and avoid sounding cold, bureaucratic, or log-like."
|
|
1221
|
+
)
|
|
1211
1222
|
return "\n".join(lines)
|
|
1212
1223
|
bound_conversations = snapshot.get("bound_conversations") or []
|
|
1213
1224
|
need_research_paper = self._need_research_paper(snapshot)
|
|
@@ -1224,6 +1235,12 @@ class PromptBuilder:
|
|
|
1224
1235
|
f"- standard_profile: {standard_profile if launch_mode == 'standard' else 'n/a'}",
|
|
1225
1236
|
f"- custom_profile: {custom_profile if launch_mode == 'custom' else 'n/a'}",
|
|
1226
1237
|
"- collaboration_mode: long-horizon, continuity-first, artifact-aware",
|
|
1238
|
+
"- user_turn_self_routing_rule: on a fresh user message, first classify the turn as `direct_answer`, `direct_action`, `stage_continue`, or `route_decision` before reading additional skills or large quest context.",
|
|
1239
|
+
"- direct_answer_rule: if the user mainly wants an answer or clarification, answer with the narrowest sufficient context and avoid reading large stage state unless needed.",
|
|
1240
|
+
"- direct_action_rule: if the user mainly wants one concrete task, execute the smallest useful unit first and do not silently expand into broader autonomous continuation in the same turn unless the user asked for it.",
|
|
1241
|
+
"- stage_continue_rule: if the user is clearly asking to continue quest progress, resume from the active durable stage state.",
|
|
1242
|
+
"- route_decision_rule: open `decision`-style reasoning only when safe continuation genuinely depends on a real route, scope, cost, branch, or scientific-direction judgment.",
|
|
1243
|
+
"- decision_skill_escalation_rule: if a fresh user-message turn upgrades into `route_decision`, explicitly read the `decision` skill before substantial route-changing work.",
|
|
1227
1244
|
"- response_pattern: say what changed -> say what it means -> say what happens next",
|
|
1228
1245
|
"- interaction_protocol: first message may be plain conversation; after that, treat artifact.interact threads and mailbox polls as the main continuity spine across TUI, web, and connectors",
|
|
1229
1246
|
"- shared_interaction_contract_precedence: use the shared interaction contract as the default user-facing cadence; the rules below add runtime-specific execution behavior instead of restating the same chat cadence",
|
|
@@ -1251,6 +1268,7 @@ class PromptBuilder:
|
|
|
1251
1268
|
"- example_and_numbers_protocol: when it materially improves understanding, include one short example or 1 to 3 key numbers or comparisons instead of relying only on vague adjectives such as better, slower, or more stable.",
|
|
1252
1269
|
"- omission_protocol: for ordinary user-facing updates, omit file paths, file names, artifact ids, branch/worktree ids, session ids, raw commands, raw logs, and internal tool names unless the user asked for them or needs them to act",
|
|
1253
1270
|
"- compaction_protocol: ordinary artifact.interact progress updates should usually fit in 2 to 4 short sentences and should not read like a monitoring transcript or execution diary",
|
|
1271
|
+
"- micro_task_stop_rule: after a fresh user-message turn that was only `direct_answer` or `direct_action`, finish that unit and do not silently turn the same turn into a broader autonomous stage pass unless the user asked for it.",
|
|
1254
1272
|
"- watchdog_payload_protocol: if a tool result includes `watchdog_notes`, `progress_watchdog_note`, `visibility_watchdog_note`, or `state_change_watchdog_note`, treat that as an action item to inspect state and decide whether a fresh user-visible update is actually needed; do not emit duplicate progress by reflex",
|
|
1255
1273
|
"- human_progress_shape_protocol: ordinary progress updates should usually make three things explicit in human language: the current task, the main difficulty or latest real progress, and the concrete next measure you will take",
|
|
1256
1274
|
"- stage_contract_protocol: stage-specific plan/checklist rules, milestone rules, literature rules, and writing rules belong in the requested skill; do not expect this runtime block to restate them",
|
|
@@ -1292,14 +1310,14 @@ class PromptBuilder:
|
|
|
1292
1310
|
if chinese_turn:
|
|
1293
1311
|
lines.extend(
|
|
1294
1312
|
[
|
|
1295
|
-
"- tone_hint:
|
|
1313
|
+
"- tone_hint: 使用自然、礼貌、专业、带一点活泼感的中文;必要时可自然称呼用户为“老师”,但不要每句重复;像靠谱又主动汇报进展的研究搭子,避免冷冰冰、官话化、机械模板腔;对真实好消息可自然用“都搞定啦”“有结果了”这种轻微庆祝开头,但下一句要立刻说清结果。",
|
|
1296
1314
|
"- connector_reply_hint: 在聊天面里优先简明说明当前状态、下一步动作、预计回传内容。",
|
|
1297
1315
|
]
|
|
1298
1316
|
)
|
|
1299
1317
|
else:
|
|
1300
1318
|
lines.extend(
|
|
1301
1319
|
[
|
|
1302
|
-
"- tone_hint: use a polite, professional,
|
|
1320
|
+
"- tone_hint: use a polite, professional, warm English tone; avoid sounding cold, bureaucratic, or like a monitoring log.",
|
|
1303
1321
|
"- connector_reply_hint: keep chat replies concise but operational, with explicit next steps and evidence targets.",
|
|
1304
1322
|
]
|
|
1305
1323
|
)
|