@researai/deepscientist 1.5.16 → 1.5.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (82) hide show
  1. package/README.md +66 -23
  2. package/bin/ds.js +550 -19
  3. package/docs/en/00_QUICK_START.md +65 -5
  4. package/docs/en/01_SETTINGS_REFERENCE.md +1 -1
  5. package/docs/en/09_DOCTOR.md +14 -3
  6. package/docs/en/15_CODEX_PROVIDER_SETUP.md +12 -3
  7. package/docs/en/21_LOCAL_MODEL_BACKENDS_GUIDE.md +283 -0
  8. package/docs/en/91_DEVELOPMENT.md +237 -0
  9. package/docs/en/README.md +7 -3
  10. package/docs/zh/00_QUICK_START.md +54 -5
  11. package/docs/zh/01_SETTINGS_REFERENCE.md +1 -1
  12. package/docs/zh/09_DOCTOR.md +15 -4
  13. package/docs/zh/15_CODEX_PROVIDER_SETUP.md +12 -3
  14. package/docs/zh/21_LOCAL_MODEL_BACKENDS_GUIDE.md +281 -0
  15. package/docs/zh/README.md +7 -3
  16. package/install.sh +46 -4
  17. package/package.json +2 -1
  18. package/pyproject.toml +1 -1
  19. package/src/deepscientist/__init__.py +1 -1
  20. package/src/deepscientist/bridges/connectors.py +8 -2
  21. package/src/deepscientist/codex_cli_compat.py +185 -72
  22. package/src/deepscientist/config/service.py +154 -6
  23. package/src/deepscientist/daemon/api/handlers.py +130 -25
  24. package/src/deepscientist/daemon/api/router.py +5 -0
  25. package/src/deepscientist/daemon/app.py +446 -22
  26. package/src/deepscientist/diagnostics/__init__.py +6 -0
  27. package/src/deepscientist/diagnostics/runner_failures.py +130 -0
  28. package/src/deepscientist/doctor.py +207 -3
  29. package/src/deepscientist/prompts/builder.py +22 -4
  30. package/src/deepscientist/quest/service.py +413 -13
  31. package/src/deepscientist/runners/codex.py +59 -14
  32. package/src/deepscientist/shared.py +19 -0
  33. package/src/prompts/contracts/shared_interaction.md +3 -2
  34. package/src/prompts/system.md +13 -0
  35. package/src/prompts/system_copilot.md +13 -0
  36. package/src/tui/package.json +1 -1
  37. package/src/ui/dist/assets/{AiManusChatView-COFACy7V.js → AiManusChatView-Bv-Z8YpU.js} +44 -44
  38. package/src/ui/dist/assets/{AnalysisPlugin-DnSm0GZn.js → AnalysisPlugin-BCKAfjba.js} +1 -1
  39. package/src/ui/dist/assets/{CliPlugin-CvwCmDQ5.js → CliPlugin-BCKcpc35.js} +4 -4
  40. package/src/ui/dist/assets/{CodeEditorPlugin-cOqSa0xq.js → CodeEditorPlugin-DbOfSJ8K.js} +1 -1
  41. package/src/ui/dist/assets/{CodeViewerPlugin-itb0tltR.js → CodeViewerPlugin-CbaFRrUU.js} +3 -3
  42. package/src/ui/dist/assets/{DocViewerPlugin-DqKkiCI6.js → DocViewerPlugin-DAjLVeQD.js} +3 -3
  43. package/src/ui/dist/assets/{GitCommitViewerPlugin-DVgNHBCS.js → GitCommitViewerPlugin-CIUqbUDO.js} +1 -1
  44. package/src/ui/dist/assets/{GitDiffViewerPlugin-DxL2ezFG.js → GitDiffViewerPlugin-CQACjoAA.js} +1 -1
  45. package/src/ui/dist/assets/{GitSnapshotViewer-B_RQm1YZ.js → GitSnapshotViewer-0r4nLPke.js} +1 -1
  46. package/src/ui/dist/assets/{ImageViewerPlugin-tHqlXY3n.js → ImageViewerPlugin-nBOmI2v_.js} +3 -3
  47. package/src/ui/dist/assets/{LabCopilotPanel-ClMbq5Yu.js → LabCopilotPanel-BHxOxF4z.js} +1 -1
  48. package/src/ui/dist/assets/{LabPlugin-L_SuE8ow.js → LabPlugin-BKoZGs95.js} +1 -1
  49. package/src/ui/dist/assets/{LatexPlugin-B495DTXC.js → LatexPlugin-ZwtV8pIp.js} +1 -1
  50. package/src/ui/dist/assets/{MarkdownViewerPlugin-DG28-61B.js → MarkdownViewerPlugin-DKqVfKyW.js} +3 -3
  51. package/src/ui/dist/assets/{MarketplacePlugin-BiOGT-Kj.js → MarketplacePlugin-BwxStZ9D.js} +1 -1
  52. package/src/ui/dist/assets/{NotebookEditor-C-4Kt1p9.js → NotebookEditor-BEQhaQbt.js} +1 -1
  53. package/src/ui/dist/assets/{NotebookEditor-CVsj8h_T.js → NotebookEditor-DB9N_T9q.js} +23 -23
  54. package/src/ui/dist/assets/{PdfLoader-CASDQmxJ.js → PdfLoader-eWBONbQP.js} +1 -1
  55. package/src/ui/dist/assets/{PdfMarkdownPlugin-BFhwoKsY.js → PdfMarkdownPlugin-D22YOZL3.js} +1 -1
  56. package/src/ui/dist/assets/{PdfViewerPlugin-DcOzU9vd.js → PdfViewerPlugin-c-RK9DLM.js} +3 -3
  57. package/src/ui/dist/assets/{SearchPlugin-CHj7M58O.js → SearchPlugin-CxF9ytAx.js} +1 -1
  58. package/src/ui/dist/assets/{TextViewerPlugin-CB4DYfWO.js → TextViewerPlugin-C5xqeeUH.js} +2 -2
  59. package/src/ui/dist/assets/{VNCViewer-CjlbyCB3.js → VNCViewer-BoLGLnHz.js} +1 -1
  60. package/src/ui/dist/assets/{bot-CFkZY-JP.js → bot-DREQOxzP.js} +1 -1
  61. package/src/ui/dist/assets/{chevron-up-Dq5ofbht.js → chevron-up-C9Qpx4DE.js} +1 -1
  62. package/src/ui/dist/assets/{code-DLC6G24T.js → code-WlFHE7z_.js} +1 -1
  63. package/src/ui/dist/assets/{file-content-Dv4LoZec.js → file-content-BZMz3RYp.js} +1 -1
  64. package/src/ui/dist/assets/{file-diff-panel-Denq-lC3.js → file-diff-panel-CQhw0jS2.js} +1 -1
  65. package/src/ui/dist/assets/{file-socket-Cu4Qln7Y.js → file-socket-CfQPKQKj.js} +1 -1
  66. package/src/ui/dist/assets/{git-commit-horizontal-BUh6G52n.js → git-commit-horizontal-DxZ8DCZh.js} +1 -1
  67. package/src/ui/dist/assets/{image-B9HUUddG.js → image-Bgl4VIyx.js} +1 -1
  68. package/src/ui/dist/assets/{index-Cgla8biy.css → index-BpV6lusQ.css} +1 -1
  69. package/src/ui/dist/assets/{index-Gbl53BNp.js → index-CBNVuWcP.js} +363 -363
  70. package/src/ui/dist/assets/{index-wQ7RIIRd.js → index-CwNu1aH4.js} +1 -1
  71. package/src/ui/dist/assets/{index-B2B1sg-M.js → index-DrUnlf6K.js} +1 -1
  72. package/src/ui/dist/assets/{index-DRyx7vAc.js → index-NW-h8VzN.js} +1 -1
  73. package/src/ui/dist/assets/{pdf-effect-queue-ZtnHFCAi.js → pdf-effect-queue-J8OnM0jE.js} +1 -1
  74. package/src/ui/dist/assets/{popover-DL6h35vr.js → popover-CLc0pPP8.js} +1 -1
  75. package/src/ui/dist/assets/{project-sync-CsX08Qno.js → project-sync-C9IdzdZW.js} +1 -1
  76. package/src/ui/dist/assets/{select-DvmXt1yY.js → select-Cs2PmzwL.js} +1 -1
  77. package/src/ui/dist/assets/{sigma-7jpXazui.js → sigma-ClKcHAXm.js} +1 -1
  78. package/src/ui/dist/assets/{trash-xA7kFt8i.js → trash-DwpbFr3w.js} +1 -1
  79. package/src/ui/dist/assets/{useCliAccess-DsMwDjOp.js → useCliAccess-NQ8m0Let.js} +1 -1
  80. package/src/ui/dist/assets/{wrap-text-CwMn-iqb.js → wrap-text-BC-Hltpd.js} +1 -1
  81. package/src/ui/dist/assets/{zoom-out-R-GWEhzS.js → zoom-out-E_gaeAxL.js} +1 -1
  82. package/src/ui/dist/index.html +2 -2
@@ -0,0 +1,130 @@
1
+ from __future__ import annotations
2
+
3
+ from dataclasses import dataclass
4
+
5
+
6
+ @dataclass(frozen=True)
7
+ class FailureDiagnosis:
8
+ code: str
9
+ problem: str
10
+ why: str
11
+ guidance: tuple[str, ...]
12
+ retriable: bool
13
+ matched_text: str | None = None
14
+
15
+
16
+ _MODEL_UNAVAILABLE_MARKERS = (
17
+ "unknown model",
18
+ "invalid model",
19
+ "model not found",
20
+ "unsupported model",
21
+ "model is not available",
22
+ "not authorized to use model",
23
+ "you do not have access",
24
+ "access to model",
25
+ "model access",
26
+ "unrecognized model",
27
+ )
28
+
29
+
30
+ def _build_haystack(*values: object) -> str:
31
+ return "\n".join(str(value or "") for value in values if str(value or "").strip())
32
+
33
+
34
+ def _contains(text: str, marker: str) -> bool:
35
+ return marker in text.lower()
36
+
37
+
38
+ def diagnose_runner_failure(
39
+ *,
40
+ runner_name: str,
41
+ summary: str = "",
42
+ stderr_text: str = "",
43
+ output_text: str = "",
44
+ ) -> FailureDiagnosis | None:
45
+ haystack = _build_haystack(summary, stderr_text, output_text)
46
+ lower = haystack.lower()
47
+ normalized_runner = str(runner_name or "").strip().lower()
48
+
49
+ if (
50
+ "tool call result does not follow tool call (2013)" in lower
51
+ or "tool result's tool id" in lower
52
+ ):
53
+ return FailureDiagnosis(
54
+ code="minimax_tool_result_sequence_error",
55
+ problem="MiniMax rejected the tool result sequence.",
56
+ why=(
57
+ "The tool result did not immediately follow the corresponding tool call, "
58
+ "or the tool result referenced a tool call id that was no longer valid."
59
+ ),
60
+ guidance=(
61
+ "Keep each tool result immediately after its matching tool call.",
62
+ "Do not insert an extra assistant message between a tool call and its tool result.",
63
+ "For MiniMax chat-wire sessions, serialize tool use one call at a time.",
64
+ ),
65
+ retriable=False,
66
+ matched_text="2013",
67
+ )
68
+
69
+ if (
70
+ "invalid function arguments json string" in lower
71
+ or "failed to parse tool call arguments" in lower
72
+ or "trailing characters at line 1 column" in lower
73
+ ):
74
+ return FailureDiagnosis(
75
+ code="chat_wire_tool_argument_parse_error",
76
+ problem="The runner emitted malformed tool-call arguments.",
77
+ why=(
78
+ "The tool-call arguments were not a single valid JSON object. "
79
+ "This usually happens when multiple tool calls are batched into one response "
80
+ "or when the arguments string contains trailing characters."
81
+ ),
82
+ guidance=(
83
+ "Serialize tool calls one at a time instead of batching multiple MCP calls together.",
84
+ "Make sure each tool call emits exactly one complete JSON object for its arguments.",
85
+ "If this is a MiniMax chat-wire path, stay on the serialized single-tool compatibility route.",
86
+ ),
87
+ retriable=False,
88
+ matched_text="tool-call arguments",
89
+ )
90
+
91
+ if "missing environment variable" in lower:
92
+ return FailureDiagnosis(
93
+ code="provider_env_missing",
94
+ problem="A required provider environment variable is missing.",
95
+ why="The configured model provider expects an API key or env var that was not present in the runner environment.",
96
+ guidance=(
97
+ "Set the required key in `~/DeepScientist/config/runners.yaml` under `runners.codex.env`.",
98
+ "If you launch from a shell, export the provider key in that same shell before starting `ds`.",
99
+ ),
100
+ retriable=False,
101
+ matched_text="missing environment variable",
102
+ )
103
+
104
+ if any(marker in lower for marker in _MODEL_UNAVAILABLE_MARKERS):
105
+ return FailureDiagnosis(
106
+ code="runner_model_unavailable",
107
+ problem="The configured runner model is not available.",
108
+ why="The selected provider or Codex account could not access the requested model id.",
109
+ guidance=(
110
+ "Set `model: inherit` for provider-backed Codex profiles unless the provider explicitly supports the model id.",
111
+ "If you need a fixed model, verify that the same model works in plain `codex exec` before retrying DeepScientist.",
112
+ ),
113
+ retriable=False,
114
+ matched_text="model unavailable",
115
+ )
116
+
117
+ if normalized_runner == "codex" and "invalid params" in lower and "bad_request_error" in lower:
118
+ return FailureDiagnosis(
119
+ code="provider_invalid_params",
120
+ problem="The provider rejected the request parameters.",
121
+ why="The upstream provider returned a deterministic request-shape error instead of a transient transport failure.",
122
+ guidance=(
123
+ "Inspect the immediately preceding tool call / tool result sequence for protocol ordering or JSON-shape mistakes.",
124
+ "Do not keep retrying the same request until the request payload or provider config is corrected.",
125
+ ),
126
+ retriable=False,
127
+ matched_text="invalid params",
128
+ )
129
+
130
+ return None
@@ -1,5 +1,6 @@
1
1
  from __future__ import annotations
2
2
 
3
+ from datetime import UTC, datetime, timedelta
3
4
  import os
4
5
  import socket
5
6
  import subprocess
@@ -13,9 +14,13 @@ from urllib.request import Request, urlopen
13
14
 
14
15
  from .bash_exec.shells import build_exec_shell_launch, build_terminal_shell_launch
15
16
  from .config import ConfigManager
17
+ from .diagnostics import diagnose_runner_failure
16
18
  from .home import ensure_home_layout
17
19
  from .runtime_tools import RuntimeToolService
18
- from .shared import resolve_runner_binary, utc_now
20
+ from .shared import read_json, read_jsonl_tail, resolve_runner_binary, utc_now
21
+
22
+
23
+ _RUNTIME_FAILURE_LOOKBACK = timedelta(hours=24)
19
24
 
20
25
 
21
26
  def _browser_ui_url(host: str, port: int) -> str:
@@ -42,6 +47,10 @@ def _make_check(
42
47
  errors: list[str] | None = None,
43
48
  guidance: list[str] | None = None,
44
49
  details: dict[str, Any] | None = None,
50
+ problem: str | None = None,
51
+ why: str | None = None,
52
+ fix: list[str] | None = None,
53
+ evidence: list[str] | None = None,
45
54
  ) -> dict[str, Any]:
46
55
  normalized_warnings = list(warnings or [])
47
56
  normalized_errors = list(errors or [])
@@ -55,6 +64,10 @@ def _make_check(
55
64
  "errors": normalized_errors,
56
65
  "guidance": list(guidance or []),
57
66
  "details": dict(details or {}),
67
+ "problem": str(problem or "").strip() or None,
68
+ "why": str(why or "").strip() or None,
69
+ "fix": [str(line) for line in (fix or []) if str(line).strip()],
70
+ "evidence": [str(line) for line in (evidence or []) if str(line).strip()],
58
71
  }
59
72
 
60
73
 
@@ -273,6 +286,13 @@ def _check_codex(config_manager: ConfigManager) -> dict[str, Any]:
273
286
  probe_warnings = [str(value) for value in probe.get("warnings") or []]
274
287
  probe_guidance = [str(value) for value in probe.get("guidance") or []]
275
288
  summary = str(probe.get("summary") or "Codex startup probe completed.")
289
+ probe_details = probe.get("details") if isinstance(probe.get("details"), dict) else {}
290
+ diagnosis = diagnose_runner_failure(
291
+ runner_name="codex",
292
+ summary="\n".join([summary, *probe_errors]),
293
+ stderr_text=str(probe_details.get("stderr_excerpt") or ""),
294
+ output_text=str(probe_details.get("stdout_excerpt") or ""),
295
+ )
276
296
  if probe.get("ok"):
277
297
  return _make_check(
278
298
  check_id="codex",
@@ -284,17 +304,188 @@ def _check_codex(config_manager: ConfigManager) -> dict[str, Any]:
284
304
  )
285
305
  if not probe_guidance:
286
306
  probe_guidance = [
287
- "Run `codex --login` (or `codex`) manually once and complete login, then retry `ds doctor`.",
307
+ "Run `codex login` (or just `codex`) manually once and complete login, then retry `ds doctor`.",
288
308
  ]
289
309
  return _make_check(
290
310
  check_id="codex",
291
311
  label="Codex CLI",
292
312
  ok=False,
293
- summary=summary,
313
+ summary=diagnosis.problem if diagnosis is not None else summary,
294
314
  warnings=probe_warnings,
295
315
  errors=probe_errors or ["Codex startup probe did not succeed."],
296
316
  guidance=probe_guidance,
297
317
  details={"resolved_binary": resolved_binary},
318
+ problem=diagnosis.problem if diagnosis is not None else None,
319
+ why=diagnosis.why if diagnosis is not None else None,
320
+ fix=list(diagnosis.guidance) if diagnosis is not None else None,
321
+ evidence=(
322
+ [f"matched: {diagnosis.matched_text}"] if diagnosis is not None and diagnosis.matched_text else None
323
+ ),
324
+ )
325
+
326
+
327
+ def _parse_timestamp(value: object) -> datetime | None:
328
+ normalized = str(value or "").strip()
329
+ if not normalized:
330
+ return None
331
+ candidate = normalized.replace("Z", "+00:00")
332
+ try:
333
+ parsed = datetime.fromisoformat(candidate)
334
+ except ValueError:
335
+ return None
336
+ if parsed.tzinfo is None:
337
+ parsed = parsed.replace(tzinfo=UTC)
338
+ return parsed.astimezone(UTC)
339
+
340
+
341
+ def _read_runtime_failure_record(home: Path) -> dict[str, Any] | None:
342
+ quests_root = home / "quests"
343
+ if not quests_root.exists():
344
+ return None
345
+
346
+ latest: dict[str, Any] | None = None
347
+ latest_at: datetime | None = None
348
+ cutoff = datetime.now(UTC) - _RUNTIME_FAILURE_LOOKBACK
349
+ interesting_types = {
350
+ "runner.turn_error",
351
+ "runner.turn_retry_exhausted",
352
+ "quest.runtime_auto_resume_suppressed",
353
+ }
354
+
355
+ for quest_root in sorted(quests_root.glob("*/")):
356
+ events = read_jsonl_tail(quest_root / ".ds" / "events.jsonl", 300)
357
+ for event in reversed(events):
358
+ event_type = str(event.get("type") or "").strip()
359
+ if event_type not in interesting_types:
360
+ continue
361
+ created_at = _parse_timestamp(event.get("created_at"))
362
+ if created_at is None or created_at < cutoff:
363
+ continue
364
+ run_id = str(event.get("run_id") or "").strip() or None
365
+ stderr_text = ""
366
+ output_text = ""
367
+ if run_id:
368
+ run_root = quest_root / ".ds" / "runs" / run_id
369
+ result_payload = read_json(run_root / "result.json", {})
370
+ if isinstance(result_payload, dict):
371
+ stderr_text = str(result_payload.get("stderr_text") or "").strip()
372
+ output_text = str(result_payload.get("output_text") or "").strip()
373
+ stderr_path = run_root / "stderr.txt"
374
+ if not stderr_text and stderr_path.exists():
375
+ try:
376
+ stderr_text = stderr_path.read_text(encoding="utf-8")
377
+ except OSError:
378
+ stderr_text = ""
379
+ candidate = {
380
+ "quest_id": quest_root.name,
381
+ "run_id": run_id,
382
+ "event_type": event_type,
383
+ "summary": str(event.get("summary") or "").strip(),
384
+ "created_at": created_at.isoformat(),
385
+ "stderr_text": stderr_text,
386
+ "output_text": output_text,
387
+ "recent_attempts": event.get("recent_attempts"),
388
+ }
389
+ if latest_at is None or created_at > latest_at:
390
+ latest = candidate
391
+ latest_at = created_at
392
+ break
393
+ return latest
394
+
395
+
396
+ def _check_recent_runtime_failures(home: Path) -> dict[str, Any]:
397
+ record = _read_runtime_failure_record(home)
398
+ if record is None:
399
+ return _make_check(
400
+ check_id="recent_runtime_failures",
401
+ label="Recent runtime failures",
402
+ ok=True,
403
+ summary="No recent quest runtime failures were found.",
404
+ )
405
+
406
+ event_type = str(record.get("event_type") or "").strip()
407
+ quest_id = str(record.get("quest_id") or "").strip() or None
408
+ run_id = str(record.get("run_id") or "").strip() or None
409
+ summary = str(record.get("summary") or "").strip()
410
+ details = {
411
+ "quest_id": quest_id,
412
+ "run_id": run_id,
413
+ "event_type": event_type,
414
+ "observed_at": record.get("created_at"),
415
+ }
416
+
417
+ if event_type == "quest.runtime_auto_resume_suppressed":
418
+ recent_attempts = int(record.get("recent_attempts") or 0)
419
+ return _make_check(
420
+ check_id="recent_runtime_failures",
421
+ label="Recent runtime failures",
422
+ ok=True,
423
+ summary="DeepScientist recently suppressed auto-resume to avoid a crash loop.",
424
+ warnings=["Automatic continuation was paused after repeated recovery attempts in a short window."],
425
+ guidance=[
426
+ "Inspect the most recent failing runner path before using `/resume` again.",
427
+ "If the failure was a provider-side 400/protocol error, fix that request path first instead of retrying immediately.",
428
+ ],
429
+ details=details,
430
+ problem="Automatic crash recovery was suppressed.",
431
+ why="The same quest hit repeated recovery attempts in a short window, so DeepScientist parked it instead of looping forever.",
432
+ fix=[
433
+ "Open the latest failing quest logs and identify the deterministic runner/provider error.",
434
+ "Resume manually only after the underlying runner or provider issue is corrected.",
435
+ ],
436
+ evidence=[
437
+ *( [f"quest: {quest_id}"] if quest_id else [] ),
438
+ f"recent recovery attempts: {recent_attempts}",
439
+ ],
440
+ )
441
+
442
+ diagnosis = diagnose_runner_failure(
443
+ runner_name="codex",
444
+ summary=summary,
445
+ stderr_text=str(record.get("stderr_text") or ""),
446
+ output_text=str(record.get("output_text") or ""),
447
+ )
448
+ if diagnosis is None:
449
+ return _make_check(
450
+ check_id="recent_runtime_failures",
451
+ label="Recent runtime failures",
452
+ ok=True,
453
+ summary="A recent quest runtime failure was found.",
454
+ warnings=[summary or "The latest quest run failed, but doctor could not classify it precisely yet."],
455
+ guidance=[
456
+ "Open the latest run stderr and events journal for the failing quest.",
457
+ "If the same failure repeats, capture the run_id and provider response text before retrying again.",
458
+ ],
459
+ details=details,
460
+ problem="A recent quest run failed.",
461
+ why="Doctor found a recent runtime failure event but could not match it to a known deterministic error pattern.",
462
+ fix=[
463
+ "Inspect the failing run's stderr and provider response text.",
464
+ "If the error is deterministic, avoid burning the retry budget until the request shape or config is fixed.",
465
+ ],
466
+ evidence=[
467
+ *( [f"quest: {quest_id}"] if quest_id else [] ),
468
+ *( [f"run: {run_id}"] if run_id else [] ),
469
+ *( [f"summary: {summary}"] if summary else [] ),
470
+ ],
471
+ )
472
+
473
+ return _make_check(
474
+ check_id="recent_runtime_failures",
475
+ label="Recent runtime failures",
476
+ ok=True,
477
+ summary=diagnosis.problem,
478
+ warnings=[summary] if summary and summary != diagnosis.problem else [],
479
+ guidance=list(diagnosis.guidance),
480
+ details=details,
481
+ problem=diagnosis.problem,
482
+ why=diagnosis.why,
483
+ fix=list(diagnosis.guidance),
484
+ evidence=[
485
+ *( [f"quest: {quest_id}"] if quest_id else [] ),
486
+ *( [f"run: {run_id}"] if run_id else [] ),
487
+ *( [f"matched: {diagnosis.matched_text}"] if diagnosis.matched_text else [] ),
488
+ ],
298
489
  )
299
490
 
300
491
 
@@ -491,6 +682,7 @@ def run_doctor(home: Path, *, repo_root: Path) -> dict[str, Any]:
491
682
  _check_config_validation(config_manager),
492
683
  _check_runner_support(config_manager),
493
684
  _check_codex(config_manager),
685
+ _check_recent_runtime_failures(home),
494
686
  _check_latex_runtime(home),
495
687
  _check_bundles(repo_root),
496
688
  _check_ui_port(home, config_manager),
@@ -519,6 +711,18 @@ def render_doctor_report(report: dict[str, Any]) -> str:
519
711
  status = str(item.get("status") or "ok").upper()
520
712
  icon = {"OK": "[ok]", "WARN": "[warn]", "ERROR": "[fail]"}.get(status, "[info]")
521
713
  lines.append(f"{icon} {item.get('label')}: {item.get('summary')}")
714
+ problem = str(item.get("problem") or "").strip()
715
+ why = str(item.get("why") or "").strip()
716
+ fix_lines = [str(line) for line in item.get("fix") or [] if str(line).strip()]
717
+ evidence_lines = [str(line) for line in item.get("evidence") or [] if str(line).strip()]
718
+ if problem:
719
+ lines.append(f" problem: {problem}")
720
+ if why:
721
+ lines.append(f" why: {why}")
722
+ for line in fix_lines:
723
+ lines.append(f" fix: {line}")
724
+ for line in evidence_lines:
725
+ lines.append(f" evidence: {line}")
522
726
  for warning in item.get("warnings") or []:
523
727
  lines.append(f" warning: {warning}")
524
728
  for error in item.get("errors") or []:
@@ -1192,6 +1192,12 @@ class PromptBuilder:
1192
1192
  "- collaboration_mode: user-directed copilot",
1193
1193
  "- freeform_task_rule: if the user asks for a concrete research task, solve that task directly before introducing stage-routing language.",
1194
1194
  "- requested_skill_hint_rule: in copilot mode, treat `requested_skill` as a lightweight routing hint, not as an instruction to default into `decision` for ordinary direct tasks.",
1195
+ "- turn_self_routing_rule: before substantial work, classify the current turn as `direct_answer`, `direct_action`, `stage_continue`, or `route_decision`.",
1196
+ "- direct_answer_rule: if the user mainly wants an answer or clarification, answer with the narrowest sufficient context and avoid reading large stage state unless needed.",
1197
+ "- direct_action_rule: if the user mainly wants one concrete task, execute the smallest useful unit first and do not expand into background research continuation in the same turn unless the user asked for it.",
1198
+ "- stage_continue_rule: if the user mainly wants the quest to keep moving, continue from the active durable stage state after acknowledging the request.",
1199
+ "- route_decision_rule: switch into `decision`-style reasoning only when safe continuation depends on a real route, scope, cost, branch, or scientific-direction judgment.",
1200
+ "- decision_skill_escalation_rule: if a turn upgrades into `route_decision`, explicitly read the `decision` skill before substantial route-changing work.",
1195
1201
  "- response_pattern: say what changed -> say what it means -> say what happens next",
1196
1202
  "- mailbox_protocol: artifact.interact(include_recent_inbound_messages=True) remains the queued human-message mailbox and should be checked whenever human continuity matters.",
1197
1203
  "- planning_rule: before non-trivial execution, make the immediate plan explicit and keep the first step small.",
@@ -1201,13 +1207,18 @@ class PromptBuilder:
1201
1207
  "- git_tool_mandate: for git work inside the current quest repository or worktree, prefer `artifact.git(...)` before raw shell git commands.",
1202
1208
  "- git_test_rule: if the user wants a generic git smoke test rather than a quest-repo mutation, use `bash_exec(...)` in an isolated scratch repository.",
1203
1209
  "- decision_entry_rule: use `decision` only for real route, scope, cost, branch, or scientific-direction judgments; do not default to it for ordinary repo, code, environment, or execution tasks.",
1210
+ "- micro_task_stop_rule: after finishing a `direct_answer` or `direct_action` turn, report the result plainly and wait instead of auto-continuing.",
1204
1211
  "- stop_rule: once the current requested unit is done, send a concise update and wait for the next message or `/resume`.",
1205
1212
  "- escalation_rule: if a route change materially affects cost, scope, or scientific direction, ask before proceeding.",
1206
1213
  ]
1207
1214
  if chinese_turn:
1208
- lines.append("- tone_hint: 使用自然、礼貌、专业的中文,先解释结论,再说明下一步。")
1215
+ lines.append(
1216
+ "- tone_hint: 使用自然、礼貌、专业、带一点活泼感的中文;像靠谱又主动汇报进展的研究搭子,不要冷冰冰或官话腔;对真实好消息可自然用“都搞定啦”“有结果了”这种轻微庆祝开头,但下一句要立刻说清具体结果。"
1217
+ )
1209
1218
  else:
1210
- lines.append("- tone_hint: use concise, natural, professional English and lead with the conclusion.")
1219
+ lines.append(
1220
+ "- tone_hint: use concise, natural, warm English, lead with the conclusion, and avoid sounding cold, bureaucratic, or log-like."
1221
+ )
1211
1222
  return "\n".join(lines)
1212
1223
  bound_conversations = snapshot.get("bound_conversations") or []
1213
1224
  need_research_paper = self._need_research_paper(snapshot)
@@ -1224,6 +1235,12 @@ class PromptBuilder:
1224
1235
  f"- standard_profile: {standard_profile if launch_mode == 'standard' else 'n/a'}",
1225
1236
  f"- custom_profile: {custom_profile if launch_mode == 'custom' else 'n/a'}",
1226
1237
  "- collaboration_mode: long-horizon, continuity-first, artifact-aware",
1238
+ "- user_turn_self_routing_rule: on a fresh user message, first classify the turn as `direct_answer`, `direct_action`, `stage_continue`, or `route_decision` before reading additional skills or large quest context.",
1239
+ "- direct_answer_rule: if the user mainly wants an answer or clarification, answer with the narrowest sufficient context and avoid reading large stage state unless needed.",
1240
+ "- direct_action_rule: if the user mainly wants one concrete task, execute the smallest useful unit first and do not silently expand into broader autonomous continuation in the same turn unless the user asked for it.",
1241
+ "- stage_continue_rule: if the user is clearly asking to continue quest progress, resume from the active durable stage state.",
1242
+ "- route_decision_rule: open `decision`-style reasoning only when safe continuation genuinely depends on a real route, scope, cost, branch, or scientific-direction judgment.",
1243
+ "- decision_skill_escalation_rule: if a fresh user-message turn upgrades into `route_decision`, explicitly read the `decision` skill before substantial route-changing work.",
1227
1244
  "- response_pattern: say what changed -> say what it means -> say what happens next",
1228
1245
  "- interaction_protocol: first message may be plain conversation; after that, treat artifact.interact threads and mailbox polls as the main continuity spine across TUI, web, and connectors",
1229
1246
  "- shared_interaction_contract_precedence: use the shared interaction contract as the default user-facing cadence; the rules below add runtime-specific execution behavior instead of restating the same chat cadence",
@@ -1251,6 +1268,7 @@ class PromptBuilder:
1251
1268
  "- example_and_numbers_protocol: when it materially improves understanding, include one short example or 1 to 3 key numbers or comparisons instead of relying only on vague adjectives such as better, slower, or more stable.",
1252
1269
  "- omission_protocol: for ordinary user-facing updates, omit file paths, file names, artifact ids, branch/worktree ids, session ids, raw commands, raw logs, and internal tool names unless the user asked for them or needs them to act",
1253
1270
  "- compaction_protocol: ordinary artifact.interact progress updates should usually fit in 2 to 4 short sentences and should not read like a monitoring transcript or execution diary",
1271
+ "- micro_task_stop_rule: after a fresh user-message turn that was only `direct_answer` or `direct_action`, finish that unit and do not silently turn the same turn into a broader autonomous stage pass unless the user asked for it.",
1254
1272
  "- watchdog_payload_protocol: if a tool result includes `watchdog_notes`, `progress_watchdog_note`, `visibility_watchdog_note`, or `state_change_watchdog_note`, treat that as an action item to inspect state and decide whether a fresh user-visible update is actually needed; do not emit duplicate progress by reflex",
1255
1273
  "- human_progress_shape_protocol: ordinary progress updates should usually make three things explicit in human language: the current task, the main difficulty or latest real progress, and the concrete next measure you will take",
1256
1274
  "- stage_contract_protocol: stage-specific plan/checklist rules, milestone rules, literature rules, and writing rules belong in the requested skill; do not expect this runtime block to restate them",
@@ -1292,14 +1310,14 @@ class PromptBuilder:
1292
1310
  if chinese_turn:
1293
1311
  lines.extend(
1294
1312
  [
1295
- "- tone_hint: 使用自然、礼貌、专业、偏正式的中文;必要时可自然称呼用户为“老师”,但不要每句重复;避免机械模板腔。",
1313
+ "- tone_hint: 使用自然、礼貌、专业、带一点活泼感的中文;必要时可自然称呼用户为“老师”,但不要每句重复;像靠谱又主动汇报进展的研究搭子,避免冷冰冰、官话化、机械模板腔;对真实好消息可自然用“都搞定啦”“有结果了”这种轻微庆祝开头,但下一句要立刻说清结果。",
1296
1314
  "- connector_reply_hint: 在聊天面里优先简明说明当前状态、下一步动作、预计回传内容。",
1297
1315
  ]
1298
1316
  )
1299
1317
  else:
1300
1318
  lines.extend(
1301
1319
  [
1302
- "- tone_hint: use a polite, professional, gentlemanly English tone.",
1320
+ "- tone_hint: use a polite, professional, warm English tone; avoid sounding cold, bureaucratic, or like a monitoring log.",
1303
1321
  "- connector_reply_hint: keep chat replies concise but operational, with explicit next steps and evidence targets.",
1304
1322
  ]
1305
1323
  )