@researai/deepscientist 1.5.15 → 1.5.17
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +385 -104
- package/bin/ds.js +1241 -110
- package/docs/en/00_QUICK_START.md +100 -19
- package/docs/en/01_SETTINGS_REFERENCE.md +34 -1
- package/docs/en/02_START_RESEARCH_GUIDE.md +7 -0
- package/docs/en/05_TUI_GUIDE.md +6 -0
- package/docs/en/06_RUNTIME_AND_CANVAS.md +4 -3
- package/docs/en/09_DOCTOR.md +25 -8
- package/docs/en/14_PROMPT_SKILLS_AND_MCP_GUIDE.md +63 -13
- package/docs/en/15_CODEX_PROVIDER_SETUP.md +37 -11
- package/docs/en/19_EXTERNAL_CONTROLLER_GUIDE.md +226 -0
- package/docs/en/19_LOCAL_BROWSER_AUTH.md +70 -0
- package/docs/en/20_WORKSPACE_MODES_GUIDE.md +250 -0
- package/docs/en/21_LOCAL_MODEL_BACKENDS_GUIDE.md +283 -0
- package/docs/en/91_DEVELOPMENT.md +237 -0
- package/docs/en/README.md +24 -2
- package/docs/zh/00_QUICK_START.md +89 -19
- package/docs/zh/01_SETTINGS_REFERENCE.md +34 -1
- package/docs/zh/02_START_RESEARCH_GUIDE.md +7 -0
- package/docs/zh/05_TUI_GUIDE.md +6 -0
- package/docs/zh/09_DOCTOR.md +26 -9
- package/docs/zh/14_PROMPT_SKILLS_AND_MCP_GUIDE.md +63 -13
- package/docs/zh/15_CODEX_PROVIDER_SETUP.md +37 -11
- package/docs/zh/19_EXTERNAL_CONTROLLER_GUIDE.md +226 -0
- package/docs/zh/19_LOCAL_BROWSER_AUTH.md +68 -0
- package/docs/zh/20_WORKSPACE_MODES_GUIDE.md +251 -0
- package/docs/zh/21_LOCAL_MODEL_BACKENDS_GUIDE.md +281 -0
- package/docs/zh/README.md +24 -2
- package/install.sh +46 -4
- package/package.json +2 -1
- package/pyproject.toml +1 -1
- package/src/deepscientist/__init__.py +1 -1
- package/src/deepscientist/acp/envelope.py +6 -0
- package/src/deepscientist/artifact/service.py +647 -22
- package/src/deepscientist/bash_exec/service.py +234 -9
- package/src/deepscientist/bridges/connectors.py +8 -2
- package/src/deepscientist/cli.py +115 -19
- package/src/deepscientist/codex_cli_compat.py +367 -22
- package/src/deepscientist/config/models.py +2 -1
- package/src/deepscientist/config/service.py +183 -13
- package/src/deepscientist/daemon/api/handlers.py +255 -31
- package/src/deepscientist/daemon/api/router.py +9 -0
- package/src/deepscientist/daemon/app.py +1146 -105
- package/src/deepscientist/diagnostics/__init__.py +6 -0
- package/src/deepscientist/diagnostics/runner_failures.py +130 -0
- package/src/deepscientist/doctor.py +207 -3
- package/src/deepscientist/gitops/__init__.py +10 -1
- package/src/deepscientist/gitops/diff.py +129 -0
- package/src/deepscientist/gitops/service.py +4 -1
- package/src/deepscientist/mcp/server.py +39 -0
- package/src/deepscientist/prompts/builder.py +275 -34
- package/src/deepscientist/quest/layout.py +15 -2
- package/src/deepscientist/quest/service.py +707 -55
- package/src/deepscientist/quest/stage_views.py +6 -1
- package/src/deepscientist/runners/codex.py +143 -43
- package/src/deepscientist/shared.py +19 -0
- package/src/deepscientist/skills/__init__.py +2 -2
- package/src/deepscientist/skills/installer.py +196 -5
- package/src/deepscientist/skills/registry.py +66 -0
- package/src/prompts/connectors/qq.md +18 -8
- package/src/prompts/connectors/weixin.md +16 -6
- package/src/prompts/contracts/shared_interaction.md +14 -2
- package/src/prompts/system.md +23 -5
- package/src/prompts/system_copilot.md +56 -0
- package/src/skills/analysis-campaign/SKILL.md +1 -0
- package/src/skills/baseline/SKILL.md +8 -0
- package/src/skills/decision/SKILL.md +8 -0
- package/src/skills/experiment/SKILL.md +8 -0
- package/src/skills/figure-polish/SKILL.md +1 -0
- package/src/skills/finalize/SKILL.md +1 -0
- package/src/skills/idea/SKILL.md +1 -0
- package/src/skills/intake-audit/SKILL.md +8 -0
- package/src/skills/mentor/SKILL.md +217 -0
- package/src/skills/mentor/references/correction-rules.md +210 -0
- package/src/skills/mentor/references/knowledge-profile.md +91 -0
- package/src/skills/mentor/references/persona-profile.md +138 -0
- package/src/skills/mentor/references/taste-profile.md +128 -0
- package/src/skills/mentor/references/thought-style-profile.md +138 -0
- package/src/skills/mentor/references/work-profile.md +289 -0
- package/src/skills/mentor/references/workflow-profile.md +240 -0
- package/src/skills/optimize/SKILL.md +1 -0
- package/src/skills/rebuttal/SKILL.md +1 -0
- package/src/skills/review/SKILL.md +1 -0
- package/src/skills/scout/SKILL.md +8 -0
- package/src/skills/write/SKILL.md +1 -0
- package/src/tui/dist/app/AppContainer.js +19 -11
- package/src/tui/dist/index.js +4 -1
- package/src/tui/dist/lib/api.js +33 -3
- package/src/tui/package.json +1 -1
- package/src/ui/dist/assets/AiManusChatView-Bv-Z8YpU.js +204 -0
- package/src/ui/dist/assets/AnalysisPlugin-BCKAfjba.js +1 -0
- package/src/ui/dist/assets/CliPlugin-BCKcpc35.js +109 -0
- package/src/ui/dist/assets/CodeEditorPlugin-DbOfSJ8K.js +2 -0
- package/src/ui/dist/assets/CodeViewerPlugin-CbaFRrUU.js +270 -0
- package/src/ui/dist/assets/DocViewerPlugin-DAjLVeQD.js +7 -0
- package/src/ui/dist/assets/GitCommitViewerPlugin-CIUqbUDO.js +1 -0
- package/src/ui/dist/assets/GitDiffViewerPlugin-CQACjoAA.js +6 -0
- package/src/ui/dist/assets/GitSnapshotViewer-0r4nLPke.js +30 -0
- package/src/ui/dist/assets/ImageViewerPlugin-nBOmI2v_.js +26 -0
- package/src/ui/dist/assets/LabCopilotPanel-BHxOxF4z.js +14 -0
- package/src/ui/dist/assets/LabPlugin-BKoZGs95.js +22 -0
- package/src/ui/dist/assets/LatexPlugin-ZwtV8pIp.js +25 -0
- package/src/ui/dist/assets/MarkdownViewerPlugin-DKqVfKyW.js +128 -0
- package/src/ui/dist/assets/MarketplacePlugin-BwxStZ9D.js +13 -0
- package/src/ui/dist/assets/NotebookEditor-BEQhaQbt.js +81 -0
- package/src/ui/dist/assets/{NotebookEditor-CccQYZjX.css → NotebookEditor-BHH8rdGj.css} +1 -1
- package/src/ui/dist/assets/NotebookEditor-BOr3x3Ej.css +1 -0
- package/src/ui/dist/assets/NotebookEditor-DB9N_T9q.js +361 -0
- package/src/ui/dist/assets/PdfLoader-Cy5jtWrr.css +1 -0
- package/src/ui/dist/assets/PdfLoader-eWBONbQP.js +16 -0
- package/src/ui/dist/assets/PdfMarkdownPlugin-D22YOZL3.js +1 -0
- package/src/ui/dist/assets/PdfViewerPlugin-c-RK9DLM.js +17 -0
- package/src/ui/dist/assets/PdfViewerPlugin-nwwE-fjJ.css +1 -0
- package/src/ui/dist/assets/SearchPlugin-CxF9ytAx.js +16 -0
- package/src/ui/dist/assets/SearchPlugin-DA4en4hK.css +1 -0
- package/src/ui/dist/assets/TextViewerPlugin-C5xqeeUH.js +54 -0
- package/src/ui/dist/assets/VNCViewer-BoLGLnHz.js +11 -0
- package/src/ui/dist/assets/bot-DREQOxzP.js +6 -0
- package/src/ui/dist/assets/browser-CTB2jwNe.js +8 -0
- package/src/ui/dist/assets/chevron-up-C9Qpx4DE.js +6 -0
- package/src/ui/dist/assets/code-WlFHE7z_.js +6 -0
- package/src/ui/dist/assets/file-content-BZMz3RYp.js +1 -0
- package/src/ui/dist/assets/file-diff-panel-CQhw0jS2.js +1 -0
- package/src/ui/dist/assets/file-jump-queue-DA-SdG__.js +1 -0
- package/src/ui/dist/assets/file-socket-CfQPKQKj.js +1 -0
- package/src/ui/dist/assets/git-commit-horizontal-DxZ8DCZh.js +6 -0
- package/src/ui/dist/assets/image-Bgl4VIyx.js +6 -0
- package/src/ui/dist/assets/index-BpV6lusQ.css +33 -0
- package/src/ui/dist/assets/index-CBNVuWcP.js +2496 -0
- package/src/ui/dist/assets/index-CwNu1aH4.js +11 -0
- package/src/ui/dist/assets/index-DrUnlf6K.js +1 -0
- package/src/ui/dist/assets/index-NW-h8VzN.js +1 -0
- package/src/ui/dist/assets/monaco-CiHMMNH_.js +1 -0
- package/src/ui/dist/assets/pdf-effect-queue-J8OnM0jE.js +6 -0
- package/src/ui/dist/assets/plugin-monaco-C8UgLomw.js +19 -0
- package/src/ui/dist/assets/plugin-notebook-HbW2K-1c.js +169 -0
- package/src/ui/dist/assets/plugin-pdf-CR8hgQBV.js +357 -0
- package/src/ui/dist/assets/plugin-terminal-MXFIPun8.js +227 -0
- package/src/ui/dist/assets/popover-CLc0pPP8.js +1 -0
- package/src/ui/dist/assets/project-sync-C9IdzdZW.js +1 -0
- package/src/ui/dist/assets/select-Cs2PmzwL.js +11 -0
- package/src/ui/dist/assets/sigma-ClKcHAXm.js +6 -0
- package/src/ui/dist/assets/trash-DwpbFr3w.js +11 -0
- package/src/ui/dist/assets/useCliAccess-NQ8m0Let.js +1 -0
- package/src/ui/dist/assets/useFileDiffOverlay-FuhcnKiw.js +1 -0
- package/src/ui/dist/assets/wrap-text-BC-Hltpd.js +11 -0
- package/src/ui/dist/assets/zoom-out-E_gaeAxL.js +11 -0
- package/src/ui/dist/index.html +5 -2
- package/src/ui/dist/assets/AiManusChatView-DDjbFnbt.js +0 -26597
- package/src/ui/dist/assets/AnalysisPlugin-Yb5IdmaU.js +0 -123
- package/src/ui/dist/assets/CliPlugin-e64sreyu.js +0 -31037
- package/src/ui/dist/assets/CodeEditorPlugin-C4D2TIkU.js +0 -427
- package/src/ui/dist/assets/CodeViewerPlugin-BVoNZIvC.js +0 -905
- package/src/ui/dist/assets/DocViewerPlugin-CLChbllo.js +0 -278
- package/src/ui/dist/assets/GitDiffViewerPlugin-C4xeFyFQ.js +0 -2661
- package/src/ui/dist/assets/ImageViewerPlugin-OiMUAcLi.js +0 -500
- package/src/ui/dist/assets/LabCopilotPanel-BjD2ThQF.js +0 -4104
- package/src/ui/dist/assets/LabPlugin-DQPg-NrB.js +0 -2677
- package/src/ui/dist/assets/LatexPlugin-CI05XAV9.js +0 -1792
- package/src/ui/dist/assets/MarkdownViewerPlugin-DpeBLYZf.js +0 -308
- package/src/ui/dist/assets/MarketplacePlugin-DolE58Q2.js +0 -413
- package/src/ui/dist/assets/NotebookEditor-7Qm2rSWD.js +0 -4214
- package/src/ui/dist/assets/NotebookEditor-C1kWaxKi.js +0 -84873
- package/src/ui/dist/assets/NotebookEditor-C3VQ7ylN.css +0 -1405
- package/src/ui/dist/assets/PdfLoader-BfOHw8Zw.js +0 -25468
- package/src/ui/dist/assets/PdfLoader-C-Y707R3.css +0 -49
- package/src/ui/dist/assets/PdfMarkdownPlugin-BulDREv1.js +0 -409
- package/src/ui/dist/assets/PdfViewerPlugin-C-daaOaL.js +0 -3095
- package/src/ui/dist/assets/PdfViewerPlugin-DQ11QcSf.css +0 -3627
- package/src/ui/dist/assets/SearchPlugin-CjpaiJ3A.js +0 -741
- package/src/ui/dist/assets/SearchPlugin-DDMrGDkh.css +0 -379
- package/src/ui/dist/assets/TextViewerPlugin-BxIyqPQC.js +0 -472
- package/src/ui/dist/assets/VNCViewer-HAg9mF7M.js +0 -18821
- package/src/ui/dist/assets/awareness-C0NPR2Dj.js +0 -292
- package/src/ui/dist/assets/bot-0DYntytV.js +0 -21
- package/src/ui/dist/assets/browser-BAcuE0Xj.js +0 -2895
- package/src/ui/dist/assets/code-B20Slj_w.js +0 -17
- package/src/ui/dist/assets/file-content-DT24KFma.js +0 -377
- package/src/ui/dist/assets/file-diff-panel-DK13YPql.js +0 -92
- package/src/ui/dist/assets/file-jump-queue-r5XKgJEV.js +0 -16
- package/src/ui/dist/assets/file-socket-B4T2o4nR.js +0 -58
- package/src/ui/dist/assets/function-B5QZkkHC.js +0 -1895
- package/src/ui/dist/assets/image-DSeR_sDS.js +0 -18
- package/src/ui/dist/assets/index-BrFje2Uk.js +0 -120
- package/src/ui/dist/assets/index-BwRJaoTl.js +0 -25
- package/src/ui/dist/assets/index-D_E4281X.js +0 -221322
- package/src/ui/dist/assets/index-DnYB3xb1.js +0 -159
- package/src/ui/dist/assets/index-G7AcWcMu.css +0 -12594
- package/src/ui/dist/assets/monaco-LExaAN3Y.js +0 -623
- package/src/ui/dist/assets/pdf-effect-queue-BJk5okWJ.js +0 -47
- package/src/ui/dist/assets/pdf_viewer-e0g1is2C.js +0 -8206
- package/src/ui/dist/assets/popover-D3Gg_FoV.js +0 -476
- package/src/ui/dist/assets/project-sync-C_ygLlVU.js +0 -297
- package/src/ui/dist/assets/select-CpAK6uWm.js +0 -1690
- package/src/ui/dist/assets/sigma-DEccaSgk.js +0 -22
- package/src/ui/dist/assets/square-check-big-uUfyVsbD.js +0 -17
- package/src/ui/dist/assets/trash-CXvwwSe8.js +0 -32
- package/src/ui/dist/assets/useCliAccess-Bnop4mgR.js +0 -957
- package/src/ui/dist/assets/useFileDiffOverlay-B8eUAX0I.js +0 -53
- package/src/ui/dist/assets/wrap-text-9vbOBpkW.js +0 -35
- package/src/ui/dist/assets/yjs-DncrqiZ8.js +0 -11243
- package/src/ui/dist/assets/zoom-out-BgVMmOW4.js +0 -34
|
@@ -0,0 +1,128 @@
|
|
|
1
|
+
# Mentor Taste Profile
|
|
2
|
+
|
|
3
|
+
This file captures the user's stable product and UI preferences.
|
|
4
|
+
|
|
5
|
+
## Product taste
|
|
6
|
+
|
|
7
|
+
### 1. Real systems should feel coherent
|
|
8
|
+
|
|
9
|
+
The user strongly prefers products that feel like one system, not many stitched demos.
|
|
10
|
+
|
|
11
|
+
Good signs:
|
|
12
|
+
|
|
13
|
+
- shared data contracts
|
|
14
|
+
- shared viewers
|
|
15
|
+
- shared interaction metaphors
|
|
16
|
+
- one obvious truth source
|
|
17
|
+
- one route to inspect the same object across surfaces
|
|
18
|
+
|
|
19
|
+
Bad signs:
|
|
20
|
+
|
|
21
|
+
- duplicate pages with slightly different semantics
|
|
22
|
+
- one-off dialogs that bypass the normal model
|
|
23
|
+
- multiple status models for the same run
|
|
24
|
+
- patch-specific UI that does not compose with the workspace
|
|
25
|
+
|
|
26
|
+
### 2. Explanatory power beats decoration
|
|
27
|
+
|
|
28
|
+
The product should help the user understand:
|
|
29
|
+
|
|
30
|
+
- where they are
|
|
31
|
+
- what is happening
|
|
32
|
+
- what changed
|
|
33
|
+
- what they can do next
|
|
34
|
+
|
|
35
|
+
Interfaces that look stylish but hide state or route are not aligned with this profile.
|
|
36
|
+
|
|
37
|
+
## UI taste
|
|
38
|
+
|
|
39
|
+
### Visual direction
|
|
40
|
+
|
|
41
|
+
Stable user preference:
|
|
42
|
+
|
|
43
|
+
- low saturation
|
|
44
|
+
- Morandi family palettes
|
|
45
|
+
- restrained contrast
|
|
46
|
+
- clear hierarchy
|
|
47
|
+
- minimal but intentional chrome
|
|
48
|
+
|
|
49
|
+
This does not mean "flat and boring".
|
|
50
|
+
It means the interface should feel composed rather than noisy.
|
|
51
|
+
|
|
52
|
+
### Layout
|
|
53
|
+
|
|
54
|
+
Preferred layout traits:
|
|
55
|
+
|
|
56
|
+
- obvious primary area
|
|
57
|
+
- clean secondary controls
|
|
58
|
+
- low duplication
|
|
59
|
+
- enough whitespace to read state
|
|
60
|
+
- tabs and dialogs that map to real objects
|
|
61
|
+
|
|
62
|
+
### Motion
|
|
63
|
+
|
|
64
|
+
Motion should:
|
|
65
|
+
|
|
66
|
+
- reveal status
|
|
67
|
+
- smooth transitions
|
|
68
|
+
- preserve continuity
|
|
69
|
+
|
|
70
|
+
Motion should not:
|
|
71
|
+
|
|
72
|
+
- distract
|
|
73
|
+
- simulate fake progress
|
|
74
|
+
- hide slow data paths
|
|
75
|
+
- force a new metaphor when the system already has one
|
|
76
|
+
|
|
77
|
+
### Surface-specific taste
|
|
78
|
+
|
|
79
|
+
#### Workspace and copilot
|
|
80
|
+
|
|
81
|
+
- timelines should preserve order and truth
|
|
82
|
+
- tool calls should feel inspectable, not magical
|
|
83
|
+
- inputs should make current mode obvious
|
|
84
|
+
- tabs should open real objects, not decorative placeholders
|
|
85
|
+
- progress surfaces should separate:
|
|
86
|
+
- completed
|
|
87
|
+
- running
|
|
88
|
+
- blocked
|
|
89
|
+
- next checkpoint
|
|
90
|
+
- user-facing updates should be concise but evidence-bearing, not generic reassurance
|
|
91
|
+
- user-facing detail should be inspectable without unnecessarily exposing private identifiers or secret-bearing payloads
|
|
92
|
+
|
|
93
|
+
#### Settings and flows
|
|
94
|
+
|
|
95
|
+
- use stepwise flows when the contract is sequential
|
|
96
|
+
- keep forms explicit
|
|
97
|
+
- explain why a setting exists, what it enables, and what risk it carries
|
|
98
|
+
|
|
99
|
+
#### Research canvases
|
|
100
|
+
|
|
101
|
+
- branch-centric or evidence-centric views are preferred over arbitrary node spam
|
|
102
|
+
- every visible node should map to something durable
|
|
103
|
+
- opening a node should expose the actual evidence chain
|
|
104
|
+
- if a paper or experiment is claimed complete, the UI should make it easy to inspect the actual supporting files, result records, and mapping into the outline or evidence ledger
|
|
105
|
+
|
|
106
|
+
## Taste anti-goals
|
|
107
|
+
|
|
108
|
+
Avoid:
|
|
109
|
+
|
|
110
|
+
- generic AI cards everywhere
|
|
111
|
+
- unrelated gradients and random color accents
|
|
112
|
+
- fake complexity
|
|
113
|
+
- too many nested frames
|
|
114
|
+
- long explanatory paragraphs in the interface
|
|
115
|
+
- a beautiful shell over a confused model
|
|
116
|
+
- a progress badge that hides the real blocker
|
|
117
|
+
- polished finality cues when the underlying route is still being contested by the user
|
|
118
|
+
- exposing raw secret or identity-bearing fields when a cleaner semantic label would suffice
|
|
119
|
+
|
|
120
|
+
## Fast checks
|
|
121
|
+
|
|
122
|
+
When judging a UI route, ask:
|
|
123
|
+
|
|
124
|
+
1. Does this expose the real object or invent a presentation-only object?
|
|
125
|
+
2. Does this make the user's next action clearer?
|
|
126
|
+
3. Does this reduce duplication?
|
|
127
|
+
4. Does the visual treatment support the underlying state?
|
|
128
|
+
5. Would this still make sense after the backend evolves?
|
|
@@ -0,0 +1,138 @@
|
|
|
1
|
+
# Mentor Thought Style Profile
|
|
2
|
+
|
|
3
|
+
This file captures the user's stable problem-solving style at the thinking layer.
|
|
4
|
+
|
|
5
|
+
## Core reasoning habits
|
|
6
|
+
|
|
7
|
+
### 1. Start from the real object, not the nearest symptom
|
|
8
|
+
|
|
9
|
+
The user consistently prefers a reasoning chain like:
|
|
10
|
+
|
|
11
|
+
1. what is the actual object under discussion?
|
|
12
|
+
2. what is the actual source of truth for that object?
|
|
13
|
+
3. what contract is supposed to govern it?
|
|
14
|
+
4. where is the first observable mismatch?
|
|
15
|
+
5. what is the smallest convergent fix?
|
|
16
|
+
|
|
17
|
+
This applies to:
|
|
18
|
+
|
|
19
|
+
- UI bugs
|
|
20
|
+
- workflow confusion
|
|
21
|
+
- prompt / skill drift
|
|
22
|
+
- experiment bookkeeping
|
|
23
|
+
- paper readiness questions
|
|
24
|
+
|
|
25
|
+
### 2. Distinguish four layers explicitly
|
|
26
|
+
|
|
27
|
+
The user repeatedly rewards answers that separate:
|
|
28
|
+
|
|
29
|
+
- symptom
|
|
30
|
+
- mechanism
|
|
31
|
+
- contract
|
|
32
|
+
- route
|
|
33
|
+
|
|
34
|
+
Bad answers collapse them together.
|
|
35
|
+
Good answers identify which layer is actually broken.
|
|
36
|
+
|
|
37
|
+
### 3. Treat user suspicion as evidence, not annoyance
|
|
38
|
+
|
|
39
|
+
When the user says:
|
|
40
|
+
|
|
41
|
+
- a concrete implementation detail looks wrong
|
|
42
|
+
- a runtime parameter looks inconsistent with the requested target
|
|
43
|
+
- a visible page still has not changed
|
|
44
|
+
- a deliverable seems to be missing promised evidence
|
|
45
|
+
|
|
46
|
+
the default reasoning stance should be:
|
|
47
|
+
|
|
48
|
+
- assume there may be a real mismatch
|
|
49
|
+
- verify the claim directly
|
|
50
|
+
- then decide whether the claim is true, false, or only partially true
|
|
51
|
+
|
|
52
|
+
### 4. Separate plan mode from execution mode
|
|
53
|
+
|
|
54
|
+
The user uses two distinct modes:
|
|
55
|
+
|
|
56
|
+
- planning mode:
|
|
57
|
+
- think carefully
|
|
58
|
+
- review code first
|
|
59
|
+
- give a route
|
|
60
|
+
- wait for approval
|
|
61
|
+
- execution mode:
|
|
62
|
+
- start modifying
|
|
63
|
+
- keep momentum
|
|
64
|
+
- run tests
|
|
65
|
+
- keep updating progress
|
|
66
|
+
|
|
67
|
+
Good mentor calibration must preserve the difference.
|
|
68
|
+
|
|
69
|
+
### 5. Prefer structured decomposition over broad brainstorming
|
|
70
|
+
|
|
71
|
+
The user does not usually want "many possible ideas" unless explicitly asking for ideation.
|
|
72
|
+
|
|
73
|
+
Default preferred pattern:
|
|
74
|
+
|
|
75
|
+
- one main route
|
|
76
|
+
- one or two clear alternatives if needed
|
|
77
|
+
- one reason each alternative is weaker
|
|
78
|
+
|
|
79
|
+
For codebase exploration and audit tasks, a second preferred pattern appears repeatedly:
|
|
80
|
+
|
|
81
|
+
- define the exact directories or modules to inspect
|
|
82
|
+
- define the exact categories of findings to return
|
|
83
|
+
- define the exact output shape expected back
|
|
84
|
+
|
|
85
|
+
This is not bureaucracy.
|
|
86
|
+
It is how the user keeps large explorations bounded and auditable.
|
|
87
|
+
|
|
88
|
+
### 6. Think in contracts and invariants
|
|
89
|
+
|
|
90
|
+
Across Codex, Claude Code, and DeepScientist quest material, the user repeatedly reasons in terms of:
|
|
91
|
+
|
|
92
|
+
- what should always be true
|
|
93
|
+
- what must be persisted
|
|
94
|
+
- what must stay in sync
|
|
95
|
+
- what transitions are allowed
|
|
96
|
+
|
|
97
|
+
This means the mentor profile should routinely ask:
|
|
98
|
+
|
|
99
|
+
- what invariant is being violated?
|
|
100
|
+
- what contract is missing?
|
|
101
|
+
- what layer is stale?
|
|
102
|
+
|
|
103
|
+
### 7. Do not let language hide uncertainty
|
|
104
|
+
|
|
105
|
+
The user tolerates uncertainty when it is explicit.
|
|
106
|
+
The user dislikes false confidence dressed up as smooth prose.
|
|
107
|
+
|
|
108
|
+
Good phrasing:
|
|
109
|
+
|
|
110
|
+
- explicit distinction between verified and unverified claims
|
|
111
|
+
- explicit distinction between likely code path and confirmed live state
|
|
112
|
+
- explicit distinction between control-surface updates and measured result changes
|
|
113
|
+
|
|
114
|
+
Bad phrasing:
|
|
115
|
+
|
|
116
|
+
- vague reassurance in place of verification
|
|
117
|
+
- broad health claims without checking the user's stated concern
|
|
118
|
+
|
|
119
|
+
## Preferred analytical outputs
|
|
120
|
+
|
|
121
|
+
The strongest answers usually do at least three of these:
|
|
122
|
+
|
|
123
|
+
- identify the truth source
|
|
124
|
+
- identify the stale or misleading layer
|
|
125
|
+
- define the acceptance gate
|
|
126
|
+
- define the next checkpoint
|
|
127
|
+
- explain why one route is smaller and safer than another
|
|
128
|
+
- preserve scope boundaries instead of wandering across the whole codebase
|
|
129
|
+
- return the requested structure rather than an impressive but differently shaped answer
|
|
130
|
+
|
|
131
|
+
## Thought anti-patterns
|
|
132
|
+
|
|
133
|
+
Avoid:
|
|
134
|
+
|
|
135
|
+
- over-explaining the same point after the user has moved on
|
|
136
|
+
- defending a previous conclusion after the user changed the goal
|
|
137
|
+
- mistaking narrative continuity for actual verification
|
|
138
|
+
- treating the latest file edit as proof that the runtime or UI changed
|
|
@@ -0,0 +1,289 @@
|
|
|
1
|
+
# Mentor Work Profile
|
|
2
|
+
|
|
3
|
+
This file captures the user's stable technical standards.
|
|
4
|
+
|
|
5
|
+
## Core operating principles
|
|
6
|
+
|
|
7
|
+
### 1. Architecture before patching
|
|
8
|
+
|
|
9
|
+
- First identify the real system boundary.
|
|
10
|
+
- Then identify the real truth source.
|
|
11
|
+
- Only after that choose an implementation route.
|
|
12
|
+
|
|
13
|
+
Do not start from surface symptoms if the issue is obviously architectural.
|
|
14
|
+
|
|
15
|
+
### 2. Prefer one convergent model
|
|
16
|
+
|
|
17
|
+
The user consistently prefers:
|
|
18
|
+
|
|
19
|
+
- one timeline model instead of split message vs tool models
|
|
20
|
+
- one viewer instead of many partial viewers
|
|
21
|
+
- one target contract instead of ad hoc special cases
|
|
22
|
+
- one durable research route instead of parallel undocumented paths
|
|
23
|
+
|
|
24
|
+
If several systems overlap, the mentor default is to merge or thin them rather than add a fourth layer.
|
|
25
|
+
|
|
26
|
+
### 3. Backend truth first, UI second
|
|
27
|
+
|
|
28
|
+
The UI is not allowed to invent semantics that the backend cannot justify.
|
|
29
|
+
|
|
30
|
+
Good frontend work should reflect:
|
|
31
|
+
|
|
32
|
+
- the actual status model
|
|
33
|
+
- the actual event order
|
|
34
|
+
- the actual branch or worktree lineage
|
|
35
|
+
- the actual artifact or connector contract
|
|
36
|
+
|
|
37
|
+
If the UI looks nice but misstates truth, the result is still wrong.
|
|
38
|
+
|
|
39
|
+
### 4. Durable state matters
|
|
40
|
+
|
|
41
|
+
The user consistently values:
|
|
42
|
+
|
|
43
|
+
- Git as durable lineage
|
|
44
|
+
- worktrees and branches as explicit divergence
|
|
45
|
+
- artifacts as durable route and result records
|
|
46
|
+
- files as stable state
|
|
47
|
+
- prompts and skills as visible workflow control
|
|
48
|
+
|
|
49
|
+
Whenever state matters, make it durable and inspectable.
|
|
50
|
+
|
|
51
|
+
### 5. Verification is part of the implementation
|
|
52
|
+
|
|
53
|
+
The user does not treat "implemented" as complete unless it is also:
|
|
54
|
+
|
|
55
|
+
- inspected
|
|
56
|
+
- tested
|
|
57
|
+
- run
|
|
58
|
+
- compared against the intended contract
|
|
59
|
+
|
|
60
|
+
Preferred pattern:
|
|
61
|
+
|
|
62
|
+
1. identify the exact failure mode
|
|
63
|
+
2. patch the real leverage point
|
|
64
|
+
3. run the smallest useful verification
|
|
65
|
+
4. say what remains uncertain
|
|
66
|
+
|
|
67
|
+
Quest dialogue evidence adds one more hard rule:
|
|
68
|
+
|
|
69
|
+
5. if the user points to a specific suspected mismatch, verify that specific mismatch directly before giving a broader health summary
|
|
70
|
+
|
|
71
|
+
### 6. Prefer minimal surface expansion
|
|
72
|
+
|
|
73
|
+
Before adding:
|
|
74
|
+
|
|
75
|
+
- a new page
|
|
76
|
+
- a new endpoint
|
|
77
|
+
- a new tool
|
|
78
|
+
- a new state object
|
|
79
|
+
- a new workflow
|
|
80
|
+
|
|
81
|
+
first ask whether the current system can be extended more cleanly.
|
|
82
|
+
|
|
83
|
+
This is especially important for research and workspace flows:
|
|
84
|
+
|
|
85
|
+
- do not add a second paper viewer if the real fix is to make the existing one load the right durable object
|
|
86
|
+
- do not add a second experiment-tracking contract if the current artifact protocol can express it
|
|
87
|
+
- do not add a second progress system if the real issue is that the current progress messages are underspecified
|
|
88
|
+
|
|
89
|
+
### 7. Make IDs, paths, and route objects explicit
|
|
90
|
+
|
|
91
|
+
The user repeatedly pushes for:
|
|
92
|
+
|
|
93
|
+
- explicit ids
|
|
94
|
+
- explicit path rules
|
|
95
|
+
- explicit route transitions
|
|
96
|
+
- explicit query interfaces when agent-generated ids are needed
|
|
97
|
+
|
|
98
|
+
If an agent must reference something, the system should expose a reliable way to query it.
|
|
99
|
+
Do not leave critical references to guesswork.
|
|
100
|
+
|
|
101
|
+
Quest conversations also show a practical preference:
|
|
102
|
+
|
|
103
|
+
- when reporting durable work, prefer exact paths and exact run / experiment / idea ids
|
|
104
|
+
- especially when the user is checking whether a result is real, current, or already mapped into the paper contract
|
|
105
|
+
|
|
106
|
+
Privacy boundary:
|
|
107
|
+
|
|
108
|
+
- use exact ids and exact paths for internal debugging and local workspace truth checks
|
|
109
|
+
- but do not surface secrets, connector conversation ids, personal handles, access tokens, or unnecessary workstation-specific absolute paths in outward-facing summaries
|
|
110
|
+
- if a relative path or semantic object id is sufficient, prefer that over exposing a raw local machine path
|
|
111
|
+
|
|
112
|
+
### 8. Prompt, skill, MCP, and UI must agree
|
|
113
|
+
|
|
114
|
+
The user strongly prefers systems where:
|
|
115
|
+
|
|
116
|
+
- the prompt says the same workflow the skill expects
|
|
117
|
+
- the skill uses the same tool contracts the MCP server exposes
|
|
118
|
+
- the UI renders the same objects the backend actually persists
|
|
119
|
+
|
|
120
|
+
If these layers diverge, fix the divergence instead of documenting around it.
|
|
121
|
+
|
|
122
|
+
### 8A. Privacy-preserving truth reporting
|
|
123
|
+
|
|
124
|
+
The user wants strong traceability, but not accidental data leakage.
|
|
125
|
+
|
|
126
|
+
Preferred rule:
|
|
127
|
+
|
|
128
|
+
- keep private truth accessible to the runtime
|
|
129
|
+
- expose only the minimum necessary identifier to the user-facing surface
|
|
130
|
+
|
|
131
|
+
Examples:
|
|
132
|
+
|
|
133
|
+
- okay: run id, idea id, experiment id, semantic relative path, sanitized metric summary
|
|
134
|
+
- avoid by default: raw connector ids, phone-like ids, OpenID-like ids, API keys, bearer tokens, machine-specific personal paths, or copied private messages that are not needed for the current task
|
|
135
|
+
|
|
136
|
+
### 9. Current-turn user instruction overrides stale route assumptions
|
|
137
|
+
|
|
138
|
+
The user accepts strong route guidance, but not stubbornness.
|
|
139
|
+
|
|
140
|
+
If the durable state says:
|
|
141
|
+
|
|
142
|
+
- the paper is ready
|
|
143
|
+
- the route is sufficient
|
|
144
|
+
- finalize is justified
|
|
145
|
+
|
|
146
|
+
but the current-turn user instruction clearly says:
|
|
147
|
+
|
|
148
|
+
- continue experiments
|
|
149
|
+
- keep exploring
|
|
150
|
+
- expand evidence
|
|
151
|
+
- write a fuller paper after more supplementary work
|
|
152
|
+
|
|
153
|
+
then the active contract has changed.
|
|
154
|
+
Do not keep arguing from the old contract.
|
|
155
|
+
Briefly note the previous state if needed, then pivot and execute the new scope.
|
|
156
|
+
|
|
157
|
+
### 10. Progress reporting must answer the real uncertainty
|
|
158
|
+
|
|
159
|
+
The user consistently values progress updates that make four things explicit:
|
|
160
|
+
|
|
161
|
+
- what is already done
|
|
162
|
+
- what is currently running
|
|
163
|
+
- what is blocked or still unknown
|
|
164
|
+
- what exact next checkpoint will change the decision
|
|
165
|
+
|
|
166
|
+
When the user asks about performance, execution, or delivery readiness, add a fifth requirement:
|
|
167
|
+
|
|
168
|
+
- the concrete acceptance metric or quantitative gate
|
|
169
|
+
|
|
170
|
+
Examples:
|
|
171
|
+
|
|
172
|
+
- actual batch size
|
|
173
|
+
- completed task count
|
|
174
|
+
- whether the run is `20/20`
|
|
175
|
+
- whether the paper is already `9.5` to `10` pages
|
|
176
|
+
- which supplementary experiments are already mapped into the outline or evidence ledger
|
|
177
|
+
|
|
178
|
+
Avoid vague health summaries like:
|
|
179
|
+
|
|
180
|
+
- generic "healthy" language
|
|
181
|
+
- generic "still progressing" language
|
|
182
|
+
- generic "not stalled" language
|
|
183
|
+
|
|
184
|
+
when the real user question is about:
|
|
185
|
+
|
|
186
|
+
- whether a bug exists
|
|
187
|
+
- whether batch size is correct
|
|
188
|
+
- whether an experiment really landed
|
|
189
|
+
- whether the paper actually includes the promised evidence
|
|
190
|
+
|
|
191
|
+
### 11. Do not confuse control-surface updates with substantive progress
|
|
192
|
+
|
|
193
|
+
Updating:
|
|
194
|
+
|
|
195
|
+
- `PLAN.md`
|
|
196
|
+
- `CHECKLIST.md`
|
|
197
|
+
- `status.md`
|
|
198
|
+
- `SUMMARY.md`
|
|
199
|
+
- review ledgers
|
|
200
|
+
- experiment matrices
|
|
201
|
+
|
|
202
|
+
is useful, but it is not the same as substantive research progress unless it accompanies at least one of:
|
|
203
|
+
|
|
204
|
+
- a new measured result
|
|
205
|
+
- a new validated comparison
|
|
206
|
+
- a new manuscript delta
|
|
207
|
+
- a real route change
|
|
208
|
+
- a newly durable contract correction
|
|
209
|
+
|
|
210
|
+
When only the control surface changed, say so plainly.
|
|
211
|
+
|
|
212
|
+
## Research-system preferences
|
|
213
|
+
|
|
214
|
+
### 1. Shared protocol over stage-specific hacks
|
|
215
|
+
|
|
216
|
+
Review, rebuttal, supplementary experiments, connector delivery, and long-running monitoring should use shared protocols whenever possible.
|
|
217
|
+
|
|
218
|
+
The user dislikes:
|
|
219
|
+
|
|
220
|
+
- special rebuttal-only experiment systems
|
|
221
|
+
- separate terminal protocols when `bash_exec` could be extended
|
|
222
|
+
- new UI panes that duplicate an existing viewer with slightly different semantics
|
|
223
|
+
|
|
224
|
+
### 2. State machine clarity
|
|
225
|
+
|
|
226
|
+
The user prefers workflows that can answer:
|
|
227
|
+
|
|
228
|
+
- what stage are we in?
|
|
229
|
+
- what artifact made that stage durable?
|
|
230
|
+
- what event allows the next transition?
|
|
231
|
+
- what is blocked on user input vs autonomous continuation?
|
|
232
|
+
|
|
233
|
+
### 3. Human-visible continuity
|
|
234
|
+
|
|
235
|
+
The system should keep the user informed with meaningful progress, not hidden black-box execution.
|
|
236
|
+
|
|
237
|
+
Preferred pattern:
|
|
238
|
+
|
|
239
|
+
- short status while work is ongoing
|
|
240
|
+
- richer milestone when route or trust state changes
|
|
241
|
+
- explicit decision when continuation is non-trivial
|
|
242
|
+
|
|
243
|
+
Quest dialogue also shows a strong continuation preference:
|
|
244
|
+
|
|
245
|
+
- if the user clearly signals continuation, default to continuing the active task rather than re-arguing why the current route is reasonable
|
|
246
|
+
- explain only what is needed to keep the continuation trustworthy
|
|
247
|
+
|
|
248
|
+
### 4. Hard operational constraints are first-class contract items
|
|
249
|
+
|
|
250
|
+
When the user specifies:
|
|
251
|
+
|
|
252
|
+
- concurrency targets
|
|
253
|
+
- batch size
|
|
254
|
+
- throughput expectations
|
|
255
|
+
- page-count targets
|
|
256
|
+
- required appendix or supplementary evidence count
|
|
257
|
+
- endpoint usage rules
|
|
258
|
+
|
|
259
|
+
those are not decorative preferences.
|
|
260
|
+
Treat them as acceptance gates that must be checked explicitly.
|
|
261
|
+
|
|
262
|
+
## What good output looks like
|
|
263
|
+
|
|
264
|
+
Good work for this profile usually has these traits:
|
|
265
|
+
|
|
266
|
+
- conclusion-first
|
|
267
|
+
- specific file or contract references
|
|
268
|
+
- one main route, not five equal options
|
|
269
|
+
- clear tradeoffs
|
|
270
|
+
- root cause identified
|
|
271
|
+
- minimal but sufficient implementation plan
|
|
272
|
+
- verification strategy attached
|
|
273
|
+
- durable file and state updates synchronized to the same conclusion
|
|
274
|
+
- direct answer to the user's actual concern before broader background
|
|
275
|
+
- clear distinction between "control files updated" and "new result obtained"
|
|
276
|
+
- quantitative acceptance checks when the user asked for performance or completeness
|
|
277
|
+
- enough evidence to be auditable without exposing unnecessary private identifiers
|
|
278
|
+
|
|
279
|
+
## What to avoid
|
|
280
|
+
|
|
281
|
+
- broad abstract advice with no leverage point
|
|
282
|
+
- polishing symptoms while state models remain wrong
|
|
283
|
+
- adding more UI around weak backend contracts
|
|
284
|
+
- hand-wavy references to ids, paths, or "the latest object"
|
|
285
|
+
- saying "done" before the route is actually verified
|
|
286
|
+
- using durable state as a shield against the user's new request
|
|
287
|
+
- repeating the same monitoring narrative after the user has already identified a likely bug
|
|
288
|
+
- treating user-specified performance or completeness targets as soft suggestions
|
|
289
|
+
- overfitting the distilled mentor profile to concrete private quest details instead of reusable standards
|