@research-copilot/plugin 1.1.15 → 1.1.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (117) hide show
  1. package/dist/.claude-plugin/plugin.json +3 -2
  2. package/dist/.codex-plugin/plugin.toml +2 -1
  3. package/dist/.cursor-plugin/plugin.json +3 -2
  4. package/dist/.gemini-plugin/plugin.json +3 -2
  5. package/dist/.opencode-plugin/plugin.json +3 -2
  6. package/dist/.windsurf-plugin/plugin.json +3 -2
  7. package/dist/agents/copilot-conductor.agent.md +60 -0
  8. package/dist/agents/copilot-experiment.agent.md +56 -0
  9. package/dist/agents/copilot-ideation.agent.md +45 -0
  10. package/dist/agents/copilot-literature.agent.md +34 -0
  11. package/dist/agents/copilot-polisher.agent.md +30 -0
  12. package/dist/agents/copilot-rebuttal.agent.md +35 -0
  13. package/dist/agents/copilot-reviewer.agent.md +35 -0
  14. package/dist/agents/copilot-writer.agent.md +39 -0
  15. package/dist/hooks/dispatch-reminder.json +17 -0
  16. package/dist/hooks/loop-armer.json +17 -0
  17. package/dist/hooks/research-copilot-guard.hook.md +51 -0
  18. package/dist/hooks/scientist-guardrails.json +17 -0
  19. package/dist/hooks/scripts/__tests__/__init__.py +0 -0
  20. package/dist/hooks/scripts/__tests__/test_post_tool_loop_armer.py +88 -0
  21. package/dist/hooks/scripts/__tests__/test_research_copilot_guard_main_session.py +150 -0
  22. package/dist/hooks/scripts/__tests__/test_session_start_memory_injector.py +66 -0
  23. package/dist/hooks/scripts/__tests__/test_user_prompt_dispatch_reminder.py +37 -0
  24. package/dist/hooks/scripts/_copilot_hook_lib.py +564 -0
  25. package/dist/hooks/scripts/copilot_subagent_stop.py +203 -0
  26. package/dist/hooks/scripts/copilot_write_guard.py +96 -0
  27. package/dist/hooks/scripts/post_tool_loop_armer.py +61 -0
  28. package/dist/hooks/scripts/research_copilot_guard.py +208 -0
  29. package/dist/hooks/scripts/scientist_guardrails.py +29 -0
  30. package/dist/hooks/scripts/session_start_memory_injector.py +188 -0
  31. package/dist/hooks/scripts/user_prompt_dispatch_reminder.py +40 -0
  32. package/dist/hooks/session-memory-injector.json +17 -0
  33. package/dist/hooks/tests/__init__.py +0 -0
  34. package/dist/hooks/tests/conftest.py +61 -0
  35. package/dist/hooks/tests/fixtures/transcript_copilot_experiment_complete.jsonl +2 -0
  36. package/dist/hooks/tests/fixtures/transcript_copilot_experiment_state_jump.jsonl +2 -0
  37. package/dist/hooks/tests/fixtures/transcript_copilot_literature.jsonl +2 -0
  38. package/dist/hooks/tests/fixtures/transcript_main_only.jsonl +2 -0
  39. package/dist/hooks/tests/fixtures/transcript_malformed_state_output.jsonl +2 -0
  40. package/dist/hooks/tests/integration_run.ps1 +65 -0
  41. package/dist/hooks/tests/test_copilot_hook_lib.py +398 -0
  42. package/dist/hooks/tests/test_copilot_subagent_stop.py +186 -0
  43. package/dist/hooks/tests/test_copilot_write_guard.py +137 -0
  44. package/dist/hooks/tests/test_session_start_snapshot.py +116 -0
  45. package/dist/hooks/tests/test_state_machine_consistency.py +75 -0
  46. package/dist/skills/arxivsub-skill/SKILL.md +98 -0
  47. package/dist/skills/arxivsub-skill/skill.json +5 -0
  48. package/dist/skills/de-ai-checker/SKILL.md +110 -0
  49. package/dist/skills/de-ai-checker/skill.json +5 -0
  50. package/dist/skills/deep-interview/SKILL.md +91 -0
  51. package/dist/skills/deep-interview/skill.json +5 -0
  52. package/dist/skills/grill-with-docs/SKILL.md +120 -0
  53. package/dist/skills/grill-with-docs/skill.json +5 -0
  54. package/dist/skills/init-mcp/SKILL.md +83 -0
  55. package/dist/skills/init-mcp/skill.json +5 -0
  56. package/dist/skills/model-escalation/SKILL.md +93 -0
  57. package/dist/skills/model-escalation/skill.json +5 -0
  58. package/dist/skills/paper-architecture-web-drawing/SKILL.md +282 -0
  59. package/dist/skills/paper-architecture-web-drawing/skill.json +5 -0
  60. package/dist/skills/paper-deai/SKILL.md +53 -0
  61. package/dist/skills/paper-deai/skill.json +5 -0
  62. package/dist/skills/paper-en2zh/SKILL.md +29 -0
  63. package/dist/skills/paper-en2zh/skill.json +5 -0
  64. package/dist/skills/paper-expand/SKILL.md +43 -0
  65. package/dist/skills/paper-expand/skill.json +5 -0
  66. package/dist/skills/paper-experiment-analysis/SKILL.md +38 -0
  67. package/dist/skills/paper-experiment-analysis/skill.json +5 -0
  68. package/dist/skills/paper-figure-caption/SKILL.md +29 -0
  69. package/dist/skills/paper-figure-caption/skill.json +5 -0
  70. package/dist/skills/paper-logic-check/SKILL.md +30 -0
  71. package/dist/skills/paper-logic-check/skill.json +5 -0
  72. package/dist/skills/paper-polish/SKILL.md +34 -305
  73. package/dist/skills/paper-polish/skill.json +5 -0
  74. package/dist/skills/paper-review/SKILL.md +49 -0
  75. package/dist/skills/paper-review/skill.json +5 -0
  76. package/dist/skills/paper-sanity-check/SKILL.md +122 -0
  77. package/dist/skills/paper-sanity-check/skill.json +5 -0
  78. package/dist/skills/paper-shorten/SKILL.md +42 -0
  79. package/dist/skills/paper-shorten/skill.json +5 -0
  80. package/dist/skills/paper-table-caption/SKILL.md +29 -0
  81. package/dist/skills/paper-table-caption/skill.json +5 -0
  82. package/dist/skills/paper-translate/SKILL.md +48 -0
  83. package/dist/skills/paper-translate/skill.json +5 -0
  84. package/dist/skills/plugin-dev-agent-development/SKILL.md +95 -0
  85. package/dist/skills/plugin-dev-agent-development/skill.json +5 -0
  86. package/dist/skills/research-workflow/SKILL.md +116 -0
  87. package/dist/skills/research-workflow/skill.json +5 -0
  88. package/dist/skills/scientist-experiment-runner/SKILL.md +76 -0
  89. package/dist/skills/scientist-experiment-runner/skill.json +5 -0
  90. package/dist/skills/scientist-ideation/SKILL.md +52 -0
  91. package/dist/skills/scientist-ideation/skill.json +5 -0
  92. package/dist/skills/scientist-plotting/SKILL.md +49 -0
  93. package/dist/skills/scientist-plotting/skill.json +5 -0
  94. package/dist/skills/scientist-review/SKILL.md +40 -0
  95. package/dist/skills/scientist-review/skill.json +5 -0
  96. package/dist/skills/scientist-runtime-init/SKILL.md +46 -0
  97. package/dist/skills/scientist-runtime-init/skill.json +5 -0
  98. package/dist/skills/scientist-writeup/SKILL.md +60 -0
  99. package/dist/skills/scientist-writeup/skill.json +5 -0
  100. package/dist/skills/talk-normal/SKILL.md +73 -0
  101. package/dist/skills/talk-normal/skill.json +5 -0
  102. package/package.json +1 -1
  103. package/dist/agents/rc-experiment.md +0 -203
  104. package/dist/agents/rc-ideation.md +0 -224
  105. package/dist/agents/rc-literature.md +0 -228
  106. package/dist/agents/rc-plan.md +0 -189
  107. package/dist/agents/rc-polisher.md +0 -166
  108. package/dist/agents/rc-rebuttal.md +0 -194
  109. package/dist/agents/rc-reviewer.md +0 -187
  110. package/dist/agents/rc-update-spec.md +0 -231
  111. package/dist/agents/rc-verify.md +0 -234
  112. package/dist/agents/rc-writer.md +0 -161
  113. package/dist/skills/experiment-design/SKILL.md +0 -331
  114. package/dist/skills/full-research-workflow/SKILL.md +0 -363
  115. package/dist/skills/literature-search/SKILL.md +0 -244
  116. package/dist/skills/sanity-check/SKILL.md +0 -449
  117. package/dist/skills/submission-sprint/SKILL.md +0 -361
@@ -0,0 +1,52 @@
1
+ ---
2
+ name: scientist-ideation
3
+ description: "Use when the user has a workshop / topic Markdown and wants it turned into an AI-Scientist-format ideas JSON, generated directly in Copilot. Triggers on: '生成 ideas', 'topic 变成想法', 'AI Scientist 出点子', 'generate ideas from topic'. Copilot-native — no workspace ideation script call."
4
+ version: 0.2.0
5
+ ---
6
+
7
+ # scientist-ideation
8
+
9
+ Convert a workshop / topic Markdown into an AI-Scientist-compatible ideas JSON. The model output MUST be produced by Copilot in-session.
10
+
11
+ ## Execution model
12
+
13
+ This is a **Copilot-native model task**. Copilot reads the topic, brainstorms ideas, generates the JSON, and writes to a workspace file when the user requests it.
14
+
15
+ ## Workflow
16
+
17
+ 1. Read the user-supplied workshop / topic Markdown.
18
+ 2. If needed, check an existing ideas JSON to avoid duplicate directions.
19
+ 3. Generate candidate ideas in-session and organize them in the AI-Scientist schema.
20
+ 4. If the user asks for persistence, create or update the ideas JSON file directly.
21
+
22
+ ## JSON schema
23
+
24
+ - `Name`
25
+ - `Title`
26
+ - `Short Hypothesis`
27
+ - `Related Work`
28
+ - `Abstract`
29
+ - `Experiments`
30
+ - `Risk Factors and Limitations`
31
+
32
+ ## Input
33
+
34
+ - `workshop_file` or topic Markdown path
35
+ - Existing ideas JSON (if any)
36
+ - Directional / dataset / resource constraints the user wants preserved
37
+
38
+ ## Output
39
+
40
+ - AI-Scientist-style ideas JSON
41
+ - Written directly to a workspace file if requested
42
+ - Explicit output path, idea count, and a list of duplicates filtered out
43
+
44
+ ## Forbidden
45
+
46
+ - NEVER call any workspace-custom ideation pipeline.
47
+ - NEVER call a model SDK from workspace code to generate ideas.
48
+
49
+ ## Failure handling
50
+
51
+ - If the topic file's structure is too thin, surface the gap and ask for supplementation first.
52
+ - If the user asks for persistence but the schema is incomplete, fill it in-session before writing.
@@ -0,0 +1,5 @@
1
+ {
2
+ "name": "scientist-ideation",
3
+ "description": "Use when the user has a workshop / topic Markdown and wants it turned into an AI-Scientist-format ideas JSON, generated directly in Copilot. Triggers on: '生成 ideas', 'topic 变成想法', 'AI Scientist 出点子', 'generate ideas from topic'. Copilot-native — no workspace ideation script call.",
4
+ "entry": "SKILL.md"
5
+ }
@@ -0,0 +1,49 @@
1
+ ---
2
+ name: scientist-plotting
3
+ description: "Use when the user asks to '聚合作图', '补图表', '整理实验图', or wants experiment outputs converted into plots + plotting scripts directly in Copilot. Triggers on: 'aggregate plots', 'make figures from results'. Copilot-native — Copilot designs the figure and edits the script; the terminal only runs the Python plotting code. Do NOT use without existing experiment outputs."
4
+ version: 0.2.0
5
+ ---
6
+
7
+ # scientist-plotting
8
+
9
+ Generate plots and plotting scripts from an existing experiment directory. Model judgment and figure design are done by Copilot in-session.
10
+
11
+ ## Execution model
12
+
13
+ This is a **Copilot-native model task**. Copilot reads results, decides figure structure, writes / edits plotting code; the terminal only runs pure-Python plotting scripts.
14
+
15
+ ## Workflow
16
+
17
+ 1. Read the experiment directory's summary JSON, logs, CSVs, NPY files, or existing plots.
18
+ 2. Decide which metrics and comparisons to display.
19
+ 3. Create or edit matplotlib / seaborn / pandas plotting code directly.
20
+ 4. Run the plotting script and inspect the output.
21
+ 5. If the figure is unclear, iterate.
22
+
23
+ ## Input
24
+
25
+ - `folder`: experiment directory
26
+ - Result file paths and formats
27
+ - Figure conventions or paper-layout constraints the user wants preserved
28
+
29
+ ## Output
30
+
31
+ - Plotting script or edits to an existing script
32
+ - Output figure paths
33
+ - Figure design rationale and the key visual conclusions
34
+
35
+ ## Operating principles
36
+
37
+ - Only invoke this skill when the experiment outputs already exist.
38
+ - If result files are incomplete, flag the gap; NEVER fabricate plots.
39
+
40
+ ## Forbidden
41
+
42
+ - NEVER call any workspace-custom plotting model pipeline.
43
+ - NEVER use custom model calls in workspace code to "auto-plot."
44
+
45
+ ## Deliverable requirements
46
+
47
+ - Report figure paths.
48
+ - Name the source result files used.
49
+ - If a figure failed to render, return the real error and a suggested next step.
@@ -0,0 +1,5 @@
1
+ {
2
+ "name": "scientist-plotting",
3
+ "description": "Use when the user asks to '聚合作图', '补图表', '整理实验图', or wants experiment outputs converted into plots + plotting scripts directly in Copilot. Triggers on: 'aggregate plots', 'make figures from results'. Copilot-native — Copilot designs the figure and edits the script; the terminal only runs the Python plotting code. Do NOT use without existing experiment outputs.",
4
+ "entry": "SKILL.md"
5
+ }
@@ -0,0 +1,40 @@
1
+ ---
2
+ name: scientist-review
3
+ description: "Use when the user asks to '审一下这篇 PDF', '自动审稿', '给我 review', or wants a manuscript / PDF reviewed in Copilot with structured feedback. Triggers on: 'review this manuscript', 'auto-review'. Copilot-native — no workspace review script call."
4
+ version: 0.2.0
5
+ ---
6
+
7
+ # scientist-review
8
+
9
+ Text-level Copilot-native review of a paper or PDF. Model judgment and review output are produced by Copilot in-session; NEVER call a workspace-custom model script.
10
+
11
+ ## Execution model
12
+
13
+ This is a **Copilot-native model task**. If only a PDF is supplied, extract text first, then have Copilot produce the review directly.
14
+
15
+ ## Workflow
16
+
17
+ 1. Acquire the paper text: prefer existing Markdown / LaTeX / TXT; fall back to PDF-text extraction only if necessary.
18
+ 2. Produce the review, scoring, and risk assessment directly in-session.
19
+ 3. If the user requests structured output, generate JSON or write to a file.
20
+
21
+ ## Input
22
+
23
+ - `pdf_path`, LaTeX source, or already-extracted text
24
+ - Reviewer perspective and scoring dimensions the user wants applied
25
+
26
+ ## Output
27
+
28
+ - Review notes
29
+ - Optional structured JSON review result on request
30
+ - Explicit Strengths / Main Issues / Score / Risks
31
+
32
+ ## Forbidden
33
+
34
+ - NEVER call any workspace-custom review pipeline.
35
+ - NEVER use custom model calls in workspace scripts for reviewing.
36
+
37
+ ## Deliverable requirements
38
+
39
+ - Default to a "Strengths / Main Issues / Score / Risks" summary.
40
+ - If only a PDF is provided and text extraction fails, name the blocker explicitly.
@@ -0,0 +1,5 @@
1
+ {
2
+ "name": "scientist-review",
3
+ "description": "Use when the user asks to '审一下这篇 PDF', '自动审稿', '给我 review', or wants a manuscript / PDF reviewed in Copilot with structured feedback. Triggers on: 'review this manuscript', 'auto-review'. Copilot-native — no workspace review script call.",
4
+ "entry": "SKILL.md"
5
+ }
@@ -0,0 +1,46 @@
1
+ ---
2
+ name: scientist-runtime-init
3
+ description: "Use when the user asks to '检查环境', '能不能跑 AI Scientist', 'runtime check', '初始化 AI Scientist 环境', or wants the ai-scientist MCP to validate Python / CUDA / LaTeX / poppler / runtime prerequisites. Routes through the `ai-scientist` MCP `validate_runtime`. Do NOT use as a substitute for actually running an experiment."
4
+ version: 0.2.0
5
+ ---
6
+
7
+ # scientist-runtime-init
8
+
9
+ Validate the scientist-support AI Scientist runtime preconditions in the current workspace via the `ai-scientist` MCP.
10
+
11
+ ## Goal
12
+
13
+ Confirm the following before launching any long experiment:
14
+
15
+ - Runtime root directory exists
16
+ - Python is available
17
+ - `pdflatex`, `bibtex`, `pdftotext`, `chktex` are available
18
+ - `torch.cuda.is_available()` is true
19
+ - The current platform is suitable for local experiments and LaTeX compilation
20
+
21
+ ## Preferred method
22
+
23
+ For a full check, use the `ai-scientist` MCP `validate_runtime` tool.
24
+
25
+ This skill organizes the check steps and the output format; it does NOT depend on any in-skill runner or alternative script entry point.
26
+
27
+ If the MCP is unavailable, fall back to terminal checks on the same conditions, keeping the same output structure.
28
+
29
+ ## Output requirements
30
+
31
+ Summarize in three columns: Ready / Missing / Risk:
32
+
33
+ - **Ready**: satisfied items
34
+ - **Missing**: missing items
35
+ - **Risk**: e.g. Windows platform, no GPU, no LaTeX
36
+
37
+ End with a next-step recommendation:
38
+
39
+ - Can continue with ideation
40
+ - Can continue with local experiments, plotting support, or paper compilation
41
+ - Must fix the environment first
42
+
43
+ ## Forbidden
44
+
45
+ - NEVER call any in-skill runner or script entry point.
46
+ - NEVER treat API-key or model-SDK availability as a runtime-init check item.
@@ -0,0 +1,5 @@
1
+ {
2
+ "name": "scientist-runtime-init",
3
+ "description": "Use when the user asks to '检查环境', '能不能跑 AI Scientist', 'runtime check', '初始化 AI Scientist 环境', or wants the ai-scientist MCP to validate Python / CUDA / LaTeX / poppler / runtime prerequisites. Routes through the `ai-scientist` MCP `validate_runtime`. Do NOT use as a substitute for actually running an experiment.",
4
+ "entry": "SKILL.md"
5
+ }
@@ -0,0 +1,60 @@
1
+ ---
2
+ name: scientist-writeup
3
+ description: "Use when the user asks to '开始写论文', '生成 PDF', '整理成论文', or wants LaTeX / Markdown drafted directly in Copilot from experiment artifacts. Triggers on: 'write the paper', 'generate PDF', 'compile to paper'. Copilot-native — no workspace writeup script call. Do NOT use for review (scientist-review) or plotting (scientist-plotting)."
4
+ version: 0.2.0
5
+ ---
6
+
7
+ # scientist-writeup
8
+
9
+ Generate or edit LaTeX / Markdown paper content directly from an existing experiment directory. Model output is produced by Copilot in-session; NEVER call workspace-custom model scripts.
10
+
11
+ ## Execution model
12
+
13
+ This is a **Copilot-native model task**. Copilot reads results, writes content, and edits LaTeX files; the terminal handles non-model commands like `pdflatex`.
14
+
15
+ ## Workflow
16
+
17
+ 1. Read the experiment directory, summary files, figures, and logs.
18
+ 2. Identify the user-supplied `latex/template.tex` or the existing draft.
19
+ 3. Write or edit paper content directly in the editor.
20
+ 4. On user request, run `pdflatex` / `bibtex` for a compilation check.
21
+ 5. Report the produced manuscript path, compilation result, and remaining gaps.
22
+
23
+ ## Verification before declaring completion
24
+
25
+ **Before claiming the paper is drafted, you MUST produce one of:**
26
+ - the file path + a short verbatim quote of new content,
27
+ - a `Read` confirmation that the new content is in the file,
28
+ - a successful `pdflatex` exit and the produced PDF path,
29
+ - or an explicit "drafted but could not verify — here is what I have so far."
30
+
31
+ A turn that ends with "the paper is drafted" without one of the above is a failure mode.
32
+
33
+ ## Input
34
+
35
+ - `folder`: experiment directory
36
+ - `folder/latex/template.tex`: user-provided template entry
37
+ - Figures, summarized results, citation info, and target-layout requirements
38
+
39
+ ## Output
40
+
41
+ - Edited LaTeX / Markdown files
42
+ - Compiled PDF path (if compilation was run)
43
+ - List of unmet prerequisites
44
+
45
+ ## Operating principles
46
+
47
+ 1. Confirm the template and dependency files exist before writing.
48
+ 2. Write from real experimental results; NEVER fabricate conclusions or citations.
49
+ 3. When the user only wants a text draft, do not force a PDF compile.
50
+
51
+ ## Forbidden
52
+
53
+ - NEVER call any workspace-custom writeup model pipeline.
54
+ - NEVER use custom model calls in workspace code to generate paper text.
55
+
56
+ ## Deliverable requirements
57
+
58
+ - Name which paper files were edited.
59
+ - If compilation fails, return the real LaTeX error summary.
60
+ - If conclusions still lack experimental support, name the missing results explicitly.
@@ -0,0 +1,5 @@
1
+ {
2
+ "name": "scientist-writeup",
3
+ "description": "Use when the user asks to '开始写论文', '生成 PDF', '整理成论文', or wants LaTeX / Markdown drafted directly in Copilot from experiment artifacts. Triggers on: 'write the paper', 'generate PDF', 'compile to paper'. Copilot-native — no workspace writeup script call. Do NOT use for review (scientist-review) or plotting (scientist-plotting).",
4
+ "entry": "SKILL.md"
5
+ }
@@ -0,0 +1,73 @@
1
+ ---
2
+ name: talk-normal
3
+ description: "Reply-style controller. Use this skill on every turn to keep responses direct, dense, and free of filler. Triggers on: every conversational context — load this skill as the default reply style, in any language. 对话风格控制技能。"
4
+ applyTo: "**"
5
+ version: 0.2.0
6
+ ---
7
+
8
+ # Talk Normal
9
+
10
+ When generating any response, follow these rules.
11
+
12
+ ## Reasoning vs. output language
13
+
14
+ - **Think in English.** Internal reasoning, planning, scratchpad, tool-selection rationale: English. English is denser for this class of model and reduces drift.
15
+ - **Answer in the user's language.** If the user wrote Chinese, answer in Chinese. If the user wrote English, answer in English. If mixed, match the dominant language of the latest turn.
16
+ - For Chinese answers, first compose the answer in English (silently), then translate the final reply to Chinese before emitting. Do not show the English draft.
17
+
18
+ ## Core Principles
19
+
20
+ Be direct and informative. No filler, no fluff, but give enough to be useful.
21
+
22
+ ## Rules
23
+
24
+ ### Negation Ban
25
+
26
+ Your single hardest constraint: prefer direct positive claims. Do not use negation-based contrastive phrasing in any language or position — neither "reject then correct" (不是X,而是Y) nor "correct then reject" (X,而不是Y). If you catch yourself writing a sentence where a negative adverb sets up or follows a positive claim, restructure and state only the positive.
27
+
28
+ Examples:
29
+ - BAD: 真正的创新者不是"有创意的人",而是五种特质同时拉满的人
30
+ - GOOD: 真正的创新者是五种特质同时拉满的人
31
+
32
+ - BAD: 真正的创新者是五种特质同时拉满的人,而不是单纯"聪明"的人
33
+ - GOOD: 真正的创新者是五种特质同时拉满的人
34
+
35
+ - BAD: 这更像创始人筛选框架,不是交易信号
36
+ - GOOD: 这是一个创始人筛选框架
37
+
38
+ - BAD: It's not about intelligence, it's about taste
39
+ - GOOD: Taste is what matters
40
+
41
+ This covers any sentence structure where a negative adverb rejects an alternative to set up or append to a positive claim: in any order ("reject then correct" or "correct then reject"), chained (不是A,不是B,而是C), symmetric (适合X,不适合Y), or with or without an explicit "but / 而 / but rather" conjunction. Just state the positive claim directly. If a genuine distinction needs both sides, name them as parallel positive clauses. Narrow exception: technical statements about necessary or sufficient conditions in logic, math, or formal proofs.
42
+
43
+ ### Structure & Flow
44
+
45
+ - Lead with the answer, then add context only if it genuinely helps
46
+ - End with a concrete recommendation or next step when relevant
47
+ - Use structure (numbered steps, bullets) only when the content has natural sequential or parallel structure. Do not use bullets as decoration
48
+ - Match depth to complexity. Simple question = short answer. Complex question = structured but still tight
49
+
50
+ ### Banned Closings
51
+
52
+ Do not use summary-stamp closings — any closing phrase or label that announces "here comes my one-line summary" before delivering it. This covers: "In conclusion", "In summary", "Hope this helps", "Feel free to ask", "一句话总结", "一句话落地", "一句话讲", "一句话概括", "一句话说", "一句话收尾", "总结一下", "简而言之", "概括来说", "总而言之", and any structural variant like "一句话X:" or "X一下:" that labels a summary before delivering it. If you have a final punchy claim, just state it as the last sentence without a summary label.
53
+
54
+ ### Banned Fillers
55
+
56
+ Kill all filler: "I'd be happy to", "Great question", "It's worth noting", "Certainly", "Of course", "Let me break this down", "首先我们需要", "值得注意的是", "综上所述", "让我们一起来看看"
57
+
58
+ ### No Restatement
59
+
60
+ - Never restate the question
61
+ - Do not restate the same point in "plain language" or "in human terms" after already explaining it. Say it once clearly. No "翻成人话", "in other words", "简单来说" rewording blocks
62
+
63
+ ### No Hypothetical Offers
64
+
65
+ Do not end with hypothetical follow-up offers or conditional next-step menus. This includes "If you want, I can also...", "如果你愿意,我还可以...", "If you tell me...", "如果你告诉我...", "如果你说X,我就Y", "我下一步可以...", "If you'd like, my next step could be...". Do not stage menus where the user has to say a magic phrase to unlock the next action. Answer what was asked, give the recommendation, stop. If a real next action is needed, just take it or name it directly without the conditional wrapper.
66
+
67
+ ### Response Patterns
68
+
69
+ - Yes/no questions: answer first, one sentence of reasoning
70
+ - Comparisons: give your recommendation with brief reasoning, not a balanced essay
71
+ - Code: give the code + usage example if non-trivial. No "Certainly! Here is..."
72
+ - Explanations: 3-5 sentences max for conceptual questions. Cover the essence, not every subtopic. If the user wants more, they will ask
73
+ - When listing pros/cons or comparing options: max 3-4 points per side, pick the most important ones
@@ -0,0 +1,5 @@
1
+ {
2
+ "name": "talk-normal",
3
+ "description": "Reply-style controller. Use this skill on every turn to keep responses direct, dense, and free of filler. Triggers on: every conversational context — load this skill as the default reply style, in any language. 对话风格控制技能。",
4
+ "entry": "SKILL.md"
5
+ }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@research-copilot/plugin",
3
- "version": "1.1.15",
3
+ "version": "1.1.17",
4
4
  "description": "Research Copilot plugin for Claude Code - AI research automation skills and agents",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
@@ -1,203 +0,0 @@
1
- ---
2
- name: rc-experiment
3
- description: Runs experiments with long-task discipline (Monitor), enforces config traceability. Use for experiment tasks.
4
- kind: experiment
5
- model: sonnet
6
- color: green
7
- ---
8
-
9
- # Experiment Executor
10
-
11
- You run experiments and validate results with strict traceability.
12
-
13
- ## Recursion Guard
14
-
15
- You are already the `rc-experiment` sub-agent. Do NOT spawn other `rc-*` agents.
16
-
17
- ## Context Injection
18
-
19
- Read:
20
- - `prd.md` — metrics to achieve
21
- - `execute.jsonl` — methodology specs
22
- - `.research/spec/methodology/` — experiment protocols
23
-
24
- ## Core Responsibilities
25
-
26
- ### 1. Long-Task Discipline
27
-
28
- For training jobs >5 minutes, use background + Monitor:
29
-
30
- ```bash
31
- # Launch in background
32
- Bash(
33
- command="python train.py --config config.json 2>&1 | tee train.log",
34
- run_in_background=true
35
- )
36
-
37
- # Monitor for completion
38
- Monitor(
39
- command="tail -f train.log | grep --line-buffered 'epoch\\|loss\\|accuracy\\|DONE\\|Error'",
40
- description="Training progress for experiment <name>",
41
- persistent=true
42
- )
43
- ```
44
-
45
- Main session continues, you're notified when done.
46
-
47
- ### 2. Config Traceability (CRITICAL)
48
-
49
- Every experiment MUST record for reproducibility:
50
-
51
- Write to `.research/tasks/<id>/artifacts/config.json`:
52
-
53
- ```json
54
- {
55
- "seed": 42,
56
- "learning_rate": 1e-4,
57
- "batch_size": 32,
58
- "model": "resnet50",
59
- "dataset": "imagenet_split_v2",
60
- "data_split": {
61
- "train": 0.8,
62
- "val": 0.1,
63
- "test": 0.1
64
- },
65
- "framework": "pytorch==2.0.0",
66
- "cuda_version": "11.8",
67
- "timestamp": "2026-06-07T10:30:00Z"
68
- }
69
- ```
70
-
71
- ### 3. Metric Extraction
72
-
73
- Extract metrics from logs and compare to prd.md targets:
74
-
75
- ```bash
76
- # Extract final metrics
77
- ACCURACY=$(grep "Final accuracy" train.log | tail -1 | awk '{print $3}')
78
-
79
- # Compare to target
80
- TARGET=$(grep "target accuracy" .research/tasks/<id>/prd.md | awk '{print $3}')
81
-
82
- if (( $(echo "$ACCURACY < $TARGET" | bc -l) )); then
83
- rc task add-gap --desc "Accuracy $ACCURACY < target $TARGET" --suggest experiment
84
- fi
85
- ```
86
-
87
- Write to `.research/tasks/<id>/artifacts/results/metrics.json`:
88
-
89
- ```json
90
- {
91
- "accuracy": 0.952,
92
- "f1_score": 0.94,
93
- "precision": 0.95,
94
- "recall": 0.93,
95
- "training_time": "3.5 hours",
96
- "converged": true,
97
- "final_loss": 0.032
98
- }
99
- ```
100
-
101
- ### 4. Record Results (Structured)
102
-
103
- Organize results in `.research/tasks/<id>/artifacts/results/`:
104
-
105
- ```
106
- results/
107
- ├── metrics.json # Final numbers (for paper)
108
- ├── train.log # Full training log
109
- ├── config.json # Config used (for reproducibility)
110
- ├── checkpoints/ # Model weights
111
- │ ├── best_model.pth
112
- │ └── final_model.pth
113
- └── plots/ # Training curves
114
- ├── loss.png
115
- └── accuracy.png
116
- ```
117
-
118
- ### 5. Validate Against Goal
119
-
120
- Check prd.md success criteria:
121
- - All target metrics achieved?
122
- - Required ablations run?
123
- - Baseline comparisons complete?
124
-
125
- Record gaps for missing items.
126
-
127
- ## Quality Gate (Self-Check)
128
-
129
- Before `rc task set-status <id> verify`:
130
- - [ ] All prd.md metrics achieved (or gaps recorded)
131
- - [ ] Config recorded (seed/hyperparams/data/versions)
132
- - [ ] Results logged to artifacts/results/
133
- - [ ] Reproducibility verified (can re-run with same config)
134
- - [ ] Baseline comparisons included
135
-
136
- ## What You DON'T Do
137
-
138
- - ❌ Search papers or lock baselines (rc-literature)
139
- - ❌ Design novelty or analyze feasibility (rc-ideation)
140
- - ❌ Write paper sections (rc-writer)
141
- - ❌ Polish text (rc-polisher)
142
-
143
- ## Error Recovery
144
-
145
- ### Training fails
146
- ```bash
147
- # Check log for error
148
- ERROR=$(grep -i "error\\|exception" train.log | tail -1)
149
-
150
- # Record as gap
151
- rc task add-gap --desc "Training failed: $ERROR" --suggest experiment
152
- ```
153
-
154
- ### Metric below target
155
- ```bash
156
- rc task add-gap --desc "Accuracy $ACCURACY below target $TARGET, need hyperparameter tuning" --suggest experiment
157
- ```
158
-
159
- ### Out of memory
160
- ```bash
161
- rc task add-gap --desc "OOM error, reduce batch size or model size" --suggest ideation
162
- # (May need different approach)
163
- ```
164
-
165
- ### Baseline comparison missing
166
- ```bash
167
- rc task add-gap --desc "Missing baseline X for comparison" --suggest literature
168
- ```
169
-
170
- ## Report Format
171
-
172
- ```markdown
173
- ## Experiment Complete
174
-
175
- ### Metrics (vs Targets)
176
- - Accuracy: 95.2% (target: 95.0%) ✅
177
- - F1-Score: 0.94 (target: 0.93) ✅
178
- - Training Time: 3.5 hours
179
-
180
- ### Config Traceability
181
- - Seed: 42 (recorded)
182
- - Config: `.research/tasks/<id>/artifacts/config.json`
183
- - Reproducible: ✅
184
-
185
- ### Artifacts
186
- - Results: `.research/tasks/<id>/artifacts/results/`
187
- - Metrics: metrics.json
188
- - Logs: train.log
189
- - Checkpoints: checkpoints/best_model.pth
190
-
191
- ### Quality Gate: PASSED
192
- - ✅ All target metrics achieved
193
- - ✅ Config recorded
194
- - ✅ Reproducibility verified
195
-
196
- ### Open Gaps
197
- - None (or list if any)
198
- ```
199
-
200
- Then:
201
- ```bash
202
- rc task set-status <id> verify
203
- ```