@research-copilot/plugin 1.1.15 → 1.1.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (117) hide show
  1. package/dist/.claude-plugin/plugin.json +3 -2
  2. package/dist/.codex-plugin/plugin.toml +2 -1
  3. package/dist/.cursor-plugin/plugin.json +3 -2
  4. package/dist/.gemini-plugin/plugin.json +3 -2
  5. package/dist/.opencode-plugin/plugin.json +3 -2
  6. package/dist/.windsurf-plugin/plugin.json +3 -2
  7. package/dist/agents/copilot-conductor.agent.md +60 -0
  8. package/dist/agents/copilot-experiment.agent.md +56 -0
  9. package/dist/agents/copilot-ideation.agent.md +45 -0
  10. package/dist/agents/copilot-literature.agent.md +34 -0
  11. package/dist/agents/copilot-polisher.agent.md +30 -0
  12. package/dist/agents/copilot-rebuttal.agent.md +35 -0
  13. package/dist/agents/copilot-reviewer.agent.md +35 -0
  14. package/dist/agents/copilot-writer.agent.md +39 -0
  15. package/dist/hooks/dispatch-reminder.json +17 -0
  16. package/dist/hooks/loop-armer.json +17 -0
  17. package/dist/hooks/research-copilot-guard.hook.md +51 -0
  18. package/dist/hooks/scientist-guardrails.json +17 -0
  19. package/dist/hooks/scripts/__tests__/__init__.py +0 -0
  20. package/dist/hooks/scripts/__tests__/test_post_tool_loop_armer.py +88 -0
  21. package/dist/hooks/scripts/__tests__/test_research_copilot_guard_main_session.py +150 -0
  22. package/dist/hooks/scripts/__tests__/test_session_start_memory_injector.py +66 -0
  23. package/dist/hooks/scripts/__tests__/test_user_prompt_dispatch_reminder.py +37 -0
  24. package/dist/hooks/scripts/_copilot_hook_lib.py +564 -0
  25. package/dist/hooks/scripts/copilot_subagent_stop.py +203 -0
  26. package/dist/hooks/scripts/copilot_write_guard.py +96 -0
  27. package/dist/hooks/scripts/post_tool_loop_armer.py +61 -0
  28. package/dist/hooks/scripts/research_copilot_guard.py +208 -0
  29. package/dist/hooks/scripts/scientist_guardrails.py +29 -0
  30. package/dist/hooks/scripts/session_start_memory_injector.py +188 -0
  31. package/dist/hooks/scripts/user_prompt_dispatch_reminder.py +40 -0
  32. package/dist/hooks/session-memory-injector.json +17 -0
  33. package/dist/hooks/tests/__init__.py +0 -0
  34. package/dist/hooks/tests/conftest.py +61 -0
  35. package/dist/hooks/tests/fixtures/transcript_copilot_experiment_complete.jsonl +2 -0
  36. package/dist/hooks/tests/fixtures/transcript_copilot_experiment_state_jump.jsonl +2 -0
  37. package/dist/hooks/tests/fixtures/transcript_copilot_literature.jsonl +2 -0
  38. package/dist/hooks/tests/fixtures/transcript_main_only.jsonl +2 -0
  39. package/dist/hooks/tests/fixtures/transcript_malformed_state_output.jsonl +2 -0
  40. package/dist/hooks/tests/integration_run.ps1 +65 -0
  41. package/dist/hooks/tests/test_copilot_hook_lib.py +398 -0
  42. package/dist/hooks/tests/test_copilot_subagent_stop.py +186 -0
  43. package/dist/hooks/tests/test_copilot_write_guard.py +137 -0
  44. package/dist/hooks/tests/test_session_start_snapshot.py +116 -0
  45. package/dist/hooks/tests/test_state_machine_consistency.py +75 -0
  46. package/dist/skills/arxivsub-skill/SKILL.md +98 -0
  47. package/dist/skills/arxivsub-skill/skill.json +5 -0
  48. package/dist/skills/de-ai-checker/SKILL.md +110 -0
  49. package/dist/skills/de-ai-checker/skill.json +5 -0
  50. package/dist/skills/deep-interview/SKILL.md +91 -0
  51. package/dist/skills/deep-interview/skill.json +5 -0
  52. package/dist/skills/grill-with-docs/SKILL.md +120 -0
  53. package/dist/skills/grill-with-docs/skill.json +5 -0
  54. package/dist/skills/init-mcp/SKILL.md +83 -0
  55. package/dist/skills/init-mcp/skill.json +5 -0
  56. package/dist/skills/model-escalation/SKILL.md +93 -0
  57. package/dist/skills/model-escalation/skill.json +5 -0
  58. package/dist/skills/paper-architecture-web-drawing/SKILL.md +282 -0
  59. package/dist/skills/paper-architecture-web-drawing/skill.json +5 -0
  60. package/dist/skills/paper-deai/SKILL.md +53 -0
  61. package/dist/skills/paper-deai/skill.json +5 -0
  62. package/dist/skills/paper-en2zh/SKILL.md +29 -0
  63. package/dist/skills/paper-en2zh/skill.json +5 -0
  64. package/dist/skills/paper-expand/SKILL.md +43 -0
  65. package/dist/skills/paper-expand/skill.json +5 -0
  66. package/dist/skills/paper-experiment-analysis/SKILL.md +38 -0
  67. package/dist/skills/paper-experiment-analysis/skill.json +5 -0
  68. package/dist/skills/paper-figure-caption/SKILL.md +29 -0
  69. package/dist/skills/paper-figure-caption/skill.json +5 -0
  70. package/dist/skills/paper-logic-check/SKILL.md +30 -0
  71. package/dist/skills/paper-logic-check/skill.json +5 -0
  72. package/dist/skills/paper-polish/SKILL.md +34 -305
  73. package/dist/skills/paper-polish/skill.json +5 -0
  74. package/dist/skills/paper-review/SKILL.md +49 -0
  75. package/dist/skills/paper-review/skill.json +5 -0
  76. package/dist/skills/paper-sanity-check/SKILL.md +122 -0
  77. package/dist/skills/paper-sanity-check/skill.json +5 -0
  78. package/dist/skills/paper-shorten/SKILL.md +42 -0
  79. package/dist/skills/paper-shorten/skill.json +5 -0
  80. package/dist/skills/paper-table-caption/SKILL.md +29 -0
  81. package/dist/skills/paper-table-caption/skill.json +5 -0
  82. package/dist/skills/paper-translate/SKILL.md +48 -0
  83. package/dist/skills/paper-translate/skill.json +5 -0
  84. package/dist/skills/plugin-dev-agent-development/SKILL.md +95 -0
  85. package/dist/skills/plugin-dev-agent-development/skill.json +5 -0
  86. package/dist/skills/research-workflow/SKILL.md +116 -0
  87. package/dist/skills/research-workflow/skill.json +5 -0
  88. package/dist/skills/scientist-experiment-runner/SKILL.md +76 -0
  89. package/dist/skills/scientist-experiment-runner/skill.json +5 -0
  90. package/dist/skills/scientist-ideation/SKILL.md +52 -0
  91. package/dist/skills/scientist-ideation/skill.json +5 -0
  92. package/dist/skills/scientist-plotting/SKILL.md +49 -0
  93. package/dist/skills/scientist-plotting/skill.json +5 -0
  94. package/dist/skills/scientist-review/SKILL.md +40 -0
  95. package/dist/skills/scientist-review/skill.json +5 -0
  96. package/dist/skills/scientist-runtime-init/SKILL.md +46 -0
  97. package/dist/skills/scientist-runtime-init/skill.json +5 -0
  98. package/dist/skills/scientist-writeup/SKILL.md +60 -0
  99. package/dist/skills/scientist-writeup/skill.json +5 -0
  100. package/dist/skills/talk-normal/SKILL.md +73 -0
  101. package/dist/skills/talk-normal/skill.json +5 -0
  102. package/package.json +1 -1
  103. package/dist/agents/rc-experiment.md +0 -203
  104. package/dist/agents/rc-ideation.md +0 -224
  105. package/dist/agents/rc-literature.md +0 -228
  106. package/dist/agents/rc-plan.md +0 -189
  107. package/dist/agents/rc-polisher.md +0 -166
  108. package/dist/agents/rc-rebuttal.md +0 -194
  109. package/dist/agents/rc-reviewer.md +0 -187
  110. package/dist/agents/rc-update-spec.md +0 -231
  111. package/dist/agents/rc-verify.md +0 -234
  112. package/dist/agents/rc-writer.md +0 -161
  113. package/dist/skills/experiment-design/SKILL.md +0 -331
  114. package/dist/skills/full-research-workflow/SKILL.md +0 -363
  115. package/dist/skills/literature-search/SKILL.md +0 -244
  116. package/dist/skills/sanity-check/SKILL.md +0 -449
  117. package/dist/skills/submission-sprint/SKILL.md +0 -361
@@ -0,0 +1,137 @@
1
+ """Tests for copilot_write_guard.py (PreToolUse, Rule 2)."""
2
+ from __future__ import annotations
3
+
4
+ import json
5
+ from io import StringIO
6
+ from pathlib import Path
7
+
8
+ import pytest
9
+
10
+ import copilot_write_guard as guard
11
+ import _copilot_hook_lib as lib
12
+
13
+
14
+ def _run(monkeypatch, payload: dict, workspace: Path) -> dict:
15
+ monkeypatch.setattr("sys.stdin", StringIO(json.dumps(payload)))
16
+ monkeypatch.chdir(workspace)
17
+ out = StringIO()
18
+ monkeypatch.setattr("sys.stdout", out)
19
+ guard.real_main()
20
+ return json.loads(out.getvalue().strip().splitlines()[-1])
21
+
22
+
23
+ class TestScope:
24
+ def test_main_agent_allowed(self, monkeypatch, workspace, fixtures_dir, payload_builder):
25
+ p = payload_builder("Write",
26
+ {"file_path": str(workspace / ".copilot" / "state.md")},
27
+ str(fixtures_dir / "transcript_main_only.jsonl"))
28
+ d = _run(monkeypatch, p, workspace)
29
+ assert d["hookSpecificOutput"]["permissionDecision"] == "allow"
30
+
31
+ def test_unknown_agent_allowed(self, monkeypatch, workspace, tmp_path, payload_builder):
32
+ t = tmp_path / "trans.jsonl"
33
+ t.write_text('{"role":"assistant","metadata":{"subagent_type":"general-purpose"}}\n')
34
+ p = payload_builder("Write",
35
+ {"file_path": str(workspace / ".copilot" / "state.md")},
36
+ str(t))
37
+ d = _run(monkeypatch, p, workspace)
38
+ assert d["hookSpecificOutput"]["permissionDecision"] == "allow"
39
+
40
+
41
+ class TestOwnedAllow:
42
+ def test_literature_writes_own(self, monkeypatch, workspace, fixtures_dir, payload_builder):
43
+ p = payload_builder("Write",
44
+ {"file_path": str(workspace / ".copilot" / "literature.md")},
45
+ str(fixtures_dir / "transcript_copilot_literature.jsonl"))
46
+ d = _run(monkeypatch, p, workspace)
47
+ assert d["hookSpecificOutput"]["permissionDecision"] == "allow"
48
+
49
+ def test_unrelated_scratch_allowed(self, monkeypatch, workspace, fixtures_dir, payload_builder):
50
+ p = payload_builder("Write",
51
+ {"file_path": str(workspace / "scratch" / "note.txt")},
52
+ str(fixtures_dir / "transcript_copilot_literature.jsonl"))
53
+ d = _run(monkeypatch, p, workspace)
54
+ assert d["hookSpecificOutput"]["permissionDecision"] == "allow"
55
+
56
+
57
+ class TestForbiddenDeny:
58
+ def test_literature_writing_ideas_denied(self, monkeypatch, workspace, fixtures_dir, payload_builder):
59
+ p = payload_builder("Write",
60
+ {"file_path": str(workspace / ".copilot" / "ideas.md")},
61
+ str(fixtures_dir / "transcript_copilot_literature.jsonl"))
62
+ d = _run(monkeypatch, p, workspace)
63
+ assert d["hookSpecificOutput"]["permissionDecision"] == "deny"
64
+ log = (workspace / ".copilot" / "__violations.log").read_text(encoding="utf-8")
65
+ assert "HARD" in log and "DENY" in log
66
+
67
+ def test_literature_editing_state_denied(self, monkeypatch, workspace, fixtures_dir, payload_builder):
68
+ p = payload_builder("Edit",
69
+ {"file_path": str(workspace / ".copilot" / "state.md")},
70
+ str(fixtures_dir / "transcript_copilot_literature.jsonl"))
71
+ d = _run(monkeypatch, p, workspace)
72
+ assert d["hookSpecificOutput"]["permissionDecision"] == "deny"
73
+
74
+
75
+ class TestOverride:
76
+ def test_skip_owned_check_allows(self, monkeypatch, workspace, fixtures_dir, payload_builder):
77
+ (workspace / ".copilot" / ".guard_override").write_text(
78
+ "copilot-literature: skip-owned-check until 2099-01-01T00:00:00Z\n",
79
+ encoding="utf-8")
80
+ p = payload_builder("Write",
81
+ {"file_path": str(workspace / ".copilot" / "ideas.md")},
82
+ str(fixtures_dir / "transcript_copilot_literature.jsonl"))
83
+ d = _run(monkeypatch, p, workspace)
84
+ assert d["hookSpecificOutput"]["permissionDecision"] == "allow"
85
+
86
+ def test_env_var_off_allows(self, monkeypatch, workspace, fixtures_dir, payload_builder):
87
+ monkeypatch.setenv("COPILOT_HOOK_GUARD", "off")
88
+ p = payload_builder("Write",
89
+ {"file_path": str(workspace / ".copilot" / "ideas.md")},
90
+ str(fixtures_dir / "transcript_copilot_literature.jsonl"))
91
+ d = _run(monkeypatch, p, workspace)
92
+ assert d["hookSpecificOutput"]["permissionDecision"] == "allow"
93
+
94
+
95
+ class TestSafeMain:
96
+ def test_empty_stdin_falls_open(self, monkeypatch, workspace):
97
+ monkeypatch.setattr("sys.stdin", StringIO(""))
98
+ monkeypatch.chdir(workspace)
99
+ out = StringIO()
100
+ monkeypatch.setattr("sys.stdout", out)
101
+ guard.real_main()
102
+ d = json.loads(out.getvalue().strip().splitlines()[-1])
103
+ assert d["hookSpecificOutput"]["permissionDecision"] == "allow"
104
+
105
+
106
+ class TestHandoffSpecial:
107
+ def _writer_transcript(self, tmp_path: Path) -> str:
108
+ t = tmp_path / "writer.jsonl"
109
+ t.write_text('{"role":"assistant","metadata":{"subagent_type":"copilot-writer"}}\n')
110
+ return str(t)
111
+
112
+ def _ideation_transcript(self, tmp_path: Path) -> str:
113
+ t = tmp_path / "ideation.jsonl"
114
+ t.write_text('{"role":"assistant","metadata":{"subagent_type":"copilot-ideation"}}\n')
115
+ return str(t)
116
+
117
+ def test_writer_edit_handoff_allowed(self, monkeypatch, workspace, tmp_path, payload_builder):
118
+ p = payload_builder("Edit",
119
+ {"file_path": str(workspace / ".copilot" / "handoff.md")},
120
+ self._writer_transcript(tmp_path))
121
+ d = _run(monkeypatch, p, workspace)
122
+ assert d["hookSpecificOutput"]["permissionDecision"] == "allow"
123
+
124
+ def test_writer_write_handoff_denied(self, monkeypatch, workspace, tmp_path, payload_builder):
125
+ p = payload_builder("Write",
126
+ {"file_path": str(workspace / ".copilot" / "handoff.md")},
127
+ self._writer_transcript(tmp_path))
128
+ d = _run(monkeypatch, p, workspace)
129
+ assert d["hookSpecificOutput"]["permissionDecision"] == "deny"
130
+ assert "append" in d["hookSpecificOutput"]["permissionDecisionReason"].lower()
131
+
132
+ def test_ideation_writing_handoff_denied(self, monkeypatch, workspace, tmp_path, payload_builder):
133
+ p = payload_builder("Edit",
134
+ {"file_path": str(workspace / ".copilot" / "handoff.md")},
135
+ self._ideation_transcript(tmp_path))
136
+ d = _run(monkeypatch, p, workspace)
137
+ assert d["hookSpecificOutput"]["permissionDecision"] == "deny"
@@ -0,0 +1,116 @@
1
+ """Tests for snapshot-writing side-effect of session_start_memory_injector."""
2
+ from __future__ import annotations
3
+
4
+ import json
5
+ import subprocess
6
+ import sys
7
+ from pathlib import Path
8
+
9
+ import pytest
10
+
11
+ import _copilot_hook_lib as lib
12
+
13
+
14
+ def _run_injector(workspace: Path) -> tuple[str, int]:
15
+ script = Path(__file__).resolve().parent.parent / "scripts" / "session_start_memory_injector.py"
16
+ proc = subprocess.run(
17
+ [sys.executable, str(script)],
18
+ cwd=str(workspace),
19
+ capture_output=True,
20
+ text=True,
21
+ timeout=10,
22
+ )
23
+ return proc.stdout, proc.returncode
24
+
25
+
26
+ class TestSnapshotWriting:
27
+ def test_writes_for_existing_handoff(self, workspace, handoff_writer):
28
+ f = workspace / ".copilot" / "literature.md"
29
+ handoff_writer(f, last_updated="2026-05-24T10:00:00Z",
30
+ written_by="copilot-literature")
31
+ stdout, rc = _run_injector(workspace)
32
+ assert rc == 0
33
+ snap = lib.read_snapshot(workspace)
34
+ assert snap.get("literature.md") == "2026-05-24T10:00:00Z"
35
+
36
+ def test_overwrites_each_run(self, workspace, handoff_writer):
37
+ f = workspace / ".copilot" / "literature.md"
38
+ handoff_writer(f, last_updated="2026-05-24T08:00:00Z")
39
+ _run_injector(workspace)
40
+ f.write_text("\n## __HANDOFF__\n- last_updated: 2026-05-24T12:00:00Z\n",
41
+ encoding="utf-8")
42
+ _run_injector(workspace)
43
+ snap = lib.read_snapshot(workspace)
44
+ assert snap.get("literature.md") == "2026-05-24T12:00:00Z"
45
+
46
+ def test_snapshot_file_created(self, workspace, handoff_writer):
47
+ f = workspace / ".copilot" / "literature.md"
48
+ handoff_writer(f, last_updated="2026-05-24T10:00:00Z")
49
+ _run_injector(workspace)
50
+ assert (workspace / ".copilot" / ".session_snapshot.json").is_file()
51
+
52
+
53
+ import datetime
54
+
55
+
56
+ class TestViolationsSummary:
57
+ def _write_log(self, workspace, records):
58
+ log = workspace / ".copilot" / "__violations.log"
59
+ log.write_text("\n".join(json.dumps(r) for r in records) + "\n",
60
+ encoding="utf-8")
61
+
62
+ def test_empty_log_no_summary(self, workspace, handoff_writer):
63
+ handoff_writer(workspace / ".copilot" / "literature.md",
64
+ last_updated="2026-05-24T10:00:00Z")
65
+ stdout, rc = _run_injector(workspace)
66
+ assert "Last 24h" not in stdout
67
+
68
+ def test_recent_blocks_summarized(self, workspace, handoff_writer):
69
+ handoff_writer(workspace / ".copilot" / "literature.md",
70
+ last_updated="2026-05-24T10:00:00Z")
71
+ now = datetime.datetime.now(datetime.timezone.utc)
72
+ recent = (now - datetime.timedelta(hours=1)).isoformat().replace("+00:00", "Z")
73
+ self._write_log(workspace, [
74
+ {"ts": recent, "sev": "HARD", "kind": "BLOCK", "agent": "copilot-literature", "detail": "x"},
75
+ {"ts": recent, "sev": "HARD", "kind": "BLOCK", "agent": "copilot-literature", "detail": "y"},
76
+ {"ts": recent, "sev": "HARD", "kind": "RELEASE", "agent": "copilot-literature", "detail": "z"},
77
+ {"ts": recent, "sev": "SOFT", "kind": "WARN", "agent": "copilot-experiment", "detail": "w"},
78
+ ])
79
+ stdout, rc = _run_injector(workspace)
80
+ assert "Last 24h" in stdout
81
+ assert "2 HARD" in stdout
82
+ assert "1 SOFT" in stdout
83
+
84
+ def test_old_entries_ignored(self, workspace, handoff_writer):
85
+ handoff_writer(workspace / ".copilot" / "literature.md",
86
+ last_updated="2026-05-24T10:00:00Z")
87
+ now = datetime.datetime.now(datetime.timezone.utc)
88
+ old = (now - datetime.timedelta(hours=48)).isoformat().replace("+00:00", "Z")
89
+ self._write_log(workspace, [
90
+ {"ts": old, "sev": "HARD", "kind": "BLOCK", "agent": "x", "detail": "old"},
91
+ ])
92
+ stdout, rc = _run_injector(workspace)
93
+ assert "1 HARD" not in stdout
94
+
95
+
96
+ def test_injects_conductor_protocol(tmp_path, monkeypatch, capsys):
97
+ import session_start_memory_injector as inj
98
+ (tmp_path / ".copilot").mkdir()
99
+ # Seed a real __HANDOFF__ block so `blocks` is non-empty and main() reaches
100
+ # the injection point (it does NOT inject on the early "no blocks" return).
101
+ (tmp_path / ".copilot" / "state.md").write_text(
102
+ "## __HANDOFF__\n- last_updated: 2026-05-24T10:00:00Z\n"
103
+ "- written_by: conductor\n- key_facts:\n - S1 in progress\n"
104
+ "- next_owner: (none)\n",
105
+ encoding="utf-8",
106
+ )
107
+ monkeypatch.chdir(tmp_path)
108
+ # Point the protocol lookup at a temp file. NOTE: this test runs the injector
109
+ # IN-PROCESS (not via subprocess like the other tests in this file) precisely
110
+ # so conductor_protocol_path can be monkeypatched — do not "fix" it to subprocess.
111
+ proto = tmp_path / "CONDUCTOR-PROTOCOL.md"
112
+ proto.write_text("# Conductor Protocol\nYou are the conductor.\n", encoding="utf-8")
113
+ monkeypatch.setattr(inj, "conductor_protocol_path", lambda: proto)
114
+ inj.main()
115
+ out = capsys.readouterr().out
116
+ assert "You are the conductor" in out
@@ -0,0 +1,75 @@
1
+ """Meta-test: STATE_MACHINE dict must match each agent.md state table."""
2
+ from __future__ import annotations
3
+
4
+ import re
5
+ from pathlib import Path
6
+
7
+ import pytest
8
+
9
+ import _copilot_hook_lib as lib
10
+
11
+ AGENTS_DIR = Path(__file__).resolve().parent.parent.parent / "agents"
12
+
13
+ AGENT_FILES = {
14
+ "conductor": AGENTS_DIR.parent / "CONDUCTOR-PROTOCOL.md",
15
+ "copilot-literature": AGENTS_DIR / "copilot-literature.agent.md",
16
+ "copilot-ideation": AGENTS_DIR / "copilot-ideation.agent.md",
17
+ "copilot-experiment": AGENTS_DIR / "copilot-experiment.agent.md",
18
+ "copilot-writer": AGENTS_DIR / "copilot-writer.agent.md",
19
+ "copilot-polisher": AGENTS_DIR / "copilot-polisher.agent.md",
20
+ "copilot-reviewer": AGENTS_DIR / "copilot-reviewer.agent.md",
21
+ "copilot-rebuttal": AGENTS_DIR / "copilot-rebuttal.agent.md",
22
+ }
23
+
24
+
25
+ def _parse_state_table(text: str) -> dict[str, list[str]]:
26
+ """Best-effort: find a markdown table whose header contains 状态 and
27
+ 可能的下一状态, parse first-column (state) and last-column (next states).
28
+
29
+ Last column format: `[A, B, C]` or `[A]` or `[]`. Whitespace tolerated.
30
+ """
31
+ lines = text.splitlines()
32
+ result: dict[str, list[str]] = {}
33
+ in_table = False
34
+ for line in lines:
35
+ s = line.strip()
36
+ if s.startswith("|") and "状态" in s and "可能的下一状态" in s:
37
+ in_table = True
38
+ continue
39
+ if not in_table:
40
+ continue
41
+ if not s.startswith("|"):
42
+ if result:
43
+ break # table ended
44
+ continue
45
+ if set(s.replace("|", "").strip()) <= set("-: "):
46
+ continue # separator row like |---|---|
47
+ cols = [c.strip() for c in s.strip("|").split("|")]
48
+ if len(cols) < 5:
49
+ continue
50
+ state = cols[0]
51
+ m = re.search(r"\[([^\]]*)\]", cols[-1])
52
+ if not m:
53
+ continue
54
+ inner = m.group(1).strip()
55
+ next_states = [x.strip() for x in inner.split(",") if x.strip()] if inner else []
56
+ result[state] = next_states
57
+ return result
58
+
59
+
60
+ @pytest.mark.parametrize("agent,path", list(AGENT_FILES.items()))
61
+ def test_state_machine_matches_agent_md(agent: str, path: Path):
62
+ assert path.is_file(), f"agent.md missing: {path}"
63
+ parsed = _parse_state_table(path.read_text(encoding="utf-8"))
64
+ coded = lib.STATE_MACHINE.get(agent, {})
65
+ if not parsed:
66
+ pytest.xfail(f"could not parse state table in {path.name}")
67
+ for state, next_states in parsed.items():
68
+ assert state in coded, (
69
+ f"{agent}: state '{state}' in agent.md but missing from STATE_MACHINE")
70
+ assert set(next_states) == set(coded[state]), (
71
+ f"{agent}: state '{state}' drift — agent.md={next_states} vs "
72
+ f"coded={coded[state]}")
73
+ for state in coded.keys():
74
+ assert state in parsed, (
75
+ f"{agent}: state '{state}' in STATE_MACHINE but missing from agent.md")
@@ -0,0 +1,98 @@
1
+ ---
2
+ name: arxivsub-skill
3
+ description: "Use whenever the user wants to search for academic papers, find recent research, look up conference publications, or explore literature on any AI / ML / CV topic. Routes through the `arxivsub-search` MCP server (arXiv + CVPR / ICCV / ICLR / ICML / NeurIPS / AAAI / MICCAI). Triggers on: \"find papers on X\", \"what are the latest papers about Y\", \"search arXiv for Z\", \"any recent work on W\", '论文检索', '最新研究'."
4
+ version: 0.2.0
5
+ ---
6
+
7
+ # arxivsub-skill
8
+
9
+ Search academic papers via the arXIVSub API through MCP.
10
+
11
+ ## Language Rule
12
+
13
+ **Always respond in the same language the user is using.**
14
+
15
+ ## Step 0: Authentication
16
+
17
+ The `arxivsub-search` MCP server reads the API key automatically from the environment or a `.env` file at the workspace root. Never ask the user for it unless the MCP call fails with a missing-key or auth error, and never pass it as a chat parameter.
18
+
19
+ If the MCP tool reports `missing_api_key` or an auth failure involving `ARXIVSUB_SKILL_KEY`, tell the user (in their language) to set it up via **one** of:
20
+
21
+ 1. Export as a shell environment variable (add to `~/.zshrc` or `~/.bashrc` for persistence):
22
+ ```
23
+ export ARXIVSUB_SKILL_KEY=your_key_here
24
+ ```
25
+ 2. Add to a `.env` file in the working directory:
26
+ ```
27
+ ARXIVSUB_SKILL_KEY=your_key_here
28
+ ```
29
+
30
+ The user's API key is found on the Skills page of the arXivSub website.
31
+
32
+ ## Step 1: Show search parameters and execute
33
+
34
+ Before calling the API, briefly show the interpreted parameters in one line (in the user's language), then proceed without waiting:
35
+
36
+ > Searching: query=`"..."`, locations=`[...]`, time=`...`, limit=`...`
37
+
38
+ Pause only if the search intent is genuinely ambiguous (e.g. a term that could mean multiple very different topics).
39
+
40
+ ## Step 2: Call MCP search
41
+
42
+ Call `arxivsub-search.search_papers` with the interpreted parameters. Omit `arxiv_days` / `conference_years` if not applicable.
43
+
44
+ Expected MCP arguments:
45
+
46
+ ```json
47
+ {
48
+ "query": "<search terms>",
49
+ "locations": ["arxiv", "CVPR", "NeurIPS"],
50
+ "limit": 10,
51
+ "arxiv_days": 7,
52
+ "conference_years": [2024, 2025],
53
+ "language": "en"
54
+ }
55
+ ```
56
+
57
+ The MCP tool returns full paper details and `quota_remaining` in structured content. Do not create temp JSON files.
58
+
59
+ ## Step 3: Filter and rank
60
+
61
+ From the returned list, select the **top 5–10** using:
62
+ 1. **Relevance first** — how directly the paper addresses the query
63
+ 2. **Recency as tiebreaker** — among equally relevant papers, prefer the most recent
64
+
65
+ ## Step 4: Fetch full details and respond
66
+
67
+ Compose the response from the returned details. **Never mention files, scripts, temp files, or internal mechanics.**
68
+
69
+ ### Output structure (translate headers to the user's language)
70
+
71
+ **[Research Findings]** — Synthesize insights. Answer the user's question directly.
72
+
73
+ **[Recommended Papers]** — For each paper, write a substantive description (not just `what_about`). Cover: what problem it solves, the key method or contribution, and notable results or significance. Typically 3–5 sentences.
74
+
75
+ ```
76
+ **[Title]**
77
+ 📍 [conference / arXiv] · [year if available]
78
+ 👥 [first_author] ([first_aff]) · [last_author] ([last_aff])
79
+ 📄 [synthesized description based on full paper details]
80
+ 🔗 [pdf_url]
81
+ ```
82
+
83
+ At the bottom, show quota in the user's language as a footnote:
84
+ English: `Daily quota remaining: N searches` / Chinese: `当日剩余搜索额度:N 次`
85
+
86
+ ## Key rules
87
+
88
+ - `locations` is **case-sensitive**: `arxiv`, `CVPR`, `ICCV`, `ICLR`, `ICML`, `NeurIPS`, `AAAI`, `MICCAI`
89
+ - Show parameters as a one-liner before calling the MCP; only ask for confirmation if intent is ambiguous
90
+ - If the MCP returns an error, classify and handle:
91
+ - **Retryable** (network timeout, transient server error): inform the user; offer to retry
92
+ - **Needs user intervention**:
93
+ - Quota exhausted → tell the user the daily quota is used up; do not retry
94
+ - Auth failure / missing key → go to Step 0
95
+ - Empty results → suggest broadening the query (fewer locations, wider date range, looser terms)
96
+ - JSON parse failure / malformed response → report as an unexpected error; ask the user to try again later
97
+ - NEVER output raw JSON or expose internal mechanics
98
+ - NEVER call local skill scripts; all network access MUST go through MCP
@@ -0,0 +1,5 @@
1
+ {
2
+ "name": "arxivsub-skill",
3
+ "description": "Use whenever the user wants to search for academic papers, find recent research, look up conference publications, or explore literature on any AI / ML / CV topic. Routes through the `arxivsub-search` MCP server (arXiv + CVPR / ICCV / ICLR / ICML / NeurIPS / AAAI / MICCAI). Triggers on: \"find papers on X\", \"what are the latest papers about Y\", \"search arXiv for Z\", \"any recent work on W\", '论文检索', '最新研究'.",
4
+ "entry": "SKILL.md"
5
+ }
@@ -0,0 +1,110 @@
1
+ ---
2
+ name: de-ai-checker
3
+ description: "Validation skill for de-AI quality checking. Use AFTER polishing to verify that AI patterns have been removed and text reads naturally. Triggers on: 'check de-AI quality', 'verify humanization', 'validate polish', 'de-AI checker', '检查去AI效果'. This is a validation-gate compatible wrapper around paper-deai."
4
+ version: 1.0.0
5
+ ---
6
+
7
+ # De-AI Checker — Post-polish validation
8
+
9
+ Run **after** polishing is complete to verify that AI patterns have been successfully removed and the text reads naturally.
10
+
11
+ ## When this skill fires
12
+
13
+ Fire automatically when:
14
+ - The polisher agent reaches the VERIFYING state and needs validation-gate clearance
15
+ - User explicitly requests de-AI quality verification
16
+ - After any `paper-deai` rewriting pass to confirm quality
17
+
18
+ Also fires on user request: "检查去AI效果" / "check de-AI quality" / "verify the polish worked"
19
+
20
+ ## Purpose
21
+
22
+ This skill serves as a **validation-gate compatible checker** that:
23
+ 1. Verifies AI patterns have been removed
24
+ 2. Confirms text naturalness and academic register
25
+ 3. Provides pass/fail validation for capability gates
26
+ 4. Delegates actual rewriting to `paper-deai` if issues are found
27
+
28
+ ## Procedure
29
+
30
+ ### Step 1 — Read the polished text
31
+
32
+ Load the text that was just polished. This should be LaTeX content from the target sections.
33
+
34
+ ### Step 2 — Run AI pattern detection
35
+
36
+ Check for common AI fingerprints:
37
+
38
+ | Pattern Category | What to check |
39
+ |---|---|
40
+ | **Over-used vocabulary** | Scan for the AI-over-used word list: leverage, delve, tapestry, endeavor, underscore, etc. (full list in `paper-deai` skill) |
41
+ | **Mechanical connectives** | Look for: "First and foremost", "It is worth noting that", "Moreover", "Furthermore" at sentence starts |
42
+ | **List format** | Check for `\item` usage where prose would be more natural |
43
+ | **Excessive dashes** | Count em-dashes (`---` and `—`) — more than 1 per paragraph is suspicious |
44
+ | **Tense errors** | Verify background/prior-art uses present perfect ("have achieved") not simple present |
45
+ | **Emphasis abuse** | Check for bold/italics used for emphasis rather than structure |
46
+
47
+ ### Step 3 — Score and decide
48
+
49
+ Calculate a naturalness score based on:
50
+ - AI vocabulary count (each instance: -10 points)
51
+ - Mechanical connectives (each instance: -5 points)
52
+ - List format in prose sections (each `\item`: -8 points)
53
+ - Excessive dashes (each beyond 1/paragraph: -3 points)
54
+ - Tense errors (each instance: -7 points)
55
+
56
+ **Pass threshold**: Score ≥ 85/100
57
+
58
+ ### Step 4 — Output validation report
59
+
60
+ ```markdown
61
+ ## De-AI validation report — <section>
62
+ - Date: <YYYY-MM-DD>
63
+ - Text reviewed: <file:line range>
64
+ - Naturalness score: <score>/100
65
+ - Status: [PASS] or [FAIL]
66
+ - Issues found:
67
+ - AI vocabulary: <count> instances → <list with line numbers>
68
+ - Mechanical connectives: <count> instances → <list>
69
+ - List format: <count> items → <list>
70
+ - Excessive dashes: <count> → <list>
71
+ - Tense errors: <count> → <list>
72
+ - Recommendation: <"Validation passed" or "Re-run paper-deai on sections: <list>">
73
+ ```
74
+
75
+ ### Step 5 — Auto-fix if requested
76
+
77
+ If the validation fails and the user/agent requests auto-fix:
78
+ 1. Call the `paper-deai` skill with the problematic sections
79
+ 2. Re-run this validation checker
80
+ 3. Report final status
81
+
82
+ ## Output format
83
+
84
+ - **Part 1 [Validation Report]**: The structured report above
85
+ - **Part 2 [Gate Status]**: One of:
86
+ - `[VALIDATION_GATE_PASS]` — text is natural, no AI patterns detected
87
+ - `[VALIDATION_GATE_FAIL]` — AI patterns remain, re-polish required
88
+
89
+ ## Integration with capability gates
90
+
91
+ This skill satisfies the `validation-gate` requirement because:
92
+ - Name matches `*-checker` pattern
93
+ - Provides explicit pass/fail status
94
+ - Can be called via `Skill(skill='de-ai-checker')`
95
+
96
+ ## Hard constraints
97
+
98
+ - **Post-polish only** — if no polished text exists, exit and recommend running `paper-deai` first
99
+ - **Read before checking** — every issue must cite file:line, not "in general"
100
+ - **Honest scoring** — do not inflate scores to declare victory
101
+ - **Delegate rewriting** — this skill validates; `paper-deai` rewrites
102
+ - **One pass per invocation** — do not loop; if re-check is needed, user/agent re-invokes
103
+ - **Explicit gate status** — always output `[VALIDATION_GATE_PASS]` or `[VALIDATION_GATE_FAIL]`
104
+
105
+ ## Relationship to paper-deai
106
+
107
+ - `paper-deai`: **Executor** — rewrites text to remove AI patterns
108
+ - `de-ai-checker`: **Validator** — checks quality and provides gate clearance
109
+
110
+ Use `paper-deai` for rewriting, use `de-ai-checker` for validation.
@@ -0,0 +1,5 @@
1
+ {
2
+ "name": "de-ai-checker",
3
+ "description": "Validation skill for de-AI quality checking. Use AFTER polishing to verify that AI patterns have been removed and text reads naturally. Triggers on: 'check de-AI quality', 'verify humanization', 'validate polish', 'de-AI checker', '检查去AI效果'. This is a validation-gate compatible wrapper around paper-deai.",
4
+ "entry": "SKILL.md"
5
+ }
@@ -0,0 +1,91 @@
1
+ ---
2
+ name: deep-interview
3
+ description: "Pre-plan Socratic clarification. Use BEFORE drafting any plan (ideation direction, experiment Goal anchor, rebuttal strategy, pipeline routing) to expose hidden assumptions, lock topology, and converge on scope. Walks the decision tree one branch at a time with a recommended answer per question and an explicit ambiguity score, stopping only when the score crosses the threshold. Triggers on: '帮我厘清需求', '我有个想法不太明确', '在写计划前先聊清楚', '深度访谈', 'clarify before planning', 'deep interview', 'requirements interview', 'scope before plan'."
4
+ version: 0.2.0
5
+ ---
6
+
7
+ # Deep Interview — Pre-plan clarification gate
8
+
9
+ Run **before** any plan is drafted. Purpose: turn a vague user ask into a crystallised spec the downstream planner can execute literally. This skill does NOT write the plan itself — when the ambiguity score crosses the threshold, hand off to the appropriate agent / planning skill.
10
+
11
+ ## When this skill fires
12
+
13
+ Fire automatically when the next action is one of:
14
+
15
+ - Drafting the **Goal anchor** in `.copilot/experiments.md` (first experiment dispatch)
16
+ - Picking a **research direction** in `.copilot/ideas.md` (ideation Step A)
17
+ - Deciding the **rebuttal response-strategy** per reviewer comment
18
+ - Choosing the **routing template** for the conductor (full pipeline / submission sprint / custom sequence)
19
+ - Any other moment where a sub-agent is about to commit a plan to disk
20
+
21
+ Skip if a `## Goal anchor` (or equivalent immutable plan block) already exists — do not re-interview about a settled commitment.
22
+
23
+ ## Operating principles
24
+
25
+ 1. **One question per round.** Each round ships exactly one question + your recommended answer + a one-sentence reason. Never dump 3–5 parallel questions.
26
+ 2. **Read before you ask.** If the answer is in `.copilot/{state, literature, ideas, experiments, handoff, decisions}.md`, workspace tex/code/logs, or recent reviewer rounds — read first, then ask only what remains.
27
+ 3. **Topology first.** Round 0 enumerates 1–6 independent components of the ask (e.g. for "submission sprint": writer pass / polish / review / rebuttal-prep — pick which are in scope). Lock the component list before drilling into any single one.
28
+ 4. **Rotate across components.** When >1 component is active, rotate targeting so no single component reaches clarity while siblings stay vague.
29
+ 5. **Score every round.** After each answer, score four dimensions on 0–1:
30
+ - **goal** — does the user know the *what*?
31
+ - **constraints** — compute / data / deadline / venue / word-limit bounds named?
32
+ - **criteria** — falsification + success criteria explicit?
33
+ - **context** — is the existing `.copilot/` / codebase state cross-referenced? (brownfield only)
34
+ - Ambiguity = `1 − (goal·0.35 + constraints·0.25 + criteria·0.25 + context·0.15)` for brownfield, drop the context term and renormalise for greenfield.
35
+ 6. **Stop condition.** Exit when ambiguity ≤ **0.2** (default; the conductor may override) **or** after 20 rounds (hard cap — force exit and flag the residual gap).
36
+ 7. **Challenge injections.** At rounds 4 / 6 / 8 (once each), inject one of:
37
+ - **contrarian** — "what would a top-venue reviewer call your weakest assumption here?"
38
+ - **simplifier** — "which of these branches can we drop without breaking the core claim?"
39
+ - **ontologist** — "what is the single noun phrase this contribution renames or invents? where else does that noun already mean something different?"
40
+
41
+ ## Round template
42
+
43
+ ```
44
+ Round <N> (ambiguity: <prev_score> → after this round: <new_score>)
45
+
46
+ Question: <one question>
47
+ My recommendation: <answer + one-sentence reason; cite file path / line / `.copilot/...` if the basis is in the repo>
48
+ What I will do once you answer: <next round's branch, or "exit and hand off to <agent>">
49
+ ```
50
+
51
+ When invoked from an agent, use `AskUserQuestion` to render the question — never freeform-ask in main response text.
52
+
53
+ ## Output: the crystallised spec
54
+
55
+ When the stop condition fires, emit:
56
+
57
+ ```markdown
58
+ ## Deep-interview spec — <slug>
59
+ - Date: <YYYY-MM-DD>
60
+ - Topology (locked at Round 0): <component list>
61
+ - Goal: <one sentence>
62
+ - Constraints: <compute / data / deadline / venue / word-limit, each with a number>
63
+ - Success criterion: <how we know it worked>
64
+ - Falsification criterion: <how we know it failed>
65
+ - Residual ambiguity: <final score> on dimensions <list any below 0.6>
66
+ - Hand off to: <agent or skill that will draft the plan>
67
+ ```
68
+
69
+ Write this block to the file the downstream planner will read:
70
+
71
+ | Downstream planner | Spec lands in |
72
+ |---|---|
73
+ | `@copilot-experiment` Step 1 (Goal anchor) | `.copilot/experiments.md` (above the Goal anchor) |
74
+ | `@copilot-ideation` Step B | `.copilot/ideas.md` `## User preferences` block |
75
+ | `@copilot-rebuttal` Step 2 (per-comment strategy) | top of `rebuttal/round-N.md` |
76
+ | conductor pipeline routing | `.copilot/decisions.md` |
77
+
78
+ Do not write the plan itself in this step — only the spec.
79
+
80
+ ## Hand-off
81
+
82
+ End with one line stating which agent / skill picks up next, e.g. "→ hand off to `@copilot-experiment` to draft the Goal anchor from this spec." After hand-off, the next agent **must** run `grill-with-docs` once its plan is drafted, to gap-check the plan against `.copilot/` documentation and glossary.
83
+
84
+ ## Hard constraints
85
+
86
+ - **One question per round** — no parallel question dumps
87
+ - **Read first, ask second** — never ask about facts already in `.copilot/` or the workspace
88
+ - **No plan writing** — only the spec block; the downstream agent writes the plan
89
+ - **Do not skip Round 0 topology** — single-component asks still benefit from explicit enumeration
90
+ - **Honest ambiguity scores** — neither inflate to declare victory nor deflate to keep drilling
91
+ - **Exit at hard cap** — at round 20, exit and flag the residual gap; do not loop forever
@@ -0,0 +1,5 @@
1
+ {
2
+ "name": "deep-interview",
3
+ "description": "Pre-plan Socratic clarification. Use BEFORE drafting any plan (ideation direction, experiment Goal anchor, rebuttal strategy, pipeline routing) to expose hidden assumptions, lock topology, and converge on scope. Walks the decision tree one branch at a time with a recommended answer per question and an explicit ambiguity score, stopping only when the score crosses the threshold. Triggers on: '帮我厘清需求', '我有个想法不太明确', '在写计划前先聊清楚', '深度访谈', 'clarify before planning', 'deep interview', 'requirements interview', 'scope before plan'.",
4
+ "entry": "SKILL.md"
5
+ }