@research-copilot/plugin 1.1.15 → 1.1.17
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/.claude-plugin/plugin.json +3 -2
- package/dist/.codex-plugin/plugin.toml +2 -1
- package/dist/.cursor-plugin/plugin.json +3 -2
- package/dist/.gemini-plugin/plugin.json +3 -2
- package/dist/.opencode-plugin/plugin.json +3 -2
- package/dist/.windsurf-plugin/plugin.json +3 -2
- package/dist/agents/copilot-conductor.agent.md +60 -0
- package/dist/agents/copilot-experiment.agent.md +56 -0
- package/dist/agents/copilot-ideation.agent.md +45 -0
- package/dist/agents/copilot-literature.agent.md +34 -0
- package/dist/agents/copilot-polisher.agent.md +30 -0
- package/dist/agents/copilot-rebuttal.agent.md +35 -0
- package/dist/agents/copilot-reviewer.agent.md +35 -0
- package/dist/agents/copilot-writer.agent.md +39 -0
- package/dist/hooks/dispatch-reminder.json +17 -0
- package/dist/hooks/loop-armer.json +17 -0
- package/dist/hooks/research-copilot-guard.hook.md +51 -0
- package/dist/hooks/scientist-guardrails.json +17 -0
- package/dist/hooks/scripts/__tests__/__init__.py +0 -0
- package/dist/hooks/scripts/__tests__/test_post_tool_loop_armer.py +88 -0
- package/dist/hooks/scripts/__tests__/test_research_copilot_guard_main_session.py +150 -0
- package/dist/hooks/scripts/__tests__/test_session_start_memory_injector.py +66 -0
- package/dist/hooks/scripts/__tests__/test_user_prompt_dispatch_reminder.py +37 -0
- package/dist/hooks/scripts/_copilot_hook_lib.py +564 -0
- package/dist/hooks/scripts/copilot_subagent_stop.py +203 -0
- package/dist/hooks/scripts/copilot_write_guard.py +96 -0
- package/dist/hooks/scripts/post_tool_loop_armer.py +61 -0
- package/dist/hooks/scripts/research_copilot_guard.py +208 -0
- package/dist/hooks/scripts/scientist_guardrails.py +29 -0
- package/dist/hooks/scripts/session_start_memory_injector.py +188 -0
- package/dist/hooks/scripts/user_prompt_dispatch_reminder.py +40 -0
- package/dist/hooks/session-memory-injector.json +17 -0
- package/dist/hooks/tests/__init__.py +0 -0
- package/dist/hooks/tests/conftest.py +61 -0
- package/dist/hooks/tests/fixtures/transcript_copilot_experiment_complete.jsonl +2 -0
- package/dist/hooks/tests/fixtures/transcript_copilot_experiment_state_jump.jsonl +2 -0
- package/dist/hooks/tests/fixtures/transcript_copilot_literature.jsonl +2 -0
- package/dist/hooks/tests/fixtures/transcript_main_only.jsonl +2 -0
- package/dist/hooks/tests/fixtures/transcript_malformed_state_output.jsonl +2 -0
- package/dist/hooks/tests/integration_run.ps1 +65 -0
- package/dist/hooks/tests/test_copilot_hook_lib.py +398 -0
- package/dist/hooks/tests/test_copilot_subagent_stop.py +186 -0
- package/dist/hooks/tests/test_copilot_write_guard.py +137 -0
- package/dist/hooks/tests/test_session_start_snapshot.py +116 -0
- package/dist/hooks/tests/test_state_machine_consistency.py +75 -0
- package/dist/skills/arxivsub-skill/SKILL.md +98 -0
- package/dist/skills/arxivsub-skill/skill.json +5 -0
- package/dist/skills/de-ai-checker/SKILL.md +110 -0
- package/dist/skills/de-ai-checker/skill.json +5 -0
- package/dist/skills/deep-interview/SKILL.md +91 -0
- package/dist/skills/deep-interview/skill.json +5 -0
- package/dist/skills/grill-with-docs/SKILL.md +120 -0
- package/dist/skills/grill-with-docs/skill.json +5 -0
- package/dist/skills/init-mcp/SKILL.md +83 -0
- package/dist/skills/init-mcp/skill.json +5 -0
- package/dist/skills/model-escalation/SKILL.md +93 -0
- package/dist/skills/model-escalation/skill.json +5 -0
- package/dist/skills/paper-architecture-web-drawing/SKILL.md +282 -0
- package/dist/skills/paper-architecture-web-drawing/skill.json +5 -0
- package/dist/skills/paper-deai/SKILL.md +53 -0
- package/dist/skills/paper-deai/skill.json +5 -0
- package/dist/skills/paper-en2zh/SKILL.md +29 -0
- package/dist/skills/paper-en2zh/skill.json +5 -0
- package/dist/skills/paper-expand/SKILL.md +43 -0
- package/dist/skills/paper-expand/skill.json +5 -0
- package/dist/skills/paper-experiment-analysis/SKILL.md +38 -0
- package/dist/skills/paper-experiment-analysis/skill.json +5 -0
- package/dist/skills/paper-figure-caption/SKILL.md +29 -0
- package/dist/skills/paper-figure-caption/skill.json +5 -0
- package/dist/skills/paper-logic-check/SKILL.md +30 -0
- package/dist/skills/paper-logic-check/skill.json +5 -0
- package/dist/skills/paper-polish/SKILL.md +34 -305
- package/dist/skills/paper-polish/skill.json +5 -0
- package/dist/skills/paper-review/SKILL.md +49 -0
- package/dist/skills/paper-review/skill.json +5 -0
- package/dist/skills/paper-sanity-check/SKILL.md +122 -0
- package/dist/skills/paper-sanity-check/skill.json +5 -0
- package/dist/skills/paper-shorten/SKILL.md +42 -0
- package/dist/skills/paper-shorten/skill.json +5 -0
- package/dist/skills/paper-table-caption/SKILL.md +29 -0
- package/dist/skills/paper-table-caption/skill.json +5 -0
- package/dist/skills/paper-translate/SKILL.md +48 -0
- package/dist/skills/paper-translate/skill.json +5 -0
- package/dist/skills/plugin-dev-agent-development/SKILL.md +95 -0
- package/dist/skills/plugin-dev-agent-development/skill.json +5 -0
- package/dist/skills/research-workflow/SKILL.md +116 -0
- package/dist/skills/research-workflow/skill.json +5 -0
- package/dist/skills/scientist-experiment-runner/SKILL.md +76 -0
- package/dist/skills/scientist-experiment-runner/skill.json +5 -0
- package/dist/skills/scientist-ideation/SKILL.md +52 -0
- package/dist/skills/scientist-ideation/skill.json +5 -0
- package/dist/skills/scientist-plotting/SKILL.md +49 -0
- package/dist/skills/scientist-plotting/skill.json +5 -0
- package/dist/skills/scientist-review/SKILL.md +40 -0
- package/dist/skills/scientist-review/skill.json +5 -0
- package/dist/skills/scientist-runtime-init/SKILL.md +46 -0
- package/dist/skills/scientist-runtime-init/skill.json +5 -0
- package/dist/skills/scientist-writeup/SKILL.md +60 -0
- package/dist/skills/scientist-writeup/skill.json +5 -0
- package/dist/skills/talk-normal/SKILL.md +73 -0
- package/dist/skills/talk-normal/skill.json +5 -0
- package/package.json +1 -1
- package/dist/agents/rc-experiment.md +0 -203
- package/dist/agents/rc-ideation.md +0 -224
- package/dist/agents/rc-literature.md +0 -228
- package/dist/agents/rc-plan.md +0 -189
- package/dist/agents/rc-polisher.md +0 -166
- package/dist/agents/rc-rebuttal.md +0 -194
- package/dist/agents/rc-reviewer.md +0 -187
- package/dist/agents/rc-update-spec.md +0 -231
- package/dist/agents/rc-verify.md +0 -234
- package/dist/agents/rc-writer.md +0 -161
- package/dist/skills/experiment-design/SKILL.md +0 -331
- package/dist/skills/full-research-workflow/SKILL.md +0 -363
- package/dist/skills/literature-search/SKILL.md +0 -244
- package/dist/skills/sanity-check/SKILL.md +0 -449
- package/dist/skills/submission-sprint/SKILL.md +0 -361
|
@@ -1,331 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: experiment-design
|
|
3
|
-
description: Designs and launches experiments. Handles long-running tasks via Monitor. Use for experiment tasks.
|
|
4
|
-
triggers:
|
|
5
|
-
- "design experiment"
|
|
6
|
-
- "run training"
|
|
7
|
-
- "launch experiment"
|
|
8
|
-
- "run experiments"
|
|
9
|
-
- "execute experiments"
|
|
10
|
-
---
|
|
11
|
-
|
|
12
|
-
# Experiment Design
|
|
13
|
-
|
|
14
|
-
Orchestrate experiment design and execution: create task → dispatch @rc-experiment → monitor long-running jobs → verify results.
|
|
15
|
-
|
|
16
|
-
## When to Use
|
|
17
|
-
|
|
18
|
-
Use this skill when:
|
|
19
|
-
- User asks to "design experiment for X"
|
|
20
|
-
- User wants to "run training" or "launch experiment"
|
|
21
|
-
- User needs to "execute experiments" for research
|
|
22
|
-
- Standalone experiment execution (NOT part of full pipeline)
|
|
23
|
-
|
|
24
|
-
Do NOT use when:
|
|
25
|
-
- Part of full-research-workflow (that handles experiments internally)
|
|
26
|
-
- User only wants to analyze existing results (use data-analysis skill)
|
|
27
|
-
|
|
28
|
-
## Task-First Protocol
|
|
29
|
-
|
|
30
|
-
Before starting, check if task exists:
|
|
31
|
-
|
|
32
|
-
```powershell
|
|
33
|
-
# Check for existing experiment task
|
|
34
|
-
$taskFile = "C:\PythonProject\research_copilot\.rc\tasks\experiment-design.json"
|
|
35
|
-
if (Test-Path $taskFile) {
|
|
36
|
-
$task = Get-Content $taskFile | ConvertFrom-Json
|
|
37
|
-
Write-Host "Found existing task: $($task.experiment_name)"
|
|
38
|
-
} else {
|
|
39
|
-
Write-Host "No existing task. Will create one."
|
|
40
|
-
}
|
|
41
|
-
```
|
|
42
|
-
|
|
43
|
-
Create task if needed:
|
|
44
|
-
|
|
45
|
-
```powershell
|
|
46
|
-
# Create experiment design task
|
|
47
|
-
rc task create --type experiment --name "user experiment name" --output .rc/tasks/experiment-design.json
|
|
48
|
-
```
|
|
49
|
-
|
|
50
|
-
## Auto-Context Loading
|
|
51
|
-
|
|
52
|
-
Read task context and relevant specs before orchestration:
|
|
53
|
-
|
|
54
|
-
```powershell
|
|
55
|
-
# Load task context
|
|
56
|
-
$taskFile = "C:\PythonProject\research_copilot\.rc\tasks\experiment-design.json"
|
|
57
|
-
if (Test-Path $taskFile) {
|
|
58
|
-
$task = Get-Content $taskFile | ConvertFrom-Json
|
|
59
|
-
|
|
60
|
-
Write-Host "Experiment Name: $($task.experiment_name)"
|
|
61
|
-
Write-Host "Target Metrics: $($task.target_metrics)"
|
|
62
|
-
Write-Host "Baseline Comparisons: $($task.baseline_count)"
|
|
63
|
-
|
|
64
|
-
# Load PRD if referenced
|
|
65
|
-
if ($task.prd_file -and (Test-Path $task.prd_file)) {
|
|
66
|
-
$prd = Get-Content $task.prd_file -Raw
|
|
67
|
-
Write-Host "PRD loaded: $($task.prd_file)"
|
|
68
|
-
|
|
69
|
-
# Extract metrics requirements from PRD
|
|
70
|
-
$metricsMatch = [regex]::Match($prd, '## Target Metrics\s+([\s\S]+?)(?=\n##|\z)')
|
|
71
|
-
if ($metricsMatch.Success) {
|
|
72
|
-
Write-Host "Target metrics from PRD:"
|
|
73
|
-
Write-Host $metricsMatch.Groups[1].Value
|
|
74
|
-
}
|
|
75
|
-
}
|
|
76
|
-
|
|
77
|
-
# Load methodology specs if available
|
|
78
|
-
$methodFile = "C:\PythonProject\research_copilot\.rc\methodology.md"
|
|
79
|
-
if (Test-Path $methodFile) {
|
|
80
|
-
$method = Get-Content $methodFile -Raw
|
|
81
|
-
Write-Host "Methodology loaded: $methodFile"
|
|
82
|
-
}
|
|
83
|
-
}
|
|
84
|
-
```
|
|
85
|
-
|
|
86
|
-
## Orchestration Logic
|
|
87
|
-
|
|
88
|
-
Execute experiment via agent dispatch with long-task support:
|
|
89
|
-
|
|
90
|
-
```powershell
|
|
91
|
-
# Start task
|
|
92
|
-
rc task start experiment-design
|
|
93
|
-
|
|
94
|
-
# Dispatch to experiment agent
|
|
95
|
-
Write-Host "Dispatching to @rc-experiment agent..."
|
|
96
|
-
|
|
97
|
-
# Agent will:
|
|
98
|
-
# 1. Parse experiment requirements from PRD
|
|
99
|
-
# 2. Design experiment configuration (hyperparameters, seeds, splits)
|
|
100
|
-
# 3. Set up experiment directory structure
|
|
101
|
-
# 4. Launch training jobs (using Monitor for long-running tasks)
|
|
102
|
-
# 5. Collect metrics and results
|
|
103
|
-
# 6. Compare against baselines
|
|
104
|
-
# 7. Generate result tables and figures
|
|
105
|
-
# 8. Save all artifacts to .rc/experiments/
|
|
106
|
-
|
|
107
|
-
# Check status (agent runs autonomously)
|
|
108
|
-
$status = rc task status experiment-design
|
|
109
|
-
Write-Host "Task status: $status"
|
|
110
|
-
```
|
|
111
|
-
|
|
112
|
-
## Long-Task Handling
|
|
113
|
-
|
|
114
|
-
The experiment executor uses `Bash(run_in_background=true)` + `Monitor` for training jobs:
|
|
115
|
-
|
|
116
|
-
```powershell
|
|
117
|
-
# Example: Monitor long-running training job
|
|
118
|
-
# The agent will set this up internally, shown here for reference
|
|
119
|
-
|
|
120
|
-
# Start training in background
|
|
121
|
-
$jobId = Start-Job -ScriptBlock {
|
|
122
|
-
python train.py --config config.yaml --output experiments/run_001/
|
|
123
|
-
}
|
|
124
|
-
|
|
125
|
-
# Monitor training progress via log file
|
|
126
|
-
Monitor -description "Training experiment run_001" `
|
|
127
|
-
-persistent $true `
|
|
128
|
-
-command @"
|
|
129
|
-
tail -f experiments/run_001/train.log | grep --line-buffered 'epoch=\|loss=\|accuracy=\|FINISHED\|ERROR\|FAILED'
|
|
130
|
-
"@
|
|
131
|
-
|
|
132
|
-
# Agent continues other work while training runs
|
|
133
|
-
# Will be notified when training completes or errors occur
|
|
134
|
-
```
|
|
135
|
-
|
|
136
|
-
Key points for long-task handling:
|
|
137
|
-
- Training jobs run in background using `run_in_background=true`
|
|
138
|
-
- Monitor streams progress updates (metrics, errors) without blocking
|
|
139
|
-
- Agent can launch multiple experiments in parallel
|
|
140
|
-
- Each experiment logs to separate directory for isolation
|
|
141
|
-
- Failed experiments recorded with error details for debugging
|
|
142
|
-
|
|
143
|
-
## Verification and Quality Gates
|
|
144
|
-
|
|
145
|
-
Verify results before marking complete:
|
|
146
|
-
|
|
147
|
-
```powershell
|
|
148
|
-
# Verify experiment outputs
|
|
149
|
-
$expDir = "C:\PythonProject\research_copilot\.rc\experiments"
|
|
150
|
-
|
|
151
|
-
# Check experiment config
|
|
152
|
-
$configFile = Join-Path $expDir "config.yaml"
|
|
153
|
-
if (Test-Path $configFile) {
|
|
154
|
-
$config = Get-Content $configFile -Raw | ConvertFrom-Yaml
|
|
155
|
-
Write-Host "Config loaded: seed=$($config.seed), epochs=$($config.epochs)"
|
|
156
|
-
} else {
|
|
157
|
-
Write-Host "WARNING: No config file found"
|
|
158
|
-
}
|
|
159
|
-
|
|
160
|
-
# Check results file
|
|
161
|
-
$resultsFile = Join-Path $expDir "results.json"
|
|
162
|
-
if (Test-Path $resultsFile) {
|
|
163
|
-
$results = Get-Content $resultsFile | ConvertFrom-Json
|
|
164
|
-
Write-Host "Found results for $($results.runs.Count) runs"
|
|
165
|
-
|
|
166
|
-
# Verify all required metrics present
|
|
167
|
-
$requiredMetrics = @('accuracy', 'f1_score', 'precision', 'recall')
|
|
168
|
-
foreach ($metric in $requiredMetrics) {
|
|
169
|
-
if (-not $results.metrics.$metric) {
|
|
170
|
-
Write-Host "WARNING: Missing metric: $metric"
|
|
171
|
-
}
|
|
172
|
-
}
|
|
173
|
-
} else {
|
|
174
|
-
Write-Host "WARNING: No results file found"
|
|
175
|
-
}
|
|
176
|
-
|
|
177
|
-
# Check baseline comparisons
|
|
178
|
-
$comparisonFile = Join-Path $expDir "baseline_comparison.json"
|
|
179
|
-
if (Test-Path $comparisonFile) {
|
|
180
|
-
$comparison = Get-Content $comparisonFile | ConvertFrom-Json
|
|
181
|
-
Write-Host "Compared against $($comparison.baselines.Count) baselines"
|
|
182
|
-
} else {
|
|
183
|
-
Write-Host "WARNING: No baseline comparison found"
|
|
184
|
-
}
|
|
185
|
-
|
|
186
|
-
# Check reproducibility artifacts
|
|
187
|
-
$artifactsDir = Join-Path $expDir "artifacts"
|
|
188
|
-
if (Test-Path $artifactsDir) {
|
|
189
|
-
$artifacts = Get-ChildItem $artifactsDir -Recurse
|
|
190
|
-
Write-Host "Saved $($artifacts.Count) reproducibility artifacts"
|
|
191
|
-
} else {
|
|
192
|
-
Write-Host "WARNING: No artifacts directory"
|
|
193
|
-
}
|
|
194
|
-
```
|
|
195
|
-
|
|
196
|
-
Complete task if quality gates pass:
|
|
197
|
-
|
|
198
|
-
```powershell
|
|
199
|
-
# Mark task complete
|
|
200
|
-
rc task complete experiment-design
|
|
201
|
-
```
|
|
202
|
-
|
|
203
|
-
## Quality Gates
|
|
204
|
-
|
|
205
|
-
Verify before marking complete:
|
|
206
|
-
|
|
207
|
-
```powershell
|
|
208
|
-
# Quality gate checks
|
|
209
|
-
$passed = $true
|
|
210
|
-
|
|
211
|
-
# Gate 1: All metrics from PRD achieved
|
|
212
|
-
$resultsFile = "C:\PythonProject\research_copilot\.rc\experiments\results.json"
|
|
213
|
-
if (Test-Path $resultsFile) {
|
|
214
|
-
$results = Get-Content $resultsFile | ConvertFrom-Json
|
|
215
|
-
$prdFile = "C:\PythonProject\research_copilot\.rc\prd.md"
|
|
216
|
-
|
|
217
|
-
if (Test-Path $prdFile) {
|
|
218
|
-
$prd = Get-Content $prdFile -Raw
|
|
219
|
-
$metricsMatch = [regex]::Matches($prd, '(\w+):\s*≥\s*([\d.]+)')
|
|
220
|
-
|
|
221
|
-
foreach ($match in $metricsMatch) {
|
|
222
|
-
$metricName = $match.Groups[1].Value
|
|
223
|
-
$targetValue = [double]$match.Groups[2].Value
|
|
224
|
-
$actualValue = $results.metrics.$metricName
|
|
225
|
-
|
|
226
|
-
if ($actualValue -lt $targetValue) {
|
|
227
|
-
Write-Host "FAIL: $metricName = $actualValue < $targetValue (target)"
|
|
228
|
-
$passed = $false
|
|
229
|
-
}
|
|
230
|
-
}
|
|
231
|
-
}
|
|
232
|
-
} else {
|
|
233
|
-
Write-Host "FAIL: No results file"
|
|
234
|
-
$passed = $false
|
|
235
|
-
}
|
|
236
|
-
|
|
237
|
-
# Gate 2: Results logged to artifacts/results/
|
|
238
|
-
$artifactsDir = "C:\PythonProject\research_copilot\.rc\experiments\artifacts\results"
|
|
239
|
-
if (Test-Path $artifactsDir) {
|
|
240
|
-
$resultFiles = Get-ChildItem $artifactsDir -Filter "*.json"
|
|
241
|
-
if ($resultFiles.Count -eq 0) {
|
|
242
|
-
Write-Host "FAIL: No result files in artifacts/results/"
|
|
243
|
-
$passed = $false
|
|
244
|
-
}
|
|
245
|
-
} else {
|
|
246
|
-
Write-Host "FAIL: artifacts/results/ directory missing"
|
|
247
|
-
$passed = $false
|
|
248
|
-
}
|
|
249
|
-
|
|
250
|
-
# Gate 3: Config and seed recorded for reproducibility
|
|
251
|
-
$configFile = "C:\PythonProject\research_copilot\.rc\experiments\config.yaml"
|
|
252
|
-
if (Test-Path $configFile) {
|
|
253
|
-
$config = Get-Content $configFile -Raw
|
|
254
|
-
if ($config -notmatch 'seed:\s*\d+') {
|
|
255
|
-
Write-Host "FAIL: No seed recorded in config"
|
|
256
|
-
$passed = $false
|
|
257
|
-
}
|
|
258
|
-
} else {
|
|
259
|
-
Write-Host "FAIL: No config file for reproducibility"
|
|
260
|
-
$passed = $false
|
|
261
|
-
}
|
|
262
|
-
|
|
263
|
-
if ($passed) {
|
|
264
|
-
Write-Host "All quality gates passed"
|
|
265
|
-
} else {
|
|
266
|
-
Write-Host "Quality gates failed - review needed"
|
|
267
|
-
}
|
|
268
|
-
```
|
|
269
|
-
|
|
270
|
-
## Report Format
|
|
271
|
-
|
|
272
|
-
Generate final report:
|
|
273
|
-
|
|
274
|
-
```markdown
|
|
275
|
-
# Experiment Design Report
|
|
276
|
-
|
|
277
|
-
**Experiment**: {experiment_name}
|
|
278
|
-
**Date**: {date}
|
|
279
|
-
**Runs Completed**: {run_count}
|
|
280
|
-
**Status**: {status}
|
|
281
|
-
|
|
282
|
-
## Configuration
|
|
283
|
-
|
|
284
|
-
- **Seed**: {seed}
|
|
285
|
-
- **Epochs**: {epochs}
|
|
286
|
-
- **Batch Size**: {batch_size}
|
|
287
|
-
- **Learning Rate**: {learning_rate}
|
|
288
|
-
|
|
289
|
-
## Results
|
|
290
|
-
|
|
291
|
-
| Metric | Value | Target | Status |
|
|
292
|
-
|--------|-------|--------|--------|
|
|
293
|
-
| Accuracy | {accuracy} | {target_accuracy} | {✓/✗} |
|
|
294
|
-
| F1 Score | {f1_score} | {target_f1} | {✓/✗} |
|
|
295
|
-
| Precision | {precision} | {target_precision} | {✓/✗} |
|
|
296
|
-
| Recall | {recall} | {target_recall} | {✓/✗} |
|
|
297
|
-
|
|
298
|
-
## Baseline Comparisons
|
|
299
|
-
|
|
300
|
-
| Baseline | Metric | Their Score | Our Score | Improvement |
|
|
301
|
-
|----------|--------|-------------|-----------|-------------|
|
|
302
|
-
| {name} | {metric} | {baseline_score} | {our_score} | {delta}% |
|
|
303
|
-
|
|
304
|
-
## Reproducibility
|
|
305
|
-
|
|
306
|
-
All artifacts saved to: `.rc/experiments/artifacts/`
|
|
307
|
-
|
|
308
|
-
- **Config**: `config.yaml` (seed={seed})
|
|
309
|
-
- **Checkpoints**: `checkpoints/run_{id}/`
|
|
310
|
-
- **Logs**: `logs/train.log`
|
|
311
|
-
- **Results**: `artifacts/results/results_{timestamp}.json`
|
|
312
|
-
|
|
313
|
-
## Next Steps
|
|
314
|
-
|
|
315
|
-
- [ ] Verify results meet all PRD requirements
|
|
316
|
-
- [ ] Generate figures for paper
|
|
317
|
-
- [ ] Update paper draft with results
|
|
318
|
-
- [ ] Archive experiment for future reference
|
|
319
|
-
```
|
|
320
|
-
|
|
321
|
-
## Example Usage
|
|
322
|
-
|
|
323
|
-
```
|
|
324
|
-
User: "Design and run experiments for the sentiment analysis model"
|
|
325
|
-
|
|
326
|
-
Skill: [Reads PRD, creates task, dispatches @rc-experiment]
|
|
327
|
-
Skill: [Agent launches training via Monitor, continues other work]
|
|
328
|
-
Skill: [Receives notification when training completes]
|
|
329
|
-
Skill: [Verifies metrics, generates report]
|
|
330
|
-
Skill: "Experiment completed. Achieved accuracy=0.94 (target: 0.90). Results saved to .rc/experiments/."
|
|
331
|
-
```
|
|
@@ -1,363 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: full-research-workflow
|
|
3
|
-
description: |
|
|
4
|
-
Orchestrate a complete research workflow from literature review to camera-ready paper.
|
|
5
|
-
Calls `rc` CLI commands to create tasks, dispatches agents for execution, verifies quality gates.
|
|
6
|
-
Follows Trellis philosophy: Task-first, Action-before-asking, Single source of truth.
|
|
7
|
-
triggers:
|
|
8
|
-
- "run the full research workflow"
|
|
9
|
-
- "orchestrate research from start to finish"
|
|
10
|
-
- "execute the complete research pipeline"
|
|
11
|
-
- "run all research stages"
|
|
12
|
-
---
|
|
13
|
-
|
|
14
|
-
# Full Research Workflow
|
|
15
|
-
|
|
16
|
-
Orchestrates a complete research workflow: literature → ideation → experiment → writing → polish → review → rebuttal.
|
|
17
|
-
|
|
18
|
-
## When to Use
|
|
19
|
-
|
|
20
|
-
Use this skill when:
|
|
21
|
-
- User wants to run the entire research pipeline end-to-end
|
|
22
|
-
- User says "run the full workflow", "execute all stages", "orchestrate research"
|
|
23
|
-
- A `prd.md` exists with research goals and you need to execute all stages
|
|
24
|
-
|
|
25
|
-
Do NOT use when:
|
|
26
|
-
- User wants to run a single stage (use stage-specific skills instead)
|
|
27
|
-
- No `prd.md` exists (guide user to create one first)
|
|
28
|
-
|
|
29
|
-
## Task-First Principle
|
|
30
|
-
|
|
31
|
-
If `prd.md` exists and has clear goals, proceed directly to Stage 1. Only ask questions if critical information is missing.
|
|
32
|
-
|
|
33
|
-
## Auto-Context (Action-First)
|
|
34
|
-
|
|
35
|
-
**Step 0**: Before asking clarifying questions, read the context files that might contain answers:
|
|
36
|
-
|
|
37
|
-
```powershell
|
|
38
|
-
# Check for PRD and existing state
|
|
39
|
-
if (Test-Path prd.md) { Get-Content prd.md } else { "No PRD found" }
|
|
40
|
-
rc task list --json 2>$null || "No tasks yet"
|
|
41
|
-
if (Test-Path execute.jsonl) { "Execution log exists" } else { "No execution log" }
|
|
42
|
-
if ((Test-Path baselines/) -and (Test-Path venue/)) { "Support dirs ready" } else { "No support dirs" }
|
|
43
|
-
```
|
|
44
|
-
|
|
45
|
-
**What to look for**:
|
|
46
|
-
- `prd.md`: Research goals, target venue, constraints
|
|
47
|
-
- `execute.jsonl`: Previous execution history, failed stages
|
|
48
|
-
- Task list: Which stages are already complete/in-progress
|
|
49
|
-
- Support dirs (`baselines/`, `venue/`): Whether infrastructure is ready
|
|
50
|
-
|
|
51
|
-
**Decision tree**:
|
|
52
|
-
1. If `prd.md` missing → Ask user to create one or provide research goals
|
|
53
|
-
2. If `prd.md` exists but tasks already in progress → Resume from last incomplete stage
|
|
54
|
-
3. If `prd.md` exists and no tasks → Start fresh from Stage 1
|
|
55
|
-
|
|
56
|
-
Only ask clarifying questions if critical information is genuinely missing from these files.
|
|
57
|
-
|
|
58
|
-
## Workflow Stages
|
|
59
|
-
|
|
60
|
-
### Stage 1: Literature Review
|
|
61
|
-
|
|
62
|
-
```powershell
|
|
63
|
-
# Create literature review task and capture ID
|
|
64
|
-
$output = rc task create `
|
|
65
|
-
--kind literature `
|
|
66
|
-
--title "Literature Review" `
|
|
67
|
-
--goal "Survey state-of-the-art in [research area from prd.md]"
|
|
68
|
-
$LIT_TASK_ID = ($output | Select-String 'Task (\d+)').Matches.Groups[1].Value
|
|
69
|
-
|
|
70
|
-
# Dispatch agent to execute
|
|
71
|
-
rc agent dispatch --task-id $LIT_TASK_ID --agent literature-agent
|
|
72
|
-
|
|
73
|
-
# Verify deliverables exist
|
|
74
|
-
if (-not (Test-Path lit_review.md)) { "ERROR: lit_review.md missing" }
|
|
75
|
-
if (-not (Test-Path references/)) { "ERROR: references/ missing" }
|
|
76
|
-
|
|
77
|
-
# Mark complete
|
|
78
|
-
rc task complete --task-id $LIT_TASK_ID
|
|
79
|
-
```
|
|
80
|
-
|
|
81
|
-
Quality gate: `lit_review.md` must exist with 15+ references, clear gaps identified.
|
|
82
|
-
|
|
83
|
-
### Stage 2: Ideation
|
|
84
|
-
|
|
85
|
-
```powershell
|
|
86
|
-
# Verify literature task is complete
|
|
87
|
-
$status = rc task status $LIT_TASK_ID
|
|
88
|
-
if ($status -notmatch "completed") {
|
|
89
|
-
"ERROR: Literature task not completed yet"
|
|
90
|
-
exit 1
|
|
91
|
-
}
|
|
92
|
-
|
|
93
|
-
# Create ideation task (depends on literature)
|
|
94
|
-
$output = rc task create `
|
|
95
|
-
--kind ideation `
|
|
96
|
-
--title "Ideation" `
|
|
97
|
-
--goal "Generate research ideas addressing gaps from literature" `
|
|
98
|
-
--depends-on $LIT_TASK_ID
|
|
99
|
-
$IDEA_TASK_ID = ($output | Select-String 'Task (\d+)').Matches.Groups[1].Value
|
|
100
|
-
|
|
101
|
-
# Dispatch agent
|
|
102
|
-
rc agent dispatch --task-id $IDEA_TASK_ID --agent ideation-agent
|
|
103
|
-
|
|
104
|
-
# Verify deliverables
|
|
105
|
-
if (-not (Test-Path ideas.md)) { "ERROR: ideas.md missing" }
|
|
106
|
-
if (-not (Select-String -Path ideas.md -Pattern "idea_" -Quiet)) { "ERROR: No structured ideas found" }
|
|
107
|
-
|
|
108
|
-
# Mark complete
|
|
109
|
-
rc task complete --task-id $IDEA_TASK_ID
|
|
110
|
-
```
|
|
111
|
-
|
|
112
|
-
Quality gate: `ideas.md` must contain 3+ structured ideas with novelty scores.
|
|
113
|
-
|
|
114
|
-
### Stage 3: Experiment
|
|
115
|
-
|
|
116
|
-
```powershell
|
|
117
|
-
# Verify ideation task is complete
|
|
118
|
-
$status = rc task status $IDEA_TASK_ID
|
|
119
|
-
if ($status -notmatch "completed") {
|
|
120
|
-
"ERROR: Ideation task not completed yet"
|
|
121
|
-
exit 1
|
|
122
|
-
}
|
|
123
|
-
|
|
124
|
-
# Create experiment task (depends on ideation)
|
|
125
|
-
$output = rc task create `
|
|
126
|
-
--kind experiment `
|
|
127
|
-
--title "Experiment Execution" `
|
|
128
|
-
--goal "Implement and run experiments for selected idea" `
|
|
129
|
-
--depends-on $IDEA_TASK_ID
|
|
130
|
-
$EXP_TASK_ID = ($output | Select-String 'Task (\d+)').Matches.Groups[1].Value
|
|
131
|
-
|
|
132
|
-
# Dispatch agent
|
|
133
|
-
rc agent dispatch --task-id $EXP_TASK_ID --agent experiment-agent
|
|
134
|
-
|
|
135
|
-
# Verify deliverables
|
|
136
|
-
if (-not (Test-Path results.json)) { "ERROR: results.json missing" }
|
|
137
|
-
if (-not (Test-Path figures/)) { "ERROR: figures/ missing" }
|
|
138
|
-
if (-not (Test-Path experiment_log.md)) { "ERROR: experiment_log.md missing" }
|
|
139
|
-
|
|
140
|
-
# Mark complete
|
|
141
|
-
rc task complete --task-id $EXP_TASK_ID
|
|
142
|
-
```
|
|
143
|
-
|
|
144
|
-
Quality gate: `results.json` must have baseline comparisons, `figures/` must contain plots.
|
|
145
|
-
|
|
146
|
-
### Stage 4: Writing
|
|
147
|
-
|
|
148
|
-
```powershell
|
|
149
|
-
# Verify experiment task is complete
|
|
150
|
-
$status = rc task status $EXP_TASK_ID
|
|
151
|
-
if ($status -notmatch "completed") {
|
|
152
|
-
"ERROR: Experiment task not completed yet"
|
|
153
|
-
exit 1
|
|
154
|
-
}
|
|
155
|
-
|
|
156
|
-
# Create writing task (depends on experiment)
|
|
157
|
-
$output = rc task create `
|
|
158
|
-
--kind writing `
|
|
159
|
-
--title "Draft Paper" `
|
|
160
|
-
--goal "Write paper draft following venue template" `
|
|
161
|
-
--depends-on $EXP_TASK_ID
|
|
162
|
-
$WRITE_TASK_ID = ($output | Select-String 'Task (\d+)').Matches.Groups[1].Value
|
|
163
|
-
|
|
164
|
-
# Dispatch agent
|
|
165
|
-
rc agent dispatch --task-id $WRITE_TASK_ID --agent writing-agent
|
|
166
|
-
|
|
167
|
-
# Verify deliverables
|
|
168
|
-
if (-not (Test-Path paper_draft.tex)) { "ERROR: paper_draft.tex missing" }
|
|
169
|
-
pdflatex paper_draft.tex >$null 2>&1
|
|
170
|
-
if ($LASTEXITCODE -ne 0) { "WARN: LaTeX compilation failed" }
|
|
171
|
-
|
|
172
|
-
# Mark complete
|
|
173
|
-
rc task complete --task-id $WRITE_TASK_ID
|
|
174
|
-
```
|
|
175
|
-
|
|
176
|
-
Quality gate: `paper_draft.tex` must compile, all sections present, references formatted.
|
|
177
|
-
|
|
178
|
-
### Stage 5: Polish
|
|
179
|
-
|
|
180
|
-
```powershell
|
|
181
|
-
# Verify writing task is complete
|
|
182
|
-
$status = rc task status $WRITE_TASK_ID
|
|
183
|
-
if ($status -notmatch "completed") {
|
|
184
|
-
"ERROR: Writing task not completed yet"
|
|
185
|
-
exit 1
|
|
186
|
-
}
|
|
187
|
-
|
|
188
|
-
# Create polish task (depends on writing)
|
|
189
|
-
$output = rc task create `
|
|
190
|
-
--kind polish `
|
|
191
|
-
--title "Polish Paper" `
|
|
192
|
-
--goal "Refine writing, check formatting, validate claims" `
|
|
193
|
-
--depends-on $WRITE_TASK_ID
|
|
194
|
-
$POLISH_TASK_ID = ($output | Select-String 'Task (\d+)').Matches.Groups[1].Value
|
|
195
|
-
|
|
196
|
-
# Dispatch agent
|
|
197
|
-
rc agent dispatch --task-id $POLISH_TASK_ID --agent polish-agent
|
|
198
|
-
|
|
199
|
-
# Verify deliverables
|
|
200
|
-
if (-not (Test-Path paper_polished.tex)) { "ERROR: paper_polished.tex missing" }
|
|
201
|
-
if (-not (Select-String -Path paper_polished.tex -Pattern "\\cite{" -Quiet)) { "WARN: No citations found" }
|
|
202
|
-
|
|
203
|
-
# Mark complete
|
|
204
|
-
rc task complete --task-id $POLISH_TASK_ID
|
|
205
|
-
```
|
|
206
|
-
|
|
207
|
-
Quality gate: Paper length within venue limits, all claims cited, figures captioned.
|
|
208
|
-
|
|
209
|
-
### Stage 6: Review
|
|
210
|
-
|
|
211
|
-
```powershell
|
|
212
|
-
# Verify polish task is complete
|
|
213
|
-
$status = rc task status $POLISH_TASK_ID
|
|
214
|
-
if ($status -notmatch "completed") {
|
|
215
|
-
"ERROR: Polish task not completed yet"
|
|
216
|
-
exit 1
|
|
217
|
-
}
|
|
218
|
-
|
|
219
|
-
# Create review task (depends on polish)
|
|
220
|
-
$output = rc task create `
|
|
221
|
-
--kind review `
|
|
222
|
-
--title "Internal Review" `
|
|
223
|
-
--goal "Simulate peer review, identify weaknesses" `
|
|
224
|
-
--depends-on $POLISH_TASK_ID
|
|
225
|
-
$REVIEW_TASK_ID = ($output | Select-String 'Task (\d+)').Matches.Groups[1].Value
|
|
226
|
-
|
|
227
|
-
# Dispatch agent
|
|
228
|
-
rc agent dispatch --task-id $REVIEW_TASK_ID --agent review-agent
|
|
229
|
-
|
|
230
|
-
# Verify deliverables
|
|
231
|
-
if (-not (Test-Path review_report.md)) { "ERROR: review_report.md missing" }
|
|
232
|
-
if (-not (Select-String -Path review_report.md -Pattern "weakness_" -Quiet)) { "ERROR: No structured weaknesses" }
|
|
233
|
-
|
|
234
|
-
# Mark complete
|
|
235
|
-
rc task complete --task-id $REVIEW_TASK_ID
|
|
236
|
-
```
|
|
237
|
-
|
|
238
|
-
Quality gate: `review_report.md` must identify 3+ weaknesses with severity scores.
|
|
239
|
-
|
|
240
|
-
### Stage 7: Rebuttal (Optional)
|
|
241
|
-
|
|
242
|
-
```powershell
|
|
243
|
-
# Only run if review found critical issues
|
|
244
|
-
if (Select-String -Path review_report.md -Pattern "severity: critical" -Quiet) {
|
|
245
|
-
# Verify review task is complete
|
|
246
|
-
$status = rc task status $REVIEW_TASK_ID
|
|
247
|
-
if ($status -notmatch "completed") {
|
|
248
|
-
"ERROR: Review task not completed yet"
|
|
249
|
-
exit 1
|
|
250
|
-
}
|
|
251
|
-
|
|
252
|
-
$output = rc task create `
|
|
253
|
-
--kind rebuttal `
|
|
254
|
-
--title "Address Review Comments" `
|
|
255
|
-
--goal "Revise paper based on review feedback" `
|
|
256
|
-
--depends-on $REVIEW_TASK_ID
|
|
257
|
-
$REBUTTAL_TASK_ID = ($output | Select-String 'Task (\d+)').Matches.Groups[1].Value
|
|
258
|
-
|
|
259
|
-
rc agent dispatch --task-id $REBUTTAL_TASK_ID --agent rebuttal-agent
|
|
260
|
-
|
|
261
|
-
if (-not (Test-Path paper_revised.tex)) { "ERROR: paper_revised.tex missing" }
|
|
262
|
-
if (-not (Test-Path rebuttal_response.md)) { "ERROR: rebuttal_response.md missing" }
|
|
263
|
-
|
|
264
|
-
rc task complete --task-id $REBUTTAL_TASK_ID
|
|
265
|
-
}
|
|
266
|
-
```
|
|
267
|
-
|
|
268
|
-
Quality gate: All critical weaknesses addressed, rebuttal response provided.
|
|
269
|
-
|
|
270
|
-
## Error Recovery
|
|
271
|
-
|
|
272
|
-
### Executor Fails
|
|
273
|
-
|
|
274
|
-
If `rc agent dispatch` fails or agent reports errors:
|
|
275
|
-
|
|
276
|
-
1. Check `execute.jsonl` for the failure record
|
|
277
|
-
2. Read the error message and context
|
|
278
|
-
3. Record a gap in the task notes:
|
|
279
|
-
```powershell
|
|
280
|
-
rc task update --task-id $TASK_ID `
|
|
281
|
-
--add-note "Agent failed: [error]. Manual intervention needed."
|
|
282
|
-
```
|
|
283
|
-
4. Ask user: "The [stage] agent failed with error: [error]. Would you like me to retry with different parameters, or handle this manually?"
|
|
284
|
-
|
|
285
|
-
### Quality Gate Fails
|
|
286
|
-
|
|
287
|
-
If deliverables are missing or malformed:
|
|
288
|
-
|
|
289
|
-
1. Set task status back to `in_progress`:
|
|
290
|
-
```powershell
|
|
291
|
-
rc task update --task-id $TASK_ID --status in_progress
|
|
292
|
-
```
|
|
293
|
-
2. Re-dispatch agent with explicit instructions:
|
|
294
|
-
```powershell
|
|
295
|
-
rc agent dispatch --task-id $TASK_ID `
|
|
296
|
-
--agent <agent-name> `
|
|
297
|
-
--instruction "Previous run failed quality gate: [details]. Focus on [missing deliverable]."
|
|
298
|
-
```
|
|
299
|
-
3. If fails twice, ask user for guidance
|
|
300
|
-
|
|
301
|
-
### MCP Unavailable
|
|
302
|
-
|
|
303
|
-
If `rc` CLI commands fail because MCP server is not running:
|
|
304
|
-
|
|
305
|
-
1. Record the gap: "MCP server unavailable, cannot create tasks programmatically"
|
|
306
|
-
2. Provide manual fallback:
|
|
307
|
-
```powershell
|
|
308
|
-
# Manual task creation
|
|
309
|
-
New-Item -ItemType Directory -Force -Path .research/tasks/
|
|
310
|
-
@"
|
|
311
|
-
{
|
|
312
|
-
"kind": "literature",
|
|
313
|
-
"title": "Literature Review",
|
|
314
|
-
"status": "pending"
|
|
315
|
-
}
|
|
316
|
-
"@ | Out-File -FilePath .research/tasks/literature.json -Encoding utf8
|
|
317
|
-
```
|
|
318
|
-
3. Ask user: "The `rc` CLI is unavailable. Should I proceed with manual task tracking?"
|
|
319
|
-
|
|
320
|
-
## Report Format
|
|
321
|
-
|
|
322
|
-
After all stages complete (or when blocked), provide a summary:
|
|
323
|
-
|
|
324
|
-
```
|
|
325
|
-
Research Workflow Summary
|
|
326
|
-
========================
|
|
327
|
-
|
|
328
|
-
Completed Stages:
|
|
329
|
-
✓ Stage 1: Literature Review (lit_review.md, 18 references)
|
|
330
|
-
✓ Stage 2: Ideation (ideas.md, 4 ideas generated)
|
|
331
|
-
✓ Stage 3: Experiment (results.json, 3 baselines compared)
|
|
332
|
-
✓ Stage 4: Writing (paper_draft.tex, 8 pages)
|
|
333
|
-
✓ Stage 5: Polish (paper_polished.tex, formatting validated)
|
|
334
|
-
✓ Stage 6: Review (review_report.md, 2 critical weaknesses found)
|
|
335
|
-
✗ Stage 7: Rebuttal (skipped - no critical issues OR in progress)
|
|
336
|
-
|
|
337
|
-
Deliverables:
|
|
338
|
-
- Paper: paper_polished.tex (or paper_revised.tex if rebuttal ran)
|
|
339
|
-
- Figures: figures/ (N plots)
|
|
340
|
-
- References: references/ (M papers)
|
|
341
|
-
- Execution log: execute.jsonl (full trace)
|
|
342
|
-
|
|
343
|
-
Quality Gates Passed:
|
|
344
|
-
- Literature: 15+ references ✓
|
|
345
|
-
- Ideation: 3+ ideas ✓
|
|
346
|
-
- Experiment: Baseline comparisons ✓
|
|
347
|
-
- Writing: LaTeX compiles ✓
|
|
348
|
-
- Polish: Within page limits ✓
|
|
349
|
-
- Review: Weaknesses identified ✓
|
|
350
|
-
|
|
351
|
-
Next Steps:
|
|
352
|
-
- Review paper_polished.tex
|
|
353
|
-
- Address any remaining review comments manually
|
|
354
|
-
- Prepare submission package
|
|
355
|
-
```
|
|
356
|
-
|
|
357
|
-
If any stage failed, include:
|
|
358
|
-
|
|
359
|
-
```
|
|
360
|
-
Blockers:
|
|
361
|
-
- Stage X failed: [error message]
|
|
362
|
-
- Manual intervention needed: [specific action]
|
|
363
|
-
```
|