get-research-done 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (127) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +560 -0
  3. package/agents/grd-architect.md +789 -0
  4. package/agents/grd-codebase-mapper.md +738 -0
  5. package/agents/grd-critic.md +1065 -0
  6. package/agents/grd-debugger.md +1203 -0
  7. package/agents/grd-evaluator.md +948 -0
  8. package/agents/grd-executor.md +784 -0
  9. package/agents/grd-explorer.md +2063 -0
  10. package/agents/grd-graduator.md +484 -0
  11. package/agents/grd-integration-checker.md +423 -0
  12. package/agents/grd-phase-researcher.md +641 -0
  13. package/agents/grd-plan-checker.md +745 -0
  14. package/agents/grd-planner.md +1386 -0
  15. package/agents/grd-project-researcher.md +865 -0
  16. package/agents/grd-research-synthesizer.md +256 -0
  17. package/agents/grd-researcher.md +2361 -0
  18. package/agents/grd-roadmapper.md +605 -0
  19. package/agents/grd-verifier.md +778 -0
  20. package/bin/install.js +1294 -0
  21. package/commands/grd/add-phase.md +207 -0
  22. package/commands/grd/add-todo.md +193 -0
  23. package/commands/grd/architect.md +283 -0
  24. package/commands/grd/audit-milestone.md +277 -0
  25. package/commands/grd/check-todos.md +228 -0
  26. package/commands/grd/complete-milestone.md +136 -0
  27. package/commands/grd/debug.md +169 -0
  28. package/commands/grd/discuss-phase.md +86 -0
  29. package/commands/grd/evaluate.md +1095 -0
  30. package/commands/grd/execute-phase.md +339 -0
  31. package/commands/grd/explore.md +258 -0
  32. package/commands/grd/graduate.md +323 -0
  33. package/commands/grd/help.md +482 -0
  34. package/commands/grd/insert-phase.md +227 -0
  35. package/commands/grd/insights.md +231 -0
  36. package/commands/grd/join-discord.md +18 -0
  37. package/commands/grd/list-phase-assumptions.md +50 -0
  38. package/commands/grd/map-codebase.md +71 -0
  39. package/commands/grd/new-milestone.md +721 -0
  40. package/commands/grd/new-project.md +1008 -0
  41. package/commands/grd/pause-work.md +134 -0
  42. package/commands/grd/plan-milestone-gaps.md +295 -0
  43. package/commands/grd/plan-phase.md +525 -0
  44. package/commands/grd/progress.md +364 -0
  45. package/commands/grd/quick-explore.md +236 -0
  46. package/commands/grd/quick.md +309 -0
  47. package/commands/grd/remove-phase.md +349 -0
  48. package/commands/grd/research-phase.md +200 -0
  49. package/commands/grd/research.md +681 -0
  50. package/commands/grd/resume-work.md +40 -0
  51. package/commands/grd/set-profile.md +106 -0
  52. package/commands/grd/settings.md +136 -0
  53. package/commands/grd/update.md +172 -0
  54. package/commands/grd/verify-work.md +219 -0
  55. package/get-research-done/config/default.json +15 -0
  56. package/get-research-done/references/checkpoints.md +1078 -0
  57. package/get-research-done/references/continuation-format.md +249 -0
  58. package/get-research-done/references/git-integration.md +254 -0
  59. package/get-research-done/references/model-profiles.md +73 -0
  60. package/get-research-done/references/planning-config.md +94 -0
  61. package/get-research-done/references/questioning.md +141 -0
  62. package/get-research-done/references/tdd.md +263 -0
  63. package/get-research-done/references/ui-brand.md +160 -0
  64. package/get-research-done/references/verification-patterns.md +612 -0
  65. package/get-research-done/templates/DEBUG.md +159 -0
  66. package/get-research-done/templates/UAT.md +247 -0
  67. package/get-research-done/templates/archive-reason.md +195 -0
  68. package/get-research-done/templates/codebase/architecture.md +255 -0
  69. package/get-research-done/templates/codebase/concerns.md +310 -0
  70. package/get-research-done/templates/codebase/conventions.md +307 -0
  71. package/get-research-done/templates/codebase/integrations.md +280 -0
  72. package/get-research-done/templates/codebase/stack.md +186 -0
  73. package/get-research-done/templates/codebase/structure.md +285 -0
  74. package/get-research-done/templates/codebase/testing.md +480 -0
  75. package/get-research-done/templates/config.json +35 -0
  76. package/get-research-done/templates/context.md +283 -0
  77. package/get-research-done/templates/continue-here.md +78 -0
  78. package/get-research-done/templates/critic-log.md +288 -0
  79. package/get-research-done/templates/data-report.md +173 -0
  80. package/get-research-done/templates/debug-subagent-prompt.md +91 -0
  81. package/get-research-done/templates/decision-log.md +58 -0
  82. package/get-research-done/templates/decision.md +138 -0
  83. package/get-research-done/templates/discovery.md +146 -0
  84. package/get-research-done/templates/experiment-readme.md +104 -0
  85. package/get-research-done/templates/graduated-script.md +180 -0
  86. package/get-research-done/templates/iteration-summary.md +234 -0
  87. package/get-research-done/templates/milestone-archive.md +123 -0
  88. package/get-research-done/templates/milestone.md +115 -0
  89. package/get-research-done/templates/objective.md +271 -0
  90. package/get-research-done/templates/phase-prompt.md +567 -0
  91. package/get-research-done/templates/planner-subagent-prompt.md +117 -0
  92. package/get-research-done/templates/project.md +184 -0
  93. package/get-research-done/templates/requirements.md +231 -0
  94. package/get-research-done/templates/research-project/ARCHITECTURE.md +204 -0
  95. package/get-research-done/templates/research-project/FEATURES.md +147 -0
  96. package/get-research-done/templates/research-project/PITFALLS.md +200 -0
  97. package/get-research-done/templates/research-project/STACK.md +120 -0
  98. package/get-research-done/templates/research-project/SUMMARY.md +170 -0
  99. package/get-research-done/templates/research.md +529 -0
  100. package/get-research-done/templates/roadmap.md +202 -0
  101. package/get-research-done/templates/scorecard.json +113 -0
  102. package/get-research-done/templates/state.md +287 -0
  103. package/get-research-done/templates/summary.md +246 -0
  104. package/get-research-done/templates/user-setup.md +311 -0
  105. package/get-research-done/templates/verification-report.md +322 -0
  106. package/get-research-done/workflows/complete-milestone.md +756 -0
  107. package/get-research-done/workflows/diagnose-issues.md +231 -0
  108. package/get-research-done/workflows/discovery-phase.md +289 -0
  109. package/get-research-done/workflows/discuss-phase.md +433 -0
  110. package/get-research-done/workflows/execute-phase.md +657 -0
  111. package/get-research-done/workflows/execute-plan.md +1844 -0
  112. package/get-research-done/workflows/list-phase-assumptions.md +178 -0
  113. package/get-research-done/workflows/map-codebase.md +322 -0
  114. package/get-research-done/workflows/resume-project.md +307 -0
  115. package/get-research-done/workflows/transition.md +556 -0
  116. package/get-research-done/workflows/verify-phase.md +628 -0
  117. package/get-research-done/workflows/verify-work.md +596 -0
  118. package/hooks/dist/grd-check-update.js +61 -0
  119. package/hooks/dist/grd-statusline.js +84 -0
  120. package/package.json +47 -0
  121. package/scripts/audit-help-commands.sh +115 -0
  122. package/scripts/build-hooks.js +42 -0
  123. package/scripts/verify-all-commands.sh +246 -0
  124. package/scripts/verify-architect-warning.sh +35 -0
  125. package/scripts/verify-insights-mode.sh +40 -0
  126. package/scripts/verify-quick-mode.sh +20 -0
  127. package/scripts/verify-revise-data-routing.sh +139 -0
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Lex Christopherson
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,560 @@
1
+ <div align="center">
2
+
3
+ # GET RESEARCH DONE (GRD)
4
+
5
+ **A recursive, agentic framework for ML research with hypothesis-driven experimentation for Claude Code.**
6
+
7
+ **Structured ML experimentation with scientific rigor — from hypothesis to validated conclusion, with a Critic agent enforcing skepticism at every step.**
8
+
9
+ [![npm version](https://img.shields.io/npm/v/get-research-done?style=for-the-badge&logo=npm&logoColor=white&color=CB3837)](https://www.npmjs.com/package/get-research-done)
10
+ [![npm downloads](https://img.shields.io/npm/dm/get-research-done?style=for-the-badge&logo=npm&logoColor=white&color=CB3837)](https://www.npmjs.com/package/get-research-done)
11
+ [![Discord](https://img.shields.io/badge/Discord-Join%20Server-5865F2?style=for-the-badge&logo=discord&logoColor=white)](https://discord.gg/5JJgD5svVS)
12
+ [![GitHub stars](https://img.shields.io/github/stars/glittercowboy/get-research-done?style=for-the-badge&logo=github&color=181717)](https://github.com/glittercowboy/get-research-done)
13
+ [![License](https://img.shields.io/badge/license-MIT-blue?style=for-the-badge)](LICENSE)
14
+
15
+ <br>
16
+
17
+ ```bash
18
+ npx get-research-done
19
+ ```
20
+
21
+ **Works on Mac, Windows, and Linux.**
22
+
23
+ <br>
24
+
25
+ ![GRD Install](assets/terminal.svg)
26
+
27
+ <br>
28
+
29
+ *"If you know clearly what you want, this WILL build it for you. No bs."*
30
+
31
+ *"I've done SpecKit, OpenSpec and Taskmaster — this has produced the best results for me."*
32
+
33
+ *"By far the most powerful addition to my Claude Code. Nothing over-engineered. Literally just gets shit done."*
34
+
35
+ <br>
36
+
37
+ **Trusted by engineers at Amazon, Google, Shopify, and Webflow.**
38
+
39
+ [Why I Built This](#why-i-built-this) · [How It Works](#how-it-works) · [Commands](#commands) · [Why It Works](#why-it-works)
40
+
41
+ </div>
42
+
43
+ ---
44
+
45
+ ## Why I Built This
46
+
47
+ ML research has a reproducibility crisis. Experiments are ad-hoc, hypotheses are vague, validation is subjective, and insights get lost.
48
+
49
+ I've watched researchers spend weeks on experiments with fundamental flaws: data leakage baked into features, "95% accuracy" on shifted distributions, negative results deleted rather than preserved. The problem isn't capability — it's structure.
50
+
51
+ So I built GRD. It's the framework that makes ML research systematic:
52
+
53
+ - **Data-first philosophy** — Explore your data before forming hypotheses
54
+ - **Testable hypotheses** — Falsification criteria, success metrics, baseline requirements
55
+ - **Automated skepticism** — Critic agent catches leakage, overfitting, and logical errors
56
+ - **Recursive validation** — Results contradict the data? System routes back to exploration
57
+ - **Human-in-the-loop gates** — You make final calls on validation and archival
58
+ - **Negative result preservation** — Failed hypotheses are valuable knowledge
59
+
60
+ The complexity is in the system, not in your workflow. You run five commands: `/grd:explore`, `/grd:architect`, `/grd:research`, `/grd:evaluate`, `/grd:graduate`. The agents handle the rest.
61
+
62
+ — **Ulmentflam**
63
+
64
+ ---
65
+
66
+ ## Who This Is For
67
+
68
+ ML researchers and practitioners who want structured experimentation with hypothesis-driven workflows — without building custom research infrastructure from scratch.
69
+
70
+ ---
71
+
72
+ ## Getting Started
73
+
74
+ ```bash
75
+ npx get-research-done
76
+ ```
77
+
78
+ The installer prompts you to choose:
79
+ 1. **Runtime** — Claude Code, OpenCode, or both
80
+ 2. **Location** — Global (all projects) or local (current project only)
81
+
82
+ Verify with `/grd:help` inside your Claude Code or OpenCode interface.
83
+
84
+ ### Staying Updated
85
+
86
+ GRD evolves fast. Update periodically:
87
+
88
+ ```bash
89
+ npx get-research-done@latest
90
+ ```
91
+
92
+ <details>
93
+ <summary><strong>Non-interactive Install (Docker, CI, Scripts)</strong></summary>
94
+
95
+ ```bash
96
+ # Claude Code
97
+ npx get-research-done --claude --global # Install to ~/.claude/
98
+ npx get-research-done --claude --local # Install to ./.claude/
99
+
100
+ # OpenCode (open source, free models)
101
+ npx get-research-done --opencode --global # Install to ~/.opencode/
102
+
103
+ # Both runtimes
104
+ npx get-research-done --both --global # Install to both directories
105
+ ```
106
+
107
+ Use `--global` (`-g`) or `--local` (`-l`) to skip the location prompt.
108
+ Use `--claude`, `--opencode`, or `--both` to skip the runtime prompt.
109
+
110
+ </details>
111
+
112
+ <details>
113
+ <summary><strong>Development Installation</strong></summary>
114
+
115
+ Clone the repository and run the installer locally:
116
+
117
+ ```bash
118
+ git clone https://github.com/glittercowboy/get-research-done.git
119
+ cd get-research-done
120
+ node bin/install.js --claude --local
121
+ ```
122
+
123
+ Installs to `./.claude/` for testing modifications before contributing.
124
+
125
+ </details>
126
+
127
+ ### Recommended: Skip Permissions Mode
128
+
129
+ GRD is designed for frictionless automation. Run Claude Code with:
130
+
131
+ ```bash
132
+ claude --dangerously-skip-permissions
133
+ ```
134
+
135
+ > [!TIP]
136
+ > This is how GRD is intended to be used — stopping to approve `date` and `git commit` 50 times defeats the purpose.
137
+
138
+ <details>
139
+ <summary><strong>Alternative: Granular Permissions</strong></summary>
140
+
141
+ If you prefer not to use that flag, add this to your project's `.claude/settings.json`:
142
+
143
+ ```json
144
+ {
145
+ "permissions": {
146
+ "allow": [
147
+ "Bash(date:*)",
148
+ "Bash(echo:*)",
149
+ "Bash(cat:*)",
150
+ "Bash(ls:*)",
151
+ "Bash(mkdir:*)",
152
+ "Bash(wc:*)",
153
+ "Bash(head:*)",
154
+ "Bash(tail:*)",
155
+ "Bash(sort:*)",
156
+ "Bash(grep:*)",
157
+ "Bash(tr:*)",
158
+ "Bash(git add:*)",
159
+ "Bash(git commit:*)",
160
+ "Bash(git status:*)",
161
+ "Bash(git log:*)",
162
+ "Bash(git diff:*)",
163
+ "Bash(git tag:*)"
164
+ ]
165
+ }
166
+ }
167
+ ```
168
+
169
+ </details>
170
+
171
+ ---
172
+
173
+ ## How It Works
174
+
175
+ GRD follows a recursive validation loop: **Explore → Architect → Research → Evaluate → Graduate**. The Critic agent enforces skepticism at every step, routing experiments back to earlier phases when issues are detected.
176
+
177
+ > **Already have ML code?** Run `/grd:map-codebase` first. It spawns parallel agents to analyze your models, datasets, metrics, and experiment patterns. Then `/grd:new-project` knows your research context.
178
+
179
+ ### 1. Data Reconnaissance
180
+
181
+ ```
182
+ /grd:explore ./data/train.csv
183
+ ```
184
+
185
+ **Understand your data before forming hypotheses.**
186
+
187
+ The Explorer agent profiles your dataset:
188
+
189
+ - **Distributions** — Feature statistics, class balance, outliers
190
+ - **Missing data patterns** — MCAR/MAR/MNAR analysis
191
+ - **Leakage detection** — High-confidence warnings for temporal/feature leakage
192
+ - **Data quality** — Anomalies that could invalidate experiments
193
+
194
+ The output grounds all downstream work in data reality.
195
+
196
+ **Creates:** `.planning/DATA_REPORT.md`
197
+
198
+ ---
199
+
200
+ ### 2. Hypothesis Synthesis
201
+
202
+ ```
203
+ /grd:architect
204
+ ```
205
+
206
+ **Transform data insights into testable hypotheses.**
207
+
208
+ The Architect agent reads your DATA_REPORT.md and proposes hypotheses with:
209
+
210
+ - **Testable claims** — What you're trying to prove
211
+ - **Success metrics** — Weighted metrics with thresholds
212
+ - **Falsification criteria** — What would disprove the hypothesis
213
+ - **Baseline requirements** — What you're comparing against
214
+
215
+ The Architect collaborates iteratively — propose, explain reasoning, refine based on your feedback.
216
+
217
+ **Creates:** `.planning/OBJECTIVE.md`
218
+
219
+ ---
220
+
221
+ ### 3. Recursive Validation Loop
222
+
223
+ ```
224
+ /grd:research baseline
225
+ ```
226
+
227
+ **Implement experiments with automated skeptical review.**
228
+
229
+ The Researcher agent:
230
+
231
+ 1. **Creates isolated run** — `experiments/run_001_baseline/` with complete snapshot
232
+ 2. **Implements experiment** — Code, config, data references
233
+ 3. **Spawns Critic** — Automated skeptic reviews for logical errors, leakage, overfitting
234
+
235
+ The Critic returns one of four verdicts:
236
+
237
+ | Verdict | Meaning | Routing |
238
+ |---------|---------|---------|
239
+ | `PROCEED` | Logic sound, results align with data | → Evaluator |
240
+ | `REVISE_METHOD` | Logical error, bad hyperparams | → Back to Researcher |
241
+ | `REVISE_DATA` | Anomalous results, potential leakage | → Back to Explorer |
242
+ | `ESCALATE` | Ambiguous failure | → Human decision |
243
+
244
+ If `REVISE_METHOD`, continue with:
245
+ ```
246
+ /grd:research --continue
247
+ ```
248
+
249
+ The loop iterates until PROCEED (default limit: 5 iterations).
250
+
251
+ **Creates:** `experiments/run_NNN/` with code, config, logs, `CRITIC_LOG.md`, `SCORECARD.json`
252
+
253
+ ---
254
+
255
+ ### 4. Human Evaluation Gate
256
+
257
+ ```
258
+ /grd:evaluate
259
+ ```
260
+
261
+ **Review evidence and make the final call.**
262
+
263
+ After Critic approves and Evaluator benchmarks, you see the evidence package:
264
+
265
+ - **SCORECARD.json** — Quantitative metrics vs thresholds
266
+ - **CRITIC_LOG.md** — What passed validation and why
267
+ - **OBJECTIVE.md** — Original hypothesis for comparison
268
+ - **DATA_REPORT.md** — Data characteristics for context
269
+
270
+ Three decisions:
271
+
272
+ | Decision | Meaning | Next Step |
273
+ |----------|---------|-----------|
274
+ | **Seal** | Hypothesis validated | Ready for production/publication |
275
+ | **Iterate** | Continue experimenting | `/grd:research --continue` |
276
+ | **Archive** | Abandon hypothesis | Preserved as negative result |
277
+
278
+ Archived hypotheses are kept in `experiments/archive/` — negative results are valuable too.
279
+
280
+ **Creates:** `DECISION.md`, `human_eval/decision_log.md`
281
+
282
+ ---
283
+
284
+ ### 5. Notebook Graduation
285
+
286
+ ```
287
+ /grd:graduate notebooks/exploration/baseline.ipynb
288
+ ```
289
+
290
+ **Convert validated notebooks to production scripts.**
291
+
292
+ After a notebook passes Critic validation:
293
+
294
+ 1. **Validates requirements** — Random seeds set, parameters cell tagged
295
+ 2. **Converts to Python** — Via nbconvert with metadata header
296
+ 3. **Places in `src/experiments/`** — Ready for production use
297
+ 4. **Generates refactoring checklist** — Manual cleanup guide
298
+
299
+ **Creates:** `src/experiments/{script_name}.py`
300
+
301
+ ---
302
+
303
+ ### The Recursive Loop in Action
304
+
305
+ ```
306
+ /grd:explore ./data/ # Profile data
307
+ /grd:architect # Form hypothesis
308
+ /grd:research baseline # Implement + Critic review
309
+ → REVISE_METHOD # Critic finds issue
310
+ /grd:research --continue # Fix and retry
311
+ → PROCEED # Critic approves
312
+ /grd:evaluate # Human reviews evidence
313
+ → Seal # Hypothesis validated
314
+ /grd:graduate notebook.ipynb # Graduate to script
315
+ ```
316
+
317
+ The power is in the routing. If results contradict the data profile, `REVISE_DATA` sends you back to `/grd:explore`. The system is self-correcting.
318
+
319
+ ---
320
+
321
+ ## Why It Works
322
+
323
+ ### Data-First Philosophy
324
+
325
+ ML research fails when hypotheses aren't grounded in data reality. GRD enforces **data reconnaissance before hypothesis formation**:
326
+
327
+ 1. **Explorer** profiles your data — distributions, outliers, class balance, leakage risks
328
+ 2. **Architect** reads DATA_REPORT.md before proposing hypotheses
329
+ 3. **Critic** validates experiments against data characteristics
330
+
331
+ No more "95% accuracy" on shifted distributions. No more spending weeks on experiments with data leakage baked in.
332
+
333
+ ### Recursive Validation Loop
334
+
335
+ Research is non-linear. Results often invalidate assumptions. GRD's Critic agent has **three exit paths**:
336
+
337
+ | Exit Code | Meaning | Routing |
338
+ |-----------|---------|---------|
339
+ | `PROCEED` | Logic sound, aligns with data | → Evaluator → Human gate |
340
+ | `REVISE_METHOD` | Logical error, bad approach | → Back to Researcher |
341
+ | `REVISE_DATA` | Data quality concern | → Back to Explorer |
342
+
343
+ When results contradict the data profile, the system forces a return to the data layer. This is the core innovation over linear workflow tools.
344
+
345
+ ### Context Engineering
346
+
347
+ GRD structures context so Claude can reason effectively:
348
+
349
+ | Artifact | Purpose |
350
+ |----------|---------|
351
+ | `DATA_REPORT.md` | Living data profile — distributions, leakage warnings, anomalies |
352
+ | `OBJECTIVE.md` | Testable hypothesis — what, why, metrics, falsification criteria |
353
+ | `CRITIC_LOG.md` | Validation history — verdicts, confidence, recommendations |
354
+ | `SCORECARD.json` | Quantitative results — metrics vs thresholds, composite score |
355
+ | `experiments/run_NNN/` | Isolated snapshots — code, config, logs, outputs per iteration |
356
+
357
+ Each run is a complete, reproducible snapshot. No context rot. No lost experiments.
358
+
359
+ ### Agent Roles
360
+
361
+ | Agent | Responsibility | Output |
362
+ |-------|----------------|--------|
363
+ | **Explorer** | Data reconnaissance, leakage detection | `DATA_REPORT.md` |
364
+ | **Architect** | Hypothesis synthesis, success criteria | `OBJECTIVE.md` |
365
+ | **Researcher** | Implementation, experiment execution | `experiments/run_NNN/` |
366
+ | **Critic** | Skeptical validation, routing decisions | `CRITIC_LOG.md` |
367
+ | **Evaluator** | Quantitative benchmarking | `SCORECARD.json` |
368
+ | **Graduator** | Notebook-to-script conversion | `src/experiments/` |
369
+
370
+ The Researcher spawns Critic automatically. You don't orchestrate — you just run `/grd:research` and the loop handles itself.
371
+
372
+ ### Human-in-the-Loop Gates
373
+
374
+ Automated skepticism catches obvious errors. But **humans make final calls**:
375
+
376
+ - **Low confidence PROCEED** — Critic shows concerns, you decide whether to continue
377
+ - **Iteration limit reached** — After 5 attempts, you review and choose direction
378
+ - **Evaluate gate** — You see full evidence package before Seal/Iterate/Archive
379
+
380
+ The system makes it **harder to deceive yourself**, not easier to ship models.
381
+
382
+ ### Negative Result Preservation
383
+
384
+ Failed hypotheses are valuable. When you Archive:
385
+
386
+ - Final run preserved in `experiments/archive/`
387
+ - `ARCHIVE_REASON.md` captures why it failed
388
+ - `ITERATION_SUMMARY.md` shows what was tried
389
+ - Future researchers won't repeat the same mistakes
390
+
391
+ Insufficient skepticism causes most ML research failures. GRD makes skepticism structural.
392
+
393
+ ---
394
+
395
+ ## Commands
396
+
397
+ ### Research Loop (Core Workflow)
398
+
399
+ | Command | What it does |
400
+ |---------|--------------|
401
+ | `/grd:explore [path]` | Data reconnaissance — profile distributions, detect leakage, identify anomalies |
402
+ | `/grd:architect [direction]` | Hypothesis synthesis — create testable OBJECTIVE.md with falsification criteria |
403
+ | `/grd:research [description]` | Recursive validation — implement experiment, Critic review, routing |
404
+ | `/grd:research --continue` | Continue after REVISE_METHOD verdict |
405
+ | `/grd:evaluate [run_name]` | Human decision gate — Seal / Iterate / Archive |
406
+ | `/grd:graduate <notebook>` | Graduate validated notebook to production script |
407
+
408
+ ### Project Setup
409
+
410
+ | Command | What it does |
411
+ |---------|--------------|
412
+ | `/grd:new-project` | Initialize project with questioning → research → requirements |
413
+ | `/grd:map-codebase` | Analyze existing codebase before new-project |
414
+
415
+ ### Navigation
416
+
417
+ | Command | What it does |
418
+ |---------|--------------|
419
+ | `/grd:progress` | Where am I? What's next? |
420
+ | `/grd:help` | Show all commands and usage guide |
421
+ | `/grd:update` | Update GRD with changelog preview |
422
+ | `/grd:join-discord` | Join the GRD Discord community |
423
+
424
+ ### Session
425
+
426
+ | Command | What it does |
427
+ |---------|--------------|
428
+ | `/grd:pause-work` | Create handoff when stopping mid-experiment |
429
+ | `/grd:resume-work` | Restore from last session |
430
+
431
+ ### Utilities
432
+
433
+ | Command | What it does |
434
+ |---------|--------------|
435
+ | `/grd:settings` | Configure model profile and workflow agents |
436
+ | `/grd:set-profile <profile>` | Switch model profile (quality/balanced/budget) |
437
+ | `/grd:add-todo [desc]` | Capture idea for later |
438
+ | `/grd:check-todos` | List pending todos |
439
+ | `/grd:debug [desc]` | Systematic debugging with persistent state |
440
+ | `/grd:quick` | Execute ad-hoc experiment with GRD guarantees |
441
+
442
+ ---
443
+
444
+ ## Configuration
445
+
446
+ GRD stores project settings in `.planning/config.json`. Configure during `/grd:new-project` or update later with `/grd:settings`.
447
+
448
+ ### Core Settings
449
+
450
+ | Setting | Options | Default | What it controls |
451
+ |---------|---------|---------|------------------|
452
+ | `mode` | `yolo`, `interactive` | `interactive` | Auto-approve vs confirm at each step |
453
+ | `iteration_limit` | 1-10 | `5` | Max Researcher → Critic loops before human gate |
454
+
455
+ ### Model Profiles
456
+
457
+ Control which Claude model each agent uses. Balance quality vs token spend.
458
+
459
+ | Profile | Explorer/Architect | Researcher | Critic/Evaluator |
460
+ |---------|-------------------|------------|------------------|
461
+ | `quality` | Opus | Opus | Sonnet |
462
+ | `balanced` (default) | Sonnet | Sonnet | Sonnet |
463
+ | `budget` | Sonnet | Haiku | Haiku |
464
+
465
+ Switch profiles:
466
+ ```
467
+ /grd:set-profile budget
468
+ ```
469
+
470
+ Or configure via `/grd:settings`.
471
+
472
+ ### Workflow Agents
473
+
474
+ | Setting | Default | What it does |
475
+ |---------|---------|--------------|
476
+ | `workflow.research` | `true` | Domain research before project setup |
477
+ | `workflow.plan_check` | `true` | Verifies experiment design before execution |
478
+ | `workflow.verifier` | `true` | Confirms hypothesis criteria after Critic approval |
479
+
480
+ ### Execution
481
+
482
+ | Setting | Default | What it controls |
483
+ |---------|---------|------------------|
484
+ | `commit_docs` | `true` | Track `.planning/` in git |
485
+
486
+ ---
487
+
488
+ ## Troubleshooting
489
+
490
+ **Commands not found after install?**
491
+ - Restart Claude Code to reload slash commands
492
+ - Verify files exist in `~/.claude/commands/grd/` (global) or `./.claude/commands/grd/` (local)
493
+
494
+ **Commands not working as expected?**
495
+ - Run `/grd:help` to verify installation
496
+ - Re-run `npx get-research-done` to reinstall
497
+
498
+ **Updating to the latest version?**
499
+ ```bash
500
+ npx get-research-done@latest
501
+ ```
502
+
503
+ **Using Docker or containerized environments?**
504
+
505
+ If file reads fail with tilde paths (`~/.claude/...`), set `CLAUDE_CONFIG_DIR` before installing:
506
+ ```bash
507
+ CLAUDE_CONFIG_DIR=/home/youruser/.claude npx get-research-done --global
508
+ ```
509
+ This ensures absolute paths are used instead of `~` which may not expand correctly in containers.
510
+
511
+ ### Uninstalling
512
+
513
+ To remove GRD completely:
514
+
515
+ ```bash
516
+ # Global installs
517
+ npx get-research-done --claude --global --uninstall
518
+ npx get-research-done --opencode --global --uninstall
519
+
520
+ # Local installs (current project)
521
+ npx get-research-done --claude --local --uninstall
522
+ npx get-research-done --opencode --local --uninstall
523
+ ```
524
+
525
+ This removes all GRD commands, agents, hooks, and settings while preserving your other configurations.
526
+
527
+ ---
528
+
529
+ ## Community Ports
530
+
531
+ | Project | Platform | Description |
532
+ |---------|----------|-------------|
533
+ | [grd-opencode](https://github.com/rokicool/grd-opencode) | OpenCode | GRD adapted for OpenCode CLI |
534
+ | [grd-gemini](https://github.com/uberfuzzy/grd-gemini) | Gemini CLI | GRD adapted for Google's Gemini CLI |
535
+
536
+ ---
537
+
538
+ ## Star History
539
+
540
+ <a href="https://star-history.com/#glittercowboy/get-research-done&Date">
541
+ <picture>
542
+ <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=glittercowboy/get-research-done&type=Date&theme=dark" />
543
+ <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=glittercowboy/get-research-done&type=Date" />
544
+ <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=glittercowboy/get-research-done&type=Date" />
545
+ </picture>
546
+ </a>
547
+
548
+ ---
549
+
550
+ ## License
551
+
552
+ MIT License. See [LICENSE](LICENSE) for details.
553
+
554
+ ---
555
+
556
+ <div align="center">
557
+
558
+ **Claude Code is powerful. GRD makes ML research systematic.**
559
+
560
+ </div>