get-research-done 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (127) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +560 -0
  3. package/agents/grd-architect.md +789 -0
  4. package/agents/grd-codebase-mapper.md +738 -0
  5. package/agents/grd-critic.md +1065 -0
  6. package/agents/grd-debugger.md +1203 -0
  7. package/agents/grd-evaluator.md +948 -0
  8. package/agents/grd-executor.md +784 -0
  9. package/agents/grd-explorer.md +2063 -0
  10. package/agents/grd-graduator.md +484 -0
  11. package/agents/grd-integration-checker.md +423 -0
  12. package/agents/grd-phase-researcher.md +641 -0
  13. package/agents/grd-plan-checker.md +745 -0
  14. package/agents/grd-planner.md +1386 -0
  15. package/agents/grd-project-researcher.md +865 -0
  16. package/agents/grd-research-synthesizer.md +256 -0
  17. package/agents/grd-researcher.md +2361 -0
  18. package/agents/grd-roadmapper.md +605 -0
  19. package/agents/grd-verifier.md +778 -0
  20. package/bin/install.js +1294 -0
  21. package/commands/grd/add-phase.md +207 -0
  22. package/commands/grd/add-todo.md +193 -0
  23. package/commands/grd/architect.md +283 -0
  24. package/commands/grd/audit-milestone.md +277 -0
  25. package/commands/grd/check-todos.md +228 -0
  26. package/commands/grd/complete-milestone.md +136 -0
  27. package/commands/grd/debug.md +169 -0
  28. package/commands/grd/discuss-phase.md +86 -0
  29. package/commands/grd/evaluate.md +1095 -0
  30. package/commands/grd/execute-phase.md +339 -0
  31. package/commands/grd/explore.md +258 -0
  32. package/commands/grd/graduate.md +323 -0
  33. package/commands/grd/help.md +482 -0
  34. package/commands/grd/insert-phase.md +227 -0
  35. package/commands/grd/insights.md +231 -0
  36. package/commands/grd/join-discord.md +18 -0
  37. package/commands/grd/list-phase-assumptions.md +50 -0
  38. package/commands/grd/map-codebase.md +71 -0
  39. package/commands/grd/new-milestone.md +721 -0
  40. package/commands/grd/new-project.md +1008 -0
  41. package/commands/grd/pause-work.md +134 -0
  42. package/commands/grd/plan-milestone-gaps.md +295 -0
  43. package/commands/grd/plan-phase.md +525 -0
  44. package/commands/grd/progress.md +364 -0
  45. package/commands/grd/quick-explore.md +236 -0
  46. package/commands/grd/quick.md +309 -0
  47. package/commands/grd/remove-phase.md +349 -0
  48. package/commands/grd/research-phase.md +200 -0
  49. package/commands/grd/research.md +681 -0
  50. package/commands/grd/resume-work.md +40 -0
  51. package/commands/grd/set-profile.md +106 -0
  52. package/commands/grd/settings.md +136 -0
  53. package/commands/grd/update.md +172 -0
  54. package/commands/grd/verify-work.md +219 -0
  55. package/get-research-done/config/default.json +15 -0
  56. package/get-research-done/references/checkpoints.md +1078 -0
  57. package/get-research-done/references/continuation-format.md +249 -0
  58. package/get-research-done/references/git-integration.md +254 -0
  59. package/get-research-done/references/model-profiles.md +73 -0
  60. package/get-research-done/references/planning-config.md +94 -0
  61. package/get-research-done/references/questioning.md +141 -0
  62. package/get-research-done/references/tdd.md +263 -0
  63. package/get-research-done/references/ui-brand.md +160 -0
  64. package/get-research-done/references/verification-patterns.md +612 -0
  65. package/get-research-done/templates/DEBUG.md +159 -0
  66. package/get-research-done/templates/UAT.md +247 -0
  67. package/get-research-done/templates/archive-reason.md +195 -0
  68. package/get-research-done/templates/codebase/architecture.md +255 -0
  69. package/get-research-done/templates/codebase/concerns.md +310 -0
  70. package/get-research-done/templates/codebase/conventions.md +307 -0
  71. package/get-research-done/templates/codebase/integrations.md +280 -0
  72. package/get-research-done/templates/codebase/stack.md +186 -0
  73. package/get-research-done/templates/codebase/structure.md +285 -0
  74. package/get-research-done/templates/codebase/testing.md +480 -0
  75. package/get-research-done/templates/config.json +35 -0
  76. package/get-research-done/templates/context.md +283 -0
  77. package/get-research-done/templates/continue-here.md +78 -0
  78. package/get-research-done/templates/critic-log.md +288 -0
  79. package/get-research-done/templates/data-report.md +173 -0
  80. package/get-research-done/templates/debug-subagent-prompt.md +91 -0
  81. package/get-research-done/templates/decision-log.md +58 -0
  82. package/get-research-done/templates/decision.md +138 -0
  83. package/get-research-done/templates/discovery.md +146 -0
  84. package/get-research-done/templates/experiment-readme.md +104 -0
  85. package/get-research-done/templates/graduated-script.md +180 -0
  86. package/get-research-done/templates/iteration-summary.md +234 -0
  87. package/get-research-done/templates/milestone-archive.md +123 -0
  88. package/get-research-done/templates/milestone.md +115 -0
  89. package/get-research-done/templates/objective.md +271 -0
  90. package/get-research-done/templates/phase-prompt.md +567 -0
  91. package/get-research-done/templates/planner-subagent-prompt.md +117 -0
  92. package/get-research-done/templates/project.md +184 -0
  93. package/get-research-done/templates/requirements.md +231 -0
  94. package/get-research-done/templates/research-project/ARCHITECTURE.md +204 -0
  95. package/get-research-done/templates/research-project/FEATURES.md +147 -0
  96. package/get-research-done/templates/research-project/PITFALLS.md +200 -0
  97. package/get-research-done/templates/research-project/STACK.md +120 -0
  98. package/get-research-done/templates/research-project/SUMMARY.md +170 -0
  99. package/get-research-done/templates/research.md +529 -0
  100. package/get-research-done/templates/roadmap.md +202 -0
  101. package/get-research-done/templates/scorecard.json +113 -0
  102. package/get-research-done/templates/state.md +287 -0
  103. package/get-research-done/templates/summary.md +246 -0
  104. package/get-research-done/templates/user-setup.md +311 -0
  105. package/get-research-done/templates/verification-report.md +322 -0
  106. package/get-research-done/workflows/complete-milestone.md +756 -0
  107. package/get-research-done/workflows/diagnose-issues.md +231 -0
  108. package/get-research-done/workflows/discovery-phase.md +289 -0
  109. package/get-research-done/workflows/discuss-phase.md +433 -0
  110. package/get-research-done/workflows/execute-phase.md +657 -0
  111. package/get-research-done/workflows/execute-plan.md +1844 -0
  112. package/get-research-done/workflows/list-phase-assumptions.md +178 -0
  113. package/get-research-done/workflows/map-codebase.md +322 -0
  114. package/get-research-done/workflows/resume-project.md +307 -0
  115. package/get-research-done/workflows/transition.md +556 -0
  116. package/get-research-done/workflows/verify-phase.md +628 -0
  117. package/get-research-done/workflows/verify-work.md +596 -0
  118. package/hooks/dist/grd-check-update.js +61 -0
  119. package/hooks/dist/grd-statusline.js +84 -0
  120. package/package.json +47 -0
  121. package/scripts/audit-help-commands.sh +115 -0
  122. package/scripts/build-hooks.js +42 -0
  123. package/scripts/verify-all-commands.sh +246 -0
  124. package/scripts/verify-architect-warning.sh +35 -0
  125. package/scripts/verify-insights-mode.sh +40 -0
  126. package/scripts/verify-quick-mode.sh +20 -0
  127. package/scripts/verify-revise-data-routing.sh +139 -0
@@ -0,0 +1,202 @@
1
+ # Roadmap Template
2
+
3
+ Template for `.planning/ROADMAP.md`.
4
+
5
+ ## Initial Roadmap (v1.0 Greenfield)
6
+
7
+ ```markdown
8
+ # Roadmap: [Project Name]
9
+
10
+ ## Overview
11
+
12
+ [One paragraph describing the journey from start to finish]
13
+
14
+ ## Phases
15
+
16
+ **Phase Numbering:**
17
+ - Integer phases (1, 2, 3): Planned milestone work
18
+ - Decimal phases (2.1, 2.2): Urgent insertions (marked with INSERTED)
19
+
20
+ Decimal phases appear between their surrounding integers in numeric order.
21
+
22
+ - [ ] **Phase 1: [Name]** - [One-line description]
23
+ - [ ] **Phase 2: [Name]** - [One-line description]
24
+ - [ ] **Phase 3: [Name]** - [One-line description]
25
+ - [ ] **Phase 4: [Name]** - [One-line description]
26
+
27
+ ## Phase Details
28
+
29
+ ### Phase 1: [Name]
30
+ **Goal**: [What this phase delivers]
31
+ **Depends on**: Nothing (first phase)
32
+ **Requirements**: [REQ-01, REQ-02, REQ-03]
33
+ **Success Criteria** (what must be TRUE):
34
+ 1. [Observable behavior from user perspective]
35
+ 2. [Observable behavior from user perspective]
36
+ 3. [Observable behavior from user perspective]
37
+ **Plans**: [Number of plans, e.g., "3 plans" or "TBD"]
38
+
39
+ Plans:
40
+ - [ ] 01-01: [Brief description of first plan]
41
+ - [ ] 01-02: [Brief description of second plan]
42
+ - [ ] 01-03: [Brief description of third plan]
43
+
44
+ ### Phase 2: [Name]
45
+ **Goal**: [What this phase delivers]
46
+ **Depends on**: Phase 1
47
+ **Requirements**: [REQ-04, REQ-05]
48
+ **Success Criteria** (what must be TRUE):
49
+ 1. [Observable behavior from user perspective]
50
+ 2. [Observable behavior from user perspective]
51
+ **Plans**: [Number of plans]
52
+
53
+ Plans:
54
+ - [ ] 02-01: [Brief description]
55
+ - [ ] 02-02: [Brief description]
56
+
57
+ ### Phase 2.1: Critical Fix (INSERTED)
58
+ **Goal**: [Urgent work inserted between phases]
59
+ **Depends on**: Phase 2
60
+ **Success Criteria** (what must be TRUE):
61
+ 1. [What the fix achieves]
62
+ **Plans**: 1 plan
63
+
64
+ Plans:
65
+ - [ ] 02.1-01: [Description]
66
+
67
+ ### Phase 3: [Name]
68
+ **Goal**: [What this phase delivers]
69
+ **Depends on**: Phase 2
70
+ **Requirements**: [REQ-06, REQ-07, REQ-08]
71
+ **Success Criteria** (what must be TRUE):
72
+ 1. [Observable behavior from user perspective]
73
+ 2. [Observable behavior from user perspective]
74
+ 3. [Observable behavior from user perspective]
75
+ **Plans**: [Number of plans]
76
+
77
+ Plans:
78
+ - [ ] 03-01: [Brief description]
79
+ - [ ] 03-02: [Brief description]
80
+
81
+ ### Phase 4: [Name]
82
+ **Goal**: [What this phase delivers]
83
+ **Depends on**: Phase 3
84
+ **Requirements**: [REQ-09, REQ-10]
85
+ **Success Criteria** (what must be TRUE):
86
+ 1. [Observable behavior from user perspective]
87
+ 2. [Observable behavior from user perspective]
88
+ **Plans**: [Number of plans]
89
+
90
+ Plans:
91
+ - [ ] 04-01: [Brief description]
92
+
93
+ ## Progress
94
+
95
+ **Execution Order:**
96
+ Phases execute in numeric order: 2 → 2.1 → 2.2 → 3 → 3.1 → 4
97
+
98
+ | Phase | Plans Complete | Status | Completed |
99
+ |-------|----------------|--------|-----------|
100
+ | 1. [Name] | 0/3 | Not started | - |
101
+ | 2. [Name] | 0/2 | Not started | - |
102
+ | 3. [Name] | 0/2 | Not started | - |
103
+ | 4. [Name] | 0/1 | Not started | - |
104
+ ```
105
+
106
+ <guidelines>
107
+ **Initial planning (v1.0):**
108
+ - Phase count depends on depth setting (quick: 3-5, standard: 5-8, comprehensive: 8-12)
109
+ - Each phase delivers something coherent
110
+ - Phases can have 1+ plans (split if >3 tasks or multiple subsystems)
111
+ - Plans use naming: {phase}-{plan}-PLAN.md (e.g., 01-02-PLAN.md)
112
+ - No time estimates (this isn't enterprise PM)
113
+ - Progress table updated by execute workflow
114
+ - Plan count can be "TBD" initially, refined during planning
115
+
116
+ **Success criteria:**
117
+ - 2-5 observable behaviors per phase (from user's perspective)
118
+ - Cross-checked against requirements during roadmap creation
119
+ - Flow downstream to `must_haves` in plan-phase
120
+ - Verified by verify-phase after execution
121
+ - Format: "User can [action]" or "[Thing] works/exists"
122
+
123
+ **After milestones ship:**
124
+ - Collapse completed milestones in `<details>` tags
125
+ - Add new milestone sections for upcoming work
126
+ - Keep continuous phase numbering (never restart at 01)
127
+ </guidelines>
128
+
129
+ <status_values>
130
+ - `Not started` - Haven't begun
131
+ - `In progress` - Currently working
132
+ - `Complete` - Done (add completion date)
133
+ - `Deferred` - Pushed to later (with reason)
134
+ </status_values>
135
+
136
+ ## Milestone-Grouped Roadmap (After v1.0 Ships)
137
+
138
+ After completing first milestone, reorganize with milestone groupings:
139
+
140
+ ```markdown
141
+ # Roadmap: [Project Name]
142
+
143
+ ## Milestones
144
+
145
+ - ✅ **v1.0 MVP** - Phases 1-4 (shipped YYYY-MM-DD)
146
+ - 🚧 **v1.1 [Name]** - Phases 5-6 (in progress)
147
+ - 📋 **v2.0 [Name]** - Phases 7-10 (planned)
148
+
149
+ ## Phases
150
+
151
+ <details>
152
+ <summary>✅ v1.0 MVP (Phases 1-4) - SHIPPED YYYY-MM-DD</summary>
153
+
154
+ ### Phase 1: [Name]
155
+ **Goal**: [What this phase delivers]
156
+ **Plans**: 3 plans
157
+
158
+ Plans:
159
+ - [x] 01-01: [Brief description]
160
+ - [x] 01-02: [Brief description]
161
+ - [x] 01-03: [Brief description]
162
+
163
+ [... remaining v1.0 phases ...]
164
+
165
+ </details>
166
+
167
+ ### 🚧 v1.1 [Name] (In Progress)
168
+
169
+ **Milestone Goal:** [What v1.1 delivers]
170
+
171
+ #### Phase 5: [Name]
172
+ **Goal**: [What this phase delivers]
173
+ **Depends on**: Phase 4
174
+ **Plans**: 2 plans
175
+
176
+ Plans:
177
+ - [ ] 05-01: [Brief description]
178
+ - [ ] 05-02: [Brief description]
179
+
180
+ [... remaining v1.1 phases ...]
181
+
182
+ ### 📋 v2.0 [Name] (Planned)
183
+
184
+ **Milestone Goal:** [What v2.0 delivers]
185
+
186
+ [... v2.0 phases ...]
187
+
188
+ ## Progress
189
+
190
+ | Phase | Milestone | Plans Complete | Status | Completed |
191
+ |-------|-----------|----------------|--------|-----------|
192
+ | 1. Foundation | v1.0 | 3/3 | Complete | YYYY-MM-DD |
193
+ | 2. Features | v1.0 | 2/2 | Complete | YYYY-MM-DD |
194
+ | 5. Security | v1.1 | 0/2 | Not started | - |
195
+ ```
196
+
197
+ **Notes:**
198
+ - Milestone emoji: ✅ shipped, 🚧 in progress, 📋 planned
199
+ - Completed milestones collapsed in `<details>` for readability
200
+ - Current/future milestones expanded
201
+ - Continuous phase numbering (01-99)
202
+ - Progress table includes milestone column
@@ -0,0 +1,113 @@
1
+ {
2
+ "$schema": "http://json-schema.org/draft-07/schema#",
3
+ "$id": "grd-scorecard-v1",
4
+ "title": "GRD Experiment Scorecard",
5
+ "description": "Quantitative evaluation results for a validated experiment run",
6
+
7
+ "run_id": "{{run_NNN_description}}",
8
+ "timestamp": "{{ISO8601_timestamp}}",
9
+ "objective_ref": ".planning/OBJECTIVE.md",
10
+ "hypothesis": "{{brief_hypothesis_statement}}",
11
+ "iteration": "{{iteration_number}}",
12
+ "data_version": "{{sha256_hash_of_data}}",
13
+
14
+ "evaluation": {
15
+ "strategy": "{{k-fold|stratified-k-fold|time-series-split|holdout}}",
16
+ "k": "{{number_of_folds_or_null}}",
17
+ "test_size": "{{proportion_or_null}}",
18
+ "random_state": 42,
19
+ "folds_completed": "{{number_of_folds_executed}}"
20
+ },
21
+
22
+ "metrics": {
23
+ "{{metric_name}}": {
24
+ "mean": "{{float_mean_across_folds}}",
25
+ "std": "{{float_std_across_folds}}",
26
+ "per_fold": ["{{fold_1_value}}", "{{fold_2_value}}", "..."],
27
+ "threshold": "{{threshold_from_objective}}",
28
+ "comparison": "{{>=|<=|==}}",
29
+ "weight": "{{0.0-1.0_from_objective}}",
30
+ "result": "{{PASS|FAIL}}"
31
+ },
32
+ "{{additional_metrics}}": {
33
+ "...": "..."
34
+ }
35
+ },
36
+
37
+ "composite_score": "{{weighted_average_of_all_metrics}}",
38
+ "composite_threshold": "{{from_objective_or_default_0.5}}",
39
+ "overall_result": "{{PASS|FAIL}}",
40
+
41
+ "baseline_comparison": {
42
+ "experiment_score": "{{weighted_composite_score}}",
43
+ "baselines": [
44
+ {
45
+ "name": "{{primary_baseline_name}}",
46
+ "type": "primary",
47
+ "source": "{{own_implementation|literature_citation}}",
48
+ "score": "{{baseline_composite_score}}",
49
+ "experiment_score": "{{experiment_composite_score}}",
50
+ "improvement": "{{float_experiment_minus_baseline}}",
51
+ "improvement_pct": "{{percentage_string}}",
52
+ "significant": "{{true|false|not_tested}}",
53
+ "run_path": "{{experiments/run_NNN_baseline/}}",
54
+ "note": "{{optional_note_for_literature_baselines}}"
55
+ },
56
+ {
57
+ "name": "{{secondary_baseline_name}}",
58
+ "type": "secondary",
59
+ "source": "{{own_implementation|literature_citation}}",
60
+ "score": "{{baseline_composite_score}}",
61
+ "improvement": "{{float_improvement}}",
62
+ "improvement_pct": "{{percentage_string}}",
63
+ "significant": "{{true|false|not_tested}}",
64
+ "run_path": "{{experiments/run_NNN_secondary/|null_if_literature}}"
65
+ }
66
+ ],
67
+ "primary_baseline": "{{primary_baseline_name_or_null}}",
68
+ "secondary_baselines": ["{{secondary_name_1}}", "{{secondary_name_2}}"],
69
+ "warnings": ["{{warning_if_any_baseline_unavailable}}"]
70
+ },
71
+
72
+ "baseline_validation": {
73
+ "researcher_validated": "{{true|false}}",
74
+ "evaluator_validated": "{{true|false}}",
75
+ "validation_skipped": "{{true_if_skip_baseline_used|false}}",
76
+ "data_hash_match": "{{true_if_same_data|false_with_warning}}",
77
+ "notes": ["{{validation_notes_if_any}}"]
78
+ },
79
+
80
+ "confidence_interval": {
81
+ "composite_lower": "{{float_lower_bound}}",
82
+ "composite_upper": "{{float_upper_bound}}",
83
+ "confidence_level": 0.95,
84
+ "method": "{{bootstrap|t_distribution}}"
85
+ },
86
+
87
+ "provenance": {
88
+ "code_snapshot": "experiments/{{run_id}}/code/",
89
+ "config_file": "experiments/{{run_id}}/config.yaml",
90
+ "logs": "experiments/{{run_id}}/logs/",
91
+ "outputs": "experiments/{{run_id}}/outputs/"
92
+ },
93
+
94
+ "critic_summary": {
95
+ "verdict": "PROCEED",
96
+ "confidence": "{{HIGH|MEDIUM|LOW}}",
97
+ "log_path": "experiments/{{run_id}}/CRITIC_LOG.md"
98
+ },
99
+
100
+ "ready_for_human_review": true,
101
+ "next_phase": "Phase 5: Human Evaluation Gate",
102
+
103
+ "_notes": {
104
+ "description": "This template shows the structure and placeholder format for SCORECARD.json",
105
+ "usage": "grd-evaluator agent populates this template with actual evaluation results",
106
+ "weights_constraint": "All metric weights must sum to 1.0",
107
+ "baseline_structure": "baselines array supports multiple comparisons: first is primary (required), rest are secondary (optional)",
108
+ "baseline_types": "primary = required for experiment to proceed; secondary = optional additional comparisons",
109
+ "baseline_validation": "Tracks both Researcher (start) and Evaluator (end) validation states",
110
+ "mlflow_integration": "These fields are also logged to MLflow if available",
111
+ "phase_5_requirement": "ready_for_human_review: true signals Phase 5 can proceed"
112
+ }
113
+ }
@@ -0,0 +1,287 @@
1
+ # State Template
2
+
3
+ <!-- STATE.md template v2.0 - GRD research loop tracking -->
4
+
5
+ Template for `.planning/STATE.md` — the project's living memory.
6
+
7
+ ---
8
+
9
+ ## File Template
10
+
11
+ ```markdown
12
+ # Project State
13
+
14
+ ## Project Reference
15
+
16
+ See: .planning/PROJECT.md (updated [date])
17
+
18
+ **Core value:** [One-liner from PROJECT.md Core Value section]
19
+ **Current focus:** [Current phase name]
20
+
21
+ ## Current Position
22
+
23
+ Phase: [X] of [Y] ([Phase name])
24
+ Plan: [A] of [B] in current phase
25
+ Status: [Ready to plan / Planning / Ready to execute / In progress / Phase complete]
26
+ Last activity: [YYYY-MM-DD] — [What happened]
27
+
28
+ Progress: [░░░░░░░░░░] 0%
29
+
30
+ ## Research Loop State
31
+
32
+ **Active Hypothesis:** {{hypothesis_id_or_none}}
33
+ **Objective:** {{brief_hypothesis_statement}}
34
+ **Status:** {{not_started|in_progress|pending_review|archived}}
35
+
36
+ ### Current Iteration
37
+
38
+ - **Iteration:** {{N}} of {{limit}} (default limit: 5)
39
+ - **Current Run:** experiments/{{run_NNN_description}}
40
+ - **Phase:** {{researcher|critic|evaluator|human_review}}
41
+ - **Data Revisions:** {{data_revision_count}} of {{data_revision_limit}} (default limit: 2)
42
+
43
+ ### Loop History
44
+
45
+ | Iteration | Run | Verdict | Confidence | Metrics Summary |
46
+ |-----------|-----|---------|------------|-----------------|
47
+ | 1 | run_001_baseline | REVISE_METHOD | MEDIUM | acc=0.72 |
48
+ | 2 | run_002_tuned | PROCEED | HIGH | acc=0.85 |
49
+
50
+ ### Verdict Trend
51
+
52
+ - **Pattern:** {{improving|stagnant|degrading|mixed}}
53
+ - **Consecutive same verdicts:** {{N}}
54
+ - **Last 3 verdicts:** {{verdict1, verdict2, verdict3}}
55
+
56
+ ### Human Decisions
57
+
58
+ | Timestamp | Decision | Rationale |
59
+ |-----------|----------|-----------|
60
+ | {{timestamp}} | {{Continue|Archive|Reset|Escalate}} | {{user_rationale}} |
61
+
62
+ ### Data Revisions
63
+
64
+ Track REVISE_DATA cycles within current hypothesis:
65
+
66
+ | Iteration | Concerns | Explorer Result | Action Taken |
67
+ |-----------|----------|-----------------|--------------|
68
+ | {{N}} | {{concern_list}} | {{result_summary}} | {{action}} |
69
+
70
+ **Data Revision Limits:**
71
+ - Current count: {{data_revision_count}} of {{data_revision_limit}}
72
+ - If limit reached: Escalate to human (data quality may be insufficient for hypothesis)
73
+
74
+ ## Performance Metrics
75
+
76
+ **Velocity:**
77
+ - Total plans completed: [N]
78
+ - Average duration: [X] min
79
+ - Total execution time: [X.X] hours
80
+
81
+ **By Phase:**
82
+
83
+ | Phase | Plans | Total | Avg/Plan |
84
+ |-------|-------|-------|----------|
85
+ | - | - | - | - |
86
+
87
+ **Recent Trend:**
88
+ - Last 5 plans: [durations]
89
+ - Trend: [Improving / Stable / Degrading]
90
+
91
+ *Updated after each plan completion*
92
+
93
+ ## Accumulated Context
94
+
95
+ ### Decisions
96
+
97
+ Decisions are logged in PROJECT.md Key Decisions table.
98
+ Recent decisions affecting current work:
99
+
100
+ - [Phase X]: [Decision summary]
101
+ - [Phase Y]: [Decision summary]
102
+
103
+ ### Research Decisions
104
+
105
+ | Decision | Iteration | Impact |
106
+ |----------|-----------|--------|
107
+ | {{decision_description}} | {{N}} | {{what_changed}} |
108
+
109
+ ### Pending Todos
110
+
111
+ [From .planning/todos/pending/ — ideas captured during sessions]
112
+
113
+ None yet.
114
+
115
+ ### Blockers/Concerns
116
+
117
+ [Issues that affect future work]
118
+
119
+ None yet.
120
+
121
+ ### Research Blockers
122
+
123
+ - **Current:** {{blocker_or_none}}
124
+ - **Requires:** {{human_action|data_fix|method_change}}
125
+
126
+ ### Research Blockers
127
+
128
+ - **Current:** {{blocker_or_none}}
129
+ - **Requires:** {{human_action|data_fix|method_change}}
130
+
131
+ ## Session Continuity
132
+
133
+ Last session: [YYYY-MM-DD HH:MM]
134
+ Stopped at: [Description of last completed action]
135
+ Resume file: [Path to .continue-here*.md if exists, otherwise "None"]
136
+
137
+ ## Research Loop History
138
+
139
+ **Active Loop:** [N/A - no active research loop]
140
+ **Loop Status:** [idle/exploring/synthesizing/validating/complete]
141
+
142
+ | Loop | Started | Focus Area | Status | Outcome |
143
+ |------|---------|------------|--------|---------|
144
+ | - | - | - | - | - |
145
+
146
+ **Current Loop Progress:**
147
+ - [ ] Data reconnaissance (Explorer)
148
+ - [ ] Hypothesis synthesis (Architect)
149
+ - [ ] Implementation (Researcher)
150
+ - [ ] Validation (Critic)
151
+ - [ ] Evaluation (Evaluator)
152
+
153
+ **Loop Notes:**
154
+ _Notes from current research iteration appear here_
155
+ ```
156
+
157
+ <purpose>
158
+
159
+ STATE.md is the project's short-term memory spanning all phases and sessions.
160
+
161
+ **Problem it solves:** Information is captured in summaries, issues, and decisions but not systematically consumed. Sessions start without context.
162
+
163
+ **Solution:** A single, small file that's:
164
+ - Read first in every workflow
165
+ - Updated after every significant action
166
+ - Contains digest of accumulated context
167
+ - Enables instant session restoration
168
+
169
+ </purpose>
170
+
171
+ <lifecycle>
172
+
173
+ **Creation:** After ROADMAP.md is created (during init)
174
+ - Reference PROJECT.md (read it for current context)
175
+ - Initialize empty accumulated context sections
176
+ - Set position to "Phase 1 ready to plan"
177
+
178
+ **Reading:** First step of every workflow
179
+ - progress: Present status to user
180
+ - plan: Inform planning decisions
181
+ - execute: Know current position
182
+ - transition: Know what's complete
183
+
184
+ **Writing:** After every significant action
185
+ - execute: After SUMMARY.md created
186
+ - Update position (phase, plan, status)
187
+ - Note new decisions (detail in PROJECT.md)
188
+ - Add blockers/concerns
189
+ - transition: After phase marked complete
190
+ - Update progress bar
191
+ - Clear resolved blockers
192
+ - Refresh Project Reference date
193
+
194
+ </lifecycle>
195
+
196
+ <sections>
197
+
198
+ ### Project Reference
199
+ Points to PROJECT.md for full context. Includes:
200
+ - Core value (the ONE thing that matters)
201
+ - Current focus (which phase)
202
+ - Last update date (triggers re-read if stale)
203
+
204
+ Claude reads PROJECT.md directly for requirements, constraints, and decisions.
205
+
206
+ ### Current Position
207
+ Where we are right now:
208
+ - Phase X of Y — which phase
209
+ - Plan A of B — which plan within phase
210
+ - Status — current state
211
+ - Last activity — what happened most recently
212
+ - Progress bar — visual indicator of overall completion
213
+
214
+ Progress calculation: (completed plans) / (total plans across all phases) × 100%
215
+
216
+ ### Performance Metrics
217
+ Track velocity to understand execution patterns:
218
+ - Total plans completed
219
+ - Average duration per plan
220
+ - Per-phase breakdown
221
+ - Recent trend (improving/stable/degrading)
222
+
223
+ Updated after each plan completion.
224
+
225
+ ### Accumulated Context
226
+
227
+ **Decisions:** Reference to PROJECT.md Key Decisions table, plus recent decisions summary for quick access. Full decision log lives in PROJECT.md.
228
+
229
+ **Pending Todos:** Ideas captured via /grd:add-todo
230
+ - Count of pending todos
231
+ - Reference to .planning/todos/pending/
232
+ - Brief list if few, count if many (e.g., "5 pending todos — see /grd:check-todos")
233
+
234
+ **Blockers/Concerns:** From "Next Phase Readiness" sections
235
+ - Issues that affect future work
236
+ - Prefix with originating phase
237
+ - Cleared when addressed
238
+
239
+ ### Session Continuity
240
+ Enables instant resumption:
241
+ - When was last session
242
+ - What was last completed
243
+ - Is there a .continue-here file to resume from
244
+
245
+ ### Research Loop History
246
+ Tracks recursive validation cycles (STATE-01 requirement):
247
+ - **Active Loop**: Which research loop is currently running (or N/A)
248
+ - **Loop Status**: Current stage (idle/exploring/synthesizing/validating/complete)
249
+ - **Loop Table**: History of completed and ongoing loops with outcomes
250
+ - **Current Loop Progress**: Checklist tracking which agents have contributed
251
+ - **Loop Notes**: Insights, decisions, and findings from the current iteration
252
+
253
+ When a research loop starts (future phases), this section tracks:
254
+ - Explorer's data reconnaissance
255
+ - Architect's hypothesis synthesis
256
+ - Researcher's implementation
257
+ - Critic's validation challenges
258
+ - Evaluator's metric assessments
259
+
260
+ This enables the recursive "hypothesis → experiment → validate → refine" cycle that distinguishes GRD from linear development workflows.
261
+
262
+ ### Data Revisions Table
263
+
264
+ Tracks REVISE_DATA cycles within the current hypothesis:
265
+ - **Iteration**: Which experiment iteration triggered data revision
266
+ - **Concerns**: Summary of data concerns from Critic (truncated)
267
+ - **Explorer Result**: Outcome of re-analysis (addressed, critical issue, etc.)
268
+ - **Action Taken**: What happened next (loop continues, escalated, etc.)
269
+
270
+ Data revisions are tracked separately from method revisions because:
271
+ - Data issues are more fundamental than hyperparameter tuning
272
+ - Lower limit (default 2) prevents infinite data loops
273
+ - Multiple data revisions suggest hypothesis may not be viable with current data
274
+
275
+ </sections>
276
+
277
+ <size_constraint>
278
+
279
+ Keep STATE.md under 100 lines.
280
+
281
+ It's a DIGEST, not an archive. If accumulated context grows too large:
282
+ - Keep only 3-5 recent decisions in summary (full log in PROJECT.md)
283
+ - Keep only active blockers, remove resolved ones
284
+
285
+ The goal is "read once, know where we are" — if it's too long, that fails.
286
+
287
+ </size_constraint>