groove-dev 0.27.134 → 0.27.136
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/moe-training/client/domain-tagger.js +1 -1
- package/moe-training/scripts/retag-delegate-yield.js +303 -0
- package/moe-training/test/shared/envelope-schema.test.js +3 -3
- package/node_modules/@groove-dev/cli/package.json +1 -1
- package/node_modules/@groove-dev/daemon/package.json +1 -1
- package/node_modules/@groove-dev/daemon/src/adaptive.js +77 -0
- package/node_modules/@groove-dev/daemon/src/api.js +35 -5
- package/node_modules/@groove-dev/daemon/src/journalist.js +28 -12
- package/node_modules/@groove-dev/daemon/src/model-lab.js +53 -76
- package/node_modules/@groove-dev/daemon/src/process.js +91 -2
- package/node_modules/@groove-dev/daemon/src/rotator.js +45 -3
- package/node_modules/@groove-dev/gui/dist/assets/{index-Dozp69tK.js → index-BrZHF7pK.js} +1770 -1766
- package/node_modules/@groove-dev/gui/dist/assets/index-DIfiwdKl.css +1 -0
- package/node_modules/@groove-dev/gui/dist/index.html +2 -2
- package/node_modules/@groove-dev/gui/package.json +1 -1
- package/node_modules/@groove-dev/gui/src/components/agents/agent-chat.jsx +60 -18
- package/node_modules/@groove-dev/gui/src/components/agents/agent-feed.jsx +42 -20
- package/node_modules/@groove-dev/gui/src/components/agents/agent-file-tree.jsx +1 -1
- package/node_modules/@groove-dev/gui/src/components/agents/workspace-mode.jsx +1 -1
- package/node_modules/@groove-dev/gui/src/components/chat/chat-messages.jsx +2 -22
- package/node_modules/@groove-dev/gui/src/components/editor/code-editor.jsx +9 -9
- package/node_modules/@groove-dev/gui/src/components/editor/file-tree.jsx +1 -1
- package/node_modules/@groove-dev/gui/src/components/editor/terminal.jsx +7 -0
- package/node_modules/@groove-dev/gui/src/components/lab/chat-playground.jsx +59 -51
- package/node_modules/@groove-dev/gui/src/components/lab/lab-assistant.jsx +48 -48
- package/node_modules/@groove-dev/gui/src/components/lab/metrics-panel.jsx +39 -38
- package/node_modules/@groove-dev/gui/src/components/lab/parameter-panel.jsx +4 -5
- package/node_modules/@groove-dev/gui/src/components/lab/preset-manager.jsx +11 -11
- package/node_modules/@groove-dev/gui/src/components/lab/runtime-config.jsx +66 -62
- package/node_modules/@groove-dev/gui/src/components/lab/system-prompt-editor.jsx +13 -13
- package/node_modules/@groove-dev/gui/src/components/layout/breadcrumb-bar.jsx +1 -1
- package/node_modules/@groove-dev/gui/src/components/preview/preview-workspace.jsx +62 -22
- package/node_modules/@groove-dev/gui/src/components/ui/slider.jsx +16 -17
- package/node_modules/@groove-dev/gui/src/components/ui/table-tree.jsx +38 -0
- package/node_modules/@groove-dev/gui/src/stores/groove.js +23 -9
- package/node_modules/@groove-dev/gui/src/views/editor.jsx +1 -1
- package/node_modules/@groove-dev/gui/src/views/model-lab.jsx +101 -87
- package/node_modules/moe-training/client/domain-tagger.js +1 -1
- package/node_modules/moe-training/scripts/retag-delegate-yield.js +303 -0
- package/node_modules/moe-training/test/shared/envelope-schema.test.js +3 -3
- package/package.json +1 -1
- package/packages/cli/package.json +1 -1
- package/packages/daemon/package.json +1 -1
- package/packages/daemon/src/adaptive.js +77 -0
- package/packages/daemon/src/api.js +35 -5
- package/packages/daemon/src/journalist.js +28 -12
- package/packages/daemon/src/model-lab.js +53 -76
- package/packages/daemon/src/process.js +91 -2
- package/packages/daemon/src/rotator.js +45 -3
- package/packages/gui/dist/assets/{index-Dozp69tK.js → index-BrZHF7pK.js} +1770 -1766
- package/packages/gui/dist/assets/index-DIfiwdKl.css +1 -0
- package/packages/gui/dist/index.html +2 -2
- package/packages/gui/package.json +1 -1
- package/packages/gui/src/components/agents/agent-chat.jsx +60 -18
- package/packages/gui/src/components/agents/agent-feed.jsx +42 -20
- package/packages/gui/src/components/agents/agent-file-tree.jsx +1 -1
- package/packages/gui/src/components/agents/workspace-mode.jsx +1 -1
- package/packages/gui/src/components/chat/chat-messages.jsx +2 -22
- package/packages/gui/src/components/editor/code-editor.jsx +9 -9
- package/packages/gui/src/components/editor/file-tree.jsx +1 -1
- package/packages/gui/src/components/editor/terminal.jsx +7 -0
- package/packages/gui/src/components/lab/chat-playground.jsx +59 -51
- package/packages/gui/src/components/lab/lab-assistant.jsx +48 -48
- package/packages/gui/src/components/lab/metrics-panel.jsx +39 -38
- package/packages/gui/src/components/lab/parameter-panel.jsx +4 -5
- package/packages/gui/src/components/lab/preset-manager.jsx +11 -11
- package/packages/gui/src/components/lab/runtime-config.jsx +66 -62
- package/packages/gui/src/components/lab/system-prompt-editor.jsx +13 -13
- package/packages/gui/src/components/layout/breadcrumb-bar.jsx +1 -1
- package/packages/gui/src/components/preview/preview-workspace.jsx +62 -22
- package/packages/gui/src/components/ui/slider.jsx +16 -17
- package/packages/gui/src/components/ui/table-tree.jsx +38 -0
- package/packages/gui/src/stores/groove.js +23 -9
- package/packages/gui/src/views/editor.jsx +1 -1
- package/packages/gui/src/views/model-lab.jsx +101 -87
- package/plan_files/DELEGATE_YIELD_TRAINING_TAGS.md +135 -0
- package/plan_files/session-quality-rotation-fixes.md +218 -0
- package/test.py +571 -0
- package/node_modules/@groove-dev/gui/dist/assets/index-BgQL4bNl.css +0 -1
- package/packages/gui/dist/assets/index-BgQL4bNl.css +0 -1
- /package/{AGENT_ORCHESTRATION.md → plan_files/AGENT_ORCHESTRATION.md} +0 -0
- /package/{DYNAMIC_LEAF_ARCH.md → plan_files/DYNAMIC_LEAF_ARCH.md} +0 -0
- /package/{EMBEDDING_DIAGNOSTIC.md → plan_files/EMBEDDING_DIAGNOSTIC.md} +0 -0
- /package/{EMBEDDING_SERVICE_BUILD_PLAN.md → plan_files/EMBEDDING_SERVICE_BUILD_PLAN.md} +0 -0
- /package/{MOE_TRAINING_PIPELINE.md → plan_files/MOE_TRAINING_PIPELINE.md} +0 -0
|
@@ -0,0 +1,218 @@
|
|
|
1
|
+
Session Quality & Preemptive Rotation Fixes — Build Doc
|
|
2
|
+
========================================================
|
|
3
|
+
|
|
4
|
+
Problem
|
|
5
|
+
-------
|
|
6
|
+
Long-running Claude sessions (30+ hours) degrade: freezing mid-sentence, losing context, output quality dropping. GROOVE's rotation system triggers AFTER failure instead of BEFORE.
|
|
7
|
+
|
|
8
|
+
Root Cause
|
|
9
|
+
----------
|
|
10
|
+
Claude Code provider is flagged managesOwnContext = true (packages/daemon/src/providers/claude-code.js:40). This causes rotator.js line 183 to SKIP all context threshold, token ceiling, and hard ceiling checks. The only rotation path left is quality-based scoring, and it misses the specific degradation patterns.
|
|
11
|
+
|
|
12
|
+
Evidence from fullstack-10 (training team, 11-day session):
|
|
13
|
+
- 2.5M tokens, 247% context usage, ZERO quality rotations triggered
|
|
14
|
+
- All 7 rotations were natural_compaction or manual
|
|
15
|
+
- Output collapsed to 1-8 tokens per response (messages 1484-1510)
|
|
16
|
+
- Thinking blocks truncated mid-sentence at 8 tokens
|
|
17
|
+
- 5 consecutive identical python3 commands (agent stuck in loop)
|
|
18
|
+
- Cache read tokens dropped from 150K to 0 twice (context resets)
|
|
19
|
+
- 80 rate limit events
|
|
20
|
+
- Tool-call-only responses with zero explanatory text in late session
|
|
21
|
+
|
|
22
|
+
|
|
23
|
+
Files Modified
|
|
24
|
+
--------------
|
|
25
|
+
|
|
26
|
+
1. packages/daemon/src/process.js
|
|
27
|
+
- Change 1: Incomplete response detection in _handleAgentOutput
|
|
28
|
+
- Change 5: Cache reset detection (cacheReadTokens drop tracking)
|
|
29
|
+
- Change 7: Intro context size warning after buildFullPrompt()
|
|
30
|
+
|
|
31
|
+
2. packages/daemon/src/rotator.js
|
|
32
|
+
- Change 3: Compaction-aware rotation (compaction count tracking, age decay extension)
|
|
33
|
+
- Change 4: Truncation-triggered immediate rotation
|
|
34
|
+
- Change 8: Fix duplicate rotation history entries from recordNaturalCompaction()
|
|
35
|
+
|
|
36
|
+
3. packages/daemon/src/adaptive.js
|
|
37
|
+
- Change 2: New signals in extractSignals() and scoreSession()
|
|
38
|
+
|
|
39
|
+
4. packages/daemon/src/journalist.js
|
|
40
|
+
- Change 6: Handoff brief compression and deduplication
|
|
41
|
+
|
|
42
|
+
|
|
43
|
+
Change Details
|
|
44
|
+
--------------
|
|
45
|
+
|
|
46
|
+
CHANGE 1 — Incomplete Response Detection (process.js)
|
|
47
|
+
|
|
48
|
+
What: Detect mid-sentence truncation and abandoned tool calls.
|
|
49
|
+
|
|
50
|
+
Where: _handleAgentOutput method, triggered on output.type === 'activity' && output.subtype === 'assistant'.
|
|
51
|
+
|
|
52
|
+
Logic:
|
|
53
|
+
- Check if last text block ends without terminal punctuation (. ? ! or closing code fence/brace)
|
|
54
|
+
- Track if tool_use was emitted without a matching tool_result in next turn
|
|
55
|
+
- Set registry fields: truncationSuspected (bool), consecutiveTruncations (int)
|
|
56
|
+
- 2+ consecutive truncations = hard degradation signal
|
|
57
|
+
- Emit classifier event: { type: 'error', subtype: 'truncated_response' }
|
|
58
|
+
- Reduce stall threshold from 5min to 2min for truncation-flagged agents
|
|
59
|
+
|
|
60
|
+
Edge cases to watch:
|
|
61
|
+
- Short valid responses like "OK" or "Done" end with a period — should not trigger
|
|
62
|
+
- Tool-use-only responses (no text block at all) need special handling — check if it's a normal tool call or a degraded one by looking at whether the tool_use has meaningful input
|
|
63
|
+
- Must not trigger on the very first response of a session
|
|
64
|
+
|
|
65
|
+
|
|
66
|
+
CHANGE 2 — Output Velocity Decay Signals (adaptive.js)
|
|
67
|
+
|
|
68
|
+
What: Four new quality signals in extractSignals() with corresponding scoring in scoreSession().
|
|
69
|
+
|
|
70
|
+
New signals:
|
|
71
|
+
- outputLengthDecay: Compare avg output tokens in last 5 turns vs first 5. If <50%, signal=1. Score: -10 pts.
|
|
72
|
+
- toolOutputVolume: Sum tool result character lengths. >300KB = signal 1 (-5 pts). >600KB = signal 2 (-10 pts).
|
|
73
|
+
- turnLatencyTrend: Compare avg timestamp gaps in last 10 vs first 10 entries. If >2x, signal=1. Score: -5 pts.
|
|
74
|
+
- bashRepetition: Count consecutive identical Bash commands. 3+ identical = signal 1 (-8 pts).
|
|
75
|
+
|
|
76
|
+
Where in extractSignals(): After the existing fileChurn and errorTrend calculations.
|
|
77
|
+
|
|
78
|
+
Where in scoreSession(): After the existing errorTrend scoring block.
|
|
79
|
+
|
|
80
|
+
Edge cases to watch:
|
|
81
|
+
- outputLengthDecay needs enough turns to compare (require at least 10 entries with output tokens)
|
|
82
|
+
- bashRepetition should normalize commands (trim whitespace) before comparing
|
|
83
|
+
- toolOutputVolume counts cumulative session total, not per-turn
|
|
84
|
+
|
|
85
|
+
|
|
86
|
+
CHANGE 3 — Compaction-Aware Rotation (rotator.js)
|
|
87
|
+
|
|
88
|
+
What: Track compaction events per agent and use them to progressively tighten rotation criteria.
|
|
89
|
+
|
|
90
|
+
Where:
|
|
91
|
+
- recordNaturalCompaction(): increment this.compactionCounts Map
|
|
92
|
+
- check(): new block BEFORE quality rotation (around line 240), applies to ALL providers
|
|
93
|
+
- scoreLiveSession(): extended age decay
|
|
94
|
+
|
|
95
|
+
Logic:
|
|
96
|
+
- this.compactionCounts = new Map() in constructor
|
|
97
|
+
- recordNaturalCompaction increments count for the agent
|
|
98
|
+
- In check(): if compactionCounts >= 5, force rotate with reason 'compaction_ceiling'
|
|
99
|
+
- In check(): if compactionCounts >= 3, use effectiveQualityThreshold = 55 instead of 40
|
|
100
|
+
- scoreLiveSession() extended age decay: >7200s=-15, >14400s=-20, >28800s=-25
|
|
101
|
+
|
|
102
|
+
Edge cases to watch:
|
|
103
|
+
- Compaction count must reset when agent is rotated (new agent ID = fresh count)
|
|
104
|
+
- Don't increment count for the duplicate "unknown" provider entries (see Change 8)
|
|
105
|
+
- The compaction ceiling check should respect MIN_AGE_SEC to avoid killing brand-new agents
|
|
106
|
+
|
|
107
|
+
|
|
108
|
+
CHANGE 4 — Truncation-Triggered Rotation (rotator.js)
|
|
109
|
+
|
|
110
|
+
What: Immediate rotation on consecutive truncated responses.
|
|
111
|
+
|
|
112
|
+
Where: check() method, BEFORE the quality rotation block.
|
|
113
|
+
|
|
114
|
+
Logic:
|
|
115
|
+
- Read agent.truncationSuspected and agent.consecutiveTruncations from registry
|
|
116
|
+
- consecutiveTruncations >= 2: force rotate, reason 'incomplete_response', bypass cooldown
|
|
117
|
+
- truncationSuspected true (single): lower effective quality threshold to 55
|
|
118
|
+
|
|
119
|
+
Edge cases to watch:
|
|
120
|
+
- Must still respect the singleTask skip for Codex
|
|
121
|
+
- Must still respect the idle check (don't rotate while agent is actively producing output)
|
|
122
|
+
- Clear truncation flags on the NEW agent after rotation
|
|
123
|
+
|
|
124
|
+
|
|
125
|
+
CHANGE 5 — Cache Reset Detection (process.js)
|
|
126
|
+
|
|
127
|
+
What: Detect sudden drops in cache read tokens as a context quality signal.
|
|
128
|
+
|
|
129
|
+
Where: _handleAgentOutput, near the existing contextUsage tracking block (around line 1383).
|
|
130
|
+
|
|
131
|
+
Logic:
|
|
132
|
+
- Add Map: this._prevCacheRead (agentId -> last cacheReadTokens value)
|
|
133
|
+
- On each output with cacheReadTokens: compare to previous value
|
|
134
|
+
- If previous > 50,000 AND current < previous * 0.5: cache reset detected
|
|
135
|
+
- Set registry field: cacheResetDetected = true
|
|
136
|
+
- Emit classifier event: { type: 'error', subtype: 'cache_reset' }
|
|
137
|
+
- Rotator treats cacheResetDetected same as truncationSuspected
|
|
138
|
+
|
|
139
|
+
Edge cases to watch:
|
|
140
|
+
- Cache builds up from 0 at session start — must require previous > 50K minimum
|
|
141
|
+
- Natural compaction causes a cache drop too, but recordNaturalCompaction already handles that — don't double-count
|
|
142
|
+
- If the provider doesn't report cacheReadTokens (non-Claude providers), skip this check
|
|
143
|
+
|
|
144
|
+
|
|
145
|
+
CHANGE 6 — Handoff Brief Compression (journalist.js)
|
|
146
|
+
|
|
147
|
+
What: Cap brief size and deduplicate content across rotations.
|
|
148
|
+
|
|
149
|
+
Where: generateHandoffBrief() method (line 851+) and buildRotationSynthesisPrompt() (line 436+).
|
|
150
|
+
|
|
151
|
+
Changes:
|
|
152
|
+
- Hard cap: 5000 chars total for the assembled brief. Truncate from the bottom (rotation history and original task get cut first).
|
|
153
|
+
- Replace inline discoveries with pointer to .groove/memory/agent-discoveries.jsonl
|
|
154
|
+
- Cap original task section to 500 chars
|
|
155
|
+
- Drop 'Recent User Messages' for quality_degradation rotations
|
|
156
|
+
- Reduce rotation history from 3 entries to 1
|
|
157
|
+
- buildRotationSynthesisPrompt: reduce input cap from 30K to 15K, response limit from 2000 to 1500
|
|
158
|
+
- Dedup: hash discovery entries and skip any that appear in the rotation history section
|
|
159
|
+
|
|
160
|
+
Priority order when truncating to fit 5000 char cap:
|
|
161
|
+
1. Unresolved Errors (keep)
|
|
162
|
+
2. User Constraints (keep)
|
|
163
|
+
3. Last 5 Tool Calls (keep)
|
|
164
|
+
4. Session Summary (truncate to 1500 chars if needed)
|
|
165
|
+
5. Rotation History (1 entry, truncate to 500 chars if needed)
|
|
166
|
+
6. Original Task (truncate to 500 chars)
|
|
167
|
+
7. Everything else (drop)
|
|
168
|
+
|
|
169
|
+
|
|
170
|
+
CHANGE 7 — Intro Context Size Warning (process.js or introducer.js)
|
|
171
|
+
|
|
172
|
+
What: Warn when injected context is too large, optionally truncate.
|
|
173
|
+
|
|
174
|
+
Where: After buildFullPrompt() in the spawn flow, or in the introducer where intro context is assembled.
|
|
175
|
+
|
|
176
|
+
Logic:
|
|
177
|
+
- Measure Buffer.byteLength(fullPrompt)
|
|
178
|
+
- If > 8000 chars: console.warn with agent name and size
|
|
179
|
+
- Config field: maxIntroContextChars (default 10000)
|
|
180
|
+
- If exceeded: truncate agent.introContext (not the task prompt) to fit
|
|
181
|
+
- Truncation should preserve the first and last sections, cutting middle content
|
|
182
|
+
|
|
183
|
+
Edge cases to watch:
|
|
184
|
+
- Don't truncate the task prompt — only the GROOVE-injected intro context
|
|
185
|
+
- The warning should include which agent and what size, so the user can debug
|
|
186
|
+
|
|
187
|
+
|
|
188
|
+
CHANGE 8 — Fix Duplicate Rotation History Entries (rotator.js)
|
|
189
|
+
|
|
190
|
+
What: recordNaturalCompaction() produces two entries per event — one real, one "unknown" with 0 tokens.
|
|
191
|
+
|
|
192
|
+
Where: recordNaturalCompaction() method (line 484+).
|
|
193
|
+
|
|
194
|
+
Evidence: rotation-history.json shows paired entries like:
|
|
195
|
+
{ agentName: "planner-9", provider: "claude-code", oldTokens: 4358, ... }
|
|
196
|
+
{ agentName: "planner-9", provider: "unknown", oldTokens: 0, ... }
|
|
197
|
+
|
|
198
|
+
Root cause: Likely the broadcast from recordNaturalCompaction triggers a listener that calls it again, or the provider lookup fails on the second invocation. Debug by checking what calls recordNaturalCompaction — it's called from process.js _handleAgentOutput (line 1392). Check if the same stdout event produces two parseOutput results.
|
|
199
|
+
|
|
200
|
+
Fix: Add a dedup guard — if the last entry in rotationHistory has the same agentId and a timestamp within 1 second, skip. Or fix the upstream double-call.
|
|
201
|
+
|
|
202
|
+
|
|
203
|
+
Testing
|
|
204
|
+
-------
|
|
205
|
+
- npm test from packages/daemon/ must pass with all changes
|
|
206
|
+
- npm run build from packages/gui/ must compile
|
|
207
|
+
- Manual test: spawn a long-running agent and verify compaction counting works
|
|
208
|
+
- Manual test: send a large file read and verify toolOutputVolume signal fires
|
|
209
|
+
- Verify rotation-history.json no longer produces duplicate entries
|
|
210
|
+
|
|
211
|
+
Rollback
|
|
212
|
+
--------
|
|
213
|
+
All changes are in 4 files in packages/daemon/src/. If any change causes issues:
|
|
214
|
+
- Revert the specific file to its pre-change state via git checkout
|
|
215
|
+
- Changes are independent enough that any single change can be reverted without affecting the others, EXCEPT:
|
|
216
|
+
- Change 4 depends on Change 1 (truncation detection feeds truncation rotation)
|
|
217
|
+
- Change 3 depends on recordNaturalCompaction (Change 8 touches the same method)
|
|
218
|
+
- If reverting Change 8, also verify Change 3's compaction counting still works
|