clementine-agent 1.0.88 → 1.0.90
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +13 -0
- package/dist/agent/assistant.js +12 -6
- package/dist/agent/decision-reflection.d.ts +65 -0
- package/dist/agent/decision-reflection.js +240 -0
- package/dist/gateway/cron-scheduler.js +18 -2
- package/dist/tools/decision-reflection-tools.d.ts +17 -0
- package/dist/tools/decision-reflection-tools.js +83 -0
- package/dist/tools/mcp-server.js +2 -0
- package/package.json +1 -1
- package/vault/00-System/CRON.md +16 -0
package/README.md
CHANGED
|
@@ -585,6 +585,19 @@ The dashboard's "The Office" page shows each agent as an animated desk station w
|
|
|
585
585
|
- Channel assignment, model badge, project badge, tool count
|
|
586
586
|
- Edit and "Let Go" (delete) actions
|
|
587
587
|
|
|
588
|
+
### Decision-loop reflection
|
|
589
|
+
|
|
590
|
+
Each agent's autonomous decisions are recorded to a proactive ledger (action chosen, signal source, eventual outcome). The `decision_reflection` MCP tool reads that ledger, computes per-action success rates, and surfaces calibration patterns:
|
|
591
|
+
|
|
592
|
+
- "act_now success rate is 33% — many autonomous actions did not advance"
|
|
593
|
+
- "Queue-heavy bias: 12 queued vs 2 act_now — engine is being conservative"
|
|
594
|
+
- "Zero ask_user despite active autonomous work"
|
|
595
|
+
- Plus concrete tuning suggestions
|
|
596
|
+
|
|
597
|
+
By default the report is saved to `vault/00-System/agents/<slug>/reflections/<date>.md`. Pass `append_to_memory: true` to also write a compact summary into the agent's `working-memory.md` so the next heartbeat tick reads it as prompt context — that's how agents self-tune without code changes.
|
|
598
|
+
|
|
599
|
+
The shipped `vault/00-System/CRON.md` template includes a `weekly-decision-reflection` job (Sundays 9am) that runs reflection for the daemon and every active specialist.
|
|
600
|
+
|
|
588
601
|
### Per-agent heartbeats
|
|
589
602
|
|
|
590
603
|
Each specialist (Ross / Sasha / your hires) gets their own autonomous heartbeat scheduler alongside Clementine's. The cycle:
|
package/dist/agent/assistant.js
CHANGED
|
@@ -1788,12 +1788,18 @@ You have a cost budget per message — not a hard turn limit. Work until the tas
|
|
|
1788
1788
|
const supportsThinking = !resolvedModel.includes('haiku');
|
|
1789
1789
|
const needsThinking = !isHeartbeat && (isPlanStep || isUnleashed || !isCron);
|
|
1790
1790
|
const computedThinking = thinking ?? (supportsThinking && needsThinking ? { type: 'adaptive' } : undefined);
|
|
1791
|
-
//
|
|
1792
|
-
//
|
|
1793
|
-
//
|
|
1794
|
-
//
|
|
1795
|
-
//
|
|
1796
|
-
|
|
1791
|
+
// ── taskBudget: don't pass to the SDK ─────────────────────────
|
|
1792
|
+
// The Anthropic API now rejects `taskBudget` for both Haiku AND Sonnet
|
|
1793
|
+
// ("This model does not support user-configurable task budgets" — 400).
|
|
1794
|
+
// We previously gated by !haiku, but that left Sonnet crons (e.g.,
|
|
1795
|
+
// ross-the-sdr:reply-detection) failing on every run. Cost is
|
|
1796
|
+
// informational on a Claude subscription anyway — `maxTurns` and the
|
|
1797
|
+
// wall-clock cap (`maxHours` for unleashed) are the actual brakes.
|
|
1798
|
+
//
|
|
1799
|
+
// computedTaskBudget is still computed below for any future telemetry
|
|
1800
|
+
// path that wants to log "soft target" values, but it is intentionally
|
|
1801
|
+
// never passed into sdkOptions.
|
|
1802
|
+
const supportsTaskBudget = false;
|
|
1797
1803
|
// 1M context beta: enable for Sonnet when toggled and context-heavy work benefits
|
|
1798
1804
|
const isSonnet = resolvedModel.includes('sonnet');
|
|
1799
1805
|
const computedBetas = ENABLE_1M_CONTEXT && isSonnet
|
|
@@ -0,0 +1,65 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Clementine TypeScript — Per-agent decision-loop reflection.
|
|
3
|
+
*
|
|
4
|
+
* Reads an agent's slice of the proactive decision ledger and produces
|
|
5
|
+
* a calibration report:
|
|
6
|
+
*
|
|
7
|
+
* - How many decisions per action (act_now / queue / ask_user / snooze / ignore)
|
|
8
|
+
* - Of those, how many had recorded outcomes
|
|
9
|
+
* - Success rate (advanced / withOutcomes) per action
|
|
10
|
+
* - Top signal sources by volume
|
|
11
|
+
* - Plain-English patterns + concrete tuning suggestions
|
|
12
|
+
*
|
|
13
|
+
* The report is meant to land in the agent's working-memory so it
|
|
14
|
+
* shapes their next heartbeat tick — they read their own track record
|
|
15
|
+
* and self-correct without code changes.
|
|
16
|
+
*
|
|
17
|
+
* Pure analysis: no I/O side effects. The MCP tool wrapper handles
|
|
18
|
+
* file writes (history) and working-memory updates separately.
|
|
19
|
+
*/
|
|
20
|
+
import type { ProactiveAction, ProactiveDecision, ProactiveSource } from './proactive-engine.js';
|
|
21
|
+
export interface ActionBucket {
|
|
22
|
+
decided: number;
|
|
23
|
+
withOutcomes: number;
|
|
24
|
+
advanced: number;
|
|
25
|
+
blocked: number;
|
|
26
|
+
failed: number;
|
|
27
|
+
/** advanced / withOutcomes * 100, or null if no outcomes recorded yet. */
|
|
28
|
+
successRatePct: number | null;
|
|
29
|
+
}
|
|
30
|
+
export interface DecisionReflection {
|
|
31
|
+
slug: string;
|
|
32
|
+
windowDays: number;
|
|
33
|
+
totalDecisions: number;
|
|
34
|
+
byAction: Record<ProactiveAction, ActionBucket>;
|
|
35
|
+
topSources: Array<{
|
|
36
|
+
source: ProactiveSource;
|
|
37
|
+
count: number;
|
|
38
|
+
}>;
|
|
39
|
+
patterns: string[];
|
|
40
|
+
suggestions: string[];
|
|
41
|
+
generatedAt: string;
|
|
42
|
+
}
|
|
43
|
+
/**
|
|
44
|
+
* Read the ledger, filter by agent + window, compute the calibration
|
|
45
|
+
* stats. Returns a DecisionReflection ready to format.
|
|
46
|
+
*
|
|
47
|
+
* Agent matching: a record belongs to `slug` when context.owner === slug
|
|
48
|
+
* OR context.goalId resolves to a goal whose owner is slug. For now we
|
|
49
|
+
* match only on context.owner — slug-by-goal is more expensive (would
|
|
50
|
+
* need listAllGoals) and not worth it until owner is consistently set.
|
|
51
|
+
*/
|
|
52
|
+
export declare function analyzeAgentDecisions(slug: string, windowDays?: number, opts?: {
|
|
53
|
+
ledgerPath?: string;
|
|
54
|
+
now?: Date;
|
|
55
|
+
}): DecisionReflection;
|
|
56
|
+
export declare function formatReflectionReport(r: DecisionReflection): string;
|
|
57
|
+
/**
|
|
58
|
+
* Compact summary for working-memory append. Skips the full table and
|
|
59
|
+
* keeps just the patterns + suggestions so the agent's prompt doesn't
|
|
60
|
+
* get bloated with raw stats.
|
|
61
|
+
*/
|
|
62
|
+
export declare function formatReflectionSummary(r: DecisionReflection): string;
|
|
63
|
+
/** Internal type alias used by tests / tool. */
|
|
64
|
+
export type { ProactiveDecision };
|
|
65
|
+
//# sourceMappingURL=decision-reflection.d.ts.map
|
|
@@ -0,0 +1,240 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Clementine TypeScript — Per-agent decision-loop reflection.
|
|
3
|
+
*
|
|
4
|
+
* Reads an agent's slice of the proactive decision ledger and produces
|
|
5
|
+
* a calibration report:
|
|
6
|
+
*
|
|
7
|
+
* - How many decisions per action (act_now / queue / ask_user / snooze / ignore)
|
|
8
|
+
* - Of those, how many had recorded outcomes
|
|
9
|
+
* - Success rate (advanced / withOutcomes) per action
|
|
10
|
+
* - Top signal sources by volume
|
|
11
|
+
* - Plain-English patterns + concrete tuning suggestions
|
|
12
|
+
*
|
|
13
|
+
* The report is meant to land in the agent's working-memory so it
|
|
14
|
+
* shapes their next heartbeat tick — they read their own track record
|
|
15
|
+
* and self-correct without code changes.
|
|
16
|
+
*
|
|
17
|
+
* Pure analysis: no I/O side effects. The MCP tool wrapper handles
|
|
18
|
+
* file writes (history) and working-memory updates separately.
|
|
19
|
+
*/
|
|
20
|
+
import { recentDecisions } from './proactive-ledger.js';
|
|
21
|
+
// ── Constants ────────────────────────────────────────────────────────
|
|
22
|
+
const ALL_ACTIONS = ['act_now', 'queue', 'ask_user', 'snooze', 'ignore'];
|
|
23
|
+
const LOW_SUCCESS_THRESHOLD = 50; // < 50% success → flag as miscalibrated
|
|
24
|
+
const HIGH_VOLUME_THRESHOLD = 20; // > 20 decisions in window → "active loop"
|
|
25
|
+
const DORMANT_THRESHOLD = 0; // 0 decisions → "dormant"
|
|
26
|
+
function emptyBucket() {
|
|
27
|
+
return { decided: 0, withOutcomes: 0, advanced: 0, blocked: 0, failed: 0, successRatePct: null };
|
|
28
|
+
}
|
|
29
|
+
function emptyByAction() {
|
|
30
|
+
const out = {};
|
|
31
|
+
for (const a of ALL_ACTIONS)
|
|
32
|
+
out[a] = emptyBucket();
|
|
33
|
+
return out;
|
|
34
|
+
}
|
|
35
|
+
// ── Analysis ─────────────────────────────────────────────────────────
|
|
36
|
+
/**
|
|
37
|
+
* Read the ledger, filter by agent + window, compute the calibration
|
|
38
|
+
* stats. Returns a DecisionReflection ready to format.
|
|
39
|
+
*
|
|
40
|
+
* Agent matching: a record belongs to `slug` when context.owner === slug
|
|
41
|
+
* OR context.goalId resolves to a goal whose owner is slug. For now we
|
|
42
|
+
* match only on context.owner — slug-by-goal is more expensive (would
|
|
43
|
+
* need listAllGoals) and not worth it until owner is consistently set.
|
|
44
|
+
*/
|
|
45
|
+
export function analyzeAgentDecisions(slug, windowDays = 7, opts) {
|
|
46
|
+
const now = opts?.now ?? new Date();
|
|
47
|
+
const sinceMs = windowDays * 24 * 60 * 60 * 1000;
|
|
48
|
+
const records = recentDecisions({ sinceMs }, opts?.ledgerPath ? { filePath: opts.ledgerPath, now } : { now });
|
|
49
|
+
// Filter to records relevant to this agent. Inclusion rule:
|
|
50
|
+
// - context.owner === slug (preferred — explicitly attributed)
|
|
51
|
+
// - else if slug === 'clementine': include records with no owner
|
|
52
|
+
// (the daemon's own decisions default to no-owner)
|
|
53
|
+
const isClementine = slug === 'clementine';
|
|
54
|
+
const mine = records.filter((r) => {
|
|
55
|
+
const owner = r.context.owner;
|
|
56
|
+
if (owner)
|
|
57
|
+
return owner === slug;
|
|
58
|
+
return isClementine; // unowned → Clementine
|
|
59
|
+
});
|
|
60
|
+
// Outcome records share the original decision's id and have an
|
|
61
|
+
// `outcome` field. Index by id so we can pair decisions with their
|
|
62
|
+
// outcomes in one pass.
|
|
63
|
+
const outcomeById = new Map();
|
|
64
|
+
for (const r of mine) {
|
|
65
|
+
if (r.outcome)
|
|
66
|
+
outcomeById.set(r.id, r.outcome.status);
|
|
67
|
+
}
|
|
68
|
+
const byAction = emptyByAction();
|
|
69
|
+
const sourceCounts = new Map();
|
|
70
|
+
let totalDecisions = 0;
|
|
71
|
+
for (const r of mine) {
|
|
72
|
+
if (r.outcome)
|
|
73
|
+
continue; // outcomes are indexed; only count original decisions here
|
|
74
|
+
totalDecisions++;
|
|
75
|
+
const action = r.decision.action;
|
|
76
|
+
const bucket = byAction[action];
|
|
77
|
+
bucket.decided++;
|
|
78
|
+
const outcomeStatus = outcomeById.get(r.id);
|
|
79
|
+
if (outcomeStatus !== undefined) {
|
|
80
|
+
bucket.withOutcomes++;
|
|
81
|
+
if (outcomeStatus === 'advanced')
|
|
82
|
+
bucket.advanced++;
|
|
83
|
+
else if (typeof outcomeStatus === 'string' && outcomeStatus.startsWith('blocked-'))
|
|
84
|
+
bucket.blocked++;
|
|
85
|
+
else if (outcomeStatus === 'failed')
|
|
86
|
+
bucket.failed++;
|
|
87
|
+
}
|
|
88
|
+
const src = r.decision.source;
|
|
89
|
+
sourceCounts.set(src, (sourceCounts.get(src) ?? 0) + 1);
|
|
90
|
+
}
|
|
91
|
+
// Compute success rates
|
|
92
|
+
for (const a of ALL_ACTIONS) {
|
|
93
|
+
const b = byAction[a];
|
|
94
|
+
b.successRatePct = b.withOutcomes > 0 ? Math.round((b.advanced / b.withOutcomes) * 100) : null;
|
|
95
|
+
}
|
|
96
|
+
const topSources = [...sourceCounts.entries()]
|
|
97
|
+
.map(([source, count]) => ({ source, count }))
|
|
98
|
+
.sort((a, b) => b.count - a.count)
|
|
99
|
+
.slice(0, 5);
|
|
100
|
+
const { patterns, suggestions } = derivePatterns({ slug, totalDecisions, byAction, topSources, windowDays });
|
|
101
|
+
return {
|
|
102
|
+
slug,
|
|
103
|
+
windowDays,
|
|
104
|
+
totalDecisions,
|
|
105
|
+
byAction,
|
|
106
|
+
topSources,
|
|
107
|
+
patterns,
|
|
108
|
+
suggestions,
|
|
109
|
+
generatedAt: now.toISOString(),
|
|
110
|
+
};
|
|
111
|
+
}
|
|
112
|
+
/**
|
|
113
|
+
* Derive plain-English patterns + tuning suggestions from the stats.
|
|
114
|
+
* Each pattern is a one-line observation; each suggestion is concrete
|
|
115
|
+
* advice the agent (or human) can act on.
|
|
116
|
+
*/
|
|
117
|
+
function derivePatterns(input) {
|
|
118
|
+
const patterns = [];
|
|
119
|
+
const suggestions = [];
|
|
120
|
+
if (input.totalDecisions === DORMANT_THRESHOLD) {
|
|
121
|
+
patterns.push(`No decisions recorded in the last ${input.windowDays} days.`);
|
|
122
|
+
suggestions.push('Agent is dormant. Either no signals are reaching the heartbeat, or every signal is being deduped. Check the proactive ledger directly to confirm.');
|
|
123
|
+
return { patterns, suggestions };
|
|
124
|
+
}
|
|
125
|
+
if (input.totalDecisions >= HIGH_VOLUME_THRESHOLD) {
|
|
126
|
+
patterns.push(`High decision volume: ${input.totalDecisions} decisions in ${input.windowDays} days.`);
|
|
127
|
+
}
|
|
128
|
+
const actNow = input.byAction.act_now;
|
|
129
|
+
if (actNow.withOutcomes >= 3 && (actNow.successRatePct ?? 100) < LOW_SUCCESS_THRESHOLD) {
|
|
130
|
+
patterns.push(`act_now success rate is ${actNow.successRatePct}% (${actNow.advanced}/${actNow.withOutcomes}). Low — many autonomous actions did not advance.`);
|
|
131
|
+
suggestions.push('Raise the urgency threshold for act_now (in proactive-engine.ts decideGoalAdvancement / decideDailyPlanPriority) — currently firing too aggressively. Consider requiring urgency >= 5 instead of >= 4.');
|
|
132
|
+
}
|
|
133
|
+
const queue = input.byAction.queue;
|
|
134
|
+
if (queue.withOutcomes >= 3 && (queue.successRatePct ?? 100) < LOW_SUCCESS_THRESHOLD) {
|
|
135
|
+
patterns.push(`Queued items are not landing: ${queue.advanced}/${queue.withOutcomes} advanced (${queue.successRatePct}%).`);
|
|
136
|
+
suggestions.push('Queued work is going stale. Review the work-queue dwell time — items may be timing out before the heartbeat runs them.');
|
|
137
|
+
}
|
|
138
|
+
if (queue.decided > actNow.decided * 3 && actNow.decided > 0) {
|
|
139
|
+
patterns.push(`Queue-heavy bias: ${queue.decided} queued vs ${actNow.decided} act_now. The engine is being conservative.`);
|
|
140
|
+
suggestions.push('If most queued items eventually advance manually, consider lowering the queue→act_now threshold or expanding act_now eligibility.');
|
|
141
|
+
}
|
|
142
|
+
const blockedTotal = ALL_ACTIONS.reduce((sum, a) => sum + input.byAction[a].blocked, 0);
|
|
143
|
+
if (blockedTotal >= 3) {
|
|
144
|
+
patterns.push(`${blockedTotal} decisions ended blocked (waiting on user/external).`);
|
|
145
|
+
suggestions.push('Frequent blocking suggests the agent is hitting questions only the owner can answer. Review whether earlier ask_user prompts would clear the blockage faster.');
|
|
146
|
+
}
|
|
147
|
+
const askUserCount = input.byAction.ask_user.decided;
|
|
148
|
+
if (askUserCount === 0 && actNow.decided + queue.decided >= 5) {
|
|
149
|
+
patterns.push('Zero ask_user decisions despite active autonomous work.');
|
|
150
|
+
suggestions.push('Agent never asked for owner input over the window. Either everything is unambiguously autonomous (good) or the agent is over-deciding without clarity (suspect when blocked outcomes are non-trivial).');
|
|
151
|
+
}
|
|
152
|
+
const failedTotal = ALL_ACTIONS.reduce((sum, a) => sum + input.byAction[a].failed, 0);
|
|
153
|
+
if (failedTotal >= 3) {
|
|
154
|
+
patterns.push(`${failedTotal} decisions ended in failed outcomes.`);
|
|
155
|
+
suggestions.push('Multiple failures — check cron logs for the failing job names. May indicate a tool that needs maintenance, a budget cap being hit, or a misconfigured trigger.');
|
|
156
|
+
}
|
|
157
|
+
if (patterns.length === 0) {
|
|
158
|
+
patterns.push('No notable patterns detected — calibration looks healthy for this window.');
|
|
159
|
+
}
|
|
160
|
+
return { patterns, suggestions };
|
|
161
|
+
}
|
|
162
|
+
// ── Markdown formatter ───────────────────────────────────────────────
|
|
163
|
+
const ACTION_LABELS = {
|
|
164
|
+
act_now: 'Doing now',
|
|
165
|
+
queue: 'Queued',
|
|
166
|
+
ask_user: 'Needs you',
|
|
167
|
+
snooze: 'Snoozed',
|
|
168
|
+
ignore: 'Skipped',
|
|
169
|
+
};
|
|
170
|
+
export function formatReflectionReport(r) {
|
|
171
|
+
const lines = [];
|
|
172
|
+
lines.push(`# Decision reflection — ${r.slug}`);
|
|
173
|
+
lines.push('');
|
|
174
|
+
lines.push(`Generated: ${r.generatedAt} `);
|
|
175
|
+
lines.push(`Window: last ${r.windowDays} day(s) `);
|
|
176
|
+
lines.push(`Total decisions: **${r.totalDecisions}**`);
|
|
177
|
+
lines.push('');
|
|
178
|
+
if (r.totalDecisions === 0) {
|
|
179
|
+
lines.push('## No decisions in window');
|
|
180
|
+
lines.push('');
|
|
181
|
+
for (const p of r.patterns)
|
|
182
|
+
lines.push(`- ${p}`);
|
|
183
|
+
if (r.suggestions.length > 0) {
|
|
184
|
+
lines.push('');
|
|
185
|
+
lines.push('### Suggestions');
|
|
186
|
+
for (const s of r.suggestions)
|
|
187
|
+
lines.push(`- ${s}`);
|
|
188
|
+
}
|
|
189
|
+
return lines.join('\n');
|
|
190
|
+
}
|
|
191
|
+
lines.push('## By action');
|
|
192
|
+
lines.push('');
|
|
193
|
+
lines.push('| Action | Decided | Outcomes | Advanced | Blocked | Failed | Success |');
|
|
194
|
+
lines.push('|---|---:|---:|---:|---:|---:|---:|');
|
|
195
|
+
for (const a of ALL_ACTIONS) {
|
|
196
|
+
const b = r.byAction[a];
|
|
197
|
+
if (b.decided === 0)
|
|
198
|
+
continue;
|
|
199
|
+
const pct = b.successRatePct === null ? '—' : `${b.successRatePct}%`;
|
|
200
|
+
lines.push(`| ${ACTION_LABELS[a]} (${a}) | ${b.decided} | ${b.withOutcomes} | ${b.advanced} | ${b.blocked} | ${b.failed} | ${pct} |`);
|
|
201
|
+
}
|
|
202
|
+
if (r.topSources.length > 0) {
|
|
203
|
+
lines.push('');
|
|
204
|
+
lines.push('## Top sources');
|
|
205
|
+
for (const s of r.topSources) {
|
|
206
|
+
lines.push(`- **${s.source}**: ${s.count}`);
|
|
207
|
+
}
|
|
208
|
+
}
|
|
209
|
+
lines.push('');
|
|
210
|
+
lines.push('## Patterns');
|
|
211
|
+
for (const p of r.patterns)
|
|
212
|
+
lines.push(`- ${p}`);
|
|
213
|
+
if (r.suggestions.length > 0) {
|
|
214
|
+
lines.push('');
|
|
215
|
+
lines.push('## Suggestions');
|
|
216
|
+
for (const s of r.suggestions)
|
|
217
|
+
lines.push(`- ${s}`);
|
|
218
|
+
}
|
|
219
|
+
return lines.join('\n');
|
|
220
|
+
}
|
|
221
|
+
/**
|
|
222
|
+
* Compact summary for working-memory append. Skips the full table and
|
|
223
|
+
* keeps just the patterns + suggestions so the agent's prompt doesn't
|
|
224
|
+
* get bloated with raw stats.
|
|
225
|
+
*/
|
|
226
|
+
export function formatReflectionSummary(r) {
|
|
227
|
+
const lines = [];
|
|
228
|
+
lines.push(`### Self-reflection (${r.generatedAt.slice(0, 10)}, last ${r.windowDays}d, ${r.totalDecisions} decisions)`);
|
|
229
|
+
lines.push('');
|
|
230
|
+
for (const p of r.patterns)
|
|
231
|
+
lines.push(`- ${p}`);
|
|
232
|
+
if (r.suggestions.length > 0) {
|
|
233
|
+
lines.push('');
|
|
234
|
+
lines.push('**Tuning suggestions:**');
|
|
235
|
+
for (const s of r.suggestions)
|
|
236
|
+
lines.push(`- ${s}`);
|
|
237
|
+
}
|
|
238
|
+
return lines.join('\n');
|
|
239
|
+
}
|
|
240
|
+
//# sourceMappingURL=decision-reflection.js.map
|
|
@@ -457,11 +457,27 @@ export class CronScheduler {
|
|
|
457
457
|
// phase updates for deep-mode runs get routed back to the originating
|
|
458
458
|
// session instead of fanning out to every registered channel.
|
|
459
459
|
const isDeepMode = (jobName) => jobName.startsWith('deep-');
|
|
460
|
-
// Wire up push notifications for unleashed task completions
|
|
460
|
+
// Wire up push notifications for unleashed task completions.
|
|
461
|
+
//
|
|
462
|
+
// This callback is only meant for AD-HOC unleashed tasks (chat-triggered
|
|
463
|
+
// "deep mode" follow-ups that didn't go through a registered job). Three
|
|
464
|
+
// other paths already own their own delivery and would otherwise produce
|
|
465
|
+
// double-dispatches:
|
|
466
|
+
//
|
|
467
|
+
// 1. deep-mode (`deep-*`) → router handles delivery via _deliverDeepResult
|
|
468
|
+
// 2. background tasks (`bg:*`) → processBackgroundTasks dispatches result
|
|
469
|
+
// 3. registered cron jobs → cron-runner success path dispatches at line ~1115
|
|
470
|
+
//
|
|
471
|
+
// Each path is gated below; without the guards Sasha's morning brief was
|
|
472
|
+
// landing twice in her Discord channel.
|
|
461
473
|
this.gateway.setUnleashedCompleteCallback((jobName, result) => {
|
|
462
474
|
this.completedJobs.set(jobName, Date.now());
|
|
463
475
|
if (isDeepMode(jobName))
|
|
464
|
-
return; //
|
|
476
|
+
return; // (1) deep-mode router
|
|
477
|
+
if (jobName.startsWith('bg:'))
|
|
478
|
+
return; // (2) background-task dispatcher
|
|
479
|
+
if (this.jobs.some(j => j.name === jobName))
|
|
480
|
+
return; // (3) registered cron job
|
|
465
481
|
if (result && result !== '__NOTHING__') {
|
|
466
482
|
const slug = jobName.includes(':') ? jobName.split(':')[0] : undefined;
|
|
467
483
|
// Strip system metadata for clean conversational delivery
|
|
@@ -0,0 +1,17 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Clementine TypeScript — Decision-loop reflection MCP tool.
|
|
3
|
+
*
|
|
4
|
+
* `decision_reflection` runs the analysis from agent/decision-reflection.ts
|
|
5
|
+
* for an agent and surfaces the result as a markdown report. Optionally
|
|
6
|
+
* persists the report to the agent's reflections history and/or appends
|
|
7
|
+
* a summary to their working-memory so the next heartbeat tick reads it
|
|
8
|
+
* as prompt context.
|
|
9
|
+
*
|
|
10
|
+
* Intended usage:
|
|
11
|
+
* - Manual call by the owner ("Clementine, reflect on your week")
|
|
12
|
+
* - Weekly cron job that calls this for each active agent
|
|
13
|
+
* - On-demand by an agent when they suspect they're miscalibrated
|
|
14
|
+
*/
|
|
15
|
+
import type { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
|
|
16
|
+
export declare function registerDecisionReflectionTools(server: McpServer): void;
|
|
17
|
+
//# sourceMappingURL=decision-reflection-tools.d.ts.map
|
|
@@ -0,0 +1,83 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Clementine TypeScript — Decision-loop reflection MCP tool.
|
|
3
|
+
*
|
|
4
|
+
* `decision_reflection` runs the analysis from agent/decision-reflection.ts
|
|
5
|
+
* for an agent and surfaces the result as a markdown report. Optionally
|
|
6
|
+
* persists the report to the agent's reflections history and/or appends
|
|
7
|
+
* a summary to their working-memory so the next heartbeat tick reads it
|
|
8
|
+
* as prompt context.
|
|
9
|
+
*
|
|
10
|
+
* Intended usage:
|
|
11
|
+
* - Manual call by the owner ("Clementine, reflect on your week")
|
|
12
|
+
* - Weekly cron job that calls this for each active agent
|
|
13
|
+
* - On-demand by an agent when they suspect they're miscalibrated
|
|
14
|
+
*/
|
|
15
|
+
import { existsSync, mkdirSync, readFileSync, writeFileSync } from 'node:fs';
|
|
16
|
+
import path from 'node:path';
|
|
17
|
+
import { z } from 'zod';
|
|
18
|
+
import { analyzeAgentDecisions, formatReflectionReport, formatReflectionSummary, } from '../agent/decision-reflection.js';
|
|
19
|
+
import { ACTIVE_AGENT_SLUG, AGENTS_DIR, logger, textResult } from './shared.js';
|
|
20
|
+
function reflectionsDir(slug) {
|
|
21
|
+
return path.join(AGENTS_DIR, slug, 'reflections');
|
|
22
|
+
}
|
|
23
|
+
function workingMemoryPath(slug) {
|
|
24
|
+
return path.join(AGENTS_DIR, slug, 'working-memory.md');
|
|
25
|
+
}
|
|
26
|
+
function todayStamp() {
|
|
27
|
+
return new Date().toISOString().slice(0, 10);
|
|
28
|
+
}
|
|
29
|
+
/** Append a summary block to working-memory.md, creating the file if needed. */
|
|
30
|
+
function appendToWorkingMemory(slug, summary) {
|
|
31
|
+
const file = workingMemoryPath(slug);
|
|
32
|
+
mkdirSync(path.dirname(file), { recursive: true });
|
|
33
|
+
let existing = '';
|
|
34
|
+
if (existsSync(file))
|
|
35
|
+
existing = readFileSync(file, 'utf-8');
|
|
36
|
+
const appended = (existing.endsWith('\n') || existing === '' ? existing : existing + '\n')
|
|
37
|
+
+ '\n'
|
|
38
|
+
+ summary
|
|
39
|
+
+ '\n';
|
|
40
|
+
writeFileSync(file, appended);
|
|
41
|
+
}
|
|
42
|
+
export function registerDecisionReflectionTools(server) {
|
|
43
|
+
server.tool('decision_reflection', 'Run a self-reflection on your recent autonomous decisions: read the proactive ledger, compute success rates per action type, identify miscalibration patterns, and surface concrete tuning suggestions. Use to spot when you are over-acting, under-acting, or going dormant. Optionally writes a summary to working-memory so your next heartbeat tick reads it as context.', {
|
|
44
|
+
slug: z.string().optional().describe('Agent slug to reflect on. Defaults to the calling agent (or "clementine" for the daemon).'),
|
|
45
|
+
window_days: z.number().optional().describe('Window in days to analyze. Default 7. Range 1-90.'),
|
|
46
|
+
save_to_history: z.boolean().optional().describe('Persist the full report to vault/00-System/agents/<slug>/reflections/<date>.md (default true).'),
|
|
47
|
+
append_to_memory: z.boolean().optional().describe('Append a compact summary to working-memory.md so the next tick reads it (default false). Be deliberate — repeated appends bloat the prompt.'),
|
|
48
|
+
}, async ({ slug, window_days, save_to_history, append_to_memory }) => {
|
|
49
|
+
const targetSlug = slug || ACTIVE_AGENT_SLUG || 'clementine';
|
|
50
|
+
const window = Math.max(1, Math.min(90, typeof window_days === 'number' ? window_days : 7));
|
|
51
|
+
const persistHistory = save_to_history !== false; // default true
|
|
52
|
+
const updateMemory = append_to_memory === true; // default false (explicit opt-in)
|
|
53
|
+
const reflection = analyzeAgentDecisions(targetSlug, window);
|
|
54
|
+
const report = formatReflectionReport(reflection);
|
|
55
|
+
const writes = [];
|
|
56
|
+
if (persistHistory) {
|
|
57
|
+
try {
|
|
58
|
+
const dir = reflectionsDir(targetSlug);
|
|
59
|
+
mkdirSync(dir, { recursive: true });
|
|
60
|
+
const file = path.join(dir, `${todayStamp()}.md`);
|
|
61
|
+
writeFileSync(file, report);
|
|
62
|
+
writes.push(`Saved to ${file}`);
|
|
63
|
+
}
|
|
64
|
+
catch (err) {
|
|
65
|
+
logger.warn({ err, slug: targetSlug }, 'Failed to save reflection history');
|
|
66
|
+
writes.push(`Failed to save history: ${String(err).slice(0, 200)}`);
|
|
67
|
+
}
|
|
68
|
+
}
|
|
69
|
+
if (updateMemory) {
|
|
70
|
+
try {
|
|
71
|
+
appendToWorkingMemory(targetSlug, formatReflectionSummary(reflection));
|
|
72
|
+
writes.push(`Appended summary to ${workingMemoryPath(targetSlug)}`);
|
|
73
|
+
}
|
|
74
|
+
catch (err) {
|
|
75
|
+
logger.warn({ err, slug: targetSlug }, 'Failed to append reflection to working-memory');
|
|
76
|
+
writes.push(`Failed to update working-memory: ${String(err).slice(0, 200)}`);
|
|
77
|
+
}
|
|
78
|
+
}
|
|
79
|
+
const footer = writes.length > 0 ? '\n\n---\n' + writes.map((w) => `- ${w}`).join('\n') : '';
|
|
80
|
+
return textResult(report + footer);
|
|
81
|
+
});
|
|
82
|
+
}
|
|
83
|
+
//# sourceMappingURL=decision-reflection-tools.js.map
|
package/dist/tools/mcp-server.js
CHANGED
|
@@ -28,6 +28,7 @@ import { registerArtifactTools } from './artifact-tools.js';
|
|
|
28
28
|
import { registerBrainTools } from './brain-tools.js';
|
|
29
29
|
import { registerAgentHeartbeatTools } from './agent-heartbeat-tools.js';
|
|
30
30
|
import { registerBackgroundTaskTools } from './background-task-tools.js';
|
|
31
|
+
import { registerDecisionReflectionTools } from './decision-reflection-tools.js';
|
|
31
32
|
// ── Server ──────────────────────────────────────────────────────────────
|
|
32
33
|
const serverName = (env['ASSISTANT_NAME'] ?? 'Clementine').toLowerCase() + '-tools';
|
|
33
34
|
const server = new McpServer({ name: serverName, version: '1.0.0' });
|
|
@@ -43,6 +44,7 @@ registerArtifactTools(server);
|
|
|
43
44
|
registerBrainTools(server);
|
|
44
45
|
registerAgentHeartbeatTools(server);
|
|
45
46
|
registerBackgroundTaskTools(server);
|
|
47
|
+
registerDecisionReflectionTools(server);
|
|
46
48
|
// ── Main ────────────────────────────────────────────────────────────────
|
|
47
49
|
async function main() {
|
|
48
50
|
// Initialize memory store and run full sync on startup
|
package/package.json
CHANGED
package/vault/00-System/CRON.md
CHANGED
|
@@ -37,6 +37,21 @@ jobs:
|
|
|
37
37
|
4. Write a brief summary of the day in today's daily note under ## Summary
|
|
38
38
|
tier: 1
|
|
39
39
|
enabled: true
|
|
40
|
+
|
|
41
|
+
- name: weekly-decision-reflection
|
|
42
|
+
schedule: "0 9 * * 0"
|
|
43
|
+
prompt: >
|
|
44
|
+
Run a self-reflection on the past week's autonomous decisions.
|
|
45
|
+
1. Call `decision_reflection` with window_days=7, save_to_history=true, append_to_memory=true.
|
|
46
|
+
This reads the proactive ledger, computes per-action success rates, identifies
|
|
47
|
+
miscalibration patterns, and writes a tuning note to your working-memory so the
|
|
48
|
+
next heartbeat tick reads it as context.
|
|
49
|
+
2. For each specialist on the team (use `team_list` to enumerate), also run
|
|
50
|
+
`decision_reflection` with their slug, save_to_history=true, append_to_memory=true.
|
|
51
|
+
3. Briefly summarize in today's daily note under "## Decision reflection" — list each
|
|
52
|
+
agent's headline pattern (e.g., "Ross: act_now success 33%, raise threshold").
|
|
53
|
+
tier: 1
|
|
54
|
+
enabled: true
|
|
40
55
|
tags:
|
|
41
56
|
- system
|
|
42
57
|
- cron
|
|
@@ -53,6 +68,7 @@ Scheduled tasks that run automatically at specific times. Edit the frontmatter a
|
|
|
53
68
|
| morning-briefing | 8:00 AM daily | Comprehensive morning briefing |
|
|
54
69
|
| weekly-review | 6:00 PM Fridays | Weekly summary + planning |
|
|
55
70
|
| daily-memory-cleanup | 10:00 PM daily | Promote daily facts to long-term memory |
|
|
71
|
+
| weekly-decision-reflection | 9:00 AM Sundays | Per-agent self-tuning from proactive ledger |
|
|
56
72
|
|
|
57
73
|
## Schedule Syntax
|
|
58
74
|
|