@ducci/jarvis 1.0.33 → 1.0.35
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/docs/findings/017-looping-intervention-and-lossy-checkpoint.md +110 -0
- package/package.json +2 -1
- package/src/channels/telegram/index.js +26 -0
- package/src/scripts/onboarding.js +173 -91
- package/src/server/agent.js +72 -13
- package/src/server/config.js +12 -6
- package/src/server/provider.js +149 -0
|
@@ -0,0 +1,110 @@
|
|
|
1
|
+
# Finding 017: Looping Intervention and Lossy Checkpoint
|
|
2
|
+
|
|
3
|
+
**Date:** 2026-03-04
|
|
4
|
+
**Severity:** High — agent burned 40 iterations (4 full runs) on a structurally impossible task without escaping; concrete facts like file paths regressed between handoff runs
|
|
5
|
+
**Status:** Fixed
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Observed Session
|
|
10
|
+
|
|
11
|
+
Session from a remote server. Model: `nvidia/nemotron-3-nano-30b-a3b:free`. User requested a ZAP security scanning workflow (create project dir, write README + scan.sh, run a test scan).
|
|
12
|
+
|
|
13
|
+
| Entry | Trigger | Status | Iterations | Notes |
|
|
14
|
+
|-------|---------|--------|------------|-------|
|
|
15
|
+
| 1 | "hello" | ok | 0 | greeting |
|
|
16
|
+
| 2 | ZAP task | checkpoint_reached | 10 | ZAP daemon lock blocked scan |
|
|
17
|
+
| 3 | handoff resume | checkpoint_reached | 10 | same lock, same result |
|
|
18
|
+
| 4 | zero-progress | intervention_required | 0 | correctly detected |
|
|
19
|
+
| 5 | "Ok can you do it?" | checkpoint_reached | 10 | loop restarted, same result |
|
|
20
|
+
| 6 | handoff resume | checkpoint_reached | 10 | same lock again |
|
|
21
|
+
| 7 | zero-progress | intervention_required | 0 | detected again, session abandoned |
|
|
22
|
+
|
|
23
|
+
Total: 40 wasted iterations. Additionally, between Entry 2 and Entry 6, the agent changed the project path from `/root/.jarvis/projects/cybersecurity` to `/root/projects/cybersecurity`, causing `list_dir` to fail at the start of that run.
|
|
24
|
+
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
## Root Cause 1: Zero-Progress Detection Resets Across User Messages
|
|
28
|
+
|
|
29
|
+
### What happened
|
|
30
|
+
|
|
31
|
+
`previousRemaining` is a local variable initialized to `null` on every call to `_runHandleChat`. Zero-progress detection requires two identical `checkpoint.remaining` values, but both must occur within the same invocation. When `intervention_required` fires and the user sends a new message, `_runHandleChat` is called fresh with `previousRemaining = null`. The detection resets.
|
|
32
|
+
|
|
33
|
+
The result: each new user message grants the agent 2 full runs (20 iterations) before zero-progress fires again — regardless of how many times the cycle has already repeated. The intervention mechanism correctly identifies "stuck" but provides no structural escape when the user replies.
|
|
34
|
+
|
|
35
|
+
### Fix
|
|
36
|
+
|
|
37
|
+
1. When zero-progress fires, persist `session.metadata.lastCheckpointRemaining = currentRemaining`.
|
|
38
|
+
2. In `_runHandleChat`, initialize `previousRemaining` from `session.metadata.lastCheckpointRemaining` (cleared after reading). Zero-progress now fires after just one run (10 iterations) on the next user message if the agent produces the same remaining.
|
|
39
|
+
3. Inject a note into `userMessageWithContext` when `lastCheckpointRemaining` was set:
|
|
40
|
+
|
|
41
|
+
```
|
|
42
|
+
[System: This task previously hit zero-progress and required intervention. If the user has given new direction or clarification, follow it. Otherwise, immediately explain what specific obstacle is blocking progress — do not resume the same failing approach.]
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
This gives the agent explicit guidance to respond to what the user actually asked instead of blindly resuming.
|
|
46
|
+
|
|
47
|
+
**File:** `src/server/agent.js` — `_runHandleChat`
|
|
48
|
+
|
|
49
|
+
---
|
|
50
|
+
|
|
51
|
+
## Root Cause 2: Checkpoint Loses Concrete Facts Between Runs
|
|
52
|
+
|
|
53
|
+
### What happened
|
|
54
|
+
|
|
55
|
+
The checkpoint schema had `progress`, `remaining`, and `failedApproaches` — all natural-language prose. Concrete facts the agent discovers during a run (file paths, binary locations, config values) are paraphrased or omitted when the agent writes the summary. On resume, the model reconstructs these facts from vague prose and sometimes gets them wrong.
|
|
56
|
+
|
|
57
|
+
In this session, the project directory `/root/.jarvis/projects/cybersecurity` was written correctly in runs 1–3 but reconstructed as `/root/projects/cybersecurity` in run 6. The first action of that run was `list_dir` on the wrong path, which failed.
|
|
58
|
+
|
|
59
|
+
### Fix
|
|
60
|
+
|
|
61
|
+
1. Added a `state` field to `WRAP_UP_NOTE`'s checkpoint schema: a flat key-value JSON object for concrete facts confirmed by tool output (file paths created, binary locations found, config values discovered).
|
|
62
|
+
|
|
63
|
+
2. After each handoff, merge `run.checkpoint.state` into `session.metadata.checkpointState` (later runs overwrite earlier values for the same key).
|
|
64
|
+
|
|
65
|
+
3. When building `resumeContent` for the next handoff run, inject the accumulated state:
|
|
66
|
+
|
|
67
|
+
```
|
|
68
|
+
[System: Known facts from previous runs:
|
|
69
|
+
- projectDir: /root/.jarvis/projects/cybersecurity
|
|
70
|
+
- zapBinary: /snap/bin/zaproxy
|
|
71
|
+
- scanScriptPath: /root/.jarvis/projects/cybersecurity/scan.sh]
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
4. When re-entering after zero-progress (via `wasZeroProgress`), also include the state in the `userMessageWithContext` injection so facts are available immediately on the first run.
|
|
75
|
+
|
|
76
|
+
5. `session.metadata.checkpointState` is reset on each new user message (same lifecycle as `failedApproaches`) to avoid stale facts from previous tasks leaking into new ones.
|
|
77
|
+
|
|
78
|
+
**File:** `src/server/agent.js` — `WRAP_UP_NOTE`, checkpoint normalization, `_runHandleChat`
|
|
79
|
+
|
|
80
|
+
---
|
|
81
|
+
|
|
82
|
+
## Interaction Between the Two Fixes
|
|
83
|
+
|
|
84
|
+
The zero-progress note injected into `userMessageWithContext` now also includes the `priorCheckpointState` (captured before the metadata reset), so the agent that receives the note also has the concrete facts it needs if the user does provide new direction. This means both fixes compound: the agent is told it was stuck AND given the facts it needs to act on whatever the user says.
|
|
85
|
+
|
|
86
|
+
---
|
|
87
|
+
|
|
88
|
+
## What Was Not Changed
|
|
89
|
+
|
|
90
|
+
- The existing exact-match loop detector (`loopTracker`) — unchanged
|
|
91
|
+
- The consecutive failure detector — unchanged
|
|
92
|
+
- The `maxHandoffs` cap — unchanged
|
|
93
|
+
- The `hasConsecutiveModelErrors` escalation — unchanged
|
|
94
|
+
- The context strip on checkpoint/intervention — unchanged
|
|
95
|
+
|
|
96
|
+
---
|
|
97
|
+
|
|
98
|
+
## Files Changed
|
|
99
|
+
|
|
100
|
+
| File | Change |
|
|
101
|
+
|------|--------|
|
|
102
|
+
| `src/server/agent.js` | `WRAP_UP_NOTE` — added `state` field to checkpoint schema |
|
|
103
|
+
| `src/server/agent.js` | Checkpoint normalization — normalize `cp.state` to `{}` if missing/malformed |
|
|
104
|
+
| `src/server/agent.js` | `_runHandleChat` — capture `wasZeroProgress`, `priorCheckpointRemaining`, `priorCheckpointState` before metadata reset |
|
|
105
|
+
| `src/server/agent.js` | `_runHandleChat` — inject zero-progress note + prior state into `userMessageWithContext` when applicable |
|
|
106
|
+
| `src/server/agent.js` | `_runHandleChat` — reset `lastCheckpointRemaining` and `checkpointState` on new user message |
|
|
107
|
+
| `src/server/agent.js` | `_runHandleChat` — initialize `previousRemaining` from `priorCheckpointRemaining` |
|
|
108
|
+
| `src/server/agent.js` | Zero-progress block — persist `session.metadata.lastCheckpointRemaining` |
|
|
109
|
+
| `src/server/agent.js` | Handoff accumulation — merge `run.checkpoint.state` into `session.metadata.checkpointState` |
|
|
110
|
+
| `src/server/agent.js` | `resumeContent` — inject accumulated `checkpointState` as known facts |
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@ducci/jarvis",
|
|
3
|
-
"version": "1.0.
|
|
3
|
+
"version": "1.0.35",
|
|
4
4
|
"description": "A fully automated agent system that lives on a server.",
|
|
5
5
|
"main": "./src/index.js",
|
|
6
6
|
"type": "module",
|
|
@@ -36,6 +36,7 @@
|
|
|
36
36
|
},
|
|
37
37
|
"license": "ISC",
|
|
38
38
|
"dependencies": {
|
|
39
|
+
"@anthropic-ai/sdk": "^0.39.0",
|
|
39
40
|
"@grammyjs/runner": "^2.0.3",
|
|
40
41
|
"chalk": "^5.6.2",
|
|
41
42
|
"commander": "^14.0.3",
|
|
@@ -1,6 +1,7 @@
|
|
|
1
1
|
import { Bot } from 'grammy';
|
|
2
2
|
import { run } from '@grammyjs/runner';
|
|
3
3
|
import { handleChat } from '../../server/agent.js';
|
|
4
|
+
import { loadSession } from '../../server/sessions.js';
|
|
4
5
|
import { load, save } from './sessions.js';
|
|
5
6
|
|
|
6
7
|
export async function startTelegramChannel(config) {
|
|
@@ -13,8 +14,33 @@ export async function startTelegramChannel(config) {
|
|
|
13
14
|
|
|
14
15
|
await bot.api.setMyCommands([
|
|
15
16
|
{ command: 'new', description: 'Start a fresh session' },
|
|
17
|
+
{ command: 'usage', description: 'Show token usage for the current session' },
|
|
16
18
|
]);
|
|
17
19
|
|
|
20
|
+
bot.command('usage', async (ctx) => {
|
|
21
|
+
const userId = ctx.from?.id;
|
|
22
|
+
if (!allowedUserIds.includes(userId)) return;
|
|
23
|
+
|
|
24
|
+
const chatId = ctx.chat.id;
|
|
25
|
+
const sessionId = sessions[chatId];
|
|
26
|
+
if (!sessionId) {
|
|
27
|
+
await ctx.reply('No active session. Send a message to start one.');
|
|
28
|
+
return;
|
|
29
|
+
}
|
|
30
|
+
|
|
31
|
+
const session = await loadSession(sessionId);
|
|
32
|
+
const u = session?.metadata?.tokenUsage;
|
|
33
|
+
if (!u || (u.prompt === 0 && u.completion === 0)) {
|
|
34
|
+
await ctx.reply('No token usage recorded for this session yet.');
|
|
35
|
+
return;
|
|
36
|
+
}
|
|
37
|
+
|
|
38
|
+
const total = u.prompt + u.completion;
|
|
39
|
+
await ctx.reply(
|
|
40
|
+
`Token usage for current session:\nIn: ${u.prompt.toLocaleString()}\nOut: ${u.completion.toLocaleString()}\nTotal: ${total.toLocaleString()}`
|
|
41
|
+
);
|
|
42
|
+
});
|
|
43
|
+
|
|
18
44
|
bot.command('new', async (ctx) => {
|
|
19
45
|
const userId = ctx.from?.id;
|
|
20
46
|
if (!allowedUserIds.includes(userId)) return;
|
|
@@ -52,17 +52,13 @@ function saveSettings(settings) {
|
|
|
52
52
|
fs.writeFileSync(settingsFile, JSON.stringify(settings, null, 2), 'utf8');
|
|
53
53
|
}
|
|
54
54
|
|
|
55
|
-
async function
|
|
55
|
+
async function fetchOpenRouterModels(apiKey) {
|
|
56
56
|
console.log(chalk.blue('Fetching models from OpenRouter...'));
|
|
57
57
|
try {
|
|
58
58
|
const response = await fetch('https://openrouter.ai/api/v1/models', {
|
|
59
|
-
headers: {
|
|
60
|
-
'Authorization': `Bearer ${apiKey}`
|
|
61
|
-
}
|
|
59
|
+
headers: { 'Authorization': `Bearer ${apiKey}` }
|
|
62
60
|
});
|
|
63
|
-
if (!response.ok) {
|
|
64
|
-
throw new Error(`HTTP error! status: ${response.status}`);
|
|
65
|
-
}
|
|
61
|
+
if (!response.ok) throw new Error(`HTTP error! status: ${response.status}`);
|
|
66
62
|
const data = await response.json();
|
|
67
63
|
return data.data;
|
|
68
64
|
} catch (error) {
|
|
@@ -71,55 +67,114 @@ async function fetchModels(apiKey) {
|
|
|
71
67
|
}
|
|
72
68
|
}
|
|
73
69
|
|
|
70
|
+
const ANTHROPIC_MODELS_FALLBACK = [
|
|
71
|
+
{ id: 'claude-opus-4-6', description: 'Most capable' },
|
|
72
|
+
{ id: 'claude-sonnet-4-6', description: 'Balanced' },
|
|
73
|
+
{ id: 'claude-3-5-sonnet-20241022', description: 'Balanced (stable)' },
|
|
74
|
+
{ id: 'claude-haiku-4-5-20251001', description: 'Fast & cheap' },
|
|
75
|
+
{ id: 'claude-3-5-haiku-20241022', description: 'Fast & cheap (stable)' },
|
|
76
|
+
];
|
|
77
|
+
|
|
78
|
+
async function fetchAnthropicModels(apiKey) {
|
|
79
|
+
try {
|
|
80
|
+
const response = await fetch('https://api.anthropic.com/v1/models', {
|
|
81
|
+
headers: { 'x-api-key': apiKey, 'anthropic-version': '2023-06-01' }
|
|
82
|
+
});
|
|
83
|
+
if (!response.ok) throw new Error(`HTTP error! status: ${response.status}`);
|
|
84
|
+
const data = await response.json();
|
|
85
|
+
return data.data.map(m => ({ id: m.id, description: '' }));
|
|
86
|
+
} catch {
|
|
87
|
+
return ANTHROPIC_MODELS_FALLBACK;
|
|
88
|
+
}
|
|
89
|
+
}
|
|
90
|
+
|
|
74
91
|
async function run() {
|
|
75
92
|
ensureDirectories();
|
|
76
93
|
|
|
77
94
|
console.log(chalk.green.bold('\n=== Jarvis Setup ===\n'));
|
|
78
95
|
|
|
96
|
+
let settings = loadSettings();
|
|
97
|
+
|
|
98
|
+
// --- PROVIDER STEP ---
|
|
99
|
+
const { provider } = await inquirer.prompt([
|
|
100
|
+
{
|
|
101
|
+
type: 'list',
|
|
102
|
+
name: 'provider',
|
|
103
|
+
message: 'Which AI provider do you want to use?',
|
|
104
|
+
choices: [
|
|
105
|
+
{ name: 'OpenRouter (access many models via one key)', value: 'openrouter' },
|
|
106
|
+
{ name: 'Anthropic Direct (use your Anthropic API key)', value: 'anthropic' },
|
|
107
|
+
],
|
|
108
|
+
default: settings.provider || 'openrouter',
|
|
109
|
+
}
|
|
110
|
+
]);
|
|
111
|
+
|
|
79
112
|
// --- API KEY STEP ---
|
|
80
|
-
let
|
|
81
|
-
let apiKey = existingKey;
|
|
113
|
+
let apiKey;
|
|
82
114
|
|
|
83
|
-
if (
|
|
84
|
-
const
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
115
|
+
if (provider === 'anthropic') {
|
|
116
|
+
const existingKey = loadEnvVar('ANTHROPIC_API_KEY');
|
|
117
|
+
apiKey = existingKey;
|
|
118
|
+
|
|
119
|
+
if (existingKey) {
|
|
120
|
+
const { keepKey } = await inquirer.prompt([
|
|
121
|
+
{
|
|
122
|
+
type: 'confirm',
|
|
123
|
+
name: 'keepKey',
|
|
124
|
+
message: 'An ANTHROPIC_API_KEY is already configured. Do you want to keep it?',
|
|
125
|
+
default: true,
|
|
126
|
+
}
|
|
127
|
+
]);
|
|
128
|
+
if (!keepKey) apiKey = null;
|
|
129
|
+
}
|
|
130
|
+
|
|
131
|
+
if (!apiKey) {
|
|
132
|
+
const { newKey } = await inquirer.prompt([
|
|
133
|
+
{
|
|
134
|
+
type: 'password',
|
|
135
|
+
name: 'newKey',
|
|
136
|
+
message: 'Enter your Anthropic API key:',
|
|
137
|
+
validate: (input) => input.length >= 10 || 'API key must be at least 10 characters long.',
|
|
138
|
+
}
|
|
139
|
+
]);
|
|
140
|
+
apiKey = newKey;
|
|
141
|
+
saveEnvVar('ANTHROPIC_API_KEY', apiKey);
|
|
142
|
+
console.log(chalk.green('Anthropic API key saved.'));
|
|
143
|
+
}
|
|
144
|
+
} else {
|
|
145
|
+
const existingKey = loadEnvVar('OPENROUTER_API_KEY');
|
|
146
|
+
apiKey = existingKey;
|
|
147
|
+
|
|
148
|
+
if (existingKey) {
|
|
149
|
+
const { keepKey } = await inquirer.prompt([
|
|
150
|
+
{
|
|
151
|
+
type: 'confirm',
|
|
152
|
+
name: 'keepKey',
|
|
153
|
+
message: 'An OPENROUTER_API_KEY is already configured. Do you want to keep it?',
|
|
154
|
+
default: true,
|
|
155
|
+
}
|
|
156
|
+
]);
|
|
157
|
+
if (!keepKey) apiKey = null;
|
|
158
|
+
}
|
|
92
159
|
|
|
93
|
-
if (!
|
|
160
|
+
if (!apiKey) {
|
|
94
161
|
const { newKey } = await inquirer.prompt([
|
|
95
162
|
{
|
|
96
163
|
type: 'password',
|
|
97
164
|
name: 'newKey',
|
|
98
165
|
message: 'Enter your OpenRouter API key:',
|
|
99
|
-
validate: (input) => input.length >= 10 || 'API key must be at least 10 characters long.'
|
|
166
|
+
validate: (input) => input.length >= 10 || 'API key must be at least 10 characters long.',
|
|
100
167
|
}
|
|
101
168
|
]);
|
|
102
169
|
apiKey = newKey;
|
|
103
170
|
saveEnvVar('OPENROUTER_API_KEY', apiKey);
|
|
104
|
-
console.log(chalk.green('API key
|
|
171
|
+
console.log(chalk.green('OpenRouter API key saved.'));
|
|
105
172
|
}
|
|
106
|
-
} else {
|
|
107
|
-
const { newKey } = await inquirer.prompt([
|
|
108
|
-
{
|
|
109
|
-
type: 'password',
|
|
110
|
-
name: 'newKey',
|
|
111
|
-
message: 'Enter your OpenRouter API key:',
|
|
112
|
-
validate: (input) => input.length >= 10 || 'API key must be at least 10 characters long.'
|
|
113
|
-
}
|
|
114
|
-
]);
|
|
115
|
-
apiKey = newKey;
|
|
116
|
-
saveEnvVar('OPENROUTER_API_KEY', apiKey);
|
|
117
|
-
console.log(chalk.green('API key saved.'));
|
|
118
173
|
}
|
|
119
174
|
|
|
120
175
|
// --- MODEL SELECTION STEP ---
|
|
121
|
-
|
|
122
|
-
let selectedModel = settings.selectedModel;
|
|
176
|
+
// Reset model selection when switching providers
|
|
177
|
+
let selectedModel = settings.provider === provider ? settings.selectedModel : null;
|
|
123
178
|
|
|
124
179
|
if (selectedModel) {
|
|
125
180
|
const { keepModel } = await inquirer.prompt([
|
|
@@ -129,88 +184,115 @@ async function run() {
|
|
|
129
184
|
message: `Current model is ${chalk.yellow(selectedModel)}. Do you want to keep it or change it?`,
|
|
130
185
|
choices: [
|
|
131
186
|
{ name: 'Keep current model', value: true },
|
|
132
|
-
{ name: 'Change model', value: false }
|
|
187
|
+
{ name: 'Change model', value: false },
|
|
133
188
|
]
|
|
134
189
|
}
|
|
135
190
|
]);
|
|
136
|
-
|
|
137
|
-
if (!keepModel) {
|
|
138
|
-
selectedModel = null;
|
|
139
|
-
}
|
|
191
|
+
if (!keepModel) selectedModel = null;
|
|
140
192
|
}
|
|
141
193
|
|
|
142
194
|
if (!selectedModel) {
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
]
|
|
152
|
-
}
|
|
153
|
-
]);
|
|
195
|
+
if (provider === 'anthropic') {
|
|
196
|
+
console.log(chalk.blue('Fetching available Claude models...'));
|
|
197
|
+
const models = await fetchAnthropicModels(apiKey);
|
|
198
|
+
const choices = models.map(m => ({
|
|
199
|
+
name: m.description ? `${m.id} ${chalk.dim(m.description)}` : m.id,
|
|
200
|
+
value: m.id,
|
|
201
|
+
}));
|
|
202
|
+
choices.push({ name: 'Enter model ID manually', value: '__manual__' });
|
|
154
203
|
|
|
155
|
-
|
|
156
|
-
const { manualModel } = await inquirer.prompt([
|
|
204
|
+
const { browsedModel } = await inquirer.prompt([
|
|
157
205
|
{
|
|
158
|
-
type: '
|
|
159
|
-
name: '
|
|
160
|
-
message: '
|
|
161
|
-
|
|
206
|
+
type: 'list',
|
|
207
|
+
name: 'browsedModel',
|
|
208
|
+
message: 'Select a Claude model:',
|
|
209
|
+
choices,
|
|
210
|
+
pageSize: 20,
|
|
162
211
|
}
|
|
163
212
|
]);
|
|
164
|
-
|
|
165
|
-
|
|
166
|
-
const models = await fetchModels(apiKey);
|
|
167
|
-
if (models.length === 0) {
|
|
168
|
-
console.log(chalk.yellow('Falling back to manual entry due to fetch failure.'));
|
|
213
|
+
|
|
214
|
+
if (browsedModel === '__manual__') {
|
|
169
215
|
const { manualModel } = await inquirer.prompt([
|
|
170
216
|
{
|
|
171
217
|
type: 'input',
|
|
172
218
|
name: 'manualModel',
|
|
173
|
-
message: 'Enter
|
|
174
|
-
validate: (input) => input.trim().length > 0 || 'Model ID cannot be empty.'
|
|
219
|
+
message: 'Enter Anthropic model ID (e.g., claude-sonnet-4-6):',
|
|
220
|
+
validate: (input) => input.trim().length > 0 || 'Model ID cannot be empty.',
|
|
175
221
|
}
|
|
176
222
|
]);
|
|
177
223
|
selectedModel = manualModel.trim();
|
|
178
224
|
} else {
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
|
|
195
|
-
}
|
|
196
|
-
|
|
197
|
-
const { browsedModel } = await inquirer.prompt([
|
|
225
|
+
selectedModel = browsedModel;
|
|
226
|
+
}
|
|
227
|
+
} else {
|
|
228
|
+
const { modelSelectionMethod } = await inquirer.prompt([
|
|
229
|
+
{
|
|
230
|
+
type: 'list',
|
|
231
|
+
name: 'modelSelectionMethod',
|
|
232
|
+
message: 'How would you like to select a model?',
|
|
233
|
+
choices: [
|
|
234
|
+
{ name: 'Browse OpenRouter models', value: 'browse' },
|
|
235
|
+
{ name: 'Enter model ID manually', value: 'manual' },
|
|
236
|
+
]
|
|
237
|
+
}
|
|
238
|
+
]);
|
|
239
|
+
|
|
240
|
+
if (modelSelectionMethod === 'manual') {
|
|
241
|
+
const { manualModel } = await inquirer.prompt([
|
|
198
242
|
{
|
|
199
|
-
type: '
|
|
200
|
-
name: '
|
|
201
|
-
message: '
|
|
202
|
-
|
|
203
|
-
pageSize: 20
|
|
243
|
+
type: 'input',
|
|
244
|
+
name: 'manualModel',
|
|
245
|
+
message: 'Enter OpenRouter model ID (e.g., anthropic/claude-3.5-sonnet):',
|
|
246
|
+
validate: (input) => input.trim().length > 0 || 'Model ID cannot be empty.',
|
|
204
247
|
}
|
|
205
248
|
]);
|
|
206
|
-
selectedModel =
|
|
249
|
+
selectedModel = manualModel.trim();
|
|
250
|
+
} else {
|
|
251
|
+
const models = await fetchOpenRouterModels(apiKey);
|
|
252
|
+
if (models.length === 0) {
|
|
253
|
+
console.log(chalk.yellow('Falling back to manual entry due to fetch failure.'));
|
|
254
|
+
const { manualModel } = await inquirer.prompt([
|
|
255
|
+
{
|
|
256
|
+
type: 'input',
|
|
257
|
+
name: 'manualModel',
|
|
258
|
+
message: 'Enter OpenRouter model ID:',
|
|
259
|
+
validate: (input) => input.trim().length > 0 || 'Model ID cannot be empty.',
|
|
260
|
+
}
|
|
261
|
+
]);
|
|
262
|
+
selectedModel = manualModel.trim();
|
|
263
|
+
} else {
|
|
264
|
+
models.sort((a, b) => {
|
|
265
|
+
const isFreeA = a.pricing && parseFloat(a.pricing.prompt) === 0 && parseFloat(a.pricing.completion) === 0;
|
|
266
|
+
const isFreeB = b.pricing && parseFloat(b.pricing.prompt) === 0 && parseFloat(b.pricing.completion) === 0;
|
|
267
|
+
if (isFreeA && !isFreeB) return -1;
|
|
268
|
+
if (!isFreeA && isFreeB) return 1;
|
|
269
|
+
return a.id.localeCompare(b.id);
|
|
270
|
+
});
|
|
271
|
+
const choices = models.map(m => {
|
|
272
|
+
const isFree = m.pricing && parseFloat(m.pricing.prompt) === 0 && parseFloat(m.pricing.completion) === 0;
|
|
273
|
+
return { name: `${m.id} ${isFree ? chalk.green('(Free)') : ''}`, value: m.id };
|
|
274
|
+
});
|
|
275
|
+
const { browsedModel } = await inquirer.prompt([
|
|
276
|
+
{
|
|
277
|
+
type: 'list',
|
|
278
|
+
name: 'browsedModel',
|
|
279
|
+
message: 'Select a model:',
|
|
280
|
+
choices,
|
|
281
|
+
pageSize: 20,
|
|
282
|
+
}
|
|
283
|
+
]);
|
|
284
|
+
selectedModel = browsedModel;
|
|
285
|
+
}
|
|
207
286
|
}
|
|
208
287
|
}
|
|
209
288
|
}
|
|
210
289
|
|
|
290
|
+
const previousProvider = settings.provider || 'openrouter';
|
|
291
|
+
settings.provider = provider;
|
|
211
292
|
settings.selectedModel = selectedModel;
|
|
212
|
-
|
|
213
|
-
|
|
293
|
+
// Reset fallback to provider-appropriate default when switching providers or on first run
|
|
294
|
+
if (!settings.fallbackModel || previousProvider !== provider) {
|
|
295
|
+
settings.fallbackModel = provider === 'anthropic' ? 'claude-haiku-4-5-20251001' : 'openrouter/free';
|
|
214
296
|
}
|
|
215
297
|
if (settings.maxIterations === undefined) {
|
|
216
298
|
settings.maxIterations = 10;
|
package/src/server/agent.js
CHANGED
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
import crypto from 'crypto';
|
|
2
|
-
import
|
|
2
|
+
import { createClient } from './provider.js';
|
|
3
3
|
import { loadSystemPrompt, resolveSystemPrompt } from './config.js';
|
|
4
4
|
import { loadSession, saveSession, createSession } from './sessions.js';
|
|
5
5
|
import { loadTools, getToolDefinitions, executeTool } from './tools.js';
|
|
@@ -20,17 +20,25 @@ Respond with your normal JSON, but add a checkpoint field:
|
|
|
20
20
|
"checkpoint": {
|
|
21
21
|
"progress": "What has been fully completed — only include items confirmed by tool output (e.g., successful exec with exit code 0, or verified by ls/cat). Do not report planned steps as completed.",
|
|
22
22
|
"remaining": "What still needs to be done to finish the task — as a plain text string, never an array or object.",
|
|
23
|
-
"failedApproaches": ["Concise description of each approach that was tried and failed, e.g. 'downloading subfinder via curl from GitHub releases — connection reset'. Omit array entries for things that succeeded. Leave as empty array if nothing failed."]
|
|
23
|
+
"failedApproaches": ["Concise description of each approach that was tried and failed, e.g. 'downloading subfinder via curl from GitHub releases — connection reset'. Omit array entries for things that succeeded. Leave as empty array if nothing failed."],
|
|
24
|
+
"state": {"factKey": "factValue — concrete facts confirmed by tool output this run: file paths created, binary locations found, config values discovered. Use short stable keys, e.g. projectDir, zapBinary, scanScriptPath. Omit or use {} if nothing concrete was discovered."}
|
|
24
25
|
}
|
|
25
26
|
}
|
|
26
27
|
|
|
27
|
-
The checkpoint field will be used to automatically resume the task in the next run. failedApproaches is injected into the next run so the agent does not waste iterations repeating strategies that already failed. remaining must be a plain text string. failedApproaches must be a JSON array of strings.]`;
|
|
28
|
+
The checkpoint field will be used to automatically resume the task in the next run. failedApproaches is injected into the next run so the agent does not waste iterations repeating strategies that already failed. state is injected verbatim so the next run does not need to rediscover file paths or binary locations. remaining must be a plain text string. failedApproaches must be a JSON array of strings. state must be a flat JSON object.]`;
|
|
28
29
|
|
|
29
30
|
// Serializes concurrent requests for the same session. Maps sessionId to the
|
|
30
31
|
// tail of the current request chain (a Promise that resolves when the last
|
|
31
32
|
// queued request finishes).
|
|
32
33
|
const sessionQueues = new Map();
|
|
33
34
|
|
|
35
|
+
function accumulateUsage(accum, result) {
|
|
36
|
+
const u = result?.usage;
|
|
37
|
+
if (!u) return;
|
|
38
|
+
accum.prompt += u.prompt_tokens || 0;
|
|
39
|
+
accum.completion += u.completion_tokens || 0;
|
|
40
|
+
}
|
|
41
|
+
|
|
34
42
|
async function callModel(client, model, messages, tools) {
|
|
35
43
|
const params = { model, messages };
|
|
36
44
|
if (tools && tools.length > 0) {
|
|
@@ -92,7 +100,7 @@ function hasConsecutiveModelErrors(messages) {
|
|
|
92
100
|
* Runs a single agent loop up to maxIterations.
|
|
93
101
|
* Returns { iteration, response, logSummary, status, runToolCalls, checkpoint }.
|
|
94
102
|
*/
|
|
95
|
-
async function runAgentLoop(client, config, session, prepareMessages) {
|
|
103
|
+
async function runAgentLoop(client, config, session, prepareMessages, usageAccum) {
|
|
96
104
|
let tools = await loadTools();
|
|
97
105
|
let toolDefs = getToolDefinitions(tools);
|
|
98
106
|
let iteration = 0;
|
|
@@ -116,6 +124,7 @@ async function runAgentLoop(client, config, session, prepareMessages) {
|
|
|
116
124
|
: base;
|
|
117
125
|
try {
|
|
118
126
|
modelResult = await callModelWithFallback(client, config, preparedMessages, toolDefs);
|
|
127
|
+
accumulateUsage(usageAccum, modelResult);
|
|
119
128
|
} catch (e) {
|
|
120
129
|
return {
|
|
121
130
|
iteration,
|
|
@@ -279,6 +288,7 @@ async function runAgentLoop(client, config, session, prepareMessages) {
|
|
|
279
288
|
{ role: 'user', content: 'You returned an empty response. ' + FORMAT_NUDGE },
|
|
280
289
|
];
|
|
281
290
|
const nudgeResult = await callModelWithFallback(client, config, emptyNudge, []);
|
|
291
|
+
accumulateUsage(usageAccum, nudgeResult);
|
|
282
292
|
const nudgeContent = nudgeResult.choices[0]?.message?.content || '';
|
|
283
293
|
// Persist nudge text before parsing — if JSON parse throws, content still
|
|
284
294
|
// carries the model's best-effort text so the !parsed handler can show it
|
|
@@ -297,6 +307,7 @@ async function runAgentLoop(client, config, session, prepareMessages) {
|
|
|
297
307
|
// Step 1: retry with fallback model
|
|
298
308
|
try {
|
|
299
309
|
const fallbackResult = await callModel(client, config.fallbackModel, preparedMessages, toolDefs);
|
|
310
|
+
accumulateUsage(usageAccum, fallbackResult);
|
|
300
311
|
const fallbackContent = fallbackResult.choices[0]?.message?.content || '';
|
|
301
312
|
parsed = JSON.parse(fallbackContent);
|
|
302
313
|
content = fallbackContent;
|
|
@@ -305,6 +316,7 @@ async function runAgentLoop(client, config, session, prepareMessages) {
|
|
|
305
316
|
try {
|
|
306
317
|
const nudgeMessages = [...preparedMessages, { role: 'user', content: FORMAT_NUDGE }];
|
|
307
318
|
const nudgeResult = await callModelWithFallback(client, config, nudgeMessages, toolDefs);
|
|
319
|
+
accumulateUsage(usageAccum, nudgeResult);
|
|
308
320
|
const nudgeContent = nudgeResult.choices[0]?.message?.content || '';
|
|
309
321
|
parsed = JSON.parse(nudgeContent);
|
|
310
322
|
content = nudgeContent;
|
|
@@ -345,6 +357,7 @@ async function runAgentLoop(client, config, session, prepareMessages) {
|
|
|
345
357
|
let wrapUpResult;
|
|
346
358
|
try {
|
|
347
359
|
wrapUpResult = await callModelWithFallback(client, config, wrapUpMessages, []);
|
|
360
|
+
accumulateUsage(usageAccum, wrapUpResult);
|
|
348
361
|
} catch (e) {
|
|
349
362
|
return {
|
|
350
363
|
iteration,
|
|
@@ -381,6 +394,7 @@ async function runAgentLoop(client, config, session, prepareMessages) {
|
|
|
381
394
|
try {
|
|
382
395
|
const nudgeMessages = [...wrapUpMessages, { role: 'user', content: FORMAT_NUDGE }];
|
|
383
396
|
const nudgeResult = await callModelWithFallback(client, config, nudgeMessages, []);
|
|
397
|
+
accumulateUsage(usageAccum, nudgeResult);
|
|
384
398
|
const nudgeContent = nudgeResult.choices[0]?.message?.content || '';
|
|
385
399
|
parsedWrapUp = JSON.parse(nudgeContent);
|
|
386
400
|
wrapUpContent = nudgeContent;
|
|
@@ -414,6 +428,9 @@ async function runAgentLoop(client, config, session, prepareMessages) {
|
|
|
414
428
|
typeof item === 'string' ? item : JSON.stringify(item)
|
|
415
429
|
);
|
|
416
430
|
}
|
|
431
|
+
if (typeof cp.state !== 'object' || cp.state === null || Array.isArray(cp.state)) {
|
|
432
|
+
cp.state = {};
|
|
433
|
+
}
|
|
417
434
|
return {
|
|
418
435
|
iteration,
|
|
419
436
|
response,
|
|
@@ -468,10 +485,7 @@ export async function handleChat(config, requestSessionId, userMessage) {
|
|
|
468
485
|
* session lock.
|
|
469
486
|
*/
|
|
470
487
|
async function _runHandleChat(config, sessionId, userMessage) {
|
|
471
|
-
const client =
|
|
472
|
-
baseURL: 'https://openrouter.ai/api/v1',
|
|
473
|
-
apiKey: config.apiKey,
|
|
474
|
-
});
|
|
488
|
+
const client = createClient(config);
|
|
475
489
|
|
|
476
490
|
const systemPromptTemplate = loadSystemPrompt();
|
|
477
491
|
let session = await loadSession(sessionId);
|
|
@@ -480,6 +494,11 @@ async function _runHandleChat(config, sessionId, userMessage) {
|
|
|
480
494
|
session = createSession(systemPromptTemplate);
|
|
481
495
|
}
|
|
482
496
|
|
|
497
|
+
// Capture persisted state BEFORE resetting metadata so we can inject it below.
|
|
498
|
+
const wasZeroProgress = !!session.metadata.lastCheckpointRemaining;
|
|
499
|
+
const priorCheckpointRemaining = session.metadata.lastCheckpointRemaining || null;
|
|
500
|
+
const priorCheckpointState = session.metadata.checkpointState || {};
|
|
501
|
+
|
|
483
502
|
// Preserve accumulated failedApproaches in conversation history before resetting
|
|
484
503
|
// so the model retains knowledge of what failed in the previous batch of handoff runs.
|
|
485
504
|
let userMessageWithContext = userMessage;
|
|
@@ -487,10 +506,24 @@ async function _runHandleChat(config, sessionId, userMessage) {
|
|
|
487
506
|
userMessageWithContext += `\n\n[System: The following approaches were tried and failed in previous runs — consider them exhausted:\n${session.metadata.failedApproaches.map((a, i) => `${i + 1}. ${a}`).join('\n')}]`;
|
|
488
507
|
}
|
|
489
508
|
|
|
509
|
+
// If this message follows a zero-progress intervention, tell the agent explicitly so
|
|
510
|
+
// it responds to the user's input instead of blindly resuming the same failing approach.
|
|
511
|
+
if (wasZeroProgress) {
|
|
512
|
+
const stateLines = Object.entries(priorCheckpointState).map(([k, v]) => `- ${k}: ${v}`);
|
|
513
|
+
let note = `\n\n[System: This task previously hit zero-progress and required intervention. If the user has given new direction or clarification, follow it. Otherwise, immediately explain what specific obstacle is blocking progress — do not resume the same failing approach.`;
|
|
514
|
+
if (stateLines.length > 0) {
|
|
515
|
+
note += `\n\nKnown facts from previous run:\n${stateLines.join('\n')}`;
|
|
516
|
+
}
|
|
517
|
+
note += `]`;
|
|
518
|
+
userMessageWithContext += note;
|
|
519
|
+
}
|
|
520
|
+
|
|
490
521
|
// Append user message and reset handoff state
|
|
491
522
|
session.messages.push({ role: 'user', content: userMessageWithContext });
|
|
492
523
|
session.metadata.handoffCount = 0;
|
|
493
524
|
session.metadata.failedApproaches = [];
|
|
525
|
+
session.metadata.lastCheckpointRemaining = null;
|
|
526
|
+
session.metadata.checkpointState = {};
|
|
494
527
|
|
|
495
528
|
// Resolves {{user_info}} in system prompt at runtime (never persisted)
|
|
496
529
|
function prepareMessages(messages) {
|
|
@@ -503,11 +536,15 @@ async function _runHandleChat(config, sessionId, userMessage) {
|
|
|
503
536
|
}
|
|
504
537
|
|
|
505
538
|
const allToolCalls = [];
|
|
539
|
+
const usageAccum = { prompt: 0, completion: 0 };
|
|
506
540
|
let finalResponse = '';
|
|
507
541
|
let finalLogSummary = '';
|
|
508
542
|
let finalStatus = 'ok';
|
|
509
|
-
// Tracks checkpoint.remaining from the previous handoff run to detect zero progress
|
|
510
|
-
|
|
543
|
+
// Tracks checkpoint.remaining from the previous handoff run to detect zero progress.
|
|
544
|
+
// Initialized from persisted metadata so detection works across user messages too —
|
|
545
|
+
// if the agent was stuck before and produces the same remaining again on the next
|
|
546
|
+
// user turn, zero-progress fires after just one run instead of two.
|
|
547
|
+
let previousRemaining = priorCheckpointRemaining;
|
|
511
548
|
|
|
512
549
|
try {
|
|
513
550
|
// Handoff loop
|
|
@@ -532,7 +569,7 @@ async function _runHandleChat(config, sessionId, userMessage) {
|
|
|
532
569
|
}
|
|
533
570
|
|
|
534
571
|
const runStartIndex = session.messages.length;
|
|
535
|
-
const run = await runAgentLoop(client, config, session, prepareMessages);
|
|
572
|
+
const run = await runAgentLoop(client, config, session, prepareMessages, usageAccum);
|
|
536
573
|
allToolCalls.push(...run.runToolCalls);
|
|
537
574
|
|
|
538
575
|
if (run.status !== 'checkpoint_reached') {
|
|
@@ -590,6 +627,15 @@ async function _runHandleChat(config, sessionId, userMessage) {
|
|
|
590
627
|
session.metadata.failedApproaches.push(...run.checkpoint.failedApproaches);
|
|
591
628
|
}
|
|
592
629
|
|
|
630
|
+
// Merge concrete facts from this run's checkpoint.state into session metadata.
|
|
631
|
+
// Later runs overwrite earlier values for the same key (newer discoveries win).
|
|
632
|
+
if (run.checkpoint.state && Object.keys(run.checkpoint.state).length > 0) {
|
|
633
|
+
session.metadata.checkpointState = {
|
|
634
|
+
...(session.metadata.checkpointState || {}),
|
|
635
|
+
...run.checkpoint.state,
|
|
636
|
+
};
|
|
637
|
+
}
|
|
638
|
+
|
|
593
639
|
// Zero-progress detection: if checkpoint.remaining is identical to the previous
|
|
594
640
|
// handoff's remaining, the agent completed a full run without making any progress.
|
|
595
641
|
// Stop immediately rather than burning more iterations on a stuck task.
|
|
@@ -599,6 +645,10 @@ async function _runHandleChat(config, sessionId, userMessage) {
|
|
|
599
645
|
finalLogSummary = 'Zero progress detected: task state unchanged after a full run. Human intervention required.';
|
|
600
646
|
finalStatus = 'intervention_required';
|
|
601
647
|
|
|
648
|
+
// Persist so that the next user message initializes previousRemaining from this
|
|
649
|
+
// value — zero-progress will then fire after just one run instead of two.
|
|
650
|
+
session.metadata.lastCheckpointRemaining = currentRemaining;
|
|
651
|
+
|
|
602
652
|
await appendLog(sessionId, {
|
|
603
653
|
iteration: 0,
|
|
604
654
|
model: config.selectedModel,
|
|
@@ -642,13 +692,17 @@ async function _runHandleChat(config, sessionId, userMessage) {
|
|
|
642
692
|
|
|
643
693
|
// Resume with checkpoint.remaining as new prompt.
|
|
644
694
|
// Guard against null/undefined in case the model omitted the field.
|
|
645
|
-
//
|
|
646
|
-
//
|
|
695
|
+
// Inject the full accumulated failedApproaches and concrete state so the agent
|
|
696
|
+
// has complete memory of what failed and what was already discovered.
|
|
647
697
|
let resumeContent = run.checkpoint.remaining || 'Continue with the task.';
|
|
648
698
|
const allFailedApproaches = session.metadata.failedApproaches || [];
|
|
649
699
|
if (allFailedApproaches.length > 0) {
|
|
650
700
|
resumeContent += `\n\n[System: The following approaches were tried and failed in previous runs — do not repeat them:\n${allFailedApproaches.map((a, i) => `${i + 1}. ${a}`).join('\n')}]`;
|
|
651
701
|
}
|
|
702
|
+
const stateToInject = session.metadata.checkpointState || {};
|
|
703
|
+
if (Object.keys(stateToInject).length > 0) {
|
|
704
|
+
resumeContent += `\n\n[System: Known facts from previous runs:\n${Object.entries(stateToInject).map(([k, v]) => `- ${k}: ${v}`).join('\n')}]`;
|
|
705
|
+
}
|
|
652
706
|
session.messages.push({ role: 'user', content: resumeContent });
|
|
653
707
|
}
|
|
654
708
|
} catch (e) {
|
|
@@ -664,6 +718,11 @@ async function _runHandleChat(config, sessionId, userMessage) {
|
|
|
664
718
|
});
|
|
665
719
|
throw e;
|
|
666
720
|
} finally {
|
|
721
|
+
// Accumulate token usage into session metadata so /usage can read it
|
|
722
|
+
if (!session.metadata.tokenUsage) session.metadata.tokenUsage = { prompt: 0, completion: 0 };
|
|
723
|
+
session.metadata.tokenUsage.prompt += usageAccum.prompt;
|
|
724
|
+
session.metadata.tokenUsage.completion += usageAccum.completion;
|
|
725
|
+
|
|
667
726
|
// Always persist the session — even if an unexpected error occurred.
|
|
668
727
|
// A failed save must not mask the original error.
|
|
669
728
|
try {
|
package/src/server/config.js
CHANGED
|
@@ -31,21 +31,27 @@ export function ensureDirectories() {
|
|
|
31
31
|
export function loadConfig() {
|
|
32
32
|
dotenv.config({ path: PATHS.envFile });
|
|
33
33
|
|
|
34
|
-
const apiKey = process.env.OPENROUTER_API_KEY;
|
|
35
|
-
if (!apiKey) {
|
|
36
|
-
throw new Error('OPENROUTER_API_KEY not found. Run `jarvis setup` first.');
|
|
37
|
-
}
|
|
38
|
-
|
|
39
34
|
if (!fs.existsSync(PATHS.settingsFile)) {
|
|
40
35
|
throw new Error('settings.json not found. Run `jarvis setup` first.');
|
|
41
36
|
}
|
|
42
37
|
|
|
43
38
|
const settings = JSON.parse(fs.readFileSync(PATHS.settingsFile, 'utf8'));
|
|
39
|
+
const provider = settings.provider || 'openrouter';
|
|
40
|
+
|
|
41
|
+
let apiKey;
|
|
42
|
+
if (provider === 'anthropic') {
|
|
43
|
+
apiKey = process.env.ANTHROPIC_API_KEY;
|
|
44
|
+
if (!apiKey) throw new Error('ANTHROPIC_API_KEY not found. Run `jarvis setup` first.');
|
|
45
|
+
} else {
|
|
46
|
+
apiKey = process.env.OPENROUTER_API_KEY;
|
|
47
|
+
if (!apiKey) throw new Error('OPENROUTER_API_KEY not found. Run `jarvis setup` first.');
|
|
48
|
+
}
|
|
44
49
|
|
|
45
50
|
return {
|
|
51
|
+
provider,
|
|
46
52
|
apiKey,
|
|
47
53
|
selectedModel: settings.selectedModel,
|
|
48
|
-
fallbackModel: settings.fallbackModel || 'openrouter/free',
|
|
54
|
+
fallbackModel: settings.fallbackModel || (provider === 'anthropic' ? 'claude-haiku-4-5-20251001' : 'openrouter/free'),
|
|
49
55
|
maxIterations: settings.maxIterations || 10,
|
|
50
56
|
maxHandoffs: settings.maxHandoffs || 5,
|
|
51
57
|
port: settings.port || 18008,
|
|
@@ -0,0 +1,149 @@
|
|
|
1
|
+
import OpenAI from 'openai';
|
|
2
|
+
import Anthropic from '@anthropic-ai/sdk';
|
|
3
|
+
|
|
4
|
+
// Convert OpenAI tool definitions to Anthropic format
|
|
5
|
+
function openAIToolsToAnthropic(tools) {
|
|
6
|
+
if (!tools || tools.length === 0) return [];
|
|
7
|
+
return tools.map(t => ({
|
|
8
|
+
name: t.function.name,
|
|
9
|
+
description: t.function.description || '',
|
|
10
|
+
input_schema: t.function.parameters || { type: 'object', properties: {}, required: [] },
|
|
11
|
+
}));
|
|
12
|
+
}
|
|
13
|
+
|
|
14
|
+
// Convert OpenAI message history to Anthropic format.
|
|
15
|
+
// Key differences:
|
|
16
|
+
// - system message becomes a separate `system` param, not part of messages
|
|
17
|
+
// - assistant tool_calls → content array with tool_use blocks
|
|
18
|
+
// - role:'tool' messages → content array with tool_result blocks inside a user message
|
|
19
|
+
// - Anthropic requires strict user/assistant alternation; consecutive user messages
|
|
20
|
+
// (e.g. tool results followed by a system note) are merged
|
|
21
|
+
function openAIMessagesToAnthropic(messages) {
|
|
22
|
+
let system;
|
|
23
|
+
let rest = messages;
|
|
24
|
+
|
|
25
|
+
if (messages[0]?.role === 'system') {
|
|
26
|
+
system = messages[0].content;
|
|
27
|
+
rest = messages.slice(1);
|
|
28
|
+
}
|
|
29
|
+
|
|
30
|
+
const result = [];
|
|
31
|
+
|
|
32
|
+
for (let i = 0; i < rest.length; i++) {
|
|
33
|
+
const msg = rest[i];
|
|
34
|
+
|
|
35
|
+
if (msg.role === 'user') {
|
|
36
|
+
const last = result[result.length - 1];
|
|
37
|
+
if (last && last.role === 'user') {
|
|
38
|
+
// Merge into previous user message to maintain strict alternation
|
|
39
|
+
const newPart = { type: 'text', text: typeof msg.content === 'string' ? msg.content : JSON.stringify(msg.content) };
|
|
40
|
+
if (typeof last.content === 'string') {
|
|
41
|
+
last.content = [{ type: 'text', text: last.content }, newPart];
|
|
42
|
+
} else {
|
|
43
|
+
last.content.push(newPart);
|
|
44
|
+
}
|
|
45
|
+
} else {
|
|
46
|
+
result.push({ role: 'user', content: msg.content });
|
|
47
|
+
}
|
|
48
|
+
|
|
49
|
+
} else if (msg.role === 'assistant') {
|
|
50
|
+
const content = [];
|
|
51
|
+
if (msg.content) content.push({ type: 'text', text: msg.content });
|
|
52
|
+
if (msg.tool_calls) {
|
|
53
|
+
for (const tc of msg.tool_calls) {
|
|
54
|
+
let input = {};
|
|
55
|
+
try { input = JSON.parse(tc.function.arguments || '{}'); } catch { /* ignore */ }
|
|
56
|
+
content.push({ type: 'tool_use', id: tc.id, name: tc.function.name, input });
|
|
57
|
+
}
|
|
58
|
+
}
|
|
59
|
+
result.push({ role: 'assistant', content: content.length > 0 ? content : [{ type: 'text', text: '' }] });
|
|
60
|
+
|
|
61
|
+
// Collect following tool-result messages into a single user message
|
|
62
|
+
const toolResults = [];
|
|
63
|
+
while (i + 1 < rest.length && rest[i + 1].role === 'tool') {
|
|
64
|
+
i++;
|
|
65
|
+
toolResults.push({
|
|
66
|
+
type: 'tool_result',
|
|
67
|
+
tool_use_id: rest[i].tool_call_id,
|
|
68
|
+
content: rest[i].content,
|
|
69
|
+
});
|
|
70
|
+
}
|
|
71
|
+
if (toolResults.length > 0) {
|
|
72
|
+
result.push({ role: 'user', content: toolResults });
|
|
73
|
+
}
|
|
74
|
+
|
|
75
|
+
}
|
|
76
|
+
// role:'tool' messages that were not consumed above are skipped (shouldn't happen)
|
|
77
|
+
}
|
|
78
|
+
|
|
79
|
+
return { system, messages: result };
|
|
80
|
+
}
|
|
81
|
+
|
|
82
|
+
// Normalize an Anthropic response to the shape agent.js expects from the OpenAI SDK
|
|
83
|
+
function anthropicResponseToOpenAI(response) {
|
|
84
|
+
const textParts = response.content.filter(c => c.type === 'text');
|
|
85
|
+
const toolParts = response.content.filter(c => c.type === 'tool_use');
|
|
86
|
+
|
|
87
|
+
const text = textParts.map(t => t.text).join('') || null;
|
|
88
|
+
const toolCalls = toolParts.length > 0
|
|
89
|
+
? toolParts.map(t => ({
|
|
90
|
+
id: t.id,
|
|
91
|
+
type: 'function',
|
|
92
|
+
function: { name: t.name, arguments: JSON.stringify(t.input) },
|
|
93
|
+
}))
|
|
94
|
+
: undefined;
|
|
95
|
+
|
|
96
|
+
return {
|
|
97
|
+
choices: [{
|
|
98
|
+
message: {
|
|
99
|
+
role: 'assistant',
|
|
100
|
+
content: toolCalls ? null : text,
|
|
101
|
+
...(toolCalls && { tool_calls: toolCalls }),
|
|
102
|
+
},
|
|
103
|
+
finish_reason: response.stop_reason === 'tool_use' ? 'tool_calls' : 'stop',
|
|
104
|
+
}],
|
|
105
|
+
usage: {
|
|
106
|
+
prompt_tokens: response.usage?.input_tokens ?? 0,
|
|
107
|
+
completion_tokens: response.usage?.output_tokens ?? 0,
|
|
108
|
+
total_tokens: (response.usage?.input_tokens ?? 0) + (response.usage?.output_tokens ?? 0),
|
|
109
|
+
},
|
|
110
|
+
};
|
|
111
|
+
}
|
|
112
|
+
|
|
113
|
+
// Build an Anthropic adapter that exposes the same interface as the OpenAI SDK client
|
|
114
|
+
function createAnthropicClient(apiKey) {
|
|
115
|
+
const anthropic = new Anthropic({ apiKey });
|
|
116
|
+
|
|
117
|
+
return {
|
|
118
|
+
chat: {
|
|
119
|
+
completions: {
|
|
120
|
+
create: async ({ model, messages, tools }) => {
|
|
121
|
+
const { system, messages: anthropicMessages } = openAIMessagesToAnthropic(messages);
|
|
122
|
+
const anthropicTools = openAIToolsToAnthropic(tools);
|
|
123
|
+
|
|
124
|
+
const params = {
|
|
125
|
+
model,
|
|
126
|
+
max_tokens: 8096,
|
|
127
|
+
messages: anthropicMessages,
|
|
128
|
+
};
|
|
129
|
+
if (system) params.system = system;
|
|
130
|
+
if (anthropicTools.length > 0) params.tools = anthropicTools;
|
|
131
|
+
|
|
132
|
+
const response = await anthropic.messages.create(params);
|
|
133
|
+
return anthropicResponseToOpenAI(response);
|
|
134
|
+
},
|
|
135
|
+
},
|
|
136
|
+
},
|
|
137
|
+
};
|
|
138
|
+
}
|
|
139
|
+
|
|
140
|
+
export function createClient(config) {
|
|
141
|
+
if (config.provider === 'anthropic') {
|
|
142
|
+
return createAnthropicClient(config.apiKey);
|
|
143
|
+
}
|
|
144
|
+
// Default: OpenRouter (OpenAI-compatible)
|
|
145
|
+
return new OpenAI({
|
|
146
|
+
baseURL: 'https://openrouter.ai/api/v1',
|
|
147
|
+
apiKey: config.apiKey,
|
|
148
|
+
});
|
|
149
|
+
}
|