agentpage 0.0.15 → 0.0.16
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +30 -7
- package/dist/index.mjs +67 -7
- package/dist/index.mjs.map +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -148,6 +148,7 @@ AI 每一轮不是“凭记忆猜页面”,而是基于最新快照选择可
|
|
|
148
148
|
- 模型可在文本中返回:
|
|
149
149
|
- `REMAINING: <剩余内容>`:表示还有任务要继续
|
|
150
150
|
- `REMAINING: DONE`:表示剩余任务已空
|
|
151
|
+
- 注意:模型在 `tool_calls` 轮可能返回空 `content`;这不代表任务结束。
|
|
151
152
|
|
|
152
153
|
### 3) 批量但不跨变更链式执行
|
|
153
154
|
|
|
@@ -180,6 +181,7 @@ AI 每一轮不是“凭记忆猜页面”,而是基于最新快照选择可
|
|
|
180
181
|
|
|
181
182
|
- `Current remaining instruction`(当前剩余任务)
|
|
182
183
|
- `Previous round planned task array`(上一轮已执行任务)
|
|
184
|
+
- `Previous round model output (normalized)`(上一轮模型输出归一化摘要)
|
|
183
185
|
- `Latest DOM snapshot`(当前快照)
|
|
184
186
|
|
|
185
187
|
说明:
|
|
@@ -195,6 +197,9 @@ AI 每一轮不是“凭记忆猜页面”,而是基于最新快照选择可
|
|
|
195
197
|
- `REMAINING: <new remaining instruction>`
|
|
196
198
|
- 或 `REMAINING: DONE`
|
|
197
199
|
|
|
200
|
+
实现细节:
|
|
201
|
+
- 若该轮返回 `tool_calls` 且 `content` 为空,loop 仍以“工具执行结果”推进状态,不把空文本当完成信号。
|
|
202
|
+
|
|
198
203
|
### 3) 每轮执行与状态推进
|
|
199
204
|
|
|
200
205
|
loop 对本轮返回做以下处理:
|
|
@@ -205,7 +210,11 @@ loop 对本轮返回做以下处理:
|
|
|
205
210
|
4. 刷新快照进入下一轮
|
|
206
211
|
5. 更新下一轮任务文本:
|
|
207
212
|
- 优先使用 `REMAINING`
|
|
208
|
-
- 若缺失 `REMAINING
|
|
213
|
+
- 若缺失 `REMAINING` 且本轮有执行动作:按线性任务剔除做启发式推进(避免整段原任务重复)
|
|
214
|
+
- 若缺失 `REMAINING` 且本轮无执行进展:保持当前任务不推进(按协议回退)
|
|
215
|
+
6. 若“remaining 未完成 + 无工具调用”:
|
|
216
|
+
- 不直接结束
|
|
217
|
+
- 下一轮注入 `Protocol violation` 强约束提示,要求“要么给可执行工具调用,要么严格 `REMAINING: DONE`”
|
|
209
218
|
|
|
210
219
|
### 3.1) 找不到元素重试流(Not-found Retry Dialogue)
|
|
211
220
|
|
|
@@ -223,7 +232,7 @@ loop 对本轮返回做以下处理:
|
|
|
223
232
|
|
|
224
233
|
### 4) 停机条件
|
|
225
234
|
|
|
226
|
-
-
|
|
235
|
+
- 无工具调用且 remaining 已完成(或明确 `REMAINING: DONE`)
|
|
227
236
|
- `REMAINING: DONE` 后自然收敛
|
|
228
237
|
- 重复批次防自转触发
|
|
229
238
|
- 达到 `maxRounds`
|
|
@@ -274,6 +283,7 @@ loop 对本轮返回做以下处理:
|
|
|
274
283
|
- `Current remaining instruction`
|
|
275
284
|
- `Done steps (do NOT repeat)`
|
|
276
285
|
- `Previous round planned task array`
|
|
286
|
+
- `Previous round model output (normalized)`
|
|
277
287
|
- `Latest DOM snapshot`
|
|
278
288
|
|
|
279
289
|
这层是“每轮变化”的动态上下文。
|
|
@@ -285,7 +295,8 @@ loop 对本轮返回做以下处理:
|
|
|
285
295
|
- 首轮使用前端注入的 `initialSnapshot`
|
|
286
296
|
- 每轮执行后刷新快照
|
|
287
297
|
- 推进 `remainingInstruction`
|
|
288
|
-
- `REMAINING`
|
|
298
|
+
- `REMAINING` 缺失且本轮有执行动作时:按线性任务剔除做启发式推进
|
|
299
|
+
- `REMAINING` 缺失且本轮无执行进展时:保持当前 remaining
|
|
289
300
|
- 防空转、防重复、防无限循环
|
|
290
301
|
- DOM 变更动作触发强制断轮(等待下一轮新快照)
|
|
291
302
|
|
|
@@ -384,7 +395,7 @@ sequenceDiagram
|
|
|
384
395
|
主流程位于 `src/core/agent-loop/index.ts`:
|
|
385
396
|
|
|
386
397
|
1. 确保当前快照可用
|
|
387
|
-
2.
|
|
398
|
+
2. 构建紧凑消息(remaining + 执行历史 + 上轮模型输出 + 最新快照)
|
|
388
399
|
3. 调用 AI
|
|
389
400
|
4. 执行工具调用并记录 trace
|
|
390
401
|
5. 运行保护机制
|
|
@@ -392,20 +403,24 @@ sequenceDiagram
|
|
|
392
403
|
|
|
393
404
|
### 渐进式执行状态(新增)
|
|
394
405
|
|
|
395
|
-
`src/core/agent-loop/index.ts` 内部维护
|
|
406
|
+
`src/core/agent-loop/index.ts` 内部维护 5 个关键状态:
|
|
396
407
|
- `remainingInstruction`:当前轮次待消费文本(初始值为用户原始输入)
|
|
397
408
|
- `previousRoundTasks`:上一轮执行任务数组
|
|
409
|
+
- `previousRoundPlannedTasks`:上一轮模型给出的计划批次(执行前)
|
|
410
|
+
- `previousRoundModelOutput`:上一轮模型输出归一化摘要(执行后供下轮输入)
|
|
398
411
|
- `lastPlannedBatchKey`:用于识别是否连续两轮给出完全相同的任务批次
|
|
399
412
|
|
|
400
413
|
停机规则:
|
|
401
|
-
-
|
|
414
|
+
- 若模型返回无工具调用且 remaining 未完成 → 不直接结束,进入协议修复轮
|
|
415
|
+
- 若模型返回无工具调用且 remaining 已完成(或 `REMAINING: DONE`)→ 结束
|
|
402
416
|
- 若连续两轮规划出相同任务批次,且上一轮无错误 → 自动终止,防止自转
|
|
403
417
|
- 若模型文本包含 `REMAINING: DONE`,通常下一轮会自然进入“无工具调用总结”并结束
|
|
404
418
|
|
|
405
419
|
### 紧凑消息结构
|
|
406
420
|
|
|
407
421
|
由 `messages.ts` 构建,核心语义:
|
|
408
|
-
-
|
|
422
|
+
- Round 0:用户原始任务 + 首轮快照
|
|
423
|
+
- Round 1+:剩余任务 + done steps + 上轮计划批次 + 上轮模型输出归一化 + 最新快照
|
|
409
424
|
- Done steps:已完成动作(避免重复)
|
|
410
425
|
- Execution context + latest snapshot:当前可执行范围
|
|
411
426
|
|
|
@@ -458,6 +473,14 @@ sequenceDiagram
|
|
|
458
473
|
|
|
459
474
|
通过 `ToolRegistry` 统一暴露给模型,执行结果标准化返回。
|
|
460
475
|
|
|
476
|
+
### Playwright 对齐说明(当前实现)
|
|
477
|
+
|
|
478
|
+
- `dom.click`:采用更完整的点击事件链(`pointerdown/mousedown/pointerup/mouseup/click`)。
|
|
479
|
+
- `dom.select_option`:支持 `value/label/index`;结果返回显式 `value + label`。
|
|
480
|
+
- `dom.fill`:不允许用于 `checkbox/radio/file/button/submit/reset` 等不兼容输入类型。
|
|
481
|
+
- `wait.wait_for_selector`:支持 `state=attached|visible|hidden|detached`(默认 `attached`)。
|
|
482
|
+
- 快照运行态增强:可见 `select val`、`option selected`、`checked`、`disabled`、`readonly`,减少重复操作。
|
|
483
|
+
|
|
461
484
|
---
|
|
462
485
|
|
|
463
486
|
## 扩展与自定义
|
package/dist/index.mjs
CHANGED
|
@@ -162,7 +162,7 @@ function formatToolResultBrief(result) {
|
|
|
162
162
|
* - `previousRoundTasks`:上一轮已执行的任务数组,避免重复计划。
|
|
163
163
|
* - 消息中要求模型输出 `REMAINING: ...` 或 `REMAINING: DONE`,供下一轮继续消费。
|
|
164
164
|
*/
|
|
165
|
-
function buildCompactMessages(userMessage, trace, latestSnapshot, currentUrl, history, remainingInstruction, previousRoundTasks) {
|
|
165
|
+
function buildCompactMessages(userMessage, trace, latestSnapshot, currentUrl, history, remainingInstruction, previousRoundTasks, previousRoundModelOutput, previousRoundPlannedTasks, protocolViolationHint) {
|
|
166
166
|
const messages = history ? [...history] : [];
|
|
167
167
|
const allowAgentUiInteraction = isExplicitAgentUiRequest(userMessage);
|
|
168
168
|
const activeInstruction = remainingInstruction && remainingInstruction.trim() ? remainingInstruction.trim() : userMessage;
|
|
@@ -176,6 +176,7 @@ function buildCompactMessages(userMessage, trace, latestSnapshot, currentUrl, hi
|
|
|
176
176
|
];
|
|
177
177
|
if (currentUrl) parts.push("", `URL: ${currentUrl}`);
|
|
178
178
|
if (latestSnapshot) parts.push("", "## Current page snapshot", "Apply task-reduction model directly from this snapshot. Do NOT restate the task.", "Use hash IDs (e.g. #a1b2c) from the snapshot as selector params.", "Do NOT call page_info (get_url/get_title/query_all/snapshot).", "Batch independent visible actions in one round.", "If action changes DOM (open modal/navigate), stop that batch and continue next round.", "For dropdown/select fields, use dom with action=select_option (or fill on a select).", allowAgentUiInteraction ? "User explicitly asked to operate AutoPilot UI. You may interact with chat input/send/dock only as requested." : "Do NOT interact with any AI chat UI elements (chat input, send button, dock). Only operate on the actual page content.", "Output one line: REMAINING: <new remaining task after this round> or REMAINING: DONE", wrapSnapshot(latestSnapshot));
|
|
179
|
+
if (protocolViolationHint) parts.push("", protocolViolationHint);
|
|
179
180
|
messages.push({
|
|
180
181
|
role: "user",
|
|
181
182
|
content: parts.join("\n")
|
|
@@ -215,6 +216,8 @@ function buildCompactMessages(userMessage, trace, latestSnapshot, currentUrl, hi
|
|
|
215
216
|
if (hasErrors) contextParts.push("", "The last step failed. Retry with a different approach, or skip and continue with other visible targets.");
|
|
216
217
|
else contextParts.push("", "If the goal is fully done, reply with a short summary (no tool calls).");
|
|
217
218
|
if (previousRoundTasks && previousRoundTasks.length > 0) contextParts.push("", "Previous round planned task array (already executed):", ...previousRoundTasks.map((task, index) => `${index + 1}. ${task}`));
|
|
219
|
+
if (previousRoundPlannedTasks && previousRoundPlannedTasks.length > 0) contextParts.push("", "Previous round model planned task array (before execution):", ...previousRoundPlannedTasks.map((task, index) => `${index + 1}. ${task}`));
|
|
220
|
+
if (previousRoundModelOutput) contextParts.push("", "Previous round model output (normalized, for task reduction input):", previousRoundModelOutput);
|
|
218
221
|
contextParts.push("", "After this round, include one plain text line:", "REMAINING: <new remaining instruction after this-round actions>", "or REMAINING: DONE");
|
|
219
222
|
const lastEntry = trace[trace.length - 1];
|
|
220
223
|
if (hasToolError(lastEntry.result)) {
|
|
@@ -222,6 +225,7 @@ function buildCompactMessages(userMessage, trace, latestSnapshot, currentUrl, hi
|
|
|
222
225
|
if (stripped && stripped.length < 300) contextParts.push("", "Last error: " + stripped);
|
|
223
226
|
}
|
|
224
227
|
if (currentUrl) contextParts.push("", `URL: ${currentUrl}`);
|
|
228
|
+
if (protocolViolationHint) contextParts.push("", protocolViolationHint);
|
|
225
229
|
if (latestSnapshot) contextParts.push("", "## Latest DOM snapshot", "Use hash IDs from this snapshot. Do NOT call page_info — this is already the latest.", wrapSnapshot(latestSnapshot));
|
|
226
230
|
messages.push({
|
|
227
231
|
role: "user",
|
|
@@ -385,9 +389,12 @@ async function executeAgentLoop(params) {
|
|
|
385
389
|
let outputTokens = 0;
|
|
386
390
|
let remainingInstruction = message.trim();
|
|
387
391
|
let previousRoundTasks = [];
|
|
392
|
+
let previousRoundPlannedTasks = [];
|
|
393
|
+
let previousRoundModelOutput = "";
|
|
388
394
|
let lastPlannedBatchKey = "";
|
|
389
395
|
let consecutiveSamePlannedBatch = 0;
|
|
390
396
|
let lastRoundHadError = false;
|
|
397
|
+
let protocolViolationHint;
|
|
391
398
|
let recoveryCount = 0;
|
|
392
399
|
let redundantInterceptCount = 0;
|
|
393
400
|
let pendingNotFoundRetry;
|
|
@@ -449,6 +456,20 @@ async function executeAgentLoop(params) {
|
|
|
449
456
|
return `${tc.name}:${inputText}`;
|
|
450
457
|
});
|
|
451
458
|
/**
|
|
459
|
+
* 规范化模型文本输出(中)/ Normalize model text for next-round input (EN).
|
|
460
|
+
*
|
|
461
|
+
* 优先保留 REMAINING 行;否则截断首段文本,避免长篇规划污染下一轮输入。
|
|
462
|
+
* Prefer REMAINING line; otherwise keep a short excerpt to avoid long planning spillover.
|
|
463
|
+
*/
|
|
464
|
+
const normalizeModelOutput = (text) => {
|
|
465
|
+
if (!text) return "";
|
|
466
|
+
const trimmed = text.trim();
|
|
467
|
+
if (!trimmed) return "";
|
|
468
|
+
const remainingMatch = trimmed.match(/REMAINING\s*:\s*([\s\S]*)$/i);
|
|
469
|
+
if (remainingMatch) return `REMAINING: ${remainingMatch[1].trim()}`;
|
|
470
|
+
return (trimmed.split(/\n\s*\n/)[0]?.trim() ?? trimmed).slice(0, 220);
|
|
471
|
+
};
|
|
472
|
+
/**
|
|
452
473
|
* 判定动作是否会触发 DOM 结构变化(中)/ Whether action may cause DOM-shape change (EN).
|
|
453
474
|
*
|
|
454
475
|
* 触发后应强制断轮,等待下一轮新快照继续。
|
|
@@ -490,8 +511,8 @@ async function executeAgentLoop(params) {
|
|
|
490
511
|
/**
|
|
491
512
|
* 推进下一轮描述(中)/ Derive next-round instruction from model text (EN).
|
|
492
513
|
*
|
|
493
|
-
* 优先 REMAINING
|
|
494
|
-
* Priority: REMAINING protocol first; otherwise
|
|
514
|
+
* 优先 REMAINING 协议;若未提供,则保持当前 remaining 不变。
|
|
515
|
+
* Priority: REMAINING protocol first; otherwise keep current remaining instruction unchanged.
|
|
495
516
|
*/
|
|
496
517
|
const deriveNextInstruction = (text, currentInstruction) => {
|
|
497
518
|
const parsed = parseRemainingInstruction(text);
|
|
@@ -504,12 +525,26 @@ async function executeAgentLoop(params) {
|
|
|
504
525
|
hasRemainingProtocol: false
|
|
505
526
|
};
|
|
506
527
|
};
|
|
528
|
+
/**
|
|
529
|
+
* 启发式任务剔除(中)/ Heuristic remaining reduction for linear instructions (EN).
|
|
530
|
+
*
|
|
531
|
+
* 在 REMAINING 缺失但本轮有执行动作时,按“线性片段”剔除已执行步数,避免下一轮继续携带整段原任务。
|
|
532
|
+
* When REMAINING is missing but actions were executed, drop executed step count from a linearized instruction.
|
|
533
|
+
*/
|
|
534
|
+
const reduceRemainingHeuristically = (currentInstruction, executedCount) => {
|
|
535
|
+
if (!currentInstruction.trim() || executedCount <= 0) return currentInstruction;
|
|
536
|
+
const parts = currentInstruction.replace(/\s+/g, " ").replace(/(->|=>|→)/g, " 然后 ").replace(/[,,。;;]/g, " 然后 ").split(/\s*(?:然后|再|并且|并|接着|随后|之后)\s*/g).map((part) => part.trim()).filter(Boolean);
|
|
537
|
+
if (parts.length <= 1) return currentInstruction;
|
|
538
|
+
const nextParts = parts.slice(Math.min(executedCount, parts.length));
|
|
539
|
+
if (nextParts.length === 0) return "";
|
|
540
|
+
return nextParts.join(" -> ");
|
|
541
|
+
};
|
|
507
542
|
for (let round = 0; round < maxRounds; round++) {
|
|
508
543
|
callbacks?.onRound?.(round);
|
|
509
544
|
usedRounds = round + 1;
|
|
510
545
|
if (!pageContext.latestSnapshot) await refreshSnapshot();
|
|
511
546
|
const effectivePrompt = stripSnapshotFromPrompt(systemPrompt);
|
|
512
|
-
const chatMessages = buildCompactMessages(message, fullToolTrace, pageContext.latestSnapshot, pageContext.currentUrl, history, remainingInstruction, previousRoundTasks);
|
|
547
|
+
const chatMessages = buildCompactMessages(message, fullToolTrace, pageContext.latestSnapshot, pageContext.currentUrl, history, remainingInstruction, previousRoundTasks, previousRoundModelOutput, previousRoundPlannedTasks, protocolViolationHint);
|
|
513
548
|
if (pendingNotFoundRetry && pendingNotFoundRetry.tasks.length > 0) chatMessages.push({
|
|
514
549
|
role: "user",
|
|
515
550
|
content: [
|
|
@@ -528,8 +563,7 @@ async function executeAgentLoop(params) {
|
|
|
528
563
|
});
|
|
529
564
|
inputTokens += response.usage?.inputTokens ?? 0;
|
|
530
565
|
outputTokens += response.usage?.outputTokens ?? 0;
|
|
531
|
-
const
|
|
532
|
-
remainingInstruction = nextInstructionState.nextInstruction;
|
|
566
|
+
const parsedInstructionState = deriveNextInstruction(response.text, remainingInstruction);
|
|
533
567
|
if (!response.toolCalls || response.toolCalls.length === 0) {
|
|
534
568
|
if (pendingNotFoundRetry) {
|
|
535
569
|
const unresolvedHint = response.text?.toLowerCase() ?? "";
|
|
@@ -545,10 +579,29 @@ async function executeAgentLoop(params) {
|
|
|
545
579
|
}
|
|
546
580
|
pendingNotFoundRetry = void 0;
|
|
547
581
|
}
|
|
582
|
+
if (parsedInstructionState.hasRemainingProtocol) remainingInstruction = parsedInstructionState.nextInstruction;
|
|
583
|
+
if (remainingInstruction.trim().length > 0 && round < maxRounds - 1) {
|
|
584
|
+
protocolViolationHint = [
|
|
585
|
+
"Protocol violation in previous round:",
|
|
586
|
+
"- Remaining task is not DONE, but no tool calls were returned.",
|
|
587
|
+
"This round MUST do one of:",
|
|
588
|
+
"1) Return actionable tool calls for visible targets; or",
|
|
589
|
+
"2) If truly complete, return a short summary and EXACTLY `REMAINING: DONE`.",
|
|
590
|
+
"Do NOT output planning/explaining text."
|
|
591
|
+
].join("\n");
|
|
592
|
+
lastRoundHadError = true;
|
|
593
|
+
await refreshSnapshot();
|
|
594
|
+
continue;
|
|
595
|
+
}
|
|
548
596
|
finalReply = response.text ?? "";
|
|
549
597
|
if (finalReply) callbacks?.onText?.(finalReply);
|
|
550
598
|
break;
|
|
551
599
|
}
|
|
600
|
+
protocolViolationHint = void 0;
|
|
601
|
+
const plannedTasksCurrentRound = buildTaskArray(response.toolCalls.map((tc) => ({
|
|
602
|
+
name: tc.name,
|
|
603
|
+
input: tc.input
|
|
604
|
+
})));
|
|
552
605
|
const plannedBatchKey = JSON.stringify(response.toolCalls.map((tc) => ({
|
|
553
606
|
name: tc.name,
|
|
554
607
|
input: tc.input
|
|
@@ -617,9 +670,16 @@ async function executeAgentLoop(params) {
|
|
|
617
670
|
tasks: roundMissingTasks
|
|
618
671
|
};
|
|
619
672
|
else pendingNotFoundRetry = void 0;
|
|
620
|
-
if (
|
|
673
|
+
if (parsedInstructionState.hasRemainingProtocol) remainingInstruction = parsedInstructionState.nextInstruction;
|
|
674
|
+
else {
|
|
675
|
+
const nextByHeuristic = reduceRemainingHeuristically(remainingInstruction, executedTaskCalls.length);
|
|
676
|
+
if (nextByHeuristic !== remainingInstruction) remainingInstruction = nextByHeuristic;
|
|
677
|
+
else roundHasError = true;
|
|
678
|
+
}
|
|
679
|
+
previousRoundModelOutput = parsedInstructionState.hasRemainingProtocol ? normalizeModelOutput(response.text) : `REMAINING: ${remainingInstruction || "DONE"}`;
|
|
621
680
|
lastRoundHadError = roundHasError;
|
|
622
681
|
previousRoundTasks = buildTaskArray(executedTaskCalls);
|
|
682
|
+
previousRoundPlannedTasks = plannedTasksCurrentRound;
|
|
623
683
|
const idleResult = detectIdleLoop(executedTaskCalls.map((tc) => tc.name), consecutiveReadOnlyRounds);
|
|
624
684
|
if (idleResult === -1) {
|
|
625
685
|
finalReply = response.text || "任务已完成。";
|