@possumtech/rummy 0.5.0 → 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.env.example +21 -5
- package/PLUGINS.md +389 -194
- package/README.md +25 -8
- package/SPEC.md +850 -373
- package/bin/demo.js +166 -0
- package/bin/rummy.js +9 -3
- package/biome/no-fallbacks.grit +50 -0
- package/lang/en.json +2 -2
- package/migrations/001_initial_schema.sql +88 -37
- package/package.json +6 -4
- package/service.js +50 -9
- package/src/agent/AgentLoop.js +460 -330
- package/src/agent/ContextAssembler.js +4 -4
- package/src/agent/Entries.js +655 -0
- package/src/agent/ProjectAgent.js +30 -18
- package/src/agent/TurnExecutor.js +229 -421
- package/src/agent/XmlParser.js +99 -33
- package/src/agent/budget.js +56 -0
- package/src/agent/errors.js +22 -0
- package/src/agent/httpStatus.js +39 -0
- package/src/agent/known_checks.sql +8 -4
- package/src/agent/known_queries.sql +9 -13
- package/src/agent/known_store.sql +275 -125
- package/src/agent/materializeContext.js +102 -0
- package/src/agent/runs.sql +10 -7
- package/src/agent/schemes.sql +14 -3
- package/src/agent/turns.sql +9 -9
- package/src/hooks/HookRegistry.js +6 -5
- package/src/hooks/Hooks.js +44 -3
- package/src/hooks/PluginContext.js +29 -21
- package/src/{server → hooks}/RpcRegistry.js +2 -1
- package/src/hooks/RummyContext.js +135 -35
- package/src/hooks/ToolRegistry.js +21 -16
- package/src/llm/LlmProvider.js +64 -90
- package/src/llm/errors.js +21 -0
- package/src/plugins/ask_user/README.md +1 -1
- package/src/plugins/ask_user/ask_user.js +37 -12
- package/src/plugins/ask_user/ask_userDoc.js +2 -25
- package/src/plugins/ask_user/ask_userDoc.md +10 -0
- package/src/plugins/budget/README.md +27 -25
- package/src/plugins/budget/budget.js +260 -88
- package/src/plugins/cp/README.md +2 -2
- package/src/plugins/cp/cp.js +29 -11
- package/src/plugins/cp/cpDoc.js +2 -15
- package/src/plugins/cp/cpDoc.md +7 -0
- package/src/plugins/engine/README.md +2 -2
- package/src/plugins/engine/engine.sql +4 -4
- package/src/plugins/engine/turn_context.sql +10 -10
- package/src/plugins/env/README.md +20 -5
- package/src/plugins/env/env.js +45 -6
- package/src/plugins/env/envDoc.js +2 -23
- package/src/plugins/env/envDoc.md +13 -0
- package/src/plugins/error/README.md +16 -0
- package/src/plugins/error/error.js +151 -0
- package/src/plugins/file/README.md +6 -6
- package/src/plugins/file/file.js +15 -2
- package/src/plugins/get/README.md +1 -1
- package/src/plugins/get/get.js +103 -48
- package/src/plugins/get/getDoc.js +2 -32
- package/src/plugins/get/getDoc.md +36 -0
- package/src/plugins/hedberg/README.md +1 -2
- package/src/plugins/hedberg/hedberg.js +8 -4
- package/src/plugins/hedberg/matcher.js +16 -17
- package/src/plugins/hedberg/normalize.js +0 -48
- package/src/plugins/helpers.js +42 -2
- package/src/plugins/index.js +146 -123
- package/src/plugins/instructions/README.md +35 -9
- package/src/plugins/instructions/instructions.js +122 -9
- package/src/plugins/instructions/instructions.md +25 -0
- package/src/plugins/instructions/instructions_104.md +7 -0
- package/src/plugins/instructions/instructions_105.md +46 -0
- package/src/plugins/instructions/instructions_106.md +0 -0
- package/src/plugins/instructions/instructions_107.md +0 -0
- package/src/plugins/instructions/instructions_108.md +8 -0
- package/src/plugins/instructions/protocol.js +12 -0
- package/src/plugins/known/README.md +2 -2
- package/src/plugins/known/known.js +67 -36
- package/src/plugins/known/knownDoc.js +2 -17
- package/src/plugins/known/knownDoc.md +8 -0
- package/src/plugins/log/README.md +48 -0
- package/src/plugins/log/log.js +109 -0
- package/src/plugins/mv/README.md +2 -2
- package/src/plugins/mv/mv.js +55 -22
- package/src/plugins/mv/mvDoc.js +2 -18
- package/src/plugins/mv/mvDoc.md +10 -0
- package/src/plugins/ollama/README.md +15 -0
- package/src/{llm/OllamaClient.js → plugins/ollama/ollama.js} +40 -18
- package/src/plugins/openai/README.md +17 -0
- package/src/plugins/openai/openai.js +120 -0
- package/src/plugins/openrouter/README.md +27 -0
- package/src/plugins/openrouter/openrouter.js +121 -0
- package/src/plugins/persona/README.md +20 -0
- package/src/plugins/persona/persona.js +9 -16
- package/src/plugins/policy/README.md +21 -0
- package/src/plugins/policy/policy.js +29 -14
- package/src/plugins/prompt/README.md +1 -1
- package/src/plugins/prompt/prompt.js +58 -16
- package/src/plugins/rm/README.md +1 -1
- package/src/plugins/rm/rm.js +56 -12
- package/src/plugins/rm/rmDoc.js +2 -20
- package/src/plugins/rm/rmDoc.md +13 -0
- package/src/plugins/rpc/README.md +2 -2
- package/src/plugins/rpc/rpc.js +515 -296
- package/src/plugins/set/README.md +1 -1
- package/src/plugins/set/set.js +318 -75
- package/src/plugins/set/setDoc.js +2 -35
- package/src/plugins/set/setDoc.md +22 -0
- package/src/plugins/sh/README.md +28 -5
- package/src/plugins/sh/sh.js +50 -6
- package/src/plugins/sh/shDoc.js +2 -23
- package/src/plugins/sh/shDoc.md +13 -0
- package/src/plugins/skill/README.md +23 -0
- package/src/plugins/skill/skill.js +14 -18
- package/src/plugins/stream/README.md +101 -0
- package/src/plugins/stream/stream.js +290 -0
- package/src/plugins/telemetry/README.md +1 -1
- package/src/plugins/telemetry/telemetry.js +129 -80
- package/src/plugins/think/README.md +1 -1
- package/src/plugins/think/think.js +12 -0
- package/src/plugins/think/thinkDoc.js +2 -15
- package/src/plugins/think/thinkDoc.md +7 -0
- package/src/plugins/unknown/README.md +3 -3
- package/src/plugins/unknown/unknown.js +47 -19
- package/src/plugins/unknown/unknownDoc.js +2 -21
- package/src/plugins/unknown/unknownDoc.md +11 -0
- package/src/plugins/update/README.md +1 -1
- package/src/plugins/update/update.js +67 -5
- package/src/plugins/update/updateDoc.js +2 -30
- package/src/plugins/update/updateDoc.md +8 -0
- package/src/plugins/xai/README.md +23 -0
- package/src/{llm/XaiClient.js → plugins/xai/xai.js} +58 -37
- package/src/server/ClientConnection.js +64 -37
- package/src/server/SocketServer.js +23 -10
- package/src/server/protocol.js +11 -0
- package/src/sql/v_model_context.sql +27 -31
- package/src/sql/v_run_log.sql +9 -14
- package/EXCEPTIONS.md +0 -46
- package/FIDELITY_CONTRACT.md +0 -172
- package/src/agent/KnownStore.js +0 -337
- package/src/agent/ResponseHealer.js +0 -241
- package/src/llm/OpenAiClient.js +0 -100
- package/src/llm/OpenRouterClient.js +0 -100
- package/src/plugins/budget/recovery.js +0 -47
- package/src/plugins/instructions/preamble.md +0 -45
- package/src/plugins/performed/README.md +0 -15
- package/src/plugins/performed/performed.js +0 -45
- package/src/plugins/previous/README.md +0 -16
- package/src/plugins/previous/previous.js +0 -56
- package/src/plugins/progress/README.md +0 -16
- package/src/plugins/progress/progress.js +0 -43
- package/src/plugins/summarize/README.md +0 -19
- package/src/plugins/summarize/summarize.js +0 -32
- package/src/plugins/summarize/summarizeDoc.js +0 -27
package/src/llm/LlmProvider.js
CHANGED
|
@@ -1,46 +1,30 @@
|
|
|
1
1
|
import msg from "../agent/messages.js";
|
|
2
|
-
import
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
2
|
+
import {
|
|
3
|
+
ContextExceededError,
|
|
4
|
+
isContextExceededMessage,
|
|
5
|
+
isTransientMessage,
|
|
6
|
+
} from "./errors.js";
|
|
7
|
+
|
|
8
|
+
const MAX_TRANSIENT_RETRIES = 3;
|
|
9
|
+
|
|
10
|
+
/**
|
|
11
|
+
* Thin dispatcher over the LLM provider registry (`hooks.llm.providers`).
|
|
12
|
+
* Resolves the model alias via the DB, finds the highest-priority provider
|
|
13
|
+
* whose `matches()` returns true, and delegates. Wraps the call with
|
|
14
|
+
* transient-error retry and surfaces context-exceeded as a typed
|
|
15
|
+
* ContextExceededError.
|
|
16
|
+
*
|
|
17
|
+
* Vendor-specific HTTP is owned by per-vendor plugins under
|
|
18
|
+
* `src/plugins/{openai,ollama,xai,openrouter,...}/`. Adding a new vendor
|
|
19
|
+
* is a matter of adding a plugin — no changes here.
|
|
20
|
+
*/
|
|
7
21
|
export default class LlmProvider {
|
|
8
22
|
#db;
|
|
9
|
-
#
|
|
10
|
-
#ollama;
|
|
11
|
-
#openAi;
|
|
12
|
-
#xai;
|
|
23
|
+
#hooks;
|
|
13
24
|
|
|
14
|
-
constructor(db) {
|
|
25
|
+
constructor(db, hooks) {
|
|
15
26
|
this.#db = db;
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
#getOpenRouter() {
|
|
19
|
-
this.#openRouter ??= new OpenRouterClient(process.env.OPENROUTER_API_KEY);
|
|
20
|
-
return this.#openRouter;
|
|
21
|
-
}
|
|
22
|
-
|
|
23
|
-
#getOllama() {
|
|
24
|
-
this.#ollama ??= new OllamaClient(process.env.OLLAMA_BASE_URL);
|
|
25
|
-
return this.#ollama;
|
|
26
|
-
}
|
|
27
|
-
|
|
28
|
-
#getOpenAi() {
|
|
29
|
-
if (!this.#openAi) {
|
|
30
|
-
const baseUrl = process.env.OPENAI_BASE_URL;
|
|
31
|
-
if (!baseUrl) throw new Error(msg("error.openai_base_url_missing"));
|
|
32
|
-
this.#openAi = new OpenAiClient(baseUrl, process.env.OPENAI_API_KEY);
|
|
33
|
-
}
|
|
34
|
-
return this.#openAi;
|
|
35
|
-
}
|
|
36
|
-
|
|
37
|
-
#getXai() {
|
|
38
|
-
if (!this.#xai) {
|
|
39
|
-
const baseUrl = process.env.XAI_BASE_URL;
|
|
40
|
-
if (!baseUrl) throw new Error(msg("error.xai_base_url_missing"));
|
|
41
|
-
this.#xai = new XaiClient(baseUrl, process.env.XAI_API_KEY);
|
|
42
|
-
}
|
|
43
|
-
return this.#xai;
|
|
27
|
+
this.#hooks = hooks;
|
|
44
28
|
}
|
|
45
29
|
|
|
46
30
|
async resolve(alias) {
|
|
@@ -49,6 +33,10 @@ export default class LlmProvider {
|
|
|
49
33
|
throw new Error(msg("error.model_alias_unknown", { alias }));
|
|
50
34
|
}
|
|
51
35
|
|
|
36
|
+
#selectProvider(modelAlias) {
|
|
37
|
+
return this.#hooks.llm.providers.find((p) => p.matches(modelAlias));
|
|
38
|
+
}
|
|
39
|
+
|
|
52
40
|
async completion(messages, model, options = {}) {
|
|
53
41
|
const resolvedModel = await this.resolve(model);
|
|
54
42
|
|
|
@@ -59,68 +47,54 @@ export default class LlmProvider {
|
|
|
59
47
|
: undefined);
|
|
60
48
|
const resolvedOptions = { ...options, temperature };
|
|
61
49
|
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
resolvedOptions,
|
|
68
|
-
);
|
|
69
|
-
}
|
|
70
|
-
|
|
71
|
-
if (resolvedModel.startsWith("openai/")) {
|
|
72
|
-
const localModel = resolvedModel.replace("openai/", "");
|
|
73
|
-
return this.#getOpenAi().completion(
|
|
74
|
-
messages,
|
|
75
|
-
localModel,
|
|
76
|
-
resolvedOptions,
|
|
50
|
+
const provider = this.#selectProvider(resolvedModel);
|
|
51
|
+
if (!provider) {
|
|
52
|
+
throw new Error(
|
|
53
|
+
`No LLM provider registered for model "${resolvedModel}". ` +
|
|
54
|
+
`Check your RUMMY_* env vars or register a provider plugin.`,
|
|
77
55
|
);
|
|
78
56
|
}
|
|
79
57
|
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
58
|
+
for (let attempt = 0; ; attempt++) {
|
|
59
|
+
try {
|
|
60
|
+
return await provider.completion(
|
|
61
|
+
messages,
|
|
62
|
+
resolvedModel,
|
|
63
|
+
resolvedOptions,
|
|
64
|
+
);
|
|
65
|
+
} catch (err) {
|
|
66
|
+
if (isContextExceededMessage(err.message)) {
|
|
67
|
+
throw new ContextExceededError(err.message, { cause: err });
|
|
68
|
+
}
|
|
69
|
+
if (
|
|
70
|
+
isTransientMessage(err.message) &&
|
|
71
|
+
attempt < MAX_TRANSIENT_RETRIES
|
|
72
|
+
) {
|
|
73
|
+
const delay = 1000 * 2 ** attempt;
|
|
74
|
+
await new Promise((r) => setTimeout(r, delay));
|
|
75
|
+
continue;
|
|
76
|
+
}
|
|
77
|
+
throw err;
|
|
78
|
+
}
|
|
83
79
|
}
|
|
84
|
-
|
|
85
|
-
return this.#getOpenRouter().completion(
|
|
86
|
-
messages,
|
|
87
|
-
resolvedModel,
|
|
88
|
-
resolvedOptions,
|
|
89
|
-
);
|
|
90
80
|
}
|
|
91
81
|
|
|
92
82
|
async getContextSize(model) {
|
|
93
|
-
|
|
94
|
-
if (
|
|
95
|
-
const row = await this.#db.get_model_by_alias.get({ alias: model });
|
|
96
|
-
if (row?.context_length) return row.context_length;
|
|
97
|
-
}
|
|
83
|
+
const row = await this.#db.get_model_by_alias.get({ alias: model });
|
|
84
|
+
if (row?.context_length) return row.context_length;
|
|
98
85
|
|
|
99
|
-
// Fall back to API query
|
|
100
86
|
const resolvedModel = await this.resolve(model);
|
|
101
|
-
|
|
102
|
-
if (
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
size = await this.#getOpenAi().getContextSize(resolvedModel);
|
|
107
|
-
} else if (resolvedModel.startsWith("x.ai/")) {
|
|
108
|
-
const localModel = resolvedModel.replace("x.ai/", "");
|
|
109
|
-
size = await this.#getXai().getContextSize(localModel);
|
|
110
|
-
} else {
|
|
111
|
-
size = await this.#getOpenRouter().getContextSize(resolvedModel);
|
|
112
|
-
}
|
|
113
|
-
|
|
114
|
-
// Cache back to DB for next time
|
|
115
|
-
if (this.#db && size) {
|
|
116
|
-
await this.#db.update_model_context_length
|
|
117
|
-
.run({
|
|
118
|
-
alias: model,
|
|
119
|
-
context_length: size,
|
|
120
|
-
})
|
|
121
|
-
.catch(() => {});
|
|
87
|
+
const provider = this.#selectProvider(resolvedModel);
|
|
88
|
+
if (!provider) {
|
|
89
|
+
throw new Error(
|
|
90
|
+
`No LLM provider registered for model "${resolvedModel}".`,
|
|
91
|
+
);
|
|
122
92
|
}
|
|
123
|
-
|
|
93
|
+
const size = await provider.getContextSize(resolvedModel);
|
|
94
|
+
await this.#db.update_model_context_length.run({
|
|
95
|
+
alias: model,
|
|
96
|
+
context_length: size,
|
|
97
|
+
});
|
|
124
98
|
return size;
|
|
125
99
|
}
|
|
126
100
|
}
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
export class ContextExceededError extends Error {
|
|
2
|
+
constructor(message, { cause } = {}) {
|
|
3
|
+
super(message);
|
|
4
|
+
this.name = "ContextExceededError";
|
|
5
|
+
if (cause) this.cause = cause;
|
|
6
|
+
}
|
|
7
|
+
}
|
|
8
|
+
|
|
9
|
+
const CONTEXT_EXCEEDED_PATTERN =
|
|
10
|
+
/\b(context.*(size|length|limit)|token.*(limit|exceed)|too.*(long|large))\b/i;
|
|
11
|
+
|
|
12
|
+
export function isContextExceededMessage(message) {
|
|
13
|
+
return CONTEXT_EXCEEDED_PATTERN.test(String(message));
|
|
14
|
+
}
|
|
15
|
+
|
|
16
|
+
const TRANSIENT_PATTERN =
|
|
17
|
+
/\b(503|429|timeout|ECONNREFUSED|ECONNRESET|unavailable)\b/i;
|
|
18
|
+
|
|
19
|
+
export function isTransientMessage(message) {
|
|
20
|
+
return TRANSIENT_PATTERN.test(String(message));
|
|
21
|
+
}
|
|
@@ -1,5 +1,7 @@
|
|
|
1
1
|
import docs from "./ask_userDoc.js";
|
|
2
2
|
|
|
3
|
+
const LOG_ACTION_RE = /^log:\/\/turn_\d+\/(\w+)\//;
|
|
4
|
+
|
|
3
5
|
export default class AskUser {
|
|
4
6
|
#core;
|
|
5
7
|
|
|
@@ -7,28 +9,50 @@ export default class AskUser {
|
|
|
7
9
|
this.#core = core;
|
|
8
10
|
core.registerScheme();
|
|
9
11
|
core.on("handler", this.handler.bind(this));
|
|
10
|
-
core.on("
|
|
11
|
-
core.on("
|
|
12
|
+
core.on("visible", this.full.bind(this));
|
|
13
|
+
core.on("summarized", this.summary.bind(this));
|
|
12
14
|
core.filter("instructions.toolDocs", async (docsMap) => {
|
|
13
15
|
docsMap.ask_user = docs;
|
|
14
16
|
return docsMap;
|
|
15
17
|
});
|
|
18
|
+
core.on("proposal.accepted", this.#onResolved.bind(this));
|
|
19
|
+
core.on("proposal.rejected", this.#onResolved.bind(this));
|
|
20
|
+
}
|
|
21
|
+
|
|
22
|
+
async #onResolved(ctx) {
|
|
23
|
+
const m = LOG_ACTION_RE.exec(ctx.path);
|
|
24
|
+
if (m?.[1] !== "ask_user") return;
|
|
25
|
+
if (!ctx.output) return;
|
|
26
|
+
const turn = (await ctx.db.get_run_by_id.get({ id: ctx.runId })).next_turn;
|
|
27
|
+
await ctx.entries.set({
|
|
28
|
+
runId: ctx.runId,
|
|
29
|
+
turn,
|
|
30
|
+
path: ctx.path,
|
|
31
|
+
body: ctx.resolvedBody,
|
|
32
|
+
attributes: { ...ctx.attrs, answer: ctx.output },
|
|
33
|
+
});
|
|
16
34
|
}
|
|
17
35
|
|
|
18
36
|
async handler(entry, rummy) {
|
|
19
37
|
const { entries: store, sequence: turn, runId, loopId } = rummy;
|
|
38
|
+
// XmlParser resolved question/options from attr-or-body already.
|
|
20
39
|
const { question, options: rawOptions } = entry.attributes;
|
|
21
40
|
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
41
|
+
let options = [];
|
|
42
|
+
if (rawOptions) {
|
|
43
|
+
const delimiter = rawOptions.includes(";") ? ";" : ",";
|
|
44
|
+
options = rawOptions
|
|
45
|
+
.split(delimiter)
|
|
46
|
+
.map((o) => o.trim())
|
|
47
|
+
.filter(Boolean);
|
|
48
|
+
}
|
|
30
49
|
|
|
31
|
-
await store.
|
|
50
|
+
await store.set({
|
|
51
|
+
runId,
|
|
52
|
+
turn,
|
|
53
|
+
path: entry.resultPath,
|
|
54
|
+
body: entry.body,
|
|
55
|
+
state: "proposed",
|
|
32
56
|
attributes: { question, options },
|
|
33
57
|
loopId,
|
|
34
58
|
});
|
|
@@ -44,6 +68,7 @@ export default class AskUser {
|
|
|
44
68
|
|
|
45
69
|
summary(entry) {
|
|
46
70
|
const { question, answer } = entry.attributes;
|
|
47
|
-
|
|
71
|
+
if (answer) return `${question} → ${answer}`;
|
|
72
|
+
return question;
|
|
48
73
|
}
|
|
49
74
|
}
|
|
@@ -1,26 +1,3 @@
|
|
|
1
|
-
|
|
2
|
-
// Text goes to the model. Rationale stays in source.
|
|
3
|
-
// Changing ANY line requires reading ALL rationales first.
|
|
4
|
-
const LINES = [
|
|
5
|
-
[
|
|
6
|
-
'## <ask_user question="[Question?]">[option1; option2; ...]</ask_user> - Ask the user a question',
|
|
7
|
-
],
|
|
8
|
-
[
|
|
9
|
-
"* YOU SHOULD use for decisions, preferences, or approvals the user must make",
|
|
10
|
-
"Positive framing. Shows what ask_user IS for.",
|
|
11
|
-
],
|
|
12
|
-
[
|
|
13
|
-
"* YOU SHOULD use <get></get> to find information before asking the user",
|
|
14
|
-
"Gentle redirect. Encourages self-sufficiency.",
|
|
15
|
-
],
|
|
16
|
-
[
|
|
17
|
-
'Example: <ask_user question="Which test framework?">Mocha; Jest; Node Native</ask_user>',
|
|
18
|
-
"Preference decision. Model truly cannot know this without asking.",
|
|
19
|
-
],
|
|
20
|
-
[
|
|
21
|
-
'Example: <ask_user question="Deploy to staging or production?">staging; production</ask_user>',
|
|
22
|
-
"Consequential action. High-stakes choice.",
|
|
23
|
-
],
|
|
24
|
-
];
|
|
1
|
+
import { loadDoc } from "../helpers.js";
|
|
25
2
|
|
|
26
|
-
export default
|
|
3
|
+
export default loadDoc(import.meta.url, "ask_userDoc.md");
|
|
@@ -0,0 +1,10 @@
|
|
|
1
|
+
## <ask_user question="[Question?]">[option1; option2; ...]</ask_user> - Ask the user a question
|
|
2
|
+
|
|
3
|
+
* YOU SHOULD ONLY use for decisions, preferences, or approvals the user must make
|
|
4
|
+
<!-- Positive framing. Shows what ask_user IS for. -->
|
|
5
|
+
|
|
6
|
+
Example: <ask_user question="Which test framework?">Mocha; Jest; Node Native</ask_user>
|
|
7
|
+
<!-- Preference decision. Model truly cannot know this without asking. -->
|
|
8
|
+
|
|
9
|
+
Example: <ask_user question="Deploy to staging or production?">staging; production</ask_user>
|
|
10
|
+
<!-- Consequential action. High-stakes choice. -->
|
|
@@ -1,41 +1,43 @@
|
|
|
1
|
-
# budget
|
|
1
|
+
# budget {#budget_plugin}
|
|
2
2
|
|
|
3
3
|
Context ceiling enforcement.
|
|
4
4
|
|
|
5
5
|
## Design
|
|
6
6
|
|
|
7
|
-
Ceiling = `floor(contextSize × 0.9)
|
|
8
|
-
operating room for graceful overflow
|
|
9
|
-
tools run uninterrupted. Enforcement
|
|
7
|
+
Ceiling = `floor(contextSize × RUMMY_BUDGET_CEILING)` (default 0.9). The
|
|
8
|
+
10% headroom is the system's operating room for graceful overflow
|
|
9
|
+
handling. No per-write gating — tools run uninterrupted. Enforcement
|
|
10
|
+
happens at boundaries.
|
|
10
11
|
|
|
11
12
|
## Enforcement Points
|
|
12
13
|
|
|
13
|
-
1. **Pre-LLM enforce** (`budget.enforce`): checks assembled context
|
|
14
|
-
before the LLM call. If over ceiling → Prompt Demotion
|
|
15
|
-
the incoming prompt).
|
|
14
|
+
1. **Pre-LLM enforce** (`hooks.budget.enforce`): checks assembled context
|
|
15
|
+
before the LLM call. If over ceiling on turn 1 → Prompt Demotion
|
|
16
|
+
(demote the incoming prompt, re-materialize, re-check). Runs in the
|
|
17
|
+
headroom if that fits. On non-first turns or still-over after
|
|
18
|
+
Prompt Demotion, emits a 413 error via `hooks.error.log` so the
|
|
19
|
+
strike system treats the overflow as a turn-level event.
|
|
16
20
|
|
|
17
|
-
2. **Post-dispatch Turn Demotion
|
|
18
|
-
|
|
19
|
-
(
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
21
|
+
2. **Post-dispatch Turn Demotion** (`hooks.budget.postDispatch`): after
|
|
22
|
+
all tools dispatch, re-materialize and check. If over ceiling →
|
|
23
|
+
demote ALL visible entries from this turn (status < 400, status
|
|
24
|
+
preserved — demotion only changes visibility). Emits a 413 error
|
|
25
|
+
with the 50% rule directive as its message; the error entry is
|
|
26
|
+
what the model sees next turn.
|
|
23
27
|
|
|
24
|
-
3. **LLM rejection** (`isContextExceeded`): turn-1
|
|
25
|
-
drift causes LLM to reject. Same
|
|
26
|
-
|
|
27
|
-
4. **AgentLoop recovery**: pre-LLM 413 that Prompt Demotion can't
|
|
28
|
-
resolve. Batch-demote all full entries, budget entry, model gets
|
|
29
|
-
recovery turns. 3 strikes without progress → hard 413 to client.
|
|
30
|
-
Only path where 413 reaches the client.
|
|
28
|
+
3. **LLM rejection** (`isContextExceeded` in TurnExecutor): turn-1
|
|
29
|
+
token estimate drift causes LLM to reject. Same 413 error path as
|
|
30
|
+
pre-LLM overflow.
|
|
31
31
|
|
|
32
32
|
## Files
|
|
33
33
|
|
|
34
|
-
- **budget.js** — Plugin.
|
|
35
|
-
|
|
34
|
+
- **budget.js** — Plugin. Enforce + postDispatch methods exposed on
|
|
35
|
+
`core.hooks.budget`.
|
|
36
36
|
|
|
37
37
|
## Registration
|
|
38
38
|
|
|
39
|
-
- **Hook**: `hooks.budget.enforce` — pre-LLM ceiling check
|
|
40
|
-
|
|
41
|
-
|
|
39
|
+
- **Hook**: `hooks.budget.enforce` — pre-LLM ceiling check + first-turn
|
|
40
|
+
Prompt Demotion.
|
|
41
|
+
- **Hook**: `hooks.budget.postDispatch` — post-dispatch re-check + Turn
|
|
42
|
+
Demotion. Emits 413 errors through the unified error channel; there
|
|
43
|
+
is no separate `budget://` scheme.
|