prjct-cli 1.9.0 → 1.10.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +75 -1
- package/core/__tests__/agentic/prompt-assembly.test.ts +298 -0
- package/core/__tests__/agentic/prompt-builder.test.ts +2 -2
- package/core/__tests__/agentic/token-budget.test.ts +294 -0
- package/core/agentic/anti-hallucination.ts +124 -0
- package/core/agentic/environment-block.ts +102 -0
- package/core/agentic/injection-validator.ts +16 -0
- package/core/agentic/prompt-builder.ts +339 -167
- package/core/agentic/token-budget.ts +226 -0
- package/core/services/context-selector.ts +8 -2
- package/dist/bin/prjct.mjs +435 -188
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,80 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## [1.10.0] - 2026-02-08
|
|
4
|
+
|
|
5
|
+
### Features
|
|
6
|
+
|
|
7
|
+
- redesign prompt assembly with correct section ordering + anti-hallucination (PRJ-301) (#152)
|
|
8
|
+
- add coordinated global token budget (PRJ-266) (#151)
|
|
9
|
+
|
|
10
|
+
|
|
11
|
+
## [1.12.0] - 2026-02-07
|
|
12
|
+
|
|
13
|
+
### Features
|
|
14
|
+
- **Prompt Assembly Redesign**: Correct section ordering based on research of 25+ system prompts (PRJ-301)
|
|
15
|
+
- **Environment Block**: Structured `<env>` block with project, git, platform, runtime, and model metadata
|
|
16
|
+
- **Anti-Hallucination Block**: Explicit availability/unavailability grounding injected BEFORE task context
|
|
17
|
+
- **Token Efficiency Directive**: Conciseness rules appended to every prompt
|
|
18
|
+
|
|
19
|
+
### Implementation Details
|
|
20
|
+
Redesigned `prompt-builder.ts` section ordering to follow research-backed pattern:
|
|
21
|
+
Identity → Environment → Ground Truth → Capabilities → Constraints → Task Context → Task → Output Schema → Efficiency
|
|
22
|
+
|
|
23
|
+
Key changes:
|
|
24
|
+
- New `environment-block.ts`: Generates `<env>` XML block with auto-detected runtime, platform, and normalized names
|
|
25
|
+
- New `anti-hallucination.ts`: Generates constraints block from sealed analysis (available tech, absent domains, grounding rules)
|
|
26
|
+
- Moved template content (task instructions) to section 7 — LLM knows identity, env, and rules before reading task
|
|
27
|
+
- Anti-hallucination block placed at section 5 (before task context), replacing old `RULES (CRITICAL)` at the end
|
|
28
|
+
- Added `buildEfficiencyDirective()` with conciseness rules (max 4 lines, no preamble/postamble)
|
|
29
|
+
- Exported `PROMPT_SECTION_ORDER` constant and `SectionPriority` type for budget trimming
|
|
30
|
+
- Kept `buildCriticalRules()` as fallback when project context unavailable
|
|
31
|
+
|
|
32
|
+
### Learnings
|
|
33
|
+
- Zod `.default()` only applies during `.parse()` — raw object construction skips defaults, use `??` fallback
|
|
34
|
+
- Renaming prompt section headers breaks existing test assertions — always update test matchers
|
|
35
|
+
- Template position matters: placing task instructions after constraints improves LLM grounding
|
|
36
|
+
|
|
37
|
+
### Test Plan
|
|
38
|
+
|
|
39
|
+
#### For QA
|
|
40
|
+
1. Run `bun test` — all 719 tests pass (0 failures)
|
|
41
|
+
2. Run `bun run build` — build succeeds
|
|
42
|
+
3. Verify `<env>` block appears in generated prompts before constraints
|
|
43
|
+
4. Verify `CONSTRAINTS (Read Before Acting)` appears before template content
|
|
44
|
+
5. Verify `OUTPUT RULES` section appears at end of prompt
|
|
45
|
+
6. Check `AVAILABLE` and `NOT PRESENT` lists reflect project tech stack
|
|
46
|
+
7. Run `prjct sync` — prompt assembly still works end-to-end
|
|
47
|
+
|
|
48
|
+
#### For Users
|
|
49
|
+
Prompts sent to AI models are now structured with research-backed section ordering, reducing hallucinations and improving response conciseness. No user action required — improvements are automatic.
|
|
50
|
+
|
|
51
|
+
## [1.11.0] - 2026-02-07
|
|
52
|
+
|
|
53
|
+
### Features
|
|
54
|
+
- **Token Budget Coordinator**: Centralized token budget management across all context-loading components (PRJ-266)
|
|
55
|
+
|
|
56
|
+
### Implementation Details
|
|
57
|
+
Created `TokenBudgetCoordinator` class that manages the global token budget based on model context windows. Key features:
|
|
58
|
+
- Model context window registry (Claude 200K, Gemini 1M) with automatic budget calculation
|
|
59
|
+
- Input/output budget split: 65% input, 35% reserved for output
|
|
60
|
+
- Priority-based allocation: state (P1) > injection context (P2) > file content (P3)
|
|
61
|
+
- Request/record API for usage tracking with overflow detection
|
|
62
|
+
- Integrated into `injection-validator.ts`, `prompt-builder.ts`, and `context-selector.ts`
|
|
63
|
+
- Backward compatible: falls back to existing defaults when no coordinator is set
|
|
64
|
+
|
|
65
|
+
### Test Plan
|
|
66
|
+
|
|
67
|
+
#### For QA
|
|
68
|
+
1. Create coordinator with `'sonnet'` → input budget = 130K, output reserve = 70K
|
|
69
|
+
2. Create with `'2.5-pro'` (Gemini) → input budget = 650K (5x Claude)
|
|
70
|
+
3. Request tokens up to allocation limit → verify grants are capped
|
|
71
|
+
4. Exhaust a category budget → verify subsequent requests return 0
|
|
72
|
+
5. Verify `budgetsFromCoordinator()` uses coordinator's injection allocation
|
|
73
|
+
6. Run full test suite → all 705 tests pass
|
|
74
|
+
|
|
75
|
+
#### For Users
|
|
76
|
+
Token budgets are now centrally coordinated based on the model's context window. Larger models get proportionally larger budgets automatically. No breaking changes.
|
|
77
|
+
|
|
3
78
|
## [1.9.0] - 2026-02-07
|
|
4
79
|
|
|
5
80
|
### Features
|
|
@@ -11,7 +86,6 @@
|
|
|
11
86
|
|
|
12
87
|
- replace keyword domain detection with LLM semantic classification (PRJ-299) (#148)
|
|
13
88
|
|
|
14
|
-
|
|
15
89
|
## [1.10.0] - 2026-02-07
|
|
16
90
|
|
|
17
91
|
### Features
|
|
@@ -0,0 +1,298 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Prompt Assembly Tests (PRJ-301)
|
|
3
|
+
*
|
|
4
|
+
* Tests for the redesigned prompt assembly:
|
|
5
|
+
* - Section ordering (Identity → Env → Ground Truth → ... → Efficiency)
|
|
6
|
+
* - Environment block generation
|
|
7
|
+
* - Anti-hallucination block generation
|
|
8
|
+
* - Token efficiency directive
|
|
9
|
+
* - Budget trimming with priorities
|
|
10
|
+
*/
|
|
11
|
+
|
|
12
|
+
import { beforeEach, describe, expect, it } from 'bun:test'
|
|
13
|
+
import {
|
|
14
|
+
buildAntiHallucinationBlock,
|
|
15
|
+
type ProjectGroundTruth,
|
|
16
|
+
} from '../../agentic/anti-hallucination'
|
|
17
|
+
import { buildEnvironmentBlock, type EnvironmentBlockInput } from '../../agentic/environment-block'
|
|
18
|
+
import promptBuilder, { PROMPT_SECTION_ORDER } from '../../agentic/prompt-builder'
|
|
19
|
+
|
|
20
|
+
// =============================================================================
|
|
21
|
+
// Environment Block
|
|
22
|
+
// =============================================================================
|
|
23
|
+
|
|
24
|
+
describe('Environment Block (PRJ-301)', () => {
|
|
25
|
+
it('should generate <env> block with all fields', () => {
|
|
26
|
+
const input: EnvironmentBlockInput = {
|
|
27
|
+
projectName: 'my-app',
|
|
28
|
+
projectPath: '/home/user/my-app',
|
|
29
|
+
isGitRepo: true,
|
|
30
|
+
gitBranch: 'feature/login',
|
|
31
|
+
platform: 'darwin',
|
|
32
|
+
runtime: 'bun',
|
|
33
|
+
date: '2026-02-07',
|
|
34
|
+
model: 'opus',
|
|
35
|
+
provider: 'claude',
|
|
36
|
+
}
|
|
37
|
+
|
|
38
|
+
const block = buildEnvironmentBlock(input)
|
|
39
|
+
|
|
40
|
+
expect(block).toContain('<env>')
|
|
41
|
+
expect(block).toContain('</env>')
|
|
42
|
+
expect(block).toContain('project: my-app')
|
|
43
|
+
expect(block).toContain('path: /home/user/my-app')
|
|
44
|
+
expect(block).toContain('git: true')
|
|
45
|
+
expect(block).toContain('branch: feature/login')
|
|
46
|
+
expect(block).toContain('platform: macOS')
|
|
47
|
+
expect(block).toContain('runtime: bun')
|
|
48
|
+
expect(block).toContain('date: 2026-02-07')
|
|
49
|
+
expect(block).toContain('model: opus')
|
|
50
|
+
expect(block).toContain('provider: claude')
|
|
51
|
+
})
|
|
52
|
+
|
|
53
|
+
it('should omit undefined fields', () => {
|
|
54
|
+
const input: EnvironmentBlockInput = {
|
|
55
|
+
projectName: 'my-app',
|
|
56
|
+
projectPath: '/test',
|
|
57
|
+
}
|
|
58
|
+
|
|
59
|
+
const block = buildEnvironmentBlock(input)
|
|
60
|
+
|
|
61
|
+
expect(block).toContain('project: my-app')
|
|
62
|
+
expect(block).toContain('path: /test')
|
|
63
|
+
expect(block).not.toContain('model:')
|
|
64
|
+
expect(block).not.toContain('provider:')
|
|
65
|
+
expect(block).not.toContain('branch:')
|
|
66
|
+
})
|
|
67
|
+
|
|
68
|
+
it('should normalize platform names', () => {
|
|
69
|
+
expect(
|
|
70
|
+
buildEnvironmentBlock({ projectName: 'x', projectPath: '/x', platform: 'darwin' })
|
|
71
|
+
).toContain('platform: macOS')
|
|
72
|
+
expect(
|
|
73
|
+
buildEnvironmentBlock({ projectName: 'x', projectPath: '/x', platform: 'linux' })
|
|
74
|
+
).toContain('platform: Linux')
|
|
75
|
+
expect(
|
|
76
|
+
buildEnvironmentBlock({ projectName: 'x', projectPath: '/x', platform: 'win32' })
|
|
77
|
+
).toContain('platform: Windows')
|
|
78
|
+
})
|
|
79
|
+
|
|
80
|
+
it('should auto-detect runtime and date when not provided', () => {
|
|
81
|
+
const block = buildEnvironmentBlock({ projectName: 'x', projectPath: '/x' })
|
|
82
|
+
|
|
83
|
+
// Should have a runtime (bun or node)
|
|
84
|
+
expect(block).toMatch(/runtime: (bun|node)/)
|
|
85
|
+
// Should have a date in YYYY-MM-DD format
|
|
86
|
+
expect(block).toMatch(/date: \d{4}-\d{2}-\d{2}/)
|
|
87
|
+
})
|
|
88
|
+
})
|
|
89
|
+
|
|
90
|
+
// =============================================================================
|
|
91
|
+
// Anti-Hallucination Block
|
|
92
|
+
// =============================================================================
|
|
93
|
+
|
|
94
|
+
describe('Anti-Hallucination Block (PRJ-301)', () => {
|
|
95
|
+
it('should generate constraints block with availability', () => {
|
|
96
|
+
const truth: ProjectGroundTruth = {
|
|
97
|
+
projectPath: '/home/user/my-app',
|
|
98
|
+
language: 'TypeScript',
|
|
99
|
+
framework: 'Hono',
|
|
100
|
+
techStack: ['Hono', 'Zod', 'Vitest'],
|
|
101
|
+
domains: {
|
|
102
|
+
hasFrontend: false,
|
|
103
|
+
hasBackend: true,
|
|
104
|
+
hasDatabase: false,
|
|
105
|
+
hasTesting: true,
|
|
106
|
+
hasDocker: false,
|
|
107
|
+
},
|
|
108
|
+
fileCount: 292,
|
|
109
|
+
availableAgents: ['backend', 'testing'],
|
|
110
|
+
}
|
|
111
|
+
|
|
112
|
+
const block = buildAntiHallucinationBlock(truth)
|
|
113
|
+
|
|
114
|
+
// Section header
|
|
115
|
+
expect(block).toContain('CONSTRAINTS (Read Before Acting)')
|
|
116
|
+
|
|
117
|
+
// Availability
|
|
118
|
+
expect(block).toContain('AVAILABLE in this project: TypeScript, Hono, Zod, Vitest')
|
|
119
|
+
|
|
120
|
+
// Unavailability (no frontend, no database, no docker)
|
|
121
|
+
expect(block).toContain('NOT PRESENT:')
|
|
122
|
+
expect(block).toContain('Frontend (UI/components)')
|
|
123
|
+
expect(block).toContain('Database (SQL/ORM)')
|
|
124
|
+
expect(block).toContain('Docker/containers')
|
|
125
|
+
|
|
126
|
+
// Should NOT list present domains as absent
|
|
127
|
+
expect(block).not.toContain('NOT PRESENT: Backend')
|
|
128
|
+
expect(block).not.toContain('NOT PRESENT: Testing')
|
|
129
|
+
|
|
130
|
+
// Agents
|
|
131
|
+
expect(block).toContain('AGENTS: backend, testing')
|
|
132
|
+
|
|
133
|
+
// Grounding rules
|
|
134
|
+
expect(block).toContain('SCOPE: Only files in `/home/user/my-app` are accessible.')
|
|
135
|
+
expect(block).toContain('Do NOT infer or guess paths')
|
|
136
|
+
expect(block).toContain('NEVER assume a library is available')
|
|
137
|
+
expect(block).toContain('trust this section')
|
|
138
|
+
|
|
139
|
+
// File count
|
|
140
|
+
expect(block).toContain('292 files in project')
|
|
141
|
+
})
|
|
142
|
+
|
|
143
|
+
it('should handle minimal input', () => {
|
|
144
|
+
const truth: ProjectGroundTruth = {
|
|
145
|
+
projectPath: '/test',
|
|
146
|
+
}
|
|
147
|
+
|
|
148
|
+
const block = buildAntiHallucinationBlock(truth)
|
|
149
|
+
|
|
150
|
+
expect(block).toContain('CONSTRAINTS')
|
|
151
|
+
expect(block).toContain('SCOPE: Only files in `/test` are accessible.')
|
|
152
|
+
// Should not have AVAILABLE or NOT PRESENT lines
|
|
153
|
+
expect(block).not.toContain('AVAILABLE in this project:')
|
|
154
|
+
expect(block).not.toContain('NOT PRESENT:')
|
|
155
|
+
})
|
|
156
|
+
|
|
157
|
+
it('should not duplicate framework in techStack listing', () => {
|
|
158
|
+
const truth: ProjectGroundTruth = {
|
|
159
|
+
projectPath: '/test',
|
|
160
|
+
language: 'TypeScript',
|
|
161
|
+
framework: 'Next.js',
|
|
162
|
+
techStack: ['Next.js', 'React', 'Tailwind'],
|
|
163
|
+
}
|
|
164
|
+
|
|
165
|
+
const block = buildAntiHallucinationBlock(truth)
|
|
166
|
+
|
|
167
|
+
// Next.js should appear once (from framework), not duplicated from techStack
|
|
168
|
+
const matches = block.match(/Next\.js/g)
|
|
169
|
+
expect(matches?.length).toBe(1)
|
|
170
|
+
})
|
|
171
|
+
})
|
|
172
|
+
|
|
173
|
+
// =============================================================================
|
|
174
|
+
// Section Ordering
|
|
175
|
+
// =============================================================================
|
|
176
|
+
|
|
177
|
+
describe('Prompt Section Ordering (PRJ-301)', () => {
|
|
178
|
+
let builder: typeof promptBuilder
|
|
179
|
+
|
|
180
|
+
beforeEach(() => {
|
|
181
|
+
builder = promptBuilder
|
|
182
|
+
builder.resetContext()
|
|
183
|
+
})
|
|
184
|
+
|
|
185
|
+
it('should define correct section order constant', () => {
|
|
186
|
+
expect(PROMPT_SECTION_ORDER).toEqual([
|
|
187
|
+
'identity',
|
|
188
|
+
'environment',
|
|
189
|
+
'ground_truth',
|
|
190
|
+
'capabilities',
|
|
191
|
+
'constraints',
|
|
192
|
+
'task_context',
|
|
193
|
+
'task',
|
|
194
|
+
'output_schema',
|
|
195
|
+
'efficiency',
|
|
196
|
+
])
|
|
197
|
+
})
|
|
198
|
+
|
|
199
|
+
it('should place environment block before constraints', async () => {
|
|
200
|
+
const template = {
|
|
201
|
+
frontmatter: { description: 'Test', 'allowed-tools': ['Read'] },
|
|
202
|
+
content: '## Instructions\nDo the thing',
|
|
203
|
+
}
|
|
204
|
+
const context = { projectPath: '/test', files: ['a.js'] }
|
|
205
|
+
|
|
206
|
+
const prompt = await builder.build(template, context, {})
|
|
207
|
+
|
|
208
|
+
const envPos = prompt.indexOf('<env>')
|
|
209
|
+
const constraintsPos = prompt.indexOf('CONSTRAINTS')
|
|
210
|
+
expect(envPos).toBeGreaterThan(-1)
|
|
211
|
+
expect(constraintsPos).toBeGreaterThan(-1)
|
|
212
|
+
expect(envPos).toBeLessThan(constraintsPos)
|
|
213
|
+
})
|
|
214
|
+
|
|
215
|
+
it('should place constraints before task template content', async () => {
|
|
216
|
+
const template = {
|
|
217
|
+
frontmatter: { description: 'Test' },
|
|
218
|
+
content: '## UNIQUE_TEMPLATE_MARKER\nFollow these steps',
|
|
219
|
+
}
|
|
220
|
+
const context = { projectPath: '/test', files: ['a.js'] }
|
|
221
|
+
|
|
222
|
+
const prompt = await builder.build(template, context, {})
|
|
223
|
+
|
|
224
|
+
const constraintsPos = prompt.indexOf('CONSTRAINTS')
|
|
225
|
+
const templatePos = prompt.indexOf('UNIQUE_TEMPLATE_MARKER')
|
|
226
|
+
expect(constraintsPos).toBeGreaterThan(-1)
|
|
227
|
+
expect(templatePos).toBeGreaterThan(-1)
|
|
228
|
+
expect(constraintsPos).toBeLessThan(templatePos)
|
|
229
|
+
})
|
|
230
|
+
|
|
231
|
+
it('should place identity (TASK:) at the beginning', async () => {
|
|
232
|
+
const template = {
|
|
233
|
+
frontmatter: { description: 'My Task' },
|
|
234
|
+
content: '## Flow\nStep 1',
|
|
235
|
+
}
|
|
236
|
+
const context = { projectPath: '/test' }
|
|
237
|
+
|
|
238
|
+
const prompt = await builder.build(template, context, {})
|
|
239
|
+
|
|
240
|
+
const taskPos = prompt.indexOf('TASK: My Task')
|
|
241
|
+
expect(taskPos).toBeLessThan(50)
|
|
242
|
+
})
|
|
243
|
+
|
|
244
|
+
it('should place efficiency directive at the end', async () => {
|
|
245
|
+
const template = {
|
|
246
|
+
frontmatter: { description: 'Test' },
|
|
247
|
+
content: '## Flow\nStep 1',
|
|
248
|
+
}
|
|
249
|
+
const context = { projectPath: '/test' }
|
|
250
|
+
|
|
251
|
+
const prompt = await builder.build(template, context, {})
|
|
252
|
+
|
|
253
|
+
const efficiencyPos = prompt.indexOf('OUTPUT RULES')
|
|
254
|
+
const executePos = prompt.indexOf('EXECUTE:')
|
|
255
|
+
expect(efficiencyPos).toBeGreaterThan(-1)
|
|
256
|
+
expect(executePos).toBeGreaterThan(-1)
|
|
257
|
+
// Should be in the last ~300 chars of the prompt
|
|
258
|
+
expect(prompt.length - executePos).toBeLessThan(300)
|
|
259
|
+
})
|
|
260
|
+
})
|
|
261
|
+
|
|
262
|
+
// =============================================================================
|
|
263
|
+
// Token Efficiency Directive
|
|
264
|
+
// =============================================================================
|
|
265
|
+
|
|
266
|
+
describe('Token Efficiency Directive (PRJ-301)', () => {
|
|
267
|
+
let builder: typeof promptBuilder
|
|
268
|
+
|
|
269
|
+
beforeEach(() => {
|
|
270
|
+
builder = promptBuilder
|
|
271
|
+
builder.resetContext()
|
|
272
|
+
})
|
|
273
|
+
|
|
274
|
+
it('should include efficiency rules in every prompt', async () => {
|
|
275
|
+
const template = {
|
|
276
|
+
frontmatter: { description: 'Test' },
|
|
277
|
+
content: '## Flow\nStep 1',
|
|
278
|
+
}
|
|
279
|
+
const context = { projectPath: '/test' }
|
|
280
|
+
|
|
281
|
+
const prompt = await builder.build(template, context, {})
|
|
282
|
+
|
|
283
|
+
expect(prompt).toContain('OUTPUT RULES')
|
|
284
|
+
expect(prompt).toContain('Be concise')
|
|
285
|
+
expect(prompt).toContain('No preamble')
|
|
286
|
+
expect(prompt).toContain('No postamble')
|
|
287
|
+
expect(prompt).toContain('EXECUTE:')
|
|
288
|
+
})
|
|
289
|
+
|
|
290
|
+
it('should build efficiency directive as standalone method', () => {
|
|
291
|
+
const directive = builder.buildEfficiencyDirective()
|
|
292
|
+
|
|
293
|
+
expect(directive).toContain('Maximum 4 lines')
|
|
294
|
+
expect(directive).toContain('No preamble')
|
|
295
|
+
expect(directive).toContain('Prefer structured output')
|
|
296
|
+
expect(directive).toContain('EXECUTE:')
|
|
297
|
+
})
|
|
298
|
+
})
|
|
@@ -107,7 +107,7 @@ describe('PromptBuilder', () => {
|
|
|
107
107
|
|
|
108
108
|
const prompt = await builder.build(template, context, state)
|
|
109
109
|
|
|
110
|
-
expect(prompt).toContain('
|
|
110
|
+
expect(prompt).toContain('STACK')
|
|
111
111
|
expect(prompt).toContain('Node.js')
|
|
112
112
|
})
|
|
113
113
|
|
|
@@ -187,7 +187,7 @@ describe('PromptBuilder', () => {
|
|
|
187
187
|
expect(prompt).toContain('TASK:')
|
|
188
188
|
expect(prompt).toContain('TOOLS:')
|
|
189
189
|
expect(prompt).toContain('Flow')
|
|
190
|
-
expect(prompt).toContain('
|
|
190
|
+
expect(prompt).toContain('CONSTRAINTS (Read Before Acting)')
|
|
191
191
|
expect(prompt).toContain('## FILES:')
|
|
192
192
|
})
|
|
193
193
|
|
|
@@ -0,0 +1,294 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Token Budget Coordinator Tests
|
|
3
|
+
*
|
|
4
|
+
* Tests for:
|
|
5
|
+
* - Budget calculation from model context windows
|
|
6
|
+
* - Priority-based allocation (state > injection > files)
|
|
7
|
+
* - Token request/record tracking
|
|
8
|
+
* - Overflow detection and prevention
|
|
9
|
+
* - Different model allocations
|
|
10
|
+
*
|
|
11
|
+
* @see PRJ-266
|
|
12
|
+
*/
|
|
13
|
+
|
|
14
|
+
import { describe, expect, it } from 'bun:test'
|
|
15
|
+
import { budgetsFromCoordinator } from '../../agentic/injection-validator'
|
|
16
|
+
import {
|
|
17
|
+
calculateInputBudget,
|
|
18
|
+
calculateOutputReserve,
|
|
19
|
+
getContextWindow,
|
|
20
|
+
INPUT_RATIO,
|
|
21
|
+
MODEL_CONTEXT_WINDOWS,
|
|
22
|
+
TokenBudgetCoordinator,
|
|
23
|
+
} from '../../agentic/token-budget'
|
|
24
|
+
|
|
25
|
+
// =============================================================================
|
|
26
|
+
// getContextWindow
|
|
27
|
+
// =============================================================================
|
|
28
|
+
|
|
29
|
+
describe('getContextWindow', () => {
|
|
30
|
+
it('should return 200K for Claude models', () => {
|
|
31
|
+
expect(getContextWindow('opus')).toBe(200_000)
|
|
32
|
+
expect(getContextWindow('sonnet')).toBe(200_000)
|
|
33
|
+
expect(getContextWindow('haiku')).toBe(200_000)
|
|
34
|
+
})
|
|
35
|
+
|
|
36
|
+
it('should return 1M for Gemini models', () => {
|
|
37
|
+
expect(getContextWindow('2.5-pro')).toBe(1_000_000)
|
|
38
|
+
expect(getContextWindow('2.5-flash')).toBe(1_000_000)
|
|
39
|
+
})
|
|
40
|
+
|
|
41
|
+
it('should return default for unknown models', () => {
|
|
42
|
+
expect(getContextWindow('unknown-model')).toBe(MODEL_CONTEXT_WINDOWS.default)
|
|
43
|
+
})
|
|
44
|
+
|
|
45
|
+
it('should return default when no model specified', () => {
|
|
46
|
+
expect(getContextWindow()).toBe(MODEL_CONTEXT_WINDOWS.default)
|
|
47
|
+
expect(getContextWindow(undefined)).toBe(MODEL_CONTEXT_WINDOWS.default)
|
|
48
|
+
})
|
|
49
|
+
|
|
50
|
+
it('should support full model IDs', () => {
|
|
51
|
+
expect(getContextWindow('claude-opus-4-6')).toBe(200_000)
|
|
52
|
+
expect(getContextWindow('claude-sonnet-4.5')).toBe(200_000)
|
|
53
|
+
})
|
|
54
|
+
})
|
|
55
|
+
|
|
56
|
+
// =============================================================================
|
|
57
|
+
// calculateInputBudget / calculateOutputReserve
|
|
58
|
+
// =============================================================================
|
|
59
|
+
|
|
60
|
+
describe('calculateInputBudget', () => {
|
|
61
|
+
it('should return 65% of context window', () => {
|
|
62
|
+
const budget = calculateInputBudget('sonnet')
|
|
63
|
+
expect(budget).toBe(Math.floor(200_000 * INPUT_RATIO))
|
|
64
|
+
expect(budget).toBe(130_000)
|
|
65
|
+
})
|
|
66
|
+
|
|
67
|
+
it('should return larger budget for Gemini models', () => {
|
|
68
|
+
const budget = calculateInputBudget('2.5-pro')
|
|
69
|
+
expect(budget).toBe(Math.floor(1_000_000 * INPUT_RATIO))
|
|
70
|
+
expect(budget).toBe(650_000)
|
|
71
|
+
})
|
|
72
|
+
})
|
|
73
|
+
|
|
74
|
+
describe('calculateOutputReserve', () => {
|
|
75
|
+
it('should reserve 35% for output', () => {
|
|
76
|
+
const reserve = calculateOutputReserve('sonnet')
|
|
77
|
+
expect(reserve).toBe(200_000 - 130_000)
|
|
78
|
+
expect(reserve).toBe(70_000)
|
|
79
|
+
})
|
|
80
|
+
})
|
|
81
|
+
|
|
82
|
+
// =============================================================================
|
|
83
|
+
// TokenBudgetCoordinator — Allocation
|
|
84
|
+
// =============================================================================
|
|
85
|
+
|
|
86
|
+
describe('TokenBudgetCoordinator allocation', () => {
|
|
87
|
+
it('should create coordinator with default model', () => {
|
|
88
|
+
const coordinator = new TokenBudgetCoordinator()
|
|
89
|
+
const allocation = coordinator.getAllocation()
|
|
90
|
+
|
|
91
|
+
expect(allocation.contextWindow).toBe(200_000)
|
|
92
|
+
expect(allocation.inputBudget).toBe(130_000)
|
|
93
|
+
expect(allocation.outputReserve).toBe(70_000)
|
|
94
|
+
})
|
|
95
|
+
|
|
96
|
+
it('should distribute budget across three categories', () => {
|
|
97
|
+
const coordinator = new TokenBudgetCoordinator('sonnet')
|
|
98
|
+
const allocation = coordinator.getAllocation()
|
|
99
|
+
|
|
100
|
+
expect(allocation.state).toBeGreaterThan(0)
|
|
101
|
+
expect(allocation.injection).toBeGreaterThan(0)
|
|
102
|
+
expect(allocation.files).toBeGreaterThan(0)
|
|
103
|
+
})
|
|
104
|
+
|
|
105
|
+
it('should allocate all input budget (no waste)', () => {
|
|
106
|
+
const coordinator = new TokenBudgetCoordinator('sonnet')
|
|
107
|
+
const allocation = coordinator.getAllocation()
|
|
108
|
+
|
|
109
|
+
const totalAllocated = allocation.state + allocation.injection + allocation.files
|
|
110
|
+
expect(totalAllocated).toBe(allocation.inputBudget)
|
|
111
|
+
})
|
|
112
|
+
|
|
113
|
+
it('should give files the largest allocation', () => {
|
|
114
|
+
const coordinator = new TokenBudgetCoordinator('sonnet')
|
|
115
|
+
const allocation = coordinator.getAllocation()
|
|
116
|
+
|
|
117
|
+
expect(allocation.files).toBeGreaterThan(allocation.state)
|
|
118
|
+
expect(allocation.files).toBeGreaterThan(allocation.injection)
|
|
119
|
+
})
|
|
120
|
+
|
|
121
|
+
it('should give state minimum 1500 tokens', () => {
|
|
122
|
+
const coordinator = new TokenBudgetCoordinator('sonnet')
|
|
123
|
+
const allocation = coordinator.getAllocation()
|
|
124
|
+
|
|
125
|
+
expect(allocation.state).toBeGreaterThanOrEqual(1_500)
|
|
126
|
+
})
|
|
127
|
+
|
|
128
|
+
it('should give injection minimum 8000 tokens', () => {
|
|
129
|
+
const coordinator = new TokenBudgetCoordinator('sonnet')
|
|
130
|
+
const allocation = coordinator.getAllocation()
|
|
131
|
+
|
|
132
|
+
expect(allocation.injection).toBeGreaterThanOrEqual(8_000)
|
|
133
|
+
})
|
|
134
|
+
|
|
135
|
+
it('should give larger allocations for Gemini models', () => {
|
|
136
|
+
const claudeCoord = new TokenBudgetCoordinator('sonnet')
|
|
137
|
+
const geminiCoord = new TokenBudgetCoordinator('2.5-pro')
|
|
138
|
+
|
|
139
|
+
const claudeAlloc = claudeCoord.getAllocation()
|
|
140
|
+
const geminiAlloc = geminiCoord.getAllocation()
|
|
141
|
+
|
|
142
|
+
expect(geminiAlloc.files).toBeGreaterThan(claudeAlloc.files)
|
|
143
|
+
expect(geminiAlloc.inputBudget).toBeGreaterThan(claudeAlloc.inputBudget)
|
|
144
|
+
})
|
|
145
|
+
})
|
|
146
|
+
|
|
147
|
+
// =============================================================================
|
|
148
|
+
// TokenBudgetCoordinator — Request/Record
|
|
149
|
+
// =============================================================================
|
|
150
|
+
|
|
151
|
+
describe('TokenBudgetCoordinator request/record', () => {
|
|
152
|
+
it('should grant tokens up to allocation', () => {
|
|
153
|
+
const coordinator = new TokenBudgetCoordinator('sonnet')
|
|
154
|
+
const stateAlloc = coordinator.getAllocationFor('state')
|
|
155
|
+
|
|
156
|
+
const granted = coordinator.request('state', 500)
|
|
157
|
+
expect(granted).toBe(500)
|
|
158
|
+
|
|
159
|
+
const usage = coordinator.getUsage('state')
|
|
160
|
+
expect(usage.used).toBe(500)
|
|
161
|
+
expect(usage.remaining).toBe(stateAlloc - 500)
|
|
162
|
+
})
|
|
163
|
+
|
|
164
|
+
it('should cap grants at available budget', () => {
|
|
165
|
+
const coordinator = new TokenBudgetCoordinator('sonnet')
|
|
166
|
+
const stateAlloc = coordinator.getAllocationFor('state')
|
|
167
|
+
|
|
168
|
+
// Request more than allocated
|
|
169
|
+
const granted = coordinator.request('state', stateAlloc + 1000)
|
|
170
|
+
expect(granted).toBe(stateAlloc)
|
|
171
|
+
})
|
|
172
|
+
|
|
173
|
+
it('should track cumulative usage', () => {
|
|
174
|
+
const coordinator = new TokenBudgetCoordinator('sonnet')
|
|
175
|
+
|
|
176
|
+
coordinator.request('state', 500)
|
|
177
|
+
coordinator.request('state', 300)
|
|
178
|
+
|
|
179
|
+
const usage = coordinator.getUsage('state')
|
|
180
|
+
expect(usage.used).toBe(800)
|
|
181
|
+
})
|
|
182
|
+
|
|
183
|
+
it('should return 0 when budget exhausted', () => {
|
|
184
|
+
const coordinator = new TokenBudgetCoordinator('sonnet')
|
|
185
|
+
const stateAlloc = coordinator.getAllocationFor('state')
|
|
186
|
+
|
|
187
|
+
coordinator.request('state', stateAlloc)
|
|
188
|
+
const granted = coordinator.request('state', 100)
|
|
189
|
+
expect(granted).toBe(0)
|
|
190
|
+
})
|
|
191
|
+
|
|
192
|
+
it('should record usage independently', () => {
|
|
193
|
+
const coordinator = new TokenBudgetCoordinator('sonnet')
|
|
194
|
+
|
|
195
|
+
coordinator.record('files', 5000)
|
|
196
|
+
const usage = coordinator.getUsage('files')
|
|
197
|
+
expect(usage.used).toBe(5000)
|
|
198
|
+
})
|
|
199
|
+
|
|
200
|
+
it('should track total remaining across categories', () => {
|
|
201
|
+
const coordinator = new TokenBudgetCoordinator('sonnet')
|
|
202
|
+
const inputBudget = coordinator.inputBudget
|
|
203
|
+
|
|
204
|
+
coordinator.request('state', 500)
|
|
205
|
+
coordinator.request('injection', 2000)
|
|
206
|
+
coordinator.request('files', 10000)
|
|
207
|
+
|
|
208
|
+
expect(coordinator.totalRemaining).toBe(inputBudget - 500 - 2000 - 10000)
|
|
209
|
+
})
|
|
210
|
+
})
|
|
211
|
+
|
|
212
|
+
// =============================================================================
|
|
213
|
+
// TokenBudgetCoordinator — Overflow Detection
|
|
214
|
+
// =============================================================================
|
|
215
|
+
|
|
216
|
+
describe('TokenBudgetCoordinator overflow detection', () => {
|
|
217
|
+
it('should not be over budget initially', () => {
|
|
218
|
+
const coordinator = new TokenBudgetCoordinator('sonnet')
|
|
219
|
+
expect(coordinator.isOverBudget).toBe(false)
|
|
220
|
+
})
|
|
221
|
+
|
|
222
|
+
it('should detect overflow when usage exceeds input budget', () => {
|
|
223
|
+
const coordinator = new TokenBudgetCoordinator('sonnet')
|
|
224
|
+
const inputBudget = coordinator.inputBudget
|
|
225
|
+
|
|
226
|
+
// Force overflow via record (bypasses allocation cap)
|
|
227
|
+
coordinator.record('files', inputBudget + 1)
|
|
228
|
+
expect(coordinator.isOverBudget).toBe(true)
|
|
229
|
+
})
|
|
230
|
+
|
|
231
|
+
it('should prevent overflow via request mechanism', () => {
|
|
232
|
+
const coordinator = new TokenBudgetCoordinator('sonnet')
|
|
233
|
+
|
|
234
|
+
// Exhaust state budget
|
|
235
|
+
const stateAlloc = coordinator.getAllocationFor('state')
|
|
236
|
+
const granted = coordinator.request('state', stateAlloc + 50000)
|
|
237
|
+
|
|
238
|
+
// Should only get what was allocated
|
|
239
|
+
expect(granted).toBe(stateAlloc)
|
|
240
|
+
expect(coordinator.isOverBudget).toBe(false)
|
|
241
|
+
})
|
|
242
|
+
})
|
|
243
|
+
|
|
244
|
+
// =============================================================================
|
|
245
|
+
// budgetsFromCoordinator integration
|
|
246
|
+
// =============================================================================
|
|
247
|
+
|
|
248
|
+
describe('budgetsFromCoordinator', () => {
|
|
249
|
+
it('should create injection budgets from coordinator', () => {
|
|
250
|
+
const coordinator = new TokenBudgetCoordinator('sonnet')
|
|
251
|
+
const budgets = budgetsFromCoordinator(coordinator)
|
|
252
|
+
|
|
253
|
+
expect(budgets.totalPrompt).toBe(coordinator.getAllocationFor('injection'))
|
|
254
|
+
// Per-section budgets remain at defaults
|
|
255
|
+
expect(budgets.autoContext).toBe(500)
|
|
256
|
+
expect(budgets.agentContent).toBe(400)
|
|
257
|
+
expect(budgets.skillContent).toBe(500)
|
|
258
|
+
expect(budgets.stateData).toBe(1000)
|
|
259
|
+
expect(budgets.memories).toBe(600)
|
|
260
|
+
})
|
|
261
|
+
|
|
262
|
+
it('should give larger injection budget for Gemini models', () => {
|
|
263
|
+
const claudeBudgets = budgetsFromCoordinator(new TokenBudgetCoordinator('sonnet'))
|
|
264
|
+
const geminiBudgets = budgetsFromCoordinator(new TokenBudgetCoordinator('2.5-pro'))
|
|
265
|
+
|
|
266
|
+
expect(geminiBudgets.totalPrompt).toBeGreaterThan(claudeBudgets.totalPrompt)
|
|
267
|
+
})
|
|
268
|
+
})
|
|
269
|
+
|
|
270
|
+
// =============================================================================
|
|
271
|
+
// Different models get different allocations
|
|
272
|
+
// =============================================================================
|
|
273
|
+
|
|
274
|
+
describe('model-specific allocations', () => {
|
|
275
|
+
it('should give Gemini 5x the file budget of Claude', () => {
|
|
276
|
+
const claude = new TokenBudgetCoordinator('sonnet')
|
|
277
|
+
const gemini = new TokenBudgetCoordinator('2.5-pro')
|
|
278
|
+
|
|
279
|
+
const claudeFiles = claude.getAllocationFor('files')
|
|
280
|
+
const geminiFiles = gemini.getAllocationFor('files')
|
|
281
|
+
|
|
282
|
+
// Gemini has 1M context vs Claude 200K = 5x
|
|
283
|
+
expect(geminiFiles / claudeFiles).toBeCloseTo(5, 0)
|
|
284
|
+
})
|
|
285
|
+
|
|
286
|
+
it('should maintain 65/35 split across all models', () => {
|
|
287
|
+
for (const model of ['opus', 'sonnet', 'haiku', '2.5-pro', '2.5-flash']) {
|
|
288
|
+
const coordinator = new TokenBudgetCoordinator(model)
|
|
289
|
+
const allocation = coordinator.getAllocation()
|
|
290
|
+
const ratio = allocation.inputBudget / allocation.contextWindow
|
|
291
|
+
expect(ratio).toBeCloseTo(INPUT_RATIO, 2)
|
|
292
|
+
}
|
|
293
|
+
})
|
|
294
|
+
})
|