promptloom 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 PeanutSplash
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,388 @@
1
+ # promptloom
2
+
3
+ Weave production-grade LLM prompts with cache boundaries, tool injection, and token budgeting.
4
+
5
+ Reverse-engineered from [Claude Code](https://claude.ai/code)'s 7-layer prompt architecture — the same patterns Anthropic uses internally to assemble system prompts for their 500K+ line CLI tool.
6
+
7
+ ## Why
8
+
9
+ Every LLM app stitches prompts together from pieces. Most do it with string concatenation. Claude Code does it with a **compiler** — multi-zone cache scoping, conditional sections, per-tool prompt injection, deferred tool loading, and token budget tracking.
10
+
11
+ **promptloom** extracts these battle-tested patterns into a zero-dependency library.
12
+
13
+ | Problem | How promptloom solves it |
14
+ |---------|------------------------|
15
+ | Changing one section breaks prompt cache → wasted money | **Multi-zone scoping** — each zone gets its own cache scope (`global`, `org`, or `null`) |
16
+ | Tool descriptions scattered everywhere | **Tool registry** with session-level prompt caching and stable ordering |
17
+ | Too many tools bloat the system prompt | **Deferred tools** — marked tools are excluded from the prompt, loaded on demand |
18
+ | Sections only relevant to some models/environments | **Conditional sections** — `when` predicates gate inclusion per compile context |
19
+ | No idea how many tokens the prompt costs | **Token estimation** built into every `compile()` call |
20
+ | Different API providers need different formats | **Multi-provider output** — `toAnthropic()`, `toOpenAI()`, `toBedrock()` |
21
+
22
+ ## Install
23
+
24
+ ```bash
25
+ bun add promptloom
26
+ ```
27
+
28
+ ## Quick Start
29
+
30
+ ```ts
31
+ import { PromptCompiler, toAnthropic } from 'promptloom'
32
+
33
+ const pc = new PromptCompiler()
34
+
35
+ // ── Zone 1: Attribution header (no cache) ──
36
+ pc.zone(null)
37
+ pc.static('attribution', 'x-billing-org: org-123')
38
+
39
+ // ── Zone 2: Static rules (globally cacheable) ──
40
+ pc.zone('global')
41
+ pc.static('identity', 'You are a code review bot.')
42
+ pc.static('rules', 'Only comment on bugs, not style.')
43
+
44
+ // ── Zone 3: Dynamic context (session-specific, no cache) ──
45
+ pc.zone(null)
46
+ pc.dynamic('diff', async () => {
47
+ const diff = await getCurrentDiff()
48
+ return `Review this diff:\n${diff}`
49
+ })
50
+
51
+ // Conditional section — only included for Opus models
52
+ pc.static('thinking', 'Use extended thinking for complex reviews.', {
53
+ when: (ctx) => ctx.model?.includes('opus') ?? false,
54
+ })
55
+
56
+ // ── Tools (inline + deferred) ──
57
+ pc.tool({
58
+ name: 'post_comment',
59
+ prompt: 'Post a review comment on a specific line of code.',
60
+ inputSchema: {
61
+ type: 'object',
62
+ properties: {
63
+ file: { type: 'string' },
64
+ line: { type: 'number' },
65
+ body: { type: 'string' },
66
+ },
67
+ required: ['file', 'line', 'body'],
68
+ },
69
+ order: 1, // explicit ordering for cache stability
70
+ })
71
+
72
+ pc.tool({
73
+ name: 'web_search',
74
+ prompt: 'Search the web for context.',
75
+ inputSchema: { type: 'object', properties: { query: { type: 'string' } } },
76
+ deferred: true, // excluded from prompt, loaded on demand
77
+ })
78
+
79
+ // ── Compile (with context for conditional sections) ──
80
+ const result = await pc.compile({ model: 'claude-opus-4-6' })
81
+
82
+ result.blocks // CacheBlock[] — one per zone, with scope annotations
83
+ result.tools // CompiledTool[] — inline tools only
84
+ result.deferredTools // CompiledTool[] — deferred tools (with defer_loading: true)
85
+ result.tokens // { systemPrompt, tools, deferredTools, total }
86
+ result.text // Full prompt as a single string
87
+ ```
88
+
89
+ ## Use with APIs
90
+
91
+ ### Anthropic
92
+
93
+ ```ts
94
+ import Anthropic from '@anthropic-ai/sdk'
95
+ import { PromptCompiler, toAnthropic } from 'promptloom'
96
+
97
+ const pc = new PromptCompiler()
98
+ // ... add zones, sections, and tools ...
99
+
100
+ const result = await pc.compile({ model: 'claude-sonnet-4-6' })
101
+ const { system, tools } = toAnthropic(result) // cache-annotated blocks + tool schemas
102
+
103
+ const response = await new Anthropic().messages.create({
104
+ model: 'claude-sonnet-4-6',
105
+ max_tokens: 4096,
106
+ system, // TextBlockParam[] with cache_control
107
+ tools, // includes deferred tools with defer_loading: true
108
+ messages: [{ role: 'user', content: 'Review this PR' }],
109
+ })
110
+ ```
111
+
112
+ ### OpenAI
113
+
114
+ ```ts
115
+ import OpenAI from 'openai'
116
+ import { PromptCompiler, toOpenAI } from 'promptloom'
117
+
118
+ const pc = new PromptCompiler()
119
+ // ... add zones, sections, and tools ...
120
+
121
+ const result = await pc.compile()
122
+ const { system, tools } = toOpenAI(result) // single string + function tools
123
+
124
+ const response = await new OpenAI().chat.completions.create({
125
+ model: 'gpt-4o',
126
+ messages: [
127
+ { role: 'system', content: system },
128
+ { role: 'user', content: 'Review this PR' },
129
+ ],
130
+ tools,
131
+ })
132
+ ```
133
+
134
+ ### AWS Bedrock
135
+
136
+ ```ts
137
+ import { PromptCompiler, toBedrock } from 'promptloom'
138
+
139
+ const result = await pc.compile()
140
+ const { system, toolConfig } = toBedrock(result) // cachePoint + toolSpec format
141
+
142
+ // Use with @aws-sdk/client-bedrock-runtime ConverseCommand
143
+ ```
144
+
145
+ ## Core Concepts
146
+
147
+ ### Zones: Multi-Block Cache Scoping
148
+
149
+ Claude Code uses a single `SYSTEM_PROMPT_DYNAMIC_BOUNDARY` to split the prompt into 2 blocks. promptloom generalizes this to **N zones** — each zone compiles into a separate `CacheBlock` with its own cache scope.
150
+
151
+ ```ts
152
+ pc.zone(null) // Zone 1: no-cache (attribution headers)
153
+ pc.static('header', 'x-billing: org-123')
154
+
155
+ pc.zone('global') // Zone 2: globally cacheable (identity, rules)
156
+ pc.static('identity', 'You are Claude Code.')
157
+ pc.static('rules', 'Follow safety protocols.')
158
+
159
+ pc.zone('org') // Zone 3: org-level cacheable
160
+ pc.static('org_rules', 'Company-specific guidelines.')
161
+
162
+ pc.zone(null) // Zone 4: session-specific (dynamic context)
163
+ pc.dynamic('git', async () => `Branch: ${await getBranch()}`)
164
+ ```
165
+
166
+ This compiles to 4 `CacheBlock`s:
167
+
168
+ ```
169
+ ┌─────────────────────────────┐
170
+ │ x-billing: org-123 │ Block 1 scope=null (no cache)
171
+ ├─────────────────────────────┤
172
+ │ You are Claude Code. │ Block 2 scope=global (cross-org cache)
173
+ │ Follow safety protocols. │
174
+ ├─────────────────────────────┤
175
+ │ Company-specific guidelines│ Block 3 scope=org (org-level cache)
176
+ ├─────────────────────────────┤
177
+ │ Branch: main │ Block 4 scope=null (session-specific)
178
+ └─────────────────────────────┘
179
+ ```
180
+
181
+ The `boundary()` method is kept for backward compatibility — it's equivalent to `zone(null)` when `enableGlobalCache` is true.
182
+
183
+ ### Conditional Sections
184
+
185
+ In Claude Code, sections are gated on `feature('FLAG')`, `process.env.USER_TYPE`, and model capabilities. promptloom uses `when` predicates:
186
+
187
+ ```ts
188
+ // Only for Opus models
189
+ pc.static('thinking_guide', 'Use extended thinking for complex tasks.', {
190
+ when: (ctx) => ctx.model?.includes('opus') ?? false,
191
+ })
192
+
193
+ // Only when MCP servers are connected
194
+ pc.dynamic('mcp', async () => fetchMCPInstructions(), {
195
+ when: (ctx) => (ctx.mcpServers as string[])?.length > 0,
196
+ })
197
+
198
+ // Only for internal users
199
+ pc.static('internal_tools', 'You have access to internal APIs.', {
200
+ when: (ctx) => ctx.userType === 'internal',
201
+ })
202
+
203
+ // Predicates are evaluated at compile time
204
+ const result = await pc.compile({
205
+ model: 'claude-opus-4-6',
206
+ mcpServers: ['figma', 'slack'],
207
+ userType: 'internal',
208
+ })
209
+ ```
210
+
211
+ ### Tool Prompt Injection
212
+
213
+ Every tool carries its own LLM-facing "user manual", resolved once per session and cached:
214
+
215
+ ```ts
216
+ pc.tool({
217
+ name: 'Bash',
218
+ prompt: async () => {
219
+ const sandbox = await detectSandbox()
220
+ return `Execute shell commands.\n${sandbox ? 'Running in sandbox.' : ''}`
221
+ },
222
+ inputSchema: { /* ... */ },
223
+ order: 1, // explicit ordering for cache stability
224
+ })
225
+ ```
226
+
227
+ ### Deferred Tools
228
+
229
+ When you have many tools (Claude Code has 42+), most aren't needed every turn. Deferred tools are excluded from the system prompt and discovered on demand:
230
+
231
+ ```ts
232
+ pc.tool({
233
+ name: 'web_search',
234
+ prompt: 'Search the web for information.',
235
+ inputSchema: { /* ... */ },
236
+ deferred: true, // not in system prompt, loaded via tool search
237
+ })
238
+
239
+ const result = await pc.compile()
240
+ result.tools // inline tools only
241
+ result.deferredTools // deferred tools (with defer_loading: true)
242
+ result.tokens.total // does NOT count deferred tools
243
+ ```
244
+
245
+ ### Tool Ordering for Cache Stability
246
+
247
+ Reordering tools changes the serialized bytes, breaking prompt cache. Use `order` for deterministic sorting:
248
+
249
+ ```ts
250
+ pc.tool({ name: 'bash', prompt: '...', inputSchema: {}, order: 1 })
251
+ pc.tool({ name: 'read', prompt: '...', inputSchema: {}, order: 2 })
252
+ pc.tool({ name: 'edit', prompt: '...', inputSchema: {}, order: 3 })
253
+ // Tools without `order` come last, in insertion order
254
+ ```
255
+
256
+ ### Token Budget
257
+
258
+ #### Estimation
259
+
260
+ Every `compile()` call includes token estimates:
261
+
262
+ ```ts
263
+ const result = await pc.compile()
264
+ result.tokens.systemPrompt // ~350 tokens
265
+ result.tokens.tools // ~200 tokens (inline only)
266
+ result.tokens.deferredTools // ~100 tokens (not counted in total)
267
+ result.tokens.total // ~550 tokens (systemPrompt + tools)
268
+ ```
269
+
270
+ #### Budget Tracking
271
+
272
+ For long-running agent loops:
273
+
274
+ ```ts
275
+ import { createBudgetTracker, checkBudget } from 'promptloom'
276
+
277
+ const tracker = createBudgetTracker()
278
+ const decision = checkBudget(tracker, currentTokens, { budget: 100_000 })
279
+
280
+ if (decision.action === 'continue') {
281
+ // Inject decision.nudgeMessage to keep the model working
282
+ } else {
283
+ // decision.reason: 'budget_reached' | 'diminishing_returns'
284
+ }
285
+ ```
286
+
287
+ #### Budget Parsing from Natural Language
288
+
289
+ Parse user-specified budgets like Claude Code does:
290
+
291
+ ```ts
292
+ import { parseTokenBudget } from 'promptloom'
293
+
294
+ parseTokenBudget('+500k') // 500_000
295
+ parseTokenBudget('spend 2M tokens') // 2_000_000
296
+ parseTokenBudget('+1.5b') // 1_500_000_000
297
+ parseTokenBudget('hello world') // null
298
+ ```
299
+
300
+ ## API Reference
301
+
302
+ ### `PromptCompiler`
303
+
304
+ | Method | Description |
305
+ |--------|-------------|
306
+ | `zone(scope)` | Start a new cache zone (`'global'`, `'org'`, or `null`) |
307
+ | `boundary()` | Shorthand for `zone(null)` when `enableGlobalCache` is true |
308
+ | `static(name, content, options?)` | Add a static section. `options.when` for conditional inclusion |
309
+ | `dynamic(name, compute, options?)` | Add a dynamic section (recomputed every `compile()`) |
310
+ | `tool(def)` | Register a tool. Set `deferred: true` for on-demand loading, `order` for sort stability |
311
+ | `compile(context?)` | Compile everything → `CompileResult`. Context is passed to `when` predicates |
312
+ | `clearCache()` | Clear all section + tool caches |
313
+ | `clearSectionCache()` | Clear only section cache |
314
+ | `clearToolCache()` | Clear only tool cache |
315
+ | `sectionCount` | Number of registered sections (excludes zone markers) |
316
+ | `toolCount` | Number of registered tools (inline + deferred) |
317
+ | `listSections()` | List sections with their types (`static`, `dynamic`, `zone`) |
318
+ | `listTools()` | List registered tool names |
319
+
320
+ ### `CompileResult`
321
+
322
+ | Field | Type | Description |
323
+ |-------|------|-------------|
324
+ | `blocks` | `CacheBlock[]` | One block per zone, with `cacheScope` annotations |
325
+ | `tools` | `CompiledTool[]` | Inline tool schemas with resolved descriptions |
326
+ | `deferredTools` | `CompiledTool[]` | Deferred tool schemas (with `defer_loading: true`) |
327
+ | `tokens` | `TokenEstimate` | `{ systemPrompt, tools, deferredTools, total }` |
328
+ | `text` | `string` | Full prompt as a single joined string |
329
+
330
+ ### Provider Formatters
331
+
332
+ ```ts
333
+ import { toAnthropic, toOpenAI, toBedrock } from 'promptloom'
334
+
335
+ toAnthropic(result) // { system: TextBlockParam[], tools: AnthropicTool[] }
336
+ toOpenAI(result) // { system: string, tools: { type: 'function', function }[] }
337
+ toBedrock(result) // { system: BedrockSystemBlock[], toolConfig: { tools } }
338
+ ```
339
+
340
+ ### Standalone Utilities
341
+
342
+ ```ts
343
+ import {
344
+ // Token estimation
345
+ estimateTokens, // Rough estimate (bytes / 4)
346
+ estimateTokensForFileType, // File-type-aware (JSON = bytes / 2)
347
+
348
+ // Budget
349
+ createBudgetTracker, // Create a new tracker
350
+ checkBudget, // Check budget → continue or stop
351
+ parseTokenBudget, // Parse "+500k" → 500_000
352
+
353
+ // Low-level (for custom compilers)
354
+ splitAtBoundary, // Split text at sentinel → CacheBlock[]
355
+ section, // Create a static Section object
356
+ dynamicSection, // Create a dynamic Section object
357
+ defineTool, // Create a ToolDef with fail-closed defaults
358
+ SectionCache, // Section cache class
359
+ ToolCache, // Tool cache class
360
+ resolveSections, // Resolve sections against cache
361
+ compileTool, // Compile a single tool
362
+ compileTools, // Compile all tools
363
+ } from 'promptloom'
364
+ ```
365
+
366
+ ## Background: Claude Code's Prompt Architecture
367
+
368
+ This library extracts patterns from Claude Code's source (leaked via unstripped source maps in March 2025). The key insight: **Anthropic treats prompts as compiler output, not handwritten text.**
369
+
370
+ Their system prompt is assembled from 7+ layers:
371
+
372
+ 1. **Identity** — who the AI is
373
+ 2. **System** — tool execution context, hooks, compression
374
+ 3. **Doing Tasks** — code style, security, collaboration rules
375
+ 4. **Actions** — risk-aware execution, reversibility
376
+ 5. **Using Tools** — tool preference guidance, parallel execution
377
+ 6. **Tone & Style** — conciseness, formatting rules
378
+ 7. **Dynamic context** — git status, CLAUDE.md files, user memory, MCP server instructions
379
+
380
+ Layers 1-6 are **static** (globally cacheable). Layer 7+ is **dynamic** (session-specific). The boundary between them is a literal sentinel string that the API layer uses to annotate cache scopes.
381
+
382
+ Sections are conditionally included based on feature flags (`feature('TOKEN_BUDGET')`), user type (`process.env.USER_TYPE === 'ant'`), and model capabilities. Each of their 42+ tools carries its own `prompt.ts`, and tools above a context threshold are deferred (loaded via `ToolSearchTool` on demand).
383
+
384
+ promptloom gives you all of these primitives.
385
+
386
+ ## License
387
+
388
+ MIT