promptpilot 0.1.2 → 0.1.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +227 -105
- package/dist/cli.d.ts +9 -0
- package/dist/cli.js +679 -35
- package/dist/cli.js.map +1 -1
- package/dist/index.d.ts +35 -1
- package/dist/index.js +446 -31
- package/dist/index.js.map +1 -1
- package/package.json +4 -2
package/README.md
CHANGED
|
@@ -1,30 +1,38 @@
|
|
|
1
1
|
# promptpilot
|
|
2
2
|
|
|
3
|
-
`promptpilot` is a
|
|
3
|
+
`promptpilot` is a code-first TypeScript package that sits between your app or CLI workflow and a downstream LLM. It optimizes prompts locally through Ollama, keeps lightweight session memory, compresses stale context, and can route each request to the best allowed downstream model for the job.
|
|
4
4
|
|
|
5
|
-
It is designed for
|
|
5
|
+
It is designed for agentic coding workflows first. If a prompt is ambiguous, PromptPilot biases toward coding-capable and tool-capable models. Non-coding tasks like email, support, summarization, and chat are still supported when the prompt makes that intent clear.
|
|
6
6
|
|
|
7
7
|
## Why local Ollama
|
|
8
8
|
|
|
9
|
-
- It keeps
|
|
10
|
-
- It
|
|
11
|
-
- It
|
|
12
|
-
- It
|
|
13
|
-
- It uses
|
|
9
|
+
- It keeps optimization and routing close to your machine.
|
|
10
|
+
- It uses a small local model before you send anything to a stronger remote model.
|
|
11
|
+
- It avoids paying remote-token costs for every prompt rewrite.
|
|
12
|
+
- It works well on laptops with limited memory by preferring small Ollama models.
|
|
13
|
+
- It uses a local Qwen router when multiple small local models are available.
|
|
14
|
+
|
|
15
|
+
Default local preference is:
|
|
16
|
+
|
|
17
|
+
- `qwen2.5:3b`
|
|
18
|
+
- `phi3:mini`
|
|
19
|
+
- `llama3.2:3b`
|
|
14
20
|
|
|
15
21
|
## What it does
|
|
16
22
|
|
|
17
|
-
- Accepts a raw prompt plus optional metadata.
|
|
23
|
+
- Accepts a raw prompt plus optional task metadata.
|
|
18
24
|
- Persists session context across turns.
|
|
19
|
-
- Retrieves relevant prior context
|
|
20
|
-
-
|
|
21
|
-
- Preserves critical instructions and constraints.
|
|
25
|
+
- Retrieves and compresses relevant prior context.
|
|
26
|
+
- Preserves pinned constraints and user intent.
|
|
22
27
|
- Estimates token usage before and after optimization.
|
|
23
|
-
-
|
|
24
|
-
-
|
|
28
|
+
- Routes to a caller-supplied downstream model allowlist.
|
|
29
|
+
- Returns a selected target plus a ranked top 3 when routing is enabled.
|
|
30
|
+
- Outputs plain prompt text for shell pipelines or JSON for tooling/debugging.
|
|
25
31
|
|
|
26
32
|
## Quick start
|
|
27
33
|
|
|
34
|
+
Local repo workflow:
|
|
35
|
+
|
|
28
36
|
```bash
|
|
29
37
|
npm install
|
|
30
38
|
npm run build
|
|
@@ -34,30 +42,64 @@ promptpilot optimize "explain binary search simply" --plain
|
|
|
34
42
|
promptpilot optimize "continue my study guide" --session dsa --save-context --plain | claude
|
|
35
43
|
```
|
|
36
44
|
|
|
37
|
-
|
|
45
|
+
Install from npm:
|
|
38
46
|
|
|
39
47
|
```bash
|
|
40
48
|
npm install -g promptpilot
|
|
41
49
|
```
|
|
42
50
|
|
|
43
|
-
|
|
51
|
+
Run `promptpilot` with no arguments in an interactive terminal to open the CLI welcome screen:
|
|
44
52
|
|
|
45
|
-
```
|
|
46
|
-
|
|
47
|
-
|
|
53
|
+
```text
|
|
54
|
+
PromptPilot v0.1.x
|
|
55
|
+
┌──────────────────────────────────────────────────────────────────────────────┐
|
|
56
|
+
│ Welcome back │
|
|
57
|
+
│ │
|
|
58
|
+
│ .-''''-. Launchpad │
|
|
59
|
+
│ .' .--. '. Run promptpilot optimize "..." │
|
|
60
|
+
│ / / oo \ \ Pipe directly into Claude with | claude│
|
|
61
|
+
│ | \_==_/ | │
|
|
62
|
+
│ \ \_/ \_/ / Custom local model │
|
|
63
|
+
│ '._/|__|\_.' Use --model promptpilot-compressor │
|
|
64
|
+
│ │
|
|
65
|
+
│ /Users/you/project Commands │
|
|
66
|
+
│ optimize optimize and route prompts │
|
|
67
|
+
│ --help show the full CLI reference │
|
|
68
|
+
└──────────────────────────────────────────────────────────────────────────────┘
|
|
48
69
|
```
|
|
49
70
|
|
|
50
|
-
Install
|
|
71
|
+
Install one or two small Ollama models so the local router has options:
|
|
51
72
|
|
|
52
73
|
```bash
|
|
53
|
-
|
|
54
|
-
|
|
74
|
+
ollama pull qwen2.5:3b
|
|
75
|
+
ollama pull phi3:mini
|
|
55
76
|
```
|
|
56
77
|
|
|
78
|
+
## Core behavior
|
|
79
|
+
|
|
80
|
+
PromptPilot has two distinct routing layers.
|
|
81
|
+
|
|
82
|
+
1. Local optimizer routing
|
|
83
|
+
|
|
84
|
+
- Explicit `ollamaModel` or `--model` always wins.
|
|
85
|
+
- If exactly one suitable small local model exists, it uses that model directly.
|
|
86
|
+
- If multiple suitable small local models exist, a local Qwen router chooses between them.
|
|
87
|
+
- If routing cannot complete, PromptPilot falls back to deterministic prompt shaping instead of making a static guess.
|
|
88
|
+
|
|
89
|
+
2. Downstream target routing
|
|
90
|
+
|
|
91
|
+
- The caller provides the allowed downstream targets.
|
|
92
|
+
- If one target is supplied, PromptPilot selects it directly.
|
|
93
|
+
- If multiple targets are supplied, a local Qwen router ranks them and selects the top target.
|
|
94
|
+
- Routing is code-first by default: ambiguous prompts bias toward coding-capable and agentic targets.
|
|
95
|
+
- If downstream routing fails, PromptPilot still returns an optimized prompt but does not invent a target.
|
|
96
|
+
|
|
57
97
|
## Library usage
|
|
58
98
|
|
|
99
|
+
### Basic optimization
|
|
100
|
+
|
|
59
101
|
```ts
|
|
60
|
-
import { createOptimizer
|
|
102
|
+
import { createOptimizer } from "promptpilot";
|
|
61
103
|
|
|
62
104
|
const optimizer = createOptimizer({
|
|
63
105
|
provider: "ollama",
|
|
@@ -66,22 +108,92 @@ const optimizer = createOptimizer({
|
|
|
66
108
|
});
|
|
67
109
|
|
|
68
110
|
const result = await optimizer.optimize({
|
|
69
|
-
prompt: "help me
|
|
70
|
-
task: "
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
111
|
+
prompt: "help me debug this failing CI job",
|
|
112
|
+
task: "code",
|
|
113
|
+
preset: "code",
|
|
114
|
+
sessionId: "ci-fix",
|
|
115
|
+
saveContext: true
|
|
116
|
+
});
|
|
117
|
+
|
|
118
|
+
console.log(result.finalPrompt);
|
|
119
|
+
console.log(result.model);
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
### Code-first downstream routing
|
|
123
|
+
|
|
124
|
+
```ts
|
|
125
|
+
import { createOptimizer } from "promptpilot";
|
|
126
|
+
|
|
127
|
+
const optimizer = createOptimizer({
|
|
128
|
+
provider: "ollama",
|
|
129
|
+
host: "http://localhost:11434",
|
|
130
|
+
contextStore: "local"
|
|
74
131
|
});
|
|
75
132
|
|
|
76
|
-
|
|
133
|
+
const result = await optimizer.optimize({
|
|
134
|
+
prompt: "rewrite this prompt for a coding refactor task",
|
|
135
|
+
task: "code",
|
|
136
|
+
preset: "code",
|
|
137
|
+
availableTargets: [
|
|
138
|
+
{
|
|
139
|
+
provider: "anthropic",
|
|
140
|
+
model: "claude-sonnet",
|
|
141
|
+
label: "anthropic:claude-sonnet",
|
|
142
|
+
capabilities: ["coding", "writing"],
|
|
143
|
+
costRank: 2
|
|
144
|
+
},
|
|
145
|
+
{
|
|
146
|
+
provider: "openai",
|
|
147
|
+
model: "gpt-4.1-mini",
|
|
148
|
+
label: "openai:gpt-4.1-mini",
|
|
149
|
+
capabilities: ["writing", "chat"],
|
|
150
|
+
costRank: 1
|
|
151
|
+
},
|
|
152
|
+
{
|
|
153
|
+
provider: "openai",
|
|
154
|
+
model: "gpt-5-codex",
|
|
155
|
+
label: "openai:gpt-5-codex",
|
|
156
|
+
capabilities: ["coding", "agentic", "tool_use", "debugging"],
|
|
157
|
+
costRank: 3
|
|
158
|
+
}
|
|
159
|
+
],
|
|
160
|
+
routingPriority: "cheapest_adequate",
|
|
161
|
+
targetHints: ["coding", "agentic", "refactor"],
|
|
162
|
+
workloadBias: "code_first",
|
|
163
|
+
debug: true
|
|
164
|
+
});
|
|
165
|
+
|
|
166
|
+
console.log(result.selectedTarget);
|
|
167
|
+
console.log(result.rankedTargets);
|
|
168
|
+
console.log(result.routingReason);
|
|
169
|
+
```
|
|
77
170
|
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
171
|
+
### Lightweight writing still works
|
|
172
|
+
|
|
173
|
+
```ts
|
|
174
|
+
const result = await optimizer.optimize({
|
|
175
|
+
prompt: "write a short internship follow-up email",
|
|
176
|
+
task: "email",
|
|
177
|
+
preset: "email",
|
|
178
|
+
availableTargets: [
|
|
179
|
+
{
|
|
180
|
+
provider: "anthropic",
|
|
181
|
+
model: "claude-sonnet",
|
|
182
|
+
label: "anthropic:claude-sonnet",
|
|
183
|
+
capabilities: ["coding", "writing"],
|
|
184
|
+
costRank: 2
|
|
185
|
+
},
|
|
186
|
+
{
|
|
187
|
+
provider: "openai",
|
|
188
|
+
model: "gpt-4.1-mini",
|
|
189
|
+
label: "openai:gpt-4.1-mini",
|
|
190
|
+
capabilities: ["writing", "email", "chat"],
|
|
191
|
+
costRank: 1
|
|
192
|
+
}
|
|
193
|
+
]
|
|
82
194
|
});
|
|
83
195
|
|
|
84
|
-
console.log(
|
|
196
|
+
console.log(result.selectedTarget);
|
|
85
197
|
```
|
|
86
198
|
|
|
87
199
|
## Claude CLI usage
|
|
@@ -89,37 +201,45 @@ console.log(oneOff.finalPrompt);
|
|
|
89
201
|
Plain shell output:
|
|
90
202
|
|
|
91
203
|
```bash
|
|
92
|
-
promptpilot optimize "help me
|
|
204
|
+
promptpilot optimize "help me debug this failing CI job" --task code --preset code --plain
|
|
93
205
|
```
|
|
94
206
|
|
|
95
|
-
|
|
207
|
+
Pipe directly into Claude CLI:
|
|
96
208
|
|
|
97
209
|
```bash
|
|
98
|
-
promptpilot optimize "
|
|
210
|
+
promptpilot optimize "continue working on this refactor" --session repo-refactor --save-context --plain | claude
|
|
99
211
|
```
|
|
100
212
|
|
|
101
|
-
|
|
213
|
+
Route against an allowlist of downstream targets:
|
|
102
214
|
|
|
103
215
|
```bash
|
|
104
|
-
|
|
216
|
+
promptpilot optimize "rewrite this prompt for a coding refactor task" \
|
|
217
|
+
--task code \
|
|
218
|
+
--preset code \
|
|
219
|
+
--target anthropic:claude-sonnet \
|
|
220
|
+
--target openai:gpt-4.1-mini \
|
|
221
|
+
--target openai:gpt-5-codex \
|
|
222
|
+
--target-hint coding \
|
|
223
|
+
--target-hint refactor \
|
|
224
|
+
--json --debug
|
|
105
225
|
```
|
|
106
226
|
|
|
107
|
-
|
|
227
|
+
Use stdin in a pipeline:
|
|
108
228
|
|
|
109
229
|
```bash
|
|
110
|
-
|
|
230
|
+
cat notes.txt | promptpilot optimize --task summarization --plain | claude
|
|
111
231
|
```
|
|
112
232
|
|
|
113
|
-
|
|
233
|
+
Save context between calls:
|
|
114
234
|
|
|
115
235
|
```bash
|
|
116
|
-
promptpilot optimize "
|
|
236
|
+
promptpilot optimize "continue my debugger plan" --session ci-fix --save-context --plain
|
|
117
237
|
```
|
|
118
238
|
|
|
119
|
-
|
|
239
|
+
Clear a session:
|
|
120
240
|
|
|
121
241
|
```bash
|
|
122
|
-
promptpilot optimize --session
|
|
242
|
+
promptpilot optimize --session ci-fix --clear-session
|
|
123
243
|
```
|
|
124
244
|
|
|
125
245
|
Node `child_process` example:
|
|
@@ -127,68 +247,67 @@ Node `child_process` example:
|
|
|
127
247
|
```ts
|
|
128
248
|
import { spawn } from "node:child_process";
|
|
129
249
|
|
|
130
|
-
const
|
|
250
|
+
const promptpilot = spawn("promptpilot", [
|
|
131
251
|
"optimize",
|
|
132
|
-
"continue
|
|
252
|
+
"continue working on this repo refactor",
|
|
133
253
|
"--session",
|
|
134
|
-
"
|
|
254
|
+
"repo-refactor",
|
|
255
|
+
"--save-context",
|
|
135
256
|
"--plain"
|
|
136
257
|
]);
|
|
137
258
|
|
|
138
259
|
const claude = spawn("claude", [], { stdio: ["pipe", "inherit", "inherit"] });
|
|
139
|
-
|
|
260
|
+
promptpilot.stdout.pipe(claude.stdin);
|
|
140
261
|
```
|
|
141
262
|
|
|
142
263
|
## Session context
|
|
143
264
|
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
If you do not pass `ollamaModel` or `--model`, `promptpilot` asks Ollama which models are installed and lets a small local Qwen router choose the best small optimizer model for the current prompt. It does not statically rank multiple candidate models anymore. If a suitable Qwen router model is not available when multiple small candidates exist, it falls back to deterministic heuristic prompt optimization instead of making a static model-choice guess. If only oversized local models are available, it also falls back to deterministic heuristic optimization instead of silently using a heavy model.
|
|
265
|
+
If you pass a `sessionId`, PromptPilot stores session entries in a local store. The default store is JSON under `~/.promptpilot/sessions`. SQLite is also supported when `node:sqlite` or `better-sqlite3` is available.
|
|
147
266
|
|
|
148
267
|
Each session stores:
|
|
149
268
|
|
|
150
|
-
-
|
|
151
|
-
-
|
|
152
|
-
-
|
|
153
|
-
-
|
|
154
|
-
-
|
|
155
|
-
-
|
|
156
|
-
-
|
|
269
|
+
- user prompts
|
|
270
|
+
- optimized prompts
|
|
271
|
+
- final prompts
|
|
272
|
+
- extracted constraints
|
|
273
|
+
- summaries
|
|
274
|
+
- timestamps
|
|
275
|
+
- optional tags
|
|
157
276
|
|
|
158
277
|
Context retrieval prefers:
|
|
159
278
|
|
|
160
|
-
-
|
|
161
|
-
-
|
|
162
|
-
-
|
|
163
|
-
-
|
|
164
|
-
-
|
|
279
|
+
- pinned constraints
|
|
280
|
+
- task goals
|
|
281
|
+
- recent relevant turns
|
|
282
|
+
- named entities and recurring references
|
|
283
|
+
- stored summaries when budgets are tight
|
|
165
284
|
|
|
166
285
|
## Token reduction
|
|
167
286
|
|
|
168
|
-
|
|
287
|
+
PromptPilot estimates token usage for:
|
|
169
288
|
|
|
170
|
-
-
|
|
171
|
-
-
|
|
172
|
-
-
|
|
289
|
+
- the new prompt
|
|
290
|
+
- retrieved context
|
|
291
|
+
- the final composed prompt
|
|
173
292
|
|
|
174
|
-
|
|
293
|
+
Budgets:
|
|
175
294
|
|
|
176
295
|
- `maxInputTokens`
|
|
177
296
|
- `maxContextTokens`
|
|
178
297
|
- `maxTotalTokens`
|
|
179
298
|
|
|
180
|
-
When
|
|
299
|
+
When the budget is too large, PromptPilot compresses or summarizes old context, preserves high-signal instructions, and drops low-value context before composing the final prompt.
|
|
181
300
|
|
|
182
301
|
## CLI
|
|
183
302
|
|
|
184
303
|
```bash
|
|
185
|
-
promptpilot optimize "
|
|
304
|
+
promptpilot optimize "rewrite this prompt for a coding refactor task"
|
|
186
305
|
```
|
|
187
306
|
|
|
188
307
|
Supported flags:
|
|
189
308
|
|
|
190
309
|
- `--session <id>`
|
|
191
|
-
- `--model <name>`
|
|
310
|
+
- `--model <name>`
|
|
192
311
|
- `--mode <mode>`
|
|
193
312
|
- `--task <task>`
|
|
194
313
|
- `--tone <tone>`
|
|
@@ -198,6 +317,13 @@ Supported flags:
|
|
|
198
317
|
- `--max-length <n>`
|
|
199
318
|
- `--tag <value>` repeatable
|
|
200
319
|
- `--pin-constraint <text>` repeatable
|
|
320
|
+
- `--target <provider:model>` repeatable
|
|
321
|
+
- `--target-hint <value>` repeatable
|
|
322
|
+
- `--routing-priority <cheapest_adequate|best_quality|fastest_adequate>`
|
|
323
|
+
- `--routing-top-k <n>`
|
|
324
|
+
- `--workload-bias <code_first>`
|
|
325
|
+
- `--no-routing`
|
|
326
|
+
- `--host <url>`
|
|
201
327
|
- `--store <local|sqlite>`
|
|
202
328
|
- `--storage-dir <path>`
|
|
203
329
|
- `--sqlite-path <path>`
|
|
@@ -206,13 +332,14 @@ Supported flags:
|
|
|
206
332
|
- `--debug`
|
|
207
333
|
- `--save-context`
|
|
208
334
|
- `--no-context`
|
|
335
|
+
- `--clear-session`
|
|
209
336
|
- `--max-total-tokens <n>`
|
|
210
337
|
- `--max-context-tokens <n>`
|
|
211
338
|
- `--max-input-tokens <n>`
|
|
212
|
-
- `--
|
|
339
|
+
- `--timeout <ms>`
|
|
213
340
|
- `--bypass-optimization`
|
|
214
341
|
|
|
215
|
-
If no prompt
|
|
342
|
+
If no positional prompt is provided, `promptpilot optimize` reads the raw prompt from stdin.
|
|
216
343
|
|
|
217
344
|
## Public API
|
|
218
345
|
|
|
@@ -225,6 +352,19 @@ Main exports:
|
|
|
225
352
|
- `FileSessionStore`
|
|
226
353
|
- `SQLiteSessionStore`
|
|
227
354
|
|
|
355
|
+
Useful result fields:
|
|
356
|
+
|
|
357
|
+
- `optimizedPrompt`
|
|
358
|
+
- `finalPrompt`
|
|
359
|
+
- `selectedTarget`
|
|
360
|
+
- `rankedTargets`
|
|
361
|
+
- `routingReason`
|
|
362
|
+
- `routingWarnings`
|
|
363
|
+
- `provider`
|
|
364
|
+
- `model`
|
|
365
|
+
- `estimatedTokensBefore`
|
|
366
|
+
- `estimatedTokensAfter`
|
|
367
|
+
|
|
228
368
|
Supported modes:
|
|
229
369
|
|
|
230
370
|
- `clarity`
|
|
@@ -244,41 +384,23 @@ Supported presets:
|
|
|
244
384
|
- `summarization`
|
|
245
385
|
- `chat`
|
|
246
386
|
|
|
247
|
-
##
|
|
387
|
+
## Why the default model was chosen
|
|
248
388
|
|
|
249
|
-
|
|
250
|
-
src/
|
|
251
|
-
index.ts
|
|
252
|
-
types.ts
|
|
253
|
-
errors.ts
|
|
254
|
-
cli.ts
|
|
255
|
-
core/
|
|
256
|
-
optimizer.ts
|
|
257
|
-
ollamaClient.ts
|
|
258
|
-
systemPrompt.ts
|
|
259
|
-
contextManager.ts
|
|
260
|
-
tokenEstimator.ts
|
|
261
|
-
contextCompressor.ts
|
|
262
|
-
storage/
|
|
263
|
-
fileSessionStore.ts
|
|
264
|
-
sqliteSessionStore.ts
|
|
265
|
-
utils/
|
|
266
|
-
validation.ts
|
|
267
|
-
logger.ts
|
|
268
|
-
json.ts
|
|
269
|
-
test/
|
|
270
|
-
```
|
|
389
|
+
`qwen2.5:3b` is the default local preference because it offers a practical balance of:
|
|
271
390
|
|
|
272
|
-
|
|
391
|
+
- good instruction following
|
|
392
|
+
- strong enough reasoning for prompt optimization
|
|
393
|
+
- acceptable memory use on laptops
|
|
394
|
+
- good performance for code-first workflows
|
|
273
395
|
|
|
274
|
-
|
|
396
|
+
`phi3:mini` remains a useful lightweight option for shorter non-coding rewrites when it is installed locally and the Qwen router selects it.
|
|
275
397
|
|
|
276
398
|
## Future improvements
|
|
277
399
|
|
|
278
|
-
-
|
|
279
|
-
-
|
|
280
|
-
-
|
|
281
|
-
-
|
|
282
|
-
-
|
|
283
|
-
-
|
|
284
|
-
-
|
|
400
|
+
- semantic retrieval for context
|
|
401
|
+
- better token counting by target model
|
|
402
|
+
- prompt scoring
|
|
403
|
+
- local embeddings for relevance search
|
|
404
|
+
- response-aware context updates
|
|
405
|
+
- cache layer
|
|
406
|
+
- benchmark suite
|
package/dist/cli.d.ts
CHANGED
|
@@ -3,6 +3,8 @@ import { createOptimizer } from './index.js';
|
|
|
3
3
|
|
|
4
4
|
type CliWriter = {
|
|
5
5
|
write(message: string): void;
|
|
6
|
+
isTTY?: boolean;
|
|
7
|
+
columns?: number;
|
|
6
8
|
};
|
|
7
9
|
interface CliIO {
|
|
8
10
|
stdout: CliWriter;
|
|
@@ -12,6 +14,13 @@ interface CliIO {
|
|
|
12
14
|
interface CliDependencies {
|
|
13
15
|
createOptimizer: typeof createOptimizer;
|
|
14
16
|
readStdin: (stdin?: NodeJS.ReadStream) => Promise<string>;
|
|
17
|
+
getCliInfo?: (stdout: CliWriter) => {
|
|
18
|
+
cwd: string;
|
|
19
|
+
version: string;
|
|
20
|
+
color: boolean;
|
|
21
|
+
columns?: number;
|
|
22
|
+
user?: string;
|
|
23
|
+
};
|
|
15
24
|
}
|
|
16
25
|
declare function runCli(argv: string[], io?: CliIO, dependencies?: CliDependencies): Promise<number>;
|
|
17
26
|
|