@fastpaca/cria 1.5.0 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (70) hide show
  1. package/README.md +269 -72
  2. package/dist/dsl/builder.d.ts +101 -41
  3. package/dist/dsl/builder.d.ts.map +1 -1
  4. package/dist/dsl/builder.js +238 -48
  5. package/dist/dsl/builder.js.map +1 -1
  6. package/dist/dsl/index.d.ts +12 -6
  7. package/dist/dsl/index.d.ts.map +1 -1
  8. package/dist/dsl/index.js +10 -0
  9. package/dist/dsl/index.js.map +1 -1
  10. package/dist/dsl/strategies.d.ts +2 -1
  11. package/dist/dsl/strategies.d.ts.map +1 -1
  12. package/dist/dsl/strategies.js +5 -4
  13. package/dist/dsl/strategies.js.map +1 -1
  14. package/dist/dsl/summary.d.ts +2 -1
  15. package/dist/dsl/summary.d.ts.map +1 -1
  16. package/dist/dsl/summary.js.map +1 -1
  17. package/dist/dsl/vector-search.d.ts +8 -41
  18. package/dist/dsl/vector-search.d.ts.map +1 -1
  19. package/dist/dsl/vector-search.js +10 -62
  20. package/dist/dsl/vector-search.js.map +1 -1
  21. package/dist/eval/index.d.ts +6 -3
  22. package/dist/eval/index.d.ts.map +1 -1
  23. package/dist/eval/judge.d.ts +2 -1
  24. package/dist/eval/judge.d.ts.map +1 -1
  25. package/dist/eval/judge.js.map +1 -1
  26. package/dist/index.d.ts +10 -5
  27. package/dist/index.d.ts.map +1 -1
  28. package/dist/index.js +6 -3
  29. package/dist/index.js.map +1 -1
  30. package/dist/instrumentation/otel.d.ts.map +1 -1
  31. package/dist/instrumentation/otel.js +124 -11
  32. package/dist/instrumentation/otel.js.map +1 -1
  33. package/dist/memory/chroma/index.js +4 -4
  34. package/dist/protocols/chat-completions.d.ts +38 -0
  35. package/dist/protocols/chat-completions.d.ts.map +1 -0
  36. package/dist/protocols/chat-completions.js +94 -0
  37. package/dist/protocols/chat-completions.js.map +1 -0
  38. package/dist/protocols/responses.d.ts +80 -0
  39. package/dist/protocols/responses.d.ts.map +1 -0
  40. package/dist/protocols/responses.js +148 -0
  41. package/dist/protocols/responses.js.map +1 -0
  42. package/dist/provider.d.ts +107 -0
  43. package/dist/provider.d.ts.map +1 -0
  44. package/dist/provider.js +60 -0
  45. package/dist/provider.js.map +1 -0
  46. package/dist/providers/ai-sdk.d.ts +21 -6
  47. package/dist/providers/ai-sdk.d.ts.map +1 -1
  48. package/dist/providers/ai-sdk.js +83 -43
  49. package/dist/providers/ai-sdk.js.map +1 -1
  50. package/dist/providers/anthropic.d.ts +28 -8
  51. package/dist/providers/anthropic.d.ts.map +1 -1
  52. package/dist/providers/anthropic.js +314 -69
  53. package/dist/providers/anthropic.js.map +1 -1
  54. package/dist/providers/openai.d.ts +67 -25
  55. package/dist/providers/openai.d.ts.map +1 -1
  56. package/dist/providers/openai.js +365 -89
  57. package/dist/providers/openai.js.map +1 -1
  58. package/dist/render.d.ts +3 -2
  59. package/dist/render.d.ts.map +1 -1
  60. package/dist/render.js +61 -13
  61. package/dist/render.js.map +1 -1
  62. package/dist/types.d.ts +49 -40
  63. package/dist/types.d.ts.map +1 -1
  64. package/dist/types.js +2 -21
  65. package/dist/types.js.map +1 -1
  66. package/package.json +20 -17
  67. package/dist/testing/plaintext.d.ts +0 -28
  68. package/dist/testing/plaintext.d.ts.map +0 -1
  69. package/dist/testing/plaintext.js +0 -71
  70. package/dist/testing/plaintext.js.map +0 -1
package/README.md CHANGED
@@ -1,13 +1,7 @@
1
1
  <h1 align="center">Cria</h1>
2
2
 
3
- > **Note:** Cria is under active development. We're iterating heavily and the API may change before 2.0. Use in production at your own discretion.
4
-
5
3
  <p align="center">
6
- <i>Your prompts deserve the same structure as your code.</i>
7
- </p>
8
-
9
- <p align="center">
10
- <b><i>Cria turns prompts into composable components with explicit roles and strategies, and works with your existing environment & frameworks.</i></b>
4
+ TypeScript prompt architecture for fast-moving teams and engineers.
11
5
  </p>
12
6
 
13
7
  <p align="center">
@@ -22,34 +16,159 @@
22
16
  </a>
23
17
  </p>
24
18
 
25
- Cria is a lightweight prompt composition library for structured prompt engineering. Build prompts as components, keep behavior predictable, and reuse the same structure across providers. Runs on Node, Deno, Bun, and Edge; adapters require their SDKs.
19
+ The LLM space moves fast. New models drop often. Providers change APIs. Better vector stores emerge. New memory systems drop. **Your prompts shouldn't break every time the stack evolves.**
20
+
21
+ Cria is prompt architecture as code. Same prompt logic, swap the building blocks underneath when you need to upgrade.
22
+
23
+ ```ts
24
+ import { cria } from "@fastpaca/cria";
25
+ import { createProvider } from "@fastpaca/cria/openai";
26
+ import OpenAI from "openai";
27
+
28
+ const client = new OpenAI();
29
+ const model = "gpt-5-nano";
30
+ const provider = createProvider(client, model);
31
+
32
+ const messages = await cria
33
+ .prompt(provider)
34
+ .system("You are a research assistant.")
35
+ .summary(conversation, { id: "history", store: memory })
36
+ .vectorSearch({ store, query, limit: 8 })
37
+ .user(query)
38
+ .render({ budget: 128_000 });
39
+
40
+ const response = await client.chat.completions.create({ model, messages });
41
+ ```
42
+
43
+ ## Why Cria?
44
+
45
+ When you run LLM features in production, you need to:
46
+
47
+ 1. **Build prompts that last** — Swap providers, models, memory, or retrieval without rewriting prompt logic. A/B test components as the stack evolves.
48
+ 2. **Test like code** — Evaluate prompts with LLM-as-a-judge. Run tests in CI. Catch drift when you swap building blocks.
49
+ 3. **Inspect what runs** — See exactly what gets sent to the model. Debug token budgets. See when your RAG input messes up the context. *(Local DevTools-style inspector: planned)*
50
+
51
+ Cria gives you composable prompt blocks, explicit token budgets, and building blocks you can easily customise and adapt so you move fast without breaking prompts.
52
+
53
+ ## What you get
54
+
55
+ | Capability | Status |
56
+ | --- | --- |
57
+ | Component swapping via adapters | ✅ |
58
+ | Memory + vector search adapters | ✅ |
59
+ | Token budgeting | ✅ |
60
+ | Fit & compaction controls | ✅ |
61
+ | Conversation summaries | ✅ |
62
+ | OpenTelemetry integration | ✅ |
63
+ | Prompt eval/test helpers | ✅ |
64
+ | Local prompt inspector (DevTools-style) | planned |
65
+
66
+ ## Quick start
67
+
68
+ ```bash
69
+ npm install @fastpaca/cria
70
+ ```
71
+
72
+ ```ts
73
+ import { cria } from "@fastpaca/cria";
74
+ import { createProvider } from "@fastpaca/cria/openai";
75
+ import OpenAI from "openai";
76
+
77
+ const client = new OpenAI();
78
+ const model = "gpt-5-nano";
79
+ const provider = createProvider(client, model);
80
+
81
+ const messages = await cria
82
+ .prompt(provider)
83
+ .system("You are a helpful assistant.")
84
+ .user("What is the capital of France?")
85
+ .render({ budget: 128_000 });
86
+
87
+ const response = await client.chat.completions.create({ model, messages });
88
+ ```
89
+
90
+ ## Core patterns
91
+
92
+ <details>
93
+ <summary><strong>RAG with vector search</strong></summary>
26
94
 
27
95
  ```ts
28
96
  const messages = await cria
29
97
  .prompt(provider)
30
98
  .system("You are a research assistant.")
31
- .vectorSearch({ store, query: question, limit: 10 })
32
- .providerScope(provider, (p) =>
33
- p.summary(conversation, { store: memory }).last(conversation, { N: 20 })
34
- )
35
- .user(question)
36
- .render({ budget: 200_000 });
99
+ .vectorSearch({ store: qdrant, query, limit: 10 })
100
+ .user(query)
101
+ .render({ budget: 128_000 });
102
+ ```
103
+
104
+ </details>
105
+
106
+ <details>
107
+ <summary><strong>Summarize long conversation history</strong></summary>
108
+
109
+ ```ts
110
+ const messages = await cria
111
+ .prompt(provider)
112
+ .system("You are a helpful assistant.")
113
+ .summary(conversation, { id: "conv", store: redis, priority: 2 })
114
+ .last(conversation, { n: 20 })
115
+ .user(query)
116
+ .render({ budget: 128_000 });
117
+ ```
118
+
119
+ </details>
120
+
121
+ <details>
122
+ <summary><strong>Token budgeting and compaction</strong></summary>
123
+
124
+ ```ts
125
+ const messages = await cria
126
+ .prompt(provider)
127
+ .system(SYSTEM_PROMPT)
128
+ // Dropped first when budget is tight
129
+ .omit(examples, { priority: 3 })
130
+ // Summaries are run ad-hoc once we hit budget limits
131
+ .summary(conversation, { id: "conv", store: redis, priority: 2 })
132
+ // Sacred, need to retain but limit to only 10 entries
133
+ .vectorSearch({ store: qdrant, query, limit: 10 })
134
+ .user(query)
135
+ // 128k token budget, once we hit the budget strategies
136
+ // will run based on priority & usage (e.g. summaries will
137
+ // trigger).
138
+ .render({ budget: 128_000 });
37
139
  ```
38
140
 
39
- Start with **[Quickstart](docs/quickstart.md)**, then use **[Docs](docs/README.md)** to jump to the right how-to.
141
+ </details>
142
+
143
+ <details>
144
+ <summary><strong>Evaluate prompts like code</strong></summary>
145
+
146
+ ```ts
147
+ import { c, cria } from "@fastpaca/cria";
148
+ import { createProvider } from "@fastpaca/cria/ai-sdk";
149
+ import { createJudge } from "@fastpaca/cria/eval";
150
+ import { openai } from "@ai-sdk/openai";
151
+
152
+ const judge = createJudge({
153
+ target: createProvider(openai("gpt-4o")),
154
+ evaluator: createProvider(openai("gpt-4o-mini")),
155
+ });
156
+
157
+ const prompt = await cria
158
+ .prompt()
159
+ .system("You are a helpful customer support agent.")
160
+ .user("How do I update my payment method?")
161
+ .build();
40
162
 
41
- ## Use Cria when you need...
163
+ await judge(prompt).toPass(c`Provides clear, actionable steps`);
164
+ ```
42
165
 
43
- - **Need RAG?** Call `.vectorSearch({ store, query })`.
44
- - **Need a summary for long conversations?** Use `.summary(...)`.
45
- - **Need to cap history but keep structure?** Use `Last(...)`.
46
- - **Need to drop optional context when the context window is full?** Use `.omit(...)`.
47
- - **Using AI SDK?** Plug and play with `@fastpaca/cria/ai-sdk`!
166
+ </details>
48
167
 
49
- ## Providers
168
+ ## Works with
50
169
 
51
170
  <details>
52
- <summary><strong>OpenAI Chat Completions</strong></summary>
171
+ <summary><strong>OpenAI (Chat Completions)</strong></summary>
53
172
 
54
173
  ```ts
55
174
  import OpenAI from "openai";
@@ -57,18 +176,22 @@ import { createProvider } from "@fastpaca/cria/openai";
57
176
  import { cria } from "@fastpaca/cria";
58
177
 
59
178
  const client = new OpenAI();
60
- const provider = createProvider(client, "gpt-4o-mini");
179
+ const model = "gpt-5-nano";
180
+ const provider = createProvider(client, model);
181
+
61
182
  const messages = await cria
62
183
  .prompt(provider)
63
184
  .system("You are helpful.")
64
185
  .user(userQuestion)
65
- .render({ budget });
66
- const response = await client.chat.completions.create({ model: "gpt-4o-mini", messages });
186
+ .render({ budget: 128_000 });
187
+
188
+ const response = await client.chat.completions.create({ model, messages });
67
189
  ```
190
+
68
191
  </details>
69
192
 
70
193
  <details>
71
- <summary><strong>OpenAI Responses</strong></summary>
194
+ <summary><strong>OpenAI (Responses)</strong></summary>
72
195
 
73
196
  ```ts
74
197
  import OpenAI from "openai";
@@ -76,14 +199,18 @@ import { createResponsesProvider } from "@fastpaca/cria/openai";
76
199
  import { cria } from "@fastpaca/cria";
77
200
 
78
201
  const client = new OpenAI();
79
- const provider = createResponsesProvider(client, "gpt-5-nano");
202
+ const model = "gpt-5-nano";
203
+ const provider = createResponsesProvider(client, model);
204
+
80
205
  const input = await cria
81
206
  .prompt(provider)
82
207
  .system("You are helpful.")
83
208
  .user(userQuestion)
84
- .render({ budget });
85
- const response = await client.responses.create({ model: "gpt-5-nano", input });
209
+ .render({ budget: 128_000 });
210
+
211
+ const response = await client.responses.create({ model, input });
86
212
  ```
213
+
87
214
  </details>
88
215
 
89
216
  <details>
@@ -95,14 +222,18 @@ import { createProvider } from "@fastpaca/cria/anthropic";
95
222
  import { cria } from "@fastpaca/cria";
96
223
 
97
224
  const client = new Anthropic();
98
- const provider = createProvider(client, "claude-haiku-4-5");
225
+ const model = "claude-sonnet-4";
226
+ const provider = createProvider(client, model);
227
+
99
228
  const { system, messages } = await cria
100
229
  .prompt(provider)
101
230
  .system("You are helpful.")
102
231
  .user(userQuestion)
103
- .render({ budget });
104
- const response = await client.messages.create({ model: "claude-haiku-4-5", system, messages });
232
+ .render({ budget: 128_000 });
233
+
234
+ const response = await client.messages.create({ model, system, messages });
105
235
  ```
236
+
106
237
  </details>
107
238
 
108
239
  <details>
@@ -114,73 +245,139 @@ import { cria } from "@fastpaca/cria";
114
245
  import { generateText } from "ai";
115
246
 
116
247
  const provider = createProvider(model);
248
+
117
249
  const messages = await cria
118
250
  .prompt(provider)
119
251
  .system("You are helpful.")
120
252
  .user(userQuestion)
121
- .render({ budget });
253
+ .render({ budget: 128_000 });
254
+
122
255
  const { text } = await generateText({ model, messages });
123
256
  ```
257
+
124
258
  </details>
125
259
 
126
- ## Evaluation (LLM-as-a-judge)
260
+ <details>
261
+ <summary><strong>Redis (conversation summaries)</strong></summary>
127
262
 
128
- Use the `@fastpaca/cria/eval` entrypoint for judge-style evaluation helpers.
263
+ ```ts
264
+ import { RedisStore } from "@fastpaca/cria/memory/redis";
265
+ import type { StoredSummary } from "@fastpaca/cria";
266
+
267
+ const store = new RedisStore<StoredSummary>({
268
+ host: "localhost",
269
+ port: 6379,
270
+ });
271
+
272
+ const messages = await cria
273
+ .prompt(provider)
274
+ .system("You are a helpful assistant.")
275
+ .summary(conversation, { id: "conv-123", store, priority: 2 })
276
+ .last(conversation, { n: 20 })
277
+ .user(query)
278
+ .render({ budget: 128_000 });
279
+ ```
280
+
281
+ </details>
282
+
283
+ <details>
284
+ <summary><strong>Postgres (conversation summaries)</strong></summary>
129
285
 
130
286
  ```ts
131
- import { c, cria } from "@fastpaca/cria";
132
- import { createProvider } from "@fastpaca/cria/ai-sdk";
133
- import { createJudge } from "@fastpaca/cria/eval";
134
- import { openai } from "@ai-sdk/openai";
287
+ import { PostgresStore } from "@fastpaca/cria/memory/postgres";
288
+ import type { StoredSummary } from "@fastpaca/cria";
135
289
 
136
- const judge = createJudge({
137
- target: createProvider(openai("gpt-4o")),
138
- evaluator: createProvider(openai("gpt-4o-mini")),
290
+ const store = new PostgresStore<StoredSummary>({
291
+ connectionString: "postgres://user:pass@localhost/mydb",
139
292
  });
140
293
 
141
- const prompt = await cria
142
- .prompt()
143
- .system("You are a helpful customer support agent.")
144
- .user("How do I update my payment method?")
145
- .build();
294
+ const messages = await cria
295
+ .prompt(provider)
296
+ .system("You are a helpful assistant.")
297
+ .summary(conversation, { id: "conv-123", store, priority: 2 })
298
+ .last(conversation, { n: 20 })
299
+ .user(query)
300
+ .render({ budget: 128_000 });
301
+ ```
302
+
303
+ </details>
304
+
305
+ <details>
306
+ <summary><strong>Chroma (vector search)</strong></summary>
307
+
308
+ ```ts
309
+ import { ChromaClient } from "chromadb";
310
+ import { ChromaStore } from "@fastpaca/cria/memory/chroma";
146
311
 
147
- await judge(prompt).toPass(c`Helpfulness in addressing the user's question`);
312
+ const client = new ChromaClient({ path: "http://localhost:8000" });
313
+ const collection = await client.getOrCreateCollection({ name: "my-docs" });
314
+
315
+ const store = new ChromaStore({
316
+ collection,
317
+ embed: async (text) => await getEmbedding(text),
318
+ });
319
+
320
+ const messages = await cria
321
+ .prompt(provider)
322
+ .system("You are a research assistant.")
323
+ .vectorSearch({ store, query, limit: 10 })
324
+ .user(query)
325
+ .render({ budget: 128_000 });
148
326
  ```
149
327
 
150
- ## Roadmap
328
+ </details>
151
329
 
152
- **Done**
330
+ <details>
331
+ <summary><strong>Qdrant (vector search)</strong></summary>
153
332
 
154
- - [x] Fluent DSL and priority-based eviction
155
- - [x] Components: Region, Message, Truncate, Omit, Last, Summary, VectorSearch, ToolCall, ToolResult, Reasoning, Examples, CodeBlock, Separator
156
- - [x] Providers: OpenAI (Chat Completions + Responses), Anthropic, AI SDK
157
- - [x] AI SDK helpers: Messages component, DEFAULT_PRIORITIES
158
- - [x] Memory: InMemoryStore, Redis, Postgres, Chroma, Qdrant
159
- - [x] Observability: render hooks, validation schemas, OpenTelemetry
160
- - [x] Prompt eval / testing functionality
333
+ ```ts
334
+ import { QdrantClient } from "@qdrant/js-client-rest";
335
+ import { QdrantStore } from "@fastpaca/cria/memory/qdrant";
161
336
 
162
- **Planned**
337
+ const client = new QdrantClient({ url: "http://localhost:6333" });
163
338
 
164
- - [ ] Next.js adapter
165
- - [ ] GenAI semantic conventions for OpenTelemetry
166
- - [ ] Visualization tool
339
+ const store = new QdrantStore({
340
+ client,
341
+ collectionName: "my-docs",
342
+ embed: async (text) => await getEmbedding(text),
343
+ });
167
344
 
168
- ## Contributing
345
+ const messages = await cria
346
+ .prompt(provider)
347
+ .system("You are a research assistant.")
348
+ .vectorSearch({ store, query, limit: 10 })
349
+ .user(query)
350
+ .render({ budget: 128_000 });
351
+ ```
169
352
 
170
- - Issues and PRs are welcome.
171
- - Keep changes small and focused.
172
- - If you add a feature, include a short example or doc note.
353
+ </details>
173
354
 
174
- ## Support
355
+ ## Documentation
175
356
 
176
- - Open a GitHub issue for bugs or feature requests.
177
- - For quick questions, include a minimal repro or snippet.
357
+ - [Quickstart](docs/quickstart.md)
358
+ - [RAG / vector search](docs/how-to/rag.md)
359
+ - [Summarize long history](docs/how-to/summarize-history.md)
360
+ - [Fit & compaction](docs/how-to/fit-and-compaction.md)
361
+ - [Prompt evaluation](docs/how-to/prompt-evaluation.md)
362
+ - [Full documentation](docs/README.md)
178
363
 
179
364
  ## FAQ
180
365
 
181
- - **Does this replace my LLM SDK?** No - Cria builds prompt structures. You still use your SDK to call the model.
182
- - **How do I tune token budgets?** Pass `budget` to `render()` and set priorities on regions; see [docs/how-to/fit-and-compaction.md](docs/how-to/fit-and-compaction.md).
183
- - **Is this production-ready?** Not yet! It is a work in progress and you should test it out before you run this in production.
366
+ **What does Cria output?**
367
+ Prompt structures/messages (via a provider adapter). You pass the rendered output into your existing LLM SDK call.
368
+
369
+ **What works out of the box?**
370
+ Provider adapters for OpenAI (Chat Completions + Responses), Anthropic, and Vercel AI SDK; store adapters for Redis, Postgres, Chroma, and Qdrant.
371
+
372
+ **How do I validate component swaps?**
373
+ Swap via adapters, diff the rendered prompt output, and run prompt eval/tests to catch drift.
374
+
375
+ **What's the API stability?**
376
+ We use Cria in production, but the API may change before 2.0. Pin versions and follow the changelog.
377
+
378
+ ## Contributing
379
+
380
+ Issues and PRs welcome. Keep changes small and focused.
184
381
 
185
382
  ## License
186
383