@fastpaca/cria 1.6.0 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (47) hide show
  1. package/README.md +273 -83
  2. package/dist/dsl/builder.d.ts +69 -41
  3. package/dist/dsl/builder.d.ts.map +1 -1
  4. package/dist/dsl/builder.js +122 -43
  5. package/dist/dsl/builder.js.map +1 -1
  6. package/dist/dsl/index.d.ts +4 -5
  7. package/dist/dsl/index.d.ts.map +1 -1
  8. package/dist/dsl/index.js.map +1 -1
  9. package/dist/dsl/strategies.d.ts +2 -1
  10. package/dist/dsl/strategies.d.ts.map +1 -1
  11. package/dist/dsl/strategies.js +5 -4
  12. package/dist/dsl/strategies.js.map +1 -1
  13. package/dist/dsl/vector-search.d.ts +8 -41
  14. package/dist/dsl/vector-search.d.ts.map +1 -1
  15. package/dist/dsl/vector-search.js +10 -62
  16. package/dist/dsl/vector-search.js.map +1 -1
  17. package/dist/eval/index.d.ts +3 -3
  18. package/dist/eval/index.d.ts.map +1 -1
  19. package/dist/index.d.ts +2 -2
  20. package/dist/index.d.ts.map +1 -1
  21. package/dist/index.js.map +1 -1
  22. package/dist/instrumentation/otel.d.ts.map +1 -1
  23. package/dist/instrumentation/otel.js +124 -11
  24. package/dist/instrumentation/otel.js.map +1 -1
  25. package/dist/provider.d.ts +16 -3
  26. package/dist/provider.d.ts.map +1 -1
  27. package/dist/provider.js +4 -4
  28. package/dist/provider.js.map +1 -1
  29. package/dist/providers/anthropic.d.ts +6 -4
  30. package/dist/providers/anthropic.d.ts.map +1 -1
  31. package/dist/providers/anthropic.js +152 -26
  32. package/dist/providers/anthropic.js.map +1 -1
  33. package/dist/providers/openai.d.ts +36 -21
  34. package/dist/providers/openai.d.ts.map +1 -1
  35. package/dist/providers/openai.js +56 -27
  36. package/dist/providers/openai.js.map +1 -1
  37. package/dist/render.d.ts.map +1 -1
  38. package/dist/render.js +53 -8
  39. package/dist/render.js.map +1 -1
  40. package/dist/types.d.ts +38 -0
  41. package/dist/types.d.ts.map +1 -1
  42. package/dist/types.js.map +1 -1
  43. package/package.json +2 -1
  44. package/dist/testing/plaintext.d.ts +0 -29
  45. package/dist/testing/plaintext.d.ts.map +0 -1
  46. package/dist/testing/plaintext.js +0 -106
  47. package/dist/testing/plaintext.js.map +0 -1
package/README.md CHANGED
@@ -1,13 +1,7 @@
1
1
  <h1 align="center">Cria</h1>
2
2
 
3
- > **Note:** Cria is under active development. We're iterating heavily and the API may change before 2.0. Use in production at your own discretion.
4
-
5
3
  <p align="center">
6
- <i>Your prompts deserve the same structure as your code.</i>
7
- </p>
8
-
9
- <p align="center">
10
- <b><i>Cria turns prompts into composable components with explicit roles and strategies, and works with your existing environment & frameworks.</i></b>
4
+ TypeScript prompt architecture for fast-moving teams and engineers.
11
5
  </p>
12
6
 
13
7
  <p align="center">
@@ -22,32 +16,159 @@
22
16
  </a>
23
17
  </p>
24
18
 
25
- Cria is a lightweight prompt composition library for structured prompt engineering. Build prompts as components, keep behavior predictable, and reuse the same structure across providers. Runs on Node, Deno, Bun, and Edge; adapters require their SDKs.
19
+ The LLM space moves fast. New models drop often. Providers change APIs. Better vector stores emerge. New memory systems drop. **Your prompts shouldn't break every time the stack evolves.**
20
+
21
+ Cria is prompt architecture as code. Same prompt logic, swap the building blocks underneath when you need to upgrade.
22
+
23
+ ```ts
24
+ import { cria } from "@fastpaca/cria";
25
+ import { createProvider } from "@fastpaca/cria/openai";
26
+ import OpenAI from "openai";
27
+
28
+ const client = new OpenAI();
29
+ const model = "gpt-5-nano";
30
+ const provider = createProvider(client, model);
31
+
32
+ const messages = await cria
33
+ .prompt(provider)
34
+ .system("You are a research assistant.")
35
+ .summary(conversation, { id: "history", store: memory })
36
+ .vectorSearch({ store, query, limit: 8 })
37
+ .user(query)
38
+ .render({ budget: 128_000 });
39
+
40
+ const response = await client.chat.completions.create({ model, messages });
41
+ ```
42
+
43
+ ## Why Cria?
44
+
45
+ When you run LLM features in production, you need to:
46
+
47
+ 1. **Build prompts that last** — Swap providers, models, memory, or retrieval without rewriting prompt logic. A/B test components as the stack evolves.
48
+ 2. **Test like code** — Evaluate prompts with LLM-as-a-judge. Run tests in CI. Catch drift when you swap building blocks.
49
+ 3. **Inspect what runs** — See exactly what gets sent to the model. Debug token budgets. See when your RAG input messes up the context. *(Local DevTools-style inspector: planned)*
50
+
51
+ Cria gives you composable prompt blocks, explicit token budgets, and building blocks you can easily customise and adapt so you move fast without breaking prompts.
52
+
53
+ ## What you get
54
+
55
+ | Capability | Status |
56
+ | --- | --- |
57
+ | Component swapping via adapters | ✅ |
58
+ | Memory + vector search adapters | ✅ |
59
+ | Token budgeting | ✅ |
60
+ | Fit & compaction controls | ✅ |
61
+ | Conversation summaries | ✅ |
62
+ | OpenTelemetry integration | ✅ |
63
+ | Prompt eval/test helpers | ✅ |
64
+ | Local prompt inspector (DevTools-style) | planned |
65
+
66
+ ## Quick start
67
+
68
+ ```bash
69
+ npm install @fastpaca/cria
70
+ ```
71
+
72
+ ```ts
73
+ import { cria } from "@fastpaca/cria";
74
+ import { createProvider } from "@fastpaca/cria/openai";
75
+ import OpenAI from "openai";
76
+
77
+ const client = new OpenAI();
78
+ const model = "gpt-5-nano";
79
+ const provider = createProvider(client, model);
80
+
81
+ const messages = await cria
82
+ .prompt(provider)
83
+ .system("You are a helpful assistant.")
84
+ .user("What is the capital of France?")
85
+ .render({ budget: 128_000 });
86
+
87
+ const response = await client.chat.completions.create({ model, messages });
88
+ ```
89
+
90
+ ## Core patterns
91
+
92
+ <details>
93
+ <summary><strong>RAG with vector search</strong></summary>
26
94
 
27
95
  ```ts
28
96
  const messages = await cria
29
97
  .prompt(provider)
30
98
  .system("You are a research assistant.")
31
- .vectorSearch({ store, query: question, limit: 10 })
32
- .summary(conversation, { id: "history", store: memory, priority: 2 })
33
- .user(question)
34
- .render({ budget: 200_000 });
99
+ .vectorSearch({ store: qdrant, query, limit: 10 })
100
+ .user(query)
101
+ .render({ budget: 128_000 });
102
+ ```
103
+
104
+ </details>
105
+
106
+ <details>
107
+ <summary><strong>Summarize long conversation history</strong></summary>
108
+
109
+ ```ts
110
+ const messages = await cria
111
+ .prompt(provider)
112
+ .system("You are a helpful assistant.")
113
+ .summary(conversation, { id: "conv", store: redis, priority: 2 })
114
+ .last(conversation, { n: 20 })
115
+ .user(query)
116
+ .render({ budget: 128_000 });
117
+ ```
118
+
119
+ </details>
120
+
121
+ <details>
122
+ <summary><strong>Token budgeting and compaction</strong></summary>
123
+
124
+ ```ts
125
+ const messages = await cria
126
+ .prompt(provider)
127
+ .system(SYSTEM_PROMPT)
128
+ // Dropped first when budget is tight
129
+ .omit(examples, { priority: 3 })
130
+ // Summaries are run ad-hoc once we hit budget limits
131
+ .summary(conversation, { id: "conv", store: redis, priority: 2 })
132
+ // Sacred, need to retain but limit to only 10 entries
133
+ .vectorSearch({ store: qdrant, query, limit: 10 })
134
+ .user(query)
135
+ // 128k token budget, once we hit the budget strategies
136
+ // will run based on priority & usage (e.g. summaries will
137
+ // trigger).
138
+ .render({ budget: 128_000 });
35
139
  ```
36
140
 
37
- Start with **[Quickstart](docs/quickstart.md)**, then use **[Docs](docs/README.md)** to jump to the right how-to.
141
+ </details>
142
+
143
+ <details>
144
+ <summary><strong>Evaluate prompts like code</strong></summary>
145
+
146
+ ```ts
147
+ import { c, cria } from "@fastpaca/cria";
148
+ import { createProvider } from "@fastpaca/cria/ai-sdk";
149
+ import { createJudge } from "@fastpaca/cria/eval";
150
+ import { openai } from "@ai-sdk/openai";
151
+
152
+ const judge = createJudge({
153
+ target: createProvider(openai("gpt-4o")),
154
+ evaluator: createProvider(openai("gpt-4o-mini")),
155
+ });
156
+
157
+ const prompt = await cria
158
+ .prompt()
159
+ .system("You are a helpful customer support agent.")
160
+ .user("How do I update my payment method?")
161
+ .build();
38
162
 
39
- ## Use Cria when you need...
163
+ await judge(prompt).toPass(c`Provides clear, actionable steps`);
164
+ ```
40
165
 
41
- - **Need RAG?** Call `.vectorSearch({ store, query })`.
42
- - **Need a summary for long conversations?** Use `.summary(...)`.
43
- - **Need to cap history but keep structure?** Use `.last(...)`.
44
- - **Need to drop optional context when the context window is full?** Use `.omit(...)`.
45
- - **Using AI SDK?** Plug and play with `@fastpaca/cria/ai-sdk`!
166
+ </details>
46
167
 
47
- ## Providers
168
+ ## Works with
48
169
 
49
170
  <details>
50
- <summary><strong>OpenAI Chat Completions</strong></summary>
171
+ <summary><strong>OpenAI (Chat Completions)</strong></summary>
51
172
 
52
173
  ```ts
53
174
  import OpenAI from "openai";
@@ -55,18 +176,22 @@ import { createProvider } from "@fastpaca/cria/openai";
55
176
  import { cria } from "@fastpaca/cria";
56
177
 
57
178
  const client = new OpenAI();
58
- const provider = createProvider(client, "gpt-4o-mini");
179
+ const model = "gpt-5-nano";
180
+ const provider = createProvider(client, model);
181
+
59
182
  const messages = await cria
60
183
  .prompt(provider)
61
184
  .system("You are helpful.")
62
185
  .user(userQuestion)
63
- .render({ budget });
64
- const response = await client.chat.completions.create({ model: "gpt-4o-mini", messages });
186
+ .render({ budget: 128_000 });
187
+
188
+ const response = await client.chat.completions.create({ model, messages });
65
189
  ```
190
+
66
191
  </details>
67
192
 
68
193
  <details>
69
- <summary><strong>OpenAI Responses</strong></summary>
194
+ <summary><strong>OpenAI (Responses)</strong></summary>
70
195
 
71
196
  ```ts
72
197
  import OpenAI from "openai";
@@ -74,14 +199,18 @@ import { createResponsesProvider } from "@fastpaca/cria/openai";
74
199
  import { cria } from "@fastpaca/cria";
75
200
 
76
201
  const client = new OpenAI();
77
- const provider = createResponsesProvider(client, "gpt-5-nano");
202
+ const model = "gpt-5-nano";
203
+ const provider = createResponsesProvider(client, model);
204
+
78
205
  const input = await cria
79
206
  .prompt(provider)
80
207
  .system("You are helpful.")
81
208
  .user(userQuestion)
82
- .render({ budget });
83
- const response = await client.responses.create({ model: "gpt-5-nano", input });
209
+ .render({ budget: 128_000 });
210
+
211
+ const response = await client.responses.create({ model, input });
84
212
  ```
213
+
85
214
  </details>
86
215
 
87
216
  <details>
@@ -93,14 +222,18 @@ import { createProvider } from "@fastpaca/cria/anthropic";
93
222
  import { cria } from "@fastpaca/cria";
94
223
 
95
224
  const client = new Anthropic();
96
- const provider = createProvider(client, "claude-haiku-4-5");
225
+ const model = "claude-sonnet-4";
226
+ const provider = createProvider(client, model);
227
+
97
228
  const { system, messages } = await cria
98
229
  .prompt(provider)
99
230
  .system("You are helpful.")
100
231
  .user(userQuestion)
101
- .render({ budget });
102
- const response = await client.messages.create({ model: "claude-haiku-4-5", system, messages });
232
+ .render({ budget: 128_000 });
233
+
234
+ const response = await client.messages.create({ model, system, messages });
103
235
  ```
236
+
104
237
  </details>
105
238
 
106
239
  <details>
@@ -112,82 +245,139 @@ import { cria } from "@fastpaca/cria";
112
245
  import { generateText } from "ai";
113
246
 
114
247
  const provider = createProvider(model);
248
+
115
249
  const messages = await cria
116
250
  .prompt(provider)
117
251
  .system("You are helpful.")
118
252
  .user(userQuestion)
119
- .render({ budget });
253
+ .render({ budget: 128_000 });
254
+
120
255
  const { text } = await generateText({ model, messages });
121
256
  ```
257
+
122
258
  </details>
123
259
 
124
- ## Evaluation (LLM-as-a-judge)
260
+ <details>
261
+ <summary><strong>Redis (conversation summaries)</strong></summary>
262
+
263
+ ```ts
264
+ import { RedisStore } from "@fastpaca/cria/memory/redis";
265
+ import type { StoredSummary } from "@fastpaca/cria";
266
+
267
+ const store = new RedisStore<StoredSummary>({
268
+ host: "localhost",
269
+ port: 6379,
270
+ });
271
+
272
+ const messages = await cria
273
+ .prompt(provider)
274
+ .system("You are a helpful assistant.")
275
+ .summary(conversation, { id: "conv-123", store, priority: 2 })
276
+ .last(conversation, { n: 20 })
277
+ .user(query)
278
+ .render({ budget: 128_000 });
279
+ ```
280
+
281
+ </details>
125
282
 
126
- Use the `@fastpaca/cria/eval` entrypoint for judge-style evaluation helpers.
283
+ <details>
284
+ <summary><strong>Postgres (conversation summaries)</strong></summary>
127
285
 
128
286
  ```ts
129
- import { c, cria } from "@fastpaca/cria";
130
- import { createProvider } from "@fastpaca/cria/ai-sdk";
131
- import { createJudge } from "@fastpaca/cria/eval";
132
- import { openai } from "@ai-sdk/openai";
287
+ import { PostgresStore } from "@fastpaca/cria/memory/postgres";
288
+ import type { StoredSummary } from "@fastpaca/cria";
133
289
 
134
- const judge = createJudge({
135
- target: createProvider(openai("gpt-4o")),
136
- evaluator: createProvider(openai("gpt-4o-mini")),
290
+ const store = new PostgresStore<StoredSummary>({
291
+ connectionString: "postgres://user:pass@localhost/mydb",
137
292
  });
138
293
 
139
- const prompt = await cria
140
- .prompt()
141
- .system("You are a helpful customer support agent.")
142
- .user("How do I update my payment method?")
143
- .build();
294
+ const messages = await cria
295
+ .prompt(provider)
296
+ .system("You are a helpful assistant.")
297
+ .summary(conversation, { id: "conv-123", store, priority: 2 })
298
+ .last(conversation, { n: 20 })
299
+ .user(query)
300
+ .render({ budget: 128_000 });
301
+ ```
144
302
 
145
- await judge(prompt).toPass(c`Helpfulness in addressing the user's question`);
303
+ </details>
304
+
305
+ <details>
306
+ <summary><strong>Chroma (vector search)</strong></summary>
307
+
308
+ ```ts
309
+ import { ChromaClient } from "chromadb";
310
+ import { ChromaStore } from "@fastpaca/cria/memory/chroma";
311
+
312
+ const client = new ChromaClient({ path: "http://localhost:8000" });
313
+ const collection = await client.getOrCreateCollection({ name: "my-docs" });
314
+
315
+ const store = new ChromaStore({
316
+ collection,
317
+ embed: async (text) => await getEmbedding(text),
318
+ });
319
+
320
+ const messages = await cria
321
+ .prompt(provider)
322
+ .system("You are a research assistant.")
323
+ .vectorSearch({ store, query, limit: 10 })
324
+ .user(query)
325
+ .render({ budget: 128_000 });
146
326
  ```
147
327
 
148
- ## Roadmap
149
-
150
- **Done**
151
-
152
- - [x] Fluent DSL and priority-based eviction
153
- - [x] Providers/Integrations
154
- - [x] OpenAI (Chat Completions + Responses)
155
- - [x] Anthropic
156
- - [x] AI SDK
157
- - [x] Memory:
158
- - [x] Key Value Stores for Summaries
159
- - [x] Redis
160
- - [x] Postgres
161
- - [x] Vector Store / RAG
162
- - [x] Chroma
163
- - [x] Qdrant
164
- - [x] Observability
165
- - [x] render hooks
166
- - [x] OpenTelemetry
167
- - [x] Prompt eval / testing functionality
168
-
169
- **Planned**
170
-
171
- - [ ] Next.js adapter
172
- - [ ] Visualization tool
173
- - [ ] Seamless provider integration (type system, no hoops)
328
+ </details>
174
329
 
175
- ## Contributing
330
+ <details>
331
+ <summary><strong>Qdrant (vector search)</strong></summary>
332
+
333
+ ```ts
334
+ import { QdrantClient } from "@qdrant/js-client-rest";
335
+ import { QdrantStore } from "@fastpaca/cria/memory/qdrant";
336
+
337
+ const client = new QdrantClient({ url: "http://localhost:6333" });
338
+
339
+ const store = new QdrantStore({
340
+ client,
341
+ collectionName: "my-docs",
342
+ embed: async (text) => await getEmbedding(text),
343
+ });
344
+
345
+ const messages = await cria
346
+ .prompt(provider)
347
+ .system("You are a research assistant.")
348
+ .vectorSearch({ store, query, limit: 10 })
349
+ .user(query)
350
+ .render({ budget: 128_000 });
351
+ ```
176
352
 
177
- - Issues and PRs are welcome.
178
- - Keep changes small and focused.
179
- - If you add a feature, include a short example or doc note.
353
+ </details>
180
354
 
181
- ## Support
355
+ ## Documentation
182
356
 
183
- - Open a GitHub issue for bugs or feature requests.
184
- - For quick questions, include a minimal repro or snippet.
357
+ - [Quickstart](docs/quickstart.md)
358
+ - [RAG / vector search](docs/how-to/rag.md)
359
+ - [Summarize long history](docs/how-to/summarize-history.md)
360
+ - [Fit & compaction](docs/how-to/fit-and-compaction.md)
361
+ - [Prompt evaluation](docs/how-to/prompt-evaluation.md)
362
+ - [Full documentation](docs/README.md)
185
363
 
186
364
  ## FAQ
187
365
 
188
- - **Does this replace my LLM SDK?** No - Cria builds prompt structures. You still use your SDK to call the model.
189
- - **How do I tune token budgets?** Pass `budget` to `render()` and set priorities on regions; see [docs/how-to/fit-and-compaction.md](docs/how-to/fit-and-compaction.md).
190
- - **Is this production-ready?** Not yet! It is a work in progress and you should test it out before you run this in production.
366
+ **What does Cria output?**
367
+ Prompt structures/messages (via a provider adapter). You pass the rendered output into your existing LLM SDK call.
368
+
369
+ **What works out of the box?**
370
+ Provider adapters for OpenAI (Chat Completions + Responses), Anthropic, and Vercel AI SDK; store adapters for Redis, Postgres, Chroma, and Qdrant.
371
+
372
+ **How do I validate component swaps?**
373
+ Swap via adapters, diff the rendered prompt output, and run prompt eval/tests to catch drift.
374
+
375
+ **What's the API stability?**
376
+ We use Cria in production, but the API may change before 2.0. Pin versions and follow the changelog.
377
+
378
+ ## Contributing
379
+
380
+ Issues and PRs welcome. Keep changes small and focused.
191
381
 
192
382
  ## License
193
383