npm - @fastpaca/cria - Versions diffs - 1.5.0 → 1.7.0 - Mend

@fastpaca/cria 1.5.0 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (70) hide show

package/README.md +269 -72
package/dist/dsl/builder.d.ts +101 -41
package/dist/dsl/builder.d.ts.map +1 -1
package/dist/dsl/builder.js +238 -48
package/dist/dsl/builder.js.map +1 -1
package/dist/dsl/index.d.ts +12 -6
package/dist/dsl/index.d.ts.map +1 -1
package/dist/dsl/index.js +10 -0
package/dist/dsl/index.js.map +1 -1
package/dist/dsl/strategies.d.ts +2 -1
package/dist/dsl/strategies.d.ts.map +1 -1
package/dist/dsl/strategies.js +5 -4
package/dist/dsl/strategies.js.map +1 -1
package/dist/dsl/summary.d.ts +2 -1
package/dist/dsl/summary.d.ts.map +1 -1
package/dist/dsl/summary.js.map +1 -1
package/dist/dsl/vector-search.d.ts +8 -41
package/dist/dsl/vector-search.d.ts.map +1 -1
package/dist/dsl/vector-search.js +10 -62
package/dist/dsl/vector-search.js.map +1 -1
package/dist/eval/index.d.ts +6 -3
package/dist/eval/index.d.ts.map +1 -1
package/dist/eval/judge.d.ts +2 -1
package/dist/eval/judge.d.ts.map +1 -1
package/dist/eval/judge.js.map +1 -1
package/dist/index.d.ts +10 -5
package/dist/index.d.ts.map +1 -1
package/dist/index.js +6 -3
package/dist/index.js.map +1 -1
package/dist/instrumentation/otel.d.ts.map +1 -1
package/dist/instrumentation/otel.js +124 -11
package/dist/instrumentation/otel.js.map +1 -1
package/dist/memory/chroma/index.js +4 -4
package/dist/protocols/chat-completions.d.ts +38 -0
package/dist/protocols/chat-completions.d.ts.map +1 -0
package/dist/protocols/chat-completions.js +94 -0
package/dist/protocols/chat-completions.js.map +1 -0
package/dist/protocols/responses.d.ts +80 -0
package/dist/protocols/responses.d.ts.map +1 -0
package/dist/protocols/responses.js +148 -0
package/dist/protocols/responses.js.map +1 -0
package/dist/provider.d.ts +107 -0
package/dist/provider.d.ts.map +1 -0
package/dist/provider.js +60 -0
package/dist/provider.js.map +1 -0
package/dist/providers/ai-sdk.d.ts +21 -6
package/dist/providers/ai-sdk.d.ts.map +1 -1
package/dist/providers/ai-sdk.js +83 -43
package/dist/providers/ai-sdk.js.map +1 -1
package/dist/providers/anthropic.d.ts +28 -8
package/dist/providers/anthropic.d.ts.map +1 -1
package/dist/providers/anthropic.js +314 -69
package/dist/providers/anthropic.js.map +1 -1
package/dist/providers/openai.d.ts +67 -25
package/dist/providers/openai.d.ts.map +1 -1
package/dist/providers/openai.js +365 -89
package/dist/providers/openai.js.map +1 -1
package/dist/render.d.ts +3 -2
package/dist/render.d.ts.map +1 -1
package/dist/render.js +61 -13
package/dist/render.js.map +1 -1
package/dist/types.d.ts +49 -40
package/dist/types.d.ts.map +1 -1
package/dist/types.js +2 -21
package/dist/types.js.map +1 -1
package/package.json +20 -17
package/dist/testing/plaintext.d.ts +0 -28
package/dist/testing/plaintext.d.ts.map +0 -1
package/dist/testing/plaintext.js +0 -71
package/dist/testing/plaintext.js.map +0 -1

package/README.md CHANGED Viewed

@@ -1,13 +1,7 @@
 <h1 align="center">Cria</h1>
-> **Note:** Cria is under active development. We're iterating heavily and the API may change before 2.0. Use in production at your own discretion.
 <p align="center">
-  <i>Your prompts deserve the same structure as your code.</i>
-</p>
-<p align="center">
-  <b><i>Cria turns prompts into composable components with explicit roles and strategies, and works with your existing environment & frameworks.</i></b>
+  TypeScript prompt architecture for fast-moving teams and engineers.
 </p>
 <p align="center">
@@ -22,34 +16,159 @@
   </a>
 </p>
-Cria is a lightweight prompt composition library for structured prompt engineering. Build prompts as components, keep behavior predictable, and reuse the same structure across providers. Runs on Node, Deno, Bun, and Edge; adapters require their SDKs.
+The LLM space moves fast. New models drop often. Providers change APIs. Better vector stores emerge. New memory systems drop. **Your prompts shouldn't break every time the stack evolves.**
+Cria is prompt architecture as code. Same prompt logic, swap the building blocks underneath when you need to upgrade.
+```ts
+import { cria } from "@fastpaca/cria";
+import { createProvider } from "@fastpaca/cria/openai";
+import OpenAI from "openai";
+const client = new OpenAI();
+const model = "gpt-5-nano";
+const provider = createProvider(client, model);
+const messages = await cria
+    .prompt(provider)
+    .system("You are a research assistant.")
+    .summary(conversation, { id: "history", store: memory })
+    .vectorSearch({ store, query, limit: 8 })
+    .user(query)
+    .render({ budget: 128_000 });
+const response = await client.chat.completions.create({ model, messages });
+```
+## Why Cria?
+When you run LLM features in production, you need to:
+1. **Build prompts that last** — Swap providers, models, memory, or retrieval without rewriting prompt logic. A/B test components as the stack evolves.
+2. **Test like code** — Evaluate prompts with LLM-as-a-judge. Run tests in CI. Catch drift when you swap building blocks.
+3. **Inspect what runs** — See exactly what gets sent to the model. Debug token budgets. See when your RAG input messes up the context. *(Local DevTools-style inspector: planned)*
+Cria gives you composable prompt blocks, explicit token budgets, and building blocks you can easily customise and adapt so you move fast without breaking prompts.
+## What you get
+| Capability | Status |
+| --- | --- |
+| Component swapping via adapters | ✅ |
+| Memory + vector search adapters | ✅ |
+| Token budgeting | ✅ |
+| Fit & compaction controls | ✅ |
+| Conversation summaries | ✅ |
+| OpenTelemetry integration | ✅ |
+| Prompt eval/test helpers | ✅ |
+| Local prompt inspector (DevTools-style) | planned |
+## Quick start
+```bash
+npm install @fastpaca/cria
+```
+```ts
+import { cria } from "@fastpaca/cria";
+import { createProvider } from "@fastpaca/cria/openai";
+import OpenAI from "openai";
+const client = new OpenAI();
+const model = "gpt-5-nano";
+const provider = createProvider(client, model);
+const messages = await cria
+  .prompt(provider)
+  .system("You are a helpful assistant.")
+  .user("What is the capital of France?")
+  .render({ budget: 128_000 });
+const response = await client.chat.completions.create({ model, messages });
+```
+## Core patterns
+<details>
+<summary><strong>RAG with vector search</strong></summary>
 ```ts
 const messages = await cria
   .prompt(provider)
   .system("You are a research assistant.")
-  .vectorSearch({ store, query: question, limit: 10 })
-  .providerScope(provider, (p) =>
-    p.summary(conversation, { store: memory }).last(conversation, { N: 20 })
-  )
-  .user(question)
-  .render({ budget: 200_000 });
+  .vectorSearch({ store: qdrant, query, limit: 10 })
+  .user(query)
+  .render({ budget: 128_000 });
+```
+</details>
+<details>
+<summary><strong>Summarize long conversation history</strong></summary>
+```ts
+const messages = await cria
+  .prompt(provider)
+  .system("You are a helpful assistant.")
+  .summary(conversation, { id: "conv", store: redis, priority: 2 })
+  .last(conversation, { n: 20 })
+  .user(query)
+  .render({ budget: 128_000 });
+```
+</details>
+<details>
+<summary><strong>Token budgeting and compaction</strong></summary>
+```ts
+const messages = await cria
+  .prompt(provider)
+  .system(SYSTEM_PROMPT)
+  // Dropped first when budget is tight
+  .omit(examples, { priority: 3 })
+  // Summaries are run ad-hoc once we hit budget limits
+  .summary(conversation, { id: "conv", store: redis, priority: 2 })
+  // Sacred, need to retain but limit to only 10 entries
+  .vectorSearch({ store: qdrant, query, limit: 10 })
+  .user(query)
+  // 128k token budget, once we hit the budget strategies
+  // will run based on priority & usage (e.g. summaries will
+  // trigger).
+  .render({ budget: 128_000 });
 ```
-Start with **[Quickstart](docs/quickstart.md)**, then use **[Docs](docs/README.md)** to jump to the right how-to.
+</details>
+<details>
+<summary><strong>Evaluate prompts like code</strong></summary>
+```ts
+import { c, cria } from "@fastpaca/cria";
+import { createProvider } from "@fastpaca/cria/ai-sdk";
+import { createJudge } from "@fastpaca/cria/eval";
+import { openai } from "@ai-sdk/openai";
+const judge = createJudge({
+  target: createProvider(openai("gpt-4o")),
+  evaluator: createProvider(openai("gpt-4o-mini")),
+});
+const prompt = await cria
+  .prompt()
+  .system("You are a helpful customer support agent.")
+  .user("How do I update my payment method?")
+  .build();
-## Use Cria when you need...
+await judge(prompt).toPass(c`Provides clear, actionable steps`);
+```
-- **Need RAG?** Call `.vectorSearch({ store, query })`.
-- **Need a summary for long conversations?** Use `.summary(...)`.
-- **Need to cap history but keep structure?** Use `Last(...)`.
-- **Need to drop optional context when the context window is full?** Use `.omit(...)`.
-- **Using AI SDK?** Plug and play with `@fastpaca/cria/ai-sdk`!
+</details>
-## Providers
+## Works with
 <details>
-<summary><strong>OpenAI Chat Completions</strong></summary>
+<summary><strong>OpenAI (Chat Completions)</strong></summary>
 ```ts
 import OpenAI from "openai";
@@ -57,18 +176,22 @@ import { createProvider } from "@fastpaca/cria/openai";
 import { cria } from "@fastpaca/cria";
 const client = new OpenAI();
-const provider = createProvider(client, "gpt-4o-mini");
+const model = "gpt-5-nano";
+const provider = createProvider(client, model);
 const messages = await cria
   .prompt(provider)
   .system("You are helpful.")
   .user(userQuestion)
-  .render({ budget });
-const response = await client.chat.completions.create({ model: "gpt-4o-mini", messages });
+  .render({ budget: 128_000 });
+const response = await client.chat.completions.create({ model, messages });
 ```
 </details>
 <details>
-<summary><strong>OpenAI Responses</strong></summary>
+<summary><strong>OpenAI (Responses)</strong></summary>
 ```ts
 import OpenAI from "openai";
@@ -76,14 +199,18 @@ import { createResponsesProvider } from "@fastpaca/cria/openai";
 import { cria } from "@fastpaca/cria";
 const client = new OpenAI();
-const provider = createResponsesProvider(client, "gpt-5-nano");
+const model = "gpt-5-nano";
+const provider = createResponsesProvider(client, model);
 const input = await cria
   .prompt(provider)
   .system("You are helpful.")
   .user(userQuestion)
-  .render({ budget });
-const response = await client.responses.create({ model: "gpt-5-nano", input });
+  .render({ budget: 128_000 });
+const response = await client.responses.create({ model, input });
 ```
 </details>
 <details>
@@ -95,14 +222,18 @@ import { createProvider } from "@fastpaca/cria/anthropic";
 import { cria } from "@fastpaca/cria";
 const client = new Anthropic();
-const provider = createProvider(client, "claude-haiku-4-5");
+const model = "claude-sonnet-4";
+const provider = createProvider(client, model);
 const { system, messages } = await cria
   .prompt(provider)
   .system("You are helpful.")
   .user(userQuestion)
-  .render({ budget });
-const response = await client.messages.create({ model: "claude-haiku-4-5", system, messages });
+  .render({ budget: 128_000 });
+const response = await client.messages.create({ model, system, messages });
 ```
 </details>
 <details>
@@ -114,73 +245,139 @@ import { cria } from "@fastpaca/cria";
 import { generateText } from "ai";
 const provider = createProvider(model);
 const messages = await cria
   .prompt(provider)
   .system("You are helpful.")
   .user(userQuestion)
-  .render({ budget });
+  .render({ budget: 128_000 });
 const { text } = await generateText({ model, messages });
 ```
 </details>
-## Evaluation (LLM-as-a-judge)
+<details>
+<summary><strong>Redis (conversation summaries)</strong></summary>
-Use the `@fastpaca/cria/eval` entrypoint for judge-style evaluation helpers.
+```ts
+import { RedisStore } from "@fastpaca/cria/memory/redis";
+import type { StoredSummary } from "@fastpaca/cria";
+const store = new RedisStore<StoredSummary>({
+  host: "localhost",
+  port: 6379,
+});
+const messages = await cria
+  .prompt(provider)
+  .system("You are a helpful assistant.")
+  .summary(conversation, { id: "conv-123", store, priority: 2 })
+  .last(conversation, { n: 20 })
+  .user(query)
+  .render({ budget: 128_000 });
+```
+</details>
+<details>
+<summary><strong>Postgres (conversation summaries)</strong></summary>
 ```ts
-import { c, cria } from "@fastpaca/cria";
-import { createProvider } from "@fastpaca/cria/ai-sdk";
-import { createJudge } from "@fastpaca/cria/eval";
-import { openai } from "@ai-sdk/openai";
+import { PostgresStore } from "@fastpaca/cria/memory/postgres";
+import type { StoredSummary } from "@fastpaca/cria";
-const judge = createJudge({
-  target: createProvider(openai("gpt-4o")),
-  evaluator: createProvider(openai("gpt-4o-mini")),
+const store = new PostgresStore<StoredSummary>({
+  connectionString: "postgres://user:pass@localhost/mydb",
 });
-const prompt = await cria
-  .prompt()
-  .system("You are a helpful customer support agent.")
-  .user("How do I update my payment method?")
-  .build();
+const messages = await cria
+  .prompt(provider)
+  .system("You are a helpful assistant.")
+  .summary(conversation, { id: "conv-123", store, priority: 2 })
+  .last(conversation, { n: 20 })
+  .user(query)
+  .render({ budget: 128_000 });
+```
+</details>
+<details>
+<summary><strong>Chroma (vector search)</strong></summary>
+```ts
+import { ChromaClient } from "chromadb";
+import { ChromaStore } from "@fastpaca/cria/memory/chroma";
-await judge(prompt).toPass(c`Helpfulness in addressing the user's question`);
+const client = new ChromaClient({ path: "http://localhost:8000" });
+const collection = await client.getOrCreateCollection({ name: "my-docs" });
+const store = new ChromaStore({
+  collection,
+  embed: async (text) => await getEmbedding(text),
+});
+const messages = await cria
+  .prompt(provider)
+  .system("You are a research assistant.")
+  .vectorSearch({ store, query, limit: 10 })
+  .user(query)
+  .render({ budget: 128_000 });
 ```
-## Roadmap
+</details>
-**Done**
+<details>
+<summary><strong>Qdrant (vector search)</strong></summary>
-- [x] Fluent DSL and priority-based eviction
-- [x] Components: Region, Message, Truncate, Omit, Last, Summary, VectorSearch, ToolCall, ToolResult, Reasoning, Examples, CodeBlock, Separator
-- [x] Providers: OpenAI (Chat Completions + Responses), Anthropic, AI SDK
-- [x] AI SDK helpers: Messages component, DEFAULT_PRIORITIES
-- [x] Memory: InMemoryStore, Redis, Postgres, Chroma, Qdrant
-- [x] Observability: render hooks, validation schemas, OpenTelemetry
-- [x] Prompt eval / testing functionality
+```ts
+import { QdrantClient } from "@qdrant/js-client-rest";
+import { QdrantStore } from "@fastpaca/cria/memory/qdrant";
-**Planned**
+const client = new QdrantClient({ url: "http://localhost:6333" });
-- [ ] Next.js adapter
-- [ ] GenAI semantic conventions for OpenTelemetry
-- [ ] Visualization tool
+const store = new QdrantStore({
+  client,
+  collectionName: "my-docs",
+  embed: async (text) => await getEmbedding(text),
+});
-## Contributing
+const messages = await cria
+  .prompt(provider)
+  .system("You are a research assistant.")
+  .vectorSearch({ store, query, limit: 10 })
+  .user(query)
+  .render({ budget: 128_000 });
+```
-- Issues and PRs are welcome.
-- Keep changes small and focused.
-- If you add a feature, include a short example or doc note.
+</details>
-## Support
+## Documentation
-- Open a GitHub issue for bugs or feature requests.
-- For quick questions, include a minimal repro or snippet.
+- [Quickstart](docs/quickstart.md)
+- [RAG / vector search](docs/how-to/rag.md)
+- [Summarize long history](docs/how-to/summarize-history.md)
+- [Fit & compaction](docs/how-to/fit-and-compaction.md)
+- [Prompt evaluation](docs/how-to/prompt-evaluation.md)
+- [Full documentation](docs/README.md)
 ## FAQ
-- **Does this replace my LLM SDK?** No - Cria builds prompt structures. You still use your SDK to call the model.
-- **How do I tune token budgets?** Pass `budget` to `render()` and set priorities on regions; see [docs/how-to/fit-and-compaction.md](docs/how-to/fit-and-compaction.md).
-- **Is this production-ready?** Not yet! It is a work in progress and you should test it out before you run this in production.
+**What does Cria output?**
+Prompt structures/messages (via a provider adapter). You pass the rendered output into your existing LLM SDK call.
+**What works out of the box?**
+Provider adapters for OpenAI (Chat Completions + Responses), Anthropic, and Vercel AI SDK; store adapters for Redis, Postgres, Chroma, and Qdrant.
+**How do I validate component swaps?**
+Swap via adapters, diff the rendered prompt output, and run prompt eval/tests to catch drift.
+**What's the API stability?**
+We use Cria in production, but the API may change before 2.0. Pin versions and follow the changelog.
+## Contributing
+Issues and PRs welcome. Keep changes small and focused.
 ## License