npm - ai - Versions diffs - 6.0.30 → 6.0.32 - Mend

ai 6.0.30 → 6.0.32

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (250) hide show

package/docs/03-ai-sdk-core/25-settings.mdx ADDED Viewed

@@ -0,0 +1,198 @@
+---
+title: Settings
+description: Learn how to configure the AI SDK.
+---
+# Settings
+Large language models (LLMs) typically provide settings to augment their output.
+All AI SDK functions support the following common settings in addition to the model, the [prompt](./prompts), and additional provider-specific settings:
+```ts highlight="3-5"
+const result = await generateText({
+  model: __MODEL__,
+  maxOutputTokens: 512,
+  temperature: 0.3,
+  maxRetries: 5,
+  prompt: 'Invent a new holiday and describe its traditions.',
+});
+```
+<Note>
+  Some providers do not support all common settings. If you use a setting with a
+  provider that does not support it, a warning will be generated. You can check
+  the `warnings` property in the result object to see if any warnings were
+  generated.
+</Note>
+### `maxOutputTokens`
+Maximum number of tokens to generate.
+### `temperature`
+Temperature setting.
+The value is passed through to the provider. The range depends on the provider and model.
+For most providers, `0` means almost deterministic results, and higher values mean more randomness.
+It is recommended to set either `temperature` or `topP`, but not both.
+<Note>In AI SDK 5.0, temperature is no longer set to `0` by default.</Note>
+### `topP`
+Nucleus sampling.
+The value is passed through to the provider. The range depends on the provider and model.
+For most providers, nucleus sampling is a number between 0 and 1.
+E.g. 0.1 would mean that only tokens with the top 10% probability mass are considered.
+It is recommended to set either `temperature` or `topP`, but not both.
+### `topK`
+Only sample from the top K options for each subsequent token.
+Used to remove "long tail" low probability responses.
+Recommended for advanced use cases only. You usually only need to use `temperature`.
+### `presencePenalty`
+The presence penalty affects the likelihood of the model to repeat information that is already in the prompt.
+The value is passed through to the provider. The range depends on the provider and model.
+For most providers, `0` means no penalty.
+### `frequencyPenalty`
+The frequency penalty affects the likelihood of the model to repeatedly use the same words or phrases.
+The value is passed through to the provider. The range depends on the provider and model.
+For most providers, `0` means no penalty.
+### `stopSequences`
+The stop sequences to use for stopping the text generation.
+If set, the model will stop generating text when one of the stop sequences is generated.
+Providers may have limits on the number of stop sequences.
+### `seed`
+It is the seed (integer) to use for random sampling.
+If set and supported by the model, calls will generate deterministic results.
+### `maxRetries`
+Maximum number of retries. Set to 0 to disable retries. Default: `2`.
+### `abortSignal`
+An optional abort signal that can be used to cancel the call.
+The abort signal can e.g. be forwarded from a user interface to cancel the call,
+or to define a timeout using `AbortSignal.timeout`.
+#### Example: AbortSignal.timeout
+```ts
+const result = await generateText({
+  model: __MODEL__,
+  prompt: 'Invent a new holiday and describe its traditions.',
+  abortSignal: AbortSignal.timeout(5000), // 5 seconds
+});
+```
+### `timeout`
+An optional timeout in milliseconds. The call will be aborted if it takes longer than the specified duration.
+This is a convenience parameter that creates an abort signal internally. It can be used alongside `abortSignal` - if both are provided, the call will abort when either condition is met.
+You can specify the timeout either as a number (milliseconds) or as an object with `totalMs`, `stepMs`, and/or `chunkMs` properties:
+- `totalMs`: The total timeout for the entire call including all steps.
+- `stepMs`: The timeout for each individual step (LLM call). This is useful for multi-step generations where you want to limit the time spent on each step independently.
+- `chunkMs`: The timeout between stream chunks (streaming only). The call will abort if no new chunk is received within this duration. This is useful for detecting stalled streams.
+#### Example: 5 second timeout (number format)
+```ts
+const result = await generateText({
+  model: __MODEL__,
+  prompt: 'Invent a new holiday and describe its traditions.',
+  timeout: 5000, // 5 seconds
+});
+```
+#### Example: 5 second total timeout (object format)
+```ts
+const result = await generateText({
+  model: __MODEL__,
+  prompt: 'Invent a new holiday and describe its traditions.',
+  timeout: { totalMs: 5000 }, // 5 seconds
+});
+```
+#### Example: 10 second step timeout
+```ts
+const result = await generateText({
+  model: __MODEL__,
+  prompt: 'Invent a new holiday and describe its traditions.',
+  timeout: { stepMs: 10000 }, // 10 seconds per step
+});
+```
+#### Example: Combined total and step timeout
+```ts
+const result = await generateText({
+  model: __MODEL__,
+  prompt: 'Invent a new holiday and describe its traditions.',
+  timeout: {
+    totalMs: 60000, // 60 seconds total
+    stepMs: 10000, // 10 seconds per step
+  },
+});
+```
+#### Example: Per-chunk timeout for streaming (streamText only)
+```ts
+const result = streamText({
+  model: __MODEL__,
+  prompt: 'Invent a new holiday and describe its traditions.',
+  timeout: { chunkMs: 5000 }, // abort if no chunk received for 5 seconds
+});
+```
+### `headers`
+Additional HTTP headers to be sent with the request. Only applicable for HTTP-based providers.
+You can use the request headers to provide additional information to the provider,
+depending on what the provider supports. For example, some observability providers support
+headers such as `Prompt-Id`.
+```ts
+import { generateText } from 'ai';
+__PROVIDER_IMPORT__;
+const result = await generateText({
+  model: __MODEL__,
+  prompt: 'Invent a new holiday and describe its traditions.',
+  headers: {
+    'Prompt-Id': 'my-prompt-id',
+  },
+});
+```
+<Note>
+  The `headers` setting is for request-specific headers. You can also set
+  `headers` in the provider configuration. These headers will be sent with every
+  request made by the provider.
+</Note>

package/docs/03-ai-sdk-core/30-embeddings.mdx ADDED Viewed

@@ -0,0 +1,247 @@
+---
+title: Embeddings
+description: Learn how to embed values with the AI SDK.
+---
+# Embeddings
+Embeddings are a way to represent words, phrases, or images as vectors in a high-dimensional space.
+In this space, similar words are close to each other, and the distance between words can be used to measure their similarity.
+## Embedding a Single Value
+The AI SDK provides the [`embed`](/docs/reference/ai-sdk-core/embed) function to embed single values, which is useful for tasks such as finding similar words
+or phrases or clustering text.
+You can use it with embeddings models, e.g. `openai.embeddingModel('text-embedding-3-large')` or `mistral.embeddingModel('mistral-embed')`.
+```tsx
+import { embed } from 'ai';
+import { openai } from '@ai-sdk/openai';
+// 'embedding' is a single embedding object (number[])
+const { embedding } = await embed({
+  model: 'openai/text-embedding-3-small',
+  value: 'sunny day at the beach',
+});
+```
+## Embedding Many Values
+When loading data, e.g. when preparing a data store for retrieval-augmented generation (RAG),
+it is often useful to embed many values at once (batch embedding).
+The AI SDK provides the [`embedMany`](/docs/reference/ai-sdk-core/embed-many) function for this purpose.
+Similar to `embed`, you can use it with embeddings models,
+e.g. `openai.embeddingModel('text-embedding-3-large')` or `mistral.embeddingModel('mistral-embed')`.
+```tsx
+import { openai } from '@ai-sdk/openai';
+import { embedMany } from 'ai';
+// 'embeddings' is an array of embedding objects (number[][]).
+// It is sorted in the same order as the input values.
+const { embeddings } = await embedMany({
+  model: 'openai/text-embedding-3-small',
+  values: [
+    'sunny day at the beach',
+    'rainy afternoon in the city',
+    'snowy night in the mountains',
+  ],
+});
+```
+## Embedding Similarity
+After embedding values, you can calculate the similarity between them using the [`cosineSimilarity`](/docs/reference/ai-sdk-core/cosine-similarity) function.
+This is useful to e.g. find similar words or phrases in a dataset.
+You can also rank and filter related items based on their similarity.
+```ts highlight={"2,10"}
+import { openai } from '@ai-sdk/openai';
+import { cosineSimilarity, embedMany } from 'ai';
+const { embeddings } = await embedMany({
+  model: 'openai/text-embedding-3-small',
+  values: ['sunny day at the beach', 'rainy afternoon in the city'],
+});
+console.log(
+  `cosine similarity: ${cosineSimilarity(embeddings[0], embeddings[1])}`,
+);
+```
+## Token Usage
+Many providers charge based on the number of tokens used to generate embeddings.
+Both `embed` and `embedMany` provide token usage information in the `usage` property of the result object:
+```ts highlight={"4,9"}
+import { openai } from '@ai-sdk/openai';
+import { embed } from 'ai';
+const { embedding, usage } = await embed({
+  model: 'openai/text-embedding-3-small',
+  value: 'sunny day at the beach',
+});
+console.log(usage); // { tokens: 10 }
+```
+## Settings
+### Provider Options
+Embedding model settings can be configured using `providerOptions` for provider-specific parameters:
+```ts highlight={"5-9"}
+import { openai } from '@ai-sdk/openai';
+import { embed } from 'ai';
+const { embedding } = await embed({
+  model: 'openai/text-embedding-3-small',
+  value: 'sunny day at the beach',
+  providerOptions: {
+    openai: {
+      dimensions: 512, // Reduce embedding dimensions
+    },
+  },
+});
+```
+### Parallel Requests
+The `embedMany` function now supports parallel processing with configurable `maxParallelCalls` to optimize performance:
+```ts highlight={"4"}
+import { openai } from '@ai-sdk/openai';
+import { embedMany } from 'ai';
+const { embeddings, usage } = await embedMany({
+  maxParallelCalls: 2, // Limit parallel requests
+  model: 'openai/text-embedding-3-small',
+  values: [
+    'sunny day at the beach',
+    'rainy afternoon in the city',
+    'snowy night in the mountains',
+  ],
+});
+```
+### Retries
+Both `embed` and `embedMany` accept an optional `maxRetries` parameter of type `number`
+that you can use to set the maximum number of retries for the embedding process.
+It defaults to `2` retries (3 attempts in total). You can set it to `0` to disable retries.
+```ts highlight={"7"}
+import { openai } from '@ai-sdk/openai';
+import { embed } from 'ai';
+const { embedding } = await embed({
+  model: 'openai/text-embedding-3-small',
+  value: 'sunny day at the beach',
+  maxRetries: 0, // Disable retries
+});
+```
+### Abort Signals and Timeouts
+Both `embed` and `embedMany` accept an optional `abortSignal` parameter of
+type [`AbortSignal`](https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal)
+that you can use to abort the embedding process or set a timeout.
+```ts highlight={"7"}
+import { openai } from '@ai-sdk/openai';
+import { embed } from 'ai';
+const { embedding } = await embed({
+  model: 'openai/text-embedding-3-small',
+  value: 'sunny day at the beach',
+  abortSignal: AbortSignal.timeout(1000), // Abort after 1 second
+});
+```
+### Custom Headers
+Both `embed` and `embedMany` accept an optional `headers` parameter of type `Record<string, string>`
+that you can use to add custom headers to the embedding request.
+```ts highlight={"7"}
+import { openai } from '@ai-sdk/openai';
+import { embed } from 'ai';
+const { embedding } = await embed({
+  model: 'openai/text-embedding-3-small',
+  value: 'sunny day at the beach',
+  headers: { 'X-Custom-Header': 'custom-value' },
+});
+```
+## Response Information
+Both `embed` and `embedMany` return response information that includes the raw provider response:
+```ts highlight={"4,9"}
+import { openai } from '@ai-sdk/openai';
+import { embed } from 'ai';
+const { embedding, response } = await embed({
+  model: 'openai/text-embedding-3-small',
+  value: 'sunny day at the beach',
+});
+console.log(response); // Raw provider response
+```
+## Embedding Middleware
+You can enhance embedding models, e.g. to set default values, using
+`wrapEmbeddingModel` and `EmbeddingModelV3Middleware`.
+Here is an example that uses the built-in `defaultEmbeddingSettingsMiddleware`:
+```ts
+import {
+  customProvider,
+  defaultEmbeddingSettingsMiddleware,
+  embed,
+  wrapEmbeddingModel,
+  gateway,
+} from 'ai';
+const embeddingModelWithDefaults = wrapEmbeddingModel({
+  model: gateway.embeddingModel('google/gemini-embedding-001'),
+  middleware: defaultEmbeddingSettingsMiddleware({
+    settings: {
+      providerOptions: {
+        google: {
+          outputDimensionality: 256,
+          taskType: 'CLASSIFICATION',
+        },
+      },
+    },
+  }),
+});
+```
+## Embedding Providers & Models
+Several providers offer embedding models:
+| Provider                                                                                  | Model                           | Embedding Dimensions |
+| ----------------------------------------------------------------------------------------- | ------------------------------- | -------------------- |
+| [OpenAI](/providers/ai-sdk-providers/openai#embedding-models)                             | `text-embedding-3-large`        | 3072                 |
+| [OpenAI](/providers/ai-sdk-providers/openai#embedding-models)                             | `text-embedding-3-small`        | 1536                 |
+| [OpenAI](/providers/ai-sdk-providers/openai#embedding-models)                             | `text-embedding-ada-002`        | 1536                 |
+| [Google Generative AI](/providers/ai-sdk-providers/google-generative-ai#embedding-models) | `gemini-embedding-001`          | 3072                 |
+| [Google Generative AI](/providers/ai-sdk-providers/google-generative-ai#embedding-models) | `text-embedding-004`            | 768                  |
+| [Mistral](/providers/ai-sdk-providers/mistral#embedding-models)                           | `mistral-embed`                 | 1024                 |
+| [Cohere](/providers/ai-sdk-providers/cohere#embedding-models)                             | `embed-english-v3.0`            | 1024                 |
+| [Cohere](/providers/ai-sdk-providers/cohere#embedding-models)                             | `embed-multilingual-v3.0`       | 1024                 |
+| [Cohere](/providers/ai-sdk-providers/cohere#embedding-models)                             | `embed-english-light-v3.0`      | 384                  |
+| [Cohere](/providers/ai-sdk-providers/cohere#embedding-models)                             | `embed-multilingual-light-v3.0` | 384                  |
+| [Cohere](/providers/ai-sdk-providers/cohere#embedding-models)                             | `embed-english-v2.0`            | 4096                 |
+| [Cohere](/providers/ai-sdk-providers/cohere#embedding-models)                             | `embed-english-light-v2.0`      | 1024                 |
+| [Cohere](/providers/ai-sdk-providers/cohere#embedding-models)                             | `embed-multilingual-v2.0`       | 768                  |
+| [Amazon Bedrock](/providers/ai-sdk-providers/amazon-bedrock#embedding-models)             | `amazon.titan-embed-text-v1`    | 1536                 |
+| [Amazon Bedrock](/providers/ai-sdk-providers/amazon-bedrock#embedding-models)             | `amazon.titan-embed-text-v2:0`  | 1024                 |

package/docs/03-ai-sdk-core/31-reranking.mdx ADDED Viewed

@@ -0,0 +1,218 @@
+---
+title: Reranking
+description: Learn how to rerank documents with the AI SDK.
+---
+# Reranking
+Reranking is a technique used to improve search relevance by reordering a set of documents based on their relevance to a query.
+Unlike embedding-based similarity search, reranking models are specifically trained to understand the relationship between queries and documents,
+often producing more accurate relevance scores.
+## Reranking Documents
+The AI SDK provides the [`rerank`](/docs/reference/ai-sdk-core/rerank) function to rerank documents based on their relevance to a query.
+You can use it with reranking models, e.g. `cohere.reranking('rerank-v3.5')` or `bedrock.reranking('cohere.rerank-v3-5:0')`.
+```tsx
+import { rerank } from 'ai';
+import { cohere } from '@ai-sdk/cohere';
+const documents = [
+  'sunny day at the beach',
+  'rainy afternoon in the city',
+  'snowy night in the mountains',
+];
+const { ranking } = await rerank({
+  model: cohere.reranking('rerank-v3.5'),
+  documents,
+  query: 'talk about rain',
+  topN: 2, // Return top 2 most relevant documents
+});
+console.log(ranking);
+// [
+//   { originalIndex: 1, score: 0.9, document: 'rainy afternoon in the city' },
+//   { originalIndex: 0, score: 0.3, document: 'sunny day at the beach' }
+// ]
+```
+## Working with Object Documents
+Reranking also supports structured documents (JSON objects), making it ideal for searching through databases, emails, or other structured content:
+```tsx
+import { rerank } from 'ai';
+import { cohere } from '@ai-sdk/cohere';
+const documents = [
+  {
+    from: 'Paul Doe',
+    subject: 'Follow-up',
+    text: 'We are happy to give you a discount of 20% on your next order.',
+  },
+  {
+    from: 'John McGill',
+    subject: 'Missing Info',
+    text: 'Sorry, but here is the pricing information from Oracle: $5000/month',
+  },
+];
+const { ranking, rerankedDocuments } = await rerank({
+  model: cohere.reranking('rerank-v3.5'),
+  documents,
+  query: 'Which pricing did we get from Oracle?',
+  topN: 1,
+});
+console.log(rerankedDocuments[0]);
+// { from: 'John McGill', subject: 'Missing Info', text: '...' }
+```
+## Understanding the Results
+The `rerank` function returns a comprehensive result object:
+```ts
+import { cohere } from '@ai-sdk/cohere';
+import { rerank } from 'ai';
+const { ranking, rerankedDocuments, originalDocuments } = await rerank({
+  model: cohere.reranking('rerank-v3.5'),
+  documents: ['sunny day at the beach', 'rainy afternoon in the city'],
+  query: 'talk about rain',
+});
+// ranking: sorted array of { originalIndex, score, document }
+// rerankedDocuments: documents sorted by relevance (convenience property)
+// originalDocuments: original documents array
+```
+Each item in the `ranking` array contains:
+- `originalIndex`: Position in the original documents array
+- `score`: Relevance score (typically 0-1, where higher is more relevant)
+- `document`: The original document
+## Settings
+### Top-N Results
+Use `topN` to limit the number of results returned. This is useful for retrieving only the most relevant documents:
+```ts highlight={"7"}
+import { cohere } from '@ai-sdk/cohere';
+import { rerank } from 'ai';
+const { ranking } = await rerank({
+  model: cohere.reranking('rerank-v3.5'),
+  documents: ['doc1', 'doc2', 'doc3', 'doc4', 'doc5'],
+  query: 'relevant information',
+  topN: 3, // Return only top 3 most relevant documents
+});
+```
+### Provider Options
+Reranking model settings can be configured using `providerOptions` for provider-specific parameters:
+```ts highlight={"8-12"}
+import { cohere } from '@ai-sdk/cohere';
+import { rerank } from 'ai';
+const { ranking } = await rerank({
+  model: cohere.reranking('rerank-v3.5'),
+  documents: ['sunny day at the beach', 'rainy afternoon in the city'],
+  query: 'talk about rain',
+  providerOptions: {
+    cohere: {
+      maxTokensPerDoc: 1000, // Limit tokens per document
+    },
+  },
+});
+```
+### Retries
+The `rerank` function accepts an optional `maxRetries` parameter of type `number`
+that you can use to set the maximum number of retries for the reranking process.
+It defaults to `2` retries (3 attempts in total). You can set it to `0` to disable retries.
+```ts highlight={"7"}
+import { cohere } from '@ai-sdk/cohere';
+import { rerank } from 'ai';
+const { ranking } = await rerank({
+  model: cohere.reranking('rerank-v3.5'),
+  documents: ['sunny day at the beach', 'rainy afternoon in the city'],
+  query: 'talk about rain',
+  maxRetries: 0, // Disable retries
+});
+```
+### Abort Signals and Timeouts
+The `rerank` function accepts an optional `abortSignal` parameter of
+type [`AbortSignal`](https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal)
+that you can use to abort the reranking process or set a timeout.
+```ts highlight={"7"}
+import { cohere } from '@ai-sdk/cohere';
+import { rerank } from 'ai';
+const { ranking } = await rerank({
+  model: cohere.reranking('rerank-v3.5'),
+  documents: ['sunny day at the beach', 'rainy afternoon in the city'],
+  query: 'talk about rain',
+  abortSignal: AbortSignal.timeout(5000), // Abort after 5 seconds
+});
+```
+### Custom Headers
+The `rerank` function accepts an optional `headers` parameter of type `Record<string, string>`
+that you can use to add custom headers to the reranking request.
+```ts highlight={"7"}
+import { cohere } from '@ai-sdk/cohere';
+import { rerank } from 'ai';
+const { ranking } = await rerank({
+  model: cohere.reranking('rerank-v3.5'),
+  documents: ['sunny day at the beach', 'rainy afternoon in the city'],
+  query: 'talk about rain',
+  headers: { 'X-Custom-Header': 'custom-value' },
+});
+```
+## Response Information
+The `rerank` function returns response information that includes the raw provider response:
+```ts highlight={"4,10"}
+import { cohere } from '@ai-sdk/cohere';
+import { rerank } from 'ai';
+const { ranking, response } = await rerank({
+  model: cohere.reranking('rerank-v3.5'),
+  documents: ['sunny day at the beach', 'rainy afternoon in the city'],
+  query: 'talk about rain',
+});
+console.log(response); // { id, timestamp, modelId, headers, body }
+```
+## Reranking Providers & Models
+Several providers offer reranking models:
+| Provider                                                                      | Model                                 |
+| ----------------------------------------------------------------------------- | ------------------------------------- |
+| [Cohere](/providers/ai-sdk-providers/cohere#reranking-models)                 | `rerank-v3.5`                         |
+| [Cohere](/providers/ai-sdk-providers/cohere#reranking-models)                 | `rerank-english-v3.0`                 |
+| [Cohere](/providers/ai-sdk-providers/cohere#reranking-models)                 | `rerank-multilingual-v3.0`            |
+| [Amazon Bedrock](/providers/ai-sdk-providers/amazon-bedrock#reranking-models) | `amazon.rerank-v1:0`                  |
+| [Amazon Bedrock](/providers/ai-sdk-providers/amazon-bedrock#reranking-models) | `cohere.rerank-v3-5:0`                |
+| [Together.ai](/providers/ai-sdk-providers/togetherai#reranking-models)        | `Salesforce/Llama-Rank-v1`            |
+| [Together.ai](/providers/ai-sdk-providers/togetherai#reranking-models)        | `mixedbread-ai/Mxbai-Rerank-Large-V2` |