@x12i/ai-gateway 9.3.5 → 9.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,58 +1,21 @@
1
1
  # @x12i/ai-gateway
2
2
 
3
- Unified gateway for LLM provider routing and management with production-ready features: context propagation, usage tier tracking, activity tracking, and comprehensive metadata. Built on top of `@x12i/ai-providers-router` with integrations for `@x12i/x-models`, `@x12i/activix` (see `package.json` for pinned versions), and **`@x12i/logxer`** for structured logging.
3
+ Unified gateway for LLM provider routing, structured logging, optional Activix activity persistence, and cost/model resolution via **@x12i/ai-tools** v2. Built on **@x12i/ai-providers-router**, **@x12i/logxer**, **@x12i/rendrix** (templates), and **@x12i/flex-md** (output-format hints and max-token lookup).
4
4
 
5
- ## Mandatory runtime identity (v9+)
6
-
7
- Every **`invoke`** / **`invokeChat`** request **must** include **`identity`**: the full runtime envelope from the **upstream client** (not invented inside the gateway).
8
-
9
- - **`identity.jobId`** and **`identity.taskId`** are **only** taken from that upstream object. The gateway **never** generates, rewrites, or back-fills them from deprecated top-level `jobId` / `taskId` fields.
10
- - If `identity` is missing or `jobId` / `taskId` are empty, the gateway logs **`warn`** via Logxer (`missingRuntimeIdentityObject` / `missingUpstreamIdentityFields`) when a logger is configured, and still attaches the merged envelope so the rest of the pipeline can proceed.
11
- - The same merged object is **`request.identity`**, forwarded to the router, returned as **`response.metadata.identity`**, and persisted on Activix as **`runContext`** (same reference as `request.identity`).
12
-
13
- See [Identity contract](./docs/IDENTITY_OBJECT_CONTRACT.md) and [Logger initialization](./docs/LOGGER_INITIALIZATION.md).
5
+ ## What this package does
14
6
 
15
- ### AI invoke payload (`gateway.invoke`)
7
+ | Area | Behavior |
8
+ |------|----------|
9
+ | **Routing** | Registers providers (or lazy-registers from env), invokes the router with merged model config, retries, and optional fallback chain. |
10
+ | **`invoke()`** | Builds messages from instructions + prompt templates + `workingMemory`; requires runtime **identity** and **actionType** / **actionRef**. |
11
+ | **`invokeChat()`** | Raw chat-style requests; no instruction builder or action classification. |
12
+ | **Cost** | Forwards router `costStatus` when present; otherwise prices via **@x12i/ai-tools** open-assets catalogs (`calculateFromRecord`). |
13
+ | **Activix** | Optional Mongo-backed activity rows (`ai-actions`, `bad-requests`, `skill-executions`) with root billing fields and `outer` I/O. |
14
+ | **Trace mode** | `diagnostics.mode === 'trace'` adds `metadata.attempts[]`, `metadata.usage`, and per-attempt billing when priced. |
16
15
 
17
- Use a single object typed as **`AIInvokeRequest`** (exported alias: **`AIRequest`**). Besides **`identity`** / **`aiRequestId`** / **`agentId`** / **`instructions`**, every **`invoke()`** call **must** include:
16
+ Pinned dependency versions are in `package.json` (currently **Activix ^7.2**, **ai-tools ^2**).
18
17
 
19
- - **`actionType`**: **`'skill'`** | **`'preSkill'`** | **`'postSkill'`** — whether this call is the main skill or a pre/post hook.
20
- - **`actionRef`**: non-empty string — stable reference for the action (for example the skill id).
21
-
22
- The gateway copies these onto **`request.identity`** for **`runContext`** (Activix) and onto activity documents when tracking is enabled. **`invokeChat()`** does **not** use these fields.
23
-
24
- ## Features
25
-
26
- - **🔀 Provider Routing**: Dynamic provider registration and automatic routing with fallback support
27
- - **📊 Context Propagation**: `aiRequestId` and identity propagation for distributed tracing (see [Identity contract](./docs/IDENTITY_OBJECT_CONTRACT.md))
28
- - **⚡ Usage Tier Tracking**: RPM/TPM limit enforcement via `@x12i/x-models`
29
- - **📈 Activity Tracking**: Comprehensive activity logging via `@x12i/activix` v7 (xronox-activitix), fixed Mongo collections `ai-actions` / `bad-requests`, validated root-level **`outer` / `inner`** I/O plus **`runContext`** for Activix 7
30
- - **📝 Structured Logging**: Production-ready logging via **`@x12i/logxer`** (LogMeta `jobId` / `sessionId` / `correlationId`, optional `debugKind`, default `runtimeIdentity` from env) with diagnostic tracing for instruction resolution and propagation debugging
31
- - **📋 Rich Metadata**: Detailed execution metadata (latency, tokens, model, cost, `aiRequestId`, `identity`)
32
- - **🏥 Health Checks**: Monitor provider health and availability
33
- - **🔄 Request/Response Interceptors**: Modify requests and responses
34
- - **🔍 Auto-Discovery**: Automatically discover and register installed provider packages
35
- - **🎯 Object Type Output Support**: Parse responses into typed inference outputs (classification, extraction, Q&A, etc.) via `@x12i/outputs-library`
36
- - **✅ Enhanced Schema Validation**: Strict/non-strict validation modes, automatic schema resolution from instruction metadata, and graceful outputs library fallback (v1.7.0+)
37
- - **✅ Graceful Outputs Library Error Handling**: Automatic fallback parsing, clear error detection, and parsing method metadata (v1.7.1+)
38
- - **✅ Guaranteed Consistent Structure**: Always returns consistent structure at all levels (content, parsedContent, parsedOutput) - JSON is always JSON, text is always text, structures are forced when needed (v1.7.4+)
39
- - **📋 Automatic Output Schema Guidance**: Automatically extends instructions with JSON schema expectations when outputType/schema is available (v2.1.1+)
40
- - **🔍 Output Structure Audit**: Automatically audits response structure against schema - identifies missing/extra fields, always available when schema exists (v2.1.1+)
41
- - **🔄 Automatic Retry**: Intelligent retry logic for network errors, server errors (5xx), and throttling (429) with exponential backoff
42
- - **📚 Content Resolver (nx-content)**: Resolve instruction keys, prompt keys, and instructions blocks from local folder or git repo via **nx-content**. See [Content Resolver — Upstream Guide](./CONTENT_RESOLVER_UPSTREAM_GUIDE.md).
43
- - **📋 Instruction Metadata API**: Fetch structured metadata (outputType, schema, validation rules) for metadata-driven inference systems (v1.6.9+)
44
- - **Post-processing & interceptors**: Request-level `transformations` hooks were **removed**; transform outputs in your app or use router interceptors (see [Response transformation hooks](#7-response-transformation-hooks-removed-from-request))
45
- - **🔧 Custom/Dynamic Instructions Mode**: Use instructions that already contain full JSON schema - no schema formatting added, instructions used exactly as provided (v3.0.4+)
46
- - **🤖 Response Repair Fallback**: In `mode=prod`, performs a minimal in-gateway repair attempt for malformed JSON/Markdown responses (logs a warning when used). In `mode=debug`, parsing failures hard-fail for maximum visibility.
47
- - **📊 Response Fix Metadata**: Track when and how responses were fixed, including fix strategy, confidence, and warnings (v3.0.4+)
48
- - **🔍 Instruction Optimizer**: Use AI to analyze and fix poorly-written instructions - meta-feature that improves instruction quality (v3.0.4+)
49
- - **🧪 Instruction Testing**: Test instructions by running them and analyzing if responses match expected format (v3.0.4+)
50
- - **📝 Multiple Output Modes**: Support for JSON output, structured text output, and two-step conversion (v3.0.5+)
51
- - **📋 Dual Instruction Formats**: Support for JSON schema instructions and structured text format specifications (v3.0.5+)
52
- - **📚 Standard Object Types**: Reference standard object types by name (e.g., `'sentiment-analysis'`) instead of defining schemas manually - includes examples, validation, and structured text instructions (v3.0.6+)
53
- - **🔍 Auto-Extraction of Output Formats**: Automatically extract output format specifications from instruction templates when using structured-text mode - no need to manually specify `flexMdFormat` or `primaryObjectType` (v3.3.3+)
54
- - **✅ Output Format Validation**: Validates output format specifications using flex-md SDK before sending to LLM, with configurable minimum compliance level (L0-L3)
55
- - **📋 Contract hints (`StructuredTextSpec`)**: Optional field on **`ChatRequest`** for typing / forward compatibility; dedicated activity **`contractOutput`** persistence is **not** implemented in the gateway runtime — validate **`parsedContent`** / **`content`** in your app or use interceptors (see **section 9** under Setup Guide)
18
+ ---
56
19
 
57
20
  ## Installation
58
21
 
@@ -60,89 +23,46 @@ The gateway copies these onto **`request.identity`** for **`runContext`** (Activ
60
23
  npm install @x12i/ai-gateway
61
24
  ```
62
25
 
63
- **📚 Documentation**: After installation, documentation is available in:
64
- - `node_modules/@x12i/ai-gateway/CONTENT_RESOLVER_UPSTREAM_GUIDE.md` - **Content resolver (nx-content)**: config, keys, local/git, upstream checklist
65
- - `node_modules/@x12i/ai-gateway/docs/IDENTITY_OBJECT_CONTRACT.md` - **Identity contract** for Activix (`sessionId` + `instance`)
66
- - `node_modules/@x12i/ai-gateway/docs/AI_GATEWAY_INVOKE_EXECUTION_METADATA.md` - **Invoke metadata**, cost/billing (G8), output contract (G6), Activix completion fields
67
- - `node_modules/@x12i/ai-gateway/docs/LOGGER_INITIALIZATION.md` - **Required reading**: How to properly initialize logger
68
- - `node_modules/@x12i/ai-gateway/TROUBLESHOOTING.md` - Troubleshooting guide
69
- - `node_modules/@x12i/ai-gateway/TROUBLESHOOTING_TOOLBOX.md` - Diagnostic tools
70
- - `node_modules/@x12i/ai-gateway/INTEGRATION_GUIDANCE.md` - Integration guidance
71
-
72
- **🔧 Troubleshooting Helpers**: Import diagnostic functions directly:
73
- ```typescript
74
- import { validateAIRequest, diagnoseRequest, formatDiagnostic } from '@x12i/ai-gateway';
75
- ```
76
-
77
- **🔍 Debugging**: Enable detailed request logging and comprehensive diagnostic tracing:
78
-
79
- ```bash
80
- export AI_GATEWAY_DEBUG=true # Basic request logging
81
- export AI_GATEWAY_DEBUG_REQUEST=true # Detailed request structure
82
- export FLEX_MD_MIN_COMPLIANCE_LEVEL=L0 # Output format validation level (L0/L1/L2/L3, default: L0)
83
- ```
84
-
85
- This logs the exact request structure received by `invoke()`, including property descriptors, which is critical for debugging validation errors (for example missing **`actionType`**, **`actionRef`**, or **`identity.jobId`** / **`identity.taskId`**).
26
+ Peer/provider packages (for example `@x12i/ai-provider-openai`) are installed by your application when you register those providers.
86
27
 
87
- ### 🔍 Advanced Diagnostic Logging
28
+ ---
88
29
 
89
- The gateway includes comprehensive diagnostic logging for instruction resolution and propagation debugging. When debug logging is enabled, the following diagnostic events are logged:
90
-
91
- **Phase 2 Instruction Resolution:**
92
- - `instructions.phase2.validation_inputs` - Logs comparison inputs for key echo validation
93
- - `instructions.phase2.resolution_result` - Logs resolution status, source, and attempts
30
+ ## Mandatory runtime identity (v9+)
94
31
 
95
- **Instruction Propagation Chain:**
96
- - `instructions.propagation.autoExtract.entry` - Instruction hash at auto-extraction
97
- - `instructions.propagation.constructMessages.entry` - Instruction hash at message construction
98
- - `instructions.propagation.providerInvoke.entry` - System prompt hash at LLM invocation
32
+ Every **`invoke()`** / **`invokeChat()`** request **must** include **`identity`** from the upstream client (the gateway does not invent `jobId` / `taskId`).
99
33
 
100
- **Resolution Detection:**
101
- - `instructions.constructMessages.entry` - Detects if constructMessages receives resolved instructions
34
+ - Missing or empty `identity.jobId` / `identity.taskId` → **warn** logs when a logger is configured; the call may still proceed with the merged envelope.
35
+ - The same object is **`request.identity`**, **`response.metadata.identity`**, router context, and Activix **`runContext`**.
102
36
 
103
- **Gate Checks:**
104
- - `gate.activityStart.precheck` - Validates instructions before activity tracking
105
- - `gate.llmInvoke.precheck` - Validates instructions before LLM calls
37
+ See [Identity contract](./docs/IDENTITY_OBJECT_CONTRACT.md).
106
38
 
107
- **Error Handling:**
108
- - `badRequest.written` - Confirms bad request path execution
39
+ ### Action classification (`invoke()` only)
109
40
 
110
- **Benefits:**
111
- - **Trace ID**: Each request gets a stable trace ID for correlation across all logs
112
- - **Hash Chain**: Track instruction content changes through the entire pipeline
113
- - **Fail-Safe Gating**: Verify that invalid instructions are properly rejected
114
- - **Resolution Audit**: Detect double-resolution or propagation failures
41
+ | Field | Required | Values / notes |
42
+ |-------|----------|----------------|
43
+ | **`actionType`** | Yes | `'skill'` \| `'preSkill'` \| `'postSkill'` |
44
+ | **`actionRef`** | Yes | Non-empty stable id (e.g. skill path) |
115
45
 
116
- All diagnostic logs include `traceId`, `jobId`, and `agentId` for correlation. Content is safely redacted (first 80 chars only) with hashes for comparison.
46
+ Copied onto **`identity`** for Activix and activity metadata. Not used by **`invokeChat()`**.
117
47
 
118
- ## Quick Start
48
+ ---
119
49
 
120
- ### Basic Usage with Enhanced Gateway
50
+ ## Quick start
121
51
 
122
52
  ```typescript
123
53
  import { AIGateway } from '@x12i/ai-gateway';
124
- import { OpenAIProvider } from '@x12i/ai-provider-openai';
125
- import { GrokProvider } from '@x12i/ai-provider-grok';
126
54
 
127
- // Create enhanced gateway
128
55
  const gateway = new AIGateway({
129
- defaultProvider: 'openai',
130
- fallbackChain: ['grok'],
131
- usageTier: 'tier-3', // RPM/TPM limits
56
+ defaultProvider: 'openrouter',
57
+ enableLogging: true,
132
58
  enableActivityTracking: true,
133
- enableUsageTracking: true,
134
- enableLogging: true
59
+ aiTools: {
60
+ enabled: true,
61
+ resolveModels: true,
62
+ calculateCost: true
63
+ }
135
64
  });
136
65
 
137
- // Register providers
138
- gateway.register(new OpenAIProvider({
139
- apiKey: process.env.OPENAI_API_KEY
140
- }));
141
- gateway.register(new GrokProvider({
142
- apiKey: process.env.GROK_API_KEY
143
- }));
144
-
145
- // Invoke with mandatory runtime identity + action classification (Activix / tracing)
146
66
  const response = await gateway.invoke({
147
67
  aiRequestId: 'call-001',
148
68
  agentId: 'agent-456',
@@ -159,4107 +79,191 @@ const response = await gateway.invoke({
159
79
  agentId: 'agent-456'
160
80
  },
161
81
  workingMemory: { input: 'Hello!' },
162
- config: { model: 'gpt-4o-mini', provider: 'openai' }
163
- // … primaryObjectType / objectTypes as needed for your flow (`invoke()` does not use client `messages`; use `invokeChat()` for raw transcripts)
164
- });
165
-
166
- // Response includes comprehensive metadata (including `identity`)
167
- console.log(response.metadata);
168
- ```
169
-
170
- ### Using Base Router (Direct Access)
171
-
172
- ```typescript
173
- import { LLMProviderRouter } from '@x12i/ai-gateway';
174
-
175
- // Use base router if you don't need enhanced features
176
- const router = new LLMProviderRouter({
177
- defaultProvider: 'openai',
178
- fallbackChain: ['grok']
179
- });
180
-
181
- router.register(new OpenAIProvider({
182
- apiKey: process.env.OPENAI_API_KEY
183
- }));
184
-
185
- const response = await router.invoke({
186
- messages: [{ role: 'user', content: 'Hello!' }]
187
- });
188
- ```
189
-
190
- ### Provider registration and OpenRouter (no manual register required)
191
-
192
- If you only use the gateway (e.g. via `@woroces/ai-tasks`) and do not call `gateway.register()` or configure the router yourself:
193
-
194
- - **OpenRouter:** Set `OPEN_ROUTER_KEY` or `OPENROUTER_API_KEY` in the environment and do not set `USE_OPENROUTER=false`. The gateway enables OpenRouter mode so the router can route without any registered provider (requires router support). Load `.env` before any code that creates the gateway; if the gateway is created by another package (e.g. ai-skills) before env is loaded, pass the key explicitly: `openrouter: { apiKey: process.env.OPEN_ROUTER_KEY ?? process.env.OPENROUTER_API_KEY }` in the gateway config.
195
- - **Direct providers:** Set the relevant API key (e.g. `OPENAI_API_KEY`, `GROK_API_KEY`). The gateway **lazy-auto-registers** these on first `invoke()`/`invokeChat()`, so you do not need to call `autoRegisterProviders` or `register()`.
196
-
197
- If you see **"No provider specified and no providers registered"** or **"Provider not registered: openrouter"**, set `OPEN_ROUTER_KEY` (or another provider’s API key), ensure `.env` is loaded before the process that creates the gateway, or pass `openrouter: { apiKey }` in the gateway config. See [TROUBLESHOOTING.md](./TROUBLESHOOTING.md#issue-no-provider-specified-and-no-providers-registered) for details.
198
-
199
- ## Setup Guide
200
-
201
- This guide shows where and how to configure each functionality of the AI Gateway.
202
-
203
- ### Configuration Overview
204
-
205
- The AI Gateway can be configured in multiple ways:
206
- 1. **Gateway Constructor** - Main configuration when creating the gateway
207
- 2. **JSON Default Files** - Default configurations loaded from `src/defaults/` (model-config.json, instructions-blocks.json)
208
- 3. **Environment Variables** - Via `nx-config2` (for logging and other settings)
209
- - `FLEX_MD_MIN_COMPLIANCE_LEVEL` - Minimum flex-md compliance level for output format validation (default: `L0`). Valid values: `L0`, `L1`, `L2`, `L3`. See [Output Format Validation](#output-format-validation) section for details. When set to `L0` (default), no format validation is required. When set to `L1` or higher, format specifications are required in instructions and validation errors will reject requests.
210
- 4. **Request-Level** - Override gateway defaults per request
211
-
212
- ### 1. Logging Configuration (@x12i/logxer)
213
-
214
- **Logger initialization:** The gateway uses **`@x12i/logxer`**. Pass a **`Logxer`** from **`createLogxer`**, or omit `logger` and let the gateway build a default (still Logxer-based). See **[Logger Initialization Guide](./docs/LOGGER_INITIALIZATION.md)** for mandatory **`identity`** on requests and **`LogMeta`** / **`debugKind`** usage.
215
-
216
- **Where to configure:**
217
- - Gateway constructor: `enableLogging`, `packageName`, `logger`
218
- - Environment variables: **`{PREFIX}_LOGS_LEVEL`** (canonical per-package level), legacy **`{PREFIX}_LOG_LEVEL`**, plus **`{PREFIX}_LOG_FORMAT`**, file sinks, unified logger, etc., as documented for **`@x12i/logxer`**.
219
-
220
- **How to configure:**
221
-
222
- ```typescript
223
- import { AIGateway } from '@x12i/ai-gateway';
224
- import { createLogxer } from '@x12i/logxer';
225
-
226
- // Create logger once at application startup
227
- const logger = createLogxer(
228
- { packageName: 'MY_APP', envPrefix: 'MY_APP', debugNamespace: 'my-app' },
229
- {
230
- logLevel: 'info', // verbose|debug|info|warn|error
231
- logFormat: 'json', // text|json|yaml|table
232
- logToFile: true,
233
- logFilePath: '/var/log/app.log',
234
- enableUnifiedLogger: true,
235
- unifiedLogger: {
236
- transports: { papertrail: true },
237
- service: 'my-app',
238
- env: 'production'
239
- },
240
- runtimeIdentity: {
241
- service: 'my-app',
242
- env: process.env.NODE_ENV,
243
- version: process.env.npm_package_version
244
- }
245
- }
246
- );
247
-
248
- const gateway = new AIGateway({
249
- enableLogging: true,
250
- logger,
251
- packageName: 'MY_APP'
252
- });
253
- ```
254
-
255
- **Why use the same Logxer?**
256
- - Consistent log format across your application
257
- - Unified logging destination
258
- - **`LogMeta`** correlation (`jobId`, `sessionId`, `correlationId`, `debugKind`, `runtimeIdentity`) matches gateway and Activix fields
259
- - Single point of control for log levels
260
-
261
- **Per-package log level (`@x12i/logxer`):**
262
-
263
- - **Canonical:** `{PREFIX}_LOGS_LEVEL` — same `envPrefix` as **`createLogxer`** / your `packageName` (e.g. `MY_APP` → `MY_APP_LOGS_LEVEL`).
264
- - **Legacy:** `{PREFIX}_LOG_LEVEL` is used only if `{PREFIX}_LOGS_LEVEL` is **not** set.
265
- - **Default** when both are unset: **`warn`** (not `info`, not silent).
266
- - **Silence** this package’s diagnostics: `off`, `none`, or `silent` (case-insensitive).
267
- - **Values:** `off` \| `none` \| `silent` \| `error` \| `warn` \| `info` \| `debug` \| `verbose`.
268
-
269
- If the gateway builds the default logger and you omit `packageName`, the prefix is **`AI_GATEWAY`** → e.g. **`AI_GATEWAY_LOGS_LEVEL`**.
270
-
271
- **Environment variables (examples):**
272
- ```bash
273
- MY_APP_LOGS_LEVEL=info # raise verbosity (or use debug / verbose)
274
- MY_APP_LOGS_LEVEL=off # silence this package’s logs
275
- # MY_APP_LOG_LEVEL=info # legacy; ignored if MY_APP_LOGS_LEVEL is set
276
-
277
- MY_APP_LOG_FORMAT=json # text|json|yaml|table
278
- MY_APP_LOG_TO_FILE=true
279
- MY_APP_LOG_FILE=/var/log/app.log
280
- MY_APP_LOG_TO_UNIFIED=true
281
- DEBUG=my-app # elevates verbose/debug when package is not fully silent
282
- ```
283
-
284
- **Diagnostic logging:** When debug level is enabled, the gateway emits diagnostic logs for instruction resolution and propagation through the same Logxer pipeline.
285
-
286
- **What happens if `logger` is not provided:** The gateway creates a default **`Logxer`** via **`createLogxer`**. For production, prefer passing your app’s logger so levels, transports, and **`runtimeIdentity`** stay aligned with the rest of your stack.
287
-
288
- ### 2. Activity Tracking Configuration (xronox-activitix via @x12i/activix v7)
289
-
290
- **Activix version:** This gateway targets **`@x12i/activix` v7.x** (built on `@x12i/xronox-store`). The dependency range is declared in `package.json` (currently `^7.1.0`). Activity I/O is stored at the **document root** as **`outer`** (and optional **`inner`**); the deprecated nested **`structure`** wrapper is not used.
291
-
292
- **Where to configure:**
293
- - Gateway constructor: `enableActivityTracking`, `activityTracker`
294
- - **Environment variables** (auto-configured when no custom tracker provided):
295
- - `MONGO_URI` (required) - MongoDB connection string
296
- - `MONGO_LOGS_DB` or `MONGO_DB` (required) - Database name
297
-
298
- **✅ Centralized configuration**
299
-
300
- The gateway reads **Mongo connection** settings from the environment, but **collection names are package constants**: they are defined only in `src/config/activity-tracking-config.ts` and **cannot** be overridden by env vars. If you pass your own **`Activix`** instance, register collections whose **`name`** values match these strings exactly or persistence and dashboards will disagree.
301
-
302
- | Mongo collection | Typical `activityType` | What gets stored |
303
- |------------------|------------------------|------------------|
304
- | **`ai-actions`** | `gateway-invocation` | Normal LLM runs after validation: full request/config snapshots, lifecycle, **`runContext`**, and Activix **`outer`** I/O for each gateway invocation. |
305
- | **`skill-executions`** | `skill-execution` | Skill-specific lifecycle rows when skill execution tracking is used (separate from the main gateway row). |
306
- | **`bad-requests`** | `bad-request` | Failures **before** `startActivity` (validation, configuration, format extraction, etc.). |
307
-
308
- **What the gateway sends into Activix (lifecycle)**
309
-
310
- `ActivityManager` drives **`@x12i/activix` v7** with a **two-phase** API:
311
-
312
- 1. **`startRecord`** — Inserts a new document with **`status: 'started'`**, **`startTime`**, **`runContext`** (same object as **`request.identity`**), root **`request`** / **`config`** snapshots, gateway metadata (e.g. **`activityType`**, **`aiRequestId`**), and the initial **`outer`** fragment (see below). Activix returns **`activityId`** (prefix **`act-`**, configured as the collection **`primaryKey`**); that id is used for all later updates — **not** `jobId`.
313
- 2. **`completeRecord`** or **`failRecord`** — Patches the **same** document by **`activityId`**. On success, adds **`response`**, **`endTime`**, **`duration`**, root **`cost`** / **`costUsd`** / **`costStatus`**, sets **`outer.output`** to the completion payload, merges billing into **`outer.metadata`**, and when priced or unpriced with usage, sets Activix **`outer.cost`** (`usd`, `tokens`, `provider`, `model`, optional `details`). Failure adds error details (and may attach **`outer.output`** for certain failure modes such as response parsing).
314
-
315
- **How a document is shaped (reading `ai-actions` in Mongo)**
316
-
317
- - **`runContext`**: Canonical correlation BSON object — the merged gateway **`identity`** (`jobId`, `taskId`, `sessionId`, `instance`, `aiRequestId`, optional graph/skill linkage, `actionType` / `actionRef` on **`invoke()`**, plus any extra keys your upstream put on **`identity`**).
318
- - **Root-level copies** of common identity fields may appear beside **`runContext`** for convenient indexing; treat **`runContext`** as the full envelope when in doubt.
319
- - **`request`**: Structured snapshot only — **`raw`** / **`parsed`** instructions, context, prompt; **`messages`**; **`workingMemory`** (template/user payload). There is **no** separate legacy **`input`** field on this object; use **`workingMemory`**.
320
- - **`config`**: `model`, `provider`, `temperature`, `maxTokens`, **`rawConfig`** (exact router config).
321
- - **`outer`**: Activix v7 **validated I/O** at the document root. At **start**, **`outer.input`** contains **`activityType`** and the same **`request`** snapshot as root **`request`** when a body exists (`{ activityType, request }`). At **success**, **`outer.output`** matches the **`response`** object written on completion; **`outer.metadata`** mirrors routing and billing from **`response.metadata`** (`modelUsed`, `provider`, `cost`, `costUsd`, `costStatus`, optional `costBreakdown`); **`outer.cost`** holds the canonical Activix cost object when usage or price is known (see [Cost reporting](#cost-reporting-invoke-response--activix-run-analysis-g8) below). Root **`request`** / **`response`** support querying and older tooling; **`outer`** satisfies Activix’s envelope — so the same logical request snapshot can appear both at **`request`** and under **`outer.input.request`** by design. Large provider blobs (**`response.content.fullResponse`**) and size limits are described in [Activities outer duplication & payload controls](./docs/ACTIVITIES_OUTER_DUPLICATION.md).
322
-
323
- **Environment variable priority (Activix / Mongo — implemented in `@x12i/activix`, not in `activity-tracking-config.ts`):**
324
- - **Mongo URI**: `MONGO_LOGS_URI` if set, otherwise **`MONGO_URI`**. If neither is set, Activix cannot use the database.
325
- - **Mongo database name** (where collections such as **`ai-actions`** live): `ACTIVIX_DB_NAME` → `MONGO_AI_LOGS_DB` → **`MONGO_LOGS_DB`** → **`MONGO_DB`** → default **`activitix`** if none are set.
326
-
327
- **Why you might not see `ai-actions` in Mongo**
328
-
329
- 1. **`storageMode: 'automatic'` (gateway default)** — On startup Activix **pings** Mongo with the URI above. If the URI is missing or the ping fails, it **silently falls back** to filesystem storage under **`./playground/`** (see `playground/collections/ai-actions/`). No Mongo collection is created in that mode. Check logs for **`Activity tracking persistence backend ready`**: `storageBackend` must be **`database`** for Mongo.
330
- 2. **Lazy collections** — MongoDB normally creates a collection on the **first insert**. Until at least one activity **`startRecord`** succeeds against Mongo, **`ai-actions`** may not exist.
331
- 3. **Wrong database in Compass / shell** — If you did not set `MONGO_LOGS_DB` / `MONGO_DB`, Activix uses the default database **`activitix`**, not your application’s primary DB.
332
- 4. **`.env` load order** — Environment variables must be available **before** `new AIGateway(...)`. If another package constructs the gateway before `dotenv.config()`, Activix may never see `MONGO_URI`.
333
-
334
- **⚠️ CRITICAL: correlation and identity**
335
-
336
- - **`aiRequestId`** (required on each gateway request): Primary correlation id for this LLM call; the gateway does **not** invent a `jobId` for you.
337
- - **Run context** (Activix BSON field `runContext`): Same object as **`request.identity`** (including required upstream **`jobId`** and **`taskId`**), plus `sessionId` and nested `instance: { instanceId, type }` when present; see [Identity contract](./docs/IDENTITY_OBJECT_CONTRACT.md).
338
- - **`jobTypeId`**, **`taskTypeId`**: Optional aggregation / grouping fields (unchanged semantics).
339
- - **Each activity**: Gets its own **unique database record** with unique `_id` (MongoDB ObjectId).
340
- - **Two-phase tracking**: `startActivity()` creates a new record; `logSuccess()` / `logFailure()` update the same record by that record’s id.
341
- - **Payload shape**: Root **`request`** and **`outer.input.request`** both snapshot the same gateway request (Activix `outer` envelope). Large provider blobs on completion live under **`content.fullResponse`** unless you opt out — see [Activities outer duplication & payload controls](./docs/ACTIVITIES_OUTER_DUPLICATION.md).
342
-
343
- **Runtime objects observability (debug only):**
344
-
345
- `@x12i/ai-gateway` exports `runtimeObjects` for runtime diagnostics. This package is a leaf runtime package, so `runtimeObjects?.packagesRuntimeObjects` is always `[]`.
346
-
347
- Runtime objects are available only in debug mode:
348
-
349
- ```env
350
- mode=debug
351
- ```
352
-
353
- `debug` is the default when `mode` is omitted. In production, use:
354
-
355
- ```env
356
- mode=prod
357
- ```
358
-
359
- When `mode=prod`, `runtimeObjects` is `undefined`.
360
-
361
- ```typescript
362
- import { runtimeObjects } from '@x12i/ai-gateway';
363
-
364
- const activities = await runtimeObjects?.activixClient?.getJobActivities({ jobId });
365
- const logs = await runtimeObjects?.logxerClient?.getJobLogs({ jobId });
366
- ```
367
-
368
- The gateway only exposes official queryable clients. It exposes `activixClient` only when the effective Activix client already implements `getJobActivities()`, and `logxerClient` only when the effective Logxer client already implements `getJobLogs()`. The gateway does not query Mongo, Logxer storage, or private package internals to emulate missing query APIs.
369
-
370
- See [Runtime Objects Observability Methodology](./docs/RUNTIME_OBJECTS_OBSERVABILITY.md) for the reusable package-level contract.
371
-
372
- ### Model catalog resolution and defaults (`@x12i/ai-tools`)
373
-
374
- Before each invoke, the gateway can normalize caller `config.model` / `modelConfig` via the **ai-models** Catalox catalog (`@x12i/ai-tools`). After invoke, when the router leaves cost **unpriced**, the gateway may compute USD from the same catalog.
375
-
376
- **Environment variables:**
377
-
378
- | Variable | Purpose |
379
- |----------|---------|
380
- | `AI_GATEWAY_DEFAULT_MODEL` | Default model when none is provided, or when resolution fails in **`mode=prod`**. Supports `provider/model` (e.g. `openrouter/openai/gpt-5-nano`) or a bare model id. |
381
- | `mode` / `MODE` | `prod` — unresolved models fall back to the default chain (with **Logxer `warn`**). `dev` / `debug` / omitted — unresolved models throw **`ModelResolutionError`**. |
382
-
383
- **Default model priority** (prod fallback only): `AI_GATEWAY_DEFAULT_MODEL` → `src/defaults/model-config.json` `defaultModel` → code constant `gpt-5-nano`.
384
-
385
- **Logxer warnings** on default substitution include structured fields: `reason` (`no_model_provided`, `model_resolution_failed`, `ai_tools_unavailable`), `defaultSource` (`env`, `model-config.json`, `code`), `originalModel`, `defaultModel`, and `mode`.
386
-
387
- Catalox/Firebase credentials are required for catalog bootstrap (same as `@x12i/ai-tools` — see that package’s README). Disable with `aiTools: { enabled: false }` on `GatewayConfig`, or inject `aiTools.catalox` for tests.
388
-
389
- **GatewayConfig (optional overrides):**
390
-
391
- ```typescript
392
- const gateway = new AIGateway({
393
- mode: 'prod', // or 'dev' | 'debug' — overrides process.env.mode
394
- aiTools: {
395
- enabled: true,
396
- resolveModels: true,
397
- calculateCost: true,
398
- costIncludeBreakdown: false,
399
- cacheTtlMs: 60_000,
400
- // catalox: injectedCataloxInstance,
401
- },
82
+ config: { model: 'openai/gpt-4o-mini', provider: 'openrouter' }
402
83
  });
403
- ```
404
-
405
- #### Cost reporting (invoke response + Activix, Run Analysis G8)
406
-
407
- Billing is resolved once per successful **`invoke()`** / **`invokeChat()`** via **`resolveCostCompletionWithAiTools`** (see [`docs/AI_GATEWAY_INVOKE_EXECUTION_METADATA.md`](./docs/AI_GATEWAY_INVOKE_EXECUTION_METADATA.md)):
408
-
409
- | Layer | Fields |
410
- |--------|--------|
411
- | **Router** (`@x12i/ai-providers-router`) | Preferred source: **`metadata.costStatus`** (`priced` \| `unpriced`), **`metadata.costUsd`** / **`metadata.cost`** when priced |
412
- | **Gateway response** | Same slice on **`response.metadata`**: **`costStatus`**, **`costUsd`**, **`cost`**, optional **`costBreakdown`** (when **`aiTools.calculateCost`** and catalog pricing apply and the router left cost unpriced) |
413
- | **Activix activity (on `logSuccess`)** | Root **`cost`**, **`costUsd`**, **`costStatus`**; **`outer.metadata`** mirror; **`outer.cost`** (`usd`, `tokens` with `input`/`output`/`total`, `provider`, `model`, `details.costStatus`, optional `details.costBreakdown`) |
414
-
415
- **`costStatus` semantics:**
416
84
 
417
- - **`priced`** — **`costUsd`** / **`cost`** is a finite USD amount for this call (from the router or from **`@x12i/ai-tools`** catalog **`CostCalculator`** when the router did not price).
418
- - **`unpriced`** — Token usage was recorded but no authoritative USD price was available (explicit router **`unpriced`** is never overridden by catalog).
419
- - Omitted — No non-zero token usage (no billing signal).
420
-
421
- Requires **`enableActivityTracking: true`** (default when Mongo/env is configured) for Activix persistence; invoke metadata is always set on the gateway response regardless.
422
-
423
- **Tests before release:**
424
-
425
- ```bash
426
- npm run build
427
- npm test # integration (tsx)
428
- npm run test:ai-tools # unit: mode, defaults, cost helper
429
- npm run test:live # LIVE: catalog + invoke (needs .env + Firebase + LLM key)
430
- npm run test:real:comprehensive # optional: compiled real router matrix + npm test
85
+ console.log(response.content, response.metadata?.costUsd, response.metadata?.tokens);
431
86
  ```
432
87
 
433
- See [`.env.example`](./.env.example) for `AI_GATEWAY_DEFAULT_MODEL`, `mode`, provider keys, and Firebase/Catalox variables.
434
-
435
- **Recommended (auto-configured from environment variables):**
436
-
437
- ```typescript
438
- import { AIGateway } from '@x12i/ai-gateway';
439
- import { OpenAIProvider } from '@x12i/ai-provider-openai';
440
-
441
- // Set environment variables (before creating the gateway):
442
- // MONGO_URI=mongodb://localhost:27017
443
- // MONGO_LOGS_DB=my-logs-db # optional; default DB name for Activix is "activitix"
444
-
445
- const gateway = new AIGateway({
446
- enableActivityTracking: true, // default: true
447
- // Activix is auto-configured; writes use collections ai-actions / bad-requests / skill-executions
448
- });
88
+ ### Providers without manual `register()`
449
89
 
450
- gateway.register(new OpenAIProvider({
451
- apiKey: process.env.OPENAI_API_KEY
452
- }));
453
- ```
90
+ - **OpenRouter:** Set `OPEN_ROUTER_KEY` or `OPENROUTER_API_KEY` (unless `USE_OPENROUTER=false`). The gateway can lazy-register on first invoke.
91
+ - **Direct providers:** Set `OPENAI_API_KEY`, `GROK_API_KEY`, etc. Same lazy registration.
454
92
 
455
- **Advanced (custom Activix v7 instance):**
93
+ Load `.env` before constructing the gateway if another package creates it first.
456
94
 
457
- If you pass your own `Activix`, configure **the same collection names** the gateway expects so routing matches persistence:
95
+ ### Base router only
458
96
 
459
97
  ```typescript
460
- import { AIGateway } from '@x12i/ai-gateway';
461
- import { Activix } from '@x12i/activix';
462
-
463
- const statusValues = {
464
- started: 'started',
465
- inProgress: 'in_progress',
466
- completed: 'success',
467
- failed: 'failed',
468
- timeout: 'timeout'
469
- };
470
-
471
- const activityTracker = new Activix({
472
- collections: [
473
- { name: 'ai-actions', statusValues },
474
- { name: 'skill-executions', statusValues },
475
- { name: 'bad-requests', statusValues }
476
- ]
477
- });
98
+ import { LLMProviderRouter } from '@x12i/ai-gateway';
478
99
 
479
- const gateway = new AIGateway({
480
- enableActivityTracking: true, // default: true
481
- activityTracker, // plug in custom tracker
482
- });
100
+ const router = new LLMProviderRouter({ defaultProvider: 'openai' });
101
+ // register providers, then router.invoke({ messages: [...] })
483
102
  ```
484
103
 
485
- When the gateway constructs Activix internally, each collection uses **`primaryKey: 'activityId'`** and **`primaryKeyPrefix: 'act-'`**. If you supply a custom **`Activix`**, align with that (or your **`activityId`** values will not match operational tooling).
486
-
487
- **What gets tracked (persisted when DB is configured):**
488
- - **Identity**: Fields aligned with **`request.identity`** / Activix **`runContext`**: **`aiRequestId`**, upstream **`jobId`** and **`taskId`**, `sessionId`, `instance`, plus optional `jobTypeId`, `agentId`, `taskTypeId`, etc., as provided
489
- - **Timing**: `startTime`, `endTime`, `duration`, `status` (`started|success|failed`)
490
- - **Request data**: Stored in **`request`** (raw/parsed prompts, **`messages`**, **`workingMemory`**) and mirrored under **`outer.input.request`** when Activix **`outer`** is populated — see table above
491
- - **Config data**: Stored in **`config`** (model, provider, temperature, maxTokens, **`rawConfig`**)
492
- - **Response data**: Stored in **`response`** on completion (content, metadata, optional **`fullResponse`** per diagnostics)
493
- - **Activix I/O**: Root **`outer`** — **`outer.input`** at start, **`outer.output`** on success (and some failure paths)
494
- - **Cost / billing**: On success, root **`cost`**, **`costUsd`**, **`costStatus`**, plus **`outer.metadata`** and **`outer.cost`** (same values as **`response.metadata`** from the invoke path — router passthrough or catalog pricing via **`@x12i/ai-tools`**)
495
-
496
- **Best Practices for Type IDs:**
497
- - **`jobTypeId`**: Use MD5 hash of your job type string (e.g., `MD5('data-processing-job')`) for consistent job-level aggregation
498
- - **`taskTypeId`**: Use MD5 hash of your task/instruction text (e.g., `MD5('What is the capital of France?')`) for consistent task-level aggregation
499
- - If `taskTypeId` is not provided, it's auto-generated from the pre-parsed instructions MD5 hash
500
- - Same type = same hash = easy aggregation and tracking across multiple jobs/tasks
501
-
502
- **Key design points:**
503
- - ✅ Each activity = separate database record with unique `_id`
504
- - ✅ **`aiRequestId`** = per-request correlation (required); **`jobId`** / **`taskId`** come from upstream **`identity`** (required on each request; see v9+ contract above)
505
- - ✅ Request data sent once in `startActivity()` (creates new record)
506
- - ✅ Response data sent once in `logSuccess()` (updates same record by `_id`)
507
-
508
- **Default:** Activity tracking is enabled by default; without DB config it will log but not persist.
104
+ ---
509
105
 
510
- **✅ Activix v7 integration**
106
+ ## Configuration
511
107
 
512
- 1. **Configuration** (`activity-tracking-config.ts`):
513
- - Mongo connection from env; **collection names** `ai-actions`, `bad-requests`, and `skill-executions` are **fixed literals** (not env-driven) for consistency across deployments.
514
-
515
- 2. **Lifecycle** (`@x12i/activix` v7):
516
- - ✅ `startRecord` / `completeRecord` / `failRecord` (two-phase lifecycle)
517
- - ✅ Status transitions: `started` → `success` or `failed` (per your `statusValues` mapping)
518
- - ✅ Persistence via xronox-store queue semantics (see Activix package docs)
519
-
520
- 3. **Testing** (ai-gateway):
521
- - Standalone test available (`npm run test:activities:standalone`) that bypasses config parsing issues
522
- - Tests activity lifecycle end-to-end: creation → completion → database persistence
523
- - See `.tests/TESTING_GUIDE.md` for complete testing documentation
524
-
525
- **See**: `.reports/new/ACTIVITY_LIFECYCLE_IMPROVEMENTS_VERIFICATION_REPORT.md` for complete verification details.
526
-
527
- #### Skill Execution Tracking
528
-
529
- When executing skills (instruction keys starting with `skills/`), the gateway automatically tracks skill executions separately from gateway invocations. Skill executions are stored in the `skill-executions` collection and support parent-child relationships.
530
-
531
- **Required Fields:**
532
- - `instructions`: Skill instruction key (e.g., `skills/professional-answer`)
533
- - `inferenceType`: Recommended - type of inference (e.g., `question-answer`, `classification`)
534
-
535
- **Optional Fields:**
536
- - `masterSkillActivityId`: Parent skill activity ID (when a skill calls another skill)
537
- - `skillId`: Skill identifier (auto-detected from instruction key, can be overridden)
538
- - `masterSkillId`: Parent skill identifier (when a skill calls another skill)
539
-
540
- **Example: Basic Skill Execution**
541
-
542
- ```typescript
543
- const aiRequestId = 'skill-pa-1';
544
- const response = await gateway.invoke({
545
- aiRequestId,
546
- agentId: 'my-agent',
547
- actionType: 'skill',
548
- actionRef: 'skills/professional-answer',
549
- instructions: 'skills/professional-answer',
550
- prompt: '{{question}}',
551
- workingMemory: { question: 'What is...' },
552
- inferenceType: 'question-answer', // Recommended
553
- skillId: 'skills/professional-answer', // Optional, auto-detected from instructions
554
- primaryObjectType: 'professional-answer',
555
- identity: {
556
- sessionId: 's1',
557
- instance: { instanceId: 'my-agent', type: 'test' },
558
- aiRequestId,
559
- jobId: 'job-123',
560
- taskId: 'task-1',
561
- agentId: 'my-agent'
562
- },
563
- config: { model: 'gpt-4o', provider: 'openai' }
564
- });
108
+ ### Gateway constructor (common flags)
565
109
 
566
- // Get the activity ID for linking child skills
567
- const activityId = response.metadata?.activityId;
568
- ```
110
+ | Option | Default | Purpose |
111
+ |--------|---------|---------|
112
+ | `enableLogging` | `true` | Logxer pipeline |
113
+ | `logger` | built-in | Pass your app `createLogxer()` instance |
114
+ | `enableActivityTracking` | `true` | Activix persistence (needs Mongo env when no `activityTracker`) |
115
+ | `activityTracker` | — | Custom `Activix` instance (collection names must still match package constants) |
116
+ | `enableUsageTracking` | `true` | In-process usage tier helper |
117
+ | `aiTools` | see below | Model resolution + catalog pricing |
118
+ | `mode` | `'debug'` | `'dev'` \| `'debug'` \| `'prod'` — affects strict model resolution |
119
+ | `diagnostics` | — | `{ mode: 'trace' }` for rich `metadata.attempts` / `metadata.usage` |
120
+ | `retry` / `rateLimit` | from `defaults/model-config.json` | Router retry and between-call spacing |
569
121
 
570
- **Example: Nested Skill Execution (Skill Calling Another Skill)**
122
+ Defaults load from `defaults/model-config.json`, `instructions-blocks.json`, and `template-rendering.json` (copied into `dist/` on build).
571
123
 
572
- ```typescript
573
- const parentAiRequestId = 'skill-parent-1';
574
- // Parent skill execution
575
- const parentResponse = await gateway.invoke({
576
- aiRequestId: parentAiRequestId,
577
- agentId: 'my-agent',
578
- actionType: 'skill',
579
- actionRef: 'skills/parent-skill',
580
- instructions: 'skills/parent-skill',
581
- prompt: '{{input}}',
582
- workingMemory: { input: '…' },
583
- inferenceType: 'analysis',
584
- skillId: 'skills/parent-skill',
585
- identity: {
586
- sessionId: 's1',
587
- instance: { instanceId: 'my-agent', type: 'test' },
588
- aiRequestId: parentAiRequestId,
589
- jobId: 'job-123',
590
- taskId: 'task-parent',
591
- agentId: 'my-agent'
592
- },
593
- config: { model: 'gpt-4o', provider: 'openai' }
594
- });
124
+ ### Environment (selected)
595
125
 
596
- const childAiRequestId = 'skill-child-1';
597
- // When parent skill calls child skill, pass parent's activityId and skillId
598
- const childResponse = await gateway.invoke({
599
- aiRequestId: childAiRequestId,
600
- agentId: 'my-agent',
601
- actionType: 'skill',
602
- actionRef: 'skills/child-skill',
603
- instructions: 'skills/child-skill',
604
- prompt: '{{input}}',
605
- workingMemory: { input: '…' },
606
- inferenceType: 'question-answer',
607
- skillId: 'skills/child-skill',
608
- masterSkillActivityId: parentResponse.metadata?.activityId, // ✅ Parent's activity ID
609
- masterSkillId: 'skills/parent-skill', // ✅ Parent's skill ID
610
- identity: {
611
- sessionId: 's1',
612
- instance: { instanceId: 'my-agent', type: 'test' },
613
- aiRequestId: childAiRequestId,
614
- jobId: 'job-123', // ✅ Same jobId (links activities)
615
- taskId: 'task-child',
616
- agentId: 'my-agent'
617
- },
618
- config: { model: 'gpt-4o', provider: 'openai' }
619
- });
620
- ```
126
+ | Variable | Role |
127
+ |----------|------|
128
+ | `MONGO_URI`, `MONGO_LOGS_DB` / `MONGO_DB` | Activix when no custom tracker |
129
+ | `AI_GATEWAY_DEFAULT_MODEL` | Default model slug (`provider/model` or OpenRouter id) |
130
+ | `mode` / `MODE` | Operational mode (`dev`, `debug`, `prod`) |
131
+ | `AI_GATEWAY_LOGS_LEVEL` | Log level when using default logger (`AI_GATEWAY` prefix) |
132
+ | `FLEX_MD_MIN_COMPLIANCE_LEVEL` | `L0`–`L3` output-format validation (default `L0`) |
133
+ | Provider API keys | OpenRouter, OpenAI, etc. |
621
134
 
622
- **Automatic Tracking:**
135
+ Logging details: [Logger initialization](./docs/LOGGER_INITIALIZATION.md).
623
136
 
624
- The gateway automatically:
625
- - Detects skill executions from instruction keys starting with `skills/`
626
- - Connects to instruction metadata (key, version) from content resolution
627
- - Routes to `skill-executions` collection (ActivityManager + Activix handle routing)
628
- - Returns `activityId` in response metadata for linking child skills
629
- - Supports parent-child skill relationships via `masterSkillActivityId` and `masterSkillId`
137
+ ---
630
138
 
631
- **For detailed integration guides, see:**
632
- - [Skill Execution Client Integration Guide](./docs/SKILL_EXECUTION_CLIENT_INTEGRATION.md)
633
- - [AI Gateway Integration Guide: Skill Requests](./docs/AI_GATEWAY_INTEGRATION_SKILL_REQUESTS.md)
634
- - [Extending AI Activities with Skill Fields](./docs/EXTENDING_AI_ACTIVITIES_WITH_SKILL_FIELDS.md)
139
+ ## @x12i/ai-tools v2 (models + cost)
635
140
 
636
- ### 3. Usage Tracking Configuration (x-models)
141
+ - **No Catalox / Firestore** — catalogs come from ai-tools open-assets JSON (optional `bundledOnly`).
142
+ - **`aiTools.enabled`** — bootstrap catalog client + calculator.
143
+ - **`aiTools.resolveModels`** — `mergeConfig()` resolves model ids (strict in **`mode: 'dev'`**).
144
+ - **`aiTools.calculateCost`** — prices usage before Activix `completeRecord` when the router did not mark the call priced.
637
145
 
638
- **Where to configure:**
639
- - Gateway constructor: `enableUsageTracking`, `usageTier`
640
- - JSON defaults: `src/defaults/model-config.json` (not used for usage tracking)
146
+ Gateway helpers (also exported): `resolveCostCompletionWithAiTools`, `buildTraceUsageSummary`, `enrichTraceAttemptsWithBilling`.
641
147
 
642
- **How to configure:**
148
+ ---
643
149
 
644
- ```typescript
645
- const gateway = new AIGateway({
646
- enableUsageTracking: true, // Default: true
647
- usageTier: 'tier-3' // RPM/TPM limits: 'tier-1' | 'tier-2' | 'tier-3'
648
- });
649
- ```
150
+ ## Activity tracking (@x12i/activix 7.2)
650
151
 
651
- **Note:** If `@x12i/x-models` is not available or has export issues, usage tracking will gracefully degrade with warnings logged.
152
+ When tracking is enabled and no custom tracker is supplied, the gateway constructs Activix with fixed collection names (see `src/config/activity-tracking-config.ts`):
652
153
 
653
- **Default:** Usage tracking is enabled by default with `usageTier: 'tier-3'`
154
+ | Collection | Typical use |
155
+ |------------|-------------|
156
+ | `ai-actions` | Normal gateway invocations |
157
+ | `bad-requests` | Validation / pre-start failures |
158
+ | `skill-executions` | Skill-specific rows |
654
159
 
655
- ### 4. Default Model and Engine Configuration
160
+ **Lifecycle:** `startRecord` `completeRecord` / `failRecord` keyed by **`activityId`** (not `jobId`).
656
161
 
657
- **Where to configure:**
658
- 1. **JSON Defaults** (lowest priority): `src/defaults/model-config.json`
659
- 2. **Gateway Constructor** (medium priority): `defaultModel`, `defaultEngine`
660
- 3. **Request Config** (highest priority): `request.config.model`, `request.config.provider`
162
+ **Successful completion** (no duplicate billing on `outer.metadata`):
661
163
 
662
- **How to configure:**
164
+ - Root: `cost`, `costUsd`, `costStatus`, **`metadata`** (routing + billing mirror for Activix 7.x)
165
+ - `outer.metadata`: routing only (`modelUsed`, `provider`, …)
166
+ - `outer.cost`: Activix cost shape (`usd`, `tokens`, `provider`, `model`, `details`)
167
+ - `response.metadata`: same billing slice as returned to callers
663
168
 
664
- **Step 1: Create/Edit JSON defaults** (`src/defaults/model-config.json`):
665
- ```json
666
- {
667
- "defaultModel": "gpt-4o",
668
- "defaultEngine": "openai",
669
- "temperature": 0.7,
670
- "maxTokens": 2000,
671
- "topP": 1.0,
672
- "frequencyPenalty": 0.0,
673
- "presencePenalty": 0.0
674
- }
675
- ```
169
+ When **`aiTools.calculateCost`** is on and you do not pass `activityTracker`, Activix **`autoCost`** is enabled with **`overwriteOuterCost: false`** so gateway-computed cost wins.
676
170
 
677
- **Step 2: Gateway constructor:**
678
- ```typescript
679
- const gateway = new AIGateway({
680
- defaultModel: 'gpt-5-nano', // Overrides JSON default
681
- defaultEngine: 'openai', // Overrides JSON default
682
- temperature: 0.9, // Overrides JSON default
683
- maxTokens: 4000 // Overrides JSON default
684
- });
685
- ```
171
+ Mongo env: `MONGO_URI` + `MONGO_LOGS_DB` or `MONGO_DB`.
686
172
 
687
- **Step 3: Request-level override:**
688
- ```typescript
689
- const aiRequestId = 'cfg-1';
690
- const response = await gateway.invoke({
691
- aiRequestId,
692
- agentId: 'agent-456',
693
- actionType: 'skill',
694
- actionRef: 'skills/helpful',
695
- instructions: 'You are helpful',
696
- prompt: '{{input}}',
697
- workingMemory: { input: 'Hello' },
698
- identity: {
699
- sessionId: 's1',
700
- instance: { instanceId: 'agent-456', type: 'test' },
701
- aiRequestId,
702
- jobId: 'job-123',
703
- taskId: 'task-1',
704
- agentId: 'agent-456'
705
- },
706
- config: {
707
- model: 'gpt-4o', // Overrides gateway default
708
- provider: 'openai', // Overrides gateway default
709
- temperature: 0.5 // Overrides gateway default
710
- }
711
- });
712
- ```
173
+ ---
713
174
 
714
- **Priority Order:**
715
- 1. Request config (highest)
716
- 2. Gateway constructor config
717
- 3. JSON defaults (lowest)
175
+ ## Response metadata and cost
718
176
 
719
- ### 5. InstructionsBlocks Configuration
177
+ On every successful **`invoke()`**:
720
178
 
721
- **Where to configure:**
722
- 1. **JSON Defaults** (lowest priority): `src/defaults/instructions-blocks.json`
723
- 2. **Content resolver (nx-content)** (medium priority): blocks under e.g. `blocks/{blockName}/{agentId}` (see [Content Resolver — Upstream Guide](./CONTENT_RESOLVER_UPSTREAM_GUIDE.md))
724
- 3. **Gateway Constructor** (highest priority): `instructionsBlocks` object
179
+ - **`metadata.provider`**, **`modelUsed`**, **`maxTokensRequested`**, **`effectiveModelConfig`**
180
+ - **`metadata.tokens`**, **`costStatus`**, **`costUsd`** when usage exists and pricing applies
725
181
 
726
- **How to configure:**
182
+ Full contract: [AI Gateway invoke execution metadata](./docs/AI_GATEWAY_INVOKE_EXECUTION_METADATA.md).
727
183
 
728
- **Step 1: Create/Edit JSON defaults** (`src/defaults/instructions-blocks.json`):
729
- ```json
730
- {
731
- "input-prefix": "Please process the following input:",
732
- "default-prompt": "You are a helpful assistant."
733
- }
734
- ```
184
+ ### Trace diagnostics
735
185
 
736
- **Step 2: Gateway constructor:**
737
186
  ```typescript
738
- const gateway = new AIGateway({
739
- instructionsBlocks: {
740
- 'input-prefix': 'Custom prefix from config:', // Overrides JSON default
741
- 'custom-block': 'Custom block content'
742
- }
187
+ await gateway.invoke({
188
+ ...request,
189
+ diagnostics: { mode: 'trace' }
743
190
  });
744
191
  ```
745
192
 
746
- **Step 3: Content Registry** (if available):
747
- - Store blocks at paths supported by nx-content (e.g. `blocks/{blockName}/{agentId}` or with taskTypeId). When content resolver is configured (via `contentRegistryConfig` or env vars), instructions blocks are resolved from local or git.
748
-
749
- **Priority Order** (highest to lowest):
750
- 1. Gateway constructor `instructionsBlocks` (highest priority)
751
- 2. Content registry with `taskTypeId` (if taskTypeId provided)
752
- 3. Content registry without `taskTypeId`
753
- 4. JSON defaults (lowest priority)
754
-
755
- **See**: [Content Resolver — Upstream Guide](./CONTENT_RESOLVER_UPSTREAM_GUIDE.md) for configuration, env vars, key vs text rule, and checklist.
756
-
757
- ### 6. Instruction Resolution (Content Resolver / nx-content)
758
-
759
- **How it works:** The gateway uses **nx-content** to resolve content. Instruction type is determined by whitespace:
760
-
761
- - **No spaces** → **Key** (resolved from local folder or git)
762
- - **Has spaces** → **Literal text** (used as-is)
763
-
764
- There is **no** option to override this; the spaces rule is the only decision point.
765
-
766
- **Configuration:** Pass `contentRegistryConfig` (or legacy `contentRegistry`) when creating the gateway:
767
-
768
- ```typescript
769
- // Local content only
770
- const gateway = new AIGateway({
771
- contentRegistryConfig: {
772
- localPath: '.metadata' // or absolute path
773
- }
774
- });
193
+ Adds **`metadata.attempts`**, **`metadata.usage`**, **`metadata.requestIds`**, and per-attempt **`costUsd`** / **`costStatus`** after catalog enrichment.
775
194
 
776
- // Local + Git (mode: 'dev' = local wins, 'prod' = git wins)
777
- const gateway = new AIGateway({
778
- contentRegistryConfig: {
779
- localPath: '.metadata',
780
- mode: 'dev',
781
- github: {
782
- repo: process.env.GITHUB_REPO_URL,
783
- token: process.env.GITHUB_TOKEN,
784
- branch: 'main'
785
- }
786
- }
787
- });
788
- ```
195
+ ---
789
196
 
790
- **Environment variables** (used when no explicit config): `CONTENT_REGISTRY_LOCAL_ROOT`, `CONTENT_REGISTRY_MODE`, `GITHUB_REPO_URL`, `GITHUB_TOKEN`, `CONTENT_REGISTRY_GIT_BRANCH`.
197
+ ## Operational modes
791
198
 
792
- **Behavior:**
793
- - Keys (no spaces) → resolved from nx-content (local or git); **never** sent as message content
794
- - Literal text (has spaces) used as-is
795
- - Unresolvable key error; no LLM call
199
+ | Mode | Model resolution | Notes |
200
+ |------|------------------|-------|
201
+ | `dev` | Strict unknown models fail at `mergeConfig` | Best for CI / local |
202
+ | `debug` | Lenient defaults | Default when env unset |
203
+ | `prod` | Falls back to configured default model when resolution fails | See `src/gateway-mode.ts` |
796
204
 
797
- **File layout:** e.g. `skills/<name>.instructions.md`, `skills/<name>.prompt.md` under the content root. See [Content Resolver — Upstream Guide](./CONTENT_RESOLVER_UPSTREAM_GUIDE.md) for full layout, checklist, and diagnostics.
205
+ Set via constructor `mode` or env `mode` / `MODE`.
798
206
 
799
- ### 7. Template Parsing Configuration (workingMemory)
207
+ ---
800
208
 
801
- **Where to configure:**
802
- - Request-level: `workingMemory` object
209
+ ## Testing
803
210
 
804
- **How to configure:**
211
+ | Script | What it runs |
212
+ |--------|----------------|
213
+ | `npm test` | All unit/integration tests in `.tests/run-all.js` (tsx, no network) |
214
+ | `npm run test:ai-tools` | ai-tools + cost + trace helper unit tests |
215
+ | `npm run test:ai-tools:live` | Real invoke + dev strict model check (needs API key) |
216
+ | `npm run test:flex-md-parsing` | flex-md parsing scenarios |
217
+ | `npm run test:flex-md-esm-regression` | ESM build regression for flex-md |
218
+ | `npm run test:prepublish` | `build` + `npm test` |
805
219
 
806
- ```typescript
807
- const aiRequestId = 'tpl-1';
808
- const response = await gateway.invoke({
809
- aiRequestId,
810
- agentId: 'agent-456',
811
- actionType: 'skill',
812
- actionRef: 'skills/professional-answer',
813
- // Instructions can be a key (resolved from content resolver) or text (parsed as template)
814
- instructions: 'professional-answer.instructions', // Key with suffix
815
- // OR: instructions: 'You are a {{role}} assistant.', // Text with template variables
816
- context: 'User is working on {{project}} project.',
817
- // Prompts can be a key (resolved from content resolver) or text (parsed as template)
818
- prompt: 'professional-answer.prompt', // Key with suffix
819
- // OR: prompt: 'Analyze this {{type}}: {{input}}', // Text with template variables
820
- workingMemory: {
821
- role: 'helpful',
822
- project: 'AI Gateway',
823
- type: 'product review',
824
- input: 'This is a review'
825
- },
826
- identity: {
827
- sessionId: 's1',
828
- instance: { instanceId: 'agent-456', type: 'test' },
829
- aiRequestId,
830
- jobId: 'job-123',
831
- taskId: 'task-1',
832
- agentId: 'agent-456'
833
- },
834
- config: { model: 'gpt-4o', provider: 'openai' }
835
- });
836
- ```
220
+ Live tests use `LIVE_TEST_PROVIDER` / `LIVE_TEST_MODEL` (default `openrouter` + `openai/gpt-4o-mini`). Set `LIVE_SKIP_INVOKE=1` to skip the LLM call.
837
221
 
838
- **What gets parsed:**
839
- - `instructions` - Resolved from content resolver (nx-content) if it's a key (no spaces), or parsed as template if text
840
- - `context` - Parsed as template with `workingMemory`
841
- - `prompt` - Resolved from content resolver if it's a key (no spaces), or parsed as template if text
842
- - All parsed using `@x12i/rendrix` (v4+) with `workingMemory`, `shortTermMemory`, `experienceMemory`, `knowledgeMemory`
222
+ ---
843
223
 
844
- **Rendrix (@x12i/rendrix) v4 template protocol**
224
+ ## Documentation index
845
225
 
846
- - Simple placeholders `{{name}}` or `{{a.b.c}}` are **required** (MUST): if resolution is **`undefined`** after the usual memory merge, rendering throws **`TemplateResolutionError`** from the parser. The gateway **rethrows** that error (it is not converted into a silent fallback).
847
- - Values that **do not** throw include **`null`**, empty string **`""`**, **`0`**, and **`false`**.
848
- - **Optional** placeholders: `{{path |}}` (empty if missing) or `{{path | fallback text}}` (literal fallback when missing).
849
- - Helpers, blocks, `{{file:...}}`, `{{json ...}}`, etc. follow the parser’s own rules; the MUST/optional rules above apply to plain path mustaches.
850
- - For full parser API details, see **`@x12i/rendrix`** README / `CHANGELOG.md`.
226
+ | Document | Topic |
227
+ |----------|--------|
228
+ | [IDENTITY_OBJECT_CONTRACT.md](./docs/IDENTITY_OBJECT_CONTRACT.md) | Identity / `runContext` |
229
+ | [AI_GATEWAY_INVOKE_EXECUTION_METADATA.md](./docs/AI_GATEWAY_INVOKE_EXECUTION_METADATA.md) | Metadata, cost, trace, Activix completion |
230
+ | [LOGGER_INITIALIZATION.md](./docs/LOGGER_INITIALIZATION.md) | Logxer setup |
231
+ | [flex-md-compliance.md](./docs/flex-md-compliance.md) | Output format levels |
232
+ | [PROMPT_TEMPLATE_USAGE.md](./docs/PROMPT_TEMPLATE_USAGE.md) | Rendrix templates |
233
+ | [UPSTREAM_TEMPLATE_RENDERING_AND_PARSER_V4.md](./docs/UPSTREAM_TEMPLATE_RENDERING_AND_PARSER_V4.md) | Parser v4 |
234
+ | [RUNTIME_OBJECTS_OBSERVABILITY.md](./docs/RUNTIME_OBJECTS_OBSERVABILITY.md) | Runtime object keys |
235
+ | [GRAPH_EXECUTION_SUPPORT.md](./docs/GRAPH_EXECUTION_SUPPORT.md) | Graph / node identity |
236
+ | [DUAL_PACKAGE_SETUP_GUIDE.md](./docs/DUAL_PACKAGE_SETUP_GUIDE.md) | ESM + CJS publish layout |
851
237
 
852
- **Gateway template options (passthrough)**
238
+ ---
853
239
 
854
- - **`GatewayConfig.templateRendering`** — default `TemplateRenderOptions` for every `invoke()` render path (merged after packaged **`src/defaults/template-rendering.json`**, which ships with **`subPathSearch.enabled: false`**). Your gateway config overrides that JSON.
855
- - **`templateRenderOptions` on the request** (`ChatRequest` / `AIInvokeRequest`) — merged on top of the gateway default for that call only (per-field override; `subPathSearch` fields merge with request winning).
856
- - **Smart Input (optional shorthand on the request)** — top-level **`smartInput`** (`SmartInputConfig`) and **`smartInputRenderOptions`** (`SmartInputRenderOptions`) are merged into the same Rendrix options object **after** `GatewayConfig.templateRendering` and **before** `templateRenderOptions`. If you set the same field both as a shorthand and inside **`templateRenderOptions`**, the nested **`templateRenderOptions`** value wins. Templates use the **`{{smartInput}}`** insertion macro (see **`@x12i/rendrix`**). Types **`SmartInputConfig`** and **`SmartInputRenderOptions`** are re-exported from this package.
857
- - For programmatic merges (tests or wrappers), use **`mergeGatewayAndRequestTemplateRenderOptions`** (same rules as `buildMessages`).
858
- - Supported fields match the parser: **`templateId`**, **`subPathSearch`** (`enabled`, `roots`), **`silentMissingMustTokens`** (legacy Handlebars-style silence for missing MUST paths), **`smartInput`**, **`smartInputRenderOptions`**.
859
- - **Sub-path root priority:** `subPathSearch.roots` is an **ordered** list. The parser tries roots in **array order**; **the first root that resolves the leaf path wins** (see ISSUE-005). There is no separate “priority” field—the order of `roots` *is* the priority. Omit `roots` when `enabled` is true to use **`@x12i/rendrix`** packaged defaults.
240
+ ## Troubleshooting helpers
860
241
 
861
242
  ```typescript
862
- // Example: prefer execution.*, then input.*, then inputs.* when a full path misses
863
- new AIGateway({
864
- templateRendering: {
865
- subPathSearch: {
866
- enabled: true,
867
- roots: ['execution', 'input', 'inputs']
868
- }
869
- }
870
- });
243
+ import { validateAIRequest, diagnoseRequest, formatDiagnostic } from '@x12i/ai-gateway';
871
244
  ```
872
- - **Memory overlay priority** for Rendrix resolution (when memories are supplied): **`shortTermMemory`** → **`workingMemory`** → **`experienceMemory`** → **`knowledgeMemory`**. (Request-level **`templateTokens`** was removed; put overrides in **`workingMemory`** or resolver-backed memories.)
873
- - Root **`config.defaults.json`** may include a **`templateRendering`** block for apps that merge this file into `GatewayConfig`. Packaged **`template-rendering.json`** includes a sample **`roots`** order (used when you turn **`enabled`** on; while **`enabled`** is **`false`**, roots are ignored by the parser).
874
-
875
- **Template-Based Prompts:**
876
- - Prompts work exactly like instructions - both can be resolved using explicit keys. See [Content Resolver — Upstream Guide](./CONTENT_RESOLVER_UPSTREAM_GUIDE.md).
877
- - Both instructions and prompts receive the same memory context for template rendering
878
- - Use explicit keys with suffixes: `professional-answer.instructions` and `professional-answer.prompt`
879
- - See [Prompt Template Usage Guide](./docs/PROMPT_TEMPLATE_USAGE.md) for details
880
-
881
- **Note:** Requires `@x12i/rendrix` **^4.x** (already a dependency).
882
245
 
246
+ Enable request logging:
883
247
 
884
- ### 9. Provider Registration
885
-
886
- **Where to configure:**
887
- - **Automatic (Recommended)**: Environment variables - providers auto-register on gateway creation
888
- - **Manual**: Runtime: `gateway.register(provider)`
889
-
890
- **Automatic Registration (v4.0.7+):**
891
-
892
- Providers are automatically registered based on environment variables when the gateway is created:
893
-
894
- ```typescript
895
- // Set environment variables in .env:
896
- // OPENAI_API_KEY=sk-...
897
- // GROK_API_KEY=xai-...
898
- // ANTHROPIC_API_KEY=sk-ant-... (optional)
899
- // GOOGLE_API_KEY=... (optional)
900
-
901
- const gateway = new AIGateway({
902
- defaultProvider: 'openai',
903
- fallbackChain: ['grok']
904
- });
905
-
906
- // Providers are automatically registered! No manual registration needed.
907
- // The gateway will log which providers were auto-registered.
248
+ ```bash
249
+ export AI_GATEWAY_DEBUG=true
250
+ export AI_GATEWAY_DEBUG_REQUEST=true
251
+ export FLEX_MD_MIN_COMPLIANCE_LEVEL=L0
908
252
  ```
909
253
 
910
- **📋 Configuration Reference:**
911
-
912
- See `.env.example` in the project root for a comprehensive guide to all environment variables, including:
913
- - Provider API keys (required/optional)
914
- - MongoDB/activity tracking configuration
915
- - Content registry setup (S3, GitHub, Redis)
916
- - Internal system actions configuration
917
- - Logging and debugging options
918
-
919
- Copy `.env.example` to `.env` and fill in your values.
920
-
921
- **Supported Providers (Auto-Registration):**
922
-
923
- - **OpenAI**: `OPENAI_API_KEY` → Auto-registers `openai` provider
924
- - **Grok**: `GROK_API_KEY` → Auto-registers `grok` provider
925
- - **Anthropic**: `ANTHROPIC_API_KEY` → Auto-registers `anthropic` provider (if package installed)
926
- - **Google**: `GOOGLE_API_KEY` → Auto-registers `google` provider (if package installed)
927
- - **Cohere**: `COHERE_API_KEY` → Auto-registers `cohere` provider (if package installed)
928
- - **Mistral**: `MISTRAL_API_KEY` → Auto-registers `mistral` provider (if package installed)
929
-
930
- **Manual Registration (Optional):**
931
-
932
- You can still manually register providers if needed:
254
+ ---
933
255
 
934
- ```typescript
935
- import { OpenAIProvider } from '@x12i/ai-provider-openai';
936
- import { GrokProvider } from '@x12i/ai-provider-grok';
937
-
938
- const gateway = new AIGateway({
939
- defaultProvider: 'openai',
940
- fallbackChain: ['grok']
941
- });
256
+ ## Build and publish
942
257
 
943
- // Manual registration (optional - auto-registration handles this if env vars are set)
944
- gateway.register(new OpenAIProvider({
945
- apiKey: process.env.OPENAI_API_KEY
946
- }));
947
-
948
- gateway.register(new GrokProvider({
949
- apiKey: process.env.GROK_API_KEY
950
- }));
258
+ ```bash
259
+ npm run build # ESM + CJS + defaults copy + CJS verify
260
+ npm run test:prepublish
951
261
  ```
952
262
 
953
- **Note:** Auto-registration only registers providers that:
954
- 1. Have their API key set in environment variables
955
- 2. Have their provider package installed (e.g., `@x12i/ai-provider-openai`)
263
+ Published files: `dist/`, `dist-cjs/`, `config.defaults.json`, `README.md`.
956
264
 
957
- If a provider package is not installed, auto-registration will skip it gracefully (with a debug log for optional providers, warning for required ones).
958
-
959
- ### 9. Complete Configuration Example
960
-
961
- ```typescript
962
- import { AIGateway } from '@x12i/ai-gateway';
963
- import { createLogxer } from '@x12i/logxer';
964
- import { Activix } from '@x12i/activix';
965
- import { OpenAIProvider } from '@x12i/ai-provider-openai';
966
-
967
- // 1. Configure activity tracker (and reuse its logger)
968
- // Single source of truth: set up the logger once, pass it to the tracker,
969
- // then reuse the same logger for the gateway.
970
- const logger = createLogxer(
971
- { packageName: 'MY_APP', envPrefix: 'MY_APP', debugNamespace: 'my-app' },
972
- {
973
- logLevel: 'info',
974
- logFormat: 'json',
975
- enableUnifiedLogger: true
976
- }
977
- );
978
-
979
- const statusValues = {
980
- started: 'started',
981
- inProgress: 'in_progress',
982
- completed: 'success',
983
- failed: 'failed',
984
- timeout: 'timeout'
985
- };
986
-
987
- const activityTracker = new Activix({
988
- collections: [
989
- { name: 'ai-actions', statusValues },
990
- { name: 'skill-executions', statusValues },
991
- { name: 'bad-requests', statusValues }
992
- ]
993
- });
994
-
995
- // 2. Create gateway with all configurations
996
- const gateway = new AIGateway({
997
- // Provider routing
998
- defaultProvider: 'openai',
999
- defaultModel: 'gpt-4o',
1000
- defaultEngine: 'openai',
1001
- fallbackChain: ['grok'],
1002
-
1003
- // Usage tracking
1004
- enableUsageTracking: true,
1005
- usageTier: 'tier-3',
1006
-
1007
- // Activity tracking
1008
- enableActivityTracking: true,
1009
- activityTracker: activityTracker,
1010
-
1011
- // Logging
1012
- enableLogging: true,
1013
- packageName: 'MY_APP',
1014
- logger: logger,
1015
-
1016
- // Content resolver (local and/or git)
1017
- contentRegistryConfig: {
1018
- localPath: '.metadata',
1019
- // optional: mode: 'prod', github: { repo: process.env.GITHUB_REPO_URL, token: process.env.GITHUB_TOKEN }
1020
- },
1021
-
1022
- // InstructionsBlocks
1023
- instructionsBlocks: {
1024
- 'input-prefix': 'Custom prefix:'
1025
- },
1026
-
1027
- // LLM defaults
1028
- temperature: 0.7,
1029
- maxTokens: 2000
1030
- });
1031
-
1032
- // 4. Register providers
1033
- gateway.register(new OpenAIProvider({
1034
- apiKey: process.env.OPENAI_API_KEY
1035
- }));
1036
- ```
1037
-
1038
- ### Configuration Priority Summary
1039
-
1040
- For each configuration option, priority is (highest to lowest):
1041
- 1. **Request-level config** (in `invoke()` call)
1042
- 2. **Gateway constructor config**
1043
- 3. **Content resolver (nx-content)** (for instructionsBlocks only)
1044
- 4. **JSON defaults** (from `src/defaults/`)
1045
-
1046
- ## Enhanced Gateway Features
1047
-
1048
- Throughout this section, **`gateway.invoke()`** examples must include **`aiRequestId`**, **`identity`** (with **`jobId`** / **`taskId`**), **`actionType`**, **`actionRef`**, and usually **`prompt`** + **`workingMemory`** unless noted. Older snippets that show only **`jobId`** / **`input`** / **`parseOptions`** / **`validateOutputSchema`** / **`transformations`** are outdated relative to the current [`AIInvokeRequest`](#chatrequest-and-airequest--aiinvokerequest) type.
1049
-
1050
- ### 1. Context Propagation (identity / job / task)
1051
-
1052
- Correlation flows through **`request.identity`**: **`identity.jobId`** and **`identity.taskId`** are supplied by the upstream client, forwarded to the router, echoed in **`response.metadata.identity`**, and persisted as Activix **`runContext`**. For **`invoke()`**, also set **`actionType`** and **`actionRef`** (see [AI invoke payload](#ai-invoke-payload-gatewayinvoke)).
1053
-
1054
- ```typescript
1055
- const aiRequestId = 'call-001';
1056
- const response = await gateway.invoke({
1057
- aiRequestId,
1058
- agentId: 'agent-456',
1059
- actionType: 'skill',
1060
- actionRef: 'skills/example',
1061
- instructions: 'You are a helpful assistant.',
1062
- prompt: '{{input}}',
1063
- workingMemory: { input: 'What is AI?' },
1064
- identity: {
1065
- sessionId: 'run-1',
1066
- instance: { instanceId: 'agent-456', type: 'ai-reasoner' },
1067
- aiRequestId,
1068
- jobId: 'job-123',
1069
- taskId: 'task-789',
1070
- agentId: 'agent-456'
1071
- },
1072
- config: { model: 'gpt-5-nano', provider: 'openai' }
1073
- });
1074
- ```
1075
-
1076
- **Note:** On **`invoke()`**, provider **`messages`** are produced by the internal message builder from **`instructions`** / **`prompt`** / **`context`** (not from a client **`messages`** array). Use **`invokeChat()`** if you need to pass a raw **`messages`** array.
1077
-
1078
- **Request requirements (`invoke`):**
1079
- - **Required**: `aiRequestId`, `agentId`, `instructions`, `identity` (with upstream `jobId` / `taskId`), `actionType`, `actionRef`; structured flows typically need `prompt` + `workingMemory` for templates (see message builder).
1080
- - **Optional**: `context`, `messages`, graph/skill linkage fields, `config` / `modelConfig`.
1081
- - Do **not** use top-level **`input`** — use **`workingMemory.input`** (and **`{{input}}`** in templates).
1082
-
1083
- **Benefits:**
1084
- - Stable tracing across logs and Activix
1085
- - Clear separation between chat (`invokeChat`) and structured invoke (`invoke`)
1086
-
1087
- ### 2. Usage Tier Tracking (RPM/TPM Limits)
1088
-
1089
- The gateway integrates with `@x12i/x-models` to enforce usage tier limits and prevent rate limit errors.
1090
-
1091
- ```typescript
1092
- import { AIGateway, getTierInfo } from '@x12i/ai-gateway';
1093
-
1094
- // Initialize with usage tier
1095
- const gateway = new AIGateway({
1096
- usageTier: 'tier-3', // 5,000 RPM, 2M TPM
1097
- enableUsageTracking: true
1098
- });
1099
-
1100
- // Get tier information
1101
- const tierInfo = getTierInfo('tier-3');
1102
- console.log(`RPM Limit: ${tierInfo?.rpm}, TPM Limit: ${tierInfo?.tpm}`);
1103
-
1104
- // Gateway automatically:
1105
- // - Records every request to x-models
1106
- // - Calculates RPM/TPM consumption
1107
- // - Logs consumption percentages
1108
- // - Prevents exceeding tier limits
1109
- ```
1110
-
1111
- **Available Tiers:**
1112
- - `tier-1`: 500 RPM, 500K TPM
1113
- - `tier-2`: 5,000 RPM, 1M TPM
1114
- - `tier-3`: 5,000 RPM, 2M TPM (default)
1115
- - `tier-4`: 10,000 RPM, 4M TPM
1116
- - `tier-5`: 15,000 RPM, 40M TPM
1117
-
1118
- ### 3. Activity Tracking (xronox-activitix via @x12i/activix v7)
1119
-
1120
- The gateway uses **`@x12i/activix` v7** (xronox-activitix) for full lifecycle logging. Recommended: enable MongoDB persistence so tracking is automatic. Writes use the **fixed** Mongo collection names **`ai-actions`**, **`bad-requests`**, and **`skill-executions`** (literal strings from `activity-tracking-config.ts`; see **section 2** for what lands in each collection and how documents are shaped).
1121
-
1122
- #### ⚠️ CRITICAL: correlation, identity, and unique record ids
1123
-
1124
- **IMPORTANT DESIGN CONCEPTS:**
1125
-
1126
- 1. **Per-request correlation**
1127
- - **`aiRequestId`** (required): One id per gateway invocation; used as the primary leaf correlation field (stored on the activity row and inside Activix `runContext`).
1128
- - **`identity.jobId`** and **`identity.taskId`** (required): Taken only from the upstream **`identity`** object; the gateway does not invent them.
1129
- - **`jobTypeId`**, **`taskTypeId`**: Optional aggregation fields (same ideas as before).
1130
- - **Activity**: Each individual LLM request is a separate **activity** with its own unique record.
1131
-
1132
- 2. **`activityId` + Mongo `_id` (not `jobId`)**
1133
- - Each row gets Mongo **`_id`** and an Activix **`activityId`** (e.g. **`act-…`**, the collection **`primaryKey`**). **`completeRecord` / `failRecord`** address the row by **`activityId`** returned from **`startRecord`**.
1134
- - **`jobId`** / **`taskId`** mirror upstream **`identity`** for correlation only; many rows may share a **`jobId`**.
1135
-
1136
- 3. **Two-phase tracking (Activix v7)**
1137
- - **Phase 1 (start)**: Creates a NEW database document
1138
- - Sends **`runContext`**, **`request`**, **`config`**, **`startTime`**, **`status: 'started'`**, plus Activix **`outer.input`** (wraps **`activityType`** and the same **`request`** snapshot when present — see section 2).
1139
- - Returns **`activityId`** (and record payload) for phase 2.
1140
- - **Phase 2 (complete / fail)**: Updates the SAME document by **`activityId`**
1141
- - Success: **`response`**, root **`cost`** / **`costUsd`** / **`costStatus`**, **`endTime`**, **`duration`**, **`status`**, **`outer.output`** (completion payload), **`outer.metadata`** (routing + billing mirror), and **`outer.cost`** when usage or price is known (see [Cost reporting](#cost-reporting-invoke-response--activix-run-analysis-g8)).
1142
- - Failure: error payload and timing; optional **`response`** / **`outer.output`** only for specific failure kinds.
1143
-
1144
- 4. **Structured fields vs Activix `outer` (v2.6.0+):**
1145
- - LLM request fields live under root **`request`** (not as loose keys on the document root). Config under **`config`**; completion payload under **`response`**.
1146
- - **`outer`** duplicates the logical **request** snapshot under **`outer.input.request`** when applicable — required for Activix v7 validation — so root **`request`** and **`outer.input.request`** align by design ([details](./docs/ACTIVITIES_OUTER_DUPLICATION.md)).
1147
-
1148
- **Example: same logical job, three LLM calls**
1149
-
1150
- Each call must have a **distinct `aiRequestId`** and a full **`identity`** (including **`jobId`** and **`taskId`** from upstream). Use the same **`identity.jobId`** (and distinct **`taskId`** per call, or your own convention) if you want to group rows in Mongo.
1151
-
1152
- ```typescript
1153
- import * as crypto from 'crypto';
1154
-
1155
- function md5(text: string): string {
1156
- return crypto.createHash('md5').update(text).digest('hex');
1157
- }
1158
-
1159
- const jobTypeId = md5('data-processing-job');
1160
-
1161
- await gateway.invoke({
1162
- aiRequestId: 'req-001',
1163
- agentId: 'agent-1',
1164
- actionType: 'skill',
1165
- actionRef: 'skills/example',
1166
- instructions: '…',
1167
- prompt: '{{input}}',
1168
- workingMemory: { input: '…' },
1169
- jobTypeId,
1170
- identity: {
1171
- sessionId: 'sess-1',
1172
- instance: { instanceId: 'inst-1', type: 'gateway' },
1173
- aiRequestId: 'req-001',
1174
- jobId: 'job-123',
1175
- taskId: 'task-001',
1176
- agentId: 'agent-1'
1177
- },
1178
- config: { model: 'gpt-5-nano', provider: 'openai' }
1179
- });
1180
-
1181
- await gateway.invoke({
1182
- aiRequestId: 'req-002',
1183
- agentId: 'agent-1',
1184
- actionType: 'skill',
1185
- actionRef: 'skills/example',
1186
- instructions: '…',
1187
- prompt: '{{input}}',
1188
- workingMemory: { input: '…' },
1189
- jobTypeId,
1190
- identity: {
1191
- sessionId: 'sess-1',
1192
- instance: { instanceId: 'inst-1', type: 'gateway' },
1193
- aiRequestId: 'req-002',
1194
- jobId: 'job-123',
1195
- taskId: 'task-002',
1196
- agentId: 'agent-1'
1197
- },
1198
- config: { model: 'gpt-5-nano', provider: 'openai' }
1199
- });
1200
-
1201
- // Query in Mongo (main collection name is ai-actions):
1202
- // db.getCollection('ai-actions').find({ 'runContext.aiRequestId': 'req-001' })
1203
- // db.getCollection('ai-actions').find({ 'runContext.jobId': 'job-123' })
1204
- ```
1205
-
1206
- #### Configuration
1207
-
1208
- ```typescript
1209
- import { Activix } from '@x12i/activix';
1210
-
1211
- const statusValues = {
1212
- started: 'started',
1213
- inProgress: 'in_progress',
1214
- completed: 'success',
1215
- failed: 'failed',
1216
- timeout: 'timeout'
1217
- };
1218
-
1219
- const activityTracker = new Activix({
1220
- collections: [
1221
- { name: 'ai-actions', statusValues },
1222
- { name: 'skill-executions', statusValues },
1223
- { name: 'bad-requests', statusValues }
1224
- ]
1225
- });
1226
-
1227
- const gateway = new AIGateway({
1228
- enableActivityTracking: true,
1229
- activityTracker
1230
- });
1231
-
1232
- // Auto-persisted by the tracker:
1233
- // - Each activity creates a new record with unique _id
1234
- // - Start/end/duration, status (started|success|failed)
1235
- // - Provider, model, cost
1236
- // - Request/response metadata, errors
1237
- // - Correlation via runContext (and mirrored top-level fields); optional jobId for grouping
1238
- ```
1239
-
1240
- #### Database Record Structure
1241
-
1242
- Example shape for a completed row in **`ai-actions`** (`activityType: 'gateway-invocation'`). **`skill-executions`** / **`bad-requests`** share the same Activix lifecycle pattern but different **`activityType`** and fields.
1243
-
1244
- ```typescript
1245
- {
1246
- // Mongo primary key
1247
- _id: ObjectId('693970636e8d0f171e4aa528'),
1248
-
1249
- // Activix logical row key (returned from startRecord; used by completeRecord / failRecord)
1250
- activityId: 'act-63f0357e-b2fc-4038-8f94-a9c7fa8fb892',
1251
-
1252
- activityType: 'gateway-invocation',
1253
-
1254
- // Activix v7: canonical correlation BSON object `runContext` (same reference as `request.identity`, merged with gateway fields)
1255
- runContext: {
1256
- sessionId: 'sess-1',
1257
- instance: { instanceId: 'gw-1', type: 'gateway' },
1258
- aiRequestId: 'req-abc',
1259
- jobId: 'job-123',
1260
- jobTypeId: 'xyz789...',
1261
- agentId: 'agent-456',
1262
- taskId: 'task-789',
1263
- taskTypeId: 'abc123...',
1264
- graphId: 'graph-456',
1265
- nodeId: 'node-789',
1266
- masterSkillId: '...',
1267
- masterSkillActivityId: '...'
1268
- },
1269
- // Mirrored / denormalized top-level fields may also appear from the gateway payload (query either as needed)
1270
- aiRequestId: 'req-abc',
1271
- sessionId: 'sess-1',
1272
- instance: { instanceId: 'gw-1', type: 'gateway' },
1273
- jobId: 'job-123',
1274
- jobTypeId: 'xyz789...',
1275
- agentId: 'agent-456',
1276
- taskId: 'task-789',
1277
- taskTypeId: 'abc123...',
1278
- graphId: 'graph-456',
1279
- nodeId: 'node-789',
1280
-
1281
- // Activix v7 root-level I/O tier — see section 2 for semantics
1282
- // startRecord: outer.input ≈ { activityType, request } (request matches root `request` when present)
1283
- // completeRecord: outer.output ← same object as root `response` on success
1284
- outer: {
1285
- input: { activityType: 'gateway-invocation', request: { /* same snapshot as root request */ } },
1286
- output: { /* success: gateway activity response (content, parsed, metadata, usage) */ },
1287
- metadata: {
1288
- modelUsed: 'openai/gpt-5-nano-2025-08-07',
1289
- provider: 'openrouter',
1290
- cost: 0.0000348,
1291
- costUsd: 0.0000348,
1292
- costStatus: 'priced'
1293
- },
1294
- cost: {
1295
- usd: 0.0000348,
1296
- unit: 'USD',
1297
- tokens: { input: 16, output: 85, total: 101 },
1298
- provider: 'openrouter',
1299
- model: 'openai/gpt-5-nano-2025-08-07',
1300
- details: { costStatus: 'priced' /* optional costBreakdown when aiTools.costIncludeBreakdown */ }
1301
- }
1302
- },
1303
- // inner: optional step array for multi-step flows (see @x12i/activix docs)
1304
-
1305
- // Timing
1306
- startTime: 1765372020804,
1307
- endTime: 1765372021535, // Added by logSuccess
1308
- duration: 731, // Added by logSuccess
1309
- status: 'success', // Updated by logSuccess (was 'started')
1310
-
1311
- // Request data (from startActivity - ONLY in request object)
1312
- request: {
1313
- raw: {
1314
- instructions, // Original instructions (before template parsing)
1315
- context, // Original context (before template parsing)
1316
- prompt // Original prompt (before template parsing)
1317
- },
1318
- parsed: {
1319
- instructions, // Parsed instructions (after template parsing with workingMemory)
1320
- context, // Parsed context (after template parsing with workingMemory)
1321
- prompt // Parsed prompt (after template parsing with workingMemory)
1322
- },
1323
- messages: [...], // Final constructed messages array
1324
- workingMemory: {...} // Template/user payload (e.g. { input: '…' } for '{{input}}'); not a separate root `input` field
1325
- },
1326
-
1327
- // Config data (from startActivity - ONLY in config object)
1328
- config: {
1329
- model: 'gpt-5-',
1330
- provider: 'openai',
1331
- temperature: 0.7,
1332
- maxTokens: 1000,
1333
- rawConfig: {...}
1334
- },
1335
-
1336
- // Response data (from logSuccess - ONLY in response object)
1337
- response: {
1338
- content: "...",
1339
- metadata: {...}
1340
- },
1341
-
1342
- // Billing (from logSuccess — mirrors response.metadata from invoke)
1343
- cost: 0.0000348,
1344
- costUsd: 0.0000348,
1345
- costStatus: 'priced',
1346
-
1347
- // Metadata
1348
- createdAt: Date,
1349
- updatedAt: Date
1350
- }
1351
- ```
1352
-
1353
- **Key points:**
1354
- - ✅ Each activity = separate Mongo document (**`_id`**) with stable **`activityId`** (`act-…`) for Activix APIs
1355
- - ✅ **`aiRequestId`** = per-request correlation (required on invoke)
1356
- - ✅ **`runContext.jobId`** / **`runContext.taskId`** = upstream identity (required on invoke since v9+)
1357
- - ✅ Request/config sent at **start**; response/timing/billing (`cost`, `costUsd`, `costStatus`, `outer.cost`) at **complete**
1358
- - ✅ Updates target **`activityId`** from **`startRecord`**, not **`jobId`**
1359
-
1360
- #### Retry Tracking (@x12i/activix v7)
1361
-
1362
- The gateway automatically retries network errors, server errors (5xx), and throttling (429) with exponential backoff. Retry attempts are tracked and stored in activity records.
1363
-
1364
- **Retry Metadata Structure:**
1365
-
1366
- ```typescript
1367
- // Success case - retry metadata in response.metadata.retries
1368
- {
1369
- response: {
1370
- metadata: {
1371
- retries: {
1372
- count: 2, // Number of retry attempts
1373
- attempts: [
1374
- {
1375
- attempt: 1, // 1-based attempt number
1376
- timestamp: 1234567890, // When retry occurred
1377
- error: "fetch failed", // Error message
1378
- errorType: "network", // Error classification
1379
- delayMs: 1000 // Delay before retry
1380
- },
1381
- {
1382
- attempt: 2,
1383
- timestamp: 1234568890,
1384
- error: "fetch failed",
1385
- errorType: "network",
1386
- delayMs: 2000
1387
- }
1388
- ]
1389
- }
1390
- }
1391
- }
1392
- }
1393
-
1394
- // Failure case - retry count in error message
1395
- {
1396
- status: "failed",
1397
- error: "Grok API network error: fetch failed [Retries: 3]"
1398
- }
1399
- ```
1400
-
1401
- **Error Types:**
1402
- - `network`: Network errors (fetch failed, DNS, connectivity)
1403
- - `http-429`: Throttling/rate limiting
1404
- - `http-5xx`: Server errors (500, 502, 503, etc.)
1405
- - `timeout`: Timeout errors
1406
-
1407
- **Querying Activities with Retries:**
1408
-
1409
- ```typescript
1410
- // Query activities that had retries
1411
- const activitiesWithRetries = await db.activities.find({
1412
- 'response.metadata.retries.count': { $gt: 0 }
1413
- });
1414
-
1415
- // Query activities with network errors that were retried
1416
- const networkRetries = await db.activities.find({
1417
- 'response.metadata.retries.attempts.errorType': 'network'
1418
- });
1419
-
1420
- // Query activities that failed after retries
1421
- const failedAfterRetries = await db.activities.find({
1422
- status: 'failed',
1423
- error: /\[Retries: \d+\]/
1424
- });
1425
- ```
1426
-
1427
- **Requirements:**
1428
- - `@x12i/activix` required for retry tracking metadata persistence
1429
- - Backward compatible: Works with older versions (retry metadata just won't be stored)
1430
-
1431
- ### 4. Response Structure (v2.1.0+)
1432
-
1433
- The gateway returns a comprehensive response structure that captures the full lifecycle: raw provider response, gateway normalization, inference parsing, and calculated metrics.
1434
-
1435
- #### Complete Response Structure
1436
-
1437
- ```typescript
1438
- const aiRequestId = 'resp-shape-1';
1439
- const response = await gateway.invoke({
1440
- aiRequestId,
1441
- agentId: 'agent-456',
1442
- actionType: 'skill',
1443
- actionRef: 'skills/qa',
1444
- instructions: 'You are a helpful assistant.',
1445
- prompt: '{{input}}',
1446
- workingMemory: { input: 'What is AI?' },
1447
- identity: {
1448
- sessionId: 's1',
1449
- instance: { instanceId: 'agent-456', type: 'test' },
1450
- aiRequestId,
1451
- jobId: 'job-123',
1452
- taskId: 'task-1',
1453
- agentId: 'agent-456'
1454
- },
1455
- config: { model: 'gpt-5-nano', provider: 'openai' }
1456
- });
1457
-
1458
- // Response structure:
1459
- {
1460
- // ============================================
1461
- // Raw Provider Response (from router)
1462
- // ============================================
1463
- content: string, // Normalized string (always present)
1464
- rawText?: string, // Original raw text from provider (before parsing)
1465
-
1466
- // Raw content from provider (if preserved)
1467
- // Note: response.content is normalized, rawContent would be in routerResponse
1468
-
1469
- // ============================================
1470
- // Gateway Normalization & Parsing
1471
- // ============================================
1472
- parsedContent?: TContent, // Parsed JSON object/array (if content was JSON)
1473
-
1474
- metadata: {
1475
- // Content type classification
1476
- contentType?: 'string' | 'object' | 'array' | 'null',
1477
-
1478
- // ============================================
1479
- // Gateway Calculated Metrics
1480
- // ============================================
1481
- jobId?: string, // Job ID for correlation
1482
- latencyMs: number, // Execution time in milliseconds
1483
- tokens: {
1484
- prompt: number, // Input tokens
1485
- completion: number, // Output tokens
1486
- total: number, // Total tokens
1487
- // Cache token support (if available)
1488
- cacheInputTokens?: number,
1489
- cacheOutputTokens?: number,
1490
- cacheTotalTokens?: number
1491
- },
1492
- model?: string, // Model ID used (e.g., 'gpt-4o', 'claude-sonnet-4')
1493
- modelUsed?: string, // Resolved/served model id (when distinct from request model)
1494
- provider?: string, // Provider used (e.g., 'openai', 'anthropic')
1495
- costStatus?: 'priced' | 'unpriced', // Billing state (Run Analysis G8)
1496
- costUsd?: number, // USD when costStatus === 'priced' (preferred field)
1497
- cost?: number, // USD mirror of costUsd when priced
1498
- costBreakdown?: { // Optional when aiTools catalog pricing runs (calculateCost + breakdown)
1499
- promptCostUsd?: number;
1500
- completionCostUsd?: number;
1501
- // ...other breakdown keys from @x12i/ai-tools
1502
- },
1503
-
1504
- // ============================================
1505
- // Inference Output Parsing (if inferenceType provided)
1506
- // ============================================
1507
- parsedOutput?: unknown, // Typed inference output (classification, Q&A, etc.)
1508
- inferenceType?: string, // Inference type used (e.g., 'classification')
1509
- outputValidationErrors?: string[], // Schema validation errors (if validation enabled)
1510
-
1511
- // ============================================
1512
- // Provider Metadata (from router)
1513
- // ============================================
1514
- // Additional metadata from provider response
1515
- // (merged from routerResponse.metadata)
1516
- },
1517
-
1518
- // ============================================
1519
- // Usage Information (from router)
1520
- // ============================================
1521
- usage?: {
1522
- cost?: number, // Cost from provider
1523
- // Additional usage fields from provider
1524
- }
1525
- }
1526
- ```
1527
-
1528
- #### Response Structure Breakdown
1529
-
1530
- **1. Raw Provider Response:**
1531
- - `content` - Normalized string (always present, never "[object Object]")
1532
- - `rawText` - Original raw text from provider (preserved if available)
1533
- - `usage` - Usage information from provider (cost, tokens if available)
1534
- - Provider metadata merged into `response.metadata`
1535
-
1536
- **2. Gateway Normalization & Parsing:**
1537
- - `parsedContent` - Parsed JSON object/array (if content was JSON)
1538
- - `metadata.contentType` - Type classification: `'string' | 'object' | 'array' | 'null'`
1539
-
1540
- **3. Inference Output Parsing** (if `inferenceType` provided in request):
1541
- - `metadata.parsedOutput` - Typed inference output (classification, question-answer, extraction, etc.)
1542
- - `metadata.inferenceType` - Inference type used
1543
- - `metadata.outputValidationErrors` - Schema validation errors (if validation enabled)
1544
-
1545
- **4. Gateway Calculated Metrics:**
1546
- - `metadata.jobId` - Job ID for correlation
1547
- - `metadata.latencyMs` - Request duration in milliseconds
1548
- - `metadata.tokens` - Token breakdown (prompt, completion, total, cache tokens)
1549
- - `metadata.costStatus` - `priced` | `unpriced` (see [Cost reporting](#cost-reporting-invoke-response--activix-run-analysis-g8))
1550
- - `metadata.costUsd` / `metadata.cost` - USD when priced
1551
- - `metadata.costBreakdown` - Optional catalog breakdown when `aiTools.calculateCost` applies
1552
- - `metadata.model` / `metadata.modelUsed` - Model id used
1553
- - `metadata.provider` - Provider used
1554
-
1555
- #### Example: Full Response
1556
-
1557
- ```typescript
1558
- const aiRequestId = 'cls-1';
1559
- const response = await gateway.invoke({
1560
- aiRequestId,
1561
- agentId: 'agent-456',
1562
- actionType: 'skill',
1563
- actionRef: 'skills/sentiment',
1564
- instructions: 'Classify sentiment',
1565
- prompt: '{{input}}',
1566
- workingMemory: { input: 'I love this product!' },
1567
- inferenceType: 'classification',
1568
- identity: {
1569
- sessionId: 's1',
1570
- instance: { instanceId: 'agent-456', type: 'test' },
1571
- aiRequestId,
1572
- jobId: 'job-123',
1573
- taskId: 'task-1',
1574
- agentId: 'agent-456'
1575
- },
1576
- config: { model: 'gpt-5-nano', provider: 'openai' }
1577
- });
1578
-
1579
- // Complete response structure:
1580
- {
1581
- // Normalized content (always string)
1582
- content: '{"label":"positive","confidence":0.95}',
1583
-
1584
- // Raw text from provider
1585
- rawText: '{"label":"positive","confidence":0.95}',
1586
-
1587
- // Parsed JSON (if content was JSON)
1588
- parsedContent: { label: 'positive', confidence: 0.95 },
1589
-
1590
- metadata: {
1591
- // Content classification
1592
- contentType: 'object',
1593
-
1594
- // Gateway metrics
1595
- jobId: 'job-123',
1596
- latencyMs: 1250,
1597
- tokens: {
1598
- prompt: 100,
1599
- completion: 50,
1600
- total: 150
1601
- },
1602
- modelUsed: 'gpt-5-mini',
1603
- provider: 'openai',
1604
- costStatus: 'priced',
1605
- costUsd: 0.002,
1606
- cost: 0.002,
1607
-
1608
- // Inference output (parsed)
1609
- parsedOutput: {
1610
- label: 'positive',
1611
- confidence: 0.95
1612
- },
1613
- inferenceType: 'classification',
1614
- outputValidationErrors: undefined // No validation errors
1615
- }
1616
- }
1617
- ```
1618
-
1619
- **Note:** The response structure captures the full lifecycle from raw provider response through gateway normalization to final parsed inference output, providing complete observability and traceability.
1620
-
1621
- ### 5. Structured Logging
1622
-
1623
- The gateway uses **`@x12i/logxer`** for structured logging with **`LogMeta`** correlation. See [Logger initialization](./docs/LOGGER_INITIALIZATION.md).
1624
-
1625
- ```typescript
1626
- import { createLogxer } from '@x12i/logxer';
1627
-
1628
- const gateway = new AIGateway({
1629
- enableLogging: true,
1630
- packageName: 'MY_APP',
1631
- logger: createLogxer(
1632
- { packageName: 'MY_APP', envPrefix: 'MY_APP', debugNamespace: 'my-app' },
1633
- {
1634
- logLevel: 'info',
1635
- logFormat: 'json',
1636
- enableUnifiedLogger: true
1637
- }
1638
- )
1639
- });
1640
-
1641
- // All operations are automatically logged:
1642
- // - Request initiation with jobId
1643
- // - Provider/model selection
1644
- // - Usage consumption
1645
- // - Success/failure with full context
1646
- ```
1647
-
1648
- ### 6. Object Type Output Support (@x12i/outputs-library)
1649
-
1650
- The gateway integrates with `@x12i/outputs-library` to parse LLM responses into typed inference outputs (classification, question-answer, extraction, etc.).
1651
-
1652
- > **Current request model:** **`parseOptions`**, **`validateOutputSchema`**, and **`strictValidation`** are **not** fields on **`ChatRequest`** / **`AIInvokeRequest`** anymore (removed). **`inferenceType`** may still be recorded for **Activix** / activity metadata. The primary **`invoke()`** response path uses **flex-md extraction** into **`parsedContent`**; for **`@x12i/outputs-library`** workflows, parse **`response.content`** / **`parsedContent`** in your app or follow instruction metadata (**`InstructionMetadata.parseOptions`** applies to catalog entries, not the invoke payload).
1653
- > The **code samples** in subsections below still show legacy request shapes for illustration — adjust them to include **`aiRequestId`**, **`identity`**, **`actionType`**, **`actionRef`**, **`prompt`**, **`workingMemory`**, and omit removed fields.
1654
-
1655
- #### Overview
1656
-
1657
- When **`@x12i/outputs-library`** is used (directly or via future gateway wiring), specifying an **`inferenceType`** can classify parsing intent. Response typing and validation should align with your integration layer and **`EnhancedLLMResponse.metadata`** (e.g. **`parsedContent`**, **`parsingMethod`**).
1658
-
1659
- #### Installation
1660
-
1661
- ```bash
1662
- npm install @x12i/outputs-library
1663
- ```
1664
-
1665
- **Note**: `@x12i/outputs-library` is automatically installed as a dependency.
1666
-
1667
- **Dependency Resolution**: The gateway includes npm overrides to resolve version conflicts between the outputs library and content-registry. The integration uses dynamic imports for graceful degradation - if the outputs library is not available, the gateway will continue to work (parsing will be skipped with a warning).
1668
-
1669
- **Installation**: The package.json includes overrides to handle version conflicts automatically. If you still encounter issues:
1670
-
1671
- ```bash
1672
- npm install --legacy-peer-deps
1673
- ```
1674
-
1675
- **Note for Package Maintainers**: The `@x12i/outputs-library` package should update its peer dependency from `@xronoces/content-registry@^1.0.0` to `@xronoces/content-registry@>=1.0.0` or `^1.0.0 || >=2.7.0` to support both versions. See `DEPENDENCY_RESOLUTION.md` for details.
1676
-
1677
- #### Supported Inference Types
1678
-
1679
- - `classification` - Classify content into predefined categories
1680
- - `question-answer` - Answer questions based on context
1681
- - `extraction` - Extract structured data from unstructured text
1682
- - `summarization` - Generate summaries of content
1683
- - `risk-assessment` - Assess risks with scores and factors
1684
- - `recommendation` - Generate recommendations with priorities
1685
- - `transformation` - Transform data between formats
1686
-
1687
- #### Basic usage (`invoke` + `inferenceType`)
1688
-
1689
- `parseOptions` / `validateOutputSchema` / `strictValidation` are **not** request fields. Supply **`aiRequestId`**, **`identity`**, **`actionType`**, **`actionRef`**, **`prompt`**, **`workingMemory`**, and optionally **`inferenceType`** for activity metadata.
1690
-
1691
- ```typescript
1692
- const aiRequestId = 'cls-1';
1693
- const identity = {
1694
- sessionId: 's1',
1695
- instance: { instanceId: 'agent-456', type: 'test' },
1696
- aiRequestId,
1697
- jobId: 'job-123',
1698
- taskId: 'task-1',
1699
- agentId: 'agent-456'
1700
- };
1701
-
1702
- const response = await gateway.invoke({
1703
- aiRequestId,
1704
- agentId: 'agent-456',
1705
- actionType: 'skill',
1706
- actionRef: 'skills/classification',
1707
- instructions: 'Classify the sentiment of the text.',
1708
- prompt: '{{input}}',
1709
- workingMemory: { input: 'This product is amazing!' },
1710
- inferenceType: 'classification',
1711
- identity,
1712
- config: { model: 'gpt-4o', provider: 'openai' }
1713
- });
1714
- ```
1715
-
1716
- Use the same shape for **`question-answer`**, **`extraction`**, etc., changing **`inferenceType`**, **`instructions`**, and **`workingMemory`**. Parse or validate structured output with **`@x12i/outputs-library`** in your process if needed.
1717
-
1718
-
1719
- #### Response Structure
1720
-
1721
- When `inferenceType` is provided, the response includes:
1722
-
1723
- ```typescript
1724
- {
1725
- content: string; // Normalized content (ALWAYS a string - JSON objects are stringified)
1726
- rawText?: string; // Original raw text (always a string when present)
1727
- parsedContent?: object | array; // Parsed JSON content (ALWAYS object/array when JSON, forced structure)
1728
- metadata: {
1729
- // ... standard metadata ...
1730
- parsedOutput?: unknown; // Typed inference output (ALWAYS present with consistent structure when inferenceType provided)
1731
- inferenceType?: string; // The inference type used
1732
- outputValidationErrors?: string[]; // Validation errors (if enabled)
1733
- outputValidationPassed?: boolean; // Whether validation passed (v1.7.0+)
1734
- outputSchema?: Record<string, unknown>; // Schema used for validation (v1.7.0+)
1735
- outputsLibraryAvailable?: boolean; // Whether outputs library was used (v1.7.0+)
1736
- parsingMethod?: 'outputs-library' | 'json-parse' | 'raw'; // Parsing method used (v1.7.0+)
1737
- isFallback?: boolean; // Whether fallback structure was used (v1.7.4+)
1738
- outputAudit?: { // Structure audit results (v2.1.1+)
1739
- hasAllRequiredFields: boolean;
1740
- missingRequiredFields?: string[];
1741
- extraFields?: string[];
1742
- matchingFields?: string[];
1743
- responseFieldCount: number;
1744
- schemaFieldCount: number;
1745
- structureMatches: boolean;
1746
- };
1747
- }
1748
- }
1749
- ```
1750
-
1751
- **Guaranteed Structure Consistency (v1.7.4+):**
1752
-
1753
- The gateway ensures structures are always consistent:
1754
-
1755
- - **`content`**: Always a string (objects/arrays are JSON.stringified)
1756
- - **`parsedContent`**: Always an object or array when content is JSON (forced structure)
1757
- - **`parsedOutput`**: Always has type-specific structure when `inferenceType` is provided (never undefined)
1758
-
1759
- **Forced Structure Examples:**
1760
-
1761
- ```typescript
1762
- // Invalid JSON string → Wrapped in object
1763
- // Input: "not valid json"
1764
- // parsedContent: { text: "not valid json", _wrapped: true }
1765
-
1766
- // Plain text → Wrapped in object if expecting JSON
1767
- // Input: "Hello world"
1768
- // parsedContent: { text: "Hello world", _wrapped: true }
1769
-
1770
- // Object → Always preserved as object
1771
- // Input: { key: "value" }
1772
- // parsedContent: { key: "value" }
1773
-
1774
- // Array → Always preserved as array
1775
- // Input: [1, 2, 3]
1776
- // parsedContent: [1, 2, 3]
1777
- ```
1778
-
1779
- #### Using Outputs Library Directly
1780
-
1781
- You can also use the outputs library utilities directly:
1782
-
1783
- ```typescript
1784
- import {
1785
- ResponseParser,
1786
- SchemaValidator,
1787
- SchemaRegistry,
1788
- type ClassificationOutput
1789
- } from '@x12i/ai-gateway';
1790
-
1791
- // Parse manually
1792
- const output = ResponseParser.parse<ClassificationOutput>(
1793
- rawText,
1794
- parsedContent,
1795
- contentType,
1796
- 'classification',
1797
- { classes: ['positive', 'negative'] }
1798
- );
1799
-
1800
- // Validate against schema
1801
- const schema = SchemaRegistry.getSchema('classification');
1802
- const validator = new SchemaValidator();
1803
- if (!validator.validate(output, schema)) {
1804
- console.error('Validation errors:', validator.getErrors());
1805
- }
1806
- ```
1807
-
1808
- #### Automatic Output Schema Guidance (v2.1.1+)
1809
-
1810
- When you want to get a JSON response that conforms to a specific schema, you must provide the output object schema. The gateway automatically extends instructions with clear expectations about the JSON structure.
1811
-
1812
- **How It Works:**
1813
-
1814
- 1. **Provide `inferenceType`** in your request, OR
1815
- 2. **Use an instruction key** that has `outputSchema` in its metadata
1816
-
1817
- The gateway will:
1818
- - Automatically fetch the schema from instruction metadata (if using content-registry)
1819
- - Resolve `outputObjectPrefix` from `instructions-blocks.json`
1820
- - Append detailed schema guidance to instructions, including:
1821
- - Field descriptions and types
1822
- - Required vs optional fields
1823
- - Full JSON schema in a code block
1824
-
1825
- **Example:**
1826
-
1827
- ```typescript
1828
- // Instruction metadata has:
1829
- // {
1830
- // instructionKey: 'extraction/user-data',
1831
- // outputSchema: {
1832
- // type: 'object',
1833
- // properties: {
1834
- // name: { type: 'string', description: 'User full name' },
1835
- // email: { type: 'string', description: 'User email address' },
1836
- // age: { type: 'number', description: 'User age' }
1837
- // },
1838
- // required: ['name', 'email']
1839
- // }
1840
- // }
1841
-
1842
- const aiRequestId = 'schema-guide-1';
1843
- const response = await gateway.invoke({
1844
- aiRequestId,
1845
- agentId: 'agent-456',
1846
- actionType: 'skill',
1847
- actionRef: 'skills/extraction-user-data',
1848
- instructions: 'extraction/user-data', // Instruction key with outputSchema
1849
- prompt: '{{input}}',
1850
- workingMemory: { input: 'John Doe, john@example.com, 30 years old' },
1851
- inferenceType: 'extraction',
1852
- identity: {
1853
- sessionId: 's1',
1854
- instance: { instanceId: 'agent-456', type: 'test' },
1855
- aiRequestId,
1856
- jobId: 'job-123',
1857
- taskId: 'task-1',
1858
- agentId: 'agent-456'
1859
- },
1860
- config: { model: 'gpt-4o', provider: 'openai' }
1861
- });
1862
-
1863
- // Instructions automatically extended with:
1864
- // "You must respond with a single valid JSON object that strictly conforms to the schema provided below:
1865
- //
1866
- // Expected fields:
1867
- // - name (string) [REQUIRED]: User full name
1868
- // - email (string) [REQUIRED]: User email address
1869
- // - age (number) [optional]: User age
1870
- //
1871
- // Full JSON Schema:
1872
- // ```json
1873
- // { ... schema ... }
1874
- // ```"
1875
- ```
1876
-
1877
- **Configuration:**
1878
-
1879
- Add `outputObjectPrefix` to `instructions-blocks.json`:
1880
-
1881
- ```json
1882
- {
1883
- "outputObjectPrefix": "You must respond with a single valid JSON object that strictly conforms to the schema provided below:"
1884
- }
1885
- ```
1886
-
1887
- This prefix is automatically appended to instructions when `outputType` or `outputSchema` is available.
1888
-
1889
- #### Schema validation (request flags removed)
1890
-
1891
- **`validateOutputSchema`** and **`strictValidation`** are **not** on **`AIInvokeRequest`** / **`ChatRequest`** anymore. Validate **`response.parsedContent`** or **`response.content`** in your application (for example **`SchemaValidator`** from **`@x12i/outputs-library`**).
1892
-
1893
- Instruction metadata can still carry **`outputSchema`** for documentation and instruction rendering via the content registry; that does **not** restore the old request-level validation switches.
1894
-
1895
-
1896
- #### Output Structure Audit (v2.1.1+)
1897
-
1898
- The gateway automatically audits the structure of parsed output against the expected schema when a schema is available. This provides detailed analysis of what fields are present, missing, or extra - without affecting the response itself.
1899
-
1900
- **Automatic Audit:**
1901
-
1902
- When `outputSchema` is available (from instruction metadata or provided directly), the gateway automatically:
1903
- - Compares the response structure against the schema
1904
- - Identifies missing required fields
1905
- - Identifies extra fields (in response but not in schema)
1906
- - Reports matching fields
1907
- - Provides structure match status
1908
-
1909
- **Audit Results:**
1910
-
1911
- ```typescript
1912
- const aiRequestId = 'audit-1';
1913
- const response = await gateway.invoke({
1914
- aiRequestId,
1915
- agentId: 'agent-456',
1916
- actionType: 'skill',
1917
- actionRef: 'skills/extraction-user-data',
1918
- instructions: 'extraction/user-data',
1919
- prompt: '{{input}}',
1920
- workingMemory: { input: 'John Doe, john@example.com' },
1921
- inferenceType: 'extraction',
1922
- identity: {
1923
- sessionId: 's1',
1924
- instance: { instanceId: 'agent-456', type: 'test' },
1925
- aiRequestId,
1926
- jobId: 'job-123',
1927
- taskId: 'task-1',
1928
- agentId: 'agent-456'
1929
- },
1930
- config: { model: 'gpt-4o', provider: 'openai' }
1931
- });
1932
-
1933
- // When present, `outputAudit` is informational (see types / runtime behavior)
1934
- console.log(response.metadata.outputAudit);
1935
- // {
1936
- // hasAllRequiredFields: true,
1937
- // missingRequiredFields: undefined,
1938
- // extraFields: ['timestamp'], // Extra field not in schema
1939
- // matchingFields: ['name', 'email'],
1940
- // responseFieldCount: 3,
1941
- // schemaFieldCount: 2,
1942
- // structureMatches: false // false because of extra field
1943
- // }
1944
- ```
1945
-
1946
- **Audit Metadata:**
1947
-
1948
- - `outputAudit.hasAllRequiredFields` - Whether all required fields are present
1949
- - `outputAudit.missingRequiredFields` - Array of missing required field names
1950
- - `outputAudit.extraFields` - Array of fields in response but not in schema
1951
- - `outputAudit.matchingFields` - Array of fields present in both response and schema
1952
- - `outputAudit.responseFieldCount` - Total fields in response
1953
- - `outputAudit.schemaFieldCount` - Total fields expected in schema
1954
- - `outputAudit.structureMatches` - Whether structure matches exactly (all required present, no extra fields)
1955
-
1956
- **Note:** Audit is always performed when a schema is available - it's informational only and doesn't affect the response. Clients can use it for monitoring, logging, or quality checks.
1957
-
1958
- **Graceful Outputs Library Handling:**
1959
-
1960
- When outputs library is not available:
1961
- - Gateway automatically falls back to JSON parsing
1962
- - Response metadata indicates parsing method used (`outputsLibraryAvailable`, `parsingMethod`)
1963
- - Clear warnings are logged (not generic errors)
1964
- - Request continues with fallback parsing
1965
-
1966
- ```typescript
1967
- // Check parsing method in response
1968
- if (response.metadata.parsingMethod === 'json-parse') {
1969
- console.log('Outputs library unavailable, used JSON parsing fallback');
1970
- }
1971
- if (!response.metadata.outputsLibraryAvailable) {
1972
- console.log('Outputs library was not available for this request');
1973
- }
1974
-
1975
- // Check if fallback structure was used
1976
- if (response.metadata.isFallback) {
1977
- console.log('Parsing failed, using fallback structure');
1978
- // parsedOutput will have _fallback: true and _raw fields
1979
- const fallbackOutput = response.metadata.parsedOutput as any;
1980
- if (fallbackOutput._fallback) {
1981
- console.log('Raw response:', fallbackOutput._raw);
1982
- }
1983
- }
1984
- ```
1985
-
1986
- **Guaranteed Consistent Structure (v1.7.4+):**
1987
-
1988
- The gateway ensures **consistent structure at all levels**:
1989
-
1990
- 1. **Content Level**: `content` is always a string (JSON objects are stringified)
1991
- 2. **Parsed Content Level**: `parsedContent` is always an object/array when content is JSON (forced structure)
1992
- 3. **Parsed Output Level**: `parsedOutput` always has type-specific structure when `inferenceType` is provided
1993
-
1994
- **Forced Structure Conversion:**
1995
- - **JSON → Object/Array**: Always parsed into object/array structure (even if invalid JSON, wrapped)
1996
- - **Text → String**: Always kept as string in `content` field
1997
- - **Object → JSON String**: Always stringified in `content` field
1998
- - **Array → JSON String**: Always stringified in `content` field
1999
-
2000
- When `inferenceType` is provided, the gateway **always** returns a consistent structure in `parsedOutput`, even when parsing fails:
2001
-
2002
- ```typescript
2003
- // Success case
2004
- {
2005
- metadata: {
2006
- parsedOutput: {
2007
- extracted: { ... } // ✅ Structured output
2008
- },
2009
- isFallback: false
2010
- }
2011
- }
2012
-
2013
- // Failure case - guaranteed structure
2014
- {
2015
- metadata: {
2016
- parsedOutput: {
2017
- extracted: parsedContent || {}, // ✅ Always present
2018
- _fallback: true, // ✅ Indicates parsing failed
2019
- _raw: rawText // ✅ Original response preserved
2020
- },
2021
- isFallback: true,
2022
- parsingMethod: 'json-parse'
2023
- }
2024
- }
2025
- ```
2026
-
2027
- **Type-Specific Fallback Structures:**
2028
-
2029
- Each inference type has a standard fallback structure:
2030
-
2031
- - **Extraction**: `{ extracted: {}, _fallback: true, _raw: string }`
2032
- - **Classification**: `{ classes: [], confidence: {}, _fallback: true, _raw: string }`
2033
- - **Question-Answer**: `{ answer: string, confidence?: number, _fallback: true, _raw: string }`
2034
- - **Summarization**: `{ summary: string, keyPoints: [], _fallback: true, _raw: string }`
2035
- - **Sentiment**: `{ sentiment: string, score: number, _fallback: true, _raw: string }`
2036
-
2037
- This ensures consumers can always safely access expected fields without null checks:
2038
-
2039
- ```typescript
2040
- // ✅ Safe - structure is guaranteed
2041
- const output = response.metadata.parsedOutput as ExtractionOutput;
2042
- const extracted = output.extracted; // Always present, never undefined
2043
-
2044
- // Check if it's a fallback
2045
- if (output._fallback) {
2046
- console.warn('Parsing failed, using fallback structure');
2047
- console.log('Raw response:', output._raw);
2048
- }
2049
- ```
2050
-
2051
- **Fallback Parsing Priority:**
2052
-
2053
- When outputs library is not available, the gateway automatically falls back in this order:
2054
-
2055
- 1. **Outputs Library** (preferred) - Full parsing with type inference
2056
- 2. **JSON Parse** - If `parsedContent` is available as object/array
2057
- 3. **Raw Content** - Last resort, returns raw text/object
2058
-
2059
- **Error Detection:**
2060
-
2061
- The gateway automatically detects and handles outputs library errors:
2062
- - **Module Not Found**: Clear warning logged, fallback parsing used
2063
- - **Export Missing**: Clear warning logged, fallback parsing used
2064
- - **Version Mismatch**: Clear warning logged, fallback parsing used
2065
-
2066
- All errors are logged with clear messages indicating the issue and the fallback method used.
2067
-
2068
- #### Benefits
2069
-
2070
- - **Type Safety**: Full TypeScript support with typed outputs
2071
- - **Consistency**: Standardized object structures across all SDKs
2072
- - **Validation**: Built-in schema validation with strict/non-strict modes (v1.7.0+)
2073
- - **Metadata-Driven**: Automatic schema resolution from instruction metadata (v1.7.0+)
2074
- - **Consistent Structure**: Guaranteed consistent `parsedOutput` structure even when parsing fails (v1.7.4+)
2075
- - **Flexibility**: Works with any inference type
2076
- - **Integration**: Seamlessly integrated into gateway responses
2077
- - **Graceful Degradation**: Automatic fallback when outputs library unavailable (v1.7.0+)
2078
-
2079
- ### 7. Response transformation hooks (removed from request)
2080
-
2081
- **Current behavior:** Request-level `transformations` and the `ResponseTransformationConfig` type are **no longer** part of the gateway request model. They were removed in favor of a smaller public surface.
2082
-
2083
- **What to use instead:**
2084
-
2085
- - Post-process `EnhancedLLMResponse` in your application (e.g. map `parsedContent` / `content`).
2086
- - Use router **request/response interceptors** from `@x12i/ai-providers-router` where appropriate.
2087
- - For instruction-catalog defaults, `InstructionMetadata` may still carry its own `parseOptions`; that is metadata about an instruction definition, **not** a field on `ChatRequest` / `AIInvokeRequest`.
2088
-
2089
- Historical examples in older docs or forks that show `transformations` on `invoke()` payloads should be ignored or migrated.
2090
-
2091
- ### 8. Content Registry Integration (Instruction Keys)
2092
-
2093
- The gateway integrates with `@xronoces/content-registry` to fetch instructions by key instead of hardcoding them. This enables centralized content management, versioning, and zero-deploy updates.
2094
-
2095
- #### Overview: Two Modes for Instructions
2096
-
2097
- The gateway supports **two modes** for providing instructions to LLMs:
2098
-
2099
- 1. **Text Mode (Default)**: Instructions are actual text strings embedded in code
2100
- 2. **Key Mode (New)**: Instructions are keys that reference content stored in `@xronoces/content-registry`
2101
-
2102
- Both modes work seamlessly - the gateway automatically detects which mode you're using based on the instruction format.
2103
-
2104
- #### Installation
2105
-
2106
- ```bash
2107
- npm install @xronoces/content-registry
2108
- ```
2109
-
2110
- **Note**: `@xronoces/content-registry` (v2.14.0+) is automatically installed as a dependency. This version includes simplified APIs for easier integration.
2111
-
2112
- **Configuration**: Content registry is used when `contentRegistryConfig` is provided in the gateway configuration.
2113
-
2114
- **For Internal Use**: Content-registry auto-initializes from environment variables if available, enabling instructionsBlocks resolution without requiring explicit configuration.
2115
-
2116
- **See**: [Content Resolver — Upstream Guide](./CONTENT_RESOLVER_UPSTREAM_GUIDE.md) for configuration, environment variables, content layout, and upstream checklist.
2117
-
2118
- #### Configuration
2119
-
2120
- ```typescript
2121
- import { AIGateway } from '@x12i/ai-gateway';
2122
-
2123
- const gateway = new AIGateway({
2124
- defaultProvider: 'openai',
2125
- // Enable content-registry mode
2126
- enableContentRegistry: true,
2127
- // Configure content registry
2128
- contentRegistryConfig: {
2129
- s3Bucket: 'my-content-registry-bucket',
2130
- cacheTTL: 3600, // Cache content for 1 hour
2131
- redis: {
2132
- host: 'localhost',
2133
- port: 6379,
2134
- password: process.env.REDIS_PASSWORD
2135
- },
2136
- // ... other content-registry config options
2137
- }
2138
- });
2139
- ```
2140
-
2141
- #### Mode 1: Text Mode (Automatic - Has Spaces)
2142
-
2143
- **How it works**: Automatically detected when instructions contain whitespace. Used as literal text without any resolution.
2144
-
2145
- **When used automatically**:
2146
- - Instructions contain spaces, tabs, or newlines
2147
- - No content registry lookup performed
2148
- - Fast processing with no external dependencies
2149
-
2150
- **Example**:
2151
- ```typescript
2152
- // Text mode - instructions as plain text (structured invoke path resolves templates from workingMemory)
2153
- const aiRequestId = 'text-mode-1';
2154
- const response = await gateway.invoke({
2155
- aiRequestId,
2156
- agentId: 'agent-1',
2157
- actionType: 'skill',
2158
- actionRef: 'skills/classification',
2159
- instructions:
2160
- 'You are a helpful assistant that classifies text into categories: positive, negative, or neutral.',
2161
- prompt: '{{userLine}}',
2162
- workingMemory: { userLine: 'Classify: "This product is amazing!"' },
2163
- identity: {
2164
- sessionId: 's1',
2165
- instance: { instanceId: 'agent-1', type: 'test' },
2166
- aiRequestId,
2167
- jobId: 'job-123',
2168
- taskId: 'task-1',
2169
- agentId: 'agent-1'
2170
- },
2171
- config: { model: 'gpt-4o', provider: 'openai' }
2172
- });
2173
- ```
2174
-
2175
- **Behavior**:
2176
- - Gateway uses the text as-is
2177
- - No content-registry lookup
2178
- - Fast (no network calls)
2179
-
2180
- #### Mode 2: Key Mode (Automatic - No Spaces)
2181
-
2182
- **How it works**: Automatically detected when instructions contain no whitespace. Keys are resolved from content registry.
2183
-
2184
- **When used automatically**:
2185
- - Instructions contain no spaces, tabs, or newlines
2186
- - Content is fetched from configured content registry
2187
- - Supports template variables and dynamic content
2188
- - Fail-closed: unresolvable keys become bad requests
2189
-
2190
- **Example**:
2191
- ```typescript
2192
- // Key mode — pass the instruction key on `instructions`; user text via `prompt` + `workingMemory`
2193
- const aiRequestId = 'key-mode-1';
2194
- const response = await gateway.invoke({
2195
- aiRequestId,
2196
- agentId: 'agent-456',
2197
- actionType: 'skill',
2198
- actionRef: 'skills/classification',
2199
- instructions: 'classification/basic', // ✅ Key - resolved from content-registry
2200
- prompt: '{{userLine}}',
2201
- workingMemory: {
2202
- userLine: 'Classify: "This product is amazing!"',
2203
- task: 'classification',
2204
- context: 'product reviews'
2205
- },
2206
- identity: {
2207
- sessionId: 's1',
2208
- instance: { instanceId: 'agent-456', type: 'test' },
2209
- aiRequestId,
2210
- jobId: 'job-123',
2211
- taskId: 'task-1',
2212
- agentId: 'agent-456'
2213
- },
2214
- config: { model: 'gpt-4o', provider: 'openai' }
2215
- });
2216
- ```
2217
-
2218
- **What happens behind the scenes**:
2219
- 1. Gateway detects `'classification/basic'` is a key (not plain text)
2220
- 2. Creates ContentResolver with hierarchical patterns for instruction resolution
2221
- 3. ContentResolver resolves content using configurable fallback chains:
2222
- - `content/instructions/{contentId}` (primary pattern)
2223
- - Handles consolidated vs individual file resolution
2224
- 4. Content-registry returns raw template content with full metadata envelope
2225
- 5. Instructions parser renders template with **`workingMemory`** (and tier memories) using Rendrix
2226
- 6. Gateway returns rendered instruction text with metadata
2227
- 7. Sends the resolved instruction to the LLM
2228
-
2229
- **Note:** **`gateway.invoke()`** builds provider **`messages`** from **`instructions`** / **`prompt`** / **`context`** via the message builder; it does **not** append a raw **`messages`** array from the request. For a pre-built transcript, use **`gateway.invokeChat()`** (see below).
2230
-
2231
- **Template-Based Prompts (Same as Instructions)**:
2232
- Prompts work exactly like instructions - both can be resolved from content-registry using explicit keys with suffixes:
2233
-
2234
- ```typescript
2235
- const aiRequestId = 'tpl-key-1';
2236
- const response = await gateway.invoke({
2237
- aiRequestId,
2238
- agentId: 'agent-456',
2239
- actionType: 'skill',
2240
- actionRef: 'skills/professional-answer',
2241
- instructions: 'professional-answer.instructions', // Key with suffix
2242
- prompt: 'professional-answer.prompt', // Key with suffix
2243
- workingMemory: {
2244
- input: 'What is the capital of France?',
2245
- taskDescription: 'Answer questions professionally'
2246
- },
2247
- identity: {
2248
- sessionId: 's1',
2249
- instance: { instanceId: 'agent-456', type: 'test' },
2250
- aiRequestId,
2251
- jobId: 'job-123',
2252
- taskId: 'task-1',
2253
- agentId: 'agent-456'
2254
- },
2255
- config: { model: 'gpt-4o', provider: 'openai' }
2256
- });
2257
- ```
2258
-
2259
- **File Structure in Content-Registry**:
2260
- ```
2261
- .metadata/
2262
- content/
2263
- instructions/
2264
- professional-answer.instructions.md
2265
- professional-answer.prompt.md
2266
- ```
2267
-
2268
- **Key Features**:
2269
- - Both instructions and prompts use the same memory context (`workingMemory`, `shortTermMemory`, `experienceMemory`, `knowledgeMemory`)
2270
- - Both use the same template rendering system (Rendrix)
2271
- - Both support the same key detection logic (no spaces = key, has spaces = text)
2272
- - Postfix handling (`.instructions` and `.prompt`) is done upstream by AI skills
2273
-
2274
- See [Prompt Template Usage Guide](./docs/PROMPT_TEMPLATE_USAGE.md) for complete documentation.
2275
-
2276
- **Behavior**:
2277
- - Gateway resolves keys automatically using ContentResolver
2278
- - Requires `contentRegistryConfig` to be set
2279
- - Uses hierarchical content resolution with configurable patterns
2280
- - Template rendering handled by dedicated instructions parser
2281
- - Supports complex variable substitution (workingMemory, shortTermMemory, etc.)
2282
- - Keys don't include file extensions (e.g., `classification/basic` not `classification/basic.md`)
2283
- - Access to YAML front matter and version metadata
2284
- - Enhanced error handling with detailed resolution information
2285
-
2286
- #### Content Resolution Patterns
2287
-
2288
- Instruction keys are resolved using configurable hierarchical patterns through the ContentResolver. The resolver tries multiple path patterns in priority order.
2289
-
2290
- **Resolution Pattern Examples**:
2291
- ```typescript
2292
- // For instruction key 'classification/basic':
2293
- // Pattern 1: content/instructions/classification/basic
2294
- // Tries: classification/basic.md, classification/basic.txt, etc.
2295
-
2296
- // For instructionsBlocks 'input' with agentId 'agent-1':
2297
- // Pattern 1: content/instructions/agent-1/input
2298
- // Pattern 2: content/instructions/shared/input
2299
- ```
2300
-
2301
- **How Resolution Works**:
2302
- 1. Key is provided: `"classification/basic"` (no file extension)
2303
- 2. ContentResolver applies hierarchical patterns with `{contentType}`, `{contentId}`, etc.
2304
- 3. Tries patterns in priority order until content is found
2305
- 4. Returns raw template content with metadata
2306
- 5. Instructions parser renders templates with Rendrix
2307
- 6. Final rendered content sent to LLM
2308
-
2309
- **Key Detection Rules**:
2310
- The gateway automatically detects if a string is a key or plain text:
2311
-
2312
- ```typescript
2313
- // ✅ Detected as KEYS (will be resolved from content-registry):
2314
- 'classification/basic' // Path-like structure
2315
- 'extraction/entities' // Contains slashes
2316
- 'instructions/path/to/content' // Already normalized format
2317
- 'short-key' // Short, no spaces
2318
-
2319
- // ❌ Treated as TEXT (used as-is):
2320
- 'You are a helpful assistant...' // Contains newlines
2321
- 'This is a very long instruction that exceeds 200 characters and contains multiple sentences with spaces and newlines...' // Too long
2322
- 'Instruction with spaces' // Contains spaces (unless very short)
2323
- ```
2324
-
2325
- **Detection Logic**:
2326
- - ✅ **Key if**: Contains `/` (path-like) OR (short length < 100 chars AND no spaces)
2327
- - ❌ **Text if**: Contains newlines OR length > 200 chars OR has spaces (for longer strings)
2328
-
2329
- #### Template Variables (Content Registry)
2330
-
2331
- When using instruction keys, you can pass variables for template rendering. Template rendering is handled by Rendrix after fetching content from content-registry.
2332
-
2333
- **How it works**:
2334
- 1. Content stored in registry uses template syntax: `{{variableName}}`
2335
- 2. Gateway fetches content from content-registry using `fetchContent()`
2336
- 3. Gateway uses Rendrix to render the template with your variables
2337
- 4. Resolved instruction is used in the LLM request
2338
-
2339
- **Example**:
2340
-
2341
- **Content stored in registry** (`instructions/classification/basic`):
2342
- ```
2343
- You are a {{task}} expert.
2344
- Classify text into the following categories: {{classes}}.
2345
- Provide confidence scores and reasoning.
2346
- ```
2347
-
2348
- **Code**:
2349
- ```typescript
2350
- const aiRequestId = 'tpl-vars-1';
2351
- const response = await gateway.invoke({
2352
- aiRequestId,
2353
- agentId: 'agent-456',
2354
- actionType: 'skill',
2355
- actionRef: 'skills/classification',
2356
- instructions: 'classification/basic',
2357
- prompt: '{{userLine}}',
2358
- workingMemory: {
2359
- userLine: 'Classify: "This product is amazing!"',
2360
- task: 'classification',
2361
- classes: 'positive, negative, neutral'
2362
- },
2363
- identity: {
2364
- sessionId: 's1',
2365
- instance: { instanceId: 'agent-456', type: 'test' },
2366
- aiRequestId,
2367
- jobId: 'job-123',
2368
- taskId: 'task-1',
2369
- agentId: 'agent-456'
2370
- },
2371
- config: { model: 'gpt-4o', provider: 'openai' }
2372
- });
2373
- ```
2374
-
2375
- **Resolved instruction** (what the LLM actually receives):
2376
- ```
2377
- You are a classification expert.
2378
- Classify text into the following categories: positive, negative, neutral.
2379
- Provide confidence scores and reasoning.
2380
- ```
2381
-
2382
- **Benefits**:
2383
- - Same instruction template, different variables
2384
- - Dynamic instruction generation
2385
- - A/B testing with different variable values
2386
- - Centralized template management
2387
-
2388
- #### Mixed Mode (multi-turn chat vs registry keys)
2389
-
2390
- **Registry keys** are resolved on **`gateway.invoke()`** when you pass the key on **`instructions`** / **`prompt`** (see key-mode example above). **`gateway.invoke()`** does **not** consume a client-supplied **`messages`** array for the provider call.
2391
-
2392
- For **multi-turn** transcripts (system/user/assistant/user, arbitrary order), use **`gateway.invokeChat()`**. Pass **`instructions: ''`** when your **`messages`** array already includes the system prompt, so the gateway does not prepend a second system message. Message bodies are sent **literally**—if you need a registry key as the system prompt, resolve it first (e.g. **`InstructionResolver`** / **`getInstructionMetadata`**) and put the rendered text in **`messages`**.
2393
-
2394
- ```typescript
2395
- const aiRequestId = 'mix-chat-1';
2396
- const response = await gateway.invokeChat({
2397
- aiRequestId,
2398
- agentId: 'agent-1',
2399
- instructions: '',
2400
- messages: [
2401
- {
2402
- role: 'system',
2403
- content:
2404
- '…system prompt text (resolve content-registry keys beforehand if needed)…'
2405
- },
2406
- { role: 'user', content: 'You are analyzing product reviews.' },
2407
- { role: 'assistant', content: 'I understand.' },
2408
- { role: 'user', content: 'Classify: "This product is amazing!"' }
2409
- ],
2410
- workingMemory: { task: 'classification' },
2411
- identity: {
2412
- sessionId: 's1',
2413
- instance: { instanceId: 'agent-1', type: 'test' },
2414
- aiRequestId,
2415
- jobId: 'job-123',
2416
- taskId: 'task-1',
2417
- agentId: 'agent-1'
2418
- },
2419
- config: { model: 'gpt-4o', provider: 'openai' }
2420
- });
2421
- ```
2422
-
2423
- **Use cases**:
2424
- - Multi-turn chat while keeping **`workingMemory`** / **`templateRenderOptions`** available on the chat path
2425
- - Single-shot registry-backed flows → prefer **`invoke()`** with **`instructions`** as the key
2426
-
2427
- #### Content Registry Accessor Methods
2428
-
2429
- The gateway exposes content-registry methods for pre-validation, batch operations, and debugging without making LLM calls:
2430
-
2431
- ```typescript
2432
- const gateway = new AIGateway({
2433
- enableContentRegistry: true,
2434
- contentRegistryConfig: { github: { ... } }
2435
- });
2436
-
2437
- // Get the underlying content-registry instance
2438
- const registry = gateway.getContentRegistry();
2439
- if (registry) {
2440
- // Use registry methods directly
2441
- const content = await registry.getByPath({
2442
- pathKey: 'classification/basic',
2443
- category: 'instruction'
2444
- });
2445
- }
2446
-
2447
- // Resolve an instruction key with variables (without making LLM call)
2448
- const resolved = await gateway.resolveInstructionKey('classification/basic', {
2449
- task: 'classification',
2450
- classes: 'positive, negative, neutral'
2451
- });
2452
- console.log(resolved); // "You are a classification expert. Classify text into: positive, negative, neutral."
2453
-
2454
- // Check if an instruction key exists
2455
- const exists = await gateway.hasInstructionKey('classification/basic');
2456
- if (exists) {
2457
- // Key exists, safe to use
2458
- }
2459
-
2460
- // Get instruction metadata (structured metadata for metadata-driven systems)
2461
- const metadata = await gateway.getInstructionMetadata('classification/basic');
2462
- if (metadata) {
2463
- console.log('Output type:', metadata.outputType); // 'classification'
2464
- console.log('Required variables:', metadata.requiredVariables); // ['task', 'classes']
2465
- console.log('Schema:', metadata.outputSchema); // JSONSchema7
2466
- console.log('Validation rules:', metadata.validationRules);
2467
- console.log('Output mapping:', metadata.defaultOutputMapping);
2468
- console.log('Parse options:', metadata.parseOptions);
2469
- console.log('Version:', metadata.version);
2470
- }
2471
- ```
2472
-
2473
- **Benefits:**
2474
- - **Pre-validation**: Check if keys exist before making requests
2475
- - **Batch operations**: Resolve multiple keys in parallel
2476
- - **Debugging**: Inspect resolved instructions without LLM calls
2477
- - **Consistency**: Use same content-registry instance as gateway
2478
-
2479
- **Note**: All methods return `undefined` or throw errors if `contentRegistryConfig` is not set.
2480
-
2481
- #### Instruction Metadata API (v1.6.9+)
2482
-
2483
- The `getInstructionMetadata()` method returns structured metadata from content-registry, enabling fully metadata-driven inference systems. This is the primary API for accessing instruction configuration without making LLM calls.
2484
-
2485
- **Interface:**
2486
-
2487
- ```typescript
2488
- interface InstructionMetadata {
2489
- instructionKey: string; // The instruction key (e.g., 'classification/basic')
2490
- outputType?: string; // Inference type for parsing (e.g., 'classification')
2491
- outputSchema?: Record<string, unknown>; // JSONSchema7 for validation
2492
- requiredVariables?: string[]; // Required template variables
2493
- optionalVariables?: string[]; // Optional template variables
2494
- validationRules?: ValidationRule[]; // Validation rules
2495
- defaultOutputMapping?: Record<string, string>; // Output field mapping
2496
- parseOptions?: { // Default parse options
2497
- question?: string;
2498
- classes?: string[];
2499
- [key: string]: unknown;
2500
- };
2501
- version?: string; // Instruction version
2502
- [key: string]: unknown; // Additional metadata preserved
2503
- }
2504
- ```
2505
-
2506
- **Use Cases:**
2507
-
2508
- 1. **Metadata-Driven Inference**: Fetch metadata to configure inference without hardcoding
2509
- 2. **Variable Validation**: Check required/optional variables before making requests
2510
- 3. **Schema Validation**: Get output schema for validation
2511
- 4. **Output Mapping**: Apply default field mappings from metadata
2512
- 5. **Parse Configuration**: Get default parse options for outputs library
2513
-
2514
- **Example: Metadata-Driven System**
2515
-
2516
- ```typescript
2517
- const gateway = new AIGateway({
2518
- enableContentRegistry: true,
2519
- contentRegistryConfig: { github: { ... } }
2520
- });
2521
-
2522
- // Fetch instruction metadata
2523
- const metadata = await gateway.getInstructionMetadata('classification/basic');
2524
-
2525
- if (metadata) {
2526
- // Validate required variables
2527
- const requiredVars = metadata.requiredVariables || [];
2528
- const providedVars = Object.keys(variables);
2529
- const missingVars = requiredVars.filter(v => !providedVars.includes(v));
2530
-
2531
- if (missingVars.length > 0) {
2532
- throw new Error(`Missing required variables: ${missingVars.join(', ')}`);
2533
- }
2534
-
2535
- // Use metadata to configure inference (parsing options live on metadata for your app, not on the request)
2536
- const aiRequestId = `md-${Date.now()}`;
2537
- const response = await gateway.invoke({
2538
- aiRequestId,
2539
- agentId: 'agent-1',
2540
- actionType: 'skill',
2541
- actionRef: metadata.instructionKey || 'skills/classification',
2542
- instructions: 'classification/basic',
2543
- prompt: '{{input}}',
2544
- workingMemory: { input: 'This is great!', ...variables },
2545
- inferenceType: metadata.outputType,
2546
- identity: {
2547
- sessionId: 's1',
2548
- instance: { instanceId: 'agent-1', type: 'test' },
2549
- aiRequestId,
2550
- jobId: 'job-1',
2551
- taskId: 'task-1',
2552
- agentId: 'agent-1'
2553
- },
2554
- config: { model: 'gpt-4o', provider: 'openai' }
2555
- });
2556
-
2557
- // Apply output mapping if provided
2558
- let output = response.metadata.parsedOutput;
2559
- if (metadata.defaultOutputMapping && output) {
2560
- const mapped: Record<string, any> = {};
2561
- for (const [key, mappedKey] of Object.entries(metadata.defaultOutputMapping)) {
2562
- if (key in output) {
2563
- mapped[mappedKey] = (output as any)[key];
2564
- }
2565
- }
2566
- output = { ...output, ...mapped };
2567
- }
2568
- }
2569
- ```
2570
-
2571
- **Metadata Storage in Content-Registry:**
2572
-
2573
- Store metadata in your content-registry files:
2574
-
2575
- ```json
2576
- {
2577
- "messages": [
2578
- {
2579
- "role": "system",
2580
- "content": "You are a {{task}} expert. Classify text into: {{classes}}."
2581
- }
2582
- ],
2583
- "metadata": {
2584
- "outputType": "classification",
2585
- "outputSchema": {
2586
- "type": "object",
2587
- "properties": {
2588
- "class": { "type": "string" },
2589
- "confidence": { "type": "number" }
2590
- }
2591
- },
2592
- "requiredVariables": ["task", "classes"],
2593
- "optionalVariables": ["context"],
2594
- "validationRules": [
2595
- {
2596
- "name": "confidence-threshold",
2597
- "type": "range",
2598
- "config": { "min": 0, "max": 1 }
2599
- }
2600
- ],
2601
- "defaultOutputMapping": {
2602
- "class": "label",
2603
- "confidence": "score"
2604
- },
2605
- "parseOptions": {
2606
- "classes": ["positive", "negative", "neutral"]
2607
- },
2608
- "version": "v1.0.0"
2609
- }
2610
- }
2611
- ```
2612
-
2613
- **Return Value:**
2614
-
2615
- - Returns `InstructionMetadata | null` (not `undefined`)
2616
- - Returns `null` if content-registry is not enabled
2617
- - Returns `null` if instruction key not found
2618
- - Preserves all additional metadata fields from content-registry
2619
-
2620
- #### Complete Example: Using Content Registry
2621
-
2622
- **Step 1: Store content in content-registry**
2623
-
2624
- ```typescript
2625
- import { ContentRegistry } from '@xronoces/content-registry';
2626
-
2627
- const registry = new ContentRegistry({
2628
- s3Bucket: 'my-content-bucket',
2629
- // ... config
2630
- });
2631
-
2632
- // Store instruction content
2633
- await registry.createPrompt({
2634
- id: 'instructions/classification/basic',
2635
- name: 'Basic Classification',
2636
- version: 'v1.0.0',
2637
- messages: [
2638
- {
2639
- role: 'system',
2640
- content: 'You are a {{task}} expert. Classify text into: {{classes}}.'
2641
- }
2642
- ]
2643
- });
2644
- ```
2645
-
2646
- **Step 2: Use key mode in gateway**
2647
-
2648
- ```typescript
2649
- const gateway = new AIGateway({
2650
- enableContentRegistry: true,
2651
- contentRegistryConfig: {
2652
- s3Bucket: 'my-content-bucket',
2653
- // ... same config as registry
2654
- }
2655
- });
2656
-
2657
- // Use instruction key on invoke() (resolved via content-registry)
2658
- const aiRequestId = 'registry-step2-1';
2659
- const response = await gateway.invoke({
2660
- aiRequestId,
2661
- agentId: 'agent-1',
2662
- actionType: 'skill',
2663
- actionRef: 'skills/classification',
2664
- instructions: 'classification/basic',
2665
- prompt: '{{userLine}}',
2666
- workingMemory: {
2667
- userLine: 'Classify: "This is great!"',
2668
- task: 'classification',
2669
- classes: 'positive, negative, neutral'
2670
- },
2671
- identity: {
2672
- sessionId: 's1',
2673
- instance: { instanceId: 'agent-1', type: 'test' },
2674
- aiRequestId,
2675
- jobId: 'job-123',
2676
- taskId: 'task-1',
2677
- agentId: 'agent-1'
2678
- },
2679
- config: { model: 'gpt-4o', provider: 'openai' }
2680
- });
2681
- ```
2682
-
2683
- **Step 3: Update content without code changes**
2684
-
2685
- ```typescript
2686
- // Update instruction in content-registry (no code deployment needed!)
2687
- await registry.createPrompt({
2688
- id: 'instructions/classification/basic',
2689
- version: 'v1.1.0', // New version
2690
- messages: [
2691
- {
2692
- role: 'system',
2693
- content: 'You are an advanced {{task}} expert. Classify text with high accuracy into: {{classes}}.'
2694
- }
2695
- ]
2696
- });
2697
-
2698
- // Gateway automatically uses latest version
2699
- // No code changes needed!
2700
- ```
2701
-
2702
- #### Benefits
2703
-
2704
- - **Centralized Management**: All instructions stored in one place (S3 + MongoDB via content-registry)
2705
- - **Versioning**: Track changes and rollback if needed (content-registry supports versioning)
2706
- - **Zero-Deploy Updates**: Update instructions without code changes
2707
- - **A/B Testing**: Test different instruction versions
2708
- - **Template Support**: Use Handlebars templates with variables
2709
- - **Caching**: Instructions are cached for performance (Redis via content-registry)
2710
- - **Migration**: Prefer **`invoke()`** + **`instructions`** keys for registry-backed single-shot flows; multi-turn **`messages`** arrays belong on **`invokeChat()`** (see Mixed Mode). Request payloads require **`identity`**, **`aiRequestId`**, and (for **`invoke()`**) **`actionType`** / **`actionRef`** — top-level **`input`** is rejected.
2711
- - **Automatic Detection**: Gateway automatically detects keys vs text - no manual mode switching
2712
-
2713
- #### Error Handling
2714
-
2715
- If a key cannot be resolved, the gateway falls back to using the key as text and logs a warning:
2716
-
2717
- ```typescript
2718
- // If key 'invalid/key' doesn't exist in registry, invoke() fails closed during message build
2719
- // (bad-request logging when activity tracking is enabled)—there is no silent fallback to raw text.
2720
- const aiRequestId = 'registry-err-1';
2721
- try {
2722
- await gateway.invoke({
2723
- aiRequestId,
2724
- agentId: 'agent-1',
2725
- actionType: 'skill',
2726
- actionRef: 'skills/invalid',
2727
- instructions: 'invalid/key',
2728
- prompt: '{{input}}',
2729
- workingMemory: { input: 'hello' },
2730
- identity: {
2731
- sessionId: 's1',
2732
- instance: { instanceId: 'agent-1', type: 'test' },
2733
- aiRequestId,
2734
- jobId: 'job-123',
2735
- taskId: 'task-1',
2736
- agentId: 'agent-1'
2737
- },
2738
- config: { model: 'gpt-4o', provider: 'openai' }
2739
- });
2740
- } catch (e) {
2741
- // Expect resolution / instruction errors; see TROUBLESHOOTING.md
2742
- }
2743
- ```
2744
-
2745
- #### Advanced: Custom Instruction Resolver
2746
-
2747
- You can also use the `InstructionResolver` directly:
2748
-
2749
- ```typescript
2750
- import { InstructionResolver } from '@x12i/ai-gateway';
2751
-
2752
- const resolver = new InstructionResolver({
2753
- enableContentRegistry: true,
2754
- contentRegistryConfig: {
2755
- s3Bucket: 'my-bucket',
2756
- // ... config
2757
- }
2758
- });
2759
-
2760
- // Resolve a key
2761
- const instructions = await resolver.resolveInstructions(
2762
- 'classification/basic',
2763
- { task: 'classification' }
2764
- );
2765
-
2766
- console.log(instructions); // Resolved instruction text
2767
- ```
2768
-
2769
- ### 9. Contract hints (`StructuredTextSpec`) vs gateway parsing
2770
-
2771
- **`ChatRequest`** still includes an optional **`expectedSchema?: StructuredTextSpec`** field for typing and forward compatibility:
2772
-
2773
- ```typescript
2774
- interface StructuredTextSpec {
2775
- description: string;
2776
- formatHint?: string;
2777
- }
2778
- ```
2779
-
2780
- **Current gateway behavior:** dedicated **contract-output** processing (persisting **`content.contractOutput`** / **`content.outputStatus`** on activities from **`expectedSchema`**) is **not** implemented on **`invoke()`** / **`invokeChat()`** — see comment in `gateway.ts` (“Contract output processing removed”). Passing **`expectedSchema`** does not, by itself, trigger extra persistence or validation inside this package.
2781
-
2782
- **Recommended patterns:**
2783
-
2784
- - Rely on **flex-md extraction** into **`response.parsedContent`** / **`response.content`** and validate in your app (e.g. **`@x12i/outputs-library`**, JSON Schema, or custom checks).
2785
- - Use **response interceptors** if you need centralized post-processing or compliance logging.
2786
- - Encode output shape in **instructions** / instruction metadata (**`outputSchema`**) so the model sees the contract; enforce it application-side.
2787
-
2788
- ## API Reference
2789
-
2790
- ### AIGateway
2791
-
2792
- Enhanced gateway class with production-ready features.
2793
-
2794
- #### Constructor
2795
-
2796
- ```typescript
2797
- const gateway = new AIGateway(config?: GatewayConfig);
2798
- ```
2799
-
2800
- **GatewayConfig Options:**
2801
-
2802
- ```typescript
2803
- interface GatewayConfig extends RouterConfig {
2804
- usageTier?: UsageTier; // Default: 'tier-3'
2805
- enableActivityTracking?: boolean; // Default: true
2806
- enableUsageTracking?: boolean; // Default: true
2807
- enableLogging?: boolean; // Default: true
2808
- contentRegistryConfig?: any; // Content registry config for instruction key resolution
2809
- logger?: import('@x12i/logxer').Logxer; // Custom Logxer (from createLogxer)
2810
- activityTracker?: Activix; // Custom Activix v7 instance (match collection names: ai-actions, bad-requests, skill-executions)
2811
- packageName?: string; // Default: 'AI_GATEWAY'
2812
- // ... all RouterConfig options
2813
- }
2814
- ```
2815
-
2816
- #### Methods
2817
-
2818
- - `register(provider: LLMProviderInterface): void` - Register a provider
2819
- - `unregister(providerName: LLMProvider): void` - Unregister a provider
2820
- - `getProvider(providerName: LLMProvider)` - Get a provider instance
2821
- - `listProviders(): LLMProvider[]` - List all registered providers
2822
- - `invoke(request: AIInvokeRequest): Promise<EnhancedLLMResponse>` - Structured / skill path; requires **`actionType`**, **`actionRef`**, **`identity`**, **`aiRequestId`**, **`instructions`**, etc. (`AIRequest` is a type alias)
2823
- - `invokeChat(request: ChatRequest): Promise<EnhancedLLMResponse>` - Chat / conversational path (no **`actionType`** / **`actionRef`**)
2824
- - `setDefaultProvider(provider: LLMProvider): void` - Set the default provider
2825
- - `setFallbackChain(chain: LLMProvider[]): void` - Configure fallback chain
2826
- - `addRequestInterceptor(interceptor: RequestInterceptor): void` - Add request interceptor
2827
- - `addResponseInterceptor(interceptor: ResponseInterceptor): void` - Add response interceptor
2828
- - `checkHealth(provider: LLMProvider): Promise<HealthCheckResult>` - Check provider health
2829
- - `checkAllHealth(): Promise<Map<LLMProvider, HealthCheckResult>>` - Check all providers' health
2830
- - `getRouter(): LLMProviderRouter` - Get underlying router instance
2831
- - `getLogger(): Logxer` - Get logger instance (`@x12i/logxer`)
2832
- - `setActivityManager(activityManager)` - Advanced/test hook to replace the internal activity manager; typical apps inject **`Activix` via `activityTracker`** in the constructor instead
2833
- - `getContentRegistry(): ContentRegistry | undefined` - Get underlying content-registry instance (returns undefined if not enabled)
2834
- - `resolveInstructionKey(key: string, variables?: Record<string, unknown>): Promise<string>` - Resolve instruction key with variables (without making LLM call)
2835
- - `hasInstructionKey(key: string): Promise<boolean>` - Check if instruction key exists in content-registry
2836
- - `getInstructionMetadata(key: string): Promise<InstructionMetadata | null>` - Get structured instruction metadata from content-registry (v1.6.9+)
2837
- - `generateTaskTypeId(instructions: string): Promise<string>` - Generate taskTypeId from instructions (instance method)
2838
- - `static generateTaskTypeId(instructions: string, options?: { contentRegistryConfig?, agentId? }): Promise<string>` - Generate taskTypeId from instructions (static method)
2839
- - `optimizeInstructions(originalInstructions: string, options?: OptimizeInstructionsOptions): Promise<InstructionOptimizationResult>` - Optimize AI instructions using AI (v3.0.4+)
2840
- - `testInstructions(instructions: string, testInput: string, expectedSchema?: Record<string, unknown>, options?: TestInstructionsOptions): Promise<TestInstructionsResult>` - Test instructions by running them and analyzing responses (v3.0.4+)
2841
- - `generateRequestReport(request: AIRequest): Promise<RequestReport>` - Generate comprehensive request report with validation, examples, and structured text information (v3.0.6+)
2842
-
2843
- ### ChatRequest and AIRequest / AIInvokeRequest
2844
-
2845
- **Naming**
2846
-
2847
- - **`ChatRequest`** — pass to **`gateway.invokeChat()`**. Conversational path; optional **`expectedSchema`** (`StructuredTextSpec`) for contract-style validation when enabled.
2848
- - **`AIInvokeRequest`** — pass to **`gateway.invoke()`**. Canonical structured/skill path.
2849
- - **`AIRequest`** — **type alias** for **`AIInvokeRequest`** (keeps existing imports working).
2850
-
2851
- **Mandatory on `invoke()` (`AIInvokeRequest`)**
2852
-
2853
- | Field | Meaning |
2854
- |-------|--------|
2855
- | **`actionType`** | **`'skill'`** \| **`'preSkill'`** \| **`'postSkill'`** — classifies the invocation for tracing and Activix. |
2856
- | **`actionRef`** | Non-empty string reference for the action (e.g. **`skills/my-skill`**). Recorded with the activity. |
2857
-
2858
- These are validated by **`validateAIRequest`**, merged into **`request.identity`** ( **`runContext`** for Activix ), and duplicated on activity root fields **`actionType`** / **`actionRef`** when activity tracking runs.
2859
-
2860
- **Mandatory on both paths**
2861
-
2862
- - **`aiRequestId`**, **`agentId`**, **`instructions`**, **`identity`** (upstream **`identity.jobId`** and **`identity.taskId`** — see [Mandatory runtime identity](#mandatory-runtime-identity-v9)).
2863
- - Top-level **`input`** is **not** supported; use **`workingMemory`** (e.g. **`workingMemory.input`**) for template data.
2864
-
2865
- **Removed from the shared request model (`BaseLLMRequest`)**
2866
-
2867
- Not accepted on gateway requests anymore:
2868
-
2869
- - **`taskConfig`**, **`templateTokens`**, **`validateOutputSchema`**, **`strictValidation`**, **`transformations`**, **`parseOptions`** (request-level).
2870
-
2871
- Use **`InstructionMetadata.parseOptions`** only as **instruction-catalog metadata**, not as invoke payload fields.
2872
-
2873
- **Exports**
2874
-
2875
- ```typescript
2876
- import type {
2877
- ChatRequest,
2878
- AIInvokeRequest,
2879
- AIRequest, // alias of AIInvokeRequest
2880
- GatewayActionType, // 'skill' | 'preSkill' | 'postSkill'
2881
- ActivityIdentity,
2882
- EnhancedLLMResponse
2883
- } from '@x12i/ai-gateway';
2884
- ```
2885
-
2886
- **Structured output helpers**
2887
-
2888
- Structured flows often use **`primaryObjectType`** / **`objectTypes`** from the router **`LLMRequest`** intersection (type guards in message building). They are **not** duplicated here verbatim — see **`src/types.ts`** and **`LLMRequest`** from **`@x12i/ai-providers-router`**.
2889
-
2890
- **Key differences**
2891
-
2892
- - **`invokeChat(request)`** — `ChatRequest`; does **not** require **`actionType`** / **`actionRef`**.
2893
- - **`invoke(request)`** — `AIInvokeRequest`; **requires** **`actionType`** and **`actionRef`**.
2894
-
2895
- #### StructuredTextSpec (chat / contract hints)
2896
-
2897
- Used by **`ChatRequest.expectedSchema`** when you attach optional contract structured-text hints:
2898
-
2899
- ```typescript
2900
- interface StructuredTextSpec {
2901
- description: string;
2902
- formatHint?: string;
2903
- }
2904
- ```
2905
-
2906
- **Output Modes (v3.0.5+):**
2907
-
2908
- The gateway supports three output modes, each with two instruction formats:
2909
-
2910
- 1. **JSON Output Mode** (`outputMode: 'json'` - default):
2911
- - Expects JSON response from LLM
2912
- - Parses response as JSON object
2913
- - Works with both instruction formats
2914
-
2915
- 2. **Structured Text Output Mode** (`outputMode: 'structured-text'`):
2916
- - Expects structured text (not JSON) - can be stories, narratives, free-form text
2917
- - Does NOT parse as JSON
2918
- - Content remains as string
2919
- - Works with both instruction formats
2920
- - **Auto-Extraction**: When `outputMode: 'structured-text'` is set and no explicit format (`flexMdFormat` or `primaryObjectType`) is provided, the gateway automatically extracts and validates the output format specification from the instruction template's "OUTPUT FORMAT" section (v3.3.3+)
2921
-
2922
- 3. **Two-Step Conversion Mode** (`outputMode: 'two-step'`):
2923
- - Step 1: Get structured text response
2924
- - Step 2: Automatically convert structured text to JSON using internal conversion instructions
2925
- - Returns JSON response (from step 2)
2926
-
2927
- **Instruction Formats:**
2928
-
2929
- 1. **JSON Schema Format** (`instructionFormat: 'json-schema'` - default):
2930
- - Instructions include JSON schema embedded
2931
- - Current behavior - schema is injected into instructions
2932
-
2933
- 2. **Structured Text Spec Format** (`instructionFormat: 'structured-text-spec'`):
2934
- - Instructions include free-form description of desired output structure
2935
- - Example: "a story with character, setting, and conflict"
2936
- - Requires `structuredTextSpec` field
2937
-
2938
- **Minimal AIInvokeRequest example (custom `primaryObjectType`):**
2939
- ```typescript
2940
- const aiRequestId = 'req-1';
2941
- const response = await gateway.invoke({
2942
- aiRequestId,
2943
- agentId: 'agent-1',
2944
- actionType: 'skill',
2945
- actionRef: 'skills/professional-answer',
2946
- instructions: 'professional-answer/general', // Content registry key
2947
- prompt: '{{input}}',
2948
- workingMemory: { input: 'What is the capital of France?' },
2949
- identity: {
2950
- sessionId: 'run-1',
2951
- instance: { instanceId: 'agent-1', type: 'ai-reasoner' },
2952
- aiRequestId,
2953
- jobId: 'job-123',
2954
- taskId: 'task-1',
2955
- agentId: 'agent-1'
2956
- },
2957
- primaryObjectType: {
2958
- type: 'professional-answer',
2959
- whenToUse: 'For professional Q&A responses'
2960
- },
2961
- modelConfig: {
2962
- model: 'gpt-4o',
2963
- provider: 'openai'
2964
- }
2965
- });
2966
- ```
2967
-
2968
- **Using modelConfig for model selection:**
2969
- ```typescript
2970
- const aiRequestId = 'req-2';
2971
- const response = await gateway.invoke({
2972
- aiRequestId,
2973
- agentId: 'agent-1',
2974
- actionType: 'skill',
2975
- actionRef: 'skills/sentiment',
2976
- instructions: 'Analyze the data',
2977
- prompt: '{{input}}',
2978
- workingMemory: { input: 'sample' },
2979
- identity: {
2980
- sessionId: 'run-1',
2981
- instance: { instanceId: 'agent-1', type: 'ai-reasoner' },
2982
- aiRequestId,
2983
- jobId: 'job-123',
2984
- taskId: 'task-1',
2985
- agentId: 'agent-1'
2986
- },
2987
- primaryObjectType: 'sentiment-analysis',
2988
- modelConfig: {
2989
- model: 'gpt-4-turbo',
2990
- provider: 'openai',
2991
- temperature: 0.7,
2992
- maxTokens: 2000,
2993
- topP: 0.9
2994
- }
2995
- });
2996
- ```
2997
-
2998
- **modelConfig overrides `config`:**
2999
- ```typescript
3000
- const aiRequestId = 'req-3';
3001
- const response = await gateway.invoke({
3002
- aiRequestId,
3003
- agentId: 'agent-1',
3004
- actionType: 'skill',
3005
- actionRef: 'skills/classification',
3006
- instructions: 'Process request',
3007
- prompt: '{{input}}',
3008
- workingMemory: { input: 'sample' },
3009
- identity: {
3010
- sessionId: 'run-1',
3011
- instance: { instanceId: 'agent-1', type: 'ai-reasoner' },
3012
- aiRequestId,
3013
- jobId: 'job-123',
3014
- taskId: 'task-1',
3015
- agentId: 'agent-1'
3016
- },
3017
- primaryObjectType: 'classification',
3018
- config: {
3019
- model: 'gpt-3.5-turbo',
3020
- temperature: 0.5
3021
- },
3022
- modelConfig: {
3023
- model: 'gpt-4-turbo',
3024
- temperature: 0.9
3025
- }
3026
- });
3027
- ```
3028
-
3029
- **Standard object type example (v3.0.6+):**
3030
- ```typescript
3031
- const aiRequestId = 'req-4';
3032
- const response = await gateway.invoke({
3033
- aiRequestId,
3034
- agentId: 'agent-1',
3035
- actionType: 'skill',
3036
- actionRef: 'skills/sentiment-analysis',
3037
- instructions: 'Analyze the sentiment of this text',
3038
- prompt: '{{input}}',
3039
- workingMemory: { input: 'I love this product!' },
3040
- identity: {
3041
- sessionId: 'run-1',
3042
- instance: { instanceId: 'agent-1', type: 'ai-reasoner' },
3043
- aiRequestId,
3044
- jobId: 'job-123',
3045
- taskId: 'task-1',
3046
- agentId: 'agent-1'
3047
- },
3048
- primaryObjectType: 'sentiment-analysis',
3049
- config: {
3050
- model: 'gpt-5-nano',
3051
- provider: 'openai'
3052
- }
3053
- });
3054
-
3055
- // Standard types include:
3056
- // - Pre-defined schemas
3057
- // - Few-shot examples (automatically used for better results)
3058
- // - Validation instructions
3059
- // - Structured text formatting guidelines
3060
- ```
3061
-
3062
- **Available Standard Types:**
3063
- - `sentiment-analysis` - Sentiment analysis with confidence and reasoning
3064
- - `classification` - General text classification
3065
- - `extraction` - Structured data extraction
3066
- - `question-answer` - Question answering
3067
- - `email-extraction` - Email address extraction
3068
- - `person-extraction` - Person information extraction
3069
- - `location-extraction` - Location name extraction
3070
- - `weather-report` - Weather information structure
3071
- - And 10+ more types (see `@x12i/outputs-library` documentation)
3072
-
3073
- **Note**: Standard types require content-registry to be configured (`contentRegistryConfig`). Custom types work without content-registry.
3074
-
3075
- ### Output Format Validation
3076
-
3077
- **Output Format Validation:**
3078
-
3079
- The gateway validates output format specifications from instruction templates before sending to LLM. Instructions should include an "OUTPUT FORMAT" section that describes the expected output structure.
3080
-
3081
- **Important: Communication Flow with LLM**
3082
-
3083
- The gateway uses **flex-md (Markdown-based format)** for all LLM communication, while returning proper JavaScript objects to your code:
3084
-
3085
- 1. **To LLM**: Instructions should include output format specifications - the gateway validates these but does not inject them
3086
- 2. **From LLM**: LLM responds with flex-md (Markdown) containing structured data in fenced blocks
3087
- 3. **Gateway Processing**: Gateway automatically extracts structured data from flex-md (Markdown) fenced blocks using flex-md SDK
3088
- 4. **To Your Code**: Gateway returns parsed JavaScript objects in `response.parsedContent` - you always get clean objects, never raw Markdown
3089
-
3090
- **Why flex-md (Markdown) Instead of Direct JSON?**
3091
-
3092
- - **flex-md is Markdown-based**: We use flex-md format, which is built on Markdown, for all LLM communication
3093
- - **Better LLM Compliance**: LLMs are more reliable at following Markdown format instructions than raw JSON
3094
- - **Flexibility**: Allows LLMs to structure responses in Markdown while embedding structured data
3095
- - **Robust Extraction**: Multiple fallback methods ensure structured data is extracted from Markdown even if format varies slightly
3096
- - **Consistent API**: Your code always receives clean JavaScript objects, regardless of how the LLM formatted its Markdown response
3097
-
3098
- **Output Format Validation Process:**
3099
-
3100
- 1. Gateway extracts output format from instruction templates (if present)
3101
- 2. Validates format specification using flex-md SDK
3102
- 3. Checks compliance level against minimum required level (from `FLEX_MD_MIN_COMPLIANCE_LEVEL` environment variable)
3103
- 4. If validation fails and minimum level > L0, request is rejected with error
3104
- 5. If validation fails and minimum level is L0, request continues with warning
3105
- 6. Output format information is included in response metadata and activity tracking
3106
-
3107
- **Example:**
3108
-
3109
- ```typescript
3110
- const aiRequestId = 'flex-md-ex-1';
3111
- const response = await gateway.invoke({
3112
- aiRequestId,
3113
- agentId: 'agent-1',
3114
- actionType: 'skill',
3115
- actionRef: 'skills/sentiment',
3116
- instructions: 'Classify the sentiment of this text',
3117
- prompt: '{{input}}',
3118
- workingMemory: { input: 'This product is amazing!' },
3119
- identity: {
3120
- sessionId: 's1',
3121
- instance: { instanceId: 'agent-1', type: 'test' },
3122
- aiRequestId,
3123
- jobId: 'job-123',
3124
- taskId: 'task-1',
3125
- agentId: 'agent-1'
3126
- },
3127
- primaryObjectType: {
3128
- type: 'classification',
3129
- schema: {
3130
- type: 'object',
3131
- properties: {
3132
- sentiment: {
3133
- type: 'string',
3134
- enum: ['positive', 'negative', 'neutral'],
3135
- description: 'The sentiment classification'
3136
- },
3137
- confidence: {
3138
- type: 'number',
3139
- minimum: 0,
3140
- maximum: 1,
3141
- description: 'Confidence score'
3142
- }
3143
- },
3144
- required: ['sentiment', 'confidence']
3145
- },
3146
- whenToUse: 'For sentiment classification',
3147
- description: 'Simple sentiment classification'
3148
- },
3149
- config: { model: 'gpt-5-nano', provider: 'openai' }
3150
- });
3151
-
3152
- // What the LLM receives (system message - using flex-md/Markdown format):
3153
- // OUTPUT FORMAT:
3154
- // Output Format: classification
3155
- // Simple sentiment classification
3156
- //
3157
- // Return structured data with the following structure:
3158
- // [Schema with properties and descriptions]
3159
- // IMPORTANT: Return your answer as structured data inside the markdown fenced block.
3160
- //
3161
- // What the LLM might return (flex-md/Markdown):
3162
- // ```markdown
3163
- // {"sentiment": "positive", "confidence": 0.95}
3164
- // ```
3165
- //
3166
- // What you receive (response.parsedContent):
3167
- // { sentiment: "positive", confidence: 0.95 } // Clean JavaScript object (not Markdown)
3168
- ```
3169
-
3170
- **Structured Data Extraction Process:**
3171
-
3172
- The gateway uses flex-md (Markdown-based) for communication and extracts structured data from Markdown responses using multiple methods (in order of preference):
3173
-
3174
- 1. **flex-md SDK**: Uses `extractFencedBlocks()` and `detectJsonAll()` to extract structured data from flex-md (Markdown) fenced blocks
3175
- 2. **Manual Regex Fallback**: Extracts structured data from Markdown fenced blocks if flex-md SDK methods fail
3176
- 3. **Response Repair (prod only)**: Uses a minimal in-gateway fallback repair (warn-logged) to recover JSON from malformed Markdown/prose. In `mode=debug`, failures throw immediately.
3177
-
3178
- **Result**: You always receive clean JavaScript objects in `response.parsedContent`, never raw Markdown text.
3179
-
3180
- **Output Format Validation Configuration:**
3181
-
3182
- Set the minimum compliance level using the `FLEX_MD_MIN_COMPLIANCE_LEVEL` environment variable:
3183
-
3184
- ```bash
3185
- export FLEX_MD_MIN_COMPLIANCE_LEVEL=L0 # Default: allows anything
3186
- export FLEX_MD_MIN_COMPLIANCE_LEVEL=L1 # Requires section headings
3187
- export FLEX_MD_MIN_COMPLIANCE_LEVEL=L2 # Requires fenced block + sections
3188
- export FLEX_MD_MIN_COMPLIANCE_LEVEL=L3 # Maximum strictness
3189
- ```
3190
-
3191
- - **L0 (default)**: No format validation required - allows any output format
3192
- - **L1+**: Format specification required in instructions - validation errors will reject requests if format is missing or invalid
3193
-
3194
- **Transforming Objects to Instruction Blocks:**
3195
-
3196
- The gateway automatically transforms instruction blocks (from `instructions-blocks.json` or content-registry) into system message instructions. This allows you to configure instruction behavior declaratively without hardcoding.
3197
-
3198
- **Supported Features:**
3199
-
3200
- 1. **Nested Instruction Blocks**: Access nested properties using dot notation
3201
- - `input.inputPrefix` - Prefix added before user input
3202
- - `reinforcement.inputAlreadyProvided` - Reinforcement rules
3203
-
3204
- 2. **Template Resolution**: Instruction blocks can contain template variables that get resolved with request context
3205
- - `{{taskDescription}}` - Replaced with task description
3206
- - `{{input}}` - Replaced with actual input value
3207
-
3208
- 3. **Compliance Level Enforcement**: Use `flexMdComplianceLevel` (L0-L3) to automatically apply appropriate markdown enforcement rules
3209
- - L0: Plain Markdown (minimum structure)
3210
- - L1: Sectioned Markdown (headings required)
3211
- - L2: Single-container Markdown (fenced block required) - **Default**
3212
- - L3: Fully-typed Markdown (strict formatting rules)
3213
-
3214
- 4. **Automatic Block Selection**: The gateway automatically selects the right instruction blocks based on:
3215
- - Request type (AIRequest vs ChatRequest)
3216
- - Presence of `primaryObjectType` or `flexMdFormat`
3217
- - Compliance level specified
3218
- - Agent ID and Task Type ID (for content-registry resolution)
3219
-
3220
- **Example: Output Format in Instructions**
3221
-
3222
- Instructions should include output format specifications:
3223
-
3224
- ```markdown
3225
- ## Task
3226
- Classify the sentiment of the input text.
3227
-
3228
- ## OUTPUT FORMAT
3229
- Return your answer in the following structure:
3230
-
3231
- Sentiment:
3232
- - One of: positive, negative, neutral
3233
-
3234
- Confidence:
3235
- - A number between 0 and 1
3236
-
3237
- Reasoning:
3238
- - Brief explanation of your classification
3239
- ```
3240
-
3241
- The gateway will:
3242
- 1. Extract and validate the output format from instructions
3243
- 2. Check compliance level against `FLEX_MD_MIN_COMPLIANCE_LEVEL`
3244
- 3. Include format information in response metadata (`response.metadata.outputFormat`)
3245
- 4. Track format information in activity records
3246
- // 2. Retrieves output.complianceLevels.L2 rules
3247
- // 3. Applies markdownEnforcement, structuralRule, and immediateComplianceRule
3248
- // 4. Adds these to the system message
3249
- ```
3250
-
3251
- **Example: Custom Instruction Blocks Structure**
3252
-
3253
- The `instructions-blocks.json` file supports nested objects that get automatically resolved:
3254
-
3255
- ```json
3256
- {
3257
- "input": {
3258
- "inputPrefix": "INPUT DATA (process this now):",
3259
- "inputRecognitionRule": "The user message below is the complete input to process."
3260
- },
3261
- "output": {
3262
- "complianceLevels": {
3263
- "L2": {
3264
- "markdownEnforcement": "Return your entire answer inside a single ```markdown fenced block.",
3265
- "structuralRule": "No conversational text before or after the fenced block.",
3266
- "immediateComplianceRule": "Immediately return your answer inside a single ```markdown fenced block."
3267
- }
3268
- }
3269
- },
3270
- "reinforcement": {
3271
- "inputAlreadyProvided": "The input has been provided in the user message. Process it immediately.",
3272
- "noConversation": "You are not having a conversation."
3273
- }
3274
- }
3275
- ```
3276
-
3277
- **Resolution Priority:**
3278
-
3279
- 1. **Content Registry** (highest): `instructions/{agentId}/{taskTypeId}/{blockPath}`
3280
- 2. **Content Registry (agent-level)**: `instructions/{agentId}/{blockPath}`
3281
- 3. **Default JSON File**: `src/defaults/instructions-blocks.json`
3282
- 4. **Hardcoded Fallbacks**: Built-in defaults if file loading fails
3283
-
3284
- **How Blocks Are Transformed:**
3285
-
3286
- - **System Message**: Instruction blocks are combined with user instructions, OUTPUT FORMAT, and compliance rules
3287
- - **User Message**: Input blocks (like `inputPrefix`) are prepended to user input
3288
- - **Template Variables**: All `{{variable}}` placeholders are resolved using request context
3289
- - **Nested Access**: Dot notation (e.g., `output.complianceLevels.L2`) automatically traverses nested objects
3290
-
3291
- **AIInvokeRequest with object types / schema (invoke):**
3292
- ```typescript
3293
- const aiRequestId = 'pa-1';
3294
- const response = await gateway.invoke({
3295
- aiRequestId,
3296
- agentId: 'agent-1',
3297
- actionType: 'skill',
3298
- actionRef: 'skills/professional-answer',
3299
- instructions: 'professional-answer/general',
3300
- prompt: '{{input}}',
3301
- workingMemory: { input: 'User question here' },
3302
- inferenceType: 'professional-answer',
3303
- identity: {
3304
- sessionId: 's1',
3305
- instance: { instanceId: 'agent-1', type: 'test' },
3306
- aiRequestId,
3307
- jobId: 'job-123',
3308
- taskId: 'task-1',
3309
- agentId: 'agent-1'
3310
- },
3311
- objectTypes: [
3312
- {
3313
- type: 'professional-answer',
3314
- whenToUse: 'For professional Q&A responses',
3315
- description: 'Professional answer JSON structure',
3316
- schema: {
3317
- type: 'object',
3318
- properties: {
3319
- answer: { type: 'string' },
3320
- reasoning: { type: 'string' },
3321
- citations: { type: 'array', items: { type: 'string' } }
3322
- },
3323
- required: ['answer']
3324
- }
3325
- }
3326
- ],
3327
- config: {
3328
- model: 'gpt-4o',
3329
- provider: 'openai'
3330
- }
3331
- });
3332
- // Validate against schema in your app if needed — `validateOutputSchema` is not a request field.
3333
- ```
3334
-
3335
- **Content registry + AIInvokeRequest (recommended):**
3336
-
3337
- When using content-registry with **`invoke()`**, the gateway typically:
3338
- 1. Resolves instruction keys from content-registry
3339
- 2. Fetches instruction metadata (**`outputType`**, **`outputSchema`**, **`parseOptions`** on **metadata**, not on the request body)
3340
- 3. May configure provider JSON modes via merged **`config`** (provider-specific)
3341
-
3342
- ```typescript
3343
- const aiRequestId = 'pa-2';
3344
- const response = await gateway.invoke({
3345
- aiRequestId,
3346
- agentId: 'agent-1',
3347
- actionType: 'skill',
3348
- actionRef: 'skills/professional-answer',
3349
- instructions: 'professional-answer/general',
3350
- prompt: '{{input}}',
3351
- workingMemory: { input: 'User question' },
3352
- inferenceType: 'professional-answer',
3353
- identity: {
3354
- sessionId: 's1',
3355
- instance: { instanceId: 'agent-1', type: 'test' },
3356
- aiRequestId,
3357
- jobId: 'job-123',
3358
- taskId: 'task-1',
3359
- agentId: 'agent-1'
3360
- },
3361
- objectTypes: [
3362
- {
3363
- type: 'professional-answer',
3364
- whenToUse: 'For professional Q&A responses',
3365
- schema: instructionMetadata?.outputSchema || {
3366
- type: 'object',
3367
- properties: { answer: { type: 'string' } }
3368
- }
3369
- }
3370
- ],
3371
- config: {
3372
- model: 'gpt-4o',
3373
- provider: 'openai'
3374
- }
3375
- });
3376
- ```
3377
-
3378
- **Output Modes Examples (v3.0.5+):**
3379
-
3380
- **1. JSON Output Mode (Default):**
3381
- ```typescript
3382
- const aiRequestId = 'out-json-1';
3383
- const response = await gateway.invoke({
3384
- aiRequestId,
3385
- agentId: 'agent-1',
3386
- actionType: 'skill',
3387
- actionRef: 'skills/qa',
3388
- instructions: 'Answer the question',
3389
- prompt: '{{input}}',
3390
- workingMemory: { input: 'What is the capital of France?' },
3391
- identity: {
3392
- sessionId: 's1',
3393
- instance: { instanceId: 'agent-1', type: 'test' },
3394
- aiRequestId,
3395
- jobId: 'job-123',
3396
- taskId: 'task-1',
3397
- agentId: 'agent-1'
3398
- },
3399
- primaryObjectType: {
3400
- type: 'question-answer',
3401
- schema: {
3402
- answer: { type: 'string' },
3403
- confidence: { type: 'number' }
3404
- },
3405
- whenToUse: 'For Q&A responses'
3406
- },
3407
- outputMode: 'json', // Default - expects JSON response
3408
- instructionFormat: 'json-schema', // Default - includes JSON schema in instructions
3409
- config: { model: 'gpt-5-nano', provider: 'openai' }
3410
- });
3411
- // Response.parsedContent will be a JSON object
3412
- ```
3413
-
3414
- **2. Structured Text Output Mode:**
3415
- ```typescript
3416
- const aiRequestId = 'out-st-1';
3417
- const response = await gateway.invoke({
3418
- aiRequestId,
3419
- agentId: 'story-teller',
3420
- actionType: 'skill',
3421
- actionRef: 'skills/story',
3422
- instructions: 'Write a short story',
3423
- prompt: '{{input}}',
3424
- workingMemory: { input: 'Create a story about a brave knight' },
3425
- identity: {
3426
- sessionId: 's1',
3427
- instance: { instanceId: 'story-teller', type: 'test' },
3428
- aiRequestId,
3429
- jobId: 'job-123',
3430
- taskId: 'task-1',
3431
- agentId: 'story-teller'
3432
- },
3433
- primaryObjectType: {
3434
- type: 'story',
3435
- whenToUse: 'For narrative content'
3436
- },
3437
- outputMode: 'structured-text', // Expects structured text, not JSON
3438
- instructionFormat: 'structured-text-spec',
3439
- structuredTextSpec: {
3440
- description: 'A story with a character, setting, conflict, and resolution',
3441
- formatHint: 'markdown' // Optional
3442
- },
3443
- config: { model: 'gpt-5-nano', provider: 'openai' }
3444
- });
3445
- // Response.content will be structured text (string)
3446
- // Response.parsedContent will be undefined
3447
- ```
3448
-
3449
- **3. Two-Step Conversion Mode:**
3450
- ```typescript
3451
- const aiRequestId = 'out-2step-1';
3452
- const response = await gateway.invoke({
3453
- aiRequestId,
3454
- agentId: 'story-converter',
3455
- actionType: 'skill',
3456
- actionRef: 'skills/story-convert',
3457
- instructions: 'Write a detailed story',
3458
- prompt: '{{input}}',
3459
- workingMemory: { input: 'Create a story about space exploration' },
3460
- identity: {
3461
- sessionId: 's1',
3462
- instance: { instanceId: 'story-converter', type: 'test' },
3463
- aiRequestId,
3464
- jobId: 'job-123',
3465
- taskId: 'task-1',
3466
- agentId: 'story-converter'
3467
- },
3468
- primaryObjectType: {
3469
- type: 'story',
3470
- schema: {
3471
- character: { type: 'string' },
3472
- setting: { type: 'string' },
3473
- conflict: { type: 'string' },
3474
- resolution: { type: 'string' }
3475
- },
3476
- whenToUse: 'For structured stories'
3477
- },
3478
- outputMode: 'two-step', // Step 1: Get structured text, Step 2: Convert to JSON
3479
- instructionFormat: 'structured-text-spec',
3480
- structuredTextSpec: {
3481
- description: 'A story with character, setting, conflict, and resolution'
3482
- },
3483
- // Optional: Custom conversion instructions
3484
- // conversionInstructions: 'Your custom conversion instructions here',
3485
- config: { model: 'gpt-5-nano', provider: 'openai' }
3486
- });
3487
- // Step 1: LLM generates structured text
3488
- // Step 2: Gateway automatically converts structured text to JSON
3489
- // Response.parsedContent will be a JSON object with the story structure
3490
- ```
3491
-
3492
- **Instruction Format Combinations:**
3493
-
3494
- - **JSON Schema + JSON Mode** (default): Instructions include JSON schema, expects JSON response
3495
- - **JSON Schema + Structured Text Mode**: Instructions include JSON schema, but expects structured text (schema used as reference)
3496
- - **Structured Text Spec + JSON Mode**: Instructions describe structure in free-form, expects JSON response
3497
- - **Structured Text Spec + Structured Text Mode**: Instructions describe structure in free-form, expects structured text
3498
- - **Structured Text Spec + Two-Step Mode**: Instructions describe structure in free-form, gets structured text first, then converts to JSON
3499
-
3500
- **Troubleshooting**: If **`invoke()`** fails validation:
3501
- 1. Confirm **`actionType`**, **`actionRef`**, **`aiRequestId`**, and full **`identity`** (**`jobId`** / **`taskId`**) are set
3502
- 2. Remove top-level **`input`** — use **`prompt`** + **`workingMemory.input`**
3503
- 3. Use **`validateAIRequest()`** / **`assertValidAIRequest()`** before sending
3504
- 4. Enable **`AI_GATEWAY_DEBUG_REQUEST=true`** to inspect the request at the entry point
3505
- 5. See [TROUBLESHOOTING.md](./TROUBLESHOOTING.md#airequest-validation-issues) for detailed solutions
3506
-
3507
- #### Task Type ID (taskTypeId)
3508
-
3509
- The `taskTypeId` field is used to identify recurring tasks that you perform repeatedly with the same instructions but different context, prompts, or inputs. This is especially useful when processing large batches of data (e.g., 15k records) with the same question/instruction pattern.
3510
-
3511
- **Auto-Generation Behavior:**
3512
-
3513
- If `taskTypeId` is not provided in the request, the gateway automatically generates it as an MD5 hash of the **pre-parsed instructions** (after resolving from content-registry if it's a key, but before template parsing with workingMemory). This ensures:
3514
-
3515
- - **Consistency**: All requests with the same base instruction text get the same `taskTypeId`
3516
- - **Automatic**: No need to manually calculate or provide hashes
3517
- - **Stable**: The hash is based on the instruction text itself, not variable content
3518
-
3519
- **Manual Generation (Helper Method):**
3520
-
3521
- For maximum control and consistency, you can use the gateway's helper method to compute `taskTypeId`:
3522
-
3523
- ```typescript
3524
- // Static method (requires content-registry config if instructions is a key)
3525
- const taskTypeId = await AIGateway.generateTaskTypeId('classification/basic', {
3526
- contentRegistryConfig: { github: { ... } },
3527
- agentId: 'agent-1'
3528
- });
3529
-
3530
- // Instance method (uses gateway's configured content-registry)
3531
- const gateway = new AIGateway({ enableContentRegistry: true, ... });
3532
- const taskTypeId = await gateway.generateTaskTypeId('classification/basic');
3533
-
3534
- // Example: Processing 15k records with the same question
3535
- const question = "What is the sentiment of this text?";
3536
- const taskTypeId = await gateway.generateTaskTypeId(question);
3537
-
3538
- // All 15k requests will have the same taskTypeId
3539
- for (const record of records) {
3540
- const aiRequestId = `sentiment-${record.id}`;
3541
- await gateway.invoke({
3542
- aiRequestId,
3543
- agentId: 'sentiment-analyzer',
3544
- actionType: 'skill',
3545
- actionRef: 'skills/sentiment-batch',
3546
- instructions: question,
3547
- prompt: '{{input}}',
3548
- workingMemory: { input: record.text },
3549
- taskTypeId, // Same for all records
3550
- identity: {
3551
- sessionId: 'batch-1',
3552
- instance: { instanceId: 'sentiment-analyzer', type: 'batch' },
3553
- aiRequestId,
3554
- jobId: `job-${record.id}`,
3555
- taskId: `task-${record.id}`,
3556
- agentId: 'sentiment-analyzer'
3557
- },
3558
- primaryObjectType: 'sentiment-analysis',
3559
- config: { model: 'gpt-5-nano', provider: 'openai' }
3560
- });
3561
- }
3562
- ```
3563
-
3564
- **What Exactly is Hashed:**
3565
-
3566
- The hash is computed from the **pre-parsed instructions**:
3567
- 1. If `instructions` is a key (e.g., `'classification/basic'`), it's resolved from content-registry first
3568
- 2. The resolved instruction text (or original text if not a key) is then hashed
3569
- 3. Template parsing with `workingMemory` happens **after** hash generation, so variable values don't affect the hash
3570
-
3571
- This ensures that:
3572
- - Instruction keys resolve to the same hash across services
3573
- - Template variables don't affect taskTypeId consistency
3574
- - The same instruction text always produces the same taskTypeId
3575
-
3576
- **How It Works:**
3577
-
3578
- 1. If `taskTypeId` is provided → Uses the provided value
3579
- 2. If `taskTypeId` is not provided → Gateway automatically:
3580
- - Resolves instructions from content-registry if it's a key (without workingMemory)
3581
- - Computes MD5 hash of the resolved instruction text
3582
- - Uses the hash as `taskTypeId`
3583
- - Logs the auto-generated value for debugging
3584
-
3585
- **Use Cases:**
3586
-
3587
- - **Batch Processing**: Process thousands of records with the same instruction/question
3588
- - **Task Grouping**: Group related activities in activity tracking
3589
- - **Content Registry**: Use `taskTypeId` in content-registry paths: `instructions/{agentId}/{taskTypeId}/{blockName}`
3590
- - **Analytics**: Track performance and costs by task type
3591
-
3592
- ### EnhancedLLMResponse
3593
-
3594
- Extended response interface with comprehensive metadata.
3595
-
3596
- ```typescript
3597
- interface EnhancedLLMResponse extends LLMResponse {
3598
- content: string; // Normalized content (always string)
3599
- rawText?: string; // Original raw text before fixing (v3.0.4+)
3600
- parsedContent?: TContent; // Parsed JSON content
3601
- metadata: {
3602
- jobId?: string;
3603
- latencyMs: number;
3604
- tokens: {
3605
- prompt: number;
3606
- completion: number;
3607
- total: number;
3608
- };
3609
- // Output mode metadata (v3.0.5+)
3610
- outputMode?: 'json' | 'structured-text' | 'two-step';
3611
- instructionFormat?: 'json-schema' | 'structured-text-spec';
3612
- isTwoStepConversion?: boolean;
3613
- structuredTextStep?: 'first' | 'second';
3614
- model?: string;
3615
- provider?: string;
3616
- cost?: number;
3617
- // Response fixer metadata (v3.0.4+)
3618
- responseWasFixed?: boolean; // Whether a response repair fallback was applied
3619
- responseFixApplied?: string; // Which fix strategy was used
3620
- responseFixConfidence?: number; // Confidence level (0-1)
3621
- responseFixWarnings?: string[]; // Warnings about the fix
3622
- [key: string]: any;
3623
- };
3624
- }
3625
- ```
3626
-
3627
- ## Advanced Usage
3628
-
3629
- ### Custom Activix instance
3630
-
3631
- Use the same **`collections`** names the gateway writes to (`ai-actions`, `skill-executions`, `bad-requests`) and the same **`statusValues`** mapping as in section 2.
3632
-
3633
- ```typescript
3634
- import { Activix } from '@x12i/activix';
3635
-
3636
- const statusValues = {
3637
- started: 'started',
3638
- inProgress: 'in_progress',
3639
- completed: 'success',
3640
- failed: 'failed',
3641
- timeout: 'timeout'
3642
- };
3643
-
3644
- const customTracker = new Activix({
3645
- collections: [
3646
- { name: 'ai-actions', statusValues },
3647
- { name: 'skill-executions', statusValues },
3648
- { name: 'bad-requests', statusValues }
3649
- ]
3650
- });
3651
-
3652
- const gateway = new AIGateway({
3653
- activityTracker: customTracker,
3654
- enableActivityTracking: true
3655
- });
3656
- ```
3657
-
3658
- ### Custom Logger
3659
-
3660
- ```typescript
3661
- import { createLogxer } from '@x12i/logxer';
3662
-
3663
- const customLogger = createLogxer(
3664
- { packageName: 'MY_APP', envPrefix: 'MY_APP', debugNamespace: 'my-app' },
3665
- {
3666
- logLevel: 'debug',
3667
- logFormat: 'json',
3668
- enableUnifiedLogger: true,
3669
- unifiedLogger: {
3670
- transports: { papertrail: true },
3671
- service: 'ai-gateway',
3672
- env: 'production'
3673
- }
3674
- }
3675
- );
3676
-
3677
- const gateway = new AIGateway({
3678
- logger: customLogger,
3679
- enableLogging: true
3680
- });
3681
- ```
3682
-
3683
- ### Usage Consumption Monitoring
3684
-
3685
- ```typescript
3686
- import { calculateAggregateConsumption } from '@x12i/ai-gateway';
3687
-
3688
- // After making requests, check aggregate consumption
3689
- const aggregate = calculateAggregateConsumption('openai');
3690
-
3691
- if (aggregate) {
3692
- console.log(`Total requests: ${aggregate.totalRequests}`);
3693
- console.log(`RPM consumption: ${aggregate.rpmConsumption.toFixed(2)}%`);
3694
- console.log(`TPM consumption: ${aggregate.tpmConsumption.toFixed(2)}%`);
3695
-
3696
- // Alert if approaching limits
3697
- if (aggregate.rpmConsumption > 80) {
3698
- console.warn('⚠️ Approaching RPM limit!');
3699
- }
3700
- }
3701
- ```
3702
-
3703
- ### Health Checks
3704
-
3705
- ```typescript
3706
- // Check single provider
3707
- const health = await gateway.checkHealth('openai');
3708
- console.log(health.healthy, health.latencyMs);
3709
-
3710
- // Check all providers
3711
- const allHealth = await gateway.checkAllHealth();
3712
- for (const [provider, result] of allHealth) {
3713
- console.log(`${provider}: ${result.healthy ? 'healthy' : 'unhealthy'}`);
3714
- if (!result.healthy) {
3715
- console.error(`Error: ${result.error}`);
3716
- }
3717
- }
3718
- ```
3719
-
3720
- ### Interceptors
3721
-
3722
- ```typescript
3723
- // Add request interceptor
3724
- gateway.addRequestInterceptor(async (request, provider) => {
3725
- console.log(`Request to ${provider}:`, request);
3726
- // Modify request if needed
3727
- return request;
3728
- });
3729
-
3730
- // Add response interceptor
3731
- gateway.addResponseInterceptor(async (response, provider) => {
3732
- console.log(`Response from ${provider}:`, response);
3733
- // Modify response if needed
3734
- return response;
3735
- });
3736
- ```
3737
-
3738
- ## Integration Examples
3739
-
3740
- ### With Agent Framework
3741
-
3742
- ```typescript
3743
- import { AIGateway } from '@x12i/ai-gateway';
3744
-
3745
- const gateway = new AIGateway({
3746
- defaultProvider: 'openai',
3747
- usageTier: 'tier-3',
3748
- enableActivityTracking: true
3749
- });
3750
-
3751
- // In your agent execution
3752
- async function executeAgentTask(task: Task, jobId: string) {
3753
- const aiRequestId = `agent-${task.id}-${Date.now()}`;
3754
- // Use invokeChat() for chat requests without structured output
3755
- const response = await gateway.invokeChat({
3756
- aiRequestId,
3757
- agentId: task.agentId,
3758
- instructions: task.instructions,
3759
- prompt: '{{input}}',
3760
- workingMemory: { input: task.input },
3761
- taskTypeId: task.typeId, // Or use MD5 hash of question/instruction for consistent identification
3762
- identity: {
3763
- sessionId: jobId,
3764
- instance: { instanceId: task.agentId, type: 'agent' },
3765
- aiRequestId,
3766
- jobId,
3767
- taskId: task.id,
3768
- agentId: task.agentId
3769
- }
3770
- });
3771
-
3772
- return {
3773
- content: response.content,
3774
- metadata: response.metadata // Includes jobId, latency, tokens, etc.
3775
- };
3776
- }
3777
- ```
3778
-
3779
- ### With x-models for Smart Selection
3780
-
3781
- ```typescript
3782
- import { AIGateway } from '@x12i/ai-gateway';
3783
- import { registry } from '@x12i/x-models';
3784
-
3785
- // Select optimal model
3786
- const model = registry.selectModel({
3787
- strategy: 'cheapest',
3788
- capabilities: { toolCalling: true },
3789
- minContext: 16000
3790
- });
3791
-
3792
- if (model) {
3793
- const gateway = new AIGateway({
3794
- defaultProvider: model.provider
3795
- });
3796
-
3797
- const aiRequestId = 'xmodels-1';
3798
- const response = await gateway.invokeChat({
3799
- aiRequestId,
3800
- agentId: 'agent-1',
3801
- instructions: 'You are a helpful assistant',
3802
- prompt: '{{input}}',
3803
- workingMemory: { input: 'Hello!' },
3804
- identity: {
3805
- sessionId: 's1',
3806
- instance: { instanceId: 'agent-1', type: 'test' },
3807
- aiRequestId,
3808
- jobId: 'job-123',
3809
- taskId: 'task-1',
3810
- agentId: 'agent-1'
3811
- },
3812
- config: {
3813
- model: model.id
3814
- }
3815
- });
3816
- }
3817
- ```
3818
-
3819
- ### Unified Reasoning API (OpenRouter Reasoning)
3820
-
3821
- The gateway supports unified reasoning/thoughts configuration through the OpenRouter Reasoning API. This provides normalized reasoning capabilities across all providers.
3822
-
3823
- #### Request Configuration
3824
-
3825
- ```typescript
3826
- const aiRequestId = 'reasoning-example';
3827
- const response = await gateway.invokeChat({
3828
- aiRequestId,
3829
- agentId: 'agent-1',
3830
- instructions: 'Solve this step-by-step',
3831
- prompt: '{{input}}',
3832
- workingMemory: { input: 'What is 15 * 23?' },
3833
- identity: {
3834
- sessionId: 's1',
3835
- instance: { instanceId: 'agent-1', type: 'test' },
3836
- aiRequestId,
3837
- jobId: 'job-reason',
3838
- taskId: 'task-reason',
3839
- agentId: 'agent-1'
3840
- },
3841
- config: {
3842
- provider: 'openai',
3843
- model: 'gpt-5-nano',
3844
- // Unified reasoning configuration
3845
- reasoning: {
3846
- effort: 'high', // 'none' | 'low' | 'medium' | 'high' | 'xhigh'
3847
- visibility: 'trace', // 'none' | 'summary' | 'trace'
3848
- onUnsupported: 'downgrade' // 'error' | 'downgrade' | 'ignore'
3849
- }
3850
- }
3851
- });
3852
- ```
3853
-
3854
- **Reasoning Configuration Options:**
3855
-
3856
- - **`effort`**: Controls reasoning depth
3857
- - `'none'`: No reasoning (default)
3858
- - `'low'`: Basic reasoning
3859
- - `'medium'`: Moderate reasoning
3860
- - `'high'`: Deep reasoning
3861
- - `'xhigh'`: Maximum reasoning depth
3862
-
3863
- - **`visibility`**: Controls what reasoning data is returned
3864
- - `'none'`: No reasoning in response (default)
3865
- - `'summary'`: Human-readable reasoning summary
3866
- - `'trace'`: Detailed reasoning trace
3867
-
3868
- - **`onUnsupported`**: Behavior when provider doesn't support requested reasoning
3869
- - `'error'`: Throw error (default)
3870
- - `'downgrade'`: Automatically downgrade to supported level
3871
- - `'ignore'`: Proceed without reasoning
3872
-
3873
- #### Response Structure
3874
-
3875
- ```typescript
3876
- interface EnhancedLLMResponse {
3877
- content: string;
3878
- // Unified reasoning response object (not array)
3879
- reasoning?: {
3880
- requested: {
3881
- effort?: 'none' | 'low' | 'medium' | 'high' | 'xhigh';
3882
- visibility?: 'none' | 'summary' | 'trace';
3883
- };
3884
- applied: {
3885
- effort?: 'none' | 'low' | 'medium' | 'high';
3886
- visibility?: 'none' | 'summary' | 'trace';
3887
- };
3888
- artifacts?: {
3889
- summary?: { text: string; format: string };
3890
- trace?: { chunks: any[] };
3891
- encrypted?: Array<{
3892
- id: string;
3893
- format: string;
3894
- type?: string;
3895
- [key: string]: any;
3896
- }>;
3897
- };
3898
- availability?: {
3899
- supportsEffort?: boolean;
3900
- supportsSummary?: boolean;
3901
- supportsTrace?: boolean;
3902
- supportsEncrypted?: boolean;
3903
- };
3904
- warnings?: Array<{
3905
- code: 'EFFORT_NORMALIZED' | 'VISIBILITY_DOWNGRADED' | 'EFFORT_IGNORED' | 'REASONING_UNSUPPORTED';
3906
- message: string;
3907
- }>;
3908
- };
3909
- // ... other response fields
3910
- }
3911
- ```
3912
-
3913
- #### Usage Examples
3914
-
3915
- **Basic Reasoning Request:**
3916
- ```typescript
3917
- const aiRequestId = 'basic-reasoning';
3918
- const response = await gateway.invokeChat({
3919
- aiRequestId,
3920
- agentId: 'agent-1',
3921
- instructions: 'Explain your reasoning step by step',
3922
- prompt: '{{input}}',
3923
- workingMemory: { input: 'Why is the sky blue?' },
3924
- identity: {
3925
- sessionId: 's1',
3926
- instance: { instanceId: 'agent-1', type: 'test' },
3927
- aiRequestId,
3928
- jobId: 'job-reason',
3929
- taskId: 'task-basic',
3930
- agentId: 'agent-1'
3931
- },
3932
- config: {
3933
- provider: 'openai',
3934
- model: 'gpt-5-nano',
3935
- reasoning: {
3936
- effort: 'medium',
3937
- visibility: 'summary'
3938
- }
3939
- }
3940
- });
3941
-
3942
- // Access reasoning data
3943
- console.log('Response:', response.content);
3944
- console.log('Reasoning effort applied:', response.reasoning?.applied.effort);
3945
- console.log('Reasoning summary:', response.reasoning?.artifacts?.summary?.text);
3946
- ```
3947
-
3948
- **Encrypted Reasoning Continuity:**
3949
- ```typescript
3950
- // First request with encrypted trace
3951
- const aiRequestId1 = 'continuity-1';
3952
- const response1 = await gateway.invokeChat({
3953
- aiRequestId: aiRequestId1,
3954
- agentId: 'agent-1',
3955
- instructions: 'Solve this complex problem',
3956
- prompt: '{{input}}',
3957
- workingMemory: { input: 'Calculate the trajectory of a satellite' },
3958
- identity: {
3959
- sessionId: 's1',
3960
- instance: { instanceId: 'agent-1', type: 'test' },
3961
- aiRequestId: aiRequestId1,
3962
- jobId: 'job-cont',
3963
- taskId: 'task-c1',
3964
- agentId: 'agent-1'
3965
- },
3966
- config: {
3967
- provider: 'openai',
3968
- model: 'o1-preview', // OpenAI o-series models support encrypted traces
3969
- reasoning: {
3970
- effort: 'xhigh',
3971
- visibility: 'trace'
3972
- }
3973
- }
3974
- });
3975
-
3976
- // Check for encrypted artifacts
3977
- const encryptedArtifacts = response1.reasoning?.artifacts?.encrypted;
3978
- if (encryptedArtifacts && encryptedArtifacts.length > 0) {
3979
- console.log(`Found ${encryptedArtifacts.length} encrypted reasoning artifacts`);
3980
-
3981
- // Second request with continuity (encrypted artifacts would be passed back)
3982
- // Note: Continuity input format depends on provider-specific implementation
3983
- const aiRequestId2 = 'continuity-2';
3984
- const response2 = await gateway.invokeChat({
3985
- aiRequestId: aiRequestId2,
3986
- agentId: 'agent-1',
3987
- instructions: 'Continue from previous reasoning',
3988
- prompt: '{{input}}',
3989
- workingMemory: { input: 'Now apply this to Mars orbit' },
3990
- identity: {
3991
- sessionId: 's1',
3992
- instance: { instanceId: 'agent-1', type: 'test' },
3993
- aiRequestId: aiRequestId2,
3994
- jobId: 'job-cont',
3995
- taskId: 'task-c2',
3996
- agentId: 'agent-1'
3997
- },
3998
- config: {
3999
- provider: 'openai',
4000
- model: 'o1-preview',
4001
- reasoning: {
4002
- effort: 'xhigh',
4003
- visibility: 'trace'
4004
- }
4005
- // reasoningContinuity: encryptedArtifacts // Format depends on provider
4006
- }
4007
- });
4008
- }
4009
- ```
4010
-
4011
- **Provider Support Notes:**
4012
-
4013
- - **Encrypted reasoning traces**: Currently supported for `openai/o*` models via OpenRouter
4014
- - **Reasoning effort levels**: Varies by provider and model capabilities
4015
- - **Automatic fallback**: When `onUnsupported: 'downgrade'` is set, the gateway automatically adjusts to supported reasoning levels
4016
-
4017
- **Model Recommendations for Reasoning:**
4018
-
4019
- - **High reasoning**: `openai/o1-preview`, `openai/o1-mini`, `openai/gpt-4o` with reasoning config
4020
- - **Standard models**: `openai/gpt-5-nano`, `anthropic/claude-3.5-sonnet` (reasoning support varies)
4021
-
4022
- ## Troubleshooting
4023
-
4024
- ### Quick Reference
4025
-
4026
- **📚 Documentation** (included in npm package - accessible after `npm install @x12i/ai-gateway`):
4027
- - **[TROUBLESHOOTING.md](./TROUBLESHOOTING.md)** - Comprehensive troubleshooting guide with test cases for all common issues
4028
- - **[TROUBLESHOOTING_TOOLBOX.md](./TROUBLESHOOTING_TOOLBOX.md)** - Diagnostic utilities API and usage examples
4029
- - **[INTEGRATION_GUIDANCE.md](./INTEGRATION_GUIDANCE.md)** - Official patterns, integration examples, and debugging steps
4030
-
4031
- **Location after install**: `node_modules/@x12i/ai-gateway/TROUBLESHOOTING.md` (and other .md files)
4032
-
4033
- **🔧 Troubleshooting Helper Functions** (exported from SDK - use immediately, no file reading needed):
4034
- ```typescript
4035
- import {
4036
- validateAIRequest, // Validate request before sending
4037
- diagnoseRequest, // Get comprehensive diagnostics
4038
- formatDiagnostic, // Format diagnostics as text
4039
- assertValidAIRequest, // Throw if invalid (for testing)
4040
- extractJSON, // Extract JSON from text/markdown
4041
- supportsJSONMode // Check provider JSON mode support
4042
- } from '@x12i/ai-gateway';
4043
- ```
4044
-
4045
- **See**: [TROUBLESHOOTING_TOOLBOX.md](./TROUBLESHOOTING_TOOLBOX.md) for complete API documentation.
4046
-
4047
- ### Common Issue: `invoke()` validation (`actionType`, `actionRef`, `identity`, `input`)
4048
-
4049
- **Typical errors**: missing **`actionType`** / **`actionRef`**, missing **`identity.jobId`** / **`identity.taskId`**, or passing deprecated top-level **`input`** (use **`workingMemory.input`** with a **`prompt`** template).
4050
-
4051
- **Quick Fix Checklist**:
4052
-
4053
- 1. ✅ **Structured calls must use `invoke()` with `AIInvokeRequest` fields**
4054
- ```typescript
4055
- const aiRequestId = 'fix-1';
4056
- await gateway.invoke({
4057
- aiRequestId,
4058
- agentId: 'agent-1',
4059
- actionType: 'skill',
4060
- actionRef: 'skills/professional-answer',
4061
- instructions: 'professional-answer/general',
4062
- prompt: '{{input}}',
4063
- workingMemory: { input: 'User question' },
4064
- identity: {
4065
- sessionId: 's1',
4066
- instance: { instanceId: 'agent-1', type: 'test' },
4067
- aiRequestId,
4068
- jobId: 'job-123',
4069
- taskId: 'task-1',
4070
- agentId: 'agent-1'
4071
- },
4072
- primaryObjectType: {
4073
- type: 'professional-answer',
4074
- whenToUse: 'For professional Q&A responses'
4075
- },
4076
- config: { model: 'gpt-4o', provider: 'openai' }
4077
- });
4078
- ```
4079
-
4080
- 2. ✅ **Use troubleshooting helper to validate before sending**
4081
- ```typescript
4082
- import { validateAIRequest, assertValidAIRequest } from '@x12i/ai-gateway';
4083
-
4084
- // Validate before sending
4085
- assertValidAIRequest(request);
4086
- await gateway.invoke(request);
4087
- ```
4088
-
4089
- 3. ✅ **Enable debug logging to see request at entry point**
4090
- ```bash
4091
- export AI_GATEWAY_DEBUG_REQUEST=true
4092
- npm run your-test
4093
- ```
4094
-
4095
- **See**: [TROUBLESHOOTING.md](./TROUBLESHOOTING.md#airequest-validation-issues) for detailed solutions and test cases.
4096
-
4097
- ### Using Troubleshooting Tools
4098
-
4099
- ```typescript
4100
- import {
4101
- validateAIRequest,
4102
- diagnoseRequest,
4103
- formatDiagnostic,
4104
- supportsJSONMode
4105
- } from '@x12i/ai-gateway';
4106
-
4107
- // Validate request
4108
- const validation = validateAIRequest(request);
4109
- if (!validation.valid) {
4110
- console.error('Errors:', validation.errors);
4111
- }
4112
-
4113
- // Get diagnostics
4114
- const diagnostic = diagnoseRequest(request);
4115
- console.log(formatDiagnostic(diagnostic));
4116
- ```
4117
-
4118
- **See**: [TROUBLESHOOTING_TOOLBOX.md](./TROUBLESHOOTING_TOOLBOX.md) for complete API.
4119
-
4120
- ## Error Handling
4121
-
4122
- The gateway throws specific error types:
4123
-
4124
- - `ProviderNotFoundError`: When a requested provider is not registered
4125
- - `FallbackExhaustedError`: When all providers in the fallback chain have failed
4126
-
4127
- ```typescript
4128
- import { ProviderNotFoundError, FallbackExhaustedError } from '@x12i/ai-gateway';
4129
-
4130
- try {
4131
- const aiRequestId = 'chat-req-001';
4132
- const response = await gateway.invokeChat({
4133
- aiRequestId,
4134
- agentId: 'agent-1',
4135
- instructions: 'You are a helpful assistant',
4136
- prompt: '{{input}}',
4137
- workingMemory: { input: 'Hello!' },
4138
- identity: {
4139
- sessionId: 'sess-1',
4140
- instance: { instanceId: 'gw-1', type: 'gateway' },
4141
- aiRequestId,
4142
- jobId: 'job-123',
4143
- taskId: 'task-1',
4144
- agentId: 'agent-1'
4145
- }
4146
- });
4147
- } catch (error) {
4148
- if (error instanceof FallbackExhaustedError) {
4149
- console.error('All providers failed:', error.attempts);
4150
- // With enableActivityTracking, success/failure is recorded automatically inside the gateway
4151
- } else if (error instanceof ProviderNotFoundError) {
4152
- console.error('Provider not found:', error.message);
4153
- }
4154
- }
4155
- ```
4156
-
4157
- ## Package Integrations
4158
-
4159
- This package integrates with:
4160
-
4161
- - **@x12i/ai-providers-router**: Core routing functionality
4162
- - **@xronoces/content-registry**: Instruction key resolution and content management (optional)
4163
- - **@x12i/x-models**: Usage tier tracking and model metadata
4164
- - **@x12i/activix** v7 (`@x12i/xronox-store`): Activity logging and tracking (`^7.1.0` in this package)
4165
- - **@x12i/logxer**: Structured logging with correlation (`^4.2.1` in this package)
4166
- - **nx-config2**: Configuration management (via dependencies)
4167
-
4168
- ## Related Packages Status
4169
-
4170
- ### @x12i/ai-gateway
4171
-
4172
- **Fixes:**
4173
- - ✅ "[object Object]" bug when router returns objects
4174
- - ✅ Enhanced `extractRawText()` with multiple safety layers
4175
-
4176
- **Features:**
4177
- - ✅ Automatic recovery if "[object Object]" is detected
4178
- - ✅ Original object preserved in `parsedContent`
4179
- - ✅ Normalized content as JSON string (backward compatible)
4180
- - ✅ `metadata.contentType` for type indication
4181
- - ✅ Content-registry integration for instruction keys
4182
- - ✅ Centralized activity tracking configuration (v2.0.5+)
4183
- - ✅ Activity lifecycle verification and enhanced logging (v2.0.5+)
4184
-
4185
- ### @x12i/ai-providers-router
4186
-
4187
- **Fixes:**
4188
- - ✅ Dynamic import registration issue
4189
-
4190
- **Features:**
4191
- - ✅ Batch API support
4192
- - ✅ Enhanced provider management
4193
-
4194
- ### Integration
4195
-
4196
- The gateway fix handles cases where the router returns structured data (objects/arrays) instead of plain strings, ensuring:
4197
-
4198
- - **Backward compatibility** — `content` is always a string
4199
- - **Data preservation** — original object in `parsedContent`
4200
- - **Type safety** — `metadata.contentType` indicates the type
4201
-
4202
- ### Installation
4203
-
4204
- ```bash
4205
- npm install @x12i/ai-gateway
4206
- npm install @x12i/ai-providers-router
4207
- ```
4208
-
4209
- **Package compatibility:**
4210
- - Router handles dynamic imports and batch operations
4211
- - Gateway handles object responses from the router
4212
- - Gateway provides content normalization and type safety
4213
-
4214
- ## Testing
4215
-
4216
- ### Activity Tracking Tests
4217
-
4218
- **Standalone Test** (Recommended):
4219
- ```bash
4220
- npm run test:activities:standalone
4221
- ```
4222
-
4223
- This test bypasses config parsing issues and tests activity lifecycle end-to-end:
4224
- - ✅ Gateway initialization with activity tracking enabled
4225
- - ✅ Activity creation with `jobId`, `agentId`, `taskId`
4226
- - ✅ LLM invocation through gateway
4227
- - ✅ Activity status update from "started" to "success"
4228
- - ✅ Database persistence of activity records
4229
-
4230
- **Standard Test Suite**:
4231
- ```bash
4232
- npm test
4233
- ```
4234
-
4235
- **Note**: Some tests may be blocked by `nx-config2` Postgres parsing issue. See `.reports/new/nx-config2-skip-config-sections-feature-request.md` for details.
4236
-
4237
- **See**: `.tests/TESTING_GUIDE.md` for complete testing documentation.
4238
-
4239
- ## Known Issues
4240
-
4241
- ### nx-config2 Postgres Parsing Issue
4242
-
4243
- **Status**: 🟡 **FEATURE REQUEST OPEN** - Not a bug in ai-gateway
4244
-
4245
- **Issue**: `nx-config2` attempts to parse Postgres configuration even when Postgres is not used, causing test failures.
4246
-
4247
- **Workaround**: Use standalone test (`npm run test:activities:standalone`) which bypasses `nx-config2`.
4248
-
4249
- **Feature Request**: See `.reports/new/nx-config2-skip-config-sections-feature-request.md` for complete details. Requesting:
4250
- 1. Fix/remove broken Postgres port parsing
4251
- 2. Add selective config loading by sections (from `.env`) and groups (from config map)
4252
- 3. Ensure ai-gateway and activix configs are covered in DEFAULT_CONFIG_MAP
4253
-
4254
- ## Requirements
4255
-
4256
- - Node.js >= 18.0.0
4257
- - TypeScript >= 5.0.0 (if using TypeScript)
265
+ ---
4258
266
 
4259
267
  ## License
4260
268
 
4261
- ISC
4262
-
4263
- ## Repository
4264
-
4265
- [GitHub](https://github.com/x12i/ai-gateway)
269
+ MIT — see package metadata.