@x12i/ai-gateway 10.0.4 → 10.0.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +33 -9
- package/dist/ai-tools-client.js +18 -2
- package/dist/gateway-utils.d.ts +26 -1
- package/dist/gateway-utils.js +115 -49
- package/dist/gateway.js +4 -5
- package/dist/index.d.ts +1 -1
- package/dist/index.js +1 -1
- package/dist/types.d.ts +7 -0
- package/dist-cjs/ai-tools-client.cjs +18 -2
- package/dist-cjs/gateway-utils.cjs +115 -49
- package/dist-cjs/gateway-utils.d.ts +26 -1
- package/dist-cjs/gateway.cjs +4 -5
- package/dist-cjs/index.cjs +1 -1
- package/dist-cjs/index.d.ts +1 -1
- package/dist-cjs/types.d.ts +7 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -210,15 +210,39 @@ Exports: `GATEWAY_LOGXER_PACKAGE`, `GATEWAY_LOG_ENV_PREFIX`, `createGatewayLogge
|
|
|
210
210
|
|
|
211
211
|
## @x12i/ai-tools v2 (models + cost)
|
|
212
212
|
|
|
213
|
-
-
|
|
214
|
-
- **`aiTools.enabled`** — bootstrap catalog client + calculator.
|
|
215
|
-
- **`aiTools.resolveModels`** — `mergeConfig()` calls `resolveInvokeModel()` (catalog + OpenRouter/direct routing).
|
|
216
|
-
- **`aiTools.modelsOnly`** — **`true` by default** — reject profile shortcuts (`cheapest`, `cheap/default`, …); pass concrete model ids only.
|
|
217
|
-
- **`aiTools.calculateCost`** — prices usage before Activix `completeRecord` when the router did not mark the call priced.
|
|
213
|
+
Engine-owned catalog bootstrap and post-call billing. Consumers read **`metadata.costUsd`** / **`costStatus`** only — no direct `@x12i/ai-tools` dependency for cost.
|
|
218
214
|
|
|
219
|
-
|
|
215
|
+
### Resolution order (after every successful LLM call)
|
|
220
216
|
|
|
221
|
-
|
|
217
|
+
| Step | Condition | Result |
|
|
218
|
+
|------|-----------|--------|
|
|
219
|
+
| A | Router/provider returned finite **`costUsd`** (or equivalent) | **`costStatus: "priced"`**, set cost |
|
|
220
|
+
| B | Tokens + catalog pricing succeeds (`isAuthoritative`, not `unknownModel`, finite cost ≥ 0) | **`priced`** (+ optional breakdown) |
|
|
221
|
+
| C | Tokens but no price | **`unpriced`** |
|
|
222
|
+
| D | No usage | omit **`costUsd`** and **`costStatus`** |
|
|
223
|
+
|
|
224
|
+
Step A always wins; explicit router **`costStatus: "unpriced"`** is never overridden by catalog.
|
|
225
|
+
|
|
226
|
+
Implemented in **`resolveCostCompletionWithAiTools`** (delegates to **`CostCalculator.calculateFromRecord`** via **`buildGatewayPricingRecord`**). Target: move orchestrator to ai-tools as **`resolveInvokeBilling`** — see [AI_TOOLS_INVOKE_BILLING_ORCHESTRATOR_SPEC.md](./docs/upstream-reports/AI_TOOLS_INVOKE_BILLING_ORCHESTRATOR_SPEC.md).
|
|
227
|
+
|
|
228
|
+
### `aiTools` config (aligned with funcx / generic engine contract)
|
|
229
|
+
|
|
230
|
+
| Flag | Default | Purpose |
|
|
231
|
+
|------|---------|---------|
|
|
232
|
+
| **`enabled`** | `true` | Bootstrap **`AiModelsCatalogClient`** + **`CostCalculator`** |
|
|
233
|
+
| **`calculateCost`** | `true` | Run post-call catalog pricing when router did not price |
|
|
234
|
+
| **`resolveModels`** | `true` | **`mergeConfig()`** → **`resolveInvokeModel()`** |
|
|
235
|
+
| **`modelsOnly`** | `true` | Reject profile shortcuts (`cheapest`, `cheap/default`, …) |
|
|
236
|
+
| **`bundledOnly`** | `false` | Offline bundled catalogs only |
|
|
237
|
+
| **`costIncludeBreakdown`** | `false` | Include prompt/completion breakdown on priced results |
|
|
238
|
+
| **`catalogLane`** | `"text"` (ai-tools default) | Catalog lane for resolution + cost lookup (`text`, `image`, …) |
|
|
239
|
+
| **`cacheTtlMs`** | ai-tools default (24h) | In-memory catalog cache TTL |
|
|
240
|
+
|
|
241
|
+
- **No Catalox / Firestore** — catalogs come from ai-tools open-assets JSON (optional **`bundledOnly`**).
|
|
242
|
+
|
|
243
|
+
Gateway exports the model orchestrator from `@x12i/ai-tools` ≥ **2.5.0** (`resolveInvokeModel`, …) — see [AI_TOOLS_INVOKE_MODEL_RESOLUTION_ORCHESTRATOR_SPEC.md](./docs/upstream-reports/AI_TOOLS_INVOKE_MODEL_RESOLUTION_ORCHESTRATOR_SPEC.md).
|
|
244
|
+
|
|
245
|
+
Gateway billing helpers (also exported): `resolveCostCompletionWithAiTools`, `buildGatewayPricingRecord`, `catalogPricingSucceeded`, `ensureInvokeBillingCostStatus`, `buildTraceUsageSummary`, `enrichTraceAttemptsWithBilling`.
|
|
222
246
|
|
|
223
247
|
---
|
|
224
248
|
|
|
@@ -249,9 +273,9 @@ Mongo env: `MONGO_URI` + `MONGO_LOGS_DB` or `MONGO_DB`.
|
|
|
249
273
|
|
|
250
274
|
## Response metadata and cost
|
|
251
275
|
|
|
252
|
-
On every successful **`invoke()`**:
|
|
276
|
+
On every successful **`invoke()`** and **`invokeChat()`**:
|
|
253
277
|
|
|
254
|
-
- **`metadata.provider`**, **`modelUsed`**, **`maxTokensRequested`**, **`effectiveModelConfig`**
|
|
278
|
+
- **`metadata.provider`**, **`modelUsed`**, **`maxTokensRequested`**, **`effectiveModelConfig`** (invoke only)
|
|
255
279
|
- **`metadata.tokens`**, **`costStatus`**, **`costUsd`** when usage exists and pricing applies
|
|
256
280
|
|
|
257
281
|
Full contract: [AI Gateway invoke execution metadata](./docs/AI_GATEWAY_INVOKE_EXECUTION_METADATA.md).
|
package/dist/ai-tools-client.js
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
* @x12i/ai-tools invoke client bootstrap for the gateway.
|
|
3
3
|
* Model resolution orchestration lives in ai-tools ≥ 2.5.0 (`resolveInvokeModel`).
|
|
4
4
|
*/
|
|
5
|
-
import { getAiToolsInvokeClient, resetAiToolsInvokeClientForTests as resetAiToolsInvokeClientForTestsUpstream, mapResolutionToRouterConfig, buildInvokeModelResolverOptions, } from '@x12i/ai-tools';
|
|
5
|
+
import { getAiToolsInvokeClient, resetAiToolsInvokeClientForTests as resetAiToolsInvokeClientForTestsUpstream, mapResolutionToRouterConfig, buildInvokeModelResolverOptions, CostCalculator, } from '@x12i/ai-tools';
|
|
6
6
|
import { gatewayLogDebug, withActivityIdentity } from './gateway-log-meta.js';
|
|
7
7
|
import { resolvePreferOpenRouter } from './openrouter-routing.js';
|
|
8
8
|
export { resolveInvokeModel, applyOpenRouterInvokePolicy, buildInvokeModelResolverOptions, enrichModelResolutionError, mapResolutionToRouterConfig, ModelProfileUnroutableError, ModelProfileInputRejectedError, MODEL_PROFILE_UNROUTABLE, getAiToolsInvokeClient, resetAiToolsInvokeClientForTests as resetAiToolsInvokeClientForTestsUpstream, createAiToolsInvokeClient, } from '@x12i/ai-tools';
|
|
@@ -13,7 +13,22 @@ function invokeClientOptions(config) {
|
|
|
13
13
|
cacheTtlMs: config.aiTools?.cacheTtlMs,
|
|
14
14
|
...(config.aiTools?.bundledOnly ? { bundledOnly: true } : {}),
|
|
15
15
|
...(config.aiTools?.costIncludeBreakdown ? { costIncludeBreakdown: true } : {}),
|
|
16
|
-
cacheKey: `${config.aiTools?.cacheTtlMs ?? ''}:${config.aiTools?.costIncludeBreakdown ?? ''}:${config.aiTools?.bundledOnly ?? ''}`,
|
|
16
|
+
cacheKey: `${config.aiTools?.cacheTtlMs ?? ''}:${config.aiTools?.costIncludeBreakdown ?? ''}:${config.aiTools?.bundledOnly ?? ''}:${config.aiTools?.catalogLane ?? ''}`,
|
|
17
|
+
};
|
|
18
|
+
}
|
|
19
|
+
function withCatalogLaneCalculator(client, config) {
|
|
20
|
+
const lane = config.aiTools?.catalogLane;
|
|
21
|
+
if (!lane)
|
|
22
|
+
return client;
|
|
23
|
+
return {
|
|
24
|
+
...client,
|
|
25
|
+
calculator: new CostCalculator(client.catalog, {
|
|
26
|
+
...(config.aiTools?.costIncludeBreakdown ? { includeBreakdown: true } : {}),
|
|
27
|
+
resolverOptions: buildInvokeModelResolverOptions({
|
|
28
|
+
routingEnv: client.routingEnv,
|
|
29
|
+
catalogLane: lane
|
|
30
|
+
})
|
|
31
|
+
})
|
|
17
32
|
};
|
|
18
33
|
}
|
|
19
34
|
/** @deprecated Use buildInvokeModelResolverOptions */
|
|
@@ -53,6 +68,7 @@ export async function getAiToolsClient(config, logger) {
|
|
|
53
68
|
logger.debug('ai-tools catalog client ready', {
|
|
54
69
|
debugKind: gatewayLogDebug.state,
|
|
55
70
|
});
|
|
71
|
+
return withCatalogLaneCalculator(client, config);
|
|
56
72
|
}
|
|
57
73
|
return client;
|
|
58
74
|
}
|
package/dist/gateway-utils.d.ts
CHANGED
|
@@ -94,13 +94,38 @@ export type ResolveCostCompletionOptions = {
|
|
|
94
94
|
calculator?: CostCalculator | null;
|
|
95
95
|
calculateCost?: boolean;
|
|
96
96
|
};
|
|
97
|
-
/**
|
|
97
|
+
/** Optional cache/reasoning token fields for catalog pricing records. */
|
|
98
|
+
export type InvokeUsageExtras = {
|
|
99
|
+
cached?: number;
|
|
100
|
+
cacheWrite?: number;
|
|
101
|
+
reasoning?: number;
|
|
102
|
+
};
|
|
103
|
+
/**
|
|
104
|
+
* Best-effort cache/reasoning token counts from router usage buckets
|
|
105
|
+
* (for {@link buildGatewayPricingRecord} / ai-tools {@link CostCalculator.calculateFromRecord}).
|
|
106
|
+
*/
|
|
107
|
+
export declare function extractUsageExtrasFromRouterResponse(routerResponse: unknown): InvokeUsageExtras;
|
|
108
|
+
/**
|
|
109
|
+
* Whether ai-tools catalog pricing is authoritative enough for Step B (`priced`).
|
|
110
|
+
* Matches the generic engine contract: authoritative catalog hit with finite cost ≥ 0.
|
|
111
|
+
*/
|
|
112
|
+
export declare function catalogPricingSucceeded(result: AiCostResult): boolean;
|
|
113
|
+
/** Record shape for {@link CostCalculator.calculateFromRecord} (shared engine contract). */
|
|
98
114
|
export declare function buildGatewayPricingRecord(routerResponse: unknown, tokens: {
|
|
99
115
|
prompt: number;
|
|
100
116
|
completion: number;
|
|
101
117
|
total: number;
|
|
102
118
|
}, mergedConfig?: unknown): Record<string, unknown>;
|
|
103
119
|
export declare function mapAiCostResultToResolvedActivityCost(base: ResolvedActivityCost, result: AiCostResult): ResolvedActivityCost;
|
|
120
|
+
/**
|
|
121
|
+
* G8 safety net: token usage without a billing signal → `unpriced`.
|
|
122
|
+
* Used at invoke boundaries after {@link resolveCostCompletionWithAiTools}.
|
|
123
|
+
*/
|
|
124
|
+
export declare function ensureInvokeBillingCostStatus(billing: ResolvedActivityCost, tokens: {
|
|
125
|
+
prompt: number;
|
|
126
|
+
completion: number;
|
|
127
|
+
total: number;
|
|
128
|
+
}): ResolvedActivityCost;
|
|
104
129
|
/**
|
|
105
130
|
* Router cost passthrough, then optional @x12i/ai-tools catalog pricing when still unpriced.
|
|
106
131
|
*/
|
package/dist/gateway-utils.js
CHANGED
|
@@ -108,6 +108,8 @@ export async function mergeConfig(request, config, logger, mergeOptions) {
|
|
|
108
108
|
defaultProvider: config.defaultEngine,
|
|
109
109
|
resolveModels: true,
|
|
110
110
|
modelsOnly: config.aiTools?.modelsOnly !== false,
|
|
111
|
+
...(config.aiTools?.catalogLane ? { catalogLane: config.aiTools.catalogLane } : {}),
|
|
112
|
+
...(config.aiTools?.bundledOnly ? { bundledOnly: true } : {}),
|
|
111
113
|
});
|
|
112
114
|
merged.provider = resolved.router.provider;
|
|
113
115
|
merged.model = resolved.router.model;
|
|
@@ -368,42 +370,119 @@ export function resolveCostCompletionForActivity(routerResponse, tokens) {
|
|
|
368
370
|
}
|
|
369
371
|
return resolveActivityCostCompletion(tokens, costUsd);
|
|
370
372
|
}
|
|
371
|
-
/**
|
|
373
|
+
/**
|
|
374
|
+
* Best-effort cache/reasoning token counts from router usage buckets
|
|
375
|
+
* (for {@link buildGatewayPricingRecord} / ai-tools {@link CostCalculator.calculateFromRecord}).
|
|
376
|
+
*/
|
|
377
|
+
export function extractUsageExtrasFromRouterResponse(routerResponse) {
|
|
378
|
+
if (routerResponse == null || typeof routerResponse !== 'object')
|
|
379
|
+
return {};
|
|
380
|
+
const r = routerResponse;
|
|
381
|
+
const roots = [r.usage];
|
|
382
|
+
const meta = r.metadata != null && typeof r.metadata === 'object'
|
|
383
|
+
? r.metadata
|
|
384
|
+
: undefined;
|
|
385
|
+
if (meta) {
|
|
386
|
+
roots.push(meta.usage, meta.tokens);
|
|
387
|
+
}
|
|
388
|
+
const raw = r.rawResponse ?? r.raw;
|
|
389
|
+
if (raw != null && typeof raw === 'object') {
|
|
390
|
+
roots.push(raw.usage);
|
|
391
|
+
}
|
|
392
|
+
const extras = {};
|
|
393
|
+
for (const bucket of roots) {
|
|
394
|
+
if (bucket == null || typeof bucket !== 'object')
|
|
395
|
+
continue;
|
|
396
|
+
const u = bucket;
|
|
397
|
+
const cached = firstFiniteNumber(u.cached, u.cached_tokens, u.cachedTokens, u.cache_read_tokens, u.cacheReadTokens);
|
|
398
|
+
const cacheWrite = firstFiniteNumber(u.cacheWrite, u.cache_write_tokens, u.cacheWriteTokens);
|
|
399
|
+
const reasoning = firstFiniteNumber(u.reasoning, u.reasoning_tokens, u.reasoningTokens);
|
|
400
|
+
if (cached !== undefined && extras.cached === undefined)
|
|
401
|
+
extras.cached = cached;
|
|
402
|
+
if (cacheWrite !== undefined && extras.cacheWrite === undefined)
|
|
403
|
+
extras.cacheWrite = cacheWrite;
|
|
404
|
+
if (reasoning !== undefined && extras.reasoning === undefined)
|
|
405
|
+
extras.reasoning = reasoning;
|
|
406
|
+
}
|
|
407
|
+
return extras;
|
|
408
|
+
}
|
|
409
|
+
/**
|
|
410
|
+
* Whether ai-tools catalog pricing is authoritative enough for Step B (`priced`).
|
|
411
|
+
* Matches the generic engine contract: authoritative catalog hit with finite cost ≥ 0.
|
|
412
|
+
*/
|
|
413
|
+
export function catalogPricingSucceeded(result) {
|
|
414
|
+
if (result.unknownModel)
|
|
415
|
+
return false;
|
|
416
|
+
if (!result.isAuthoritative)
|
|
417
|
+
return false;
|
|
418
|
+
if (result.source === 'estimate-fallback' || result.source === 'local')
|
|
419
|
+
return false;
|
|
420
|
+
if (typeof result.cost !== 'number' || !Number.isFinite(result.cost) || result.cost < 0) {
|
|
421
|
+
return false;
|
|
422
|
+
}
|
|
423
|
+
return true;
|
|
424
|
+
}
|
|
425
|
+
/** Record shape for {@link CostCalculator.calculateFromRecord} (shared engine contract). */
|
|
372
426
|
export function buildGatewayPricingRecord(routerResponse, tokens, mergedConfig) {
|
|
373
|
-
const base = routerResponse != null && typeof routerResponse === 'object'
|
|
374
|
-
? { ...routerResponse }
|
|
375
|
-
: {};
|
|
376
|
-
const meta = base.metadata != null && typeof base.metadata === 'object'
|
|
377
|
-
? { ...base.metadata }
|
|
378
|
-
: {};
|
|
379
427
|
const routing = pickInvokeRoutingMetadataSlice(routerResponse, mergedConfig);
|
|
428
|
+
const cfg = mergedConfig != null && typeof mergedConfig === 'object'
|
|
429
|
+
? mergedConfig
|
|
430
|
+
: {};
|
|
431
|
+
const requestModel = typeof cfg.model === 'string'
|
|
432
|
+
? cfg.model
|
|
433
|
+
: typeof routing.modelUsed === 'string'
|
|
434
|
+
? routing.modelUsed
|
|
435
|
+
: undefined;
|
|
436
|
+
const modelUsed = routing.modelUsed ?? requestModel;
|
|
437
|
+
const provider = routing.provider ??
|
|
438
|
+
(typeof cfg.provider === 'string' ? cfg.provider : undefined) ??
|
|
439
|
+
'openrouter';
|
|
440
|
+
const usageExtras = extractUsageExtrasFromRouterResponse(routerResponse);
|
|
441
|
+
const tokenSlice = {
|
|
442
|
+
prompt: tokens.prompt,
|
|
443
|
+
completion: tokens.completion,
|
|
444
|
+
total: tokens.total,
|
|
445
|
+
...usageExtras
|
|
446
|
+
};
|
|
380
447
|
return {
|
|
381
|
-
|
|
448
|
+
model: modelUsed ?? requestModel ?? '',
|
|
449
|
+
...(requestModel && modelUsed && requestModel !== modelUsed
|
|
450
|
+
? { modelAlias: requestModel }
|
|
451
|
+
: {}),
|
|
452
|
+
...(modelUsed ? { modelUsed, usedModel: modelUsed } : {}),
|
|
453
|
+
provider,
|
|
454
|
+
...(provider || routing.region
|
|
455
|
+
? {
|
|
456
|
+
routing: {
|
|
457
|
+
provider,
|
|
458
|
+
...(routing.region ? { region: routing.region } : {})
|
|
459
|
+
}
|
|
460
|
+
}
|
|
461
|
+
: {}),
|
|
382
462
|
usage: {
|
|
383
|
-
|
|
384
|
-
|
|
385
|
-
|
|
463
|
+
prompt_tokens: tokens.prompt,
|
|
464
|
+
completion_tokens: tokens.completion,
|
|
465
|
+
total_tokens: tokens.total,
|
|
466
|
+
...(usageExtras.cached !== undefined ? { cachedTokensPrompt: usageExtras.cached } : {}),
|
|
467
|
+
...(usageExtras.cached !== undefined ? { cachedTokensTotal: usageExtras.cached } : {})
|
|
386
468
|
},
|
|
387
|
-
tokens,
|
|
469
|
+
promptTokens: tokens.prompt,
|
|
470
|
+
completionTokens: tokens.completion,
|
|
471
|
+
totalTokens: tokens.total,
|
|
472
|
+
tokens: tokenSlice,
|
|
388
473
|
metadata: {
|
|
389
|
-
|
|
390
|
-
|
|
391
|
-
...(routing.
|
|
392
|
-
|
|
393
|
-
|
|
394
|
-
|
|
474
|
+
provider,
|
|
475
|
+
...(modelUsed ? { modelUsed, model: modelUsed } : {}),
|
|
476
|
+
...(routing.maxTokensRequested !== undefined
|
|
477
|
+
? { maxTokensRequested: routing.maxTokensRequested }
|
|
478
|
+
: {}),
|
|
479
|
+
tokens: tokenSlice
|
|
395
480
|
},
|
|
396
481
|
...(mergedConfig != null ? { config: mergedConfig } : {})
|
|
397
482
|
};
|
|
398
483
|
}
|
|
399
484
|
export function mapAiCostResultToResolvedActivityCost(base, result) {
|
|
400
|
-
if (result
|
|
401
|
-
return base.costStatus ? base : { ...base, costStatus: 'unpriced' };
|
|
402
|
-
}
|
|
403
|
-
if (typeof result.cost !== 'number' || !Number.isFinite(result.cost)) {
|
|
404
|
-
return base;
|
|
405
|
-
}
|
|
406
|
-
if (!result.isAuthoritative && result.source === 'estimate-fallback') {
|
|
485
|
+
if (!catalogPricingSucceeded(result)) {
|
|
407
486
|
return base.costStatus ? base : { ...base, costStatus: 'unpriced' };
|
|
408
487
|
}
|
|
409
488
|
return {
|
|
@@ -412,6 +491,16 @@ export function mapAiCostResultToResolvedActivityCost(base, result) {
|
|
|
412
491
|
...(result.breakdown ? { costBreakdown: result.breakdown } : {})
|
|
413
492
|
};
|
|
414
493
|
}
|
|
494
|
+
/**
|
|
495
|
+
* G8 safety net: token usage without a billing signal → `unpriced`.
|
|
496
|
+
* Used at invoke boundaries after {@link resolveCostCompletionWithAiTools}.
|
|
497
|
+
*/
|
|
498
|
+
export function ensureInvokeBillingCostStatus(billing, tokens) {
|
|
499
|
+
if (!billing.costStatus && hasNonZeroTokenUsage(tokens)) {
|
|
500
|
+
return { ...billing, costStatus: 'unpriced' };
|
|
501
|
+
}
|
|
502
|
+
return billing;
|
|
503
|
+
}
|
|
415
504
|
/**
|
|
416
505
|
* Router cost passthrough, then optional @x12i/ai-tools catalog pricing when still unpriced.
|
|
417
506
|
*/
|
|
@@ -436,30 +525,7 @@ export async function resolveCostCompletionWithAiTools(routerResponse, tokens, o
|
|
|
436
525
|
return mapAiCostResultToResolvedActivityCost(base, result);
|
|
437
526
|
}
|
|
438
527
|
catch {
|
|
439
|
-
|
|
440
|
-
const cfg = options.mergedConfig != null && typeof options.mergedConfig === 'object'
|
|
441
|
-
? options.mergedConfig
|
|
442
|
-
: {};
|
|
443
|
-
const provider = routing.provider ?? cfg.provider;
|
|
444
|
-
const modelUsed = routing.modelUsed ?? cfg.model;
|
|
445
|
-
if (!provider || !modelUsed) {
|
|
446
|
-
return base;
|
|
447
|
-
}
|
|
448
|
-
try {
|
|
449
|
-
const result = await options.calculator.calculate({
|
|
450
|
-
tokens: {
|
|
451
|
-
prompt: tokens.prompt,
|
|
452
|
-
completion: tokens.completion,
|
|
453
|
-
total: tokens.total
|
|
454
|
-
},
|
|
455
|
-
provider,
|
|
456
|
-
usedModel: modelUsed
|
|
457
|
-
});
|
|
458
|
-
return mapAiCostResultToResolvedActivityCost(base, result);
|
|
459
|
-
}
|
|
460
|
-
catch {
|
|
461
|
-
return base;
|
|
462
|
-
}
|
|
528
|
+
return ensureInvokeBillingCostStatus(base, tokens);
|
|
463
529
|
}
|
|
464
530
|
}
|
|
465
531
|
function applyBillingToTraceAttempt(attempt, billing) {
|
package/dist/gateway.js
CHANGED
|
@@ -11,7 +11,7 @@ import { resolveRetryConfig } from './gateway-defaults.js';
|
|
|
11
11
|
import { buildMessages } from './message-builder.js';
|
|
12
12
|
import { extractJsonFromFlexMd } from './flex-md-loader.js';
|
|
13
13
|
import { enrichParsedContentForOutputContract, resolveOutputContractFieldKeys } from './output-contract-normalizer.js';
|
|
14
|
-
import { attachGatewayInvokeRejectionMetadata, buildGatewayFallbackAttemptsFromTrace, buildInvokeRejectionMetadata, capActivityFullResponsePayload, formatFallbackExhaustionMessage, logResolvedModelRouting, mapGatewayFallbackAttemptsToRouter,
|
|
14
|
+
import { attachGatewayInvokeRejectionMetadata, buildGatewayFallbackAttemptsFromTrace, buildInvokeRejectionMetadata, capActivityFullResponsePayload, formatFallbackExhaustionMessage, logResolvedModelRouting, mapGatewayFallbackAttemptsToRouter, DEFAULT_ACTIVITY_FULL_RESPONSE_MAX_CHARS, extractCostUsdFromRouterResponse, extractTokenUsageFromRouterResponse, mergeConfig, pickEffectiveModelConfigForMetadata, pickInvokeRoutingMetadataSlice, pickTraceMergedRouterConfig, resolveCostCompletionWithAiTools, ensureInvokeBillingCostStatus, buildTraceUsageSummary, enrichTraceAttemptsWithBilling, tryExtractRouterLikePayloadFromErrorChain } from './gateway-utils.js';
|
|
15
15
|
import { getAiToolsClient } from './ai-tools-client.js';
|
|
16
16
|
import { autoRegisterProviders } from './gateway-provider-auto-register.js';
|
|
17
17
|
import { setGatewayLastJobId, setGatewayRuntimeClients } from './runtime-objects.js';
|
|
@@ -135,11 +135,12 @@ export class AIGateway {
|
|
|
135
135
|
});
|
|
136
136
|
const metaChat = response?.metadata || {};
|
|
137
137
|
const tokensChat = extractTokenUsageFromRouterResponse(response);
|
|
138
|
-
|
|
138
|
+
let costCompletionChat = await resolveCostCompletionWithAiTools(response, tokensChat, {
|
|
139
139
|
mergedConfig,
|
|
140
140
|
calculator: aiTools?.calculator ?? null,
|
|
141
141
|
calculateCost: this.config.aiTools?.calculateCost
|
|
142
142
|
});
|
|
143
|
+
costCompletionChat = ensureInvokeBillingCostStatus(costCompletionChat, tokensChat);
|
|
143
144
|
// Create enhanced response
|
|
144
145
|
const enhancedResponse = {
|
|
145
146
|
content: response.content || '',
|
|
@@ -614,9 +615,7 @@ export class AIGateway {
|
|
|
614
615
|
calculator: aiTools?.calculator ?? null,
|
|
615
616
|
calculateCost: this.config.aiTools?.calculateCost
|
|
616
617
|
});
|
|
617
|
-
|
|
618
|
-
costCompletion = { ...costCompletion, costStatus: 'unpriced' };
|
|
619
|
-
}
|
|
618
|
+
costCompletion = ensureInvokeBillingCostStatus(costCompletion, tokens);
|
|
620
619
|
const routerMetaForCost = routerResponse?.metadata || {};
|
|
621
620
|
const routingMetadataSlice = pickInvokeRoutingMetadataSlice(routerResponse, mergedConfig);
|
|
622
621
|
const effectiveModelConfig = pickEffectiveModelConfigForMetadata(mergedConfig);
|
package/dist/index.d.ts
CHANGED
|
@@ -17,7 +17,7 @@ export { AIGateway } from './gateway.js';
|
|
|
17
17
|
export { InstructionNotFoundError, InstructionBackendError, ModelRequiredError, MaxTokensRequiredError } from './instruction-errors.js';
|
|
18
18
|
export { autoRegisterProviders } from './gateway-provider-auto-register.js';
|
|
19
19
|
export type { GatewayConfig, ProviderModelRef, ModelConfig, RetryConfig, ChatRequest, AIInvokeRequest, AIRequest, GatewayActionType, GatewayInvokeRejectionMetadata, GatewayFallbackAttempt, GatewayTraceRequestIds, GatewayTraceAttempt, GatewayTraceUsageSummary, GatewayTraceMergedConfig, EnhancedLLMResponse, InstructionMetadata, ValidationRule, TemplateRenderOptions, SmartInputConfig, SmartInputRenderOptions } from './types.js';
|
|
20
|
-
export { attachGatewayInvokeRejectionMetadata, buildInvokeRejectionMetadata, tryExtractRouterLikePayloadFromErrorChain, tryExtractFallbackAttemptsFromErrorChain, pickRequestIdsFromRouterLike, resolveActivityCostCompletion, resolveCostCompletionForActivity, resolveCostCompletionWithAiTools, buildGatewayPricingRecord, mapAiCostResultToResolvedActivityCost, buildTraceUsageSummary, enrichTraceAttemptsWithBilling, hasNonZeroTokenUsage, MODEL_PROFILE_UNROUTABLE, ModelProfileUnroutableError, ModelProfileInputRejectedError, buildGatewayFallbackAttemptsFromTrace, formatFallbackExhaustionMessage, logResolvedModelRouting, mapGatewayFallbackAttemptsToRouter } from './gateway-utils.js';
|
|
20
|
+
export { attachGatewayInvokeRejectionMetadata, buildInvokeRejectionMetadata, tryExtractRouterLikePayloadFromErrorChain, tryExtractFallbackAttemptsFromErrorChain, pickRequestIdsFromRouterLike, resolveActivityCostCompletion, resolveCostCompletionForActivity, resolveCostCompletionWithAiTools, buildGatewayPricingRecord, mapAiCostResultToResolvedActivityCost, catalogPricingSucceeded, ensureInvokeBillingCostStatus, extractUsageExtrasFromRouterResponse, buildTraceUsageSummary, enrichTraceAttemptsWithBilling, hasNonZeroTokenUsage, MODEL_PROFILE_UNROUTABLE, ModelProfileUnroutableError, ModelProfileInputRejectedError, buildGatewayFallbackAttemptsFromTrace, formatFallbackExhaustionMessage, logResolvedModelRouting, mapGatewayFallbackAttemptsToRouter } from './gateway-utils.js';
|
|
21
21
|
export { getGatewayOperationalMode, isProdGatewayMode, parseModelProviderSpec } from './gateway-mode.js';
|
|
22
22
|
export type { GatewayOperationalMode } from './gateway-mode.js';
|
|
23
23
|
export { DEFAULT_ACTIVITY_FULL_RESPONSE_MAX_CHARS, GATEWAY_DEFAULT_FREQUENCY_PENALTY, GATEWAY_DEFAULT_PRESENCE_PENALTY, GATEWAY_DEFAULT_RETRY, GATEWAY_DEFAULT_TEMPERATURE, GATEWAY_DEFAULT_TOP_P, resolveRetryConfig } from './gateway-defaults.js';
|
package/dist/index.js
CHANGED
|
@@ -17,7 +17,7 @@ export * from '@x12i/ai-providers-router';
|
|
|
17
17
|
export { AIGateway } from './gateway.js';
|
|
18
18
|
export { InstructionNotFoundError, InstructionBackendError, ModelRequiredError, MaxTokensRequiredError } from './instruction-errors.js';
|
|
19
19
|
export { autoRegisterProviders } from './gateway-provider-auto-register.js';
|
|
20
|
-
export { attachGatewayInvokeRejectionMetadata, buildInvokeRejectionMetadata, tryExtractRouterLikePayloadFromErrorChain, tryExtractFallbackAttemptsFromErrorChain, pickRequestIdsFromRouterLike, resolveActivityCostCompletion, resolveCostCompletionForActivity, resolveCostCompletionWithAiTools, buildGatewayPricingRecord, mapAiCostResultToResolvedActivityCost, buildTraceUsageSummary, enrichTraceAttemptsWithBilling, hasNonZeroTokenUsage, MODEL_PROFILE_UNROUTABLE, ModelProfileUnroutableError, ModelProfileInputRejectedError, buildGatewayFallbackAttemptsFromTrace, formatFallbackExhaustionMessage, logResolvedModelRouting, mapGatewayFallbackAttemptsToRouter } from './gateway-utils.js';
|
|
20
|
+
export { attachGatewayInvokeRejectionMetadata, buildInvokeRejectionMetadata, tryExtractRouterLikePayloadFromErrorChain, tryExtractFallbackAttemptsFromErrorChain, pickRequestIdsFromRouterLike, resolveActivityCostCompletion, resolveCostCompletionForActivity, resolveCostCompletionWithAiTools, buildGatewayPricingRecord, mapAiCostResultToResolvedActivityCost, catalogPricingSucceeded, ensureInvokeBillingCostStatus, extractUsageExtrasFromRouterResponse, buildTraceUsageSummary, enrichTraceAttemptsWithBilling, hasNonZeroTokenUsage, MODEL_PROFILE_UNROUTABLE, ModelProfileUnroutableError, ModelProfileInputRejectedError, buildGatewayFallbackAttemptsFromTrace, formatFallbackExhaustionMessage, logResolvedModelRouting, mapGatewayFallbackAttemptsToRouter } from './gateway-utils.js';
|
|
21
21
|
export { getGatewayOperationalMode, isProdGatewayMode, parseModelProviderSpec } from './gateway-mode.js';
|
|
22
22
|
export { DEFAULT_ACTIVITY_FULL_RESPONSE_MAX_CHARS, GATEWAY_DEFAULT_FREQUENCY_PENALTY, GATEWAY_DEFAULT_PRESENCE_PENALTY, GATEWAY_DEFAULT_RETRY, GATEWAY_DEFAULT_TEMPERATURE, GATEWAY_DEFAULT_TOP_P, resolveRetryConfig } from './gateway-defaults.js';
|
|
23
23
|
export { contractSpecToFieldKeys, enrichParsedContentForOutputContract, resolveOutputContractFieldKeys } from './output-contract-normalizer.js';
|
package/dist/types.d.ts
CHANGED
|
@@ -9,6 +9,7 @@ type AIModel = string;
|
|
|
9
9
|
export type UsageTier = string;
|
|
10
10
|
import type { Activix } from '@x12i/activix';
|
|
11
11
|
import type { SmartInputConfig, SmartInputRenderOptions, TemplateRenderOptions } from '@x12i/rendrix';
|
|
12
|
+
import type { ProfileCatalogLane } from '@x12i/ai-profiles';
|
|
12
13
|
import type { Logxer, PackageLogLevelsConfig } from '@x12i/logxer';
|
|
13
14
|
/**
|
|
14
15
|
* Diagnostics options for opt-in authoritative tracing.
|
|
@@ -415,6 +416,11 @@ export interface GatewayConfig extends Omit<RouterConfig, 'defaultEngine' | 'log
|
|
|
415
416
|
cacheTtlMs?: number;
|
|
416
417
|
/** Use bundled catalog JSON only (offline / tests). */
|
|
417
418
|
bundledOnly?: boolean;
|
|
419
|
+
/**
|
|
420
|
+
* Catalog lane for model resolution and cost lookup (`text`, `image`, …).
|
|
421
|
+
* @default `"text"` in ai-tools when omitted.
|
|
422
|
+
*/
|
|
423
|
+
catalogLane?: ProfileCatalogLane;
|
|
418
424
|
/** @default true */
|
|
419
425
|
resolveModels?: boolean;
|
|
420
426
|
/**
|
|
@@ -424,6 +430,7 @@ export interface GatewayConfig extends Omit<RouterConfig, 'defaultEngine' | 'log
|
|
|
424
430
|
modelsOnly?: boolean;
|
|
425
431
|
/** @default true */
|
|
426
432
|
calculateCost?: boolean;
|
|
433
|
+
/** @default false — when true, priced results may include prompt/completion breakdown. */
|
|
427
434
|
costIncludeBreakdown?: boolean;
|
|
428
435
|
};
|
|
429
436
|
/**
|
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
* @x12i/ai-tools invoke client bootstrap for the gateway.
|
|
3
3
|
* Model resolution orchestration lives in ai-tools ≥ 2.5.0 (`resolveInvokeModel`).
|
|
4
4
|
*/
|
|
5
|
-
import { getAiToolsInvokeClient, resetAiToolsInvokeClientForTests as resetAiToolsInvokeClientForTestsUpstream, mapResolutionToRouterConfig, buildInvokeModelResolverOptions, } from '@x12i/ai-tools';
|
|
5
|
+
import { getAiToolsInvokeClient, resetAiToolsInvokeClientForTests as resetAiToolsInvokeClientForTestsUpstream, mapResolutionToRouterConfig, buildInvokeModelResolverOptions, CostCalculator, } from '@x12i/ai-tools';
|
|
6
6
|
import { gatewayLogDebug, withActivityIdentity } from './gateway-log-meta.js';
|
|
7
7
|
import { resolvePreferOpenRouter } from './openrouter-routing.js';
|
|
8
8
|
export { resolveInvokeModel, applyOpenRouterInvokePolicy, buildInvokeModelResolverOptions, enrichModelResolutionError, mapResolutionToRouterConfig, ModelProfileUnroutableError, ModelProfileInputRejectedError, MODEL_PROFILE_UNROUTABLE, getAiToolsInvokeClient, resetAiToolsInvokeClientForTests as resetAiToolsInvokeClientForTestsUpstream, createAiToolsInvokeClient, } from '@x12i/ai-tools';
|
|
@@ -13,7 +13,22 @@ function invokeClientOptions(config) {
|
|
|
13
13
|
cacheTtlMs: config.aiTools?.cacheTtlMs,
|
|
14
14
|
...(config.aiTools?.bundledOnly ? { bundledOnly: true } : {}),
|
|
15
15
|
...(config.aiTools?.costIncludeBreakdown ? { costIncludeBreakdown: true } : {}),
|
|
16
|
-
cacheKey: `${config.aiTools?.cacheTtlMs ?? ''}:${config.aiTools?.costIncludeBreakdown ?? ''}:${config.aiTools?.bundledOnly ?? ''}`,
|
|
16
|
+
cacheKey: `${config.aiTools?.cacheTtlMs ?? ''}:${config.aiTools?.costIncludeBreakdown ?? ''}:${config.aiTools?.bundledOnly ?? ''}:${config.aiTools?.catalogLane ?? ''}`,
|
|
17
|
+
};
|
|
18
|
+
}
|
|
19
|
+
function withCatalogLaneCalculator(client, config) {
|
|
20
|
+
const lane = config.aiTools?.catalogLane;
|
|
21
|
+
if (!lane)
|
|
22
|
+
return client;
|
|
23
|
+
return {
|
|
24
|
+
...client,
|
|
25
|
+
calculator: new CostCalculator(client.catalog, {
|
|
26
|
+
...(config.aiTools?.costIncludeBreakdown ? { includeBreakdown: true } : {}),
|
|
27
|
+
resolverOptions: buildInvokeModelResolverOptions({
|
|
28
|
+
routingEnv: client.routingEnv,
|
|
29
|
+
catalogLane: lane
|
|
30
|
+
})
|
|
31
|
+
})
|
|
17
32
|
};
|
|
18
33
|
}
|
|
19
34
|
/** @deprecated Use buildInvokeModelResolverOptions */
|
|
@@ -53,6 +68,7 @@ export async function getAiToolsClient(config, logger) {
|
|
|
53
68
|
logger.debug('ai-tools catalog client ready', {
|
|
54
69
|
debugKind: gatewayLogDebug.state,
|
|
55
70
|
});
|
|
71
|
+
return withCatalogLaneCalculator(client, config);
|
|
56
72
|
}
|
|
57
73
|
return client;
|
|
58
74
|
}
|
|
@@ -108,6 +108,8 @@ export async function mergeConfig(request, config, logger, mergeOptions) {
|
|
|
108
108
|
defaultProvider: config.defaultEngine,
|
|
109
109
|
resolveModels: true,
|
|
110
110
|
modelsOnly: config.aiTools?.modelsOnly !== false,
|
|
111
|
+
...(config.aiTools?.catalogLane ? { catalogLane: config.aiTools.catalogLane } : {}),
|
|
112
|
+
...(config.aiTools?.bundledOnly ? { bundledOnly: true } : {}),
|
|
111
113
|
});
|
|
112
114
|
merged.provider = resolved.router.provider;
|
|
113
115
|
merged.model = resolved.router.model;
|
|
@@ -368,42 +370,119 @@ export function resolveCostCompletionForActivity(routerResponse, tokens) {
|
|
|
368
370
|
}
|
|
369
371
|
return resolveActivityCostCompletion(tokens, costUsd);
|
|
370
372
|
}
|
|
371
|
-
/**
|
|
373
|
+
/**
|
|
374
|
+
* Best-effort cache/reasoning token counts from router usage buckets
|
|
375
|
+
* (for {@link buildGatewayPricingRecord} / ai-tools {@link CostCalculator.calculateFromRecord}).
|
|
376
|
+
*/
|
|
377
|
+
export function extractUsageExtrasFromRouterResponse(routerResponse) {
|
|
378
|
+
if (routerResponse == null || typeof routerResponse !== 'object')
|
|
379
|
+
return {};
|
|
380
|
+
const r = routerResponse;
|
|
381
|
+
const roots = [r.usage];
|
|
382
|
+
const meta = r.metadata != null && typeof r.metadata === 'object'
|
|
383
|
+
? r.metadata
|
|
384
|
+
: undefined;
|
|
385
|
+
if (meta) {
|
|
386
|
+
roots.push(meta.usage, meta.tokens);
|
|
387
|
+
}
|
|
388
|
+
const raw = r.rawResponse ?? r.raw;
|
|
389
|
+
if (raw != null && typeof raw === 'object') {
|
|
390
|
+
roots.push(raw.usage);
|
|
391
|
+
}
|
|
392
|
+
const extras = {};
|
|
393
|
+
for (const bucket of roots) {
|
|
394
|
+
if (bucket == null || typeof bucket !== 'object')
|
|
395
|
+
continue;
|
|
396
|
+
const u = bucket;
|
|
397
|
+
const cached = firstFiniteNumber(u.cached, u.cached_tokens, u.cachedTokens, u.cache_read_tokens, u.cacheReadTokens);
|
|
398
|
+
const cacheWrite = firstFiniteNumber(u.cacheWrite, u.cache_write_tokens, u.cacheWriteTokens);
|
|
399
|
+
const reasoning = firstFiniteNumber(u.reasoning, u.reasoning_tokens, u.reasoningTokens);
|
|
400
|
+
if (cached !== undefined && extras.cached === undefined)
|
|
401
|
+
extras.cached = cached;
|
|
402
|
+
if (cacheWrite !== undefined && extras.cacheWrite === undefined)
|
|
403
|
+
extras.cacheWrite = cacheWrite;
|
|
404
|
+
if (reasoning !== undefined && extras.reasoning === undefined)
|
|
405
|
+
extras.reasoning = reasoning;
|
|
406
|
+
}
|
|
407
|
+
return extras;
|
|
408
|
+
}
|
|
409
|
+
/**
|
|
410
|
+
* Whether ai-tools catalog pricing is authoritative enough for Step B (`priced`).
|
|
411
|
+
* Matches the generic engine contract: authoritative catalog hit with finite cost ≥ 0.
|
|
412
|
+
*/
|
|
413
|
+
export function catalogPricingSucceeded(result) {
|
|
414
|
+
if (result.unknownModel)
|
|
415
|
+
return false;
|
|
416
|
+
if (!result.isAuthoritative)
|
|
417
|
+
return false;
|
|
418
|
+
if (result.source === 'estimate-fallback' || result.source === 'local')
|
|
419
|
+
return false;
|
|
420
|
+
if (typeof result.cost !== 'number' || !Number.isFinite(result.cost) || result.cost < 0) {
|
|
421
|
+
return false;
|
|
422
|
+
}
|
|
423
|
+
return true;
|
|
424
|
+
}
|
|
425
|
+
/** Record shape for {@link CostCalculator.calculateFromRecord} (shared engine contract). */
|
|
372
426
|
export function buildGatewayPricingRecord(routerResponse, tokens, mergedConfig) {
|
|
373
|
-
const base = routerResponse != null && typeof routerResponse === 'object'
|
|
374
|
-
? { ...routerResponse }
|
|
375
|
-
: {};
|
|
376
|
-
const meta = base.metadata != null && typeof base.metadata === 'object'
|
|
377
|
-
? { ...base.metadata }
|
|
378
|
-
: {};
|
|
379
427
|
const routing = pickInvokeRoutingMetadataSlice(routerResponse, mergedConfig);
|
|
428
|
+
const cfg = mergedConfig != null && typeof mergedConfig === 'object'
|
|
429
|
+
? mergedConfig
|
|
430
|
+
: {};
|
|
431
|
+
const requestModel = typeof cfg.model === 'string'
|
|
432
|
+
? cfg.model
|
|
433
|
+
: typeof routing.modelUsed === 'string'
|
|
434
|
+
? routing.modelUsed
|
|
435
|
+
: undefined;
|
|
436
|
+
const modelUsed = routing.modelUsed ?? requestModel;
|
|
437
|
+
const provider = routing.provider ??
|
|
438
|
+
(typeof cfg.provider === 'string' ? cfg.provider : undefined) ??
|
|
439
|
+
'openrouter';
|
|
440
|
+
const usageExtras = extractUsageExtrasFromRouterResponse(routerResponse);
|
|
441
|
+
const tokenSlice = {
|
|
442
|
+
prompt: tokens.prompt,
|
|
443
|
+
completion: tokens.completion,
|
|
444
|
+
total: tokens.total,
|
|
445
|
+
...usageExtras
|
|
446
|
+
};
|
|
380
447
|
return {
|
|
381
|
-
|
|
448
|
+
model: modelUsed ?? requestModel ?? '',
|
|
449
|
+
...(requestModel && modelUsed && requestModel !== modelUsed
|
|
450
|
+
? { modelAlias: requestModel }
|
|
451
|
+
: {}),
|
|
452
|
+
...(modelUsed ? { modelUsed, usedModel: modelUsed } : {}),
|
|
453
|
+
provider,
|
|
454
|
+
...(provider || routing.region
|
|
455
|
+
? {
|
|
456
|
+
routing: {
|
|
457
|
+
provider,
|
|
458
|
+
...(routing.region ? { region: routing.region } : {})
|
|
459
|
+
}
|
|
460
|
+
}
|
|
461
|
+
: {}),
|
|
382
462
|
usage: {
|
|
383
|
-
|
|
384
|
-
|
|
385
|
-
|
|
463
|
+
prompt_tokens: tokens.prompt,
|
|
464
|
+
completion_tokens: tokens.completion,
|
|
465
|
+
total_tokens: tokens.total,
|
|
466
|
+
...(usageExtras.cached !== undefined ? { cachedTokensPrompt: usageExtras.cached } : {}),
|
|
467
|
+
...(usageExtras.cached !== undefined ? { cachedTokensTotal: usageExtras.cached } : {})
|
|
386
468
|
},
|
|
387
|
-
tokens,
|
|
469
|
+
promptTokens: tokens.prompt,
|
|
470
|
+
completionTokens: tokens.completion,
|
|
471
|
+
totalTokens: tokens.total,
|
|
472
|
+
tokens: tokenSlice,
|
|
388
473
|
metadata: {
|
|
389
|
-
|
|
390
|
-
|
|
391
|
-
...(routing.
|
|
392
|
-
|
|
393
|
-
|
|
394
|
-
|
|
474
|
+
provider,
|
|
475
|
+
...(modelUsed ? { modelUsed, model: modelUsed } : {}),
|
|
476
|
+
...(routing.maxTokensRequested !== undefined
|
|
477
|
+
? { maxTokensRequested: routing.maxTokensRequested }
|
|
478
|
+
: {}),
|
|
479
|
+
tokens: tokenSlice
|
|
395
480
|
},
|
|
396
481
|
...(mergedConfig != null ? { config: mergedConfig } : {})
|
|
397
482
|
};
|
|
398
483
|
}
|
|
399
484
|
export function mapAiCostResultToResolvedActivityCost(base, result) {
|
|
400
|
-
if (result
|
|
401
|
-
return base.costStatus ? base : { ...base, costStatus: 'unpriced' };
|
|
402
|
-
}
|
|
403
|
-
if (typeof result.cost !== 'number' || !Number.isFinite(result.cost)) {
|
|
404
|
-
return base;
|
|
405
|
-
}
|
|
406
|
-
if (!result.isAuthoritative && result.source === 'estimate-fallback') {
|
|
485
|
+
if (!catalogPricingSucceeded(result)) {
|
|
407
486
|
return base.costStatus ? base : { ...base, costStatus: 'unpriced' };
|
|
408
487
|
}
|
|
409
488
|
return {
|
|
@@ -412,6 +491,16 @@ export function mapAiCostResultToResolvedActivityCost(base, result) {
|
|
|
412
491
|
...(result.breakdown ? { costBreakdown: result.breakdown } : {})
|
|
413
492
|
};
|
|
414
493
|
}
|
|
494
|
+
/**
|
|
495
|
+
* G8 safety net: token usage without a billing signal → `unpriced`.
|
|
496
|
+
* Used at invoke boundaries after {@link resolveCostCompletionWithAiTools}.
|
|
497
|
+
*/
|
|
498
|
+
export function ensureInvokeBillingCostStatus(billing, tokens) {
|
|
499
|
+
if (!billing.costStatus && hasNonZeroTokenUsage(tokens)) {
|
|
500
|
+
return { ...billing, costStatus: 'unpriced' };
|
|
501
|
+
}
|
|
502
|
+
return billing;
|
|
503
|
+
}
|
|
415
504
|
/**
|
|
416
505
|
* Router cost passthrough, then optional @x12i/ai-tools catalog pricing when still unpriced.
|
|
417
506
|
*/
|
|
@@ -436,30 +525,7 @@ export async function resolveCostCompletionWithAiTools(routerResponse, tokens, o
|
|
|
436
525
|
return mapAiCostResultToResolvedActivityCost(base, result);
|
|
437
526
|
}
|
|
438
527
|
catch {
|
|
439
|
-
|
|
440
|
-
const cfg = options.mergedConfig != null && typeof options.mergedConfig === 'object'
|
|
441
|
-
? options.mergedConfig
|
|
442
|
-
: {};
|
|
443
|
-
const provider = routing.provider ?? cfg.provider;
|
|
444
|
-
const modelUsed = routing.modelUsed ?? cfg.model;
|
|
445
|
-
if (!provider || !modelUsed) {
|
|
446
|
-
return base;
|
|
447
|
-
}
|
|
448
|
-
try {
|
|
449
|
-
const result = await options.calculator.calculate({
|
|
450
|
-
tokens: {
|
|
451
|
-
prompt: tokens.prompt,
|
|
452
|
-
completion: tokens.completion,
|
|
453
|
-
total: tokens.total
|
|
454
|
-
},
|
|
455
|
-
provider,
|
|
456
|
-
usedModel: modelUsed
|
|
457
|
-
});
|
|
458
|
-
return mapAiCostResultToResolvedActivityCost(base, result);
|
|
459
|
-
}
|
|
460
|
-
catch {
|
|
461
|
-
return base;
|
|
462
|
-
}
|
|
528
|
+
return ensureInvokeBillingCostStatus(base, tokens);
|
|
463
529
|
}
|
|
464
530
|
}
|
|
465
531
|
function applyBillingToTraceAttempt(attempt, billing) {
|
|
@@ -94,13 +94,38 @@ export type ResolveCostCompletionOptions = {
|
|
|
94
94
|
calculator?: CostCalculator | null;
|
|
95
95
|
calculateCost?: boolean;
|
|
96
96
|
};
|
|
97
|
-
/**
|
|
97
|
+
/** Optional cache/reasoning token fields for catalog pricing records. */
|
|
98
|
+
export type InvokeUsageExtras = {
|
|
99
|
+
cached?: number;
|
|
100
|
+
cacheWrite?: number;
|
|
101
|
+
reasoning?: number;
|
|
102
|
+
};
|
|
103
|
+
/**
|
|
104
|
+
* Best-effort cache/reasoning token counts from router usage buckets
|
|
105
|
+
* (for {@link buildGatewayPricingRecord} / ai-tools {@link CostCalculator.calculateFromRecord}).
|
|
106
|
+
*/
|
|
107
|
+
export declare function extractUsageExtrasFromRouterResponse(routerResponse: unknown): InvokeUsageExtras;
|
|
108
|
+
/**
|
|
109
|
+
* Whether ai-tools catalog pricing is authoritative enough for Step B (`priced`).
|
|
110
|
+
* Matches the generic engine contract: authoritative catalog hit with finite cost ≥ 0.
|
|
111
|
+
*/
|
|
112
|
+
export declare function catalogPricingSucceeded(result: AiCostResult): boolean;
|
|
113
|
+
/** Record shape for {@link CostCalculator.calculateFromRecord} (shared engine contract). */
|
|
98
114
|
export declare function buildGatewayPricingRecord(routerResponse: unknown, tokens: {
|
|
99
115
|
prompt: number;
|
|
100
116
|
completion: number;
|
|
101
117
|
total: number;
|
|
102
118
|
}, mergedConfig?: unknown): Record<string, unknown>;
|
|
103
119
|
export declare function mapAiCostResultToResolvedActivityCost(base: ResolvedActivityCost, result: AiCostResult): ResolvedActivityCost;
|
|
120
|
+
/**
|
|
121
|
+
* G8 safety net: token usage without a billing signal → `unpriced`.
|
|
122
|
+
* Used at invoke boundaries after {@link resolveCostCompletionWithAiTools}.
|
|
123
|
+
*/
|
|
124
|
+
export declare function ensureInvokeBillingCostStatus(billing: ResolvedActivityCost, tokens: {
|
|
125
|
+
prompt: number;
|
|
126
|
+
completion: number;
|
|
127
|
+
total: number;
|
|
128
|
+
}): ResolvedActivityCost;
|
|
104
129
|
/**
|
|
105
130
|
* Router cost passthrough, then optional @x12i/ai-tools catalog pricing when still unpriced.
|
|
106
131
|
*/
|
package/dist-cjs/gateway.cjs
CHANGED
|
@@ -11,7 +11,7 @@ import { resolveRetryConfig } from './gateway-defaults.js';
|
|
|
11
11
|
import { buildMessages } from './message-builder.js';
|
|
12
12
|
import { extractJsonFromFlexMd } from './flex-md-loader.js';
|
|
13
13
|
import { enrichParsedContentForOutputContract, resolveOutputContractFieldKeys } from './output-contract-normalizer.js';
|
|
14
|
-
import { attachGatewayInvokeRejectionMetadata, buildGatewayFallbackAttemptsFromTrace, buildInvokeRejectionMetadata, capActivityFullResponsePayload, formatFallbackExhaustionMessage, logResolvedModelRouting, mapGatewayFallbackAttemptsToRouter,
|
|
14
|
+
import { attachGatewayInvokeRejectionMetadata, buildGatewayFallbackAttemptsFromTrace, buildInvokeRejectionMetadata, capActivityFullResponsePayload, formatFallbackExhaustionMessage, logResolvedModelRouting, mapGatewayFallbackAttemptsToRouter, DEFAULT_ACTIVITY_FULL_RESPONSE_MAX_CHARS, extractCostUsdFromRouterResponse, extractTokenUsageFromRouterResponse, mergeConfig, pickEffectiveModelConfigForMetadata, pickInvokeRoutingMetadataSlice, pickTraceMergedRouterConfig, resolveCostCompletionWithAiTools, ensureInvokeBillingCostStatus, buildTraceUsageSummary, enrichTraceAttemptsWithBilling, tryExtractRouterLikePayloadFromErrorChain } from './gateway-utils.js';
|
|
15
15
|
import { getAiToolsClient } from './ai-tools-client.js';
|
|
16
16
|
import { autoRegisterProviders } from './gateway-provider-auto-register.js';
|
|
17
17
|
import { setGatewayLastJobId, setGatewayRuntimeClients } from './runtime-objects.js';
|
|
@@ -135,11 +135,12 @@ export class AIGateway {
|
|
|
135
135
|
});
|
|
136
136
|
const metaChat = response?.metadata || {};
|
|
137
137
|
const tokensChat = extractTokenUsageFromRouterResponse(response);
|
|
138
|
-
|
|
138
|
+
let costCompletionChat = await resolveCostCompletionWithAiTools(response, tokensChat, {
|
|
139
139
|
mergedConfig,
|
|
140
140
|
calculator: aiTools?.calculator ?? null,
|
|
141
141
|
calculateCost: this.config.aiTools?.calculateCost
|
|
142
142
|
});
|
|
143
|
+
costCompletionChat = ensureInvokeBillingCostStatus(costCompletionChat, tokensChat);
|
|
143
144
|
// Create enhanced response
|
|
144
145
|
const enhancedResponse = {
|
|
145
146
|
content: response.content || '',
|
|
@@ -614,9 +615,7 @@ export class AIGateway {
|
|
|
614
615
|
calculator: aiTools?.calculator ?? null,
|
|
615
616
|
calculateCost: this.config.aiTools?.calculateCost
|
|
616
617
|
});
|
|
617
|
-
|
|
618
|
-
costCompletion = { ...costCompletion, costStatus: 'unpriced' };
|
|
619
|
-
}
|
|
618
|
+
costCompletion = ensureInvokeBillingCostStatus(costCompletion, tokens);
|
|
620
619
|
const routerMetaForCost = routerResponse?.metadata || {};
|
|
621
620
|
const routingMetadataSlice = pickInvokeRoutingMetadataSlice(routerResponse, mergedConfig);
|
|
622
621
|
const effectiveModelConfig = pickEffectiveModelConfigForMetadata(mergedConfig);
|
package/dist-cjs/index.cjs
CHANGED
|
@@ -17,7 +17,7 @@ export * from '@x12i/ai-providers-router';
|
|
|
17
17
|
export { AIGateway } from './gateway.js';
|
|
18
18
|
export { InstructionNotFoundError, InstructionBackendError, ModelRequiredError, MaxTokensRequiredError } from './instruction-errors.js';
|
|
19
19
|
export { autoRegisterProviders } from './gateway-provider-auto-register.js';
|
|
20
|
-
export { attachGatewayInvokeRejectionMetadata, buildInvokeRejectionMetadata, tryExtractRouterLikePayloadFromErrorChain, tryExtractFallbackAttemptsFromErrorChain, pickRequestIdsFromRouterLike, resolveActivityCostCompletion, resolveCostCompletionForActivity, resolveCostCompletionWithAiTools, buildGatewayPricingRecord, mapAiCostResultToResolvedActivityCost, buildTraceUsageSummary, enrichTraceAttemptsWithBilling, hasNonZeroTokenUsage, MODEL_PROFILE_UNROUTABLE, ModelProfileUnroutableError, ModelProfileInputRejectedError, buildGatewayFallbackAttemptsFromTrace, formatFallbackExhaustionMessage, logResolvedModelRouting, mapGatewayFallbackAttemptsToRouter } from './gateway-utils.js';
|
|
20
|
+
export { attachGatewayInvokeRejectionMetadata, buildInvokeRejectionMetadata, tryExtractRouterLikePayloadFromErrorChain, tryExtractFallbackAttemptsFromErrorChain, pickRequestIdsFromRouterLike, resolveActivityCostCompletion, resolveCostCompletionForActivity, resolveCostCompletionWithAiTools, buildGatewayPricingRecord, mapAiCostResultToResolvedActivityCost, catalogPricingSucceeded, ensureInvokeBillingCostStatus, extractUsageExtrasFromRouterResponse, buildTraceUsageSummary, enrichTraceAttemptsWithBilling, hasNonZeroTokenUsage, MODEL_PROFILE_UNROUTABLE, ModelProfileUnroutableError, ModelProfileInputRejectedError, buildGatewayFallbackAttemptsFromTrace, formatFallbackExhaustionMessage, logResolvedModelRouting, mapGatewayFallbackAttemptsToRouter } from './gateway-utils.js';
|
|
21
21
|
export { getGatewayOperationalMode, isProdGatewayMode, parseModelProviderSpec } from './gateway-mode.js';
|
|
22
22
|
export { DEFAULT_ACTIVITY_FULL_RESPONSE_MAX_CHARS, GATEWAY_DEFAULT_FREQUENCY_PENALTY, GATEWAY_DEFAULT_PRESENCE_PENALTY, GATEWAY_DEFAULT_RETRY, GATEWAY_DEFAULT_TEMPERATURE, GATEWAY_DEFAULT_TOP_P, resolveRetryConfig } from './gateway-defaults.js';
|
|
23
23
|
export { contractSpecToFieldKeys, enrichParsedContentForOutputContract, resolveOutputContractFieldKeys } from './output-contract-normalizer.js';
|
package/dist-cjs/index.d.ts
CHANGED
|
@@ -17,7 +17,7 @@ export { AIGateway } from './gateway.js';
|
|
|
17
17
|
export { InstructionNotFoundError, InstructionBackendError, ModelRequiredError, MaxTokensRequiredError } from './instruction-errors.js';
|
|
18
18
|
export { autoRegisterProviders } from './gateway-provider-auto-register.js';
|
|
19
19
|
export type { GatewayConfig, ProviderModelRef, ModelConfig, RetryConfig, ChatRequest, AIInvokeRequest, AIRequest, GatewayActionType, GatewayInvokeRejectionMetadata, GatewayFallbackAttempt, GatewayTraceRequestIds, GatewayTraceAttempt, GatewayTraceUsageSummary, GatewayTraceMergedConfig, EnhancedLLMResponse, InstructionMetadata, ValidationRule, TemplateRenderOptions, SmartInputConfig, SmartInputRenderOptions } from './types.js';
|
|
20
|
-
export { attachGatewayInvokeRejectionMetadata, buildInvokeRejectionMetadata, tryExtractRouterLikePayloadFromErrorChain, tryExtractFallbackAttemptsFromErrorChain, pickRequestIdsFromRouterLike, resolveActivityCostCompletion, resolveCostCompletionForActivity, resolveCostCompletionWithAiTools, buildGatewayPricingRecord, mapAiCostResultToResolvedActivityCost, buildTraceUsageSummary, enrichTraceAttemptsWithBilling, hasNonZeroTokenUsage, MODEL_PROFILE_UNROUTABLE, ModelProfileUnroutableError, ModelProfileInputRejectedError, buildGatewayFallbackAttemptsFromTrace, formatFallbackExhaustionMessage, logResolvedModelRouting, mapGatewayFallbackAttemptsToRouter } from './gateway-utils.js';
|
|
20
|
+
export { attachGatewayInvokeRejectionMetadata, buildInvokeRejectionMetadata, tryExtractRouterLikePayloadFromErrorChain, tryExtractFallbackAttemptsFromErrorChain, pickRequestIdsFromRouterLike, resolveActivityCostCompletion, resolveCostCompletionForActivity, resolveCostCompletionWithAiTools, buildGatewayPricingRecord, mapAiCostResultToResolvedActivityCost, catalogPricingSucceeded, ensureInvokeBillingCostStatus, extractUsageExtrasFromRouterResponse, buildTraceUsageSummary, enrichTraceAttemptsWithBilling, hasNonZeroTokenUsage, MODEL_PROFILE_UNROUTABLE, ModelProfileUnroutableError, ModelProfileInputRejectedError, buildGatewayFallbackAttemptsFromTrace, formatFallbackExhaustionMessage, logResolvedModelRouting, mapGatewayFallbackAttemptsToRouter } from './gateway-utils.js';
|
|
21
21
|
export { getGatewayOperationalMode, isProdGatewayMode, parseModelProviderSpec } from './gateway-mode.js';
|
|
22
22
|
export type { GatewayOperationalMode } from './gateway-mode.js';
|
|
23
23
|
export { DEFAULT_ACTIVITY_FULL_RESPONSE_MAX_CHARS, GATEWAY_DEFAULT_FREQUENCY_PENALTY, GATEWAY_DEFAULT_PRESENCE_PENALTY, GATEWAY_DEFAULT_RETRY, GATEWAY_DEFAULT_TEMPERATURE, GATEWAY_DEFAULT_TOP_P, resolveRetryConfig } from './gateway-defaults.js';
|
package/dist-cjs/types.d.ts
CHANGED
|
@@ -9,6 +9,7 @@ type AIModel = string;
|
|
|
9
9
|
export type UsageTier = string;
|
|
10
10
|
import type { Activix } from '@x12i/activix';
|
|
11
11
|
import type { SmartInputConfig, SmartInputRenderOptions, TemplateRenderOptions } from '@x12i/rendrix';
|
|
12
|
+
import type { ProfileCatalogLane } from '@x12i/ai-profiles';
|
|
12
13
|
import type { Logxer, PackageLogLevelsConfig } from '@x12i/logxer';
|
|
13
14
|
/**
|
|
14
15
|
* Diagnostics options for opt-in authoritative tracing.
|
|
@@ -415,6 +416,11 @@ export interface GatewayConfig extends Omit<RouterConfig, 'defaultEngine' | 'log
|
|
|
415
416
|
cacheTtlMs?: number;
|
|
416
417
|
/** Use bundled catalog JSON only (offline / tests). */
|
|
417
418
|
bundledOnly?: boolean;
|
|
419
|
+
/**
|
|
420
|
+
* Catalog lane for model resolution and cost lookup (`text`, `image`, …).
|
|
421
|
+
* @default `"text"` in ai-tools when omitted.
|
|
422
|
+
*/
|
|
423
|
+
catalogLane?: ProfileCatalogLane;
|
|
418
424
|
/** @default true */
|
|
419
425
|
resolveModels?: boolean;
|
|
420
426
|
/**
|
|
@@ -424,6 +430,7 @@ export interface GatewayConfig extends Omit<RouterConfig, 'defaultEngine' | 'log
|
|
|
424
430
|
modelsOnly?: boolean;
|
|
425
431
|
/** @default true */
|
|
426
432
|
calculateCost?: boolean;
|
|
433
|
+
/** @default false — when true, priced results may include prompt/completion breakdown. */
|
|
427
434
|
costIncludeBreakdown?: boolean;
|
|
428
435
|
};
|
|
429
436
|
/**
|