@blamejs/core 0.12.27 → 0.12.28
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +2 -0
- package/README.md +1 -0
- package/index.js +1 -0
- package/lib/ai-capability.js +482 -0
- package/package.json +1 -1
- package/sbom.cdx.json +6 -6
package/CHANGELOG.md
CHANGED
|
@@ -8,6 +8,8 @@ upgrading across more than a few patches at a time.
|
|
|
8
8
|
|
|
9
9
|
## v0.12.x
|
|
10
10
|
|
|
11
|
+
- v0.12.28 (2026-05-24) — **`b.ai.capability` — model-capability registry + cheapest-satisfying-model router.** `b.ai.capability.create({ models })` turns a fleet of AI model descriptors into a routing decision: given a set of requirements (context window, input/output modalities, tool use, structured output, reasoning tier, citation support, prompt-caching size), it picks the cheapest model that satisfies all of them. NIST AI RMF (AI 100-1) MAP 2.x requires documenting each model's capabilities and limitations; the Model Cards convention (Mitchell et al., 2019) formalizes that descriptor — this primitive makes the descriptor actionable. Routing to the cheapest sufficient model is a front-line defense against over-provisioning spend and composes directly with `b.ai.quota`'s `cost-usd` dimension (the chosen descriptor's rate feeds the budget charge); refusing to route a request to a model that cannot satisfy it (missing modality, too-small context window, no tool use) catches a capability mismatch before the inference call burns tokens on a guaranteed-bad result. Cost ranking uses a supplied `costBasis` (`{ inputTokens, outputTokens }`) for real per-call spend, else the sum of the per-1k rates; ties break by model id so the choice is deterministic across calls and nodes. **Added:** *`b.ai.capability.create({ models })` — capability registry + router* — Returns `{ describe, list, register, satisfies, route }`. A descriptor carries `maxContextTokens`, `maxOutputTokens`, `modalitiesIn` / `modalitiesOut` (arrays), `toolUse`, `structuredOutput`, `fineTunable`, `reasoningTier` (`none` / `basic` / `standard` / `advanced`, ordered), `citationSupport`, `promptCachingMaxTokens`, and the cost rates `costPer1kInputTokens` / `costPer1kOutputTokens`. Descriptors are validated + frozen at registration so a typo (negative cost, unknown reasoning tier, non-array modality list) surfaces at config time rather than as a silent mis-route. `describe(modelId)` returns the frozen descriptor; `register(modelId, descriptor)` adds or replaces one at runtime. · *`route({ requirements, fallback?, costBasis? })` — cheapest-satisfying selection* — Collects every model whose descriptor satisfies all requirements, then returns the cheapest (`{ modelId, descriptor, estimatedCost, reason }`). Requirements: `minContextTokens`, `minOutputTokens`, `modalitiesIn` / `modalitiesOut` (model must support every listed modality), `toolUse`, `structuredOutput`, `fineTunable`, `minReasoningTier` (tier ordering — `standard` is met by `standard` or `advanced`), `citationSupport`, `minPromptCachingTokens`. When no model matches, `fallback` (a registered model id) is returned with `reason: "fallback"`, or the call refuses with `aiCapability/no-candidate` if no fallback was supplied. Routing decisions emit `ai/capability-routed` / `ai/capability-fallback` / `ai/capability-no-candidate` through the drop-silent audit chain. · *`satisfies(modelId, requirements)` — precise capability-mismatch reasons* — Returns `{ ok, failures }` where each failure names the `requirement`, the `need`, and what the model `have`s — so a caller surfaces a precise reason (e.g. `minReasoningTier need advanced have basic`) instead of a bare boolean. Use it to explain a routing miss or to gate a request against a specific model before calling it.
|
|
12
|
+
|
|
11
13
|
- v0.12.27 (2026-05-24) — **`b.ai.quota` — per-tenant, per-model AI usage budgets with atomic consume-and-check.** `b.ai.quota.create(opts)` builds an enforcer that caps AI inference usage per `(tenant, model, dimension, period)` and defends OWASP LLM Top 10 2025 LLM10 (Unbounded Consumption) — the class that includes denial-of-wallet, where an attacker drives a high volume of pay-per-use inferences until the bill itself is the attack. Meter by `tokens`, `requests`, `cost-usd`, or `compute-hours` over a calendar-aligned UTC window (`second` through `month`). `consume(tenant, model, amount)` is a single atomic check-and-charge: under the default `hard` enforcement it reserves the amount only if it fits under the ceiling, otherwise it refuses without charging — the limit test and the charge are one indivisible operation, so there is no charge-then-refund window for a concurrent call to observe. The in-memory counter is per-process; multi-node deployments supply an `opts.store` adapter whose `reserve` (an atomic conditional test-and-charge — a Redis Lua script, a SQL `UPDATE ... WHERE used + :amt <= :limit RETURNING used`) and `add` are atomic on the shared backend to enforce one aggregate ceiling across the cluster without false denials under contention. Limit resolution is most-specific-first: `perTenantModel` over `perTenant` over `perModel` over the default `limit`; tenant and model identifiers are percent-encoded into the counter key so a hostile tenant name cannot collide with another tenant's budget. **Added:** *`b.ai.quota.create(opts)` — per-tenant AI usage-budget enforcer* — Returns `{ consume, check, snapshot, reset }` scoped to one `dimension` (`tokens` / `requests` / `cost-usd` / `compute-hours`) and one `period` (`second` / `minute` / `hour` / `day` / `week` (Monday-aligned) / `month` (1st-of-month), all UTC-aligned). `consume(tenant, model, amount, opts?)` returns `{ used, limit, remaining, allowed, exceeded, windowStart, resetsAt, ... }`. `check(tenant, model)` is the read-only snapshot. Spin up one enforcer per dimension you meter — a monthly `cost-usd` budget and a per-minute `tokens` burst cap coexist as two `create()` calls sharing one store. Defends OWASP LLM10:2025 Unbounded Consumption / denial-of-wallet; maps to NIST AI RMF (AI 100-1) MANAGE 2.x and EU AI Act Art. 15 (robustness / resource-exhaustion resilience). · *`hard` / `soft` / `warn` enforcement* — `hard` (default) refuses the over-budget call and throws `aiQuota/exceeded` without charging — the rejected reservation is refunded so the counter is untouched. `soft` admits the charge but reports `allowed: false` so the caller decides whether to honor it. `warn` admits and allows (advisory), flagging `exceeded: true`. A per-call `consume(..., { enforcement })` override lets one endpoint soften the mode for a trusted internal caller without a second enforcer. Every over-budget event emits `ai/quota-exceeded` through the drop-silent audit chain (`ai/quota-applied` on success), tagged with the active cluster node id for attribution. · *Cross-node aggregate budgets via `opts.store`* — The default counter is in-memory (per-process). Supply `opts.store` exposing atomic `reserve` / `add` / `get` / `reset` (a Redis Lua script, a shared SQL row) and the ceiling is enforced on the cluster-wide aggregate. `hard` mode goes through `reserve`, an atomic conditional test-and-charge that adds the amount only if it fits — so a concurrent over-budget call cannot transiently inflate the counter and falsely deny a smaller call that should fit. Per-tenant and per-model limit overrides (`perTenant` / `perModel` / `perTenantModel`) are validated at config time so a malformed cap surfaces at boot, not as a silent fall-through to the default.
|
|
12
14
|
|
|
13
15
|
- v0.12.26 (2026-05-24) — **`b.compliance` posture cascades — `eu-ai-act` + `ca-ab-853` + `cac-genai-label` POSTURE_DEFAULTS + backup encryption refusal.** Three new posture cascades wired into `b.compliance.POSTURE_DEFAULTS` + `KNOWN_POSTURES` + `REGIME_MAP` so operators globally pinning the EU AI Act / California AB-853 / China CAC GenAI postures get the right floors automatically: backupEncryptionRequired:true, auditChainSignedRequired:true, tlsMinVersion:TLSv1.3, requireVacuumAfterErase:true. `b.backup.bundleAdapterStorage` extends the encryption-required posture list to include the three new postures so `cryptoStrategy: "none"` is refused upfront under any of them (parity with HIPAA + PCI-DSS, which the operator surface has carried since v0.12.10). The canonical `eu-ai-act` posture is the production name; the legacy `ai-act` short name stays in KNOWN_POSTURES for back-compat with operators who pinned it pre-v0.12.26. **Added:** *`eu-ai-act` posture cascade — Regulation (EU) 2024/1689* — POSTURE_DEFAULTS entry: backupEncryptionRequired:true (Art. 12 logging + Art. 15 robustness/cybersecurity demand encryption-at-rest for high-risk system training logs), auditChainSignedRequired:true (Art. 12 + Art. 13 audit-chain integrity), tlsMinVersion:TLSv1.3, requireVacuumAfterErase:true (Art. 50(4) synthetic-content provenance — residual EXIF / metadata pointing at the generating model must be cleared on erase). REGIME_MAP entry under jurisdiction:"EU" domain:"ai-governance". KNOWN_POSTURES carries both `eu-ai-act` (canonical) and `ai-act` (legacy short name). · *`ca-ab-853` posture cascade — California AB-853 effective 2026* — Same encryption + audit floor as eu-ai-act; jurisdiction:"US-CA". Model-generated content watermarking + disclosure regime. Operators serving California traffic pin this posture for the AB-853 §22949.91 obligations the v0.12.12 deepfake primitive's crossWalk references. · *`cac-genai-label` posture cascade — China CAC GenAI Service Measures* — Synthetic-content labelling per Art. 12 + algorithm filing per Art. 4. Same backup encryption + signed audit chain floor. Operators serving Chinese traffic pin this posture so the bundleAdapterStorage refuses plaintext bundles and the disclosure primitive's `jurisdiction: "cn"` cross-walk produces the right legal-reference array. · *`bundleAdapterStorage` BACKUP_ENCRYPTION_REQUIRED_POSTURES extended* — `hipaa` + `pci-dss` (the v0.12.10 baseline) joined by the three AI postures. `cryptoStrategy: "none"` refused upfront under any of `eu-ai-act` / `ca-ab-853` / `cac-genai-label` with `backup/posture-requires-encryption`. Operators wiring backup storage in a regulated AI deployment now get the same posture-driven gate that the storage primitive has always applied to health + payment data.
|
package/README.md
CHANGED
|
@@ -163,6 +163,7 @@ The framework bundles the surface a typical Node app reaches for. Every primitiv
|
|
|
163
163
|
- **Agent identity** — A2A signed agent-card primitive (Linux Foundation Agentic AI Foundation v1.x, ML-DSA-87) (`b.a2a`)
|
|
164
164
|
- **Content provenance** — C2PA 2.1 + California SB-942 / AB-853 manifest builder for AI-generated media (provider, model id + version, timestamp, content ID, signed) (`b.contentCredentials`)
|
|
165
165
|
- **AI usage quotas** — per-tenant / per-model budgets metered by tokens / requests / cost-usd / compute-hours over calendar-aligned windows, with an atomic conditional reserve (no charge-then-refund race) + hard/soft/warn enforcement and an optional cross-node store; defends OWASP LLM10:2025 unbounded consumption / denial-of-wallet (`b.ai.quota`)
|
|
166
|
+
- **AI capability routing** — model-capability registry (context window / modalities / tool use / reasoning tier / cost rates) + a router that picks the cheapest model satisfying a request's requirements, refusing capability mismatches before the inference call (NIST AI RMF MAP + Model Cards); composes with `b.ai.quota` cost budgets (`b.ai.capability`)
|
|
166
167
|
### Compliance regimes
|
|
167
168
|
|
|
168
169
|
- **Posture coordinator** — `b.compliance` cascades operator-declared regime into retention / audit / db / cryptoField via POSTURE_DEFAULTS:
|
package/index.js
CHANGED
|
@@ -444,6 +444,7 @@ module.exports = {
|
|
|
444
444
|
modelManifest: require("./lib/ai-model-manifest"),
|
|
445
445
|
disclosure: require("./lib/ai-disclosure"),
|
|
446
446
|
quota: require("./lib/ai-quota"),
|
|
447
|
+
capability: require("./lib/ai-capability"),
|
|
447
448
|
},
|
|
448
449
|
promisePool: require("./lib/promise-pool"),
|
|
449
450
|
sdNotify: require("./lib/sd-notify"),
|
|
@@ -0,0 +1,482 @@
|
|
|
1
|
+
"use strict";
|
|
2
|
+
/**
|
|
3
|
+
* @module b.ai.capability
|
|
4
|
+
* @nav AI
|
|
5
|
+
* @title AI capability routing
|
|
6
|
+
*
|
|
7
|
+
* @intro
|
|
8
|
+
* A capability registry + capability-aware router for AI model
|
|
9
|
+
* fleets. NIST AI RMF (AI 100-1) MAP 2.x requires documenting each
|
|
10
|
+
* model's capabilities and limitations; the Model Cards convention
|
|
11
|
+
* (Mitchell et al., 2019) formalizes that descriptor. This module
|
|
12
|
+
* turns those descriptors into a routing decision: given a set of
|
|
13
|
+
* requirements (context window, modalities, tool use, reasoning
|
|
14
|
+
* tier, …), pick the <em>cheapest</em> model in the fleet that
|
|
15
|
+
* satisfies all of them, or fall back deterministically.
|
|
16
|
+
*
|
|
17
|
+
* <code>b.ai.capability.create({ models })</code> builds a registry
|
|
18
|
+
* from operator-supplied descriptors and returns:
|
|
19
|
+
*
|
|
20
|
+
* - <code>describe(modelId)</code> — the frozen descriptor.
|
|
21
|
+
* - <code>list()</code> — every registered model id.
|
|
22
|
+
* - <code>register(modelId, descriptor)</code> — add / replace one.
|
|
23
|
+
* - <code>satisfies(modelId, requirements)</code> —
|
|
24
|
+
* <code>{ ok, failures }</code> where each failure names the
|
|
25
|
+
* requirement, the need, and what the model has.
|
|
26
|
+
* - <code>route({ requirements, fallback?, costBasis? })</code> —
|
|
27
|
+
* the cheapest satisfying model, or the fallback, or a refusal.
|
|
28
|
+
*
|
|
29
|
+
* A descriptor carries: <code>maxContextTokens</code>,
|
|
30
|
+
* <code>maxOutputTokens</code>, <code>modalitiesIn</code> /
|
|
31
|
+
* <code>modalitiesOut</code> (arrays — e.g. <code>"text"</code>,
|
|
32
|
+
* <code>"image"</code>, <code>"audio"</code>, <code>"video"</code>),
|
|
33
|
+
* <code>toolUse</code>, <code>structuredOutput</code>,
|
|
34
|
+
* <code>fineTunable</code>, <code>reasoningTier</code>
|
|
35
|
+
* (<code>"none" | "basic" | "standard" | "advanced"</code>,
|
|
36
|
+
* ordered), <code>citationSupport</code>,
|
|
37
|
+
* <code>promptCachingMaxTokens</code>, and the cost rates
|
|
38
|
+
* <code>costPer1kInputTokens</code> / <code>costPer1kOutputTokens</code>.
|
|
39
|
+
*
|
|
40
|
+
* <strong>Routing picks the cheapest match.</strong> When a
|
|
41
|
+
* <code>costBasis</code> (<code>{ inputTokens, outputTokens }</code>)
|
|
42
|
+
* is supplied the router estimates the per-call cost and ranks by
|
|
43
|
+
* it; otherwise it ranks by the sum of the per-1k rates. Ties break
|
|
44
|
+
* by model id so the choice is deterministic. Routing to the
|
|
45
|
+
* cheapest sufficient model is the front-line defense against
|
|
46
|
+
* over-provisioning spend — it composes with
|
|
47
|
+
* <code>b.ai.quota</code>'s <code>cost-usd</code> dimension, where
|
|
48
|
+
* the chosen descriptor's rate feeds the budget charge.
|
|
49
|
+
*
|
|
50
|
+
* Refusing to route a request to a model that cannot satisfy it
|
|
51
|
+
* (missing modality, too-small context window, no tool use) catches
|
|
52
|
+
* a capability mismatch before the inference call burns tokens on a
|
|
53
|
+
* guaranteed-bad result.
|
|
54
|
+
*
|
|
55
|
+
* @card
|
|
56
|
+
* Capability registry + cheapest-satisfying-model router for AI
|
|
57
|
+
* model fleets (context / modalities / tool use / reasoning tier /
|
|
58
|
+
* cost). Composes with b.ai.quota cost budgets.
|
|
59
|
+
*/
|
|
60
|
+
|
|
61
|
+
var lazyRequire = require("./lazy-require");
|
|
62
|
+
var validateOpts = require("./validate-opts");
|
|
63
|
+
var { defineClass } = require("./framework-error");
|
|
64
|
+
|
|
65
|
+
var AiCapabilityError = defineClass("AiCapabilityError", { alwaysPermanent: true });
|
|
66
|
+
|
|
67
|
+
var audit = lazyRequire(function () { return require("./audit"); });
|
|
68
|
+
|
|
69
|
+
// Ordered reasoning tiers — a requirement of `minReasoningTier:
|
|
70
|
+
// "standard"` is satisfied by "standard" or "advanced", not "basic".
|
|
71
|
+
var REASONING_TIERS = ["none", "basic", "standard", "advanced"];
|
|
72
|
+
|
|
73
|
+
// Cost rates are quoted per 1000 tokens (industry convention; the
|
|
74
|
+
// descriptor fields are costPer1kInputTokens / costPer1kOutputTokens).
|
|
75
|
+
// Dividing a token count by this rate unit converts a per-1k rate into
|
|
76
|
+
// the per-token multiplier — a rate denominator, not a byte size.
|
|
77
|
+
var COST_RATE_TOKEN_UNIT = 1000; // allow:raw-byte-literal — per-1k-token cost-rate denominator, not a byte count
|
|
78
|
+
|
|
79
|
+
var DESCRIPTOR_KEYS = [
|
|
80
|
+
"maxContextTokens", "maxOutputTokens", "modalitiesIn", "modalitiesOut",
|
|
81
|
+
"toolUse", "structuredOutput", "fineTunable", "reasoningTier",
|
|
82
|
+
"citationSupport", "promptCachingMaxTokens",
|
|
83
|
+
"costPer1kInputTokens", "costPer1kOutputTokens", "provider", "version",
|
|
84
|
+
];
|
|
85
|
+
|
|
86
|
+
var REQUIREMENT_KEYS = [
|
|
87
|
+
"minContextTokens", "minOutputTokens", "modalitiesIn", "modalitiesOut",
|
|
88
|
+
"toolUse", "structuredOutput", "fineTunable", "minReasoningTier",
|
|
89
|
+
"citationSupport", "minPromptCachingTokens",
|
|
90
|
+
];
|
|
91
|
+
|
|
92
|
+
function _isPositiveInt(n) {
|
|
93
|
+
return typeof n === "number" && isFinite(n) && n > 0 && Math.floor(n) === n;
|
|
94
|
+
}
|
|
95
|
+
function _isNonNegFinite(n) {
|
|
96
|
+
return typeof n === "number" && isFinite(n) && n >= 0;
|
|
97
|
+
}
|
|
98
|
+
function _isStringArray(a) {
|
|
99
|
+
if (!Array.isArray(a)) return false;
|
|
100
|
+
for (var i = 0; i < a.length; i++) {
|
|
101
|
+
if (typeof a[i] !== "string" || a[i].length === 0) return false;
|
|
102
|
+
}
|
|
103
|
+
return true;
|
|
104
|
+
}
|
|
105
|
+
|
|
106
|
+
// Normalize + validate one descriptor at registration time so a typo
|
|
107
|
+
// (negative cost, unknown reasoning tier, non-array modality list)
|
|
108
|
+
// surfaces at config time rather than as a silent mis-route.
|
|
109
|
+
function _normalizeDescriptor(modelId, d) {
|
|
110
|
+
if (!d || typeof d !== "object" || Array.isArray(d)) {
|
|
111
|
+
throw new AiCapabilityError("aiCapability/bad-descriptor",
|
|
112
|
+
"ai.capability: descriptor for '" + modelId + "' must be a plain object");
|
|
113
|
+
}
|
|
114
|
+
validateOpts(d, DESCRIPTOR_KEYS, "ai.capability descriptor['" + modelId + "']");
|
|
115
|
+
|
|
116
|
+
if (!_isPositiveInt(d.maxContextTokens)) {
|
|
117
|
+
throw new AiCapabilityError("aiCapability/bad-descriptor",
|
|
118
|
+
"ai.capability: '" + modelId + "'.maxContextTokens must be a positive integer");
|
|
119
|
+
}
|
|
120
|
+
var maxOut = (d.maxOutputTokens == null) ? d.maxContextTokens : d.maxOutputTokens;
|
|
121
|
+
if (!_isPositiveInt(maxOut)) {
|
|
122
|
+
throw new AiCapabilityError("aiCapability/bad-descriptor",
|
|
123
|
+
"ai.capability: '" + modelId + "'.maxOutputTokens must be a positive integer");
|
|
124
|
+
}
|
|
125
|
+
|
|
126
|
+
var modIn = (d.modalitiesIn == null) ? ["text"] : d.modalitiesIn;
|
|
127
|
+
var modOut = (d.modalitiesOut == null) ? ["text"] : d.modalitiesOut;
|
|
128
|
+
if (!_isStringArray(modIn) || !_isStringArray(modOut)) {
|
|
129
|
+
throw new AiCapabilityError("aiCapability/bad-descriptor",
|
|
130
|
+
"ai.capability: '" + modelId + "'.modalitiesIn / modalitiesOut must be arrays of non-empty strings");
|
|
131
|
+
}
|
|
132
|
+
|
|
133
|
+
var tier = (d.reasoningTier == null) ? "standard" : d.reasoningTier;
|
|
134
|
+
if (REASONING_TIERS.indexOf(tier) === -1) {
|
|
135
|
+
throw new AiCapabilityError("aiCapability/bad-descriptor",
|
|
136
|
+
"ai.capability: '" + modelId + "'.reasoningTier must be one of " + REASONING_TIERS.join(" / "));
|
|
137
|
+
}
|
|
138
|
+
|
|
139
|
+
var cachingMax = (d.promptCachingMaxTokens == null) ? 0 : d.promptCachingMaxTokens;
|
|
140
|
+
var costIn = (d.costPer1kInputTokens == null) ? 0 : d.costPer1kInputTokens;
|
|
141
|
+
var costOut = (d.costPer1kOutputTokens == null) ? 0 : d.costPer1kOutputTokens;
|
|
142
|
+
if (!_isNonNegFinite(cachingMax) || !_isNonNegFinite(costIn) || !_isNonNegFinite(costOut)) {
|
|
143
|
+
throw new AiCapabilityError("aiCapability/bad-descriptor",
|
|
144
|
+
"ai.capability: '" + modelId + "'.promptCachingMaxTokens / costPer1kInputTokens / " +
|
|
145
|
+
"costPer1kOutputTokens must be non-negative finite numbers");
|
|
146
|
+
}
|
|
147
|
+
|
|
148
|
+
return Object.freeze({
|
|
149
|
+
modelId: modelId,
|
|
150
|
+
maxContextTokens: d.maxContextTokens,
|
|
151
|
+
maxOutputTokens: maxOut,
|
|
152
|
+
modalitiesIn: Object.freeze(modIn.slice()),
|
|
153
|
+
modalitiesOut: Object.freeze(modOut.slice()),
|
|
154
|
+
toolUse: d.toolUse === true,
|
|
155
|
+
structuredOutput: d.structuredOutput === true,
|
|
156
|
+
fineTunable: d.fineTunable === true,
|
|
157
|
+
reasoningTier: tier,
|
|
158
|
+
citationSupport: d.citationSupport === true,
|
|
159
|
+
promptCachingMaxTokens: cachingMax,
|
|
160
|
+
costPer1kInputTokens: costIn,
|
|
161
|
+
costPer1kOutputTokens: costOut,
|
|
162
|
+
provider: (typeof d.provider === "string") ? d.provider : null,
|
|
163
|
+
version: (typeof d.version === "string") ? d.version : null,
|
|
164
|
+
});
|
|
165
|
+
}
|
|
166
|
+
|
|
167
|
+
/**
|
|
168
|
+
* @primitive b.ai.capability.create
|
|
169
|
+
* @signature b.ai.capability.create(opts)
|
|
170
|
+
* @since 0.12.28
|
|
171
|
+
* @status stable
|
|
172
|
+
* @compliance soc2
|
|
173
|
+
* @related b.ai.quota.create, b.ai.modelManifest.build
|
|
174
|
+
*
|
|
175
|
+
* Build a capability registry + router from operator-supplied model
|
|
176
|
+
* descriptors. Returns <code>{ describe, list, register, satisfies,
|
|
177
|
+
* route }</code>. Pair it with <code>b.ai.quota</code>:
|
|
178
|
+
* <code>route()</code> picks the cheapest model that meets the
|
|
179
|
+
* request, and the chosen descriptor's cost rate feeds the
|
|
180
|
+
* <code>cost-usd</code> budget charge.
|
|
181
|
+
*
|
|
182
|
+
* @opts
|
|
183
|
+
* {
|
|
184
|
+
* models: { // required, ≥ 1 entry
|
|
185
|
+
* [modelId: string]: {
|
|
186
|
+
* maxContextTokens: number, // required, positive int
|
|
187
|
+
* maxOutputTokens?: number, // default: maxContextTokens
|
|
188
|
+
* modalitiesIn?: string[], // default: ["text"]
|
|
189
|
+
* modalitiesOut?: string[], // default: ["text"]
|
|
190
|
+
* toolUse?: boolean, // default: false
|
|
191
|
+
* structuredOutput?: boolean, // default: false
|
|
192
|
+
* fineTunable?: boolean, // default: false
|
|
193
|
+
* reasoningTier?: string, // none|basic|standard|advanced
|
|
194
|
+
* citationSupport?: boolean, // default: false
|
|
195
|
+
* promptCachingMaxTokens?: number, // default: 0
|
|
196
|
+
* costPer1kInputTokens?: number, // default: 0
|
|
197
|
+
* costPer1kOutputTokens?: number, // default: 0
|
|
198
|
+
* provider?: string,
|
|
199
|
+
* version?: string,
|
|
200
|
+
* }
|
|
201
|
+
* },
|
|
202
|
+
* audit?: boolean, // default: true (route decisions)
|
|
203
|
+
* }
|
|
204
|
+
*
|
|
205
|
+
* @example
|
|
206
|
+
* var fleet = b.ai.capability.create({
|
|
207
|
+
* models: {
|
|
208
|
+
* "haiku": { maxContextTokens: 200000, reasoningTier: "basic",
|
|
209
|
+
* costPer1kInputTokens: 0.001, costPer1kOutputTokens: 0.005 },
|
|
210
|
+
* "opus": { maxContextTokens: 200000, reasoningTier: "advanced",
|
|
211
|
+
* toolUse: true, modalitiesIn: ["text", "image"],
|
|
212
|
+
* costPer1kInputTokens: 0.015, costPer1kOutputTokens: 0.075 },
|
|
213
|
+
* },
|
|
214
|
+
* });
|
|
215
|
+
* var pick = fleet.route({
|
|
216
|
+
* requirements: { minContextTokens: 100000, toolUse: true,
|
|
217
|
+
* modalitiesIn: ["text", "image"] },
|
|
218
|
+
* costBasis: { inputTokens: 4000, outputTokens: 500 },
|
|
219
|
+
* });
|
|
220
|
+
* // → { modelId: "opus", descriptor: {...}, estimatedCost: 0.0975, reason: "cheapest-of-1" }
|
|
221
|
+
*/
|
|
222
|
+
function create(opts) {
|
|
223
|
+
validateOpts.requireObject(opts, "ai.capability.create", AiCapabilityError);
|
|
224
|
+
validateOpts(opts, ["models", "audit"], "ai.capability.create");
|
|
225
|
+
|
|
226
|
+
if (!opts.models || typeof opts.models !== "object" || Array.isArray(opts.models)) {
|
|
227
|
+
throw new AiCapabilityError("aiCapability/bad-models",
|
|
228
|
+
"ai.capability.create: models must be a plain object { modelId: descriptor }");
|
|
229
|
+
}
|
|
230
|
+
var ids = Object.keys(opts.models);
|
|
231
|
+
if (ids.length === 0) {
|
|
232
|
+
throw new AiCapabilityError("aiCapability/bad-models",
|
|
233
|
+
"ai.capability.create: models must declare at least one model");
|
|
234
|
+
}
|
|
235
|
+
|
|
236
|
+
var registry = new Map();
|
|
237
|
+
for (var i = 0; i < ids.length; i++) {
|
|
238
|
+
registry.set(ids[i], _normalizeDescriptor(ids[i], opts.models[ids[i]]));
|
|
239
|
+
}
|
|
240
|
+
var auditOn = opts.audit !== false;
|
|
241
|
+
|
|
242
|
+
function _emitAudit(action, outcome, metadata) {
|
|
243
|
+
if (!auditOn) return;
|
|
244
|
+
try {
|
|
245
|
+
audit().safeEmit({ action: action, outcome: outcome, metadata: metadata || {} });
|
|
246
|
+
} catch (_e) { /* audit best-effort — drop-silent */ }
|
|
247
|
+
}
|
|
248
|
+
|
|
249
|
+
function describe(modelId) {
|
|
250
|
+
var d = registry.get(modelId);
|
|
251
|
+
if (!d) {
|
|
252
|
+
throw new AiCapabilityError("aiCapability/unknown-model",
|
|
253
|
+
"ai.capability.describe: unknown model '" + modelId + "'");
|
|
254
|
+
}
|
|
255
|
+
return d;
|
|
256
|
+
}
|
|
257
|
+
|
|
258
|
+
function list() {
|
|
259
|
+
return Array.from(registry.keys());
|
|
260
|
+
}
|
|
261
|
+
|
|
262
|
+
function register(modelId, descriptor) {
|
|
263
|
+
validateOpts.requireNonEmptyString(modelId,
|
|
264
|
+
"ai.capability.register: modelId", AiCapabilityError, "aiCapability/bad-model");
|
|
265
|
+
registry.set(modelId, _normalizeDescriptor(modelId, descriptor));
|
|
266
|
+
return registry.get(modelId);
|
|
267
|
+
}
|
|
268
|
+
|
|
269
|
+
// Returns { ok, failures } — every unmet requirement names what was
|
|
270
|
+
// needed and what the model has, so a caller can surface a precise
|
|
271
|
+
// capability-mismatch reason instead of a bare boolean.
|
|
272
|
+
function _evaluate(descriptor, requirements) {
|
|
273
|
+
var failures = [];
|
|
274
|
+
function fail(requirement, need, have) {
|
|
275
|
+
failures.push({ requirement: requirement, need: need, have: have });
|
|
276
|
+
}
|
|
277
|
+
if (requirements.minContextTokens != null &&
|
|
278
|
+
descriptor.maxContextTokens < requirements.minContextTokens) {
|
|
279
|
+
fail("minContextTokens", requirements.minContextTokens, descriptor.maxContextTokens);
|
|
280
|
+
}
|
|
281
|
+
if (requirements.minOutputTokens != null &&
|
|
282
|
+
descriptor.maxOutputTokens < requirements.minOutputTokens) {
|
|
283
|
+
fail("minOutputTokens", requirements.minOutputTokens, descriptor.maxOutputTokens);
|
|
284
|
+
}
|
|
285
|
+
if (requirements.modalitiesIn != null) {
|
|
286
|
+
for (var a = 0; a < requirements.modalitiesIn.length; a++) {
|
|
287
|
+
if (descriptor.modalitiesIn.indexOf(requirements.modalitiesIn[a]) === -1) {
|
|
288
|
+
fail("modalitiesIn", requirements.modalitiesIn[a], descriptor.modalitiesIn);
|
|
289
|
+
}
|
|
290
|
+
}
|
|
291
|
+
}
|
|
292
|
+
if (requirements.modalitiesOut != null) {
|
|
293
|
+
for (var b = 0; b < requirements.modalitiesOut.length; b++) {
|
|
294
|
+
if (descriptor.modalitiesOut.indexOf(requirements.modalitiesOut[b]) === -1) {
|
|
295
|
+
fail("modalitiesOut", requirements.modalitiesOut[b], descriptor.modalitiesOut);
|
|
296
|
+
}
|
|
297
|
+
}
|
|
298
|
+
}
|
|
299
|
+
if (requirements.toolUse === true && descriptor.toolUse !== true) {
|
|
300
|
+
fail("toolUse", true, false);
|
|
301
|
+
}
|
|
302
|
+
if (requirements.structuredOutput === true && descriptor.structuredOutput !== true) {
|
|
303
|
+
fail("structuredOutput", true, false);
|
|
304
|
+
}
|
|
305
|
+
if (requirements.fineTunable === true && descriptor.fineTunable !== true) {
|
|
306
|
+
fail("fineTunable", true, false);
|
|
307
|
+
}
|
|
308
|
+
if (requirements.citationSupport === true && descriptor.citationSupport !== true) {
|
|
309
|
+
fail("citationSupport", true, false);
|
|
310
|
+
}
|
|
311
|
+
if (requirements.minReasoningTier != null &&
|
|
312
|
+
REASONING_TIERS.indexOf(descriptor.reasoningTier) <
|
|
313
|
+
REASONING_TIERS.indexOf(requirements.minReasoningTier)) {
|
|
314
|
+
fail("minReasoningTier", requirements.minReasoningTier, descriptor.reasoningTier);
|
|
315
|
+
}
|
|
316
|
+
if (requirements.minPromptCachingTokens != null &&
|
|
317
|
+
descriptor.promptCachingMaxTokens < requirements.minPromptCachingTokens) {
|
|
318
|
+
fail("minPromptCachingTokens", requirements.minPromptCachingTokens, descriptor.promptCachingMaxTokens);
|
|
319
|
+
}
|
|
320
|
+
return { ok: failures.length === 0, failures: failures };
|
|
321
|
+
}
|
|
322
|
+
|
|
323
|
+
function _validateRequirements(requirements) {
|
|
324
|
+
if (requirements == null) return {};
|
|
325
|
+
if (typeof requirements !== "object" || Array.isArray(requirements)) {
|
|
326
|
+
throw new AiCapabilityError("aiCapability/bad-requirements",
|
|
327
|
+
"ai.capability: requirements must be a plain object");
|
|
328
|
+
}
|
|
329
|
+
validateOpts(requirements, REQUIREMENT_KEYS, "ai.capability requirements");
|
|
330
|
+
if (requirements.minReasoningTier != null &&
|
|
331
|
+
REASONING_TIERS.indexOf(requirements.minReasoningTier) === -1) {
|
|
332
|
+
throw new AiCapabilityError("aiCapability/bad-requirements",
|
|
333
|
+
"ai.capability: minReasoningTier must be one of " + REASONING_TIERS.join(" / "));
|
|
334
|
+
}
|
|
335
|
+
if (requirements.modalitiesIn != null && !_isStringArray(requirements.modalitiesIn)) {
|
|
336
|
+
throw new AiCapabilityError("aiCapability/bad-requirements",
|
|
337
|
+
"ai.capability: requirements.modalitiesIn must be an array of non-empty strings");
|
|
338
|
+
}
|
|
339
|
+
if (requirements.modalitiesOut != null && !_isStringArray(requirements.modalitiesOut)) {
|
|
340
|
+
throw new AiCapabilityError("aiCapability/bad-requirements",
|
|
341
|
+
"ai.capability: requirements.modalitiesOut must be an array of non-empty strings");
|
|
342
|
+
}
|
|
343
|
+
// Numeric minimums are compared with `<` against the descriptor; a
|
|
344
|
+
// non-numeric value (NaN, "128k", a bad parse) makes that compare
|
|
345
|
+
// false and SILENTLY satisfies the requirement, so an undersized
|
|
346
|
+
// model could be selected. Reject non-finite / negative here so a
|
|
347
|
+
// malformed requirement fails fast instead of fail-open.
|
|
348
|
+
var numericMins = ["minContextTokens", "minOutputTokens", "minPromptCachingTokens"];
|
|
349
|
+
for (var ni = 0; ni < numericMins.length; ni++) {
|
|
350
|
+
var nk = numericMins[ni];
|
|
351
|
+
if (requirements[nk] != null && !_isNonNegFinite(requirements[nk])) {
|
|
352
|
+
throw new AiCapabilityError("aiCapability/bad-requirements",
|
|
353
|
+
"ai.capability: requirements." + nk + " must be a non-negative finite number");
|
|
354
|
+
}
|
|
355
|
+
}
|
|
356
|
+
// Boolean opt-in requirements are matched with `=== true`; a
|
|
357
|
+
// non-boolean (truthy 1, "false") would silently fail to require
|
|
358
|
+
// the capability. Reject non-booleans so the intent is explicit.
|
|
359
|
+
var booleanReqs = ["toolUse", "structuredOutput", "fineTunable", "citationSupport"];
|
|
360
|
+
for (var bi = 0; bi < booleanReqs.length; bi++) {
|
|
361
|
+
var bk = booleanReqs[bi];
|
|
362
|
+
if (requirements[bk] != null && typeof requirements[bk] !== "boolean") {
|
|
363
|
+
throw new AiCapabilityError("aiCapability/bad-requirements",
|
|
364
|
+
"ai.capability: requirements." + bk + " must be a boolean");
|
|
365
|
+
}
|
|
366
|
+
}
|
|
367
|
+
return requirements;
|
|
368
|
+
}
|
|
369
|
+
|
|
370
|
+
function satisfies(modelId, requirements) {
|
|
371
|
+
return _evaluate(describe(modelId), _validateRequirements(requirements));
|
|
372
|
+
}
|
|
373
|
+
|
|
374
|
+
// Per-call cost estimate. With a costBasis the estimate is the
|
|
375
|
+
// real per-call spend (input + output tokens at the model's rates);
|
|
376
|
+
// without one it is the sum of the per-1k rates — a stable proxy
|
|
377
|
+
// for "cheaper model" when the caller hasn't sized the request.
|
|
378
|
+
function _estimateCost(descriptor, costBasis) {
|
|
379
|
+
if (costBasis) {
|
|
380
|
+
var inTok = _isNonNegFinite(costBasis.inputTokens) ? costBasis.inputTokens : 0;
|
|
381
|
+
var outTok = _isNonNegFinite(costBasis.outputTokens) ? costBasis.outputTokens : 0;
|
|
382
|
+
return (inTok / COST_RATE_TOKEN_UNIT) * descriptor.costPer1kInputTokens +
|
|
383
|
+
(outTok / COST_RATE_TOKEN_UNIT) * descriptor.costPer1kOutputTokens;
|
|
384
|
+
}
|
|
385
|
+
return descriptor.costPer1kInputTokens + descriptor.costPer1kOutputTokens;
|
|
386
|
+
}
|
|
387
|
+
|
|
388
|
+
function route(routeOpts) {
|
|
389
|
+
routeOpts = routeOpts || {};
|
|
390
|
+
validateOpts(routeOpts, ["requirements", "fallback", "costBasis"], "ai.capability.route");
|
|
391
|
+
var requirements = _validateRequirements(routeOpts.requirements);
|
|
392
|
+
var costBasis = null;
|
|
393
|
+
if (routeOpts.costBasis != null) {
|
|
394
|
+
if (typeof routeOpts.costBasis !== "object" || Array.isArray(routeOpts.costBasis)) {
|
|
395
|
+
throw new AiCapabilityError("aiCapability/bad-requirements",
|
|
396
|
+
"ai.capability.route: costBasis must be a plain object { inputTokens, outputTokens }");
|
|
397
|
+
}
|
|
398
|
+
validateOpts(routeOpts.costBasis, ["inputTokens", "outputTokens"],
|
|
399
|
+
"ai.capability.route costBasis");
|
|
400
|
+
// A malformed costBasis field silently underprices a candidate
|
|
401
|
+
// and biases the "cheapest" choice toward the wrong model — fail
|
|
402
|
+
// fast instead. An absent field is fine (treated as 0 tokens on
|
|
403
|
+
// that side); a present-but-non-numeric field is rejected.
|
|
404
|
+
var cbFields = ["inputTokens", "outputTokens"];
|
|
405
|
+
for (var ci = 0; ci < cbFields.length; ci++) {
|
|
406
|
+
var ck = cbFields[ci];
|
|
407
|
+
if (routeOpts.costBasis[ck] != null && !_isNonNegFinite(routeOpts.costBasis[ck])) {
|
|
408
|
+
throw new AiCapabilityError("aiCapability/bad-requirements",
|
|
409
|
+
"ai.capability.route: costBasis." + ck + " must be a non-negative finite number");
|
|
410
|
+
}
|
|
411
|
+
}
|
|
412
|
+
costBasis = routeOpts.costBasis;
|
|
413
|
+
}
|
|
414
|
+
|
|
415
|
+
// Collect every satisfying model, then pick the cheapest. Tie
|
|
416
|
+
// break by model id (lexicographic) so the choice is deterministic
|
|
417
|
+
// across calls and across nodes.
|
|
418
|
+
var candidates = [];
|
|
419
|
+
var modelIds = Array.from(registry.keys());
|
|
420
|
+
for (var i = 0; i < modelIds.length; i++) {
|
|
421
|
+
var d = registry.get(modelIds[i]);
|
|
422
|
+
if (_evaluate(d, requirements).ok) {
|
|
423
|
+
candidates.push({ modelId: modelIds[i], descriptor: d, cost: _estimateCost(d, costBasis) });
|
|
424
|
+
}
|
|
425
|
+
}
|
|
426
|
+
candidates.sort(function (x, y) {
|
|
427
|
+
if (x.cost !== y.cost) return x.cost - y.cost;
|
|
428
|
+
return x.modelId < y.modelId ? -1 : (x.modelId > y.modelId ? 1 : 0);
|
|
429
|
+
});
|
|
430
|
+
|
|
431
|
+
if (candidates.length > 0) {
|
|
432
|
+
var pick = candidates[0];
|
|
433
|
+
_emitAudit("ai/capability-routed", "allowed", {
|
|
434
|
+
modelId: pick.modelId, candidateCount: candidates.length,
|
|
435
|
+
estimatedCost: pick.cost, requirements: requirements,
|
|
436
|
+
});
|
|
437
|
+
return {
|
|
438
|
+
modelId: pick.modelId,
|
|
439
|
+
descriptor: pick.descriptor,
|
|
440
|
+
estimatedCost: pick.cost,
|
|
441
|
+
reason: "cheapest-of-" + candidates.length,
|
|
442
|
+
};
|
|
443
|
+
}
|
|
444
|
+
|
|
445
|
+
// No model satisfies the requirements.
|
|
446
|
+
if (routeOpts.fallback != null) {
|
|
447
|
+
var fb = registry.get(routeOpts.fallback);
|
|
448
|
+
if (!fb) {
|
|
449
|
+
throw new AiCapabilityError("aiCapability/unknown-model",
|
|
450
|
+
"ai.capability.route: fallback '" + routeOpts.fallback + "' is not a registered model");
|
|
451
|
+
}
|
|
452
|
+
_emitAudit("ai/capability-fallback", "allowed", {
|
|
453
|
+
modelId: routeOpts.fallback, requirements: requirements,
|
|
454
|
+
});
|
|
455
|
+
return {
|
|
456
|
+
modelId: routeOpts.fallback,
|
|
457
|
+
descriptor: fb,
|
|
458
|
+
estimatedCost: _estimateCost(fb, costBasis),
|
|
459
|
+
reason: "fallback",
|
|
460
|
+
};
|
|
461
|
+
}
|
|
462
|
+
|
|
463
|
+
_emitAudit("ai/capability-no-candidate", "denied", { requirements: requirements });
|
|
464
|
+
throw new AiCapabilityError("aiCapability/no-candidate",
|
|
465
|
+
"ai.capability.route: no registered model satisfies the requirements " +
|
|
466
|
+
"and no fallback was supplied");
|
|
467
|
+
}
|
|
468
|
+
|
|
469
|
+
return {
|
|
470
|
+
describe: describe,
|
|
471
|
+
list: list,
|
|
472
|
+
register: register,
|
|
473
|
+
satisfies: satisfies,
|
|
474
|
+
route: route,
|
|
475
|
+
};
|
|
476
|
+
}
|
|
477
|
+
|
|
478
|
+
module.exports = {
|
|
479
|
+
create: create,
|
|
480
|
+
REASONING_TIERS: REASONING_TIERS,
|
|
481
|
+
AiCapabilityError: AiCapabilityError,
|
|
482
|
+
};
|
package/package.json
CHANGED
package/sbom.cdx.json
CHANGED
|
@@ -2,10 +2,10 @@
|
|
|
2
2
|
"$schema": "http://cyclonedx.org/schema/bom-1.5.schema.json",
|
|
3
3
|
"bomFormat": "CycloneDX",
|
|
4
4
|
"specVersion": "1.5",
|
|
5
|
-
"serialNumber": "urn:uuid:
|
|
5
|
+
"serialNumber": "urn:uuid:2bff79a1-ab38-4b20-8cdc-37b5e80872a3",
|
|
6
6
|
"version": 1,
|
|
7
7
|
"metadata": {
|
|
8
|
-
"timestamp": "2026-05-
|
|
8
|
+
"timestamp": "2026-05-24T16:44:14.267Z",
|
|
9
9
|
"lifecycles": [
|
|
10
10
|
{
|
|
11
11
|
"phase": "build"
|
|
@@ -19,14 +19,14 @@
|
|
|
19
19
|
}
|
|
20
20
|
],
|
|
21
21
|
"component": {
|
|
22
|
-
"bom-ref": "@blamejs/core@0.12.
|
|
22
|
+
"bom-ref": "@blamejs/core@0.12.28",
|
|
23
23
|
"type": "application",
|
|
24
24
|
"name": "blamejs",
|
|
25
|
-
"version": "0.12.
|
|
25
|
+
"version": "0.12.28",
|
|
26
26
|
"scope": "required",
|
|
27
27
|
"author": "blamejs contributors",
|
|
28
28
|
"description": "The Node framework that owns its stack.",
|
|
29
|
-
"purl": "pkg:npm/%40blamejs/core@0.12.
|
|
29
|
+
"purl": "pkg:npm/%40blamejs/core@0.12.28",
|
|
30
30
|
"properties": [],
|
|
31
31
|
"externalReferences": [
|
|
32
32
|
{
|
|
@@ -54,7 +54,7 @@
|
|
|
54
54
|
"components": [],
|
|
55
55
|
"dependencies": [
|
|
56
56
|
{
|
|
57
|
-
"ref": "@blamejs/core@0.12.
|
|
57
|
+
"ref": "@blamejs/core@0.12.28",
|
|
58
58
|
"dependsOn": []
|
|
59
59
|
}
|
|
60
60
|
]
|